Skip to main content

Pipelines

bc_dev_tools has two primary data pipelines: the Import & Compare pipeline for matching BC fault codes against FAQ issue types, and the Failed CSM Resolver pipeline for resolving failed service order submissions.

Import & Compare Pipeline

Triggered by python -m bc_dev_tools pipeline or interactive menu option 1. This is the core matching workflow.

Import & Compare Pipeline

Flow:

  1. Data Sources (green) -- FAQ issue types are fetched from the FAQ REST API; BC fault code relationships are read from an Excel file.
  2. Import Layer (gray) -- The FAQ API client authenticates and retrieves data; the database manager uses pandas to read the Excel file. Both are stored as flat tables in SQLite.
  3. Normalization (blue) -- Each flat table is decomposed into dimension tables (see Data Model) and a normalized fact table. SQL views are created for convenience.
  4. Matching (purple) -- The faqbc_data_matcher compares each BC row against every FAQ row using difflib.SequenceMatcher across four mapped column pairs. The default similarity threshold is 0.85.
  5. Output (red) -- Match results are written back to the BC source table, then exported to comparison_results.xlsx with three sheets: All Results, Matched, and Unmatched.

Failed CSM Resolver Pipeline

Triggered by python -m bc_dev_tools resolve run. This pipeline automates the resolution of failed CSM quality report submissions.

Failed CSM Resolver Pipeline

Flow:

  1. Configuration (gray) -- Determines the date cutoff. By default, loads the last successful run date from .last_run.json for incremental processing. --full fetches all records; --since YYYY-MM-DD sets a custom cutoff.
  2. Data Fetch (green) -- Two parallel fetches: BC OData API retrieves posted service invoices where csmQualityReportFailed = true and orderDate >= cutoff; FAQ REST API retrieves the issue type hierarchy.
  3. Matching & Resolution (blue) -- Each invoice line's fault descriptions are matched against the FAQ hierarchy top-down (Fault Area, Symptom Code, Fault Code, Resolution). If all four levels meet the similarity threshold, FAQ descriptions become the resolved values. Unresolved rows are flagged for manual review.
  4. Dry Run Gate -- --dry-run exits here with a preview, writing nothing.
  5. Spreadsheet Output (yellow) -- Results are written to the resolver spreadsheet at the FAILED_CSM_SPREADSHEET path as a formatted Excel table.
  6. Optional Actions (red) -- --submit sends matched rows to the CSM 1.0 API via ImportQuality. --mark-sent PATCHes BC records to set csmQualityReportSent via OData.
  7. State Save -- The cutoff date is saved to .last_run.json for the next incremental run.