Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(dev-branch-pacbio) #3453

Merged
merged 76 commits into from
Aug 15, 2024
Merged

feature(dev-branch-pacbio) #3453

merged 76 commits into from
Aug 15, 2024

Conversation

ChrOertlin
Copy link
Contributor

@ChrOertlin ChrOertlin commented Jul 22, 2024

Description

This PR introduces a new structure for the run devices post-processing:

flowchart TD
    X([sequencing_dir]) --> B[Run Data Generator]
    A([run_name]) --> B
    B --> C([Run Data])
    C --> D[Run File Manager]
    D --> E(["list[path]"])
    D --> F(["list[path]"])
    E --> G[Metrics Parser]
    G --> H([Metrics])
    H --> I[Transfer Service]
    I --> J([DTOs])
    J --> K[Store Service]
    F --> L[HK Service]
    K --> M[Post-processing Service]
    L --> M[Post-processing Service]

Loading

Added

  • CLI command cg post-process run <run-name> which currently works with PacBio SMRT cells but will work for any run in the future
  • Abstract interfaces for the classes described in the diagram above, suitable for overriding with the proper classes for each device
  • Classes for the PacBio post-processing
  • CRUD functions to create and read PacBio DB entries
  • Util functions
  • Test functions, fixtures and fixture files

Changed

  • Function store_fastq_path_in_housekeeper in housekeeper modified into create_bundle_and_add_file_with_tags so that it works in a more general way (not only for fastqs)

How to prepare for test

  • Ssh to relevant server (depending on type of change)
  • Use stage: us
  • Paxa the environment: paxa
  • Install on stage (example for Hasta):
    bash /home/proj/production/servers/resources/hasta.scilifelab.se/update-tool-stage.sh -e S_cg -t cg -b dev-pacbio-flow -a

How to test

See below

Review

  • Tests executed by SD
  • "Merge and deploy" approved by SD
    Thanks for filling in who performed the code review and the test!

This version is a

  • MINOR - when you add functionality in a backwards compatible manner

Implementation Plan

  • Deployed to stage:
INFO  [alembic.runtime.migration] Context impl MySQLImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
fatal: Not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
repository is clean
Logging deploy ...
Getting deployer... done.
Getting last commit message and SHA... done.
Getting version of deploy scripts... /home/js.diazboada
done.
Log deploy... done.
cg, version 62.1.0
[js.diazboada@hasta:~] [S_base] $ up
  • Deployed to production:
Log deploy... done.
cg, version 62.1.0

@ChrOertlin ChrOertlin requested a review from a team as a code owner July 22, 2024 09:29
cg/cli/validate.py Outdated Show resolved Hide resolved
tests/conftest.py Outdated Show resolved Hide resolved
@diitaz93 diitaz93 marked this pull request as draft July 24, 2024 05:45
tests/conftest.py Outdated Show resolved Hide resolved
diitaz93 and others added 3 commits July 24, 2024 08:58
# Descriptions

Adds the implementation of the RunDataGenerator interface for the Pacbio post processing.
## Description
Closes Clinical-Genomics/add-new-tech#67
Add PacBioRunFileManager, fixture and tests


---------

Co-authored-by: Christian Oertlin <[email protected]>
diitaz93 added 3 commits July 25, 2024 12:15
## Description
Refactor PacBio metrics parser to compile with the post-processing flow


### Changed

- Condensed all individual parsers into one
- Refactored tests
## Description
Closes Clinical-Genomics/add-new-tech#64
The read metrics we parsed came from a file containing only HiFi data. It is important to parse the failed read metrics too. There is a file containing both HiFi and failed metrics (`m84202_240522_135641_s1.ccs_report.json`). It is, however, generated by **another software** so the values are not exactly the same as the metrics parsed before. @J35P312 assured that the difference was negligible. 

### Added

- New parameters to parse from ccs file:
  - [x] <Q20 Reads
  - [x] <Q20 Yield (bp)
  - [x] <Q20 Read Length (mean, bp)

### Changed

- Renamed `HiFiMetrics` model to `ReadMetrics`
- The path to the ccs file

### Fixed

- Removed old ccs file usage and fixture
@diitaz93
Copy link
Contributor

Test on stage: Non-existent smrt cell should fail with correct error message

$ cg -l DEBUG post-process run r84202_20241119_150802/1_A01
Running cg post-processing.
Instantiating post-processing services
Instantiating PacBio post-processing service
Instantiating status db
Instantiating housekeeper api
Initializing Store
Starting PacBio post-processing for run: r84202_20241119_150802/1_A01
File or directory /home/proj/stage/sequencing_data/pacbio/r84202_20241119_150802/1_A01 does not exist
Traceback (most recent call last):
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/error_handler.py", line 16, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/pacbio/run_file_manager/run_file_manager.py", line 21, in get_files_to_parse
    validate_files_or_directories_exist([run_path])
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/validators.py", line 25, in validate_files_or_directories_exist
    raise FileNotFoundError("Some of the provided paths do not exist")
FileNotFoundError: Some of the provided paths do not exist

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/error_handler.py", line 16, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/pacbio/metrics_parser/metrics_parser.py", line 41, in parse_metrics
    metrics_files: list[Path] = self.file_manager.get_files_to_parse(run_data)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/error_handler.py", line 18, in wrapper
    raise to_raise(error) from error
cg.services.run_devices.exc.PostProcessingRunFileManagerError: Some of the provided paths do not exist

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/error_handler.py", line 16, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/pacbio/data_transfer_service/data_transfer_service.py", line 35, in get_post_processing_dtos
    metrics: PacBioMetrics = self.metrics_service.parse_metrics(run_data)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/error_handler.py", line 18, in wrapper
    raise to_raise(error) from error
cg.services.run_devices.exc.PostProcessingParsingError: Some of the provided paths do not exist

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/error_handler.py", line 16, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/pacbio/data_storage_service/pacbio_store_service.py", line 55, in store_post_processing_data
    dtos: PacBioDTOs = self.data_transfer_service.get_post_processing_dtos(run_data)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/error_handler.py", line 20, in wrapper
    raise CgError(f"{error}") from error
cg.exc.CgError: Some of the provided paths do not exist

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/error_handler.py", line 16, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/pacbio/post_processing_service.py", line 53, in post_process
    self.store_service.store_post_processing_data(run_data=run_data, dry_run=dry_run)
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/error_handler.py", line 20, in wrapper
    raise CgError(f"{error}") from error
cg.exc.CgError: Some of the provided paths do not exist

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/bin/cg", line 8, in <module>
    sys.exit(base())
             ^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/click/decorators.py", line 45, in new_func
    return f(get_current_context().obj, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/cli/post_process/post_process.py", line 35, in post_process_sequencing_run
    post_processing_service.post_process(run_name=run_name, dry_run=dry_run)
  File "/home/proj/stage/bin/miniconda3/envs/S_cg/lib/python3.11/site-packages/cg/services/run_devices/error_handler.py", line 20, in wrapper
    raise CgError(f"{error}") from error
cg.exc.CgError: Some of the provided paths do not exist

@diitaz93
Copy link
Contributor

Test on stage: Dry run

$ cg -l DEBUG post-process run r84202_20240319_150802/1_A01 --dry-run
Running cg post-processing.
Instantiating post-processing services
Instantiating PacBio post-processing service
Instantiating status db
Instantiating housekeeper api
Initializing Store
Starting PacBio post-processing for run: r84202_20240319_150802/1_A01
Dry run, no entries will be added to database for SMRT cell /home/proj/stage/sequencing_data/pacbio/r84202_20240319_150802/1_A01.
Dry run: would have added /home/proj/stage/sequencing_data/pacbio/r84202_20240319_150802/1_A01/statistics/unzipped_reports/control.report.json to Housekeeper.
Dry run: would have added /home/proj/stage/sequencing_data/pacbio/r84202_20240319_150802/1_A01/statistics/unzipped_reports/loading.report.json to Housekeeper.
Dry run: would have added /home/proj/stage/sequencing_data/pacbio/r84202_20240319_150802/1_A01/statistics/unzipped_reports/raw_data.report.json to Housekeeper.
Dry run: would have added /home/proj/stage/sequencing_data/pacbio/r84202_20240319_150802/1_A01/statistics/unzipped_reports/smrtlink-datasets.json to Housekeeper.
Dry run: would have added /home/proj/stage/sequencing_data/pacbio/r84202_20240319_150802/1_A01/statistics/m84202_240319_154410_s1.ccs_report.json to Housekeeper.
Dry run: would have added /home/proj/stage/sequencing_data/pacbio/r84202_20240319_150802/1_A01/hifi_reads/m84202_240319_154410_s1.hifi_reads.bam to Housekeeper.

@diitaz93 diitaz93 marked this pull request as ready for review August 14, 2024 14:54
@diitaz93
Copy link
Contributor

diitaz93 commented Aug 14, 2024

Test on stage: post-processing a PacBio run cell

Check that post-processing finishes succesfully

$ cg -l DEBUG post-process run r84202_20240319_150802/1_A01
Running cg post-processing.
Instantiating post-processing services
Instantiating PacBio post-processing service
Instantiating status db
Instantiating housekeeper api
Initializing Store
Starting PacBio post-processing for run: r84202_20240319_150802/1_A01
Fetching bundle with name: EA094816
Bundle with name EA094816 already exists
Fetching tag with name: EA094816
Fetch latest version from bundle EA094816
Found Housekeeper version object for EA094816: <housekeeper.store.models.Version object at 0x7fa351cf3450>
Bundle EA094816 already has a file with the same name as /home/proj/stage/sequencing_data/pacbio/r84202_20240319_150802/1_A01/statistics/unzipped_reports/control.report.json
Fetching bundle with name: EA094816
Bundle with name EA094816 already exists
Fetching tag with name: EA094816
Fetch latest version from bundle EA094816
Found Housekeeper version object for EA094816: <housekeeper.store.models.Version object at 0x7fa351d10cd0>
Bundle EA094816 already has a file with the same name as /home/proj/stage/sequencing_data/pacbio/r84202_20240319_150802/1_A01/statistics/unzipped_reports/loading.report.json
Fetching bundle with name: EA094816
Bundle with name EA094816 already exists
Fetching tag with name: EA094816
Fetch latest version from bundle EA094816
Found Housekeeper version object for EA094816: <housekeeper.store.models.Version object at 0x7fa351d11bd0>
Bundle EA094816 already has a file with the same name as /home/proj/stage/sequencing_data/pacbio/r84202_20240319_150802/1_A01/statistics/unzipped_reports/raw_data.report.json
Fetching bundle with name: EA094816
Bundle with name EA094816 already exists
Fetching tag with name: EA094816
Fetch latest version from bundle EA094816
Found Housekeeper version object for EA094816: <housekeeper.store.models.Version object at 0x7fa351d12150>
Bundle EA094816 already has a file with the same name as /home/proj/stage/sequencing_data/pacbio/r84202_20240319_150802/1_A01/statistics/unzipped_reports/smrtlink-datasets.json
Fetching bundle with name: EA094816
Bundle with name EA094816 already exists
Fetching tag with name: EA094816
Fetch latest version from bundle EA094816
Found Housekeeper version object for EA094816: <housekeeper.store.models.Version object at 0x7fa351d122d0>
Fetch latest version from bundle EA094816
Fetching tag with name: ccs-report
Fetching tag with name: EA094816
Fetching tag with name: ccs-report
Fetching tag with name: EA094816
Created new bundle version dir: /home/proj/stage/housekeeper-bundles/EA094816/2024-08-14
Linked file: /home/proj/stage/sequencing_data/pacbio/r84202_20240319_150802/1_A01/statistics/m84202_240319_154410_s1.ccs_report.json -> /home/proj/stage/housekeeper-bundles/EA094816/2024-08-14/m84202_240319_154410_s1.ccs_report.json
File added to Housekeeper bundle EA094816
Fetching bundle with name: 2023-41693-02
Created new bundle: 2023-41693-02
Created new version
New bundle created with name 2023-41693-02
Fetching tag with name: 2023-41693-02
Fetch latest version from bundle 2023-41693-02
Found Housekeeper version object for 2023-41693-02: <housekeeper.store.models.Version object at 0x7fa351d2ad10>
Fetch latest version from bundle 2023-41693-02
Fetching tag with name: bam
Fetching tag with name: 2023-41693-02
Fetching tag with name: bam
Fetching tag with name: 2023-41693-02
Created new bundle version dir: /home/proj/stage/housekeeper-bundles/2023-41693-02/2024-08-14
Linked file: /home/proj/stage/sequencing_data/pacbio/r84202_20240319_150802/1_A01/hifi_reads/m84202_240319_154410_s1.hifi_reads.bam -> /home/proj/stage/housekeeper-bundles/2023-41693-02/2024-08-14/m84202_240319_154410_s1.hifi_reads.bam
File added to Housekeeper bundle 2023-41693-02

Check that files were added to Housekeeper in the sample bundle and in the smart cell bundle:

$ housekeeper get bundle EA094816
2024-08-14 15:24:36 hasta.scilifelab.se housekeeper.cli.core[119715] INFO Use root path /home/proj/stage/housekeeper-bundles
                📦 Bundle table 📦                 
┏━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ ID     ┃ Bundle name ┃ Version IDs ┃ Created    ┃
┡━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ 190806 │ EA094816    │ 195684      │ 2024-08-14 │
└────────┴─────────────┴─────────────┴────────────┘
                              📕 Version table 📕                               
┏━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━┓
┃ ID     ┃ Bundle name ┃ Nr files ┃ Included ┃ Archived ┃ Created    ┃ Expires ┃
┡━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━┩
│ 195684 │ EA094816    │ 5        │          │          │ 2024-08-14 │         │
└────────┴─────────────┴──────────┴──────────┴──────────┴────────────┴─────────┘
                                                            📜 Local files 📜                                                             
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ID      ┃ File name                                                                                        ┃ Tags                      ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 6502566 │ /home/proj/stage/housekeeper-bundles/EA094816/2024-08-14/control.report.json                     │ EA094816, control-report  │
│ 6502567 │ /home/proj/stage/housekeeper-bundles/EA094816/2024-08-14/loading.report.json                     │ EA094816, loading-report  │
│ 6502568 │ /home/proj/stage/housekeeper-bundles/EA094816/2024-08-14/raw_data.report.json                    │ EA094816, raw-data-report │
│ 6502569 │ /home/proj/stage/housekeeper-bundles/EA094816/2024-08-14/smrtlink-datasets.json                  │ EA094816, datasets-report │
│ 6502571 │ /home/proj/stage/housekeeper-bundles/EA094816/2024-08-14/m84202_240319_154410_s1.ccs_report.json │ EA094816, ccs-report      │
└─────────┴──────────────────────────────────────────────────────────────────────────────────────────────────┴───────────────────────────┘
   📜 Remote files 📜    
┏━━━━┳━━━━━━━━━━━┳━━━━━━┓
┃ ID ┃ File name ┃ Tags ┃
┡━━━━╇━━━━━━━━━━━╇━━━━━━┩
└────┴───────────┴──────┘
[15:24] [hiseq.clinical@hasta:/home/proj/stage/sequencing_data/pacbio] [S_base]  $ housekeeper get bundle 2023-41693-02
2024-08-14 15:25:02 hasta.scilifelab.se housekeeper.cli.core[120146] INFO Use root path /home/proj/stage/housekeeper-bundles
                 📦 Bundle table 📦                  
┏━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ ID     ┃ Bundle name   ┃ Version IDs ┃ Created    ┃
┡━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ 190807 │ 2023-41693-02 │ 195685      │ 2024-08-14 │
└────────┴───────────────┴─────────────┴────────────┘
                               📕 Version table 📕                                
┏━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━┓
┃ ID     ┃ Bundle name   ┃ Nr files ┃ Included ┃ Archived ┃ Created    ┃ Expires ┃
┡━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━┩
│ 195685 │ 2023-41693-02 │ 1        │          │          │ 2024-08-14 │         │
└────────┴───────────────┴──────────┴──────────┴──────────┴────────────┴─────────┘
                                                           📜 Local files 📜                                                           
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ ID      ┃ File name                                                                                            ┃ Tags               ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ 6502572 │ /home/proj/stage/housekeeper-bundles/2023-41693-02/2024-08-14/m84202_240319_154410_s1.hifi_reads.bam │ bam, 2023-41693-02 │
└─────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────┴────────────────────┘
   📜 Remote files 📜    
┏━━━━┳━━━━━━━━━━━┳━━━━━━┓
┃ ID ┃ File name ┃ Tags ┃
┡━━━━╇━━━━━━━━━━━╇━━━━━━┩
└────┴───────────┴──────┘

Check that entries were added to StatusDB:

Screenshot 2024-08-14 at 17 00 51

@diitaz93
Copy link
Contributor

Tests on stage:

  • Run the new CLI command as follows:
    • show the help message, verify that the post-processing shows up
    $ cg --help
    Usage: cg [OPTIONS] COMMAND [ARGS]...
    
      cg - interface between tools at Clinical Genomics.
      ...
      Commands:
        ...
        post-process   Post-process sequencing runs from the sequencing instruments.
        ...
    
    • show the help message of the post-processing sub-command, verify that the correct message shows up
    $ cg post-process --help
    Usage: cg post-process [OPTIONS] COMMAND [ARGS]...
    
        Post-process sequencing runs from the sequencing instruments.
      
      Options:
        --help  Show this message and exit.
      
      Commands:
        run  Post-process a sequencing run from the PacBio instrument.
    
    • show the help message of the post-processing run, verify that the correct message shows up
    $ cg post-processing run --help
      Running cg post-processing.
      Usage: cg post-process run [OPTIONS] RUN_NAME
      
        Post-process a sequencing run from the PacBio instrument.
      
        run-name is the full name of the sequencing unit of run. For example:     PacBio: 'r84202_20240522_133539/1_A01'
      
      Options:
        --dry-run  Runs the command without making any changes
        --help     Show this message and exit.
    
    • A wrong smrt cell id, e.g. an Illumina flow cell name. See that a CgError is raised telling that the pattern of the given path could not be found:
    $ cg post-process run 20231108_LH00188_0028_B22F52TLT3
    ...
    NameError: Run name 20231108_LH00188_0028_B22F52TLT3 does not match with any known sequencing run name pattern
    

Copy link

@diitaz93 diitaz93 merged commit 33e9f5e into master Aug 15, 2024
9 checks passed
Copy link
Contributor

@Vince-janv Vince-janv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Herculean effort 🦁 Well done ⭐

cg/cli/post_process/post_process.py Show resolved Hide resolved
Comment on lines +16 to +18
device: str = get_item_by_pattern_in_source(
source=run_name, pattern_map=PATTERN_TO_DEVICE_MAP
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to use the same function for mapping files to tags as we do to map directories to post-process classes. I feel like that function might be too general. I don't think having a helper function that only applies PATTERN_TO_DEVICE_MAP is problematic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can wrap the map plus the function that returns the device.

cg/services/run_devices/exc.py Show resolved Hide resolved
cg/constants/pacbio.py Show resolved Hide resolved
diitaz93 added a commit that referenced this pull request Sep 2, 2024
## Description
Address last comments of #3453 regarding PacBio post-processing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants