Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent Overwriting of Existing Sample Sheets (#3915) #4012

Merged
merged 3 commits into from
Dec 11, 2024

Conversation

ahdamin
Copy link
Contributor

@ahdamin ahdamin commented Dec 10, 2024

Description

Fixed

  • Added a samefile check to ensure the sample sheet from Housekeeper is not copied if it already matches the existing sample sheet in the sequencing run directory.

How to prepare for test

  • Ssh to relevant server (depending on type of change)
  • Use stage: us
  • Paxa the environment: paxa
  • Install on stage (example for Hasta):
    bash /home/proj/production/servers/resources/hasta.scilifelab.se/update-tool-stage.sh -e S_cg -t cg -b prevent-overwriting-existing-sheets -a

How to test

  1. Sample sheet is not overwritten when is the same as the one in HK
  • Choose a flow cell that has the same sample sheet in HK and in the sequencing dir
  • run cg demultiplex samplesheet create <flow_cell_id>
  • verify from the logs that it was not overwritten
  1. Sample sheet is overwritten when is different from the one in HK
  • Modify a sample sheet in the sequencing dir
  • run cg demultiplex samplesheet create <flow_cell_id>
  • verify from the logs that the sample sheet is overwritten

Review

  • Tests executed by AA
  • "Merge and deploy" approved by SD CO
    Thanks for filling in who performed the code review and the test!

This version is a

  • PATCH - when you make backwards compatible bug fixes or documentation/instructions

Implementation Plan

  • Deploy this branch on hasta stage and prod

@ahdamin ahdamin linked an issue Dec 10, 2024 that may be closed by this pull request
Copy link
Contributor

@diitaz93 diitaz93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I can add instructions on how to test in the PR description

cg/apps/demultiplex/sample_sheet/api.py Outdated Show resolved Hide resolved
cg/apps/demultiplex/sample_sheet/api.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ChrOertlin ChrOertlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

cg/apps/demultiplex/sample_sheet/api.py Outdated Show resolved Hide resolved
@diitaz93 diitaz93 marked this pull request as ready for review December 11, 2024 10:34
@diitaz93 diitaz93 requested a review from a team as a code owner December 11, 2024 10:34
@ahdamin
Copy link
Contributor Author

ahdamin commented Dec 11, 2024

Testing in Stage

$ cg -l DEBUG demultiplex samplesheet create 180508_ST-E00269_0269_AHL32LCCXY
Running cg demultiplex.
Getting a valid sample sheet for flow cell 180508_ST-E00269_0269_AHL32LCCXY
Instantiating sample sheet API
Instantiating housekeeper api
Initializing Store
Instantiating lims api
Called undefined __pydantic_serializer__ on HousekeeperAPI, please wrap
Called undefined __dataclass_fields__ on HousekeeperAPI, please wrap
Set dry run to False
Set force to False
Instantiating IlluminaRunDirectoryData with path /home/proj/stage/sequencing_data/illumina/sequencing-runs/180508_ST-E00269_0269_AHL32LCCXY
Set sequencing run id to AHL32LCCXY
Fetching and validating sample sheet for 180508_ST-E00269_0269_AHL32LCCXY from Housekeeper
Fetch latest version from bundle HL32LCCXY
Fetching files with tags in [HL32LCCXY,samplesheet]
Fetching files from version 75318
Validating sample sheet
Validating that the sample sheet has all the necessary sections
Looking for index settings in the sample sheet
Found index settings: NoReverseComplements
Looking for read and index run cycles in the sample sheet
Validating samples
Order samples by lane
Validate that samples are unique in lane: 1
Validate that samples are unique in lane: 2
Validate that samples are unique in lane: 3
Validate that samples are unique in lane: 4
Validate that samples are unique in lane: 5
Validate that samples are unique in lane: 6
Validate that samples are unique in lane: 7
Validate that samples are unique in lane: 8
Validating override cycles for all samples
Samplesheet passed validation
Sample sheet from Housekeeper is the same as the sequencing directory sample sheet

Test FileNotFoundError

$ mv SampleSheet.csv SampleSheet_amin.csv
$ cg -l DEBUG demultiplex samplesheet create 161206_ST-E00201_0170_BH57JWALX
Running cg demultiplex.
Getting a valid sample sheet for flow cell 161206_ST-E00201_0170_BH57JWALXX
Instantiating sample sheet API
Instantiating housekeeper api
Initializing Store
Instantiating lims api
Called undefined __pydantic_serializer__ on HousekeeperAPI, please wrap
Called undefined __dataclass_fields__ on HousekeeperAPI, please wrap
Set dry run to False
Set force to False
Instantiating IlluminaRunDirectoryData with path /home/proj/stage/sequencing_data/illumina/sequencing-runs/161206_ST-E00201_0170_BH57JWALXX
Set sequencing run id to BH57JWALXX
Fetching and validating sample sheet for 161206_ST-E00201_0170_BH57JWALXX from Housekeeper
Fetch latest version from bundle H57JWALXX
Fetching files with tags in [H57JWALXX,samplesheet]
Fetching files from version 75433
Validating sample sheet
Validating that the sample sheet has all the necessary sections
Looking for index settings in the sample sheet
Found index settings: NoReverseComplements
Looking for read and index run cycles in the sample sheet
Validating samples
Order samples by lane
Validate that samples are unique in lane: 1
Validate that samples are unique in lane: 2
Validate that samples are unique in lane: 3
Validate that samples are unique in lane: 4
Validate that samples are unique in lane: 5
Validate that samples are unique in lane: 6
Validate that samples are unique in lane: 7
Validate that samples are unique in lane: 8
Validating override cycles for all samples
Samplesheet passed validation
Sample sheet or target path does not exist. Housekeeper sample sheet path: /home/proj/stage/housekeeper-bundles/H57JWALXX/2022-01-18/SampleSheet.csv, Target sample sheet path: /home/proj/stage/sequencing_data/illumina/sequencing-runs/161206_ST-E00201_0170_BH57JWALXX/SampleSheet.csv
Sample sheet from Housekeeper is valid. Copying it to sequencing run directory
Linked /home/proj/stage/housekeeper-bundles/H57JWALXX/2022-01-18/SampleSheet.csv to /home/proj/stage/sequencing_data/illumina/sequencing-runs/161206_ST-E00201_0170_BH57JWALXX/SampleSheet.csv

@ahdamin ahdamin merged commit f0d62a0 into master Dec 11, 2024
10 checks passed
@ahdamin ahdamin deleted the prevent-overwriting-existing-sheets branch December 11, 2024 13:04
@ahdamin
Copy link
Contributor Author

ahdamin commented Dec 11, 2024

Deploy to stage

Log deploy... done.
cg, version 64.5.28

@ahdamin
Copy link
Contributor Author

ahdamin commented Dec 11, 2024

Deploy to PROD

Log deploy... done.
cg, version 64.5.28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generation of sample sheets hindering cleaning of Illumina runs
4 participants