Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/atac demux #726

Merged
merged 170 commits into from
Mar 25, 2024
Merged

Feature/atac demux #726

merged 170 commits into from
Mar 25, 2024

Conversation

VladimirShitov
Copy link
Collaborator

@VladimirShitov VladimirShitov commented Mar 1, 2024

Changelog

  • Add script for downloading atac tiny bcl test file
  • Add reference/build_cellranger_arc_reference component. It creates reference compatible with both Multiome (ATAC + RNA) and ATAC-only pipelines. For more information, check out cellranger documentation
  • Add demux/cellranger_atac_mkfastq component for generation of fastq-files from bcl-files
  • Fix a bug with missing import in src/demux/cellranger_mkfastq/test.py

Issue ticket number and link

Contributes to #398

Checklist before requesting a review

  • I have performed a self-review of my code

  • Conforms to the Contributor's guide

  • Check the correct box. Does this PR contain:

    • Breaking changes
    • New functionality
    • Major changes
    • Minor changes
    • Documentation
    • Bug fixes
  • Proposed changes are described in the CHANGELOG.md

  • CI tests succeed!

@VladimirShitov VladimirShitov changed the title Feature/ataq demux Feature/atac demux Mar 6, 2024
Copy link
Member

@DriesSchaumont DriesSchaumont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @VladimirShitov

@DriesSchaumont DriesSchaumont changed the base branch from main to develop March 25, 2024 10:34
@DriesSchaumont DriesSchaumont merged commit 9ccc4a3 into develop Mar 25, 2024
9 checks passed
@DriesSchaumont DriesSchaumont deleted the feature/ataq-demux branch March 25, 2024 10:35
dorien-er pushed a commit that referenced this pull request Apr 25, 2024
VladimirShitov added a commit that referenced this pull request Apr 26, 2024
DriesSchaumont pushed a commit that referenced this pull request Jun 12, 2024
DriesSchaumont pushed a commit that referenced this pull request Jun 12, 2024
DriesSchaumont added a commit that referenced this pull request Jun 14, 2024
…813)

* preproc script

* preproc script

* tokenize and pad script

* tokenize and pad script

* embedding script

* test resourcers and evaluation script

* cross check gene set

* pad_tokenize module

* updat image

* remove test resources, update inputs

* use pytorch image

* remove integration component

* remove nvidia reqs

* remove load_model option

* Fix retag for viash-hub not using correct namespace separator (#745)

* CI - Build: Fix second occurance of namespace separator (#746)

* script to download scgpt test data

* remove test resources script

* adjust preprocessing script

* add scgpt full preproc module

* integration submodule

* integration submodule and add normalize_total flag

* add params

* Add script to download scgpt test resources (#750)

* add script to download scgpt test resources

* Update resources_test_scripts/scgpt.sh

Co-authored-by: Dries Schaumont <[email protected]>

* add drive folders containing data and model

* chmod +x

---------

Co-authored-by: Dries Schaumont <[email protected]>

* embedding module

* add unit tests

* undo subsampling test data

* update tests

* update tests

* update memory requirements

* update tests

* update changelog

* update component name

* fix tests, update changelog

* run tests on subsampled data

* adjust shm size

* update test

* update memory requirements nextflow

* update test

* update test

* update test

* expand unit tests, update script with loggers and todo

* Add ATAC demux (#726)

* run tests with subsampled data

* use specific model input files instead of directory

* update test data

* Remove muon as test dependency for concatenate_h5mu. (#773)

* scGPT binning component (#765)

Co-authored-by: Dries Schaumont <[email protected]>

* update embedding dependencies and gene name layer handling

* update input handling

* include dsbn logic

* update unit tests

* update config

* expand unit tests, fix dsbn

* Update CHANGELOG.md

Co-authored-by: Dries Schaumont <[email protected]>

* Update src/scgpt/embedding/config.vsh.yaml

Co-authored-by: Dries Schaumont <[email protected]>

* update required, remove shared memory docker

* Add scGPT padding and tokenization component (#754)

* add module for scgpt padding and tokenization

* remove base requirement

* update changelog

* update component name

* expand unit tests, update script with loggers and todo

* fix unit tests

* remove annotation script

* run tests with subsampled data

* use specific model input files instead of directory

* remove unused binning script

* update layer names and handling

* Add script to download scgpt test resources (#750)

* add script to download scgpt test resources

* Update resources_test_scripts/scgpt.sh

Co-authored-by: Dries Schaumont <[email protected]>

* add drive folders containing data and model

* chmod +x

---------

Co-authored-by: Dries Schaumont <[email protected]>

* preproc script

* preproc script

* tokenize and pad script

* tokenize and pad script

* embedding script

* test resourcers and evaluation script

* cross check gene set

* Fix retag for viash-hub not using correct namespace separator (#745)

* CI - Build: Fix second occurance of namespace separator (#746)

* script to download scgpt test data

* remove test resources script

* pad_tokenize module

* updat image

* remove test resources, update inputs

* use pytorch image

* remove integration component

* remove nvidia reqs

* remove load_model option

* adjust preprocessing script

* add scgpt full preproc module

* integration submodule

* integration submodule and add normalize_total flag

* add params

* update scanpy version

* remove branch irrelevant scripts

* update output handling

* update unit tests, add output compression

* update key name input output

* fix test

* update unit tests

* Update CHANGELOG.md

Co-authored-by: Dries Schaumont <[email protected]>

* add pars to logging

---------

Co-authored-by: Dries Schaumont <[email protected]>

* enable gpu device option

* update dsbn

* remove temporary, unused components

* update error messages, remove device param

* remove dropout param

* fix typo

* fix typo

* Generate scgpt cross check genes module (#758)

* base script and config added

* config extended + logger set up + tests in progress

* config working + script improved + tests in progress

* exception handling, extended tests

* extended tests + better logging

* changelog entry added

* test resource path in config fixed

* python test setup added to config

* PR comments fixed

* updated to use subset data

* remove batch id column logic

* update authors

* resources, tests and dependencies fixes

* update key name input output

* update key name input output

* update var gene names

* update config

* compression param added + minor fixes

---------

Co-authored-by: dorien-er <[email protected]>
Co-authored-by: DriesSchaumont <[email protected]>

* undo concat changes

---------

Co-authored-by: jakubmajercik <[email protected]>
Co-authored-by: Dries Schaumont <[email protected]>
Co-authored-by: Vladimir Shitov <[email protected]>
Co-authored-by: Jakub Majercik <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants