Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SILO input transformer #644

Open
13 tasks
Tracked by #518
fengelniederhammer opened this issue Nov 13, 2024 · 0 comments
Open
13 tasks
Tracked by #518

SILO input transformer #644

fengelniederhammer opened this issue Nov 13, 2024 · 0 comments
Labels
epic Collection of multiple issues for a larger feature
Milestone

Comments

@fengelniederhammer
Copy link
Contributor

fengelniederhammer commented Nov 13, 2024

Integrate and finalize the SILO input transformer.

It is a follow up to #562. This issue removes the TSV/FASTA input format from the SILO code (and leaves SILO only with the NDJSON input). The SILO input is a separate tool that can translate the TSV/FASTA format to the NDJSON input, so that SILO still "supports" both formats (eventually by running the input transformer first).

This is a list of tasks that need to be done (not necessarily in a good order for implementation):

  • SILO input transformer: Integrate the code into the SILO repository.
  • SILO input transformer: add logging
  • SILO input transformer: Allow compressed input files
  • SILO input transformer: compress sequences in intermediate steps
    • to reduce the amount of data that needs to be transferred between the program and the disc.
  • SILO input transformer: allow command line arguments
    • especially specifying where the config is located and whether to keep intermediate files (and maybe the log level)
  • SILO input transformer: error messages
    • We already have error handling, but the errors might be hard to understand for users.
  • SILO input transformer: add more unit tests
  • SILO input transformer: add some e2e tests
  • SILO input transformer: parallelize where possible
  • SILO input transformer: support SAM files
  • SILO input transformer: include insertions
    • to do: think about whether we want to keep the old input format
  • SILO input transformer: clean up temp folders
  • SILO input transformer: accept empty fasta files
    • currently the fasta parser complains when a fasta file is empty.
@fengelniederhammer fengelniederhammer added the epic Collection of multiple issues for a larger feature label Nov 13, 2024
@fengelniederhammer fengelniederhammer added this to the SILO 1.0 milestone Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic Collection of multiple issues for a larger feature
Projects
None yet
Development

No branches or pull requests

1 participant