-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Snakemake to build transcripts #70
Comments
Would be good to automate uploading releases as this is pretty tedious, could do:
|
Made a script "generate_transcript_data/github_release_upload.sh" which makes a release easier |
Looking at the bash scripts, a lot of the complexity is due to looping over URLs and dealing with RefSeq URLs having identical file names, eg:
So it's not so easy to just download it and carry on. I think with SnakeMake we should just explicitly list everything out in YAML files, and use that config to run a pipeline common between everything We could make urls a dictionary, and then have the "nice name" for it as a key. That would allow us to move code into config which would be a lot nicer |
…names (handle RefSeq's duplicated filenames)
ok, I have started on this (in generate_transcript_data) I wanted to run the code with different config files, but couldn't work out a way to do it. I think SnakeMake seems to only want 1 config file. I thus combined everything in "config/*.yaml" into "cdot_transcripts.yaml" having an issue at the moment with ambiguous rules for downloading files |
@tedil @holtgrewe - I've finished v1 of the SnakeMake pipeline - if you could check it out as it's the first one I ever wrote: https://github.com/SACGF/cdot/blob/main/generate_transcript_data/Snakefile Happy to hear feedback / if I should have structured it a different way etc. |
Great, thank you! I will have a look when I am back from vacation |
Sure, no hurry, enjoy your time off |
At the moment we have file existence tests instead of proper dependency management
The text was updated successfully, but these errors were encountered: