Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow reference sequence as input to fill in uncalled variants for a fasta alignment #3

Open
kche309 opened this issue Aug 13, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@kche309
Copy link
Contributor

kche309 commented Aug 13, 2024

Fill in uncalled sites in the VCF with the REF allele in the reference fasta.
The reference fasta file is assumed to be sorted with 1-based positions 1, ..., N.

Usage:

cd vcf2fasta
python src/vcf2fasta.py <vcf file> <output fasta> --ref <ref fasta>

Example:

python src/vcf2fasta.py data/variants.vcf data/test.fasta --ref data/filtered_sequence.fna     

Future TODO:
A more robust format to be implemented in the future could:

  • use a BED style file with chrom, pos, REF as the header, or
  • use a Reference VCF with chrom, pos, REF as the fields
@kche309 kche309 added the enhancement New feature or request label Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant