-
Notifications
You must be signed in to change notification settings - Fork 80
Creating a protein sequence FASTA file
Pablo Cingolani edited this page Dec 6, 2017
·
1 revision
SnpEff ann
command has a command line option called -fastaProt
that tells SnpEff to output the "original" and "resulting" protein sequences for each variant into a FASTA file.
This means that for each variant, the output FASTA file will have an entry with protein sequence resulting from applying that variant to the reference sequence.
Here is an example:
$ cat z.vcf
1 889455 . G A . . .
$ java -Xmx6g -jar snpEff.jar ann -fastaProt z.prot.fa hg19 z.vcf > z.ann.vcf
The resulting fasta file z.prot.fa
looks like this (lines edited for readibility):
>NM_015658.3 Ref
MAAAGSR...LLFGKVAKDSSRMLQPSSSPLWGKLRVDIKAYLGS...
>NM_015658.3 Variant 1:889455-889455 Ref:G Alt:A HGVS.p:p.Gln236*
MAAAGSR...LLFGKVAKDSSRML*PSSSPLWGKLRVDIKAYLGS...