-
Notifications
You must be signed in to change notification settings - Fork 12
HW5
-
You have two files named MyAwesomeSamples_1.fastq and MyAwesomeSamples_2.fastq in your current working directory. You have run
module load fastqc
on newton and are ready to check the quality. What two command lines would you use to perform fastqc analysis? You may assume you have 2 cores available for use. (1 points) -
Next you get the trimmomatic software loaded by typing
module load trimmomatic
. You have decided you need to clipping of adapters using the TruSeq3-PE adapters (same as we used in class example), a sliding window quality trim of an average phred value of 10 over a 4 base window, and a minimum length of 30. The beginning and end look ok, so you've decided not to trim those explicitly. Please write the trimmomatic command that will do the 3 trimming tasks listed above and produce reasonably named output files. (2 points) -
Find the bwa manual online. You have a command from a collaborator:
bwa mem \ -t 10 \ -k 12 \ -T 45 \ reference.fna \ somereads_1.trimmed.paired.fastq \ somereads_2.trimmed.paired.fastq \ > mappedreads.sam
What do each of the 3 flags mean? (3 points)
-
In class we used the following command to identify variants from the filtered vcf file:
samtools mpileup \ -uf GCF_000005845.2_ASM584v2_genomic.fna DRR021342.bam \ | \ bcftools call -mv > DRR021342.raw.vcf
Notes:
- samtools and bcftools have undergone many changes as new versions have been released, please use the most updated manuals to answer the questions
- samtools and bcftools have a particular syntax where flags can be combined. For example, in the samtools mpileup command, the
-uf GCF_000005845.2_ASM584v2_genomic.fna
is the same as-u -f GCF_000005845.2_ASM584v2_genomic.fna
. - the flags for the subcommands may be different. For example, for
samtools view
the -b flag refers to an output format, but insamtools mpileup
, it refers to input files.
Based on these above information, what is the meaning of the samtools and bcftools flags in the above command? There are 4 total flags to explain. (4 points)
Provide the command line to convert the filtered vcf file from class to a bcf file. To make sure we are all starting from the exact same file, the command
md5sum DRR021342.flt.vcf
should yield
c8ef3d0fc0ea342561de58cb8f7eefa1 DRR021342.flt.vcf
Provide the file size for the original vcf and for the new bcf file.