Skip to content
Meg Staton edited this page Sep 21, 2016 · 6 revisions
  1. You have two files named MyAwesomeSamples_1.fastq and MyAwesomeSamples_2.fastq in your current working directory. You have run module load fastqc on newton and are ready to check the quality. What two command lines would you use to perform fastqc analysis? You may assume you have 2 cores available for use. (1 points)

  2. Next you get the trimmomatic software loaded by typing module load trimmomatic. You have decided you need to clipping of adapters using the TruSeq3-PE adapters (same as we used in class example), a sliding window quality trim of an average phred value of 10 over a 4 base window, and a minimum length of 30. The beginning and end look ok, so you've decided not to trim those explicitly. Please write the trimmomatic command that will do the 3 trimming tasks listed above and produce reasonably named output files. (2 points)

  3. Find the bwa manual online. You have a command from a collaborator:

    bwa mem \
    -t 10 \
    -k 12 \
    -T 45 \
    reference.fna \
    somereads_1.trimmed.paired.fastq \
    somereads_2.trimmed.paired.fastq \
    > mappedreads.sam
    

    What do each of the 3 flags mean? (3 points)

  4. In class we used the following command to identify variants from the filtered vcf file:

    samtools mpileup \
    -uf GCF_000005845.2_ASM584v2_genomic.fna DRR021342.bam \
    | \
    bcftools call -mv > DRR021342.raw.vcf
    

    Notes:

    • samtools and bcftools have undergone many changes as new versions have been released, please use the most updated manuals to answer the questions
    • samtools and bcftools have a particular syntax where flags can be combined. For example, in the samtools mpileup command, the -uf GCF_000005845.2_ASM584v2_genomic.fna is the same as -u -f GCF_000005845.2_ASM584v2_genomic.fna.
    • the flags for the subcommands may be different. For example, for samtools view the -b flag refers to an output format, but in samtools mpileup, it refers to input files.

    Based on these above information, what is the meaning of the samtools and bcftools flags in the above command? There are 4 total flags to explain. (4 points)

EC (3 points)

Provide the command line to convert the filtered vcf file from class to a bcf file. To make sure we are all starting from the exact same file, the command

md5sum DRR021342.flt.vcf

should yield

c8ef3d0fc0ea342561de58cb8f7eefa1  DRR021342.flt.vcf

EC (1 point)

Provide the file size for the original vcf and for the new bcf file.

Clone this wiki locally