-
Notifications
You must be signed in to change notification settings - Fork 12
2018 HW7
Send me the code and question answers by Wednesday Nov 7th at midnight.
Write a python program that takes a list of accessions (given below) and uses regular expressions to print ones that meet certain criteria (see comments in the code below).
UPDATE: A bit more detail on approach. For each section that requires a regular expression (each question), iterate over the list, and then use a conditional statement to decide if each entry meets the expression criteria. If it does, print it. If it does not, do not print it. Make sure to use all the code below in your solution, including the print statements, so that when I execute your submission, the output is clear.
## This is the list of accessions
accessions = ['zyx385960', 'jnd48659', 'ape309', 'aovr89235', 'jneixp9347', 'jdqpt7839', '383nalkdn', '38374dn']
## print the accessions with the number 5 (1 point)
print("The accessions with the number 5 are: ")
## your code here
## print the accessions that have either a j or a p (1 point)
print("The accessions that have either a j or a p are: ")
## your code here
## print the accessions that have a j followed by an n or a p (1 point)
print("The accessions that have either a j followed by an n or a p: ")
## your code here
## print the accessions that begin with the number 3 (1 point)
print("The accessions that begin with the number 3 are: ")
## your code here
## print the accessions that end with a number (1 point)
print("The accessions that end with a number are: ")
## your code here
## print the accessions that contain three or more digits in a row (1 point)
print("The accessions that contain three or more digits in a row are: ")
## your code here
Download the reads from this dataset: https://www.ncbi.nlm.nih.gov/sra/SRX4883423[accn]
- How many reads are in this file? (1 point)
- Map with STAR to the arabidopsis genome. Provide the command line you used to do this (1 point)
- Find the Log.final.out file. What is the percent of reads uniquely mapped? (1 point)
- What is the percent of reads mapped to multiple loci? (1 point)
- Look in the STAR manual on github. STAR can detect chimeric alignments. How do you turn this feature on? (1 point)
- Continuing to examine the STAR manual. STAR can output alignment coordinates in terms of transcripts instead of in terms of the genome. How do you turn this feature on and what file will be created to hold those coordinates? (1 point)