Mitch Peer Review #4

mrezzoni · 2018-10-27T02:02:16Z

Sup big pimpin,

Nice implementation of call and re. I also plan on using a set to store unique files; I think it is the most efficient data type for the scale of our task. Definitely a good idea to check the legitimacy of the UMI before proceeding with the rest of your code. Can the user just upload a set of UMIs instead of having the program use resources to search for them? I guess the answer to this depends on whether you implement UMI correction.

I'm sure you thought of this, but it'd be cool to include output print statements that quantify things like raw duplicate count or percentage of duplicates with your clever duplicate and total read counters.

I like your truePOS function, but make sure you account for strandedness (sense vs antisense). Maybe you can have it check your output from parseSAM for + vs - and make the adjustments accordingly. A more accurate description of this function would be to examine the 5' position. How will it account for other Op's like N, I, D, etc?

Overall, nice start. Please let me know if my comments suck because I'd be glad to elaborate or provide further feedback.

Love,
Mitch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mitch Peer Review #4

Mitch Peer Review #4

mrezzoni commented Oct 27, 2018

Mitch Peer Review #4

Mitch Peer Review #4

Comments

mrezzoni commented Oct 27, 2018