You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I like how you check the legitimacy of the UMI, but think about moving this process higher up to conserve resources. Consider using samtools to sort by chromosome rather than position, as there are significantly fewer of the former. Maybe use a set instead of numpy arrays because sets are a lot faster at checking if something is already in it.
If you have moved past all the unique alignments on a particular chromosome and have progressed to the next chromosome, consider re-initializing whatever data type you were using store your unique alignment. This will help save space assuming the unique alignments have already been written out to a file.
Good job taking the size of your sliding window into account. Make sure it is capable of handling chromosome for species from exoplanets.
Nice high-level functions. Based on your use of the term "start position" instead than "left-most position", it seems like you have considered the implications of strandedness. Will soft_clip be capable of handling any other op's like I or D? Maybe create a function to search for N's once your alignments have passed other tests such as UMI legitimacy so that you avoid iterating through the file for alignments that might not end up being viewed.
Please let me know if you'd like any elaboration on my feedback. Good luck!
Mitch
The text was updated successfully, but these errors were encountered:
Helena "Don't Google Me" Klein,
I like how you check the legitimacy of the UMI, but think about moving this process higher up to conserve resources. Consider using samtools to sort by chromosome rather than position, as there are significantly fewer of the former. Maybe use a set instead of numpy arrays because sets are a lot faster at checking if something is already in it.
If you have moved past all the unique alignments on a particular chromosome and have progressed to the next chromosome, consider re-initializing whatever data type you were using store your unique alignment. This will help save space assuming the unique alignments have already been written out to a file.
Good job taking the size of your sliding window into account. Make sure it is capable of handling chromosome for species from exoplanets.
Nice high-level functions. Based on your use of the term "start position" instead than "left-most position", it seems like you have considered the implications of strandedness. Will soft_clip be capable of handling any other op's like I or D? Maybe create a function to search for N's once your alignments have passed other tests such as UMI legitimacy so that you avoid iterating through the file for alignments that might not end up being viewed.
Please let me know if you'd like any elaboration on my feedback. Good luck!
Mitch
The text was updated successfully, but these errors were encountered: