-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
suboptimal alignment in endsfree mode when match score != 0 #102
Comments
Another note, I tried using match score=0 with Not sure if I missed anything. Any help would be appreciated. Thanks! |
Hi,
I notice that the program can return a suboptimal alignment in endsfree mode when match score != 0.
Given a pattern (sequencing read)
GGGGCGCGTCGGGCTCCGGGTGTGGGGGGGGTGTGGGGGGGGGGGGTGGTGTGTGGGGGTGTGGCTGGTGAATGGGGTGAGGGTGGTGAGTGGAGTGAGGGTGGTGAGTGGGGTGAGGGTGGTGAATGGGGTGAGGGTGGTGAGTGGGGT
,and text (repetitive DNA in tandem repeat region)
CCTGAGGCCCCGGGTGTGGAGCGGAGGTGGACCAGAGGTGGACACAGACCCACGGGCCGCCAAGGCCCACCCAGGATCCCCCGGGGGCCATCCACATCTGGTAAAGCCGAGGTGTGGGCGGACCCCAGGAAGCAGCCCCCACCCCTGCCCCCAGTGGCTCAGGCCTGGGCAGAGAAAACAGGCCCAGCAGGGCGGCAGGGTGGGATCCCCACGATTCACCGAGGATGCGTCTTCCACAGGGAGAGTTTGGGGGAGCTGTGTGTGAAAATGTGAGTAACGTACATAAATCAGTATCACAGGAATCCAGGCGGGCGGAGGATGCATGACTGAACTTGGAGGACGCTCATCAGGGAGGTCAGTGCTCCCCTCCGGGGACAGGATCCTGCCTTCGCCTGGCCTGCGGGACAGGGCTCCCCTTGCCGGCCAGGGGCTACTGGCCACTGATGCTCACTTTGGGCTTCCGCCCCCCAGGGGAAGGGGTGCTGAGAGCCCCGTGTCCGGAGGGCTGGTGAGTGGGGCTGAGGCTGGTGGAGTGGGGGTGAGGCTGGTGAATGGGGTGAGGGTGGTGAGTGGAGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGAGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGAGTGAGGGTGGTGTGGGTGAGTGGTGAGTGGAGTGAGGGTGGTGAGTGGGGGTGAGGGTGATGAGTGGGGTGAGGGTGGTGAGTGGAGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGGGTGAGGGTGGTGAATGGGGTGAGGCTGGTGAATGGGGTGAGGGTGATGAGTGGGGTGAGGGTGGTGAGTGGGGTGAGGCTGGTGAGTGGGGGTGAGGGTGATGAGTGGGGTGAGGGTGGTGAGTGGGGTGAGGCTGGTGAGTGGGGGTGAGGGTGGTGAATGGAGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGGGGTGAGGCTGGTGAGTGGGGGTGAGGGTGGTGAGTGGGGGTGAGGGTGATGAGTGGGGTGAGGGTGGTGAGTGGAGTGAGGGTGGTGAATGGGGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGGGTGAGGGTGGTGAATGGGGTGAGGGTGATGAGTGGAGTGAGGGTGATGAGTGGGGTGAGGGTGGTGAGTGGAGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGGGGTGAGGGTGATGAGTGGGGTGAGGGTGGTGAGTGGAGTGAGGGTGGTGAATGGGGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGGGTGAGGGTGGTGAATGGGGTGAGGCTGGTGAATGGGGTGAGGGTGATGAGTGGGGTGAGGGTGGTGAGTGGGGTGAGGCTGGTGAGTGGGGGTGAGGGTGATGAGTGGGGTGAGGGTGGTGAGTGGGGTGAGGCTGGTGAGTGGGGGTGAGGGTGGTGAGTGGAGTGAGGGTGGTGAGTGGGGGTGAGGCTGGTGAATGGGGTGAGGGTGGTGAGTGGGGTGAGGCTGGTGAGTGGGGGTGAGGCTGGTGAATGGGGTGAGGGTGGTGAGTGGGGTGAGGGTGGTGAGTGGGGGTGAGGGTGATGAGTGGGGTGAGGGTGGTGAGTGGGGTGAGGCTGGTGAGTGGGGGTGAGGGTGGTGAGTGGAGTGAGGGTGGTGAGTGGGGGTGAGGGTGGTGAGTGGGGGTGAGGCTGGTGAGTGGGGTGAGGGTAGTGGGTGGGGCTGAGGTTATTCCAGCCTCGGGCACTGGATCTTCTCGGGGTGGGGGGGTTTGTGAGCGCTGACCCCCTGGGCTGTCTCCACCTTGTCCTGGGGCTGGGTCCCCGGACGACGCGGCCACAGCTCCTGGGAGAGTGGCCAGCCCTCGGACAGCTGTGAGCCCCCACGGGGGTGTCTGGGTTCGAGGCCACGTTGCAGACCCGCTGGCTGCTGGGGCTCAGGGAGGAAATGACCTGGCCTCCTGGAGCTTCAGATTCCTCATCTGTGTGCTGAGGGAAGGGGCACATCTCGGAGCCTGGGGACTCCCGGCGTGTGGGCTGCTTGCCTGGCACCCGCTCACCCAGGAGTTGTCCTTGCTGTGGGCTCTGAGCCTCCGGGATGGAGTGGGGCTGAGAGCGTGTCCACCACCTCCACCACATCAGCCTGTCCCTGGTCCTGCTCCGCCAGATGACAAATCTCTGGGAAATCTTCTTTAATTTTGTTCTCTGGGAAGTGGTAGGTTTTGGAGA
,the output has an alignment score of 136 with the CIGAR string being

.The optimal alignment should have a score of 160 against this substring in text
GGGGTGAGGCTGGTGAATGGGGTGAGGGTGGTGAGTGGGGTGAGGCTGGTGAGTGGGGGTGAGGCTGGTGAATGGGGTGAGGGTGGTGAGTGGGGTGAGGGTGGTGAGTGGGGGTGAGGGTGATGAGTGGGGTGAGGGTGGTGAGTGGGGT
with the following CIGAR stringMMMMXMXMXMXMMXXXXXMMXMMMXMMMXMMMMXMXMMMMXMXMMXMMMMMXMMMMMMMMMXMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMXMMMMMMMMMMMMMMMDMMMMMMMMMMMMXMMMXMMMMMMMMMMMMMMMMMMMMMMMM
.Upon inspection, the suboptimal alignment has a better prefix compared to the optimal alignment as shown below (replaced M with = for visualization):
====X=X=X=X==XXXXX==X===X===X====X=X====X=X==X=====X=========X===============================X===============D============X===X======================== optimal cigar
====X=X=X=X==XXXXX==X===X===X====X=X====X=X========X=========X==X==X===X=====================X======X============D========M===X===X===================X suboptimal cigar
It seems like a nonzero match score plus a better prefix in the suboptimal substring causes the program to keep extending the suboptimal wavefront as long as the alignment score doesn't drop below the second best wavefront. Not sure how easily this can be fixed. My initial thought is maybe we need to consider the potential of each wavefront to surpass the best wavefront under nonzero match scoring scheme.
This test was performed with the following c++ template
Thanks,
Tony
The text was updated successfully, but these errors were encountered: