Enable prefix-grouping for one-to-one filtering #66
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously, the
-f one-to-one
filter was applied to all mappings at the same time. In cases where users are mapping multiple query genomes to one or more target sequences with the--skipPrefix #
flag, the one-to-one filter would treat all query sequences as part of the same genome, even if they had unique prefixes.This patch makes it so that the one-to-one plane-sweep filter is applied to each pair of query and reference groups independently, ensuring that
-n
mappings are retained for each pair. A "group" of sequences is the set of sequences which contain the same prefix up until the last occurrence of the characterc
, where--skipPrefix c
is specified.