Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pad the target when aligning #312

Open
wants to merge 59 commits into
base: main
Choose a base branch
from
Open

pad the target when aligning #312

wants to merge 59 commits into from

Conversation

ekg
Copy link
Collaborator

@ekg ekg commented Jan 22, 2025

Changes so far:

  1. A new parameter target_padding was added to the Parameters struct in align_parameters.hpp. This parameter adds additional padding around target sequences.

  2. In computeAlignments.hpp, the main changes are:

    • Modified parseMashmapRow function to accept a new target_padding parameter
    • Added logic to apply padding to reference sequence coordinates (rStartPos and rEndPos) while ensuring they stay within valid bounds (not below 0 or above reference length)
    • Updated function calls to pass through the new target_padding parameter
  3. In parse_args.hpp, added command-line support for the target padding feature:

    • Added new command-line option -E or --target-padding to specify padding around target sequence
    • Default value is set to 0
    • Added validation to ensure the padding value is non-negative
    • Integrated the parameter into the alignment parameters structure

Objectives:

  • Integrate https://github.com/ekg/indelswizzle/blob/main/cigar_swap.cpp, ideally within the biWFA alignment function in wflign. This will mean left and right-aligning leading/trailing indels, so that we have gaps at the start and end of the alignment. We should then trim back the coordinates of the target.
  • Test!
  • Only run this when we are on mappings which have been split (via -P 50k for instance) and not at the starts and ends of chains. This would mean integrating a new bit of information into the mapping records to say where we are in the chain in addition to which chain id we have. We could stash that in a single variable or put multiple ones on the row (three are needed—chain id, chain length, position in chain).

ekg added 30 commits December 2, 2024 16:47
This commit addresses several compilation errors in the wflign module:
- Removed nested namespace in wflign_swizzle.hpp
- Added missing function definitions for try_swap_start_pattern and try_swap_end_pattern
- Added using namespace directive to resolve type resolution issues
- Corrected namespace closure and function implementations
ekg added 29 commits January 22, 2025 18:13
This commit introduces:
1. CIGAR string conversion utilities in `wflign_cigar_utils.hpp`
2. Enhanced indel-swizzling logic in `do_biwfa_alignment`
3. Debug logging for chain and CIGAR processing
4. Coordinate updates based on swizzled CIGAR

Key changes:
- Convert WFA edit_cigar to/from string format
- Apply swizzling only for internal chain chunks
- Update alignment coordinates when CIGAR changes
- Add detailed debug logging

Remaining tasks:
- Add boundary validation
- Comprehensive testing
- Optional configuration for swizzling behavior
feat: Add detailed sequence debug output for swizzle pattern matching
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant