Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dup longer than 100 bases converted back to delins (due to hardcoding of 100 in code) #57

Open
davmlaw opened this issue May 28, 2021 · 1 comment

Comments

@davmlaw
Copy link

davmlaw commented May 28, 2021

Expected: Converting a long HGVS dup to variant coordinates then back again will make a dup
Actual: A long dup is converted to a delins:

from pyhgvs import parse_hgvs_name, variant_to_hgvs_name

g_hgvs_str = "NC_000001.10:g.235611675_235611994dup"
c_hgvs_str = "NM_003193.4(TBCE):c.1411_1501dup"


chrom, offset, ref, alt = parse_hgvs_name(g_hgvs_str, f, None)
g_hgvs_name = variant_to_hgvs_name(chrom, offset, ref, alt, f, None)

print(f"{g_hgvs_str=} => {g_hgvs_name=}")

chrom, offset, ref, alt = parse_hgvs_name(c_hgvs_str, f, transcript)
c_hgvs_name = variant_to_hgvs_name(chrom, offset, ref, alt, f, transcript)

print(f"{c_hgvs_str=} => {c_hgvs_name=}")

Output:

g_hgvs_str='NC_000001.10:g.235611675_235611994dup' => g_hgvs_name=HGVSName('g.235611773_235611774ins320')
c_hgvs_str='NM_003193.4(TBCE):c.1411_1501dup' => c_hgvs_name=HGVSName('NM_003193.4(TBCE):c.1491+18_1491+19ins320')

This is because hgvs_justify_indel only looks a hardcoded 100 bases around the indel

If you change the code to:

    size = max(len(ref), len(alt)) + 1
    start = max(offset - size, 0)
    end = offset + size

It keeps the dup:

g_hgvs_str='NC_000001.10:g.235611675_235611994dup' => g_hgvs_name=HGVSName('g.235611675_235611994dup320')
c_hgvs_str='NM_003193.4(TBCE):c.1411_1501dup' => c_hgvs_name=HGVSName('NM_003193.4(TBCE):c.1411_1501dup320')
@davmlaw
Copy link
Author

davmlaw commented May 25, 2023

Updated to use both genomic and coding HGVS, added note that it's due to hardcoding 100 bases around the indel rather than using size

davmlaw added a commit to SACGF/hgvs that referenced this issue May 25, 2023
@davmlaw davmlaw changed the title c. dup converted back to delins dup longer than 100 bases converted back to delins (due to hardcoding of 100 in code) May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant