Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

report all REVEL scores from dbNSFP #179

Open
andrewsu opened this issue Nov 15, 2023 · 1 comment
Open

report all REVEL scores from dbNSFP #179

andrewsu opened this issue Nov 15, 2023 · 1 comment
Assignees

Comments

@andrewsu
Copy link
Member

Reported in an email to the mailing list:

I have a question for you about the REVEL scores, REVEL scores are transcript-specific, and dbSNFP updated their site in early 2022 to include all REVEL scores and link the transcripts to them, however I'm not seeing multiple scores available in the JSON output by myvariant.info.

Example variant: CA415086302

This variant has multiple REVEL scores,

image

However the myvariant.info section for this variant only displays the 0.173 value, https://myvariant.info/v1/variant/chrX:g.153693944C%3EA?assembly=hg38&format=html.

Can you help me understand if we can get the various REVEL scores and their respective transcripts from myvariant.info? dbSNFP does provide the multiple scores, I've attached a download of their output for this variant.

Presumably an update of our dbNSFP parser is required...

@liammulh
Copy link

liammulh commented Dec 6, 2023

Hi, @andrewsu and @everaldorodrigo. I am a software developer at Stanford ClinGen. I work with Christine Preston who sent the email to you. We use myvariant.info to pull in REVEL scores.

I did some investigation of this issue yesterday. I'm not sure if my findings will be helpful to you, but I will summarize them here in case they are. More detail in this issue.

I downloaded dbNSFP and searched through dbNSFP4.3a_variant.chrX to ascertain whether your dbNSFP parsing code needs an update. I searched for "152959399" (the hg19_pos common to each REVEL score in the screenshot provided by Christine). There were five lines that matched the search. Then I searched those lines for 0.173, 0.109, and 0.653. I was only able to find 0.173 and 0.653. The result for 0.653 was delimited by a semi-colon:

search-results

I wrote a script that prints out the columns of my search results, here is the REVEL_score column for the five lines that match "152959399":

('REVEL_score', '0.173', '0.653;0.653', '0.177', '0.653;0.653', '0.606;0.606')

Based on my reading of your DbnsfpReader class in dbnsfp_parser_43a.py module, your code seems to anticipate a semi-colon delimiter (specifically in the _iter_read_group method).

I'm not able to spend more time investigating this at the moment. I would have liked to run your code on some of the data I extracted from dbNSFP with some breakpoints. I hope this helps! If you have any questions, please @ me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants