You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are some phenotypes in that are caused by various genes, and we generally (always?) do not want to include these in the pipeline; they should not be marked as having causal germline mutation w/ associated gene in morbidmap.txt. The problem is that this information about whether or not they are caused by various genes is not contained in morbidmap.txt or the other data files. It is found in the first paragraph in the "Text" subsection in omim.org/entry pages.
Possible solution
a. Web scrape, and include that paragraph of text in the spreadsheet
b. Web scrape, and maybe look for specific phrases like “various genes”, and toggle a boolean column when such phrases are found.
c. Do nothing, and import these anyway even though we ideally should/would not.
Overview
There are some phenotypes in that are caused by various genes, and we generally (always?) do not want to include these in the pipeline; they should not be marked as having causal germline mutation w/ associated gene in
morbidmap.txt
. The problem is that this information about whether or not they are caused by various genes is not contained inmorbidmap.txt
or the other data files. It is found in the first paragraph in the "Text" subsection in omim.org/entry pages.Possible solution
a. Web scrape, and include that paragraph of text in the spreadsheet
b. Web scrape, and maybe look for specific phrases like “various genes”, and toggle a boolean column when such phrases are found.
c. Do nothing, and import these anyway even though we ideally should/would not.
Background
The text was updated successfully, but these errors were encountered: