-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate description source from gene description text #11
Comments
I'm looking to extract the gene_description source information in SQL, but when I use the
Also, I don't think Based on these issues, seems like we should parse the description in Python instead. |
Noting that not all descriptions have source information. Here are some examples without:
There are also cases where gene_description is null. |
Rerunning 105 exports in https://github.com/related-sciences/ensembl-genes/actions/runs/1564648697 to include gene description updates. |
Example gene descriptions by species:
Notice the trailing bracketed source information like "[Source:HGNC Symbol;Acc:HGNC:11858]". It would be nice to separate this description source information into a separate column, such that it's possible to isolate the actual description.
Question: is the source string always going to be in the format of
[Source:SOURCE;Acc:CURIE]
for all species and descriptions?The text was updated successfully, but these errors were encountered: