-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for rat species #6
Conversation
d204e6a
to
adfe1a3
Compare
0cffbfe
to
bb6c731
Compare
ensembl_genes/species.py
Outdated
# FIXME: mhc coordinates | ||
mhc_chromosome="6", | ||
mhc_lower=28_510_120, | ||
mhc_upper=33_480_577, | ||
xmhc_lower=25_726_063, | ||
xmhc_upper=33_410_226, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ACastanza do you happen to know the MHC/ xMHC boundaries for rats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not, sorry. I did find this publication: https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC383307/ but it's pretty old and almost certainly doesn't correspond to the current genome assembly which is Rnor_6.0.
Speaking of which, the mouse coordinates will need to be taken from GRCm39.
Here's output readme for rat from a local export. @ACastanza does this look okay, besides the OutputRelease info
Table headsThe first 10 rows of each exported table is shown below. genesPrimary table of ensembl genes with IDs, symbols, and genomic location information. Most users will want to filter this dataset to representative genes only, via the
alt_allelesThis is an intermediate table that groups genes if they are alternate alleles of eachother. A representative gene is selected from each group.
old_to_newestThis table maps outdated gene symbols to their newest gene symbol, traversing multiple levels of replacement if necessary. When
updatesThis dataset updates ensembl genes to current, representative ensembl genes. We refer to it as the 'omni-updater'. When ingesting external datasets that use Ensembl gene IDs, we recommend joining with this table. Current, representative genes map to themselves.
xrefsThis dataset contains cross-references (xrefs) from Ensembl genes to various external gene resources.
xref_ncbigeneThis dataset contains cross-references (xrefs) from Ensembl genes to NCBI (Entrez) genes.
xref_goThis dataset contains cross-references (xrefs) from Ensembl genes to Gene Ontology terms, as asserted by Gene Ontology annotations.
|
since we do not yet know the actual MHC/xMHC boundaries
This looks pretty good. I also really like the omni-updater, tracking old ensembl ids was something I'd always intended to get to but was difficult data to get to. |
refs #4