-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automate detection & export of new ensembl releases #1
Comments
@dhimmel thanks for making an issue. I would have done so myself but I was on the run when I tweeted at you. Here's a little more context: The code you'd need after doing import bioversions
ensembl_version = bioversions.get_version("ensembl") This code executes a live request to the Ensembl website and does some HTML parsing/traversal to pick out the version number. This actually runs on a nightly build (along with all of the other version getter functions in Bioversions) that writes to a YAML file on the Bioversions GitHub repository, so you can use this alternative code that doesn't actually rely on Bioversions as a Python dependency: import requests
import yaml
url = "https://raw.githubusercontent.com/biopragmatics/bioversions/main/docs/_data/versions.yml"
res = requests.get(url)
res_yaml = yaml.safe_load(res.text)
versions = {
entry["prefix"]: entry["releases"][-1]["version"]
for entry in res_yaml["database"]
if "prefix" in entry
}
ensembl_version = versions["ensembl"] |
Note: I forgot that the single source of truth for the daily updated data is natively stored in JSON at https://raw.githubusercontent.com/biopragmatics/bioversions/main/src/bioversions/resources/versions.json. A better way, that doesn't rely on a YAML parser would be: import requests
url = "https://raw.githubusercontent.com/biopragmatics/bioversions/main/src/bioversions/resources/versions.json"
res_json = requests.get(url).json()
versions = {
entry["prefix"]: entry["releases"][-1]["version"]
for entry in res_json["database"]
if "prefix" in entry
}
ensembl_version = versions["ensembl"] |
Okay I added scheduled export builds in b75c893 along with an overwrite option for whether to re-export if an output branch exists. Both scheduled and dispatch jobs now default to overwrite=false. Must set overwrite=true on an dispatch to overwrite. |
Here are two export CI logs |
@cthoyt tweeted:
This is a great idea and would reduce future maintenance. Happy to use bioversions for this.
We will need to detect if an output already exists. Should be able to do this by looking at the git branches.
Sometimes exports will fail, for example if a release changes the schema. These changes take a non-trivial amount of effort to fix. For this reason I lean towards weekly scheduled jobs, so when this is failing it becomes a weekly and not daily annoyance.
The text was updated successfully, but these errors were encountered: