-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add taxon subsets #2811
Add taxon subsets #2811
Conversation
Add two new subsets: human-view and mouse-view. They are automatically generated by applying the taxon constraints declared in the ontology to exclude any class that can be inferred not to exist in human and mouse, respectively. This is the same strategy as used to produce similar subsets in Uberon, and is reusing the `create-species-subset` custom ROBOT command developed to that effect in the Uberon ROBOT plugin.
Now that we can generate tag files for the taxon subsets, we merge those into the main cl.owl release artefact. This requires that we no longer build the taxon subsets from the same release artefact, to avoid an obvious chicken-and-egg problem. So we now generate the subsets from cl-full.owl instead (which is, in effect, almost the same thing as cl.owl, modulo the ontology annotation).
In the preprocess step, where we make use of the FlyBase ROBOT plugin, make the 'all_robot_plugins' target an order-only prerequisite, so that once the plugins have been installed, we do not always trigger a rebuild of the preprocessed ontology, which in turn would trigger a rebuild of everything else.
The subset files are generated upon every release and are release artefacts, there is no need to commit them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. One minor nitpick but I will leave it to you to ignore or act on
# tags for the taxon subsets. | ||
POSTPROCESS_ADDITIONS = subsets/human-tags.ofn \ | ||
subsets/mouse-tags.ofn | ||
$(ONT).owl: $(ONT)-full.owl $(POSTPROCESS_ADDITIONS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some mild itch in my nose to overwrite x.owl rather than x-full.owl, just because it means that the primary release does not correspond direct to any of the known ontology types. Minor itch though, if you don't share that sentiment I am fine with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this way you avoid circularity..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the primary release does not correspond direct to any of the known ontology types
Never thought this could be a concern. o_O uberon.owl
also does not correspond to any of the ODK-defined “ontology types“.
And in this case, the difference between cl-full.owl
and cl.owl
is quite minor, since it is only the additions of the oboInOwl:inSubset
annotations, whereas in Uberon, uberon.owl
uses a completely different pipeline than oberon-full.owl
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roger that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A longer-term alternative could be to have some kind of POSTPROCESS
step in the ODK (similar to the PREPROCESS
that is already in there), which would be a no-op by default but that could be overridden by projects if they need to perform some last-minute changes at the very end of the build process.
This PR adds taxon-specific subsets to CL.
A taxon-specific subset for a given taxon is automatically generated by using the taxon constraints declared in the ontology to exclude any class that, due to the constraints, is known not to exist in the taxon. This is the same approach used for similar subsets in Uberon (obophenotype/uberon#3363).
Two new release artefacts are added:
human-view.owl
(the human-specific subset) andmouse-view.owl
(the mouse-specific subset). In addition, in the main release product (cl.owl
), classes that were found to belong in the human (respectively mouse) subset are tagged with ahttp://purl.obolibrary.org/obo/cl#human_subset
(respectivelyhttp://purl.obolibrary.org/obo/cl#mouse_subset
) subset annotation (again, similar to what was done in Uberon).