-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to handle non diseases in EFO? #50
Comments
We were missing injury, poisoning or other complication as a non-disease TA (fixed in 1b59233). However, this does let us see what some classifications are like on non-disease terms: Expand for table
Many make sense, but some seem off like "spinal injury" and "adverse effect" as high precision. So this makes me reconsider whether it entirely makes sense to apply the model outside of its trained domain. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
explodes #35 (comment)
We currently ignore non-diseases, by only training and predicting on terms that are diseases as per
get_disease_nodes
.Our training labels only apply to diseases. Therefore, I think it makes sense to continue training only on diseases. However, there is the possibility that we could:
is_disease
marker column that is part of the outputWhile predictions on non-diseases would likely be of lower quality due to the lack of training coverage, many of the same concepts of grouping terms versus more specific terms would still apply. Users could then decide to discard all predictions when
is_disease
is False to continue with the current behavior.There could be a benefit to having precision predictions for non-diseases. For example, classifications of symptoms (one example being pain) would make sense along a precision axis:
@yonromai: I'll bring this up with the data team at our next meeting, so no need to do anything until then. CC @eric-czech
The text was updated successfully, but these errors were encountered: