Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle non diseases in EFO? #50

Open
dhimmel opened this issue Oct 17, 2023 · 1 comment
Open

How to handle non diseases in EFO? #50

dhimmel opened this issue Oct 17, 2023 · 1 comment

Comments

@dhimmel
Copy link
Member

dhimmel commented Oct 17, 2023

explodes #35 (comment)

We currently ignore non-diseases, by only training and predicting on terms that are diseases as per get_disease_nodes.

Our training labels only apply to diseases. Therefore, I think it makes sense to continue training only on diseases. However, there is the possibility that we could:

  1. create an is_disease marker column that is part of the output
  2. compute features for non-diseases
  3. compute predictions for non-diseases

While predictions on non-diseases would likely be of lower quality due to the lack of training coverage, many of the same concepts of grouping terms versus more specific terms would still apply. Users could then decide to discard all predictions when is_disease is False to continue with the current behavior.

There could be a benefit to having precision predictions for non-diseases. For example, classifications of symptoms (one example being pain) would make sense along a precision axis:

image

@yonromai: I'll bring this up with the data team at our next meeting, so no need to do anything until then. CC @eric-czech

@dhimmel
Copy link
Member Author

dhimmel commented Oct 18, 2023

We were missing injury, poisoning or other complication as a non-disease TA (fixed in 1b59233). However, this does let us see what some classifications are like on non-disease terms:

Expand for table
efo_otar_slim_id efo_label class_new
EFO:0010686 muscle strain 02-disease-root
EFO:0010725 aseptic loosening 02-disease-root
EFO:0010581 organophosphate poisoning 02-disease-root
EFO:0011061 toxicity 03-disease-area
EFO:0020910 thermal burn 02-disease-root
EFO:0020930 immune-mediated adverse reaction 02-disease-root
EFO:0600078 Achilles tendon injury 01-disease-subtype
EFO:0000546 injury 03-disease-area
EFO:0002687 ischemia reperfusion injury 02-disease-root
MONDO:0037747 spinal injury 01-disease-subtype
MONDO:0700220 disease related to transplantation 03-disease-area
MONDO:0700222 disease related to hematopoietic stem cell transplant 03-disease-area
OTAR:0000009 injury, poisoning or other complication 03-disease-area
MONDO:0800373 carbon monoxide poisoning 03-disease-area
EFO:0007430 persian gulf syndrome 02-disease-root
EFO:0009485 eye injury 01-disease-subtype
EFO:0009518 complication 03-disease-area
EFO:0009582 sprain 02-disease-root
EFO:0009508 leg injury 02-disease-root
EFO:0009574 intoxication 02-disease-root
EFO:0009503 caustic injury 02-disease-root
EFO:0009565 radiation-induced disorder 03-disease-area
EFO:0009504 crush injury 02-disease-root
EFO:0009816 perineal laceration during delivery 02-disease-root
EFO:0009516 burn 03-disease-area
EFO:0009509 limb injury 03-disease-area
EFO:0009658 adverse effect 01-disease-subtype
EFO:0009434 death by undetermined cause 01-disease-subtype
EFO:0009521 dislocation 02-disease-root
EFO:0009887 intrathoracic organ injury 02-disease-root
EFO:0009506 heart injury 02-disease-root
EFO:0008546 poisoning 03-disease-area
EFO:0009507 knee injury 02-disease-root
EFO:0009527 frostbite 02-disease-root
EFO:0009888 trauma complication 01-disease-subtype
EFO:0009623 nose injury 02-disease-root
EFO:0009502 abdominal injury 02-disease-root
EFO:0009525 foreign body 03-disease-area
EFO:0009476 neck injury 02-disease-root
EFO:0009519 device complication 02-disease-root
EFO:0009833 kidney injury 02-disease-root
EFO:0009505 head injury 02-disease-root
MONDO:0019088 post-transplant lymphoproliferative disease 01-disease-subtype
EFO:1001291 ciguatera poisoning 02-disease-root
EFO:1001788 Eye Burns 02-disease-root
EFO:1001328 fluoride poisoning 02-disease-root
EFO:1001518 heavy metal poisoning 03-disease-area
EFO:1001756 Acrodynia 02-disease-root
EFO:1001373 Multiple Organ Failure 02-disease-root
EFO:1001768 cadmium poisoning 02-disease-root

Many make sense, but some seem off like "spinal injury" and "adverse effect" as high precision. So this makes me reconsider whether it entirely makes sense to apply the model outside of its trained domain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant