The MEDQA-USMLE-Symp dataset consists of clinical cases retrieved from the MEDQA-USMLE dataset. The clinical cases are annotated with medical entities such as:
- Sign or Symptom
- Findings
- Temporal Concept
- Location
- Population Group
- Age Group
- No Symptom Occurrence
For details regarding the creation of the dataset and experiments based on this data, please see the paper below.
The MEDQA-USMLE-Symp dataset is released under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License:
https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode
If you are using our dataset for research purposes, please cite the following paper:
Marro S, Molinet B, Cabrio E, Villata S. Natural Language Explanatory Arguments for Correct and Incorrect Diagnoses of Clinical Cases. InICAART 2023-15th International Conference on Agents and Artificial Intelligence 2023 Feb 22 (Vol. 1, pp. 438-449).