Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
bglocker authored Oct 24, 2021
1 parent b6739ff commit 678e028
Showing 1 changed file with 9 additions and 8 deletions.
17 changes: 9 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,14 +55,15 @@ In order to replicate the results presented in the paper, please follow these st

1. Download the [CheXpert dataset](https://stanfordmlgroup.github.io/competitions/chexpert/), copy the file `train.csv` to the `datafiles` folder
2. Download the [CheXpert demographics data](https://stanfordaimi.azurewebsites.net/datasets/192ada7c-4d43-466e-b8bb-b81992bb80cf), copy the file `CHEXPERT DEMO.xlsx` to the `datafiles` folder
3. Run the notebook `chexpert.sample.ipynb` to generate the study data
4. Run the script `chexpert.disease.py` to train a disease detection model
5. Run the script `chexpert.sex.py` to train a sex classification model
6. Run the script `chexpert.race.py` to train a race classification model
7. Run the notebook `chexpert.predictions.ipynb` to evaluate all three prediction models
8. Run the notebook `chexpert.explorer.ipynb` for the unsupervised exploration of feature representations

Additionally, there are scripts `chexpert.sex.split.py` and `chexpert.race.split.py` to run SPLIT on the disease detection model. The default setting in all scripts is to train a DenseNet-121 using the training data from all patients. The results for models trained on subgroups only can be produced by changing the path to the datafiles (e.g., using `full_sample_train_white.csv` and `full_sample_val_white.csv` instead of `full_sample_train.csv` and `full_sample_val.csv`).
3. Run the notebook [`chexpert.sample.ipynb`](notebooks/chexpert.sample.ipynb) to generate the study data
4. Adjust the variable `img_data_dir` to point to the imaging data and run the following scripts
- Run the script [`chexpert.disease.py`](prediction/chexpert.disease.py) to train a disease detection model
- Run the script [`chexpert.sex.py`](prediction/chexpert.sex.py) to train a sex classification model
- Run the script [`chexpert.race.py`](prediction/chexpert.race.py) to train a race classification model
5. Run the notebook `chexpert.predictions.ipynb` to evaluate all three prediction models
6. Run the notebook `chexpert.explorer.ipynb` for the unsupervised exploration of feature representations

Additionally, there are scripts [`chexpert.sex.split.py`](prediction/chexpert.sex.split.py) and [`chexpert.race.split.py`](prediction/chexpert.race.split.py) to run SPLIT on the disease detection model. The default setting in all scripts is to train a DenseNet-121 using the training data from all patients. The results for models trained on subgroups only can be produced by changing the path to the datafiles (e.g., using `full_sample_train_white.csv` and `full_sample_val_white.csv` instead of `full_sample_train.csv` and `full_sample_val.csv`).

Note, the Python scripts also contain code for running the experiments using a ResNet-34 backbone which requires less GPU memory.

Expand Down

0 comments on commit 678e028

Please sign in to comment.