Jinan Bao, Hanshi Sun, Hanqiu Deng, Yinsheng He, Zhaoxiang Zhang, Xingyu Li†
Our paper can be accessed at arvix .
In medical imaging, AD is especially vital for detecting and diagnosing anomalies that may indicate rare diseases or conditions. However, there is a lack of a universal and fair benchmark for evaluating AD methods on medical images, which hinders the development of more generalized and robust AD methods in this specific domain. To bridge this gap, we introduce a comprehensive evaluation benchmark for assessing anomaly detection methods on medical images. This benchmark encompasses six reorganized datasets from five medical domains (i.e. brain MRI, liver CT, retinal OCT, chest X-ray, and digital histopathology) and three key evaluation metrics, and includes a total of fourteen state-of-the-art AD algorithms. This standardized and well-curated medical benchmark with the well-structured codebase enables comprehensive comparisons among recently proposed anomaly detection methods. It will facilitate the community to conduct a fair comparison and advance the field of AD on medical imaging.
BMAD includes six medical datasets from five different domains for medical anomaly detection as summerized in Table 1. Within these datasets, three supports pixel-level evaluation of anomaly detection, while the remaining three is for sample-level assessment only. Note, the validation set is specifically designed for model hyper-parameter tuning and training strategy selection, while the test set should be remained untouched until the final evaluation stage. One can download the our datasets from Google drive
Take the Histopathology dataset (anomaly detection) as an example, the structure is as follows:
camelyon16
├── train
├── good
├── 1000.png
├── 1001.png
├── ...
├── valid
├── good
├── 1080.png
├── 1081.png
├── ...
├── Ungood
├── 1000.png
├── 1001.png
├── ...
├── test
├── good
├── 1000.png
├── 1001.png
├── ...
├── Ungood
├── 100.png
├── 101.png
├── ...
Take the Brain dataset (anomaly detection and localization) as an example, the structure is as follows:
Brain
├── train
├── good
├── img
├── 00003_60.png
├── 00003_61.png
├── ...
├── valid
├── good
├── img
├── 00025_99.png
├── 00100_60.png
├── ...
├── Ungood
├── img
├── 00124_60.png
├── 00124_70.png
├── ...
├── anomaly_mask
├── 00124_60.png
├── 00124_70.png
├── ...
├── test
├── good
├── img
├── 00000_96.png
├── 00000_97.png
├── ...
├── Ungood
├── img
├── 00002_60.png
├── 00002_68.png
├── ...
├── anomaly_mask
├── 00002_60.png
├── 00002_68.png
├── ...
We support 14 algorithms in our BMAD and the results are shown in Table 2. Our trained checkpoints can be downloaded from Google drive
You can train the model by running main.py
with args. For example, if you want to train a RD4AD model on RESC dataset, you can run the following command:
python main.py --mode train --data RESC --model RD4AD
You can change the hyperparameters by modifying the config file in config/
folder. Take the cflow
model as an example, you can change the hyperparameters in config/camelyon_cflow.yaml
file for cflow model on the camelyon dataset.
...
coupling_blocks: 8
clamp_alpha: 1.9
fiber_batch_size: 64
lr: 0.0001
...
You can test the model by running main.py
with args. For example, if you want to test a PaDiM model on liver dataset with weight file results/padim/liver/run/weights/model.ckpt
, you can run the following command:
python main.py --mode test --data liver --model padim --weight results/padim/liver/run/weights/model.ckpt
Brain MRI Anomaly Detection and Localization Benchmark : BraTS2021 Dataset
Liver CT Anomaly Detection and Localization Benchmark : BTCV Dataset + LiTS Dataset
Retinal OCT Anomaly Detection and Localization Benchmark : RESC Dataset + OCT2017 Dataset
Chest X-ray Anomaly Detection Benchmark : RSNA Dataset
Digital Histopathology Anomaly Detection Benchmark : Camelyon16 Dataset
RD4AD,PatchCore,DRAEM,DeepSVDD,MKD,PaDIM,CFLOW,CS-Flow,CutPaste GANomaly,UTRAD,STFPM,f-AnoGAN,CFA
Our orgianl resources and support alogorithms are come from the above references, thanks their splendid works!