You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First training was done on 20 subjects from the spine-generic and OpenNeuro datasets. Manual segmentation of the ventral rootlets was done while the dorsal rootlets' segmentation were taken from the D5 dataset described here.
2) Model training
nnUNet 3d_fullres model trained on 20 subjects.
To initialize the dataset, the following command was used: nnUNetv2_plan_and_preprocess -d 104 --verify_dataset_integrity -c 3d_fullres
For starting the training, the following command was used: CUDA_VISIBLE_DEVICES=0 nnUNetv2_train 104 3d_fullres 0
For running the inference on new images, the following command was used: nnUNetv2_predict -i nnUNet_raw/Dataset104_M1/imagesTs -o nnUNet_results/Dataset104_M1/labels_results -d 104 -c 3d_fullres -f 0
Where the Dataset104_M1/imagesTs folder contains the images on which inference was run.
3) Results
Here are the learning curves for the training.
This next graph shows the performance of the V1 (Dataset101) and V2 (Dataset104) models on the test subjects. It shows an augmentation in the quality of the segmentations from the V2 model.
The relatively small mean dice score and big standard deviation come from the fact that the V1 and V2 models have a lot of difficulty to correctly label the spinal level on one of the test subject, resulting in a big amount of spinal level mislabelisation (SLM) errors (see the table below).
V1 model performance on test subjects
<style>
</style>
Image Name
TP
SLM
FP
FN
Dice
b'sub-007_ses-headNormal_009.nii.gz'
2398
0
1454
2165
0,56993464
b'sub-010_ses-headUp_015.nii.gz'
3817
0
535
2755
0,69882827
b'sub-amu02_215.nii.gz'
645
769
813
1196
0,26669423
b'sub-barcelona01_212.nii.gz'
1985
0
209
1803
0,66365764
b'sub-brnoUhb03_209.nii.gz'
5390
0
1381
2934
0,71414376
Mean
0,58265171
V2 model performance on test subjects
<style>
</style>
Image Name
TP
SLM
FP
FN
Dice
b'sub-007_ses-headNormal_009.nii.gz'
2363
0
1279
2200
0,57599025
b'sub-010_ses-headUp_015.nii.gz'
4158
0
516
2414
0,73946292
b'sub-amu02_215.nii.gz'
976
310
686
1324
0,42601484
b'sub-barcelona01_212.nii.gz'
2492
0
389
1296
0,74733843
b'sub-brnoUhb03_209.nii.gz'
5357
0
1157
2967
0,72206497
Mean
0,64217428
For this specific subject, the curvature of its spine in the lower levels is bigger, resulting in SLM in the lower levels as shown on the image below (The picture on the left is the V1 model, the picture on the right is the V2 model. The green pixels are correctly labeled, yellow pixels are false negatives, red pixels are false positives and blue pixels are SLM).
The text was updated successfully, but these errors were encountered:
It's great to see that the model has improved (i.e., Dice is higher) when more subjects are added!
Could you please run training_scripts/plot_nnunet_training_log.py to generate a figure showing validation pseudo dice for each class (i.e., each rootlets level) to see what levels the model struggle with?
Also, the figure is slightly difficult to follow; could you please generate a GIF (you can toggle overlays), for example, as done here.
1) Dataset
First training was done on 20 subjects from the spine-generic and OpenNeuro datasets. Manual segmentation of the ventral rootlets was done while the dorsal rootlets' segmentation were taken from the D5 dataset described here.
2) Model training
nnUNet 3d_fullres model trained on 20 subjects.
To initialize the dataset, the following command was used:
nnUNetv2_plan_and_preprocess -d 104 --verify_dataset_integrity -c 3d_fullres
For starting the training, the following command was used:
CUDA_VISIBLE_DEVICES=0 nnUNetv2_train 104 3d_fullres 0
For running the inference on new images, the following command was used:
nnUNetv2_predict -i nnUNet_raw/Dataset104_M1/imagesTs -o nnUNet_results/Dataset104_M1/labels_results -d 104 -c 3d_fullres -f 0
Where the Dataset104_M1/imagesTs folder contains the images on which inference was run.
3) Results
Here are the learning curves for the training.
This next graph shows the performance of the V1 (Dataset101) and V2 (Dataset104) models on the test subjects. It shows an augmentation in the quality of the segmentations from the V2 model.
The relatively small mean dice score and big standard deviation come from the fact that the V1 and V2 models have a lot of difficulty to correctly label the spinal level on one of the test subject, resulting in a big amount of spinal level mislabelisation (SLM) errors (see the table below).
V1 model performance on test subjects
<style> </style>V2 model performance on test subjects
<style> </style>For this specific subject, the curvature of its spine in the lower levels is bigger, resulting in SLM in the lower levels as shown on the image below (The picture on the left is the V1 model, the picture on the right is the V2 model. The green pixels are correctly labeled, yellow pixels are false negatives, red pixels are false positives and blue pixels are SLM).
The text was updated successfully, but these errors were encountered: