-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Re-Binarization of labels in pre-processing routine #18
Comments
good catch! this is definitely a problem when the model expects a binary mask as input. However, when using softeg, the input mask should/can be non-binary (float ranging between 0 and 1). Also, I think the problematic line referring to resampling the label (not the image) is this one:
So, in order to address this issue, one possibility would be to output a binary and a non-binary masks, and your training config file would fetch the appropriate label. |
Hey @jqmcginnis, thanks for notifying about this! I have a few comments on the As for the low scores obtained by @kiristern, I would attribute them to sub-optimal hyperparamters and/or sizes of the input image patches (note, this was a 3D model with a relatively very large dimension in the S-I direction because of the stitching) rather than it being something to do with non-binarized labels. Now, as for nnUNet expecting binarized labels, I had the same problem with Monai's default parameters so I had to use a preprocessing transform on Monai's side to convert the soft labels to the binarized labels (granted that this will already lose information, but some monai functions are not ready to accept soft labels). HENCE, I was wondering if nnUNet had something similar as well? That is, conversion to binary labels before processing? OR, as @jcohenadad suggested, we could modify the pre-processing script to output both binary and soft labels instead and choose whatever is appropriate for the model we're training. |
don't you want to go with a cropped version of the input for lesion segmentation task? (ie: cascaded approach) |
I've looked into the different functions, and I have not discovered any functions in the nn-unet repo to do this. I would see two options.
Personally, I would recommend option 2) as this ensures each current model (and future models) will always use the same binarized version, e.g. perhaps different libraries use different threshold for the generation of a binary label. |
Hi all 🙂,
I just discovered this small bug / missing post-processing step in the pre-processing routine.
Discovery / Problem
When running nn_UNnet
nnUNet_plan_and_preprocess -t 501 --verify_dataset_integrity
on the pre-processed dataset, we get the following error message:Similarly, @kiristern is dealing with low DICE values for the Modified U-Net baseline:
Although I am not familiar with ivadomed, I suspect that
multi_class_dice_score
indicates that ivadomed faces similar problems with the non-binary labels and interprets it as a multi-class problem instead.... but why?
When we resample the images to isotropic resolution, we introduce sampling artifacts as we blur the edges of the labels, leading to smoothed contours. Thus, we observe values other than {0,1} in the labels. Can be easily debugged by looking at one of the many examples of labels in the dataset.
Solution
We can mitigate this effect by adding the following post-processing step after this line:
bavaria-quebec/preprocessing/preprocess_data.sh
Line 137 in fc4f71d
sct_maths -i ${file}_T2w_crop_res.nii.gz -bin 1e-12 -o ${file}_T2w_crop_res.nii.gz
I haven't looked into finding an optimal value for the threshold
1e-12
, but I've chosen it to be extremely low so we have the whole label borders.The text was updated successfully, but these errors were encountered: