Lower performance of later version (eg: 2.5 vs. 2.0) #126

jcohenadad · 2024-11-28T15:42:53Z

Possible avenues:

more aggressive data augmentation
more careful selection of the balance across all contrasts

jcohenadad · 2024-11-28T15:51:35Z

More aggressive data augmentation was already tested at #119, more specifically, these params, however, results were not improved.

Maybe @NathanMolinier or @yw7 have some insights as they used very aggressive DA, maybe there are some transfos that were not explored here?

NathanMolinier · 2024-11-28T18:37:57Z

All the transformations done to train totalspineseg are available here.

naga-karthik · 2024-11-28T19:59:58Z

Here are the categories of transformations applied in totalspineseg as per these lines

    Augmentation includes:
    1. Contrast augmentation (Laplace, Gamma, Histogram Equalization, Log, Sqrt, Exp, Sin, Sig, Inverse)
    2. Image from segmentation augmentation
    3. Redistribute segmentation values
    4. Artifacts augmentation (Motion, Ghosting, Spike, Bias Field, Blur, Noise)
    5. Spatial augmentation (Flip, BSpline, Affine, Elastic)
    6. Anisotropy augmentation

In my set of transforms, I cover:

Contrast augmentation (Gamma; L29)
Not applicable because it is essentially SynthSeg -- works well in case of synthesizing an image with a lot of labels but not only when SC label is present (the resulting is too unrealistic)
could look into it
Artifacts augmentation (Bias Field, Blur, Noise; L30-32)
Spatial (Affine, Flip and Elastic; L21, L25, and L34)
LowResolution (L28)

I notice that I indeed cover most of the transforms used in TotalSpineSeg (also with larger probabilities). There are 1-2 transforms that I could add (i.e. ghosting, motion, etc. which I will consider adding)

NathanMolinier · 2024-11-28T20:09:06Z

I believe you should try more contrasts augmentations like Laplace, Log, Sqrt, Exp, Sin, Sig, Inverse... Gamma is definitely not enough

naga-karthik · 2024-11-28T20:26:20Z

do you have some references that have used these transformations (i.e. log, sqrt, exp, etc.)? curious to know how these simple math operations are helping

NathanMolinier · 2024-11-28T20:35:49Z

I can't tell which transformation helped the most, but I know that totalspineseg works on MP2RAGE, PSIR, proton density or CT scans without being trained on these contrasts.

NathanMolinier · 2024-11-28T20:41:16Z

A good test would be to run the inference multiple times on the same image after applying different transformation to evaluate the impact on each prediction.

yw7 · 2024-11-29T00:44:34Z

It seems to me that the model is able to recognize the spinal cord quite well but misses some voxels at the edges. This suggests the issue might not be the lack of augmentation but rather problems with the ground truth labels or, most probably, certain data augmentation transformations. This might be some new dataset or new data augmentation that introduced in the later versions. For example, some transformations might have misaligned the image and segmentation by a few voxels, or certain augmentations could have artificially altered the spinal cord boundaries (e.g., enlarging or modifying the edge voxels). As a result, the model might have learned to exclude these edge voxels from the segmentation.

From my experience, data augmentation helps the model generalize to detect the object (in this case, the spinal cord) even under varying contrasts not present in the training set. However, this doesn’t seem to be the issue here. It might be worth revisiting the augmentations to identify any that could have unintentionally introduced such inconsistencies.

Beside that, as @NathanMolinier mentioned, it worth testing if some of the methods used in totalspineseg can help or replace some data augmentation here to make the model generalize better without effecting the segmentation.

In addition, the focus of totalspineseg was on the labeling and not the accuracy of the segmentation, so not sure if what work there will work here. Also in totalspineseg the model trained to learn the level (cervical, thoracic, lumbar) that might help the model generalize to detect the specific structure of the spinal cord in each specific region. Also the use of more classes might help the model get more knowledge and force it to generalize.

naga-karthik · 2024-11-29T23:07:51Z

suggests the issue might not be the lack of augmentation but rather problems with the ground truth labels

@yw7 I have the same feeling as well!

This might be some new dataset or new data augmentation that introduced in the later versions.

precisely! that's why I kept the augmentations same as in v2.0 but added a lot more datasets causing a heavy imbalance in the number of images per each contrast

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lower performance of later version (eg: 2.5 vs. 2.0) #126

Lower performance of later version (eg: 2.5 vs. 2.0) #126

jcohenadad commented Nov 28, 2024 •

edited

Loading

jcohenadad commented Nov 28, 2024

NathanMolinier commented Nov 28, 2024

naga-karthik commented Nov 28, 2024 •

edited

Loading

NathanMolinier commented Nov 28, 2024

naga-karthik commented Nov 28, 2024

NathanMolinier commented Nov 28, 2024

NathanMolinier commented Nov 28, 2024 •

edited

Loading

yw7 commented Nov 29, 2024 •

edited

Loading

naga-karthik commented Nov 29, 2024

Lower performance of later version (eg: 2.5 vs. 2.0) #126

Lower performance of later version (eg: 2.5 vs. 2.0) #126

Comments

jcohenadad commented Nov 28, 2024 • edited Loading

jcohenadad commented Nov 28, 2024

NathanMolinier commented Nov 28, 2024

naga-karthik commented Nov 28, 2024 • edited Loading

NathanMolinier commented Nov 28, 2024

naga-karthik commented Nov 28, 2024

NathanMolinier commented Nov 28, 2024

NathanMolinier commented Nov 28, 2024 • edited Loading

yw7 commented Nov 29, 2024 • edited Loading

naga-karthik commented Nov 29, 2024

jcohenadad commented Nov 28, 2024 •

edited

Loading

naga-karthik commented Nov 28, 2024 •

edited

Loading

NathanMolinier commented Nov 28, 2024 •

edited

Loading

yw7 commented Nov 29, 2024 •

edited

Loading