Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lower performance of later version (eg: 2.5 vs. 2.0) #126

Open
jcohenadad opened this issue Nov 28, 2024 · 9 comments
Open

Lower performance of later version (eg: 2.5 vs. 2.0) #126

jcohenadad opened this issue Nov 28, 2024 · 9 comments

Comments

@jcohenadad
Copy link
Member

jcohenadad commented Nov 28, 2024

image

Possible avenues:

  • more aggressive data augmentation
  • more careful selection of the balance across all contrasts
@jcohenadad
Copy link
Member Author

More aggressive data augmentation was already tested at #119, more specifically, these params, however, results were not improved.

Maybe @NathanMolinier or @yw7 have some insights as they used very aggressive DA, maybe there are some transfos that were not explored here?

@NathanMolinier
Copy link

All the transformations done to train totalspineseg are available here.

@naga-karthik
Copy link
Collaborator

naga-karthik commented Nov 28, 2024

Here are the categories of transformations applied in totalspineseg as per these lines

    Augmentation includes:
    1. Contrast augmentation (Laplace, Gamma, Histogram Equalization, Log, Sqrt, Exp, Sin, Sig, Inverse)
    2. Image from segmentation augmentation
    3. Redistribute segmentation values
    4. Artifacts augmentation (Motion, Ghosting, Spike, Bias Field, Blur, Noise)
    5. Spatial augmentation (Flip, BSpline, Affine, Elastic)
    6. Anisotropy augmentation

In my set of transforms, I cover:

  1. Contrast augmentation (Gamma; L29)
  2. Not applicable because it is essentially SynthSeg -- works well in case of synthesizing an image with a lot of labels but not only when SC label is present (the resulting is too unrealistic)
  3. could look into it
  4. Artifacts augmentation (Bias Field, Blur, Noise; L30-32)
  5. Spatial (Affine, Flip and Elastic; L21, L25, and L34)
  6. LowResolution (L28)

I notice that I indeed cover most of the transforms used in TotalSpineSeg (also with larger probabilities). There are 1-2 transforms that I could add (i.e. ghosting, motion, etc. which I will consider adding)

@NathanMolinier
Copy link

I believe you should try more contrasts augmentations like Laplace, Log, Sqrt, Exp, Sin, Sig, Inverse... Gamma is definitely not enough

@naga-karthik
Copy link
Collaborator

do you have some references that have used these transformations (i.e. log, sqrt, exp, etc.)? curious to know how these simple math operations are helping

@NathanMolinier
Copy link

I can't tell which transformation helped the most, but I know that totalspineseg works on MP2RAGE, PSIR, proton density or CT scans without being trained on these contrasts.

@NathanMolinier
Copy link

NathanMolinier commented Nov 28, 2024

A good test would be to run the inference multiple times on the same image after applying different transformation to evaluate the impact on each prediction.

@yw7
Copy link

yw7 commented Nov 29, 2024

image

It seems to me that the model is able to recognize the spinal cord quite well but misses some voxels at the edges. This suggests the issue might not be the lack of augmentation but rather problems with the ground truth labels or, most probably, certain data augmentation transformations. This might be some new dataset or new data augmentation that introduced in the later versions. For example, some transformations might have misaligned the image and segmentation by a few voxels, or certain augmentations could have artificially altered the spinal cord boundaries (e.g., enlarging or modifying the edge voxels). As a result, the model might have learned to exclude these edge voxels from the segmentation.

From my experience, data augmentation helps the model generalize to detect the object (in this case, the spinal cord) even under varying contrasts not present in the training set. However, this doesn’t seem to be the issue here. It might be worth revisiting the augmentations to identify any that could have unintentionally introduced such inconsistencies.

Beside that, as @NathanMolinier mentioned, it worth testing if some of the methods used in totalspineseg can help or replace some data augmentation here to make the model generalize better without effecting the segmentation.

In addition, the focus of totalspineseg was on the labeling and not the accuracy of the segmentation, so not sure if what work there will work here. Also in totalspineseg the model trained to learn the level (cervical, thoracic, lumbar) that might help the model generalize to detect the specific structure of the spinal cord in each specific region. Also the use of more classes might help the model get more knowledge and force it to generalize.

@naga-karthik
Copy link
Collaborator

suggests the issue might not be the lack of augmentation but rather problems with the ground truth labels

@yw7 I have the same feeling as well!

This might be some new dataset or new data augmentation that introduced in the later versions.

precisely! that's why I kept the augmentations same as in v2.0 but added a lot more datasets causing a heavy imbalance in the number of images per each contrast

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants