Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MHub / GC - Add SPIDER baseline model/algorithm #53

Merged
merged 26 commits into from
Apr 19, 2024

Conversation

silvandeleemput
Copy link
Contributor

@silvandeleemput silvandeleemput commented Sep 5, 2023

This PR adds the SPIDER ISS baseline model (from the SPIDER challenge) to MHub.
GitHub Repo: https://github.com/DIAGNijmegen/SPIDER-Baseline-IIS
GC page: https://grand-challenge.org/algorithms/spider-baseline-iis/

Algorithm I/O

Caveats

  • The PR target is main, but should be something like m-gc-spider-baseline
  • In the Dockerfile the MHub model repo integration is currently marked TODO since it requires the creation of the appropriate branch for this code first.
  • Support for Dicom input has not been added for this algorithm since we could not find any specific Spine MR data on IDC, if it exists let us know then we can add it. Support has been added.
  • Support for DicomSeg output has not been added for this algorithm because of the complexity and sheer number of output segmentation labels. Basic support for DicomSeg has been added with support for 1-25: different vertebrae numbered from the bottom (i.e. L5 = 1) through the dseg rois.

@LennyN95
Copy link
Member

LennyN95 commented Sep 7, 2023

Hi Sil,

Support for DicomSeg output has not been added for this algorithm because of the complexity and sheer number of output segmentation labels.

We need to discuss this, ideally with @fedorov.

The vertebrae (1-25) should all be found in SegDB. For the partially vertebrae, maybe there is some kind of modifier to indicate incompleteness, Andrey will know this best.

SCT has codes for the intervertebral discs, so this shouldn't be a problem.

So i think this is solvable; if some variable delineations cannot be covered by Dicom for any reason, we might exclude them and only predict those that are (we can always export additional data). But for those we can standardize, we need to. After all,a Dicom-Dicom default pipeline is a hard requirement.

Best,
Leo.

@fedorov
Copy link
Member

fedorov commented Sep 7, 2023

Sagittal spine MR image (MHA) [...] Support for Dicom input has not been added for this algorithm since we could not find any specific Spine MR data on IDC, if it exists let us know then we can add it.

My understanding is that availability of the applicable data in IDC was precisely one of the criteria used to select the models for the contract. So I assume you did find it at the time the selection was made, right?

Sagittal spine MR segmentation (0: background, 1-25: different vertebrae numbered from the bottom (i.e. L5 = 1) 100: spinal canal 101-125 for partially visible vertebrae 201-225: different intervertebral discs (also numbered from the bottom, i.e. L5/S1 = 201))

Individual vertebrae are already mapped into SNOMED in https://github.com/MHubAI/models/blob/main/models/totalsegmentator/config/dicomseg_metadata_whole.json, so we can just reuse those.

For the spinal canal, is there an anatomic designation of spinal canal segments corresponding to the individual vertebrae? If not, I am not sure if it makes sense to have a vertebrae-labeled canal segmentation. Also, I am confused about "partially visible vertebrae". What is the reasoning for the logic: "if vertebra is partial, we segment spinal canal"?

image

For the disks, there are codes to leverage from SNOMED (https://browser.ihtsdotools.org/?perspective=full&conceptId1=404684003&edition=MAIN/2023-09-01&release=&languages=en):

image

@silvandeleemput
Copy link
Contributor Author

My understanding is that availability of the applicable data in IDC was precisely one of the criteria used to select the models for the contract. So I assume you did find it at the time the selection was made, right?

Yes, there is some spine MR available on IDC:

However, we found out when adding the model that these are not very suited for running with the model. There is however a very suitable open-source DICOM dataset available outside of IDC here: https://www.cg.informatik.uni-siegen.de/en/spine-segmentation-and-analysis. Hence, I have re-enabled DICOM support for the inputs for this model.

Thanks for the detailed instructions regarding the DicomSeg labels. I think it should be possible to add these, but I understood that it is required for the segmentation labels to be presented in sequence without any gaps in order for the DicomSegConverter to properly work. Wouldn't this require us to remap the internal labels to be without any gaps?

Also, I am confused about "partially visible vertebrae". What is the reasoning for the logic: "if vertebra is partial, we segment spinal canal"?

I am not entirely sure I fully understand the idea for this from the algorithm authors, but it seems to be something like you mentioned; if the vertebrae is on the edge of the image and not fully visible within the scan (so partially visible) label it and start searching for the next segment label it, etc...

@fedorov
Copy link
Member

fedorov commented Sep 12, 2023

However, we found out when adding the model that these are not very suited for running with the model.

Can you please elaborate on this?

@silvandeleemput
Copy link
Contributor Author

silvandeleemput commented Sep 13, 2023

Can you please elaborate on this?

Yes, we tried running the algorithm on some of the MR spine IDC data, but the algorithm failed to segment any spine segments, resulting in failing and empty output segmentations. Upon inspection, the MR images found in the IDC data have fewer slices and a much higher slice thickness and also have more surrounding organ tissue than the scans the model was trained on which seems to be causing the issue. The other DICOM dataset that I mentioned contains MR Spine images with properties closer to the original training data set and returns good output segmentations.

@silvandeleemput
Copy link
Contributor Author

Basic support for DicomSeg has been added with support for 1-25: different vertebrae numbered from the bottom (i.e. L5 = 1) through the dseg rois. Currently, the partially visible vertebrae labels have just been remapped to the normal vertebrae labels, and the spinal canal and intervertebral discs are ignored.

The implementation was achieved through using the DsegConverter using the already available dseg rois. If the other segmentation rois get added it should be very easy to add the other segmentation outputs as well (remapping code is already available). If you guys prefer a dseg.json setup instead let me know then I can generate that as well.

@fedorov
Copy link
Member

fedorov commented Oct 19, 2023

Here's a better link that has discs codes: https://dicom.nema.org/medical/dicom/current/output/chtml/part16/sect_CID_7604.html

@silvandeleemput
Copy link
Contributor Author

Could you please confirm that you guys prefer a dseg.json including the disks and spinal canal over the current implementation using the available values in https://github.com/MHubAI/SegDB (only vertebrae)?

If so could someone help with the DicomSeg codes for the spinal canal as well?

@LennyN95
Copy link
Member

LennyN95 commented Nov 2, 2023

Could you please confirm that you guys prefer a dseg.json including the disks and spinal canal over the current implementation using the available values in https://github.com/MHubAI/SegDB (only vertebrae)?

No, we're always preferring the usage of SegDB codes ;) Sorry if that wasn't clear; The dseg.json is only required if SegDB entries are missing and will ultimately be integrated into SegDB. In that case, we'll update the implementation to use SegDB instead.

@fedorov commented earlier on the spinal canal codes. @sil, the problem is that you have multiple segments of the spinal canal, correct? Couldn't we assign the spinal canal SCT code to all these segments for now @fedorov? We can always update with more specific codes in the future.

A nice feature of MHub is, that besides the standardized output, all original outputs of the pipeline are still available. So we can always have a workflow that returns those output files.

@fedorov
Copy link
Member

fedorov commented Nov 2, 2023

the problem is that you have multiple segments of the spinal canal, correct? Couldn't we assign the spinal canal SCT code to all these segments for now @fedorov? We can always update with more specific codes in the future.

You mean assign generic "Spinal canal" code instead of more specific ones?

@silvandeleemput
Copy link
Contributor Author

silvandeleemput commented Nov 6, 2023

Could you please confirm that you guys prefer a dseg.json including the disks and spinal canal over the current implementation using the available values in https://github.com/MHubAI/SegDB (only vertebrae)?

No, we're always preferring the usage of SegDB codes ;) Sorry if that wasn't clear; The dseg.json is only required if SegDB entries are missing and will ultimately be integrated into SegDB. In that case, we'll update the implementation to use SegDB instead.

Thanks!

@fedorov commented earlier on the spinal canal codes. @sil, the problem is that you have multiple segments of the spinal canal, correct? Couldn't we assign the spinal canal SCT code to all these segments for now @fedorov? We can always update with more specific codes in the future.

That's exactly the implementation in place right now, and I am using the current spinal canal SCT codes available in the SegDB.
However, the algorithm provides more segmentation labels, so the codes that are missing in SegDB are:

I think it would be great if you could add these to the SegDB.

A nice feature of MHub is, that besides the standardized output, all original outputs of the pipeline are still available. So we can always have a workflow that returns those output files.

Yes, this is currently the case, I also output the original MHA segmentation alongside the DicomSeg output.

@silvandeleemput
Copy link
Contributor Author

silvandeleemput commented Nov 6, 2023

You mean assign generic "Spinal canal" code instead of more specific ones?

Yes, I don't think this algorithm/model is very specific and just segments it as a whole.

@fedorov
Copy link
Member

fedorov commented Nov 6, 2023

Vertebrae disc codes (see link of Federov:

Fedorov

I don't think this algorithm/model is very specific and just segments it as a whole.

You can use "Spinal canal" SCT=61853006 https://browser.ihtsdotools.org/?perspective=full&conceptId1=61853006&edition=MAIN/2023-11-01&release=&languages=en

…eRunner, added support for CT and some other configuration options
models/gc_spider_baseline/utils/SpiderBaselineRunner.py Outdated Show resolved Hide resolved
models/gc_spider_baseline/utils/SpiderBaselineRunner.py Outdated Show resolved Hide resolved
models/gc_spider_baseline/utils/SpiderBaselineRunner.py Outdated Show resolved Hide resolved
models/gc_spider_baseline/utils/SpiderBaselineRunner.py Outdated Show resolved Hide resolved
models/gc_spider_baseline/utils/SpiderBaselineRunner.py Outdated Show resolved Hide resolved
models/gc_spider_baseline/utils/SpiderBaselineRunner.py Outdated Show resolved Hide resolved
models/gc_spider_baseline/dockerfiles/Dockerfile Outdated Show resolved Hide resolved
models/gc_spider_baseline/dockerfiles/Dockerfile Outdated Show resolved Hide resolved
models/gc_spider_baseline/config/default.yml Outdated Show resolved Hide resolved
models/gc_spider_baseline/config/default.yml Outdated Show resolved Hide resolved
@LennyN95
Copy link
Member

You can use "Spinal canal" SCT=61853006 https://browser.ihtsdotools.org/?perspective=full&conceptId1=61853006&edition=MAIN/2023-11-01&release=&languages=en

I updated SegDB.
The generic id for the spinal canal is SPINAL_CANAL.

@silvandeleemput
Copy link
Contributor Author

@LennyN95 Did you also add the Vertebrae discs to SegDB yet?
Vertebrae disc codes (see link of Fedorov: https://dicom.nema.org/medical/dicom/current/output/chtml/part16/sect_CID_7604.html)

@LennyN95
Copy link
Member

Yup, I pushed the changes a second ago.
You can search for "disk" here: https://github.com/MHubAI/SegDB/blob/main/segdb/data/segmentations.csv

The SegDB IDs follow the schema VERTEBRAE_DISK_C2C3, VERTEBRAE_DISK_C3C4, ...

models/gc_spider_baseline/meta.json Outdated Show resolved Hide resolved
models/gc_spider_baseline/meta.json Outdated Show resolved Hide resolved
models/gc_spider_baseline/meta.json Outdated Show resolved Hide resolved
models/gc_spider_baseline/meta.json Outdated Show resolved Hide resolved
models/gc_spider_baseline/meta.json Outdated Show resolved Hide resolved
models/gc_spider_baseline/utils/SpiderBaselineRunner.py Outdated Show resolved Hide resolved
models/gc_spider_baseline/utils/SpiderBaselineRunner.py Outdated Show resolved Hide resolved
models/gc_spider_baseline/meta.json Outdated Show resolved Hide resolved
@silvandeleemput
Copy link
Contributor Author

output.zip

@silvandeleemput
Copy link
Contributor Author

silvandeleemput commented Apr 15, 2024

/test

sample:
  idc_version: 17.0
  data:
    - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.1600.1202.719041608264172429762267513295
      aws_url: s3://idc-open-data/6417b84a-f531-4c4a-a47d-3a887bf00f2f/*
      path: dicom

reference:
  url: https://github.com/MHubAI/models/files/14976834/output.zip

Test Results (24.04.18_22.31.12_E3krCTzQUg)
id: cc8604b9-7b1d-4085-a257-518c035608d9
date: '2024-04-18 22:41:35'
checked_files:
- file: spider_baseline_vertebrae_segmentation_raw.mha
  path: /app/test/src/1.3.6.1.4.1.14519.5.2.1.1600.1202.719041608264172429762267513295/spider_baseline_vertebrae_segmentation_raw.mha
  checks:
  - checker: ImageFileCheck
    notes:
    - label: Data Type
      description: Data type of the reference image
      info: uint16
    - label: Dice Score
      description: Dice score between reference and test image
      info: 0.999989
- file: spider_baseline_vertebrae.seg.dcm
  path: /app/test/src/1.3.6.1.4.1.14519.5.2.1.1600.1202.719041608264172429762267513295/spider_baseline_vertebrae.seg.dcm
  checks:
  - checker: DicomsegContentCheck
    notes:
    - label: Segment Count
      description: The number of segments identified in the inspected dicomseg file.
      info: 17
- file: spider_baseline_vertebrae.seg.mha
  path: /app/test/src/1.3.6.1.4.1.14519.5.2.1.1600.1202.719041608264172429762267513295/spider_baseline_vertebrae.seg.mha
  checks:
  - checker: ImageFileCheck
    notes:
    - label: Data Type
      description: Data type of the reference image
      info: uint16
    - label: Dice Score
      description: Dice score between reference and test image
      info: 0.999989
summary:
  files_missing: 0
  files_extra: 0
  checks:
    ImageFileCheck:
      files: 2
    DicomsegContentCheck:
      files: 1
conclusion: true

@silvandeleemput
Copy link
Contributor Author

silvandeleemput commented Apr 15, 2024

/review

The meta.json file has been updated and the comments have been addressed. For the provided test file it should be noted that I was unable to find a single case with all the annotated labels of the spine. Typically only a part of the spine is scanned using MR, i.e. the lumbar, thoracic, or cervical vertebrae. The example scan from IDC that I provided is of the lumbar spine.

edit - as discussed, this model only functions properly on input images of the lumbar spine, since it was only trained on these. I think the currently provided test image should be sufficient for testing.

@github-actions github-actions bot added the REQUEST REVIEW Attach this label to your PR when your submission is "in progress" and is ready to be reviewed by us label Apr 15, 2024
models/gc_spider_baseline/meta.json Outdated Show resolved Hide resolved
models/gc_spider_baseline/meta.json Show resolved Hide resolved
models/gc_spider_baseline/meta.json Outdated Show resolved Hide resolved
Copy link
Member

@LennyN95 LennyN95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good & tests passed.

@LennyN95 LennyN95 merged commit 7ea9653 into MHubAI:main Apr 19, 2024
1 check passed
@LennyN95 LennyN95 removed REQUEST REVIEW Attach this label to your PR when your submission is "in progress" and is ready to be reviewed by us TEST REQUESTED labels Apr 19, 2024
@silvandeleemput silvandeleemput deleted the m-gc-spider-baseline branch April 22, 2024 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants