Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding dicom package tests that would utilize selected samples from IDC #1208

Open
fedorov opened this issue Aug 23, 2024 · 3 comments
Assignees
Labels
collaboration:idc Identify feature or fix partially funded by the NCI Imaging Data Commons (IDC) initiative

Comments

@fedorov
Copy link
Member

fedorov commented Aug 23, 2024

Since all of the IDC data is available in public buckets, with the content available via S3 API or HTTPS, without authentication, it might be good to add regression tests that utilize hand-picked DICOM samples that stress specific aspects of the functionality.

Specific examples that we already ran into, with SeriesInstanceUID of a corresponding sample from the current IDC v18 data release:

  • large DICOM SEG for TotalSegmentator: 1.2.276.0.7230010.3.1.3.313263360.35955.1706319184.882151
  • CT with SpacingBetweenSlices = -4: 1.3.6.1.4.1.32722.99.99.239963936032720978832553442140518002510
  • NM with SpacingBetweenSlices = -2: 1.3.6.1.4.1.14519.5.2.1.7009.2403.484725606860278331095617627781

Given the UID above, the corresponding file(s) can be retrieved in just 2 steps:

  1. $ pip install --upgrade idc-index
  2. $ idc download <SeriesInstanceUID>

Other dimensions we may want to consider testing could include various transfer syntaxes, diffusion images from different manufacturers, series with missing slices, series with inconsistent PixelSpacing or ImageOrientationPatient, gantry tilt, presentation states, various samples that contain attributes that are invalid per standard, but may be encountered "in the wild". I think we should be able to find samples for many situations that need to be regression-tested.

I have not done this myself, but looks like CMake supports such external data sources: https://cmake.org/cmake/help/book/mastering-cmake/chapter/Testing%20With%20CMake%20and%20CTest.html#managing-test-data.

I am happy to help with selection of the relevant samples for the tasks we agree should be tested and answer any questions related to IDC.

I think something like the above has been a dream of @pieper for many years now. I believe we finally can make it come true!

@fedorov
Copy link
Member Author

fedorov commented Aug 23, 2024

This is where tests are right now and it seems they are propagated from dcmqi: https://github.com/InsightSoftwareConsortium/ITK-Wasm/blob/main/packages/dicom/dcmtk/CMakeLists.txt#L88

@jcfr jcfr added the collaboration:idc Identify feature or fix partially funded by the NCI Imaging Data Commons (IDC) initiative label Aug 30, 2024
@jcfr jcfr assigned jadh4v and jcfr Sep 16, 2024
@jadh4v
Copy link
Member

jadh4v commented Sep 16, 2024

This is where tests are right now and it seems they are propagated from dcmqi: https://github.com/InsightSoftwareConsortium/ITK-Wasm/blob/main/packages/dicom/dcmtk/CMakeLists.txt#L88

@fedorov

The tests in the CMake file are mostly run for sanity check on native binaries.
The more comprehensive typescript and python tests are here:
https://github.com/InsightSoftwareConsortium/ITK-Wasm/tree/main/packages/dicom/typescript/test
https://github.com/InsightSoftwareConsortium/ITK-Wasm/tree/main/packages/dicom/python/itkwasm-dicom-wasi/tests

Also, GDCM is available through both image-io as well as the dicom subpackage for reading image series.
I don't believe DCMTK is currently being used for reading imaging modalities of dicom series (@thewtex correct me if I'm wrong).

@fedorov
Copy link
Member Author

fedorov commented Sep 16, 2024

Yes, I understand. The idea is to augment the existing tests of dcmqi (which are basically small toy examples) with the tests on the real data from IDC, and also add tests of the image-io package using data from IDC. No need to add DCMTK to image-io for this purpose, but just improve testing of the existing GDCM-based functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collaboration:idc Identify feature or fix partially funded by the NCI Imaging Data Commons (IDC) initiative
Projects
None yet
Development

No branches or pull requests

3 participants