Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naming convention for adding new contrasts to the training set #102

Open
naga-karthik opened this issue Mar 6, 2024 · 12 comments
Open

Naming convention for adding new contrasts to the training set #102

naga-karthik opened this issue Mar 6, 2024 · 12 comments

Comments

@naga-karthik
Copy link
Collaborator

naga-karthik commented Mar 6, 2024

I am working on adding new datasets/contrasts to augment the existing spine-generic dataset. I wanted to clarify/confirm a few things about the naming conventions to be used for the outputs derived from the contrast agnostic model (which will be used for training)

Example folder structure for `basel-mp2rage`
basel-mp2rage
├── README.md
├── dataset_description.json
├── participants.tsv
├── participants.json
├── code/
├── derivatives
│  └── labels
│        ├── dataset_description.json
│        ├── README.md
│        └── sub-CXXX
│            └── anat
│                ├── sub-CXXX_UNIT1_label-SC_seg.nii.gz
│                ├── sub-CXXX_UNIT1_label-SC_seg.json
│    └── labels_softseg_bin     ---> this folder is newly added 
│        ├── dataset_description.json
│        ├── README.md
│        └── sub-CXXX
│            └── anat
│                ├── sub-CXXX_UNIT1_desc-softseg_label-SC_seg.nii.gz
│                ├── sub-CXXX_UNIT1_desc-softseg_label-SC_seg.json
├── sub-CXXX
│   └── anat
│       ├──sub-CXXX_UNIT1.nii.gz
│       └──sub-CXXX_UNIT1.json
│

Contents of the json file

{
    "Name": "contrast-agnostic model",
    "Version": "SCT v6.2" / "v2.0", # Should I use SCT version or the tag v2.0 of the contrast-agnositc repo?
    "Date": "2024-03-21"
}

Issue is that the model currently in SCT v6.2 is the original soft model trained on soft GTs. But, the model I will be use for inference and training with new contrasts is the soft_bin model.

EDIT: updated the filename for labels_softseg_bin

@sandrinebedard
Copy link
Member

If it is under labels_softseg_bin, the name softseg is not the right one, you can check in spine generic data multi subject in my branch the name convention we decided

I would go with C-A version since the 2.0 is not in sct, is that right?

@naga-karthik
Copy link
Collaborator Author

naga-karthik commented Mar 7, 2024

Right, I checked and updated folder structure in my comment! The filenames are like this now:

sub-CXXX_UNIT1_desc-softseg_label-SC_seg.nii.gz

I would go with C-A version since the 2.0 is not in sct, is that right?

Correct, it's not in SCT. The 2.0 is essentially coming from our latest release. (but, technically, even this is not the soft_bin model, but I think it's okay, the difference with the original soft model isn't much.

@sandrinebedard
Copy link
Member

Maybe we should create a release (like v.1?)

@Nilser3
Copy link

Nilser3 commented Mar 7, 2024

For nih-ms-mp2rage I have generated these JSON files

{
  "GeneratedBy": [
    {
      "Name": "contrast-agnostic-softseg-spinalcord",
      "Version": "2.0"
    },
    {
      "Author": "Nilser Laines Medina",
      "Date": "2023-12-14",
      "Note": "Binarised spinal cord soft segmentation"
    }
  ]
}

@naga-karthik
Copy link
Collaborator Author

Thanks @Nilser3 for your JSON example! I think I will go with contrast-agnostic-softseg-spinalcord as the Name but with a different version.

@sandrinebedard Sure, we can create a new release v2.1 today and I will be using this for the JSON sidecars

@NathanMolinier
Copy link

NathanMolinier commented Mar 7, 2024

For nih-ms-mp2rage I have generated these JSON files

{
  "GeneratedBy": [
    {
      "Name": "contrast-agnostic-softseg-spinalcord",
      "Version": "2.0"
    },
    {
      "Author": "Nilser Laines Medina",
      "Date": "2023-12-14",
      "Note": "Binarised spinal cord soft segmentation"
    }
  ]
}

The field "Name" is missing in this example. From where the second step was generated ? @Nilser3

@Nilser3
Copy link

Nilser3 commented Mar 7, 2024

Thanks for the feedback @NathanMolinier
I see that it was still not in agreement with the new convention,
I think it would be better something like:

{
  "SpatialReference": "orig",
  "GeneratedBy": [
    {
      "Name": "contrast-agnostic-softseg-spinalcord",
      "Version": "2.0"
    },
    {
      "Name": "Manual",
      "Author": "Nilser Laines Medina",
      "Date": "2023-12-14",
      "Note": "Binarised spinal cord soft segmentation"
    }
  ]
}

@valosekj
Copy link
Member

valosekj commented Mar 7, 2024

Just a nitpick, we concluded here that we should use yyyy-mm-dd hh:mm:ss format for Date to make it easy to distinguish the order of corrections.

@NathanMolinier
Copy link

NathanMolinier commented Mar 7, 2024

{
  "SpatialReference": "orig",
  "GeneratedBy": [
    {
      "Name": "contrast-agnostic-softseg-spinalcord",
      "Version": "2.0"
    },
    {
      "Name": "Manual",
      "Author": "Nilser Laines Medina",
      "Date": "2023-12-14",
      "Note": "Binarised spinal cord soft segmentation"
    }
  ]
}

Just out of curiosity, did you really binarize manually ? Or did you use a custom script doing thresholding ?

@Nilser3
Copy link

Nilser3 commented Mar 7, 2024

was binarized sct_maths after generating the soft masks,

but, I think I will remove this "Note" , because I will generate again these SC masks with the last version of contrast-agnostic-model (there the result is already binary).

@NathanMolinier
Copy link

was binarized sct_maths after generating the soft masks,

You should then specify the method sct_maths using this "Name" field instead of Manual, and potentially provide the command you ran, like below:

{
  "SpatialReference": "orig",
  "GeneratedBy": [
    {
      "Name": "contrast-agnostic-softseg-spinalcord",
      "Version": "2.0"
    },
    {
      "Name": "sct_maths",
      "Param": "-thr 0.8",
      "Version": "SCT v6.2",
      "Note": "Binarised spinal cord soft segmentation"
    }
  ]
}

@naga-karthik
Copy link
Collaborator Author

It's pretty cool that you can directly define the json dict and sct_run_batch script and can create the json file for each subject with the contents of the json dict.

code snippet
date_time=$(date +"%Y-%m-%d %H:%M:%S")
json_dict='{
  "GeneratedBy": [
    {
      "Name": "contrast-agnostic-softseg-spinalcord",
      "Version": "2.1",
      "Date": "'$date_time'"
    }
  ]
}'

PATH_DATA_PROCESSED_CLEAN="${PATH_DATA_PROCESSED}_clean"
# create new folder and copy only the predictions
mkdir -p ${PATH_DATA_PROCESSED_CLEAN}/derivatives/labels_softseg_bin/${SUBJECT}/anat

rsync -avzh ${file}_seg_monai.nii.gz ${PATH_DATA_PROCESSED_CLEAN}/derivatives/labels_softseg_bin/${SUBJECT}/anat/${file}_desc-softseg_label-SC_seg.nii.gz
rsync -avzh ${file}_seg-manual.json ${PATH_DATA_PROCESSED_CLEAN}/derivatives/labels_softseg_bin/${SUBJECT}/anat/${file}_desc-softseg_label-SC_seg.json

# create json file
echo $json_dict > ${PATH_DATA_PROCESSED_CLEAN}/derivatives/labels_softseg_bin/${SUBJECT}/anat/${file}_desc-softseg_label-SC_seg.json
# re-save json files with indentation
python -c "import json;
json_file = '${PATH_DATA_PROCESSED_CLEAN}/derivatives/labels_softseg_bin/${SUBJECT}/anat/${file}_desc-softseg_label-SC_seg.json'
with open(json_file, 'r') as f:
    data = json.load(f)
    json.dump(data, open(json_file, 'w'), indent=4)
"
contents of json file
{
    "GeneratedBy": [
        {
            "Name": "contrast-agnostic-softseg-spinalcord",
            "Version": "2.1",
            "Date": "2024-03-08 17:46:09"
        }
    ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants