Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Useful statistics for training models #20

Open
naga-karthik opened this issue Jan 28, 2023 · 3 comments
Open

Useful statistics for training models #20

naga-karthik opened this issue Jan 28, 2023 · 3 comments

Comments

@naga-karthik
Copy link
Collaborator

naga-karthik commented Jan 28, 2023

I computed some stats about the dataset which could be considered for training the segmentation models

Subjects with and without lesions

It is useful to know the exact number of either of them so as to think of a curriculum learning strategy involving training the model for a few "warm-up" epochs only on the subjects with lesions and gradually introducing subjects who do not.

Number of subjects without lesion: 56

['sub-m969884', 'sub-m139339', 'sub-m456943', 'sub-m831195', 'sub-m376420', 'sub-m824387', 'sub-m116619', 'sub-m598270', 'sub-m769014', 'sub-m804794', 'sub-m362157', 'sub-m927055', 
'sub-m665521', 'sub-m906416', 'sub-m162129', 'sub-m709160', 'sub-m852614', 'sub-m902962', 'sub-m659205', 'sub-m843987', 'sub-m128628', 'sub-m884947', 'sub-m012474', 'sub-m053662', 
'sub-m998939', 'sub-m373162', 'sub-m711452', 'sub-m073580', 'sub-m380212', 'sub-m597981', 'sub-m116500', 'sub-m790028', 'sub-m300747', 'sub-m991840', 'sub-m987382', 'sub-m936588',
 'sub-m747612', 'sub-m854598', 'sub-m838132', 'sub-m431499', 'sub-m387058', 'sub-m737478', 'sub-m090343', 'sub-m627960', 'sub-m441629', 'sub-m339071', 'sub-m206271', 'sub-m550628', 
'sub-m472036', 'sub-m553941', 'sub-m358902', 'sub-m826180', 'sub-m491476', 'sub-m554105', 'sub-m919335', 'sub-m299563']
Number of subjects with lesion: 163

['sub-m703984', 'sub-m545591', 'sub-m052556', 'sub-m552033', 'sub-m425924', 'sub-m531168', 'sub-m508874', 'sub-m205815', 'sub-m990877', 'sub-m484245', 'sub-m757346', 'sub-m868134', 
'sub-m723132', 'sub-m738530', 'sub-m322775', 'sub-m362600', 'sub-m707812', 'sub-m463857', 'sub-m597865', 'sub-m378204', 'sub-m026506', 'sub-m818513', 'sub-m718495', 'sub-m572861', 
'sub-m563712', 'sub-m977362', 'sub-m978163', 'sub-m829931', 'sub-m991145', 'sub-m295736', 'sub-m159764', 'sub-m531317', 'sub-m158425', 'sub-m360832', 'sub-m243433', 'sub-m142435', 
'sub-m221398', 'sub-m762797', 'sub-m724575', 'sub-m786260', 'sub-m560928', 'sub-m275415', 'sub-m818091', 'sub-m808926', 'sub-m522051', 'sub-m117189', 'sub-m556439', 'sub-m774069', 
'sub-m220491', 'sub-m434248', 'sub-m916671', 'sub-m694074', 'sub-m222399', 'sub-m839135', 'sub-m350871', 'sub-m763939', 'sub-m739531', 'sub-m793289', 'sub-m205610', 'sub-m023917', 
'sub-m310073', 'sub-m778290', 'sub-m717470', 'sub-m631090', 'sub-m704693', 'sub-m354066', 'sub-m772796', 'sub-m094254', 'sub-m698534', 'sub-m063690', 'sub-m757043', 'sub-m556894', 
'sub-m595577', 'sub-m573737', 'sub-m168132', 'sub-m356340', 'sub-m356026', 'sub-m816146', 'sub-m751383', 'sub-m944619', 'sub-m663069', 'sub-m698817', 'sub-m126053', 'sub-m621782', 
'sub-m909606', 'sub-m508941', 'sub-m673334', 'sub-m785774', 'sub-m978546', 'sub-m085197', 'sub-m312155', 'sub-m492109', 'sub-m798409', 'sub-m104714', 'sub-m993488', 'sub-m751075', 
'sub-m040509', 'sub-m843491', 'sub-m949797', 'sub-m977227', 'sub-m469393', 'sub-m558234', 'sub-m474555', 'sub-m878455', 'sub-m043194', 'sub-m664123', 'sub-m527202', 'sub-m029034', 
'sub-m087754', 'sub-m545924', 'sub-m809689', 'sub-m779887', 'sub-m403171', 'sub-m275864', 'sub-m569425', 'sub-m729353', 'sub-m617186', 'sub-m701054', 'sub-m333631', 'sub-m315309', 
'sub-m027847', 'sub-m707324', 'sub-m397667', 'sub-m339845', 'sub-m941876', 'sub-m841476', 'sub-m846990', 'sub-m870870', 'sub-m251271', 'sub-m243881', 'sub-m220667', 'sub-m124504', 
'sub-m172680', 'sub-m901378', 'sub-m245390', 'sub-m886317', 'sub-m094503', 'sub-m979943', 'sub-m640779', 'sub-m493131', 'sub-m379862', 'sub-m438239', 'sub-m730546', 'sub-m762599', 
'sub-m781551', 'sub-m072533', 'sub-m189434', 'sub-m115467', 'sub-m438273', 'sub-m838420', 'sub-m986156', 'sub-m644597', 'sub-m819426', 'sub-m329161', 'sub-m479421', 'sub-m684459', 
'sub-m504077', 'sub-m157227', 'sub-m551363', 'sub-m034619', 'sub-m412427', 'sub-m037477', 'sub-m292834']

Min/max sizes of the images and labels

In order to decide on a optimal cropping size, it is useful to know what largest and smallest dimensions across all the subjects. Hence,

min along each dimension: [ 18  42 201]
max along each dimension: [160 175 713]

EDIT: Note that these dimensions correspond to the sizes of the preprocessed images that have been cropped using the spinal cord segmentation mask.

@jcohenadad
Copy link
Member

Thanks! but do we need to know the min/max of the images if at the end we use the spinal cord segmentation for cropping?

@naga-karthik
Copy link
Collaborator Author

These are actually the min/max of the (preprocessed) images which have already been cropped using the SC segmentation mask!

@jcohenadad
Copy link
Member

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants