Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model not classifying obvious pneumothorax #81

Open
AceMcAwesome77 opened this issue Jan 5, 2022 · 4 comments
Open

Model not classifying obvious pneumothorax #81

AceMcAwesome77 opened this issue Jan 5, 2022 · 4 comments

Comments

@AceMcAwesome77
Copy link

AceMcAwesome77 commented Jan 5, 2022

Hi, I am trying to run a quick test using this model. I have 2 different chest XR images that show an entire lung collapsed from pneumothorax, and I'd like to verify that the model correctly picks them up. However it isn't. My starting image is "img_data" which is a numpy.ndarray of size (2800, 3408) that looks like this:

image

below is the code I'm running:

import torchxrayvision as xrv
import skimage
import torchvision
import torch

model = xrv.models.ResNet(weights="resnet50-res512-all")

img = xrv.datasets.normalize(img_data, 255)

if len(img.shape) > 2:
img = img[:, :, 0]
if len(img.shape) < 2:
print("error, dimension lower than 2 for image")

img = img[None, :, :]

transform = torchvision.transforms.Compose([xrv.datasets.XRayCenterCrop(),
xrv.datasets.XRayResizer(512)])

img = transform(img)

output = {}
with torch.no_grad():
<tab img = torch.from_numpy(img).unsqueeze(0)
<tab preds = model(img).cpu()
<tab output["preds"] = dict(zip(xrv.datasets.default_pathologies,preds[0].detach().numpy()))

The "<tab" is indented lines in the loop. Running all this code, I get the following "output" variable:

{'preds': {'Atelectasis': 0.031930413,
'Consolidation': 0.0079838885,
'Infiltration': 0.022067936,
'Pneumothorax': 0.012027948,
'Edema': 3.992413e-06,
'Emphysema': 0.008683062,
'Fibrosis': 0.0037461556,
'Effusion': 0.012206978,
'Pneumonia': 0.005400587,
'Pleural_Thickening': 0.043657843,
'Cardiomegaly': 0.0010988085,
'Nodule': 0.011990261,
'Mass': 0.20278542,
'Hernia': 1.3901392e-05,
'Lung Lesion': 0.5,
'Fracture': 0.033246215,
'Lung Opacity': 0.04536338,
'Enlarged Cardiomediastinum': 0.5}}

Where we can see that Pneumothorax has a score of 0.012. It should be much higher given the obvious pneumothorax. The other test image does the same thing, shows an obvious pneumothorax but scores about 0.01 using this pipeline. What am I doing wrong here? Thanks much!

@ieee8023
Copy link
Member

ieee8023 commented Jan 7, 2022

Hey sorry for my delay in getting back to you. Your code looks correct. I processed the image you posted using this script: https://github.com/mlmed/torchxrayvision/blob/master/scripts/process_image.py It seems the densenet at a 224x224 resolution predicts higher but it could just be predicting using some spuriously correlated signal.

Perhaps a pneumothorax that big was rare in the training data so the model didn't learn any features for it. These models are not perfect.

$ python3 process_image.py test-pneumo.png -weights densenet121-res224-all
Warning: Input size (252x252) is not the native resolution (224x224) for this model. A resize will be performed but this could impact performance.
{'preds': {'Atelectasis': 0.2336309,
           'Cardiomegaly': 0.5088244,
           'Consolidation': 0.49505916,
           'Edema': 0.006818563,
           'Effusion': 0.16614664,
           'Emphysema': 0.5041779,
           'Enlarged Cardiomediastinum': 0.44423354,
           'Fibrosis': 0.06360667,
           'Fracture': 0.50422204,
           'Hernia': 0.38043112,
           'Infiltration': 0.19527479,
           'Lung Lesion': 0.04211408,
           'Lung Opacity': 0.16715923,
           'Mass': 0.5078881,
           'Nodule': 0.22657393,
           'Pleural_Thickening': 0.16447839,
           'Pneumonia': 0.12326898,
           'Pneumothorax': 0.50655806}}

$ python3 process_image.py test-pneumo.png -weights resnet50-res512-all
Warning: Input size (252x252) is not the native resolution (512x512) for this model. A resize will be performed but this could impact performance.
{'preds': {'Atelectasis': 0.07039986,
           'Cardiomegaly': 0.006025043,
           'Consolidation': 0.010440136,
           'Edema': 0.00018646289,
           'Effusion': 0.06766936,
           'Emphysema': 0.0020499881,
           'Enlarged Cardiomediastinum': 0.5,
           'Fibrosis': 0.0077288793,
           'Fracture': 0.01869749,
           'Hernia': 0.00023267824,
           'Infiltration': 0.03506834,
           'Lung Lesion': 0.5,
           'Lung Opacity': 0.007614809,
           'Mass': 0.042145472,
           'Nodule': 0.02053261,
           'Pleural_Thickening': 0.009434696,
           'Pneumonia': 0.004180993,
           'Pneumothorax': 0.0058075087}}

@ieee8023
Copy link
Member

ieee8023 commented Jan 8, 2022

I also took a look at what the 224x224 densenet model was looking at using the gifsplanation approach (https://arxiv.org/abs/2102.09475) using this code: https://colab.research.google.com/github/mlmed/gifsplanation/blob/main/demo.ipynb

The images are not great but they give an impression of what changes the model prediction and it it seems to be looking at the right thing from what I can see.

test-pneumo1

test-pneumo1.mp4

@AceMcAwesome77
Copy link
Author

Thanks for the reply! The gifsplanation is super interesting, I'll take a look at that paper.

@croraf
Copy link

croraf commented May 1, 2024

Explicit resize seems to be problematic #152 , perhaps it also messes up your case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants