Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about imagenet-r #1

Open
zhaoedf opened this issue Oct 26, 2022 · 7 comments
Open

about imagenet-r #1

zhaoedf opened this issue Oct 26, 2022 · 7 comments

Comments

@zhaoedf
Copy link

zhaoedf commented Oct 26, 2022

why not reproduce imagenet-r? i used your code to reproduce imagenet-r and i got significantly better results compared to the original paper? is there anything wrong?

@JH-LEE-KR
Copy link
Owner

JH-LEE-KR commented Oct 29, 2022

Hi,
thanks for your comment.

I will reproduce Split Imagenet-R in the future.

How much better results did you get than paper?
It seems to me that there is no implementation problem.

The results of reproducing Split CIFAR-100 as an official code also showed different results from the paper.
The reproduce results were worse in my environment with RTX 3090.
My PyTorch implementation rather achieved better results than the paper in Split CIFAR-100.

I think it's likely the library difference between PyTorch and TensorFlow or the HW environment difference.

If you have any additional comments, please feel free to let me know.

Best,
Jaeho Lee.

@zhaoedf
Copy link
Author

zhaoedf commented Nov 3, 2022

Hi, thanks for your comment.

I will reproduce Split Imagenet-R in the future.

How much better results did you get than paper? It seems to me that there is no implementation problem.

The results of reproducing Split CIFAR-100 as an official code also showed different results from the paper. The reproduce results were worse in my environment with RTX 3090. My PyTorch implementation rather achieved better results than the paper in Split CIFAR-100.

I think it's likely the library difference between PyTorch and TensorFlow or the HW environment difference.

If you have any additional comments, please feel free to let me know.

Best, Jaeho Lee.

sorry i missed your reply.

reply

yes, i double checked your code, which is almost a replicate of the original jax code with API changed and i also checked the config to make sure it is the same as the author published.

i got 85 for cifar, which is similar to you, but i got avg acc 79 for imagenet-r, which is 10% higher than the papre reported, which is quite bizzard.

i currently simply do 1. add imagenet-r dataset and 2. modify config as the official one. do i miss anything?

more questions

you mentioned that you have run the official code with RTX3090. i was wondering how many card do you need? i tried to run that too, but my 8 3090 seems not enough for batch_size=128.

thx a lot! i can tell you spend a lot of time reproducing the code, e.g., the difference between "torch.unique" and "jnp.unique".

@JH-LEE-KR
Copy link
Owner

Hi,
thanks for the comment.

I don't know for sure because I haven't tested ImageNet-R yet.
As of now, the problem I am expecting is as follows.

  1. Did you make the dataset transform fair with the official code?
    In my experience, while experimenting with Split CIFAR-100, I found that there was a large difference in performance depending on the transform or augmentation.

  2. Would you like to have an args.unscale_lr as True?
    The official code scales the learning rate according to the number of GPUs, but it is not used basically in my code.

  3. (more questions). I've tried running on a single gpu, and also 8 GPUs.
    Did the out of memory issue occur?
    If so, please refer to the link.
    Here's the solution I found.

Please reply if you need further discussion.

Best,
Jaeho Lee.

@zhaoedf
Copy link
Author

zhaoedf commented Nov 4, 2022

Hi, thanks for the comment.

I don't know for sure because I haven't tested ImageNet-R yet. As of now, the problem I am expecting is as follows.

  1. Did you make the dataset transform fair with the official code?
    In my experience, while experimenting with Split CIFAR-100, I found that there was a large difference in performance depending on the transform or augmentation.
  2. Would you like to have an args.unscale_lr as True?
    The official code scales the learning rate according to the number of GPUs, but it is not used basically in my code.
  3. (more questions). I've tried running on a single gpu, and also 8 GPUs.
    Did the out of memory issue occur?
    If so, please refer to the link.
    Here's the solution I found.

Please reply if you need further discussion.

Best, Jaeho Lee.

  1. regarding to data transoform, i looked into the official code and the data augmentations i found for imagenet-r are as follows, which i think is rather weak augmentation and can not be any weaker.
  2. i use single gpu now but i will look into this. Honestly, i don't think this can cause the 10% performance increase.
    image

Thanks for your quick reply!

@JH-LEE-KR
Copy link
Owner

Sorry for the delayed reply.

I think I implemented the transformation and augmentation in the same way as the official code.
I don't know where there was a 10% performance improvement.

However, I have succeeded in implementing the Split-ImageNet-R code and achieving reasonable results similar to the results of the paper and official code reproduction.

I updated the code and README, so please check it.

Please feel free to discuss.

Best,
Jaeho Lee.

@zhaoedf
Copy link
Author

zhaoedf commented Nov 20, 2022

Sorry for the delayed reply.

I think I implemented the transformation and augmentation in the same way as the official code. I don't know where there was a 10% performance improvement.

However, I have succeeded in implementing the Split-ImageNet-R code and achieving reasonable results similar to the results of the paper and official code reproduction.

I updated the code and README, so please check it.

Please feel free to discuss.

Best, Jaeho Lee.

thanks a lot, i followed the link about jax above and successfully ran the official code.

@jcy132
Copy link

jcy132 commented Feb 12, 2023

Sorry for the delayed reply.

I think I implemented the transformation and augmentation in the same way as the official code. I don't know where there was a 10% performance improvement.

However, I have succeeded in implementing the Split-ImageNet-R code and achieving reasonable results similar to the results of the paper and official code reproduction.

I updated the code and README, so please check it.

Please feel free to discuss.

Best, Jaeho Lee.

what was the main factor of the 10% improvement for imgnet-R? What change led to the similar result with the paper?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants