clarification on augreg2 models #2420

mueller-mp · 2025-01-21T08:32:47Z

mueller-mp
Jan 21, 2025

Hi,
could someone clarify if the augreg2 models were fine-tuned from the same checkpoint as the regular augreg models?
For instance, vit_base_patch16_224.augreg_in21k_ft_in1k was fine-tuned from vit_base_patch16_224.augreg_in21k (which is the i21k-300ep-lr_0.001-aug_medium1-wd_0.1-do_0.0-sd_0.0 checkpoint from google). Was vit_base_patch16_224.augreg2_in21k_ft_in1k also fine-tuned from that checkpoint, or from a different one? I'd like to understand if the difference between the two checkpoints is only the fine-tuning or also the pretraining.
Also, details on the fine-tuning hparams would be appreciated :)
Thanks!

Answered by rwightman

Jan 21, 2025

@mueller-mp yes, I re-did the fine-tune from the original in21k checkpoint, mostly to show lucas that they could be better :)

The biggest difference was that these fine-tunes used the timm scripts & augmentations (original pretrained & fine-tunes were using the google jax train code). Using layer-wise LR decay was the biggest single hparam change, will see if I have that config files somewhere...

View full answer

rwightman · 2025-01-21T19:17:38Z

rwightman
Jan 21, 2025
Maintainer

@mueller-mp yes, I re-did the fine-tune from the original in21k checkpoint, mostly to show lucas that they could be better :)

The biggest difference was that these fine-tunes used the timm scripts & augmentations (original pretrained & fine-tunes were using the google jax train code). Using layer-wise LR decay was the biggest single hparam change, will see if I have that config files somewhere...

3 replies

rwightman Jan 21, 2025
Maintainer

https://gist.github.com/rwightman/943c0fe59293b44024bbd2d5d23e6303#file-vit_base_patch16_224-augreg2-yaml

I'm pretty sure this was run on 4x GPU ... so the LR (2e-4) was for a global batch size of 4 x 512.

mueller-mp Jan 22, 2025
Author

great, thanks for your quick answer!

mueller-mp Jan 22, 2025
Author

One more thing - I noted that vit_base_patch16_384.augreg2_in21k_ft_in1k weights are mentioned here, but they seem not to be available on huggingface or via timm.create_model(). Any plans of publishing those (in case they exist)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clarification on augreg2 models #2420

{{title}}

Replies: 1 comment 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

clarification on augreg2 models #2420

mueller-mp Jan 21, 2025

Replies: 1 comment · 3 replies

rwightman Jan 21, 2025 Maintainer

rwightman Jan 21, 2025 Maintainer

mueller-mp Jan 22, 2025 Author

mueller-mp Jan 22, 2025 Author

mueller-mp
Jan 21, 2025

Replies: 1 comment 3 replies

rwightman
Jan 21, 2025
Maintainer

rwightman Jan 21, 2025
Maintainer

mueller-mp Jan 22, 2025
Author

mueller-mp Jan 22, 2025
Author