Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SigLIP impl #634

Merged
merged 10 commits into from
Sep 22, 2023
Merged

SigLIP impl #634

merged 10 commits into from
Sep 22, 2023

Conversation

rwightman
Copy link
Collaborator

@rwightman rwightman commented Sep 15, 2023

Re #618

@rwightman rwightman changed the title Initial SigLIP impl SigLIP impl Sep 16, 2023
@lucasb-eyer
Copy link

Can't comment on the distributed part of the code as I don't know that part of PyTorch, but the rest (loss details, bias/temp/inits) LGTM.

@rwightman
Copy link
Collaborator Author

@lucasb-eyer thanks for taking a look, yeah the dist part is where a lot of the risk is, but seems to be behaving on local cc12m runs comparing single to 4x GPU.

@lucasb-eyer
Copy link

FYI: in our code, Basil implemented a small unit-test checking both formulations for "almost equalness" of chunked vs non-chunked, this gave us good reassurance in the implementation (+looking at profiler for memory use).

@rwightman
Copy link
Collaborator Author

rwightman commented Sep 22, 2023

I've tested

  • convnext_tiny on cc12m old InfoNCE run vs new SigLIP (36.13 vs 36.46 zero-shot in1k) (4 GPU)
  • initial convergence w/ siglip + grad accum enabled
  • initial convergence of custom text and original clip models w/o siglip (original CLIP InfoNCE loss)
  • initial convergence with bidirection exchange and unidirectional
  • validating several existing models

Will merge shortly to prevent this getting stale

@rwightman rwightman merged commit a6a80c4 into main Sep 22, 2023
5 checks passed
@rwightman rwightman deleted the siglip branch September 22, 2023 19:17
Interpause pushed a commit to Interpause/open_clip that referenced this pull request May 23, 2024
* Initial SigLIP impl

* Add logit_bias to custom text clip

* non-dict model output wrong way around wrt logit_bias

* Disable diving loss by world size, better without

* A bit of cleanup

* Add bidirectional exchange option, more cleanup

* Add reference in siglip docstring

* Remove some comments after further verification

* bidir exchange by default

* Proper bidir default
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants