The meaning of "samples seen scales" #446

yuezewang · 2023-02-21T08:46:12Z

yuezewang
Feb 21, 2023

Hi~ May I know the meaning of "samples seen scales" in the OpenCLIP's paper (Reproducible scaling laws for contrastive language-image learning) ?

gabrielilharco · 2023-02-21T16:48:12Z

gabrielilharco
Feb 21, 2023
Maintainer

Hi @yuezewang, it refers to the number of image-text pairs seen by the model during training. For example if a model has batch size 100k and trains for 100k steps, it "sees" 10 billion samples during training. This number can be scaled up and down; in the paper, we have experiments ranging from 3 billion to 34 billion samples seen.

0 replies

yuezewang · 2023-02-22T06:31:56Z

yuezewang
Feb 22, 2023
Author

Hi @yuezewang, it refers to the number of image-text pairs seen by the model during training. For example if a model has batch size 100k and trains for 100k steps, it "sees" 10 billion samples during training. This number can be scaled up and down; in the paper, we have experiments ranging from 3 billion to 34 billion samples seen.

It's kind of you to answer, thank you~ Does it mean that people need to repeatedly sample a pre-defined number of samples (3B / 13B / 34B) from the original dataset in order to construct a "new" dataset? Alternatively, the number of steps should be accumulated during epoch-wise training so as to reach the pre-defined number of samples (3B / 13B / 34B)?

0 replies

gabrielilharco · 2023-02-22T18:20:42Z

gabrielilharco
Feb 22, 2023
Maintainer

@yuezewang, you don't need to explicitly construct a new dataset. For example, if you train on a dataset with 300M samples for 10 epochs, that will be 3B samples seen. Does this make sense?

1 reply

yuezewang Feb 23, 2023
Author

Ok, thanks for your patient answer~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The meaning of "samples seen scales" #446

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

The meaning of "samples seen scales" #446

yuezewang Feb 21, 2023

Replies: 3 comments · 1 reply

gabrielilharco Feb 21, 2023 Maintainer

yuezewang Feb 22, 2023 Author

gabrielilharco Feb 22, 2023 Maintainer

yuezewang Feb 23, 2023 Author

yuezewang
Feb 21, 2023

Replies: 3 comments 1 reply

gabrielilharco
Feb 21, 2023
Maintainer

yuezewang
Feb 22, 2023
Author

gabrielilharco
Feb 22, 2023
Maintainer

yuezewang Feb 23, 2023
Author