The meaning of "samples seen scales" #446
Replies: 3 comments 1 reply
-
Hi @yuezewang, it refers to the number of image-text pairs seen by the model during training. For example if a model has batch size 100k and trains for 100k steps, it "sees" 10 billion samples during training. This number can be scaled up and down; in the paper, we have experiments ranging from 3 billion to 34 billion samples seen. |
Beta Was this translation helpful? Give feedback.
-
It's kind of you to answer, thank you~ Does it mean that people need to repeatedly sample a pre-defined number of samples (3B / 13B / 34B) from the original dataset in order to construct a "new" dataset? Alternatively, the number of steps should be accumulated during epoch-wise training so as to reach the pre-defined number of samples (3B / 13B / 34B)? |
Beta Was this translation helpful? Give feedback.
-
@yuezewang, you don't need to explicitly construct a new dataset. For example, if you train on a dataset with 300M samples for 10 epochs, that will be 3B samples seen. Does this make sense? |
Beta Was this translation helpful? Give feedback.
-
Hi~ May I know the meaning of "samples seen scales" in the OpenCLIP's paper (Reproducible scaling laws for contrastive language-image learning) ?
Beta Was this translation helpful? Give feedback.
All reactions