Add optional flag in the text model config to return embeddings for all tokens in sequence instead of just the EOS embedding. #456

mkaic · 2023-03-02T23:37:58Z

This is a small QOL PR that implements an optional parameter.

Currently, there is no way to have the CLIP text model return the entirety of its (batch x seq_length x embed_dim)-shaped output tensor, something that's quite useful if you want to use your model as part of, say, a Stable Diffusion training run.

I've added an optional parameter, return_all_embeddings, that can be included in the text_cfg section of a model config. It defaults to False, so this pull request won't change any default behavior. If it's set to True, the model will return the output vectors for all tokens in its input, not just the one for the EOS token.

mkaic added 2 commits March 2, 2023 23:24

allow for returning embeddings for all timesteps instead of just EOT

b133c1d

undo accidental format changes

b54b209

mkaic marked this pull request as draft March 3, 2023 03:18

mkaic marked this pull request as ready for review March 10, 2023 21:11

Merge branch 'main' into return_all_embeddings

83f981f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optional flag in the text model config to return embeddings for all tokens in sequence instead of just the EOS embedding. #456

Add optional flag in the text model config to return embeddings for all tokens in sequence instead of just the EOS embedding. #456

mkaic commented Mar 2, 2023

Add optional flag in the text model config to return embeddings for all tokens in sequence instead of just the EOS embedding. #456

Are you sure you want to change the base?

Add optional flag in the text model config to return embeddings for all tokens in sequence instead of just the EOS embedding. #456

Conversation

mkaic commented Mar 2, 2023