Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[T5 1.1] Enable v1.1 Presets #1948

Merged
merged 10 commits into from
Oct 30, 2024

Conversation

DavidLandup0
Copy link
Collaborator

@DavidLandup0 DavidLandup0 commented Oct 22, 2024

Turns out we already operationally supported T5 1.1 (given the gated activations) but only supported vanilla T5 models through weight conversion and presets.

This PR updates the conversion script to include the T5 1.1 variants:

  • google/t5-v1_1-small
  • google/t5-v1_1-base
  • google/t5-v1_1-large
  • google/t5-v1_1-xl
  • google/t5-v1_1-xxl

For example:

t5_small = keras_hub.models.T5Backbone.from_preset("t5_1.1_small")
tokenizer = keras_hub.models.T5Tokenizer.from_preset("t5_1.1_small")

It also updates the conversion to use the save_to_preset() functionality and fixes assertions that raised exceptions, and saves the tokenizer as well.

Numerical Equivalence

small = keras_hub.models.T5Backbone.from_preset("t5_1.1_small")
keras_tokenizer = keras_hub.models.T5Tokenizer.from_preset("t5_1.1_small")

Behaves equally to:

hf_tokenizer = transformers.AutoTokenizer.from_pretrained("google/t5-v1_1-small")
hf_model = transformers.T5ForConditionalGeneration.from_pretrained("google/t5-v1_1-small")

PCA on flattened outputs running on the same input:

image

Notes

The XXL version (11B params, 44GB for weights) is too large to run on consumer hardware. I can't run the conversion script on it. Getting XL weights up on Kaggle as soon as the download is finished.

/cc @divyashreepathihalli

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Oct 24, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Oct 24, 2024
@divyashreepathihalli
Copy link
Collaborator

@DavidLandup0 the GPU tests are failing, can you pleas take a look?

@DavidLandup0
Copy link
Collaborator Author

DavidLandup0 commented Oct 29, 2024

@DavidLandup0 the GPU tests are failing, can you pleas take a look?

@divyashreepathihalli - Fixed - there was a missing Kaggle link for the XL preset

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Oct 29, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Oct 29, 2024
@divyashreepathihalli divyashreepathihalli merged commit bd57aed into keras-team:master Oct 30, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants