Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

From pretrained normally? #64

Open
johnml1135 opened this issue Nov 20, 2023 · 3 comments
Open

From pretrained normally? #64

johnml1135 opened this issue Nov 20, 2023 · 3 comments
Assignees
Labels

Comments

@johnml1135
Copy link
Collaborator

So, it takes 4 minutes to build all the weights even of a 600MB distilled model on my RTX3090. If I am correct (I may not be), we should be able to cache checkpoints at position 0 for the NLLB models - which could dramatically reduce that startup time. That would be very helpful for debugging quick builds and running E2E testing. I am unsure exactly the code change to make, but the idea would be something like:

  • in hugging_face_model_trainer when about to make the Seq2SeqTrainer:
    • First, check if the model is a string (is this how it comes in when starting fresh?)
    • If so, check if there is a cached version of the model hanging out in a cache folder.
    • If not create the model and save the cache
    • Keep going

I am unsure whether there would need to be a separate cached version for each project (undesirable) or if it could be one per NLLB model type.

I could be going about this wrong, but I saw some things that looked similar to these ideas but nothing slam-dunk.

@johnml1135
Copy link
Collaborator Author

@ddaspit do you have any insight into this? It could dramatically reduce the "10 step" build time from 6 minutes to 2 minutes.

@ddaspit
Copy link
Contributor

ddaspit commented Dec 1, 2023

I have no idea if this is possible. I am not aware of a way to do this in Huggingface or PyTorch. I think we would need to do more investigation to determine the exact cause of the long startup time.

@johnml1135
Copy link
Collaborator Author

This may be of help - huggingface/transformers#21913.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 🆕 New
Development

No branches or pull requests

2 participants