Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy error for Llama-3.2-vision-11B: "Sharded is not supported for AutoModel" #2571

Open
1 of 4 tasks
xuan1905 opened this issue Sep 26, 2024 · 4 comments
Open
1 of 4 tasks

Comments

@xuan1905
Copy link

System Info

Hi Team,
When deploying the model on AWS with huggingface-pytorch-tgi-inference:2.3.0-tgi2.2.0, I got the above error.
Could you tell me when can TGI provide the new image? Is there any way I can work around the issue for the moment?

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

Run the image huggingface-pytorch-tgi-inference:2.3.0-tgi2.2.0 on Sagemaker.

Expected behavior

TGI can deploy the Llama3.2 model successfully

@dossjjx
Copy link

dossjjx commented Sep 27, 2024

Same issue here with the 90B model. Number of shards: 4.

@xuan1905
Copy link
Author

xuan1905 commented Oct 5, 2024

Is there any update?

@renambot
Copy link

renambot commented Oct 7, 2024

TGI v2.3.1 works with llama 3.2 Vision now (mllama models)

@xuan1905
Copy link
Author

xuan1905 commented Oct 8, 2024

Great. Thanks. Is it available in AWS deep learning container images?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants