Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEI failed to serve fine-tuned bge-m3 model #385

Open
2 of 4 tasks
KCFindstr opened this issue Aug 15, 2024 · 0 comments
Open
2 of 4 tasks

TEI failed to serve fine-tuned bge-m3 model #385

KCFindstr opened this issue Aug 15, 2024 · 0 comments

Comments

@KCFindstr
Copy link

System Info

Tested with TEI 1.2, 1.4, and latest (ghcr.io/huggingface/text-embeddings-inference:cuda-latest)
OS: Docker on Debian 12
Model: dophys/bge-m3_finetuned
Hardware: 1 NVIDIA_L4

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

#!/bin/bash
IMAGE="ghcr.io/huggingface/text-embeddings-inference:cuda-latest"
MODEL=dophys/bge-m3_finetuned
docker pull "$IMAGE"

docker run \
  --shm-size=1G \
  -e NVIDIA_VISIBLE_DEVICES=0 \
  -e MODEL_ID=$MODEL \
  -e JSON_OUTPUT=true \
  -e PORT=8080 \
  -p 7080:8080 \
  --runtime=nvidia \
  $IMAGE

Got error:

Error: Could not create backend

Caused by:
    Could not start backend: cannot find tensor embeddings.word_embeddings.weight

(for TEI 1.2 / 1.4, it throws a different tokenizer.json error)

Expected behavior

Expected the model to be served successfully since its base model BAAI/bge-m3 can be served with TEI, and the model has text-embeddings-inference tag on the model card page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant