TEI failed to serve fine-tuned bge-m3 model #385

KCFindstr · 2024-08-15T20:08:45Z

System Info

Tested with TEI 1.2, 1.4, and latest (ghcr.io/huggingface/text-embeddings-inference:cuda-latest)
OS: Docker on Debian 12
Model: dophys/bge-m3_finetuned
Hardware: 1 NVIDIA_L4

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

#!/bin/bash
IMAGE="ghcr.io/huggingface/text-embeddings-inference:cuda-latest"
MODEL=dophys/bge-m3_finetuned
docker pull "$IMAGE"

docker run \
  --shm-size=1G \
  -e NVIDIA_VISIBLE_DEVICES=0 \
  -e MODEL_ID=$MODEL \
  -e JSON_OUTPUT=true \
  -e PORT=8080 \
  -p 7080:8080 \
  --runtime=nvidia \
  $IMAGE

Got error:

Error: Could not create backend

Caused by:
    Could not start backend: cannot find tensor embeddings.word_embeddings.weight

(for TEI 1.2 / 1.4, it throws a different tokenizer.json error)

Expected behavior

Expected the model to be served successfully since its base model BAAI/bge-m3 can be served with TEI, and the model has text-embeddings-inference tag on the model card page.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TEI failed to serve fine-tuned bge-m3 model #385

TEI failed to serve fine-tuned bge-m3 model #385

KCFindstr commented Aug 15, 2024

TEI failed to serve fine-tuned bge-m3 model #385

TEI failed to serve fine-tuned bge-m3 model #385

Comments

KCFindstr commented Aug 15, 2024

System Info

Information

Tasks

Reproduction

Expected behavior