05 Nov 15:17

OlivierDehaene

76b29f1

v1.5.1 Latest

Latest

What's Changed

Download model.onnx_data by @kozistr in #343
Rename 'Sentence Transformers' to 'sentence-transformers' in docstrings by @Wauplin in #342
fix: add serde default for truncation direction by @drbh in #399
fix: metrics unbounded memory by @OlivierDehaene in #409
Fix to allow health check w/o auth by @kozistr in #360
Update ort crate version to 2.0.0-rc.4 to support onnx IR version 10 by @kozistr in #361
adds curl to fix healthcheck by @WissamAntoun in #376
fix: use num_cpus::get to check as get_physical does not check cgroups by @OlivierDehaene in #410
fix: use status code 400 when batch is empty by @OlivierDehaene in #413
fix: add cls pooling as default for BERT variants by @OlivierDehaene in #426
feat: auto limit string if truncate is set by @OlivierDehaene in #428

New Contributors

@Wauplin made their first contribution in #342
@XciD made their first contribution in #345
@WissamAntoun made their first contribution in #376

Full Changelog: v1.5.0...v1.5.1

Contributors

XciD, drbh, and 4 other contributors

Assets 2

10 Jul 15:34

OlivierDehaene

v1.5.0

661a77f

v1.5.0

Notable Changes

ONNX runtime for CPU deployments: greatly improve CPU deployment throughput
Add /similarity route

What's Changed

tokenizer max limit on input size by @ErikKaum in #324
docs: air-gapped deployments by @OlivierDehaene in #326
feat(onnx): add onnx runtime for better CPU perf by @OlivierDehaene in #328
feat: add /similarity route by @OlivierDehaene in #331
fix(ort): fix mean pooling by @OlivierDehaene in #332
chore(candle): update flash attn by @OlivierDehaene in #335
v1.5.0 by @OlivierDehaene in #336

New Contributors

@ErikKaum made their first contribution in #324

Full Changelog: v1.4.0...v1.5.0

Contributors

OlivierDehaene and ErikKaum

Assets 2

02 Jul 15:17

OlivierDehaene

v1.4.0

a0549e6

v1.4.0

Notable Changes

Cuda support for the Qwen2 model architecture

What's Changed

feat(candle): support Qwen2 on Cuda by @OlivierDehaene in #316
fix(candle): fix last token pooling

Full Changelog: v1.3.0...v1.4.0

Contributors

OlivierDehaene

Assets 2

28 Jun 11:37

OlivierDehaene

v1.3.0

6c6cd93

v1.3.0

Notable changes

New truncation direction parameter
Cuda support for JinaCode model architecture
Cuda support for Mistral model architecture
Cuda support for Alibaba GTE model architecture
New prompt name parameter: you can now add a prompt name to the body of your request to add a pre-prompt to your input, based on the Sentence Transformers configuration. You can also set a default prompt / prompt name to always add a pre-prompt to your requests.

What's Changed

Ci migration to K8s by @glegendre01 in #269
chore: map compute_cap from GPU name by @haixiw in #276
chore: cover Nvidia T4/L4 GPU by @haixiw in #284
feat(ci): add trufflehog secrets detection by @McPatate in #286
Community contribution code of conduct by @LysandreJik in #291
Update README.md by @michaelfeil in #277
Upgrade tokenizers to 0.19.1 to deal with breaking change in tokenizers by @scriptator in #266
Add env for OTLP service name by @kozistr in #285
Fix CI build timeout by @fxmarty in #296
fix(router): payload limit was not correctly applied by @OlivierDehaene in #298
feat(candle): better cuda error by @OlivierDehaene in #300
feat(router): add truncation direction parameter by @OlivierDehaene in #299
Support for Jina Code model by @patricebechard in #292
feat(router): add base64 encoding_format for OpenAI API by @OlivierDehaene in #301
fix(candle): fix FlashJinaCodeModel by @OlivierDehaene in #302
fix: use malloc_trim to cleanup pages by @OlivierDehaene in #307
feat(candle): add FlashMistral by @OlivierDehaene in #308
feat(candle): add flash gte by @OlivierDehaene in #310
feat: add default prompts by @OlivierDehaene in #312
Add optional CORS allow any option value in http server cli by @kir-gadjello in #260
Update HUGGING_FACE_HUB_TOKEN to HF_API_TOKEN in README by @kevinhu in #263
v1.3.0 by @OlivierDehaene in #313

New Contributors

@haixiw made their first contribution in #276
@McPatate made their first contribution in #286
@LysandreJik made their first contribution in #291
@michaelfeil made their first contribution in #277
@scriptator made their first contribution in #266
@fxmarty made their first contribution in #296
@patricebechard made their first contribution in #292
@kir-gadjello made their first contribution in #260
@kevinhu made their first contribution in #263

Full Changelog: v1.2.3...v1.3.0

Contributors

kevinhu, scriptator, and 10 other contributors

Assets 2

25 Apr 08:48

OlivierDehaene

v1.2.3

cc1c510

v1.2.3

What's Changed

fix limit peak memory to build cuda-all docker image by @OlivierDehaene in #246

Full Changelog: v1.2.2...v1.2.3

Contributors

OlivierDehaene

Assets 2

16 Apr 14:48

OlivierDehaene

v1.2.2

d221b99

v1.2.2

What's Changed

fix(gke): accept null values for vertex env vars by @OlivierDehaene in #243
fix: fix cpu image to not default on the sagemaker entrypoint

Full Changelog: v1.2.1...v1.2.2

Contributors

OlivierDehaene

Assets 2

15 Apr 16:58

OlivierDehaene

v1.2.1

8927093

v1.2.1

TEI is now Apache 2.0!

What's Changed

Document how to send batched inputs by @osanseviero in #222
feat: add auto-truncate arg by @OlivierDehaene in #224
feat: add PredictPair to proto by @OlivierDehaene in #225
fix: fix auto_truncate for openai by @OlivierDehaene in #228
Change license to Apache 2.0 by @OlivierDehaene in #231
feat: Amazon SageMaker compatible images by @JGalego in #103
fix(CI): fix build all by @OlivierDehaene in #236
fix: fix cuda-all image by @OlivierDehaene in #239
Add SageMaker CPU images and validate by @philschmid in #240

New Contributors

@osanseviero made their first contribution in #222
@JGalego made their first contribution in #103
@philschmid made their first contribution in #240

Full Changelog: v1.2.0...v1.2.1

Contributors

osanseviero, JGalego, and 2 other contributors

Assets 2

22 Mar 16:36

OlivierDehaene

v1.2.0

3edace2

v1.2.0

What's Changed

add cuda all image to facilitate deployment by @OlivierDehaene in #186
add splade pooling to Bert by @OlivierDehaene in #187
support vertex api endpoint by @drbh in #184
readme examples by @plaggy in #180
add_pooling_layer for bert classification by @OlivierDehaene in #190
add /embed_sparse route by @OlivierDehaene in #191
Applying Cargo.toml optimization options by @somehowchris in #201
Add Dockerfile-arm64 to allow docker builds on Apple M1/M2 architecture by @iandoe in #209
configurable payload limit by @OlivierDehaene in #210
add api_key for request authorization by @OlivierDehaene in #211
add all methods to vertex API by @OlivierDehaene in #192
add /decode route by @OlivierDehaene in #212
Input Types Compatibility with OpenAI's API (#112) by @OlivierDehaene in #214

New Contributors

@drbh made their first contribution in #184
@plaggy made their first contribution in #180
@somehowchris made their first contribution in #201
@iandoe made their first contribution in #209

Full Changelog: v1.1.0...v1.2.0

Contributors

iandoe, drbh, and 3 other contributors

Assets 2

01 Mar 17:06

OlivierDehaene

v1.1.0

7b85d8c

v1.1.0

Highlights

Splade pooling

What's Changed

Update Dockerfile to install curl by @jpbalarini in #117
fix loading of bert classification models by @OlivierDehaene in #173
splade pooling by @OlivierDehaene in #174

New Contributors

@jpbalarini made their first contribution in #117

Full Changelog: v1.0.0...v.1.1.0

Contributors

jpbalarini and OlivierDehaene

Assets 2

23 Feb 16:43

OlivierDehaene

v1.0.0

41b692d

v1.0.0

Highlights

Support for Nomic models
Support for Flash Attention for Jina models
Metal backend for M* users
/tokenize route to directly access the internal TEI tokenizer
/embed_all route to allow client level pooling

What's Changed

fix: limit the number of buckets for prom metrics by @OlivierDehaene in #114
feat: support flash attention for Jina by @OlivierDehaene in #119
feat: add support for Metal by @OlivierDehaene in #120
fix: fix turing for Jina and limit concurrency in docker build by @OlivierDehaene in #121
fix(router): fix panics on partial_cmp and empty req.texts by @OlivierDehaene in #138
feat(router): add /tokenize route by @OlivierDehaene in #139
feat(backend): support classification for bert by @OlivierDehaene in #155
feat: add embed_raw route to get all embeddings without pooling by @OlivierDehaene in #154
added docs for OTLP_ENDPOINT around the defaults and format sent by @MarcusDunn in #157
fix: use mimalloc to solve memory "leak" by @OlivierDehaene in #161
fix: remove modif of tokenizer by @OlivierDehaene in #163
fix: add cors_allow_origin to cli by @OlivierDehaene in #162
fix: use st max_seq_length by @OlivierDehaene in #167
feat: support nomic models by @OlivierDehaene in #166

New Contributors

@MarcusDunn made their first contribution in #157

Full Changelog: v0.6.0...v1.0.0

Contributors

OlivierDehaene and MarcusDunn

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

Notable Changes

What's Changed

New Contributors

Contributors

Notable Changes

What's Changed

Contributors

Notable changes

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Contributors

TEI is now Apache 2.0!

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

Highlights

What's Changed

New Contributors

Contributors

Highlights

What's Changed

New Contributors

Contributors

Releases: huggingface/text-embeddings-inference

v1.5.1

What's Changed

New Contributors

Contributors

v1.5.0

Notable Changes

What's Changed

New Contributors

Contributors

v1.4.0

Notable Changes

What's Changed

Contributors

v1.3.0

Notable changes

What's Changed

New Contributors

Contributors

v1.2.3

What's Changed

Contributors

v1.2.2

What's Changed

Contributors

v1.2.1

TEI is now Apache 2.0!

What's Changed

New Contributors

Contributors

v1.2.0

What's Changed

New Contributors

Contributors

v1.1.0

Highlights

What's Changed

New Contributors

Contributors

v1.0.0

Highlights

What's Changed

New Contributors

Contributors