Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REQUEST] Better Infinity Embeddings support #211

Open
3 tasks done
arbi-dev opened this issue Sep 24, 2024 · 1 comment
Open
3 tasks done

[REQUEST] Better Infinity Embeddings support #211

arbi-dev opened this issue Sep 24, 2024 · 1 comment

Comments

@arbi-dev
Copy link

Problem

It's great that Tabby supports the fantastic infinity_emb backend for local embeddings, but there are a couple of missing features:

  1. Infinity_emb is not included in the official docker image -- although it can be built with custom image, ideally it would work out of the box.
  2. Infinity_emb supports loading a reranker in addition to an embedding model. It would be great to support loading both models on the same tabby instance (if I am not mistaken that it's not currently available).

Solution

-rebuild official docker with [infinity_emb]
-support multiple --embedding-model-name values at the same time

Alternatives

No response

Explanation

Tabby, Exllama and Infinity are great options for environments like kubernetes where it's better to use official image builds. also having multiple models on one instance helps with more efficient use of GPU resources.

Thank you every for your great hard work on these projects!

Examples

No response

Additional context

No response

Acknowledgements

  • I have looked for similar requests before submitting this one.
  • I understand that the developers have lives and my issue will be answered when possible.
  • I understand the developers of this program are human, and I will make my requests politely.
@bdashore3
Copy link
Member

The docker change shouldn't be a big issue. It's possible to make it so docker pulls extras before compiling. However, re-ranking models are a different story.

Any other infinity-emb model outside of embeddings fall out of scope for TabbyAPI's purposes and will bloat the codebase. If you'd like to use different types of models at that level, I'd suggest using infinity-emb itself. Then, write a program that bridges tabby and infinity-emb which can broadcast to the end user.

Tabby is meant to be a cog in the machine, not the entire machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants