-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Hugging Face #14412
Add support for Hugging Face #14412
Conversation
fixed #14411 Signed-off-by: Jonas Helming <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle it seems to work. I was able to communicate with starcoder hosted at huggingface.
- There is an issue with the language model management which I described in detail in the comments
- I used the suggested
as the code-completion prompt but only got very bad completions back 🤷
The language is {{language}}. <fim_prefix>{{textUntilCurrentPosition}}<fim_suffix>{{textAfterCurrentPosition}}<fim_middle>
if (!token) { | ||
throw new Error('Please provide a Hugging Face API token.'); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not throw errors for expected code paths. This is even the default code path as the api key will be undefined
by default. Therefore this will show up as an error for all users in their logs.
If we don't want to be able to configure hugging face models without keys, then the huggingface-language-model-manager
should not even create them.
Note that if the user deletes the apiKey
in their preferences, you will still use it as the hfInference
is only created once here. In fact the whole dynamic apiKey
provider is not utilized in this implementation as it's only called once.
I would like to suggest to either make the apiKey
static here and making sure that the models are removed/recreated if the apiKey
changes or refactor them to being able to handle non existent api keys at request time, like in the open ai language model implementation.
packages/ai-hugging-face/src/node/huggingface-language-models-manager-impl.ts
Show resolved
Hide resolved
@sdirix I changed the behavior so that API Key changes are respected (used the APIkeyProvider). It is now similar to the openAI provider. If no API key is set, the models are visible, but an error is thrown. This is catched by the chat and shown to the user (just like with the open AI provider) |
Signed-off-by: Jonas Helming <[email protected]>
31945b1
to
caa370a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me with the suggested starcoder model. I have one more question regarding the code.
parameters: { | ||
temperature: 0.1, | ||
max_new_tokens: 200, | ||
return_full_text: false, | ||
do_sample: true, | ||
stop: ['<|endoftext|>'] | ||
} | ||
}); | ||
|
||
const asyncIterator = { | ||
async *[Symbol.asyncIterator](): AsyncIterator<LanguageModelStreamResponsePart> { | ||
for await (const chunk of stream) { | ||
const content = chunk.token.text.replace(/<\|endoftext\|>/g, ''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work for most/all hugging face models or only for starcoder? If this is starcoder specific then we should check for that model before setting the parameters / replacing the content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that almost all HuggingFace Model return stop charakters. And that it is most robust to explcitly specify it and then replace it. Good test case is meta-llama/Llama-3.2-3B-Instruct which is always "warm"
fixed #14411
What it does
Adds support for using Hugging Face as an inference provider. This was possible before by using the OpenAI API, but some models require custom parameters
How to test
Create a HF Account and an API Key. Copy any model in the settings and select it to be used in the AI configuration view.
For simplicity, select a model that supports server less so that inference is for free.
For StarCoder code completion you can use:
Follow-ups
Review checklist
Reminder for reviewers