Add support for Hugging Face #14412

JonasHelming · 2024-11-06T18:23:47Z

What it does

Adds support for using Hugging Face as an inference provider. This was possible before by using the OpenAI API, but some models require custom parameters

How to test

Create a HF Account and an API Key. Copy any model in the settings and select it to be used in the AI configuration view.
For simplicity, select a model that supports server less so that inference is for free.
For StarCoder code completion you can use:

The language is {{language}}.
<fim_prefix>{{textUntilCurrentPosition}}<fim_suffix>{{textAfterCurrentPosition}}<fim_middle>

Follow-ups

We should think about prompt management for agents so that you can define different prompt depending on the connected model.
We might want to harmonize the LLM Providers we have and extract common behavior soon

Review checklist

As an author, I have thoroughly tested my changes and carefully followed the review guidelines

Reminder for reviewers

As a reviewer, I agree to behave in accordance with the review guidelines

fixed #14411 Signed-off-by: Jonas Helming <[email protected]>

sdirix

In principle it seems to work. I was able to communicate with starcoder hosted at huggingface.

There is an issue with the language model management which I described in detail in the comments

I used the suggested

The language is {{language}}.
<fim_prefix>{{textUntilCurrentPosition}}<fim_suffix>{{textAfterCurrentPosition}}<fim_middle>

as the code-completion prompt but only got very bad completions back 🤷

sdirix · 2024-11-07T16:23:09Z

packages/ai-hugging-face/src/node/huggingface-language-model.ts

+        if (!token) {
+            throw new Error('Please provide a Hugging Face API token.');
+        }


We should not throw errors for expected code paths. This is even the default code path as the api key will be undefined by default. Therefore this will show up as an error for all users in their logs.

If we don't want to be able to configure hugging face models without keys, then the huggingface-language-model-manager should not even create them.

Note that if the user deletes the apiKey in their preferences, you will still use it as the hfInference is only created once here. In fact the whole dynamic apiKey provider is not utilized in this implementation as it's only called once.

I would like to suggest to either make the apiKey static here and making sure that the models are removed/recreated if the apiKey changes or refactor them to being able to handle non existent api keys at request time, like in the open ai language model implementation.

packages/ai-hugging-face/src/node/huggingface-language-models-manager-impl.ts

JonasHelming · 2024-11-11T22:17:46Z

@sdirix I changed the behavior so that API Key changes are respected (used the APIkeyProvider). It is now similar to the openAI provider. If no API key is set, the models are visible, but an error is thrown. This is catched by the chat and shown to the user (just like with the open AI provider)

Signed-off-by: Jonas Helming <[email protected]>

sdirix

Works for me with the suggested starcoder model. I have one more question regarding the code.

sdirix · 2024-11-12T15:52:13Z

packages/ai-hugging-face/src/node/huggingface-language-model.ts

+            parameters: {
+                temperature: 0.1,
+                max_new_tokens: 200,
+                return_full_text: false,
+                do_sample: true,
+                stop: ['<|endoftext|>']
+            }
+        });
+
+        const asyncIterator = {
+            async *[Symbol.asyncIterator](): AsyncIterator<LanguageModelStreamResponsePart> {
+                for await (const chunk of stream) {
+                    const content = chunk.token.text.replace(/<\|endoftext\|>/g, '');


Will this work for most/all hugging face models or only for starcoder? If this is starcoder specific then we should check for that model before setting the parameters / replacing the content.

I found that almost all HuggingFace Model return stop charakters. And that it is most robust to explcitly specify it and then replace it. Good test case is meta-llama/Llama-3.2-3B-Instruct which is always "warm"

Add support for Hugging Face

6bfaa2d

fixed #14411 Signed-off-by: Jonas Helming <[email protected]>

JonasHelming requested a review from sdirix November 6, 2024 18:24

sdirix requested changes Nov 7, 2024

View reviewed changes

Adress review comments

caa370a

Signed-off-by: Jonas Helming <[email protected]>

JonasHelming force-pushed the GH-14411 branch from 31945b1 to caa370a Compare November 11, 2024 23:18

JonasHelming requested a review from sdirix November 12, 2024 07:46

sdirix approved these changes Nov 12, 2024

View reviewed changes

JonasHelming merged commit 8c69d73 into master Nov 12, 2024
11 checks passed

github-actions bot added this to the 1.56.0 milestone Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Hugging Face #14412

Add support for Hugging Face #14412

JonasHelming commented Nov 6, 2024

sdirix left a comment •

edited

Loading

sdirix Nov 7, 2024

JonasHelming commented Nov 11, 2024

sdirix left a comment

sdirix Nov 12, 2024

JonasHelming Nov 12, 2024

Add support for Hugging Face #14412

Add support for Hugging Face #14412

Conversation

JonasHelming commented Nov 6, 2024

What it does

How to test

Follow-ups

Review checklist

Reminder for reviewers

sdirix left a comment • edited Loading

Choose a reason for hiding this comment

sdirix Nov 7, 2024

Choose a reason for hiding this comment

JonasHelming commented Nov 11, 2024

sdirix left a comment

Choose a reason for hiding this comment

sdirix Nov 12, 2024

Choose a reason for hiding this comment

JonasHelming Nov 12, 2024

Choose a reason for hiding this comment

sdirix left a comment •

edited

Loading