-
Notifications
You must be signed in to change notification settings - Fork 810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements on Exposed ORT support #976
base: main
Are you sure you want to change the base?
Conversation
Hi again 👋 Apologies for the late reply. For this PR, could you explain the meaning/origin of the |
That's ok, saw that you had a loot of work to do!
Currently the runner will not look on it to execute, it does based on the environment. So the But the problem is that Example, if we don't specify a device, or anything instead of const pipe = await pipeline('feature-extraction', 'supabase/gte-small')
// Error: Unsupported device "wasm", Should be one of: .
const pipe = await pipeline('feature-extraction', 'supabase/gte-small', { device: 'cpu' })
// Error: Unsupported device "cpu", Should be one of: .
const pipe = await pipeline('feature-extraction', 'supabase/gte-small', { device: 'webgpu' })
// Error: Unsupported device "webgpu", Should be one of: . But if I do const pipe = await pipeline('feature-extraction', 'supabase/gte-small', { device: 'auto' })
// [.... embeddings ] Custom ORT exposed:console.log(globalThis[Symbol.for("onnxruntime"])
/*
{
Tensor: [class Tensor],
env: {},
InferenceSession: { create: [AsyncFunction: fromBuffer] }
}
*/ So the purpose of this PR is to automatically set I invite you to try run npx supabase functions new "ort-test"
npx supabase functions serve |
Thanks for the additional context! 👍 Just on that note, what if an exposed ORT library does allow for different devices to be specified, and the user could choose the version they would like? For Supabase's runner, specifically, I imagine it is running on CPU, right? So "cpu" could be mapped to the default device on their side perhaps? |
Sure, it makes sense. But this way they still need to manually specify a device?
At this moment, could you give me any suggestion? about how can we achieve: the following, without it throws an error? const pipe = await pipeline('feature-extraction', 'supabase/gte-small') Do I need to export some property that says "Hey EDIT: I'd look in both |
Note that there is no "auto" execution provider in onnxruntime-web/onnxruntime-node (and this is a layer added on top by Transformers.js). That said, I think I see what the problem is: for the custom runtime, we don't specify transformers.js/src/backends/onnx.js Lines 91 to 92 in 2c92943
device -> execution provider mapping: transformers.js/src/backends/onnx.js Lines 117 to 136 in 2c92943
So, I think the solution here would be to set them both to |
Hey Joshua, thanks for your suggestion
Yes it worked, 42ae1fe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating! ✅ Looks good!
- Add global variable to check if `IS_EXPOSED_RUNTIME_ENV` -> true if Js exposes their own custom runtime. - Applying 'auto' device as default for exposed runtime environment.
- Adding checkings for 'Tensor' and 'InferenceSession' members of the exposed custom ort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for the modifications! I was talking to a colleague about this PR and one thing which was brought up was the possibility of accidental namespace clashes (if onnxruntime
is detected globally).
For that reason, I think this should be "opt-in" behaviour based on the presence or absence of an environment variable, and should not be the default for all users.
In the supabase runtime, are you able to set default environment variables. e.g., HF_TRANSFORMERS_USE_EXPOSED_RUNTIME
(name subject to change, ofc).
We can then use this to detect when to use the exposed runtime or not.
Hi Joshua, I hope that you had a happy new year!
I agree with this suggestion, the only thing to consider is that Btw I think that
|
- Introduce `HF_TRANSFORMERS_USE_EXPOSED_RUNTIME` env variable to opt-in the exposed runtime behaviour
- Using array loop to avoid code duplication
Hey there 🙏,
Coming from #947, I did test the new
Exposed ORT
feature inalpha-v20
andalpha-21
releases. I'd notice that to make it work we must specify'auto'
as inferencedevice
:Current behaviour:
Without specify a device:
Explicit using
auto
as device:To solve this annoying config, the PR improves the environment support for
Exposed ORT
by implicitly selecting'auto'
as the default fallback instead of'wasm'
.