You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TEI 1.5 introduced feat(onnx): add onnx runtime for better CPU perf #328. Request to not use onnx runtime on CPU. It seems there is no way to opt out of the ONNX backend. Requesting ability to not use the ONNX backend.
pub async fn download_weights(api: &ApiRepo) -> Result<Vec<PathBuf>, ApiError> {
let model_files = if cfg!(feature = "python") || cfg!(feature = "candle") {
match download_safetensors(api).await {
Ok(p) => p,
Err(_) => {
tracing::warn!("safetensors weights not found. Using `pytorch_model.bin` instead. Model loading will be significantly slower.");
tracing::info!("Downloading `pytorch_model.bin`");
let p = api.get("pytorch_model.bin").await?;
vec![p]
}
}
} else if cfg!(feature = "ort") {
tracing::info!("Downloading `model.onnx`");
match api.get("model.onnx").await {
Ok(p) => vec![p],
Err(err) => {
tracing::warn!("Could not download `model.onnx`: {err}");
tracing::info!("Downloading `onnx/model.onnx`");
let p = api.get("onnx/model.onnx").await?;
vec![p.parent().unwrap().to_path_buf()]
}
}
} else {
unreachable!()
};
Ok(model_files)
}
Motivation
Not all models have .onnx files posted. There doesn't seem to be a reason to lock the cpu runtime to onnx models only for people not concerned about performance.
Your contribution
Happy to open a PR adding an option to opt out of the onnx runtime and build the old cpu image as it was
The text was updated successfully, but these errors were encountered:
Feature request
TEI 1.5 introduced feat(onnx): add onnx runtime for better CPU perf #328. Request to not use onnx runtime on CPU. It seems there is no way to opt out of the ONNX backend. Requesting ability to not use the ONNX backend.
Motivation
Not all models have
.onnx
files posted. There doesn't seem to be a reason to lock the cpu runtime to onnx models only for people not concerned about performance.Your contribution
Happy to open a PR adding an option to opt out of the onnx runtime and build the old cpu image as it was
The text was updated successfully, but these errors were encountered: