Adding vllm speculative decoding example #317

htrivedi99 · 2024-07-01T15:05:44Z

No description provided.

bolasim · 2024-07-01T21:23:05Z

vllm-speculative-decoding/config.yaml

+base_image:
+  image: nvcr.io/nvidia/pytorch:23.11-py3
+  python_executable_path: /usr/bin/python3


just cuious why we need this base image? Can you add a cooment?

Without this base image the build does not succeed. The baseten base image does not have nvcc, which is required for the developer build of vLLM.

bolasim · 2024-07-01T21:23:15Z

vllm-speculative-decoding/config.yaml

+  tensor_parallel: 1
+  max_num_seqs: 16
+model_name: vLLM Speculative Decoding
+python_version: py310


drop python version if using base image

bolasim · 2024-07-01T21:23:26Z

vllm-speculative-decoding/config.yaml

+  python_executable_path: /usr/bin/python3
+build_commands: []
+environment_variables:
+  HF_TOKEN: ""


why here over secrets?

vLLM reads only this specific environment variable for the access token. It doesn't work with secrets

Adding vllm spec dec example

c438c2d

htrivedi99 requested review from bolasim and vshulman July 1, 2024 21:00

bolasim reviewed Jul 1, 2024

View reviewed changes

minor fixes

fd5be0c

Provide feedback