Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding vllm speculative decoding example #317

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

htrivedi99
Copy link
Contributor

No description provided.

@htrivedi99 htrivedi99 requested review from bolasim and vshulman July 1, 2024 21:00
Comment on lines +1 to +3
base_image:
image: nvcr.io/nvidia/pytorch:23.11-py3
python_executable_path: /usr/bin/python3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just cuious why we need this base image? Can you add a cooment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this base image the build does not succeed. The baseten base image does not have nvcc, which is required for the developer build of vLLM.

tensor_parallel: 1
max_num_seqs: 16
model_name: vLLM Speculative Decoding
python_version: py310
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop python version if using base image

python_executable_path: /usr/bin/python3
build_commands: []
environment_variables:
HF_TOKEN: ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why here over secrets?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vLLM reads only this specific environment variable for the access token. It doesn't work with secrets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants