Replies: 1 comment 7 replies
-
See: https://github.com/vllm-project/vllm/blob/main/examples/lora_with_quantization_inference.py |
Beta Was this translation helpful? Give feedback.
7 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
What i have observed is when I try to deploy the model using
qlora_adapter_name_or_path
for qlora adapter it fails to deploy with the error mentioned in the line https://github.com/vllm-project/vllm/blob/main/vllm/engine/arg_utils.py#L899-L911. The question is to deploy qlora adpater should i use--lora-modules
oradapter-cache
parameter for qlora adapter? What is the best approach here ?Beta Was this translation helpful? Give feedback.
All reactions