Does qlora adapter only support BitsAndBytes quantization model ? #9544

tarukumar · 2024-10-21T05:57:30Z

tarukumar
Oct 21, 2024

What i have observed is when I try to deploy the model using qlora_adapter_name_or_path for qlora adapter it fails to deploy with the error mentioned in the line https://github.com/vllm-project/vllm/blob/main/vllm/engine/arg_utils.py#L899-L911. The question is to deploy qlora adpater should i use --lora-modules or adapter-cache parameter for qlora adapter? What is the best approach here ?

jeejeelee · 2024-10-21T16:30:26Z

jeejeelee
Oct 21, 2024

See: https://github.com/vllm-project/vllm/blob/main/examples/lora_with_quantization_inference.py

7 replies

tarukumar Oct 22, 2024
Author

Okay so for other qunat model to load qlora adapter we need to use lora-modules argument right? Also, my main question is why we can not use qlora_adapter_name_or_path argument for other quant model. What is the technical bottleneck beacsue this doesn’t seem consistent for one specific quant method we have to use specific argument and for others we have to use different

jeejeelee Oct 22, 2024

vllm first added support for LoRA with other quantization models (such as GPTQ, awq, etc.) before supporting QLoRA. I think this is due to implementation logic reasons. Would you please help explain this, @chenqianfzh, Thank you.

chenqianfzh Oct 22, 2024

@jeejeelee sure, my pleasure.

QLoRA and 4-bit quantization were originally introduced together in the paper "QLoRA: Efficient Finetuning of Quantized LLMs" (https://arxiv.org/abs/2305.14314). In practice, most QLoRA models utilize bitsandbytes for 4-bit quantization.

However, as @tarukumar observed from the code, QLoRA adapters are not restricted to supporting bitsandbytes quantization exclusively. We are prepared to modify the code if other quantization models require such functionality.

Hope this helps. :-)

tarukumar Oct 23, 2024
Author

@chenqianfzh That would be good if we can expand that. Thank you!

tarukumar Oct 28, 2024
Author

@chenqianfzh should i open the issue for this support ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does qlora adapter only support BitsAndBytes quantization model ? #9544

{{title}}

Replies: 1 comment 7 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Does qlora adapter only support BitsAndBytes quantization model ? #9544

tarukumar Oct 21, 2024

Replies: 1 comment · 7 replies

jeejeelee Oct 21, 2024

tarukumar Oct 22, 2024 Author

jeejeelee Oct 22, 2024

chenqianfzh Oct 22, 2024

tarukumar Oct 23, 2024 Author

tarukumar Oct 28, 2024 Author

tarukumar
Oct 21, 2024

Replies: 1 comment 7 replies

jeejeelee
Oct 21, 2024

tarukumar Oct 22, 2024
Author

tarukumar Oct 23, 2024
Author

tarukumar Oct 28, 2024
Author