Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问现在支持Yi-34B的awq 4bit部署吗? #291

Open
xyfZzz opened this issue Jan 9, 2024 · 5 comments
Open

请问现在支持Yi-34B的awq 4bit部署吗? #291

xyfZzz opened this issue Jan 9, 2024 · 5 comments

Comments

@xyfZzz
Copy link

xyfZzz commented Jan 9, 2024

No description provided.

@hiworldwzj
Copy link
Collaborator

@xyfZzz 还不能很好的支持,一个是开源实现的triton int4weightonly gemm 算子性能不是很好。还有就是直接加载awq的权重需要去适配相关权重的加载。这个后续会继续优化提升。

@xyfZzz
Copy link
Author

xyfZzz commented Jan 9, 2024

@xyfZzz 还不能很好的支持,一个是开源实现的triton int4weightonly gemm 算子性能不是很好。还有就是直接加载awq的权重需要去适配相关权重的加载。这个后续会继续优化提升。

好的,请问,那4bit gptq目前是不是也暂时不支持?

@hiworldwzj
Copy link
Collaborator

@xyfZzz 目前只有一些量化计算的算子支持了,默认情况下是直接量化原始的权重,没有做PTQ等权重调整,也还没有适配gptq这种量化后权重的加载。

@xyfZzz
Copy link
Author

xyfZzz commented Jan 9, 2024

@xyfZzz 目前只有一些量化计算的算子支持了,默认情况下是直接量化原始的权重,没有做PTQ等权重调整,也还没有适配gptq这种量化后权重的加载。

好的,感谢大佬

@RanchiZhao
Copy link

available now?I simply do gptq and awq on Yi-6B, and try to do lora training on it, however, loss is Nan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants