Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Mixtral training support #31

Open
epicfilemcnulty opened this issue Dec 12, 2023 · 2 comments · May be fixed by #1541
Open

[Feature Request] Mixtral training support #31

epicfilemcnulty opened this issue Dec 12, 2023 · 2 comments · May be fixed by #1541
Labels
feature request Feature request pending on roadmap good first issue on roadmap Feature request on roadmap

Comments

@epicfilemcnulty
Copy link

For reference, LLaMA-Factory claims that using their toolkit you can QLoRA fine-tune mixtral with 28GB of VRAM.

@danielhanchen
Copy link
Contributor

@epicfilemcnulty We're working on it for a later release!!

@danielhanchen danielhanchen added the on roadmap Feature request on roadmap label Jan 27, 2024
@danielhanchen danielhanchen added feature request Feature request pending on roadmap good first issue labels Oct 9, 2024
@danielhanchen danielhanchen changed the title Mixtral training support [Feature Request] Mixtral training support Oct 9, 2024
@Itssshikhar Itssshikhar linked a pull request Jan 14, 2025 that will close this issue
4 tasks
@Itssshikhar
Copy link

Hi Everyone,

I have the required updates as addressed in the issue by adding Mixtral model support. I’ve tested the model, and it loads successfully, with QLoRA fine-tuning and memory usage yet to be tested due to gpu constraints on my side.

If anyone has suggestions for further testing or optimizations, please let me know! I’m happy to make any additional changes or improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Feature request pending on roadmap good first issue on roadmap Feature request on roadmap
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants