[Feature Request] Mixtral training support #31

epicfilemcnulty · 2023-12-12T19:43:09Z

For reference, LLaMA-Factory claims that using their toolkit you can QLoRA fine-tune mixtral with 28GB of VRAM.

danielhanchen · 2023-12-13T02:09:41Z

@epicfilemcnulty We're working on it for a later release!!

Itssshikhar · 2025-01-14T19:18:15Z

Hi Everyone,

I have the required updates as addressed in the issue by adding Mixtral model support. I’ve tested the model, and it loads successfully, with QLoRA fine-tuning and memory usage yet to be tested due to gpu constraints on my side.

If anyone has suggestions for further testing or optimizations, please let me know! I’m happy to make any additional changes or improvements.

danielhanchen added the on roadmap Feature request on roadmap label Jan 27, 2024

danielhanchen added feature request Feature request pending on roadmap good first issue labels Oct 9, 2024

danielhanchen changed the title ~~Mixtral training support~~ [Feature Request] Mixtral training support Oct 9, 2024

Itssshikhar linked a pull request Jan 14, 2025 that will close this issue

feat: Add Mixtral model support #1541

Draft

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Mixtral training support #31

[Feature Request] Mixtral training support #31

epicfilemcnulty commented Dec 12, 2023

danielhanchen commented Dec 13, 2023

Itssshikhar commented Jan 14, 2025

[Feature Request] Mixtral training support #31

[Feature Request] Mixtral training support #31

Comments

epicfilemcnulty commented Dec 12, 2023

danielhanchen commented Dec 13, 2023

Itssshikhar commented Jan 14, 2025