Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error saving GGUF of vision model #1504

Open
SergioRubio01 opened this issue Jan 5, 2025 · 1 comment
Open

Error saving GGUF of vision model #1504

SergioRubio01 opened this issue Jan 5, 2025 · 1 comment

Comments

@SergioRubio01
Copy link

SergioRubio01 commented Jan 5, 2025

Hey,
Congrat for the work!
I have a vision FT but when attempting to download the GGUF for Ollama usage, it gives me this exception:

Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.57s/it]%|          | 0/2 [00:00<?, ?it/s]
2025-01-05 04:09:07,810 - ERROR - STDERR: max_steps is given, it will override any value given in num_train_epochs
2025-01-05 04:09:07,810 - ERROR - STDERR: ==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
2025-01-05 04:09:07,810 - ERROR - STDERR: \\   /|    Num examples = 54 | Num Epochs = 1
2025-01-05 04:09:07,820 - ERROR - STDERR: O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 4
2025-01-05 04:09:07,820 - ERROR - STDERR: \        /    Total batch size = 8 | Total steps = 5
2025-01-05 04:09:07,824 - ERROR - STDERR: "-____-"     Number of trainable parameters = 16,793,600
2025-01-05 04:09:07,826 - ERROR - STDERR: 🦥 Unsloth needs about 1-3 minutes to load everything - please wait!
100%|██████████| 5/5 [01:20<00:00, 16.08s/it]          | 0/5 [00:00<?, ?it/s]
2025-01-05 04:09:07,829 - ERROR - STDERR: Traceback (most recent call last):
2025-01-05 04:09:07,832 - ERROR - STDERR: File "/home/Ubuntu/finetune.py", line 145, in <module>
2025-01-05 04:09:07,833 - ERROR - STDERR: model.save_pretrained_gguf(MODEL_NAME, tokenizer)
2025-01-05 04:09:07,835 - ERROR - STDERR: File "/home/Ubuntu/miniconda3/lib/python3.12/site-packages/unsloth/save.py", line 2238, in not_implemented_save 
2025-01-05 04:09:07,836 - ERROR - STDERR: raise NotImplementedError("Unsloth: Sorry GGUF is currently not supported for vision models!")
2025-01-05 04:09:07,837 - ERROR - STDERR: NotImplementedError: Unsloth: Sorry GGUF is currently not supported for vision models!
2025-01-05 04:09:07,840 - ERROR - Command failed with non-zero exit status: bash -l -c 'eval "$(~/miniconda3/bin/conda shell.bash hook)" &&         conda activate &&         python3 finetune.py'

Is this being currently developed? Or is there another possible way of using my FT LLM models in Ollama?

I followed the steps from the Docs:

...
trainer_stats = trainer.train()

model.save_pretrained(MODEL_NAME) # Local saving
tokenizer.save_pretrained(MODEL_NAME)
# model.push_to_hub("your_name/lora_model", token = "...") # Online saving
# tokenizer.push_to_hub("your_name/lora_model", token = "...") # Online saving

used_memory = round(torch.cuda.max_memory_reserved()/1024/1024/1024, 3)
used_memory_for_lora = round(used_memory - start_gpu_memory, 3)
used_percentage = round(used_memory/max_memory*100, 3)
lora_percentage = round(used_memory_for_lora/max_memory*100,3)
print(f"{trainer_stats.metrics['train_runtime']} seconds used for training.")
print(f"Peak reserved memory = {used_memory} GB.")
print(f"Peak reserved memory for training = {used_memory_for_lora} GB.")
print(f"Peak reserved memory % of max memory = {used_percentage} %.")
print(f"Peak reserved memory for training % of max memory = {lora_percentage} %.")

# Save model in GGUF format
model.save_pretrained_gguf(MODEL_NAME, tokenizer)
@danielhanchen
Copy link
Contributor

Sorry on the delay - I'm planning to add GGUF support in the coming days!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants