Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gguf_init_from_file: invalid magic characters - Fine Tuned Model - #1500

Open
dynamite9999 opened this issue Jan 4, 2025 · 2 comments
Open

Comments

@dynamite9999
Copy link

Hello,
I followed the sample colab notebook and fine tuned - "unsloth/Meta-Llama-3.1-8B-bnb-4bit" model.

I used the latest llama.cpp compiled with flags cmake -B build -DGGML_CUDA=ON -DGGML_CUDA_ENABLE_UNIFIED_MEMORY=1

It generated the gguf file no problem, but when I tried to use the generated gguf I got this error:
c$ ./main -m ./models/unsloth.Q4_K_M.gguf -p "hello"
Log start
main: build = 3482 (e54c35e4)
main: built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
main: seed = 1735960410
gguf_init_from_file: invalid magic characters ''
llama_model_load: error loading model: llama_model_loader: failed to load model from ./models/unsloth.Q4_K_M.gguf

llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './models/unsloth.Q4_K_M.gguf'
main: error: unable to load model

Here is the first few bytes of the generated gguf file, any experts see any issues with the generated gguf ?

(netai) d@d:/hp/NetAnalytics/dev/netai/syslog/syslog_scraper_netai/t80/rc$ hexdump -C ./models/unsloth.Q4_K_M.gguf | head -n 10
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00777e20 00 00 80 3f 00 00 80 3f 00 00 80 3f 00 00 80 3f |...?...?...?...?|
*
00777e50 00 00 80 3f 00 00 80 3f 00 00 80 3f c2 5e d3 3f |...?...?...?.^.?|
00777e60 6f b4 52 40 ee aa 1a 41 00 00 00 42 00 00 00 42 |[email protected]|
00777e70 00 00 00 42 00 00 00 42 00 00 00 42 00 00 00 42 |...B...B...B...B|
*
00777ea0 dc 5a 06 ac 97 b8 0f 2a 94 88 da 3f c1 7d 8e 71 |.Z.....*...?.}.q|
00777eb0 f4 a2 db 17 fe 31 75 eb 87 6f 00 0b 58 39 54 44 |.....1u..o..X9TD|
(netai) d@d:
/hp/NetAnalytics/dev/netai/syslog/syslog_scraper_netai/t80/rc$

Any ideas on how to figure out how to start debugging ?

@dynamite9999
Copy link
Author

More info.
The 16Bit generated model works fine. During the process of constructing the quanitzed 4bit unsloth/llama3.2 model, it creates a 16 Brain Float model, and it works. Here is some more info
model_q4_k_m$ ls -lrt
total 7002584
-rw-rw-r-- 1 d d 54628 Jan 3 19:10 tokenizer_config.json
-rw-rw-r-- 1 d d 454 Jan 3 19:10 special_tokens_map.json
-rw-rw-r-- 1 d d 17209920 Jan 3 19:10 tokenizer.json
-rw-rw-r-- 1 d d 994 Jan 3 19:10 config.json
-rw-rw-r-- 1 d d 234 Jan 3 19:10 generation_config.json
-rw-rw-r-- 1 d d 4417802560 Jan 3 19:10 model.safetensors
-rw-rw-r-- 1 d d 2479595168 Jan 3 19:10 unsloth.BF16.gguf <<<<< WORKS FINE.
-rw-rw-r-- 1 d d 255954592 Jan 3 19:10 unsloth.Q4_K_M.gguf <<<< DOES NOT WORK -INVALID MAGIC NUMBER

@danielhanchen
Copy link
Contributor

Much apologies on the delay - it generally means you ran out of disk space - I'm trying to make GGUF saving much easier in the coming days - sorry on the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants