Continued pretraining for llama 3.2 not working #1530

methosmythos · 2025-01-12T11:05:08Z

Hi,

I'm trying to fine-tune a llama 3.2 3B model for a new language.
I followed this notebook: https://colab.research.google.com/drive/1tEd1FrOXWMnCU9UIvdYhs61tkxdMuKZu?usp=sharing locally.
The only thing I've changed is the model name, so instead of "unsloth/mistral-7b-v0.3" I set it to "unsloth/Llama-3.2-3B-bnb-4bit".
When trainer_stats = trainer.train() is called, a strange error occurs:

Exception has occurred: Unsupported
generator
KeyError: <code object sort_logit_avg at 0x773e9003e230, file "./llama-fine-tune/.venv/lib/python3.12/site-packages/cut_cross_entropy/cce.py", line 30>

During handling of the above exception, another exception occurred:

File "./llama-fine-tune/main.py", line 214, in
trainer_stats = trainer.train()
^^^^^^^^^^^^^^^
torch._dynamo.exc.Unsupported: generator

Thanks in advance for any help.

danielhanchen · 2025-01-14T11:00:21Z

Would you be able to print the Unsloth info out (the Pytorch version, etc)

methosmythos · 2025-01-14T20:10:32Z

Hi,

sure. First I tried with the latest version:

Package                  Version
------------------------ ------------
accelerate               1.2.1
aiohappyeyeballs         2.4.4
aiohttp                  3.11.11
aiosignal                1.3.2
attrs                    24.3.0
bitsandbytes             0.45.0
certifi                  2024.12.14
charset-normalizer       3.4.1
cut-cross-entropy        25.1.1
datasets                 3.2.0
dill                     0.3.8
docstring_parser         0.16
filelock                 3.16.1
frozenlist               1.5.0
fsspec                   2024.9.0
hf_transfer              0.1.9
huggingface-hub          0.27.1
idna                     3.10
Jinja2                   3.1.5
markdown-it-py           3.0.0
MarkupSafe               3.0.2
mdurl                    0.1.2
mpmath                   1.3.0
multidict                6.1.0
multiprocess             0.70.16
networkx                 3.4.2
numpy                    2.2.1
nvidia-cublas-cu12       12.4.5.8
nvidia-cuda-cupti-cu12   12.4.127
nvidia-cuda-nvrtc-cu12   12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12        9.1.0.70
nvidia-cufft-cu12        11.2.1.3
nvidia-curand-cu12       10.3.5.147
nvidia-cusolver-cu12     11.6.1.9
nvidia-cusparse-cu12     12.3.1.170
nvidia-nccl-cu12         2.21.5
nvidia-nvjitlink-cu12    12.4.127
nvidia-nvtx-cu12         12.4.127
packaging                24.2
pandas                   2.2.3
peft                     0.14.0
pillow                   11.1.0
pip                      24.3.1
propcache                0.2.1
protobuf                 3.20.3
psutil                   6.1.1
pyarrow                  18.1.0
Pygments                 2.19.1
python-dateutil          2.9.0.post0
pytz                     2024.2
PyYAML                   6.0.2
regex                    2024.11.6
requests                 2.32.3
rich                     13.9.4
safetensors              0.5.2
sentencepiece            0.2.0
setuptools               75.8.0
shtab                    1.7.1
six                      1.17.0
sympy                    1.13.1
tokenizers               0.21.0
torch                    2.5.1
tqdm                     4.67.1
transformers             4.48.0
triton                   3.1.0
trl                      0.13.0
typeguard                4.4.1
typing_extensions        4.12.2
tyro                     0.9.8
tzdata                   2024.2
unsloth                  2025.1.5
unsloth_zoo              2025.1.3
urllib3                  2.3.0
wheel                    0.45.1
xformers                 0.0.29.post1
xxhash                   3.5.0
yarl                     1.18.3

and then I've tried to downgrade to:

unsloth                  2024.12.11
unsloth_zoo              2024.12.6

Same error occurred for both.
If I use the unsloth/mistral-7b-v0.3-bnb-4bit everything is working fine.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continued pretraining for llama 3.2 not working #1530

Continued pretraining for llama 3.2 not working #1530

methosmythos commented Jan 12, 2025

danielhanchen commented Jan 14, 2025

methosmythos commented Jan 14, 2025

Continued pretraining for llama 3.2 not working #1530

Continued pretraining for llama 3.2 not working #1530

Comments

methosmythos commented Jan 12, 2025

danielhanchen commented Jan 14, 2025

methosmythos commented Jan 14, 2025