[FIXED] Flash-Attn breaks with `flash_attn_gpu` #1437

jmparejaz · 2024-12-17T05:33:57Z

I have

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:18:24_PDT_2024
Cuda compilation tools, release 12.4, V12.4.131
Build cuda_12.4.r12.4/compiler.34097967_0

Name: torch
Version: 2.4.1+cu124
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3
Location: /usr/local/lib/python3.11/dist-packages
Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-nccl-cu12, nvidia-nvjitlink-cu12, nvidia-nvtx-cu12, sympy, triton, typing-extensions
Required-by: accelerate, bitsandbytes, cut-cross-entropy, flash-attn, peft, torchaudio, torchvision, unsloth_zoo, xformers

Name: flash-attn
Version: 2.7.2.post1
Summary: Flash Attention: Fast and Memory-Efficient Exact Attention
Home-page: https://github.com/Dao-AILab/flash-attention
Author: Tri Dao
Author-email: [email protected]
License: 
Location: /usr/local/lib/python3.11/dist-packages
Requires: einops, torch
Required-by:

installed unsloth with the wget command and still i got this warning message

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
Unsloth: Your Flash Attention 2 installation seems to be broken?
A possible explanation is you have a new CUDA version which isn't
yet compatible with FA2? Please file a ticket to Unsloth or FA2.

The text was updated successfully, but these errors were encountered:

shimmyshimmer · 2024-12-18T08:41:52Z

Try uninstalling and reinstalling unsloth. If error still persists let us know
https://docs.unsloth.ai/get-started/install-update/updating

jmparejaz · 2024-12-21T17:24:13Z

the error message is still happening

R3xpook · 2024-12-28T16:59:39Z

I do have the same problem but with latest flash-attn version , with 2.7.0.post2 It works fine (wsl)

hengck23 · 2024-12-31T20:23:13Z

maybe edit this file:
lib/python3.11/site-packages/unsloth/models/_utils.py

i think there is no more flash_attn_cuda, instead we have flash_attn_2_cuda.
Please verify that the change is correct (e.g., run on some data and compute some metric), as I have not tried it.

   ##from flash_attn.flash_attn_interface import flash_attn_cuda
   from flash_attn.flash_attn_interface import flash_attn_gpu as flash_attn_cuda

n9Mtq4 · 2025-01-04T21:01:42Z

I'm getting the same thing.

In Dao-AILab/flash-attention#1203 (specific location) flash_attn_cuda was renamed flash_attn_gpu which causes unsloth to think FA is broken.

For me I did a different workaround of re-adding flash_attn_cuda to flash attention as I wanted to make sure flash_attn_cuda was available as it might be imported elsewhere.

I just edited venv/lib/python3.12/site-packages/flash_attn/flash_attn_interface.py

USE_TRITON_ROCM = os.getenv("FLASH_ATTENTION_TRITON_AMD_ENABLE", "FALSE") == "TRUE"
if USE_TRITON_ROCM:
    from .flash_attn_triton_amd import interface_fa as flash_attn_gpu
else:
    import flash_attn_2_cuda as flash_attn_gpu
+   flash_attn_cuda = flash_attn_gpu

Taimin · 2025-01-15T03:18:42Z

Same here. Got a reminder about flash-attn not installed. I used pip install flash-attn --no-build-isolation to install the package. Solved with the workaround mentioned above.

danielhanchen · 2025-01-16T11:09:15Z

Apologies a lot on this - and sorry I missed this entirely - I added a fix to the nightly branch - so sorry on the issue!

weiminw · 2025-01-17T02:46:31Z

+1

danielhanchen · 2025-01-20T11:39:11Z

I just fixed it! Apologies on the delay!

For local machines, please update Unsloth via:

pip install --upgrade --no-cache-dir --force-reinstall --no-deps unsloth unsloth_zoo

shimmyshimmer added the unsure bug? I'm unsure label Dec 18, 2024

danielhanchen added currently fixing Am fixing now! and removed unsure bug? I'm unsure labels Jan 16, 2025

danielhanchen changed the title ~~how to fix flash attention broken installation~~ Flash-Attn breaks with flash_attn_gpu Jan 16, 2025

danielhanchen added fixed - pending confirmation Fixed, waiting for confirmation from poster and removed currently fixing Am fixing now! labels Jan 20, 2025

danielhanchen changed the title ~~Flash-Attn breaks with flash_attn_gpu~~ [FIXED] Flash-Attn breaks with flash_attn_gpu Jan 20, 2025

danielhanchen pinned this issue Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIXED] Flash-Attn breaks with `flash_attn_gpu` #1437

[FIXED] Flash-Attn breaks with `flash_attn_gpu` #1437

jmparejaz commented Dec 17, 2024

shimmyshimmer commented Dec 18, 2024

jmparejaz commented Dec 21, 2024

R3xpook commented Dec 28, 2024

hengck23 commented Dec 31, 2024 •

edited

Loading

n9Mtq4 commented Jan 4, 2025

Taimin commented Jan 15, 2025

danielhanchen commented Jan 16, 2025

weiminw commented Jan 17, 2025

danielhanchen commented Jan 20, 2025

[FIXED] Flash-Attn breaks with flash_attn_gpu #1437

[FIXED] Flash-Attn breaks with flash_attn_gpu #1437

Comments

jmparejaz commented Dec 17, 2024

shimmyshimmer commented Dec 18, 2024

jmparejaz commented Dec 21, 2024

R3xpook commented Dec 28, 2024

hengck23 commented Dec 31, 2024 • edited Loading

n9Mtq4 commented Jan 4, 2025

Taimin commented Jan 15, 2025

danielhanchen commented Jan 16, 2025

weiminw commented Jan 17, 2025

danielhanchen commented Jan 20, 2025

[FIXED] Flash-Attn breaks with `flash_attn_gpu` #1437

[FIXED] Flash-Attn breaks with `flash_attn_gpu` #1437

hengck23 commented Dec 31, 2024 •

edited

Loading