Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: Telsa M40 GPU reports CUBLAS_STATUS_NOT_SUPPORTED #3552

Open
2 tasks done
edward-kirk opened this issue Nov 2, 2024 · 1 comment
Open
2 tasks done

[Issue]: Telsa M40 GPU reports CUBLAS_STATUS_NOT_SUPPORTED #3552

edward-kirk opened this issue Nov 2, 2024 · 1 comment
Labels
platform Platform specific problem

Comments

@edward-kirk
Copy link

Issue Description

Getting the following error. Tesla M40 24gb
Python: version=3.10.15 platform=Linux
bin="/home/kirk/sdNext/venv/bin/python3"
venv="/home/kirk/sdNext/venv"
12:42:45-813967 INFO Version: app=sd.next updated=2024-11-02 hash=65ddc611
branch=master
url=https://github.com/vladmandic/automatic/tree/master ui=main
12:42:46-146439 INFO Platform: arch=x86_64 cpu=x86_64 system=Linux
release=6.8.0-48-generic python=3.10.15
12:42:46-147900 INFO Args: []
12:42:46-156748 INFO CUDA: nVidia toolkit detected
12:42:46-157737 INFO Install: package="onnxruntime-gpu" mode=pip
12:43:04-069380 INFO Install: package="torch==2.5.1+cu124 torchvision==0.20.1+cu124
--index-url https://download.pytorch.org/whl/cu124" mode=pip
12:44:36-079777 INFO Install: package="onnx" mode=pip
12:44:39-368315 INFO Install: package="onnxruntime" mode=pip
12:54:52-085814 INFO Base: class=StableDiffusionPipeline
12:54:52-500710 ERROR Prompt parser encode: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when
calling cublasGemmStridedBatchedEx(handle, opa, opb, (int)m, (int)n, (int)k, (void*)&falpha, a, CUDA_R_16BF, (int)lda, stridea, b, CUDA_R_16BF, (int)ldb, strideb, (void*)&fbeta, c, CUDA_R_16BF, (int)ldc, stridec, (int)num_batches, compute_type, CUBLAS_GEMM_DEFAULT_TENSOR_OP)
12:54:52-525193 ERROR Processing: step=base args={'prompt': ['test'], 'negative_prompt':
[''], 'guidance_scale': 6, 'generator': [<torch._C.Generator
object at 0x7e3650e3c5f0>], 'callback_on_step_end': <function
diffusers_callback at 0x7e3697f4cf70>,
'callback_on_step_end_tensor_inputs': ['latents', 'prompt_embeds',
'negative_prompt_embeds'], 'num_inference_steps': 20, 'eta': 1.0,
'guidance_rescale': 0.7, 'output_type': 'latent', 'width': 1024,
'height': 1024} CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when
calling cublasGemmStridedBatchedEx(handle, opa, opb, (int)m, (int)n, (int)k, (void*)&falpha, a, CUDA_R_16BF, (int)lda, stridea, b, CUDA_R_16BF, (int)ldb, strideb, (void*)&fbeta, c, CUDA_R_16BF, (int)ldc, stridec, (int)num_batches, compute_type, CUBLAS_GEMM_DEFAULT_TENSOR_OP)
12:54:52-529346 ERROR Processing: RuntimeError

Version Platform Description

No response

Relevant log output

No response

Backend

Diffusers

UI

Standard

Branch

Master

Model

StableDiffusion 1.5

Acknowledgements

  • I have read the above and searched for existing issues
  • I confirm that this is classified correctly and its not an extension issue
@vladmandic
Copy link
Owner

try forcing dtype in settings to fp16 instead of auto since this is really old gpu architecture.
if that doesn't work, you may need to search for version of torch that is working fine with M40, not much i can do about that.

@vladmandic vladmandic changed the title [Issue]: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasGemmStridedBatchedEx [Issue]: Telsa M40 GPU reports CUBLAS_STATUS_NOT_SUPPORTED Nov 2, 2024
@vladmandic vladmandic added the platform Platform specific problem label Nov 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform Platform specific problem
Projects
None yet
Development

No branches or pull requests

2 participants