[Issue]: Telsa M40 GPU reports CUBLAS_STATUS_NOT_SUPPORTED #3552

edward-kirk · 2024-11-02T18:00:49Z

Issue Description

Getting the following error. Tesla M40 24gb
Python: version=3.10.15 platform=Linux
bin="/home/kirk/sdNext/venv/bin/python3"
venv="/home/kirk/sdNext/venv"
12:42:45-813967 INFO Version: app=sd.next updated=2024-11-02 hash=65ddc611
branch=master
url=https://github.com/vladmandic/automatic/tree/master ui=main
12:42:46-146439 INFO Platform: arch=x86_64 cpu=x86_64 system=Linux
release=6.8.0-48-generic python=3.10.15
12:42:46-147900 INFO Args: []
12:42:46-156748 INFO CUDA: nVidia toolkit detected
12:42:46-157737 INFO Install: package="onnxruntime-gpu" mode=pip
12:43:04-069380 INFO Install: package="torch==2.5.1+cu124 torchvision==0.20.1+cu124
--index-url https://download.pytorch.org/whl/cu124" mode=pip
12:44:36-079777 INFO Install: package="onnx" mode=pip
12:44:39-368315 INFO Install: package="onnxruntime" mode=pip
12:54:52-085814 INFO Base: class=StableDiffusionPipeline
12:54:52-500710 ERROR Prompt parser encode: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when
calling cublasGemmStridedBatchedEx(handle, opa, opb, (int)m, (int)n, (int)k, (void*)&falpha, a, CUDA_R_16BF, (int)lda, stridea, b, CUDA_R_16BF, (int)ldb, strideb, (void*)&fbeta, c, CUDA_R_16BF, (int)ldc, stridec, (int)num_batches, compute_type, CUBLAS_GEMM_DEFAULT_TENSOR_OP)
12:54:52-525193 ERROR Processing: step=base args={'prompt': ['test'], 'negative_prompt':
[''], 'guidance_scale': 6, 'generator': [<torch._C.Generator
object at 0x7e3650e3c5f0>], 'callback_on_step_end': <function
diffusers_callback at 0x7e3697f4cf70>,
'callback_on_step_end_tensor_inputs': ['latents', 'prompt_embeds',
'negative_prompt_embeds'], 'num_inference_steps': 20, 'eta': 1.0,
'guidance_rescale': 0.7, 'output_type': 'latent', 'width': 1024,
'height': 1024} CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when
calling cublasGemmStridedBatchedEx(handle, opa, opb, (int)m, (int)n, (int)k, (void*)&falpha, a, CUDA_R_16BF, (int)lda, stridea, b, CUDA_R_16BF, (int)ldb, strideb, (void*)&fbeta, c, CUDA_R_16BF, (int)ldc, stridec, (int)num_batches, compute_type, CUBLAS_GEMM_DEFAULT_TENSOR_OP)
12:54:52-529346 ERROR Processing: RuntimeError

Version Platform Description

No response

Relevant log output

No response

Backend

Diffusers

UI

Standard

Branch

Master

Model

StableDiffusion 1.5

Acknowledgements

I have read the above and searched for existing issues
I confirm that this is classified correctly and its not an extension issue

The text was updated successfully, but these errors were encountered:

vladmandic · 2024-11-02T18:05:11Z

try forcing dtype in settings to fp16 instead of auto since this is really old gpu architecture.
if that doesn't work, you may need to search for version of torch that is working fine with M40, not much i can do about that.

vladmandic changed the title ~~[Issue]: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasGemmStridedBatchedEx~~ [Issue]: Telsa M40 GPU reports CUBLAS_STATUS_NOT_SUPPORTED Nov 2, 2024

vladmandic added the platform Platform specific problem label Nov 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Issue]: Telsa M40 GPU reports CUBLAS_STATUS_NOT_SUPPORTED #3552

[Issue]: Telsa M40 GPU reports CUBLAS_STATUS_NOT_SUPPORTED #3552

edward-kirk commented Nov 2, 2024

vladmandic commented Nov 2, 2024

[Issue]: Telsa M40 GPU reports CUBLAS_STATUS_NOT_SUPPORTED #3552

[Issue]: Telsa M40 GPU reports CUBLAS_STATUS_NOT_SUPPORTED #3552

Comments

edward-kirk commented Nov 2, 2024

Issue Description

Version Platform Description

Relevant log output

Backend

UI

Branch

Model

Acknowledgements

vladmandic commented Nov 2, 2024