Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing_dgeqrf fails in CUDA runs #70

Closed
therault opened this issue Mar 17, 2023 · 1 comment
Closed

testing_dgeqrf fails in CUDA runs #70

therault opened this issue Mar 17, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@therault
Copy link
Contributor

Describe the bug

It seems DGEQRF (PTG at least) is broken for CUDA runs

To Reproduce

Steps to reproduce the behavior:

  1. Checkout current master
  2. Compile with CUDA enabled (e.g. use modules hwloc cuda gcc openmpi gdb ninja cmake intel-mkl python on leconte) and let cmake detect everything
  3. Run ./tests/testing_dgeqrf -N 4096 -t 1024 -x -g 1
  4. See error

Expected behavior

The CUDA driver complains of misaligned memory accesses and bails out

~/dplasma/out/build/Debug $ ./tests/testing_dgeqrf -N 4096 -t 1024 -x -g 1
W@00000 /!\ PERFORMANCE MIGHT BE REDUCED /!\: The binding defined by --parsec_bind has been ignored!
	This option requires a build with HWLOC with bitmap support.
#+++++ cores detected       : 80
#+++++ nodes x cores + gpu  : 1 x 80 + 1 (80+1)
#+++++ thread mode          : THREAD_SERIALIZED
#+++++ P x Q                : 1 x 1 (1/1)
#+++++ M x N x K|NRHS       : 4096 x 4096 x 1
#+++++ MB x NB , IB         : 1024 x 1024 , 32
#+++++ KP x KQ              : 4 x 1
W@00000 /home/herault/dplasma/parsec/parsec/mca/device/cuda/device_cuda_module.c:2012 (progress_stream) cudaEventQuery an illegal memory access was encountered
W@00000 Critical issue related to the GPU discovered. Giving up
@therault therault added the bug Something isn't working label Mar 17, 2023
@abouteiller
Copy link
Contributor

more information in #110

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants