Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOB trap is fatal for CPU back-ends #249

Open
maleadt opened this issue Oct 4, 2024 · 4 comments
Open

OOB trap is fatal for CPU back-ends #249

maleadt opened this issue Oct 4, 2024 · 4 comments

Comments

@maleadt
Copy link
Member

maleadt commented Oct 4, 2024

julia> using OpenCL, pocl_jll

julia> function kernel(x)
       x[1]
       return
       end
kernel (generic function with 1 method)

julia> @opencl kernel(CLArray{Int}(undef, 0))
ERROR: Out-of-bounds array access.

[8653] signal (4.1): Illegal instruction: 4
@maleadt
Copy link
Member Author

maleadt commented Oct 8, 2024

Problem is that SPIR-V only has OpUnreachable, and no equivalent for exit from PTX. And signal_exception already does nothing right now (which is something that ought to be fixed in its own regard); the trap comes from Julia, and is relatively hard to avoid without rewriting the IR right before SPIR-V generation. And even then there's no guarantee that the back-end's LLVM won't reconstruct the unreachable and insert a trap again...

@VarLad
Copy link

VarLad commented Oct 9, 2024

Apologies for adding noise. Another way to run on CPU as backend (on Linux) would be:

RUSTICL_ENABLE=llvmpipe RUSTICL_DEVICE_TYPE=gpu julia

which interestingly gives a different error:

julia> @opencl kernel(CLArray{Int}(undef, 0))
ERROR: Out-of-bounds array access.
OpenCL.HostKernel{typeof(kernel), Tuple{CLDeviceVector{Int64, 1}}}(kernel, OpenCL.Kernel("_Z6kernel13CLDeviceArrayI5Int64Ll1ELl1EE" nargs=1))

This error is same if I run it with NVIDIA/AMD GPUs as well.
Not sure if its of any relevance, just thought it'd be cool to share :)

@maleadt
Copy link
Member Author

maleadt commented Oct 15, 2024

Definitely interesting, as it means that rusticl lowers trap differently than pocl does. May be worth opening an issue about.

EDIT: opened an issue, pocl/pocl#1619

@maleadt
Copy link
Member Author

maleadt commented Oct 15, 2024

@VarLad Could you check whether the kernel actually exits after encountering an exception, i.e., if no code after it gets executed? And/or, if possible, post the generated native code for the kernel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants