-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DFT] Introduce the cuFFT backend for the DFT interface. #284
Conversation
evidence of the tests running. real-real transforms aren't implemented. cufft_backend_run.txt |
Multidimensional tests (#275) rebased onto these changes - |
@anantsrivastava30 could you please take a look at this PR? Thanks! |
@anantsrivastava30 The conflicts caused by the #282 are quite big so I will need some time to sort that out. I'll give you a ping when I'm ready. |
@anantsrivastava30 this is ready for review now. Sorry for the wait, I had a horrible bug with cufftDestroy changing the cuda context for no known reason. Should be fine now. |
@anantsrivastava30 I've fixed that merge conflict now so please review when you're ready |
@@ -165,6 +163,8 @@ int DFT_Test<precision, domain>::test_in_place_buffer() { | |||
} | |||
} | |||
|
|||
// account for scaling that occurs during DFT | |||
std::for_each(input.begin(), input.end(), [this](auto& x) { x *= forward_elements; }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you need to do that instead of using the backward scale as it was before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cuFFT doesn't support forward or backward scaling. We consider adding a kernel that would do a simple multiplication, but decided that users could easily integrate this into their own kernels and get much better performance (since they could perform the scaling at the same time as other work, avoiding the cost of loading and storing all the data just to perform a single divide). Testing for forward and backward scale is included in our testing roadmap, a which point we will test many values, not just 1/N.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if a user of the oneMKL Interface library is using the oneMKL SYCL APIs with cuFFT backend and setting the bwd_scale then the scale will not be applied to the result of the FFT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that does seem bad.
I could add a check to the commit step for cuFFT so that an exception is thrown when BACKWARD|FORWARD_SCALE != 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a fix in fb06c46. This does have a problem in that I had to rearrange the test since the invalid scale was stored, and then all subsequent attempts to commit became invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should find a way to make the backward, forward scale works for the cuFFT backend.
The goal of this project is to demonstrate that we can implement the oneMKL SYCL APIs as defined in the oneMKL specification for many platforms.
So as is we have a gap for Nvidia platform.
We could add the scale for the Nvidia backend in a different PR though and as a first step maybe throw an "unimplemented" exception for those scale for the cuFFT backend and update the test only for the cuFFT backend to either apply the scale on the test side if the exception is catch or just simply skip scaling in the cuFFT backend tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the team is open to adding an extra kernel that will do the scaling when FORWARD/BACKWARD_SCALE != 1
if it comes with some documentation that for cuFFT it will likely add overhead. We would like to add this in a later PR though.
When we get to adding the FORWARD/BACKWARD_SCALE tests, a range of values will be tested and the test will skip when a backend throws an "unimplemented" exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please open an issue to track this?
Thanks for the log! Could you also please check that all oneMKL GPU backend tests are passing after these changes and attached the log to the PR? |
can you add the output for the exmaple with NVIDIA backend. |
Log from running after the latest changes. Output of the example @anantsrivastava30
|
Test run after the strides validity check changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me as long as the unimplemented scale is tracked with an issue. Thanks!
Test log cufft_run.log |
…on#284) * [DFT] Rearrange DFT compute tests so unimplemented always skips (uxlfoundation#311) * rearrange tests so unimplemented always skips * wait to wait_and_throw, detect skipped tests * Initial cuFFT integration Currently only has support for inplace complex-to-complex single precision transforms * throw from host task directly * remove detail namespace where possible * format * update after rebase * style change * Implemented all cufft execution functions * Increase the relative error margin so cufft backend passes tests * Fix swapped input and output strides * fix compile-time tests for cufft * fix macro typo * fix non cuda build and increase test accuracy error margin * update README * format with clang-format-10 * enable recommit in cuda backend * change cuda context after call to cufftDestroy * update dft example cmake * update example readme * typo in ENABLE_CUFFT_BACKEND description * Update help text for the various backends * use the correct copyright headers * Fix cmake comment * fix binary name in example * Add an exception for when the user tries to scale with cufft * fix warnings * removed forward_scale in runtime example for cufft * avoid creating plans with invalid strides
Description
Introduce a cuFFT backend for the dft interface. I believe this exposes as much of the cuFFT library as we can use.
Checklist
All Submissions