Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DFT] Introduce the cuFFT backend for the DFT interface. #284

Merged
merged 28 commits into from
May 9, 2023

Conversation

FMarno
Copy link
Contributor

@FMarno FMarno commented Mar 1, 2023

Description

Introduce a cuFFT backend for the dft interface. I believe this exposes as much of the cuFFT library as we can use.

Checklist

All Submissions

  • Do all unit tests pass locally? Attach a log.
  • Have you formatted the code using clang-format?

@FMarno FMarno changed the title [DFT] Complete the cuFFT backend for the DFT interface. [DFT] Introduce the cuFFT backend for the DFT interface. Mar 9, 2023
@FMarno
Copy link
Contributor Author

FMarno commented Mar 10, 2023

evidence of the tests running. real-real transforms aren't implemented. cufft_backend_run.txt

@FMarno
Copy link
Contributor Author

FMarno commented Mar 13, 2023

Multidimensional tests (#275) rebased onto these changes -
cufft_backend_run.txt

@lhuot
Copy link
Contributor

lhuot commented Mar 29, 2023

@anantsrivastava30 could you please take a look at this PR? Thanks!

@FMarno
Copy link
Contributor Author

FMarno commented Mar 29, 2023

@anantsrivastava30 The conflicts caused by the #282 are quite big so I will need some time to sort that out. I'll give you a ping when I'm ready.

@FMarno FMarno marked this pull request as draft March 29, 2023 11:22
CMakeLists.txt Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@FMarno FMarno marked this pull request as ready for review April 5, 2023 16:25
@FMarno
Copy link
Contributor Author

FMarno commented Apr 5, 2023

@anantsrivastava30 this is ready for review now. Sorry for the wait, I had a horrible bug with cufftDestroy changing the cuda context for no known reason. Should be fine now.

@FMarno
Copy link
Contributor Author

FMarno commented Apr 18, 2023

@anantsrivastava30 I've fixed that merge conflict now so please review when you're ready

CMakeLists.txt Outdated Show resolved Hide resolved
examples/README.md Show resolved Hide resolved
src/dft/backends/cufft/backward.cpp Show resolved Hide resolved
@@ -165,6 +163,8 @@ int DFT_Test<precision, domain>::test_in_place_buffer() {
}
}

// account for scaling that occurs during DFT
std::for_each(input.begin(), input.end(), [this](auto& x) { x *= forward_elements; });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you need to do that instead of using the backward scale as it was before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuFFT doesn't support forward or backward scaling. We consider adding a kernel that would do a simple multiplication, but decided that users could easily integrate this into their own kernels and get much better performance (since they could perform the scaling at the same time as other work, avoiding the cost of loading and storing all the data just to perform a single divide). Testing for forward and backward scale is included in our testing roadmap, a which point we will test many values, not just 1/N.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if a user of the oneMKL Interface library is using the oneMKL SYCL APIs with cuFFT backend and setting the bwd_scale then the scale will not be applied to the result of the FFT?

Copy link
Contributor Author

@FMarno FMarno Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that does seem bad.
I could add a check to the commit step for cuFFT so that an exception is thrown when BACKWARD|FORWARD_SCALE != 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a fix in fb06c46. This does have a problem in that I had to rearrange the test since the invalid scale was stored, and then all subsequent attempts to commit became invalid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should find a way to make the backward, forward scale works for the cuFFT backend.
The goal of this project is to demonstrate that we can implement the oneMKL SYCL APIs as defined in the oneMKL specification for many platforms.
So as is we have a gap for Nvidia platform.
We could add the scale for the Nvidia backend in a different PR though and as a first step maybe throw an "unimplemented" exception for those scale for the cuFFT backend and update the test only for the cuFFT backend to either apply the scale on the test side if the exception is catch or just simply skip scaling in the cuFFT backend tests.

Copy link
Contributor Author

@FMarno FMarno Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the team is open to adding an extra kernel that will do the scaling when FORWARD/BACKWARD_SCALE != 1 if it comes with some documentation that for cuFFT it will likely add overhead. We would like to add this in a later PR though.
When we get to adding the FORWARD/BACKWARD_SCALE tests, a range of values will be tested and the test will skip when a backend throws an "unimplemented" exception.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please open an issue to track this?

@lhuot
Copy link
Contributor

lhuot commented Apr 19, 2023

evidence of the tests running. real-real transforms aren't implemented. cufft_backend_run.txt

Thanks for the log! Could you also please check that all oneMKL GPU backend tests are passing after these changes and attached the log to the PR?

@anantsrivastava30
Copy link
Contributor

can you add the output for the exmaple with NVIDIA backend.

@FMarno
Copy link
Contributor Author

FMarno commented Apr 20, 2023

Log from running after the latest changes.

cufft_run.log
mklgpu_run.log

Output of the example @anantsrivastava30


########################################################################
# DFTI complex in-place forward transform with USM API example:
#
# Using APIs:
#   USM forward complex in-place
#   Run-time dispatch
#
# Using single precision (float) data type
#
# Device will be selected during runtime.
# The environment variable SYCL_DEVICE_FILTER can be used to specify
# SYCL device
#
########################################################################

Running DFT complex forward example on GPU device
Device name is: NVIDIA A100-PCIE-40GB
Running with single precision real data type:
DFT example run_time dispatch
DFT example ran OK

@FMarno
Copy link
Contributor Author

FMarno commented Apr 21, 2023

Test run after the strides validity check changes
cufft_run.log

@FMarno FMarno force-pushed the cuFFT_complete branch from e34aae2 to 677ad7d Compare May 1, 2023 09:28
Copy link
Contributor

@lhuot lhuot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as long as the unimplemented scale is tracked with an issue. Thanks!

@FMarno
Copy link
Contributor Author

FMarno commented May 9, 2023

Thanks @lhuot. I've created an issue here #313.
I have fixed the merge conflict but github seems to not be picking up the changes I've pushed???
When I get a chance I will run the tests once more

@FMarno
Copy link
Contributor Author

FMarno commented May 9, 2023

Test log cufft_run.log

@FMarno FMarno merged commit 8155847 into uxlfoundation:develop May 9, 2023
@FMarno FMarno deleted the cuFFT_complete branch May 9, 2023 16:47
normallytangent pushed a commit to normallytangent/oneMKL that referenced this pull request Aug 6, 2024
…on#284)

* [DFT] Rearrange DFT compute tests so unimplemented always skips (uxlfoundation#311)

* rearrange tests so unimplemented always skips

* wait to wait_and_throw, detect skipped tests

* Initial cuFFT integration

Currently only has support for inplace complex-to-complex single precision transforms

* throw from host task directly

* remove detail namespace where possible

* format

* update after rebase

* style change

* Implemented all cufft execution functions

* Increase the relative error margin so cufft backend passes tests

* Fix swapped input and output strides

* fix compile-time tests for cufft

* fix macro typo

* fix non cuda build and increase test accuracy error margin

* update README

* format with clang-format-10

* enable recommit in cuda backend

* change cuda context after call to cufftDestroy

* update dft example cmake

* update example readme

* typo in ENABLE_CUFFT_BACKEND description

* Update help text for the various backends

* use the correct copyright headers

* Fix cmake comment

* fix binary name in example

* Add an exception for when the user tries to scale with cufft

* fix warnings

* removed forward_scale in runtime example for cufft

* avoid creating plans with invalid strides
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants