[DFT] Introduce the cuFFT backend for the DFT interface. #284

FMarno · 2023-03-01T16:29:29Z

Description

Introduce a cuFFT backend for the dft interface. I believe this exposes as much of the cuFFT library as we can use.

Checklist

All Submissions

Do all unit tests pass locally? Attach a log.
Have you formatted the code using clang-format?

FMarno · 2023-03-10T18:55:03Z

evidence of the tests running. real-real transforms aren't implemented. cufft_backend_run.txt

FMarno · 2023-03-13T15:57:15Z

Multidimensional tests (#275) rebased onto these changes -
cufft_backend_run.txt

lhuot · 2023-03-29T07:14:32Z

@anantsrivastava30 could you please take a look at this PR? Thanks!

FMarno · 2023-03-29T11:21:52Z

@anantsrivastava30 The conflicts caused by the #282 are quite big so I will need some time to sort that out. I'll give you a ping when I'm ready.

CMakeLists.txt

README.md

FMarno · 2023-04-05T16:26:31Z

@anantsrivastava30 this is ready for review now. Sorry for the wait, I had a horrible bug with cufftDestroy changing the cuda context for no known reason. Should be fine now.

FMarno · 2023-04-18T10:26:33Z

@anantsrivastava30 I've fixed that merge conflict now so please review when you're ready

CMakeLists.txt

examples/README.md

src/dft/backends/cufft/backward.cpp

lhuot · 2023-04-19T12:11:06Z

tests/unit_tests/dft/include/compute_inplace.hpp

@@ -165,6 +163,8 @@ int DFT_Test<precision, domain>::test_in_place_buffer() {
        }
    }

+    // account for scaling that occurs during DFT
+    std::for_each(input.begin(), input.end(), [this](auto& x) { x *= forward_elements; });


Why did you need to do that instead of using the backward scale as it was before?

cuFFT doesn't support forward or backward scaling. We consider adding a kernel that would do a simple multiplication, but decided that users could easily integrate this into their own kernels and get much better performance (since they could perform the scaling at the same time as other work, avoiding the cost of loading and storing all the data just to perform a single divide). Testing for forward and backward scale is included in our testing roadmap, a which point we will test many values, not just 1/N.

So if a user of the oneMKL Interface library is using the oneMKL SYCL APIs with cuFFT backend and setting the bwd_scale then the scale will not be applied to the result of the FFT?

Yes, that does seem bad.
I could add a check to the commit step for cuFFT so that an exception is thrown when BACKWARD|FORWARD_SCALE != 1

I've added a fix in fb06c46. This does have a problem in that I had to rearrange the test since the invalid scale was stored, and then all subsequent attempts to commit became invalid.

I think we should find a way to make the backward, forward scale works for the cuFFT backend.
The goal of this project is to demonstrate that we can implement the oneMKL SYCL APIs as defined in the oneMKL specification for many platforms.
So as is we have a gap for Nvidia platform.
We could add the scale for the Nvidia backend in a different PR though and as a first step maybe throw an "unimplemented" exception for those scale for the cuFFT backend and update the test only for the cuFFT backend to either apply the scale on the test side if the exception is catch or just simply skip scaling in the cuFFT backend tests.

I think the team is open to adding an extra kernel that will do the scaling when FORWARD/BACKWARD_SCALE != 1 if it comes with some documentation that for cuFFT it will likely add overhead. We would like to add this in a later PR though.
When we get to adding the FORWARD/BACKWARD_SCALE tests, a range of values will be tested and the test will skip when a backend throws an "unimplemented" exception.

Could you please open an issue to track this?

tests/unit_tests/dft/include/compute_tester.hpp

lhuot · 2023-04-19T12:13:52Z

evidence of the tests running. real-real transforms aren't implemented. cufft_backend_run.txt

Thanks for the log! Could you also please check that all oneMKL GPU backend tests are passing after these changes and attached the log to the PR?

anantsrivastava30 · 2023-04-20T07:10:09Z

can you add the output for the exmaple with NVIDIA backend.

FMarno · 2023-04-20T17:23:52Z

Log from running after the latest changes.

cufft_run.log
mklgpu_run.log

Output of the example @anantsrivastava30


########################################################################
# DFTI complex in-place forward transform with USM API example:
#
# Using APIs:
#   USM forward complex in-place
#   Run-time dispatch
#
# Using single precision (float) data type
#
# Device will be selected during runtime.
# The environment variable SYCL_DEVICE_FILTER can be used to specify
# SYCL device
#
########################################################################

Running DFT complex forward example on GPU device
Device name is: NVIDIA A100-PCIE-40GB
Running with single precision real data type:
DFT example run_time dispatch
DFT example ran OK

src/dft/backends/cufft/commit.cpp

FMarno · 2023-04-21T17:40:19Z

Test run after the strides validity check changes
cufft_run.log

lhuot

Looks good to me as long as the unimplemented scale is tracked with an issue. Thanks!

FMarno · 2023-05-09T13:15:46Z

Thanks @lhuot. I've created an issue here #313.
I have fixed the merge conflict but github seems to not be picking up the changes I've pushed???
When I get a chance I will run the tests once more

FMarno · 2023-05-09T16:44:53Z

Test log cufft_run.log

…on#284) * [DFT] Rearrange DFT compute tests so unimplemented always skips (uxlfoundation#311) * rearrange tests so unimplemented always skips * wait to wait_and_throw, detect skipped tests * Initial cuFFT integration Currently only has support for inplace complex-to-complex single precision transforms * throw from host task directly * remove detail namespace where possible * format * update after rebase * style change * Implemented all cufft execution functions * Increase the relative error margin so cufft backend passes tests * Fix swapped input and output strides * fix compile-time tests for cufft * fix macro typo * fix non cuda build and increase test accuracy error margin * update README * format with clang-format-10 * enable recommit in cuda backend * change cuda context after call to cufftDestroy * update dft example cmake * update example readme * typo in ENABLE_CUFFT_BACKEND description * Update help text for the various backends * use the correct copyright headers * Fix cmake comment * fix binary name in example * Add an exception for when the user tries to scale with cufft * fix warnings * removed forward_scale in runtime example for cufft * avoid creating plans with invalid strides

t4c1 approved these changes Mar 2, 2023

View reviewed changes

FMarno mentioned this pull request Mar 2, 2023

[DFT] Multi-Dimension Many-Batch DFT tests #275

Merged

2 tasks

FMarno force-pushed the cuFFT_complete branch from 6358f18 to 1a0e8aa Compare March 9, 2023 11:15

FMarno mentioned this pull request Mar 9, 2023

[DFT] Introduce a cuFFT backend for the DFT interface #278

Closed

2 tasks

FMarno changed the title ~~[DFT] Complete the cuFFT backend for the DFT interface.~~ [DFT] Introduce the cuFFT backend for the DFT interface. Mar 9, 2023

Rbiessy approved these changes Mar 13, 2023

View reviewed changes

hjabird mentioned this pull request Mar 14, 2023

FFT support for Nvidia/AMD question #272

Closed

FMarno force-pushed the cuFFT_complete branch from 789f341 to 146cf56 Compare March 27, 2023 16:11

FMarno marked this pull request as draft March 29, 2023 11:22

FMarno force-pushed the cuFFT_complete branch from f5204c3 to f29c788 Compare March 30, 2023 16:46

Rbiessy reviewed Mar 31, 2023

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

Rbiessy reviewed Mar 31, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

FMarno force-pushed the cuFFT_complete branch from f29c788 to bccaf05 Compare April 5, 2023 14:42

FMarno marked this pull request as ready for review April 5, 2023 16:25

FMarno force-pushed the cuFFT_complete branch from e6bd53e to 5a5eedd Compare April 18, 2023 10:25

lhuot reviewed Apr 19, 2023

View reviewed changes

FMarno force-pushed the cuFFT_complete branch from fb06c46 to 9b028cc Compare April 20, 2023 16:04

lhuot reviewed Apr 21, 2023

View reviewed changes

src/dft/backends/cufft/commit.cpp Outdated Show resolved Hide resolved

FMarno force-pushed the cuFFT_complete branch from e34aae2 to 677ad7d Compare May 1, 2023 09:28

lhuot approved these changes May 9, 2023

View reviewed changes

FMarno added 23 commits May 9, 2023 12:39

update after rebase

719b0e1

style change

1c53bfd

Implemented all cufft execution functions

12d2e9f

Increase the relative error margin so cufft backend passes tests

4860fdd

Fix swapped input and output strides

121f554

fix compile-time tests for cufft

46b51b3

fix macro typo

2110c2e

fix non cuda build and increase test accuracy error margin

3128522

update README

a9d8154

format with clang-format-10

745f332

enable recommit in cuda backend

0d006c4

change cuda context after call to cufftDestroy

14dda42

update dft example cmake

00ff378

update example readme

e70a4a6

typo in ENABLE_CUFFT_BACKEND description

8df0fee

Update help text for the various backends

4f63869

use the correct copyright headers

87cdf0d

Fix cmake comment

5e13967

fix binary name in example

d4465d8

Add an exception for when the user tries to scale with cufft

bd503eb

fix warnings

0d7fdd5

removed forward_scale in runtime example for cufft

4b1d0b1

avoid creating plans with invalid strides

82639ca

FMarno mentioned this pull request May 9, 2023

[DFT] cuFFT FORWARD/BACKWARD_SCALE #313

Open

FMarno merged commit 8155847 into uxlfoundation:develop May 9, 2023

FMarno deleted the cuFFT_complete branch May 9, 2023 16:47

Rbiessy mentioned this pull request Oct 15, 2024

[DOCS]Update new backend guide to support the latest develop #593

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DFT] Introduce the cuFFT backend for the DFT interface. #284

[DFT] Introduce the cuFFT backend for the DFT interface. #284

FMarno commented Mar 1, 2023 •

edited

Loading

FMarno commented Mar 10, 2023

FMarno commented Mar 13, 2023 •

edited

Loading

lhuot commented Mar 29, 2023

FMarno commented Mar 29, 2023

FMarno commented Apr 5, 2023

FMarno commented Apr 18, 2023

lhuot Apr 19, 2023

FMarno Apr 19, 2023

lhuot Apr 19, 2023

FMarno Apr 19, 2023 •

edited

Loading

FMarno Apr 19, 2023

lhuot Apr 20, 2023

FMarno Apr 20, 2023 •

edited

Loading

lhuot May 9, 2023

lhuot commented Apr 19, 2023

anantsrivastava30 commented Apr 20, 2023

FMarno commented Apr 20, 2023

FMarno commented Apr 21, 2023

lhuot left a comment

FMarno commented May 9, 2023

FMarno commented May 9, 2023

[DFT] Introduce the cuFFT backend for the DFT interface. #284

[DFT] Introduce the cuFFT backend for the DFT interface. #284

Conversation

FMarno commented Mar 1, 2023 • edited Loading

Description

Checklist

All Submissions

FMarno commented Mar 10, 2023

FMarno commented Mar 13, 2023 • edited Loading

lhuot commented Mar 29, 2023

FMarno commented Mar 29, 2023

FMarno commented Apr 5, 2023

FMarno commented Apr 18, 2023

lhuot Apr 19, 2023

Choose a reason for hiding this comment

FMarno Apr 19, 2023

Choose a reason for hiding this comment

lhuot Apr 19, 2023

Choose a reason for hiding this comment

FMarno Apr 19, 2023 • edited Loading

Choose a reason for hiding this comment

FMarno Apr 19, 2023

Choose a reason for hiding this comment

lhuot Apr 20, 2023

Choose a reason for hiding this comment

FMarno Apr 20, 2023 • edited Loading

Choose a reason for hiding this comment

lhuot May 9, 2023

Choose a reason for hiding this comment

lhuot commented Apr 19, 2023

anantsrivastava30 commented Apr 20, 2023

FMarno commented Apr 20, 2023

FMarno commented Apr 21, 2023

lhuot left a comment

Choose a reason for hiding this comment

FMarno commented May 9, 2023

FMarno commented May 9, 2023

FMarno commented Mar 1, 2023 •

edited

Loading

FMarno commented Mar 13, 2023 •

edited

Loading

FMarno Apr 19, 2023 •

edited

Loading

FMarno Apr 20, 2023 •

edited

Loading