Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why there is no "cudaDeviceSynchronize" after cufftExecC2C's call? #199

Open
zhuwanggg opened this issue Jul 22, 2024 · 1 comment
Open
Labels

Comments

@zhuwanggg
Copy link

CUFFT_CALL(cufftExecC2C(plan, d_data, d_data, CUFFT_INVERSE));

CUDA_RT_CALL(cudaMemcpyAsync(data.data(), d_data, sizeof(data_type) * data.size(),
                                 cudaMemcpyDeviceToHost, stream));

CUDA_RT_CALL(cudaStreamSynchronize(stream));

But in [https://docs.nvidia.com/cuda/cufft/](cuFFT API Reference),we can get a sample with cudaDeviceSynchronize just behind the call of "cufftExecC2C".

@JanuszL JanuszL added the cuFFT label Aug 5, 2024
@ayushpareek2003
Copy link

there is no need for cudaDeviceSynchronize because cudaStreamSynchronize already ensures all operations in the specified stream are completed, including the cufftExecC2C call. adding cudaDeviceSynchronize would redundantly synchronize all streams, which is unnecessary and less efficient

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants