Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

D3D12RaytracingRealTimeDenoisedAmbientOcclusion - large per frame upload #682

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mackrol
Copy link

@mackrol mackrol commented Dec 19, 2020

Improvement for #681.

m_hemisphereSamplesGPUBuffer.CopyStagingToGpu(frameIndex) is executed three times (back buffer count) only when AO samples are recreated.

The array was previously uploaded to GPU on every frame and since the array is quite large (87031808 bytes) it caused unnecessary CPU overhead (8ms of CPU time on i7 4790K 4.4GHz 32GB RAM 2070 RTX). On simpler scenes this was main bottleneck i.e. in the default scene with grass disabled (#define RENDER_GRASS_GEOMETRY 0) framerate tripled (~100 to ~300fps) with GPU timings unaffected.

TODO:

  • The array is unnecessarily triple-buffered and has to be uploaded three times on consecutive frames. Only single upload is required since the content is unchanged.
  • In order to reduce size precision of the array can be reduced to half float or even 8bit SNORM.

…ue mask, write mask is 7 and store value mask is 0.
…. The array was previously uploaded to GPU on every frame and since the array is quite large (87031808 bytes) it caused unnecessary CPU overhead (8ms of CPU time on i7 4790K 4.4GHz 32GB RAM 2070 RTX). On simpler scenes this was main bottleneck i.e. in the default scene with grass disabled (#define RENDER_GRASS_GEOMETRY 0) framerate tripled (~100 to ~300fps) with GPU timings unaffected.

TODO:
- The array is unnecessarily triple-buffered and has to be uploaded three times on consecutive frames. Only single upload is required since the content is unchanged.
- In order to reduce size precision of the array can be reduced to half float or even 8bit SNORM.
@ghost
Copy link

ghost commented Dec 19, 2020

CLA assistant check
All CLA requirements met.

@walbourn walbourn added the samples Issues related to Samples label Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
samples Issues related to Samples
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants