Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Resource management #740

Draft
wants to merge 73 commits into
base: vulkan
Choose a base branch
from
Draft

[WIP] Resource management #740

wants to merge 73 commits into from

Conversation

w23
Copy link
Owner

@w23 w23 commented May 7, 2024

  • extract resource management from vk_rtx.c to vk_resources.c
  • correct resource registration by their producers
  • resource state tracking for barrier/sync purposes

w23 added 30 commits May 7, 2024 11:06
- rename ray_resources to vk_resources
- add agenda and notes
functionally it's the same, the move is mostly mechanical
Adds r_vk_barrier_t type for collecting data needed for pipeline barriers.
Makes all resources, passes and dependencies visible. Can be a precursor
to bytecode generator, where render graph is serialized into passes and
barriers in python code, and native engine ends up only interpreting
this bytecode.
Do not build BLASes on model creation. Collect all BLASes to be built,
and then build them just before the TLAS is built.

Known issues:
- crashes (timeout + device lost) RADV AMD on dynamic model update.
Also add dynamic array (not used in this commit)
This is untested PoC quality. Staging regions are not tracked properly
yet. Image upload commit is also done at a weird place.
fixes building on windows
handles canceling holes

still, corrupts some textures for some reason
Prior to this change `R_VkImageClear()` functtion was causing SYNC-HAZARD-WRITE-AFTER-READ error, thinking that clearing `[dest]` image is not synchronized with blit during the previous frame. However, there's an explicit semaphore sync with the previous frame, and as such it seems this validation complaint is baseless.

I'd make a simple repro and submit it to validation repo, but who am i kidding, i have like 10 minutes left to do anything today, and i likely won't be able to get back to this in several days.
also add some notes about clangdb and staging problems
Apparently now it is possible to handle emissive brush surfaces at the same time as generating geometry. No second pass for emissive extraction is needed.

This allows skipping extra `R_VkStagingFlushSync()`.

Not all flush-sync usages are removed, though.
It's an incomplete intermediary change. This commit doesn't work.

It compiles tho.

Changes:
- Move buffer staging tracking to vk_buffer
- Sketch automatic buffer barriers tied to vk_combuf
- Remove all combuf handling from staging. That was just gross.

Breaks:
- Everything.
- RT AS building is commented out for now
it render quite a few traditional frames
but then fails with cross-cmdbuf sync validation errors
- print out vkQueueSubmit with its semaphores
- print out buffer barriers
- print out when buffer copy submission happens
- print various ref_vk initialization stages
Validation was complaining about odd SYNC-HAZARD-WRITE-AFTER-READ lack of buffer barrier at the very beginning of a frame, while I thought that command buffers are properly serialized by semaphores.
Turns out, `VkSubmitInfo::pWaitDstStageMask` should accompany each wait semaphore with its corresponding stage.
Properly setting TOP_OF_PIPE for the wait semaphore of a previous submission fixes the complaint.
This is likely no the right way to do this. Address this when focusing on correct gamma overall for traditional renderer. Currently this is here just to make it have some non-zero values early.
Previous 20.04 doesn't have the latest Vulkan SDK.
Also print deps script commands verbosely for easier CI debugging.
w23 added 30 commits December 11, 2024 23:42
Allows to remove a small pile of manual barriers yay.
It compiles, but it's broken and doesn't pass validation yet. Resource part doesn't collect barriers correctly somehow, needs debugging.
Now it works!
Needs a bit of a cleanup, though.
This adds explicit staging user tracking, which allows:
- tracking whether there are any unclaimed items, and pushing them (or ignoring, if the user decides so, for transient stuff)
- having more granular stats for staging, i.e. which buffer/subsystem used staging in this frame, and how much (not implemented yet)

This commit also changes staging from using flip buffer to just ring buffer allocator.
`staging.<USER>.size` and `staging.<USER>.allocs` r_speeds metrics are now available for every staging user.
Move draw_instance into ray_accel module. Then, when building TLAS, go through all instances, and check whether their blases need to be (re)built. Enqueue those who need to be rebuilt before building TLAS.

Fixes crashing when doing changelevel w/o rt, and then enabling rt.
…rriers

Group by access/stage, not by src/dst. Makes logs a bit more readable.
Previously we forced src image layout to be UNDEFINED if the image was
to be written into. This lead to RADV driver to completely clear our so
painfully constructed ray traced frame.

The correct layout transition should probably be something like this: if
we're not to _read_ from image contents, only then we can be sure that
its contents are not needed anymore, and can be discarded by settind the
src layout to UNDEFINED.
This makes c0a0d toxic pool emissive again, but it still doesn't make all known toxic water objects emissive.
This does fix a bunch of emissive water surfaces missing, but not all. It also entangles the code even more.

Not sure if its worth it, maybe a better approach is possible.
This was a bug in validation layers, it's been fixed back in 2021.
Swapchain framebuffer image being in VK_IMAGE_LAYOUT_PRESENT_SRC_KHR layout has zero access flags, and is probably synced with bottom-of-pipe stage.

At least this does please validation layers.
Implements new totally automatic barrier placement. Also, staging is refactored.

- [x] image staging
  - [x] some images are corrupted
  - ~~[ ] #745~~ -- postponed until next time we'd need to touch images; current code works good enough for now.
  - [x] use combuf auto barriers everywhere where it makes sense
- [x] corrupted geometry in playdemo ...
- [x] buffer staging
  - [x] #743 
  - [x] track copied staging regions: i.e. staging must know that it has been drained fully
- [x] RT-trad dynamic toggle
  - [x] push-pull staging boundary
- [x] frame dependency tracking: automatically free/flip buffers when frame using them is done
- [x] replace ALL barriers with combuf ones
  - [x] buffers in rtx/resources
  - [x] images
    - [x] track images sync state inline where possible
  - [x] find other uses
- [x] improve staging
  - [x] track staging users explicitly
    - [x] per-user stats: sizes, allocations, etc
    - [x] push remaining data for stale users
  - [x] use ring buffer directly, track frame boundaries externally in fctl
- [x] crash in `buildBlases()`:
  1. load map with rt disabled
  2. change to another map
  3. enable rt
  4. 💥
- [x] suboptimal barrier, see comment #742 (comment)
- [x] simplify creating and building TLAS
- [x] Run rendering tests
  - [x] missing emissive toxic waters
    - Leave as a known problem: it's due to inadvertently skipping some water surfaces when looking for emissive ones, see:
      - #56
      - #752
  - [x] slightly different indirect blur
    - Assuming that this is due to Á-Trous filtering, which could've sneaked through before the gold images were set. Not going to investigate, as we're about to submit a big change to the denoiser.
Brush model water surfaces might be emissive for e.g. toxicgrn textures.
For now we're making original, not tesselated, surfaces emissive for
performance reasons. However, tesselated surfaces are not coplanar with
emissive ones, and therefore cast weird shadows.

Add material flag to make sure that given model doesn't cast shadows
(might be useful not only for water, but e.g. windows and other things).

Add geometry bit for opaque models casting shadows. I.e. not all opaque
models cast shadows now.
Fix a bunch of issues with emissive (toxicgrn) water:
- extract emissive surfaces
- exclude water from shadow
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant