-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
n64: improve and extend cache coherency checks #1314
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rasky
force-pushed
the
cache_coherency
branch
6 times, most recently
from
December 3, 2023 21:06
e48b9eb
to
19dfb4a
Compare
To make sure to intercept all possible errors, the check are now performed by the RDRAM module, whenever a RDRAM read/write behind a cacheline happens. CPU writes to cache is now tracking dirtyness at the byte level rather than whole cacheline level, so that hardware accessing memory does not trigger a false positive for false-shared variables. An initial round of testing has shown that the check would trigger far too much otherwise. Error reporting is also much improved to provide more context to analyze the issue, including tracking the PC at which the hardware DMA was triggered.
It is quite normal for RSP ucode to fetch extra data to IMEM/DMEM. For instance, games tend to load 4 KiB of ucode even if the actual ucode is smaller; or they can fetch a command buffer in fixed size chunks of 256 bytes, and then just ignore data past the actual end. To avoid tons of false positives, we track the actual DMEM cells that contain tainted data, that is, data read from RDRAM in a non coherent state. If and only if those DMEM cells are accessed, we issue the warning.
rasky
force-pushed
the
cache_coherency
branch
from
December 4, 2023 12:08
d520e02
to
a600f6b
Compare
invertego
pushed a commit
to invertego/ares
that referenced
this pull request
Dec 13, 2023
To make sure to intercept all possible errors, the check are now performed by the RDRAM module, whenever a RDRAM read/write behind a cacheline happens. CPU writes to cache is now tracking dirtyness at the byte level rather than whole cacheline level, so that hardware accessing memory does not trigger a false positive for false-shared variables. An initial round of testing has shown that the check would trigger far too much otherwise. Error reporting is also much improved to provide more context to analyze the issue, including tracking the PC at which the hardware DMA was triggered. For RSP DMA, we do even more: instead of reporting an issue when area of a cached memory is DMA into RSP DMEM/IMEM, we actually just mark those memory locations as tainted, and emit the warning only if RSP later reads them. This is necessary because it is extremely common for RSP ucode to read data beyond the actual buffers it cares about, even though that data is then never accesses. Some of the warnings issues by Ares have been analyzed on Super Mario 64 and Zelda OOT and in both cases they have been confirmed to as real bugs in cache management made by the game. For instance, in Mario 64, the programmers forgot to invalidate the cache before loading data for the initial Mario head animation, as reported by Ares now: ``` [unusual] PI DMA writing to RDRAM address 0x390650 which is cached (missing cache invalidation?) Cacheline was loaded at CPU PC: 0xffffffff80183b38 PI DMA started at CPU PC: 0xffffffff80328558 ``` The game just happens to work because after loading, the game does something else and manages to get the cache invalidated by touching other locations, but otherwise it would be a real bug. As another example, in Zelda OOT, we get these warnings at boot: ``` [unusual] AI reading from RDRAM address 190820 which is modified in the cache (missing cache writeback?) Cacheline was loaded at CPU PC: ffffffff800b54fc Cacheline was last written at CPU PC: ffffffff800b5528 [unusual] RSP reading from DMEM address fe0 which contains a value which is not cache coherent Current RSP PC: d10 The value read was previously written by RSP DMA from RDRAM address 00199e40 RSP DMA started at RSP PC: abc The relative CPU cacheline was dirty (missing cache writeback?) Cacheline was loaded at CPU PC: ffffffff800b5244 Cacheline was last written at CPU PC: ffffffff800b5264 ``` These warnings appear to be real bugs in the audio library. Quoting Thar0: ``` * AudioHeap_ClearCurrentAiBuffer/AudioHeap_ResetStep wipe the AI buffers with CPU writes and then don't write back the cache, and I guess an AI DMA is either in progress or starts before it can be properly flushed. On boot this shouldn't matter as gAudioHeap is BSS and is already zero, but if the driver is reset later it might cause some minor issues? * The RSP cache coherency problems is from loading filters in AudioHeap_LoadLowPassFilter. The heap allocator returns a pointer to which the CPU writes the filter data.. but the allocator writes back the cache before the CPU writes the data 😂 ``` Moreover, this PR also implements all the known SysAD-related CPU freezes that cause the console to crash. We now pass the [n64-systemcrash](https://github.com/rasky/n64-systemcrash) testsuite in full.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
To make sure to intercept all possible errors, the check are now performed by the RDRAM module, whenever a RDRAM read/write behind a cacheline happens.
CPU writes to cache is now tracking dirtyness at the byte level rather than whole cacheline level, so that hardware accessing memory does not trigger a false positive for false-shared variables. An initial round of testing has shown that the check would trigger far too much otherwise. Error reporting is also much improved to provide more context to analyze the issue, including tracking the PC at which the hardware DMA was triggered.
For RSP DMA, we do even more: instead of reporting an issue when an area of a cached memory is DMA’d into RSP DMEM/IMEM, we actually just mark those memory locations as tainted, and emit the warning only if RSP later reads them. This is necessary because it is extremely common for RSP ucode to read data beyond the actual buffers it cares about, even though that data is then never accessed.
Some of the warnings issued by Ares have been analyzed on Super Mario 64 and Zelda OOT and in both cases they have been confirmed to be real bugs in cache management made by the game.
For instance, in Mario 64, the programmers forgot to invalidate the cache before loading data for the initial Mario head animation, as reported by Ares now:
The game just happens to work because after loading, the game does something else and manages to get the cache invalidated by touching other locations, but otherwise it would be a real bug.
As another example, in Zelda OOT, we get these warnings at boot:
These warnings appear to be real bugs in the audio library. Quoting Thar0:
Moreover, this PR also implements all the known SysAD-related CPU freezes that cause the console to crash. We now pass the n64-systemcrash testsuite in full.