n64: improve and extend cache coherency checks #1314

rasky · 2023-12-01T11:02:46Z

To make sure to intercept all possible errors, the check are now performed by the RDRAM module, whenever a RDRAM read/write behind a cacheline happens.

CPU writes to cache is now tracking dirtyness at the byte level rather than whole cacheline level, so that hardware accessing memory does not trigger a false positive for false-shared variables. An initial round of testing has shown that the check would trigger far too much otherwise. Error reporting is also much improved to provide more context to analyze the issue, including tracking the PC at which the hardware DMA was triggered.

For RSP DMA, we do even more: instead of reporting an issue when an area of a cached memory is DMA’d into RSP DMEM/IMEM, we actually just mark those memory locations as tainted, and emit the warning only if RSP later reads them. This is necessary because it is extremely common for RSP ucode to read data beyond the actual buffers it cares about, even though that data is then never accessed.

Some of the warnings issued by Ares have been analyzed on Super Mario 64 and Zelda OOT and in both cases they have been confirmed to be real bugs in cache management made by the game.

For instance, in Mario 64, the programmers forgot to invalidate the cache before loading data for the initial Mario head animation, as reported by Ares now:

[unusual] PI DMA writing to RDRAM address 0x390650 which is cached (missing cache invalidation?)
	Cacheline was loaded at CPU PC: 0xffffffff80183b38
	PI DMA started at CPU PC: 0xffffffff80328558

The game just happens to work because after loading, the game does something else and manages to get the cache invalidated by touching other locations, but otherwise it would be a real bug.

As another example, in Zelda OOT, we get these warnings at boot:

[unusual] AI reading from RDRAM address 190820 which is modified in the cache (missing cache writeback?)
    Cacheline was loaded at CPU PC: ffffffff800b54fc
    Cacheline was last written at CPU PC: ffffffff800b5528
    
[unusual] RSP reading from DMEM address fe0 which contains a value which is not cache coherent
    Current RSP PC: d10
    The value read was previously written by RSP DMA from RDRAM address 00199e40
    RSP DMA started at RSP PC: abc
    The relative CPU cacheline was dirty (missing cache writeback?)
    Cacheline was loaded at CPU PC: ffffffff800b5244
    Cacheline was last written at CPU PC: ffffffff800b5264

These warnings appear to be real bugs in the audio library. Quoting Thar0:

* AudioHeap_ClearCurrentAiBuffer/AudioHeap_ResetStep wipe the AI buffers with CPU writes and 
  then don't write back the cache, and I guess an AI DMA is either in progress or starts before it can
  be properly flushed. On boot this shouldn't matter as gAudioHeap is BSS and is already zero, but
  if the driver is reset later it might cause some minor issues?
* The RSP cache coherency problems is from loading filters in AudioHeap_LoadLowPassFilter.
  The heap allocator returns a pointer to which the CPU writes the filter data.. but the allocator
  writes back the cache before the CPU writes the data 😂

Moreover, this PR also implements all the known SysAD-related CPU freezes that cause the console to crash. We now pass the n64-systemcrash testsuite in full.

To make sure to intercept all possible errors, the check are now performed by the RDRAM module, whenever a RDRAM read/write behind a cacheline happens. CPU writes to cache is now tracking dirtyness at the byte level rather than whole cacheline level, so that hardware accessing memory does not trigger a false positive for false-shared variables. An initial round of testing has shown that the check would trigger far too much otherwise. Error reporting is also much improved to provide more context to analyze the issue, including tracking the PC at which the hardware DMA was triggered.

It is quite normal for RSP ucode to fetch extra data to IMEM/DMEM. For instance, games tend to load 4 KiB of ucode even if the actual ucode is smaller; or they can fetch a command buffer in fixed size chunks of 256 bytes, and then just ignore data past the actual end. To avoid tons of false positives, we track the actual DMEM cells that contain tainted data, that is, data read from RDRAM in a non coherent state. If and only if those DMEM cells are accessed, we issue the warning.

…ecks

To make sure to intercept all possible errors, the check are now performed by the RDRAM module, whenever a RDRAM read/write behind a cacheline happens. CPU writes to cache is now tracking dirtyness at the byte level rather than whole cacheline level, so that hardware accessing memory does not trigger a false positive for false-shared variables. An initial round of testing has shown that the check would trigger far too much otherwise. Error reporting is also much improved to provide more context to analyze the issue, including tracking the PC at which the hardware DMA was triggered. For RSP DMA, we do even more: instead of reporting an issue when area of a cached memory is DMA into RSP DMEM/IMEM, we actually just mark those memory locations as tainted, and emit the warning only if RSP later reads them. This is necessary because it is extremely common for RSP ucode to read data beyond the actual buffers it cares about, even though that data is then never accesses. Some of the warnings issues by Ares have been analyzed on Super Mario 64 and Zelda OOT and in both cases they have been confirmed to as real bugs in cache management made by the game. For instance, in Mario 64, the programmers forgot to invalidate the cache before loading data for the initial Mario head animation, as reported by Ares now: ``` [unusual] PI DMA writing to RDRAM address 0x390650 which is cached (missing cache invalidation?) Cacheline was loaded at CPU PC: 0xffffffff80183b38 PI DMA started at CPU PC: 0xffffffff80328558 ``` The game just happens to work because after loading, the game does something else and manages to get the cache invalidated by touching other locations, but otherwise it would be a real bug. As another example, in Zelda OOT, we get these warnings at boot: ``` [unusual] AI reading from RDRAM address 190820 which is modified in the cache (missing cache writeback?) Cacheline was loaded at CPU PC: ffffffff800b54fc Cacheline was last written at CPU PC: ffffffff800b5528 [unusual] RSP reading from DMEM address fe0 which contains a value which is not cache coherent Current RSP PC: d10 The value read was previously written by RSP DMA from RDRAM address 00199e40 RSP DMA started at RSP PC: abc The relative CPU cacheline was dirty (missing cache writeback?) Cacheline was loaded at CPU PC: ffffffff800b5244 Cacheline was last written at CPU PC: ffffffff800b5264 ``` These warnings appear to be real bugs in the audio library. Quoting Thar0: ``` * AudioHeap_ClearCurrentAiBuffer/AudioHeap_ResetStep wipe the AI buffers with CPU writes and then don't write back the cache, and I guess an AI DMA is either in progress or starts before it can be properly flushed. On boot this shouldn't matter as gAudioHeap is BSS and is already zero, but if the driver is reset later it might cause some minor issues? * The RSP cache coherency problems is from loading filters in AudioHeap_LoadLowPassFilter. The heap allocator returns a pointer to which the CPU writes the filter data.. but the allocator writes back the cache before the CPU writes the data 😂 ``` Moreover, this PR also implements all the known SysAD-related CPU freezes that cause the console to crash. We now pass the [n64-systemcrash](https://github.com/rasky/n64-systemcrash) testsuite in full.

rasky marked this pull request as draft December 1, 2023 16:13

rasky force-pushed the cache_coherency branch 6 times, most recently from e48b9eb to 19dfb4a Compare December 3, 2023 21:06

rasky added 7 commits December 4, 2023 13:08

n64: simulate CPU freeze when accessing physical addresses >= 0x80000000

06843fb

n64: implement all known sysad freezes

46538d5

n64: avoid stalling the emulator when a sysad freeze happens

7812299

n64: add support to RDP DMA (and XBUS) to cache coherency checks

53348a8

n64: add support for RSP unaligned reads/writes to cache coherency ch…

a600f6b

…ecks

rasky force-pushed the cache_coherency branch from d520e02 to a600f6b Compare December 4, 2023 12:08

rasky marked this pull request as ready for review December 4, 2023 12:09

LukeUsher merged commit c95dc1b into ares-emulator:master Dec 7, 2023
9 checks passed

Dragorn421 mentioned this pull request Jan 21, 2024

Homebrew mode: split features into tracers #1371

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

n64: improve and extend cache coherency checks #1314

n64: improve and extend cache coherency checks #1314

rasky commented Dec 1, 2023 •

edited

Loading

n64: improve and extend cache coherency checks #1314

n64: improve and extend cache coherency checks #1314

Conversation

rasky commented Dec 1, 2023 • edited Loading

rasky commented Dec 1, 2023 •

edited

Loading