Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

n64: improve and extend cache coherency checks #1314

Merged
merged 7 commits into from
Dec 7, 2023

Conversation

rasky
Copy link
Collaborator

@rasky rasky commented Dec 1, 2023

To make sure to intercept all possible errors, the check are now performed by the RDRAM module, whenever a RDRAM read/write behind a cacheline happens.

CPU writes to cache is now tracking dirtyness at the byte level rather than whole cacheline level, so that hardware accessing memory does not trigger a false positive for false-shared variables. An initial round of testing has shown that the check would trigger far too much otherwise. Error reporting is also much improved to provide more context to analyze the issue, including tracking the PC at which the hardware DMA was triggered.

For RSP DMA, we do even more: instead of reporting an issue when an area of a cached memory is DMA’d into RSP DMEM/IMEM, we actually just mark those memory locations as tainted, and emit the warning only if RSP later reads them. This is necessary because it is extremely common for RSP ucode to read data beyond the actual buffers it cares about, even though that data is then never accessed.

Some of the warnings issued by Ares have been analyzed on Super Mario 64 and Zelda OOT and in both cases they have been confirmed to be real bugs in cache management made by the game.

For instance, in Mario 64, the programmers forgot to invalidate the cache before loading data for the initial Mario head animation, as reported by Ares now:

[unusual] PI DMA writing to RDRAM address 0x390650 which is cached (missing cache invalidation?)
	Cacheline was loaded at CPU PC: 0xffffffff80183b38
	PI DMA started at CPU PC: 0xffffffff80328558

The game just happens to work because after loading, the game does something else and manages to get the cache invalidated by touching other locations, but otherwise it would be a real bug.

As another example, in Zelda OOT, we get these warnings at boot:

[unusual] AI reading from RDRAM address 190820 which is modified in the cache (missing cache writeback?)
    Cacheline was loaded at CPU PC: ffffffff800b54fc
    Cacheline was last written at CPU PC: ffffffff800b5528
    
[unusual] RSP reading from DMEM address fe0 which contains a value which is not cache coherent
    Current RSP PC: d10
    The value read was previously written by RSP DMA from RDRAM address 00199e40
    RSP DMA started at RSP PC: abc
    The relative CPU cacheline was dirty (missing cache writeback?)
    Cacheline was loaded at CPU PC: ffffffff800b5244
    Cacheline was last written at CPU PC: ffffffff800b5264

These warnings appear to be real bugs in the audio library. Quoting Thar0:

* AudioHeap_ClearCurrentAiBuffer/AudioHeap_ResetStep wipe the AI buffers with CPU writes and 
  then don't write back the cache, and I guess an AI DMA is either in progress or starts before it can
  be properly flushed. On boot this shouldn't matter as gAudioHeap is BSS and is already zero, but
  if the driver is reset later it might cause some minor issues?
* The RSP cache coherency problems is from loading filters in AudioHeap_LoadLowPassFilter.
  The heap allocator returns a pointer to which the CPU writes the filter data.. but the allocator
  writes back the cache before the CPU writes the data 😂

Moreover, this PR also implements all the known SysAD-related CPU freezes that cause the console to crash. We now pass the n64-systemcrash testsuite in full.

@rasky rasky marked this pull request as draft December 1, 2023 16:13
@rasky rasky force-pushed the cache_coherency branch 6 times, most recently from e48b9eb to 19dfb4a Compare December 3, 2023 21:06
rasky added 7 commits December 4, 2023 13:08
To make sure to intercept all possible errors, the check are now
performed by the RDRAM module, whenever a RDRAM read/write behind
a cacheline happens.

CPU writes to cache is now tracking dirtyness at the byte level rather
than whole cacheline level, so that hardware accessing memory does not
trigger a false positive for false-shared variables. An initial
round of testing has shown that the check would trigger far too much
otherwise.

Error reporting is also much improved to provide more context to
analyze the issue, including tracking the PC at which the hardware DMA
was triggered.
It is quite normal for RSP ucode to fetch extra data to IMEM/DMEM.
For instance, games tend to load 4 KiB of ucode even if the actual
ucode is smaller; or they can fetch a command buffer in fixed size
chunks of 256 bytes, and then just ignore data past the actual end.

To avoid tons of false positives, we track the actual DMEM cells that
contain tainted data, that is, data read from RDRAM in a non coherent
state. If and only if those DMEM cells are accessed, we issue the
warning.
@rasky rasky marked this pull request as ready for review December 4, 2023 12:09
@LukeUsher LukeUsher merged commit c95dc1b into ares-emulator:master Dec 7, 2023
9 checks passed
invertego pushed a commit to invertego/ares that referenced this pull request Dec 13, 2023
To make sure to intercept all possible errors, the check are now
performed by the RDRAM module, whenever a RDRAM read/write behind a
cacheline happens.

CPU writes to cache is now tracking dirtyness at the byte level rather
than whole cacheline level, so that hardware accessing memory does not
trigger a false positive for false-shared variables. An initial round of
testing has shown that the check would trigger far too much otherwise.
Error reporting is also much improved to provide more context to analyze
the issue, including tracking the PC at which the hardware DMA was
triggered.

For RSP DMA, we do even more: instead of reporting an issue when area of
a cached memory is DMA into RSP DMEM/IMEM, we actually just mark those
memory locations as tainted, and emit the warning only if RSP later
reads them. This is necessary because it is extremely common for RSP
ucode to read data beyond the actual buffers it cares about, even though
that data is then never accesses.

Some of the warnings issues by Ares have been analyzed on Super Mario 64
and Zelda OOT and in both cases they have been confirmed to as real bugs
in cache management made by the game.

For instance, in Mario 64, the programmers forgot to invalidate the
cache before loading data for the initial Mario head animation, as
reported by Ares now:
```
[unusual] PI DMA writing to RDRAM address 0x390650 which is cached (missing cache invalidation?)
	Cacheline was loaded at CPU PC: 0xffffffff80183b38
	PI DMA started at CPU PC: 0xffffffff80328558
```

The game just happens to work because after loading, the game does
something else and manages to get the cache invalidated by touching
other locations, but otherwise it would be a real bug.

As another example, in Zelda OOT, we get these warnings at boot:

```
[unusual] AI reading from RDRAM address 190820 which is modified in the cache (missing cache writeback?)
    Cacheline was loaded at CPU PC: ffffffff800b54fc
    Cacheline was last written at CPU PC: ffffffff800b5528
    
[unusual] RSP reading from DMEM address fe0 which contains a value which is not cache coherent
    Current RSP PC: d10
    The value read was previously written by RSP DMA from RDRAM address 00199e40
    RSP DMA started at RSP PC: abc
    The relative CPU cacheline was dirty (missing cache writeback?)
    Cacheline was loaded at CPU PC: ffffffff800b5244
    Cacheline was last written at CPU PC: ffffffff800b5264
```

These warnings appear to be real bugs in the audio library. Quoting
Thar0:
```
* AudioHeap_ClearCurrentAiBuffer/AudioHeap_ResetStep wipe the AI buffers with CPU writes and 
  then don't write back the cache, and I guess an AI DMA is either in progress or starts before it can
  be properly flushed. On boot this shouldn't matter as gAudioHeap is BSS and is already zero, but
  if the driver is reset later it might cause some minor issues?
* The RSP cache coherency problems is from loading filters in AudioHeap_LoadLowPassFilter.
  The heap allocator returns a pointer to which the CPU writes the filter data.. but the allocator
  writes back the cache before the CPU writes the data 😂
```

Moreover, this PR also implements all the known SysAD-related CPU
freezes that cause the console to crash. We now pass the
[n64-systemcrash](https://github.com/rasky/n64-systemcrash) testsuite in
full.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants