n64: Fix 32-bit signed integer multiplication and division when the i…

…nput operands are not sign-extended 32-bit values. (#1701) Verified by the test ROM at https://github.com/Thar0/N64-IntegerMulDiv-Test, see the readme there for full details. Note that the behavior of the div instruction is still not well-understood when bits 31 and 63 of the divisor in rt are not equal. These will still give incorrect results for now. The test ROM classifies tests with such inputs in "failed emulations" since their behavior is not understood well enough, while "failures" consists of tests with inputs whose behavior is known and still failed. Ares result before this patch: ![image](https://github.com/user-attachments/assets/f6e68e2c-b4be-4b38-8a5a-e7c2950817b1) Hardware result (apologies for the poor video quality 😬) ![image](https://github.com/user-attachments/assets/51b5c5be-75cb-48a3-9828-9ed638722966) Ares result after this patch: ![image](https://github.com/user-attachments/assets/6f19524b-eaa4-47c5-9e6a-1dd77ca400e4)
ares-emulator · Nov 18, 2024 · 8cc98f9 · 8cc98f9
1 parent 7e5d749
commit 8cc98f9
Showing 1 changed file with 9 additions and 5 deletions.
diff --git a/ares/n64/cpu/interpreter-ipu.cpp b/ares/n64/cpu/interpreter-ipu.cpp
@@ -287,10 +287,12 @@ auto CPU::DDIVU(cr64& rs, cr64& rt) -> void {
 
 auto CPU::DIV(cr64& rs, cr64& rt) -> void {
   if(!context.kernelMode() && context.bits == 32) return exception.reservedInstruction();
-  if(rt.s32) {
-    //cast to s64 to prevent exception on INT32_MIN / -1
-    LO.u64 = s32(s64(rs.s32) / s64(rt.s32));
-    HI.u64 = s32(s64(rs.s32) % s64(rt.s32));
+  if(rt.s64) {
+    //using s64 to match hardware behavior when input operands are not properly sign-extended
+    //note: this does not give correct results when sgn(rt.s32) != sgn(rt.s64); on hardware this
+    //results in a meaningless quotient, it's not clear how the result is reached in that case
+    LO.u64 = s32(s64(rs.s32) / rt.s64);
+    HI.u64 = s32(s64(rs.s32) % rt.s64);
   } else {
     LO.u64 = rs.s32 < 0 ? +1 : -1;
     HI.u64 = rs.s32;
@@ -780,7 +782,9 @@ auto CPU::MTLO(cr64& rs) -> void {
 }
 
 auto CPU::MULT(cr64& rs, cr64& rt) -> void {
-  u64 result = s64(rs.s32) * s64(rt.s32);
+  //using s64 to match hardware behavior when input operands are not properly sign-extended,
+  //the hardware behavior appears to be a signed 64-bit by 35-bit multiplication
+  u64 result = rs.s64 * (rt.s64 << 29 >> 29);
   LO.u64 = s32(result >>  0);
   HI.u64 = s32(result >> 32);
   step((5 - 1) * 2);