Why is `Rem` of `strength_reduced_u32` using 128-bit arithmetic? #10

benruijl · 2023-01-21T12:13:58Z

I would expect that any computation involving 32-bit modulos only requires 64-bit arithmetic, instead of the much slower (and often emulated) 128-bit arithmetic.

When I look at the Rem implementation, however, we see use of 128-bit multiplication:

                    let product = rhs.multiplier.wrapping_mul(self as u64) as u128;
                    let divisor = rhs.divisor as u128;

                    let shifted = (product * divisor) >> 64;
                    shifted as $primitive_type

The div_rem routine does not use u128.

Why is this? I'd expect quite a bit of loss of performance, which is the main reason why I want to use 32-bit modulos instead of 64-bit ones.

The text was updated successfully, but these errors were encountered:

ejmahler · 2023-01-22T21:12:58Z

Because we immediately shift right by 64 bits and then convert back to u64, this is basically just a hint to the compiler to do a u64*u64 and only take the high bits of the result. AFAIK the compiler ends up producing optimal code here, although i've never directly checked - i've only done performance tests in which this function does well.

I'm open to improving this. If you know of a better way to express this, that results in better codegen and doesn't involve platform-specific intrinsics, i'd be happy to accept a PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is `Rem` of `strength_reduced_u32` using 128-bit arithmetic? #10

Why is `Rem` of `strength_reduced_u32` using 128-bit arithmetic? #10

benruijl commented Jan 21, 2023

ejmahler commented Jan 22, 2023

Why is Rem of strength_reduced_u32 using 128-bit arithmetic? #10

Why is Rem of strength_reduced_u32 using 128-bit arithmetic? #10

Comments

benruijl commented Jan 21, 2023

ejmahler commented Jan 22, 2023

Why is `Rem` of `strength_reduced_u32` using 128-bit arithmetic? #10

Why is `Rem` of `strength_reduced_u32` using 128-bit arithmetic? #10