Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Math: Optimize 16 Bit elementwise matrix multiplication function
Implemented optimizations in the 16-bit elementwise matrix multiplication function by changing accumulator data type from int64_t to int32_t. This reduces the instruction cycle count i.e. reducing cycle count by ~51.18%. Enhance pointer arithmetic within loops for better readability and compiler optimization opportunities Eliminate unnecessary conditionals by directly handling Q0 data in the algorithm's core logic Update fractional bit shift and rounding logic for more accurate fixed-point calcualations Performance gains from these optimizations include a 1.08% reduction in memory usage for the elementwise matrix multiplication. Signed-off-by: Shriram Shastry <[email protected]>
- Loading branch information