-
Notifications
You must be signed in to change notification settings - Fork 1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
New JMH benchmark method - vdot8s that implement int8 dotProduct in C…
… using Neon intrinsics
- Loading branch information
Ankur Goel
committed
Jul 18, 2024
1 parent
22ca695
commit 356d194
Showing
10 changed files
with
193 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
// dotProduct.c | ||
#include <arm_neon.h> | ||
#include <stdio.h> | ||
|
||
// https://developer.arm.com/architectures/instruction-sets/intrinsics/ | ||
int vdot8s(char vec1[], char vec2[], int limit) { | ||
int result = 0; | ||
int32x4_t acc = vdupq_n_s32(0); | ||
int i = 0; | ||
|
||
for (; i+16 <= limit; i+=16 ) { | ||
// Read into 8 (bit) x 16 (values) vector | ||
int8x16_t va8 = vld1q_s8((const void*) (vec1 + i)); | ||
int8x16_t vb8 = vld1q_s8((const void*) (vec2 + i)); | ||
acc = vdotq_s32(acc, va8, vb8); | ||
} | ||
// REDUCE: Add every vector element in target and write result to scalar | ||
result += vaddvq_s32(acc); | ||
|
||
// Scalar tail. TODO: Use FMA | ||
for (; i < limit; i++) { | ||
result += vec1[i] * vec2[i]; | ||
} | ||
return result; | ||
} | ||
|
||
int dot8s(char vec1[], char vec2[], int limit) { | ||
int result = 0; | ||
for (int i = 0; i < limit; i++) { | ||
result += vec1[i] * vec2[i]; | ||
} | ||
return result; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
|
||
int vdot8s(char vec1[], char vec2[], int limit); | ||
int dot8s(char vec1[], char vec2[], int limit); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters