-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU] Bugfix accumulate sum of FC dyn-quan #27657
[GPU] Bugfix accumulate sum of FC dyn-quan #27657
Conversation
Signed-off-by: Min, Byungil <[email protected]>
@@ -53,7 +53,7 @@ KERNEL(quantize_input)( | |||
half4 buff = input_0[i] / (half4)quan_scale; | |||
quantized_value[i] = CAT(CAT(convert_, MAKE_VECTOR_TYPE(DQ_TYPE, INPUT_LOAD_SIZE)), _rte)(buff); | |||
#if COMPRESSED_WEIGHTS_INT8 | |||
quantized_sum += (buff[0] + buff[1] + buff[2] + buff[3]); | |||
quantized_sum += ((half)quantized_value[i][0] + (half)quantized_value[i][1] + (half)quantized_value[i][2] + (half)quantized_value[i][3]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about using int for accumulation instead of half? It will reduce quantization error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied.
Signed-off-by: Min, Byungil <[email protected]>
Signed-off-by: Min, Byungil <[email protected]>
#endif | ||
vstore4(quantized_value[i], 0, &quantized_input[input_offset + i * 4]); | ||
} | ||
|
||
// Pair of quantizing_scale and quantized activation_sum for each group | ||
quan_var[offset * 2] = quan_scale; | ||
#if COMPRESSED_WEIGHTS_INT8 | ||
quan_var[(offset * 2) + 1] = quantized_sum; | ||
quan_var[(offset * 2) + 1] = CAT(convert_, INPUT0_TYPE)(quantized_sum); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use _rte
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied
### Details: - Updated dyn-quantize of fully_connected_gpu_bf_tiled kernel ### Tickets: - CVS-151707 --------- Signed-off-by: Min, Byungil <[email protected]>
### Details: - Updated dyn-quantize of fully_connected_gpu_bf_tiled kernel ### Tickets: - CVS-151707 --------- Signed-off-by: Min, Byungil <[email protected]>
Details:
Tickets: