[GPU] Bugfix accumulate sum of FC dyn-quan #27657

byungilm · 2024-11-21T06:25:18Z

Details:

Updated dyn-quantize of fully_connected_gpu_bf_tiled kernel

Tickets:

CVS-151707

Signed-off-by: Min, Byungil <[email protected]>

isanghao · 2024-11-22T02:08:58Z

src/plugins/intel_gpu/src/kernel_selector/cl_kernels/fully_connected_gpu_bf_tiled.cl

@@ -53,7 +53,7 @@ KERNEL(quantize_input)(
        half4 buff = input_0[i] / (half4)quan_scale;
        quantized_value[i] = CAT(CAT(convert_, MAKE_VECTOR_TYPE(DQ_TYPE, INPUT_LOAD_SIZE)), _rte)(buff);
        #if COMPRESSED_WEIGHTS_INT8
-            quantized_sum += (buff[0] + buff[1] + buff[2] + buff[3]);
+            quantized_sum += ((half)quantized_value[i][0] + (half)quantized_value[i][1] + (half)quantized_value[i][2] + (half)quantized_value[i][3]);


what about using int for accumulation instead of half? It will reduce quantization error

Signed-off-by: Min, Byungil <[email protected]>

isanghao · 2024-11-22T10:58:32Z

src/plugins/intel_gpu/src/kernel_selector/cl_kernels/fully_connected_gpu_bf_tiled.cl

        #endif
        vstore4(quantized_value[i], 0, &quantized_input[input_offset + i * 4]);
    }

    // Pair of quantizing_scale and quantized activation_sum for each group
    quan_var[offset * 2] = quan_scale;
    #if COMPRESSED_WEIGHTS_INT8
-        quan_var[(offset * 2) + 1] = quantized_sum;
+        quan_var[(offset * 2) + 1] = CAT(convert_, INPUT0_TYPE)(quantized_sum);


please use _rte

### Details: - Updated dyn-quantize of fully_connected_gpu_bf_tiled kernel ### Tickets: - CVS-151707 --------- Signed-off-by: Min, Byungil <[email protected]>

[GPU] Bugfix accumulate sum of FC dyn-quan

df02880

Signed-off-by: Min, Byungil <[email protected]>

github-actions bot added the category: GPU OpenVINO GPU plugin label Nov 21, 2024

byungilm marked this pull request as ready for review November 21, 2024 07:36

byungilm requested review from a team as code owners November 21, 2024 07:36

byungilm self-assigned this Nov 21, 2024

isanghao reviewed Nov 22, 2024

View reviewed changes

isanghao modified the milestones: 2024.6, 2023.3 Nov 22, 2024

[GPU] Apply comments

d449123

Signed-off-by: Min, Byungil <[email protected]>

byungilm requested a review from isanghao November 22, 2024 05:27

[GPU] Add _rte for conversion

99cb557

Signed-off-by: Min, Byungil <[email protected]>

isanghao reviewed Nov 22, 2024

View reviewed changes

isanghao approved these changes Nov 22, 2024

View reviewed changes

isanghao enabled auto-merge November 22, 2024 11:34

byungilm requested review from isanghao November 22, 2024 11:48

Merge branch 'master' into bugfix_dyn_qaun_int8_sd

f3cabe2

isanghao added this pull request to the merge queue Nov 26, 2024

Merged via the queue into openvinotoolkit:master with commit 611796c Nov 26, 2024
155 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPU] Bugfix accumulate sum of FC dyn-quan #27657

[GPU] Bugfix accumulate sum of FC dyn-quan #27657

byungilm commented Nov 21, 2024

isanghao Nov 22, 2024

byungilm Nov 22, 2024

isanghao Nov 22, 2024

byungilm Nov 22, 2024

[GPU] Bugfix accumulate sum of FC dyn-quan #27657

[GPU] Bugfix accumulate sum of FC dyn-quan #27657

Conversation

byungilm commented Nov 21, 2024

Details:

Tickets:

isanghao Nov 22, 2024

Choose a reason for hiding this comment

byungilm Nov 22, 2024

Choose a reason for hiding this comment

isanghao Nov 22, 2024

Choose a reason for hiding this comment

byungilm Nov 22, 2024

Choose a reason for hiding this comment