NPUW: Disable AVX2 code with ENABLE_AVX2=OFF #26890

eshiryae · 2024-10-02T16:28:19Z

Details:

Disable AVX2 code with ENABLE_AVX2=OFF

Tickets:

E-141645

dmatveev · 2024-10-02T16:34:23Z

Thanks @eshiryae for the quick turnaround! It seems you need to run clang_format_fix_all in your branch
@esmirno can you please review the changes? I'll come up with review later

esmirno · 2024-10-02T17:05:00Z

LGTM certain avx instruction requires F16C module, while it will be hard to differentiate, and more over openvino doesn't have this level of granularity in settings, so as first step i think this is ok.

src/plugins/intel_npu/src/plugin/npuw/util.cpp

esmirno · 2024-10-04T14:15:41Z

please rebase this PR, once Keras fix merged : #26912

src/plugins/intel_npu/src/plugin/npuw/util.cpp

dmatveev

Let's discuss if needed.

If you want to dispatch over top-level unpack, please keep the old unpack back somewhere in util and call XARCH::unpack from it.

dmatveev · 2024-10-15T13:52:07Z

src/plugins/intel_npu/src/plugin/CMakeLists.txt

+        ARCH AVX2 ANY
+                    npuw/util_xarch.cpp
+        API         npuw/util_xarch.hpp
+        NAME        unpack unpack_scale unpack_scale_zp to_f16


I thought we could stay with top-level unpacks being platform-independent, but have their unpack_u4f16, unpack_i4f16, etc. etc. versions separated. It could make easier handling the "overloaded names" problem.

dmatveev · 2024-10-15T13:55:33Z

src/plugins/intel_npu/src/plugin/npuw/just_sync_infer_request.cpp

+            ov::npuw::util::XARCH::unpack_scale_zp(ov::get_tensor_impl(closure),
+                                    ov::get_tensor_impl(comp_model_desc.zerops[cidx]),
+                                    ov::get_tensor_impl(comp_model_desc.scales[cidx]),
+                                    clparam);
        } else if (!comp_model_desc.scales.empty() && comp_model_desc.scales[cidx]) {
            // Unpacking this weight requires scaling
-            ov::npuw::util::unpack(ov::get_tensor_impl(closure),
+            ov::npuw::util::XARCH::unpack_scale(ov::get_tensor_impl(closure),
                                   ov::get_tensor_impl(comp_model_desc.scales[cidx]),
                                   clparam);
        } else {
            // Unpacking this weight doesn't require scaling
-            ov::npuw::util::unpack(ov::get_tensor_impl(closure), clparam);
+            ov::npuw::util::XARCH::unpack(ov::get_tensor_impl(closure), clparam);


let's keep these top-level unpack APIs intact. That's not the only code which is using those, there's also LazyTensor which I see was also updated.

An util function should be easy to use, this XARCH thing should be hidden inside if possible.

dmatveev · 2024-10-15T13:57:05Z

src/plugins/intel_npu/src/plugin/npuw/util_xarch.cpp

+void ov::npuw::util::XARCH::unpack(const ov::SoPtr<ov::ITensor>& from,
+            const ov::SoPtr<ov::ITensor>& to) {
+              unpack_impl(from, to);
+            }
+
+void ov::npuw::util::XARCH::unpack_scale(const ov::SoPtr<ov::ITensor>& from,
+            const ov::SoPtr<ov::ITensor>& scale,
+            const ov::SoPtr<ov::ITensor>& to) {
+              unpack_scale_impl(from, scale, to);
+            }
+
+void ov::npuw::util::XARCH::unpack_scale_zp(const ov::SoPtr<ov::ITensor>& from,
+            const ov::SoPtr<ov::ITensor>& zerop,
+            const ov::SoPtr<ov::ITensor>& scale,
+            const ov::SoPtr<ov::ITensor>& to) {
+              unpack_scale_zp_impl(from, zerop, scale, to);
+            }


Please let's make this dispatch one level below?

These three functions call another three-four, based on data type combinations. Let's make dispatch over those (must be a pretty modest set) but keep the rest intact.

dmatveev

Great!

eshiryae requested review from dmatveev and esmirno October 2, 2024 16:28

eshiryae requested review from a team as code owners October 2, 2024 16:28

github-actions bot added category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin labels Oct 2, 2024

dmatveev added this to the 2024.5 milestone Oct 2, 2024

dmatveev self-assigned this Oct 2, 2024

esmirno approved these changes Oct 2, 2024

View reviewed changes

eshiryae force-pushed the avx_quick_fix branch from 586b4c3 to 6ea0420 Compare October 2, 2024 18:33

smirnov-alexey reviewed Oct 2, 2024

View reviewed changes

src/plugins/intel_npu/src/plugin/npuw/util.cpp Outdated Show resolved Hide resolved

smirnov-alexey approved these changes Oct 2, 2024

View reviewed changes

ilya-lavrenov reviewed Oct 7, 2024

View reviewed changes

src/plugins/intel_npu/src/plugin/npuw/util.cpp Outdated Show resolved Hide resolved

ilya-lavrenov reviewed Oct 7, 2024

View reviewed changes

src/plugins/intel_npu/src/plugin/npuw/util.cpp Outdated Show resolved Hide resolved

eshiryae force-pushed the avx_quick_fix branch from f006def to dd02296 Compare October 14, 2024 09:33

github-actions bot added the category: build OpenVINO cmake script / infra label Oct 14, 2024

eshiryae changed the title ~~NPUW: Disable AVX2 code with ENABLE_AVX2=OFF~~ [DO NOT MERGE] NPUW: Disable AVX2 code with ENABLE_AVX2=OFF Oct 14, 2024

eshiryae added 5 commits October 14, 2024 11:15

NPUW: AVX2 quick fix

80dcdad

NPUW: AVX2 quick fix - missed to_fp16 function

9e20aff

Fix code style

b8b2de5

Fix for unused functions

c234ed3

Address comments

81e141c

eshiryae force-pushed the avx_quick_fix branch from 0a49d14 to 1e6748b Compare October 14, 2024 10:27

NPUW: Use dynamic dispatch for AVX2 code

1e6748b

eshiryae force-pushed the avx_quick_fix branch from 02ceca9 to 04ee8ec Compare October 14, 2024 14:29

NPUW: Use correct unpack() for LazyTensor class

04ee8ec

dmatveev requested changes Oct 15, 2024

View reviewed changes

eshiryae added 2 commits October 15, 2024 15:59

Merge branch 'openvinotoolkit:master' into avx_quick_fix

6e1a2d0

NPUW: Fix code style

6eca67c

eshiryae force-pushed the avx_quick_fix branch from 28457f2 to 7060669 Compare October 16, 2024 15:23

NPUW: Fix for dynamic dispatch for AVX2 code

7060669

eshiryae requested a review from dmatveev October 16, 2024 18:34

eshiryae changed the title ~~[DO NOT MERGE] NPUW: Disable AVX2 code with ENABLE_AVX2=OFF~~ NPUW: Disable AVX2 code with ENABLE_AVX2=OFF Oct 17, 2024

Merge branch 'master' into avx_quick_fix

7ed0136

dmatveev approved these changes Oct 17, 2024

View reviewed changes

smirnov-alexey approved these changes Oct 17, 2024

View reviewed changes

ilya-lavrenov added this pull request to the merge queue Oct 17, 2024

esmirno approved these changes Oct 17, 2024

View reviewed changes

Merged via the queue into openvinotoolkit:master with commit 6dbca1f Oct 17, 2024
134 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NPUW: Disable AVX2 code with ENABLE_AVX2=OFF #26890

NPUW: Disable AVX2 code with ENABLE_AVX2=OFF #26890

eshiryae commented Oct 2, 2024

dmatveev commented Oct 2, 2024

esmirno commented Oct 2, 2024

esmirno commented Oct 4, 2024

dmatveev left a comment •

edited

Loading

dmatveev Oct 15, 2024

dmatveev Oct 15, 2024

dmatveev Oct 15, 2024

dmatveev left a comment

NPUW: Disable AVX2 code with ENABLE_AVX2=OFF #26890

NPUW: Disable AVX2 code with ENABLE_AVX2=OFF #26890

Conversation

eshiryae commented Oct 2, 2024

Details:

Tickets:

dmatveev commented Oct 2, 2024

esmirno commented Oct 2, 2024

esmirno commented Oct 4, 2024

dmatveev left a comment • edited Loading

Choose a reason for hiding this comment

dmatveev Oct 15, 2024

Choose a reason for hiding this comment

dmatveev Oct 15, 2024

Choose a reason for hiding this comment

dmatveev Oct 15, 2024

Choose a reason for hiding this comment

dmatveev left a comment

Choose a reason for hiding this comment

dmatveev left a comment •

edited

Loading