Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPUW: Disable AVX2 code with ENABLE_AVX2=OFF #26890

Merged
merged 11 commits into from
Oct 17, 2024

Conversation

eshiryae
Copy link
Contributor

@eshiryae eshiryae commented Oct 2, 2024

Details:

  • Disable AVX2 code with ENABLE_AVX2=OFF

Tickets:

  • E-141645

@eshiryae eshiryae requested review from dmatveev and esmirno October 2, 2024 16:28
@eshiryae eshiryae requested review from a team as code owners October 2, 2024 16:28
@github-actions github-actions bot added category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin labels Oct 2, 2024
@dmatveev
Copy link
Contributor

dmatveev commented Oct 2, 2024

Thanks @eshiryae for the quick turnaround! It seems you need to run clang_format_fix_all in your branch
@esmirno can you please review the changes? I'll come up with review later

@dmatveev dmatveev added this to the 2024.5 milestone Oct 2, 2024
@dmatveev dmatveev self-assigned this Oct 2, 2024
@esmirno
Copy link
Contributor

esmirno commented Oct 2, 2024

LGTM certain avx instruction requires F16C module, while it will be hard to differentiate, and more over openvino doesn't have this level of granularity in settings, so as first step i think this is ok.

@esmirno
Copy link
Contributor

esmirno commented Oct 4, 2024

please rebase this PR, once Keras fix merged : #26912

@github-actions github-actions bot added the category: build OpenVINO cmake script / infra label Oct 14, 2024
@eshiryae eshiryae changed the title NPUW: Disable AVX2 code with ENABLE_AVX2=OFF [DO NOT MERGE] NPUW: Disable AVX2 code with ENABLE_AVX2=OFF Oct 14, 2024
Copy link
Contributor

@dmatveev dmatveev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss if needed.

If you want to dispatch over top-level unpack, please keep the old unpack back somewhere in util and call XARCH::unpack from it.

ARCH AVX2 ANY
npuw/util_xarch.cpp
API npuw/util_xarch.hpp
NAME unpack unpack_scale unpack_scale_zp to_f16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we could stay with top-level unpacks being platform-independent, but have their unpack_u4f16, unpack_i4f16, etc. etc. versions separated. It could make easier handling the "overloaded names" problem.

Comment on lines 621 to 632
ov::npuw::util::XARCH::unpack_scale_zp(ov::get_tensor_impl(closure),
ov::get_tensor_impl(comp_model_desc.zerops[cidx]),
ov::get_tensor_impl(comp_model_desc.scales[cidx]),
clparam);
} else if (!comp_model_desc.scales.empty() && comp_model_desc.scales[cidx]) {
// Unpacking this weight requires scaling
ov::npuw::util::unpack(ov::get_tensor_impl(closure),
ov::npuw::util::XARCH::unpack_scale(ov::get_tensor_impl(closure),
ov::get_tensor_impl(comp_model_desc.scales[cidx]),
clparam);
} else {
// Unpacking this weight doesn't require scaling
ov::npuw::util::unpack(ov::get_tensor_impl(closure), clparam);
ov::npuw::util::XARCH::unpack(ov::get_tensor_impl(closure), clparam);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep these top-level unpack APIs intact. That's not the only code which is using those, there's also LazyTensor which I see was also updated.

An util function should be easy to use, this XARCH thing should be hidden inside if possible.

Comment on lines 12 to 28
void ov::npuw::util::XARCH::unpack(const ov::SoPtr<ov::ITensor>& from,
const ov::SoPtr<ov::ITensor>& to) {
unpack_impl(from, to);
}

void ov::npuw::util::XARCH::unpack_scale(const ov::SoPtr<ov::ITensor>& from,
const ov::SoPtr<ov::ITensor>& scale,
const ov::SoPtr<ov::ITensor>& to) {
unpack_scale_impl(from, scale, to);
}

void ov::npuw::util::XARCH::unpack_scale_zp(const ov::SoPtr<ov::ITensor>& from,
const ov::SoPtr<ov::ITensor>& zerop,
const ov::SoPtr<ov::ITensor>& scale,
const ov::SoPtr<ov::ITensor>& to) {
unpack_scale_zp_impl(from, zerop, scale, to);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let's make this dispatch one level below?

These three functions call another three-four, based on data type combinations. Let's make dispatch over those (must be a pretty modest set) but keep the rest intact.

@eshiryae eshiryae requested a review from dmatveev October 16, 2024 18:34
@eshiryae eshiryae changed the title [DO NOT MERGE] NPUW: Disable AVX2 code with ENABLE_AVX2=OFF NPUW: Disable AVX2 code with ENABLE_AVX2=OFF Oct 17, 2024
Copy link
Contributor

@dmatveev dmatveev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Oct 17, 2024
Merged via the queue into openvinotoolkit:master with commit 6dbca1f Oct 17, 2024
134 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: build OpenVINO cmake script / infra category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants