-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NPUW: Disable AVX2 code with ENABLE_AVX2=OFF #26890
NPUW: Disable AVX2 code with ENABLE_AVX2=OFF #26890
Conversation
LGTM certain avx instruction requires F16C module, while it will be hard to differentiate, and more over openvino doesn't have this level of granularity in settings, so as first step i think this is ok. |
586b4c3
to
6ea0420
Compare
please rebase this PR, once Keras fix merged : #26912 |
f006def
to
dd02296
Compare
0a49d14
to
1e6748b
Compare
02ceca9
to
04ee8ec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss if needed.
If you want to dispatch over top-level unpack
, please keep the old unpack
back somewhere in util and call XARCH::unpack
from it.
ARCH AVX2 ANY | ||
npuw/util_xarch.cpp | ||
API npuw/util_xarch.hpp | ||
NAME unpack unpack_scale unpack_scale_zp to_f16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we could stay with top-level unpacks
being platform-independent, but have their unpack_u4f16
, unpack_i4f16
, etc. etc. versions separated. It could make easier handling the "overloaded names" problem.
ov::npuw::util::XARCH::unpack_scale_zp(ov::get_tensor_impl(closure), | ||
ov::get_tensor_impl(comp_model_desc.zerops[cidx]), | ||
ov::get_tensor_impl(comp_model_desc.scales[cidx]), | ||
clparam); | ||
} else if (!comp_model_desc.scales.empty() && comp_model_desc.scales[cidx]) { | ||
// Unpacking this weight requires scaling | ||
ov::npuw::util::unpack(ov::get_tensor_impl(closure), | ||
ov::npuw::util::XARCH::unpack_scale(ov::get_tensor_impl(closure), | ||
ov::get_tensor_impl(comp_model_desc.scales[cidx]), | ||
clparam); | ||
} else { | ||
// Unpacking this weight doesn't require scaling | ||
ov::npuw::util::unpack(ov::get_tensor_impl(closure), clparam); | ||
ov::npuw::util::XARCH::unpack(ov::get_tensor_impl(closure), clparam); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's keep these top-level unpack APIs intact. That's not the only code which is using those, there's also LazyTensor
which I see was also updated.
An util function should be easy to use, this XARCH thing should be hidden inside if possible.
void ov::npuw::util::XARCH::unpack(const ov::SoPtr<ov::ITensor>& from, | ||
const ov::SoPtr<ov::ITensor>& to) { | ||
unpack_impl(from, to); | ||
} | ||
|
||
void ov::npuw::util::XARCH::unpack_scale(const ov::SoPtr<ov::ITensor>& from, | ||
const ov::SoPtr<ov::ITensor>& scale, | ||
const ov::SoPtr<ov::ITensor>& to) { | ||
unpack_scale_impl(from, scale, to); | ||
} | ||
|
||
void ov::npuw::util::XARCH::unpack_scale_zp(const ov::SoPtr<ov::ITensor>& from, | ||
const ov::SoPtr<ov::ITensor>& zerop, | ||
const ov::SoPtr<ov::ITensor>& scale, | ||
const ov::SoPtr<ov::ITensor>& to) { | ||
unpack_scale_zp_impl(from, zerop, scale, to); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let's make this dispatch one level below?
These three functions call another three-four, based on data type combinations. Let's make dispatch over those (must be a pretty modest set) but keep the rest intact.
28457f2
to
7060669
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
Details:
Tickets: