Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New caching api #27644

Closed
wants to merge 7 commits into from
Closed

New caching api #27644

wants to merge 7 commits into from

Conversation

olpipi
Copy link
Contributor

@olpipi olpipi commented Nov 20, 2024

Details:

  • New API for plugins to import model using std::istream + ov::AlignedBuffer. It allows to read weights directly from the buffer.
  • Plugins should add internal property ov::internal::caching_with_mmap and implement appropriate method to enable this feature

Tickets:

@github-actions github-actions bot added category: inference OpenVINO Runtime library - Inference category: CPU OpenVINO CPU plugin labels Nov 20, 2024
@olpipi
Copy link
Contributor Author

olpipi commented Nov 20, 2024

Hi guys @ilya-lavrenov @PatrikStepan @nshchego @MirceaDan99 @sshlyapn
Please take a look. If such API is ok to you to share mmap buffer to plugin?
There is also another option - fully replace std::istream with OwningSharedStreamBuffer. Then plugin will be able to get ov::AlignedBuffer from it

@MirceaDan99
Copy link
Contributor

Updated my functional POC of supporting new import_model api: master...MirceaDan99:openvino:POC/add_caching_with_mmap_property

I still believe we'll need to add a mechanism to treat offsets in the buffer for the ov::AlignedBuffer. My personal approach for this can be found in commits:

@olpipi olpipi marked this pull request as ready for review November 28, 2024 13:48
@olpipi olpipi requested review from a team as code owners November 28, 2024 13:48
@olpipi olpipi force-pushed the new_caching_api branch 2 times, most recently from c3df271 to 1e72a49 Compare November 28, 2024 14:26
Copy link
Contributor

@praasz praasz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be any test added to use new API?

src/inference/src/dev/iplugin.cpp Show resolved Hide resolved
src/inference/src/dev/plugin.hpp Show resolved Hide resolved
src/plugins/intel_cpu/src/plugin.cpp Show resolved Hide resolved
@olpipi olpipi requested review from a team as code owners December 4, 2024 13:42
@github-actions github-actions bot added category: AUTO BATCH OpenVINO Auto Batch plugin category: PROXY OpenVINO proxy plugin category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin labels Dec 4, 2024
@@ -40,6 +40,11 @@ class MockIPlugin : public ov::IPlugin {
import_model,
(std::istream&, const ov::SoPtr<ov::IRemoteContext>&, const ov::AnyMap&),
(const));
MOCK_METHOD(std::shared_ptr<ov::ICompiledModel>, import_model, (std::istream&, std::shared_ptr<ov::AlignedBuffer>, const ov::AnyMap&), (const));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not string + Tensor ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is internal plugins api. All plugins support model import with std::istream. I just added ov::AlignedBuffer as additional argument to simplify implementation to plugins. They can now reuse existing implementation + add additional login to get weights directly from ov::AlignedBuffer

@olpipi
Copy link
Contributor Author

olpipi commented Dec 5, 2024

Could be any test added to use new API?

tests were added

@olpipi olpipi requested review from praasz and ilya-lavrenov and removed request for praasz December 5, 2024 17:31
@olpipi olpipi enabled auto-merge December 6, 2024 15:57
@olpipi olpipi added this pull request to the merge queue Dec 6, 2024
@ilya-lavrenov ilya-lavrenov removed this pull request from the merge queue due to a manual request Dec 6, 2024
*/
virtual std::shared_ptr<ov::ICompiledModel> import_model(std::istream& model,
std::shared_ptr<ov::AlignedBuffer> model_buffer,
const ov::AnyMap& properties) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not to pass this model_buffer via properties ? It will require less changes

In future, we are moving to weightless cache for all plugins, so model_buffer will be less important.

@ilya-lavrenov
Copy link
Contributor

alternative solution is #27981

@olpipi
Copy link
Contributor Author

olpipi commented Dec 10, 2024

This PR is closed to merge another solution #27981

github-merge-queue bot pushed a commit that referenced this pull request Dec 10, 2024
### Details:
- Replacement for #27644

### Tickets:
 - CVS-154602
 - CVS-157192
11happy pushed a commit to 11happy/openvino that referenced this pull request Dec 23, 2024
### Details:
- Replacement for openvinotoolkit#27644

### Tickets:
 - CVS-154602
 - CVS-157192
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: AUTO BATCH OpenVINO Auto Batch plugin category: AUTO OpenVINO AUTO device selection plugin category: CPU OpenVINO CPU plugin category: GPU OpenVINO GPU plugin category: HETERO OpenVINO HETERO plugin category: IE Tests OpenVINO Test: plugins and common category: inference OpenVINO Runtime library - Inference category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin category: PROXY OpenVINO proxy plugin category: TEMPLATE OpenVINO Template plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants