-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introducing a loading layer in FAISS native engine. #2139
Introducing a loading layer in FAISS native engine. #2139
Conversation
Signed-off-by: Dooyong Kim <[email protected]>
Signed-off-by: Dooyong Kim <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall code looks good to me. Couple of things:
- Please add the unit test for the C++ layer.
- Add the reference for the benchmarks performed using this code.
src/main/java/org/opensearch/knn/index/memory/NativeMemoryLoadStrategy.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/knn/index/memory/NativeMemoryLoadStrategy.java
Outdated
Show resolved
Hide resolved
|
||
final int indexSize = (int) directory.fileLength(logicalIndexPath); | ||
|
||
try (IndexInput readStream = directory.openInput(logicalIndexPath, IOContext.READONCE)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check when openInput is called what is the memory spike we are observing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need more input for this, which issue are you referring to??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not talking about any issue here. So my question is in your tests when you the queries are run, can we graph the spikes in memory during the query. As IndexInput.openInput maps the files to memory. So want to know if there is any spike we are seeing and how much is that spike actually.
src/main/java/org/opensearch/knn/index/util/IndexInputWithBuffer.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/knn/index/util/IndexInputWithBuffer.java
Outdated
Show resolved
Hide resolved
@@ -286,6 +286,7 @@ private Map<Integer, Float> doANNSearch( | |||
try { | |||
indexAllocation = nativeMemoryCacheManager.get( | |||
new NativeMemoryEntryContext.IndexEntryContext( | |||
reader.directory(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have added the directory code here, are we going to remove the dependency from FSDirectory in upcoming PRs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're referring to using FSDirectory
to get indexPath
right?
If so, yeah, that is the plan! We must deprecate FileWatcher first then moving on cutting FSDirectory
entirely.
Passing Directory
here to let the internal loading strategy be able to create IndexInput
from it, in case there's not yet a vector index was loaded.
Signed-off-by: Dooyong Kim <[email protected]>
Appendix 1. Performance Details
|
@navneet1v Address your comment. Can you take a look at it? |
@0ctopus13prime please ensure that all CIs are successful |
Seems like |
…meter. Signed-off-by: Dooyong Kim <[email protected]>
ceb73cb
to
fe33151
Compare
if (KNNEngine.NMSLIB == knnEngine) { | ||
throw new UnsupportedOperationException("Loading NMSLIB with a read stream is not supported at the moment"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems redundant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had it here for being conservative.
Also I thought it would helpful for the reader to have a easier understanding why there's only FAISS in this method.
But, yeah! there won't be NMSLIB to be passed down to here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is the last throw is just dead code as it stands now. Would recommend removing if its handled correctly in favor of reducing code since there is no ambiguity here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I'm understanding The thing is the last throw is just dead code as it stands now.
.
When we throw it there, the thrown exception will be eaten silently?
Hmm... I think KNNWeight would trigger this, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was talking about this, I will leave it up to you on this. not a blocker
throw new IllegalArgumentException(
String.format(Locale.ROOT, "LoadIndex not supported for provided engine : %s", knnEngine.getName())
);
@@ -547,6 +547,34 @@ void knn_jni::JNIUtil::SetByteArrayRegion(JNIEnv *env, jbyteArray array, jsize s | |||
this->HasExceptionInStack(env, "Unable to set byte array region"); | |||
} | |||
|
|||
jobject knn_jni::JNIUtil::GetObjectField(JNIEnv * env, jobject obj, jfieldID fieldID) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These seem to be just wrappers over JNIEnv, is this done for testing purposes? can we avoid this maintenance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is for testing purposes.
I could not come up with a better solution to have it here.
Also I think the motivation of introducing this class is on the testing purposes as well, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah but most of the methods here have added logic, would be nice if we can mock JNIEnv. but should be fine if its not possible. One of the reason for this file is also caching which is handled by statics in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, in order to let gtest mock, the target has to be a virtual class...
And JNIEnv is a struct, and defined APIs are also not virtual methods.
// Loads an index with a reader implemented IOReader | ||
// | ||
// Returns a pointer of the loaded index | ||
jlong LoadIndexWithStream(faiss::IOReader* ioReader); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets move these methods to index service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a parity of LoadIndex
which will be deprecated soon.
Tried not to modify the whole structure too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Load index is a pending refactor/move, if there is enough bandwidth we should move new code to index service to start with
@heemin32 thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I make the changes in the upcoming PR?
Which will be raised right after this - Streaming support NMSLIB.
Will ask you for the review for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to have it in index service. Having it in the next PR is fine for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
Will move the method in index service in the next PR where I will also ask you to review over the refactoring.
Thank you both.
Signed-off-by: Dooyong Kim <[email protected]>
c995564
to
da17102
Compare
Loading Time ComparisonThe numbers below were measured through
Conclusion: When a new index file was just created, and it's not system cached, then there's no trivial different to loading time between baseline and streaming fashion. ExperimentIndex size : 6.4G 1. Baseline (Using fread)
2. Using Stream2.1. 4KB
2.2. 64KB
2.3. 1M
|
Signed-off-by: Dooyong Kim <[email protected]>
@navneet1v StreamingBaseline |
Signed-off-by: Dooyong Kim <[email protected]>
Hi @shatejas, included unit tests in the commit! |
Thanks @0ctopus13prime this is was some really good benchmarks and also the way you performed the benchmarks. I really like the idea of dropping the file from page cache and then running the benchmarks. But I am not completely sold on the conclusion of sticking to 4KB as buffer size. I believe we should move to towards little higher buffer size to ensure that we are as close to cached performance. We has a similar buffer kind of thing when we transfer the vectors from Java Heap to off heap. There we use 1% of the heap memory. can we run 1 test with a higher value like 1% of the heap in a 32GB heap JVM. Another thing, please paste these benchmarks on the RFC too. This is really good benchmarks. |
I am little surprised by this could you share more on details on what was your testing strategy and what was the metrics gathering technique to build the above graphs. |
@navneet1v Testing machine : c5a.12xlarge | c5.12xlarge | 48vcpu | 96cores | EBS-Only | 12 Gbits | Testing strategyJVM option: Benchmark procedures
Metrics gathering
Why there's no memory peak?I think it is reasonable to think that opening IndexInput will not result in memory peak since the default directory ( Additionally, because Of course, if user configured a different Directory that is similar to RAMDirectory, then memory peak is expected as JVM will try to load file contents into its heap. |
@navneet1v Reasons:
|
@0ctopus13prime thanks for the details. Overall code looks good to me. Please fix the conflicts, |
Signed-off-by: Dooyong Kim <[email protected]>
Signed-off-by: Doo Yong Kim <[email protected]>
In CI, Submitting tasks sequentially to a thread pool does not guarantee that they will be executed in the same order they were submitted. Therefore, even we submitted 1.
Will fix this testing code and rerun CI. |
Thanks @0ctopus13prime for taking a look at this. We already have a GH issue for this flaky test. @kotwanikunal was supposed to look into this. If you think fix is small please go ahead and fix it. But it that is not a blocker for this PR to merge. |
Signed-off-by: Dooyong Kim <[email protected]>
* Introducing a loading layer in FAISS native engine. Signed-off-by: Dooyong Kim <[email protected]> * Update change log. Signed-off-by: Dooyong Kim <[email protected]> * Added unit tests for Faiss stream support. Signed-off-by: Dooyong Kim <[email protected]> * Fix a bug to pass a KB size integer value as a byte size integer parameter. Signed-off-by: Dooyong Kim <[email protected]> * Fix a casting bugs when it tries to laod more than 4G sized index file. Signed-off-by: Dooyong Kim <[email protected]> * Added unit tests for new methods in JNIService. Signed-off-by: Dooyong Kim <[email protected]> * Fix formatting and removed nmslib_stream_support. Signed-off-by: Dooyong Kim <[email protected]> * Removing redundant exception message in JNIService.loadIndex. Signed-off-by: Dooyong Kim <[email protected]> * Fix a flaky testing - testIndexAllocation_closeBlocking Signed-off-by: Dooyong Kim <[email protected]> --------- Signed-off-by: Dooyong Kim <[email protected]> Signed-off-by: Doo Yong Kim <[email protected]> Co-authored-by: Dooyong Kim <[email protected]> (cherry picked from commit f3b2bd0)
* Introducing a loading layer in FAISS native engine. Signed-off-by: Dooyong Kim <[email protected]> * Update change log. Signed-off-by: Dooyong Kim <[email protected]> * Added unit tests for Faiss stream support. Signed-off-by: Dooyong Kim <[email protected]> * Fix a bug to pass a KB size integer value as a byte size integer parameter. Signed-off-by: Dooyong Kim <[email protected]> * Fix a casting bugs when it tries to laod more than 4G sized index file. Signed-off-by: Dooyong Kim <[email protected]> * Added unit tests for new methods in JNIService. Signed-off-by: Dooyong Kim <[email protected]> * Fix formatting and removed nmslib_stream_support. Signed-off-by: Dooyong Kim <[email protected]> * Removing redundant exception message in JNIService.loadIndex. Signed-off-by: Dooyong Kim <[email protected]> * Fix a flaky testing - testIndexAllocation_closeBlocking Signed-off-by: Dooyong Kim <[email protected]> --------- Signed-off-by: Dooyong Kim <[email protected]> Signed-off-by: Doo Yong Kim <[email protected]> Co-authored-by: Dooyong Kim <[email protected]> (cherry picked from commit f3b2bd0) Co-authored-by: Doo Yong Kim <[email protected]>
Description
This PR is the first commit making the loading layer in native engines available.
Please refer to this issue for more details. - #2033
Related Issues
Resolves #2033
#2033
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.