-
Notifications
You must be signed in to change notification settings - Fork 770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Multi-threaded Benchmarks for Production-Realistic Performance Metrics #568
Comments
@mesibo thanks for the initiative, I agree that reporting on non-single-thread results is interesting. Could you detail what you are missing with the present |
Actually, the default
You are correct that the Unlike most synchronous algorithms, our algorithm is asynchronous and analyzes patterns in vectors before indexing them. It uses multi-core systems for this pre-processing and hence has to be asynchronous. However, this poses no issue during indexing as it’s a one-time operation, and we only need to wait in the However, querying is also asynchronous, and the algorithm can handle multiple query vectors in parallel. This is where integrating it into the While We are still investigating potential solutions, but any suggestions or guidance you can provide would be greatly appreciated. |
Thanks for making such a strong case, @mesibo. I'd be very interested to see the work that you are doing and how the current framework stands in your way to achieve the best performance for your implementation. I'm not entirely sure I understand what you mean with
Batch-mode will bypass the The only part where we are using a multiprocessing queue is our implementation of |
Batch-mode should work, the concern was memory, but as we’ve learned, |
Currently,
ANN-Benchmarks
enforces single-CPU execution during experimentation, disabling multi-threading capabilities at the hardware level (AWS single-cpu mode) so that even libraries supporting multi-core CPU/threading can't use it. While batch mode exists, it doesn't address the potential performance benefits that multi-threading could offer.There are ANN libraries supporting multi-core/multi-threading processing to improve performance. However, the single-threaded benchmark is unlikely to project their true potential in the production environment. This is especially true since most end users are likely to run ANN on multi-core CPUs, and there is no reason they should choose the wrong implementation based on single CPU benchmarks. Running and providing both single- and multi-threaded benchmarks will allow users to view more realistic benchmarks, matching production setup.
We request to consider adding multi-threading benchmarks alongside existing single-threaded tests.
We're happy to contribute to the implementation if needed and have already made some changes for local benchmarking to test ANN in multi-threaded mode.
Looking forward to your thoughts on this.
The text was updated successfully, but these errors were encountered: