Releases: triton-inference-server/model_analyzer
Releases · triton-inference-server/model_analyzer
Release 1.45.0 corresponding to NGC container 24.10
Update README for r24.10 (#939) * Update README for Release 24.10 * Update README.md Co-authored-by: Misha Chornyi <[email protected]> * Revert "Update README.md" This reverts commit 682c999fbd35b514cb7ae5f52b1ecb842651365e. --------- Co-authored-by: Misha Chornyi <[email protected]> Co-authored-by: Misha Chornyi <[email protected]>
Release 1.44.0 corresponding to NGC container 24.09
Update README for r24.09 (#934) * Update README for r24.09 * Update branch name in README
Release 1.43.0 corresponding to NGC container 24.08
v1.43.0 Update README.md for 24.08 Release. (#928)
Release 1.42.0 corresponding to NGC container 24.07
New Features and Improvements
- Optuna search mode
- Allows you to search any parameter that can be specified in the model configuration, using a hyperparameter optimization framework
Release 1.41.0 corresponding to NGC container 24.06
v1.41.0 Update README.md for 24.06 (#905)
Release 1.40.0 corresponding to NGC container 24.05
Update README and versions for 1.40.0 / 24.05 (#883) * Update README and versions for 1.40.0 / 24.05
Release 1.39.0 corresponding to NGC container 24.04
New Features and Improvements
- Model Analyzer now supports profiling Large Language Models (LLMs) using GenAI-Perf
Release 1.38.0 corresponding to NGC container 24.03
v1.38.0 Update README.md for 1.38.0 / 24.03 (#848)
Release 1.37.0 corresponding to NGC container 24.02
v1.37.0 Update README.md for 24.02 (#830)
Release 1.36.0 corresponding to NGC container 24.01
New Features and Improvements
- Model Analyzer now correctly loads and optimizes ensemble models
- Model Analyzer now correctly works with SSL via gRPC
- Model Analyzer now handles the case of optimizing a model on a remote Triton server without requiring a local GPU