Release v15.4: latest TensorRT library · AmusementClub/vs-mlrt

TRT

Upgraded to TensorRT 10.4.0.

General

Upgraded to CUDA 12.6.0.

vsmlrt.py

Added support for Ani4K-v2 model by @srk24 in #105
Added support for RIFE v4.23 and v4.24 models.
Add max_tactics option to the TRT backend, which can reduce engine build time by limiting the number of tactics to time.
- By default, TensorRT will determine the number of tactics based on its own heuristic.

Batch Inference (Preview)

The latest vsmlrt.py (not in v15.4) provides experimental support for batch inference via batch_size option in inference() and flexible_inference(), which may improve device utilization for inference on small inputs using some small models.

This feature requires flexible output support starting with vs-mlrt v15 and is inspired by styler00dollar/VSGAN-tensorrt-docker@ac47012.

Note that not all onnx models are supported.

Preliminary benchmark:

NVIDIA GeForce RTX 4090
driver 560.94
Windows Server 2019
python 3.12.6, vapoursynth-classic R57.A10
input: 720x480 RGBS
backend: TRT(fp16=True, use_cuda_graph=True)

Measurements: FPS / Device Memory (MB)

model	batch 1	batch 2
realesrgan compact (stream 1)	73.01 / 708	138.68 / 950
realesrgan compact (streams 2)	107.81 / 914	263.87 / 1347
realesrgan compact (streams 3)	108.30 / 1128	348.23 / 1738
realesrgan ultracompact (stream 1)	99.43 / 702	165.52 / 950
realesrgan ultracompact (streams 2)	184.48 / 908	302.56 / 1344
realesrgan ultracompact (streams 3)	184.69 / 1114	458.18 / 1738

Full Changelog: v15.3...v15.4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v15.4: latest TensorRT library

TRT

General

vsmlrt.py

Batch Inference (Preview)

Contributors