Skip to content
WolframRhodium edited this page Jan 20, 2022 · 41 revisions

Welcome to the vs-mlrt wiki!

The goal of the project to provide highly-optimized AI inference runtime for VapourSynth.

Runtimes

  • vs-ov: OpenVINO Pure CPU AI Inference Runtime
  • vs-ort: ONNX Runtime based CPU/CUDA AI Inference Runtime
  • vs-trt: TensorRT based CUDA AI Inference Runtime

Models

The following models are available:

Benchmarking

DPIR

Model Input Size Speed(fps) VRAM Usage(MB) Backend
drunet_gray 1920x1080 2.45 5188 ort-cuda[1]
drunet_gray 1920x1080 5.20 3018 ort-cuda[1], fp16
drunet_gray 1920x1080 2.55 3791 trt[1]
drunet_gray 1920x1080 7.33 2429 trt[1], fp16
drunet_gray 1920x1080 2.57 6795 trt[1], fp16, 2 streams
drunet_gray 1920x1080 7.74 4075 trt[1], fp16, 2 streams
drunet_gray 1920x1080 2.27 11552 pytorch[2]
drunet_gray 1920x1080 5.10 11024 pytorch[2], fp16
drunet_gray 1920x1080 2.45 14853 pytorch[2], trt
drunet_gray 1920x1080 6.90 13565 pytorch[2], trt, fp16
drunet_gray 1920x1080 2.34 5791 ort-cuda[9]
drunet_gray 1920x1080 3.73 3621 ort-cuda[9], fp16
drunet_gray 1920x1080 2.62 4011 trt[9]
drunet_gray 1920x1080 5.93 2911 trt[9], fp16
drunet_gray 1920x1080 6.11 2915 trt[9], fp16, use graph
drunet_gray 1920x1080 6.67 3437 trt[*], fp16
drunet_gray 1920x1080 2.20 11837 pytorch[10]
drunet_gray 1920x1080 3.72 11583 pytorch[10], fp16
drunet_gray 1920x1080 2.67 4189 pytorch[10], trt
drunet_gray 1920x1080 6.17 4079 pytorch[10], trt, fp16
drunet_color 1920x1080 2.39 5220 ort-cuda[1]
drunet_color 1920x1080 4.95 3058 ort-cuda[1], fp16
drunet_color 1920x1080 2.48 3983 trt[1]
drunet_color 1920x1080 6.95 2457 trt[1], fp16
drunet_color 1920x1080 2.56 7187 trt[1], 2 streams
drunet_color 1920x1080 7.70 4135 trt[1], fp16, 2 streams
drunet_color 1920x1080 2.12 11558 pytorch[2]
drunet_color 1920x1080 4.29 11302 pytorch[2], fp16
drunet_color 1920x1080 2.26 14879 pytorch[2], trt
drunet_color 1920x1080 5.60 13575 pytorch[2], trt, fp16
drunet_color 1920x1080 2.29 5823 ort-cuda[9]
drunet_color 1920x1080 3.65 3661 ort-cuda[9], fp16
drunet_color 1920x1080 2.58 4043 trt[9]
drunet_color 1920x1080 5.57 2941 trt[9], fp16
drunet_color 1920x1080 2.12 11853 pytorch[10]
drunet_color 1920x1080 3.45 11597 pytorch[10], fp16
drunet_color 1920x1080 2.54 4209 pytorch[10], trt
drunet_color 1920x1080 5.25 4103 pytorch[10], trt, fp16

Waifu2x

Model Input Size Speed(fps) Device RAM Usage(MB) Backend
upconv7 1920x1080 5.98 5065 ort-cuda[1]
upconv7 1920x1080 10.4 5189 ort-cuda[1], fp16
upconv7 1920x1080 6.54 5031 trt[1]
upconv7 1920x1080 13.7 3041 trt[1], fp16
upconv7 1920x1080 8.47 9283 trt[1], 2 streams
upconv7 1920x1080 25.7 5303 trt[1], fp16, 2 streams
upconv7 1920x1080 5.66 3355 ort-cuda[1], 540p patch
upconv7 1920x1080 1.63 3248 caffe[3], 540p patch
upconv7 1920x1080 1.14 15547 ov-cpu[4]
upconv7 1920x1080 0.37 8612 ov-cpu[5]
upconv7 1920x1080 6.94 9765 ort-cuda[9]
upconv7 1920x1080 9.66 6049 ort-cuda[9], fp16
upconv7 1920x1080 7.71 5513 trt[9]
upconv7 1920x1080 15.8 3855 trt[9], fp16
upconv7 1920x1080 8.55 9759 trt[9], 2 streams
upconv7 1920x1080 19.5 6093 trt[9], fp16, 2 streams
upresnet10 1920x1080 4.36 5061 ort-cuda[1]
upresnet10 1920x1080 6.43 5059 ort-cuda[1], fp16
upresnet10 1920x1080 4.27 1879 ort-cuda[1], 540p patch
upresnet10 1920x1080 1.54 7232 caffe[3], 540p patch
upresnet10 1920x1080 1.27 7245 ov-cpu[4]
upresnet10 1920x1080 0.44 7143 ov-cpu[5]
upresnet10 1920x1080 3.90 5665 ort-cuda[9]
upresnet10 1920x1080 6.53 5663 ort-cuda[9], fp16
cunet 1920x1080 2.58 9155 ort-cuda[1]
cunet 1920x1080 4.10 9535 ort-cuda[1], fp16
cunet 1920x1080 2.48 4955 ort-cuda[1], 540p patch
cunet 1920x1080 1.11 11657 caffe[3], 540p patch
cunet 1920x1080 0.57 10943 ov-cpu[4]
cunet 1920x1080 0.23 10943 ov-cpu[5]
cunet 1920x1080 2.17 18469 ort-cuda[1]
cunet 1920x1080 3.22 10017 ort-cuda[1], fp16
anime rgb 1920x1080 0.62 15578 ov-cpu[4]
anime rgb 1920x1080 0.21 15439 ov-cpu[5]
anime rgb 1920x1080 0.048 1145 w2xc[6]
anime rgb 1920x1080 0.039 1183 w2xc[7]

RealESRGANv2

Model Input Size Speed(fps) Device RAM Usage(MB) Backend
animevideo-xsx2 1920x1080 5.27 2213 ort-cuda[1]
animevideo-xsx2 1920x1080 5.98 1811 trt[1]
animevideo-xsx2 1920x1080 11.0 1291 trt[1], fp16
animevideo-xsx2 1920x1080 6.70 2971 trt[1], 2 streams
animevideo-xsx2 1920x1080 17.2 1933 trt[1], fp16, 2 streams
animevideo-xsx2 1920x1080 3.64 6799 pytorch[2]
animevideo-xsx2 1920x1080 4.72 4291 pytorch[2], fp16
animevideo-xsx2 1920x1080 1.48 5239 ov-cpu[4]
animevideo-xsx2 1920x1080 0.42 5201 ov-cpu[5]
animevideo-xsx2 1920x1080 0.064 2883 pytorch[8]
animevideo-xsx2 1920x1080 4.15 2817 ort-cuda[9]
animevideo-xsx2 1920x1080 4.78 1941 trt[9]
animevideo-xsx2 1920x1080 10.3 1475 trt[9], fp16
animevideo-xsx2 1920x1080 5.02 3093 trt[9], 2 streams
animevideo-xsx2 1920x1080 10.8 2163 trt[9], fp16, 2 streams
animevideo-xsx2 1920x1080 3.47 7075 pytorch[2]
animevideo-xsx2 1920x1080 4.90 4585 pytorch[2], fp16
  1. VapourSynth R57, Tesla V100, Windows Server 2019, Graphics Driver 511.23, vs-mlrt v5
  2. VapourSynth R57, Tesla V100, Windows Server 2019, Graphics Driver 511.23, vs-dpir v1.7.1, vs-realesrgan v2.0.0, PyTorch 1.10.1+cu113, TensorRT 8.2.2, torch2trt 2732b35
  3. VapourSynth R57, Tesla V100, Windows Server 2019, Graphics Driver 511.23, VapourSynth-Waifu2x-caffe r14
  4. VapourSynth R57, Icelake Server 32C64T @2.90 GHz, Windows Server 2019, vs-mlrt v5
  5. VapourSynth R57, EPYC Milan 16C32T @2.55 GHz, Windows Server 2019, vs-mlrt v5
  6. VapourSynth R57, Icelake Server 32C64T @2.90 GHz, Windows Server 2019, VapourSynth-Waifu2x-w2xc r8
  7. VapourSynth R57, EPYC Milan 16C32T @2.55 GHz, Windows Server 2019, VapourSynth-Waifu2x-w2xc r8
  8. VapourSynth R57, Icelake Server 32C64T @2.90 GHz, Windows Server 2019, vs-dpir v1.7.1, PyTorch 1.10.1, NumPy 1.21.5+mkl
  9. VapourSynth R57, Tesla A10, Windows Server 2019, Graphics Driver 511.23, vs-mlrt v5
  10. VapourSynth R57, Tesla A10, Windows Server 2019, Graphics Driver 511.23, vs-dpir v1.7.1, vs-realesrgan v2.0.0, PyTorch 1.10.1+cu113, TensorRT 8.2.2, torch2trt 2732b35

*: VapourSynth R57, Tesla A10, Windows Server 2019, Graphics Driver 511.23, vs-mlrt 343be9e (TRT.static_shape=True)