Release v9 Major release: Intel GPU support & much more · AmusementClub/vs-mlrt

This is a major release.

Added support for Intel GPUs (both discrete [Xe Arc series] and integrated [Gen 8+ on Broadwell+])
- In vsmlrt.py, this corresponds to the OV_GPU backend.
- The openvino library is now dynamically linked because of the integration of oneDNN for GPU.
Added support for RealESRGANv3 and cugan-pro models.
Upgraded CUDA toolkit to 11.7.0, TensorRT to 8.4.1 and cuDNN to 8.4.1. It is now possible to build TRT engines for CUGAN, waifu2x cunet and upresnet10 models on RTX 2000 and RTX 3000 series GPUs.
The trt backend in vsmlrt.py wrapper now creates a log file for trtexec output in the TEMP directory (this only works if using the bundled trtexec.exe.) The log file will only be retained if trtexec fails (and the vsmlrt exception message will include the full path of the log file.) If you want the log to go to a specific file, set environment variable TRTEXEC_LOG_FILE to the absolute path of the log file. If you don't want this behavior, set log=False when creating the backend (e.g.vsmlrt.Backend.TRT(log=False))
The cuda bundles now include VC runtime DLLs as well, so trtexec.exe should run even on systems without proper VC runtime redistributable packages installed (e.g. freshly installed Windows).
The ov backend can now configure model compilation via config. Available configurations can be found here.
- Example:
```
core.ov.Model(..., config = lambda: dict(CPU_THROUGHPUT_STREAMS=core.num_threads, CPU_BIND_THREAD="NO"))
```
  This configuration may be useful in improving processor utilization at the expense of significantly increased memory consumption (only try this if you have a huge number of cores underutilized by the default settings.)
  
  The equivalent form for the python wrapper is
```
backend = vsmlrt.Backend.OV_CPU(num_streams=core.num_threads, bind_thread=False)
```
When using the vsmlrt.py wrapper, it will no longer create temporary onnx files (e.g. when using non-default alpha CUGAN parameters). Instead, the modified ONNX network will be passed directly into the various ML runtime filters. Those filters now supports (network_path=b'raw onnx protobuf serialization', path_is_serialization=True) for this. This feature also opens the door for generating ONNX on the fly (e.g. ever dreamed of GPU accelerated 2d-convolution or std.Expr?)

Update Instructions

Delete the previous vsmlrt-cuda, vsov, vsort and vstrt directories and vsov.dll, vsort.dll and vstrt.dll from your VS plugins directory and then extract the newly released files (specifically, do not leave files from previous version and just overwrite with the new release as the new release might have removed some files in those four directories.)
Replace vsmlrt.py in your Python package directory.
Updated models directories by overwriting with the new release. (Models are generally append only. We will make special notices and bump the model release tag if we change any of the previously released models.)

Compatibility Notes

vsmrt.py in this release is not compatible with binaries in previous releases, only script level compatibility is maintained. Generally, please make sure to upgrade the filters and vsmlrt.py as a whole.

We strive to maintain script source level compatibility as much as possible (i.e. there won't be a great api4 breakage), and it means script writing for v7 (for example) will continue to function for the foreseeable future. Minor issues (like the non-monotonic denoise setting of cugan) will be documented instead of fixed with a breaking change.

Known issue

CUGAN(version=2) (a.k.a. cugan-pro) may produces blank clip when using the ORT_CUDA(fp16) backend. This is fixed in the v10 release.

Full Changelog: v8...v9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v9 Major release: Intel GPU support & much more

Update Instructions

Compatibility Notes

Known issue