v9 Major release: Intel GPU support & much more
This is a major release.
-
Added support for Intel GPUs (both discrete [Xe Arc series] and integrated [Gen 8+ on Broadwell+])
- In
vsmlrt.py
, this corresponds to theOV_GPU
backend. - The openvino library is now dynamically linked because of the integration of oneDNN for GPU.
- In
-
Added support for
RealESRGANv3
andcugan-pro
models. -
Upgraded CUDA toolkit to 11.7.0, TensorRT to 8.4.1 and cuDNN to 8.4.1. It is now possible to build TRT engines for
CUGAN
, waifu2xcunet
andupresnet10
models on RTX 2000 and RTX 3000 series GPUs. -
The trt backend in
vsmlrt.py
wrapper now creates a log file fortrtexec
output in the TEMP directory (this only works if using the bundledtrtexec.exe
.) The log file will only be retained iftrtexec
fails (and the vsmlrt exception message will include the full path of the log file.) If you want the log to go to a specific file, set environment variableTRTEXEC_LOG_FILE
to the absolute path of the log file. If you don't want this behavior, setlog=False
when creating the backend (e.g.vsmlrt.Backend.TRT(log=False)
) -
The cuda bundles now include VC runtime DLLs as well, so
trtexec.exe
should run even on systems without proper VC runtime redistributable packages installed (e.g. freshly installed Windows). -
The ov backend can now configure model compilation via
config
. Available configurations can be found here.-
Example:
core.ov.Model(..., config = lambda: dict(CPU_THROUGHPUT_STREAMS=core.num_threads, CPU_BIND_THREAD="NO"))
This configuration may be useful in improving processor utilization at the expense of significantly increased memory consumption (only try this if you have a huge number of cores underutilized by the default settings.)
The equivalent form for the python wrapper is
backend = vsmlrt.Backend.OV_CPU(num_streams=core.num_threads, bind_thread=False)
-
-
When using the
vsmlrt.py
wrapper, it will no longer create temporary onnx files (e.g. when using non-defaultalpha
CUGAN parameters). Instead, the modified ONNX network will be passed directly into the various ML runtime filters. Those filters now supports(network_path=b'raw onnx protobuf serialization', path_is_serialization=True)
for this. This feature also opens the door for generating ONNX on the fly (e.g. ever dreamed of GPU accelerated 2d-convolution orstd.Expr
?)
Update Instructions
- Delete the previous
vsmlrt-cuda
,vsov
,vsort
andvstrt
directories andvsov.dll
,vsort.dll
andvstrt.dll
from your VS plugins directory and then extract the newly released files (specifically, do not leave files from previous version and just overwrite with the new release as the new release might have removed some files in those four directories.) - Replace
vsmlrt.py
in your Python package directory. - Updated
models
directories by overwriting with the new release. (Models are generally append only. We will make special notices and bump the model release tag if we change any of the previously released models.)
Compatibility Notes
vsmrt.py
in this release is not compatible with binaries in previous releases, only script level compatibility is maintained. Generally, please make sure to upgrade the filters and vsmlrt.py
as a whole.
We strive to maintain script source level compatibility as much as possible (i.e. there won't be a great api4 breakage), and it means script writing for v7 (for example) will continue to function for the foreseeable future. Minor issues (like the non-monotonic denoise setting of cugan) will be documented instead of fixed with a breaking change.
Known issue
CUGAN(version=2)
(a.k.a. cugan-pro) may produces blank clip when using the ORT_CUDA(fp16)
backend. This is fixed in the v10 release.
Full Changelog: v8...v9