Replies: 12 comments 61 replies
-
If tunings like increasing This problem is specific to rife in vs-mlrt and I will try to solve it. Thanks for the information! Anyway, for advanced usage, it is possible to build a single trt model for multiple video resolutions by setting parameters |
Beta Was this translation helpful? Give feedback.
-
Need to push this to SVP team as i dont know where to check the parameters num_threads etc. But i can confirm, i run the 4090 on a pcie4 x16 slot and another user confirms my framerate |
Beta Was this translation helpful? Give feedback.
-
We need additional information to determine why the performance degrades. It would be better if you could provide profiling information like nsight system data. The installer can be downloaded without registration from the cuda toolkit. |
Beta Was this translation helpful? Give feedback.
-
2160p video has 4x more data than 1080p. If your system can do 115fps at 1080p, then the max performance it can get for 2160p is 115/4 fps, which is around 28.75 fps. You also need to consider the data movement cost, at 4k, each 22fps, the GPU requires 7.74 GB data from system DRAM and outputs 2.11GB. How much DRAM bandwidth do you have in your system? Even if it's not limited by PCIe bandwidth, it will very likely hit DRAM bandwidth ceiling. Remember that RIFE is not the only thing that uses the DRAM bandwidth, the rest of the VS script and SVP also consume DRAM bandwidth. |
Beta Was this translation helpful? Give feedback.
-
Launch a new svp process by specifying path to svp executable in nsight system, and then transcode for a short period of time. Collection of GPU metrics should be enabled. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Based on my personal experience, mpv's vapoursynth filter is not fast enough. Even though the simpliest script would make 4K-video's playback laggy on my 10875H i7 CPU.
|
Beta Was this translation helpful? Give feedback.
-
I'm using 5900x+3070ti and got so it's clearly OP's setup has problems. |
Beta Was this translation helpful? Give feedback.
-
Could you please run a baseline benchmark? test.vpy (run with import vapoursynth as vs
from vsmlrt import RIFE, Backend, RIFEModel
core = vs.core
num_streams = 2
use_cuda_graph = True
peak_performance = True
if peak_performance:
src = core.std.BlankClip(width=3840, height=2176, length=2000, format=vs.RGBS, color=[0.5]*3)
else:
src = core.std.BlankClip(width=3840, height=2160, length=2000, format=vs.YUV420P8, color=[128]*3)
src = core.misc.SCDetect(src)
src = core.resize.Bicubic(src, format=vs.RGBS, matrix_in_s="709")
src = core.std.AddBorders(src, 0, 0, 0, 16)
backend = Backend.TRT(fp16=True, num_streams=num_streams, use_cuda_graph=use_cuda_graph)
flt = RIFE(src, multi=2, model=RIFEModel.v4_6, backend=backend)
if not peak_performance:
flt = core.std.Crop(flt, 0, 0, 0, 16)
flt.set_output()
It seems that |
Beta Was this translation helpful? Give feedback.
-
Graet work! |
Beta Was this translation helpful? Give feedback.
-
I just want to report that turning OFF Hardware-accelerated GPU greatly increases FPS on Windows 11. ON OFF |
Beta Was this translation helpful? Give feedback.
-
here is some result I tested with vsmlrt v13.1 I use the script above and use vspipe to benchmark. It gives very weird result. Idk if this is problem from my end. vsmlrt v13.1
some reports if anyone interested: https://drive.google.com/drive/folders/1zhigwILaQm13FgY4O5w7oZ18iABY5Lhw?usp=share_link only rife io-fp16 peak performance has reach 100% cuda usage and gives > 120fps result. Others just not able to. note: I didn't flip the result of io-fp32 peak and overhead, the peak performance one is slower than the mpv simulate one. I have no idea why. |
Beta Was this translation helpful? Give feedback.
-
https://www.svp-team.com/forum/viewtopic.php?pid=81483#p81483 integrated this as a test -> in 1080p transcoding i reached around 115fps with RTX4090 and 22 fps for 2160p .. any idea to improve 4k performance ? :)
Beta Was this translation helpful? Give feedback.
All reactions