-
Notifications
You must be signed in to change notification settings - Fork 368
Video stream corrupt with bitrates over 6000 on Raspberry Pi Zero #374
Comments
We haven't added any RPi specific compile options maybe that's worth a shot? I don't know the network code I have just skimmed it and am not experienced with that and I don't have a zero to do problem solving with. Maybe @thestr4ng3r has an idea where to look? Regarding your Qt stumbling block, I was thinking that what I'd do is to try and produce a Zero version of just the 'libQt5Multimedia.so.5' and see if I could hack that into the build as I believe that might be what deals with the sound. Or Iv'e seen people mention that library specifically when having sound issues. That might be easier than trying to build the entirety of Qt5. Just as an easier way to try something.. I can take a look and see if I can do something like that as I believe that I the Qt build didn't go though last time I tried. But I only tried once. I would try compiling with the RPi 1 compile flags. What OS are you using? Buster? Is there a possibility that the broken sound clogs up the network handling? |
This needs concrete profiling to find the bottleneck, but I would suspect either encryption or fec to be the culprit on this slow cpu. Iirc there is a Neon implementation of gf-complete that is currently not used. This might help if fec is the bottleneck. |
Thanks for the insight. Looking at top while running shows that I am hitting 100%cpu right when things go bad, so that confirms it's a cpu load issue. I'm not sure what fec or gf-complete are what their purpose is, so I don't really have any ideas there, but I can say that neon isn't supported on armv6, so that won't be of much help. |
I attempted to do some profiling on this, but did not have any luck. I tried using both gprof and google-perftools, but both broke streaming entirely. It looks like they don't play well with the multithreaded networking code. So, at the moment I've resorted to the highly scientific method of trying random things to see what effect they have. One thing I did discover is that setting the fullscreen flag on the OMX gave a noticeable improvement. I suspect that allows the driver to make some optimizations somewhere. With that I was able to play for a few minutes at a time before things went bad. It might be worth setting that flag when using the F11 fullscreen mode. Something else I've observed in testing is that when the stream goes bad, it never recovers. Once it starts artifacting, it eventually just goes completely grey with only occasional blotches of color. But even if I leave the controls alone and let it go to a still screen, the video never comes back. I have to wonder if something is "snowballing" internally, that shouldn't be? Unfortunately I don't understand enough about the network streaming stuff to really have any ideas beyond that 😅 One thing I can say about FEC is that poking around at moonlight, they have an implementation of that as well, based on this. But, I don't see any obvious signs of better optimizations than gf-complete (no intrinsics for the arm6 or anything), so I'd be a little bit surprised if that was the root of the issue here. But that's just a guess: I don't know what I don't know 😆 |
With the fullscreen save my wild guess is that something else gets turned off when that happens. The OMX parts seems to take a tiny amount of cpu, as far as I can tell. This was the first time Ive ever tried to do profiling on Linux so I might do it completely wrong but I have managed to get some output at least. I invoked the profiler in gui/main.cpp just around the Streamwindow,
The top of the output I keep getting look like this,
It would be great to know what actual functions these are. |
If there's lots of matrix multiplications could this be used? (I too have no idea what all the crypto parts are doing) https://github.com/jetpacapp/pi-gemm Maybe less chance of anything useful here: |
I'm checking out the spec comparisons from raspberry team themselves, here: (pi4)800 / (piZ)50 = 16 times faster
All things being equal and me not messing up my numbers? Edit: Top instead shows me maxing out at ~20% on one of the four cores. I'm going to see if I can save out continuous values per core during play and graph it. HERE's a graph per core and with sum graph: |
The fact that the pi4 is a multi-core arm7 with different features makes it quite different from the zero. I'm afraid it's like comparing apples to oranges, really. I was able to get a newer version of gperftools built and working on the device, and set up to only profile the streaming session. But so far I'm not getting a lot of good info. Call-graph information is mostly missing. I suspect I need to statically compile in all libs to get what I'm looking for. But based on what I'm seeing, almost everything at the top is in OpenSSL code, which there's not much that can be done about. One oddity is that I consistently see I did make one breakthrough, however. My theory that something was 'snowballing' was correct. It was actually the logging that was doing it. As soon as the stream went bad, tons of stuff was being written out, which itself took up precious resources, causing the stream to have further problems, and so on. Disabling logging (specifically the file logging) has reduced the issue dramatically. Now when the video glitches out, it recovers (usually pretty quickly), instead of becoming unusable and requiring me to close the stream. So at this point, it's actually playable at higher bitrates, as long as I don't mind the occasional artifacting 😄 |
If you can identify the core loops or math functions that are
frequently used then maybe we could use the QPU cores to speed some things
up. Sounds like you just need a bit of help to lower the cpu usage.
Iv'e never done anything like that but it seems possible.
https://github.com/mn416/QPULib
…On Fri, Nov 20, 2020 at 3:27 PM Tom ***@***.***> wrote:
The fact that the pi4 is a multi-core arm7 with different features makes
it quite different from the zero. I'm afraid it's like comparing apples to
oranges, really.
I was able to get a newer version of gperftools built and working on the
device, and set up to only profile the streaming session. But so far I'm
not getting a lot of good info. Call-graph information is mostly missing. I
suspect I need to statically compile in all libs to get what I'm looking
for. But based on what I'm seeing, almost everything at the top is in
OpenSSL code, which there's not much that can be done about.
One oddity is that I consistently see opus_get_version_string in one of
the top couple spots. This seems very strange, as that's not something I
would expect to be called frequently. But without more callgraph info it's
tough to say where that's coming from.
I did make one breakthrough, however. My theory that something was
'snowballing' was correct. It was actually the logging that was doing it.
As soon as the stream went bad, tons of stuff was being written out, which
itself took up precious resources, causing the stream to have further
problems, and so on. Disabling logging (specifically the file logging) has
reduced the issue dramatically. Now when the video glitches out, it
recovers (usually pretty quickly), instead of becoming unusable and
requiring me to close the stream. So at this point, it's actually playable
at higher bitrates, as long as I don't mind the occasional artifacting 😄
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#374 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJPWURO7UQRDCN24B5YFT3SQ33PTANCNFSM4TWST77Q>
.
|
Environment
Describe the bug
The video stream becomes corrupt and unusable when bitrate is set above 6000. This doesn't happen immediately, but only once movement happens in the game. Menus will work, but once you move a character around the stream becomes blocky and then bugs out and becomes unplayable.
The obvious culprit would be the network setup, but the pi is connected via cat6 to the PS4 with only a gigabit switch in between. Additionally, other streaming software (moonlight) is able to run stable on this same setup with a bitrate of 20000. This leads me to believe something else might be causing the issue.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Bitrates up to 10000 provide a stable video stream.
Log Files
Click to expand log file
Additional context
I also tried using a 5GHz wifi dongle, which is how I typically use moonlight on this device, and had the same exact results.
Thanks for any help that can be provided!
The text was updated successfully, but these errors were encountered: