-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large performance improvements and a couple of nice to have new features #96
Conversation
…xt if sticking around) and added 1st/2nd/3rd with config var
# Performance Branch The most computationally expensive thing to do on any computer is render a font. It's what Babbage fought with for years. I profiled the main loop of this code by vendoring in Luma, removing the threadpool, and running cProfile over it. That revealed two things: * Running the seconds as "hotspot" was burning up the CPU * The rest of the CPU time was spent "scrolling" the "calling at stations" as it required a full re-render of the string each time a character was dropped off the front of the string. * We've changed the scrolling behaviour to scroll the bitmap at 1 pixel per frame, rather than 1 character per frame. Smooth! * There's also a snazzy little rising-up animation for fun. * The other frequent font calls (in the loop) have all been bitmap cached too. * The frame regulator is now configurable; it fights with the CPU on a Pi Zero and is better disabled there; but on a Pi3 you want a regulator to stop burning the CPU up unnecessarily. So, main fixes were to put the seconds as an interval-updated zone with 0.1 second resolution, and to pre-render all other commonly used TTF operations in the main loop. ## Results On a Pi Zero (the oldest possible device!), here's the performance on "main": ![353895428_652705433582787_8305305711463697195_n](https://github.com/CalamityJames/train-departure-display/assets/1850718/823cfcc8-1f6b-4730-ae5d-f49e655af10f) And here's the performance on "performance", with `targetFPS` set to `0`: ![image](https://github.com/CalamityJames/train-departure-display/assets/1850718/c1a260a1-cd26-4872-b204-4654293caa9a) # Changelog See updated CHANGELOG --------- Co-authored-by: James <[email protected]>
I don't have two screens so couldn't test the multi-screen performance. Sorry if that totally breaks it! Will be ordering a second screen but my previous ebay supplier is now away till August! |
Just a note here to say that I reformatted my Pi Zero and installed Python 3.11 (rather than 3.7 which the image comes with), which has many CPU optimisations, and the result is significant: 95fps on a Pi Zero is a big leap! It looks like the image here: https://hub.docker.com/layers/balenalib/raspberry-pi-debian-python/3.11-buster-run/images/sha256-d74b72c912b9f0d019308d0995e50c82b54c106466656d094ebdd30d831e72f7?context=explore Could be used, but I haven't run this project through Balena yet, so I'm not going to fiddle with Balena-specific settings. |
* zooom zoom * bleep * tweak; remove zlib, but libjpeg is required at runtime * rm emu * tbh * rewrite for merge to 0.5.0
I've proceeded with the move to Python 3.11 and put both the build and the run Docker containers on Alpine. Also got this all working with Balena, which is frustrating to start with, but quite neato when you get used to it. Updated the PR to note performance running via BalenaOS is acceptable, but understandably lower. |
Wow guys! I'll take a look at this immediately! @CalamityJames I've got a spare screen I can send you if you need it for testing purposes - email me your address [email protected] |
@chrisys thanks for the offer - I've dropped you an email :) I should have a Pi Zero 1 (and 2) with me in the next couple of days too so I can test on more realistic devices than my OP Pi3! |
@CalamityJames @cr3ative I've updated my sign (running on balena) with this and it worked first time and is just beautiful, it's the point I always dreamed we could get to, the scrolling is just 😍 @cr3ative I'm shipping James a display to help with testing but I'm happy to do the same for you too if it helps! I have a couple of spare white ones as I was planning on working on #62 but realistically I'm not going to get to it any time soon. |
That's really kind of you to say! I hope we got the Balena bits right - hopefully you could tweak them for us anyway. I'll take a display, especially if it saves me taking the dang headers off it! Will drop you an email. |
Yep as far as the balena side is concerned it all looks great! The resultant container is a bit larger than it was previously (225MB vs 89MB) but I agree it makes the maintenance (and development) easier. |
I would just like to say a huge thank you to everyone who contributed to this update. The Pi Zero display under my monitor has gone up from approx 0.9fps easily into the 40's. A truly huge improvement and I'm not seeing the clock sticking as it used to either. One silly question though, my TZ is set to the default of "Europe/London" but the clock is an hour behind and making a change to it is not making any difference. Has anyone else encountered this, and if so what have I forgotten to change ? The departure times are correct, it's the real time clock at the bottom that is an hour behind. |
Hah, just glanced behind me and noticed mine is also reporting as 16:27 currently! Will have a look and see if it's something we've broken! |
Fix identified I believe, just testing and will submit a new PR! |
See #97 for Time Zone fix. |
Obsoletes #95 (included in this PR)
Massive thanks to @cr3ative for his stellar work with optimising the scrolling performance, he is the real star of this PR!
Base OS and Python Upgrade
We've moved the base operating system to Alpine, cleaned up the build chain, and simplified the Dockerfile. The resulting image on Balena is about 220MB, about the same as before, but hopefully easier to approach.
This PR has now been tested in Balena and deploys as expected.
Python Profiling
The most computationally expensive thing to do on any computer is render a font. It's what Babbage fought with for years.
I profiled the main loop of this code by vendoring in Luma, removing the threadpool, and running cProfile over it. That revealed two things:
So, main fixes were to put the seconds as an interval-updated zone with 0.1 second resolution, and to pre-render all other commonly used TTF operations in the main loop.
Results
On a Pi Zero (the oldest possible device!), here's the performance on "main" running Raspbian:
And here's the performance on "performance", with targetFPS set to 0:
Using BalenaOS, the Pi Zero can manage about 33fps when targeting 45fps, at about 70% CPU; this leaves some headroom for the supervisor services.
Changelog
We edited the changelog that shouldn't have been edited, sorry! Changes pasted in from changelog: