Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brightness limiter utilizing global LED buffer and improved color restauration #3272

Closed
wants to merge 4 commits into from

Conversation

Aircoookie
Copy link
Owner

Fixes #3264

  • Decoupled global buffer from optional segment leds buffer, per-segment buffers are only allocated if the global buffer is disabled (per-segment buffer could still be useful with global buffer enabled in some cases as it is lossless even if less than maximum segment opacity is set, so we may consider changing the buffer option to a dropdown with "Global", "None", "Per-segment", "Global + per-segment")
  • Added improved lossy color restauration to BusDigital::getPixelColor(). Such functionality was part of NeoPixelBrightnessBus, but not NeoPixelBusLg, still it is absolutely essential if no double buffer is used. The restauration equation is improved to be lossless when setting repeatedly, e.g. calling setPixelColor(i, getPixelColor(i)) will not gradually degrade the color to black.
  • ABL makePostAdjustments() hotfix replaced by reading global buffer if enabled, else using restored colors from bus (lower performance)
  • Global buffer is now of type uint32_t for ease of implementation, using RgbColor or CRGB types would also work and is up for consideration so that enabling global buffer would use 1 byte less per LED (perhaps at least if the setup contains no white channel LEDs at all?). Would cause headaches with how to get the white channel for ABL though and might impact performance slightly by having to convert from/to uint32_t.

@Makuna if you are interested in testing the improved lossy restauration even though NPBrightnessBus is already deprecated, just replacing *ptr++ = (value << 8) / scale; (L78) with *ptr++ = ((value << 8) + _brightness) / scale; should do the trick. While there is still significant loss, the recovered color is frequently one value closer to the one originally set, but most importantly repeatedly getting and setting again should no longer degrade the color any further.

@blazoncek
Copy link
Collaborator

On a first glance I must say that I do not like the complexity of the solution.

Especially I do not like mixing of Bus class into the handling of global brightness. IMO it should only happen within WS2812FX class if you ever want to use NeoPixelBus in the future without re-factoring PolyBus.

The other thing I do not like is the removal of global buffer from Segment class. Segments must have access to buffer directly and should prefer global buffer to local to reduce memory use.

I will take a deeper look later.

BTW I would also prefer this to be a feature of WLED 0.15.

@Aircoookie
Copy link
Owner Author

Thank you for the feedback!

Color restauration does need to be done in BusDigital::getPixelColor(), we would need to add that regardless of the global buffer changes as NeoPixelBusLg does not in itself recover colors. Having recovered colors is necessary not only for ABL calculation to work properly, but also for use of getPixelColor() by effects if buffer is disabled, and for a more vibrant peek preview on lower brightness levels, so I had this on my todo before final 0.14 anyways. Switching from NeoPixelBusLg to NeoPixelBus is as easy as adding brightness scaling to BusDigital::setPixelColor(), the reason I haven't done that (yet) is that NeoPixelBusLg takes care of proper brightness scaling on 16-bit LEDs for us.

Segments must have access to buffer directly and should prefer global buffer to local to reduce memory use.

Agreed that global buffer should be preferred. Local is also suboptimal as it may cause memory fragmentation due to frequent reallocations if segment setup is modified.
What is the reason direct buffer access from the segment would be necessary? All effects access it only through set/getPixelColor (and they have to, else grouping/spacing/mirroring would not work).
The only drawback I see in my solution is that SEGMENT.opacity scaling is applied to the values in the buffer and as such, the retrieved value would be scaled. We could also add color restauration here to alleviate this.
If segments would write the color to global buffer directly without applying opacity scaling, the buffer could not be used for ABL, nor setting the LEDs, nor for peek; which would kind of defeat the purpose of a global buffer IMO.

Best of both worlds would be making a GlobalLedBuffer class, storing the unscaled color directly from segments, and having a method for reading them directly (for use in effects) and one for reading them with segment opacity scaling applied. Would increase implementation complexity and probably incur a performance hit though, and it is a headache in case of overlapping segments (with overlay effects and effect transitions in mind)

@blazoncek
Copy link
Collaborator

Let me briefly comment, before I take a second look at the changes. 😃

Why do you think color restoration need to happen in Bus if we do double buffering? Let Bus class only deal with low level writing and delegate color retrieval to WS2812FX class. Consider this workflow: Segment::getPixelColor() -> WS2812FX::getPixelColor() ->BusManager::getPixelColor() -> Bus::getPixelColor() -> PolyBus::getPixelColor() -> NeoPixelBus::GetPixelColor()
A lot of jumps. Unnecessary if you use global buffer as restoration could happen in WS2812FX variant in that case. In case of local buffer, it can happen in Segment but would need to be enumerated in current estimation, only if no double buffering happens then use underlying NeoPixelBus. There is no need to handle low level color information retrieval from NeoPixelBus as we have all information in global buffer. You would only need to do it in case user disables global buffer (which will prevent us from using plain NeoPixelBus anyway).

My thought was also about show() as it is applying current limitation. Why not also let it handle global brightness? It would incur a bit of CPU cycles but most of those could be stolen from elsewhere. This can be achieved by prematurely ending Segment::setPixelColor() by only writing to global buffer and not calling NeoPixelBus. show() would then need to do BusManager::setPixelColor() after applying segment opacity and global brightness. (Here I admit that segment opacity may need different handling.)

TL:DR; Having global buffer can simplify color recovery and speed up the process. At the expense of RAM. And here we stumble on the dreaded ESP8266 support. 😅

@Aircoookie
Copy link
Owner Author

Why do you think color restoration need to happen in Bus if we do double buffering?

It doesn't! :) In fact, busses.getPixelColor() is never called if double buffering is enabled. Color restauration is only used when global buffer is disabled. I would like to enable it by default on ESP32.

As for why I added it to BusDigital and not WS2812FX is because only digital busses store scaled colors. Analog and network busses store correct unscaled colors, and trying to "restore" them would not be good. Distinguishing whether the color came from a digital or analog/network bus in WS2812FX would be a lot more involved than just doing it in BusDigital. Besides, the brightness scaling is done in NPB, which is one level lower down the stack still.
Alternatively, I could move all brightness scaling (regardless of bus type) to WS2812FX and adopt the standard NeoPixelBus. The issue I see with that is that we would need to do some modifications to the Bus API to support more than 8 bits, and that the auto white and CCT from RGB equations would likely be less accurate on already scaled colors. Therefore I would prefer to do all brightness handling on the bus level.

This can be achieved by prematurely ending Segment::setPixelColor() by only writing to global buffer and not calling NeoPixelBus.

It also does this! Only show() will write to busses now if double buffering is enabled, this even slightly increases performance (less for-loops and function calls). Just segment opacity needs further thought, completely agree on that. Color restoration from global buffer would be good enough for now IMO, but definitely not optimal and I'll think whether there there is a better way, but I'm not sure on one right now that wouldn't involve code complexity going 📈

@Makuna
Copy link

Makuna commented Jun 27, 2023

@Makuna if you are interested in testing the improved lossy restauration even though NPBrightnessBus is already deprecated, just replacing *ptr++ = (value << 8) / scale; (L78) with *ptr++ = ((value << 8) + _brightness) / scale; should do the trick. While there is still significant loss, the recovered color is frequently one value closer to the one originally set, but most importantly repeatedly getting and setting again should no longer degrade the color any further.

The real concern is not really just the luminance effects (even though low original values still cause more error than higher values), it's the effects of reversing gamma correction and then add luminance on top of that.

@Makuna
Copy link

Makuna commented Jun 27, 2023

Also note, there is nothing stopping you from copying NeoPixelBrightnessBus into your project and modify it to exactly what you want. It was meant to be used as compatibility layer between Adafruits and my library, not as an advanced feature.

@blazoncek
Copy link
Collaborator

The main problem here is legacy. Users are accustomed to have global brightness even though we have segment opacity (which behaves like local brightness). If there is only one segment, then one of them is redundant.

My (radical) view would be to dump global brightness in favour of using only segment opacity. But I cannot afford that. 😁

I know you are unlikely to change NeoPixelBusLg @Makuna but what if you'd only apply gamma and luminosity in the ISR while sending out buffer data? Is that too much CPU intensive for ISR? Or are you not using ISRs and/or the buffer needs to be prepared in advance?

@blazoncek
Copy link
Collaborator

@Aircoookie what if we implement double buffering on the bus level (the way you like it) in the same way as we did it for BusNetwork?

That would require no change on the upper layers (and we could go without setUpLeds()) but we'd need to send pixel data to NeoPixelBus in the show() method instead of setPixelColor(). That would make double buffering mandatory, unfortunately.

What do you think?

@Makuna
Copy link

Makuna commented Jun 27, 2023

I know you are unlikely to change NeoPixelBusLg @Makuna but what if you'd only apply gamma and luminosity in the ISR while sending out buffer data? Is that too much CPU intensive for ISR? Or are you not using ISRs and/or the buffer needs to be prepared in advance?

The issue here is that while that can work with Methods that do a translation into a hardware feature to send actual bits, not all methods have this extra step and only send what is in the buffer untranslated.
The buffer stack sort of layered like
RGB - DIB Layer (optional) for quicker management of non-dimmed nor gamma corrected pixels. Outside the NPB instance.
GRB - DDB Layer (dimmed and gamma corrected data in format for specific LED chips with pre-ambles and post-ambles for one wire chips (timing), as seen on the bus). Inside the NPB instance.
HW - (optional) Hardware Method required format for getting bits out a pin (I2S requires a full buffer, RMT is translated on the fly in gulps, BitBang this isn't even present). Also inside the NPB instance.

Remember, NeoPixelBus supports other platforms than ESP, so this architecture is designed to be supportable across all.

I have a future design Issue logged to review buffering at all layers listed above and abstract them so that methods like the ESP32 hardware could merge layers in way like you are thinking. Right now, this would be Luminance and Gamma correction merged with the Hardware requirement. And this is why NeoPixelBusLg was created, to start users on a supportable path for the future where those two features (luminance/gamma) are moved lower in the stack and not accessible for retrieval as they will in future would require (un) translation from a hardware sending specific format rather than just the LED chip format that is done today.

@blazoncek
Copy link
Collaborator

I was afraid so. 😄

The problem we are currently facing is that NPBLg doesn't apply brightness/luminance to pixels that were already SetPixelColor-ed (prior to SetLuminance) which we need for brightness/current limitation. And this seem it is totally on us to solve by using double buffering (DIB as you call it).

And we are at the dreaded RAM limitations of ESP8266.

@Makuna
Copy link

Makuna commented Jun 27, 2023

I was afraid so. 😄

The problem we are currently facing is that NPBLg doesn't apply brightness/luminance to pixels that were already SetPixelColor-ed (prior to SetLuminance) which we need for brightness/current limitation. And this seem it is totally on us to solve by using double buffering (DIB as you call it).

And we are at the dreaded RAM limitations of ESP8266.

Yeah, right now this would be an issue for ESP8266. The I2s (DMA) method today has two buffers, one for the async send in the DMA format, and one that is DDB format mentioned above. In the future, The DDB layer could be done away with and either replaced with a DIB (internally managed) or my current thoughts with an external DIB that you pass to a Show()/Render(). The detail on this last method, due to async nature of DMA, this Show/Render would have to wait for the previous one to complete sending from hardware before being translated into the single DDB. Not often an issue as you have the external DIB to work with while it sends. This is analogue to double buffer rendering for video games BTW. And then could allow for those who want advanced drawing triple buffering.

Maybe this conversation should be moved somewhere else:

But the detail I am hearing from your concern is how to limit overall power usage, where the measurement would happen at lower layer (after Luminance/Gamma), but how you react would like to be done at a higher layer (before Luminance/Gamma). Today, you have to apply luminance/gamma, check the over all power usage, then mitigate overdraw how? Starting over with re-apply a lower luminance? Capping peek brightness pixels only?

@blazoncek
Copy link
Collaborator

Today, you have to apply luminance/gamma, check the over all power usage, then mitigate overdraw how? Starting over with re-apply a lower luminance? Capping peek brightness pixels only?

The process is rather simple overall.

  1. set (global) brightness (from UI)
  2. draw pixels as intended at full brightness (not scaled within FX, will be scaled down by NPBLg)
  3. sum all pixel values and calculate current draw
  4. if the current draw is above budget reduce global brightness further (apply to all pixels not just the brightest ones)

@blazoncek
Copy link
Collaborator

While I was working on implementation of double buffering in a lower level than this PR (within our bus) and was checking NPBLg, I noticed that ApplyPostAdjustments() only does SetPixelColor(pix, Shader.Dim(GetPixelColor(pix))) as we are not using Gamma correction.

So, the simplest* solution is to decouple BusDigital::setBrightness() from PolyBus::setBrightness() and then do PolyBus::setBrightness(_bri); PolyBus::applyPostAdjustments(); PolyBus::show(); in BusDigital::show() followed by PolyBus::setBrightness(255) when PolyBus::show() completes. This will require a slight change in WS2812FX::estimateCurrentAndLimitBri() but that is manageable (a simple scale8(R(c),_brightness) will do it).

In such case strip brightness is applied just before outputting to pixels and full brightness (for shader) restored upon sending data to pixels.

simplest* will add complexity to BusDigital::getPixelColor() as it will read scaled down values on the next pass of rendering effects until pixels are overwritten. Unfortunately I cannot think of any good solution to this problem. (Perhaps adding dirty flag to Segment or WS2812FX class?)

On the other hand, the implementation of double buffer on Bus level is very clean and fast, needs no other adjustments but will require twice the amount of RAM.

@softhack007
Copy link
Collaborator

Is this PR still active, or fully included in alt-buffer #3280 ?

@blazoncek
Copy link
Collaborator

Is this PR still active, or fully included in alt-buffer #3280 ?

Replaced by it. But leave it open for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

beta-3 Auto Brightness Limiter issue
4 participants