Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the FIFO thread #7568

Open
wants to merge 25 commits into
base: master
Choose a base branch
from
Open

Conversation

sakertooth
Copy link
Contributor

@sakertooth sakertooth commented Oct 31, 2024

This PR removes AudioEngine::fifoWriter, which is a thread that facilities the multiple buffering mechanism we have currently for our audio rendering. The way the mechanism works (to my understanding):

  1. The FIFO thread, when started, splits the buffer size into chunks of 256 sample frames and generates them as fast as it can.
  2. The FIFO thread stops generating audio when the number of chunks generated equal original_buffer_size / 256 and waits for someone to request a chunk.
  3. The main audio thread requests chunks from the FIFO and uses that for playback.
  4. While the main audio thread is requesting and receiving chunks, the FIFO thread continues rendering more data.

Pros to having this thread:

  1. Automation changes happen more regularly - Most processing in LMMS treat its parameters as having one value. That same value is used throughout the buffer. As a result, without the FIFO thread and the chunking of the full audio buffer, larger buffer sizes reduce the amount of automation that occurs through time.
    1. Sample-exactness can fix this for most parameters, which makes it so that parameters are treated as having values per sample in the buffer, rather than just one single value. There are complaints of having sample-exactness for most of our native parameters will cause major drops in performance, which is why removing this thread may not be that straightforward. Note: In the case of VST2 parameters, I believe sample-exactness cannot be applied, as sample-exact automation was added in VST3. This is to show that inevitably, not all the parameters can be made sample-exact, and most professional DAWs deal with this same issue and also have problems involving automation timing accuracy.
    2. I fixed this problem but was still able to remove the FIFO thread by doing the chunking on the callback thread.

Cons to having this thread:

  1. Export bugs - This was the main reason why I wanted to remove this thread. The communication between the main audio thread, the export thread, and the FIFO thread is complex and has lead to deadlocks when exporting like in Constant freezing at 0% during export #7320. Trying to fix the FIFO thread and remove the deadlock does not feel like a good use of time considering that it really shouldn't be there in the first place if things were more correctly implemented in the beginning.
  2. Higher buffer sizes != more performance - For reasons I am currently unaware of, higher buffer sizes do not necessarily imply more performance when using the FIFO thread (at least in this instance). I discovered this when I was investigating the infamous "automating VST knobs" bottleneck. The video I attached demonstrates a project automating a VST knob with an LFO controller with this branch merged in my build. In the video, you can see how having a buffer size of 1024 samples keeps the CPU meter at acceptable levels, while in contrast a buffer size of 256 samples has a bad impact on performance.
    1. This improvement in performance was only seen when this thread was removed and the main audio thread was directly rendering the audio buffers. Current master exhibits bad performance regardless of what buffer size you select in the settings when automating VST knobs, while here the situation is less of a problem if a higher buffer size is selected.
    2. Note: I checked out master and moved into a new branch in the video for some reason, but merged this branch and nothing else. I was planning to do a real fix, but @DomClark seems to be working on it already, and I quickly came up with the idea to see if this PR would help performance, so I merged it in and it did.

The solution now is to keep the chunking of the buffer size and do it on the audio callback thread, and still remove the FIFO thread.

output.mp4

@sakertooth
Copy link
Contributor Author

Changed the PR to still remove the FIFO thread but keep the chunking of the buffer size (to avoid problems with non sample accurate automation but to also fix the problems this thread has caused).

@sakertooth sakertooth marked this pull request as ready for review December 17, 2024 16:10
@sakertooth sakertooth added needs code review A functional code review is currently required for this PR needs testing This pull request needs more testing labels Dec 18, 2024
@sakertooth sakertooth linked an issue Dec 21, 2024 that may be closed by this pull request
1 task
@Rossmaxx
Copy link
Contributor

This shoulda had higher priority. @LMMS/testers

@Rossmaxx
Copy link
Contributor

Wait the tester role is only for peki?

Copy link
Contributor

@Rossmaxx Rossmaxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fatloss on this is insane. I am not approving for now because I am doubtful if something might have gotten lost somewhere. I will approve after i get around to testing.

Also, don't understand what is now happening with the chunking mechanism. Is the audio thread processing in chunks of 256 and then outputting to the playback/export at the user defined buffer size? Or is there something I missed.

src/core/AudioEngine.cpp Outdated Show resolved Hide resolved
src/core/AudioEngine.cpp Outdated Show resolved Hide resolved
src/core/audio/AudioDevice.cpp Show resolved Hide resolved
@bratpeki
Copy link
Member

bratpeki commented Jan 16, 2025

@LMMS/testers

Reporting!

I'm busy with managing #7477, #7444 and #7366.

Is this making the code more readable without reducing performance? Or is it a performance boosting PR? In any case, great stuff, I'll look at it when I'm done with these three PRs!

@Rossmaxx
Copy link
Contributor

This is supposed to solve the performance degradation on master compared to 1.2 and the rare case with deadlocks making a mess, along with the buffer size and automation performance, among other stuff. @sakertooth correct me if I'm wrong. If what i guess is right, this PR has the potential to straighten up a lot of the performance bugs.

@bratpeki
Copy link
Member

Wait the tester role is only for peki?

For the time being, yeah. I think I get rights to review PRs and whatnot, so it's probably not wise to give it away to anyone wanting to test, but I'm not entirely sure about if that's how it behaves, just speculation.

@sakertooth
Copy link
Contributor Author

This is supposed to solve the performance degradation on master compared to 1.2 and the rare case with deadlocks making a mess, along with the buffer size and automation performance, among other stuff. @sakertooth correct me if I'm wrong. If what i guess is right, this PR has the potential to straighten up a lot of the performance bugs.

It's mostly to fix deadlock bugs involving this thread waiting forever to close. It's less of a performance benefit (though that could be possible) but more so to simplify the thread communication within the audio pipeline.

@CorruptVoidSoul
Copy link

So I tested this PR.
Exporting works fine, no hearable differences.
Playback is still pretty much the same as on master, I made it play the performance intensive part of a song, my phone is still in pain, no noticeable gain or loss there.
I had two random crashes but I expect this to be a me problem because well, it's a phone and this mmpz is uhh... a bit tough sometimes.
Still on the "me problem" side, usually Equalizer takes a minute to load for the first time, in this PR it takes 30 seconds so it's better, not gonna complain.
Now back on playback, the playhead will desync itself from the actual sound after a huge lag spike, but that's a general performance problem, it doesn't have much to do with this PR I think.

Final feedback is this seems good, did Peki test that too ? There could be differences with a computer that can handle heavy workload.

@sakertooth sakertooth removed the needs testing This pull request needs more testing label Jan 17, 2025
@sakertooth
Copy link
Contributor Author

Final feedback is this seems good, did Peki test that too ? There could be differences with a computer that can handle heavy workload.

I don't think Peki has tested this yet, but they're probably busy with other PRs so it's fine. Thank you for testing this for me 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs code review A functional code review is currently required for this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Constant freezing at 0% during export
4 participants