You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Upstairs periodically sends flush commands to Downstairs, even if the guest is not asking for disk flushes. Originally this was every 5 seconds. I believe, though am not sure, that it was added to prevent jobs piling up in the upstairs queue.
We adjusted this to 0.5 seconds back when we were still using the sqlite:
We adjusted to 0.5 because, at the time, the cost of an extent flush required a lot of work clearing old metadata contexts out of the sqlite database. That work scaled up directly with the number of writes that had hit an extent since the last flush, and was causing some pretty terrible latency bubbles that we wanted to avoid. Bryan did some testing, found 0.5 gave the best results ( #757 ). Note that this was also pre fast-write-ack.
let flush_timeout_secs = opt.flush_timeout.unwrap_or(0.5);
This is configurable in the VCR, but Nexus is written to always pass None and accept our default.
Why we might want to change it
Well for one thing, we are not on the sqlite backend anymore, and our new one has different flush performance characteristics. But also, we may be sending more fsyncs to zfs than the guest actually cares about. That has a cost to it.
Questions
Does adjusting the flush timeout affect performance or CPU usage?
If it is also still serving the role of preventing job pileup, is it still the right mechanism for that?
The text was updated successfully, but these errors were encountered:
The flush timeout does still play a role in preventing job pileup.
For replay to work, we keep a list of all jobs since the last ack'd flush, and if a
downstairs goes away and comes back, we replay all the work since that last confirmed flush.
The frequent flush allows us to discard any work and not have to keep that work in memory.
So, we would need to determine a new way of allowing replay to work correctly if we want
to increase the flush timeout.
Background:
Upstairs periodically sends flush commands to Downstairs, even if the guest is not asking for disk flushes. Originally this was every 5 seconds. I believe, though am not sure, that it was added to prevent jobs piling up in the upstairs queue.
We adjusted this to 0.5 seconds back when we were still using the sqlite:
crucible/upstairs/src/lib.rs
Line 9675 in 8757b3f
We adjusted to 0.5 because, at the time, the cost of an extent flush required a lot of work clearing old metadata contexts out of the sqlite database. That work scaled up directly with the number of writes that had hit an extent since the last flush, and was causing some pretty terrible latency bubbles that we wanted to avoid. Bryan did some testing, found 0.5 gave the best results ( #757 ). Note that this was also pre fast-write-ack.
And it remains 0.5 to this day:
crucible/upstairs/src/upstairs.rs
Line 396 in 7d6c7e1
This is configurable in the VCR, but Nexus is written to always pass None and accept our default.
Why we might want to change it
Well for one thing, we are not on the sqlite backend anymore, and our new one has different flush performance characteristics. But also, we may be sending more fsyncs to zfs than the guest actually cares about. That has a cost to it.
Questions
The text was updated successfully, but these errors were encountered: