revisit flush_timeout #1358

faithanalog · 2024-06-17T19:59:12Z

Background:

Upstairs periodically sends flush commands to Downstairs, even if the guest is not asking for disk flushes. Originally this was every 5 seconds. I believe, though am not sure, that it was added to prevent jobs piling up in the upstairs queue.

We adjusted this to 0.5 seconds back when we were still using the sqlite:

crucible/upstairs/src/lib.rs

Line 9675 in 8757b3f

let flush_timeout = timeout.unwrap_or(0.5);

We adjusted to 0.5 because, at the time, the cost of an extent flush required a lot of work clearing old metadata contexts out of the sqlite database. That work scaled up directly with the number of writes that had hit an extent since the last flush, and was causing some pretty terrible latency bubbles that we wanted to avoid. Bryan did some testing, found 0.5 gave the best results ( #757 ). Note that this was also pre fast-write-ack.

And it remains 0.5 to this day:

crucible/upstairs/src/upstairs.rs

Line 396 in 7d6c7e1

let flush_timeout_secs = opt.flush_timeout.unwrap_or(0.5);

This is configurable in the VCR, but Nexus is written to always pass None and accept our default.

Why we might want to change it

Well for one thing, we are not on the sqlite backend anymore, and our new one has different flush performance characteristics. But also, we may be sending more fsyncs to zfs than the guest actually cares about. That has a cost to it.

Questions

Does adjusting the flush timeout affect performance or CPU usage?
If it is also still serving the role of preventing job pileup, is it still the right mechanism for that?

leftwo · 2024-06-21T17:21:40Z

The flush timeout does still play a role in preventing job pileup.
For replay to work, we keep a list of all jobs since the last ack'd flush, and if a
downstairs goes away and comes back, we replay all the work since that last confirmed flush.
The frequent flush allows us to discard any work and not have to keep that work in memory.

So, we would need to determine a new way of allowing replay to work correctly if we want
to increase the flush timeout.

faithanalog mentioned this issue Jul 25, 2024

Lock contention during iodriver runs #1337

Open

leftwo mentioned this issue Oct 29, 2024

Only send flushes when Downstairs is idle; send Barrier otherwise #1505

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

revisit flush_timeout #1358

revisit flush_timeout #1358

faithanalog commented Jun 17, 2024 •

edited

Loading

leftwo commented Jun 21, 2024

revisit flush_timeout #1358

revisit flush_timeout #1358

Comments

faithanalog commented Jun 17, 2024 • edited Loading

Background:

Why we might want to change it

Questions

leftwo commented Jun 21, 2024

faithanalog commented Jun 17, 2024 •

edited

Loading