-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only send flushes when Downstairs is idle; send Barrier otherwise #1505
base: main
Are you sure you want to change the base?
Conversation
b0c092b
to
3a7e2f8
Compare
Rebased to stage on top of #1507, because we want to only send |
7b7ec22
to
1e4cc53
Compare
d3e9973
to
5f94966
Compare
796a397
to
2d9dba3
Compare
2d9dba3
to
2709712
Compare
panic!("expected Barrier, got message {m:?}"); | ||
} | ||
harness.ds2.ack_barrier().await; | ||
harness.ds3.ack_barrier().await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any way to check that replay is no longer available to this downstairs? I'm not sure if the test has access at this layer to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The end of this unit test is checking that live-repair works, so I think we're implicitly testing that we don't do replay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was thinking more that the sending of a barrier operation has also flipped the can_replay
in the Downstairs
struct. I'm not sure if we can do that though here (or in a test elsewhere), as can_replay
might not be exposed like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's not easy to check from the outside (when we only have the Guest
handle). If you really want it, we could probably add it to the downstairs_state
helper function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's worth adding it to the downstairs_state
helper function.
I don't see anywhere else that seems like a good place either. Nothing in the upstairs tests cover this case.
With this, we can probably close #1358 as fixed |
5cc9448
to
be0f66d
Compare
be0f66d
to
bcd4ba2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably tighten up IO_CACHED_MAX_BYTES
soon. Right now we can theoretically buffer up to 1GiB of jobs for replay, and that feels like quite a lot to me. We usually won't- we will only get to that point if the guest is doing large amounts of IO, continuously so we never go idle, and the guest never sends a flush. That's unlikely/rare during normal filesystem operations (but will be easier to hit when writing to the raw block device like iodriver does). Because of that, we shouldn't expect more than a few VMs to hit this under normal operation. But in interest in not overcommitting resources, I think that bound should be lower.
bcd4ba2
to
bcad542
Compare
This PR removes automatic flushes, per RFD 518. Instead, the new
Barrier
operation is sent. If the system is idle for a particular amount of time, we send a finalFlush
to put everything into a known state.When the Upstairs retires jobs after a barrier operation, the system as a whole becomes ineligible for replay. This state determines whether the new Downstairs reconnects through
Offline
(which does replay) orFaulted
(which does live-repair instead).Removing automatic flushes is a noticeable performance improvement:
(tested on the London mini-cluster, with Upstairs and 3x Downstairs on different sleds)