Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phil/pruning is hard #362

Merged
merged 3 commits into from
Jan 31, 2024
Merged

Phil/pruning is hard #362

merged 3 commits into from
Jan 31, 2024

Conversation

psFried
Copy link
Contributor

@psFried psFried commented Jan 30, 2024

This reverts the previous attempt to fix recovery log pruning, and replaces it with a slightly different approach. Individual commit messages have more details, but the gist is to avoid trying to merge segment sets from different hints, and instead just check each set of hints separately.

One question I have is how to handle recovery logs that are used by multiple shards (that have since split). For example, say we have shards a/0 and a/1, which both have hints covering a/log. As of this PR, we would still potentially be merging the hints from multiple shards into a single set, since the maps are keyed on journal name and hints name, and both shards will have hints with the same name. So I'm thinking it may be safer to key the hints on journal name and a combined shardId+hintsName (for example "a/log": { "a/0-primary": <hints>, "a/1-primary": <hints>, ...}). But I'm not really sure whether that's necessary, so curious to get your thoughts.


This change is Reviewable

This reverts commit e2cace3.

Unfortunately, this did not actually work in practice to allow shard pruning to
work based on the combined set of backup and primary hints. Trying a new thing
in the next commit.
@psFried psFried force-pushed the phil/pruning-is-hard branch 2 times, most recently from bb00810 to 07c98c9 Compare January 31, 2024 17:05
Copy link
Contributor

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % comments, thanks

@@ -101,7 +101,11 @@ func (cmd *cmdShardsPrune) Execute([]string) error {
metrics.fragmentsTotal++
metrics.bytesTotal += spec.ContentLength()

if len(segments.Intersect(journal, spec.Begin, spec.End)) == 0 {
// belt-with-suspenders, to ensure we never try to prune un-persisted fragments.
if spec.End == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All fragments -- even un-persisted ones -- are always placed within the journal (have a Begin and End).
AFAIK it's not even possible for brokers to return a fragment with End == 0

What lead to this check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What lead to this check?

Just a bit of paranoia, and feeling like the zeroing of LastOffset in the last segment seemed a bit tricky. I'll remove it

@@ -123,6 +127,16 @@ func (cmd *cmdShardsPrune) Execute([]string) error {
return nil
}

func overlapsAnySegment(segments []recoverylog.Segment, fragment pb.Fragment) bool {
for _, seg := range segments {
if (seg.FirstOffset < fragment.End || fragment.End == 0) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect the fragment.End check can be dropped per above.

Otherwise, I ran through a bunch of mental exercises proving to myself this otherwise handles all of the possible overlap case. I presume you did the same. I couldn't come up with any cases where it misbehaves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume you did the same.

Yeah, and also updated the tests to use this function so that we're exercising it. I waffled a bit on whether to add unit tests for just this function, and would be happy to do so if you think it'd help

psFried and others added 2 commits January 31, 2024 13:16
…tely

When pruning recovery logs,  keep a slice of all the segments from each set of hints.
Then check each segment separately, pruning only fragments that are not overlapped by
any of them. This works around difficulties we've had in the past with merging
segments from different hints into a single `SegmentSet` (which is not what
it was designed for).
@psFried psFried merged commit 4caca87 into master Jan 31, 2024
1 check passed
@psFried psFried deleted the phil/pruning-is-hard branch January 31, 2024 18:28
psFried added a commit to estuary/flow that referenced this pull request Jan 31, 2024
psFried added a commit to estuary/flow that referenced this pull request Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants