Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance backup ready conditions #619
Enhance backup ready conditions #619
Changes from all commits
82aa681
3a3f343
44cc1ae
9645733
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no real need for
createBackupConditionResult
function. You do not save on the number of lines of code, in-fact you have more lines of code :) and there is no readability improvement over just creating an instance of a struct.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conType: druidv1alpha1.ConditionTypeBackupReady
is common to all backupReady condition result, so it made sense to pull it out into a separate function, just to avoid adding theconType
every single time when returning. Of course, the previous method was to create a default result at the beginning of the function and then simply change the values when returning, but @seshachalam-yv pointed out that it was not the most readable, since one has to check the default result as well as the changed values to figure out the final result being returned.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have create functions for delta and full snapshot grace period? These can be methods on etcd resource itself, if you do not like that then these can just be standalone functions. One single place to compute these preventing duplicating these multipliers in more than one place in future code changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you notice, the grace period isn't same across functions. Grace period for renewing delta snapshot lease that was already renewed previously is 2x delta snapshot period, while grace period for delta snapshotting to start once full snapshot has been taken is 5x delta snapshot period. So it's not achievable by one single function. Also, this is condition-specific information, and may not make sense to be part of the API, because the API has no knowledge of which conditions are set on the etcd status by druid, and it shouldn't as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely convinced if this results in
BackupFailed
. Can you semantically define this term to better gauge if this is the correct reason code? The reason is that you still have a full-snapshot that has been successfully backed-up. Should you call thisBackupFailed
orDeltaSnapshotBackupFailed
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That again now depends on how granular we want our
Reason
codes to be.BackupFailed
denotes that either of the backups failed, and either case is dangerous. Without a full snapshot taken on time, we risk longer restoration time and hence a longer RTO. And without a delta snapshot taken on time, we violate SLAs since RPOs is affected. So differentiating between which snapshot failed in theReason
does not provide much benefit to operators. If they want more info on whyBackupFailed
was set, they can look into theMessage
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not clear on why the reason code should be
Unknown
. Also using the same condition for full and delta is quite confusing to correctly determine the status of the overall backup. If we now see this condition in the etcd status.Conditions then one must always remember that this condition means that a full snapshot was indeed taken and we are waiting for delta snapshot backups to be taken. This is not clear from this condition as we have decided to overload one condition for 2 different snapshots.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that something like
PartiallyUnknown
sounds better thanUnknown
. But it simply denotes that there is an "unknown" in the cluster that druid does not know about, hence it sets this reason. It provides more detailed information in the message.