Implement recreate ExistingResourcePolicy to restore API #6354

kaovilai · 2023-06-05T13:55:43Z

HackMD where we discussed usecases/solutions: https://hackmd.io/KKWcslVSR3yysufvC76MHg#Solutions-for-usecase
Signed-off-by: Tiger Kaovilai [email protected]

Thank you for contributing to Velero!

Please add a summary of your change

Does your change fix a particular issue?

Fixes #6142

Please indicate you've done the following:

Accepted the DCO. Commits without the DCO will delay acceptance.
Created a changelog file or added /kind changelog-not-required as a comment on this pull request.
Updated the corresponding documentation in site/content/docs/main.

pkg/restore/restore.go

codecov-commenter · 2023-06-25T19:51:28Z

Codecov Report

Attention: Patch coverage is 14.28571% with 66 lines in your changes are missing coverage. Please review.

Project coverage is 61.54%. Comparing base (1579842) to head (447845d).

Files	Patch %	Lines
pkg/restore/restore.go	12.16%	64 Missing and 1 partial ⚠️
pkg/cmd/cli/restore/create.go	66.66%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6354      +/-   ##
==========================================
- Coverage   61.66%   61.54%   -0.12%     
==========================================
  Files         263      263              
  Lines       28634    28700      +66     
==========================================
+ Hits        17657    17664       +7     
- Misses       9734     9794      +60     
+ Partials     1243     1242       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

anshulahuja98 · 2023-07-19T08:27:40Z

design/Implemented/existing-resource-policy_design.md

@@ -55,10 +55,11 @@ skip restoration.
   - Changed resources: Velero will first try to patch the changed resource, Now if the patch:
       - succeeds: Then the in-cluster resource gets updated with the labels as well as the resource diff
       - fails: Velero adds a restore warning and tries to just update the backup/restore labels on the resource, if the labels patch also fails then we add restore error.
-3. `recreate`:  If resource already exists, then Velero will delete it and recreate the resource.
+3. `recreate`:  Similar to `update` but if resource already exists and is immutable (such as Pods), then Velero attempt to recreate the resource by deleting it first.


did you try this out with CSI PVC restore?
will it make sure that the original PVC is deleted and new PVC with CSI Snapshot source is created.

I believe this has a lot of value in that context of doing actual volume restore which was not supported by current none / patch scenarios.

@anshulahuja98 without any special code for specific resource types (which I don't think we have right now), we would only delete and recreate something if:

the kubernetes metadata in-cluster is different from what's in-cluster

the attempt to patch the resource failed.

Also, at the point where the delete-and-recreate happens, all plugins, etc. have already run. However, it's possible that this will actually work if there are immutable fields that we're attempting to patch here. Otherwise, there may be value in making an exception for the PVC itself in the code. I think that may be better handled as a follow-on change to handle the specific PVC use case, since there may be a number of edge cases to deal with -- CSI-with-datamover, CSI-without-datamover, native snapshots, etc.

I thought further on this. I am now trying to think from the angle of, what is recreate exactly solving for the end user. I have a worry that it might lead to the cluster in a bad shape than expected restore with the current design approach of deleting if patch fails.

For example we have a deployment and a pvc attached to it. As per the RestoreORder used by velero today it will first go to PVC -> patch will fail most likely(let's assume it will) -> velero will try to delete PVC but it will never complete since PVC is mounted on deployment -> velero will now move forward after just marking PVC for deletion -> Velero will now try to restore deployment (where let's assume recreate kicks in ) -> Deployment gets deleted -> leading to the terminating state PVC getting cleaned up -> Deployment created but PVC is gone / or if recreate didn't kick in for deployment, PVC will be in terminating state.

As a result we caused the user data loss by marking the current PVC deleted while not restoring the data back. The cluster would be in a bad state. IMO: this is kind of a P0 in my opinion to avoid users shooting themselves in the foot.

I believe from k8s core API this is very basic use case which we should think how it can be supported as part of the design or even if a follow on PR can solve this properly if we special case PVC. The question is - is the design extensible in current velero to support it properly?

I have a similar concern with problems because of complex relationships between objects:

An object may belong to an application which consists of multiple objects

If one object exists at the time of restore, it may indicate that the entire application is running with many other objects

We don't know the relationship of these objects, there may be one possibility -- one other object is monitoring the kind of the current object, if it is missing, a new one with the same kind & name will be created

Then even though we successfully delete the object as part of the recreate logic, there may still be further problems:

Either the creation of the object as the other part of the recreate logic will fail because the application recreates the object first

Or the existing application itself goes to a chaos or modifies the newly created object erroneously which cause the object unusable

Finally, it looks like we have successfully created all the objects due for the restore, but the restored application cannot work appropriately

Totally speaking, this indicates that it is more safe to delete the application/namespace as a whole instead of deleting single objects one by one for this recreate logic, because the latter is not in a consistent manner.

@anshulahuja98 "I thought further on this. I am now trying to think from the angle of, what is recreate exactly solving for the end user. I have a worry that it might lead to the cluster in a bad shape than expected restore with the current design approach of deleting if patch fails."

The main thing this is supposed to resolve is if the backup contains different metadata for immutable fields (or for immutable objects). If the patch attempts to modify an immutable field or an immutable resource, it will fail.

As for the concern that this will break things in some cases (PVCs, etc.), I had similar concerns, which is why my original recommendation was to implement this after we had implemented the per-resource-type overrides from the original design document. Right now, policy is all-or-none. We either do nothing across the board, or we attempt to patch resources across the board, or (with this PR), attempt to delete-and-recreate for everything that fails patching. I'm starting to think we may want to first implement the per-resource override functionality and then the recreate policy. That would allow us, for example, to set the update/patch policy for (almost) everything, but then recreate for just the specific resource type that the user has hit patch failures with. This will limit the potential breakage from the recreate policy.

this might work for purely k8s objects backup.. if there are other CRDs for other controllers, the order of creation of objects may matter and therefore the chances that everything work is not guaranteed.. so perhaps we introduce it as an alpha feature?

your point is valid for other controllers as well.
But a mass deletion before creation of resources / deleting entire namespaces is still more likely to succeed. Since atleast resources do not stay stuck in deletion for the most part.
I believe we need to go back to design phase for this. Introducing current impl as alpha feature is not useful for any real customer scenario and will introduce more complications in understanding.
But that is just my opinion on this particular feature. Others please share your thoughts.

It might be useful to check compete offerings on how they tackle this issue to come up with the design.

a mass deletion before creation of resources / deleting entire namespaces is still more likely to succeed.

I disagree on this part.. I've seen resource unable to be created/applied because it's being deleted (blocked by finalizer).

unless we also mass remove finalizers

pkg/restore/restore.go

anshulahuja98 · 2023-07-19T14:01:43Z

pkg/restore/restore.go

+			time.Sleep(10 * time.Second)
+			continue
+		}
+		break


It might be a very common scenario that the resources have finalizers which prevent deletion.
Can we perhaps remove those finalizers if deletion doesn't happen in 2mins and try deletion again.

Or otherwise can we try to think through these finalizer cases...?

how often are finalizer workloads going to restore/delete cleanly with just a simple finalizer removal?

I think most controllers with finalizer just need extra time to process deletion and go away on its own.. others are intentional deletion blocks that need more attention for deletion.

hmm, I get your point. I think we can skip it then as part of the current work.
My thought around finalizers stemmed from the points I have put in the comment above.

resolving this for now.

Similarly, I think there are more cases that the deletion of objects never succeeds in the recreate situation because of various unpredicted mechanisms, i.e., orders of deletion. Here is one example:

As the current restore order, the restore of VolumeSnapshot objects is prior to PVC objects

Suppose there is a PVC with its DataSource referring to a VolumeSnapshot object

If we delete the VolumeSnapshot object first, it will always fail and complaining that this operation is blocked because a PVC is referring it.

As far as I remember, the VS object is not marked as on-deletion in this case, that is, even though we delete the PVC later, the VS will not be deleted

@kaovilai "others are intentional deletion blocks that need more attention for deletion." -- this could be a problem for restore. Following my comment above about implementing this after we have per-resource-type overrides, maybe we could have 2 different policies:

recreate-if-no-finalizers -- this would attempt to patch, and if patch failed, would delete-and-recreate only if there are no finalizers on the resource.

recreate-and-delete-finalizers -- this would attempt to patch and if patch failed, would remove finalizers before attempting to delete

The latter is potentially dangerous, so we would not want to allow it as an overall policy, but might make sense for some specific named resource/crd that the user knows is safe to delete-and-recreate in this way.

shubham-pampattiwar · 2023-07-19T16:55:12Z

pkg/restore/restore.go

+	for timeStarted := time.Now(); apierrors.IsNotFound(err) || time.Now().After(timeStarted.Add(2*time.Minute)); {
+		_, err = resourceClient.Get(obj.GetName(), metav1.GetOptions{})
+		if !apierrors.IsNotFound(err) {
+			ctx.log.Warnf("get attempt to check object is gone failed for %s %s: %v", fromCluster.GroupVersionKind(), kube.NamespaceAndName(fromCluster), err)


Maybe update this warning to add a little context on why we are fetching the object

pkg/apis/velero/v1/restore_types.go

shubham-pampattiwar · 2023-07-19T16:56:21Z

@kaovilai The PR needs a rebase.

Lyndon-Li · 2023-07-20T15:16:41Z

pkg/restore/restore.go

+			time.Sleep(10 * time.Second)
+			continue
+		}
+		break


Similarly, I think there are more cases that the deletion of objects never succeeds in the recreate situation because of various unpredicted mechanisms, i.e., orders of deletion. Here is one example:

As the current restore order, the restore of VolumeSnapshot objects is prior to PVC objects

Suppose there is a PVC with its DataSource referring to a VolumeSnapshot object

If we delete the VolumeSnapshot object first, it will always fail and complaining that this operation is blocked because a PVC is referring it.

As far as I remember, the VS object is not marked as on-deletion in this case, that is, even though we delete the PVC later, the VS will not be deleted

Lyndon-Li · 2023-07-20T15:24:29Z

design/Implemented/existing-resource-policy_design.md

@@ -55,10 +55,11 @@ skip restoration.
   - Changed resources: Velero will first try to patch the changed resource, Now if the patch:
       - succeeds: Then the in-cluster resource gets updated with the labels as well as the resource diff
       - fails: Velero adds a restore warning and tries to just update the backup/restore labels on the resource, if the labels patch also fails then we add restore error.
-3. `recreate`:  If resource already exists, then Velero will delete it and recreate the resource.
+3. `recreate`:  Similar to `update` but if resource already exists and is immutable (such as Pods), then Velero attempt to recreate the resource by deleting it first.


I have a similar concern with problems because of complex relationships between objects:

An object may belong to an application which consists of multiple objects

If one object exists at the time of restore, it may indicate that the entire application is running with many other objects

We don't know the relationship of these objects, there may be one possibility -- one other object is monitoring the kind of the current object, if it is missing, a new one with the same kind & name will be created

Then even though we successfully delete the object as part of the recreate logic, there may still be further problems:

Either the creation of the object as the other part of the recreate logic will fail because the application recreates the object first

Or the existing application itself goes to a chaos or modifies the newly created object erroneously which cause the object unusable

Finally, it looks like we have successfully created all the objects due for the restore, but the restored application cannot work appropriately

Totally speaking, this indicates that it is more safe to delete the application/namespace as a whole instead of deleting single objects one by one for this recreate logic, because the latter is not in a consistent manner.

Lyndon-Li · 2023-07-20T15:25:42Z

design/Implemented/existing-resource-policy_design.md


-*Note:* The `recreate` option is a non-goal for this enhancement proposal, but it is considered as a future scope.
-Another thing to highlight is that Velero will not be deleting any resources in any of the policy options proposed in 
+*Note:* The `recreate` option is being implemented in [#6354](https://github.com/vmware-tanzu/velero/pull/6354) separate from original proposal.


Other than just clarifying this, is it a better idea to modify this design in the current PR to elaborate on the detailed design of the recreate option?

If we're going through with the recreate option, then I can add more details here if desired.

Lyndon-Li · 2023-07-20T15:30:45Z

pkg/restore/restore.go

+	}
+	// Create object from latest backup/restore)
+	obj.SetNamespace(namespace)
+	_, err = resourceClient.Create(obj)


If the above polling timeout, it means the object still exists, then resourceClient.Create will definitely fail, so maybe we don't need to go with this try, but break by adding the error and explicitly saying that the deletion of the existing object failed

Thanks for the feedback!

kaovilai · 2023-08-02T20:18:49Z

from the community call this PR is bumped out of 1.12.0 release while we discuss and update designs.

addresses vmware-tanzu#6354 (review) Signed-off-by: Tiger Kaovilai <[email protected]>

draghuram · 2024-02-29T20:45:59Z

Since this a longstanding issue with many old comments, it would be nice if someone can highlight the overall changes being implemented. In particular, is there any resource type specific logic or delete+recreate is applied to all resource types? One particular scenario comes to mind: if a pod is deleted, deployment may try to start the pod, racing with Velero which also will try to create it?

I remember discussions about allowing "replace" for only specific resource types. I am guessing that is not being considered in this PR?

kaovilai · 2024-03-01T04:56:08Z

if someone can highlight the overall changes being implemented.

I'm just going to go with the simplest approach first, and see what reviewers think. Progress over perfection. I'm not sure there's any reasonable logic for recreate that would apply for all resource types; it would be specific to items that are not practical or possible to use update ExistingResourcePolicy.

is there any resource type specific logic

If it's universally applicable in all k8s, such as pods/deployments then I will try to make it work for that.
Other types such as custom resources will have to be user input logic, if user input configuration is implemented.

kaovilai · 2024-03-01T21:09:25Z

if a pod is deleted, deployment may try to start the pod, racing with Velero which also will try to create it?

For a pod owned by deployment, we probably don't bother delete+recreate if possible given that you can just update the deployment to achieve this effect.

The case that recreate could make sense would be a lone pod without a deployment/replicaset.

draghuram · 2024-03-01T21:18:03Z

Agreed. That is why I wanted to know if the replace logic uniformly tries to delete and recreate for all resource types or whether it incorporates some specific logic, depending on the resource type.

kaovilai · 2024-03-01T21:23:42Z

I wanted it to be on resource types, but in the issue there were suggestions by @anshulahuja98 that it may need two passes for deletion to prevent being stuck

If we do delete and restore 1 by 1, it is likely resources will stay stuck in deletion.

I will try to make it only delete specific types that's enabled, and only do so on ones without ownerReferences perhaps.

addresses vmware-tanzu#6354 (review) Signed-off-by: Tiger Kaovilai <[email protected]>

Signed-off-by: Tiger Kaovilai <[email protected]> use wait.PollImmediate addresses vmware-tanzu#6354 (review) Signed-off-by: Tiger Kaovilai <[email protected]>

kaovilai · 2024-03-05T21:30:56Z

Closing, consolidating PVC data restore in-place into #7481.

For a single immutable item recreate, the usecase is currently not being considered.

github-actions bot added the has-changelog label Jun 5, 2023

kaovilai commented Jun 5, 2023

View reviewed changes

pkg/restore/restore.go Outdated Show resolved Hide resolved

kaovilai force-pushed the existingresourcepolicy-delete-recreate branch 2 times, most recently from 6723627 to b8ea1c4 Compare June 11, 2023 10:52

kaovilai force-pushed the existingresourcepolicy-delete-recreate branch from b8ea1c4 to f8a0bcc Compare June 25, 2023 19:39

kaovilai force-pushed the existingresourcepolicy-delete-recreate branch 4 times, most recently from 290c65c to da6fdac Compare July 19, 2023 00:58

github-actions bot added the Documentation label Jul 19, 2023

kaovilai force-pushed the existingresourcepolicy-delete-recreate branch from da6fdac to 24e31cc Compare July 19, 2023 01:43

kaovilai marked this pull request as ready for review July 19, 2023 01:43

github-actions bot requested review from Lyndon-Li and ywk253100 July 19, 2023 01:43

github-actions bot assigned kaovilai Jul 19, 2023

reasonerjt added this to the v1.12 milestone Jul 19, 2023

anshulahuja98 reviewed Jul 19, 2023

View reviewed changes

anshulahuja98 requested changes Jul 19, 2023

View reviewed changes

pkg/restore/restore.go Outdated Show resolved Hide resolved

anshulahuja98 reviewed Jul 19, 2023

View reviewed changes

shubham-pampattiwar reviewed Jul 19, 2023

View reviewed changes

pkg/apis/velero/v1/restore_types.go Show resolved Hide resolved

Lyndon-Li reviewed Jul 20, 2023

View reviewed changes

kaovilai force-pushed the existingresourcepolicy-delete-recreate branch from 24e31cc to b635e85 Compare July 25, 2023 20:19

kaovilai added a commit to kaovilai/velero that referenced this pull request Aug 2, 2023

use wait.PollImmediate

6e25841

addresses vmware-tanzu#6354 (review) Signed-off-by: Tiger Kaovilai <[email protected]>

kaovilai added a commit to kaovilai/velero that referenced this pull request Aug 2, 2023

use wait.PollImmediate

51a87d1

addresses vmware-tanzu#6354 (review) Signed-off-by: Tiger Kaovilai <[email protected]>

kaovilai force-pushed the existingresourcepolicy-delete-recreate branch from 6e25841 to 51a87d1 Compare August 2, 2023 21:30

reasonerjt removed this from the v1.12 milestone Aug 9, 2023

kaovilai marked this pull request as draft September 13, 2023 19:47

kaovilai added a commit to kaovilai/velero that referenced this pull request Feb 29, 2024

use wait.PollImmediate

07c9d6c

addresses vmware-tanzu#6354 (review) Signed-off-by: Tiger Kaovilai <[email protected]>

kaovilai force-pushed the existingresourcepolicy-delete-recreate branch from 51a87d1 to 07c9d6c Compare February 29, 2024 04:33

kaovilai added a commit to kaovilai/velero that referenced this pull request Mar 1, 2024

use wait.PollImmediate

f0075aa

addresses vmware-tanzu#6354 (review) Signed-off-by: Tiger Kaovilai <[email protected]>

kaovilai force-pushed the existingresourcepolicy-delete-recreate branch from 07c9d6c to f0075aa Compare March 1, 2024 21:25

kaovilai added a commit to kaovilai/velero that referenced this pull request Mar 1, 2024

use wait.PollImmediate

8ed1cfb

addresses vmware-tanzu#6354 (review) Signed-off-by: Tiger Kaovilai <[email protected]>

kaovilai force-pushed the existingresourcepolicy-delete-recreate branch from f0075aa to 8ed1cfb Compare March 1, 2024 21:26

kaovilai added a commit to kaovilai/velero that referenced this pull request Mar 1, 2024

use wait.PollImmediate

cd499d1

addresses vmware-tanzu#6354 (review) Signed-off-by: Tiger Kaovilai <[email protected]>

kaovilai force-pushed the existingresourcepolicy-delete-recreate branch 2 times, most recently from cd499d1 to 2f21a24 Compare March 1, 2024 22:55

kaovilai added a commit to kaovilai/velero that referenced this pull request Mar 1, 2024

use wait.PollImmediate

2f21a24

addresses vmware-tanzu#6354 (review) Signed-off-by: Tiger Kaovilai <[email protected]>

kaovilai force-pushed the existingresourcepolicy-delete-recreate branch from 2f21a24 to f4a20fc Compare March 1, 2024 22:55

kaovilai force-pushed the existingresourcepolicy-delete-recreate branch from f4a20fc to 5658921 Compare March 2, 2024 03:23

Implement recreate ExistingResourcePolicy to restore API

447845d

Signed-off-by: Tiger Kaovilai <[email protected]> use wait.PollImmediate addresses vmware-tanzu#6354 (review) Signed-off-by: Tiger Kaovilai <[email protected]>

kaovilai force-pushed the existingresourcepolicy-delete-recreate branch from 5658921 to 447845d Compare March 2, 2024 03:29

anshulahuja98 mentioned this pull request Mar 4, 2024

Add design for only restore volume data. #7481

Open

3 tasks

kaovilai closed this Mar 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement recreate ExistingResourcePolicy to restore API #6354

Implement recreate ExistingResourcePolicy to restore API #6354

kaovilai commented Jun 5, 2023 •

edited

Loading

codecov-commenter commented Jun 25, 2023 •

edited by codecov bot

Loading

anshulahuja98 Jul 19, 2023

sseago Jul 19, 2023

anshulahuja98 Jul 20, 2023

Lyndon-Li Jul 20, 2023

sseago Jul 21, 2023

kaovilai Aug 1, 2023

anshulahuja98 Aug 2, 2023

anshulahuja98 Aug 2, 2023

kaovilai Aug 2, 2023

kaovilai Aug 2, 2023

anshulahuja98 Jul 19, 2023

kaovilai Jul 19, 2023

anshulahuja98 Jul 20, 2023

Lyndon-Li Jul 20, 2023

sseago Jul 21, 2023

shubham-pampattiwar Jul 19, 2023

shubham-pampattiwar commented Jul 19, 2023

Lyndon-Li Jul 20, 2023

Lyndon-Li Jul 20, 2023

Lyndon-Li Jul 20, 2023

kaovilai Jul 21, 2023

Lyndon-Li Jul 20, 2023

kaovilai Jul 21, 2023

kaovilai commented Aug 2, 2023

draghuram commented Feb 29, 2024

kaovilai commented Mar 1, 2024 •

edited

Loading

kaovilai commented Mar 1, 2024

draghuram commented Mar 1, 2024

kaovilai commented Mar 1, 2024

kaovilai commented Mar 5, 2024

Implement recreate ExistingResourcePolicy to restore API #6354

Implement recreate ExistingResourcePolicy to restore API #6354

Conversation

kaovilai commented Jun 5, 2023 • edited Loading

Please add a summary of your change

Does your change fix a particular issue?

Please indicate you've done the following:

codecov-commenter commented Jun 25, 2023 • edited by codecov bot Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shubham-pampattiwar commented Jul 19, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaovilai commented Aug 2, 2023

draghuram commented Feb 29, 2024

kaovilai commented Mar 1, 2024 • edited Loading

kaovilai commented Mar 1, 2024

draghuram commented Mar 1, 2024

kaovilai commented Mar 1, 2024

kaovilai commented Mar 5, 2024

kaovilai commented Jun 5, 2023 •

edited

Loading

codecov-commenter commented Jun 25, 2023 •

edited by codecov bot

Loading

kaovilai commented Mar 1, 2024 •

edited

Loading