Skip to content

Commit

Permalink
Add detailed backup information in the MedusaBackup CRD status (#1047)
Browse files Browse the repository at this point in the history
  • Loading branch information
adejanovski authored Sep 14, 2023
1 parent 8614857 commit a843b32
Show file tree
Hide file tree
Showing 11 changed files with 266 additions and 14 deletions.
1 change: 1 addition & 0 deletions CHANGELOG/CHANGELOG-1.9.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ When cutting a new release, update the `unreleased` heading to the tag being gen

## unreleased

* [ENHANCEMENT] [#1046](https://github.com/k8ssandra/k8ssandra-operator/issues/1046) Add detailed backup information in the MedusaBackup CRD status
* [BUGFIX] [#1027](https://github.com/k8ssandra/k8ssandra-operator/issues/1027) Point system-logger image to use the v1.16.0 tag instead of latest
* [BUGFIX] [#1026](https://github.com/k8ssandra/k8ssandra-operator/issues/1026) Fix DC name overrides not being properly handled
* [BUGFIX] [#981](https://github.com/k8ssandra/k8ssandra-operator/issues/981) Fix race condition in K8ssandraTask status update
Expand Down
20 changes: 18 additions & 2 deletions apis/medusa/v1alpha1/medusabackup_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,28 @@ type MedusaBackupSpec struct {

// MedusaBackupStatus defines the observed state of MedusaBackup
type MedusaBackupStatus struct {
StartTime metav1.Time `json:"startTime,omitempty"`
FinishTime metav1.Time `json:"finishTime,omitempty"`
StartTime metav1.Time `json:"startTime,omitempty"`
FinishTime metav1.Time `json:"finishTime,omitempty"`
TotalNodes int32 `json:"totalNodes,omitempty"`
FinishedNodes int32 `json:"finishedNodes,omitempty"`
Nodes []*MedusaBackupNode `json:"nodes,omitempty"`
Status string `json:"status,omitempty"`
}

type MedusaBackupNode struct {
Host string `json:"host,omitempty"`
Tokens []int64 `json:"tokens,omitempty"`
Datacenter string `json:"datacenter,omitempty"`
Rack string `json:"rack,omitempty"`
}

//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
//+kubebuilder:printcolumn:name="Started",type=date,JSONPath=".status.startTime",description="Backup start time"
//+kubebuilder:printcolumn:name="Finished",type=date,JSONPath=".status.finishTime",description="Backup finish time"
//+kubebuilder:printcolumn:name="Nodes",type=string,JSONPath=".status.totalNodes",description="Total number of nodes at the time of the backup"
//+kubebuilder:printcolumn:name="Completed",type=string,JSONPath=".status.finishedNodes",description="Number of nodes that completed this backup"
//+kubebuilder:printcolumn:name="Status",type=string,JSONPath=".status.status",description="Backup completion status"

// MedusaBackup is the Schema for the medusabackups API
type MedusaBackup struct {
Expand Down
6 changes: 4 additions & 2 deletions apis/medusa/v1alpha1/medusabackupjob_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,10 @@ type MedusaBackupJobStatus struct {
Failed []string `json:"failed,omitempty"`
}

//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// +kubebuilder:printcolumn:name="Started",type=date,JSONPath=".status.startTime",description="Backup start time"
// +kubebuilder:printcolumn:name="Finished",type=date,JSONPath=".status.finishTime",description="Backup finish time"

// MedusaBackupJob is the Schema for the medusabackupjobs API
type MedusaBackupJob struct {
Expand Down
31 changes: 31 additions & 0 deletions apis/medusa/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 10 additions & 1 deletion config/crd/bases/medusa.k8ssandra.io_medusabackupjobs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,16 @@ spec:
singular: medusabackupjob
scope: Namespaced
versions:
- name: v1alpha1
- additionalPrinterColumns:
- description: Backup start time
jsonPath: .status.startTime
name: Started
type: date
- description: Backup finish time
jsonPath: .status.finishTime
name: Finished
type: date
name: v1alpha1
schema:
openAPIV3Schema:
description: MedusaBackupJob is the Schema for the medusabackupjobs API
Expand Down
47 changes: 46 additions & 1 deletion config/crd/bases/medusa.k8ssandra.io_medusabackups.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,28 @@ spec:
singular: medusabackup
scope: Namespaced
versions:
- name: v1alpha1
- additionalPrinterColumns:
- description: Backup start time
jsonPath: .status.startTime
name: Started
type: date
- description: Backup finish time
jsonPath: .status.finishTime
name: Finished
type: date
- description: Total number of nodes at the time of the backup
jsonPath: .status.totalNodes
name: Nodes
type: string
- description: Number of nodes that completed this backup
jsonPath: .status.finishedNodes
name: Completed
type: string
- description: Backup completion status
jsonPath: .status.status
name: Status
type: string
name: v1alpha1
schema:
openAPIV3Schema:
description: MedusaBackup is the Schema for the medusabackups API
Expand Down Expand Up @@ -54,9 +75,33 @@ spec:
finishTime:
format: date-time
type: string
finishedNodes:
format: int32
type: integer
nodes:
items:
properties:
datacenter:
type: string
host:
type: string
rack:
type: string
tokens:
items:
format: int64
type: integer
type: array
type: object
type: array
startTime:
format: date-time
type: string
status:
type: string
totalNodes:
format: int32
type: integer
type: object
type: object
served: true
Expand Down
39 changes: 37 additions & 2 deletions controllers/medusa/medusabackupjob_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,12 @@ func (r *MedusaBackupJobReconciler) Reconcile(ctx context.Context, req ctrl.Requ
logger.Info("backup complete")

// The MedusaBackupJob is finished and we now need to create the MedusaBackup object.
if err := r.createMedusaBackup(ctx, backup, logger); err != nil {
backupSummary, err := r.getBackupSummary(ctx, backup, pods, logger)
if err != nil {
logger.Error(err, "Failed to get backup summary")
return ctrl.Result{RequeueAfter: r.DefaultDelay}, err
}
if err := r.createMedusaBackup(ctx, backup, backupSummary, logger); err != nil {
logger.Error(err, "Failed to create MedusaBackup")
return ctrl.Result{RequeueAfter: r.DefaultDelay}, err
}
Expand Down Expand Up @@ -210,7 +215,25 @@ func (r *MedusaBackupJobReconciler) Reconcile(ctx context.Context, req ctrl.Requ
return ctrl.Result{RequeueAfter: r.DefaultDelay}, nil
}

func (r *MedusaBackupJobReconciler) createMedusaBackup(ctx context.Context, backup *medusav1alpha1.MedusaBackupJob, logger logr.Logger) error {
func (r *MedusaBackupJobReconciler) getBackupSummary(ctx context.Context, backup *medusav1alpha1.MedusaBackupJob, pods []corev1.Pod, logger logr.Logger) (*medusa.BackupSummary, error) {
for _, pod := range pods {
if remoteBackups, err := GetBackups(ctx, &pod, r.ClientFactory); err != nil {
logger.Error(err, "failed to list backups", "CassandraPod", pod.Name)
return nil, err
} else {
for _, remoteBackup := range remoteBackups {
logger.Info("found backup", "CassandraPod", pod.Name, "Backup", remoteBackup.BackupName)
if backup.ObjectMeta.Name == remoteBackup.BackupName {
return remoteBackup, nil
}
logger.Info("backup name does not match", "CassandraPod", pod.Name, "Backup", remoteBackup.BackupName)
}
}
}
return nil, nil
}

func (r *MedusaBackupJobReconciler) createMedusaBackup(ctx context.Context, backup *medusav1alpha1.MedusaBackupJob, backupSummary *medusa.BackupSummary, logger logr.Logger) error {
// Create a MedusaBackup object after a successful MedusaBackupJob execution.
logger.Info("Creating MedusaBackup object", "MedusaBackup", backup.Name)
backupKey := types.NamespacedName{Namespace: backup.ObjectMeta.Namespace, Name: backup.Name}
Expand Down Expand Up @@ -239,6 +262,18 @@ func (r *MedusaBackupJobReconciler) createMedusaBackup(ctx context.Context, back
backupPatch := client.MergeFrom(backupResource.DeepCopy())
backupResource.Status.StartTime = startTime
backupResource.Status.FinishTime = finishTime
backupResource.Status.TotalNodes = backupSummary.TotalNodes
backupResource.Status.FinishedNodes = backupSummary.FinishedNodes
backupResource.Status.Nodes = make([]*medusav1alpha1.MedusaBackupNode, len(backupSummary.Nodes))
for i, node := range backupSummary.Nodes {
backupResource.Status.Nodes[i] = &medusav1alpha1.MedusaBackupNode{
Host: node.Host,
Tokens: node.Tokens,
Datacenter: node.Datacenter,
Rack: node.Rack,
}
}
backupResource.Status.Status = backupSummary.Status.String()
if err := r.Status().Patch(ctx, backupResource, backupPatch); err != nil {
logger.Error(err, "failed to patch status with finish time")
return err
Expand Down
40 changes: 36 additions & 4 deletions controllers/medusa/medusabackupjob_controller_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,16 @@ func createAndVerifyMedusaBackup(dcKey framework.ClusterKey, dc *cassdcapi.Cassa
return !updated.Status.FinishTime.IsZero() && len(updated.Status.Finished) == 3 && len(updated.Status.InProgress) == 0
}, timeout, interval)

t.Log("verify that the MedusaBackup is created")
medusaBackupKey := framework.NewClusterKey(dcKey.K8sContext, dcKey.Namespace, backupName)
medusaBackup := &api.MedusaBackup{}
err = f.Get(ctx, medusaBackupKey, medusaBackup)
require.NoError(err, "failed to get MedusaBackup")
require.Equal(medusaBackup.Status.TotalNodes, dc.Spec.Size, "backup total nodes doesn't match dc nodes")
require.Equal(medusaBackup.Status.FinishedNodes, dc.Spec.Size, "backup finished nodes doesn't match dc nodes")
require.Equal(len(medusaBackup.Status.Nodes), int(dc.Spec.Size), "backup topology doesn't match dc topology")
require.Equal(medusa.StatusType_SUCCESS.String(), medusaBackup.Status.Status, "backup status is not success")

require.Equal(int(dc.Spec.Size), len(medusaClientFactory.GetRequestedBackups(dc.DatacenterName())))

return true
Expand Down Expand Up @@ -339,10 +349,32 @@ func (c *fakeMedusaClient) GetBackups(ctx context.Context) ([]*medusa.BackupSumm
backups := make([]*medusa.BackupSummary, 0)
for _, name := range c.RequestedBackups {
backup := &medusa.BackupSummary{
BackupName: name,
StartTime: 0,
FinishTime: 10,
Status: *medusa.StatusType_SUCCESS.Enum(),
BackupName: name,
StartTime: 0,
FinishTime: 10,
TotalNodes: 3,
FinishedNodes: 3,
Status: *medusa.StatusType_SUCCESS.Enum(),
Nodes: []*medusa.BackupNode{
{
Host: "host1",
Tokens: []int64{1, 2, 3},
Datacenter: "dc1",
Rack: "rack1",
},
{
Host: "host2",
Tokens: []int64{1, 2, 3},
Datacenter: "dc1",
Rack: "rack1",
},
{
Host: "host3",
Tokens: []int64{1, 2, 3},
Datacenter: "dc1",
Rack: "rack1",
},
},
}
backups = append(backups, backup)
}
Expand Down
17 changes: 15 additions & 2 deletions controllers/medusa/medusatask_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -265,7 +265,7 @@ func (r *MedusaTaskReconciler) syncOperation(ctx context.Context, task *medusav1
}
for _, pod := range pods {
logger.Info("Listing Backups...", "CassandraPod", pod.Name)
if remoteBackups, err := getBackups(ctx, &pod, r.ClientFactory); err != nil {
if remoteBackups, err := GetBackups(ctx, &pod, r.ClientFactory); err != nil {
logger.Error(err, "failed to list backups", "CassandraPod", pod.Name)
} else {
for _, backup := range remoteBackups {
Expand Down Expand Up @@ -344,6 +344,19 @@ func createMedusaBackup(logger logr.Logger, backup *medusa.BackupSummary, datace
backupPatch := client.MergeFrom(backupResource.DeepCopy())
backupResource.Status.StartTime = startTime
backupResource.Status.FinishTime = finishTime
backupResource.Status.TotalNodes = backup.TotalNodes
backupResource.Status.FinishedNodes = backup.FinishedNodes
backupResource.Status.Nodes = make([]*medusav1alpha1.MedusaBackupNode, len(backup.Nodes))
for i, node := range backup.Nodes {
backupResource.Status.Nodes[i] = &medusav1alpha1.MedusaBackupNode{
Host: node.Host,
Tokens: node.Tokens,
Datacenter: node.Datacenter,
Rack: node.Rack,
}
}
backupResource.Status.Status = backup.Status.String()

if err := r.Status().Patch(ctx, backupResource, backupPatch); err != nil {
logger.Error(err, "failed to patch status with finish time")
return true, ctrl.Result{}, err
Expand Down Expand Up @@ -401,7 +414,7 @@ func prepareRestore(ctx context.Context, task *medusav1alpha1.MedusaTask, pod *c
}
}

func getBackups(ctx context.Context, pod *corev1.Pod, clientFactory medusa.ClientFactory) ([]*medusa.BackupSummary, error) {
func GetBackups(ctx context.Context, pod *corev1.Pod, clientFactory medusa.ClientFactory) ([]*medusa.BackupSummary, error) {
addr := net.JoinHostPort(pod.Status.PodIP, fmt.Sprint(shared.BackupSidecarPort))
if medusaClient, err := clientFactory.NewClient(addr); err != nil {
return nil, err
Expand Down
55 changes: 55 additions & 0 deletions docs/content/en/tasks/backup-restore/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,8 +147,63 @@ status:
```

The start and finish times are also displayed in the output of the kubectl get command:

```sh
% kubectl get MedusaBackupJob -A
NAME STARTED FINISHED
backup1 25m 24m
medusa-backup1 19m 19m
```


All pods having completed the backup will be in the `finished` list.
At the end of the backup operation, a `MedusaBackup` custom resource will be created with the same name as the `MedusaBackupJob` object. It materializes the backup locally on the Kubernetes cluster.
The MedusaBackup object status contains the total number of node in the cluster at the time of the backup, the number of nodes that successfully achieved the backup, and the topology of the DC at the time of the backup:

```yaml
apiVersion: medusa.k8ssandra.io/v1alpha1
kind: MedusaBackup
metadata:
name: backup1
status:
startTime: '2023-09-13T12:15:57Z'
finishTime: '2023-09-13T12:16:12Z'
totalNodes: 2
finishedNodes: 2
nodes:
- datacenter: dc1
host: firstcluster-dc1-default-sts-0
rack: default
tokens:
- -110555885826893
- -1149279817337332700
- -1222258121654772000
- -127355705089199870
- datacenter: dc1
host: firstcluster-dc1-default-sts-1
rack: default
tokens:
- -1032268962284829800
- -1054373523049285200
- -1058110708807841300
- -107256661843445790
status: SUCCESS
spec:
backupType: differential
cassandraDatacenter: dc1
```

The `kubectl get`` output for MedusaBackup objects will show a subset of this information :

```sh
kubectl get MedusaBackup -A
NAME STARTED FINISHED NODES COMPLETED STATUS
backup1 29m 28m 2 2 SUCCESS
medusa-backup1 23m 23m 2 2 SUCCESS
```

For a restore to be possible, a `MedusaBackup` object must exist.


Expand Down
Loading

0 comments on commit a843b32

Please sign in to comment.