pravega · pbelgundi · Sep 20, 2019 · Aug 30, 2019 · Sep 4, 2019 · Sep 6, 2019
diff --git a/doc/rollback-cluster.md b/doc/rollback-cluster.md
@@ -0,0 +1,94 @@
+# Pravega Cluster Rollback
+
+This document details how manual rollback can be triggered after a Pravega cluster upgrade fails.
+Note that a rollback can be triggered only on Upgrade Failure.
+
+## Upgrade Failure
+
+An Upgrade can fail because of following reasons:
+
+1. Incorrect configuration (wrong quota, permissions, limit ranges)
+2. Network issues (ImagePullError)
+3. K8s Cluster Issues.
+4. Application issues (Application runtime misconfiguration or code bugs)
+
+An upgrade failure can manifest through a Pod staying in `Pending` state forever or continuously restarting or crashing (CrashLoopBackOff).
+A component deployment failure needs to be tracked and mapped to "Upgrade Failure" for Pravega Cluster.
+Here we try to fail-fast by explicitly checking for some common causes for deployment failure like image pull errors or  CrashLoopBackOff State and failing the upgrade if any pod runs into this state during upgrade.
+
+The following Pravega Cluster Status Condition indicates a Failed Upgrade:
+
+```
+ClusterConditionType: Error
+Status: True
+Reason: UpgradeFailed
+Message: <Details of exception/cause of failure>
+```
+After an Upgrade Failure the output of `kubectl describe pravegacluster pravega` would look like this:
+
+```
+$> kubectl describe pravegacluster pravega
+. . .
+Spec:
+. . .
+Version:        0.6.0-2252.b6f6512
+. . .
+Status:
+. . .
+Conditions:
+    Last Transition Time:  2019-09-06T09:00:13Z
+    Last Update Time:      2019-09-06T09:00:13Z
+    Status:                False
+    Type:                  Upgrading
+    Last Transition Time:  2019-09-06T08:58:40Z
+    Last Update Time:      2019-09-06T08:58:40Z
+    Status:                False
+    Type:                  PodsReady
+    Last Transition Time:  2019-09-06T09:00:13Z
+    Last Update Time:      2019-09-06T09:00:13Z
+    Message:               failed to sync segmentstore version. pod pravega-pravega-segmentstore-0 update failed because of ImagePullBackOff
+    Reason:                UpgradeFailed
+    Status:                True
+    Type:                  Error
+  . . .
+  Current Version:         0.6.0-2239.6e24df7
+. . .
+Version History:
+    0.6.0-2239.6e24df7
+```
+where `0.6.0-2252.b6f6512` is the version we tried upgrading to and `0.6.0-2239.6e24df7` is the version before upgrade.
+
+## Manual Rollback Trigger
+A Rollback is triggered when a Pravega Cluster is in `UpgradeFailed` Error State and a user manually updates version feild in the PravegaCluster spec to point to the last stable cluster version.
+
+Note:
+1. Rollback to only the last stable cluster version is supported at this point.
+2. Changing the cluster spec version to the previous cluster version, when cluster is not in `UpgradeFailed` state, will not trigger a rollback.
+
+## Rollback Implementation
+When Rollback is started the cluster moves into ClusterCondition `RollbackInProgress`.
+Once the Rollback completes, this condition is set to false.
+
+The operator rolls back components following the reverse upgrade order :
+
+1. Pravega Controller
+2. Pravega Segment Store
+3. BookKeeper
+
+A new field `versionHistory` has been added to Pravega ClusterStatus to maintain the history of upgrades.
+
+Rollback involves moving all components in the cluster back to the last stable cluster version. As with upgrades, the operator rolls back one component at a time and one pod at a time to preserve high-availability.
+
+If the Rollback completes successfully, the cluster state goes back to `PodsReady`, which would mean the cluster is now in a stable state.
+If the Rollback Fails, the cluster would move to state `RollbackFailed` indicated by this cluster condition:
+```
+ClusterConditionType: Error
+Status: True
+Reason: RollbackFailed
+Message: <Details of exception/cause of failure>
+```
+
+When a rollback failure happens, manual intervention would be required to solve this.
+After checking and resolving the root cause of failure, a user can upgrade to :
+1. The version to which a user initially intended to upgrade.(which caused upgrade failure)
+2. To any other supported version based versions of all pods in the cluster.
diff --git a/doc/upgrade-cluster.md b/doc/upgrade-cluster.md
@@ -20,8 +20,6 @@ Check out [Pravega documentation](http://pravega.io/docs/latest/) for more infor
 
 ## Pending tasks
 
-- The rollback mechanism is on the roadmap but not implemented yet. Check out [this issue](https://github.com/pravega/pravega-operator/issues/153).
-- Manual recovery from an upgrade is possible but it has not been defined yet. Check out [this issue](https://github.com/pravega/pravega-operator/issues/157).
 - There is no validation of the configured desired version. Check out [this issue](https://github.com/pravega/pravega-operator/issues/156)
 
 
@@ -35,6 +33,19 @@ NAME      VERSION   DESIRED MEMBERS   READY MEMBERS   AGE
 example   0.4.0     7                 7               11m
 ```
 
+## Upgrade Supported Versions Matrix
+
+| BASE VERSION | TARGET VERSION     |
+| ------------ | ----------------   |
+| 0.1.0        | 0.1.0              |
+| 0.2.0        | 0.2.0              |
+| 0.3.0        | 0.3.0, 0.3.1, 0.3.2|
+| 0.3.1        | 0.3.1, 0.3.2       |
+| 0.3.2        | 0.3.2              |
+| 0.4.0        | 0.4.0              |
+| 0.5.0        | 0.5.0, 0.6.0       |
+| 0.6.0        | 0.6.0              |
+
 ## Trigger an upgrade
 
 To initiate an upgrade process, a user has to update the `spec.version` field on the `PravegaCluster` custom resource. This can be done in three different ways using the `kubectl` command.
@@ -103,8 +114,7 @@ Segment Store instances need access to a persistent volume to store the cache. L
 
 Also, Segment Store pods need to be individually accessed by clients, so having a stable network identifier provided by the Statefulset and a headless service is very convenient.
 
-Same as Bookkeeper, we use `OnDelete` strategy for Segment Store. The reason that we don't use `RollingUpdate` strategy here is that we found it convenient to manage the upgrade
-and rollback in the same fashion. Using `RollingUpdate` will introduce Kubernetes rollback mechanism which will cause trouble to our implementation. 
+Same as Bookkeeper, we use `OnDelete` strategy for Segment Store. The reason that we don't use `RollingUpdate` strategy here is that we found it convenient to manage the upgrade and rollback in the same fashion. Using `RollingUpdate` will introduce Kubernetes rollback mechanism which will cause trouble to our implementation.
 
 ### Pravega Controller upgrade
 
@@ -131,6 +141,29 @@ NAME      VERSION   DESIRED MEMBERS   READY MEMBERS   AGE
 example   0.5.0     8                 8               1h
 ```
 
+To see progress of Upgrade, you can do a `kubectl describe`
+```
+$ kubectl describe PravegaCluster example
+...
+Status:
+  Conditions:
+    Status:                True
+    Type:                  Upgrading
+    Reason:                Updating BookKeeper
+    Message:               1
+    Last Transition Time:  2019-04-01T19:42:37+02:00
+    Last Update Time:      2019-04-01T19:42:37+02:00
+    Status:                False
+    Type:                  PodsReady
+    Last Transition Time:  2019-04-01T19:43:08+02:00
+    Last Update Time:      2019-04-01T19:43:08+02:00
+    Status:                False
+    Type:                  Error
+...  
+
+```
+The `Reason` field in Upgrading Condition shows the component currently being upgraded and `Message` field reflects number of successfully upgraded replicas in this component.
+
 If your upgrade has failed, you can describe the status section of your Pravega cluster to discover why.
 
 ```
@@ -181,10 +214,10 @@ INFO[5899] Reconciling PravegaCluster default/example
 INFO[5900] statefulset (example-bookie) status: 1 updated, 2 ready, 3 target
 INFO[5929] Reconciling PravegaCluster default/example
 INFO[5930] statefulset (example-bookie) status: 1 updated, 2 ready, 3 target
-INFO[5930] error syncing cluster version, need manual intervention. failed to sync bookkeeper version. pod example-bookie-0 is restarting
+INFO[5930] error syncing cluster version, upgrade failed. failed to sync bookkeeper version. pod example-bookie-0 is restarting
 ...
 ```
 
 ### Recovering from a failed upgrade
 
-Not defined yet. Check [this issue](https://github.com/pravega/pravega-operator/issues/157) for tracking.
+See [Rollback](rollback-cluster.md)
diff --git a/pkg/apis/pravega/v1alpha1/status.go b/pkg/apis/pravega/v1alpha1/status.go
@@ -11,6 +11,7 @@
 package v1alpha1
 
 import (
+	"log"
 	"time"
 
 	corev1 "k8s.io/api/core/v1"
@@ -21,12 +22,13 @@ type ClusterConditionType string
 const (
 	ClusterConditionPodsReady ClusterConditionType = "PodsReady"
 	ClusterConditionUpgrading                      = "Upgrading"
+	ClusterConditionRollback                       = "RollbackInProgress"
 	ClusterConditionError                          = "Error"
 
 	// Reasons for cluster upgrading condition
-	UpgradingControllerReason   = "UpgradingController"
-	UpgradingSegmentstoreReason = "UpgradingSegmentstore"
-	UpgradingBookkeeperReason   = "UpgradingBookkeeper"
+	UpdatingControllerReason   = "Updating Controller"
+	UpdatingSegmentstoreReason = "Updating Segmentstore"
+	UpdatingBookkeeperReason   = "Updating Bookkeeper"
 )
 
 // ClusterStatus defines the observed state of PravegaCluster
@@ -41,6 +43,8 @@ type ClusterStatus struct {
 	// If the cluster is not upgrading, TargetVersion is empty.
 	TargetVersion string `json:"targetVersion,omitempty"`
 
+	VersionHistory []string `json:"versionHistory,omitempty"`
+
 	// Replicas is the number of desired replicas in the cluster
 	Replicas int32 `json:"replicas"`
 
@@ -83,7 +87,8 @@ type ClusterCondition struct {
 	LastTransitionTime string `json:"lastTransitionTime,omitempty"`
 }
 
-func (ps *ClusterStatus) InitConditions() {
+func (ps *ClusterStatus) Init() {
+	// Initialise conditions
 	conditionTypes := []ClusterConditionType{
 		ClusterConditionPodsReady,
 		ClusterConditionUpgrading,
@@ -95,6 +100,12 @@ func (ps *ClusterStatus) InitConditions() {
 			ps.setClusterCondition(*c)
 		}
 	}
+
+	// Set current cluster version in version history,
+	// so if the first upgrade fails we can rollback to this version
+	if ps.VersionHistory == nil && ps.CurrentVersion != "" {
+		ps.VersionHistory = []string{ps.CurrentVersion}
+	}
 }
 
 func (ps *ClusterStatus) SetPodsReadyConditionTrue() {
@@ -127,6 +138,15 @@ func (ps *ClusterStatus) SetErrorConditionFalse() {
 	ps.setClusterCondition(*c)
 }
 
+func (ps *ClusterStatus) SetRollbackConditionTrue(reason, message string) {
+	c := newClusterCondition(ClusterConditionRollback, corev1.ConditionTrue, reason, message)
+	ps.setClusterCondition(*c)
+}
+func (ps *ClusterStatus) SetRollbackConditionFalse() {
+	c := newClusterCondition(ClusterConditionRollback, corev1.ConditionFalse, "", "")
+	ps.setClusterCondition(*c)
+}
+
 func newClusterCondition(condType ClusterConditionType, status corev1.ConditionStatus, reason, message string) *ClusterCondition {
 	return &ClusterCondition{
 		Type:               condType,
@@ -170,3 +190,88 @@ func (ps *ClusterStatus) setClusterCondition(newCondition ClusterCondition) {
 
 	ps.Conditions[position] = *existingCondition
 }
+
+func (ps *ClusterStatus) AddToVersionHistory(version string) {
+	lastIndex := len(ps.VersionHistory) - 1
+	if version != "" && ps.VersionHistory[lastIndex] != version {
+		ps.VersionHistory = append(ps.VersionHistory, version)
+		log.Printf("Updating version history adding version %v", version)
+	}
+}
+
+func (ps *ClusterStatus) GetLastVersion() (previousVersion string) {
+	len := len(ps.VersionHistory)
+	return ps.VersionHistory[len-1]
+}
+
+func (ps *ClusterStatus) IsClusterInUpgradeFailedState() bool {
+	_, errorCondition := ps.GetClusterCondition(ClusterConditionError)
+	if errorCondition == nil {
+		return false
+	}
+	if errorCondition.Status == corev1.ConditionTrue && errorCondition.Reason == "UpgradeFailed" {
+		return true
+	}
+	return false
+}
+
+func (ps *ClusterStatus) IsClusterInUpgradeFailedOrRollbackState() bool {
+	if ps.IsClusterInUpgradeFailedState() || ps.IsClusterInRollbackState() {
+		return true
+	}
+	return false
+}
+
+func (ps *ClusterStatus) IsClusterInRollbackState() bool {
+	_, rollbackCondition := ps.GetClusterCondition(ClusterConditionRollback)
+	if rollbackCondition == nil {
+		return false
+	}
+	if rollbackCondition.Status == corev1.ConditionTrue {
+		return true
+	}
+	return false
+}
+
+func (ps *ClusterStatus) IsClusterInUpgradingState() bool {
+	_, upgradeCondition := ps.GetClusterCondition(ClusterConditionUpgrading)
+	if upgradeCondition == nil {
+		return false
+	}
+	if upgradeCondition.Status == corev1.ConditionTrue {
+		return true
+	}
+	return false
+}
+
+func (ps *ClusterStatus) IsClusterInRollbackFailedState() bool {
+	_, errorCondition := ps.GetClusterCondition(ClusterConditionError)
+	if errorCondition == nil {
+		return false
+	}
+	if errorCondition.Status == corev1.ConditionTrue && errorCondition.Reason == "RollbackFailed" {
+		return true
+	}
+	return false
+}
+
+func (ps *ClusterStatus) UpdateProgress(reason, updatedReplicas string) {
+	if ps.IsClusterInUpgradingState() {
+		// Set the upgrade condition reason to be UpgradingBookkeeperReason, message to be 0
+		ps.SetUpgradingConditionTrue(reason, updatedReplicas)
+	} else {
+		ps.SetRollbackConditionTrue(reason, updatedReplicas)
+	}
+}
+
+func (ps *ClusterStatus) GetLastCondition() (lastCondition *ClusterCondition) {
+	if ps.IsClusterInUpgradingState() {
+		_, lastCondition := ps.GetClusterCondition(ClusterConditionUpgrading)
+		return lastCondition
+	} else if ps.IsClusterInRollbackState() {
+		_, lastCondition := ps.GetClusterCondition(ClusterConditionRollback)
+		return lastCondition
+	}
+	// nothing to do if we are neither upgrading nor rolling back,
+	return nil
+}