Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add operational-mode annotation to resolve VRG state ambiguity during… #1479

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

BenamarMk
Copy link
Member

For CephFS PVCs, we need two VRGs: one as primary on the primary cluster and the other as secondary on the failover cluster. If the workload has been relocated from the secondary cluster back to the primary and the primary goes offline, when the DRPC reconciles, it will find the primary inaccessible and will only detect the secondary VRG on the failover cluster. This situation is difficult to resolve because the DRPC can't distinguish between a VRG that is in its final state as secondary and one that is transitioning from primary to secondary. As a result, the PeerReady condition will be turned off, and the user will be unable to failover the application using the UI.

The fix for this is to add a hint so that the DRPC can tell whether the VRG is in its final state or transitioning to primary or secondary. This hint is provided through the use of an annotation.

Fixes: bz-2264765

@BenamarMk BenamarMk force-pushed the fix-drpc-vrg-transition branch from 09057aa to 068e7ec Compare December 16, 2024 12:12
curHomeCluster, err := d.validateAndSelectCurrentPrimary(preferredCluster)
if err != nil {
currentPrimary, transitioning, err := d.isCurrentPrimaryValidForRelocation(preferredCluster)
if (currentPrimary == "" && !transitioning) || err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"transitioning" flag should be checked for true, not false.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Good catch Elena.

… failover

- Problem: During failover, if the primary cluster is inaccessible, the DRPC cannot
  distinguish between a VRG in its final secondary state and one transitioning to
  secondary. This ambiguity prevents the user from failing over the application.
- Solution: Introduced  annotation to indicate the current operational mode of the VRG.

This fix allows the DRPC to correctly identify the VRG's state and maintaining the
PeerReady condition.

Fixes: bz-2264765

Signed-off-by: Benamar Mekhissi <[email protected]>
@BenamarMk BenamarMk force-pushed the fix-drpc-vrg-transition branch from 068e7ec to 6c0a8ce Compare December 16, 2024 22:26
@BenamarMk BenamarMk marked this pull request as draft December 19, 2024 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants