Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite loop while updating EDS #178

Closed
fbloo opened this issue May 21, 2021 · 1 comment
Closed

Infinite loop while updating EDS #178

fbloo opened this issue May 21, 2021 · 1 comment

Comments

@fbloo
Copy link

fbloo commented May 21, 2021

First of all, I appreciate you sharing this operator. Currently, I'm gaining some hands-on experience with it and I'm encountering some strange behaviour (to my best knowledge). I'm trying to apply a new configuration to my EDS, but it's having trouble updating the individual pods. Please correct me if I'm doing anything stupid/unsupported.

Expected Behavior

  1. Apply yaml update on EDS
  2. ES operator update all pods within EDS by draining, deleting, and deploying new pods. (Pod by pod)

Actual Behavior

  1. Apply yaml update on EDS
  2. The operator starts draining the pod, and successfully deletes it.
  3. New pod is scheduled. However, ES operator directly shows that the pod should be updated.
  4. ES operator starts draining again and this continues as an infinity loop.

The logs below shows all relevant logging from a single "loop". Notice how it tells that it deleted pod demo/es-data1-0, and then directly pod demo/es-data1-0 should be updated.

Steps to Reproduce the Problem

  1. Set: enabled: true, minReplicas: 1, minIndexReplicas: 0
  2. Apply yaml update on EDS

Specifications

  • Version: es-operator:latest; elasticsearch-oss:7.5.1
  • Platform: Azure Kubernetes
  • Subsystem: any

Logs:

time="2021-05-20T13:12:25Z" level=info msg="Ensuring cluster is in green state" endpoint="http://es-data1.demo.svc.cluster.local.:9200"
time="2021-05-20T13:12:25Z" level=info msg="Event(v1.ObjectReference{Kind:\"ElasticsearchDataSet\", Namespace:\"demo\", Name:\"es-data1\", UID:\"22b3dd79-41b6-4165-bc7f-ad78557d7959\", APIVersion:\"zalando.org/v1\", ResourceVersion:\"6271955\", FieldPath:\"\"}): type: 'Normal' reason: 'DrainingPod' Draining Pod 'demo/es-data1-0'"
time="2021-05-20T13:12:25Z" level=info msg="Disabling auto-rebalance" endpoint="http://es-data1.demo.svc.cluster.local.:9200"
time="2021-05-20T13:12:26Z" level=info msg="Excluding pod demo/es-data1-0 from shard allocation" endpoint="http://es-data1.demo.svc.cluster.local.:9200"
time="2021-05-20T13:12:26Z" level=info msg="Waiting for draining to finish" endpoint="http://es-data1.demo.svc.cluster.local.:9200"
time="2021-05-20T13:12:26Z" level=info msg="Found 0 remaining shards on demo/es-data1-0 (10.244.3.147)" endpoint="http://es-data1.demo.svc.cluster.local.:9200"
time="2021-05-20T13:12:26Z" level=info msg="Event(v1.ObjectReference{Kind:\"ElasticsearchDataSet\", Namespace:\"demo\", Name:\"es-data1\", UID:\"22b3dd79-41b6-4165-bc7f-ad78557d7959\", APIVersion:\"zalando.org/v1\", ResourceVersion:\"6271955\", FieldPath:\"\"}): type: 'Normal' reason: 'DrainedPod' Successfully drained Pod 'demo/es-data1-0'"
time="2021-05-20T13:12:26Z" level=info msg="Event(v1.ObjectReference{Kind:\"ElasticsearchDataSet\", Namespace:\"demo\", Name:\"es-data1\", UID:\"22b3dd79-41b6-4165-bc7f-ad78557d7959\", APIVersion:\"zalando.org/v1\", ResourceVersion:\"6271955\", FieldPath:\"\"}): type: 'Normal' reason: 'DeletingPod' Deleting Pod 'demo/es-data1-0'"
time="2021-05-20T13:12:42Z" level=info msg="Event(v1.ObjectReference{Kind:\"ElasticsearchDataSet\", Namespace:\"demo\", Name:\"es-data1\", UID:\"22b3dd79-41b6-4165-bc7f-ad78557d7959\", APIVersion:\"zalando.org/v1\", ResourceVersion:\"6271955\", FieldPath:\"\"}): type: 'Normal' reason: 'DeletedPod' Successfully deleted Pod 'demo/es-data1-0'"
time="2021-05-20T13:12:42Z" level=info msg="Setting exclude list to ''" endpoint="http://es-data1.demo.svc.cluster.local.:9200"
time="2021-05-20T13:12:42Z" level=info msg="Enabling auto-rebalance" endpoint="http://es-data1.demo.svc.cluster.local.:9200"
time="2021-05-20T13:12:43Z" level=info msg="Pod demo/es-data1-0 should be updated. Priority: 5 (NodeSelector,PodOldRevision,STSReplicaDiff)"
time="2021-05-20T13:12:43Z" level=info msg="Pod demo/es-data1-1 should be updated. Priority: 5 (NodeSelector,PodOldRevision,STSReplicaDiff)"
time="2021-05-20T13:12:43Z" level=info msg="Found 2 Pods on StatefulSet demo/es-data1 to update"
time="2021-05-20T13:12:43Z" level=info msg="StatefulSet demo/es-data1 has 1/2 ready replicas"
@fbloo fbloo changed the title Infinity loop while updating EDS Infinite loop while updating EDS May 21, 2021
@fbloo
Copy link
Author

fbloo commented May 25, 2021

Issue related to #69

My operator got an argument ---priority-node-selector=lifecycle-status=ready whilst I didn't specify a nodeSelector for my pods. Removed the argument and now it seems to be working fine.

@fbloo fbloo closed this as completed May 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant