[Bug]: The StrimziPodSetController is not checking controller flag of the old owner before adding new one #10128
jakubmalek
started this conversation in
General
Replies: 1 comment
-
The Owner reference does not prevent the Pod from being deleted. So I'm not sure what exactly was your problem. Maybe sharing the YAMLs and the logs might make it easier to understand. But in general, the PodSet controller is the owner of the pod as well as the controller. I would not expect you to have any other owner references or controllers managing it. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Bug Description
I've noticed bit of a strange problem, during the update of the Kafka cluster, the Zookeeper pod started to produce the errors, so the Zookeeper pods was manually deleted.
All the pods kept on hanging in the terminating state, and the operator was continuously producing the errors for the pod update with the following reason
Only one reference can have Controller set to true
.It seem that the owner reference, set in the pod, was still pointing to the old
StrimziPodSet
for the Zookeeper.I'm not sure why the reference wasn't automatically removed, after the CRD was updated, but in consequence the operator wasn't able to add new owner reference to the pod, as the old one still had
controller
flag set to `true.Unfortunately, I don't have too much details about the incident, as I wasn't directly involved in it. So I can't provide any reliable steps to reproduce the issue with the owners. And at the end, the problem was resolved with the forced deletion od the pods.
But during my brief investigation I've noticed that the
StrimziPodSetController
is only checking if the owner references are set, and then it simply tries to add new owner:https://github.com/strimzi/strimzi-kafka-operator/blob/0.41.0/cluster-operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/StrimziPodSetController.java#L484-L485
I think its worth to check if at least one owner reference entry already have
controller
flag set totrue
.In such case, the pod update can be ignored, with potential warning in the logs, until the old reference is removed or controller flag is disabled.
Alternatively, the operator may try to disable the controller flag in the previous owner, to potentially fix the problem.
Steps to reproduce
N/A
Expected behavior
The operator should check if old owner reference in the pod has the controller flag set to
true
before adding new owner.Alternatively, it can disable controller flag in the old owner, or even remove it, to fix the issue.
Strimzi version
0.39.0
Kubernetes version
1.29.4
Installation method
Helm
Infrastructure
Azure
Configuration files and logs
No response
Additional context
No response
Beta Was this translation helpful? Give feedback.
All reactions