Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle the failure due to reaching the servlet capacity when getting user tasks #10768

Merged
merged 3 commits into from
Nov 6, 2024

Conversation

tinaselenge
Copy link
Contributor

@tinaselenge tinaselenge commented Oct 28, 2024

Type of change

Select the type of your PR

  • Bugfix

Description

If failed to get user tasks due to reaching the servlet capacity, it will no longer transition the KafkaRebalance status to NotReady. Instead, KafkaRebalance status will not be changed and in the next reconciliation, it will retry getting the user tasks again. The operator will also report a warning message.

Resolves #10704

Checklist

Please go through this checklist and make sure all applicable tasks have been done

  • Write tests
  • Make sure all tests pass
  • Update documentation
  • Check RBAC rights for Kubernetes / OpenShift roles
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • Reference relevant issue(s) and close them after merging
  • Update CHANGELOG.md
  • Supply screenshots for visual changes, such as Grafana dashboards

@scholzj
Copy link
Member

scholzj commented Oct 29, 2024

/azp run build

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@tinaselenge tinaselenge marked this pull request as ready for review October 29, 2024 12:56
@ppatierno ppatierno added this to the 0.45.0 milestone Oct 30, 2024
@ppatierno ppatierno requested review from a team and ppatierno October 30, 2024 08:42
Copy link
Contributor

@katheris katheris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of small suggestions, but otherwise LGTM

@ppatierno
Copy link
Member

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@ppatierno ppatierno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@ppatierno ppatierno requested a review from a team November 4, 2024 09:13
@tinaselenge
Copy link
Contributor Author

Thank you all for reviewing the PR. I pushed an update addressing the suggestions from @katheris.

@tinaselenge
Copy link
Contributor Author

@ppatierno can you please kick off the build and regression tests again?

@im-konge
Copy link
Member

im-konge commented Nov 4, 2024

/azp run build

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@im-konge
Copy link
Member

im-konge commented Nov 4, 2024

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ppatierno
Copy link
Member

@tinaselenge I see some failures on regression while one could not be involved, can you double check the failure on testAutoKafkaRebalanceScaleUpScaleDown because this PR is about tweaking the rebalance operator, please? Thanks!

@tinaselenge
Copy link
Contributor Author

Looks like testAutoKafkaRebalanceScaleUpScaleDown was added recently and did not yet exist in my branch. I rebased to main and then ran this test locally which then passed. I'm also running KafkaRollerST#testKafkaDoesNotRollsWhenTopicIsUnderReplicated locally just in case, as this test had timed out during regression run but it shouldn't be related to the changes anyway.

@ppatierno
Copy link
Member

I ran the failed jobs again, let's see.

@ppatierno
Copy link
Member

Regression was ok. Going to merge this one.

@ppatierno ppatierno merged commit 9a9a6df into strimzi:main Nov 6, 2024
13 checks passed
@tinaselenge
Copy link
Contributor Author

Thank you @ppatierno !

@tinaselenge tinaselenge deleted the issue#10704 branch November 6, 2024 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Error getting status of rebalance task via /user_tasks endpoint results in "NotReady" state
5 participants