-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(refactor): updates the cruise control rebalance concepts #10810
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: prmellor <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @PaulRMellor, thanks. I must say that our CC doc is quite good :)
I left some comments for your consideration.
Cruise Control also provides a REST API for client interactions, which Strimzi uses to support these features: | ||
|
||
* Generating optimization proposals from optimization goals | ||
* Rebalancing a Kafka cluster based on an optimization proposal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Changing topic replication factor
|
||
include::../../modules/cruise-control/con-cruise-control-overview.adoc[leveloffset=+1] | ||
As Kafka clusters evolve, some brokers may become overloaded while others remain underutilized. | ||
Cruise Control addresses this imbalance by modeling resource utilization--CPU, disk, network load--and generating optimization proposals (that you can approve or reject) for balanced partition assignments based on configurable optimization goals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cruise Control addresses this imbalance by modeling resource utilization--CPU, disk, network load--and generating optimization proposals (that you can approve or reject) for balanced partition assignments based on configurable optimization goals. | |
Cruise Control addresses this imbalance by modeling resource utilization at replica-granularity--CPU, disk, network load--and generating optimization proposals (that you can approve or reject) for balanced partition assignments based on configurable optimization goals. |
* *Main goals* are inherited from Cruise Control, some preset as hard goals, used by default. | ||
* *Default goals* are the same as main goals by default but customizable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the name suggests, the Analyzer's default.goals
is what it's used by default when requesting a proposal, and it is equal to the Analyzer's goals
if the user does not set default.goals
.
The goals
property is customizable in case the user wants to restrict the supported goals to a subset of what CruiseControl provides by default. This means that, if a goal is not present in Analyzer's goals
, the it cannot be used in Analyzer's default.goals
and proposal's goals
.
The rest of the documentation seems to match my understanding. Do you think we can improve the wording a bit here?
* *Soft goals* are optional and can be set aside if hard goals are met. | ||
* *Main goals* are inherited from Cruise Control, some preset as hard goals, used by default. | ||
* *Default goals* are the same as main goals by default but customizable. | ||
* *User-provided goals* are a subset of default goals configured for specific proposals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"User provided" (here and in other places) is also a bit confusing for me, because all goals configurations can be user provided. What about KafkaRebalance
goals?
. Load Monitor collects the metrics from Kafka brokers, including CPU, disk, and network utilization data. | ||
. Anomaly Detector continuously monitors the collected metrics to identify anomalies, such as broker failures or disk capacity issues, that could impact cluster stability. | ||
. Analyzer processes the collected metrics and constructs a _workload model_ of the current state of the Kafka cluster. | ||
It generates an optimization proposal based on configured goals (and constraints) such as balancing partition distribution across brokers, which is sent to the status of the `KafkaRebalance` resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It generates an optimization proposal based on configured goals (and constraints) such as balancing partition distribution across brokers, which is sent to the status of the `KafkaRebalance` resource. | |
Based on configured goals and capacities, it generates an optimization proposal for balancing partition distribution across brokers, which is finally reflected in the status of the `KafkaRebalance` resource. |
Intra-broker rebalancing moves data between disks on the same broker when you are using a JBOD storage configuration. | ||
Such information can be useful even if you don't go ahead and approve the proposal. | ||
|
||
You might reject an optimization proposal, or delay its approval, because of the additional load on a Kafka cluster when rebalancing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to add that if the proposal is too old, the cluster load may have changed significantly, so it would be better to request a new proposal.
---- | ||
|
||
The proposal will also move 24 partition leaders to different brokers. | ||
This requires a change to the ZooKeeper configuration, which has a low impact on performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires a change to the ZooKeeper configuration, which has a low impact on performance. | |
This requires a change to the cluster metadata, which has a low impact on performance. |
|
||
They are categorized as follows: | ||
|
||
* *Hard goals* are preset and mandatory for a proposal to succeed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we mean by "mandatory" here? That the hard goals must be set? Or "must be satisfied"? The latter is a key distinction of hard goals that we don't want to lose from the docs
They are categorized as follows: | ||
|
||
* *Hard goals* are preset and mandatory for a proposal to succeed. | ||
* *Soft goals* are optional and can be set aside if hard goals are met. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even in the original doc, the phrase "They can be set aside if it means that all hard goals are met" sounds as if the soft goals can be ignored or are unimportant if hard goals are met. However, even if the hard goals are met, the soft goals are still important. Although the soft goals are best effort and will not prevent a optimization proposal from being generated, they are still taken it account when create a optimization proposal.
I would keep the original bullets concerning "hard goals" and "soft goals" but fix the line "They can be set aside if it means that all hard goals are met." to show that soft goals are "best effort" but will not block a optimization proposal from being created if the hard goals are met/satisfied.
@@ -5,17 +5,21 @@ | |||
[id='cruise-control-concepts-{context}'] | |||
= Using Cruise Control for cluster rebalancing | |||
|
|||
include::../../modules/cruise-control/con-cruise-control-description.adoc[leveloffset=+1] | |||
[role="_abstract"] | |||
Cruise Control is an open-source system for Kafka that monitors broker loads and rebalances partitions to optimize use of resources across the cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we use the word "system" in the original doc before these changes but is it the correct word? Wouldn't "application" be a better fit here?
Cruise Control is an open-source system for Kafka that monitors broker loads and rebalances partitions to optimize use of resources across the cluster. | |
Cruise Control is an open-source application designed to run alongside Kafka to help optimize use of cluster resources by: | |
* Monitoring cluster workload | |
* Rebalancing partitions based on predefined constraints |
include::../../modules/cruise-control/con-cruise-control-description.adoc[leveloffset=+1] | ||
[role="_abstract"] | ||
Cruise Control is an open-source system for Kafka that monitors broker loads and rebalances partitions to optimize use of resources across the cluster. | ||
Rebalances help with running a more balanced Kafka cluster that uses brokers more efficiently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we use Cruise Control for more than just rebalances, for example topic replication factor changes, I would suggest we alter the original line like this
Rebalances help with running a more balanced Kafka cluster that uses brokers more efficiently. | |
Cruise Control operations help with running a more balanced Kafka cluster that uses brokers more efficiently. |
The classification of hard and soft goals is fixed in Cruise Control code and cannot be changed. | ||
|
||
A proposal meeting all hard goals is valid, even if it violates some soft goals. | ||
Cruise Control prioritizes satisfying hard goals and then maximizes soft goals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cruise Control prioritizes satisfying hard goals and then maximizes soft goals. | |
Cruise Control prioritizes satisfying hard goals and then prioritizes satisfy soft goals in the order by which they are listed. |
|
||
=== Hard and soft goals | ||
|
||
Hard goals are mandatory and must be satisfied for optimization proposals to be valid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard goals are mandatory and must be satisfied for optimization proposals to be valid. | |
Hard goals are mandatory and must be satisfied for optimization proposals to be generated. |
* To specify hard goals, list them in `hard.goals`. | ||
* To exclude a hard goal, ensure it's not in either `default.goals` or `hard.goals`. | ||
|
||
Increasing the number of configured hard goals will reduce the likelihood of Cruise Control generating valid optimization proposals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll reduce the likelihood of generating an optimization proposal period. From what I understand there is no concept of an invalid optimization proposal. If the hard goals cannot be satisfied, a proposal will not be generated
Increasing the number of configured hard goals will reduce the likelihood of Cruise Control generating valid optimization proposals. | |
Increasing the number of configured hard goals will reduce the likelihood of Cruise Control generating optimization proposals. |
Documentation
Refactor and refresh of Cruise Control concepts
Checklist
Please go through this checklist and make sure all applicable tasks have been done