Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(refactor): updates the cruise control rebalance concepts #10810

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

PaulRMellor
Copy link
Contributor

@PaulRMellor PaulRMellor commented Nov 6, 2024

Documentation

Refactor and refresh of Cruise Control concepts

  • Adds new optimization proposal process flow description and diagram, including description of partition reassignment commands
  • Less verbose and more direct intro and concepts
  • Removes three overview files to create a single "components and features" file for goals and proposals concepts
  • Consolidates related conceptual information (goals, proposals) into single sections
  • Retitled sections to provide more direction to readers from ToC

Checklist

Please go through this checklist and make sure all applicable tasks have been done

  • Write tests
  • Make sure all tests pass
  • Update documentation
  • Check RBAC rights for Kubernetes / OpenShift roles
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • Reference relevant issue(s) and close them after merging
  • Update CHANGELOG.md
  • Supply screenshots for visual changes, such as Grafana dashboards

@PaulRMellor PaulRMellor added this to the 0.45.0 milestone Nov 6, 2024
@PaulRMellor PaulRMellor requested review from kyguy, fvaleri and a team November 6, 2024 16:32
@PaulRMellor PaulRMellor self-assigned this Nov 6, 2024
Copy link
Contributor

@fvaleri fvaleri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @PaulRMellor, thanks. I must say that our CC doc is quite good :)

I left some comments for your consideration.

Cruise Control also provides a REST API for client interactions, which Strimzi uses to support these features:

* Generating optimization proposals from optimization goals
* Rebalancing a Kafka cluster based on an optimization proposal
Copy link
Contributor

@fvaleri fvaleri Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Changing topic replication factor


include::../../modules/cruise-control/con-cruise-control-overview.adoc[leveloffset=+1]
As Kafka clusters evolve, some brokers may become overloaded while others remain underutilized.
Cruise Control addresses this imbalance by modeling resource utilization--CPU, disk, network load--and generating optimization proposals (that you can approve or reject) for balanced partition assignments based on configurable optimization goals.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Cruise Control addresses this imbalance by modeling resource utilization--CPU, disk, network load--and generating optimization proposals (that you can approve or reject) for balanced partition assignments based on configurable optimization goals.
Cruise Control addresses this imbalance by modeling resource utilization at replica-granularity--CPU, disk, network load--and generating optimization proposals (that you can approve or reject) for balanced partition assignments based on configurable optimization goals.

Comment on lines +31 to +32
* *Main goals* are inherited from Cruise Control, some preset as hard goals, used by default.
* *Default goals* are the same as main goals by default but customizable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the name suggests, the Analyzer's default.goals is what it's used by default when requesting a proposal, and it is equal to the Analyzer's goals if the user does not set default.goals.

The goals property is customizable in case the user wants to restrict the supported goals to a subset of what CruiseControl provides by default. This means that, if a goal is not present in Analyzer's goals, the it cannot be used in Analyzer's default.goals and proposal's goals.

The rest of the documentation seems to match my understanding. Do you think we can improve the wording a bit here?

* *Soft goals* are optional and can be set aside if hard goals are met.
* *Main goals* are inherited from Cruise Control, some preset as hard goals, used by default.
* *Default goals* are the same as main goals by default but customizable.
* *User-provided goals* are a subset of default goals configured for specific proposals.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"User provided" (here and in other places) is also a bit confusing for me, because all goals configurations can be user provided. What about KafkaRebalance goals?

. Load Monitor collects the metrics from Kafka brokers, including CPU, disk, and network utilization data.
. Anomaly Detector continuously monitors the collected metrics to identify anomalies, such as broker failures or disk capacity issues, that could impact cluster stability.
. Analyzer processes the collected metrics and constructs a _workload model_ of the current state of the Kafka cluster.
It generates an optimization proposal based on configured goals (and constraints) such as balancing partition distribution across brokers, which is sent to the status of the `KafkaRebalance` resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It generates an optimization proposal based on configured goals (and constraints) such as balancing partition distribution across brokers, which is sent to the status of the `KafkaRebalance` resource.
Based on configured goals and capacities, it generates an optimization proposal for balancing partition distribution across brokers, which is finally reflected in the status of the `KafkaRebalance` resource.

Intra-broker rebalancing moves data between disks on the same broker when you are using a JBOD storage configuration.
Such information can be useful even if you don't go ahead and approve the proposal.

You might reject an optimization proposal, or delay its approval, because of the additional load on a Kafka cluster when rebalancing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to add that if the proposal is too old, the cluster load may have changed significantly, so it would be better to request a new proposal.

----

The proposal will also move 24 partition leaders to different brokers.
This requires a change to the ZooKeeper configuration, which has a low impact on performance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This requires a change to the ZooKeeper configuration, which has a low impact on performance.
This requires a change to the cluster metadata, which has a low impact on performance.


They are categorized as follows:

* *Hard goals* are preset and mandatory for a proposal to succeed.
Copy link
Member

@kyguy kyguy Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we mean by "mandatory" here? That the hard goals must be set? Or "must be satisfied"? The latter is a key distinction of hard goals that we don't want to lose from the docs

They are categorized as follows:

* *Hard goals* are preset and mandatory for a proposal to succeed.
* *Soft goals* are optional and can be set aside if hard goals are met.
Copy link
Member

@kyguy kyguy Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even in the original doc, the phrase "They can be set aside if it means that all hard goals are met" sounds as if the soft goals can be ignored or are unimportant if hard goals are met. However, even if the hard goals are met, the soft goals are still important. Although the soft goals are best effort and will not prevent a optimization proposal from being generated, they are still taken it account when create a optimization proposal.

I would keep the original bullets concerning "hard goals" and "soft goals" but fix the line "They can be set aside if it means that all hard goals are met." to show that soft goals are "best effort" but will not block a optimization proposal from being created if the hard goals are met/satisfied.

@@ -5,17 +5,21 @@
[id='cruise-control-concepts-{context}']
= Using Cruise Control for cluster rebalancing

include::../../modules/cruise-control/con-cruise-control-description.adoc[leveloffset=+1]
[role="_abstract"]
Cruise Control is an open-source system for Kafka that monitors broker loads and rebalances partitions to optimize use of resources across the cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we use the word "system" in the original doc before these changes but is it the correct word? Wouldn't "application" be a better fit here?

Suggested change
Cruise Control is an open-source system for Kafka that monitors broker loads and rebalances partitions to optimize use of resources across the cluster.
Cruise Control is an open-source application designed to run alongside Kafka to help optimize use of cluster resources by:
* Monitoring cluster workload
* Rebalancing partitions based on predefined constraints

include::../../modules/cruise-control/con-cruise-control-description.adoc[leveloffset=+1]
[role="_abstract"]
Cruise Control is an open-source system for Kafka that monitors broker loads and rebalances partitions to optimize use of resources across the cluster.
Rebalances help with running a more balanced Kafka cluster that uses brokers more efficiently.
Copy link
Member

@kyguy kyguy Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we use Cruise Control for more than just rebalances, for example topic replication factor changes, I would suggest we alter the original line like this

Suggested change
Rebalances help with running a more balanced Kafka cluster that uses brokers more efficiently.
Cruise Control operations help with running a more balanced Kafka cluster that uses brokers more efficiently.

The classification of hard and soft goals is fixed in Cruise Control code and cannot be changed.

A proposal meeting all hard goals is valid, even if it violates some soft goals.
Cruise Control prioritizes satisfying hard goals and then maximizes soft goals.
Copy link
Member

@kyguy kyguy Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Cruise Control prioritizes satisfying hard goals and then maximizes soft goals.
Cruise Control prioritizes satisfying hard goals and then prioritizes satisfy soft goals in the order by which they are listed.


=== Hard and soft goals

Hard goals are mandatory and must be satisfied for optimization proposals to be valid.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Hard goals are mandatory and must be satisfied for optimization proposals to be valid.
Hard goals are mandatory and must be satisfied for optimization proposals to be generated.

* To specify hard goals, list them in `hard.goals`.
* To exclude a hard goal, ensure it's not in either `default.goals` or `hard.goals`.

Increasing the number of configured hard goals will reduce the likelihood of Cruise Control generating valid optimization proposals.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll reduce the likelihood of generating an optimization proposal period. From what I understand there is no concept of an invalid optimization proposal. If the hard goals cannot be satisfied, a proposal will not be generated

Suggested change
Increasing the number of configured hard goals will reduce the likelihood of Cruise Control generating valid optimization proposals.
Increasing the number of configured hard goals will reduce the likelihood of Cruise Control generating optimization proposals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants