Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(auto-rebalance): adds new and updated content for auto-rebalancing #10698

Merged
merged 4 commits into from
Oct 21, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
// Module included in the following assemblies:
//
// deploying.adoc

[id='assembly-scaling-kafka-clusters-{context}']
= Scaling clusters by adding or removing brokers

[role="_abstract"]
Scaling Kafka clusters by adding brokers can improve performance and reliability.
Increasing the number of brokers provides more resources, enabling the cluster to handle larger workloads and process more messages.
It also enhances fault tolerance by providing additional replicas.
Conversely, removing underutilized brokers can reduce resource consumption and increase efficiency.
Scaling must be done carefully to avoid disruption or data loss.
Redistributing partitions across brokers reduces the load on individual brokers, increasing the overall throughput of the cluster.

Adjusting the `replicas` configuration changes the number of brokers in a cluster.
A replication factor of 3 means each partition is replicated across three brokers, ensuring fault tolerance in case of broker failure:

.Example node pool configuration for the number of replicas
[source,yaml,subs="+attributes"]
----
apiVersion: {KafkaNodePoolApiVersion}
kind: KafkaNodePool
metadata:
name: my-node-pool
labels:
strimzi.io/cluster: my-cluster
spec:
replicas: 3
# ...
----

The actual replication factor for topics depends on the number of available brokers and how many brokers store replicas of each topic partition (configured by `default.replication.factor`).
The minimum number of replicas that must acknowledge a write for it to be considered successful is defined by `min.insync.replicas`:

.Example configuration for topic replication
[source,yaml,subs="+attributes"]
----
apiVersion: {KafkaApiVersion}
kind: Kafka
metadata:
name: my-cluster
spec:
kafka:
config:
default.replication.factor: 3
min.insync.replicas: 2
# ...
----

When adding brokers by changing the number of `replicas`, node IDs start at 0, and the Cluster Operator assigns the next lowest available ID to new brokers.
Removing brokers starts with the pod that has the highest node ID.
Additionally, when scaling clusters with node pools, you can xref:proc-managing-node-pools-ids-{context}[assign node IDs for scaling operations].

Strimzi can automatically reassign partitions when brokers are added or removed if Cruise Control is deployed and auto-rebalancing is enabled in the Kafka resource.
If auto-rebalancing is disabled, you can use Cruise Control to generate optimization proposals before manually rebalancing the cluster.

Cruise Control provides `add-brokers` and `remove-brokers` modes for scaling:

* Use the `add-brokers` mode after scaling up to move partition replicas to the new brokers.
* Use the `remove-brokers` mode before scaling down to move partition replicas off the brokers being removed.

With auto-rebalancing, these modes run automatically using the default Cruise Control configuration or custom settings from a rebalancing template.

NOTE: To increase the throughput of a Kafka topic, you can increase the number of partitions for that topic, distributing the load across multiple brokers.
However, if all brokers are constrained by a resource (such as I/O), adding more partitions won't improve throughput, and adding more brokers is necessary.

include::../../modules/cruise-control/proc-automating-rebalances.adoc[leveloffset=+1]
include::../../modules/configuring/con-skipping-scale-down-checks.adoc[leveloffset=+1]
2 changes: 1 addition & 1 deletion documentation/deploying/deploying.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ include::assemblies/security/assembly-security.adoc[leveloffset=+1]
//security context for all pods
include::assemblies/configuring/assembly-security-providers.adoc[leveloffset=+1]
//Scaling clusters
include::modules/configuring/con-scaling-kafka-clusters.adoc[leveloffset=+1]
include::assemblies/configuring/assembly-scaling-kafka-clusters.adoc[leveloffset=+1]
//Using Cruise Control for rebalancing
include::assemblies/cruise-control/assembly-cruise-control-concepts.adoc[leveloffset=+1]
//Using Cruise Control for changing topic replication factor
Expand Down
4 changes: 3 additions & 1 deletion documentation/modules/configuring/con-config-examples.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ examples
<5> xref:assembly-metrics-config-files-{context}[Metrics configuration], including Prometheus installation and Grafana dashboard files.
<6> `Kafka` and `KafkaNodePool` custom resource configurations for a deployment of Kafka clusters that use ZooKeeper mode. Includes example configuration for an ephemeral or persistent single or multi-node deployment.
<7> `Kafka` and `KafkaNodePool` configurations for a deployment of Kafka clusters that use KRaft (Kafka Raft metadata) mode.
<8> `Kafka` custom resource with a deployment configuration for Cruise Control. Includes `KafkaRebalance` custom resources to generate optimization proposals from Cruise Control, with example configurations to use the default or user optimization goals.
<8> `Kafka` and `KafkaRebalance` configurations for deploying and using Cruise Control to manage clusters.
`Kafka` configuration examples enable auto-rebalancing on scaling events and set default optimization goals.
`KakaRebalance` configuration examples set user-provided optimization goals and generate optimization proposals in various supported modes.
<9> `KafkaConnect` and `KafkaConnector` custom resource configuration for a deployment of Kafka Connect. Includes example configurations for a single or multi-node deployment.
<10> `KafkaBridge` custom resource configuration for a deployment of Kafka Bridge.
56 changes: 0 additions & 56 deletions documentation/modules/configuring/con-scaling-kafka-clusters.adoc

This file was deleted.

13 changes: 9 additions & 4 deletions documentation/modules/configuring/proc-moving-node-pools.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,17 @@ We scale up `pool-a`, and reassign partitions and scale down `pool-b`, which res
* `pool-a` with four replicas
* `pool-b` with three replicas

Currently, scaling is only possible for broker-only node pools containing nodes that run as dedicated brokers.

NOTE: During this process, the ID of the node that holds the partition replicas changes. Consider any dependencies that reference the node ID.

.Prerequisites

* xref:deploying-cluster-operator-str[The Cluster Operator must be deployed.]
* xref:proc-configuring-deploying-cruise-control-str[Cruise Control is deployed with Kafka.]
* (Optional) For scale up and scale down operations, xref:proc-managing-node-pools-ids-{context}[you can specify the range of node IDs to use].
+
* (Optional) xref:proc-automating-rebalances-{context}[Auto-rebalancing is enabled]. +
If auto-rebalancing is enabled, partition reassignment happens automatically during the node scaling process, so you don't need to manually initiate the reassignment through Cruise Control.
* (Optional) For scale up and scale down operations, xref:proc-managing-node-pools-ids-{context}[you can specify the range of node IDs to use]. +
If you have assigned node IDs for the operation, the ID of the node being added or removed is determined by the sequence of nodes given.
Otherwise, the lowest available node ID across the cluster is used when adding nodes; and the node with the highest available ID in the node pool is removed.

Expand Down Expand Up @@ -65,10 +68,12 @@ my-cluster-pool-b-6 1/1 Running 0
+
Node IDs are appended to the name of the node on creation.
We add node `my-cluster-pool-a-7`, which has a node ID of `7`.
+
If auto-rebalancing is enabled, partitions are reassigned to new nodes and moved off brokers that are going to be removed automatically, so you can skip the next step.

. Reassign the partitions from the old node to the new node.
. If auto-rebalancing is not enabled, reassign partitions before decreasing the number of nodes in the source node pool.
+
Before scaling down the source node pool, use the Cruise Control `remove-brokers` mode to move partition replicas off the brokers that are going to be removed.
Use the Cruise Control `remove-brokers` mode to move partition replicas off the brokers that are going to be removed.
+
.Using Cruise Control to reassign partition replicas
[source,shell,subs="+attributes"]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,9 @@ NOTE: During this process, the ID of the node that holds the partition replicas

* xref:deploying-cluster-operator-str[The Cluster Operator must be deployed.]
* xref:proc-configuring-deploying-cruise-control-str[Cruise Control is deployed with Kafka.]
* (Optional) For scale down operations, xref:proc-managing-node-pools-ids-{context}[you can specify the node IDs to use in the operation].
+
* (Optional) xref:proc-automating-rebalances-{context}[Auto-rebalancing is enabled]. +
If auto-rebalancing is enabled, partition reassignment happens automatically during the node scaling process, so you don't need to manually initiate the reassignment through Cruise Control.
* (Optional) For scale down operations, xref:proc-managing-node-pools-ids-{context}[you can specify the node IDs to use in the operation]. +
If you have assigned a range of node IDs for the operation, the ID of the node being removed is determined by the sequence of nodes given.
If you have assigned a single node ID, the node with the specified ID is removed.
Otherwise, the node with the highest available ID in the node pool is removed.
Expand All @@ -40,7 +41,8 @@ Otherwise, the node with the highest available ID in the node pool is removed.

. Reassign the partitions before decreasing the number of nodes in the node pool.
+
Before scaling down a node pool, use the Cruise Control `remove-brokers` mode to move partition replicas off the brokers that are going to be removed.
* If auto-rebalancing is enabled, partitions are moved off brokers that are going to be removed automatically, so you can skip this step.
* If auto-rebalancing is not enabled, use the Cruise Control `remove-brokers` mode to move partition replicas off the brokers that are going to be removed.
+
.Using Cruise Control to reassign partition replicas
[source,shell,subs="+attributes"]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,9 @@ NOTE: During this process, the ID of the node that holds the partition replicas

* xref:deploying-cluster-operator-str[The Cluster Operator must be deployed.]
* xref:proc-configuring-deploying-cruise-control-str[Cruise Control is deployed with Kafka.]
* (Optional) For scale up operations, xref:proc-managing-node-pools-ids-{context}[you can specify the node IDs to use in the operation].
+
* (Optional) xref:proc-automating-rebalances-{context}[Auto-rebalancing is enabled]. +
If auto-rebalancing is enabled, partition reassignment happens automatically during the node scaling process, so you don't need to manually initiate the reassignment through Cruise Control.
* (Optional) For scale up operations, xref:proc-managing-node-pools-ids-{context}[you can specify the node IDs to use in the operation]. +
If you have assigned a range of node IDs for the operation, the ID of the node being added is determined by the sequence of nodes given.
If you have assigned a single node ID, a node is added with the specified ID.
Otherwise, the lowest available node ID across the cluster is used.
Expand Down Expand Up @@ -61,11 +62,12 @@ my-cluster-pool-a-0 1/1 Running 0
my-cluster-pool-a-1 1/1 Running 0
my-cluster-pool-a-2 1/1 Running 0
my-cluster-pool-a-3 1/1 Running 0
----
----

. Reassign the partitions after increasing the number of nodes in the node pool.
+
After scaling up a node pool, use the Cruise Control `add-brokers` mode to move partition replicas from existing brokers to the newly added brokers.
* If auto-rebalancing is enabled, partitions are reassigned to new nodes automatically, so you can skip this step.
* If auto-rebalancing is not enabled, use the Cruise Control `add-brokers` mode to move partition replicas from existing brokers to the newly added brokers.
+
.Using Cruise Control to reassign partition replicas
[source,shell,subs="+attributes"]
Expand Down
Loading
Loading