You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have noticed that the Ophan cluster spends many hours after a node rotation rebalancing the cluster.
When vacating a node using node exclusion, Elasticsearch will move the shards away, but that doesn’t mean they will all go to the new node. So while the old node quickly passes its shards on to other nodes, and then is terminated, after this there is a lot of rebalancing, at a slow pace to minimize the impact cluster performance, as heuristics are being met.
Elastic have suggested that another option would be to move the shards from the old to the new node using Cluster Reroute. As all shards are moved to the new node, this should cause minimal rebalance, if any.
The text was updated successfully, but these errors were encountered:
@davidfurey - I agree that the extended period of rebalancing is undesirable; it was always our intention to migrate all data from the old node onto the newest node. Perhaps this happened to work OK on the smaller clusters that we were testing with, or perhaps we just never fulfilled this requirement correctly... either way I definitely think the suggestion from you/Elastic would be an improvement on the current behaviour.
@tomrf1 might have thoughts on this too? (Perhaps your memory is better than mine!)
My memory of this is not great...
But I do remember us spending a lot of time making sure the shards go from the old node to the new node without any further re-allocation. Perhaps something has changed since?
Cluster Reroute is new to me, but that sounds like exactly what we want!
We have noticed that the Ophan cluster spends many hours after a node rotation rebalancing the cluster.
When vacating a node using node exclusion, Elasticsearch will move the shards away, but that doesn’t mean they will all go to the new node. So while the old node quickly passes its shards on to other nodes, and then is terminated, after this there is a lot of rebalancing, at a slow pace to minimize the impact cluster performance, as heuristics are being met.
Elastic have suggested that another option would be to move the shards from the old to the new node using Cluster Reroute. As all shards are moved to the new node, this should cause minimal rebalance, if any.
The text was updated successfully, but these errors were encountered: