bugfix: refactor alerts to accomodate for single-node clusters #1010

rexagod · 2025-01-06T06:29:38Z

For the sake of brevity, let:

Q:  kube_node_status_allocatable{job="kube-state-metrics",resource="cpu"} (allocable)
QQ: namespace_cpu:kube_pod_container_resource_requests:sum{} (requested)

thus, both quota alert expressions relevant here (KubeCPUOvercommit and KubeMemoryOvercommit) exist in the form: sum(QQ) by (cluster) - (sum(Q) by (cluster) - max(Q) by (cluster)) > 0 and (sum(Q) by (cluster) - max(Q) by (cluster)) > 0, which, in case of a single-node cluster (sum(Q) by (cluster) = max(Q) by (cluster)), is reduced to, sum(QQ) by (cluster) > 0, i.e., the alert will fire if any request limits exist.

To address this, drop the max(Q) by (cluster) buffer assumed in non-SNO clusters from SNO, reducing the expression to: sum(QQ) by (cluster) - sum(Q) by (cluster) > 0 (total requeted - total allocable > 0 to trigger alert), since there is only a single node, so a buffer of the same sort does not make sense.

For the sake of brevity, let: Q: kube_node_status_allocatable{job="kube-state-metrics",resource="cpu"} (allocable), and, QQ: namespace_cpu:kube_pod_container_resource_requests:sum{} (requested), thus, both quota alerts relevant here exist in the form: sum(QQ) by (cluster) - (sum(Q) by (cluster) - max(Q) by (cluster)) > 0 and (sum(Q) by (cluster) - max(Q) by (cluster)) > 0, which, in case of a single-node cluster (sum(Q) by (cluster) = max(Q) by (cluster)), is reduced to, sum(QQ) by (cluster) > 0, i.e., the alert will fire if *any* request limits exist. To address this, drop the "max(Q) by (cluster)" buffer assumed in non-SNO clusters from SNO, reducing the expression to: sum(QQ) by (cluster) - sum(Q) by (cluster) > 0 (total requeted - total allocable > 0 to trigger alert), since there is only a single node, so a buffer of the same sort does not make sense. Signed-off-by: Pranshu Srivastava <[email protected]>

rexagod · 2025-01-06T08:15:48Z

alerts/resource_alerts.libsonnet

              and
-              (sum(kube_node_status_allocatable{%(kubeStateMetricsSelector)s,resource="cpu"}) by (%(clusterLabel)s) - max(kube_node_status_allocatable{%(kubeStateMetricsSelector)s,resource="cpu"}) by (%(clusterLabel)s)) > 0
+              sum(namespace_cpu:kube_pod_container_resource_requests:sum{%(ignoringOverprovisionedWorkloadSelector)s}) by (%(clusterLabel)s) -
+              sum(kube_node_status_allocatable{%(kubeStateMetricsSelector)s,resource="cpu"}) by (%(clusterLabel)s) > 0)


Suggested change

sum(kube_node_status_allocatable{%(kubeStateMetricsSelector)s,resource="cpu"}) by (%(clusterLabel)s) > 0)

0.95 * sum(kube_node_status_allocatable{%(kubeStateMetricsSelector)s,resource="cpu"}) by (%(clusterLabel)s) > 0)

Since a max(Q) buffer is not applicable in SNO, how about a numeric buffer of 5% (or more?)? That should help alert before things go out of budget.

skl · 2025-01-07T18:00:39Z

alerts/resource_alerts.libsonnet

@@ -34,18 +34,34 @@
          } +
          if $._config.showMultiCluster then {
            expr: |||
-              sum(namespace_cpu:kube_pod_container_resource_requests:sum{%(ignoringOverprovisionedWorkloadSelector)s}) by (%(clusterLabel)s) - (sum(kube_node_status_allocatable{%(kubeStateMetricsSelector)s,resource="cpu"}) by (%(clusterLabel)s) - max(kube_node_status_allocatable{%(kubeStateMetricsSelector)s,resource="cpu"}) by (%(clusterLabel)s)) > 0
+              (count(kube_node_info) == 1


If showMultiCluster is true that implies the cluster label is available, so the check here should probably use the cluster label (so that each cluster is checked on whether it has a single node).

Additionally, I suggest a de-dupe for multiple KSM using max like so:

Suggested change

(count(kube_node_info) == 1

(count by (cluster) (max by (cluster, node) (kube_node_info)) == 1

alerts/resource_alerts.libsonnet

simonpasquier · 2025-01-09T08:36:35Z

alerts/resource_alerts.libsonnet

@@ -34,18 +34,34 @@
          } +
          if $._config.showMultiCluster then {
            expr: |||
-              sum(namespace_cpu:kube_pod_container_resource_requests:sum{%(ignoringOverprovisionedWorkloadSelector)s}) by (%(clusterLabel)s) - (sum(kube_node_status_allocatable{%(kubeStateMetricsSelector)s,resource="cpu"}) by (%(clusterLabel)s) - max(kube_node_status_allocatable{%(kubeStateMetricsSelector)s,resource="cpu"}) by (%(clusterLabel)s)) > 0
+              (count(kube_node_info) == 1


wouldn't (count(kube_node_info) == 1 and ... mess up the returned value (e.g. it would always return 1)? Given the complexity of the expression, I'd advocate for some unit tests in the first place to assert the current rule.

I'll add some unit tests around this, but I'm not sure why this will always return 1?

foo == 1 and bar will always return 1 (the right-hand side is only used for label matching).

(count by (%(clusterLabel)s) (max by (%(clusterLabel)s, node) (kube_node_info)) == 1 and sum by (%(clusterLabel)s) (namespace_cpu:kube_pod_container_resource_requests:sum{%(ignoringOverprovisionedWorkloadSelector)s}) - sum by (%(clusterLabel)s) (kube_node_status_allocatable{%(kubeStateMetricsSelector)s,resource="cpu"}) > 0)

Apologies if I'm missing something here, but this seems to have boolean expressions on both sides, similar to, say, vector(1) == 1 and vector(2) - vector(1) > 0 (scalar vector on RHS of and instead of an instant vector, so no label matching)?

If this was the aforementioned case, I would've preferred the bool operator to avoid the default filtering behavior, but it seemed to suffice without that.

rexagod force-pushed the KubeCPUOvercommit-SNO branch from 5b96fb5 to 5cd53d6 Compare January 6, 2025 07:07

rexagod marked this pull request as ready for review January 6, 2025 07:08

rexagod requested review from povilasv and skl as code owners January 6, 2025 07:08

rexagod marked this pull request as draft January 6, 2025 07:17

rexagod marked this pull request as ready for review January 6, 2025 08:13

rexagod commented Jan 6, 2025

View reviewed changes

rexagod mentioned this pull request Jan 7, 2025

chore(manifests): disable --auto-gomemlimit for Prometheus on SNO unt… openshift/cluster-monitoring-operator#2549

Draft

2 tasks

skl reviewed Jan 7, 2025

View reviewed changes

skl self-assigned this Jan 7, 2025

simonpasquier reviewed Jan 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bugfix: refactor alerts to accomodate for single-node clusters #1010

bugfix: refactor alerts to accomodate for single-node clusters #1010

rexagod commented Jan 6, 2025 •

edited

Loading

rexagod Jan 6, 2025

skl Jan 7, 2025

simonpasquier Jan 9, 2025

rexagod Jan 9, 2025

simonpasquier Jan 10, 2025

rexagod Jan 12, 2025 •

edited

Loading

	sum(kube_node_status_allocatable{%(kubeStateMetricsSelector)s,resource="cpu"}) by (%(clusterLabel)s) > 0)
	0.95 * sum(kube_node_status_allocatable{%(kubeStateMetricsSelector)s,resource="cpu"}) by (%(clusterLabel)s) > 0)

	(count(kube_node_info) == 1
	(count by (cluster) (max by (cluster, node) (kube_node_info)) == 1

bugfix: refactor alerts to accomodate for single-node clusters #1010

Are you sure you want to change the base?

bugfix: refactor alerts to accomodate for single-node clusters #1010

Conversation

rexagod commented Jan 6, 2025 • edited Loading

rexagod Jan 6, 2025

Choose a reason for hiding this comment

skl Jan 7, 2025

Choose a reason for hiding this comment

simonpasquier Jan 9, 2025

Choose a reason for hiding this comment

rexagod Jan 9, 2025

Choose a reason for hiding this comment

simonpasquier Jan 10, 2025

Choose a reason for hiding this comment

rexagod Jan 12, 2025 • edited Loading

Choose a reason for hiding this comment

rexagod commented Jan 6, 2025 •

edited

Loading

rexagod Jan 12, 2025 •

edited

Loading