Skip to content

Commit

Permalink
redpanda: make lifecycle hooks debuggable
Browse files Browse the repository at this point in the history
Prior to this commit debugging issues with our lifecycle hooks was next to
impossible. This is primarily due to Kubernetes providing little to no output
about them except in the case of failure. Our hooks are wrapped with ; true to
ensure failure never happens making the entire issue worse.

This commit adds a more complex wrapper around the PostStart and PreStop hooks
which causes all output from the hooks to be output to stdout of the redpanda
process so it appears in `kubectl logs` with a timestamp and prefix indicating
which hook it is.

Additionally, this commit removes a seemingly benign bugged step of the
PostStart hook that claimed to be creating the bootstrap user. This logic is
handled by either the bootstrap environment variable or by the config-watcher
container.

Example output from `kubectl logs -f` on a terminating node:
```
INFO  2024-10-10 18:23:02,637 [shard 0:main] cluster - members_table.cc:258 - marking node 2 in maintenance state
INFO  2024-10-10 18:23:02,637 [shard 0:main] cluster - drain_manager.cc:54 - Node draining is starting
INFO  2024-10-10 18:23:02,637 [shard 0:main] cluster - drain_manager.cc:150 - Node draining has started
INFO  2024-10-10 18:23:02,637 [shard 0:main] cluster - drain_manager.cc:183 - Node draining has completed on shard 0
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + touch /tmp/preStopHookStarted
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + source /var/lifecycle/common.sh
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ CURL_URL=https://redpanda-2.redpanda.default.svc.cluster.local:9644
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ CURL_NODE_ID_CMD='curl --silent --fail --cacert /etc/tls/certs/default/ca.crt https://redpanda-2.redpanda.default.svc.cluster.local:9644/v1/node_config'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ CURL_MAINTENANCE_DELETE_CMD_PREFIX='curl -X DELETE --silent -o /dev/null -w "%{http_code}"'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ CURL_MAINTENANCE_PUT_CMD_PREFIX='curl -X PUT --silent -o /dev/null -w "%{http_code}"'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ CURL_MAINTENANCE_GET_CMD='curl -X GET --silent --cacert /etc/tls/certs/default/ca.crt https://redpanda-2.redpanda.default.svc.cluster.local:9644/v1/maintenance'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + set -x
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + preStopHook
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ curl --silent --fail --cacert /etc/tls/certs/default/ca.crt https://redpanda-2.redpanda.default.svc.cluster.local:9644/v1/node_config
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ grep -o '\"node_id\":[^,}]*'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ grep -o '[^: ]*$'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + NODE_ID=2
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + echo 'Setting maintenance mode on node 2'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: Setting maintenance mode on node 2
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + CURL_MAINTENANCE_PUT_CMD='curl -X PUT --silent -o /dev/null -w "%{http_code}" --cacert /etc/tls/certs/default/ca.crt https://redpanda-2.redpanda.default.svc.cluster.local:9644/v1/brokers/2/maintenance'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + '[' '' = '"200"' ']'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ curl -X PUT --silent -o /dev/null -w '"%{http_code}"' --cacert /etc/tls/certs/default/ca.crt https://redpanda-2.redpanda.default.svc.cluster.local:9644/v1/brokers/2/maintenance
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + status='"200"'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + sleep 0.5
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + '[' '"200"' = '"200"' ']'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + '[' '' = true ']'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + '[' '' = false ']'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ curl -X GET --silent --cacert /etc/tls/certs/default/ca.crt https://redpanda-2.redpanda.default.svc.cluster.local:9644/v1/maintenance
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + res='{"draining": true, "finished": true, "errors": false, "partitions": 2, "eligible": 0}'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ echo '{"draining":' true, '"finished":' true, '"errors":' false, '"partitions":' 2, '"eligible":' '0}'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ grep -o '\"finished\":[^,}]*'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ grep -o '[^: ]*$'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + finished=true
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ echo '{"draining":' true, '"finished":' true, '"errors":' false, '"partitions":' 2, '"eligible":' '0}'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ grep -o '\"draining\":[^,}]*'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: ++ grep -o '[^: ]*$'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + draining=true
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + sleep 0.5
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + '[' true = true ']'
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + touch /tmp/preStopHookFinished
lifecycle-hook Thu Oct 10 18:23:02 UTC 2024 pre-stop: + true
INFO  2024-10-10 18:23:03,400 [shard 0:main] main - application.cc:466 - Stopping...
```
  • Loading branch information
chrisseto committed Oct 11, 2024
1 parent 751f11d commit 045ef15
Show file tree
Hide file tree
Showing 5 changed files with 547 additions and 740 deletions.
13 changes: 1 addition & 12 deletions charts/redpanda/secrets.go
Original file line number Diff line number Diff line change
Expand Up @@ -100,24 +100,13 @@ func SecretSTSLifecycle(dot *helmette.Dot) *corev1.Secret {
` status=$(${CURL_MAINTENANCE_DELETE_CMD})`,
` sleep 0.5`,
` done`,
}
if values.Auth.SASL.Enabled && values.Auth.SASL.SecretRef != "" {
postStartSh = append(postStartSh,
` # Setup and export SASL bootstrap-user`,
` IFS=":" read -r USER_NAME PASSWORD MECHANISM < <(grep "" $(find /etc/secrets/users/* -print))`,
fmt.Sprintf(` MECHANISM=${MECHANISM:-%s}`, helmette.Dig(dot.Values.AsMap(), "SCRAM-SHA-512", "auth", "sasl", "mechanism")),
` rpk acl user create ${USER_NAME} -p {PASSWORD} --mechanism ${MECHANISM} || true`,
)
}
postStartSh = append(postStartSh,
``,

` touch /tmp/postStartHookFinished`,
`}`,
``,
`postStartHook`,
`true`,
)
}
secret.StringData["postStart.sh"] = helmette.Join("\n", postStartSh)

preStopSh := []string{
Expand Down
43 changes: 21 additions & 22 deletions charts/redpanda/statefulset.go
Original file line number Diff line number Diff line change
Expand Up @@ -555,6 +555,17 @@ func StatefulSetContainers(dot *helmette.Dot) []corev1.Container {
return containers
}

// wrapLifecycleHook wraps the given command in an attempt to make it more friendly for Kubernetes' lifecycle hooks.
// - It attaches a maximum time limit by wrapping the command with `timeout -v <timeout>`
// - It redirect stderr to stdout so all logs from cmd get the same treatment.
// - It prepends the "lifecycle-hook $(hook) $(date)" to al lines emitted by the hook for easy identification.
// - It tees the output to fd 1 of pid 1 so it shows up in kubectl logs
// - It terminates the entire command with "true" so it never fails which would cause the Pod to get killed.
func wrapLifecycleHook(hook string, timeoutSeconds int64, cmd []string) []string {
wrapped := helmette.Join(" ", cmd)
return []string{"bash", "-c", fmt.Sprintf("timeout -v %d %s 2>&1 | sed \"s/^/lifecycle-hook %s $(date): /\" | tee /proc/1/fd/1; true", timeoutSeconds, wrapped, hook)}
}

func statefulSetContainerRedpanda(dot *helmette.Dot) *corev1.Container {
values := helmette.Unwrap[Values](dot.Values)

Expand All @@ -568,32 +579,20 @@ func statefulSetContainerRedpanda(dot *helmette.Dot) *corev1.Container {
// finish the lifecycle scripts with "true" to prevent them from terminating the pod prematurely
PostStart: &corev1.LifecycleHandler{
Exec: &corev1.ExecAction{
Command: []string{
`/bin/bash`,
`-c`,
helmette.Join("\n", []string{
fmt.Sprintf(`timeout -v %d bash -x /var/lifecycle/postStart.sh`,
values.Statefulset.TerminationGracePeriodSeconds/2,
),
`true`,
``,
}),
},
Command: wrapLifecycleHook(
"post-start",
values.Statefulset.TerminationGracePeriodSeconds/2,
[]string{"bash", "-x", "/var/lifecycle/postStart.sh"},
),
},
},
PreStop: &corev1.LifecycleHandler{
Exec: &corev1.ExecAction{
Command: []string{
`/bin/bash`,
`-c`,
helmette.Join("\n", []string{
fmt.Sprintf(`timeout -v %d bash -x /var/lifecycle/preStop.sh`,
values.Statefulset.TerminationGracePeriodSeconds/2,
),
`true # do not fail and cause the pod to terminate`,
``,
}),
},
Command: wrapLifecycleHook(
"pre-stop",
values.Statefulset.TerminationGracePeriodSeconds/2,
[]string{"bash", "-x", "/var/lifecycle/preStop.sh"},
),
},
},
},
Expand Down
6 changes: 1 addition & 5 deletions charts/redpanda/templates/_secrets.go.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,7 @@
{{- $secret := (mustMergeOverwrite (dict "metadata" (dict "creationTimestamp" (coalesce nil) ) ) (mustMergeOverwrite (dict ) (dict "apiVersion" "v1" "kind" "Secret" )) (dict "metadata" (mustMergeOverwrite (dict "creationTimestamp" (coalesce nil) ) (dict "name" (printf "%s-sts-lifecycle" (get (fromJson (include "redpanda.Fullname" (dict "a" (list $dot) ))) "r")) "namespace" $dot.Release.Namespace "labels" (get (fromJson (include "redpanda.FullLabels" (dict "a" (list $dot) ))) "r") )) "type" "Opaque" "stringData" (dict ) )) -}}
{{- $adminCurlFlags := (get (fromJson (include "redpanda.adminTLSCurlFlags" (dict "a" (list $dot) ))) "r") -}}
{{- $_ := (set $secret.stringData "common.sh" (join "\n" (list `#!/usr/bin/env bash` `` `# the SERVICE_NAME comes from the metadata.name of the pod, essentially the POD_NAME` (printf `CURL_URL="%s"` (get (fromJson (include "redpanda.adminInternalURL" (dict "a" (list $dot) ))) "r")) `` `# commands used throughout` (printf `CURL_NODE_ID_CMD="curl --silent --fail %s ${CURL_URL}/v1/node_config"` $adminCurlFlags) `` `CURL_MAINTENANCE_DELETE_CMD_PREFIX='curl -X DELETE --silent -o /dev/null -w "%{http_code}"'` `CURL_MAINTENANCE_PUT_CMD_PREFIX='curl -X PUT --silent -o /dev/null -w "%{http_code}"'` (printf `CURL_MAINTENANCE_GET_CMD="curl -X GET --silent %s ${CURL_URL}/v1/maintenance"` $adminCurlFlags)))) -}}
{{- $postStartSh := (list `#!/usr/bin/env bash` `# This code should be similar if not exactly the same as that found in the panda-operator, see` `# https://github.com/redpanda-data/redpanda/blob/e51d5b7f2ef76d5160ca01b8c7a8cf07593d29b6/src/go/k8s/pkg/resources/secret.go` `` `# path below should match the path defined on the statefulset` `source /var/lifecycle/common.sh` `` `postStartHook () {` ` set -x` `` ` touch /tmp/postStartHookStarted` `` ` until NODE_ID=$(${CURL_NODE_ID_CMD} | grep -o '\"node_id\":[^,}]*' | grep -o '[^: ]*$'); do` ` sleep 0.5` ` done` `` ` echo "Clearing maintenance mode on node ${NODE_ID}"` (printf ` CURL_MAINTENANCE_DELETE_CMD="${CURL_MAINTENANCE_DELETE_CMD_PREFIX} %s ${CURL_URL}/v1/brokers/${NODE_ID}/maintenance"` $adminCurlFlags) ` # a 400 here would mean not in maintenance mode` ` until [ "${status:-}" = '"200"' ] || [ "${status:-}" = '"400"' ]; do` ` status=$(${CURL_MAINTENANCE_DELETE_CMD})` ` sleep 0.5` ` done`) -}}
{{- if (and $values.auth.sasl.enabled (ne $values.auth.sasl.secretRef "")) -}}
{{- $postStartSh = (concat (default (list ) $postStartSh) (list ` # Setup and export SASL bootstrap-user` ` IFS=":" read -r USER_NAME PASSWORD MECHANISM < <(grep "" $(find /etc/secrets/users/* -print))` (printf ` MECHANISM=${MECHANISM:-%s}` (dig "auth" "sasl" "mechanism" "SCRAM-SHA-512" $dot.Values.AsMap)) ` rpk acl user create ${USER_NAME} -p {PASSWORD} --mechanism ${MECHANISM} || true`)) -}}
{{- end -}}
{{- $postStartSh = (concat (default (list ) $postStartSh) (list `` ` touch /tmp/postStartHookFinished` `}` `` `postStartHook` `true`)) -}}
{{- $postStartSh := (list `#!/usr/bin/env bash` `# This code should be similar if not exactly the same as that found in the panda-operator, see` `# https://github.com/redpanda-data/redpanda/blob/e51d5b7f2ef76d5160ca01b8c7a8cf07593d29b6/src/go/k8s/pkg/resources/secret.go` `` `# path below should match the path defined on the statefulset` `source /var/lifecycle/common.sh` `` `postStartHook () {` ` set -x` `` ` touch /tmp/postStartHookStarted` `` ` until NODE_ID=$(${CURL_NODE_ID_CMD} | grep -o '\"node_id\":[^,}]*' | grep -o '[^: ]*$'); do` ` sleep 0.5` ` done` `` ` echo "Clearing maintenance mode on node ${NODE_ID}"` (printf ` CURL_MAINTENANCE_DELETE_CMD="${CURL_MAINTENANCE_DELETE_CMD_PREFIX} %s ${CURL_URL}/v1/brokers/${NODE_ID}/maintenance"` $adminCurlFlags) ` # a 400 here would mean not in maintenance mode` ` until [ "${status:-}" = '"200"' ] || [ "${status:-}" = '"400"' ]; do` ` status=$(${CURL_MAINTENANCE_DELETE_CMD})` ` sleep 0.5` ` done` `` ` touch /tmp/postStartHookFinished` `}` `` `postStartHook` `true`) -}}
{{- $_ := (set $secret.stringData "postStart.sh" (join "\n" $postStartSh)) -}}
{{- $preStopSh := (list `#!/usr/bin/env bash` `# This code should be similar if not exactly the same as that found in the panda-operator, see` `# https://github.com/redpanda-data/redpanda/blob/e51d5b7f2ef76d5160ca01b8c7a8cf07593d29b6/src/go/k8s/pkg/resources/secret.go` `` `touch /tmp/preStopHookStarted` `` `# path below should match the path defined on the statefulset` `source /var/lifecycle/common.sh` `` `set -x` `` `preStopHook () {` ` until NODE_ID=$(${CURL_NODE_ID_CMD} | grep -o '\"node_id\":[^,}]*' | grep -o '[^: ]*$'); do` ` sleep 0.5` ` done` `` ` echo "Setting maintenance mode on node ${NODE_ID}"` (printf ` CURL_MAINTENANCE_PUT_CMD="${CURL_MAINTENANCE_PUT_CMD_PREFIX} %s ${CURL_URL}/v1/brokers/${NODE_ID}/maintenance"` $adminCurlFlags) ` until [ "${status:-}" = '"200"' ]; do` ` status=$(${CURL_MAINTENANCE_PUT_CMD})` ` sleep 0.5` ` done` `` ` until [ "${finished:-}" = "true" ] || [ "${draining:-}" = "false" ]; do` ` res=$(${CURL_MAINTENANCE_GET_CMD})` ` finished=$(echo $res | grep -o '\"finished\":[^,}]*' | grep -o '[^: ]*$')` ` draining=$(echo $res | grep -o '\"draining\":[^,}]*' | grep -o '[^: ]*$')` ` sleep 0.5` ` done` `` ` touch /tmp/preStopHookFinished` `}`) -}}
{{- if (and (gt ($values.statefulset.replicas | int) (2 | int)) (not (get (fromJson (include "_shims.typeassertion" (dict "a" (list "bool" (dig "recovery_mode_enabled" false $values.config.node)) ))) "r"))) -}}
Expand Down
15 changes: 14 additions & 1 deletion charts/redpanda/templates/_statefulset.go.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -337,13 +337,26 @@
{{- end -}}
{{- end -}}

{{- define "redpanda.wrapLifecycleHook" -}}
{{- $hook := (index .a 0) -}}
{{- $timeoutSeconds := (index .a 1) -}}
{{- $cmd := (index .a 2) -}}
{{- range $_ := (list 1) -}}
{{- $_is_returning := false -}}
{{- $wrapped := (join " " $cmd) -}}
{{- $_is_returning = true -}}
{{- (dict "r" (list "bash" "-c" (printf "timeout -v %d %s 2>&1 | sed \"s/^/lifecycle-hook %s $(date): /\" | tee /proc/1/fd/1; true" $timeoutSeconds $wrapped $hook))) | toJson -}}
{{- break -}}
{{- end -}}
{{- end -}}

{{- define "redpanda.statefulSetContainerRedpanda" -}}
{{- $dot := (index .a 0) -}}
{{- range $_ := (list 1) -}}
{{- $_is_returning := false -}}
{{- $values := $dot.Values.AsMap -}}
{{- $internalAdvertiseAddress := (printf "%s.%s" "$(SERVICE_NAME)" (get (fromJson (include "redpanda.InternalDomain" (dict "a" (list $dot) ))) "r")) -}}
{{- $container := (mustMergeOverwrite (dict "name" "" "resources" (dict ) ) (dict "name" (get (fromJson (include "redpanda.Name" (dict "a" (list $dot) ))) "r") "image" (printf `%s:%s` $values.image.repository (get (fromJson (include "redpanda.Tag" (dict "a" (list $dot) ))) "r")) "env" (get (fromJson (include "redpanda.bootstrapEnvVars" (dict "a" (list $dot (get (fromJson (include "redpanda.statefulSetRedpandaEnv" (dict "a" (list ) ))) "r")) ))) "r") "lifecycle" (mustMergeOverwrite (dict ) (dict "postStart" (mustMergeOverwrite (dict ) (dict "exec" (mustMergeOverwrite (dict ) (dict "command" (list `/bin/bash` `-c` (join "\n" (list (printf `timeout -v %d bash -x /var/lifecycle/postStart.sh` ((div ($values.statefulset.terminationGracePeriodSeconds | int64) (2 | int64)) | int64)) `true` ``))) )) )) "preStop" (mustMergeOverwrite (dict ) (dict "exec" (mustMergeOverwrite (dict ) (dict "command" (list `/bin/bash` `-c` (join "\n" (list (printf `timeout -v %d bash -x /var/lifecycle/preStop.sh` ((div ($values.statefulset.terminationGracePeriodSeconds | int64) (2 | int64)) | int64)) `true # do not fail and cause the pod to terminate` ``))) )) )) )) "startupProbe" (mustMergeOverwrite (dict ) (mustMergeOverwrite (dict ) (dict "exec" (mustMergeOverwrite (dict ) (dict "command" (list `/bin/sh` `-c` (join "\n" (list `set -e` (printf `RESULT=$(curl --silent --fail -k -m 5 %s "%s://%s/v1/status/ready")` (get (fromJson (include "redpanda.adminTLSCurlFlags" (dict "a" (list $dot) ))) "r") (get (fromJson (include "redpanda.adminInternalHTTPProtocol" (dict "a" (list $dot) ))) "r") (get (fromJson (include "redpanda.adminApiURLs" (dict "a" (list $dot) ))) "r")) `echo $RESULT` `echo $RESULT | grep ready` ``))) )) )) (dict "initialDelaySeconds" ($values.statefulset.startupProbe.initialDelaySeconds | int) "periodSeconds" ($values.statefulset.startupProbe.periodSeconds | int) "failureThreshold" ($values.statefulset.startupProbe.failureThreshold | int) )) "livenessProbe" (mustMergeOverwrite (dict ) (mustMergeOverwrite (dict ) (dict "exec" (mustMergeOverwrite (dict ) (dict "command" (list `/bin/sh` `-c` (printf `curl --silent --fail -k -m 5 %s "%s://%s/v1/status/ready"` (get (fromJson (include "redpanda.adminTLSCurlFlags" (dict "a" (list $dot) ))) "r") (get (fromJson (include "redpanda.adminInternalHTTPProtocol" (dict "a" (list $dot) ))) "r") (get (fromJson (include "redpanda.adminApiURLs" (dict "a" (list $dot) ))) "r"))) )) )) (dict "initialDelaySeconds" ($values.statefulset.livenessProbe.initialDelaySeconds | int) "periodSeconds" ($values.statefulset.livenessProbe.periodSeconds | int) "failureThreshold" ($values.statefulset.livenessProbe.failureThreshold | int) )) "command" (list `rpk` `redpanda` `start` (printf `--advertise-rpc-addr=%s:%d` $internalAdvertiseAddress ($values.listeners.rpc.port | int))) "volumeMounts" (concat (default (list ) (get (fromJson (include "redpanda.StatefulSetVolumeMounts" (dict "a" (list $dot) ))) "r")) (default (list ) (get (fromJson (include "redpanda.templateToVolumeMounts" (dict "a" (list $dot $values.statefulset.extraVolumeMounts) ))) "r"))) "securityContext" (get (fromJson (include "redpanda.ContainerSecurityContext" (dict "a" (list $dot) ))) "r") "resources" (mustMergeOverwrite (dict ) (dict )) )) -}}
{{- $container := (mustMergeOverwrite (dict "name" "" "resources" (dict ) ) (dict "name" (get (fromJson (include "redpanda.Name" (dict "a" (list $dot) ))) "r") "image" (printf `%s:%s` $values.image.repository (get (fromJson (include "redpanda.Tag" (dict "a" (list $dot) ))) "r")) "env" (get (fromJson (include "redpanda.bootstrapEnvVars" (dict "a" (list $dot (get (fromJson (include "redpanda.statefulSetRedpandaEnv" (dict "a" (list ) ))) "r")) ))) "r") "lifecycle" (mustMergeOverwrite (dict ) (dict "postStart" (mustMergeOverwrite (dict ) (dict "exec" (mustMergeOverwrite (dict ) (dict "command" (get (fromJson (include "redpanda.wrapLifecycleHook" (dict "a" (list "post-start" ((div ($values.statefulset.terminationGracePeriodSeconds | int64) (2 | int64)) | int64) (list "bash" "-x" "/var/lifecycle/postStart.sh")) ))) "r") )) )) "preStop" (mustMergeOverwrite (dict ) (dict "exec" (mustMergeOverwrite (dict ) (dict "command" (get (fromJson (include "redpanda.wrapLifecycleHook" (dict "a" (list "pre-stop" ((div ($values.statefulset.terminationGracePeriodSeconds | int64) (2 | int64)) | int64) (list "bash" "-x" "/var/lifecycle/preStop.sh")) ))) "r") )) )) )) "startupProbe" (mustMergeOverwrite (dict ) (mustMergeOverwrite (dict ) (dict "exec" (mustMergeOverwrite (dict ) (dict "command" (list `/bin/sh` `-c` (join "\n" (list `set -e` (printf `RESULT=$(curl --silent --fail -k -m 5 %s "%s://%s/v1/status/ready")` (get (fromJson (include "redpanda.adminTLSCurlFlags" (dict "a" (list $dot) ))) "r") (get (fromJson (include "redpanda.adminInternalHTTPProtocol" (dict "a" (list $dot) ))) "r") (get (fromJson (include "redpanda.adminApiURLs" (dict "a" (list $dot) ))) "r")) `echo $RESULT` `echo $RESULT | grep ready` ``))) )) )) (dict "initialDelaySeconds" ($values.statefulset.startupProbe.initialDelaySeconds | int) "periodSeconds" ($values.statefulset.startupProbe.periodSeconds | int) "failureThreshold" ($values.statefulset.startupProbe.failureThreshold | int) )) "livenessProbe" (mustMergeOverwrite (dict ) (mustMergeOverwrite (dict ) (dict "exec" (mustMergeOverwrite (dict ) (dict "command" (list `/bin/sh` `-c` (printf `curl --silent --fail -k -m 5 %s "%s://%s/v1/status/ready"` (get (fromJson (include "redpanda.adminTLSCurlFlags" (dict "a" (list $dot) ))) "r") (get (fromJson (include "redpanda.adminInternalHTTPProtocol" (dict "a" (list $dot) ))) "r") (get (fromJson (include "redpanda.adminApiURLs" (dict "a" (list $dot) ))) "r"))) )) )) (dict "initialDelaySeconds" ($values.statefulset.livenessProbe.initialDelaySeconds | int) "periodSeconds" ($values.statefulset.livenessProbe.periodSeconds | int) "failureThreshold" ($values.statefulset.livenessProbe.failureThreshold | int) )) "command" (list `rpk` `redpanda` `start` (printf `--advertise-rpc-addr=%s:%d` $internalAdvertiseAddress ($values.listeners.rpc.port | int))) "volumeMounts" (concat (default (list ) (get (fromJson (include "redpanda.StatefulSetVolumeMounts" (dict "a" (list $dot) ))) "r")) (default (list ) (get (fromJson (include "redpanda.templateToVolumeMounts" (dict "a" (list $dot $values.statefulset.extraVolumeMounts) ))) "r"))) "securityContext" (get (fromJson (include "redpanda.ContainerSecurityContext" (dict "a" (list $dot) ))) "r") "resources" (mustMergeOverwrite (dict ) (dict )) )) -}}
{{- if (not (get (fromJson (include "_shims.typeassertion" (dict "a" (list "bool" (dig `recovery_mode_enabled` false $values.config.node)) ))) "r")) -}}
{{- $_ := (set $container "readinessProbe" (mustMergeOverwrite (dict ) (mustMergeOverwrite (dict ) (dict "exec" (mustMergeOverwrite (dict ) (dict "command" (list `/bin/sh` `-c` (join "\n" (list `set -x` `RESULT=$(rpk cluster health)` `echo $RESULT` `echo $RESULT | grep 'Healthy:.*true'` ``))) )) )) (dict "initialDelaySeconds" ($values.statefulset.readinessProbe.initialDelaySeconds | int) "timeoutSeconds" ($values.statefulset.readinessProbe.timeoutSeconds | int) "periodSeconds" ($values.statefulset.readinessProbe.periodSeconds | int) "successThreshold" ($values.statefulset.readinessProbe.successThreshold | int) "failureThreshold" ($values.statefulset.readinessProbe.failureThreshold | int) ))) -}}
{{- end -}}
Expand Down
Loading

0 comments on commit 045ef15

Please sign in to comment.