-
Notifications
You must be signed in to change notification settings - Fork 203
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add multiple concurrent node reboot feature (#660)
* Add ability to have multiple nodes get a lock Currently in kured a single node can get a lock with Acquire. There could be situations where multiple nodes might want a lock in the event that a cluster can handle multiple nodes being rebooted. This adds the side-by-side implementation for a multiple node lock situation. Signed-off-by: Thomas Stringer <[email protected]> * Refactor to use the same code path for a single lock and a multilock Signed-off-by: Thomas Stringer <[email protected]> * test: force rebuild Signed-off-by: Christian Kotzbauer <[email protected]> * build: log pod-logs Signed-off-by: Christian Kotzbauer <[email protected]> * fix: change condition Signed-off-by: Christian Kotzbauer <[email protected]> * build: fix test-script Signed-off-by: Christian Kotzbauer <[email protected]> * build: add concurrent test Signed-off-by: Christian Kotzbauer <[email protected]> * fix: final changes Signed-off-by: Christian Kotzbauer <[email protected]> --------- Signed-off-by: Thomas Stringer <[email protected]> Signed-off-by: Christian Kotzbauer <[email protected]> Co-authored-by: Christian Kotzbauer <[email protected]>
- Loading branch information
1 parent
f22b1ab
commit 3b9b190
Showing
6 changed files
with
474 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -179,3 +179,90 @@ jobs: | |
DEBUG: true | ||
run: | | ||
./tests/kind/follow-coordinated-reboot.sh | ||
# This ensures the latest code works with the manifests built from tree. | ||
# It is useful for two things: | ||
# - Test manifests changes (obviously), ensuring they don't break existing clusters | ||
# - Ensure manifests work with the latest versions even with no manifest change | ||
# (compared to helm charts, manifests cannot easily template changes based on versions) | ||
# Helm charts are _trailing_ releases, while manifests are done during development. | ||
# Concurrency = 2 | ||
e2e-manifests-concurent: | ||
name: End-to-End test with kured with code and manifests from HEAD (concurrent) | ||
runs-on: ubuntu-latest | ||
strategy: | ||
fail-fast: false | ||
matrix: | ||
kubernetes: | ||
- "1.25" | ||
- "1.26" | ||
- "1.27" | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- name: Ensure go version | ||
uses: actions/setup-go@v4 | ||
with: | ||
go-version-file: 'go.mod' | ||
check-latest: true | ||
- name: Set up QEMU | ||
uses: docker/setup-qemu-action@v2 | ||
- name: Set up Docker Buildx | ||
uses: docker/setup-buildx-action@v2 | ||
- name: Setup GoReleaser | ||
run: make bootstrap-tools | ||
- name: Find current tag version | ||
run: echo "sha_short=$(git rev-parse --short HEAD)" >> $GITHUB_OUTPUT | ||
id: tags | ||
- name: Build artifacts | ||
run: | | ||
VERSION="${{ steps.tags.outputs.sha_short }}" make image | ||
VERSION="${{ steps.tags.outputs.sha_short }}" make manifest | ||
- name: Workaround "Failed to attach 1 to compat systemd cgroup /actions_job/..." on gh actions | ||
run: | | ||
sudo bash << EOF | ||
cp /etc/docker/daemon.json /etc/docker/daemon.json.old | ||
echo '{}' > /etc/docker/daemon.json | ||
systemctl restart docker || journalctl --no-pager -n 500 | ||
systemctl status docker | ||
EOF | ||
# Default name for helm/kind-action kind clusters is "chart-testing" | ||
- name: Create kind cluster with 5 nodes | ||
uses: helm/[email protected] | ||
with: | ||
config: .github/kind-cluster-${{ matrix.kubernetes }}.yaml | ||
version: v0.14.0 | ||
|
||
- name: Preload previously built images onto kind cluster | ||
run: kind load docker-image ghcr.io/${{ github.repository }}:${{ steps.tags.outputs.sha_short }} --name chart-testing | ||
|
||
- name: Do not wait for an hour before detecting the rebootSentinel | ||
run: | | ||
sed -i 's/#\(.*\)--period=1h/\1--period=30s/g' kured-ds.yaml | ||
sed -i 's/#\(.*\)--concurrency=1/\1--concurrency=2/g' kured-ds.yaml | ||
- name: Install kured with kubectl | ||
run: | | ||
kubectl apply -f kured-rbac.yaml && kubectl apply -f kured-ds.yaml | ||
- name: Ensure kured is ready | ||
uses: nick-invision/[email protected] | ||
with: | ||
timeout_minutes: 10 | ||
max_attempts: 10 | ||
retry_wait_seconds: 60 | ||
# DESIRED CURRENT READY UP-TO-DATE AVAILABLE should all be = to cluster_size | ||
command: "kubectl get ds -n kube-system kured | grep -E 'kured.*5.*5.*5.*5.*5'" | ||
|
||
- name: Create reboot sentinel files | ||
run: | | ||
./tests/kind/create-reboot-sentinels.sh | ||
- name: Follow reboot until success | ||
env: | ||
DEBUG: true | ||
run: | | ||
./tests/kind/follow-coordinated-reboot.sh |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -85,3 +85,4 @@ spec: | |
# - --log-format=text | ||
# - --metrics-host="" | ||
# - --metrics-port=8080 | ||
# - --concurrency=1 |
Oops, something went wrong.