Skip to content

Latest commit

 

History

History
2385 lines (2154 loc) · 74.3 KB

2024.09.keepalived.sidecar.md

File metadata and controls

2385 lines (2154 loc) · 74.3 KB

Tip

Ongoing and occasional updates and improvements.

using keepalived as a sidecar to maintain VIP for pods

The client wants to use Kubernetes (K8S) as a traditional platform. In practice, they plan to expose IP addresses directly from pod. Additionally, there are two Pod running on two hosts that VIP can migrate between each other. However, they have not fully utilized some of the native features offered by Kubernetes and only regard it as a container management tool.

After summarizing, we have clarified the customer's requirements:

  1. it is necessary to support running multiple Pods on the development platform
  2. there should be a capability to directly expose IP addresses for external accessing;
  3. it is required to configure a VIP across multiple Pods, and this VIP should be able to migrate between different Pods.
  4. After migrating the VIP from pod-01 to pod-02, the VIP cannot be automatically migrated back to pod-01. Therefore, future migration operations need to be performed manually.
  5. The system can only detect the failure status of nodes and does not check the failure status of applications.

We will use macvlan on 2nd network to demo the VIP for pods. We will also use keepalived as a sidecar to maintain the VIP for the pods.

Here is the architecture diagram of our testing:

Key points of this solution:

  1. keepalived as a sidecar to maintain the VIP for the pods.
  2. keepalived will change the route table (default gateway) when the VIP is migrated to another pod.
  3. pods run with macvlan on 2nd network.

Tip

Only VIPs require a public address; all other addresses can be private.

macvlan on 2nd network

First, we need to configure the settings related to macvlan in the cluster of our province. Once this configuration is complete, we will be able to utilize it when deploying Deployment and Pool.

var_namespace='demo-playground'

# create demo project
oc new-project $var_namespace


# create the macvlan config
# please notice, we have ip address configured.
oc delete -f ${BASE_DIR}/data/install/macvlan.conf

var_namespace='demo-playground'
cat << EOF > ${BASE_DIR}/data/install/macvlan.conf
---
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: $var_namespace-macvlan
  namespace: $var_namespace
spec:
  config: |- 
    {
      "cniVersion": "0.3.1",
      "name": "macvlan-net",
      "type": "macvlan",
      "_master": "eth1",
      "linkInContainer": false,
      "mode": "bridge",
      "ipam": {
          "type": "static"
        }
    }

EOF

oc apply -f ${BASE_DIR}/data/install/macvlan.conf

test with pods

Next, we deploy two Pods in the cluster. Each Pod is assigned an IP address from the macvlan network. We will then run some commands on the Pods to verify the configuration.

# create demo pods
oc delete -f ${BASE_DIR}/data/install/pod.yaml

var_namespace='demo-playground'
cat << EOF > ${BASE_DIR}/data/install/pod.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tinypod-01
  namespace: $var_namespace
  labels:
    app: tinypod-01
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tinypod-01
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
          {
            "name": "$var_namespace-macvlan", 
            "_mac": "02:03:04:05:06:07", 
            "_interface": "myiface1", 
            "ips": [
              "192.168.99.91/24"
              ] 
          }
        ]'
      labels:
        app: tinypod-01
        wzh-run: tinypod-testing
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - tinypod-02
              topologyKey: "kubernetes.io/hostname"
      containers:
      - image: registry.k8s.io/e2e-test-images/agnhost:2.43
        imagePullPolicy: IfNotPresent
        name: agnhost-container
        command: [ "/agnhost", "serve-hostname"]

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tinypod-02
  namespace: $var_namespace
  labels:
    app: tinypod-02
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tinypod-02
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
          {
            "name": "$var_namespace-macvlan", 
            "_mac": "02:03:04:05:06:07", 
            "_interface": "myiface1", 
            "ips": [
              "192.168.99.92/24"
              ] 
          }
        ]'
      labels:
        app: tinypod-02
        wzh-run: tinypod-testing
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - tinypod-01
              topologyKey: "kubernetes.io/hostname"
      containers:
      - image: registry.k8s.io/e2e-test-images/agnhost:2.43
        imagePullPolicy: IfNotPresent
        name: agnhost-container
        command: [ "/agnhost", "serve-hostname"]

EOF

oc apply -f ${BASE_DIR}/data/install/pod.yaml

# run commands on the pods belongs to both deployments
# Get the list of pod names
pods=$(oc get pods -n $var_namespace -l wzh-run=tinypod-testing -o jsonpath='{.items[*].metadata.name}')

# Loop through each pod and execute the command
for pod in $pods; do
  echo "Pod: $pod"
  oc exec -it $pod -n $var_namespace -- /bin/sh -c "ip a"
  echo
done

# Pod: tinypod-01-64f74695d5-qzdkr
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
#     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
#     inet 127.0.0.1/8 scope host lo
#        valid_lft forever preferred_lft forever
#     inet6 ::1/128 scope host
#        valid_lft forever preferred_lft forever
# 2: eth0@if18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default
#     link/ether 0a:58:0a:86:00:0a brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 10.134.0.10/23 brd 10.134.1.255 scope global eth0
#        valid_lft forever preferred_lft forever
#     inet6 fe80::858:aff:fe86:a/64 scope link
#        valid_lft forever preferred_lft forever
# 3: net1@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
#     link/ether 12:8f:74:c6:ef:18 brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 192.168.99.91/24 brd 192.168.99.255 scope global net1
#        valid_lft forever preferred_lft forever
#     inet6 fe80::108f:74ff:fec6:ef18/64 scope link
#        valid_lft forever preferred_lft forever

# Pod: tinypod-02-597bb4db87-wmh74
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
#     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
#     inet 127.0.0.1/8 scope host lo
#        valid_lft forever preferred_lft forever
#     inet6 ::1/128 scope host
#        valid_lft forever preferred_lft forever
# 2: eth0@if20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default
#     link/ether 0a:58:0a:85:00:0c brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 10.133.0.12/23 brd 10.133.1.255 scope global eth0
#        valid_lft forever preferred_lft forever
#     inet6 fe80::858:aff:fe85:c/64 scope link
#        valid_lft forever preferred_lft forever
# 3: net1@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
#     link/ether c2:4f:09:dc:ea:43 brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 192.168.99.92/24 brd 192.168.99.255 scope global net1
#        valid_lft forever preferred_lft forever
#     inet6 fe80::c04f:9ff:fedc:ea43/64 scope link
#        valid_lft forever preferred_lft forever

keepalived as a sidecar

Next, we will deploy a keepalived container as a sidecar to maintain the VIP for the pods. The keepalived container will be responsible for monitoring the health of the pods and managing the VIP.

keepalived image

There are some keepalived container image available on the github, but they are not updated for a long time. We will build our own keepalived image.

mkdir -p /data/keepalived
cd /data/keepalived

cat << EOF > init.sh
#!/bin/bash

set -e
set -o pipefail

/usr/sbin/keepalived -n -l -D -f /etc/keepalived/keepalived.conf
EOF

cat << EOF > Dockerfile
FROM registry.access.redhat.com/ubi9

# Update the image to get the latest CVE updates
RUN dnf update -y \
 && dnf install -y --nodocs --allowerasing \
    bash       \
    curl       \
    iproute    \
    keepalived \
 && rm /etc/keepalived/keepalived.conf

COPY init.sh /init.sh

RUN chmod +x init.sh

CMD ["./init.sh"]
EOF

podman build -t quay.io/wangzheng422/qimgs:keepalived-2024-09-06-v01 .

podman push quay.io/wangzheng422/qimgs:keepalived-2024-09-06-v01

settings

# create scc for keepalived, we need to add NET_ADMIN, NET_BROADCAST, NET_RAW capabilities
oc delete -f ${BASE_DIR}/data/install/keepalived-scc.yaml

cat << EOF > ${BASE_DIR}/data/install/keepalived-scc.yaml
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
  name: keepalived-scc
allowPrivilegedContainer: false
allowedCapabilities:
- NET_ADMIN
- NET_BROADCAST
- NET_RAW
runAsUser:
  type: RunAsAny
seLinuxContext:
  type: RunAsAny
fsGroup:
  type: RunAsAny
supplementalGroups:
  type: RunAsAny
users: []
groups: []
EOF
oc apply -f ${BASE_DIR}/data/install/keepalived-scc.yaml

# create a sa
oc delete -f ${BASE_DIR}/data/install/keepalived-sa.yaml
var_namespace='demo-playground'
cat << EOF > ${BASE_DIR}/data/install/keepalived-sa.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: keepalived-sa
  namespace: $var_namespace
EOF
oc apply -f ${BASE_DIR}/data/install/keepalived-sa.yaml

# add scc to sa
oc adm policy add-scc-to-user keepalived-scc -z keepalived-sa -n $var_namespace

run with 2 nodes

ip addresses:

  • VIP : 192.168.77.100 (public ip address)
  • app / keepalived pod 1: 192.168.99.91
  • app / keepalived pod 2: 192.168.99.92
# create demo pods
# 192.168.77.100 is our VIP
oc delete -f ${BASE_DIR}/data/install/pod.yaml

var_namespace='demo-playground'
cat << EOF > ${BASE_DIR}/data/install/pod.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: keepalived-config
  namespace: $var_namespace
data:
  keepalived.conf: |
    global_defs {
        log_level 7
        script_user root
        # enable_script_security
    }
    vrrp_script chk_ip {
        script "/etc/keepalived/check_ip.sh"
        interval 2
    }
    vrrp_instance VI_1 {
        state MASTER
        # state BACKUP
        interface net1
        virtual_router_id 51
        priority 100
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass 1111
        }
        virtual_ipaddress {
            192.168.77.100/24 dev net1
        }
        track_interface {
            net1
        }
        track_script {
            chk_ip 
        }
        notify_master "/etc/keepalived/notify_master.sh"
        notify_backup "/etc/keepalived/notify_backup.sh"
    }
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: keepalived-scripts
  namespace: $var_namespace
data:
  check_ip.sh: |
    #!/bin/sh
    if curl --max-time 0.1 -s http://192.168.99.91:9376 > /dev/null 2>&1 ; then
      exit 0
    else
      exit 1
    fi
  notify_master.sh: |
    #!/bin/sh
    ip route del default
    ip route add default via 192.168.77.1 dev net1
  notify_backup.sh: |
    #!/bin/sh
    ip route del default
    GATEWAY=\$(ip r | grep "10.132.0.0/14" | awk '{print \$3}')
    ip route add default via \$GATEWAY dev eth0   
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tinypod-01
  namespace: $var_namespace
  labels:
    app: tinypod-01
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tinypod-01
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
          {
            "name": "$var_namespace-macvlan", 
            "_mac": "02:03:04:05:06:07", 
            "_interface": "myiface1", 
            "ips": [
              "192.168.99.91/24"
              ] 
          }
        ]'
      labels:
        app: tinypod-01
        wzh-run: tinypod-testing
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - tinypod-02
              topologyKey: "kubernetes.io/hostname"
      serviceAccountName: keepalived-sa
      initContainers:
      - name: init-permissions
        image: docker.io/busybox
        command: ['sh', '-c', 'cp /etc/keepalived/*.sh /tmp/keepalived/ && chmod 755 /tmp/keepalived/*.sh && chown root:root /tmp/keepalived/*.sh']
        volumeMounts:
        - name: keepalived-scripts
          mountPath: /etc/keepalived
        - name: writable-scripts
          mountPath: /tmp/keepalived
      containers:
      - name: application-container
        image: registry.k8s.io/e2e-test-images/agnhost:2.43
        imagePullPolicy: IfNotPresent
        command: [ "/agnhost", "serve-hostname"]
      - name: keepalived
        image: quay.io/wangzheng422/qimgs:keepalived-2024-09-06-v01
        imagePullPolicy: IfNotPresent
        securityContext:
          # privileged: true
          # runAsUser: 0
          capabilities:
            add: ["NET_ADMIN", "NET_BROADCAST", "NET_RAW"]
        volumeMounts:
        - name: keepalived-config
          mountPath: /etc/keepalived/keepalived.conf
          subPath: keepalived.conf
        - name: writable-scripts
          mountPath: /etc/keepalived
      volumes:
      - name: keepalived-config
        configMap:
          name: keepalived-config
      - name: keepalived-scripts
        configMap:
          name: keepalived-scripts
      - name: writable-scripts
        emptyDir: {}
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: keepalived-config-backup
  namespace: $var_namespace
data:
  keepalived.conf: |
    global_defs {
        log_level 7
        script_user root
        # enable_script_security
    }
    vrrp_script chk_ip {
        script "/etc/keepalived/check_ip.sh"
        interval 2
        weight +20
    }
    vrrp_instance VI_1 {
        state BACKUP
        interface net1
        virtual_router_id 51
        priority 90
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass 1111
        }
        virtual_ipaddress {
            192.168.77.100/24 dev net1
        }
        track_interface {
            net1
        }
        track_script {
            chk_ip
        }
        notify_master "/etc/keepalived/notify_master.sh"
        notify_backup "/etc/keepalived/notify_backup.sh"
    }
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: keepalived-backup-scripts
  namespace: $var_namespace
data:
  check_ip.sh: |
    #!/bin/sh
    
    # ourself is ok?
    if curl --max-time 0.1 -s http://192.168.99.92:9376 > /dev/null 2>&1 ; then
      # exit 0
      # continue, only ourself is ok.
      :
    else
      exit 1
    fi
    
    # Define the local file to record failure
    FAILURE_RECORD_FILE="/tmp/failure_record.txt"

    # Check if the failure record file exists
    # if so, we will still be the master
    if [ -f "\$FAILURE_RECORD_FILE" ]; then
      exit 0  # return Success (will add weight)
    fi

    # if peer fail, we should add weight
    if curl --max-time 0.1 -s http://192.168.99.91:9376 > /dev/null 2>&1 ; then
      # exit 0
      exit 1  # curl ok, return Failure (no change in weight)
    else
      # exit 1
      # Record the failure by creating the file
      touch "\$FAILURE_RECORD_FILE"
      exit 0  # curl fail, return Success (will add weight)
    fi
  notify_master.sh: |
    #!/bin/sh
    ip route del default
    ip route add default via 192.168.77.1 dev net1
  notify_backup.sh: |
    #!/bin/sh
    ip route del default
    GATEWAY=\$(ip r | grep "10.132.0.0/14" | awk '{print \$3}')
    ip route add default via \$GATEWAY dev eth0 
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tinypod-02
  namespace: $var_namespace
  labels:
    app: tinypod-02
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tinypod-02
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
          {
            "name": "$var_namespace-macvlan", 
            "_mac": "02:03:04:05:06:07", 
            "_interface": "myiface1", 
            "ips": [
              "192.168.99.92/24"
              ] 
          }
        ]'
      labels:
        app: tinypod-02
        wzh-run: tinypod-testing
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - tinypod-01
              topologyKey: "kubernetes.io/hostname"
      serviceAccountName: keepalived-sa
      initContainers:
      - name: init-permissions
        image: docker.io/busybox
        command: ['sh', '-c', 'cp /etc/keepalived/*.sh /tmp/keepalived/ && chmod 755 /tmp/keepalived/*.sh && chown root:root /tmp/keepalived/*.sh']
        volumeMounts:
        - name: keepalived-scripts
          mountPath: /etc/keepalived
        - name: writable-scripts
          mountPath: /tmp/keepalived
      containers:
      - name: application-container
        image: registry.k8s.io/e2e-test-images/agnhost:2.43
        imagePullPolicy: IfNotPresent
        command: [ "/agnhost", "serve-hostname"]
      - name: keepalived
        image: quay.io/wangzheng422/qimgs:keepalived-2024-09-06-v01
        imagePullPolicy: IfNotPresent
        securityContext:
          # privileged: true
          # runAsUser: 0
          capabilities:
            add: ["NET_ADMIN", "NET_BROADCAST", "NET_RAW"]
        volumeMounts:
        - name: keepalived-config
          mountPath: /etc/keepalived/keepalived.conf
          subPath: keepalived.conf
        - name: writable-scripts
          mountPath: /etc/keepalived
      volumes:
      - name: keepalived-config
        configMap:
          name: keepalived-config-backup
      - name: keepalived-scripts
        configMap:
          name: keepalived-backup-scripts
      - name: writable-scripts
        emptyDir: {}

EOF

oc apply -f ${BASE_DIR}/data/install/pod.yaml


# run commands on the pods belongs to both deployments
# Get the list of pod names
pods=$(oc get pods -n $var_namespace -l wzh-run=tinypod-testing -o jsonpath='{.items[*].metadata.name}')

# Loop through each pod and execute the command
# we try to check the ip address and route table
for pod in $pods; do
  echo "Pod: $pod"
  oc exec -it $pod -n $var_namespace -- /bin/sh -c "ip a"
  echo
done

# Pod: tinypod-01-7899f4c557-wnvd2
# Defaulted container "agnhost-container" out of: agnhost-container, keepalived
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
#     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
#     inet 127.0.0.1/8 scope host lo
#        valid_lft forever preferred_lft forever
#     inet6 ::1/128 scope host
#        valid_lft forever preferred_lft forever
# 2: eth0@if19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default
#     link/ether 0a:58:0a:86:00:0b brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 10.134.0.11/23 brd 10.134.1.255 scope global eth0
#        valid_lft forever preferred_lft forever
#     inet6 fe80::858:aff:fe86:b/64 scope link
#        valid_lft forever preferred_lft forever
# 3: net1@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
#     link/ether 72:24:2e:8b:df:a5 brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 192.168.99.91/24 brd 192.168.99.255 scope global net1
#        valid_lft forever preferred_lft forever
#     inet 192.168.77.100/24 scope global net1
#        valid_lft forever preferred_lft forever
#     inet6 fe80::7024:2eff:fe8b:dfa5/64 scope link
#        valid_lft forever preferred_lft forever

# Pod: tinypod-02-65b5989698-q5t5p
# Defaulted container "agnhost-container" out of: agnhost-container, keepalived
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
#     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
#     inet 127.0.0.1/8 scope host lo
#        valid_lft forever preferred_lft forever
#     inet6 ::1/128 scope host
#        valid_lft forever preferred_lft forever
# 2: eth0@if22: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default
#     link/ether 0a:58:0a:85:00:0e brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 10.133.0.14/23 brd 10.133.1.255 scope global eth0
#        valid_lft forever preferred_lft forever
#     inet6 fe80::858:aff:fe85:e/64 scope link
#        valid_lft forever preferred_lft forever
# 3: net1@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
#     link/ether 62:ec:10:28:3a:c9 brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 192.168.99.92/24 brd 192.168.99.255 scope global net1
#        valid_lft forever preferred_lft forever
#     inet6 fe80::60ec:10ff:fe28:3ac9/64 scope link
#        valid_lft forever preferred_lft forever


# Get the list of pod names
pods=$(oc get pods -n $var_namespace -l wzh-run=tinypod-testing -o jsonpath='{.items[*].metadata.name}')

# Loop through each pod and execute the command
# here is the route table
for pod in $pods; do
  echo "Pod: $pod"
  oc exec -it $pod -n $var_namespace -- /bin/sh -c "ip r"
  echo
done

# Pod: tinypod-01-7fbb8c856b-n22cd
# Defaulted container "agnhost-container" out of: agnhost-container, keepalived, init-permissions (init)
# default via 192.168.77.1 dev net1
# 10.132.0.0/14 via 10.133.0.1 dev eth0
# 10.133.0.0/23 dev eth0 proto kernel scope link src 10.133.0.19
# 100.64.0.0/16 via 10.133.0.1 dev eth0
# 172.22.0.0/16 via 10.133.0.1 dev eth0
# 192.168.77.0/24 dev net1 proto kernel scope link src 192.168.77.100
# 192.168.99.0/24 dev net1 proto kernel scope link src 192.168.99.91

# Pod: tinypod-02-bb499c57-l8tng
# Defaulted container "agnhost-container" out of: agnhost-container, keepalived, init-permissions (init)
# default via 10.134.0.1 dev eth0
# 10.132.0.0/14 via 10.134.0.1 dev eth0
# 10.134.0.0/23 dev eth0 proto kernel scope link src 10.134.0.9
# 100.64.0.0/16 via 10.134.0.1 dev eth0
# 172.22.0.0/16 via 10.134.0.1 dev eth0
# 192.168.99.0/24 dev net1 proto kernel scope link src 192.168.99.92


# curl http://192.168.77.100:9376
# curl http://192.168.99.91:9376
# curl http://192.168.99.92:9376

# we check the service from the VIP, it gives timestamp, and the pod name which is serving the request.

while true; do
  TIMESTAMP=$(date +"%Y-%m-%d %H:%M:%S")
  RESPONSE=$(curl --max-time 0.05 -s -w "%{http_code}" http://192.168.77.100:9376)
  HTTP_CODE="${RESPONSE: -3}"
  CONTENT="${RESPONSE:0:-3}"

  if [ "$HTTP_CODE" -eq 200 ]; then
      echo "$TIMESTAMP - $CONTENT"
  else
      echo "$TIMESTAMP - call failed"
  fi

  sleep 1
done

# after node is cut off power, VIP take 3 seconds to failover to pod-02

# 2024-09-05 23:10:49 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:10:50 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:10:51 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:10:52 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:10:53 - call failed
# 2024-09-05 23:10:54 - call failed
# 2024-09-05 23:10:55 - call failed
# 2024-09-05 23:10:56 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:10:57 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:10:58 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:10:59 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:11:00 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:11:01 - tinypod-02-c788654d4-hlsw5


# after node power on, VIP will not back to pod-01
# because the check_ip.sh script logic.

# 2024-09-05 23:12:30 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:12:31 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:12:32 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:12:33 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:12:34 - tinypod-02-c788654d4-hlsw5


# after node normal power off, VIP take 1 second to failover to pod-02

# 2024-09-05 23:14:45 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:46 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:47 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:48 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:49 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:50 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:51 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:52 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:53 - call failed
# 2024-09-05 23:14:54 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:14:55 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:14:56 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:14:57 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:14:58 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:14:59 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:15:00 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:15:01 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:15:02 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:15:03 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:15:04 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:15:05 - tinypod-02-c788654d4-hlsw5

# after node normal power on, VIP will not move back to pod-01
# because the check_ip.sh script logic.

# 2024-09-05 23:17:28 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:17:29 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:17:30 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:17:31 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:17:32 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:17:33 - tinypod-01-6fc4fb867-ml4rh
# 2024-09-05 23:17:34 - tinypod-01-6fc4fb867-ml4rh
# 2024-09-05 23:17:35 - tinypod-01-6fc4fb867-ml4rh
# 2024-09-05 23:17:36 - tinypod-01-6fc4fb867-ml4rh
# 2024-09-05 23:17:37 - tinypod-01-6fc4fb867-ml4rh
# 2024-09-05 23:17:38 - tinypod-01-6fc4fb867-ml4rh
# 2024-09-05 23:17:39 - tinypod-01-6fc4fb867-ml4rh

oc get pod -o wide
# NAME                         READY   STATUS        RESTARTS   AGE   IP            NODE             NOMINATED NODE   READINESS GATES
# tinypod-01-6fc4fb867-ml4rh   2/2     Running       0          97s   10.132.0.98   master-01-demo   <none>           <none>
# tinypod-01-6fc4fb867-tcmgx   2/2     Terminating   2          14m   10.134.0.8    worker-02-demo   <none>           <none>
# tinypod-02-c788654d4-hlsw5   2/2     Running       0          14m   10.133.0.24   worker-01-demo   <none>           <none>


# on host, run tcpdump, to verify the VIP is working
# run `curl 8.8.8.8` in the master pod, and check the tcpdump output
tcpdump -i br-ocp tcp and dst host 8.8.8.8
# dropped privs to tcpdump
# tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
# listening on br-ocp, link-type EN10MB (Ethernet), snapshot length 262144 bytes
# 07:25:20.189841 IP 192.168.77.100.33636 > dns.google.http: Flags [S], seq 1576966623, win 32120, options [mss 1460,sackOK,TS val 1509135670 ecr 0,nop,wscale 7], length 0
# 07:25:21.229779 IP 192.168.77.100.33636 > dns.google.http: Flags [S], seq 1576966623, win 32120, options [mss 1460,sackOK,TS val 1509136710 ecr 0,nop,wscale 7], length 0
# 07:25:23.277818 IP 192.168.77.100.33636 > dns.google.http: Flags [S], seq 1576966623, win 32120, options [mss 1460,sackOK,TS val 1509138758 ecr 0,nop,wscale 7], length 0
# 07:25:27.309816 IP 192.168.77.100.33636 > dns.google.http: Flags [S], seq 1576966623, win 32120, options [mss 1460,sackOK,TS val 1509142790 ecr 0,nop,wscale 7], length 0

check on node level

The requirement is to failover the VIP to another pod only when the node is down. So we should not check the application endpoint, we will create a simple http server using seperated pod, and check the web server is running or not, so we can know the node is down or not.

Here is the architecture:

ip addresses:

  • VIP : 192.168.77.100 (public ip address)
  • app / keepalived pod 1: 192.168.99.91
  • app / keepalived pod 2: 192.168.99.92
  • check pod 1: 192.168.99.81
  • check pod 2: 192.168.99.82
# create demo pods
# 192.168.77.100 is our VIP
oc delete -f ${BASE_DIR}/data/install/pod.yaml

var_namespace='demo-playground'
cat << EOF > ${BASE_DIR}/data/install/pod.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: keepalived-config
  namespace: $var_namespace
data:
  keepalived.conf: |
    global_defs {
        log_level 7
        script_user root
        # enable_script_security
    }
    vrrp_script chk_ip {
        script "/etc/keepalived/check_ip.sh"
        interval 2
    }
    vrrp_instance VI_1 {
        state MASTER
        # state BACKUP
        interface net1
        virtual_router_id 51
        priority 100
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass 1111
        }
        virtual_ipaddress {
            192.168.77.100/24 dev net1
        }
        track_interface {
            net1
        }
        track_script {
            chk_ip 
        }
        notify_master "/etc/keepalived/notify_master.sh"
        notify_backup "/etc/keepalived/notify_backup.sh"
    }
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: keepalived-scripts
  namespace: $var_namespace
data:
  check_ip.sh: |
    #!/bin/sh
    if curl --max-time 0.1 -s http://192.168.99.81:9376 > /dev/null 2>&1 ; then
      exit 0
    else
      exit 1
    fi
  notify_master.sh: |
    #!/bin/sh
    ip route del default
    ip route add default via 192.168.77.1 dev net1
  notify_backup.sh: |
    #!/bin/sh
    ip route del default
    GATEWAY=\$(ip r | grep "10.132.0.0/14" | awk '{print \$3}')
    ip route add default via \$GATEWAY dev eth0   
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tinypod-01
  namespace: $var_namespace
  labels:
    app: tinypod-01
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tinypod-01
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
          {
            "name": "$var_namespace-macvlan", 
            "_mac": "02:03:04:05:06:07", 
            "_interface": "myiface1", 
            "ips": [
              "192.168.99.91/24"
              ] 
          }
        ]'
      labels:
        app: tinypod-01
        wzh-run: tinypod-testing
    spec:
      # do not run with the same node of tinypod-02
      # be can be run on the same node in extreme cases
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - tinypod-02
              topologyKey: "kubernetes.io/hostname"
      serviceAccountName: keepalived-sa
      initContainers:
      - name: init-permissions
        image: docker.io/busybox
        command: ['sh', '-c', 'cp /etc/keepalived/*.sh /tmp/keepalived/ && chmod 755 /tmp/keepalived/*.sh && chown root:root /tmp/keepalived/*.sh']
        volumeMounts:
        - name: keepalived-scripts
          mountPath: /etc/keepalived
        - name: writable-scripts
          mountPath: /tmp/keepalived
      containers:
      - name: application-container
        image: registry.k8s.io/e2e-test-images/agnhost:2.43
        imagePullPolicy: IfNotPresent
        command: [ "/agnhost", "serve-hostname"]
      - name: keepalived
        image: quay.io/wangzheng422/qimgs:keepalived-2024-09-06-v01
        imagePullPolicy: IfNotPresent
        securityContext:
          # privileged: true
          # runAsUser: 0
          capabilities:
            add: ["NET_ADMIN", "NET_BROADCAST", "NET_RAW"]
        volumeMounts:
        - name: keepalived-config
          mountPath: /etc/keepalived/keepalived.conf
          subPath: keepalived.conf
        - name: writable-scripts
          mountPath: /etc/keepalived
      volumes:
      - name: keepalived-config
        configMap:
          name: keepalived-config
      - name: keepalived-scripts
        configMap:
          name: keepalived-scripts
      - name: writable-scripts
        emptyDir: {}
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tinypod-01-check
  namespace: $var_namespace
  labels:
    app: tinypod-01-check
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tinypod-01-check
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
          {
            "name": "$var_namespace-macvlan", 
            "_mac": "02:03:04:05:06:07", 
            "_interface": "myiface1", 
            "ips": [
              "192.168.99.81/24"
              ] 
          }
        ]'
      labels:
        app: tinypod-01-check
        wzh-run: tinypod-testing
    spec:
      # run with tinypod-01 app
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - tinypod-01
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: endpoint-container
        image: registry.k8s.io/e2e-test-images/agnhost:2.43
        imagePullPolicy: IfNotPresent
        command: [ "/agnhost", "serve-hostname"]
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: keepalived-config-backup
  namespace: $var_namespace
data:
  keepalived.conf: |
    global_defs {
        log_level 7
        script_user root
        # enable_script_security
    }
    vrrp_script chk_ip {
        script "/etc/keepalived/check_ip.sh"
        interval 2
        weight +20
    }
    vrrp_instance VI_1 {
        state BACKUP
        interface net1
        virtual_router_id 51
        # priority should be lower than master
        # but we do not want to fail back
        priority 90
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass 1111
        }
        virtual_ipaddress {
            192.168.77.100/24 dev net1
        }
        track_interface {
            net1
        }
        track_script {
            chk_ip
        }
        notify_master "/etc/keepalived/notify_master.sh"
        notify_backup "/etc/keepalived/notify_backup.sh"
    }
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: keepalived-backup-scripts
  namespace: $var_namespace
data:
  check_ip.sh: |
    #!/bin/sh

    # Define the local file to record failure
    FAILURE_RECORD_FILE="/tmp/failure_record.txt"

    # ourself is ok?
    if curl --max-time 0.1 -s http://192.168.99.82:9376 > /dev/null 2>&1 ; then
      # exit 0
      # continue, only ourself is ok.
      :
    else
      # return Failure (no change in weight)
      # will not be the master
      /bin/rm -f "\$FAILURE_RECORD_FILE"
      exit 1
    fi
    
    # Check if the failure record file exists
    # if so, we will still be the master
    if [ -f "\$FAILURE_RECORD_FILE" ]; then
      exit 0  # return Success (will add weight)
    fi

    # if peer fail, we should add weight
    if curl --max-time 0.1 -s http://192.168.99.81:9376 > /dev/null 2>&1 ; then
      # exit 0
      exit 1  # curl ok, return Failure (no change in weight)
    else
      # exit 1
      # Record the failure by creating the file
      touch "\$FAILURE_RECORD_FILE"
      exit 0  # curl fail, return Success (will add weight)
    fi
  notify_master.sh: |
    #!/bin/sh
    ip route del default
    ip route add default via 192.168.77.1 dev net1
  notify_backup.sh: |
    #!/bin/sh
    ip route del default
    GATEWAY=\$(ip r | grep "10.132.0.0/14" | awk '{print \$3}')
    ip route add default via \$GATEWAY dev eth0 
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tinypod-02
  namespace: $var_namespace
  labels:
    app: tinypod-02
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tinypod-02
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
          {
            "name": "$var_namespace-macvlan", 
            "_mac": "02:03:04:05:06:07", 
            "_interface": "myiface1", 
            "ips": [
              "192.168.99.92/24"
              ] 
          }
        ]'
      labels:
        app: tinypod-02
        wzh-run: tinypod-testing
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - tinypod-01
              topologyKey: "kubernetes.io/hostname"
      serviceAccountName: keepalived-sa
      initContainers:
      - name: init-permissions
        image: docker.io/busybox
        command: ['sh', '-c', 'cp /etc/keepalived/*.sh /tmp/keepalived/ && chmod 755 /tmp/keepalived/*.sh && chown root:root /tmp/keepalived/*.sh']
        volumeMounts:
        - name: keepalived-scripts
          mountPath: /etc/keepalived
        - name: writable-scripts
          mountPath: /tmp/keepalived
      containers:
      - name: application-container
        image: registry.k8s.io/e2e-test-images/agnhost:2.43
        imagePullPolicy: IfNotPresent
        command: [ "/agnhost", "serve-hostname"]
      - name: keepalived
        image: quay.io/wangzheng422/qimgs:keepalived-2024-09-06-v01
        imagePullPolicy: IfNotPresent
        securityContext:
          # privileged: true
          # runAsUser: 0
          capabilities:
            add: ["NET_ADMIN", "NET_BROADCAST", "NET_RAW"]
        volumeMounts:
        - name: keepalived-config
          mountPath: /etc/keepalived/keepalived.conf
          subPath: keepalived.conf
        - name: writable-scripts
          mountPath: /etc/keepalived
      volumes:
      - name: keepalived-config
        configMap:
          name: keepalived-config-backup
      - name: keepalived-scripts
        configMap:
          name: keepalived-backup-scripts
      - name: writable-scripts
        emptyDir: {}
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tinypod-02-check
  namespace: $var_namespace
  labels:
    app: tinypod-02-check
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tinypod-02-check
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
          {
            "name": "$var_namespace-macvlan", 
            "_mac": "02:03:04:05:06:07", 
            "_interface": "myiface1", 
            "ips": [
              "192.168.99.82/24"
              ] 
          }
        ]'
      labels:
        app: tinypod-02-check
        wzh-run: tinypod-testing
    spec:
      # run with tinypod-02 app
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - tinypod-02
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: endpoint-container
        image: registry.k8s.io/e2e-test-images/agnhost:2.43
        imagePullPolicy: IfNotPresent
        command: [ "/agnhost", "serve-hostname"]
EOF

oc apply -f ${BASE_DIR}/data/install/pod.yaml


# run commands on the pods belongs to both deployments
# Get the list of pod names
pods=$(oc get pods -n $var_namespace -l wzh-run=tinypod-testing -o jsonpath='{.items[*].metadata.name}')

# Loop through each pod and execute the command
# we try to check the ip address and route table
for pod in $pods; do
  echo "Pod: $pod"
  oc exec -it $pod -n $var_namespace -- /bin/sh -c "ip a"
  echo
done

# Pod: tinypod-01-974b4cc84-x9rsz
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
#     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
#     inet 127.0.0.1/8 scope host lo
#        valid_lft forever preferred_lft forever
#     inet6 ::1/128 scope host
#        valid_lft forever preferred_lft forever
# 2: eth0@if26: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default
#     link/ether 0a:58:0a:86:00:15 brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 10.134.0.21/23 brd 10.134.1.255 scope global eth0
#        valid_lft forever preferred_lft forever
#     inet6 fe80::858:aff:fe86:15/64 scope link
#        valid_lft forever preferred_lft forever
# 3: net1@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
#     link/ether 36:70:c0:f8:8d:07 brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 192.168.99.91/24 brd 192.168.99.255 scope global net1
#        valid_lft forever preferred_lft forever
#     inet 192.168.77.100/24 scope global net1
#        valid_lft forever preferred_lft forever
#     inet6 fe80::3470:c0ff:fef8:8d07/64 scope link
#        valid_lft forever preferred_lft forever

# Pod: tinypod-01-check-668b5d9498-ncmk5
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
#     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
#     inet 127.0.0.1/8 scope host lo
#        valid_lft forever preferred_lft forever
#     inet6 ::1/128 scope host
#        valid_lft forever preferred_lft forever
# 2: eth0@if27: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default
#     link/ether 0a:58:0a:86:00:16 brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 10.134.0.22/23 brd 10.134.1.255 scope global eth0
#        valid_lft forever preferred_lft forever
#     inet6 fe80::858:aff:fe86:16/64 scope link
#        valid_lft forever preferred_lft forever
# 3: net1@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
#     link/ether ae:82:93:bd:38:f2 brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 192.168.99.81/24 brd 192.168.99.255 scope global net1
#        valid_lft forever preferred_lft forever
#     inet6 fe80::ac82:93ff:febd:38f2/64 scope link
#        valid_lft forever preferred_lft forever

# Pod: tinypod-02-97d4bfd8b-kxsbn
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
#     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
#     inet 127.0.0.1/8 scope host lo
#        valid_lft forever preferred_lft forever
#     inet6 ::1/128 scope host
#        valid_lft forever preferred_lft forever
# 2: eth0@if22: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default
#     link/ether 0a:58:0a:85:00:0f brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 10.133.0.15/23 brd 10.133.1.255 scope global eth0
#        valid_lft forever preferred_lft forever
#     inet6 fe80::858:aff:fe85:f/64 scope link
#        valid_lft forever preferred_lft forever
# 3: net1@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
#     link/ether 5a:2e:41:7c:8c:1c brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 192.168.99.92/24 brd 192.168.99.255 scope global net1
#        valid_lft forever preferred_lft forever
#     inet6 fe80::582e:41ff:fe7c:8c1c/64 scope link
#        valid_lft forever preferred_lft forever

# Pod: tinypod-02-check-645b69c854-6nk9r
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
#     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
#     inet 127.0.0.1/8 scope host lo
#        valid_lft forever preferred_lft forever
#     inet6 ::1/128 scope host
#        valid_lft forever preferred_lft forever
# 2: eth0@if24: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default
#     link/ether 0a:58:0a:85:00:11 brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 10.133.0.17/23 brd 10.133.1.255 scope global eth0
#        valid_lft forever preferred_lft forever
#     inet6 fe80::858:aff:fe85:11/64 scope link
#        valid_lft forever preferred_lft forever
# 3: net1@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
#     link/ether 2e:5e:cb:96:ee:20 brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 192.168.99.82/24 brd 192.168.99.255 scope global net1
#        valid_lft forever preferred_lft forever
#     inet6 fe80::2c5e:cbff:fe96:ee20/64 scope link
#        valid_lft forever preferred_lft forever



# Get the list of pod names
pods=$(oc get pods -n $var_namespace -l wzh-run=tinypod-testing -o jsonpath='{.items[*].metadata.name}')

# Loop through each pod and execute the command
# here is the route table
for pod in $pods; do
  echo "Pod: $pod"
  oc exec -it $pod -n $var_namespace -- /bin/sh -c "ip r"
  echo
done

# Pod: tinypod-01-974b4cc84-x9rsz
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# default via 192.168.77.1 dev net1
# 10.132.0.0/14 via 10.134.0.1 dev eth0
# 10.134.0.0/23 dev eth0 proto kernel scope link src 10.134.0.21
# 100.64.0.0/16 via 10.134.0.1 dev eth0
# 172.22.0.0/16 via 10.134.0.1 dev eth0
# 192.168.77.0/24 dev net1 proto kernel scope link src 192.168.77.100
# 192.168.99.0/24 dev net1 proto kernel scope link src 192.168.99.91

# Pod: tinypod-01-check-668b5d9498-ncmk5
# default via 10.134.0.1 dev eth0
# 10.132.0.0/14 via 10.134.0.1 dev eth0
# 10.134.0.0/23 dev eth0 proto kernel scope link src 10.134.0.22
# 100.64.0.0/16 via 10.134.0.1 dev eth0
# 172.22.0.0/16 via 10.134.0.1 dev eth0
# 192.168.99.0/24 dev net1 proto kernel scope link src 192.168.99.81

# Pod: tinypod-02-97d4bfd8b-kxsbn
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# default via 10.133.0.1 dev eth0
# 10.132.0.0/14 via 10.133.0.1 dev eth0
# 10.133.0.0/23 dev eth0 proto kernel scope link src 10.133.0.15
# 100.64.0.0/16 via 10.133.0.1 dev eth0
# 172.22.0.0/16 via 10.133.0.1 dev eth0
# 192.168.99.0/24 dev net1 proto kernel scope link src 192.168.99.92

# Pod: tinypod-02-check-645b69c854-6nk9r
# default via 10.133.0.1 dev eth0
# 10.132.0.0/14 via 10.133.0.1 dev eth0
# 10.133.0.0/23 dev eth0 proto kernel scope link src 10.133.0.17
# 100.64.0.0/16 via 10.133.0.1 dev eth0
# 172.22.0.0/16 via 10.133.0.1 dev eth0
# 192.168.99.0/24 dev net1 proto kernel scope link src 192.168.99.82

# curl http://192.168.77.100:9376
# curl http://192.168.99.91:9376
# curl http://192.168.99.92:9376

# we check the service from the VIP, it gives timestamp, and the pod name which is serving the request.

while true; do
  TIMESTAMP=$(date +"%Y-%m-%d %H:%M:%S")
  RESPONSE=$(curl --max-time 0.05 -s -w "%{http_code}" http://192.168.77.100:9376)
  HTTP_CODE="${RESPONSE: -3}"
  CONTENT="${RESPONSE:0:-3}"

  if [ "$HTTP_CODE" -eq 200 ]; then
      echo "$TIMESTAMP - $CONTENT"
  else
      echo "$TIMESTAMP - call failed"
  fi

  sleep 1
done

# after node is cut off power, VIP take 3 seconds to failover to pod-02

# 2024-09-05 23:10:49 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:10:50 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:10:51 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:10:52 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:10:53 - call failed
# 2024-09-05 23:10:54 - call failed
# 2024-09-05 23:10:55 - call failed
# 2024-09-05 23:10:56 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:10:57 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:10:58 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:10:59 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:11:00 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:11:01 - tinypod-02-c788654d4-hlsw5


# after node power on, VIP will not back to pod-01
# because the check_ip.sh script logic.

# 2024-09-05 23:12:30 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:12:31 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:12:32 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:12:33 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:12:34 - tinypod-02-c788654d4-hlsw5


# after node normal power off, VIP take 1 second to failover to pod-02

# 2024-09-05 23:14:45 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:46 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:47 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:48 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:49 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:50 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:51 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:52 - tinypod-01-6fc4fb867-tcmgx
# 2024-09-05 23:14:53 - call failed
# 2024-09-05 23:14:54 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:14:55 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:14:56 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:14:57 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:14:58 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:14:59 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:15:00 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:15:01 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:15:02 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:15:03 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:15:04 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:15:05 - tinypod-02-c788654d4-hlsw5

# after node normal power on, VIP will not back to pod-01
# because the check_ip.sh script logic.

# 2024-09-05 23:17:28 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:17:29 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:17:30 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:17:31 - tinypod-02-c788654d4-hlsw5
# 2024-09-05 23:17:32 - tinypod-02-c788654d4-hlsw5

do not start app on backup node

The client has further new requirements:

  1. The app should not start on the backup node; it should only start after switching to the master node.
  2. The pod can have multiple VIPs, with each VIP maintained by a separate VRRP ID.

These requirements are not complicated; we just need to use more techniques within the same architecture:

  1. Use a script as a daemon process to start the app through this daemon.
  2. The daemon monitors certain switch files; if there is a switch file, it starts the app, and if there is no switch file, it shuts down the app.
  3. In Keepalived, configure multiple VRRP instances.

However, there are some shortcomings in this solution:

  1. The logs of the daemon process and those of the app will be mixed together in stdout, so other logging solutions may need to be considered further.

# create demo pods
# 192.168.77.100 , 192.168.88.100 is our VIP

var_namespace='demo-playground'

# create master pod
oc delete -f ${BASE_DIR}/data/install/pod.yaml

cat << EOF > ${BASE_DIR}/data/install/pod.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: keepalived-config
  namespace: $var_namespace
data:
  keepalived.conf: |
    global_defs {
        log_level 7
        script_user root
        # enable_script_security
    }
    vrrp_script chk_ip {
        script "/etc/keepalived/check_ip.sh"
        interval 2
    }
    vrrp_instance VI_1 {
        state MASTER
        interface net1
        virtual_router_id 51
        priority 100
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass 1111
        }
        virtual_ipaddress {
            192.168.77.100/24 dev net1
        }
        track_interface {
            net1
        }
        track_script {
            chk_ip 
        }
        notify_master "/etc/keepalived/notify_master_vi_1.sh"
        notify_backup "/etc/keepalived/notify_backup_vi_1.sh"
    }
    vrrp_instance VI_2 {
        state MASTER
        interface net1
        virtual_router_id 52
        priority 100
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass 2222
        }
        virtual_ipaddress {
            192.168.88.100/24 dev net1
        }
        track_interface {
            net1
        }
        track_script {
            chk_ip 
        }
        notify_master "/etc/keepalived/notify_master_vi_2.sh"
        notify_backup "/etc/keepalived/notify_backup_vi_2.sh"
    }
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: keepalived-scripts
  namespace: $var_namespace
data:
  check_ip.sh: |
    #!/bin/sh
    if curl --max-time 0.1 -s http://192.168.99.81:9376 > /dev/null 2>&1 ; then
      exit 0
    else
      exit 1
    fi
  notify_master_vi_1.sh: |
    #!/bin/sh
    ip route del default
    ip route add default via 192.168.77.1 dev net1

    # create switch file, to trigger app start
    touch /mnt/switch/77
  notify_backup_vi_1.sh: |
    #!/bin/sh
    ip route del default
    GATEWAY=\$(ip r | grep "10.132.0.0/14" | awk '{print \$3}')
    ip route add default via \$GATEWAY dev eth0   

    # remove switch file, to trigger app stop
    rm /mnt/switch/77
  notify_master_vi_2.sh: |
    #!/bin/sh

    # create switch file, to trigger app start
    touch /mnt/switch/88
  notify_backup_vi_2.sh: |
    #!/bin/sh

    # remove switch file, to trigger app stop
    rm /mnt/switch/88
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-start-script
data:
  startup.sh: |
    #!/bin/bash

    # Directory containing the switch files
    SWITCH_DIR="/mnt/switch"
    # List of switch files to monitor
    SWITCH_FILES=("88" "77")

    # Infinite loop
    while true; do

      # Check if agnhost process is running
      AGNHOST_PID=\$(pgrep -f "agnhost serve-hostname")

      # Flag to check if all switch files exist
      all_files_exist=true

      # Check each switch file
      for file in "\${SWITCH_FILES[@]}"; do
        if [[ ! -f "\$SWITCH_DIR/\$file" ]]; then
          all_files_exist=false
          break
        fi
      done

      if \$all_files_exist; then
        echo "All switch files found."
        # Check if agnhost process is running
        if [[ -z "\$AGNHOST_PID" ]]; then
          # Start agnhost in the background if not running
          echo "Starting agnhost..."
          /agnhost serve-hostname &
        else
          echo "agnhost is already running."
        fi
      else
        echo "Not all switch files found."
        # Find and terminate the agnhost process
        if [[ -n "\$AGNHOST_PID" ]]; then
          kill -9 "\$AGNHOST_PID"
          echo "agnhost process terminated."
        else
          echo "agnhost process not running."
        fi
      fi

      # Sleep for 1 second
      echo "Sleeping for 1 second..."
      sleep 1
      
    done
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tinypod-01
  namespace: $var_namespace
  labels:
    app: tinypod-01
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tinypod-01
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
          {
            "name": "$var_namespace-macvlan", 
            "_mac": "02:03:04:05:06:07", 
            "_interface": "myiface1", 
            "ips": [
              "192.168.99.91/24"
              ] 
          }
        ]'
      labels:
        app: tinypod-01
        wzh-run: tinypod-testing
    spec:
      # do not run with the same node of tinypod-02
      # be can be run on the same node in extreme cases
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - tinypod-02
              topologyKey: "kubernetes.io/hostname"
      serviceAccountName: keepalived-sa
      initContainers:
      - name: init-permissions
        image: docker.io/busybox
        command: ['sh', '-c', 'cp /etc/keepalived/*.sh /tmp/keepalived/ && chmod 755 /tmp/keepalived/*.sh && chown root:root /tmp/keepalived/*.sh']
        volumeMounts:
        - name: keepalived-scripts
          mountPath: /etc/keepalived
        - name: writable-scripts
          mountPath: /tmp/keepalived
      containers:
      - name: application-container
        image: registry.k8s.io/e2e-test-images/agnhost:2.43
        imagePullPolicy: IfNotPresent
        command: [ "/bin/bash", "/mnt/scripts/startup.sh"]
        volumeMounts:
        - name: share-switch
          mountPath: /mnt/switch
        - name: app-start-script
          mountPath: /mnt/scripts
      - name: keepalived
        image: quay.io/wangzheng422/qimgs:keepalived-2024-09-06-v01
        imagePullPolicy: IfNotPresent
        securityContext:
          # privileged: true
          # runAsUser: 0
          capabilities:
            add: ["NET_ADMIN", "NET_BROADCAST", "NET_RAW"]
        volumeMounts:
        - name: keepalived-config
          mountPath: /etc/keepalived/keepalived.conf
          subPath: keepalived.conf
        - name: writable-scripts
          mountPath: /etc/keepalived
        - name: share-switch
          mountPath: /mnt/switch
      volumes:
      - name: keepalived-config
        configMap:
          name: keepalived-config
      - name: keepalived-scripts
        configMap:
          name: keepalived-scripts
      - name: writable-scripts
        emptyDir: {}
      - name: share-switch
        emptyDir: {}
      - name: app-start-script
        configMap:
          name: app-start-script
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tinypod-01-check
  namespace: $var_namespace
  labels:
    app: tinypod-01-check
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tinypod-01-check
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
          {
            "name": "$var_namespace-macvlan", 
            "_mac": "02:03:04:05:06:07", 
            "_interface": "myiface1", 
            "ips": [
              "192.168.99.81/24"
              ] 
          }
        ]'
      labels:
        app: tinypod-01-check
        wzh-run: tinypod-check
    spec:
      # run with tinypod-01 app
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - tinypod-01
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: endpoint-container
        image: registry.k8s.io/e2e-test-images/agnhost:2.43
        imagePullPolicy: IfNotPresent
        command: [ "/agnhost", "serve-hostname"]
EOF

oc apply -f ${BASE_DIR}/data/install/pod.yaml


# get the log of app container using label
oc logs -n $var_namespace -l app=tinypod-01 -c application-container
# Sleeping for 1 second...
# Not all switch files found.
# agnhost process not running.
# Sleeping for 1 second...
# Not all switch files found.
# agnhost process not running.
# Sleeping for 1 second...
# All switch files found.
# Starting agnhost...
# I0922 04:15:22.582473      16 log.go:198] Serving on port 9376.


# create backup pod
oc delete -f ${BASE_DIR}/data/install/pod-02.yaml

cat << EOF > ${BASE_DIR}/data/install/pod-02.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: keepalived-config-backup
  namespace: $var_namespace
data:
  keepalived.conf: |
    global_defs {
        log_level 7
        script_user root
        # enable_script_security
    }
    vrrp_script chk_ip {
        script "/etc/keepalived/check_ip.sh"
        interval 2
        weight +20
    }
    vrrp_instance VI_1 {
        state BACKUP
        interface net1
        virtual_router_id 51
        # priority should be lower than master
        # but we do not want to fail back
        # so we will increase weight to 110
        # if master failed, we will take over
        priority 90
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass 1111
        }
        virtual_ipaddress {
            192.168.77.100/24 dev net1
        }
        track_interface {
            net1
        }
        track_script {
            chk_ip
        }
        notify_master "/etc/keepalived/notify_master_vi_1.sh"
        notify_backup "/etc/keepalived/notify_backup_vi_1.sh"
    }
    vrrp_instance VI_2 {
        state BACKUP
        interface net1
        virtual_router_id 52
        priority 90
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass 2222
        }
        virtual_ipaddress {
            192.168.88.100/24 dev net1
        }
        track_interface {
            net1
        }
        track_script {
            chk_ip 
        }
        notify_master "/etc/keepalived/notify_master_vi_2.sh"
        notify_backup "/etc/keepalived/notify_backup_vi_2.sh"
    }
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: keepalived-backup-scripts
  namespace: $var_namespace
data:
  check_ip.sh: |
    #!/bin/sh

    # Define the local file to record failure
    FAILURE_RECORD_FILE="/tmp/failure_record.txt"

    # ourself is ok?
    if curl --max-time 0.1 -s http://192.168.99.82:9376 > /dev/null 2>&1 ; then
      # exit 0
      # continue, only ourself is ok.
      :
    else
      # return Failure (no change in weight)
      # will not be the master
      /bin/rm -f "\$FAILURE_RECORD_FILE"
      exit 1
    fi
    
    # Check if the failure record file exists
    # if so, we will still be the master
    if [ -f "\$FAILURE_RECORD_FILE" ]; then
      exit 0  # return Success (will add weight)
    fi

    # if peer fail, we should add weight
    if curl --max-time 0.1 -s http://192.168.99.81:9376 > /dev/null 2>&1 ; then
      # exit 0
      exit 1  # curl ok, return Failure (no change in weight)
    else
      # exit 1
      # Record the failure by creating the file
      touch "\$FAILURE_RECORD_FILE"
      exit 0  # curl fail, return Success (will add weight)
    fi
  notify_master_vi_1.sh: |
    #!/bin/sh
    ip route del default
    ip route add default via 192.168.77.1 dev net1

    # create switch file, to trigger app start
    touch /mnt/switch/77
  notify_backup_vi_1.sh: |
    #!/bin/sh
    ip route del default
    GATEWAY=\$(ip r | grep "10.132.0.0/14" | awk '{print \$3}')
    ip route add default via \$GATEWAY dev eth0   

    # remove switch file, to trigger app stop
    rm /mnt/switch/77
  notify_master_vi_2.sh: |
    #!/bin/sh

    # create switch file, to trigger app start
    touch /mnt/switch/88
  notify_backup_vi_2.sh: |
    #!/bin/sh

    # remove switch file, to trigger app stop
    rm /mnt/switch/88
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tinypod-02
  namespace: $var_namespace
  labels:
    app: tinypod-02
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tinypod-02
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
          {
            "name": "$var_namespace-macvlan", 
            "_mac": "02:03:04:05:06:07", 
            "_interface": "myiface1", 
            "ips": [
              "192.168.99.92/24"
              ] 
          }
        ]'
      labels:
        app: tinypod-02
        wzh-run: tinypod-testing
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - tinypod-01
              topologyKey: "kubernetes.io/hostname"
      serviceAccountName: keepalived-sa
      initContainers:
      - name: init-permissions
        image: docker.io/busybox
        command: ['sh', '-c', 'cp /etc/keepalived/*.sh /tmp/keepalived/ && chmod 755 /tmp/keepalived/*.sh && chown root:root /tmp/keepalived/*.sh']
        volumeMounts:
        - name: keepalived-scripts
          mountPath: /etc/keepalived
        - name: writable-scripts
          mountPath: /tmp/keepalived
      containers:
      - name: application-container
        image: registry.k8s.io/e2e-test-images/agnhost:2.43
        imagePullPolicy: IfNotPresent
        command: [ "/bin/bash", "/mnt/scripts/startup.sh"]
        volumeMounts:
        - name: share-switch
          mountPath: /mnt/switch
        - name: app-start-script
          mountPath: /mnt/scripts
      - name: keepalived
        image: quay.io/wangzheng422/qimgs:keepalived-2024-09-06-v01
        imagePullPolicy: IfNotPresent
        securityContext:
          # privileged: true
          # runAsUser: 0
          capabilities:
            add: ["NET_ADMIN", "NET_BROADCAST", "NET_RAW"]
        volumeMounts:
        - name: keepalived-config
          mountPath: /etc/keepalived/keepalived.conf
          subPath: keepalived.conf
        - name: writable-scripts
          mountPath: /etc/keepalived
        - name: share-switch
          mountPath: /mnt/switch
      volumes:
      - name: keepalived-config
        configMap:
          name: keepalived-config-backup
      - name: keepalived-scripts
        configMap:
          name: keepalived-backup-scripts
      - name: writable-scripts
        emptyDir: {}
      - name: share-switch
        emptyDir: {}
      - name: app-start-script
        configMap:
          name: app-start-script
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tinypod-02-check
  namespace: $var_namespace
  labels:
    app: tinypod-02-check
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tinypod-02-check
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
          {
            "name": "$var_namespace-macvlan", 
            "_mac": "02:03:04:05:06:07", 
            "_interface": "myiface1", 
            "ips": [
              "192.168.99.82/24"
              ] 
          }
        ]'
      labels:
        app: tinypod-02-check
        wzh-run: tinypod-check
    spec:
      # run with tinypod-02 app
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - tinypod-02
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: endpoint-container
        image: registry.k8s.io/e2e-test-images/agnhost:2.43
        imagePullPolicy: IfNotPresent
        command: [ "/agnhost", "serve-hostname"]
EOF

oc apply -f ${BASE_DIR}/data/install/pod-02.yaml

# get the log of app container using label
oc logs -n $var_namespace -l app=tinypod-01 -c application-container
# Sleeping for 1 second...
# Not all switch files found.
# agnhost process not running.
# Sleeping for 1 second...
# Not all switch files found.
# agnhost process not running.
# Sleeping for 1 second...
# All switch files found.
# Starting agnhost...
# I0922 04:15:22.582473      16 log.go:198] Serving on port 9376.

oc logs -n $var_namespace -l app=tinypod-02 -c application-container
# Sleeping for 1 second...
# Not all switch files found.
# agnhost process not running.
# Sleeping for 1 second...
# Not all switch files found.
# agnhost process not running.
# Sleeping for 1 second...
# Not all switch files found.
# agnhost process not running.
# Sleeping for 1 second...

# Get the list of pod names
pods=$(oc get pods -n $var_namespace -l wzh-run=tinypod-testing -o jsonpath='{.items[*].metadata.name}')

# Loop through each pod and execute the command
# here is the route table
for pod in $pods; do
  echo "Pod: $pod"
  oc exec -it $pod -n $var_namespace -- /bin/sh -c "ip r"
  echo
done
# Pod: tinypod-01-779d86fd54-xn4hb
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# default via 192.168.77.1 dev net1
# 10.132.0.0/14 via 10.133.0.1 dev eth0
# 10.133.0.0/23 dev eth0 proto kernel scope link src 10.133.0.13
# 100.64.0.0/16 via 10.133.0.1 dev eth0
# 172.22.0.0/16 via 10.133.0.1 dev eth0
# 192.168.77.0/24 dev net1 proto kernel scope link src 192.168.77.100
# 192.168.88.0/24 dev net1 proto kernel scope link src 192.168.88.100
# 192.168.99.0/24 dev net1 proto kernel scope link src 192.168.99.91

# Pod: tinypod-02-7d44467c79-zwk4h
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# default via 10.134.0.1 dev eth0
# 10.132.0.0/14 via 10.134.0.1 dev eth0
# 10.134.0.0/23 dev eth0 proto kernel scope link src 10.134.0.7
# 100.64.0.0/16 via 10.134.0.1 dev eth0
# 172.22.0.0/16 via 10.134.0.1 dev eth0
# 192.168.99.0/24 dev net1 proto kernel scope link src 192.168.99.92


# get ps -ef result of all app container using lable
# Get the list of pods with the label tinypod
pods=$(oc get pods -n $var_namespace -l wzh-run=tinypod-testing -o jsonpath='{.items[*].metadata.name}')

# Iterate over each pod and execute the ps -ef command inside the container
for pod in $pods; do
  echo "ps -ef result for pod: $pod"
  oc exec $pod -n $var_namespace -- ip r
  oc exec -n $var_namespace $pod -- ps -ef
  oc exec -n $var_namespace $pod -- ls /mnt/switch
  echo "-----------------------------------"
done

# ps -ef result for pod: tinypod-01-779d86fd54-fqrnp
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# default via 192.168.77.1 dev net1
# 10.132.0.0/14 via 10.133.0.1 dev eth0
# 10.133.0.0/23 dev eth0 proto kernel scope link src 10.133.0.25
# 100.64.0.0/16 via 10.133.0.1 dev eth0
# 172.22.0.0/16 via 10.133.0.1 dev eth0
# 192.168.77.0/24 dev net1 proto kernel scope link src 192.168.77.100
# 192.168.88.0/24 dev net1 proto kernel scope link src 192.168.88.100
# 192.168.99.0/24 dev net1 proto kernel scope link src 192.168.99.91
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# PID   USER     TIME  COMMAND
#     1 root      0:00 /bin/bash /mnt/scripts/startup.sh
#   214 root      0:00 /agnhost serve-hostname
#   517 root      0:00 sleep 1
#   518 root      0:00 ps -ef
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# 77
# 88
# -----------------------------------
# ps -ef result for pod: tinypod-02-7d44467c79-8xpvz
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# default via 10.134.0.1 dev eth0
# 10.132.0.0/14 via 10.134.0.1 dev eth0
# 10.134.0.0/23 dev eth0 proto kernel scope link src 10.134.0.16
# 100.64.0.0/16 via 10.134.0.1 dev eth0
# 172.22.0.0/16 via 10.134.0.1 dev eth0
# 192.168.99.0/24 dev net1 proto kernel scope link src 192.168.99.92
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# PID   USER     TIME  COMMAND
#     1 root      0:00 /bin/bash /mnt/scripts/startup.sh
#   476 root      0:00 sleep 1
#   483 root      0:00 ps -ef
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# -----------------------------------


# scale deployment tinypod-01-check to 0
oc scale deployment tinypod-01-check --replicas=0 -n $var_namespace
oc scale deployment tinypod-02-check --replicas=1 -n $var_namespace


# or scale deployment tinypod-02-check to 0
oc scale deployment tinypod-02-check --replicas=0 -n $var_namespace
oc scale deployment tinypod-01-check --replicas=1 -n $var_namespace

# scale check deployment to 1
oc scale deployment tinypod-01-check --replicas=1 -n $var_namespace
oc scale deployment tinypod-02-check --replicas=1 -n $var_namespace




# get ps -ef result of all app container using lable
# Get the list of pods with the label tinypod
pods=$(oc get pods -n $var_namespace -l wzh-run=tinypod-testing -o jsonpath='{.items[*].metadata.name}')

# Iterate over each pod and execute the ps -ef command inside the container
for pod in $pods; do
  echo "ps -ef result for pod: $pod"
  oc exec $pod -n $var_namespace -- ip r
  oc exec -n $var_namespace $pod -- ps -ef
  oc exec -n $var_namespace $pod -- ls /mnt/switch
  echo "-----------------------------------"
done

# ps -ef result for pod: tinypod-01-779d86fd54-fqrnp
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# default via 10.133.0.1 dev eth0
# 10.132.0.0/14 via 10.133.0.1 dev eth0
# 10.133.0.0/23 dev eth0 proto kernel scope link src 10.133.0.25
# 100.64.0.0/16 via 10.133.0.1 dev eth0
# 172.22.0.0/16 via 10.133.0.1 dev eth0
# 192.168.99.0/24 dev net1 proto kernel scope link src 192.168.99.91
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# PID   USER     TIME  COMMAND
#     1 root      0:00 /bin/bash /mnt/scripts/startup.sh
#   815 root      0:00 sleep 1
#   816 root      0:00 ps -ef
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# -----------------------------------
# ps -ef result for pod: tinypod-02-7d44467c79-8xpvz
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# default via 192.168.77.1 dev net1
# 10.132.0.0/14 via 10.134.0.1 dev eth0
# 10.134.0.0/23 dev eth0 proto kernel scope link src 10.134.0.16
# 100.64.0.0/16 via 10.134.0.1 dev eth0
# 172.22.0.0/16 via 10.134.0.1 dev eth0
# 192.168.77.0/24 dev net1 proto kernel scope link src 192.168.77.100
# 192.168.88.0/24 dev net1 proto kernel scope link src 192.168.88.100
# 192.168.99.0/24 dev net1 proto kernel scope link src 192.168.99.92
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# PID   USER     TIME  COMMAND
#     1 root      0:00 /bin/bash /mnt/scripts/startup.sh
#   582 root      0:00 /agnhost serve-hostname
#   785 root      0:00 sleep 1
#   792 root      0:00 ps -ef
# Defaulted container "application-container" out of: application-container, keepalived, init-permissions (init)
# 77
# 88
# -----------------------------------

end

demo application image

We need an application container image to demo, we need it restart fast, so we can test the keepalived failover quickly.

mkdir -p /data/caddy
cd /data/caddy

echo "wzh hello world" > index.html

cat << EOF > Caddyfile
:9376 {
        # Set this path to your site's directory.
        root * /usr/share/caddy

        # Enable the static file server.
        file_server

}
EOF

cat << EOF > Dockerfile
FROM docker.io/caddy:alpine

COPY index.html /usr/share/caddy/index.html

COPY Caddyfile /etc/caddy/Caddyfile

EOF

podman build -t quay.io/wangzheng422/qimgs:caddy-2024-09-09-v02 .

podman push quay.io/wangzheng422/qimgs:caddy-2024-09-09-v02

bottom