Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(workshop): add documentation for kubecon optional steps #568

Merged
merged 28 commits into from
Nov 10, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
add documentation for kubecon optional steps
Charlie McBride committed Nov 8, 2024
commit 848509a7bb9bfa0348bbdca17a366f380eae1e10
238 changes: 238 additions & 0 deletions docs/workshops/12_scheduling_constraints.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,238 @@

## Deploy NodePool:

### 2. Percentage-Base Disruption

Use the following command, instead of the NodePool deployment listed under `2. Percentage-Base Disruption` of `Scheduling Constraints`. This will deploy a `NodePool`, and `AKSNodeClass` where we've set a disruption budget of `40%`.

```bash
cd ~/environment/karpenter
cat > ndb-nodepool.yaml << EOF
# This example NodePool will provision general purpose instances
---
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: default
annotations:
kubernetes.io/description: "Basic NodePool for generic workloads"
spec:
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 30s
budgets:
- nodes: "40%"
limits:
cpu: "20"
template:
metadata:
labels:
# required for Karpenter to predict overhead from cilium DaemonSet
kubernetes.azure.com/ebpf-dataplane: cilium
eks-immersion-team: my-team
spec:
expireAfter: 720h # 30 days
startupTaints:
# https://karpenter.sh/docs/concepts/nodepools/#cilium-startup-taint
- key: node.cilium.io/agent-not-ready
effect: NoExecute
value: "true"
requirements:
- key: karpenter.azure.com/sku-family
operator: In
values: [D]
- key: karpenter.azure.com/sku-cpu
operator: Lt
values: ["3"]
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: kubernetes.io/os
operator: In
values: ["linux"]
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"]
nodeClassRef:
group: karpenter.azure.com
kind: AKSNodeClass
name: default
---
apiVersion: karpenter.azure.com/v1alpha2
kind: AKSNodeClass
metadata:
name: default
annotations:
kubernetes.io/description: "Basic AKSNodeClass for running Ubuntu2204 nodes"
spec:
imageFamily: Ubuntu2204
EOF

kubectl apply -f ndb-nodepool.yaml
```

```
nodepool.karpenter.sh/default created
aksnodeclass.karpenter.azure.com/default created
```

### 3. Multiple Budget Policies

Use the following command, instead of the first NodePool deployment listed under `3. Multiple Budget Policies` of `Scheduling Constraints`. This will update the `NodePool` deployment to add a max disruption budget of `2`, and define a schedule of when to not allow for any disruption.

> Note: modify the schedule to the current UTC time

```bash
cd ~/environment/karpenter
cat > ndb-nodepool.yaml << EOF
# This example NodePool will provision general purpose instances
---
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: default
annotations:
kubernetes.io/description: "Basic NodePool for generic workloads"
spec:
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 30s
budgets:
- nodes: "40%"
reasons:
- "Empty"
- "Drifted"
- nodes: "2"
- nodes: "0"
schedule: "0 21 * * *" # modify this line to the current UTC time
duration: 3h
limits:
cpu: "40"
template:
metadata:
labels:
# required for Karpenter to predict overhead from cilium DaemonSet
kubernetes.azure.com/ebpf-dataplane: cilium
eks-immersion-team: my-team
spec:
expireAfter: 720h # 30 days
startupTaints:
# https://karpenter.sh/docs/concepts/nodepools/#cilium-startup-taint
- key: node.cilium.io/agent-not-ready
effect: NoExecute
value: "true"
requirements:
- key: karpenter.azure.com/sku-family
operator: In
values: [D]
- key: karpenter.azure.com/sku-cpu
operator: Lt
values: ["3"]
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: kubernetes.io/os
operator: In
values: ["linux"]
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"]
nodeClassRef:
group: karpenter.azure.com
kind: AKSNodeClass
name: default
---
apiVersion: karpenter.azure.com/v1alpha2
kind: AKSNodeClass
metadata:
name: default
annotations:
kubernetes.io/description: "Basic AKSNodeClass for running Ubuntu2204 nodes"
spec:
imageFamily: Ubuntu2204
EOF

kubectl apply -f ndb-nodepool.yaml
```

```
nodepool.karpenter.sh/default created
aksnodeclass.karpenter.azure.com/default created
```

Use the following command, instead of the second NodePool deployment listed under `3. Multiple Budget Policies` of `Scheduling Constraints`. This will remove the disruption schedule which is not allowing for any disruptions to occur.

```bash
cd ~/environment/karpenter
cat > ndb-nodepool.yaml << EOF
# This example NodePool will provision general purpose instances
---
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: default
annotations:
kubernetes.io/description: "Basic NodePool for generic workloads"
spec:
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 30s
budgets:
- nodes: "40%"
reasons:
- "Empty"
- "Drifted"
- nodes: "2"
limits:
cpu: "10"
template:
metadata:
labels:
# required for Karpenter to predict overhead from cilium DaemonSet
kubernetes.azure.com/ebpf-dataplane: cilium
eks-immersion-team: my-team
spec:
expireAfter: 720h # 30 days
startupTaints:
# https://karpenter.sh/docs/concepts/nodepools/#cilium-startup-taint
- key: node.cilium.io/agent-not-ready
effect: NoExecute
value: "true"
requirements:
- key: karpenter.azure.com/sku-family
operator: In
values: [D]
- key: karpenter.azure.com/sku-cpu
operator: Lt
values: ["3"]
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: kubernetes.io/os
operator: In
values: ["linux"]
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"]
nodeClassRef:
group: karpenter.azure.com
kind: AKSNodeClass
name: default
---
apiVersion: karpenter.azure.com/v1alpha2
kind: AKSNodeClass
metadata:
name: default
annotations:
kubernetes.io/description: "Basic AKSNodeClass for running Ubuntu2204 nodes"
spec:
imageFamily: Ubuntu2204
EOF

kubectl apply -f ndb-nodepool.yaml
```

```
nodepool.karpenter.sh/default created
aksnodeclass.karpenter.azure.com/default created
```
72 changes: 72 additions & 0 deletions docs/workshops/13_disruption_controls.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
## Deploy NodePool:

Use the following command to deploy a `NodePool`, and `AKSNodeClass` for Disruption Controls, where we've made the nodes `expireAfter` 2 minutes, which will make the NodePool try to remove the nodes after 2 minutes.

> Note: We've set `terminationGracePeriod` in addition to `expireAfter` here. This is a good way to help define an absolute maximum on the lifetime of a node. The node should be deleted at `expireAfter` and finishes draining within the `terminationGracePeriod` thereafter. Pods blocking eviction like PDBs and `do-not-disrupt` will block full draining until the `terminationGracePeriod` is reached.

```bash
cd ~/environment/karpenter
cat > eviction.yaml << EOF
# This example NodePool will provision general purpose instances
---
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: default
annotations:
kubernetes.io/description: "Basic NodePool for generic workloads"
spec:
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 30s
limits:
cpu: "10"
template:
metadata:
labels:
# required for Karpenter to predict overhead from cilium DaemonSet
kubernetes.azure.com/ebpf-dataplane: cilium
eks-immersion-team: my-team
spec:
expireAfter: 2m0s
terminationGracePeriod: 2m0s
startupTaints:
# https://karpenter.sh/docs/concepts/nodepools/#cilium-startup-taint
- key: node.cilium.io/agent-not-ready
effect: NoExecute
value: "true"
requirements:
- key: karpenter.azure.com/sku-family
operator: In
values: [D]
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: kubernetes.io/os
operator: In
values: ["linux"]
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"]
nodeClassRef:
group: karpenter.azure.com
kind: AKSNodeClass
name: default
---
apiVersion: karpenter.azure.com/v1alpha2
kind: AKSNodeClass
metadata:
name: default
annotations:
kubernetes.io/description: "Basic AKSNodeClass for running Ubuntu2204 nodes"
spec:
imageFamily: Ubuntu2204
EOF

kubectl apply -f eviction.yaml
```

```
nodepool.karpenter.sh/default created
aksnodeclass.karpenter.azure.com/default created
```
36 changes: 34 additions & 2 deletions docs/workshops/kubecon_azure_track.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Table of contents:
- [Overview](#overview)
- [Basic Cheet Sheet](#basic-cheet-sheet)
- [Adjustments](#adjustments)
- [Main Topics](#main-topics)
- [Step: Install Karpenter](#step-install-karpenter)
- [Step: Basic NodePool](#step-basic-nodepool)
- [Step: Scaling Application](#step-scaling-application)
@@ -12,6 +12,10 @@ Table of contents:
- [Step: Consolidation](#step-consolidation)
- [Step: Single Node Consolidation](#step-single-node-consolidation)
- [Step: Multi Node Consolidation](#step-multi-node-consolidation)
- [Bonus Content (optional)](#bonus-content-optional)
- [Step: Scheduling Constraints](#step-scheduling-constraints)
- [Step: Disruption Control](#step-disruption-control)


## Overview

@@ -23,7 +27,7 @@ To follow along using this workshop, simply go through the steps detailed in thi

When you see `eks-node-viewer` use `aks-node-viewer` instead.

## Adjusted Instructions
## Main Topics

### Step: [Install Karpenter](https://catalog.workshops.aws/karpenter/en-US/install-karpenter)

@@ -107,3 +111,31 @@ When you see `eks-node-viewer` use `aks-node-viewer` instead.
kubectl delete aksnodeclass default
```
- The same concepts within the workshop generally translate to AKS, but with different instances/pricing. However, for the deployment step of the NodePool, use a new deployment command with consolidation enabled. Found in [10_multi_node_consolidation.md](https://github.com/Azure/karpenter-provider-azure/tree/main/docs/workshops/10_multi_node_consolidation.md)

## Bonus Content (optional)

Everything beyond this point is optional

### Step: [Scheduling Constraints](https://catalog.workshops.aws/karpenter/en-US/scheduling-constraints#how-does-it-work)

> Concepts translate to Azure.

### Step: [NodePool Disruption Budgets](https://catalog.workshops.aws/karpenter/en-US/scheduling-constraints/nodepool-disruption-budgets)

- Adjustments:
- In initial cleanup, replace the command to cleanup the `ec2nodeclass`, with:
> Note: it might pause for a few seconds on this command
```bash
kubectl delete aksnodeclass default
```
- The same concepts within the workshop generally translate to AKS. However, for the 3 NodePool deployment commands, use the replacement deployment commands listed in [12_scheduling_constraints.md](https://github.com/Azure/karpenter-provider-azure/tree/main/docs/workshops/12_scheduling_constraints.md)

### Step: [Disruption Control](https://catalog.workshops.aws/karpenter/en-US/scheduling-constraints/disable-eviction)

- Adjustments:
- In initial cleanup, replace the command to cleanup the `ec2nodeclass`, with:
> Note: it might pause for a few seconds on this command
```bash
kubectl delete aksnodeclass default
```
- The same concepts within the workshop generally translate to AKS. However, for the deployment step of the NodePool, use the deployment command found in [13_disruption_controls.md](https://github.com/Azure/karpenter-provider-azure/tree/main/docs/workshops/13_disruption_controls.md)