Skip to content

Commit

Permalink
Port asg_node_group to v1.25
Browse files Browse the repository at this point in the history
* We have some legacy clusters on 1.24 that need to be upgraded before
extended support ends.
* They are still using cluster-autoscaler (not Karpenter)
* The simplest upgrade procedure is to forward-port the asg_node_group
  module.

Note: this module adds a node_instance_profile variable, as cluster
config no longer exposes an instance profile for the node group.

To avoid any issues this needs to be set to the same value as you
were using in 1.24 iam_config

Also note that the IAM role of nodes in the node group will need to be
manually added to the clusters `aws_auth_role_map` as this is no longer
defaulted.
  • Loading branch information
errm committed Nov 22, 2024
1 parent 943156d commit 10ff44a
Show file tree
Hide file tree
Showing 6 changed files with 670 additions and 0 deletions.
229 changes: 229 additions & 0 deletions modules/asg_node_group/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,229 @@
# asg_node_group

This module provisions nodes for your cluster by managing AWS auto scaling groups.

## Features

* Will manage spot or on demand instances.
* Provisions an auto scaling group per availability zone, to support applications
utilizing EBS volumes via PVC.
* Prepares the auto scaling group(s) to be scaled by the cluster autoscaler.
* Uses the official AWS EKS optimised Amazon Linux AMI

## Usage

```hcl
module "nodes" {
source = "cookpad/eks/aws//modules/asg_node_group"
cluster_config = module.cluster.config
max_size = 60
instance_family = "memory_optimized"
instance_size = "4xlarge"
}
```

### Instance type selection

There are two ways to choose the instance types launched by the autoscaling
groups:

#### `instance_family` & `instance_size`

The module has 4 preset instance families to choose from (the default is `general_purpose`) :

| family | instance types (x86_64) | (arm64) |
|--------|----------------|
| `memory_optimized` | `r5`, `r5d`, `r5n`, `r5dn`, `r5a`, `r5ad` | `r6g`, `r6gd` |
| `general_purpose` | `m5`, `m5d`, `m5n`, `m5dn`, `m5a`, `m5ad` | `m6g`, `m6gd` |
| `compute_optimized` | `c5`, `c5n`, `c5d` | `c6g`, `c6gn`, `c6gd`, `c7g` |
| `burstable` | `t3`, `t3a` | `t4g` |

This is combined with `instance_size` to choose the instance types that the
group will launch.

These groups are useful when utilising spot instances to provide diversity to
avoid the effects of price spikes.

When using on-demand instances, as diversity is not required, only the
first instance type in a family is used.

e.g.
```hcl
module "nodes" {
source = "cookpad/eks/aws//modules/asg_node_group"
cluster_config = module.cluster.config
instance_family = "compute_optimized"
instance_lifecycle = "on_demand"
}
```

#### `instance_family` & `instance_types`

Alternatively `instance_types` can be used to provide a list of the exact
instance types that will be launched, `instance_family` and `instance_size` is
used in this case to provide part of the ASG name.

e.g.
```hcl
module "nodes" {
source = "cookpad/eks/aws//modules/asg_node_group"
cluster_config = module.cluster.config
max_size = 16
instance_family = "io_optimised"
instance_size = "xlarge"
instance_types = ["i3.xlarge", "i3en.xlarge"]
}
```

### GPU Nodes

In order to use a GPU optimised AMI set the `gpu` variable.

It is recommended to set the `k8s.amazonaws.com/accelerator` variable to avoid
the cluster autoscaler from adding too many nodes whilst the GPU driver is
initialising. See https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider/aws#gpu-node-groups for more info.

If you are running mixed workloads on your cluster, you could
add a taint to your GPU nodes to avoid running non GPU workloads on expensive
GPU instances.

Note: Currently you would need to manually add the appropriate toleration
to your workloads, as EKS currently doesn't enable the `ExtendedResourceToleration`
admission controller, see: https://github.com/aws/containers-roadmap/issues/739

```hcl
module "gpu_nodes" {
source = "cookpad/eks/aws//modules/asg_node_group"
cluster_config = module.cluster.config
gpu = true
instance_family = "gpu"
instance_size = "2xlarge"
instance_types = ["p3.2xlarge"]
labels = {
"k8s.amazonaws.com/accelerator" = "nvidia-tesla-v100"
}
taints = {
"nvidia.com/gpu" = "gpu:NoSchedule"
}
}
```

### Labels & taints

You can provide kubernetes labels and/or taints for the nodes, to provide some
control of where your workloads are scheduled.

* https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
* https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/

e.g.
```hcl
module "nodes" {
source = "cookpad/eks/aws//modules/asg_node_group"
cluster_config = module.cluster.config
labels = {
"cookpad.com/environment_name" = "production"
"cookpad.com/department" = "machine-learning"
}
taints = {
"dedicated" = "gpu:PreferNoSchedule"
}
}
```

### Volume size

You can configure the root volume size (it defaults to 40 GiB).

e.g.

```hcl
module "nodes" {
source = "cookpad/eks/aws//modules/asg_node_group"
cluster_config = module.cluster.config
root_volume_size = 10
}
```

### Zone awareness

The module by default provisions 1 ASG per availability zone so the cluster
autoscaler can create instances in particular zone.

If this is not required you can disable this behaviour, and the module will
create a single ASG that will create instances any of your cluster's availability
zones.

e.g.

```hcl
module "nodes" {
source = "cookpad/eks/aws//modules/asg_node_group"
cluster_config = module.cluster.config
zone_awareness = false
}
```

### Security groups

The module automatically applies the node security group provided by the cluster
module to each node. This allows access of the nodes to the control plane, and
intra-cluster communication between pods running on the cluster.

If you need to add any additional security groups, e.g. for ssh access, configure
`security_groups` with the security group ids.

### SSH key

Set `key_name` to configure a ssh key pair.

### Cloud config

The module will configure the instance user data to use cloud config to add
each node to the cluster, via the eks bootsstrap script, as well as setting the
instances name tag.

If you need to provide any additional cloud config it will be merged,
see https://cloudinit.readthedocs.io/en/latest/topics/merging.html for more info.

### Bottlerocket

[Bottlerocket](https://github.com/bottlerocket-os/bottlerocket) is a free and open-source Linux-based operating system meant for hosting containers.

To use bottlerocket set the bottlerocket variable.

```hcl
module "bottlerocket_nodes" {
source = "cookpad/eks/aws//modules/asg_node_group"
cluster_config = module.cluster.config
bottlerocket = true
}
```
⚠️ Bottlerocket now [supports GPU nodes](https://github.com/bottlerocket-os/bottlerocket/blob/develop/QUICKSTART-EKS.md#aws-k8s--nvidia-variants), set `gpu = true` to enable them. Ensure that you set `instance_types` to a GPU instance type.

📝 If you want to get a shell session on your instances via Bottlerocket's SSM agent
you will need to attach the `arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore` policy
to your node instance profile. If you use the `cookpad/eks/aws//modules/iam` module to
provision your node role, then this is done by default!

### IMDSv2 instead of v1 to secure nodes from a hacker to obtain AWS credentials

By default, IMDSv2 will be enabled through the variable nodes_metadata_http_tokens.

⚠️ If you are using kube2iam change the default value to "optional". [terraform IMDSv2](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/launch_template#metadata-options)
Once we don't have any cluster using kube2iam, this variable can be removed and forced to be required the token.
14 changes: 14 additions & 0 deletions modules/asg_node_group/bottlerocket_config.toml.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
[settings.kubernetes]
cluster-name = "${cluster_name}"
api-server = "${cluster_endpoint}"
cluster-certificate = "${cluster_ca_data}"
[settings.kubernetes.node-labels]
${node_labels}
[settings.kubernetes.node-taints]
${node_taints}
[settings.host-containers.admin]
enabled = ${admin_container_enabled}
superpowered = ${admin_container_superpowered}
%{ if admin_container_source != "" }
source = "${admin_container_source}"
%{ endif }
7 changes: 7 additions & 0 deletions modules/asg_node_group/cloud_config.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
## template: jinja
#cloud-config
fqdn: eks-node-${cluster_name}-{{ v1.instance_id }}
runcmd:
- [aws, --region={{ v1.region }}, ec2, create-tags, --resources={{ v1.instance_id }}, "--tags=Key=Name,Value=eks-node-${cluster_name}-{{ v1.instance_id }}"]
- [systemctl, restart, docker]
- [/etc/eks/bootstrap.sh, ${cluster_name}, --kubelet-extra-args, '--node-labels=${labels} --register-with-taints="${taints}"']
Loading

0 comments on commit 10ff44a

Please sign in to comment.