Skip to content

Latest commit

 

History

History
534 lines (409 loc) · 38.3 KB

README.md

File metadata and controls

534 lines (409 loc) · 38.3 KB

Jenkins X EKS Module

This repository contains a Terraform module for creating an EKS cluster and all the necessary infrastructure to install Jenkins X as described in https://jenkins-x.io/v3/admin/platforms/eks/.

What is a Terraform module

A Terraform module refers to a self-contained package of Terraform configurations that are managed as a group. For more information about modules refer to the Terraform documentation.

How do you use this module

Prerequisites

This Terraform module allows you to create an EKS cluster ready for the installation of Jenkins X. You need the following binaries locally installed and configured on your PATH:

  • terraform (>= 1.0.0, < 2.0.0)
  • kubectl (>= 1.10)
  • aws-cli
  • helm (>= 3.0)

Cluster provisioning

From version 3.0.0 this module creates neither the EKS cluster nor the VPC.

We recommend using the Terraform modules terraform-aws-modules/eks/aws to create the cluster and terraform-aws-modules/vpc/aws to create the VPC.

A Jenkins X ready cluster can be provisioned using the configuration in jx3-terraform-eks as described in https://jenkins-x.io/v3/admin/platforms/eks/.

All s3 buckets created by the module use Server-Side Encryption with Amazon S3-Managed Encryption Keys (SSE-S3) by default. You can set the value of use_kms_s3 to true to use server-side encryption with AWS KMS (SSE-KMS). If you don't specify the value of s3_kms_arn, then the default aws managed cmk is used (aws/s3)

⚠️ Note: Using AWS KMS with customer managed keys has cost considerations.

You should have your AWS CLI configured correctly.

AWS_REGION

In addition, you should make sure to specify the region via the AWS_REGION environment variable. e.g. export AWS_REGION=us-east-1 and the region variable (make sure the region variable matches the environment variable)

The IAM user does not need any permissions attached to it.

Once you have your initial configuration, you can apply it by running:

terraform init
terraform apply

This creates an EKS cluster with all possible configuration options defaulted.

⚠️ Note: This example is for getting up and running quickly. It is not intended for a production cluster. Refer to Production cluster considerations for things to consider when creating a production cluster.

Migrating to current version of module from a version prior to 3.0.0

If you already have created an EKS cluster using a pre 3.0.0 version of this module there is unfortunately no easy way to upgrade without recreating the cluster. If you already create the cluster in some other way and now set create_eks = false you only n eed to remove some inputs. I won't cover that much simpler case here.

While it would be a bit easier if you started using the same version of terraform-aws-modules/eks/aws as previously used by this module we would advise against that. The reason is that this version is very old and doesn't support a lot of feature currently available with AWS.

Let's say you created your cluster using an old version of the template and change your configuration to a current version. If you then run terraform plan you will see that basically everything would be destroyed and then created. To mitigate that you can move resources in the terraform state to the new addresses. In some cases there are no corresponding new address, instead you are better off removing resources to avoid that they get destroyed before the new resources are created. This means that you need to remove those cloud resources manually later. You can also tweak configurations to prevent resources from be replaced. If you check the output from terraform plan you will see that resources marked as "must be replaced" have one or more inputs with the comment "# forces replacement". If it is a resource that you need to keep to prevent disruption or data loss you should try to tweak the configuration so that the inputs value is reverted to what it was before.

terraform state mv module.eks-jx.random_pet.current  random_pet.current # Only needed if cluster_name wasn't specified
terraform state mv module.eks-jx.module.cluster.module.vpc module.vpc # Only needed if create_vpc wasn't false
terraform state mv module.eks-jx.module.cluster.module.eks module.eks
terraform state mv 'module.eks.aws_iam_role.cluster[0]' 'module.eks.aws_iam_role.this[0]'

# If the following two commands fail it is because you are migrating from a version of this module that didn't 
# create these resource. That is not a problem, but if you have installed the add on in some other way you will 
# need to issue some other terraform command: either "terraform state mv" command or "terragrunt import"
terraform state mv 'module.eks-jx.module.cluster.aws_eks_addon.ebs_addon[0]' 'module.eks.aws_eks_addon.this["aws-ebs-csi-driver"]'
terraform state mv module.eks-jx.module.cluster.module.ebs_csi_irsa_role module.ebs_csi_irsa_role

# Removing the following resources from the state prevent terraform apply from destroying existing node groups and 
# related resources before new ones are created. But this means that ypu need to delete the resources manually later.  
terraform state rm module.eks.module.node_groups 
terraform state rm 'module.eks.aws_iam_role_policy_attachment.workers_AmazonEC2ContainerRegistryReadOnly[0]' 'module.eks.aws_iam_role_policy_attachment.workers_AmazonEKSWorkerNodePolicy[0]' 'module.eks.aws_iam_role_policy_attachment.workers_AmazonEKS_CNI_Policy[0]'
terraform state rm 'module.eks.aws_security_group.workers[0]'  'module.eks.aws_iam_role.workers[0]'
terraform state rm $(terraform state list | grep aws_security_group_rule.workers)
terraform state rm $(terraform state list | grep aws_security_group_rule.cluster) 

In main.tf some tweaks are needed. Add the following inputs to the module eks

  prefix_separator                   = ""
  iam_role_name                      = local.cluster_name
  cluster_security_group_name        = local.cluster_name
  cluster_security_group_description = "EKS cluster security group."

Cluster add ons

If you already create cluster addons with terraform you can either remove the corresponding addon from the cluster_addons input of the eks module or use terraform state mv to change the address in the state file and thus prevent destroying and creating the add-on.

aws-auth config map

If you have configured the config map aws-auth by setting any of the inputs map_accounts, map_roles or map_users you will need to either configure aws-auth ins some other way, see https://registry.terraform. io/modules/terraform-aws-modules/eks/aws/20.20.0/submodules/aws-auth or switch to using access entries. See the documentation for the input access_entries in terraform-aws-modules/eks/aws and the AWS documentation.

If you keep aws-auth you should remove the old configuration, so the config map isn't deleted temporarily during terraform apply:

terraform state rm 'module.eks.kubernetes_config_map.aws_auth[0]'

Cluster Autoscaling

This does not automatically install cluster-autoscaler, it installs all of the prerequisite policies and roles required to install autoscaler.

Create a pull request for your cluster repository the changes created by the following command (with the root of your cluster repo as current directory):

jx gitops helmfile add --chart  autoscaler/cluster-autoscaler --repository https://kubernetes.github.io/autoscaler  --namespace kube-system

In the file kube-system/helmfile.yaml you should now configure a version of cluster autoscaler suitable for your version of Kubernetes by adding values for the chart:

- chart: autoscaler/cluster-autoscaler
  values:
  - image:
      tag: v1.30.0

Notice the image tag is v1.30.0 - this tag goes with clusters running Kubernetes 1.30. If you are running another version, you will need to find the image tag that matches your cluster version.

Open the Cluster Autoscaler releases page and find the latest Cluster Autoscaler version that matches your cluster's Kubernetes major and minor version. For example, if your cluster's Kubernetes version is 1.29 find the latest Cluster Autoscaler release that begins with 1.29. Use the semantic version number (1.29.3 for example) for that release to form the tag.

Other values to configure for the chart (apart from image.tag) can be seen in the documentation.

The verify pipeline for the cluster repository will add some default values to helmfile.yaml. When this is done the PR can be merged by approving it.

⚠️ Note: If you later on remove helmfiles/kube-system/helmfile.yaml from the root helmfiles.yaml the jx boot job will try to remove the kube-system namespace, which would make the Kubernetes cluster non-functional. To prevent this you would need to remove the label gitops.jenkins-x.io/pipeline from the kube-system namespace (i.e. run kubectl label ns kube-system gitops.jenkins-x.io/pipeline-) before the change to the root helmfiles.yaml.

Long Term Storage

You can choose to create S3 buckets for long term storage of Jenkins X build artefacts with enable_logs_storage, enable_reports_storage and enable_repository_storage.

During terraform apply the enabled S3 buckets are created, and the jx_requirements output will contain the following section:

storage:
  logs:
    enabled: ${enable_logs_storage}
    url: s3://${logs_storage_bucket}
  reports:
    enabled: ${enable_reports_storage}
    url: s3://${reports_storage_bucket}
  repository:
    enabled: ${enable_repository_storage}
    url: s3://${repository_storage_bucket}

If you just want to experiment with Jenkins X, you can set the variable force_destroy to true. This allows you to remove all generated buckets when running terraform destroy.

⚠️ Note: If you set force_destroy to false, and run a terraform destroy, it will fail. In that case empty the s3 buckets from the aws s3 console, and re run terraform destroy.

⚠️ Note: A notice from Amazon: Amazon S3 will automatically enable S3 Block Public Access and disable access control lists for all new buckets starting in April 2023. To accomodate this acl setting was removed for buckets and the enable_acl variable was introduced and set to false (default). If the requirement is to provide ACL with bucket ownership conrols for the bucket, then set the enable_acl variable to true.

Secrets Management

Vault is the default tool used by Jenkins X for managing secrets. Part of this module's responsibilities is the installation of Vault Operator which in turn install vault.

You can also configure an existing Vault instance for use with Jenkins X. In this case

  • provide the Vault URL via the vault_url input variable
  • set the boot_secrets in main.tf to this value:
boot_secrets = [
    {
      name  = "jxBootJobEnvVarSecrets.EXTERNAL_VAULT"
      value = "true"
      type  = "string"
    },
    {
      name  = "jxBootJobEnvVarSecrets.VAULT_ADDR"
      value = "https://enter-your-vault-url:8200"
      type  = "string"
    }
  ]
  • follow the Jenkins X documentation around the installation of an external Vault instance.

To use AWS Secrets Manager instead of vault, set use_vault variable to false, and use_asm variable to true. You will also need a role that grants access to AWS Secrets Manager, this will be created for you by setting create_asm_role variable to true. Setting the above variables will add the asm role arn to the boot job service account, which is required for the boot job to interact with AWS secrets manager to populate secrets.

NGINX

The module can install the nginx chart by setting create_nginx flag to true. Example can be found here. You can specify a nginx_values.yaml file or the module will use the default one stored here. If you are using terraform to create nginx resources, do not use the chart specified in the versionstream. Remove the entry in the helmfile.yaml referencing the nginx chart

path: helmfiles/nginx/helmfile.yaml

ExternalDNS

You can enable ExternalDNS with the enable_external_dns variable. This modifies the generated jx-requirements.yml file to enable External DNS when running jx boot.

If enable_external_dns is true, additional configuration is required.

If you want to use a domain with an already existing Route 53 Hosted Zone, you can provide it through the apex_domain variable:

This domain will be configured in the jx_requirements output in the following section:

ingress:
  domain: ${domain}
  ignoreLoadBalancer: true
  externalDNS: ${enable_external_dns}

If you want to use a subdomain and have this module create and configure a new Hosted Zone with DNS delegation, you can provide the following variables:

subdomain: This subdomain is added to the apex domain and configured in the resulting jx-requirements.yml file.

create_and_configure_subdomain: This flag instructs the script to create a new Route53 Hosted Zone for your subdomain and configure DNS delegation with the apex domain.

By providing these variables, the script creates a new Route 53 HostedZone that looks like <subdomain>.<apex_domain>, then it delegates the resolving of DNS to the apex domain. This is done by creating a NS RecordSet in the apex domain's Hosted Zone with the subdomain's HostedZone nameservers.

This ensures that the newly created HostedZone for the subdomain is instantly resolvable instead of having to wait for DNS propagation.

cert-manager

You can enable cert-manager to use TLS for your cluster through LetsEncrypt with the enable_tls variable.

LetsEncrypt has two environments, staging and production.

If you use staging, you will receive self-signed certificates, but you are not rate-limited, if you use the production environment, you receive certificates signed by LetsEncrypt, but you can be rate limited.

You can choose to use the production environment with the production_letsencrypt variable:

You need to provide a valid email to register your domain in LetsEncrypt with tls_email.

Customer's CA certificates

Customer has got signed certificates from CA and want to use it instead of LetsEncrypt certificates. Terraform creates k8s tls-ingress-certificates-ca secret with tls_key and tls_cert in default namespace. User should define:

enable_external_dns = true
apex_domain         = "office.com"
subdomain           = "subdomain"
enable_tls          = true
tls_email           = "[email protected]"

// Signed Certificate must match the domain: *.subdomain.office.com
tls_cert            = "/opt/CA/cert.crt"
tls_key             = "LS0tLS1C....BLRVktLS0tLQo="

Velero Backups

This module can set up the resources required for running backups with Velero on your cluster by setting the flag enable_backup to true.

Enabling backups on pre-existing clusters

If your cluster is pre-existing and already contains a namespace named velero, then enabling backups will initially fail with an error that you are trying to create a namespace which already exists.

Error: namespaces "velero" already exists

If you get this error, consider it a warning - you may then adjust accordingly by importing that namespace to be managed by Terraform, deleting the previously existing ns if it wasn't actually in use, or setting enable_backup back to false to continue managing Velero in the previous manner.

The recommended way is to import the namespace and then run another Terraform plan and apply:

terraform import module.eks-jx.module.backup.kubernetes_namespace.velero_namespace velero

Production cluster considerations

The configuration, as seen in Cluster provisioning, is not suited for creating and maintaining a production Jenkins X cluster. The following is a list of considerations for a production use case.

  • Specify the version attribute of the module, for example:

    module "eks-jx" {
      source  = "github.com/jenkins-x/terraform-aws-eks-jx"
      version = "1.0.0"
      # insert your configuration
    }
    
    output "jx_requirements" {
      value = module.eks-jx.jx_requirements
    }

    Specifying the version ensures that you are using a fixed version and that version upgrades cannot occur unintended.

  • Keep the Terraform configuration under version control by creating a dedicated repository for your cluster configuration or by adding it to an already existing infrastructure repository.

  • Setup a Terraform backend to securely store and share the state of your cluster. For more information refer to Configuring a Terraform backend.

  • Disable public API for the EKS cluster. If that is not not possible, restrict access to it by specifying the cidr blocks which can access it.

Configuring a Terraform backend

A "backend" in Terraform determines how state is loaded and how an operation such as apply is executed. By default, Terraform uses the local backend, which keeps the state of the created resources on the local file system. This is problematic since sensitive information will be stored on disk and it is not possible to share state across a team. When working with AWS a good choice for your Terraform backend is the s3 backend which stores the Terraform state in an AWS S3 bucket. The examples directory of this repository contains configuration examples for using the s3 backed.

To use the s3 backend, you will need to create the bucket upfront. You need the S3 bucket as well as a Dynamo table for state locks. You can use terraform-aws-tfstate-backend to create these required resources.

Examples

You can find examples for different configurations in the examples folder.

Each example generates a valid jx-requirements.yml file that can be used to boot a Jenkins X cluster.

Module configuration

Providers

Name Version
aws 5.60.0

Modules

Name Source Version
backup ./modules/backup n/a
cluster ./modules/cluster n/a
dns ./modules/dns n/a
health ./modules/health n/a
nginx ./modules/nginx n/a
vault ./modules/vault n/a

Requirements

Name Version
terraform >= 0.13.0, < 2.0.0
aws > 4.0
helm ~> 2.0
kubernetes ~> 2.0
null ~> 3.0

Inputs

Name Description Type Default Required
additional_tekton_role_policy_arns Additional Policy ARNs to attach to Tekton IRSA Role list(string) [] no
apex_domain The main domain to either use directly or to configure a subdomain from string "" no
asm_role DEPRECATED: Use the new bot_iam_role input with he same semantics instead. string "" no
boot_iam_role Specify arn of the role to apply to the boot job service account string "" no
boot_secrets n/a
list(object({
name = string
value = string
type = string
}))
[] no
cluster_name Variable to provide your desired name for the cluster string n/a yes
cluster_oidc_issuer_url The oidc provider url for the clustrer string n/a yes
create_and_configure_subdomain Flag to create an NS record set for the subdomain in the apex domain's Hosted Zone bool false no
create_asm_role Flag to control AWS Secrets Manager iam roles creation bool false no
create_autoscaler_role Flag to control cluster autoscaler iam role creation bool true no
create_bucketrepo_role Flag to control bucketrepo role bool true no
create_cm_role Flag to control cert manager iam role creation bool true no
create_cmcainjector_role Flag to control cert manager ca-injector iam role creation bool true no
create_ctrlb_role Flag to control controller build iam role creation bool true no
create_exdns_role Flag to control external dns iam role creation bool true no
create_nginx Decides whether we want to create nginx resources using terraform or not bool false no
create_nginx_namespace Boolean to control nginx namespace creation bool true no
create_pipeline_vis_role Flag to control pipeline visualizer role bool true no
create_ssm_role Flag to control AWS Parameter Store iam roles creation bool false no
create_tekton_role Flag to control tekton iam role creation bool true no
create_velero_role Flag to control velero iam role creation bool true no
eks_cluster_tags Add tags for the EKS Cluster map(any) {} no
enable_acl Flag to enable ACL instead of bucket ownership for S3 storage bool false no
enable_backup Whether or not Velero backups should be enabled bool false no
enable_external_dns Flag to enable or disable External DNS in the final jx-requirements.yml file bool false no
enable_key_rotation Flag to enable kms key rotation bool true no
enable_logs_storage Flag to enable or disable long term storage for logs bool true no
enable_reports_storage Flag to enable or disable long term storage for reports bool true no
enable_repository_storage Flag to enable or disable the repository bucket storage bool true no
enable_tls Flag to enable TLS in the final jx-requirements.yml file bool false no
expire_logs_after_days Number of days objects in the logs bucket are stored number 90 no
force_destroy Flag to determine whether storage buckets get forcefully destroyed. If set to false, empty the bucket first in the aws s3 console, else terraform destroy will fail with BucketNotEmpty error bool false no
force_destroy_subdomain Flag to determine whether subdomain zone get forcefully destroyed. If set to false, empty the sub domain first in the aws Route 53 console, else terraform destroy will fail with HostedZoneNotEmpty error bool false no
ignoreLoadBalancer Flag to specify if jx boot will ignore loadbalancer DNS to resolve to an IP bool false no
install_kuberhealthy Flag to specify if kuberhealthy operator should be installed bool false no
install_vault Whether or not this modules creates and manages the Vault instance. If set to false and use_vault is true either an external Vault URL needs to be provided or you need to install vault operator and instance using helmfile. bool true no
jx_bot_token Bot token used to interact with the Jenkins X cluster git repository string "" no
jx_bot_username Bot username used to interact with the Jenkins X cluster git repository string "" no
jx_git_operator_values Extra values for jx-git-operator chart as a list of yaml formated strings list(string) [] no
jx_git_url URL for the Jenkins X cluster git repository string "" no
local-exec-interpreter If provided, this is a list of interpreter arguments used to execute the command list(string)
[
"/bin/bash",
"-c"
]
no
manage_apex_domain Flag to control if apex domain should be managed/updated by this module. Set this to false,if your apex domain is managed in a different AWS account or different provider bool true no
manage_subdomain Flag to control subdomain creation/management bool true no
nginx_chart_version nginx chart version string n/a yes
nginx_namespace Name of the nginx namespace string "nginx" no
nginx_release_name Name of the nginx release name string "nginx-ingress" no
nginx_values_file Name of the values file which holds the helm chart values string "nginx_values.yaml" no
production_letsencrypt Flag to use the production environment of letsencrypt in the jx-requirements.yml file bool false no
region The region to create the resources into string "us-east-1" no
registry Registry used to store images string "" no
s3_extra_tags Add new tags for s3 buckets map(any) {} no
s3_kms_arn ARN of the kms key used for encrypting s3 buckets string "" no
subdomain The subdomain to be added to the apex domain. If subdomain is set, it will be appended to the apex domain in jx-requirements-eks.yml file string "" no
subnets The subnet ids to create EKS cluster in if create_vpc is false list(string) [] no
tls_cert TLS certificate encrypted with Base64 string "" no
tls_email The email to register the LetsEncrypt certificate with. Added to the jx-requirements.yml file string "" no
tls_key TLS key encrypted with Base64 string "" no
use_asm Flag to specify if AWS Secrets manager is being used bool false no
use_kms_s3 Flag to determine whether kms should be used for encrypting s3 buckets bool false no
use_vault Flag to control vault resource creation bool true no
vault_instance_values Extra values for vault-instance chart as a list of yaml formated strings list(string) [] no
vault_operator_values Extra values for vault-operator chart as a list of yaml formated strings list(string) [] no
vault_url URL to an external Vault instance in case Jenkins X does not create its own system Vault string "" no
velero_namespace Kubernetes namespace for Velero string "velero" no
velero_schedule The Velero backup schedule in cron notation to be set in the Velero Schedule CRD (see default-backup.yaml) string "0 * * * *" no
velero_ttl The the lifetime of a velero backup to be set in the Velero Schedule CRD (see default-backup.yaml) string "720h0m0s" no
velero_username The username to be assigned to the Velero IAM user string "velero" no
vpc_id The VPC to create EKS cluster in if create_vpc is false string "" no

Outputs

Name Description
backup_bucket_url The bucket where backups from velero will be stored
cert_manager_iam_role The IAM Role that the Cert Manager pod will assume to authenticate
cluster_asm_iam_role The IAM Role that the External Secrets pod will assume to authenticate (Secrets Manager)
cluster_autoscaler_iam_role The IAM Role that the Jenkins X UI pod will assume to authenticate
cluster_name The name of the created cluster
cluster_ssm_iam_role The IAM Role that the External Secrets pod will assume to authenticate (Parameter Store)
cm_cainjector_iam_role The IAM Role that the CM CA Injector pod will assume to authenticate
connect "The cluster connection string to use once Terraform apply finishes,
this command is already executed as part of the apply, you may have to provide the region and
profile as environment variables "
controllerbuild_iam_role The IAM Role that the ControllerBuild pod will assume to authenticate
external_dns_iam_role The IAM Role that the External DNS pod will assume to authenticate
jx_requirements The jx-requirements rendered output
lts_logs_bucket The bucket where logs from builds will be stored
lts_reports_bucket The bucket where test reports will be stored
lts_repository_bucket The bucket that will serve as artifacts repository
pipeline_viz_iam_role The IAM Role that the pipeline visualizer pod will assume to authenticate
subdomain_nameservers ---------------------------------------------------------------------------- DNS ----------------------------------------------------------------------------
tekton_bot_iam_role The IAM Role that the build pods will assume to authenticate

FAQ: Frequently Asked Questions

IAM Roles for Service Accounts

This module sets up a series of IAM Policies and Roles. These roles will be annotated into a few Kubernetes Service accounts. This allows us to make use of IAM Roles for Sercive Accounts to set fine-grained permissions on a pod per pod basis. There is no way to provide your own roles or define other Service Accounts by variables, but you can always modify the modules/cluster/irsa.tf Terraform file.

How can I contribute

Contributions are very welcome! Check out the Contribution Guidelines for instructions.