diff --git a/README.md b/README.md index c51f5224..a8bd61fc 100644 --- a/README.md +++ b/README.md @@ -8,25 +8,26 @@

-**Kamaji** is a tool aimed to build and operate a **Managed Kubernetes Service** with a fraction of the operational burden. With **Kamaji**, you can deploy and operate hundreds of Kubernetes clusters as a hyper-scale cloud provider. +**Kamaji** deploys and operates **Kubernetes** at scale with a fraction of the operational burden.

-> This project is still in the early development stage which means it's not ready for production as APIs, commands, flags, etc. are subject to change, but also that your feedback can still help to shape it. Please try it out and let us know what you like, dislike, what works, what doesn't, etc. - ## Why we are building it? -Global hyper-scalers are leading the Managed Kubernetes space, while regional cloud providers, as well as large corporations, are struggling to offer the same level of experience to their developers because of the lack of the right tools. Also, Kubernetes solutions for the on-premises are designed with an enterprise-first approach and they are too costly when deployed at scale. Project Kamaji aims to solve this pain by leveraging multi-tenancy and simplifying how to run Kubernetes clusters at scale with a fraction of the operational burden. +Global hyper-scalers are leading the Managed Kubernetes space, while other cloud providers, as well as large corporations, are struggling to offer the same experience to their DevOps teams because of the lack of the right tools. Also, current Kubernetes solutions are mainly designed with an enterprise-first approach and they are too costly when deployed at scale. -## How it works -Kamaji turns any CNCF conformant Kubernetes cluster into an _“admin cluster”_ to orchestrate other Kubernetes clusters we're calling _“tenant clusters”_. As with every Kubernetes cluster, a tenant cluster has a set of nodes and a control plane, composed of several components: `APIs server`, `scheduler`, `controller manager`. What Kamaji does is deploy those components as a pod running in the admin cluster. +**Kamaji** aims to solve these pains by leveraging multi-tenancy and simplifying how to run multiple control planes on the same infrastructure with a fraction of the operational burden. +## How it works +Kamaji turns any Kubernetes cluster into an _“admin cluster”_ to orchestrate other Kubernetes clusters called _“tenant clusters”_. What makes Kamaji special is that Control Planes of _“tenant clusters”_ are just regular pods running in the _“admin cluster”_ instead of dedicated Virtual Machines. This solution makes running control planes at scale cheaper and easier to deploy and operate. View [Core Concepts](./docs/concepts.md) for a deeper understanding of principles behind Kamaji's design. -And what about the tenant worker nodes? They are just worker nodes: regular instances, e.g. virtual or bare metal, connecting to the APIs server of the tenant cluster. Kamaji's goal is to manage the lifecycle of hundreds of these clusters, not only one, so how can we add another tenant cluster? As you could expect, Kamaji just deploys a new tenant control plane as a new pod in the admin cluster, and then it joins the new tenant worker nodes. +

+ +

- +

All the tenant clusters built with Kamaji are fully compliant CNCF Kubernetes clusters and are compatible with the standard Kubernetes toolchains everybody knows and loves. @@ -35,57 +36,32 @@ All the tenant clusters built with Kamaji are fully compliant CNCF Kubernetes cl

-## Save the state -Putting the tenant cluster control plane in a pod is the easiest part. Also, we have to make sure each tenant cluster saves the state to be able to store and retrieve data. - -A dedicated `etcd` cluster for each tenant cluster doesn’t scale well for a managed service because `etcd` data persistence can be cumbersome at scale, rising the operational effort to mitigate it. So we have to find an alternative keeping in mind our goal for a resilient and cost-optimized solution at the same time. As we can deploy any Kubernetes cluster with an external `etcd` cluster, we explored this option for the tenant control planes. On the admin cluster, we deploy a multi-tenant `etcd` cluster storing the state of multiple tenant clusters. - -With this solution, the resiliency is guaranteed by the usual `etcd` mechanism, and the pods' count remains under control, so it solves the main goal of resiliency and costs optimization. The trade-off here is that we have to operate an external `etcd` cluster and manage the access to be sure that each tenant cluster uses only its data. Also, there are limits in size in `etcd`, defaulted to 2GB and configurable to a maximum of 8GB. We’re solving this issue by pooling multiple `etcd` and sharding the tenant control planes. - ## Getting started Please refer to the [Getting Started guide](./docs/getting-started-with-kamaji.md) to deploy a minimal setup of Kamaji on KinD. -## Use cases -Kamaji project has been initially started as a solution for actual and common problems such as minimizing the Total Cost of Ownership while running Kubernetes at large scale. However, it can open a wider range of use cases. Here are a few: - -### Managed Kubernetes -Enabling companies to provide Cloud Native Infrastructure with ease by introducing a strong separation of concerns between management and workloads. Centralize clusters management, monitoring, and observability by leaving developers to focus on the applications, increase productivity and reduce operational costs. - -### Kubernetes as a Service -Provide Kubernetes clusters in a self-service fashion by running management and workloads on different infrastructures and cost centers with the option of Bring Your Own Device - BYOD. - -### Control Plane as a Service -Provide multiple Kubernetes control planes running on top of a single Kubernetes cluster. Tenants who use namespaces based isolation often still need access to cluster wide resources like Cluster Roles, Admission Webhooks, or Custom Resource Definitions. +> This project is still in the early development stage which means it's not ready for production as APIs, commands, flags, etc. are subject to change, but also that your feedback can still help to shape it. Please try it out and let us know what you like, dislike, what works, what doesn't, etc. -### Edge Computing -Distribute Kubernetes workloads across edge computing locations without having to manage multiple clusters across various providers. Centralize management of hundreds of control planes while leaving workloads to run isolated on their own dedicated infrastructure. +## Use cases +Kamaji project has been initially started as a solution for actual and common problems such as minimizing the Total Cost of Ownership while running Kubernetes at large scale. However, it can open a wider range of use cases. -### Cluster Simulation -Check new Kubernetes API or experimental flag or a new tool without impacting production operations. Kamaji will let you simulate such things in a safe and controlled environment. +Here are a few: -### Workloads Testing -Check the behaviour of your workloads on different and multiple versions of Kubernetes with ease by deploying multiple Control Planes in a single cluster. +- **Managed Kubernetes:** enable companies to provide Cloud Native Infrastructure with ease by introducing a strong separation of concerns between management and workloads. Centralize clusters management, monitoring, and observability by leaving developers to focus on applications, increase productivity and reduce operational costs. +- **Kubernetes as a Service:** provide Kubernetes clusters in a self-service fashion by running management and workloads on different infrastructures with the option of Bring Your Own Device, BYOD. +- **Control Plane as a Service:** provide multiple Kubernetes control planes running on top of a single Kubernetes cluster. Tenants who use namespaces based isolation often still need access to cluster wide resources like Cluster Roles, Admission Webhooks, or Custom Resource Definitions. +- **Edge Computing:** distribute Kubernetes workloads across edge computing locations without having to manage multiple clusters across various providers. Centralize management of hundreds of control planes while leaving workloads to run isolated on their own dedicated infrastructure. +- **Cluster Simulation:** check new Kubernetes API or experimental flag or a new tool without impacting production operations. Kamaji will let you simulate such things in a safe and controlled environment. +- **Workloads Testing:** check the behaviour of your workloads on different and multiple versions of Kubernetes with ease by deploying multiple Control Planes in a single cluster. ## Features -### Self Service Kubernetes -Leave users the freedom to self-provision their Kubernetes clusters according to the assigned boundaries. - -### Multi-cluster Management -Centrally manage multiple tenant clusters from a single admin cluster. Happy SREs. - -### Cheaper Control Planes -Place multiple tenant control planes on a single node, instead of having three nodes for a single control plane. - -### Stronger Multi-Tenancy -Leave tenants to access the control plane with admin permissions while keeping the tenant isolated at the infrastructure level. - -### Kubernetes Inception -Use Kubernetes to manage Kubernetes by re-using all the Kubernetes goodies you already know and love. - -### Full APIs compliant -Tenant clusters are fully CNCF compliant built with upstream Kubernetes binaries. A client does not see differences between a Kamaji provisioned cluster and a dedicated cluster. +- **Self Service Kubernetes:** leave users the freedom to self-provision their Kubernetes clusters according to the assigned boundaries. +- **Multi-cluster Management:** centrally manage multiple tenant clusters from a single admin cluster. Happy SREs. +- **Cheaper Control Planes:** place multiple tenant control planes on a single node, instead of having three nodes for a single control plane. +- **Stronger Multi-Tenancy:** leave tenants to access the control plane with admin permissions while keeping the tenant isolated at the infrastructure level. +- **Kubernetes Inception:** use Kubernetes to manage Kubernetes by re-using all the Kubernetes goodies you already know and love. +- **Full APIs compliant:** tenant clusters are fully CNCF compliant built with upstream Kubernetes binaries. A user does not see differences between a Kamaji provisioned cluster and a dedicated cluster. ## Roadmap @@ -94,10 +70,14 @@ Tenant clusters are fully CNCF compliant built with upstream Kubernetes binaries - [x] Zero Downtime Tenant Control Plane upgrade - [x] `konnectivity` integration - [ ] Provisioning of Tenant Control Plane through Cluster APIs -- [ ] Prometheus metrics for monitoring and alerting -- [ ] `kine` integration, i.e. use MySQL, SQLite, PostgreSQL as datastore +- [ ] Terraform provider +- [ ] Custom Prometheus metrics for monitoring and alerting +- [x] `kine` integration for MySQL as datastore +- [ ] `kine` integration for PostgreSQL as datastore - [ ] Deeper `kubeadm` integration -- [ ] `etcd` pooling +- [ ] Pooling of multiple `etcd` datastores +- [ ] Autoscaling of Tenant Control Plane pods + ## Documentation Please, check the project's [documentation](./docs/) for getting started with Kamaji. @@ -108,27 +88,23 @@ Kamaji is Open Source with Apache 2 license and any contribution is welcome. ## Community Join the [Kubernetes Slack Workspace](https://slack.k8s.io/) and the [`#kamaji`](https://kubernetes.slack.com/archives/C03GLTTMWNN) channel to meet end-users and contributors. -## FAQ +## FAQs Q. What does Kamaji means? A. Kamaji is named as the character _Kamaji_ from the Japanese movie [_Spirited Away_](https://en.wikipedia.org/wiki/Spirited_Away). Q. Is Kamaji another Kubernetes distribution? -A. No, Kamaji is a tool you can install on top of any CNCF conformant Kubernetes to provide hundreds of managed Tenant clusters as a service. We tested Kamaji on vanilla Kubernetes 1.23+, KinD, and MS Azure AKS. We expect it to work smoothly on other Kubernetes distributions. The tenant clusters made with Kamaji are conformant CNCF Kubernetes vanilla clusters built with `kubeadm`. - -Q. Is it safe to run Kubernetes control plane components in a pod? - -A. Yes, the tenant control plane components are packaged in the same way they are running in bare metal or virtual nodes. We leverage the `kubeadm` code to set up the control plane components as they were running on a server. The same unchanged images of upstream `APIs server`, `scheduler`, `controller manager` are used. +A. No, Kamaji is a Kubernetes Operator you can install on top of any Kubernetes cluster to provide hundreds of managed Kubernetes clusters as a service. We tested Kamaji on vanilla Kubernetes 1.22+, KinD, and Azure AKS. We expect it to work smoothly on other Kubernetes distributions. The tenant clusters made with Kamaji are conformant CNCF Kubernetes clusters as we leverage on [`kubeadm`](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/). -Q. And what about multi-tenant `etcd`? I never heard of it. +Q. Is it safe to run Kubernetes control plane components in a pod instead of dedicated virtual machines? -A. Even if multi-tenant deployment for `etcd` is not a common practice, multi-tenancy, RBAC, and client authentication has been [supported](https://etcd.io/docs/v3.5/op-guide/authentication/) in `etcd` from a long time. +A. Yes, the tenant control plane components are packaged in the same way they are running in bare metal or virtual nodes. We leverage the `kubeadm` code to set up the control plane components as they were running on their own server. The unchanged images of upstream `kube-apiserver`, `kube-scheduler`, and `kube-controller-manager` are used. Q. You already provide a Kubernetes multi-tenancy solution with [Capsule](capsule.clastix.io). Why does Kamaji matter? -A. Lighter Multi-Tenancy solutions, like Capsule shares the Kubernetes control plane among all tenants keeping tenant namespaces isolated by policies. While these solutions are the right choice by balancing between features and ease of usage, there are cases where a tenant user requires access to the control plane, for example, when a tenant requires to manage CRDs on his own. With Kamaji, you can provide admin permissions to the tenants. +A. A multi-tenancy solution, like Capsule shares the Kubernetes control plane among all tenants keeping tenant namespaces isolated by policies. While the solution is the right choice by balancing between features and ease of usage, there are cases where a tenant user requires access to the control plane, for example, when a tenant requires to manage CRDs on his own. With Kamaji, you can provide cluster admin permissions to the tenant. -Q. So I need a costly cloud infrastructure to try Kamaji? +Q. Well you convinced me, how to get a try? -A. No, it is possible to getting started Kamaji on your laptop with [KinD](./docs/getting-started-with-kamaji.md). +A. It is possible to get started with Kamaji on a laptop with [KinD](./docs/getting-started-with-kamaji.md) installed. diff --git a/assets/kamaji-dark.png b/assets/kamaji-dark.png index 2d12867d..8c9b5c0f 100644 Binary files a/assets/kamaji-dark.png and b/assets/kamaji-dark.png differ diff --git a/assets/kamaji-dark.svg b/assets/kamaji-dark.svg deleted file mode 100644 index a00b32cc..00000000 --- a/assets/kamaji-dark.svg +++ /dev/null @@ -1,16 +0,0 @@ - - - - - - - Kamaji Admin ClusterMulti-tenantetcdTenant 01Control Plane PodTenant 02Control Plane PodTenant XControl Plane PodTenant 01ClusterAny CNCFConformantKubernetesworker nodesworker nodesworker nodesTenant 02Clusterworker nodesworker nodesworker nodesTenant XClusterworker nodesworker nodesworker nodes \ No newline at end of file diff --git a/assets/kamaji-light.png b/assets/kamaji-light.png index 1a3a3c96..dc06a3c1 100644 Binary files a/assets/kamaji-light.png and b/assets/kamaji-light.png differ diff --git a/assets/kamaji-light.svg b/assets/kamaji-light.svg deleted file mode 100644 index f544308b..00000000 --- a/assets/kamaji-light.svg +++ /dev/null @@ -1,16 +0,0 @@ - - - - - - - Kamaji Admin ClusterMulti-tenantetcdTenant 01Control Plane PodTenant 02Control Plane PodTenant XControl Plane PodTenant 01ClusterAny CNCFConformantKubernetesworker nodesworker nodesworker nodesTenant 02Clusterworker nodesworker nodesworker nodesTenant XClusterworker nodesworker nodesworker nodes \ No newline at end of file diff --git a/deploy/etcd/etcd-cluster.yaml b/deploy/etcd/etcd-cluster.yaml new file mode 100644 index 00000000..d5fe1c9d --- /dev/null +++ b/deploy/etcd/etcd-cluster.yaml @@ -0,0 +1,120 @@ +apiVersion: v1 +kind: ServiceAccount +metadata: + name: etcd + namespace: +--- +apiVersion: v1 +kind: Service +metadata: + name: etcd-server + namespace: +spec: + type: ClusterIP + ports: + - name: client + port: 2379 + protocol: TCP + targetPort: 2379 + selector: + app: etcd +--- +apiVersion: v1 +kind: Service +metadata: + name: etcd + namespace: +spec: + clusterIP: None + ports: + - port: 2379 + name: client + - port: 2380 + name: peer + selector: + app: etcd +--- +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: etcd + labels: + app: etcd + namespace: +spec: + serviceName: etcd + selector: + matchLabels: + app: etcd + replicas: 3 + template: + metadata: + name: etcd + labels: + app: etcd + spec: + serviceAccountName: etcd + topologySpreadConstraints: + - maxSkew: 1 + topologyKey: topology.kubernetes.io/zone + whenUnsatisfiable: DoNotSchedule + labelSelector: + matchLabels: + app: etcd + volumes: + - name: certs + secret: + secretName: etcd-certs + containers: + - name: etcd + image: quay.io/coreos/etcd:v3.5.1 + ports: + - containerPort: 2379 + name: client + - containerPort: 2380 + name: peer + volumeMounts: + - name: data + mountPath: /var/run/etcd + - name: certs + mountPath: /etc/etcd/pki + command: + - etcd + - --data-dir=/var/run/etcd + - --name=$(POD_NAME) + - --initial-cluster-state=new + - --initial-cluster=etcd-0=https://etcd-0.etcd.$(POD_NAMESPACE).svc.cluster.local:2380,etcd-1=https://etcd-1.etcd.$(POD_NAMESPACE).svc.cluster.local:2380,etcd-2=https://etcd-2.etcd.$(POD_NAMESPACE).svc.cluster.local:2380 + - --initial-advertise-peer-urls=https://$(POD_NAME).etcd.$(POD_NAMESPACE).svc.cluster.local:2380 + - --initial-cluster-token=kamaji + - --listen-client-urls=https://0.0.0.0:2379 + - --advertise-client-urls=https://etcd-0.etcd.$(POD_NAMESPACE).svc.cluster.local:2379,https://etcd-1.etcd.$(POD_NAMESPACE).svc.cluster.local:2379,https://etcd-2.etcd.$(POD_NAMESPACE).svc.cluster.local:2379,https://etcd-server.$(POD_NAMESPACE).svc.cluster.local:2379 + - --client-cert-auth=true + - --trusted-ca-file=/etc/etcd/pki/ca.crt + - --cert-file=/etc/etcd/pki/server.pem + - --key-file=/etc/etcd/pki/server-key.pem + - --listen-peer-urls=https://0.0.0.0:2380 + - --peer-client-cert-auth=true + - --peer-trusted-ca-file=/etc/etcd/pki/ca.crt + - --peer-cert-file=/etc/etcd/pki/peer.pem + - --peer-key-file=/etc/etcd/pki/peer-key.pem + - --snapshot-count=8000 + - --auto-compaction-mode=periodic + - --auto-compaction-retention=5m + - --quota-backend-bytes=8589934592 + env: + - name: POD_NAME + valueFrom: + fieldRef: + fieldPath: metadata.name + - name: POD_NAMESPACE + valueFrom: + fieldRef: + fieldPath: metadata.namespace + volumeClaimTemplates: + - metadata: + name: data + spec: + accessModes: ["ReadWriteOnce"] + resources: + requests: + storage: 8Gi diff --git a/deploy/kamaji-azure.env b/deploy/kamaji-azure.env index d50063b4..3f52cd84 100644 --- a/deploy/kamaji-azure.env +++ b/deploy/kamaji-azure.env @@ -1,5 +1,30 @@ +# azure parameters export KAMAJI_REGION=westeurope export KAMAJI_RG=Kamaji # https://docs.microsoft.com/en-us/azure/aks/faq#why-are-two-resource-groups-created-with-aks export KAMAJI_CLUSTER=kamaji export KAMAJI_NODE_RG=MC_${KAMAJI_RG}_${KAMAJI_CLUSTER}_${KAMAJI_REGION} + +# kamaji parameters +export KAMAJI_NAMESPACE=kamaji-system + +# tenant cluster parameters +export TENANT_NAMESPACE=tenants +export TENANT_NAME=tenant-00 +export TENANT_DOMAIN=$KAMAJI_REGION.cloudapp.azure.com +export TENANT_VERSION=v1.23.5 +export TENANT_PORT=6443 # port used to expose the tenant api server +export TENANT_PROXY_PORT=8132 # port used to expose the konnectivity server +export TENANT_POD_CIDR=10.36.0.0/16 +export TENANT_SVC_CIDR=10.96.0.0/16 +export TENANT_DNS_SERVICE=10.96.0.10 + +export TENANT_VM_SIZE=Standard_D2ds_v4 +export TENANT_VM_IMAGE=UbuntuLTS +export TENANT_RG=$TENANT_NAME +export TENANT_NSG=$TENANT_NAME-nsg +export TENANT_VNET_NAME=$TENANT_NAME +export TENANT_VNET_ADDRESS=172.12.0.0/16 +export TENANT_SUBNET_NAME=$TENANT_NAME-subnet +export TENANT_SUBNET_ADDRESS=172.12.10.0/24 +export TENANT_VMSS=$TENANT_NAME-vmss \ No newline at end of file diff --git a/deploy/kamaji-external-etcd.env b/deploy/kamaji-external-etcd.env deleted file mode 100644 index dfc92aa4..00000000 --- a/deploy/kamaji-external-etcd.env +++ /dev/null @@ -1,5 +0,0 @@ -# etcd machine addresses -export ETCD0=192.168.32.10 -export ETCD1=192.168.32.11 -export ETCD2=192.168.32.12 -export ETCDHOSTS=($ETCD0 $ETCD1 $ETCD2) diff --git a/deploy/kamaji-internal-etcd.env b/deploy/kamaji-internal-etcd.env deleted file mode 100644 index 70152b1c..00000000 --- a/deploy/kamaji-internal-etcd.env +++ /dev/null @@ -1,5 +0,0 @@ -# etcd endpoints -export ETCD_NAMESPACE=etcd-system -export ETCD0=etcd-0.etcd.${ETCD_NAMESPACE}.svc.cluster.local -export ETCD1=etcd-1.etcd.${ETCD_NAMESPACE}.svc.cluster.local -export ETCD2=etcd-2.etcd.${ETCD_NAMESPACE}.svc.cluster.local \ No newline at end of file diff --git a/deploy/kamaji-tenant-azure.env b/deploy/kamaji-tenant-azure.env deleted file mode 100644 index bc510970..00000000 --- a/deploy/kamaji-tenant-azure.env +++ /dev/null @@ -1,22 +0,0 @@ -export KAMAJI_REGION=westeurope - -# tenant cluster parameters -export TENANT_NAMESPACE=tenants -export TENANT_NAME=tenant-00 -export TENANT_DOMAIN=$KAMAJI_REGION.cloudapp.azure.com -export TENANT_VERSION=v1.23.4 -export TENANT_ADDR=10.240.0.100 # IP used to expose the tenant control plane -export TENANT_PORT=6443 # PORT used to expose the tenant control plane -export TENANT_POD_CIDR=10.36.0.0/16 -export TENANT_SVC_CIDR=10.96.0.0/16 -export TENANT_DNS_SERVICE=10.96.0.10 - -export TENANT_VM_SIZE=Standard_D2ds_v4 -export TENANT_VM_IMAGE=UbuntuLTS -export TENANT_RG=$TENANT_NAME -export TENANT_NSG=$TENANT_NAME-nsg -export TENANT_VNET_NAME=$TENANT_NAME -export TENANT_VNET_ADDRESS=192.168.0.0/16 -export TENANT_SUBNET_NAME=$TENANT_NAME-subnet -export TENANT_SUBNET_ADDRESS=192.168.10.0/24 -export TENANT_VMSS=$TENANT_NAME-vmss diff --git a/deploy/kamaji-tenant.env b/deploy/kamaji-tenant.env deleted file mode 100644 index a0beae1c..00000000 --- a/deploy/kamaji-tenant.env +++ /dev/null @@ -1,17 +0,0 @@ - -# tenant cluster parameters -export TENANT_NAMESPACE=tenants -export TENANT_NAME=tenant-00 -export TENANT_DOMAIN=clastix.labs -export TENANT_VERSION=v1.23.1 -export TENANT_ADDR=192.168.32.150 # IP used to expose the tenant control plane -export TENANT_PORT=6443 # PORT used to expose the tenant control plane -export TENANT_POD_CIDR=10.36.0.0/16 -export TENANT_SVC_CIDR=10.96.0.0/16 -export TENANT_DNS_SERVICE=10.96.0.10 - -# tenant node addresses -export WORKER0=192.168.32.90 -export WORKER1=192.168.32.91 -export WORKER2=192.168.32.92 -export WORKER3=192.168.32.93 diff --git a/deploy/kamaji.env b/deploy/kamaji.env new file mode 100644 index 00000000..4025a2d0 --- /dev/null +++ b/deploy/kamaji.env @@ -0,0 +1,18 @@ +# kamaji parameters +export KAMAJI_NAMESPACE=kamaji-system + +# tenant cluster parameters +export TENANT_NAMESPACE=tenants +export TENANT_NAME=tenant-00 +export TENANT_DOMAIN=clastix.labs +export TENANT_VERSION=v1.23.5 +export TENANT_PORT=6443 # port used to expose the tenant api server +export TENANT_PROXY_PORT=8132 # port used to expose the konnectivity server +export TENANT_POD_CIDR=10.36.0.0/16 +export TENANT_SVC_CIDR=10.96.0.0/16 +export TENANT_DNS_SERVICE=10.96.0.10 + +# tenant node addresses +export WORKER0=172.12.0.10 +export WORKER1=172.12.0.11 +export WORKER2=172.12.0.12 diff --git a/deploy/nodes-prerequisites.sh b/deploy/nodes-prerequisites.sh old mode 100644 new mode 100755 diff --git a/docs/README.md b/docs/README.md index d06d1d71..2bc98dc1 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,10 +1,13 @@ # Kamaji documentation +- [Core Concepts](./concepts.md) - [Architecture](./architecture.md) -- [Concepts](./concepts.md) - [Getting started](./getting-started-with-kamaji.md) -- [Kamaji Deployment](./kamaji-deployment-guide.md) -- [Tenant deployment](./kamaji-tenant-deployment-guide.md) -- Deployment on cloud providers: - - [Azure](./kamaji-azure-deployment-guide.md) +- Guides: + - [Deploy Kamaji](./kamaji-deployment-guide.md) + - [Deploy Kamaji on Azure](./kamaji-azure-deployment-guide.md) + - Deploy Kamaji on AWS + - Deploy Kamaji on GCP + - Deploy Kamaji on OpenStack - [Reference](./reference.md) +- [Versioning](./versioning.md) \ No newline at end of file diff --git a/docs/concepts.md b/docs/concepts.md index a867a7a6..f724a47e 100644 --- a/docs/concepts.md +++ b/docs/concepts.md @@ -1 +1,31 @@ -# Kamaji concepts \ No newline at end of file +# Core Concepts + +Kamaji is a Kubernetes Operator. It turns any Kubernetes cluster into an _“admin cluster”_ to orchestrate other Kubernetes clusters called _“tenant clusters”_. + +## Tenant Control Plane +What makes Kamaji special is that the Control Plane of a _“tenant cluster”_ is just one or more regular pods running in a namespace of the _“admin cluster”_ instead of a dedicated set of Virtual Machines. This solution makes running control planes at scale cheaper and easier to deploy and operate. The Tenant Control Plane components are packaged in the same way they are running in bare metal or virtual nodes. We leverage the `kubeadm` code to set up the control plane components as they were running on their own server. The unchanged images of upstream `kube-apiserver`, `kube-scheduler`, and `kube-controller-manager` are used. + +High Availability and rolling updates of the Tenant Control Plane pods are provided by a regular Deployment. Autoscaling based on the metrics is available. A Service is used to espose the Tenant Control Plane outside of the _“admin cluster”_. The `LoadBalancer` service type is used, `NodePort` and `ClusterIP` with an Ingress Controller are still viable options, depending on the case. + +Kamaji offers a [Custom Resource Definition](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/) to provide a declarative approach of managing a Tenant Control Plane. This *CRD* is called `TenantControlPlane`, or `tcp` in short. + +## Tenant worker nodes +And what about the tenant worker nodes? They are just _"worker nodes"_, i.e. regular virtual or bare metal machines, connecting to the APIs server of the Tenant Control Plane. Kamaji's goal is to manage the lifecycle of hundreds of these _“tenant clusters”_, not only one, so how to add another tenant cluster to Kamaji? As you could expect, you have just deploys a new Tenant Control Plane in one of the _“admin cluster”_ namespace, and then joins the tenant worker nodes to it. + +All the tenant clusters built with Kamaji are fully compliant CNCF Kubernetes clusters and are compatible with the standard Kubernetes toolchains everybody knows and loves. + +## Save the state +Putting the Tenant Control Plane in a pod is the easiest part. Also, we have to make sure each tenant cluster saves the state to be able to store and retrieve data. A dedicated `etcd` cluster for each tenant cluster doesn’t scale well for a managed service because `etcd` data persistence can be cumbersome at scale, rising the operational effort to mitigate it. So we have to find an alternative keeping in mind our goal for a resilient and cost-optimized solution at the same time. As we can deploy any Kubernetes cluster with an external `etcd` cluster, we explored this option for the tenant control planes. On the admin cluster, we deploy a multi-tenant `etcd` cluster storing the state of multiple tenant clusters. + +With this solution, the resiliency is guaranteed by the usual `etcd` mechanism, and the pods' count remains under control, so it solves the main goal of resiliency and costs optimization. The trade-off here is that we have to operate an external `etcd` cluster, in addition to `etcd` of the _“admin cluster”_ and manage the access to be sure that each _“tenant cluster”_ uses only its data. Also, there are limits in size in `etcd`, defaulted to 2GB and configurable to a maximum of 8GB. We’re solving this issue by pooling multiple `etcd` togheter and sharding the Tenant Control Planes. + +Optionally, Kamaji offers the possibility of using a different storage system than `etcd` to save the state of the tenants' clusters, like MySQL or PostgreSQL compatible databases, thanks to the [kine](https://github.com/k3s-io/kine) integration. + +## Requirements of design +These are requirements of design behind Kamaji: + +- Communication between the _“admin cluster”_ and a _“tenant cluster”_ is unidirectional. The _“admin cluster”_ manages a _“tenant cluster”_, but a _“tenant cluster”_ has no awareness of the _“admin cluster”_. +- Communication between different _“tenant clusters”_ is not allowed. +- The worker nodes of tenant should not run anything beyond tenant's workloads. + +Goals and scope may vary as the project evolves. \ No newline at end of file diff --git a/docs/kamaji-azure-deployment-guide.md b/docs/kamaji-azure-deployment-guide.md index 10ecf328..8fb2a6bd 100644 --- a/docs/kamaji-azure-deployment-guide.md +++ b/docs/kamaji-azure-deployment-guide.md @@ -1,42 +1,48 @@ # Setup Kamaji on Azure - -In this section, we're going to setup Kamaji on MS Azure: +This guide will lead you through the process of creating a working Kamaji setup on on MS Azure. It requires: - one bootstrap local workstation -- a regular AKS cluster as Kamaji Admin Cluster -- a multi-tenant etcd internal cluster running on AKS -- an arbitrary number of Azure virtual machines hosting `Tenant`s' workloads +- an AKS Kubernetes cluster to run the Admin and Tenant Control Planes +- an arbitrary number of Azure virtual machines to host `Tenant`s' workloads -## Bootstrap machine -This getting started guide is supposed to be run from a remote or local bootstrap machine. -First, prepare the workspace directory: + * [Prepare the bootstrap workspace](#prepare-the-bootstrap-workspace) + * [Access Admin cluster](#access-admin-cluster) + * [Install Kamaji controller](#install-kamaji-controller) + * [Create Tenant Cluster](#create-tenant-cluster) + * [Cleanup](#cleanup) -``` +## Prepare the bootstrap workspace +This guide is supposed to be run from a remote or local bootstrap machine. First, clone the repo and prepare the workspace directory: + +```bash git clone https://github.com/clastix/kamaji cd kamaji/deploy ``` -1. Follow the instructions in [Prepare the bootstrap workspace](./getting-started-with-kamaji.md#prepare-the-bootstrap-workspace). -2. Install the [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli). -3. Make sure you have a valid Azure subscription -4. Login to Azure: +We assume you have installed on your workstation: + +- [kubectl](https://kubernetes.io/docs/tasks/tools/) +- [helm](https://helm.sh/docs/intro/install/) +- [jq](https://stedolan.github.io/jq/) +- [openssl](https://www.openssl.org/) +- [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli) + +Make sure you have a valid Azure subscription, and login to Azure: ```bash az account set --subscription "MySubscription" az login ``` -> Currently, the Kamaji setup, including Admin and Tenant clusters need to be deployed within the same Azure region. Cross-regions deployments are not (yet) validated. +> Currently, the Kamaji setup, including Admin and Tenant clusters need to be deployed within the same Azure region. Cross-regions deployments are not supported. + +## Access Admin cluster +In Kamaji, an Admin Cluster is a regular Kubernetes cluster which hosts zero to many Tenant Cluster Control Planes. The admin cluster acts as management cluster for all the Tenant clusters and implements Monitoring, Logging, and Governance of all the Kamaji setup, including all Tenant clusters. For this guide, we're going to use an instance of Azure Kubernetes Service - AKS as the Admin Cluster. -## Setup Admin cluster on AKS -Throughout the instructions, shell variables are used to indicate values that you should adjust to your own Azure environment: +Throughout the following instructions, shell variables are used to indicate values that you should adjust to your own Azure environment: ```bash source kamaji-azure.env -``` -> we use the Azure CLI to setup the Kamaji Admin cluster on AKS. - -``` az group create \ --name $KAMAJI_RG \ --location $KAMAJI_REGION @@ -66,34 +72,46 @@ And check you can access: kubectl cluster-info ``` -## Setup internal multi-tenant etcd -Follow the instructions [here](./kamaji-deployment-guide.md#setup-internal-multi-tenant-etcd). +## Install Kamaji +There are multiple ways to deploy Kamaji, including a [single YAML file](../config/install.yaml) and [Helm Chart](../helm/kamaji). -## Install Kamaji controller -Follow the instructions [here](./kamaji-deployment-guide.md#install-kamaji-controller). +### Multi-tenant datastore +The Kamaji controller needs to access a multi-tenant datastore in order to save data of the tenants' clusters. Install a multi-tenant `etcd` in the admin cluster as three replicas StatefulSet with data persistence. The Helm [Chart](../helm/kamaji/) provides the installation of an internal `etcd`. However, an externally managed `etcd` is highly recommended. If you'd like to use an external one, you can specify the overrides by setting the value `etcd.deploy=false`. -## Create Tenant Clusters -To create a Tenant Cluster in Kamaji on AKS, we have to work on both the Kamaji and Azure infrastructure sides. +Optionally, Kamaji offers the possibility of using a different storage system than `etcd` for the tenants' clusters, like MySQL compatible database, thanks to the [kine](https://github.com/k3s-io/kine) integration [here](../deploy/mysql/README.md). +### Install with Helm Chart +Install with the `helm` in a dedicated namespace of the Admin cluster: + +```bash +helm install --create-namespace --namespace kamaji-system kamaji ../helm/kamaji ``` -source kamaji-tenant-azure.env + +The Kamaji controller and the multi-tenant `etcd` are now running: + +```bash +kubectl -n kamaji-system get pods +NAME READY STATUS RESTARTS AGE +etcd-0 1/1 Running 0 120m +etcd-1 1/1 Running 0 120m +etcd-2 1/1 Running 0 119m +kamaji-857fcdf599-4fb2p 2/2 Running 0 120m ``` -### On Kamaji side -With Kamaji on AKS, the tenant control plane is accessible: +You just turned your AKS cluster into a Kamaji cluster to run multiple Tenant Control Planes. -- from tenant work nodes through an internal loadbalancer as `https://${TENANT_ADDR}:6443` -- from tenant admin user through an external loadbalancer `https://${TENANT_NAME}.${KAMAJI_REGION}.cloudapp.azure.com:443` +## Create Tenant Cluster -Where `TENANT_ADDR` is the Azure internal IP address assigned to the LoadBalancer service created by Kamaji to expose the Tenant Control Plane endpoint. +### Tenant Control Plane +With Kamaji on AKS, the tenant control plane is accessible: -#### Create the Tenant Control Plane +- from tenant worker nodes through an internal loadbalancer +- from tenant admin user through an external loadbalancer responding to `https://${TENANT_NAME}.${TENANT_NAME}.${TENANT_DOMAIN}:443` -Create the manifest for Tenant Control Plane: +Create a tenant control plane of example: ```yaml cat > ${TENANT_NAMESPACE}-${TENANT_NAME}-tcp.yaml < 1m v1.23.4 -kamaji-tenant-worker-01 NotReady 1m v1.23.4 -kamaji-tenant-worker-02 NotReady 1m v1.23.4 -kamaji-tenant-worker-03 NotReady 1m v1.23.4 +NAME STATUS ROLES AGE VERSION +tenant-00-000000 NotReady 112s v1.23.5 +tenant-00-000002 NotReady 92s v1.23.5 +tenant-00-000003 NotReady 71s v1.23.5 ``` The cluster needs a [CNI](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/) plugin to get the nodes ready. In our case, we are going to install [calico](https://projectcalico.docs.tigera.io/about/about-calico). @@ -371,7 +382,7 @@ kubectl apply -f calico-cni/calico-azure.yaml --kubeconfig=${TENANT_NAMESPACE}-$ And after a while, `kube-system` pods will be running. ```bash -kubectl get po -n kube-system --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig +kubectl --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig get po -n kube-system NAME READY STATUS RESTARTS AGE calico-kube-controllers-8594699699-dlhbj 1/1 Running 0 3m @@ -386,14 +397,14 @@ kube-proxy-m48v4 1/1 Running 0 3m And the nodes will be ready ```bash -kubectl get nodes --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig +kubectl --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig get nodes -NAME STATUS ROLES AGE VERSION -kamaji-tenant-worker-01 Ready 10m v1.23.4 -kamaji-tenant-worker-02 Ready 10m v1.23.4 +NAME STATUS ROLES AGE VERSION +tenant-00-000000 Ready 3m38s v1.23.5 +tenant-00-000002 Ready 3m18s v1.23.5 +tenant-00-000003 Ready 2m57s v1.23.5 ``` - ## Cleanup To get rid of the Tenant infrastructure, remove the RESOURCE_GROUP: diff --git a/docs/kamaji-deployment-guide.md b/docs/kamaji-deployment-guide.md index 77ea13e6..f8f3d40b 100644 --- a/docs/kamaji-deployment-guide.md +++ b/docs/kamaji-deployment-guide.md @@ -1,603 +1,479 @@ -# Install a Kamaji environment -This guide will lead you through the process of creating a basic working Kamaji setup. +# Setup Kamaji +This guide will lead you through the process of creating a working Kamaji setup on a generic Kubernetes cluster. It requires: -Kamaji requires: +- one bootstrap local workstation +- a Kubernetes cluster 1.22+, to run the Admin and Tenant Control Planes +- an arbitrary number of machines to host Tenants' workloads -- (optional) a bootstrap node; -- a multi-tenant `etcd` cluster made of 3 nodes hosting the datastore for the `Tenant`s' clusters -- a Kubernetes cluster, running the admin and Tenant Control Planes -- an arbitrary number of machines hosting `Tenant`s' workloads - -> In this guide, we assume all machines are running `Ubuntu 20.04`. +> In this guide, we assume the machines are running `Ubuntu 20.04`. * [Prepare the bootstrap workspace](#prepare-the-bootstrap-workspace) * [Access Admin cluster](#access-admin-cluster) - * [Setup external multi-tenant etcd](#setup-external-multi-tenant-etcd) - * [Setup internal multi-tenant etcd](#setup-internal-multi-tenant-etcd) * [Install Kamaji controller](#install-kamaji-controller) - * [Setup Tenant cluster](#setup-tenant-cluster) + * [Create Tenant Cluster](#create-tenant-cluster) + * [Cleanup](#cleanup) ## Prepare the bootstrap workspace -This guide is supposed to be run from a remote or local bootstrap machine. -First, prepare the workspace directory: +This guide is supposed to be run from a remote or local bootstrap machine. First, clone the repo and prepare the workspace directory: -``` +```bash git clone https://github.com/clastix/kamaji cd kamaji/deploy ``` -Throughout the instructions, shell variables are used to indicate values that you should adjust to your own environment. +We assume you have installed on your workstation: -### Install required tools -On the bootstrap machine, install all the required tools to work with a Kamaji setup. +- [kubectl](https://kubernetes.io/docs/tasks/tools/) +- [helm](https://helm.sh/docs/intro/install/) +- [jq](https://stedolan.github.io/jq/) +- [openssl](https://www.openssl.org/) -#### cfssl and cfssljson -The `cfssl` and `cfssljson` command line utilities will be used in addition to `kubeadm` to provision the PKI Infrastructure and generate TLS certificates. +## Access Admin cluster +In Kamaji, an Admin Cluster is a regular Kubernetes cluster which hosts zero to many Tenant Cluster Control Planes. The admin cluster acts as management cluster for all the Tenant clusters and implements Monitoring, Logging, and Governance of all the Kamaji setup, including all Tenant clusters. -``` -wget -q --show-progress --https-only --timestamping \ -https://storage.googleapis.com/kubernetes-the-hard-way/cfssl/1.4.1/linux/cfssl \ -https://storage.googleapis.com/kubernetes-the-hard-way/cfssl/1.4.1/linux/cfssljson +Throughout the following instructions, shell variables are used to indicate values that you should adjust to your environment: -chmod +x cfssl cfssljson -sudo mv cfssl cfssljson /usr/local/bin/ +```bash +source kamaji.env ``` -#### Kubernetes tools -Install `kubeadm` and `kubectl` +Any regular and conformant Kubernetes v1.22+ cluster can be turned into a Kamaji setup. To work properly, the admin cluster should provide: -```bash -sudo apt update && sudo apt install -y apt-transport-https ca-certificates curl && \ -sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg && \ -echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list && \ -sudo apt update && sudo apt install -y kubeadm kubectl --allow-change-held-packages && \ -sudo apt-mark hold kubeadm kubectl -``` +- CNI module installed, eg. [Calico](https://github.com/projectcalico/calico), [Cilium](https://github.com/cilium/cilium). +- CSI module installed with a Storage Class for the Tenants' `etcd`. +- Support for LoadBalancer Service Type, or alternatively, an Ingress Controller, eg. [ingress-nginx](https://github.com/kubernetes/ingress-nginx), [haproxy](https://github.com/haproxytech/kubernetes-ingress). +- Monitoring Stack, eg. [Prometheus](https://github.com/prometheus-community). + +Make sure you have a `kubeconfig` file with admin permissions on the cluster you want to turn into Kamaji Admin Cluster. + +## Install Kamaji +There are multiple ways to deploy Kamaji, including a [single YAML file](../config/install.yaml) and [Helm Chart](../helm/kamaji). -#### etcdctl -For administration of the `etcd` cluster, download and install the `etcdctl` CLI utility on the bootstrap machine +### Multi-tenant datastore +The Kamaji controller needs to access a multi-tenant datastore in order to save data of the tenants' clusters. Install a multi-tenant `etcd` in the admin cluster as three replicas StatefulSet with data persistence. The Helm [Chart](../helm/kamaji/) provides the installation of an internal `etcd`. However, an externally managed `etcd` is highly recommended. If you'd like to use an external one, you can specify the overrides by setting the value `etcd.deploy=false`. + +Optionally, Kamaji offers the possibility of using a different storage system than `etcd` for the tenants' clusters, like MySQL compatible database, thanks to the [kine](https://github.com/k3s-io/kine) integration [here](../deploy/mysql/README.md). + +### Install with Helm Chart +Install with the `helm` in a dedicated namespace of the Admin cluster: ```bash -ETCD_VER=v3.5.1 -ETCD_URL=https://storage.googleapis.com/etcd -curl -L ${ETCD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o etcd-${ETCD_VER}-linux-amd64.tar.gz -tar xzvf etcd-${ETCD_VER}-linux-amd64.tar.gz etcd-${ETCD_VER}-linux-amd64/etcdctl -sudo cp etcd-${ETCD_VER}-linux-amd64/etcdctl /usr/bin/etcdctl -rm -rf etcd-${ETCD_VER}-linux-amd64* +helm install --create-namespace --namespace kamaji-system kamaji ../helm/kamaji ``` -Verify `etcdctl` version is installed +The Kamaji controller and the multi-tenant `etcd` are now running: ```bash -etcdctl version -etcdctl version: 3.5.1 -API version: 3.5 +kubectl -n kamaji-system get pods +NAME READY STATUS RESTARTS AGE +etcd-0 1/1 Running 0 120m +etcd-1 1/1 Running 0 120m +etcd-2 1/1 Running 0 119m +kamaji-857fcdf599-4fb2p 2/2 Running 0 120m ``` +You just turned your Kubernetes cluster into a Kamaji cluster to run multiple Tenant Control Planes. -## Access Admin cluster -In Kamaji, an Admin Cluster is a regular Kubernetes cluster which hosts zero to many Tenant Cluster Control Planes running as pods. The admin cluster acts as management cluster for all the Tenant clusters and implements Monitoring, Logging, and Governance of all the Kamaji setup, including all Tenant clusters. +## Create Tenant Cluster -Any regular and conformant Kubernetes v1.22+ cluster can be turned into a Kamaji setup. Currently we tested: +### Tenant Control Plane -- [Kubernetes installed with `kubeadm`](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/). -- [Azure AKS managed service](./kamaji-on-azure.md). -- [KinD for local development](./getting-started-with-kamaji.md ). +A tenant control plane of example looks like: -The admin cluster should provide: +```yaml +cat > ${TENANT_NAMESPACE}-${TENANT_NAME}-tcp.yaml < Do not use a password. +The `LoadBalancer` service type is used to expose the Tenant Control Plane. However, `NodePort` and `ClusterIP` with an Ingress Controller are still viable options, depending on the case. High Availability and rolling updates of the Tenant Control Plane are provided by the `tcp` Deployment and all the resources reconcilied by the Kamaji controller. -Distribute the key to the other cluster hosts. +### Konnectivity +In addition to the standard control plane containers, Kamaji creates an instance of [konnectivity-server](https://kubernetes.io/docs/concepts/architecture/control-plane-node-communication/) running as sidecar container in the `tcp` pod and exposed on port `8132` of the `tcp` service. -Depending on your environment, use a bash loop: +This is required when the tenant worker nodes are not reachable from the `tcp` pods. The Konnectivity service consists of two parts: the Konnectivity server in the tenant control plane pod and the Konnectivity agents running on the tenant worker nodes. After worker nodes joined the tenant control plane, the Konnectivity agents initiate connections to the Konnectivity server and maintain the network connections. After enabling the Konnectivity service, all control plane to worker nodes traffic goes through these connections. -```bash -HOSTS=(${ETCD0} ${ETCD1} ${ETCD2}) -for i in "${!HOSTS[@]}"; do - HOST=${HOSTS[$i]} - ssh-copy-id -i ~/.ssh/id_rsa.pub $HOST; -done -``` +> In Kamaji, Konnectivity is enabled by default and can be disabled when not required. -> Alternatively, inject the generated public key into machines metadata. +### Working with Tenant Control Plane -Confirm that you can access each host from bootstrap machine: +Collect the external IP address of the `tcp` service: ```bash -HOSTS=(${ETCD0} ${ETCD1} ${ETCD2}) -for i in "${!HOSTS[@]}"; do - HOST=${HOSTS[$i]} - ssh ${USER}@${HOST} -t 'hostname'; -done +TENANT_ADDR=$(kubectl -n ${TENANT_NAMESPACE} get svc ${TENANT_NAME} -o json | jq -r ."spec.loadBalancerIP") ``` -### Configure disk layout -As per `etcd` [requirements](https://etcd.io/docs/v3.5/op-guide/hardware/#disks), back `etcd`’s storage with a SSD. A SSD usually provides lower write latencies and with less variance than a spinning disk, thus improving the stability and reliability of `etcd`. - -For each `etcd` machine, we assume an additional `sdb` disk of 10GB: +and check it out: +```bash +curl -k https://${TENANT_ADDR}:${TENANT_PORT}/healthz +curl -k https://${TENANT_ADDR}:${TENANT_PORT}/version ``` -clastix@kamaji-etcd-00:~$ lsblk -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT -sda 8:0 0 16G 0 disk -├─sda1 8:1 0 15.9G 0 part / -├─sda14 8:14 0 4M 0 part -└─sda15 8:15 0 106M 0 part /boot/efi -sdb 8:16 0 10G 0 disk -sr0 11:0 1 4M 0 rom -``` - -Create partition, format, and mount the `etcd` disk, by running the script below from the bootstrap machine: -> If you already used the `etcd` disks, please make sure to wipe the partitions with `sudo wipefs --all --force /dev/sdb` before to attempt to recreate them. +The `kubeconfig` required to access the Tenant Control Plane is stored in a secret: ```bash -for i in "${!ETCDHOSTS[@]}"; do - HOST=${ETCDHOSTS[$i]} - ssh ${USER}@${HOST} -t 'echo type=83 | sudo sfdisk -f -q /dev/sdb' - ssh ${USER}@${HOST} -t 'sudo mkfs -F -q -t ext4 /dev/sdb1' - ssh ${USER}@${HOST} -t 'sudo mkdir -p /var/lib/etcd' - ssh ${USER}@${HOST} -t 'sudo e2label /dev/sdb1 ETCD' - ssh ${USER}@${HOST} -t 'echo LABEL=ETCD /var/lib/etcd ext4 defaults 0 1 | sudo tee -a /etc/fstab' - ssh ${USER}@${HOST} -t 'sudo mount -a' - ssh ${USER}@${HOST} -t 'sudo lsblk -f' -done +kubectl get secrets -n ${TENANT_NAMESPACE} ${TENANT_NAME}-admin-kubeconfig -o json \ + | jq -r '.data["admin.conf"]' \ + | base64 -d \ + > ${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig ``` -### Install prerequisites -Use bash script `nodes-prerequisites.sh` to install all the dependencies on all the cluster nodes: - -- Install `containerd` as container runtime -- Install `crictl`, the command line for working with `containerd` -- Install `kubectl`, `kubelet`, and `kubeadm` in the desired version, eg. `v1.24.0` - -Run the installation script: +and let's check it out: ```bash -VERSION=v1.24.0 -./nodes-prerequisites.sh ${VERSION:1} ${HOSTS[@]} -``` +kubectl --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig cluster-info -### Configure kubelet +Kubernetes control plane is running at https://192.168.32.240:6443 +CoreDNS is running at https://192.168.32.240:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy +``` -On each `etcd` node, configure the `kubelet` service to start `etcd` static pods using `containerd` as container runtime, by running the script below from the bootstrap machine: +Check out how the Tenant control Plane advertises itself to workloads: ```bash -cat << EOF > 20-etcd-service-manager.conf -[Service] -ExecStart= -ExecStart=/usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock -Restart=always -EOF -``` +kubectl --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig get svc +NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +default kubernetes ClusterIP 10.32.0.1 443/TCP 6m ``` -for i in "${!ETCDHOSTS[@]}"; do - HOST=${ETCDHOSTS[$i]} - scp 20-etcd-service-manager.conf ${USER}@${HOST}: - ssh ${USER}@${HOST} -t 'sudo chown -R root:root 20-etcd-service-manager.conf && sudo mv 20-etcd-service-manager.conf /etc/systemd/system/kubelet.service.d/20-etcd-service-manager.conf' - ssh ${USER}@${HOST} -t 'sudo systemctl daemon-reload' - ssh ${USER}@${HOST} -t 'sudo systemctl start kubelet' - ssh ${USER}@${HOST} -t 'sudo systemctl enable kubelet' -done -rm -f 20-etcd-service-manager.conf +```bash +kubectl --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig get ep + +NAME ENDPOINTS AGE +kubernetes 192.168.32.240:6443 18m ``` -### Create configuration -Create temp directories to store files that will end up on `etcd` hosts: +And make sure it is `${TENANT_ADDR}:${TENANT_PORT}`. -```bash -mkdir -p /tmp/${ETCD0}/ /tmp/${ETCD1}/ /tmp/${ETCD2}/ -NAMES=("etcd00" "etcd01" "etcd02") - -for i in "${!ETCDHOSTS[@]}"; do -HOST=${ETCDHOSTS[$i]} -NAME=${NAMES[$i]} - -cat < Note: -> -> ##### Etcd compaction -> -> By enabling `etcd` authentication, it prevents the tenant apiservers (clients of `etcd`) to issue compaction requests. We set `etcd` to automatically compact the keyspace with the `--auto-compaction-*` option with a period of hours or minutes. When `--auto-compaction-mode=periodic` and `--auto-compaction-retention=5m` and writes per minute are about 1000, `etcd` compacts revision 5000 for every 5 minute. -> -> ##### Etcd storage quota -> -> Currently, `etcd` is limited in storage size, defaulted to `2GB` and configurable with `--quota-backend-bytes` flag up to `8GB`. In Kamaji, we use a single `etcd` to store multiple tenant clusters, so we need to increase this size. Please, note `etcd` warns at startup if the configured value exceeds `8GB`. +### Preparing Worker Nodes to join -### Generate certificates -On the bootstrap machine, using `kubeadm` init phase, create and distribute `etcd` CA certificates: +Currently Kamaji does not provide any helper for creation of tenant worker nodes. You should get a set of machines from your infrastructure provider, turn them into worker nodes, and then join to the tenant control plane with the `kubeadm`. In the future, we'll provide integration with Cluster APIs and other IaC tools. -```bash -sudo kubeadm init phase certs etcd-ca -mkdir kamaji -sudo cp -r /etc/kubernetes/pki/etcd kamaji -sudo chown -R ${USER}. kamaji/etcd -``` +Use bash script `nodes-prerequisites.sh` to install the dependencies on all the worker nodes: -For each `etcd` host: +- Install `containerd` as container runtime +- Install `crictl`, the command line for working with `containerd` +- Install `kubectl`, `kubelet`, and `kubeadm` in the desired version -```bash -for i in "${!ETCDHOSTS[@]}"; do - HOST=${ETCDHOSTS[$i]} - sudo kubeadm init phase certs etcd-server --config=/tmp/${HOST}/kubeadmcfg.yaml - sudo kubeadm init phase certs etcd-peer --config=/tmp/${HOST}/kubeadmcfg.yaml - sudo kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST}/kubeadmcfg.yaml - sudo cp -R /etc/kubernetes/pki /tmp/${HOST}/ - sudo find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete -done -``` +> Warning: we assume worker nodes are machines running `Ubuntu 20.04` -### Startup the cluster -Upload certificates on each `etcd` node and restart the `kubelet` +Run the installation script: ```bash -for i in "${!ETCDHOSTS[@]}"; do - HOST=${ETCDHOSTS[$i]} - sudo chown -R ${USER}. /tmp/${HOST} - scp -r /tmp/${HOST}/* ${USER}@${HOST}: - ssh ${USER}@${HOST} -t 'sudo chown -R root:root pki' - ssh ${USER}@${HOST} -t 'sudo mv pki /etc/kubernetes/' - ssh ${USER}@${HOST} -t 'sudo kubeadm init phase etcd local --config=kubeadmcfg.yaml' - ssh ${USER}@${HOST} -t 'sudo systemctl daemon-reload' - ssh ${USER}@${HOST} -t 'sudo systemctl restart kubelet' -done +HOSTS=(${WORKER0} ${WORKER1} ${WORKER2}) +./nodes-prerequisites.sh ${TENANT_VERSION:1} ${HOSTS[@]} ``` -This will start the static `etcd` pod on each node and then the cluster gets formed. - -Generate certificates for the `root` user +### Join Command -```bash -cat > root-csr.json < 25s v1.23.5 +tenant-00-worker-01 NotReady 17s v1.23.5 +tenant-00-worker-02 NotReady 9s v1.23.5 ``` -The result should be something like this: +The cluster needs a [CNI](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/) plugin to get the nodes ready. In our case, we are going to install [calico](https://projectcalico.docs.tigera.io/about/about-calico). -``` -+------------------+---------+--------+----------------------------+----------------------------+------------+ -| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | -+------------------+---------+--------+----------------------------+----------------------------+------------+ -| 72657d6307364226 | started | etcd01 | https://192.168.32.11:2380 | https://192.168.32.11:2379 | false | -| 91eb892c5ee87610 | started | etcd00 | https://192.168.32.10:2380 | https://192.168.32.10:2379 | false | -| e9971c576949c34e | started | etcd02 | https://192.168.32.12:2380 | https://192.168.32.12:2379 | false | -+------------------+---------+--------+----------------------------+----------------------------+------------+ +```bash +kubectl apply -f calico-cni/calico-crd.yaml --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig +kubectl apply -f calico-cni/calico.yaml --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig ``` -### Enable multi-tenancy -The `root` user has full access to `etcd`, must be created before activating authentication. The `root` user must have the `root` role and is allowed to change anything inside `etcd`. +And after a while, `kube-system` pods will be running. ```bash -etcdctl user add --no-password=true root -etcdctl role add root -etcdctl user grant-role root root -etcdctl auth enable +kubectl --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig get pods -n kube-system + +NAME READY STATUS RESTARTS AGE +calico-kube-controllers-8594699699-dlhbj 1/1 Running 0 3m +calico-node-kxf6n 1/1 Running 0 3m +calico-node-qtdlw 1/1 Running 0 3m +coredns-64897985d-2v5lc 1/1 Running 0 5m +coredns-64897985d-nq276 1/1 Running 0 5m +kube-proxy-cwdww 1/1 Running 0 3m +kube-proxy-m48v4 1/1 Running 0 3m ``` -### Cleanup -If you want to get rid of the etcd cluster, for each node, login and clean it: +And the nodes will be ready ```bash -HOSTS=(${ETCD0} ${ETCD1} ${ETCD2}) -for i in "${!HOSTS[@]}"; do - HOST=${HOSTS[$i]} - ssh ${USER}@${HOST} -t 'sudo kubeadm reset -f'; - ssh ${USER}@${HOST} -t 'sudo systemctl reboot'; -done +kubectl get nodes --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig +NAME STATUS ROLES AGE VERSION +tenant-00-worker-00 Ready 2m48s v1.23.5 +tenant-00-worker-01 Ready 2m40s v1.23.5 +tenant-00-worker-02 Ready 2m32s v1.23.5 ``` -## Setup internal multi-tenant etcd -If you opted for an internal etcd cluster running in the Kamaji admin cluster, follow steps below. +## Smoke test -From the bootstrap machine load the environment for internal `etcd` setup: +The tenant cluster is now ready to accept workloads. + +Export its `kubeconfig` file ```bash -source kamaji-internal-etcd.env +export KUBECONFIG=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig ``` -### Generate certificates -On the bootstrap machine, using `kubeadm` init phase, create the `etcd` CA certificates: +#### Deployment +Deploy a `nginx` application on the tenant cluster ```bash -sudo kubeadm init phase certs etcd-ca -mkdir kamaji -sudo cp -r /etc/kubernetes/pki/etcd kamaji -sudo chown -R ${USER}. kamaji/etcd -``` - -Generate the `etcd` certificates for peers: - -``` -cat << EOF | tee kamaji/etcd/peer-csr.json -{ - "CN": "etcd", - "key": { - "algo": "rsa", - "size": 2048 - }, - "hosts": [ - "127.0.0.1", - "etcd-0", - "etcd-0.etcd", - "etcd-0.etcd.${ETCD_NAMESPACE}.svc", - "etcd-0.etcd.${ETCD_NAMESPACE}.svc.cluster.local", - "etcd-1", - "etcd-1.etcd", - "etcd-1.etcd.${ETCD_NAMESPACE}.svc", - "etcd-1.etcd.${ETCD_NAMESPACE}.svc.cluster.local", - "etcd-2", - "etcd-2.etcd", - "etcd-2.etcd.${ETCD_NAMESPACE}.svc", - "etcd-2.etcd.${ETCD_NAMESPACE}.cluster.local" - ] -} -EOF - -cfssl gencert -ca=kamaji/etcd/ca.crt -ca-key=kamaji/etcd/ca.key \ - -config=cfssl-cert-config.json \ - -profile=peer-authentication kamaji/etcd/peer-csr.json | cfssljson -bare kamaji/etcd/peer - +kubectl create deployment nginx --image=nginx ``` -Generate the `etcd` certificates for server: +and check the `nginx` pod gets scheduled -``` -cat << EOF | tee kamaji/etcd/server-csr.json -{ - "CN": "etcd", - "key": { - "algo": "rsa", - "size": 2048 - }, - "hosts": [ - "127.0.0.1", - "etcd-server", - "etcd-server.${ETCD_NAMESPACE}.svc", - "etcd-server.${ETCD_NAMESPACE}.svc.cluster.local", - "etcd-0.etcd.${ETCD_NAMESPACE}.svc.cluster.local", - "etcd-1.etcd.${ETCD_NAMESPACE}.svc.cluster.local", - "etcd-2.etcd.${ETCD_NAMESPACE}.svc.cluster.local" - ] -} -EOF +```bash +kubectl get pods -o wide -cfssl gencert -ca=kamaji/etcd/ca.crt -ca-key=kamaji/etcd/ca.key \ - -config=cfssl-cert-config.json \ - -profile=peer-authentication kamaji/etcd/server-csr.json | cfssljson -bare kamaji/etcd/server +NAME READY STATUS RESTARTS AGE IP NODE +nginx-6799fc88d8-4sgcb 1/1 Running 0 33s 172.12.121.1 worker02 ``` -Generate certificates for the `root` user of the `etcd` +#### Port Forwarding +Verify the ability to access applications remotely using port forwarding. -``` -cat << EOF | tee kamaji/etcd/root-csr.json -{ - "CN": "root", - "key": { - "algo": "rsa", - "size": 2048 - } -} -EOF +Retrieve the full name of the `nginx` pod: -cfssl gencert -ca=kamaji/etcd/ca.crt -ca-key=kamaji/etcd/ca.key \ - -config=cfssl-cert-config.json \ - -profile=client-authentication kamaji/etcd/root-csr.json | cfssljson -bare kamaji/etcd/root +```bash +POD_NAME=$(kubectl get pods -l app=nginx -o jsonpath="{.items[0].metadata.name}") ``` -Install the `etcd` in the Kamaji admin cluster +Forward port 8080 on your local machine to port 80 of the `nginx` pod: ```bash -kubectl create namespace ${ETCD_NAMESPACE} - -kubectl -n ${ETCD_NAMESPACE} create secret generic etcd-certs \ - --from-file=kamaji/etcd/ca.crt \ - --from-file=kamaji/etcd/ca.key \ - --from-file=kamaji/etcd/peer-key.pem --from-file=kamaji/etcd/peer.pem \ - --from-file=kamaji/etcd/server-key.pem --from-file=kamaji/etcd/server.pem +kubectl port-forward $POD_NAME 8080:80 -kubectl -n ${ETCD_NAMESPACE} apply -f etcd/etcd-cluster.yaml +Forwarding from 127.0.0.1:8080 -> 80 +Forwarding from [::1]:8080 -> 80 ``` -Install an `etcd` client to interact with the `etcd` server +In a new terminal make an HTTP request using the forwarding address: ```bash -kubectl -n ${ETCD_NAMESPACE} create secret tls root-client-certs \ - --key=kamaji/etcd/root-key.pem \ - --cert=kamaji/etcd/root.pem +curl --head http://127.0.0.1:8080 -kubectl -n ${ETCD_NAMESPACE} apply -f etcd/etcd-client.yaml +HTTP/1.1 200 OK +Server: nginx/1.21.0 +Date: Sat, 19 Jun 2021 08:19:01 GMT +Content-Type: text/html +Content-Length: 612 +Last-Modified: Tue, 25 May 2021 12:28:56 GMT +Connection: keep-alive +ETag: "60aced88-264" +Accept-Ranges: bytes ``` -Wait the etcd instances discover each other and the cluster is formed: - -```bash -kubectl -n ${ETCD_NAMESPACE} wait pod --for=condition=ready -l app=etcd --timeout=120s -echo -n "\nChecking endpoint's health..." - -kubectl -n ${ETCD_NAMESPACE} exec etcd-root-client -- /bin/bash -c "etcdctl endpoint health 1>/dev/null 2>/dev/null; until [ \$$? -eq 0 ]; do sleep 10; printf "."; etcdctl endpoint health 1>/dev/null 2>/dev/null; done;" -echo -n "\netcd cluster's health:\n" +Switch back to the previous terminal and stop the port forwarding to the `nginx` pod. -kubectl -n ${ETCD_NAMESPACE} exec etcd-root-client -- /bin/bash -c "etcdctl endpoint health" -echo -n "\nWaiting for all members..." +#### Logs +Verify the ability to retrieve container logs. -kubectl -n ${ETCD_NAMESPACE} exec etcd-root-client -- /bin/bash -c "until [ \$$(etcdctl member list 2>/dev/null | wc -l) -eq 3 ]; do sleep 10; printf '.'; done;" - @echo -n "\netcd's members:\n" +Print the `nginx` pod logs: -kubectl -n ${ETCD_NAMESPACE} exec etcd-root-client -- /bin/bash -c "etcdctl member list -w table" +```bash +kubectl logs $POD_NAME +... +127.0.0.1 - - [19/Jun/2021:08:19:01 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.68.0" "-" ``` -### Enable multi-tenancy -The `root` user has full access to `etcd`, must be created before activating authentication. The `root` user must have the `root` role and is allowed to change anything inside `etcd`. +#### Kubelet tunnel +Verify the ability to execute commands in a container. + +Print the `nginx` version by executing the `nginx -v` command in the `nginx` container: ```bash -kubectl -n ${ETCD_NAMESPACE} exec etcd-root-client -- etcdctl user add --no-password=true root -kubectl -n ${ETCD_NAMESPACE} exec etcd-root-client -- etcdctl role add root -kubectl -n ${ETCD_NAMESPACE} exec etcd-root-client -- etcdctl user grant-role root root -kubectl -n ${ETCD_NAMESPACE} exec etcd-root-client -- etcdctl auth enable +kubectl exec -ti $POD_NAME -- nginx -v +nginx version: nginx/1.21.0 ``` +#### Services +Verify the ability to expose applications using a service. -## Install Kamaji controller -Currently, the behaviour of the Kamaji controller for Tenant Control Plane is controlled by (in this order): +Expose the `nginx` deployment using a `NodePort` service: -- CLI flags -- Environment variables -- Configuration file `kamaji.yaml` built into the image - -By default Kamaji search for the configuration file and uses parameters found inside of it. In case some environment variable are passed, this will override configuration file parameters. In the end, if also a CLI flag is passed, this will override both env vars and config file as well. - -There are multiple ways to deploy the Kamaji controller: +```bash +kubectl expose deployment nginx --port 80 --type NodePort +``` -- Use the single YAML file installer -- Use Kustomize with Makefile -- Use the Kamaji Helm Chart +Retrieve the node port assigned to the `nginx` service: -The Kamaji controller needs to access the multi-tenant `etcd` in order to provision the access for tenant `kube-apiserver`. +```bash +NODE_PORT=$(kubectl get svc nginx \ + --output=jsonpath='{range .spec.ports[0]}{.nodePort}') +``` -Create the secrets containing the `etcd` certificates +Retrieve the IP address of a worker instance and make an HTTP request: ```bash -kubectl create namespace kamaji-system -kubectl -n kamaji-system create secret generic etcd-certs \ - --from-file=kamaji/etcd/ca.crt \ - --from-file=kamaji/etcd/ca.key +curl -I http://${WORKER0}:${NODE_PORT} -kubectl -n kamaji-system create secret tls root-client-certs \ - --cert=kamaji/etcd/root.crt \ - --key=kamaji/etcd/root.key +HTTP/1.1 200 OK +Server: nginx/1.21.0 +Date: Sat, 19 Jun 2021 09:29:01 GMT +Content-Type: text/html +Content-Length: 612 +Last-Modified: Tue, 25 May 2021 12:28:56 GMT +Connection: keep-alive +ETag: "60aced88-264" +Accept-Ranges: bytes ``` -### Install with a single manifest -Install with the single YAML file installer: +## Cleanup +Remove the worker nodes joined the tenant control plane ```bash -kubectl -n kamaji-system apply -f ../config/install.yaml +kubectl delete nodes --all --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig ``` -Make sure to patch the `etcd` endpoints of the Kamaji controller, according to your environment: +For each worker node, login and clean it ```bash -cat > patch-deploy.yaml < ${TENANT_NAMESPACE}-${TENANT_NAME}-tcp.yaml < ${TENANT_NAMESPACE}-${TENANT_NAME}-scheduler.kubeconfig -``` - -```bash -kubectl get secrets -n ${TENANT_NAMESPACE} ${TENANT_NAME}-controller-manager-kubeconfig -o json \ - | jq -r '.data["controller-manager.conf"]' \ - | base64 -d \ - > ${TENANT_NAMESPACE}-${TENANT_NAME}-controller-manager.kubeconfig -``` - -## Working with Tenant Control Plane - -A new Tenant cluster will be available at this moment but, it will not be useful without having worker nodes joined to it. - -### Getting Tenant Control Plane Kubeconfig - -Let's retrieve the `kubeconfig` in order to work with the tenant control plane. - -```bash -kubectl get secrets -n ${TENANT_NAMESPACE} ${TENANT_NAME}-admin-kubeconfig -o json \ - | jq -r '.data["admin.conf"]' \ - | base64 -d \ - > ${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig -``` - -and let's check it out: - -```bash -kubectl --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig get svc - -NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -default kubernetes ClusterIP 10.32.0.1 443/TCP 6m -``` - -Check out how the Tenant control Plane advertises itself to workloads: - -```bash -kubectl --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig get ep - -NAME ENDPOINTS AGE -kubernetes 192.168.32.150:6443 18m -``` - -Make sure it's `${TENANT_ADDR}:${TENANT_PORT}`. - -### Preparing Worker Nodes to join - -Currently Kamaji does not provide any helper for creation of tenant worker nodes. You should get a set of machines from your infrastructure provider, turn them into worker nodes, and then join to the tenant control plane with the `kubeadm`. In the future, we'll provide integration with Cluster APIs and other IaC tools. - -Use bash script `nodes-prerequisites.sh` to install the dependencies on all the worker nodes: - -- Install `containerd` as container runtime -- Install `crictl`, the command line for working with `containerd` -- Install `kubectl`, `kubelet`, and `kubeadm` in the desired version - -> Warning: we assume worker nodes are machines running `Ubuntu 20.04` - -Run the installation script: - -```bash -HOSTS=(${WORKER0} ${WORKER1} ${WORKER2} ${WORKER3}) -./nodes-prerequisites.sh ${TENANT_VERSION:1} ${HOSTS[@]} -``` - -### Join Command - -The current approach for joining nodes is to use the kubeadm one therefore, we will create a bootstrap token to perform the action. In order to facilitate the step, we will store the entire command of joining in a variable. - -```bash -JOIN_CMD=$(echo "sudo ")$(kubeadm --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig token create --print-join-command) -``` - -### Adding Worker Nodes - -A bash loop will be used to join all the available nodes. - -```bash -HOSTS=(${WORKER0} ${WORKER1} ${WORKER2} ${WORKER3}) -for i in "${!HOSTS[@]}"; do - HOST=${HOSTS[$i]} - ssh ${USER}@${HOST} -t ${JOIN_CMD}; -done -``` - -Checking the nodes: - -```bash -kubectl --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig get nodes - -NAME STATUS ROLES AGE VERSION -kamaji-tenant-worker-00 NotReady 1m v1.23.4 -kamaji-tenant-worker-01 NotReady 1m v1.23.4 -kamaji-tenant-worker-02 NotReady 1m v1.23.4 -kamaji-tenant-worker-03 NotReady 1m v1.23.4 -``` - -The cluster needs a [CNI](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/) plugin to get the nodes ready. In our case, we are going to install [calico](https://projectcalico.docs.tigera.io/about/about-calico). - -```bash -kubectl apply -f calico-cni/calico-crd.yaml --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig -kubectl apply -f calico-cni/calico.yaml --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig -``` - -And after a while, `kube-system` pods will be running. - -```bash -kubectl --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig get pods -n kube-system - -NAME READY STATUS RESTARTS AGE -calico-kube-controllers-8594699699-dlhbj 1/1 Running 0 3m -calico-node-kxf6n 1/1 Running 0 3m -calico-node-qtdlw 1/1 Running 0 3m -coredns-64897985d-2v5lc 1/1 Running 0 5m -coredns-64897985d-nq276 1/1 Running 0 5m -kube-proxy-cwdww 1/1 Running 0 3m -kube-proxy-m48v4 1/1 Running 0 3m -``` - -And the nodes will be ready - -```bash -kubectl get nodes --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig - -NAME STATUS ROLES AGE VERSION -kamaji-tenant-worker-01 Ready 10m v1.23.4 -kamaji-tenant-worker-02 Ready 10m v1.23.4 -``` - -## Smoke test - -The tenant cluster is now ready to accept workloads. - -Export its `kubeconfig` file - -```bash -export KUBECONFIG=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig -``` - -#### Deployment -Deploy a `nginx` application on the tenant cluster - -```bash -kubectl create deployment nginx --image=nginx -``` - -and check the `nginx` pod gets scheduled - -```bash -kubectl get pods -o wide - -NAME READY STATUS RESTARTS AGE IP NODE -nginx-6799fc88d8-4sgcb 1/1 Running 0 33s 172.12.121.1 worker02 -``` - -#### Port Forwarding -Verify the ability to access applications remotely using port forwarding. - -Retrieve the full name of the `nginx` pod: - -```bash -POD_NAME=$(kubectl get pods -l app=nginx -o jsonpath="{.items[0].metadata.name}") -``` - -Forward port 8080 on your local machine to port 80 of the `nginx` pod: - -```bash -kubectl port-forward $POD_NAME 8080:80 - -Forwarding from 127.0.0.1:8080 -> 80 -Forwarding from [::1]:8080 -> 80 -``` - -In a new terminal make an HTTP request using the forwarding address: - -```bash -curl --head http://127.0.0.1:8080 - -HTTP/1.1 200 OK -Server: nginx/1.21.0 -Date: Sat, 19 Jun 2021 08:19:01 GMT -Content-Type: text/html -Content-Length: 612 -Last-Modified: Tue, 25 May 2021 12:28:56 GMT -Connection: keep-alive -ETag: "60aced88-264" -Accept-Ranges: bytes -``` - -Switch back to the previous terminal and stop the port forwarding to the `nginx` pod. - -#### Logs -Verify the ability to retrieve container logs. - -Print the `nginx` pod logs: - -```bash -kubectl logs $POD_NAME -... -127.0.0.1 - - [19/Jun/2021:08:19:01 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.68.0" "-" -``` - -#### Kubelet tunnel -Verify the ability to execute commands in a container. - -Print the `nginx` version by executing the `nginx -v` command in the `nginx` container: - -```bash -kubectl exec -ti $POD_NAME -- nginx -v -nginx version: nginx/1.21.0 -``` - -#### Services -Verify the ability to expose applications using a service. - -Expose the `nginx` deployment using a `NodePort` service: - -```bash -kubectl expose deployment nginx --port 80 --type NodePort -``` - -Retrieve the node port assigned to the `nginx` service: - -```bash -NODE_PORT=$(kubectl get svc nginx \ - --output=jsonpath='{range .spec.ports[0]}{.nodePort}') -``` - -Retrieve the IP address of a worker instance and make an HTTP request: - -```bash -curl -I http://${WORKER0}:${NODE_PORT} - -HTTP/1.1 200 OK -Server: nginx/1.21.0 -Date: Sat, 19 Jun 2021 09:29:01 GMT -Content-Type: text/html -Content-Length: 612 -Last-Modified: Tue, 25 May 2021 12:28:56 GMT -Connection: keep-alive -ETag: "60aced88-264" -Accept-Ranges: bytes -``` - -## Cleanup Tenant cluster -Remove the worker nodes joined the tenant control plane - -```bash -kubectl delete nodes --all --kubeconfig=${TENANT_NAMESPACE}-${TENANT_NAME}.kubeconfig -``` - -For each worker node, login and clean it - -```bash -HOSTS=(${WORKER0} ${WORKER1} ${WORKER2} ${WORKER3}) -for i in "${!HOSTS[@]}"; do - HOST=${HOSTS[$i]} - ssh ${USER}@${HOST} -t 'sudo kubeadm reset -f'; - ssh ${USER}@${HOST} -t 'sudo rm -rf /etc/cni/net.d'; - ssh ${USER}@${HOST} -t 'sudo systemctl reboot'; -done -``` - -Delete the tenant control plane from kamaji - -```bash -kubectl delete -f ${TENANT_NAMESPACE}-${TENANT_NAME}-tcp.yaml -``` diff --git a/docs/reference.md b/docs/reference.md index 3f001e5e..53d85cd1 100644 --- a/docs/reference.md +++ b/docs/reference.md @@ -1,12 +1,12 @@ ## Configuration -Currently **kamaji** supports (in this order): +Currently **Kamaji** supports (in this order): * CLI flags * Environment variables * Configuration files -By default **kamaji** search for the configuration file (`kamaji.yaml`) and uses parameters found inside of it. In case some environment variable are passed, this will override configuration file parameters. In the end, if also a CLI flag is passed, this will override both env vars and config file as well. +By default **Kamaji** search for the configuration file (`kamaji.yaml`) and uses parameters found inside of it. In case some environment variable are passed, this will override configuration file parameters. In the end, if also a CLI flag is passed, this will override both env vars and config file as well. This is easily explained in this way: @@ -92,12 +92,13 @@ $ make yaml-installation-file It will generate a yaml installation file at `config/install.yaml`. It should be customize accordingly. - ## Tenant Control Planes +**Kamaji** offers a [CRD](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/) to provide a declarative approach of managing tenant control planes. This *CRD* is called `TenantControlPlane`, or `tcp` in short. Use the command `kubectl explain tcp.spec` to understand the fields and their usage. + ### Add-ons -Kamaji provides optional installations into the deployed tenant control plane through add-ons. Is it possible to enable/disable them through the `tcp` definition. +**Kamaji** provides optional installations into the deployed tenant control plane through add-ons. Is it possible to enable/disable them through the `tcp` definition. ### Core DNS @@ -117,15 +118,15 @@ addons: ```yaml addons: - konnectivity: - proxyPort: 31132 # mandatory - version: v0.0.31 - resources: - requests: - cpu: 100m - memory: 128Mi - limits: - cpu: 100m - memory: 128Mi - serverImage: us.gcr.io/k8s-artifacts-prod/kas-network-proxy/proxy-server - agentImage: us.gcr.io/k8s-artifacts-prod/kas-network-proxy/proxy-agent + konnectivity: + proxyPort: 31132 # mandatory + version: v0.0.31 + resources: + requests: + cpu: 100m + memory: 128Mi + limits: + cpu: 100m + memory: 128Mi + serverImage: us.gcr.io/k8s-artifacts-prod/kas-network-proxy/proxy-server + agentImage: us.gcr.io/k8s-artifacts-prod/kas-network-proxy/proxy-agent diff --git a/docs/versioning.md b/docs/versioning.md new file mode 100644 index 00000000..3a5d2bc8 --- /dev/null +++ b/docs/versioning.md @@ -0,0 +1,8 @@ +# Versioning and support +In Kamaji, there are different components that might require independent versioning and support level: + +|Kamaji|Admin Cluster|Tenant Cluster (min)|Tenant Cluster (max)|Konnectivity|Tenant etcd | +|------|-------------|--------------------|--------------------|------------|------------| +|0.0.1 |1.22.0+ |1.21.0 |1.23.x |0.0.32 |3.5.4 | + +Other combinations might work but have not been tested. \ No newline at end of file