This repository guides you through deploying a private GKE cluster and provides a base platform for hands-on exploration of several GKE related topics which leverage or integrate with that infrastructure. After completing the exercises in all topic areas, you will have a deeper understanding of several core components of GKE and GCP as configured in an enterprise environment.
To follow this guide successfully:
- Install the prerequisite tools.
- Deploy the base GKE Cluster in a project of your choosing.
- Proceed to the guided demos section to learn more about each topic area via hands-on instruction.
Additional topics will be added as they are integrated into this demo structure, so check back often.
Note, when you clone this repo, specify --recursive to pull down dependencies (ie, submodules).
The gke-tf
CLI tool in combination with the gke-tf-demo.yaml
configuration file will generate the necessary terraform
infrastructure-as-code in the ./terraform
directory. Within the GCP project that you have Project Owner
permissions, the generated terraform
will be used to manage the lifecycle of all the required resources. This includes the VPC networks, firewall rules, subnets, service accounts, IAM roles, GCE instances, and the GKE Cluster.
Note that this regional GKE cluster is configured as a private GKE cluster, so a dedicated "bastion" host GCE instance is provided to protect the GKE API from the open Internet. Accessing the GKE API requires first running an SSH tunnel to the bastion host while forwarding a local port (8888
). The GKE worker nodes have egress access via a Cloud NAT instance to be able to pull container images and other assets as needed.
Click the button below to run the demo in a Google Cloud Shell.
When using Cloud Shell execute the following command in order to setup gcloud cli. When executing this command please setup your region and zone.
gcloud init
- gke-tf for your architecture in your
$PATH
Move on to the Tools section for installation instructions.
- A Google Cloud Platform project where you have
Project Owner
permissions to create VPC networks, service accounts, IAM Roles, GKE clusters, and more. bash
orbash
compatible shell- Google Cloud SDK version >= 244.0.0
- kubectl matching the latest GKE version.
- gke-tf for your architecture in your
$PATH
- Terraform >= 0.12.3
The Google Cloud SDK is used to interact with your GCP resources. Installation instructions for multiple platforms are available online.
The kubectl CLI is used to interteract with both Kubernetes Engine and kubernetes in general.
Installation instructions
for multiple platforms are available online. Ensure that you download a version of kubectl
that is equal to or newer than the version of the GKE cluster you are accessing.
The gke-tf
CLI is used for generating the necessary Terraform infrastructure-as-code source files to build the VPC, networks, service accounts, IAM roles, and GKE cluster from a single configuration YAML file. Installation instructions.
Terraform is used to automate the manipulation of cloud infrastructure. Its installation instructions are also available online.
Prior to running this demo, ensure you have authenticated your gcloud client by running the following command:
gcloud auth application-default login
Also, confirm the gcloud
configuration is properly pointing at your desired project. Run gcloud config list
and make sure that compute/zone
, compute/region
and core/project
are populated with values that work for you. You can set their values with the following commands:
# Where the region is us-east1
gcloud config set compute/region us-east1
Updated property [compute/region].
# Where the zone inside the region is us-east1-c
gcloud config set compute/zone us-east1-c
Updated property [compute/zone].
# Where the project id is my-project-id
gcloud config set project my-project-id
Updated property [core/project].
The steps below will walk you through using terraform to deploy a Kubernetes Engine cluster that you will then use for installing test users, applications and RBAC roles.
The Terraform generated by gke-tf
will enable the following Google Cloud Service APIs in the target project:
cloudresourcemanager.googleapis.com
container.googleapis.com
compute.googleapis.com
iam.googleapis.com
logging.googleapis.com
monitoring.googleapis.com
Review the gke-tf-demo.yaml
file in the root of this repository for an understanding of how the GKE Cluster will be configured. You may wish to edit the region:
field to one that is geographically closer to your location. The default is us-central1
unless changed.
With gke-tf
in your $PATH
, generate the Terraform necessary to build the cluster for this demo. The command below will send the generated Terraform files to the terraform
directory inside this repository and use the gke-tf-demo.yaml
as the cluster configuration file input. The GCP project is passed to this command as well.
export PROJECT="$(gcloud config list project --format='value(core.project)')"
gke-tf gen -d ./terraform -f gke-tf-demo.yaml -o -p ${PROJECT}
I0719 16:05:08.219900 57205 gen.go:78]
+-------------------------------------------------------------------+
| __.--/) .-~~ ~~>>>>>>>> .-. gke-tf |
| (._\~ \ ( ~~>>>>>>>>.~.-' |
| -~} \_~-, )~~>>>>>>>' / |
| { ~/ /~~~~~~. _.-~ |
| ~.( '--~~/ /~ ~. |
| .--~~~~_\ \--~( -.-~~-. \ |
| '''-'~~ / / ~-. \ .--~ / |
| (((_.' (((__.' '''-' |
+-------------------------------------------------------------------+
I0719 16:05:08.225777 57205 gen.go:91] Creating terraform for your GKE cluster demo-cluster.
I0719 16:05:08.227777 57205 templates.go:150] Created terraform file: main.tf
I0719 16:05:08.228081 57205 templates.go:150] Created terraform file: network.tf
I0719 16:05:08.228309 57205 templates.go:150] Created terraform file: outputs.tf
I0719 16:05:08.228507 57205 templates.go:150] Created terraform file: variables.tf
I0719 16:05:08.228520 57205 templates.go:153] Finished creating terraform files in: ./terraform
Review the generated Terraform files in the terraform
directory to understand what will be built inside your GCP project. If anything needs modifying, edit the gke-tf-demo.yaml
and re-run the gke-tf gen
command above. The newly generated Terraform files will reflect your changes. You are then ready to proceed to using Terraform to build the cluster and supporting resources.
Next, apply the terraform configuration with:
cd terraform
terraform init
terraform plan
terraform apply
Enter yes
to deploy the environment when prompted after running terraform apply
. This will take several minutes to build all the necessary GCP resources and GKE Cluster.
When Terraform has finished creating the cluster, you will see several generated outputs that will help you to access the private control plane:
Apply complete! Resources: 20 added, 0 changed, 0 destroyed.
Outputs:
bastion_kubectl = HTTPS_PROXY=localhost:8888 kubectl get pods --all-namespaces
bastion_ssh = gcloud compute ssh demo-cluster-bastion --project my-project-id --zone us-central1-a -- -L8888:127.0.0.1:8888
cluster_ca_certificate = <sensitive>
cluster_endpoint = 172.16.0.18
cluster_location = us-central1
cluster_name = demo-cluster
get_credentials = gcloud container clusters get-credentials --project my-project-id --region us-central1 --internal-ip demo-cluster
In addition to the GKE cluster, a small GCE instance known as a "bastion host" was also provisioned which supports SSH "tunneling and HTTP proxying" to allow remote API Server access in a more secure manner. To access the GKE cluster, first run the following command to obtain a valid set of Kubernetes credentials:
echo $(terraform output get_credentials)
$(terraform output get_credentials)
Fetching cluster endpoint and auth data.
kubeconfig entry generated for demo-cluster.
Notice that the gcloud container clusters get-credentials
command specified the --internal-ip
flag to use the private GKE Control Plane IP.
Next, open up a second terminal in the ./terraform
directory and run the following command:
$(terraform output bastion_ssh)
...snip...
permitted by applicable law.
myusername@demo-cluster-bastion:~$
With this "SSH Tunnel" running and forwarding port 8888
, any web traffic sent to our localhost:8888
will be sent down the tunnel and connect to the tiny proxy instance running on the demo-cluster-bastion
host listening on localhost:8888
.
If this SSH session disconnects, you will need to re-run the above command to reconnect and reach the GKE API.
Because kubectl
honors the HTTPS_PROXY
environment variable, this means that our kubectl
commands can be sent securely over the SSH tunnel and through the HTTP(S) proxy and reach the GKE control plane inside that VPC network via its private IP. While it's possible to run export HTTPS_PROXY=localhost:8888
in the current session, that environment variable is honored by other applications, which might not be desirable. For the duration of this terminal session, setting a simple shell alias will make all kubectl
commands use the SSH tunnel's HTTP proxy:
alias k="HTTPS_PROXY=localhost:8888 kubectl"
Now, every time k
is used within this terminal session, the shell will silently replace it with HTTPS_PROXY=localhost:8888 kubectl
, and the connection will work as expected.
k get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-node-f49fd 2/2 Running 0 25m
kube-system calico-node-sj8pp 2/2 Running 0 25m
kube-system calico-node-tw84c 2/2 Running 0 26mZ
...snip...
kube-system prometheus-to-sd-4xb67 1/1 Running 0 27m
kube-system prometheus-to-sd-fnd2l 1/1 Running 0 27m
kube-system stackdriver-metadata-agent-cluster-level-594ff5c995-htszq 1/1 Running 3 28m
After following the guidance in the Prerequisites section and successfully creating the base GKE Cluster and supporting resources in the Deployment section, you will first want to configure Anthos Configuration Management in your cluster.
- Anthos Configuration Management - Learn how to centrally manage your fleet of GKE Clusters using a "git-ops" workflow.
After completing the Anthos Configuration Management configuration, you can explore the following topics in any order you choose:
- Binary Authorization - Learn how to enforce which containers run inside your GKE Cluster.
- Role-Based Access Control - Understand how RBAC can be used to grant specific permissions to users and groups accessing the Kubernetes API.
- Logging with Stackdriver - Learn how GKE Clusters send logs and metrics to Stackdriver and how to export those to Google Cloud Storage (GCS) Buckets for long term storage and BigQuery datasets for analysis.
- Monitoring with Stackdriver - Learn how GKE Clusters send metrics to Stackdriver to monitor your cluster and container application performance.
This teardown step will remove the base GKE cluster and supporting resources that each topic area uses. Only perform the following procedures when you have completed all the desired topics and wish to fully remove all demo resources.
If you have completed any of the guided-demos, be sure to follow the Teardown section of each one to fully remove the resources that were created. After those are removed, you can remove the base cluster and its supported resources.
Log out of the bastion host by typing exit
in that terminal sessions and run the following to destroy the environment via Terraform in the current terminal from the base of the repository:
cd terraform
terraform destroy
...snip...
google_compute_network.demo-network: Still destroying... (ID: demo-network, 10s elapsed)
google_compute_network.demo-network: Still destroying... (ID: demo-network, 20s elapsed)
google_compute_network.demo-network: Destruction complete after 25s
Destroy complete! Resources: 20 destroyed.
If you have already followed the Teardown steps to delete the Cloud Source Repository, you can delete the local anthos-demo
repository folder:
rm -rf anthos/anthos-demo
All resources should now be fully removed.
During the make create
command, the gcloud compute ssh
command is run to create the SSH tunnel, forward the local port 8888
, and background the session. If it stops running and kubectl
commands are no longer working, rerun it:
`echo $(terraform output --state=../../terraform/terraform.tfstate bastion_ssh) -f tail -f /dev/null`
Because gcloud
leverages the host's SSH client binary to run SSH sessions, the process name may vary. The most reliable method is to find the process id
of the SSH session and run kill <pid>
or pkill <processname>
ps -ef | grep "ssh.*L8888:127.0.0.1:8888" | grep -v grep
579761 83734 1 0 9:53AM ?? 0:00.02 /usr/local/bin/gnubby-ssh -t -i /Users/myuser/.ssh/google_compute_engine -o CheckHostIP=no -o HostKeyAlias=compute.192NNNNNNNN -o IdentitiesOnly=yes -o StrictHostKeyChecking=yes -o UserKnownHostsFile=/Users/myuser/.ssh/google_compute_known_hosts [email protected] -L8888:127.0.0.1:8888 -f tail -f /dev/null /dev/null
In this case, running pkill gnubby-ssh
or kill 83734
would end this SSH session.
The credentials that Terraform is using do not provide the necessary permissions to create resources in the selected projects. Ensure that the account listed in gcloud config list
has necessary permissions to create resources. If it does, regenerate the application default credentials using gcloud auth application-default login
.
Note, this is not an officially supported Google product.