This is the documentation - and executable code! - for the Service Mesh Academy workshop about new 2.15 features. The easiest way to use this file is to execute it with demosh.
Things in Markdown comments are safe to ignore when reading this later. When
executing this with demosh, things after the horizontal rule below (which
is just before a commented @SHOW
directive) will get displayed.
This workshop will create a multizone k3d cluster for you. Make sure you don't already have a cluster named "features".
OK, let's get this show on the road by creating a multi-zone cluster. This
will be a four-node k3d cluster, with each Node in a different zone (in this
case, east
, west
, and central
). We'll do our usual dance of setting up
the cluster to expose ports 80 & 443 on the host network, and of specifying a
named network so that we can hook other things up to it.
There's enough to this that we'll use a YAML file to specify the cluster rather than doing it all on the command line.
bat k3d-multizone.yaml
k3d cluster create -c k3d-multizone.yaml --wait
kubectl cluster-info
Next up, instead of installing the Linkerd CRDs and control plane by hand,
we're going to use the Linkerd Operator from Buoyant, which will do all the
heavy lifting for us. This is more involved than just running linkerd install
, but it's a lot more flexible and powerful.
This does require you to sign up for a free account with Buoyant. But really, it's worth it, and we won't sell your information to anyone! To get set up, go to https://enterprise.buoyant.io/ and sign up.
Once done, you'll get to a page that'll show you three environment variables:
API_CLIENT_ID
API_CLIENT_SECRET
BUOYANT_LICENSE
You'll need all of those set in your environment as you continue!
Once you have all three of those, you can install the Linkerd Operator.
helm repo add linkerd-buoyant https://helm.buoyant.cloud
helm repo update
helm install linkerd-buoyant \
--create-namespace \
--namespace linkerd-buoyant \
--set metadata.agentName=$CLUSTER_NAME \
--set api.clientID=$API_CLIENT_ID \
--set api.clientSecret=$API_CLIENT_SECRET \
linkerd-buoyant/linkerd-buoyant
kubectl rollout status daemonset/buoyant-cloud-metrics -n linkerd-buoyant
linkerd buoyant check
Now that we have the Linkerd Operator installed, we need to create secrets for
our control plane to use. We'll use step
to create a trust anchor and an
identity issuer in the certs
directory:
#@immed
rm -rf certs
mkdir certs
# The `root-ca` profile is correct for a Linkerd trust anchor. We
# don't need a password, and we acknowledge that this is insecure.
step certificate create root.linkerd.cluster.local \
certs/ca.crt certs/ca.key \
--profile root-ca \
--no-password --insecure
# The `intermediate-ca` profile is correct for a Linkerd identity
# issuer. Drop its lifetime to 1 year (8760 hours) and use the trust
# anchor to sign it.
step certificate create identity.linkerd.cluster.local \
certs/issuer.crt certs/issuer.key \
--profile intermediate-ca --not-after 8760h \
--no-password --insecure \
--ca certs/ca.crt --ca-key certs/ca.key
Once we have these secrets, we need to store them in a Kubernetes Secret so
that the Linkerd control plane can use them later. Sadly, we can't use
kubectl create secret
for this, because we need three keys in this secret...
so we're going to use a stupid Python script instead, to avoid copying and
pasting things everywhere.
bat make-identity-secret.py
python make-identity-secret.py | kubectl apply -f -
Once we have our secrets, we need to create a ControlPlane CRD, which is what the Linkerd Operator uses to manage the Linkerd control plane for us. We'll use another stupid Python script to generate this CRD, but this time we're going to dump it to a file to look at it before we apply it.
bat make-control-plane.py
python make-control-plane.py enterprise-2.15.1 > buoyant/control-plane.yaml
bat buoyant/control-plane.yaml
kubectl apply -f buoyant/control-plane.yaml
At this point the Operator is merrily getting Linkerd installed for us.
kubectl get controlplane
linkerd check
kubectl get controlplane
We'll finish this part of our setup by installing a DataPlane CRD to tell the
Operator that it should manage the data plane in the linkerd-buoyant
namespace. No need for a Python script here!
bat buoyant/dataplane.yaml
kubectl apply -f buoyant/dataplane.yaml
At this point, the Operator is installing the data plane for the
linkerd-buoyant
namespace. We're not going to wait for it, though -- let's
go ahead and get our app installed.
For our application, we'll use the Faces demo behind Emissary, as usual. However, we're going to install things differently:
- We'll install three replicas of Emissary, with anti-affinity rules to ensure that we have one on each Node. This is actually the recommended way to run Emissary, though it's not the way we typically do for SMA demos!
- We'll put the
face
andsmiley
workloads for Faces in theeast
zone.
-
Finally, we'll install different
color
Deployments so that we can put one in each zone. All three deployments will be behind the same Service, though.(The reason we're using multiple Deployments like this is so we can independently scale them for the demo.)
We'll also tell Faces NOT to fail all the time; this isn't a resilience demo!
Emissary is pretty easy, we can just use Helm with a custom values file to set up anti-affinity rules:
bat emissary/values.yaml
# Create our namespace and annotate it for Linkerd
kubectl create ns emissary
kubectl annotate ns emissary linkerd.io/inject=enabled
# Install Emissary CRDs
helm install emissary-crds -n emissary \
oci://ghcr.io/emissary-ingress/emissary-crds-chart \
--version 0.0.0-test \
--wait
# Install Emissary itself
helm install emissary-ingress \
oci://ghcr.io/emissary-ingress/emissary-chart \
-n emissary \
--version 0.0.0-test \
-f emissary/values.yaml
# Wait for everything to be running
kubectl rollout status -n emissary deploy
# Verify that our Pods are running on different Nodes
kubectl get pods -n emissary -o wide
# Finally, install the bootstrap configuration.
kubectl apply -f emissary/bootstrap
In addition to the usual Emissary setup to listen on ports 80 & 443, the bootstrap configuration also includes a DataPlane for the Emissary namespace, so that the Operator can manage that for us later. (You still need the injection annotation, though, or the DataPlane resource will have no effect.)
Faces is a bit weirder, since we want to explicitly control the zones. I just
used helm template
to dump out the YAML from the Helm chart, then edited
things by hand, so let's take a look:
# faces/faces.yaml contains most of what Faces needs
bat faces/faces.yaml
# faces/colors.yaml contains the three `color` Deployments
bat faces/colors.yaml
# faces/bootstrap contains the bootstrap configuration
bat faces/bootstrap/*
Given all that, let's get this show on the road:
kubectl create ns faces
kubectl annotate ns faces linkerd.io/inject=enabled
kubectl apply -f faces/faces.yaml
kubectl apply -f faces/colors.yaml
kubectl apply -f faces/bootstrap
kubectl rollout status -n faces deploy
Once again, let's doublecheck to make sure that our Pods are really running in the correct zones.
kubectl get pods -n faces -o wide
OK! At this point everything should be running for us -- let's make sure of that in the browser.
One of the really cool things that Kubernetes 1.29 brings us is native support for sidecars. (This was present in 1.28, but it was alpha and not recommended. In 1.29, it's officially good to go.)
Let's try a quick test using a Job to run curl
to our face
workload.
bat job.yaml
kubectl apply -f job.yaml
We can watch the Job run...
watch "kubectl get -n faces pods | grep curl"
...and, hmmm, that's not good. It's never finishing, even though it ran.
kubectl get jobs -n faces
kubectl logs -n faces job/curl -c curl | bat
This is the problem with sidecars and Jobs pre-native-sidecars: since the sidecar is still running, the Job never finishes. So that's not good.
Let's clean up the dead job and try again with native sidecar support.
kubectl delete job -n faces curl
To enable native sidecar support, we just need set proxy.nativeSidecar=true
in the ControlPlane configuration.
${EDITOR} buoyant/control-plane.yaml
kubectl apply -f buoyant/control-plane.yaml
Applying that, we'll see that the Operator immediately starts updating everything.
watch "kubectl get controlplane; kubectl get pods -n linkerd"
Once it's done, we can try our Job again.
kubectl apply -f job.yaml
watch "kubectl get -n faces pods | grep curl"
This time it finishes! and we can make sure it actually ran by checking its logs.
kubectl logs -n faces job/curl -c curl
So that's that. Let's clean up the job before continuing.
kubectl delete job -n faces curl
If we go to the browser, we'll see a mix of background colors:
- blue for east
- green for west
- yellow for central
In fact, we see mostly not blue, because color-east
is actually a little
slower (by design) than the other two.
We can also see this in stats:
linkerd dg proxy-metrics -n faces deploy/face | python crunch-metrics.py
The problem here is that the face
Deployment is running in the east
zone,
and really, routing a ton of its traffic out to other zones isn't ideal. We
can use HAZL, the High Availability Zone-aware Load Balancer, to fix this.
We enable HAZL by modifying the ControlPlane resource to contain this rather
messy stanza under linkerd.controlPlaneConfig
:
destinationController:
additionalArgs:
- -ext-endpoint-zone-weights
So let's go ahead and do that.
${EDITOR} buoyant/control-plane.yaml
kubectl apply -f buoyant/control-plane.yaml
This will cause updates:
watch "kubectl get controlplane; kubectl get pods -n linkerd"
If we head back to the browser, we should see all blue now!
So that didn't work.
We could look at a ton of stuff here, but the actual cause is pretty simple: way back at the beginning, I installed enterprise-2.15.1, rather than enterprise-2.15.2. Sigh.
Fortunately, this is really easy to fix with the ControlPlane.
${EDITOR} buoyant/control-plane.yaml
kubectl apply -f buoyant/control-plane.yaml
watch "kubectl get controlplane; kubectl get dataplane -A; kubectl get pods -n linkerd"
watch "kubectl get controlplane; kubectl get dataplane -A; kubectl get pods -n faces; kubectl get pods -n emissary"
...and NOW if we go back to the browser, we'll see all blue!
This, of course, we could probably get with Kubernetes own topology-aware
routing. What topology-aware routing doesn't give us is resilience. Suppose
our color-east
workload crashes?
kubectl scale -n faces deploy/color-east --replicas=0
Over in the browser, we'll see that we've just seamlessly switched to a different zone.
If color-east
comes back up, we'll see it start taking traffic
again.
kubectl scale -n faces deploy/color-east --replicas=1
HAZL is also smart enough to know that if we overwhelm color-east
, it should
bring in workloads from the other zones to help out. Let's fire up a traffic
generator to see this in action.
bat faces/load.yaml
kubectl apply -f faces/load.yaml
At 10RPS of additional load, nothing much will happen in the browser, though we can see the request rate change in Buoyant Cloud.
So let's just escalate things here.
kubectl set env -n faces deploy/load LOAD_RPS=50
There we go. And, once again, if we drop the load back down, we expect to see only blue faces.
kubectl scale -n faces deploy/load --replicas=0
We've done a couple of Service Mesh Academy sessions on Linkerd's newfound ability to run workloads that aren't in Kubernetes at all, but to date we haven't done one using BEL yet. So let's do that now!
We don't have time to dive deep into exactly how the mesh-expansion setup is
built (you can find that in the 2-15-mesh-expansion
directory of Service
Mesh Academy!), but let's see it in action at minimum.
First up, let's break everything by completely removing the smiley
workload.
kubectl delete -n faces deploy,service smiley
If we head over to the browser now, we'll see all cursing faces.
Let's fire up the smiley
workload in a Docker container, outside of our
cluster. We're running this using exactly the same setup as we did for the
2.15 Mesh Expansion SMA earlier, including the horrible hackery of running
both a SPIRE server and a SPIRE agent in our Docker container. Don't do that
in the real world.
- Our external workload needs to route to the cluster's Pod CIDR range via the
one of the Nodes. We'll use the
server-0
Node for that (why not?) so let's grab its IP address.
NODE_IP=$(kubectl get node k3d-features-server-0 -ojsonpath='{.status.addresses[0].address}')
#@immed
echo "NODE_IP is ${NODE_IP}"
POD_CIDR=$(kubectl get nodes -ojsonpath='{.items[0].spec.podCIDR}')
#@immed
echo "POD_CIDR is ${POD_CIDR}"
-
We need to set up DNS to allow references to things like
face.faces.svc.cluster.local
need to actually resolve to addresses inside the cluster! We're going to tackle this by first editing thekube-dns
Service to make it a NodePort on UDP port 30000, so we can talk to it from our Node, then runningdnsmasq
as a separate Docker container to forward DNS requests for cluster Services to thekube-dns
Service.This isn't really the best way to tackle this in production, but it's not completely awful: we probably don't want to completely expose the cluster's DNS to the outside world. So, first we'll switch
kube-dns
to a NodePort:
kubectl edit -n kube-system svc kube-dns
kubectl get -n kube-system svc kube-dns
Next, we need the dnsmasq
container. This is a little weird: we're going to
use the drpsychick/dnsmasq
image, but we'll volume mount our own entrypoint
script:
bat expansion/dns-forwarder.sh
#@immed
docker kill dnsmasq >/dev/null 2>&1
#@immed
docker rm dnsmasq >/dev/null 2>&1
docker run --detach --rm --cap-add NET_ADMIN --net=features \
-v $(pwd)/expansion/dns-forwarder.sh:/usr/local/bin/dns-forwarder.sh \
--entrypoint sh \
-e DNS_HOST=${NODE_IP} \
--name dnsmasq drpsychick/dnsmasq \
-c /usr/local/bin/dns-forwarder.sh
Once that's done, we can get the IP address of the dnsmasq
container to
use later.
DNS_IP=$(docker inspect dnsmasq | jq -r '.[].NetworkSettings.Networks["features"].IPAddress')
#@immed
echo "DNS_IP is ${DNS_IP}"
So let's actually get our smiley
container running! We're using our
ghcr.io/buoyantio/faces-external-workload:1.0.0
image from the Mesh
Expansion SMA here.
#@immed
docker kill smiley >/dev/null 2>&1
#@immed
docker rm smiley >/dev/null 2>&1
docker run --rm --detach \
--cap-add=NET_ADMIN \
--network=features \
--dns=${DNS_IP} \
--name=smiley \
-v "$(pwd)/certs:/opt/spire/certs" \
-e WORKLOAD_NAME=smiley \
-e WORKLOAD_NAMESPACE=faces \
-e NODE_NAME='$(hostname)' \
-e FACES_SERVICE=smiley \
-e DELAY_BUCKETS=0,50,100,200,500,1000 \
ghcr.io/buoyantio/faces-external-workload:1.0.0 \
&& docker exec smiley ip route add ${POD_CIDR} via ${NODE_IP}
OK, let's make sure that that's running.
docker ps -a
Next, we need to create an ExternalWorkload resource so that Linkerd knows how
to use our smiley
workload. This is kind of analogous to a Kubernetes Pod:
it's a way of associating a name and an IP address with our workload.
SMILEY_ADDR=$(docker inspect smiley | jq -r '.[].NetworkSettings.Networks["features"].IPAddress')
#@immed
echo "SMILEY_ADDR is ${SMILEY_ADDR}"
sed -e "s/%%NAME%%/smiley/" -e "s/%%IP%%/${SMILEY_ADDR}/g" \
< ./expansion/external-workload.yaml.tmpl \
> /tmp/smiley.yaml
bat /tmp/smiley.yaml
kubectl apply -f /tmp/smiley.yaml
At this point we should see our new smiley
Service, and we should see that
it has EndpointSlices, too:
kubectl get svc -n faces
kubectl get endpointslices -n faces
And if we go back to the browser, we should see grinning faces again!
So there we have it: a tour of HAZL, native sidecar support, the Linkerd Operator, and mesh expansion with BEL! This is a lot of stuff, but it's all pretty cool stuff, and we look forward to digging more into it with everyone!