You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a summary of a thread from the #spinkube CNCF slack. Thank you @asteurer for discovering this issue!
The issue
Initial discovery: When running a CPU intensive spin app with the shim, if load/ requests increases, CPU usage reporting on the pod stays static (output of kubectl top pods does not change). This makes it impossible to use the Horizontal Pod Autoscaler with SpinKube. This is consistent for all of the following tested K8s distributions -- note that the only distribution that does not exhibit this behavior is K3d:
distro
containerd
works
k3d
v1.7.7-k3s1.27
yes
AKS
1.7.15-1
no
k3s
1.6.28
no
Kind
1.7.15
no
Repro steps
Apply a CPU intensive spin app deployment and the HPA:
Notice how the Pod CPU and memory usage values are 0 while the container has properly propagated values.
Load the app to see if the HPA increases replicas
If Pod metrics were properly reported, the app replicas would increase.
# After port forwarding to port 3000
bombardier localhost:3000 -n 10 -t 30s
Calling the stats API during the load test shows that while the container usageNanoCores jumped from 4728 to 497486, the pod metrics did not change nor did the app replica count or the output of kubectl top pods for that Pod.
Other investigation
Pod metrics are surfaced for normal containers not executed with the shim (without runtime class wasmtime-spin-v2 specified):
I wonder if this may be the issue: containerd/cri#922. Specifically, we may need to add the io.kubernetes.container.name=="POD" label to the pause container. This may also explain why this works on k3d if it is using docker for the container runtime instead of containerd (not sure if this is the case though).
@jprendes and I spent some time setting up GDB debugging with the shim for this. While we did not come to any new conclusions, I wanted to share our repro steps:
Debugging with GDB
Install K3s. This uses kwasm to configure the containerd config to use a shim at /opt/kwasm/bin/containerd-shim-spin-v2 so be sure to move your debug binary here
Create script to enable executing gbd as sudo user:
#!/bin/bash
sudo gdb $*
Create gdb launch.json
{
// Use IntelliSense to learn about possible attributes.// Hover to view descriptions of existing attributes.// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387"version": "0.2.0",
"configurations": [
{
"type": "gdb",
"request": "attach",
"name": "Attach to PID",
"target": "{PID}",
"cwd": "${workspaceRoot}",
"valuesFormatting": "parseText",
"gdbpath": "/home/kagold/projects/containerd-shim-spin/_scratch/resources-debug/sudo-gdb.sh"
}
]
}
Build debug version of shim and move it to expected shim location for k8s distro
Apply spin app deployment
Get spin process PID and update launch.json to target it
This is a summary of a thread from the #spinkube CNCF slack. Thank you @asteurer for discovering this issue!
The issue
Initial discovery: When running a CPU intensive spin app with the shim, if load/ requests increases, CPU usage reporting on the pod stays static (output of
kubectl top pods
does not change). This makes it impossible to use the Horizontal Pod Autoscaler with SpinKube. This is consistent for all of the following tested K8s distributions -- note that the only distribution that does not exhibit this behavior is K3d:Repro steps
Output may look similar to the following:
Notice how the Pod CPU and memory usage values are 0 while the container has properly propagated values.
If Pod metrics were properly reported, the app replicas would increase.
# After port forwarding to port 3000 bombardier localhost:3000 -n 10 -t 30s
Calling the stats API during the load test shows that while the container
usageNanoCores
jumped from 4728 to 497486, the pod metrics did not change nor did the app replica count or the output ofkubectl top pods
for that Pod.Other investigation
Pod metrics are surfaced for normal containers not executed with the shim (without runtime class
wasmtime-spin-v2
specified):However, if that same container is executed with the shim, Pod metrics are no longer surfaced.
Possible solutions and areas to investigate
Some areas to investigate that @jsturtevant and @radu-matei mentioned are the following:
The text was updated successfully, but these errors were encountered: