-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to include images in custom node build #3634
base: main
Are you sure you want to change the base?
Conversation
We have a real problem with cross-architecture, there's going to be really confusing results if it's from local side-loading instead of pulling. I'm wondering if we should just only permit pull-based and require using a local or remote registry for adding additional pre-baked images. |
(the docker/podman local store doesn't really have a good way to keep an image at both architectures, they have to have unique names) |
(also don't take that as a given and rewrite everything yet ... just a thought .... this is kind of tricky, there's a lot of confusing behaviors and we should do our best to avoid confusion) |
I've been doing some additional testing, and it looks like even the existing |
That's a bug, because I use this to publish multi-arch images on every release: |
(For |
That may very well be the case - I would not be surprised at all. If you (or someone else that has this working) are able to test this change in your environment, that would make me more comfortable. So I'm not clear from the comments above if something more is needed here. We are pulling only, not side-loading from local files. But do you mean not even that? That we should spin up the builder container, then pull from within there directly? |
OK, I played around a bit, and it's actually a little cleaner doing it the way I think you were suggesting. Updated to pull directly from the builder container. Just let me know if I'm off track here. ;) |
THX for this PR ... I tried it on my arm Mac and build a kind node image twice once for amd64 and then for amr64. That part work very well. A small issue popped up ... In this two step approach there is this base image kindest/node:v1.30.0@sha256:047357... used. Basically an archive for amd64 and then another archive for arm64 pulled is pull by my docker desktop locally to do the build for and or arm. Well, both archive have the same multi-archive hash, so that the second step failed because it seems to be the case that we can't have two different kindest/node:v... images below one (multi-archive) sha256 hash in place at the local machine. No real problem. After each step of building the custom node images I pushed it into my registry. I cleared everything locally and the way was free. Another new multi-archive custom node image could be build again for the other arch. In a final step by pulling the custom images again from it and creating the multi-archive by hand. |
Awesome, thanks for trying things out and reporting back @xlf12! Great to hear it's (mostly) working for someone else too.
I believe this is a known issue with image building today. IIRC @BenTheElder has mentioned something similar, so maybe he can confirm or give some helpful hints on that. |
The feature itself is so valuable to us that this multi-archive issue IMHO could simply be ignored and perhaps - my Go skills are too poor for go on it - so might be passed on to a future hero. It's just a bit of a usability issue, I'd say. It doesn't prevent us from creating images in multiple archives. You can always work around that with a script, which works fine. Not everyone will have a use case like that, either. |
I'd like to track down this bug, which might inform us to how to approach adding more image building commands ... I don't think what we have currently is very nice.
I'd rather not do this sort of "ignore it until later and then need to majorly redesign and introduce breaking changes", there are already alternatives to this entire feature, I recommend https://kind.sigs.k8s.io/docs/user/local-registry/
This is a docker limitation. Docker doesn't actually store images by (image:tag, arch), only by (image:tag/image@sha). Previously we're forced to use the run + commit approach because we needed --privileged to get containerd running, but we might be able to come up with another work around today, either getting image imports to work without priv or maybe https://docs.docker.com/reference/cli/docker/buildx/build/#allow If we built images in buildkit mode, it can handle this, and it can natively handle producing a multi-arch artifact (though again only if you push it to a registry, it cannot store it in the local docker storage that you can This is why building the node images is clunky and looks like this: (basically how you described building and pushing your image) |
[sorry we are deep in the infra migration stuff, I'm going to dive into this soonish ...] |
This adds a new `kind build add-image` command that provides the ability to "preload" container images in a custom node image. Signed-off-by: Sean McGinnis <[email protected]> Co-authored-by: Curt Bushko <[email protected]> Signed-off-by: Sean McGinnis <[email protected]>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: stmcginnis The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
When I tested this branch to add images from our local antifactory, I saw below error, any idea why I get this error? Same works fine if I use public images. host="": failed to do request: Head "*": tls: failed to verify certificate: x509: certificate signed by unknown authority Stack Trace: |
This is one of the problems with the current approach -- pulling happens inside the node containers and the node containers do not have additional certs installed on your host (most likely that's what's happening here anyhow) This is not very easy to solve. |
Hmm, what if we mount |
We haven't currently settled on a solution to this with the running nodes either, something like that might work but ... that only solves this particular variation on the problem even if it works. (And ... it has portability issues) The user may also e.g. have docker credential plugins enabled for image pulling (and containerd doesn't even have these, there are CRI auth plugins but that's different). What would be ideal is if we had a solution to pull images in an architecture specific way without conflicts and then import those into the node, but we haven't yet (and we also have to consider nerdctl and podman these days). Also, once we commit to either pulling in the node or pulling on the host, users will depend on it, so we will have to decide though. |
We've kinda punted these with One possible answer is requiring adding via tarballs, but that's less convenient. On the other hand if we attempt to support pulling from third party hosts and we don't pull via the host runtime, we're going to get a LOT of support issues confused when auth / networking doesn't work. But if we pull to the host, we're back to the problems we ran into before with architectures, and I don't think we want to build new features that aren't multi-arch aware. That one might be solvable, it needs revisiting. |
I think maybe we should start by just supporting the equivalent of image-archive, and let the caller be responsible for supplying the images with their preferred tooling. I think this will fall into three use cases:
Those could also work for 2/3) but this could certainly be more convenient, especially because local registry requires replacing the manifests with your local image in the case of third party workloads you are not actively developing, and because loading at runtime is not optimally efficient. In a subset of those cases, this may be a better option, in even more cases it may just feel more convenient. In all cases, the actual pull part is painful to do portably, and an image archive will contain the intended tags and also give the user the ability to rewrite tags (pull, retag, save, then run this command). Maybe we start there and then based on user feedback we can revisit kind handling the pulling? The user has to identify the set of images anyhow, I don't think telling them to perform the pull and export first is a huge burden and it has efficiency wins (may be cached on the host at their discretion) then it's up to the user to handle authentication and the architecture issues (some users may be better off using something like crane for multi-arch pull and export, YMMV) |
cc @aojea |
that is, take image archives as the input? and avoid being responsible for image pulling mess, let the user decide between ease and caching (docker pull; docker save ...) versus cross-platform flexibility (crane pull --platform) or local builds (bazel or ko or ...) |
Just a note: the latest responses have been on my queue to get back to and address, just haven't had the time to lately. I should be able to get to it soon. At first glance, just supporting image archives seems like a reasonable first step to avoid some of the complexity and would add value. |
FWIW I would love to see this merged. I have a use case where I'm running an authorizing webhook as a static pod and want the webhook image pre-loaded in the kind node image. My workaround today is to run |
Dockerfile
|
I think we're going to rewrite it, pending bandwidth, to work around image archives rather than pulling images, for the reasons discussed above. |
Current suggested alternative is to use a local registry for the image. Though Antonio's suggestion is also possible if trickier. |
Also there's no guarantee that we don't make changes to the images that break this in the future, we only support that the node images can create kubernetes clusters. Depending on specific tools to be installed on them is not supported and subject to change. We've changed distros and container runtimes before and we're always looking to minimize the base image size. So it's workable, but may not be stable ... The registry approach is pretty stable, just make sure to use the current version that avoids depending on deprecated containerd config :| |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Updated / alternate take on #2303.
This adds a new
kind build add-image
command that provides the ability to "preload" container images in a custom node image.Closes: #2160
Help output
Updated
kind build
output:$ kind build Build a kind node image by creating a new image from source, or combining existing images in a new image. Usage: kind build [command] Available Commands: add-image Update node image with extra images preloaded node-image Build the node image
And
add-image
help output:$ kind build add-image -h Use an existing node image as a base and preload extra container images. Usage: kind build add-image <IMAGE> [IMAGE...] [flags] Flags: --arch string architecture to build for, defaults to the host architecture --base-image string name:tag of the base image to use for the build (default "kindest/node:v1.30.0@sha256:047357ac0cfea04663786a612ba1eaba9702bef25227a794b52890dd8bcd692e") -h, --help help for add-image --image string name:tag of the resulting image to be built (default "kindest/custom-node:latest")
Command execution
The new command can be run to preload more container images into the new node image, using the default base image and target image:
$ kind build add-image nginx postgres golang:1.22 Adding [nginx postgres golang:1.22] images to base image Creating build container based on "kindest/node:v1.30.0@sha256:047357ac0cfea04663786a612ba1eaba9702bef25227a794b52890dd8bcd692e" Importing images into build container kind-build-1717165213-626335550 Saving new image kindest/custom-node:latest sha256:38ad8e7460ce80324d4a9dbb76a94f7c5434f7ea1d6260c8566f9c797933c6c3 Add image build completed.
New cluster creation
The custom node image can then be used to create a new cluster:
$ cat <<EOF | kind create cluster --config=- kind: Cluster apiVersion: kind.x-k8s.io/v1alpha4 nodes: - role: control-plane image: kindest/custom-node:latest EOF Creating cluster "kind" ... ✓ Ensuring node image (kindest/custom-node:latest) 🖼 ✓ Preparing nodes 📦 ✓ Writing configuration 📜 ✓ Starting control-plane 🕹️ ✓ Installing CNI 🔌 ✓ Installing StorageClass 💾 Set kubectl context to "kind-kind" You can now use your cluster with: kubectl cluster-info --context kind-kind Have a nice day! 👋
Verification
We can see that the cluster created with our modified node image has the expected images included:
$ docker exec -it kind-control-plane crictl images IMAGE TAG IMAGE ID SIZE ... docker.io/library/golang 1.22 5905f95343e84 846MB docker.io/library/nginx latest 4f67c83422ec7 192MB docker.io/library/postgres latest cff6b68a194a6 439MB ...