-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Fast Deploy Part Deux (Experimental) #851
Open
akutz
wants to merge
1
commit into
vmware-tanzu:main
Choose a base branch
from
akutz:feature/fast-deploy-direct
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
akutz
force-pushed
the
feature/fast-deploy-direct
branch
3 times, most recently
from
January 7, 2025 17:41
e63089b
to
b0b5e20
Compare
Minimum allowed line rate is |
akutz
force-pushed
the
feature/fast-deploy-direct
branch
24 times, most recently
from
January 10, 2025 18:04
43ff6b5
to
b55a286
Compare
akutz
force-pushed
the
feature/fast-deploy-direct
branch
4 times, most recently
from
January 10, 2025 19:37
0b56871
to
6469a96
Compare
akutz
force-pushed
the
feature/fast-deploy-direct
branch
from
January 10, 2025 20:11
6469a96
to
ad2cf24
Compare
akutz
force-pushed
the
feature/fast-deploy-direct
branch
2 times, most recently
from
January 10, 2025 21:41
950b243
to
e8333fa
Compare
This patch adds support for the Fast Deploy Direct and Linked features, i.e. the ability to cache images per-datastore and quickly provision a VM from these caches, either directly or as a linked clone. This is an experimental feature that must be enabled manually. There are many things about this feature that may change prior to it being ready for production. The patch notes below are broken down into several sections: * **Goals** -- What is currently supported * **Non-goals** -- What is not on the table right now * **Architecture** * **Activation** -- How to enable this experimental feature * **Placement** -- Request datastore recommendations * **Image cache** -- A general-purpose VM image cache * **Create VM** -- Create directly from cached disk The following goals are what is considered in-scope for this experimental feature at this time. Just because something is not listed, it does not mean it will not be added before the feature is made generally available: * Support all VM images that are OVFs * Support multiple zones * Support workload-domain isolation * Support all datastore types, including host-local and vSAN * Support for configuring a default fast-deploy mode * Support picking the fast-deploy mode per VM (direct, linked) * Support disabling fast-deploy per VM * Support VM encryption for VMs deployed with fast deploy direct * Support backup/restore for VMs deployed with fast deploy direct * Support site replication for VMs deployed with fast deploy direct * Support datastore maintenance/migration for VMs deployed with fast deploy direct The following is a list of non-goals that are not in scope at this time, although most of them should be revisited prior to this feature graduating to production: * Support VM images that are VM templates (VMTX) The architecture behind Fast Deploy makes it trivial to support deploying VM images that point to VM templates. While not in scope at this time, it is likely this becomes part of the feature prior to it graduating to production-ready. The architecture is broken down into the following sections: * **Activation** -- How to enable this experimental feature * **Placement** -- Request datastore recommendations * **Image cache** -- A general-purpose VM image cache * **Create VM** -- Create directly from cached disk Enabling the experimental Fast Deploy feature requires setting the environment variable `FSS_WCP_VMSERVICE_FAST_DEPLOY` to `true` in the VM Operator deployment. The environment variable `FAST_DEPLOY_MODE` may be set to one of the following values to configure the default mode for the fast-deploy feature: * `direct` -- VMs are deployed using cached disks * `linked` -- VMs are deployed as a linked clone * the value is empty -- `direct` mode is used * the value is anything else -- fast deploy is disabled It is possible to override the default mode per-VM by setting the annotation `vmoperator.vmware.com/fast-deploy`. The values of this annotation follow the same rules described above. Please note, setting the environment variable `FAST_DEPLOY_MODE` or the annotation `vmoperator.vmware.com/fast-deploy` has no effect if the feature is not enabled. Please refer to PR vmware-tanzu#823 for information on placement as the logic from that change has stayed the same in this one. The way the images/disks are cached has completely changed since PR * not visible to DevOps users * a namespace-scoped resource that only exists in the same namespace as the VM Operator controller pod * used to cache the OVF and an image's disks A `VirtualMachineImageCache` resource is created per unique library item resource. That means even if there are 20,000 VMI resources spread across a multitude of namespaces or at the cluster scope, if they all point to the same underlying library item, then for all those VMI resources there will be a single `VirtualMachineImageCache` resource in the VM Operator namespace. The `VirtualMachineImageCache` controller caches the OVF for the image in a `ConfigMap` resource in the VM Operator namespace. This completely obviates the need to maintain a bespoke, in-memory OVF cache. The `VirtualMachineImageCache` resource caches the image's disks on specified datastores by setting `spec.locations` with entries that map to unique datacenter/datastore IDs. The resource's status reveals the location(s) of the cached disk(s). For a more in-depth look on how the disks are actually cached, please refer to PR vmware-tanzu#823. If the `VirtualMachineImageCache` object is not ready with the cached OVF or disks, then the VM will be re-enqueued once the `VirtualMachineImageCache` _is_ ready. Please note, while placement is required to know where to cache the disks, additional placement calls are not issued if a VM is actively awaiting a `VirtualMachineImageCache` resource. Beyond that, the create VM workflow depends on the fast-deploy mode: 1. The cached disks are copied into the VM's folder. 2. The ConfigSpec is updated to reference the disks. a. Please note, if the VM is encrypted, the disks are not as part of the create call. This is because it is not possible to change the encryption state of disks when adding them to a VM. Thus the disks are encrypted after the VM is created, before it is powered on. 3. The `CreateVM_Task` VMODL1 API is used to create the VM. 1. The `VirtualDisk` devices in the ConfigSpec used to create the VM are updated with `VirtualDiskFlatVer2BackingInfo` backings that specify a parent backing which refers to the cached, base disk from above. The path to each of the VM's disks is constructed based on the index of the disk, ex.: `[<DATASTORE>] <KUBE_VM_OBJ_UUID>/<KUBE_VM_NAME>-<DISK_INDEX>.vmdk`. 2. The `CreateVM_Task` VMODL1 API is used to create the VM. Because the the VM's disks have parent backings, this new VM is effectively a linked clone.
akutz
force-pushed
the
feature/fast-deploy-direct
branch
from
January 10, 2025 21:43
e8333fa
to
1a7f124
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do, and why is it needed?
This patch adds support for the Fast Deploy Direct and Linked features, i.e. the ability to cache images per-datastore and quickly provision a VM from these caches, either directly or as a linked clone. This is an experimental feature that must be enabled manually. There are many things about this feature that may change prior to it being ready for production.
The patch notes below are broken down into several sections:
Goals
The following goals are what is considered in-scope for this experimental feature at this time. Just because something is not listed, it does not mean it will not be added before the feature is made generally available:
Non-goals
The following is a list of non-goals that are not in scope at this time, although most of them should be revisited prior to this feature graduating to production:
Support VM images that are VM templates (VMTX)
The architecture behind Fast Deploy makes it trivial to support deploying VM images that point to VM templates. While not in scope at this time, it is likely this becomes part of the feature prior to it graduating to production-ready.
Architecture
The architecture is broken down into the following sections:
Activation
Enabling the experimental Fast Deploy feature requires setting the environment variable
FSS_WCP_VMSERVICE_FAST_DEPLOY
totrue
in the VM Operator deployment. The environment variableFAST_DEPLOY_MODE
may be set to one of the following values to configure the default mode for the fast-deploy feature:direct
-- VMs are deployed using cached diskslinked
-- VMs are deployed as a linked clonedirect
mode is usedIt is possible to override the default mode per-VM by setting the annotation
vmoperator.vmware.com/fast-deploy
. The values of this annotation follow the same rules described above.Please note, setting the environment variable
FAST_DEPLOY_MODE
or the annotationvmoperator.vmware.com/fast-deploy
has no effect if the feature is not enabled.Placement
Please refer to PR #823 for information on placement as the logic from that change has stayed the same in this one.
Image cache
The way the images/disks are cached has completely changed since PR #823. There is now a new API named
VirtualMachineImageCache
that is:A
VirtualMachineImageCache
resource is created per unique library item resource. That means even if there are 20,000 VMI resources spread across a multitude of namespaces or at the cluster scope, if they all point to the same underlying library item, then for all those VMI resources there will be a singleVirtualMachineImageCache
resource in the VM Operator namespace.The
VirtualMachineImageCache
controller caches the OVF for the image in aConfigMap
resource in the VM Operator namespace. This completely obviates the need to maintain a bespoke, in-memory OVF cache.The
VirtualMachineImageCache
resource caches the image's disks on specified datastores by settingspec.locations
with entries that map to unique datacenter/datastore IDs. The resource's status reveals the location(s) of the cached disk(s).For a more in-depth look on how the disks are actually cached, please refer to PR #823.
Create VM
If the
VirtualMachineImageCache
object is not ready with the cached OVF or disks, then the VM will be re-enqueued once theVirtualMachineImageCache
is ready. Please note, while placement is required to know where to cache the disks, additional placement calls are not issued if a VM is actively awaiting aVirtualMachineImageCache
resource. Beyond that, the create VM workflow depends on the fast-deploy mode:Direct
The cached disks are copied into the VM's folder.
The ConfigSpec is updated to reference the disks.
Please note, if the VM is encrypted, the disks are not as part of the create call. This is because it is not possible to change the encryption state of disks when adding them to a VM. Thus the disks are encrypted after the VM is created, before it is powered on.
The
CreateVM_Task
VMODL1 API is used to create the VM.Linked
The
VirtualDisk
devices in the ConfigSpec used to create the VM are updated withVirtualDiskFlatVer2BackingInfo
backings that specify a parent backing which refers to the cached, base disk from above.The path to each of the VM's disks is constructed based on the index of the disk, ex.:
[<DATASTORE>] <KUBE_VM_OBJ_UUID>/<KUBE_VM_NAME>-<DISK_INDEX>.vmdk
.The
CreateVM_Task
VMODL1 API is used to create the VM. Because the the VM's disks have parent backings, this new VM is effectively a linked clone.Which issue(s) is/are addressed by this PR? (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes
NA
Are there any special notes for your reviewer:
Please add a release note if necessary: