Skip to content

Commit

Permalink
Fast Deploy Direct & Linked (Experimental)
Browse files Browse the repository at this point in the history
This patch adds support for the Fast Deploy Direct and Linked features,
i.e. the ability to cache images per-datastore and quickly provision a
VM from these caches, either directly or as a linked clone. This is an
experimental feature that must be enabled manually. There are many
things about this feature that may change prior to it being ready for
production.

The patch notes below are broken down into several sections:

* **Goals** -- What is currently supported
* **Non-goals** -- What is not on the table right now
* **Architecture**
    * **Activation** -- How to enable this experimental feature
    * **Placement** --  Request datastore recommendations
    * **Image cache** -- A general-purpose VM image cache
    * **Create VM** -- Create directly from cached disk

The following goals are what is considered in-scope for this
experimental feature at this time. Just because something is not listed,
it does not mean it will not be added before the feature is made
generally available:

* Support all VM images that are OVFs
* Support multiple zones
* Support workload-domain isolation
* Support all datastore types, including host-local and vSAN
* Support for configuring a default fast-deploy mode
* Support picking the fast-deploy mode per VM (direct, linked)
* Support disabling fast-deploy per VM
* Support VM encryption for VMs deployed with fast deploy direct
* Support backup/restore for VMs deployed with fast deploy direct
* Support site replication for VMs deployed with fast deploy direct
* Support datastore maintenance/migration for VMs deployed with fast
  deploy direct

The following is a list of non-goals that are not in scope at this time,
although most of them should be revisited prior to this feature
graduating to production:

* Support VM images that are VM templates (VMTX)

    The architecture behind Fast Deploy makes it trivial to support
    deploying VM images that point to VM templates. While not in scope
    at this time, it is likely this becomes part of the feature prior to
    it graduating to production-ready.

The architecture is broken down into the following sections:

* **Activation** -- How to enable this experimental feature
* **Placement**  -- Request datastore recommendations
* **Image cache** -- A general-purpose VM image cache
* **Create VM**  -- Create directly from cached disk

Enabling the experimental Fast Deploy feature requires setting the
environment variable `FSS_WCP_VMSERVICE_FAST_DEPLOY` to `true` in the VM
Operator deployment. The environment variable `FAST_DEPLOY_MODE` may be
set to one of the following values to configure the default mode for the
fast-deploy feature:

* `direct` -- VMs are deployed using cached disks
* `linked` -- VMs are deployed as a linked clone
* the value is empty -- `direct` mode is used
* the value is anything else -- fast deploy is disabled

It is possible to override the default mode per-VM by setting the
annotation `vmoperator.vmware.com/fast-deploy`. The values of this
annotation follow the same rules described above.

Please note, setting the environment variable `FAST_DEPLOY_MODE` or the
annotation `vmoperator.vmware.com/fast-deploy` has no effect if the
feature is not enabled.

Please refer to PR #823 for information on placement as the logic from
that change has stayed the same in this one.

The way the images/disks are cached has completely changed since PR

* not visible to DevOps users
* a namespace-scoped resource that only exists in the same namespace as
  the VM Operator controller pod
* used to cache the OVF and an image's disks

A `VirtualMachineImageCache` resource is created per unique library item
resource. That means even if there are 20,000 VMI resources spread
across a multitude of namespaces or at the cluster scope, if they all
point to the same underlying library item, then for all those VMI
resources there will be a single `VirtualMachineImageCache` resource in
the VM Operator namespace.

The `VirtualMachineImageCache` controller caches the OVF for the image
in a `ConfigMap` resource in the VM Operator namespace. This completely
obviates the need to maintain a bespoke, in-memory OVF cache.

The `VirtualMachineImageCache` resource caches the image's disks on
specified datastores by setting `spec.locations` with entries that map
to unique datacenter/datastore IDs. The resource's status reveals the
location(s) of the cached disk(s).

For a more in-depth look on how the disks are actually cached, please
refer to PR #823.

If the `VirtualMachineImageCache` object is not ready with the cached
OVF or disks, then the VM will be re-enqueued once the
`VirtualMachineImageCache` _is_ ready. Please note, while placement is
required to know where to cache the disks, additional placement calls
are not issued if a VM is actively awaiting a `VirtualMachineImageCache`
resource. Beyond that, the create VM workflow depends on the fast-deploy
mode:

1. The cached disks are copied into the VM's folder.

2. The ConfigSpec is updated to reference the disks.

  a. Please note, if the VM is encrypted, the disks are not as part of
     the create call. This is because it is not possible to change the
     encryption state of disks when adding them to a VM. Thus the disks
     are encrypted after the VM is created, before it is powered on.

3. The `CreateVM_Task` VMODL1 API is used to create the VM.

1. The `VirtualDisk` devices in the ConfigSpec used to create the VM are
   updated with `VirtualDiskFlatVer2BackingInfo` backings that specify a
   parent backing which refers to the cached, base disk from above.

   The path to each of the VM's disks is constructed based on the index
   of the disk, ex.:
   `[<DATASTORE>] <KUBE_VM_OBJ_UUID>/<KUBE_VM_NAME>-<DISK_INDEX>.vmdk`.

2. The `CreateVM_Task` VMODL1 API is used to create the VM. Because the
   the VM's disks have parent backings, this new VM is effectively a
   linked clone.
  • Loading branch information
akutz committed Jan 10, 2025
1 parent 0936429 commit ad2cf24
Show file tree
Hide file tree
Showing 71 changed files with 6,155 additions and 2,644 deletions.
234 changes: 120 additions & 114 deletions .golangci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,8 @@ linters-settings:
pkg: github.com/vmware-tanzu/vm-operator/pkg/config
- alias: pkgctx
pkg: github.com/vmware-tanzu/vm-operator/pkg/context
- alias: pkgerr
pkg: github.com/vmware-tanzu/vm-operator/pkg/errors
- alias: ctxop
pkg: github.com/vmware-tanzu/vm-operator/pkg/context/operation
- alias: pkgmgr
Expand All @@ -90,59 +92,63 @@ linters-settings:
pkg: github.com/vmware-tanzu/vm-operator/pkg/util
- alias: proberctx
pkg: github.com/vmware-tanzu/vm-operator/pkg/prober/context
- alias: dsutil
pkg: github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/datastore
- alias: clsutil
pkg: github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/library

depguard:
rules:
main:
list-mode: lax # allow unless explicitly denied
files:
- $all
- "!$test"
- "!**/test/builder/*.go"
- "!**/matcher.go"
- $all
- "!$test"
- "!**/test/builder/*.go"
- "!**/matcher.go"
deny:
- pkg: io/ioutil
desc: "replaced by io and os packages since Go 1.16: https://tip.golang.org/doc/go1.16#ioutil"
- pkg: github.com/pkg/errors
desc: "replaced by stdlib errors package since Go.13: https://go.dev/blog/go1.13-errors"
- pkg: k8s.io/utils
desc: "replaced by internal packages like pkg/util/ptr"
- pkg: testing
desc: "do not import testing packages in non-test sources"
- pkg: github.com/onsi/ginkgo$
desc: "do not import testing packages in non-test sources"
- pkg: github.com/onsi/ginkgo/v2
desc: "do not import testing packages in non-test sources"
- pkg: github.com/onsi/gomega
desc: "do not import testing packages in non-test sources"
- pkg: io/ioutil
desc: "replaced by io and os packages since Go 1.16: https://tip.golang.org/doc/go1.16#ioutil"
- pkg: github.com/pkg/errors
desc: "replaced by stdlib errors package since Go.13: https://go.dev/blog/go1.13-errors"
- pkg: k8s.io/utils
desc: "replaced by internal packages like pkg/util/ptr"
- pkg: testing
desc: "do not import testing packages in non-test sources"
- pkg: github.com/onsi/ginkgo$
desc: "do not import testing packages in non-test sources"
- pkg: github.com/onsi/ginkgo/v2
desc: "do not import testing packages in non-test sources"
- pkg: github.com/onsi/gomega
desc: "do not import testing packages in non-test sources"
test:
list-mode: lax # allow unless explicitly denied
files:
- $test
- $test
deny:
- pkg: io/ioutil
desc: "replaced by io and os packages since Go 1.16: https://tip.golang.org/doc/go1.16#ioutil"
- pkg: github.com/pkg/errors
desc: "replaced by stdlib errors package since Go.13: https://go.dev/blog/go1.13-errors"
- pkg: k8s.io/utils
desc: "replaced by internal packages like pkg/util/ptr"
- pkg: github.com/onsi/ginkgo$
desc: "replaced by github.com/onsi/ginkgo/v2"
- pkg: io/ioutil
desc: "replaced by io and os packages since Go 1.16: https://tip.golang.org/doc/go1.16#ioutil"
- pkg: github.com/pkg/errors
desc: "replaced by stdlib errors package since Go.13: https://go.dev/blog/go1.13-errors"
- pkg: k8s.io/utils
desc: "replaced by internal packages like pkg/util/ptr"
- pkg: github.com/onsi/ginkgo$
desc: "replaced by github.com/onsi/ginkgo/v2"
test-builder:
list-mode: lax # allow unless explicitly denied
files:
- "**/test/builder/*.go"
- "**/matcher.go"
- "!$test"
- "**/test/builder/*.go"
- "**/matcher.go"
- "!$test"
deny:
- pkg: io/ioutil
desc: "replaced by io and os packages since Go 1.16: https://tip.golang.org/doc/go1.16#ioutil"
- pkg: github.com/pkg/errors
desc: "replaced by stdlib errors package since Go.13: https://go.dev/blog/go1.13-errors"
- pkg: k8s.io/utils
desc: "replaced by internal packages like pkg/util/ptr"
- pkg: github.com/onsi/ginkgo$
desc: "replaced by github.com/onsi/ginkgo/v2"
- pkg: io/ioutil
desc: "replaced by io and os packages since Go 1.16: https://tip.golang.org/doc/go1.16#ioutil"
- pkg: github.com/pkg/errors
desc: "replaced by stdlib errors package since Go.13: https://go.dev/blog/go1.13-errors"
- pkg: k8s.io/utils
desc: "replaced by internal packages like pkg/util/ptr"
- pkg: github.com/onsi/ginkgo$
desc: "replaced by github.com/onsi/ginkgo/v2"
errorlint:
# Check whether fmt.Errorf uses the %w verb for formatting errors.
errorf: false
Expand All @@ -152,38 +158,38 @@ linters-settings:
linters:
disable-all: true
enable:
- errorlint
- asciicheck
- bodyclose
- depguard
- dogsled
- errcheck
- exportloopref
- goconst
- gocritic
- gocyclo
- godot
- gofmt
- goimports
- goprintffuncname
- gosec
- gosimple
- govet
- importas
- ineffassign
- misspell
- nakedret
- nilerr
- nolintlint
- prealloc
- revive
- rowserrcheck
- staticcheck
- stylecheck
- typecheck
- unconvert
- unparam
- unused
- errorlint
- asciicheck
- bodyclose
- depguard
- dogsled
- errcheck
- exportloopref
- goconst
- gocritic
- gocyclo
- godot
- gofmt
- goimports
- goprintffuncname
- gosec
- gosimple
- govet
- importas
- ineffassign
- misspell
- nakedret
- nilerr
- nolintlint
- prealloc
- revive
- rowserrcheck
- staticcheck
- stylecheck
- typecheck
- unconvert
- unparam
- unused

issues:
max-same-issues: 0
Expand All @@ -193,51 +199,51 @@ issues:
# nitpicking.
exclude-use-default: false
exclude-dirs:
- external
- pkg/util/cloudinit/schema
- pkg/util/netplan/schema
- external
- pkg/util/cloudinit/schema
- pkg/util/netplan/schema
exclude-files:
- ".*generated.*\\.go"
- ".*generated.*\\.go"
exclude:
# TODO: Remove the following exclusions over time once we have fixed those.
- "ST1000: at least one file in a package should have a package comment"
# TODO: Remove the following exclusions over time once we have fixed those.
- "ST1000: at least one file in a package should have a package comment"
# List of regexps of issue texts to exclude, empty list by default.
exclude-rules:
- linters:
- staticcheck
text: "^SA1019: [^.]+.Wait is deprecated: Please use WaitEx instead."
- linters:
- staticcheck
text: "^SA1019: [^.]+.WaitForResult is deprecated: Please use WaitForResultEx instead."
- linters:
- revive
text: ".*should have (a package )?comment.*"
- linters:
- revive
text: "^exported: comment on exported const"
- linters:
- staticcheck
text: "^SA1019: .*TCPSocket is deprecated"
- linters:
- govet
text: "printf: non-constant format string in call"
# Dot imports for gomega or ginkgo are allowed within test files.
- path: test/builder/intg_test_context.go
text: should not use dot imports
- path: test/builder/test_suite.go
text: should not use dot imports
- path: test/builder/vcsim_test_context.go
text: should not use dot imports
- path: _test.go
text: should not use dot imports
# All of our webhooks follow the pattern of passing the webhook context which
# contains fields like the Client. Ignore the linter warnings for now.
- path: webhooks/
text: ".* `ctx` is unused"
- path: _test.go
linters:
- gosec
- depguard
- linters:
- revive
text: "unused-parameter: parameter"
- linters:
- staticcheck
text: "^SA1019: [^.]+.Wait is deprecated: Please use WaitEx instead."
- linters:
- staticcheck
text: "^SA1019: [^.]+.WaitForResult is deprecated: Please use WaitForResultEx instead."
- linters:
- revive
text: ".*should have (a package )?comment.*"
- linters:
- revive
text: "^exported: comment on exported const"
- linters:
- staticcheck
text: "^SA1019: .*TCPSocket is deprecated"
- linters:
- govet
text: "printf: non-constant format string in call"
# Dot imports for gomega or ginkgo are allowed within test files.
- path: test/builder/intg_test_context.go
text: should not use dot imports
- path: test/builder/test_suite.go
text: should not use dot imports
- path: test/builder/vcsim_test_context.go
text: should not use dot imports
- path: _test.go
text: should not use dot imports
# All of our webhooks follow the pattern of passing the webhook context which
# contains fields like the Client. Ignore the linter warnings for now.
- path: webhooks/
text: ".* `ctx` is unused"
- path: _test.go
linters:
- gosec
- depguard
- linters:
- revive
text: "unused-parameter: parameter"
12 changes: 10 additions & 2 deletions api/v1alpha3/virtualmachineimage_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -252,6 +252,14 @@ type VirtualMachineImageStatus struct {
Type string `json:"type,omitempty"`
}

func (i VirtualMachineImageStatus) GetConditions() []metav1.Condition {
return i.Conditions
}

func (i *VirtualMachineImageStatus) SetConditions(conditions []metav1.Condition) {
i.Conditions = conditions
}

// +kubebuilder:object:root=true
// +kubebuilder:resource:scope=Namespaced,shortName=vmi;vmimage
// +kubebuilder:storageversion
Expand All @@ -273,7 +281,7 @@ type VirtualMachineImage struct {
Status VirtualMachineImageStatus `json:"status,omitempty"`
}

func (i *VirtualMachineImage) GetConditions() []metav1.Condition {
func (i VirtualMachineImage) GetConditions() []metav1.Condition {
return i.Status.Conditions
}

Expand Down Expand Up @@ -311,7 +319,7 @@ type ClusterVirtualMachineImage struct {
Status VirtualMachineImageStatus `json:"status,omitempty"`
}

func (i *ClusterVirtualMachineImage) GetConditions() []metav1.Condition {
func (i ClusterVirtualMachineImage) GetConditions() []metav1.Condition {
return i.Status.Conditions
}

Expand Down
Loading

0 comments on commit ad2cf24

Please sign in to comment.