Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add user stories for disk support #1681

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
162 changes: 162 additions & 0 deletions enhancements/machine-api/multiple-disk-support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
---
title: multiple-disk-support
authors:
- "@kannon92"
reviewers:
- "tbd"
approvers:
- "tbd"
api-approvers:
- "tbd"
creation-date: 2024-09-17
last-updated: 2024-09-17
tracking-link:
- "https://issues.redhat.com/browse/OCPSTRAT-615"
- "https://issues.redhat.com/browse/OCPSTRAT-1592"
see-also:
- "https://github.com/openshift/enhancements/pull/1657"
replaces:
superseded-by:
---

# Disk Support in Openshift

## Release Signoff Checklist

- [ ] Enhancement is `implementable`
- [ ] Design details are appropriately documented from clear requirements
- [ ] Test plan is defined
- [ ] Operational readiness criteria is defined
- [ ] Graduation criteria for dev preview, tech preview, GA
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)

## Summary

OpenShift is traditionally a single-disk system, meaning the OpenShift installer is designed to install the OpenShift filesystem on one disk. However, new customer use cases have highlighted the need for the ability to configure additional disks during installation.

## Motivation

Custormers request the ability to add disks for day 0 and day 1 operations. Some of the common areas include designed disk for etcd, dedicated disk for swap partitions, container runtime filesystem, and a separate filesystem for container images.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Custormers request the ability to add disks for day 0 and day 1 operations. Some of the common areas include designed disk for etcd, dedicated disk for swap partitions, container runtime filesystem, and a separate filesystem for container images.
Customers request the ability to add disks for day 0 and day 1 operations. Some of the common areas include designed disk for etcd, dedicated disk for swap partitions, container runtime filesystem, and a separate filesystem for container images.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth defining days 0, 1, and 2?


All of these features are possible to support through a combination of machine configs and machine API changes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of definitions, I wonder if it's worth defining 'infrastructure platform' (or some better term for the same thing). Something like: "A platform-specific combination of machine config and machine API configuration"?

However, the support varies across the use cases and there is a need to define a general API for disk support so we can have a common experience across different methods.

### Workflow Description

### Goals

TBD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Define a common interface for infrastructure platforms to implement to use additional disks for a defined set of specific uses
  • Implement common behaviour to safely use the above disks when they have been presented by the infrastructure platform


### Non-Goals

- Adding disk support in CAPI providers where it is not supported upstream
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Adding generic support for mounting arbitrary additional disks


## Proposal

### User Stories

#### Designated Disk for ETCD

As a user, I would like to install a cluster with a dedicated disk for etcd.
Our recommended practices for etcd suggest using a dedicated disk for optimal performance.
Managing disk mounting through MCO can be challenging and may introduce additional issues.
Cluster-API supports running etcd on a dedicated disk.

An example of this done via MCO is available on our [documentation pages](https://docs.openshift.com/container-platform/4.13/scalability_and_performance/recommended-performance-scale-practices/recommended-etcd-practices.html#move-etcd-different-disk_recommended-etcd-practices).

#### Dedicated Disk for Swap Partitions

As a user, I would like to install a cluster with swap enabled on each node and utilize a dedicated disk for swap partitions.
A dedicated disk for swap would help prevent swap activity from impacting node performance.

If the swap partition is located on the node filesystem, it is possible to get I/O contention between the system I/O and the user workloads.
Having a separate disk could allow for swap I/O contention to be mitigated.

#### Dedicated Disk for Container Runtime

As a user, I would like to install a cluster and assign a separate disk for the container runtime for each node.

This has been supported in Kubernetes for a long time but it has been poorly documented. Kevin created [container runtime filesystem](https://kubernetes.io/blog/2024/01/23/kubernetes-separate-image-filesystem).

A common KCS article we link is [here](https://access.redhat.com/solutions/4952011).

#### Dedicated Disk for Image Storage

As a user, I would like to install a cluster with images stored on a dedicated disk, while keeping ephemeral storage on the node's main filesystem.

This story was the motivation for [KEP-4191](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/4191-split-image-filesystem/README.md).

#### Container logs

As a user, I would like that my container logs are stored on a separate filesystem from the node filesystem and the container runtime filesystem.

[OCPSTRAT-188](https://issues.redhat.com/browse/OCPSTRAT-188) is one feature request for this.

[RFE-2734](https://issues.redhat.com/browse/RFE-2734) is another ask for separating logs.

### API Extensions

### Implementation Details/Notes/Constraints

### Topology Considerations

#### Hypershift / Hosted Control Planes

This proposal does not affect HyperShift.
HyperShift does not leverage Machine API.
Copy link
Member

@derekwaynecarr derekwaynecarr Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hosted Control Planes do support the ability to inject a MachineConfig into the NodePool definition on the management cluster. For the use cases described above, it seems like we should be able to provide the same capability for either form factor, and customers (particularly the swap scenario) would benefit ensuring we have coverage for both form factors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjenning @mrunalp the use cases oriented around swap are valuable in the HCP scenario, wanted to make sure we ensure that we can have a consistent set of disk layout options even if in the HCP case it is defined externally via NodePool/MachineConfig injection.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derekwaynecarr what would be the motivation for HCP swap? We were thinking that swap should only be enabled on the worker nodes. I don't see a benefit of swap on control plane nodes at the moment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hosted Control Planes do support the ability to inject a MachineConfig into the NodePool definition on the management cluster. For the use cases described above, it seems like we should be able to provide the same capability for either form factor, and customers (particularly the swap scenario) would benefit ensuring we have coverage for both form factors.

I should look more into this. Do you have a link or a design doc handy for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation is to ensure that you can enable the desired disk layout for swap on the worker nodes that join a Hosted Control Plane. Those worker nodes are configured via the NodePool abstraction on the management cluster which supports the ability to inject a MachineConfig.

For customers exploring OpenShift Virtualization to support their virtualization workload, we see a lot of interest in having OpenShift Virtualization running on a cluster that uses the HCP form-factor with bare-metal workers in order to bring the number of physical nodes needed to support the number of control planes down. This is important if you have a large number of virtualization machines and therefore need multiple clusters to support virtualization in a given data center.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the link. We will make sure to consider NodePool in this design.


#### Standalone Clusters


#### Single-node Deployments or MicroShift

Single Node and MicroShift do not leverage Machine API.

### Risks and Mitigations

N/A

## Design Details

### Open Questions

## Test Plan

## Graduation Criteria

### Dev Preview -> Tech Preview

N/A

### Tech Preview -> GA

N/A

### Removing a deprecated feature

No features will be removed as a part of this proposal.

## Upgrade / Downgrade Strategy

## Version Skew Strategy

N/A

## Operational Aspects of API Extensions

#### Failure Modes


## Support Procedures

## Implementation History

N/A

### Drawbacks

N/A

## Alternatives

### Future implementation