Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support larger than 1000GiB volumes via multiple sub-volumes #1450

Open
leftwo opened this issue Sep 5, 2024 · 0 comments
Open

Support larger than 1000GiB volumes via multiple sub-volumes #1450

leftwo opened this issue Sep 5, 2024 · 0 comments

Comments

@leftwo
Copy link
Contributor

leftwo commented Sep 5, 2024

Right now, the maximum size of a virtual disk in Nexus is capped at 1000GiB
The limitation in Nexus is because we currenly only support a single SubVolume in a VolumeConstructionRequest.
The maximum size for a single SubVolume is 1000GiB (and why Nexus has the cap).

This SubVolume max, which is the largest size of a downstairs region we can create, was chosen as a compromise relating to:

  • The size of the SSD itself
  • How many open files a downstairs process consumes
  • Trying to strike a balance between giant extent files and what that means to repair times and flush overhead.

To increase the size of a virtual disk in Nexus, we need to implement support for multiple SubVolumes in a Volume. Making larger virtual disks by chaining together SubVolumes has been the plan for a while, and we already do something similar with read-only-parents.

There will be a different issue specifically for the work required in Nexus to stitch this together, but there is work in the Crucible repo we need to do first. Much of that will probably be around the Volume layer, but possibly additional places will need work as we work through the details.

Some things to consider:

  • Unknown impact on performance, and all the careful tuning and backpressure work.
  • This means two “upstairs”, one for each SubVolume. (This is currently how read_only_parents work)
  • Need to update crutest to use Volume instead of Guest, or make a new Volume level test?
  • SubVolumes in the same volume must share block size, but they don’t need to share extent size or count. (Could this lead to weird performance behavior?)

If we Update crutest to support Volume instead of Guest

  • The Guest type supports activate_with_gen(), query_work_queue(), and query_extent_size() and the Volume and generic BlockIO trait does not. Support would need to be added to the Volume layer (other BlockIO types could NoOp these calls)
  • Many crutest tests look at extent size/count and test based on that, would need new logic to handle both getting multiple layers of extent info and knowing how to test might change depending on how the sub volumes are constructed.
  • Maybe time to go through crutest and remove old/unused tests?

If we make a new voltest.

  • Similar to crutest, but It could take a VCR in directly, json or command line?
  • We would have to duplicate a bunch of functionality and tests that crutest provides now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants