Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge page support for guest memory #2139

Open
3 tasks
pclesr opened this issue Sep 16, 2020 · 6 comments
Open
3 tasks

Huge page support for guest memory #2139

pclesr opened this issue Sep 16, 2020 · 6 comments
Assignees
Labels
Roadmap: Tracked Items tracked on the roadmap project. Type: Enhancement Indicates new feature requests

Comments

@pclesr
Copy link

pclesr commented Sep 16, 2020

Why is this feature request important? What are the use cases? Please describe.

Page faulting can increase startup time. In the development of Nitro Hypervisor support for arm, supporting huge pages for guest memory made the difference between hitting the target SPEC numbers and not. Since there was no hugepage fs in Nitro Hypervisor, the only alternative was to make the kernel support huge pud/pmd for arm. I would not advocate that route, rather have the ability to use a mounted hugetblfs.

For embedded environments where everything is restricted, being able to allocate a specific number of huge pages at boot that can be used for guests would decrease the startup time of the app by reducing faults.

Describe the desired solution

Since not every environment will have huge pages, either a build-time or run time option to have guest memory that is specified in the KVM_SET_USER_MEMORY_REGION ioctl be backed by huge pages. I would propose using the hugetlbfs and mmap() to alloc the anon memory. Since I barely know Rust, I can't tell how guest memory is currently allocated, but I would assume it is some mmap() call.

Describe possible alternatives

perf tracing shows that a lot of time is spent page faulting, especially in arm. Obviously it works without having huge pages, but it could reduce not only startup time, but fault time as well.

Additional context

Running on a poor, resource-starved 2-core A57 in an embedded environment. Since it's embedded, there is control over everything. The goal is to reduce startup time and overhead of the KVM calls.

Checks

  • Have you searched the Firecracker Issues database for similar requests?
  • Have you read all the existing relevant Firecracker documentation?
  • Have you read and understood Firecracker's core tenets?
@pclesr
Copy link
Author

pclesr commented Sep 23, 2020

As an additional data point, when running on a two core A57, putting the kernel image and initrd into a hugetlb filesystem resulted in a 20% improvement. When running 'perf -ag' and looking at where the time is being spent, it was no longer spending all of it's time faulting from devices::virtio::block::device::Block::process_queue. It still spends a lot of time handling faults from process_queue(), but at least it is no longer at the very top of the perf output.

Also, the console on arm is extremely expensive; 'quiet' on the kernel command line is your friend.

@iulianbarbu
Copy link

iulianbarbu commented Oct 5, 2020

Hi, @pclesr ! Sorry for the delay and thanks for logging this feature request. I think we are interested in exposing such capabilities. We're currently using the guest memory primitives from rust-vmm/vm-memory. We need to contribute there first with huge pages support and release a new vm-memory version which can be consumed by Firecracker.

I've opened an issue on rust-vmm/vm-memory. We'll keep this issue here to track the progress on rust-vmm/memory and the discussions around Firecracker consumption/exposing of the feature.

@EmeraldShift
Copy link
Contributor

Hi, @iulianbarbu, I responded to the issue you opened on rust-vmm/vm-memory, expressing interest in that issue. We'd also like to work on the Firecracker end of that feature, too. Is there more information/context you can provide to help us get started?

@serban300
Copy link
Contributor

Hi @EmeraldShift ! Personally I think on the Firecracker end the issue would be how exactly to expose the feature to the customer. Should it be an API call ? Or should it be something else ? We haven't discussed anything yet. But anyway, we should wait for the rust-vmm/vm-memory implementation first. There might be some aspects that will depend on the design that will be adopted there.

@pclesr
Copy link
Author

pclesr commented Oct 27, 2020

I did a simple proof of concept by patching MmapRegion() in src/mmap_unix.rs and saw that page faults went through the hugetlb handler (verified by running perf and looking at the kernel stacks).

diff --git a/src/mmap_unix.rs b/src/mmap_unix.rs
index 5d23de0..f0983d1 100644
--- a/src/mmap_unix.rs
+++ b/src/mmap_unix.rs
@@ -109,7 +109,7 @@ impl MmapRegion {
             None,
             size,
             libc::PROT_READ | libc::PROT_WRITE,
-            libc::MAP_ANONYMOUS | libc::MAP_NORESERVE | libc::MAP_PRIVATE,
+            libc::MAP_ANONYMOUS | libc::MAP_NORESERVE | libc::MAP_PRIVATE | libc::MAP_HUGETLB,
         )
     }

Obviously, this is not generic, but I wanted to just see if it would work.

One possibility for firecracker would be an option in the machine config that would control whether the pages for guest memory were backed by huge pages or normal.

@JonathanWoollett-Light JonathanWoollett-Light added Type: Enhancement Indicates new feature requests and removed Performance: Memory labels Mar 23, 2023
@roypat roypat added Roadmap: New Request Status: Parked Indicates that an issues or pull request will be revisited later labels Dec 4, 2023
@ShadowCurse ShadowCurse removed the Status: Parked Indicates that an issues or pull request will be revisited later label Jan 15, 2024
@ShadowCurse ShadowCurse moved this to We're Working On It in Firecracker Roadmap Jan 15, 2024
@roypat roypat moved this from We're Working On It to Coming Soon in Firecracker Roadmap Jan 24, 2024
@roypat roypat moved this from Coming Soon to Developer Preview in Firecracker Roadmap Mar 25, 2024
@roypat
Copy link
Contributor

roypat commented Mar 25, 2024

Hey all,
We have added support for backing guest memory by 2M hugetlb pages with Firecracker 1.7. Please also see #4360 and https://github.com/firecracker-microvm/firecracker/blob/main/docs/hugepages.md. I'm keeping this issue open to track that hugepages support is in developer preview for now, so please let us know if you have any feedback on the feature!

@roypat roypat added Roadmap: Tracked Items tracked on the roadmap project. and removed Roadmap: New Request labels Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Roadmap: Tracked Items tracked on the roadmap project. Type: Enhancement Indicates new feature requests
Projects
Status: Developer Preview
Development

No branches or pull requests

8 participants