Skip to content

Commit

Permalink
Use virtiofsd for sharing file system data with host
Browse files Browse the repository at this point in the history
Switch over to using virtiofsd for sharing file system data with the
host. virtiofs is a file system designed for the needs of virtual
machines and environments. That is in contrast to 9P fs, which we
currently use for sharing data with the host, which is first and
foremost a network file system.
9P is problematic if for no other reason that it lacks proper support
for usage of the "open-unlink-fstat idiom", in which files are unlinked
and later referenced via file descriptor (see danobi#83). virtiofs does not
have this problem.

This change replaces usage of 9P with that of virtiofs. In order to
work, virtiofsd needs a user space server. Contrary to what is the case
for 9P, it is not currently integrated into Qemu itself and so we have
to manage it separately (and require the user to install it).

I benchmarked both the current master as well as this version with a
bare-bones custom kernel:
  Benchmark 1: target/release/vmtest -k bzImage-9p 'echo test'
    Time (mean ± σ):      1.316 s ±  0.087 s    [User: 0.462 s, System: 1.104 s]
    Range (min … max):    1.232 s …  1.463 s    10 runs

  Benchmark 1: target/release/vmtest -k bzImage-virtiofsd 'echo test'
    Time (mean ± σ):      1.244 s ±  0.011 s    [User: 0.307 s, System: 0.358 s]
    Range (min … max):    1.227 s …  1.260 s    10 runs

So it seems there is a ~0.7s speed up, on average (and significantly
less system time being used). This is great, but I suspect a more
pronounced speed advantage will be visible when working with large
files, in which virtiofs is said to significantly outperform 9P
(typically >2x from what I understand, but I have not done any
benchmarks of that nature).

A few other notes:
- we solely rely on guest level read-only mounts to enforce read-only
  state. The virtiofsd recommended way is to use read-only bind mounts
  [0], but doing so would require root.
- we are not using DAX, because it still is still incomplete and
  apparently requires building Qemu (?) from source. In any event, it
  should not change anything functionally and be solely a performance
  improvement.
- interestingly, there may be the option of just consuming the virtiofsd
  crate as a library and not require any shelling out. That would be
  *much* nicer, but the current APIs make this somewhat cumbersome. I'd
  think we'd pretty much have to reimplement their entire main()
  functionality [1]. I consider this way out of scope for this first
  version.

I have adjusted the configs, but because I don't have Docker handy I
can't really create those kernel. CI seems incapable of producing the
artifacts without doing a fully-blown release dance. No idea what empty
is about, really. I suspect the test failures we see are because it
lacks support?

Some additional resources worth keeping around:
  - https://virtio-fs.gitlab.io/howto-boot.html
  - https://virtio-fs.gitlab.io/howto-qemu.html

[0] https://gitlab.com/virtio-fs/virtiofsd/-/blob/main/README.md?ref_type=heads#faq
[1] https://gitlab.com/virtio-fs/virtiofsd/-/blob/main/src/main.rs?ref_type=heads#L1242

Closes: danobi#16
Closes: danobi#83

Signed-off-by: Daniel Müller <[email protected]>
  • Loading branch information
d-e-s-o committed Aug 27, 2024
1 parent b713eba commit b8fbb61
Show file tree
Hide file tree
Showing 11 changed files with 248 additions and 66 deletions.
5 changes: 3 additions & 2 deletions .github/workflows/rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@ env:
jobs:
build-test:

runs-on: ubuntu-latest
# ubuntu-24.04 is the earliest image with a `virtiofsd` package.
runs-on: ubuntu-24.04

steps:
- uses: actions/checkout@v3
Expand All @@ -36,7 +37,7 @@ jobs:
run: |
sudo apt-get update
# Virtualization deps
sudo apt-get install -y qemu-system-x86-64 qemu-guest-agent qemu-utils ovmf
sudo apt-get install -y qemu-system-x86-64 qemu-guest-agent qemu-utils ovmf virtiofsd
- name: Cache test assets
uses: actions/cache@v3
Expand Down
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ readme = "README.md"
version = "0.14.0"
edition = "2021"
license = "Apache-2.0"
rust-version = "1.70.0"
rust-version = "1.80.0"

[dependencies]
anyhow = "1.0.66"
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,14 +26,15 @@ The following are required dependencies, grouped by location:

Host machine:

* [`qemu`](https://pkgs.org/download/qemu)
* [`qemu`](https://pkgs.org/download/qemu) (version 5.9 or higher)
* [`qemu-guest-agent`](https://pkgs.org/search/?q=qemu-guest-agent)
* [`OVMF`](https://pkgs.org/download/ovmf)
* [`virtiofsd`](https://gitlab.com/virtio-fs/virtiofsd)

Virtual machine image:

* `qemu-guest-agent`
* Kernel 9p filesystem support, either compiled in or as modules (see kernel
* Kernel `virtiofs` support, either compiled in or as modules (see kernel
dependencies)
* Most (if not all) distros already ship support as modules or better

Expand All @@ -42,9 +43,8 @@ Kernel:
* `CONFIG_VIRTIO=y`
* `CONFIG_VIRTIO_PCI=y`
* `CONFIG_VIRTIO_CONSOLE=y`
* `CONFIG_NET_9P=y`
* `CONFIG_NET_9P_VIRTIO=y`
* `CONFIG_9P_FS=y`
* `CONFIG_FUSE_FS=y`
* `CONFIG_VIRTIO_FS=y`

Note the virtual machine image dependencies are only required if you're using
the `image` target parameter. Likewise, the same applies for kernel
Expand Down
1 change: 1 addition & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,4 @@ pub use crate::vmtest::*;
mod qemu;
mod qga;
mod util;
mod virtiofsd;
5 changes: 5 additions & 0 deletions src/output.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@ use anyhow::Result;
/// Receivers should treat failures as terminal and not expect any more
/// updates.
pub enum Output {
/// On-host initialization starts
InitializeStart,
/// Initialization finished with provided with provided result
InitializeEnd(Result<()>),

/// VM boot begins
BootStart,
/// Output related to VM boot
Expand Down
130 changes: 77 additions & 53 deletions src/qemu.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,18 +27,17 @@ use tinytemplate::{format_unescaped, TinyTemplate};
use crate::output::Output;
use crate::qga::QgaWrapper;
use crate::util::gen_sock;
use crate::virtiofsd::Virtiofsd;
use crate::{Mount, Target, VMConfig};

const INIT_TEMPLATE: &str = include_str!("init/init.sh.template");
const COMMAND_TEMPLATE: &str = include_str!("init/command.template");
// Needs to be `/dev/root` for kernel to "find" the 9pfs as rootfs
const ROOTFS_9P_FS_MOUNT_TAG: &str = "/dev/root";
const SHARED_9P_FS_MOUNT_TAG: &str = "vmtest-shared";
const ROOT_FS_MOUNT_TAG: &str = "rootfs";
const SHARED_FS_MOUNT_TAG: &str = "vmtest-shared";
const COMMAND_OUTPUT_PORT_NAME: &str = "org.qemu.virtio_serial.0";
const MAGIC_INTERACTIVE_COMMAND: &str = "-";

const SHARED_9P_FS_MOUNT_PATH: &str = "/mnt/vmtest";
const MOUNT_OPTS_9P_FS: &str = "trans=virtio,cache=mmap,msize=1048576";
const SHARED_FS_MOUNT_PATH: &str = "/mnt/vmtest";
const OVMF_PATHS: &[&str] = &[
// Fedora
"/usr/share/edk2/ovmf/OVMF_CODE.fd",
Expand All @@ -55,6 +54,8 @@ type QmpUnixStream = qapi::Stream<BufReader<UnixStream>, UnixStream>;
/// Represents a single QEMU instance
pub struct Qemu {
process: Command,
/// `virtiofsd` instances for each of the mounts in use.
virtiofsds: Vec<Virtiofsd>,
qga_sock: PathBuf,
qmp_sock: PathBuf,
command: String,
Expand Down Expand Up @@ -241,6 +242,18 @@ fn guest_agent_args(sock: &Path) -> Vec<OsString> {
args
}

/// Generate general arguments necessary for working with `virtiofs`.
fn virtiofs_general_args(vm: &VMConfig) -> Vec<OsString> {
let mut args: Vec<OsString> = Vec::new();

args.push("-object".into());
args.push(format!("memory-backend-memfd,id=mem,share=on,size={}", vm.memory.as_str()).into());
args.push("-numa".into());
args.push("node,memdev=mem".into());

args
}

/// Generate arguments for full KVM virtualization if host supports it
fn kvm_args(arch: &str) -> Vec<&'static str> {
let mut args = Vec::new();
Expand Down Expand Up @@ -291,30 +304,17 @@ fn machine_protocol_args(sock: &Path) -> Vec<OsString> {
args
}

/// Generate arguments for setting up 9p FS server on host
/// Generate per-file-system arguments necessary for working with `virtiofs`.
///
/// `id` is the ID for the FS export (currently unused AFAICT)
/// `id` is the ID for the FS export
/// `mount_tag` is used inside guest to find the export
fn plan9_fs_args(host_shared: &Path, id: &str, mount_tag: &str, ro: bool) -> Vec<OsString> {
fn virtiofs_per_fs_args(virtiofsd: &Virtiofsd, id: &str, mount_tag: &str) -> Vec<OsString> {
let mut args: Vec<OsString> = Vec::new();

args.push("-virtfs".into());

let mut arg = OsString::new();
arg.push(format!("local,id={id},path="));
arg.push(if host_shared.as_os_str().is_empty() {
// This case occurs when the config file path is just "vmtest.toml"
Path::new(".")
} else {
host_shared
});
arg.push(format!(
",mount_tag={mount_tag},security_model=none,multidevs=remap"
));
if ro {
arg.push(",readonly=on")
}
args.push(arg);
args.push("-chardev".into());
args.push(format!("socket,id={id},path={}", virtiofsd.socket_path().display()).into());
args.push("-device".into());
args.push(format!("vhost-user-fs-pci,queue-size=1024,chardev={id},tag={mount_tag}").into());

args
}
Expand Down Expand Up @@ -371,9 +371,9 @@ fn kernel_args(
// The guest kernel command line args
let mut cmdline: Vec<OsString> = Vec::new();

// Tell kernel the rootfs is 9p
cmdline.push("rootfstype=9p".into());
cmdline.push(format!("rootflags={}", MOUNT_OPTS_9P_FS).into());
// Tell kernel the rootfs is on a virtiofs and what "tag" it uses.
cmdline.push("rootfstype=virtiofs".into());
cmdline.push(format!("root={ROOT_FS_MOUNT_TAG}").into());

// Mount rootfs readable/writable to make experience more smooth.
// Lots of tools expect to be able to write logs or change global
Expand Down Expand Up @@ -455,16 +455,6 @@ fn vmconfig_args(vm: &VMConfig) -> Vec<OsString> {
vm.memory.clone().into(),
];

for mount in vm.mounts.values() {
let name = format!("mount{}", hash(&mount.host_path));
args.append(&mut plan9_fs_args(
&mount.host_path,
&name,
&name,
!mount.writable,
));
}

let mut extra_args = vm
.extra_args
.clone()
Expand Down Expand Up @@ -650,6 +640,7 @@ impl Qemu {
let command_sock = gen_sock("cmdout");
let (init, guest_init) = gen_init(&target.rootfs).context("Failed to generate init")?;

let mut virtiofsds = Vec::new();
let mut c = Command::new(format!("qemu-system-{}", target.arch));

c.args(QEMU_DEFAULT_ARGS)
Expand All @@ -660,6 +651,7 @@ impl Qemu {
.args(machine_args(&target.arch))
.args(machine_protocol_args(&qmp_sock))
.args(guest_agent_args(&qga_sock))
.args(virtiofs_general_args(&target.vm))
.args(virtio_serial_args(&command_sock));
// Always ensure the rootfs is first.
if let Some(image) = &target.image {
Expand All @@ -668,28 +660,42 @@ impl Qemu {
c.args(uefi_firmware_args(target.vm.bios.as_deref()));
}
} else if let Some(kernel) = &target.kernel {
c.args(plan9_fs_args(
target.rootfs.as_path(),
let virtiofsd = Virtiofsd::new(target.rootfs.as_path())?;
c.args(virtiofs_per_fs_args(
&virtiofsd,
"root",
ROOTFS_9P_FS_MOUNT_TAG,
false,
ROOT_FS_MOUNT_TAG,
));
c.args(kernel_args(
kernel,
&target.arch,
guest_init.as_path(),
target.kernel_args.as_ref(),
));
virtiofsds.push(virtiofsd);
} else {
panic!("Config validation should've enforced XOR");
}

// Now add the shared mount and other extra mounts.
c.args(plan9_fs_args(
host_shared,
let virtiofsd = Virtiofsd::new(host_shared)?;
c.args(virtiofs_per_fs_args(
&virtiofsd,
"shared",
SHARED_9P_FS_MOUNT_TAG,
false,
SHARED_FS_MOUNT_TAG,
));
virtiofsds.push(virtiofsd);

for mount in target.vm.mounts.values() {
let name = format!("mount{}", hash(&mount.host_path));
let virtiofsd = Virtiofsd::new(&mount.host_path)?;
c.args(virtiofs_per_fs_args(
&virtiofsd,
&name,
&name,
));
virtiofsds.push(virtiofsd);
}
c.args(vmconfig_args(&target.vm));

if log_enabled!(Level::Error) {
Expand All @@ -706,6 +712,7 @@ impl Qemu {

let mut qemu = Self {
process: c,
virtiofsds,
qga_sock,
qmp_sock,
command: target.command.to_string(),
Expand Down Expand Up @@ -838,16 +845,18 @@ impl Qemu {
// We can race with VM/qemu coming up. So retry a few times with growing backoff.
let mut rc = 0;
for i in 0..5 {
let mount_opts = if ro {
format!("{},ro", MOUNT_OPTS_9P_FS)
} else {
MOUNT_OPTS_9P_FS.into()
};
let mut args = vec![
"-t", "virtiofs", mount_tag, guest_path
];
if ro {
args.push("-oro")
}

rc = run_in_vm(
qga,
&output_fn,
"mount",
&["-t", "9p", "-o", &mount_opts, mount_tag, guest_path],
&args,
false,
None,
)?;
Expand Down Expand Up @@ -1052,7 +1061,7 @@ impl Qemu {
// Mount shared directory inside guest
let _ = self.updates.send(Output::SetupStart);
if let Err(e) =
self.mount_in_guest(qga, SHARED_9P_FS_MOUNT_PATH, SHARED_9P_FS_MOUNT_TAG, false)
self.mount_in_guest(qga, SHARED_FS_MOUNT_PATH, SHARED_FS_MOUNT_TAG, false)
{
return Err(e).context("Failed to mount shared directory in guest");
}
Expand All @@ -1075,6 +1084,21 @@ impl Qemu {
/// Errors and return status are reported through the `updates` channel passed into the
/// constructor.
pub fn run(mut self) {
let _ = self.updates.send(Output::InitializeStart);
for phase_fn in [Virtiofsd::launch, Virtiofsd::await_launched] {
for virtiofsd in self.virtiofsds.iter_mut() {
match phase_fn(virtiofsd) {
Ok(()) => (),
Err(e) => {
let _ = self.updates.send(Output::InitializeEnd(Err(e)));
return;
}
}
}
}

let _ = self.updates.send(Output::InitializeEnd(Ok(())));

// Start QEMU
let (mut child, qga, mut qmp) = match self.boot_vm() {
Ok((c, qga, qmp)) => (c, qga, qmp),
Expand Down
10 changes: 10 additions & 0 deletions src/ui.rs
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,16 @@ impl Ui {
};

match &msg {
Output::InitializeStart => {
stage = Stage::new(term.clone(), &heading("Initializing host environment", 2), Some(stage));
stages += 1;
}
Output::InitializeEnd(r) => {
if let Err(e) = r {
error_out_stage(&mut stage, e);
errors += 1;
}
}
Output::BootStart => {
stage = Stage::new(term.clone(), &heading("Booting", 2), Some(stage));
stages += 1;
Expand Down
Loading

0 comments on commit b8fbb61

Please sign in to comment.