libvirt-shim: libvirt to run qemu in a different container

Sunday, 26 March 2023

Hey all,

I'm Itamar Holder, a Kubevirt developer.
Lately we came across a problem w.r.t. properly supporting VMs with
dedicated CPUs on Kubernetes. The full details can be seen in this PR
<https://github.com/kubevirt/kubevirt/pull/8869> [1], but to make a very
long story short, we would like to use two different containers in the
virt-launcher pod that is responsible to run a VM:

   - "Managerial container": would be allocated with a shared cpuset. Would
   run all of the virtualization infrastructure, such as libvirtd and its
   dependencies.
   - "Emulator container": would be allocated with a dedicated cpuset.
   Would run the qemu process.

There are many reasons for choosing this design, but in short, the main
reasons are that it's impossible to allocate both shared and dedicated cpus
to a single container, and that it would allow finer-grained control and
isolation for the different containers.

Since there is no way to start the qemu process in a different container, I
tried to start qemu in the "managerial" container, then move it into the
"emulator" container. This fails however, since libvirt uses
sched_setaffinity to pin the vcpus into the dedicated cpuset, which is not
allocated to the managerial container, resulting in an EINVAL error.

Therefore, I thought about discussing a new approach - introducing a small
shim that could communicate with libvirtd in order to start and control the
qemu process that would run on a different container.

As I see it, the main workflow could be described as follows:

   - The emulator container would start with the shim.
   - libvirtd, running in the managerial container, would ask for some
   information from the target, e.g. cpuset.
   - libvirtd would create the domain xml and would transfer to the shim
   everything needed in order to launch the guest.
   - The shim, running in the emulator container, would run the
   qemu-process.

What do you think? Feedback is much appreciated.

Best Regards,
Itamar Holder.

[1] https://github.com/kubevirt/kubevirt/pull/8869

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005