Re: qemu:///embed and isolation from global components

Monday, 9 March 2020

On Fri, 2020-03-06 at 17:49 +0000, Daniel P. Berrangé wrote:
...
 On Fri, Mar 06, 2020 at 06:24:15PM +0100, Andrea Bolognani wrote:
 >   * it does, however, show up in the output of 'machinectl', with
 >     class=vm and service=libvirt-qemu;

 This is bad. It is one of the gaps we need to deal with.

 A long while back I proposed a domain XML option for this:

   https://www.redhat.com/archives/libvir-list/2017-December/msg00729.html

     <resource register="none|direct|machined|systemd"/>

 The "none" case meant inherit the cgroups placement of the parent
 libvirtd (or now the app using the embedded driver), and would be
 a reasonable default for the embedded case. 
Agreed. In my case I'll want to use "direct" instead, but "none"
would indeed be a sane default for the embedded driver.

Aside: instead of a per-VM setting, would it make sense for this to
be a connection-level setting? That way, even on traditional libvirt
deployments, the hypervisor admin could eg. opt out of machinectl
registration if they so desire; at the same time, you'd avoid the
situation where most VMs are confined using CGroups but a few are
not, which is probably not a desirable scenario.

...
 For the other cases, we certainly need to do something to ensure
 uniqueness. This is possibly as simple as including a fixed
 prefix like "qemu-$PID" where $PID is the app embedding the
 driver. That said, we support closing the app, and re-starting
 using the same embedded driver directory, which would change
 PID. 
Right now we're already doing

  qemu-$domid-$domname

where I'm not entirely sure how much $domid actually buys us.

I think machine ids need to be unique per host, not per service,
which is kind of a bummer because the obvious choice would be to
generate a service name based on the embedded root... Michal already
suggested another solution, perhaps that one is more viable.

Anyway, I think it's reasonable to expect that, when it comes to VMs
created via the embedded driver, the same way you'd not be able to
control them via virsh, you'd also not be able to do so via
machinectl, so I'm not too concerned about this once we flip the
default to "none" as discussed above.

...
 >   * virtlogd is automatically spawned, if not already active,
to
 >     take care of the domain's log file.

 This is trickier. The use of virtlogd was to fix a security DoS
 where malicious QEMU can write to serial console, or trigger
 QEMU to write to stdout/err, such that it exhausts the host
 filesystem.  IIUC, virtlogd should still end up writing to
 the logfile path associated with the embedded  driver root
 directory prefix, so there shouldn't be a filename clash
 unless I screwed up.

 Since introducing virtlogd, however, I did think of a different
 strategy, which would be to connect a FIFO to QEMU as the log
 file FD. The QEMU driver itself can own the other end of the FIFO
 and do rollover.

 Of course you can turn off virtlogd via qemu.conf 
That's what I'm doing right now and it works well enough, but I'm
afraid that requiring a bunch of setup will discourage developers
from using the embedded driver. We should aim for a reasonable out
of the box experience instead.

...
 > The first one is expected, but the other two were a surprise,
at
 > least to me. Basically, what I would expect is that qemu:///embed
 > would work in a completely isolated way, only interacting with
 > system-wide components when that's explicitly requested.

 The goal wasn't explicitly to avoid system-wide components,
 but it certainly was intended to avoid clashing resources.

 IOW, machined, virtlogd are both valid to use, as long as
 the resource clashes are avoided. We should definitely have
 a way to disable these too. 
I'd argue that most users of the embedded driver would probably
prefer it didn't interact with system-wide components: if that
wasn't the case, they'd just use qemu:///system or qemu:///session
instead.

Having a way to turn off those behaviors would certainly be a step
in the right direction, but I think ultimately we want to be in a
situation where developers opt in rather than out of them.

...
 > In other words, I would expect virtlogd not to be spawned, and
the
 > domain not to be registered with machinectl; at the same time, if
 > the domain configuration is such that it contains for example
 > 
 >   <interface type='network'>
 >     <source network='default'/>
 >     <model type='virtio'/>
 >   </interface>
 > 
 > then I would expect to see a failure unless a connection to
 > network:///system had been explicitly and pre-emptively opened, and
 > not the current behavior which apparently is to automatically open
 > that connection and spawning virtnetworkd as a result.

 The QEMU embedded driver is explicitly intended to be able to
 interact with other global secondary drivers.

 If you don't want to use virtnetworkd, then you won't be
 creating such an <interface> config in the first place.
 The app will have the option to open an embedded seconary
 driver if desired. Some of the drivers won't really make
 sense as embedded things though, at least not without
 extra work. ie a embedded network or nwfilter driver has
 no value unless your app has moved into a new network
 namespace, as otherwise it will just fight with the
 global network driver. 
Again, I think our defaults for qemu:///embed should be consistent
and conservative: instead of having developers opt out of using
network:///system, they should have to opt in before global
resources are involved.

If we don't do that, I'm afraid developers will lose trust in the
whole qemu:///embed idea. Speaking from my own experience, I was
certainly surprised when I accidentally realized my qemu:///embed
VMs were showing up in the output of machinectl, and now I'm kinda
wondering how many other ways the application I'm working on, for
which the use of libvirt is just an implementation detail, is poking
at the system without my knowledge...

-- 
Andrea Bolognani / Red Hat / Virtualization

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: qemu:///embed and isolation from global components