Re: qemu:///embed and isolation from global components

11 Mar 2020


      On 3/10/20 4:42 PM, Andrea Bolognani wrote:
...
On Mon, 2020-03-09 at 18:04 +0000, Daniel P. Berrangé wrote:
...
On Mon, Mar 09, 2020 at 06:09:13PM +0100, Andrea Bolognani wrote:
...
On Fri, 2020-03-06 at 17:49 +0000, Daniel P. Berrangé wrote:
...
On Fri, Mar 06, 2020 at 06:24:15PM +0100, Andrea Bolognani wrote:
[...]
Aside: instead of a per-VM setting, would it make sense for this to
be a connection-level setting? That way, even on traditional libvirt
deployments, the hypervisor admin could eg. opt out of machinectl
registration if they so desire; at the same time, you'd avoid the
situation where most VMs are confined using CGroups but a few are
not, which is probably not a desirable scenario.
Yes, functionally it would work fine as a connection level setting
too, though this hides the behaviour from the anyone looking at the
guest config. We've previously punted quite a few things to the
qemu.conf because we didn't want to go through process of representing
them in the domain XML. This was OK when the qemu.conf settings were
something done once at host deployment time.
With the embedded driver, I think this is not so desirable, as means
to get the configuration they want from a VM, they need to deal with
two distinct config files. The ideal would be that everything that
is commonly needed can be achieved solely in the domain XML, and
I think resource control backend is one such common tunable.
I don't have a strong opinion either way, and as far as my current
use is concerned it doesn't bother me to have to deal with a second
configuration file. The reason why I thought a per-VM setting might
not be desirable is that applications would then be able to
override it, and so maybe VMs created with virt-manager would be
registered against machinectl but VMs created using GNOME Boxes
would not, and if the sysadmin likes to use machinectl to get a
comprehensive view of the system they'd no longer be guaranteed
that. But if that's not the kind of scenario we think we should
prevent, then a per-VM setting is fine by me :)
I still don't quite see the value in machinectl (maybe because I'm not 
using systemd :-D), but anyway - it's a system-wide monitor of virtual 
machines. Therefore it makes sense to register a domain started under 
qemu:///embed there. I don't view embed mode as a way of starting VMs 
secretly. It's a way of starting VMs privately and that's a different 
thing. Other users might learn that my app is running a VM (plain 'ps' 
would give it away), but they can not mangle with it in any way, e.g. 
change its XML.
...
[...]
...
...
Right now we're already doing
qemu-$domid-$domname
where I'm not entirely sure how much $domid actually buys us.
IIRC $domid was a hack because at one time we had problems with
systemd not cleaning up the transient scope when QEMU was killed.
This would prevent libvirt starting the guest again thereafter.
I can't remember now if this was a bug we fixed in systemd or
libvirt or both or neither.
It was introduced by bd773e74f0d1d1b9ebbfcaa645178316b4f2265c and the 
commit message links to this bug: 
https://bugs.freedesktop.org/show_bug.cgi?id=68370 which looks like 
fixed in systemd.
...
I see! It would be neat if we could get rid of it, assuming of course
it's no longer needed on the platforms we target.
I don't think it's that simple. Machinectl poses some limitations on the 
name: either it has to be a FQDN or a simple name without any dots. And 
according to our code the strlen() must be <= 64 (don't know if that 
comes from machinectl or is just our own limitation). Therefore, if you 
have two domains which names would clash after our processing, the 
domain ID guarantees unique strings are passed to machined.
...
[...]
...
...
...
Of course you can turn off virtlogd via qemu.conf
That's what I'm doing right now and it works well enough, but I'm
afraid that requiring a bunch of setup will discourage developers
from using the embedded driver. We should aim for a reasonable out
of the box experience instead.
Why do you need to turn it off though ?  It should already
"do the right thing" as the log files should appear under a
different location and not have any clash. Turning it off
immediately creates a denial of service CVE in your application.
I was getting the same SELinux denial that Michal reported a few
days back: virtlogd wants to verify it's being connected to by a
process running as root, but it's only allowed by the policy to
look into libvirtd's /proc/$pid for this information. So, for the
same reason virtqemud can't currently connect to virtlogd when
SELinux is in enforcing mode, neither can my qemu:///embed-using
application.
Besides that, there is the fact that a lot of people, mainly those
coming from a containers background, are not happy with having extra
daemons running. I'm not saying they would prefer being hit by a DoS
than having virtlogd running :) but they really, really don't like
daemons :)
...
None the less, as above I think we need common things to be
controllable via the domain XML. So either we need to make a
tunable there for use of logd or not, or we need to implement
the FIFO idea to avoid need for logd at all.
It seems like the FIFO idea (though I'll admit I don't fully
understand it) would be the best of both worlds.
[...]
...
...
...
If you don't want to use virtnetworkd, then you won't be
creating such an <interface> config in the first place.
The app will have the option to open an embedded seconary
driver if desired. Some of the drivers won't really make
sense as embedded things though, at least not without
extra work. ie a embedded network or nwfilter driver has
no value unless your app has moved into a new network
namespace, as otherwise it will just fight with the
global network driver.
Again, I think our defaults for qemu:///embed should be consistent
and conservative: instead of having developers opt out of using
network:///system, they should have to opt in before global
resources are involved.
They already opt-in to using the network driver by virtue of
configuring their guest to request its use. We don't need to
opt-in twice.
For some applications that's definitely an option, but others like
virt-qemu-run accept a fairly arbitrary configuration and having to
figure out whether that would result in eg. virtnetworkd being used,
and whether that's acceptable, is extra work as well as potential
duplication of logic that already exists in libvirt...
Then again, something like virt-qemu-run is probably expected to
expose basically the entire feature set of libvirt, as opposed to
more targeted applications which will use qemu:///embed internally
and only rely on a relatively small subset.
...
...
If we don't do that, I'm afraid developers will lose trust in the
whole qemu:///embed idea. Speaking from my own experience, I was
certainly surprised when I accidentally realized my qemu:///embed
VMs were showing up in the output of machinectl, and now I'm kinda
wondering how many other ways the application I'm working on, for
which the use of libvirt is just an implementation detail, is poking
at the system without my knowledge...
First off, this mis-understanding highlights a need for better
documentation to set out what the embedded driver is and is not
expected to do.
We definitely need to document this very clearly if we want
qemu:///embed to gain traction.
...
At a high level the embedded QEMU driver
- Isolated from any other instance of the QEMU driver
Yup.
...
- Process context of app is inherited by QEMU (namespaces,
    cgroups, CPU pinning, security context, etc)
Optionally! The fact that libvirt can deal with these is a big
selling point in some scenarios.
...
- All other functionality is unchanged.
Your comments above are equating two distinct scenarios, one which
had a serious functional bug & violated the first two design goals,
and one which does not cause any functional issues at all.
I'm not equating the two, just reporting a bunch of behaviors that
I ran into while trying to use qemu:///embed in my application and
that I found to be surprising, in an attempt to figure out which
ones are intentional and whether even those are necessarily
something that we want to keep around in their current form.
...
There's no question that we must fix the machined problem.
I will try to post patches for this.

Michal

Re: qemu:///embed and isolation from global components

Michal Privoznik