On Fri, Nov 04, 2022 at 02:56:51PM -0400, Andrea Bolognani wrote:
On Thu, Nov 03, 2022 at 05:23:27PM +0000, Daniel P. Berrangé wrote:
> On Thu, Nov 03, 2022 at 12:35:15PM -0400, Andrea Bolognani wrote:
> > On Thu, Nov 03, 2022 at 03:39:44PM +0100, Peter Krempa wrote:
> > > On Thu, Nov 03, 2022 at 12:13:53 +0100, Andrea Bolognani wrote:
> > > > Distros that use AppArmor, such as Debian and Ubuntu, install
> > > > QEMU under /usr/bin/qemu-system-*, and our AppArmor profile is
> > > > written with that assumption in mind.
> > > >
> > > > If you try to run the RHEL or CentOS version of libvirt and
> > > > QEMU inside a privileged container on such distros, however,
> > > > that will result in an error, because the path
> > > > /usr/libexec/qemu-kvm is used instead.
> > >
> > > So IIUC by this patch you modify the profile which gets installed into
> > > the Debian/Ubuntu host system by the Debian/Ubuntu package which then in
> > > turn allows the non-Debian/Ubuntu libvirt in the container to do it's
> > > job?
> >
> > Pretty much.
> >
> > > I'm basing the above on the fact that the RHEL/Centos package is
> > > compiled with:
> > >
> > > -Dapparmor=disabled \
> > > -Dapparmor_profiles=disabled \
> > > -Dsecdriver_apparmor=disabled \
> > >
> > > By extension, does that mean that you have to install libvirt on your
> > > host so that you can in turn run a container (which I'd presume is
> > > opaque) with libvirt bundled inside?
> >
> > It's actually the other way around :)
> >
> > If you don't have libvirt installed on the Debian/Ubuntu host, then
> > the AppArmor profile won't be present and the containerized CentOS
> > libvirt will be allowed to start the containerized CentOS QEMU.
> >
> > If you *do* have libvirt installed on the Debian/Ubuntu host, then
> > the AppArmor profile will also be applied to the containerized CentOS
> > libvirt and running the containerized CentOS QEMU will be forbidden.
> >
> > Patching the AppArmor policy is supposed to help with the second
> > scenario.
>
> I don't see how this can work properly.
>
> If running with AppArmor, I would expect libvirtd itself needs to be
> built with AppArmor, so that when launching a VM it can spawn
> virt-aa-helper to create the per-VM customized profile. The CentOS
> based libvirt running inside the container will be built without
> virt-aa-helper, so won't load this.
>
> I would rather expect that AppArmor does not attempt to control
> any processes inside the containers, other than with a generic
> 'docker' AppArmor profile. It makes no sense for profiles from
> the host OS install to apply to stuff in containers, as we can't
> assume the host + container installs are the same versions. Are
> you sure the KubeVirt problem isn't simply a mis-configuration
> of the host environment allowing AppArmor to leak inside the
> container.
IIUC a specific profile (cri-containerd.apparmor.d) is used for
unprivileged containers such as virt-launcher, but a privileged one
such as virt-handler falls under the same profile as the host.
This makes some amount of sense to me: unprivileged containers are
already limited in what they can do by the usual restrictions on user
processes. Privileged containers, on the other hand, are effectively
root processes, so it's advisable to be significantly more cautious
with them.
I still consider that situation to be broken by design. If the
privileged container is running a completely differnt software
stack from the host OS, using the host OS apparmour profile
to confined the container binary is never going to be a reliable
setup. Either the privileged container has to run without
confinement, or it needs to be confined using policy provided
by the container (which is likely not viable anyway).
Note that this is just my current understanding of the situation,
and
I'm far from an expert when it comes to containers in general and
their interactions with AppArmor in particular. I recommend taking a
look at
https://github.com/kubevirt/kubevirt/pull/8692
and the issues linked therein, which will provide more context coming
from people who actually know what they're talking about :)
I did read that and it didn't give me any more confidence that
this setup is sensible.
Now that I've typed all of the above, I wonder if the problem
wouldn't be better solved by making sure that KubeVirt runs the
libvirtd instance used to figure out node capabilities in an
unprivileged container? Maybe there's something that prevents them
from doing so.
I'll bring up the idea and see what they think of it.
With regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|