Re: [libvirt] RFC: a couple ideas regarding selinux/DAC labeling

30 Apr 2015

      On Thu, Apr 30, 2015 at 01:20:30PM +0200, Michal Privoznik wrote:
...
On 28.04.2015 00:04, Cole Robinson wrote:
...
Hi all,
libvirt's selinux/DAC labeling behavior has been a repeated source of
frustration for desktop virt users. Granted much of the frustration comes from
the fact that virt-manager runs as $user, talks to libvirtd running as root,
which launches VMs running as qemu, and we are mixing it all with selinux :)
But the flip side is that I've lost count of how many times I've heard someone
complain 'why is libvirt chowning/relabeling my .iso? it's world readable'
I spent some time looking at bug reports and looking at the relevant libvirt
code, and I have a couple ideas for reducing user complaints in this area. The
patches should be pretty simple but I haven't written them yet since I'd like
other opinions on this stuff, given it could impact security. Here's my two
suggestions:
Ideally, we should check if qemu process running under configured
seclabels will have access to all the resources and relabel only those,
which are inaccessible due to wrong/missing seclabels. However, this
type of operation is much more complex than one would think (if even
possible). Therefore we went with (rather unconditional) relabeling of
all resources.
Yeah, we've historically pretty much assumed that a VM only has ability
to access svirt_t / virt_content_t labels, and hence always explicitly
relabelled everything.

Over time the SELinux policy grew the ability to allow a VM to access
a handful of other things read-only, but we've never figured out a good
story around how mgmt applications should deal with this.

We finally added details of the default labels to the capabilities XML.
Perhaps one idea is to extend the capabilities to include a list of
supplementary labels to which access is permitted ? The other option
is to punt that knowledge to apps, but that seems rather bad.

Or instead of relabel=yes|no, we could have relabel=optional, which
would tell libvirt that the file should be relabelled, if-and-only-if
access would otherwise be denied.  That way virt-manager could use
relabel=optional on any thing like CDROM ISOs, and libvirtd would
figure out if its current label needed changing or could be left
alone.
...
...
1) Don't attempt to change selinux label for readonly/shareable images, if the
image already has a label that svirt_t can access.
Background: svirt is about giving r/w disk images a unique selinux label tied
to one running VM, so that if other VMs are compromised, they can't access
another VMs disk image. Maintaining this guarantee is important, which is why
libvirt will raise an explicit error if attempting to change the disk image
label errors with EPERM and selinux==Enforcing. Makes sense.
The problem is with readonly media. We don't give a unique label to RO media,
we just give it plain virt_content_t. RO media labeling isn't about VM
isolation, it's about ensuring the svirt VM can even access the media.
However, if labeling fails with EPERM, we still fail in the same way mentioned
above.
Here's a situation where that is problematic:
- Running as qemu:///session (unprivileged)
- Point a VM at /path/to/my/isos/foo.iso
- foo.iso is: owner=root, mode=644, label=public_content_t. We
  are sharing this media for qemu:///system usage as well
Without any libvirt label changing, the VM should be able to access that ISO:
permissions are fine, and svirt_t can access public_content_t. However the
unprivileged user does not have permissions to change the label to
virt_content_t, libvirt fails to set the label, raises the error, VM startup
fails.
I suggest that in the case of readonly (and maybe shareable which uses the
plain svirt_image_t label) disk, we first check to see if svirt_t can access
the existing label, and if so, don't do any explicit labeling. This makes
things 'just work' in the above example, and doesn't seem to have a
detrimental effect on security. I don't know exactly how to do this with the
selinux API but I assume there's something that reproduces what sesearch does
on the CLI.
Another case this helps: qemu:///system VM wants to use /dev/sr0, changes the
label to virt_content_t, which can generate selinux violations from udev when
media is ejected/inserted and the VM is running (I don't remember specifics
here but we've had bug reports in the past). Instead it just leaves /dev/sr0
as removable_device_t, which svirt_t can already read from.
(Libvirt already does something along these lines: If the labeling fails, it
checks to see if the disk image _already_ has the label we just requested, and
if so, it's not an error condition. But this suggestion goes a step further
than that)
2) DAC shouldn't change owner if qemu will be able to access the disk.
Correct. Therefore, when I tried to fix this behaviour in the past I
came up with ACLs. Unfortunately, it wasn't accepted because not all
filesystems out there support ACLs. My argument was, that on those
systems/kernels (e.g. *BSD) we can fallback to chown(), but the
conversation went stale. Maybe we can resurrect that and implement ACLs
correctly. This way we don't even need to bother with R/O disks since
one file can have multiple ACL entries.
I agree that ACLs are really the way forward here. I'm fine with it
falling back to chown() on filesystems which lack support. I thought
our patches stalled because of the issues around tracking ref-count
for the ACLs on shared filesystems like NFS ?
...
On the other hand, each disk can have 'relabel="no"' attribute and we
won't touch that disk at all. Maybe we can advertise this more so users
stop complaining? Or even more, if dynamicOwnership is turned off, we
don't relabel anything and let users set the correct labels themselves.
We could use the same relabel=optional trick for DAC too, so if the file
was world readable, that would tell libvirtd to skip relabelling of it.
...
...
Background: My understanding of the DAC driver is it's _just_ to facilitate
running qemu as non-root; we chown the image to ensure the emulator has
permissions to access it so that VM startup doesn't fail. So I suggest that if
libvirt determines the disk is already accessible by the qemu user, we skip
the chown step.
Coupled with the above selinux erroring mentioned above, unnecessary chowning
can cause issues when sharing media between qemu:///system and
qemu:///session; once the file is owned by qemu, qemu:///session can't change
the selinux label ever again, and labeling failures may mean the VM fails to
start.
Thankfully we have libvirt infrastructure for checking whether a file can be
read by arbitrary uid:gid: virStorageFileInitAs and friends. We already use
that to check the file is searchable at qemu startup time, via:
    qemuProcessState->
      qemuDomainCheckDiskPresence->
        qemuDomainDetermineDiskChain->
          virStorageFileGetMetadata->
            virStorageFileGetMetadataRecurse
So I suggest we add a check along those lines in security_dac.c, and if the
file is already accessible for the disk's needs, we skip the chown'ing.
This works only for DAC, not for SELinux. And I don't know anything
about AppArmomr. We have qemuFileOpenAs() which seems like copying the
functionality. Maybe it can be merged together?
IIUC, AppArmour doesn't touch anything on the file at all. It generates a
custom AppArmour configuration file for each VM that is booted, listing
all the things that VM needs to access. So it wouldn't really have this
kind of problem I believe.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

Re: [libvirt] RFC: a couple ideas regarding selinux/DAC labeling

Daniel P. Berrange