On Thu, Apr 30, 2015 at 01:20:30PM +0200, Michal Privoznik wrote:
On 28.04.2015 00:04, Cole Robinson wrote:
> Hi all,
>
> libvirt's selinux/DAC labeling behavior has been a repeated source of
> frustration for desktop virt users. Granted much of the frustration comes from
> the fact that virt-manager runs as $user, talks to libvirtd running as root,
> which launches VMs running as qemu, and we are mixing it all with selinux :)
>
> But the flip side is that I've lost count of how many times I've heard
someone
> complain 'why is libvirt chowning/relabeling my .iso? it's world
readable'
>
> I spent some time looking at bug reports and looking at the relevant libvirt
> code, and I have a couple ideas for reducing user complaints in this area. The
> patches should be pretty simple but I haven't written them yet since I'd
like
> other opinions on this stuff, given it could impact security. Here's my two
> suggestions:
>
Ideally, we should check if qemu process running under configured
seclabels will have access to all the resources and relabel only those,
which are inaccessible due to wrong/missing seclabels. However, this
type of operation is much more complex than one would think (if even
possible). Therefore we went with (rather unconditional) relabeling of
all resources.
Yeah, we've historically pretty much assumed that a VM only has ability
to access svirt_t / virt_content_t labels, and hence always explicitly
relabelled everything.
Over time the SELinux policy grew the ability to allow a VM to access
a handful of other things read-only, but we've never figured out a good
story around how mgmt applications should deal with this.
We finally added details of the default labels to the capabilities XML.
Perhaps one idea is to extend the capabilities to include a list of
supplementary labels to which access is permitted ? The other option
is to punt that knowledge to apps, but that seems rather bad.
Or instead of relabel=yes|no, we could have relabel=optional, which
would tell libvirt that the file should be relabelled, if-and-only-if
access would otherwise be denied. That way virt-manager could use
relabel=optional on any thing like CDROM ISOs, and libvirtd would
figure out if its current label needed changing or could be left
alone.
> 1) Don't attempt to change selinux label for
readonly/shareable images, if the
> image already has a label that svirt_t can access.
>
> Background: svirt is about giving r/w disk images a unique selinux label tied
> to one running VM, so that if other VMs are compromised, they can't access
> another VMs disk image. Maintaining this guarantee is important, which is why
> libvirt will raise an explicit error if attempting to change the disk image
> label errors with EPERM and selinux==Enforcing. Makes sense.
>
> The problem is with readonly media. We don't give a unique label to RO media,
> we just give it plain virt_content_t. RO media labeling isn't about VM
> isolation, it's about ensuring the svirt VM can even access the media.
> However, if labeling fails with EPERM, we still fail in the same way mentioned
> above.
>
> Here's a situation where that is problematic:
>
> - Running as qemu:///session (unprivileged)
> - Point a VM at /path/to/my/isos/foo.iso
> - foo.iso is: owner=root, mode=644, label=public_content_t. We
> are sharing this media for qemu:///system usage as well
>
> Without any libvirt label changing, the VM should be able to access that ISO:
> permissions are fine, and svirt_t can access public_content_t. However the
> unprivileged user does not have permissions to change the label to
> virt_content_t, libvirt fails to set the label, raises the error, VM startup
> fails.
>
> I suggest that in the case of readonly (and maybe shareable which uses the
> plain svirt_image_t label) disk, we first check to see if svirt_t can access
> the existing label, and if so, don't do any explicit labeling. This makes
> things 'just work' in the above example, and doesn't seem to have a
> detrimental effect on security. I don't know exactly how to do this with the
> selinux API but I assume there's something that reproduces what sesearch does
> on the CLI.
>
> Another case this helps: qemu:///system VM wants to use /dev/sr0, changes the
> label to virt_content_t, which can generate selinux violations from udev when
> media is ejected/inserted and the VM is running (I don't remember specifics
> here but we've had bug reports in the past). Instead it just leaves /dev/sr0
> as removable_device_t, which svirt_t can already read from.
>
> (Libvirt already does something along these lines: If the labeling fails, it
> checks to see if the disk image _already_ has the label we just requested, and
> if so, it's not an error condition. But this suggestion goes a step further
> than that)
>
>
> 2) DAC shouldn't change owner if qemu will be able to access the disk.
Correct. Therefore, when I tried to fix this behaviour in the past I
came up with ACLs. Unfortunately, it wasn't accepted because not all
filesystems out there support ACLs. My argument was, that on those
systems/kernels (e.g. *BSD) we can fallback to chown(), but the
conversation went stale. Maybe we can resurrect that and implement ACLs
correctly. This way we don't even need to bother with R/O disks since
one file can have multiple ACL entries.
I agree that ACLs are really the way forward here. I'm fine with it
falling back to chown() on filesystems which lack support. I thought
our patches stalled because of the issues around tracking ref-count
for the ACLs on shared filesystems like NFS ?
On the other hand, each disk can have
'relabel="no"' attribute and we
won't touch that disk at all. Maybe we can advertise this more so users
stop complaining? Or even more, if dynamicOwnership is turned off, we
don't relabel anything and let users set the correct labels themselves.
We could use the same relabel=optional trick for DAC too, so if the file
was world readable, that would tell libvirtd to skip relabelling of it.
> Background: My understanding of the DAC driver is it's
_just_ to facilitate
> running qemu as non-root; we chown the image to ensure the emulator has
> permissions to access it so that VM startup doesn't fail. So I suggest that if
> libvirt determines the disk is already accessible by the qemu user, we skip
> the chown step.
>
> Coupled with the above selinux erroring mentioned above, unnecessary chowning
> can cause issues when sharing media between qemu:///system and
> qemu:///session; once the file is owned by qemu, qemu:///session can't change
> the selinux label ever again, and labeling failures may mean the VM fails to
> start.
>
> Thankfully we have libvirt infrastructure for checking whether a file can be
> read by arbitrary uid:gid: virStorageFileInitAs and friends. We already use
> that to check the file is searchable at qemu startup time, via:
> qemuProcessState->
> qemuDomainCheckDiskPresence->
> qemuDomainDetermineDiskChain->
> virStorageFileGetMetadata->
> virStorageFileGetMetadataRecurse
>
> So I suggest we add a check along those lines in security_dac.c, and if the
> file is already accessible for the disk's needs, we skip the chown'ing.
This works only for DAC, not for SELinux. And I don't know anything
about AppArmomr. We have qemuFileOpenAs() which seems like copying the
functionality. Maybe it can be merged together?
IIUC, AppArmour doesn't touch anything on the file at all. It generates a
custom AppArmour configuration file for each VM that is booted, listing
all the things that VM needs to access. So it wouldn't really have this
kind of problem I believe.
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|