Hi all,
libvirt's selinux/DAC labeling behavior has been a repeated source of
frustration for desktop virt users. Granted much of the frustration comes from
the fact that virt-manager runs as $user, talks to libvirtd running as root,
which launches VMs running as qemu, and we are mixing it all with selinux :)
But the flip side is that I've lost count of how many times I've heard someone
complain 'why is libvirt chowning/relabeling my .iso? it's world readable'
I spent some time looking at bug reports and looking at the relevant libvirt
code, and I have a couple ideas for reducing user complaints in this area. The
patches should be pretty simple but I haven't written them yet since I'd like
other opinions on this stuff, given it could impact security. Here's my two
suggestions:
1) Don't attempt to change selinux label for readonly/shareable images, if the
image already has a label that svirt_t can access.
Background: svirt is about giving r/w disk images a unique selinux label tied
to one running VM, so that if other VMs are compromised, they can't access
another VMs disk image. Maintaining this guarantee is important, which is why
libvirt will raise an explicit error if attempting to change the disk image
label errors with EPERM and selinux==Enforcing. Makes sense.
The problem is with readonly media. We don't give a unique label to RO media,
we just give it plain virt_content_t. RO media labeling isn't about VM
isolation, it's about ensuring the svirt VM can even access the media.
However, if labeling fails with EPERM, we still fail in the same way mentioned
above.
Here's a situation where that is problematic:
- Running as qemu:///session (unprivileged)
- Point a VM at /path/to/my/isos/foo.iso
- foo.iso is: owner=root, mode=644, label=public_content_t. We
are sharing this media for qemu:///system usage as well
Without any libvirt label changing, the VM should be able to access that ISO:
permissions are fine, and svirt_t can access public_content_t. However the
unprivileged user does not have permissions to change the label to
virt_content_t, libvirt fails to set the label, raises the error, VM startup
fails.
I suggest that in the case of readonly (and maybe shareable which uses the
plain svirt_image_t label) disk, we first check to see if svirt_t can access
the existing label, and if so, don't do any explicit labeling. This makes
things 'just work' in the above example, and doesn't seem to have a
detrimental effect on security. I don't know exactly how to do this with the
selinux API but I assume there's something that reproduces what sesearch does
on the CLI.
Another case this helps: qemu:///system VM wants to use /dev/sr0, changes the
label to virt_content_t, which can generate selinux violations from udev when
media is ejected/inserted and the VM is running (I don't remember specifics
here but we've had bug reports in the past). Instead it just leaves /dev/sr0
as removable_device_t, which svirt_t can already read from.
(Libvirt already does something along these lines: If the labeling fails, it
checks to see if the disk image _already_ has the label we just requested, and
if so, it's not an error condition. But this suggestion goes a step further
than that)
2) DAC shouldn't change owner if qemu will be able to access the disk.
Background: My understanding of the DAC driver is it's _just_ to facilitate
running qemu as non-root; we chown the image to ensure the emulator has
permissions to access it so that VM startup doesn't fail. So I suggest that if
libvirt determines the disk is already accessible by the qemu user, we skip
the chown step.
Coupled with the above selinux erroring mentioned above, unnecessary chowning
can cause issues when sharing media between qemu:///system and
qemu:///session; once the file is owned by qemu, qemu:///session can't change
the selinux label ever again, and labeling failures may mean the VM fails to
start.
Thankfully we have libvirt infrastructure for checking whether a file can be
read by arbitrary uid:gid: virStorageFileInitAs and friends. We already use
that to check the file is searchable at qemu startup time, via:
qemuProcessState->
qemuDomainCheckDiskPresence->
qemuDomainDetermineDiskChain->
virStorageFileGetMetadata->
virStorageFileGetMetadataRecurse
So I suggest we add a check along those lines in security_dac.c, and if the
file is already accessible for the disk's needs, we skip the chown'ing.
Thoughts/comments appreciated.
Thanks,
Cole