On 04/30/2015 07:30 AM, Daniel P. Berrange wrote:
On Thu, Apr 30, 2015 at 01:20:30PM +0200, Michal Privoznik wrote:
> On 28.04.2015 00:04, Cole Robinson wrote:
>> Hi all,
>>
>> libvirt's selinux/DAC labeling behavior has been a repeated source of
>> frustration for desktop virt users. Granted much of the frustration comes from
>> the fact that virt-manager runs as $user, talks to libvirtd running as root,
>> which launches VMs running as qemu, and we are mixing it all with selinux :)
>>
>> But the flip side is that I've lost count of how many times I've heard
someone
>> complain 'why is libvirt chowning/relabeling my .iso? it's world
readable'
>>
>> I spent some time looking at bug reports and looking at the relevant libvirt
>> code, and I have a couple ideas for reducing user complaints in this area. The
>> patches should be pretty simple but I haven't written them yet since I'd
like
>> other opinions on this stuff, given it could impact security. Here's my two
>> suggestions:
>>
>
> Ideally, we should check if qemu process running under configured
> seclabels will have access to all the resources and relabel only those,
> which are inaccessible due to wrong/missing seclabels. However, this
> type of operation is much more complex than one would think (if even
> possible). Therefore we went with (rather unconditional) relabeling of
> all resources.
This seems to give some output, so the check is possible at least:
sesearch --allow --source svirt_t --target removable_device_t
Which uses libqpol from setools-libs. Example code:
https://searchcode.com/codesearch/view/19370387/
https://searchcode.com/codesearch/view/19676526/
I asked dwalsh about this on IRC years ago and he said it was doable via the
API but we didn't talk details.
sesearch seems to dump raw policy output though, maybe that's where difficulty
stems from.
Yeah, we've historically pretty much assumed that a VM only has
ability
to access svirt_t / virt_content_t labels, and hence always explicitly
relabelled everything.
Over time the SELinux policy grew the ability to allow a VM to access
a handful of other things read-only, but we've never figured out a good
story around how mgmt applications should deal with this.
We finally added details of the default labels to the capabilities XML.
Perhaps one idea is to extend the capabilities to include a list of
supplementary labels to which access is permitted ? The other option
is to punt that knowledge to apps, but that seems rather bad.
Or instead of relabel=yes|no, we could have relabel=optional, which
would tell libvirt that the file should be relabelled, if-and-only-if
access would otherwise be denied. That way virt-manager could use
relabel=optional on any thing like CDROM ISOs, and libvirtd would
figure out if its current label needed changing or could be left
alone.
I'd rather we switch the default to relabel=optional and instead add a
relabel=force option for people that reallly want to use virt_content_t and
friends. I figure if selinux guys let svirt_t access a host label like
removable_device_t that should be good enough for the default
To clarify, WRT selinux this label skipping would only apply to RO or
shareable media. RW disks that get per-VM labels would never have their
selinux labeling skipped if the disk was already labeled public_content_t for
example, since that breaks the svirt guarantee of isolation from other VMs.
>> 1) Don't attempt to change selinux label for
readonly/shareable images, if the
>> image already has a label that svirt_t can access.
>>
>> Background: svirt is about giving r/w disk images a unique selinux label tied
>> to one running VM, so that if other VMs are compromised, they can't access
>> another VMs disk image. Maintaining this guarantee is important, which is why
>> libvirt will raise an explicit error if attempting to change the disk image
>> label errors with EPERM and selinux==Enforcing. Makes sense.
>>
>> The problem is with readonly media. We don't give a unique label to RO
media,
>> we just give it plain virt_content_t. RO media labeling isn't about VM
>> isolation, it's about ensuring the svirt VM can even access the media.
>> However, if labeling fails with EPERM, we still fail in the same way mentioned
>> above.
>>
>> Here's a situation where that is problematic:
>>
>> - Running as qemu:///session (unprivileged)
>> - Point a VM at /path/to/my/isos/foo.iso
>> - foo.iso is: owner=root, mode=644, label=public_content_t. We
>> are sharing this media for qemu:///system usage as well
>>
>> Without any libvirt label changing, the VM should be able to access that ISO:
>> permissions are fine, and svirt_t can access public_content_t. However the
>> unprivileged user does not have permissions to change the label to
>> virt_content_t, libvirt fails to set the label, raises the error, VM startup
>> fails.
>>
>> I suggest that in the case of readonly (and maybe shareable which uses the
>> plain svirt_image_t label) disk, we first check to see if svirt_t can access
>> the existing label, and if so, don't do any explicit labeling. This makes
>> things 'just work' in the above example, and doesn't seem to have a
>> detrimental effect on security. I don't know exactly how to do this with the
>> selinux API but I assume there's something that reproduces what sesearch
does
>> on the CLI.
>>
>> Another case this helps: qemu:///system VM wants to use /dev/sr0, changes the
>> label to virt_content_t, which can generate selinux violations from udev when
>> media is ejected/inserted and the VM is running (I don't remember specifics
>> here but we've had bug reports in the past). Instead it just leaves /dev/sr0
>> as removable_device_t, which svirt_t can already read from.
>>
>> (Libvirt already does something along these lines: If the labeling fails, it
>> checks to see if the disk image _already_ has the label we just requested, and
>> if so, it's not an error condition. But this suggestion goes a step further
>> than that)
>>
>>
>> 2) DAC shouldn't change owner if qemu will be able to access the disk.
>
> Correct. Therefore, when I tried to fix this behaviour in the past I
> came up with ACLs. Unfortunately, it wasn't accepted because not all
> filesystems out there support ACLs. My argument was, that on those
> systems/kernels (e.g. *BSD) we can fallback to chown(), but the
> conversation went stale. Maybe we can resurrect that and implement ACLs
> correctly. This way we don't even need to bother with R/O disks since
> one file can have multiple ACL entries.
FWIW I like the ACL idea too... especially if it sets user:qemu:x on the
directory chain so that we know the disk will actually be accessible :)
I agree that ACLs are really the way forward here. I'm fine with
it
falling back to chown() on filesystems which lack support. I thought
our patches stalled because of the issues around tracking ref-count
for the ACLs on shared filesystems like NFS ?
> On the other hand, each disk can have 'relabel="no"' attribute and
we
> won't touch that disk at all. Maybe we can advertise this more so users
> stop complaining? Or even more, if dynamicOwnership is turned off, we
> don't relabel anything and let users set the correct labels themselves.
We could use the same relabel=optional trick for DAC too, so if the file
was world readable, that would tell libvirtd to skip relabelling of it.
See below
>> Background: My understanding of the DAC driver is it's
_just_ to facilitate
>> running qemu as non-root; we chown the image to ensure the emulator has
>> permissions to access it so that VM startup doesn't fail. So I suggest that
if
>> libvirt determines the disk is already accessible by the qemu user, we skip
>> the chown step.
>>
If this initial premise is correct, that DAC is only about ensuring qemu can
run non-root $user:$group, then the mandate should be 'do the bare minimum to
make qemu run'. So I think the proposed behavior/relabel=optional should be
the default, apps/users shouldn't have to opt in.
>> Coupled with the above selinux erroring mentioned above,
unnecessary chowning
>> can cause issues when sharing media between qemu:///system and
>> qemu:///session; once the file is owned by qemu, qemu:///session can't
change
>> the selinux label ever again, and labeling failures may mean the VM fails to
>> start.
>>
>> Thankfully we have libvirt infrastructure for checking whether a file can be
>> read by arbitrary uid:gid: virStorageFileInitAs and friends. We already use
>> that to check the file is searchable at qemu startup time, via:
>> qemuProcessState->
>> qemuDomainCheckDiskPresence->
>> qemuDomainDetermineDiskChain->
>> virStorageFileGetMetadata->
>> virStorageFileGetMetadataRecurse
>>
>> So I suggest we add a check along those lines in security_dac.c, and if the
>> file is already accessible for the disk's needs, we skip the chown'ing.
>
> This works only for DAC, not for SELinux. And I don't know anything
> about AppArmomr. We have qemuFileOpenAs() which seems like copying the
> functionality. Maybe it can be merged together?
Right, for selinux we would need to look for an sesearch equivalent. This
suggestion was just about determining if we need to trigger DAC reassignment
- Cole