Re: [libvirt] New QEMU daemon for persistent reservations

27 Nov 2017

      On Mon, Nov 27, 2017 at 12:51:33PM +0100, Paolo Bonzini wrote:
...
On 27/11/2017 12:37, Daniel P. Berrange wrote:
...
On Mon, Nov 27, 2017 at 12:13:24PM +0100, Paolo Bonzini wrote:
...
Hm, I see what you mean now.  But it would be "just" a qemu-pr-helper
bug that it trusts the caller to have "ioctl" permissions on the file
descriptor, wouldn't it?
And it could be a feature even, since the remote QEMU process also has
to have "connect" permissions on the qemu-pr-helper socket.  So you
could give it ioctl access *limited to persistent reservations* by
granting the appropriate permissions on the socket.
We can't grant access to the persistent reservation helper's socket on a
per QEMU basis. Permissions are granted on the domain type svirt_t, and
we don't want to invent a new domain type just for having access to the
PR helper.
You can do so via DAC and MAC on the socket itself, or is that not enough?
Whatever is done with DAC is irrelevant when considering the MAC policy.

Yes, you can do MAC on the socket, but that merely gives you protection
betweeen the host & guest. The goal of sVirt is to give protection
between VMs, and MAC on the socket alone doesn't guarantee that.

Consider what the kernel would enforce if the SCSI PR was done inside
QEMU. It would check

   allowed(process context, file context, ...file class.., ...ioctl perm...)

where

   process context == svirt_t:s0,c340,c524
   file context == svirt_image_t:s0,c340,c524

In the case of a per-QEMU  SCSI PR, we can get the same effect because
when this is checked by the kernel, the PR helper has a process context
still containing a MCS level, eg

   process context ==  qemu_pr_helper_t:s0,c340,c524
   file context ==  svirt_image_t:s0,c340,c524

In the shared-PR helper though, the PR helper context would not have an
MAC level applied, so the MAC check is now

   process context == qemu_pr_helper_t:s0
   file context == svirt_image_t:s0,c340,c524

This is still a MAC check, but it is a weaker check than what sVirt promises,
because a QEMU with MCS   c340,c524 could pass a file descriptor with a MCS
level c123,c422.  QEMU would be denied to use this bogus FD, but the PR
helper would happily allow this bad access.

So to satisfy sVirt separation of QEMU instances, we need the PR helper to
be able to validate the MCS levels, as the kernel no longer has enough info
todo this automatically. 

In the PR helper this is achieved with something like

    getpeercon(socketfd, &qemucon)  => returns svirt_t:s0,c340,c524
    fgetfilecont(filefd, &diskcon)  => returns svirt_image_t:s0,c340,c524

Then it would call something like

   security_compute_av(qemucon, diskcon, ...file class..., ...ioctl perm...)

to get a MAC decision from the kernel. This preserves the sVirt isolation
of individual QEMU instances.
...
In other words, what are the SELinux best practices when you don't want
a process to have blanket access to a permission, but you may be fine
with a subset of those?  If you think of unpriv_sgio=0 as a very simple
MAC, this is actually the very case that the PR helper is designed for.
It depends on the security goal of the MAC policy. The original SELinux
policy was just protecting the host from QEMUs. For sVirt the goal is
stronger, to have isolation between individual QEMUs

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|