[libvirt] [PATCH 0/2] qemu_cgroup: allow access to /dev/dri/render*

Technically a v2, but v1 is already pushed. This version is based on the <gl enable> in <spice> instead of accel3d="yes" in <video><model type="virtio". It also only allows access to the render* devices, instead of all of them. https://bugzilla.redhat.com/show_bug.cgi?id=1337290 Ján Tomko (2): Revert "qemu_cgroup: allow access to /dev/dri for virtio-vga" qemu_cgroup: allow access to /dev/dri/render* src/qemu/qemu_cgroup.c | 71 ++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 57 insertions(+), 14 deletions(-) -- 2.7.3

This reverts commit 3943bdd60c7ff1de00e73f66d907664b74a88a3f. --- src/qemu/qemu_cgroup.c | 15 --------------- 1 file changed, 15 deletions(-) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 46634f4..1e04a68 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -51,7 +51,6 @@ static const char *const defaultDeviceACL[] = { }; #define DEVICE_PTY_MAJOR 136 #define DEVICE_SND_MAJOR 116 -#define DEVICE_DRI_MAJOR 226 static int @@ -627,20 +626,6 @@ qemuSetupDevicesCgroup(virQEMUDriverPtr driver, goto cleanup; } - if (vm->def->nvideos) { - /* currently libvirt only allows the primary video to be virtio */ - virDomainVideoDefPtr vid = vm->def->videos[0]; - if (vid->type == VIR_DOMAIN_VIDEO_TYPE_VIRTIO && - vid->accel && vid->accel->accel3d == VIR_TRISTATE_BOOL_YES) { - rv = virCgroupAllowDevice(priv->cgroup, 'c', DEVICE_DRI_MAJOR, -1, - VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupMajor(vm, priv->cgroup, "allow", DEVICE_DRI_MAJOR, - "video", "rw", rv == 0); - if (rv < 0) - goto cleanup; - } - } - for (i = 0; deviceACL[i] != NULL; i++) { if (!virFileExists(deviceACL[i])) { VIR_DEBUG("Ignoring non-existent device %s", deviceACL[i]); -- 2.7.3

On 05/19/2016 07:29 AM, Ján Tomko wrote:
This reverts commit 3943bdd60c7ff1de00e73f66d907664b74a88a3f. --- src/qemu/qemu_cgroup.c | 15 --------------- 1 file changed, 15 deletions(-)
ACK, at least this should go in until we figure come to agreement on the correct solution - Cole

On Fri, May 20, 2016 at 04:47:20PM -0400, Cole Robinson wrote:
On 05/19/2016 07:29 AM, Ján Tomko wrote:
This reverts commit 3943bdd60c7ff1de00e73f66d907664b74a88a3f. --- src/qemu/qemu_cgroup.c | 15 --------------- 1 file changed, 15 deletions(-)
ACK, at least this should go in until we figure come to agreement on the correct solution
Thanks, I have pushed this revert. Jan

Allow access to /dev/dri/render* devices for domains using <graphics type="spice"> with <gl enable="yes"/> https://bugzilla.redhat.com/show_bug.cgi?id=1337290 --- src/qemu/qemu_cgroup.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 1e04a68..5d810fa 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -567,6 +567,61 @@ qemuSetupFirmwareCgroup(virDomainObjPtr vm) static int +qemuSetupGraphicsCgroup(virDomainObjPtr vm) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + const char *dripath = "/dev/dri"; + char *devpath = NULL; + struct dirent *ent; + int ret = -1; + DIR *dir; + int rv, rc; + size_t i; + + for (i = 0; i < vm->def->ngraphics; i++) { + virDomainGraphicsDefPtr gfx = vm->def->graphics[i]; + if (gfx->type == VIR_DOMAIN_GRAPHICS_TYPE_SPICE && + gfx->data.spice.gl == VIR_TRISTATE_BOOL_YES) + break; + } + + if (i == vm->def->ngraphics) + return 0; + + if (!(dir = opendir(dripath))) { + virReportSystemError(errno, + _("Could not open directory '%s'"), + dripath); + return -1; + } + + while ((rv = virDirRead(dir, &ent, dripath)) > 0) { + if (STRPREFIX(ent->d_name, "render")) { + if (virAsprintf(&devpath, "%s/%s", dripath, ent->d_name) < 0) + goto cleanup; + + rc = virCgroupAllowDevicePath(priv->cgroup, devpath, + VIR_CGROUP_DEVICE_RW, false); + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", devpath, + "rw", rc == 0); + if (rc < 0) + goto cleanup; + VIR_FREE(devpath); + } + } + + if (rv < 0) + goto cleanup; + + ret = 0; + cleanup: + VIR_FREE(devpath); + closedir(dir); + return ret; +} + + +static int qemuSetupDevicesCgroup(virQEMUDriverPtr driver, virDomainObjPtr vm) { @@ -626,6 +681,9 @@ qemuSetupDevicesCgroup(virQEMUDriverPtr driver, goto cleanup; } + if (qemuSetupGraphicsCgroup(vm) < 0) + goto cleanup; + for (i = 0; deviceACL[i] != NULL; i++) { if (!virFileExists(deviceACL[i])) { VIR_DEBUG("Ignoring non-existent device %s", deviceACL[i]); -- 2.7.3

On Thu, May 19, 2016 at 01:29:07PM +0200, Ján Tomko wrote:
Allow access to /dev/dri/render* devices for domains using <graphics type="spice"> with <gl enable="yes"/>
Ignoring cgroups for a minute, how exactly does QEMU get access to the /dev/dri/render* devices in general ? ie when QEMU is running as the 'qemu:qemu' user/group account, with selinux enforcing I don't see how it can possibly open these files, as we're not granting access to them in any of the security drivers. Given this, allowing them in cgroups seems like the least of our problems. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 05/19/2016 08:21 AM, Daniel P. Berrange wrote:
On Thu, May 19, 2016 at 01:29:07PM +0200, Ján Tomko wrote:
Allow access to /dev/dri/render* devices for domains using <graphics type="spice"> with <gl enable="yes"/>
Ignoring cgroups for a minute, how exactly does QEMU get access to the /dev/dri/render* devices in general ? ie when QEMU is running as the 'qemu:qemu' user/group account, with selinux enforcing I don't see how it can possibly open these files, as we're not granting access to them in any of the security drivers. Given this, allowing them in cgroups seems like the least of our problems.
The svirt bits can at least be temporarily worked around with chmod 666 /dev/dri/render* and setenforce 0. The cgroup bit requires duplicating the entire cgroup_device_acl block in qemu.conf which is less friendly and not very future proof. Seems like an easy win But yes, there needs to be a larger discussion about how to correctly handle this WRT svirt for both qemu:///system and qemu:///session. selinux bug here: https://bugzilla.redhat.com/show_bug.cgi?id=1337333 - Cole

On Thu, May 19, 2016 at 08:36:35AM -0400, Cole Robinson wrote:
On 05/19/2016 08:21 AM, Daniel P. Berrange wrote:
On Thu, May 19, 2016 at 01:29:07PM +0200, Ján Tomko wrote:
Allow access to /dev/dri/render* devices for domains using <graphics type="spice"> with <gl enable="yes"/>
Ignoring cgroups for a minute, how exactly does QEMU get access to the /dev/dri/render* devices in general ? ie when QEMU is running as the 'qemu:qemu' user/group account, with selinux enforcing I don't see how it can possibly open these files, as we're not granting access to them in any of the security drivers. Given this, allowing them in cgroups seems like the least of our problems.
The svirt bits can at least be temporarily worked around with chmod 666 /dev/dri/render* and setenforce 0. The cgroup bit requires duplicating the entire cgroup_device_acl block in qemu.conf which is less friendly and not very future proof. Seems like an easy win
There's a potential issue though with going down a path now which is not viable long term, which we then get stuck supporting for upgradability. eg if we start granting permission to use these devices to multiple QEMUs concurrently will we regret doing that later and have to break people's deployments to fix it properly. Without sVirt integration though I'd suggest we don't really advertize this to users, as telling them to chmod / setenforce is not really a supportable strategy for usage in any case.
But yes, there needs to be a larger discussion about how to correctly handle this WRT svirt for both qemu:///system and qemu:///session. selinux bug here:
Looks like we'd need to consider those separately - as in the session case, even libvirtd won't have the option to fix permissioning. It is something that would have to be done at the OS level to grant access. Once granting access to just an unprivileged QEMU you might as well just grant access to all a user's processes, since there's no separation stopping other processes in the user session getting access to the devices via QEMU. IOW, if you want qemu:///session mode to have access you end up with a chmod 666 world, where everyone has access. I don't know enough about it to know if that's reasonable or not. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 05/19/2016 08:52 AM, Daniel P. Berrange wrote:
On Thu, May 19, 2016 at 08:36:35AM -0400, Cole Robinson wrote:
On 05/19/2016 08:21 AM, Daniel P. Berrange wrote:
On Thu, May 19, 2016 at 01:29:07PM +0200, Ján Tomko wrote:
Allow access to /dev/dri/render* devices for domains using <graphics type="spice"> with <gl enable="yes"/>
Ignoring cgroups for a minute, how exactly does QEMU get access to the /dev/dri/render* devices in general ? ie when QEMU is running as the 'qemu:qemu' user/group account, with selinux enforcing I don't see how it can possibly open these files, as we're not granting access to them in any of the security drivers. Given this, allowing them in cgroups seems like the least of our problems.
The svirt bits can at least be temporarily worked around with chmod 666 /dev/dri/render* and setenforce 0. The cgroup bit requires duplicating the entire cgroup_device_acl block in qemu.conf which is less friendly and not very future proof. Seems like an easy win
There's a potential issue though with going down a path now which is not viable long term, which we then get stuck supporting for upgradability. eg if we start granting permission to use these devices to multiple QEMUs concurrently will we regret doing that later and have to break people's deployments to fix it properly.
Hmm, I see. CCing gl guys How this works on my f24 host: $ ls -lZ /dev/dri/ total 0 crw-rw----+ 1 root video system_u:object_r:dri_device_t:s0 226, 0 May 18 19:17 card0 crw-------. 1 root video system_u:object_r:dri_device_t:s0 226, 64 May 18 19:17 controlD64 crw-rw----+ 1 root video system_u:object_r:dri_device_t:s0 226, 128 May 18 19:17 renderD128 qemu itself loops over /dev/dri, grabs the first matching renderD* that it can open, then does its magic. Some questions I have in general: - is there only ever one render node? or one per video card? - is it okay to use the same node for multiple VMs simultaneously? Maybe the long term fix is to have libvirt pass in a pre-opened fd for qemu:///system, since I don't know if it would be acceptable to chown qemu:qemu on the render node, but maybe we use setfacl instead.
Without sVirt integration though I'd suggest we don't really advertize this to users, as telling them to chmod / setenforce is not really a supportable strategy for usage in any case.
I certainly agree with that at least WRT UI tools... after playing with this stuff a bit I won't be adding UI clicky 'enable gl' in virt-manager in the short term, there's just too many operational caveats. But advertisement in the form of blog posts with all the caveats listed will probably save us bug reports, and maybe a command line virt-xml one liner to turn it on for an existing VM. And of course have virt-manager work with it correctly on the viewer side, patches in git and fedora build coming soon
But yes, there needs to be a larger discussion about how to correctly handle this WRT svirt for both qemu:///system and qemu:///session. selinux bug here:
Looks like we'd need to consider those separately - as in the session case, even libvirtd won't have the option to fix permissioning. It is something that would have to be done at the OS level to grant access. Once granting access to just an unprivileged QEMU you might as well just grant access to all a user's processes, since there's no separation stopping other processes in the user session getting access to the devices via QEMU. IOW, if you want qemu:///session mode to have access you end up with a chmod 666 world, where everyone has access. I don't know enough about it to know if that's reasonable or not.
Actually qemu:///session DAC permissions are fine, because the logged in user has ACL access to the render node already, like /dev/snd/* for example. It's just the svirt selinux policy that is rejecting access The DAC permissions are an issue with qemu:qemu on qemu:///system though - Cole

Hi,
$ ls -lZ /dev/dri/ total 0 crw-rw----+ 1 root video system_u:object_r:dri_device_t:s0 226, 0 May 18 19:17 card0 crw-------. 1 root video system_u:object_r:dri_device_t:s0 226, 64 May 18 19:17 controlD64 crw-rw----+ 1 root video system_u:object_r:dri_device_t:s0 226, 128 May 18 19:17 renderD128
qemu itself loops over /dev/dri, grabs the first matching renderD* that it can open, then does its magic. Some questions I have in general:
- is there only ever one render node? or one per video card?
One per video card.
- is it okay to use the same node for multiple VMs simultaneously?
Yes.
Maybe the long term fix is to have libvirt pass in a pre-opened fd for qemu:///system, since I don't know if it would be acceptable to chown qemu:qemu on the render node, but maybe we use setfacl instead.
chown isn't a good idea I think. But doesn't use libvirt setfacl anyway for simliar things (i.e. /dev/bus/usb/... for usb pass-through) ?
I certainly agree with that at least WRT UI tools... after playing with this stuff a bit I won't be adding UI clicky 'enable gl' in virt-manager in the short term, there's just too many operational caveats. But advertisement in the form of blog posts with all the caveats listed will probably save us bug reports, and maybe a command line virt-xml one liner to turn it on for an existing VM. And of course have virt-manager work with it correctly on the viewer side, patches in git and fedora build coming soon
Agree, we should first get things going smooth before adding a big "enable gl" switch in the UI. cheers, Gerd

On Thu, May 19, 2016 at 04:12:52PM +0200, Gerd Hoffmann wrote:
Hi,
$ ls -lZ /dev/dri/ total 0 crw-rw----+ 1 root video system_u:object_r:dri_device_t:s0 226, 0 May 18 19:17 card0 crw-------. 1 root video system_u:object_r:dri_device_t:s0 226, 64 May 18 19:17 controlD64 crw-rw----+ 1 root video system_u:object_r:dri_device_t:s0 226, 128 May 18 19:17 renderD128
qemu itself loops over /dev/dri, grabs the first matching renderD* that it can open, then does its magic. Some questions I have in general:
- is there only ever one render node? or one per video card?
One per video card.
Is there a way to tell QEMU which video card to use ? If so we need to somehow represent this in libvirt.
- is it okay to use the same node for multiple VMs simultaneously?
Yes.
Presumably they're going to compete for execution time and potentially VRAM at least ? I assume they have 100% security isolation from each other though. IOW, permissioning is really just there to prevent a rogue processes from doing denial of service on the GPU resources, rather than actively compromising other users of the GPU ?
Maybe the long term fix is to have libvirt pass in a pre-opened fd for qemu:///system, since I don't know if it would be acceptable to chown qemu:qemu on the render node, but maybe we use setfacl instead.
chown isn't a good idea I think. But doesn't use libvirt setfacl anyway for simliar things (i.e. /dev/bus/usb/... for usb pass-through) ?
No, we exclusively switch access to QEMU only. Obviously the DRI stuff is different as we expected the host OS to have continued concurrent use of the video card. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 20 May 2016 at 00:23, Daniel P. Berrange <berrange@redhat.com> wrote:
On Thu, May 19, 2016 at 04:12:52PM +0200, Gerd Hoffmann wrote:
Hi,
$ ls -lZ /dev/dri/ total 0 crw-rw----+ 1 root video system_u:object_r:dri_device_t:s0 226, 0 May 18 19:17 card0 crw-------. 1 root video system_u:object_r:dri_device_t:s0 226, 64 May 18 19:17 controlD64 crw-rw----+ 1 root video system_u:object_r:dri_device_t:s0 226, 128 May 18 19:17 renderD128
qemu itself loops over /dev/dri, grabs the first matching renderD* that it can open, then does its magic. Some questions I have in general:
- is there only ever one render node? or one per video card?
One per video card.
Is there a way to tell QEMU which video card to use ? If so we need to somehow represent this in libvirt.
We should probably add support for using an explicit path as the backing for a particular virtio-gpu device. At the moment I think we just open the first which may or may not be a great decision.
- is it okay to use the same node for multiple VMs simultaneously?
Yes.
Presumably they're going to compete for execution time and potentially VRAM at least ? I assume they have 100% security isolation from each other though. IOW, permissioning is really just there to prevent a rogue processes from doing denial of service on the GPU resources, rather than actively compromising other users of the GPU ?
Securing 3D accelerated VM access to 100% is unlikely to ever be possible, the GPU hardware just doesn't support this in some cases, later GPU hardware is a lot better, but there will always be DoS and possible info leaks through a GPU. I don't think VMware or anyone else do much different here. What using a render node does is blocks you from deliberately /accidentally accessing other users buffers using the defined API the old drm API had a global namespace you could stumble through for shared buffers.
Maybe the long term fix is to have libvirt pass in a pre-opened fd for qemu:///system, since I don't know if it would be acceptable to chown qemu:qemu on the render node, but maybe we use setfacl instead.
chown isn't a good idea I think. But doesn't use libvirt setfacl anyway for simliar things (i.e. /dev/bus/usb/... for usb pass-through) ?
No, we exclusively switch access to QEMU only.
Obviously the DRI stuff is different as we expected the host OS to have continued concurrent use of the video card.
chowning wouldn't be acceptable, adding an acl for qemu:qemu would be fine though. Dave.

On Thu, May 19, 2016 at 01:52:00PM +0100, Daniel P. Berrange wrote:
On Thu, May 19, 2016 at 08:36:35AM -0400, Cole Robinson wrote:
On 05/19/2016 08:21 AM, Daniel P. Berrange wrote:
On Thu, May 19, 2016 at 01:29:07PM +0200, Ján Tomko wrote:
Allow access to /dev/dri/render* devices for domains using <graphics type="spice"> with <gl enable="yes"/>
Ignoring cgroups for a minute, how exactly does QEMU get access to the /dev/dri/render* devices in general ? ie when QEMU is running as the 'qemu:qemu' user/group account, with selinux enforcing I don't see how it can possibly open these files, as we're not granting access to them in any of the security drivers. Given this, allowing them in cgroups seems like the least of our problems.
I saw this more as "not denying access" instead of allowing access. For dac/SELinux, if the user adds qemu to the video group/adds ACLs or creates a SELinux rule for it (or the more realistic solution mentioned by Cole), libvirt will not interfere. But it would deny "*:*" devices, giving a "Permission denied" (which is also harder to debug than the other two security measures)
The svirt bits can at least be temporarily worked around with chmod 666 /dev/dri/render* and setenforce 0. The cgroup bit requires duplicating the entire cgroup_device_acl block in qemu.conf which is less friendly and not very future proof. Seems like an easy win
There's a potential issue though with going down a path now which is not viable long term, which we then get stuck supporting for upgradability. eg if we start granting permission to use these devices to multiple QEMUs concurrently will we regret doing that later and have to break people's deployments to fix it properly.
Without sVirt integration though I'd suggest we don't really advertize this to users, as telling them to chmod / setenforce is not really a supportable strategy for usage in any case.
I'm afraid we'll end up with: 1. 'add qemu to this group/acl' 2. 'run this setsebool' Since I'm not sure how we could differentiate between QEMUs that can and QEMUs that cannot access this shared resource. Jan

On Fri, May 20, 2016 at 09:54:29AM +0200, Ján Tomko wrote:
On Thu, May 19, 2016 at 01:52:00PM +0100, Daniel P. Berrange wrote:
On Thu, May 19, 2016 at 08:36:35AM -0400, Cole Robinson wrote:
On 05/19/2016 08:21 AM, Daniel P. Berrange wrote:
On Thu, May 19, 2016 at 01:29:07PM +0200, Ján Tomko wrote:
Allow access to /dev/dri/render* devices for domains using <graphics type="spice"> with <gl enable="yes"/>
Ignoring cgroups for a minute, how exactly does QEMU get access to the /dev/dri/render* devices in general ? ie when QEMU is running as the 'qemu:qemu' user/group account, with selinux enforcing I don't see how it can possibly open these files, as we're not granting access to them in any of the security drivers. Given this, allowing them in cgroups seems like the least of our problems.
I saw this more as "not denying access" instead of allowing access. For dac/SELinux, if the user adds qemu to the video group/adds ACLs or creates a SELinux rule for it (or the more realistic solution mentioned by Cole), libvirt will not interfere. But it would deny "*:*" devices, giving a "Permission denied" (which is also harder to debug than the other two security measures)
The svirt bits can at least be temporarily worked around with chmod 666 /dev/dri/render* and setenforce 0. The cgroup bit requires duplicating the entire cgroup_device_acl block in qemu.conf which is less friendly and not very future proof. Seems like an easy win
There's a potential issue though with going down a path now which is not viable long term, which we then get stuck supporting for upgradability. eg if we start granting permission to use these devices to multiple QEMUs concurrently will we regret doing that later and have to break people's deployments to fix it properly.
Without sVirt integration though I'd suggest we don't really advertize this to users, as telling them to chmod / setenforce is not really a supportable strategy for usage in any case.
I'm afraid we'll end up with: 1. 'add qemu to this group/acl' 2. 'run this setsebool' Since I'm not sure how we could differentiate between QEMUs that can and QEMUs that cannot access this shared resource.
We will have to figure this out somehow, becasue if usage requires those explicit admin steps outside of libvirt, then history shows the feature will be so painful to use that most people won't bother and those who try will file an endless stream of bugs. IOW without solving this problem, the feature might as well not exist for the majority of people. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 05/20/2016 03:54 AM, Ján Tomko wrote:
On Thu, May 19, 2016 at 01:52:00PM +0100, Daniel P. Berrange wrote:
On Thu, May 19, 2016 at 08:36:35AM -0400, Cole Robinson wrote:
On 05/19/2016 08:21 AM, Daniel P. Berrange wrote:
On Thu, May 19, 2016 at 01:29:07PM +0200, Ján Tomko wrote:
Allow access to /dev/dri/render* devices for domains using <graphics type="spice"> with <gl enable="yes"/>
Ignoring cgroups for a minute, how exactly does QEMU get access to the /dev/dri/render* devices in general ? ie when QEMU is running as the 'qemu:qemu' user/group account, with selinux enforcing I don't see how it can possibly open these files, as we're not granting access to them in any of the security drivers. Given this, allowing them in cgroups seems like the least of our problems.
I saw this more as "not denying access" instead of allowing access. For dac/SELinux, if the user adds qemu to the video group/adds ACLs or creates a SELinux rule for it (or the more realistic solution mentioned by Cole), libvirt will not interfere. But it would deny "*:*" devices, giving a "Permission denied" (which is also harder to debug than the other two security measures)
I agree with this, and the patch in general. Dave and Gerd confirmed that this is expected to be a shared resource, so I think extending the cgroups whitelist is going to happen eventually anyways. The svirt/dac issues are real and we need to figure that out, but it's kind of irrelevant at this stage; the word is out on this stuff and people are going to find a way to enable it one way or the other. I've already had discussions with 5 people this week about when support will be available in virt-manager. Heck I know people were already using the qemu command line passthrough magic to test GL with the qemu gtk frontend with older qemu. This is just one of those features that people are going to want to play with, integration issues be damned, so IMO better that we get out in front of it. What I'm afraid of specifically with this cgroups issue is people will work around it with a custom cgroup_device_acl, then some time later when we bump the default list to make some new qemu feature work, suddenly that old cgroup list doesn't cut it and the user get's weird errors with new qemu, which we have to debug. svirt and dac workarounds are less scary in that regard - Cole

Hi What's the status with this patch? If I understand the discussion, it is needed, but not enough. Now that SELinux has been fixed (both in f24/f25 now), I can see only the ACL left: setfacl -m u:qemu:rw /dev/dri/renderD128 + this patch allows me to setup a system VM with virgl. (though tbh, I would be fine restricting virgl to qemu:///session only) thanks On Sat, May 21, 2016 at 1:10 AM Cole Robinson <crobinso@redhat.com> wrote:
On 05/20/2016 03:54 AM, Ján Tomko wrote:
On Thu, May 19, 2016 at 01:52:00PM +0100, Daniel P. Berrange wrote:
On Thu, May 19, 2016 at 08:36:35AM -0400, Cole Robinson wrote:
On 05/19/2016 08:21 AM, Daniel P. Berrange wrote:
On Thu, May 19, 2016 at 01:29:07PM +0200, Ján Tomko wrote:
Allow access to /dev/dri/render* devices for domains using <graphics type="spice"> with <gl enable="yes"/>
Ignoring cgroups for a minute, how exactly does QEMU get access to the /dev/dri/render* devices in general ? ie when QEMU is running as the 'qemu:qemu' user/group account, with selinux enforcing I don't see how it can possibly open these files, as we're not granting access to them in any of the security drivers. Given this, allowing them in cgroups seems like the least of our problems.
I saw this more as "not denying access" instead of allowing access. For dac/SELinux, if the user adds qemu to the video group/adds ACLs or creates a SELinux rule for it (or the more realistic solution mentioned by Cole), libvirt will not interfere. But it would deny "*:*" devices, giving a "Permission denied" (which is also harder to debug than the other two security measures)
I agree with this, and the patch in general. Dave and Gerd confirmed that this is expected to be a shared resource, so I think extending the cgroups whitelist is going to happen eventually anyways.
The svirt/dac issues are real and we need to figure that out, but it's kind of irrelevant at this stage; the word is out on this stuff and people are going to find a way to enable it one way or the other. I've already had discussions with 5 people this week about when support will be available in virt-manager. Heck I know people were already using the qemu command line passthrough magic to test GL with the qemu gtk frontend with older qemu. This is just one of those features that people are going to want to play with, integration issues be damned, so IMO better that we get out in front of it.
What I'm afraid of specifically with this cgroups issue is people will work around it with a custom cgroup_device_acl, then some time later when we bump the default list to make some new qemu feature work, suddenly that old cgroup list doesn't cut it and the user get's weird errors with new qemu, which we have to debug. svirt and dac workarounds are less scary in that regard
- Cole
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Marc-André Lureau

On Thu, Nov 10, 2016 at 10:35:46AM +0000, Marc-André Lureau wrote:
Hi
What's the status with this patch? If I understand the discussion, it is needed, but not enough. Now that SELinux has been fixed (both in f24/f25 now), I can see only the ACL left: setfacl -m u:qemu:rw /dev/dri/renderD128 + this patch allows me to setup a system VM with virgl. (though tbh, I would be fine restricting virgl to qemu:///session only)
This ties in with the discussion we've just been having around udev and DAC/MAC labelling of device nodes. With my proposed solution of using a new mount namespace + dedicated /dev per VM, then granting DAC access to the DRI nodes is easy. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|

On 11/10/2016 05:59 AM, Daniel P. Berrange wrote:
On Thu, Nov 10, 2016 at 10:35:46AM +0000, Marc-André Lureau wrote:
Hi
What's the status with this patch? If I understand the discussion, it is needed, but not enough. Now that SELinux has been fixed (both in f24/f25 now), I can see only the ACL left: setfacl -m u:qemu:rw /dev/dri/renderD128 + this patch allows me to setup a system VM with virgl. (though tbh, I would be fine restricting virgl to qemu:///session only)
This ties in with the discussion we've just been having around udev and DAC/MAC labelling of device nodes. With my proposed solution of using a new mount namespace + dedicated /dev per VM, then granting DAC access to the DRI nodes is easy.
The DAC thing at least has an easy workaround like Marc-André pointed out. The only workaround for the cgroup issue is a custom cgroup_device_acl in qemu.conf, which sucks: if a user adds a custom list to their qemu.conf, and then forgets about it, future libvirt updates might extend the default cgroup_device_acl, the user misses these updates, possibly causing hard to diagnose errors or bugs. In the meantime we have people that are trying to make this work regardless of workarounds (see libvirt-users thread, and comments on bug 1337290). So IMO better to make the needed workarounds less intrusive. So I still vote for this patch. But if it's still not acceptable, maybe we can add a new qemu.conf option like cgroup_device_acl_append= which users can manually edit, which avoids the upgrade issues of cgroup_device_acl= - Cole
participants (6)
-
Cole Robinson
-
Daniel P. Berrange
-
Dave Airlie
-
Gerd Hoffmann
-
Ján Tomko
-
Marc-André Lureau