[libvirt] [PATCH v4 00/14] Introduce vGPU mdev framework to libvirt

since v1: - new <hostdev> attribute model introduced which tells libvirt which device API should be considered when auto-assigning guest address - device_api is properly checked, thus taking the 'model' attribute only as a hint to assign "some" address - new address type 'mdev' is introduced rather than using plain <uuid> element, since the address element is more conveniently extendable. - the emulated mtty driver now works as well out of the box, so no HW needed to review this series --> let's try it :) - fixed all the nits from v1 since v2: - dropped the patch introducing new address type 'mdev' since I added by mistake and only after that realized that the device address type enum is used for guest addresses only --> the mdevs are still identified by address element containing an 'uuid' attribute, I just dropped the enum - resolved the driver hostdev list race condition raised by Pavel in his review --> the device API is now checked every time our internal mdev object is created as opposed to the previous version where because of the model being checked separately, the locking issues arose. - rewrote the docs, reflecting the mdev address type drop change - squashed all security related stuff into 1 patch, also added app-armor bits - as Pavel suggested, moved most of the mdev-related functions out of virhostdev.c to virmdev.c - added a few more test cases - created a new branch 'mdev-next' on my github (more suitable name than a strict version number) on https://github.com/eskultety/libvirt/commits/mdev-next since v3: - 'undo' an accidental squash of virmdev.{c,h} module introduction into patch 4/15 and made it a separate patch again - squash 5/15 into 4/15 as Pavel suggested - dropped the NEWS patch, as I've so far got at least 4 merge conflicts because of it when rebasing...I'll add it before the series is ready to be merged...or I'll forget about it like I usually do and add it later :/ Erik Erik Skultety (14): conf: hostdev: Enforce enum-in-switch compile-time checks conf: hostdev: Introduce virDomainHostdevSubsysSCSIClear conf: Introduce virDomainHostdevDefPostParse util: Introduce new module virmdev conf: Introduce new hostdev device type mdev security: Enable labeling of vfio mediated devices conf: Enable cold-plug of a mediated device qemu: Assign PCI addresses for mediated devices as well hostdev: Maintain a driver list of active mediated devices qemu: cgroup: Adjust cgroups' logic to allow mediated devices qemu: Bump the memory locking limit for mdevs as well qemu: Format mdevs on qemu command line test: Add some test cases for our test suite regarding the mdevs docs: Document the new hostdev and address type 'mdev' docs/formatdomain.html.in | 46 +- docs/schemas/domaincommon.rng | 22 + po/POTFILES.in | 1 + src/Makefile.am | 1 + src/conf/domain_conf.c | 225 ++++++++-- src/conf/domain_conf.h | 9 + src/libvirt_private.syms | 25 ++ src/qemu/qemu_command.c | 45 ++ src/qemu/qemu_command.h | 5 + src/qemu/qemu_domain.c | 24 +- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 14 +- src/qemu/qemu_hostdev.c | 56 +++ src/qemu/qemu_hostdev.h | 10 + src/qemu/qemu_hotplug.c | 2 + src/security/security_apparmor.c | 22 + src/security/security_dac.c | 43 ++ src/security/security_selinux.c | 45 ++ src/util/virhostdev.c | 165 ++++++- src/util/virhostdev.h | 23 + src/util/virmdev.c | 487 +++++++++++++++++++++ src/util/virmdev.h | 123 ++++++ tests/domaincapsschemadata/full.xml | 1 + ...ml2argv-hostdev-mdev-invalid-target-address.xml | 33 ++ ...muxml2argv-hostdev-mdev-src-address-invalid.xml | 35 ++ .../qemuxml2argv-hostdev-mdev-unmanaged.args | 25 ++ .../qemuxml2argv-hostdev-mdev-unmanaged.xml | 35 ++ tests/qemuxml2argvtest.c | 9 + .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml | 40 ++ tests/qemuxml2xmltest.c | 1 + 30 files changed, 1518 insertions(+), 55 deletions(-) create mode 100644 src/util/virmdev.c create mode 100644 src/util/virmdev.h create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-invalid-target-address.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-src-address-invalid.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml -- 2.12.1

Enforce virDomainHostdevSubsysType checking during compilation. Again, one of a few spots in our code where we should enforce the typecast to the enum type, thus not forgetting to update *all* switch occurrences dealing with the give enum. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 6bbc6a2a7b..17909820be 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2320,7 +2320,7 @@ void virDomainHostdevDefClear(virDomainHostdevDefPtr def) switch (def->mode) { case VIR_DOMAIN_HOSTDEV_MODE_CAPABILITIES: - switch (def->source.caps.type) { + switch ((virDomainHostdevCapsType) def->source.caps.type) { case VIR_DOMAIN_HOSTDEV_CAPS_TYPE_STORAGE: VIR_FREE(def->source.caps.u.storage.block); break; @@ -2331,6 +2331,8 @@ void virDomainHostdevDefClear(virDomainHostdevDefPtr def) VIR_FREE(def->source.caps.u.net.ifname); virNetDevIPInfoClear(&def->source.caps.u.net.ip); break; + case VIR_DOMAIN_HOSTDEV_CAPS_TYPE_LAST: + break; } break; case VIR_DOMAIN_HOSTDEV_MODE_SUBSYS: -- 2.12.1

On 03/22/2017 11:27 AM, Erik Skultety wrote:
Enforce virDomainHostdevSubsysType checking during compilation. Again, one of a few spots in our code where we should enforce the typecast to the enum type, thus not forgetting to update *all* switch occurrences dealing with the give enum.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 6bbc6a2a7b..17909820be 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2320,7 +2320,7 @@ void virDomainHostdevDefClear(virDomainHostdevDefPtr def)
switch (def->mode) { case VIR_DOMAIN_HOSTDEV_MODE_CAPABILITIES: - switch (def->source.caps.type) { + switch ((virDomainHostdevCapsType) def->source.caps.type) { case VIR_DOMAIN_HOSTDEV_CAPS_TYPE_STORAGE: VIR_FREE(def->source.caps.u.storage.block); break; @@ -2331,6 +2331,8 @@ void virDomainHostdevDefClear(virDomainHostdevDefPtr def) VIR_FREE(def->source.caps.u.net.ifname); virNetDevIPInfoClear(&def->source.caps.u.net.ip); break; + case VIR_DOMAIN_HOSTDEV_CAPS_TYPE_LAST: + break; } break; case VIR_DOMAIN_HOSTDEV_MODE_SUBSYS:
ACK.

Just a tiny wrapper over the SCSI def clearing logic to drop some if-else branches from a switch, mainly because extending the switch in the future would render the current code with branching less readable. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 33 ++++++++++++++++++++++----------- 1 file changed, 22 insertions(+), 11 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 17909820be..568bf6722e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2303,6 +2303,17 @@ virDomainHostdevSubsysSCSIiSCSIClear(virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc iscsisrc->auth = NULL; } + +static void +virDomainHostdevSubsysSCSIClear(virDomainHostdevSubsysSCSIPtr scsisrc) +{ + if (scsisrc->protocol == VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_ISCSI) + virDomainHostdevSubsysSCSIiSCSIClear(&scsisrc->u.iscsi); + else + VIR_FREE(scsisrc->u.host.adapter); +} + + void virDomainHostdevDefClear(virDomainHostdevDefPtr def) { if (!def) @@ -2336,17 +2347,17 @@ void virDomainHostdevDefClear(virDomainHostdevDefPtr def) } break; case VIR_DOMAIN_HOSTDEV_MODE_SUBSYS: - if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI) { - virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; - if (scsisrc->protocol == - VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_ISCSI) { - virDomainHostdevSubsysSCSIiSCSIClear(&scsisrc->u.iscsi); - } else { - VIR_FREE(scsisrc->u.host.adapter); - } - } else if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST) { - virDomainHostdevSubsysSCSIVHostPtr hostsrc = &def->source.subsys.u.scsi_host; - VIR_FREE(hostsrc->wwpn); + switch ((virDomainHostdevSubsysType) def->source.subsys.type) { + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: + virDomainHostdevSubsysSCSIClear(&def->source.subsys.u.scsi); + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: + VIR_FREE(def->source.subsys.u.scsi_host.wwpn); + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: + break; } break; } -- 2.12.1

On 03/22/2017 11:27 AM, Erik Skultety wrote:
Just a tiny wrapper over the SCSI def clearing logic to drop some if-else branches from a switch, mainly because extending the switch in the future would render the current code with branching less readable.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 33 ++++++++++++++++++++++----------- 1 file changed, 22 insertions(+), 11 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 17909820be..568bf6722e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2303,6 +2303,17 @@ virDomainHostdevSubsysSCSIiSCSIClear(virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc iscsisrc->auth = NULL; }
+ +static void +virDomainHostdevSubsysSCSIClear(virDomainHostdevSubsysSCSIPtr scsisrc) +{ + if (scsisrc->protocol == VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_ISCSI) + virDomainHostdevSubsysSCSIiSCSIClear(&scsisrc->u.iscsi); + else + VIR_FREE(scsisrc->u.host.adapter); +} + + void virDomainHostdevDefClear(virDomainHostdevDefPtr def) { if (!def) @@ -2336,17 +2347,17 @@ void virDomainHostdevDefClear(virDomainHostdevDefPtr def) } break; case VIR_DOMAIN_HOSTDEV_MODE_SUBSYS: - if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI) { - virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; - if (scsisrc->protocol == - VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_ISCSI) { - virDomainHostdevSubsysSCSIiSCSIClear(&scsisrc->u.iscsi); - } else { - VIR_FREE(scsisrc->u.host.adapter); - } - } else if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST) { - virDomainHostdevSubsysSCSIVHostPtr hostsrc = &def->source.subsys.u.scsi_host; - VIR_FREE(hostsrc->wwpn); + switch ((virDomainHostdevSubsysType) def->source.subsys.type) { + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: + virDomainHostdevSubsysSCSIClear(&def->source.subsys.u.scsi); + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: + VIR_FREE(def->source.subsys.u.scsi_host.wwpn); + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: + break; } break; }
ACK.

Just to make the code a bit cleaner, move hostdev specific post parse code to its own function just in case it grows in the future. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 75 +++++++++++++++++++++++++++++++------------------- 1 file changed, 47 insertions(+), 28 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 568bf6722e..a5ab42297d 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4219,6 +4219,50 @@ virDomainHostdevAssignAddress(virDomainXMLOptionPtr xmlopt, static int +virDomainHostdevDefPostParse(virDomainHostdevDefPtr dev, + const virDomainDef *def, + virDomainXMLOptionPtr xmlopt) +{ + if (dev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) + return 0; + + switch (dev->source.subsys.type) { + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: + if (dev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + virDomainHostdevAssignAddress(xmlopt, def, dev) < 0) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Cannot assign SCSI host device address")); + return -1; + } else { + /* Ensure provided address doesn't conflict with existing + * scsi disk drive address + */ + virDomainDeviceDriveAddressPtr addr = &dev->info->addr.drive; + if (virDomainDriveAddressIsUsedByDisk(def, + VIR_DOMAIN_DISK_BUS_SCSI, + addr)) { + virReportError(VIR_ERR_XML_ERROR, + _("SCSI host address controller='%u' " + "bus='%u' target='%u' unit='%u' in " + "use by a SCSI disk"), + addr->controller, addr->bus, + addr->target, addr->unit); + return -1; + } + } + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: + break; + } + + return 0; +} + + +static int virDomainDeviceDefPostParseInternal(virDomainDeviceDefPtr dev, const virDomainDef *def, virCapsPtr caps ATTRIBUTE_UNUSED, @@ -4299,34 +4343,9 @@ virDomainDeviceDefPostParseInternal(virDomainDeviceDefPtr dev, video->vram = VIR_ROUND_UP_POWER_OF_TWO(video->vram); } - if (dev->type == VIR_DOMAIN_DEVICE_HOSTDEV) { - virDomainHostdevDefPtr hdev = dev->data.hostdev; - - if (virHostdevIsSCSIDevice(hdev)) { - if (hdev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && - virDomainHostdevAssignAddress(xmlopt, def, hdev) < 0) { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("Cannot assign SCSI host device address")); - return -1; - } else { - /* Ensure provided address doesn't conflict with existing - * scsi disk drive address - */ - virDomainDeviceDriveAddressPtr addr = &hdev->info->addr.drive; - if (virDomainDriveAddressIsUsedByDisk(def, - VIR_DOMAIN_DISK_BUS_SCSI, - addr)) { - virReportError(VIR_ERR_XML_ERROR, - _("SCSI host address controller='%u' " - "bus='%u' target='%u' unit='%u' in " - "use by a SCSI disk"), - addr->controller, addr->bus, - addr->target, addr->unit); - return -1; - } - } - } - } + if (dev->type == VIR_DOMAIN_DEVICE_HOSTDEV && + virDomainHostdevDefPostParse(dev->data.hostdev, def, xmlopt) < 0) + return -1; if (dev->type == VIR_DOMAIN_DEVICE_CONTROLLER) { virDomainControllerDefPtr cdev = dev->data.controller; -- 2.12.1

On 03/22/2017 11:27 AM, Erik Skultety wrote:
Just to make the code a bit cleaner, move hostdev specific post parse code to its own function just in case it grows in the future.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 75 +++++++++++++++++++++++++++++++------------------- 1 file changed, 47 insertions(+), 28 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 568bf6722e..a5ab42297d 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4219,6 +4219,50 @@ virDomainHostdevAssignAddress(virDomainXMLOptionPtr xmlopt,
static int +virDomainHostdevDefPostParse(virDomainHostdevDefPtr dev, + const virDomainDef *def, + virDomainXMLOptionPtr xmlopt) +{ + if (dev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) + return 0; + + switch (dev->source.subsys.type) {
I had to actually go look at the definition of virHostdevIsSCSIDevice() to see that the above lines do exactly duplicate the original functionality. And I like it the way you've done it (putting in a switch() on the type rather than hiding the check in a utility function) better - it will make more sense when it's expanded. ACK.
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: + if (dev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + virDomainHostdevAssignAddress(xmlopt, def, dev) < 0) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Cannot assign SCSI host device address")); + return -1; + } else { + /* Ensure provided address doesn't conflict with existing + * scsi disk drive address + */ + virDomainDeviceDriveAddressPtr addr = &dev->info->addr.drive; + if (virDomainDriveAddressIsUsedByDisk(def, + VIR_DOMAIN_DISK_BUS_SCSI, + addr)) { + virReportError(VIR_ERR_XML_ERROR, + _("SCSI host address controller='%u' " + "bus='%u' target='%u' unit='%u' in " + "use by a SCSI disk"), + addr->controller, addr->bus, + addr->target, addr->unit); + return -1; + } + } + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: + break; + } + + return 0; +} + + +static int virDomainDeviceDefPostParseInternal(virDomainDeviceDefPtr dev, const virDomainDef *def, virCapsPtr caps ATTRIBUTE_UNUSED, @@ -4299,34 +4343,9 @@ virDomainDeviceDefPostParseInternal(virDomainDeviceDefPtr dev, video->vram = VIR_ROUND_UP_POWER_OF_TWO(video->vram); }
- if (dev->type == VIR_DOMAIN_DEVICE_HOSTDEV) { - virDomainHostdevDefPtr hdev = dev->data.hostdev; - - if (virHostdevIsSCSIDevice(hdev)) { - if (hdev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && - virDomainHostdevAssignAddress(xmlopt, def, hdev) < 0) { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("Cannot assign SCSI host device address")); - return -1; - } else { - /* Ensure provided address doesn't conflict with existing - * scsi disk drive address - */ - virDomainDeviceDriveAddressPtr addr = &hdev->info->addr.drive; - if (virDomainDriveAddressIsUsedByDisk(def, - VIR_DOMAIN_DISK_BUS_SCSI, - addr)) { - virReportError(VIR_ERR_XML_ERROR, - _("SCSI host address controller='%u' " - "bus='%u' target='%u' unit='%u' in " - "use by a SCSI disk"), - addr->controller, addr->bus, - addr->target, addr->unit); - return -1; - } - } - } - } + if (dev->type == VIR_DOMAIN_DEVICE_HOSTDEV && + virDomainHostdevDefPostParse(dev->data.hostdev, def, xmlopt) < 0) + return -1;
if (dev->type == VIR_DOMAIN_DEVICE_CONTROLLER) { virDomainControllerDefPtr cdev = dev->data.controller;

Beside creation, disposal, getter, and setter methods the module exports methods to work with lists of mediated devices. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- po/POTFILES.in | 1 + src/Makefile.am | 1 + src/libvirt_private.syms | 22 +++ src/util/virmdev.c | 487 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virmdev.h | 123 ++++++++++++ 5 files changed, 634 insertions(+) create mode 100644 src/util/virmdev.c create mode 100644 src/util/virmdev.h diff --git a/po/POTFILES.in b/po/POTFILES.in index 64cb88cfa5..df3d3cfe80 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -224,6 +224,7 @@ src/util/virlease.c src/util/virlockspace.c src/util/virlog.c src/util/virmacmap.c +src/util/virmdev.c src/util/virnetdev.c src/util/virnetdevbandwidth.c src/util/virnetdevbridge.c diff --git a/src/Makefile.am b/src/Makefile.am index 3b1bb1da35..01b7f4e0be 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -188,6 +188,7 @@ UTIL_SOURCES = \ util/virvhba.c util/virvhba.h \ util/virxdrdefs.h \ util/virxml.c util/virxml.h \ + util/virmdev.c util/virmdev.h \ $(NULL) EXTRA_DIST += $(srcdir)/util/keymaps.csv $(srcdir)/util/virkeycode-mapgen.py diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 57acfdbb19..c51b295d30 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1983,6 +1983,28 @@ virMacMapNew; virMacMapRemove; virMacMapWriteFile; +# util/virmdev.h +virMediatedDeviceFree; +virMediatedDeviceGetIOMMUGroupDev; +virMediatedDeviceGetIOMMUGroupNum; +virMediatedDeviceGetSysfsPath; +virMediatedDeviceGetUsedBy; +virMediatedDeviceIsUsed; +virMediatedDeviceListAdd; +virMediatedDeviceListCount; +virMediatedDeviceListDel; +virMediatedDeviceListFind; +virMediatedDeviceListGet; +virMediatedDeviceListMarkDevices; +virMediatedDeviceListNew; +virMediatedDeviceListSteal; +virMediatedDeviceListStealIndex; +virMediatedDeviceModelTypeFromString; +virMediatedDeviceModelTypeToString; +virMediatedDeviceNew; +virMediatedDeviceSetUsedBy; + + # util/virnetdev.h virNetDevAddMulti; diff --git a/src/util/virmdev.c b/src/util/virmdev.c new file mode 100644 index 0000000000..d1692a982a --- /dev/null +++ b/src/util/virmdev.c @@ -0,0 +1,487 @@ +/* + * virmdev.c: helper APIs for managing host mediated devices + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#include <config.h> + +#include <fcntl.h> +#include <inttypes.h> +#include <limits.h> +#include <stdio.h> +#include <string.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> +#include <stdlib.h> + +#include "virmdev.h" +#include "dirname.h" +#include "virlog.h" +#include "viralloc.h" +#include "vircommand.h" +#include "virerror.h" +#include "virfile.h" +#include "virkmod.h" +#include "virstring.h" +#include "virutil.h" +#include "viruuid.h" + +#define VIR_FROM_THIS VIR_FROM_NONE + +VIR_LOG_INIT("util.mdev"); + +struct _virMediatedDevice { + char *path; /* sysfs path */ + virMediatedDeviceModelType model; + + char *used_by_drvname; + char *used_by_domname; +}; + +struct _virMediatedDeviceList { + virObjectLockable parent; + + size_t count; + virMediatedDevicePtr *devs; +}; + +VIR_ENUM_IMPL(virMediatedDeviceModel, VIR_MDEV_MODEL_TYPE_LAST, + "vfio-pci") + +static virClassPtr virMediatedDeviceListClass; + +static void virMediatedDeviceListDispose(void *obj); + +static int virMediatedOnceInit(void) +{ + if (!(virMediatedDeviceListClass = virClassNew(virClassForObjectLockable(), + "virMediatedDeviceList", + sizeof(virMediatedDeviceList), + virMediatedDeviceListDispose))) + return -1; + + return 0; +} + +VIR_ONCE_GLOBAL_INIT(virMediated) + +static int +virMediatedDeviceGetSysfsDeviceAPI(virMediatedDevicePtr dev, + char **device_api) +{ + int ret = -1; + char *buf = NULL; + char *tmp = NULL; + char *file = NULL; + + if (virAsprintf(&file, "%s/mdev_type/device_api", dev->path) < 0) + goto cleanup; + + /* TODO - make this a generic method to access sysfs files for various + * kinds of devices + */ + if (!virFileExists(file)) { + virReportSystemError(errno, _("Failed to read '%s'"), file); + goto cleanup; + } + + if (virFileReadAll(file, 1024, &buf) < 0) + goto cleanup; + + if ((tmp = strchr(buf, '\n'))) + *tmp = '\0'; + + *device_api = buf; + buf = NULL; + + ret = 0; + cleanup: + VIR_FREE(file); + VIR_FREE(buf); + return ret; +} + + +static int +virMediatedDeviceCheckModel(virMediatedDevicePtr dev, + virMediatedDeviceModelType model) +{ + int ret = -1; + char *dev_api = NULL; + int actual_model; + + if (virMediatedDeviceGetSysfsDeviceAPI(dev, &dev_api) < 0) + return -1; + + /* safeguard in case we've got an older libvirt which doesn't know newer + * device_api models yet + */ + if ((actual_model = virMediatedDeviceModelTypeFromString(dev_api)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Device API '%s' not supported yet"), + dev_api); + goto cleanup; + } + + if (actual_model != model) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid device API '%s' for device %s: " + "device only supports '%s'"), + virMediatedDeviceModelTypeToString(model), + dev->path, dev_api); + goto cleanup; + } + + ret = 0; + cleanup: + VIR_FREE(dev_api); + return ret; +} + + +#ifdef __linux__ +# define MDEV_SYSFS_DEVICES "/sys/bus/mdev/devices/" + +virMediatedDevicePtr +virMediatedDeviceNew(const char *uuidstr, virMediatedDeviceModelType model) +{ + virMediatedDevicePtr ret = NULL; + virMediatedDevicePtr dev = NULL; + + if (VIR_ALLOC(dev) < 0) + return NULL; + + if (!(dev->path = virMediatedDeviceGetSysfsPath(uuidstr))) + goto cleanup; + + /* Check whether the user-provided model corresponds with the actually + * supported mediated device's API. + */ + if (virMediatedDeviceCheckModel(dev, model)) + goto cleanup; + + dev->model = model; + VIR_STEAL_PTR(ret, dev); + + cleanup: + virMediatedDeviceFree(dev); + return ret; +} + +#else + +virMediatedDevicePtr +virMediatedDeviceNew(virPCIDeviceAddressPtr pciaddr ATTRIBUTE_UNUSED, + const char *uuidstr ATTRIBUTE_UNUSED) +{ + virRerportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("not supported on non-linux platforms")); + return NULL; +} + +#endif /* __linux__ */ + +void +virMediatedDeviceFree(virMediatedDevicePtr dev) +{ + if (!dev) + return; + VIR_FREE(dev->path); + VIR_FREE(dev->used_by_drvname); + VIR_FREE(dev->used_by_domname); + VIR_FREE(dev); +} + + +const char * +virMediatedDeviceGetPath(virMediatedDevicePtr dev) +{ + return dev->path; +} + + +/* Returns an absolute canonicalized path to the device used to control the + * mediated device's IOMMU group (e.g. "/dev/vfio/15"). Caller is responsible + * for freeing the result. + */ +char * +virMediatedDeviceGetIOMMUGroupDev(virMediatedDevicePtr dev) +{ + char *resultpath = NULL; + char *iommu_path = NULL; + char *vfio_path = NULL; + + if (virAsprintf(&iommu_path, "%s/iommu_group", dev->path) < 0) + return NULL; + + if (!virFileExists(iommu_path)) { + virReportSystemError(errno, _("Failed to access '%s'"), iommu_path); + goto cleanup; + } + + if (virFileResolveLink(iommu_path, &resultpath) < 0) { + virReportSystemError(errno, _("Failed to resolve '%s'"), iommu_path); + goto cleanup; + } + + if (virAsprintf(&vfio_path, "/dev/vfio/%s", last_component(resultpath)) < 0) + goto cleanup; + + cleanup: + VIR_FREE(resultpath); + VIR_FREE(iommu_path); + return vfio_path; +} + + +int +virMediatedDeviceGetIOMMUGroupNum(virMediatedDevicePtr dev) +{ + char *vfio_path = NULL; + char *group_num_str = NULL; + unsigned int group_num = -1; + + if (!(vfio_path = virMediatedDeviceGetIOMMUGroupDev(dev))) + return -1; + + group_num_str = last_component(vfio_path); + ignore_value(virStrToLong_ui(group_num_str, NULL, 10, &group_num)); + + VIR_FREE(vfio_path); + return group_num; +} + + +void +virMediatedDeviceGetUsedBy(virMediatedDevicePtr dev, + const char **drvname, const char **domname) +{ + *drvname = dev->used_by_drvname; + *domname = dev->used_by_domname; +} + + +int +virMediatedDeviceSetUsedBy(virMediatedDevicePtr dev, + const char *drvname, + const char *domname) +{ + VIR_FREE(dev->used_by_drvname); + VIR_FREE(dev->used_by_domname); + if (VIR_STRDUP(dev->used_by_drvname, drvname) < 0) + return -1; + if (VIR_STRDUP(dev->used_by_domname, domname) < 0) + return -1; + + return 0; +} + + +virMediatedDeviceListPtr +virMediatedDeviceListNew(void) +{ + virMediatedDeviceListPtr list; + + if (virMediatedInitialize() < 0) + return NULL; + + if (!(list = virObjectLockableNew(virMediatedDeviceListClass))) + return NULL; + + return list; +} + + +static void +virMediatedDeviceListDispose(void *obj) +{ + virMediatedDeviceListPtr list = obj; + size_t i; + + for (i = 0; i < list->count; i++) { + virMediatedDeviceFree(list->devs[i]); + list->devs[i] = NULL; + } + + list->count = 0; + VIR_FREE(list->devs); +} + + +int +virMediatedDeviceListAdd(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + if (virMediatedDeviceListFind(list, dev)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Device %s is already in use"), dev->path); + return -1; + } + return VIR_APPEND_ELEMENT(list->devs, list->count, dev); +} + + +virMediatedDevicePtr +virMediatedDeviceListGet(virMediatedDeviceListPtr list, + ssize_t idx) +{ + if (idx < 0 || idx >= list->count) + return NULL; + + return list->devs[idx]; +} + + +size_t +virMediatedDeviceListCount(virMediatedDeviceListPtr list) +{ + return list->count; +} + + +virMediatedDevicePtr +virMediatedDeviceListStealIndex(virMediatedDeviceListPtr list, + ssize_t idx) +{ + virMediatedDevicePtr ret; + + if (idx < 0 || idx >= list->count) + return NULL; + + ret = list->devs[idx]; + VIR_DELETE_ELEMENT(list->devs, idx, list->count); + return ret; +} + + +virMediatedDevicePtr +virMediatedDeviceListSteal(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + int idx = virMediatedDeviceListFindIndex(list, dev); + + return virMediatedDeviceListStealIndex(list, idx); +} + + +void +virMediatedDeviceListDel(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + virMediatedDevicePtr ret = virMediatedDeviceListSteal(list, dev); + virMediatedDeviceFree(ret); +} + + +int +virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + size_t i; + + for (i = 0; i < list->count; i++) { + virMediatedDevicePtr other = list->devs[i]; + if (STREQ(other->path, dev->path)) + return i; + } + return -1; +} + + +virMediatedDevicePtr +virMediatedDeviceListFind(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + int idx; + + if ((idx = virMediatedDeviceListFindIndex(list, dev)) >= 0) + return list->devs[idx]; + else + return NULL; +} + + +bool +virMediatedDeviceIsUsed(virMediatedDevicePtr dev, + virMediatedDeviceListPtr list) +{ + const char *drvname, *domname; + virMediatedDevicePtr tmp = NULL; + + if ((tmp = virMediatedDeviceListFind(list, dev))) { + virMediatedDeviceGetUsedBy(tmp, &drvname, &domname); + virReportError(VIR_ERR_OPERATION_INVALID, + _("Mediated device %s is in use by " + "driver %s, domain %s"), + tmp->path, drvname, domname); + } + + return !!tmp; +} + + +char * +virMediatedDeviceGetSysfsPath(const char *uuidstr) +{ + char *ret = NULL; + + ignore_value(virAsprintf(&ret, MDEV_SYSFS_DEVICES "%s", uuidstr)); + return ret; +} + + +int +virMediatedDeviceListMarkDevices(virMediatedDeviceListPtr dst, + virMediatedDeviceListPtr src, + const char *drvname, + const char *domname) +{ + int ret = -1; + size_t count = virMediatedDeviceListCount(src); + size_t i, j; + + virObjectLock(dst); + for (i = 0; i < count; i++) { + virMediatedDevicePtr mdev = virMediatedDeviceListGet(src, i); + + if (virMediatedDeviceIsUsed(mdev, dst) || + virMediatedDeviceSetUsedBy(mdev, drvname, domname) < 0) + goto cleanup; + + /* Copy mdev references to the driver list: + * - caller is responsible for NOT freeing devices in @list on success + * - we're responsible for performing a rollback on failure + */ + if (virMediatedDeviceListAdd(dst, mdev) < 0) + goto rollback; + + VIR_DEBUG("'%s' added to list of active mediated devices used by '%s'", + mdev->path, domname); + } + + ret = 0; + cleanup: + virObjectUnlock(dst); + return ret; + + rollback: + for (j = 0; j < i; j++) { + virMediatedDevicePtr tmp = virMediatedDeviceListGet(src, j); + virMediatedDeviceListSteal(dst, tmp); + } + goto cleanup; +} diff --git a/src/util/virmdev.h b/src/util/virmdev.h new file mode 100644 index 0000000000..2f3d6bb840 --- /dev/null +++ b/src/util/virmdev.h @@ -0,0 +1,123 @@ +/* + * virmdev.h: helper APIs for managing host mediated devices + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#ifndef __VIR_MDEV_H__ +# define __VIR_MDEV_H__ + +# include "internal.h" +# include "virobject.h" +# include "virutil.h" +# include "virpci.h" + +typedef enum { + VIR_MDEV_MODEL_TYPE_VFIO_PCI = 0, + + VIR_MDEV_MODEL_TYPE_LAST +} virMediatedDeviceModelType; + +VIR_ENUM_DECL(virMediatedDeviceModel) + + +typedef struct _virMediatedDevice virMediatedDevice; +typedef virMediatedDevice *virMediatedDevicePtr; +typedef struct _virMediatedDeviceAddress virMediatedDeviceAddress; +typedef virMediatedDeviceAddress *virMediatedDeviceAddressPtr; +typedef struct _virMediatedDeviceList virMediatedDeviceList; +typedef virMediatedDeviceList *virMediatedDeviceListPtr; + +typedef int (*virMediatedDeviceCallback)(virMediatedDevicePtr dev, + const char *path, void *opaque); + +virMediatedDevicePtr +virMediatedDeviceNew(const char *uuidstr, virMediatedDeviceModelType model); + +virMediatedDevicePtr +virMediatedDeviceCopy(virMediatedDevicePtr dev); + +void +virMediatedDeviceFree(virMediatedDevicePtr dev); + +const char * +virMediatedDeviceGetPath(virMediatedDevicePtr dev); + +void +virMediatedDeviceGetUsedBy(virMediatedDevicePtr dev, + const char **drvname, const char **domname); + +int +virMediatedDeviceSetUsedBy(virMediatedDevicePtr dev, + const char *drvname, + const char *domname); + +char * +virMediatedDeviceGetIOMMUGroupDev(virMediatedDevicePtr dev); + +int +virMediatedDeviceGetIOMMUGroupNum(virMediatedDevicePtr dev); + +char * +virMediatedDeviceGetSysfsPath(const char *uuidstr); + +bool +virMediatedDeviceIsUsed(virMediatedDevicePtr dev, + virMediatedDeviceListPtr list); + +bool +virMediatedDeviceIsUsed(virMediatedDevicePtr dev, + virMediatedDeviceListPtr list); + +virMediatedDeviceListPtr +virMediatedDeviceListNew(void); + +int +virMediatedDeviceListAdd(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr +virMediatedDeviceListGet(virMediatedDeviceListPtr list, + ssize_t idx); + +size_t +virMediatedDeviceListCount(virMediatedDeviceListPtr list); + +virMediatedDevicePtr +virMediatedDeviceListSteal(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr +virMediatedDeviceListStealIndex(virMediatedDeviceListPtr list, + ssize_t idx); + +void +virMediatedDeviceListDel(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr +virMediatedDeviceListFind(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +int +virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +int +virMediatedDeviceListMarkDevices(virMediatedDeviceListPtr dst, + virMediatedDeviceListPtr src, + const char *drvname, + const char *domname); +#endif /* __VIR_MDEV_H__ */ -- 2.12.1

On 03/22/2017 11:27 AM, Erik Skultety wrote:
Beside creation, disposal, getter, and setter methods the module exports methods to work with lists of mediated devices.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- po/POTFILES.in | 1 + src/Makefile.am | 1 + src/libvirt_private.syms | 22 +++ src/util/virmdev.c | 487 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virmdev.h | 123 ++++++++++++ 5 files changed, 634 insertions(+) create mode 100644 src/util/virmdev.c create mode 100644 src/util/virmdev.h
diff --git a/po/POTFILES.in b/po/POTFILES.in index 64cb88cfa5..df3d3cfe80 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -224,6 +224,7 @@ src/util/virlease.c src/util/virlockspace.c src/util/virlog.c src/util/virmacmap.c +src/util/virmdev.c src/util/virnetdev.c src/util/virnetdevbandwidth.c src/util/virnetdevbridge.c diff --git a/src/Makefile.am b/src/Makefile.am index 3b1bb1da35..01b7f4e0be 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -188,6 +188,7 @@ UTIL_SOURCES = \ util/virvhba.c util/virvhba.h \ util/virxdrdefs.h \ util/virxml.c util/virxml.h \ + util/virmdev.c util/virmdev.h \ $(NULL)
EXTRA_DIST += $(srcdir)/util/keymaps.csv $(srcdir)/util/virkeycode-mapgen.py diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 57acfdbb19..c51b295d30 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1983,6 +1983,28 @@ virMacMapNew; virMacMapRemove; virMacMapWriteFile;
+# util/virmdev.h +virMediatedDeviceFree; +virMediatedDeviceGetIOMMUGroupDev; +virMediatedDeviceGetIOMMUGroupNum; +virMediatedDeviceGetSysfsPath; +virMediatedDeviceGetUsedBy; +virMediatedDeviceIsUsed; +virMediatedDeviceListAdd; +virMediatedDeviceListCount; +virMediatedDeviceListDel; +virMediatedDeviceListFind; +virMediatedDeviceListGet; +virMediatedDeviceListMarkDevices; +virMediatedDeviceListNew; +virMediatedDeviceListSteal; +virMediatedDeviceListStealIndex; +virMediatedDeviceModelTypeFromString; +virMediatedDeviceModelTypeToString; +virMediatedDeviceNew; +virMediatedDeviceSetUsedBy; + +
# util/virnetdev.h virNetDevAddMulti; diff --git a/src/util/virmdev.c b/src/util/virmdev.c new file mode 100644 index 0000000000..d1692a982a --- /dev/null +++ b/src/util/virmdev.c @@ -0,0 +1,487 @@ +/* + * virmdev.c: helper APIs for managing host mediated devices + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#include <config.h> + +#include <fcntl.h> +#include <inttypes.h> +#include <limits.h> +#include <stdio.h> +#include <string.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> +#include <stdlib.h>
None of the above are needed. Compilation completes just fine without any of them.
+ +#include "virmdev.h" +#include "dirname.h"
dirname.h is from gnulib. Move it out separate from all the vir*.h includes.
+#include "virlog.h" +#include "viralloc.h" +#include "vircommand.h" +#include "virerror.h" +#include "virfile.h" +#include "virkmod.h" +#include "virstring.h" +#include "virutil.h" +#include "viruuid.h"
Out of all the above, I found that nothing is used from vircommand.h, virkmod.h, or viruuid.h, so they definitely should be removed. In addition, it looks like maybe nothing from virutil.h is used either (compiles fine without it). It also compiles okay without virerror.h, but that's just because it's incidentally included by some other include, so it can/should stay.
+ +#define VIR_FROM_THIS VIR_FROM_NONE + +VIR_LOG_INIT("util.mdev"); + +struct _virMediatedDevice { + char *path; /* sysfs path */ + virMediatedDeviceModelType model; + + char *used_by_drvname; + char *used_by_domname; +}; + +struct _virMediatedDeviceList { + virObjectLockable parent; + + size_t count; + virMediatedDevicePtr *devs; +}; + +VIR_ENUM_IMPL(virMediatedDeviceModel, VIR_MDEV_MODEL_TYPE_LAST, + "vfio-pci") + +static virClassPtr virMediatedDeviceListClass; + +static void virMediatedDeviceListDispose(void *obj); + +static int virMediatedOnceInit(void) +{ + if (!(virMediatedDeviceListClass = virClassNew(virClassForObjectLockable(), + "virMediatedDeviceList", + sizeof(virMediatedDeviceList), + virMediatedDeviceListDispose))) + return -1; + + return 0; +} + +VIR_ONCE_GLOBAL_INIT(virMediated)
(I really wish there was a less verbose way to do all the above. Not your problem though...)
+ +static int +virMediatedDeviceGetSysfsDeviceAPI(virMediatedDevicePtr dev, + char **device_api) +{ + int ret = -1; + char *buf = NULL; + char *tmp = NULL; + char *file = NULL;
Since buf and file both point to strings that we need to free, and tmp doesn't, I think it would be better if tmp was after the others and not initialized since it's never used before it's set) to differentiate it from the other two (you can't differentiate by making it const, since it's used to replace the \n with 0. A comment saying that anything it points to doesn't need to be freed might be nice too. (Generally I assume that const char* doesn't need its data freed, but char* does. Seeing it not freed sets off alarms that then need brain cells to suppress).
+ + if (virAsprintf(&file, "%s/mdev_type/device_api", dev->path) < 0) + goto cleanup; + + /* TODO - make this a generic method to access sysfs files for various + * kinds of devices + */ + if (!virFileExists(file)) { + virReportSystemError(errno, _("Failed to read '%s'"), file);
We're consistently inconsistent about it throughout the code (and I am personally inconsisent about it even within the same path) but I *think* there is a preference for starting error messages with lower case. Not worth an edit just for that though.
+ goto cleanup; + } + + if (virFileReadAll(file, 1024, &buf) < 0) + goto cleanup;
If it was 1983, or even 1990, I would complain that you're going to use up 1024 bytes for a string that will surely never be longer than 15-20 characters. But of course we no longer run our software on Z-80s with 64k of RAM, so I must keep my mouth shut :-) (I haven't looked in many years, but I would be surprised if a modern malloc gave out memory in chunks less than 1024 bytes anyway...)
+ + if ((tmp = strchr(buf, '\n'))) + *tmp = '\0'; + + *device_api = buf; + buf = NULL; + + ret = 0; + cleanup: + VIR_FREE(file); + VIR_FREE(buf); + return ret; +} + + +static int +virMediatedDeviceCheckModel(virMediatedDevicePtr dev, + virMediatedDeviceModelType model) +{ + int ret = -1; + char *dev_api = NULL; + int actual_model; + + if (virMediatedDeviceGetSysfsDeviceAPI(dev, &dev_api) < 0) + return -1; + + /* safeguard in case we've got an older libvirt which doesn't know newer + * device_api models yet + */ + if ((actual_model = virMediatedDeviceModelTypeFromString(dev_api)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Device API '%s' not supported yet"), + dev_api); + goto cleanup; + } + + if (actual_model != model) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid device API '%s' for device %s: " + "device only supports '%s'"), + virMediatedDeviceModelTypeToString(model), + dev->path, dev_api); + goto cleanup; + } + + ret = 0; + cleanup: + VIR_FREE(dev_api); + return ret; +} + + +#ifdef __linux__
Why did you only start the #ifdef part here - surely most of the stuff up above would never be used on a non-linux system either? virMediatedDeviceGetSysfsDeviceAPI for example references files in linux sysfs.
+# define MDEV_SYSFS_DEVICES "/sys/bus/mdev/devices/" + +virMediatedDevicePtr +virMediatedDeviceNew(const char *uuidstr, virMediatedDeviceModelType model) +{ + virMediatedDevicePtr ret = NULL; + virMediatedDevicePtr dev = NULL; + + if (VIR_ALLOC(dev) < 0) + return NULL; + + if (!(dev->path = virMediatedDeviceGetSysfsPath(uuidstr))) + goto cleanup; + + /* Check whether the user-provided model corresponds with the actually + * supported mediated device's API. + */ + if (virMediatedDeviceCheckModel(dev, model)) + goto cleanup; + + dev->model = model; + VIR_STEAL_PTR(ret, dev);
Heh. Never noticed that one before :-)
+ + cleanup: + virMediatedDeviceFree(dev); + return ret; +} + +#else + +virMediatedDevicePtr +virMediatedDeviceNew(virPCIDeviceAddressPtr pciaddr ATTRIBUTE_UNUSED, + const char *uuidstr ATTRIBUTE_UNUSED) +{ + virRerportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("not supported on non-linux platforms"));
Maybe say "mediate devices ...." in case someone sees/reports the log message from the output of virsh (where the function name isn't a part of the log message).
+ return NULL; +} + +#endif /* __linux__ */ + +void +virMediatedDeviceFree(virMediatedDevicePtr dev) +{ + if (!dev) + return; + VIR_FREE(dev->path); + VIR_FREE(dev->used_by_drvname); + VIR_FREE(dev->used_by_domname); + VIR_FREE(dev); +} + + +const char * +virMediatedDeviceGetPath(virMediatedDevicePtr dev) +{ + return dev->path; +} + + +/* Returns an absolute canonicalized path to the device used to control the + * mediated device's IOMMU group (e.g. "/dev/vfio/15"). Caller is responsible + * for freeing the result. + */ +char * +virMediatedDeviceGetIOMMUGroupDev(virMediatedDevicePtr dev) +{ + char *resultpath = NULL; + char *iommu_path = NULL; + char *vfio_path = NULL; + + if (virAsprintf(&iommu_path, "%s/iommu_group", dev->path) < 0) + return NULL; + + if (!virFileExists(iommu_path)) { + virReportSystemError(errno, _("Failed to access '%s'"), iommu_path); + goto cleanup; + } + + if (virFileResolveLink(iommu_path, &resultpath) < 0) { + virReportSystemError(errno, _("Failed to resolve '%s'"), iommu_path); + goto cleanup; + } + + if (virAsprintf(&vfio_path, "/dev/vfio/%s", last_component(resultpath)) < 0) + goto cleanup; + + cleanup: + VIR_FREE(resultpath); + VIR_FREE(iommu_path); + return vfio_path; +} + + +int +virMediatedDeviceGetIOMMUGroupNum(virMediatedDevicePtr dev) +{ + char *vfio_path = NULL; + char *group_num_str = NULL; + unsigned int group_num = -1; + + if (!(vfio_path = virMediatedDeviceGetIOMMUGroupDev(dev))) + return -1; + + group_num_str = last_component(vfio_path); + ignore_value(virStrToLong_ui(group_num_str, NULL, 10, &group_num));
I'm assuming the caller logs a message on failure? If so, then the ignore_value is okay. (end of comments) To sum it up: ACK if you: * remove the unnecessary #includes * make it obvious when defined that tmp doesn't "own" any memory that should be freed (just to make life easier for maintainers) * modify the "unsupported" log message to start with "mediated devices" * maybe put more/most/all(?) of the functions inside #ifdef __linux__
+ + VIR_FREE(vfio_path); + return group_num; +} + + +void +virMediatedDeviceGetUsedBy(virMediatedDevicePtr dev, + const char **drvname, const char **domname) +{ + *drvname = dev->used_by_drvname; + *domname = dev->used_by_domname; +} + + +int +virMediatedDeviceSetUsedBy(virMediatedDevicePtr dev, + const char *drvname, + const char *domname) +{ + VIR_FREE(dev->used_by_drvname); + VIR_FREE(dev->used_by_domname); + if (VIR_STRDUP(dev->used_by_drvname, drvname) < 0) + return -1; + if (VIR_STRDUP(dev->used_by_domname, domname) < 0) + return -1; + + return 0; +} + + +virMediatedDeviceListPtr +virMediatedDeviceListNew(void) +{ + virMediatedDeviceListPtr list; + + if (virMediatedInitialize() < 0) + return NULL; + + if (!(list = virObjectLockableNew(virMediatedDeviceListClass))) + return NULL; + + return list; +} + + +static void +virMediatedDeviceListDispose(void *obj) +{ + virMediatedDeviceListPtr list = obj; + size_t i; + + for (i = 0; i < list->count; i++) { + virMediatedDeviceFree(list->devs[i]); + list->devs[i] = NULL; + } + + list->count = 0; + VIR_FREE(list->devs); +} + + +int +virMediatedDeviceListAdd(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + if (virMediatedDeviceListFind(list, dev)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Device %s is already in use"), dev->path); + return -1; + } + return VIR_APPEND_ELEMENT(list->devs, list->count, dev); +} + + +virMediatedDevicePtr +virMediatedDeviceListGet(virMediatedDeviceListPtr list, + ssize_t idx) +{ + if (idx < 0 || idx >= list->count) + return NULL; + + return list->devs[idx]; +} + + +size_t +virMediatedDeviceListCount(virMediatedDeviceListPtr list) +{ + return list->count; +} + + +virMediatedDevicePtr +virMediatedDeviceListStealIndex(virMediatedDeviceListPtr list, + ssize_t idx) +{ + virMediatedDevicePtr ret; + + if (idx < 0 || idx >= list->count) + return NULL; + + ret = list->devs[idx]; + VIR_DELETE_ELEMENT(list->devs, idx, list->count); + return ret; +} + + +virMediatedDevicePtr +virMediatedDeviceListSteal(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + int idx = virMediatedDeviceListFindIndex(list, dev); + + return virMediatedDeviceListStealIndex(list, idx); +} + + +void +virMediatedDeviceListDel(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + virMediatedDevicePtr ret = virMediatedDeviceListSteal(list, dev); + virMediatedDeviceFree(ret); +} + + +int +virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + size_t i; + + for (i = 0; i < list->count; i++) { + virMediatedDevicePtr other = list->devs[i]; + if (STREQ(other->path, dev->path)) + return i; + } + return -1; +} + + +virMediatedDevicePtr +virMediatedDeviceListFind(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + int idx; + + if ((idx = virMediatedDeviceListFindIndex(list, dev)) >= 0) + return list->devs[idx]; + else + return NULL; +} + + +bool +virMediatedDeviceIsUsed(virMediatedDevicePtr dev, + virMediatedDeviceListPtr list) +{ + const char *drvname, *domname; + virMediatedDevicePtr tmp = NULL; + + if ((tmp = virMediatedDeviceListFind(list, dev))) { + virMediatedDeviceGetUsedBy(tmp, &drvname, &domname); + virReportError(VIR_ERR_OPERATION_INVALID, + _("Mediated device %s is in use by " + "driver %s, domain %s"), + tmp->path, drvname, domname); + } + + return !!tmp; +} + + +char * +virMediatedDeviceGetSysfsPath(const char *uuidstr) +{ + char *ret = NULL; + + ignore_value(virAsprintf(&ret, MDEV_SYSFS_DEVICES "%s", uuidstr)); + return ret; +} + + +int +virMediatedDeviceListMarkDevices(virMediatedDeviceListPtr dst, + virMediatedDeviceListPtr src, + const char *drvname, + const char *domname) +{ + int ret = -1; + size_t count = virMediatedDeviceListCount(src); + size_t i, j; + + virObjectLock(dst); + for (i = 0; i < count; i++) { + virMediatedDevicePtr mdev = virMediatedDeviceListGet(src, i); + + if (virMediatedDeviceIsUsed(mdev, dst) || + virMediatedDeviceSetUsedBy(mdev, drvname, domname) < 0) + goto cleanup; + + /* Copy mdev references to the driver list: + * - caller is responsible for NOT freeing devices in @list on success + * - we're responsible for performing a rollback on failure + */ + if (virMediatedDeviceListAdd(dst, mdev) < 0) + goto rollback; + + VIR_DEBUG("'%s' added to list of active mediated devices used by '%s'", + mdev->path, domname); + } + + ret = 0; + cleanup: + virObjectUnlock(dst); + return ret; + + rollback: + for (j = 0; j < i; j++) { + virMediatedDevicePtr tmp = virMediatedDeviceListGet(src, j); + virMediatedDeviceListSteal(dst, tmp); + } + goto cleanup; +} diff --git a/src/util/virmdev.h b/src/util/virmdev.h new file mode 100644 index 0000000000..2f3d6bb840 --- /dev/null +++ b/src/util/virmdev.h @@ -0,0 +1,123 @@ +/* + * virmdev.h: helper APIs for managing host mediated devices + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#ifndef __VIR_MDEV_H__ +# define __VIR_MDEV_H__ + +# include "internal.h" +# include "virobject.h" +# include "virutil.h" +# include "virpci.h" + +typedef enum { + VIR_MDEV_MODEL_TYPE_VFIO_PCI = 0, + + VIR_MDEV_MODEL_TYPE_LAST +} virMediatedDeviceModelType; + +VIR_ENUM_DECL(virMediatedDeviceModel) + + +typedef struct _virMediatedDevice virMediatedDevice; +typedef virMediatedDevice *virMediatedDevicePtr; +typedef struct _virMediatedDeviceAddress virMediatedDeviceAddress; +typedef virMediatedDeviceAddress *virMediatedDeviceAddressPtr; +typedef struct _virMediatedDeviceList virMediatedDeviceList; +typedef virMediatedDeviceList *virMediatedDeviceListPtr; + +typedef int (*virMediatedDeviceCallback)(virMediatedDevicePtr dev, + const char *path, void *opaque); + +virMediatedDevicePtr +virMediatedDeviceNew(const char *uuidstr, virMediatedDeviceModelType model); + +virMediatedDevicePtr +virMediatedDeviceCopy(virMediatedDevicePtr dev); + +void +virMediatedDeviceFree(virMediatedDevicePtr dev); + +const char * +virMediatedDeviceGetPath(virMediatedDevicePtr dev); + +void +virMediatedDeviceGetUsedBy(virMediatedDevicePtr dev, + const char **drvname, const char **domname); + +int +virMediatedDeviceSetUsedBy(virMediatedDevicePtr dev, + const char *drvname, + const char *domname); + +char * +virMediatedDeviceGetIOMMUGroupDev(virMediatedDevicePtr dev); + +int +virMediatedDeviceGetIOMMUGroupNum(virMediatedDevicePtr dev); + +char * +virMediatedDeviceGetSysfsPath(const char *uuidstr); + +bool +virMediatedDeviceIsUsed(virMediatedDevicePtr dev, + virMediatedDeviceListPtr list); + +bool +virMediatedDeviceIsUsed(virMediatedDevicePtr dev, + virMediatedDeviceListPtr list); + +virMediatedDeviceListPtr +virMediatedDeviceListNew(void); + +int +virMediatedDeviceListAdd(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr +virMediatedDeviceListGet(virMediatedDeviceListPtr list, + ssize_t idx); + +size_t +virMediatedDeviceListCount(virMediatedDeviceListPtr list); + +virMediatedDevicePtr +virMediatedDeviceListSteal(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr +virMediatedDeviceListStealIndex(virMediatedDeviceListPtr list, + ssize_t idx); + +void +virMediatedDeviceListDel(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr +virMediatedDeviceListFind(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +int +virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +int +virMediatedDeviceListMarkDevices(virMediatedDeviceListPtr dst, + virMediatedDeviceListPtr src, + const char *drvname, + const char *domname); +#endif /* __VIR_MDEV_H__ */

+ +static int +virMediatedDeviceGetSysfsDeviceAPI(virMediatedDevicePtr dev, + char **device_api) +{ + int ret = -1; + char *buf = NULL; + char *tmp = NULL; + char *file = NULL;
Since buf and file both point to strings that we need to free, and tmp doesn't, I think it would be better if tmp was after the others and not initialized since it's never used before it's set) to differentiate it from the other two (you can't differentiate by making it const, since it's used to replace the \n with 0. A comment saying that anything it points to doesn't need to be freed might be nice too. (Generally I assume that const char* doesn't need its data freed, but char* does. Seeing it not freed sets off alarms that then need brain cells to suppress).
Well, the name 'tmp' sort of indicates that it is going to be used for something temporary/volatile which could be anything, so the pointer could be used further in the code for something that indeed needs freeing, combine it with forgetting to remove the comment in that case and you created an unnecessary confusion. Although I get your point and internally agree that it could be useful, I also think that generally the best approach is just to scroll through the code and look at how a specific variable is used (however sad it sounds...). [..]
+ + +static int +virMediatedDeviceCheckModel(virMediatedDevicePtr dev, + virMediatedDeviceModelType model) +{ + int ret = -1; + char *dev_api = NULL; + int actual_model; + + if (virMediatedDeviceGetSysfsDeviceAPI(dev, &dev_api) < 0) + return -1; + + /* safeguard in case we've got an older libvirt which doesn't know newer + * device_api models yet + */ + if ((actual_model = virMediatedDeviceModelTypeFromString(dev_api)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Device API '%s' not supported yet"), + dev_api); + goto cleanup; + } + + if (actual_model != model) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid device API '%s' for device %s: " + "device only supports '%s'"), + virMediatedDeviceModelTypeToString(model), + dev->path, dev_api); + goto cleanup; + } + + ret = 0; + cleanup: + VIR_FREE(dev_api); + return ret; +} + + +#ifdef __linux__
Why did you only start the #ifdef part here - surely most of the stuff up above would never be used on a non-linux system either? virMediatedDeviceGetSysfsDeviceAPI for example references files in linux sysfs.
As per [1] I dropped the #ifdef part covering all the functions, but I think it would be a good practice to at least cover the *CheckModel and *GetSysfsDeviceAPI functions since you're right about it, although it's not necessary at the moment because they're only being called from virMediatedDeviceNew which is correctly guarded by the #ifdef. [1] https://www.redhat.com/archives/libvir-list/2017-February/msg00171.html
+virMediatedDevicePtr +virMediatedDeviceNew(virPCIDeviceAddressPtr pciaddr ATTRIBUTE_UNUSED, + const char *uuidstr ATTRIBUTE_UNUSED) +{ + virRerportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("not supported on non-linux platforms"));
Maybe say "mediate devices ...." in case someone sees/reports the log message from the output of virsh (where the function name isn't a part of the log message).
+ return NULL; +int +virMediatedDeviceGetIOMMUGroupNum(virMediatedDevicePtr dev) +{ + char *vfio_path = NULL; + char *group_num_str = NULL; + unsigned int group_num = -1; + + if (!(vfio_path = virMediatedDeviceGetIOMMUGroupDev(dev))) + return -1; + + group_num_str = last_component(vfio_path); + ignore_value(virStrToLong_ui(group_num_str, NULL, 10, &group_num));
I'm assuming the caller logs a message on failure? If so, then the ignore_value is okay.
Well, they have to. If you think about it, we're converting string which we obtained by ourselves from sysfs. If virMediatedDeviceGetIOMMUGroupDev failed, it'd log error by itself, so the only point of failure in this function is "last_component" from gnulib. So supposing that our code obtaining file paths is correct, there is really no way how virStrToLong* could fail reasonably, so I decided to just ignore the return value. [...]
(end of comments)
To sum it up: ACK if you:
* remove the unnecessary #includes
* make it obvious when defined that tmp doesn't "own" any memory that should be freed (just to make life easier for maintainers)
- I'd like to, but see my comment above, in any case, it can be added later
* modify the "unsupported" log message to start with "mediated devices"
* maybe put more/most/all(?) of the functions inside #ifdef __linux__
Adjusted, thanks. Erik

A mediated device will be identified by a UUID (with 'model' now being a mandatory <hostdev> attribute to represent the mediated device API) of the user pre-created mediated device. The data necessary to identify a mediated device can be easily extended in the future, e.g. when auto-creation of mediated devices should be enabled. We also need to make sure that if user explicitly provides a guest address for a mdev device, the address type will be matching the device API supported on that specific mediated device and error out with an incorrect XML message. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/schemas/domaincommon.rng | 22 +++++++++ src/conf/domain_conf.c | 99 ++++++++++++++++++++++++++++++++++++- src/conf/domain_conf.h | 9 ++++ src/qemu/qemu_domain.c | 1 + src/qemu/qemu_hotplug.c | 2 + src/security/security_apparmor.c | 3 ++ src/security/security_dac.c | 2 + src/security/security_selinux.c | 2 + tests/domaincapsschemadata/full.xml | 1 + 9 files changed, 140 insertions(+), 1 deletion(-) diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index fbedc9b1f9..edc225fe50 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -4034,6 +4034,7 @@ <ref name="hostdevsubsysusb"/> <ref name="hostdevsubsysscsi"/> <ref name="hostdevsubsyshost"/> + <ref name="hostdevsubsysmdev"/> </choice> </define> @@ -4184,6 +4185,20 @@ </element> </define> + <define name="hostdevsubsysmdev"> + <attribute name="type"> + <value>mdev</value> + </attribute> + <attribute name="model"> + <choice> + <value>vfio-pci</value> + </choice> + </attribute> + <element name="source"> + <ref name="mdevaddress"/> + </element> + </define> + <define name="hostdevcapsstorage"> <attribute name="type"> <value>storage</value> @@ -4342,6 +4357,13 @@ </attribute> </optional> </define> + <define name="mdevaddress"> + <element name="address"> + <attribute name="uuid"> + <ref name="UUID"/> + </attribute> + </element> + </define> <define name="devices"> <element name="devices"> <interleave> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a5ab42297d..63ac65e8ab 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -56,6 +56,7 @@ #include "virstring.h" #include "virnetdev.h" #include "virhostdev.h" +#include "virmdev.h" #define VIR_FROM_THIS VIR_FROM_DOMAIN @@ -652,7 +653,8 @@ VIR_ENUM_IMPL(virDomainHostdevSubsys, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST, "usb", "pci", "scsi", - "scsi_host") + "scsi_host", + "mdev") VIR_ENUM_IMPL(virDomainHostdevSubsysPCIBackend, VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_LAST, @@ -2356,6 +2358,7 @@ void virDomainHostdevDefClear(virDomainHostdevDefPtr def) break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } @@ -4251,6 +4254,23 @@ virDomainHostdevDefPostParse(virDomainHostdevDefPtr dev, } } break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + int model = dev->source.subsys.u.mdev.model; + + if (dev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) + return 0; + + if (model == VIR_MDEV_MODEL_TYPE_VFIO_PCI && + dev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI) { + virReportError(VIR_ERR_XML_ERROR, + _("Unsupported address type '%s' with mediated " + "device model '%s'"), + virDomainDeviceAddressTypeToString(dev->info->type), + virMediatedDeviceModelTypeToString(model)); + return -1; + } + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: @@ -6397,6 +6417,41 @@ virDomainHostdevSubsysSCSIVHostDefParseXML(xmlNodePtr sourcenode, return ret; } +static int +virDomainHostdevSubsysMediatedDevDefParseXML(virDomainHostdevDefPtr def, + xmlXPathContextPtr ctxt) +{ + int ret = -1; + unsigned char uuid[VIR_UUID_BUFLEN] = {0}; + char *uuidxml = NULL; + xmlNodePtr node = NULL; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; + + if (!(node = virXPathNode("./source/address", ctxt))) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Missing <address> element")); + goto cleanup; + } + + if (!(uuidxml = virXMLPropString(node, "uuid"))) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Missing 'uuid' attribute for element <address>")); + goto cleanup; + } + + if (virUUIDParse(uuidxml, uuid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", + _("Cannot parse uuid attribute of element <address>")); + goto cleanup; + } + + virUUIDFormat(uuid, mdevsrc->uuidstr); + ret = 0; + cleanup: + VIR_FREE(uuidxml); + return ret; +} static int virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, @@ -6410,10 +6465,12 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, char *sgio = NULL; char *rawio = NULL; char *backendStr = NULL; + char *model = NULL; int backend; int ret = -1; virDomainHostdevSubsysPCIPtr pcisrc = &def->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; /* @managed can be read from the xml document - it is always an * attribute of the toplevel element, no matter what type of @@ -6427,6 +6484,7 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, sgio = virXMLPropString(node, "sgio"); rawio = virXMLPropString(node, "rawio"); + model = virXMLPropString(node, "model"); /* @type is passed in from the caller rather than read from the * xml document, because it is specified in different places for @@ -6493,6 +6551,28 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, } } + if (def->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) { + if (model) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("model is only supported with mediated devices")); + goto error; + } + } else { + if (!model) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing 'model' attribute in mediated device's " + "<hostdev> element")); + goto error; + } + + if ((mdevsrc->model = virMediatedDeviceModelTypeFromString(model)) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("unknown hostdev model '%s'"), + model); + goto error; + } + } + switch (def->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: if (virDomainHostdevSubsysPCIDefParseXML(sourcenode, def, flags) < 0) @@ -6525,6 +6605,10 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, if (virDomainHostdevSubsysSCSIVHostDefParseXML(sourcenode, def) < 0) goto error; break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + if (virDomainHostdevSubsysMediatedDevDefParseXML(def, ctxt) < 0) + goto error; + break; default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -6539,6 +6623,7 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, VIR_FREE(sgio); VIR_FREE(rawio); VIR_FREE(backendStr); + VIR_FREE(model); return ret; } @@ -13384,6 +13469,7 @@ virDomainHostdevDefParseXML(virDomainXMLOptionPtr xmlopt, } break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } @@ -14318,6 +14404,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, return 1; else return 0; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: return 0; } @@ -21319,6 +21406,7 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, virDomainHostdevSubsysPCIPtr pcisrc = &def->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &def->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; virDomainHostdevSubsysSCSIHostPtr scsihostsrc = &scsisrc->u.host; virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc = &scsisrc->u.iscsi; @@ -21423,6 +21511,10 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + virBufferAsprintf(buf, "<address uuid='%s'/>\n", + mdevsrc->uuidstr); + break; default: virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected hostdev type %d"), @@ -23318,6 +23410,7 @@ virDomainHostdevDefFormat(virBufferPtr buf, { const char *mode = virDomainHostdevModeTypeToString(def->mode); virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; const char *type; if (!mode) { @@ -23367,6 +23460,10 @@ virDomainHostdevDefFormat(virBufferPtr buf, virBufferAsprintf(buf, " rawio='%s'", virTristateBoolTypeToString(scsisrc->rawio)); } + + if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + virBufferAsprintf(buf, " model='%s'", + virMediatedDeviceModelTypeToString(mdevsrc->model)); } virBufferAddLit(buf, ">\n"); virBufferAdjustIndent(buf, 2); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 8e6d874178..47eaacef3d 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -295,6 +295,7 @@ typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST, + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST } virDomainHostdevSubsysType; @@ -369,6 +370,13 @@ struct _virDomainHostdevSubsysSCSI { } u; }; +typedef struct _virDomainHostdevSubsysMediatedDev virDomainHostdevSubsysMediatedDev; +typedef virDomainHostdevSubsysMediatedDev *virDomainHostdevSubsysMediatedDevPtr; +struct _virDomainHostdevSubsysMediatedDev { + int model; /* enum virMediatedDeviceModelType */ + char uuidstr[VIR_UUID_STRING_BUFLEN]; /* mediated device's uuid string */ +}; + typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_NONE, VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_VHOST, @@ -394,6 +402,7 @@ struct _virDomainHostdevSubsys { virDomainHostdevSubsysPCI pci; virDomainHostdevSubsysSCSI scsi; virDomainHostdevSubsysSCSIVHost scsi_host; + virDomainHostdevSubsysMediatedDev mdev; } u; }; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c239a06767..c0f060b0a3 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7102,6 +7102,7 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index ddcbc5e259..66c469e55e 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3907,6 +3907,8 @@ qemuDomainRemoveHostDevice(virQEMUDriverPtr driver, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: qemuDomainRemoveSCSIVHostDevice(driver, vm, hostdev); break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c index 0d3e891a71..f5b72e1c2d 100644 --- a/src/security/security_apparmor.c +++ b/src/security/security_apparmor.c @@ -901,6 +901,9 @@ AppArmorSetSecurityHostdevLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_dac.c b/src/security/security_dac.c index d6a2daf747..4e968f29c0 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -964,6 +964,7 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -1119,6 +1120,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index a6bcf9e01f..7b3276dc34 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1838,6 +1838,7 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -2065,6 +2066,7 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/tests/domaincapsschemadata/full.xml b/tests/domaincapsschemadata/full.xml index 6a676253c3..82a92322e5 100644 --- a/tests/domaincapsschemadata/full.xml +++ b/tests/domaincapsschemadata/full.xml @@ -89,6 +89,7 @@ <value>pci</value> <value>scsi</value> <value>scsi_host</value> + <value>mdev</value> </enum> <enum name='capsType'> <value>storage</value> -- 2.12.1

On 03/22/2017 11:27 AM, Erik Skultety wrote:
A mediated device will be identified by a UUID (with 'model' now being a mandatory <hostdev> attribute to represent the mediated device API) of the user pre-created mediated device. The data necessary to identify a mediated device can be easily extended in the future, e.g. when auto-creation of mediated devices should be enabled. We also need to
s/should be/is/ (although... it really goes without saying that the data needed can be easily extended. After all, you wrote it, right? :-P)
make sure that if user explicitly provides a guest address for a mdev device, the address type will be matching the device API supported on that specific mediated device and error out with an incorrect XML message.
Please include a sample of the xml for an mdev <hostdev> in the commit log.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/schemas/domaincommon.rng | 22 +++++++++ src/conf/domain_conf.c | 99 ++++++++++++++++++++++++++++++++++++- src/conf/domain_conf.h | 9 ++++ src/qemu/qemu_domain.c | 1 + src/qemu/qemu_hotplug.c | 2 + src/security/security_apparmor.c | 3 ++ src/security/security_dac.c | 2 + src/security/security_selinux.c | 2 + tests/domaincapsschemadata/full.xml | 1 + 9 files changed, 140 insertions(+), 1 deletion(-)
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index fbedc9b1f9..edc225fe50 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -4034,6 +4034,7 @@ <ref name="hostdevsubsysusb"/> <ref name="hostdevsubsysscsi"/> <ref name="hostdevsubsyshost"/> + <ref name="hostdevsubsysmdev"/> </choice> </define>
@@ -4184,6 +4185,20 @@ </element> </define>
+ <define name="hostdevsubsysmdev"> + <attribute name="type"> + <value>mdev</value> + </attribute> + <attribute name="model"> + <choice> + <value>vfio-pci</value> + </choice> + </attribute> + <element name="source"> + <ref name="mdevaddress"/> + </element> + </define> + <define name="hostdevcapsstorage"> <attribute name="type"> <value>storage</value> @@ -4342,6 +4357,13 @@ </attribute> </optional> </define> + <define name="mdevaddress"> + <element name="address"> + <attribute name="uuid"> + <ref name="UUID"/> + </attribute> + </element> + </define> <define name="devices"> <element name="devices"> <interleave> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a5ab42297d..63ac65e8ab 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -56,6 +56,7 @@ #include "virstring.h" #include "virnetdev.h" #include "virhostdev.h" +#include "virmdev.h"
#define VIR_FROM_THIS VIR_FROM_DOMAIN
@@ -652,7 +653,8 @@ VIR_ENUM_IMPL(virDomainHostdevSubsys, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST, "usb", "pci", "scsi", - "scsi_host") + "scsi_host", + "mdev")
VIR_ENUM_IMPL(virDomainHostdevSubsysPCIBackend, VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_LAST, @@ -2356,6 +2358,7 @@ void virDomainHostdevDefClear(virDomainHostdevDefPtr def) break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } @@ -4251,6 +4254,23 @@ virDomainHostdevDefPostParse(virDomainHostdevDefPtr dev, } } break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + int model = dev->source.subsys.u.mdev.model; + + if (dev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) + return 0; + + if (model == VIR_MDEV_MODEL_TYPE_VFIO_PCI && + dev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI) { + virReportError(VIR_ERR_XML_ERROR, + _("Unsupported address type '%s' with mediated " + "device model '%s'"), + virDomainDeviceAddressTypeToString(dev->info->type), + virMediatedDeviceModelTypeToString(model)); + return -1; + } + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: @@ -6397,6 +6417,41 @@ virDomainHostdevSubsysSCSIVHostDefParseXML(xmlNodePtr sourcenode, return ret; }
+static int +virDomainHostdevSubsysMediatedDevDefParseXML(virDomainHostdevDefPtr def, + xmlXPathContextPtr ctxt) +{ + int ret = -1; + unsigned char uuid[VIR_UUID_BUFLEN] = {0}; + char *uuidxml = NULL; + xmlNodePtr node = NULL; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; + + if (!(node = virXPathNode("./source/address", ctxt))) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Missing <address> element")); + goto cleanup; + } + + if (!(uuidxml = virXMLPropString(node, "uuid"))) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Missing 'uuid' attribute for element <address>")); + goto cleanup; + } + + if (virUUIDParse(uuidxml, uuid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", + _("Cannot parse uuid attribute of element <address>")); + goto cleanup; + } + + virUUIDFormat(uuid, mdevsrc->uuidstr); + ret = 0; + cleanup: + VIR_FREE(uuidxml); + return ret; +}
static int virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, @@ -6410,10 +6465,12 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, char *sgio = NULL; char *rawio = NULL; char *backendStr = NULL; + char *model = NULL; int backend; int ret = -1; virDomainHostdevSubsysPCIPtr pcisrc = &def->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev;
/* @managed can be read from the xml document - it is always an * attribute of the toplevel element, no matter what type of @@ -6427,6 +6484,7 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node,
sgio = virXMLPropString(node, "sgio"); rawio = virXMLPropString(node, "rawio"); + model = virXMLPropString(node, "model");
/* @type is passed in from the caller rather than read from the * xml document, because it is specified in different places for @@ -6493,6 +6551,28 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, } }
+ if (def->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) { + if (model) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("model is only supported with mediated devices"));
This error message should be more specific, e.g. "model attribute is supported in <hostdev> devices only if type='mdev'".
+ goto error; + } + } else { + if (!model) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing 'model' attribute in mediated device's " + "<hostdev> element"));
Do you actually required it in the original config? I had thought it should be optional, and filled in from the device-api if it's missing (hmm, although I guess the problem is that we don't have access to the device-api at the time of definition, so maybe not :-(. Still, maybe we can divine a default based on the machinetype - for x86 vfio-pci is an extremely safe bet, for example). Anyway, that could be a lengthy discussion, so not worth holding everything else up for it - we can always loosen the restrictions and supply a default later. (it *would* be nice to be able to completely specify the config without needing to use "vfio" anywhere...) (end of comments) ACK with the commit log changed, and the one error log made more specific.
+ goto error; + } + + if ((mdevsrc->model = virMediatedDeviceModelTypeFromString(model)) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("unknown hostdev model '%s'"), + model); + goto error; + } + } + switch (def->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: if (virDomainHostdevSubsysPCIDefParseXML(sourcenode, def, flags) < 0) @@ -6525,6 +6605,10 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, if (virDomainHostdevSubsysSCSIVHostDefParseXML(sourcenode, def) < 0) goto error; break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + if (virDomainHostdevSubsysMediatedDevDefParseXML(def, ctxt) < 0) + goto error; + break;
default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -6539,6 +6623,7 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, VIR_FREE(sgio); VIR_FREE(rawio); VIR_FREE(backendStr); + VIR_FREE(model); return ret; }
@@ -13384,6 +13469,7 @@ virDomainHostdevDefParseXML(virDomainXMLOptionPtr xmlopt, } break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } @@ -14318,6 +14404,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, return 1; else return 0; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: return 0; } @@ -21319,6 +21406,7 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, virDomainHostdevSubsysPCIPtr pcisrc = &def->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &def->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; virDomainHostdevSubsysSCSIHostPtr scsihostsrc = &scsisrc->u.host; virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc = &scsisrc->u.iscsi;
@@ -21423,6 +21511,10 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + virBufferAsprintf(buf, "<address uuid='%s'/>\n", + mdevsrc->uuidstr); + break; default: virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected hostdev type %d"), @@ -23318,6 +23410,7 @@ virDomainHostdevDefFormat(virBufferPtr buf, { const char *mode = virDomainHostdevModeTypeToString(def->mode); virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; const char *type;
if (!mode) { @@ -23367,6 +23460,10 @@ virDomainHostdevDefFormat(virBufferPtr buf, virBufferAsprintf(buf, " rawio='%s'", virTristateBoolTypeToString(scsisrc->rawio)); } + + if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + virBufferAsprintf(buf, " model='%s'", + virMediatedDeviceModelTypeToString(mdevsrc->model)); } virBufferAddLit(buf, ">\n"); virBufferAdjustIndent(buf, 2); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 8e6d874178..47eaacef3d 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -295,6 +295,7 @@ typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST, + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV,
VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST } virDomainHostdevSubsysType; @@ -369,6 +370,13 @@ struct _virDomainHostdevSubsysSCSI { } u; };
+typedef struct _virDomainHostdevSubsysMediatedDev virDomainHostdevSubsysMediatedDev; +typedef virDomainHostdevSubsysMediatedDev *virDomainHostdevSubsysMediatedDevPtr; +struct _virDomainHostdevSubsysMediatedDev { + int model; /* enum virMediatedDeviceModelType */ + char uuidstr[VIR_UUID_STRING_BUFLEN]; /* mediated device's uuid string */ +}; + typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_NONE, VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_VHOST, @@ -394,6 +402,7 @@ struct _virDomainHostdevSubsys { virDomainHostdevSubsysPCI pci; virDomainHostdevSubsysSCSI scsi; virDomainHostdevSubsysSCSIVHost scsi_host; + virDomainHostdevSubsysMediatedDev mdev; } u; };
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c239a06767..c0f060b0a3 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7102,6 +7102,7 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index ddcbc5e259..66c469e55e 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3907,6 +3907,8 @@ qemuDomainRemoveHostDevice(virQEMUDriverPtr driver, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: qemuDomainRemoveSCSIVHostDevice(driver, vm, hostdev); break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c index 0d3e891a71..f5b72e1c2d 100644 --- a/src/security/security_apparmor.c +++ b/src/security/security_apparmor.c @@ -901,6 +901,9 @@ AppArmorSetSecurityHostdevLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_dac.c b/src/security/security_dac.c index d6a2daf747..4e968f29c0 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -964,6 +964,7 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -1119,6 +1120,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index a6bcf9e01f..7b3276dc34 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1838,6 +1838,7 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -2065,6 +2066,7 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/tests/domaincapsschemadata/full.xml b/tests/domaincapsschemadata/full.xml index 6a676253c3..82a92322e5 100644 --- a/tests/domaincapsschemadata/full.xml +++ b/tests/domaincapsschemadata/full.xml @@ -89,6 +89,7 @@ <value>pci</value> <value>scsi</value> <value>scsi_host</value> + <value>mdev</value> </enum> <enum name='capsType'> <value>storage</value>

On Sun, Mar 26, 2017 at 02:16:10PM -0400, Laine Stump wrote:
On 03/22/2017 11:27 AM, Erik Skultety wrote:
A mediated device will be identified by a UUID (with 'model' now being a mandatory <hostdev> attribute to represent the mediated device API) of the user pre-created mediated device. The data necessary to identify a mediated device can be easily extended in the future, e.g. when auto-creation of mediated devices should be enabled. We also need to
s/should be/is/
(although... it really goes without saying that the data needed can be easily extended. After all, you wrote it, right? :-P)
make sure that if user explicitly provides a guest address for a mdev device, the address type will be matching the device API supported on that specific mediated device and error out with an incorrect XML message.
Please include a sample of the xml for an mdev <hostdev> in the commit log.
[...]
+ goto error; + } + } else { + if (!model) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing 'model' attribute in mediated device's " + "<hostdev> element"));
Do you actually required it in the original config? I had thought it should be optional, and filled in from the device-api if it's missing (hmm, although I guess the problem is that we don't have access to the device-api at the time of definition, so maybe not :-(. Still, maybe we can divine a default based on the machinetype - for x86 vfio-pci is an extremely safe bet, for example).
Anyway, that could be a lengthy discussion, so not worth holding everything else up for it - we can always loosen the restrictions and supply a default later. (it *would* be nice to be able to completely specify the config without needing to use "vfio" anywhere...)
Right, I'd wait until we're supposed to be able to do more than manage pre-created devices by the user, because the 'default' would need significant adjustments, otherwise you could end up throwing an error about a wrong model type, even though the user specified 'default' thinking that libvirt will probably know better, but unfortunately in the meantime someone introduced vfio-pci2 (I think we talked about this as well...) so our 'safe' assumption about vfio-pci on the x86 platform would no longer be correct and would lead to an error. Thus, what we probably will want to do is leave the model as default (verbatim) and stall the resolution of the default type to some reasonable type until we absolutely need it, which is just before calling virHostdevPrepareDevices, and then just simply look into sysfs and assign the value we find over there, rather than trying to guess. Erik
(end of comments)
ACK with the commit log changed, and the one error log made more specific.
+ goto error; + } + + if ((mdevsrc->model = virMediatedDeviceModelTypeFromString(model)) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("unknown hostdev model '%s'"), + model); + goto error; + } + } + switch (def->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: if (virDomainHostdevSubsysPCIDefParseXML(sourcenode, def, flags) < 0) @@ -6525,6 +6605,10 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, if (virDomainHostdevSubsysSCSIVHostDefParseXML(sourcenode, def) < 0) goto error; break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + if (virDomainHostdevSubsysMediatedDevDefParseXML(def, ctxt) < 0) + goto error; + break;
default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -6539,6 +6623,7 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, VIR_FREE(sgio); VIR_FREE(rawio); VIR_FREE(backendStr); + VIR_FREE(model); return ret; }
@@ -13384,6 +13469,7 @@ virDomainHostdevDefParseXML(virDomainXMLOptionPtr xmlopt, } break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } @@ -14318,6 +14404,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, return 1; else return 0; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: return 0; } @@ -21319,6 +21406,7 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, virDomainHostdevSubsysPCIPtr pcisrc = &def->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &def->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; virDomainHostdevSubsysSCSIHostPtr scsihostsrc = &scsisrc->u.host; virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc = &scsisrc->u.iscsi;
@@ -21423,6 +21511,10 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + virBufferAsprintf(buf, "<address uuid='%s'/>\n", + mdevsrc->uuidstr); + break; default: virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected hostdev type %d"), @@ -23318,6 +23410,7 @@ virDomainHostdevDefFormat(virBufferPtr buf, { const char *mode = virDomainHostdevModeTypeToString(def->mode); virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; const char *type;
if (!mode) { @@ -23367,6 +23460,10 @@ virDomainHostdevDefFormat(virBufferPtr buf, virBufferAsprintf(buf, " rawio='%s'", virTristateBoolTypeToString(scsisrc->rawio)); } + + if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + virBufferAsprintf(buf, " model='%s'", + virMediatedDeviceModelTypeToString(mdevsrc->model)); } virBufferAddLit(buf, ">\n"); virBufferAdjustIndent(buf, 2); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 8e6d874178..47eaacef3d 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -295,6 +295,7 @@ typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST, + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV,
VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST } virDomainHostdevSubsysType; @@ -369,6 +370,13 @@ struct _virDomainHostdevSubsysSCSI { } u; };
+typedef struct _virDomainHostdevSubsysMediatedDev virDomainHostdevSubsysMediatedDev; +typedef virDomainHostdevSubsysMediatedDev *virDomainHostdevSubsysMediatedDevPtr; +struct _virDomainHostdevSubsysMediatedDev { + int model; /* enum virMediatedDeviceModelType */ + char uuidstr[VIR_UUID_STRING_BUFLEN]; /* mediated device's uuid string */ +}; + typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_NONE, VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_VHOST, @@ -394,6 +402,7 @@ struct _virDomainHostdevSubsys { virDomainHostdevSubsysPCI pci; virDomainHostdevSubsysSCSI scsi; virDomainHostdevSubsysSCSIVHost scsi_host; + virDomainHostdevSubsysMediatedDev mdev; } u; };
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c239a06767..c0f060b0a3 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7102,6 +7102,7 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index ddcbc5e259..66c469e55e 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3907,6 +3907,8 @@ qemuDomainRemoveHostDevice(virQEMUDriverPtr driver, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: qemuDomainRemoveSCSIVHostDevice(driver, vm, hostdev); break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c index 0d3e891a71..f5b72e1c2d 100644 --- a/src/security/security_apparmor.c +++ b/src/security/security_apparmor.c @@ -901,6 +901,9 @@ AppArmorSetSecurityHostdevLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_dac.c b/src/security/security_dac.c index d6a2daf747..4e968f29c0 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -964,6 +964,7 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -1119,6 +1120,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index a6bcf9e01f..7b3276dc34 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1838,6 +1838,7 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -2065,6 +2066,7 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/tests/domaincapsschemadata/full.xml b/tests/domaincapsschemadata/full.xml index 6a676253c3..82a92322e5 100644 --- a/tests/domaincapsschemadata/full.xml +++ b/tests/domaincapsschemadata/full.xml @@ -89,6 +89,7 @@ <value>pci</value> <value>scsi</value> <value>scsi_host</value> + <value>mdev</value> </enum> <enum name='capsType'> <value>storage</value>

Label the VFIO IOMMU devices under /dev/vfio/ referenced by the symlinks in the sysfs (e.g. /sys/class/mdev_bus/<uuid>/iommu_group) which what qemu actually gets formatted on the command line. This patch updates all of our security drivers. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/security/security_apparmor.c | 21 +++++++++++++++++- src/security/security_dac.c | 45 ++++++++++++++++++++++++++++++++++++-- src/security/security_selinux.c | 47 ++++++++++++++++++++++++++++++++++++++-- 3 files changed, 108 insertions(+), 5 deletions(-) diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c index f5b72e1c2d..fc55815261 100644 --- a/src/security/security_apparmor.c +++ b/src/security/security_apparmor.c @@ -51,6 +51,7 @@ #include "virlog.h" #include "virstring.h" #include "virscsi.h" +#include "virmdev.h" #define VIR_FROM_THIS VIR_FROM_SECURITY @@ -813,6 +814,7 @@ AppArmorSetSecurityHostdevLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; if (!secdef || !secdef->relabel) return 0; @@ -901,8 +903,25 @@ AppArmorSetSecurityHostdevLabel(virSecurityManagerPtr mgr, break; } - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + } + + ret = AppArmorSetSecurityHostdevLabelHelper(vfiodev, ptr); + + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); break; + } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 4e968f29c0..922e484942 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -33,6 +33,7 @@ #include "virfile.h" #include "viralloc.h" #include "virlog.h" +#include "virmdev.h" #include "virpci.h" #include "virusb.h" #include "virscsi.h" @@ -867,6 +868,7 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1; if (!priv->dynamicOwnership) @@ -964,7 +966,26 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; } - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + } + + ret = virSecurityDACSetHostdevLabelHelper(vfiodev, &cbdata); + + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -1032,6 +1053,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1; secdef = virDomainDefGetSecurityLabelDef(def, SECURITY_DAC_NAME); @@ -1120,7 +1142,26 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; } - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + } + + ret = virSecurityDACRestoreFileLabel(virSecurityManagerGetPrivateData(mgr), + vfiodev); + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 7b3276dc34..df7c96833e 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -36,6 +36,7 @@ #include "virerror.h" #include "viralloc.h" #include "virlog.h" +#include "virmdev.h" #include "virpci.h" #include "virusb.h" #include "virscsi.h" @@ -1741,6 +1742,7 @@ virSecuritySELinuxSetHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, return virSecuritySELinuxSetHostdevLabelHelper(file, opaque); } + static int virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainDefPtr def, @@ -1752,6 +1754,7 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; virSecuritySELinuxCallbackData data = {.mgr = mgr, .def = def}; int ret = -1; @@ -1838,7 +1841,26 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, break; } - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + } + + ret = virSecuritySELinuxSetHostdevLabelHelper(vfiodev, &data); + + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -1973,6 +1995,7 @@ virSecuritySELinuxRestoreHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, return virSecuritySELinuxRestoreFileLabel(mgr, file); } + static int virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainHostdevDefPtr dev, @@ -1983,6 +2006,7 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1; /* Like virSecuritySELinuxRestoreImageLabelInt() for a networked @@ -2066,7 +2090,26 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, break; } - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + } + + ret = virSecuritySELinuxRestoreFileLabel(mgr, vfiodev); + + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; -- 2.12.1

On 03/22/2017 11:27 AM, Erik Skultety wrote:
Label the VFIO IOMMU devices under /dev/vfio/ referenced by the symlinks in the sysfs (e.g. /sys/class/mdev_bus/<uuid>/iommu_group) which what qemu actually gets formatted on the command line.
The sentence above is confused (i.e. I don't understand it), but I won't know how to unconfuse it until I've gone through the patch.
This patch updates all of our security drivers.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/security/security_apparmor.c | 21 +++++++++++++++++- src/security/security_dac.c | 45 ++++++++++++++++++++++++++++++++++++-- src/security/security_selinux.c | 47 ++++++++++++++++++++++++++++++++++++++-- 3 files changed, 108 insertions(+), 5 deletions(-)
diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c index f5b72e1c2d..fc55815261 100644 --- a/src/security/security_apparmor.c +++ b/src/security/security_apparmor.c @@ -51,6 +51,7 @@ #include "virlog.h" #include "virstring.h" #include "virscsi.h" +#include "virmdev.h"
#define VIR_FROM_THIS VIR_FROM_SECURITY
@@ -813,6 +814,7 @@ AppArmorSetSecurityHostdevLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev;
if (!secdef || !secdef->relabel) return 0; @@ -901,8 +903,25 @@ AppArmorSetSecurityHostdevLabel(virSecurityManagerPtr mgr, break; }
- case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + }
Going through the various patches and seeing this (or similar) sequences so often makes me think it might be cleaner to have APIs that take a uuidstr and model instead (or maybe define virDomainHostdevSubsysMediatedDevPtr in util instead of conf, then pass the mdevsrc directly - that would make it continue to work if/once we add different ways to specify the device. But things currently work exactly the same way for PCI devices, so no sense rewriting just for that. These are all internal APIs, so we can tweak them to our hearts' content in the future.
+ + ret = AppArmorSetSecurityHostdevLabelHelper(vfiodev, ptr); + + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); break; + }
case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 4e968f29c0..922e484942 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -33,6 +33,7 @@ #include "virfile.h" #include "viralloc.h" #include "virlog.h" +#include "virmdev.h" #include "virpci.h" #include "virusb.h" #include "virscsi.h" @@ -867,6 +868,7 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1;
if (!priv->dynamicOwnership) @@ -964,7 +966,26 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; }
- case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + }
(see what I mean?)
+ + ret = virSecurityDACSetHostdevLabelHelper(vfiodev, &cbdata); + + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -1032,6 +1053,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1;
secdef = virDomainDefGetSecurityLabelDef(def, SECURITY_DAC_NAME); @@ -1120,7 +1142,26 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; }
- case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + } + + ret = virSecurityDACRestoreFileLabel(virSecurityManagerGetPrivateData(mgr), + vfiodev); + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 7b3276dc34..df7c96833e 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -36,6 +36,7 @@ #include "virerror.h" #include "viralloc.h" #include "virlog.h" +#include "virmdev.h" #include "virpci.h" #include "virusb.h" #include "virscsi.h" @@ -1741,6 +1742,7 @@ virSecuritySELinuxSetHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, return virSecuritySELinuxSetHostdevLabelHelper(file, opaque); }
+ static int virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainDefPtr def, @@ -1752,6 +1754,7 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; virSecuritySELinuxCallbackData data = {.mgr = mgr, .def = def};
int ret = -1; @@ -1838,7 +1841,26 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, break; }
- case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + } + + ret = virSecuritySELinuxSetHostdevLabelHelper(vfiodev, &data); + + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -1973,6 +1995,7 @@ virSecuritySELinuxRestoreHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, return virSecuritySELinuxRestoreFileLabel(mgr, file); }
+ static int virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainHostdevDefPtr dev, @@ -1983,6 +2006,7 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1;
/* Like virSecuritySELinuxRestoreImageLabelInt() for a networked @@ -2066,7 +2090,26 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, break; }
- case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + } + + ret = virSecuritySELinuxRestoreFileLabel(mgr, vfiodev); + + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break;
ACK.

On Sun, Mar 26, 2017 at 02:25:02PM -0400, Laine Stump wrote:
On 03/22/2017 11:27 AM, Erik Skultety wrote:
Label the VFIO IOMMU devices under /dev/vfio/ referenced by the symlinks in the sysfs (e.g. /sys/class/mdev_bus/<uuid>/iommu_group) which what qemu actually gets formatted on the command line.
The sentence above is confused (i.e. I don't understand it), but I won't know how to unconfuse it until I've gone through the patch.
Now that I'm reading it again, of course I know what I meant, but that's only because I wrote it, but from the native speaker's perspective, I guess the reaction must have been something like "wut?!". So I simplified it to the bare minimum in terms of the patch just adding labeling for mdevs as well. [...]
- case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + }
Going through the various patches and seeing this (or similar) sequences so often makes me think it might be cleaner to have APIs that take a uuidstr and model instead (or maybe define virDomainHostdevSubsysMediatedDevPtr in util instead of conf, then pass the mdevsrc directly - that would make it continue to work if/once we add different ways to specify the device.
But things currently work exactly the same way for PCI devices, so no sense rewriting just for that. These are all internal APIs, so we can tweak them to our hearts' content in the future.
Yeah, there's definitely some room for 'refurbishment' of the hostdev code. I'll put that on my TODO list.
- case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + }
(see what I mean?)
Yep.

This merely introduces virDomainHostdevMatchSubsysMediatedDev method that is supposed to check whether device being cold-plugged does not already exist in the domain configuration. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 63ac65e8ab..a4ed605c27 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -14375,6 +14375,19 @@ virDomainHostdevMatchSubsysSCSIiSCSI(virDomainHostdevDefPtr first, } static int +virDomainHostdevMatchSubsysMediatedDev(virDomainHostdevDefPtr a, + virDomainHostdevDefPtr b) +{ + virDomainHostdevSubsysMediatedDevPtr src_a = &a->source.subsys.u.mdev; + virDomainHostdevSubsysMediatedDevPtr src_b = &b->source.subsys.u.mdev; + + if (STREQ(src_a->uuidstr, src_b->uuidstr)) + return 1; + + return 0; +} + +static int virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, virDomainHostdevDefPtr b) { @@ -14405,6 +14418,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, else return 0; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + return virDomainHostdevMatchSubsysMediatedDev(a, b); case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: return 0; } -- 2.12.1

On 03/22/2017 11:27 AM, Erik Skultety wrote:
This merely introduces virDomainHostdevMatchSubsysMediatedDev method that is supposed to check whether device being cold-plugged does not already exist in the domain configuration.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 63ac65e8ab..a4ed605c27 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -14375,6 +14375,19 @@ virDomainHostdevMatchSubsysSCSIiSCSI(virDomainHostdevDefPtr first, }
static int +virDomainHostdevMatchSubsysMediatedDev(virDomainHostdevDefPtr a, + virDomainHostdevDefPtr b) +{ + virDomainHostdevSubsysMediatedDevPtr src_a = &a->source.subsys.u.mdev; + virDomainHostdevSubsysMediatedDevPtr src_b = &b->source.subsys.u.mdev; + + if (STREQ(src_a->uuidstr, src_b->uuidstr)) + return 1; + + return 0; +} + +static int virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, virDomainHostdevDefPtr b) { @@ -14405,6 +14418,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, else return 0; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + return virDomainHostdevMatchSubsysMediatedDev(a, b);
This points out that the practice of typecasting all values that are being used for a switch() and removing the default: case isn't really all that big of a help. Sure, it forces you to add new clauses to all the relevant switches in the patch where you add the new enum value, but it would just add in an *empty* clause thinking "I'll fill it in later", then you're once again relegating the responsibility for adding the code to your own memory, which is exactly what we were trying to avoid :-P (Nothing wrong with what you're doing, I just felt like pontificating a bit :-)
case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: return 0; }
ACK.

So far, the official support is for x86_64 arch guests so unless a different device API than vfio-pci is available let's only turn on support for PCI address assignment. Once a different device API is introduced, we can enable another address type easily. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 14 +++++++++++--- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 1f266bffb5..4377560261 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -34,6 +34,7 @@ # include "qemu_agent.h" # include "qemu_conf.h" # include "qemu_capabilities.h" +# include "virmdev.h" # include "virchrdev.h" # include "virobject.h" # include "logging/log_manager.h" diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index 6d3a627868..22d8bf67d9 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -619,7 +619,8 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, virPCIDeviceAddressPtr hostAddr = &hostdev->source.subsys.u.pci.addr; if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || - hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) { + (hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV)) { return 0; } @@ -643,6 +644,9 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, return pcieFlags; } + if (hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + return pcieFlags; + if (!(pciDev = virPCIDeviceNew(hostAddr->domain, hostAddr->bus, hostAddr->slot, @@ -1727,13 +1731,17 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def, /* Host PCI devices */ for (i = 0; i < def->nhostdevs; i++) { + virDomainHostdevSubsysPtr subsys = &def->hostdevs[i]->source.subsys; if (!virDeviceInfoPCIAddressWanted(def->hostdevs[i]->info)) continue; if (def->hostdevs[i]->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) continue; - if (def->hostdevs[i]->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - def->hostdevs[i]->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST) + if (subsys->type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + subsys->type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST && + !(subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV && + subsys->u.mdev.model == VIR_MDEV_MODEL_TYPE_VFIO_PCI)) { continue; + } if (qemuDomainPCIAddressReserveNextAddr(addrs, def->hostdevs[i]->info) < 0) -- 2.12.1

On 03/22/2017 11:27 AM, Erik Skultety wrote:
So far, the official support is for x86_64 arch guests so unless a different device API than vfio-pci is available let's only turn on support for PCI address assignment. Once a different device API is introduced, we can enable another address type easily.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 14 +++++++++++--- 2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 1f266bffb5..4377560261 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -34,6 +34,7 @@ # include "qemu_agent.h" # include "qemu_conf.h" # include "qemu_capabilities.h" +# include "virmdev.h" # include "virchrdev.h" # include "virobject.h" # include "logging/log_manager.h" diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index 6d3a627868..22d8bf67d9 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -619,7 +619,8 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, virPCIDeviceAddressPtr hostAddr = &hostdev->source.subsys.u.pci.addr;
if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || - hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) { + (hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV)) { return 0; }
@@ -643,6 +644,9 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, return pcieFlags; }
+ if (hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + return pcieFlags; +
I think we talked about this a long time ago, and agreed this was the correct thing to do, but I just want to make sure everyone sees/understands/agrees - the bit right above this comment assumes that all mediated devices are PCI Express devices and should be plugged into a PCIe port (which would include adding a pcie-root-port and plugging it into that rather than directly into pcie-root). (Hopefully nobody will come up with a new mediated device that is legacy-pci only...)
if (!(pciDev = virPCIDeviceNew(hostAddr->domain, hostAddr->bus, hostAddr->slot, @@ -1727,13 +1731,17 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def,
/* Host PCI devices */ for (i = 0; i < def->nhostdevs; i++) { + virDomainHostdevSubsysPtr subsys = &def->hostdevs[i]->source.subsys; if (!virDeviceInfoPCIAddressWanted(def->hostdevs[i]->info)) continue; if (def->hostdevs[i]->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) continue; - if (def->hostdevs[i]->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - def->hostdevs[i]->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST) + if (subsys->type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + subsys->type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST && + !(subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV && + subsys->u.mdev.model == VIR_MDEV_MODEL_TYPE_VFIO_PCI)) { continue; + }
if (qemuDomainPCIAddressReserveNextAddr(addrs, def->hostdevs[i]->info) < 0)
ACK.

On Sun, Mar 26, 2017 at 02:32:23PM -0400, Laine Stump wrote:
On 03/22/2017 11:27 AM, Erik Skultety wrote:
So far, the official support is for x86_64 arch guests so unless a different device API than vfio-pci is available let's only turn on support for PCI address assignment. Once a different device API is introduced, we can enable another address type easily.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 14 +++++++++++--- 2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 1f266bffb5..4377560261 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -34,6 +34,7 @@ # include "qemu_agent.h" # include "qemu_conf.h" # include "qemu_capabilities.h" +# include "virmdev.h" # include "virchrdev.h" # include "virobject.h" # include "logging/log_manager.h" diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index 6d3a627868..22d8bf67d9 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -619,7 +619,8 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, virPCIDeviceAddressPtr hostAddr = &hostdev->source.subsys.u.pci.addr;
if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || - hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) { + (hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV)) { return 0; }
@@ -643,6 +644,9 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, return pcieFlags; }
+ if (hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + return pcieFlags; +
I think we talked about this a long time ago, and agreed this was the correct thing to do, but I just want to make sure everyone sees/understands/agrees - the bit right above this comment assumes that all mediated devices are PCI Express devices and should be plugged into a PCIe port (which would include adding a pcie-root-port and plugging it into that rather than directly into pcie-root).
(Hopefully nobody will come up with a new mediated device that is legacy-pci only...)
Hopefully not, but then, we can always adjust the code, since this doesn't have any implications on the XML. Erik

Keep track of the assigned mediated devices the same way we do it for the rest of hostdevs. Methods like 'Prepare', 'Update', and 'ReAttach' are introduced by this patch. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 3 + src/qemu/qemu_hostdev.c | 56 ++++++++++++++++ src/qemu/qemu_hostdev.h | 10 +++ src/util/virhostdev.c | 165 ++++++++++++++++++++++++++++++++++++++++++++++- src/util/virhostdev.h | 23 +++++++ 5 files changed, 256 insertions(+), 1 deletion(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c51b295d30..8f3b9697e4 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1730,16 +1730,19 @@ virHostdevPCINodeDeviceDetach; virHostdevPCINodeDeviceReAttach; virHostdevPCINodeDeviceReset; virHostdevPrepareDomainDevices; +virHostdevPrepareMediatedDevices; virHostdevPreparePCIDevices; virHostdevPrepareSCSIDevices; virHostdevPrepareSCSIVHostDevices; virHostdevPrepareUSBDevices; virHostdevReAttachDomainDevices; +virHostdevReAttachMediatedDevices; virHostdevReAttachPCIDevices; virHostdevReAttachSCSIDevices; virHostdevReAttachSCSIVHostDevices; virHostdevReAttachUSBDevices; virHostdevUpdateActiveDomainDevices; +virHostdevUpdateActiveMediatedDevices; virHostdevUpdateActivePCIDevices; virHostdevUpdateActiveSCSIDevices; virHostdevUpdateActiveUSBDevices; diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c index 7cd49e4aa5..685bf5b59f 100644 --- a/src/qemu/qemu_hostdev.c +++ b/src/qemu/qemu_hostdev.c @@ -83,6 +83,22 @@ qemuHostdevUpdateActiveSCSIDevices(virQEMUDriverPtr driver, QEMU_DRIVER_NAME, def->name); } + +int +qemuHostdevUpdateActiveMediatedDevices(virQEMUDriverPtr driver, + virDomainDefPtr def) +{ + virHostdevManagerPtr mgr = driver->hostdevMgr; + + if (!def->nhostdevs) + return 0; + + return virHostdevUpdateActiveMediatedDevices(mgr, def->hostdevs, + def->nhostdevs, + QEMU_DRIVER_NAME, def->name); +} + + int qemuHostdevUpdateActiveDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def) @@ -99,6 +115,9 @@ qemuHostdevUpdateActiveDomainDevices(virQEMUDriverPtr driver, if (qemuHostdevUpdateActiveSCSIDevices(driver, def) < 0) return -1; + if (qemuHostdevUpdateActiveMediatedDevices(driver, def) < 0) + return -1; + return 0; } @@ -305,6 +324,24 @@ qemuHostdevPrepareSCSIVHostDevices(virQEMUDriverPtr driver, } int +qemuHostdevPrepareMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; + + if (!qemuHostdevHostSupportsPassthroughVFIO()) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("host doesn't support VFIO PCI interface")); + return -1; + } + + return virHostdevPrepareMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME, + name, hostdevs, nhostdevs); +} + +int qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def, virQEMUCapsPtr qemuCaps, @@ -330,6 +367,10 @@ qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, def->hostdevs, def->nhostdevs) < 0) return -1; + if (qemuHostdevPrepareMediatedDevices(driver, def->name, + def->hostdevs, def->nhostdevs) < 0) + return -1; + return 0; } @@ -397,6 +438,18 @@ qemuHostdevReAttachSCSIVHostDevices(virQEMUDriverPtr driver, } void +qemuHostdevReAttachMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; + + virHostdevReAttachMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME, + name, hostdevs, nhostdevs); +} + +void qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def) { @@ -414,4 +467,7 @@ qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver, qemuHostdevReAttachSCSIVHostDevices(driver, def->name, def->hostdevs, def->nhostdevs); + + qemuHostdevReAttachMediatedDevices(driver, def->name, def->hostdevs, + def->nhostdevs); } diff --git a/src/qemu/qemu_hostdev.h b/src/qemu/qemu_hostdev.h index 74a7d4f34e..9a7c7f143c 100644 --- a/src/qemu/qemu_hostdev.h +++ b/src/qemu/qemu_hostdev.h @@ -30,6 +30,8 @@ bool qemuHostdevHostSupportsPassthroughLegacy(void); bool qemuHostdevHostSupportsPassthroughVFIO(void); +int qemuHostdevUpdateActiveMediatedDevices(virQEMUDriverPtr driver, + virDomainDefPtr def); int qemuHostdevUpdateActivePCIDevices(virQEMUDriverPtr driver, virDomainDefPtr def); int qemuHostdevUpdateActiveUSBDevices(virQEMUDriverPtr driver, @@ -59,6 +61,10 @@ int qemuHostdevPrepareSCSIVHostDevices(virQEMUDriverPtr driver, const char *name, virDomainHostdevDefPtr *hostdevs, int nhostdevs); +int qemuHostdevPrepareMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs); int qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def, virQEMUCapsPtr qemuCaps, @@ -80,6 +86,10 @@ void qemuHostdevReAttachSCSIVHostDevices(virQEMUDriverPtr driver, const char *name, virDomainHostdevDefPtr *hostdevs, int nhostdevs); +void qemuHostdevReAttachMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs); void qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def); diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 86ca8e0473..0c337e6116 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -1,6 +1,6 @@ /* virhostdev.c: hostdev management * - * Copyright (C) 2006-2007, 2009-2016 Red Hat, Inc. + * Copyright (C) 2006-2007, 2009-2017 Red Hat, Inc. * Copyright (C) 2006 Daniel P. Berrange * Copyright (C) 2014 SUSE LINUX Products GmbH, Nuernberg, Germany. * @@ -147,6 +147,7 @@ virHostdevManagerDispose(void *obj) virObjectUnref(hostdevMgr->activeUSBHostdevs); virObjectUnref(hostdevMgr->activeSCSIHostdevs); virObjectUnref(hostdevMgr->activeSCSIVHostHostdevs); + virObjectUnref(hostdevMgr->activeMediatedHostdevs); VIR_FREE(hostdevMgr->stateDir); } @@ -174,6 +175,9 @@ virHostdevManagerNew(void) if (!(hostdevMgr->activeSCSIVHostHostdevs = virSCSIVHostDeviceListNew())) goto error; + if (!(hostdevMgr->activeMediatedHostdevs = virMediatedDeviceListNew())) + goto error; + if (privileged) { if (VIR_STRDUP(hostdevMgr->stateDir, HOSTDEV_STATE_DIR) < 0) goto error; @@ -1147,6 +1151,50 @@ virHostdevUpdateActiveSCSIDevices(virHostdevManagerPtr mgr, return ret; } + +int +virHostdevUpdateActiveMediatedDevices(virHostdevManagerPtr mgr, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs, + const char *drv_name, + const char *dom_name) +{ + int ret = -1; + size_t i; + virMediatedDevicePtr mdev = NULL; + + if (nhostdevs == 0) + return 0; + + virObjectLock(mgr->activeMediatedHostdevs); + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDefPtr hostdev = hostdevs[i]; + virDomainHostdevSubsysMediatedDevPtr mdevsrc; + + mdevsrc = &hostdev->source.subsys.u.mdev; + + if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || + hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) { + continue; + } + + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr, mdevsrc->model))) + goto cleanup; + + virMediatedDeviceSetUsedBy(mdev, drv_name, dom_name); + + if (virMediatedDeviceListAdd(mgr->activeMediatedHostdevs, mdev) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + virMediatedDeviceFree(mdev); + virObjectUnlock(mgr->activeMediatedHostdevs); + return ret; +} + + static int virHostdevMarkUSBDevices(virHostdevManagerPtr mgr, const char *drv_name, @@ -1595,6 +1643,70 @@ virHostdevPrepareSCSIVHostDevices(virHostdevManagerPtr mgr, return -1; } + +int +virHostdevPrepareMediatedDevices(virHostdevManagerPtr mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + size_t i; + int ret = -1; + virMediatedDeviceListPtr list; + + if (!nhostdevs) + return 0; + + /* To prevent situation where mediated device is assigned to multiple + * domains we maintain a driver list of currently assigned mediated devices. + * A device is appended to the driver list after a series of preparations. + */ + if (!(list = virMediatedDeviceListNew())) + goto cleanup; + + /* Loop 1: Build a temporary list of ALL mediated devices. */ + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDefPtr hostdev = hostdevs[i]; + virDomainHostdevSubsysMediatedDevPtr src = &hostdev->source.subsys.u.mdev; + virMediatedDevicePtr mdev; + + if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) + continue; + if (hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + continue; + + if (!(mdev = virMediatedDeviceNew(src->uuidstr, src->model))) + goto cleanup; + + if (virMediatedDeviceListAdd(list, mdev) < 0) { + virMediatedDeviceFree(mdev); + goto cleanup; + } + } + + /* Mark the devices in the list as used by @drv_name-@dom_name and copy the + * references to the driver list + */ + if (virMediatedDeviceListMarkDevices(mgr->activeMediatedHostdevs, + list, drv_name, dom_name) < 0) + goto cleanup; + + /* Loop 2: Temporary list was successfully merged with + * driver list, so steal all items to avoid freeing them + * in cleanup label. + */ + while (virMediatedDeviceListCount(list) > 0) { + virMediatedDevicePtr tmp = virMediatedDeviceListGet(list, 0); + virMediatedDeviceListSteal(list, tmp); + } + + ret = 0; + cleanup: + virObjectUnref(list); + return ret; +} + void virHostdevReAttachUSBDevices(virHostdevManagerPtr mgr, const char *drv_name, @@ -1789,6 +1901,57 @@ virHostdevReAttachSCSIVHostDevices(virHostdevManagerPtr mgr, virObjectUnlock(mgr->activeSCSIVHostHostdevs); } +void +virHostdevReAttachMediatedDevices(virHostdevManagerPtr mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + const char *used_by_drvname = NULL; + const char *used_by_domname = NULL; + size_t i; + + if (nhostdevs == 0) + return; + + virObjectLock(mgr->activeMediatedHostdevs); + for (i = 0; i < nhostdevs; i++) { + virMediatedDevicePtr mdev, tmp; + virDomainHostdevSubsysMediatedDevPtr mdevsrc; + virDomainHostdevDefPtr hostdev = hostdevs[i]; + + mdevsrc = &hostdev->source.subsys.u.mdev; + + if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || + hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) { + continue; + } + + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model))) + continue; + + /* Remove from the list only mdevs assigned to @drv_name/@dom_name */ + + tmp = virMediatedDeviceListFind(mgr->activeMediatedHostdevs, mdev); + virMediatedDeviceFree(mdev); + + /* skip inactive devices */ + if (!tmp) + continue; + + virMediatedDeviceGetUsedBy(tmp, &used_by_drvname, &used_by_domname); + if (STREQ_NULLABLE(drv_name, used_by_drvname) && + STREQ_NULLABLE(dom_name, used_by_domname)) { + VIR_DEBUG("Removing %s dom=%s from activeMediatedHostdevs", + mdevsrc->uuidstr, dom_name); + virMediatedDeviceListDel(mgr->activeMediatedHostdevs, tmp); + } + } + virObjectUnlock(mgr->activeMediatedHostdevs); +} + int virHostdevPCINodeDeviceDetach(virHostdevManagerPtr mgr, virPCIDevicePtr pci) diff --git a/src/util/virhostdev.h b/src/util/virhostdev.h index 1202136c29..42e898211e 100644 --- a/src/util/virhostdev.h +++ b/src/util/virhostdev.h @@ -32,6 +32,7 @@ # include "virscsi.h" # include "virscsivhost.h" # include "conf/domain_conf.h" +# include "virmdev.h" typedef enum { VIR_HOSTDEV_STRICT_ACS_CHECK = (1 << 0), /* strict acs check */ @@ -55,6 +56,7 @@ struct _virHostdevManager { virUSBDeviceListPtr activeUSBHostdevs; virSCSIDeviceListPtr activeSCSIHostdevs; virSCSIVHostDeviceListPtr activeSCSIVHostHostdevs; + virMediatedDeviceListPtr activeMediatedHostdevs; }; virHostdevManagerPtr virHostdevManagerGetDefault(void); @@ -96,6 +98,13 @@ virHostdevPrepareSCSIVHostDevices(virHostdevManagerPtr hostdev_mgr, virDomainHostdevDefPtr *hostdevs, int nhostdevs) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); +int +virHostdevPrepareMediatedDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); void virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, const char *drv_name, @@ -125,6 +134,13 @@ virHostdevReAttachSCSIVHostDevices(virHostdevManagerPtr hostdev_mgr, virDomainHostdevDefPtr *hostdevs, int nhostdevs) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); +void +virHostdevReAttachMediatedDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); int virHostdevUpdateActivePCIDevices(virHostdevManagerPtr mgr, virDomainHostdevDefPtr *hostdevs, @@ -147,6 +163,13 @@ virHostdevUpdateActiveSCSIDevices(virHostdevManagerPtr mgr, const char *dom_name) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5); int +virHostdevUpdateActiveMediatedDevices(virHostdevManagerPtr mgr, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs, + const char *drv_name, + const char *dom_name) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5); +int virHostdevUpdateActiveDomainDevices(virHostdevManagerPtr mgr, const char *driver, virDomainDefPtr def, -- 2.12.1

(I'm unable to apply this patch to the head of master with "git am -3", and it won't show me the conflicts (it just fails saying "fatal: sha1 information is lacking or useless (src/libvirt_private.syms), error: could not build fake ancestor". Because of this, all further review is based purely on examining the patch emails.) On 03/22/2017 11:27 AM, Erik Skultety wrote:
Keep track of the assigned mediated devices the same way we do it for the rest of hostdevs. Methods like 'Prepare', 'Update', and 'ReAttach' are introduced by this patch.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 3 + src/qemu/qemu_hostdev.c | 56 ++++++++++++++++ src/qemu/qemu_hostdev.h | 10 +++ src/util/virhostdev.c | 165 ++++++++++++++++++++++++++++++++++++++++++++++- src/util/virhostdev.h | 23 +++++++ 5 files changed, 256 insertions(+), 1 deletion(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c51b295d30..8f3b9697e4 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1730,16 +1730,19 @@ virHostdevPCINodeDeviceDetach; virHostdevPCINodeDeviceReAttach; virHostdevPCINodeDeviceReset; virHostdevPrepareDomainDevices; +virHostdevPrepareMediatedDevices; virHostdevPreparePCIDevices; virHostdevPrepareSCSIDevices; virHostdevPrepareSCSIVHostDevices; virHostdevPrepareUSBDevices; virHostdevReAttachDomainDevices; +virHostdevReAttachMediatedDevices; virHostdevReAttachPCIDevices; virHostdevReAttachSCSIDevices; virHostdevReAttachSCSIVHostDevices; virHostdevReAttachUSBDevices; virHostdevUpdateActiveDomainDevices; +virHostdevUpdateActiveMediatedDevices; virHostdevUpdateActivePCIDevices; virHostdevUpdateActiveSCSIDevices; virHostdevUpdateActiveUSBDevices;
BTW, I'm assuming that you've run "make syntax-check && make check" as each patch is applied, so that we're sure the functions in libvirt_private.syms are in proper alphabetic order (among other things). I've run it at a few stages of applying the patches, but not all.
diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c index 7cd49e4aa5..685bf5b59f 100644 --- a/src/qemu/qemu_hostdev.c +++ b/src/qemu/qemu_hostdev.c @@ -83,6 +83,22 @@ qemuHostdevUpdateActiveSCSIDevices(virQEMUDriverPtr driver, QEMU_DRIVER_NAME, def->name); }
+ +int +qemuHostdevUpdateActiveMediatedDevices(virQEMUDriverPtr driver, + virDomainDefPtr def) +{ + virHostdevManagerPtr mgr = driver->hostdevMgr; + + if (!def->nhostdevs) + return 0; + + return virHostdevUpdateActiveMediatedDevices(mgr, def->hostdevs, + def->nhostdevs, + QEMU_DRIVER_NAME, def->name); +} + + int qemuHostdevUpdateActiveDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def) @@ -99,6 +115,9 @@ qemuHostdevUpdateActiveDomainDevices(virQEMUDriverPtr driver, if (qemuHostdevUpdateActiveSCSIDevices(driver, def) < 0) return -1;
+ if (qemuHostdevUpdateActiveMediatedDevices(driver, def) < 0) + return -1; + return 0; }
@@ -305,6 +324,24 @@ qemuHostdevPrepareSCSIVHostDevices(virQEMUDriverPtr driver, }
int +qemuHostdevPrepareMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; + + if (!qemuHostdevHostSupportsPassthroughVFIO()) {
Sshhhh! We won't let Alex know this function exists :-)
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("host doesn't support VFIO PCI interface")); + return -1; + } + + return virHostdevPrepareMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME, + name, hostdevs, nhostdevs); +} + +int qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def, virQEMUCapsPtr qemuCaps, @@ -330,6 +367,10 @@ qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, def->hostdevs, def->nhostdevs) < 0) return -1;
+ if (qemuHostdevPrepareMediatedDevices(driver, def->name, + def->hostdevs, def->nhostdevs) < 0) + return -1; + return 0; }
@@ -397,6 +438,18 @@ qemuHostdevReAttachSCSIVHostDevices(virQEMUDriverPtr driver, }
void +qemuHostdevReAttachMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; + + virHostdevReAttachMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME, + name, hostdevs, nhostdevs);
What does "ReAttach" mean for a mediated device? Isn't this a NOP? Oh, nevermind - I looked ahead and compared to the PCI version of the ReAttach function. Turns out the PCI version does a *bunch* of things other than reattaching the device to its host driver, including restoring netdev config (which doesn't apply for mdevs and calling virPCIDeviceReset() (which is a NOP even for PCI devices as long as we're using VFIO, which *everybody* is these days). *BUT* the one other thing it does is move the devices that had been in use by the domain from the active list to the inactive list, and that's what the mdev version of the function does. Someday in the future we may want to clean these up and rename the functions appropriately, but for now having them named similarly makes it easier to review, so forget I said anything :-)
+} + +void qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def) { @@ -414,4 +467,7 @@ qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver,
qemuHostdevReAttachSCSIVHostDevices(driver, def->name, def->hostdevs, def->nhostdevs); + + qemuHostdevReAttachMediatedDevices(driver, def->name, def->hostdevs, + def->nhostdevs); } diff --git a/src/qemu/qemu_hostdev.h b/src/qemu/qemu_hostdev.h index 74a7d4f34e..9a7c7f143c 100644 --- a/src/qemu/qemu_hostdev.h +++ b/src/qemu/qemu_hostdev.h @@ -30,6 +30,8 @@ bool qemuHostdevHostSupportsPassthroughLegacy(void); bool qemuHostdevHostSupportsPassthroughVFIO(void);
+int qemuHostdevUpdateActiveMediatedDevices(virQEMUDriverPtr driver, + virDomainDefPtr def); int qemuHostdevUpdateActivePCIDevices(virQEMUDriverPtr driver, virDomainDefPtr def); int qemuHostdevUpdateActiveUSBDevices(virQEMUDriverPtr driver, @@ -59,6 +61,10 @@ int qemuHostdevPrepareSCSIVHostDevices(virQEMUDriverPtr driver, const char *name, virDomainHostdevDefPtr *hostdevs, int nhostdevs); +int qemuHostdevPrepareMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs); int qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def, virQEMUCapsPtr qemuCaps, @@ -80,6 +86,10 @@ void qemuHostdevReAttachSCSIVHostDevices(virQEMUDriverPtr driver, const char *name, virDomainHostdevDefPtr *hostdevs, int nhostdevs); +void qemuHostdevReAttachMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs); void qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def);
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 86ca8e0473..0c337e6116 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -1,6 +1,6 @@ /* virhostdev.c: hostdev management * - * Copyright (C) 2006-2007, 2009-2016 Red Hat, Inc. + * Copyright (C) 2006-2007, 2009-2017 Red Hat, Inc. * Copyright (C) 2006 Daniel P. Berrange * Copyright (C) 2014 SUSE LINUX Products GmbH, Nuernberg, Germany. * @@ -147,6 +147,7 @@ virHostdevManagerDispose(void *obj) virObjectUnref(hostdevMgr->activeUSBHostdevs); virObjectUnref(hostdevMgr->activeSCSIHostdevs); virObjectUnref(hostdevMgr->activeSCSIVHostHostdevs); + virObjectUnref(hostdevMgr->activeMediatedHostdevs); VIR_FREE(hostdevMgr->stateDir); }
@@ -174,6 +175,9 @@ virHostdevManagerNew(void) if (!(hostdevMgr->activeSCSIVHostHostdevs = virSCSIVHostDeviceListNew())) goto error;
+ if (!(hostdevMgr->activeMediatedHostdevs = virMediatedDeviceListNew())) + goto error; + if (privileged) { if (VIR_STRDUP(hostdevMgr->stateDir, HOSTDEV_STATE_DIR) < 0) goto error; @@ -1147,6 +1151,50 @@ virHostdevUpdateActiveSCSIDevices(virHostdevManagerPtr mgr, return ret; }
+ +int +virHostdevUpdateActiveMediatedDevices(virHostdevManagerPtr mgr, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs, + const char *drv_name, + const char *dom_name) +{ + int ret = -1; + size_t i; + virMediatedDevicePtr mdev = NULL; + + if (nhostdevs == 0) + return 0; + + virObjectLock(mgr->activeMediatedHostdevs); + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDefPtr hostdev = hostdevs[i]; + virDomainHostdevSubsysMediatedDevPtr mdevsrc; + + mdevsrc = &hostdev->source.subsys.u.mdev; + + if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || + hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) { + continue; + } + + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr, mdevsrc->model))) + goto cleanup;
Sigh. *still* with the temporary object creation just to call some utility function that could have just as easily taken different args...
+ + virMediatedDeviceSetUsedBy(mdev, drv_name, dom_name); + + if (virMediatedDeviceListAdd(mgr->activeMediatedHostdevs, mdev) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + virMediatedDeviceFree(mdev); + virObjectUnlock(mgr->activeMediatedHostdevs); + return ret; +} + + static int virHostdevMarkUSBDevices(virHostdevManagerPtr mgr, const char *drv_name, @@ -1595,6 +1643,70 @@ virHostdevPrepareSCSIVHostDevices(virHostdevManagerPtr mgr, return -1; }
+ +int +virHostdevPrepareMediatedDevices(virHostdevManagerPtr mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + size_t i; + int ret = -1; + virMediatedDeviceListPtr list; + + if (!nhostdevs) + return 0; + + /* To prevent situation where mediated device is assigned to multiple + * domains we maintain a driver list of currently assigned mediated devices. + * A device is appended to the driver list after a series of preparations. + */ + if (!(list = virMediatedDeviceListNew())) + goto cleanup; + + /* Loop 1: Build a temporary list of ALL mediated devices. */ + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDefPtr hostdev = hostdevs[i]; + virDomainHostdevSubsysMediatedDevPtr src = &hostdev->source.subsys.u.mdev; + virMediatedDevicePtr mdev; + + if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) + continue; + if (hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + continue; + + if (!(mdev = virMediatedDeviceNew(src->uuidstr, src->model))) + goto cleanup; + + if (virMediatedDeviceListAdd(list, mdev) < 0) { + virMediatedDeviceFree(mdev); + goto cleanup; + } + } + + /* Mark the devices in the list as used by @drv_name-@dom_name and copy the + * references to the driver list + */ + if (virMediatedDeviceListMarkDevices(mgr->activeMediatedHostdevs, + list, drv_name, dom_name) < 0) + goto cleanup; + + /* Loop 2: Temporary list was successfully merged with + * driver list, so steal all items to avoid freeing them + * in cleanup label. + */ + while (virMediatedDeviceListCount(list) > 0) { + virMediatedDevicePtr tmp = virMediatedDeviceListGet(list, 0); + virMediatedDeviceListSteal(list, tmp); + } + + ret = 0; + cleanup: + virObjectUnref(list); + return ret; +} + void virHostdevReAttachUSBDevices(virHostdevManagerPtr mgr, const char *drv_name, @@ -1789,6 +1901,57 @@ virHostdevReAttachSCSIVHostDevices(virHostdevManagerPtr mgr, virObjectUnlock(mgr->activeSCSIVHostHostdevs); }
+void +virHostdevReAttachMediatedDevices(virHostdevManagerPtr mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + const char *used_by_drvname = NULL; + const char *used_by_domname = NULL; + size_t i; + + if (nhostdevs == 0) + return; + + virObjectLock(mgr->activeMediatedHostdevs); + for (i = 0; i < nhostdevs; i++) { + virMediatedDevicePtr mdev, tmp; + virDomainHostdevSubsysMediatedDevPtr mdevsrc; + virDomainHostdevDefPtr hostdev = hostdevs[i]; + + mdevsrc = &hostdev->source.subsys.u.mdev; + + if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || + hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) { + continue; + } + + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model))) + continue; + + /* Remove from the list only mdevs assigned to @drv_name/@dom_name */ + + tmp = virMediatedDeviceListFind(mgr->activeMediatedHostdevs, mdev); + virMediatedDeviceFree(mdev); + + /* skip inactive devices */ + if (!tmp) + continue; + + virMediatedDeviceGetUsedBy(tmp, &used_by_drvname, &used_by_domname); + if (STREQ_NULLABLE(drv_name, used_by_drvname) && + STREQ_NULLABLE(dom_name, used_by_domname)) { + VIR_DEBUG("Removing %s dom=%s from activeMediatedHostdevs", + mdevsrc->uuidstr, dom_name); + virMediatedDeviceListDel(mgr->activeMediatedHostdevs, tmp); + } + } + virObjectUnlock(mgr->activeMediatedHostdevs); +} + int virHostdevPCINodeDeviceDetach(virHostdevManagerPtr mgr, virPCIDevicePtr pci) diff --git a/src/util/virhostdev.h b/src/util/virhostdev.h index 1202136c29..42e898211e 100644 --- a/src/util/virhostdev.h +++ b/src/util/virhostdev.h @@ -32,6 +32,7 @@ # include "virscsi.h" # include "virscsivhost.h" # include "conf/domain_conf.h" +# include "virmdev.h"
typedef enum { VIR_HOSTDEV_STRICT_ACS_CHECK = (1 << 0), /* strict acs check */ @@ -55,6 +56,7 @@ struct _virHostdevManager { virUSBDeviceListPtr activeUSBHostdevs; virSCSIDeviceListPtr activeSCSIHostdevs; virSCSIVHostDeviceListPtr activeSCSIVHostHostdevs; + virMediatedDeviceListPtr activeMediatedHostdevs; };
virHostdevManagerPtr virHostdevManagerGetDefault(void); @@ -96,6 +98,13 @@ virHostdevPrepareSCSIVHostDevices(virHostdevManagerPtr hostdev_mgr, virDomainHostdevDefPtr *hostdevs, int nhostdevs) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); +int +virHostdevPrepareMediatedDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); void virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, const char *drv_name, @@ -125,6 +134,13 @@ virHostdevReAttachSCSIVHostDevices(virHostdevManagerPtr hostdev_mgr, virDomainHostdevDefPtr *hostdevs, int nhostdevs) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); +void +virHostdevReAttachMediatedDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); int virHostdevUpdateActivePCIDevices(virHostdevManagerPtr mgr, virDomainHostdevDefPtr *hostdevs, @@ -147,6 +163,13 @@ virHostdevUpdateActiveSCSIDevices(virHostdevManagerPtr mgr, const char *dom_name) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5); int +virHostdevUpdateActiveMediatedDevices(virHostdevManagerPtr mgr, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs, + const char *drv_name, + const char *dom_name) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5); +int virHostdevUpdateActiveDomainDevices(virHostdevManagerPtr mgr, const char *driver, virDomainDefPtr def,
ACK.

On Sun, Mar 26, 2017 at 03:00:58PM -0400, Laine Stump wrote:
(I'm unable to apply this patch to the head of master with "git am -3", and it won't show me the conflicts (it just fails saying "fatal: sha1 information is lacking or useless (src/libvirt_private.syms), error: could not build fake ancestor". Because of this, all further review is based purely on examining the patch emails.)
Hmm, I also got a merge conflict (not sure if it's the same one you got) at virHostdevReAttachMediatedDevices, but not with a fatal error. Wondering what the problem at your end might be.
BTW, I'm assuming that you've run "make syntax-check && make check" as each patch is applied, so that we're sure the functions in
I have.
libvirt_private.syms are in proper alphabetic order (among other things). I've run it at a few stages of applying the patches, but not all.
[..]
void +qemuHostdevReAttachMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; + + virHostdevReAttachMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME, + name, hostdevs, nhostdevs);
What does "ReAttach" mean for a mediated device? Isn't this a NOP?
Oh, nevermind - I looked ahead and compared to the PCI version of the ReAttach function. Turns out the PCI version does a *bunch* of things other than reattaching the device to its host driver, including restoring netdev config (which doesn't apply for mdevs and calling virPCIDeviceReset() (which is a NOP even for PCI devices as long as we're using VFIO, which *everybody* is these days). *BUT* the one other thing it does is move the devices that had been in use by the domain from the active list to the inactive list, and that's what the mdev version of the function does.
Yeah, this was merely just to stay consistent, I hated it from the very moment I introduced the function, but I checked with other types of devices and thought "it just might be easier to read", but I will add a comment that it's there just because of consistency reasons and re-attach doesn't make any sense for mdevs.
Someday in the future we may want to clean these up and rename the functions appropriately, but for now having them named similarly makes it easier to review, so forget I said anything :-)
Yep. Erik

On Wed, Mar 22, 2017 at 04:27:37PM +0100, Erik Skultety wrote:
Keep track of the assigned mediated devices the same way we do it for the rest of hostdevs. Methods like 'Prepare', 'Update', and 'ReAttach' are introduced by this patch.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 3 + src/qemu/qemu_hostdev.c | 56 ++++++++++++++++ src/qemu/qemu_hostdev.h | 10 +++ src/util/virhostdev.c | 165 ++++++++++++++++++++++++++++++++++++++++++++++- src/util/virhostdev.h | 23 +++++++ 5 files changed, 256 insertions(+), 1 deletion(-)
@@ -305,6 +324,24 @@ qemuHostdevPrepareSCSIVHostDevices(virQEMUDriverPtr driver, }
int +qemuHostdevPrepareMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; + + if (!qemuHostdevHostSupportsPassthroughVFIO()) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("host doesn't support VFIO PCI interface")); + return -1; + }
This is unconditionally breaking *all* use of host devices on libvirt if the system lacks VFIO, as it is not actually checking if any of the 'hostdevs' are actually mediated devices, or indeed whether they are even PCI devices. ie i can no longer boot a guest that uses USB host device passthrough.
+ + return virHostdevPrepareMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME, + name, hostdevs, nhostdevs); +} + +int qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def, virQEMUCapsPtr qemuCaps, @@ -330,6 +367,10 @@ qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, def->hostdevs, def->nhostdevs) < 0) return -1;
+ if (qemuHostdevPrepareMediatedDevices(driver, def->name, + def->hostdevs, def->nhostdevs) < 0) + return -1;
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|

As goes for all the other hostdev device types, grant the qemu process access to /dev/vfio/<mediated_device_iommu_group>. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c0f060b0a3..d1ac1d641b 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6999,10 +6999,12 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; virPCIDevicePtr pci = NULL; virUSBDevicePtr usb = NULL; virSCSIDevicePtr scsi = NULL; virSCSIVHostDevicePtr host = NULL; + virMediatedDevicePtr mdev = NULL; char *tmpPath = NULL; bool freeTmpPath = false; bool includeVFIO = false; @@ -7103,6 +7105,17 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model))) + goto cleanup; + + if (!(tmpPath = virMediatedDeviceGetIOMMUGroupDev(mdev))) + goto cleanup; + + freeTmpPath = true; + includeVFIO = true; + perm = VIR_CGROUP_DEVICE_RW; + break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } @@ -7152,6 +7165,7 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, virUSBDeviceFree(usb); virSCSIDeviceFree(scsi); virSCSIVHostDeviceFree(host); + virMediatedDeviceFree(mdev); if (freeTmpPath) VIR_FREE(tmpPath); return ret; -- 2.12.1

On 03/22/2017 11:27 AM, Erik Skultety wrote:
As goes for all the other hostdev device types, grant the qemu process access to /dev/vfio/<mediated_device_iommu_group>.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c0f060b0a3..d1ac1d641b 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6999,10 +6999,12 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; virPCIDevicePtr pci = NULL; virUSBDevicePtr usb = NULL; virSCSIDevicePtr scsi = NULL; virSCSIVHostDevicePtr host = NULL; + virMediatedDevicePtr mdev = NULL; char *tmpPath = NULL; bool freeTmpPath = false; bool includeVFIO = false; @@ -7103,6 +7105,17 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, }
case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr, + mdevsrc->model))) + goto cleanup; + + if (!(tmpPath = virMediatedDeviceGetIOMMUGroupDev(mdev))) + goto cleanup; + + freeTmpPath = true; + includeVFIO = true; + perm = VIR_CGROUP_DEVICE_RW; + break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } @@ -7152,6 +7165,7 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, virUSBDeviceFree(usb); virSCSIDeviceFree(scsi); virSCSIVHostDeviceFree(host); + virMediatedDeviceFree(mdev); if (freeTmpPath) VIR_FREE(tmpPath); return ret;
Yep. Faithful recreation of what's done to PCI devices. ACK.

Since mdevs are just another type of VFIO devices, we should increase the memory locking limit the same way we do for VFIO PCI devices. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d1ac1d641b..04e64b47ea 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6347,11 +6347,12 @@ qemuDomainRequiresMemLock(virDomainDefPtr def) return true; for (i = 0; i < def->nhostdevs; i++) { - virDomainHostdevDefPtr dev = def->hostdevs[i]; + virDomainHostdevSubsysPtr subsys = &def->hostdevs[i]->source.subsys; - if (dev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && - dev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - dev->source.subsys.u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) + if (def->hostdevs[i]->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV || + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + subsys->u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO))) return true; } -- 2.12.1

On 03/22/2017 11:27 AM, Erik Skultety wrote:
Since mdevs are just another type of VFIO devices, we should increase the memory locking limit the same way we do for VFIO PCI devices.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d1ac1d641b..04e64b47ea 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6347,11 +6347,12 @@ qemuDomainRequiresMemLock(virDomainDefPtr def) return true;
for (i = 0; i < def->nhostdevs; i++) { - virDomainHostdevDefPtr dev = def->hostdevs[i]; + virDomainHostdevSubsysPtr subsys = &def->hostdevs[i]->source.subsys;
- if (dev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && - dev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - dev->source.subsys.u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) + if (def->hostdevs[i]->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV || + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + subsys->u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO))) return true; }
ACK.

On Wed, 2017-03-22 at 16:27 +0100, Erik Skultety wrote:
Since mdevs are just another type of VFIO devices, we should increase the memory locking limit the same way we do for VFIO PCI devices. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d1ac1d641b..04e64b47ea 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6347,11 +6347,12 @@ qemuDomainRequiresMemLock(virDomainDefPtr def) return true; for (i = 0; i < def->nhostdevs; i++) { - virDomainHostdevDefPtr dev = def->hostdevs[i]; + virDomainHostdevSubsysPtr subsys = &def->hostdevs[i]->source.subsys; - if (dev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && - dev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - dev->source.subsys.u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) + if (def->hostdevs[i]->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV || + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + subsys->u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO))) return true; }
Now that we have test suite coverage for the calculation of memory locking limits (qemumemlocktest), it would be neat if you could add test cases for mdevs as well. Maybe the comment above the loop (which has been moved to the qemuDomainGetMemLockLimitBytes() function now) could be updated to mention mdev-specific information, if any? -- Andrea Bolognani / Red Hat / Virtualization

On Wed, Mar 29, 2017 at 10:41:52AM +0200, Andrea Bolognani wrote:
On Wed, 2017-03-22 at 16:27 +0100, Erik Skultety wrote:
Since mdevs are just another type of VFIO devices, we should increase the memory locking limit the same way we do for VFIO PCI devices.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d1ac1d641b..04e64b47ea 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6347,11 +6347,12 @@ qemuDomainRequiresMemLock(virDomainDefPtr def) return true;
for (i = 0; i < def->nhostdevs; i++) { - virDomainHostdevDefPtr dev = def->hostdevs[i]; + virDomainHostdevSubsysPtr subsys = &def->hostdevs[i]->source.subsys;
- if (dev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && - dev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - dev->source.subsys.u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) + if (def->hostdevs[i]->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV || + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + subsys->u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO))) return true; }
Now that we have test suite coverage for the calculation of memory locking limits (qemumemlocktest), it would be neat if you could add test cases for mdevs as well.
I'll look into that, thanks for raising this up.
Maybe the comment above the loop (which has been moved to the qemuDomainGetMemLockLimitBytes() function now) could be updated to mention mdev-specific information, if any?
None that I'm aware of at the moment, but I'll discuss this with Alex just to be sure. Erik
-- Andrea Bolognani / Red Hat / Virtualization

Format the mediated devices on the qemu command line as -device vfio-pci,sysfsdev='/path/to/device/in/syfs'. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_command.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_command.h | 5 +++++ 2 files changed, 50 insertions(+) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 2045c2e7cf..2a2ab3e9b0 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -58,6 +58,7 @@ #include "virscsi.h" #include "virnuma.h" #include "virgic.h" +#include "virmdev.h" #if defined(__linux__) # include <linux/capability.h> #endif @@ -5220,6 +5221,31 @@ qemuBuildChrChardevStr(virLogManagerPtr logManager, return ret; } +char * +qemuBuildHostdevMediatedDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; + char *ret = NULL; + + virBufferAddLit(&buf, "vfio-pci"); + virBufferAsprintf(&buf, ",sysfsdev=%s", + virMediatedDeviceGetSysfsPath(mdevsrc->uuidstr)); + + if (qemuBuildDeviceAddressStr(&buf, def, dev->info, qemuCaps) < 0) + goto cleanup; + + if (virBufferCheckError(&buf) < 0) + goto cleanup; + + ret = virBufferContentAndReset(&buf); + + cleanup: + virBufferFreeAndReset(&buf); + return ret; +} static int qemuBuildHostdevCommandLine(virCommandPtr cmd, @@ -5408,6 +5434,25 @@ qemuBuildHostdevCommandLine(virCommandPtr cmd, VIR_FREE(devstr); } } + + /* MDEV */ + if (hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) { + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_VFIO_PCI)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("VFIO PCI device assignment is not " + "supported by this version of qemu")); + return -1; + } + + virCommandAddArg(cmd, "-device"); + if (!(devstr = + qemuBuildHostdevMediatedDevStr(def, hostdev, qemuCaps))) + return -1; + virCommandAddArg(cmd, devstr); + VIR_FREE(devstr); + } } return 0; diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index f3ed9e7e4e..7da92c8c98 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -168,6 +168,11 @@ qemuBuildSCSIVHostHostdevDevStr(const virDomainDef *def, virQEMUCapsPtr qemuCaps, char *vhostfdName); +char * +qemuBuildHostdevMediatedDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps); + char *qemuBuildRedirdevDevStr(const virDomainDef *def, virDomainRedirdevDefPtr dev, virQEMUCapsPtr qemuCaps); -- 2.12.1

On 03/22/2017 11:27 AM, Erik Skultety wrote:
Format the mediated devices on the qemu command line as -device vfio-pci,sysfsdev='/path/to/device/in/syfs'.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_command.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_command.h | 5 +++++ 2 files changed, 50 insertions(+)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 2045c2e7cf..2a2ab3e9b0 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -58,6 +58,7 @@ #include "virscsi.h" #include "virnuma.h" #include "virgic.h" +#include "virmdev.h" #if defined(__linux__) # include <linux/capability.h> #endif @@ -5220,6 +5221,31 @@ qemuBuildChrChardevStr(virLogManagerPtr logManager, return ret; }
+char * +qemuBuildHostdevMediatedDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; + char *ret = NULL; + + virBufferAddLit(&buf, "vfio-pci"); + virBufferAsprintf(&buf, ",sysfsdev=%s", + virMediatedDeviceGetSysfsPath(mdevsrc->uuidstr)); + + if (qemuBuildDeviceAddressStr(&buf, def, dev->info, qemuCaps) < 0) + goto cleanup; + + if (virBufferCheckError(&buf) < 0) + goto cleanup; + + ret = virBufferContentAndReset(&buf); + + cleanup: + virBufferFreeAndReset(&buf); + return ret; +}
static int qemuBuildHostdevCommandLine(virCommandPtr cmd, @@ -5408,6 +5434,25 @@ qemuBuildHostdevCommandLine(virCommandPtr cmd, VIR_FREE(devstr); } } + + /* MDEV */ + if (hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) { + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_VFIO_PCI)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("VFIO PCI device assignment is not " + "supported by this version of qemu")); + return -1; + } + + virCommandAddArg(cmd, "-device"); + if (!(devstr = + qemuBuildHostdevMediatedDevStr(def, hostdev, qemuCaps))) + return -1; + virCommandAddArg(cmd, devstr); + VIR_FREE(devstr); + } }
return 0; diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index f3ed9e7e4e..7da92c8c98 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -168,6 +168,11 @@ qemuBuildSCSIVHostHostdevDevStr(const virDomainDef *def, virQEMUCapsPtr qemuCaps, char *vhostfdName);
+char * +qemuBuildHostdevMediatedDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps); + char *qemuBuildRedirdevDevStr(const virDomainDef *def, virDomainRedirdevDefPtr dev, virQEMUCapsPtr qemuCaps);
ACK.

For now, these only cover the unmanaged, i.e. user pre-created devices. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- ...ml2argv-hostdev-mdev-invalid-target-address.xml | 33 ++++++++++++++++++ ...muxml2argv-hostdev-mdev-src-address-invalid.xml | 35 +++++++++++++++++++ .../qemuxml2argv-hostdev-mdev-unmanaged.args | 25 ++++++++++++++ .../qemuxml2argv-hostdev-mdev-unmanaged.xml | 35 +++++++++++++++++++ tests/qemuxml2argvtest.c | 9 +++++ .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml | 40 ++++++++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 7 files changed, 178 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-invalid-target-address.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-src-address-invalid.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-invalid-target-address.xml b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-invalid-target-address.xml new file mode 100644 index 0000000000..e522febaa9 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-invalid-target-address.xml @@ -0,0 +1,33 @@ +<domain type='qemu'> + <name>QEMUGuest2</name> + <uuid>c7a5fdbd-edaf-9466-926a-d65c16db1809</uuid> + <memory unit='KiB'>219100</memory> + <currentMemory unit='KiB'>219100</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <source dev='/dev/HostVG/QEMUGuest2'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address uuid='53764d0e-85a0-42b4-af5c-2046b460b1dc'/> + </source> + <address type='drive' controller='0' bus='0' target='0' unit='1'/> + </hostdev> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-src-address-invalid.xml b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-src-address-invalid.xml new file mode 100644 index 0000000000..707a239f24 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-src-address-invalid.xml @@ -0,0 +1,35 @@ +<domain type='qemu'> + <name>QEMUGuest2</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest2'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <controller type='ide' index='0'> + </controller> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address domain='0x0000' bus='0x06' slot='0x12' function='0x5'/> + </source> + </hostdev> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args new file mode 100644 index 0000000000..fdefeb6104 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args @@ -0,0 +1,25 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/home/test \ +USER=test \ +LOGNAME=test \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu \ +-name QEMUGuest2 \ +-S \ +-M pc \ +-m 214 \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-nographic \ +-nodefconfig \ +-nodefaults \ +-monitor unix:/tmp/lib/domain--1-QEMUGuest2/monitor.sock,server,nowait \ +-no-acpi \ +-boot c \ +-usb \ +-drive file=/dev/HostVG/QEMUGuest2,format=raw,if=none,id=drive-ide0-0-0 \ +-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \ +-device vfio-pci,\ +sysfsdev=/sys/bus/mdev/devices/53764d0e-85a0-42b4-af5c-2046b460b1dc,bus=pci.0,\ +addr=0x3 diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml new file mode 100644 index 0000000000..2ea18a2b43 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml @@ -0,0 +1,35 @@ +<domain type='qemu'> + <name>QEMUGuest2</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest2'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <controller type='ide' index='0'> + </controller> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address uuid='53764d0e-85a0-42b4-af5c-2046b460b1dc'/> + </source> + </hostdev> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 64e14af891..15b65dafb5 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1469,6 +1469,15 @@ mymain(void) DO_TEST("hostdev-vfio-multidomain", QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_DEVICE_VFIO_PCI, QEMU_CAPS_HOST_PCI_MULTIDOMAIN); + DO_TEST("hostdev-mdev-unmanaged", + QEMU_CAPS_NODEFCONFIG, + QEMU_CAPS_DEVICE_VFIO_PCI); + DO_TEST_PARSE_ERROR("hostdev-mdev-src-address-invalid", + QEMU_CAPS_NODEFCONFIG, + QEMU_CAPS_DEVICE_VFIO_PCI); + DO_TEST_PARSE_ERROR("hostdev-mdev-invalid-target-address", + QEMU_CAPS_NODEFCONFIG, + QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST_FAILURE("hostdev-vfio-multidomain", QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST("pci-rom", diff --git a/tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml b/tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml new file mode 100644 index 0000000000..328b3aeb2f --- /dev/null +++ b/tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml @@ -0,0 +1,40 @@ +<domain type='qemu'> + <name>QEMUGuest2</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest2'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <controller type='ide' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci'> + <source> + <address uuid='53764d0e-85a0-42b4-af5c-2046b460b1dc'/> + </source> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </hostdev> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 266b9c0233..81ea0cc7b0 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -557,6 +557,7 @@ mymain(void) DO_TEST("hostdev-usb-address", NONE); DO_TEST("hostdev-pci-address", NONE); DO_TEST("hostdev-vfio", NONE); + DO_TEST("hostdev-mdev-unmanaged", NONE); DO_TEST("pci-rom", NONE); DO_TEST("pci-serial-dev-chardev", NONE); -- 2.12.1

On 03/22/2017 11:27 AM, Erik Skultety wrote:
For now, these only cover the unmanaged, i.e. user pre-created devices.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- ...ml2argv-hostdev-mdev-invalid-target-address.xml | 33 ++++++++++++++++++ ...muxml2argv-hostdev-mdev-src-address-invalid.xml | 35 +++++++++++++++++++ .../qemuxml2argv-hostdev-mdev-unmanaged.args | 25 ++++++++++++++ .../qemuxml2argv-hostdev-mdev-unmanaged.xml | 35 +++++++++++++++++++ tests/qemuxml2argvtest.c | 9 +++++ .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml | 40 ++++++++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 7 files changed, 178 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-invalid-target-address.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-src-address-invalid.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml
ACK, but you probably should rename the "unmanaged" to "preexisting" or something, since I think we've all agreed to no re-use "managed='yes'" to mdevs.

Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/formatdomain.html.in | 46 ++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 40 insertions(+), 6 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 4a3123e989..1eb6c44b6f 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3886,6 +3886,19 @@ </devices> ...</pre> + <p>or:</p> + +<pre> + ... + <devices> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'> + </source> + </hostdev> + </devices> + ...</pre> + <dl> <dt><code>hostdev</code></dt> <dd>The <code>hostdev</code> element is the main container for describing @@ -3930,12 +3943,22 @@ <code>type</code> passes all LUNs presented by a single HBA to the guest. </dd> + <dt><code>mdev</code></dt> + <dd>For mediated devices (<span class="since">Since 3.2.0</span>) + the <code>model</code> attribute specifies the device API which + determines how the host's vfio driver will expose the device to the + guest. Currently, only <code>vfio-pci</code> model is supported. + There are also some implications on the usage of guest's address type + depending on the <code>model</code> attribute, see the + <code>address</code> element below.</dd> </dl> <p> - Note: The <code>managed</code> attribute is only used with PCI devices - and is ignored by all the other device types, thus setting - <code>managed</code> explicitly with other than PCI device has the same - effect as omitting it. + Note: The <code>managed</code> attribute is only used with PCI and is + ignored by all the other device types, thus setting + <code>managed</code> explicitly with other than a PCI device has the + same effect as omitting it. Similarly, <code>model</code> attribute is + only supported by mediated devices and ignored by all other device + types. </p> </dd> <dt><code>source</code></dt> @@ -4000,6 +4023,12 @@ is the vhost_scsi wwpn (16 hexadecimal digits with a prefix of "naa.") established in the host configfs. </dd> + <dt><code>mdev</code></dt> + <dd>Mediated devices (<span class="since">Since 3.2.0</span>) are + described by the <code>address</code> element. The + <code>address</code> element contains so far a single mandatory + attribute <code>uuid</code>. + </dd> </dl> </dd> <dt><code>vendor</code>, <code>product</code></dt> @@ -4043,8 +4072,13 @@ For PCI devices the element carries 4 attributes allowing to designate the device as can be found with the <code>lspci</code> or with <code>virsh nodedev-list</code>. For SCSI devices a 'drive' - address type must be used. <a href="#elementsAddress">See above</a> for - more details on the address element.</dd> + address type must be used. For mediated devices, which are only software + devices defining an allocation of resources on the physical parent device, + the address type used must conform to the <code>model</code> attribute + of element <code>hostdev</code>, e.g. any address type other than PCI for + <code>vfio-pci</code> device API will result in an error. + <a href="#elementsAddress">See above</a> for more details on the address + element.</dd> <dt><code>driver</code></dt> <dd> PCI devices can have an optional <code>driver</code> -- 2.12.1

On 03/22/2017 11:27 AM, Erik Skultety wrote:
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/formatdomain.html.in | 46 ++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 40 insertions(+), 6 deletions(-)
I always like to put the docs changes in the same patch that modifies the schema and XML parsing functions, but that's not a hard rule, so I'm fine with this.
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 4a3123e989..1eb6c44b6f 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3886,6 +3886,19 @@ </devices> ...</pre>
+ <p>or:</p> + +<pre> + ... + <devices> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'> + </source> + </hostdev> + </devices> + ...</pre> + <dl> <dt><code>hostdev</code></dt> <dd>The <code>hostdev</code> element is the main container for describing @@ -3930,12 +3943,22 @@ <code>type</code> passes all LUNs presented by a single HBA to the guest. </dd> + <dt><code>mdev</code></dt> + <dd>For mediated devices (<span class="since">Since 3.2.0</span>) + the <code>model</code> attribute specifies the device API which + determines how the host's vfio driver will expose the device to the + guest. Currently, only <code>vfio-pci</code> model is supported.
either "only the vfio-pci model is supported" or "only model='vfio-pci' is supported".
+ There are also some implications on the usage of guest's address type + depending on the <code>model</code> attribute, see the + <code>address</code> element below.</dd>
I'm not really comfortable with this - it means that the code that parses <address> has to have information from a higher level rather than being self contained. Remind me why you decided to remove "type='mdev'" (or whatever it was) from the <address> syntax?
</dl> <p> - Note: The <code>managed</code> attribute is only used with PCI devices
s/PCI/type='pci'/
- and is ignored by all the other device types, thus setting - <code>managed</code> explicitly with other than PCI device has the same - effect as omitting it. + Note: The <code>managed</code> attribute is only used with PCI and is + ignored by all the other device types, thus setting + <code>managed</code> explicitly with other than a PCI device has the + same effect as omitting it. Similarly, <code>model</code> attribute is + only supported by mediated devices and ignored by all other device + types.
Do we want to ignore it, or log an error if someone specifies it when they shouldn't? Doing the latter would prevent someone from thinking they've done "A" when actually they've done "B".
</p> </dd> <dt><code>source</code></dt> @@ -4000,6 +4023,12 @@ is the vhost_scsi wwpn (16 hexadecimal digits with a prefix of "naa.") established in the host configfs. </dd> + <dt><code>mdev</code></dt> + <dd>Mediated devices (<span class="since">Since 3.2.0</span>) are + described by the <code>address</code> element. The + <code>address</code> element contains so far a single mandatory
s/so far//
+ attribute <code>uuid</code>. + </dd> </dl> </dd> <dt><code>vendor</code>, <code>product</code></dt> @@ -4043,8 +4072,13 @@ For PCI devices the element carries 4 attributes allowing to designate the device as can be found with the <code>lspci</code> or with <code>virsh nodedev-list</code>. For SCSI devices a 'drive' - address type must be used. <a href="#elementsAddress">See above</a> for - more details on the address element.</dd> + address type must be used. For mediated devices, which are only software
s/only software/software-only/
+ devices defining an allocation of resources on the physical parent device, + the address type used must conform to the <code>model</code> attribute + of element <code>hostdev</code>, e.g. any address type other than PCI for + <code>vfio-pci</code> device API will result in an error. + <a href="#elementsAddress">See above</a> for more details on the address + element.</dd> <dt><code>driver</code></dt> <dd> PCI devices can have an optional <code>driver</code>
I'm still unsure about having no type specifier in <address>. I know that's already the case for PCI addresses in <source>, but I actually think that was an incorrect decision - it's cleaner if we can parse/format <address> without reaching up into higher levels. Maybe I can be convinced though :-) In the meantime, I think it would be better to get this all in and get some solid testing outside of the few people directly involved, so ACK to this patch too (which makes the entire series ACKed (with a few small changes requested), and we can have one last debate about <address> during the next couple days.

On Sun, Mar 26, 2017 at 03:21:04PM -0400, Laine Stump wrote:
On 03/22/2017 11:27 AM, Erik Skultety wrote:
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/formatdomain.html.in | 46 ++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 40 insertions(+), 6 deletions(-)
I always like to put the docs changes in the same patch that modifies the schema and XML parsing functions, but that's not a hard rule, so I'm fine with this.
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 4a3123e989..1eb6c44b6f 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3886,6 +3886,19 @@ </devices> ...</pre>
+ <p>or:</p> + +<pre> + ... + <devices> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'> + </source> + </hostdev> + </devices> + ...</pre> + <dl> <dt><code>hostdev</code></dt> <dd>The <code>hostdev</code> element is the main container for describing @@ -3930,12 +3943,22 @@ <code>type</code> passes all LUNs presented by a single HBA to the guest. </dd> + <dt><code>mdev</code></dt> + <dd>For mediated devices (<span class="since">Since 3.2.0</span>) + the <code>model</code> attribute specifies the device API which + determines how the host's vfio driver will expose the device to the + guest. Currently, only <code>vfio-pci</code> model is supported.
either "only the vfio-pci model is supported" or "only model='vfio-pci' is supported".
+ There are also some implications on the usage of guest's address type + depending on the <code>model</code> attribute, see the + <code>address</code> element below.</dd>
I'm not really comfortable with this - it means that the code that parses <address> has to have information from a higher level rather than being self contained. Remind me why you decided to remove "type='mdev'" (or whatever it was) from the <address> syntax?
Because we're kind of abusing the <address> element when using it within source. If you look at our docs [1] we define it only for guest addresses where we actually require the type, but when we use it under source, we define what kind of attributes are expected. Internally, we have an enum for each address type we know (which we also check only when assigning guest addresses) which means that I would have to add the enum for mdev and also document it under the 'Device Addresses' paragraph (which I did before), but the only thing I documented then was that it's not supposed to be used as a guest address, except that the paragraph headline indicates it's indeed about guest addresses. Do you see the confusion in that, i.e. so do we or do we actually not support mdev as a guest address?? (that was a rhetorical question...) Also, besides our schema not being happy about that, you can happily define a domain with an source address element containing some additional rubbish attributes which will be silently ignored by the parser. [1] http://libvirt.org/formatdomain.html#elementsAddress
</dl> <p> - Note: The <code>managed</code> attribute is only used with PCI devices
s/PCI/type='pci'/
- and is ignored by all the other device types, thus setting - <code>managed</code> explicitly with other than PCI device has the same - effect as omitting it. + Note: The <code>managed</code> attribute is only used with PCI and is + ignored by all the other device types, thus setting + <code>managed</code> explicitly with other than a PCI device has the + same effect as omitting it. Similarly, <code>model</code> attribute is + only supported by mediated devices and ignored by all other device + types.
Do we want to ignore it, or log an error if someone specifies it when they shouldn't? Doing the latter would prevent someone from thinking they've done "A" when actually they've done "B".
As I wrote above, our parser is quite permissive in semantics and since usually the case is that we ignore everything that does not make sense for us and only error out if we got a value different from what we expected or we're missing a mandatory attribute in general or there is a conflict in attributes, etc. If someone then later complains that they have a different hostdev and expected the model to do something, we can still confront them with the documentation which states that it will be ignored for unsupported device types.
I'm still unsure about having no type specifier in <address>. I know that's already the case for PCI addresses in <source>, but I actually think that was an incorrect decision - it's cleaner if we can parse/format <address> without reaching up into higher levels. Maybe I can be convinced though :-)
See my response above
In the meantime, I think it would be better to get this all in and get some solid testing outside of the few people directly involved, so ACK to this patch too (which makes the entire series ACKed (with a few small changes requested), and we can have one last debate about <address> during the next couple days.
Thanks a lot for review, hopefully I didn't forget to adjust any of the patches and pushed the series. Erik

Verify Summary: * the none rooted mode starting a high-privileges VM actually. The configurations is source generated default value except tls disabled. 1. rooted virsh define ./libvirt/vgpu-win10.xml Domain vgpu-win10 defined from ./libvirt/vgpu-win10.xml ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ virsh start vgpu-win10 2017-03-26 23:28:57.385+0000: 2886: info : libvirt version: 3.2.0 2017-03-26 23:28:57.385+0000: 2886: info : hostname: z-nuc-11.maas 2017-03-26 23:28:57.385+0000: 2886: warning : qemuDomainObjTaint:4155 : Domain id=1 name='vgpu-win10' uuid=916c5c36-0437-11e7-a23d-830ed1295d00 is tainted: high-privileges 2017-03-26 23:28:58.010+0000: 2886: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4 Domain vgpu-win10 started 2. None rooted virsh -c qemu:///session Welcome to lt-virsh, the virtualization interactive terminal. virsh # define ./libvirt/vgpu-win10.xml Domain vgpu-win10 defined from ./libvirt/vgpu-win10.xml virsh # start vgpu-win10 2017-03-26 23:38:11.220+0000: 2882: warning : qemuDomainObjTaint:4155 : Domain id=4 name='vgpu-win10' uuid=916c5c36-0437-11e7-a23d-830ed1295d00 is tainted: high-privileges 2017-03-26 23:38:12.356+0000: 2882: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4 Domain vgpu-win10 started Regards Yongli He
since v1: - new <hostdev> attribute model introduced which tells libvirt which device API should be considered when auto-assigning guest address - device_api is properly checked, thus taking the 'model' attribute only as a hint to assign "some" address - new address type 'mdev' is introduced rather than using plain <uuid> element, since the address element is more conveniently extendable. - the emulated mtty driver now works as well out of the box, so no HW needed to review this series --> let's try it :) - fixed all the nits from v1
since v2: - dropped the patch introducing new address type 'mdev' since I added by mistake and only after that realized that the device address type enum is used for guest addresses only --> the mdevs are still identified by address element containing an 'uuid' attribute, I just dropped the enum - resolved the driver hostdev list race condition raised by Pavel in his review --> the device API is now checked every time our internal mdev object is created as opposed to the previous version where because of the model being checked separately, the locking issues arose. - rewrote the docs, reflecting the mdev address type drop change - squashed all security related stuff into 1 patch, also added app-armor bits - as Pavel suggested, moved most of the mdev-related functions out of virhostdev.c to virmdev.c - added a few more test cases - created a new branch 'mdev-next' on my github (more suitable name than a strict version number) on https://github.com/eskultety/libvirt/commits/mdev-next
since v3: - 'undo' an accidental squash of virmdev.{c,h} module introduction into patch 4/15 and made it a separate patch again - squash 5/15 into 4/15 as Pavel suggested - dropped the NEWS patch, as I've so far got at least 4 merge conflicts because of it when rebasing...I'll add it before the series is ready to be merged...or I'll forget about it like I usually do and add it later :/
Erik
Erik Skultety (14): conf: hostdev: Enforce enum-in-switch compile-time checks conf: hostdev: Introduce virDomainHostdevSubsysSCSIClear conf: Introduce virDomainHostdevDefPostParse util: Introduce new module virmdev conf: Introduce new hostdev device type mdev security: Enable labeling of vfio mediated devices conf: Enable cold-plug of a mediated device qemu: Assign PCI addresses for mediated devices as well hostdev: Maintain a driver list of active mediated devices qemu: cgroup: Adjust cgroups' logic to allow mediated devices qemu: Bump the memory locking limit for mdevs as well qemu: Format mdevs on qemu command line test: Add some test cases for our test suite regarding the mdevs docs: Document the new hostdev and address type 'mdev'
docs/formatdomain.html.in | 46 +- docs/schemas/domaincommon.rng | 22 + po/POTFILES.in | 1 + src/Makefile.am | 1 + src/conf/domain_conf.c | 225 ++++++++-- src/conf/domain_conf.h | 9 + src/libvirt_private.syms | 25 ++ src/qemu/qemu_command.c | 45 ++ src/qemu/qemu_command.h | 5 + src/qemu/qemu_domain.c | 24 +- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 14 +- src/qemu/qemu_hostdev.c | 56 +++ src/qemu/qemu_hostdev.h | 10 + src/qemu/qemu_hotplug.c | 2 + src/security/security_apparmor.c | 22 + src/security/security_dac.c | 43 ++ src/security/security_selinux.c | 45 ++ src/util/virhostdev.c | 165 ++++++- src/util/virhostdev.h | 23 + src/util/virmdev.c | 487 +++++++++++++++++++++ src/util/virmdev.h | 123 ++++++ tests/domaincapsschemadata/full.xml | 1 + ...ml2argv-hostdev-mdev-invalid-target-address.xml | 33 ++ ...muxml2argv-hostdev-mdev-src-address-invalid.xml | 35 ++ .../qemuxml2argv-hostdev-mdev-unmanaged.args | 25 ++ .../qemuxml2argv-hostdev-mdev-unmanaged.xml | 35 ++ tests/qemuxml2argvtest.c | 9 + .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml | 40 ++ tests/qemuxml2xmltest.c | 1 + 30 files changed, 1518 insertions(+), 55 deletions(-) create mode 100644 src/util/virmdev.c create mode 100644 src/util/virmdev.h create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-invalid-target-address.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-src-address-invalid.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml

On 2017年03月27日 15:42, yonglihe wrote:
Verify Summary: * the none rooted mode starting a high-privileges VM actually.
The configurations is source generated default value except tls disabled.
1. rooted
virsh define ./libvirt/vgpu-win10.xml Domain vgpu-win10 defined from ./libvirt/vgpu-win10.xml
ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ virsh start vgpu-win10 2017-03-26 23:28:57.385+0000: 2886: info : libvirt version: 3.2.0 2017-03-26 23:28:57.385+0000: 2886: info : hostname: z-nuc-11.maas 2017-03-26 23:28:57.385+0000: 2886: warning : qemuDomainObjTaint:4155 : Domain id=1 name='vgpu-win10' uuid=916c5c36-0437-11e7-a23d-830ed1295d00 is tainted: high-privileges 2017-03-26 23:28:58.010+0000: 2886: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4 Domain vgpu-win10 started
2. None rooted virsh -c qemu:///session Welcome to lt-virsh, the virtualization interactive terminal.
virsh # define ./libvirt/vgpu-win10.xml Domain vgpu-win10 defined from ./libvirt/vgpu-win10.xml
virsh # start vgpu-win10 2017-03-26 23:38:11.220+0000: 2882: warning : qemuDomainObjTaint:4155 : Domain id=4 name='vgpu-win10' uuid=916c5c36-0437-11e7-a23d-830ed1295d00 is tainted: high-privileges 2017-03-26 23:38:12.356+0000: 2882: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4 Domain vgpu-win10 started
Please ignore above none rooted testing result, my fault. the proper test given following result: to successfully starting a non rooted vm, the following operation needed: 1.change the ownership/access right of the mdev corresponding vfio sudo chown ubuntu:ubuntu /dev/vfio/0 2. set a correct ulimit -l for the vm sudo sh -c "ulimit -l 3074424832 && exec su $LOGNAME" otherwise, it running into the following error: virsh # start vgpu-win10 internal error: Process exited prior to exec: libvirt: error : cannot limit locked memory to 3074424832: Operation not permitted my testing bed is Ubuntu 14.04, there is a similar bug ever reported: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1276719 I could not make sure if there is special requirements run virsh directly from the source tree using the ./run scripts. fix me. Yongli He
Regards Yongli He
since v1: - new <hostdev> attribute model introduced which tells libvirt which device API should be considered when auto-assigning guest address - device_api is properly checked, thus taking the 'model' attribute only as a hint to assign "some" address - new address type 'mdev' is introduced rather than using plain <uuid> element, since the address element is more conveniently extendable. - the emulated mtty driver now works as well out of the box, so no HW needed to review this series --> let's try it :) - fixed all the nits from v1
since v2: - dropped the patch introducing new address type 'mdev' since I added by mistake and only after that realized that the device address type enum is used for guest addresses only --> the mdevs are still identified by address element containing an 'uuid' attribute, I just dropped the enum - resolved the driver hostdev list race condition raised by Pavel in his review --> the device API is now checked every time our internal mdev object is created as opposed to the previous version where because of the model being checked separately, the locking issues arose. - rewrote the docs, reflecting the mdev address type drop change - squashed all security related stuff into 1 patch, also added app-armor bits - as Pavel suggested, moved most of the mdev-related functions out of virhostdev.c to virmdev.c - added a few more test cases - created a new branch 'mdev-next' on my github (more suitable name than a strict version number) on https://github.com/eskultety/libvirt/commits/mdev-next
since v3: - 'undo' an accidental squash of virmdev.{c,h} module introduction into patch 4/15 and made it a separate patch again - squash 5/15 into 4/15 as Pavel suggested - dropped the NEWS patch, as I've so far got at least 4 merge conflicts because of it when rebasing...I'll add it before the series is ready to be merged...or I'll forget about it like I usually do and add it later :/
Erik
Erik Skultety (14): conf: hostdev: Enforce enum-in-switch compile-time checks conf: hostdev: Introduce virDomainHostdevSubsysSCSIClear conf: Introduce virDomainHostdevDefPostParse util: Introduce new module virmdev conf: Introduce new hostdev device type mdev security: Enable labeling of vfio mediated devices conf: Enable cold-plug of a mediated device qemu: Assign PCI addresses for mediated devices as well hostdev: Maintain a driver list of active mediated devices qemu: cgroup: Adjust cgroups' logic to allow mediated devices qemu: Bump the memory locking limit for mdevs as well qemu: Format mdevs on qemu command line test: Add some test cases for our test suite regarding the mdevs docs: Document the new hostdev and address type 'mdev'
docs/formatdomain.html.in | 46 +- docs/schemas/domaincommon.rng | 22 + po/POTFILES.in | 1 + src/Makefile.am | 1 + src/conf/domain_conf.c | 225 ++++++++-- src/conf/domain_conf.h | 9 + src/libvirt_private.syms | 25 ++ src/qemu/qemu_command.c | 45 ++ src/qemu/qemu_command.h | 5 + src/qemu/qemu_domain.c | 24 +- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 14 +- src/qemu/qemu_hostdev.c | 56 +++ src/qemu/qemu_hostdev.h | 10 + src/qemu/qemu_hotplug.c | 2 + src/security/security_apparmor.c | 22 + src/security/security_dac.c | 43 ++ src/security/security_selinux.c | 45 ++ src/util/virhostdev.c | 165 ++++++- src/util/virhostdev.h | 23 + src/util/virmdev.c | 487 +++++++++++++++++++++ src/util/virmdev.h | 123 ++++++ tests/domaincapsschemadata/full.xml | 1 + ...ml2argv-hostdev-mdev-invalid-target-address.xml | 33 ++ ...muxml2argv-hostdev-mdev-src-address-invalid.xml | 35 ++ .../qemuxml2argv-hostdev-mdev-unmanaged.args | 25 ++ .../qemuxml2argv-hostdev-mdev-unmanaged.xml | 35 ++ tests/qemuxml2argvtest.c | 9 + .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml | 40 ++ tests/qemuxml2xmltest.c | 1 + 30 files changed, 1518 insertions(+), 55 deletions(-) create mode 100644 src/util/virmdev.c create mode 100644 src/util/virmdev.h create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-invalid-target-address.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-src-address-invalid.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 03/28/2017 10:05 PM, yonglihe wrote:
On 2017年03月27日 15:42, yonglihe wrote:
Verify Summary: * the none rooted mode starting a high-privileges VM actually.
The configurations is source generated default value except tls disabled.
1. rooted
virsh define ./libvirt/vgpu-win10.xml Domain vgpu-win10 defined from ./libvirt/vgpu-win10.xml
ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ virsh start vgpu-win10 2017-03-26 23:28:57.385+0000: 2886: info : libvirt version: 3.2.0 2017-03-26 23:28:57.385+0000: 2886: info : hostname: z-nuc-11.maas 2017-03-26 23:28:57.385+0000: 2886: warning : qemuDomainObjTaint:4155 : Domain id=1 name='vgpu-win10' uuid=916c5c36-0437-11e7-a23d-830ed1295d00 is tainted: high-privileges 2017-03-26 23:28:58.010+0000: 2886: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4 Domain vgpu-win10 started
2. None rooted virsh -c qemu:///session Welcome to lt-virsh, the virtualization interactive terminal.
The above line makes me think that you're mixing up "unprivileged libvirtd" with "unprivileged qemu". When you connect to virsh with "virsh -c qemu:///session" you are using an unprivileged copy of libvirtd started for your specific uid, and that libvirtd will: 1) not do any of the uid/permission/selinux/apparmor changes to the files/devices that will be used by the qemu process. (because it can't) 2) try to modify the locked memory limit for the qemu process, but likely fail because it needs more than the user's default limit. (I see below that you ran into this). 3) run qemu as the same unprivileged user. When you connect to virsh with the default URL (qemu:///system) you will connect to the system instance of libvirtd, which is running as root. It will: 1) modify uid/permissions/selinux/apparmor settings of any files/devices according to the "user" setting in /etc/libvirt/qemu.conf. and after forking the qemu process: 2) modify the locked memory limit to accommodate the needs of any assigned devices and 3) change the uid of the qemu process to the "user" setting from qemu.conf and drop all privileges (in the case that the "user" in qemu.conf is set to root, then step 3 doesn't happen). It sounds like you are using an "unprivileged libvirtd" in your tests, which will create the need to chown the various device files and manually change the ulimit for the login session that is running "virsh -c qemu:///session" (and thus starting up the unprivileged libvirtd which gets started on demand). The more common scenario is to use virsh -c qemu:///system (or simply run virsh as root and not add the URL so that the default is used), and to leave the qemu user set to "qemu" (or in some distros I think it is set to "kvm" by default).
virsh # define ./libvirt/vgpu-win10.xml Domain vgpu-win10 defined from ./libvirt/vgpu-win10.xml
virsh # start vgpu-win10 2017-03-26 23:38:11.220+0000: 2882: warning : qemuDomainObjTaint:4155 : Domain id=4 name='vgpu-win10' uuid=916c5c36-0437-11e7-a23d-830ed1295d00 is tainted: high-privileges 2017-03-26 23:38:12.356+0000: 2882: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4 Domain vgpu-win10 started
Please ignore above none rooted testing result, my fault. the proper test given following result:
to successfully starting a non rooted vm, the following operation needed: 1.change the ownership/access right of the mdev corresponding vfio sudo chown ubuntu:ubuntu /dev/vfio/0
2. set a correct ulimit -l for the vm sudo sh -c "ulimit -l 3074424832 && exec su $LOGNAME"
otherwise, it running into the following error: virsh # start vgpu-win10 internal error: Process exited prior to exec: libvirt: error : cannot limit locked memory to 3074424832: Operation not permitted
This is to be expected - both of these extra steps are also needed if you try to assign a standard PCI device using VFIO using unprivileged libvirtd. This is the best that can be expected without any component having root privileges. If you run the same test using qemu:///system, both of these should be taken care of automatically.
my testing bed is Ubuntu 14.04, there is a similar bug ever reported: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1276719
The original report (filed 2.5 years ago, and resolved soon after) was due to apparmor not doing the right stuff to the necessary files in sysfs. Many later comments and error messages were from people who were using the tools incorrectly (e.g. using <qemu:commandline> to manually add "-device vfio-pci" args to the qemu process, making it impossible for libvirt to recognize that it must perform steps 2 & 3 listed above.
I could not make sure if there is special requirements run virsh directly from the source tree using the ./run scripts. fix me.
I'm fairly certain the reason you're needing to perform those two extra steps are because you're using qemu:///session instead of qemu:///system.

On 03/28/2017 10:05 PM, yonglihe wrote:
On 2017年03月27日 15:42, yonglihe wrote:
Verify Summary: * the none rooted mode starting a high-privileges VM actually.
The configurations is source generated default value except tls disabled.
1. rooted
virsh define ./libvirt/vgpu-win10.xml Domain vgpu-win10 defined from ./libvirt/vgpu-win10.xml
ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ virsh start vgpu-win10 2017-03-26 23:28:57.385+0000: 2886: info : libvirt version: 3.2.0 2017-03-26 23:28:57.385+0000: 2886: info : hostname: z-nuc-11.maas 2017-03-26 23:28:57.385+0000: 2886: warning : qemuDomainObjTaint:4155 : Domain id=1 name='vgpu-win10' uuid=916c5c36-0437-11e7-a23d-830ed1295d00 is tainted: high-privileges 2017-03-26 23:28:58.010+0000: 2886: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4 Domain vgpu-win10 started
2. None rooted virsh -c qemu:///session Welcome to lt-virsh, the virtualization interactive terminal.
The above line makes me think that you're mixing up "unprivileged libvirtd" with "unprivileged qemu".
When you connect to virsh with "virsh -c qemu:///session" you are using an unprivileged copy of libvirtd started for your specific uid, and that libvirtd will:
1) not do any of the uid/permission/selinux/apparmor changes to the files/devices that will be used by the qemu process. (because it can't)
2) try to modify the locked memory limit for the qemu process, but likely fail because it needs more than the user's default limit. (I see below that you ran into this).
3) run qemu as the same unprivileged user.
When you connect to virsh with the default URL (qemu:///system) you will connect to the system instance of libvirtd, which is running as root. It will:
1) modify uid/permissions/selinux/apparmor settings of any files/devices according to the "user" setting in /etc/libvirt/qemu.conf.
and after forking the qemu process:
2) modify the locked memory limit to accommodate the needs of any assigned devices and
3) change the uid of the qemu process to the "user" setting from qemu.conf and drop all privileges
(in the case that the "user" in qemu.conf is set to root, then step 3 doesn't happen).
It sounds like you are using an "unprivileged libvirtd" in your tests, which will create the need to chown the various device files and manually change the ulimit for the login session that is running "virsh -c qemu:///session" (and thus starting up the unprivileged libvirtd which gets started on demand).
The more common scenario is to use virsh -c qemu:///system (or simply run virsh as root and not add the URL so that the default is used), and to leave the qemu user set to "qemu" (or in some distros I think it is set to "kvm" by default). thanks explain all of these, this is so big help to better understanding the processes of libvirt and what problem i'm encounter, thanks, very much!
Regards Yongli He
virsh # define ./libvirt/vgpu-win10.xml Domain vgpu-win10 defined from ./libvirt/vgpu-win10.xml
virsh # start vgpu-win10 2017-03-26 23:38:11.220+0000: 2882: warning : qemuDomainObjTaint:4155 : Domain id=4 name='vgpu-win10' uuid=916c5c36-0437-11e7-a23d-830ed1295d00 is tainted: high-privileges 2017-03-26 23:38:12.356+0000: 2882: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4 Domain vgpu-win10 started Please ignore above none rooted testing result, my fault. the proper test given following result:
to successfully starting a non rooted vm, the following operation needed: 1.change the ownership/access right of the mdev corresponding vfio sudo chown ubuntu:ubuntu /dev/vfio/0
2. set a correct ulimit -l for the vm sudo sh -c "ulimit -l 3074424832 && exec su $LOGNAME"
otherwise, it running into the following error: virsh # start vgpu-win10 internal error: Process exited prior to exec: libvirt: error : cannot limit locked memory to 3074424832: Operation not permitted This is to be expected - both of these extra steps are also needed if you try to assign a standard PCI device using VFIO using unprivileged libvirtd. This is the best that can be expected without any component having root privileges.
If you run the same test using qemu:///system, both of these should be taken care of automatically.
my testing bed is Ubuntu 14.04, there is a similar bug ever reported: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1276719 The original report (filed 2.5 years ago, and resolved soon after) was due to apparmor not doing the right stuff to the necessary files in sysfs. Many later comments and error messages were from people who were using the tools incorrectly (e.g. using <qemu:commandline> to manually add "-device vfio-pci" args to the qemu process, making it impossible for libvirt to recognize that it must perform steps 2 & 3 listed above.
I could not make sure if there is special requirements run virsh directly from the source tree using the ./run scripts. fix me. I'm fairly certain the reason you're needing to perform those two extra steps are because you're using qemu:///session instead of qemu:///system.
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
participants (5)
-
Andrea Bolognani
-
Daniel P. Berrange
-
Erik Skultety
-
Laine Stump
-
yonglihe