[libvirt] [RFC PATCH v2 00/18] Introduce vGPU mdev framework to libvirt

since v1: - new <hostdev> attribute model introduced which tells libvirt which device API should be considered when auto-assigning guest address - device_api is properly checked, thus taking the 'model' attribute only as a hint to assign "some" address - new address type 'mdev' is introduced rather than using plain <uuid> element, since the address element is more conveniently extendable. - the emulated mtty driver now works as well out of the box, so no HW needed to review this series --> let's try it :) - fixed all the nits from v1 Erik Skultety (18): util: Introduce new module virmdev conf: Introduce new hostdev device type mdev conf: Introduce new address type mdev conf: Update XML parser, formatter, and RNG schema to support mdev conf: Introduce virDomainHostdevDefPostParse conf: Add post parse code for mdevs to virDomainHostdevDefPostParse security: dac: Enable labeling of vfio mediated devices security: selinux: Enable labeling of vfio mediated devices conf: Enable cold-plug of a mediated device qemu: Assign PCI addresses for mediated devices as well hostdev: Maintain a driver list of active mediated devices hostdev: Introduce a reattach method for mediated devices qemu: cgroup: Adjust cgroups' logic to allow mediated devices qemu: namespace: Hook up the discovery of mdevs into the namespace code qemu: Format mdevs on the qemu command line test: Add some test cases for our test suite regarding the mdevs docs: Document the new hostdev and address type 'mdev' news: Update the NEWS.xml about the new mdev feature docs/formatdomain.html.in | 48 ++- docs/news.xml | 11 + docs/schemas/domaincommon.rng | 26 ++ po/POTFILES.in | 1 + src/Makefile.am | 1 + src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 203 ++++++++++-- src/conf/domain_conf.h | 9 + src/libvirt_private.syms | 20 ++ src/qemu/qemu_cgroup.c | 34 ++ src/qemu/qemu_command.c | 49 +++ src/qemu/qemu_command.h | 5 + src/qemu/qemu_domain.c | 12 + src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 16 +- src/qemu/qemu_hostdev.c | 37 +++ src/qemu/qemu_hostdev.h | 8 + src/qemu/qemu_hotplug.c | 2 + src/security/security_apparmor.c | 3 + src/security/security_dac.c | 55 ++++ src/security/security_selinux.c | 54 ++++ src/util/virhostdev.c | 229 ++++++++++++- src/util/virhostdev.h | 16 + src/util/virmdev.c | 358 +++++++++++++++++++++ src/util/virmdev.h | 93 ++++++ tests/domaincapsschemadata/full.xml | 1 + .../qemuxml2argv-hostdev-mdev-unmanaged.args | 25 ++ .../qemuxml2argv-hostdev-mdev-unmanaged.xml | 37 +++ tests/qemuxml2argvtest.c | 6 + .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml | 40 +++ tests/qemuxml2xmltest.c | 1 + 31 files changed, 1362 insertions(+), 40 deletions(-) create mode 100644 src/util/virmdev.c create mode 100644 src/util/virmdev.h create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml -- 2.10.2

Beside creation, disposal, getter, and setter methods the module exports methods to work with lists of mediated devices. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- po/POTFILES.in | 1 + src/Makefile.am | 1 + src/libvirt_private.syms | 18 +++ src/util/virmdev.c | 358 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virmdev.h | 93 ++++++++++++ 5 files changed, 471 insertions(+) create mode 100644 src/util/virmdev.c create mode 100644 src/util/virmdev.h diff --git a/po/POTFILES.in b/po/POTFILES.in index 365ea66..53c674e 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -218,6 +218,7 @@ src/util/virlease.c src/util/virlockspace.c src/util/virlog.c src/util/virmacmap.c +src/util/virmdev.c src/util/virnetdev.c src/util/virnetdevbandwidth.c src/util/virnetdevbridge.c diff --git a/src/Makefile.am b/src/Makefile.am index 2f32d41..6510e1d 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -186,6 +186,7 @@ UTIL_SOURCES = \ util/viruuid.c util/viruuid.h \ util/virxdrdefs.h \ util/virxml.c util/virxml.h \ + util/virmdev.c util/virmdev.h \ $(NULL) EXTRA_DIST += $(srcdir)/util/keymaps.csv $(srcdir)/util/virkeycode-mapgen.py diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 6bbb36b..48c86a2 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1962,6 +1962,24 @@ virMacMapNew; virMacMapRemove; virMacMapWriteFile; +# util/virmdev.h +virMediatedDeviceFree; +virMediatedDeviceGetDeviceAPI; +virMediatedDeviceGetIOMMUGroupDev; +virMediatedDeviceGetPath; +virMediatedDeviceGetUsedBy; +virMediatedDeviceListAdd; +virMediatedDeviceListCount; +virMediatedDeviceListDel; +virMediatedDeviceListFind; +virMediatedDeviceListGet; +virMediatedDeviceListNew; +virMediatedDeviceListSteal; +virMediatedDeviceListStealIndex; +virMediatedDeviceNew; +virMediatedDeviceSetUsedBy; + + # util/virnetdev.h virNetDevAddMulti; diff --git a/src/util/virmdev.c b/src/util/virmdev.c new file mode 100644 index 0000000..9c80647 --- /dev/null +++ b/src/util/virmdev.c @@ -0,0 +1,358 @@ +/* + * virmdev.c: helper APIs for managing host MDEV devices + * + * Copyright (C) 2017-2018 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#include <config.h> + +#include <fcntl.h> +#include <inttypes.h> +#include <limits.h> +#include <stdio.h> +#include <string.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> +#include <stdlib.h> + +#include "virmdev.h" +#include "dirname.h" +#include "virlog.h" +#include "viralloc.h" +#include "vircommand.h" +#include "virerror.h" +#include "virfile.h" +#include "virkmod.h" +#include "virstring.h" +#include "virutil.h" +#include "viruuid.h" +#include "virhostdev.h" + +VIR_LOG_INIT("util.mdev"); + +struct _virMediatedDevice { + char *path; /* sysfs path */ + + char *used_by_drvname; + char *used_by_domname; +}; + +struct _virMediatedDeviceList { + virObjectLockable parent; + + size_t count; + virMediatedDevicePtr *devs; +}; + + +/* For virReportOOMError() and virReportSystemError() */ +#define VIR_FROM_THIS VIR_FROM_NONE + +static virClassPtr virMediatedDeviceListClass; + +static void virMediatedDeviceListDispose(void *obj); + +static int virMediatedOnceInit(void) +{ + if (!(virMediatedDeviceListClass = virClassNew(virClassForObjectLockable(), + "virMediatedDeviceList", + sizeof(virMediatedDeviceList), + virMediatedDeviceListDispose))) + return -1; + + return 0; +} + +VIR_ONCE_GLOBAL_INIT(virMediated) + +#ifdef __linux__ +# define MDEV_SYSFS_DEVICES "/sys/bus/mdev/devices/" + +virMediatedDevicePtr +virMediatedDeviceNew(const char *uuidstr) +{ + virMediatedDevicePtr dev = NULL, ret = NULL; + + if (VIR_ALLOC(dev) < 0) + return NULL; + + if (virAsprintf(&dev->path, MDEV_SYSFS_DEVICES "%s", uuidstr) < 0) + goto cleanup; + + ret = dev; + dev = NULL; + + cleanup: + virMediatedDeviceFree(dev); + return ret; +} + +#else + +virMediatedDevicePtr +virMediatedDeviceNew(virPCIDeviceAddressPtr pciaddr ATTRIBUTE_UNUSED, + const char *uuidstr ATTRIBUTE_UNUSED) +{ + virRerportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("not supported on non-linux platforms")); + return NULL; +} + +#endif /* __linux__ */ + +void +virMediatedDeviceFree(virMediatedDevicePtr dev) +{ + if (!dev) + return; + VIR_FREE(dev->path); + VIR_FREE(dev->used_by_drvname); + VIR_FREE(dev->used_by_domname); + VIR_FREE(dev); +} + + +const char * +virMediatedDeviceGetPath(virMediatedDevicePtr dev) +{ + return dev->path; +} + + +/* Returns an absolute canonicalized path to the device used to control the + * mediated device's IOMMU group (e.g. "/dev/vfio/15") + */ +char * +virMediatedDeviceGetIOMMUGroupDev(virMediatedDevicePtr dev) +{ + char *resultpath = NULL; + char *iommu_linkpath = NULL; + char *vfio_path = NULL; + + if (virAsprintf(&iommu_linkpath, "%s/iommu_group", dev->path) < 0) + return NULL; + + if (virFileIsLink(iommu_linkpath) != 1) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("IOMMU group file %s is not a symlink"), + iommu_linkpath); + goto cleanup; + } + + if (virFileResolveLink(iommu_linkpath, &resultpath) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to resolve IOMMU group symlink %s"), + iommu_linkpath); + goto cleanup; + } + + if (virAsprintf(&vfio_path, "/dev/vfio/%s", last_component(resultpath)) < 0) + goto cleanup; + + cleanup: + VIR_FREE(resultpath); + VIR_FREE(iommu_linkpath); + return vfio_path; +} + + +void +virMediatedDeviceGetUsedBy(virMediatedDevicePtr dev, + const char **drvname, const char **domname) +{ + *drvname = dev->used_by_drvname; + *domname = dev->used_by_domname; +} + + +int +virMediatedDeviceSetUsedBy(virMediatedDevicePtr dev, + const char *drvname, + const char *domname) +{ + VIR_FREE(dev->used_by_drvname); + VIR_FREE(dev->used_by_domname); + if (VIR_STRDUP(dev->used_by_drvname, drvname) < 0) + return -1; + if (VIR_STRDUP(dev->used_by_domname, domname) < 0) + return -1; + + return 0; +} + + +virMediatedDeviceListPtr +virMediatedDeviceListNew(void) +{ + virMediatedDeviceListPtr list; + + if (virMediatedInitialize() < 0) + return NULL; + + if (!(list = virObjectLockableNew(virMediatedDeviceListClass))) + return NULL; + + return list; +} + + +static void +virMediatedDeviceListDispose(void *obj) +{ + virMediatedDeviceListPtr list = obj; + size_t i; + + for (i = 0; i < list->count; i++) { + virMediatedDeviceFree(list->devs[i]); + list->devs[i] = NULL; + } + + list->count = 0; + VIR_FREE(list->devs); +} + + +int +virMediatedDeviceListAdd(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + if (virMediatedDeviceListFind(list, dev)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Device %s is already in use"), dev->path); + return -1; + } + return VIR_APPEND_ELEMENT(list->devs, list->count, dev); +} + + +virMediatedDevicePtr +virMediatedDeviceListGet(virMediatedDeviceListPtr list, + ssize_t idx) +{ + if (idx < 0 || idx >= list->count) + return NULL; + + return list->devs[idx]; +} + + +size_t +virMediatedDeviceListCount(virMediatedDeviceListPtr list) +{ + return list->count; +} + + +virMediatedDevicePtr +virMediatedDeviceListStealIndex(virMediatedDeviceListPtr list, + ssize_t idx) +{ + virMediatedDevicePtr ret; + + if (idx < 0 || idx >= list->count) + return NULL; + + ret = list->devs[idx]; + VIR_DELETE_ELEMENT(list->devs, idx, list->count); + return ret; +} + + +virMediatedDevicePtr +virMediatedDeviceListSteal(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + int idx = virMediatedDeviceListFindIndex(list, dev); + + return virMediatedDeviceListStealIndex(list, idx); +} + + +void +virMediatedDeviceListDel(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + virMediatedDevicePtr ret = virMediatedDeviceListSteal(list, dev); + virMediatedDeviceFree(ret); +} + + +int +virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + size_t i; + + for (i = 0; i < list->count; i++) { + virMediatedDevicePtr other = list->devs[i]; + if (STREQ(other->path, dev->path)) + return i; + } + return -1; +} + + +virMediatedDevicePtr +virMediatedDeviceListFind(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + int idx; + + if ((idx = virMediatedDeviceListFindIndex(list, dev)) >= 0) + return list->devs[idx]; + else + return NULL; +} + + +int +virMediatedDeviceGetDeviceAPI(virMediatedDevicePtr dev, + char **device_api) +{ + int ret = -1; + char *buf = NULL; + char *tmp = NULL; + char *sysfs_path = NULL; + + if (virAsprintf(&sysfs_path, "%s/mdev_type/device_api", dev->path) < 0) + goto cleanup; + + /* TODO - make this a generic method to access sysfs files for various + * kinds of devices + */ + if (!virFileExists(sysfs_path)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("mediated devices are not supported by this kernel")); + goto cleanup; + } + + if (virFileReadAll(sysfs_path, 1024, &buf) < 0) + goto cleanup; + + if ((tmp = strchr(buf, '\n'))) + *tmp = '\0'; + + *device_api = buf; + buf = NULL; + + ret = 0; + cleanup: + VIR_FREE(sysfs_path); + VIR_FREE(buf); + return ret; +} diff --git a/src/util/virmdev.h b/src/util/virmdev.h new file mode 100644 index 0000000..78a7df5 --- /dev/null +++ b/src/util/virmdev.h @@ -0,0 +1,93 @@ +/* + * virmdev.h: helper APIs for managing host mediated devices + * + * Copyright (C) 2017-2018 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#ifndef __VIR_MDEV_H__ +# define __VIR_MDEV_H__ + +# include "internal.h" +# include "virobject.h" +# include "virutil.h" +# include "virpci.h" + +typedef enum { + VIR_MDEV_MODEL_TYPE_DEFAULT = 0, + VIR_MDEV_MODEL_TYPE_VFIO_PCI, + + VIR_MDEV_MODEL_TYPE_LAST +} virMediatedDeviceModelType; + +VIR_ENUM_DECL(virMediatedDeviceModel) + + +typedef struct _virMediatedDevice virMediatedDevice; +typedef virMediatedDevice *virMediatedDevicePtr; +typedef struct _virMediatedDeviceAddress virMediatedDeviceAddress; +typedef virMediatedDeviceAddress *virMediatedDeviceAddressPtr; +typedef struct _virMediatedDeviceList virMediatedDeviceList; +typedef virMediatedDeviceList *virMediatedDeviceListPtr; + +typedef int (*virMediatedDeviceCallback)(virMediatedDevicePtr dev, + const char *path, void *opaque); + +virMediatedDevicePtr virMediatedDeviceNew(const char *uuidstr); + +virMediatedDevicePtr virMediatedDeviceCopy(virMediatedDevicePtr dev); + +void virMediatedDeviceFree(virMediatedDevicePtr dev); + +const char *virMediatedDeviceGetPath(virMediatedDevicePtr dev); + +void virMediatedDeviceGetUsedBy(virMediatedDevicePtr dev, + const char **drvname, const char **domname); + +int virMediatedDeviceSetUsedBy(virMediatedDevicePtr dev, + const char *drvname, + const char *domname); + +char *virMediatedDeviceGetIOMMUGroupDev(virMediatedDevicePtr dev); + +int virMediatedDeviceGetDeviceAPI(virMediatedDevicePtr dev, char **device_api); + +virMediatedDeviceListPtr virMediatedDeviceListNew(void); + +int virMediatedDeviceListAdd(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr virMediatedDeviceListGet(virMediatedDeviceListPtr list, + ssize_t idx); + +size_t virMediatedDeviceListCount(virMediatedDeviceListPtr list); + +virMediatedDevicePtr virMediatedDeviceListSteal(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr virMediatedDeviceListStealIndex(virMediatedDeviceListPtr list, + ssize_t idx); + +void virMediatedDeviceListDel(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr virMediatedDeviceListFind(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +int virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +#endif /* __VIR_MDEV_H__ */ -- 2.10.2

A mediated device will be identified by a UUID of the user pre-created mediated device. The data necessary to identify a mediated device can be easily extended in the future, e.g. when auto-creation of mediated devices should be enabled. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 36 +++++++++++++++++++++++++++++++++++- src/conf/domain_conf.h | 9 +++++++++ src/qemu/qemu_cgroup.c | 5 +++++ src/qemu/qemu_domain.c | 1 + src/qemu/qemu_hotplug.c | 2 ++ src/security/security_apparmor.c | 3 +++ src/security/security_dac.c | 2 ++ src/security/security_selinux.c | 2 ++ tests/domaincapsschemadata/full.xml | 1 + 9 files changed, 60 insertions(+), 1 deletion(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 1bc72a4..4911b0b 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -56,6 +56,7 @@ #include "virstring.h" #include "virnetdev.h" #include "virhostdev.h" +#include "virmdev.h" #define VIR_FROM_THIS VIR_FROM_DOMAIN @@ -649,7 +650,8 @@ VIR_ENUM_IMPL(virDomainHostdevSubsys, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST, "usb", "pci", "scsi", - "scsi_host") + "scsi_host", + "mdev") VIR_ENUM_IMPL(virDomainHostdevSubsysPCIBackend, VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_LAST, @@ -668,6 +670,11 @@ VIR_ENUM_IMPL(virDomainHostdevSubsysSCSIHostProtocol, "none", "vhost") +VIR_ENUM_IMPL(virMediatedDeviceModel, + VIR_MDEV_MODEL_TYPE_LAST, + "default", + "vfio-pci") + VIR_ENUM_IMPL(virDomainHostdevCaps, VIR_DOMAIN_HOSTDEV_CAPS_TYPE_LAST, "storage", "misc", @@ -6348,10 +6355,12 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, char *sgio = NULL; char *rawio = NULL; char *backendStr = NULL; + char *model = NULL; int backend; int ret = -1; virDomainHostdevSubsysPCIPtr pcisrc = &def->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; /* @managed can be read from the xml document - it is always an * attribute of the toplevel element, no matter what type of @@ -6431,6 +6440,21 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, } } + if (model) { + if (def->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("model is only supported with mediated devices")); + goto error; + } + + if ((mdevsrc->model = virMediatedDeviceModelTypeFromString(model)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("unknown hostdev model '%s'"), + model); + goto error; + } + } + switch (def->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: if (virDomainHostdevSubsysPCIDefParseXML(sourcenode, def, flags) < 0) @@ -6463,6 +6487,8 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, if (virDomainHostdevSubsysSCSIVHostDefParseXML(sourcenode, def) < 0) goto error; break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -13291,6 +13317,7 @@ virDomainHostdevDefParseXML(virDomainXMLOptionPtr xmlopt, } break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } @@ -14182,6 +14209,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, return 1; else return 0; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: return 0; } @@ -23105,6 +23133,7 @@ virDomainHostdevDefFormat(virBufferPtr buf, { const char *mode = virDomainHostdevModeTypeToString(def->mode); virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; const char *type; if (!mode) { @@ -23154,6 +23183,11 @@ virDomainHostdevDefFormat(virBufferPtr buf, virBufferAsprintf(buf, " rawio='%s'", virTristateBoolTypeToString(scsisrc->rawio)); } + + if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV && + mdevsrc->model) + virBufferAsprintf(buf, " model='%s'", + virMediatedDeviceModelTypeToString(mdevsrc->model)); } virBufferAddLit(buf, ">\n"); virBufferAdjustIndent(buf, 2); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index dd79206..e7bcfbf 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -295,6 +295,7 @@ typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST, + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST } virDomainHostdevSubsysType; @@ -369,6 +370,13 @@ struct _virDomainHostdevSubsysSCSI { } u; }; +typedef struct _virDomainHostdevSubsysMediatedDev virDomainHostdevSubsysMediatedDev; +typedef virDomainHostdevSubsysMediatedDev *virDomainHostdevSubsysMediatedDevPtr; +struct _virDomainHostdevSubsysMediatedDev { + int model; /* enum virMediatedDeviceModelType */ + char uuidstr[VIR_UUID_STRING_BUFLEN]; /* mediated device's uuid string */ +}; + typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_NONE, VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_VHOST, @@ -394,6 +402,7 @@ struct _virDomainHostdevSubsys { virDomainHostdevSubsysPCI pci; virDomainHostdevSubsysSCSI scsi; virDomainHostdevSubsysSCSIVHost scsi_host; + virDomainHostdevSubsysMediatedDev mdev; } u; }; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 6c90d46..0902624 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -433,6 +433,9 @@ qemuSetupHostdevCgroup(virDomainObjPtr vm, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } @@ -501,6 +504,8 @@ qemuTeardownHostdevCgroup(virDomainObjPtr vm, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: /* nothing to tear down for scsi_host */ break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index be44843..c4d7a47 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6929,6 +6929,7 @@ qemuDomainGetHostdevPath(virDomainHostdevDefPtr dev, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 2f209f1..96c27ad 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3854,6 +3854,8 @@ qemuDomainRemoveHostDevice(virQEMUDriverPtr driver, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: qemuDomainRemoveSCSIVHostDevice(driver, vm, hostdev); break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c index 0d3e891..f5b72e1 100644 --- a/src/security/security_apparmor.c +++ b/src/security/security_apparmor.c @@ -901,6 +901,9 @@ AppArmorSetSecurityHostdevLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 6721917..ecce1d3 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -964,6 +964,7 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -1119,6 +1120,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index e22de06..e152c72 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1782,6 +1782,7 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -2009,6 +2010,7 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/tests/domaincapsschemadata/full.xml b/tests/domaincapsschemadata/full.xml index 6abd499..6b43069 100644 --- a/tests/domaincapsschemadata/full.xml +++ b/tests/domaincapsschemadata/full.xml @@ -88,6 +88,7 @@ <value>pci</value> <value>scsi</value> <value>scsi_host</value> + <value>mdev</value> </enum> <enum name='capsType'> <value>storage</value> -- 2.10.2

Since the address element is much easier extendible by attributes adding more and more elements, starting with <uuid>, to the <source> element once we find out we need more data to identify a mediated device. By introducing a new address type rather than using plain elements, we also remain consistent with all other devices that can make use of the address element. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 11 +++++++++-- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/src/conf/device_conf.h b/src/conf/device_conf.h index f435fb5..8b67208 100644 --- a/src/conf/device_conf.h +++ b/src/conf/device_conf.h @@ -47,6 +47,7 @@ typedef enum { VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ISA, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_DIMM, + VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST } virDomainDeviceAddressType; diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 4911b0b..053f3cb 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -259,7 +259,8 @@ VIR_ENUM_IMPL(virDomainDeviceAddress, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST, "ccw", "virtio-mmio", "isa", - "dimm") + "dimm", + "mdev") VIR_ENUM_IMPL(virDomainDiskDevice, VIR_DOMAIN_DISK_DEVICE_LAST, "disk", @@ -3289,6 +3290,7 @@ int virDomainDeviceAddressIsValid(virDomainDeviceInfoPtr info, return virDomainDeviceCCWAddressIsValid(&info->addr.ccw); case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_USB: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV: return 1; } @@ -3381,6 +3383,7 @@ virDomainDeviceInfoAddressIsEqual(const virDomainDeviceInfo *a, /* address types below don't have any specific data */ case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV: break; case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI: @@ -5129,7 +5132,8 @@ virDomainDeviceInfoFormat(virBufferPtr buf, } if (info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE || - info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390) + info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390 || + info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV) return 0; /* We'll be in domain/devices/[device type]/ so 3 level indent */ @@ -5212,6 +5216,7 @@ virDomainDeviceInfoFormat(virBufferPtr buf, break; case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: break; @@ -5773,6 +5778,7 @@ virDomainDeviceInfoParseXML(xmlNodePtr node, goto cleanup; break; + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: break; @@ -18540,6 +18546,7 @@ virDomainDeviceInfoCheckABIStability(virDomainDeviceInfoPtr src, case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: break; -- 2.10.2

To keep the domain XML as much platform agnostic as possible, do not expose an element/attribute which would contain path directly to the syfs filesystem which the mediated devices are build upon. Instead, identify each mediated device by a UUID attribute as well as specifying the device API (e.g. vfio-pci) through the 'model' attribute. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/schemas/domaincommon.rng | 26 ++++++++++++++++++++++ src/conf/domain_conf.c | 50 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 76 insertions(+) diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index d715bff..ad8f47c 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -4014,6 +4014,7 @@ <ref name="hostdevsubsysusb"/> <ref name="hostdevsubsysscsi"/> <ref name="hostdevsubsyshost"/> + <ref name="hostdevsubsysmdev"/> </choice> </define> @@ -4164,6 +4165,20 @@ </element> </define> + <define name="hostdevsubsysmdev"> + <attribute name="type"> + <value>mdev</value> + </attribute> + <attribute name="model"> + <choice> + <value>vfio-pci</value> + </choice> + </attribute> + <element name="source"> + <ref name="address"/> + </element> + </define> + <define name="hostdevcapsstorage"> <attribute name="type"> <value>storage</value> @@ -4322,6 +4337,11 @@ </attribute> </optional> </define> + <define name="mdevaddress"> + <attribute name="uuid"> + <ref name="UUID"/> + </attribute> + </define> <define name="devices"> <element name="devices"> <interleave> @@ -4703,6 +4723,12 @@ </attribute> <ref name="dimmaddress"/> </group> + <group> + <attribute name="type"> + <value>mdev</value> + </attribute> + <ref name="mdevaddress"/> + </group> </choice> </element> </define> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 053f3cb..870aec6 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -6348,6 +6348,47 @@ virDomainHostdevSubsysSCSIVHostDefParseXML(xmlNodePtr sourcenode, return ret; } +static int +virDomainHostdevSubsysMediatedDevDefParseXML(virDomainHostdevDefPtr def, + xmlXPathContextPtr ctxt) +{ + int ret = -1; + unsigned char uuid[VIR_UUID_BUFLEN] = {0}; + char *uuidxml = NULL; + xmlNodePtr node = NULL; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; + + if (mdevsrc->model == 0) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing 'model' attribute for element <hostdev>")); + goto cleanup; + } + + if (!(node = virXPathNode("./source/address", ctxt))) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Missing <address> element")); + goto cleanup; + } + + if (!(uuidxml = virXMLPropString(node, "uuid"))) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Missing 'uuid' attribute for element <address>")); + goto cleanup; + } + + if (virUUIDParse(uuidxml, uuid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", + _("Cannot parse uuid attribute of element <address>")); + goto cleanup; + } + + virUUIDFormat(uuid, mdevsrc->uuidstr); + ret = 0; + cleanup: + VIR_FREE(uuidxml); + return ret; +} static int virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, @@ -6380,6 +6421,7 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, sgio = virXMLPropString(node, "sgio"); rawio = virXMLPropString(node, "rawio"); + model = virXMLPropString(node, "model"); /* @type is passed in from the caller rather than read from the * xml document, because it is specified in different places for @@ -6494,6 +6536,8 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, goto error; break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + if (virDomainHostdevSubsysMediatedDevDefParseXML(def, ctxt) < 0) + goto error; break; default: @@ -6509,6 +6553,7 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, VIR_FREE(sgio); VIR_FREE(rawio); VIR_FREE(backendStr); + VIR_FREE(model); return ret; } @@ -21180,6 +21225,7 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, virDomainHostdevSubsysPCIPtr pcisrc = &def->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &def->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; virDomainHostdevSubsysSCSIHostPtr scsihostsrc = &scsisrc->u.host; virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc = &scsisrc->u.iscsi; @@ -21284,6 +21330,10 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + virBufferAsprintf(buf, "<address type='mdev' uuid='%s'/>\n", + mdevsrc->uuidstr); + break; default: virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected hostdev type %d"), -- 2.10.2

Just to make the code a bit cleaner, move hostdev specific post parse code to its own function just in case it grows in the future. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 76 +++++++++++++++++++++++++++++++------------------- 1 file changed, 48 insertions(+), 28 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 870aec6..151a308 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4207,6 +4207,51 @@ virDomainHostdevAssignAddress(virDomainXMLOptionPtr xmlopt, static int +virDomainHostdevDefPostParse(virDomainHostdevDefPtr dev, + const virDomainDef *def, + virDomainXMLOptionPtr xmlopt) +{ + if (dev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) + return 0; + + switch (dev->source.subsys.type) { + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: + if (dev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + virDomainHostdevAssignAddress(xmlopt, def, dev) < 0) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Cannot assign SCSI host device address")); + return -1; + } else { + /* Ensure provided address doesn't conflict with existing + * scsi disk drive address + */ + virDomainDeviceDriveAddressPtr addr = &dev->info->addr.drive; + if (virDomainDriveAddressIsUsedByDisk(def, + VIR_DOMAIN_DISK_BUS_SCSI, + addr)) { + virReportError(VIR_ERR_XML_ERROR, + _("SCSI host address controller='%u' " + "bus='%u' target='%u' unit='%u' in " + "use by a SCSI disk"), + addr->controller, addr->bus, + addr->target, addr->unit); + return -1; + } + } + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: + break; + } + + return 0; +} + + +static int virDomainDeviceDefPostParseInternal(virDomainDeviceDefPtr dev, const virDomainDef *def, virCapsPtr caps ATTRIBUTE_UNUSED, @@ -4287,34 +4332,9 @@ virDomainDeviceDefPostParseInternal(virDomainDeviceDefPtr dev, video->vram = VIR_ROUND_UP_POWER_OF_TWO(video->vram); } - if (dev->type == VIR_DOMAIN_DEVICE_HOSTDEV) { - virDomainHostdevDefPtr hdev = dev->data.hostdev; - - if (virHostdevIsSCSIDevice(hdev)) { - if (hdev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && - virDomainHostdevAssignAddress(xmlopt, def, hdev) < 0) { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("Cannot assign SCSI host device address")); - return -1; - } else { - /* Ensure provided address doesn't conflict with existing - * scsi disk drive address - */ - virDomainDeviceDriveAddressPtr addr = &hdev->info->addr.drive; - if (virDomainDriveAddressIsUsedByDisk(def, - VIR_DOMAIN_DISK_BUS_SCSI, - addr)) { - virReportError(VIR_ERR_XML_ERROR, - _("SCSI host address controller='%u' " - "bus='%u' target='%u' unit='%u' in " - "use by a SCSI disk"), - addr->controller, addr->bus, - addr->target, addr->unit); - return -1; - } - } - } - } + if (dev->type == VIR_DOMAIN_DEVICE_HOSTDEV && + virDomainHostdevDefPostParse(dev->data.hostdev, def, xmlopt) < 0) + return -1; if (dev->type == VIR_DOMAIN_DEVICE_CONTROLLER) { virDomainControllerDefPtr cdev = dev->data.controller; -- 2.10.2

We need to make sure that if user explicitly provides a guest address for a mdev device, the address type will be matching the device API supported on that specific mediated device and error out with an incorrect XML message. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 151a308..69874f0 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4239,10 +4239,26 @@ virDomainHostdevDefPostParse(virDomainHostdevDefPtr dev, } } break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + int model = dev->source.subsys.u.mdev.model; + + if (dev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) + return 0; + + if (model == VIR_MDEV_MODEL_TYPE_VFIO_PCI && + dev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI) { + virReportError(VIR_ERR_XML_ERROR, + _("Unsupported address type '%s' with mediated " + "device model '%s'"), + virDomainDeviceAddressTypeToString(dev->info->type), + virMediatedDeviceModelTypeToString(model)); + return -1; + } + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } -- 2.10.2

Label the VFIO IOMMU devices under /dev/vfio/ referenced by the symlinks in the sysfs (e.g. /sys/class/mdev_bus/<uuid>/iommu_group) which what qemu actually gets formatted on the command line. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/security/security_dac.c | 57 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 55 insertions(+), 2 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index ecce1d3..45bd24e 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -33,6 +33,7 @@ #include "virfile.h" #include "viralloc.h" #include "virlog.h" +#include "virmdev.h" #include "virpci.h" #include "virusb.h" #include "virscsi.h" @@ -856,6 +857,15 @@ virSecurityDACSetHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, static int +virSecurityDACSetMediatedDevLabel(virMediatedDevicePtr dev ATTRIBUTE_UNUSED, + const char *file, + void *opaque) +{ + return virSecurityDACSetHostdevLabelHelper(file, opaque); +} + + +static int virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, virDomainDefPtr def, virDomainHostdevDefPtr dev, @@ -867,7 +877,9 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1; + virMediatedDevicePtr mdev = NULL; if (!priv->dynamicOwnership) return 0; @@ -964,13 +976,26 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; } - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfio_dev = NULL; + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr))) + goto done; + + if (!(vfio_dev = virMediatedDeviceGetIOMMUGroupDev(mdev))) + goto done; + + ret = virSecurityDACSetMediatedDevLabel(mdev, vfio_dev, &cbdata); + VIR_FREE(vfio_dev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; } done: + virMediatedDeviceFree(mdev); return ret; } @@ -1018,6 +1043,15 @@ virSecurityDACRestoreHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, return virSecurityDACRestoreFileLabel(priv, file); } +static int +virSecurityDACRestoreMediatedDevLabel(virMediatedDevicePtr dev ATTRIBUTE_UNUSED, + const char *file, + void *opaque) +{ + virSecurityManagerPtr mgr = opaque; + virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr); + return virSecurityDACRestoreFileLabel(priv, file); +} static int virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, @@ -1032,6 +1066,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1; secdef = virDomainDefGetSecurityLabelDef(def, SECURITY_DAC_NAME); @@ -1120,7 +1155,25 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; } - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + } + + ret = virSecurityDACRestoreMediatedDevLabel(mdev, vfiodev, mgr); + + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; -- 2.10.2

Label the VFIO IOMMU devices under /dev/vfio/ referenced by the symlinks in the sysfs (e.g. /sys/class/mdev_bus/<uuid>/iommu_group) which what qemu actually gets formatted on the command line. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/security/security_selinux.c | 56 +++++++++++++++++++++++++++++++++++++++-- 1 file changed, 54 insertions(+), 2 deletions(-) diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index e152c72..60bdb1c 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -36,6 +36,7 @@ #include "virerror.h" #include "viralloc.h" #include "virlog.h" +#include "virmdev.h" #include "virpci.h" #include "virusb.h" #include "virscsi.h" @@ -1686,6 +1687,13 @@ virSecuritySELinuxSetHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, } static int +virSecuritySELinuxSetMediatedDevLabel(virMediatedDevicePtr dev ATTRIBUTE_UNUSED, + const char *file, void *opaque) +{ + return virSecuritySELinuxSetHostdevLabelHelper(file, opaque); +} + +static int virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainDefPtr def, virDomainHostdevDefPtr dev, @@ -1696,7 +1704,9 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; virSecuritySELinuxCallbackData data = {.mgr = mgr, .def = def}; + virMediatedDevicePtr mdev = NULL; int ret = -1; @@ -1782,13 +1792,26 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, break; } - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfio_dev = NULL; + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr))) + goto done; + + if (!(vfio_dev = virMediatedDeviceGetIOMMUGroupDev(mdev))) + goto done; + + ret = virSecuritySELinuxSetMediatedDevLabel(mdev, vfio_dev, &data); + VIR_FREE(vfio_dev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; } done: + virMediatedDeviceFree(mdev); return ret; } @@ -1918,6 +1941,16 @@ virSecuritySELinuxRestoreHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, } static int +virSecuritySELinuxRestoreMediatedDevLabel(virMediatedDevicePtr dev ATTRIBUTE_UNUSED, + const char *file, + void *opaque) +{ + virSecurityManagerPtr mgr = opaque; + + return virSecuritySELinuxRestoreFileLabel(mgr, file); +} + +static int virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainHostdevDefPtr dev, const char *vroot) @@ -1927,6 +1960,7 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1; /* Like virSecuritySELinuxRestoreImageLabelInt() for a networked @@ -2010,7 +2044,25 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, break; } - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + } + + ret = virSecuritySELinuxRestoreMediatedDevLabel(mdev, vfiodev, mgr); + + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; -- 2.10.2

This merely introduces virDomainHostdevMatchSubsysMediatedDev method that is supposed to check whether device being cold-plugged does not already exist in the domain configuration. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 69874f0..0fcf9f8 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -14267,6 +14267,19 @@ virDomainHostdevMatchSubsysSCSIiSCSI(virDomainHostdevDefPtr first, } static int +virDomainHostdevMatchSubsysMediatedDev(virDomainHostdevDefPtr a, + virDomainHostdevDefPtr b) +{ + virDomainHostdevSubsysMediatedDevPtr src_a = &a->source.subsys.u.mdev; + virDomainHostdevSubsysMediatedDevPtr src_b = &b->source.subsys.u.mdev; + + if (STREQ(src_a->uuidstr, src_b->uuidstr)) + return 1; + + return 0; +} + +static int virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, virDomainHostdevDefPtr b) { @@ -14297,6 +14310,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, else return 0; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + return virDomainHostdevMatchSubsysMediatedDev(a, b); case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: return 0; } -- 2.10.2

So far, the official support is for x86_64 arch guests so unless a different device API than vfio-pci is available let's only turn on support for PCI address assignment. Once a different device API is introduced, we can enable another address type easily. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 16 ++++++++++++---- 2 files changed, 13 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 524a672..341264b 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -34,6 +34,7 @@ # include "qemu_agent.h" # include "qemu_conf.h" # include "qemu_capabilities.h" +# include "virmdev.h" # include "virchrdev.h" # include "virobject.h" # include "logging/log_manager.h" diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index 5b75044..cacd131 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -618,9 +618,11 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, virPCIDeviceAddressPtr hostAddr = &hostdev->source.subsys.u.pci.addr; if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || - hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) { + (hostdev->source.subsys.type != + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + hostdev->source.subsys.type != + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV)) return 0; - } if (pciFlags == pcieFlags) { /* This arch/qemu only supports legacy PCI, so there @@ -642,6 +644,9 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, return pcieFlags; } + if (hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + return pcieFlags; + if (!(pciDev = virPCIDeviceNew(hostAddr->domain, hostAddr->bus, hostAddr->slot, @@ -1725,12 +1730,15 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def, /* Host PCI devices */ for (i = 0; i < def->nhostdevs; i++) { + virDomainHostdevSubsysPtr subsys = &def->hostdevs[i]->source.subsys; if (!virDeviceInfoPCIAddressWanted(def->hostdevs[i]->info)) continue; if (def->hostdevs[i]->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) continue; - if (def->hostdevs[i]->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - def->hostdevs[i]->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST) + if (subsys->type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + subsys->type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST && + !(subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV && + subsys->u.mdev.model == VIR_MDEV_MODEL_TYPE_VFIO_PCI)) continue; if (qemuDomainPCIAddressReserveNextAddr(addrs, -- 2.10.2

Keep track of the assigned mediated devices the same way we do it for the rest of hostdevs. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_hostdev.c | 22 ++++++ src/qemu/qemu_hostdev.h | 4 ++ src/util/virhostdev.c | 176 ++++++++++++++++++++++++++++++++++++++++++++++- src/util/virhostdev.h | 9 +++ 5 files changed, 211 insertions(+), 1 deletion(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 48c86a2..97f81ee 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1709,6 +1709,7 @@ virHostdevPCINodeDeviceDetach; virHostdevPCINodeDeviceReAttach; virHostdevPCINodeDeviceReset; virHostdevPrepareDomainDevices; +virHostdevPrepareMediatedDevices; virHostdevPreparePCIDevices; virHostdevPrepareSCSIDevices; virHostdevPrepareSCSIVHostDevices; diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c index 7cd49e4..45b731c 100644 --- a/src/qemu/qemu_hostdev.c +++ b/src/qemu/qemu_hostdev.c @@ -305,6 +305,24 @@ qemuHostdevPrepareSCSIVHostDevices(virQEMUDriverPtr driver, } int +qemuHostdevPrepareMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; + + if (!qemuHostdevHostSupportsPassthroughVFIO()) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("host doesn't support VFIO PCI interface")); + return -1; + } + + return virHostdevPrepareMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME, + name, hostdevs, nhostdevs); +} + +int qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def, virQEMUCapsPtr qemuCaps, @@ -330,6 +348,10 @@ qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, def->hostdevs, def->nhostdevs) < 0) return -1; + if (qemuHostdevPrepareMediatedDevices(driver, def->name, + def->hostdevs, def->nhostdevs) < 0) + return -1; + return 0; } diff --git a/src/qemu/qemu_hostdev.h b/src/qemu/qemu_hostdev.h index 74a7d4f..9399241 100644 --- a/src/qemu/qemu_hostdev.h +++ b/src/qemu/qemu_hostdev.h @@ -59,6 +59,10 @@ int qemuHostdevPrepareSCSIVHostDevices(virQEMUDriverPtr driver, const char *name, virDomainHostdevDefPtr *hostdevs, int nhostdevs); +int qemuHostdevPrepareMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs); int qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def, virQEMUCapsPtr qemuCaps, diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 86ca8e0..681f720 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -1,6 +1,6 @@ /* virhostdev.c: hostdev management * - * Copyright (C) 2006-2007, 2009-2016 Red Hat, Inc. + * Copyright (C) 2006-2007, 2009-2017 Red Hat, Inc. * Copyright (C) 2006 Daniel P. Berrange * Copyright (C) 2014 SUSE LINUX Products GmbH, Nuernberg, Germany. * @@ -147,6 +147,7 @@ virHostdevManagerDispose(void *obj) virObjectUnref(hostdevMgr->activeUSBHostdevs); virObjectUnref(hostdevMgr->activeSCSIHostdevs); virObjectUnref(hostdevMgr->activeSCSIVHostHostdevs); + virObjectUnref(hostdevMgr->activeMediatedHostdevs); VIR_FREE(hostdevMgr->stateDir); } @@ -174,6 +175,9 @@ virHostdevManagerNew(void) if (!(hostdevMgr->activeSCSIVHostHostdevs = virSCSIVHostDeviceListNew())) goto error; + if (!(hostdevMgr->activeMediatedHostdevs = virMediatedDeviceListNew())) + goto error; + if (privileged) { if (VIR_STRDUP(hostdevMgr->stateDir, HOSTDEV_STATE_DIR) < 0) goto error; @@ -1595,6 +1599,176 @@ virHostdevPrepareSCSIVHostDevices(virHostdevManagerPtr mgr, return -1; } + +static bool +virHostdevMediatedDeviceIsUsed(virMediatedDevicePtr dev, + virMediatedDeviceListPtr list) +{ + const char *drvname, *domname; + virMediatedDevicePtr tmp = NULL; + + virObjectLock(list); + + if ((tmp = virMediatedDeviceListFind(list, dev))) { + virMediatedDeviceGetUsedBy(tmp, &drvname, &domname); + virReportError(VIR_ERR_OPERATION_INVALID, + _("Mediated device %s is in use by " + "driver %s, domain %s"), + virMediatedDeviceGetPath(tmp), + drvname, domname); + } + + virObjectUnlock(list); + return !!tmp; +} + + +static int +virHostdevMediatedDeviceCheckModel(virMediatedDevicePtr dev, + virMediatedDeviceModelType model) +{ + int ret = -1; + char *dev_api = NULL; + int actual_model; + + if (virMediatedDeviceGetDeviceAPI(dev, &dev_api) < 0) + return -1; + + /* safeguard in case we've got an older libvirt which doesn't know newer + * device_api models yet + */ + if ((actual_model = virMediatedDeviceModelTypeFromString(dev_api)) <= 0) + goto cleanup; + + if (actual_model != model) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid device API '%s' for device %s: " + "device only supports '%s'"), + virMediatedDeviceModelTypeToString(model), + virMediatedDeviceGetPath(dev), dev_api); + goto cleanup; + } + + ret = 0; + cleanup: + VIR_FREE(dev_api); + return ret; +} + + +static int +virHostdevMarkMediatedDevices(virHostdevManagerPtr mgr, + const char *drvname, + const char *domname, + virMediatedDeviceListPtr list) +{ + int ret = -1; + size_t i, j, count; + + virObjectLock(mgr->activeMediatedHostdevs); + + count = virMediatedDeviceListCount(list); + for (i = 0; i < count; i++) { + virMediatedDevicePtr mdev = virMediatedDeviceListGet(list, i); + VIR_DEBUG("Adding %s to activeMediatedHostdevs", + virMediatedDeviceGetPath(mdev)); + + virMediatedDeviceSetUsedBy(mdev, drvname, domname); + + /* Copy mdev references to the driver list: + * - caller is responsible for NOT freeing devices in @list on success + * - we're responsible for performing a rollback on failure + */ + if (virMediatedDeviceListAdd(mgr->activeMediatedHostdevs, mdev) < 0) + goto rollback; + } + + ret = 0; + cleanup: + virObjectUnlock(mgr->activeMediatedHostdevs); + return ret; + + rollback: + for (j = 0; j < i; j++) { + virMediatedDevicePtr tmp = virMediatedDeviceListGet(list, j); + virMediatedDeviceListSteal(mgr->activeMediatedHostdevs, tmp); + } + goto cleanup; +} + + +int +virHostdevPrepareMediatedDevices(virHostdevManagerPtr mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + size_t i; + int ret = -1; + virMediatedDeviceListPtr list; + + if (!nhostdevs) + return 0; + + /* To prevent situation where mediated device is assigned to multiple + * domains we maintain a driver list of currently assigned mediated devices. + * A device is appended to the driver list after a series of preparations. + */ + if (!(list = virMediatedDeviceListNew())) + goto cleanup; + + /* Loop 1: Build a temporary list of unused mediated devices. */ + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDefPtr hostdev = hostdevs[i]; + virDomainHostdevSubsysMediatedDevPtr src = &hostdev->source.subsys.u.mdev; + virMediatedDevicePtr mdev; + + if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) + continue; + if (hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + continue; + + if (!(mdev = virMediatedDeviceNew(src->uuidstr))) + goto cleanup; + + if (virHostdevMediatedDeviceIsUsed(mdev, mgr->activeMediatedHostdevs)) + goto cleanup; + + /* Check whether the user-provided model corresponds with the actually + * supported mediated device's API. + */ + if (virHostdevMediatedDeviceCheckModel(mdev, src->model) < 0) + goto cleanup; + + if (virMediatedDeviceListAdd(list, mdev) < 0) { + virMediatedDeviceFree(mdev); + goto cleanup; + } + } + + + /* Mark the devices in the list as used by @drv_name-@dom_name and copy the + * references to the driver list + */ + if (virHostdevMarkMediatedDevices(mgr, drv_name, dom_name, list) < 0) + goto cleanup; + + /* Loop 2: Temporary list was successfully merged with + * driver list, so steal all items to avoid freeing them + * in cleanup label. + */ + while (virMediatedDeviceListCount(list) > 0) { + virMediatedDevicePtr tmp = virMediatedDeviceListGet(list, 0); + virMediatedDeviceListSteal(list, tmp); + } + + ret = 0; + cleanup: + virObjectUnref(list); + return ret; +} + void virHostdevReAttachUSBDevices(virHostdevManagerPtr mgr, const char *drv_name, diff --git a/src/util/virhostdev.h b/src/util/virhostdev.h index 4c1fea3..b077089 100644 --- a/src/util/virhostdev.h +++ b/src/util/virhostdev.h @@ -31,6 +31,7 @@ # include "virusb.h" # include "virscsi.h" # include "virscsivhost.h" +# include "virmdev.h" # include "domain_conf.h" typedef enum { @@ -55,6 +56,7 @@ struct _virHostdevManager { virUSBDeviceListPtr activeUSBHostdevs; virSCSIDeviceListPtr activeSCSIHostdevs; virSCSIVHostDeviceListPtr activeSCSIVHostHostdevs; + virMediatedDeviceListPtr activeMediatedHostdevs; }; virHostdevManagerPtr virHostdevManagerGetDefault(void); @@ -96,6 +98,13 @@ virHostdevPrepareSCSIVHostDevices(virHostdevManagerPtr hostdev_mgr, virDomainHostdevDefPtr *hostdevs, int nhostdevs) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); +int +virHostdevPrepareMediatedDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); void virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, const char *drv_name, -- 2.10.2

The name "reattach" does not really reflect the truth behind mediated devices, since these are purely software devices that do not need to be plugged back into the host in any way, however this patch pushes for naming consistency for methods where the logic behind the underlying operation stays the same except that in case of mdevs the operation itself is effectively a NO-OP. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_hostdev.c | 15 ++++++++++++++ src/qemu/qemu_hostdev.h | 4 ++++ src/util/virhostdev.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virhostdev.h | 7 +++++++ 5 files changed, 80 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 97f81ee..52aa7eb 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1715,6 +1715,7 @@ virHostdevPrepareSCSIDevices; virHostdevPrepareSCSIVHostDevices; virHostdevPrepareUSBDevices; virHostdevReAttachDomainDevices; +virHostdevReAttachMediatedDevices; virHostdevReAttachPCIDevices; virHostdevReAttachSCSIDevices; virHostdevReAttachSCSIVHostDevices; diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c index 45b731c..6a7232f 100644 --- a/src/qemu/qemu_hostdev.c +++ b/src/qemu/qemu_hostdev.c @@ -419,6 +419,18 @@ qemuHostdevReAttachSCSIVHostDevices(virQEMUDriverPtr driver, } void +qemuHostdevReAttachMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; + + virHostdevReAttachMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME, + name, hostdevs, nhostdevs); +} + +void qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def) { @@ -436,4 +448,7 @@ qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver, qemuHostdevReAttachSCSIVHostDevices(driver, def->name, def->hostdevs, def->nhostdevs); + + qemuHostdevReAttachMediatedDevices(driver, def->name, def->hostdevs, + def->nhostdevs); } diff --git a/src/qemu/qemu_hostdev.h b/src/qemu/qemu_hostdev.h index 9399241..0096497 100644 --- a/src/qemu/qemu_hostdev.h +++ b/src/qemu/qemu_hostdev.h @@ -84,6 +84,10 @@ void qemuHostdevReAttachSCSIVHostDevices(virQEMUDriverPtr driver, const char *name, virDomainHostdevDefPtr *hostdevs, int nhostdevs); +void qemuHostdevReAttachMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs); void qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def); diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 681f720..2a43b4d 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -1963,6 +1963,59 @@ virHostdevReAttachSCSIVHostDevices(virHostdevManagerPtr mgr, virObjectUnlock(mgr->activeSCSIVHostHostdevs); } +void +virHostdevReAttachMediatedDevices(virHostdevManagerPtr mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + const char *used_by_drvname = NULL; + const char *used_by_domname = NULL; + virDomainHostdevDefPtr hostdev = NULL; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = NULL; + size_t i; + + if (nhostdevs == 0) + return; + + virObjectLock(mgr->activeMediatedHostdevs); + for (i = 0; i < nhostdevs; i++) { + virMediatedDevicePtr mdev, tmp; + + hostdev = hostdevs[i]; + mdevsrc = &hostdev->source.subsys.u.mdev; + + if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || + hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + continue; + + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr))) { + VIR_WARN("Failed to reattach mediated device %s attached to " + "domain %s", mdevsrc->uuidstr, dom_name); + continue; + } + + /* Remove from the list only mdevs assigned to @drv_name/@dom_name */ + + tmp = virMediatedDeviceListFind(mgr->activeMediatedHostdevs, mdev); + virMediatedDeviceFree(mdev); + + /* skip inactive devices */ + if (!tmp) + continue; + + virMediatedDeviceGetUsedBy(tmp, &used_by_drvname, &used_by_domname); + if (STREQ_NULLABLE(drv_name, used_by_drvname) && + STREQ_NULLABLE(dom_name, used_by_domname)) { + VIR_DEBUG("Removing %s dom=%s from activeMediatedHostdevs", + mdevsrc->uuidstr, dom_name); + virMediatedDeviceListDel(mgr->activeMediatedHostdevs, tmp); + } + } + virObjectUnlock(mgr->activeMediatedHostdevs); +} + int virHostdevPCINodeDeviceDetach(virHostdevManagerPtr mgr, virPCIDevicePtr pci) diff --git a/src/util/virhostdev.h b/src/util/virhostdev.h index b077089..d0875d8 100644 --- a/src/util/virhostdev.h +++ b/src/util/virhostdev.h @@ -134,6 +134,13 @@ virHostdevReAttachSCSIVHostDevices(virHostdevManagerPtr hostdev_mgr, virDomainHostdevDefPtr *hostdevs, int nhostdevs) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); +void +virHostdevReAttachMediatedDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); int virHostdevUpdateActivePCIDevices(virHostdevManagerPtr mgr, virDomainHostdevDefPtr *hostdevs, -- 2.10.2

As goes for all the other hostdev device types, grant the qemu process access to /dev/vfio/<mediated_device_iommu_group>. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_cgroup.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 0902624..bcd10db 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -318,6 +318,23 @@ qemuSetupHostSCSIVHostDeviceCgroup(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, return ret; } +static int +qemuSetupHostMediatedDeviceCgroup(virMediatedDevicePtr dev ATTRIBUTE_UNUSED, + const char *path, + void *opaque) +{ + virDomainObjPtr vm = opaque; + qemuDomainObjPrivatePtr priv = vm->privateData; + int ret = -1; + + VIR_DEBUG("Process path '%s' for mediated device", path); + ret = virCgroupAllowDevicePath(priv->cgroup, path, + VIR_CGROUP_DEVICE_RW, false); + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", path, "rw", ret == 0); + + return ret; +} + int qemuSetupHostdevCgroup(virDomainObjPtr vm, virDomainHostdevDefPtr dev) @@ -328,10 +345,12 @@ qemuSetupHostdevCgroup(virDomainObjPtr vm, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; virPCIDevicePtr pci = NULL; virUSBDevicePtr usb = NULL; virSCSIDevicePtr scsi = NULL; virSCSIVHostDevicePtr host = NULL; + virMediatedDevicePtr mdev = NULL; char *path = NULL; /* currently this only does something for PCI devices using vfio @@ -434,6 +453,15 @@ qemuSetupHostdevCgroup(virDomainObjPtr vm, } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr))) + goto cleanup; + + if (!(path = virMediatedDeviceGetIOMMUGroupDev(mdev))) + goto cleanup; + + if (qemuSetupHostMediatedDeviceCgroup(mdev, path, vm) < 0) + goto cleanup; + break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: @@ -447,6 +475,7 @@ qemuSetupHostdevCgroup(virDomainObjPtr vm, virUSBDeviceFree(usb); virSCSIDeviceFree(scsi); virSCSIVHostDeviceFree(host); + virMediatedDeviceFree(mdev); VIR_FREE(path); return ret; } -- 2.10.2

Again, as for all the other hostdev device types, make sure that the /dev/vfio/<mdev_iommu_group> device will be added to the qemu namespace. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c4d7a47..eb86385 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6852,10 +6852,12 @@ qemuDomainGetHostdevPath(virDomainHostdevDefPtr dev, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; virPCIDevicePtr pci = NULL; virUSBDevicePtr usb = NULL; virSCSIDevicePtr scsi = NULL; virSCSIVHostDevicePtr host = NULL; + virMediatedDevicePtr mdev = NULL; char *tmpPath = NULL; bool freeTmpPath = false; @@ -6930,6 +6932,14 @@ qemuDomainGetHostdevPath(virDomainHostdevDefPtr dev, } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr))) + goto cleanup; + + if (!(tmpPath = virMediatedDeviceGetIOMMUGroupDev(mdev))) + goto cleanup; + + freeTmpPath = true; + break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } @@ -6950,6 +6960,7 @@ qemuDomainGetHostdevPath(virDomainHostdevDefPtr dev, virUSBDeviceFree(usb); virSCSIDeviceFree(scsi); virSCSIVHostDeviceFree(host); + virMediatedDeviceFree(mdev); if (freeTmpPath) VIR_FREE(tmpPath); return ret; -- 2.10.2

Format the mediated devices on the qemu command line as -device vfio-pci,sysfsdev='/path/to/device/in/syfs'. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_command.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_command.h | 5 +++++ 2 files changed, 54 insertions(+) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index c00a47a..9c12c21 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -57,6 +57,7 @@ #include "virscsi.h" #include "virnuma.h" #include "virgic.h" +#include "virmdev.h" #if defined(__linux__) # include <linux/capability.h> #endif @@ -5161,6 +5162,35 @@ qemuBuildChrChardevStr(virLogManagerPtr logManager, return ret; } +char * +qemuBuildHostdevMediatedDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; + virMediatedDevicePtr mdev = NULL; + char *ret = NULL; + + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr))) + goto cleanup; + + virBufferAddLit(&buf, "vfio-pci"); + virBufferAsprintf(&buf, ",sysfsdev=%s", virMediatedDeviceGetPath(mdev)); + + if (qemuBuildDeviceAddressStr(&buf, def, dev->info, qemuCaps) < 0) + goto cleanup; + + if (virBufferCheckError(&buf) < 0) + goto cleanup; + + ret = virBufferContentAndReset(&buf); + + cleanup: + virBufferFreeAndReset(&buf); + virMediatedDeviceFree(mdev); + return ret; +} static int qemuBuildHostdevCommandLine(virCommandPtr cmd, @@ -5349,6 +5379,25 @@ qemuBuildHostdevCommandLine(virCommandPtr cmd, VIR_FREE(devstr); } } + + /* MDEV */ + if (hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) { + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_VFIO_PCI)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("VFIO PCI device assignment is not " + "supported by this version of qemu")); + return -1; + } + + virCommandAddArg(cmd, "-device"); + if (!(devstr = + qemuBuildHostdevMediatedDevStr(def, hostdev, qemuCaps))) + return -1; + virCommandAddArg(cmd, devstr); + VIR_FREE(devstr); + } } return 0; diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 69fe846..b263b87 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -171,6 +171,11 @@ qemuBuildSCSIVHostHostdevDevStr(const virDomainDef *def, virQEMUCapsPtr qemuCaps, char *vhostfdName); +char * +qemuBuildHostdevMediatedDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps); + char *qemuBuildRedirdevDevStr(const virDomainDef *def, virDomainRedirdevDefPtr dev, virQEMUCapsPtr qemuCaps); -- 2.10.2

For now, these only cover the unmanaged, i.e. user pre-created devices. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- .../qemuxml2argv-hostdev-mdev-unmanaged.args | 25 ++++++++++++++ .../qemuxml2argv-hostdev-mdev-unmanaged.xml | 37 ++++++++++++++++++++ tests/qemuxml2argvtest.c | 6 ++++ .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml | 40 ++++++++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 5 files changed, 109 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args new file mode 100644 index 0000000..fdefeb6 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args @@ -0,0 +1,25 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/home/test \ +USER=test \ +LOGNAME=test \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu \ +-name QEMUGuest2 \ +-S \ +-M pc \ +-m 214 \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-nographic \ +-nodefconfig \ +-nodefaults \ +-monitor unix:/tmp/lib/domain--1-QEMUGuest2/monitor.sock,server,nowait \ +-no-acpi \ +-boot c \ +-usb \ +-drive file=/dev/HostVG/QEMUGuest2,format=raw,if=none,id=drive-ide0-0-0 \ +-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \ +-device vfio-pci,\ +sysfsdev=/sys/bus/mdev/devices/53764d0e-85a0-42b4-af5c-2046b460b1dc,bus=pci.0,\ +addr=0x3 diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml new file mode 100644 index 0000000..144b769 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml @@ -0,0 +1,37 @@ +<domain type='qemu'> + <name>QEMUGuest2</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest2'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <controller type='ide' index='0'> + </controller> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address type='mdev' uuid='53764d0e-85a0-42b4-af5c-2046b460b1dc'/> + </source> + </hostdev> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index faa99c6..e1173ac 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1475,6 +1475,12 @@ mymain(void) DO_TEST("hostdev-vfio-multidomain", QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_DEVICE_VFIO_PCI, QEMU_CAPS_HOST_PCI_MULTIDOMAIN); + DO_TEST("hostdev-mdev-unmanaged", + QEMU_CAPS_NODEFCONFIG, + QEMU_CAPS_DEVICE_VFIO_PCI); + DO_TEST_PARSE_ERROR("hostdev-mdev-unmanaged-no-uuid", + QEMU_CAPS_NODEFCONFIG, + QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST_FAILURE("hostdev-vfio-multidomain", QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST("pci-rom", diff --git a/tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml b/tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml new file mode 100644 index 0000000..55f19c4 --- /dev/null +++ b/tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml @@ -0,0 +1,40 @@ +<domain type='qemu'> + <name>QEMUGuest2</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest2'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <controller type='ide' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci'> + <source> + <address type='mdev' uuid='53764d0e-85a0-42b4-af5c-2046b460b1dc'/> + </source> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </hostdev> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 0702f58..fbb5192 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -550,6 +550,7 @@ mymain(void) DO_TEST("hostdev-usb-address", NONE); DO_TEST("hostdev-pci-address", NONE); DO_TEST("hostdev-vfio", NONE); + DO_TEST("hostdev-mdev-unmanaged", NONE); DO_TEST("pci-rom", NONE); DO_TEST("pci-serial-dev-chardev", NONE); -- 2.10.2

Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/formatdomain.html.in | 48 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 44 insertions(+), 4 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 0a115f5..8f21436 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3277,8 +3277,20 @@ attributes: <code>iobase</code> and <code>irq</code>. <span class="since">Since 1.2.1</span> </dd> + <dt><code>mdev</code></dt> + <dd>Mediated devices' addresses have so far only one mandatory attribute + <code>uuid</code> (<span class="since">since 3.1.0</span>) which + uniquely identifies a mediated device under the syfs file system. + </dd> </dl> + <p> + Note: Due to nature of mediated devices, being only software devices + defining an allocation of resources on the physical parent device, the + address type <code>mdev</code> is supposed to be used to identify a + device on the host only, rather than identifying it in the guest. + </p> + <h4><a name="elementsControllers">Controllers</a></h4> <p> @@ -3774,6 +3786,19 @@ </devices> ...</pre> + <p>or:</p> + +<pre> + ... + <devices> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address type='mdev' uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'> + </source> + </hostdev> + </devices> + ...</pre> + <dl> <dt><code>hostdev</code></dt> <dd>The <code>hostdev</code> element is the main container for describing @@ -3818,12 +3843,23 @@ <code>type</code> passes all LUNs presented by a single HBA to the guest. </dd> + <dt><code>mdev</code></dt> + <dd>For mediated devices (<span class="since">Since 3.1.0</span>) + the <code>model</code> attribute specifies the device API which + determines how the host's vfio driver will expose the device to the + guest. Currently, only <code>vfio-pci</code> model is supported. + The model also has implications on the guest's address type, i.e. + for <code>vfio-pci</code> device API any address type other than PCI + will result in an error. + </dd> </dl> <p> - Note: The <code>managed</code> attribute is only used with PCI devices - and is ignored by all the other device types, thus setting - <code>managed</code> explicitly with other than PCI device has the same - effect as omitting it. + Note: The <code>managed</code> attribute is only used with PCI and is + ignored by all the other device types, thus setting + <code>managed</code> explicitly with other than a PCI device has the + same effect as omitting it. Similarly, <code>model</code> attribute is + only supported by mediated devices and ignored by all other device + types. </p> </dd> <dt><code>source</code></dt> @@ -3888,6 +3924,10 @@ is the vhost_scsi wwpn (16 hexadecimal digits with a prefix of "naa.") established in the host configfs. </dd> + <dt><code>mdev</code></dt> + <dd>Mediated devices (<span class="since">Since 3.1.0</span>) are + described by the <code>address</code> element. + </dd> </dl> </dd> <dt><code>vendor</code>, <code>product</code></dt> -- 2.10.2

Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/news.xml | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/docs/news.xml b/docs/news.xml index b756a97..55fa5ed 100644 --- a/docs/news.xml +++ b/docs/news.xml @@ -53,6 +53,17 @@ was <code>virtio-net</code>. </description> </change> + <change> + <summary> + qemu: add mediated devices framework support + </summary> + <description> + With the upcomming kernel changes which introduce the new mediated + device framework, provide an initial support of this framework into + libvirt, mainly introduce new device type as well as an address type + in the XML. + </description> + </change> </section> <section title="Improvements"> <change> -- 2.10.2

Hi Erik, On Wed, 15 Feb 2017 22:32:17 +0100 Erik Skultety <eskultet@redhat.com> wrote:
since v1: - new <hostdev> attribute model introduced which tells libvirt which device API should be considered when auto-assigning guest address - device_api is properly checked, thus taking the 'model' attribute only as a hint to assign "some" address - new address type 'mdev' is introduced rather than using plain <uuid> element, since the address element is more conveniently extendable. - the emulated mtty driver now works as well out of the box, so no HW needed to review this series --> let's try it :) - fixed all the nits from v1
Yay! Is this missing the part where the locked memory limits get bumped for vfio devices though? We don't know how much any given mdev device will need to lock, but we should plan for up to the full VM size, just like vfio devices. Otherwise it seems like it works! Thanks, Alex
Erik Skultety (18): util: Introduce new module virmdev conf: Introduce new hostdev device type mdev conf: Introduce new address type mdev conf: Update XML parser, formatter, and RNG schema to support mdev conf: Introduce virDomainHostdevDefPostParse conf: Add post parse code for mdevs to virDomainHostdevDefPostParse security: dac: Enable labeling of vfio mediated devices security: selinux: Enable labeling of vfio mediated devices conf: Enable cold-plug of a mediated device qemu: Assign PCI addresses for mediated devices as well hostdev: Maintain a driver list of active mediated devices hostdev: Introduce a reattach method for mediated devices qemu: cgroup: Adjust cgroups' logic to allow mediated devices qemu: namespace: Hook up the discovery of mdevs into the namespace code qemu: Format mdevs on the qemu command line test: Add some test cases for our test suite regarding the mdevs docs: Document the new hostdev and address type 'mdev' news: Update the NEWS.xml about the new mdev feature
docs/formatdomain.html.in | 48 ++- docs/news.xml | 11 + docs/schemas/domaincommon.rng | 26 ++ po/POTFILES.in | 1 + src/Makefile.am | 1 + src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 203 ++++++++++-- src/conf/domain_conf.h | 9 + src/libvirt_private.syms | 20 ++ src/qemu/qemu_cgroup.c | 34 ++ src/qemu/qemu_command.c | 49 +++ src/qemu/qemu_command.h | 5 + src/qemu/qemu_domain.c | 12 + src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 16 +- src/qemu/qemu_hostdev.c | 37 +++ src/qemu/qemu_hostdev.h | 8 + src/qemu/qemu_hotplug.c | 2 + src/security/security_apparmor.c | 3 + src/security/security_dac.c | 55 ++++ src/security/security_selinux.c | 54 ++++ src/util/virhostdev.c | 229 ++++++++++++- src/util/virhostdev.h | 16 + src/util/virmdev.c | 358 +++++++++++++++++++++ src/util/virmdev.h | 93 ++++++ tests/domaincapsschemadata/full.xml | 1 + .../qemuxml2argv-hostdev-mdev-unmanaged.args | 25 ++ .../qemuxml2argv-hostdev-mdev-unmanaged.xml | 37 +++ tests/qemuxml2argvtest.c | 6 + .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml | 40 +++ tests/qemuxml2xmltest.c | 1 + 31 files changed, 1362 insertions(+), 40 deletions(-) create mode 100644 src/util/virmdev.c create mode 100644 src/util/virmdev.h create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml

Since mdevs are just another type of VFIO devices, we should increase the memory locking limit the same way we do for VFIO PCI devices. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index eb86385..bdd687f 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6222,11 +6222,12 @@ qemuDomainRequiresMemLock(virDomainDefPtr def) return true; for (i = 0; i < def->nhostdevs; i++) { - virDomainHostdevDefPtr dev = def->hostdevs[i]; + virDomainHostdevSubsysPtr subsys = &def->hostdevs[i]->source.subsys; - if (dev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && - dev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - dev->source.subsys.u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) + if (def->hostdevs[i]->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV || + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + subsys->u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO))) return true; } -- 2.10.2

On Thu, 16 Feb 2017 17:49:45 +0100 Erik Skultety <eskultet@redhat.com> wrote:
Since mdevs are just another type of VFIO devices, we should increase the memory locking limit the same way we do for VFIO PCI devices.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index eb86385..bdd687f 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6222,11 +6222,12 @@ qemuDomainRequiresMemLock(virDomainDefPtr def) return true;
for (i = 0; i < def->nhostdevs; i++) { - virDomainHostdevDefPtr dev = def->hostdevs[i]; + virDomainHostdevSubsysPtr subsys = &def->hostdevs[i]->source.subsys;
- if (dev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && - dev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - dev->source.subsys.u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) + if (def->hostdevs[i]->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV || + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + subsys->u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO))) return true; }
Nice! For series: Tested-by: Alex Williamson <alex.williamson@redhat.com> (with KVMGT vGPU mdev)

On Thu, 16 Feb 2017 11:04:39 -0700 Alex Williamson <alex.williamson@redhat.com> wrote:
On Thu, 16 Feb 2017 17:49:45 +0100 Erik Skultety <eskultet@redhat.com> wrote:
Since mdevs are just another type of VFIO devices, we should increase the memory locking limit the same way we do for VFIO PCI devices.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index eb86385..bdd687f 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6222,11 +6222,12 @@ qemuDomainRequiresMemLock(virDomainDefPtr def) return true;
for (i = 0; i < def->nhostdevs; i++) { - virDomainHostdevDefPtr dev = def->hostdevs[i]; + virDomainHostdevSubsysPtr subsys = &def->hostdevs[i]->source.subsys;
- if (dev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && - dev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - dev->source.subsys.u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) + if (def->hostdevs[i]->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV || + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + subsys->u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO))) return true; }
Nice! For series:
Tested-by: Alex Williamson <alex.williamson@redhat.com>
(with KVMGT vGPU mdev)
Nit, if I configure a VM for an invalid mdev uuid, I get the following error message: Error starting domain: Requested operation is not valid: mediated devices are not supported by this kernel Traceback (most recent call last): File "/usr/share/virt-manager/virtManager/asyncjob.py", line 88, in cb_wrapper callback(asyncjob, *args, **kwargs) File "/usr/share/virt-manager/virtManager/asyncjob.py", line 124, in tmpcb callback(*args, **kwargs) File "/usr/share/virt-manager/virtManager/libvirtobject.py", line 83, in newfn ret = fn(self, *args, **kwargs) File "/usr/share/virt-manager/virtManager/domain.py", line 1404, in startup self._backend.create() File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1035, in create if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self) libvirtError: Requested operation is not valid: mediated devices are not supported by this kernel In this case it should really just be a device not found error, the speculation that the kernel doesn't support mediated devices is incorrect. Thanks, Alex

Nice! For series:
Tested-by: Alex Williamson <alex.williamson@redhat.com>
(with KVMGT vGPU mdev)
Thank you very much for the effort Alex. Given its current state, I'm glad KVMGT was testable with my patches :).
Nit, if I configure a VM for an invalid mdev uuid, I get the following error message:
Error starting domain: Requested operation is not valid: mediated devices are not supported by this kernel
Traceback (most recent call last): File "/usr/share/virt-manager/virtManager/asyncjob.py", line 88, in cb_wrapper callback(asyncjob, *args, **kwargs) File "/usr/share/virt-manager/virtManager/asyncjob.py", line 124, in tmpcb callback(*args, **kwargs) File "/usr/share/virt-manager/virtManager/libvirtobject.py", line 83, in newfn ret = fn(self, *args, **kwargs) File "/usr/share/virt-manager/virtManager/domain.py", line 1404, in startup self._backend.create() File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1035, in create if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self) libvirtError: Requested operation is not valid: mediated devices are not supported by this kernel
In this case it should really just be a device not found error, the speculation that the kernel doesn't support mediated devices is incorrect. Thanks,
Noted, I'll address this as part of, presumably, v3 when I get a patch review. Thanks, Erik
participants (2)
-
Alex Williamson
-
Erik Skultety