[libvirt] [RFC PATCH v2 REBASE 00/18] Introduce vGPU mdev framework to libvirt

since the original v2 [1]: - resolved a few merge conflicts caused by @9d92f533 which refactored out some duplicate code which eventually lead to dropping patch 14/18 from the original series due to being unnecessary - rebased onto fresh HEAD [1] https://www.redhat.com/archives/libvir-list/2017-February/msg00739.html Erik Skultety (18): util: Introduce new module virmdev conf: Introduce new hostdev device type mdev conf: Introduce new address type mdev conf: Update XML parser, formatter, and RNG schema to support mdev conf: Introduce virDomainHostdevDefPostParse conf: Add post parse code for mdevs to virDomainHostdevDefPostParse security: dac: Enable labeling of vfio mediated devices security: selinux: Enable labeling of vfio mediated devices conf: Enable cold-plug of a mediated device qemu: Assign PCI addresses for mediated devices as well hostdev: Maintain a driver list of active mediated devices hostdev: Introduce a reattach method for mediated devices qemu: cgroup: Adjust cgroups' logic to allow mediated devices qemu: Bump the memory locking limit for mdevs as well qemu: Format mdevs on qemu command line test: Add some test cases for our test suite regarding the mdevs docs: Document the new hostdev and address type 'mdev' news: Update the NEWS.xml about the new mdev feature docs/formatdomain.html.in | 48 ++- docs/news.xml | 9 + docs/schemas/domaincommon.rng | 26 ++ po/POTFILES.in | 1 + src/Makefile.am | 1 + src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 203 ++++++++++-- src/conf/domain_conf.h | 9 + src/libvirt_private.syms | 20 ++ src/qemu/qemu_command.c | 49 +++ src/qemu/qemu_command.h | 5 + src/qemu/qemu_domain.c | 23 +- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 16 +- src/qemu/qemu_hostdev.c | 37 +++ src/qemu/qemu_hostdev.h | 8 + src/qemu/qemu_hotplug.c | 2 + src/security/security_apparmor.c | 3 + src/security/security_dac.c | 55 ++++ src/security/security_selinux.c | 54 ++++ src/util/virhostdev.c | 229 ++++++++++++- src/util/virhostdev.h | 16 + src/util/virmdev.c | 358 +++++++++++++++++++++ src/util/virmdev.h | 93 ++++++ tests/domaincapsschemadata/full.xml | 1 + .../qemuxml2argv-hostdev-mdev-unmanaged.args | 25 ++ .../qemuxml2argv-hostdev-mdev-unmanaged.xml | 37 +++ tests/qemuxml2argvtest.c | 6 + .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml | 40 +++ tests/qemuxml2xmltest.c | 1 + 30 files changed, 1333 insertions(+), 44 deletions(-) create mode 100644 src/util/virmdev.c create mode 100644 src/util/virmdev.h create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml -- 2.10.2

Beside creation, disposal, getter, and setter methods the module exports methods to work with lists of mediated devices. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- po/POTFILES.in | 1 + src/Makefile.am | 1 + src/libvirt_private.syms | 18 +++ src/util/virmdev.c | 358 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virmdev.h | 93 ++++++++++++ 5 files changed, 471 insertions(+) create mode 100644 src/util/virmdev.c create mode 100644 src/util/virmdev.h diff --git a/po/POTFILES.in b/po/POTFILES.in index 9f66697..c857211 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -218,6 +218,7 @@ src/util/virlease.c src/util/virlockspace.c src/util/virlog.c src/util/virmacmap.c +src/util/virmdev.c src/util/virnetdev.c src/util/virnetdevbandwidth.c src/util/virnetdevbridge.c diff --git a/src/Makefile.am b/src/Makefile.am index 46ca272..7bc2d3b 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -188,6 +188,7 @@ UTIL_SOURCES = \ util/virvhba.c util/virvhba.h \ util/virxdrdefs.h \ util/virxml.c util/virxml.h \ + util/virmdev.c util/virmdev.h \ $(NULL) EXTRA_DIST += $(srcdir)/util/keymaps.csv $(srcdir)/util/virkeycode-mapgen.py diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e6ccd69..8af65a1 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1959,6 +1959,24 @@ virMacMapNew; virMacMapRemove; virMacMapWriteFile; +# util/virmdev.h +virMediatedDeviceFree; +virMediatedDeviceGetDeviceAPI; +virMediatedDeviceGetIOMMUGroupDev; +virMediatedDeviceGetPath; +virMediatedDeviceGetUsedBy; +virMediatedDeviceListAdd; +virMediatedDeviceListCount; +virMediatedDeviceListDel; +virMediatedDeviceListFind; +virMediatedDeviceListGet; +virMediatedDeviceListNew; +virMediatedDeviceListSteal; +virMediatedDeviceListStealIndex; +virMediatedDeviceNew; +virMediatedDeviceSetUsedBy; + + # util/virnetdev.h virNetDevAddMulti; diff --git a/src/util/virmdev.c b/src/util/virmdev.c new file mode 100644 index 0000000..9c80647 --- /dev/null +++ b/src/util/virmdev.c @@ -0,0 +1,358 @@ +/* + * virmdev.c: helper APIs for managing host MDEV devices + * + * Copyright (C) 2017-2018 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#include <config.h> + +#include <fcntl.h> +#include <inttypes.h> +#include <limits.h> +#include <stdio.h> +#include <string.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> +#include <stdlib.h> + +#include "virmdev.h" +#include "dirname.h" +#include "virlog.h" +#include "viralloc.h" +#include "vircommand.h" +#include "virerror.h" +#include "virfile.h" +#include "virkmod.h" +#include "virstring.h" +#include "virutil.h" +#include "viruuid.h" +#include "virhostdev.h" + +VIR_LOG_INIT("util.mdev"); + +struct _virMediatedDevice { + char *path; /* sysfs path */ + + char *used_by_drvname; + char *used_by_domname; +}; + +struct _virMediatedDeviceList { + virObjectLockable parent; + + size_t count; + virMediatedDevicePtr *devs; +}; + + +/* For virReportOOMError() and virReportSystemError() */ +#define VIR_FROM_THIS VIR_FROM_NONE + +static virClassPtr virMediatedDeviceListClass; + +static void virMediatedDeviceListDispose(void *obj); + +static int virMediatedOnceInit(void) +{ + if (!(virMediatedDeviceListClass = virClassNew(virClassForObjectLockable(), + "virMediatedDeviceList", + sizeof(virMediatedDeviceList), + virMediatedDeviceListDispose))) + return -1; + + return 0; +} + +VIR_ONCE_GLOBAL_INIT(virMediated) + +#ifdef __linux__ +# define MDEV_SYSFS_DEVICES "/sys/bus/mdev/devices/" + +virMediatedDevicePtr +virMediatedDeviceNew(const char *uuidstr) +{ + virMediatedDevicePtr dev = NULL, ret = NULL; + + if (VIR_ALLOC(dev) < 0) + return NULL; + + if (virAsprintf(&dev->path, MDEV_SYSFS_DEVICES "%s", uuidstr) < 0) + goto cleanup; + + ret = dev; + dev = NULL; + + cleanup: + virMediatedDeviceFree(dev); + return ret; +} + +#else + +virMediatedDevicePtr +virMediatedDeviceNew(virPCIDeviceAddressPtr pciaddr ATTRIBUTE_UNUSED, + const char *uuidstr ATTRIBUTE_UNUSED) +{ + virRerportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("not supported on non-linux platforms")); + return NULL; +} + +#endif /* __linux__ */ + +void +virMediatedDeviceFree(virMediatedDevicePtr dev) +{ + if (!dev) + return; + VIR_FREE(dev->path); + VIR_FREE(dev->used_by_drvname); + VIR_FREE(dev->used_by_domname); + VIR_FREE(dev); +} + + +const char * +virMediatedDeviceGetPath(virMediatedDevicePtr dev) +{ + return dev->path; +} + + +/* Returns an absolute canonicalized path to the device used to control the + * mediated device's IOMMU group (e.g. "/dev/vfio/15") + */ +char * +virMediatedDeviceGetIOMMUGroupDev(virMediatedDevicePtr dev) +{ + char *resultpath = NULL; + char *iommu_linkpath = NULL; + char *vfio_path = NULL; + + if (virAsprintf(&iommu_linkpath, "%s/iommu_group", dev->path) < 0) + return NULL; + + if (virFileIsLink(iommu_linkpath) != 1) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("IOMMU group file %s is not a symlink"), + iommu_linkpath); + goto cleanup; + } + + if (virFileResolveLink(iommu_linkpath, &resultpath) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to resolve IOMMU group symlink %s"), + iommu_linkpath); + goto cleanup; + } + + if (virAsprintf(&vfio_path, "/dev/vfio/%s", last_component(resultpath)) < 0) + goto cleanup; + + cleanup: + VIR_FREE(resultpath); + VIR_FREE(iommu_linkpath); + return vfio_path; +} + + +void +virMediatedDeviceGetUsedBy(virMediatedDevicePtr dev, + const char **drvname, const char **domname) +{ + *drvname = dev->used_by_drvname; + *domname = dev->used_by_domname; +} + + +int +virMediatedDeviceSetUsedBy(virMediatedDevicePtr dev, + const char *drvname, + const char *domname) +{ + VIR_FREE(dev->used_by_drvname); + VIR_FREE(dev->used_by_domname); + if (VIR_STRDUP(dev->used_by_drvname, drvname) < 0) + return -1; + if (VIR_STRDUP(dev->used_by_domname, domname) < 0) + return -1; + + return 0; +} + + +virMediatedDeviceListPtr +virMediatedDeviceListNew(void) +{ + virMediatedDeviceListPtr list; + + if (virMediatedInitialize() < 0) + return NULL; + + if (!(list = virObjectLockableNew(virMediatedDeviceListClass))) + return NULL; + + return list; +} + + +static void +virMediatedDeviceListDispose(void *obj) +{ + virMediatedDeviceListPtr list = obj; + size_t i; + + for (i = 0; i < list->count; i++) { + virMediatedDeviceFree(list->devs[i]); + list->devs[i] = NULL; + } + + list->count = 0; + VIR_FREE(list->devs); +} + + +int +virMediatedDeviceListAdd(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + if (virMediatedDeviceListFind(list, dev)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Device %s is already in use"), dev->path); + return -1; + } + return VIR_APPEND_ELEMENT(list->devs, list->count, dev); +} + + +virMediatedDevicePtr +virMediatedDeviceListGet(virMediatedDeviceListPtr list, + ssize_t idx) +{ + if (idx < 0 || idx >= list->count) + return NULL; + + return list->devs[idx]; +} + + +size_t +virMediatedDeviceListCount(virMediatedDeviceListPtr list) +{ + return list->count; +} + + +virMediatedDevicePtr +virMediatedDeviceListStealIndex(virMediatedDeviceListPtr list, + ssize_t idx) +{ + virMediatedDevicePtr ret; + + if (idx < 0 || idx >= list->count) + return NULL; + + ret = list->devs[idx]; + VIR_DELETE_ELEMENT(list->devs, idx, list->count); + return ret; +} + + +virMediatedDevicePtr +virMediatedDeviceListSteal(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + int idx = virMediatedDeviceListFindIndex(list, dev); + + return virMediatedDeviceListStealIndex(list, idx); +} + + +void +virMediatedDeviceListDel(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + virMediatedDevicePtr ret = virMediatedDeviceListSteal(list, dev); + virMediatedDeviceFree(ret); +} + + +int +virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + size_t i; + + for (i = 0; i < list->count; i++) { + virMediatedDevicePtr other = list->devs[i]; + if (STREQ(other->path, dev->path)) + return i; + } + return -1; +} + + +virMediatedDevicePtr +virMediatedDeviceListFind(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + int idx; + + if ((idx = virMediatedDeviceListFindIndex(list, dev)) >= 0) + return list->devs[idx]; + else + return NULL; +} + + +int +virMediatedDeviceGetDeviceAPI(virMediatedDevicePtr dev, + char **device_api) +{ + int ret = -1; + char *buf = NULL; + char *tmp = NULL; + char *sysfs_path = NULL; + + if (virAsprintf(&sysfs_path, "%s/mdev_type/device_api", dev->path) < 0) + goto cleanup; + + /* TODO - make this a generic method to access sysfs files for various + * kinds of devices + */ + if (!virFileExists(sysfs_path)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("mediated devices are not supported by this kernel")); + goto cleanup; + } + + if (virFileReadAll(sysfs_path, 1024, &buf) < 0) + goto cleanup; + + if ((tmp = strchr(buf, '\n'))) + *tmp = '\0'; + + *device_api = buf; + buf = NULL; + + ret = 0; + cleanup: + VIR_FREE(sysfs_path); + VIR_FREE(buf); + return ret; +} diff --git a/src/util/virmdev.h b/src/util/virmdev.h new file mode 100644 index 0000000..78a7df5 --- /dev/null +++ b/src/util/virmdev.h @@ -0,0 +1,93 @@ +/* + * virmdev.h: helper APIs for managing host mediated devices + * + * Copyright (C) 2017-2018 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#ifndef __VIR_MDEV_H__ +# define __VIR_MDEV_H__ + +# include "internal.h" +# include "virobject.h" +# include "virutil.h" +# include "virpci.h" + +typedef enum { + VIR_MDEV_MODEL_TYPE_DEFAULT = 0, + VIR_MDEV_MODEL_TYPE_VFIO_PCI, + + VIR_MDEV_MODEL_TYPE_LAST +} virMediatedDeviceModelType; + +VIR_ENUM_DECL(virMediatedDeviceModel) + + +typedef struct _virMediatedDevice virMediatedDevice; +typedef virMediatedDevice *virMediatedDevicePtr; +typedef struct _virMediatedDeviceAddress virMediatedDeviceAddress; +typedef virMediatedDeviceAddress *virMediatedDeviceAddressPtr; +typedef struct _virMediatedDeviceList virMediatedDeviceList; +typedef virMediatedDeviceList *virMediatedDeviceListPtr; + +typedef int (*virMediatedDeviceCallback)(virMediatedDevicePtr dev, + const char *path, void *opaque); + +virMediatedDevicePtr virMediatedDeviceNew(const char *uuidstr); + +virMediatedDevicePtr virMediatedDeviceCopy(virMediatedDevicePtr dev); + +void virMediatedDeviceFree(virMediatedDevicePtr dev); + +const char *virMediatedDeviceGetPath(virMediatedDevicePtr dev); + +void virMediatedDeviceGetUsedBy(virMediatedDevicePtr dev, + const char **drvname, const char **domname); + +int virMediatedDeviceSetUsedBy(virMediatedDevicePtr dev, + const char *drvname, + const char *domname); + +char *virMediatedDeviceGetIOMMUGroupDev(virMediatedDevicePtr dev); + +int virMediatedDeviceGetDeviceAPI(virMediatedDevicePtr dev, char **device_api); + +virMediatedDeviceListPtr virMediatedDeviceListNew(void); + +int virMediatedDeviceListAdd(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr virMediatedDeviceListGet(virMediatedDeviceListPtr list, + ssize_t idx); + +size_t virMediatedDeviceListCount(virMediatedDeviceListPtr list); + +virMediatedDevicePtr virMediatedDeviceListSteal(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr virMediatedDeviceListStealIndex(virMediatedDeviceListPtr list, + ssize_t idx); + +void virMediatedDeviceListDel(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr virMediatedDeviceListFind(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +int virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +#endif /* __VIR_MDEV_H__ */ -- 2.10.2

On Mon, Feb 20, 2017 at 03:28:14PM +0100, Erik Skultety wrote:
Beside creation, disposal, getter, and setter methods the module exports methods to work with lists of mediated devices.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- po/POTFILES.in | 1 + src/Makefile.am | 1 + src/libvirt_private.syms | 18 +++ src/util/virmdev.c | 358 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virmdev.h | 93 ++++++++++++ 5 files changed, 471 insertions(+) create mode 100644 src/util/virmdev.c create mode 100644 src/util/virmdev.h
diff --git a/po/POTFILES.in b/po/POTFILES.in index 9f66697..c857211 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -218,6 +218,7 @@ src/util/virlease.c src/util/virlockspace.c src/util/virlog.c src/util/virmacmap.c +src/util/virmdev.c src/util/virnetdev.c src/util/virnetdevbandwidth.c src/util/virnetdevbridge.c diff --git a/src/Makefile.am b/src/Makefile.am index 46ca272..7bc2d3b 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -188,6 +188,7 @@ UTIL_SOURCES = \ util/virvhba.c util/virvhba.h \ util/virxdrdefs.h \ util/virxml.c util/virxml.h \ + util/virmdev.c util/virmdev.h \ $(NULL)
EXTRA_DIST += $(srcdir)/util/keymaps.csv $(srcdir)/util/virkeycode-mapgen.py diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e6ccd69..8af65a1 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1959,6 +1959,24 @@ virMacMapNew; virMacMapRemove; virMacMapWriteFile;
+# util/virmdev.h +virMediatedDeviceFree; +virMediatedDeviceGetDeviceAPI; +virMediatedDeviceGetIOMMUGroupDev; +virMediatedDeviceGetPath; +virMediatedDeviceGetUsedBy; +virMediatedDeviceListAdd; +virMediatedDeviceListCount; +virMediatedDeviceListDel; +virMediatedDeviceListFind; +virMediatedDeviceListGet; +virMediatedDeviceListNew; +virMediatedDeviceListSteal; +virMediatedDeviceListStealIndex; +virMediatedDeviceNew; +virMediatedDeviceSetUsedBy; + +
# util/virnetdev.h virNetDevAddMulti; diff --git a/src/util/virmdev.c b/src/util/virmdev.c new file mode 100644 index 0000000..9c80647 --- /dev/null +++ b/src/util/virmdev.c @@ -0,0 +1,358 @@ +/* + * virmdev.c: helper APIs for managing host MDEV devices + * + * Copyright (C) 2017-2018 Red Hat, Inc.
s/2017-2018/2017/
+ * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#include <config.h> + +#include <fcntl.h> +#include <inttypes.h> +#include <limits.h> +#include <stdio.h> +#include <string.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> +#include <stdlib.h> + +#include "virmdev.h" +#include "dirname.h" +#include "virlog.h" +#include "viralloc.h" +#include "vircommand.h" +#include "virerror.h" +#include "virfile.h" +#include "virkmod.h" +#include "virstring.h" +#include "virutil.h" +#include "viruuid.h" +#include "virhostdev.h" + +VIR_LOG_INIT("util.mdev"); + +struct _virMediatedDevice { + char *path; /* sysfs path */ + + char *used_by_drvname; + char *used_by_domname; +}; + +struct _virMediatedDeviceList { + virObjectLockable parent; + + size_t count; + virMediatedDevicePtr *devs; +}; + + +/* For virReportOOMError() and virReportSystemError() */ +#define VIR_FROM_THIS VIR_FROM_NONE
We usually place this define right after all includes and I think that you can drop the comment.
+ +static virClassPtr virMediatedDeviceListClass; + +static void virMediatedDeviceListDispose(void *obj); + +static int virMediatedOnceInit(void) +{ + if (!(virMediatedDeviceListClass = virClassNew(virClassForObjectLockable(), + "virMediatedDeviceList", + sizeof(virMediatedDeviceList), + virMediatedDeviceListDispose))) + return -1; + + return 0; +} + +VIR_ONCE_GLOBAL_INIT(virMediated) + +#ifdef __linux__ +# define MDEV_SYSFS_DEVICES "/sys/bus/mdev/devices/" + +virMediatedDevicePtr +virMediatedDeviceNew(const char *uuidstr) +{ + virMediatedDevicePtr dev = NULL, ret = NULL; + + if (VIR_ALLOC(dev) < 0) + return NULL; + + if (virAsprintf(&dev->path, MDEV_SYSFS_DEVICES "%s", uuidstr) < 0) + goto cleanup; + + ret = dev; + dev = NULL; + + cleanup: + virMediatedDeviceFree(dev); + return ret; +} + +#else + +virMediatedDevicePtr +virMediatedDeviceNew(virPCIDeviceAddressPtr pciaddr ATTRIBUTE_UNUSED, + const char *uuidstr ATTRIBUTE_UNUSED) +{ + virRerportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("not supported on non-linux platforms")); + return NULL; +} + +#endif /* __linux__ */ + +void +virMediatedDeviceFree(virMediatedDevicePtr dev) +{ + if (!dev) + return; + VIR_FREE(dev->path); + VIR_FREE(dev->used_by_drvname); + VIR_FREE(dev->used_by_domname); + VIR_FREE(dev); +} + + +const char * +virMediatedDeviceGetPath(virMediatedDevicePtr dev) +{ + return dev->path; +} + + +/* Returns an absolute canonicalized path to the device used to control the + * mediated device's IOMMU group (e.g. "/dev/vfio/15") + */ +char * +virMediatedDeviceGetIOMMUGroupDev(virMediatedDevicePtr dev) +{ + char *resultpath = NULL; + char *iommu_linkpath = NULL; + char *vfio_path = NULL; + + if (virAsprintf(&iommu_linkpath, "%s/iommu_group", dev->path) < 0) + return NULL; + + if (virFileIsLink(iommu_linkpath) != 1) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("IOMMU group file %s is not a symlink"), + iommu_linkpath); + goto cleanup; + } + + if (virFileResolveLink(iommu_linkpath, &resultpath) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to resolve IOMMU group symlink %s"), + iommu_linkpath); + goto cleanup; + } + + if (virAsprintf(&vfio_path, "/dev/vfio/%s", last_component(resultpath)) < 0) + goto cleanup; + + cleanup: + VIR_FREE(resultpath); + VIR_FREE(iommu_linkpath); + return vfio_path; +} + + +void +virMediatedDeviceGetUsedBy(virMediatedDevicePtr dev, + const char **drvname, const char **domname) +{ + *drvname = dev->used_by_drvname; + *domname = dev->used_by_domname; +} + + +int +virMediatedDeviceSetUsedBy(virMediatedDevicePtr dev, + const char *drvname, + const char *domname) +{ + VIR_FREE(dev->used_by_drvname); + VIR_FREE(dev->used_by_domname); + if (VIR_STRDUP(dev->used_by_drvname, drvname) < 0) + return -1; + if (VIR_STRDUP(dev->used_by_domname, domname) < 0) + return -1; + + return 0; +} + + +virMediatedDeviceListPtr +virMediatedDeviceListNew(void) +{ + virMediatedDeviceListPtr list; + + if (virMediatedInitialize() < 0) + return NULL; + + if (!(list = virObjectLockableNew(virMediatedDeviceListClass))) + return NULL; + + return list; +} + + +static void +virMediatedDeviceListDispose(void *obj) +{ + virMediatedDeviceListPtr list = obj; + size_t i; + + for (i = 0; i < list->count; i++) { + virMediatedDeviceFree(list->devs[i]); + list->devs[i] = NULL; + } + + list->count = 0; + VIR_FREE(list->devs); +} + + +int +virMediatedDeviceListAdd(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + if (virMediatedDeviceListFind(list, dev)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Device %s is already in use"), dev->path); + return -1; + } + return VIR_APPEND_ELEMENT(list->devs, list->count, dev); +} + + +virMediatedDevicePtr +virMediatedDeviceListGet(virMediatedDeviceListPtr list, + ssize_t idx) +{ + if (idx < 0 || idx >= list->count) + return NULL; + + return list->devs[idx]; +} + + +size_t +virMediatedDeviceListCount(virMediatedDeviceListPtr list) +{ + return list->count; +} + + +virMediatedDevicePtr +virMediatedDeviceListStealIndex(virMediatedDeviceListPtr list, + ssize_t idx) +{ + virMediatedDevicePtr ret; + + if (idx < 0 || idx >= list->count) + return NULL; + + ret = list->devs[idx]; + VIR_DELETE_ELEMENT(list->devs, idx, list->count); + return ret; +} + + +virMediatedDevicePtr +virMediatedDeviceListSteal(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + int idx = virMediatedDeviceListFindIndex(list, dev); + + return virMediatedDeviceListStealIndex(list, idx); +} + + +void +virMediatedDeviceListDel(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + virMediatedDevicePtr ret = virMediatedDeviceListSteal(list, dev); + virMediatedDeviceFree(ret); +} + + +int +virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + size_t i; + + for (i = 0; i < list->count; i++) { + virMediatedDevicePtr other = list->devs[i]; + if (STREQ(other->path, dev->path)) + return i; + } + return -1; +} + + +virMediatedDevicePtr +virMediatedDeviceListFind(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev) +{ + int idx; + + if ((idx = virMediatedDeviceListFindIndex(list, dev)) >= 0) + return list->devs[idx]; + else + return NULL; +} + + +int +virMediatedDeviceGetDeviceAPI(virMediatedDevicePtr dev, + char **device_api) +{ + int ret = -1; + char *buf = NULL; + char *tmp = NULL; + char *sysfs_path = NULL; + + if (virAsprintf(&sysfs_path, "%s/mdev_type/device_api", dev->path) < 0) + goto cleanup; + + /* TODO - make this a generic method to access sysfs files for various + * kinds of devices + */ + if (!virFileExists(sysfs_path)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("mediated devices are not supported by this kernel")); + goto cleanup; + } + + if (virFileReadAll(sysfs_path, 1024, &buf) < 0) + goto cleanup; + + if ((tmp = strchr(buf, '\n'))) + *tmp = '\0'; + + *device_api = buf; + buf = NULL; + + ret = 0; + cleanup: + VIR_FREE(sysfs_path); + VIR_FREE(buf); + return ret; +} diff --git a/src/util/virmdev.h b/src/util/virmdev.h new file mode 100644 index 0000000..78a7df5 --- /dev/null +++ b/src/util/virmdev.h @@ -0,0 +1,93 @@ +/* + * virmdev.h: helper APIs for managing host mediated devices + * + * Copyright (C) 2017-2018 Red Hat, Inc.
s/2017-2018/2017/
+ * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#ifndef __VIR_MDEV_H__ +# define __VIR_MDEV_H__ + +# include "internal.h" +# include "virobject.h" +# include "virutil.h" +# include "virpci.h" + +typedef enum { + VIR_MDEV_MODEL_TYPE_DEFAULT = 0, + VIR_MDEV_MODEL_TYPE_VFIO_PCI, + + VIR_MDEV_MODEL_TYPE_LAST +} virMediatedDeviceModelType; + +VIR_ENUM_DECL(virMediatedDeviceModel) + + +typedef struct _virMediatedDevice virMediatedDevice; +typedef virMediatedDevice *virMediatedDevicePtr; +typedef struct _virMediatedDeviceAddress virMediatedDeviceAddress; +typedef virMediatedDeviceAddress *virMediatedDeviceAddressPtr; +typedef struct _virMediatedDeviceList virMediatedDeviceList; +typedef virMediatedDeviceList *virMediatedDeviceListPtr; + +typedef int (*virMediatedDeviceCallback)(virMediatedDevicePtr dev, + const char *path, void *opaque); + +virMediatedDevicePtr virMediatedDeviceNew(const char *uuidstr);
I think it's time to start using the same format as for .c files especially when introducing new header files. Pavel
+ +virMediatedDevicePtr virMediatedDeviceCopy(virMediatedDevicePtr dev); + +void virMediatedDeviceFree(virMediatedDevicePtr dev); + +const char *virMediatedDeviceGetPath(virMediatedDevicePtr dev); + +void virMediatedDeviceGetUsedBy(virMediatedDevicePtr dev, + const char **drvname, const char **domname); + +int virMediatedDeviceSetUsedBy(virMediatedDevicePtr dev, + const char *drvname, + const char *domname); + +char *virMediatedDeviceGetIOMMUGroupDev(virMediatedDevicePtr dev); + +int virMediatedDeviceGetDeviceAPI(virMediatedDevicePtr dev, char **device_api); + +virMediatedDeviceListPtr virMediatedDeviceListNew(void); + +int virMediatedDeviceListAdd(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr virMediatedDeviceListGet(virMediatedDeviceListPtr list, + ssize_t idx); + +size_t virMediatedDeviceListCount(virMediatedDeviceListPtr list); + +virMediatedDevicePtr virMediatedDeviceListSteal(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr virMediatedDeviceListStealIndex(virMediatedDeviceListPtr list, + ssize_t idx); + +void virMediatedDeviceListDel(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +virMediatedDevicePtr virMediatedDeviceListFind(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +int virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list, + virMediatedDevicePtr dev); + +#endif /* __VIR_MDEV_H__ */ -- 2.10.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

A mediated device will be identified by a UUID of the user pre-created mediated device. The data necessary to identify a mediated device can be easily extended in the future, e.g. when auto-creation of mediated devices should be enabled. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 36 +++++++++++++++++++++++++++++++++++- src/conf/domain_conf.h | 9 +++++++++ src/qemu/qemu_domain.c | 1 + src/qemu/qemu_hotplug.c | 2 ++ src/security/security_apparmor.c | 3 +++ src/security/security_dac.c | 2 ++ src/security/security_selinux.c | 2 ++ tests/domaincapsschemadata/full.xml | 1 + 8 files changed, 55 insertions(+), 1 deletion(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a56ea82..83aa15f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -56,6 +56,7 @@ #include "virstring.h" #include "virnetdev.h" #include "virhostdev.h" +#include "virmdev.h" #define VIR_FROM_THIS VIR_FROM_DOMAIN @@ -649,7 +650,8 @@ VIR_ENUM_IMPL(virDomainHostdevSubsys, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST, "usb", "pci", "scsi", - "scsi_host") + "scsi_host", + "mdev") VIR_ENUM_IMPL(virDomainHostdevSubsysPCIBackend, VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_LAST, @@ -668,6 +670,11 @@ VIR_ENUM_IMPL(virDomainHostdevSubsysSCSIHostProtocol, "none", "vhost") +VIR_ENUM_IMPL(virMediatedDeviceModel, + VIR_MDEV_MODEL_TYPE_LAST, + "default", + "vfio-pci") + VIR_ENUM_IMPL(virDomainHostdevCaps, VIR_DOMAIN_HOSTDEV_CAPS_TYPE_LAST, "storage", "misc", @@ -6349,10 +6356,12 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, char *sgio = NULL; char *rawio = NULL; char *backendStr = NULL; + char *model = NULL; int backend; int ret = -1; virDomainHostdevSubsysPCIPtr pcisrc = &def->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; /* @managed can be read from the xml document - it is always an * attribute of the toplevel element, no matter what type of @@ -6432,6 +6441,21 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, } } + if (model) { + if (def->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("model is only supported with mediated devices")); + goto error; + } + + if ((mdevsrc->model = virMediatedDeviceModelTypeFromString(model)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("unknown hostdev model '%s'"), + model); + goto error; + } + } + switch (def->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: if (virDomainHostdevSubsysPCIDefParseXML(sourcenode, def, flags) < 0) @@ -6464,6 +6488,8 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, if (virDomainHostdevSubsysSCSIVHostDefParseXML(sourcenode, def) < 0) goto error; break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -13297,6 +13323,7 @@ virDomainHostdevDefParseXML(virDomainXMLOptionPtr xmlopt, } break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } @@ -14188,6 +14215,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, return 1; else return 0; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: return 0; } @@ -23121,6 +23149,7 @@ virDomainHostdevDefFormat(virBufferPtr buf, { const char *mode = virDomainHostdevModeTypeToString(def->mode); virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; const char *type; if (!mode) { @@ -23170,6 +23199,11 @@ virDomainHostdevDefFormat(virBufferPtr buf, virBufferAsprintf(buf, " rawio='%s'", virTristateBoolTypeToString(scsisrc->rawio)); } + + if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV && + mdevsrc->model) + virBufferAsprintf(buf, " model='%s'", + virMediatedDeviceModelTypeToString(mdevsrc->model)); } virBufferAddLit(buf, ">\n"); virBufferAdjustIndent(buf, 2); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 7e1afa4..e1a7a36 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -295,6 +295,7 @@ typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST, + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST } virDomainHostdevSubsysType; @@ -369,6 +370,13 @@ struct _virDomainHostdevSubsysSCSI { } u; }; +typedef struct _virDomainHostdevSubsysMediatedDev virDomainHostdevSubsysMediatedDev; +typedef virDomainHostdevSubsysMediatedDev *virDomainHostdevSubsysMediatedDevPtr; +struct _virDomainHostdevSubsysMediatedDev { + int model; /* enum virMediatedDeviceModelType */ + char uuidstr[VIR_UUID_STRING_BUFLEN]; /* mediated device's uuid string */ +}; + typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_NONE, VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_VHOST, @@ -394,6 +402,7 @@ struct _virDomainHostdevSubsys { virDomainHostdevSubsysPCI pci; virDomainHostdevSubsysSCSI scsi; virDomainHostdevSubsysSCSIVHost scsi_host; + virDomainHostdevSubsysMediatedDev mdev; } u; }; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index ea4b282..b905f9a 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6978,6 +6978,7 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 2f209f1..96c27ad 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3854,6 +3854,8 @@ qemuDomainRemoveHostDevice(virQEMUDriverPtr driver, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: qemuDomainRemoveSCSIVHostDevice(driver, vm, hostdev); break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c index 0d3e891..f5b72e1 100644 --- a/src/security/security_apparmor.c +++ b/src/security/security_apparmor.c @@ -901,6 +901,9 @@ AppArmorSetSecurityHostdevLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 6721917..ecce1d3 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -964,6 +964,7 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -1119,6 +1120,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index e22de06..e152c72 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1782,6 +1782,7 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -2009,6 +2010,7 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/tests/domaincapsschemadata/full.xml b/tests/domaincapsschemadata/full.xml index 6abd499..6b43069 100644 --- a/tests/domaincapsschemadata/full.xml +++ b/tests/domaincapsschemadata/full.xml @@ -88,6 +88,7 @@ <value>pci</value> <value>scsi</value> <value>scsi_host</value> + <value>mdev</value> </enum> <enum name='capsType'> <value>storage</value> -- 2.10.2

On Mon, Feb 20, 2017 at 03:28:15PM +0100, Erik Skultety wrote:
A mediated device will be identified by a UUID of the user pre-created mediated device. The data necessary to identify a mediated device can be easily extended in the future, e.g. when auto-creation of mediated devices should be enabled.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 36 +++++++++++++++++++++++++++++++++++- src/conf/domain_conf.h | 9 +++++++++ src/qemu/qemu_domain.c | 1 + src/qemu/qemu_hotplug.c | 2 ++ src/security/security_apparmor.c | 3 +++ src/security/security_dac.c | 2 ++ src/security/security_selinux.c | 2 ++ tests/domaincapsschemadata/full.xml | 1 + 8 files changed, 55 insertions(+), 1 deletion(-)
Patches 04/18 and 06/18 should be merged into this one, it adds the same functionality.
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a56ea82..83aa15f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -56,6 +56,7 @@ #include "virstring.h" #include "virnetdev.h" #include "virhostdev.h" +#include "virmdev.h"
#define VIR_FROM_THIS VIR_FROM_DOMAIN
@@ -649,7 +650,8 @@ VIR_ENUM_IMPL(virDomainHostdevSubsys, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST, "usb", "pci", "scsi", - "scsi_host") + "scsi_host", + "mdev")
VIR_ENUM_IMPL(virDomainHostdevSubsysPCIBackend, VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_LAST, @@ -668,6 +670,11 @@ VIR_ENUM_IMPL(virDomainHostdevSubsysSCSIHostProtocol, "none", "vhost")
+VIR_ENUM_IMPL(virMediatedDeviceModel, + VIR_MDEV_MODEL_TYPE_LAST, + "default", + "vfio-pci") + VIR_ENUM_IMPL(virDomainHostdevCaps, VIR_DOMAIN_HOSTDEV_CAPS_TYPE_LAST, "storage", "misc", @@ -6349,10 +6356,12 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, char *sgio = NULL; char *rawio = NULL; char *backendStr = NULL; + char *model = NULL; int backend; int ret = -1; virDomainHostdevSubsysPCIPtr pcisrc = &def->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev;
/* @managed can be read from the xml document - it is always an * attribute of the toplevel element, no matter what type of @@ -6432,6 +6441,21 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, } }
+ if (model) { + if (def->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("model is only supported with mediated devices")); + goto error; + } + + if ((mdevsrc->model = virMediatedDeviceModelTypeFromString(model)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("unknown hostdev model '%s'"), + model); + goto error; + }
This condition means that "VIR_MDEV_MODEL_TYPE_DEFAULT" is also unknown which is not correct, however your code later on depends on this check because in function virDomainHostdevSubsysMediatedDevDefParseXML you require that the model must by > 0. This should be handled by postParse code to check whether the model has a correct value. If you want to check whether model attribute was provided you shouldn't do it by the parsed value but whether the attribute was present in the XML. The rest of the patch looks good. Pavel
+ } + switch (def->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: if (virDomainHostdevSubsysPCIDefParseXML(sourcenode, def, flags) < 0) @@ -6464,6 +6488,8 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, if (virDomainHostdevSubsysSCSIVHostDefParseXML(sourcenode, def) < 0) goto error; break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break;
default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -13297,6 +13323,7 @@ virDomainHostdevDefParseXML(virDomainXMLOptionPtr xmlopt, } break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } @@ -14188,6 +14215,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, return 1; else return 0; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: return 0; } @@ -23121,6 +23149,7 @@ virDomainHostdevDefFormat(virBufferPtr buf, { const char *mode = virDomainHostdevModeTypeToString(def->mode); virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; const char *type;
if (!mode) { @@ -23170,6 +23199,11 @@ virDomainHostdevDefFormat(virBufferPtr buf, virBufferAsprintf(buf, " rawio='%s'", virTristateBoolTypeToString(scsisrc->rawio)); } + + if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV && + mdevsrc->model) + virBufferAsprintf(buf, " model='%s'", + virMediatedDeviceModelTypeToString(mdevsrc->model)); } virBufferAddLit(buf, ">\n"); virBufferAdjustIndent(buf, 2); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 7e1afa4..e1a7a36 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -295,6 +295,7 @@ typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST, + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV,
VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST } virDomainHostdevSubsysType; @@ -369,6 +370,13 @@ struct _virDomainHostdevSubsysSCSI { } u; };
+typedef struct _virDomainHostdevSubsysMediatedDev virDomainHostdevSubsysMediatedDev; +typedef virDomainHostdevSubsysMediatedDev *virDomainHostdevSubsysMediatedDevPtr; +struct _virDomainHostdevSubsysMediatedDev { + int model; /* enum virMediatedDeviceModelType */ + char uuidstr[VIR_UUID_STRING_BUFLEN]; /* mediated device's uuid string */ +}; + typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_NONE, VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_VHOST, @@ -394,6 +402,7 @@ struct _virDomainHostdevSubsys { virDomainHostdevSubsysPCI pci; virDomainHostdevSubsysSCSI scsi; virDomainHostdevSubsysSCSIVHost scsi_host; + virDomainHostdevSubsysMediatedDev mdev; } u; };
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index ea4b282..b905f9a 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6978,6 +6978,7 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 2f209f1..96c27ad 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3854,6 +3854,8 @@ qemuDomainRemoveHostDevice(virQEMUDriverPtr driver, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: qemuDomainRemoveSCSIVHostDevice(driver, vm, hostdev); break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c index 0d3e891..f5b72e1 100644 --- a/src/security/security_apparmor.c +++ b/src/security/security_apparmor.c @@ -901,6 +901,9 @@ AppArmorSetSecurityHostdevLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 6721917..ecce1d3 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -964,6 +964,7 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -1119,6 +1120,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index e22de06..e152c72 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1782,6 +1782,7 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -2009,6 +2010,7 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/tests/domaincapsschemadata/full.xml b/tests/domaincapsschemadata/full.xml index 6abd499..6b43069 100644 --- a/tests/domaincapsschemadata/full.xml +++ b/tests/domaincapsschemadata/full.xml @@ -88,6 +88,7 @@ <value>pci</value> <value>scsi</value> <value>scsi_host</value> + <value>mdev</value> </enum> <enum name='capsType'> <value>storage</value> -- 2.10.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

Since the address element is much easier extendible by attributes adding more and more elements, starting with <uuid>, to the <source> element once we find out we need more data to identify a mediated device. By introducing a new address type rather than using plain elements, we also remain consistent with all other devices that can make use of the address element. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 11 +++++++++-- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/src/conf/device_conf.h b/src/conf/device_conf.h index f435fb5..8b67208 100644 --- a/src/conf/device_conf.h +++ b/src/conf/device_conf.h @@ -47,6 +47,7 @@ typedef enum { VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ISA, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_DIMM, + VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST } virDomainDeviceAddressType; diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 83aa15f..947a902 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -259,7 +259,8 @@ VIR_ENUM_IMPL(virDomainDeviceAddress, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST, "ccw", "virtio-mmio", "isa", - "dimm") + "dimm", + "mdev") VIR_ENUM_IMPL(virDomainDiskDevice, VIR_DOMAIN_DISK_DEVICE_LAST, "disk", @@ -3290,6 +3291,7 @@ int virDomainDeviceAddressIsValid(virDomainDeviceInfoPtr info, return virDomainDeviceCCWAddressIsValid(&info->addr.ccw); case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_USB: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV: return 1; } @@ -3382,6 +3384,7 @@ virDomainDeviceInfoAddressIsEqual(const virDomainDeviceInfo *a, /* address types below don't have any specific data */ case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV: break; case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI: @@ -5130,7 +5133,8 @@ virDomainDeviceInfoFormat(virBufferPtr buf, } if (info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE || - info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390) + info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390 || + info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV) return 0; /* We'll be in domain/devices/[device type]/ so 3 level indent */ @@ -5213,6 +5217,7 @@ virDomainDeviceInfoFormat(virBufferPtr buf, break; case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: break; @@ -5774,6 +5779,7 @@ virDomainDeviceInfoParseXML(xmlNodePtr node, goto cleanup; break; + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: break; @@ -18546,6 +18552,7 @@ virDomainDeviceInfoCheckABIStability(virDomainDeviceInfoPtr src, case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_MDEV: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: break; -- 2.10.2

On Mon, Feb 20, 2017 at 03:28:16PM +0100, Erik Skultety wrote:
Since the address element is much easier extendible by attributes adding more and more elements, starting with <uuid>, to the <source> element once we find out we need more data to identify a mediated device. By introducing a new address type rather than using plain elements, we also remain consistent with all other devices that can make use of the address element.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 11 +++++++++-- 2 files changed, 10 insertions(+), 2 deletions(-)
ACK to the code, however I think that this patch should be placed before "[PATCH 02/18] conf: Introduce new hostdev device type mdev". Pavel

To keep the domain XML as much platform agnostic as possible, do not expose an element/attribute which would contain path directly to the syfs filesystem which the mediated devices are build upon. Instead, identify each mediated device by a UUID attribute as well as specifying the device API (e.g. vfio-pci) through the 'model' attribute. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/schemas/domaincommon.rng | 26 ++++++++++++++++++++++ src/conf/domain_conf.c | 50 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 76 insertions(+) diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index c5f1013..7b17a17 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -4019,6 +4019,7 @@ <ref name="hostdevsubsysusb"/> <ref name="hostdevsubsysscsi"/> <ref name="hostdevsubsyshost"/> + <ref name="hostdevsubsysmdev"/> </choice> </define> @@ -4169,6 +4170,20 @@ </element> </define> + <define name="hostdevsubsysmdev"> + <attribute name="type"> + <value>mdev</value> + </attribute> + <attribute name="model"> + <choice> + <value>vfio-pci</value> + </choice> + </attribute> + <element name="source"> + <ref name="address"/> + </element> + </define> + <define name="hostdevcapsstorage"> <attribute name="type"> <value>storage</value> @@ -4327,6 +4342,11 @@ </attribute> </optional> </define> + <define name="mdevaddress"> + <attribute name="uuid"> + <ref name="UUID"/> + </attribute> + </define> <define name="devices"> <element name="devices"> <interleave> @@ -4708,6 +4728,12 @@ </attribute> <ref name="dimmaddress"/> </group> + <group> + <attribute name="type"> + <value>mdev</value> + </attribute> + <ref name="mdevaddress"/> + </group> </choice> </element> </define> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 947a902..3aee8b6 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -6349,6 +6349,47 @@ virDomainHostdevSubsysSCSIVHostDefParseXML(xmlNodePtr sourcenode, return ret; } +static int +virDomainHostdevSubsysMediatedDevDefParseXML(virDomainHostdevDefPtr def, + xmlXPathContextPtr ctxt) +{ + int ret = -1; + unsigned char uuid[VIR_UUID_BUFLEN] = {0}; + char *uuidxml = NULL; + xmlNodePtr node = NULL; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; + + if (mdevsrc->model == 0) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing 'model' attribute for element <hostdev>")); + goto cleanup; + } + + if (!(node = virXPathNode("./source/address", ctxt))) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Missing <address> element")); + goto cleanup; + } + + if (!(uuidxml = virXMLPropString(node, "uuid"))) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Missing 'uuid' attribute for element <address>")); + goto cleanup; + } + + if (virUUIDParse(uuidxml, uuid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", + _("Cannot parse uuid attribute of element <address>")); + goto cleanup; + } + + virUUIDFormat(uuid, mdevsrc->uuidstr); + ret = 0; + cleanup: + VIR_FREE(uuidxml); + return ret; +} static int virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, @@ -6381,6 +6422,7 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, sgio = virXMLPropString(node, "sgio"); rawio = virXMLPropString(node, "rawio"); + model = virXMLPropString(node, "model"); /* @type is passed in from the caller rather than read from the * xml document, because it is specified in different places for @@ -6495,6 +6537,8 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, goto error; break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + if (virDomainHostdevSubsysMediatedDevDefParseXML(def, ctxt) < 0) + goto error; break; default: @@ -6510,6 +6554,7 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, VIR_FREE(sgio); VIR_FREE(rawio); VIR_FREE(backendStr); + VIR_FREE(model); return ret; } @@ -21186,6 +21231,7 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, virDomainHostdevSubsysPCIPtr pcisrc = &def->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &def->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &def->source.subsys.u.mdev; virDomainHostdevSubsysSCSIHostPtr scsihostsrc = &scsisrc->u.host; virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc = &scsisrc->u.iscsi; @@ -21290,6 +21336,10 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + virBufferAsprintf(buf, "<address type='mdev' uuid='%s'/>\n", + mdevsrc->uuidstr); + break; default: virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected hostdev type %d"), -- 2.10.2

Just to make the code a bit cleaner, move hostdev specific post parse code to its own function just in case it grows in the future. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 76 +++++++++++++++++++++++++++++++------------------- 1 file changed, 48 insertions(+), 28 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 3aee8b6..522553c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4208,6 +4208,51 @@ virDomainHostdevAssignAddress(virDomainXMLOptionPtr xmlopt, static int +virDomainHostdevDefPostParse(virDomainHostdevDefPtr dev, + const virDomainDef *def, + virDomainXMLOptionPtr xmlopt) +{ + if (dev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) + return 0; + + switch (dev->source.subsys.type) { + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: + if (dev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + virDomainHostdevAssignAddress(xmlopt, def, dev) < 0) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Cannot assign SCSI host device address")); + return -1; + } else { + /* Ensure provided address doesn't conflict with existing + * scsi disk drive address + */ + virDomainDeviceDriveAddressPtr addr = &dev->info->addr.drive; + if (virDomainDriveAddressIsUsedByDisk(def, + VIR_DOMAIN_DISK_BUS_SCSI, + addr)) { + virReportError(VIR_ERR_XML_ERROR, + _("SCSI host address controller='%u' " + "bus='%u' target='%u' unit='%u' in " + "use by a SCSI disk"), + addr->controller, addr->bus, + addr->target, addr->unit); + return -1; + } + } + break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: + break; + } + + return 0; +} + + +static int virDomainDeviceDefPostParseInternal(virDomainDeviceDefPtr dev, const virDomainDef *def, virCapsPtr caps ATTRIBUTE_UNUSED, @@ -4288,34 +4333,9 @@ virDomainDeviceDefPostParseInternal(virDomainDeviceDefPtr dev, video->vram = VIR_ROUND_UP_POWER_OF_TWO(video->vram); } - if (dev->type == VIR_DOMAIN_DEVICE_HOSTDEV) { - virDomainHostdevDefPtr hdev = dev->data.hostdev; - - if (virHostdevIsSCSIDevice(hdev)) { - if (hdev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && - virDomainHostdevAssignAddress(xmlopt, def, hdev) < 0) { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("Cannot assign SCSI host device address")); - return -1; - } else { - /* Ensure provided address doesn't conflict with existing - * scsi disk drive address - */ - virDomainDeviceDriveAddressPtr addr = &hdev->info->addr.drive; - if (virDomainDriveAddressIsUsedByDisk(def, - VIR_DOMAIN_DISK_BUS_SCSI, - addr)) { - virReportError(VIR_ERR_XML_ERROR, - _("SCSI host address controller='%u' " - "bus='%u' target='%u' unit='%u' in " - "use by a SCSI disk"), - addr->controller, addr->bus, - addr->target, addr->unit); - return -1; - } - } - } - } + if (dev->type == VIR_DOMAIN_DEVICE_HOSTDEV && + virDomainHostdevDefPostParse(dev->data.hostdev, def, xmlopt) < 0) + return -1; if (dev->type == VIR_DOMAIN_DEVICE_CONTROLLER) { virDomainControllerDefPtr cdev = dev->data.controller; -- 2.10.2

On Mon, Feb 20, 2017 at 03:28:18PM +0100, Erik Skultety wrote:
Just to make the code a bit cleaner, move hostdev specific post parse code to its own function just in case it grows in the future.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 76 +++++++++++++++++++++++++++++++------------------- 1 file changed, 48 insertions(+), 28 deletions(-)
ACK, but I would move this patch at the first place in the series because it's a cleanup before introducing new feature and usually these cleanups can be pushed separately. Pavel

We need to make sure that if user explicitly provides a guest address for a mdev device, the address type will be matching the device API supported on that specific mediated device and error out with an incorrect XML message. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 522553c..1fdab6a 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4240,10 +4240,26 @@ virDomainHostdevDefPostParse(virDomainHostdevDefPtr dev, } } break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + int model = dev->source.subsys.u.mdev.model; + + if (dev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) + return 0; + + if (model == VIR_MDEV_MODEL_TYPE_VFIO_PCI && + dev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI) { + virReportError(VIR_ERR_XML_ERROR, + _("Unsupported address type '%s' with mediated " + "device model '%s'"), + virDomainDeviceAddressTypeToString(dev->info->type), + virMediatedDeviceModelTypeToString(model)); + return -1; + } + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } -- 2.10.2

Label the VFIO IOMMU devices under /dev/vfio/ referenced by the symlinks in the sysfs (e.g. /sys/class/mdev_bus/<uuid>/iommu_group) which what qemu actually gets formatted on the command line. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/security/security_dac.c | 57 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 55 insertions(+), 2 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index ecce1d3..45bd24e 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -33,6 +33,7 @@ #include "virfile.h" #include "viralloc.h" #include "virlog.h" +#include "virmdev.h" #include "virpci.h" #include "virusb.h" #include "virscsi.h" @@ -856,6 +857,15 @@ virSecurityDACSetHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, static int +virSecurityDACSetMediatedDevLabel(virMediatedDevicePtr dev ATTRIBUTE_UNUSED, + const char *file, + void *opaque) +{ + return virSecurityDACSetHostdevLabelHelper(file, opaque); +} + + +static int virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, virDomainDefPtr def, virDomainHostdevDefPtr dev, @@ -867,7 +877,9 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1; + virMediatedDevicePtr mdev = NULL; if (!priv->dynamicOwnership) return 0; @@ -964,13 +976,26 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; } - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfio_dev = NULL; + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr))) + goto done; + + if (!(vfio_dev = virMediatedDeviceGetIOMMUGroupDev(mdev))) + goto done; + + ret = virSecurityDACSetMediatedDevLabel(mdev, vfio_dev, &cbdata); + VIR_FREE(vfio_dev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; } done: + virMediatedDeviceFree(mdev); return ret; } @@ -1018,6 +1043,15 @@ virSecurityDACRestoreHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, return virSecurityDACRestoreFileLabel(priv, file); } +static int +virSecurityDACRestoreMediatedDevLabel(virMediatedDevicePtr dev ATTRIBUTE_UNUSED, + const char *file, + void *opaque) +{ + virSecurityManagerPtr mgr = opaque; + virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr); + return virSecurityDACRestoreFileLabel(priv, file); +} static int virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, @@ -1032,6 +1066,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1; secdef = virDomainDefGetSecurityLabelDef(def, SECURITY_DAC_NAME); @@ -1120,7 +1155,25 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; } - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + } + + ret = virSecurityDACRestoreMediatedDevLabel(mdev, vfiodev, mgr); + + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; -- 2.10.2

On Mon, Feb 20, 2017 at 03:28:20PM +0100, Erik Skultety wrote:
Label the VFIO IOMMU devices under /dev/vfio/ referenced by the symlinks in the sysfs (e.g. /sys/class/mdev_bus/<uuid>/iommu_group) which what qemu actually gets formatted on the command line.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/security/security_dac.c | 57 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 55 insertions(+), 2 deletions(-)
diff --git a/src/security/security_dac.c b/src/security/security_dac.c index ecce1d3..45bd24e 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -33,6 +33,7 @@ #include "virfile.h" #include "viralloc.h" #include "virlog.h" +#include "virmdev.h" #include "virpci.h" #include "virusb.h" #include "virscsi.h" @@ -856,6 +857,15 @@ virSecurityDACSetHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED,
static int +virSecurityDACSetMediatedDevLabel(virMediatedDevicePtr dev ATTRIBUTE_UNUSED, + const char *file, + void *opaque) +{ + return virSecurityDACSetHostdevLabelHelper(file, opaque); +}
This wrapper is not required, mediated devices don't have an *Iterate() function (which is in most cases only yet another wrapper for a simple function call).
+ + +static int virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, virDomainDefPtr def, virDomainHostdevDefPtr dev, @@ -867,7 +877,9 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1; + virMediatedDevicePtr mdev = NULL;
if (!priv->dynamicOwnership) return 0; @@ -964,13 +976,26 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; }
- case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfio_dev = NULL; + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr))) + goto done; + + if (!(vfio_dev = virMediatedDeviceGetIOMMUGroupDev(mdev))) + goto done; + + ret = virSecurityDACSetMediatedDevLabel(mdev, vfio_dev, &cbdata);
You can use virSecurityDACSetHostdevLabelHelper directly.
+ VIR_FREE(vfio_dev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; }
done: + virMediatedDeviceFree(mdev); return ret; }
@@ -1018,6 +1043,15 @@ virSecurityDACRestoreHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, return virSecurityDACRestoreFileLabel(priv, file); }
+static int +virSecurityDACRestoreMediatedDevLabel(virMediatedDevicePtr dev ATTRIBUTE_UNUSED, + const char *file, + void *opaque) +{ + virSecurityManagerPtr mgr = opaque; + virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr); + return virSecurityDACRestoreFileLabel(priv, file); +}
static int virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, @@ -1032,6 +1066,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1;
secdef = virDomainDefGetSecurityLabelDef(def, SECURITY_DAC_NAME); @@ -1120,7 +1155,25 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; }
- case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + } + + ret = virSecurityDACRestoreMediatedDevLabel(mdev, vfiodev, mgr);
Same here, you don't have to use this wrapper, use virSecurityDACRestoreFileLabel directly. This applies to security_selinux as well and I think that you should merge the security_dac and security_selinux patches together and you are missing security_apparmor patch. Pavel
+ + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; -- 2.10.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

Label the VFIO IOMMU devices under /dev/vfio/ referenced by the symlinks in the sysfs (e.g. /sys/class/mdev_bus/<uuid>/iommu_group) which what qemu actually gets formatted on the command line. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/security/security_selinux.c | 56 +++++++++++++++++++++++++++++++++++++++-- 1 file changed, 54 insertions(+), 2 deletions(-) diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index e152c72..60bdb1c 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -36,6 +36,7 @@ #include "virerror.h" #include "viralloc.h" #include "virlog.h" +#include "virmdev.h" #include "virpci.h" #include "virusb.h" #include "virscsi.h" @@ -1686,6 +1687,13 @@ virSecuritySELinuxSetHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, } static int +virSecuritySELinuxSetMediatedDevLabel(virMediatedDevicePtr dev ATTRIBUTE_UNUSED, + const char *file, void *opaque) +{ + return virSecuritySELinuxSetHostdevLabelHelper(file, opaque); +} + +static int virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainDefPtr def, virDomainHostdevDefPtr dev, @@ -1696,7 +1704,9 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; virSecuritySELinuxCallbackData data = {.mgr = mgr, .def = def}; + virMediatedDevicePtr mdev = NULL; int ret = -1; @@ -1782,13 +1792,26 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, break; } - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfio_dev = NULL; + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr))) + goto done; + + if (!(vfio_dev = virMediatedDeviceGetIOMMUGroupDev(mdev))) + goto done; + + ret = virSecuritySELinuxSetMediatedDevLabel(mdev, vfio_dev, &data); + VIR_FREE(vfio_dev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; } done: + virMediatedDeviceFree(mdev); return ret; } @@ -1918,6 +1941,16 @@ virSecuritySELinuxRestoreHostLabel(virSCSIVHostDevicePtr dev ATTRIBUTE_UNUSED, } static int +virSecuritySELinuxRestoreMediatedDevLabel(virMediatedDevicePtr dev ATTRIBUTE_UNUSED, + const char *file, + void *opaque) +{ + virSecurityManagerPtr mgr = opaque; + + return virSecuritySELinuxRestoreFileLabel(mgr, file); +} + +static int virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainHostdevDefPtr dev, const char *vroot) @@ -1927,6 +1960,7 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; int ret = -1; /* Like virSecuritySELinuxRestoreImageLabelInt() for a networked @@ -2010,7 +2044,25 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, break; } - case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { + char *vfiodev = NULL; + virMediatedDevicePtr mdev = virMediatedDeviceNew(mdevsrc->uuidstr); + + if (!mdev) + goto done; + + if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) { + virMediatedDeviceFree(mdev); + goto done; + } + + ret = virSecuritySELinuxRestoreMediatedDevLabel(mdev, vfiodev, mgr); + + VIR_FREE(vfiodev); + virMediatedDeviceFree(mdev); + break; + } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; -- 2.10.2

This merely introduces virDomainHostdevMatchSubsysMediatedDev method that is supposed to check whether device being cold-plugged does not already exist in the domain configuration. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/conf/domain_conf.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 1fdab6a..3df7576 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -14273,6 +14273,19 @@ virDomainHostdevMatchSubsysSCSIiSCSI(virDomainHostdevDefPtr first, } static int +virDomainHostdevMatchSubsysMediatedDev(virDomainHostdevDefPtr a, + virDomainHostdevDefPtr b) +{ + virDomainHostdevSubsysMediatedDevPtr src_a = &a->source.subsys.u.mdev; + virDomainHostdevSubsysMediatedDevPtr src_b = &b->source.subsys.u.mdev; + + if (STREQ(src_a->uuidstr, src_b->uuidstr)) + return 1; + + return 0; +} + +static int virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, virDomainHostdevDefPtr b) { @@ -14303,6 +14316,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, else return 0; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + return virDomainHostdevMatchSubsysMediatedDev(a, b); case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: return 0; } -- 2.10.2

So far, the official support is for x86_64 arch guests so unless a different device API than vfio-pci is available let's only turn on support for PCI address assignment. Once a different device API is introduced, we can enable another address type easily. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 16 ++++++++++++---- 2 files changed, 13 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 8ba807c..8d51bb8 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -34,6 +34,7 @@ # include "qemu_agent.h" # include "qemu_conf.h" # include "qemu_capabilities.h" +# include "virmdev.h" # include "virchrdev.h" # include "virobject.h" # include "logging/log_manager.h" diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index 5b75044..cacd131 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -618,9 +618,11 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, virPCIDeviceAddressPtr hostAddr = &hostdev->source.subsys.u.pci.addr; if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || - hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) { + (hostdev->source.subsys.type != + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + hostdev->source.subsys.type != + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV)) return 0; - } if (pciFlags == pcieFlags) { /* This arch/qemu only supports legacy PCI, so there @@ -642,6 +644,9 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, return pcieFlags; } + if (hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + return pcieFlags; + if (!(pciDev = virPCIDeviceNew(hostAddr->domain, hostAddr->bus, hostAddr->slot, @@ -1725,12 +1730,15 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def, /* Host PCI devices */ for (i = 0; i < def->nhostdevs; i++) { + virDomainHostdevSubsysPtr subsys = &def->hostdevs[i]->source.subsys; if (!virDeviceInfoPCIAddressWanted(def->hostdevs[i]->info)) continue; if (def->hostdevs[i]->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) continue; - if (def->hostdevs[i]->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - def->hostdevs[i]->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST) + if (subsys->type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + subsys->type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST && + !(subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV && + subsys->u.mdev.model == VIR_MDEV_MODEL_TYPE_VFIO_PCI)) continue; if (qemuDomainPCIAddressReserveNextAddr(addrs, -- 2.10.2

On Mon, Feb 20, 2017 at 03:28:23PM +0100, Erik Skultety wrote:
So far, the official support is for x86_64 arch guests so unless a different device API than vfio-pci is available let's only turn on support for PCI address assignment. Once a different device API is introduced, we can enable another address type easily.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 16 ++++++++++++---- 2 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 8ba807c..8d51bb8 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -34,6 +34,7 @@ # include "qemu_agent.h" # include "qemu_conf.h" # include "qemu_capabilities.h" +# include "virmdev.h" # include "virchrdev.h" # include "virobject.h" # include "logging/log_manager.h" diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index 5b75044..cacd131 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -618,9 +618,11 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, virPCIDeviceAddressPtr hostAddr = &hostdev->source.subsys.u.pci.addr;
if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || - hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) { + (hostdev->source.subsys.type != + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + hostdev->source.subsys.type != + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV))
We don't have to strictly follow the 80 columns rule, in this case it's better to not indent.
return 0; - }
I wouldn't remove the curly bracers, in case that the if condition is on more lines it is recommended to use the curly braces to make it clear. However we don't have a syntax check for this case and we don't follow this rule everywhere.
if (pciFlags == pcieFlags) { /* This arch/qemu only supports legacy PCI, so there @@ -642,6 +644,9 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, return pcieFlags; }
+ if (hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + return pcieFlags; + if (!(pciDev = virPCIDeviceNew(hostAddr->domain, hostAddr->bus, hostAddr->slot, @@ -1725,12 +1730,15 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def,
/* Host PCI devices */ for (i = 0; i < def->nhostdevs; i++) { + virDomainHostdevSubsysPtr subsys = &def->hostdevs[i]->source.subsys; if (!virDeviceInfoPCIAddressWanted(def->hostdevs[i]->info)) continue; if (def->hostdevs[i]->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) continue; - if (def->hostdevs[i]->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - def->hostdevs[i]->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST) + if (subsys->type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + subsys->type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST && + !(subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV && + subsys->u.mdev.model == VIR_MDEV_MODEL_TYPE_VFIO_PCI)) continue;
It would be nice to have curly braces here. Pavel

Keep track of the assigned mediated devices the same way we do it for the rest of hostdevs. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_hostdev.c | 22 ++++++ src/qemu/qemu_hostdev.h | 4 ++ src/util/virhostdev.c | 176 ++++++++++++++++++++++++++++++++++++++++++++++- src/util/virhostdev.h | 9 +++ 5 files changed, 211 insertions(+), 1 deletion(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 8af65a1..e6d1282 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1706,6 +1706,7 @@ virHostdevPCINodeDeviceDetach; virHostdevPCINodeDeviceReAttach; virHostdevPCINodeDeviceReset; virHostdevPrepareDomainDevices; +virHostdevPrepareMediatedDevices; virHostdevPreparePCIDevices; virHostdevPrepareSCSIDevices; virHostdevPrepareSCSIVHostDevices; diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c index 7cd49e4..45b731c 100644 --- a/src/qemu/qemu_hostdev.c +++ b/src/qemu/qemu_hostdev.c @@ -305,6 +305,24 @@ qemuHostdevPrepareSCSIVHostDevices(virQEMUDriverPtr driver, } int +qemuHostdevPrepareMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; + + if (!qemuHostdevHostSupportsPassthroughVFIO()) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("host doesn't support VFIO PCI interface")); + return -1; + } + + return virHostdevPrepareMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME, + name, hostdevs, nhostdevs); +} + +int qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def, virQEMUCapsPtr qemuCaps, @@ -330,6 +348,10 @@ qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, def->hostdevs, def->nhostdevs) < 0) return -1; + if (qemuHostdevPrepareMediatedDevices(driver, def->name, + def->hostdevs, def->nhostdevs) < 0) + return -1; + return 0; } diff --git a/src/qemu/qemu_hostdev.h b/src/qemu/qemu_hostdev.h index 74a7d4f..9399241 100644 --- a/src/qemu/qemu_hostdev.h +++ b/src/qemu/qemu_hostdev.h @@ -59,6 +59,10 @@ int qemuHostdevPrepareSCSIVHostDevices(virQEMUDriverPtr driver, const char *name, virDomainHostdevDefPtr *hostdevs, int nhostdevs); +int qemuHostdevPrepareMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs); int qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def, virQEMUCapsPtr qemuCaps, diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 86ca8e0..681f720 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -1,6 +1,6 @@ /* virhostdev.c: hostdev management * - * Copyright (C) 2006-2007, 2009-2016 Red Hat, Inc. + * Copyright (C) 2006-2007, 2009-2017 Red Hat, Inc. * Copyright (C) 2006 Daniel P. Berrange * Copyright (C) 2014 SUSE LINUX Products GmbH, Nuernberg, Germany. * @@ -147,6 +147,7 @@ virHostdevManagerDispose(void *obj) virObjectUnref(hostdevMgr->activeUSBHostdevs); virObjectUnref(hostdevMgr->activeSCSIHostdevs); virObjectUnref(hostdevMgr->activeSCSIVHostHostdevs); + virObjectUnref(hostdevMgr->activeMediatedHostdevs); VIR_FREE(hostdevMgr->stateDir); } @@ -174,6 +175,9 @@ virHostdevManagerNew(void) if (!(hostdevMgr->activeSCSIVHostHostdevs = virSCSIVHostDeviceListNew())) goto error; + if (!(hostdevMgr->activeMediatedHostdevs = virMediatedDeviceListNew())) + goto error; + if (privileged) { if (VIR_STRDUP(hostdevMgr->stateDir, HOSTDEV_STATE_DIR) < 0) goto error; @@ -1595,6 +1599,176 @@ virHostdevPrepareSCSIVHostDevices(virHostdevManagerPtr mgr, return -1; } + +static bool +virHostdevMediatedDeviceIsUsed(virMediatedDevicePtr dev, + virMediatedDeviceListPtr list) +{ + const char *drvname, *domname; + virMediatedDevicePtr tmp = NULL; + + virObjectLock(list); + + if ((tmp = virMediatedDeviceListFind(list, dev))) { + virMediatedDeviceGetUsedBy(tmp, &drvname, &domname); + virReportError(VIR_ERR_OPERATION_INVALID, + _("Mediated device %s is in use by " + "driver %s, domain %s"), + virMediatedDeviceGetPath(tmp), + drvname, domname); + } + + virObjectUnlock(list); + return !!tmp; +} + + +static int +virHostdevMediatedDeviceCheckModel(virMediatedDevicePtr dev, + virMediatedDeviceModelType model) +{ + int ret = -1; + char *dev_api = NULL; + int actual_model; + + if (virMediatedDeviceGetDeviceAPI(dev, &dev_api) < 0) + return -1; + + /* safeguard in case we've got an older libvirt which doesn't know newer + * device_api models yet + */ + if ((actual_model = virMediatedDeviceModelTypeFromString(dev_api)) <= 0) + goto cleanup; + + if (actual_model != model) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid device API '%s' for device %s: " + "device only supports '%s'"), + virMediatedDeviceModelTypeToString(model), + virMediatedDeviceGetPath(dev), dev_api); + goto cleanup; + } + + ret = 0; + cleanup: + VIR_FREE(dev_api); + return ret; +} + + +static int +virHostdevMarkMediatedDevices(virHostdevManagerPtr mgr, + const char *drvname, + const char *domname, + virMediatedDeviceListPtr list) +{ + int ret = -1; + size_t i, j, count; + + virObjectLock(mgr->activeMediatedHostdevs); + + count = virMediatedDeviceListCount(list); + for (i = 0; i < count; i++) { + virMediatedDevicePtr mdev = virMediatedDeviceListGet(list, i); + VIR_DEBUG("Adding %s to activeMediatedHostdevs", + virMediatedDeviceGetPath(mdev)); + + virMediatedDeviceSetUsedBy(mdev, drvname, domname); + + /* Copy mdev references to the driver list: + * - caller is responsible for NOT freeing devices in @list on success + * - we're responsible for performing a rollback on failure + */ + if (virMediatedDeviceListAdd(mgr->activeMediatedHostdevs, mdev) < 0) + goto rollback; + } + + ret = 0; + cleanup: + virObjectUnlock(mgr->activeMediatedHostdevs); + return ret; + + rollback: + for (j = 0; j < i; j++) { + virMediatedDevicePtr tmp = virMediatedDeviceListGet(list, j); + virMediatedDeviceListSteal(mgr->activeMediatedHostdevs, tmp); + } + goto cleanup; +} + + +int +virHostdevPrepareMediatedDevices(virHostdevManagerPtr mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + size_t i; + int ret = -1; + virMediatedDeviceListPtr list; + + if (!nhostdevs) + return 0; + + /* To prevent situation where mediated device is assigned to multiple + * domains we maintain a driver list of currently assigned mediated devices. + * A device is appended to the driver list after a series of preparations. + */ + if (!(list = virMediatedDeviceListNew())) + goto cleanup; + + /* Loop 1: Build a temporary list of unused mediated devices. */ + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDefPtr hostdev = hostdevs[i]; + virDomainHostdevSubsysMediatedDevPtr src = &hostdev->source.subsys.u.mdev; + virMediatedDevicePtr mdev; + + if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) + continue; + if (hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + continue; + + if (!(mdev = virMediatedDeviceNew(src->uuidstr))) + goto cleanup; + + if (virHostdevMediatedDeviceIsUsed(mdev, mgr->activeMediatedHostdevs)) + goto cleanup; + + /* Check whether the user-provided model corresponds with the actually + * supported mediated device's API. + */ + if (virHostdevMediatedDeviceCheckModel(mdev, src->model) < 0) + goto cleanup; + + if (virMediatedDeviceListAdd(list, mdev) < 0) { + virMediatedDeviceFree(mdev); + goto cleanup; + } + } + + + /* Mark the devices in the list as used by @drv_name-@dom_name and copy the + * references to the driver list + */ + if (virHostdevMarkMediatedDevices(mgr, drv_name, dom_name, list) < 0) + goto cleanup; + + /* Loop 2: Temporary list was successfully merged with + * driver list, so steal all items to avoid freeing them + * in cleanup label. + */ + while (virMediatedDeviceListCount(list) > 0) { + virMediatedDevicePtr tmp = virMediatedDeviceListGet(list, 0); + virMediatedDeviceListSteal(list, tmp); + } + + ret = 0; + cleanup: + virObjectUnref(list); + return ret; +} + void virHostdevReAttachUSBDevices(virHostdevManagerPtr mgr, const char *drv_name, diff --git a/src/util/virhostdev.h b/src/util/virhostdev.h index 4c1fea3..b077089 100644 --- a/src/util/virhostdev.h +++ b/src/util/virhostdev.h @@ -31,6 +31,7 @@ # include "virusb.h" # include "virscsi.h" # include "virscsivhost.h" +# include "virmdev.h" # include "domain_conf.h" typedef enum { @@ -55,6 +56,7 @@ struct _virHostdevManager { virUSBDeviceListPtr activeUSBHostdevs; virSCSIDeviceListPtr activeSCSIHostdevs; virSCSIVHostDeviceListPtr activeSCSIVHostHostdevs; + virMediatedDeviceListPtr activeMediatedHostdevs; }; virHostdevManagerPtr virHostdevManagerGetDefault(void); @@ -96,6 +98,13 @@ virHostdevPrepareSCSIVHostDevices(virHostdevManagerPtr hostdev_mgr, virDomainHostdevDefPtr *hostdevs, int nhostdevs) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); +int +virHostdevPrepareMediatedDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); void virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, const char *drv_name, -- 2.10.2

On Mon, Feb 20, 2017 at 03:28:24PM +0100, Erik Skultety wrote:
Keep track of the assigned mediated devices the same way we do it for the rest of hostdevs.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_hostdev.c | 22 ++++++ src/qemu/qemu_hostdev.h | 4 ++ src/util/virhostdev.c | 176 ++++++++++++++++++++++++++++++++++++++++++++++- src/util/virhostdev.h | 9 +++ 5 files changed, 211 insertions(+), 1 deletion(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 8af65a1..e6d1282 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1706,6 +1706,7 @@ virHostdevPCINodeDeviceDetach; virHostdevPCINodeDeviceReAttach; virHostdevPCINodeDeviceReset; virHostdevPrepareDomainDevices; +virHostdevPrepareMediatedDevices; virHostdevPreparePCIDevices; virHostdevPrepareSCSIDevices; virHostdevPrepareSCSIVHostDevices; diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c index 7cd49e4..45b731c 100644 --- a/src/qemu/qemu_hostdev.c +++ b/src/qemu/qemu_hostdev.c @@ -305,6 +305,24 @@ qemuHostdevPrepareSCSIVHostDevices(virQEMUDriverPtr driver, }
int +qemuHostdevPrepareMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; + + if (!qemuHostdevHostSupportsPassthroughVFIO()) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("host doesn't support VFIO PCI interface")); + return -1; + } + + return virHostdevPrepareMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME, + name, hostdevs, nhostdevs); +} + +int qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def, virQEMUCapsPtr qemuCaps, @@ -330,6 +348,10 @@ qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, def->hostdevs, def->nhostdevs) < 0) return -1;
+ if (qemuHostdevPrepareMediatedDevices(driver, def->name, + def->hostdevs, def->nhostdevs) < 0) + return -1; + return 0; }
diff --git a/src/qemu/qemu_hostdev.h b/src/qemu/qemu_hostdev.h index 74a7d4f..9399241 100644 --- a/src/qemu/qemu_hostdev.h +++ b/src/qemu/qemu_hostdev.h @@ -59,6 +59,10 @@ int qemuHostdevPrepareSCSIVHostDevices(virQEMUDriverPtr driver, const char *name, virDomainHostdevDefPtr *hostdevs, int nhostdevs); +int qemuHostdevPrepareMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs); int qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def, virQEMUCapsPtr qemuCaps, diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 86ca8e0..681f720 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -1,6 +1,6 @@ /* virhostdev.c: hostdev management * - * Copyright (C) 2006-2007, 2009-2016 Red Hat, Inc. + * Copyright (C) 2006-2007, 2009-2017 Red Hat, Inc. * Copyright (C) 2006 Daniel P. Berrange * Copyright (C) 2014 SUSE LINUX Products GmbH, Nuernberg, Germany. * @@ -147,6 +147,7 @@ virHostdevManagerDispose(void *obj) virObjectUnref(hostdevMgr->activeUSBHostdevs); virObjectUnref(hostdevMgr->activeSCSIHostdevs); virObjectUnref(hostdevMgr->activeSCSIVHostHostdevs); + virObjectUnref(hostdevMgr->activeMediatedHostdevs); VIR_FREE(hostdevMgr->stateDir); }
@@ -174,6 +175,9 @@ virHostdevManagerNew(void) if (!(hostdevMgr->activeSCSIVHostHostdevs = virSCSIVHostDeviceListNew())) goto error;
+ if (!(hostdevMgr->activeMediatedHostdevs = virMediatedDeviceListNew())) + goto error; + if (privileged) { if (VIR_STRDUP(hostdevMgr->stateDir, HOSTDEV_STATE_DIR) < 0) goto error; @@ -1595,6 +1599,176 @@ virHostdevPrepareSCSIVHostDevices(virHostdevManagerPtr mgr, return -1; }
+ +static bool +virHostdevMediatedDeviceIsUsed(virMediatedDevicePtr dev, + virMediatedDeviceListPtr list) +{
This function can be moved into util/virmdev.c.
+ const char *drvname, *domname; + virMediatedDevicePtr tmp = NULL; + + virObjectLock(list); + + if ((tmp = virMediatedDeviceListFind(list, dev))) { + virMediatedDeviceGetUsedBy(tmp, &drvname, &domname); + virReportError(VIR_ERR_OPERATION_INVALID, + _("Mediated device %s is in use by " + "driver %s, domain %s"), + virMediatedDeviceGetPath(tmp), + drvname, domname); + } + + virObjectUnlock(list); + return !!tmp; +} + + +static int +virHostdevMediatedDeviceCheckModel(virMediatedDevicePtr dev, + virMediatedDeviceModelType model) +{
This function can be moved into util/virmedev.c.
+ int ret = -1; + char *dev_api = NULL; + int actual_model; + + if (virMediatedDeviceGetDeviceAPI(dev, &dev_api) < 0) + return -1; + + /* safeguard in case we've got an older libvirt which doesn't know newer + * device_api models yet + */ + if ((actual_model = virMediatedDeviceModelTypeFromString(dev_api)) <= 0) + goto cleanup; + + if (actual_model != model) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid device API '%s' for device %s: " + "device only supports '%s'"), + virMediatedDeviceModelTypeToString(model), + virMediatedDeviceGetPath(dev), dev_api); + goto cleanup; + } + + ret = 0; + cleanup: + VIR_FREE(dev_api); + return ret; +} + + +static int +virHostdevMarkMediatedDevices(virHostdevManagerPtr mgr, + const char *drvname, + const char *domname, + virMediatedDeviceListPtr list) +{
You can pass only mgr->activeMediatedHostdevs and this function can be moved to util/virmdev.c.
+ int ret = -1; + size_t i, j, count; + + virObjectLock(mgr->activeMediatedHostdevs); + + count = virMediatedDeviceListCount(list); + for (i = 0; i < count; i++) { + virMediatedDevicePtr mdev = virMediatedDeviceListGet(list, i); + VIR_DEBUG("Adding %s to activeMediatedHostdevs", + virMediatedDeviceGetPath(mdev)); + + virMediatedDeviceSetUsedBy(mdev, drvname, domname); + + /* Copy mdev references to the driver list: + * - caller is responsible for NOT freeing devices in @list on success + * - we're responsible for performing a rollback on failure + */ + if (virMediatedDeviceListAdd(mgr->activeMediatedHostdevs, mdev) < 0) + goto rollback; + } + + ret = 0; + cleanup: + virObjectUnlock(mgr->activeMediatedHostdevs); + return ret; + + rollback: + for (j = 0; j < i; j++) { + virMediatedDevicePtr tmp = virMediatedDeviceListGet(list, j); + virMediatedDeviceListSteal(mgr->activeMediatedHostdevs, tmp); + } + goto cleanup; +} + + +int +virHostdevPrepareMediatedDevices(virHostdevManagerPtr mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + size_t i; + int ret = -1; + virMediatedDeviceListPtr list; + + if (!nhostdevs) + return 0; + + /* To prevent situation where mediated device is assigned to multiple + * domains we maintain a driver list of currently assigned mediated devices. + * A device is appended to the driver list after a series of preparations. + */ + if (!(list = virMediatedDeviceListNew())) + goto cleanup; + + /* Loop 1: Build a temporary list of unused mediated devices. */ + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDefPtr hostdev = hostdevs[i]; + virDomainHostdevSubsysMediatedDevPtr src = &hostdev->source.subsys.u.mdev; + virMediatedDevicePtr mdev; + + if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) + continue; + if (hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + continue; + + if (!(mdev = virMediatedDeviceNew(src->uuidstr))) + goto cleanup; + + if (virHostdevMediatedDeviceIsUsed(mdev, mgr->activeMediatedHostdevs)) + goto cleanup;
This checks whether the mdev is used by another domain but the *activeMediatedHostdevs* list is unlocked in this whole function so if two domains starts at the same time they both can get the same device as unused ... [1]
+ + /* Check whether the user-provided model corresponds with the actually + * supported mediated device's API. + */ + if (virHostdevMediatedDeviceCheckModel(mdev, src->model) < 0) + goto cleanup; + + if (virMediatedDeviceListAdd(list, mdev) < 0) { + virMediatedDeviceFree(mdev); + goto cleanup; + } + } + + + /* Mark the devices in the list as used by @drv_name-@dom_name and copy the + * references to the driver list + */ + if (virHostdevMarkMediatedDevices(mgr, drv_name, dom_name, list) < 0) + goto cleanup;
[1] and there both domain can mark the mdev as used without triggering any error and with the side effect that the last domain to call this function would be stored as an owner of that mdev. To fix this issue you can move the virHostdevMediatedDeviceIsUsed check into virHostdevMarkMediatedDevices where you can hold the lock of *activeMediatedHostdevs* list for the whole time. Pavel
+ + /* Loop 2: Temporary list was successfully merged with + * driver list, so steal all items to avoid freeing them + * in cleanup label. + */ + while (virMediatedDeviceListCount(list) > 0) { + virMediatedDevicePtr tmp = virMediatedDeviceListGet(list, 0); + virMediatedDeviceListSteal(list, tmp); + } + + ret = 0; + cleanup: + virObjectUnref(list); + return ret; +} + void virHostdevReAttachUSBDevices(virHostdevManagerPtr mgr, const char *drv_name, diff --git a/src/util/virhostdev.h b/src/util/virhostdev.h index 4c1fea3..b077089 100644 --- a/src/util/virhostdev.h +++ b/src/util/virhostdev.h @@ -31,6 +31,7 @@ # include "virusb.h" # include "virscsi.h" # include "virscsivhost.h" +# include "virmdev.h" # include "domain_conf.h"
typedef enum { @@ -55,6 +56,7 @@ struct _virHostdevManager { virUSBDeviceListPtr activeUSBHostdevs; virSCSIDeviceListPtr activeSCSIHostdevs; virSCSIVHostDeviceListPtr activeSCSIVHostHostdevs; + virMediatedDeviceListPtr activeMediatedHostdevs; };
virHostdevManagerPtr virHostdevManagerGetDefault(void); @@ -96,6 +98,13 @@ virHostdevPrepareSCSIVHostDevices(virHostdevManagerPtr hostdev_mgr, virDomainHostdevDefPtr *hostdevs, int nhostdevs) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); +int +virHostdevPrepareMediatedDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); void virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, const char *drv_name, -- 2.10.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

The name "reattach" does not really reflect the truth behind mediated devices, since these are purely software devices that do not need to be plugged back into the host in any way, however this patch pushes for naming consistency for methods where the logic behind the underlying operation stays the same except that in case of mdevs the operation itself is effectively a NO-OP. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_hostdev.c | 15 ++++++++++++++ src/qemu/qemu_hostdev.h | 4 ++++ src/util/virhostdev.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virhostdev.h | 7 +++++++ 5 files changed, 80 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e6d1282..f2e1a74 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1712,6 +1712,7 @@ virHostdevPrepareSCSIDevices; virHostdevPrepareSCSIVHostDevices; virHostdevPrepareUSBDevices; virHostdevReAttachDomainDevices; +virHostdevReAttachMediatedDevices; virHostdevReAttachPCIDevices; virHostdevReAttachSCSIDevices; virHostdevReAttachSCSIVHostDevices; diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c index 45b731c..6a7232f 100644 --- a/src/qemu/qemu_hostdev.c +++ b/src/qemu/qemu_hostdev.c @@ -419,6 +419,18 @@ qemuHostdevReAttachSCSIVHostDevices(virQEMUDriverPtr driver, } void +qemuHostdevReAttachMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; + + virHostdevReAttachMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME, + name, hostdevs, nhostdevs); +} + +void qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def) { @@ -436,4 +448,7 @@ qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver, qemuHostdevReAttachSCSIVHostDevices(driver, def->name, def->hostdevs, def->nhostdevs); + + qemuHostdevReAttachMediatedDevices(driver, def->name, def->hostdevs, + def->nhostdevs); } diff --git a/src/qemu/qemu_hostdev.h b/src/qemu/qemu_hostdev.h index 9399241..0096497 100644 --- a/src/qemu/qemu_hostdev.h +++ b/src/qemu/qemu_hostdev.h @@ -84,6 +84,10 @@ void qemuHostdevReAttachSCSIVHostDevices(virQEMUDriverPtr driver, const char *name, virDomainHostdevDefPtr *hostdevs, int nhostdevs); +void qemuHostdevReAttachMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs); void qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def); diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 681f720..2a43b4d 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -1963,6 +1963,59 @@ virHostdevReAttachSCSIVHostDevices(virHostdevManagerPtr mgr, virObjectUnlock(mgr->activeSCSIVHostHostdevs); } +void +virHostdevReAttachMediatedDevices(virHostdevManagerPtr mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + const char *used_by_drvname = NULL; + const char *used_by_domname = NULL; + virDomainHostdevDefPtr hostdev = NULL; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = NULL; + size_t i; + + if (nhostdevs == 0) + return; + + virObjectLock(mgr->activeMediatedHostdevs); + for (i = 0; i < nhostdevs; i++) { + virMediatedDevicePtr mdev, tmp; + + hostdev = hostdevs[i]; + mdevsrc = &hostdev->source.subsys.u.mdev; + + if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || + hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + continue; + + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr))) { + VIR_WARN("Failed to reattach mediated device %s attached to " + "domain %s", mdevsrc->uuidstr, dom_name); + continue; + } + + /* Remove from the list only mdevs assigned to @drv_name/@dom_name */ + + tmp = virMediatedDeviceListFind(mgr->activeMediatedHostdevs, mdev); + virMediatedDeviceFree(mdev); + + /* skip inactive devices */ + if (!tmp) + continue; + + virMediatedDeviceGetUsedBy(tmp, &used_by_drvname, &used_by_domname); + if (STREQ_NULLABLE(drv_name, used_by_drvname) && + STREQ_NULLABLE(dom_name, used_by_domname)) { + VIR_DEBUG("Removing %s dom=%s from activeMediatedHostdevs", + mdevsrc->uuidstr, dom_name); + virMediatedDeviceListDel(mgr->activeMediatedHostdevs, tmp); + } + } + virObjectUnlock(mgr->activeMediatedHostdevs); +} + int virHostdevPCINodeDeviceDetach(virHostdevManagerPtr mgr, virPCIDevicePtr pci) diff --git a/src/util/virhostdev.h b/src/util/virhostdev.h index b077089..d0875d8 100644 --- a/src/util/virhostdev.h +++ b/src/util/virhostdev.h @@ -134,6 +134,13 @@ virHostdevReAttachSCSIVHostDevices(virHostdevManagerPtr hostdev_mgr, virDomainHostdevDefPtr *hostdevs, int nhostdevs) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); +void +virHostdevReAttachMediatedDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); int virHostdevUpdateActivePCIDevices(virHostdevManagerPtr mgr, virDomainHostdevDefPtr *hostdevs, -- 2.10.2

On Mon, Feb 20, 2017 at 03:28:25PM +0100, Erik Skultety wrote:
The name "reattach" does not really reflect the truth behind mediated devices, since these are purely software devices that do not need to be plugged back into the host in any way, however this patch pushes for naming consistency for methods where the logic behind the underlying operation stays the same except that in case of mdevs the operation itself is effectively a NO-OP.
This could be merged with the previous patch and you are also missing the case when libvirtd is restarted, see qemuHostdevUpdateActiveDomainDevices. Pavel
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_hostdev.c | 15 ++++++++++++++ src/qemu/qemu_hostdev.h | 4 ++++ src/util/virhostdev.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virhostdev.h | 7 +++++++ 5 files changed, 80 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e6d1282..f2e1a74 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1712,6 +1712,7 @@ virHostdevPrepareSCSIDevices; virHostdevPrepareSCSIVHostDevices; virHostdevPrepareUSBDevices; virHostdevReAttachDomainDevices; +virHostdevReAttachMediatedDevices; virHostdevReAttachPCIDevices; virHostdevReAttachSCSIDevices; virHostdevReAttachSCSIVHostDevices; diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c index 45b731c..6a7232f 100644 --- a/src/qemu/qemu_hostdev.c +++ b/src/qemu/qemu_hostdev.c @@ -419,6 +419,18 @@ qemuHostdevReAttachSCSIVHostDevices(virQEMUDriverPtr driver, }
void +qemuHostdevReAttachMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; + + virHostdevReAttachMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME, + name, hostdevs, nhostdevs); +} + +void qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def) { @@ -436,4 +448,7 @@ qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver,
qemuHostdevReAttachSCSIVHostDevices(driver, def->name, def->hostdevs, def->nhostdevs); + + qemuHostdevReAttachMediatedDevices(driver, def->name, def->hostdevs, + def->nhostdevs); } diff --git a/src/qemu/qemu_hostdev.h b/src/qemu/qemu_hostdev.h index 9399241..0096497 100644 --- a/src/qemu/qemu_hostdev.h +++ b/src/qemu/qemu_hostdev.h @@ -84,6 +84,10 @@ void qemuHostdevReAttachSCSIVHostDevices(virQEMUDriverPtr driver, const char *name, virDomainHostdevDefPtr *hostdevs, int nhostdevs); +void qemuHostdevReAttachMediatedDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs); void qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver, virDomainDefPtr def);
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 681f720..2a43b4d 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -1963,6 +1963,59 @@ virHostdevReAttachSCSIVHostDevices(virHostdevManagerPtr mgr, virObjectUnlock(mgr->activeSCSIVHostHostdevs); }
+void +virHostdevReAttachMediatedDevices(virHostdevManagerPtr mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + const char *used_by_drvname = NULL; + const char *used_by_domname = NULL; + virDomainHostdevDefPtr hostdev = NULL; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = NULL; + size_t i; + + if (nhostdevs == 0) + return; + + virObjectLock(mgr->activeMediatedHostdevs); + for (i = 0; i < nhostdevs; i++) { + virMediatedDevicePtr mdev, tmp; + + hostdev = hostdevs[i]; + mdevsrc = &hostdev->source.subsys.u.mdev; + + if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || + hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) + continue; + + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr))) { + VIR_WARN("Failed to reattach mediated device %s attached to " + "domain %s", mdevsrc->uuidstr, dom_name); + continue; + } + + /* Remove from the list only mdevs assigned to @drv_name/@dom_name */ + + tmp = virMediatedDeviceListFind(mgr->activeMediatedHostdevs, mdev); + virMediatedDeviceFree(mdev); + + /* skip inactive devices */ + if (!tmp) + continue; + + virMediatedDeviceGetUsedBy(tmp, &used_by_drvname, &used_by_domname); + if (STREQ_NULLABLE(drv_name, used_by_drvname) && + STREQ_NULLABLE(dom_name, used_by_domname)) { + VIR_DEBUG("Removing %s dom=%s from activeMediatedHostdevs", + mdevsrc->uuidstr, dom_name); + virMediatedDeviceListDel(mgr->activeMediatedHostdevs, tmp); + } + } + virObjectUnlock(mgr->activeMediatedHostdevs); +} + int virHostdevPCINodeDeviceDetach(virHostdevManagerPtr mgr, virPCIDevicePtr pci) diff --git a/src/util/virhostdev.h b/src/util/virhostdev.h index b077089..d0875d8 100644 --- a/src/util/virhostdev.h +++ b/src/util/virhostdev.h @@ -134,6 +134,13 @@ virHostdevReAttachSCSIVHostDevices(virHostdevManagerPtr hostdev_mgr, virDomainHostdevDefPtr *hostdevs, int nhostdevs) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); +void +virHostdevReAttachMediatedDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); int virHostdevUpdateActivePCIDevices(virHostdevManagerPtr mgr, virDomainHostdevDefPtr *hostdevs, -- 2.10.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

As goes for all the other hostdev device types, grant the qemu process access to /dev/vfio/<mediated_device_iommu_group>. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index b905f9a..86db8b7 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6875,10 +6875,12 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, virDomainHostdevSubsysPCIPtr pcisrc = &dev->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &dev->source.subsys.u.scsi; virDomainHostdevSubsysSCSIVHostPtr hostsrc = &dev->source.subsys.u.scsi_host; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; virPCIDevicePtr pci = NULL; virUSBDevicePtr usb = NULL; virSCSIDevicePtr scsi = NULL; virSCSIVHostDevicePtr host = NULL; + virMediatedDevicePtr mdev = NULL; char *tmpPath = NULL; bool freeTmpPath = false; bool includeVFIO = false; @@ -6979,6 +6981,16 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr))) + goto cleanup; + + if (!(tmpPath = virMediatedDeviceGetIOMMUGroupDev(mdev))) + goto cleanup; + + freeTmpPath = true; + includeVFIO = true; + perm = VIR_CGROUP_DEVICE_RW; + break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: break; } @@ -7028,6 +7040,7 @@ qemuDomainGetHostdevPath(virDomainDefPtr def, virUSBDeviceFree(usb); virSCSIDeviceFree(scsi); virSCSIVHostDeviceFree(host); + virMediatedDeviceFree(mdev); if (freeTmpPath) VIR_FREE(tmpPath); return ret; -- 2.10.2

Since mdevs are just another type of VFIO devices, we should increase the memory locking limit the same way we do for VFIO PCI devices. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 86db8b7..27f6f17 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6223,11 +6223,12 @@ qemuDomainRequiresMemLock(virDomainDefPtr def) return true; for (i = 0; i < def->nhostdevs; i++) { - virDomainHostdevDefPtr dev = def->hostdevs[i]; + virDomainHostdevSubsysPtr subsys = &def->hostdevs[i]->source.subsys; - if (dev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && - dev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - dev->source.subsys.u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) + if (def->hostdevs[i]->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV || + (subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + subsys->u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO))) return true; } -- 2.10.2

Format the mediated devices on the qemu command line as -device vfio-pci,sysfsdev='/path/to/device/in/syfs'. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_command.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_command.h | 5 +++++ 2 files changed, 54 insertions(+) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index f4bcfd4..0532156 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -57,6 +57,7 @@ #include "virscsi.h" #include "virnuma.h" #include "virgic.h" +#include "virmdev.h" #if defined(__linux__) # include <linux/capability.h> #endif @@ -5161,6 +5162,35 @@ qemuBuildChrChardevStr(virLogManagerPtr logManager, return ret; } +char * +qemuBuildHostdevMediatedDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + virDomainHostdevSubsysMediatedDevPtr mdevsrc = &dev->source.subsys.u.mdev; + virMediatedDevicePtr mdev = NULL; + char *ret = NULL; + + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr))) + goto cleanup; + + virBufferAddLit(&buf, "vfio-pci"); + virBufferAsprintf(&buf, ",sysfsdev=%s", virMediatedDeviceGetPath(mdev)); + + if (qemuBuildDeviceAddressStr(&buf, def, dev->info, qemuCaps) < 0) + goto cleanup; + + if (virBufferCheckError(&buf) < 0) + goto cleanup; + + ret = virBufferContentAndReset(&buf); + + cleanup: + virBufferFreeAndReset(&buf); + virMediatedDeviceFree(mdev); + return ret; +} static int qemuBuildHostdevCommandLine(virCommandPtr cmd, @@ -5349,6 +5379,25 @@ qemuBuildHostdevCommandLine(virCommandPtr cmd, VIR_FREE(devstr); } } + + /* MDEV */ + if (hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) { + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_VFIO_PCI)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("VFIO PCI device assignment is not " + "supported by this version of qemu")); + return -1; + } + + virCommandAddArg(cmd, "-device"); + if (!(devstr = + qemuBuildHostdevMediatedDevStr(def, hostdev, qemuCaps))) + return -1; + virCommandAddArg(cmd, devstr); + VIR_FREE(devstr); + } } return 0; diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 69fe846..b263b87 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -171,6 +171,11 @@ qemuBuildSCSIVHostHostdevDevStr(const virDomainDef *def, virQEMUCapsPtr qemuCaps, char *vhostfdName); +char * +qemuBuildHostdevMediatedDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps); + char *qemuBuildRedirdevDevStr(const virDomainDef *def, virDomainRedirdevDefPtr dev, virQEMUCapsPtr qemuCaps); -- 2.10.2

For now, these only cover the unmanaged, i.e. user pre-created devices. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- .../qemuxml2argv-hostdev-mdev-unmanaged.args | 25 ++++++++++++++ .../qemuxml2argv-hostdev-mdev-unmanaged.xml | 37 ++++++++++++++++++++ tests/qemuxml2argvtest.c | 6 ++++ .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml | 40 ++++++++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 5 files changed, 109 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args new file mode 100644 index 0000000..fdefeb6 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args @@ -0,0 +1,25 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/home/test \ +USER=test \ +LOGNAME=test \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu \ +-name QEMUGuest2 \ +-S \ +-M pc \ +-m 214 \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-nographic \ +-nodefconfig \ +-nodefaults \ +-monitor unix:/tmp/lib/domain--1-QEMUGuest2/monitor.sock,server,nowait \ +-no-acpi \ +-boot c \ +-usb \ +-drive file=/dev/HostVG/QEMUGuest2,format=raw,if=none,id=drive-ide0-0-0 \ +-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \ +-device vfio-pci,\ +sysfsdev=/sys/bus/mdev/devices/53764d0e-85a0-42b4-af5c-2046b460b1dc,bus=pci.0,\ +addr=0x3 diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml new file mode 100644 index 0000000..144b769 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml @@ -0,0 +1,37 @@ +<domain type='qemu'> + <name>QEMUGuest2</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest2'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <controller type='ide' index='0'> + </controller> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address type='mdev' uuid='53764d0e-85a0-42b4-af5c-2046b460b1dc'/> + </source> + </hostdev> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index f55b04b..0740ba3 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1475,6 +1475,12 @@ mymain(void) DO_TEST("hostdev-vfio-multidomain", QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_DEVICE_VFIO_PCI, QEMU_CAPS_HOST_PCI_MULTIDOMAIN); + DO_TEST("hostdev-mdev-unmanaged", + QEMU_CAPS_NODEFCONFIG, + QEMU_CAPS_DEVICE_VFIO_PCI); + DO_TEST_PARSE_ERROR("hostdev-mdev-unmanaged-no-uuid", + QEMU_CAPS_NODEFCONFIG, + QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST_FAILURE("hostdev-vfio-multidomain", QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST("pci-rom", diff --git a/tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml b/tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml new file mode 100644 index 0000000..55f19c4 --- /dev/null +++ b/tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml @@ -0,0 +1,40 @@ +<domain type='qemu'> + <name>QEMUGuest2</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest2'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <controller type='ide' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci'> + <source> + <address type='mdev' uuid='53764d0e-85a0-42b4-af5c-2046b460b1dc'/> + </source> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </hostdev> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 0702f58..fbb5192 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -550,6 +550,7 @@ mymain(void) DO_TEST("hostdev-usb-address", NONE); DO_TEST("hostdev-pci-address", NONE); DO_TEST("hostdev-vfio", NONE); + DO_TEST("hostdev-mdev-unmanaged", NONE); DO_TEST("pci-rom", NONE); DO_TEST("pci-serial-dev-chardev", NONE); -- 2.10.2

Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/formatdomain.html.in | 48 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 44 insertions(+), 4 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b69bd4c..13cb767 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3277,8 +3277,20 @@ attributes: <code>iobase</code> and <code>irq</code>. <span class="since">Since 1.2.1</span> </dd> + <dt><code>mdev</code></dt> + <dd>Mediated devices' addresses have so far only one mandatory attribute + <code>uuid</code> (<span class="since">since 3.1.0</span>) which + uniquely identifies a mediated device under the syfs file system. + </dd> </dl> + <p> + Note: Due to nature of mediated devices, being only software devices + defining an allocation of resources on the physical parent device, the + address type <code>mdev</code> is supposed to be used to identify a + device on the host only, rather than identifying it in the guest. + </p> + <h4><a name="elementsControllers">Controllers</a></h4> <p> @@ -3774,6 +3786,19 @@ </devices> ...</pre> + <p>or:</p> + +<pre> + ... + <devices> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address type='mdev' uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'> + </source> + </hostdev> + </devices> + ...</pre> + <dl> <dt><code>hostdev</code></dt> <dd>The <code>hostdev</code> element is the main container for describing @@ -3818,12 +3843,23 @@ <code>type</code> passes all LUNs presented by a single HBA to the guest. </dd> + <dt><code>mdev</code></dt> + <dd>For mediated devices (<span class="since">Since 3.1.0</span>) + the <code>model</code> attribute specifies the device API which + determines how the host's vfio driver will expose the device to the + guest. Currently, only <code>vfio-pci</code> model is supported. + The model also has implications on the guest's address type, i.e. + for <code>vfio-pci</code> device API any address type other than PCI + will result in an error. + </dd> </dl> <p> - Note: The <code>managed</code> attribute is only used with PCI devices - and is ignored by all the other device types, thus setting - <code>managed</code> explicitly with other than PCI device has the same - effect as omitting it. + Note: The <code>managed</code> attribute is only used with PCI and is + ignored by all the other device types, thus setting + <code>managed</code> explicitly with other than a PCI device has the + same effect as omitting it. Similarly, <code>model</code> attribute is + only supported by mediated devices and ignored by all other device + types. </p> </dd> <dt><code>source</code></dt> @@ -3888,6 +3924,10 @@ is the vhost_scsi wwpn (16 hexadecimal digits with a prefix of "naa.") established in the host configfs. </dd> + <dt><code>mdev</code></dt> + <dd>Mediated devices (<span class="since">Since 3.1.0</span>) are + described by the <code>address</code> element. + </dd> </dl> </dd> <dt><code>vendor</code>, <code>product</code></dt> -- 2.10.2

On 2/20/2017 7:58 PM, Erik Skultety wrote:
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/formatdomain.html.in | 48 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 44 insertions(+), 4 deletions(-)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b69bd4c..13cb767 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3277,8 +3277,20 @@ attributes: <code>iobase</code> and <code>irq</code>. <span class="since">Since 1.2.1</span> </dd> + <dt><code>mdev</code></dt> + <dd>Mediated devices' addresses have so far only one mandatory attribute + <code>uuid</code> (<span class="since">since 3.1.0</span>) which + uniquely identifies a mediated device under the syfs file system. + </dd> </dl>
+ <p> + Note: Due to nature of mediated devices, being only software devices + defining an allocation of resources on the physical parent device, the + address type <code>mdev</code> is supposed to be used to identify a + device on the host only, rather than identifying it in the guest. + </p> + <h4><a name="elementsControllers">Controllers</a></h4>
<p> @@ -3774,6 +3786,19 @@ </devices> ...</pre>
+ <p>or:</p> + +<pre> + ... + <devices> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address type='mdev' uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'> + </source> + </hostdev> + </devices> + ...</pre> +
Hi Erik, Changes looks good so far. I did some basic testing with these patches. I used above example snippet of hostdev element and it throws below error. Add a '/' before '>' and after uuid in address line. error: (domain_definition):74: Opening and ending tag mismatch: address line 73 and source </source> ---------------^ Failed. Try again? [y,n,i,f,?]: Thanks, Kirti
<dl> <dt><code>hostdev</code></dt> <dd>The <code>hostdev</code> element is the main container for describing @@ -3818,12 +3843,23 @@ <code>type</code> passes all LUNs presented by a single HBA to the guest. </dd> + <dt><code>mdev</code></dt> + <dd>For mediated devices (<span class="since">Since 3.1.0</span>) + the <code>model</code> attribute specifies the device API which + determines how the host's vfio driver will expose the device to the + guest. Currently, only <code>vfio-pci</code> model is supported. + The model also has implications on the guest's address type, i.e. + for <code>vfio-pci</code> device API any address type other than PCI + will result in an error. + </dd> </dl> <p> - Note: The <code>managed</code> attribute is only used with PCI devices - and is ignored by all the other device types, thus setting - <code>managed</code> explicitly with other than PCI device has the same - effect as omitting it. + Note: The <code>managed</code> attribute is only used with PCI and is + ignored by all the other device types, thus setting + <code>managed</code> explicitly with other than a PCI device has the + same effect as omitting it. Similarly, <code>model</code> attribute is + only supported by mediated devices and ignored by all other device + types. </p> </dd> <dt><code>source</code></dt> @@ -3888,6 +3924,10 @@ is the vhost_scsi wwpn (16 hexadecimal digits with a prefix of "naa.") established in the host configfs. </dd> + <dt><code>mdev</code></dt> + <dd>Mediated devices (<span class="since">Since 3.1.0</span>) are + described by the <code>address</code> element. + </dd> </dl> </dd> <dt><code>vendor</code>, <code>product</code></dt>
----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----------------------------------------------------------------------------------

On Wed, Feb 22, 2017 at 08:20:06PM +0530, Kirti Wankhede wrote:
On 2/20/2017 7:58 PM, Erik Skultety wrote:
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/formatdomain.html.in | 48 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 44 insertions(+), 4 deletions(-)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b69bd4c..13cb767 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3277,8 +3277,20 @@ attributes: <code>iobase</code> and <code>irq</code>. <span class="since">Since 1.2.1</span> </dd> + <dt><code>mdev</code></dt> + <dd>Mediated devices' addresses have so far only one mandatory attribute + <code>uuid</code> (<span class="since">since 3.1.0</span>) which + uniquely identifies a mediated device under the syfs file system. + </dd> </dl>
+ <p> + Note: Due to nature of mediated devices, being only software devices + defining an allocation of resources on the physical parent device, the + address type <code>mdev</code> is supposed to be used to identify a + device on the host only, rather than identifying it in the guest. + </p> + <h4><a name="elementsControllers">Controllers</a></h4>
<p> @@ -3774,6 +3786,19 @@ </devices> ...</pre>
+ <p>or:</p> + +<pre> + ... + <devices> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address type='mdev' uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'> + </source> + </hostdev> + </devices> + ...</pre> +
Hi Erik,
Changes looks good so far. I did some basic testing with these patches. I used above example snippet of hostdev element and it throws below error. Add a '/' before '>' and after uuid in address line.
error: (domain_definition):74: Opening and ending tag mismatch: address line 73 and source </source> ---------------^ Failed. Try again? [y,n,i,f,?]:
Thanks, Kirti
Oh, right, good catch. Thanks, Erik

On 20/02/17 15:28 +0100, Erik Skultety wrote:
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/formatdomain.html.in | 48 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 44 insertions(+), 4 deletions(-)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b69bd4c..13cb767 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3277,8 +3277,20 @@ attributes: <code>iobase</code> and <code>irq</code>. <span class="since">Since 1.2.1</span> </dd> + <dt><code>mdev</code></dt> + <dd>Mediated devices' addresses have so far only one mandatory attribute + <code>uuid</code> (<span class="since">since 3.1.0</span>) which + uniquely identifies a mediated device under the syfs file system. + </dd> </dl>
+ <p> + Note: Due to nature of mediated devices, being only software devices + defining an allocation of resources on the physical parent device, the + address type <code>mdev</code> is supposed to be used to identify a + device on the host only, rather than identifying it in the guest. + </p> + <h4><a name="elementsControllers">Controllers</a></h4>
<p> @@ -3774,6 +3786,19 @@ </devices> ...</pre>
+ <p>or:</p> + +<pre> + ... + <devices> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address type='mdev' uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'> + </source> + </hostdev> + </devices> + ...</pre>
Can't really test it yet, but from the docs/code seems to be OK for oVirt.
+ <dl> <dt><code>hostdev</code></dt> <dd>The <code>hostdev</code> element is the main container for describing @@ -3818,12 +3843,23 @@ <code>type</code> passes all LUNs presented by a single HBA to the guest. </dd> + <dt><code>mdev</code></dt> + <dd>For mediated devices (<span class="since">Since 3.1.0</span>) + the <code>model</code> attribute specifies the device API which + determines how the host's vfio driver will expose the device to the + guest. Currently, only <code>vfio-pci</code> model is supported. + The model also has implications on the guest's address type, i.e. + for <code>vfio-pci</code> device API any address type other than PCI + will result in an error. + </dd> </dl> <p> - Note: The <code>managed</code> attribute is only used with PCI devices - and is ignored by all the other device types, thus setting - <code>managed</code> explicitly with other than PCI device has the same - effect as omitting it. + Note: The <code>managed</code> attribute is only used with PCI and is + ignored by all the other device types, thus setting + <code>managed</code> explicitly with other than a PCI device has the + same effect as omitting it. Similarly, <code>model</code> attribute is + only supported by mediated devices and ignored by all other device + types. </p> </dd> <dt><code>source</code></dt> @@ -3888,6 +3924,10 @@ is the vhost_scsi wwpn (16 hexadecimal digits with a prefix of "naa.") established in the host configfs. </dd> + <dt><code>mdev</code></dt> + <dd>Mediated devices (<span class="since">Since 3.1.0</span>) are + described by the <code>address</code> element. + </dd> </dl> </dd> <dt><code>vendor</code>, <code>product</code></dt> -- 2.10.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Mon, Feb 27, 2017 at 09:33:51AM +0100, Martin Polednik wrote:
On 20/02/17 15:28 +0100, Erik Skultety wrote:
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/formatdomain.html.in | 48 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 44 insertions(+), 4 deletions(-)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b69bd4c..13cb767 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3277,8 +3277,20 @@ attributes: <code>iobase</code> and <code>irq</code>. <span class="since">Since 1.2.1</span> </dd> + <dt><code>mdev</code></dt> + <dd>Mediated devices' addresses have so far only one mandatory attribute + <code>uuid</code> (<span class="since">since 3.1.0</span>) which + uniquely identifies a mediated device under the syfs file system. + </dd> </dl>
+ <p> + Note: Due to nature of mediated devices, being only software devices + defining an allocation of resources on the physical parent device, the + address type <code>mdev</code> is supposed to be used to identify a + device on the host only, rather than identifying it in the guest. + </p> + <h4><a name="elementsControllers">Controllers</a></h4>
<p> @@ -3774,6 +3786,19 @@ </devices> ...</pre>
+ <p>or:</p> + +<pre> + ... + <devices> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address type='mdev' uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'> + </source> + </hostdev> + </devices> + ...</pre>
Can't really test it yet, but from the docs/code seems to be OK for oVirt.
Actually you can, with kernel 4.10, I don't know how the distro-packaged kernels are configured, so my honest guess just would be to try modprobe mtty and see what happens. Anyway, you can build your own kernel and just make sure the vfio mediated devices framework is either included or modularized and the CONFIG_SAMPLE_VFIO_MDEV_MTTY sample driver is checked and you're good to go. Erik

On Mon, 27 Feb 2017 11:03:42 +0100 Erik Skultety <eskultet@redhat.com> wrote:
On Mon, Feb 27, 2017 at 09:33:51AM +0100, Martin Polednik wrote:
On 20/02/17 15:28 +0100, Erik Skultety wrote:
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/formatdomain.html.in | 48 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 44 insertions(+), 4 deletions(-)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b69bd4c..13cb767 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3277,8 +3277,20 @@ attributes: <code>iobase</code> and <code>irq</code>. <span class="since">Since 1.2.1</span> </dd> + <dt><code>mdev</code></dt> + <dd>Mediated devices' addresses have so far only one mandatory attribute + <code>uuid</code> (<span class="since">since 3.1.0</span>) which + uniquely identifies a mediated device under the syfs file system. + </dd> </dl>
+ <p> + Note: Due to nature of mediated devices, being only software devices + defining an allocation of resources on the physical parent device, the + address type <code>mdev</code> is supposed to be used to identify a + device on the host only, rather than identifying it in the guest. + </p> + <h4><a name="elementsControllers">Controllers</a></h4>
<p> @@ -3774,6 +3786,19 @@ </devices> ...</pre>
+ <p>or:</p> + +<pre> + ... + <devices> + <hostdev mode='subsystem' type='mdev' model='vfio-pci'> + <source> + <address type='mdev' uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'> + </source> + </hostdev> + </devices> + ...</pre>
Can't really test it yet, but from the docs/code seems to be OK for oVirt.
Actually you can, with kernel 4.10, I don't know how the distro-packaged kernels are configured, so my honest guess just would be to try modprobe mtty and see what happens. Anyway, you can build your own kernel and just make sure the vfio mediated devices framework is either included or modularized and the CONFIG_SAMPLE_VFIO_MDEV_MTTY sample driver is checked and you're good to go.
Distro kernels aren't likely to include sample drivers, but KVMGT (ie. Intel vGPUs) also went in for v4.10 and makes use of the vfio-mdev framework as well. These should be supported on Broadwell and newer systems. There's work yet to do for stability and robustness of this particular mdev driver, but the sysfs interface mostly behaves as intended and it can be expected to improve significantly though v4.11 kernels and v4.10 stable backports. Thanks, Alex

Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/news.xml | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/news.xml b/docs/news.xml index 8d53e07..5523f29 100644 --- a/docs/news.xml +++ b/docs/news.xml @@ -62,6 +62,15 @@ strong encryption and doesn't require any extra network connection other than what's required for remote access of libvirtd. </description> + <summary> + qemu: add mediated devices framework support + </summary> + <description> + With the upcomming kernel changes which introduce the new mediated + device framework, provide an initial support of this framework into + libvirt, mainly introduce new device type as well as an address type + in the XML. + </description> </change> <change> <summary> -- 2.10.2

On Mon, 20 Feb 2017 15:28:13 +0100 Erik Skultety <eskultet@redhat.com> wrote:
since the original v2 [1]: - resolved a few merge conflicts caused by @9d92f533 which refactored out some duplicate code which eventually lead to dropping patch 14/18 from the original series due to being unnecessary - rebased onto fresh HEAD
[1] https://www.redhat.com/archives/libvir-list/2017-February/msg00739.html
Erik Skultety (18): util: Introduce new module virmdev conf: Introduce new hostdev device type mdev conf: Introduce new address type mdev conf: Update XML parser, formatter, and RNG schema to support mdev conf: Introduce virDomainHostdevDefPostParse conf: Add post parse code for mdevs to virDomainHostdevDefPostParse security: dac: Enable labeling of vfio mediated devices security: selinux: Enable labeling of vfio mediated devices conf: Enable cold-plug of a mediated device qemu: Assign PCI addresses for mediated devices as well hostdev: Maintain a driver list of active mediated devices hostdev: Introduce a reattach method for mediated devices qemu: cgroup: Adjust cgroups' logic to allow mediated devices qemu: Bump the memory locking limit for mdevs as well qemu: Format mdevs on qemu command line test: Add some test cases for our test suite regarding the mdevs docs: Document the new hostdev and address type 'mdev' news: Update the NEWS.xml about the new mdev feature
docs/formatdomain.html.in | 48 ++- docs/news.xml | 9 + docs/schemas/domaincommon.rng | 26 ++ po/POTFILES.in | 1 + src/Makefile.am | 1 + src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 203 ++++++++++-- src/conf/domain_conf.h | 9 + src/libvirt_private.syms | 20 ++
I don't understand how these get generated, so I won't suggest where they should be added, but a usb test fails for me without adding these to this syms file: +virMediatedDeviceModelTypeFromString; +virMediatedDeviceModelTypeToString; Thanks, Alex
src/qemu/qemu_command.c | 49 +++ src/qemu/qemu_command.h | 5 + src/qemu/qemu_domain.c | 23 +- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 16 +- src/qemu/qemu_hostdev.c | 37 +++ src/qemu/qemu_hostdev.h | 8 + src/qemu/qemu_hotplug.c | 2 + src/security/security_apparmor.c | 3 + src/security/security_dac.c | 55 ++++ src/security/security_selinux.c | 54 ++++ src/util/virhostdev.c | 229 ++++++++++++- src/util/virhostdev.h | 16 + src/util/virmdev.c | 358 +++++++++++++++++++++ src/util/virmdev.h | 93 ++++++ tests/domaincapsschemadata/full.xml | 1 + .../qemuxml2argv-hostdev-mdev-unmanaged.args | 25 ++ .../qemuxml2argv-hostdev-mdev-unmanaged.xml | 37 +++ tests/qemuxml2argvtest.c | 6 + .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml | 40 +++ tests/qemuxml2xmltest.c | 1 + 30 files changed, 1333 insertions(+), 44 deletions(-) create mode 100644 src/util/virmdev.c create mode 100644 src/util/virmdev.h create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml

On Fri, Feb 24, 2017 at 11:10:00AM -0700, Alex Williamson wrote:
On Mon, 20 Feb 2017 15:28:13 +0100 Erik Skultety <eskultet@redhat.com> wrote:
since the original v2 [1]: - resolved a few merge conflicts caused by @9d92f533 which refactored out some duplicate code which eventually lead to dropping patch 14/18 from the original series due to being unnecessary - rebased onto fresh HEAD
[1] https://www.redhat.com/archives/libvir-list/2017-February/msg00739.html
Erik Skultety (18): util: Introduce new module virmdev conf: Introduce new hostdev device type mdev conf: Introduce new address type mdev conf: Update XML parser, formatter, and RNG schema to support mdev conf: Introduce virDomainHostdevDefPostParse conf: Add post parse code for mdevs to virDomainHostdevDefPostParse security: dac: Enable labeling of vfio mediated devices security: selinux: Enable labeling of vfio mediated devices conf: Enable cold-plug of a mediated device qemu: Assign PCI addresses for mediated devices as well hostdev: Maintain a driver list of active mediated devices hostdev: Introduce a reattach method for mediated devices qemu: cgroup: Adjust cgroups' logic to allow mediated devices qemu: Bump the memory locking limit for mdevs as well qemu: Format mdevs on qemu command line test: Add some test cases for our test suite regarding the mdevs docs: Document the new hostdev and address type 'mdev' news: Update the NEWS.xml about the new mdev feature
docs/formatdomain.html.in | 48 ++- docs/news.xml | 9 + docs/schemas/domaincommon.rng | 26 ++ po/POTFILES.in | 1 + src/Makefile.am | 1 + src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 203 ++++++++++-- src/conf/domain_conf.h | 9 + src/libvirt_private.syms | 20 ++
I don't understand how these get generated, so I won't suggest where they should be added, but a usb test fails for me without adding these to this syms file:
Hmm, weird, nothing fails for me, even rebased onto the current master. Anyhow, I checked that we indeed put these generated methods into the .syms file which I didn't know (never had a problem with that), but I'll fix it. Thanks, Erik

On Mon, 27 Feb 2017 10:55:26 +0100 Erik Skultety <eskultet@redhat.com> wrote:
On Fri, Feb 24, 2017 at 11:10:00AM -0700, Alex Williamson wrote:
On Mon, 20 Feb 2017 15:28:13 +0100 Erik Skultety <eskultet@redhat.com> wrote:
since the original v2 [1]: - resolved a few merge conflicts caused by @9d92f533 which refactored out some duplicate code which eventually lead to dropping patch 14/18 from the original series due to being unnecessary - rebased onto fresh HEAD
[1] https://www.redhat.com/archives/libvir-list/2017-February/msg00739.html
Erik Skultety (18): util: Introduce new module virmdev conf: Introduce new hostdev device type mdev conf: Introduce new address type mdev conf: Update XML parser, formatter, and RNG schema to support mdev conf: Introduce virDomainHostdevDefPostParse conf: Add post parse code for mdevs to virDomainHostdevDefPostParse security: dac: Enable labeling of vfio mediated devices security: selinux: Enable labeling of vfio mediated devices conf: Enable cold-plug of a mediated device qemu: Assign PCI addresses for mediated devices as well hostdev: Maintain a driver list of active mediated devices hostdev: Introduce a reattach method for mediated devices qemu: cgroup: Adjust cgroups' logic to allow mediated devices qemu: Bump the memory locking limit for mdevs as well qemu: Format mdevs on qemu command line test: Add some test cases for our test suite regarding the mdevs docs: Document the new hostdev and address type 'mdev' news: Update the NEWS.xml about the new mdev feature
docs/formatdomain.html.in | 48 ++- docs/news.xml | 9 + docs/schemas/domaincommon.rng | 26 ++ po/POTFILES.in | 1 + src/Makefile.am | 1 + src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 203 ++++++++++-- src/conf/domain_conf.h | 9 + src/libvirt_private.syms | 20 ++
I don't understand how these get generated, so I won't suggest where they should be added, but a usb test fails for me without adding these to this syms file:
Hmm, weird, nothing fails for me, even rebased onto the current master. Anyhow, I checked that we indeed put these generated methods into the .syms file which I didn't know (never had a problem with that), but I'll fix it.
The error occurs for me on the make rpm target with the following: FAIL: virusbtest ================ /home/alwillia/rpmbuild/BUILD/libvirt-3.1.0/tests/.libs/lt-virusbtest: symbol lookup error: /home/alwillia/rpmbuild/BUILD/libvirt-3.1.0/tests/.libs/virusbmock.so: undefined symbol: virMediatedDeviceModelTypeFromString FAIL virusbtest (exit status: 127) I'm based on the v3.1.0-rc1 tag plus this series. Thanks, Alex

On behalf of Yongli, He who tested these series with KVMGT vGPU mdev. Hi, Erik Here is the libvirt testing result. in general speaking, it works well. while start libvirt-d and starting the vm, there are some call traces, i attached them in the very end of this mail. this mail will be the test-by content later. Test env summary ============== 0. Test Bed hardware summary cat /proc/cpuinfo vendor_id : GenuineIntel cpu family : 6 model : 78 model name : Intel(R) Core(TM) i5-6260U CPU @ 1.80GHz stepping : 3 microcode : 0x84 physical id : 0 cpu cores : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp bugs : 1. test steps and software version following this guide, https://github.com/01org/gvt-linux/wiki/GVTg_Setup_Guide#322-build-kernel-so... 2. Host env uname -a Linux z-nuc-11 4.10.0-vgt #1 SMP Thu Mar 9 15:55:49 CST 2017 x86_64 x86_64 x86_64 GNU/Linux cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=14.04 DISTRIB_CODENAME=trusty DISTRIB_DESCRIPTION="Ubuntu 14.04.4 LTS" kernel build from source: git describe gvt-fixes-2017-01-25-1560-gcedbc34 qemu build from source: qemu-system-x86_64 --version QEMU emulator version 2.8.50 (v2.8.0-2059-geba44e9-dirty) Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers libvirt, use your branch https://github.com/eskultety/libvirt.git git describe v1.2.17-rc2-5530-g202402e git branch master * mdev-v3 3. guest xml template, refer to attachment 4. related scripts, refer to attachment myvirsh: source build libvirt wrapper gdb-libvirtd: libvirtd start up wrapper the screen call trace while start the virtlogd ================================== ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ ./libvirt/gdb-libvirt & *************************************************** start virtlogd [2] 2005 ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ *************************************************** start libvirt-d 2017-03-09 19:04:57.211+0000: 2059: info : libvirt version: 3.1.0 2017-03-09 19:04:57.211+0000: 2059: info : hostname: z-nuc-11.maas 2017-03-09 19:04:57.211+0000: 2059: error : qemuMonitorOpenUnix:367 : failed to connect to monitor socket: No such process 2017-03-09 19:04:57.213+0000: 2059: error : virMediatedDeviceGetIOMMUGroupDev:153 : internal error: IOMMU group file /sys/bus/mdev/devices/894f3983-1a36-42b3-b52c-1024aca216be/iommu_group is not a symlink 2017-03-09 19:04:57.213+0000: 2003: info : libvirt version: 3.1.0 2017-03-09 19:04:57.213+0000: 2003: info : hostname: z-nuc-11.maas 2017-03-09 19:04:57.213+0000: 2003: error : virNetSocketReadWire:1800 : End of file while reading data: Input/output error the screen call trace while start the VM (same for Ubuntu, Win10 etc) ====================================================== ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ myvirsh start vgpu-ubuntu 2017-03-09 19:06:50.483+0000: 2232: info : libvirt version: 3.1.0 2017-03-09 19:06:50.483+0000: 2232: info : hostname: z-nuc-11.maas 2017-03-09 19:06:50.483+0000: 2232: warning : qemuDomainObjTaint:4056 : Domain id=1 name='vgpu-ubuntu' uuid=972b5e38-0437-11e7-8f97-d36dba74552d is tainted: high-privileges 2017-03-09 19:06:50.819+0000: 2204: info : libvirt version: 3.1.0 2017-03-09 19:06:50.819+0000: 2232: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4 2017-03-09 19:06:50.819+0000: 2204: info : hostname: z-nuc-11.maas 2017-03-09 19:06:50.819+0000: 2204: error : virNetSocketReadWire:1800 : End of file while reading data: Input/output error Domain vgpu-ubuntu started Tested-by: Yongli, He <yongli.he@intel.com>
-----Original Message----- From: libvir-list-bounces@redhat.com [mailto:libvir-list-bounces@redhat.com] On Behalf Of Erik Skultety Sent: Monday, February 20, 2017 10:28 PM To: libvir-list@redhat.com Cc: Erik Skultety <eskultet@redhat.com> Subject: [libvirt] [RFC PATCH v2 REBASE 00/18] Introduce vGPU mdev framework to libvirt
since the original v2 [1]: - resolved a few merge conflicts caused by @9d92f533 which refactored out some duplicate code which eventually lead to dropping patch 14/18 from the original series due to being unnecessary - rebased onto fresh HEAD
[1] https://www.redhat.com/archives/libvir-list/2017-February/msg00739.html
Erik Skultety (18): util: Introduce new module virmdev conf: Introduce new hostdev device type mdev conf: Introduce new address type mdev conf: Update XML parser, formatter, and RNG schema to support mdev conf: Introduce virDomainHostdevDefPostParse conf: Add post parse code for mdevs to virDomainHostdevDefPostParse security: dac: Enable labeling of vfio mediated devices security: selinux: Enable labeling of vfio mediated devices conf: Enable cold-plug of a mediated device qemu: Assign PCI addresses for mediated devices as well hostdev: Maintain a driver list of active mediated devices hostdev: Introduce a reattach method for mediated devices qemu: cgroup: Adjust cgroups' logic to allow mediated devices qemu: Bump the memory locking limit for mdevs as well qemu: Format mdevs on qemu command line test: Add some test cases for our test suite regarding the mdevs docs: Document the new hostdev and address type 'mdev' news: Update the NEWS.xml about the new mdev feature
docs/formatdomain.html.in | 48 ++- docs/news.xml | 9 + docs/schemas/domaincommon.rng | 26 ++ po/POTFILES.in | 1 + src/Makefile.am | 1 + src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 203 ++++++++++-- src/conf/domain_conf.h | 9 + src/libvirt_private.syms | 20 ++ src/qemu/qemu_command.c | 49 +++ src/qemu/qemu_command.h | 5 + src/qemu/qemu_domain.c | 23 +- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 16 +- src/qemu/qemu_hostdev.c | 37 +++ src/qemu/qemu_hostdev.h | 8 + src/qemu/qemu_hotplug.c | 2 + src/security/security_apparmor.c | 3 + src/security/security_dac.c | 55 ++++ src/security/security_selinux.c | 54 ++++ src/util/virhostdev.c | 229 ++++++++++++- src/util/virhostdev.h | 16 + src/util/virmdev.c | 358 +++++++++++++++++++++ src/util/virmdev.h | 93 ++++++ tests/domaincapsschemadata/full.xml | 1 + .../qemuxml2argv-hostdev-mdev-unmanaged.args | 25 ++ .../qemuxml2argv-hostdev-mdev-unmanaged.xml | 37 +++ tests/qemuxml2argvtest.c | 6 + .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml | 40 +++ tests/qemuxml2xmltest.c | 1 + 30 files changed, 1333 insertions(+), 44 deletions(-) create mode 100644 src/util/virmdev.c create mode 100644 src/util/virmdev.h create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev- unmanaged.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev- mdev-unmanaged.xml
-- 2.10.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 03/16/2017 03:17 AM, Chen, Xiaoguang wrote:
the screen call trace while start the VM (same for Ubuntu, Win10 etc) ======================================================
ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ myvirsh start vgpu-ubuntu 2017-03-09 19:06:50.483+0000: 2232: info : libvirt version: 3.1.0 2017-03-09 19:06:50.483+0000: 2232: info : hostname: z-nuc-11.maas 2017-03-09 19:06:50.483+0000: 2232: warning : qemuDomainObjTaint:4056 : Domain id=1 name='vgpu-ubuntu' uuid=972b5e38-0437-11e7-8f97-d36dba74552d is tainted: high-privileges
I haven't considered any of the rest of the log yet, but this caught my eye on a first pass - "high-privileges" means that you're running qemu as root, so your test is bypassing several issues that could cause vfio device assignment to fail on a "standard" system. It shouldn't be necessary to run qemu as root in order for device assignment to work. Is there some specific reason that you're doing it this way? (I'm guessing that you've set "user = root" in /etc/libvirt/qemu.conf)

-----Original Message----- From: sendmail [mailto:justsendmailnothingelse@gmail.com] On Behalf Of Laine Stump Sent: Thursday, March 16, 2017 10:01 PM To: libvir-list@redhat.com Cc: Chen, Xiaoguang <xiaoguang.chen@intel.com>; Erik Skultety <eskultet@redhat.com>; He, Yongli <yongli.he@intel.com> Subject: Re: [libvirt] [RFC PATCH v2 REBASE 00/18] Introduce vGPU mdev framework to libvirt
On 03/16/2017 03:17 AM, Chen, Xiaoguang wrote:
the screen call trace while start the VM (same for Ubuntu, Win10 etc) ======================================================
ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ myvirsh start vgpu-ubuntu 2017-03-09 19:06:50.483+0000: 2232: info : libvirt version: 3.1.0 2017-03-09 19:06:50.483+0000: 2232: info : hostname: z-nuc-11.maas 2017-03-09 19:06:50.483+0000: 2232: warning : qemuDomainObjTaint:4056 : Domain id=1 name='vgpu-ubuntu' uuid=972b5e38-0437-11e7-8f97-d36dba74552d is tainted: high-privileges
I haven't considered any of the rest of the log yet, but this caught my eye on a first pass - "high-privileges" means that you're running qemu as root, so your test is bypassing several issues that could cause vfio device assignment to fail on a "standard" system. What do you mean for 'cause vfio device assignment to fail on a standard system'?
It shouldn't be necessary to run qemu as root in order for device assignment to work. Is there some specific reason that you're doing it this way? (I'm guessing that you've set "user = root" in /etc/libvirt/qemu.conf) No. we will test the v3 using a non-root user.

On 03/17/2017 01:57 AM, Chen, Xiaoguang wrote:
-----Original Message----- From: sendmail [mailto:justsendmailnothingelse@gmail.com] On Behalf Of Laine Stump Sent: Thursday, March 16, 2017 10:01 PM To: libvir-list@redhat.com Cc: Chen, Xiaoguang <xiaoguang.chen@intel.com>; Erik Skultety <eskultet@redhat.com>; He, Yongli <yongli.he@intel.com> Subject: Re: [libvirt] [RFC PATCH v2 REBASE 00/18] Introduce vGPU mdev framework to libvirt
On 03/16/2017 03:17 AM, Chen, Xiaoguang wrote:
the screen call trace while start the VM (same for Ubuntu, Win10 etc) ======================================================
ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ myvirsh start vgpu-ubuntu 2017-03-09 19:06:50.483+0000: 2232: info : libvirt version: 3.1.0 2017-03-09 19:06:50.483+0000: 2232: info : hostname: z-nuc-11.maas 2017-03-09 19:06:50.483+0000: 2232: warning : qemuDomainObjTaint:4056 : Domain id=1 name='vgpu-ubuntu' uuid=972b5e38-0437-11e7-8f97-d36dba74552d is tainted: high-privileges
I haven't considered any of the rest of the log yet, but this caught my eye on a first pass - "high-privileges" means that you're running qemu as root, so your test is bypassing several issues that could cause vfio device assignment to fail on a "standard" system. What do you mean for 'cause vfio device assignment to fail on a standard system'?
I mean that there are some device/file setup operations that libvirt should be performing in order to make vfio device assignment work properly in the case that the qemu process is unprivileged, and those are often the source of bugs; if you run your tests with a privileged qemu process, you're not testing any of the code that performs those operations.
It shouldn't be necessary to run qemu as root in order for device assignment to work. Is there some specific reason that you're doing it this way? (I'm guessing that you've set "user = root" in /etc/libvirt/qemu.conf) No. we will test the v3 using a non-root user.

[2] 2005 ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ *************************************************** start libvirt-d 2017-03-09 19:04:57.211+0000: 2059: info : libvirt version: 3.1.0 2017-03-09 19:04:57.211+0000: 2059: info : hostname: z-nuc-11.maas 2017-03-09 19:04:57.211+0000: 2059: error : qemuMonitorOpenUnix:367 : failed to connect to monitor socket: No such process 2017-03-09 19:04:57.213+0000: 2059: error : virMediatedDeviceGetIOMMUGroupDev:153 : internal error: IOMMU group file /sys/bus/mdev/devices/894f3983-1a36-42b3-b52c-1024aca216be/iommu_group is not a symlink
When I saw this error message for the first time in the original thread, I got confused, since this just checks whether the symlink exists, if it doesn't, the vfio device probably also doesn't exist (but take this with a grain of salt, I haven't investigated that deep) and libvirt needs it to pass it onto qemu command line. I hit this issue once by accident in the past and at that time I didn't understand what caused it, but after a reboot it was gone. So seeing it here it caught my eye and I investigated it last week. What I found out was that it's caused by the vfio-mdev module not being loaded automatically as a dependency. I solved it by autoloading the module on system boot. So this is not a libvirt issue, but just for a reference, there is a BZ on this [1].
2017-03-09 19:04:57.213+0000: 2003: info : libvirt version: 3.1.0 2017-03-09 19:04:57.213+0000: 2003: info : hostname: z-nuc-11.maas 2017-03-09 19:04:57.213+0000: 2003: error : virNetSocketReadWire:1800 : End of file while reading data: Input/output error
I suppose this corresponds to the problem above, do you hit this error if you work around the vfio-mdev module problem described above?
the screen call trace while start the VM (same for Ubuntu, Win10 etc) ======================================================
ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ myvirsh start vgpu-ubuntu 2017-03-09 19:06:50.483+0000: 2232: info : libvirt version: 3.1.0 2017-03-09 19:06:50.483+0000: 2232: info : hostname: z-nuc-11.maas 2017-03-09 19:06:50.483+0000: 2232: warning : qemuDomainObjTaint:4056 : Domain id=1 name='vgpu-ubuntu' uuid=972b5e38-0437-11e7-8f97-d36dba74552d is tainted: high-privileges 2017-03-09 19:06:50.819+0000: 2204: info : libvirt version: 3.1.0 2017-03-09 19:06:50.819+0000: 2232: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4
This one's interesting, again, are you able to hit the error when you work around the missing vfio-mdev module? I'll have a look at this if you actually can hit the error, even if the XML is correct. I posted v3 of the series and also created a new branch 'mdev-next' on my github [2]. I dropped the attribute 'type' from the source address element, so follow the example in the updated docs. Thanks for giving it a try. Erik [1] https://bugzilla.redhat.com/show_bug.cgi?id=1420572 [2] https://github.com/eskultety/libvirt/commits/mdev-next

-----Original Message----- From: Erik Skultety [mailto:eskultet@redhat.com] Sent: Thursday, March 16, 2017 10:41 PM To: Chen, Xiaoguang <xiaoguang.chen@intel.com> Cc: libvir-list@redhat.com; He, Yongli <yongli.he@intel.com> Subject: Re: [libvirt] [RFC PATCH v2 REBASE 00/18] Introduce vGPU mdev framework to libvirt
[2] 2005 ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ *************************************************** start libvirt-d 2017-03-09 19:04:57.211+0000: 2059: info : libvirt version: 3.1.0 2017-03-09 19:04:57.211+0000: 2059: info : hostname: z-nuc-11.maas 2017-03-09 19:04:57.211+0000: 2059: error : qemuMonitorOpenUnix:367 : failed to connect to monitor socket: No such process 2017-03-09 19:04:57.213+0000: 2059: error : virMediatedDeviceGetIOMMUGroupDev:153 : internal error: IOMMU group file /sys/bus/mdev/devices/894f3983-1a36-42b3-b52c- 1024aca216be/iommu_group is not a symlink
When I saw this error message for the first time in the original thread, I got confused, since this just checks whether the symlink exists, if it doesn't, the vfio device probably also doesn't exist (but take this with a grain of salt, I haven't investigated that deep) and libvirt needs it to pass it onto qemu command line. I hit this issue once by accident in the past and at that time I didn't understand what caused it, but after a reboot it was gone. So seeing it here it caught my eye and I investigated it last week. What I found out was that it's caused by the vfio-mdev module not being loaded automatically as a dependency. I solved it by autoloading the module on system boot. So this is not a libvirt issue, but just for a reference, there is a BZ on this [1].
2017-03-09 19:04:57.213+0000: 2003: info : libvirt version: 3.1.0 2017-03-09 19:04:57.213+0000: 2003: info : hostname: z-nuc-11.maas 2017-03-09 19:04:57.213+0000: 2003: error : virNetSocketReadWire:1800 : End of file while reading data: Input/output error
I suppose this corresponds to the problem above, do you hit this error if you work around the vfio-mdev module problem described above? Based on the KVMGT setup guide we should add kvmgt, vfio-iommu-type1 and vfio-mdev modules into initfamfs. So I don't think it is the same problem. But we will double confirm it later. Yong li is on vacation this week when he come back we will do the test again.
the screen call trace while start the VM (same for Ubuntu, Win10 etc) ======================================================
ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ myvirsh start vgpu-ubuntu 2017-03-09 19:06:50.483+0000: 2232: info : libvirt version: 3.1.0 2017-03-09 19:06:50.483+0000: 2232: info : hostname: z-nuc-11.maas 2017-03-09 19:06:50.483+0000: 2232: warning : qemuDomainObjTaint:4056 : Domain id=1 name='vgpu-ubuntu' uuid=972b5e38-0437-11e7-8f97-d36dba74552d is tainted: high-privileges 2017-03-09 19:06:50.819+0000: 2204: info : libvirt version: 3.1.0 2017-03-09 19:06:50.819+0000: 2232: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4
This one's interesting, again, are you able to hit the error when you work around the missing vfio-mdev module? I'll have a look at this if you actually can hit the error, even if the XML is correct.
I posted v3 of the series and also created a new branch 'mdev-next' on my github [2]. I dropped the attribute 'type' from the source address element, so follow the example in the updated docs.
Will test the v3 when yong li back from vacation.
Thanks for giving it a try. Erik
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1420572 [2] https://github.com/eskultety/libvirt/commits/mdev-next

tested v3, on the mdev-next branch: none-root: ====== 1. hacker the privilege operations sudo sh -c "ulimit -l 3074424832 && exec su $LOGNAME" sudo chown ubuntu:ubuntu /dev/vfio/0 RFC: i don't know the correct way to do such thing while build it from sources. updated me, thanks. 2. myvirsh -c qemu:///session #define ./libvirt/vgpu-win10.xml Domain vgpu-win10 defined from ./libvirt/vgpu-win10.xml #start vgpu-win10 Domain vgpu-win10 started 3.ps aux | grep qemu ubuntu 3262 141 12.3 2929432 2017172 ? SLl 23:58 0:51 /usr/bin/qemu-system-x86_64 -name guest=vgpu-win10,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/home/ubuntu/.config/libvirt/qemu/lib/domain-1-vgpu-win10/master-key.aes -machine pc-i440fx-2.3,accel=kvm,usb=off,dump-guest-core=off -m 1908 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 916c5c36-0437-11e7-a23d-830ed1295d00 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/home/ubuntu/.config/libvirt/qemu/lib/domain-1-vgpu-win10/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/home/ubuntu/vgpu-meta/libvirt-stage/win10-64.qcow2,format=qcow2,if=none,id=drive-ide0-0-0,cache=none,aio=native -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device vfio-pci,sysfsdev=/sys/bus/mdev/devices/894f3983-1a36-42b3-b52c-1024aca216be,bus=pci.0,addr=0x4 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -msg timestamp=on ubuntu 3276 0.0 0.0 10468 2216 pts/0 S+ 23:59 0:00 grep --color=auto qemu rooted mode ======== start libvirt-d trace: -------------------------- this trace shows occasionally while starting the libvirt-d, not every time. 2017-03-19 19:22:45.559+0000: 13104: info : libvirt version: 3.2.0 2017-03-19 19:22:45.559+0000: 13104: info : hostname: z-nuc-11.maas 2017-03-19 19:22:45.559+0000: 13104: error : qemuMonitorOpenUnix:367 : failed to connect to monitor socket: No such process 2017-03-19 19:22:45.562+0000: 13000: info : libvirt version: 3.2.0 2017-03-19 19:22:45.562+0000: 13000: info : hostname: z-nuc-11.maas 2017-03-19 19:22:45.562+0000: 13000: error : virNetSocketReadWire:1800 : End of file while reading data: Input/output error start domian trace: -------------------------- some time there is trace while starting the domain. 2017-03-19 19:22:50.912+0000: 13034: warning : qemuDomainObjTaint:4113 : Domain id=3 name='vgpu-win10' uuid=916c5c36-0437-11e7-a23d-830ed1295d00 is tainted: high-privileges 2017-03-19 19:22:51.859+0000: 13000: error : virNetSocketReadWire:1800 : End of file while reading data: Input/output error 2017-03-19 19:22:51.859+0000: 13034: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4 Domain vgpu-win10 started NOTES: there is no traces under none root mode, though i don't think the trace is related to user privilege. fix me. Regards Yongli He On 2017年03月16日 22:41, Erik Skultety wrote:
[2] 2005 ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ *************************************************** start libvirt-d 2017-03-09 19:04:57.211+0000: 2059: info : libvirt version: 3.1.0 2017-03-09 19:04:57.211+0000: 2059: info : hostname: z-nuc-11.maas 2017-03-09 19:04:57.211+0000: 2059: error : qemuMonitorOpenUnix:367 : failed to connect to monitor socket: No such process 2017-03-09 19:04:57.213+0000: 2059: error : virMediatedDeviceGetIOMMUGroupDev:153 : internal error: IOMMU group file /sys/bus/mdev/devices/894f3983-1a36-42b3-b52c-1024aca216be/iommu_group is not a symlink When I saw this error message for the first time in the original thread, I got confused, since this just checks whether the symlink exists, if it doesn't, the vfio device probably also doesn't exist (but take this with a grain of salt, I haven't investigated that deep) and libvirt needs it to pass it onto qemu command line. I hit this issue once by accident in the past and at that time I didn't understand what caused it, but after a reboot it was gone. So seeing it here it caught my eye and I investigated it last week. What I found out was that it's caused by the vfio-mdev module not being loaded automatically as a dependency. I solved it by autoloading the module on system boot. So this is not a libvirt issue, but just for a reference, there is a BZ on this [1].
2017-03-09 19:04:57.213+0000: 2003: info : libvirt version: 3.1.0 2017-03-09 19:04:57.213+0000: 2003: info : hostname: z-nuc-11.maas 2017-03-09 19:04:57.213+0000: 2003: error : virNetSocketReadWire:1800 : End of file while reading data: Input/output error I suppose this corresponds to the problem above, do you hit this error if you work around the vfio-mdev module problem described above?
the screen call trace while start the VM (same for Ubuntu, Win10 etc) ======================================================
ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ myvirsh start vgpu-ubuntu 2017-03-09 19:06:50.483+0000: 2232: info : libvirt version: 3.1.0 2017-03-09 19:06:50.483+0000: 2232: info : hostname: z-nuc-11.maas 2017-03-09 19:06:50.483+0000: 2232: warning : qemuDomainObjTaint:4056 : Domain id=1 name='vgpu-ubuntu' uuid=972b5e38-0437-11e7-8f97-d36dba74552d is tainted: high-privileges 2017-03-09 19:06:50.819+0000: 2204: info : libvirt version: 3.1.0 2017-03-09 19:06:50.819+0000: 2232: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4 This one's interesting, again, are you able to hit the error when you work around the missing vfio-mdev module? I'll have a look at this if you actually can hit the error, even if the XML is correct.
I posted v3 of the series and also created a new branch 'mdev-next' on my github [2]. I dropped the attribute 'type' from the source address element, so follow the example in the updated docs.
Thanks for giving it a try. Erik
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1420572 [2] https://github.com/eskultety/libvirt/commits/mdev-next

On Mon, Mar 20, 2017 at 03:27:19PM +0800, yonglihe wrote:
tested v3, on the mdev-next branch:
none-root: ====== 1. hacker the privilege operations sudo sh -c "ulimit -l 3074424832 && exec su $LOGNAME" sudo chown ubuntu:ubuntu /dev/vfio/0
RFC: i don't know the correct way to do such thing while build it from sources. updated me, thanks.
I'm not sure what you're referring to, so could you be more specific, so we can work from there and find the right solution? [...]
rooted mode ========
start libvirt-d trace: --------------------------
this trace shows occasionally while starting the libvirt-d, not every time.
2017-03-19 19:22:45.559+0000: 13104: info : libvirt version: 3.2.0 2017-03-19 19:22:45.559+0000: 13104: info : hostname: z-nuc-11.maas 2017-03-19 19:22:45.559+0000: 13104: error : qemuMonitorOpenUnix:367 : failed to connect to monitor socket: No such process 2017-03-19 19:22:45.562+0000: 13000: info : libvirt version: 3.2.0 2017-03-19 19:22:45.562+0000: 13000: info : hostname: z-nuc-11.maas 2017-03-19 19:22:45.562+0000: 13000: error : virNetSocketReadWire:1800 : End of file while reading data: Input/output error
Hmm, the virNetSocketReadWire might go away with the recent libvirt patches (I rebased my branch onto current master) if you're using either virtlogd (which is the default in qemu.conf on RHEL if I'm not mistaken) or virtlockd, so try with the rebased branch. But I doubt the qemuMonitorOpenUnix bit would disappear with the mentioned libvirt fix. But I will need a fresh and complete daemon log to investigate further, as well as libvirtd.conf and qemu.conf (just in case), so that I can compare to my environment, since I'm not able to reproduce this. [...]
NOTES: there is no traces under none root mode, though i don't think the trace is related to user privilege. fix me.
It's indeed odd that you don't see the same behaviour when running qemu as root. As I said, I will need the daemon log and preferably the configs as well to investigate further. Thanks, Erik

On Mon, Mar 20, 2017 at 11:03:40AM +0100, Erik Skultety wrote:
On Mon, Mar 20, 2017 at 03:27:19PM +0800, yonglihe wrote:
tested v3, on the mdev-next branch:
none-root: ====== 1. hacker the privilege operations sudo sh -c "ulimit -l 3074424832 && exec su $LOGNAME" sudo chown ubuntu:ubuntu /dev/vfio/0
RFC: i don't know the correct way to do such thing while build it from sources. updated me, thanks.
I'm not sure what you're referring to, so could you be more specific, so we can work from there and find the right solution?
[...]
rooted mode ========
start libvirt-d trace: --------------------------
this trace shows occasionally while starting the libvirt-d, not every time.
2017-03-19 19:22:45.559+0000: 13104: info : libvirt version: 3.2.0 2017-03-19 19:22:45.559+0000: 13104: info : hostname: z-nuc-11.maas 2017-03-19 19:22:45.559+0000: 13104: error : qemuMonitorOpenUnix:367 : failed to connect to monitor socket: No such process 2017-03-19 19:22:45.562+0000: 13000: info : libvirt version: 3.2.0 2017-03-19 19:22:45.562+0000: 13000: info : hostname: z-nuc-11.maas 2017-03-19 19:22:45.562+0000: 13000: error : virNetSocketReadWire:1800 : End of file while reading data: Input/output error
Hmm, the virNetSocketReadWire might go away with the recent libvirt patches (I rebased my branch onto current master) if you're using either virtlogd (which is the default in qemu.conf on RHEL if I'm not mistaken) or virtlockd, so try
^^The fix I mentioned would only make difference if all the daemons would log into the same file or to journal, but the log format above is different from what you'd see in the journal and I assume there is a dedicated log file for each daemon, so please ignore my post above. I'll still need complete logs and the configs though. Erik
with the rebased branch. But I doubt the qemuMonitorOpenUnix bit would disappear with the mentioned libvirt fix. But I will need a fresh and complete daemon log to investigate further, as well as libvirtd.conf and qemu.conf (just in case), so that I can compare to my environment, since I'm not able to reproduce this.
[...]
NOTES: there is no traces under none root mode, though i don't think the trace is related to user privilege. fix me.
It's indeed odd that you don't see the same behaviour when running qemu as root. As I said, I will need the daemon log and preferably the configs as well to investigate further.
Thanks, Erik
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 03/20/2017 07:37 PM, Erik Skultety wrote: > On Mon, Mar 20, 2017 at 11:03:40AM +0100, Erik Skultety wrote: >> On Mon, Mar 20, 2017 at 03:27:19PM +0800, yonglihe wrote: >>> tested v3, on the mdev-next branch: >>> >>> none-root: >>> ====== >>> 1. hacker the privilege operations >>> sudo sh -c "ulimit -l 3074424832 && exec su $LOGNAME" >>> sudo chown ubuntu:ubuntu /dev/vfio/0 >>> >>> RFC: i don't know the correct way to do such thing while build it from >>> sources. updated me, thanks. >> I'm not sure what you're referring to, so could you be more specific, so we can >> work from there and find the right solution? while i try starting a non-privileged vm, there is 2 problem i run into: 1. the ulimit can not be changed cause the virsh user don't have that right to do so. 2. the /dev/vfio/0 (i got only one mdev), is does not own by the user. that's all I’m talking about. >> >> [...] >> >>> rooted mode >>> ======== >>> >>> >>> start libvirt-d trace: >>> -------------------------- >>> >>> this trace shows occasionally while starting the libvirt-d, not every time. >>> >>> 2017-03-19 19:22:45.559+0000: 13104: info : libvirt version: 3.2.0 >>> 2017-03-19 19:22:45.559+0000: 13104: info : hostname: z-nuc-11.maas >>> 2017-03-19 19:22:45.559+0000: 13104: error : qemuMonitorOpenUnix:367 : >>> failed to connect to monitor socket: No such process >>> 2017-03-19 19:22:45.562+0000: 13000: info : libvirt version: 3.2.0 >>> 2017-03-19 19:22:45.562+0000: 13000: info : hostname: z-nuc-11.maas >>> 2017-03-19 19:22:45.562+0000: 13000: error : virNetSocketReadWire:1800 : End >>> of file while reading data: Input/output error >> Hmm, the virNetSocketReadWire might go away with the recent libvirt patches (I >> rebased my branch onto current master) if you're using either virtlogd (which >> is the default in qemu.conf on RHEL if I'm not mistaken) or virtlockd, so try > ^^The fix I mentioned would only make difference if all the daemons would log > into the same file or to journal, but the log format above is different from > what you'd see in the journal and I assume there is a dedicated log file for > each daemon, so please ignore my post above. I'll still need complete logs and refer to attachment. the logfile is messed up. i will update you later. i'm going to try pull new patches, and retest it. Yongli He > the configs though. > > Erik > >> with the rebased branch. But I doubt the qemuMonitorOpenUnix bit would >> disappear with the mentioned libvirt fix. But I will need a fresh and complete >> daemon log to investigate further, as well as libvirtd.conf and qemu.conf (just >> in case), so that I can compare to my environment, since I'm not able to >> reproduce this. >> >> [...] >> >>> NOTES: >>> there is no traces under none root mode, though i don't think the trace >>> is related to user privilege. fix me. >> It's indeed odd that you don't see the same behaviour when running qemu as >> root. As I said, I will need the daemon log and preferably the configs as well >> to investigate further. >> >> Thanks, >> Erik >> >> -- >> libvir-list mailing list >> libvir-list@redhat.com >> https://www.redhat.com/mailman/listinfo/libvir-list

[2] 2005 ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ *************************************************** start libvirt-d 2017-03-09 19:04:57.211+0000: 2059: info : libvirt version: 3.1.0 2017-03-09 19:04:57.211+0000: 2059: info : hostname: z-nuc-11.maas 2017-03-09 19:04:57.211+0000: 2059: error : qemuMonitorOpenUnix:367 : failed to connect to monitor socket: No such process 2017-03-09 19:04:57.213+0000: 2059: error : virMediatedDeviceGetIOMMUGroupDev:153 : internal error: IOMMU group file /sys/bus/mdev/devices/894f3983-1a36-42b3-b52c-1024aca216be/iommu_group is not a symlink When I saw this error message for the first time in the original thread, I got confused, since this just checks whether the symlink exists, if it doesn't, the vfio device probably also doesn't exist (but take this with a grain of salt, I haven't investigated that deep) and libvirt needs it to pass it onto qemu command line. I hit this issue once by accident in the past and at that time I didn't understand what caused it, but after a reboot it was gone. So seeing it here it caught my eye and I investigated it last week. What I found out was that it's caused by the vfio-mdev module not being loaded automatically as a dependency. I solved it by autoloading the module on system boot. So this is not a libvirt issue, but just for a reference, there is a BZ on this [1]. it's does not show up every time. might be my bad.
2017-03-09 19:04:57.213+0000: 2003: info : libvirt version: 3.1.0 2017-03-09 19:04:57.213+0000: 2003: info : hostname: z-nuc-11.maas 2017-03-09 19:04:57.213+0000: 2003: error : virNetSocketReadWire:1800 : End of file while reading data: Input/output error I suppose this corresponds to the problem above, do you hit this error if you work around the vfio-mdev module problem described above?
the screen call trace while start the VM (same for Ubuntu, Win10 etc) ======================================================
ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ myvirsh start vgpu-ubuntu 2017-03-09 19:06:50.483+0000: 2232: info : libvirt version: 3.1.0 2017-03-09 19:06:50.483+0000: 2232: info : hostname: z-nuc-11.maas 2017-03-09 19:06:50.483+0000: 2232: warning : qemuDomainObjTaint:4056 : Domain id=1 name='vgpu-ubuntu' uuid=972b5e38-0437-11e7-8f97-d36dba74552d is tainted: high-privileges 2017-03-09 19:06:50.819+0000: 2204: info : libvirt version: 3.1.0 2017-03-09 19:06:50.819+0000: 2232: warning : virDomainAuditHostdev:456 : Unexpected hostdev type while encoding audit message: 4 This one's interesting, again, are you able to hit the error when you work around the missing vfio-mdev module? I'll have a look at this if you actually can hit the error, even if the XML is correct.
On 2017年03月16日 22:41, Erik Skultety wrote: the module should be there, the domain is running and virtual gpu is working. ubuntu@z-nuc-11:~/vgpu-meta/libvirt-stage$ sudo lsmod| grep mdev vfio_mdev 16384 1 mdev 20480 2 kvmgt,vfio_mdev vfio 28672 6 vfio_iommu_type1,kvmgt,vfio_mdev on the v3, the trace still there. refer to my "test by" email on mdev-next. Regards Yongli He
I posted v3 of the series and also created a new branch 'mdev-next' on my github [2]. I dropped the attribute 'type' from the source address element, so follow the example in the updated docs.
Thanks for giving it a try. Erik
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1420572 [2] https://github.com/eskultety/libvirt/commits/mdev-next
participants (8)
-
Alex Williamson
-
Chen, Xiaoguang
-
Erik Skultety
-
Kirti Wankhede
-
Laine Stump
-
Martin Polednik
-
Pavel Hrdina
-
yonglihe