[libvirt PATCH 0/6] Add ability to create mediated devices in libvirt

This is the first portion of an effort to support persistent mediated devices with libvirt. This first series simply enables creating and destroying non-persistent mediated devices via the virNodeDeviceCreateXML() and virNodeDeviceDestroy() functions. The 'mdevctl' utility[1] provides the backend implementation. [1] https://github.com/mdevctl/mdevctl Jonathon Jongsma (6): nodedev: factor out nodeDeviceHasCapability() nodedev: add support for mdev attributes nodedev: refactor nodeDeviceFindNewDevice() nodedev: store mdev UUID in mdev caps nodedev: add mdev support to virNodeDeviceCreateXML() nodedev: add mdev support to virNodeDeviceDestroy() docs/schemas/nodedev.rng | 6 + libvirt.spec.in | 3 + m4/virt-external-programs.m4 | 3 + src/conf/node_device_conf.c | 59 ++++- src/conf/node_device_conf.h | 3 + src/conf/virnodedeviceobj.c | 34 +++ src/conf/virnodedeviceobj.h | 3 + src/libvirt_private.syms | 3 + src/node_device/node_device_driver.c | 326 ++++++++++++++++++++++++--- src/node_device/node_device_udev.c | 5 +- src/util/virmdev.c | 12 + src/util/virmdev.h | 11 + 12 files changed, 425 insertions(+), 43 deletions(-) -- 2.21.1

Currently nodeDeviceCreateXML() and nodeDeviceDestroy() only support NPIV HBAs, but we want to be able to create mdev devices as well. This is a first step to enabling that support. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 90 ++++++++++++++++++---------- 1 file changed, 57 insertions(+), 33 deletions(-) diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index ee175e1095..6a96a77d92 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -487,6 +487,20 @@ nodeDeviceFindNewDevice(virConnectPtr conn, return device; } +static bool +nodeDeviceHasCapability(virNodeDeviceDefPtr def, virNodeDevCapType type) +{ + virNodeDevCapsDefPtr cap = NULL; + + cap = def->caps; + while (cap != NULL) { + if (cap->data.type == type) + return true; + cap = cap->next; + } + + return false; +} virNodeDevicePtr nodeDeviceCreateXML(virConnectPtr conn, @@ -513,24 +527,29 @@ nodeDeviceCreateXML(virConnectPtr conn, if (virNodeDeviceCreateXMLEnsureACL(conn, def) < 0) return NULL; - if (virNodeDeviceGetWWNs(def, &wwnn, &wwpn) == -1) - return NULL; + if (nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_SCSI_HOST)) { + if (virNodeDeviceGetWWNs(def, &wwnn, &wwpn) == -1) + return NULL; - if ((parent_host = virNodeDeviceObjListGetParentHost(driver->devs, def)) < 0) - return NULL; + if ((parent_host = virNodeDeviceObjListGetParentHost(driver->devs, def)) < 0) + return NULL; - if (virVHBAManageVport(parent_host, wwpn, wwnn, VPORT_CREATE) < 0) - return NULL; + if (virVHBAManageVport(parent_host, wwpn, wwnn, VPORT_CREATE) < 0) + return NULL; - device = nodeDeviceFindNewDevice(conn, wwnn, wwpn); - /* We don't check the return value, because one way or another, - * we're returning what we get... */ + device = nodeDeviceFindNewDevice(conn, wwnn, wwpn); + /* We don't check the return value, because one way or another, + * we're returning what we get... */ - if (device == NULL) - virReportError(VIR_ERR_NO_NODE_DEVICE, - _("no node device for '%s' with matching " - "wwnn '%s' and wwpn '%s'"), - def->name, wwnn, wwpn); + if (device == NULL) + virReportError(VIR_ERR_NO_NODE_DEVICE, + _("no node device for '%s' with matching " + "wwnn '%s' and wwpn '%s'"), + def->name, wwnn, wwpn); + } else { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Unsupported device type")); + } return device; } @@ -557,31 +576,36 @@ nodeDeviceDestroy(virNodeDevicePtr device) if (virNodeDeviceDestroyEnsureACL(device->conn, def) < 0) goto cleanup; - if (virNodeDeviceGetWWNs(def, &wwnn, &wwpn) < 0) - goto cleanup; + if (nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_SCSI_HOST)) { + if (virNodeDeviceGetWWNs(def, &wwnn, &wwpn) < 0) + goto cleanup; - /* Because we're about to release the lock and thus run into a race - * possibility (however improbable) with a udevAddOneDevice change - * event which would essentially free the existing @def (obj->def) and - * replace it with something new, we need to grab the parent field - * and then find the parent obj in order to manage the vport */ - parent = g_strdup(def->parent); + /* Because we're about to release the lock and thus run into a race + * possibility (however improbable) with a udevAddOneDevice change + * event which would essentially free the existing @def (obj->def) and + * replace it with something new, we need to grab the parent field + * and then find the parent obj in order to manage the vport */ + parent = g_strdup(def->parent); - virNodeDeviceObjEndAPI(&obj); + virNodeDeviceObjEndAPI(&obj); - if (!(obj = virNodeDeviceObjListFindByName(driver->devs, parent))) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find parent '%s' definition"), parent); - goto cleanup; - } + if (!(obj = virNodeDeviceObjListFindByName(driver->devs, parent))) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("cannot find parent '%s' definition"), parent); + goto cleanup; + } - if (virSCSIHostGetNumber(parent, &parent_host) < 0) - goto cleanup; + if (virSCSIHostGetNumber(parent, &parent_host) < 0) + goto cleanup; - if (virVHBAManageVport(parent_host, wwpn, wwnn, VPORT_DELETE) < 0) - goto cleanup; + if (virVHBAManageVport(parent_host, wwpn, wwnn, VPORT_DELETE) < 0) + goto cleanup; - ret = 0; + ret = 0; + } else { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Unsupported device type")); + } cleanup: virNodeDeviceObjEndAPI(&obj); -- 2.21.1

On 4/30/20 11:42 PM, Jonathon Jongsma wrote:
Currently nodeDeviceCreateXML() and nodeDeviceDestroy() only support NPIV HBAs, but we want to be able to create mdev devices as well. This is a first step to enabling that support.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 90 ++++++++++++++++++---------- 1 file changed, 57 insertions(+), 33 deletions(-)
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index ee175e1095..6a96a77d92 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -487,6 +487,20 @@ nodeDeviceFindNewDevice(virConnectPtr conn, return device; }
+static bool +nodeDeviceHasCapability(virNodeDeviceDefPtr def, virNodeDevCapType type) +{ + virNodeDevCapsDefPtr cap = NULL; + + cap = def->caps;
This variable can be initialized to def->caps directly. No need to go through NULL, IMO.
+ while (cap != NULL) { + if (cap->data.type == type) + return true; + cap = cap->next; + } + + return false; +}
Also, here and for the rest of the patchset - the file uses two spaces between functions. Please keep that. Michal

On Mon, May 11, 2020 at 03:28:03PM +0200, Michal Privoznik wrote:
On 4/30/20 11:42 PM, Jonathon Jongsma wrote:
Currently nodeDeviceCreateXML() and nodeDeviceDestroy() only support NPIV HBAs, but we want to be able to create mdev devices as well. This is a first step to enabling that support.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 90 ++++++++++++++++++---------- 1 file changed, 57 insertions(+), 33 deletions(-)
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index ee175e1095..6a96a77d92 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -487,6 +487,20 @@ nodeDeviceFindNewDevice(virConnectPtr conn, return device; } +static bool +nodeDeviceHasCapability(virNodeDeviceDefPtr def, virNodeDevCapType type) +{ + virNodeDevCapsDefPtr cap = NULL; + + cap = def->caps;
This variable can be initialized to def->caps directly. No need to go through NULL, IMO.
+ while (cap != NULL) { + if (cap->data.type == type) + return true; + cap = cap->next; + } + + return false; +}
Also, here and for the rest of the patchset - the file uses two spaces between functions. Please keep that.
Michal
With the suggested changes: Reviewed-by: Erik Skultety <eskultet@redhat.com>

Mediated devices support arbitrary vendor-specific attributes that can be attached to a mediated device. These attributes are ordered, and are written to sysfs in order after a device is created. This patch adds support for these attributes to the mdev data types and XML schema. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- docs/schemas/nodedev.rng | 6 ++++ src/conf/node_device_conf.c | 58 +++++++++++++++++++++++++++++++++++-- src/conf/node_device_conf.h | 2 ++ src/libvirt_private.syms | 2 ++ src/util/virmdev.c | 12 ++++++++ src/util/virmdev.h | 11 +++++++ 6 files changed, 88 insertions(+), 3 deletions(-) diff --git a/docs/schemas/nodedev.rng b/docs/schemas/nodedev.rng index fe6ffa0b53..a1ce09af54 100644 --- a/docs/schemas/nodedev.rng +++ b/docs/schemas/nodedev.rng @@ -634,6 +634,12 @@ <ref name='unsignedInt'/> </attribute> </element> + <zeroOrMore> + <element name="attr"> + <attribute name="name"/> + <attribute name="value"/> + </element> + </zeroOrMore> </define> <define name='capccwdev'> diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index 4cf5b6e3d7..8cffe48d23 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -500,6 +500,20 @@ virNodeDeviceCapStorageDefFormat(virBufferPtr buf, virBufferAddLit(buf, "<capability type='hotpluggable'/>\n"); } +static void +virNodeDeviceCapMdevDefFormat(virBufferPtr buf, + const virNodeDevCapData *data) +{ + size_t i; + virBufferEscapeString(buf, "<type id='%s'/>\n", data->mdev.type); + virBufferAsprintf(buf, "<iommuGroup number='%u'/>\n", + data->mdev.iommuGroupNumber); + for (i = 0; i < data->mdev.nattributes; i++) { + virMediatedDeviceAttrPtr attr = data->mdev.attributes[i]; + virBufferAsprintf(buf, "<attribute name='%s' value='%s'/>\n", + attr->name, attr->value); + } +} char * virNodeDeviceDefFormat(const virNodeDeviceDef *def) @@ -583,9 +597,7 @@ virNodeDeviceDefFormat(const virNodeDeviceDef *def) virBufferEscapeString(&buf, "<type>%s</type>\n", virNodeDevDRMTypeToString(data->drm.type)); break; case VIR_NODE_DEV_CAP_MDEV: - virBufferEscapeString(&buf, "<type id='%s'/>\n", data->mdev.type); - virBufferAsprintf(&buf, "<iommuGroup number='%u'/>\n", - data->mdev.iommuGroupNumber); + virNodeDeviceCapMdevDefFormat(&buf, data); break; case VIR_NODE_DEV_CAP_CCW_DEV: virBufferAsprintf(&buf, "<cssid>0x%x</cssid>\n", @@ -1788,6 +1800,35 @@ virNodeDevCapSystemParseXML(xmlXPathContextPtr ctxt, return ret; } +static int +virNodeDevCapMdevAttributeParseXML(xmlXPathContextPtr ctxt, + xmlNodePtr node, + virNodeDevCapMdevPtr mdev) +{ + xmlNodePtr orig_node = node; + ctxt->node = node; + int ret = -1; + + g_autoptr(virMediatedDeviceAttr) attr = virMediatedDeviceAttrNew(); + attr->name = virXPathString("string(./@name)", ctxt); + attr->value = virXPathString("string(./@value)", ctxt); + if (!attr->name || !attr->value) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("mdev attribute missing name or value")); + goto cleanup; + } + + if (VIR_EXPAND_N(mdev->attributes, mdev->nattributes, 1) < 0) + goto cleanup; + + mdev->attributes[mdev->nattributes - 1] = g_steal_pointer(&attr); + + ret = 0; + + cleanup: + ctxt->node = orig_node; + return ret; +} static int virNodeDevCapMdevParseXML(xmlXPathContextPtr ctxt, @@ -1797,6 +1838,9 @@ virNodeDevCapMdevParseXML(xmlXPathContextPtr ctxt, { xmlNodePtr orignode; int ret = -1; + int nattrs = 0; + xmlNodePtr *attrs; + size_t i; orignode = ctxt->node; ctxt->node = node; @@ -1815,6 +1859,11 @@ virNodeDevCapMdevParseXML(xmlXPathContextPtr ctxt, "'%s'")) < 0) goto out; + if ((nattrs = virXPathNodeSet("./attribute", ctxt, &attrs)) < 0) + goto out; + for (i = 0; i < nattrs; i++) + virNodeDevCapMdevAttributeParseXML(ctxt, attrs[i], mdev); + ret = 0; out: ctxt->node = orignode; @@ -2205,6 +2254,9 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps) break; case VIR_NODE_DEV_CAP_MDEV: VIR_FREE(data->mdev.type); + for (i = 0; i < data->mdev.nattributes; i++) + virMediatedDeviceAttrFree(data->mdev.attributes[i]); + VIR_FREE(data->mdev.attributes); break; case VIR_NODE_DEV_CAP_MDEV_TYPES: case VIR_NODE_DEV_CAP_DRM: diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 9e4b0847fb..e3e1e788d4 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -141,6 +141,8 @@ typedef virNodeDevCapMdev *virNodeDevCapMdevPtr; struct _virNodeDevCapMdev { char *type; unsigned int iommuGroupNumber; + virMediatedDeviceAttrPtr *attributes; + size_t nattributes; }; typedef struct _virNodeDevCapPCIDev virNodeDevCapPCIDev; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 935ef7303b..94b206aa21 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2482,6 +2482,8 @@ virMacMapRemove; virMacMapWriteFile; # util/virmdev.h +virMediatedDeviceAttrFree; +virMediatedDeviceAttrNew; virMediatedDeviceFree; virMediatedDeviceGetIOMMUGroupDev; virMediatedDeviceGetIOMMUGroupNum; diff --git a/src/util/virmdev.c b/src/util/virmdev.c index c2499c0a20..6c37eae00b 100644 --- a/src/util/virmdev.c +++ b/src/util/virmdev.c @@ -518,3 +518,15 @@ virMediatedDeviceTypeReadAttrs(const char *sysfspath, return 0; } + +virMediatedDeviceAttrPtr virMediatedDeviceAttrNew(void) +{ + return g_new0(virMediatedDeviceAttr, 1); +} + +void virMediatedDeviceAttrFree(virMediatedDeviceAttrPtr attr) +{ + g_free(attr->name); + g_free(attr->value); + g_free(attr); +} diff --git a/src/util/virmdev.h b/src/util/virmdev.h index 51f7f608a2..eb167ccb48 100644 --- a/src/util/virmdev.h +++ b/src/util/virmdev.h @@ -37,6 +37,17 @@ typedef struct _virMediatedDevice virMediatedDevice; typedef virMediatedDevice *virMediatedDevicePtr; typedef struct _virMediatedDeviceList virMediatedDeviceList; typedef virMediatedDeviceList *virMediatedDeviceListPtr; +typedef struct _virMediatedDeviceAttr virMediatedDeviceAttr; +typedef virMediatedDeviceAttr *virMediatedDeviceAttrPtr; + +struct _virMediatedDeviceAttr { + char *name; + char *value; +}; + +virMediatedDeviceAttrPtr virMediatedDeviceAttrNew(void); +void virMediatedDeviceAttrFree(virMediatedDeviceAttrPtr attr); +G_DEFINE_AUTOPTR_CLEANUP_FUNC(virMediatedDeviceAttr, virMediatedDeviceAttrFree); G_DEFINE_AUTOPTR_CLEANUP_FUNC(virMediatedDeviceList, virObjectUnref); -- 2.21.1

On 4/30/20 11:42 PM, Jonathon Jongsma wrote:
Mediated devices support arbitrary vendor-specific attributes that can be attached to a mediated device. These attributes are ordered, and are written to sysfs in order after a device is created. This patch adds support for these attributes to the mdev data types and XML schema.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- docs/schemas/nodedev.rng | 6 ++++ src/conf/node_device_conf.c | 58 +++++++++++++++++++++++++++++++++++-- src/conf/node_device_conf.h | 2 ++ src/libvirt_private.syms | 2 ++ src/util/virmdev.c | 12 ++++++++ src/util/virmdev.h | 11 +++++++ 6 files changed, 88 insertions(+), 3 deletions(-)
diff --git a/docs/schemas/nodedev.rng b/docs/schemas/nodedev.rng index fe6ffa0b53..a1ce09af54 100644 --- a/docs/schemas/nodedev.rng +++ b/docs/schemas/nodedev.rng @@ -634,6 +634,12 @@ <ref name='unsignedInt'/> </attribute> </element> + <zeroOrMore> + <element name="attr"> + <attribute name="name"/> + <attribute name="value"/> + </element> + </zeroOrMore> </define>
<define name='capccwdev'> diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index 4cf5b6e3d7..8cffe48d23 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -500,6 +500,20 @@ virNodeDeviceCapStorageDefFormat(virBufferPtr buf, virBufferAddLit(buf, "<capability type='hotpluggable'/>\n"); }
+static void +virNodeDeviceCapMdevDefFormat(virBufferPtr buf, + const virNodeDevCapData *data) +{ + size_t i;
We like declarations separated by an empty line from the rest of the code.
+ virBufferEscapeString(buf, "<type id='%s'/>\n", data->mdev.type); + virBufferAsprintf(buf, "<iommuGroup number='%u'/>\n", + data->mdev.iommuGroupNumber); + for (i = 0; i < data->mdev.nattributes; i++) { + virMediatedDeviceAttrPtr attr = data->mdev.attributes[i]; + virBufferAsprintf(buf, "<attribute name='%s' value='%s'/>\n", + attr->name, attr->value); + } +}
char * virNodeDeviceDefFormat(const virNodeDeviceDef *def) @@ -583,9 +597,7 @@ virNodeDeviceDefFormat(const virNodeDeviceDef *def) virBufferEscapeString(&buf, "<type>%s</type>\n", virNodeDevDRMTypeToString(data->drm.type)); break; case VIR_NODE_DEV_CAP_MDEV: - virBufferEscapeString(&buf, "<type id='%s'/>\n", data->mdev.type); - virBufferAsprintf(&buf, "<iommuGroup number='%u'/>\n", - data->mdev.iommuGroupNumber); + virNodeDeviceCapMdevDefFormat(&buf, data); break; case VIR_NODE_DEV_CAP_CCW_DEV: virBufferAsprintf(&buf, "<cssid>0x%x</cssid>\n", @@ -1788,6 +1800,35 @@ virNodeDevCapSystemParseXML(xmlXPathContextPtr ctxt, return ret; }
+static int +virNodeDevCapMdevAttributeParseXML(xmlXPathContextPtr ctxt, + xmlNodePtr node, + virNodeDevCapMdevPtr mdev) +{ + xmlNodePtr orig_node = node; + ctxt->node = node; + int ret = -1; + + g_autoptr(virMediatedDeviceAttr) attr = virMediatedDeviceAttrNew();
Here I'd move the empty line below this @attr declaration.
+ attr->name = virXPathString("string(./@name)", ctxt); + attr->value = virXPathString("string(./@value)", ctxt); + if (!attr->name || !attr->value) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("mdev attribute missing name or value")); + goto cleanup; + } + + if (VIR_EXPAND_N(mdev->attributes, mdev->nattributes, 1) < 0) + goto cleanup; + + mdev->attributes[mdev->nattributes - 1] = g_steal_pointer(&attr);
We have VIR_APPEND_ELEMENT() which would set attr = NULL on success.
+ + ret = 0; + + cleanup: + ctxt->node = orig_node; + return ret; +}
static int virNodeDevCapMdevParseXML(xmlXPathContextPtr ctxt, @@ -1797,6 +1838,9 @@ virNodeDevCapMdevParseXML(xmlXPathContextPtr ctxt, { xmlNodePtr orignode; int ret = -1; + int nattrs = 0; + xmlNodePtr *attrs; + size_t i;
orignode = ctxt->node; ctxt->node = node; @@ -1815,6 +1859,11 @@ virNodeDevCapMdevParseXML(xmlXPathContextPtr ctxt, "'%s'")) < 0) goto out;
+ if ((nattrs = virXPathNodeSet("./attribute", ctxt, &attrs)) < 0) + goto out;
This allocates @attrs, so they must be VIR_FREE()-d or g_autofree()-d on return from the function. And to avoid freeing uninitialized pointer, @attrs should be set to NULL.
+ for (i = 0; i < nattrs; i++) + virNodeDevCapMdevAttributeParseXML(ctxt, attrs[i], mdev);
This needs to have its retval checked, because otherwise we would accept incomplete XML (e.g. with missing attribute name).
+ ret = 0; out: ctxt->node = orignode; @@ -2205,6 +2254,9 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps) break; case VIR_NODE_DEV_CAP_MDEV: VIR_FREE(data->mdev.type); + for (i = 0; i < data->mdev.nattributes; i++) + virMediatedDeviceAttrFree(data->mdev.attributes[i]); + VIR_FREE(data->mdev.attributes); break; case VIR_NODE_DEV_CAP_MDEV_TYPES: case VIR_NODE_DEV_CAP_DRM: diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 9e4b0847fb..e3e1e788d4 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -141,6 +141,8 @@ typedef virNodeDevCapMdev *virNodeDevCapMdevPtr; struct _virNodeDevCapMdev { char *type; unsigned int iommuGroupNumber; + virMediatedDeviceAttrPtr *attributes; + size_t nattributes; };
typedef struct _virNodeDevCapPCIDev virNodeDevCapPCIDev; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 935ef7303b..94b206aa21 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2482,6 +2482,8 @@ virMacMapRemove; virMacMapWriteFile;
# util/virmdev.h +virMediatedDeviceAttrFree; +virMediatedDeviceAttrNew; virMediatedDeviceFree; virMediatedDeviceGetIOMMUGroupDev; virMediatedDeviceGetIOMMUGroupNum; diff --git a/src/util/virmdev.c b/src/util/virmdev.c index c2499c0a20..6c37eae00b 100644 --- a/src/util/virmdev.c +++ b/src/util/virmdev.c @@ -518,3 +518,15 @@ virMediatedDeviceTypeReadAttrs(const char *sysfspath,
return 0; } + +virMediatedDeviceAttrPtr virMediatedDeviceAttrNew(void) +{ + return g_new0(virMediatedDeviceAttr, 1); +} + +void virMediatedDeviceAttrFree(virMediatedDeviceAttrPtr attr) +{
All free functions must accept NULL. Because only then it's safe to use g_autoptr() in non-trivial functions.
+ g_free(attr->name); + g_free(attr->value); + g_free(attr); +} diff --git a/src/util/virmdev.h b/src/util/virmdev.h index 51f7f608a2..eb167ccb48 100644 --- a/src/util/virmdev.h +++ b/src/util/virmdev.h @@ -37,6 +37,17 @@ typedef struct _virMediatedDevice virMediatedDevice; typedef virMediatedDevice *virMediatedDevicePtr; typedef struct _virMediatedDeviceList virMediatedDeviceList; typedef virMediatedDeviceList *virMediatedDeviceListPtr; +typedef struct _virMediatedDeviceAttr virMediatedDeviceAttr; +typedef virMediatedDeviceAttr *virMediatedDeviceAttrPtr; + +struct _virMediatedDeviceAttr { + char *name; + char *value; +}; + +virMediatedDeviceAttrPtr virMediatedDeviceAttrNew(void); +void virMediatedDeviceAttrFree(virMediatedDeviceAttrPtr attr); +G_DEFINE_AUTOPTR_CLEANUP_FUNC(virMediatedDeviceAttr, virMediatedDeviceAttrFree);
G_DEFINE_AUTOPTR_CLEANUP_FUNC(virMediatedDeviceList, virObjectUnref);
Michal

In preparation for creating mediated devices in libvirt, we will need to wait for new mediated devices to be created as well. Refactor nodeDeviceFindNewDevice() so that we can re-use the main logic from this function to wait for different device types by passing a different 'find' function. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 33 +++++++++++++++++++++++----- 1 file changed, 28 insertions(+), 5 deletions(-) diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 6a96a77d92..f948dfbf69 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -446,7 +446,8 @@ nodeDeviceGetTime(time_t *t) return ret; } - +typedef virNodeDevicePtr (*nodeDeviceFindNewDeviceFunc)(virConnectPtr conn, + const void* opaque); /* When large numbers of devices are present on the host, it's * possible for udev not to realize that it has work to do before we * get here. We thus keep trying to find the new device we just @@ -462,8 +463,8 @@ nodeDeviceGetTime(time_t *t) */ static virNodeDevicePtr nodeDeviceFindNewDevice(virConnectPtr conn, - const char *wwnn, - const char *wwpn) + nodeDeviceFindNewDeviceFunc func, + const void *opaque) { virNodeDevicePtr device = NULL; time_t start = 0, now = 0; @@ -474,7 +475,7 @@ nodeDeviceFindNewDevice(virConnectPtr conn, virWaitForDevices(); - device = nodeDeviceLookupSCSIHostByWWN(conn, wwnn, wwpn, 0); + device = func(conn, opaque); if (device != NULL) break; @@ -487,6 +488,28 @@ nodeDeviceFindNewDevice(virConnectPtr conn, return device; } +typedef struct _NewSCISHostFuncData +{ + const char *wwnn; + const char *wwpn; +} NewSCSIHostFuncData; +static virNodeDevicePtr +nodeDeviceFindNewSCSIHostFunc(virConnectPtr conn, + const void *opaque) +{ + const NewSCSIHostFuncData *data = opaque; + return nodeDeviceLookupSCSIHostByWWN(conn, data->wwnn, data->wwpn, 0); +} + +static virNodeDevicePtr +nodeDeviceFindNewSCSIHost(virConnectPtr conn, + const char *wwnn, + const char *wwpn) +{ + NewSCSIHostFuncData data = { .wwnn = wwnn, .wwpn = wwpn}; + return nodeDeviceFindNewDevice(conn, nodeDeviceFindNewSCSIHostFunc, &data); +} + static bool nodeDeviceHasCapability(virNodeDeviceDefPtr def, virNodeDevCapType type) { @@ -537,7 +560,7 @@ nodeDeviceCreateXML(virConnectPtr conn, if (virVHBAManageVport(parent_host, wwpn, wwnn, VPORT_CREATE) < 0) return NULL; - device = nodeDeviceFindNewDevice(conn, wwnn, wwpn); + device = nodeDeviceFindNewSCSIHost(conn, wwnn, wwpn); /* We don't check the return value, because one way or another, * we're returning what we get... */ -- 2.21.1

On 4/30/20 11:42 PM, Jonathon Jongsma wrote:
In preparation for creating mediated devices in libvirt, we will need to wait for new mediated devices to be created as well. Refactor nodeDeviceFindNewDevice() so that we can re-use the main logic from this function to wait for different device types by passing a different 'find' function.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 33 +++++++++++++++++++++++----- 1 file changed, 28 insertions(+), 5 deletions(-)
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 6a96a77d92..f948dfbf69 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -446,7 +446,8 @@ nodeDeviceGetTime(time_t *t) return ret; }
- +typedef virNodeDevicePtr (*nodeDeviceFindNewDeviceFunc)(virConnectPtr conn, + const void* opaque);
Again, some spaces here, and especially below.
/* When large numbers of devices are present on the host, it's * possible for udev not to realize that it has work to do before we * get here. We thus keep trying to find the new device we just @@ -462,8 +463,8 @@ nodeDeviceGetTime(time_t *t) */ static virNodeDevicePtr nodeDeviceFindNewDevice(virConnectPtr conn, - const char *wwnn, - const char *wwpn) + nodeDeviceFindNewDeviceFunc func, + const void *opaque) { virNodeDevicePtr device = NULL; time_t start = 0, now = 0; @@ -474,7 +475,7 @@ nodeDeviceFindNewDevice(virConnectPtr conn,
virWaitForDevices();
- device = nodeDeviceLookupSCSIHostByWWN(conn, wwnn, wwpn, 0); + device = func(conn, opaque);
if (device != NULL) break; @@ -487,6 +488,28 @@ nodeDeviceFindNewDevice(virConnectPtr conn, return device; }
+typedef struct _NewSCISHostFuncData
s/SCIS/SCSI/
+{ + const char *wwnn; + const char *wwpn; +} NewSCSIHostFuncData;
We tend to like this split: typedef struct _someStruct someStruct; struct _someStruct { const char *member; }; Honestly, I don't understand why, I guess it's just a coding style.
+static virNodeDevicePtr +nodeDeviceFindNewSCSIHostFunc(virConnectPtr conn, + const void *opaque) +{ + const NewSCSIHostFuncData *data = opaque; + return nodeDeviceLookupSCSIHostByWWN(conn, data->wwnn, data->wwpn, 0); +} + +static virNodeDevicePtr +nodeDeviceFindNewSCSIHost(virConnectPtr conn, + const char *wwnn, + const char *wwpn) +{ + NewSCSIHostFuncData data = { .wwnn = wwnn, .wwpn = wwpn}; + return nodeDeviceFindNewDevice(conn, nodeDeviceFindNewSCSIHostFunc, &data); +} + static bool nodeDeviceHasCapability(virNodeDeviceDefPtr def, virNodeDevCapType type) { @@ -537,7 +560,7 @@ nodeDeviceCreateXML(virConnectPtr conn, if (virVHBAManageVport(parent_host, wwpn, wwnn, VPORT_CREATE) < 0) return NULL;
- device = nodeDeviceFindNewDevice(conn, wwnn, wwpn); + device = nodeDeviceFindNewSCSIHost(conn, wwnn, wwpn); /* We don't check the return value, because one way or another, * we're returning what we get... */
Michal

On Mon, May 11, 2020 at 03:27:58PM +0200, Michal Privoznik wrote:
On 4/30/20 11:42 PM, Jonathon Jongsma wrote:
In preparation for creating mediated devices in libvirt, we will need to wait for new mediated devices to be created as well. Refactor nodeDeviceFindNewDevice() so that we can re-use the main logic from this function to wait for different device types by passing a different 'find' function.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 33 +++++++++++++++++++++++----- 1 file changed, 28 insertions(+), 5 deletions(-)
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 6a96a77d92..f948dfbf69 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -446,7 +446,8 @@ nodeDeviceGetTime(time_t *t) return ret; } - +typedef virNodeDevicePtr (*nodeDeviceFindNewDeviceFunc)(virConnectPtr conn, + const void* opaque);
Again, some spaces here, and especially below.
/* When large numbers of devices are present on the host, it's * possible for udev not to realize that it has work to do before we * get here. We thus keep trying to find the new device we just @@ -462,8 +463,8 @@ nodeDeviceGetTime(time_t *t) */ static virNodeDevicePtr nodeDeviceFindNewDevice(virConnectPtr conn, - const char *wwnn, - const char *wwpn) + nodeDeviceFindNewDeviceFunc func, + const void *opaque) { virNodeDevicePtr device = NULL; time_t start = 0, now = 0; @@ -474,7 +475,7 @@ nodeDeviceFindNewDevice(virConnectPtr conn, virWaitForDevices(); - device = nodeDeviceLookupSCSIHostByWWN(conn, wwnn, wwpn, 0); + device = func(conn, opaque); if (device != NULL) break; @@ -487,6 +488,28 @@ nodeDeviceFindNewDevice(virConnectPtr conn, return device; } +typedef struct _NewSCISHostFuncData
s/SCIS/SCSI/
+{ + const char *wwnn; + const char *wwpn; +} NewSCSIHostFuncData;
We tend to like this split:
typedef struct _someStruct someStruct; struct _someStruct { const char *member; };
Honestly, I don't understand why, I guess it's just a coding style.
The typedef doesn't really bring any benefit in this case. I don't have any tangible statistics for this, just a plain git grep, but it looks like that the prevailing style for such private structures actually *is* without any typedefing which also makes sense. No further comments on this patch from my side. -- Erik Skultety

In order to allow libvirt to create and start new mediated devices, we need to be able to verify that the device has been started. In order to do this, we'll need to save the UUID of newly-discovered devices within the virNodeDevCapMdev structure. This allows us to search the device list by UUID and verify whether the expected device has been started. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/conf/node_device_conf.c | 1 + src/conf/node_device_conf.h | 1 + src/node_device/node_device_udev.c | 5 ++--- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index 8cffe48d23..88e1e84f1c 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -2254,6 +2254,7 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps) break; case VIR_NODE_DEV_CAP_MDEV: VIR_FREE(data->mdev.type); + VIR_FREE(data->mdev.uuid); for (i = 0; i < data->mdev.nattributes; i++) virMediatedDeviceAttrFree(data->mdev.attributes[i]); VIR_FREE(data->mdev.attributes); diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index e3e1e788d4..9b8c7aadea 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -141,6 +141,7 @@ typedef virNodeDevCapMdev *virNodeDevCapMdevPtr; struct _virNodeDevCapMdev { char *type; unsigned int iommuGroupNumber; + char *uuid; virMediatedDeviceAttrPtr *attributes; size_t nattributes; }; diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 386f23ef3a..bdf0b03add 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1013,7 +1013,6 @@ udevProcessMediatedDevice(struct udev_device *dev, virNodeDeviceDefPtr def) { int ret = -1; - const char *uuidstr = NULL; int iommugrp = -1; char *linkpath = NULL; char *canonicalpath = NULL; @@ -1041,8 +1040,8 @@ udevProcessMediatedDevice(struct udev_device *dev, data->type = g_path_get_basename(canonicalpath); - uuidstr = udev_device_get_sysname(dev); - if ((iommugrp = virMediatedDeviceGetIOMMUGroupNum(uuidstr)) < 0) + data->uuid = g_strdup(udev_device_get_sysname(dev)); + if ((iommugrp = virMediatedDeviceGetIOMMUGroupNum(data->uuid)) < 0) goto cleanup; if (udevGenerateDeviceName(dev, def, NULL) != 0) -- 2.21.1

With recent additions to the node device xml schema, an xml schema can now describe a mdev device sufficiently for libvirt to create and start the device using the mdevctl utility. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- libvirt.spec.in | 3 + m4/virt-external-programs.m4 | 3 + src/conf/virnodedeviceobj.c | 34 +++++ src/conf/virnodedeviceobj.h | 3 + src/libvirt_private.syms | 1 + src/node_device/node_device_driver.c | 177 +++++++++++++++++++++++++++ 6 files changed, 221 insertions(+) diff --git a/libvirt.spec.in b/libvirt.spec.in index 6abf97df85..411846a9fc 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -466,6 +466,9 @@ Requires: dbus # For uid creation during pre Requires(pre): shadow-utils +# For managing persistent mediated devices +Recommends: mdevctl + %description daemon Server side daemon required to manage the virtualization capabilities of recent versions of Linux. Requires a hypervisor specific sub-RPM diff --git a/m4/virt-external-programs.m4 b/m4/virt-external-programs.m4 index 9046e3bf07..bd3cb1f757 100644 --- a/m4/virt-external-programs.m4 +++ b/m4/virt-external-programs.m4 @@ -65,6 +65,7 @@ AC_DEFUN([LIBVIRT_CHECK_EXTERNAL_PROGRAMS], [ AC_PATH_PROG([OVSVSCTL], [ovs-vsctl], [ovs-vsctl], [$LIBVIRT_SBIN_PATH]) AC_PATH_PROG([SCRUB], [scrub], [scrub], [$LIBVIRT_SBIN_PATH]) AC_PATH_PROG([ADDR2LINE], [addr2line], [addr2line], [$LIBVIRT_SBIN_PATH]) + AC_PATH_PROG([MDEVCTL], [mdevctl], [mdevctl], [$LIBVIRT_SBIN_PATH]) AC_DEFINE_UNQUOTED([DMIDECODE], ["$DMIDECODE"], [Location or name of the dmidecode program]) @@ -88,6 +89,8 @@ AC_DEFUN([LIBVIRT_CHECK_EXTERNAL_PROGRAMS], [ [Location or name of the scrub program (for wiping algorithms)]) AC_DEFINE_UNQUOTED([ADDR2LINE], ["$ADDR2LINE"], [Location of addr2line program]) + AC_DEFINE_UNQUOTED([MDEVCTL], ["$MDEVCTL"], + [Location or name of the mdevctl program]) AC_PATH_PROG([IP_PATH], [ip], [/sbin/ip], [$LIBVIRT_SBIN_PATH]) AC_DEFINE_UNQUOTED([IP_PATH], ["$IP_PATH"], [path to ip binary]) diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index 3a34a324ca..fd20d5f9e2 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -399,6 +399,40 @@ virNodeDeviceObjListFindSCSIHostByWWNs(virNodeDeviceObjListPtr devs, &data); } +static int +virNodeDeviceObjListFindMediatedDeviceByUUIDCallback(const void *payload, + const void *name G_GNUC_UNUSED, + const void *opaque G_GNUC_UNUSED) +{ + virNodeDeviceObjPtr obj = (virNodeDeviceObjPtr) payload; + const char *uuid = (const char *) opaque; + virNodeDevCapsDefPtr cap; + int want = 0; + + virObjectLock(obj); + + for (cap = obj->def->caps; cap != NULL; cap = cap->next) { + if (cap->data.type == VIR_NODE_DEV_CAP_MDEV) { + if (STREQ(cap->data.mdev.uuid, uuid)) { + want = 1; + break; + } + } + } + + virObjectUnlock(obj); + return want; +} + + +virNodeDeviceObjPtr +virNodeDeviceObjListFindMediatedDeviceByUUID(virNodeDeviceObjListPtr devs, + const char *uuid) +{ + return virNodeDeviceObjListSearch(devs, + virNodeDeviceObjListFindMediatedDeviceByUUIDCallback, + uuid); +} static void virNodeDeviceObjListDispose(void *obj) diff --git a/src/conf/virnodedeviceobj.h b/src/conf/virnodedeviceobj.h index c9df8dedab..6efdb23d36 100644 --- a/src/conf/virnodedeviceobj.h +++ b/src/conf/virnodedeviceobj.h @@ -118,3 +118,6 @@ virNodeDeviceObjListExport(virConnectPtr conn, void virNodeDeviceObjSetSkipUpdateCaps(virNodeDeviceObjPtr obj, bool skipUpdateCaps); +virNodeDeviceObjPtr +virNodeDeviceObjListFindMediatedDeviceByUUID(virNodeDeviceObjListPtr devs, + const char *uuid); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 94b206aa21..0468f68f52 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1178,6 +1178,7 @@ virNodeDeviceObjListAssignDef; virNodeDeviceObjListExport; virNodeDeviceObjListFindByName; virNodeDeviceObjListFindBySysfsPath; +virNodeDeviceObjListFindMediatedDeviceByUUID; virNodeDeviceObjListFindSCSIHostByWWNs; virNodeDeviceObjListFree; virNodeDeviceObjListGetNames; diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index f948dfbf69..c271f3bae5 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -30,6 +30,7 @@ #include "datatypes.h" #include "viralloc.h" #include "virfile.h" +#include "virjson.h" #include "virstring.h" #include "node_device_conf.h" #include "node_device_event.h" @@ -40,6 +41,7 @@ #include "viraccessapicheck.h" #include "virnetdev.h" #include "virutil.h" +#include "vircommand.h" #define VIR_FROM_THIS VIR_FROM_NODEDEV @@ -304,6 +306,30 @@ nodeDeviceLookupSCSIHostByWWN(virConnectPtr conn, return device; } +static virNodeDevicePtr +nodeDeviceLookupMediatedDeviceByUUID(virConnectPtr conn, + const char *uuid, + unsigned int flags) +{ + virNodeDeviceObjPtr obj = NULL; + virNodeDeviceDefPtr def; + virNodeDevicePtr device = NULL; + + virCheckFlags(0, NULL); + + if (!(obj = virNodeDeviceObjListFindMediatedDeviceByUUID(driver->devs, + uuid))) + return NULL; + + def = virNodeDeviceObjGetDef(obj); + + if ((device = virGetNodeDevice(conn, def->name))) + device->parentName = g_strdup(def->parent); + + virNodeDeviceObjEndAPI(&obj); + return device; +} + char * nodeDeviceGetXMLDesc(virNodeDevicePtr device, @@ -488,6 +514,23 @@ nodeDeviceFindNewDevice(virConnectPtr conn, return device; } +static virNodeDevicePtr +nodeDeviceFindNewMediatedDeviceFunc(virConnectPtr conn, + const void *opaque) +{ + const char *uuid = opaque; + return nodeDeviceLookupMediatedDeviceByUUID(conn, uuid, 0); +} + +static virNodeDevicePtr +nodeDeviceFindNewMediatedDevice(virConnectPtr conn, + const char *mdev_uuid) +{ + return nodeDeviceFindNewDevice(conn, + nodeDeviceFindNewMediatedDeviceFunc, + mdev_uuid); +} + typedef struct _NewSCISHostFuncData { const char *wwnn; @@ -525,6 +568,68 @@ nodeDeviceHasCapability(virNodeDeviceDefPtr def, virNodeDevCapType type) return false; } +static int +virNodeDeviceDefToMdevctlConfig(virNodeDeviceDefPtr def, char **buf) +{ + size_t i; + virNodeDevCapMdevPtr mdev = &def->caps->data.mdev; + g_autoptr(virJSONValue) json = virJSONValueNewObject(); + + if (virJSONValueObjectAppendString(json, "mdev_type", mdev->type) < 0) + return -1; + if (virJSONValueObjectAppendString(json, "start", "manual") < 0) + return -1; + if (mdev->attributes) { + g_autoptr(virJSONValue) attributes = virJSONValueNewArray(); + for (i = 0; i < mdev->nattributes; i++) { + virMediatedDeviceAttrPtr attr = mdev->attributes[i]; + g_autoptr(virJSONValue) jsonattr = virJSONValueNewObject(); + + if (virJSONValueObjectAppendString(jsonattr, attr->name, attr->value) < 0) + return -1; + if (virJSONValueArrayAppend(attributes, g_steal_pointer(&jsonattr)) < 0) + return -1; + } + if (virJSONValueObjectAppend(json, "attrs", g_steal_pointer(&attributes)) < 0) + return -1; + } + + *buf = virJSONValueToString(json, false); + if (*buf == NULL) + return -1; + + return 0; +} + +static int +virMdevctlStart(const char *parent, const char *json_file, char **uuid) +{ + int status; + g_autofree char *mdevctl = virFindFileInPath(MDEVCTL); + if (!mdevctl) + return -1; + + g_autoptr(virCommand) cmd = virCommandNewArgList(mdevctl, + "start", + "-p", + parent, + "--jsonfile", + json_file, + NULL); + virCommandAddEnvPassCommon(cmd); + virCommandSetOutputBuffer(cmd, uuid); + + /* an auto-generated uuid is returned via stdout if no uuid is specified in + * the mdevctl args */ + if (virCommandRun(cmd, &status) < 0 || status != 0) + return -1; + + /* remove newline */ + *uuid = g_strstrip(*uuid); + + return 0; +} + virNodeDevicePtr nodeDeviceCreateXML(virConnectPtr conn, const char *xmlDesc, @@ -569,6 +674,78 @@ nodeDeviceCreateXML(virConnectPtr conn, _("no node device for '%s' with matching " "wwnn '%s' and wwpn '%s'"), def->name, wwnn, wwpn); + } else if (nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_MDEV)) { + virNodeDeviceObjPtr parent_dev = NULL; + virNodeDeviceDefPtr parent_def = NULL; + virNodeDevCapsDefPtr parent_caps = NULL; + g_autofree char *uuid = NULL; + g_autofree char *parent_pci = NULL; + + if (def->parent == NULL) { + virReportError(VIR_ERR_NO_NODE_DEVICE, "%s", + _("cannot create a mediated device without a parent")); + return NULL; + } + + if ((parent_dev = virNodeDeviceObjListFindByName(driver->devs, def->parent)) == NULL) { + virReportError(VIR_ERR_NO_NODE_DEVICE, + _("could not find parent device '%s'"), def->parent); + return NULL; + } + + parent_def = virNodeDeviceObjGetDef(parent_dev); + for (parent_caps = parent_def->caps; parent_caps != NULL; parent_caps = parent_caps->next) { + if (parent_caps->data.type == VIR_NODE_DEV_CAP_PCI_DEV) { + virPCIDeviceAddress addr = { + .domain = parent_caps->data.pci_dev.domain, + .bus = parent_caps->data.pci_dev.bus, + .slot = parent_caps->data.pci_dev.slot, + .function = parent_caps->data.pci_dev.function + }; + + parent_pci = virPCIDeviceAddressAsString(&addr); + break; + } + } + + virNodeDeviceObjEndAPI(&parent_dev); + + if (parent_pci == NULL) { + virReportError(VIR_ERR_NO_NODE_DEVICE, + _("unable to find PCI address for parent device '%s'"), def->parent); + return NULL; + } + + /* Convert node device definition to a JSON string formatted for + * mdevctl */ + g_autofree char *json = NULL; + if (virNodeDeviceDefToMdevctlConfig(def, &json) < 0) { + virReportError(VIR_ERR_NO_NODE_DEVICE, "%s", + _("couldn't convert node device def to mdevctl JSON")); + return NULL; + } + + g_autofree char *tmp = g_build_filename(driver->stateDir, "mdev-json-XXXXXX", NULL); + gint tmpfd; + + if ((tmpfd = g_mkstemp_full(tmp, O_RDWR | O_CLOEXEC, S_IRUSR | S_IWUSR)) < 0) { + virReportSystemError(errno, "%s", _("Failed to open temp file for write")); + return NULL; + } + if (safewrite(tmpfd, json, strlen(json)) != strlen(json) || + VIR_CLOSE(tmpfd) < 0) { + virReportSystemError(errno, "%s", _("Failed to write temp file")); + return NULL; + } + + if (virMdevctlStart(parent_pci, tmp, &uuid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unable to start mediated device")); + return NULL; + } + + if (uuid && uuid[0] != '\0') + device = nodeDeviceFindNewMediatedDevice(conn, uuid); } else { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Unsupported device type")); -- 2.21.1

On 4/30/20 11:42 PM, Jonathon Jongsma wrote:
With recent additions to the node device xml schema, an xml schema can now describe a mdev device sufficiently for libvirt to create and start the device using the mdevctl utility.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- libvirt.spec.in | 3 + m4/virt-external-programs.m4 | 3 + src/conf/virnodedeviceobj.c | 34 +++++ src/conf/virnodedeviceobj.h | 3 + src/libvirt_private.syms | 1 + src/node_device/node_device_driver.c | 177 +++++++++++++++++++++++++++ 6 files changed, 221 insertions(+)
diff --git a/libvirt.spec.in b/libvirt.spec.in index 6abf97df85..411846a9fc 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -466,6 +466,9 @@ Requires: dbus # For uid creation during pre Requires(pre): shadow-utils
+# For managing persistent mediated devices +Recommends: mdevctl +
I wonder whether we could make this for nodedev driver RPM only. Because, if somebody installs say client library + virsh only (for remote mgmt), then there is no reason they should install mdevctl with that.
%description daemon Server side daemon required to manage the virtualization capabilities of recent versions of Linux. Requires a hypervisor specific sub-RPM diff --git a/m4/virt-external-programs.m4 b/m4/virt-external-programs.m4 index 9046e3bf07..bd3cb1f757 100644 --- a/m4/virt-external-programs.m4 +++ b/m4/virt-external-programs.m4 @@ -65,6 +65,7 @@ AC_DEFUN([LIBVIRT_CHECK_EXTERNAL_PROGRAMS], [ AC_PATH_PROG([OVSVSCTL], [ovs-vsctl], [ovs-vsctl], [$LIBVIRT_SBIN_PATH]) AC_PATH_PROG([SCRUB], [scrub], [scrub], [$LIBVIRT_SBIN_PATH]) AC_PATH_PROG([ADDR2LINE], [addr2line], [addr2line], [$LIBVIRT_SBIN_PATH]) + AC_PATH_PROG([MDEVCTL], [mdevctl], [mdevctl], [$LIBVIRT_SBIN_PATH])
AC_DEFINE_UNQUOTED([DMIDECODE], ["$DMIDECODE"], [Location or name of the dmidecode program]) @@ -88,6 +89,8 @@ AC_DEFUN([LIBVIRT_CHECK_EXTERNAL_PROGRAMS], [ [Location or name of the scrub program (for wiping algorithms)]) AC_DEFINE_UNQUOTED([ADDR2LINE], ["$ADDR2LINE"], [Location of addr2line program]) + AC_DEFINE_UNQUOTED([MDEVCTL], ["$MDEVCTL"], + [Location or name of the mdevctl program])
AC_PATH_PROG([IP_PATH], [ip], [/sbin/ip], [$LIBVIRT_SBIN_PATH]) AC_DEFINE_UNQUOTED([IP_PATH], ["$IP_PATH"], [path to ip binary]) diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index 3a34a324ca..fd20d5f9e2 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -399,6 +399,40 @@ virNodeDeviceObjListFindSCSIHostByWWNs(virNodeDeviceObjListPtr devs, &data); }
+static int +virNodeDeviceObjListFindMediatedDeviceByUUIDCallback(const void *payload, + const void *name G_GNUC_UNUSED, + const void *opaque G_GNUC_UNUSED) +{ + virNodeDeviceObjPtr obj = (virNodeDeviceObjPtr) payload; + const char *uuid = (const char *) opaque; + virNodeDevCapsDefPtr cap; + int want = 0; + + virObjectLock(obj); + + for (cap = obj->def->caps; cap != NULL; cap = cap->next) { + if (cap->data.type == VIR_NODE_DEV_CAP_MDEV) { + if (STREQ(cap->data.mdev.uuid, uuid)) { + want = 1; + break; + } + } + } + + virObjectUnlock(obj); + return want; +} + + +virNodeDeviceObjPtr +virNodeDeviceObjListFindMediatedDeviceByUUID(virNodeDeviceObjListPtr devs, + const char *uuid) +{ + return virNodeDeviceObjListSearch(devs, + virNodeDeviceObjListFindMediatedDeviceByUUIDCallback, + uuid); +}
static void virNodeDeviceObjListDispose(void *obj) diff --git a/src/conf/virnodedeviceobj.h b/src/conf/virnodedeviceobj.h index c9df8dedab..6efdb23d36 100644 --- a/src/conf/virnodedeviceobj.h +++ b/src/conf/virnodedeviceobj.h @@ -118,3 +118,6 @@ virNodeDeviceObjListExport(virConnectPtr conn, void virNodeDeviceObjSetSkipUpdateCaps(virNodeDeviceObjPtr obj, bool skipUpdateCaps); +virNodeDeviceObjPtr +virNodeDeviceObjListFindMediatedDeviceByUUID(virNodeDeviceObjListPtr devs, + const char *uuid); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 94b206aa21..0468f68f52 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1178,6 +1178,7 @@ virNodeDeviceObjListAssignDef; virNodeDeviceObjListExport; virNodeDeviceObjListFindByName; virNodeDeviceObjListFindBySysfsPath; +virNodeDeviceObjListFindMediatedDeviceByUUID; virNodeDeviceObjListFindSCSIHostByWWNs; virNodeDeviceObjListFree; virNodeDeviceObjListGetNames; diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index f948dfbf69..c271f3bae5 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -30,6 +30,7 @@ #include "datatypes.h" #include "viralloc.h" #include "virfile.h" +#include "virjson.h" #include "virstring.h" #include "node_device_conf.h" #include "node_device_event.h" @@ -40,6 +41,7 @@ #include "viraccessapicheck.h" #include "virnetdev.h" #include "virutil.h" +#include "vircommand.h"
#define VIR_FROM_THIS VIR_FROM_NODEDEV
@@ -304,6 +306,30 @@ nodeDeviceLookupSCSIHostByWWN(virConnectPtr conn, return device; }
+static virNodeDevicePtr +nodeDeviceLookupMediatedDeviceByUUID(virConnectPtr conn, + const char *uuid, + unsigned int flags) +{ + virNodeDeviceObjPtr obj = NULL; + virNodeDeviceDefPtr def; + virNodeDevicePtr device = NULL; + + virCheckFlags(0, NULL); + + if (!(obj = virNodeDeviceObjListFindMediatedDeviceByUUID(driver->devs, + uuid))) + return NULL; + + def = virNodeDeviceObjGetDef(obj); + + if ((device = virGetNodeDevice(conn, def->name))) + device->parentName = g_strdup(def->parent); + + virNodeDeviceObjEndAPI(&obj); + return device; +} +
char * nodeDeviceGetXMLDesc(virNodeDevicePtr device, @@ -488,6 +514,23 @@ nodeDeviceFindNewDevice(virConnectPtr conn, return device; }
+static virNodeDevicePtr +nodeDeviceFindNewMediatedDeviceFunc(virConnectPtr conn, + const void *opaque) +{ + const char *uuid = opaque; + return nodeDeviceLookupMediatedDeviceByUUID(conn, uuid, 0); +} + +static virNodeDevicePtr +nodeDeviceFindNewMediatedDevice(virConnectPtr conn, + const char *mdev_uuid) +{ + return nodeDeviceFindNewDevice(conn, + nodeDeviceFindNewMediatedDeviceFunc, + mdev_uuid); +} + typedef struct _NewSCISHostFuncData { const char *wwnn; @@ -525,6 +568,68 @@ nodeDeviceHasCapability(virNodeDeviceDefPtr def, virNodeDevCapType type) return false; }
+static int +virNodeDeviceDefToMdevctlConfig(virNodeDeviceDefPtr def, char **buf) +{ + size_t i; + virNodeDevCapMdevPtr mdev = &def->caps->data.mdev; + g_autoptr(virJSONValue) json = virJSONValueNewObject(); + + if (virJSONValueObjectAppendString(json, "mdev_type", mdev->type) < 0) + return -1; + if (virJSONValueObjectAppendString(json, "start", "manual") < 0) + return -1; + if (mdev->attributes) { + g_autoptr(virJSONValue) attributes = virJSONValueNewArray(); + for (i = 0; i < mdev->nattributes; i++) { + virMediatedDeviceAttrPtr attr = mdev->attributes[i]; + g_autoptr(virJSONValue) jsonattr = virJSONValueNewObject(); + + if (virJSONValueObjectAppendString(jsonattr, attr->name, attr->value) < 0) + return -1; + if (virJSONValueArrayAppend(attributes, g_steal_pointer(&jsonattr)) < 0) + return -1; + } + if (virJSONValueObjectAppend(json, "attrs", g_steal_pointer(&attributes)) < 0) + return -1; + } + + *buf = virJSONValueToString(json, false); + if (*buf == NULL) + return -1; + + return 0; +} + +static int +virMdevctlStart(const char *parent, const char *json_file, char **uuid) +{ + int status; + g_autofree char *mdevctl = virFindFileInPath(MDEVCTL); + if (!mdevctl) + return -1; + + g_autoptr(virCommand) cmd = virCommandNewArgList(mdevctl, + "start", + "-p", + parent, + "--jsonfile", + json_file, + NULL); + virCommandAddEnvPassCommon(cmd); + virCommandSetOutputBuffer(cmd, uuid); + + /* an auto-generated uuid is returned via stdout if no uuid is specified in + * the mdevctl args */ + if (virCommandRun(cmd, &status) < 0 || status != 0) + return -1; + + /* remove newline */ + *uuid = g_strstrip(*uuid); + + return 0; +} + virNodeDevicePtr nodeDeviceCreateXML(virConnectPtr conn, const char *xmlDesc, @@ -569,6 +674,78 @@ nodeDeviceCreateXML(virConnectPtr conn, _("no node device for '%s' with matching " "wwnn '%s' and wwpn '%s'"), def->name, wwnn, wwpn); + } else if (nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_MDEV)) {
Can this whole block be moved into a separate function? I think it would look slightly better.
+ virNodeDeviceObjPtr parent_dev = NULL; + virNodeDeviceDefPtr parent_def = NULL; + virNodeDevCapsDefPtr parent_caps = NULL; + g_autofree char *uuid = NULL; + g_autofree char *parent_pci = NULL; + + if (def->parent == NULL) { + virReportError(VIR_ERR_NO_NODE_DEVICE, "%s", + _("cannot create a mediated device without a parent")); + return NULL; + } + + if ((parent_dev = virNodeDeviceObjListFindByName(driver->devs, def->parent)) == NULL) { + virReportError(VIR_ERR_NO_NODE_DEVICE, + _("could not find parent device '%s'"), def->parent); + return NULL; + } + + parent_def = virNodeDeviceObjGetDef(parent_dev); + for (parent_caps = parent_def->caps; parent_caps != NULL; parent_caps = parent_caps->next) { + if (parent_caps->data.type == VIR_NODE_DEV_CAP_PCI_DEV) { + virPCIDeviceAddress addr = { + .domain = parent_caps->data.pci_dev.domain, + .bus = parent_caps->data.pci_dev.bus, + .slot = parent_caps->data.pci_dev.slot, + .function = parent_caps->data.pci_dev.function + }; + + parent_pci = virPCIDeviceAddressAsString(&addr); + break; + } + } + + virNodeDeviceObjEndAPI(&parent_dev); + + if (parent_pci == NULL) { + virReportError(VIR_ERR_NO_NODE_DEVICE, + _("unable to find PCI address for parent device '%s'"), def->parent); + return NULL; + } + + /* Convert node device definition to a JSON string formatted for + * mdevctl */ + g_autofree char *json = NULL; + if (virNodeDeviceDefToMdevctlConfig(def, &json) < 0) { + virReportError(VIR_ERR_NO_NODE_DEVICE, "%s", + _("couldn't convert node device def to mdevctl JSON")); + return NULL; + } + + g_autofree char *tmp = g_build_filename(driver->stateDir, "mdev-json-XXXXXX", NULL); + gint tmpfd;
These variables should be declared upfront. We don't really like in-block declarations. Also, we have VIR_AUTOCLOSE ;-) Or even better, virFileWriteStr(). I know, it's not obvious, unless you know src/util/ by heart.
+ + if ((tmpfd = g_mkstemp_full(tmp, O_RDWR | O_CLOEXEC, S_IRUSR | S_IWUSR)) < 0) { + virReportSystemError(errno, "%s", _("Failed to open temp file for write")); + return NULL; + } + if (safewrite(tmpfd, json, strlen(json)) != strlen(json) || + VIR_CLOSE(tmpfd) < 0) { + virReportSystemError(errno, "%s", _("Failed to write temp file")); + return NULL; + } + + if (virMdevctlStart(parent_pci, tmp, &uuid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unable to start mediated device")); > + return NULL; + } + + if (uuid && uuid[0] != '\0') + device = nodeDeviceFindNewMediatedDevice(conn, uuid); } else { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Unsupported device type"));
Michal

On Mon, May 11, 2020 at 03:27:56PM +0200, Michal Privoznik wrote:
On 4/30/20 11:42 PM, Jonathon Jongsma wrote:
With recent additions to the node device xml schema, an xml schema can now describe a mdev device sufficiently for libvirt to create and start the device using the mdevctl utility.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- libvirt.spec.in | 3 + m4/virt-external-programs.m4 | 3 + src/conf/virnodedeviceobj.c | 34 +++++ src/conf/virnodedeviceobj.h | 3 + src/libvirt_private.syms | 1 + src/node_device/node_device_driver.c | 177 +++++++++++++++++++++++++++ 6 files changed, 221 insertions(+)
diff --git a/libvirt.spec.in b/libvirt.spec.in index 6abf97df85..411846a9fc 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -466,6 +466,9 @@ Requires: dbus # For uid creation during pre Requires(pre): shadow-utils +# For managing persistent mediated devices +Recommends: mdevctl +
I wonder whether we could make this for nodedev driver RPM only. Because, if somebody installs say client library + virsh only (for remote mgmt), then there is no reason they should install mdevctl with that.
Yeah, it should be mandatory against the nodedev driver RPM. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Thu, Apr 30, 2020 at 04:42:57PM -0500, Jonathon Jongsma wrote:
With recent additions to the node device xml schema, an xml schema can now describe a mdev device sufficiently for libvirt to create and start the device using the mdevctl utility.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> ---
...
}
+static int +virNodeDeviceDefToMdevctlConfig(virNodeDeviceDefPtr def, char **buf) +{ + size_t i; + virNodeDevCapMdevPtr mdev = &def->caps->data.mdev; + g_autoptr(virJSONValue) json = virJSONValueNewObject(); + + if (virJSONValueObjectAppendString(json, "mdev_type", mdev->type) < 0) + return -1; + if (virJSONValueObjectAppendString(json, "start", "manual") < 0) + return -1; + if (mdev->attributes) { + g_autoptr(virJSONValue) attributes = virJSONValueNewArray(); + for (i = 0; i < mdev->nattributes; i++) { + virMediatedDeviceAttrPtr attr = mdev->attributes[i]; + g_autoptr(virJSONValue) jsonattr = virJSONValueNewObject(); + + if (virJSONValueObjectAppendString(jsonattr, attr->name, attr->value) < 0) + return -1; + if (virJSONValueArrayAppend(attributes, g_steal_pointer(&jsonattr)) < 0) + return -1; + } + if (virJSONValueObjectAppend(json, "attrs", g_steal_pointer(&attributes)) < 0) + return -1; + } + + *buf = virJSONValueToString(json, false); + if (*buf == NULL)
tiny nitpick - we test pointers simply as if(!ptr) (multiple occurrences)
+ return -1; + + return 0; +} + +static int +virMdevctlStart(const char *parent, const char *json_file, char **uuid) +{ + int status; + g_autofree char *mdevctl = virFindFileInPath(MDEVCTL);
insert an empty line here...
+ if (!mdevctl) + return -1; + + g_autoptr(virCommand) cmd = virCommandNewArgList(mdevctl, + "start", + "-p", + parent, + "--jsonfile", + json_file, + NULL);
So, I was wondering what if someone actually wants to specify the UUID for the device for some reason. We would have to expose a new element in the XML partially duplicating what the <name> already contains. On the other hand, I think we're good in create since these are truly supposed to be transient mdevs, so it's desirable to let someone else (or us internally for that matter) figure out the UUID for such device. Now, this is irrelevant to this patch because I'm already thinking ahead. In order to define an mdev, you will need to expose an element mapping to the UUID, you want to let apps create mdev profiles with expected UUID values and activate them on demand.
+ virCommandAddEnvPassCommon(cmd); + virCommandSetOutputBuffer(cmd, uuid); + + /* an auto-generated uuid is returned via stdout if no uuid is specified in + * the mdevctl args */ + if (virCommandRun(cmd, &status) < 0 || status != 0) + return -1; + + /* remove newline */ + *uuid = g_strstrip(*uuid); + + return 0; +} + virNodeDevicePtr nodeDeviceCreateXML(virConnectPtr conn, const char *xmlDesc, @@ -569,6 +674,78 @@ nodeDeviceCreateXML(virConnectPtr conn, _("no node device for '%s' with matching " "wwnn '%s' and wwpn '%s'"), def->name, wwnn, wwpn); + } else if (nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_MDEV)) { + virNodeDeviceObjPtr parent_dev = NULL; + virNodeDeviceDefPtr parent_def = NULL; + virNodeDevCapsDefPtr parent_caps = NULL; + g_autofree char *uuid = NULL; + g_autofree char *parent_pci = NULL; + + if (def->parent == NULL) { + virReportError(VIR_ERR_NO_NODE_DEVICE, "%s",
This IMO should be a VIR_ERR_XML_ERROR because it's a mandatory element to be specified.
+ _("cannot create a mediated device without a parent")); + return NULL; + } + + if ((parent_dev = virNodeDeviceObjListFindByName(driver->devs, def->parent)) == NULL) { + virReportError(VIR_ERR_NO_NODE_DEVICE, + _("could not find parent device '%s'"), def->parent); + return NULL; + } + + parent_def = virNodeDeviceObjGetDef(parent_dev); + for (parent_caps = parent_def->caps; parent_caps != NULL; parent_caps = parent_caps->next) { + if (parent_caps->data.type == VIR_NODE_DEV_CAP_PCI_DEV) { + virPCIDeviceAddress addr = { + .domain = parent_caps->data.pci_dev.domain, + .bus = parent_caps->data.pci_dev.bus, + .slot = parent_caps->data.pci_dev.slot, + .function = parent_caps->data.pci_dev.function + }; + + parent_pci = virPCIDeviceAddressAsString(&addr); + break; + } + } + + virNodeDeviceObjEndAPI(&parent_dev); + + if (parent_pci == NULL) { + virReportError(VIR_ERR_NO_NODE_DEVICE, + _("unable to find PCI address for parent device '%s'"), def->parent); + return NULL; + } + + /* Convert node device definition to a JSON string formatted for + * mdevctl */ + g_autofree char *json = NULL; + if (virNodeDeviceDefToMdevctlConfig(def, &json) < 0) { + virReportError(VIR_ERR_NO_NODE_DEVICE, "%s",
I think ^this should be an INTERNAL_ERROR instead
+ _("couldn't convert node device def to mdevctl JSON")); + return NULL; + } + + g_autofree char *tmp = g_build_filename(driver->stateDir, "mdev-json-XXXXXX", NULL); + gint tmpfd; + + if ((tmpfd = g_mkstemp_full(tmp, O_RDWR | O_CLOEXEC, S_IRUSR | S_IWUSR)) < 0) { + virReportSystemError(errno, "%s", _("Failed to open temp file for write")); + return NULL; + } + if (safewrite(tmpfd, json, strlen(json)) != strlen(json) || + VIR_CLOSE(tmpfd) < 0) { + virReportSystemError(errno, "%s", _("Failed to write temp file")); + return NULL;
You should discard the temporary JSON file at some point once the below is finished because it's littering the state dir.
+ } + + if (virMdevctlStart(parent_pci, tmp, &uuid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unable to start mediated device")); + return NULL; + } + + if (uuid && uuid[0] != '\0')
Is ^this check necessary? I mean, mdevctl either returns the UUID and retcode 0 or nothing and retcode 1 which we should have already checked in virCommandRun, so since it was successful, we now know that UUID is filled in or did I miss something in the code?
+ device = nodeDeviceFindNewMediatedDevice(conn, uuid); } else { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Unsupported device type"));
-- Erik Skultety

Add the ability to destroy mdev node devices via the mdevctl utility. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index c271f3bae5..113cc063df 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -630,6 +630,27 @@ virMdevctlStart(const char *parent, const char *json_file, char **uuid) return 0; } +static int +virMdevctlStop(const char *uuid) +{ + int status; + g_autofree char *mdevctl = virFindFileInPath(MDEVCTL); + if (!mdevctl) + return -1; + + g_autoptr(virCommand) cmd = virCommandNewArgList(mdevctl, + "stop", + "-u", + uuid, + NULL); + virCommandAddEnvPassCommon(cmd); + + if (virCommandRun(cmd, &status) < 0 || status != 0) + return -1; + + return 0; +} + virNodeDevicePtr nodeDeviceCreateXML(virConnectPtr conn, const char *xmlDesc, @@ -801,6 +822,13 @@ nodeDeviceDestroy(virNodeDevicePtr device) if (virVHBAManageVport(parent_host, wwpn, wwnn, VPORT_DELETE) < 0) goto cleanup; + ret = 0; + } else if (nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_MDEV)) { + if (virMdevctlStop(def->caps->data.mdev.uuid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unable to stop mediated device")); + goto cleanup; + } ret = 0; } else { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", -- 2.21.1

On 4/30/20 11:42 PM, Jonathon Jongsma wrote:
This is the first portion of an effort to support persistent mediated devices with libvirt. This first series simply enables creating and destroying non-persistent mediated devices via the virNodeDeviceCreateXML() and virNodeDeviceDestroy() functions. The 'mdevctl' utility[1] provides the backend implementation.
[1] https://github.com/mdevctl/mdevctl
Jonathon Jongsma (6): nodedev: factor out nodeDeviceHasCapability() nodedev: add support for mdev attributes nodedev: refactor nodeDeviceFindNewDevice() nodedev: store mdev UUID in mdev caps nodedev: add mdev support to virNodeDeviceCreateXML() nodedev: add mdev support to virNodeDeviceDestroy()
docs/schemas/nodedev.rng | 6 + libvirt.spec.in | 3 + m4/virt-external-programs.m4 | 3 + src/conf/node_device_conf.c | 59 ++++- src/conf/node_device_conf.h | 3 + src/conf/virnodedeviceobj.c | 34 +++ src/conf/virnodedeviceobj.h | 3 + src/libvirt_private.syms | 3 + src/node_device/node_device_driver.c | 326 ++++++++++++++++++++++++--- src/node_device/node_device_udev.c | 5 +- src/util/virmdev.c | 12 + src/util/virmdev.h | 11 + 12 files changed, 425 insertions(+), 43 deletions(-)
Codewise, this looks good. I will let Erik review the semantics of creating mdevs. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Michal

On Mon, May 11, 2020 at 03:28:02PM +0200, Michal Privoznik wrote:
On 4/30/20 11:42 PM, Jonathon Jongsma wrote:
This is the first portion of an effort to support persistent mediated devices with libvirt. This first series simply enables creating and destroying non-persistent mediated devices via the virNodeDeviceCreateXML() and virNodeDeviceDestroy() functions. The 'mdevctl' utility[1] provides the backend implementation.
[1] https://github.com/mdevctl/mdevctl
Jonathon Jongsma (6): nodedev: factor out nodeDeviceHasCapability() nodedev: add support for mdev attributes nodedev: refactor nodeDeviceFindNewDevice() nodedev: store mdev UUID in mdev caps nodedev: add mdev support to virNodeDeviceCreateXML() nodedev: add mdev support to virNodeDeviceDestroy()
docs/schemas/nodedev.rng | 6 +
docs/formatnode.html.in needs some documentation and examples
libvirt.spec.in | 3 + m4/virt-external-programs.m4 | 3 + src/conf/node_device_conf.c | 59 ++++- src/conf/node_device_conf.h | 3 + src/conf/virnodedeviceobj.c | 34 +++ src/conf/virnodedeviceobj.h | 3 + src/libvirt_private.syms | 3 + src/node_device/node_device_driver.c | 326 ++++++++++++++++++++++++--- src/node_device/node_device_udev.c | 5 +- src/util/virmdev.c | 12 + src/util/virmdev.h | 11 + 12 files changed, 425 insertions(+), 43 deletions(-)
Codewise, this looks good. I will let Erik review the semantics of creating mdevs.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This is notably lacking any unit test coverage, so is not validating the RNG schema or the XML parser or conversion of XML to mdevctl args. I think that needs fixing before we accept it. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, May 11, 2020 at 05:51:13PM +0100, Daniel P. Berrangé wrote:
On Mon, May 11, 2020 at 03:28:02PM +0200, Michal Privoznik wrote:
On 4/30/20 11:42 PM, Jonathon Jongsma wrote:
This is the first portion of an effort to support persistent mediated devices with libvirt. This first series simply enables creating and destroying non-persistent mediated devices via the virNodeDeviceCreateXML() and virNodeDeviceDestroy() functions. The 'mdevctl' utility[1] provides the backend implementation.
[1] https://github.com/mdevctl/mdevctl
Jonathon Jongsma (6): nodedev: factor out nodeDeviceHasCapability() nodedev: add support for mdev attributes nodedev: refactor nodeDeviceFindNewDevice() nodedev: store mdev UUID in mdev caps nodedev: add mdev support to virNodeDeviceCreateXML() nodedev: add mdev support to virNodeDeviceDestroy()
docs/schemas/nodedev.rng | 6 +
docs/formatnode.html.in needs some documentation and examples
libvirt.spec.in | 3 + m4/virt-external-programs.m4 | 3 + src/conf/node_device_conf.c | 59 ++++- src/conf/node_device_conf.h | 3 + src/conf/virnodedeviceobj.c | 34 +++ src/conf/virnodedeviceobj.h | 3 + src/libvirt_private.syms | 3 + src/node_device/node_device_driver.c | 326 ++++++++++++++++++++++++--- src/node_device/node_device_udev.c | 5 +- src/util/virmdev.c | 12 + src/util/virmdev.h | 11 + 12 files changed, 425 insertions(+), 43 deletions(-)
Codewise, this looks good. I will let Erik review the semantics of creating mdevs.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This is notably lacking any unit test coverage, so is not validating the RNG schema or the XML parser or conversion of XML to mdevctl args. I think that needs fixing before we accept it.
Agreed. So I played with the series and found a want to make a few points. Apparently, this is the minimalistic XML that would work: <device> <parent>pci_0000_06_00_0</parent> <capability type='mdev'> <type id='nvidia-11'/> <iommuGroup number='71'/> </capability> </device> Which means we should make iommuGroup optional, because it's a readonly element, users are not supposed to specify the iommu group and as a setting it's also ignored, because it's figured out by the parent device driver (I think) when the mdev is created. Like Daniel mentioned, some documentation would be nice, especially clarifying that the <name> and <path> elements are also read only and any attempt to set them would be ignored - well, simply because we're reusing the XML structure we've been using for ages to only report results back, not consume them back ourselves. It's good to think ahead to the future with the additional attributes, but I don't know about any attributes that vGPUs would accept, so I can't comment on that really even if I wanted to, have you tried with some other non-vGPU type of mdev? I finally got to try the mdevctl utility directly and seeing what it can do and how it does things, I have to re-iterate the question what benefit does libvirt bring in terms of creating/defining the mdevs? You can already define multiple profiles in JSON for mdevctl even with the same UUID and activate them on demand. In terms of migration, in order to migrate persistent mdevs with libvirt we've already talked about pre-creating the configs on the destination with the correct parent address and mdev type which is exactly the same thing one would have to do with the JSON configs of mdevctl and editing the relevant bits. I'm asking because I've been away from this feature for a while, so I may have missed substantial information in this area. Regards, -- Erik Skultety

On Wed, May 20, 2020 at 10:32:05AM +0200, Erik Skultety wrote:
On Mon, May 11, 2020 at 05:51:13PM +0100, Daniel P. Berrangé wrote:
On Mon, May 11, 2020 at 03:28:02PM +0200, Michal Privoznik wrote:
On 4/30/20 11:42 PM, Jonathon Jongsma wrote:
This is the first portion of an effort to support persistent mediated devices with libvirt. This first series simply enables creating and destroying non-persistent mediated devices via the virNodeDeviceCreateXML() and virNodeDeviceDestroy() functions. The 'mdevctl' utility[1] provides the backend implementation.
[1] https://github.com/mdevctl/mdevctl
Jonathon Jongsma (6): nodedev: factor out nodeDeviceHasCapability() nodedev: add support for mdev attributes nodedev: refactor nodeDeviceFindNewDevice() nodedev: store mdev UUID in mdev caps nodedev: add mdev support to virNodeDeviceCreateXML() nodedev: add mdev support to virNodeDeviceDestroy()
docs/schemas/nodedev.rng | 6 +
docs/formatnode.html.in needs some documentation and examples
libvirt.spec.in | 3 + m4/virt-external-programs.m4 | 3 + src/conf/node_device_conf.c | 59 ++++- src/conf/node_device_conf.h | 3 + src/conf/virnodedeviceobj.c | 34 +++ src/conf/virnodedeviceobj.h | 3 + src/libvirt_private.syms | 3 + src/node_device/node_device_driver.c | 326 ++++++++++++++++++++++++--- src/node_device/node_device_udev.c | 5 +- src/util/virmdev.c | 12 + src/util/virmdev.h | 11 + 12 files changed, 425 insertions(+), 43 deletions(-)
Codewise, this looks good. I will let Erik review the semantics of creating mdevs.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This is notably lacking any unit test coverage, so is not validating the RNG schema or the XML parser or conversion of XML to mdevctl args. I think that needs fixing before we accept it.
Agreed.
So I played with the series and found a want to make a few points.
Apparently, this is the minimalistic XML that would work: <device> <parent>pci_0000_06_00_0</parent> <capability type='mdev'> <type id='nvidia-11'/> <iommuGroup number='71'/> </capability> </device>
Which means we should make iommuGroup optional, because it's a readonly element, users are not supposed to specify the iommu group and as a setting it's also ignored, because it's figured out by the parent device driver (I think) when the mdev is created.
Like Daniel mentioned, some documentation would be nice, especially clarifying that the <name> and <path> elements are also read only and any attempt to set them would be ignored - well, simply because we're reusing the XML structure we've been using for ages to only report results back, not consume them back ourselves.
It's good to think ahead to the future with the additional attributes, but I don't know about any attributes that vGPUs would accept, so I can't comment on that really even if I wanted to, have you tried with some other non-vGPU type of mdev?
I finally got to try the mdevctl utility directly and seeing what it can do and how it does things, I have to re-iterate the question what benefit does libvirt bring in terms of creating/defining the mdevs?
Libvirt provides a consistent API to control the host, with authenticaton, acess control and separation. In OpenStack case, Nova runs as a non-root account and so anything where libvirt doesn't expose functionality would force them to resort to sudo rules which is not attractive. In the OpenShift demo installer, the libvirt client is running inside a container which is permitted to connect to libvirt. They don't ave ability to run mdevctl at all given the separate container image they're in. There's going to be similar benefits for other applications. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, May 20, 2020 at 10:09:54AM +0100, Daniel P. Berrangé wrote:
On Wed, May 20, 2020 at 10:32:05AM +0200, Erik Skultety wrote:
On Mon, May 11, 2020 at 05:51:13PM +0100, Daniel P. Berrangé wrote:
On Mon, May 11, 2020 at 03:28:02PM +0200, Michal Privoznik wrote:
On 4/30/20 11:42 PM, Jonathon Jongsma wrote:
This is the first portion of an effort to support persistent mediated devices with libvirt. This first series simply enables creating and destroying non-persistent mediated devices via the virNodeDeviceCreateXML() and virNodeDeviceDestroy() functions. The 'mdevctl' utility[1] provides the backend implementation.
[1] https://github.com/mdevctl/mdevctl
Jonathon Jongsma (6): nodedev: factor out nodeDeviceHasCapability() nodedev: add support for mdev attributes nodedev: refactor nodeDeviceFindNewDevice() nodedev: store mdev UUID in mdev caps nodedev: add mdev support to virNodeDeviceCreateXML() nodedev: add mdev support to virNodeDeviceDestroy()
docs/schemas/nodedev.rng | 6 +
docs/formatnode.html.in needs some documentation and examples
libvirt.spec.in | 3 + m4/virt-external-programs.m4 | 3 + src/conf/node_device_conf.c | 59 ++++- src/conf/node_device_conf.h | 3 + src/conf/virnodedeviceobj.c | 34 +++ src/conf/virnodedeviceobj.h | 3 + src/libvirt_private.syms | 3 + src/node_device/node_device_driver.c | 326 ++++++++++++++++++++++++--- src/node_device/node_device_udev.c | 5 +- src/util/virmdev.c | 12 + src/util/virmdev.h | 11 + 12 files changed, 425 insertions(+), 43 deletions(-)
Codewise, this looks good. I will let Erik review the semantics of creating mdevs.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This is notably lacking any unit test coverage, so is not validating the RNG schema or the XML parser or conversion of XML to mdevctl args. I think that needs fixing before we accept it.
Agreed.
So I played with the series and found a want to make a few points.
Apparently, this is the minimalistic XML that would work: <device> <parent>pci_0000_06_00_0</parent> <capability type='mdev'> <type id='nvidia-11'/> <iommuGroup number='71'/> </capability> </device>
Which means we should make iommuGroup optional, because it's a readonly element, users are not supposed to specify the iommu group and as a setting it's also ignored, because it's figured out by the parent device driver (I think) when the mdev is created.
Like Daniel mentioned, some documentation would be nice, especially clarifying that the <name> and <path> elements are also read only and any attempt to set them would be ignored - well, simply because we're reusing the XML structure we've been using for ages to only report results back, not consume them back ourselves.
It's good to think ahead to the future with the additional attributes, but I don't know about any attributes that vGPUs would accept, so I can't comment on that really even if I wanted to, have you tried with some other non-vGPU type of mdev?
I finally got to try the mdevctl utility directly and seeing what it can do and how it does things, I have to re-iterate the question what benefit does libvirt bring in terms of creating/defining the mdevs?
Libvirt provides a consistent API to control the host, with authenticaton, acess control and separation. In OpenStack case, Nova runs as a non-root account and so anything where libvirt doesn't expose functionality would force them to resort to sudo rules which is not attractive. In the OpenShift demo installer, the libvirt client is running inside a container which is permitted to connect to libvirt. They don't ave ability to run mdevctl at all given the separate container image they're in. There's going to be similar benefits for other applications.
Okay, thanks for refreshing my memory, the OpenShift use case is new to me, but it makes sense.
participants (4)
-
Daniel P. Berrangé
-
Erik Skultety
-
Jonathon Jongsma
-
Michal Privoznik