[libvirt PATCH 0/7] Accumulated mdev fixes

These patches fix a couple of outstanding issues with mdev support. These include: 1. output proper xml for the mdev parent device when run under the test suite 2. mdevctl allows multiple devices with the same UUID (but different parents), so we have to be prepared to handle this. This also requires a change in nodedev name for mdev devices. From 'mdev_$UUID' to 'mdev_$UUID_$PARENTADDR' 3. validate input xml when defining or creating mdevs (e.g. ensure we have a proper parent device) This patch obsoletes the previous patch titled "nodedev: Handle inactive mdevs with the same UUID" Jonathon Jongsma (7): nodedev: add internal virNodeDeviceObjListFind() nodedev: fix xml output for mdev parents in test suite nodedev: cache parent address in mdev caps nodedev: Add parser validation for node devices nodedev: add PostParse callback for nodedev parsing nodedev: Handle inactive mdevs with the same UUID nodedev: look up mdevs by UUID and parent src/conf/node_device_conf.c | 37 +++- src/conf/node_device_conf.h | 21 +- src/conf/virnodedeviceobj.c | 71 +++++-- src/conf/virnodedeviceobj.h | 15 +- src/hypervisor/domain_driver.c | 7 +- src/libvirt_private.syms | 1 + src/node_device/node_device_driver.c | 182 ++++++++++++++---- src/node_device/node_device_driver.h | 6 + src/node_device/node_device_udev.c | 21 +- src/test/test_driver.c | 6 +- .../mdevctl-list-multiple.out.xml | 16 +- tests/nodedevmdevctltest.c | 36 +++- tests/nodedevxml2xmltest.c | 3 +- 13 files changed, 338 insertions(+), 84 deletions(-) -- 2.31.1

This is a generic function that you can provide your own predicate function to search for a particular device. It will be used in an upcoming commit. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/conf/virnodedeviceobj.c | 53 +++++++++++++++++++++++++++++++------ src/conf/virnodedeviceobj.h | 11 +++++--- src/libvirt_private.syms | 1 + 3 files changed, 54 insertions(+), 11 deletions(-) diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index b213592b56..6e7b354f96 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -1026,19 +1026,19 @@ virNodeDeviceObjSetPersistent(virNodeDeviceObj *obj, } -struct virNodeDeviceObjListRemoveHelperData +typedef struct { - virNodeDeviceObjListRemoveIterator callback; + virNodeDeviceObjListPredicate predicate; void *opaque; -}; +} PredicateHelperData; static int virNodeDeviceObjListRemoveHelper(void *key G_GNUC_UNUSED, void *value, void *opaque) { - struct virNodeDeviceObjListRemoveHelperData *data = opaque; + PredicateHelperData *data = opaque; - return data->callback(value, data->opaque); + return data->predicate(value, data->opaque); } @@ -1054,11 +1054,11 @@ static int virNodeDeviceObjListRemoveHelper(void *key G_GNUC_UNUSED, */ void virNodeDeviceObjListForEachRemove(virNodeDeviceObjList *devs, - virNodeDeviceObjListRemoveIterator callback, + virNodeDeviceObjListPredicate callback, void *opaque) { - struct virNodeDeviceObjListRemoveHelperData data = { - .callback = callback, + PredicateHelperData data = { + .predicate = callback, .opaque = opaque }; @@ -1068,3 +1068,40 @@ virNodeDeviceObjListForEachRemove(virNodeDeviceObjList *devs, &data); virObjectRWUnlock(devs); } + + +static int virNodeDeviceObjListFindHelper(const void *payload, + const char *name G_GNUC_UNUSED, + const void *opaque) +{ + PredicateHelperData *data = (PredicateHelperData *) opaque; + virNodeDeviceObj *obj = (virNodeDeviceObj *) payload; + + return data->predicate(obj, data->opaque); +} + + +/** + * virNodeDeviceObjListFind + * @devs: Pointer to object list + * @predicate: function to test the device for a certain property + * @opaque: Opaque data to use as argument to helper + * + * For each object in @devs, call the @predicate helper using @opaque as + * an argument until it returns TRUE. The list may not be modified while + * iterating. + */ +virNodeDeviceObj * +virNodeDeviceObjListFind(virNodeDeviceObjList *devs, + virNodeDeviceObjListPredicate predicate, + void *opaque) +{ + PredicateHelperData data = { + .predicate = predicate, + .opaque = opaque + }; + + return virNodeDeviceObjListSearch(devs, + virNodeDeviceObjListFindHelper, + &data); +} diff --git a/src/conf/virnodedeviceobj.h b/src/conf/virnodedeviceobj.h index 7353e4619b..0cb78748a4 100644 --- a/src/conf/virnodedeviceobj.h +++ b/src/conf/virnodedeviceobj.h @@ -136,9 +136,14 @@ void virNodeDeviceObjSetPersistent(virNodeDeviceObj *obj, bool persistent); -typedef bool (*virNodeDeviceObjListRemoveIterator)(virNodeDeviceObj *obj, - const void *opaque); +typedef bool (*virNodeDeviceObjListPredicate)(virNodeDeviceObj *obj, + const void *opaque); void virNodeDeviceObjListForEachRemove(virNodeDeviceObjList *devs, - virNodeDeviceObjListRemoveIterator callback, + virNodeDeviceObjListPredicate callback, void *opaque); + +virNodeDeviceObj * +virNodeDeviceObjListFind(virNodeDeviceObjList *devs, + virNodeDeviceObjListPredicate callback, + void *opaque); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 4bc2974e7f..9d16d7c6b3 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1287,6 +1287,7 @@ virNodeDeviceObjIsActive; virNodeDeviceObjIsPersistent; virNodeDeviceObjListAssignDef; virNodeDeviceObjListExport; +virNodeDeviceObjListFind; virNodeDeviceObjListFindByName; virNodeDeviceObjListFindBySysfsPath; virNodeDeviceObjListFindMediatedDeviceByUUID; -- 2.31.1

On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
This is a generic function that you can provide your own predicate function to search for a particular device. It will be used in an upcoming commit.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/conf/virnodedeviceobj.c | 53 +++++++++++++++++++++++++++++++------ src/conf/virnodedeviceobj.h | 11 +++++--- src/libvirt_private.syms | 1 + 3 files changed, 54 insertions(+), 11 deletions(-)
diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index b213592b56..6e7b354f96 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -1026,19 +1026,19 @@ virNodeDeviceObjSetPersistent(virNodeDeviceObj *obj, }
-struct virNodeDeviceObjListRemoveHelperData +typedef struct { - virNodeDeviceObjListRemoveIterator callback; + virNodeDeviceObjListPredicate predicate; void *opaque; -}; +} PredicateHelperData;
We usually split this into: typedef struct _virXXX virXXX; struct _virXXX { ... }; and honestly I've never understood why. I blamed coding style. Michal

Commit 51fbbfdce8 attempted to get the proper nodedev name for the parent of an defined mdev by traversing the filesystem and looking for a device that had the appropriate sysfs path. This works, but it would be cleaner to to avoid mucking around in the filesystem and instead just just examine the list of devices we have in memory. We already had a function nodeDeviceFindAddressByName() which constructs an address for parent device in a format that can be used with mdevctl. So if we refactor this function into a a function that simply formats an address for an arbitrary virNodeDeviceObj*, then we can use this function as a predicate for our new virNodeDeviceObjListFind() function from the previous commit. This will search our list of devices for one whose address matches the address we get from mdevctl. One nice benefit of this approach is that our test cases will now display xml output with the proper parent name for mdevs (assuming that we've added the appropriate mock parent devices to the test driver). Previously they just displayed 'computer' for the parent because the alternative would have required specially constructing a mock filesystem environment with a sysfs that mapped to the appropriate parent. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 55 ++++++++++--------- .../mdevctl-list-multiple.out.xml | 8 +-- tests/nodedevmdevctltest.c | 28 +++++++++- 3 files changed, 59 insertions(+), 32 deletions(-) diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 6f8968f1f0..a3a261f08b 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -648,17 +648,11 @@ nodeDeviceDefToMdevctlConfig(virNodeDeviceDef *def, char **buf) static char * -nodeDeviceFindAddressByName(const char *name) +nodeDeviceObjFormatAddress(virNodeDeviceObj *obj) { - virNodeDeviceDef *def = NULL; virNodeDevCapsDef *caps = NULL; char *addr = NULL; - virNodeDeviceObj *dev = virNodeDeviceObjListFindByName(driver->devs, name); - - if (!dev) - return NULL; - - def = virNodeDeviceObjGetDef(dev); + virNodeDeviceDef *def = virNodeDeviceObjGetDef(obj); for (caps = def->caps; caps != NULL; caps = caps->next) { switch (caps->data.type) { case VIR_NODE_DEV_CAP_PCI_DEV: { @@ -714,8 +708,6 @@ nodeDeviceFindAddressByName(const char *name) break; } - virNodeDeviceObjEndAPI(&dev); - return addr; } @@ -730,6 +722,7 @@ nodeDeviceGetMdevctlCommand(virNodeDeviceDef *def, g_autoptr(virCommand) cmd = NULL; const char *subcommand = virMdevctlCommandTypeToString(cmd_type); g_autofree char *inbuf = NULL; + virNodeDeviceObj *parent_obj = NULL; switch (cmd_type) { case MDEVCTL_CMD_CREATE: @@ -754,7 +747,10 @@ nodeDeviceGetMdevctlCommand(virNodeDeviceDef *def, switch (cmd_type) { case MDEVCTL_CMD_CREATE: case MDEVCTL_CMD_DEFINE: - parent_addr = nodeDeviceFindAddressByName(def->parent); + if ((parent_obj = nodeDeviceObjFindByName(def->parent))) { + parent_addr = nodeDeviceObjFormatAddress(parent_obj); + virNodeDeviceObjEndAPI(&parent_obj); + } if (!parent_addr) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -1042,6 +1038,21 @@ static void mdevGenerateDeviceName(virNodeDeviceDef *dev) } +static bool +matchDeviceAddress(virNodeDeviceObj *obj, + const void *opaque) +{ + g_autofree char *addr = NULL; + bool want = false; + + virObjectLock(obj); + addr = nodeDeviceObjFormatAddress(obj); + want = STREQ_NULLABLE(addr, opaque); + virObjectUnlock(obj); + return want; +} + + static virNodeDeviceDef* nodeDeviceParseMdevctlChildDevice(const char *parent, virJSONValue *json) @@ -1051,7 +1062,7 @@ nodeDeviceParseMdevctlChildDevice(const char *parent, virJSONValue *props; virJSONValue *attrs; g_autoptr(virNodeDeviceDef) child = g_new0(virNodeDeviceDef, 1); - g_autofree char *parent_sysfs_path = NULL; + virNodeDeviceObj *parent_obj; /* the child object should have a single key equal to its uuid. * The value is an object describing the properties of the mdev */ @@ -1064,20 +1075,12 @@ nodeDeviceParseMdevctlChildDevice(const char *parent, /* Look up id of parent device. mdevctl supports defining mdevs for parent * devices that are not present on the system (to support starting mdevs on * hotplug, etc) so the parent may not actually exist. */ - parent_sysfs_path = g_strdup_printf("/sys/class/mdev_bus/%s", parent); - if (virFileExists(parent_sysfs_path)) { - g_autofree char *canon_syspath = virFileCanonicalizePath(parent_sysfs_path); - virNodeDeviceObj *parentobj = NULL; - - if ((parentobj = virNodeDeviceObjListFindBySysfsPath(driver->devs, - canon_syspath))) { - virNodeDeviceDef *parentdef = virNodeDeviceObjGetDef(parentobj); - child->parent = g_strdup(parentdef->name); - virNodeDeviceObjEndAPI(&parentobj); - - child->parent_sysfs_path = g_steal_pointer(&canon_syspath); - } - } + if ((parent_obj = virNodeDeviceObjListFind(driver->devs, matchDeviceAddress, + (void *)parent))) { + virNodeDeviceDef *parentdef = virNodeDeviceObjGetDef(parent_obj); + child->parent = g_strdup(parentdef->name); + virNodeDeviceObjEndAPI(&parent_obj); + }; if (!child->parent) child->parent = g_strdup("computer"); child->caps = g_new0(virNodeDevCapsDef, 1); diff --git a/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml b/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml index eead6f2a40..c23a3130c1 100644 --- a/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml +++ b/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml @@ -1,6 +1,6 @@ <device> <name>mdev_200f228a_c80a_4d50_bfb7_f5a0e4e34045</name> - <parent>computer</parent> + <parent>pci_0000_00_02_0</parent> <capability type='mdev'> <type id='i915-GVTg_V5_4'/> <uuid>200f228a-c80a-4d50-bfb7-f5a0e4e34045</uuid> @@ -9,7 +9,7 @@ </device> <device> <name>mdev_de807ffc_1923_4d5f_b6c9_b20ecebc6d4b</name> - <parent>computer</parent> + <parent>pci_0000_00_02_0</parent> <capability type='mdev'> <type id='i915-GVTg_V5_4'/> <uuid>de807ffc-1923-4d5f-b6c9-b20ecebc6d4b</uuid> @@ -18,7 +18,7 @@ </device> <device> <name>mdev_435722ea_5f43_468a_874f_da34f1217f13</name> - <parent>computer</parent> + <parent>pci_0000_00_02_0</parent> <capability type='mdev'> <type id='i915-GVTg_V5_8'/> <uuid>435722ea-5f43-468a-874f-da34f1217f13</uuid> @@ -28,7 +28,7 @@ </device> <device> <name>mdev_783e6dbb_ea0e_411f_94e2_717eaad438bf</name> - <parent>computer</parent> + <parent>matrix_matrix</parent> <capability type='mdev'> <type id='vfio_ap-passthrough'/> <uuid>783e6dbb-ea0e-411f-94e2-717eaad438bf</uuid> diff --git a/tests/nodedevmdevctltest.c b/tests/nodedevmdevctltest.c index e246de4d87..7af89ab5f1 100644 --- a/tests/nodedevmdevctltest.c +++ b/tests/nodedevmdevctltest.c @@ -231,7 +231,7 @@ fakeRootDevice(void) * parent of the mdev, and it needs a PCI address */ static virNodeDeviceDef * -fakeParentDevice(void) +fakePCIDevice(void) { virNodeDeviceDef *def = NULL; virNodeDevCapPCIDev *pci_dev; @@ -252,6 +252,29 @@ fakeParentDevice(void) return def; } + +/* Add a fake matrix device that can be used as a parent device for mediated + * devices. For our purposes, it only needs to have a name that matches the + * parent of the mdev, and it needs the proper name + */ +static virNodeDeviceDef * +fakeMatrixDevice(void) +{ + virNodeDeviceDef *def = NULL; + virNodeDevCapAPMatrix *cap; + + def = g_new0(virNodeDeviceDef, 1); + def->caps = g_new0(virNodeDevCapsDef, 1); + + def->name = g_strdup("matrix_matrix"); + def->parent = g_strdup("computer"); + + def->caps->data.type = VIR_NODE_DEV_CAP_AP_MATRIX; + cap = &def->caps->data.ap_matrix; + cap->addr = g_strdup("matrix"); + + return def; +} static int addDevice(virNodeDeviceDef *def) { @@ -274,7 +297,8 @@ static int nodedevTestDriverAddTestDevices(void) { if (addDevice(fakeRootDevice()) < 0 || - addDevice(fakeParentDevice()) < 0) + addDevice(fakePCIDevice()) < 0 || + addDevice(fakeMatrixDevice()) < 0) return -1; return 0; -- 2.31.1

On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
Commit 51fbbfdce8 attempted to get the proper nodedev name for the parent of an defined mdev by traversing the filesystem and looking for a device that had the appropriate sysfs path. This works, but it would be cleaner to to avoid mucking around in the filesystem and instead just just examine the list of devices we have in memory.
We already had a function nodeDeviceFindAddressByName() which constructs an address for parent device in a format that can be used with mdevctl. So if we refactor this function into a a function that simply formats an address for an arbitrary virNodeDeviceObj*, then we can use this function as a predicate for our new virNodeDeviceObjListFind() function from the previous commit. This will search our list of devices for one whose address matches the address we get from mdevctl.
One nice benefit of this approach is that our test cases will now display xml output with the proper parent name for mdevs (assuming that we've added the appropriate mock parent devices to the test driver). Previously they just displayed 'computer' for the parent because the alternative would have required specially constructing a mock filesystem environment with a sysfs that mapped to the appropriate parent.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 55 ++++++++++--------- .../mdevctl-list-multiple.out.xml | 8 +-- tests/nodedevmdevctltest.c | 28 +++++++++- 3 files changed, 59 insertions(+), 32 deletions(-)
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 6f8968f1f0..a3a261f08b 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -648,17 +648,11 @@ nodeDeviceDefToMdevctlConfig(virNodeDeviceDef *def, char **buf)
static char * -nodeDeviceFindAddressByName(const char *name) +nodeDeviceObjFormatAddress(virNodeDeviceObj *obj) { - virNodeDeviceDef *def = NULL; virNodeDevCapsDef *caps = NULL; char *addr = NULL; - virNodeDeviceObj *dev = virNodeDeviceObjListFindByName(driver->devs, name); - - if (!dev) - return NULL; - - def = virNodeDeviceObjGetDef(dev); + virNodeDeviceDef *def = virNodeDeviceObjGetDef(obj); for (caps = def->caps; caps != NULL; caps = caps->next) { switch (caps->data.type) { case VIR_NODE_DEV_CAP_PCI_DEV: { @@ -714,8 +708,6 @@ nodeDeviceFindAddressByName(const char *name) break; }
- virNodeDeviceObjEndAPI(&dev); - return addr; }
@@ -730,6 +722,7 @@ nodeDeviceGetMdevctlCommand(virNodeDeviceDef *def, g_autoptr(virCommand) cmd = NULL; const char *subcommand = virMdevctlCommandTypeToString(cmd_type); g_autofree char *inbuf = NULL; + virNodeDeviceObj *parent_obj = NULL;
switch (cmd_type) { case MDEVCTL_CMD_CREATE: @@ -754,7 +747,10 @@ nodeDeviceGetMdevctlCommand(virNodeDeviceDef *def, switch (cmd_type) { case MDEVCTL_CMD_CREATE: case MDEVCTL_CMD_DEFINE: - parent_addr = nodeDeviceFindAddressByName(def->parent); + if ((parent_obj = nodeDeviceObjFindByName(def->parent))) { + parent_addr = nodeDeviceObjFormatAddress(parent_obj); + virNodeDeviceObjEndAPI(&parent_obj); + }
if (!parent_addr) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -1042,6 +1038,21 @@ static void mdevGenerateDeviceName(virNodeDeviceDef *dev) }
+static bool +matchDeviceAddress(virNodeDeviceObj *obj, + const void *opaque) +{ + g_autofree char *addr = NULL; + bool want = false; + + virObjectLock(obj); + addr = nodeDeviceObjFormatAddress(obj); + want = STREQ_NULLABLE(addr, opaque); + virObjectUnlock(obj); + return want; +} + + static virNodeDeviceDef* nodeDeviceParseMdevctlChildDevice(const char *parent, virJSONValue *json) @@ -1051,7 +1062,7 @@ nodeDeviceParseMdevctlChildDevice(const char *parent, virJSONValue *props; virJSONValue *attrs; g_autoptr(virNodeDeviceDef) child = g_new0(virNodeDeviceDef, 1); - g_autofree char *parent_sysfs_path = NULL; + virNodeDeviceObj *parent_obj;
/* the child object should have a single key equal to its uuid. * The value is an object describing the properties of the mdev */ @@ -1064,20 +1075,12 @@ nodeDeviceParseMdevctlChildDevice(const char *parent, /* Look up id of parent device. mdevctl supports defining mdevs for parent * devices that are not present on the system (to support starting mdevs on * hotplug, etc) so the parent may not actually exist. */ - parent_sysfs_path = g_strdup_printf("/sys/class/mdev_bus/%s", parent); - if (virFileExists(parent_sysfs_path)) { - g_autofree char *canon_syspath = virFileCanonicalizePath(parent_sysfs_path); - virNodeDeviceObj *parentobj = NULL; - - if ((parentobj = virNodeDeviceObjListFindBySysfsPath(driver->devs, - canon_syspath))) { - virNodeDeviceDef *parentdef = virNodeDeviceObjGetDef(parentobj); - child->parent = g_strdup(parentdef->name); - virNodeDeviceObjEndAPI(&parentobj); - - child->parent_sysfs_path = g_steal_pointer(&canon_syspath); - } - } + if ((parent_obj = virNodeDeviceObjListFind(driver->devs, matchDeviceAddress, + (void *)parent))) { + virNodeDeviceDef *parentdef = virNodeDeviceObjGetDef(parent_obj); + child->parent = g_strdup(parentdef->name); + virNodeDeviceObjEndAPI(&parent_obj); + }; if (!child->parent) child->parent = g_strdup("computer"); child->caps = g_new0(virNodeDevCapsDef, 1); diff --git a/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml b/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml index eead6f2a40..c23a3130c1 100644 --- a/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml +++ b/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml @@ -1,6 +1,6 @@ <device> <name>mdev_200f228a_c80a_4d50_bfb7_f5a0e4e34045</name> - <parent>computer</parent> + <parent>pci_0000_00_02_0</parent> <capability type='mdev'> <type id='i915-GVTg_V5_4'/> <uuid>200f228a-c80a-4d50-bfb7-f5a0e4e34045</uuid> @@ -9,7 +9,7 @@ </device> <device> <name>mdev_de807ffc_1923_4d5f_b6c9_b20ecebc6d4b</name> - <parent>computer</parent> + <parent>pci_0000_00_02_0</parent> <capability type='mdev'> <type id='i915-GVTg_V5_4'/> <uuid>de807ffc-1923-4d5f-b6c9-b20ecebc6d4b</uuid> @@ -18,7 +18,7 @@ </device> <device> <name>mdev_435722ea_5f43_468a_874f_da34f1217f13</name> - <parent>computer</parent> + <parent>pci_0000_00_02_0</parent> <capability type='mdev'> <type id='i915-GVTg_V5_8'/> <uuid>435722ea-5f43-468a-874f-da34f1217f13</uuid> @@ -28,7 +28,7 @@ </device> <device> <name>mdev_783e6dbb_ea0e_411f_94e2_717eaad438bf</name> - <parent>computer</parent> + <parent>matrix_matrix</parent>
Hello Jonathon, The nodedev object name of a matrix device is ap_matrix, so please change matrix_matrix to ap_matrix.
<capability type='mdev'> <type id='vfio_ap-passthrough'/> <uuid>783e6dbb-ea0e-411f-94e2-717eaad438bf</uuid> diff --git a/tests/nodedevmdevctltest.c b/tests/nodedevmdevctltest.c index e246de4d87..7af89ab5f1 100644 --- a/tests/nodedevmdevctltest.c +++ b/tests/nodedevmdevctltest.c @@ -231,7 +231,7 @@ fakeRootDevice(void) * parent of the mdev, and it needs a PCI address */ static virNodeDeviceDef * -fakeParentDevice(void) +fakePCIDevice(void) { virNodeDeviceDef *def = NULL; virNodeDevCapPCIDev *pci_dev; @@ -252,6 +252,29 @@ fakeParentDevice(void) return def; }
+ +/* Add a fake matrix device that can be used as a parent device for mediated + * devices. For our purposes, it only needs to have a name that matches the + * parent of the mdev, and it needs the proper name + */ +static virNodeDeviceDef * +fakeMatrixDevice(void) +{ + virNodeDeviceDef *def = NULL; + virNodeDevCapAPMatrix *cap; + + def = g_new0(virNodeDeviceDef, 1); + def->caps = g_new0(virNodeDevCapsDef, 1); + + def->name = g_strdup("matrix_matrix");
Please change matrix_matrix to ap_matrix in here as well, thank you.
+ def->parent = g_strdup("computer"); + + def->caps->data.type = VIR_NODE_DEV_CAP_AP_MATRIX; + cap = &def->caps->data.ap_matrix; + cap->addr = g_strdup("matrix"); + + return def; +} static int addDevice(virNodeDeviceDef *def) { @@ -274,7 +297,8 @@ static int nodedevTestDriverAddTestDevices(void) { if (addDevice(fakeRootDevice()) < 0 || - addDevice(fakeParentDevice()) < 0) + addDevice(fakePCIDevice()) < 0 || + addDevice(fakeMatrixDevice()) < 0) return -1;
return 0;
-- Kind regards Shalini Chellathurai Saroja Linux on Z and Virtualization Development Vorsitzende des Aufsichtsrats: Gregor Pillen Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

mdevctl can report multiple defined devices with the same UUID but different parents, including parents that don't actually exist on the host machine. Libvirt sets the parent to the 'computer' device for all of the mdevs that have nonexistent parents. Because of this, it's possible that there are multiple devices with the same UUID and the same 'computer' device as their parent, so the combination of uuid and parent 'nodedev name is not guaranteed to be a unique name. We need to ensure that each nodedev has a unique name. If we can't use the UUID as a unique nodedev name, and we can't use the combination of UUID and nodedev parent name, we need to find another solution. By caching and using the parent name reported by mdevctl in combination with the UUID, we can achieve a unique name. mdevctl guarantees that its uuid/parent combination is unique. This value will be used to set the mdev nodedev name in a following commit. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/conf/node_device_conf.h | 1 + src/node_device/node_device_driver.c | 1 + src/node_device/node_device_udev.c | 16 ++++++++++++++++ 3 files changed, 18 insertions(+) diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index a60562e4fe..556878b9a8 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -153,6 +153,7 @@ struct _virNodeDevCapMdev { char *uuid; virMediatedDeviceAttr **attributes; size_t nattributes; + char *parent_addr; }; typedef struct _virNodeDevCapPCIDev virNodeDevCapPCIDev; diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index a3a261f08b..ad2ca2a614 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -1088,6 +1088,7 @@ nodeDeviceParseMdevctlChildDevice(const char *parent, mdev = &child->caps->data.mdev; mdev->uuid = g_strdup(uuid); + mdev->parent_addr = g_strdup(parent); mdev->type = g_strdup(virJSONValueObjectGetString(props, "mdev_type")); diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index f99578414d..e7db74b325 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1024,6 +1024,7 @@ udevProcessMediatedDevice(struct udev_device *dev, char *linkpath = NULL; char *canonicalpath = NULL; virNodeDevCapMdev *data = &def->caps->data.mdev; + struct udev_device *parent_device = NULL; /* Because of a kernel uevent race, we might get the 'add' event prior to * the sysfs tree being ready, so any attempt to access any sysfs attribute @@ -1051,6 +1052,21 @@ udevProcessMediatedDevice(struct udev_device *dev, if ((iommugrp = virMediatedDeviceGetIOMMUGroupNum(data->uuid)) < 0) goto cleanup; + /* lookup the address of parent device */ + parent_device = udev_device_get_parent(dev); + if (parent_device) { + const char *parent_sysfs_path = udev_device_get_syspath(parent_device); + if (parent_sysfs_path) + data->parent_addr = g_path_get_basename(parent_sysfs_path); + } + + if (!data->parent_addr) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Could not get parent of '%s'"), + udev_device_get_syspath(dev)); + return -1; + } + udevGenerateDeviceName(dev, def, NULL); data->iommuGroupNumber = iommugrp; -- 2.31.1

On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
mdevctl can report multiple defined devices with the same UUID but different parents, including parents that don't actually exist on the host machine. Libvirt sets the parent to the 'computer' device for all of the mdevs that have nonexistent parents. Because of this, it's possible that there are multiple devices with the same UUID and the same 'computer' device as their parent, so the combination of uuid and parent 'nodedev name is not guaranteed to be a unique name.
We need to ensure that each nodedev has a unique name. If we can't use the UUID as a unique nodedev name, and we can't use the combination of UUID and nodedev parent name, we need to find another solution. By caching and using the parent name reported by mdevctl in combination with the UUID, we can achieve a unique name. mdevctl guarantees that its uuid/parent combination is unique.
This value will be used to set the mdev nodedev name in a following commit.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/conf/node_device_conf.h | 1 + src/node_device/node_device_driver.c | 1 + src/node_device/node_device_udev.c | 16 ++++++++++++++++ 3 files changed, 18 insertions(+)
The following should be squashed in to prevent memleak: diff --git i/src/conf/node_device_conf.c w/src/conf/node_device_conf.c index cd1c07092d..80ff0f973f 100644 --- i/src/conf/node_device_conf.c +++ w/src/conf/node_device_conf.c @@ -2329,6 +2329,7 @@ virNodeDevCapsDefFree(virNodeDevCapsDef *caps) for (i = 0; i < data->mdev.nattributes; i++) virMediatedDeviceAttrFree(data->mdev.attributes[i]); g_free(data->mdev.attributes); + g_free(data->mdev.parent_addr); break; case VIR_NODE_DEV_CAP_CSS_DEV: for (i = 0; i < data->ccw_dev.nmdev_types; i++) Michal

At the moment, this is only for mediated devices. When a new mediated device is created or defined, the xml is expected specify the nodedev name of an existing device as its parent. We were not previously validating this and were simply accepting any string here. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/conf/node_device_conf.c | 30 +++++++++--- src/conf/node_device_conf.h | 20 +++++++- src/conf/virnodedeviceobj.h | 1 + src/hypervisor/domain_driver.c | 7 +-- src/node_device/node_device_driver.c | 68 +++++++++++++++++++++++++++- src/node_device/node_device_driver.h | 3 ++ src/node_device/node_device_udev.c | 2 + src/test/test_driver.c | 6 ++- tests/nodedevmdevctltest.c | 7 ++- tests/nodedevxml2xmltest.c | 3 +- 10 files changed, 129 insertions(+), 18 deletions(-) diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index 332b12f997..cd1c07092d 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -2174,10 +2174,12 @@ static virNodeDeviceDef * virNodeDeviceDefParse(const char *str, const char *filename, int create, - const char *virt_type) + const char *virt_type, + virNodeDeviceDefParserCallbacks *parserCallbacks, + void *opaque) { xmlDocPtr xml; - virNodeDeviceDef *def = NULL; + g_autoptr(virNodeDeviceDef) def = NULL; if ((xml = virXMLParse(filename, str, _("(node_device_definition)")))) { def = virNodeDeviceDefParseNode(xml, xmlDocGetRootElement(xml), @@ -2185,25 +2187,39 @@ virNodeDeviceDefParse(const char *str, xmlFreeDoc(xml); } - return def; + if (parserCallbacks) { + int ret = 0; + /* validate definition */ + if (parserCallbacks->validate) { + ret = parserCallbacks->validate(def, opaque); + if (ret < 0) + return NULL; + } + } + + return g_steal_pointer(&def); } virNodeDeviceDef * virNodeDeviceDefParseString(const char *str, int create, - const char *virt_type) + const char *virt_type, + virNodeDeviceDefParserCallbacks *parserCallbacks, + void *opaque) { - return virNodeDeviceDefParse(str, NULL, create, virt_type); + return virNodeDeviceDefParse(str, NULL, create, virt_type, parserCallbacks, opaque); } virNodeDeviceDef * virNodeDeviceDefParseFile(const char *filename, int create, - const char *virt_type) + const char *virt_type, + virNodeDeviceDefParserCallbacks *parserCallbacks, + void *opaque) { - return virNodeDeviceDefParse(NULL, filename, create, virt_type); + return virNodeDeviceDefParse(NULL, filename, create, virt_type, parserCallbacks, opaque); } diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 556878b9a8..786de85060 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -349,15 +349,31 @@ struct _virNodeDeviceDef { char * virNodeDeviceDefFormat(const virNodeDeviceDef *def); + +typedef int (*virNodeDeviceDefPostParseCallback)(virNodeDeviceDef *dev, + void *opaque); + +typedef int (*virNodeDeviceDefValidateCallback)(virNodeDeviceDef *dev, + void *opaque); + +typedef struct _virNodeDeviceDefParserCallbacks { + virNodeDeviceDefPostParseCallback postParse; + virNodeDeviceDefValidateCallback validate; +} virNodeDeviceDefParserCallbacks; + virNodeDeviceDef * virNodeDeviceDefParseString(const char *str, int create, - const char *virt_type); + const char *virt_type, + virNodeDeviceDefParserCallbacks *callbacks, + void *opaque); virNodeDeviceDef * virNodeDeviceDefParseFile(const char *filename, int create, - const char *virt_type); + const char *virt_type, + virNodeDeviceDefParserCallbacks *callbacks, + void *opaque); virNodeDeviceDef * virNodeDeviceDefParseNode(xmlDocPtr xml, diff --git a/src/conf/virnodedeviceobj.h b/src/conf/virnodedeviceobj.h index 0cb78748a4..1fdd4f65da 100644 --- a/src/conf/virnodedeviceobj.h +++ b/src/conf/virnodedeviceobj.h @@ -47,6 +47,7 @@ struct _virNodeDeviceDriverState { /* Immutable pointer, self-locking APIs */ virObjectEventState *nodeDeviceEventState; + virNodeDeviceDefParserCallbacks parserCallbacks; }; void diff --git a/src/hypervisor/domain_driver.c b/src/hypervisor/domain_driver.c index 29e11c0447..2969d55173 100644 --- a/src/hypervisor/domain_driver.c +++ b/src/hypervisor/domain_driver.c @@ -395,7 +395,8 @@ virDomainDriverNodeDeviceReset(virNodeDevicePtr dev, if (!xml) return -1; - def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL); + def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL, + NULL, NULL); if (!def) return -1; @@ -440,7 +441,7 @@ virDomainDriverNodeDeviceReAttach(virNodeDevicePtr dev, if (!xml) return -1; - def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL); + def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL, NULL, NULL); if (!def) return -1; @@ -488,7 +489,7 @@ virDomainDriverNodeDeviceDetachFlags(virNodeDevicePtr dev, if (!xml) return -1; - def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL); + def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL, NULL, NULL); if (!def) return -1; diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index ad2ca2a614..8f39d79c68 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -896,7 +896,8 @@ nodeDeviceCreateXML(virConnectPtr conn, virt_type = virConnectGetType(conn); - if (!(def = virNodeDeviceDefParseString(xmlDesc, CREATE_DEVICE, virt_type))) + if (!(def = virNodeDeviceDefParseString(xmlDesc, CREATE_DEVICE, virt_type, + &driver->parserCallbacks, NULL))) return NULL; if (virNodeDeviceCreateXMLEnsureACL(conn, def) < 0) @@ -1376,7 +1377,8 @@ nodeDeviceDefineXML(virConnect *conn, virt_type = virConnectGetType(conn); - if (!(def = virNodeDeviceDefParseString(xmlDesc, CREATE_DEVICE, virt_type))) + if (!(def = virNodeDeviceDefParseString(xmlDesc, CREATE_DEVICE, virt_type, + &driver->parserCallbacks, NULL))) return NULL; if (virNodeDeviceDefineXMLEnsureACL(conn, def) < 0) @@ -1761,3 +1763,65 @@ nodeDeviceDefCopyFromMdevctl(virNodeDeviceDef *dst, return ret; } + + +/* validate that parent exists */ +static int nodeDeviceDefValidateMdev(virNodeDeviceDef *def, + G_GNUC_UNUSED virNodeDevCapMdev *mdev, + G_GNUC_UNUSED void *opaque) +{ + virNodeDeviceObj *obj = NULL; + if (!def->parent) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("missing parent device")); + return -1; + } + obj = virNodeDeviceObjListFindByName(driver->devs, def->parent); + if (!obj) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("invalid parent device '%s'"), + def->parent); + return -1; + } + + virNodeDeviceObjEndAPI(&obj); + return 0; +} + +int nodeDeviceDefValidate(virNodeDeviceDef *def, + G_GNUC_UNUSED void *opaque) +{ + virNodeDevCapsDef *caps = NULL; + for (caps = def->caps; caps != NULL; caps = caps->next) { + switch (caps->data.type) { + case VIR_NODE_DEV_CAP_MDEV: + if (nodeDeviceDefValidateMdev(def, &caps->data.mdev, opaque) < 0) + return -1; + break; + + case VIR_NODE_DEV_CAP_SYSTEM: + case VIR_NODE_DEV_CAP_PCI_DEV: + case VIR_NODE_DEV_CAP_USB_DEV: + case VIR_NODE_DEV_CAP_USB_INTERFACE: + case VIR_NODE_DEV_CAP_NET: + case VIR_NODE_DEV_CAP_SCSI_HOST: + case VIR_NODE_DEV_CAP_SCSI_TARGET: + case VIR_NODE_DEV_CAP_SCSI: + case VIR_NODE_DEV_CAP_STORAGE: + case VIR_NODE_DEV_CAP_FC_HOST: + case VIR_NODE_DEV_CAP_VPORTS: + case VIR_NODE_DEV_CAP_SCSI_GENERIC: + case VIR_NODE_DEV_CAP_DRM: + case VIR_NODE_DEV_CAP_MDEV_TYPES: + case VIR_NODE_DEV_CAP_CCW_DEV: + case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: + case VIR_NODE_DEV_CAP_AP_CARD: + case VIR_NODE_DEV_CAP_AP_QUEUE: + case VIR_NODE_DEV_CAP_AP_MATRIX: + case VIR_NODE_DEV_CAP_LAST: + break; + } + } + return 0; +} diff --git a/src/node_device/node_device_driver.h b/src/node_device/node_device_driver.h index edd763f0e4..d9b9b7a961 100644 --- a/src/node_device/node_device_driver.h +++ b/src/node_device/node_device_driver.h @@ -171,3 +171,6 @@ bool nodeDeviceDefCopyFromMdevctl(virNodeDeviceDef *dst, int nodeDeviceCreate(virNodeDevice *dev, unsigned int flags); + +int nodeDeviceDefValidate(virNodeDeviceDef *def, + void *opaque); diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index e7db74b325..ccf94d8e7d 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -2243,6 +2243,8 @@ nodeStateInitialize(bool privileged, driver->privateData = priv; driver->nodeDeviceEventState = virObjectEventStateNew(); + driver->parserCallbacks.validate = nodeDeviceDefValidate; + if (udevPCITranslateInit(privileged) < 0) goto unlock; diff --git a/src/test/test_driver.c b/src/test/test_driver.c index 892dc978f2..c2d642ff6b 100644 --- a/src/test/test_driver.c +++ b/src/test/test_driver.c @@ -7481,7 +7481,8 @@ testNodeDeviceMockCreateVport(testDriver *driver, if (!xml) goto cleanup; - if (!(def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL))) + if (!(def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL, NULL, + NULL))) goto cleanup; VIR_FREE(def->name); @@ -7543,7 +7544,8 @@ testNodeDeviceCreateXML(virConnectPtr conn, virCheckFlags(0, NULL); - if (!(def = virNodeDeviceDefParseString(xmlDesc, CREATE_DEVICE, NULL))) + if (!(def = virNodeDeviceDefParseString(xmlDesc, CREATE_DEVICE, NULL, + NULL, NULL))) goto cleanup; /* We run this simply for validation - it essentially validates that diff --git a/tests/nodedevmdevctltest.c b/tests/nodedevmdevctltest.c index 7af89ab5f1..bf1d7b4984 100644 --- a/tests/nodedevmdevctltest.c +++ b/tests/nodedevmdevctltest.c @@ -12,6 +12,10 @@ #define VIRT_TYPE "QEMU" +static virNodeDeviceDefParserCallbacks parser_callbacks = { + .validate = nodeDeviceDefValidate +}; + struct TestInfo { const char *filename; virMdevctlCommand command; @@ -66,7 +70,8 @@ testMdevctlCmd(virMdevctlCommand cmd_type, return -1; } - if (!(def = virNodeDeviceDefParseFile(mdevxml, create, VIRT_TYPE))) + if (!(def = virNodeDeviceDefParseFile(mdevxml, create, VIRT_TYPE, + &parser_callbacks, NULL))) return -1; /* this function will set a stdin buffer containing the json configuration diff --git a/tests/nodedevxml2xmltest.c b/tests/nodedevxml2xmltest.c index e9716726d7..d8f81456ee 100644 --- a/tests/nodedevxml2xmltest.c +++ b/tests/nodedevxml2xmltest.c @@ -25,7 +25,8 @@ testCompareXMLToXMLFiles(const char *xml, const char *outfile) if (virTestLoadFile(xml, &xmlData) < 0) goto fail; - if (!(dev = virNodeDeviceDefParseString(xmlData, EXISTING_DEVICE, NULL))) + if (!(dev = virNodeDeviceDefParseString(xmlData, EXISTING_DEVICE, NULL, + NULL, NULL))) goto fail; /* Calculate some things that are not read in */ -- 2.31.1

On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
At the moment, this is only for mediated devices. When a new mediated device is created or defined, the xml is expected specify the nodedev name of an existing device as its parent. We were not previously validating this and were simply accepting any string here.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/conf/node_device_conf.c | 30 +++++++++--- src/conf/node_device_conf.h | 20 +++++++- src/conf/virnodedeviceobj.h | 1 + src/hypervisor/domain_driver.c | 7 +-- src/node_device/node_device_driver.c | 68 +++++++++++++++++++++++++++- src/node_device/node_device_driver.h | 3 ++ src/node_device/node_device_udev.c | 2 + src/test/test_driver.c | 6 ++- tests/nodedevmdevctltest.c | 7 ++- tests/nodedevxml2xmltest.c | 3 +- 10 files changed, 129 insertions(+), 18 deletions(-)
diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index 332b12f997..cd1c07092d 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -2174,10 +2174,12 @@ static virNodeDeviceDef * virNodeDeviceDefParse(const char *str, const char *filename, int create, - const char *virt_type) + const char *virt_type, + virNodeDeviceDefParserCallbacks *parserCallbacks, + void *opaque) { xmlDocPtr xml; - virNodeDeviceDef *def = NULL; + g_autoptr(virNodeDeviceDef) def = NULL;
if ((xml = virXMLParse(filename, str, _("(node_device_definition)")))) { def = virNodeDeviceDefParseNode(xml, xmlDocGetRootElement(xml), @@ -2185,25 +2187,39 @@ virNodeDeviceDefParse(const char *str, xmlFreeDoc(xml); }
- return def; + if (parserCallbacks) { + int ret = 0; + /* validate definition */ + if (parserCallbacks->validate) { + ret = parserCallbacks->validate(def, opaque); + if (ret < 0) + return NULL;
Or simply without the @ret variable: if (parserCallbacks->validate && parserCallbacks->validate(def, opaque) < 0) return NULL; Maybe even a direct call of the ->validate() callback is sufficient because looking into the future there's not a case where parserCallbacks != NULL but ->validate == NULL; or is there?
+ } + } + + return g_steal_pointer(&def); }
virNodeDeviceDef * virNodeDeviceDefParseString(const char *str, int create, - const char *virt_type) + const char *virt_type, + virNodeDeviceDefParserCallbacks *parserCallbacks, + void *opaque) { - return virNodeDeviceDefParse(str, NULL, create, virt_type); + return virNodeDeviceDefParse(str, NULL, create, virt_type, parserCallbacks, opaque); }
virNodeDeviceDef * virNodeDeviceDefParseFile(const char *filename, int create, - const char *virt_type) + const char *virt_type, + virNodeDeviceDefParserCallbacks *parserCallbacks, + void *opaque) { - return virNodeDeviceDefParse(NULL, filename, create, virt_type); + return virNodeDeviceDefParse(NULL, filename, create, virt_type, parserCallbacks, opaque); }
diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 556878b9a8..786de85060 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -349,15 +349,31 @@ struct _virNodeDeviceDef { char * virNodeDeviceDefFormat(const virNodeDeviceDef *def);
+ +typedef int (*virNodeDeviceDefPostParseCallback)(virNodeDeviceDef *dev, + void *opaque); + +typedef int (*virNodeDeviceDefValidateCallback)(virNodeDeviceDef *dev, + void *opaque); + +typedef struct _virNodeDeviceDefParserCallbacks { + virNodeDeviceDefPostParseCallback postParse; + virNodeDeviceDefValidateCallback validate; +} virNodeDeviceDefParserCallbacks; + virNodeDeviceDef * virNodeDeviceDefParseString(const char *str, int create, - const char *virt_type); + const char *virt_type, + virNodeDeviceDefParserCallbacks *callbacks, + void *opaque);
virNodeDeviceDef * virNodeDeviceDefParseFile(const char *filename, int create, - const char *virt_type); + const char *virt_type, + virNodeDeviceDefParserCallbacks *callbacks, + void *opaque);
virNodeDeviceDef * virNodeDeviceDefParseNode(xmlDocPtr xml, diff --git a/src/conf/virnodedeviceobj.h b/src/conf/virnodedeviceobj.h index 0cb78748a4..1fdd4f65da 100644 --- a/src/conf/virnodedeviceobj.h +++ b/src/conf/virnodedeviceobj.h @@ -47,6 +47,7 @@ struct _virNodeDeviceDriverState {
/* Immutable pointer, self-locking APIs */ virObjectEventState *nodeDeviceEventState; + virNodeDeviceDefParserCallbacks parserCallbacks; };
void diff --git a/src/hypervisor/domain_driver.c b/src/hypervisor/domain_driver.c index 29e11c0447..2969d55173 100644 --- a/src/hypervisor/domain_driver.c +++ b/src/hypervisor/domain_driver.c @@ -395,7 +395,8 @@ virDomainDriverNodeDeviceReset(virNodeDevicePtr dev, if (!xml) return -1;
- def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL); + def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL, + NULL, NULL); if (!def) return -1;
@@ -440,7 +441,7 @@ virDomainDriverNodeDeviceReAttach(virNodeDevicePtr dev, if (!xml) return -1;
- def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL); + def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL, NULL, NULL); if (!def) return -1;
@@ -488,7 +489,7 @@ virDomainDriverNodeDeviceDetachFlags(virNodeDevicePtr dev, if (!xml) return -1;
- def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL); + def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL, NULL, NULL); if (!def) return -1;
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index ad2ca2a614..8f39d79c68 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -896,7 +896,8 @@ nodeDeviceCreateXML(virConnectPtr conn,
virt_type = virConnectGetType(conn);
- if (!(def = virNodeDeviceDefParseString(xmlDesc, CREATE_DEVICE, virt_type))) + if (!(def = virNodeDeviceDefParseString(xmlDesc, CREATE_DEVICE, virt_type, + &driver->parserCallbacks, NULL))) return NULL;
if (virNodeDeviceCreateXMLEnsureACL(conn, def) < 0) @@ -1376,7 +1377,8 @@ nodeDeviceDefineXML(virConnect *conn,
virt_type = virConnectGetType(conn);
- if (!(def = virNodeDeviceDefParseString(xmlDesc, CREATE_DEVICE, virt_type))) + if (!(def = virNodeDeviceDefParseString(xmlDesc, CREATE_DEVICE, virt_type, + &driver->parserCallbacks, NULL))) return NULL;
if (virNodeDeviceDefineXMLEnsureACL(conn, def) < 0) @@ -1761,3 +1763,65 @@ nodeDeviceDefCopyFromMdevctl(virNodeDeviceDef *dst,
return ret; } + + +/* validate that parent exists */ +static int nodeDeviceDefValidateMdev(virNodeDeviceDef *def, + G_GNUC_UNUSED virNodeDevCapMdev *mdev, + G_GNUC_UNUSED void *opaque) +{ + virNodeDeviceObj *obj = NULL; + if (!def->parent) {
Here and elsewhere in the series - please separate these two blocks with an empty line: virNodeDeviceObj *obj = NULL; if (!def->parent) ...
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("missing parent device")); + return -1; + } + obj = virNodeDeviceObjListFindByName(driver->devs, def->parent); + if (!obj) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("invalid parent device '%s'"), + def->parent); + return -1; + } + + virNodeDeviceObjEndAPI(&obj); + return 0; +} +
Michal

This can be used similarly to other postparse callbacks in libvirt -- filling in additional information that can be determined by using the information provided in the XML. In this case, we determine the address of the parent device and cache it in the mdev caps so that we can use it for generating a unique name and interacting with mdevctl. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/conf/node_device_conf.c | 7 +++++ src/node_device/node_device_driver.c | 45 +++++++++++++++++++++------- src/node_device/node_device_driver.h | 3 ++ src/node_device/node_device_udev.c | 1 + tests/nodedevmdevctltest.c | 1 + 5 files changed, 46 insertions(+), 11 deletions(-) diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index cd1c07092d..a7cd90b9e6 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -2189,6 +2189,13 @@ virNodeDeviceDefParse(const char *str, if (parserCallbacks) { int ret = 0; + /* fill in backend-specific aspects */ + if (parserCallbacks->postParse) { + ret = parserCallbacks->postParse(def, opaque); + if (ret < 0) + return NULL; + } + /* validate definition */ if (parserCallbacks->validate) { ret = parserCallbacks->validate(def, opaque); diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 8f39d79c68..d76268285f 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -718,11 +718,9 @@ nodeDeviceGetMdevctlCommand(virNodeDeviceDef *def, char **outbuf, char **errbuf) { - g_autofree char *parent_addr = NULL; g_autoptr(virCommand) cmd = NULL; const char *subcommand = virMdevctlCommandTypeToString(cmd_type); g_autofree char *inbuf = NULL; - virNodeDeviceObj *parent_obj = NULL; switch (cmd_type) { case MDEVCTL_CMD_CREATE: @@ -747,12 +745,7 @@ nodeDeviceGetMdevctlCommand(virNodeDeviceDef *def, switch (cmd_type) { case MDEVCTL_CMD_CREATE: case MDEVCTL_CMD_DEFINE: - if ((parent_obj = nodeDeviceObjFindByName(def->parent))) { - parent_addr = nodeDeviceObjFormatAddress(parent_obj); - virNodeDeviceObjEndAPI(&parent_obj); - } - - if (!parent_addr) { + if (!def->caps->data.mdev.parent_addr) { virReportError(VIR_ERR_INTERNAL_ERROR, _("unable to find parent device '%s'"), def->parent); return NULL; @@ -764,7 +757,7 @@ nodeDeviceGetMdevctlCommand(virNodeDeviceDef *def, return NULL; } - virCommandAddArgPair(cmd, "--parent", parent_addr); + virCommandAddArgPair(cmd, "--parent", def->caps->data.mdev.parent_addr); virCommandAddArgPair(cmd, "--jsonfile", "/dev/stdin"); virCommandSetInputBuffer(cmd, inbuf); @@ -1765,9 +1758,30 @@ nodeDeviceDefCopyFromMdevctl(virNodeDeviceDef *dst, } +int nodeDeviceDefPostParse(virNodeDeviceDef *def, + G_GNUC_UNUSED void *opaque) +{ + virNodeDevCapsDef *caps = NULL; + for (caps = def->caps; caps != NULL; caps = caps->next) { + if (caps->data.type == VIR_NODE_DEV_CAP_MDEV) { + virNodeDeviceObj *obj = NULL; + + if (def->parent) + obj = virNodeDeviceObjListFindByName(driver->devs, def->parent); + + if (obj) { + caps->data.mdev.parent_addr = nodeDeviceObjFormatAddress(obj); + virNodeDeviceObjEndAPI(&obj); + } + } + } + return 0; +} + + /* validate that parent exists */ static int nodeDeviceDefValidateMdev(virNodeDeviceDef *def, - G_GNUC_UNUSED virNodeDevCapMdev *mdev, + virNodeDevCapMdev *mdev, G_GNUC_UNUSED void *opaque) { virNodeDeviceObj *obj = NULL; @@ -1783,8 +1797,17 @@ static int nodeDeviceDefValidateMdev(virNodeDeviceDef *def, def->parent); return -1; } - virNodeDeviceObjEndAPI(&obj); + + /* the post-parse callback should have found the address of the parent + * device and stored it in the mdev caps */ + if (!mdev->parent_addr) { + virReportError(VIR_ERR_PARSE_FAILED, + _("Unable to find address for parent device '%s'"), + def->parent); + return -1; + } + return 0; } diff --git a/src/node_device/node_device_driver.h b/src/node_device/node_device_driver.h index d9b9b7a961..fdc92b8aef 100644 --- a/src/node_device/node_device_driver.h +++ b/src/node_device/node_device_driver.h @@ -172,5 +172,8 @@ int nodeDeviceCreate(virNodeDevice *dev, unsigned int flags); +int nodeDeviceDefPostParse(virNodeDeviceDef *def, + void *opaque); + int nodeDeviceDefValidate(virNodeDeviceDef *def, void *opaque); diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index ccf94d8e7d..81037d8139 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -2243,6 +2243,7 @@ nodeStateInitialize(bool privileged, driver->privateData = priv; driver->nodeDeviceEventState = virObjectEventStateNew(); + driver->parserCallbacks.postParse = nodeDeviceDefPostParse; driver->parserCallbacks.validate = nodeDeviceDefValidate; if (udevPCITranslateInit(privileged) < 0) diff --git a/tests/nodedevmdevctltest.c b/tests/nodedevmdevctltest.c index bf1d7b4984..d88e08f485 100644 --- a/tests/nodedevmdevctltest.c +++ b/tests/nodedevmdevctltest.c @@ -13,6 +13,7 @@ #define VIRT_TYPE "QEMU" static virNodeDeviceDefParserCallbacks parser_callbacks = { + .postParse = nodeDeviceDefPostParse, .validate = nodeDeviceDefValidate }; -- 2.31.1

Unfortunately, mdevctl supports defining more than one mdev with the same UUID as long as they have different parent devices. (Only one of these devices can be active at any given time). This means that we can't use the UUID alone as a way to uniquely identify mdev node devices. Append the parent address to ensure uniqueness. For example: Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0 Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440 Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 3 ++- src/node_device/node_device_udev.c | 2 +- tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- 3 files changed, 7 insertions(+), 6 deletions(-) diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index d76268285f..4e676bfd56 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -1028,7 +1028,8 @@ nodeDeviceGetMdevctlListCommand(bool defined, static void mdevGenerateDeviceName(virNodeDeviceDef *dev) { - nodeDeviceGenerateName(dev, "mdev", dev->caps->data.mdev.uuid, NULL); + nodeDeviceGenerateName(dev, "mdev", dev->caps->data.mdev.uuid, + dev->caps->data.mdev.parent_addr); } diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 81037d8139..90a64f16b0 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1067,7 +1067,7 @@ udevProcessMediatedDevice(struct udev_device *dev, return -1; } - udevGenerateDeviceName(dev, def, NULL); + udevGenerateDeviceName(dev, def, data->parent_addr); data->iommuGroupNumber = iommugrp; diff --git a/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml b/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml index c23a3130c1..3971898549 100644 --- a/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml +++ b/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml @@ -1,5 +1,5 @@ <device> - <name>mdev_200f228a_c80a_4d50_bfb7_f5a0e4e34045</name> + <name>mdev_200f228a_c80a_4d50_bfb7_f5a0e4e34045_0000_00_02_0</name> <parent>pci_0000_00_02_0</parent> <capability type='mdev'> <type id='i915-GVTg_V5_4'/> @@ -8,7 +8,7 @@ </capability> </device> <device> - <name>mdev_de807ffc_1923_4d5f_b6c9_b20ecebc6d4b</name> + <name>mdev_de807ffc_1923_4d5f_b6c9_b20ecebc6d4b_0000_00_02_0</name> <parent>pci_0000_00_02_0</parent> <capability type='mdev'> <type id='i915-GVTg_V5_4'/> @@ -17,7 +17,7 @@ </capability> </device> <device> - <name>mdev_435722ea_5f43_468a_874f_da34f1217f13</name> + <name>mdev_435722ea_5f43_468a_874f_da34f1217f13_0000_00_02_0</name> <parent>pci_0000_00_02_0</parent> <capability type='mdev'> <type id='i915-GVTg_V5_8'/> @@ -27,7 +27,7 @@ </capability> </device> <device> - <name>mdev_783e6dbb_ea0e_411f_94e2_717eaad438bf</name> + <name>mdev_783e6dbb_ea0e_411f_94e2_717eaad438bf_matrix</name> <parent>matrix_matrix</parent> <capability type='mdev'> <type id='vfio_ap-passthrough'/> -- 2.31.1

On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
Unfortunately, mdevctl supports defining more than one mdev with the same UUID as long as they have different parent devices. (Only one of these devices can be active at any given time).
This means that we can't use the UUID alone as a way to uniquely identify mdev node devices. Append the parent address to ensure uniqueness. For example:
Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 3 ++- src/node_device/node_device_udev.c | 2 +- tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- 3 files changed, 7 insertions(+), 6 deletions(-)
The patch looks good, but for some reason it leaves API breakage aftertaste. But I guess we don't document anywhere what MDEV <name/> looks like, do we? IOW - we are free to change it if needed. Michal

On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník <mprivozn@redhat.com> wrote:
On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
Unfortunately, mdevctl supports defining more than one mdev with the same UUID as long as they have different parent devices. (Only one of these devices can be active at any given time).
This means that we can't use the UUID alone as a way to uniquely identify mdev node devices. Append the parent address to ensure uniqueness. For example:
Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 3 ++- src/node_device/node_device_udev.c | 2 +- tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- 3 files changed, 7 insertions(+), 6 deletions(-)
The patch looks good, but for some reason it leaves API breakage aftertaste. But I guess we don't document anywhere what MDEV <name/> looks like, do we? IOW - we are free to change it if needed.
That is true -- it does have a bit of a bad aftertaste. As far as I know, we haven't documented or promised any specific naming format for mdevs. But it could certainly cause some pain for people that are using mdevs already, which might be reason enough to reject or adjust this patch. If a person already has assigned a mdev device to their vm with the old name, the name would change when upgrading libvirt. That could make the domain definition become invalid. But, we have to deal with this situation somehow. We don't have a choice but to handle the case where mdevctl might return multiple defined devices with the same UUID. I considered a couple of other approaches to handling this that I rejected, such as - only add a suffix to the second such device with the same UUID - rejected because the name of a given device depends on the order in which it was observed and the presence of other devices. For example, if we have two devices: mdev_$UUID and mdev_$UUID_$PARENT, and the first device was undefined, then when libvirt was restarted, mdev_$UUID_$PARENT would change its name to mdev_$UUID because it is now the only device with that UUID - name all *active* mdevs mdev_$UUID and use a different scheme for inactive, defined mdevs - rejected because the device would change names when it changed from inactive to active. - ignore this situation because probably almost nobody uses multiple devices with the same UUID - but do they? It's possible that we could make the transition a little bit less painful by automatically translating the old name to the new one for cases where it was unambiguous. I haven't looked into what it would require to do something like that, but it seems possible. Jonathon

On 7/27/21 12:08 AM, Jonathon Jongsma wrote:
On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník <mprivozn@redhat.com> wrote:
On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
Unfortunately, mdevctl supports defining more than one mdev with the same UUID as long as they have different parent devices. (Only one of these devices can be active at any given time).
This means that we can't use the UUID alone as a way to uniquely identify mdev node devices. Append the parent address to ensure uniqueness. For example:
Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 3 ++- src/node_device/node_device_udev.c | 2 +- tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- 3 files changed, 7 insertions(+), 6 deletions(-)
The patch looks good, but for some reason it leaves API breakage aftertaste. But I guess we don't document anywhere what MDEV <name/> looks like, do we? IOW - we are free to change it if needed.
That is true -- it does have a bit of a bad aftertaste. As far as I know, we haven't documented or promised any specific naming format for mdevs. But it could certainly cause some pain for people that are using mdevs already, which might be reason enough to reject or adjust this patch. If a person already has assigned a mdev device to their vm with the old name, the name would change when upgrading libvirt. That could make the domain definition become invalid.
But, we have to deal with this situation somehow. We don't have a choice but to handle the case where mdevctl might return multiple defined devices with the same UUID. I considered a couple of other approaches to handling this that I rejected, such as - only add a suffix to the second such device with the same UUID - rejected because the name of a given device depends on the order in which it was observed and the presence of other devices. For example, if we have two devices: mdev_$UUID and mdev_$UUID_$PARENT, and the first device was undefined, then when libvirt was restarted, mdev_$UUID_$PARENT would change its name to mdev_$UUID because it is now the only device with that UUID
Agreed, this would not make situation any better, in fact worse.
- name all *active* mdevs mdev_$UUID and use a different scheme for inactive, defined mdevs - rejected because the device would change names when it changed from inactive to active.
Yeah, this is equally horrible.
- ignore this situation because probably almost nobody uses multiple devices with the same UUID - but do they?
Frankly, I don't have enough experience with MDEVs, but since it's possible to define an MDEV with an existing UUID, it is possible to define it also for the same parent? I mean, if I'd have an MDEV capable graphics card and I'd define two MDEVs with the same UUID they would both have the same parent, wouldn't they? Because in that case, appending PCI address to libvirt's name is not enough. Michal

On Tue, Jul 27, 2021 at 3:02 AM Michal Prívozník <mprivozn@redhat.com> wrote:
On 7/27/21 12:08 AM, Jonathon Jongsma wrote:
On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník <mprivozn@redhat.com> wrote:
On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
Unfortunately, mdevctl supports defining more than one mdev with the same UUID as long as they have different parent devices. (Only one of these devices can be active at any given time).
This means that we can't use the UUID alone as a way to uniquely identify mdev node devices. Append the parent address to ensure uniqueness. For example:
Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 3 ++- src/node_device/node_device_udev.c | 2 +- tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- 3 files changed, 7 insertions(+), 6 deletions(-)
The patch looks good, but for some reason it leaves API breakage aftertaste. But I guess we don't document anywhere what MDEV <name/> looks like, do we? IOW - we are free to change it if needed.
That is true -- it does have a bit of a bad aftertaste. As far as I know, we haven't documented or promised any specific naming format for mdevs. But it could certainly cause some pain for people that are using mdevs already, which might be reason enough to reject or adjust this patch. If a person already has assigned a mdev device to their vm with the old name, the name would change when upgrading libvirt. That could make the domain definition become invalid.
But, we have to deal with this situation somehow. We don't have a choice but to handle the case where mdevctl might return multiple defined devices with the same UUID. I considered a couple of other approaches to handling this that I rejected, such as - only add a suffix to the second such device with the same UUID - rejected because the name of a given device depends on the order in which it was observed and the presence of other devices. For example, if we have two devices: mdev_$UUID and mdev_$UUID_$PARENT, and the first device was undefined, then when libvirt was restarted, mdev_$UUID_$PARENT would change its name to mdev_$UUID because it is now the only device with that UUID
Agreed, this would not make situation any better, in fact worse.
- name all *active* mdevs mdev_$UUID and use a different scheme for inactive, defined mdevs - rejected because the device would change names when it changed from inactive to active.
Yeah, this is equally horrible.
- ignore this situation because probably almost nobody uses multiple devices with the same UUID - but do they?
Frankly, I don't have enough experience with MDEVs, but since it's possible to define an MDEV with an existing UUID, it is possible to define it also for the same parent? I mean, if I'd have an MDEV capable graphics card and I'd define two MDEVs with the same UUID they would both have the same parent, wouldn't they? Because in that case, appending PCI address to libvirt's name is not enough.
Nope, it's the opposite in fact. mdevctl allows you to define two or more mdevs with the same UUID, but only if they have different parents. Only a single mdev with a given UUID can be activated at the same time. So among *active* devices, the UUID is indeed unique. But not among persistent device definitions. For defined devices, the UUID is guaranteed unique per parent. I discussed this with Alex a while ago and he thought that the reason that mdevctl supported this was to allow people to assign an mdev with a specific UUID to a vm, but that UUID could represent one of several (equivalent?) devices, depending on which parent device was present in the host or which device was instantiated. That feature doesn't seem very compelling to me, and Alex wasn't sure that anybody was actually using it that way. But the fact remains that mdevctl does allow people to configure things this way and it seems like we need to be prepared to deal with it. I briefly considered somehow rejecting mdevctl configurations in libvirt that contain duplicate UUIDS, but couldn't figure out a good way to handle that. Jonathon

On 7/27/21 4:09 PM, Jonathon Jongsma wrote:
On Tue, Jul 27, 2021 at 3:02 AM Michal Prívozník <mprivozn@redhat.com> wrote:
On 7/27/21 12:08 AM, Jonathon Jongsma wrote:
On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník <mprivozn@redhat.com> wrote:
On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
Unfortunately, mdevctl supports defining more than one mdev with the same UUID as long as they have different parent devices. (Only one of these devices can be active at any given time).
This means that we can't use the UUID alone as a way to uniquely identify mdev node devices. Append the parent address to ensure uniqueness. For example:
Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 3 ++- src/node_device/node_device_udev.c | 2 +- tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- 3 files changed, 7 insertions(+), 6 deletions(-)
The patch looks good, but for some reason it leaves API breakage aftertaste. But I guess we don't document anywhere what MDEV <name/> looks like, do we? IOW - we are free to change it if needed.
That is true -- it does have a bit of a bad aftertaste. As far as I know, we haven't documented or promised any specific naming format for mdevs. But it could certainly cause some pain for people that are using mdevs already, which might be reason enough to reject or adjust this patch. If a person already has assigned a mdev device to their vm with the old name, the name would change when upgrading libvirt. That could make the domain definition become invalid.
But, we have to deal with this situation somehow. We don't have a choice but to handle the case where mdevctl might return multiple defined devices with the same UUID. I considered a couple of other approaches to handling this that I rejected, such as - only add a suffix to the second such device with the same UUID - rejected because the name of a given device depends on the order in which it was observed and the presence of other devices. For example, if we have two devices: mdev_$UUID and mdev_$UUID_$PARENT, and the first device was undefined, then when libvirt was restarted, mdev_$UUID_$PARENT would change its name to mdev_$UUID because it is now the only device with that UUID
Agreed, this would not make situation any better, in fact worse.
- name all *active* mdevs mdev_$UUID and use a different scheme for inactive, defined mdevs - rejected because the device would change names when it changed from inactive to active.
Yeah, this is equally horrible.
- ignore this situation because probably almost nobody uses multiple devices with the same UUID - but do they?
Frankly, I don't have enough experience with MDEVs, but since it's possible to define an MDEV with an existing UUID, it is possible to define it also for the same parent? I mean, if I'd have an MDEV capable graphics card and I'd define two MDEVs with the same UUID they would both have the same parent, wouldn't they? Because in that case, appending PCI address to libvirt's name is not enough.
Nope, it's the opposite in fact. mdevctl allows you to define two or more mdevs with the same UUID, but only if they have different parents. Only a single mdev with a given UUID can be activated at the same time. So among *active* devices, the UUID is indeed unique. But not among persistent device definitions. For defined devices, the UUID is guaranteed unique per parent.
I discussed this with Alex a while ago and he thought that the reason that mdevctl supported this was to allow people to assign an mdev with a specific UUID to a vm, but that UUID could represent one of several (equivalent?) devices, depending on which parent device was present in the host or which device was instantiated. That feature doesn't seem very compelling to me, and Alex wasn't sure that anybody was actually using it that way. But the fact remains that mdevctl does allow people to configure things this way and it seems like we need to be prepared to deal with it.
I am questioning how a user is supposed to correlate the nodedev representation with mdev_{uuid}_{parent} to a domain using the mdev like <devices> <hostdev mode='subsystem' type='mdev' model='vfio-ccw'> <source> <address uuid='{uuid}'/> </source> </hostdev> </devices> Reading uuid it tells me its universal unique (https://en.wikipedia.org/wiki/Universally_unique_identifier) but now this is not true for definitions which breaks the term universal in my small universe since for the defined state of an mdev it becomes "parental" unique... What is going to happen to the mdev nodedev object that is just defined and being started? As this mdev nodedev object at that point in time gains actually universal uniqueness. Is libvirt supposed to rename the nodedev object into with name pattern mdev_{uuid}? Another question is about the autostart feature mdevctl provides for mdev definitions. Let's assume more than one mdev with the same uuid is flagged to autostart. What happens is first come first starts all other fail and a autostart enabled domain using the uuid gets whatever definition has won the race at system boot? Isn't the example of individual devices having equivalent characteristic which Alex used in libvirt represented as pool?
I briefly considered somehow rejecting mdevctl configurations in libvirt that contain duplicate UUIDS, but couldn't figure out a good way to handle that.
Jonathon
-- Mit freundlichen Grüßen/Kind regards Boris Fiuczynski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Gregor Pillen Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On Thu, Jul 29, 2021 at 1:35 PM Boris Fiuczynski <fiuczy@linux.ibm.com> wrote:
On 7/27/21 4:09 PM, Jonathon Jongsma wrote:
On Tue, Jul 27, 2021 at 3:02 AM Michal Prívozník <mprivozn@redhat.com> wrote:
On 7/27/21 12:08 AM, Jonathon Jongsma wrote:
On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník <mprivozn@redhat.com> wrote:
On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
Unfortunately, mdevctl supports defining more than one mdev with the same UUID as long as they have different parent devices. (Only one of these devices can be active at any given time).
This means that we can't use the UUID alone as a way to uniquely identify mdev node devices. Append the parent address to ensure uniqueness. For example:
Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 3 ++- src/node_device/node_device_udev.c | 2 +- tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- 3 files changed, 7 insertions(+), 6 deletions(-)
The patch looks good, but for some reason it leaves API breakage aftertaste. But I guess we don't document anywhere what MDEV <name/> looks like, do we? IOW - we are free to change it if needed.
That is true -- it does have a bit of a bad aftertaste. As far as I know, we haven't documented or promised any specific naming format for mdevs. But it could certainly cause some pain for people that are using mdevs already, which might be reason enough to reject or adjust this patch. If a person already has assigned a mdev device to their vm with the old name, the name would change when upgrading libvirt. That could make the domain definition become invalid.
But, we have to deal with this situation somehow. We don't have a choice but to handle the case where mdevctl might return multiple defined devices with the same UUID. I considered a couple of other approaches to handling this that I rejected, such as - only add a suffix to the second such device with the same UUID - rejected because the name of a given device depends on the order in which it was observed and the presence of other devices. For example, if we have two devices: mdev_$UUID and mdev_$UUID_$PARENT, and the first device was undefined, then when libvirt was restarted, mdev_$UUID_$PARENT would change its name to mdev_$UUID because it is now the only device with that UUID
Agreed, this would not make situation any better, in fact worse.
- name all *active* mdevs mdev_$UUID and use a different scheme for inactive, defined mdevs - rejected because the device would change names when it changed from inactive to active.
Yeah, this is equally horrible.
- ignore this situation because probably almost nobody uses multiple devices with the same UUID - but do they?
Frankly, I don't have enough experience with MDEVs, but since it's possible to define an MDEV with an existing UUID, it is possible to define it also for the same parent? I mean, if I'd have an MDEV capable graphics card and I'd define two MDEVs with the same UUID they would both have the same parent, wouldn't they? Because in that case, appending PCI address to libvirt's name is not enough.
Nope, it's the opposite in fact. mdevctl allows you to define two or more mdevs with the same UUID, but only if they have different parents. Only a single mdev with a given UUID can be activated at the same time. So among *active* devices, the UUID is indeed unique. But not among persistent device definitions. For defined devices, the UUID is guaranteed unique per parent.
I discussed this with Alex a while ago and he thought that the reason that mdevctl supported this was to allow people to assign an mdev with a specific UUID to a vm, but that UUID could represent one of several (equivalent?) devices, depending on which parent device was present in the host or which device was instantiated. That feature doesn't seem very compelling to me, and Alex wasn't sure that anybody was actually using it that way. But the fact remains that mdevctl does allow people to configure things this way and it seems like we need to be prepared to deal with it.
I am questioning how a user is supposed to correlate the nodedev representation with mdev_{uuid}_{parent} to a domain using the mdev like <devices> <hostdev mode='subsystem' type='mdev' model='vfio-ccw'> <source> <address uuid='{uuid}'/> </source> </hostdev> </devices>
Sorry for potentially confusing the impact of this change with my previous comments. I had said that domains with mdevs assigned might become invalid if the nodedev name changes. But that's not actually true. The naming that we're discussing is only related to the name of the device within the libvirt nodedev driver. When adding a mediated device to a domain, you don't use this nodedev name, you use the UUID directly. This is similar to the way you add a physical passthrough PCI device to a domain using its PCI address rather than its 'pci_$domain_$bus_$slot_$fn' nodedev name.
Reading uuid it tells me its universal unique (https://en.wikipedia.org/wiki/Universally_unique_identifier) but now this is not true for definitions which breaks the term universal in my small universe since for the defined state of an mdev it becomes "parental" unique...
Yes, that's true. Which is why I think this is a questionable feature. But: A) defined devices are only *potential* devices.This is not all that different from the fact that a user can manually instantiate a mdev on parent A with a specific UUID and then later instantiate a totally different mdev on parent B with the exact same UUID. One UUID refers to different devices at different times. B) the question we're discussing is not whether mdevctl should allow duplicate UUIDs or not. mdevctl *already* allows it, so we need to figure out how to deal with it in libvirt. I suppose one potential possibility might be to change upstream mdevctl to disallow duplicate UUIDs and then require this new mdevctl version for libvirt. But that's a discussion we'd have to have in upstream mdevctl first.
What is going to happen to the mdev nodedev object that is just defined and being started? As this mdev nodedev object at that point in time gains actually universal uniqueness. Is libvirt supposed to rename the nodedev object into with name pattern mdev_{uuid}?
No, that's an approach that I just mentioned as considered and rejected.
Another question is about the autostart feature mdevctl provides for mdev definitions. Let's assume more than one mdev with the same uuid is flagged to autostart. What happens is first come first starts all other fail and a autostart enabled domain using the uuid gets whatever definition has won the race at system boot?
That's true. Which is a good reason not to use this feature in mdevctl. And as I mentioned in the earlier email, it's probably unlikely that anybody is actually using it. Jonathon

On 7/29/21 9:27 PM, Jonathon Jongsma wrote:
On Thu, Jul 29, 2021 at 1:35 PM Boris Fiuczynski <fiuczy@linux.ibm.com> wrote:
On 7/27/21 4:09 PM, Jonathon Jongsma wrote:
On Tue, Jul 27, 2021 at 3:02 AM Michal Prívozník <mprivozn@redhat.com> wrote:
On 7/27/21 12:08 AM, Jonathon Jongsma wrote:
On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník <mprivozn@redhat.com> wrote:
On 7/23/21 6:40 PM, Jonathon Jongsma wrote: > Unfortunately, mdevctl supports defining more than one mdev with the > same UUID as long as they have different parent devices. (Only one of > these devices can be active at any given time). > > This means that we can't use the UUID alone as a way to uniquely > identify mdev node devices. Append the parent address to ensure > uniqueness. For example: > > Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 > After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0 > > Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440 > > Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> > --- > src/node_device/node_device_driver.c | 3 ++- > src/node_device/node_device_udev.c | 2 +- > tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- > 3 files changed, 7 insertions(+), 6 deletions(-)
The patch looks good, but for some reason it leaves API breakage aftertaste. But I guess we don't document anywhere what MDEV <name/> looks like, do we? IOW - we are free to change it if needed.
That is true -- it does have a bit of a bad aftertaste. As far as I know, we haven't documented or promised any specific naming format for mdevs. But it could certainly cause some pain for people that are using mdevs already, which might be reason enough to reject or adjust this patch. If a person already has assigned a mdev device to their vm with the old name, the name would change when upgrading libvirt. That could make the domain definition become invalid.
But, we have to deal with this situation somehow. We don't have a choice but to handle the case where mdevctl might return multiple defined devices with the same UUID. I considered a couple of other approaches to handling this that I rejected, such as - only add a suffix to the second such device with the same UUID - rejected because the name of a given device depends on the order in which it was observed and the presence of other devices. For example, if we have two devices: mdev_$UUID and mdev_$UUID_$PARENT, and the first device was undefined, then when libvirt was restarted, mdev_$UUID_$PARENT would change its name to mdev_$UUID because it is now the only device with that UUID
Agreed, this would not make situation any better, in fact worse.
- name all *active* mdevs mdev_$UUID and use a different scheme for inactive, defined mdevs - rejected because the device would change names when it changed from inactive to active.
Yeah, this is equally horrible.
- ignore this situation because probably almost nobody uses multiple devices with the same UUID - but do they?
Frankly, I don't have enough experience with MDEVs, but since it's possible to define an MDEV with an existing UUID, it is possible to define it also for the same parent? I mean, if I'd have an MDEV capable graphics card and I'd define two MDEVs with the same UUID they would both have the same parent, wouldn't they? Because in that case, appending PCI address to libvirt's name is not enough.
Nope, it's the opposite in fact. mdevctl allows you to define two or more mdevs with the same UUID, but only if they have different parents. Only a single mdev with a given UUID can be activated at the same time. So among *active* devices, the UUID is indeed unique. But not among persistent device definitions. For defined devices, the UUID is guaranteed unique per parent.
I discussed this with Alex a while ago and he thought that the reason that mdevctl supported this was to allow people to assign an mdev with a specific UUID to a vm, but that UUID could represent one of several (equivalent?) devices, depending on which parent device was present in the host or which device was instantiated. That feature doesn't seem very compelling to me, and Alex wasn't sure that anybody was actually using it that way. But the fact remains that mdevctl does allow people to configure things this way and it seems like we need to be prepared to deal with it.
I am questioning how a user is supposed to correlate the nodedev representation with mdev_{uuid}_{parent} to a domain using the mdev like <devices> <hostdev mode='subsystem' type='mdev' model='vfio-ccw'> <source> <address uuid='{uuid}'/> </source> </hostdev> </devices>
Sorry for potentially confusing the impact of this change with my previous comments. I had said that domains with mdevs assigned might become invalid if the nodedev name changes. But that's not actually true. The naming that we're discussing is only related to the name of the device within the libvirt nodedev driver. When adding a mediated device to a domain, you don't use this nodedev name, you use the UUID directly. This is similar to the way you add a physical passthrough PCI device to a domain using its PCI address rather than its 'pci_$domain_$bus_$slot_$fn' nodedev name.
Reading uuid it tells me its universal unique (https://en.wikipedia.org/wiki/Universally_unique_identifier) but now this is not true for definitions which breaks the term universal in my small universe since for the defined state of an mdev it becomes "parental" unique...
Yes, that's true. Which is why I think this is a questionable feature. But: A) defined devices are only *potential* devices.This is not all that different from the fact that a user can manually instantiate a mdev on parent A with a specific UUID and then later instantiate a totally different mdev on parent B with the exact same UUID. One UUID refers to different devices at different times. B) the question we're discussing is not whether mdevctl should allow duplicate UUIDs or not. mdevctl *already* allows it, so we need to figure out how to deal with it in libvirt.
I suppose one potential possibility might be to change upstream mdevctl to disallow duplicate UUIDs and then require this new mdevctl version for libvirt. But that's a discussion we'd have to have in upstream mdevctl first.
Maybe we can start the discussion, but even in libvirt you are allowed to have two guests with the same name and UUID if they are in different drivers (e.g. one in QEMU and one in LXC). Meanwhile, I think these can be pushed as we don't have much options anyway. Michal

On 7/30/21 9:48 AM, Michal Prívozník wrote:
On 7/29/21 9:27 PM, Jonathon Jongsma wrote:
On Thu, Jul 29, 2021 at 1:35 PM Boris Fiuczynski <fiuczy@linux.ibm.com> wrote:
On 7/27/21 4:09 PM, Jonathon Jongsma wrote:
On Tue, Jul 27, 2021 at 3:02 AM Michal Prívozník <mprivozn@redhat.com> wrote:
On 7/27/21 12:08 AM, Jonathon Jongsma wrote:
On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník <mprivozn@redhat.com> wrote: > > On 7/23/21 6:40 PM, Jonathon Jongsma wrote: >> Unfortunately, mdevctl supports defining more than one mdev with the >> same UUID as long as they have different parent devices. (Only one of >> these devices can be active at any given time). >> >> This means that we can't use the UUID alone as a way to uniquely >> identify mdev node devices. Append the parent address to ensure >> uniqueness. For example: >> >> Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 >> After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0 >> >> Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440 >> >> Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> >> --- >> src/node_device/node_device_driver.c | 3 ++- >> src/node_device/node_device_udev.c | 2 +- >> tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- >> 3 files changed, 7 insertions(+), 6 deletions(-) > > The patch looks good, but for some reason it leaves API breakage > aftertaste. But I guess we don't document anywhere what MDEV <name/> > looks like, do we? IOW - we are free to change it if needed.
That is true -- it does have a bit of a bad aftertaste. As far as I know, we haven't documented or promised any specific naming format for mdevs. But it could certainly cause some pain for people that are using mdevs already, which might be reason enough to reject or adjust this patch. If a person already has assigned a mdev device to their vm with the old name, the name would change when upgrading libvirt. That could make the domain definition become invalid.
But, we have to deal with this situation somehow. We don't have a choice but to handle the case where mdevctl might return multiple defined devices with the same UUID. I considered a couple of other approaches to handling this that I rejected, such as - only add a suffix to the second such device with the same UUID - rejected because the name of a given device depends on the order in which it was observed and the presence of other devices. For example, if we have two devices: mdev_$UUID and mdev_$UUID_$PARENT, and the first device was undefined, then when libvirt was restarted, mdev_$UUID_$PARENT would change its name to mdev_$UUID because it is now the only device with that UUID
Agreed, this would not make situation any better, in fact worse.
- name all *active* mdevs mdev_$UUID and use a different scheme for inactive, defined mdevs - rejected because the device would change names when it changed from inactive to active.
Yeah, this is equally horrible.
- ignore this situation because probably almost nobody uses multiple devices with the same UUID - but do they?
Frankly, I don't have enough experience with MDEVs, but since it's possible to define an MDEV with an existing UUID, it is possible to define it also for the same parent? I mean, if I'd have an MDEV capable graphics card and I'd define two MDEVs with the same UUID they would both have the same parent, wouldn't they? Because in that case, appending PCI address to libvirt's name is not enough.
Nope, it's the opposite in fact. mdevctl allows you to define two or more mdevs with the same UUID, but only if they have different parents. Only a single mdev with a given UUID can be activated at the same time. So among *active* devices, the UUID is indeed unique. But not among persistent device definitions. For defined devices, the UUID is guaranteed unique per parent.
I discussed this with Alex a while ago and he thought that the reason that mdevctl supported this was to allow people to assign an mdev with a specific UUID to a vm, but that UUID could represent one of several (equivalent?) devices, depending on which parent device was present in the host or which device was instantiated. That feature doesn't seem very compelling to me, and Alex wasn't sure that anybody was actually using it that way. But the fact remains that mdevctl does allow people to configure things this way and it seems like we need to be prepared to deal with it.
I am questioning how a user is supposed to correlate the nodedev representation with mdev_{uuid}_{parent} to a domain using the mdev like <devices> <hostdev mode='subsystem' type='mdev' model='vfio-ccw'> <source> <address uuid='{uuid}'/> </source> </hostdev> </devices>
Sorry for potentially confusing the impact of this change with my previous comments. I had said that domains with mdevs assigned might become invalid if the nodedev name changes. But that's not actually true. The naming that we're discussing is only related to the name of the device within the libvirt nodedev driver. When adding a mediated device to a domain, you don't use this nodedev name, you use the UUID directly. This is similar to the way you add a physical passthrough PCI device to a domain using its PCI address rather than its 'pci_$domain_$bus_$slot_$fn' nodedev name.
Reading uuid it tells me its universal unique (https://en.wikipedia.org/wiki/Universally_unique_identifier) but now this is not true for definitions which breaks the term universal in my small universe since for the defined state of an mdev it becomes "parental" unique...
Yes, that's true. Which is why I think this is a questionable feature. But: A) defined devices are only *potential* devices.This is not all that different from the fact that a user can manually instantiate a mdev on parent A with a specific UUID and then later instantiate a totally different mdev on parent B with the exact same UUID. One UUID refers to different devices at different times. B) the question we're discussing is not whether mdevctl should allow duplicate UUIDs or not. mdevctl *already* allows it, so we need to figure out how to deal with it in libvirt.
I suppose one potential possibility might be to change upstream mdevctl to disallow duplicate UUIDs and then require this new mdevctl version for libvirt. But that's a discussion we'd have to have in upstream mdevctl first.
Maybe we can start the discussion, but even in libvirt you are allowed to have two guests with the same name and UUID if they are in different drivers (e.g. one in QEMU and one in LXC).
Meanwhile, I think these can be pushed as we don't have much options anyway.
Michal
I think that we have at least one option. We could rethink the approach to create nodedev objects based on mdevctl mdev definitions which actually are not a usable host resource for domains. -- Mit freundlichen Grüßen/Kind regards Boris Fiuczynski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Gregor Pillen Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On Fri, Jul 30, 2021 at 8:01 AM Boris Fiuczynski <fiuczy@linux.ibm.com> wrote:
On 7/30/21 9:48 AM, Michal Prívozník wrote:
On 7/29/21 9:27 PM, Jonathon Jongsma wrote:
On Thu, Jul 29, 2021 at 1:35 PM Boris Fiuczynski <fiuczy@linux.ibm.com> wrote:
On 7/27/21 4:09 PM, Jonathon Jongsma wrote:
On Tue, Jul 27, 2021 at 3:02 AM Michal Prívozník <mprivozn@redhat.com> wrote:
On 7/27/21 12:08 AM, Jonathon Jongsma wrote: > On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník <mprivozn@redhat.com> wrote: >> >> On 7/23/21 6:40 PM, Jonathon Jongsma wrote: >>> Unfortunately, mdevctl supports defining more than one mdev with the >>> same UUID as long as they have different parent devices. (Only one of >>> these devices can be active at any given time). >>> >>> This means that we can't use the UUID alone as a way to uniquely >>> identify mdev node devices. Append the parent address to ensure >>> uniqueness. For example: >>> >>> Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 >>> After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0 >>> >>> Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440 >>> >>> Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> >>> --- >>> src/node_device/node_device_driver.c | 3 ++- >>> src/node_device/node_device_udev.c | 2 +- >>> tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- >>> 3 files changed, 7 insertions(+), 6 deletions(-) >> >> The patch looks good, but for some reason it leaves API breakage >> aftertaste. But I guess we don't document anywhere what MDEV <name/> >> looks like, do we? IOW - we are free to change it if needed. > > > That is true -- it does have a bit of a bad aftertaste. As far as I > know, we haven't documented or promised any specific naming format for > mdevs. But it could certainly cause some pain for people that are > using mdevs already, which might be reason enough to reject or adjust > this patch. If a person already has assigned a mdev device to their vm > with the old name, the name would change when upgrading libvirt. That > could make the domain definition become invalid. > > But, we have to deal with this situation somehow. We don't have a > choice but to handle the case where mdevctl might return multiple > defined devices with the same UUID. I considered a couple of other > approaches to handling this that I rejected, such as > - only add a suffix to the second such device with the same UUID > - rejected because the name of a given device depends on the order > in which it was observed and the presence of other devices. For > example, if we have two devices: mdev_$UUID and mdev_$UUID_$PARENT, > and the first device was undefined, then when libvirt was restarted, > mdev_$UUID_$PARENT would change its name to mdev_$UUID because it is > now the only device with that UUID
Agreed, this would not make situation any better, in fact worse.
> - name all *active* mdevs mdev_$UUID and use a different scheme for > inactive, defined mdevs > - rejected because the device would change names when it changed > from inactive to active.
Yeah, this is equally horrible.
> - ignore this situation because probably almost nobody uses multiple > devices with the same UUID > - but do they?
Frankly, I don't have enough experience with MDEVs, but since it's possible to define an MDEV with an existing UUID, it is possible to define it also for the same parent? I mean, if I'd have an MDEV capable graphics card and I'd define two MDEVs with the same UUID they would both have the same parent, wouldn't they? Because in that case, appending PCI address to libvirt's name is not enough.
Nope, it's the opposite in fact. mdevctl allows you to define two or more mdevs with the same UUID, but only if they have different parents. Only a single mdev with a given UUID can be activated at the same time. So among *active* devices, the UUID is indeed unique. But not among persistent device definitions. For defined devices, the UUID is guaranteed unique per parent.
I discussed this with Alex a while ago and he thought that the reason that mdevctl supported this was to allow people to assign an mdev with a specific UUID to a vm, but that UUID could represent one of several (equivalent?) devices, depending on which parent device was present in the host or which device was instantiated. That feature doesn't seem very compelling to me, and Alex wasn't sure that anybody was actually using it that way. But the fact remains that mdevctl does allow people to configure things this way and it seems like we need to be prepared to deal with it.
I am questioning how a user is supposed to correlate the nodedev representation with mdev_{uuid}_{parent} to a domain using the mdev like <devices> <hostdev mode='subsystem' type='mdev' model='vfio-ccw'> <source> <address uuid='{uuid}'/> </source> </hostdev> </devices>
Sorry for potentially confusing the impact of this change with my previous comments. I had said that domains with mdevs assigned might become invalid if the nodedev name changes. But that's not actually true. The naming that we're discussing is only related to the name of the device within the libvirt nodedev driver. When adding a mediated device to a domain, you don't use this nodedev name, you use the UUID directly. This is similar to the way you add a physical passthrough PCI device to a domain using its PCI address rather than its 'pci_$domain_$bus_$slot_$fn' nodedev name.
Reading uuid it tells me its universal unique (https://en.wikipedia.org/wiki/Universally_unique_identifier) but now this is not true for definitions which breaks the term universal in my small universe since for the defined state of an mdev it becomes "parental" unique...
Yes, that's true. Which is why I think this is a questionable feature. But: A) defined devices are only *potential* devices.This is not all that different from the fact that a user can manually instantiate a mdev on parent A with a specific UUID and then later instantiate a totally different mdev on parent B with the exact same UUID. One UUID refers to different devices at different times. B) the question we're discussing is not whether mdevctl should allow duplicate UUIDs or not. mdevctl *already* allows it, so we need to figure out how to deal with it in libvirt.
I suppose one potential possibility might be to change upstream mdevctl to disallow duplicate UUIDs and then require this new mdevctl version for libvirt. But that's a discussion we'd have to have in upstream mdevctl first.
Maybe we can start the discussion, but even in libvirt you are allowed to have two guests with the same name and UUID if they are in different drivers (e.g. one in QEMU and one in LXC).
Meanwhile, I think these can be pushed as we don't have much options anyway.
Michal
I think that we have at least one option. We could rethink the approach to create nodedev objects based on mdevctl mdev definitions which actually are not a usable host resource for domains.
If I understand your suggestion, I don't think it solves any problems. It seems to me that you're saying that libvirt should ignore all mdev definitions from mdevctl if the parent does not (yet?) exist on the host. (At least that's my interpretation of your phrase "definitions which are not a usable host resource"). But it's still possible to have definitions with duplicate UUIDs for devices which have parents that *do* exist on the host but are simply not instantiated yet. If I'm misinterpreting your comment, feel free to clarify. Jonathon

On 8/2/21 5:30 PM, Jonathon Jongsma wrote:
On Fri, Jul 30, 2021 at 8:01 AM Boris Fiuczynski <fiuczy@linux.ibm.com> wrote:
On 7/30/21 9:48 AM, Michal Prívozník wrote:
On 7/29/21 9:27 PM, Jonathon Jongsma wrote:
On Thu, Jul 29, 2021 at 1:35 PM Boris Fiuczynski <fiuczy@linux.ibm.com> wrote:
On 7/27/21 4:09 PM, Jonathon Jongsma wrote:
On Tue, Jul 27, 2021 at 3:02 AM Michal Prívozník <mprivozn@redhat.com> wrote: > > On 7/27/21 12:08 AM, Jonathon Jongsma wrote: >> On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník <mprivozn@redhat.com> wrote: >>> >>> On 7/23/21 6:40 PM, Jonathon Jongsma wrote: >>>> Unfortunately, mdevctl supports defining more than one mdev with the >>>> same UUID as long as they have different parent devices. (Only one of >>>> these devices can be active at any given time). >>>> >>>> This means that we can't use the UUID alone as a way to uniquely >>>> identify mdev node devices. Append the parent address to ensure >>>> uniqueness. For example: >>>> >>>> Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 >>>> After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0 >>>> >>>> Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440 >>>> >>>> Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> >>>> --- >>>> src/node_device/node_device_driver.c | 3 ++- >>>> src/node_device/node_device_udev.c | 2 +- >>>> tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- >>>> 3 files changed, 7 insertions(+), 6 deletions(-) >>> >>> The patch looks good, but for some reason it leaves API breakage >>> aftertaste. But I guess we don't document anywhere what MDEV <name/> >>> looks like, do we? IOW - we are free to change it if needed. >> >> >> That is true -- it does have a bit of a bad aftertaste. As far as I >> know, we haven't documented or promised any specific naming format for >> mdevs. But it could certainly cause some pain for people that are >> using mdevs already, which might be reason enough to reject or adjust >> this patch. If a person already has assigned a mdev device to their vm >> with the old name, the name would change when upgrading libvirt. That >> could make the domain definition become invalid. >> >> But, we have to deal with this situation somehow. We don't have a >> choice but to handle the case where mdevctl might return multiple >> defined devices with the same UUID. I considered a couple of other >> approaches to handling this that I rejected, such as >> - only add a suffix to the second such device with the same UUID >> - rejected because the name of a given device depends on the order >> in which it was observed and the presence of other devices. For >> example, if we have two devices: mdev_$UUID and mdev_$UUID_$PARENT, >> and the first device was undefined, then when libvirt was restarted, >> mdev_$UUID_$PARENT would change its name to mdev_$UUID because it is >> now the only device with that UUID > > Agreed, this would not make situation any better, in fact worse. > >> - name all *active* mdevs mdev_$UUID and use a different scheme for >> inactive, defined mdevs >> - rejected because the device would change names when it changed >> from inactive to active. > > Yeah, this is equally horrible. > >> - ignore this situation because probably almost nobody uses multiple >> devices with the same UUID >> - but do they? > > Frankly, I don't have enough experience with MDEVs, but since it's > possible to define an MDEV with an existing UUID, it is possible to > define it also for the same parent? I mean, if I'd have an MDEV capable > graphics card and I'd define two MDEVs with the same UUID they would > both have the same parent, wouldn't they? Because in that case, > appending PCI address to libvirt's name is not enough.
Nope, it's the opposite in fact. mdevctl allows you to define two or more mdevs with the same UUID, but only if they have different parents. Only a single mdev with a given UUID can be activated at the same time. So among *active* devices, the UUID is indeed unique. But not among persistent device definitions. For defined devices, the UUID is guaranteed unique per parent.
I discussed this with Alex a while ago and he thought that the reason that mdevctl supported this was to allow people to assign an mdev with a specific UUID to a vm, but that UUID could represent one of several (equivalent?) devices, depending on which parent device was present in the host or which device was instantiated. That feature doesn't seem very compelling to me, and Alex wasn't sure that anybody was actually using it that way. But the fact remains that mdevctl does allow people to configure things this way and it seems like we need to be prepared to deal with it.
I am questioning how a user is supposed to correlate the nodedev representation with mdev_{uuid}_{parent} to a domain using the mdev like <devices> <hostdev mode='subsystem' type='mdev' model='vfio-ccw'> <source> <address uuid='{uuid}'/> </source> </hostdev> </devices>
Sorry for potentially confusing the impact of this change with my previous comments. I had said that domains with mdevs assigned might become invalid if the nodedev name changes. But that's not actually true. The naming that we're discussing is only related to the name of the device within the libvirt nodedev driver. When adding a mediated device to a domain, you don't use this nodedev name, you use the UUID directly. This is similar to the way you add a physical passthrough PCI device to a domain using its PCI address rather than its 'pci_$domain_$bus_$slot_$fn' nodedev name.
Reading uuid it tells me its universal unique (https://en.wikipedia.org/wiki/Universally_unique_identifier) but now this is not true for definitions which breaks the term universal in my small universe since for the defined state of an mdev it becomes "parental" unique...
Yes, that's true. Which is why I think this is a questionable feature. But: A) defined devices are only *potential* devices.This is not all that different from the fact that a user can manually instantiate a mdev on parent A with a specific UUID and then later instantiate a totally different mdev on parent B with the exact same UUID. One UUID refers to different devices at different times. B) the question we're discussing is not whether mdevctl should allow duplicate UUIDs or not. mdevctl *already* allows it, so we need to figure out how to deal with it in libvirt.
I suppose one potential possibility might be to change upstream mdevctl to disallow duplicate UUIDs and then require this new mdevctl version for libvirt. But that's a discussion we'd have to have in upstream mdevctl first.
Maybe we can start the discussion, but even in libvirt you are allowed to have two guests with the same name and UUID if they are in different drivers (e.g. one in QEMU and one in LXC).
Meanwhile, I think these can be pushed as we don't have much options anyway.
Michal
I think that we have at least one option. We could rethink the approach to create nodedev objects based on mdevctl mdev definitions which actually are not a usable host resource for domains.
If I understand your suggestion, I don't think it solves any problems. It seems to me that you're saying that libvirt should ignore all mdev definitions from mdevctl if the parent does not (yet?) exist on the host. (At least that's my interpretation of your phrase "definitions which are not a usable host resource"). But it's still possible to have definitions with duplicate UUIDs for devices which have parents that *do* exist on the host but are simply not instantiated yet. If I'm misinterpreting your comment, feel free to clarify.
I used to think of the nodedev objects as libvirt showing the user its view of devices it can understand in the sysfs. This changed when mdev nodedev objects got created based on mdevctl's mediated device definition. These definition objects are different from the sysfs objects. One difference is state active/inactive which caused e.g. the introduction of the option "inactive" on list operation. When we now talk about an nodedev mdev object it could be a) an instantiated mediated devices without mdev definition, b) an instantiated mediated devices with mdev definition or c) just an mdev definition, but just an instantiated mediated device [part of a) and b)] is actually a host resource one can use in a domain. So what I meant by "...not usable host resources" was the definition itself is not usable for a domain. The sysfs based host resources are. I agree that mdev definitions and operations on these definitions have a value but I am getting the feeling that they do not fit so well into the nodedev objects especially with the new twist of the loss of identity uniqueness which results from my perspective in a very messy nodedev object space. There is one other thing I noticed. I can change an mdev definition after I initiated/started the mdev definition. This is kind of starting a domain and afterwards change/edit its persisted definition. So when dumping the xml one would need to be able to chose running/started or defined. This currently is not a big issue since there are not many actually no attributes that can be retrieved via mdevctl from a running/started mdev but it just adds to the disparity since currently the user of nodedev gets the mdev definition and running/started mdev in a merged nodedev object. -- Mit freundlichen Grüßen/Kind regards Boris Fiuczynski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Gregor Pillen Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On Tue, Aug 3, 2021 at 6:24 AM Boris Fiuczynski <fiuczy@linux.ibm.com> wrote:
On 8/2/21 5:30 PM, Jonathon Jongsma wrote:
On Fri, Jul 30, 2021 at 8:01 AM Boris Fiuczynski <fiuczy@linux.ibm.com> wrote:
On 7/30/21 9:48 AM, Michal Prívozník wrote:
On 7/29/21 9:27 PM, Jonathon Jongsma wrote:
On Thu, Jul 29, 2021 at 1:35 PM Boris Fiuczynski <fiuczy@linux.ibm.com> wrote:
On 7/27/21 4:09 PM, Jonathon Jongsma wrote: > On Tue, Jul 27, 2021 at 3:02 AM Michal Prívozník <mprivozn@redhat.com> wrote: >> >> On 7/27/21 12:08 AM, Jonathon Jongsma wrote: >>> On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník <mprivozn@redhat.com> wrote: >>>> >>>> On 7/23/21 6:40 PM, Jonathon Jongsma wrote: >>>>> Unfortunately, mdevctl supports defining more than one mdev with the >>>>> same UUID as long as they have different parent devices. (Only one of >>>>> these devices can be active at any given time). >>>>> >>>>> This means that we can't use the UUID alone as a way to uniquely >>>>> identify mdev node devices. Append the parent address to ensure >>>>> uniqueness. For example: >>>>> >>>>> Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 >>>>> After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0 >>>>> >>>>> Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440 >>>>> >>>>> Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> >>>>> --- >>>>> src/node_device/node_device_driver.c | 3 ++- >>>>> src/node_device/node_device_udev.c | 2 +- >>>>> tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- >>>>> 3 files changed, 7 insertions(+), 6 deletions(-) >>>> >>>> The patch looks good, but for some reason it leaves API breakage >>>> aftertaste. But I guess we don't document anywhere what MDEV <name/> >>>> looks like, do we? IOW - we are free to change it if needed. >>> >>> >>> That is true -- it does have a bit of a bad aftertaste. As far as I >>> know, we haven't documented or promised any specific naming format for >>> mdevs. But it could certainly cause some pain for people that are >>> using mdevs already, which might be reason enough to reject or adjust >>> this patch. If a person already has assigned a mdev device to their vm >>> with the old name, the name would change when upgrading libvirt. That >>> could make the domain definition become invalid. >>> >>> But, we have to deal with this situation somehow. We don't have a >>> choice but to handle the case where mdevctl might return multiple >>> defined devices with the same UUID. I considered a couple of other >>> approaches to handling this that I rejected, such as >>> - only add a suffix to the second such device with the same UUID >>> - rejected because the name of a given device depends on the order >>> in which it was observed and the presence of other devices. For >>> example, if we have two devices: mdev_$UUID and mdev_$UUID_$PARENT, >>> and the first device was undefined, then when libvirt was restarted, >>> mdev_$UUID_$PARENT would change its name to mdev_$UUID because it is >>> now the only device with that UUID >> >> Agreed, this would not make situation any better, in fact worse. >> >>> - name all *active* mdevs mdev_$UUID and use a different scheme for >>> inactive, defined mdevs >>> - rejected because the device would change names when it changed >>> from inactive to active. >> >> Yeah, this is equally horrible. >> >>> - ignore this situation because probably almost nobody uses multiple >>> devices with the same UUID >>> - but do they? >> >> Frankly, I don't have enough experience with MDEVs, but since it's >> possible to define an MDEV with an existing UUID, it is possible to >> define it also for the same parent? I mean, if I'd have an MDEV capable >> graphics card and I'd define two MDEVs with the same UUID they would >> both have the same parent, wouldn't they? Because in that case, >> appending PCI address to libvirt's name is not enough. > > > Nope, it's the opposite in fact. mdevctl allows you to define two or > more mdevs with the same UUID, but only if they have different > parents. Only a single mdev with a given UUID can be activated at the > same time. So among *active* devices, the UUID is indeed unique. But > not among persistent device definitions. For defined devices, the UUID > is guaranteed unique per parent. > > I discussed this with Alex a while ago and he thought that the reason > that mdevctl supported this was to allow people to assign an mdev with > a specific UUID to a vm, but that UUID could represent one of several > (equivalent?) devices, depending on which parent device was present in > the host or which device was instantiated. That feature doesn't seem > very compelling to me, and Alex wasn't sure that anybody was actually > using it that way. But the fact remains that mdevctl does allow people > to configure things this way and it seems like we need to be prepared > to deal with it.
I am questioning how a user is supposed to correlate the nodedev representation with mdev_{uuid}_{parent} to a domain using the mdev like <devices> <hostdev mode='subsystem' type='mdev' model='vfio-ccw'> <source> <address uuid='{uuid}'/> </source> </hostdev> </devices>
Sorry for potentially confusing the impact of this change with my previous comments. I had said that domains with mdevs assigned might become invalid if the nodedev name changes. But that's not actually true. The naming that we're discussing is only related to the name of the device within the libvirt nodedev driver. When adding a mediated device to a domain, you don't use this nodedev name, you use the UUID directly. This is similar to the way you add a physical passthrough PCI device to a domain using its PCI address rather than its 'pci_$domain_$bus_$slot_$fn' nodedev name.
Reading uuid it tells me its universal unique (https://en.wikipedia.org/wiki/Universally_unique_identifier) but now this is not true for definitions which breaks the term universal in my small universe since for the defined state of an mdev it becomes "parental" unique...
Yes, that's true. Which is why I think this is a questionable feature. But: A) defined devices are only *potential* devices.This is not all that different from the fact that a user can manually instantiate a mdev on parent A with a specific UUID and then later instantiate a totally different mdev on parent B with the exact same UUID. One UUID refers to different devices at different times. B) the question we're discussing is not whether mdevctl should allow duplicate UUIDs or not. mdevctl *already* allows it, so we need to figure out how to deal with it in libvirt.
I suppose one potential possibility might be to change upstream mdevctl to disallow duplicate UUIDs and then require this new mdevctl version for libvirt. But that's a discussion we'd have to have in upstream mdevctl first.
Maybe we can start the discussion, but even in libvirt you are allowed to have two guests with the same name and UUID if they are in different drivers (e.g. one in QEMU and one in LXC).
Meanwhile, I think these can be pushed as we don't have much options anyway.
Michal
I think that we have at least one option. We could rethink the approach to create nodedev objects based on mdevctl mdev definitions which actually are not a usable host resource for domains.
If I understand your suggestion, I don't think it solves any problems. It seems to me that you're saying that libvirt should ignore all mdev definitions from mdevctl if the parent does not (yet?) exist on the host. (At least that's my interpretation of your phrase "definitions which are not a usable host resource"). But it's still possible to have definitions with duplicate UUIDs for devices which have parents that *do* exist on the host but are simply not instantiated yet. If I'm misinterpreting your comment, feel free to clarify.
I used to think of the nodedev objects as libvirt showing the user its view of devices it can understand in the sysfs. This changed when mdev nodedev objects got created based on mdevctl's mediated device definition. These definition objects are different from the sysfs objects.
Yes, but in a similar way, a defined storage pool is different from an active storage pool. Or a defined domain that is different from an active domain. There is plenty of precedent in libvirt of this kind of distinction.
One difference is state active/inactive which caused e.g. the introduction of the option "inactive" on list operation. When we now talk about an nodedev mdev object it could be a) an instantiated mediated devices without mdev definition, b) an instantiated mediated devices with mdev definition or c) just an mdev definition, but just an instantiated mediated device [part of a) and b)] is actually a host resource one can use in a domain. So what I meant by "...not usable host resources" was the definition itself is not usable for a domain. The sysfs based host resources are.
Those three options are also true of pretty much every other libvirt object as well. A domain can be transient or defined, active or inactive. Same for pools. I can define a domain in libvirt that uses devices that are not (currently) present on the host system. I can define a storage pool that refers to a nonexistent directory. When I attempt to instantiate these objects, libvirt will return an error, but I am allowed to define objects that refer to unavailable host resources. So all of those things mentioned above seem pretty analogous to existing libvirt objects.
I agree that mdev definitions and operations on these definitions have a value but I am getting the feeling that they do not fit so well into the nodedev objects especially with the new twist of the loss of identity uniqueness which results from my perspective in a very messy nodedev object space.
I don't think we've lost any identity uniqueness. It's just that we initially chose a device name schema for mdevs that turned out to not be unique when mdevctl is used as a backend. Also, I don't think this nodedev object space is any more "messy" than the other libvirt objects (e.g. pools and domains mentioned above). Are you arguing that we should not provide a way to define new persistent nodedevs in libvirt?
There is one other thing I noticed. I can change an mdev definition after I initiated/started the mdev definition. This is kind of starting a domain and afterwards change/edit its persisted definition. So when dumping the xml one would need to be able to chose running/started or defined. This currently is not a big issue since there are not many actually no attributes that can be retrieved via mdevctl from a running/started mdev but it just adds to the disparity since currently the user of nodedev gets the mdev definition and running/started mdev in a merged nodedev object.
Again, this is not all that different from existing libvirt objects. I can also edit a persistent definition of a storage pool after it is started, for example. It's possible that there are some issues unique to mdevs that I need to fix in this area, but those are just details. In general it seems pretty consistent with other libvirt objects. Jonathon

On 8/3/21 9:55 PM, Jonathon Jongsma wrote:
On Tue, Aug 3, 2021 at 6:24 AM Boris Fiuczynski <fiuczy@linux.ibm.com> wrote:
On 8/2/21 5:30 PM, Jonathon Jongsma wrote:
On Fri, Jul 30, 2021 at 8:01 AM Boris Fiuczynski <fiuczy@linux.ibm.com> wrote:
On 7/30/21 9:48 AM, Michal Prívozník wrote:
On 7/29/21 9:27 PM, Jonathon Jongsma wrote:
On Thu, Jul 29, 2021 at 1:35 PM Boris Fiuczynski <fiuczy@linux.ibm.com> wrote: > > On 7/27/21 4:09 PM, Jonathon Jongsma wrote: >> On Tue, Jul 27, 2021 at 3:02 AM Michal Prívozník <mprivozn@redhat.com> wrote: >>> >>> On 7/27/21 12:08 AM, Jonathon Jongsma wrote: >>>> On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník <mprivozn@redhat.com> wrote: >>>>> >>>>> On 7/23/21 6:40 PM, Jonathon Jongsma wrote: >>>>>> Unfortunately, mdevctl supports defining more than one mdev with the >>>>>> same UUID as long as they have different parent devices. (Only one of >>>>>> these devices can be active at any given time). >>>>>> >>>>>> This means that we can't use the UUID alone as a way to uniquely >>>>>> identify mdev node devices. Append the parent address to ensure >>>>>> uniqueness. For example: >>>>>> >>>>>> Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 >>>>>> After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0 >>>>>> >>>>>> Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440 >>>>>> >>>>>> Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> >>>>>> --- >>>>>> src/node_device/node_device_driver.c | 3 ++- >>>>>> src/node_device/node_device_udev.c | 2 +- >>>>>> tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- >>>>>> 3 files changed, 7 insertions(+), 6 deletions(-) >>>>> >>>>> The patch looks good, but for some reason it leaves API breakage >>>>> aftertaste. But I guess we don't document anywhere what MDEV <name/> >>>>> looks like, do we? IOW - we are free to change it if needed. >>>> >>>> >>>> That is true -- it does have a bit of a bad aftertaste. As far as I >>>> know, we haven't documented or promised any specific naming format for >>>> mdevs. But it could certainly cause some pain for people that are >>>> using mdevs already, which might be reason enough to reject or adjust >>>> this patch. If a person already has assigned a mdev device to their vm >>>> with the old name, the name would change when upgrading libvirt. That >>>> could make the domain definition become invalid. >>>> >>>> But, we have to deal with this situation somehow. We don't have a >>>> choice but to handle the case where mdevctl might return multiple >>>> defined devices with the same UUID. I considered a couple of other >>>> approaches to handling this that I rejected, such as >>>> - only add a suffix to the second such device with the same UUID >>>> - rejected because the name of a given device depends on the order >>>> in which it was observed and the presence of other devices. For >>>> example, if we have two devices: mdev_$UUID and mdev_$UUID_$PARENT, >>>> and the first device was undefined, then when libvirt was restarted, >>>> mdev_$UUID_$PARENT would change its name to mdev_$UUID because it is >>>> now the only device with that UUID >>> >>> Agreed, this would not make situation any better, in fact worse. >>> >>>> - name all *active* mdevs mdev_$UUID and use a different scheme for >>>> inactive, defined mdevs >>>> - rejected because the device would change names when it changed >>>> from inactive to active. >>> >>> Yeah, this is equally horrible. >>> >>>> - ignore this situation because probably almost nobody uses multiple >>>> devices with the same UUID >>>> - but do they? >>> >>> Frankly, I don't have enough experience with MDEVs, but since it's >>> possible to define an MDEV with an existing UUID, it is possible to >>> define it also for the same parent? I mean, if I'd have an MDEV capable >>> graphics card and I'd define two MDEVs with the same UUID they would >>> both have the same parent, wouldn't they? Because in that case, >>> appending PCI address to libvirt's name is not enough. >> >> >> Nope, it's the opposite in fact. mdevctl allows you to define two or >> more mdevs with the same UUID, but only if they have different >> parents. Only a single mdev with a given UUID can be activated at the >> same time. So among *active* devices, the UUID is indeed unique. But >> not among persistent device definitions. For defined devices, the UUID >> is guaranteed unique per parent. >> >> I discussed this with Alex a while ago and he thought that the reason >> that mdevctl supported this was to allow people to assign an mdev with >> a specific UUID to a vm, but that UUID could represent one of several >> (equivalent?) devices, depending on which parent device was present in >> the host or which device was instantiated. That feature doesn't seem >> very compelling to me, and Alex wasn't sure that anybody was actually >> using it that way. But the fact remains that mdevctl does allow people >> to configure things this way and it seems like we need to be prepared >> to deal with it. > > > I am questioning how a user is supposed to correlate the nodedev > representation with mdev_{uuid}_{parent} to a domain using the mdev like > <devices> > <hostdev mode='subsystem' type='mdev' model='vfio-ccw'> > <source> > <address uuid='{uuid}'/> > </source> > </hostdev> > </devices>
Sorry for potentially confusing the impact of this change with my previous comments. I had said that domains with mdevs assigned might become invalid if the nodedev name changes. But that's not actually true. The naming that we're discussing is only related to the name of the device within the libvirt nodedev driver. When adding a mediated device to a domain, you don't use this nodedev name, you use the UUID directly. This is similar to the way you add a physical passthrough PCI device to a domain using its PCI address rather than its 'pci_$domain_$bus_$slot_$fn' nodedev name.
> Reading uuid it tells me its universal unique > (https://en.wikipedia.org/wiki/Universally_unique_identifier) but now > this is not true for definitions which breaks the term universal in my > small universe since for the defined state of an mdev it becomes > "parental" unique...
Yes, that's true. Which is why I think this is a questionable feature. But: A) defined devices are only *potential* devices.This is not all that different from the fact that a user can manually instantiate a mdev on parent A with a specific UUID and then later instantiate a totally different mdev on parent B with the exact same UUID. One UUID refers to different devices at different times. B) the question we're discussing is not whether mdevctl should allow duplicate UUIDs or not. mdevctl *already* allows it, so we need to figure out how to deal with it in libvirt.
I suppose one potential possibility might be to change upstream mdevctl to disallow duplicate UUIDs and then require this new mdevctl version for libvirt. But that's a discussion we'd have to have in upstream mdevctl first.
Maybe we can start the discussion, but even in libvirt you are allowed to have two guests with the same name and UUID if they are in different drivers (e.g. one in QEMU and one in LXC).
Meanwhile, I think these can be pushed as we don't have much options anyway.
Michal
I think that we have at least one option. We could rethink the approach to create nodedev objects based on mdevctl mdev definitions which actually are not a usable host resource for domains.
If I understand your suggestion, I don't think it solves any problems. It seems to me that you're saying that libvirt should ignore all mdev definitions from mdevctl if the parent does not (yet?) exist on the host. (At least that's my interpretation of your phrase "definitions which are not a usable host resource"). But it's still possible to have definitions with duplicate UUIDs for devices which have parents that *do* exist on the host but are simply not instantiated yet. If I'm misinterpreting your comment, feel free to clarify.
I used to think of the nodedev objects as libvirt showing the user its view of devices it can understand in the sysfs. This changed when mdev nodedev objects got created based on mdevctl's mediated device definition. These definition objects are different from the sysfs objects.
Yes, but in a similar way, a defined storage pool is different from an active storage pool. Or a defined domain that is different from an active domain. There is plenty of precedent in libvirt of this kind of distinction.
One difference is state active/inactive which caused e.g. the introduction of the option "inactive" on list operation. When we now talk about an nodedev mdev object it could be a) an instantiated mediated devices without mdev definition, b) an instantiated mediated devices with mdev definition or c) just an mdev definition, but just an instantiated mediated device [part of a) and b)] is actually a host resource one can use in a domain. So what I meant by "...not usable host resources" was the definition itself is not usable for a domain. The sysfs based host resources are.
Those three options are also true of pretty much every other libvirt object as well. A domain can be transient or defined, active or inactive. Same for pools. I can define a domain in libvirt that uses devices that are not (currently) present on the host system. I can define a storage pool that refers to a nonexistent directory. When I attempt to instantiate these objects, libvirt will return an error, but I am allowed to define objects that refer to unavailable host resources. So all of those things mentioned above seem pretty analogous to existing libvirt objects.
I agree that mdev definitions and operations on these definitions have a value but I am getting the feeling that they do not fit so well into the nodedev objects especially with the new twist of the loss of identity uniqueness which results from my perspective in a very messy nodedev object space.
I don't think we've lost any identity uniqueness. It's just that we initially chose a device name schema for mdevs that turned out to not be unique when mdevctl is used as a backend.
Also, I don't think this nodedev object space is any more "messy" than the other libvirt objects (e.g. pools and domains mentioned above). Are you arguing that we should not provide a way to define new persistent nodedevs in libvirt?
There is one other thing I noticed. I can change an mdev definition after I initiated/started the mdev definition. This is kind of starting a domain and afterwards change/edit its persisted definition. So when dumping the xml one would need to be able to chose running/started or defined. This currently is not a big issue since there are not many actually no attributes that can be retrieved via mdevctl from a running/started mdev but it just adds to the disparity since currently the user of nodedev gets the mdev definition and running/started mdev in a merged nodedev object.
Again, this is not all that different from existing libvirt objects. I can also edit a persistent definition of a storage pool after it is started, for example. It's possible that there are some issues unique to mdevs that I need to fix in this area, but those are just details. In general it seems pretty consistent with other libvirt objects.
Jonathon
My arguing was based on looking at the node device driver. I know that the concepts exist elsewhere in libvirt but apparently not in the node device driver. I always though that the reason for it is/was that it is based on sysfs. -- Mit freundlichen Grüßen/Kind regards Boris Fiuczynski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Gregor Pillen Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 7/26/21 4:47 PM, Michal Prívozník wrote:
On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
Unfortunately, mdevctl supports defining more than one mdev with the same UUID as long as they have different parent devices. (Only one of these devices can be active at any given time).
This means that we can't use the UUID alone as a way to uniquely identify mdev node devices. Append the parent address to ensure uniqueness. For example:
Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38 After: mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 3 ++- src/node_device/node_device_udev.c | 2 +- tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++---- 3 files changed, 7 insertions(+), 6 deletions(-)
The patch looks good, but for some reason it leaves API breakage aftertaste. But I guess we don't document anywhere what MDEV <name/> looks like, do we? IOW - we are free to change it if needed.
Michal
Michal, Jonathon, now that the name change broke the undocumented {bus}_{device} format pattern I am coping with the aftermath. Besides the fact that before- and after-format pattern support cannot easily be introspected until actually dealing with an mdev and than try to make sense from its name I am confronted with two general questions: 1. How stable are nodedev object names? 2. Do nodedev object names contain bus and device address information one can rely on or are they (since the format is undocumented) just arbitrary characters? -- Mit freundlichen Grüßen/Kind regards Boris Fiuczynski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Gregor Pillen Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

Since UUID is not guaranteed to be unique by mdevctl, we may have more than one nodedev with the same UUID. Therefore, we need to disambiguate when looking up mdevs by specifying the UUID and parent address, which mdevctl guarantees to be a unique combination. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/conf/virnodedeviceobj.c | 18 ++++++++++++++---- src/conf/virnodedeviceobj.h | 3 ++- src/node_device/node_device_driver.c | 24 ++++++++++++++++-------- 3 files changed, 32 insertions(+), 13 deletions(-) diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index 6e7b354f96..d1d23fc857 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -401,13 +401,20 @@ virNodeDeviceObjListFindSCSIHostByWWNs(virNodeDeviceObjList *devs, &data); } + +typedef struct { + const char *uuid; + const char *parent_addr; +} FindMediatedDeviceData; + + static int virNodeDeviceObjListFindMediatedDeviceByUUIDCallback(const void *payload, const char *name G_GNUC_UNUSED, const void *opaque) { virNodeDeviceObj *obj = (virNodeDeviceObj *) payload; - const char *uuid = (const char *) opaque; + const FindMediatedDeviceData* data = opaque; virNodeDevCapsDef *cap; int want = 0; @@ -415,7 +422,8 @@ virNodeDeviceObjListFindMediatedDeviceByUUIDCallback(const void *payload, for (cap = obj->def->caps; cap != NULL; cap = cap->next) { if (cap->data.type == VIR_NODE_DEV_CAP_MDEV) { - if (STREQ(cap->data.mdev.uuid, uuid)) { + if (STREQ(cap->data.mdev.uuid, data->uuid) && + STREQ(cap->data.mdev.parent_addr, data->parent_addr)) { want = 1; break; } @@ -429,11 +437,13 @@ virNodeDeviceObjListFindMediatedDeviceByUUIDCallback(const void *payload, virNodeDeviceObj * virNodeDeviceObjListFindMediatedDeviceByUUID(virNodeDeviceObjList *devs, - const char *uuid) + const char *uuid, + const char *parent_addr) { + const FindMediatedDeviceData data = {uuid, parent_addr}; return virNodeDeviceObjListSearch(devs, virNodeDeviceObjListFindMediatedDeviceByUUIDCallback, - uuid); + &data); } static void diff --git a/src/conf/virnodedeviceobj.h b/src/conf/virnodedeviceobj.h index 1fdd4f65da..068c7c6f34 100644 --- a/src/conf/virnodedeviceobj.h +++ b/src/conf/virnodedeviceobj.h @@ -122,7 +122,8 @@ virNodeDeviceObjSetSkipUpdateCaps(virNodeDeviceObj *obj, bool skipUpdateCaps); virNodeDeviceObj * virNodeDeviceObjListFindMediatedDeviceByUUID(virNodeDeviceObjList *devs, - const char *uuid); + const char *uuid, + const char *parent_addr); bool virNodeDeviceObjIsActive(virNodeDeviceObj *obj); diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 4e676bfd56..ddb9a9bca7 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -321,6 +321,7 @@ nodeDeviceLookupSCSIHostByWWN(virConnectPtr conn, static virNodeDevicePtr nodeDeviceLookupMediatedDeviceByUUID(virConnectPtr conn, const char *uuid, + const char *parent_addr, unsigned int flags) { virNodeDeviceObj *obj = NULL; @@ -330,7 +331,7 @@ nodeDeviceLookupMediatedDeviceByUUID(virConnectPtr conn, virCheckFlags(0, NULL); if (!(obj = virNodeDeviceObjListFindMediatedDeviceByUUID(driver->devs, - uuid))) + uuid, parent_addr))) return NULL; def = virNodeDeviceObjGetDef(obj); @@ -542,23 +543,29 @@ nodeDeviceFindNewDevice(virConnectPtr conn, } +typedef struct { + const char *uuid; + const char *parent_addr; +} NewMediatedDeviceData; + static virNodeDevicePtr nodeDeviceFindNewMediatedDeviceFunc(virConnectPtr conn, const void *opaque) { - const char *uuid = opaque; + const NewMediatedDeviceData *data = opaque; - return nodeDeviceLookupMediatedDeviceByUUID(conn, uuid, 0); + return nodeDeviceLookupMediatedDeviceByUUID(conn, data->uuid, data->parent_addr, 0); } static virNodeDevicePtr nodeDeviceFindNewMediatedDevice(virConnectPtr conn, - const char *mdev_uuid) + const char *mdev_uuid, + const char *parent_addr) { - return nodeDeviceFindNewDevice(conn, - nodeDeviceFindNewMediatedDeviceFunc, - mdev_uuid); + NewMediatedDeviceData data = {mdev_uuid, parent_addr}; + return nodeDeviceFindNewDevice(conn, nodeDeviceFindNewMediatedDeviceFunc, + &data); } @@ -867,7 +874,8 @@ nodeDeviceCreateXMLMdev(virConnectPtr conn, def->caps->data.mdev.uuid = g_steal_pointer(&uuid); } - return nodeDeviceFindNewMediatedDevice(conn, def->caps->data.mdev.uuid); + return nodeDeviceFindNewMediatedDevice(conn, def->caps->data.mdev.uuid, + def->caps->data.mdev.parent_addr); } -- 2.31.1

On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
Since UUID is not guaranteed to be unique by mdevctl, we may have more than one nodedev with the same UUID. Therefore, we need to disambiguate when looking up mdevs by specifying the UUID and parent address, which mdevctl guarantees to be a unique combination.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/conf/virnodedeviceobj.c | 18 ++++++++++++++---- src/conf/virnodedeviceobj.h | 3 ++- src/node_device/node_device_driver.c | 24 ++++++++++++++++-------- 3 files changed, 32 insertions(+), 13 deletions(-)
diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index 6e7b354f96..d1d23fc857 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -401,13 +401,20 @@ virNodeDeviceObjListFindSCSIHostByWWNs(virNodeDeviceObjList *devs, &data); }
+ +typedef struct { + const char *uuid; + const char *parent_addr; +} FindMediatedDeviceData; +
Again, typedef on one line, struct on the other. Michal

On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
These patches fix a couple of outstanding issues with mdev support. These include:
1. output proper xml for the mdev parent device when run under the test suite 2. mdevctl allows multiple devices with the same UUID (but different parents), so we have to be prepared to handle this. This also requires a change in nodedev name for mdev devices. From 'mdev_$UUID' to 'mdev_$UUID_$PARENTADDR' 3. validate input xml when defining or creating mdevs (e.g. ensure we have a proper parent device)
This patch obsoletes the previous patch titled "nodedev: Handle inactive mdevs with the same UUID"
Jonathon Jongsma (7): nodedev: add internal virNodeDeviceObjListFind() nodedev: fix xml output for mdev parents in test suite nodedev: cache parent address in mdev caps nodedev: Add parser validation for node devices nodedev: add PostParse callback for nodedev parsing nodedev: Handle inactive mdevs with the same UUID nodedev: look up mdevs by UUID and parent
src/conf/node_device_conf.c | 37 +++- src/conf/node_device_conf.h | 21 +- src/conf/virnodedeviceobj.c | 71 +++++-- src/conf/virnodedeviceobj.h | 15 +- src/hypervisor/domain_driver.c | 7 +- src/libvirt_private.syms | 1 + src/node_device/node_device_driver.c | 182 ++++++++++++++---- src/node_device/node_device_driver.h | 6 + src/node_device/node_device_udev.c | 21 +- src/test/test_driver.c | 6 +- .../mdevctl-list-multiple.out.xml | 16 +- tests/nodedevmdevctltest.c | 36 +++- tests/nodedevxml2xmltest.c | 3 +- 13 files changed, 338 insertions(+), 84 deletions(-)
If my assumption in 6/7 is correct and we don't break API by changing the name for MDEVs then: Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Michal
participants (4)
-
Boris Fiuczynski
-
Jonathon Jongsma
-
Michal Prívozník
-
Shalini Chellathurai Saroja