[libvirt PATCH v3 00/21] Add support for persistent mediated devices

This patch series follows the previously-merged series which added support for transient mediated devices. This series expands mdev support to include persistent device definitions. Again, it relies on mdevctl as the backend. It follows the common libvirt pattern of APIs by adding the following new APIs for node devices: - virNodeDeviceDefineXML() - defines a persistent device - virNodeDeviceUndefine() - undefines a persistent device - virNodeDeviceCreate() - starts a previously-defined device It also adds virsh commands mapping to these new APIs: nodedev-define, nodedev-undefine, and nodedev-start. Since we rely on mdevctl for the definition of ediated devices, we need a way to stay up-to-date with devices that are defined by mdevctl (outside of libvirt). The method for staying up-to-date is currently a little bit crude due to the fact that mdevctl does not emit any events when new devices are added or removed. As a workaround, we create a file monitor for the mdevctl config directory and re-query mdevctl when we detect changes within that directory. In the future, mdevctl may introduce a more elegant solution. changes in v3: - streamlined tests -- removed some unnecessary duplication - split out patch to factor out node device name generation function - split nodeDeviceParseMdevctlChildDevice() into a separate function - added follow-up patch to remove space-padded alignment in header - refactored the mdevctl update handling significantly: - no longer a separate persistent thread that gets signaled by a timer - now piggybacks onto the existing udev thread and signals the thread in t= he same way that the udev event does. - Daniel suggested spawning a throw-away thread to handle mdevctl updates, but that introduces the complexity of possibly serializing multiple throw-away threads (e.g. if we get an 'created' event followed immediate= ly by a 'deleted' event, two threads may be spawned and we'd need to ensure they are properly ordered) - added virNodeDeviceObjListForEach() and virNodeDeviceObjListRemoveLocked() to simplify removing devices that are removed from mdevctl. - coding style fixes - NOTE: per Erik's request, I experimented with changing the way that mdevctl commands were generated and tested (e.g. introducing something like virMdevctlGetCommand(def, MDEVCTL_COMMAND_<SUBCOMMAND>, ...)), but it was too invasive and awkward and didn't seem worthwhile Changes in v2: - rebase to latest git master Jonathon Jongsma (21): tests: remove extra trailing semicolon nodedev: introduce concept of 'active' node devices nodedev: Add ability to filter by active state nodedev: expose internal helper for naming devices nodedev: add ability to list and parse defined mdevs nodedev: add STOPPED/STARTED lifecycle events nodedev: add mdevctl devices to node device list nodedev: add helper functions to remove node devices nodedev: handle mdevs that disappear from mdevctl nodedev: rename dataReady to udevReady nodedev: Refresh mdev devices when changes are detected api: add virNodeDeviceDefineXML() virsh: Add --active, --inactive, --all to nodedev-list virsh: add nodedev-define command nodedev: refactor tests to support mdev undefine api: add virNodeDeviceUndefine() virsh: Factor out function to find node device virsh: add nodedev-undefine command api: add virNodeDeviceCreate() virsh: add "nodedev-start" command libvirt-nodedev.h: remove space-padded alignment examples/c/misc/event-test.c | 4 + include/libvirt/libvirt-nodedev.h | 98 ++-- src/access/viraccessperm.c | 2 +- src/access/viraccessperm.h | 6 + src/conf/node_device_conf.h | 9 + src/conf/virnodedeviceobj.c | 132 ++++- src/conf/virnodedeviceobj.h | 19 + src/driver-nodedev.h | 14 + src/libvirt-nodedev.c | 115 ++++ src/libvirt_private.syms | 4 + src/libvirt_public.syms | 3 + src/node_device/node_device_driver.c | 538 +++++++++++++++++- src/node_device/node_device_driver.h | 38 ++ src/node_device/node_device_udev.c | 308 ++++++++-- src/remote/remote_driver.c | 3 + src/remote/remote_protocol.x | 40 +- src/remote_protocol-structs | 16 + src/rpc/gendispatch.pl | 1 + ...19_36ea_4111_8f0a_8c9a70e21366-define.argv | 1 + ...19_36ea_4111_8f0a_8c9a70e21366-define.json | 1 + ...39_495e_4243_ad9f_beb3f14c23d9-define.argv | 1 + ...39_495e_4243_ad9f_beb3f14c23d9-define.json | 1 + ...16_1ca8_49ac_b176_871d16c13076-define.argv | 1 + ...16_1ca8_49ac_b176_871d16c13076-define.json | 1 + tests/nodedevmdevctldata/mdevctl-create.argv | 1 + .../mdevctl-list-defined.argv | 1 + .../mdevctl-list-multiple.json | 59 ++ .../mdevctl-list-multiple.out.xml | 39 ++ tests/nodedevmdevctltest.c | 222 +++++++- tools/virsh-nodedev.c | 276 +++++++-- 30 files changed, 1765 insertions(+), 189 deletions(-) create mode 100644 tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9= a70e21366-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9= a70e21366-define.json create mode 100644 tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb= 3f14c23d9-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb= 3f14c23d9-define.json create mode 100644 tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871= d16c13076-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871= d16c13076-define.json create mode 100644 tests/nodedevmdevctldata/mdevctl-create.argv create mode 100644 tests/nodedevmdevctldata/mdevctl-list-defined.argv create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.json create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml --=20 2.26.2

The macro should not have a trailing semicolon so that when the macro is used, the user can add a semicolon themselves. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- tests/nodedevmdevctltest.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/nodedevmdevctltest.c b/tests/nodedevmdevctltest.c index dab4b487b9..1d3cb00400 100644 --- a/tests/nodedevmdevctltest.c +++ b/tests/nodedevmdevctltest.c @@ -271,7 +271,7 @@ mymain(void) struct startTestInfo info = { virt_type, create, filename }; \ DO_TEST_FULL("mdevctl start " filename, testMdevctlStartHelper, info); \ } \ - while (0); + while (0) #define DO_TEST_START(filename) \ DO_TEST_START_FULL("QEMU", CREATE_DEVICE, filename) -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:25AM -0600, Jonathon Jongsma wrote:
The macro should not have a trailing semicolon so that when the macro is used, the user can add a semicolon themselves.
Well, it can, it has no effect. However, for consistency reasons, this is the prevailing form, so sure, let's drop it.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> ---
Reviewed-by: Erik Skultety <eskultet@redhat.com>

we will be able to define mediated devices that can be started or stopped, so we need to be able to indicate whether the device is active or not, similar to other resources (storage pools, domains, etc.) Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/conf/virnodedeviceobj.c | 14 ++++++++++++++ src/conf/virnodedeviceobj.h | 7 +++++++ src/libvirt_private.syms | 2 ++ src/node_device/node_device_udev.c | 3 +++ 4 files changed, 26 insertions(+) diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index c9bda77b2e..92f58dbf7d 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -39,6 +39,7 @@ struct _virNodeDeviceObj { virNodeDeviceDefPtr def; /* device definition */ bool skipUpdateCaps; /* whether to skip checking host caps, used by testdriver */ + bool active; }; struct _virNodeDeviceObjList { @@ -976,3 +977,16 @@ virNodeDeviceObjSetSkipUpdateCaps(virNodeDeviceObjPtr obj, { obj->skipUpdateCaps = skipUpdateCaps; } + +bool +virNodeDeviceObjIsActive(virNodeDeviceObjPtr obj) +{ + return obj->active; +} + +void +virNodeDeviceObjSetActive(virNodeDeviceObjPtr obj, + bool active) +{ + obj->active = active; +} diff --git a/src/conf/virnodedeviceobj.h b/src/conf/virnodedeviceobj.h index 6efdb23d36..c119f4c51f 100644 --- a/src/conf/virnodedeviceobj.h +++ b/src/conf/virnodedeviceobj.h @@ -121,3 +121,10 @@ virNodeDeviceObjSetSkipUpdateCaps(virNodeDeviceObjPtr obj, virNodeDeviceObjPtr virNodeDeviceObjListFindMediatedDeviceByUUID(virNodeDeviceObjListPtr devs, const char *uuid); + +bool +virNodeDeviceObjIsActive(virNodeDeviceObjPtr obj); + +void +virNodeDeviceObjSetActive(virNodeDeviceObjPtr obj, + bool active); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 583fc5800e..fecd012300 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1203,6 +1203,7 @@ virNetworkPortDefSaveStatus; # conf/virnodedeviceobj.h virNodeDeviceObjEndAPI; virNodeDeviceObjGetDef; +virNodeDeviceObjIsActive; virNodeDeviceObjListAssignDef; virNodeDeviceObjListExport; virNodeDeviceObjListFindByName; @@ -1215,6 +1216,7 @@ virNodeDeviceObjListGetParentHost; virNodeDeviceObjListNew; virNodeDeviceObjListNumOfDevices; virNodeDeviceObjListRemove; +virNodeDeviceObjSetActive; # conf/virnwfilterbindingdef.h diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 55a2731681..208f256895 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1536,6 +1536,7 @@ udevAddOneDevice(struct udev_device *device) else event = virNodeDeviceEventUpdateNew(objdef->name); + virNodeDeviceObjSetActive(obj, true); virNodeDeviceObjEndAPI(&obj); ret = 0; @@ -1921,6 +1922,8 @@ udevSetupSystemDev(void) if (!(obj = virNodeDeviceObjListAssignDef(driver->devs, def))) goto cleanup; + virNodeDeviceObjSetActive(obj, true); + virNodeDeviceObjEndAPI(&obj); ret = 0; -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:26AM -0600, Jonathon Jongsma wrote:
we will be able to define mediated devices that can be started or stopped, so we need to be able to indicate whether the device is active or not, similar to other resources (storage pools, domains, etc.)
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- Reviewed-by: Erik Skultety <eskultet@redhat.com>

Add two flag values for virConnectListAllNodeDevices() so that we can list only node devices that are active or inactive. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- include/libvirt/libvirt-nodedev.h | 9 +++-- src/conf/node_device_conf.h | 8 ++++ src/conf/virnodedeviceobj.c | 56 ++++++++++++++++------------ src/libvirt-nodedev.c | 2 + src/node_device/node_device_driver.c | 2 +- 5 files changed, 50 insertions(+), 27 deletions(-) diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index eab8abf6ab..d304283871 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -61,10 +61,9 @@ int virNodeListDevices (virConnectPtr conn, * virConnectListAllNodeDevices: * * Flags used to filter the returned node devices. Flags in each group - * are exclusive. Currently only one group to filter the devices by cap - * type. - */ + * are exclusive. */ typedef enum { + /* filter the devices by cap type */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_SYSTEM = 1 << 0, /* System capability */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_PCI_DEV = 1 << 1, /* PCI device */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_USB_DEV = 1 << 2, /* USB device */ @@ -86,6 +85,10 @@ typedef enum { VIR_CONNECT_LIST_NODE_DEVICES_CAP_AP_CARD = 1 << 18, /* s390 AP Card device */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_AP_QUEUE = 1 << 19, /* s390 AP Queue */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_AP_MATRIX = 1 << 20, /* s390 AP Matrix */ + + /* filter the devices by active state */ + VIR_CONNECT_LIST_NODE_DEVICES_INACTIVE = 1 << 29, /* Inactive devices */ + VIR_CONNECT_LIST_NODE_DEVICES_ACTIVE = 1 << 30, /* Active devices */ } virConnectListAllNodeDeviceFlags; int virConnectListAllNodeDevices (virConnectPtr conn, diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index c67b8e2aeb..3d7872fd6e 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -422,6 +422,14 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps); VIR_CONNECT_LIST_NODE_DEVICES_CAP_AP_QUEUE | \ VIR_CONNECT_LIST_NODE_DEVICES_CAP_AP_MATRIX) +#define VIR_CONNECT_LIST_NODE_DEVICES_FILTERS_ACTIVE \ + VIR_CONNECT_LIST_NODE_DEVICES_ACTIVE |\ + VIR_CONNECT_LIST_NODE_DEVICES_INACTIVE + +#define VIR_CONNECT_LIST_NODE_DEVICES_FILTERS_ALL \ + VIR_CONNECT_LIST_NODE_DEVICES_FILTERS_CAP | \ + VIR_CONNECT_LIST_NODE_DEVICES_FILTERS_ACTIVE + int virNodeDeviceGetSCSIHostCaps(virNodeDevCapSCSIHostPtr scsi_host); diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index 92f58dbf7d..6e9291264a 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -848,8 +848,10 @@ virNodeDeviceObjListGetNames(virNodeDeviceObjListPtr devs, } -#define MATCH(FLAG) ((flags & (VIR_CONNECT_LIST_NODE_DEVICES_CAP_ ## FLAG)) && \ - virNodeDeviceObjHasCap(obj, VIR_NODE_DEV_CAP_ ## FLAG)) +#define MATCH_CAP(FLAG) ((flags & (VIR_CONNECT_LIST_NODE_DEVICES_CAP_ ## FLAG)) && \ + virNodeDeviceObjHasCap(obj, VIR_NODE_DEV_CAP_ ## FLAG)) +#define MATCH(FLAG) (flags & (FLAG)) + static bool virNodeDeviceObjMatch(virNodeDeviceObjPtr obj, unsigned int flags) @@ -861,33 +863,41 @@ virNodeDeviceObjMatch(virNodeDeviceObjPtr obj, /* filter by cap type */ if (flags & VIR_CONNECT_LIST_NODE_DEVICES_FILTERS_CAP) { - if (!(MATCH(SYSTEM) || - MATCH(PCI_DEV) || - MATCH(USB_DEV) || - MATCH(USB_INTERFACE) || - MATCH(NET) || - MATCH(SCSI_HOST) || - MATCH(SCSI_TARGET) || - MATCH(SCSI) || - MATCH(STORAGE) || - MATCH(FC_HOST) || - MATCH(VPORTS) || - MATCH(SCSI_GENERIC) || - MATCH(DRM) || - MATCH(MDEV_TYPES) || - MATCH(MDEV) || - MATCH(CCW_DEV) || - MATCH(CSS_DEV) || - MATCH(VDPA) || - MATCH(AP_CARD) || - MATCH(AP_QUEUE) || - MATCH(AP_MATRIX))) + if (!(MATCH_CAP(SYSTEM) || + MATCH_CAP(PCI_DEV) || + MATCH_CAP(USB_DEV) || + MATCH_CAP(USB_INTERFACE) || + MATCH_CAP(NET) || + MATCH_CAP(SCSI_HOST) || + MATCH_CAP(SCSI_TARGET) || + MATCH_CAP(SCSI) || + MATCH_CAP(STORAGE) || + MATCH_CAP(FC_HOST) || + MATCH_CAP(VPORTS) || + MATCH_CAP(SCSI_GENERIC) || + MATCH_CAP(DRM) || + MATCH_CAP(MDEV_TYPES) || + MATCH_CAP(MDEV) || + MATCH_CAP(CCW_DEV) || + MATCH_CAP(CSS_DEV) || + MATCH_CAP(VDPA) || + MATCH_CAP(AP_CARD) || + MATCH_CAP(AP_QUEUE) || + MATCH_CAP(AP_MATRIX))) return false; } + if (MATCH(VIR_CONNECT_LIST_NODE_DEVICES_FILTERS_ACTIVE) && + !((MATCH(VIR_CONNECT_LIST_NODE_DEVICES_ACTIVE) && + virNodeDeviceObjIsActive(obj)) || + (MATCH(VIR_CONNECT_LIST_NODE_DEVICES_INACTIVE) && + !virNodeDeviceObjIsActive(obj)))) + return false; + return true; } #undef MATCH +#undef MATCH_CAP typedef struct _virNodeDeviceObjListExportData virNodeDeviceObjListExportData; diff --git a/src/libvirt-nodedev.c b/src/libvirt-nodedev.c index eb8c735a8c..375b907852 100644 --- a/src/libvirt-nodedev.c +++ b/src/libvirt-nodedev.c @@ -105,6 +105,8 @@ virNodeNumOfDevices(virConnectPtr conn, const char *cap, unsigned int flags) * VIR_CONNECT_LIST_NODE_DEVICES_CAP_AP_CARD * VIR_CONNECT_LIST_NODE_DEVICES_CAP_AP_QUEUE * VIR_CONNECT_LIST_NODE_DEVICES_CAP_AP_MATRIX + * VIR_CONNECT_LIST_NODE_DEVICES_ACTIVE + * VIR_CONNECT_LIST_NODE_DEVICES_INACTIVE * * Returns the number of node devices found or -1 and sets @devices to NULL in * case of error. On success, the array stored into @devices is guaranteed to diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 51b20848ce..c992251121 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -217,7 +217,7 @@ nodeConnectListAllNodeDevices(virConnectPtr conn, virNodeDevicePtr **devices, unsigned int flags) { - virCheckFlags(VIR_CONNECT_LIST_NODE_DEVICES_FILTERS_CAP, -1); + virCheckFlags(VIR_CONNECT_LIST_NODE_DEVICES_FILTERS_ALL, -1); if (virConnectListAllNodeDevicesEnsureACL(conn) < 0) return -1; -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:27AM -0600, Jonathon Jongsma wrote:
Add two flag values for virConnectListAllNodeDevices() so that we can list only node devices that are active or inactive.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- Reviewed-by: Erik Skultety <eskultet@redhat.com>

Expose a helper function that can be used by udev and mdevctl to generate device names for node devices. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 25 +++++++++++++++++++++++++ src/node_device/node_device_driver.h | 6 ++++++ src/node_device/node_device_udev.c | 19 +++---------------- 3 files changed, 34 insertions(+), 16 deletions(-) diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index c992251121..6143459618 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -968,3 +968,28 @@ nodedevRegister(void) return udevNodeRegister(); #endif } + + +void +nodeDeviceGenerateName(virNodeDeviceDefPtr def, + const char *subsystem, + const char *sysname, + const char *s) +{ + size_t i; + g_auto(virBuffer) buf = VIR_BUFFER_INITIALIZER; + + virBufferAsprintf(&buf, "%s_%s", + subsystem, + sysname); + + if (s != NULL) + virBufferAsprintf(&buf, "_%s", s); + + def->name = virBufferContentAndReset(&buf); + + for (i = 0; i < strlen(def->name); i++) { + if (!(g_ascii_isalnum(*(def->name + i)))) + *(def->name + i) = '_'; + } +} diff --git a/src/node_device/node_device_driver.h b/src/node_device/node_device_driver.h index 2113d2b0a5..02baf56dab 100644 --- a/src/node_device/node_device_driver.h +++ b/src/node_device/node_device_driver.h @@ -118,3 +118,9 @@ nodeDeviceGetMdevctlStartCommand(virNodeDeviceDefPtr def, char **uuid_out); virCommandPtr nodeDeviceGetMdevctlStopCommand(const char *uuid); + +void +nodeDeviceGenerateName(virNodeDeviceDefPtr def, + const char *subsystem, + const char *sysname, + const char *s); diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 208f256895..be59e6c6bc 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -308,22 +308,9 @@ udevGenerateDeviceName(struct udev_device *device, virNodeDeviceDefPtr def, const char *s) { - size_t i; - g_auto(virBuffer) buf = VIR_BUFFER_INITIALIZER; - - virBufferAsprintf(&buf, "%s_%s", - udev_device_get_subsystem(device), - udev_device_get_sysname(device)); - - if (s != NULL) - virBufferAsprintf(&buf, "_%s", s); - - def->name = virBufferContentAndReset(&buf); - - for (i = 0; i < strlen(def->name); i++) { - if (!(g_ascii_isalnum(*(def->name + i)))) - *(def->name + i) = '_'; - } + nodeDeviceGenerateName(def, + udev_device_get_subsystem(device), + udev_device_get_sysname(device), s); return 0; } -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:28AM -0600, Jonathon Jongsma wrote:
Expose a helper function that can be used by udev and mdevctl to generate device names for node devices.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- Reviewed-by: Erik Skultety <eskultet@redhat.com>

This adds some internal API to query for persistent mediated devices that are defined by mdevctl. Following commits will make use of this information. This just provides the infrastructure and tests for this feature. One test verifies that we are executing mdevctl with the proper arguments, and the other test verifies that we can parse the returned JSON and convert it to our own internal node device representations. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 150 ++++++++++++++++++ src/node_device/node_device_driver.h | 7 + .../mdevctl-list-defined.argv | 1 + .../mdevctl-list-multiple.json | 59 +++++++ .../mdevctl-list-multiple.out.xml | 39 +++++ tests/nodedevmdevctltest.c | 95 ++++++++++- 6 files changed, 349 insertions(+), 2 deletions(-) create mode 100644 tests/nodedevmdevctldata/mdevctl-list-defined.argv create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.json create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 6143459618..bbd373e32e 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -853,6 +853,156 @@ virMdevctlStop(virNodeDeviceDefPtr def) } +virCommandPtr +nodeDeviceGetMdevctlListCommand(bool defined, + char **output) +{ + virCommandPtr cmd = virCommandNewArgList(MDEVCTL, + "list", + "--dumpjson", + NULL); + + if (defined) + virCommandAddArg(cmd, "--defined"); + + virCommandSetOutputBuffer(cmd, output); + + return cmd; +} + + +static void mdevGenerateDeviceName(virNodeDeviceDefPtr dev) +{ + nodeDeviceGenerateName(dev, "mdev", dev->caps->data.mdev.uuid, NULL); +} + + +static virNodeDeviceDefPtr +nodeDeviceParseMdevctlChildDevice(const char *parent, + virJSONValuePtr json) +{ + virNodeDevCapMdevPtr mdev; + const char *uuid; + virJSONValuePtr props, attrs; + g_autoptr(virNodeDeviceDef) child = g_new0(virNodeDeviceDef, 1); + + /* the child object should have a single key equal to its uuid. + * The value is an object describing the properties of the mdev */ + if (virJSONValueObjectKeysNumber(json) != 1) + return NULL; + + uuid = virJSONValueObjectGetKey(json, 0); + props = virJSONValueObjectGetValue(json, 0); + + child->parent = g_strdup(parent); + child->caps = g_new0(virNodeDevCapsDef, 1); + child->caps->data.type = VIR_NODE_DEV_CAP_MDEV; + + mdev = &child->caps->data.mdev; + mdev->uuid = g_strdup(uuid); + mdev->type = + g_strdup(virJSONValueObjectGetString(props, "mdev_type")); + + attrs = virJSONValueObjectGet(props, "attrs"); + + if (attrs && virJSONValueIsArray(attrs)) { + size_t i; + int nattrs = virJSONValueArraySize(attrs); + + mdev->attributes = g_new0(virMediatedDeviceAttrPtr, nattrs); + mdev->nattributes = nattrs; + + for (i = 0; i < nattrs; i++) { + virJSONValuePtr attr = virJSONValueArrayGet(attrs, i); + virMediatedDeviceAttrPtr attribute; + virJSONValuePtr value; + + if (!virJSONValueIsObject(attr) || + virJSONValueObjectKeysNumber(attr) != 1) + return NULL; + + attribute = g_new0(virMediatedDeviceAttr, 1); + attribute->name = g_strdup(virJSONValueObjectGetKey(attr, 0)); + value = virJSONValueObjectGetValue(attr, 0); + attribute->value = g_strdup(virJSONValueGetString(value)); + mdev->attributes[i] = attribute; + } + } + mdevGenerateDeviceName(child); + + return g_steal_pointer(&child); +} + + +int +nodeDeviceParseMdevctlJSON(const char *jsonstring, + virNodeDeviceDefPtr **devs) +{ + int n; + g_autoptr(virJSONValue) json_devicelist = NULL; + virNodeDeviceDefPtr *outdevs = NULL; + size_t noutdevs = 0; + size_t i, j; + + json_devicelist = virJSONValueFromString(jsonstring); + + if (!json_devicelist) + goto parsefailure; + + if (!virJSONValueIsArray(json_devicelist)) + goto parsefailure; + + n = virJSONValueArraySize(json_devicelist); + + for (i = 0; i < n; i++) { + virJSONValuePtr obj = virJSONValueArrayGet(json_devicelist, i); + const char *parent; + virJSONValuePtr child_array; + int nchildren; + + if (!virJSONValueIsObject(obj)) + goto parsefailure; + + /* mdevctl returns an array of objects. Each object is a parent device + * object containing a single key-value pair which maps from the name + * of the parent device to an array of child devices */ + if (virJSONValueObjectKeysNumber(obj) != 1) + goto parsefailure; + + parent = virJSONValueObjectGetKey(obj, 0); + child_array = virJSONValueObjectGetValue(obj, 0); + + if (!virJSONValueIsArray(child_array)) + goto parsefailure; + + nchildren = virJSONValueArraySize(child_array); + + for (j = 0; j < nchildren; j++) { + g_autoptr(virNodeDeviceDef) child = NULL; + virJSONValuePtr child_obj = virJSONValueArrayGet(child_array, j); + + if (!(child = nodeDeviceParseMdevctlChildDevice(parent, child_obj))) + goto parsefailure; + + if (VIR_APPEND_ELEMENT(outdevs, noutdevs, child) < 0) + goto parsefailure; + } + } + + *devs = outdevs; + return noutdevs; + + parsefailure: + for (i = 0; i < noutdevs; i++) + virNodeDeviceDefFree(outdevs[i]); + VIR_FREE(outdevs); + + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("unable to parse JSON response")); + return -1; +} + + int nodeDeviceDestroy(virNodeDevicePtr device) { diff --git a/src/node_device/node_device_driver.h b/src/node_device/node_device_driver.h index 02baf56dab..80ac7c5320 100644 --- a/src/node_device/node_device_driver.h +++ b/src/node_device/node_device_driver.h @@ -119,6 +119,13 @@ nodeDeviceGetMdevctlStartCommand(virNodeDeviceDefPtr def, virCommandPtr nodeDeviceGetMdevctlStopCommand(const char *uuid); +virCommandPtr +nodeDeviceGetMdevctlListCommand(bool defined, char **output); + +int +nodeDeviceParseMdevctlJSON(const char *jsonstring, + virNodeDeviceDefPtr **devs); + void nodeDeviceGenerateName(virNodeDeviceDefPtr def, const char *subsystem, diff --git a/tests/nodedevmdevctldata/mdevctl-list-defined.argv b/tests/nodedevmdevctldata/mdevctl-list-defined.argv new file mode 100644 index 0000000000..72b5906e9e --- /dev/null +++ b/tests/nodedevmdevctldata/mdevctl-list-defined.argv @@ -0,0 +1 @@ +$MDEVCTL_BINARY$ list --dumpjson --defined diff --git a/tests/nodedevmdevctldata/mdevctl-list-multiple.json b/tests/nodedevmdevctldata/mdevctl-list-multiple.json new file mode 100644 index 0000000000..eefcd90c62 --- /dev/null +++ b/tests/nodedevmdevctldata/mdevctl-list-multiple.json @@ -0,0 +1,59 @@ +[ + { + "0000:00:02.0": [ + { + "200f228a-c80a-4d50-bfb7-f5a0e4e34045": { + "mdev_type": "i915-GVTg_V5_4", + "start": "manual" + } + }, + { + "de807ffc-1923-4d5f-b6c9-b20ecebc6d4b": { + "mdev_type": "i915-GVTg_V5_4", + "start": "auto" + } + }, + { + "435722ea-5f43-468a-874f-da34f1217f13": { + "mdev_type": "i915-GVTg_V5_8", + "start": "manual", + "attrs": [ + { + "testattr": "42" + } + ] + } + } + ] + }, + { + "matrix": [ + { "783e6dbb-ea0e-411f-94e2-717eaad438bf": { + "mdev_type": "vfio_ap-passthrough", + "start": "manual", + "attrs": [ + { + "assign_adapter": "5" + }, + { + "assign_adapter": "6" + }, + { + "assign_domain": "0xab" + }, + { + "assign_control_domain": "0xab" + }, + { + "assign_domain": "4" + }, + { + "assign_control_domain": "4" + } + ] + } + } + ] + } +] + diff --git a/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml b/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml new file mode 100644 index 0000000000..543ad916b7 --- /dev/null +++ b/tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml @@ -0,0 +1,39 @@ +<device> + <name>mdev_200f228a_c80a_4d50_bfb7_f5a0e4e34045</name> + <parent>0000:00:02.0</parent> + <capability type='mdev'> + <type id='i915-GVTg_V5_4'/> + <iommuGroup number='0'/> + </capability> +</device> +<device> + <name>mdev_de807ffc_1923_4d5f_b6c9_b20ecebc6d4b</name> + <parent>0000:00:02.0</parent> + <capability type='mdev'> + <type id='i915-GVTg_V5_4'/> + <iommuGroup number='0'/> + </capability> +</device> +<device> + <name>mdev_435722ea_5f43_468a_874f_da34f1217f13</name> + <parent>0000:00:02.0</parent> + <capability type='mdev'> + <type id='i915-GVTg_V5_8'/> + <iommuGroup number='0'/> + <attr name='testattr' value='42'/> + </capability> +</device> +<device> + <name>mdev_783e6dbb_ea0e_411f_94e2_717eaad438bf</name> + <parent>matrix</parent> + <capability type='mdev'> + <type id='vfio_ap-passthrough'/> + <iommuGroup number='0'/> + <attr name='assign_adapter' value='5'/> + <attr name='assign_adapter' value='6'/> + <attr name='assign_domain' value='0xab'/> + <attr name='assign_control_domain' value='0xab'/> + <attr name='assign_domain' value='4'/> + <attr name='assign_control_domain' value='4'/> + </capability> +</device> diff --git a/tests/nodedevmdevctltest.c b/tests/nodedevmdevctltest.c index 1d3cb00400..8bc464d59f 100644 --- a/tests/nodedevmdevctltest.c +++ b/tests/nodedevmdevctltest.c @@ -143,6 +143,87 @@ testMdevctlStop(const void *data) return ret; } +static int +testMdevctlListDefined(const void *data G_GNUC_UNUSED) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + const char *actualCmdline = NULL; + int ret = -1; + g_autoptr(virCommand) cmd = NULL; + g_autofree char *output = NULL; + g_autofree char *cmdlinefile = + g_strdup_printf("%s/nodedevmdevctldata/mdevctl-list-defined.argv", + abs_srcdir); + + cmd = nodeDeviceGetMdevctlListCommand(true, &output); + + if (!cmd) + goto cleanup; + + virCommandSetDryRun(&buf, NULL, NULL); + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + if (!(actualCmdline = virBufferCurrentContent(&buf))) + goto cleanup; + + if (nodedevCompareToFile(actualCmdline, cmdlinefile) < 0) + goto cleanup; + + ret = 0; + + cleanup: + virBufferFreeAndReset(&buf); + virCommandSetDryRun(NULL, NULL, NULL); + return ret; +} + +static int +testMdevctlParse(const void *data) +{ + g_autofree char *buf = NULL; + const char *filename = data; + g_autofree char *jsonfile = g_strdup_printf("%s/nodedevmdevctldata/%s.json", + abs_srcdir, filename); + g_autofree char *xmloutfile = g_strdup_printf("%s/nodedevmdevctldata/%s.out.xml", + abs_srcdir, filename); + int ret = -1; + int nmdevs = 0; + virNodeDeviceDefPtr *mdevs = NULL; + virBuffer xmloutbuf = VIR_BUFFER_INITIALIZER; + size_t i; + + if (virFileReadAll(jsonfile, 1024*1024, &buf) < 0) { + VIR_TEST_DEBUG("Unable to read file %s", jsonfile); + return -1; + } + + if ((nmdevs = nodeDeviceParseMdevctlJSON(buf, &mdevs)) < 0) { + VIR_TEST_DEBUG("Unable to parse json for %s", filename); + return -1; + } + + for (i = 0; i < nmdevs; i++) { + g_autofree char *devxml = virNodeDeviceDefFormat(mdevs[i]); + if (!devxml) + goto cleanup; + virBufferAddStr(&xmloutbuf, devxml); + } + + if (nodedevCompareToFile(virBufferCurrentContent(&xmloutbuf), xmloutfile) < 0) + goto cleanup; + + ret = 0; + + cleanup: + virBufferFreeAndReset(&xmloutbuf); + for (i = 0; i < nmdevs; i++) + virNodeDeviceDefFree(mdevs[i]); + g_free(mdevs); + + return ret; +} + static void nodedevTestDriverFree(virNodeDeviceDriverStatePtr drv) { @@ -263,13 +344,13 @@ mymain(void) } #define DO_TEST_FULL(desc, func, info) \ - if (virTestRun(desc, func, &info) < 0) \ + if (virTestRun(desc, func, info) < 0) \ ret = -1; #define DO_TEST_START_FULL(virt_type, create, filename) \ do { \ struct startTestInfo info = { virt_type, create, filename }; \ - DO_TEST_FULL("mdevctl start " filename, testMdevctlStartHelper, info); \ + DO_TEST_FULL("mdevctl start " filename, testMdevctlStartHelper, &info); \ } \ while (0) @@ -279,6 +360,12 @@ mymain(void) #define DO_TEST_STOP(uuid) \ DO_TEST_FULL("mdevctl stop " uuid, testMdevctlStop, uuid) +#define DO_TEST_LIST_DEFINED() \ + DO_TEST_FULL("mdevctl list --defined", testMdevctlListDefined, NULL) + +#define DO_TEST_PARSE_JSON(filename) \ + DO_TEST_FULL("parse mdevctl json " filename, testMdevctlParse, filename) + /* Test mdevctl start commands */ DO_TEST_START("mdev_d069d019_36ea_4111_8f0a_8c9a70e21366"); DO_TEST_START("mdev_fedc4916_1ca8_49ac_b176_871d16c13076"); @@ -287,6 +374,10 @@ mymain(void) /* Test mdevctl stop command, pass an arbitrary uuid */ DO_TEST_STOP("e2451f73-c95b-4124-b900-e008af37c576"); + DO_TEST_LIST_DEFINED(); + + DO_TEST_PARSE_JSON("mdevctl-list-multiple"); + done: nodedevTestDriverFree(driver); -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:29AM -0600, Jonathon Jongsma wrote:
This adds some internal API to query for persistent mediated devices that are defined by mdevctl. Following commits will make use of this information. This just provides the infrastructure and tests for this feature. One test verifies that we are executing mdevctl with the proper arguments, and the other test verifies that we can parse the returned JSON and convert it to our own internal node device representations.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 150 ++++++++++++++++++ src/node_device/node_device_driver.h | 7 + .../mdevctl-list-defined.argv | 1 + .../mdevctl-list-multiple.json | 59 +++++++ .../mdevctl-list-multiple.out.xml | 39 +++++ tests/nodedevmdevctltest.c | 95 ++++++++++- 6 files changed, 349 insertions(+), 2 deletions(-) create mode 100644 tests/nodedevmdevctldata/mdevctl-list-defined.argv create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.json create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 6143459618..bbd373e32e 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -853,6 +853,156 @@ virMdevctlStop(virNodeDeviceDefPtr def) }
+virCommandPtr +nodeDeviceGetMdevctlListCommand(bool defined, + char **output) +{ + virCommandPtr cmd = virCommandNewArgList(MDEVCTL, + "list", + "--dumpjson", + NULL); + + if (defined) + virCommandAddArg(cmd, "--defined"); + + virCommandSetOutputBuffer(cmd, output); + + return cmd; +} +
At this point I think we could split this even further and add the parser code first and then introduce listing support. ...
+int +nodeDeviceParseMdevctlJSON(const char *jsonstring, + virNodeDeviceDefPtr **devs) +{ + int n; + g_autoptr(virJSONValue) json_devicelist = NULL; + virNodeDeviceDefPtr *outdevs = NULL; + size_t noutdevs = 0; + size_t i, j;
One declaration per line please (I've seen another one, but not sure if it was in this patch or some later one).
+ + json_devicelist = virJSONValueFromString(jsonstring); + + if (!json_devicelist) + goto parsefailure; + + if (!virJSONValueIsArray(json_devicelist)) + goto parsefailure; + + n = virJSONValueArraySize(json_devicelist); + + for (i = 0; i < n; i++) { + virJSONValuePtr obj = virJSONValueArrayGet(json_devicelist, i); + const char *parent; + virJSONValuePtr child_array; + int nchildren; + + if (!virJSONValueIsObject(obj)) + goto parsefailure; + + /* mdevctl returns an array of objects. Each object is a parent device + * object containing a single key-value pair which maps from the name + * of the parent device to an array of child devices */ + if (virJSONValueObjectKeysNumber(obj) != 1) + goto parsefailure; + + parent = virJSONValueObjectGetKey(obj, 0); + child_array = virJSONValueObjectGetValue(obj, 0); + + if (!virJSONValueIsArray(child_array)) + goto parsefailure; + + nchildren = virJSONValueArraySize(child_array); + + for (j = 0; j < nchildren; j++) { + g_autoptr(virNodeDeviceDef) child = NULL; + virJSONValuePtr child_obj = virJSONValueArrayGet(child_array, j); + + if (!(child = nodeDeviceParseMdevctlChildDevice(parent, child_obj))) + goto parsefailure; + + if (VIR_APPEND_ELEMENT(outdevs, noutdevs, child) < 0) + goto parsefailure; + } + } + + *devs = outdevs; + return noutdevs; + + parsefailure:
As pointed out in v2: s/parsefailure/error
+ for (i = 0; i < noutdevs; i++) + virNodeDeviceDefFree(outdevs[i]); + VIR_FREE(outdevs); + + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("unable to parse JSON response")); + return -1; +} + +
...
--- /dev/null +++ b/tests/nodedevmdevctldata/mdevctl-list-multiple.json @@ -0,0 +1,59 @@ +[ + { + "0000:00:02.0": [ + { + "200f228a-c80a-4d50-bfb7-f5a0e4e34045": { + "mdev_type": "i915-GVTg_V5_4", + "start": "manual" + } + }, + { + "de807ffc-1923-4d5f-b6c9-b20ecebc6d4b": { + "mdev_type": "i915-GVTg_V5_4", + "start": "auto" + } + }, + { + "435722ea-5f43-468a-874f-da34f1217f13": { + "mdev_type": "i915-GVTg_V5_8", + "start": "manual", + "attrs": [ + { + "testattr": "42" + } + ] + } + } + ] + }, + { + "matrix": [ + { "783e6dbb-ea0e-411f-94e2-717eaad438bf": { + "mdev_type": "vfio_ap-passthrough", + "start": "manual", + "attrs": [ + { + "assign_adapter": "5" + }, + { + "assign_adapter": "6" + },
Why are there 2 of each? Does that indicate to mdevctl that when creating an mdev on the matrix device it can pick any of the adapters for the mdev?
+ { + "assign_domain": "0xab" + }, + { + "assign_control_domain": "0xab" + }, + { + "assign_domain": "4" + }, + { + "assign_control_domain": "4"
Are ^these the real life attribute names mdevctl uses for vfio_ap-passthrough? ...
+<device> + <name>mdev_435722ea_5f43_468a_874f_da34f1217f13</name> + <parent>0000:00:02.0</parent> + <capability type='mdev'> + <type id='i915-GVTg_V5_8'/> + <iommuGroup number='0'/> + <attr name='testattr' value='42'/> + </capability> +</device> +<device> + <name>mdev_783e6dbb_ea0e_411f_94e2_717eaad438bf</name> + <parent>matrix</parent> + <capability type='mdev'> + <type id='vfio_ap-passthrough'/> + <iommuGroup number='0'/> + <attr name='assign_adapter' value='5'/> + <attr name='assign_adapter' value='6'/> + <attr name='assign_domain' value='0xab'/> + <attr name='assign_control_domain' value='0xab'/> + <attr name='assign_domain' value='4'/> + <attr name='assign_control_domain' value='4'/>
Bjoern was right in his response to v2 when he said this adds quite some noise for the end user and is not consistent at all with how we represent the AP scheme. On the other hand, at the time of posting the v2 we already had introduced and documented the <attr> element, so I still think what you have here is the right approach because should we have gone with vendor specific elements the end users might get the impression that libvirt owns the attributes or IOW is somehow responsible for whether the mdev is going to be created successfully with these attributes or not - whilst we're just the middle man and we're merely passing these attributes to mdevctl and we also document that it's a generic mechanism to allow vendor-specific attributes to be specified along with the mdev. However, the end users know nothing about what the names of the attributes are and what mdevctl accepts, so maybe we could, for XML purposes, name the attributes the same way as we name the AP elements, i.e. ap-adapter, ap-control (yeah, it should have been "ap_xyz" I overlooked this during review, so that is on me). The fact we're using mdevctl underneath is completely transparent to the end users, so we should strive for at least some kind of consistency so anyone can tell how the individual node devices relate to each other. Code-wise I don't see any outstanding problems anymore, so once we come to an agreement on what the XML needs to look like to make most (ideally all) parties happy, you have my: Reviewed-by: Erik Skultety <eskultet@redhat.com>

On Thu, Dec 24, 2020 at 08:14:29AM -0600, Jonathon Jongsma wrote:
This adds some internal API to query for persistent mediated devices that are defined by mdevctl. Following commits will make use of this information. This just provides the infrastructure and tests for this feature. One test verifies that we are executing mdevctl with the proper arguments, and the other test verifies that we can parse the returned JSON and convert it to our own internal node device representations.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 150 ++++++++++++++++++ src/node_device/node_device_driver.h | 7 + .../mdevctl-list-defined.argv | 1 + .../mdevctl-list-multiple.json | 59 +++++++ .../mdevctl-list-multiple.out.xml | 39 +++++ tests/nodedevmdevctltest.c | 95 ++++++++++- 6 files changed, 349 insertions(+), 2 deletions(-) create mode 100644 tests/nodedevmdevctldata/mdevctl-list-defined.argv create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.json create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 6143459618..bbd373e32e 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -853,6 +853,156 @@ virMdevctlStop(virNodeDeviceDefPtr def) }
+int +nodeDeviceParseMdevctlJSON(const char *jsonstring, + virNodeDeviceDefPtr **devs) +{ + int n; + g_autoptr(virJSONValue) json_devicelist = NULL; + virNodeDeviceDefPtr *outdevs = NULL; + size_t noutdevs = 0; + size_t i, j; + + json_devicelist = virJSONValueFromString(jsonstring); + + if (!json_devicelist) + goto parsefailure; + + if (!virJSONValueIsArray(json_devicelist)) + goto parsefailure; + + n = virJSONValueArraySize(json_devicelist); + + for (i = 0; i < n; i++) { + virJSONValuePtr obj = virJSONValueArrayGet(json_devicelist, i); + const char *parent; + virJSONValuePtr child_array; + int nchildren; + + if (!virJSONValueIsObject(obj)) + goto parsefailure; + + /* mdevctl returns an array of objects. Each object is a parent device + * object containing a single key-value pair which maps from the name + * of the parent device to an array of child devices */ + if (virJSONValueObjectKeysNumber(obj) != 1) + goto parsefailure; + + parent = virJSONValueObjectGetKey(obj, 0); + child_array = virJSONValueObjectGetValue(obj, 0); + + if (!virJSONValueIsArray(child_array)) + goto parsefailure; + + nchildren = virJSONValueArraySize(child_array); + + for (j = 0; j < nchildren; j++) { + g_autoptr(virNodeDeviceDef) child = NULL; + virJSONValuePtr child_obj = virJSONValueArrayGet(child_array, j); + + if (!(child = nodeDeviceParseMdevctlChildDevice(parent, child_obj))) + goto parsefailure; + + if (VIR_APPEND_ELEMENT(outdevs, noutdevs, child) < 0) + goto parsefailure; + } + } + + *devs = outdevs; + return noutdevs; + + parsefailure: + for (i = 0; i < noutdevs; i++) + virNodeDeviceDefFree(outdevs[i]); + VIR_FREE(outdevs); + + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("unable to parse JSON response"));
This error message is horrible for debugging as it tells us nothing about what part of parsing failed. Please report error messages at the original site of the error
+ return -1; +}
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Since a mediated device can be persistently defined by the mdevctl backend, we need additional lifecycle events beyond CREATED/DELETED to indicate that e.g. the device has been stopped but the device definition still exists. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- examples/c/misc/event-test.c | 4 ++++ include/libvirt/libvirt-nodedev.h | 2 ++ src/conf/node_device_conf.h | 1 + src/node_device/node_device_driver.c | 1 + src/node_device/node_device_udev.c | 25 +++++++++++++++++++++++-- tools/virsh-nodedev.c | 4 +++- 6 files changed, 34 insertions(+), 3 deletions(-) diff --git a/examples/c/misc/event-test.c b/examples/c/misc/event-test.c index f164e825e1..d6eec648ec 100644 --- a/examples/c/misc/event-test.c +++ b/examples/c/misc/event-test.c @@ -381,6 +381,10 @@ nodeDeviceEventToString(int event) return "Created"; case VIR_NODE_DEVICE_EVENT_DELETED: return "Deleted"; + case VIR_NODE_DEVICE_EVENT_STOPPED: + return "Stopped"; + case VIR_NODE_DEVICE_EVENT_STARTED: + return "Started"; case VIR_NODE_DEVICE_EVENT_LAST: break; } diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index d304283871..a473563857 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -197,6 +197,8 @@ int virConnectNodeDeviceEventDeregisterAny(virConnectPtr conn, typedef enum { VIR_NODE_DEVICE_EVENT_CREATED = 0, VIR_NODE_DEVICE_EVENT_DELETED = 1, + VIR_NODE_DEVICE_EVENT_STOPPED = 2, + VIR_NODE_DEVICE_EVENT_STARTED = 3, # ifdef VIR_ENUM_SENTINELS VIR_NODE_DEVICE_EVENT_LAST diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 3d7872fd6e..bbc28cf2b9 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -157,6 +157,7 @@ struct _virNodeDevCapMdev { char *uuid; virMediatedDeviceAttrPtr *attributes; size_t nattributes; + bool persistent; }; typedef struct _virNodeDevCapPCIDev virNodeDevCapPCIDev; diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index bbd373e32e..5309b8abd5 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -899,6 +899,7 @@ nodeDeviceParseMdevctlChildDevice(const char *parent, child->caps->data.type = VIR_NODE_DEV_CAP_MDEV; mdev = &child->caps->data.mdev; + mdev->persistent = true; mdev->uuid = g_strdup(uuid); mdev->type = g_strdup(virJSONValueObjectGetString(props, "mdev_type")); diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index be59e6c6bc..632413d046 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1401,6 +1401,8 @@ udevRemoveOneDeviceSysPath(const char *path) virNodeDeviceObjPtr obj = NULL; virNodeDeviceDefPtr def; virObjectEventPtr event = NULL; + virNodeDevCapsDefPtr cap; + int event_type = VIR_NODE_DEVICE_EVENT_DELETED; if (!(obj = virNodeDeviceObjListFindBySysfsPath(driver->devs, path))) { VIR_DEBUG("Failed to find device to remove that has udev path '%s'", @@ -1409,13 +1411,32 @@ udevRemoveOneDeviceSysPath(const char *path) } def = virNodeDeviceObjGetDef(obj); + /* If the device is a mediated device that has been 'stopped', it may still + * be defined by mdevctl and can therefore be started again. Don't drop it + * from the list of node devices */ + cap = def->caps; + while (cap != NULL) { + if (cap->data.type == VIR_NODE_DEV_CAP_MDEV) { + if (cap->data.mdev.persistent) { + VIR_FREE(def->sysfs_path); + event_type = VIR_NODE_DEVICE_EVENT_STOPPED; + break; + } + } + cap = cap->next; + } + event = virNodeDeviceEventLifecycleNew(def->name, - VIR_NODE_DEVICE_EVENT_DELETED, + event_type, 0); VIR_DEBUG("Removing device '%s' with sysfs path '%s'", def->name, path); - virNodeDeviceObjListRemove(driver->devs, obj); + + if (event_type == VIR_NODE_DEVICE_EVENT_DELETED) + virNodeDeviceObjListRemove(driver->devs, obj); + else + virNodeDeviceObjSetActive(obj, false); virNodeDeviceObjEndAPI(&obj); virObjectEventStateQueue(driver->nodeDeviceEventState, event); diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index 69422e20f5..35117585ff 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -775,7 +775,9 @@ VIR_ENUM_DECL(virshNodeDeviceEvent); VIR_ENUM_IMPL(virshNodeDeviceEvent, VIR_NODE_DEVICE_EVENT_LAST, N_("Created"), - N_("Deleted")); + N_("Deleted"), + N_("Stopped"), + N_("Started")); static const char * virshNodeDeviceEventToString(int event) -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:30AM -0600, Jonathon Jongsma wrote:
Since a mediated device can be persistently defined by the mdevctl backend, we need additional lifecycle events beyond CREATED/DELETED to indicate that e.g. the device has been stopped but the device definition still exists.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- examples/c/misc/event-test.c | 4 ++++ include/libvirt/libvirt-nodedev.h | 2 ++ src/conf/node_device_conf.h | 1 + src/node_device/node_device_driver.c | 1 + src/node_device/node_device_udev.c | 25 +++++++++++++++++++++++-- tools/virsh-nodedev.c | 4 +++- 6 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/examples/c/misc/event-test.c b/examples/c/misc/event-test.c index f164e825e1..d6eec648ec 100644 --- a/examples/c/misc/event-test.c +++ b/examples/c/misc/event-test.c @@ -381,6 +381,10 @@ nodeDeviceEventToString(int event) return "Created"; case VIR_NODE_DEVICE_EVENT_DELETED: return "Deleted"; + case VIR_NODE_DEVICE_EVENT_STOPPED: + return "Stopped"; + case VIR_NODE_DEVICE_EVENT_STARTED: + return "Started"; case VIR_NODE_DEVICE_EVENT_LAST: break; } diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index d304283871..a473563857 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -197,6 +197,8 @@ int virConnectNodeDeviceEventDeregisterAny(virConnectPtr conn, typedef enum { VIR_NODE_DEVICE_EVENT_CREATED = 0, VIR_NODE_DEVICE_EVENT_DELETED = 1, + VIR_NODE_DEVICE_EVENT_STOPPED = 2, + VIR_NODE_DEVICE_EVENT_STARTED = 3,
# ifdef VIR_ENUM_SENTINELS VIR_NODE_DEVICE_EVENT_LAST diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 3d7872fd6e..bbc28cf2b9 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -157,6 +157,7 @@ struct _virNodeDevCapMdev { char *uuid; virMediatedDeviceAttrPtr *attributes; size_t nattributes; + bool persistent;
So, this patch is essentially unchanged since v2. In v2 I suggested ^this attribute should be bound to the object not the capabability and you haven't replied to those comments, is there a reason why that is not the case? It's also consistent with what we do for other objects, like domains and networks. ...
if (!(obj = virNodeDeviceObjListFindBySysfsPath(driver->devs, path))) { VIR_DEBUG("Failed to find device to remove that has udev path '%s'", @@ -1409,13 +1411,32 @@ udevRemoveOneDeviceSysPath(const char *path) } def = virNodeDeviceObjGetDef(obj);
+ /* If the device is a mediated device that has been 'stopped', it may still + * be defined by mdevctl and can therefore be started again. Don't drop it + * from the list of node devices */ + cap = def->caps; + while (cap != NULL) { + if (cap->data.type == VIR_NODE_DEV_CAP_MDEV) { + if (cap->data.mdev.persistent) { + VIR_FREE(def->sysfs_path); + event_type = VIR_NODE_DEVICE_EVENT_STOPPED; + break; + } + } + cap = cap->next; + } +
...which would simplify ^this while loop to a mere: if (obj->persitent) { VIR_FREE(def->sysfs_path); virNodeDeviceObjSetActive(obj, false); }
event = virNodeDeviceEventLifecycleNew(def->name, - VIR_NODE_DEVICE_EVENT_DELETED, + event_type, 0);
VIR_DEBUG("Removing device '%s' with sysfs path '%s'", def->name, path); - virNodeDeviceObjListRemove(driver->devs, obj); + + if (event_type == VIR_NODE_DEVICE_EVENT_DELETED) + virNodeDeviceObjListRemove(driver->devs, obj); + else + virNodeDeviceObjSetActive(obj, false);
...and this 'else' branch would be unnecessary then.
virNodeDeviceObjEndAPI(&obj);
virObjectEventStateQueue(driver->nodeDeviceEventState, event);
Regards, Erik

On Tue, 5 Jan 2021 17:25:50 +0100 Erik Skultety <eskultet@redhat.com> wrote:
On Thu, Dec 24, 2020 at 08:14:30AM -0600, Jonathon Jongsma wrote:
Since a mediated device can be persistently defined by the mdevctl backend, we need additional lifecycle events beyond CREATED/DELETED to indicate that e.g. the device has been stopped but the device definition still exists.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- examples/c/misc/event-test.c | 4 ++++ include/libvirt/libvirt-nodedev.h | 2 ++ src/conf/node_device_conf.h | 1 + src/node_device/node_device_driver.c | 1 + src/node_device/node_device_udev.c | 25 +++++++++++++++++++++++-- tools/virsh-nodedev.c | 4 +++- 6 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/examples/c/misc/event-test.c b/examples/c/misc/event-test.c index f164e825e1..d6eec648ec 100644 --- a/examples/c/misc/event-test.c +++ b/examples/c/misc/event-test.c @@ -381,6 +381,10 @@ nodeDeviceEventToString(int event) return "Created"; case VIR_NODE_DEVICE_EVENT_DELETED: return "Deleted"; + case VIR_NODE_DEVICE_EVENT_STOPPED: + return "Stopped"; + case VIR_NODE_DEVICE_EVENT_STARTED: + return "Started"; case VIR_NODE_DEVICE_EVENT_LAST: break; } diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index d304283871..a473563857 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -197,6 +197,8 @@ int virConnectNodeDeviceEventDeregisterAny(virConnectPtr conn, typedef enum { VIR_NODE_DEVICE_EVENT_CREATED = 0, VIR_NODE_DEVICE_EVENT_DELETED = 1, + VIR_NODE_DEVICE_EVENT_STOPPED = 2, + VIR_NODE_DEVICE_EVENT_STARTED = 3,
# ifdef VIR_ENUM_SENTINELS VIR_NODE_DEVICE_EVENT_LAST diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 3d7872fd6e..bbc28cf2b9 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -157,6 +157,7 @@ struct _virNodeDevCapMdev { char *uuid; virMediatedDeviceAttrPtr *attributes; size_t nattributes; + bool persistent;
So, this patch is essentially unchanged since v2. In v2 I suggested ^this attribute should be bound to the object not the capabability and you haven't replied to those comments, is there a reason why that is not the case? It's also consistent with what we do for other objects, like domains and networks.
hmm, I apparently missed that comment. But I guess the reason that I added it to the mdev caps rather than the object is because it really only applies to mdev devices. For instance, we don't really have a mechanism to keep around the definition of e.g. a PCI device that has been removed from the machine. And I don't really see any use case for doing so. I suppose we could just set this value to 'false' for all non-mdev node devices, but that doesn't seem very elegant to me.
...
if (!(obj = virNodeDeviceObjListFindBySysfsPath(driver->devs, path))) { VIR_DEBUG("Failed to find device to remove that has udev path '%s'", @@ -1409,13 +1411,32 @@ udevRemoveOneDeviceSysPath(const char *path) } def = virNodeDeviceObjGetDef(obj);
+ /* If the device is a mediated device that has been 'stopped', it may still + * be defined by mdevctl and can therefore be started again. Don't drop it + * from the list of node devices */ + cap = def->caps; + while (cap != NULL) { + if (cap->data.type == VIR_NODE_DEV_CAP_MDEV) { + if (cap->data.mdev.persistent) { + VIR_FREE(def->sysfs_path); + event_type = VIR_NODE_DEVICE_EVENT_STOPPED; + break; + } + } + cap = cap->next; + } +
...which would simplify ^this while loop to a mere: if (obj->persitent) { VIR_FREE(def->sysfs_path); virNodeDeviceObjSetActive(obj, false); }
event = virNodeDeviceEventLifecycleNew(def->name, - VIR_NODE_DEVICE_EVENT_DELETED, + event_type, 0);
VIR_DEBUG("Removing device '%s' with sysfs path '%s'", def->name, path); - virNodeDeviceObjListRemove(driver->devs, obj); + + if (event_type == VIR_NODE_DEVICE_EVENT_DELETED) + virNodeDeviceObjListRemove(driver->devs, obj); + else + virNodeDeviceObjSetActive(obj, false);
...and this 'else' branch would be unnecessary then.
virNodeDeviceObjEndAPI(&obj);
virObjectEventStateQueue(driver->nodeDeviceEventState, event);
Regards, Erik

On Thu, Jan 14, 2021 at 05:07:10PM -0600, Jonathon Jongsma wrote:
On Tue, 5 Jan 2021 17:25:50 +0100 Erik Skultety <eskultet@redhat.com> wrote:
On Thu, Dec 24, 2020 at 08:14:30AM -0600, Jonathon Jongsma wrote:
Since a mediated device can be persistently defined by the mdevctl backend, we need additional lifecycle events beyond CREATED/DELETED to indicate that e.g. the device has been stopped but the device definition still exists.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- examples/c/misc/event-test.c | 4 ++++ include/libvirt/libvirt-nodedev.h | 2 ++ src/conf/node_device_conf.h | 1 + src/node_device/node_device_driver.c | 1 + src/node_device/node_device_udev.c | 25 +++++++++++++++++++++++-- tools/virsh-nodedev.c | 4 +++- 6 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/examples/c/misc/event-test.c b/examples/c/misc/event-test.c index f164e825e1..d6eec648ec 100644 --- a/examples/c/misc/event-test.c +++ b/examples/c/misc/event-test.c @@ -381,6 +381,10 @@ nodeDeviceEventToString(int event) return "Created"; case VIR_NODE_DEVICE_EVENT_DELETED: return "Deleted"; + case VIR_NODE_DEVICE_EVENT_STOPPED: + return "Stopped"; + case VIR_NODE_DEVICE_EVENT_STARTED: + return "Started"; case VIR_NODE_DEVICE_EVENT_LAST: break; } diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index d304283871..a473563857 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -197,6 +197,8 @@ int virConnectNodeDeviceEventDeregisterAny(virConnectPtr conn, typedef enum { VIR_NODE_DEVICE_EVENT_CREATED = 0, VIR_NODE_DEVICE_EVENT_DELETED = 1, + VIR_NODE_DEVICE_EVENT_STOPPED = 2, + VIR_NODE_DEVICE_EVENT_STARTED = 3,
# ifdef VIR_ENUM_SENTINELS VIR_NODE_DEVICE_EVENT_LAST diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 3d7872fd6e..bbc28cf2b9 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -157,6 +157,7 @@ struct _virNodeDevCapMdev { char *uuid; virMediatedDeviceAttrPtr *attributes; size_t nattributes; + bool persistent;
There are some trailing whitespaces FWIW.
So, this patch is essentially unchanged since v2. In v2 I suggested ^this attribute should be bound to the object not the capabability and you haven't replied to those comments, is there a reason why that is not the case? It's also consistent with what we do for other objects, like domains and networks.
hmm, I apparently missed that comment. But I guess the reason that I added it to the mdev caps rather than the object is because it really only applies to mdev devices. For instance, we don't really have a mechanism to keep around the definition of e.g. a PCI device that has been removed from the machine. And I don't really see any use case for doing so. I suppose we could just set this value to 'false' for all non-mdev node devices, but that doesn't seem very elegant to me.
I know, but I would not mind sacrificing a bit of elegance in exchange for remaining consistent across other drivers in libvirt - IOW persistence is a property of the XML definition, not the capability itself, regardless of the fact that for other types of devices we have no use for it - 'false' for everything else sounds like an acceptable trade-off to me. Unless you have some other compelling reason to break this consistency in which case I'm open to reconsider. Erik

On Thu, Dec 24, 2020 at 08:14:30AM -0600, Jonathon Jongsma wrote:
Since a mediated device can be persistently defined by the mdevctl backend, we need additional lifecycle events beyond CREATED/DELETED to indicate that e.g. the device has been stopped but the device definition still exists.
This doesn't feel right to me. STARTED/STOPPED are esstentially the same as CREATED/DELETED semantically. None of the other APIs emit different events for persistent vs transient objects. We're adding the ability to define/undefine inactive mdev configs, so what is missing is the DEFINED/UNDEFINED events I believe.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- examples/c/misc/event-test.c | 4 ++++ include/libvirt/libvirt-nodedev.h | 2 ++ src/conf/node_device_conf.h | 1 + src/node_device/node_device_driver.c | 1 + src/node_device/node_device_udev.c | 25 +++++++++++++++++++++++-- tools/virsh-nodedev.c | 4 +++- 6 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/examples/c/misc/event-test.c b/examples/c/misc/event-test.c index f164e825e1..d6eec648ec 100644 --- a/examples/c/misc/event-test.c +++ b/examples/c/misc/event-test.c @@ -381,6 +381,10 @@ nodeDeviceEventToString(int event) return "Created"; case VIR_NODE_DEVICE_EVENT_DELETED: return "Deleted"; + case VIR_NODE_DEVICE_EVENT_STOPPED: + return "Stopped"; + case VIR_NODE_DEVICE_EVENT_STARTED: + return "Started"; case VIR_NODE_DEVICE_EVENT_LAST: break; } diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index d304283871..a473563857 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -197,6 +197,8 @@ int virConnectNodeDeviceEventDeregisterAny(virConnectPtr conn, typedef enum { VIR_NODE_DEVICE_EVENT_CREATED = 0, VIR_NODE_DEVICE_EVENT_DELETED = 1, + VIR_NODE_DEVICE_EVENT_STOPPED = 2, + VIR_NODE_DEVICE_EVENT_STARTED = 3,
# ifdef VIR_ENUM_SENTINELS VIR_NODE_DEVICE_EVENT_LAST diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 3d7872fd6e..bbc28cf2b9 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -157,6 +157,7 @@ struct _virNodeDevCapMdev { char *uuid; virMediatedDeviceAttrPtr *attributes; size_t nattributes; + bool persistent; };
typedef struct _virNodeDevCapPCIDev virNodeDevCapPCIDev; diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index bbd373e32e..5309b8abd5 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -899,6 +899,7 @@ nodeDeviceParseMdevctlChildDevice(const char *parent, child->caps->data.type = VIR_NODE_DEV_CAP_MDEV;
mdev = &child->caps->data.mdev; + mdev->persistent = true; mdev->uuid = g_strdup(uuid); mdev->type = g_strdup(virJSONValueObjectGetString(props, "mdev_type")); diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index be59e6c6bc..632413d046 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1401,6 +1401,8 @@ udevRemoveOneDeviceSysPath(const char *path) virNodeDeviceObjPtr obj = NULL; virNodeDeviceDefPtr def; virObjectEventPtr event = NULL; + virNodeDevCapsDefPtr cap; + int event_type = VIR_NODE_DEVICE_EVENT_DELETED;
if (!(obj = virNodeDeviceObjListFindBySysfsPath(driver->devs, path))) { VIR_DEBUG("Failed to find device to remove that has udev path '%s'", @@ -1409,13 +1411,32 @@ udevRemoveOneDeviceSysPath(const char *path) } def = virNodeDeviceObjGetDef(obj);
+ /* If the device is a mediated device that has been 'stopped', it may still + * be defined by mdevctl and can therefore be started again. Don't drop it + * from the list of node devices */ + cap = def->caps; + while (cap != NULL) { + if (cap->data.type == VIR_NODE_DEV_CAP_MDEV) { + if (cap->data.mdev.persistent) { + VIR_FREE(def->sysfs_path); + event_type = VIR_NODE_DEVICE_EVENT_STOPPED; + break; + } + } + cap = cap->next; + } + event = virNodeDeviceEventLifecycleNew(def->name, - VIR_NODE_DEVICE_EVENT_DELETED, + event_type, 0);
VIR_DEBUG("Removing device '%s' with sysfs path '%s'", def->name, path); - virNodeDeviceObjListRemove(driver->devs, obj); + + if (event_type == VIR_NODE_DEVICE_EVENT_DELETED) + virNodeDeviceObjListRemove(driver->devs, obj); + else + virNodeDeviceObjSetActive(obj, false); virNodeDeviceObjEndAPI(&obj);
virObjectEventStateQueue(driver->nodeDeviceEventState, event); diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index 69422e20f5..35117585ff 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -775,7 +775,9 @@ VIR_ENUM_DECL(virshNodeDeviceEvent); VIR_ENUM_IMPL(virshNodeDeviceEvent, VIR_NODE_DEVICE_EVENT_LAST, N_("Created"), - N_("Deleted")); + N_("Deleted"), + N_("Stopped"), + N_("Started"));
static const char * virshNodeDeviceEventToString(int event) -- 2.26.2
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

At startup, query devices that are defined by 'mdevctl' and add them to the node device list. This adds a complication: we now have two potential sources of information for a node device: - udev for all devices and for activated mediated devices - mdevctl for persistent mediated devices Unfortunately, neither backend returns full information for a mediated device. For example, if a persistent mediated device in the list (with information provided from mdevctl) is 'started', that same device will now be detected by udev. If we simply overwrite the existing device definition with the new one provided by the udev backend, we will lose extra information that was provided by mdevctl (e.g. attributes, etc). To avoid this, make sure to copy the extra information into the new device definition. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 76 ++++++++++++++++++++++++++++ src/node_device/node_device_driver.h | 3 ++ src/node_device/node_device_udev.c | 48 ++++++++++++++++++ 3 files changed, 127 insertions(+) diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 5309b8abd5..0267005af1 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -1144,3 +1144,79 @@ nodeDeviceGenerateName(virNodeDeviceDefPtr def, *(def->name + i) = '_'; } } + + +static int +virMdevctlListDefined(virNodeDeviceDefPtr **devs) +{ + int status; + g_autofree char *output = NULL; + g_autoptr(virCommand) cmd = nodeDeviceGetMdevctlListCommand(true, &output); + + if (virCommandRun(cmd, &status) < 0) + return -1; + + if (!output) + return -1; + + return nodeDeviceParseMdevctlJSON(output, devs); +} + + +int +nodeDeviceUpdateMediatedDevices(void) +{ + g_autofree virNodeDeviceDefPtr *devs = NULL; + int ndevs; + size_t i; + + if ((ndevs = virMdevctlListDefined(&devs)) < 0) { + virReportSystemError(errno, "%s", + _("failed to query mdevs from mdevctl")); + return -1; + } + + for (i = 0; i < ndevs; i++) { + virNodeDeviceObjPtr obj; + virObjectEventPtr event; + virNodeDeviceDefPtr dev = devs[i]; + bool new_device = true; + + dev->driver = g_strdup("vfio_mdev"); + + /* If a device defined by mdevctl is already in the list, that means + * that it was found via the normal device discovery process and thus + * is already activated. Active devices contain some additional + * information (e.g. sysfs path) that is not provided by mdevctl, so + * preserve that info */ + if ((obj = virNodeDeviceObjListFindByName(driver->devs, dev->name))) { + virNodeDeviceDefPtr olddef = virNodeDeviceObjGetDef(obj); + + /* Copy any data from the existing device */ + dev->sysfs_path = g_strdup(olddef->sysfs_path); + dev->parent_sysfs_path = g_strdup(olddef->parent_sysfs_path); + dev->driver = g_strdup(olddef->driver); + dev->devnode = g_strdup(olddef->devnode); + dev->caps->data.mdev.iommuGroupNumber = olddef->caps->data.mdev.iommuGroupNumber; + + virNodeDeviceObjEndAPI(&obj); + new_device = false; + } + + if (!(obj = virNodeDeviceObjListAssignDef(driver->devs, dev))) { + virNodeDeviceDefFree(dev); + return -1; + } + + if (new_device) + event = virNodeDeviceEventLifecycleNew(dev->name, + VIR_NODE_DEVICE_EVENT_CREATED, + 0); + else + event = virNodeDeviceEventUpdateNew(dev->name); + virNodeDeviceObjEndAPI(&obj); + virObjectEventStateQueue(driver->nodeDeviceEventState, event); + } + + return 0; +} diff --git a/src/node_device/node_device_driver.h b/src/node_device/node_device_driver.h index 80ac7c5320..4315f6d6ed 100644 --- a/src/node_device/node_device_driver.h +++ b/src/node_device/node_device_driver.h @@ -126,6 +126,9 @@ int nodeDeviceParseMdevctlJSON(const char *jsonstring, virNodeDeviceDefPtr **devs); +int +nodeDeviceUpdateMediatedDevices(void); + void nodeDeviceGenerateName(virNodeDeviceDefPtr def, const char *subsystem, diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 632413d046..223ee5a2ff 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1494,6 +1494,50 @@ udevSetParent(struct udev_device *device, return 0; } +static virMediatedDeviceAttrPtr * +virMediatedDeviceAttrsCopy(virMediatedDeviceAttrPtr *attrs, + size_t nattrs) +{ + size_t i; + size_t j = 0; + g_autofree virMediatedDeviceAttrPtr *ret = NULL; + + if (nattrs == 0) + return NULL; + + ret = g_new0(virMediatedDeviceAttrPtr, nattrs); + + for (i = 0; i < nattrs; i++) { + virMediatedDeviceAttrPtr attr = virMediatedDeviceAttrNew(); + attr->name = g_strdup(attrs[i]->name); + attr->value = g_strdup(attrs[i]->value); + VIR_APPEND_ELEMENT_INPLACE(ret, j, attr); + } + + return g_steal_pointer(ret); +} + +/* An existing device definition may have additional info from mdevctl that is + * not available from udev. Transfer this data to the new definition */ +static void +nodeDeviceDefCopyExtraData(virNodeDeviceDefPtr dst, + virNodeDeviceDefPtr src) +{ + virNodeDevCapMdevPtr srcmdev; + virNodeDevCapMdevPtr dstmdev; + + if (dst->caps->data.type != VIR_NODE_DEV_CAP_MDEV) + return; + + srcmdev = &src->caps->data.mdev; + dstmdev = &dst->caps->data.mdev; + + dstmdev->persistent = srcmdev->persistent; + dstmdev->nattributes = srcmdev->nattributes; + dstmdev->attributes = virMediatedDeviceAttrsCopy(srcmdev->attributes, + srcmdev->nattributes); +} + static int udevAddOneDevice(struct udev_device *device) @@ -1527,6 +1571,8 @@ udevAddOneDevice(struct udev_device *device) goto cleanup; if ((obj = virNodeDeviceObjListFindByName(driver->devs, def->name))) { + nodeDeviceDefCopyExtraData(def, virNodeDeviceObjGetDef(obj)); + virNodeDeviceObjEndAPI(&obj); new_device = false; } @@ -1953,6 +1999,8 @@ nodeStateInitializeEnumerate(void *opaque) /* Populate with known devices */ if (udevEnumerateDevices(udev) != 0) goto error; + if (nodeDeviceUpdateMediatedDevices() != 0) + goto error; nodeDeviceLock(); driver->initialized = true; -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:31AM -0600, Jonathon Jongsma wrote:
At startup, query devices that are defined by 'mdevctl' and add them to the node device list.
This adds a complication: we now have two potential sources of information for a node device: - udev for all devices and for activated mediated devices - mdevctl for persistent mediated devices
Unfortunately, neither backend returns full information for a mediated device. For example, if a persistent mediated device in the list (with information provided from mdevctl) is 'started', that same device will now be detected by udev. If we simply overwrite the existing device definition with the new one provided by the udev backend, we will lose extra information that was provided by mdevctl (e.g. attributes, etc). To avoid this, make sure to copy the extra information into the new device definition.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 76 ++++++++++++++++++++++++++++ src/node_device/node_device_driver.h | 3 ++ src/node_device/node_device_udev.c | 48 ++++++++++++++++++ 3 files changed, 127 insertions(+)
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 5309b8abd5..0267005af1 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -1144,3 +1144,79 @@ nodeDeviceGenerateName(virNodeDeviceDefPtr def, *(def->name + i) = '_'; } } + + +static int +virMdevctlListDefined(virNodeDeviceDefPtr **devs) +{ + int status; + g_autofree char *output = NULL; + g_autoptr(virCommand) cmd = nodeDeviceGetMdevctlListCommand(true, &output); + + if (virCommandRun(cmd, &status) < 0) + return -1; + + if (!output) + return -1; + + return nodeDeviceParseMdevctlJSON(output, devs); +} + + +int +nodeDeviceUpdateMediatedDevices(void)
^This was called mdevctlEnumerateDevices in v2, so I'm wondering why did you change the name? I think it was okay the way it was (I double checked that I didn't suggest this change in v2 by accident).
+{ + g_autofree virNodeDeviceDefPtr *devs = NULL; + int ndevs; + size_t i; + + if ((ndevs = virMdevctlListDefined(&devs)) < 0) { + virReportSystemError(errno, "%s", + _("failed to query mdevs from mdevctl")); + return -1; + } + + for (i = 0; i < ndevs; i++) { + virNodeDeviceObjPtr obj; + virObjectEventPtr event; + virNodeDeviceDefPtr dev = devs[i]; + bool new_device = true; + + dev->driver = g_strdup("vfio_mdev"); + + /* If a device defined by mdevctl is already in the list, that means + * that it was found via the normal device discovery process and thus + * is already activated. Active devices contain some additional + * information (e.g. sysfs path) that is not provided by mdevctl, so + * preserve that info */ + if ((obj = virNodeDeviceObjListFindByName(driver->devs, dev->name))) { + virNodeDeviceDefPtr olddef = virNodeDeviceObjGetDef(obj); + + /* Copy any data from the existing device */ + dev->sysfs_path = g_strdup(olddef->sysfs_path); + dev->parent_sysfs_path = g_strdup(olddef->parent_sysfs_path); + dev->driver = g_strdup(olddef->driver); + dev->devnode = g_strdup(olddef->devnode); + dev->caps->data.mdev.iommuGroupNumber = olddef->caps->data.mdev.iommuGroupNumber;
So, what data other than attributes are coming from mdevctl that we have to copy the udev provided data from the old def ^this way? When I look at virNodeDeviceDef, If I'm not missing anything, the added value from mdevctl are the caps, more specifically, attributes right? If that is true I think we should revert the logic and use the function that you introduce at the end of the patch that copies extra data and use it here as well. To describe my thoughts into more details, at the end of the patch you add calls to udevAddOneDevice and nodeStateInitializeEnumerate. In the former case, we already enumerated the devices, including mdevctl, so now when we start an mdev, we get an event from udev and what do we do? We take that info and copy the attributes from the existing definition to it, so the data flows more or less like udev <- mdevctl. In the latter case though, we first enumerated udev devices, then asked mdevctl which created a new def ptr for us and we copied the data from udev and then redefined the device, so mdevctl <- udev. If I'm not missing some crucial information I think the logic can be reverted in one of those cases so that we always copy from the same source to the same destination to apply a new definition.
+ + virNodeDeviceObjEndAPI(&obj); + new_device = false; + } + + if (!(obj = virNodeDeviceObjListAssignDef(driver->devs, dev))) { + virNodeDeviceDefFree(dev); + return -1; + } + + if (new_device) + event = virNodeDeviceEventLifecycleNew(dev->name, + VIR_NODE_DEVICE_EVENT_CREATED, + 0); + else + event = virNodeDeviceEventUpdateNew(dev->name); + virNodeDeviceObjEndAPI(&obj); + virObjectEventStateQueue(driver->nodeDeviceEventState, event); + } + + return 0; +} diff --git a/src/node_device/node_device_driver.h b/src/node_device/node_device_driver.h index 80ac7c5320..4315f6d6ed 100644 --- a/src/node_device/node_device_driver.h +++ b/src/node_device/node_device_driver.h @@ -126,6 +126,9 @@ int nodeDeviceParseMdevctlJSON(const char *jsonstring, virNodeDeviceDefPtr **devs);
+int +nodeDeviceUpdateMediatedDevices(void); + void nodeDeviceGenerateName(virNodeDeviceDefPtr def, const char *subsystem, diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 632413d046..223ee5a2ff 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1494,6 +1494,50 @@ udevSetParent(struct udev_device *device, return 0; }
+static virMediatedDeviceAttrPtr * +virMediatedDeviceAttrsCopy(virMediatedDeviceAttrPtr *attrs, + size_t nattrs) +{ + size_t i; + size_t j = 0; + g_autofree virMediatedDeviceAttrPtr *ret = NULL; + + if (nattrs == 0) + return NULL; + + ret = g_new0(virMediatedDeviceAttrPtr, nattrs); + + for (i = 0; i < nattrs; i++) { + virMediatedDeviceAttrPtr attr = virMediatedDeviceAttrNew(); + attr->name = g_strdup(attrs[i]->name); + attr->value = g_strdup(attrs[i]->value); + VIR_APPEND_ELEMENT_INPLACE(ret, j, attr); + } + + return g_steal_pointer(ret); +} + +/* An existing device definition may have additional info from mdevctl that is + * not available from udev. Transfer this data to the new definition */ +static void +nodeDeviceDefCopyExtraData(virNodeDeviceDefPtr dst, + virNodeDeviceDefPtr src)
^This name is too vague, we're likely going to only use it with mdevs (and if the time comes we need to make it generic, we can easily create a wrapper). I'd suggest nodeDeviceDefCopyFromMdevctl or something similar to make it clear at first glance what we're trying to copy, because extra data is basically "void *".
+{ + virNodeDevCapMdevPtr srcmdev; + virNodeDevCapMdevPtr dstmdev; + + if (dst->caps->data.type != VIR_NODE_DEV_CAP_MDEV) + return;
I think ^this check would be better on the caller side since we're tailoring it to mdevs.
+ + srcmdev = &src->caps->data.mdev; + dstmdev = &dst->caps->data.mdev; + + dstmdev->persistent = srcmdev->persistent; + dstmdev->nattributes = srcmdev->nattributes; + dstmdev->attributes = virMediatedDeviceAttrsCopy(srcmdev->attributes, + srcmdev->nattributes); +} +
static int udevAddOneDevice(struct udev_device *device) @@ -1527,6 +1571,8 @@ udevAddOneDevice(struct udev_device *device) goto cleanup;
if ((obj = virNodeDeviceObjListFindByName(driver->devs, def->name))) { + nodeDeviceDefCopyExtraData(def, virNodeDeviceObjGetDef(obj));
...the check mentioned above should come here, it'll help the code readability instantly to signal that we're not trying to change/copy anything unless the device for which udev generated an event is an mdev. Erik
+ virNodeDeviceObjEndAPI(&obj); new_device = false; } @@ -1953,6 +1999,8 @@ nodeStateInitializeEnumerate(void *opaque) /* Populate with known devices */ if (udevEnumerateDevices(udev) != 0) goto error; + if (nodeDeviceUpdateMediatedDevices() != 0) + goto error;
nodeDeviceLock(); driver->initialized = true; -- 2.26.2

On Wed, Jan 06, 2021 at 11:27:09AM +0100, Erik Skultety wrote:
On Thu, Dec 24, 2020 at 08:14:31AM -0600, Jonathon Jongsma wrote:
At startup, query devices that are defined by 'mdevctl' and add them to the node device list.
This adds a complication: we now have two potential sources of information for a node device: - udev for all devices and for activated mediated devices - mdevctl for persistent mediated devices
Unfortunately, neither backend returns full information for a mediated device. For example, if a persistent mediated device in the list (with information provided from mdevctl) is 'started', that same device will now be detected by udev. If we simply overwrite the existing device definition with the new one provided by the udev backend, we will lose extra information that was provided by mdevctl (e.g. attributes, etc). To avoid this, make sure to copy the extra information into the new device definition.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 76 ++++++++++++++++++++++++++++ src/node_device/node_device_driver.h | 3 ++ src/node_device/node_device_udev.c | 48 ++++++++++++++++++ 3 files changed, 127 insertions(+)
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 5309b8abd5..0267005af1 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -1144,3 +1144,79 @@ nodeDeviceGenerateName(virNodeDeviceDefPtr def, *(def->name + i) = '_'; } } + + +static int +virMdevctlListDefined(virNodeDeviceDefPtr **devs) +{ + int status; + g_autofree char *output = NULL; + g_autoptr(virCommand) cmd = nodeDeviceGetMdevctlListCommand(true, &output); + + if (virCommandRun(cmd, &status) < 0) + return -1; + + if (!output) + return -1; + + return nodeDeviceParseMdevctlJSON(output, devs); +} + + +int +nodeDeviceUpdateMediatedDevices(void)
^This was called mdevctlEnumerateDevices in v2, so I'm wondering why did you change the name? I think it was okay the way it was (I double checked that I didn't suggest this change in v2 by accident).
+{ + g_autofree virNodeDeviceDefPtr *devs = NULL; + int ndevs; + size_t i; + + if ((ndevs = virMdevctlListDefined(&devs)) < 0) { + virReportSystemError(errno, "%s", + _("failed to query mdevs from mdevctl")); + return -1; + } + + for (i = 0; i < ndevs; i++) { + virNodeDeviceObjPtr obj; + virObjectEventPtr event; + virNodeDeviceDefPtr dev = devs[i]; + bool new_device = true; + + dev->driver = g_strdup("vfio_mdev"); + + /* If a device defined by mdevctl is already in the list, that means + * that it was found via the normal device discovery process and thus + * is already activated. Active devices contain some additional + * information (e.g. sysfs path) that is not provided by mdevctl, so + * preserve that info */ + if ((obj = virNodeDeviceObjListFindByName(driver->devs, dev->name))) { + virNodeDeviceDefPtr olddef = virNodeDeviceObjGetDef(obj); + + /* Copy any data from the existing device */ + dev->sysfs_path = g_strdup(olddef->sysfs_path); + dev->parent_sysfs_path = g_strdup(olddef->parent_sysfs_path); + dev->driver = g_strdup(olddef->driver); + dev->devnode = g_strdup(olddef->devnode); + dev->caps->data.mdev.iommuGroupNumber = olddef->caps->data.mdev.iommuGroupNumber;
So, what data other than attributes are coming from mdevctl that we have to copy the udev provided data from the old def ^this way? When I look at virNodeDeviceDef, If I'm not missing anything, the added value from mdevctl are the caps, more specifically, attributes right? If that is true I think we should revert the logic and use the function that you introduce at the end of the patch that copies extra data and use it here as well. To describe my thoughts into more details, at the end of the patch you add calls to udevAddOneDevice and nodeStateInitializeEnumerate.
In the former case, we already enumerated the devices, including mdevctl, so now when we start an mdev, we get an event from udev and what do we do? We take that info and copy the attributes from the existing definition to it, so the data flows more or less like udev <- mdevctl. In the latter case though, we first enumerated udev devices, then asked mdevctl which created a new def ptr for us and we copied the data from udev and then redefined the device, so mdevctl <- udev. If I'm not missing some crucial information I think the logic can be reverted in one of those cases so that we always copy from the same source to the same destination to apply a new definition.
+ + virNodeDeviceObjEndAPI(&obj); + new_device = false; + } + + if (!(obj = virNodeDeviceObjListAssignDef(driver->devs, dev))) { + virNodeDeviceDefFree(dev); + return -1; + } + + if (new_device) + event = virNodeDeviceEventLifecycleNew(dev->name, + VIR_NODE_DEVICE_EVENT_CREATED, + 0); + else + event = virNodeDeviceEventUpdateNew(dev->name); + virNodeDeviceObjEndAPI(&obj); + virObjectEventStateQueue(driver->nodeDeviceEventState, event); + } + + return 0; +} diff --git a/src/node_device/node_device_driver.h b/src/node_device/node_device_driver.h index 80ac7c5320..4315f6d6ed 100644 --- a/src/node_device/node_device_driver.h +++ b/src/node_device/node_device_driver.h @@ -126,6 +126,9 @@ int nodeDeviceParseMdevctlJSON(const char *jsonstring, virNodeDeviceDefPtr **devs);
+int +nodeDeviceUpdateMediatedDevices(void); + void nodeDeviceGenerateName(virNodeDeviceDefPtr def, const char *subsystem, diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 632413d046..223ee5a2ff 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1494,6 +1494,50 @@ udevSetParent(struct udev_device *device, return 0; }
+static virMediatedDeviceAttrPtr * +virMediatedDeviceAttrsCopy(virMediatedDeviceAttrPtr *attrs, + size_t nattrs) +{ + size_t i; + size_t j = 0; + g_autofree virMediatedDeviceAttrPtr *ret = NULL; + + if (nattrs == 0) + return NULL; + + ret = g_new0(virMediatedDeviceAttrPtr, nattrs); + + for (i = 0; i < nattrs; i++) { + virMediatedDeviceAttrPtr attr = virMediatedDeviceAttrNew(); + attr->name = g_strdup(attrs[i]->name); + attr->value = g_strdup(attrs[i]->value); + VIR_APPEND_ELEMENT_INPLACE(ret, j, attr); + } + + return g_steal_pointer(ret); +} + +/* An existing device definition may have additional info from mdevctl that is + * not available from udev. Transfer this data to the new definition */ +static void +nodeDeviceDefCopyExtraData(virNodeDeviceDefPtr dst, + virNodeDeviceDefPtr src)
^This name is too vague, we're likely going to only use it with mdevs (and if the time comes we need to make it generic, we can easily create a wrapper). I'd suggest nodeDeviceDefCopyFromMdevctl or something similar to make it clear at first glance what we're trying to copy, because extra data is basically "void *".
+{ + virNodeDevCapMdevPtr srcmdev; + virNodeDevCapMdevPtr dstmdev; + + if (dst->caps->data.type != VIR_NODE_DEV_CAP_MDEV) + return;
I think ^this check would be better on the caller side since we're tailoring it to mdevs.
+ + srcmdev = &src->caps->data.mdev; + dstmdev = &dst->caps->data.mdev; + + dstmdev->persistent = srcmdev->persistent; + dstmdev->nattributes = srcmdev->nattributes; + dstmdev->attributes = virMediatedDeviceAttrsCopy(srcmdev->attributes, + srcmdev->nattributes); +} +
static int udevAddOneDevice(struct udev_device *device) @@ -1527,6 +1571,8 @@ udevAddOneDevice(struct udev_device *device) goto cleanup;
if ((obj = virNodeDeviceObjListFindByName(driver->devs, def->name))) { + nodeDeviceDefCopyExtraData(def, virNodeDeviceObjGetDef(obj));
...the check mentioned above should come here, it'll help the code readability instantly to signal that we're not trying to change/copy anything unless the device for which udev generated an event is an mdev.
Erik
+ virNodeDeviceObjEndAPI(&obj); new_device = false; } @@ -1953,6 +1999,8 @@ nodeStateInitializeEnumerate(void *opaque) /* Populate with known devices */ if (udevEnumerateDevices(udev) != 0) goto error;
Also, I think a one line commentary explaining why we're explicitly updating/enumerating mdevs from mdevctl would be nice too, something like "we don't all the data about mdevs from udev, we query mdevctl for the rest" Erik
+ if (nodeDeviceUpdateMediatedDevices() != 0) + goto error;
nodeDeviceLock(); driver->initialized = true; -- 2.26.2

On Wed, Jan 06, 2021 at 11:27:09AM +0100, Erik Skultety wrote:
On Thu, Dec 24, 2020 at 08:14:31AM -0600, Jonathon Jongsma wrote:
At startup, query devices that are defined by 'mdevctl' and add them to the node device list.
This adds a complication: we now have two potential sources of information for a node device: - udev for all devices and for activated mediated devices - mdevctl for persistent mediated devices
Unfortunately, neither backend returns full information for a mediated device. For example, if a persistent mediated device in the list (with information provided from mdevctl) is 'started', that same device will now be detected by udev. If we simply overwrite the existing device definition with the new one provided by the udev backend, we will lose extra information that was provided by mdevctl (e.g. attributes, etc). To avoid this, make sure to copy the extra information into the new device definition.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 76 ++++++++++++++++++++++++++++ src/node_device/node_device_driver.h | 3 ++ src/node_device/node_device_udev.c | 48 ++++++++++++++++++ 3 files changed, 127 insertions(+)
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 5309b8abd5..0267005af1 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -1144,3 +1144,79 @@ nodeDeviceGenerateName(virNodeDeviceDefPtr def, *(def->name + i) = '_'; } } + + +static int +virMdevctlListDefined(virNodeDeviceDefPtr **devs) +{ + int status; + g_autofree char *output = NULL; + g_autoptr(virCommand) cmd = nodeDeviceGetMdevctlListCommand(true, &output); + + if (virCommandRun(cmd, &status) < 0) + return -1; + + if (!output) + return -1; + + return nodeDeviceParseMdevctlJSON(output, devs); +} + + +int +nodeDeviceUpdateMediatedDevices(void)
^This was called mdevctlEnumerateDevices in v2, so I'm wondering why did you change the name? I think it was okay the way it was (I double checked that I didn't suggest this change in v2 by accident).
In patch 9 I see the reason and it's indeed better to be named this way, so you can ignore ^this comment, sorry, I was too hasty. Erik

On Thu, Dec 24, 2020 at 08:14:31AM -0600, Jonathon Jongsma wrote:
At startup, query devices that are defined by 'mdevctl' and add them to the node device list.
This adds a complication: we now have two potential sources of information for a node device: - udev for all devices and for activated mediated devices - mdevctl for persistent mediated devices
Unfortunately, neither backend returns full information for a mediated device. For example, if a persistent mediated device in the list (with information provided from mdevctl) is 'started', that same device will now be detected by udev. If we simply overwrite the existing device definition with the new one provided by the udev backend, we will lose extra information that was provided by mdevctl (e.g. attributes, etc). To avoid this, make sure to copy the extra information into the new device definition.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 76 ++++++++++++++++++++++++++++ src/node_device/node_device_driver.h | 3 ++ src/node_device/node_device_udev.c | 48 ++++++++++++++++++ 3 files changed, 127 insertions(+)
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 5309b8abd5..0267005af1 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -1144,3 +1144,79 @@ nodeDeviceGenerateName(virNodeDeviceDefPtr def, *(def->name + i) = '_'; } } + + +static int +virMdevctlListDefined(virNodeDeviceDefPtr **devs) +{ + int status; + g_autofree char *output = NULL; + g_autoptr(virCommand) cmd = nodeDeviceGetMdevctlListCommand(true, &output); + + if (virCommandRun(cmd, &status) < 0) + return -1; + + if (!output) + return -1; + + return nodeDeviceParseMdevctlJSON(output, devs); +} + + +int +nodeDeviceUpdateMediatedDevices(void) +{ + g_autofree virNodeDeviceDefPtr *devs = NULL; + int ndevs; + size_t i; + + if ((ndevs = virMdevctlListDefined(&devs)) < 0) { + virReportSystemError(errno, "%s", + _("failed to query mdevs from mdevctl")); + return -1; + } + + for (i = 0; i < ndevs; i++) { + virNodeDeviceObjPtr obj; + virObjectEventPtr event; + virNodeDeviceDefPtr dev = devs[i]; + bool new_device = true; + + dev->driver = g_strdup("vfio_mdev"); + + /* If a device defined by mdevctl is already in the list, that means + * that it was found via the normal device discovery process and thus + * is already activated. Active devices contain some additional + * information (e.g. sysfs path) that is not provided by mdevctl, so + * preserve that info */ + if ((obj = virNodeDeviceObjListFindByName(driver->devs, dev->name))) { + virNodeDeviceDefPtr olddef = virNodeDeviceObjGetDef(obj); + + /* Copy any data from the existing device */ + dev->sysfs_path = g_strdup(olddef->sysfs_path); + dev->parent_sysfs_path = g_strdup(olddef->parent_sysfs_path); + dev->driver = g_strdup(olddef->driver); + dev->devnode = g_strdup(olddef->devnode); + dev->caps->data.mdev.iommuGroupNumber = olddef->caps->data.mdev.iommuGroupNumber; + + virNodeDeviceObjEndAPI(&obj); + new_device = false; + } + + if (!(obj = virNodeDeviceObjListAssignDef(driver->devs, dev))) { + virNodeDeviceDefFree(dev); + return -1; + } + + if (new_device) + event = virNodeDeviceEventLifecycleNew(dev->name, + VIR_NODE_DEVICE_EVENT_CREATED, + 0);
As mentioned in the other patch, we need to have DEFINED/UNDEFINED events. In fact this method doens't seem to remove existing mdevs that no longer exist.
+ else + event = virNodeDeviceEventUpdateNew(dev->name);
This is triggering events for all devices every time any single device changes, even if those devices have no changes. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

When a mediated device is stopped or undefined by an application outside of libvirt, we need to remove it from our list of node devices within libvirt. This patch introduces virNodeDeviceObjListRemoveLocked() and virNodeDeviceObjListForEach() (which are analogous to other types of object lists in libvirt) to facilitate that. They will be used in coming commits. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/conf/virnodedeviceobj.c | 62 ++++++++++++++++++++++++++++++++++--- src/conf/virnodedeviceobj.h | 12 +++++++ src/libvirt_private.syms | 2 ++ 3 files changed, 72 insertions(+), 4 deletions(-) diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index 6e9291264a..5f0b21f8bf 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -506,23 +506,29 @@ void virNodeDeviceObjListRemove(virNodeDeviceObjListPtr devs, virNodeDeviceObjPtr obj) { - virNodeDeviceDefPtr def; - if (!obj) return; - def = obj->def; virObjectRef(obj); virObjectUnlock(obj); virObjectRWLockWrite(devs); virObjectLock(obj); - virHashRemoveEntry(devs->objs, def->name); + virNodeDeviceObjListRemoveLocked(devs, obj); virObjectUnlock(obj); virObjectUnref(obj); virObjectRWUnlock(devs); } +/* The caller must hold lock on 'devs' */ +void +virNodeDeviceObjListRemoveLocked(virNodeDeviceObjListPtr devs, + virNodeDeviceObjPtr dev) +{ + virHashRemoveEntry(devs->objs, dev->def->name); +} + + /* * Return the NPIV dev's parent device name */ @@ -1000,3 +1006,51 @@ virNodeDeviceObjSetActive(virNodeDeviceObjPtr obj, { obj->active = active; } + +struct _virNodeDeviceObjListForEachData { + virNodeDeviceObjListIterator iter; + const void *opaque; +}; + +static int +virNodeDeviceObjListForEachCb(void *payload, + const char *name G_GNUC_UNUSED, + void *opaque) +{ + virNodeDeviceObjPtr obj = payload; + struct _virNodeDeviceObjListForEachData *data = opaque; + + /* Grab a reference so that we don't rely only on references grabbed by + * hash table earlier. Remember, an iterator can remove object from the + * hash table. */ + virObjectRef(obj); + virObjectLock(obj); + data->iter(obj, data->opaque); + virNodeDeviceObjEndAPI(&obj); + + return 0; +} + +/** + * virNodeDeviceObjListForEach + * @devs: Pointer to object list + * @iter: Callback iteration helper + * @opaque: Opaque data to use as argument to helper + * + * For each object in @devs, call the @iter helper using @opaque as + * an argument. + */ +void +virNodeDeviceObjListForEach(virNodeDeviceObjListPtr devs, + virNodeDeviceObjListIterator iter, + const void *opaque) +{ + struct _virNodeDeviceObjListForEachData data = { + .iter = iter, + .opaque = opaque + }; + + virObjectRWLockWrite(devs); + virHashForEach(devs->objs, virNodeDeviceObjListForEachCb, &data); + virObjectRWUnlock(devs); +} diff --git a/src/conf/virnodedeviceobj.h b/src/conf/virnodedeviceobj.h index c119f4c51f..e672d8af61 100644 --- a/src/conf/virnodedeviceobj.h +++ b/src/conf/virnodedeviceobj.h @@ -80,6 +80,10 @@ void virNodeDeviceObjListRemove(virNodeDeviceObjListPtr devs, virNodeDeviceObjPtr dev); +void +virNodeDeviceObjListRemoveLocked(virNodeDeviceObjListPtr devs, + virNodeDeviceObjPtr dev); + int virNodeDeviceObjListGetParentHost(virNodeDeviceObjListPtr devs, virNodeDeviceDefPtr def); @@ -128,3 +132,11 @@ virNodeDeviceObjIsActive(virNodeDeviceObjPtr obj); void virNodeDeviceObjSetActive(virNodeDeviceObjPtr obj, bool active); + +typedef void +(*virNodeDeviceObjListIterator)(virNodeDeviceObjPtr obj, + const void *opaque); + +void virNodeDeviceObjListForEach(virNodeDeviceObjListPtr devs, + virNodeDeviceObjListIterator iter, + const void *opaque); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index fecd012300..181d91792e 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1210,12 +1210,14 @@ virNodeDeviceObjListFindByName; virNodeDeviceObjListFindBySysfsPath; virNodeDeviceObjListFindMediatedDeviceByUUID; virNodeDeviceObjListFindSCSIHostByWWNs; +virNodeDeviceObjListForEach; virNodeDeviceObjListFree; virNodeDeviceObjListGetNames; virNodeDeviceObjListGetParentHost; virNodeDeviceObjListNew; virNodeDeviceObjListNumOfDevices; virNodeDeviceObjListRemove; +virNodeDeviceObjListRemoveLocked; virNodeDeviceObjSetActive; -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:32AM -0600, Jonathon Jongsma wrote:
When a mediated device is stopped or undefined by an application outside of libvirt, we need to remove it from our list of node devices within libvirt. This patch introduces virNodeDeviceObjListRemoveLocked() and virNodeDeviceObjListForEach() (which are analogous to other types of object lists in libvirt) to facilitate that. They will be used in coming commits.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- Reviewed-by: Erik Skultety <eskultet@redhat.com>

mdevctl does not currently provide any events when the list of defined devices changes, so we will need to poll mdevctl for the list of defined devices periodically. When a mediated device no longer exists from one iteration to the next, we need to treat it as an "undefine" event. When we get such an event, we remove the device from the list if it's not active. Otherwise, we simply mark it as non-persistent. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 67 +++++++++++++++++++++++++--- 1 file changed, 62 insertions(+), 5 deletions(-) diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 0267005af1..0bebd534d0 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -1163,23 +1163,73 @@ virMdevctlListDefined(virNodeDeviceDefPtr **devs) } +typedef struct _virMdevctlForEachData virMdevctlForEachData; +struct _virMdevctlForEachData { + int ndevs; + virNodeDeviceDefPtr *devs; +}; + + +/* This function keeps the list of persistent mediated devices consistent + * between the nodedev driver and mdevctl. + * @obj is a device that is currently known by the nodedev driver, and @opaque + * contains the most recent list of devices defined by mdevctl. If @obj is no + * longer defined in mdevctl, remove it from the driver as well. */ +static void +removeMissingPersistentMdevs(virNodeDeviceObjPtr obj, + const void *opaque) +{ + const virMdevctlForEachData *data = opaque; + size_t i; + virNodeDeviceDefPtr def = virNodeDeviceObjGetDef(obj); + virObjectEventPtr event; + + /* transient mdevs are populated via udev, so don't remove them from the + * nodedev driver just because they are not reported by by mdevctl */ + if (!(def->caps->data.type == VIR_NODE_DEV_CAP_MDEV && + def->caps->data.mdev.persistent)) + return; + + for (i = 0; i < data->ndevs; i++) { + /* OK, this mdev is still defined by mdevctl */ + if (STREQ(data->devs[i]->name, def->name)) + return; + } + + if (virNodeDeviceObjIsActive(obj)) { + /* The device is active, but no longer defined by mdevctl. Keep + * the device in the list, but mark it as non-persistent */ + def->caps->data.mdev.persistent = false; + return; + } + + /* the device is not active, and it is no longer defined by mdevctl, so + * remove it. */ + event = virNodeDeviceEventLifecycleNew(def->name, + VIR_NODE_DEVICE_EVENT_DELETED, + 0); + + virNodeDeviceObjListRemoveLocked(driver->devs, obj); + virObjectEventStateQueue(driver->nodeDeviceEventState, event); +} + + int nodeDeviceUpdateMediatedDevices(void) { - g_autofree virNodeDeviceDefPtr *devs = NULL; - int ndevs; size_t i; + virMdevctlForEachData data = { 0, }; - if ((ndevs = virMdevctlListDefined(&devs)) < 0) { + if ((data.ndevs = virMdevctlListDefined(&data.devs)) < 0) { virReportSystemError(errno, "%s", _("failed to query mdevs from mdevctl")); return -1; } - for (i = 0; i < ndevs; i++) { + for (i = 0; i < data.ndevs; i++) { virNodeDeviceObjPtr obj; virObjectEventPtr event; - virNodeDeviceDefPtr dev = devs[i]; + virNodeDeviceDefPtr dev = data.devs[i]; bool new_device = true; dev->driver = g_strdup("vfio_mdev"); @@ -1218,5 +1268,12 @@ nodeDeviceUpdateMediatedDevices(void) virObjectEventStateQueue(driver->nodeDeviceEventState, event); } + /* Any mdevs that were previously defined but were not returned by mdevctl + * this time will need to be checked to see if they should be removed from + * the device list */ + virNodeDeviceObjListForEach(driver->devs, removeMissingPersistentMdevs, &data); + + g_free(data.devs); + return 0; } -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:33AM -0600, Jonathon Jongsma wrote:
mdevctl does not currently provide any events when the list of defined devices changes, so we will need to poll mdevctl for the list of defined devices periodically. When a mediated device no longer exists from one iteration to the next, we need to treat it as an "undefine" event.
When we get such an event, we remove the device from the list if it's not active. Otherwise, we simply mark it as non-persistent.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_driver.c | 67 +++++++++++++++++++++++++--- 1 file changed, 62 insertions(+), 5 deletions(-)
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 0267005af1..0bebd534d0 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -1163,23 +1163,73 @@ virMdevctlListDefined(virNodeDeviceDefPtr **devs) }
+typedef struct _virMdevctlForEachData virMdevctlForEachData; +struct _virMdevctlForEachData { + int ndevs; + virNodeDeviceDefPtr *devs; +}; + + +/* This function keeps the list of persistent mediated devices consistent + * between the nodedev driver and mdevctl. + * @obj is a device that is currently known by the nodedev driver, and @opaque + * contains the most recent list of devices defined by mdevctl. If @obj is no + * longer defined in mdevctl, remove it from the driver as well. */ +static void +removeMissingPersistentMdevs(virNodeDeviceObjPtr obj, + const void *opaque) +{ + const virMdevctlForEachData *data = opaque; + size_t i; + virNodeDeviceDefPtr def = virNodeDeviceObjGetDef(obj); + virObjectEventPtr event; + + /* transient mdevs are populated via udev, so don't remove them from the + * nodedev driver just because they are not reported by by mdevctl */ + if (!(def->caps->data.type == VIR_NODE_DEV_CAP_MDEV && + def->caps->data.mdev.persistent)) + return; + + for (i = 0; i < data->ndevs; i++) { + /* OK, this mdev is still defined by mdevctl */ + if (STREQ(data->devs[i]->name, def->name)) + return; + } + + if (virNodeDeviceObjIsActive(obj)) { + /* The device is active, but no longer defined by mdevctl. Keep + * the device in the list, but mark it as non-persistent */
^This commentary should most likely be outside of the 'if' condition just like the one a few lines above.
+ def->caps->data.mdev.persistent = false; + return; + } + + /* the device is not active, and it is no longer defined by mdevctl, so + * remove it. */ + event = virNodeDeviceEventLifecycleNew(def->name, + VIR_NODE_DEVICE_EVENT_DELETED, + 0); + + virNodeDeviceObjListRemoveLocked(driver->devs, obj);
For the sake of logically ordering the actions, I believe we should first remove the device and *then* emit an event - not that it would matter much in the end since virNodeDeviceObjListRemoveLocked doesn't return anything, so even if it failed to remove the device for some reason we'd emit the event anyway...
+ virObjectEventStateQueue(driver->nodeDeviceEventState, event); +} + + int nodeDeviceUpdateMediatedDevices(void) { - g_autofree virNodeDeviceDefPtr *devs = NULL; - int ndevs; size_t i; + virMdevctlForEachData data = { 0, };
- if ((ndevs = virMdevctlListDefined(&devs)) < 0) { + if ((data.ndevs = virMdevctlListDefined(&data.devs)) < 0) { virReportSystemError(errno, "%s", _("failed to query mdevs from mdevctl")); return -1; }
- for (i = 0; i < ndevs; i++) { + for (i = 0; i < data.ndevs; i++) { virNodeDeviceObjPtr obj; virObjectEventPtr event; - virNodeDeviceDefPtr dev = devs[i]; + virNodeDeviceDefPtr dev = data.devs[i]; bool new_device = true;
dev->driver = g_strdup("vfio_mdev"); @@ -1218,5 +1268,12 @@ nodeDeviceUpdateMediatedDevices(void) virObjectEventStateQueue(driver->nodeDeviceEventState, event); }
The context is missing here, but you'd leak data.devs on line 1258: if (!(obj = virNodeDeviceObjListAssignDef(driver->devs, dev))) { virNodeDeviceDefFree(dev); return -1; } With the adjustments: Reviewed-by: Erik Skultety <eskultet@redhat.com>

The udev thread handles received udev events. This is accomplished by setting dataReady to 'true' and signalling the thread. We also want to use this thread to handle mdev events, so we'll need to add another variable to indicate which event has woken the thread. To prepare for this, rename 'dataReady' so that it is clear that when this variable is set, the udev event needs to be handled. Mdevctl events will use a different variable name. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_udev.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 223ee5a2ff..d7f7ab4370 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -65,7 +65,7 @@ struct _udevEventData { virThread th; virCond threadCond; bool threadQuit; - bool dataReady; + bool udevReady; }; static virClassPtr udevEventDataClass; @@ -1809,7 +1809,7 @@ udevEventMonitorSanityCheck(udevEventDataPtr priv, * NB: Some older distros, such as CentOS 6, libudev opens sockets * without the NONBLOCK flag which might cause issues with event * based algorithm. Although the issue can be mitigated by resetting - * priv->dataReady for each event found; however, the scheduler issues + * priv->udevReady for each event found; however, the scheduler issues * would still come into play. */ static void @@ -1821,7 +1821,7 @@ udevEventHandleThread(void *opaque G_GNUC_UNUSED) /* continue rather than break from the loop on non-fatal errors */ while (1) { virObjectLock(priv); - while (!priv->dataReady && !priv->threadQuit) { + while (!priv->udevReady && !priv->threadQuit) { if (virCondWait(&priv->threadCond, &priv->parent.lock)) { virReportSystemError(errno, "%s", _("handler failed to wait on condition")); @@ -1858,11 +1858,11 @@ udevEventHandleThread(void *opaque G_GNUC_UNUSED) return; } - /* Trying to move the reset of the @priv->dataReady flag to + /* Trying to move the reset of the @priv->udevReady flag to * after the udev_monitor_receive_device wouldn't help much * due to event mgmt and scheduler timing. */ virObjectLock(priv); - priv->dataReady = false; + priv->udevReady = false; virObjectUnlock(priv); continue; @@ -1892,7 +1892,7 @@ udevEventHandleCallback(int watch G_GNUC_UNUSED, if (!udevEventMonitorSanityCheck(priv, fd)) priv->threadQuit = true; else - priv->dataReady = true; + priv->udevReady = true; virCondSignal(&priv->threadCond); virObjectUnlock(priv); -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:34AM -0600, Jonathon Jongsma wrote:
The udev thread handles received udev events. This is accomplished by setting dataReady to 'true' and signalling the thread.
We also want to use this thread to handle mdev events, so we'll need to add another variable to indicate which event has woken the thread. To prepare for this, rename 'dataReady' so that it is clear that when this variable is set, the udev event needs to be handled. Mdevctl events will use a different variable name.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> ---
Depending on what we're going to do with 11/21, this can have a conditional: Reviewed-by: Erik Skultety <eskultet@redhat.com>

We need to periodically query mdevctl for changes to device definitions since an administrator can define new devices with mdevctl outside of libvirt. In the future, mdevctl may add a way to signal device add/remove via events, but for now we resort to a bit of a workaround: monitoring the mdevctl config directories for changes to files. When a change is detected, we query mdevctl and update our device list. The mdevctl querying is handled by the existing udev thread. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_udev.c | 204 ++++++++++++++++++++++++----- 1 file changed, 171 insertions(+), 33 deletions(-) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index d7f7ab4370..5729fea264 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -41,6 +41,7 @@ #include "virnetdev.h" #include "virmdev.h" #include "virutil.h" +#include <gio/gio.h> #include "configmake.h" @@ -66,6 +67,10 @@ struct _udevEventData { virCond threadCond; bool threadQuit; bool udevReady; + bool mdevReady; + + GList *mdevctlMonitors; + int mdevctlTimeout; }; static virClassPtr udevEventDataClass; @@ -85,6 +90,7 @@ udevEventDataDispose(void *obj) udev = udev_monitor_get_udev(priv->udev_monitor); udev_monitor_unref(priv->udev_monitor); udev_unref(udev); + g_list_free_full(priv->mdevctlMonitors, g_object_unref); virCondDestroy(&priv->threadCond); } @@ -1821,7 +1827,7 @@ udevEventHandleThread(void *opaque G_GNUC_UNUSED) /* continue rather than break from the loop on non-fatal errors */ while (1) { virObjectLock(priv); - while (!priv->udevReady && !priv->threadQuit) { + while (!priv->udevReady && !priv->mdevReady && !priv->threadQuit) { if (virCondWait(&priv->threadCond, &priv->parent.lock)) { virReportSystemError(errno, "%s", _("handler failed to wait on condition")); @@ -1835,46 +1841,58 @@ udevEventHandleThread(void *opaque G_GNUC_UNUSED) return; } - errno = 0; - device = udev_monitor_receive_device(priv->udev_monitor); - virObjectUnlock(priv); + if (priv->udevReady) { + errno = 0; + device = udev_monitor_receive_device(priv->udev_monitor); + virObjectUnlock(priv); - if (!device) { - if (errno == 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("failed to receive device from udev monitor")); - return; - } + if (!device) { + if (errno == 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to receive device from udev monitor")); + return; + } + + /* POSIX allows both EAGAIN and EWOULDBLOCK to be used + * interchangeably when the read would block or timeout was fired + */ + VIR_WARNINGS_NO_WLOGICALOP_EQUAL_EXPR + if (errno != EAGAIN && errno != EWOULDBLOCK) { + VIR_WARNINGS_RESET + virReportSystemError(errno, "%s", + _("failed to receive device from udev " + "monitor")); + return; + } + + /* Trying to move the reset of the @priv->udevReady flag to + * after the udev_monitor_receive_device wouldn't help much + * due to event mgmt and scheduler timing. */ + virObjectLock(priv); + priv->udevReady = false; + virObjectUnlock(priv); - /* POSIX allows both EAGAIN and EWOULDBLOCK to be used - * interchangeably when the read would block or timeout was fired - */ - VIR_WARNINGS_NO_WLOGICALOP_EQUAL_EXPR - if (errno != EAGAIN && errno != EWOULDBLOCK) { - VIR_WARNINGS_RESET - virReportSystemError(errno, "%s", - _("failed to receive device from udev " - "monitor")); - return; + continue; } - /* Trying to move the reset of the @priv->udevReady flag to - * after the udev_monitor_receive_device wouldn't help much - * due to event mgmt and scheduler timing. */ - virObjectLock(priv); - priv->udevReady = false; - virObjectUnlock(priv); + udevHandleOneDevice(device); + udev_device_unref(device); - continue; + /* Instead of waiting for the next event after processing @device + * data, let's keep reading from the udev monitor and only wait + * for the next event once either a EAGAIN or a EWOULDBLOCK error + * is encountered. */ } - udevHandleOneDevice(device); - udev_device_unref(device); + if (priv->mdevReady) { + virObjectUnlock(priv); + if (nodeDeviceUpdateMediatedDevices() < 0) + VIR_WARN("mdevctl failed to updated mediated devices"); - /* Instead of waiting for the next event after processing @device - * data, let's keep reading from the udev monitor and only wait - * for the next event once either a EAGAIN or a EWOULDBLOCK error - * is encountered. */ + virObjectLock(priv); + priv->mdevReady = false; + virObjectUnlock(priv); + } } } @@ -1899,6 +1917,117 @@ udevEventHandleCallback(int watch G_GNUC_UNUSED, } +static void +scheduleMdevctlHandler(int timer G_GNUC_UNUSED, void *opaque) +{ + udevEventDataPtr priv = opaque; + + if (priv->mdevctlTimeout > 0) { + virEventRemoveTimeout(priv->mdevctlTimeout); + priv->mdevctlTimeout = -1; + } + + virObjectLock(priv); + priv->mdevReady = true; + virCondSignal(&priv->threadCond); + virObjectUnlock(priv); +} + + +static void +mdevctlEventHandleCallback(GFileMonitor *monitor, + GFile *file, + GFile *other_file, + GFileMonitorEvent event_type, + gpointer user_data); + + +/* Recursively monitors a directory and its subdirectories for file changes and + * returns a GList of GFileMonitor objects */ +static GList* +monitorFileRecursively(udevEventDataPtr udev, + GFile *file) +{ + GList *monitors = NULL; + g_autoptr(GError) error = NULL; + g_autoptr(GFileEnumerator) children = NULL; + GFileMonitor *mon; + + if (!(children = g_file_enumerate_children(file, "standard::*", + G_FILE_QUERY_INFO_NONE, NULL, &error))) { + if (error->code == G_IO_ERROR_NOT_DIRECTORY) + return NULL; + goto error; + } + + if (!(mon = g_file_monitor(file, G_FILE_MONITOR_NONE, NULL, &error))) + goto error; + + g_signal_connect(mon, "changed", + G_CALLBACK(mdevctlEventHandleCallback), udev); + monitors = g_list_append(monitors, mon); + + while (true) { + GFileInfo *info; + GFile *child; + GList *child_monitors = NULL; + + if (!g_file_enumerator_iterate(children, &info, &child, NULL, &error)) + goto error; + if (!info) + break; + + child_monitors = monitorFileRecursively(udev, child); + if (child_monitors) + monitors = g_list_concat(monitors, child_monitors); + } + + return monitors; + + error: + g_list_free_full(monitors, g_object_unref); + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to monitor directory: %s"), error->message); + return NULL; +} + + +static void +mdevctlEventHandleCallback(GFileMonitor *monitor G_GNUC_UNUSED, + GFile *file, + GFile *other_file G_GNUC_UNUSED, + GFileMonitorEvent event_type, + gpointer user_data) +{ + udevEventDataPtr priv = user_data; + + /* if a new directory appears, monitor that directory for changes */ + if (event_type == G_FILE_MONITOR_EVENT_CREATED && + g_file_query_file_type(file, G_FILE_QUERY_INFO_NONE, NULL) == + G_FILE_TYPE_DIRECTORY) { + GList *newmonitors = monitorFileRecursively(priv, file); + priv->mdevctlMonitors = g_list_concat(priv->mdevctlMonitors, newmonitors); + } + + /* When mdevctl creates a device, it can result in multiple notify events + * emitted for a single logical change (e.g. several CHANGED events, or a + * CREATED and CHANGED event followed by CHANGES_DONE_HINT). To avoid + * spawning a mdevctl thread multiple times for a single logical + * configuration change, try to coalesce these changes by waiting for the + * CHANGES_DONE_HINT event. As a fallback, add a timeout to trigger the + * signal if that event never comes */ + if (event_type != G_FILE_MONITOR_EVENT_CHANGES_DONE_HINT) { + if (priv->mdevctlTimeout > 0) + virEventRemoveTimeout(priv->mdevctlTimeout); + priv->mdevctlTimeout = virEventAddTimeout(100, scheduleMdevctlHandler, + priv, NULL); + return; + } + + scheduleMdevctlHandler(-1, priv); +} + + /* DMI is intel-compatible specific */ #if defined(__x86_64__) || defined(__i386__) || defined(__amd64__) static void @@ -2052,6 +2181,7 @@ nodeStateInitialize(bool privileged, udevEventDataPtr priv = NULL; struct udev *udev = NULL; virThread enumThread; + g_autoptr(GFile) mdevctlConfigDir = g_file_new_for_path("/etc/mdevctl.d"); if (root != NULL) { virReportError(VIR_ERR_INVALID_ARG, "%s", @@ -2153,6 +2283,14 @@ nodeStateInitialize(bool privileged, if (priv->watch == -1) goto unlock; + /* mdevctl may add notification events in the future: + * https://github.com/mdevctl/mdevctl/issues/27. For now, fall back to + * monitoring the mdevctl configuration directory for changes. + * mdevctl configuration is stored in a directory tree within + * /etc/mdevctl.d/. There is a directory for each parent device, which + * contains a file defining each mediated device */ + priv->mdevctlMonitors = monitorFileRecursively(priv, mdevctlConfigDir); + virObjectUnlock(priv); /* Create a fictional 'computer' device to root the device tree. */ -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:35AM -0600, Jonathon Jongsma wrote:
We need to periodically query mdevctl for changes to device definitions since an administrator can define new devices with mdevctl outside of libvirt.
In the future, mdevctl may add a way to signal device add/remove via events, but for now we resort to a bit of a workaround: monitoring the mdevctl config directories for changes to files. When a change is detected, we query mdevctl and update our device list. The mdevctl querying is handled by the existing udev thread.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/node_device/node_device_udev.c | 204 ++++++++++++++++++++++++----- 1 file changed, 171 insertions(+), 33 deletions(-)
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index d7f7ab4370..5729fea264 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -41,6 +41,7 @@ #include "virnetdev.h" #include "virmdev.h" #include "virutil.h" +#include <gio/gio.h>
System header includes should come before local header includes.
#include "configmake.h"
@@ -66,6 +67,10 @@ struct _udevEventData { virCond threadCond; bool threadQuit; bool udevReady; + bool mdevReady; + + GList *mdevctlMonitors; + int mdevctlTimeout; };
static virClassPtr udevEventDataClass; @@ -85,6 +90,7 @@ udevEventDataDispose(void *obj) udev = udev_monitor_get_udev(priv->udev_monitor); udev_monitor_unref(priv->udev_monitor); udev_unref(udev); + g_list_free_full(priv->mdevctlMonitors, g_object_unref);
virCondDestroy(&priv->threadCond); } @@ -1821,7 +1827,7 @@ udevEventHandleThread(void *opaque G_GNUC_UNUSED) /* continue rather than break from the loop on non-fatal errors */ while (1) { virObjectLock(priv); - while (!priv->udevReady && !priv->threadQuit) { + while (!priv->udevReady && !priv->mdevReady && !priv->threadQuit) { if (virCondWait(&priv->threadCond, &priv->parent.lock)) { virReportSystemError(errno, "%s", _("handler failed to wait on condition")); @@ -1835,46 +1841,58 @@ udevEventHandleThread(void *opaque G_GNUC_UNUSED) return; }
- errno = 0; - device = udev_monitor_receive_device(priv->udev_monitor); - virObjectUnlock(priv); + if (priv->udevReady) { + errno = 0; + device = udev_monitor_receive_device(priv->udev_monitor); + virObjectUnlock(priv);
- if (!device) { - if (errno == 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("failed to receive device from udev monitor")); - return; - } + if (!device) { + if (errno == 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to receive device from udev monitor")); + return; + } + + /* POSIX allows both EAGAIN and EWOULDBLOCK to be used + * interchangeably when the read would block or timeout was fired + */ + VIR_WARNINGS_NO_WLOGICALOP_EQUAL_EXPR + if (errno != EAGAIN && errno != EWOULDBLOCK) { + VIR_WARNINGS_RESET + virReportSystemError(errno, "%s", + _("failed to receive device from udev " + "monitor")); + return; + } + + /* Trying to move the reset of the @priv->udevReady flag to + * after the udev_monitor_receive_device wouldn't help much + * due to event mgmt and scheduler timing. */ + virObjectLock(priv); + priv->udevReady = false; + virObjectUnlock(priv);
- /* POSIX allows both EAGAIN and EWOULDBLOCK to be used - * interchangeably when the read would block or timeout was fired - */ - VIR_WARNINGS_NO_WLOGICALOP_EQUAL_EXPR - if (errno != EAGAIN && errno != EWOULDBLOCK) { - VIR_WARNINGS_RESET - virReportSystemError(errno, "%s", - _("failed to receive device from udev " - "monitor")); - return; + continue; }
- /* Trying to move the reset of the @priv->udevReady flag to - * after the udev_monitor_receive_device wouldn't help much - * due to event mgmt and scheduler timing. */ - virObjectLock(priv); - priv->udevReady = false; - virObjectUnlock(priv); + udevHandleOneDevice(device); + udev_device_unref(device);
- continue; + /* Instead of waiting for the next event after processing @device + * data, let's keep reading from the udev monitor and only wait + * for the next event once either a EAGAIN or a EWOULDBLOCK error + * is encountered. */ }
Hmm, we'll lose some responsiveness if we do it within a single thread, on systems which regularly spawns and destroys virtual functions and mdevs it may take some time until we actually get to processing mdevctl stuff, but it may not be that significant, although it doesn't feel quite right to me to do it in a single thread dedicated to udev. The event ordering discussed in v2 should be a matter of a single shared lock right? The first thing the throw-away thread does is to acquire a lock to minimize the window for a context switch and then process the event.
- udevHandleOneDevice(device); - udev_device_unref(device); + if (priv->mdevReady) { + virObjectUnlock(priv); + if (nodeDeviceUpdateMediatedDevices() < 0) + VIR_WARN("mdevctl failed to updated mediated devices");
- /* Instead of waiting for the next event after processing @device - * data, let's keep reading from the udev monitor and only wait - * for the next event once either a EAGAIN or a EWOULDBLOCK error - * is encountered. */ + virObjectLock(priv); + priv->mdevReady = false; + virObjectUnlock(priv); + } } }
@@ -1899,6 +1917,117 @@ udevEventHandleCallback(int watch G_GNUC_UNUSED, }
+static void +scheduleMdevctlHandler(int timer G_GNUC_UNUSED, void *opaque) +{ + udevEventDataPtr priv = opaque; + + if (priv->mdevctlTimeout > 0) { + virEventRemoveTimeout(priv->mdevctlTimeout); + priv->mdevctlTimeout = -1; + } + + virObjectLock(priv); + priv->mdevReady = true; + virCondSignal(&priv->threadCond); + virObjectUnlock(priv); +} + + +static void +mdevctlEventHandleCallback(GFileMonitor *monitor, + GFile *file, + GFile *other_file, + GFileMonitorEvent event_type, + gpointer user_data); + + +/* Recursively monitors a directory and its subdirectories for file changes and + * returns a GList of GFileMonitor objects */ +static GList* +monitorFileRecursively(udevEventDataPtr udev, + GFile *file) +{ + GList *monitors = NULL; + g_autoptr(GError) error = NULL; + g_autoptr(GFileEnumerator) children = NULL; + GFileMonitor *mon; + + if (!(children = g_file_enumerate_children(file, "standard::*", + G_FILE_QUERY_INFO_NONE, NULL, &error))) { + if (error->code == G_IO_ERROR_NOT_DIRECTORY)
We're not reporting an error in this code path, I think we should not leave it like that and report errors on all error code paths.
+ return NULL; + goto error; + } + + if (!(mon = g_file_monitor(file, G_FILE_MONITOR_NONE, NULL, &error))) + goto error; + + g_signal_connect(mon, "changed", + G_CALLBACK(mdevctlEventHandleCallback), udev); + monitors = g_list_append(monitors, mon); + + while (true) { + GFileInfo *info; + GFile *child; + GList *child_monitors = NULL; + + if (!g_file_enumerator_iterate(children, &info, &child, NULL, &error)) + goto error; + if (!info) + break; + + child_monitors = monitorFileRecursively(udev, child); + if (child_monitors) + monitors = g_list_concat(monitors, child_monitors); + } + + return monitors; + + error: + g_list_free_full(monitors, g_object_unref); + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to monitor directory: %s"), error->message); + return NULL; +} + + +static void +mdevctlEventHandleCallback(GFileMonitor *monitor G_GNUC_UNUSED, + GFile *file, + GFile *other_file G_GNUC_UNUSED, + GFileMonitorEvent event_type, + gpointer user_data) +{ + udevEventDataPtr priv = user_data; + + /* if a new directory appears, monitor that directory for changes */ + if (event_type == G_FILE_MONITOR_EVENT_CREATED && + g_file_query_file_type(file, G_FILE_QUERY_INFO_NONE, NULL) == + G_FILE_TYPE_DIRECTORY) { + GList *newmonitors = monitorFileRecursively(priv, file); + priv->mdevctlMonitors = g_list_concat(priv->mdevctlMonitors, newmonitors); + } + + /* When mdevctl creates a device, it can result in multiple notify events + * emitted for a single logical change (e.g. several CHANGED events, or a + * CREATED and CHANGED event followed by CHANGES_DONE_HINT). To avoid + * spawning a mdevctl thread multiple times for a single logical + * configuration change, try to coalesce these changes by waiting for the + * CHANGES_DONE_HINT event. As a fallback, add a timeout to trigger the + * signal if that event never comes */
If the CHANGES_DONE_HINT event doesn't come, then the ordering is still broken even if we have a timer in place, isn't it? I'm wondering whether it actually happened to you that it didn't arrive. I mean, we should be able to rely on an event to be delivered properly, it's not like we're trying to wait for an event that kernel will never send when the whole filesystem tree is created for an mdev that was just created. Feel free to point out facts I'm not seeing on why we can't adopt the throw-away threads for the mdevctl events with a single shared lock. Regards, Erik
+ if (event_type != G_FILE_MONITOR_EVENT_CHANGES_DONE_HINT) { + if (priv->mdevctlTimeout > 0) + virEventRemoveTimeout(priv->mdevctlTimeout); + priv->mdevctlTimeout = virEventAddTimeout(100, scheduleMdevctlHandler, + priv, NULL); + return; + } + + scheduleMdevctlHandler(-1, priv); +} + + /* DMI is intel-compatible specific */ #if defined(__x86_64__) || defined(__i386__) || defined(__amd64__) static void @@ -2052,6 +2181,7 @@ nodeStateInitialize(bool privileged, udevEventDataPtr priv = NULL; struct udev *udev = NULL; virThread enumThread; + g_autoptr(GFile) mdevctlConfigDir = g_file_new_for_path("/etc/mdevctl.d");
if (root != NULL) { virReportError(VIR_ERR_INVALID_ARG, "%s", @@ -2153,6 +2283,14 @@ nodeStateInitialize(bool privileged, if (priv->watch == -1) goto unlock;
+ /* mdevctl may add notification events in the future: + * https://github.com/mdevctl/mdevctl/issues/27. For now, fall back to + * monitoring the mdevctl configuration directory for changes. + * mdevctl configuration is stored in a directory tree within + * /etc/mdevctl.d/. There is a directory for each parent device, which + * contains a file defining each mediated device */ + priv->mdevctlMonitors = monitorFileRecursively(priv, mdevctlConfigDir); + virObjectUnlock(priv);
/* Create a fictional 'computer' device to root the device tree. */ -- 2.26.2

With mediated devices, we can now define persistent node devices that can be started and stopped. In order to take advantage of this, we need an API to define new node devices. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- include/libvirt/libvirt-nodedev.h | 4 + src/driver-nodedev.h | 6 ++ src/libvirt-nodedev.c | 42 ++++++++ src/libvirt_public.syms | 1 + src/node_device/node_device_driver.c | 97 ++++++++++++++++++- src/node_device/node_device_driver.h | 10 ++ src/node_device/node_device_udev.c | 1 + src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 17 +++- src/remote_protocol-structs | 8 ++ src/rpc/gendispatch.pl | 1 + ...19_36ea_4111_8f0a_8c9a70e21366-define.argv | 1 + ...19_36ea_4111_8f0a_8c9a70e21366-define.json | 1 + ...39_495e_4243_ad9f_beb3f14c23d9-define.argv | 1 + ...39_495e_4243_ad9f_beb3f14c23d9-define.json | 1 + ...16_1ca8_49ac_b176_871d16c13076-define.argv | 1 + ...16_1ca8_49ac_b176_871d16c13076-define.json | 1 + tests/nodedevmdevctltest.c | 68 +++++++++---- 18 files changed, 239 insertions(+), 23 deletions(-) create mode 100644 tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9a70e21366-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9a70e21366-define.json create mode 100644 tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb3f14c23d9-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb3f14c23d9-define.json create mode 100644 tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871d16c13076-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871d16c13076-define.json diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index a473563857..3e57e9522a 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -132,6 +132,10 @@ virNodeDevicePtr virNodeDeviceCreateXML (virConnectPtr conn, int virNodeDeviceDestroy (virNodeDevicePtr dev); +virNodeDevicePtr virNodeDeviceDefineXML(virConnectPtr conn, + const char *xmlDesc, + unsigned int flags); + /** * VIR_NODE_DEVICE_EVENT_CALLBACK: * diff --git a/src/driver-nodedev.h b/src/driver-nodedev.h index d0fc7f19cf..16d787f3fc 100644 --- a/src/driver-nodedev.h +++ b/src/driver-nodedev.h @@ -74,6 +74,11 @@ typedef virNodeDevicePtr typedef int (*virDrvNodeDeviceDestroy)(virNodeDevicePtr dev); +typedef virNodeDevicePtr +(*virDrvNodeDeviceDefineXML)(virConnectPtr conn, + const char *xmlDesc, + unsigned int flags); + typedef int (*virDrvConnectNodeDeviceEventRegisterAny)(virConnectPtr conn, virNodeDevicePtr dev, @@ -113,4 +118,5 @@ struct _virNodeDeviceDriver { virDrvNodeDeviceListCaps nodeDeviceListCaps; virDrvNodeDeviceCreateXML nodeDeviceCreateXML; virDrvNodeDeviceDestroy nodeDeviceDestroy; + virDrvNodeDeviceDefineXML nodeDeviceDefineXML; }; diff --git a/src/libvirt-nodedev.c b/src/libvirt-nodedev.c index 375b907852..15eb70218a 100644 --- a/src/libvirt-nodedev.c +++ b/src/libvirt-nodedev.c @@ -765,6 +765,48 @@ virNodeDeviceDestroy(virNodeDevicePtr dev) } +/** + * virNodeDeviceDefineXML: + * @conn: pointer to the hypervisor connection + * @xmlDesc: string containing an XML description of the device to be defined + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Define a new device on the VM host machine, for example, a mediated device + * + * virNodeDeviceFree should be used to free the resources after the + * node device object is no longer needed. + * + * Returns a node device object if successful, NULL in case of failure + */ +virNodeDevicePtr +virNodeDeviceDefineXML(virConnectPtr conn, + const char *xmlDesc, + unsigned int flags) +{ + VIR_DEBUG("conn=%p, xmlDesc=%s, flags=0x%x", conn, NULLSTR(xmlDesc), flags); + + virResetLastError(); + + virCheckConnectReturn(conn, NULL); + virCheckReadOnlyGoto(conn->flags, error); + virCheckNonNullArgGoto(xmlDesc, error); + + if (conn->nodeDeviceDriver && + conn->nodeDeviceDriver->nodeDeviceDefineXML) { + virNodeDevicePtr dev = conn->nodeDeviceDriver->nodeDeviceDefineXML(conn, xmlDesc, flags); + if (dev == NULL) + goto error; + return dev; + } + + virReportUnsupportedError(); + + error: + virDispatchError(conn); + return NULL; +} + + /** * virConnectNodeDeviceEventRegisterAny: * @conn: pointer to the connection diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index cf31f937d5..8fae4352ff 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -877,6 +877,7 @@ LIBVIRT_6.10.0 { global: virDomainAuthorizedSSHKeysGet; virDomainAuthorizedSSHKeysSet; + virNodeDeviceDefineXML; } LIBVIRT_6.0.0; # .... define new API here using predicted next version number .... diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 0bebd534d0..4fbe8743b4 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -698,9 +698,13 @@ nodeDeviceFindAddressByName(const char *name) } -virCommandPtr -nodeDeviceGetMdevctlStartCommand(virNodeDeviceDefPtr def, - char **uuid_out) +/* the mdevctl 'start' and 'define' commands accept almost the exact same + * arguments, so provide a common implementation that can be wrapped by a more + * specific function */ +static virCommandPtr +nodeDeviceGetMdevctlDefineStartCommand(virNodeDeviceDefPtr def, + const char *subcommand, + char **uuid_out) { virCommandPtr cmd; g_autofree char *json = NULL; @@ -718,7 +722,7 @@ nodeDeviceGetMdevctlStartCommand(virNodeDeviceDefPtr def, return NULL; } - cmd = virCommandNewArgList(MDEVCTL, "start", + cmd = virCommandNewArgList(MDEVCTL, subcommand, "-p", parent_addr, "--jsonfile", "/dev/stdin", NULL); @@ -729,6 +733,22 @@ nodeDeviceGetMdevctlStartCommand(virNodeDeviceDefPtr def, return cmd; } +virCommandPtr +nodeDeviceGetMdevctlStartCommand(virNodeDeviceDefPtr def, + char **uuid_out) +{ + return nodeDeviceGetMdevctlDefineStartCommand(def, "start", uuid_out); +} + +virCommandPtr +nodeDeviceGetMdevctlDefineCommand(virNodeDeviceDefPtr def, + char **uuid_out) +{ + return nodeDeviceGetMdevctlDefineStartCommand(def, "define", uuid_out); +} + + + static int virMdevctlStart(virNodeDeviceDefPtr def, char **uuid) { @@ -749,6 +769,26 @@ virMdevctlStart(virNodeDeviceDefPtr def, char **uuid) } +static int +virMdevctlDefine(virNodeDeviceDefPtr def, char **uuid) +{ + int status; + g_autoptr(virCommand) cmd = nodeDeviceGetMdevctlDefineCommand(def, uuid); + if (!cmd) + return -1; + + /* an auto-generated uuid is returned via stdout if no uuid is specified in + * the mdevctl args */ + if (virCommandRun(cmd, &status) < 0 || status != 0) + return -1; + + /* remove newline */ + *uuid = g_strstrip(*uuid); + + return 0; +} + + static virNodeDevicePtr nodeDeviceCreateXMLMdev(virConnectPtr conn, virNodeDeviceDefPtr def) @@ -1068,6 +1108,55 @@ nodeDeviceDestroy(virNodeDevicePtr device) return ret; } +virNodeDevicePtr +nodeDeviceDefineXML(virConnectPtr conn, + const char *xmlDesc, + unsigned int flags) +{ + g_autoptr(virNodeDeviceDef) def = NULL; + virNodeDevicePtr device = NULL; + const char *virt_type = NULL; + g_autofree char *uuid = NULL; + + virCheckFlags(0, NULL); + + if (nodeDeviceWaitInit() < 0) + return NULL; + + virt_type = virConnectGetType(conn); + + if (!(def = virNodeDeviceDefParseString(xmlDesc, CREATE_DEVICE, virt_type))) + return NULL; + + if (virNodeDeviceDefineXMLEnsureACL(conn, def) < 0) + return NULL; + + if (!nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_MDEV)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Unsupported device type")); + return NULL; + } + + if (!def->parent) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("cannot define a mediated device without a parent")); + return NULL; + } + + if (virMdevctlDefine(def, &uuid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unable to define mediated device")); + return NULL; + } + + def->caps->data.mdev.uuid = g_strdup(uuid); + mdevGenerateDeviceName(def); + device = nodeDeviceFindNewMediatedDevice(conn, uuid); + + return device; +} + + int nodeConnectNodeDeviceEventRegisterAny(virConnectPtr conn, diff --git a/src/node_device/node_device_driver.h b/src/node_device/node_device_driver.h index 4315f6d6ed..e58bb0f124 100644 --- a/src/node_device/node_device_driver.h +++ b/src/node_device/node_device_driver.h @@ -102,6 +102,11 @@ nodeDeviceCreateXML(virConnectPtr conn, int nodeDeviceDestroy(virNodeDevicePtr dev); +virNodeDevicePtr +nodeDeviceDefineXML(virConnectPtr conn, + const char *xmlDesc, + unsigned int flags); + int nodeConnectNodeDeviceEventRegisterAny(virConnectPtr conn, virNodeDevicePtr dev, @@ -116,6 +121,11 @@ nodeConnectNodeDeviceEventDeregisterAny(virConnectPtr conn, virCommandPtr nodeDeviceGetMdevctlStartCommand(virNodeDeviceDefPtr def, char **uuid_out); + +virCommandPtr +nodeDeviceGetMdevctlDefineCommand(virNodeDeviceDefPtr def, + char **uuid_out); + virCommandPtr nodeDeviceGetMdevctlStopCommand(const char *uuid); diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 5729fea264..2532c3189e 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -2338,6 +2338,7 @@ static virNodeDeviceDriver udevNodeDeviceDriver = { .nodeDeviceListCaps = nodeDeviceListCaps, /* 0.7.3 */ .nodeDeviceCreateXML = nodeDeviceCreateXML, /* 0.7.3 */ .nodeDeviceDestroy = nodeDeviceDestroy, /* 0.7.3 */ + .nodeDeviceDefineXML = nodeDeviceDefineXML, /* 6.5.0 */ }; diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index b0af3ee88e..f826b6f73e 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -8665,6 +8665,7 @@ static virNodeDeviceDriver node_device_driver = { .nodeDeviceNumOfCaps = remoteNodeDeviceNumOfCaps, /* 0.5.0 */ .nodeDeviceListCaps = remoteNodeDeviceListCaps, /* 0.5.0 */ .nodeDeviceCreateXML = remoteNodeDeviceCreateXML, /* 0.6.3 */ + .nodeDeviceDefineXML = remoteNodeDeviceDefineXML, /* 6.5.0 */ .nodeDeviceDestroy = remoteNodeDeviceDestroy /* 0.6.3 */ }; diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 2df38cef77..d2250502b4 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -2142,6 +2142,15 @@ struct remote_node_device_destroy_args { remote_nonnull_string name; }; +struct remote_node_device_define_xml_args { + remote_nonnull_string xml_desc; + unsigned int flags; +}; + +struct remote_node_device_define_xml_ret { + remote_nonnull_node_device dev; +}; + /* * Events Register/Deregister: @@ -6714,5 +6723,11 @@ enum remote_procedure { * @generate: none * @acl: domain:write */ - REMOTE_PROC_DOMAIN_AUTHORIZED_SSH_KEYS_SET = 425 + REMOTE_PROC_DOMAIN_AUTHORIZED_SSH_KEYS_SET = 425, + + /** + * @generate: both + * @acl: node_device:write + */ + REMOTE_PROC_NODE_DEVICE_DEFINE_XML = 426 }; diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index 9bcd14603d..565c1673f1 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -1600,6 +1600,13 @@ struct remote_node_device_create_xml_ret { struct remote_node_device_destroy_args { remote_nonnull_string name; }; +struct remote_node_device_define_xml_args { + remote_nonnull_string xml_desc; + u_int flags; +}; +struct remote_node_device_define_xml_ret { + remote_nonnull_node_device dev; +}; struct remote_connect_domain_event_register_ret { int cb_registered; }; @@ -3588,4 +3595,5 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_EVENT_MEMORY_FAILURE = 423, REMOTE_PROC_DOMAIN_AUTHORIZED_SSH_KEYS_GET = 424, REMOTE_PROC_DOMAIN_AUTHORIZED_SSH_KEYS_SET = 425, + REMOTE_PROC_NODE_DEVICE_DEFINE_XML = 426, }; diff --git a/src/rpc/gendispatch.pl b/src/rpc/gendispatch.pl index 0020273d9e..30108272f1 100755 --- a/src/rpc/gendispatch.pl +++ b/src/rpc/gendispatch.pl @@ -567,6 +567,7 @@ elsif ($mode eq "server") { if ($argtype =~ m/^remote_node_device_/ and !($argtype =~ m/^remote_node_device_lookup_by_name_/) and !($argtype =~ m/^remote_node_device_create_xml_/) and + !($argtype =~ m/^remote_node_device_define_xml_/) and !($argtype =~ m/^remote_node_device_lookup_scsi_host_by_wwn_/)) { $has_node_device = 1; push(@vars_list, "virNodeDevicePtr dev = NULL"); diff --git a/tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9a70e21366-define.argv b/tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9a70e21366-define.argv new file mode 100644 index 0000000000..773e98b963 --- /dev/null +++ b/tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9a70e21366-define.argv @@ -0,0 +1 @@ +$MDEVCTL_BINARY$ define -p 0000:00:02.0 --jsonfile /dev/stdin diff --git a/tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9a70e21366-define.json b/tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9a70e21366-define.json new file mode 100644 index 0000000000..bfc6dcace3 --- /dev/null +++ b/tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9a70e21366-define.json @@ -0,0 +1 @@ +{"mdev_type":"i915-GVTg_V5_8","start":"manual"} \ No newline at end of file diff --git a/tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb3f14c23d9-define.argv b/tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb3f14c23d9-define.argv new file mode 100644 index 0000000000..773e98b963 --- /dev/null +++ b/tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb3f14c23d9-define.argv @@ -0,0 +1 @@ +$MDEVCTL_BINARY$ define -p 0000:00:02.0 --jsonfile /dev/stdin diff --git a/tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb3f14c23d9-define.json b/tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb3f14c23d9-define.json new file mode 100644 index 0000000000..e5b22b2c44 --- /dev/null +++ b/tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb3f14c23d9-define.json @@ -0,0 +1 @@ +{"mdev_type":"i915-GVTg_V5_8","start":"manual","attrs":[{"example-attribute-1":"attribute-value-1"},{"example-attribute-2":"attribute-value-2"}]} \ No newline at end of file diff --git a/tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871d16c13076-define.argv b/tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871d16c13076-define.argv new file mode 100644 index 0000000000..773e98b963 --- /dev/null +++ b/tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871d16c13076-define.argv @@ -0,0 +1 @@ +$MDEVCTL_BINARY$ define -p 0000:00:02.0 --jsonfile /dev/stdin diff --git a/tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871d16c13076-define.json b/tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871d16c13076-define.json new file mode 100644 index 0000000000..2e03d0bd8e --- /dev/null +++ b/tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871d16c13076-define.json @@ -0,0 +1 @@ +{"mdev_type":"i915-GVTg_V5_8","start":"manual","attrs":[{"example-attribute":"attribute-value"}]} \ No newline at end of file diff --git a/tests/nodedevmdevctltest.c b/tests/nodedevmdevctltest.c index 8bc464d59f..981276d180 100644 --- a/tests/nodedevmdevctltest.c +++ b/tests/nodedevmdevctltest.c @@ -10,10 +10,16 @@ #define VIR_FROM_THIS VIR_FROM_NODEDEV +typedef enum { + MDEVCTL_CMD_START, + MDEVCTL_CMD_DEFINE, +} MdevctlCmd; + struct startTestInfo { const char *virt_type; int create; const char *filename; + MdevctlCmd command; }; /* capture stdin passed to command */ @@ -45,12 +51,17 @@ nodedevCompareToFile(const char *actual, return virTestCompareToFile(replacedCmdline, filename); } + +typedef virCommandPtr (*GetStartDefineCmdFunc)(virNodeDeviceDefPtr, char **); + + static int testMdevctlStart(const char *virt_type, int create, + GetStartDefineCmdFunc get_cmd_func, const char *mdevxml, - const char *startcmdfile, - const char *startjsonfile) + const char *cmdfile, + const char *jsonfile) { g_autoptr(virNodeDeviceDef) def = NULL; virNodeDeviceObjPtr obj = NULL; @@ -66,7 +77,7 @@ testMdevctlStart(const char *virt_type, /* this function will set a stdin buffer containing the json configuration * of the device. The json value is captured in the callback above */ - cmd = nodeDeviceGetMdevctlStartCommand(def, &uuid); + cmd = get_cmd_func(def, &uuid); if (!cmd) goto cleanup; @@ -78,10 +89,10 @@ testMdevctlStart(const char *virt_type, if (!(actualCmdline = virBufferCurrentContent(&buf))) goto cleanup; - if (nodedevCompareToFile(actualCmdline, startcmdfile) < 0) + if (nodedevCompareToFile(actualCmdline, cmdfile) < 0) goto cleanup; - if (virTestCompareToFile(stdinbuf, startjsonfile) < 0) + if (virTestCompareToFile(stdinbuf, jsonfile) < 0) goto cleanup; ret = 0; @@ -96,17 +107,31 @@ static int testMdevctlStartHelper(const void *data) { const struct startTestInfo *info = data; + const char *cmd; + GetStartDefineCmdFunc func; + g_autofree char *mdevxml = NULL; + g_autofree char *cmdlinefile = NULL; + g_autofree char *jsonfile = NULL; + + if (info->command == MDEVCTL_CMD_START) { + cmd = "start"; + func = nodeDeviceGetMdevctlStartCommand; + } else if (info->command == MDEVCTL_CMD_DEFINE) { + cmd = "define"; + func = nodeDeviceGetMdevctlDefineCommand; + } else { + return -1; + } - g_autofree char *mdevxml = g_strdup_printf("%s/nodedevschemadata/%s.xml", - abs_srcdir, info->filename); - g_autofree char *cmdlinefile = g_strdup_printf("%s/nodedevmdevctldata/%s-start.argv", - abs_srcdir, info->filename); - g_autofree char *jsonfile = g_strdup_printf("%s/nodedevmdevctldata/%s-start.json", - abs_srcdir, info->filename); + mdevxml = g_strdup_printf("%s/nodedevschemadata/%s.xml", abs_srcdir, + info->filename); + cmdlinefile = g_strdup_printf("%s/nodedevmdevctldata/%s-%s.argv", + abs_srcdir, info->filename, cmd); + jsonfile = g_strdup_printf("%s/nodedevmdevctldata/%s-%s.json", abs_srcdir, + info->filename, cmd); - return testMdevctlStart(info->virt_type, - info->create, mdevxml, cmdlinefile, - jsonfile); + return testMdevctlStart(info->virt_type, info->create, func, + mdevxml, cmdlinefile, jsonfile); } static int @@ -347,15 +372,18 @@ mymain(void) if (virTestRun(desc, func, info) < 0) \ ret = -1; -#define DO_TEST_START_FULL(virt_type, create, filename) \ +#define DO_TEST_START_FULL(desc, virt_type, create, filename, command) \ do { \ - struct startTestInfo info = { virt_type, create, filename }; \ - DO_TEST_FULL("mdevctl start " filename, testMdevctlStartHelper, &info); \ + struct startTestInfo info = { virt_type, create, filename, command }; \ + DO_TEST_FULL(desc, testMdevctlStartHelper, &info); \ } \ while (0) #define DO_TEST_START(filename) \ - DO_TEST_START_FULL("QEMU", CREATE_DEVICE, filename) + DO_TEST_START_FULL("mdevctl start " filename, "QEMU", CREATE_DEVICE, filename, MDEVCTL_CMD_START) + +#define DO_TEST_DEFINE(filename) \ + DO_TEST_START_FULL("mdevctl define " filename, "QEMU", CREATE_DEVICE, filename, MDEVCTL_CMD_DEFINE) #define DO_TEST_STOP(uuid) \ DO_TEST_FULL("mdevctl stop " uuid, testMdevctlStop, uuid) @@ -378,6 +406,10 @@ mymain(void) DO_TEST_PARSE_JSON("mdevctl-list-multiple"); + DO_TEST_DEFINE("mdev_d069d019_36ea_4111_8f0a_8c9a70e21366"); + DO_TEST_DEFINE("mdev_fedc4916_1ca8_49ac_b176_871d16c13076"); + DO_TEST_DEFINE("mdev_d2441d39_495e_4243_ad9f_beb3f14c23d9"); + done: nodedevTestDriverFree(driver); -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:36AM -0600, Jonathon Jongsma wrote:
With mediated devices, we can now define persistent node devices that can be started and stopped. In order to take advantage of this, we need an API to define new node devices.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- ...
diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index cf31f937d5..8fae4352ff 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -877,6 +877,7 @@ LIBVIRT_6.10.0 {
This will likely be 7.1.0+ - just so that we don't forget to update it :)
global: virDomainAuthorizedSSHKeysGet; virDomainAuthorizedSSHKeysSet; + virNodeDeviceDefineXML; } LIBVIRT_6.0.0;
# .... define new API here using predicted next version number .... diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 0bebd534d0..4fbe8743b4 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -698,9 +698,13 @@ nodeDeviceFindAddressByName(const char *name) }
-virCommandPtr -nodeDeviceGetMdevctlStartCommand(virNodeDeviceDefPtr def, - char **uuid_out) +/* the mdevctl 'start' and 'define' commands accept almost the exact same + * arguments, so provide a common implementation that can be wrapped by a more + * specific function */ +static virCommandPtr +nodeDeviceGetMdevctlDefineStartCommand(virNodeDeviceDefPtr def, + const char *subcommand, + char **uuid_out) { virCommandPtr cmd; g_autofree char *json = NULL; @@ -718,7 +722,7 @@ nodeDeviceGetMdevctlStartCommand(virNodeDeviceDefPtr def, return NULL; }
- cmd = virCommandNewArgList(MDEVCTL, "start", + cmd = virCommandNewArgList(MDEVCTL, subcommand, "-p", parent_addr, "--jsonfile", "/dev/stdin", NULL); @@ -729,6 +733,22 @@ nodeDeviceGetMdevctlStartCommand(virNodeDeviceDefPtr def, return cmd; }
+virCommandPtr +nodeDeviceGetMdevctlStartCommand(virNodeDeviceDefPtr def, + char **uuid_out) +{ + return nodeDeviceGetMdevctlDefineStartCommand(def, "start", uuid_out); +} +
^These hunks should be in a separate preparation patch ...
+ .nodeDeviceDefineXML = nodeDeviceDefineXML, /* 6.5.0 */
7.x.0
.nodeDeviceCreateXML = remoteNodeDeviceCreateXML, /* 0.6.3 */ + .nodeDeviceDefineXML = remoteNodeDeviceDefineXML, /* 6.5.0 */
7.x.0
.nodeDeviceDestroy = remoteNodeDeviceDestroy /* 0.6.3 */ };
/* capture stdin passed to command */ @@ -45,12 +51,17 @@ nodedevCompareToFile(const char *actual, return virTestCompareToFile(replacedCmdline, filename); }
<HUNK_START
+ +typedef virCommandPtr (*GetStartDefineCmdFunc)(virNodeDeviceDefPtr, char **);
MdevctlCmdFunc would be probably a better name.
+ + static int testMdevctlStart(const char *virt_type, int create, + GetStartDefineCmdFunc get_cmd_func,
mdevctl_cmd_func probably?
const char *mdevxml, - const char *startcmdfile, - const char *startjsonfile) + const char *cmdfile, + const char *jsonfile) { g_autoptr(virNodeDeviceDef) def = NULL; virNodeDeviceObjPtr obj = NULL; @@ -66,7 +77,7 @@ testMdevctlStart(const char *virt_type,
/* this function will set a stdin buffer containing the json configuration * of the device. The json value is captured in the callback above */ - cmd = nodeDeviceGetMdevctlStartCommand(def, &uuid); + cmd = get_cmd_func(def, &uuid);
if (!cmd) goto cleanup; @@ -78,10 +89,10 @@ testMdevctlStart(const char *virt_type, if (!(actualCmdline = virBufferCurrentContent(&buf))) goto cleanup;
- if (nodedevCompareToFile(actualCmdline, startcmdfile) < 0) + if (nodedevCompareToFile(actualCmdline, cmdfile) < 0) goto cleanup;
- if (virTestCompareToFile(stdinbuf, startjsonfile) < 0) + if (virTestCompareToFile(stdinbuf, jsonfile) < 0) goto cleanup;
ret = 0; @@ -96,17 +107,31 @@ static int testMdevctlStartHelper(const void *data) { const struct startTestInfo *info = data; + const char *cmd; + GetStartDefineCmdFunc func; + g_autofree char *mdevxml = NULL; + g_autofree char *cmdlinefile = NULL; + g_autofree char *jsonfile = NULL; + + if (info->command == MDEVCTL_CMD_START) { + cmd = "start"; + func = nodeDeviceGetMdevctlStartCommand;
<HUNK_END ^These hunks should be in a separate preparation patch (part of the one I mentioned earlier in this reply).
+ } else if (info->command == MDEVCTL_CMD_DEFINE) { + cmd = "define"; + func = nodeDeviceGetMdevctlDefineCommand; + } else { + return -1; + }
- g_autofree char *mdevxml = g_strdup_printf("%s/nodedevschemadata/%s.xml", - abs_srcdir, info->filename); - g_autofree char *cmdlinefile = g_strdup_printf("%s/nodedevmdevctldata/%s-start.argv", - abs_srcdir, info->filename); - g_autofree char *jsonfile = g_strdup_printf("%s/nodedevmdevctldata/%s-start.json", - abs_srcdir, info->filename); + mdevxml = g_strdup_printf("%s/nodedevschemadata/%s.xml", abs_srcdir, + info->filename); + cmdlinefile = g_strdup_printf("%s/nodedevmdevctldata/%s-%s.argv", + abs_srcdir, info->filename, cmd); + jsonfile = g_strdup_printf("%s/nodedevmdevctldata/%s-%s.json", abs_srcdir, + info->filename, cmd);
- return testMdevctlStart(info->virt_type, - info->create, mdevxml, cmdlinefile, - jsonfile); + return testMdevctlStart(info->virt_type, info->create, func, + mdevxml, cmdlinefile, jsonfile); }
The same applies to ^these hunks as well.
static int @@ -347,15 +372,18 @@ mymain(void) if (virTestRun(desc, func, info) < 0) \ ret = -1;
-#define DO_TEST_START_FULL(virt_type, create, filename) \ +#define DO_TEST_START_FULL(desc, virt_type, create, filename, command) \
At this point I think this wrapper macro should be called DO_TEST_CMD instead. The API functionality looks okay, ACK. Erik

Now that we can filter active and inactive node devices in virConnectListAllNodeDevices(), add these switches to the virsh command. Eventual output (once everything is hooked up): virsh # nodedev-list --inactive --cap mdev mdev_07d8b8b0_7e04_4c0f_97ed_9214ce12723c mdev_927c040f_ae7d_4a35_966e_286ba6ebbe1c virsh # nodedev-list --active --cap mdev mdev_bd2ea955_3402_4252_8c17_7468083a0f26 virsh # nodedev-list --all --cap mdev mdev_07d8b8b0_7e04_4c0f_97ed_9214ce12723c mdev_927c040f_ae7d_4a35_966e_286ba6ebbe1c mdev_bd2ea955_3402_4252_8c17_7468083a0f26 Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- tools/virsh-nodedev.c | 30 ++++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index 35117585ff..e3261747e3 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -378,6 +378,14 @@ static const vshCmdOptDef opts_node_list_devices[] = { .completer = virshNodeDeviceCapabilityNameCompleter, .help = N_("capability names, separated by comma") }, + {.name = "inactive", + .type = VSH_OT_BOOL, + .help = N_("list inactive devices") + }, + {.name = "all", + .type = VSH_OT_BOOL, + .help = N_("list inactive & active devices") + }, {.name = NULL} }; @@ -393,18 +401,27 @@ cmdNodeListDevices(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED) int ncaps = 0; virshNodeDeviceListPtr list = NULL; int cap_type = -1; + bool inactive, all; + inactive = vshCommandOptBool(cmd, "inactive"); + all = vshCommandOptBool(cmd, "all"); ignore_value(vshCommandOptStringQuiet(ctl, cmd, "cap", &cap_str)); if (cap_str) { - if (tree) { - vshError(ctl, "%s", _("Options --tree and --cap are incompatible")); - return false; - } if ((ncaps = vshStringToArray(cap_str, &caps)) < 0) return false; } + if (all && inactive) { + vshError(ctl, "%s", _("Option --all is incompatible with --inactive")); + return false; + } + + if (tree && (cap_str || inactive)) { + vshError(ctl, "%s", _("Option --tree is incompatible with other options")); + return false; + } + for (i = 0; i < ncaps; i++) { if ((cap_type = virNodeDevCapTypeFromString(caps[i])) < 0) { vshError(ctl, "%s", _("Invalid capability type")); @@ -481,6 +498,11 @@ cmdNodeListDevices(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED) } } + if (inactive || all) + flags |= VIR_CONNECT_LIST_NODE_DEVICES_INACTIVE; + if (!inactive) + flags |= VIR_CONNECT_LIST_NODE_DEVICES_ACTIVE; + if (!(list = virshNodeDeviceListCollect(ctl, caps, ncaps, flags))) { ret = false; goto cleanup; -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:37AM -0600, Jonathon Jongsma wrote:
Now that we can filter active and inactive node devices in virConnectListAllNodeDevices(), add these switches to the virsh command.
Eventual output (once everything is hooked up):
virsh # nodedev-list --inactive --cap mdev mdev_07d8b8b0_7e04_4c0f_97ed_9214ce12723c mdev_927c040f_ae7d_4a35_966e_286ba6ebbe1c
virsh # nodedev-list --active --cap mdev mdev_bd2ea955_3402_4252_8c17_7468083a0f26
This patch doesn't introduce --active (which is correct), so both the subject and the commit message need to be adjusted.
+ if (all && inactive) { + vshError(ctl, "%s", _("Option --all is incompatible with --inactive")); + return false; + } + + if (tree && (cap_str || inactive)) {
we should also check for 'all', since --tree is exclusive with everything else. Reviewed-by: Erik Skultety <eskultet@redhat.com>

Add a virsh command that maps to virNodeDeviceDefineXML(). Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- tools/virsh-nodedev.c | 58 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index e3261747e3..07d48bbfbe 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -1014,6 +1014,58 @@ cmdNodeDeviceEvent(vshControl *ctl, const vshCmd *cmd) } +/* + * "nodedev-define" command + */ +static const vshCmdInfo info_node_device_define[] = { + {.name = "help", + .data = N_("Define a device by an xml file on a node") + }, + {.name = "desc", + .data = N_("Defines a persistent device on the node that can be " + "assigned to a domain. The device must be started before " + "it can be assigned to a domain.") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_node_device_define[] = { + VIRSH_COMMON_OPT_FILE(N_("file containing an XML description " + "of the device")), + {.name = NULL} +}; + +static bool +cmdNodeDeviceDefine(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED) +{ + virNodeDevicePtr dev = NULL; + const char *from = NULL; + bool ret = true; + char *buffer; + virshControlPtr priv = ctl->privData; + + if (vshCommandOptStringReq(ctl, cmd, "file", &from) < 0) + return false; + + if (virFileReadAll(from, VSH_MAX_XML_FILE, &buffer) < 0) + return false; + + dev = virNodeDeviceDefineXML(priv->conn, buffer, 0); + VIR_FREE(buffer); + + if (dev != NULL) { + vshPrintExtra(ctl, _("Node device %s defined from %s\n"), + virNodeDeviceGetName(dev), from); + virNodeDeviceFree(dev); + } else { + vshError(ctl, _("Failed to define node device from %s"), from); + ret = false; + } + + return ret; +} + + const vshCmdDef nodedevCmds[] = { {.name = "nodedev-create", .handler = cmdNodeDeviceCreate, @@ -1067,5 +1119,11 @@ const vshCmdDef nodedevCmds[] = { .info = info_node_device_event, .flags = 0 }, + {.name = "nodedev-define", + .handler = cmdNodeDeviceDefine, + .opts = opts_node_device_define, + .info = info_node_device_define, + .flags = 0 + }, {.name = NULL} }; -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:38AM -0600, Jonathon Jongsma wrote:
Add a virsh command that maps to virNodeDeviceDefineXML().
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- tools/virsh-nodedev.c | 58 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+)
diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index e3261747e3..07d48bbfbe 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -1014,6 +1014,58 @@ cmdNodeDeviceEvent(vshControl *ctl, const vshCmd *cmd) }
+/* + * "nodedev-define" command + */ +static const vshCmdInfo info_node_device_define[] = { + {.name = "help", + .data = N_("Define a device by an xml file on a node") + }, + {.name = "desc", + .data = N_("Defines a persistent device on the node that can be " + "assigned to a domain. The device must be started before " + "it can be assigned to a domain.") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_node_device_define[] = { + VIRSH_COMMON_OPT_FILE(N_("file containing an XML description " + "of the device")), + {.name = NULL} +}; + +static bool +cmdNodeDeviceDefine(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED) +{ + virNodeDevicePtr dev = NULL; + const char *from = NULL; + bool ret = true; + char *buffer; + virshControlPtr priv = ctl->privData; + + if (vshCommandOptStringReq(ctl, cmd, "file", &from) < 0) + return false; + + if (virFileReadAll(from, VSH_MAX_XML_FILE, &buffer) < 0) + return false; + + dev = virNodeDeviceDefineXML(priv->conn, buffer, 0); + VIR_FREE(buffer); + + if (dev != NULL) { + vshPrintExtra(ctl, _("Node device %s defined from %s\n"),
We should adopt a new style guideline and enclose any %s that may denote a name potentially containing spaces in '' (multiple occurrences). Reviewed-by: Erik Skultety <eskultet@redhat.com>

mdevctl 'stop' and 'undefine' commands take the same uuid parameter, so refactor the test infrastructure to share common implementation for both of these commands. The 'undefine' command will be introduced in a following patch. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- tests/nodedevmdevctltest.c | 48 +++++++++++++++++++++++++++++++------- 1 file changed, 40 insertions(+), 8 deletions(-) diff --git a/tests/nodedevmdevctltest.c b/tests/nodedevmdevctltest.c index 981276d180..56265cfaea 100644 --- a/tests/nodedevmdevctltest.c +++ b/tests/nodedevmdevctltest.c @@ -12,7 +12,9 @@ typedef enum { MDEVCTL_CMD_START, + MDEVCTL_CMD_STOP, MDEVCTL_CMD_DEFINE, + MDEVCTL_CMD_UNDEFINE } MdevctlCmd; struct startTestInfo { @@ -134,19 +136,21 @@ testMdevctlStartHelper(const void *data) mdevxml, cmdlinefile, jsonfile); } +typedef virCommandPtr (*GetStopUndefineCmdFunc)(const char *uuid); +struct UuidCommandTestInfo { + const char *uuid; + MdevctlCmd command; +}; + static int -testMdevctlStop(const void *data) +testMdevctlUuidCommand(const char *uuid, GetStopUndefineCmdFunc func, const char *outfile) { - const char *uuid = data; g_auto(virBuffer) buf = VIR_BUFFER_INITIALIZER; const char *actualCmdline = NULL; int ret = -1; g_autoptr(virCommand) cmd = NULL; - g_autofree char *cmdlinefile = - g_strdup_printf("%s/nodedevmdevctldata/mdevctl-stop.argv", - abs_srcdir); - cmd = nodeDeviceGetMdevctlStopCommand(uuid); + cmd = func(uuid); if (!cmd) goto cleanup; @@ -158,7 +162,7 @@ testMdevctlStop(const void *data) if (!(actualCmdline = virBufferCurrentContent(&buf))) goto cleanup; - if (nodedevCompareToFile(actualCmdline, cmdlinefile) < 0) + if (nodedevCompareToFile(actualCmdline, outfile) < 0) goto cleanup; ret = 0; @@ -168,6 +172,27 @@ testMdevctlStop(const void *data) return ret; } +static int +testMdevctlUuidCommandHelper(const void *data) +{ + const struct UuidCommandTestInfo *info = data; + GetStopUndefineCmdFunc func; + const char *cmd; + g_autofree char *cmdlinefile = NULL; + + if (info->command == MDEVCTL_CMD_STOP) { + cmd = "stop"; + func = nodeDeviceGetMdevctlStopCommand; + } else { + return -1; + } + + cmdlinefile = g_strdup_printf("%s/nodedevmdevctldata/mdevctl-%s.argv", + abs_srcdir, cmd); + + return testMdevctlUuidCommand(info->uuid, func, cmdlinefile); +} + static int testMdevctlListDefined(const void *data G_GNUC_UNUSED) { @@ -385,8 +410,15 @@ mymain(void) #define DO_TEST_DEFINE(filename) \ DO_TEST_START_FULL("mdevctl define " filename, "QEMU", CREATE_DEVICE, filename, MDEVCTL_CMD_DEFINE) +#define DO_TEST_UUID_COMMAND_FULL(desc, uuid, command) \ + do { \ + struct UuidCommandTestInfo info = { uuid, command }; \ + DO_TEST_FULL(desc, testMdevctlUuidCommandHelper, &info); \ + } \ + while (0) + #define DO_TEST_STOP(uuid) \ - DO_TEST_FULL("mdevctl stop " uuid, testMdevctlStop, uuid) + DO_TEST_UUID_COMMAND_FULL("mdevctl stop " uuid, uuid, MDEVCTL_CMD_STOP) #define DO_TEST_LIST_DEFINED() \ DO_TEST_FULL("mdevctl list --defined", testMdevctlListDefined, NULL) -- 2.26.2

This interface allows you to undefine a persistently defined (but inactive) mediated devices. It is implemented via 'mdevctl' Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- include/libvirt/libvirt-nodedev.h | 2 + src/access/viraccessperm.c | 2 +- src/access/viraccessperm.h | 6 +++ src/driver-nodedev.h | 4 ++ src/libvirt-nodedev.c | 36 +++++++++++++++ src/libvirt_public.syms | 1 + src/node_device/node_device_driver.c | 67 ++++++++++++++++++++++++++++ src/node_device/node_device_driver.h | 6 +++ src/node_device/node_device_udev.c | 1 + src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 14 +++++- src/remote_protocol-structs | 4 ++ tests/nodedevmdevctltest.c | 8 ++++ 13 files changed, 150 insertions(+), 2 deletions(-) diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index 3e57e9522a..d9dac97147 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -136,6 +136,8 @@ virNodeDevicePtr virNodeDeviceDefineXML(virConnectPtr conn, const char *xmlDesc, unsigned int flags); +int virNodeDeviceUndefine(virNodeDevicePtr dev); + /** * VIR_NODE_DEVICE_EVENT_CALLBACK: * diff --git a/src/access/viraccessperm.c b/src/access/viraccessperm.c index 33db7752b6..d4a0c98b9b 100644 --- a/src/access/viraccessperm.c +++ b/src/access/viraccessperm.c @@ -70,7 +70,7 @@ VIR_ENUM_IMPL(virAccessPermNodeDevice, VIR_ACCESS_PERM_NODE_DEVICE_LAST, "getattr", "read", "write", "start", "stop", - "detach", + "detach", "delete", ); VIR_ENUM_IMPL(virAccessPermNWFilter, diff --git a/src/access/viraccessperm.h b/src/access/viraccessperm.h index 42996b9741..051246a7b6 100644 --- a/src/access/viraccessperm.h +++ b/src/access/viraccessperm.h @@ -500,6 +500,12 @@ typedef enum { */ VIR_ACCESS_PERM_NODE_DEVICE_DETACH, + /** + * @desc: Delete node device + * @message: Deleting node device driver requires authorization + */ + VIR_ACCESS_PERM_NODE_DEVICE_DELETE, + VIR_ACCESS_PERM_NODE_DEVICE_LAST } virAccessPermNodeDevice; diff --git a/src/driver-nodedev.h b/src/driver-nodedev.h index 16d787f3fc..e18029ab48 100644 --- a/src/driver-nodedev.h +++ b/src/driver-nodedev.h @@ -79,6 +79,9 @@ typedef virNodeDevicePtr const char *xmlDesc, unsigned int flags); +typedef int +(*virDrvNodeDeviceUndefine)(virNodeDevicePtr dev); + typedef int (*virDrvConnectNodeDeviceEventRegisterAny)(virConnectPtr conn, virNodeDevicePtr dev, @@ -119,4 +122,5 @@ struct _virNodeDeviceDriver { virDrvNodeDeviceCreateXML nodeDeviceCreateXML; virDrvNodeDeviceDestroy nodeDeviceDestroy; virDrvNodeDeviceDefineXML nodeDeviceDefineXML; + virDrvNodeDeviceUndefine nodeDeviceUndefine; }; diff --git a/src/libvirt-nodedev.c b/src/libvirt-nodedev.c index 15eb70218a..618778517b 100644 --- a/src/libvirt-nodedev.c +++ b/src/libvirt-nodedev.c @@ -807,6 +807,42 @@ virNodeDeviceDefineXML(virConnectPtr conn, } +/** + * virNodeDeviceUndefine: + * @dev: a device object + * + * Undefine the device object. The virtual device is removed from the host + * operating system. This function may require privileged access. + * + * Returns 0 in case of success and -1 in case of failure. + */ +int +virNodeDeviceUndefine(virNodeDevicePtr dev) +{ + VIR_DEBUG("dev=%p", dev); + + virResetLastError(); + + virCheckNodeDeviceReturn(dev, -1); + virCheckReadOnlyGoto(dev->conn->flags, error); + + if (dev->conn->nodeDeviceDriver && + dev->conn->nodeDeviceDriver->nodeDeviceUndefine) { + int retval = dev->conn->nodeDeviceDriver->nodeDeviceUndefine(dev); + if (retval < 0) + goto error; + + return 0; + } + + virReportUnsupportedError(); + + error: + virDispatchError(dev->conn); + return -1; +} + + /** * virConnectNodeDeviceEventRegisterAny: * @conn: pointer to the connection diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index 8fae4352ff..16022a74bf 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -878,6 +878,7 @@ LIBVIRT_6.10.0 { virDomainAuthorizedSSHKeysGet; virDomainAuthorizedSSHKeysSet; virNodeDeviceDefineXML; + virNodeDeviceUndefine; } LIBVIRT_6.0.0; # .... define new API here using predicted next version number .... diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 4fbe8743b4..e38aef0656 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -878,6 +878,17 @@ nodeDeviceGetMdevctlStopCommand(const char *uuid) } +virCommandPtr +nodeDeviceGetMdevctlUndefineCommand(const char *uuid) +{ + return virCommandNewArgList(MDEVCTL, + "undefine", + "-u", + uuid, + NULL); + +} + static int virMdevctlStop(virNodeDeviceDefPtr def) { @@ -893,6 +904,21 @@ virMdevctlStop(virNodeDeviceDefPtr def) } +static int +virMdevctlUndefine(virNodeDeviceDefPtr def) +{ + int status; + g_autoptr(virCommand) cmd = NULL; + + cmd = nodeDeviceGetMdevctlUndefineCommand(def->caps->data.mdev.uuid); + + if (virCommandRun(cmd, &status) < 0 || status != 0) + return -1; + + return 0; +} + + virCommandPtr nodeDeviceGetMdevctlListCommand(bool defined, char **output) @@ -1157,6 +1183,47 @@ nodeDeviceDefineXML(virConnectPtr conn, } +int +nodeDeviceUndefine(virNodeDevicePtr device) +{ + int ret = -1; + virNodeDeviceObjPtr obj = NULL; + virNodeDeviceDefPtr def; + + if (nodeDeviceWaitInit() < 0) + return -1; + + if (!(obj = nodeDeviceObjFindByName(device->name))) + return -1; + def = virNodeDeviceObjGetDef(obj); + + if (virNodeDeviceUndefineEnsureACL(device->conn, def) < 0) + goto cleanup; + + if (virNodeDeviceObjIsActive(obj)) { + virReportError(VIR_ERR_OPERATION_INVALID, + _("node device '%s' is still active"), + def->name); + goto cleanup; + } + + if (nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_MDEV)) { + if (virMdevctlUndefine(def) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unable to undefine mediated device")); + goto cleanup; + } + ret = 0; + } else { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Unsupported device type")); + } + + cleanup: + virNodeDeviceObjEndAPI(&obj); + return ret; +} + int nodeConnectNodeDeviceEventRegisterAny(virConnectPtr conn, diff --git a/src/node_device/node_device_driver.h b/src/node_device/node_device_driver.h index e58bb0f124..e06d62cee6 100644 --- a/src/node_device/node_device_driver.h +++ b/src/node_device/node_device_driver.h @@ -107,6 +107,9 @@ nodeDeviceDefineXML(virConnectPtr conn, const char *xmlDesc, unsigned int flags); +int +nodeDeviceUndefine(virNodeDevicePtr dev); + int nodeConnectNodeDeviceEventRegisterAny(virConnectPtr conn, virNodeDevicePtr dev, @@ -129,6 +132,9 @@ nodeDeviceGetMdevctlDefineCommand(virNodeDeviceDefPtr def, virCommandPtr nodeDeviceGetMdevctlStopCommand(const char *uuid); +virCommandPtr +nodeDeviceGetMdevctlUndefineCommand(const char *uuid); + virCommandPtr nodeDeviceGetMdevctlListCommand(bool defined, char **output); diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 2532c3189e..d7521a9c18 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -2339,6 +2339,7 @@ static virNodeDeviceDriver udevNodeDeviceDriver = { .nodeDeviceCreateXML = nodeDeviceCreateXML, /* 0.7.3 */ .nodeDeviceDestroy = nodeDeviceDestroy, /* 0.7.3 */ .nodeDeviceDefineXML = nodeDeviceDefineXML, /* 6.5.0 */ + .nodeDeviceUndefine = nodeDeviceUndefine, /* 6.5.0 */ }; diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index f826b6f73e..de569b74e5 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -8666,6 +8666,7 @@ static virNodeDeviceDriver node_device_driver = { .nodeDeviceListCaps = remoteNodeDeviceListCaps, /* 0.5.0 */ .nodeDeviceCreateXML = remoteNodeDeviceCreateXML, /* 0.6.3 */ .nodeDeviceDefineXML = remoteNodeDeviceDefineXML, /* 6.5.0 */ + .nodeDeviceUndefine = remoteNodeDeviceUndefine, /* 6.5.0 */ .nodeDeviceDestroy = remoteNodeDeviceDestroy /* 0.6.3 */ }; diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index d2250502b4..7fee7539f0 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -2151,6 +2151,10 @@ struct remote_node_device_define_xml_ret { remote_nonnull_node_device dev; }; +struct remote_node_device_undefine_args { + remote_nonnull_string name; +}; + /* * Events Register/Deregister: @@ -6729,5 +6733,13 @@ enum remote_procedure { * @generate: both * @acl: node_device:write */ - REMOTE_PROC_NODE_DEVICE_DEFINE_XML = 426 + REMOTE_PROC_NODE_DEVICE_DEFINE_XML = 426, + + /** + * @generate: both + * @priority: high + * @acl: node_device:delete + */ + REMOTE_PROC_NODE_DEVICE_UNDEFINE = 427 + }; diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index 565c1673f1..6d136bdb66 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -1607,6 +1607,9 @@ struct remote_node_device_define_xml_args { struct remote_node_device_define_xml_ret { remote_nonnull_node_device dev; }; +struct remote_node_device_undefine_args { + remote_nonnull_string name; +}; struct remote_connect_domain_event_register_ret { int cb_registered; }; @@ -3596,4 +3599,5 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_AUTHORIZED_SSH_KEYS_GET = 424, REMOTE_PROC_DOMAIN_AUTHORIZED_SSH_KEYS_SET = 425, REMOTE_PROC_NODE_DEVICE_DEFINE_XML = 426, + REMOTE_PROC_NODE_DEVICE_UNDEFINE = 427, }; diff --git a/tests/nodedevmdevctltest.c b/tests/nodedevmdevctltest.c index 56265cfaea..753cfd273e 100644 --- a/tests/nodedevmdevctltest.c +++ b/tests/nodedevmdevctltest.c @@ -183,6 +183,9 @@ testMdevctlUuidCommandHelper(const void *data) if (info->command == MDEVCTL_CMD_STOP) { cmd = "stop"; func = nodeDeviceGetMdevctlStopCommand; + } else if (info->command == MDEVCTL_CMD_UNDEFINE) { + cmd = "undefine"; + func = nodeDeviceGetMdevctlUndefineCommand; } else { return -1; } @@ -420,6 +423,9 @@ mymain(void) #define DO_TEST_STOP(uuid) \ DO_TEST_UUID_COMMAND_FULL("mdevctl stop " uuid, uuid, MDEVCTL_CMD_STOP) +#define DO_TEST_UNDEFINE(uuid) \ + DO_TEST_UUID_COMMAND_FULL("mdevctl undefine " uuid, uuid, MDEVCTL_CMD_UNDEFINE) + #define DO_TEST_LIST_DEFINED() \ DO_TEST_FULL("mdevctl list --defined", testMdevctlListDefined, NULL) @@ -442,6 +448,8 @@ mymain(void) DO_TEST_DEFINE("mdev_fedc4916_1ca8_49ac_b176_871d16c13076"); DO_TEST_DEFINE("mdev_d2441d39_495e_4243_ad9f_beb3f14c23d9"); + DO_TEST_UNDEFINE("d76a6b78-45ed-4149-a325-005f9abc5281"); + done: nodedevTestDriverFree(driver); -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:40AM -0600, Jonathon Jongsma wrote:
This interface allows you to undefine a persistently defined (but inactive) mediated devices. It is implemented via 'mdevctl'
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> ---
This patch doesn't pass tests, because tests/nodedevmdevctldata/mdevctl-undefine.argv is missing. I'll post the review in a separate reply. Erik

Several functions accept providing a node device by name or by wwnn,wwpn pair. Extract the logic to do this into a function that can be used by both callers. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- tools/virsh-nodedev.c | 58 +++++++++++++++++++++---------------------- 1 file changed, 28 insertions(+), 30 deletions(-) diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index 07d48bbfbe..5df75013de 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -111,23 +111,18 @@ static const vshCmdOptDef opts_node_device_destroy[] = { {.name = NULL} }; -static bool -cmdNodeDeviceDestroy(vshControl *ctl, const vshCmd *cmd) +static virNodeDevicePtr +vshFindNodeDevice(vshControl *ctl, const char *value) { virNodeDevicePtr dev = NULL; - bool ret = false; - const char *device_value = NULL; char **arr = NULL; int narr; virshControlPtr priv = ctl->privData; - if (vshCommandOptStringReq(ctl, cmd, "device", &device_value) < 0) - return false; - - if (strchr(device_value, ',')) { - narr = vshStringToArray(device_value, &arr); + if (strchr(value, ',')) { + narr = vshStringToArray(value, &arr); if (narr != 2) { - vshError(ctl, _("Malformed device value '%s'"), device_value); + vshError(ctl, _("Malformed device value '%s'"), value); goto cleanup; } @@ -136,9 +131,30 @@ cmdNodeDeviceDestroy(vshControl *ctl, const vshCmd *cmd) dev = virNodeDeviceLookupSCSIHostByWWN(priv->conn, arr[0], arr[1], 0); } else { - dev = virNodeDeviceLookupByName(priv->conn, device_value); + dev = virNodeDeviceLookupByName(priv->conn, value); } + if (!dev) { + vshError(ctl, "%s '%s'", _("Could not find matching device"), value); + goto cleanup; + } + + cleanup: + g_strfreev(arr); + return dev; +} + +static bool +cmdNodeDeviceDestroy(vshControl *ctl, const vshCmd *cmd) +{ + virNodeDevicePtr dev = NULL; + bool ret = false; + const char *device_value = NULL; + + if (vshCommandOptStringReq(ctl, cmd, "device", &device_value) < 0) + return false; + + dev = vshFindNodeDevice(ctl, device_value); if (!dev) { vshError(ctl, "%s '%s'", _("Could not find matching device"), device_value); goto cleanup; @@ -153,7 +169,6 @@ cmdNodeDeviceDestroy(vshControl *ctl, const vshCmd *cmd) ret = true; cleanup: - g_strfreev(arr); if (dev) virNodeDeviceFree(dev); return ret; @@ -579,28 +594,12 @@ cmdNodeDeviceDumpXML(vshControl *ctl, const vshCmd *cmd) virNodeDevicePtr device = NULL; char *xml = NULL; const char *device_value = NULL; - char **arr = NULL; - int narr; bool ret = false; - virshControlPtr priv = ctl->privData; if (vshCommandOptStringReq(ctl, cmd, "device", &device_value) < 0) return false; - if (strchr(device_value, ',')) { - narr = vshStringToArray(device_value, &arr); - if (narr != 2) { - vshError(ctl, _("Malformed device value '%s'"), device_value); - goto cleanup; - } - - if (!virValidateWWN(arr[0]) || !virValidateWWN(arr[1])) - goto cleanup; - - device = virNodeDeviceLookupSCSIHostByWWN(priv->conn, arr[0], arr[1], 0); - } else { - device = virNodeDeviceLookupByName(priv->conn, device_value); - } + device = vshFindNodeDevice(ctl, device_value); if (!device) { vshError(ctl, "%s '%s'", _("Could not find matching device"), device_value); @@ -614,7 +613,6 @@ cmdNodeDeviceDumpXML(vshControl *ctl, const vshCmd *cmd) ret = true; cleanup: - g_strfreev(arr); VIR_FREE(xml); if (device) virNodeDeviceFree(device); -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:41AM -0600, Jonathon Jongsma wrote:
Several functions accept providing a node device by name or by wwnn,wwpn pair. Extract the logic to do this into a function that can be used by both callers.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- Reviewed-by: Erik Skultety <eskultet@redhat.com>

Add a virsh command that maps to virNodeDeviceUndefine(). Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- tools/virsh-nodedev.c | 65 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index 5df75013de..97d2a34056 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -1012,6 +1012,65 @@ cmdNodeDeviceEvent(vshControl *ctl, const vshCmd *cmd) } +/* + * "nodedev-undefine" command + */ +static const vshCmdInfo info_node_device_undefine[] = { + {.name = "help", + .data = N_("Undefine an inactive node device") + }, + {.name = "desc", + .data = N_("Undefines the configuration for an inactive node device") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_node_device_undefine[] = { + {.name = "name", + .type = VSH_OT_ALIAS, + .help = "device" + }, + {.name = "device", + .type = VSH_OT_DATA, + .flags = VSH_OFLAG_REQ, + .help = N_("device name or wwn pair in 'wwnn,wwpn' format"), + .completer = virshNodeDeviceNameCompleter, + }, + {.name = NULL} +}; + +static bool +cmdNodeDeviceUndefine(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED) +{ + virNodeDevicePtr dev = NULL; + bool ret = false; + const char *device_value = NULL; + + if (vshCommandOptStringReq(ctl, cmd, "device", &device_value) < 0) + return false; + + dev = vshFindNodeDevice(ctl, device_value); + + if (!dev) { + vshError(ctl, "%s '%s'", _("Could not find matching device"), device_value); + goto cleanup; + } + + if (virNodeDeviceUndefine(dev) == 0) { + vshPrintExtra(ctl, _("Undefined node device '%s'\n"), device_value); + } else { + vshError(ctl, _("Failed to undefine node device '%s'"), device_value); + goto cleanup; + } + + ret = true; + cleanup: + if (dev) + virNodeDeviceFree(dev); + return ret; +} + + /* * "nodedev-define" command */ @@ -1123,5 +1182,11 @@ const vshCmdDef nodedevCmds[] = { .info = info_node_device_define, .flags = 0 }, + {.name = "nodedev-undefine", + .handler = cmdNodeDeviceUndefine, + .opts = opts_node_device_undefine, + .info = info_node_device_undefine, + .flags = 0 + }, {.name = NULL} }; -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:42AM -0600, Jonathon Jongsma wrote:
Add a virsh command that maps to virNodeDeviceUndefine().
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- tools/virsh-nodedev.c | 65 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+)
diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index 5df75013de..97d2a34056 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -1012,6 +1012,65 @@ cmdNodeDeviceEvent(vshControl *ctl, const vshCmd *cmd) }
+/* + * "nodedev-undefine" command + */ +static const vshCmdInfo info_node_device_undefine[] = { + {.name = "help", + .data = N_("Undefine an inactive node device") + }, + {.name = "desc", + .data = N_("Undefines the configuration for an inactive node device") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_node_device_undefine[] = { + {.name = "name", + .type = VSH_OT_ALIAS, + .help = "device"
^This is a new command, so there should not introduce any alias. We tend to add aliases when at some point we realize the design was a mistake and we need the virsh argument names, e.g. for virt-admin commands Reviewed-by: Erik Skultety <eskultet@redhat.com>

This new API function provides a way to start a persistently-defined mediate device that was defined by virNodeDeviceDefineXML() (or one that was defined externally via mdevctl) Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- include/libvirt/libvirt-nodedev.h | 2 + src/driver-nodedev.h | 4 ++ src/libvirt-nodedev.c | 35 +++++++++++ src/libvirt_public.syms | 1 + src/node_device/node_device_driver.c | 63 ++++++++++++++++++++ src/node_device/node_device_driver.h | 6 ++ src/node_device/node_device_udev.c | 1 + src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 13 +++- src/remote_protocol-structs | 4 ++ tests/nodedevmdevctldata/mdevctl-create.argv | 1 + tests/nodedevmdevctltest.c | 11 +++- 12 files changed, 140 insertions(+), 2 deletions(-) create mode 100644 tests/nodedevmdevctldata/mdevctl-create.argv diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index d9dac97147..71626bd837 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -138,6 +138,8 @@ virNodeDevicePtr virNodeDeviceDefineXML(virConnectPtr conn, int virNodeDeviceUndefine(virNodeDevicePtr dev); +int virNodeDeviceCreate(virNodeDevicePtr dev); + /** * VIR_NODE_DEVICE_EVENT_CALLBACK: * diff --git a/src/driver-nodedev.h b/src/driver-nodedev.h index e18029ab48..d00dd5845c 100644 --- a/src/driver-nodedev.h +++ b/src/driver-nodedev.h @@ -82,6 +82,9 @@ typedef virNodeDevicePtr typedef int (*virDrvNodeDeviceUndefine)(virNodeDevicePtr dev); +typedef int +(*virDrvNodeDeviceCreate)(virNodeDevicePtr def); + typedef int (*virDrvConnectNodeDeviceEventRegisterAny)(virConnectPtr conn, virNodeDevicePtr dev, @@ -123,4 +126,5 @@ struct _virNodeDeviceDriver { virDrvNodeDeviceDestroy nodeDeviceDestroy; virDrvNodeDeviceDefineXML nodeDeviceDefineXML; virDrvNodeDeviceUndefine nodeDeviceUndefine; + virDrvNodeDeviceCreate nodeDeviceCreate; }; diff --git a/src/libvirt-nodedev.c b/src/libvirt-nodedev.c index 618778517b..1d13466fc1 100644 --- a/src/libvirt-nodedev.c +++ b/src/libvirt-nodedev.c @@ -843,6 +843,41 @@ virNodeDeviceUndefine(virNodeDevicePtr dev) } +/** + * virNodeDeviceCreate: + * @dev: a device object + * + * Start a defined node device: + * + * Returns 0 in case of success and -1 in case of failure. + */ +int +virNodeDeviceCreate(virNodeDevicePtr dev) +{ + VIR_DEBUG("dev=%p", dev); + + virResetLastError(); + + virCheckNodeDeviceReturn(dev, -1); + virCheckReadOnlyGoto(dev->conn->flags, error); + + if (dev->conn->nodeDeviceDriver && + dev->conn->nodeDeviceDriver->nodeDeviceCreate) { + int retval = dev->conn->nodeDeviceDriver->nodeDeviceCreate(dev); + if (retval < 0) + goto error; + + return 0; + } + + virReportUnsupportedError(); + + error: + virDispatchError(dev->conn); + return -1; +} + + /** * virConnectNodeDeviceEventRegisterAny: * @conn: pointer to the connection diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index 16022a74bf..7c812ebb6e 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -879,6 +879,7 @@ LIBVIRT_6.10.0 { virDomainAuthorizedSSHKeysSet; virNodeDeviceDefineXML; virNodeDeviceUndefine; + virNodeDeviceCreate; } LIBVIRT_6.0.0; # .... define new API here using predicted next version number .... diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index e38aef0656..dc41e937cd 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -889,6 +889,17 @@ nodeDeviceGetMdevctlUndefineCommand(const char *uuid) } +virCommandPtr +nodeDeviceGetMdevctlCreateCommand(const char *uuid) +{ + return virCommandNewArgList(MDEVCTL, + "start", + "-u", + uuid, + NULL); + +} + static int virMdevctlStop(virNodeDeviceDefPtr def) { @@ -919,6 +930,21 @@ virMdevctlUndefine(virNodeDeviceDefPtr def) } +static int +virMdevctlCreate(virNodeDeviceDefPtr def) +{ + int status; + g_autoptr(virCommand) cmd = NULL; + + cmd = nodeDeviceGetMdevctlCreateCommand(def->caps->data.mdev.uuid); + + if (virCommandRun(cmd, &status) < 0 || status != 0) + return -1; + + return 0; +} + + virCommandPtr nodeDeviceGetMdevctlListCommand(bool defined, char **output) @@ -1225,6 +1251,43 @@ nodeDeviceUndefine(virNodeDevicePtr device) } +int nodeDeviceCreate(virNodeDevicePtr device) +{ + int ret = -1; + virNodeDeviceObjPtr obj = NULL; + virNodeDeviceDefPtr def; + + if (!(obj = nodeDeviceObjFindByName(device->name))) + return -1; + + if (virNodeDeviceObjIsActive(obj)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("Device is already active")); + goto cleanup; + } + def = virNodeDeviceObjGetDef(obj); + + if (virNodeDeviceCreateEnsureACL(device->conn, def) < 0) + goto cleanup; + + if (nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_MDEV)) { + if (virMdevctlCreate(def) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unable to create mediated device")); + goto cleanup; + } + ret = 0; + } else { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Unsupported device type")); + } + + cleanup: + virNodeDeviceObjEndAPI(&obj); + return ret; +} + + int nodeConnectNodeDeviceEventRegisterAny(virConnectPtr conn, virNodeDevicePtr device, diff --git a/src/node_device/node_device_driver.h b/src/node_device/node_device_driver.h index e06d62cee6..58a5958f02 100644 --- a/src/node_device/node_device_driver.h +++ b/src/node_device/node_device_driver.h @@ -150,3 +150,9 @@ nodeDeviceGenerateName(virNodeDeviceDefPtr def, const char *subsystem, const char *sysname, const char *s); + +virCommandPtr +nodeDeviceGetMdevctlCreateCommand(const char *uuid); + +int +nodeDeviceCreate(virNodeDevicePtr dev); diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index d7521a9c18..54c9bb5f2b 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -2340,6 +2340,7 @@ static virNodeDeviceDriver udevNodeDeviceDriver = { .nodeDeviceDestroy = nodeDeviceDestroy, /* 0.7.3 */ .nodeDeviceDefineXML = nodeDeviceDefineXML, /* 6.5.0 */ .nodeDeviceUndefine = nodeDeviceUndefine, /* 6.5.0 */ + .nodeDeviceCreate = nodeDeviceCreate, /* 6.5.0 */ }; diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index de569b74e5..a592fcb31d 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -8665,6 +8665,7 @@ static virNodeDeviceDriver node_device_driver = { .nodeDeviceNumOfCaps = remoteNodeDeviceNumOfCaps, /* 0.5.0 */ .nodeDeviceListCaps = remoteNodeDeviceListCaps, /* 0.5.0 */ .nodeDeviceCreateXML = remoteNodeDeviceCreateXML, /* 0.6.3 */ + .nodeDeviceCreate = remoteNodeDeviceCreate, /* 6.5.0 */ .nodeDeviceDefineXML = remoteNodeDeviceDefineXML, /* 6.5.0 */ .nodeDeviceUndefine = remoteNodeDeviceUndefine, /* 6.5.0 */ .nodeDeviceDestroy = remoteNodeDeviceDestroy /* 0.6.3 */ diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 7fee7539f0..a7cf227e53 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -2155,6 +2155,10 @@ struct remote_node_device_undefine_args { remote_nonnull_string name; }; +struct remote_node_device_create_args { + remote_nonnull_string name; +}; + /* * Events Register/Deregister: @@ -6740,6 +6744,13 @@ enum remote_procedure { * @priority: high * @acl: node_device:delete */ - REMOTE_PROC_NODE_DEVICE_UNDEFINE = 427 + REMOTE_PROC_NODE_DEVICE_UNDEFINE = 427, + + /** + * @generate: both + * @priority: high + * @acl: node_device:start + */ + REMOTE_PROC_NODE_DEVICE_CREATE = 428 }; diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index 6d136bdb66..0f35efc5b3 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -1610,6 +1610,9 @@ struct remote_node_device_define_xml_ret { struct remote_node_device_undefine_args { remote_nonnull_string name; }; +struct remote_node_device_create_args { + remote_nonnull_string name; +}; struct remote_connect_domain_event_register_ret { int cb_registered; }; @@ -3600,4 +3603,5 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_AUTHORIZED_SSH_KEYS_SET = 425, REMOTE_PROC_NODE_DEVICE_DEFINE_XML = 426, REMOTE_PROC_NODE_DEVICE_UNDEFINE = 427, + REMOTE_PROC_NODE_DEVICE_CREATE = 428, }; diff --git a/tests/nodedevmdevctldata/mdevctl-create.argv b/tests/nodedevmdevctldata/mdevctl-create.argv new file mode 100644 index 0000000000..802109340c --- /dev/null +++ b/tests/nodedevmdevctldata/mdevctl-create.argv @@ -0,0 +1 @@ +$MDEVCTL_BINARY$ start -u 8a05ad83-3472-497d-8631-8142f31460e8 diff --git a/tests/nodedevmdevctltest.c b/tests/nodedevmdevctltest.c index 753cfd273e..bc0b94f441 100644 --- a/tests/nodedevmdevctltest.c +++ b/tests/nodedevmdevctltest.c @@ -14,7 +14,8 @@ typedef enum { MDEVCTL_CMD_START, MDEVCTL_CMD_STOP, MDEVCTL_CMD_DEFINE, - MDEVCTL_CMD_UNDEFINE + MDEVCTL_CMD_UNDEFINE, + MDEVCTL_CMD_CREATE, } MdevctlCmd; struct startTestInfo { @@ -186,6 +187,9 @@ testMdevctlUuidCommandHelper(const void *data) } else if (info->command == MDEVCTL_CMD_UNDEFINE) { cmd = "undefine"; func = nodeDeviceGetMdevctlUndefineCommand; + }else if (info->command == MDEVCTL_CMD_CREATE) { + cmd = "create"; + func = nodeDeviceGetMdevctlCreateCommand; } else { return -1; } @@ -426,6 +430,9 @@ mymain(void) #define DO_TEST_UNDEFINE(uuid) \ DO_TEST_UUID_COMMAND_FULL("mdevctl undefine " uuid, uuid, MDEVCTL_CMD_UNDEFINE) +#define DO_TEST_CREATE(uuid) \ + DO_TEST_UUID_COMMAND_FULL("mdevctl create " uuid, uuid, MDEVCTL_CMD_CREATE) + #define DO_TEST_LIST_DEFINED() \ DO_TEST_FULL("mdevctl list --defined", testMdevctlListDefined, NULL) @@ -450,6 +457,8 @@ mymain(void) DO_TEST_UNDEFINE("d76a6b78-45ed-4149-a325-005f9abc5281"); + DO_TEST_CREATE("8a05ad83-3472-497d-8631-8142f31460e8"); + done: nodedevTestDriverFree(driver); -- 2.26.2

This virsh command maps to virNodeDeviceCreate(), which starts a node device that has been previously defined by virNodeDeviceDefineXML(). This is only supported for mediated devices. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- tools/virsh-nodedev.c | 61 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index 97d2a34056..861c3259f1 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -1123,6 +1123,61 @@ cmdNodeDeviceDefine(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED) } +/* + * "nodedev-start" command + */ +static const vshCmdInfo info_node_device_start[] = { + {.name = "help", + .data = N_("Start an inactive node device") + }, + {.name = "desc", + .data = N_("Starts an inactive node device that was previously defined") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_node_device_start[] = { + {.name = "name", + .type = VSH_OT_ALIAS, + .help = "device" + }, + {.name = "device", + .type = VSH_OT_DATA, + .flags = VSH_OFLAG_REQ, + .help = N_("device name"), + .completer = virshNodeDeviceNameCompleter, + }, + {.name = NULL} +}; + +static bool +cmdNodeDeviceStart(vshControl *ctl, const vshCmd *cmd) +{ + const char *name = NULL; + virNodeDevicePtr device; + bool ret = true; + virshControlPtr priv = ctl->privData; + + if (vshCommandOptStringReq(ctl, cmd, "device", &name) < 0) + return false; + + if (!(device = virNodeDeviceLookupByName(priv->conn, name))) { + vshError(ctl, _("Could not find matching device '%s'"), name); + return false; + } + + if (virNodeDeviceCreate(device) == 0) { + vshPrintExtra(ctl, _("Device %s started\n"), name); + } else { + vshError(ctl, _("Failed to start device %s"), name); + ret = false; + } + + virNodeDeviceFree(device); + return ret; +} + + const vshCmdDef nodedevCmds[] = { {.name = "nodedev-create", .handler = cmdNodeDeviceCreate, @@ -1188,5 +1243,11 @@ const vshCmdDef nodedevCmds[] = { .info = info_node_device_undefine, .flags = 0 }, + {.name = "nodedev-start", + .handler = cmdNodeDeviceStart, + .opts = opts_node_device_start, + .info = info_node_device_start, + .flags = 0 + }, {.name = NULL} }; -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:44AM -0600, Jonathon Jongsma wrote:
This virsh command maps to virNodeDeviceCreate(), which starts a node device that has been previously defined by virNodeDeviceDefineXML(). This is only supported for mediated devices.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- tools/virsh-nodedev.c | 61 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+)
diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index 97d2a34056..861c3259f1 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -1123,6 +1123,61 @@ cmdNodeDeviceDefine(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED) }
+/* + * "nodedev-start" command + */ +static const vshCmdInfo info_node_device_start[] = { + {.name = "help", + .data = N_("Start an inactive node device") + }, + {.name = "desc", + .data = N_("Starts an inactive node device that was previously defined") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_node_device_start[] = { + {.name = "name", + .type = VSH_OT_ALIAS, + .help = "device" + },
No alias here either. Reviewed-by: Erik Skultety <eskultet@redhat.com>

Most of the header files no longer use the space-padded function name and parameter alignment. Update the style to be more consistent with other headers. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- include/libvirt/libvirt-nodedev.h | 79 ++++++++++++++++--------------- 1 file changed, 41 insertions(+), 38 deletions(-) diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index 71626bd837..663e8a9e5b 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -48,15 +48,14 @@ typedef struct _virNodeDevice virNodeDevice; typedef virNodeDevice *virNodeDevicePtr; -int virNodeNumOfDevices (virConnectPtr conn, - const char *cap, - unsigned int flags); - -int virNodeListDevices (virConnectPtr conn, - const char *cap, - char **const names, - int maxnames, - unsigned int flags); +int virNodeNumOfDevices(virConnectPtr conn, + const char *cap, + unsigned int flags); + +int virNodeListDevices(virConnectPtr conn, + const char *cap, + char **const names, + int maxnames, unsigned int flags); /* * virConnectListAllNodeDevices: * @@ -91,46 +90,50 @@ typedef enum { VIR_CONNECT_LIST_NODE_DEVICES_ACTIVE = 1 << 30, /* Active devices */ } virConnectListAllNodeDeviceFlags; -int virConnectListAllNodeDevices (virConnectPtr conn, - virNodeDevicePtr **devices, - unsigned int flags); +int virConnectListAllNodeDevices(virConnectPtr conn, + virNodeDevicePtr **devices, + unsigned int flags); -virNodeDevicePtr virNodeDeviceLookupByName (virConnectPtr conn, - const char *name); +virNodeDevicePtr virNodeDeviceLookupByName(virConnectPtr conn, + const char *name); -virNodeDevicePtr virNodeDeviceLookupSCSIHostByWWN (virConnectPtr conn, - const char *wwnn, - const char *wwpn, - unsigned int flags); +virNodeDevicePtr virNodeDeviceLookupSCSIHostByWWN(virConnectPtr conn, + const char *wwnn, + const char *wwpn, + unsigned int flags); -const char * virNodeDeviceGetName (virNodeDevicePtr dev); +const char *virNodeDeviceGetName(virNodeDevicePtr dev); -const char * virNodeDeviceGetParent (virNodeDevicePtr dev); +const char *virNodeDeviceGetParent(virNodeDevicePtr dev); -int virNodeDeviceNumOfCaps (virNodeDevicePtr dev); +int virNodeDeviceNumOfCaps(virNodeDevicePtr dev); -int virNodeDeviceListCaps (virNodeDevicePtr dev, - char **const names, - int maxnames); +int virNodeDeviceListCaps(virNodeDevicePtr dev, + char **const names, + int maxnames); -char * virNodeDeviceGetXMLDesc (virNodeDevicePtr dev, - unsigned int flags); +char * virNodeDeviceGetXMLDesc(virNodeDevicePtr dev, + unsigned int flags); -int virNodeDeviceRef (virNodeDevicePtr dev); -int virNodeDeviceFree (virNodeDevicePtr dev); +int virNodeDeviceRef(virNodeDevicePtr dev); -int virNodeDeviceDettach (virNodeDevicePtr dev); -int virNodeDeviceDetachFlags(virNodeDevicePtr dev, - const char *driverName, - unsigned int flags); -int virNodeDeviceReAttach (virNodeDevicePtr dev); -int virNodeDeviceReset (virNodeDevicePtr dev); +int virNodeDeviceFree(virNodeDevicePtr dev); -virNodeDevicePtr virNodeDeviceCreateXML (virConnectPtr conn, - const char *xmlDesc, - unsigned int flags); +int virNodeDeviceDettach(virNodeDevicePtr dev); -int virNodeDeviceDestroy (virNodeDevicePtr dev); +int virNodeDeviceDetachFlags(virNodeDevicePtr dev, + const char *driverName, + unsigned int flags); + +int virNodeDeviceReAttach(virNodeDevicePtr dev); + +int virNodeDeviceReset(virNodeDevicePtr dev); + +virNodeDevicePtr virNodeDeviceCreateXML(virConnectPtr conn, + const char *xmlDesc, + unsigned int flags); + +int virNodeDeviceDestroy(virNodeDevicePtr dev); virNodeDevicePtr virNodeDeviceDefineXML(virConnectPtr conn, const char *xmlDesc, -- 2.26.2

On Thu, Dec 24, 2020 at 08:14:45AM -0600, Jonathon Jongsma wrote:
Most of the header files no longer use the space-padded function name and parameter alignment. Update the style to be more consistent with other headers.
It's actually half of the public ones (9 out of 19 including this one) that are inconsistent, so I would suggest dropping this patch from this series and posting a standalone series to fix the inconsistencies in every header affected header (public or private). Erik

On Thu, Dec 24, 2020 at 10:15 PM Jonathon Jongsma <jjongsma@redhat.com> wrote:
This patch series follows the previously-merged series which added support for transient mediated devices. This series expands mdev support to include persistent device definitions. Again, it relies on mdevctl as the backend.
It follows the common libvirt pattern of APIs by adding the following new APIs for node devices: - virNodeDeviceDefineXML() - defines a persistent device - virNodeDeviceUndefine() - undefines a persistent device - virNodeDeviceCreate() - starts a previously-defined device
It also adds virsh commands mapping to these new APIs: nodedev-define, nodedev-undefine, and nodedev-start.
Trivial suggestions: 1. Mention the bug to be resolved by this series in commit msg: https://bugzilla.redhat.com/show_bug.cgi?id=1699274 2. Add doc of man page for the new virsh commands and options
Since we rely on mdevctl for the definition of ediated devices, we need a way to stay up-to-date with devices that are defined by mdevctl (outside of libvirt). The method for staying up-to-date is currently a little bit crude due to the fact that mdevctl does not emit any events when new devices are added or removed. As a workaround, we create a file monitor for the mdevctl config directory and re-query mdevctl when we detect changes within that directory. In the future, mdevctl may introduce a more elegant solution.
changes in v3: - streamlined tests -- removed some unnecessary duplication - split out patch to factor out node device name generation function - split nodeDeviceParseMdevctlChildDevice() into a separate function - added follow-up patch to remove space-padded alignment in header - refactored the mdevctl update handling significantly: - no longer a separate persistent thread that gets signaled by a timer - now piggybacks onto the existing udev thread and signals the thread in t= he same way that the udev event does. - Daniel suggested spawning a throw-away thread to handle mdevctl updates, but that introduces the complexity of possibly serializing multiple throw-away threads (e.g. if we get an 'created' event followed immediate= ly by a 'deleted' event, two threads may be spawned and we'd need to ensure they are properly ordered) - added virNodeDeviceObjListForEach() and virNodeDeviceObjListRemoveLocked() to simplify removing devices that are removed from mdevctl. - coding style fixes - NOTE: per Erik's request, I experimented with changing the way that mdevctl commands were generated and tested (e.g. introducing something like virMdevctlGetCommand(def, MDEVCTL_COMMAND_<SUBCOMMAND>, ...)), but it was too invasive and awkward and didn't seem worthwhile
Changes in v2: - rebase to latest git master
Jonathon Jongsma (21): tests: remove extra trailing semicolon nodedev: introduce concept of 'active' node devices nodedev: Add ability to filter by active state nodedev: expose internal helper for naming devices nodedev: add ability to list and parse defined mdevs nodedev: add STOPPED/STARTED lifecycle events nodedev: add mdevctl devices to node device list nodedev: add helper functions to remove node devices nodedev: handle mdevs that disappear from mdevctl nodedev: rename dataReady to udevReady nodedev: Refresh mdev devices when changes are detected api: add virNodeDeviceDefineXML() virsh: Add --active, --inactive, --all to nodedev-list virsh: add nodedev-define command nodedev: refactor tests to support mdev undefine api: add virNodeDeviceUndefine() virsh: Factor out function to find node device virsh: add nodedev-undefine command api: add virNodeDeviceCreate() virsh: add "nodedev-start" command libvirt-nodedev.h: remove space-padded alignment
examples/c/misc/event-test.c | 4 + include/libvirt/libvirt-nodedev.h | 98 ++-- src/access/viraccessperm.c | 2 +- src/access/viraccessperm.h | 6 + src/conf/node_device_conf.h | 9 + src/conf/virnodedeviceobj.c | 132 ++++- src/conf/virnodedeviceobj.h | 19 + src/driver-nodedev.h | 14 + src/libvirt-nodedev.c | 115 ++++ src/libvirt_private.syms | 4 + src/libvirt_public.syms | 3 + src/node_device/node_device_driver.c | 538 +++++++++++++++++- src/node_device/node_device_driver.h | 38 ++ src/node_device/node_device_udev.c | 308 ++++++++-- src/remote/remote_driver.c | 3 + src/remote/remote_protocol.x | 40 +- src/remote_protocol-structs | 16 + src/rpc/gendispatch.pl | 1 + ...19_36ea_4111_8f0a_8c9a70e21366-define.argv | 1 + ...19_36ea_4111_8f0a_8c9a70e21366-define.json | 1 + ...39_495e_4243_ad9f_beb3f14c23d9-define.argv | 1 + ...39_495e_4243_ad9f_beb3f14c23d9-define.json | 1 + ...16_1ca8_49ac_b176_871d16c13076-define.argv | 1 + ...16_1ca8_49ac_b176_871d16c13076-define.json | 1 + tests/nodedevmdevctldata/mdevctl-create.argv | 1 + .../mdevctl-list-defined.argv | 1 + .../mdevctl-list-multiple.json | 59 ++ .../mdevctl-list-multiple.out.xml | 39 ++ tests/nodedevmdevctltest.c | 222 +++++++- tools/virsh-nodedev.c | 276 +++++++-- 30 files changed, 1765 insertions(+), 189 deletions(-) create mode 100644 tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9= a70e21366-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9= a70e21366-define.json create mode 100644 tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb= 3f14c23d9-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb= 3f14c23d9-define.json create mode 100644 tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871= d16c13076-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871= d16c13076-define.json create mode 100644 tests/nodedevmdevctldata/mdevctl-create.argv create mode 100644 tests/nodedevmdevctldata/mdevctl-list-defined.argv create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.json create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml
--=20 2.26.2

Tested with v6.10.0-283-g1948d4e61e. 1.Can define/start/destroy mdev device successfully; 2.'virsh nodedev-list' has no '--active' option, which is inconsistent with the description in the patch: # virsh nodedev-list --active error: command 'nodedev-list' doesn't support option --active 3.virsh client hang when trying to destroy a mdev device which is using by a vm, and after that all 'virsh nodev*' cmds will hang. If restarting llibvirtd after that, libvirtd will hang. 4.Define a mdev device with the uuid specified, but the mdev device defined seems using another uuid. Maybe it make a little confusion about the use of uuid in the xml: #cat mdev.xml <device> <name>mdev_85531b6d_e5e4_41c1_aa2a_8844717f066a</name> **** <path>/sys/devices/pci0000:00/0000:00:02.0/85531b6d-e5e4-41c1-aa2a-8844717f066a</path> *** <parent>pci_0000_00_02_0</parent> <driver> <name>vfio_mdev</name> </driver> <capability type='mdev'> <type id='i915-GVTg_V5_8'/> <iommuGroup number='0'/> </capability> </device> # virsh nodedev-define /root/mdev.xml Node device ***mdev_6cdbaad0_7215_4741_82b9_c51e1a4decda**** defined from /root/mdev.xml -- Tested by Yan Fu(yafu@redhat.com) On Fri, Dec 25, 2020 at 10:19 AM Han Han <hhan@redhat.com> wrote:
On Thu, Dec 24, 2020 at 10:15 PM Jonathon Jongsma <jjongsma@redhat.com> wrote:
This patch series follows the previously-merged series which added support for transient mediated devices. This series expands mdev support to include persistent device definitions. Again, it relies on mdevctl as the backend.
It follows the common libvirt pattern of APIs by adding the following new APIs for node devices: - virNodeDeviceDefineXML() - defines a persistent device - virNodeDeviceUndefine() - undefines a persistent device - virNodeDeviceCreate() - starts a previously-defined device
It also adds virsh commands mapping to these new APIs: nodedev-define, nodedev-undefine, and nodedev-start.
Trivial suggestions: 1. Mention the bug to be resolved by this series in commit msg: https://bugzilla.redhat.com/show_bug.cgi?id=1699274 2. Add doc of man page for the new virsh commands and options
Since we rely on mdevctl for the definition of ediated devices, we need a way to stay up-to-date with devices that are defined by mdevctl (outside of libvirt). The method for staying up-to-date is currently a little bit crude due to the fact that mdevctl does not emit any events when new devices are added or removed. As a workaround, we create a file monitor for the mdevctl config directory and re-query mdevctl when we detect changes within that directory. In the future, mdevctl may introduce a more elegant solution.
changes in v3: - streamlined tests -- removed some unnecessary duplication - split out patch to factor out node device name generation function - split nodeDeviceParseMdevctlChildDevice() into a separate function - added follow-up patch to remove space-padded alignment in header - refactored the mdevctl update handling significantly: - no longer a separate persistent thread that gets signaled by a timer - now piggybacks onto the existing udev thread and signals the thread in t= he same way that the udev event does. - Daniel suggested spawning a throw-away thread to handle mdevctl updates, but that introduces the complexity of possibly serializing multiple throw-away threads (e.g. if we get an 'created' event followed immediate= ly by a 'deleted' event, two threads may be spawned and we'd need to ensure they are properly ordered) - added virNodeDeviceObjListForEach() and virNodeDeviceObjListRemoveLocked() to simplify removing devices that are removed from mdevctl. - coding style fixes - NOTE: per Erik's request, I experimented with changing the way that mdevctl commands were generated and tested (e.g. introducing something like virMdevctlGetCommand(def, MDEVCTL_COMMAND_<SUBCOMMAND>, ...)), but it was too invasive and awkward and didn't seem worthwhile
Changes in v2: - rebase to latest git master
Jonathon Jongsma (21): tests: remove extra trailing semicolon nodedev: introduce concept of 'active' node devices nodedev: Add ability to filter by active state nodedev: expose internal helper for naming devices nodedev: add ability to list and parse defined mdevs nodedev: add STOPPED/STARTED lifecycle events nodedev: add mdevctl devices to node device list nodedev: add helper functions to remove node devices nodedev: handle mdevs that disappear from mdevctl nodedev: rename dataReady to udevReady nodedev: Refresh mdev devices when changes are detected api: add virNodeDeviceDefineXML() virsh: Add --active, --inactive, --all to nodedev-list virsh: add nodedev-define command nodedev: refactor tests to support mdev undefine api: add virNodeDeviceUndefine() virsh: Factor out function to find node device virsh: add nodedev-undefine command api: add virNodeDeviceCreate() virsh: add "nodedev-start" command libvirt-nodedev.h: remove space-padded alignment
examples/c/misc/event-test.c | 4 + include/libvirt/libvirt-nodedev.h | 98 ++-- src/access/viraccessperm.c | 2 +- src/access/viraccessperm.h | 6 + src/conf/node_device_conf.h | 9 + src/conf/virnodedeviceobj.c | 132 ++++- src/conf/virnodedeviceobj.h | 19 + src/driver-nodedev.h | 14 + src/libvirt-nodedev.c | 115 ++++ src/libvirt_private.syms | 4 + src/libvirt_public.syms | 3 + src/node_device/node_device_driver.c | 538 +++++++++++++++++- src/node_device/node_device_driver.h | 38 ++ src/node_device/node_device_udev.c | 308 ++++++++-- src/remote/remote_driver.c | 3 + src/remote/remote_protocol.x | 40 +- src/remote_protocol-structs | 16 + src/rpc/gendispatch.pl | 1 + ...19_36ea_4111_8f0a_8c9a70e21366-define.argv | 1 + ...19_36ea_4111_8f0a_8c9a70e21366-define.json | 1 + ...39_495e_4243_ad9f_beb3f14c23d9-define.argv | 1 + ...39_495e_4243_ad9f_beb3f14c23d9-define.json | 1 + ...16_1ca8_49ac_b176_871d16c13076-define.argv | 1 + ...16_1ca8_49ac_b176_871d16c13076-define.json | 1 + tests/nodedevmdevctldata/mdevctl-create.argv | 1 + .../mdevctl-list-defined.argv | 1 + .../mdevctl-list-multiple.json | 59 ++ .../mdevctl-list-multiple.out.xml | 39 ++ tests/nodedevmdevctltest.c | 222 +++++++- tools/virsh-nodedev.c | 276 +++++++-- 30 files changed, 1765 insertions(+), 189 deletions(-) create mode 100644 tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9= a70e21366-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9= a70e21366-define.json create mode 100644 tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb= 3f14c23d9-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb= 3f14c23d9-define.json create mode 100644 tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871= d16c13076-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871= d16c13076-define.json create mode 100644 tests/nodedevmdevctldata/mdevctl-create.argv create mode 100644 tests/nodedevmdevctldata/mdevctl-list-defined.argv create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.json create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml
--=20 2.26.2

On Tue, Jan 5, 2021 at 11:50 AM Yan Fu <yafu@redhat.com> wrote:
Tested with v6.10.0-283-g1948d4e61e.
1.Can define/start/destroy mdev device successfully;
2.'virsh nodedev-list' has no '--active' option, which is inconsistent with the description in the patch: # virsh nodedev-list --active error: command 'nodedev-list' doesn't support option --active
3.virsh client hang when trying to destroy a mdev device which is using by a vm, and after that all 'virsh nodev*' cmds will hang. If restarting llibvirtd after that, libvirtd will hang.
Please provide the domain xml and the libvirtd log by the filter "1:qemu 1:node_device"
4.Define a mdev device with the uuid specified, but the mdev device defined seems using another uuid. Maybe it make a little confusion about the use of uuid in the xml: #cat mdev.xml <device> <name>mdev_85531b6d_e5e4_41c1_aa2a_8844717f066a</name> ****
<path>/sys/devices/pci0000:00/0000:00:02.0/85531b6d-e5e4-41c1-aa2a-8844717f066a</path> *** <parent>pci_0000_00_02_0</parent> <driver> <name>vfio_mdev</name> </driver> <capability type='mdev'> <type id='i915-GVTg_V5_8'/> <iommuGroup number='0'/> </capability> </device>
# virsh nodedev-define /root/mdev.xml Node device ***mdev_6cdbaad0_7215_4741_82b9_c51e1a4decda**** defined from /root/mdev.xml
-- Tested by Yan Fu(yafu@redhat.com)
It should be: Tested-by: Yan Fu <yafu@redhat.com> That means it works for me. If you think there are problems here, the tested-by should not be given.
On Fri, Dec 25, 2020 at 10:19 AM Han Han <hhan@redhat.com> wrote:
On Thu, Dec 24, 2020 at 10:15 PM Jonathon Jongsma <jjongsma@redhat.com> wrote:
This patch series follows the previously-merged series which added support for transient mediated devices. This series expands mdev support to include persistent device definitions. Again, it relies on mdevctl as the backend.
It follows the common libvirt pattern of APIs by adding the following new APIs for node devices: - virNodeDeviceDefineXML() - defines a persistent device - virNodeDeviceUndefine() - undefines a persistent device - virNodeDeviceCreate() - starts a previously-defined device
It also adds virsh commands mapping to these new APIs: nodedev-define, nodedev-undefine, and nodedev-start.
Trivial suggestions: 1. Mention the bug to be resolved by this series in commit msg: https://bugzilla.redhat.com/show_bug.cgi?id=1699274 2. Add doc of man page for the new virsh commands and options
Since we rely on mdevctl for the definition of ediated devices, we need a way to stay up-to-date with devices that are defined by mdevctl (outside of libvirt). The method for staying up-to-date is currently a little bit crude due to the fact that mdevctl does not emit any events when new devices are added or removed. As a workaround, we create a file monitor for the mdevctl config directory and re-query mdevctl when we detect changes within that directory. In the future, mdevctl may introduce a more elegant solution.
changes in v3: - streamlined tests -- removed some unnecessary duplication - split out patch to factor out node device name generation function - split nodeDeviceParseMdevctlChildDevice() into a separate function - added follow-up patch to remove space-padded alignment in header - refactored the mdevctl update handling significantly: - no longer a separate persistent thread that gets signaled by a timer - now piggybacks onto the existing udev thread and signals the thread in t= he same way that the udev event does. - Daniel suggested spawning a throw-away thread to handle mdevctl updates, but that introduces the complexity of possibly serializing multiple throw-away threads (e.g. if we get an 'created' event followed immediate= ly by a 'deleted' event, two threads may be spawned and we'd need to ensure they are properly ordered) - added virNodeDeviceObjListForEach() and virNodeDeviceObjListRemoveLocked() to simplify removing devices that are removed from mdevctl. - coding style fixes - NOTE: per Erik's request, I experimented with changing the way that mdevctl commands were generated and tested (e.g. introducing something like virMdevctlGetCommand(def, MDEVCTL_COMMAND_<SUBCOMMAND>, ...)), but it was too invasive and awkward and didn't seem worthwhile
Changes in v2: - rebase to latest git master
Jonathon Jongsma (21): tests: remove extra trailing semicolon nodedev: introduce concept of 'active' node devices nodedev: Add ability to filter by active state nodedev: expose internal helper for naming devices nodedev: add ability to list and parse defined mdevs nodedev: add STOPPED/STARTED lifecycle events nodedev: add mdevctl devices to node device list nodedev: add helper functions to remove node devices nodedev: handle mdevs that disappear from mdevctl nodedev: rename dataReady to udevReady nodedev: Refresh mdev devices when changes are detected api: add virNodeDeviceDefineXML() virsh: Add --active, --inactive, --all to nodedev-list virsh: add nodedev-define command nodedev: refactor tests to support mdev undefine api: add virNodeDeviceUndefine() virsh: Factor out function to find node device virsh: add nodedev-undefine command api: add virNodeDeviceCreate() virsh: add "nodedev-start" command libvirt-nodedev.h: remove space-padded alignment
examples/c/misc/event-test.c | 4 + include/libvirt/libvirt-nodedev.h | 98 ++-- src/access/viraccessperm.c | 2 +- src/access/viraccessperm.h | 6 + src/conf/node_device_conf.h | 9 + src/conf/virnodedeviceobj.c | 132 ++++- src/conf/virnodedeviceobj.h | 19 + src/driver-nodedev.h | 14 + src/libvirt-nodedev.c | 115 ++++ src/libvirt_private.syms | 4 + src/libvirt_public.syms | 3 + src/node_device/node_device_driver.c | 538 +++++++++++++++++- src/node_device/node_device_driver.h | 38 ++ src/node_device/node_device_udev.c | 308 ++++++++-- src/remote/remote_driver.c | 3 + src/remote/remote_protocol.x | 40 +- src/remote_protocol-structs | 16 + src/rpc/gendispatch.pl | 1 + ...19_36ea_4111_8f0a_8c9a70e21366-define.argv | 1 + ...19_36ea_4111_8f0a_8c9a70e21366-define.json | 1 + ...39_495e_4243_ad9f_beb3f14c23d9-define.argv | 1 + ...39_495e_4243_ad9f_beb3f14c23d9-define.json | 1 + ...16_1ca8_49ac_b176_871d16c13076-define.argv | 1 + ...16_1ca8_49ac_b176_871d16c13076-define.json | 1 + tests/nodedevmdevctldata/mdevctl-create.argv | 1 + .../mdevctl-list-defined.argv | 1 + .../mdevctl-list-multiple.json | 59 ++ .../mdevctl-list-multiple.out.xml | 39 ++ tests/nodedevmdevctltest.c | 222 +++++++- tools/virsh-nodedev.c | 276 +++++++-- 30 files changed, 1765 insertions(+), 189 deletions(-) create mode 100644 tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9= a70e21366-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_d069d019_36ea_4111_8f0a_8c9= a70e21366-define.json create mode 100644 tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb= 3f14c23d9-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_d2441d39_495e_4243_ad9f_beb= 3f14c23d9-define.json create mode 100644 tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871= d16c13076-define.argv create mode 100644 tests/nodedevmdevctldata/mdev_fedc4916_1ca8_49ac_b176_871= d16c13076-define.json create mode 100644 tests/nodedevmdevctldata/mdevctl-create.argv create mode 100644 tests/nodedevmdevctldata/mdevctl-list-defined.argv create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.json create mode 100644 tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml
--=20 2.26.2

On Tue, Jan 05, 2021 at 11:50:11AM +0800, Yan Fu wrote:
Tested with v6.10.0-283-g1948d4e61e.
1.Can define/start/destroy mdev device successfully;
2.'virsh nodedev-list' has no '--active' option, which is inconsistent with the description in the patch: # virsh nodedev-list --active error: command 'nodedev-list' doesn't support option --active
3.virsh client hang when trying to destroy a mdev device which is using by a vm, and after that all 'virsh nodev*' cmds will hang. If restarting llibvirtd after that, libvirtd will hang.
It hangs because underneath a write to the 'remove' sysfs attribute is now blocking for some reason and since we're relying on mdevctl to do it for us, hence "it hangs". I'm not trying to make an excuse, it's plain wrong. I'd love to rely on such a basic functionality, but it looks like we'll have to go with a extremely ugly workaround and try to get the list of active domains from the nodedev driver and see whether any of them has the device assigned before we try to destroy the mdev via the nodedev driver. However, in my testing libvirtd never hung after a restart, can you elaborate on that?
4.Define a mdev device with the uuid specified, but the mdev device defined seems using another uuid. Maybe it make a little confusion about the use of uuid in the xml: #cat mdev.xml <device> <name>mdev_85531b6d_e5e4_41c1_aa2a_8844717f066a</name> ****
Yeah, the easy way out here is to document that the <name> element is read only, but that would be wrong, because we allow specifying it for domains, networks, interfaces, etc. So, we should give the end user the option to specify whatever name they want and generate one if none is provided. Regards, Erik

On Thu, 7 Jan 2021 17:43:54 +0100 Erik Skultety <eskultet@redhat.com> wrote:
4.Define a mdev device with the uuid specified, but the mdev device defined seems using another uuid. Maybe it make a little confusion about the use of uuid in the xml: #cat mdev.xml <device> <name>mdev_85531b6d_e5e4_41c1_aa2a_8844717f066a</name> ****
Yeah, the easy way out here is to document that the <name> element is read only, but that would be wrong, because we allow specifying it for domains, networks, interfaces, etc. So, we should give the end user the option to specify whatever name they want and generate one if none is provided.
Unfortunately, this appears to be difficult to achieve for persistent devices. For mediated devices, we are using mdevctl as our backend persistent device definition storage. But mdevctl doesn't provide a way to attach additional metadata along with a mdev definition. So we would need to either modify mdevctl to allow us to store additional data with a device definition, or we would need to create (and keep synchronized) a separate persistent storage for libvirt-specific metadata about mediated devices. It's also worth noting that 'nodedev-create' previously allowed you to create vHBA devices in libvirt, but the 'name' element is explicitly ignored (see https://wiki.libvirt.org/page/NPIV_in_libvirt): "NOTE: If you specify "name" for the vHBA, then it will be ignored. The kernel will automatically pick the next SCSI host name in sequence not already used." That said, I do think that it will be useful or even necessary to be able to create mdevs with a specific UUID, but I think that's a separate issue than being able to specify the 'name' of the nodedev. It seems better to do that by specifying a 'uuid' element within the mdev caps, rather than trying to parse a UUID from an arbitrary name string. For example: <device> <capability type='mdev'> <type id='i915-GVTg_V5_8'/> <uuid>901891a6-2077-4476-9465-53d8995b81b4</uuid> </capability> <parent>pci_0000_00_02_0</parent> </device> Thoughts? Jonathon

On Thu, 7 Jan 2021 17:43:54 +0100 Erik Skultety <eskultet@redhat.com> wrote:
Tested with v6.10.0-283-g1948d4e61e.
1.Can define/start/destroy mdev device successfully;
2.'virsh nodedev-list' has no '--active' option, which is inconsistent with the description in the patch: # virsh nodedev-list --active error: command 'nodedev-list' doesn't support option --active
3.virsh client hang when trying to destroy a mdev device which is using by a vm, and after that all 'virsh nodev*' cmds will hang. If restarting llibvirtd after that, libvirtd will hang.
It hangs because underneath a write to the 'remove' sysfs attribute is now blocking for some reason and since we're relying on mdevctl to do it for us, hence "it hangs". I'm not trying to make an excuse, it's plain wrong. I'd love to rely on such a basic functionality, but it looks like we'll have to go with a extremely ugly workaround and try to get the list of active domains from the nodedev driver and see whether any of them has the device assigned before we try to destroy the mdev via the nodedev driver.
So, I've been trying to figure out a way to do this, but as far as I know, there's no way to get a list of active domains from within the nodedev driver, and I can't think of any better ways to handle it. Any ideas? Jonathon

On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma wrote:
On Thu, 7 Jan 2021 17:43:54 +0100 Erik Skultety <eskultet@redhat.com> wrote:
Tested with v6.10.0-283-g1948d4e61e.
1.Can define/start/destroy mdev device successfully;
2.'virsh nodedev-list' has no '--active' option, which is inconsistent with the description in the patch: # virsh nodedev-list --active error: command 'nodedev-list' doesn't support option --active
3.virsh client hang when trying to destroy a mdev device which is using by a vm, and after that all 'virsh nodev*' cmds will hang. If restarting llibvirtd after that, libvirtd will hang.
It hangs because underneath a write to the 'remove' sysfs attribute is now blocking for some reason and since we're relying on mdevctl to do it for us, hence "it hangs". I'm not trying to make an excuse, it's plain wrong. I'd love to rely on such a basic functionality, but it looks like we'll have to go with a extremely ugly workaround and try to get the list of active domains from the nodedev driver and see whether any of them has the device assigned before we try to destroy the mdev via the nodedev driver.
So, I've been trying to figure out a way to do this, but as far as I know, there's no way to get a list of active domains from within the nodedev driver, and I can't think of any better ways to handle it. Any ideas?
Correct, the nodedev driver isn't permitted to talk to any of the virt drivers. Is there anything in sysfs which reports whether the device is in use ? Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, Feb 01, 2021 at 09:42:32AM +0000, Daniel P. Berrangé wrote:
On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma wrote:
On Thu, 7 Jan 2021 17:43:54 +0100 Erik Skultety <eskultet@redhat.com> wrote:
Tested with v6.10.0-283-g1948d4e61e.
1.Can define/start/destroy mdev device successfully;
2.'virsh nodedev-list' has no '--active' option, which is inconsistent with the description in the patch: # virsh nodedev-list --active error: command 'nodedev-list' doesn't support option --active
3.virsh client hang when trying to destroy a mdev device which is using by a vm, and after that all 'virsh nodev*' cmds will hang. If restarting llibvirtd after that, libvirtd will hang.
It hangs because underneath a write to the 'remove' sysfs attribute is now blocking for some reason and since we're relying on mdevctl to do it for us, hence "it hangs". I'm not trying to make an excuse, it's plain wrong. I'd love to rely on such a basic functionality, but it looks like we'll have to go with a extremely ugly workaround and try to get the list of active domains from the nodedev driver and see whether any of them has the device assigned before we try to destroy the mdev via the nodedev driver.
So, I've been trying to figure out a way to do this, but as far as I know, there's no way to get a list of active domains from within the nodedev driver, and I can't think of any better ways to handle it. Any ideas?
Correct, the nodedev driver isn't permitted to talk to any of the virt drivers.
Oh, not even via secondary connection? What makes nodedev so special, since we can open a secondary connection from e.g. the storage driver?
Is there anything in sysfs which reports whether the device is in use ?
Nothing that I know of, the way it used to work was that you tried to write to sysfs and kernel returned a write error with "device in use" or something like that, but that has changed since :(. Erik

On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote:
On Mon, Feb 01, 2021 at 09:42:32AM +0000, Daniel P. Berrangé wrote:
On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma wrote:
On Thu, 7 Jan 2021 17:43:54 +0100 Erik Skultety <eskultet@redhat.com> wrote:
Tested with v6.10.0-283-g1948d4e61e.
1.Can define/start/destroy mdev device successfully;
2.'virsh nodedev-list' has no '--active' option, which is inconsistent with the description in the patch: # virsh nodedev-list --active error: command 'nodedev-list' doesn't support option --active
3.virsh client hang when trying to destroy a mdev device which is using by a vm, and after that all 'virsh nodev*' cmds will hang. If restarting llibvirtd after that, libvirtd will hang.
It hangs because underneath a write to the 'remove' sysfs attribute is now blocking for some reason and since we're relying on mdevctl to do it for us, hence "it hangs". I'm not trying to make an excuse, it's plain wrong. I'd love to rely on such a basic functionality, but it looks like we'll have to go with a extremely ugly workaround and try to get the list of active domains from the nodedev driver and see whether any of them has the device assigned before we try to destroy the mdev via the nodedev driver.
So, I've been trying to figure out a way to do this, but as far as I know, there's no way to get a list of active domains from within the nodedev driver, and I can't think of any better ways to handle it. Any ideas?
Correct, the nodedev driver isn't permitted to talk to any of the virt drivers.
Oh, not even via secondary connection? What makes nodedev so special, since we can open a secondary connection from e.g. the storage driver?
It is technically possible, but it should never be done, because it introduces a bi-directional dependancy between the daemons which introduces the danger of deadlocking them. None of the secondary drivers should connect to the hypervisor drivers.
Is there anything in sysfs which reports whether the device is in use ?
Nothing that I know of, the way it used to work was that you tried to write to sysfs and kernel returned a write error with "device in use" or something like that, but that has changed since :(.
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, Feb 01, 2021 at 09:48:11AM +0000, Daniel P. Berrangé wrote:
On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote:
On Mon, Feb 01, 2021 at 09:42:32AM +0000, Daniel P. Berrangé wrote:
On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma wrote:
On Thu, 7 Jan 2021 17:43:54 +0100 Erik Skultety <eskultet@redhat.com> wrote:
Tested with v6.10.0-283-g1948d4e61e.
1.Can define/start/destroy mdev device successfully;
2.'virsh nodedev-list' has no '--active' option, which is inconsistent with the description in the patch: # virsh nodedev-list --active error: command 'nodedev-list' doesn't support option --active
3.virsh client hang when trying to destroy a mdev device which is using by a vm, and after that all 'virsh nodev*' cmds will hang. If restarting llibvirtd after that, libvirtd will hang.
It hangs because underneath a write to the 'remove' sysfs attribute is now blocking for some reason and since we're relying on mdevctl to do it for us, hence "it hangs". I'm not trying to make an excuse, it's plain wrong. I'd love to rely on such a basic functionality, but it looks like we'll have to go with a extremely ugly workaround and try to get the list of active domains from the nodedev driver and see whether any of them has the device assigned before we try to destroy the mdev via the nodedev driver.
So, I've been trying to figure out a way to do this, but as far as I know, there's no way to get a list of active domains from within the nodedev driver, and I can't think of any better ways to handle it. Any ideas?
Correct, the nodedev driver isn't permitted to talk to any of the virt drivers.
Oh, not even via secondary connection? What makes nodedev so special, since we can open a secondary connection from e.g. the storage driver?
It is technically possible, but it should never be done, because it introduces a bi-directional dependancy between the daemons which introduces the danger of deadlocking them. None of the secondary drivers should connect to the hypervisor drivers.
Is there anything in sysfs which reports whether the device is in use ?
Nothing that I know of, the way it used to work was that you tried to write to sysfs and kernel returned a write error with "device in use" or something like that, but that has changed since :(.
Without having tried this and since mdevctl is just a Bash script, can we bypass mdevctl on destroys a little bit by constructing the path to the sysfs attribute ourselves and perform a non-blocking write of zero bytes to see if we get an error? If so, we'll skip mdevctl and report an error. If we don't, we'll invoke mdevctl to remove the device in order to remain consistent. Would that be an acceptable workaround (provided it would work)? Erik

On Mon, 1 Feb 2021 11:33:08 +0100 Erik Skultety <eskultet@redhat.com> wrote:
On Mon, Feb 01, 2021 at 09:48:11AM +0000, Daniel P. Berrangé wrote:
On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote:
On Mon, Feb 01, 2021 at 09:42:32AM +0000, Daniel P. Berrangé wrote:
On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma wrote:
On Thu, 7 Jan 2021 17:43:54 +0100 Erik Skultety <eskultet@redhat.com> wrote:
> Tested with v6.10.0-283-g1948d4e61e. > > 1.Can define/start/destroy mdev device successfully; > > 2.'virsh nodedev-list' has no '--active' option, which is > inconsistent with the description in the patch: > # virsh nodedev-list --active > error: command 'nodedev-list' doesn't support option > --active > > 3.virsh client hang when trying to destroy a mdev device > which is using by a vm, and after that all 'virsh nodev*' > cmds will hang. If restarting llibvirtd after that, > libvirtd will hang.
It hangs because underneath a write to the 'remove' sysfs attribute is now blocking for some reason and since we're relying on mdevctl to do it for us, hence "it hangs". I'm not trying to make an excuse, it's plain wrong. I'd love to rely on such a basic functionality, but it looks like we'll have to go with a extremely ugly workaround and try to get the list of active domains from the nodedev driver and see whether any of them has the device assigned before we try to destroy the mdev via the nodedev driver.
So, I've been trying to figure out a way to do this, but as far as I know, there's no way to get a list of active domains from within the nodedev driver, and I can't think of any better ways to handle it. Any ideas?
Correct, the nodedev driver isn't permitted to talk to any of the virt drivers.
Oh, not even via secondary connection? What makes nodedev so special, since we can open a secondary connection from e.g. the storage driver?
It is technically possible, but it should never be done, because it introduces a bi-directional dependancy between the daemons which introduces the danger of deadlocking them. None of the secondary drivers should connect to the hypervisor drivers.
Is there anything in sysfs which reports whether the device is in use ?
Nothing that I know of, the way it used to work was that you tried to write to sysfs and kernel returned a write error with "device in use" or something like that, but that has changed since :(.
Without having tried this and since mdevctl is just a Bash script, can we bypass mdevctl on destroys a little bit by constructing the path to the sysfs attribute ourselves and perform a non-blocking write of zero bytes to see if we get an error? If so, we'll skip mdevctl and report an error. If we don't, we'll invoke mdevctl to remove the device in order to remain consistent. Would that be an acceptable workaround (provided it would work)?
As far as I can tell, this doesn't work. According to my tests, attempting to write zero bytes to $(mdev_sysfs_path)/remove doesn't result in an error if the mdev is in use by a vm. It just "successfully" writes zero bytes. Adding Alex to cc in case he has an idea for a workaround here. Jonathon

On Mon, 1 Feb 2021 16:57:44 -0600 Jonathon Jongsma <jjongsma@redhat.com> wrote:
On Mon, 1 Feb 2021 11:33:08 +0100 Erik Skultety <eskultet@redhat.com> wrote:
On Mon, Feb 01, 2021 at 09:48:11AM +0000, Daniel P. Berrangé wrote:
On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote:
On Mon, Feb 01, 2021 at 09:42:32AM +0000, Daniel P. Berrangé wrote:
On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma wrote:
On Thu, 7 Jan 2021 17:43:54 +0100 Erik Skultety <eskultet@redhat.com> wrote:
> > Tested with v6.10.0-283-g1948d4e61e. > > > > 1.Can define/start/destroy mdev device successfully; > > > > 2.'virsh nodedev-list' has no '--active' option, which is > > inconsistent with the description in the patch: > > # virsh nodedev-list --active > > error: command 'nodedev-list' doesn't support option > > --active > > > > 3.virsh client hang when trying to destroy a mdev device > > which is using by a vm, and after that all 'virsh nodev*' > > cmds will hang. If restarting llibvirtd after that, > > libvirtd will hang. > > It hangs because underneath a write to the 'remove' sysfs > attribute is now blocking for some reason and since we're > relying on mdevctl to do it for us, hence "it hangs". I'm > not trying to make an excuse, it's plain wrong. I'd love to > rely on such a basic functionality, but it looks like we'll > have to go with a extremely ugly workaround and try to get > the list of active domains from the nodedev driver and see > whether any of them has the device assigned before we try > to destroy the mdev via the nodedev driver.
So, I've been trying to figure out a way to do this, but as far as I know, there's no way to get a list of active domains from within the nodedev driver, and I can't think of any better ways to handle it. Any ideas?
Correct, the nodedev driver isn't permitted to talk to any of the virt drivers.
Oh, not even via secondary connection? What makes nodedev so special, since we can open a secondary connection from e.g. the storage driver?
It is technically possible, but it should never be done, because it introduces a bi-directional dependancy between the daemons which introduces the danger of deadlocking them. None of the secondary drivers should connect to the hypervisor drivers.
Is there anything in sysfs which reports whether the device is in use ?
Nothing that I know of, the way it used to work was that you tried to write to sysfs and kernel returned a write error with "device in use" or something like that, but that has changed since :(.
Without having tried this and since mdevctl is just a Bash script, can we bypass mdevctl on destroys a little bit by constructing the path to the sysfs attribute ourselves and perform a non-blocking write of zero bytes to see if we get an error? If so, we'll skip mdevctl and report an error. If we don't, we'll invoke mdevctl to remove the device in order to remain consistent. Would that be an acceptable workaround (provided it would work)?
As far as I can tell, this doesn't work. According to my tests, attempting to write zero bytes to $(mdev_sysfs_path)/remove doesn't result in an error if the mdev is in use by a vm. It just "successfully" writes zero bytes. Adding Alex to cc in case he has an idea for a workaround here.
[Cc +Connie] I'm not really sure why mdevs are unique here. When we write to remove, the first step is to release the device from the driver, so it's really the same as an unbind for a vfio-pci device. PCI drivers cannot return an error, an unbind is handled not as a request, but a directive, so when the device is in use we block until the unbind can complete. With vfio-pci (and added upstream to the mdev core - depending on vendor support), the driver remove callback triggers a virtual interrupt to the user asking to cooperatively return the device (triggering a hot-unplug in QEMU). Has this really worked so well in vfio-pci that we've forgotten that an unbind can block there too or are we better about tracking something with PCI devices vs mdevs? On idea for a solution would be that vfio only allows a single open of a group at a time, so if libvirt were to open the group it could know that it's unused. If you can manage to close the group once you've already triggered the remove/unbind, then I'd think the completion of the write would be deterministic. If the group is in use elsewhere, the open should get back an -EBUSY and you probably ought to be careful about removing/unbinding it anyway. It might be possible to implement this in mdevctl too, ie. redirect /dev/null to group file and fork, fork the echo 1 > remove, kill the redirect, return a device in use error if the initial redirect fails. Thanks, Alex

On Mon, Feb 01, 2021 at 04:38:56PM -0700, Alex Williamson wrote:
On Mon, 1 Feb 2021 16:57:44 -0600 Jonathon Jongsma <jjongsma@redhat.com> wrote:
On Mon, 1 Feb 2021 11:33:08 +0100 Erik Skultety <eskultet@redhat.com> wrote:
On Mon, Feb 01, 2021 at 09:48:11AM +0000, Daniel P. Berrangé wrote:
On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote:
On Mon, Feb 01, 2021 at 09:42:32AM +0000, Daniel P. Berrangé wrote:
On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma wrote: > On Thu, 7 Jan 2021 17:43:54 +0100 > Erik Skultety <eskultet@redhat.com> wrote: > > > > Tested with v6.10.0-283-g1948d4e61e. > > > > > > 1.Can define/start/destroy mdev device successfully; > > > > > > 2.'virsh nodedev-list' has no '--active' option, which is > > > inconsistent with the description in the patch: > > > # virsh nodedev-list --active > > > error: command 'nodedev-list' doesn't support option > > > --active > > > > > > 3.virsh client hang when trying to destroy a mdev device > > > which is using by a vm, and after that all 'virsh nodev*' > > > cmds will hang. If restarting llibvirtd after that, > > > libvirtd will hang. > > > > It hangs because underneath a write to the 'remove' sysfs > > attribute is now blocking for some reason and since we're > > relying on mdevctl to do it for us, hence "it hangs". I'm > > not trying to make an excuse, it's plain wrong. I'd love to > > rely on such a basic functionality, but it looks like we'll > > have to go with a extremely ugly workaround and try to get > > the list of active domains from the nodedev driver and see > > whether any of them has the device assigned before we try > > to destroy the mdev via the nodedev driver. > > So, I've been trying to figure out a way to do this, but as > far as I know, there's no way to get a list of active domains > from within the nodedev driver, and I can't think of any > better ways to handle it. Any ideas?
Correct, the nodedev driver isn't permitted to talk to any of the virt drivers.
Oh, not even via secondary connection? What makes nodedev so special, since we can open a secondary connection from e.g. the storage driver?
It is technically possible, but it should never be done, because it introduces a bi-directional dependancy between the daemons which introduces the danger of deadlocking them. None of the secondary drivers should connect to the hypervisor drivers.
Is there anything in sysfs which reports whether the device is in use ?
Nothing that I know of, the way it used to work was that you tried to write to sysfs and kernel returned a write error with "device in use" or something like that, but that has changed since :(.
Without having tried this and since mdevctl is just a Bash script, can we bypass mdevctl on destroys a little bit by constructing the path to the sysfs attribute ourselves and perform a non-blocking write of zero bytes to see if we get an error? If so, we'll skip mdevctl and report an error. If we don't, we'll invoke mdevctl to remove the device in order to remain consistent. Would that be an acceptable workaround (provided it would work)?
As far as I can tell, this doesn't work. According to my tests, attempting to write zero bytes to $(mdev_sysfs_path)/remove doesn't result in an error if the mdev is in use by a vm. It just "successfully" writes zero bytes. Adding Alex to cc in case he has an idea for a workaround here.
[Cc +Connie]
I'm not really sure why mdevs are unique here. When we write to remove, the first step is to release the device from the driver, so it's really the same as an unbind for a vfio-pci device. PCI drivers cannot return an error, an unbind is handled not as a request, but a directive, so when the device is in use we block until the unbind can complete. With vfio-pci (and added upstream to the mdev core - depending on vendor support), the driver remove callback triggers a virtual interrupt to the user asking to cooperatively return the device (triggering a hot-unplug in QEMU). Has this really worked so well in vfio-pci that we've forgotten that an unbind can block there too or are we better about tracking something with PCI devices vs mdevs?
Does any of the current vendor guest drivers for mdev support unplug? While I'm not trying to argue that unpluging a vfio-pci cannot block, it just works seamlessly in majority of cases nowadays, but I guess we were in the same situation with PCI assignment in the past? The whole point here is IMO about a massive inconvenience for a library consumer to be blocked on an operation and not knowing why, whereas when you return an instant error saying why the operation cannot be completed right now that opens the door for a necessary adjustment in their usage of the library.
On idea for a solution would be that vfio only allows a single open of a group at a time, so if libvirt were to open the group it could know that it's unused. If you can manage to close the group once you've already triggered the remove/unbind, then I'd think the completion of the write would be deterministic. If the group is in use elsewhere, the open should get back an -EBUSY and you probably ought to be careful
Honestly, ^this seems like a fairly straightforward workaround to me. Erik
about removing/unbinding it anyway. It might be possible to implement this in mdevctl too, ie. redirect /dev/null to group file and fork, fork the echo 1 > remove, kill the redirect, return a device in use error if the initial redirect fails. Thanks,
Alex

On Tue, 2 Feb 2021 08:28:52 +0100 Erik Skultety <eskultet@redhat.com> wrote:
On Mon, Feb 01, 2021 at 04:38:56PM -0700, Alex Williamson wrote:
On Mon, 1 Feb 2021 16:57:44 -0600 Jonathon Jongsma <jjongsma@redhat.com> wrote:
On Mon, 1 Feb 2021 11:33:08 +0100 Erik Skultety <eskultet@redhat.com> wrote:
On Mon, Feb 01, 2021 at 09:48:11AM +0000, Daniel P. Berrangé wrote:
On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote:
On Mon, Feb 01, 2021 at 09:42:32AM +0000, Daniel P. Berrangé wrote: > On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma > wrote: > > On Thu, 7 Jan 2021 17:43:54 +0100 > > Erik Skultety <eskultet@redhat.com> wrote: > > > > > > Tested with v6.10.0-283-g1948d4e61e. > > > > > > > > 1.Can define/start/destroy mdev device successfully; > > > > > > > > 2.'virsh nodedev-list' has no '--active' option, > > > > which is inconsistent with the description in the > > > > patch: # virsh nodedev-list --active > > > > error: command 'nodedev-list' doesn't support option > > > > --active > > > > > > > > 3.virsh client hang when trying to destroy a mdev > > > > device which is using by a vm, and after that all > > > > 'virsh nodev*' cmds will hang. If restarting > > > > llibvirtd after that, libvirtd will hang. > > > > > > It hangs because underneath a write to the 'remove' > > > sysfs attribute is now blocking for some reason and > > > since we're relying on mdevctl to do it for us, hence > > > "it hangs". I'm not trying to make an excuse, it's > > > plain wrong. I'd love to rely on such a basic > > > functionality, but it looks like we'll have to go > > > with a extremely ugly workaround and try to get the > > > list of active domains from the nodedev driver and > > > see whether any of them has the device assigned > > > before we try to destroy the mdev via the nodedev > > > driver. > > > > So, I've been trying to figure out a way to do this, > > but as far as I know, there's no way to get a list of > > active domains from within the nodedev driver, and I > > can't think of any better ways to handle it. Any ideas? > > > > Correct, the nodedev driver isn't permitted to talk to > any of the virt drivers.
Oh, not even via secondary connection? What makes nodedev so special, since we can open a secondary connection from e.g. the storage driver?
It is technically possible, but it should never be done, because it introduces a bi-directional dependancy between the daemons which introduces the danger of deadlocking them. None of the secondary drivers should connect to the hypervisor drivers.
> Is there anything in sysfs which reports whether the > device is in use ?
Nothing that I know of, the way it used to work was that you tried to write to sysfs and kernel returned a write error with "device in use" or something like that, but that has changed since :(.
Without having tried this and since mdevctl is just a Bash script, can we bypass mdevctl on destroys a little bit by constructing the path to the sysfs attribute ourselves and perform a non-blocking write of zero bytes to see if we get an error? If so, we'll skip mdevctl and report an error. If we don't, we'll invoke mdevctl to remove the device in order to remain consistent. Would that be an acceptable workaround (provided it would work)?
As far as I can tell, this doesn't work. According to my tests, attempting to write zero bytes to $(mdev_sysfs_path)/remove doesn't result in an error if the mdev is in use by a vm. It just "successfully" writes zero bytes. Adding Alex to cc in case he has an idea for a workaround here.
[Cc +Connie]
I'm not really sure why mdevs are unique here. When we write to remove, the first step is to release the device from the driver, so it's really the same as an unbind for a vfio-pci device. PCI drivers cannot return an error, an unbind is handled not as a request, but a directive, so when the device is in use we block until the unbind can complete. With vfio-pci (and added upstream to the mdev core - depending on vendor support), the driver remove callback triggers a virtual interrupt to the user asking to cooperatively return the device (triggering a hot-unplug in QEMU). Has this really worked so well in vfio-pci that we've forgotten that an unbind can block there too or are we better about tracking something with PCI devices vs mdevs?
Does any of the current vendor guest drivers for mdev support unplug? While I'm not trying to argue that unpluging a vfio-pci cannot block, it just works seamlessly in majority of cases nowadays, but I guess we were in the same situation with PCI assignment in the past?
The whole point here is IMO about a massive inconvenience for a library consumer to be blocked on an operation and not knowing why, whereas when you return an instant error saying why the operation cannot be completed right now that opens the door for a necessary adjustment in their usage of the library.
On idea for a solution would be that vfio only allows a single open of a group at a time, so if libvirt were to open the group it could know that it's unused. If you can manage to close the group once you've already triggered the remove/unbind, then I'd think the completion of the write would be deterministic. If the group is in use elsewhere, the open should get back an -EBUSY and you probably ought to be careful
Honestly, ^this seems like a fairly straightforward workaround to me.
Erik
Yes, thanks. This seems pretty clean and simple to to implement, and I can't really think of any significant downsides. Jonathon
about removing/unbinding it anyway. It might be possible to implement this in mdevctl too, ie. redirect /dev/null to group file and fork, fork the echo 1 > remove, kill the redirect, return a device in use error if the initial redirect fails. Thanks,
Alex

On Tue, 2 Feb 2021 08:28:52 +0100 Erik Skultety <eskultet@redhat.com> wrote:
On Mon, Feb 01, 2021 at 04:38:56PM -0700, Alex Williamson wrote:
On Mon, 1 Feb 2021 16:57:44 -0600 Jonathon Jongsma <jjongsma@redhat.com> wrote:
On Mon, 1 Feb 2021 11:33:08 +0100 Erik Skultety <eskultet@redhat.com> wrote:
On Mon, Feb 01, 2021 at 09:48:11AM +0000, Daniel P. Berrangé wrote:
On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote:
On Mon, Feb 01, 2021 at 09:42:32AM +0000, Daniel P. Berrangé wrote: > On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma > wrote: > > On Thu, 7 Jan 2021 17:43:54 +0100 > > Erik Skultety <eskultet@redhat.com> wrote: > > > > > > Tested with v6.10.0-283-g1948d4e61e. > > > > > > > > 1.Can define/start/destroy mdev device successfully; > > > > > > > > 2.'virsh nodedev-list' has no '--active' option, which is > > > > inconsistent with the description in the patch: > > > > # virsh nodedev-list --active > > > > error: command 'nodedev-list' doesn't support option > > > > --active > > > > > > > > 3.virsh client hang when trying to destroy a mdev device > > > > which is using by a vm, and after that all 'virsh nodev*' > > > > cmds will hang. If restarting llibvirtd after that, > > > > libvirtd will hang. > > > > > > It hangs because underneath a write to the 'remove' sysfs > > > attribute is now blocking for some reason and since we're > > > relying on mdevctl to do it for us, hence "it hangs". I'm > > > not trying to make an excuse, it's plain wrong. I'd love to > > > rely on such a basic functionality, but it looks like we'll > > > have to go with a extremely ugly workaround and try to get > > > the list of active domains from the nodedev driver and see > > > whether any of them has the device assigned before we try > > > to destroy the mdev via the nodedev driver. > > > > So, I've been trying to figure out a way to do this, but as > > far as I know, there's no way to get a list of active domains > > from within the nodedev driver, and I can't think of any > > better ways to handle it. Any ideas? > > Correct, the nodedev driver isn't permitted to talk to any of > the virt drivers.
Oh, not even via secondary connection? What makes nodedev so special, since we can open a secondary connection from e.g. the storage driver?
It is technically possible, but it should never be done, because it introduces a bi-directional dependancy between the daemons which introduces the danger of deadlocking them. None of the secondary drivers should connect to the hypervisor drivers.
> Is there anything in sysfs which reports whether the device is > in use ?
Nothing that I know of, the way it used to work was that you tried to write to sysfs and kernel returned a write error with "device in use" or something like that, but that has changed since :(.
Without having tried this and since mdevctl is just a Bash script, can we bypass mdevctl on destroys a little bit by constructing the path to the sysfs attribute ourselves and perform a non-blocking write of zero bytes to see if we get an error? If so, we'll skip mdevctl and report an error. If we don't, we'll invoke mdevctl to remove the device in order to remain consistent. Would that be an acceptable workaround (provided it would work)?
As far as I can tell, this doesn't work. According to my tests, attempting to write zero bytes to $(mdev_sysfs_path)/remove doesn't result in an error if the mdev is in use by a vm. It just "successfully" writes zero bytes. Adding Alex to cc in case he has an idea for a workaround here.
[Cc +Connie]
I'm not really sure why mdevs are unique here. When we write to remove, the first step is to release the device from the driver, so it's really the same as an unbind for a vfio-pci device. PCI drivers cannot return an error, an unbind is handled not as a request, but a directive, so when the device is in use we block until the unbind can complete. With vfio-pci (and added upstream to the mdev core - depending on vendor support), the driver remove callback triggers a virtual interrupt to the user asking to cooperatively return the device (triggering a hot-unplug in QEMU). Has this really worked so well in vfio-pci that we've forgotten that an unbind can block there too or are we better about tracking something with PCI devices vs mdevs?
Does any of the current vendor guest drivers for mdev support unplug? While I'm not trying to argue that unpluging a vfio-pci cannot block, it just works seamlessly in majority of cases nowadays, but I guess we were in the same situation with PCI assignment in the past?
I think this will only work for ccw right now (when using QEMU from current git). Channel devices always support unplug (it's not a request on the guest side, it's a notification.)
The whole point here is IMO about a massive inconvenience for a library consumer to be blocked on an operation and not knowing why, whereas when you return an instant error saying why the operation cannot be completed right now that opens the door for a necessary adjustment in their usage of the library.
On idea for a solution would be that vfio only allows a single open of a group at a time, so if libvirt were to open the group it could know that it's unused. If you can manage to close the group once you've already triggered the remove/unbind, then I'd think the completion of the write would be deterministic. If the group is in use elsewhere, the open should get back an -EBUSY and you probably ought to be careful
Honestly, ^this seems like a fairly straightforward workaround to me.
Erik
about removing/unbinding it anyway. It might be possible to implement this in mdevctl too, ie. redirect /dev/null to group file and fork, fork the echo 1 > remove, kill the redirect, return a device in use error if the initial redirect fails. Thanks,
Alex
Hacking something like that into mdevctl might be a good idea.

On Thu, Dec 24, 2020 at 08:14:24AM -0600, Jonathon Jongsma wrote:
This patch series follows the previously-merged series which added support for transient mediated devices. This series expands mdev support to include persistent device definitions. Again, it relies on mdevctl as the backend.
It follows the common libvirt pattern of APIs by adding the following new APIs for node devices: - virNodeDeviceDefineXML() - defines a persistent device - virNodeDeviceUndefine() - undefines a persistent device - virNodeDeviceCreate() - starts a previously-defined device
It also adds virsh commands mapping to these new APIs: nodedev-define, nodedev-undefine, and nodedev-start.
Since we rely on mdevctl for the definition of ediated devices, we need a way to stay up-to-date with devices that are defined by mdevctl (outside of libvirt). The method for staying up-to-date is currently a little bit crude due to the fact that mdevctl does not emit any events when new devices are added or removed. As a workaround, we create a file monitor for the mdevctl config directory and re-query mdevctl when we detect changes within that directory. In the future, mdevctl may introduce a more elegant solution.
changes in v3: - streamlined tests -- removed some unnecessary duplication - split out patch to factor out node device name generation function - split nodeDeviceParseMdevctlChildDevice() into a separate function - added follow-up patch to remove space-padded alignment in header - refactored the mdevctl update handling significantly: - no longer a separate persistent thread that gets signaled by a timer - now piggybacks onto the existing udev thread and signals the thread in t= he same way that the udev event does. - Daniel suggested spawning a throw-away thread to handle mdevctl updates, but that introduces the complexity of possibly serializing multiple throw-away threads (e.g. if we get an 'created' event followed immediate= ly by a 'deleted' event, two threads may be spawned and we'd need to ensure they are properly ordered) - added virNodeDeviceObjListForEach() and virNodeDeviceObjListRemoveLocked() to simplify removing devices that are removed from mdevctl. - coding style fixes - NOTE: per Erik's request, I experimented with changing the way that mdevctl commands were generated and tested (e.g. introducing something like virMdevctlGetCommand(def, MDEVCTL_COMMAND_<SUBCOMMAND>, ...)), but it was too invasive and awkward and didn't seem worthwhile
Thanks for giving ^this a try :) I think we're converging nicely with the only remaining bit to agree on being the mdevctl monitoring thread. I think we can target 7.1.0 with this series. Regards, Erik

On Thu, Jan 07, 2021 at 05:50:10PM +0100, Erik Skultety wrote:
On Thu, Dec 24, 2020 at 08:14:24AM -0600, Jonathon Jongsma wrote:
This patch series follows the previously-merged series which added support for transient mediated devices. This series expands mdev support to include persistent device definitions. Again, it relies on mdevctl as the backend.
It follows the common libvirt pattern of APIs by adding the following new APIs for node devices: - virNodeDeviceDefineXML() - defines a persistent device - virNodeDeviceUndefine() - undefines a persistent device - virNodeDeviceCreate() - starts a previously-defined device
It also adds virsh commands mapping to these new APIs: nodedev-define, nodedev-undefine, and nodedev-start.
Since we rely on mdevctl for the definition of ediated devices, we need a way to stay up-to-date with devices that are defined by mdevctl (outside of libvirt). The method for staying up-to-date is currently a little bit crude due to the fact that mdevctl does not emit any events when new devices are added or removed. As a workaround, we create a file monitor for the mdevctl config directory and re-query mdevctl when we detect changes within that directory. In the future, mdevctl may introduce a more elegant solution.
changes in v3: - streamlined tests -- removed some unnecessary duplication - split out patch to factor out node device name generation function - split nodeDeviceParseMdevctlChildDevice() into a separate function - added follow-up patch to remove space-padded alignment in header - refactored the mdevctl update handling significantly: - no longer a separate persistent thread that gets signaled by a timer - now piggybacks onto the existing udev thread and signals the thread in t= he same way that the udev event does. - Daniel suggested spawning a throw-away thread to handle mdevctl updates, but that introduces the complexity of possibly serializing multiple throw-away threads (e.g. if we get an 'created' event followed immediate= ly by a 'deleted' event, two threads may be spawned and we'd need to ensure they are properly ordered) - added virNodeDeviceObjListForEach() and virNodeDeviceObjListRemoveLocked() to simplify removing devices that are removed from mdevctl. - coding style fixes - NOTE: per Erik's request, I experimented with changing the way that mdevctl commands were generated and tested (e.g. introducing something like virMdevctlGetCommand(def, MDEVCTL_COMMAND_<SUBCOMMAND>, ...)), but it was too invasive and awkward and didn't seem worthwhile
Thanks for giving ^this a try :)
I think we're converging nicely with the only remaining bit to agree on being the mdevctl monitoring thread. I think we can target 7.1.0 with this series.
The lifecycle events handling in this series is totally broken and needs addressing before we can consider merging. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (7)
-
Alex Williamson
-
Cornelia Huck
-
Daniel P. Berrangé
-
Erik Skultety
-
Han Han
-
Jonathon Jongsma
-
Yan Fu