On 6/28/23 12:03 AM, Jonathon Jongsma wrote:
On 6/23/23 5:43 AM, Boris Fiuczynski wrote:
> Update the optional mdev attributes on the new created nodedev object as
> they otherwise would not get set until the next mdevctl update cycle.
>
> Resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=2143158
> Signed-off-by: Boris Fiuczynski <fiuczy(a)linux.ibm.com>
> ---
> src/node_device/node_device_driver.c | 18 ++++++++++++++++--
> 1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/src/node_device/node_device_driver.c
> b/src/node_device/node_device_driver.c
> index a2d0600560..5134d246f3 100644
> --- a/src/node_device/node_device_driver.c
> +++ b/src/node_device/node_device_driver.c
> @@ -847,6 +847,9 @@ static virNodeDevicePtr
> nodeDeviceCreateXMLMdev(virConnectPtr conn,
> virNodeDeviceDef *def)
> {
> + virNodeDeviceObj *obj;
> + virNodeDeviceDef *new_def;
> + virNodeDevicePtr device;
> g_autofree char *uuid = NULL;
> if (!def->parent) {
> @@ -864,8 +867,19 @@ nodeDeviceCreateXMLMdev(virConnectPtr conn,
> def->caps->data.mdev.uuid = g_steal_pointer(&uuid);
> }
> - return nodeDeviceFindNewMediatedDevice(conn,
> def->caps->data.mdev.uuid,
> -
> def->caps->data.mdev.parent_addr);
> + device = nodeDeviceFindNewMediatedDevice(conn,
> def->caps->data.mdev.uuid,
> +
> def->caps->data.mdev.parent_addr);
So, this function waits up to 60s for the given mediated device to show
up in the list of devices known to the driver (which is stored in
driver->devs and gets populated by handlers that respond to udev or
mdevctl events)...
> + /* check on def for attributes and try update */
> + if (def->caps->data.mdev.nattributes > 0) {
> + /* ignore failures as mdevctl updates will recover later */
> + if (!(obj = nodeDeviceObjFindByName(device->name)))
> + return device;
If the device was found and the device has mdev attributes, you then
query the exact same list for a device with the name of the device you
just found...
> + new_def = virNodeDeviceObjGetDef(obj);
> + nodeDeviceDefCopyFromMdevctl(new_def, def);
If the device was found in the list (which it should be, because the
nodeDeviceFindNewMediatedDevice() function had already found it in the
list of devices), then we simply copy the user-supplied device
definition into the definition we received from the nodedev driver's
device list.
Please let me know which procedure to use if the node device object
which is missing the attributes is created by the udev ADD event which
the method nodeDeviceFindNewMediatedDevice is waiting for to happen.
This is just papering over the issue. By copying the user-supplied
definition into the system-queried definition, we're just assuming that
the device was created properly with the requested attributes rather
than verifying that it was created correctly.
I expect virMdevctlCreate to fail if attributes are not properly set on
the transient mdev created by mdevctl. If that happens this code is
never run anyway.
And speaking about verification:
mdevctl only returns attributes if a callout script is present
supporting the mdev type and has a method for event "get" and action
"attributes" implemented.
Therefore no "reliable" verification is possible! The only reliable
source of information is the outcome of virMdevctlCreate.
The root issue seems to be that when a transient mediated device is
created, our udev event handler runs and adds the device to the device
list, but the mdevctl handler is never run to query additional
mdev-related information about the device. mdevctl is only re-queried
when the config files for defined devices are changed or when a new
physical mdev parent device is added or removed.
Actually that was the first place I looked at.
There is already an mdevctl update triggered in method udevAddOneDeive
if the device is an mdev parent device. This happens usually on either a
vfio* module load or a device being bound to a vfio* device driver.
Introducing an update on every mdev device added would create an high
impact especially as the udevAddOneDeive is used in multiple contexts,
1) add and change udev event
2) move usev event
3) processing the each entry in a udev process device list, which is
usually run when the node device daemon initializes. In this case a
device unrelated/unconditional mdevctl update is triggered once already
(nodeStateInitializedEnumerate).
In cases 1) and 2) the context of what triggered the add udev event is
unknown and we are only missing the update if a transient mdev device
with attributes is being created. Therefore I choose the "shortcut" of
added these where the context is known. Please also note that the
callout script restriction (mentioned above) also applies here.
Jonathon
--
Mit freundlichen Grüßen/Kind regards
Boris Fiuczynski
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294