On Tue, Jun 20, 2017 at 17:03:32 +0200, Erik Skultety wrote:
If we find ourselves in the situation that the 'add' uevent
has been
fired earlier than the sysfs tree for a device, we'd better wait for the
attributes to be ready rather than discarding the device from our device
list forever.
You'd have to wait for an infinite amount of time to guarantee this.
This patch just makes it slightly more probable that libvirt will find
the device.
Resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1463285
Signed-off-by: Erik Skultety <eskultet(a)redhat.com>
---
src/node_device/node_device_udev.c | 48 +++++++++++++++++++++++++++++++++++++-
1 file changed, 47 insertions(+), 1 deletion(-)
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
index 174124a1b..4f9ca0c67 100644
--- a/src/node_device/node_device_udev.c
+++ b/src/node_device/node_device_udev.c
@@ -60,6 +60,43 @@ struct _udevPrivate {
};
+/**
+ * udevWaitForAttrs:
+ * @sys_path: node device base path
+ * @...: attributes to wait for, last attribute must be NULL
+ *
+ * Takes a list of attributes to wait for, waits until all of them are
+ * available, unless the max number of tries (10) has been reached.
+ *
+ * Returns 0 if all attributes became available, -1 on error.
Since this function waits in order to see whether the individual
components are available I don't really see a point in having the
varargs here. A simple helper that only waits for 1 file (basically
formats the path and waits) would be way better.
+ */
+static int
+udevWaitForAttrs(const char *sys_path, ...)
+{
+ int ret = -1;
+ const char *attr = NULL;
+ char *attr_path = NULL;
+ va_list args;
+
+ va_start(args, sys_path);
+ while ((attr = va_arg(args, char *))) {
+ if (virAsprintf(&attr_path, "%s/%s", sys_path, attr) < 0)
+ goto cleanup;
+
+ if (virFileWaitForAccess(attr_path, 100, 10) < 0)
So this waits up to 1 second per file in rather long increments (100
ms) which I don't think is really desired.
The only prior art here which I think is somewhat relevant is the
waiting code for netdevs, where a 1 ms timeout with 100 retries is used.
Also note that this will delay the event loop since the function is
called by udevEventHandleCallback which is registered in the event loop.
This is definitely unaceptable. NACK to this approach
+ goto cleanup;
+
+ VIR_FREE(attr_path);
+ }
+
+ ret = 0;
+ cleanup:
+ va_end(args);
+ VIR_FREE(attr_path);
+ return ret;
+}
+
+
static bool
udevHasDeviceProperty(struct udev_device *dev,
const char *key)
@@ -1113,12 +1150,21 @@ udevProcessMediatedDevice(struct udev_device *dev,
{
int ret = -1;
const char *uuidstr = NULL;
+ const char *devpath = udev_device_get_syspath(dev);
int iommugrp = -1;
char *linkpath = NULL;
char *canonicalpath = NULL;
virNodeDevCapMdevPtr data = &def->caps->data.mdev;
- if (virAsprintf(&linkpath, "%s/mdev_type",
udev_device_get_syspath(dev)) < 0)
+ /* Because of a kernel uevent race, we might get the 'add' event prior to
+ * the sysfs tree being ready, so any attempt to access any sysfs attribute
+ * would result in ENOENT and us dropping the device, so let's work around
+ * it by waiting for the attributes to become available.
+ */
+ if (udevWaitForAttrs(devpath, "mdev_type", "iommu_group", NULL)
< 0)
+ goto cleanup;
This call formats the same string ...
+
+ if (virAsprintf(&linkpath, "%s/mdev_type", devpath) < 0)
goto cleanup;
.. as this and then throws it away, just to be reformatted in the same
way.