[PATCH 0/6] nodedev: Follow up fixes

Yesterday I've posted patches that allowed me to start the daemon. But today I tried listing some nodedevs. It uncovered some dormant bugs we had (e.g. 3/6 and 5/6) and one new (6/6). Michal Prívozník (6): nodedev: Rename nodeDeviceWaitInit() nodedev: Wait for device initialization in nodeDeviceCreate() nodedev: Signal initCond with driver locked nodedev: Introduce nodeDeviceInitComplete() nodedev: Mark device initialization complete even in case of an error nodedev: Don't fail device enumeration if MDEVCTL is missing src/node_device/node_device_driver.c | 47 +++++++++++++++++++++------- src/node_device/node_device_driver.h | 2 ++ src/node_device/node_device_udev.c | 7 ++--- 3 files changed, 40 insertions(+), 16 deletions(-) -- 2.26.3

The consensus is to put the verb last. Therefore, the new name is nodeDeviceInitWait(). This allows us to introduce new function (done later in a separate commit) that will "complete" the device initialization. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/node_device/node_device_driver.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index c11b8d89ab..4678a0fc01 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -158,7 +158,7 @@ nodeDeviceUnlock(void) static int -nodeDeviceWaitInit(void) +nodeDeviceInitWait(void) { nodeDeviceLock(); while (!driver->initialized) { @@ -183,7 +183,7 @@ nodeNumOfDevices(virConnectPtr conn, virCheckFlags(0, -1); - if (nodeDeviceWaitInit() < 0) + if (nodeDeviceInitWait() < 0) return -1; return virNodeDeviceObjListNumOfDevices(driver->devs, conn, cap, @@ -203,7 +203,7 @@ nodeListDevices(virConnectPtr conn, virCheckFlags(0, -1); - if (nodeDeviceWaitInit() < 0) + if (nodeDeviceInitWait() < 0) return -1; return virNodeDeviceObjListGetNames(driver->devs, conn, @@ -222,7 +222,7 @@ nodeConnectListAllNodeDevices(virConnectPtr conn, if (virConnectListAllNodeDevicesEnsureACL(conn) < 0) return -1; - if (nodeDeviceWaitInit() < 0) + if (nodeDeviceInitWait() < 0) return -1; return virNodeDeviceObjListExport(conn, driver->devs, devices, @@ -254,7 +254,7 @@ nodeDeviceLookupByName(virConnectPtr conn, virNodeDeviceDefPtr def; virNodeDevicePtr device = NULL; - if (nodeDeviceWaitInit() < 0) + if (nodeDeviceInitWait() < 0) return NULL; if (!(obj = nodeDeviceObjFindByName(name))) @@ -285,7 +285,7 @@ nodeDeviceLookupSCSIHostByWWN(virConnectPtr conn, virCheckFlags(0, NULL); - if (nodeDeviceWaitInit() < 0) + if (nodeDeviceInitWait() < 0) return NULL; if (!(obj = virNodeDeviceObjListFindSCSIHostByWWNs(driver->devs, @@ -844,7 +844,7 @@ nodeDeviceCreateXML(virConnectPtr conn, virCheckFlags(0, NULL); - if (nodeDeviceWaitInit() < 0) + if (nodeDeviceInitWait() < 0) return NULL; virt_type = virConnectGetType(conn); @@ -1144,7 +1144,7 @@ nodeDeviceDestroy(virNodeDevicePtr device) g_autofree char *wwpn = NULL; unsigned int parent_host; - if (nodeDeviceWaitInit() < 0) + if (nodeDeviceInitWait() < 0) return -1; if (!(obj = nodeDeviceObjFindByName(device->name))) @@ -1288,7 +1288,7 @@ nodeDeviceDefineXML(virConnect *conn, virCheckFlags(0, NULL); - if (nodeDeviceWaitInit() < 0) + if (nodeDeviceInitWait() < 0) return NULL; virt_type = virConnectGetType(conn); @@ -1350,7 +1350,7 @@ nodeDeviceUndefine(virNodeDevice *device, virCheckFlags(0, -1); - if (nodeDeviceWaitInit() < 0) + if (nodeDeviceInitWait() < 0) return -1; if (!(obj = nodeDeviceObjFindByName(device->name))) @@ -1446,7 +1446,7 @@ nodeConnectNodeDeviceEventRegisterAny(virConnectPtr conn, if (virConnectNodeDeviceEventRegisterAnyEnsureACL(conn) < 0) return -1; - if (nodeDeviceWaitInit() < 0) + if (nodeDeviceInitWait() < 0) return -1; if (virNodeDeviceEventStateRegisterID(conn, driver->nodeDeviceEventState, @@ -1465,7 +1465,7 @@ nodeConnectNodeDeviceEventDeregisterAny(virConnectPtr conn, if (virConnectNodeDeviceEventDeregisterAnyEnsureACL(conn) < 0) return -1; - if (nodeDeviceWaitInit() < 0) + if (nodeDeviceInitWait() < 0) return -1; if (virObjectEventStateDeregisterID(conn, -- 2.26.3

On Tue, Apr 13, 2021 at 12:01:52PM +0200, Michal Privoznik wrote:
The consensus is to put the verb last. Therefore, the new name is nodeDeviceInitWait(). This allows us to introduce new function (done later in a separate commit) that will "complete" the device initialization.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- Reviewed-by: Erik Skultety <eskultet@redhat.com>

Although I have not experienced this in real life, there is a possible race condition when creating new device. If the nodedev driver is still enumerating devices (in a separate thread) and virNodeDeviceCreate() is called then it can lead to spurious results because the device enumeration thread is removing devices from or adding them to the internal list of devices (among with their states). Therefore, wait for things to settle down before proceeding with nodeDeviceCreate(). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/node_device/node_device_driver.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 4678a0fc01..bc8a758c1c 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -1399,6 +1399,9 @@ nodeDeviceCreate(virNodeDevice *device, virCheckFlags(0, -1); + if (nodeDeviceInitWait() < 0) + return -1; + if (!(obj = nodeDeviceObjFindByName(device->name))) return -1; -- 2.26.3

On Tue, Apr 13, 2021 at 12:01:53PM +0200, Michal Privoznik wrote:
Although I have not experienced this in real life, there is a possible race condition when creating new device. If the nodedev driver is still enumerating devices (in a separate thread) and virNodeDeviceCreate() is called then it can lead to spurious results because the device enumeration thread is removing devices from or adding them to the internal list of devices (among with their states).
Therefore, wait for things to settle down before proceeding with nodeDeviceCreate().
Hmm, correct. Looking at the source - nodeDeviceGetXMLDesc, nodeDeviceGetParent and nodeDeviceNumOfCaps are the only exceptions as far as public API mappings go. Would you mind extending this patch? Erik

On 4/13/21 4:14 PM, Erik Skultety wrote:
On Tue, Apr 13, 2021 at 12:01:53PM +0200, Michal Privoznik wrote:
Although I have not experienced this in real life, there is a possible race condition when creating new device. If the nodedev driver is still enumerating devices (in a separate thread) and virNodeDeviceCreate() is called then it can lead to spurious results because the device enumeration thread is removing devices from or adding them to the internal list of devices (among with their states).
Therefore, wait for things to settle down before proceeding with nodeDeviceCreate().
Hmm, correct. Looking at the source - nodeDeviceGetXMLDesc, nodeDeviceGetParent and nodeDeviceNumOfCaps are the only exceptions as far as public API mappings go. Would you mind extending this patch?
Good catch. In fact there's a fourth one - nodeDeviceListCaps. Is it okay to fix locally or do you want me to send v2? Michal

This is more academic dispute than a real bug, but this is taken from pthread_cond_broadcast(3p) man: The pthread_cond_broadcast() or pthread_cond_signal() functions may be called by a thread whether or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait() have associated with the condition variable during their waits; however, if predictable scheduling behavior is required, then that mutex shall be locked by the thread calling pthread_cond_broadcast() or pthread_cond_signal(). Therefore, broadcast the initCond while the nodedev driver is still locked. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/node_device/node_device_udev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 44f96f5ff9..20a13211a0 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1986,8 +1986,8 @@ nodeStateInitializeEnumerate(void *opaque) nodeDeviceLock(); driver->initialized = true; - nodeDeviceUnlock(); virCondBroadcast(&driver->initCond); + nodeDeviceUnlock(); return; -- 2.26.3

On Tue, Apr 13, 2021 at 12:01:54PM +0200, Michal Privoznik wrote:
This is more academic dispute than a real bug, but this is taken from pthread_cond_broadcast(3p) man:
The pthread_cond_broadcast() or pthread_cond_signal() functions may be called by a thread whether or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait() have associated with the condition variable during their waits; however, if predictable scheduling behavior is required, then that mutex shall be locked by the thread calling pthread_cond_broadcast() or pthread_cond_signal().
Therefore, broadcast the initCond while the nodedev driver is still locked.
It is consistent with what we do elsewhere. Reviewed-by: Erik Skultety <eskultet@redhat.com>

When the device enumeration thread finishes it sets the driver->initialized boolean and signals condition to wake up other threads that are waiting for the initialization to complete. Move this code into a separate function so that it can be re-used from other places too. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/node_device/node_device_driver.c | 10 ++++++++++ src/node_device/node_device_driver.h | 2 ++ src/node_device/node_device_udev.c | 5 +---- 3 files changed, 13 insertions(+), 4 deletions(-) diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index bc8a758c1c..2bb83c19f2 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -157,6 +157,16 @@ nodeDeviceUnlock(void) } +void +nodeDeviceInitComplete(void) +{ + nodeDeviceLock(); + driver->initialized = true; + virCondBroadcast(&driver->initCond); + nodeDeviceUnlock(); +} + + static int nodeDeviceInitWait(void) { diff --git a/src/node_device/node_device_driver.h b/src/node_device/node_device_driver.h index 8a935ffed6..7160f551db 100644 --- a/src/node_device/node_device_driver.h +++ b/src/node_device/node_device_driver.h @@ -39,6 +39,8 @@ nodeDeviceLock(void); void nodeDeviceUnlock(void); +void nodeDeviceInitComplete(void); + extern virNodeDeviceDriverStatePtr driver; int diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 20a13211a0..68547c6986 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1984,10 +1984,7 @@ nodeStateInitializeEnumerate(void *opaque) if (nodeDeviceUpdateMediatedDevices() != 0) goto error; - nodeDeviceLock(); - driver->initialized = true; - virCondBroadcast(&driver->initCond); - nodeDeviceUnlock(); + nodeDeviceInitComplete(); return; -- 2.26.3

On Tue, Apr 13, 2021 at 12:01:55PM +0200, Michal Privoznik wrote:
When the device enumeration thread finishes it sets the driver->initialized boolean and signals condition to wake up other threads that are waiting for the initialization to complete. Move this code into a separate function so that it can be re-used from other places too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- ....
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 20a13211a0..68547c6986 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1984,10 +1984,7 @@ nodeStateInitializeEnumerate(void *opaque) if (nodeDeviceUpdateMediatedDevices() != 0) goto error;
- nodeDeviceLock(); - driver->initialized = true; - virCondBroadcast(&driver->initCond); - nodeDeviceUnlock(); + nodeDeviceInitComplete();
well, it will be re-used in the same function after patch 5. I suggest going with the old style - introducing a fallthrough cleanup label and adding a 'goto cleanup' to the 'error' section. Patch 5 will not be needed in that case. Erik

On 4/13/21 4:14 PM, Erik Skultety wrote:
On Tue, Apr 13, 2021 at 12:01:55PM +0200, Michal Privoznik wrote:
When the device enumeration thread finishes it sets the driver->initialized boolean and signals condition to wake up other threads that are waiting for the initialization to complete. Move this code into a separate function so that it can be re-used from other places too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- ....
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 20a13211a0..68547c6986 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1984,10 +1984,7 @@ nodeStateInitializeEnumerate(void *opaque) if (nodeDeviceUpdateMediatedDevices() != 0) goto error;
- nodeDeviceLock(); - driver->initialized = true; - virCondBroadcast(&driver->initCond); - nodeDeviceUnlock(); + nodeDeviceInitComplete();
well, it will be re-used in the same function after patch 5. I suggest going with the old style - introducing a fallthrough cleanup label and adding a 'goto cleanup' to the 'error' section. Patch 5 will not be needed in that case.
Fair enough. Michal

To speed up nodedev driver initialization, the device enumeration is done in a separate thread. Once finished, the thread sets a boolean variable that allows public APIs to be called (instead of waiting for the thread to finish). However, if there's an error in the device enumeration thread then the control jumps over at the 'error' label and the boolean is never set. This means, that any virNodeDev*() API is stuck forever. Mark the initialization as complete (the thread is quitting anyway) and let the APIs proceed. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/node_device/node_device_udev.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 68547c6986..1c5b80ea58 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1995,6 +1995,8 @@ nodeStateInitializeEnumerate(void *opaque) priv->threadQuit = true; virCondSignal(&priv->threadCond); virObjectUnlock(priv); + + nodeDeviceInitComplete(); } -- 2.26.3

After all devices were enumerated, the enumeration thread call nodeDeviceUpdateMediatedDevices() to refresh the state of mediated devices. This means that 'mdevctl' will be executed. But it may be missing on some systems (e.g. mine) in which case we should just skip the update of mdevs instead of failing whole device enumeration. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/node_device/node_device_driver.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 2bb83c19f2..fab830c596 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -42,9 +42,12 @@ #include "virnetdev.h" #include "virutil.h" #include "vircommand.h" +#include "virlog.h" #define VIR_FROM_THIS VIR_FROM_NODEDEV +VIR_LOG_INIT("node_device.node_device_driver"); + virNodeDeviceDriverStatePtr driver; virDrvOpenStatus @@ -1605,6 +1608,13 @@ nodeDeviceUpdateMediatedDevices(void) virMdevctlForEachData data = { 0, }; size_t i; + /* The code below assumes MDEVCTL to exist. Well, if it doesn't then exit + * early. */ + if (!virFileExists(MDEVCTL)) { + VIR_DEBUG(MDEVCTL " does not exist. Skipping update of mediated devices."); + return 0; + } + if ((data.ndefs = virMdevctlListDefined(&defs, &errmsg)) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("failed to query mdevs from mdevctl: %s"), errmsg); -- 2.26.3

On Tue, Apr 13, 2021 at 12:01:57PM +0200, Michal Privoznik wrote:
After all devices were enumerated, the enumeration thread call nodeDeviceUpdateMediatedDevices() to refresh the state of mediated devices. This means that 'mdevctl' will be executed. But it may be missing on some systems (e.g. mine) in which case we should just skip the update of mdevs instead of failing whole device enumeration.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/node_device/node_device_driver.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 2bb83c19f2..fab830c596 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -42,9 +42,12 @@ #include "virnetdev.h" #include "virutil.h" #include "vircommand.h" +#include "virlog.h"
#define VIR_FROM_THIS VIR_FROM_NODEDEV
+VIR_LOG_INIT("node_device.node_device_driver"); + virNodeDeviceDriverStatePtr driver;
virDrvOpenStatus @@ -1605,6 +1608,13 @@ nodeDeviceUpdateMediatedDevices(void) virMdevctlForEachData data = { 0, }; size_t i;
+ /* The code below assumes MDEVCTL to exist. Well, if it doesn't then exit + * early. */
This commentary can be dropped as the check is self explanatory.
+ if (!virFileExists(MDEVCTL)) {
I don't think ^this is correct, when you don't have mdevctl installed, MDEVCTL == "mdevctl" which won't exist even after you install the program, so update won't work properly until you reconfigure meson and rebuild. You need to do virFindFileInPath so that everything starts transparently working. I'll give you an example where this can get annoying. I just tested this on a machine where I had mdevctl installed previously, so I removed it, but I had some mdevs defined. Because mdevctl is now missing, udev reports some mdev devices, but that's it. After I re-installed mdevctl back I expected libvirtd to at least be able to define a device, but I got an error from mdevctl that that device already is defined, but libvirtd doesn't know it's persistent because /etc/mdevctl.d was never scanned, so I suddenly see devices as transient and have to manually remove the mdevctl definitions. Erik
+ VIR_DEBUG(MDEVCTL " does not exist. Skipping update of mediated devices."); + return 0;
s/does not exist/not found/
+ } + if ((data.ndefs = virMdevctlListDefined(&defs, &errmsg)) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("failed to query mdevs from mdevctl: %s"), errmsg); -- 2.26.3

On 4/13/21 4:58 PM, Erik Skultety wrote:
On Tue, Apr 13, 2021 at 12:01:57PM +0200, Michal Privoznik wrote:
After all devices were enumerated, the enumeration thread call nodeDeviceUpdateMediatedDevices() to refresh the state of mediated devices. This means that 'mdevctl' will be executed. But it may be missing on some systems (e.g. mine) in which case we should just skip the update of mdevs instead of failing whole device enumeration.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/node_device/node_device_driver.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c index 2bb83c19f2..fab830c596 100644 --- a/src/node_device/node_device_driver.c +++ b/src/node_device/node_device_driver.c @@ -42,9 +42,12 @@ #include "virnetdev.h" #include "virutil.h" #include "vircommand.h" +#include "virlog.h"
#define VIR_FROM_THIS VIR_FROM_NODEDEV
+VIR_LOG_INIT("node_device.node_device_driver"); + virNodeDeviceDriverStatePtr driver;
virDrvOpenStatus @@ -1605,6 +1608,13 @@ nodeDeviceUpdateMediatedDevices(void) virMdevctlForEachData data = { 0, }; size_t i;
+ /* The code below assumes MDEVCTL to exist. Well, if it doesn't then exit + * early. */
This commentary can be dropped as the check is self explanatory.
+ if (!virFileExists(MDEVCTL)) {
I don't think ^this is correct, when you don't have mdevctl installed, MDEVCTL == "mdevctl" which won't exist even after you install the program, so update won't work properly until you reconfigure meson and rebuild. You need to do virFindFileInPath so that everything starts transparently working.
Okay.
I'll give you an example where this can get annoying. I just tested this on a machine where I had mdevctl installed previously, so I removed it, but I had some mdevs defined. Because mdevctl is now missing, udev reports some mdev devices, but that's it. After I re-installed mdevctl back I expected libvirtd to at least be able to define a device, but I got an error from mdevctl that that device already is defined, but libvirtd doesn't know it's persistent because /etc/mdevctl.d was never scanned, so I suddenly see devices as transient and have to manually remove the mdevctl definitions.
Fair enough, but is this even supposed to work? I mean, if you install/uninstall a tool without libvirtd restart then you shouldn't expect things to work. We use binaries, because some tools don't provide libraries. If mdevctl was a library then no such thing as install/uninstall without libvirtd would be possible. Either it would be still mapped into libvirtd (even after you've removed it from disk), or it won't be loaded after you've installed it. I'll switch over to virFindFileInPath(), though. Michal

On Tue, Apr 13, 2021 at 05:33:49PM +0200, Michal Privoznik wrote:
On 4/13/21 4:58 PM, Erik Skultety wrote:
After all devices were enumerated, the enumeration thread call nodeDeviceUpdateMediatedDevices() to refresh the state of mediated devices. This means that 'mdevctl' will be executed. But it may be missing on some systems (e.g. mine) in which case we should just skip the update of mdevs instead of failing whole device enumeration.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- ... I'll give you an example where this can get annoying. I just tested this on a machine where I had mdevctl installed previously, so I removed it, but I had some mdevs defined. Because mdevctl is now missing, udev reports some mdev devices, but that's it. After I re-installed mdevctl back I expected libvirtd to at least be able to define a device, but I got an error from mdevctl that
On Tue, Apr 13, 2021 at 12:01:57PM +0200, Michal Privoznik wrote: that device already is defined, but libvirtd doesn't know it's persistent because /etc/mdevctl.d was never scanned, so I suddenly see devices as transient and have to manually remove the mdevctl definitions.
Fair enough, but is this even supposed to work? I mean, if you install/uninstall a tool without libvirtd restart then you shouldn't expect things to work. We use binaries, because some tools don't provide libraries. If mdevctl was a library then no such thing as install/uninstall without libvirtd would be possible. Either it would be still mapped into libvirtd (even after you've removed it from disk), or it won't be loaded after you've installed it.
Well, I guess that I disagree with one particular part of your answer - "if you install/uninstall a tool without libvirtd restart then you shouldn't expect things to work." Actually, when we declare some tools as runtime dependencies, then yes, I'd actually expect libvirt to have checks in place for the respective binaries and stop erroring out when it manages to find those after having it installed or conversely start erroring out when the tool disappears. And for the record, ^this case was broken even after libvirtd restart.
I'll switch over to virFindFileInPath(), though.
Thanks. Erik
participants (2)
-
Erik Skultety
-
Michal Privoznik