On 09/18/2017 12:34 PM, Erik Skultety wrote:
The event loop may get scheduled earlier than the udev event handler
thread which means that it would keep invoking the handler callback with
"new" events, while in fact it's most likely still the same event which
the handler thread hasn't managed to remove from the socket queue yet.
This is due to unwanted increments of the number of events queuing on the
socket and causes the handler thread to spam the logs with an error about
libudev returning NULL (errno == EAGAIN).
Signed-off-by: Erik Skultety <eskultet(a)redhat.com>
---
src/node_device/node_device_udev.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
index 96760d1e4..70e15ffb8 100644
--- a/src/node_device/node_device_udev.c
+++ b/src/node_device/node_device_udev.c
@@ -1696,6 +1696,7 @@ udevEventCheckMonitorFD(struct udev_monitor *udev_monitor,
static void
udevEventHandleThread(void *opaque)
{
+ udevPrivate *priv = NULL;
udevEventThreadDataPtr privateData = opaque;
struct udev_device *device = NULL;
struct udev_monitor *udev_monitor = NULL;
@@ -1715,6 +1716,7 @@ udevEventHandleThread(void *opaque)
virMutexUnlock(&privateData->lock);
nodeDeviceLock();
+ priv = driver->privateData;
udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
if (!udevEventCheckMonitorFD(udev_monitor, privateData->monitor_fd)) {
@@ -1725,6 +1727,9 @@ udevEventHandleThread(void *opaque)
device = udev_monitor_receive_device(udev_monitor);
nodeDeviceUnlock();
+ /* Re-enable polling for new events on the @udev_monitor */
+ virEventUpdateHandle(priv->watch, VIR_EVENT_HANDLE_READABLE);
+
I think this should only be done when privateData->nevents == 0? If we
have multiple events to read, then calling virEventPollUpdateHandle,
(eventually) for every pass through the loop seems like a bit of
overkill especially if udevEventHandleCallback turns right around and
disables it again.
Also fortunately there isn't more than one udev thread sending the
events since you access the priv->watch without the driver lock...
Conversely perhaps we only disable if events > 1... Then again, how does
one get to 2 events queued if we're disabling as soon as we increment
nevents? Wouldn't this totally obviate the need for nevents?
I would think it would be overkill to disable/enable for just 1 event.
If multiple are queued, then yeah we're starting to get behind... Of
course that could effect the re-enable when events == 1 (or some
MAGIC_CONSTANT_NUMBER_THRESHOLD_TO_DISABLE_POLLING)
IIRC from the previous trip down this rabbit hole (I did briefly go back
and read the comments) that what you're trying to avoid is the following
message spamming the error logs because of a scheduling mishap... Thus
perhaps the following message could only be displayed if nevents > 1 and
nothing was found knowing that we have this "timing problem" between
udev, this thread, and the fd polling scanner of when a device should be
"found"... and then reset nevents == 0 with the appropriate lock.
Curious on your thoughts before an R-B/ACK though.
John
if (!device) {
virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
_("udev_monitor_receive_device returned NULL"));
@@ -1750,10 +1755,12 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
int events ATTRIBUTE_UNUSED,
void *opaque)
{
+ udevPrivate *priv = NULL;
struct udev_monitor *udev_monitor = NULL;
udevEventThreadDataPtr threadData = opaque;
nodeDeviceLock();
+ priv = driver->privateData;
udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
if (!udevEventCheckMonitorFD(udev_monitor, fd)) {
@@ -1771,6 +1778,16 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
threadData->nevents++;
virCondSignal(&threadData->threadCond);
virMutexUnlock(&threadData->lock);
+
+ /* Due to scheduling, the eventloop might poll the udev monitor before the
+ * handler thread actually removes the data from the socket, thus causing it
+ * to spam errors about data not being ready yet, because
+ * udevEventHandleCallback would be invoked multiple times incrementing the
+ * counter of events queuing for a single event.
+ * Therefore we need to disable polling on the fd until the thread actually
+ * removes the data from the socket.
+ */
+ virEventUpdateHandle(priv->watch, 0);
}