On 11/2/18 3:48 AM, Erik Skultety wrote:
On Thu, Nov 01, 2018 at 01:48:39PM -0400, John Ferlan wrote:
> Commit cdbe1332 neglected to document the API. So let's add some
> details about the algorithm and why it was used to help future
> readers understand the issues encountered. Based largely off a
> description provided by Erik Skultety <eskultet(a)redhat.com>.
Oh, I missed ^this last sentence. Since you're blaming the commit already and
the description based on a mailing list thread will not be part of the log, you
should IMHO drop it.
>
> NB: Management of the processing udev device notification is a
> delicate balance between the udev process, the scheduler, and when
> exactly the data from/for the socket is received. The balance is
> particularly important for environments when multiple devices are
> added into the system more or less simultaneously such as is done
> for mdev.
"or SRIOV"
Note: I can't really remember what I used for reproducer, mdevs or a SRIOV
card.
> In these cases scheduler blocking on the udev recv() call
I don't think scheduler is blocking anywhere, it's rather old libudev blocking
on the recv call ;)
> is more noticeable. It's expected that future devices will follow
> similar algorithms. Even though the algorithm does present some
> challenges for older OS's (such as Centos 6), trying to rewrite
> the algorithm to fit both models would be more complex and involve
> pulling the monitor object out of the private data lockable object
> and would need to be guarded by a separate lock. Devising such an
> algorithm to work around issues with older OS's at the expense
> of more modern OS algorithms in newer event processing code may
> result in unexpected issues, so the choice is to encourage use
> of newer OS's with newer udev event processing code.
>
> Signed-off-by: John Ferlan <jferlan(a)redhat.com>
> ---
>
> v1:
https://www.redhat.com/archives/libvir-list/2018-October/msg01053.html
>
> Changes are from code review with some minor tweaks/editing as I
> went along. Mistakes are my own!
>
> src/node_device/node_device_udev.c | 28 ++++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
> index 22897591de..f2c2299d4d 100644
> --- a/src/node_device/node_device_udev.c
> +++ b/src/node_device/node_device_udev.c
> @@ -1591,6 +1591,26 @@ udevEventMonitorSanityCheck(udevEventDataPtr priv,
> }
>
>
> +/**
> + * udevEventHandleThread
> + * @opaque: unused
> + *
> + * Thread to handle the udevEventHandleCallback processing when udev
> + * tells us there's a device change for us (add, modify, delete, etc).
> + *
> + * Once notified there is data to be processed, the actual @device
> + * data retrieval by libudev may be delayed due to how threads are
> + * scheduled. In fact, the event loop could be scheduled earlier than
> + * the handler thread, thus potentially emitting the very same event
> + * the handler thread is currently trying to process, simply because
> + * the data hadn't been retrieved from the socket.
> + *
...
> + * NB: Usage of event based socket algorithm causes some issues with
> + * older platforms such as CentOS 6 in which the data socket is opened
^Sounds a bit too generic, as an event based algorithm is not forced to open a
socket without O_NONBLOCK - just put something like "older platforms' libudev
opens sockets without the NONBLOCK flag which might cause issues with event
based algorithm" or something alike in there.
* NB: Some older distros, such as CentOS 6, libudev opens sockets
* without the NONBLOCK flag which might cause issues with event
* based algorithm. Although the issue can be mitigated by resetting
* priv->dataReady for each event found; however, the scheduler issues
* would still come into play.
Should I drop the "Although the issue..." as well? IDC... mainly trying
to avoid the "trap" of patches looking to fix older distros...
Tks -
John
otherwise looks okay, I don't think a v3 is necessary:
Reviewed-by: Erik Skultety <eskultet(a)redhat.com>
> + * without the NONBLOCK bit set. That can be avoided by moving the reset
> + * priv->dataReady to just after the udev_monitor_receive_device; however,
> + * scheduler issues would still come into play.
> + */
> static void
> udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
> {
> @@ -1637,6 +1657,9 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
> return;
> }
>
> + /* Trying to move the reset of the @priv->dataReady flag to
> + * after the udev_monitor_receive_device wouldn't help much
> + * due to event mgmt and scheduler timing. */
> virObjectLock(priv);
> priv->dataReady = false;
> virObjectUnlock(priv);
> @@ -1646,6 +1669,11 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
>
> udevHandleOneDevice(device);
> udev_device_unref(device);
> +
> + /* Instead of waiting for the next event after processing @device
> + * data, let's keep reading from the udev monitor and only wait
> + * for the next event once either a EAGAIN or a EWOULDBLOCK error
> + * is encountered. */
> }
> }
>
> --
> 2.17.2
>
> --
> libvir-list mailing list
> libvir-list(a)redhat.com
>
https://www.redhat.com/mailman/listinfo/libvir-list