On 11/2/18 3:48 AM, Erik Skultety wrote:
> On Thu, Nov 01, 2018 at 01:48:39PM -0400, John Ferlan wrote:
>> Commit cdbe1332 neglected to document the API. So let's add some
>> details about the algorithm and why it was used to help future
>> readers understand the issues encountered. Based largely off a
>> description provided by Erik Skultety <eskultet(a)redhat.com>.
>
> Oh, I missed ^this last sentence. Since you're blaming the commit already and
> the description based on a mailing list thread will not be part of the log, you
> should IMHO drop it.
>
>>
>> NB: Management of the processing udev device notification is a
>> delicate balance between the udev process, the scheduler, and when
>> exactly the data from/for the socket is received. The balance is
>> particularly important for environments when multiple devices are
>> added into the system more or less simultaneously such as is done
>> for mdev.
>
> "or SRIOV"
>
> Note: I can't really remember what I used for reproducer, mdevs or a SRIOV
> card.
>
>> In these cases scheduler blocking on the udev recv() call
>
> I don't think scheduler is blocking anywhere, it's rather old libudev
blocking
> on the recv call ;)
>
>> is more noticeable. It's expected that future devices will follow
>> similar algorithms. Even though the algorithm does present some
>> challenges for older OS's (such as Centos 6), trying to rewrite
>> the algorithm to fit both models would be more complex and involve
>> pulling the monitor object out of the private data lockable object
>> and would need to be guarded by a separate lock. Devising such an
>> algorithm to work around issues with older OS's at the expense
>> of more modern OS algorithms in newer event processing code may
>> result in unexpected issues, so the choice is to encourage use
>> of newer OS's with newer udev event processing code.
>>
>> Signed-off-by: John Ferlan <jferlan(a)redhat.com>
>> ---
>>
>> v1:
https://www.redhat.com/archives/libvir-list/2018-October/msg01053.html
>>
>> Changes are from code review with some minor tweaks/editing as I
>> went along. Mistakes are my own!
>>
>> src/node_device/node_device_udev.c | 28 ++++++++++++++++++++++++++++
>> 1 file changed, 28 insertions(+)
>>
>> diff --git a/src/node_device/node_device_udev.c
b/src/node_device/node_device_udev.c
>> index 22897591de..f2c2299d4d 100644
>> --- a/src/node_device/node_device_udev.c
>> +++ b/src/node_device/node_device_udev.c
>> @@ -1591,6 +1591,26 @@ udevEventMonitorSanityCheck(udevEventDataPtr priv,
>> }
>>
>>
>> +/**
>> + * udevEventHandleThread
>> + * @opaque: unused
>> + *
>> + * Thread to handle the udevEventHandleCallback processing when udev
>> + * tells us there's a device change for us (add, modify, delete, etc).
>> + *
>> + * Once notified there is data to be processed, the actual @device
>> + * data retrieval by libudev may be delayed due to how threads are
>> + * scheduled. In fact, the event loop could be scheduled earlier than
>> + * the handler thread, thus potentially emitting the very same event
>> + * the handler thread is currently trying to process, simply because
>> + * the data hadn't been retrieved from the socket.
>> + *
>
> ...
>
>> + * NB: Usage of event based socket algorithm causes some issues with
>> + * older platforms such as CentOS 6 in which the data socket is opened
>
> ^Sounds a bit too generic, as an event based algorithm is not forced to open a
> socket without O_NONBLOCK - just put something like "older platforms'
libudev
> opens sockets without the NONBLOCK flag which might cause issues with event
> based algorithm" or something alike in there.
>
* NB: Some older distros, such as CentOS 6, libudev opens sockets
* without the NONBLOCK flag which might cause issues with event
* based algorithm. Although the issue can be mitigated by resetting
* priv->dataReady for each event found; however, the scheduler issues
* would still come into play.