On Tue, Feb 12, 2019 at 09:46 PM +0100, John Ferlan <jferlan(a)redhat.com> wrote:
On 2/7/19 11:08 AM, Marc Hartmayer wrote:
> Commit "nodedev: Move device enumumeration out of nodeStateInitialize"
> (9f0ae0b18e3e620) has moved the heavy task of device enumeration into
> a separate thread. The problem with this commit is that there is a
> functionality change in the cleanup when udevEnumerateDevices
> fails. Before commit 9f0ae0b18e3e620, the entire driver was cleaned up
> by calling nodeStateCleanup (defined in node_device_udev.c) which
> resulted in libvirtd stopping with the error message
> 'daemonRunStateInit:800 : Driver state initialization failed'. With
> the commit 9f0ae0b18e3e620 there is only a signal to the udev thread
> that it must stop. This means that for example the watch handle isn't
> removed from the main loop and this can result in the main loop
> consuming 100% CPU time as soon as a new udev event occurs.
>
> This patch proposes a simple solution to the described problem. In
> case the udev thread stops the watch handle is removed from the main
> loop.
>
> Fixes: 9f0ae0b18e3e620 ("nodedev: Move device enumumeration out of
nodeStateInitialize")
> Signed-off-by: Marc Hartmayer <mhartmay(a)linux.ibm.com>
> ---
>
> Note: I'm not sure whether we should stop libvirtd (as it would have
> been done before) or if this patch is already sufficient.
>
> ---
> src/node_device/node_device_udev.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
What you have seems reasonable - although I wonder if it would be better
to remove the event handle in the error of nodeStateInitializeEnumerate.
I think the assumption made was that by setting threadQuit that the next
event have some sort of failure through udevEventMonitorSanityCheck
which removes the EventHandle too. Although I cannot be sure it's been
too long to remember for sure ;-)
Furthermore, should nodeStateCleanup be altered to check for priv->watch
== -1 and thus not worry about the code to set threadQuit, signal, and
joining the thread.
A simple check for `priv->threadQuit` is probably even more useful since
it’s not safe that the watch callback is set.
I’ll send a v2 with the changes you suggested.
John
BTW: I'm subscribed to the list, no need to CC...
> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
> index b1e5f00067e8..299f55260129 100644
> --- a/src/node_device/node_device_udev.c
> +++ b/src/node_device/node_device_udev.c
> @@ -1628,6 +1628,13 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
> }
>
> if (priv->threadQuit) {
> + if (priv->watch != -1) {
> + /* Since the udev thread getting stopped remove the
> + * watch handle from the main loop */
> + virEventRemoveHandle(priv->watch);
> + priv->watch = -1;
> + }
> +
> virObjectUnlock(priv);
> return;
> }
>
--
Kind regards / Beste Grüße
Marc Hartmayer
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294