On Tue, Feb 12, 2019 at 09:46 PM +0100, John Ferlan <jferlan(a)redhat.com> wrote:
On 2/7/19 11:08 AM, Marc Hartmayer wrote:
> Commit "nodedev: Move device enumumeration out of nodeStateInitialize"
> (9f0ae0b18e3e620) has moved the heavy task of device enumeration into
> a separate thread. The problem with this commit is that there is a
> functionality change in the cleanup when udevEnumerateDevices
> fails. Before commit 9f0ae0b18e3e620, the entire driver was cleaned up
> by calling nodeStateCleanup (defined in node_device_udev.c) which
> resulted in libvirtd stopping with the error message
> 'daemonRunStateInit:800 : Driver state initialization failed'. With
> the commit 9f0ae0b18e3e620 there is only a signal to the udev thread
> that it must stop. This means that for example the watch handle isn't
> removed from the main loop and this can result in the main loop
> consuming 100% CPU time as soon as a new udev event occurs.
>
> This patch proposes a simple solution to the described problem. In
> case the udev thread stops the watch handle is removed from the main
> loop.
>
> Fixes: 9f0ae0b18e3e620 ("nodedev: Move device enumumeration out of
nodeStateInitialize")
> Signed-off-by: Marc Hartmayer <mhartmay(a)linux.ibm.com>
> ---
>
> Note: I'm not sure whether we should stop libvirtd (as it would have
> been done before) or if this patch is already sufficient.
>
> ---
> src/node_device/node_device_udev.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
[…snip…]
Furthermore, should nodeStateCleanup be altered to check for
priv->watch
== -1 and thus not worry about the code to set threadQuit, signal, and
joining the thread.
Hmm, I looked at the code again and it seems like your suggested change
could be a small performance improvement but it makes the code not more
readable. The current code is not wrong/problematic because changing
`priv->threadQuit` to true changes nothing. virCondSignal doesn’t block
and virThreadJoin/pthread_join returns immediately if the passed thread
has already terminated
(
http://man7.org/linux/man-pages/man3/pthread_join.3.html).
The join would also still be needed with your proposed change since
otherwise there would be a (possible) race condition between
nodeStateCleanup() and setting `priv->threadQuit`.
John
BTW: I'm subscribed to the list, no need to CC...
> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
> index b1e5f00067e8..299f55260129 100644
> --- a/src/node_device/node_device_udev.c
> +++ b/src/node_device/node_device_udev.c
> @@ -1628,6 +1628,13 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
> }
>
> if (priv->threadQuit) {
> + if (priv->watch != -1) {
> + /* Since the udev thread getting stopped remove the
> + * watch handle from the main loop */
> + virEventRemoveHandle(priv->watch);
> + priv->watch = -1;
> + }
> +
> virObjectUnlock(priv);
> return;
> }
>
--
Kind regards / Beste Grüße
Marc Hartmayer
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294