Further investigating this it indeed seems like libvirtd is stuck in a
busy-loop in the event loop.
It repeatedly reads and writes from fd 10;
read(10, "\2\0\0\0\0\0\0\0", 16) = 8
write(10, "\1\0\0\0\0\0\0\0", 8) = 8
fd 10 being;
10u a_inode 0,14 0 12812 [eventfd]
Any ideas on how to find the root cause of this?
-------- Vidarebefordrat meddelande --------
Ämne: Re: Libvirt slow after a couple of months uptime
Datum: Fri, 11 Nov 2022 08:02:19 +0100
Från: André Malm <admin(a)sheepa.org>
Till: Peter Krempa <pkrempa(a)redhat.com>
Kopia: libvirt-users(a)redhat.com
Hello,
Thanks for the reply. I've now run into a slow server again and the
debug log print around 500 of these lines a second;
2022-11-11 06:55:52.948+0000: 1470: debug :
virEventGLibHandleDispatch:113 : Dispatch handler data=0x7f336402ac00
watch=7 fd=24 events=1 opaque=(nil)
2022-11-11 06:55:52.948+0000: 1470: info :
virEventGLibHandleDispatch:116 : EVENT_GLIB_DISPATCH_HANDLE: watch=7
events=1 cb=0x7f33a824d610 opaque=(nil)
2022-11-11 06:55:52.948+0000: 1470: debug : virEventRunDefaultImpl:341 :
running default event implementation
What might be causing this?
Den 2022-09-19 kl. 13:22, skrev Peter Krempa:
On Fri, Sep 16, 2022 at 19:41:28 +0200, André Malm wrote:
> Hello,
>
> I have some issues with libvirtd getting slow over time.
>
> After a fresh reboot (or systemctl restart libvirtd) virsh list /
> virt-install is fast, as expected, but after a couple of months
> uptime they
> both take a significantly longer time.
>
> Virsh list takes around 3 seconds (from 0.04s on a fresh reboot) and
> virt-install takes over a minute (from around a second).
>
> Running strace on virsh list it seems to get stuck in a loop on this:
> poll([{fd=5<socket:[173169773]>, events=POLLOUT},
> {fd=6<anon_inode:[eventfd]>, events=POLLIN}], 2, -1) = 2 ([{fd=5,
> revents=POLLOUT}, {fd=6, revents=POLLIN}])
Unfortunately this bit doesn't help much. Virsh' is simply a client
which does RPC over a unix socket to the libvirt/virtqemud daemon based
on your host configuration.
This means that what you straced was simply a event loop waiting for the
communication with the server. In fact there's a whole thread simply for
polling and dispatching the calls so it's expected that it's always
stuck in a poll().
> While restarting libvirtd fixes it
So it looks like the problem isn't in virsh at all. In such case
stracing virsh won't help at all as it's a completely different process
from the dameon.
> a restart takes around 1 minute where
> ebtables rules etc are recreated and it does interrupt the service. What
> could cause this? How would I troubleshoot this?
The best way to at least get an idea where the problem might be would be
to collect debug logs of the libvirt daemon (libvirtd/virtqemud based on
how your host is configured).
To enable debug logs you can use the following guide, which also
explains how to figure out which daemon is in use and also outlines how
to set it without restarting the daemon. Make sure to read the
appropriate chapters:
https://www.libvirt.org/kbase/debuglogs.html
The log contains timestamps so we'll be able to see what bogs down the
runtime once it's in the 'slow' period.