On 3/1/21 12:47 PM, Peter Krempa wrote:
On Mon, Mar 01, 2021 at 15:30:58 +0000, Thanos Makatos wrote:
> I'm trying to use QEMU master with libvirt 4.5 and QEMU seems to be hanging
> when I try to start a guest.
>
> My environment is a modified CentOS 7.9 installation using libvirt 4.5.0. When
> I use a modified version of QEMU 2.12 (reasonably close to the stock CentOS
> version) everything works fine. When I try to use a fairly recent version of
> QEMU (e.g. v5.2.0-729-g89ff714f4b).
>
> qemu 118657 1.6 0.0 0 0 ? Z 14:50 0:00 [qemu-kvm]
<defunct>
> qemu 118664 0.0 0.0 207340 3560 ? Ssl 14:50 0:00
/usr/libexec/qemu-kvm -S -no-user-config -nodefaults -nographic -machine
none,accel=kvm:tcg -qmp unix:/var/lib/libvirt/qemu/capabilities.monitor.sock,server,nowait
-pidfile /var/lib/libvirt/qemu/capabilities.pidfile -daemonize
> qemu 118666 0.0 0.0 275008 13916 ? Sl 14:50 0:00
/usr/libexec/qemu-kvm -S -no-user-config -nodefaults -nographic -machine
none,accel=kvm:tcg -qmp unix:/var/lib/libvirt/qemu/capabilities.monitor.sock,server,nowait
-pidfile /var/lib/libvirt/qemu/capabilities.pidfile -daemonize
These are possibly a leftover from libvirt's capability probing. I
recall that we had some issue with capability detection qemu instances
getting stuck but I don't remember the details any more.
Not sure if related to this reported problem, but upstream QEMU when
compiled with the trace backend was having problems daemonizing. I
experienced a similar issue in the start of the year when running
upstream Libvirt with upstream QEMU.
I posted a patch fixing it in QEMU [1] but it wasn't pushed as of today.
I am not sure if the problem was fixed in another way or if the patch I
posted ended up left behind.
Thanos, if you're compiling QEMU with a trace backend (e.g. with
--enable-trace-backend= in ../configure) it might be worth compiling it
without this option. There is a chance you're hitting same problem I
mentioned above.
[1]
https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg00541.html
Thanks,
DHB
> /var/lib/libvirt/qemu/capabilities.pidfile contains the PID of the 2nd QEMU
> process and by experimenting I found that by killing the QEMU process NOT in
> the PID file the same thing happens again (maybe once more or twice) and then
> the guest boots fine.
>
> Does libvirt have some specific QEMU dependency? Is there some compatibility
> matrix I might have missed? I understand this may not be a supported
> configuration given that I'm not using vanilla libvirt/QEMU, however I'd
> appreciate some pointers so I can further debug this. I've trying debugging
> this in case there's some obvious error but didn't find anything
interesting.
>
> Any ideas?
In general, libvirt can't guarantee support of future qemu versions as
qemu evolving and recently also deprecating and removing superseded
configuration approaches, so you must cross-check with qemu's support
policy.
For debugging, debug logs may be helpful. In case a real VM getting
stuck also the VM log file, but in your case the capability processes
don't have that.