
On Mon, Mar 01, 2021 at 15:30:58 +0000, Thanos Makatos wrote:
I'm trying to use QEMU master with libvirt 4.5 and QEMU seems to be hanging when I try to start a guest.
My environment is a modified CentOS 7.9 installation using libvirt 4.5.0. When I use a modified version of QEMU 2.12 (reasonably close to the stock CentOS version) everything works fine. When I try to use a fairly recent version of QEMU (e.g. v5.2.0-729-g89ff714f4b).
qemu 118657 1.6 0.0 0 0 ? Z 14:50 0:00 [qemu-kvm] <defunct> qemu 118664 0.0 0.0 207340 3560 ? Ssl 14:50 0:00 /usr/libexec/qemu-kvm -S -no-user-config -nodefaults -nographic -machine none,accel=kvm:tcg -qmp unix:/var/lib/libvirt/qemu/capabilities.monitor.sock,server,nowait -pidfile /var/lib/libvirt/qemu/capabilities.pidfile -daemonize qemu 118666 0.0 0.0 275008 13916 ? Sl 14:50 0:00 /usr/libexec/qemu-kvm -S -no-user-config -nodefaults -nographic -machine none,accel=kvm:tcg -qmp unix:/var/lib/libvirt/qemu/capabilities.monitor.sock,server,nowait -pidfile /var/lib/libvirt/qemu/capabilities.pidfile -daemonize
These are possibly a leftover from libvirt's capability probing. I recall that we had some issue with capability detection qemu instances getting stuck but I don't remember the details any more.
/var/lib/libvirt/qemu/capabilities.pidfile contains the PID of the 2nd QEMU process and by experimenting I found that by killing the QEMU process NOT in the PID file the same thing happens again (maybe once more or twice) and then the guest boots fine.
Does libvirt have some specific QEMU dependency? Is there some compatibility matrix I might have missed? I understand this may not be a supported configuration given that I'm not using vanilla libvirt/QEMU, however I'd appreciate some pointers so I can further debug this. I've trying debugging this in case there's some obvious error but didn't find anything interesting.
Any ideas?
In general, libvirt can't guarantee support of future qemu versions as qemu evolving and recently also deprecating and removing superseded configuration approaches, so you must cross-check with qemu's support policy. For debugging, debug logs may be helpful. In case a real VM getting stuck also the VM log file, but in your case the capability processes don't have that.