On 04/05/2018 12:17 PM, Jiri Denemark wrote:
On Thu, Apr 05, 2018 at 12:00:44 -0600, Chris Friesen wrote:
> I'm investigating something weird with libvirt 1.2.17 and qemu 2.3.0.
>
> I'm using the python bindings, and I seem to have a case where
> libvirtmod.virDomainCreateWithFlags() hung rather than returned. Then, about
> 15min later a subsequent call to libvirtmod.virDomainDestroy() from a different
> eventlet within the same process seems to have "unblocked" the original
creation
> call, which raised an exception and an error code of
> libvirt.VIR_ERR_INTERNAL_ERROR. The virDomainDestroy() call came back with an
> error of "Requested operation is not valid: domain is not running".
>
> The corresponding qemu logs show the guest starting up and then a bit over 15min
> later there is a "shutting down" log. At shutdown time the libvirtd log
shows
> "qemuMonitorIORead:609 : Unable to read from monitor: Connection reset by
peer".
Looks like qemu is hung and is not responding to commands libvirt sends
to the QEMU's monitor socket. And since this happens while libvirt is in
the process of starting up the domain (it sends several commands to QEMU
before it starts the virtual CPU and considers the domain running), you
see a hanging virDomainCreateWithFlags API.
Seems plausible. The libvirt qemuDomainDestroyFlags() code seems to kill the
qemu process first before emitting the "domain is not running" error, so that
would fit with the logs.
Of course now I have an unexplained qemu hang, which isn't much better. :)
Chris