* Ryan Harper <ryanh(a)us.ibm.com> [2011-05-03 16:57]:
I've encountered an interesting scenario:
1. define a guest via virsh define <xml>
2 start this guest via virsh
3. one of the disk elements is a multipath device that is currently
misconfigured such that any io to the device hangs the calling process
4. libvirt times out when attemping to communicate via the monitor to
the guest (btw, this timeout isn't configurable AFAICT)
5. returns an error from create indicating that we failed to create the VM
At this point:
1) libvirt reports that the VM is stopped (and this is true, the qemu
process has never been issued the 'cont' command and thus won't ever
execute gues tcode)
2) the qemu process for this VM is still running (just blocked on IO)
3) it is possible that if the process becomes unblocked that the QEMU
process will be functional again, but won't be started, and the process
won't be terminated since libvirt isn't tracking this any more, and is
consuming some amount of resources that are allocated on start up.
How can we clean up from this failure scenario? Would it make sense for
libvirt to send a SIGTERM to a qemu if it failed to create? In the
above scenario, this would allow us to reap the process if it ever
became unblocked.
Looks like I completely missed
src/qemu/qemu_process.c:qemuProcessStop() which does indeed send SIGTERM
and SIGKILL.
This should be sufficient to clean up in the above case.
--
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ryanh(a)us.ibm.com