[libvirt-users] Finding cause when "virsh list" hangs
Hi, Did something dumb - had two VM hosts with DRBD mirroring of VMs on the same UPS, which failed and crashed them both. While I've got VMs running now on both, "virsh list" and "virsh start" and so on are just hanging. I'm not seeing it log anything in these instances - just hanging. Both systems are Ubuntu 10.10, one with the stock libvirtd 0.8.3 and one with 0.9.12 compiled from source. They each brought up their autostart VMs okay. But I've got no working virsh shell on either now. The 0.9.12 host was, to complicate this report, okay with virsh initially, or "virsh list" anyway. But it hung with a "virsh start," and now "virsh list" fails too. So ... what should I look for to have been left where by the crash that's making virsh hang? Is there any way to get virsh to provide debugging info in a coffee-addled-friendly way? I've of course Googled "virsh list" hanging, but without finding anything that seems to directly apply to my case, although it's been seen before. Thanks, Whit
Additional note: On 0.8.3 virsh itself hangs. On 0.9.12 I can get a virsh shell, but "list" there (or "virsh list" from bash) hangs. In both cases there's nothing displayed in the hang at all. I've found reports of "virsh list" hanging mid-list from certain instances of possibly misconfigured VMs. But there's not even the header of the listing making it to screen here. Whit On Sun, Aug 19, 2012 at 02:19:21PM -0400, Whit Blauvelt wrote:
Both systems are Ubuntu 10.10, one with the stock libvirtd 0.8.3 and one with 0.9.12 compiled from source. They each brought up their autostart VMs okay. But I've got no working virsh shell on either now. The 0.9.12 host was, to complicate this report, okay with virsh initially, or "virsh list" anyway. But it hung with a "virsh start," and now "virsh list" fails too.
On 08/19/2012 12:19 PM, Whit Blauvelt wrote:
Hi,
Did something dumb - had two VM hosts with DRBD mirroring of VMs on the same UPS, which failed and crashed them both. While I've got VMs running now on both, "virsh list" and "virsh start" and so on are just hanging. I'm not seeing it log anything in these instances - just hanging.
Both systems are Ubuntu 10.10, one with the stock libvirtd 0.8.3 and one with 0.9.12 compiled from source. They each brought up their autostart VMs okay. But I've got no working virsh shell on either now. The 0.9.12 host was, to complicate this report, okay with virsh initially, or "virsh list" anyway. But it hung with a "virsh start," and now "virsh list" fails too.
Libvirt 0.8.3 didn't have any priority commands - hang in one (such as failure to communicate with the qemu monitor due to confusion on running a domain twice) could starve all other commands. But we fixed that in the meantime, and libvirt 0.9.12 is supposed to be able to 'virsh list' without any delay due to a hung low-priority command.
So ... what should I look for to have been left where by the crash that's making virsh hang?
Do you have debugging symbols handy? At this point, a gdb backtgrace would be the best place to look for clues.
Is there any way to get virsh to provide debugging info in a coffee-addled-friendly way? I've of course Googled "virsh list" hanging, but without finding anything that seems to directly apply to my case, although it's been seen before.
I'm not sure I have the best suggestions today, but maybe someone else can also chime in. Good luck. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Eric, Thanks for taking the time to respond. Your explanation about the stuck queue makes sense. The system with the more recent libvirt, I realized on closer inspection, was still using the original kvm. Once I switched the kvm symlink to /usr/local/bin/qemu-system-x86_64 and restarted libvirtd it became happy. I'd just made the dumb assumption that the default builds of both, letting them go in their default /usr/local locations, would work together automatically, what with /usr/local/bin being first in the path. Not too hard a thing to adjust. But I wonder if in most cases where libvirt is being installed from source the object is to use it with a packaged version of kvm-qemu, or with a kvm-qemu also installed from source. If the latter, would it make more sense for it to invoke /usr/local/bin/qemu-system-x86_64 as its default rather than /usr/bin/kvm? Or would the trick - providing that libvirt isn't specifying the full path but just invoking "kvm" - be for the kvm-qemu source build to by default put a kvm symlink in /usr/local/bin, for libvirt to find it first, before the /usr/bin version? Quibbles, I know. But with this area evolving so fast, the distros often fall behind. Having source install be easy can be a good thing. Best, Whit On Tue, Aug 21, 2012 at 03:59:12PM -0600, Eric Blake wrote:
On 08/19/2012 12:19 PM, Whit Blauvelt wrote:
Hi,
Did something dumb - had two VM hosts with DRBD mirroring of VMs on the same UPS, which failed and crashed them both. While I've got VMs running now on both, "virsh list" and "virsh start" and so on are just hanging. I'm not seeing it log anything in these instances - just hanging.
participants (2)
-
Eric Blake -
Whit Blauvelt