On 03/29/2011 04:17 AM, Wen Congyang wrote:
When I edit the domain's config file like this:
=====================
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/>
<source file='/var/lib/libvirt/images/test3.img'/>
<target dev='sdb' bus='scsi'/>
<address type='drive' controller='0' bus='0'
unit='10'/>
</disk>
=====================
Note, the unit is wrong, but libvirt does not check it.
When I start the vm with the wrong config file, libvirtd will be blocked because
qemu quited unexpectedly. This bug does not happen every time, and it only happened
once on my box.
So I try to use gdb and add sleep() to trigger this bug. I have posted two patches
to fix 2 bugs. But there is still another bug, and I have no good way to fix it.
Did you find a way to work around this yet? The typical solution is to
temporarily add another reference to an object that you intend to still
use after regaining locks, so that even if the qemu process dies in the
meantime, it does not free the last reference and therefore does not
delete the object in its cleanup code.
Steps to reproduce this bug:
1. use gdb to attach libvirtd, and set a breakpoint in the function
qemuConnectMonitor()
2. start a vm
3. let the libvirtd to run until qemuMonitorSetCapabilities() returns.
4. kill the qemu process
5. step into qemuDomainObjExitMonitorWithDriver(), and set debug to 1
Now, qemuDomainObjExitMonitorWithDriver() will sleep 100s to make sure
qemuProcessHandleMonitorEOF() is done before qemuProcessHandleMonitorEOF()
returns.
priv->mon will be null after qemuDomainObjExitMonitorWithDriver() returns.
So we must not use it. Unfortunately we still use it, and it will cause
libvirtd crashed.
Sounds like qemuConnectMonitor needs an extra reference around priv->mon
for the duration of the connect attempt, so that
qemuProcessHandleMonitorEOF will not free the monitor.
--
Eric Blake eblake(a)redhat.com +1-801-349-2682
Libvirt virtualization library
http://libvirt.org