At 04/12/2011 06:40 PM, Daniel P. Berrange Write:
On Tue, Apr 12, 2011 at 06:33:21PM +0800, Hu Tao wrote:
> On Tue, Apr 12, 2011 at 06:22:12PM +0800, Wen Congyang wrote:
>> At 04/12/2011 01:51 PM, Hu Tao Write:
>>> Sorry, I unexpectedly deleted text body.
>>>
>>> I changed the code like this:
>>>
>>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>>> index c2a1f9a..8aad0b3 100644
>>> --- a/src/qemu/qemu_domain.c
>>> +++ b/src/qemu/qemu_domain.c
>>> @@ -514,7 +514,10 @@ int qemuDomainObjBeginJobWithDriver(struct qemud_driver
*driver,
>>> else
>>> virReportSystemError(errno,
>>> "%s", _("cannot acquire
job mutex"));
>>> + virDomainObjUnlock(obj);
>>> qemuDriverLock(driver);
>>> + virDomainObjLock(obj);
>>
>> We lock qemu_driver and vm as the folling steps:
>> 1. lock qemu_driver
>> 2. lock vm
>>
>> We try to lock qemu_driver while holding the vm's lock.
>> OOps, it will cause the libvirtd deadlock.(I can reproduce this bug by your
script)
>>
>>> + virDomainObjUnref(obj);
>>
>> We have unref it, so do not unref it again.
>
> Didn't see there is already a call to virDomainObjUnref. the patch
> should be:
>
> >From 712883d0151222a276f777f473d96aa23ad5d9d6 Mon Sep 17 00:00:00 2001
> From: Hu Tao <hutao(a)cn.fujitsu.com>
> Date: Tue, 12 Apr 2011 18:29:27 +0800
> Subject: [PATCH] qemu: fix a dead-lock problem
>
> In qemuDomainObjBeginJobWithDriver, when virCondWaitUntil timeouts,
> the function tries to call qemuDriverLock with virDomainObj locked,
> this causes the dead-lock problem. This patch fixes this.
> ---
> src/qemu/qemu_domain.c | 6 ++++--
> 1 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index c2a1f9a..a947b4e 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -506,15 +506,17 @@ int qemuDomainObjBeginJobWithDriver(struct qemud_driver
*driver,
>
> while (priv->jobActive) {
> if (virCondWaitUntil(&priv->jobCond, &obj->lock, then) < 0)
{
> - /* Safe to ignore value since ref count was incremented above */
> - ignore_value(virDomainObjUnref(obj));
> if (errno == ETIMEDOUT)
> qemuReportError(VIR_ERR_OPERATION_TIMEOUT,
> "%s", _("cannot acquire state change
lock"));
> else
> virReportSystemError(errno,
> "%s", _("cannot acquire job
mutex"));
> + virDomainObjUnlock(obj);
> qemuDriverLock(driver);
> + virDomainObjLock(obj);
> + /* Safe to ignore value since ref count was incremented above */
> + ignore_value(virDomainObjUnref(obj));
> return -1;
> }
> }
> --
ACK
Thanks. Applied.
Daniel