On 08/07/2012 07:34 PM, Daniel P. Berrange wrote:
On Tue, Aug 07, 2012 at 03:18:38PM +0800, Alex Jia wrote:
> * src/qemu/qemu_domain.c (qemuDomainObjExitAgentInternal): fix crashing
> libvirtd due to derefing a NULL pointer.
>
> For details, please see bug:
> RHBZ:
https://bugzilla.redhat.com/show_bug.cgi?id=845966
>
> Signed-off-by: Alex Jia<ajia(a)redhat.com>
> ---
> src/qemu/qemu_domain.c | 10 ++++++----
> 1 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index 86f0265..8667b6c 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -1136,12 +1136,14 @@ qemuDomainObjExitAgentInternal(struct qemud_driver *driver,
> virDomainObjPtr obj)
> {
> qemuDomainObjPrivatePtr priv = obj->privateData;
> - int refs;
> + int refs = -1;
>
> - refs = qemuAgentUnref(priv->agent);
> + if (priv->agent) {
> + refs = qemuAgentUnref(priv->agent);
>
> - if (refs> 0)
> - qemuAgentUnlock(priv->agent);
> + if (refs> 0)
> + qemuAgentUnlock(priv->agent);
> + }
>
> if (driver_locked)
> qemuDriverLock(driver);
I'm not convinced this is the right fix. The whole point of the Enter/ExitAgent
methods is to hold an extra reference on priv->agent, so that it is *not*
deleted while a agent command is run.
What is setting priv->agent to NULL while the command is still active ?
In fact, the command 'guest-suspend-disk' is freed by virJSONValueFree()
in qemuAgentSuspend() after the command is successfully sent via
'qemuAgentCommand()':
(gdb) s
qemuDomainPMSuspendForDuration (dom=<value optimized out>, target=1,
duration=<value optimized out>, flags=<value optimized out>) at
qemu/qemu_driver.c:13123
13123 qemuDomainObjExitAgent(driver, vm);
(gdb) p *vm
$68 = {object = {magic = 3405643788, refs = 4, klass = 0x7f4ce815a9b0},
lock = {lock = {__data = {__lock = 1, __count = 0, __owner = 20285,
__nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0,
__next = 0x0}}, __size =
"\001\000\000\000\000\000\000\000=O\000\000\001", '\000' <repeats 26
times>, __align = 1}}, pid = 20379, *state = {state = 4, reason = 0}*,
autostart = 0, persistent = 1,
updated = 0, def = 0x7f4ce815e500, newDef = 0x7f4ce8069b80, snapshots
= {objs = 0x7f4ce815e240, metaroot = {def = 0x0, parent = 0x0, sibling =
0x0, nchildren = 0, first_child = 0x0}},
current_snapshot = 0x0, hasManagedSave = false, privateData =
0x7f4ce8154660, privateDataFreeFunc = 0x7f4cefada190
<qemuDomainObjPrivateFree>, taint = 4}
(gdb) s
13122 ret = qemuAgentSuspend(priv->agent, target);
(gdb) p *priv
$70 = {job = {cond = {cond = {__data = {__lock = 0, __futex = 0,
__total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0,
__nwaiters = 0, __broadcast_seq = 0}, __size = '\000' <repeats 47 times>,
__align = 0}}, active = QEMU_JOB_MODIFY, owner = 20286,
asyncCond = {cond = {__data = {__lock = 0, __futex = 0, __total_seq = 0,
__wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0,
__broadcast_seq = 0}, __size = '\000' <repeats 47 times>,
__align = 0}}, asyncJob = QEMU_ASYNC_JOB_NONE, asyncOwner = 0, phase =
0, mask = 0, start = 0, dump_memory_only = false, info = {
type = 0, timeElapsed = 0, timeRemaining = 0, dataTotal = 0,
dataProcessed = 0, dataRemaining = 0, memTotal = 0, memProcessed = 0,
memRemaining = 0, fileTotal = 0, fileProcessed = 0,
fileRemaining = 0}}, mon = 0x7f4ce80a3570, monConfig = 0x0,
monJSON = 1, monError = false, monStart = 0, *agent = 0x0*, agentError =
false, agentStart = 1344402957193, gotShutdown = true,
beingDestroyed = false, pidfile = 0x7f4ce816ba90
"/var/run/libvirt/qemu/myRHEL6.pid", nvcpupids = 1, vcpupids =
0x7f4ce8146be0, pciaddrs = 0x7f4ce8171b30, persistentAddrs = 1, qemuCaps
= 0x7f4ce80b4030,
lockState = 0x0, fakeReboot = false, jobs_queued = 1, migMaxBandwidth
= 32, origname = 0x0, cons = 0x7f4ce8165ce0, cleanupCallbacks = 0x0,
ncleanupCallbacks = 0, ncleanupCallbacks_max = 0}
(gdb) p priv->agent
$71 =*(qemuAgentPtr) 0x0*
(gdb) s
1138 refs = qemuAgentUnref(priv->agent);
(gdb) s
qemuAgentUnref (mon=0x0) at qemu/qemu_agent.c:168
(gdb) s
170 VIR_DEBUG("%d", mon->refs);
(gdb) s
168 {
(gdb) s
169 mon->refs--;
(gdb) s
Program received signal SIGSEGV, Segmentation fault.
qemuAgentUnref (*mon=0x0*) at qemu/qemu_agent.c:169
169 *mon->refs--*;
(gdb) s
virNetServerFatalSignal (sig=11, siginfo=0x7f4cf748f630,
context=0x7f4cf748f500) at rpc/virnetserver.c:296
In addition, the old qemuAgentUnref(mon) hasn't judge whether its
parameter is NULL then will deref a NULL pointer, I should simply fix it
in qemuAgentUnref(), for example, if 'mon' is NULL then directly return.
Fortunately, your commit b57ee09 potentially fix this issue via using
virObjectUnref() instead of qemuAgentUnref(), if the parameter
'priv->agent' is NULL then the virObjectUnref(priv->agent) will directly
return false:
bool virObjectUnref(void *anyobj)
{
virObjectPtr obj = anyobj;
if (!obj)
return false;
......
}
Regards,
Alex
Daniel