[libvirt] [PATCH 0/2] Avoid issue when VM quits while attempting connection to the guest agent. - Devel - libvirt List Archives

newer
[libvirt] [PATCH] BSD: implement...

[libvirt] [PATCH 0/2] Avoid issue when VM quits while attempting connection to the guest agent.

older
[libvirt] [PATCHv2] tests: be more...

Peter Krempa

14 Jan 2014 14 Jan '14

9:31 a.m.

Peter Krempa (2): DO NOT APPLY UPSTREAM: Reproducer qemu: Avoid operations on NULL monitor if VM fails early src/qemu/qemu_process.c | 36 +++++++++++++++++++++++++++--------- 1 file changed, 27 insertions(+), 9 deletions(-) -- 1.8.5.2

Reply

Sign in to reply online Use email software

Show replies by date

Peter Krempa

14 Jan 14 Jan

9:31 a.m.

New subject: [libvirt] [PATCH 1/2] DO NOT APPLY UPSTREAM: Reproducer

Apply this patch and run a patched libvirt daemon. Then start a VM and kill it's process after few seconds: # virsh start VM & sleep 3; killall -9 qemu-kvm Upstream version doesn't crash but the virsh command returns cryptic error message: error: invalid argument: monitor must not be NULL 0.10.2 and similar downstream versions segfault: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffed467700 (LWP 29256)] __pthread_mutex_lock (mutex=0x10) at pthread_mutex_lock.c:50 50 unsigned int type = PTHREAD_MUTEX_TYPE (mutex); Missing separate debuginfos, use: debuginfo-install VirtualBox-4.2-4.2.22_91556_el6-1.x86_64 (gdb) bt #0 __pthread_mutex_lock (mutex=0x10) at pthread_mutex_lock.c:50 #1 0x00007fffe891585e in qemuDomainObjEnterMonitorInternal (driver=0x7fffe00858f0, driver_locked=true, obj=0x7fffe01689c0, asyncJob=<value optimized out>) at qemu/qemu_domain.c:1008 #2 0x00007fffe892662a in qemuProcessDetectVcpuPIDs (driver=0x7fffe00858f0, vm=0x7fffe01689c0) at qemu/qemu_process.c:1831 #3 0x00007fffe892c417 in qemuProcessStart (conn=0x7fffdc000ae0, driver=0x7fffe00858f0, vm=0x7fffe01689c0, migrateFrom=0x0, stdin_fd=-1, stdin_path=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1) at qemu/qemu_process.c:4022 #4 0x00007fffe8973f3e in qemuDomainObjStart (conn=0x7fffdc000ae0, driver=0x7fffe00858f0, vm=0x7fffe01689c0, flags=<value optimized out>) at qemu/qemu_driver.c:6025 --- src/qemu/qemu_process.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 9331744..a27eded 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -243,6 +243,8 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) config, &agentCallbacks); + sleep(10); + virObjectLock(vm); priv->agentStart = 0; -- 1.8.5.2

Reply

Sign in to reply online Use email software

Peter Krempa

9:31 a.m.

New subject: [libvirt] [PATCH 2/2] qemu: Avoid operations on NULL monitor if VM fails early

https://bugzilla.redhat.com/show_bug.cgi?id=1047659 If a VM dies very early during an attempted connect to the guest agent while the locks are down the domain monitor object will be freed. The object is then accessed later as any failure during guest agent startup isn't considered fatal. In the current upstream version this doesn't lead to a crash as virObjectLock called when entering the monitor in qemuProcessDetectVcpuPIDs checks the pointer before attempting to dereference (lock) it. The NULL pointer is then caught in the monitor helper code. Before the introduction of virObjectLockable - observed on 0.10.2 - the pointer is locked directly via virMutexLock leading to a crash. To avoid this problem we need to differentiate between the guest agent not being present and the VM quitting when the locks were down. The fix reorganizes the code in qemuConnectAgent to add the check and then adds special handling to the callers. --- src/qemu/qemu_process.c | 34 +++++++++++++++++++++++++--------- 1 file changed, 25 insertions(+), 9 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index a27eded..cf23ff3 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -248,6 +248,17 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) virObjectLock(vm); priv->agentStart = 0; + if (agent == NULL) + virObjectUnref(vm); + + if (!virDomainObjIsActive(vm)) { + qemuAgentClose(agent); + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("guest crashed while connecting to the guest agent")); + ret = -2; + goto cleanup; + } + if (virSecurityManagerClearSocketLabel(driver->securityManager, vm->def) < 0) { VIR_ERROR(_("Failed to clear security context for agent for %s"), @@ -255,13 +266,7 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) goto cleanup; } - if (agent == NULL) - virObjectUnref(vm); - if (!virDomainObjIsActive(vm)) { - qemuAgentClose(agent); - goto cleanup; - } priv->agent = agent; if (priv->agent == NULL) { @@ -3120,6 +3125,7 @@ qemuProcessReconnect(void *opaque) int reason; virQEMUDriverConfigPtr cfg; size_t i; + int ret; memcpy(&oldjob, &data->oldjob, sizeof(oldjob)); @@ -3144,7 +3150,10 @@ qemuProcessReconnect(void *opaque) goto error; /* Failure to connect to agent shouldn't be fatal */ - if (qemuConnectAgent(driver, obj) < 0) { + if ((ret = qemuConnectAgent(driver, obj)) < 0) { + if (ret == -2) + goto error; + VIR_WARN("Cannot connect to QEMU guest agent for %s", obj->def->name); virResetLastError(); @@ -4018,7 +4027,10 @@ int qemuProcessStart(virConnectPtr conn, goto cleanup; /* Failure to connect to agent shouldn't be fatal */ - if (qemuConnectAgent(driver, vm) < 0) { + if ((ret = qemuConnectAgent(driver, vm)) < 0) { + if (ret == -2) + goto cleanup; + VIR_WARN("Cannot connect to QEMU guest agent for %s", vm->def->name); virResetLastError(); @@ -4478,6 +4490,7 @@ int qemuProcessAttach(virConnectPtr conn ATTRIBUTE_UNUSED, virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); virCapsPtr caps = NULL; bool active = false; + int ret; VIR_DEBUG("Beginning VM attach process"); @@ -4592,7 +4605,10 @@ int qemuProcessAttach(virConnectPtr conn ATTRIBUTE_UNUSED, goto error; /* Failure to connect to agent shouldn't be fatal */ - if (qemuConnectAgent(driver, vm) < 0) { + if ((ret = qemuConnectAgent(driver, vm)) < 0) { + if (ret == -2) + goto error; + VIR_WARN("Cannot connect to QEMU guest agent for %s", vm->def->name); virResetLastError(); -- 1.8.5.2

Reply

Sign in to reply online Use email software

Michal Privoznik

15 Jan 15 Jan

7:45 a.m.

New subject: [libvirt] [PATCH 2/2] qemu: Avoid operations on NULL monitor if VM fails early

On 14.01.2014 19:31, Peter Krempa wrote:

https://bugzilla.redhat.com/show_bug.cgi?id=1047659

If a VM dies very early during an attempted connect to the guest agent while the locks are down the domain monitor object will be freed. The object is then accessed later as any failure during guest agent startup isn't considered fatal.

In the current upstream version this doesn't lead to a crash as virObjectLock called when entering the monitor in qemuProcessDetectVcpuPIDs checks the pointer before attempting to dereference (lock) it. The NULL pointer is then caught in the monitor helper code.

Before the introduction of virObjectLockable - observed on 0.10.2 - the pointer is locked directly via virMutexLock leading to a crash.

To avoid this problem we need to differentiate between the guest agent not being present and the VM quitting when the locks were down. The fix reorganizes the code in qemuConnectAgent to add the check and then adds special handling to the callers. --- src/qemu/qemu_process.c | 34 +++++++++++++++++++++++++--------- 1 file changed, 25 insertions(+), 9 deletions(-)

ACK and safe for 1.2.1. Michal

Reply

Sign in to reply online Use email software

Peter Krempa

8:10 a.m.

New subject: [libvirt] [PATCH 2/2] qemu: Avoid operations on NULL monitor if VM fails early

On 01/15/14 17:45, Michal Privoznik wrote:

On 14.01.2014 19:31, Peter Krempa wrote:

...
https://bugzilla.redhat.com/show_bug.cgi?id=1047659

If a VM dies very early during an attempted connect to the guest agent while the locks are down the domain monitor object will be freed. The object is then accessed later as any failure during guest agent startup isn't considered fatal.

In the current upstream version this doesn't lead to a crash as virObjectLock called when entering the monitor in qemuProcessDetectVcpuPIDs checks the pointer before attempting to dereference (lock) it. The NULL pointer is then caught in the monitor helper code.

Before the introduction of virObjectLockable - observed on 0.10.2 - the pointer is locked directly via virMutexLock leading to a crash.

To avoid this problem we need to differentiate between the guest agent not being present and the VM quitting when the locks were down. The fix reorganizes the code in qemuConnectAgent to add the check and then adds special handling to the callers. --- src/qemu/qemu_process.c | 34 +++++++++++++++++++++++++--------- 1 file changed, 25 insertions(+), 9 deletions(-)

ACK and safe for 1.2.1.

Michal

Pushed; Thanks. Peter

Reply

Sign in to reply online Use email software

4430

Age (days ago)

4431

Last active (days ago)

Download

4 comments

2 participants

tags

participants (2)

Michal Privoznik
Peter Krempa