There were multiple race conditions that could lead to segmentation
faults. The first precondition for this is qemuProcessLaunch must fail
sometime shortly after starting the new QEMU process. The second
precondition for the segmentation faults is that the new QEMU process
dies - or to be more precise the QEMU monitor has to be closed
irregularly. If both happens during qemuProcessStart (starting a
domain) there are race windows between the thread with the event
loop (T1) and the thread that is starting the domain (T2).
First segmentation fault scenario:
If qemuProcessLaunch fails during qemuProcessStart the code branches
to the 'stop' path where 'qemuMonitorSetDomainLog(priv->mon, NULL,
NULL, NULL)' will set the log function of the monitor to NULL (done in
T2). In the meantime the event loop of T1 will wake up with an EOF
event for the QEMU monitor because the QEMU process has died. The
crash occurs if T1 has checked 'mon->logFunc != NULL' in qemuMonitorIO
just before the logFunc was set to NULL by T2. If this situation
occurs T1 will try to call mon->logFunc which leads to the
segmentation fault.
Solution:
Require the monitor lock for setting the log function.
Backtrace:
0 0x0000000000000000 in ?? ()
1 0x000003ffe9e45316 in qemuMonitorIO (watch=<optimized out>,
fd=<optimized out>, events=<optimized out>, opaque=0x3ffe08aa860) at
../../src/qemu/qemu_monitor.c:727
2 0x000003fffda2e1a4 in virEventPollDispatchHandles (nfds=<optimized
out>, fds=0x2aa000fd980) at ../../src/util/vireventpoll.c:508
3 0x000003fffda2e398 in virEventPollRunOnce () at
../../src/util/vireventpoll.c:657
4 0x000003fffda2ca10 in virEventRunDefaultImpl () at
../../src/util/virevent.c:314
5 0x000003fffdba9366 in virNetDaemonRun (dmn=0x2aa000cc550) at
../../src/rpc/virnetdaemon.c:818
6 0x000002aa00024668 in main (argc=<optimized out>, argv=<optimized
out>) at ../../daemon/libvirtd.c:1541
Second segmentation fault scenario:
If qemuProcessLaunch fails it will unref the log context and with
invoking qemuMonitorSetDomainLog(priv->mon, NULL, NULL, NULL)
qemuDomainLogContextFree() will be invoked. qemuDomainLogContextFree()
invokes virNetClientClose() to close the client and cleans everything
up (including unref of _virLogManager.client) when virNetClientClose()
returns. When T1 is now trying to report 'qemu unexpectedly closed the
monitor' libvirtd will crash because the client has already been
freed.
Solution:
As the critical section in qemuMonitorIO is protected with the monitor
lock we can use the same solution as proposed for the first
segmentation fault.
Backtrace:
0 virClassIsDerivedFrom (klass=0x3100979797979797,
parent=0x2aa000d92f0) at ../../src/util/virobject.c:169
1 0x000003fffda659e6 in virObjectIsClass (anyobj=<optimized out>,
klass=<optimized out>) at ../../src/util/virobject.c:365
2 0x000003fffda65a24 in virObjectLock (anyobj=0x3ffe08c1db0) at
../../src/util/virobject.c:317
3 0x000003fffdba4688 in
virNetClientIOEventLoop (client=client@entry=0x3ffe08c1db0,
thiscall=thiscall@entry=0x2aa000fbfa0) at
../../src/rpc/virnetclient.c:1668
4 0x000003fffdba4b4c in
virNetClientIO (client=client@entry=0x3ffe08c1db0,
thiscall=0x2aa000fbfa0) at ../../src/rpc/virnetclient.c:1944
5 0x000003fffdba4d42 in
virNetClientSendInternal (client=client@entry=0x3ffe08c1db0,
msg=msg@entry=0x2aa000cc710, expectReply=expectReply@entry=true,
nonBlock=nonBlock@entry=false) at ../../src/rpc/virnetclient.c:2116
6 0x000003fffdba6268 in
virNetClientSendWithReply (client=0x3ffe08c1db0, msg=0x2aa000cc710) at
../../src/rpc/virnetclient.c:2144
7 0x000003fffdba6e8e in virNetClientProgramCall (prog=0x3ffe08c1120,
client=<optimized out>, serial=<optimized out>, proc=<optimized out>,
noutfds=<optimized out>, outfds=0x0, ninfds=0x0, infds=0x0,
args_filter=0x3fffdb64440
<xdr_virLogManagerProtocolDomainReadLogFileArgs>, args=0x3ffffffe010,
ret_filter=0x3fffdb644c0
<xdr_virLogManagerProtocolDomainReadLogFileRet>, ret=0x3ffffffe008) at
../../src/rpc/virnetclientprogram.c:329
8 0x000003fffdb64042 in
virLogManagerDomainReadLogFile (mgr=<optimized out>, path=<optimized
out>, inode=<optimized out>, offset=<optimized out>, maxlen=<optimized
out>, flags=0) at ../../src/logging/log_manager.c:272
9 0x000003ffe9e0315c in qemuDomainLogContextRead (ctxt=0x3ffe08c2980,
msg=0x3ffffffe1c0) at ../../src/qemu/qemu_domain.c:4422
10 0x000003ffe9e280a8 in qemuProcessReadLog (logCtxt=<optimized out>,
msg=msg@entry=0x3ffffffe288) at ../../src/qemu/qemu_process.c:1800
11 0x000003ffe9e28206 in qemuProcessReportLogError (logCtxt=<optimized
out>, msgprefix=0x3ffe9ec276a "qemu unexpectedly closed the monitor")
at ../../src/qemu/qemu_process.c:1836
12 0x000003ffe9e28306 in
qemuProcessMonitorReportLogError (mon=mon@entry=0x3ffe085cf10,
msg=<optimized out>, opaque=<optimized out>) at
../../src/qemu/qemu_process.c:1856
13 0x000003ffe9e452b6 in qemuMonitorIO (watch=<optimized out>,
fd=<optimized out>, events=<optimized out>, opaque=0x3ffe085cf10) at
../../src/qemu/qemu_monitor.c:726
14 0x000003fffda2e1a4 in virEventPollDispatchHandles (nfds=<optimized
out>, fds=0x2aa000fd980) at ../../src/util/vireventpoll.c:508
15 0x000003fffda2e398 in virEventPollRunOnce () at
../../src/util/vireventpoll.c:657
16 0x000003fffda2ca10 in virEventRunDefaultImpl () at
../../src/util/virevent.c:314
17 0x000003fffdba9366 in virNetDaemonRun (dmn=0x2aa000cc550) at
../../src/rpc/virnetdaemon.c:818
18 0x000002aa00024668 in main (argc=<optimized out>, argv=<optimized
out>) at ../../daemon/libvirtd.c:1541
Other code parts where the same problem was possible to occur are
fixed as well (qemuMigrationFinish, qemuProcessStart, and
qemuDomainSaveImageStartVM).
Signed-off-by: Marc Hartmayer <mhartmay(a)linux.vnet.ibm.com>
Reported-by: Sascha Silbe <silbe(a)linux.vnet.ibm.com>
---
src/qemu/qemu_monitor.c | 44 ++++++++++++++++++++++++++++++++++----------
src/qemu/qemu_monitor.h | 4 ++++
2 files changed, 38 insertions(+), 10 deletions(-)
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
index a4fa6ec..b41aaed 100644
--- a/src/qemu/qemu_monitor.c
+++ b/src/qemu/qemu_monitor.c
@@ -963,7 +963,7 @@ qemuMonitorClose(qemuMonitorPtr mon)
PROBE(QEMU_MONITOR_CLOSE,
"mon=%p refs=%d", mon, mon->parent.parent.u.s.refs);
- qemuMonitorSetDomainLog(mon, NULL, NULL, NULL);
+ qemuMonitorSetDomainLogLocked(mon, NULL, NULL, NULL);
if (mon->fd >= 0) {
qemuMonitorUnregister(mon);
@@ -4035,20 +4035,21 @@ qemuMonitorGetDeviceAliases(qemuMonitorPtr mon,
/**
- * qemuMonitorSetDomainLog:
- * Set the file descriptor of the open VM log file to report potential
- * early startup errors of qemu.
- *
- * @mon: Monitor object to set the log file reading on
+ * qemuMonitorSetDomainLogLocked:
+ * @mon: Locked monitor object to set the log file reading on
* @func: the callback to report errors
* @opaque: data to pass to @func
* @destroy: optional callback to free @opaque
+ *
+ * Set the file descriptor of the open VM log file to report potential
+ * early startup errors of qemu. This function requires @mon to be
+ * locked already!
*/
void
-qemuMonitorSetDomainLog(qemuMonitorPtr mon,
- qemuMonitorReportDomainLogError func,
- void *opaque,
- virFreeCallback destroy)
+qemuMonitorSetDomainLogLocked(qemuMonitorPtr mon,
+ qemuMonitorReportDomainLogError func,
+ void *opaque,
+ virFreeCallback destroy)
{
if (mon->logDestroy && mon->logOpaque)
mon->logDestroy(mon->logOpaque);
@@ -4060,6 +4061,29 @@ qemuMonitorSetDomainLog(qemuMonitorPtr mon,
/**
+ * qemuMonitorSetDomainLog:
+ * @mon: Unlocked monitor object to set the log file reading on
+ * @func: the callback to report errors
+ * @opaque: data to pass to @func
+ * @destroy: optional callback to free @opaque
+ *
+ * Set the file descriptor of the open VM log file to report potential
+ * early startup errors of qemu. This functions requires @mon to be
+ * unlocked.
+ */
+void
+qemuMonitorSetDomainLog(qemuMonitorPtr mon,
+ qemuMonitorReportDomainLogError func,
+ void *opaque,
+ virFreeCallback destroy)
+{
+ virObjectLock(mon);
+ qemuMonitorSetDomainLogLocked(mon, func, opaque, destroy);
+ virObjectUnlock(mon);
+}
+
+
+/**
* qemuMonitorJSONGetGuestCPU:
* @mon: Pointer to the monitor
* @arch: arch of the guest
diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h
index 91ab905..2e42d16 100644
--- a/src/qemu/qemu_monitor.h
+++ b/src/qemu/qemu_monitor.h
@@ -1067,6 +1067,10 @@ int qemuMonitorGetDeviceAliases(qemuMonitorPtr mon,
typedef void (*qemuMonitorReportDomainLogError)(qemuMonitorPtr mon,
const char *msg,
void *opaque);
+void qemuMonitorSetDomainLogLocked(qemuMonitorPtr mon,
+ qemuMonitorReportDomainLogError func,
+ void *opaque,
+ virFreeCallback destroy);
void qemuMonitorSetDomainLog(qemuMonitorPtr mon,
qemuMonitorReportDomainLogError func,
void *opaque,
--
2.5.5