On Sat, May 18, 2019 at 18:24:36 +0800, Wang King wrote:
From: Dingzhong Weng <wengdingzhong(a)huawei.com>
libvirtd starts qemuProcessReconnect threads when libvirtd is starting
and there are live vm on the host. but as qemuProcessReconnect threads
are simply side job done by libvirtd upon start and are only called
for existing vm. these threads are not managed like that of worker
threads and event pool.
once libvirtd receives SIGTERM signal or any similar command right after
it starts, these qemuProcessReconnect threads might still be running and
libvirtd goes to clean up and frees qemu_driver and its components. In
this short window, qemuProcessReconnect threads might have a NULL
qemu_driver->xmlopt.format, skip this function, and only partial vm
status can be saved to disk. As a result, vm may fail to recover with
missing information.
this patch increases qemu_driver->xmlopt ref count during the lifecycle
of qemuProcessReconnect so that libvirtd will not release this resource
until qemuProcessReconnect threads finish using it. thus make sure all
of vm status information can be formatted and thus maintain its integrity
in virDomainSaveStatus.
no vm is killed in this patch.
Signed-off-by: Dingzhong Weng <wengdingzhong(a)huawei.com>
---
src/qemu/qemu_process.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 90466771cd..36b9b5fd03 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -8000,6 +8000,7 @@ qemuProcessReconnect(void *opaque)
struct qemuProcessReconnectData *data = opaque;
virQEMUDriverPtr driver = data->driver;
virDomainObjPtr obj = data->obj;
+ virDomainXMLOptionPtr xmlopt = NULL;
qemuDomainObjPrivatePtr priv;
qemuDomainJobObj oldjob;
int state;
@@ -8023,6 +8024,9 @@ qemuProcessReconnect(void *opaque)
cfg = virQEMUDriverGetConfig(driver);
priv = obj->privateData;
+ /* need xmlopt later to save status, do not free */
+ xmlopt = virObjectRef(driver->xmlopt);
So I presume the problem is that qemuStateCleanup is called before this
function finishes and thus accesses invalid memory.
This patch will not fix the problem entirely, because the access to
XMLopt here (and everywhere else) is not atomic. This means that if
qemuStateCleanup is called before the above line you'll try to reference
a pointer which was already freed.
Also even if qemuStateCleanup sets the pointer to NULL your patch does
not check it.
To fully fix this I think we need an accessor similar to
virQEMUDriverGetConfig which will access the xmlopt object.