[libvirt] [PATCH] qemu: fix libvirtd crash in migration after vm shutdown

[PATCH] qemu: fix libvirtd crash in migration after vm shutdown If we shutdown a guest, then migrate it without the arg XML, libvirtd will get crashed. The reason is that: 1 during shutdown callback, qemuProcessStop() , it points vm->def to vm->newDef 2 during migration, it frees persistentDef, which points to vm->newDef when the arg XML is NULL. However, because vm->newDef is now vm->def, what we IN FACT freed is vm->def. 3 it will refer to vm->def after step2, thus invalid read/write causes libvirtd crash We needn't to free persistentDef if persist_xml is NULL, because no extra def was alloced if persistent_xml is NULL. --- src/qemu/qemu_migration.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 6a683f7..3636c93 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -4915,7 +4915,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, VIR_WARN("Unable to encode migration cookie"); } - if (persistDef != vm->newDef) + if (persist_xml && persistDef) virDomainDefFree(persistDef); qemuMigrationCookieFree(mig); -- 1.9.5.msysgit.1

something weird with your email client and adding extra <LF>'s everywhere... Plaintext is your friend as much as 'git send-email'... On 08/01/2016 10:20 PM, weifuqiang wrote:
[PATCH] qemu: fix libvirtd crash in migration after vm shutdown
If we shutdown a guest, then migrate it without the arg XML, libvirtd will get crashed.
Well it would be "nice" to see the crash trace(s). You give some context below, but it skips a few steps... I think understanding how reproducible something is would be nice to know as well as whether there's some edge (or error) condition that causes what you saw. We're engineers, we can handle and want the details especially for crashes.
The reason is that:
1 during shutdown callback, qemuProcessStop() , it points vm->def to vm->newDef
Since changed to be in virDomainObjRemoveTransientDef Hmmm... is this traces of some error path or edge condition? What causes the shutdown? From inside the guest?
2 during migration, it frees persistentDef, which points to vm->newDef when the arg XML is NULL.
However, because vm->newDef is now vm->def, what we IN FACT freed is vm->def.
3 it will refer to vm->def after step2, thus invalid read/write causes libvirtd crash
We needn't to free persistentDef if persist_xml is NULL, because no extra def was alloced if persistent_xml is NULL.
I know there's been recent changes in migration - I wonder if those fix what you've seen here... Perhaps not, but I never rule that out either.
---
src/qemu/qemu_migration.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
index 6a683f7..3636c93 100644
--- a/src/qemu/qemu_migration.c
+++ b/src/qemu/qemu_migration.c
@@ -4915,7 +4915,7 @@ qemuMigrationRun(virQEMUDriverPtr driver,
VIR_WARN("Unable to encode migration cookie");
}
- if (persistDef != vm->newDef)
+ if (persist_xml && persistDef)
So, looking at the code, 1. "persistDef" is only ever set if "flags & VIR_MIGRATE_PERSIST_DEST", so we have to have taken that path 2. If we have "persist_xml", then we "know" that qemuMigrationPrepareDef was called. So I suppose this could be a solution. Alternatively, I would think if (persistDef != vm->newDef && persistDef != vm->def) virDomainDefFree(persistDef); would work as well; however, is this the right solution? It's the what has caused vm->newDef to become vm->def and whether that should be happening while we're migrating. So I wonder should/could the shutdown be blocked until the migration is complete or are we perhaps lacking a lock or extra check somewhere to sure we don't hit some edge condition. Alternatively, is there some nuance with virDomainObjRemoveTransientDef that needs adjustment vis-a-vis domain->persistent (I get brain cramps going through that code). In any case, perhaps this analysis prompts others to chime in... John
virDomainDefFree(persistDef);
qemuMigrationCookieFree(mig);
--
1.9.5.msysgit.1
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Tue, Aug 02, 2016 at 02:20:51 +0000, weifuqiang wrote:
[PATCH] qemu: fix libvirtd crash in migration after vm shutdown
If we shutdown a guest, then migrate it without the arg XML, libvirtd will get crashed.
The reason is that:
1 during shutdown callback, qemuProcessStop() , it points vm->def to vm->newDef
2 during migration, it frees persistentDef, which points to vm->newDef when the arg XML is NULL.
However, because vm->newDef is now vm->def, what we IN FACT freed is vm->def.
3 it will refer to vm->def after step2, thus invalid read/write causes libvirtd crash
We needn't to free persistentDef if persist_xml is NULL, because no extra def was alloced if persistent_xml is NULL.
--- src/qemu/qemu_migration.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 6a683f7..3636c93 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -4915,7 +4915,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, VIR_WARN("Unable to encode migration cookie"); } - if (persistDef != vm->newDef) + if (persist_xml && persistDef) virDomainDefFree(persistDef); qemuMigrationCookieFree(mig);
If persist_xml != NULL then persistDef is it's parsed version (or NULL) and we need to free it. Otherwise it's just a copy of another pointer. In other words, your patch is correct (although checking for persist_xml is enough). An alternative solution would be to make sure we can always free persistDef by making a copy of vm->newDef rather than copying just the pointer and free it unconditionally. Anyway, your patch is good enough. ACK. I'll push it later. Jirka

On Thu, Sep 22, 2016 at 11:56:38 +0200, Jiri Denemark wrote:
On Tue, Aug 02, 2016 at 02:20:51 +0000, weifuqiang wrote:
[PATCH] qemu: fix libvirtd crash in migration after vm shutdown
If we shutdown a guest, then migrate it without the arg XML, libvirtd will get crashed.
The reason is that:
1 during shutdown callback, qemuProcessStop() , it points vm->def to vm->newDef
2 during migration, it frees persistentDef, which points to vm->newDef when the arg XML is NULL.
However, because vm->newDef is now vm->def, what we IN FACT freed is vm->def.
3 it will refer to vm->def after step2, thus invalid read/write causes libvirtd crash
We needn't to free persistentDef if persist_xml is NULL, because no extra def was alloced if persistent_xml is NULL.
--- src/qemu/qemu_migration.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 6a683f7..3636c93 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -4915,7 +4915,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, VIR_WARN("Unable to encode migration cookie"); } - if (persistDef != vm->newDef) + if (persist_xml && persistDef) virDomainDefFree(persistDef); qemuMigrationCookieFree(mig);
If persist_xml != NULL then persistDef is it's parsed version (or NULL) and we need to free it. Otherwise it's just a copy of another pointer. In other words, your patch is correct (although checking for persist_xml is enough).
An alternative solution would be to make sure we can always free persistDef by making a copy of vm->newDef rather than copying just the pointer and free it unconditionally.
Anyway, your patch is good enough. ACK.
I'll push it later.
Oh, the patch is corrupted and thus it doesn't apply. And it's hard to see what your real name is. Is it Wei Fuqiang or something else? We need proper author's name for each commit. Please, when sending a new version of this patch, don't copy&paste the patch into your mail client. Use git send-email whenever possible, or attach the patch generated by git format-patch to the email. Jirka
participants (3)
-
Jiri Denemark
-
John Ferlan
-
weifuqiang