
On 03/21/2012 02:41 AM, Jiri Denemark wrote:
On Tue, Mar 20, 2012 at 15:56:39 -0600, Eric Blake wrote:
On 03/19/2012 10:18 AM, Jiri Denemark wrote:
Destination daemon should not rely on the client or source daemon (depending on the type of migration) to call Finish when migration fails, because the client may crash before it can do so. The domain prepared for incoming migration is set to be destroyed (and migration job cleaned up) when connection with the client closes but this is not enough. If the associated qemu process crashes after Prepare step and the domain is cleaned up before the connection gets closed, autodestroy is not called for the domain and migration jobs remains set. In case the domain is defined on destination host (i.e., it is not completely removed once destroyed) we keep the job set for ever. To fix this, we register a cleanup callback which is responsible to clean migration-in job when a domain dies anywhere between Prepare and Finish steps. Note that we can't blindly clean any job when spotting EOF on monitor since normally an API is running at that time. --- src/qemu/qemu_domain.c | 2 -- src/qemu/qemu_domain.h | 2 ++ src/qemu/qemu_migration.c | 22 ++++++++++++++++++++++ 3 files changed, 24 insertions(+), 2 deletions(-)
I'm restating my understanding of the bug, to make sure I am sure why your patch helps:
ACK. Thanks.
Should I clarify the commit message a bit before pushing?
Dunno that it is needed. If you think a better wording would help, then go for it, but I was able to understand the patch and form my own rephrasing using just your original wording, so I'm okay if you push as-is. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org