[adding libvirt list]
On 09/17/2014 09:04 AM, Stefan Hajnoczi wrote:
On Wed, Sep 17, 2014 at 10:25 AM, Paolo Bonzini
<pbonzini(a)redhat.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Il 17/09/2014 11:06, Stefan Hajnoczi ha scritto:
>> I think the fundamental problem here is that the mirror block job
>> on the source host does not synchronize with live migration.
>>
>> Remember the mirror block job iterates on the dirty bitmap
>> whenever it feels like.
>>
>> There is no guarantee that the mirror block job has quiesced before
>> migration handover takes place, right?
>
> Libvirt does that. Migration is started only once storage mirroring
> is out of the bulk phase, and the handover looks like:
>
> 1) migration completes
>
> 2) because the source VM is stopped, the disk has quiesced on the source
But the mirror block job might still be writing out dirty blocks.
> 3) libvirt sends block-job-complete
No, it sends block-job-cancel after the source QEMU's migration has
completed. See the qemuMigrationCancelDriveMirror() call in
src/qemu/qemu_migration.c:qemuMigrationRun().
> 4) libvirt receives BLOCK_JOB_COMPLETED. The disk has now quiesced on
> the destination as well.
I don't see where this happens in the libvirt source code. Libvirt
doesn't care about block job events for drive-mirror during migration.
And that's why there could still be I/O going on (since
block-job-cancel is asynchronous).
> 5) the VM is started on the destination
>
> 6) the NBD server is stopped on the destination and the source VM is quit.
>
> It is actually a feature that storage migration is completed
> asynchronously with respect to RAM migration. The problem is that
> qcow2_invalidate_cache happens between (3) and (5), and it doesn't
> like the concurrent I/O received by the NBD server.
I agree that qcow2_invalidate_cache() (and any other invalidate cache
implementations) need to allow concurrent I/O requests.
Either I'm misreading the libvirt code or libvirt is not actually
ensuring that the block job on the source has cancelled/completed
before the guest is resumed on the destination. So I think there is
still a bug, maybe Eric can verify this?
You may indeed be correct that libvirt is not waiting long enough for
the block job to be gone on the source before resuming on the
destination. I didn't write that particular code, so I'm cc'ing the
libvirt list, but I can try and take a look into it, since it's related
to code I've recently touched in getting libvirt to support active layer
block commit.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org