[libvirt-users] Adjust disk image migration (NBD)

Hi all, As I am doing some tests with qemu, I realized that the way it does 'migrate -i tcp:DEST:444' is not the same as 'libvirt migrate --copy-storage-inc'. Basically qemu uses the same stream as RAM migration and libvirt takes advantage of NBD transfer. With virsh migrate-setspeed I observed that one can only control the transfer throughput of RAM, but not disk synchronization. At least this is what I can see in bmon when doing a migration with incremental copy. The question is: Am I missing something or it is not implemented? Thank you guys!

On 14.02.2014 10:40, Joaquim Barrera wrote:
Hi all,
As I am doing some tests with qemu, I realized that the way it does 'migrate -i tcp:DEST:444' is not the same as 'libvirt migrate --copy-storage-inc'. Basically qemu uses the same stream as RAM migration and libvirt takes advantage of NBD transfer.
With virsh migrate-setspeed I observed that one can only control the transfer throughput of RAM, but not disk synchronization. At least this is what I can see in bmon when doing a migration with incremental copy.
The question is: Am I missing something or it is not implemented?
Thank you guys!
I think this is actually a qemu bug. Libvirt passes the correct values: 2014-02-14 10:52:08.010+0000: 27701: debug : qemuMonitorIOWrite:504 : QEMU_MONITOR_IO_WRITE: mon=0x7f06cc00ea20 buf={"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk0","target":"nbd:masina:49153:exportname=drive-virtio-disk0","speed":1048576,"sync":"full","mode":"existing"},"id":"libvirt-15"} ... 2014-02-14 10:53:51.169+0000: 27701: debug : qemuMonitorIOWrite:504 : QEMU_MONITOR_IO_WRITE: mon=0x7f06cc00ea20 buf={"execute":"migrate_set_speed","arguments":{"value":1048576},"id":"libvirt-221"} 2014-02-14 10:53:51.204+0000: 27701: debug : qemuMonitorIOWrite:504 : QEMU_MONITOR_IO_WRITE: mon=0x7f06cc00ea20 buf={"execute":"migrate","arguments":{"detach":true,"blk":false,"inc":false,"uri":"fd:migrate"},"id":"libvirt-223"} However I observe what you do - disk migration is not shaped, while internal state is. Michal

On Fri, Feb 14, 2014 at 11:58:56AM +0100, Michal Privoznik wrote:
On 14.02.2014 10:40, Joaquim Barrera wrote:
Hi all,
As I am doing some tests with qemu, I realized that the way it does 'migrate -i tcp:DEST:444' is not the same as 'libvirt migrate --copy-storage-inc'. Basically qemu uses the same stream as RAM migration and libvirt takes advantage of NBD transfer.
With virsh migrate-setspeed I observed that one can only control the transfer throughput of RAM, but not disk synchronization. At least this is what I can see in bmon when doing a migration with incremental copy.
The question is: Am I missing something or it is not implemented?
Thank you guys!
I think this is actually a qemu bug. Libvirt passes the correct values:
2014-02-14 10:52:08.010+0000: 27701: debug : qemuMonitorIOWrite:504 : QEMU_MONITOR_IO_WRITE: mon=0x7f06cc00ea20 buf={"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk0","target":"nbd:masina:49153:exportname=drive-virtio-disk0","speed":1048576,"sync":"full","mode":"existing"},"id":"libvirt-15"}
...
2014-02-14 10:53:51.169+0000: 27701: debug : qemuMonitorIOWrite:504 : QEMU_MONITOR_IO_WRITE: mon=0x7f06cc00ea20 buf={"execute":"migrate_set_speed","arguments":{"value":1048576},"id":"libvirt-221"}
2014-02-14 10:53:51.204+0000: 27701: debug : qemuMonitorIOWrite:504 : QEMU_MONITOR_IO_WRITE: mon=0x7f06cc00ea20 buf={"execute":"migrate","arguments":{"detach":true,"blk":false,"inc":false,"uri":"fd:migrate"},"id":"libvirt-223"}
However I observe what you do - disk migration is not shaped, while internal state is.
Thanks for raising this. I noticed that mirror_run() does not throttle the first loop where it populates the dirty bitmap using bdrv_is_allocated_above(). The main copy loop does take the speed limit into account but perhaps that's broken too. Paolo, Jeff: Any ideas? Stefan

Thanks for raising this.
I noticed that mirror_run() does not throttle the first loop where it populates the dirty bitmap using bdrv_is_allocated_above().
This is on purpose. Does it causes a noticeable stall in the guest?
The main copy loop does take the speed limit into account but perhaps that's broken too.
Yeah, it looks broken. Each iteration of the loop can write much more than sectors_per_chunk sectors, but here: if (s->common.speed) { delay_ns = ratelimit_calculate_delay(&s->limit, sectors_per_chunk); } else { delay_ns = 0; } the second argument is fixed. :/ Paolo

On 24/02/14 23:26, Paolo Bonzini wrote:
Thanks for raising this.
I noticed that mirror_run() does not throttle the first loop where it populates the dirty bitmap using bdrv_is_allocated_above(). This is on purpose. Does it causes a noticeable stall in the guest?
The main copy loop does take the speed limit into account but perhaps that's broken too. Yeah, it looks broken. Each iteration of the loop can write much more than sectors_per_chunk sectors, but here:
if (s->common.speed) { delay_ns = ratelimit_calculate_delay(&s->limit, sectors_per_chunk); } else { delay_ns = 0; }
the second argument is fixed. :/
Paolo
Thanks for the answer. Something is still not clear to me. Are we in front of a bug (that means, something that could be fixed) or is this behaviour somehow expected for some reason? More and more tests I am doing, I get allways the same throughput chart: unlimited bandwidth when syncronizing the disk, and smooth bandwidth limit when migrating RAM. Joaquim

Il 28/02/2014 11:41, Joaquim Barrera ha scritto:
Thanks for the answer. Something is still not clear to me. Are we in front of a bug (that means, something that could be fixed) or is this behaviour somehow expected for some reason? More and more tests I am doing, I get allways the same throughput chart: unlimited bandwidth when syncronizing the disk, and smooth bandwidth limit when migrating RAM.
Joaquim
Yes, it's a bug that we can fix. Paolo

On 28/02/14 11:43, Paolo Bonzini wrote:
Il 28/02/2014 11:41, Joaquim Barrera ha scritto:
Thanks for the answer. Something is still not clear to me. Are we in front of a bug (that means, something that could be fixed) or is this behaviour somehow expected for some reason? More and more tests I am doing, I get allways the same throughput chart: unlimited bandwidth when syncronizing the disk, and smooth bandwidth limit when migrating RAM.
Joaquim
Yes, it's a bug that we can fix.
Hi Paolo et all. Can you tell me how to "start" the process of bug fixing? Am I supposed to report it somewhere, or did you just take note? Thanks a lot.
Paolo

On Tue, Mar 11, 2014 at 06:13:18PM +0100, Joaquim Barrera wrote:
On 28/02/14 11:43, Paolo Bonzini wrote:
Il 28/02/2014 11:41, Joaquim Barrera ha scritto:
Thanks for the answer. Something is still not clear to me. Are we in front of a bug (that means, something that could be fixed) or is this behaviour somehow expected for some reason? More and more tests I am doing, I get allways the same throughput chart: unlimited bandwidth when syncronizing the disk, and smooth bandwidth limit when migrating RAM.
Joaquim
Yes, it's a bug that we can fix.
Hi Paolo et all. Can you tell me how to "start" the process of bug fixing? Am I supposed to report it somewhere, or did you just take note?
Hi Jaoquim, Thanks for reporting this. We're in the bug-fixing phase of the QEMU 2.0 release cycle so your reminder is perfect timing. I'll take a look at this issue today. Stefan
participants (4)
-
Joaquim Barrera
-
Michal Privoznik
-
Paolo Bonzini
-
Stefan Hajnoczi