[libvirt] virsh domjobinfo during storage migration

Hi all, Running virsh domjobinfo on my CentOS 6 systems during a migration with --copy-storage-all, the output used to look like this: Job type: Unbounded Time elapsed: 1830632 ms Data processed: 37.212 GiB Data remaining: 1.025 GiB Data total: 16.016 GiB Memory processed: 37.212 GiB Memory remaining: 1.025 GiB Memory total: 16.016 GiB Memory bandwidth: 100.018 MiB/s Constant pages: 618279 Normal pages: 9734623 Normal data: 37.135 GiB Expected downtime: 1118 ms Setup time: 61 ms But when I run the same command on CentOS 7, libvirt-1.2.17-13 and qemu-kvm-1.5.3-105, I just get this: $ sudo virsh domjobinfo 58358fec-c35c-4a7a-a4dd-18ec2e1327bf Job type: Unbounded Time elapsed: 209616 ms I tried to debug this myself, but I'm a bit stuck. I'd expected libvirt to send query-migrate commands to qemu every 50 ms or so. To verify I attached gdb and put a breakpoint at qemuMonitorJSONGetMigrationStatus, but the breakpoint only hits after the complete disk is mirrored. Is this a regression, or am I doing something wrong? Kind regards, Ruben Kerkhof

On Thu, Nov 24, 2016 at 22:54:10 +0100, Ruben Kerkhof wrote:
Hi all,
Running virsh domjobinfo on my CentOS 6 systems during a migration with --copy-storage-all, the output used to look like this:
Job type: Unbounded Time elapsed: 1830632 ms Data processed: 37.212 GiB Data remaining: 1.025 GiB Data total: 16.016 GiB Memory processed: 37.212 GiB Memory remaining: 1.025 GiB Memory total: 16.016 GiB Memory bandwidth: 100.018 MiB/s Constant pages: 618279 Normal pages: 9734623 Normal data: 37.135 GiB Expected downtime: 1118 ms Setup time: 61 ms
But when I run the same command on CentOS 7, libvirt-1.2.17-13 and qemu-kvm-1.5.3-105, I just get this:
$ sudo virsh domjobinfo 58358fec-c35c-4a7a-a4dd-18ec2e1327bf Job type: Unbounded Time elapsed: 209616 ms
I tried to debug this myself, but I'm a bit stuck. I'd expected libvirt to send query-migrate commands to qemu every 50 ms or so. To verify I attached gdb and put a breakpoint at qemuMonitorJSONGetMigrationStatus, but the breakpoint only hits after the complete disk is mirrored.
That's expected since QEMU finally implemented a migration event, which allowed us to stop polling every 50 ms. Libvirt will only call query-migrate at the end of a migration or when virsh domjobinfo is called. The reason why you don't see anything interesting in the output is NBD storage migration. With new enough QEMU libvirt will first migrate storage using QEMU's integrated NBD server and then "migrate" QMP command will called. With old QEMU storage migration was done by the "migrate" command itself. Thus calling query-migrate while storage is migrated (i.e., before migrate was called) does not provide anything. I guess it should be possible to also check the progress of all running block jobs so that we can report statistics about ongoing storage migration even when NBD is used. Jirka

Hi Jiri, On Fri, Nov 25, 2016 at 9:26 AM, Jiri Denemark <jdenemar@redhat.com> wrote:
On Thu, Nov 24, 2016 at 22:54:10 +0100, Ruben Kerkhof wrote:
Hi all,
Running virsh domjobinfo on my CentOS 6 systems during a migration with --copy-storage-all, the output used to look like this:
Job type: Unbounded Time elapsed: 1830632 ms Data processed: 37.212 GiB Data remaining: 1.025 GiB Data total: 16.016 GiB Memory processed: 37.212 GiB Memory remaining: 1.025 GiB Memory total: 16.016 GiB Memory bandwidth: 100.018 MiB/s Constant pages: 618279 Normal pages: 9734623 Normal data: 37.135 GiB Expected downtime: 1118 ms Setup time: 61 ms
But when I run the same command on CentOS 7, libvirt-1.2.17-13 and qemu-kvm-1.5.3-105, I just get this:
$ sudo virsh domjobinfo 58358fec-c35c-4a7a-a4dd-18ec2e1327bf Job type: Unbounded Time elapsed: 209616 ms
I tried to debug this myself, but I'm a bit stuck. I'd expected libvirt to send query-migrate commands to qemu every 50 ms or so. To verify I attached gdb and put a breakpoint at qemuMonitorJSONGetMigrationStatus, but the breakpoint only hits after the complete disk is mirrored.
That's expected since QEMU finally implemented a migration event, which allowed us to stop polling every 50 ms. Libvirt will only call query-migrate at the end of a migration or when virsh domjobinfo is called.
The reason why you don't see anything interesting in the output is NBD storage migration. With new enough QEMU libvirt will first migrate storage using QEMU's integrated NBD server and then "migrate" QMP command will called. With old QEMU storage migration was done by the "migrate" command itself. Thus calling query-migrate while storage is migrated (i.e., before migrate was called) does not provide anything.
That makes sense, thanks for the explanation.
I guess it should be possible to also check the progress of all running block jobs so that we can report statistics about ongoing storage migration even when NBD is used.
That would be great, since I won't be able to upgrade to a newer version of QEMU yet. How do you want to handle this, would you like me to open an issue in bugzilla?
Jirka
Kind regards, Ruben Kerkhof

On Fri, Nov 25, 2016 at 11:38:36 +0100, Ruben Kerkhof wrote:
Hi Jiri,
On Fri, Nov 25, 2016 at 9:26 AM, Jiri Denemark <jdenemar@redhat.com> wrote:
I guess it should be possible to also check the progress of all running block jobs so that we can report statistics about ongoing storage migration even when NBD is used.
That would be great, since I won't be able to upgrade to a newer version of QEMU yet. How do you want to handle this, would you like me to open an issue in bugzilla?
Yes please. Unless you want to provide the patches :-) Jirka

On Fri, Nov 25, 2016 at 11:53 AM, Jiri Denemark <jdenemar@redhat.com> wrote:
How do you want to handle this, would you like me to open an issue in bugzilla?
Yes please. Unless you want to provide the patches :-)
I'd love to, but it is a bit outside of my area of expertise. I opened https://bugzilla.redhat.com/1398599
Jirka
Thanks Jirka!
participants (2)
-
Jiri Denemark
-
Ruben Kerkhof