On Wed, Jun 08, 2022 at 17:32:57 +0800, Han Han wrote:
Hi developers,
Recently, I am researching migration with non-share disk(flags
VIR_MIGRATE_NON_SHARED_DISK and VIR_MIGRATE_NON_SHARED_INC).
As we know, the non-shared disk migration could have block jobs to copy the
disk image from the src host to the dst host. So here are my questions for
non-shared disk migration:
q1. For the API virDomainMigrate3 with the bandwidth param, could it set
the bandwidth of block jobs?
q2. For the API virDomainMigrateSetMaxSpeed, could it set the bandwidth of
block jobs?
q3. For the domain job abort API virDomainAbortJob, could it stop the block
job of non-shared disk migration?
q4. For the block job bandwidth API virDomainBlockJobSetSpeed, could it set
the block job of non-shared disk migration?
q5. For the block job abort API virDomainBlockJobAbort, could it stop the
block job of non-shared disk migration?
Then I got the test results of libvirt-8.4.0-1.el9.x86_64
qemu-kvm-7.0.0-4.el9.x86_64:
q1: The bandwidth limit of virDomainMigrate3 is effective to the blockjob:
➜ ~ virsh migrate OVMF qemu+ssh://root@hhan-rhel9--1/system --live --p2p
--tls --tls-destination hhan-rhel9--1 --copy-storage-all --disks-uri
tcp://hhan-rhel9--1:49156 --bandwidth 2
➜ ~ virsh blockjob OVMF vda
Block Copy: [ 0 %] Bandwidth limit: 2097152 bytes/s (2.000 MiB/s)
This is expected and desired.
q2: The virDomainMigrateSetMaxSpeed doesn't change the the
bandwidth of
block jobs.
➜ ~ virsh migrate-setspeed OVMF 8
➜ ~ virsh blockjob OVMF vda
Block Copy: [ 9 %] Bandwidth limit: 2097152 bytes/s (2.000 MiB/s)
This is a bug though, setting the migration speed should, based on the
fact that we want to use the global migration speed flag for disks too
, apply also to the disk migration streams.
q3: The virDomainAbortJob could stop a block job of non-shared disk
migration
➜ ~ virsh migrate OVMF qemu+ssh://root@hhan-rhel9--1/system --live --p2p
--tls --tls-destination hhan-rhel9--1 --copy-storage-all --disks-uri
tcp://hhan-rhel9--1:49156 --bandwidth 2
Then start a virsh event on another terminal:
➜ ~ virsh event --loop --all
Abort the domain job:
➜ ~ virsh domjobabort OVMF
The error "error: operation aborted: migration out: canceled by client"
appears at the terminal of "virsh migrate"
The terminal of "virsh event" shows the block job has been failed:
event 'block-job' for domain 'OVMF': Block Copy for
/var/lib/libvirt/images/OVMF.qcow2 failed
event 'block-job-2' for domain 'OVMF': Block Copy for vda failed
This is again expected, the blockjobs are started by the migration thus
when you cancel the migration we also need to cancel the blockjobs.
q4: The block job bandwidth of non-shared disk migration cannot be
set by
virDomainBlockJobSetSpeed:
➜ ~ virsh blockjob OVMF vda --bandwidth 10
error: Timed out during operation: cannot acquire state change lock (held
by monitor=remoteDispatchDomainMigratePerform3Params)
This is okay, but we could take it a sa feature request to allow tuning
of the individual blockjobs.
q5: The block job of non-shared disk migration cannot be aborted by
virDomainBlockJobAbort:
➜ ~ virsh blockjob OVMF vda --abort
error: Timed out during operation: cannot acquire state change lock (held
by monitor=remoteDispatchDomainMigratePerform3Params)
This is expected. Same as above, we dodn't want to allow users to
control this. In contrast to 'q4' I'd refuse a RFE to allow cancelling
of individual jobs.
Are the results above expected?
Here are my personal thoughts:
For the bandwidth in q1 and q2, they are commented as migration bandwidth(
https://gitlab.com/libvirt/libvirt/-/blob/master/include/libvirt/libvirt-...
,
https://gitlab.com/libvirt/libvirt/-/blob/master/src/libvirt-domain.c#L9696
), but one works for block jobs while one doesn't. So we should make the
comment clear whether they are the bandwidth of VM migration or the
bandwidth of migration with blockjobs. What's more, add a flag to
virDomainMigrateMaxSpeedFlags to support set bandwidth to the blockjobs in
migration.
For q4 and q5, if we will not support to change the block job of non-shared
disk migration by blockjob APIs, we should note that in the migration doc
or the block job doc, to present the difference between this type of block
job and the others.