
I think current master had lost the ability of the followed patch: https://github.com/libvirt/libvirt/commit/e8f263e0d006390c3764aaa07093b2d174... On 2021/9/21 0:52, wangjie (P) wrote:
bug reproduce process: 1、perform migrateToURI3. 2、kill libvirtd when enter memory migration phase,and restart libvirtd. 3、perform migrateToURI3 again and again,migrateToURI3 will fail forever with err-msg "Requested operation is not valid: domain has active block job"
I found the reasion which trigger the bug as follow:
1、the qemuBlockJobData is not persistent when libvirtd restart,so the job which return from qemuBlockJobDiskGetJob while always NULL, so qemuMigrationSrcNBDCopyCancel will not be taken.
2、calltrace: qemuProcessReconnect ->qemuProcessRecoverJob ->qemuProcessRecoverMigrationOut ->qemuMigrationSrcCancel
3、code as follow: qemuMigrationSrcCancel(virQEMUDriver *driver, virDomainObj *vm) { ... ... for (i = 0; i < vm->def->ndisks; i++) { virDomainDiskDef *disk = vm->def->disks[i]; qemuDomainDiskPrivate *diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk); qemuBlockJobData *job;
if (!(job = qemuBlockJobDiskGetJob(disk)) || //the job is always NULL !!! !qemuBlockJobIsRunning(job)) diskPriv->migrating = false;
if (diskPriv->migrating) { qemuBlockJobSyncBegin(job); storage = true; }
virObjectUnref(job); } ... ...
if (storage && qemuMigrationSrcNBDCopyCancel(driver, vm, true, QEMU_ASYNC_JOB_NONE, NULL) < 0) return -1; ... ... }
4、I think current master had lost the ability of the followed patch: http://10.175.124.40/cgit/cgit.cgi/code.huawei.com/libvirt.git/commit/?id=e8...
can you give some suggestions to fix it?