
On 06/10/2015 09:42 AM, Jiri Denemark wrote:
QEMU will soon (patches are available on qemu-devel) get support for migration events which will finally allow us to get rid of polling query-migrate every 50ms. However, we first need to be able to wait for all events related to migration (migration status changes, block job events, async abort requests) at once. This series prepares the infrastructure and uses it to switch all polling loops in migration code to pthread_cond_wait.
https://bugzilla.redhat.com/show_bug.cgi?id=1212077
Version 3 (see individual patches for details): - most of the series has been ACKed in v2 - "qemu: Use domain condition for synchronous block jobs" was split in 3 patches for easier review - minor changes requested in v2 review
Version 2 (see individual patches for details): - rewritten using per-domain condition variable - enahnced to fully support the migration events
Jiri Denemark (24): conf: Introduce per-domain condition variable qemu: Introduce qemuBlockJobUpdate qemu: Properly report failed migration qemu: Use domain condition for synchronous block jobs qemu: Cancel storage migration in parallel qemu: Abort migration early if disk mirror failed qemu: Don't mess with disk->mirrorState Pass domain object to private data formatter/parser qemu: Make qemuMigrationCancelDriveMirror usable without async job qemu: Refactor qemuMonitorBlockJobInfo qemu: Cancel disk mirrors after libvirtd restart qemu: Use domain condition for asyncAbort qemu_monitor: Wire up SPICE_MIGRATE_COMPLETED event qemu: Do not poll for spice migration status qemu: Refactor qemuDomainGetJob{Info,Stats} qemu: Refactor qemuMigrationUpdateJobStatus qemu: Don't pass redundant job name around qemu: Refactor qemuMigrationWaitForCompletion qemu_monitor: Wire up MIGRATION event qemuDomainGetJobStatsInternal: Support migration events qemu: Update migration state according to MIGRATION event qemu: Wait for migration events on domain condition qemu: cancel drive mirrors when p2p connection breaks DO NOT APPLY: qemu: Work around weird migration status changes
po/POTFILES.in | 1 - src/conf/domain_conf.c | 51 ++- src/conf/domain_conf.h | 12 +- src/libvirt_private.syms | 6 + src/libxl/libxl_domain.c | 10 +- src/lxc/lxc_domain.c | 12 +- src/qemu/qemu_blockjob.c | 185 +++-------- src/qemu/qemu_blockjob.h | 15 +- src/qemu/qemu_capabilities.c | 3 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_domain.c | 78 +++-- src/qemu/qemu_domain.h | 7 +- src/qemu/qemu_driver.c | 201 +++++++----- src/qemu/qemu_migration.c | 763 +++++++++++++++++++++++++++++-------------- src/qemu/qemu_migration.h | 8 + src/qemu/qemu_monitor.c | 73 ++++- src/qemu/qemu_monitor.h | 33 +- src/qemu/qemu_monitor_json.c | 152 ++++----- src/qemu/qemu_monitor_json.h | 7 +- src/qemu/qemu_process.c | 92 +++++- tests/qemumonitorjsontest.c | 40 --- 21 files changed, 1057 insertions(+), 693 deletions(-)
Just ran this through my Coverity checker - only one issue RESOURCE_LEAK in qemuMigrationRun: 4235 if (qemuMigrationCheckJobStatus(driver, vm, 4236 QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) (4) Event if_end: End of if statement 4237 goto cancel; 4238 (5) Event open_fn: Returning handle opened by "accept". (6) Event var_assign: Assigning: "fd" = handle returned from "accept(spec->dest.unix_socket.sock, __SOCKADDR_ARG({ .__sockaddr__ = NULL}), NULL)". (7) Event cond_false: Condition "(fd = accept(spec->dest.unix_socket.sock, __SOCKADDR_ARG({ .__sockaddr__ = NULL}), NULL)) < 0", taking false branch Also see events: [leaked_handle] 4239 while ((fd = accept(spec->dest.unix_socket.sock, NULL, NULL)) < 0) { 4240 if (errno == EAGAIN || errno == EINTR) 4241 continue; ... 4252 rc = qemuMigrationWaitForCompletion(driver, vm, 4253 QEMU_ASYNC_JOB_MIGRATION_OUT, 4254 dconn, abort_on_error, !!mig->nbd); (13) Event cond_true: Condition "rc == -2", taking true branch 4255 if (rc == -2) (14) Event goto: Jumping to label "cancel" 4256 goto cancel; 4257 else if (rc == -1) ... 4288 (28) Event cond_false: Condition "spec->fwdType != MIGRATION_FWD_DIRECT", taking false branch 4289 if (spec->fwdType != MIGRATION_FWD_DIRECT) { 4290 if (iothread && qemuMigrationStopTunnel(iothread, ret < 0) < 0) 4291 ret = -1; 4292 VIR_FORCE_CLOSE(fd); (29) Event if_end: End of if statement 4293 } 4294 ... 4322 } 4323 (38) Event leaked_handle: Handle variable "fd" going out of scope leaks the handle. Also see events: [open_fn][var_assign] 4324 return ret; 4325 4326 exit_monitor: 4327 ignore_value(qemuDomainObjExitMonitor(driver, vm)); 4328 goto cleanup; 4329 (15) Event label: Reached label "cancel" 4330 cancel: 4331 orig_err = virSaveLastError(); 4332 (16) Event cond_true: Condition "virDomainObjIsActive(vm)", taking true branch ...