[libvirt] [PATCH v3 00/24] Add support for migration events

older
[libvirt] FW: CPU Usage and Memory...

Jiri Denemark

10 Jun 2015 10 Jun '15

1:42 p.m.

QEMU will soon (patches are available on qemu-devel) get support for migration events which will finally allow us to get rid of polling query-migrate every 50ms. However, we first need to be able to wait for all events related to migration (migration status changes, block job events, async abort requests) at once. This series prepares the infrastructure and uses it to switch all polling loops in migration code to pthread_cond_wait. https://bugzilla.redhat.com/show_bug.cgi?id=1212077 Version 3 (see individual patches for details): - most of the series has been ACKed in v2 - "qemu: Use domain condition for synchronous block jobs" was split in 3 patches for easier review - minor changes requested in v2 review Version 2 (see individual patches for details): - rewritten using per-domain condition variable - enahnced to fully support the migration events Jiri Denemark (24): conf: Introduce per-domain condition variable qemu: Introduce qemuBlockJobUpdate qemu: Properly report failed migration qemu: Use domain condition for synchronous block jobs qemu: Cancel storage migration in parallel qemu: Abort migration early if disk mirror failed qemu: Don't mess with disk->mirrorState Pass domain object to private data formatter/parser qemu: Make qemuMigrationCancelDriveMirror usable without async job qemu: Refactor qemuMonitorBlockJobInfo qemu: Cancel disk mirrors after libvirtd restart qemu: Use domain condition for asyncAbort qemu_monitor: Wire up SPICE_MIGRATE_COMPLETED event qemu: Do not poll for spice migration status qemu: Refactor qemuDomainGetJob{Info,Stats} qemu: Refactor qemuMigrationUpdateJobStatus qemu: Don't pass redundant job name around qemu: Refactor qemuMigrationWaitForCompletion qemu_monitor: Wire up MIGRATION event qemuDomainGetJobStatsInternal: Support migration events qemu: Update migration state according to MIGRATION event qemu: Wait for migration events on domain condition qemu: cancel drive mirrors when p2p connection breaks DO NOT APPLY: qemu: Work around weird migration status changes po/POTFILES.in | 1 - src/conf/domain_conf.c | 51 ++- src/conf/domain_conf.h | 12 +- src/libvirt_private.syms | 6 + src/libxl/libxl_domain.c | 10 +- src/lxc/lxc_domain.c | 12 +- src/qemu/qemu_blockjob.c | 185 +++-------- src/qemu/qemu_blockjob.h | 15 +- src/qemu/qemu_capabilities.c | 3 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_domain.c | 78 +++-- src/qemu/qemu_domain.h | 7 +- src/qemu/qemu_driver.c | 201 +++++++----- src/qemu/qemu_migration.c | 763 +++++++++++++++++++++++++++++-------------- src/qemu/qemu_migration.h | 8 + src/qemu/qemu_monitor.c | 73 ++++- src/qemu/qemu_monitor.h | 33 +- src/qemu/qemu_monitor_json.c | 152 ++++----- src/qemu/qemu_monitor_json.h | 7 +- src/qemu/qemu_process.c | 92 +++++- tests/qemumonitorjsontest.c | 40 --- 21 files changed, 1057 insertions(+), 693 deletions(-) -- 2.4.3

Show replies by date

Jiri Denemark

10 Jun 10 Jun

1:42 p.m.

New subject: [libvirt] [PATCH v3 01/24] conf: Introduce per-domain condition variable

Complex jobs, such as migration, need to monitor several events at once, which is impossible when each of the event uses its own condition variable. This patch adds a single condition variable to each domain object. This variable can be used instead of the other event specific conditions. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- Notes: ACKed in version 2 Version 3: - rebased (context conflict in libvirt_private.syms) Version 2: - new patch which replaces thread queues and conditions (patch 1 and 2 in version 1) src/conf/domain_conf.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 6 ++++++ src/libvirt_private.syms | 4 ++++ 3 files changed, 57 insertions(+) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 36de844..433183f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2509,6 +2509,7 @@ static void virDomainObjDispose(void *obj) virDomainObjPtr dom = obj; VIR_DEBUG("obj=%p", dom); + virCondDestroy(&dom->cond); virDomainDefFree(dom->def); virDomainDefFree(dom->newDef); @@ -2529,6 +2530,12 @@ virDomainObjNew(virDomainXMLOptionPtr xmlopt) if (!(domain = virObjectLockableNew(virDomainObjClass))) return NULL; + if (virCondInit(&domain->cond) < 0) { + virReportSystemError(errno, "%s", + _("failed to initialize domain condition")); + goto error; + } + if (xmlopt->privateData.alloc) { if (!(domain->privateData = (xmlopt->privateData.alloc)())) goto error; @@ -2651,6 +2658,46 @@ virDomainObjEndAPI(virDomainObjPtr *vm) } +void +virDomainObjSignal(virDomainObjPtr vm) +{ + virCondSignal(&vm->cond); +} + + +void +virDomainObjBroadcast(virDomainObjPtr vm) +{ + virCondBroadcast(&vm->cond); +} + + +int +virDomainObjWait(virDomainObjPtr vm) +{ + if (virCondWait(&vm->cond, &vm->parent.lock) < 0) { + virReportSystemError(errno, "%s", + _("failed to wait for domain condition")); + return -1; + } + return 0; +} + + +int +virDomainObjWaitUntil(virDomainObjPtr vm, + unsigned long long whenms) +{ + if (virCondWaitUntil(&vm->cond, &vm->parent.lock, whenms) < 0 && + errno != ETIMEDOUT) { + virReportSystemError(errno, "%s", + _("failed to wait for domain condition")); + return -1; + } + return 0; +} + + /* * * If flags & VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE then diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index ba17a8d..ac29ce5 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2318,6 +2318,7 @@ typedef struct _virDomainObj virDomainObj; typedef virDomainObj *virDomainObjPtr; struct _virDomainObj { virObjectLockable parent; + virCond cond; pid_t pid; virDomainStateReason state; @@ -2437,6 +2438,11 @@ void virDomainObjEndAPI(virDomainObjPtr *vm); bool virDomainObjTaint(virDomainObjPtr obj, virDomainTaintFlags taint); +void virDomainObjSignal(virDomainObjPtr vm); +void virDomainObjBroadcast(virDomainObjPtr vm); +int virDomainObjWait(virDomainObjPtr vm); +int virDomainObjWaitUntil(virDomainObjPtr vm, + unsigned long long whenms); int virDomainDefCheckUnsupportedMemoryHotplug(virDomainDefPtr def); int virDomainDeviceDefCheckUnsupportedMemoryDevice(virDomainDeviceDefPtr dev); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 55a5e19..62a4b4c 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -380,6 +380,7 @@ virDomainNetTypeToString; virDomainNostateReasonTypeFromString; virDomainNostateReasonTypeToString; virDomainObjAssignDef; +virDomainObjBroadcast; virDomainObjCopyPersistentDef; virDomainObjEndAPI; virDomainObjFormat; @@ -408,8 +409,11 @@ virDomainObjParseNode; virDomainObjSetDefTransient; virDomainObjSetMetadata; virDomainObjSetState; +virDomainObjSignal; virDomainObjTaint; virDomainObjUpdateModificationImpact; +virDomainObjWait; +virDomainObjWaitUntil; virDomainOSTypeFromString; virDomainOSTypeToString; virDomainParseMemory; -- 2.4.3

Jiri Denemark

1:42 p.m.

New subject: [libvirt] [PATCH v3 02/24] qemu: Introduce qemuBlockJobUpdate

The wrapper is useful for calling qemuBlockJobEventProcess with the event details stored in disk's privateData, which is the most likely usage of qemuBlockJobEventProcess. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- Notes: ACKed in version 1 and 2 Version 3: - better flow in qemuBlockJobUpdate - document qemuBlockJobUpdate Version 2: - no changes ACKed in version 1 and 2 Version 3: - better flow in qemuBlockJobUpdate Version 2: - no changes src/libvirt_private.syms | 2 ++ src/qemu/qemu_blockjob.c | 47 +++++++++++++++++++++++++++++++++++++++-------- src/qemu/qemu_blockjob.h | 3 +++ 3 files changed, 44 insertions(+), 8 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 62a4b4c..0c9fa06 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -265,6 +265,8 @@ virDomainDiskInsert; virDomainDiskInsertPreAlloced; virDomainDiskIoTypeFromString; virDomainDiskIoTypeToString; +virDomainDiskMirrorStateTypeFromString; +virDomainDiskMirrorStateTypeToString; virDomainDiskPathByName; virDomainDiskRemove; virDomainDiskRemoveByName; diff --git a/src/qemu/qemu_blockjob.c b/src/qemu/qemu_blockjob.c index 098a43a..eb05cef 100644 --- a/src/qemu/qemu_blockjob.c +++ b/src/qemu/qemu_blockjob.c @@ -38,6 +38,37 @@ VIR_LOG_INIT("qemu.qemu_blockjob"); + +/** + * qemuBlockJobUpdate: + * @driver: qemu driver + * @vm: domain + * @disk: domain disk + * + * Update disk's mirror state in response to a block job event stored in + * blockJobStatus by qemuProcessHandleBlockJob event handler. + * + * Returns the block job event processed or -1 if there was no pending event. + */ +int +qemuBlockJobUpdate(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainDiskDefPtr disk) +{ + qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk); + int status = diskPriv->blockJobStatus; + + if (status != -1) { + qemuBlockJobEventProcess(driver, vm, disk, + diskPriv->blockJobType, + diskPriv->blockJobStatus); + diskPriv->blockJobStatus = -1; + } + + return status; +} + + /** * qemuBlockJobEventProcess: * @driver: qemu driver @@ -49,8 +80,6 @@ VIR_LOG_INIT("qemu.qemu_blockjob"); * Update disk's mirror state in response to a block job event * from QEMU. For mirror state's that must survive libvirt * restart, also update the domain's status XML. - * - * Returns 0 on success, -1 otherwise. */ void qemuBlockJobEventProcess(virQEMUDriverPtr driver, @@ -67,6 +96,12 @@ qemuBlockJobEventProcess(virQEMUDriverPtr driver, bool save = false; qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk); + VIR_DEBUG("disk=%s, mirrorState=%s, type=%d, status=%d", + disk->dst, + NULLSTR(virDomainDiskMirrorStateTypeToString(disk->mirrorState)), + type, + status); + /* Have to generate two variants of the event for old vs. new * client callbacks */ if (type == VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT && @@ -218,9 +253,7 @@ qemuBlockJobSyncEnd(virQEMUDriverPtr driver, if (diskPriv->blockJobSync && diskPriv->blockJobStatus != -1) { if (ret_status) *ret_status = diskPriv->blockJobStatus; - qemuBlockJobEventProcess(driver, vm, disk, - diskPriv->blockJobType, - diskPriv->blockJobStatus); + qemuBlockJobUpdate(driver, vm, disk); diskPriv->blockJobStatus = -1; } diskPriv->blockJobSync = false; @@ -300,9 +333,7 @@ qemuBlockJobSyncWaitWithTimeout(virQEMUDriverPtr driver, if (ret_status) *ret_status = diskPriv->blockJobStatus; - qemuBlockJobEventProcess(driver, vm, disk, - diskPriv->blockJobType, - diskPriv->blockJobStatus); + qemuBlockJobUpdate(driver, vm, disk); diskPriv->blockJobStatus = -1; return 0; diff --git a/src/qemu/qemu_blockjob.h b/src/qemu/qemu_blockjob.h index ba372a2..81e893e 100644 --- a/src/qemu/qemu_blockjob.h +++ b/src/qemu/qemu_blockjob.h @@ -25,6 +25,9 @@ # include "internal.h" # include "qemu_conf.h" +int qemuBlockJobUpdate(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainDiskDefPtr disk); void qemuBlockJobEventProcess(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainDiskDefPtr disk, -- 2.4.3

Peter Krempa

2:21 p.m.

New subject: [libvirt] [PATCH v3 02/24] qemu: Introduce qemuBlockJobUpdate

On Wed, Jun 10, 2015 at 15:42:36 +0200, Jiri Denemark wrote:

...

The wrapper is useful for calling qemuBlockJobEventProcess with the event details stored in disk's privateData, which is the most likely usage of qemuBlockJobEventProcess.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com> ---

Notes: ACKed in version 1 and 2

Version 3: - better flow in qemuBlockJobUpdate - document qemuBlockJobUpdate

Version 2: - no changes

ACKed in version 1 and 2

Version 3: - better flow in qemuBlockJobUpdate

Version 2: - no changes

src/libvirt_private.syms | 2 ++ src/qemu/qemu_blockjob.c | 47 +++++++++++++++++++++++++++++++++++++++-------- src/qemu/qemu_blockjob.h | 3 +++ 3 files changed, 44 insertions(+), 8 deletions(-)

...

diff --git a/src/qemu/qemu_blockjob.c b/src/qemu/qemu_blockjob.c index 098a43a..eb05cef 100644 --- a/src/qemu/qemu_blockjob.c +++ b/src/qemu/qemu_blockjob.c @@ -38,6 +38,37 @@

VIR_LOG_INIT("qemu.qemu_blockjob");

+ +/** + * qemuBlockJobUpdate: + * @driver: qemu driver + * @vm: domain + * @disk: domain disk + * + * Update disk's mirror state in response to a block job event stored in + * blockJobStatus by qemuProcessHandleBlockJob event handler. + * + * Returns the block job event processed or -1 if there was no pending event. + */ +int +qemuBlockJobUpdate(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainDiskDefPtr disk) +{ + qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk); + int status = diskPriv->blockJobStatus; + + if (status != -1) { + qemuBlockJobEventProcess(driver, vm, disk, + diskPriv->blockJobType, + diskPriv->blockJobStatus); + diskPriv->blockJobStatus = -1; + } + + return status; +}

Looks much better!

...

+ + /** * qemuBlockJobEventProcess: * @driver: qemu driver

ACK, Peter

Jiri Denemark

On Wed, Jun 10, 2015 at 15:42:53 +0200, Jiri Denemark wrote:

...

Thanks to Juan's work QEMU finally emits an event whenever migration state changes.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com> ---

Notes: The MIGRATION event is not supported by QEMU yet, this patch is based on the current patches available on qemu-devel mailing list. However, there were no objections to the design of the event, which makes it unlikely to change. Anyway this will have to wait until the patches are applied to QEMU.

ACKed in version 2

Version 3: - rebased (context conflict in qemu_capabilities.[ch])

Version 2: - new patch

src/qemu/qemu_capabilities.c | 3 +++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_monitor.c | 14 ++++++++++++++ src/qemu/qemu_monitor.h | 8 ++++++++ src/qemu/qemu_monitor_json.c | 23 +++++++++++++++++++++++ 5 files changed, 49 insertions(+)

...

diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index b5a7770..34cd078 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -229,6 +229,7 @@ typedef enum { QEMU_CAPS_DEA_KEY_WRAP = 187, /* -machine dea_key_wrap */ QEMU_CAPS_DEVICE_PCI_SERIAL = 188, /* -device pci-serial */ QEMU_CAPS_CPU_AARCH64_OFF = 189, /* -cpu ...,aarch64=off */ + QEMU_CAPS_MIGRATION_EVENT = 190, /* MIGRATION event */

The alignment of the equals sign is off here.

...

QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags;

ACK with that fixed. Peter

Jiri Denemark

1:42 p.m.

New subject: [libvirt] [PATCH v3 20/24] qemuDomainGetJobStatsInternal: Support migration events

When QEMU supports migration events the qemuDomainJobInfo structure will no longer be updated with migration statistics. We have to enter a job and explicitly ask QEMU every time virDomainGetJob{Info,Stats} is called. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- Notes: ACKed in version 2 Version 3: - moved before "qemu: Update migration state according to MIGRATION event" Version 2: - new patch src/qemu/qemu_driver.c | 27 +++++++++++++++++++++++---- 1 file changed, 23 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 17c8c85..7c5d685 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13091,15 +13091,27 @@ qemuConnectBaselineCPU(virConnectPtr conn ATTRIBUTE_UNUSED, static int -qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, +qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, virDomainObjPtr vm, bool completed, qemuDomainJobInfoPtr jobInfo) { qemuDomainObjPrivatePtr priv = vm->privateData; qemuDomainJobInfoPtr info; + bool fetch = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT); int ret = -1; + if (completed) + fetch = false; + + /* Do not ask QEMU if migration is not even running yet */ + if (!priv->job.current || !priv->job.current->status.status) + fetch = false; + + if (fetch && + qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) + return -1; + if (!completed && !virDomainObjIsActive(vm)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", @@ -13120,12 +13132,19 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, *jobInfo = *info; if (jobInfo->type == VIR_DOMAIN_JOB_BOUNDED || - jobInfo->type == VIR_DOMAIN_JOB_UNBOUNDED) - ret = qemuDomainJobInfoUpdateTime(jobInfo); - else + jobInfo->type == VIR_DOMAIN_JOB_UNBOUNDED) { + if (fetch) + ret = qemuMigrationFetchJobStatus(driver, vm, QEMU_ASYNC_JOB_NONE, + jobInfo); + else + ret = qemuDomainJobInfoUpdateTime(jobInfo); + } else { ret = 0; + } cleanup: + if (fetch) + qemuDomainObjEndJob(driver, vm); return ret; } -- 2.4.3

Jiri Denemark

1:42 p.m.

New subject: [libvirt] [PATCH v3 21/24] qemu: Update migration state according to MIGRATION event

We don't need to call query-migrate every 50ms when we get the current migration state via MIGRATION event. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- Notes: ACKed in vesrion 2 Version 3: - no change Version 2: - new patch src/qemu/qemu_migration.c | 14 ++++++++++++-- src/qemu/qemu_process.c | 31 +++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index d9f1a59..c3c2cac 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2511,7 +2511,11 @@ qemuMigrationCheckJobStatus(virQEMUDriverPtr driver, qemuDomainObjPrivatePtr priv = vm->privateData; qemuDomainJobInfoPtr jobInfo = priv->job.current; - if (qemuMigrationUpdateJobStatus(driver, vm, asyncJob) < 0) + bool events = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT); + + if (events) + qemuMigrationUpdateJobType(jobInfo); + else if (qemuMigrationUpdateJobStatus(driver, vm, asyncJob) < 0) return -1; switch (jobInfo->type) { @@ -2530,9 +2534,15 @@ qemuMigrationCheckJobStatus(virQEMUDriverPtr driver, qemuMigrationJobName(vm), _("canceled by client")); return -1; + case VIR_DOMAIN_JOB_COMPLETED: + /* Fetch statistics of a completed migration */ + if (events && + qemuMigrationUpdateJobStatus(driver, vm, asyncJob) < 0) + return -1; + break; + case VIR_DOMAIN_JOB_BOUNDED: case VIR_DOMAIN_JOB_UNBOUNDED: - case VIR_DOMAIN_JOB_COMPLETED: case VIR_DOMAIN_JOB_LAST: break; } diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index ba84182..e703cbd 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1508,6 +1508,36 @@ qemuProcessHandleSpiceMigrated(qemuMonitorPtr mon ATTRIBUTE_UNUSED, } +static int +qemuProcessHandleMigrationStatus(qemuMonitorPtr mon ATTRIBUTE_UNUSED, + virDomainObjPtr vm, + int status, + void *opaque ATTRIBUTE_UNUSED) +{ + qemuDomainObjPrivatePtr priv; + + virObjectLock(vm); + + VIR_DEBUG("Migration of domain %p %s changed state to %s", + vm, vm->def->name, + qemuMonitorMigrationStatusTypeToString(status)); + + priv = vm->privateData; + if (priv->job.asyncJob != QEMU_ASYNC_JOB_MIGRATION_OUT && + priv->job.asyncJob != QEMU_ASYNC_JOB_MIGRATION_IN) { + VIR_DEBUG("got MIGRATION event without a migration job"); + goto cleanup; + } + + priv->job.current->status.status = status; + virDomainObjSignal(vm); + + cleanup: + virObjectUnlock(vm); + return 0; +} + + static qemuMonitorCallbacks monitorCallbacks = { .eofNotify = qemuProcessHandleMonitorEOF, .errorNotify = qemuProcessHandleMonitorError, @@ -1532,6 +1562,7 @@ static qemuMonitorCallbacks monitorCallbacks = { .domainNicRxFilterChanged = qemuProcessHandleNicRxFilterChanged, .domainSerialChange = qemuProcessHandleSerialChanged, .domainSpiceMigrated = qemuProcessHandleSpiceMigrated, + .domainMigrationStatus = qemuProcessHandleMigrationStatus, }; static int -- 2.4.3

Jiri Denemark

1:42 p.m.

New subject: [libvirt] [PATCH v3 22/24] qemu: Wait for migration events on domain condition

Since we already support the MIGRATION event, we just need to make sure the domain condition is signalled whenever a p2p connection drops or the domain is paused due to IO error and we can avoid waking up every 50 ms to check whether something happened. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- Notes: ACKed in version 2 Version 3: - no change Version 2: - new patch src/qemu/qemu_domain.h | 3 +++ src/qemu/qemu_migration.c | 45 +++++++++++++++++++++++++++++++++++++++------ src/qemu/qemu_process.c | 3 +++ 3 files changed, 45 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 54e1e7b..86bd604 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -199,6 +199,9 @@ struct _qemuDomainObjPrivate { /* Bitmaps below hold data from the auto NUMA feature */ virBitmapPtr autoNodeset; virBitmapPtr autoCpuset; + + bool signalIOError; /* true if the domain condition should be signalled on + I/O error */ }; # define QEMU_DOMAIN_DISK_PRIVATE(disk) \ diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index c3c2cac..60c75f3 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2620,20 +2620,28 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver, { qemuDomainObjPrivatePtr priv = vm->privateData; qemuDomainJobInfoPtr jobInfo = priv->job.current; + bool events = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT); int rv; jobInfo->type = VIR_DOMAIN_JOB_UNBOUNDED; while ((rv = qemuMigrationCompleted(driver, vm, asyncJob, dconn, abort_on_error, storage)) != 1) { - /* Poll every 50ms for progress & to allow cancellation */ - struct timespec ts = { .tv_sec = 0, .tv_nsec = 50 * 1000 * 1000ull }; - if (rv < 0) return rv; - virObjectUnlock(vm); - nanosleep(&ts, NULL); - virObjectLock(vm); + if (events) { + if (virDomainObjWait(vm) < 0) { + jobInfo->type = VIR_DOMAIN_JOB_FAILED; + return -2; + } + } else { + /* Poll every 50ms for progress & to allow cancellation */ + struct timespec ts = { .tv_sec = 0, .tv_nsec = 50 * 1000 * 1000ull }; + + virObjectUnlock(vm); + nanosleep(&ts, NULL); + virObjectLock(vm); + } } qemuDomainJobInfoUpdateDowntime(jobInfo); @@ -4050,6 +4058,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, virErrorPtr orig_err = NULL; unsigned int cookieFlags = 0; bool abort_on_error = !!(flags & VIR_MIGRATE_ABORT_ON_ERROR); + bool events = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT); int rc; VIR_DEBUG("driver=%p, vm=%p, cookiein=%s, cookieinlen=%d, " @@ -4079,6 +4088,9 @@ qemuMigrationRun(virQEMUDriverPtr driver, return -1; } + if (events) + priv->signalIOError = abort_on_error; + mig = qemuMigrationEatCookie(driver, vm, cookiein, cookieinlen, cookieFlags | QEMU_MIGRATION_COOKIE_GRAPHICS); if (!mig) @@ -4284,6 +4296,9 @@ qemuMigrationRun(virQEMUDriverPtr driver, qemuMigrationCookieFree(mig); + if (events) + priv->signalIOError = false; + if (orig_err) { virSetError(orig_err); virFreeError(orig_err); @@ -4908,6 +4923,18 @@ doPeer2PeerMigrate3(virQEMUDriverPtr driver, } +static void +qemuMigrationConnectionClosed(virConnectPtr conn, + int reason, + void *opaque) +{ + virDomainObjPtr vm = opaque; + + VIR_DEBUG("conn=%p, reason=%d, vm=%s", conn, reason, vm->def->name); + virDomainObjSignal(vm); +} + + static int virConnectCredType[] = { VIR_CRED_AUTHNAME, VIR_CRED_PASSPHRASE, @@ -4981,6 +5008,11 @@ static int doPeer2PeerMigrate(virQEMUDriverPtr driver, cfg->keepAliveCount) < 0) goto cleanup; + if (virConnectRegisterCloseCallback(dconn, qemuMigrationConnectionClosed, + vm, NULL) < 0) { + goto cleanup; + } + qemuDomainObjEnterRemote(vm); p2p = VIR_DRV_SUPPORTS_FEATURE(dconn->driver, dconn, VIR_DRV_FEATURE_MIGRATION_P2P); @@ -5045,6 +5077,7 @@ static int doPeer2PeerMigrate(virQEMUDriverPtr driver, cleanup: orig_err = virSaveLastError(); qemuDomainObjEnterRemote(vm); + virConnectUnregisterCloseCallback(dconn, qemuMigrationConnectionClosed); virObjectUnref(dconn); qemuDomainObjExitRemote(vm); if (orig_err) { diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index e703cbd..93c0844 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -952,6 +952,9 @@ qemuProcessHandleIOError(qemuMonitorPtr mon ATTRIBUTE_UNUSED, qemuDomainObjPrivatePtr priv = vm->privateData; VIR_DEBUG("Transitioned guest %s to paused state due to IO error", vm->def->name); + if (priv->signalIOError) + virDomainObjSignal(vm); + virDomainObjSetState(vm, VIR_DOMAIN_PAUSED, VIR_DOMAIN_PAUSED_IOERROR); lifecycleEvent = virDomainEventLifecycleNewFromObj(vm, VIR_DOMAIN_EVENT_SUSPENDED, -- 2.4.3

Jiri Denemark

1:42 p.m.

New subject: [libvirt] [PATCH v3 23/24] qemu: cancel drive mirrors when p2p connection breaks

When a connection to the destination host during a p2p migration drops, we know we will have to cancel the migration; it doesn't make sense to waste resources by trying to finish the migration. We already do so after sending "migrate" command to QEMU and we should do it while waiting for drive mirrors to become ready too. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- Notes: ACKed in version 2 Version 3: - rebased on top of modified qemuMigrationDriveMirrorCancelled Version 2: - new patch src/qemu/qemu_migration.c | 29 +++++++++++++++++++++++------ 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 60c75f3..c53d2ea 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1908,7 +1908,8 @@ static int qemuMigrationCancelDriveMirror(virQEMUDriverPtr driver, virDomainObjPtr vm, bool check, - qemuDomainAsyncJob asyncJob) + qemuDomainAsyncJob asyncJob, + virConnectPtr dconn) { virErrorPtr err = NULL; int ret = -1; @@ -1939,6 +1940,13 @@ qemuMigrationCancelDriveMirror(virQEMUDriverPtr driver, } while ((rv = qemuMigrationDriveMirrorCancelled(driver, vm, check)) != 1) { + if (check && !failed && + dconn && virConnectIsAlive(dconn) <= 0) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("Lost connection to destination host")); + failed = true; + } + if (rv < 0) { failed = true; if (rv == -2) @@ -1989,7 +1997,8 @@ qemuMigrationDriveMirror(virQEMUDriverPtr driver, qemuMigrationCookiePtr mig, const char *host, unsigned long speed, - unsigned int *migrate_flags) + unsigned int *migrate_flags, + virConnectPtr dconn) { qemuDomainObjPrivatePtr priv = vm->privateData; int ret = -1; @@ -2069,6 +2078,12 @@ qemuMigrationDriveMirror(virQEMUDriverPtr driver, goto cleanup; } + if (dconn && virConnectIsAlive(dconn) <= 0) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("Lost connection to destination host")); + goto cleanup; + } + if (virDomainObjWait(vm) < 0) goto cleanup; } @@ -3689,7 +3704,7 @@ qemuMigrationConfirmPhase(virQEMUDriverPtr driver, /* cancel any outstanding NBD jobs */ qemuMigrationCancelDriveMirror(driver, vm, false, - QEMU_ASYNC_JOB_MIGRATION_OUT); + QEMU_ASYNC_JOB_MIGRATION_OUT, NULL); virSetError(orig_err); virFreeError(orig_err); @@ -4106,7 +4121,8 @@ qemuMigrationRun(virQEMUDriverPtr driver, if (qemuMigrationDriveMirror(driver, vm, mig, spec->dest.host.name, migrate_speed, - &migrate_flags) < 0) { + &migrate_flags, + dconn) < 0) { goto cleanup; } } else { @@ -4265,7 +4281,8 @@ qemuMigrationRun(virQEMUDriverPtr driver, /* cancel any outstanding NBD jobs */ if (mig && mig->nbd) { if (qemuMigrationCancelDriveMirror(driver, vm, ret == 0, - QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) + QEMU_ASYNC_JOB_MIGRATION_OUT, + dconn) < 0) ret = -1; } @@ -5853,7 +5870,7 @@ qemuMigrationCancel(virQEMUDriverPtr driver, } if (qemuMigrationCancelDriveMirror(driver, vm, false, - QEMU_ASYNC_JOB_NONE) < 0) + QEMU_ASYNC_JOB_NONE, NULL) < 0) goto endsyncjob; ret = 0; -- 2.4.3

Jiri Denemark

1:42 p.m.

New subject: [libvirt] [PATCH v3 24/24] DO NOT APPLY: qemu: Work around weird migration status changes

When cancelling migration we can see the following conversation on QMP monitor: {"execute":"migrate_cancel","id":"libvirt-33"} {"timestamp": {"seconds": 1432899178, "microseconds": 844907}, "event": "MIGRATION", "data": {"status": "cancelling"}} {"return": {}, "id": "libvirt-33"} {"timestamp": {"seconds": 1432899178, "microseconds": 845625}, "event": "MIGRATION", "data": {"status": "failed"}} {"timestamp": {"seconds": 1432899178, "microseconds": 846432}, "event": "MIGRATION", "data": {"status": "cancelled"}} That is, migration status first changes to "failed" just to change to the correct "cancelled" state in a few moments later. However, this is enough to let libvirt report migration failed for unknown reason instead of having been cancelled by a user. This should really be fixed in QEMU but I'm not sure how easy it is. However, it's pretty easy for us to work around it. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- Notes: Version 3: - mark as "DO NOT APPLY" -- Juan will fix the bug in QEMU in the next version of his migration event series Version 2: - new patch src/qemu/qemu_process.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 93c0844..dd43657 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1518,6 +1518,7 @@ qemuProcessHandleMigrationStatus(qemuMonitorPtr mon ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { qemuDomainObjPrivatePtr priv; + qemuDomainJobInfoPtr jobInfo; virObjectLock(vm); @@ -1532,7 +1533,15 @@ qemuProcessHandleMigrationStatus(qemuMonitorPtr mon ATTRIBUTE_UNUSED, goto cleanup; } - priv->job.current->status.status = status; + jobInfo = priv->job.current; + if (status == QEMU_MONITOR_MIGRATION_STATUS_ERROR && + jobInfo->status.status == QEMU_MONITOR_MIGRATION_STATUS_CANCELLING) { + VIR_DEBUG("State changed from \"cancelling\" to \"failed\"; setting " + "current state to \"cancelled\""); + status = QEMU_MONITOR_MIGRATION_STATUS_CANCELLED; + } + + jobInfo->status.status = status; virDomainObjSignal(vm); cleanup: -- 2.4.3

John Ferlan

2:27 p.m.

On 06/10/2015 09:42 AM, Jiri Denemark wrote:

...

QEMU will soon (patches are available on qemu-devel) get support for migration events which will finally allow us to get rid of polling query-migrate every 50ms. However, we first need to be able to wait for all events related to migration (migration status changes, block job events, async abort requests) at once. This series prepares the infrastructure and uses it to switch all polling loops in migration code to pthread_cond_wait.

https://bugzilla.redhat.com/show_bug.cgi?id=1212077

Version 3 (see individual patches for details): - most of the series has been ACKed in v2 - "qemu: Use domain condition for synchronous block jobs" was split in 3 patches for easier review - minor changes requested in v2 review

Version 2 (see individual patches for details): - rewritten using per-domain condition variable - enahnced to fully support the migration events

Jiri Denemark (24): conf: Introduce per-domain condition variable qemu: Introduce qemuBlockJobUpdate qemu: Properly report failed migration qemu: Use domain condition for synchronous block jobs qemu: Cancel storage migration in parallel qemu: Abort migration early if disk mirror failed qemu: Don't mess with disk->mirrorState Pass domain object to private data formatter/parser qemu: Make qemuMigrationCancelDriveMirror usable without async job qemu: Refactor qemuMonitorBlockJobInfo qemu: Cancel disk mirrors after libvirtd restart qemu: Use domain condition for asyncAbort qemu_monitor: Wire up SPICE_MIGRATE_COMPLETED event qemu: Do not poll for spice migration status qemu: Refactor qemuDomainGetJob{Info,Stats} qemu: Refactor qemuMigrationUpdateJobStatus qemu: Don't pass redundant job name around qemu: Refactor qemuMigrationWaitForCompletion qemu_monitor: Wire up MIGRATION event qemuDomainGetJobStatsInternal: Support migration events qemu: Update migration state according to MIGRATION event qemu: Wait for migration events on domain condition qemu: cancel drive mirrors when p2p connection breaks DO NOT APPLY: qemu: Work around weird migration status changes

po/POTFILES.in | 1 - src/conf/domain_conf.c | 51 ++- src/conf/domain_conf.h | 12 +- src/libvirt_private.syms | 6 + src/libxl/libxl_domain.c | 10 +- src/lxc/lxc_domain.c | 12 +- src/qemu/qemu_blockjob.c | 185 +++-------- src/qemu/qemu_blockjob.h | 15 +- src/qemu/qemu_capabilities.c | 3 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_domain.c | 78 +++-- src/qemu/qemu_domain.h | 7 +- src/qemu/qemu_driver.c | 201 +++++++----- src/qemu/qemu_migration.c | 763 +++++++++++++++++++++++++++++-------------- src/qemu/qemu_migration.h | 8 + src/qemu/qemu_monitor.c | 73 ++++- src/qemu/qemu_monitor.h | 33 +- src/qemu/qemu_monitor_json.c | 152 ++++----- src/qemu/qemu_monitor_json.h | 7 +- src/qemu/qemu_process.c | 92 +++++- tests/qemumonitorjsontest.c | 40 --- 21 files changed, 1057 insertions(+), 693 deletions(-)

Just ran this through my Coverity checker - only one issue RESOURCE_LEAK in qemuMigrationRun: 4235 if (qemuMigrationCheckJobStatus(driver, vm, 4236 QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) (4) Event if_end: End of if statement 4237 goto cancel; 4238 (5) Event open_fn: Returning handle opened by "accept". (6) Event var_assign: Assigning: "fd" = handle returned from "accept(spec->dest.unix_socket.sock, __SOCKADDR_ARG({ .__sockaddr__ = NULL}), NULL)". (7) Event cond_false: Condition "(fd = accept(spec->dest.unix_socket.sock, __SOCKADDR_ARG({ .__sockaddr__ = NULL}), NULL)) < 0", taking false branch Also see events: [leaked_handle] 4239 while ((fd = accept(spec->dest.unix_socket.sock, NULL, NULL)) < 0) { 4240 if (errno == EAGAIN || errno == EINTR) 4241 continue; ... 4252 rc = qemuMigrationWaitForCompletion(driver, vm, 4253 QEMU_ASYNC_JOB_MIGRATION_OUT, 4254 dconn, abort_on_error, !!mig->nbd); (13) Event cond_true: Condition "rc == -2", taking true branch 4255 if (rc == -2) (14) Event goto: Jumping to label "cancel" 4256 goto cancel; 4257 else if (rc == -1) ... 4288 (28) Event cond_false: Condition "spec->fwdType != MIGRATION_FWD_DIRECT", taking false branch 4289 if (spec->fwdType != MIGRATION_FWD_DIRECT) { 4290 if (iothread && qemuMigrationStopTunnel(iothread, ret < 0) < 0) 4291 ret = -1; 4292 VIR_FORCE_CLOSE(fd); (29) Event if_end: End of if statement 4293 } 4294 ... 4322 } 4323 (38) Event leaked_handle: Handle variable "fd" going out of scope leaks the handle. Also see events: [open_fn][var_assign] 4324 return ret; 4325 4326 exit_monitor: 4327 ignore_value(qemuDomainObjExitMonitor(driver, vm)); 4328 goto cleanup; 4329 (15) Event label: Reached label "cancel" 4330 cancel: 4331 orig_err = virSaveLastError(); 4332 (16) Event cond_true: Condition "virDomainObjIsActive(vm)", taking true branch ...

Jiri Denemark

3:06 p.m.

On Wed, Jun 10, 2015 at 10:27:11 -0400, John Ferlan wrote:

...

On 06/10/2015 09:42 AM, Jiri Denemark wrote:

...
QEMU will soon (patches are available on qemu-devel) get support for migration events which will finally allow us to get rid of polling query-migrate every 50ms. However, we first need to be able to wait for all events related to migration (migration status changes, block job events, async abort requests) at once. This series prepares the infrastructure and uses it to switch all polling loops in migration code to pthread_cond_wait.

https://bugzilla.redhat.com/show_bug.cgi?id=1212077

Version 3 (see individual patches for details): - most of the series has been ACKed in v2 - "qemu: Use domain condition for synchronous block jobs" was split in 3 patches for easier review - minor changes requested in v2 review

Version 2 (see individual patches for details): - rewritten using per-domain condition variable - enahnced to fully support the migration events

Just ran this through my Coverity checker - only one issue

RESOURCE_LEAK in qemuMigrationRun:

4235 if (qemuMigrationCheckJobStatus(driver, vm, 4236 QEMU_ASYNC_JOB_MIGRATION_OUT) < 0)

(4) Event if_end: End of if statement

4237 goto cancel; 4238

(5) Event open_fn: Returning handle opened by "accept". (6) Event var_assign: Assigning: "fd" = handle returned from "accept(spec->dest.unix_socket.sock, __SOCKADDR_ARG({ .__sockaddr__ = NULL}), NULL)". (7) Event cond_false: Condition "(fd = accept(spec->dest.unix_socket.sock, __SOCKADDR_ARG({ .__sockaddr__ = NULL}), NULL)) < 0", taking false branch Also see events: [leaked_handle]

4239 while ((fd = accept(spec->dest.unix_socket.sock, NULL, NULL)) < 0) { 4240 if (errno == EAGAIN || errno == EINTR) 4241 continue;

Hmm, what an old and unused (except for some ancient QEMU versions) code path :-) However, this code is only executed if spec->destType == MIGRATION_DEST_UNIX, which only happens in tunnelled migration path, which also sets spec.fwdType = MIGRATION_FWD_STREAM. ...

...

(28) Event cond_false: Condition "spec->fwdType != MIGRATION_FWD_DIRECT", taking false branch

4289 if (spec->fwdType != MIGRATION_FWD_DIRECT) { 4290 if (iothread && qemuMigrationStopTunnel(iothread, ret < 0) < 0) 4291 ret = -1; 4292 VIR_FORCE_CLOSE(fd);

Which means "spec->fwdType != MIGRATION_FWD_DIRECT" will be true and fd will be correctly closed. Jirka

John Ferlan

3:16 p.m.

On 06/10/2015 11:06 AM, Jiri Denemark wrote:

...

On Wed, Jun 10, 2015 at 10:27:11 -0400, John Ferlan wrote:

...
On 06/10/2015 09:42 AM, Jiri Denemark wrote:

...
QEMU will soon (patches are available on qemu-devel) get support for migration events which will finally allow us to get rid of polling query-migrate every 50ms. However, we first need to be able to wait for all events related to migration (migration status changes, block job events, async abort requests) at once. This series prepares the infrastructure and uses it to switch all polling loops in migration code to pthread_cond_wait.

https://bugzilla.redhat.com/show_bug.cgi?id=1212077

Version 3 (see individual patches for details): - most of the series has been ACKed in v2 - "qemu: Use domain condition for synchronous block jobs" was split in 3 patches for easier review - minor changes requested in v2 review

Version 2 (see individual patches for details): - rewritten using per-domain condition variable - enahnced to fully support the migration events

Just ran this through my Coverity checker - only one issue

RESOURCE_LEAK in qemuMigrationRun:

4235 if (qemuMigrationCheckJobStatus(driver, vm, 4236 QEMU_ASYNC_JOB_MIGRATION_OUT) < 0)

(4) Event if_end: End of if statement

4237 goto cancel; 4238

(5) Event open_fn: Returning handle opened by "accept". (6) Event var_assign: Assigning: "fd" = handle returned from "accept(spec->dest.unix_socket.sock, __SOCKADDR_ARG({ .__sockaddr__ = NULL}), NULL)". (7) Event cond_false: Condition "(fd = accept(spec->dest.unix_socket.sock, __SOCKADDR_ARG({ .__sockaddr__ = NULL}), NULL)) < 0", taking false branch Also see events: [leaked_handle]

4239 while ((fd = accept(spec->dest.unix_socket.sock, NULL, NULL)) < 0) { 4240 if (errno == EAGAIN || errno == EINTR) 4241 continue;

Hmm, what an old and unused (except for some ancient QEMU versions) code path :-) However, this code is only executed if spec->destType == MIGRATION_DEST_UNIX, which only happens in tunnelled migration path, which also sets spec.fwdType = MIGRATION_FWD_STREAM.

Placing "sa_assert(spec->fwdType == MIGRATION_FWD_STREAM);" above the while loop makes Coverity happy. John

...

...

...
(28) Event cond_false: Condition "spec->fwdType != MIGRATION_FWD_DIRECT", taking false branch

4289 if (spec->fwdType != MIGRATION_FWD_DIRECT) { 4290 if (iothread && qemuMigrationStopTunnel(iothread, ret < 0) < 0) 4291 ret = -1; 4292 VIR_FORCE_CLOSE(fd);

Which means "spec->fwdType != MIGRATION_FWD_DIRECT" will be true and fd will be correctly closed.

Jirka

Jiri Denemark

11 Jun 11 Jun

7:09 a.m.

On Wed, Jun 10, 2015 at 11:16:29 -0400, John Ferlan wrote:

...

On 06/10/2015 11:06 AM, Jiri Denemark wrote:

...
On Wed, Jun 10, 2015 at 10:27:11 -0400, John Ferlan wrote:

...
On 06/10/2015 09:42 AM, Jiri Denemark wrote:

...
QEMU will soon (patches are available on qemu-devel) get support for migration events which will finally allow us to get rid of polling query-migrate every 50ms. However, we first need to be able to wait for all events related to migration (migration status changes, block job events, async abort requests) at once. This series prepares the infrastructure and uses it to switch all polling loops in migration code to pthread_cond_wait.

https://bugzilla.redhat.com/show_bug.cgi?id=1212077

Version 3 (see individual patches for details): - most of the series has been ACKed in v2 - "qemu: Use domain condition for synchronous block jobs" was split in 3 patches for easier review - minor changes requested in v2 review

Version 2 (see individual patches for details): - rewritten using per-domain condition variable - enahnced to fully support the migration events

Just ran this through my Coverity checker - only one issue

RESOURCE_LEAK in qemuMigrationRun:

4235 if (qemuMigrationCheckJobStatus(driver, vm, 4236 QEMU_ASYNC_JOB_MIGRATION_OUT) < 0)

(4) Event if_end: End of if statement

4237 goto cancel; 4238

(5) Event open_fn: Returning handle opened by "accept". (6) Event var_assign: Assigning: "fd" = handle returned from "accept(spec->dest.unix_socket.sock, __SOCKADDR_ARG({ .__sockaddr__ = NULL}), NULL)". (7) Event cond_false: Condition "(fd = accept(spec->dest.unix_socket.sock, __SOCKADDR_ARG({ .__sockaddr__ = NULL}), NULL)) < 0", taking false branch Also see events: [leaked_handle]

4239 while ((fd = accept(spec->dest.unix_socket.sock, NULL, NULL)) < 0) { 4240 if (errno == EAGAIN || errno == EINTR) 4241 continue;

Hmm, what an old and unused (except for some ancient QEMU versions) code path :-) However, this code is only executed if spec->destType == MIGRATION_DEST_UNIX, which only happens in tunnelled migration path, which also sets spec.fwdType = MIGRATION_FWD_STREAM.

Placing "sa_assert(spec->fwdType == MIGRATION_FWD_STREAM);" above the while loop makes Coverity happy.

Feel free to push the sa_assert, it's completely unrelated to this series and it has been there for ages. Jirka

Peter Krempa

10 Jun 10 Jun

3:13 p.m.

On Wed, Jun 10, 2015 at 15:42:34 +0200, Jiri Denemark wrote:

...

QEMU will soon (patches are available on qemu-devel) get support for migration events which will finally allow us to get rid of polling query-migrate every 50ms. However, we first need to be able to wait for all events related to migration (migration status changes, block job events, async abort requests) at once. This series prepares the infrastructure and uses it to switch all polling loops in migration code to pthread_cond_wait.

https://bugzilla.redhat.com/show_bug.cgi?id=1212077

Version 3 (see individual patches for details): - most of the series has been ACKed in v2 - "qemu: Use domain condition for synchronous block jobs" was split in 3 patches for easier review - minor changes requested in v2 review

Version 2 (see individual patches for details): - rewritten using per-domain condition variable - enahnced to fully support the migration events

Jiri Denemark (24): conf: Introduce per-domain condition variable qemu: Introduce qemuBlockJobUpdate qemu: Properly report failed migration qemu: Use domain condition for synchronous block jobs qemu: Cancel storage migration in parallel qemu: Abort migration early if disk mirror failed qemu: Don't mess with disk->mirrorState Pass domain object to private data formatter/parser qemu: Make qemuMigrationCancelDriveMirror usable without async job qemu: Refactor qemuMonitorBlockJobInfo qemu: Cancel disk mirrors after libvirtd restart qemu: Use domain condition for asyncAbort qemu_monitor: Wire up SPICE_MIGRATE_COMPLETED event qemu: Do not poll for spice migration status qemu: Refactor qemuDomainGetJob{Info,Stats} qemu: Refactor qemuMigrationUpdateJobStatus qemu: Don't pass redundant job name around qemu: Refactor qemuMigrationWaitForCompletion qemu_monitor: Wire up MIGRATION event qemuDomainGetJobStatsInternal: Support migration events qemu: Update migration state according to MIGRATION event qemu: Wait for migration events on domain condition qemu: cancel drive mirrors when p2p connection breaks

ACK to the above ones once qemu accepts the event stuff. Peter

...

DO NOT APPLY: qemu: Work around weird migration status changes

Jiri Denemark

19 Jun 19 Jun

1:52 p.m.

On Wed, Jun 10, 2015 at 17:13:12 +0200, Peter Krempa wrote:

...

On Wed, Jun 10, 2015 at 15:42:34 +0200, Jiri Denemark wrote:

...
QEMU will soon (patches are available on qemu-devel) get support for migration events which will finally allow us to get rid of polling query-migrate every 50ms. However, we first need to be able to wait for all events related to migration (migration status changes, block job events, async abort requests) at once. This series prepares the infrastructure and uses it to switch all polling loops in migration code to pthread_cond_wait.

https://bugzilla.redhat.com/show_bug.cgi?id=1212077

Version 3 (see individual patches for details): - most of the series has been ACKed in v2 - "qemu: Use domain condition for synchronous block jobs" was split in 3 patches for easier review - minor changes requested in v2 review

Version 2 (see individual patches for details): - rewritten using per-domain condition variable - enahnced to fully support the migration events

Jiri Denemark (24): conf: Introduce per-domain condition variable qemu: Introduce qemuBlockJobUpdate qemu: Properly report failed migration qemu: Use domain condition for synchronous block jobs qemu: Cancel storage migration in parallel qemu: Abort migration early if disk mirror failed qemu: Don't mess with disk->mirrorState Pass domain object to private data formatter/parser qemu: Make qemuMigrationCancelDriveMirror usable without async job qemu: Refactor qemuMonitorBlockJobInfo qemu: Cancel disk mirrors after libvirtd restart qemu: Use domain condition for asyncAbort qemu_monitor: Wire up SPICE_MIGRATE_COMPLETED event qemu: Do not poll for spice migration status qemu: Refactor qemuDomainGetJob{Info,Stats} qemu: Refactor qemuMigrationUpdateJobStatus qemu: Don't pass redundant job name around qemu: Refactor qemuMigrationWaitForCompletion qemu_monitor: Wire up MIGRATION event qemuDomainGetJobStatsInternal: Support migration events qemu: Update migration state according to MIGRATION event qemu: Wait for migration events on domain condition qemu: cancel drive mirrors when p2p connection breaks

ACK to the above ones once qemu accepts the event stuff.

Thanks, I pushed all patches which did not rely on the new MIGRATION event, in other words, all except the following ones: qemu_monitor: Wire up MIGRATION event qemuDomainGetJobStatsInternal: Support migration events qemu: Update migration state according to MIGRATION event qemu: Wait for migration events on domain condition DO NOT APPLY: qemu: Work around weird migration status changes Jirka

3846

Age (days ago)

3855

Last active (days ago)

List overview

Download

35 comments

3 participants

participants (3)

Jiri Denemark
John Ferlan
Peter Krempa

[libvirt] [PATCH v3 00/24] Add support for migration events

tags

participants (3)