[libvirt] [PATCH v2 00/12] qemu: migration: show disks stats for nbd migration

diff from v1: ============ 1. patch "qemu: clean out unused migrate to unix" is dropped as it is already pushed. 2. a lot of refactoring patches added, namely all except the last patch. 3. fetching mirroring stats is done separately from getting migration status. Generally speaking refactorings patches removes the function to fetch migrations status altogether. Current migration stats will show something like [1] when in the process of mirroring of non shared disks. This gives very little info on the migration progress. Likewise completed stats miss disks mirroring info. This patch provides disks stats in the said phase like in [2] so user can now understand what's going on. However data stats miss memory stats, so data total and remaining will change when memory migration starts. AFAIU disks stats were available before the nbd based migration becomes the default. So this patch returns disks stats back at some level. [1] Job type: Unbounded Time elapsed: 4964 ms [2] Job type: Unbounded Time elapsed: 4964 ms Data processed: 146.000 MiB Data remaining: 854.000 MiB Data total: 1000.000 MiB File processed: 146.000 MiB File remaining: 854.000 MiB File total: 1000.000 MiB Nikolay Shirokovskiy (12): qemu: qemuDomainJobInfoToParams drop unused code qemu: introduce qemu domain job status qemu: introduce QEMU_DOMAIN_JOB_STATUS_POSTCOPY qemu: drop QEMU_MIGRATION_COMPLETED_UPDATE_STATS qemu: drop excessive zero-out in qemuMigrationFetchJobStatus qemu: drop fetch and update status functions qemu: simplify getting completed job stats qemu: drop unused code in qemuDomainGetJobStatsInternal qemu: drop fetch flag in qemuDomainGetJobStatsInternal qemu: split getting stats for migration and others qemu: introduce QEMU_DOMAIN_JOB_STATUS_PREPARE qemu: show disks stats for nbd migration docs/news.html.in | 4 + src/qemu/qemu_domain.c | 32 ++++++-- src/qemu/qemu_domain.h | 13 +++- src/qemu/qemu_driver.c | 100 ++++++++++++++---------- src/qemu/qemu_migration.c | 195 ++++++++++++++++++++++++++-------------------- src/qemu/qemu_migration.h | 8 +- src/qemu/qemu_process.c | 6 +- 7 files changed, 214 insertions(+), 144 deletions(-) -- 1.8.3.1

qemu driver does not have VIR_DOMAIN_JOB_BOUNDED jobs. --- src/qemu/qemu_domain.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index acfa695..acc27d0 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -432,12 +432,6 @@ qemuDomainJobInfoToParams(qemuDomainJobInfoPtr jobInfo, jobInfo->timeElapsed - jobInfo->timeDelta) < 0) goto error; - if (jobInfo->type == VIR_DOMAIN_JOB_BOUNDED && - virTypedParamsAddULLong(&par, &npar, &maxpar, - VIR_DOMAIN_JOB_TIME_REMAINING, - jobInfo->timeRemaining) < 0) - goto error; - if (stats->downtime_set && virTypedParamsAddULLong(&par, &npar, &maxpar, VIR_DOMAIN_JOB_DOWNTIME, -- 1.8.3.1

On Wed, Dec 28, 2016 at 17:39:10 +0300, Nikolay Shirokovskiy wrote:
qemu driver does not have VIR_DOMAIN_JOB_BOUNDED jobs. --- src/qemu/qemu_domain.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index acfa695..acc27d0 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -432,12 +432,6 @@ qemuDomainJobInfoToParams(qemuDomainJobInfoPtr jobInfo, jobInfo->timeElapsed - jobInfo->timeDelta) < 0) goto error;
- if (jobInfo->type == VIR_DOMAIN_JOB_BOUNDED && - virTypedParamsAddULLong(&par, &npar, &maxpar, - VIR_DOMAIN_JOB_TIME_REMAINING, - jobInfo->timeRemaining) < 0) - goto error; - if (stats->downtime_set && virTypedParamsAddULLong(&par, &npar, &maxpar, VIR_DOMAIN_JOB_DOWNTIME,
The function is supposed to process all fields in qemuDomainJobInfo and we should keep it so. If we want to drop this code, we should just drop qemuDomainJobInfo.timeRemaining completely. Jirka

On 16.02.2017 16:07, Jiri Denemark wrote:
On Wed, Dec 28, 2016 at 17:39:10 +0300, Nikolay Shirokovskiy wrote:
qemu driver does not have VIR_DOMAIN_JOB_BOUNDED jobs. --- src/qemu/qemu_domain.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index acfa695..acc27d0 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -432,12 +432,6 @@ qemuDomainJobInfoToParams(qemuDomainJobInfoPtr jobInfo, jobInfo->timeElapsed - jobInfo->timeDelta) < 0) goto error;
- if (jobInfo->type == VIR_DOMAIN_JOB_BOUNDED && - virTypedParamsAddULLong(&par, &npar, &maxpar, - VIR_DOMAIN_JOB_TIME_REMAINING, - jobInfo->timeRemaining) < 0) - goto error; - if (stats->downtime_set && virTypedParamsAddULLong(&par, &npar, &maxpar, VIR_DOMAIN_JOB_DOWNTIME,
The function is supposed to process all fields in qemuDomainJobInfo and we should keep it so. If we want to drop this code, we should just drop qemuDomainJobInfo.timeRemaining completely.
Ok. Looks like we can drop timeRemaining from cookies as well as its parsing result never checked. Nikolay

This patch simply switches code from using VIR_DOMAIN_JOB_* to introduced QEMU_DOMAIN_JOB_STATUS_*. Later this gives us freedom to introduce states for postcopy and mirroring phases. --- src/qemu/qemu_domain.c | 24 ++++++++++++++++++++-- src/qemu/qemu_domain.h | 11 +++++++++- src/qemu/qemu_driver.c | 11 +++++----- src/qemu/qemu_migration.c | 52 +++++++++++++++++++++++------------------------ src/qemu/qemu_process.c | 2 +- 5 files changed, 63 insertions(+), 37 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index acc27d0..3582151 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -386,11 +386,31 @@ qemuDomainJobInfoUpdateDowntime(qemuDomainJobInfoPtr jobInfo) return 0; } +static virDomainJobType +qemuDomainJobStatusToType(qemuDomainJobStatus status) +{ + switch (status) { + case QEMU_DOMAIN_JOB_STATUS_NONE: + return VIR_DOMAIN_JOB_NONE; + case QEMU_DOMAIN_JOB_STATUS_ACTIVE: + return VIR_DOMAIN_JOB_UNBOUNDED; + case QEMU_DOMAIN_JOB_STATUS_COMPLETED: + return VIR_DOMAIN_JOB_COMPLETED; + case QEMU_DOMAIN_JOB_STATUS_FAILED: + return VIR_DOMAIN_JOB_FAILED; + case QEMU_DOMAIN_JOB_STATUS_CANCELED: + return VIR_DOMAIN_JOB_CANCELLED; + } + + /* should not reach here */ + return VIR_DOMAIN_JOB_NONE; +} + int qemuDomainJobInfoToInfo(qemuDomainJobInfoPtr jobInfo, virDomainJobInfoPtr info) { - info->type = jobInfo->type; + info->type = qemuDomainJobStatusToType(jobInfo->status); info->timeElapsed = jobInfo->timeElapsed; info->timeRemaining = jobInfo->timeRemaining; @@ -546,7 +566,7 @@ qemuDomainJobInfoToParams(qemuDomainJobInfoPtr jobInfo, stats->cpu_throttle_percentage) < 0) goto error; - *type = jobInfo->type; + *type = qemuDomainJobStatusToType(jobInfo->status); *params = par; *nparams = npar; return 0; diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index b2db45e..f257dc1 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -98,10 +98,19 @@ typedef enum { } qemuDomainAsyncJob; VIR_ENUM_DECL(qemuDomainAsyncJob) +typedef enum { + QEMU_DOMAIN_JOB_STATUS_NONE = 0, + QEMU_DOMAIN_JOB_STATUS_ACTIVE, + QEMU_DOMAIN_JOB_STATUS_COMPLETED, + QEMU_DOMAIN_JOB_STATUS_FAILED, + QEMU_DOMAIN_JOB_STATUS_CANCELED, +} qemuDomainJobStatus; + typedef struct _qemuDomainJobInfo qemuDomainJobInfo; typedef qemuDomainJobInfo *qemuDomainJobInfoPtr; struct _qemuDomainJobInfo { - virDomainJobType type; + qemuDomainJobStatus status; + unsigned long long started; /* When the async job started */ unsigned long long stopped; /* When the domain's CPUs were stopped */ unsigned long long sent; /* When the source sent status info to the diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index b359e77..143d401 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3169,7 +3169,7 @@ qemuDomainSaveInternal(virQEMUDriverPtr driver, virDomainPtr dom, goto endjob; } - priv->job.current->type = VIR_DOMAIN_JOB_UNBOUNDED; + priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE; /* Pause */ if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { @@ -13056,14 +13056,13 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, info = priv->job.current; if (!info) { - jobInfo->type = VIR_DOMAIN_JOB_NONE; + jobInfo->status = QEMU_DOMAIN_JOB_STATUS_NONE; ret = 0; goto cleanup; } *jobInfo = *info; - if (jobInfo->type == VIR_DOMAIN_JOB_BOUNDED || - jobInfo->type == VIR_DOMAIN_JOB_UNBOUNDED) { + if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE) { if (fetch) ret = qemuMigrationFetchJobStatus(driver, vm, QEMU_ASYNC_JOB_NONE, jobInfo); @@ -13098,7 +13097,7 @@ qemuDomainGetJobInfo(virDomainPtr dom, if (qemuDomainGetJobStatsInternal(driver, vm, false, &jobInfo) < 0) goto cleanup; - if (jobInfo.type == VIR_DOMAIN_JOB_NONE) { + if (jobInfo.status == QEMU_DOMAIN_JOB_STATUS_NONE) { memset(info, 0, sizeof(*info)); info->type = VIR_DOMAIN_JOB_NONE; ret = 0; @@ -13139,7 +13138,7 @@ qemuDomainGetJobStats(virDomainPtr dom, if (qemuDomainGetJobStatsInternal(driver, vm, completed, &jobInfo) < 0) goto cleanup; - if (jobInfo.type == VIR_DOMAIN_JOB_NONE) { + if (jobInfo.status == QEMU_DOMAIN_JOB_STATUS_NONE) { *type = VIR_DOMAIN_JOB_NONE; *params = NULL; *nparams = 0; diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 0f4a6cf..c5184b2 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1096,7 +1096,7 @@ qemuMigrationCookieStatisticsXMLParse(xmlXPathContextPtr ctxt) goto cleanup; stats = &jobInfo->stats; - jobInfo->type = VIR_DOMAIN_JOB_COMPLETED; + jobInfo->status = QEMU_DOMAIN_JOB_STATUS_COMPLETED; virXPathULongLong("string(./started[1])", ctxt, &jobInfo->started); virXPathULongLong("string(./stopped[1])", ctxt, &jobInfo->stopped); @@ -2196,7 +2196,7 @@ qemuMigrationDriveMirror(virQEMUDriverPtr driver, goto cleanup; if (priv->job.abortJob) { - priv->job.current->type = VIR_DOMAIN_JOB_CANCELLED; + priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_CANCELED; virReportError(VIR_ERR_OPERATION_ABORTED, _("%s: %s"), qemuDomainAsyncJobTypeToString(priv->job.asyncJob), _("canceled by client")); @@ -2555,19 +2555,19 @@ qemuMigrationUpdateJobType(qemuDomainJobInfoPtr jobInfo) { switch ((qemuMonitorMigrationStatus) jobInfo->stats.status) { case QEMU_MONITOR_MIGRATION_STATUS_COMPLETED: - jobInfo->type = VIR_DOMAIN_JOB_COMPLETED; + jobInfo->status = QEMU_DOMAIN_JOB_STATUS_COMPLETED; break; case QEMU_MONITOR_MIGRATION_STATUS_INACTIVE: - jobInfo->type = VIR_DOMAIN_JOB_NONE; + jobInfo->status = QEMU_DOMAIN_JOB_STATUS_NONE; break; case QEMU_MONITOR_MIGRATION_STATUS_ERROR: - jobInfo->type = VIR_DOMAIN_JOB_FAILED; + jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; break; case QEMU_MONITOR_MIGRATION_STATUS_CANCELLED: - jobInfo->type = VIR_DOMAIN_JOB_CANCELLED; + jobInfo->status = QEMU_DOMAIN_JOB_STATUS_CANCELED; break; case QEMU_MONITOR_MIGRATION_STATUS_SETUP: @@ -2654,32 +2654,30 @@ qemuMigrationCheckJobStatus(virQEMUDriverPtr driver, else if (qemuMigrationUpdateJobStatus(driver, vm, asyncJob) < 0) return -1; - switch (jobInfo->type) { - case VIR_DOMAIN_JOB_NONE: + switch (jobInfo->status) { + case QEMU_DOMAIN_JOB_STATUS_NONE: virReportError(VIR_ERR_OPERATION_FAILED, _("%s: %s"), qemuMigrationJobName(vm), _("is not active")); return -1; - case VIR_DOMAIN_JOB_FAILED: + case QEMU_DOMAIN_JOB_STATUS_FAILED: virReportError(VIR_ERR_OPERATION_FAILED, _("%s: %s"), qemuMigrationJobName(vm), _("unexpectedly failed")); return -1; - case VIR_DOMAIN_JOB_CANCELLED: + case QEMU_DOMAIN_JOB_STATUS_CANCELED: virReportError(VIR_ERR_OPERATION_ABORTED, _("%s: %s"), qemuMigrationJobName(vm), _("canceled by client")); return -1; - case VIR_DOMAIN_JOB_COMPLETED: + case QEMU_DOMAIN_JOB_STATUS_COMPLETED: /* Fetch statistics of a completed migration */ if (events && updateJobStats && qemuMigrationUpdateJobStatus(driver, vm, asyncJob) < 0) return -1; break; - case VIR_DOMAIN_JOB_BOUNDED: - case VIR_DOMAIN_JOB_UNBOUNDED: - case VIR_DOMAIN_JOB_LAST: + case QEMU_DOMAIN_JOB_STATUS_ACTIVE: break; } return 0; @@ -2737,7 +2735,7 @@ qemuMigrationCompleted(virQEMUDriverPtr driver, * will continue waiting until the migrate state changes to completed. */ if (flags & QEMU_MIGRATION_COMPLETED_POSTCOPY && - jobInfo->type == VIR_DOMAIN_JOB_UNBOUNDED && + jobInfo->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE && jobInfo->stats.status == QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY) { VIR_DEBUG("Migration switched to post-copy"); if (updateStats && @@ -2746,18 +2744,18 @@ qemuMigrationCompleted(virQEMUDriverPtr driver, return 1; } - if (jobInfo->type == VIR_DOMAIN_JOB_COMPLETED) + if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_COMPLETED) return 1; else return 0; error: - if (jobInfo->type == VIR_DOMAIN_JOB_UNBOUNDED) { + if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE) { /* The migration was aborted by us rather than QEMU itself. */ - jobInfo->type = VIR_DOMAIN_JOB_FAILED; + jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -2; - } else if (jobInfo->type == VIR_DOMAIN_JOB_COMPLETED) { - jobInfo->type = VIR_DOMAIN_JOB_FAILED; + } else if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_COMPLETED) { + jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -1; } else { return -1; @@ -2782,7 +2780,7 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver, flags |= QEMU_MIGRATION_COMPLETED_UPDATE_STATS; - jobInfo->type = VIR_DOMAIN_JOB_UNBOUNDED; + jobInfo->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE; while ((rv = qemuMigrationCompleted(driver, vm, asyncJob, dconn, flags)) != 1) { if (rv < 0) @@ -2790,7 +2788,7 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver, if (events) { if (virDomainObjWait(vm) < 0) { - jobInfo->type = VIR_DOMAIN_JOB_FAILED; + jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -2; } } else { @@ -4772,7 +4770,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, * as this is a critical section so we are guaranteed * priv->job.abortJob will not change */ ignore_value(qemuDomainObjExitMonitor(driver, vm)); - priv->job.current->type = VIR_DOMAIN_JOB_CANCELLED; + priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_CANCELED; virReportError(VIR_ERR_OPERATION_ABORTED, _("%s: %s"), qemuDomainAsyncJobTypeToString(priv->job.asyncJob), _("canceled by client")); @@ -4902,8 +4900,8 @@ qemuMigrationRun(virQEMUDriverPtr driver, ignore_value(virTimeMillisNow(&priv->job.completed->sent)); } - if (priv->job.current->type == VIR_DOMAIN_JOB_UNBOUNDED && !inPostCopy) - priv->job.current->type = VIR_DOMAIN_JOB_FAILED; + if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE && !inPostCopy) + priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_FAILED; cookieFlags |= QEMU_MIGRATION_COOKIE_NETWORK | QEMU_MIGRATION_COOKIE_STATS; @@ -4946,7 +4944,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, goto cleanup; cancelPostCopy: - priv->job.current->type = VIR_DOMAIN_JOB_FAILED; + priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_FAILED; if (inPostCopy) goto cancel; else @@ -6654,7 +6652,7 @@ qemuMigrationJobStart(virQEMUDriverPtr driver, JOB_MASK(QEMU_JOB_MIGRATION_OP))); } - priv->job.current->type = VIR_DOMAIN_JOB_UNBOUNDED; + priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE; return 0; } diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index afe3cac..3ac364c 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4057,7 +4057,7 @@ qemuProcessBeginJob(virQEMUDriverPtr driver, return -1; qemuDomainObjSetAsyncJobMask(vm, QEMU_JOB_NONE); - priv->job.current->type = VIR_DOMAIN_JOB_UNBOUNDED; + priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE; return 0; } -- 1.8.3.1

On Wed, Dec 28, 2016 at 17:39:11 +0300, Nikolay Shirokovskiy wrote:
This patch simply switches code from using VIR_DOMAIN_JOB_* to introduced QEMU_DOMAIN_JOB_STATUS_*. Later this gives us freedom to introduce states for postcopy and mirroring phases. --- src/qemu/qemu_domain.c | 24 ++++++++++++++++++++-- src/qemu/qemu_domain.h | 11 +++++++++- src/qemu/qemu_driver.c | 11 +++++----- src/qemu/qemu_migration.c | 52 +++++++++++++++++++++++------------------------ src/qemu/qemu_process.c | 2 +- 5 files changed, 63 insertions(+), 37 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index acc27d0..3582151 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -386,11 +386,31 @@ qemuDomainJobInfoUpdateDowntime(qemuDomainJobInfoPtr jobInfo) return 0; }
+static virDomainJobType +qemuDomainJobStatusToType(qemuDomainJobStatus status) +{ + switch (status) { + case QEMU_DOMAIN_JOB_STATUS_NONE: + return VIR_DOMAIN_JOB_NONE; + case QEMU_DOMAIN_JOB_STATUS_ACTIVE: + return VIR_DOMAIN_JOB_UNBOUNDED; + case QEMU_DOMAIN_JOB_STATUS_COMPLETED: + return VIR_DOMAIN_JOB_COMPLETED; + case QEMU_DOMAIN_JOB_STATUS_FAILED: + return VIR_DOMAIN_JOB_FAILED; + case QEMU_DOMAIN_JOB_STATUS_CANCELED: + return VIR_DOMAIN_JOB_CANCELLED; + }
Please, put separate cases by an empty line. It's pretty hard to read such a big block of uppercase latters.
+ + /* should not reach here */ + return VIR_DOMAIN_JOB_NONE;
I'm afraid this could be reported as unreachable code by some picky static analyzers, so how about: switch (status) { case QEMU_DOMAIN_JOB_STATUS_NONE: break; ... } return VIR_DOMAIN_JOB_NONE;
+} + ... diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 0f4a6cf..c5184b2 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c ... @@ -2555,19 +2555,19 @@ qemuMigrationUpdateJobType(qemuDomainJobInfoPtr jobInfo)
I think it would make sense to rename this function as qemuMigrationUpdateJobStatus in a follow-up patch. ACK with the cosmetic changes. Jirka

Current code consults job.current->stats.status to check for postcopy state. First it is more correct to check for both job.current->status and job.current->stats.status.code because on some paths on failures we change only the former. Second if qemu supports migration events then stats can change unexpectedly. Let's introduce QEMU_DOMAIN_JOB_STATUS_POSTCOPY state for job.current->status. This patch removes all state checking usage of stats except for qemuDomainGetJobStatsInternal. This place will be handled separately. --- src/qemu/qemu_domain.c | 1 + src/qemu/qemu_domain.h | 1 + src/qemu/qemu_driver.c | 5 +++-- src/qemu/qemu_migration.c | 18 +++++++++++------- src/qemu/qemu_process.c | 4 ++-- 5 files changed, 18 insertions(+), 11 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 3582151..952a933 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -393,6 +393,7 @@ qemuDomainJobStatusToType(qemuDomainJobStatus status) case QEMU_DOMAIN_JOB_STATUS_NONE: return VIR_DOMAIN_JOB_NONE; case QEMU_DOMAIN_JOB_STATUS_ACTIVE: + case QEMU_DOMAIN_JOB_STATUS_POSTCOPY: return VIR_DOMAIN_JOB_UNBOUNDED; case QEMU_DOMAIN_JOB_STATUS_COMPLETED: return VIR_DOMAIN_JOB_COMPLETED; diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index f257dc1..273145d 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -102,6 +102,7 @@ typedef enum { QEMU_DOMAIN_JOB_STATUS_NONE = 0, QEMU_DOMAIN_JOB_STATUS_ACTIVE, QEMU_DOMAIN_JOB_STATUS_COMPLETED, + QEMU_DOMAIN_JOB_STATUS_POSTCOPY, QEMU_DOMAIN_JOB_STATUS_FAILED, QEMU_DOMAIN_JOB_STATUS_CANCELED, } qemuDomainJobStatus; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 143d401..d1a64d7 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13062,7 +13062,8 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, } *jobInfo = *info; - if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE) { + if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE || + jobInfo->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { if (fetch) ret = qemuMigrationFetchJobStatus(driver, vm, QEMU_ASYNC_JOB_NONE, jobInfo); @@ -13196,7 +13197,7 @@ static int qemuDomainAbortJob(virDomainPtr dom) } if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT && - (priv->job.current->stats.status == QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY || + (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY || (virDomainObjGetState(vm, &reason) == VIR_DOMAIN_PAUSED && reason == VIR_DOMAIN_PAUSED_POSTCOPY))) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index c5184b2..64e5b91 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2558,6 +2558,10 @@ qemuMigrationUpdateJobType(qemuDomainJobInfoPtr jobInfo) jobInfo->status = QEMU_DOMAIN_JOB_STATUS_COMPLETED; break; + case QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY: + jobInfo->status = QEMU_DOMAIN_JOB_STATUS_POSTCOPY; + break; + case QEMU_MONITOR_MIGRATION_STATUS_INACTIVE: jobInfo->status = QEMU_DOMAIN_JOB_STATUS_NONE; break; @@ -2572,7 +2576,6 @@ qemuMigrationUpdateJobType(qemuDomainJobInfoPtr jobInfo) case QEMU_MONITOR_MIGRATION_STATUS_SETUP: case QEMU_MONITOR_MIGRATION_STATUS_ACTIVE: - case QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY: case QEMU_MONITOR_MIGRATION_STATUS_CANCELLING: case QEMU_MONITOR_MIGRATION_STATUS_LAST: break; @@ -2678,6 +2681,7 @@ qemuMigrationCheckJobStatus(virQEMUDriverPtr driver, break; case QEMU_DOMAIN_JOB_STATUS_ACTIVE: + case QEMU_DOMAIN_JOB_STATUS_POSTCOPY: break; } return 0; @@ -2735,8 +2739,7 @@ qemuMigrationCompleted(virQEMUDriverPtr driver, * will continue waiting until the migrate state changes to completed. */ if (flags & QEMU_MIGRATION_COMPLETED_POSTCOPY && - jobInfo->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE && - jobInfo->stats.status == QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY) { + jobInfo->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { VIR_DEBUG("Migration switched to post-copy"); if (updateStats && qemuMigrationUpdateJobStatus(driver, vm, asyncJob) < 0) @@ -2750,7 +2753,8 @@ qemuMigrationCompleted(virQEMUDriverPtr driver, return 0; error: - if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE) { + if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE || + jobInfo->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { /* The migration was aborted by us rather than QEMU itself. */ jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -2; @@ -4852,7 +4856,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, else if (rc == -1) goto cleanup; - if (priv->job.current->stats.status == QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY) + if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) inPostCopy = true; /* When migration completed, QEMU will have paused the CPUs for us. @@ -4900,7 +4904,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, ignore_value(virTimeMillisNow(&priv->job.completed->sent)); } - if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE && !inPostCopy) + if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE) priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_FAILED; cookieFlags |= QEMU_MIGRATION_COOKIE_NETWORK | @@ -6264,7 +6268,7 @@ qemuMigrationFinish(virQEMUDriverPtr driver, goto endjob; } - if (priv->job.current->stats.status == QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY) + if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) inPostCopy = true; if (!(flags & VIR_MIGRATE_PAUSED)) { diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 3ac364c..fd34b23 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -703,8 +703,8 @@ qemuProcessHandleStop(qemuMonitorPtr mon ATTRIBUTE_UNUSED, } if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) { - if (priv->job.current->stats.status == - QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY) { + if (priv->job.current->status == + QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { reason = VIR_DOMAIN_PAUSED_POSTCOPY; detail = VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY; } else { -- 1.8.3.1

On Wed, Dec 28, 2016 at 17:39:12 +0300, Nikolay Shirokovskiy wrote:
Current code consults job.current->stats.status to check for postcopy state. First it is more correct to check for both job.current->status and job.current->stats.status.code because on some paths on failures we change only the former. Second if qemu supports migration events then stats can change unexpectedly.
Let's introduce QEMU_DOMAIN_JOB_STATUS_POSTCOPY state for job.current->status.
This patch removes all state checking usage of stats except for qemuDomainGetJobStatsInternal. This place will be handled separately. --- src/qemu/qemu_domain.c | 1 + src/qemu/qemu_domain.h | 1 + src/qemu/qemu_driver.c | 5 +++-- src/qemu/qemu_migration.c | 18 +++++++++++------- src/qemu/qemu_process.c | 4 ++-- 5 files changed, 18 insertions(+), 11 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 3582151..952a933 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -393,6 +393,7 @@ qemuDomainJobStatusToType(qemuDomainJobStatus status) case QEMU_DOMAIN_JOB_STATUS_NONE: return VIR_DOMAIN_JOB_NONE; case QEMU_DOMAIN_JOB_STATUS_ACTIVE: + case QEMU_DOMAIN_JOB_STATUS_POSTCOPY: return VIR_DOMAIN_JOB_UNBOUNDED; case QEMU_DOMAIN_JOB_STATUS_COMPLETED: return VIR_DOMAIN_JOB_COMPLETED; diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index f257dc1..273145d 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -102,6 +102,7 @@ typedef enum { QEMU_DOMAIN_JOB_STATUS_NONE = 0, QEMU_DOMAIN_JOB_STATUS_ACTIVE, QEMU_DOMAIN_JOB_STATUS_COMPLETED, + QEMU_DOMAIN_JOB_STATUS_POSTCOPY,
Adding this item above QEMU_DOMAIN_JOB_STATUS_COMPLETED would make more sense.
QEMU_DOMAIN_JOB_STATUS_FAILED, QEMU_DOMAIN_JOB_STATUS_CANCELED, } qemuDomainJobStatus;
...
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index c5184b2..64e5b91 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2558,6 +2558,10 @@ qemuMigrationUpdateJobType(qemuDomainJobInfoPtr jobInfo) jobInfo->status = QEMU_DOMAIN_JOB_STATUS_COMPLETED; break;
+ case QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY: + jobInfo->status = QEMU_DOMAIN_JOB_STATUS_POSTCOPY; + break; +
This case would also better fit above the COMPLETED one.
case QEMU_MONITOR_MIGRATION_STATUS_INACTIVE: jobInfo->status = QEMU_DOMAIN_JOB_STATUS_NONE; break;
Nice cleanup. ACK once enum items are reordered. Jirka

This way we get stats only in one place. The former code waits for complete/postcopy status basically and don't need to mess with stats. The patch drops raising an error on stats updates failure. This does not make much sense anyway. --- src/qemu/qemu_migration.c | 24 +++++++----------------- 1 file changed, 7 insertions(+), 17 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 64e5b91..05ff2a0 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2644,8 +2644,7 @@ qemuMigrationUpdateJobStatus(virQEMUDriverPtr driver, static int qemuMigrationCheckJobStatus(virQEMUDriverPtr driver, virDomainObjPtr vm, - qemuDomainAsyncJob asyncJob, - bool updateJobStats) + qemuDomainAsyncJob asyncJob) { qemuDomainObjPrivatePtr priv = vm->privateData; qemuDomainJobInfoPtr jobInfo = priv->job.current; @@ -2674,12 +2673,6 @@ qemuMigrationCheckJobStatus(virQEMUDriverPtr driver, return -1; case QEMU_DOMAIN_JOB_STATUS_COMPLETED: - /* Fetch statistics of a completed migration */ - if (events && updateJobStats && - qemuMigrationUpdateJobStatus(driver, vm, asyncJob) < 0) - return -1; - break; - case QEMU_DOMAIN_JOB_STATUS_ACTIVE: case QEMU_DOMAIN_JOB_STATUS_POSTCOPY: break; @@ -2691,10 +2684,10 @@ qemuMigrationCheckJobStatus(virQEMUDriverPtr driver, enum qemuMigrationCompletedFlags { QEMU_MIGRATION_COMPLETED_ABORT_ON_ERROR = (1 << 0), QEMU_MIGRATION_COMPLETED_CHECK_STORAGE = (1 << 1), - QEMU_MIGRATION_COMPLETED_UPDATE_STATS = (1 << 2), - QEMU_MIGRATION_COMPLETED_POSTCOPY = (1 << 3), + QEMU_MIGRATION_COMPLETED_POSTCOPY = (1 << 2), }; + /** * Returns 1 if migration completed successfully, * 0 if the domain is still being migrated, @@ -2711,9 +2704,8 @@ qemuMigrationCompleted(virQEMUDriverPtr driver, qemuDomainObjPrivatePtr priv = vm->privateData; qemuDomainJobInfoPtr jobInfo = priv->job.current; int pauseReason; - bool updateStats = !!(flags & QEMU_MIGRATION_COMPLETED_UPDATE_STATS); - if (qemuMigrationCheckJobStatus(driver, vm, asyncJob, updateStats) < 0) + if (qemuMigrationCheckJobStatus(driver, vm, asyncJob) < 0) goto error; if (flags & QEMU_MIGRATION_COMPLETED_CHECK_STORAGE && @@ -2741,9 +2733,6 @@ qemuMigrationCompleted(virQEMUDriverPtr driver, if (flags & QEMU_MIGRATION_COMPLETED_POSTCOPY && jobInfo->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { VIR_DEBUG("Migration switched to post-copy"); - if (updateStats && - qemuMigrationUpdateJobStatus(driver, vm, asyncJob) < 0) - goto error; return 1; } @@ -2782,8 +2771,6 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver, bool events = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT); int rv; - flags |= QEMU_MIGRATION_COMPLETED_UPDATE_STATS; - jobInfo->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE; while ((rv = qemuMigrationCompleted(driver, vm, asyncJob, dconn, flags)) != 1) { @@ -2805,6 +2792,9 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver, } } + if (events) + ignore_value(qemuMigrationUpdateJobStatus(driver, vm, asyncJob)); + qemuDomainJobInfoUpdateDowntime(jobInfo); VIR_FREE(priv->job.completed); if (VIR_ALLOC(priv->job.completed) == 0) -- 1.8.3.1

On Wed, Dec 28, 2016 at 17:39:13 +0300, Nikolay Shirokovskiy wrote:
This way we get stats only in one place. The former code waits for complete/postcopy status basically and don't need to mess with stats.
The patch drops raising an error on stats updates failure. This does not make much sense anyway. --- src/qemu/qemu_migration.c | 24 +++++++----------------- 1 file changed, 7 insertions(+), 17 deletions(-)
ACK Jirka

qemuMonitorGetMigrationStats will do it for us anyway. --- src/qemu/qemu_migration.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 05ff2a0..6ddc4bd 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2595,7 +2595,6 @@ qemuMigrationFetchJobStatus(virQEMUDriverPtr driver, if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) return -1; - memset(&jobInfo->stats, 0, sizeof(jobInfo->stats)); rv = qemuMonitorGetMigrationStats(priv->mon, &jobInfo->stats); if (qemuDomainObjExitMonitor(driver, vm) < 0 || rv < 0) -- 1.8.3.1

On Wed, Dec 28, 2016 at 17:39:14 +0300, Nikolay Shirokovskiy wrote:
qemuMonitorGetMigrationStats will do it for us anyway. --- src/qemu/qemu_migration.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 05ff2a0..6ddc4bd 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2595,7 +2595,6 @@ qemuMigrationFetchJobStatus(virQEMUDriverPtr driver, if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) return -1;
- memset(&jobInfo->stats, 0, sizeof(jobInfo->stats)); rv = qemuMonitorGetMigrationStats(priv->mon, &jobInfo->stats);
if (qemuDomainObjExitMonitor(driver, vm) < 0 || rv < 0)
ACK Jirka

qemuMigrationFetchJobStatus is rather inconvinient. If we poll stats for status only then we don't need to update elapsed time. Next on some paths we use this function to get stats only we don't what to unexpectedly update job status. So the common part is only enter/exit which is too little to have a distinct function. Let's open code getting stats and update elapsed time and status where appropriate. The patch also drops qemuMigrationUpdateJobStatus. It's value is only in keeping job status on failures. Now we have option not to update status when we want to. --- src/qemu/qemu_driver.c | 27 ++++++++++++----- src/qemu/qemu_migration.c | 74 ++++++++++++++++------------------------------- src/qemu/qemu_migration.h | 5 ---- 3 files changed, 44 insertions(+), 62 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d1a64d7..d3da18a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13062,17 +13062,28 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, } *jobInfo = *info; - if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE || - jobInfo->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { - if (fetch) - ret = qemuMigrationFetchJobStatus(driver, vm, QEMU_ASYNC_JOB_NONE, - jobInfo); - else - ret = qemuDomainJobInfoUpdateTime(jobInfo); - } else { + if (jobInfo->status != QEMU_DOMAIN_JOB_STATUS_ACTIVE && + jobInfo->status != QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { ret = 0; + goto cleanup; } + if (fetch) { + int rv; + + qemuDomainObjEnterMonitor(driver, vm); + + rv = qemuMonitorGetMigrationStats(priv->mon, &jobInfo->stats); + + if (qemuDomainObjExitMonitor(driver, vm) < 0 || rv < 0) + goto cleanup; + } + + if (qemuDomainJobInfoUpdateTime(jobInfo) < 0) + goto cleanup; + + ret = 0; + cleanup: if (fetch) qemuDomainObjEndJob(driver, vm); diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 6ddc4bd..5be79df 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2583,28 +2583,6 @@ qemuMigrationUpdateJobType(qemuDomainJobInfoPtr jobInfo) } -int -qemuMigrationFetchJobStatus(virQEMUDriverPtr driver, - virDomainObjPtr vm, - qemuDomainAsyncJob asyncJob, - qemuDomainJobInfoPtr jobInfo) -{ - qemuDomainObjPrivatePtr priv = vm->privateData; - int rv; - - if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) - return -1; - - rv = qemuMonitorGetMigrationStats(priv->mon, &jobInfo->stats); - - if (qemuDomainObjExitMonitor(driver, vm) < 0 || rv < 0) - return -1; - - qemuMigrationUpdateJobType(jobInfo); - return qemuDomainJobInfoUpdateTime(jobInfo); -} - - static const char * qemuMigrationJobName(virDomainObjPtr vm) { @@ -2624,23 +2602,6 @@ qemuMigrationJobName(virDomainObjPtr vm) static int -qemuMigrationUpdateJobStatus(virQEMUDriverPtr driver, - virDomainObjPtr vm, - qemuDomainAsyncJob asyncJob) -{ - qemuDomainObjPrivatePtr priv = vm->privateData; - qemuDomainJobInfoPtr jobInfo = priv->job.current; - qemuDomainJobInfo newInfo = *jobInfo; - - if (qemuMigrationFetchJobStatus(driver, vm, asyncJob, &newInfo) < 0) - return -1; - - *jobInfo = newInfo; - return 0; -} - - -static int qemuMigrationCheckJobStatus(virQEMUDriverPtr driver, virDomainObjPtr vm, qemuDomainAsyncJob asyncJob) @@ -2650,10 +2611,18 @@ qemuMigrationCheckJobStatus(virQEMUDriverPtr driver, bool events = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT); - if (events) - qemuMigrationUpdateJobType(jobInfo); - else if (qemuMigrationUpdateJobStatus(driver, vm, asyncJob) < 0) - return -1; + if (!events) { + int rv; + + if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) + return -1; + + rv = qemuMonitorGetMigrationStats(priv->mon, &jobInfo->stats); + + if (qemuDomainObjExitMonitor(driver, vm) < 0 || rv < 0) + return -1; + } + qemuMigrationUpdateJobType(jobInfo); switch (jobInfo->status) { case QEMU_DOMAIN_JOB_STATUS_NONE: @@ -2791,9 +2760,13 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver, } } - if (events) - ignore_value(qemuMigrationUpdateJobStatus(driver, vm, asyncJob)); + if (events && qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) == 0) { + ignore_value(qemuMonitorGetMigrationStats(priv->mon, &jobInfo->stats)); + ignore_value(qemuDomainObjExitMonitor(driver, vm)); + } + + qemuDomainJobInfoUpdateTime(jobInfo); qemuDomainJobInfoUpdateDowntime(jobInfo); VIR_FREE(priv->job.completed); if (VIR_ALLOC(priv->job.completed) == 0) @@ -4229,10 +4202,13 @@ qemuMigrationConfirmPhase(virQEMUDriverPtr driver, */ if (virDomainObjGetState(vm, &reason) == VIR_DOMAIN_PAUSED && reason == VIR_DOMAIN_PAUSED_POSTCOPY && - qemuMigrationFetchJobStatus(driver, vm, - QEMU_ASYNC_JOB_MIGRATION_OUT, - jobInfo) < 0) - VIR_WARN("Could not refresh migration statistics"); + qemuDomainObjEnterMonitorAsync(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_OUT) == 0) { + + ignore_value(qemuMonitorGetMigrationStats(priv->mon, &jobInfo->stats)); + + ignore_value(qemuDomainObjExitMonitor(driver, vm)); + } qemuDomainJobInfoUpdateTime(jobInfo); jobInfo->timeDeltaSet = mig->jobInfo->timeDeltaSet; diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index 14c6178..9293859 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -245,11 +245,6 @@ int qemuMigrationToFile(virQEMUDriverPtr driver, int qemuMigrationCancel(virQEMUDriverPtr driver, virDomainObjPtr vm); -int qemuMigrationFetchJobStatus(virQEMUDriverPtr driver, - virDomainObjPtr vm, - qemuDomainAsyncJob asyncJob, - qemuDomainJobInfoPtr jobInfo); - int qemuMigrationErrorInit(virQEMUDriverPtr driver); void qemuMigrationErrorSave(virQEMUDriverPtr driver, const char *name, -- 1.8.3.1

On Wed, Dec 28, 2016 at 17:39:15 +0300, Nikolay Shirokovskiy wrote:
qemuMigrationFetchJobStatus is rather inconvinient. If we poll stats for status only then we don't need to update elapsed time. Next on some paths we use this function to get stats only we don't what to unexpectedly update job status. So the common part is only enter/exit which is too little to have a distinct function.
Let's open code getting stats and update elapsed time and status where appropriate.
The patch also drops qemuMigrationUpdateJobStatus. It's value is only in keeping job status on failures. Now we have option not to update status when we want to. --- src/qemu/qemu_driver.c | 27 ++++++++++++----- src/qemu/qemu_migration.c | 74 ++++++++++++++++------------------------------- src/qemu/qemu_migration.h | 5 ---- 3 files changed, 44 insertions(+), 62 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d1a64d7..d3da18a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13062,17 +13062,28 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, } *jobInfo = *info;
- if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE || - jobInfo->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { - if (fetch) - ret = qemuMigrationFetchJobStatus(driver, vm, QEMU_ASYNC_JOB_NONE, - jobInfo); - else - ret = qemuDomainJobInfoUpdateTime(jobInfo); - } else { + if (jobInfo->status != QEMU_DOMAIN_JOB_STATUS_ACTIVE && + jobInfo->status != QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { ret = 0; + goto cleanup; }
+ if (fetch) { + int rv; + + qemuDomainObjEnterMonitor(driver, vm); + + rv = qemuMonitorGetMigrationStats(priv->mon, &jobInfo->stats); + + if (qemuDomainObjExitMonitor(driver, vm) < 0 || rv < 0) + goto cleanup; + } + + if (qemuDomainJobInfoUpdateTime(jobInfo) < 0) + goto cleanup; + + ret = 0; + cleanup: if (fetch) qemuDomainObjEndJob(driver, vm); diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 6ddc4bd..5be79df 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2583,28 +2583,6 @@ qemuMigrationUpdateJobType(qemuDomainJobInfoPtr jobInfo) }
-int -qemuMigrationFetchJobStatus(virQEMUDriverPtr driver, - virDomainObjPtr vm, - qemuDomainAsyncJob asyncJob, - qemuDomainJobInfoPtr jobInfo) -{ - qemuDomainObjPrivatePtr priv = vm->privateData; - int rv; - - if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) - return -1; - - rv = qemuMonitorGetMigrationStats(priv->mon, &jobInfo->stats); - - if (qemuDomainObjExitMonitor(driver, vm) < 0 || rv < 0) - return -1; - - qemuMigrationUpdateJobType(jobInfo); - return qemuDomainJobInfoUpdateTime(jobInfo);
Dropping the two lines above could make sense, but there's no reason to drop the function completely and copy its contents to three places.
-} - - static const char * qemuMigrationJobName(virDomainObjPtr vm) { @@ -2624,23 +2602,6 @@ qemuMigrationJobName(virDomainObjPtr vm)
static int -qemuMigrationUpdateJobStatus(virQEMUDriverPtr driver, - virDomainObjPtr vm, - qemuDomainAsyncJob asyncJob) -{ - qemuDomainObjPrivatePtr priv = vm->privateData; - qemuDomainJobInfoPtr jobInfo = priv->job.current; - qemuDomainJobInfo newInfo = *jobInfo; - - if (qemuMigrationFetchJobStatus(driver, vm, asyncJob, &newInfo) < 0) - return -1; - - *jobInfo = newInfo; - return 0; -}
The goal of this function is to avoid touching priv->job.current in case of failure. Since qemuMonitorGetMigrationStats memsets qemuDomainJobInfo, we would lose its original contents, which is definitely not something we want. I'm not against refactoring this code, but we need to keep the current behavior of qemuMigrationUpdateJobStatus in some way. NACK to this version. Jirka

On Thu, Feb 16, 2017 at 16:15:02 +0100, Jiri Denemark wrote:
On Wed, Dec 28, 2016 at 17:39:15 +0300, Nikolay Shirokovskiy wrote:
qemuMigrationFetchJobStatus is rather inconvinient. If we poll stats for status only then we don't need to update elapsed time. Next on some paths we use this function to get stats only we don't what to unexpectedly update job status. So the common part is only enter/exit which is too little to have a distinct function.
Let's open code getting stats and update elapsed time and status where appropriate.
The patch also drops qemuMigrationUpdateJobStatus. It's value is only in keeping job status on failures. Now we have option not to update status when we want to. ... @@ -2624,23 +2602,6 @@ qemuMigrationJobName(virDomainObjPtr vm)
static int -qemuMigrationUpdateJobStatus(virQEMUDriverPtr driver, - virDomainObjPtr vm, - qemuDomainAsyncJob asyncJob) -{ - qemuDomainObjPrivatePtr priv = vm->privateData; - qemuDomainJobInfoPtr jobInfo = priv->job.current; - qemuDomainJobInfo newInfo = *jobInfo; - - if (qemuMigrationFetchJobStatus(driver, vm, asyncJob, &newInfo) < 0) - return -1; - - *jobInfo = newInfo; - return 0; -}
The goal of this function is to avoid touching priv->job.current in case of failure. Since qemuMonitorGetMigrationStats memsets qemuDomainJobInfo, we would lose its original contents, which is definitely not something we want.
And I forgot to mention that the code should not touch priv->job.current between EnterMonitorAsync/ExitMonitor. Jirka

On 17.02.2017 11:23, Jiri Denemark wrote:
On Thu, Feb 16, 2017 at 16:15:02 +0100, Jiri Denemark wrote:
On Wed, Dec 28, 2016 at 17:39:15 +0300, Nikolay Shirokovskiy wrote:
qemuMigrationFetchJobStatus is rather inconvinient. If we poll stats for status only then we don't need to update elapsed time. Next on some paths we use this function to get stats only we don't what to unexpectedly update job status. So the common part is only enter/exit which is too little to have a distinct function.
Let's open code getting stats and update elapsed time and status where appropriate.
The patch also drops qemuMigrationUpdateJobStatus. It's value is only in keeping job status on failures. Now we have option not to update status when we want to. ... @@ -2624,23 +2602,6 @@ qemuMigrationJobName(virDomainObjPtr vm)
static int -qemuMigrationUpdateJobStatus(virQEMUDriverPtr driver, - virDomainObjPtr vm, - qemuDomainAsyncJob asyncJob) -{ - qemuDomainObjPrivatePtr priv = vm->privateData; - qemuDomainJobInfoPtr jobInfo = priv->job.current; - qemuDomainJobInfo newInfo = *jobInfo; - - if (qemuMigrationFetchJobStatus(driver, vm, asyncJob, &newInfo) < 0) - return -1; - - *jobInfo = newInfo; - return 0; -}
The goal of this function is to avoid touching priv->job.current in case of failure. Since qemuMonitorGetMigrationStats memsets qemuDomainJobInfo, we would lose its original contents, which is definitely not something we want.
And I forgot to mention that the code should not touch priv->job.current between EnterMonitorAsync/ExitMonitor.
Ooh, i missed that. Using domain object without domain lock... Nikolay

--- src/qemu/qemu_driver.c | 26 ++++++++++++-------------- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d3da18a..913f8f3 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13019,12 +13019,18 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, qemuDomainJobInfoPtr jobInfo) { qemuDomainObjPrivatePtr priv = vm->privateData; - qemuDomainJobInfoPtr info; bool fetch = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT); int ret = -1; - if (completed) - fetch = false; + if (completed) { + if (priv->job.current || !priv->job.completed) { + jobInfo->status = QEMU_DOMAIN_JOB_STATUS_NONE; + return 0; + } + + *jobInfo = *priv->job.completed; + return 0; + } /* Do not ask QEMU if migration is not even running yet */ if (!priv->job.current || !priv->job.current->stats.status) @@ -13041,26 +13047,18 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, return -1; } - if (!completed && - !virDomainObjIsActive(vm)) { + if (!virDomainObjIsActive(vm)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("domain is not running")); goto cleanup; } - if (completed && priv->job.current) - info = NULL; - else if (completed) - info = priv->job.completed; - else - info = priv->job.current; - - if (!info) { + if (!priv->job.current) { jobInfo->status = QEMU_DOMAIN_JOB_STATUS_NONE; ret = 0; goto cleanup; } - *jobInfo = *info; + *jobInfo = *priv->job.current; if (jobInfo->status != QEMU_DOMAIN_JOB_STATUS_ACTIVE && jobInfo->status != QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { -- 1.8.3.1

On Wed, Dec 28, 2016 at 17:39:16 +0300, Nikolay Shirokovskiy wrote:
--- src/qemu/qemu_driver.c | 26 ++++++++++++-------------- 1 file changed, 12 insertions(+), 14 deletions(-)
ACK Jirka

Destination migration never fetch migration stats thus due to the check above fetch flag can not be set. --- src/qemu/qemu_driver.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 913f8f3..84db59d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13037,12 +13037,6 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, fetch = false; if (fetch) { - if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_IN) { - virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", - _("migration statistics are available only on " - "the source host")); - return -1; - } if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) return -1; } -- 1.8.3.1

On Wed, Dec 28, 2016 at 17:39:17 +0300, Nikolay Shirokovskiy wrote:
Destination migration never fetch migration stats thus due
This becomes true only after patch 10/12.
to the check above fetch flag can not be set. --- src/qemu/qemu_driver.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 913f8f3..84db59d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13037,12 +13037,6 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, fetch = false;
if (fetch) { - if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_IN) { - virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", - _("migration statistics are available only on " - "the source host")); - return -1; - }
With a combined knowledge of this patch and 10/12 I think it's better to report this error on the destination host rather then successfully returning statistics with everything but elapsed time equal to zero. Jirka

On 16.02.2017 18:45, Jiri Denemark wrote:
On Wed, Dec 28, 2016 at 17:39:17 +0300, Nikolay Shirokovskiy wrote:
Destination migration never fetch migration stats thus due
This becomes true only after patch 10/12.
to the check above fetch flag can not be set. --- src/qemu/qemu_driver.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 913f8f3..84db59d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13037,12 +13037,6 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, fetch = false;
if (fetch) { - if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_IN) { - virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", - _("migration statistics are available only on " - "the source host")); - return -1; - }
With a combined knowledge of this patch and 10/12 I think it's better to report this error on the destination host rather then successfully returning statistics with everything but elapsed time equal to zero.
Looks like this place is contradictionary. On dest side it depends on stats.status whether we fail or return elapsed time. So you suggest always fail? Won't it break anyone? Probably it depends upon how often one will receive fails/time during migration i guess. Nikolay

On Fri, Feb 17, 2017 at 17:06:17 +0300, Nikolay Shirokovskiy wrote:
On 16.02.2017 18:45, Jiri Denemark wrote:
On Wed, Dec 28, 2016 at 17:39:17 +0300, Nikolay Shirokovskiy wrote:
Destination migration never fetch migration stats thus due
This becomes true only after patch 10/12.
to the check above fetch flag can not be set. --- src/qemu/qemu_driver.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 913f8f3..84db59d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13037,12 +13037,6 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, fetch = false;
if (fetch) { - if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_IN) { - virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", - _("migration statistics are available only on " - "the source host")); - return -1; - }
With a combined knowledge of this patch and 10/12 I think it's better to report this error on the destination host rather then successfully returning statistics with everything but elapsed time equal to zero.
Looks like this place is contradictionary. On dest side it depends on stats.status whether we fail or return elapsed time. So you suggest always fail?
Yes.
Won't it break anyone? Probably it depends upon how often one will receive fails/time during migration i guess.
I think it should be OK, because it would only return elapsed time for a very short time between starting QEMU process on the destination host and actually starting the migration. Well, the time could be longer if storage migration is involved, but there is still a significant part of migration for which getting the statistics on the destination host fails. So anyone doing that should already be able to cope with the failure :-) Jirka

Basically this optimization skips acquiring jobs condition in case no job is running. But as we are going to add mirroring statistics it is simplier to drop this optimization. --- src/qemu/qemu_driver.c | 17 +++++------------ 1 file changed, 5 insertions(+), 12 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 84db59d..21e3f9c 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13019,7 +13019,6 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, qemuDomainJobInfoPtr jobInfo) { qemuDomainObjPrivatePtr priv = vm->privateData; - bool fetch = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT); int ret = -1; if (completed) { @@ -13032,14 +13031,8 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, return 0; } - /* Do not ask QEMU if migration is not even running yet */ - if (!priv->job.current || !priv->job.current->stats.status) - fetch = false; - - if (fetch) { - if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) - return -1; - } + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) + return -1; if (!virDomainObjIsActive(vm)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", @@ -13060,7 +13053,8 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, goto cleanup; } - if (fetch) { + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT) && + priv->job.current->stats.status) { int rv; qemuDomainObjEnterMonitor(driver, vm); @@ -13077,8 +13071,7 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, ret = 0; cleanup: - if (fetch) - qemuDomainObjEndJob(driver, vm); + qemuDomainObjEndJob(driver, vm); return ret; } -- 1.8.3.1

On Wed, Dec 28, 2016 at 17:39:18 +0300, Nikolay Shirokovskiy wrote:
Basically this optimization skips acquiring jobs condition in case no job is running. But as we are going to add mirroring statistics it is simplier to drop this optimization.
This was actually an optimization for old QEMU without support for migration events. I agree we can drop it, but the patch will need to be modified because of changes to the previous patches. Jirka

All domain jobs other than source migration have only one state - active. Only elapsed time is available for such jobs so let's make it more explicit. Also if in future there will be more stats for such jobs we don't want to mess them with migration stats code. --- src/qemu/qemu_driver.c | 58 ++++++++++++++++++++++++++++++++------------------ 1 file changed, 37 insertions(+), 21 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 21e3f9c..acaad8f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13013,6 +13013,34 @@ qemuConnectBaselineCPU(virConnectPtr conn ATTRIBUTE_UNUSED, static int +qemuDomainGetMigrationJobStats(virQEMUDriverPtr driver, + virDomainObjPtr vm, + qemuDomainJobInfoPtr jobInfo) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (jobInfo->status != QEMU_DOMAIN_JOB_STATUS_ACTIVE && + jobInfo->status != QEMU_DOMAIN_JOB_STATUS_POSTCOPY) + return 0; + + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT) && + priv->job.current->stats.status) { + int rv; + + qemuDomainObjEnterMonitor(driver, vm); + rv = qemuMonitorGetMigrationStats(priv->mon, &jobInfo->stats); + if (qemuDomainObjExitMonitor(driver, vm) < 0 || rv < 0) + return -1; + } + + if (qemuDomainJobInfoUpdateTime(jobInfo) < 0) + return -1; + + return 0; +} + + +static int qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, virDomainObjPtr vm, bool completed, @@ -13047,29 +13075,17 @@ qemuDomainGetJobStatsInternal(virQEMUDriverPtr driver, } *jobInfo = *priv->job.current; - if (jobInfo->status != QEMU_DOMAIN_JOB_STATUS_ACTIVE && - jobInfo->status != QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { - ret = 0; - goto cleanup; + if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT || + priv->job.asyncJob == QEMU_ASYNC_JOB_SAVE) { + ret = qemuDomainGetMigrationJobStats(driver, vm, jobInfo); + } else { + if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE) { + ret = qemuDomainJobInfoUpdateTime(jobInfo); + } else { + ret = 0; + } } - if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT) && - priv->job.current->stats.status) { - int rv; - - qemuDomainObjEnterMonitor(driver, vm); - - rv = qemuMonitorGetMigrationStats(priv->mon, &jobInfo->stats); - - if (qemuDomainObjExitMonitor(driver, vm) < 0 || rv < 0) - goto cleanup; - } - - if (qemuDomainJobInfoUpdateTime(jobInfo) < 0) - goto cleanup; - - ret = 0; - cleanup: qemuDomainObjEndJob(driver, vm); return ret; -- 1.8.3.1

On Wed, Dec 28, 2016 at 17:39:19 +0300, Nikolay Shirokovskiy wrote:
All domain jobs other than source migration have only one state - active. Only elapsed time is available for such jobs so let's make it more explicit. Also if in future there will be more stats for such jobs we don't want to mess them with migration stats code. --- src/qemu/qemu_driver.c | 58 ++++++++++++++++++++++++++++++++------------------ 1 file changed, 37 insertions(+), 21 deletions(-)
NACK, see my reply to 8/12. Jirka

On 16.02.2017 18:47, Jiri Denemark wrote:
On Wed, Dec 28, 2016 at 17:39:19 +0300, Nikolay Shirokovskiy wrote:
All domain jobs other than source migration have only one state - active. Only elapsed time is available for such jobs so let's make it more explicit. Also if in future there will be more stats for such jobs we don't want to mess them with migration stats code. --- src/qemu/qemu_driver.c | 58 ++++++++++++++++++++++++++++++++------------------ 1 file changed, 37 insertions(+), 21 deletions(-)
NACK, see my reply to 8/12.
Can you please clarify? It is not that related to 8 patch as I see it. It basically just creates separate function for getting migration stats. Nikolay

On Fri, Feb 17, 2017 at 17:16:54 +0300, Nikolay Shirokovskiy wrote:
On 16.02.2017 18:47, Jiri Denemark wrote:
On Wed, Dec 28, 2016 at 17:39:19 +0300, Nikolay Shirokovskiy wrote:
All domain jobs other than source migration have only one state - active. Only elapsed time is available for such jobs so let's make it more explicit. Also if in future there will be more stats for such jobs we don't want to mess them with migration stats code. --- src/qemu/qemu_driver.c | 58 ++++++++++++++++++++++++++++++++------------------ 1 file changed, 37 insertions(+), 21 deletions(-)
NACK, see my reply to 8/12.
Can you please clarify? It is not that related to 8 patch as I see it. It basically just creates separate function for getting migration stats.
But it also changes when the code to get the stats is run. Previously it was always run for active migration (and doing so would report an error on the destination host) while after this patch the code is specifically run only on the source host. This patch doesn't change anything if 8/12 is unchanged, but the change doesn't make a lot of sense without 8/12. Also depending on the changes done in 6/12, there may be no need separate the code at all. Jirka

This patch removes the last place where we consult priv->job.current->stats.status for migration state, namely in qemuDomainGetMigrationJobStats. --- src/qemu/qemu_domain.c | 1 + src/qemu/qemu_domain.h | 1 + src/qemu/qemu_driver.c | 5 +++-- src/qemu/qemu_migration.c | 7 +++++-- 4 files changed, 10 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 952a933..74ceab9 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -392,6 +392,7 @@ qemuDomainJobStatusToType(qemuDomainJobStatus status) switch (status) { case QEMU_DOMAIN_JOB_STATUS_NONE: return VIR_DOMAIN_JOB_NONE; + case QEMU_DOMAIN_JOB_STATUS_PREPARE: case QEMU_DOMAIN_JOB_STATUS_ACTIVE: case QEMU_DOMAIN_JOB_STATUS_POSTCOPY: return VIR_DOMAIN_JOB_UNBOUNDED; diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 273145d..d001ae6 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -100,6 +100,7 @@ VIR_ENUM_DECL(qemuDomainAsyncJob) typedef enum { QEMU_DOMAIN_JOB_STATUS_NONE = 0, + QEMU_DOMAIN_JOB_STATUS_PREPARE, QEMU_DOMAIN_JOB_STATUS_ACTIVE, QEMU_DOMAIN_JOB_STATUS_COMPLETED, QEMU_DOMAIN_JOB_STATUS_POSTCOPY, diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index acaad8f..0d2f9de 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13019,12 +13019,13 @@ qemuDomainGetMigrationJobStats(virQEMUDriverPtr driver, { qemuDomainObjPrivatePtr priv = vm->privateData; - if (jobInfo->status != QEMU_DOMAIN_JOB_STATUS_ACTIVE && + if (jobInfo->status != QEMU_DOMAIN_JOB_STATUS_PREPARE && + jobInfo->status != QEMU_DOMAIN_JOB_STATUS_ACTIVE && jobInfo->status != QEMU_DOMAIN_JOB_STATUS_POSTCOPY) return 0; if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT) && - priv->job.current->stats.status) { + jobInfo->status != QEMU_DOMAIN_JOB_STATUS_PREPARE) { int rv; qemuDomainObjEnterMonitor(driver, vm); diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 5be79df..b26b8ad 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2641,6 +2641,7 @@ qemuMigrationCheckJobStatus(virQEMUDriverPtr driver, return -1; case QEMU_DOMAIN_JOB_STATUS_COMPLETED: + case QEMU_DOMAIN_JOB_STATUS_PREPARE: case QEMU_DOMAIN_JOB_STATUS_ACTIVE: case QEMU_DOMAIN_JOB_STATUS_POSTCOPY: break; @@ -4869,7 +4870,8 @@ qemuMigrationRun(virQEMUDriverPtr driver, ignore_value(virTimeMillisNow(&priv->job.completed->sent)); } - if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE) + if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_PREPARE || + priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE) priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_FAILED; cookieFlags |= QEMU_MIGRATION_COOKIE_NETWORK | @@ -6615,13 +6617,14 @@ qemuMigrationJobStart(virQEMUDriverPtr driver, if (job == QEMU_ASYNC_JOB_MIGRATION_IN) { qemuDomainObjSetAsyncJobMask(vm, QEMU_JOB_NONE); + priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE; } else { qemuDomainObjSetAsyncJobMask(vm, (QEMU_JOB_DEFAULT_MASK | JOB_MASK(QEMU_JOB_SUSPEND) | JOB_MASK(QEMU_JOB_MIGRATION_OP))); + priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_PREPARE; } - priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE; return 0; } -- 1.8.3.1

On Wed, Dec 28, 2016 at 17:39:20 +0300, Nikolay Shirokovskiy wrote:
This patch removes the last place where we consult priv->job.current->stats.status for migration state, namely in qemuDomainGetMigrationJobStats. --- src/qemu/qemu_domain.c | 1 + src/qemu/qemu_domain.h | 1 + src/qemu/qemu_driver.c | 5 +++-- src/qemu/qemu_migration.c | 7 +++++-- 4 files changed, 10 insertions(+), 4 deletions(-)
The patch looks OK. Jirka

There is no disks stats when migrating with VIR_MIGRATE_NON_SHARED_* for qemu that supports nbd. The reason is that disks are copied via disk mirroring and not in the scope of migration job itself. Below a couble of issues of the implementation are described. 'total' field is set from 'end' field of block job info for the sake of simplicity. This is true only when there is no guest disk activity during migration. If there is an activity then 'end' will grow while 'total' is an estimation that should stay constant. I guess this can be fixed by setting 'total' to disk 'capacity'. There is also known possible corner case issue with this implementation. There is a chance that client asking for stats at the process of the mirroring stopping on successfull migration will see only part of mirroring disks and thus will receive inconsisent partial info. --- docs/news.html.in | 4 ++++ src/qemu/qemu_driver.c | 3 ++- src/qemu/qemu_migration.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_migration.h | 5 +++++ 4 files changed, 66 insertions(+), 1 deletion(-) diff --git a/docs/news.html.in b/docs/news.html.in index 22611db..8036cf8 100644 --- a/docs/news.html.in +++ b/docs/news.html.in @@ -42,6 +42,10 @@ cpu cycles, stalled backend cpu cycles, and ref cpu cycles by applications running on the platform </li> + <li>qemu: show disks stats for nbd disks migration<br/> + Show disks stats in migrations stats in disks copy phase + of migration with non-shared disks. + </li> </ul> </li> <li><strong>Bug fixes</strong> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0d2f9de..1881466 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13037,7 +13037,8 @@ qemuDomainGetMigrationJobStats(virQEMUDriverPtr driver, if (qemuDomainJobInfoUpdateTime(jobInfo) < 0) return -1; - return 0; + return qemuMigrationFetchMirrorStats(driver, vm, QEMU_ASYNC_JOB_NONE, + &jobInfo->stats); } diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index b26b8ad..006c264 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2582,6 +2582,59 @@ qemuMigrationUpdateJobType(qemuDomainJobInfoPtr jobInfo) } } +int +qemuMigrationFetchMirrorStats(virQEMUDriverPtr driver, + virDomainObjPtr vm, + qemuDomainAsyncJob asyncJob, + qemuMonitorMigrationStatsPtr stats) +{ + size_t i; + qemuDomainObjPrivatePtr priv = vm->privateData; + bool nbd = false; + virHashTablePtr blockinfo = NULL; + + for (i = 0; i < vm->def->ndisks; i++) { + virDomainDiskDefPtr disk = vm->def->disks[i]; + if (QEMU_DOMAIN_DISK_PRIVATE(disk)->migrating) { + nbd = true; + break; + } + } + + if (!nbd) + return 0; + + if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) + return -1; + + blockinfo = qemuMonitorGetAllBlockJobInfo(priv->mon); + + if (qemuDomainObjExitMonitor(driver, vm) < 0 || !blockinfo) + return -1; + + stats->disk_transferred = 0; + stats->disk_total = 0; + stats->disk_remaining = 0; + + for (i = 0; i < vm->def->ndisks; i++) { + virDomainDiskDefPtr disk = vm->def->disks[i]; + qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk); + qemuMonitorBlockJobInfoPtr data; + + if (!diskPriv->migrating) + continue; + + if (!(data = virHashLookup(blockinfo, disk->info.alias))) + continue; + + stats->disk_transferred += data->cur; + stats->disk_total += data->end; + stats->disk_remaining += data->end - data->cur; + } + + virHashFree(blockinfo); + return 0; +} static const char * qemuMigrationJobName(virDomainObjPtr vm) @@ -2768,6 +2821,8 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver, } qemuDomainJobInfoUpdateTime(jobInfo); + qemuMigrationFetchMirrorStats(driver, vm, asyncJob, &jobInfo->stats); + qemuDomainJobInfoUpdateDowntime(jobInfo); VIR_FREE(priv->job.completed); if (VIR_ALLOC(priv->job.completed) == 0) diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index 9293859..5f60a9b 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -245,6 +245,11 @@ int qemuMigrationToFile(virQEMUDriverPtr driver, int qemuMigrationCancel(virQEMUDriverPtr driver, virDomainObjPtr vm); +int qemuMigrationFetchMirrorStats(virQEMUDriverPtr driver, + virDomainObjPtr vm, + qemuDomainAsyncJob asyncJob, + qemuMonitorMigrationStatsPtr stats); + int qemuMigrationErrorInit(virQEMUDriverPtr driver); void qemuMigrationErrorSave(virQEMUDriverPtr driver, const char *name, -- 1.8.3.1

On Wed, Dec 28, 2016 at 17:39:21 +0300, Nikolay Shirokovskiy wrote:
There is no disks stats when migrating with VIR_MIGRATE_NON_SHARED_* for qemu that supports nbd. The reason is that disks are copied via disk mirroring and not in the scope of migration job itself. Below a couble of issues of the implementation are described.
'total' field is set from 'end' field of block job info for the sake of simplicity. This is true only when there is no guest disk activity during migration. If there is an activity then 'end' will grow
This is exactly how memory migration works too.
while 'total' is an estimation that should stay constant.
Nope. That's the reason why migration is an unbounded job, we never know what the total is and it is designed to change anytime during migration as we realize more data needs to be transferred.
I guess this can be fixed by setting 'total' to disk 'capacity'.
Nothing to fix really.
There is also known possible corner case issue with this implementation. There is a chance that client asking for stats at the process of the mirroring stopping on successfull migration will see only part of mirroring disks and thus will receive inconsisent partial info. --- docs/news.html.in | 4 ++++ src/qemu/qemu_driver.c | 3 ++- src/qemu/qemu_migration.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_migration.h | 5 +++++ 4 files changed, 66 insertions(+), 1 deletion(-)
diff --git a/docs/news.html.in b/docs/news.html.in index 22611db..8036cf8 100644 --- a/docs/news.html.in +++ b/docs/news.html.in @@ -42,6 +42,10 @@ cpu cycles, stalled backend cpu cycles, and ref cpu cycles by applications running on the platform </li> + <li>qemu: show disks stats for nbd disks migration<br/> + Show disks stats in migrations stats in disks copy phase + of migration with non-shared disks. + </li> </ul> </li> <li><strong>Bug fixes</strong>
Since it took me so long to review your patches (I apologize for that), we started using news.xml instead of news.html.in. And it should be modified in a separate patch to avoid conflicts during backports. Otherwise the patch looks good and it's definitely cleaner than v1. However, I think you should also update disk migration statistics while stopping the mirror jobs so that the stats of a completed migration contain accurate data. Jirka

On 17.02.2017 13:17, Jiri Denemark wrote:
On Wed, Dec 28, 2016 at 17:39:21 +0300, Nikolay Shirokovskiy wrote:
There is no disks stats when migrating with VIR_MIGRATE_NON_SHARED_* for qemu that supports nbd. The reason is that disks are copied via disk mirroring and not in the scope of migration job itself. Below a couble of issues of the implementation are described.
'total' field is set from 'end' field of block job info for the sake of simplicity. This is true only when there is no guest disk activity during migration. If there is an activity then 'end' will grow
This is exactly how memory migration works too.
while 'total' is an estimation that should stay constant.
Nope. That's the reason why migration is an unbounded job, we never know what the total is and it is designed to change anytime during migration as we realize more data needs to be transferred.
I guess this can be fixed by setting 'total' to disk 'capacity'.
Nothing to fix really.
There is also known possible corner case issue with this implementation. There is a chance that client asking for stats at the process of the mirroring stopping on successfull migration will see only part of mirroring disks and thus will receive inconsisent partial info. --- docs/news.html.in | 4 ++++ src/qemu/qemu_driver.c | 3 ++- src/qemu/qemu_migration.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_migration.h | 5 +++++ 4 files changed, 66 insertions(+), 1 deletion(-)
diff --git a/docs/news.html.in b/docs/news.html.in index 22611db..8036cf8 100644 --- a/docs/news.html.in +++ b/docs/news.html.in @@ -42,6 +42,10 @@ cpu cycles, stalled backend cpu cycles, and ref cpu cycles by applications running on the platform </li> + <li>qemu: show disks stats for nbd disks migration<br/> + Show disks stats in migrations stats in disks copy phase + of migration with non-shared disks. + </li> </ul> </li> <li><strong>Bug fixes</strong>
Since it took me so long to review your patches (I apologize for that), we started using news.xml instead of news.html.in. And it should be modified in a separate patch to avoid conflicts during backports.
Otherwise the patch looks good and it's definitely cleaner than v1. However, I think you should also update disk migration statistics while stopping the mirror jobs so that the stats of a completed migration contain accurate data.
Thanks for review! I need to refresh these patches in memory a bit to get some of your comments and ask questions while the patches are fresh in you memory too. Nikolay

On 17.02.2017 13:17, Jiri Denemark wrote:
On Wed, Dec 28, 2016 at 17:39:21 +0300, Nikolay Shirokovskiy wrote:
There is no disks stats when migrating with VIR_MIGRATE_NON_SHARED_* for qemu that supports nbd. The reason is that disks are copied via disk mirroring and not in the scope of migration job itself. Below a couble of issues of the implementation are described.
'total' field is set from 'end' field of block job info for the sake of simplicity. This is true only when there is no guest disk activity during migration. If there is an activity then 'end' will grow
This is exactly how memory migration works too.
while 'total' is an estimation that should stay constant.
Nope. That's the reason why migration is an unbounded job, we never know what the total is and it is designed to change anytime during migration as we realize more data needs to be transferred.
I guess this can be fixed by setting 'total' to disk 'capacity'.
Nothing to fix really.
Then I don't understand next passage from virDomainJobInfo definition. * For VIR_DOMAIN_JOB_UNBOUNDED, dataTotal may be less * than the final sum of dataProcessed + dataRemaining * in the event that the hypervisor has to repeat some * data, such as due to dirtied pages during migration. I thought total is initial estimation which never changed and processed and remaining are updated as we go. IIRC this is the case for memory migration.
There is also known possible corner case issue with this implementation. There is a chance that client asking for stats at the process of the mirroring stopping on successfull migration will see only part of mirroring disks and thus will receive inconsisent partial info. --- docs/news.html.in | 4 ++++ src/qemu/qemu_driver.c | 3 ++- src/qemu/qemu_migration.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_migration.h | 5 +++++ 4 files changed, 66 insertions(+), 1 deletion(-)
diff --git a/docs/news.html.in b/docs/news.html.in index 22611db..8036cf8 100644 --- a/docs/news.html.in +++ b/docs/news.html.in @@ -42,6 +42,10 @@ cpu cycles, stalled backend cpu cycles, and ref cpu cycles by applications running on the platform </li> + <li>qemu: show disks stats for nbd disks migration<br/> + Show disks stats in migrations stats in disks copy phase + of migration with non-shared disks. + </li> </ul> </li> <li><strong>Bug fixes</strong>
Since it took me so long to review your patches (I apologize for that), we started using news.xml instead of news.html.in. And it should be modified in a separate patch to avoid conflicts during backports.
Otherwise the patch looks good and it's definitely cleaner than v1. However, I think you should also update disk migration statistics while stopping the mirror jobs so that the stats of a completed migration contain accurate data.
You mean after migration is completed there can be some disks transfers yet like buffered data? Ok. Nikolay

On Fri, Feb 17, 2017 at 17:30:27 +0300, Nikolay Shirokovskiy wrote:
On 17.02.2017 13:17, Jiri Denemark wrote:
On Wed, Dec 28, 2016 at 17:39:21 +0300, Nikolay Shirokovskiy wrote:
There is no disks stats when migrating with VIR_MIGRATE_NON_SHARED_* for qemu that supports nbd. The reason is that disks are copied via disk mirroring and not in the scope of migration job itself. Below a couble of issues of the implementation are described.
'total' field is set from 'end' field of block job info for the sake of simplicity. This is true only when there is no guest disk activity during migration. If there is an activity then 'end' will grow
This is exactly how memory migration works too.
while 'total' is an estimation that should stay constant.
Nope. That's the reason why migration is an unbounded job, we never know what the total is and it is designed to change anytime during migration as we realize more data needs to be transferred.
I guess this can be fixed by setting 'total' to disk 'capacity'.
Nothing to fix really.
Then I don't understand next passage from virDomainJobInfo definition.
* For VIR_DOMAIN_JOB_UNBOUNDED, dataTotal may be less * than the final sum of dataProcessed + dataRemaining * in the event that the hypervisor has to repeat some * data, such as due to dirtied pages during migration.
I thought total is initial estimation which never changed and processed and remaining are updated as we go. IIRC this is the case for memory migration.
Hmm. I believe total has always been updated and when watching a live migration, one would see a progress (dataProcess / dataTotal * 100) which looks like 0% ... 100% ... 70% ... 100% ... 90% ... 100% Either I'm totally wrong or the documentation does not describe reality. There's definitely something to be fixed here :-) I'll try to dig into it.
Otherwise the patch looks good and it's definitely cleaner than v1. However, I think you should also update disk migration statistics while stopping the mirror jobs so that the stats of a completed migration contain accurate data.
You mean after migration is completed there can be some disks transfers yet like buffered data? Ok.
Exactly, when memory migration completes we need to wait for all mirror jobs to finish to make sure all disk blocks changed on the source are transferred to the destination. And we need to count these last moment transfers too. Jirka

ping On 28.12.2016 17:39, Nikolay Shirokovskiy wrote:
diff from v1: ============ 1. patch "qemu: clean out unused migrate to unix" is dropped as it is already pushed. 2. a lot of refactoring patches added, namely all except the last patch. 3. fetching mirroring stats is done separately from getting migration status. Generally speaking refactorings patches removes the function to fetch migrations status altogether.
Current migration stats will show something like [1] when in the process of mirroring of non shared disks. This gives very little info on the migration progress. Likewise completed stats miss disks mirroring info.
This patch provides disks stats in the said phase like in [2] so user can now understand what's going on. However data stats miss memory stats, so data total and remaining will change when memory migration starts.
AFAIU disks stats were available before the nbd based migration becomes the default. So this patch returns disks stats back at some level.
[1] Job type: Unbounded Time elapsed: 4964 ms
[2] Job type: Unbounded Time elapsed: 4964 ms Data processed: 146.000 MiB Data remaining: 854.000 MiB Data total: 1000.000 MiB File processed: 146.000 MiB File remaining: 854.000 MiB File total: 1000.000 MiB
Nikolay Shirokovskiy (12): qemu: qemuDomainJobInfoToParams drop unused code qemu: introduce qemu domain job status qemu: introduce QEMU_DOMAIN_JOB_STATUS_POSTCOPY qemu: drop QEMU_MIGRATION_COMPLETED_UPDATE_STATS qemu: drop excessive zero-out in qemuMigrationFetchJobStatus qemu: drop fetch and update status functions qemu: simplify getting completed job stats qemu: drop unused code in qemuDomainGetJobStatsInternal qemu: drop fetch flag in qemuDomainGetJobStatsInternal qemu: split getting stats for migration and others qemu: introduce QEMU_DOMAIN_JOB_STATUS_PREPARE qemu: show disks stats for nbd migration
docs/news.html.in | 4 + src/qemu/qemu_domain.c | 32 ++++++-- src/qemu/qemu_domain.h | 13 +++- src/qemu/qemu_driver.c | 100 ++++++++++++++---------- src/qemu/qemu_migration.c | 195 ++++++++++++++++++++++++++-------------------- src/qemu/qemu_migration.h | 8 +- src/qemu/qemu_process.c | 6 +- 7 files changed, 214 insertions(+), 144 deletions(-)
participants (2)
-
Jiri Denemark
-
Nikolay Shirokovskiy