On Fri, Sep 05, 2014 at 14:41:21 -0400, John Ferlan wrote:
On 09/01/2014 11:05 AM, Jiri Denemark wrote:
> Job statistics data were tracked in several structures and variables.
> Let's make a new qemuDomainJobInfo structure which can be used as a
> single source of statistics data as a preparation for storing data about
> completed a job.
>
> Signed-off-by: Jiri Denemark <jdenemar(a)redhat.com>
> ---
> src/qemu/qemu_domain.c | 157 ++++++++++++++++++++++++++++++++++++++++++++--
> src/qemu/qemu_domain.h | 24 ++++++-
> src/qemu/qemu_driver.c | 119 ++++-------------------------------
> src/qemu/qemu_migration.c | 72 ++++++++-------------
> 4 files changed, 213 insertions(+), 159 deletions(-)
>
Curious why the decision to use "Ptr" instead of inline struct for
current & completed?
It's just easier to check whether completed is filled in or not and
current has the same type for consistency.
When I saw an alloc in the middle of a structure
and then a few free's along the way I began to wonder and attempt to
research the various paths to make sure all accesses knew/checked that
the impending deref would be OK - I marked those I found... I think one
or two may need double check since I assume you know the overall code
paths better.
Basically, current is allocated iff there is an async job running so
any code that knows an async job exists may access current without
having to worry about it too much.
I see there's a path from qemuProcessReconnect() - that's
the libvirtd restart path right? So when the VIR_ALLOC() is done for
current/completed - who's address space is that? Wouldn't that be
address space from the "old" libvirtd?
I'm not quite sure what you are talking about here :-) However,
qemuProcessReconnect calls qemuProcessRecoverJob and that one does not
care about any statistics.
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index e9506e0..2c709e9 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -163,11 +163,9 @@ qemuDomainObjResetAsyncJob(qemuDomainObjPrivatePtr priv)
> job->asyncOwner = 0;
> job->phase = 0;
> job->mask = DEFAULT_JOB_MASK;
> - job->start = 0;
> job->dump_memory_only = false;
> job->asyncAbort = false;
> - memset(&job->status, 0, sizeof(job->status));
> - memset(&job->info, 0, sizeof(job->info));
> + VIR_FREE(job->current);
This was one of those free paths where there were multiple callers and I
wasn't 100% other accesses needed to be checked w/r/t to this occurring
before an unchecked access in some other path...
Once an async job goes away so does current info about it.
> }
>
> void
...
> @@ -211,6 +210,150 @@ qemuDomainTrackJob(qemuDomainJob job)
> }
>
>
> +int
> +qemuDomainJobInfoUpdateTime(qemuDomainJobInfoPtr jobInfo)
> +{
> + unsigned long long now;
> +
> + if (!jobInfo->started)
> + return 0;
any chance jobInfo == NULL? I think the only path where the caller
doesn't check for current (or completed) is qemuMigrationUpdateJobStatus
Shouldn't be. And this is definitely a good candidate for
ATTRIBUTE_NONNULL. qemuMigrationUpdateJobStatus is always called within
an async job so current is guaranteed to be non-NULL.
...
> + if (status->ram_duplicate_set) {
> + if (virTypedParamsAddULLong(&par, &npar, &maxpar,
> + VIR_DOMAIN_JOB_MEMORY_CONSTANT,
> + status->ram_duplicate) < 0 ||
> + virTypedParamsAddULLong(&par, &npar, &maxpar,
> + VIR_DOMAIN_JOB_MEMORY_NORMAL,
> + status->ram_normal) < 0 ||
> + virTypedParamsAddULLong(&par, &npar, &maxpar,
> + VIR_DOMAIN_JOB_MEMORY_NORMAL_BYTES,
> + status->ram_normal_bytes) < 0)
I have a note in patch 2 about ram_normal and ram_normal_bytes vis-a-vis
qemu_monitor_json.c not checking status on their fetch and my latest
coverity run complaining about it.
Not sure what you're talking about but I'll in patch 2 :-)
...
> @@ -1071,7 +1214,7 @@
qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
> unsigned long long then;
> bool nested = job == QEMU_JOB_ASYNC_NESTED;
> virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver);
> - int ret;
> + int ret = -1;
>
> VIR_DEBUG("Starting %s: %s (async=%s vm=%p name=%s)",
> job == QEMU_JOB_ASYNC ? "async job" : "job",
Just a note after this there's a
priv->jobs_queued++;
which can be decremented at cleanup; however, if virTimeMillisNow()
fails, then there's a return -1 *and* a virObjectUnref(obj) before the
virObjectRef(obj)...
This is unrelated to this patch, but you are right jobs_queued++ should
be done after virTimeMillisNow(). The rest is correct because it's
virObjectUnref(cfg) that is called before return -1 (note obj vs. cfg).
> @@ -1127,9 +1270,11 @@
qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
> qemuDomainAsyncJobTypeToString(asyncJob),
> obj, obj->def->name);
> qemuDomainObjResetAsyncJob(priv);
> + if (VIR_ALLOC(priv->job.current) < 0)
> + goto cleanup;
If this fails - what are the chances some other path will check/access
current without checking for it being NULL? Perhaps low and there's
going to be other problems anyway in a low memory situation...
If this fails then the caller will not get an async job and thus it
shouldn't do anything and especially it shouldn't touch job.current.
> priv->job.asyncJob = asyncJob;
> priv->job.asyncOwner = virThreadSelfID();
> - priv->job.start = now;
> + priv->job.current->started = now;
> }
>
> if (qemuDomainTrackJob(job))
...
> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
> index 8736889..119590e 100644
> --- a/src/qemu/qemu_domain.h
> +++ b/src/qemu/qemu_domain.h
...
> @@ -378,4 +388,12 @@ bool
qemuDomainDefCheckABIStability(virQEMUDriverPtr driver,
> bool qemuDomainAgentAvailable(qemuDomainObjPrivatePtr priv,
> bool reportError);
>
> +int qemuDomainJobInfoUpdateTime(qemuDomainJobInfoPtr jobInfo);
> +int qemuDomainJobInfoToInfo(qemuDomainJobInfoPtr jobInfo,
> + virDomainJobInfoPtr info);
> +int qemuDomainJobInfoToParams(qemuDomainJobInfoPtr jobInfo,
> + int *type,
> + virTypedParameterPtr *params,
> + int *nparams);
> +
Should any of these new defs need to have/use the ATTRIBUTE_NONNULL()
for params?
Yeah, all of them I guess.
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 239a300..265315d 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
...
> @@ -3110,8 +3108,7 @@ qemuDomainSaveInternal(virQEMUDriverPtr
driver, virDomainPtr dom,
> goto endjob;
> }
>
> - memset(&priv->job.info, 0, sizeof(priv->job.info));
> - priv->job.info.type = VIR_DOMAIN_JOB_UNBOUNDED;
> + priv->job.current->type = VIR_DOMAIN_JOB_UNBOUNDED;
Assumes current-> exists, but occurs after BeginAsyncJob() is
successful, so it seems safe.
Exactly.
>
> /* Pause */
> if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) {
...
> diff --git a/src/qemu/qemu_migration.c
b/src/qemu/qemu_migration.c
> index 9cfb77e..64adab4 100644
> --- a/src/qemu/qemu_migration.c
> +++ b/src/qemu/qemu_migration.c
...
> @@ -1738,62 +1739,40 @@
qemuMigrationUpdateJobStatus(virQEMUDriverPtr driver,
>
> qemuDomainObjExitMonitor(driver, vm);
>
> - priv->job.status = status;
> -
> - if (ret < 0 || virTimeMillisNow(&priv->job.info.timeElapsed) < 0)
> + if (ret < 0 ||
> + qemuDomainJobInfoUpdateTime(priv->job.current) < 0)
Here (and below) it's assumed job.current is valid.
qemuMigrationUpdateJobStatus is only called with an async job started.
...
> @@ -1803,11 +1782,14 @@
qemuMigrationUpdateJobStatus(virQEMUDriverPtr driver,
> * QEMU reports failed migration.
> */
> static int
> -qemuMigrationWaitForCompletion(virQEMUDriverPtr driver, virDomainObjPtr vm,
> +qemuMigrationWaitForCompletion(virQEMUDriverPtr driver,
> + virDomainObjPtr vm,
> qemuDomainAsyncJob asyncJob,
> - virConnectPtr dconn, bool abort_on_error)
> + virConnectPtr dconn,
> + bool abort_on_error)
> {
> qemuDomainObjPrivatePtr priv = vm->privateData;
> + qemuDomainJobInfoPtr jobInfo = priv->job.current;
Hopefully nothing has cleared or could clear this before this point...
qemuMigrationWaitForCompletion is also called with async job set.
...
Jirka