[libvirt] [PATCH v3 00/11] qemu: replace nested job with interruptible async job state

Hello, everyone. This is successor to [1] "[PATCH v2 RFC 0/4] qemu: replace nested job with interruptible async job state". I tried to expand on series rationale, to make better splitting and to add comments. ABSTRACT Let's use interruptible async job state instead of nested jobs to control concurrent execution of async and regular jobs. RATIONALE I'm not quite happy with current implementation of concurrent execution async and regular jobs because it is not straigthforward. What is current implementation of jobs concurrency? While async job is running any compatible regular job is allowed to run concurrently. Practically it means that as soon as async job drops the domain lock another rpc thread can run in, find domain in the domain list and start a regular job. This is what happens when for example migration drops the lock before waiting for migration completion event or before sending request to destination side. But if we are going to send command to qemu monitor and accordingly drop domain lock this concurrency is not desired because qemu monitor can not handle simultaneus commands now. (This is probably not the only reason. I guess we don't want to think about what can happen if regular job will run concurrenly for example to migration at any place. Ability to run concurrently during disk mirror or state transfer which take the most time of migration looks enough). Thus nested jobs were introduced. Now if async job wants to send command to monitor it start special nested jobs beforehand thus effectively block concurrent regular jobs. To me it sounds like hacky way to implement the fact that async job allows concurrency only in limited places. Let's call them interruptible. So I suggest introduce interruptible state for async job. Before we are going to wait for migration completion for example we enter this state allowing regular jobs run concurrently. Later when we meet migration completion condition we wait for concurrent regular job to finish if any before proceed. This final step is required in this approach by definition - when async job is not interruptible no regular jobs can run concurrenly. On practice this protects us from sending qemu monitor command from async job while concurrent regular job is not finished its qemu monitor communication yet. Sounds like this final wait can have consequenses on migration. Like what if finishing concurrent job take too much time. In this case we get increased migration downtime. Well we will get this increase with current appoach too. Because right after migration completion condition is met we fetch migration stats and wait in enter monitor function anyway until concurrent job is finished. This also demontrates that current appoach isolates async and concurrent regular job loosely. It is possible that regular job starts then async job resumes and continues its execution until it requires qemu monitor and only then regular job finishes its execution. I guess it will be easy to reason about concurrency if when concurrent regular is started async job can not continue until regular job is finished. Check patch 4 for a demonstration that job isolation is good. TESTS To test concurrency I ran about 100 times a migration of non-empty domain with shared disk in a run with concurrent domstats for this domain in a tight loop. Looks good. MISSING PARTS - Enter/exit interruptible state at the end/begin of migration phases. Until this point is done we have a bit of degradation because domain won't response for example to queries between migration phases. - Update THREADS.txt I'm going to implement missing parts as soon as RFC has overall approvement. TODO 1 Drop driver argument from calls to qemuDomainObjEnterMonitor 2 Replace qemuDomainObjEnterMonitorAsync with qemuDomainObjEnterMonitor 3 Make qemuDomainObjExitMonitor void and drop its driver argument. [1] https://www.redhat.com/archives/libvir-list/2017-May/msg00018.html Nikolay Shirokovskiy (11): qemu: introduce routines for interruptible state qemu: check interruptible state instead of taking nested job qemu: enter interruptlible state where appropriate qemu: remove nested job usage from qemuProcessStop qemu: remove unused qemuDomainObjBeginNestedJob qemu: remove QEMU_JOB_ASYNC_NESTED qemu: remove liveness check from qemuDomainObjExitMonitor qemu: drop qemuDomainObjEnterMonitorInternal qemu: drop qemuDomainObjExitMonitorInternal conf: cleanup: drop virDomainObjWait qemu: rename qemuDomainNestedJobAllowed src/conf/domain_conf.c | 19 ---- src/conf/domain_conf.h | 1 - src/libvirt_private.syms | 1 - src/qemu/qemu_domain.c | 265 ++++++++++++++++++++++++++++++---------------- src/qemu/qemu_domain.h | 17 ++- src/qemu/qemu_driver.c | 2 +- src/qemu/qemu_migration.c | 18 ++-- src/qemu/qemu_process.c | 19 +--- 8 files changed, 195 insertions(+), 147 deletions(-) -- 1.8.3.1

Async job that wants to allow concurrent execution of regular job should call qemuDomainObjEnterInterruptible and then qemuDomainObjExitInterruptible when it wants to disable concurrent execution of regular jobs. These functions can be called from non concurrent regular jobs. In this case the functions are noops. Non concurrent regular job is regular job started when no async job is running. We need to take care of these functions being called from non concurrent regular jobs because some high level routines can be called from both contexts - async job and non concurrent regular job. One example is qemuConnectAgent which can be called during domain startup (async job) or domain reconnect (non concurrent regular job). qemuDomainObjEnterInterruptible result is void even though it detects some error conditions. These error conditions depend solely on qemu driver design thus for the sake of simplicity errors are not propagated. If by mistake function will be used improreply at least we have diagnostics in log. These enter/exit functions detect some error conditions but do not propagate them to the callers because this conditions depend solely on qemu driver consistensy and not outer conditions. (By driver consistency I mean domain mutex, domain condition should be initialised and enter/exit should not be called without any job or from concurret regular job). In other words propagating error can not help handling any world situation which is not under out control. Drop some log on detecting self inconsistency should be enough as it should be fixed eventually. qemuDomainObjWait and qemuDomainObjSleep are typical uses cases of interruptible state - wait on domain condition/sleep with interruptible state set. --- src/qemu/qemu_domain.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.h | 12 +++++ 2 files changed, 156 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 61d2833..f6a403c 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -333,6 +333,7 @@ qemuDomainObjResetAsyncJob(qemuDomainObjPrivatePtr priv) job->spiceMigration = false; job->spiceMigrated = false; job->postcopyEnabled = false; + job->asyncInterruptible = false; VIR_FREE(job->current); } @@ -4697,6 +4698,149 @@ qemuDomainObjEnterMonitorAsync(virQEMUDriverPtr driver, return qemuDomainObjEnterMonitorInternal(driver, obj, asyncJob); } +/* + * @obj must be locked before calling. Must be used within context of async job + * or non concurrent regular job. + * + * Enters interruptible state for async job. When async job is in this + * state another thread can run concurrent regular job. + */ +void +qemuDomainObjEnterInterruptible(virDomainObjPtr obj) +{ + qemuDomainObjPrivatePtr priv = obj->privateData; + struct qemuDomainJobObj *job = &priv->job; + + if (job->asyncJob) { + /* + * Entering interruptible state from concurrent regular job is design + * flaw. Let's not return error in this case to try to recover but + * drop error message to help detect this situation. + */ + if (job->active) { + VIR_ERROR(_("Attempt to enter interruptible state for the concurrent " + "regular job")); + return; + } + } else { + /* + * Entering interruptible state without any job at all is design flaw. + * Let's not return error in this case to try to recover but + * drop error message to help detect this situation. + */ + if (!job->active) + VIR_ERROR(_("Attempt to enter interruptible state without any job")); + + /* In case of non concurrent regular we don't need to do anything and + * this situation can be easily detected on exit interruptible state + * as no job con run concurrently to non concurrent regular job. */ + return; + } + + job->asyncInterruptible = true; + VIR_DEBUG("Async job enters interruptible state.(obj=%p name=%s, async=%s)", + obj, obj->def->name, + qemuDomainAsyncJobTypeToString(job->asyncJob)); + virCondBroadcast(&priv->job.asyncCond); +} + + +/* + * @obj must be locked before calling. Must be used within context of async job + * or non concurrent regular job. Must be called if qemuDomainObjEnterInterruptible + * is called before. + * + * Exits async job interruptible state so after exit from this function + * no concurrent regular jobs are allowed to run simultaneously. The function + * waits until any concurrent regular job started after async job entered + * interruptible state is finished before exit. + */ +void +qemuDomainObjExitInterruptible(virDomainObjPtr obj) +{ + qemuDomainObjPrivatePtr priv = obj->privateData; + struct qemuDomainJobObj *job = &priv->job; + + /* Just noop for the case of non concurrent regular job. See + * also entering function. */ + if (!job->asyncJob && job->active) + return; + + job->asyncInterruptible = false; + VIR_DEBUG("Async job exits interruptible state. " + "(obj=%p name=%s, async=%s)", + obj, obj->def->name, + qemuDomainAsyncJobTypeToString(job->asyncJob)); + + /* + * Wait until no concurrent regular job is running. There is no real + * conditions for wait to fail thus just do not return error in case + * wait fails but log error just for safety. + */ + while (job->active) { + if (virCondWait(&priv->job.cond, &obj->parent.lock) < 0) { + char buf[1024]; + + VIR_ERROR(_("failed to wait for job condition: %s"), + virStrerror(errno, buf, sizeof(buf))); + return; + } + } +} + + +/* + * @obj must be locked before calling. Must be used within context of async job + * or non concurrent regular job. + * + * Wait on @obj lock. Async job is in interruptlible state during wait so + * concurrent regular jobs are allowed to run. + */ +int +qemuDomainObjWait(virDomainObjPtr obj) +{ + qemuDomainObjEnterInterruptible(obj); + + if (virCondWait(&obj->cond, &obj->parent.lock) < 0) { + virReportSystemError(errno, "%s", + _("failed to wait for domain condition")); + qemuDomainObjExitInterruptible(obj); + return -1; + } + + qemuDomainObjExitInterruptible(obj); + + if (!virDomainObjIsActive(obj)) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("domain is not running")); + return -1; + } + + return 0; +} + + +/* + * obj must be locked before calling. Must be used within context of async job + * or nonconcurrent regular job. + * + * Sleep with obj lock dropped. Async job is in interruptlible state during wait + * so concurrent regular jobs are allowed to run. + */ +void +qemuDomainObjSleep(virDomainObjPtr obj, unsigned long nsec) +{ + struct timespec ts = { .tv_sec = 0, .tv_nsec = nsec }; + + qemuDomainObjEnterInterruptible(obj); + + virObjectUnlock(obj); + nanosleep(&ts, NULL); + virObjectLock(obj); + + qemuDomainObjExitInterruptible(obj); +} + /* * obj must be locked before calling diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index e021da5..25a6823 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -151,6 +151,8 @@ struct qemuDomainJobObj { virCond asyncCond; /* Use to coordinate with async jobs */ qemuDomainAsyncJob asyncJob; /* Currently active async job */ + bool asyncInterruptible; /* Regular jobs compatible with current + async job are allowed to run */ unsigned long long asyncOwner; /* Thread which set current async job */ const char *asyncOwnerAPI; /* The API which owns the async job */ unsigned long long asyncStarted; /* When the current async job started */ @@ -527,6 +529,16 @@ int qemuDomainObjEnterMonitorAsync(virQEMUDriverPtr driver, qemuDomainAsyncJob asyncJob) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_RETURN_CHECK; +void qemuDomainObjEnterInterruptible(virDomainObjPtr obj) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); +void qemuDomainObjExitInterruptible(virDomainObjPtr obj) + ATTRIBUTE_NONNULL(1); + +int qemuDomainObjWait(virDomainObjPtr obj) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; + +void qemuDomainObjSleep(virDomainObjPtr obj, unsigned long nsec) + ATTRIBUTE_NONNULL(1); qemuAgentPtr qemuDomainObjEnterAgent(virDomainObjPtr obj) ATTRIBUTE_NONNULL(1); -- 1.8.3.1

First don't begin nested job when entering monitor in the context of async job. Second check for interruptible state when try to run concurrent regular job. A third step to enter/exit interruptible state when appropriate is an distinct patch. --- src/qemu/qemu_domain.c | 41 +++++++++++++---------------------------- 1 file changed, 13 insertions(+), 28 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index f6a403c..21b8c8d 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -4302,7 +4302,8 @@ qemuDomainObjReleaseAsyncJob(virDomainObjPtr obj) static bool qemuDomainNestedJobAllowed(qemuDomainObjPrivatePtr priv, qemuDomainJob job) { - return !priv->job.asyncJob || (priv->job.mask & JOB_MASK(job)) != 0; + return !priv->job.asyncJob || + (priv->job.asyncInterruptible && (priv->job.mask & JOB_MASK(job)) != 0); } bool @@ -4326,7 +4327,6 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver, qemuDomainObjPrivatePtr priv = obj->privateData; unsigned long long now; unsigned long long then; - bool nested = job == QEMU_JOB_ASYNC_NESTED; bool async = job == QEMU_JOB_ASYNC; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); const char *blocker = NULL; @@ -4359,7 +4359,7 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver, goto error; } - while (!nested && !qemuDomainNestedJobAllowed(priv, job)) { + while (!qemuDomainNestedJobAllowed(priv, job)) { VIR_DEBUG("Waiting for async job (vm=%p name=%s)", obj, obj->def->name); if (virCondWaitUntil(&priv->job.asyncCond, &obj->parent.lock, then) < 0) goto error; @@ -4373,7 +4373,7 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver, /* No job is active but a new async job could have been started while obj * was unlocked, so we need to recheck it. */ - if (!nested && !qemuDomainNestedJobAllowed(priv, job)) + if (!qemuDomainNestedJobAllowed(priv, job)) goto retry; qemuDomainObjResetJob(priv); @@ -4429,7 +4429,7 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver, priv->job.asyncOwner, NULLSTR(priv->job.asyncOwnerAPI), duration / 1000, asyncDuration / 1000); - if (nested || qemuDomainNestedJobAllowed(priv, job)) + if (qemuDomainNestedJobAllowed(priv, job)) blocker = priv->job.ownerAPI; else blocker = priv->job.asyncOwnerAPI; @@ -4549,7 +4549,10 @@ qemuDomainObjEndJob(virQEMUDriverPtr driver, virDomainObjPtr obj) qemuDomainObjResetJob(priv); if (qemuDomainTrackJob(job)) qemuDomainObjSaveJob(driver, obj); - virCondSignal(&priv->job.cond); + /* In case async job is run concurrently and waits for completion of + * regular job we need to pass control to async job thus need to use + * of broadcast. */ + virCondBroadcast(&priv->job.cond); } void @@ -4586,32 +4589,17 @@ qemuDomainObjAbortAsyncJob(virDomainObjPtr obj) * * To be called immediately before any QEMU monitor API call * Must have already either called qemuDomainObjBeginJob() and checked - * that the VM is still active; may not be used for nested async jobs. + * that the VM is still active. * * To be followed with qemuDomainObjExitMonitor() once complete */ static int -qemuDomainObjEnterMonitorInternal(virQEMUDriverPtr driver, +qemuDomainObjEnterMonitorInternal(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virDomainObjPtr obj, - qemuDomainAsyncJob asyncJob) + qemuDomainAsyncJob asyncJob ATTRIBUTE_UNUSED) { qemuDomainObjPrivatePtr priv = obj->privateData; - if (asyncJob != QEMU_ASYNC_JOB_NONE) { - int ret; - if ((ret = qemuDomainObjBeginNestedJob(driver, obj, asyncJob)) < 0) - return ret; - if (!virDomainObjIsActive(obj)) { - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("domain is no longer running")); - qemuDomainObjEndJob(driver, obj); - return -1; - } - } else if (priv->job.asyncOwner == virThreadSelfID()) { - VIR_WARN("This thread seems to be the async job owner; entering" - " monitor without asking for a nested job is dangerous"); - } - VIR_DEBUG("Entering monitor (mon=%p vm=%p name=%s)", priv->mon, obj, obj->def->name); virObjectLock(priv->mon); @@ -4623,7 +4611,7 @@ qemuDomainObjEnterMonitorInternal(virQEMUDriverPtr driver, } static void ATTRIBUTE_NONNULL(1) -qemuDomainObjExitMonitorInternal(virQEMUDriverPtr driver, +qemuDomainObjExitMonitorInternal(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virDomainObjPtr obj) { qemuDomainObjPrivatePtr priv = obj->privateData; @@ -4641,9 +4629,6 @@ qemuDomainObjExitMonitorInternal(virQEMUDriverPtr driver, priv->monStart = 0; if (!hasRefs) priv->mon = NULL; - - if (priv->job.active == QEMU_JOB_ASYNC_NESTED) - qemuDomainObjEndJob(driver, obj); } void qemuDomainObjEnterMonitor(virQEMUDriverPtr driver, -- 1.8.3.1

This is just places where async job drops vm lock besides for the purpose of sending command to qemu process. If I fail to identify one of these places it is not a big problem - vm will be unresponsive to queries etc sometimes and someone who cares report this issue and we fix it )) NB: Now there are some known places were enter/exit are not yet added as noted in the cover letter. These are end/begin of migration phases. --- src/qemu/qemu_domain.c | 2 ++ src/qemu/qemu_driver.c | 2 +- src/qemu/qemu_migration.c | 18 +++++++----------- src/qemu/qemu_process.c | 2 ++ 4 files changed, 12 insertions(+), 12 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 21b8c8d..f8602b7 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -4872,6 +4872,7 @@ void qemuDomainObjEnterRemote(virDomainObjPtr obj) { VIR_DEBUG("Entering remote (vm=%p name=%s)", obj, obj->def->name); + qemuDomainObjEnterInterruptible(obj); virObjectUnlock(obj); } @@ -4880,6 +4881,7 @@ void qemuDomainObjExitRemote(virDomainObjPtr obj) virObjectLock(obj); VIR_DEBUG("Exited remote (vm=%p name=%s)", obj, obj->def->name); + qemuDomainObjExitInterruptible(obj); } diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 6132bc4..019ec33 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -16768,7 +16768,7 @@ qemuDomainBlockJobAbort(virDomainPtr dom, qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk); qemuBlockJobUpdate(driver, vm, QEMU_ASYNC_JOB_NONE, disk); while (diskPriv->blockjob) { - if (virDomainObjWait(vm) < 0) { + if (qemuDomainObjWait(vm) < 0) { ret = -1; goto endjob; } diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 69eb231..6082a5f 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -840,7 +840,7 @@ qemuMigrationCancelDriveMirror(virQEMUDriverPtr driver, if (failed && !err) err = virSaveLastError(); - if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm) < 0) goto cleanup; } @@ -979,7 +979,7 @@ qemuMigrationDriveMirror(virQEMUDriverPtr driver, goto cleanup; } - if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm) < 0) goto cleanup; } @@ -1335,7 +1335,7 @@ qemuMigrationWaitForSpice(virDomainObjPtr vm) VIR_DEBUG("Waiting for SPICE to finish migration"); while (!priv->job.spiceMigrated && !priv->job.abortJob) { - if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm) < 0) return -1; } return 0; @@ -1603,17 +1603,13 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver, return rv; if (events) { - if (virDomainObjWait(vm) < 0) { + if (qemuDomainObjWait(vm) < 0) { jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -2; } } else { /* Poll every 50ms for progress & to allow cancellation */ - struct timespec ts = { .tv_sec = 0, .tv_nsec = 50 * 1000 * 1000ull }; - - virObjectUnlock(vm); - nanosleep(&ts, NULL); - virObjectLock(vm); + qemuDomainObjSleep(vm, 50 * 1000 * 1000ul); } } @@ -1654,7 +1650,7 @@ qemuMigrationWaitForDestCompletion(virQEMUDriverPtr driver, while ((rv = qemuMigrationCompleted(driver, vm, asyncJob, NULL, flags)) != 1) { - if (rv < 0 || virDomainObjWait(vm) < 0) + if (rv < 0 || qemuDomainObjWait(vm) < 0) return -1; } @@ -3910,7 +3906,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, if (priv->monJSON) { while (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { priv->signalStop = true; - rc = virDomainObjWait(vm); + rc = qemuDomainObjWait(vm); priv->signalStop = false; if (rc < 0) goto error; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 18dd3aa..24b7ba1 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -229,6 +229,7 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) * deleted while the agent is active */ virObjectRef(vm); + qemuDomainObjEnterInterruptible(vm); virObjectUnlock(vm); agent = qemuAgentOpen(vm, @@ -236,6 +237,7 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) &agentCallbacks); virObjectLock(vm); + qemuDomainObjExitInterruptible(vm); if (agent == NULL) virObjectUnref(vm); -- 1.8.3.1

81f50cb92 added getting nested job on stopping domain to prevent leaking of monitor object on race between aborting job and stopping domain cased by that abort. One of the causes of this problem is that async job and concurrent regular job were not fully isolated and async job can continue to run when concurrent job is not finished yet. Now this is not possible and we can safely remove getting nested job condition in qemuProcessStop as stopping always done within context of some job. --- src/qemu/qemu_process.c | 16 +--------------- 1 file changed, 1 insertion(+), 15 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 24b7ba1..7eaeeaa 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -6296,19 +6296,9 @@ void qemuProcessStop(virQEMUDriverPtr driver, * reporting so we don't squash a legit error. */ orig_err = virSaveLastError(); - if (asyncJob != QEMU_ASYNC_JOB_NONE) { - if (qemuDomainObjBeginNestedJob(driver, vm, asyncJob) < 0) - goto cleanup; - } else if (priv->job.asyncJob != QEMU_ASYNC_JOB_NONE && - priv->job.asyncOwner == virThreadSelfID() && - priv->job.active != QEMU_JOB_ASYNC_NESTED) { - VIR_WARN("qemuProcessStop called without a nested job (async=%s)", - qemuDomainAsyncJobTypeToString(asyncJob)); - } - if (!virDomainObjIsActive(vm)) { VIR_DEBUG("VM '%s' not active", vm->def->name); - goto endjob; + goto cleanup; } qemuProcessBuildDestroyMemoryPaths(driver, vm, NULL, false); @@ -6565,10 +6555,6 @@ void qemuProcessStop(virQEMUDriverPtr driver, virDomainObjRemoveTransientDef(vm); - endjob: - if (asyncJob != QEMU_ASYNC_JOB_NONE) - qemuDomainObjEndJob(driver, vm); - cleanup: if (orig_err) { virSetError(orig_err); -- 1.8.3.1

--- src/qemu/qemu_domain.c | 24 ------------------------ src/qemu/qemu_domain.h | 4 ---- 2 files changed, 28 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index f8602b7..edf204e 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -4503,30 +4503,6 @@ int qemuDomainObjBeginAsyncJob(virQEMUDriverPtr driver, return 0; } -int -qemuDomainObjBeginNestedJob(virQEMUDriverPtr driver, - virDomainObjPtr obj, - qemuDomainAsyncJob asyncJob) -{ - qemuDomainObjPrivatePtr priv = obj->privateData; - - if (asyncJob != priv->job.asyncJob) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("unexpected async job %d"), asyncJob); - return -1; - } - - if (priv->job.asyncOwner != virThreadSelfID()) { - VIR_WARN("This thread doesn't seem to be the async job owner: %llu", - priv->job.asyncOwner); - } - - return qemuDomainObjBeginJobInternal(driver, obj, - QEMU_JOB_ASYNC_NESTED, - QEMU_ASYNC_JOB_NONE); -} - - /* * obj must be locked and have a reference before calling * diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 25a6823..1729761 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -494,10 +494,6 @@ int qemuDomainObjBeginAsyncJob(virQEMUDriverPtr driver, qemuDomainAsyncJob asyncJob, virDomainJobOperation operation) ATTRIBUTE_RETURN_CHECK; -int qemuDomainObjBeginNestedJob(virQEMUDriverPtr driver, - virDomainObjPtr obj, - qemuDomainAsyncJob asyncJob) - ATTRIBUTE_RETURN_CHECK; void qemuDomainObjEndJob(virQEMUDriverPtr driver, virDomainObjPtr obj); -- 1.8.3.1

We can safely remove this value particularly because it could not be saved in status file because when nested job is started no other thread can execute concurrent regular job which in turn can save status file. --- src/qemu/qemu_domain.c | 3 --- src/qemu/qemu_domain.h | 1 - src/qemu/qemu_process.c | 1 - 3 files changed, 5 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index edf204e..782bd43 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -89,7 +89,6 @@ VIR_ENUM_IMPL(qemuDomainJob, QEMU_JOB_LAST, "abort", "migration operation", "none", /* async job is never stored in job.active */ - "async nested", ); VIR_ENUM_IMPL(qemuDomainAsyncJob, QEMU_ASYNC_JOB_LAST, @@ -4277,8 +4276,6 @@ qemuDomainObjDiscardAsyncJob(virQEMUDriverPtr driver, virDomainObjPtr obj) { qemuDomainObjPrivatePtr priv = obj->privateData; - if (priv->job.active == QEMU_JOB_ASYNC_NESTED) - qemuDomainObjResetJob(priv); qemuDomainObjResetAsyncJob(priv); qemuDomainObjSaveJob(driver, obj); } diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 1729761..146a8f2 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -75,7 +75,6 @@ typedef enum { /* The following two items must always be the last items before JOB_LAST */ QEMU_JOB_ASYNC, /* Asynchronous job */ - QEMU_JOB_ASYNC_NESTED, /* Normal job within an async job */ QEMU_JOB_LAST } qemuDomainJob; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 7eaeeaa..a33fa69 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3201,7 +3201,6 @@ qemuProcessRecoverJob(virQEMUDriverPtr driver, case QEMU_JOB_MIGRATION_OP: case QEMU_JOB_ABORT: case QEMU_JOB_ASYNC: - case QEMU_JOB_ASYNC_NESTED: /* async job was already handled above */ case QEMU_JOB_NONE: case QEMU_JOB_LAST: -- 1.8.3.1

This check is useless now at least. We enter/exit monitor in the context of some job. Domain can become inactive only in the context of another job which calls qemuProcessStop. But we do not allow concurrency during calling qemu commands anywhere thus domain can not become inactive. --- src/qemu/qemu_domain.c | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 782bd43..3747b25 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -4615,22 +4615,11 @@ void qemuDomainObjEnterMonitor(virQEMUDriverPtr driver, * * Should be paired with an earlier qemuDomainObjEnterMonitor() call * - * Returns -1 if the domain is no longer alive after exiting the monitor. - * In that case, the caller should be careful when using obj's data, - * e.g. the live definition in vm->def has been freed by qemuProcessStop - * and replaced by the persistent definition, so pointers stolen - * from the live definition could no longer be valid. */ int qemuDomainObjExitMonitor(virQEMUDriverPtr driver, virDomainObjPtr obj) { qemuDomainObjExitMonitorInternal(driver, obj); - if (!virDomainObjIsActive(obj)) { - if (!virGetLastError()) - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("domain is no longer running")); - return -1; - } return 0; } -- 1.8.3.1

--- src/qemu/qemu_domain.c | 21 ++++++--------------- 1 file changed, 6 insertions(+), 15 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 3747b25..56606e0 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -4566,10 +4566,9 @@ qemuDomainObjAbortAsyncJob(virDomainObjPtr obj) * * To be followed with qemuDomainObjExitMonitor() once complete */ -static int -qemuDomainObjEnterMonitorInternal(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, - virDomainObjPtr obj, - qemuDomainAsyncJob asyncJob ATTRIBUTE_UNUSED) +void +qemuDomainObjEnterMonitor(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, + virDomainObjPtr obj) { qemuDomainObjPrivatePtr priv = obj->privateData; @@ -4579,8 +4578,6 @@ qemuDomainObjEnterMonitorInternal(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virObjectRef(priv->mon); ignore_value(virTimeMillisNow(&priv->monStart)); virObjectUnlock(obj); - - return 0; } static void ATTRIBUTE_NONNULL(1) @@ -4604,13 +4601,6 @@ qemuDomainObjExitMonitorInternal(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, priv->mon = NULL; } -void qemuDomainObjEnterMonitor(virQEMUDriverPtr driver, - virDomainObjPtr obj) -{ - ignore_value(qemuDomainObjEnterMonitorInternal(driver, obj, - QEMU_ASYNC_JOB_NONE)); -} - /* obj must NOT be locked before calling * * Should be paired with an earlier qemuDomainObjEnterMonitor() call @@ -4640,9 +4630,10 @@ int qemuDomainObjExitMonitor(virQEMUDriverPtr driver, int qemuDomainObjEnterMonitorAsync(virQEMUDriverPtr driver, virDomainObjPtr obj, - qemuDomainAsyncJob asyncJob) + qemuDomainAsyncJob asyncJob ATTRIBUTE_UNUSED) { - return qemuDomainObjEnterMonitorInternal(driver, obj, asyncJob); + qemuDomainObjEnterMonitor(driver, obj); + return 0; } /* -- 1.8.3.1

--- src/qemu/qemu_domain.c | 21 ++++++++------------- 1 file changed, 8 insertions(+), 13 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 56606e0..1cddab9 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -4580,9 +4580,13 @@ qemuDomainObjEnterMonitor(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virObjectUnlock(obj); } -static void ATTRIBUTE_NONNULL(1) -qemuDomainObjExitMonitorInternal(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, - virDomainObjPtr obj) +/* obj must NOT be locked before calling + * + * Should be paired with an earlier qemuDomainObjEnterMonitor() call + */ +int +qemuDomainObjExitMonitor(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, + virDomainObjPtr obj) { qemuDomainObjPrivatePtr priv = obj->privateData; bool hasRefs; @@ -4599,20 +4603,11 @@ qemuDomainObjExitMonitorInternal(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, priv->monStart = 0; if (!hasRefs) priv->mon = NULL; -} -/* obj must NOT be locked before calling - * - * Should be paired with an earlier qemuDomainObjEnterMonitor() call - * - */ -int qemuDomainObjExitMonitor(virQEMUDriverPtr driver, - virDomainObjPtr obj) -{ - qemuDomainObjExitMonitorInternal(driver, obj); return 0; } + /* * obj must be locked before calling * -- 1.8.3.1

--- src/conf/domain_conf.c | 19 ------------------- src/conf/domain_conf.h | 1 - src/libvirt_private.syms | 1 - 3 files changed, 21 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 7dfd7b5..dad0520 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -3192,25 +3192,6 @@ virDomainObjBroadcast(virDomainObjPtr vm) } -int -virDomainObjWait(virDomainObjPtr vm) -{ - if (virCondWait(&vm->cond, &vm->parent.lock) < 0) { - virReportSystemError(errno, "%s", - _("failed to wait for domain condition")); - return -1; - } - - if (!virDomainObjIsActive(vm)) { - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("domain is not running")); - return -1; - } - - return 0; -} - - /** * Waits for domain condition to be triggered for a specific period of time. * diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index e3f060b..e6140ab 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2685,7 +2685,6 @@ bool virDomainObjTaint(virDomainObjPtr obj, virDomainTaintFlags taint); void virDomainObjBroadcast(virDomainObjPtr vm); -int virDomainObjWait(virDomainObjPtr vm); int virDomainObjWaitUntil(virDomainObjPtr vm, unsigned long long whenms); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 6164446..e7320dc 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -470,7 +470,6 @@ virDomainObjSetMetadata; virDomainObjSetState; virDomainObjTaint; virDomainObjUpdateModificationImpact; -virDomainObjWait; virDomainObjWaitUntil; virDomainOSTypeFromString; virDomainOSTypeToString; -- 1.8.3.1

--- src/qemu/qemu_domain.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 1cddab9..b0f7e56 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -4297,7 +4297,7 @@ qemuDomainObjReleaseAsyncJob(virDomainObjPtr obj) } static bool -qemuDomainNestedJobAllowed(qemuDomainObjPrivatePtr priv, qemuDomainJob job) +qemuDomainConcurrentJobAllowed(qemuDomainObjPrivatePtr priv, qemuDomainJob job) { return !priv->job.asyncJob || (priv->job.asyncInterruptible && (priv->job.mask & JOB_MASK(job)) != 0); @@ -4306,7 +4306,7 @@ qemuDomainNestedJobAllowed(qemuDomainObjPrivatePtr priv, qemuDomainJob job) bool qemuDomainJobAllowed(qemuDomainObjPrivatePtr priv, qemuDomainJob job) { - return !priv->job.active && qemuDomainNestedJobAllowed(priv, job); + return !priv->job.active && qemuDomainConcurrentJobAllowed(priv, job); } /* Give up waiting for mutex after 30 seconds */ @@ -4356,7 +4356,7 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver, goto error; } - while (!qemuDomainNestedJobAllowed(priv, job)) { + while (!qemuDomainConcurrentJobAllowed(priv, job)) { VIR_DEBUG("Waiting for async job (vm=%p name=%s)", obj, obj->def->name); if (virCondWaitUntil(&priv->job.asyncCond, &obj->parent.lock, then) < 0) goto error; @@ -4370,7 +4370,7 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver, /* No job is active but a new async job could have been started while obj * was unlocked, so we need to recheck it. */ - if (!qemuDomainNestedJobAllowed(priv, job)) + if (!qemuDomainConcurrentJobAllowed(priv, job)) goto retry; qemuDomainObjResetJob(priv); @@ -4426,7 +4426,7 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver, priv->job.asyncOwner, NULLSTR(priv->job.asyncOwnerAPI), duration / 1000, asyncDuration / 1000); - if (qemuDomainNestedJobAllowed(priv, job)) + if (qemuDomainConcurrentJobAllowed(priv, job)) blocker = priv->job.ownerAPI; else blocker = priv->job.asyncOwnerAPI; -- 1.8.3.1
participants (1)
-
Nikolay Shirokovskiy