[PATCH 0/2] qemu: migration corner case fix and cleanup

Nikolay Shirokovskiy (2): qemu: fix qemuMigrationSrcCleanup to use qemuMigrationJobFinish qemu: don't needlessly unset close callback during perform phase src/qemu/qemu_migration.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-) -- 1.8.3.1

qemuMigrationSrcCleanup uses qemuDomainObjDiscardAsyncJob currently. But discard does not reduce jobs_queued counter so it leaks. Also discard does not notify other threads that job condition is available. Discard does reset nested job but nested job is not possible in this conditions. Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_migration.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 142faa2..301475a 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2026,7 +2026,6 @@ qemuMigrationSrcCleanup(virDomainObjPtr vm, switch ((qemuMigrationJobPhase) priv->job.phase) { case QEMU_MIGRATION_PHASE_BEGIN3: /* just forget we were about to migrate */ - qemuDomainObjDiscardAsyncJob(driver, vm); break; case QEMU_MIGRATION_PHASE_PERFORM3_DONE: @@ -2036,7 +2035,6 @@ qemuMigrationSrcCleanup(virDomainObjPtr vm, qemuMigrationParamsReset(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT, jobPriv->migParams, priv->job.apiFlags); /* clear the job and let higher levels decide what to do */ - qemuDomainObjDiscardAsyncJob(driver, vm); break; case QEMU_MIGRATION_PHASE_PERFORM3: @@ -2053,8 +2051,9 @@ qemuMigrationSrcCleanup(virDomainObjPtr vm, case QEMU_MIGRATION_PHASE_NONE: case QEMU_MIGRATION_PHASE_LAST: /* unreachable */ - ; + return; } + qemuMigrationJobFinish(driver, vm); } -- 1.8.3.1

On Tue, Aug 18, 2020 at 13:26:35 +0300, Nikolay Shirokovskiy wrote:
qemuMigrationSrcCleanup uses qemuDomainObjDiscardAsyncJob currently. But discard does not reduce jobs_queued counter so it leaks. Also discard does not notify other threads that job condition is available. Discard does reset nested job but nested job is not possible in this conditions.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_migration.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 142faa2..301475a 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2026,7 +2026,6 @@ qemuMigrationSrcCleanup(virDomainObjPtr vm, switch ((qemuMigrationJobPhase) priv->job.phase) { case QEMU_MIGRATION_PHASE_BEGIN3: /* just forget we were about to migrate */ - qemuDomainObjDiscardAsyncJob(driver, vm); break;
case QEMU_MIGRATION_PHASE_PERFORM3_DONE: @@ -2036,7 +2035,6 @@ qemuMigrationSrcCleanup(virDomainObjPtr vm, qemuMigrationParamsReset(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT, jobPriv->migParams, priv->job.apiFlags); /* clear the job and let higher levels decide what to do */ - qemuDomainObjDiscardAsyncJob(driver, vm); break;
case QEMU_MIGRATION_PHASE_PERFORM3: @@ -2053,8 +2051,9 @@ qemuMigrationSrcCleanup(virDomainObjPtr vm, case QEMU_MIGRATION_PHASE_NONE: case QEMU_MIGRATION_PHASE_LAST: /* unreachable */ - ; + return; } + qemuMigrationJobFinish(driver, vm); }
I was thinking whether we could somehow merge the two APIs, but apparently we need both as some callers should not do the actions in which these two APIs differ. Anyway, this patch is good, but I'd prefer the qemuMigrationJobFinish call to stay in each switch branch to make the code executed for each phase immediately visible. And it's just one line so I don't see a reason for moving it. With the suggested change: Reviewed-by: Jiri Denemark <jdenemar@redhat.com>

During API call connection is referenced and close callback is called when connection is disposed. Thus during API call close callback cannot be triggered and we don't need to unset callback on API call duration. This code was added in [1] and commit does not explain this part of the patch. [1] 1fdc53c3: qemu: Avoid dangling migration-out job when client dies Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_migration.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 301475a..7615dab 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -4743,7 +4743,6 @@ qemuMigrationSrcPerformJob(virQEMUDriverPtr driver, */ static int qemuMigrationSrcPerformPhase(virQEMUDriverPtr driver, - virConnectPtr conn, virDomainObjPtr vm, const char *persist_xml, const char *uri, @@ -4772,8 +4771,6 @@ qemuMigrationSrcPerformPhase(virQEMUDriverPtr driver, } qemuMigrationJobStartPhase(driver, vm, QEMU_MIGRATION_PHASE_PERFORM3); - virCloseCallbacksUnset(driver->closeCallbacks, vm, - qemuMigrationSrcCleanup); ret = qemuMigrationSrcPerformNative(driver, vm, persist_xml, uri, cookiein, cookieinlen, cookieout, cookieoutlen, @@ -4787,10 +4784,6 @@ qemuMigrationSrcPerformPhase(virQEMUDriverPtr driver, qemuMigrationJobSetPhase(driver, vm, QEMU_MIGRATION_PHASE_PERFORM3_DONE); - if (virCloseCallbacksSet(driver->closeCallbacks, vm, conn, - qemuMigrationSrcCleanup) < 0) - goto endjob; - endjob: if (ret < 0) { qemuMigrationParamsReset(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT, @@ -4862,7 +4855,7 @@ qemuMigrationSrcPerform(virQEMUDriverPtr driver, } if (v3proto) { - return qemuMigrationSrcPerformPhase(driver, conn, vm, persist_xml, uri, + return qemuMigrationSrcPerformPhase(driver, vm, persist_xml, uri, graphicsuri, nmigrate_disks, migrate_disks, migParams, -- 1.8.3.1

On Tue, Aug 18, 2020 at 13:26:36 +0300, Nikolay Shirokovskiy wrote:
During API call connection is referenced and close callback is called when connection is disposed. Thus during API call close callback cannot be triggered and we don't need to unset callback on API call duration.
This code was added in [1] and commit does not explain this part of the patch.
[1] 1fdc53c3: qemu: Avoid dangling migration-out job when client dies
I suspect this was not the case 8 years ago when close callbacks were introduced. In any case, I don't think calling Unset in the begining of qemuMigrationSrcPerformPhase and Set at the end could cause any harm. The idea is to keep the callback set only when there's no running API. So while it may not be needed, I don't think there's a real need to change it. Jirka

On 06.11.2020 17:03, Jiri Denemark wrote:
On Tue, Aug 18, 2020 at 13:26:36 +0300, Nikolay Shirokovskiy wrote:
During API call connection is referenced and close callback is called when connection is disposed. Thus during API call close callback cannot be triggered and we don't need to unset callback on API call duration.
This code was added in [1] and commit does not explain this part of the patch.
[1] 1fdc53c3: qemu: Avoid dangling migration-out job when client dies
I suspect this was not the case 8 years ago when close callbacks were introduced. In any case, I don't think calling Unset in the begining of qemuMigrationSrcPerformPhase and Set at the end could cause any harm. The idea is to keep the callback set only when there's no running API. So while it may not be needed, I don't think there's a real need to change it.
Jirka
Ok, thank you for review! First patch pushed with suggested changes. Nikolay

Polite ping On 18.08.2020 13:26, Nikolay Shirokovskiy wrote:
Nikolay Shirokovskiy (2): qemu: fix qemuMigrationSrcCleanup to use qemuMigrationJobFinish qemu: don't needlessly unset close callback during perform phase
src/qemu/qemu_migration.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)

On 8/18/20 12:26 PM, Nikolay Shirokovskiy wrote:
Nikolay Shirokovskiy (2): qemu: fix qemuMigrationSrcCleanup to use qemuMigrationJobFinish qemu: don't needlessly unset close callback during perform phase
src/qemu/qemu_migration.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)
Patches look good to me, but since it was Jirka who added the call you're removing in 2/2 I'll let him share opinion too. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Michal

There is issue with second patch. Probably this unset/set was too mysterious and I miss one point - we do need to call unset on error path as in case of non p2p migration confirm step will not be called. So this need to be squashed in: diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 01c702e..6213025 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -4958,6 +4958,8 @@ qemuMigrationSrcPerformPhase(virQEMUDriverPtr driver, qemuMigrationParamsReset(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT, jobPriv->migParams, priv->job.apiFlags); qemuMigrationJobFinish(driver, vm); + virCloseCallbacksUnset(driver->closeCallbacks, vm, + qemuMigrationSrcCleanup); } else { qemuMigrationJobContinue(vm); } On 14.09.2020 18:41, Michal Privoznik wrote:
On 8/18/20 12:26 PM, Nikolay Shirokovskiy wrote:
Nikolay Shirokovskiy (2): qemu: fix qemuMigrationSrcCleanup to use qemuMigrationJobFinish qemu: don't needlessly unset close callback during perform phase
src/qemu/qemu_migration.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)
Patches look good to me, but since it was Jirka who added the call you're removing in 2/2 I'll let him share opinion too.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Michal

Polite ping On 14.09.2020 19:10, Nikolay Shirokovskiy wrote:
There is issue with second patch. Probably this unset/set was too mysterious and I miss one point - we do need to call unset on error path as in case of non p2p migration confirm step will not be called.
So this need to be squashed in:
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 01c702e..6213025 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -4958,6 +4958,8 @@ qemuMigrationSrcPerformPhase(virQEMUDriverPtr driver, qemuMigrationParamsReset(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT, jobPriv->migParams, priv->job.apiFlags); qemuMigrationJobFinish(driver, vm); + virCloseCallbacksUnset(driver->closeCallbacks, vm, + qemuMigrationSrcCleanup); } else { qemuMigrationJobContinue(vm); }
On 14.09.2020 18:41, Michal Privoznik wrote:
On 8/18/20 12:26 PM, Nikolay Shirokovskiy wrote:
Nikolay Shirokovskiy (2): qemu: fix qemuMigrationSrcCleanup to use qemuMigrationJobFinish qemu: don't needlessly unset close callback during perform phase
src/qemu/qemu_migration.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)
Patches look good to me, but since it was Jirka who added the call you're removing in 2/2 I'll let him share opinion too.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Michal
participants (3)
-
Jiri Denemark
-
Michal Privoznik
-
Nikolay Shirokovskiy