[libvirt] [PATCH REBASE 0/5] qemu: fix domain object wait to handle monitor errors

Main patch is 4th, others are misc. Nikolay Shirokovskiy (5): qemu: erase synchronous block job cancel mentions in comments qemu: monitor: set error flag even in OOM conditions utils: export virCopyError qemu: fix domain object wait to handle monitor errors qemu: fix races in beingDestroyed usage src/conf/domain_conf.c | 43 --------------------------------------- src/conf/domain_conf.h | 3 --- src/libvirt_private.syms | 3 +-- src/qemu/qemu_domain.c | 45 +++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.h | 7 +++++-- src/qemu/qemu_driver.c | 27 +++++++++++++++---------- src/qemu/qemu_hotplug.c | 4 ++-- src/qemu/qemu_migration.c | 12 +++++------ src/qemu/qemu_monitor.c | 5 +++++ src/qemu/qemu_process.c | 51 +++++++++++++++++++++++++++++++---------------- src/util/virerror.c | 12 ++++++++--- src/util/virerror.h | 1 + 12 files changed, 124 insertions(+), 89 deletions(-) -- 1.8.3.1

Commit [1] dropped support for synchronous block job cancel. This patch erases remnants from comments. [1] commit 2350d101 "qemu: Remove support for legacy block jobs" Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_driver.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 4e06c9c..0dd6032 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -16914,13 +16914,12 @@ qemuDomainBlockJobAbort(virDomainPtr dom, if (save) ignore_value(virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm, driver->caps)); - /* With synchronous block cancel, we must synthesize an event, and - * we silently ignore the ABORT_ASYNC flag. With asynchronous - * block cancel, the event will come from qemu and will update the - * XML as appropriate, but without the ABORT_ASYNC flag, we must - * block to guarantee synchronous operation. We do the waiting - * while still holding the VM job, to prevent newly scheduled - * block jobs from confusing us. */ + /* + * With asynchronous block cancel, the event will come from qemu and will + * update the XML as appropriate, but without the ABORT_ASYNC flag, we must + * block to guarantee synchronous operation. We do the waiting while still + * holding the VM job, to prevent newly scheduled block jobs from confusing + * us. */ if (!async) { qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk); qemuBlockJobUpdate(driver, vm, QEMU_ASYNC_JOB_NONE, disk, NULL); -- 1.8.3.1

On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Commit [1] dropped support for synchronous block job cancel. This patch erases remnants from comments.
[1] commit 2350d101 "qemu: Remove support for legacy block jobs"
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_driver.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 4e06c9c..0dd6032 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -16914,13 +16914,12 @@ qemuDomainBlockJobAbort(virDomainPtr dom, if (save) ignore_value(virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm, driver->caps));
- /* With synchronous block cancel, we must synthesize an event, and - * we silently ignore the ABORT_ASYNC flag. With asynchronous - * block cancel, the event will come from qemu and will update the - * XML as appropriate, but without the ABORT_ASYNC flag, we must - * block to guarantee synchronous operation. We do the waiting - * while still holding the VM job, to prevent newly scheduled - * block jobs from confusing us. */ + /* + * With asynchronous block cancel, the event will come from qemu and will
/* With... IOW: don't have a separate line for the open comment...
+ * update the XML as appropriate, but without the ABORT_ASYNC flag, we must + * block to guarantee synchronous operation. We do the waiting while still + * holding the VM job, to prevent newly scheduled block jobs from confusing + * us. */
This works, but the lead-in doesn't mean as much without the "With synchronous..." as the alternative... So perhaps even simpler: Wait for the QEMU event to update the the blockjob with the domain lock to prevent newly scheduled block jobs from confusing us. The event will update the XML as appropriate. Without the ABORT_ASYNC flag, we must block to guarantee synchronous completion. With at least the comment entry fixup... feel free to use my wording as well (as long as it makes sense to you).. Reviewed-by: John Ferlan <jferlan@redhat.com> but please wait for 4.4.0 to open before pushing. John
if (!async) { qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk); qemuBlockJobUpdate(driver, vm, QEMU_ASYNC_JOB_NONE, disk, NULL);

On 01.05.2018 01:03, John Ferlan wrote:
On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Commit [1] dropped support for synchronous block job cancel. This patch erases remnants from comments.
[1] commit 2350d101 "qemu: Remove support for legacy block jobs"
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_driver.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 4e06c9c..0dd6032 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -16914,13 +16914,12 @@ qemuDomainBlockJobAbort(virDomainPtr dom, if (save) ignore_value(virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm, driver->caps));
- /* With synchronous block cancel, we must synthesize an event, and - * we silently ignore the ABORT_ASYNC flag. With asynchronous - * block cancel, the event will come from qemu and will update the - * XML as appropriate, but without the ABORT_ASYNC flag, we must - * block to guarantee synchronous operation. We do the waiting - * while still holding the VM job, to prevent newly scheduled - * block jobs from confusing us. */ + /* + * With asynchronous block cancel, the event will come from qemu and will
/* With...
IOW: don't have a separate line for the open comment...
+ * update the XML as appropriate, but without the ABORT_ASYNC flag, we must + * block to guarantee synchronous operation. We do the waiting while still + * holding the VM job, to prevent newly scheduled block jobs from confusing + * us. */
This works, but the lead-in doesn't mean as much without the "With synchronous..." as the alternative... So perhaps even simpler:
Wait for the QEMU event to update the the blockjob with the domain lock to prevent newly scheduled block jobs from confusing us. The event will update the XML as appropriate. Without the ABORT_ASYNC flag, we must block to guarantee synchronous completion.
With at least the comment entry fixup... feel free to use my wording as well (as long as it makes sense to you)..
I can not use your version unchanged too... It starts saying that we wait but we wait only if ABORT_ASYNC is not set. I'll reword the proposed version too to eliminate the mention that we use qemu's async block job cancel API.
Reviewed-by: John Ferlan <jferlan@redhat.com>
but please wait for 4.4.0 to open before pushing.
John
if (!async) { qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk); qemuBlockJobUpdate(driver, vm, QEMU_ASYNC_JOB_NONE, disk, NULL);

pushed slightly reworded On 03.05.2018 10:01, Nikolay Shirokovskiy wrote:
On 01.05.2018 01:03, John Ferlan wrote:
On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Commit [1] dropped support for synchronous block job cancel. This patch erases remnants from comments.
[1] commit 2350d101 "qemu: Remove support for legacy block jobs"
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_driver.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 4e06c9c..0dd6032 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -16914,13 +16914,12 @@ qemuDomainBlockJobAbort(virDomainPtr dom, if (save) ignore_value(virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm, driver->caps));
- /* With synchronous block cancel, we must synthesize an event, and - * we silently ignore the ABORT_ASYNC flag. With asynchronous - * block cancel, the event will come from qemu and will update the - * XML as appropriate, but without the ABORT_ASYNC flag, we must - * block to guarantee synchronous operation. We do the waiting - * while still holding the VM job, to prevent newly scheduled - * block jobs from confusing us. */ + /* + * With asynchronous block cancel, the event will come from qemu and will
/* With...
IOW: don't have a separate line for the open comment...
+ * update the XML as appropriate, but without the ABORT_ASYNC flag, we must + * block to guarantee synchronous operation. We do the waiting while still + * holding the VM job, to prevent newly scheduled block jobs from confusing + * us. */
This works, but the lead-in doesn't mean as much without the "With synchronous..." as the alternative... So perhaps even simpler:
Wait for the QEMU event to update the the blockjob with the domain lock to prevent newly scheduled block jobs from confusing us. The event will update the XML as appropriate. Without the ABORT_ASYNC flag, we must block to guarantee synchronous completion.
With at least the comment entry fixup... feel free to use my wording as well (as long as it makes sense to you)..
I can not use your version unchanged too... It starts saying that we wait but we wait only if ABORT_ASYNC is not set. I'll reword the proposed version too to eliminate the mention that we use qemu's async block job cancel API.
Reviewed-by: John Ferlan <jferlan@redhat.com>
but please wait for 4.4.0 to open before pushing.
John
if (!async) { qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk); qemuBlockJobUpdate(driver, vm, QEMU_ASYNC_JOB_NONE, disk, NULL);
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

lastError.code is used as flag in qemu monitor. Unfortunately due to temporary OOM conditions (very unlikely through as thread local error is allocated on first use) we can fail to set this flag in case of monitor eofs/errors. This can cause hangs. Let's make sure flag is always set in case eofs/errors. Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_monitor.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index f642d9a..906a320 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -757,6 +757,11 @@ qemuMonitorIO(int watch, int fd, int events, void *opaque) virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Error while processing monitor IO")); virCopyLastError(&mon->lastError); + + /* set error code if due to OOM conditions we fail to set it before */ + if (mon->lastError.code == VIR_ERR_OK) + mon->lastError.code = VIR_ERR_INTERNAL_ERROR; + virResetLastError(); } -- 1.8.3.1

On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
lastError.code is used as flag in qemu monitor. Unfortunately due to temporary OOM conditions (very unlikely through as thread local error is allocated on first use) we can fail to set this flag in case of monitor eofs/errors. This can cause hangs.
Let's make sure flag is always set in case eofs/errors.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_monitor.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index f642d9a..906a320 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -757,6 +757,11 @@ qemuMonitorIO(int watch, int fd, int events, void *opaque) virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Error while processing monitor IO")); virCopyLastError(&mon->lastError); + + /* set error code if due to OOM conditions we fail to set it before */ + if (mon->lastError.code == VIR_ERR_OK)
So this can only happen if virCopyLastError fails, right? And code == 0 because of memset(..., 0, ...). Wouldn't you only be working around the issue for one of the callers? and only setting *.code and not *.domain? BTW: I feel we've been this way in rather recent history... See commit 'e711a391' - I recall we had varying opinions on that one. Anyway, assuming CopyLastError is the issue, then perhaps the right place to fix this is there and I would think at least the following would need to be set in the error path: to->code = VIR_ERR_NO_MEMORY; to->domain = VIR_FROM_THREAD; We cannot create a to->message... Of course, how can we be "sure" that the failure is because the VIR_ALLOC_QUIET in virLastErrorObject caused the issue other than changing it's return values or passing a "bool &memfail" that's only set when the VIR_ALLOC_QUIET fails allowing the caller to decide what to do. It's local, so change is limited. Thus virGetLastErrorMessage and virCopyLastError could make use of the returned, while other callers could just pass "&dummy". In the long run, it's a somewhat interesting corner case problem and we may have some varying opinions over the "best way" to resolve. Suffice to say, libvirt probably isn't going to "survive" much longer, but unless others read this response and have varying opinions and/or we come to a resolution - can this be posted/consider on it's own merits? Tks - John
+ mon->lastError.code = VIR_ERR_INTERNAL_ERROR; + virResetLastError(); }

On 01.05.2018 01:03, John Ferlan wrote:
On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
lastError.code is used as flag in qemu monitor. Unfortunately due to temporary OOM conditions (very unlikely through as thread local error is allocated on first use) we can fail to set this flag in case of monitor eofs/errors. This can cause hangs.
Let's make sure flag is always set in case eofs/errors.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_monitor.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index f642d9a..906a320 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -757,6 +757,11 @@ qemuMonitorIO(int watch, int fd, int events, void *opaque) virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Error while processing monitor IO")); virCopyLastError(&mon->lastError); + + /* set error code if due to OOM conditions we fail to set it before */ + if (mon->lastError.code == VIR_ERR_OK)
So this can only happen if virCopyLastError fails, right? And code == 0
Nope, virCopyLastError fail to set code field only in one case - if last error is NULL which can be only if in next lines even reporting fails: virErrorPtr err = virGetLastError(); if (!err) virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Error while processing monitor IO"));
because of memset(..., 0, ...). Wouldn't you only be working around the issue for one of the callers? and only setting *.code and not *.domain?
We use mon->lastError for 2 purposes: - as flag, and it is critical that we set this flag appropriately. We use only code field for this purpuse - as error message that client can retrieve in case if API call returns error, it is no critical if we can not provide error message, moreover we can fail to pass error message in RPC for example due to temporary OOM conditions I even think to remove flag usage from mon->lastError and introduce bool flag but this looks ugly like 2 fields to keep error.
BTW: I feel we've been this way in rather recent history... See commit 'e711a391' - I recall we had varying opinions on that one.
Yes, but this fix does not relate to usage mon->lastError as flag, rather then to second usage in the above distinction.
Anyway, assuming CopyLastError is the issue, then perhaps the right place to fix this is there and I would think at least the following would need to be set in the error path:
to->code = VIR_ERR_NO_MEMORY; to->domain = VIR_FROM_THREAD;
It can be misleading. For example if we forget to set last error and then try to copy it.
We cannot create a to->message...
Of course, how can we be "sure" that the failure is because the VIR_ALLOC_QUIET in virLastErrorObject caused the issue other than changing it's return values or passing a "bool &memfail" that's only set when the VIR_ALLOC_QUIET fails allowing the caller to decide what to do. It's local, so change is limited. Thus virGetLastErrorMessage and virCopyLastError could make use of the returned, while other callers could just pass "&dummy".
In the long run, it's a somewhat interesting corner case problem and we may have some varying opinions over the "best way" to resolve. Suffice to say, libvirt probably isn't going to "survive" much longer, but unless others read this response and have varying opinions and/or we come to a resolution - can this be posted/consider on it's own merits?> Tks -
John
May be it would be just more simple to not to use mon->lastError as flag in monitor.
+ mon->lastError.code = VIR_ERR_INTERNAL_ERROR; + virResetLastError(); }

On 05/03/2018 03:35 AM, Nikolay Shirokovskiy wrote:
On 01.05.2018 01:03, John Ferlan wrote:
On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
lastError.code is used as flag in qemu monitor. Unfortunately due to temporary OOM conditions (very unlikely through as thread local error is allocated on first use) we can fail to set this flag in case of monitor eofs/errors. This can cause hangs.
Let's make sure flag is always set in case eofs/errors.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_monitor.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index f642d9a..906a320 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -757,6 +757,11 @@ qemuMonitorIO(int watch, int fd, int events, void *opaque) virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Error while processing monitor IO")); virCopyLastError(&mon->lastError); + + /* set error code if due to OOM conditions we fail to set it before */ + if (mon->lastError.code == VIR_ERR_OK)
So this can only happen if virCopyLastError fails, right? And code == 0
Nope, virCopyLastError fail to set code field only in one case - if last error is NULL which can be only if in next lines even reporting fails:
virErrorPtr err = virGetLastError(); if (!err) virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Error while processing monitor IO"));
I read the code as if pthread_getspecific in virThreadLocalGet returns NULL, then virLastErrorObject will try to VIR_ALLOC_QUIET and return NULL if it fails. If that succeeds, then virThreadLocalSet is called to pthread_setspecific and if it fails, then @err (from VIR_ALLOC_QUIET) is VIR_FREE'd and a NULL is returned. So NULL could be returned if VIR_ALLOC_QUIET or virThreadLocalSet fail.
because of memset(..., 0, ...). Wouldn't you only be working around the issue for one of the callers? and only setting *.code and not *.domain?
We use mon->lastError for 2 purposes:
- as flag, and it is critical that we set this flag appropriately. We use only code field for this purpuse
Understood, but still ->domain would be 0; however, if you set it to something (such as VIR_FROM_THREAD), then perhaps that too could be used as a marker of sorts (e.g. code == value, thread == value, and message == NULL means something fairly specific).
- as error message that client can retrieve in case if API call returns error, it is no critical if we can not provide error message, moreover we can fail to pass error message in RPC for example due to temporary OOM conditions
I even think to remove flag usage from mon->lastError and introduce bool flag but this looks ugly like 2 fields to keep error.
In any case, perhaps you misinterpreted my remark. Only adjusting this one caller fixes one condition, but if there are possibly more then changing virCopyLastError instead of here would allow those others to also have the adjustment...
BTW: I feel we've been this way in rather recent history... See commit 'e711a391' - I recall we had varying opinions on that one.
Yes, but this fix does not relate to usage mon->lastError as flag, rather then to second usage in the above distinction.
Anyway, assuming CopyLastError is the issue, then perhaps the right place to fix this is there and I would think at least the following would need to be set in the error path:
to->code = VIR_ERR_NO_MEMORY; to->domain = VIR_FROM_THREAD;
It can be misleading. For example if we forget to set last error and then try to copy it.
We cannot create a to->message...
Of course, how can we be "sure" that the failure is because the VIR_ALLOC_QUIET in virLastErrorObject caused the issue other than changing it's return values or passing a "bool &memfail" that's only set when the VIR_ALLOC_QUIET fails allowing the caller to decide what to do. It's local, so change is limited. Thus virGetLastErrorMessage and virCopyLastError could make use of the returned, while other callers could just pass "&dummy".
In the long run, it's a somewhat interesting corner case problem and we may have some varying opinions over the "best way" to resolve. Suffice to say, libvirt probably isn't going to "survive" much longer, but unless others read this response and have varying opinions and/or we come to a resolution - can this be posted/consider on it's own merits?> Tks -
John
May be it would be just more simple to not to use mon->lastError as flag in monitor.
I think it's still possible. You've found two conditions where usage was problematic (low memory and failure to set thread value). So the patch isn't without merit. John
+ mon->lastError.code = VIR_ERR_INTERNAL_ERROR; + virResetLastError(); }

Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/libvirt_private.syms | 1 + src/util/virerror.c | 12 +++++++++--- src/util/virerror.h | 1 + 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index b31f599..6bbbf77 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1702,6 +1702,7 @@ ebtablesRemoveForwardAllowIn; # util/virerror.h +virCopyError; virDispatchError; virErrorCopyNew; virErrorInitialize; diff --git a/src/util/virerror.c b/src/util/virerror.c index c000b00..2ff8a3e 100644 --- a/src/util/virerror.c +++ b/src/util/virerror.c @@ -188,10 +188,16 @@ virErrorGenericFailure(virErrorPtr err) } -/* - * Internal helper to perform a deep copy of an error +/** + * virCopyError: + * @from: error to copy from + * @to: error to copy to + * + * Copy error fields from @from to @to. + * + * Returns 0 on success, -1 on failure. */ -static int +int virCopyError(virErrorPtr from, virErrorPtr to) { diff --git a/src/util/virerror.h b/src/util/virerror.h index 760bfa4..90ac619 100644 --- a/src/util/virerror.h +++ b/src/util/virerror.h @@ -192,6 +192,7 @@ void virReportOOMErrorFull(int domcode, int virSetError(virErrorPtr newerr); virErrorPtr virErrorCopyNew(virErrorPtr err); +int virCopyError(virErrorPtr from, virErrorPtr to); void virDispatchError(virConnectPtr conn); const char *virStrerror(int theerrno, char *errBuf, size_t errBufLen); -- 1.8.3.1

On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/libvirt_private.syms | 1 + src/util/virerror.c | 12 +++++++++--- src/util/virerror.h | 1 + 3 files changed, 11 insertions(+), 3 deletions(-)
NACK, you should be using virErrorCopyNew John
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index b31f599..6bbbf77 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1702,6 +1702,7 @@ ebtablesRemoveForwardAllowIn;
# util/virerror.h +virCopyError; virDispatchError; virErrorCopyNew; virErrorInitialize; diff --git a/src/util/virerror.c b/src/util/virerror.c index c000b00..2ff8a3e 100644 --- a/src/util/virerror.c +++ b/src/util/virerror.c @@ -188,10 +188,16 @@ virErrorGenericFailure(virErrorPtr err) }
-/* - * Internal helper to perform a deep copy of an error +/** + * virCopyError: + * @from: error to copy from + * @to: error to copy to + * + * Copy error fields from @from to @to. + * + * Returns 0 on success, -1 on failure. */ -static int +int virCopyError(virErrorPtr from, virErrorPtr to) { diff --git a/src/util/virerror.h b/src/util/virerror.h index 760bfa4..90ac619 100644 --- a/src/util/virerror.h +++ b/src/util/virerror.h @@ -192,6 +192,7 @@ void virReportOOMErrorFull(int domcode,
int virSetError(virErrorPtr newerr); virErrorPtr virErrorCopyNew(virErrorPtr err); +int virCopyError(virErrorPtr from, virErrorPtr to); void virDispatchError(virConnectPtr conn); const char *virStrerror(int theerrno, char *errBuf, size_t errBufLen);

On 01.05.2018 01:03, John Ferlan wrote:
On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/libvirt_private.syms | 1 + src/util/virerror.c | 12 +++++++++--- src/util/virerror.h | 1 + 3 files changed, 11 insertions(+), 3 deletions(-)
NACK, you should be using virErrorCopyNew
John
I introduced monError in qemuDomainObjPrivatePtr in next patch and it is not a pointer so I need this function. I did not make monError pointer for the same reason it is not pointer in monitor object - I use monError both as flag and as error message.
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index b31f599..6bbbf77 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1702,6 +1702,7 @@ ebtablesRemoveForwardAllowIn;
# util/virerror.h +virCopyError; virDispatchError; virErrorCopyNew; virErrorInitialize; diff --git a/src/util/virerror.c b/src/util/virerror.c index c000b00..2ff8a3e 100644 --- a/src/util/virerror.c +++ b/src/util/virerror.c @@ -188,10 +188,16 @@ virErrorGenericFailure(virErrorPtr err) }
-/* - * Internal helper to perform a deep copy of an error +/** + * virCopyError: + * @from: error to copy from + * @to: error to copy to + * + * Copy error fields from @from to @to. + * + * Returns 0 on success, -1 on failure. */ -static int +int virCopyError(virErrorPtr from, virErrorPtr to) { diff --git a/src/util/virerror.h b/src/util/virerror.h index 760bfa4..90ac619 100644 --- a/src/util/virerror.h +++ b/src/util/virerror.h @@ -192,6 +192,7 @@ void virReportOOMErrorFull(int domcode,
int virSetError(virErrorPtr newerr); virErrorPtr virErrorCopyNew(virErrorPtr err); +int virCopyError(virErrorPtr from, virErrorPtr to); void virDispatchError(virConnectPtr conn); const char *virStrerror(int theerrno, char *errBuf, size_t errBufLen);

On 05/03/2018 03:39 AM, Nikolay Shirokovskiy wrote:
On 01.05.2018 01:03, John Ferlan wrote:
On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/libvirt_private.syms | 1 + src/util/virerror.c | 12 +++++++++--- src/util/virerror.h | 1 + 3 files changed, 11 insertions(+), 3 deletions(-)
NACK, you should be using virErrorCopyNew
John
I introduced monError in qemuDomainObjPrivatePtr in next patch and it is not a pointer so I need this function. I did not make monError pointer for the same reason it is not pointer in monitor object - I use monError both as flag and as error message.
I saw what you did - the fact is virCopyError is coded as private to the module. Just making it global because you have a use for it is I believe incorrect. Ironically in the next patch you have: + virErrorPtr err = qemuMonitorLastError(mon); + + virCopyError(err, &priv->monError); The first call isn't guaranteed to set @err and all you're essentially doing is wanting to copy from mon->lastError to priv->monError or perhaps similar to virCopyLastError. Maybe some new API in virerror.c should handle that. John
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index b31f599..6bbbf77 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1702,6 +1702,7 @@ ebtablesRemoveForwardAllowIn;
# util/virerror.h +virCopyError; virDispatchError; virErrorCopyNew; virErrorInitialize; diff --git a/src/util/virerror.c b/src/util/virerror.c index c000b00..2ff8a3e 100644 --- a/src/util/virerror.c +++ b/src/util/virerror.c @@ -188,10 +188,16 @@ virErrorGenericFailure(virErrorPtr err) }
-/* - * Internal helper to perform a deep copy of an error +/** + * virCopyError: + * @from: error to copy from + * @to: error to copy to + * + * Copy error fields from @from to @to. + * + * Returns 0 on success, -1 on failure. */ -static int +int virCopyError(virErrorPtr from, virErrorPtr to) { diff --git a/src/util/virerror.h b/src/util/virerror.h index 760bfa4..90ac619 100644 --- a/src/util/virerror.h +++ b/src/util/virerror.h @@ -192,6 +192,7 @@ void virReportOOMErrorFull(int domcode,
int virSetError(virErrorPtr newerr); virErrorPtr virErrorCopyNew(virErrorPtr err); +int virCopyError(virErrorPtr from, virErrorPtr to); void virDispatchError(virConnectPtr conn); const char *virStrerror(int theerrno, char *errBuf, size_t errBufLen);

On 04.05.2018 16:58, John Ferlan wrote:
On 05/03/2018 03:39 AM, Nikolay Shirokovskiy wrote:
On 01.05.2018 01:03, John Ferlan wrote:
On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/libvirt_private.syms | 1 + src/util/virerror.c | 12 +++++++++--- src/util/virerror.h | 1 + 3 files changed, 11 insertions(+), 3 deletions(-)
NACK, you should be using virErrorCopyNew
John
I introduced monError in qemuDomainObjPrivatePtr in next patch and it is not a pointer so I need this function. I did not make monError pointer for the same reason it is not pointer in monitor object - I use monError both as flag and as error message.
I saw what you did - the fact is virCopyError is coded as private to the module. Just making it global because you have a use for it is I believe incorrect.
But why?
Ironically in the next patch you have:
+ virErrorPtr err = qemuMonitorLastError(mon); + + virCopyError(err, &priv->monError);
The first call isn't guaranteed to set @err and all you're essentially doing is wanting to copy from mon->lastError to priv->monError or perhaps similar to virCopyLastError.
It is ok that error can turn from non-NULL to NULL on copy due to OOM conditions or whaever. It is just as in previous patch. We use priv->monError for 2 purpuses. And thus qemuProcessNotifyMonitorError sets priv->monError.code explicitly if it still set to VIR_ERR_OK after copy.
Maybe some new API in virerror.c should handle that.
Not sure we need it at this point. But may be I miss something, please share your vision in more detail then. Nikolay
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index b31f599..6bbbf77 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1702,6 +1702,7 @@ ebtablesRemoveForwardAllowIn;
# util/virerror.h +virCopyError; virDispatchError; virErrorCopyNew; virErrorInitialize; diff --git a/src/util/virerror.c b/src/util/virerror.c index c000b00..2ff8a3e 100644 --- a/src/util/virerror.c +++ b/src/util/virerror.c @@ -188,10 +188,16 @@ virErrorGenericFailure(virErrorPtr err) }
-/* - * Internal helper to perform a deep copy of an error +/** + * virCopyError: + * @from: error to copy from + * @to: error to copy to + * + * Copy error fields from @from to @to. + * + * Returns 0 on success, -1 on failure. */ -static int +int virCopyError(virErrorPtr from, virErrorPtr to) { diff --git a/src/util/virerror.h b/src/util/virerror.h index 760bfa4..90ac619 100644 --- a/src/util/virerror.h +++ b/src/util/virerror.h @@ -192,6 +192,7 @@ void virReportOOMErrorFull(int domcode,
int virSetError(virErrorPtr newerr); virErrorPtr virErrorCopyNew(virErrorPtr err); +int virCopyError(virErrorPtr from, virErrorPtr to); void virDispatchError(virConnectPtr conn); const char *virStrerror(int theerrno, char *errBuf, size_t errBufLen);

On 05/07/2018 04:33 AM, Nikolay Shirokovskiy wrote:
On 04.05.2018 16:58, John Ferlan wrote:
On 05/03/2018 03:39 AM, Nikolay Shirokovskiy wrote:
On 01.05.2018 01:03, John Ferlan wrote:
On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/libvirt_private.syms | 1 + src/util/virerror.c | 12 +++++++++--- src/util/virerror.h | 1 + 3 files changed, 11 insertions(+), 3 deletions(-)
NACK, you should be using virErrorCopyNew
John
I introduced monError in qemuDomainObjPrivatePtr in next patch and it is not a pointer so I need this function. I did not make monError pointer for the same reason it is not pointer in monitor object - I use monError both as flag and as error message.
I saw what you did - the fact is virCopyError is coded as private to the module. Just making it global because you have a use for it is I believe incorrect.
But why?
Because virErrorCopyNew is designated to "externally" use virCopyError.
Ironically in the next patch you have:
+ virErrorPtr err = qemuMonitorLastError(mon); + + virCopyError(err, &priv->monError);
The first call isn't guaranteed to set @err and all you're essentially doing is wanting to copy from mon->lastError to priv->monError or perhaps similar to virCopyLastError.
It is ok that error can turn from non-NULL to NULL on copy due to OOM conditions or whaever. It is just as in previous patch. We use priv->monError for 2 purpuses. And thus qemuProcessNotifyMonitorError sets priv->monError.code explicitly if it still set to VIR_ERR_OK after copy.
It's "possible" from the above code that @priv->monError would have nothing filled in based on virCopyError logic, so how is that better than what's being done now? That's why I figured that changing the innards of virCopyLastError would be more beneficial in the long run.
Maybe some new API in virerror.c should handle that.
Not sure we need it at this point. But may be I miss something, please share your vision in more detail then.
It's not my patch - I'll review whatever comes next. I've provided suggestions and comments. John

Block job abort operation can not handle properly qemu crashes when waiting for abort/pivot completion. Deadlock scenario is next: - qemuDomainBlockJobAbort waits for pivot/abort completion - qemu crashes, then qemuProcessBeginStopJob broadcasts for VM condition and then waits for job condition (taken by qemuDomainBlockJobAbort) - qemuDomainBlockJobAbort awakes but nothing really changed, VM is still active (vm->def->id != -1) so thread starts waiting for completion again. Now two threads are in deadlock. First let's add some condition besides domain active status so that waiting thread can check it when awakes. Second let's move signalling to the place where condition is set, namely monitor eof/error handlers. Having signalling in qemuProcessBeginStopJob is not useful. The patch copies monitor error to domain state because at time waiting thread awakes there can be no monitor and it is useful to report monitor error to user. The patch has a drawback that on destroying a domain when waiting for event from monitor we get not very convinient error message like 'EOF from monitor' from waiting API. On the other hand if qemu crashes this is more useful then 'domain is not running'. The first case will be addressed in another patch. The patch also fixes other places where we wait for event from qemu. Namely handling device removal and tray ejection. Other places which used virDomainObjWait (dump, migration, save) were good because of async jobs which allow concurrent destroy job. Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/conf/domain_conf.c | 43 ------------------------------------------- src/conf/domain_conf.h | 3 --- src/libvirt_private.syms | 2 -- src/qemu/qemu_domain.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.h | 5 ++++- src/qemu/qemu_driver.c | 6 +++--- src/qemu/qemu_hotplug.c | 4 ++-- src/qemu/qemu_migration.c | 12 ++++++------ src/qemu/qemu_process.c | 27 ++++++++++++++++++++++----- 9 files changed, 82 insertions(+), 65 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 5f1af91..e046008 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -3249,49 +3249,6 @@ virDomainObjBroadcast(virDomainObjPtr vm) } -int -virDomainObjWait(virDomainObjPtr vm) -{ - if (virCondWait(&vm->cond, &vm->parent.lock) < 0) { - virReportSystemError(errno, "%s", - _("failed to wait for domain condition")); - return -1; - } - - if (!virDomainObjIsActive(vm)) { - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("domain is not running")); - return -1; - } - - return 0; -} - - -/** - * Waits for domain condition to be triggered for a specific period of time. - * - * Returns: - * -1 in case of error - * 0 on success - * 1 on timeout - */ -int -virDomainObjWaitUntil(virDomainObjPtr vm, - unsigned long long whenms) -{ - if (virCondWaitUntil(&vm->cond, &vm->parent.lock, whenms) < 0) { - if (errno != ETIMEDOUT) { - virReportSystemError(errno, "%s", - _("failed to wait for domain condition")); - return -1; - } - return 1; - } - return 0; -} - - /* * Mark the current VM config as transient. Ensures transient hotplug * operations do not persist past shutdown. diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 122a051..b4d5bdb 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2747,9 +2747,6 @@ bool virDomainObjTaint(virDomainObjPtr obj, virDomainTaintFlags taint); void virDomainObjBroadcast(virDomainObjPtr vm); -int virDomainObjWait(virDomainObjPtr vm); -int virDomainObjWaitUntil(virDomainObjPtr vm, - unsigned long long whenms); void virDomainPanicDefFree(virDomainPanicDefPtr panic); void virDomainResourceDefFree(virDomainResourceDefPtr resource); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 6bbbf77..ed5498e 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -493,8 +493,6 @@ virDomainObjSetMetadata; virDomainObjSetState; virDomainObjTaint; virDomainObjUpdateModificationImpact; -virDomainObjWait; -virDomainObjWaitUntil; virDomainOSTypeFromString; virDomainOSTypeToString; virDomainParseMemory; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 672f08b..1f40ff1 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1909,6 +1909,7 @@ qemuDomainObjPrivateFree(void *data) qemuDomainObjFreeJob(priv); VIR_FREE(priv->lockState); VIR_FREE(priv->origname); + virResetError(&priv->monError); virChrdevFree(priv->devs); @@ -11881,3 +11882,47 @@ qemuProcessEventFree(struct qemuProcessEvent *event) } VIR_FREE(event); } + + +/** + * Waits for domain condition to be triggered for a specific period of time. + * if @until is 0 then waits indefinetely. + * + * Returns: + * -1 on error + * 0 on success + * 1 on timeout + */ +int +qemuDomainObjWait(virDomainObjPtr vm, unsigned long long until) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int rc; + + if (until) + rc = virCondWaitUntil(&vm->cond, &vm->parent.lock, until); + else + rc = virCondWait(&vm->cond, &vm->parent.lock); + + if (rc < 0) { + if (until && errno == ETIMEDOUT) + return 1; + + virReportSystemError(errno, "%s", + _("failed to wait for domain condition")); + return -1; + } + + if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("domain is not running")); + return -1; + } + + if (priv->monError.code != VIR_ERR_OK) { + virSetError(&priv->monError); + return -1; + } + + return 0; +} diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 40d9a6f..494ed35 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -262,7 +262,7 @@ struct _qemuDomainObjPrivate { qemuMonitorPtr mon; virDomainChrSourceDefPtr monConfig; bool monJSON; - bool monError; + virError monError; unsigned long long monStart; qemuAgentPtr agent; @@ -994,4 +994,7 @@ qemuDomainPrepareDiskSource(virDomainDiskDefPtr disk, qemuDomainObjPrivatePtr priv, virQEMUDriverConfigPtr cfg); +int +qemuDomainObjWait(virDomainObjPtr vm, unsigned long long until); + #endif /* __QEMU_DOMAIN_H__ */ diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0dd6032..03969d8 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2727,7 +2727,7 @@ qemuDomainGetControlInfo(virDomainPtr dom, memset(info, 0, sizeof(*info)); - if (priv->monError) { + if (priv->monError.code != VIR_ERR_OK) { info->state = VIR_DOMAIN_CONTROL_ERROR; info->details = VIR_DOMAIN_CONTROL_ERROR_REASON_MONITOR; } else if (priv->job.active) { @@ -3726,7 +3726,7 @@ qemuDumpWaitForCompletion(virDomainObjPtr vm) VIR_DEBUG("Waiting for dump completion"); while (!priv->job.dumpCompleted && !priv->job.abortJob) { - if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm, 0) < 0) return -1; } @@ -16924,7 +16924,7 @@ qemuDomainBlockJobAbort(virDomainPtr dom, qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk); qemuBlockJobUpdate(driver, vm, QEMU_ASYNC_JOB_NONE, disk, NULL); while (diskPriv->blockjob) { - if (virDomainObjWait(vm) < 0) { + if (qemuDomainObjWait(vm, 0) < 0) { ret = -1; goto endjob; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 8d3191f..527ce3f 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -206,7 +206,7 @@ qemuHotplugWaitForTrayEject(virQEMUDriverPtr driver, return -1; while (disk->tray_status != VIR_DOMAIN_DISK_TRAY_OPEN) { - if ((rc = virDomainObjWaitUntil(vm, now + CHANGE_MEDIA_TIMEOUT)) < 0) + if ((rc = qemuDomainObjWait(vm, now + CHANGE_MEDIA_TIMEOUT)) < 0) return -1; if (rc > 0) { @@ -4602,7 +4602,7 @@ qemuDomainWaitForDeviceRemoval(virDomainObjPtr vm) until += qemuDomainRemoveDeviceWaitTime; while (priv->unplug.alias) { - if ((rc = virDomainObjWaitUntil(vm, until)) == 1) + if ((rc = qemuDomainObjWait(vm, until)) == 1) return 0; if (rc < 0) { diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 88b8253..82e1c7f 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -738,7 +738,7 @@ qemuMigrationSrcCancelDriveMirror(virQEMUDriverPtr driver, if (failed && !err) err = virSaveLastError(); - if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm, 0) < 0) goto cleanup; } @@ -877,7 +877,7 @@ qemuMigrationSrcDriveMirror(virQEMUDriverPtr driver, goto cleanup; } - if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm, 0) < 0) goto cleanup; } @@ -1181,7 +1181,7 @@ qemuMigrationSrcWaitForSpice(virDomainObjPtr vm) VIR_DEBUG("Waiting for SPICE to finish migration"); while (!priv->job.spiceMigrated && !priv->job.abortJob) { - if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm, 0) < 0) return -1; } return 0; @@ -1460,7 +1460,7 @@ qemuMigrationSrcWaitForCompletion(virQEMUDriverPtr driver, return rv; if (events) { - if (virDomainObjWait(vm) < 0) { + if (qemuDomainObjWait(vm, 0) < 0) { jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -2; } @@ -1513,7 +1513,7 @@ qemuMigrationDstWaitForCompletion(virQEMUDriverPtr driver, while ((rv = qemuMigrationAnyCompleted(driver, vm, asyncJob, NULL, flags)) != 1) { - if (rv < 0 || virDomainObjWait(vm) < 0) + if (rv < 0 || qemuDomainObjWait(vm, 0) < 0) return -1; } @@ -3464,7 +3464,7 @@ qemuMigrationSrcRun(virQEMUDriverPtr driver, if (priv->monJSON) { while (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { priv->signalStop = true; - rc = virDomainObjWait(vm); + rc = qemuDomainObjWait(vm, 0); priv->signalStop = false; if (rc < 0) goto error; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 6a5262a..d76809e 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -268,6 +268,23 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) return 0; } +static void +qemuProcessNotifyMonitorError(virDomainObjPtr vm, + qemuMonitorPtr mon) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virErrorPtr err = qemuMonitorLastError(mon); + + virCopyError(err, &priv->monError); + + /* set error code if due to OOM conditions we fail to set it before */ + if (priv->monError.code == VIR_ERR_OK) + priv->monError.code = VIR_ERR_INTERNAL_ERROR; + + /* Wake up anything waiting for events from monitor */ + virDomainObjBroadcast(vm); + virFreeError(err); +} /* * This is a callback registered with a qemuMonitorPtr instance, @@ -286,6 +303,8 @@ qemuProcessHandleMonitorEOF(qemuMonitorPtr mon, virObjectLock(vm); + qemuProcessNotifyMonitorError(vm, mon); + VIR_DEBUG("Received EOF on %p '%s'", vm, vm->def->name); priv = vm->privateData; @@ -338,7 +357,8 @@ qemuProcessHandleMonitorError(qemuMonitorPtr mon ATTRIBUTE_UNUSED, virObjectLock(vm); - ((qemuDomainObjPrivatePtr) vm->privateData)->monError = true; + qemuProcessNotifyMonitorError(vm, mon); + event = virDomainEventControlErrorNewFromObj(vm); qemuDomainEventQueue(driver, event); @@ -5727,7 +5747,7 @@ qemuProcessPrepareDomain(virQEMUDriverPtr driver, goto cleanup; priv->monJSON = true; - priv->monError = false; + virResetError(&priv->monError); priv->monStart = 0; priv->gotShutdown = false; @@ -6483,9 +6503,6 @@ qemuProcessBeginStopJob(virQEMUDriverPtr driver, if (qemuProcessKill(vm, killFlags) < 0) goto cleanup; - /* Wake up anything waiting on domain condition */ - virDomainObjBroadcast(vm); - if (qemuDomainObjBeginJob(driver, vm, job) < 0) goto cleanup; -- 1.8.3.1

On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Block job abort operation can not handle properly qemu crashes when waiting for abort/pivot completion. Deadlock scenario is next:
- qemuDomainBlockJobAbort waits for pivot/abort completion - qemu crashes, then qemuProcessBeginStopJob broadcasts for VM condition and then waits for job condition (taken by qemuDomainBlockJobAbort) - qemuDomainBlockJobAbort awakes but nothing really changed, VM is still active (vm->def->id != -1) so thread starts waiting for completion again. Now two threads are in deadlock.
First let's add some condition besides domain active status so that waiting thread can check it when awakes. Second let's move signalling to the place where condition is set, namely monitor eof/error handlers. Having signalling in qemuProcessBeginStopJob is not useful.
The patch copies monitor error to domain state because at time waiting thread awakes there can be no monitor and it is useful to report monitor error to user.
The patch has a drawback that on destroying a domain when waiting for event from monitor we get not very convinient error message like
convenient
'EOF from monitor' from waiting API. On the other hand if qemu crashes this is more useful then 'domain is not running'. The first case will be addressed in another patch.
The patch also fixes other places where we wait for event from qemu. Namely handling device removal and tray ejection. Other places which used virDomainObjWait (dump, migration, save) were good because of async jobs which allow concurrent destroy job.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/conf/domain_conf.c | 43 ------------------------------------------- src/conf/domain_conf.h | 3 --- src/libvirt_private.syms | 2 -- src/qemu/qemu_domain.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.h | 5 ++++- src/qemu/qemu_driver.c | 6 +++--- src/qemu/qemu_hotplug.c | 4 ++-- src/qemu/qemu_migration.c | 12 ++++++------ src/qemu/qemu_process.c | 27 ++++++++++++++++++++++----- 9 files changed, 82 insertions(+), 65 deletions(-)
This no longer git am -3 applies and based on previous patch reviews, I think perhaps this needs to be redone. I don't believe moving/renaming virDomainObjWait is good/right, but I'm sure there would be other opinions on that. Yes, QEMU is currently the only consumer... If it were to move, it should move to virdomainobjlist since that's where innards of virDomainObjPtr are managed. The fact that we look at ->parent.lock, well, <sigh>... Anyway, you're attempting to special case something and perhaps you just need to create a qemuDomainObjWait that would call the timeout version of the virDomainObjWait and be able to handle whether some other error occurred. Wouldn't the LastError before the current wait return NULL (as in no error), then during the timeout period if LastError is something couldn't that indicate the failure you're looking for. I didn't spend a lot of time thinking about alternatives and how the code should change, but when you mention the patch has a drawback - that immediately raises concern. John
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 5f1af91..e046008 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -3249,49 +3249,6 @@ virDomainObjBroadcast(virDomainObjPtr vm) }
-int -virDomainObjWait(virDomainObjPtr vm) -{ - if (virCondWait(&vm->cond, &vm->parent.lock) < 0) { - virReportSystemError(errno, "%s", - _("failed to wait for domain condition")); - return -1; - } - - if (!virDomainObjIsActive(vm)) { - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("domain is not running")); - return -1; - } - - return 0; -} - - -/** - * Waits for domain condition to be triggered for a specific period of time. - * - * Returns: - * -1 in case of error - * 0 on success - * 1 on timeout - */ -int -virDomainObjWaitUntil(virDomainObjPtr vm, - unsigned long long whenms) -{ - if (virCondWaitUntil(&vm->cond, &vm->parent.lock, whenms) < 0) { - if (errno != ETIMEDOUT) { - virReportSystemError(errno, "%s", - _("failed to wait for domain condition")); - return -1; - } - return 1; - } - return 0; -} - - /* * Mark the current VM config as transient. Ensures transient hotplug * operations do not persist past shutdown. diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 122a051..b4d5bdb 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2747,9 +2747,6 @@ bool virDomainObjTaint(virDomainObjPtr obj, virDomainTaintFlags taint);
void virDomainObjBroadcast(virDomainObjPtr vm); -int virDomainObjWait(virDomainObjPtr vm); -int virDomainObjWaitUntil(virDomainObjPtr vm, - unsigned long long whenms);
void virDomainPanicDefFree(virDomainPanicDefPtr panic); void virDomainResourceDefFree(virDomainResourceDefPtr resource); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 6bbbf77..ed5498e 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -493,8 +493,6 @@ virDomainObjSetMetadata; virDomainObjSetState; virDomainObjTaint; virDomainObjUpdateModificationImpact; -virDomainObjWait; -virDomainObjWaitUntil; virDomainOSTypeFromString; virDomainOSTypeToString; virDomainParseMemory; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 672f08b..1f40ff1 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1909,6 +1909,7 @@ qemuDomainObjPrivateFree(void *data) qemuDomainObjFreeJob(priv); VIR_FREE(priv->lockState); VIR_FREE(priv->origname); + virResetError(&priv->monError);
virChrdevFree(priv->devs);
@@ -11881,3 +11882,47 @@ qemuProcessEventFree(struct qemuProcessEvent *event) } VIR_FREE(event); } + + +/** + * Waits for domain condition to be triggered for a specific period of time. + * if @until is 0 then waits indefinetely. + * + * Returns: + * -1 on error + * 0 on success + * 1 on timeout + */ +int +qemuDomainObjWait(virDomainObjPtr vm, unsigned long long until) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int rc; + + if (until) + rc = virCondWaitUntil(&vm->cond, &vm->parent.lock, until); + else + rc = virCondWait(&vm->cond, &vm->parent.lock); + + if (rc < 0) { + if (until && errno == ETIMEDOUT) + return 1; + + virReportSystemError(errno, "%s", + _("failed to wait for domain condition")); + return -1; + } + + if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("domain is not running")); + return -1; + } + + if (priv->monError.code != VIR_ERR_OK) { + virSetError(&priv->monError); + return -1; + } + + return 0; +} diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 40d9a6f..494ed35 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -262,7 +262,7 @@ struct _qemuDomainObjPrivate { qemuMonitorPtr mon; virDomainChrSourceDefPtr monConfig; bool monJSON; - bool monError; + virError monError; unsigned long long monStart;
qemuAgentPtr agent; @@ -994,4 +994,7 @@ qemuDomainPrepareDiskSource(virDomainDiskDefPtr disk, qemuDomainObjPrivatePtr priv, virQEMUDriverConfigPtr cfg);
+int +qemuDomainObjWait(virDomainObjPtr vm, unsigned long long until); + #endif /* __QEMU_DOMAIN_H__ */ diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0dd6032..03969d8 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2727,7 +2727,7 @@ qemuDomainGetControlInfo(virDomainPtr dom,
memset(info, 0, sizeof(*info));
- if (priv->monError) { + if (priv->monError.code != VIR_ERR_OK) { info->state = VIR_DOMAIN_CONTROL_ERROR; info->details = VIR_DOMAIN_CONTROL_ERROR_REASON_MONITOR; } else if (priv->job.active) { @@ -3726,7 +3726,7 @@ qemuDumpWaitForCompletion(virDomainObjPtr vm)
VIR_DEBUG("Waiting for dump completion"); while (!priv->job.dumpCompleted && !priv->job.abortJob) { - if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm, 0) < 0) return -1; }
@@ -16924,7 +16924,7 @@ qemuDomainBlockJobAbort(virDomainPtr dom, qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk); qemuBlockJobUpdate(driver, vm, QEMU_ASYNC_JOB_NONE, disk, NULL); while (diskPriv->blockjob) { - if (virDomainObjWait(vm) < 0) { + if (qemuDomainObjWait(vm, 0) < 0) { ret = -1; goto endjob; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 8d3191f..527ce3f 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -206,7 +206,7 @@ qemuHotplugWaitForTrayEject(virQEMUDriverPtr driver, return -1;
while (disk->tray_status != VIR_DOMAIN_DISK_TRAY_OPEN) { - if ((rc = virDomainObjWaitUntil(vm, now + CHANGE_MEDIA_TIMEOUT)) < 0) + if ((rc = qemuDomainObjWait(vm, now + CHANGE_MEDIA_TIMEOUT)) < 0) return -1;
if (rc > 0) { @@ -4602,7 +4602,7 @@ qemuDomainWaitForDeviceRemoval(virDomainObjPtr vm) until += qemuDomainRemoveDeviceWaitTime;
while (priv->unplug.alias) { - if ((rc = virDomainObjWaitUntil(vm, until)) == 1) + if ((rc = qemuDomainObjWait(vm, until)) == 1) return 0;
if (rc < 0) { diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 88b8253..82e1c7f 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -738,7 +738,7 @@ qemuMigrationSrcCancelDriveMirror(virQEMUDriverPtr driver, if (failed && !err) err = virSaveLastError();
- if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm, 0) < 0) goto cleanup; }
@@ -877,7 +877,7 @@ qemuMigrationSrcDriveMirror(virQEMUDriverPtr driver, goto cleanup; }
- if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm, 0) < 0) goto cleanup; }
@@ -1181,7 +1181,7 @@ qemuMigrationSrcWaitForSpice(virDomainObjPtr vm)
VIR_DEBUG("Waiting for SPICE to finish migration"); while (!priv->job.spiceMigrated && !priv->job.abortJob) { - if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm, 0) < 0) return -1; } return 0; @@ -1460,7 +1460,7 @@ qemuMigrationSrcWaitForCompletion(virQEMUDriverPtr driver, return rv;
if (events) { - if (virDomainObjWait(vm) < 0) { + if (qemuDomainObjWait(vm, 0) < 0) { jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -2; } @@ -1513,7 +1513,7 @@ qemuMigrationDstWaitForCompletion(virQEMUDriverPtr driver,
while ((rv = qemuMigrationAnyCompleted(driver, vm, asyncJob, NULL, flags)) != 1) { - if (rv < 0 || virDomainObjWait(vm) < 0) + if (rv < 0 || qemuDomainObjWait(vm, 0) < 0) return -1; }
@@ -3464,7 +3464,7 @@ qemuMigrationSrcRun(virQEMUDriverPtr driver, if (priv->monJSON) { while (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { priv->signalStop = true; - rc = virDomainObjWait(vm); + rc = qemuDomainObjWait(vm, 0); priv->signalStop = false; if (rc < 0) goto error; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 6a5262a..d76809e 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -268,6 +268,23 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) return 0; }
+static void +qemuProcessNotifyMonitorError(virDomainObjPtr vm, + qemuMonitorPtr mon) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virErrorPtr err = qemuMonitorLastError(mon); + + virCopyError(err, &priv->monError); + + /* set error code if due to OOM conditions we fail to set it before */ + if (priv->monError.code == VIR_ERR_OK) + priv->monError.code = VIR_ERR_INTERNAL_ERROR; + + /* Wake up anything waiting for events from monitor */ + virDomainObjBroadcast(vm); + virFreeError(err); +}
/* * This is a callback registered with a qemuMonitorPtr instance, @@ -286,6 +303,8 @@ qemuProcessHandleMonitorEOF(qemuMonitorPtr mon,
virObjectLock(vm);
+ qemuProcessNotifyMonitorError(vm, mon); + VIR_DEBUG("Received EOF on %p '%s'", vm, vm->def->name);
priv = vm->privateData; @@ -338,7 +357,8 @@ qemuProcessHandleMonitorError(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
virObjectLock(vm);
- ((qemuDomainObjPrivatePtr) vm->privateData)->monError = true; + qemuProcessNotifyMonitorError(vm, mon); + event = virDomainEventControlErrorNewFromObj(vm); qemuDomainEventQueue(driver, event);
@@ -5727,7 +5747,7 @@ qemuProcessPrepareDomain(virQEMUDriverPtr driver, goto cleanup;
priv->monJSON = true; - priv->monError = false; + virResetError(&priv->monError); priv->monStart = 0; priv->gotShutdown = false;
@@ -6483,9 +6503,6 @@ qemuProcessBeginStopJob(virQEMUDriverPtr driver, if (qemuProcessKill(vm, killFlags) < 0) goto cleanup;
- /* Wake up anything waiting on domain condition */ - virDomainObjBroadcast(vm); - if (qemuDomainObjBeginJob(driver, vm, job) < 0) goto cleanup;

On 01.05.2018 01:03, John Ferlan wrote:
On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Block job abort operation can not handle properly qemu crashes when waiting for abort/pivot completion. Deadlock scenario is next:
- qemuDomainBlockJobAbort waits for pivot/abort completion - qemu crashes, then qemuProcessBeginStopJob broadcasts for VM condition and then waits for job condition (taken by qemuDomainBlockJobAbort) - qemuDomainBlockJobAbort awakes but nothing really changed, VM is still active (vm->def->id != -1) so thread starts waiting for completion again. Now two threads are in deadlock.
First let's add some condition besides domain active status so that waiting thread can check it when awakes. Second let's move signalling to the place where condition is set, namely monitor eof/error handlers. Having signalling in qemuProcessBeginStopJob is not useful.
The patch copies monitor error to domain state because at time waiting thread awakes there can be no monitor and it is useful to report monitor error to user.
The patch has a drawback that on destroying a domain when waiting for event from monitor we get not very convinient error message like
convenient
'EOF from monitor' from waiting API. On the other hand if qemu crashes this is more useful then 'domain is not running'. The first case will be addressed in another patch.
The patch also fixes other places where we wait for event from qemu. Namely handling device removal and tray ejection. Other places which used virDomainObjWait (dump, migration, save) were good because of async jobs which allow concurrent destroy job.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/conf/domain_conf.c | 43 ------------------------------------------- src/conf/domain_conf.h | 3 --- src/libvirt_private.syms | 2 -- src/qemu/qemu_domain.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.h | 5 ++++- src/qemu/qemu_driver.c | 6 +++--- src/qemu/qemu_hotplug.c | 4 ++-- src/qemu/qemu_migration.c | 12 ++++++------ src/qemu/qemu_process.c | 27 ++++++++++++++++++++++----- 9 files changed, 82 insertions(+), 65 deletions(-)
This no longer git am -3 applies and based on previous patch reviews, I think perhaps this needs to be redone.
I'll resend as soon as we come to agreement on series. Now I'm not convinced to change much (except for using distinct flag to indicate error as mentioned in 2nd patch discussion, then I don't need 3d patch). See comments below.
I don't believe moving/renaming virDomainObjWait is good/right, but I'm sure there would be other opinions on that. Yes, QEMU is currently the only consumer... If it were to move, it should move to virdomainobjlist since that's where innards of virDomainObjPtr are managed. The fact that we look at ->parent.lock, well, <sigh>...
I would not move the function without reason. It gets new name qemuDomainObjWait becase it use qemu specific data, namely monError.
Anyway, you're attempting to special case something and perhaps you just need to create a qemuDomainObjWait that would call the timeout version of the virDomainObjWait and be able to handle whether some other error occurred. Wouldn't the LastError before the current wait return NULL (as in no error), then during the timeout period if LastError is something couldn't that indicate the failure you're looking for.
I introduced qemuDomainObjWait not to put virDomainObjWait and virDomainObjWaitUntil in the first place but to check monError. Then rather then keeping too functions it's look more simple to use only one and branch on given timeout.
I didn't spend a lot of time thinking about alternatives and how the code should change, but when you mention the patch has a drawback - that immediately raises concern.
But ... I addressed this issue in next patch as I wrote.
John
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 5f1af91..e046008 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -3249,49 +3249,6 @@ virDomainObjBroadcast(virDomainObjPtr vm) }
-int -virDomainObjWait(virDomainObjPtr vm) -{ - if (virCondWait(&vm->cond, &vm->parent.lock) < 0) { - virReportSystemError(errno, "%s", - _("failed to wait for domain condition")); - return -1; - } - - if (!virDomainObjIsActive(vm)) { - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("domain is not running")); - return -1; - } - - return 0; -} - - -/** - * Waits for domain condition to be triggered for a specific period of time. - * - * Returns: - * -1 in case of error - * 0 on success - * 1 on timeout - */ -int -virDomainObjWaitUntil(virDomainObjPtr vm, - unsigned long long whenms) -{ - if (virCondWaitUntil(&vm->cond, &vm->parent.lock, whenms) < 0) { - if (errno != ETIMEDOUT) { - virReportSystemError(errno, "%s", - _("failed to wait for domain condition")); - return -1; - } - return 1; - } - return 0; -} - - /* * Mark the current VM config as transient. Ensures transient hotplug * operations do not persist past shutdown. diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 122a051..b4d5bdb 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2747,9 +2747,6 @@ bool virDomainObjTaint(virDomainObjPtr obj, virDomainTaintFlags taint);
void virDomainObjBroadcast(virDomainObjPtr vm); -int virDomainObjWait(virDomainObjPtr vm); -int virDomainObjWaitUntil(virDomainObjPtr vm, - unsigned long long whenms);
void virDomainPanicDefFree(virDomainPanicDefPtr panic); void virDomainResourceDefFree(virDomainResourceDefPtr resource); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 6bbbf77..ed5498e 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -493,8 +493,6 @@ virDomainObjSetMetadata; virDomainObjSetState; virDomainObjTaint; virDomainObjUpdateModificationImpact; -virDomainObjWait; -virDomainObjWaitUntil; virDomainOSTypeFromString; virDomainOSTypeToString; virDomainParseMemory; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 672f08b..1f40ff1 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1909,6 +1909,7 @@ qemuDomainObjPrivateFree(void *data) qemuDomainObjFreeJob(priv); VIR_FREE(priv->lockState); VIR_FREE(priv->origname); + virResetError(&priv->monError);
virChrdevFree(priv->devs);
@@ -11881,3 +11882,47 @@ qemuProcessEventFree(struct qemuProcessEvent *event) } VIR_FREE(event); } + + +/** + * Waits for domain condition to be triggered for a specific period of time. + * if @until is 0 then waits indefinetely. + * + * Returns: + * -1 on error + * 0 on success + * 1 on timeout + */ +int +qemuDomainObjWait(virDomainObjPtr vm, unsigned long long until) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int rc; + + if (until) + rc = virCondWaitUntil(&vm->cond, &vm->parent.lock, until); + else + rc = virCondWait(&vm->cond, &vm->parent.lock); + + if (rc < 0) { + if (until && errno == ETIMEDOUT) + return 1; + + virReportSystemError(errno, "%s", + _("failed to wait for domain condition")); + return -1; + } + + if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("domain is not running")); + return -1; + } + + if (priv->monError.code != VIR_ERR_OK) { + virSetError(&priv->monError); + return -1; + } + + return 0; +} diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 40d9a6f..494ed35 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -262,7 +262,7 @@ struct _qemuDomainObjPrivate { qemuMonitorPtr mon; virDomainChrSourceDefPtr monConfig; bool monJSON; - bool monError; + virError monError; unsigned long long monStart;
qemuAgentPtr agent; @@ -994,4 +994,7 @@ qemuDomainPrepareDiskSource(virDomainDiskDefPtr disk, qemuDomainObjPrivatePtr priv, virQEMUDriverConfigPtr cfg);
+int +qemuDomainObjWait(virDomainObjPtr vm, unsigned long long until); + #endif /* __QEMU_DOMAIN_H__ */ diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0dd6032..03969d8 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2727,7 +2727,7 @@ qemuDomainGetControlInfo(virDomainPtr dom,
memset(info, 0, sizeof(*info));
- if (priv->monError) { + if (priv->monError.code != VIR_ERR_OK) { info->state = VIR_DOMAIN_CONTROL_ERROR; info->details = VIR_DOMAIN_CONTROL_ERROR_REASON_MONITOR; } else if (priv->job.active) { @@ -3726,7 +3726,7 @@ qemuDumpWaitForCompletion(virDomainObjPtr vm)
VIR_DEBUG("Waiting for dump completion"); while (!priv->job.dumpCompleted && !priv->job.abortJob) { - if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm, 0) < 0) return -1; }
@@ -16924,7 +16924,7 @@ qemuDomainBlockJobAbort(virDomainPtr dom, qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk); qemuBlockJobUpdate(driver, vm, QEMU_ASYNC_JOB_NONE, disk, NULL); while (diskPriv->blockjob) { - if (virDomainObjWait(vm) < 0) { + if (qemuDomainObjWait(vm, 0) < 0) { ret = -1; goto endjob; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 8d3191f..527ce3f 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -206,7 +206,7 @@ qemuHotplugWaitForTrayEject(virQEMUDriverPtr driver, return -1;
while (disk->tray_status != VIR_DOMAIN_DISK_TRAY_OPEN) { - if ((rc = virDomainObjWaitUntil(vm, now + CHANGE_MEDIA_TIMEOUT)) < 0) + if ((rc = qemuDomainObjWait(vm, now + CHANGE_MEDIA_TIMEOUT)) < 0) return -1;
if (rc > 0) { @@ -4602,7 +4602,7 @@ qemuDomainWaitForDeviceRemoval(virDomainObjPtr vm) until += qemuDomainRemoveDeviceWaitTime;
while (priv->unplug.alias) { - if ((rc = virDomainObjWaitUntil(vm, until)) == 1) + if ((rc = qemuDomainObjWait(vm, until)) == 1) return 0;
if (rc < 0) { diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 88b8253..82e1c7f 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -738,7 +738,7 @@ qemuMigrationSrcCancelDriveMirror(virQEMUDriverPtr driver, if (failed && !err) err = virSaveLastError();
- if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm, 0) < 0) goto cleanup; }
@@ -877,7 +877,7 @@ qemuMigrationSrcDriveMirror(virQEMUDriverPtr driver, goto cleanup; }
- if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm, 0) < 0) goto cleanup; }
@@ -1181,7 +1181,7 @@ qemuMigrationSrcWaitForSpice(virDomainObjPtr vm)
VIR_DEBUG("Waiting for SPICE to finish migration"); while (!priv->job.spiceMigrated && !priv->job.abortJob) { - if (virDomainObjWait(vm) < 0) + if (qemuDomainObjWait(vm, 0) < 0) return -1; } return 0; @@ -1460,7 +1460,7 @@ qemuMigrationSrcWaitForCompletion(virQEMUDriverPtr driver, return rv;
if (events) { - if (virDomainObjWait(vm) < 0) { + if (qemuDomainObjWait(vm, 0) < 0) { jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -2; } @@ -1513,7 +1513,7 @@ qemuMigrationDstWaitForCompletion(virQEMUDriverPtr driver,
while ((rv = qemuMigrationAnyCompleted(driver, vm, asyncJob, NULL, flags)) != 1) { - if (rv < 0 || virDomainObjWait(vm) < 0) + if (rv < 0 || qemuDomainObjWait(vm, 0) < 0) return -1; }
@@ -3464,7 +3464,7 @@ qemuMigrationSrcRun(virQEMUDriverPtr driver, if (priv->monJSON) { while (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { priv->signalStop = true; - rc = virDomainObjWait(vm); + rc = qemuDomainObjWait(vm, 0); priv->signalStop = false; if (rc < 0) goto error; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 6a5262a..d76809e 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -268,6 +268,23 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) return 0; }
+static void +qemuProcessNotifyMonitorError(virDomainObjPtr vm, + qemuMonitorPtr mon) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virErrorPtr err = qemuMonitorLastError(mon); + + virCopyError(err, &priv->monError); + + /* set error code if due to OOM conditions we fail to set it before */ + if (priv->monError.code == VIR_ERR_OK) + priv->monError.code = VIR_ERR_INTERNAL_ERROR; + + /* Wake up anything waiting for events from monitor */ + virDomainObjBroadcast(vm); + virFreeError(err); +}
/* * This is a callback registered with a qemuMonitorPtr instance, @@ -286,6 +303,8 @@ qemuProcessHandleMonitorEOF(qemuMonitorPtr mon,
virObjectLock(vm);
+ qemuProcessNotifyMonitorError(vm, mon); + VIR_DEBUG("Received EOF on %p '%s'", vm, vm->def->name);
priv = vm->privateData; @@ -338,7 +357,8 @@ qemuProcessHandleMonitorError(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
virObjectLock(vm);
- ((qemuDomainObjPrivatePtr) vm->privateData)->monError = true; + qemuProcessNotifyMonitorError(vm, mon); + event = virDomainEventControlErrorNewFromObj(vm); qemuDomainEventQueue(driver, event);
@@ -5727,7 +5747,7 @@ qemuProcessPrepareDomain(virQEMUDriverPtr driver, goto cleanup;
priv->monJSON = true; - priv->monError = false; + virResetError(&priv->monError); priv->monStart = 0; priv->gotShutdown = false;
@@ -6483,9 +6503,6 @@ qemuProcessBeginStopJob(virQEMUDriverPtr driver, if (qemuProcessKill(vm, killFlags) < 0) goto cleanup;
- /* Wake up anything waiting on domain condition */ - virDomainObjBroadcast(vm); - if (qemuDomainObjBeginJob(driver, vm, job) < 0) goto cleanup;

On 05/03/2018 03:54 AM, Nikolay Shirokovskiy wrote:
On 01.05.2018 01:03, John Ferlan wrote:
On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Block job abort operation can not handle properly qemu crashes when waiting for abort/pivot completion. Deadlock scenario is next:
- qemuDomainBlockJobAbort waits for pivot/abort completion - qemu crashes, then qemuProcessBeginStopJob broadcasts for VM condition and then waits for job condition (taken by qemuDomainBlockJobAbort) - qemuDomainBlockJobAbort awakes but nothing really changed, VM is still active (vm->def->id != -1) so thread starts waiting for completion again. Now two threads are in deadlock.
First let's add some condition besides domain active status so that waiting thread can check it when awakes. Second let's move signalling to the place where condition is set, namely monitor eof/error handlers. Having signalling in qemuProcessBeginStopJob is not useful.
The patch copies monitor error to domain state because at time waiting thread awakes there can be no monitor and it is useful to report monitor error to user.
The patch has a drawback that on destroying a domain when waiting for event from monitor we get not very convinient error message like
convenient
'EOF from monitor' from waiting API. On the other hand if qemu crashes this is more useful then 'domain is not running'. The first case will be addressed in another patch.
The patch also fixes other places where we wait for event from qemu. Namely handling device removal and tray ejection. Other places which used virDomainObjWait (dump, migration, save) were good because of async jobs which allow concurrent destroy job.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/conf/domain_conf.c | 43 ------------------------------------------- src/conf/domain_conf.h | 3 --- src/libvirt_private.syms | 2 -- src/qemu/qemu_domain.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.h | 5 ++++- src/qemu/qemu_driver.c | 6 +++--- src/qemu/qemu_hotplug.c | 4 ++-- src/qemu/qemu_migration.c | 12 ++++++------ src/qemu/qemu_process.c | 27 ++++++++++++++++++++++----- 9 files changed, 82 insertions(+), 65 deletions(-)
This no longer git am -3 applies and based on previous patch reviews, I think perhaps this needs to be redone.
I'll resend as soon as we come to agreement on series. Now I'm not convinced to change much (except for using distinct flag to indicate error as mentioned in 2nd patch discussion, then I don't need 3d patch).
See comments below.
I don't believe moving/renaming virDomainObjWait is good/right, but I'm sure there would be other opinions on that. Yes, QEMU is currently the only consumer... If it were to move, it should move to virdomainobjlist since that's where innards of virDomainObjPtr are managed. The fact that we look at ->parent.lock, well, <sigh>...
I would not move the function without reason. It gets new name qemuDomainObjWait becase it use qemu specific data, namely monError.
Today only qemu uses this generic virDomainObjWait which is generic without the need to have/use qemu specific things. Other domain consumers could use it if they chose. I'm not in favor of moving, renaming, repurposing for a very specific issue for what is a generically used function. Maybe that's just me - so you could try to get a different reviewer if you want.
Anyway, you're attempting to special case something and perhaps you just need to create a qemuDomainObjWait that would call the timeout version of the virDomainObjWait and be able to handle whether some other error occurred. Wouldn't the LastError before the current wait return NULL (as in no error), then during the timeout period if LastError is something couldn't that indicate the failure you're looking for.
I introduced qemuDomainObjWait not to put virDomainObjWait and virDomainObjWaitUntil in the first place but to check monError. Then rather then keeping too functions it's look more simple to use only one and branch on given timeout.
I would think if the goal was to have specific code for a specific purpose for specific functions, then introduce the qemuDomainObjWait, but have it call virDomainObjWait[Until] based on the need you have. Which in this case appears to fiddle with monError in some way. Again - that's just my opinion John
I didn't spend a lot of time thinking about alternatives and how the code should change, but when you mention the patch has a drawback - that immediately raises concern.
But ... I addressed this issue in next patch as I wrote.
[...]

On 04.05.2018 17:52, John Ferlan wrote:
On 05/03/2018 03:54 AM, Nikolay Shirokovskiy wrote:
On 01.05.2018 01:03, John Ferlan wrote:
On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Block job abort operation can not handle properly qemu crashes when waiting for abort/pivot completion. Deadlock scenario is next:
- qemuDomainBlockJobAbort waits for pivot/abort completion - qemu crashes, then qemuProcessBeginStopJob broadcasts for VM condition and then waits for job condition (taken by qemuDomainBlockJobAbort) - qemuDomainBlockJobAbort awakes but nothing really changed, VM is still active (vm->def->id != -1) so thread starts waiting for completion again. Now two threads are in deadlock.
First let's add some condition besides domain active status so that waiting thread can check it when awakes. Second let's move signalling to the place where condition is set, namely monitor eof/error handlers. Having signalling in qemuProcessBeginStopJob is not useful.
The patch copies monitor error to domain state because at time waiting thread awakes there can be no monitor and it is useful to report monitor error to user.
The patch has a drawback that on destroying a domain when waiting for event from monitor we get not very convinient error message like
convenient
'EOF from monitor' from waiting API. On the other hand if qemu crashes this is more useful then 'domain is not running'. The first case will be addressed in another patch.
The patch also fixes other places where we wait for event from qemu. Namely handling device removal and tray ejection. Other places which used virDomainObjWait (dump, migration, save) were good because of async jobs which allow concurrent destroy job.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/conf/domain_conf.c | 43 ------------------------------------------- src/conf/domain_conf.h | 3 --- src/libvirt_private.syms | 2 -- src/qemu/qemu_domain.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.h | 5 ++++- src/qemu/qemu_driver.c | 6 +++--- src/qemu/qemu_hotplug.c | 4 ++-- src/qemu/qemu_migration.c | 12 ++++++------ src/qemu/qemu_process.c | 27 ++++++++++++++++++++++----- 9 files changed, 82 insertions(+), 65 deletions(-)
This no longer git am -3 applies and based on previous patch reviews, I think perhaps this needs to be redone.
I'll resend as soon as we come to agreement on series. Now I'm not convinced to change much (except for using distinct flag to indicate error as mentioned in 2nd patch discussion, then I don't need 3d patch).
See comments below.
I don't believe moving/renaming virDomainObjWait is good/right, but I'm sure there would be other opinions on that. Yes, QEMU is currently the only consumer... If it were to move, it should move to virdomainobjlist since that's where innards of virDomainObjPtr are managed. The fact that we look at ->parent.lock, well, <sigh>...
I would not move the function without reason. It gets new name qemuDomainObjWait becase it use qemu specific data, namely monError.
Today only qemu uses this generic virDomainObjWait which is generic without the need to have/use qemu specific things. Other domain consumers could use it if they chose.
I'm not in favor of moving, renaming, repurposing for a very specific issue for what is a generically used function. Maybe that's just me - so you could try to get a different reviewer if you want.
Sorry.
Anyway, you're attempting to special case something and perhaps you just need to create a qemuDomainObjWait that would call the timeout version of the virDomainObjWait and be able to handle whether some other error occurred. Wouldn't the LastError before the current wait return NULL (as in no error), then during the timeout period if LastError is something couldn't that indicate the failure you're looking for.
I introduced qemuDomainObjWait not to put virDomainObjWait and virDomainObjWaitUntil in the first place but to check monError. Then rather then keeping too functions it's look more simple to use only one and branch on given timeout.
I would think if the goal was to have specific code for a specific purpose for specific functions, then introduce the qemuDomainObjWait, but have it call virDomainObjWait[Until] based on the need you have. Which in this case appears to fiddle with monError in some way.
Again - that's just my opinion
John
Ok we can do that way. I just think that as virDomainObjWait/virDomainObjWaitUntil will not be used anywhere in code after the patch then sooner or later somebody will clean them up. Nikolay
I didn't spend a lot of time thinking about alternatives and how the code should change, but when you mention the patch has a drawback - that immediately raises concern.
But ... I addressed this issue in next patch as I wrote.
[...]

Clearing beingDestroyed right after acquiring job condition is racy. It is not known when EOF will be delivired. Let's keep this flag set. This makes possible to make a clear distinction between monitor errors/eofs and domain being destroyed in qemuDomainObjWait. Also let's move setting destroyed flag out of qemuProcessBeginStopJob as the function is called from eof handler too. Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_domain.c | 4 ++-- src/qemu/qemu_domain.h | 2 +- src/qemu/qemu_driver.c | 8 +++++++- src/qemu/qemu_process.c | 24 ++++++++++++------------ 4 files changed, 22 insertions(+), 16 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 1f40ff1..431901c 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -11913,9 +11913,9 @@ qemuDomainObjWait(virDomainObjPtr vm, unsigned long long until) return -1; } - if (!virDomainObjIsActive(vm)) { + if (priv->destroyed) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("domain is not running")); + _("domain is destroyed")); return -1; } diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 494ed35..69112ea 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -269,7 +269,7 @@ struct _qemuDomainObjPrivate { bool agentError; bool gotShutdown; - bool beingDestroyed; + bool destroyed; char *pidfile; virDomainPCIAddressSetPtr pciaddrs; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 03969d8..4356c0d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2227,7 +2227,13 @@ qemuDomainDestroyFlags(virDomainPtr dom, state = virDomainObjGetState(vm, &reason); starting = (state == VIR_DOMAIN_PAUSED && reason == VIR_DOMAIN_PAUSED_STARTING_UP && - !priv->beingDestroyed); + !priv->destroyed); + + /* We need to prevent monitor EOF callback from doing our work (and + * sending misleading events) while the vm is unlocked inside + * BeginJob/ProcessKill API + */ + priv->destroyed = true; if (qemuProcessBeginStopJob(driver, vm, QEMU_JOB_DESTROY, !(flags & VIR_DOMAIN_DESTROY_GRACEFUL)) < 0) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index d76809e..689fc8b 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -143,8 +143,8 @@ qemuProcessHandleAgentEOF(qemuAgentPtr agent, goto unlock; } - if (priv->beingDestroyed) { - VIR_DEBUG("Domain is being destroyed, agent EOF is expected"); + if (priv->destroyed) { + VIR_DEBUG("Domain is destroyed, agent EOF is expected"); goto unlock; } @@ -286,6 +286,7 @@ qemuProcessNotifyMonitorError(virDomainObjPtr vm, virFreeError(err); } + /* * This is a callback registered with a qemuMonitorPtr instance, * and to be invoked when the monitor console hits an end of file @@ -308,8 +309,8 @@ qemuProcessHandleMonitorEOF(qemuMonitorPtr mon, VIR_DEBUG("Received EOF on %p '%s'", vm, vm->def->name); priv = vm->privateData; - if (priv->beingDestroyed) { - VIR_DEBUG("Domain is being destroyed, EOF is expected"); + if (priv->destroyed) { + VIR_DEBUG("Domain is destroyed, EOF is expected"); goto cleanup; } @@ -5750,6 +5751,7 @@ qemuProcessPrepareDomain(virQEMUDriverPtr driver, virResetError(&priv->monError); priv->monStart = 0; priv->gotShutdown = false; + priv->destroyed = false; VIR_DEBUG("Updating guest CPU definition"); if (qemuProcessUpdateGuestCPU(vm->def, priv->qemuCaps, caps, flags) < 0) @@ -6490,16 +6492,9 @@ qemuProcessBeginStopJob(virQEMUDriverPtr driver, qemuDomainJob job, bool forceKill) { - qemuDomainObjPrivatePtr priv = vm->privateData; unsigned int killFlags = forceKill ? VIR_QEMU_PROCESS_KILL_FORCE : 0; int ret = -1; - /* We need to prevent monitor EOF callback from doing our work (and - * sending misleading events) while the vm is unlocked inside - * BeginJob/ProcessKill API - */ - priv->beingDestroyed = true; - if (qemuProcessKill(vm, killFlags) < 0) goto cleanup; @@ -6509,7 +6504,6 @@ qemuProcessBeginStopJob(virQEMUDriverPtr driver, ret = 0; cleanup: - priv->beingDestroyed = false; return ret; } @@ -7088,6 +7082,12 @@ qemuProcessAutoDestroy(virDomainObjPtr dom, VIR_DEBUG("Killing domain"); + /* We need to prevent monitor EOF callback from doing our work (and + * sending misleading events) while the vm is unlocked inside + * BeginJob/ProcessKill API + */ + priv->destroyed = true; + if (qemuProcessBeginStopJob(driver, dom, QEMU_JOB_DESTROY, true) < 0) return; -- 1.8.3.1

On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Clearing beingDestroyed right after acquiring job condition is racy. It is not known when EOF will be delivired. Let's keep this flag
delivered
set. This makes possible to make a clear distinction between monitor errors/eofs and domain being destroyed in qemuDomainObjWait.
Also let's move setting destroyed flag out of qemuProcessBeginStopJob as the function is called from eof handler too.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_domain.c | 4 ++-- src/qemu/qemu_domain.h | 2 +- src/qemu/qemu_driver.c | 8 +++++++- src/qemu/qemu_process.c | 24 ++++++++++++------------ 4 files changed, 22 insertions(+), 16 deletions(-)
This one didn't git am -3 apply as well, but I see you're changing DomainObjWait so that makes sense as to why. I do recall looking at this code at one time, but I cannot recall my exact conclusion given how qemuDomainObjBeginJobInternal allows the QEMU_JOB_DESTROY to be run, but then waits for whatever is taking place now to finish.
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 1f40ff1..431901c 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -11913,9 +11913,9 @@ qemuDomainObjWait(virDomainObjPtr vm, unsigned long long until) return -1; }
- if (!virDomainObjIsActive(vm)) { + if (priv->destroyed) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("domain is not running")); + _("domain is destroyed")); return -1; }
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 494ed35..69112ea 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -269,7 +269,7 @@ struct _qemuDomainObjPrivate { bool agentError;
bool gotShutdown; - bool beingDestroyed; + bool destroyed; char *pidfile;
virDomainPCIAddressSetPtr pciaddrs; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 03969d8..4356c0d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2227,7 +2227,13 @@ qemuDomainDestroyFlags(virDomainPtr dom, state = virDomainObjGetState(vm, &reason); starting = (state == VIR_DOMAIN_PAUSED && reason == VIR_DOMAIN_PAUSED_STARTING_UP && - !priv->beingDestroyed); + !priv->destroyed); + + /* We need to prevent monitor EOF callback from doing our work (and + * sending misleading events) while the vm is unlocked inside + * BeginJob/ProcessKill API + */ + priv->destroyed = true;
would this be the right place anyway? especially since you don't clear it in error paths after setting it. Once the job starts and we either goto cleanup or endjob on failure - how does a future call distinguish? Not sure this works conceptually. At least with the current model if DestroyFlags finally gets the job, it checks the domain active state, and goes to endjob after printing a message. So if while waiting, EOF occurred, there's no qemuProcessStop Perhaps the only thing I recall wondering is whether we clear beingDestroyed too soon... If we're successful, we're still destroying but not until destroy is successful should the flag then be cleared while still holding the job. John
if (qemuProcessBeginStopJob(driver, vm, QEMU_JOB_DESTROY, !(flags & VIR_DOMAIN_DESTROY_GRACEFUL)) < 0) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index d76809e..689fc8b 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -143,8 +143,8 @@ qemuProcessHandleAgentEOF(qemuAgentPtr agent, goto unlock; }
- if (priv->beingDestroyed) { - VIR_DEBUG("Domain is being destroyed, agent EOF is expected"); + if (priv->destroyed) { + VIR_DEBUG("Domain is destroyed, agent EOF is expected"); goto unlock; }
@@ -286,6 +286,7 @@ qemuProcessNotifyMonitorError(virDomainObjPtr vm, virFreeError(err); }
+ /* * This is a callback registered with a qemuMonitorPtr instance, * and to be invoked when the monitor console hits an end of file @@ -308,8 +309,8 @@ qemuProcessHandleMonitorEOF(qemuMonitorPtr mon, VIR_DEBUG("Received EOF on %p '%s'", vm, vm->def->name);
priv = vm->privateData; - if (priv->beingDestroyed) { - VIR_DEBUG("Domain is being destroyed, EOF is expected"); + if (priv->destroyed) { + VIR_DEBUG("Domain is destroyed, EOF is expected"); goto cleanup; }
@@ -5750,6 +5751,7 @@ qemuProcessPrepareDomain(virQEMUDriverPtr driver, virResetError(&priv->monError); priv->monStart = 0; priv->gotShutdown = false; + priv->destroyed = false;
VIR_DEBUG("Updating guest CPU definition"); if (qemuProcessUpdateGuestCPU(vm->def, priv->qemuCaps, caps, flags) < 0) @@ -6490,16 +6492,9 @@ qemuProcessBeginStopJob(virQEMUDriverPtr driver, qemuDomainJob job, bool forceKill) { - qemuDomainObjPrivatePtr priv = vm->privateData; unsigned int killFlags = forceKill ? VIR_QEMU_PROCESS_KILL_FORCE : 0; int ret = -1;
- /* We need to prevent monitor EOF callback from doing our work (and - * sending misleading events) while the vm is unlocked inside - * BeginJob/ProcessKill API - */ - priv->beingDestroyed = true; - if (qemuProcessKill(vm, killFlags) < 0) goto cleanup;
@@ -6509,7 +6504,6 @@ qemuProcessBeginStopJob(virQEMUDriverPtr driver, ret = 0;
cleanup: - priv->beingDestroyed = false; return ret; }
@@ -7088,6 +7082,12 @@ qemuProcessAutoDestroy(virDomainObjPtr dom,
VIR_DEBUG("Killing domain");
+ /* We need to prevent monitor EOF callback from doing our work (and + * sending misleading events) while the vm is unlocked inside + * BeginJob/ProcessKill API + */ + priv->destroyed = true; + if (qemuProcessBeginStopJob(driver, dom, QEMU_JOB_DESTROY, true) < 0) return;

On 01.05.2018 01:03, John Ferlan wrote:
On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Clearing beingDestroyed right after acquiring job condition is racy. It is not known when EOF will be delivired. Let's keep this flag
delivered
set. This makes possible to make a clear distinction between monitor errors/eofs and domain being destroyed in qemuDomainObjWait.
Also let's move setting destroyed flag out of qemuProcessBeginStopJob as the function is called from eof handler too.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_domain.c | 4 ++-- src/qemu/qemu_domain.h | 2 +- src/qemu/qemu_driver.c | 8 +++++++- src/qemu/qemu_process.c | 24 ++++++++++++------------ 4 files changed, 22 insertions(+), 16 deletions(-)
This one didn't git am -3 apply as well, but I see you're changing DomainObjWait so that makes sense as to why.
I do recall looking at this code at one time, but I cannot recall my exact conclusion given how qemuDomainObjBeginJobInternal allows the QEMU_JOB_DESTROY to be run, but then waits for whatever is taking place now to finish.
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 1f40ff1..431901c 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -11913,9 +11913,9 @@ qemuDomainObjWait(virDomainObjPtr vm, unsigned long long until) return -1; }
- if (!virDomainObjIsActive(vm)) { + if (priv->destroyed) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("domain is not running")); + _("domain is destroyed")); return -1; }
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 494ed35..69112ea 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -269,7 +269,7 @@ struct _qemuDomainObjPrivate { bool agentError;
bool gotShutdown; - bool beingDestroyed; + bool destroyed; char *pidfile;
virDomainPCIAddressSetPtr pciaddrs; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 03969d8..4356c0d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2227,7 +2227,13 @@ qemuDomainDestroyFlags(virDomainPtr dom, state = virDomainObjGetState(vm, &reason); starting = (state == VIR_DOMAIN_PAUSED && reason == VIR_DOMAIN_PAUSED_STARTING_UP && - !priv->beingDestroyed); + !priv->destroyed); + + /* We need to prevent monitor EOF callback from doing our work (and + * sending misleading events) while the vm is unlocked inside + * BeginJob/ProcessKill API + */ + priv->destroyed = true;
would this be the right place anyway? especially since you don't clear it in error paths after setting it. Once the job starts and we either goto cleanup or endjob on failure - how does a future call distinguish?
We send SIGTERM/SIGKILL right away after this flag is set so even if this API fails eventually keeping destroyed flag set looks good because we send signal to qemu and good chances are qemu will die because of that. I guess it will be nicer then to move setting the flag to qemuProcessBeginStopJob function to keep setting the flag and killing domain together. Anyway we can clear the flag on failure too.
Not sure this works conceptually. At least with the current model if DestroyFlags finally gets the job, it checks the domain active state, and goes to endjob after printing a message. So if while waiting, EOF occurred, there's no qemuProcessStop
Not true. After sending signal to qemu to terminate EOF will occur but handler will return right away because of beingDestroyed/destroyed flag check and then after destroyFlags gets the job it will call qemuProcessStop. This is not the place I'm addressing with the patch. It is rather if destroyFlags is called when we are migrating for example then the interrupted API call can report 'domain is not active'/'some monitor error' rather then much nicer 'domain is destroyed' without this patch. See the scenario below.
Perhaps the only thing I recall wondering is whether we clear beingDestroyed too soon... If we're successful, we're still destroying but not until destroy is successful should the flag then be cleared while still holding the job.
It does not matter if we clear the flag at the begin or the end of time we holding the job because during the job nobody else who needs the job too can progress. I propose to prolong setting the flag after destroy job is finished (and thus rename flag too). Consider next scenario: - migrate is called and wait for migrationg complete event from qemu - we call destroy and as destroy can run concurrently with migration destroy gets the job and executes domain stop - migrate awaiks on monitor EOF, beingDestroyed is cleared back at this point. We can check for domain is active and give 'domain is not active' error but 'domain is destroyed' is nicer thus we need 'destroyed' flag Nikolay
if (qemuProcessBeginStopJob(driver, vm, QEMU_JOB_DESTROY, !(flags & VIR_DOMAIN_DESTROY_GRACEFUL)) < 0) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index d76809e..689fc8b 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -143,8 +143,8 @@ qemuProcessHandleAgentEOF(qemuAgentPtr agent, goto unlock; }
- if (priv->beingDestroyed) { - VIR_DEBUG("Domain is being destroyed, agent EOF is expected"); + if (priv->destroyed) { + VIR_DEBUG("Domain is destroyed, agent EOF is expected"); goto unlock; }
@@ -286,6 +286,7 @@ qemuProcessNotifyMonitorError(virDomainObjPtr vm, virFreeError(err); }
+ /* * This is a callback registered with a qemuMonitorPtr instance, * and to be invoked when the monitor console hits an end of file @@ -308,8 +309,8 @@ qemuProcessHandleMonitorEOF(qemuMonitorPtr mon, VIR_DEBUG("Received EOF on %p '%s'", vm, vm->def->name);
priv = vm->privateData; - if (priv->beingDestroyed) { - VIR_DEBUG("Domain is being destroyed, EOF is expected"); + if (priv->destroyed) { + VIR_DEBUG("Domain is destroyed, EOF is expected"); goto cleanup; }
@@ -5750,6 +5751,7 @@ qemuProcessPrepareDomain(virQEMUDriverPtr driver, virResetError(&priv->monError); priv->monStart = 0; priv->gotShutdown = false; + priv->destroyed = false;
VIR_DEBUG("Updating guest CPU definition"); if (qemuProcessUpdateGuestCPU(vm->def, priv->qemuCaps, caps, flags) < 0) @@ -6490,16 +6492,9 @@ qemuProcessBeginStopJob(virQEMUDriverPtr driver, qemuDomainJob job, bool forceKill) { - qemuDomainObjPrivatePtr priv = vm->privateData; unsigned int killFlags = forceKill ? VIR_QEMU_PROCESS_KILL_FORCE : 0; int ret = -1;
- /* We need to prevent monitor EOF callback from doing our work (and - * sending misleading events) while the vm is unlocked inside - * BeginJob/ProcessKill API - */ - priv->beingDestroyed = true; - if (qemuProcessKill(vm, killFlags) < 0) goto cleanup;
@@ -6509,7 +6504,6 @@ qemuProcessBeginStopJob(virQEMUDriverPtr driver, ret = 0;
cleanup: - priv->beingDestroyed = false; return ret; }
@@ -7088,6 +7082,12 @@ qemuProcessAutoDestroy(virDomainObjPtr dom,
VIR_DEBUG("Killing domain");
+ /* We need to prevent monitor EOF callback from doing our work (and + * sending misleading events) while the vm is unlocked inside + * BeginJob/ProcessKill API + */ + priv->destroyed = true; + if (qemuProcessBeginStopJob(driver, dom, QEMU_JOB_DESTROY, true) < 0) return;

On 05/03/2018 05:26 AM, Nikolay Shirokovskiy wrote:
On 01.05.2018 01:03, John Ferlan wrote:
On 04/18/2018 10:44 AM, Nikolay Shirokovskiy wrote:
Clearing beingDestroyed right after acquiring job condition is racy. It is not known when EOF will be delivired. Let's keep this flag
delivered
set. This makes possible to make a clear distinction between monitor errors/eofs and domain being destroyed in qemuDomainObjWait.
Also let's move setting destroyed flag out of qemuProcessBeginStopJob as the function is called from eof handler too.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> --- src/qemu/qemu_domain.c | 4 ++-- src/qemu/qemu_domain.h | 2 +- src/qemu/qemu_driver.c | 8 +++++++- src/qemu/qemu_process.c | 24 ++++++++++++------------ 4 files changed, 22 insertions(+), 16 deletions(-)
This one didn't git am -3 apply as well, but I see you're changing DomainObjWait so that makes sense as to why.
I do recall looking at this code at one time, but I cannot recall my exact conclusion given how qemuDomainObjBeginJobInternal allows the QEMU_JOB_DESTROY to be run, but then waits for whatever is taking place now to finish.
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 1f40ff1..431901c 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -11913,9 +11913,9 @@ qemuDomainObjWait(virDomainObjPtr vm, unsigned long long until) return -1; }
- if (!virDomainObjIsActive(vm)) { + if (priv->destroyed) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("domain is not running")); + _("domain is destroyed")); return -1; }
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 494ed35..69112ea 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -269,7 +269,7 @@ struct _qemuDomainObjPrivate { bool agentError;
bool gotShutdown; - bool beingDestroyed; + bool destroyed; char *pidfile;
virDomainPCIAddressSetPtr pciaddrs; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 03969d8..4356c0d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2227,7 +2227,13 @@ qemuDomainDestroyFlags(virDomainPtr dom, state = virDomainObjGetState(vm, &reason); starting = (state == VIR_DOMAIN_PAUSED && reason == VIR_DOMAIN_PAUSED_STARTING_UP && - !priv->beingDestroyed); + !priv->destroyed); + + /* We need to prevent monitor EOF callback from doing our work (and + * sending misleading events) while the vm is unlocked inside + * BeginJob/ProcessKill API + */ + priv->destroyed = true;
would this be the right place anyway? especially since you don't clear it in error paths after setting it. Once the job starts and we either goto cleanup or endjob on failure - how does a future call distinguish?
We send SIGTERM/SIGKILL right away after this flag is set so even if this API fails eventually keeping destroyed flag set looks good because we send signal to qemu and good chances are qemu will die because of that.
I guess it will be nicer then to move setting the flag to qemuProcessBeginStopJob function to keep setting the flag and killing domain together.
Anyway we can clear the flag on failure too.
Not sure this works conceptually. At least with the current model if DestroyFlags finally gets the job, it checks the domain active state, and goes to endjob after printing a message. So if while waiting, EOF occurred, there's no qemuProcessStop
Not true. After sending signal to qemu to terminate EOF will occur but handler will return right away because of beingDestroyed/destroyed flag check and then after destroyFlags gets the job it will call qemuProcessStop.
This is not the place I'm addressing with the patch. It is rather if destroyFlags is called when we are migrating for example then the interrupted API call can report 'domain is not active'/'some monitor error' rather then much nicer 'domain is destroyed' without this patch. See the scenario below.
Perhaps the only thing I recall wondering is whether we clear beingDestroyed too soon... If we're successful, we're still destroying but not until destroy is successful should the flag then be cleared while still holding the job.
It does not matter if we clear the flag at the begin or the end of time we holding the job because during the job nobody else who needs the job too can progress.
I propose to prolong setting the flag after destroy job is finished (and thus rename flag too). Consider next scenario:
- migrate is called and wait for migrationg complete event from qemu - we call destroy and as destroy can run concurrently with migration destroy gets the job and executes domain stop - migrate awaiks on monitor EOF, beingDestroyed is cleared back at this point. We can check for domain is active and give 'domain is not active' error but 'domain is destroyed' is nicer thus we need 'destroyed' flag
Nikolay
Honestly - hoping things will be clearer for the round of this. John [...]

ping On 18.04.2018 17:44, Nikolay Shirokovskiy wrote:
Main patch is 4th, others are misc.
Nikolay Shirokovskiy (5): qemu: erase synchronous block job cancel mentions in comments qemu: monitor: set error flag even in OOM conditions utils: export virCopyError qemu: fix domain object wait to handle monitor errors qemu: fix races in beingDestroyed usage
src/conf/domain_conf.c | 43 --------------------------------------- src/conf/domain_conf.h | 3 --- src/libvirt_private.syms | 3 +-- src/qemu/qemu_domain.c | 45 +++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.h | 7 +++++-- src/qemu/qemu_driver.c | 27 +++++++++++++++---------- src/qemu/qemu_hotplug.c | 4 ++-- src/qemu/qemu_migration.c | 12 +++++------ src/qemu/qemu_monitor.c | 5 +++++ src/qemu/qemu_process.c | 51 +++++++++++++++++++++++++++++++---------------- src/util/virerror.c | 12 ++++++++--- src/util/virerror.h | 1 + 12 files changed, 124 insertions(+), 89 deletions(-)
participants (2)
-
John Ferlan
-
Nikolay Shirokovskiy