[libvirt] [PATCH 0/6] Add internal APIs for disabling error reporting

In testing migration, I was hitting an error in the Perform step, but this was being returned as 'Unknown Error' via virsh. The reason is that even a failed Perform will call MigrateFinish to do cleanup, but Finish will always return an NULL == error in that case, overwriting the original message. I've added internal APIs which allow temporarily disabling error reporting: any ReportError calls will log the error message, but will not overwrite the previous error and will not trigger the error callback. These APIs are then used in a few places where they are needed. Thanks, Cole Cole Robinson (6): virterror: Make a thread local error state structure virterror: Add methods to temporarily disable error reporting. qemu: migrate: Disable errors in MigrateFinish if MigratePerform failed. Disable errors if Perform failed at the libvirt.c level. qemu: Disable errors in qemudShutdownVMDaemon virterror: Log reported errors when only logging output is stderr src/libvirt.c | 5 ++ src/qemu/qemu_driver.c | 13 ++++ src/util/logging.c | 3 +- src/util/virterror.c | 143 ++++++++++++++++++++++++++++++++--------- src/util/virterror_internal.h | 2 + 5 files changed, 135 insertions(+), 31 deletions(-)

Move virLastError into that structure. This should be a no op refactor. Signed-off-by: Cole Robinson <crobinso@redhat.com> --- src/util/virterror.c | 52 +++++++++++++++++++++++++++++++++++-------------- 1 files changed, 37 insertions(+), 15 deletions(-) diff --git a/src/util/virterror.c b/src/util/virterror.c index c8e8623..7b7543b 100644 --- a/src/util/virterror.c +++ b/src/util/virterror.c @@ -22,7 +22,12 @@ #include "threads.h" #include "util.h" -virThreadLocal virLastErr; +virThreadLocal virThreadErrorStateIdx; + +typedef struct virThreadErrorState { + virErrorPtr lastError; +} virThreadErrorState; +typedef virThreadErrorState *virThreadErrorStatePtr; virErrorFunc virErrorHandler = NULL; /* global error handler */ void *virUserData = NULL; /* associated data */ @@ -182,13 +187,18 @@ static const char *virErrorDomainName(virErrorDomain domain) { * release the error object stored in the thread local */ static void -virLastErrFreeData(void *data) +virThreadErrorStateFree(void *data) { - virErrorPtr err = data; - if (!err) + virThreadErrorStatePtr state = data; + if (!state) return; - virResetError(err); - VIR_FREE(err); + + if (state->lastError) { + virResetError(state->lastError); + VIR_FREE(state->lastError); + } + + VIR_FREE(state); } @@ -202,7 +212,8 @@ virLastErrFreeData(void *data) int virErrorInitialize(void) { - return virThreadLocalInit(&virLastErr, virLastErrFreeData); + return virThreadLocalInit(&virThreadErrorStateIdx, + virThreadErrorStateFree); } @@ -253,19 +264,30 @@ virCopyError(virErrorPtr from, return ret; } -static virErrorPtr -virLastErrorObject(void) +static virThreadErrorStatePtr +virThreadErrorStateObject(void) { - virErrorPtr err; - err = virThreadLocalGet(&virLastErr); - if (!err) { - if (VIR_ALLOC(err) < 0) + virThreadErrorStatePtr state; + state = virThreadLocalGet(&virThreadErrorStateIdx); + if (!state) { + if (VIR_ALLOC(state) < 0) return NULL; - virThreadLocalSet(&virLastErr, err); + if (VIR_ALLOC(state->lastError) < 0) + return NULL; + virThreadLocalSet(&virThreadErrorStateIdx, state); } - return err; + return state; } +static virErrorPtr +virLastErrorObject(void) +{ + virThreadErrorStatePtr state = virThreadErrorStateObject(); + if (!state) + return NULL; + + return state->lastError; +} /** * virGetLastError: -- 1.6.5.1

If error reporting is disabled, RaiseFullError will log the passed message, but it will not wipe out the previous error and won't trigger the error callback. Signed-off-by: Cole Robinson <crobinso@redhat.com> --- src/util/virterror.c | 82 ++++++++++++++++++++++++++++++++++------- src/util/virterror_internal.h | 2 + 2 files changed, 70 insertions(+), 14 deletions(-) diff --git a/src/util/virterror.c b/src/util/virterror.c index 7b7543b..58b3086 100644 --- a/src/util/virterror.c +++ b/src/util/virterror.c @@ -26,6 +26,7 @@ virThreadLocal virThreadErrorStateIdx; typedef struct virThreadErrorState { virErrorPtr lastError; + unsigned int enableErrors; } virThreadErrorState; typedef virThreadErrorState *virThreadErrorStatePtr; @@ -274,6 +275,8 @@ virThreadErrorStateObject(void) return NULL; if (VIR_ALLOC(state->lastError) < 0) return NULL; + + state->enableErrors = 1; virThreadLocalSet(&virThreadErrorStateIdx, state); } return state; @@ -289,6 +292,49 @@ virLastErrorObject(void) return state->lastError; } +static unsigned int +virGetEnableErrors(void) +{ + virThreadErrorStatePtr state = virThreadErrorStateObject(); + if (!state) + return 1; + + return state->enableErrors; +} + +static void +virSetEnableErrors(unsigned int val) +{ + virThreadErrorStatePtr state = virThreadErrorStateObject(); + if (!state) + return; + + state->enableErrors = val ? 1 : 0; +} + +/* + * Enable error reporting and error callback launching (on by default) + */ +void +virEnableErrors(void) +{ + virSetEnableErrors(1); +} + +/* + * Disable error reporting and error callback launching. Calls to + * ReportFullError will log the message, but the previous error will remain + * intact. + * + * This is useful when we want to ensure that some clean up path will + * not stomp all over the original error report. + */ +void +virDisableErrors(void) +{ + virSetEnableErrors(0); +} + /** * virGetLastError: * @@ -669,21 +715,10 @@ virRaiseErrorFull(virConnectPtr conn, if (!to) return; /* Hit OOM allocating thread error object, sod all we can do now */ - virResetError(to); - - if (code == VIR_ERR_OK) + /* If code is OK, clear the previous error and return. */ + if (code == VIR_ERR_OK) { + virResetError(to); return; - - /* - * try to find the best place to save and report the error - */ - if (conn != NULL) { - virMutexLock(&conn->lock); - if (conn->handler != NULL) { - handler = conn->handler; - userData = conn->userData; - } - virMutexUnlock(&conn->lock); } /* @@ -702,6 +737,25 @@ virRaiseErrorFull(virConnectPtr conn, virLogMessage(virErrorDomainName(domain), virErrorLevelPriority(level), funcname, linenr, 1, "%s", str); + /* Error reporting has been deliberately disabled, don't wipe out the + * previous error, just log what was passed and return. */ + if (!virGetEnableErrors()) { + VIR_FREE(str); + return; + } + + /* + * try to find the best place to save and report the error + */ + if (conn != NULL) { + virMutexLock(&conn->lock); + if (conn->handler != NULL) { + handler = conn->handler; + userData = conn->userData; + } + virMutexUnlock(&conn->lock); + } + /* * Save the information about the error */ diff --git a/src/util/virterror_internal.h b/src/util/virterror_internal.h index da89de7..2f0e327 100644 --- a/src/util/virterror_internal.h +++ b/src/util/virterror_internal.h @@ -89,6 +89,8 @@ void virReportOOMErrorFull(virConnectPtr conn, virReportOOMErrorFull((conn), VIR_FROM_THIS, \ __FILE__, __FUNCTION__, __LINE__) +void virEnableErrors(void); +void virDisableErrors(void); void virSetGlobalError(void); void virSetConnError(virConnectPtr conn); -- 1.6.5.1

This way we won't squash the original error report in MigratePerform as done for P2P and Tunneled migration. Signed-off-by: Cole Robinson <crobinso@redhat.com> --- src/qemu/qemu_driver.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 5463951..91fdd3b 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6753,6 +6753,12 @@ qemudDomainMigrateFinish2 (virConnectPtr dconn, virDomainEventPtr event = NULL; int newVM = 1; + /* Migration failed. Disable error reporting so we don't squash the + * original error report */ + if (retcode != 0) { + virDisableErrors(); + } + qemuDriverLock(driver); vm = virDomainFindByName(&driver->domains, dname); if (!vm) { @@ -6823,6 +6829,7 @@ qemudDomainMigrateFinish2 (virConnectPtr dconn, } cleanup: + virEnableErrors(); if (vm) virDomainObjUnlock(vm); if (event) -- 1.6.5.1

Signed-off-by: Cole Robinson <crobinso@redhat.com> --- src/libvirt.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/src/libvirt.c b/src/libvirt.c index 2c50790..2bb24e1 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -3032,6 +3032,10 @@ virDomainMigrateVersion2 (virDomainPtr domain, ret = domain->conn->driver->domainMigratePerform (domain, cookie, cookielen, uri, flags, dname, bandwidth); + /* Perform failed. Disable error reporting to 'Finish' doesn't squash */ + if (ret < 0) + virDisableErrors(); + /* In version 2 of the migration protocol, we pass the * status code from the sender to the destination host, * so it can do any cleanup if the migration failed. @@ -3041,6 +3045,7 @@ virDomainMigrateVersion2 (virDomainPtr domain, (dconn, dname, cookie, cookielen, uri, flags, ret); done: + virEnableErrors(); VIR_FREE (uri_out); VIR_FREE (cookie); return ddomain; -- 1.6.5.1

ShutdownVMDaemon is regularly used in cleanup paths, so let's make sure it doesn't squash any original errors. Signed-off-by: Cole Robinson <crobinso@redhat.com> --- src/qemu/qemu_driver.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 91fdd3b..281c02d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2174,6 +2174,10 @@ static void qemudShutdownVMDaemon(virConnectPtr conn, VIR_DEBUG(_("Shutting down VM '%s'\n"), vm->def->name); + /* This method is routinely used in clean up paths. Disable error + * reporting so we don't squash a legit error. */ + virDisableErrors(); + if (driver->macFilter) { int i; virDomainDefPtr def = vm->def; @@ -2256,6 +2260,8 @@ retry: vm->def->id = -1; vm->newDef = NULL; } + + virEnableErrors(); } -- 1.6.5.1

This shouldn't be necessary, as the ideal case is only a single error message is raised before a routine exits, and it will be printed by the default handler. However, libvirt bugs can cause several errors to be raised over one another, and non default error handlers (like virsh's) will hide these. And now with the ability to temporarily disable error raising, we want to be sure that these interim messages are logged in some way. Signed-off-by: Cole Robinson <crobinso@redhat.com> --- src/util/logging.c | 3 ++- src/util/virterror.c | 9 ++++++++- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/src/util/logging.c b/src/util/logging.c index 4899de6..d2c355c 100644 --- a/src/util/logging.c +++ b/src/util/logging.c @@ -504,7 +504,8 @@ cleanup: * @priority: the priority level * @funcname: the function emitting the (debug) message * @linenr: line where the message was emitted - * @flags: extra flags, 1 if coming from the error handler + * @flags: extra flags, 1 if the message shouldn't be logged to stderr + * when no logging handlers are set up. * @fmt: the string format * @...: the arguments * diff --git a/src/util/virterror.c b/src/util/virterror.c index 58b3086..f4c8766 100644 --- a/src/util/virterror.c +++ b/src/util/virterror.c @@ -705,6 +705,7 @@ virRaiseErrorFull(virConnectPtr conn, void *userData = virUserData; virErrorFunc handler = virErrorHandler; char *str; + int logging_flags; /* * All errors are recorded in thread local storage @@ -730,12 +731,18 @@ virRaiseErrorFull(virConnectPtr conn, VIR_GET_VAR_STR(fmt, str); } + /* Only print message to stderr if log level == DEBUG */ + if (virLogGetDefaultPriority() == VIR_LOG_DEBUG) + logging_flags = 0; + else + logging_flags = 1; + /* * Hook up the error or warning to the logging facility * XXXX should we include filename as 'category' instead of domain name ? */ virLogMessage(virErrorDomainName(domain), virErrorLevelPriority(level), - funcname, linenr, 1, "%s", str); + funcname, linenr, logging_flags, "%s", str); /* Error reporting has been deliberately disabled, don't wipe out the * previous error, just log what was passed and return. */ -- 1.6.5.1

On Thu, Nov 05, 2009 at 01:17:41PM -0500, Cole Robinson wrote:
In testing migration, I was hitting an error in the Perform step, but this was being returned as 'Unknown Error' via virsh. The reason is that even a failed Perform will call MigrateFinish to do cleanup, but Finish will always return an NULL == error in that case, overwriting the original message.
I've added internal APIs which allow temporarily disabling error reporting: any ReportError calls will log the error message, but will not overwrite the previous error and will not trigger the error callback. These APIs are then used in a few places where they are needed.
I don't particularly like this as a concept. Could we do this the other way around, whereby we call virSaveLastError() before the bit of code which might clobber the error we want, and then add a new method to set it back afterwards. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On 11/09/2009 07:57 AM, Daniel P. Berrange wrote:
On Thu, Nov 05, 2009 at 01:17:41PM -0500, Cole Robinson wrote:
In testing migration, I was hitting an error in the Perform step, but this was being returned as 'Unknown Error' via virsh. The reason is that even a failed Perform will call MigrateFinish to do cleanup, but Finish will always return an NULL == error in that case, overwriting the original message.
I've added internal APIs which allow temporarily disabling error reporting: any ReportError calls will log the error message, but will not overwrite the previous error and will not trigger the error callback. These APIs are then used in a few places where they are needed.
I don't particularly like this as a concept. Could we do this the other way around, whereby we call virSaveLastError() before the bit of code which might clobber the error we want, and then add a new method to set it back afterwards.
Daniel
Tried that originally. Problem is it doesn't easily work for virsh. If a libvirt call errors, virsh doesn't print the last error from GetLastError, it prints the last error that triggered the error callback. So for the end result to be reached, we would need to fire error callback twice for the same msg: once when the msg is raised, again when we reraise it after it was squashed. That would muddle up debug output at least. Maybe virsh just needs to be fixed. - Cole

On Mon, Nov 09, 2009 at 08:29:04AM -0500, Cole Robinson wrote:
On 11/09/2009 07:57 AM, Daniel P. Berrange wrote:
On Thu, Nov 05, 2009 at 01:17:41PM -0500, Cole Robinson wrote:
In testing migration, I was hitting an error in the Perform step, but this was being returned as 'Unknown Error' via virsh. The reason is that even a failed Perform will call MigrateFinish to do cleanup, but Finish will always return an NULL == error in that case, overwriting the original message.
I've added internal APIs which allow temporarily disabling error reporting: any ReportError calls will log the error message, but will not overwrite the previous error and will not trigger the error callback. These APIs are then used in a few places where they are needed.
I don't particularly like this as a concept. Could we do this the other way around, whereby we call virSaveLastError() before the bit of code which might clobber the error we want, and then add a new method to set it back afterwards.
Daniel
Tried that originally. Problem is it doesn't easily work for virsh. If a libvirt call errors, virsh doesn't print the last error from GetLastError, it prints the last error that triggered the error callback. So for the end result to be reached, we would need to fire error callback twice for the same msg: once when the msg is raised, again when we reraise it after it was squashed. That would muddle up debug output at least.
That's actually a flaw in the way we trigger the error callback. In the current code we fire the callback the momnent virRaiseError is called internally. This is bad because internal code may reset the error if it doesn't want it propagated. It is also bad because the callpath may be holding active mutex locks which can then result in deadlock if the error callback impl from the app calls back into libvirt. We need the triggering of the error callback to be the absolute last thing done in the API call, which means it really needs to be triggered explicitly from the entry points in src/libvirt.c. I started work on this a while back, but then got side-tracked so never finished it http://gitorious.org/~berrange/libvirt/staging/commits/error-callbacks Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
participants (2)
-
Cole Robinson
-
Daniel P. Berrange