
On 23.10.2012 16:51, Eric Blake wrote:
On 10/23/2012 03:39 AM, Michal Privoznik wrote:
Currently, we use iohelper when saving/restoring a domain. However, if there's some kind of error (like I/O) it is not propagated to libvirt. Since it is not qemu who is doing the actual write() it will not get error. The iohelper does. Therefore we should check for iohelper errors as it makes libvirt more user friendly. --- src/libvirt_private.syms | 1 + src/qemu/qemu_driver.c | 7 ++++- src/util/virfile.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 + 4 files changed, 55 insertions(+), 2 deletions(-)
I agree that catching iohelper errors is needed, but I'm worried that you might introduce deadlock with this approach (then again, it may just be theoretical):
+++ b/src/qemu/qemu_driver.c @@ -2945,6 +2945,7 @@ endjob: if (rc < 0) VIR_WARN("Unable to resume guest CPUs after save failure"); } + virFileWrapperFdCatchError(wrapperFd); } if (qemuDomainObjEndAsyncJob(driver, vm) == 0) vm = NULL; @@ -3282,9 +3283,11 @@ doCoreDump(struct qemud_driver *driver,
cleanup: VIR_FORCE_CLOSE(fd); - virFileWrapperFdFree(wrapperFd); - if (ret != 0) + if (ret != 0) { + virFileWrapperFdCatchError(wrapperFd); unlink(path);
You aren't doing any reading of the stderr file descriptor until you get to the cleanup label. But suppose that iohelper encounters situations that cause it to write more than PIPE_BUF bytes to stderr, but while still continuing to run. Then, because no one is reading the other end of the pipe, iohelper blocks on writes to stderr instead of proceeding on with the rest of its operations, and thus never gets to a point where it will exit in order to kick libvirtd over to the cleanup label to start reading the stderr pipe in the first place.
+++ b/src/util/virfile.c @@ -135,6 +135,7 @@ virFileDirectFdFlag(void) * read-write is not supported, just a single direction. */ struct _virFileWrapperFd { virCommandPtr cmd; /* Child iohelper process to do the I/O. */ + int err_fd; /* FD to read stderr of @cmd */ };
#ifndef WIN32 @@ -229,6 +230,14 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags) virCommandAddArg(ret->cmd, "0"); }
+ /* In order to catch iohelper stderr, we must: + * - pass a FD to virCommand (-1 to auto-allocate one) + * - change iohelper's env so virLog functions print to stderr + */ + ret->err_fd = -1; + virCommandSetErrorFD(ret->cmd, &ret->err_fd); + virCommandAddEnvPair(ret->cmd, "LIBVIRT_LOG_OUTPUTS", "1:stderr");
Would it work to use virCommandSetErrorBuffer() instead, or does that rely on using virCommandRun() to properly populate the buffer via poll(), where we are stuck using virCommandRunAsync() instead? Or, since we are already using virCommandRunAsync(), does that mean we need to figure out how to incorporate a poll() to favorably handle stderr in addition to everything else that iohelper is already handling?
Yeah, I've tried te virCommandSetErrorBuffer() way firstly. But it requires the poll() which is wrapped in virCommandProcessIO(). However, it is static so it would needed to be dig out and run in a separate thread. But we can run virCommandRun directly then. So I think I can incorporate our event loop. Just add a callback once ret->err_fd is initialized.
Or am I worrying too much, and if iohelper hits any issue where it would be logging to stderr in the first place, it will be exiting before filling up the stderr pipe and thus never deadlocking?
Well, with current implementation of iohelper there is only one place where it doesn't die right after reporting an error: src/util/iohelper.c:160 I don't know how possible is to hit that.
At any rate, depending on how that question is answered, this code does indeed look useful.
Okay, since we would have to keep this limitation in mind when changing iohelper (and honestly, I'll forget immediately after the push) I think we should make this safer and read the error pipe in the event loop. I'll update the patch and resend. Michal