Re: [libvirt] [PATCH] qemu: Report errors from iohelper

24 Oct 2012


      On 23.10.2012 16:51, Eric Blake wrote:
...
On 10/23/2012 03:39 AM, Michal Privoznik wrote:
...
Currently, we use iohelper when saving/restoring a domain.
However, if there's some kind of error (like I/O) it is not
propagated to libvirt. Since it is not qemu who is doing
the actual write() it will not get error. The iohelper does.
Therefore we should check for iohelper errors as it makes
libvirt more user friendly.
---
 src/libvirt_private.syms |    1 +
 src/qemu/qemu_driver.c   |    7 ++++-
 src/util/virfile.c       |   47 ++++++++++++++++++++++++++++++++++++++++++++++
 src/util/virfile.h       |    2 +
 4 files changed, 55 insertions(+), 2 deletions(-)
I agree that catching iohelper errors is needed, but I'm worried that
you might introduce deadlock with this approach (then again, it may just
be theoretical):
...
+++ b/src/qemu/qemu_driver.c
@@ -2945,6 +2945,7 @@ endjob:
                 if (rc < 0)
                     VIR_WARN("Unable to resume guest CPUs after save failure");
             }
+            virFileWrapperFdCatchError(wrapperFd);
         }
         if (qemuDomainObjEndAsyncJob(driver, vm) == 0)
             vm = NULL;
@@ -3282,9 +3283,11 @@ doCoreDump(struct qemud_driver *driver,
cleanup:
     VIR_FORCE_CLOSE(fd);
-    virFileWrapperFdFree(wrapperFd);
-    if (ret != 0)
+    if (ret != 0) {
+        virFileWrapperFdCatchError(wrapperFd);
         unlink(path);
You aren't doing any reading of the stderr file descriptor until you get
to the cleanup label.  But suppose that iohelper encounters situations
that cause it to write more than PIPE_BUF bytes to stderr, but while
still continuing to run.  Then, because no one is reading the other end
of the pipe, iohelper blocks on writes to stderr instead of proceeding
on with the rest of its operations, and thus never gets to a point where
it will exit in order to kick libvirtd over to the cleanup label to
start reading the stderr pipe in the first place.
...
+++ b/src/util/virfile.c
@@ -135,6 +135,7 @@ virFileDirectFdFlag(void)
  * read-write is not supported, just a single direction.  */
 struct _virFileWrapperFd {
     virCommandPtr cmd; /* Child iohelper process to do the I/O.  */
+    int err_fd; /* FD to read stderr of @cmd */
 };
#ifndef WIN32
@@ -229,6 +230,14 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags)
         virCommandAddArg(ret->cmd, "0");
     }
+    /* In order to catch iohelper stderr, we must:
+     * - pass a FD to virCommand (-1 to auto-allocate one)
+     * - change iohelper's env so virLog functions print to stderr
+     */
+    ret->err_fd = -1;
+    virCommandSetErrorFD(ret->cmd, &ret->err_fd);
+    virCommandAddEnvPair(ret->cmd, "LIBVIRT_LOG_OUTPUTS", "1:stderr");
Would it work to use virCommandSetErrorBuffer() instead, or does that
rely on using virCommandRun() to properly populate the buffer via
poll(), where we are stuck using virCommandRunAsync() instead?  Or,
since we are already using virCommandRunAsync(), does that mean we need
to figure out how to incorporate a poll() to favorably handle stderr in
addition to everything else that iohelper is already  handling?
Yeah, I've tried te virCommandSetErrorBuffer() way firstly. But it
requires the poll() which is wrapped in virCommandProcessIO(). However,
it is static so it would needed to be dig out and run in a separate
thread. But we can run virCommandRun directly then. So I think I can
incorporate our event loop. Just add a callback once ret->err_fd is
initialized.
...
Or am I worrying too much, and if iohelper hits any issue where it would
be logging to stderr in the first place, it will be exiting before
filling up the stderr pipe and thus never deadlocking?
Well, with current implementation of iohelper there is only one place
where it doesn't die right after reporting an error:
src/util/iohelper.c:160

I don't know how possible is to hit that.
...
At any rate, depending on how that question is answered, this code does
indeed look useful.
Okay, since we would have to keep this limitation in mind when changing
iohelper (and honestly, I'll forget immediately after the push) I think
we should make this safer and read the error pipe in the event loop.

I'll update the patch and resend.

Michal