[libvirt] [PATCH v2 0/3] qemu: Report better error on dump/migrate failure

Changes from [v1]: * Use VIR_FREE() followed by VIR_ALLOC_N() instead of manually setting the last (and only) byte of the array returned by VIR_REALLOC_N() to zero. [v1] https://www.redhat.com/archives/libvir-list/2019-February/msg00156.html Andrea Bolognani (3): vircommand: Ensure buffers are NULL-terminated virfile: Report error in virFileWrapperFdFree() iohelper: Don't include newlines in error messages src/util/iohelper.c | 2 +- src/util/vircommand.c | 6 ++++-- src/util/virfile.c | 6 +++++- 3 files changed, 10 insertions(+), 4 deletions(-) -- 2.20.1

The memory allocated by VIR_REALLOC_N() is uninitialized, which means it's not possible to figure out whether any output was produced at all after the fact. Since we don't care about the previous contents of buffers, if any, use VIR_FREE() followed by VIR_ALLOC_N() instead. Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/util/vircommand.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/util/vircommand.c b/src/util/vircommand.c index d965068369..3d533c68a6 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c @@ -2055,12 +2055,14 @@ virCommandProcessIO(virCommandPtr cmd) * results accumulated over a prior run of the same command. */ if (cmd->outbuf) { outfd = cmd->outfd; - if (VIR_REALLOC_N(*cmd->outbuf, 1) < 0) + VIR_FREE(*cmd->outbuf); + if (VIR_ALLOC_N(*cmd->outbuf, 1) < 0) ret = -1; } if (cmd->errbuf) { errfd = cmd->errfd; - if (VIR_REALLOC_N(*cmd->errbuf, 1) < 0) + VIR_FREE(*cmd->errbuf); + if (VIR_ALLOC_N(*cmd->errbuf, 1) < 0) ret = -1; } if (ret == -1) -- 2.20.1

On 2/13/19 7:04 AM, Andrea Bolognani wrote:
The memory allocated by VIR_REALLOC_N() is uninitialized, which means it's not possible to figure out whether any output was produced at all after the fact.
Since we don't care about the previous contents of buffers, if any, use VIR_FREE() followed by VIR_ALLOC_N() instead.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/util/vircommand.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

On Wed, Feb 13, 2019 at 01:04:06PM +0100, Andrea Bolognani wrote:
The memory allocated by VIR_REALLOC_N() is uninitialized, which means it's not possible to figure out whether any output was produced at all after the fact.
Since we don't care about the previous contents of buffers, if any, use VIR_FREE() followed by VIR_ALLOC_N() instead.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/util/vircommand.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Logging the error is fine and all, but getting the information to the user directly is even better. https://bugzilla.redhat.com/show_bug.cgi?id=1578741 Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/util/virfile.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/util/virfile.c b/src/util/virfile.c index 271bf5e796..30cad87df9 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -351,8 +351,12 @@ virFileWrapperFdFree(virFileWrapperFdPtr wfd) if (!wfd) return; + /* If the command used to process IO has produced errors, it's fair + * to assume those will be more relevant to the user than whatever + * eg. QEMU can figure out on its own, so it's okay if we end up + * discarding an existing error */ if (wfd->err_msg && *wfd->err_msg) - VIR_WARN("iohelper reports: %s", wfd->err_msg); + virReportError(VIR_ERR_OPERATION_FAILED, "%s", wfd->err_msg); virCommandAbort(wfd->cmd); -- 2.20.1

On 2/13/19 7:04 AM, Andrea Bolognani wrote:
Logging the error is fine and all, but getting the information to the user directly is even better.
https://bugzilla.redhat.com/show_bug.cgi?id=1578741 Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/util/virfile.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
Shall I assume this works because qemuProcessHandleDumpCompleted will move this message now as opposed to some other message related to EPIPE as noted in the bz response? I think it's good to fill in pieces of the commit message at least so if someone else ends up in this same code they have a few hints where to start. My other throughts would be to note commit 1f25194ad and 34e8f63a3 as the genesis of at least printing the message "somewhere"... Then commit b0c3e931 at least made sure the message got printed in the event that the *FdClose never occurred. Just saying "is fine and all" hand waves a bit too much for me compared to what you got to figure out here ;-). Reviewed-by: John Ferlan <jferlan@redhat.com> John
diff --git a/src/util/virfile.c b/src/util/virfile.c index 271bf5e796..30cad87df9 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -351,8 +351,12 @@ virFileWrapperFdFree(virFileWrapperFdPtr wfd) if (!wfd) return;
+ /* If the command used to process IO has produced errors, it's fair + * to assume those will be more relevant to the user than whatever + * eg. QEMU can figure out on its own, so it's okay if we end up + * discarding an existing error */ if (wfd->err_msg && *wfd->err_msg) - VIR_WARN("iohelper reports: %s", wfd->err_msg); + virReportError(VIR_ERR_OPERATION_FAILED, "%s", wfd->err_msg);
virCommandAbort(wfd->cmd);

On Mon, 2019-02-18 at 11:04 -0500, John Ferlan wrote:
On 2/13/19 7:04 AM, Andrea Bolognani wrote:
Logging the error is fine and all, but getting the information to the user directly is even better.
https://bugzilla.redhat.com/show_bug.cgi?id=1578741 Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/util/virfile.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
Shall I assume this works because qemuProcessHandleDumpCompleted will move this message now as opposed to some other message related to EPIPE as noted in the bz response?
libvirt_iohelper is used internally by the virFileWrapperFd APIs; more specifically, in the QEMU driver we have the doCoreDump() and qemuDomainSaveMemory() helper functions as users, and those in turn end up being called by the implementation of several driver APIs. By calling virReportError() if libvirt_iohelper has failed, we overwrite whatever generic error message QEMU might have raised with the more useful one generated by the helper program. I'm not sure how qemuProcessHandleDumpCompleted() fits into the picture.
I think it's good to fill in pieces of the commit message at least so if someone else ends up in this same code they have a few hints where to start.
Yeah, I guess the commit message is a bit terse. Would adding something along the lines of the first two paragraphs above work, in your opinion?
My other throughts would be to note commit 1f25194ad and 34e8f63a3 as the genesis of at least printing the message "somewhere"... Then commit b0c3e931 at least made sure the message got printed in the event that the *FdClose never occurred. Just saying "is fine and all" hand waves a bit too much for me compared to what you got to figure out here ;-).
Honestly, I didn't bother digging into the history of the functionality and simply fixed what was broken without really wondering how it ended up like that in the first place :) -- Andrea Bolognani / Red Hat / Virtualization

On 2/19/19 8:27 AM, Andrea Bolognani wrote:
On Mon, 2019-02-18 at 11:04 -0500, John Ferlan wrote:
On 2/13/19 7:04 AM, Andrea Bolognani wrote:
Logging the error is fine and all, but getting the information to the user directly is even better.
https://bugzilla.redhat.com/show_bug.cgi?id=1578741 Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/util/virfile.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
Shall I assume this works because qemuProcessHandleDumpCompleted will move this message now as opposed to some other message related to EPIPE as noted in the bz response?
libvirt_iohelper is used internally by the virFileWrapperFd APIs; more specifically, in the QEMU driver we have the doCoreDump() and qemuDomainSaveMemory() helper functions as users, and those in turn end up being called by the implementation of several driver APIs.
By calling virReportError() if libvirt_iohelper has failed, we overwrite whatever generic error message QEMU might have raised with the more useful one generated by the helper program.
I'm not sure how qemuProcessHandleDumpCompleted() fits into the picture.
My recollection is/was event based mechanism and the HandleDumpCompleted was what ended up being where the message came through whether success or failure. Could be wrong - I perhaps had more in short term recall when I added some processing for memory dump event processing.
I think it's good to fill in pieces of the commit message at least so if someone else ends up in this same code they have a few hints where to start.
Yeah, I guess the commit message is a bit terse. Would adding something along the lines of the first two paragraphs above work, in your opinion?
Sure that's fine. I understand it doesn't mean someone will look at the commit message, but for those that do it can help.
My other throughts would be to note commit 1f25194ad and 34e8f63a3 as the genesis of at least printing the message "somewhere"... Then commit b0c3e931 at least made sure the message got printed in the event that the *FdClose never occurred. Just saying "is fine and all" hand waves a bit too much for me compared to what you got to figure out here ;-).
Honestly, I didn't bother digging into the history of the functionality and simply fixed what was broken without really wondering how it ended up like that in the first place :)
Thankfully gitk makes it easy for me to trace through the history. I know everyone has their favorite blame tools. John

On Wed, Feb 13, 2019 at 01:04:07PM +0100, Andrea Bolognani wrote:
Logging the error is fine and all, but getting the information to the user directly is even better.
https://bugzilla.redhat.com/show_bug.cgi?id=1578741 Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/util/virfile.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/util/virfile.c b/src/util/virfile.c index 271bf5e796..30cad87df9 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -351,8 +351,12 @@ virFileWrapperFdFree(virFileWrapperFdPtr wfd) if (!wfd) return;
+ /* If the command used to process IO has produced errors, it's fair + * to assume those will be more relevant to the user than whatever + * eg. QEMU can figure out on its own, so it's okay if we end up + * discarding an existing error */ if (wfd->err_msg && *wfd->err_msg) - VIR_WARN("iohelper reports: %s", wfd->err_msg); + virReportError(VIR_ERR_OPERATION_FAILED, "%s", wfd->err_msg);
I don't think this is a good place to report errors. We normally write xxxxFree() functions such that they always succeed & don't report errors. Essentially they should just be cleaning up memory and other allocated resources. This is especially true now that we rely on GCC cleanup functions which automagically call virFileWrapperFdFree when the variable goes out of scope. IOW, if we need to report an error from the io helper, then it needs to be done earlier, pehrpas in virFileWrapperFdClose ? Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Tue, 2019-02-19 at 13:58 +0000, Daniel P. Berrangé wrote:
On Wed, Feb 13, 2019 at 01:04:07PM +0100, Andrea Bolognani wrote: [...]
@@ -351,8 +351,12 @@ virFileWrapperFdFree(virFileWrapperFdPtr wfd) if (!wfd) return;
+ /* If the command used to process IO has produced errors, it's fair + * to assume those will be more relevant to the user than whatever + * eg. QEMU can figure out on its own, so it's okay if we end up + * discarding an existing error */ if (wfd->err_msg && *wfd->err_msg) - VIR_WARN("iohelper reports: %s", wfd->err_msg); + virReportError(VIR_ERR_OPERATION_FAILED, "%s", wfd->err_msg);
I don't think this is a good place to report errors. We normally write xxxxFree() functions such that they always succeed & don't report errors. Essentially they should just be cleaning up memory and other allocated resources. This is especially true now that we rely on GCC cleanup functions which automagically call virFileWrapperFdFree when the variable goes out of scope.
IOW, if we need to report an error from the io helper, then it needs to be done earlier, pehrpas in virFileWrapperFdClose ?
As John noted, that's what the original implementation looked like but commit b0c3e931 moved the VIR_WARN() call from Close() to Free() to avoid the situation where jumping out from a function early results in the former not being called. That said, I agree with you that in general we don't want our Free() function to do anything more than release the allocated memory. To get there, though, it looks like some reworking is needed... I'll try to make it work. -- Andrea Bolognani / Red Hat / Virtualization

On Tue, Feb 19, 2019 at 04:42:51PM +0100, Andrea Bolognani wrote:
On Tue, 2019-02-19 at 13:58 +0000, Daniel P. Berrangé wrote:
On Wed, Feb 13, 2019 at 01:04:07PM +0100, Andrea Bolognani wrote: [...]
@@ -351,8 +351,12 @@ virFileWrapperFdFree(virFileWrapperFdPtr wfd) if (!wfd) return;
+ /* If the command used to process IO has produced errors, it's fair + * to assume those will be more relevant to the user than whatever + * eg. QEMU can figure out on its own, so it's okay if we end up + * discarding an existing error */ if (wfd->err_msg && *wfd->err_msg) - VIR_WARN("iohelper reports: %s", wfd->err_msg); + virReportError(VIR_ERR_OPERATION_FAILED, "%s", wfd->err_msg);
I don't think this is a good place to report errors. We normally write xxxxFree() functions such that they always succeed & don't report errors. Essentially they should just be cleaning up memory and other allocated resources. This is especially true now that we rely on GCC cleanup functions which automagically call virFileWrapperFdFree when the variable goes out of scope.
IOW, if we need to report an error from the io helper, then it needs to be done earlier, pehrpas in virFileWrapperFdClose ?
As John noted, that's what the original implementation looked like but commit b0c3e931 moved the VIR_WARN() call from Close() to Free() to avoid the situation where jumping out from a function early results in the former not being called.
I think we should really just revert that commit and make sure the callers will invoke Close() in all paths & then make Close use virReportError instead of WARN. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Tue, 2019-02-19 at 15:49 +0000, Daniel P. Berrangé wrote:
On Tue, Feb 19, 2019 at 04:42:51PM +0100, Andrea Bolognani wrote:
On Tue, 2019-02-19 at 13:58 +0000, Daniel P. Berrangé wrote:
IOW, if we need to report an error from the io helper, then it needs to be done earlier, pehrpas in virFileWrapperFdClose ?
As John noted, that's what the original implementation looked like but commit b0c3e931 moved the VIR_WARN() call from Close() to Free() to avoid the situation where jumping out from a function early results in the former not being called.
I think we should really just revert that commit and make sure the callers will invoke Close() in all paths & then make Close use virReportError instead of WARN.
Yeah, that's basically what I'm gonna try to do in v3 :) -- Andrea Bolognani / Red Hat / Virtualization

The newline was pretty arbitrary, and we're better off without it. Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/util/iohelper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/util/iohelper.c b/src/util/iohelper.c index 1ff4a7b314..aed7ef3184 100644 --- a/src/util/iohelper.c +++ b/src/util/iohelper.c @@ -236,7 +236,7 @@ main(int argc, char **argv) return 0; error: - fprintf(stderr, _("%s: failure with %s\n: %s"), + fprintf(stderr, _("%s: failure with %s: %s"), program_name, path, virGetLastErrorMessage()); exit(EXIT_FAILURE); } -- 2.20.1

On 2/13/19 7:04 AM, Andrea Bolognani wrote:
The newline was pretty arbitrary, and we're better off without it.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/util/iohelper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
I'm mostly ambivalent about this one; however, since virGetLastErrorMessage could return a string without a "\n", then perhaps it's best to keep the \n since it really doesn't hurt. John
diff --git a/src/util/iohelper.c b/src/util/iohelper.c index 1ff4a7b314..aed7ef3184 100644 --- a/src/util/iohelper.c +++ b/src/util/iohelper.c @@ -236,7 +236,7 @@ main(int argc, char **argv) return 0;
error: - fprintf(stderr, _("%s: failure with %s\n: %s"), + fprintf(stderr, _("%s: failure with %s: %s"), program_name, path, virGetLastErrorMessage()); exit(EXIT_FAILURE); }

On Mon, 2019-02-18 at 11:05 -0500, John Ferlan wrote:
On 2/13/19 7:04 AM, Andrea Bolognani wrote:
The newline was pretty arbitrary, and we're better off without it.
I'm mostly ambivalent about this one; however, since virGetLastErrorMessage could return a string without a "\n", then perhaps it's best to keep the \n since it really doesn't hurt.
Without this patch, the user will end up seeing $ sudo virsh dump guest /small/guest.dump --memory-only error: Failed to core dump domain guest to /small/guest.dump error: operation failed: /usr/libexec/libvirt_iohelper: failure with /small/guest.dump : Unable to write /small/guest.dump: No space left on device instead of the more reasonable $ sudo virsh dump guest /small/guest.dump --memory-only error: Failed to core dump domain guest to /small/guest.dump error: operation failed: /usr/libexec/libvirt_iohelper: failure with /small/guest.dump: Unable to write /small/guest.dump: No space left on device Now, neither is optimal and the way libvirt_iohelper formats its error messages should be tweaked further, but at least in the latter case the error is not split randomly with the second line starting with a colon, which is an improvement in my book. I was also unable to find other examples of messages passed to virReportError(), which is what will ultimately happen to this output, containing newlines. To be fair, I have not really spent a lot of time looking for them either :) -- Andrea Bolognani / Red Hat / Virtualization

On 2/19/19 8:44 AM, Andrea Bolognani wrote:
On Mon, 2019-02-18 at 11:05 -0500, John Ferlan wrote:
On 2/13/19 7:04 AM, Andrea Bolognani wrote:
The newline was pretty arbitrary, and we're better off without it.
I'm mostly ambivalent about this one; however, since virGetLastErrorMessage could return a string without a "\n", then perhaps it's best to keep the \n since it really doesn't hurt.
Without this patch, the user will end up seeing
$ sudo virsh dump guest /small/guest.dump --memory-only error: Failed to core dump domain guest to /small/guest.dump error: operation failed: /usr/libexec/libvirt_iohelper: failure with /small/guest.dump : Unable to write /small/guest.dump: No space left on device
instead of the more reasonable
$ sudo virsh dump guest /small/guest.dump --memory-only error: Failed to core dump domain guest to /small/guest.dump error: operation failed: /usr/libexec/libvirt_iohelper: failure with /small/guest.dump: Unable to write /small/guest.dump: No space left on device
Now, neither is optimal and the way libvirt_iohelper formats its error messages should be tweaked further, but at least in the latter case the error is not split randomly with the second line starting with a colon, which is an improvement in my book.
I was also unable to find other examples of messages passed to virReportError(), which is what will ultimately happen to this output, containing newlines. To be fair, I have not really spent a lot of time looking for them either :)
I didn't dig too much either, but I was perhaps overthinking the other end of the message. I went straight to virGetLastErrorMessage and noted it "could" return "no error" or "unknown error" without the '\n' at the end which IIUC would be generated for virReportError (or err->message) values. Anyway, see commit b29e08db... I think a case could be made in the commit message ;-) that prior to that commit the '\n' was (properly) at the end, but with that commit message the '\n' was perhaps put in the wrong place. So maybe this changes to move the '\n' to after the formatted message. Of course that means perhaps an extra blank line for most outputs. In the long run, it's a blank line adjustment, so consider this Reviewed-by: John Ferlan <jferlan@redhat.com> for me too (I see Dan responded in the mean time). John

On Tue, 2019-02-19 at 09:00 -0500, John Ferlan wrote:
Anyway, see commit b29e08db... I think a case could be made in the commit message ;-) that prior to that commit the '\n' was (properly) at the end, but with that commit message the '\n' was perhaps put in the wrong place.
Yeah, looking at that commit it's pretty obvious that the \n ended up in the middle of the string rather than at the end of it by mistake.
So maybe this changes to move the '\n' to after the formatted message. Of course that means perhaps an extra blank line for most outputs.
Since libvirt_iohelper is really only intended to be called from libvirt itself, and whatever it outputs will ultimately end up as an argument to either VIR_WARN() or virReportError(), I think it makes complete sense for error messages to *not* end with a newline. In fact, all but two strings in the program contain \n after this change. I'll post a separate patch that gets rid of those too. -- Andrea Bolognani / Red Hat / Virtualization

On Mon, Feb 18, 2019 at 11:05:50AM -0500, John Ferlan wrote:
On 2/13/19 7:04 AM, Andrea Bolognani wrote:
The newline was pretty arbitrary, and we're better off without it.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/util/iohelper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
I'm mostly ambivalent about this one; however, since virGetLastErrorMessage could return a string without a "\n", then perhaps it's best to keep the \n since it really doesn't hurt.
Whether or not virGetLastErrorMessage has a trailing "\n" in the returned string is tangential to what this patch is changing. The newline being removed here is the one printed after the filename "path" parameter, not the error message parameter.
diff --git a/src/util/iohelper.c b/src/util/iohelper.c index 1ff4a7b314..aed7ef3184 100644 --- a/src/util/iohelper.c +++ b/src/util/iohelper.c @@ -236,7 +236,7 @@ main(int argc, char **argv) return 0;
error: - fprintf(stderr, _("%s: failure with %s\n: %s"), + fprintf(stderr, _("%s: failure with %s: %s"), program_name, path, virGetLastErrorMessage()); exit(EXIT_FAILURE);
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (3)
-
Andrea Bolognani
-
Daniel P. Berrangé
-
John Ferlan