Re: [libvirt PATCH v3 09/18] qemu: add functions to start and stop nbdkit

14 Dec 2022

      On 12/9/22 9:21 AM, Peter Krempa wrote:
...
On Thu, Oct 20, 2022 at 16:59:00 -0500, Jonathon Jongsma wrote:
...
Add some helper functions to build a virCommand object and run the
nbdkit process for a given virStorageSource.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
---
  src/qemu/qemu_nbdkit.c | 251 +++++++++++++++++++++++++++++++++++++++++
  src/qemu/qemu_nbdkit.h |  10 ++
  2 files changed, 261 insertions(+)
[...]
...
+static virCommand *
+qemuNbdkitProcessBuildCommand(qemuNbdkitProcess *proc)
+{
+    g_autoptr(virCommand) cmd = virCommandNewArgList(proc->caps->path,
+                                                     "--exit-with-parent",
The documentation for this option states:
If the parent process exits, we exit.  This can be used to avoid
            complicated cleanup or orphaned nbdkit processes.  There are some
            important caveats with this, see "EXIT WITH PARENT" in
            nbdkit-captive(1).
But this doesn't really make much sense in our case. We very
specifically daemonize the process so it becomes parent of init, thus
there's no point waiting for the parent.
...
+                                                     "--unix",
+                                                     proc->socketfile,
+                                                     "--foreground",
+                                                     //"--selinux-label",
+                                                     //selinux_label,
+                                                     NULL);
[...]
...
+qemuNbdkitProcessStart(qemuNbdkitProcess *proc,
+                       virDomainObj *vm,
+                       virQEMUDriver *driver)
+{
+    g_autoptr(virCommand) cmd = NULL;
+    int rc;
+    int exitstatus = 0;
+    int cmdret = 0;
+    VIR_AUTOCLOSE errfd = -1;
+    virTimeBackOffVar timebackoff;
+    bool socketCreated = false;
+
+    if (!(cmd = qemuNbdkitProcessBuildCommand(proc)))
+        return -1;
+
+    VIR_DEBUG("starting nbdkit process for %s", proc->source->nodestorage);
+    virCommandSetErrorFD(cmd, &errfd);
+    virCommandSetPidFile(cmd, proc->pidfile);
+
+    if (qemuExtDeviceLogCommand(driver, vm, cmd, "nbdkit") < 0)
+        goto error;
+
+    if (qemuSecurityCommandRun(driver, vm, cmd, proc->user, proc->group, &exitstatus, &cmdret) < 0)
+        goto error;
+
+    if (cmdret < 0 || exitstatus != 0) {
+        virReportError(VIR_ERR_INTERNAL_ERROR,
+                       _("Could not start 'nbdkit'. exitstatus: %d"), exitstatus);
+        goto error;
+    }
+
+    if ((rc = virPidFileReadPath(proc->pidfile, &proc->pid)) < 0) {
+        virReportSystemError(-rc,
+                             _("Failed to read pidfile %s"),
+                             proc->pidfile);
+        goto error;
+    }
+
+    if (virTimeBackOffStart(&timebackoff, 1, 1000) < 0)
+        goto error;
+
+    while (virTimeBackOffWait(&timebackoff)) {
+        if ((socketCreated = virFileExists(proc->socketfile)))
+            break;
+
+        if (virProcessKill(proc->pid, 0) == 0)
+            continue;
+
+        goto error;
+    }
+
+    if (!socketCreated) {
+        virReportError(VIR_ERR_OPERATION_TIMEOUT, "%s",
+                       _("nbdkit socket did not show up"));
+        goto error;
+    }
+
+    return 0;
+
+ error:
+    if (errfd > 0) {
+        g_autofree char *errbuf = g_new0(char, 1024);
+        if (read(errfd, errbuf, 1024) > 0)
+            virReportError(VIR_ERR_OPERATION_FAILED,
+                           _("nbdkit failed to start and reported: %s"), errbuf);
This error reporting doesn't seem to work. I tried starting a VM and got
an error from qemu that the NBD socket is not available:
error: internal error: process exited while connecting to monitor: 2022-12-09T15:12:25.558050Z qemu-system-x86_64: -blockdev {"driver":"nbd","server":{"type":"unix","path":"/var/lib/libvirt/qemu/domain-11-cd/nbdkit-libvirt-1-storage.socket"},"node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}: Requested export not available
So I enabled logging from nbdkit into logging into the system journal
and got:
Dec 09 16:07:55 speedmetal nbdkit[2115190]: curl[1]: problem doing HEAD request to fetch size of URL [http://HOST:80/name]: Unsupported protocol: Protocol "https" not supported or disabled in libcurl
Which is true, the tested host forces https, but libvirt didn't report
it at all.
I think you'll need to convert this to properly capture the stdout/err
via virtlogd as we do with other external process helpers.
Adding Rich to cc on this subthread as well.

Peter and I were discussing this on IRC and were wondering if nbdkit 
provides (or could provide) something that could help with error 
reporting here. The issue is that if you configure the the network 
source incorrectly, nbdkit will start up fine, and won't report an error 
until qemu connects to the unix socket and tries to access the nbd 
export. At that point, we report the unhelpful error Peter mentions 
above ("Requested export not available") rather than something helpful 
(e.g. "Couldn't resolve host name: site-that-doesnt-exist.com").

One possibility that we discussed was that libvirt could link against 
libnbd and try to open the socket ourselves before attempting to pass it 
to qemu. This would theoretically trigger an error from nbdkit and allow 
us to report that error before we even get to the point of starting 
qemu. That would at least help catch misconfigurations. But it seems 
unfortunate to add a new dependency just for that. If we could simply 
tell nbdkit to try to access the remote host and fail early rather than 
waiting for a nbd client, that would be nicer from my point of view.

Any thoughts? Based on my past experience, I wouldn't be surprised if 
you've already though of this and there's a plugin that already provides 
this functionality ;)

Jonathon