[libvirt] [PATCH 0/5] Support embedding the QEMU driver in client apps directly

This patch series proposes a new way to build a "libvirt QEMU shim" which supports more use cases than the previous approach from last year, as well as being vastly simpler to implement. Specifically - We can use this to embed the QEMU driver in unit tests allowing the full range of odriver functionality to be exercised during *unit* testing without interfering with the host OS libvirtd. - Apps like libguestfs can embed the QEMU driver to allow them to spawn VMs that are immediate children of their app, thus inheriting process context, as well as hiding these "service VM" from the main libvirtd. This would avoid a common irritation where libguestfs VMs suddenly appear & disappear in virt-manager's VM list. - Apps like kubevirt could use this to eliminate the need to use the RPC layer for libvirt. It can directly embed the QEMU driver giving it more direct control over how it is run & again inheriting process state. This is useful for kubevirt's one VM per container model where Kubernetes itself provides the aggregated view, thus making libvirtd's aggregated view redundant - The previous shim was very much one shim == one QEMU process. This new embed approach supports that of course, but there's also the option to run multiple VMs if desired. - Using a virtual root directory for QEMU driver state meams we do not need to solve the hard refactoring problem of getting the main libvirtd to detect the VMs launched via the shim. They can safely live independantly. - Thanks to the driver refactoring work it is still possible for the embedded drivers to talk to main libvirtd to use the secondary drivers for storage/network/etc if needed. Ideally we would allow those secondary drivers to be embedded too but that's not implemented here. Most important would be the secrets driver. Other drivers are much less important. Daniel P. Berrangé (5): access: report an error if no access manager is present libvirt: pass a root directory path into drivers qemu: honour the root parameter during driver initialization libvirt: support an "embed" URI path selector for QEMU driver qemu: introduce a new style libvirt_qemu_shim program src/access/viraccessmanager.c | 5 ++ src/driver-state.h | 1 + src/interface/interface_backend_netcf.c | 1 + src/interface/interface_backend_udev.c | 1 + src/libvirt.c | 41 ++++++++++++++- src/libvirt_internal.h | 1 + src/libxl/libxl_driver.c | 1 + src/lxc/lxc_driver.c | 2 + src/network/bridge_driver.c | 1 + src/node_device/node_device_hal.c | 1 + src/node_device/node_device_udev.c | 1 + src/nwfilter/nwfilter_driver.c | 1 + src/qemu/Makefile.inc.am | 7 +++ src/qemu/qemu_conf.c | 41 +++++++++------ src/qemu/qemu_conf.h | 3 +- src/qemu/qemu_driver.c | 9 ++-- src/qemu/qemu_shim.c | 70 +++++++++++++++++++++++++ src/remote/remote_daemon.c | 1 + src/remote/remote_driver.c | 8 +++ src/secret/secret_driver.c | 1 + src/storage/storage_driver.c | 1 + src/vz/vz_driver.c | 1 + 22 files changed, 179 insertions(+), 20 deletions(-) create mode 100644 src/qemu/qemu_shim.c -- 2.21.0

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/access/viraccessmanager.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/access/viraccessmanager.c b/src/access/viraccessmanager.c index f5d62604cf..26a16c6fb3 100644 --- a/src/access/viraccessmanager.c +++ b/src/access/viraccessmanager.c @@ -65,6 +65,11 @@ VIR_ONCE_GLOBAL_INIT(virAccessManager); virAccessManagerPtr virAccessManagerGetDefault(void) { + if (virAccessManagerDefault == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("No access manager registered")); + return NULL; + } return virObjectRef(virAccessManagerDefault); } -- 2.21.0

Allow drivers to be run against a different virtual root directory. The intent here is to allow the virt drivers to be run directly embedded in an arbitrary process without interfering with libvirtd. This can be useful for doing testing of the virt drivers in "make check" without interfering with the user's own libvirtd instances. It can also be used for an application like libguestfs which doesn't want its VMs to show up in the main libvirtd VM list. XXX make drivers reject root != "" until they are ported to support it. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/driver-state.h | 1 + src/interface/interface_backend_netcf.c | 1 + src/interface/interface_backend_udev.c | 1 + src/libvirt.c | 2 ++ src/libvirt_internal.h | 1 + src/libxl/libxl_driver.c | 1 + src/lxc/lxc_driver.c | 2 ++ src/network/bridge_driver.c | 1 + src/node_device/node_device_hal.c | 1 + src/node_device/node_device_udev.c | 1 + src/nwfilter/nwfilter_driver.c | 1 + src/qemu/qemu_driver.c | 1 + src/remote/remote_daemon.c | 1 + src/remote/remote_driver.c | 1 + src/secret/secret_driver.c | 1 + src/storage/storage_driver.c | 1 + src/vz/vz_driver.c | 1 + 17 files changed, 19 insertions(+) diff --git a/src/driver-state.h b/src/driver-state.h index a8595662af..68bbf75e06 100644 --- a/src/driver-state.h +++ b/src/driver-state.h @@ -27,6 +27,7 @@ typedef int (*virDrvStateInitialize)(bool privileged, + const char *root, virStateInhibitCallback callback, void *opaque); diff --git a/src/interface/interface_backend_netcf.c b/src/interface/interface_backend_netcf.c index cf8eb9488d..e9a0fee606 100644 --- a/src/interface/interface_backend_netcf.c +++ b/src/interface/interface_backend_netcf.c @@ -76,6 +76,7 @@ virNetcfDriverStateDispose(void *obj) static int netcfStateInitialize(bool privileged, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { diff --git a/src/interface/interface_backend_udev.c b/src/interface/interface_backend_udev.c index 1373356246..7c315c315a 100644 --- a/src/interface/interface_backend_udev.c +++ b/src/interface/interface_backend_udev.c @@ -1159,6 +1159,7 @@ udevInterfaceIsActive(virInterfacePtr ifinfo) static int udevStateInitialize(bool privileged, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { diff --git a/src/libvirt.c b/src/libvirt.c index 7e665b6cba..677f1cef5f 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -638,6 +638,7 @@ virRegisterStateDriver(virStateDriverPtr driver) */ int virStateInitialize(bool privileged, + const char *root, virStateInhibitCallback callback, void *opaque) { @@ -651,6 +652,7 @@ virStateInitialize(bool privileged, VIR_DEBUG("Running global init for %s state driver", virStateDriverTab[i]->name); if (virStateDriverTab[i]->stateInitialize(privileged, + root, callback, opaque) < 0) { VIR_ERROR(_("Initialization of %s state driver failed: %s"), diff --git a/src/libvirt_internal.h b/src/libvirt_internal.h index 8be1257814..c3964d4c38 100644 --- a/src/libvirt_internal.h +++ b/src/libvirt_internal.h @@ -31,6 +31,7 @@ typedef void (*virStateInhibitCallback)(bool inhibit, void *opaque); int virStateInitialize(bool privileged, + const char *root, virStateInhibitCallback inhibit, void *opaque); int virStateCleanup(void); diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c index 2b9c6f1866..59e0d7229d 100644 --- a/src/libxl/libxl_driver.c +++ b/src/libxl/libxl_driver.c @@ -633,6 +633,7 @@ libxlAddDom0(libxlDriverPrivatePtr driver) static int libxlStateInitialize(bool privileged, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 9db2a02dee..6d0d3a8ace 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -85,6 +85,7 @@ VIR_LOG_INIT("lxc.lxc_driver"); static int lxcStateInitialize(bool privileged, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback, void *opaque); static int lxcStateCleanup(void); @@ -1536,6 +1537,7 @@ lxcSecurityInit(virLXCDriverConfigPtr cfg) static int lxcStateInitialize(bool privileged, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 26f7f80418..ca9deaab4b 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -588,6 +588,7 @@ firewalld_dbus_filter_bridge(DBusConnection *connection ATTRIBUTE_UNUSED, */ static int networkStateInitialize(bool privileged, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { diff --git a/src/node_device/node_device_hal.c b/src/node_device/node_device_hal.c index d1eb6c7851..7b11d4d7f7 100644 --- a/src/node_device/node_device_hal.c +++ b/src/node_device/node_device_hal.c @@ -588,6 +588,7 @@ device_prop_modified(LibHalContext *ctx ATTRIBUTE_UNUSED, static int nodeStateInitialize(bool privileged ATTRIBUTE_UNUSED, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 5df2fd72f3..2d772d85c3 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1800,6 +1800,7 @@ udevPCITranslateInit(bool privileged ATTRIBUTE_UNUSED) static int nodeStateInitialize(bool privileged, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { diff --git a/src/nwfilter/nwfilter_driver.c b/src/nwfilter/nwfilter_driver.c index fdfc6f48fa..58adcdbeb1 100644 --- a/src/nwfilter/nwfilter_driver.c +++ b/src/nwfilter/nwfilter_driver.c @@ -176,6 +176,7 @@ virNWFilterTriggerRebuildImpl(void *opaque) */ static int nwfilterStateInitialize(bool privileged, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0a425b82e5..659d7b89e1 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -574,6 +574,7 @@ qemuDomainFindMaxID(virDomainObjPtr vm, */ static int qemuStateInitialize(bool privileged, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback, void *opaque) { diff --git a/src/remote/remote_daemon.c b/src/remote/remote_daemon.c index c3782971f1..2e47263bb9 100644 --- a/src/remote/remote_daemon.c +++ b/src/remote/remote_daemon.c @@ -796,6 +796,7 @@ static void daemonRunStateInit(void *opaque) * we're ready, since it can take a long time and this will * seriously delay OS bootup process */ if (virStateInitialize(virNetDaemonIsPrivileged(dmn), + "", daemonInhibitCallback, dmn) < 0) { VIR_ERROR(_("Driver state initialization failed")); diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index 5c4dd41227..244e384607 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -184,6 +184,7 @@ static int remoteSplitURIScheme(virURIPtr uri, static int remoteStateInitialize(bool privileged ATTRIBUTE_UNUSED, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { diff --git a/src/secret/secret_driver.c b/src/secret/secret_driver.c index ac85f5d195..1b4826e597 100644 --- a/src/secret/secret_driver.c +++ b/src/secret/secret_driver.c @@ -444,6 +444,7 @@ secretStateCleanup(void) static int secretStateInitialize(bool privileged, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 496d51b1e0..a13eabf134 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -247,6 +247,7 @@ storageDriverAutostart(void) */ static int storageStateInitialize(bool privileged, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { diff --git a/src/vz/vz_driver.c b/src/vz/vz_driver.c index dfd49e7cc7..92a9cfeaac 100644 --- a/src/vz/vz_driver.c +++ b/src/vz/vz_driver.c @@ -4096,6 +4096,7 @@ vzStateCleanup(void) static int vzStateInitialize(bool privileged ATTRIBUTE_UNUSED, + const char *root ATTRIBUTE_UNUSED, virStateInhibitCallback callback ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { -- 2.21.0

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_conf.c | 41 ++++++++++++++++++++++++++--------------- src/qemu/qemu_conf.h | 3 ++- src/qemu/qemu_driver.c | 2 +- 3 files changed, 29 insertions(+), 17 deletions(-) diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index daea11dacb..35ecb19b29 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -123,7 +123,8 @@ void qemuDomainCmdlineDefFree(qemuDomainCmdlineDefPtr def) #endif -virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) +virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged, + const char *root) { virQEMUDriverConfigPtr cfg; @@ -150,30 +151,31 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) if (privileged) { if (virAsprintf(&cfg->logDir, - "%s/log/libvirt/qemu", LOCALSTATEDIR) < 0) + "%s%s/log/libvirt/qemu", root, LOCALSTATEDIR) < 0) goto error; if (virAsprintf(&cfg->swtpmLogDir, - "%s/log/swtpm/libvirt/qemu", LOCALSTATEDIR) < 0) + "%s%s/log/swtpm/libvirt/qemu", root, LOCALSTATEDIR) < 0) goto error; - if (VIR_STRDUP(cfg->configBaseDir, SYSCONFDIR "/libvirt") < 0) + if (virAsprintf(&cfg->configBaseDir, + "%s%s/libvirt", root, SYSCONFDIR) < 0) goto error; if (virAsprintf(&cfg->stateDir, - "%s/run/libvirt/qemu", LOCALSTATEDIR) < 0) + "%s%s/run/libvirt/qemu", root, LOCALSTATEDIR) < 0) goto error; if (virAsprintf(&cfg->swtpmStateDir, - "%s/run/libvirt/qemu/swtpm", LOCALSTATEDIR) < 0) + "%s%s/run/libvirt/qemu/swtpm", root, LOCALSTATEDIR) < 0) goto error; if (virAsprintf(&cfg->cacheDir, - "%s/cache/libvirt/qemu", LOCALSTATEDIR) < 0) + "%s%s/cache/libvirt/qemu", root, LOCALSTATEDIR) < 0) goto error; if (virAsprintf(&cfg->libDir, - "%s/lib/libvirt/qemu", LOCALSTATEDIR) < 0) + "%s%s/lib/libvirt/qemu", root, LOCALSTATEDIR) < 0) goto error; if (virAsprintf(&cfg->saveDir, "%s/save", cfg->libDir) < 0) goto error; @@ -188,8 +190,8 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) goto error; if (virAsprintf(&cfg->memoryBackingDir, "%s/ram", cfg->libDir) < 0) goto error; - if (virAsprintf(&cfg->swtpmStorageDir, "%s/lib/libvirt/swtpm", - LOCALSTATEDIR) < 0) + if (virAsprintf(&cfg->swtpmStorageDir, "%s%s/lib/libvirt/swtpm", + root, LOCALSTATEDIR) < 0) goto error; if (!virDoesUserExist("tss") || virGetUserID("tss", &cfg->swtpm_user) < 0) @@ -200,22 +202,23 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) } else { char *rundir; char *cachedir; + char *cfgdir; cachedir = virGetUserCacheDirectory(); if (!cachedir) goto error; if (virAsprintf(&cfg->logDir, - "%s/qemu/log", cachedir) < 0) { + "%s%s/qemu/log", root, cachedir) < 0) { VIR_FREE(cachedir); goto error; } if (virAsprintf(&cfg->swtpmLogDir, - "%s/qemu/log", cachedir) < 0) { + "%s%s/qemu/log", root, cachedir) < 0) { VIR_FREE(cachedir); goto error; } - if (virAsprintf(&cfg->cacheDir, "%s/qemu/cache", cachedir) < 0) { + if (virAsprintf(&cfg->cacheDir, "%s%s/qemu/cache", root, cachedir) < 0) { VIR_FREE(cachedir); goto error; } @@ -224,7 +227,7 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) rundir = virGetUserRuntimeDirectory(); if (!rundir) goto error; - if (virAsprintf(&cfg->stateDir, "%s/qemu/run", rundir) < 0) { + if (virAsprintf(&cfg->stateDir, "%s%s/qemu/run", root, rundir) < 0) { VIR_FREE(rundir); goto error; } @@ -233,9 +236,16 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) if (virAsprintf(&cfg->swtpmStateDir, "%s/swtpm", cfg->stateDir) < 0) goto error; - if (!(cfg->configBaseDir = virGetUserConfigDirectory())) + cfgdir = virGetUserConfigDirectory(); + if (!cfgdir) goto error; + if (virAsprintf(&cfg->configBaseDir, "%s%s", root, cfgdir) < 0) { + VIR_FREE(cfgdir); + goto error; + } + VIR_FREE(cfgdir); + if (virAsprintf(&cfg->libDir, "%s/qemu/lib", cfg->configBaseDir) < 0) goto error; if (virAsprintf(&cfg->saveDir, "%s/qemu/save", cfg->configBaseDir) < 0) @@ -267,6 +277,7 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) * This will then be used as a fallback if the service specific * directory doesn't exist (although we don't check if this exists). */ + // XXX root ? if (VIR_STRDUP(cfg->defaultTLSx509certdir, SYSCONFDIR "/pki/qemu") < 0) goto error; diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index 983e74a3cf..f93b09c4fe 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -312,7 +312,8 @@ struct _qemuDomainCmdlineDef { void qemuDomainCmdlineDefFree(qemuDomainCmdlineDefPtr def); -virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged); +virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged, + const char *root); int virQEMUDriverConfigLoadFile(virQEMUDriverConfigPtr cfg, const char *filename, diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 659d7b89e1..019f6b2bf3 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -613,7 +613,7 @@ qemuStateInitialize(bool privileged, if (privileged) qemu_driver->hostsysinfo = virSysinfoRead(); - if (!(qemu_driver->config = cfg = virQEMUDriverConfigNew(privileged))) + if (!(qemu_driver->config = cfg = virQEMUDriverConfigNew(privileged, root))) goto error; if (virAsprintf(&driverConf, "%s/qemu.conf", cfg->configBaseDir) < 0) -- 2.21.0

The driver URI: "qemu:///embed?root=/some/path" enables a new way to use the QEMU driver by embedding it directly in the calling process. To use this the process must have a thread running the libvirt event loop. This URI will then cause libvirt to dynamically load the QEMU driver module and call its global initialization function. The application can now make normal libvirt API calls which are all serviced in-process with no RPC layer involved. To avoid conflicting with the existing libvirtd daemon it is important to specify an alternative root directory. All the paths the QEMU driver normal creates will be placed under this root. For example using the path /home/berrange/tmp/embed, as an unprivileged process cause the creation of the following directories /home/berrange/tmp/embed/home/berrange/.config/libvirt/qemu/lib /home/berrange/tmp/embed/home/berrange/.config/libvirt/qemu/snapshot /home/berrange/tmp/embed/home/berrange/.config/libvirt/qemu/save /home/berrange/tmp/embed/home/berrange/.config/libvirt/qemu/channel/target /home/berrange/tmp/embed/home/berrange/.config/libvirt/qemu/ram/libvirt /home/berrange/tmp/embed/home/berrange/.config/libvirt/qemu/ram/libvirt/qemu /home/berrange/tmp/embed/home/berrange/.config/libvirt/qemu/nvram /home/berrange/tmp/embed/home/berrange/.config/libvirt/qemu/dump /home/berrange/tmp/embed/home/berrange/.cache/libvirt/qemu/cache/capabilities /home/berrange/tmp/embed/home/berrange/.cache/libvirt/qemu/log /home/berrange/tmp/embed/run/user/501/libvirt/qemu/run The application is responsible for purging everything underneath this root directory when no longer required. Note that QEMU is still daemonized right now and so the application embedding the QEMU driver can quit & restart and still have its VMs present, just as libvirtd would. A future patch will hook up VIR_DOMAIN_CREATE_AUTO_DESTROY flag such that the VM are not daemonized and still retain the parent/child relationship forcing them to die the the application exits. Thanks to previous refactoring, it is still possible to use functionality that calls out to other secondary drivers. For example it can open the network, secret or storage drivers. The network driver stuff is not fully functional though until the pending refactoring is done to introduce the virNetworkPort object concept. A key thing to note is that for this to work, the application that links to libvirt *MUST* be built with -Wl,--export-dynamic to ensure that symbols from libvirt.so are exported & thus available to the dynamically loaded QEMU driver module. If libvirt.so itself was dynamically loaded then RTLD_GLOBAL must be passed to dlopen(). Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/libvirt.c | 39 +++++++++++++++++++++++++++++++++++++- src/qemu/qemu_driver.c | 6 ++++-- src/remote/remote_driver.c | 7 +++++++ 3 files changed, 49 insertions(+), 3 deletions(-) diff --git a/src/libvirt.c b/src/libvirt.c index 677f1cef5f..e34fd5c96d 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -87,6 +87,7 @@ #ifdef WITH_BHYVE # include "bhyve/bhyve_driver.h" #endif +#include "access/viraccessmanager.h" #define VIR_FROM_THIS VIR_FROM_NONE @@ -841,6 +842,7 @@ virConnectOpenInternal(const char *name, virConnectPtr ret; virConfPtr conf = NULL; char *uristr = NULL; + bool embed = false; ret = virGetConnect(); if (ret == NULL) @@ -939,6 +941,36 @@ virConnectOpenInternal(const char *name, ret->uri) < 0) { goto failed; } + + if (STREQ(ret->uri->scheme, "qemu") && + STREQ(ret->uri->path, "/embed")) { + const char *root = NULL; + for (i = 0; i < ret->uri->paramsCount; i++) { + virURIParamPtr var = &ret->uri->params[i]; + if (STREQ(var->name, "root")) + root = var->value; + } + + if (!root) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("root parameter required for embedded driver")); + goto failed; + } + + if (virDriverLoadModule("qemu", "qemuRegister", false) < 0) + goto failed; + + if (virAccessManagerGetDefault() == NULL) { + virAccessManagerPtr acl; + acl = virAccessManagerNew("none"); + virAccessManagerSetDefault(acl); + } + + if (virStateInitialize(geteuid() == 0, root, NULL, NULL) < 0) + goto failed; + + embed = true; + } } else { VIR_DEBUG("no name, allowing driver auto-select"); } @@ -1011,7 +1043,12 @@ virConnectOpenInternal(const char *name, continue; } } else { - VIR_DEBUG("Matching any URI scheme for '%s'", ret->uri ? ret->uri->scheme : ""); + if (embed) { + VIR_DEBUG("Skipping wildcard for embedded URI"); + continue; + } else { + VIR_DEBUG("Matching any URI scheme for '%s'", ret->uri ? ret->uri->scheme : ""); + } } /* before starting the new connection, check if the driver only works diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 019f6b2bf3..cdd483a338 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1105,14 +1105,16 @@ static virDrvOpenStatus qemuConnectOpen(virConnectPtr conn, if (virQEMUDriverIsPrivileged(qemu_driver)) { if (STRNEQ(conn->uri->path, "/system") && - STRNEQ(conn->uri->path, "/session")) { + STRNEQ(conn->uri->path, "/session") && + STRNEQ(conn->uri->path, "/embed")) { virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected QEMU URI path '%s', try qemu:///system"), conn->uri->path); return VIR_DRV_OPEN_ERROR; } } else { - if (STRNEQ(conn->uri->path, "/session")) { + if (STRNEQ(conn->uri->path, "/session") && + STRNEQ(conn->uri->path, "/embed")) { virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected QEMU URI path '%s', try qemu:///session"), conn->uri->path); diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index 244e384607..976f311060 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -1343,6 +1343,13 @@ remoteConnectOpen(virConnectPtr conn, if (flags & VIR_CONNECT_RO) rflags |= VIR_DRV_OPEN_REMOTE_RO; + if (conn->uri && + conn->uri->path && + STREQ(conn->uri->path, "/embed")) { + ret = VIR_DRV_OPEN_DECLINED; + goto cleanup; + } + /* * If no servername is given, and no +XXX * transport is listed, or transport is unix, -- 2.21.0

The previous "QEMU shim" proof of concept was taking an approach of only caring about initial spawning of the QEMU process. It was then registered with the libvirtd daemon who took over management of it. The intent was that later libvirtd would be refactored so that the shim retained control over the QEMU monitor and libvirt just forwarded APIs to each shim as needed. This forwarding of APIs would require quite alot of significant refactoring of libvirtd to achieve. This impl thus takes a quite different approach, explicitly deciding to keep the VMs completely separate from those seen & managed by libvirtd. Instead it uses the new "qemu:///embed" URI scheme to embed the entire QEMU driver in the shim, running with a custom root directory. Once the driver is initialization, the shim starts a VM and then waits to shutdown automatically when QEMU shuts down, or should kill QEMU if it is terminated itself. This is pending the AUTO_DESTROY improvements mentioned in the previous patch. Note this program does not expose any way to manage the QEMU process, since there's no RPC interface enabled. It merely starts the VM and cleans up at the end. It is probably quite rare this would be used as is, though potentially users might find it useful scripting throwaway VMs during automation that they wish to keep hidden from the rest of libvirtd instances. Mostly though it serves as example code for how to use the new "qemu:///embed" URI syntax. A more advanced application can copy this code into their own codebase, and thus have the full range of QEMU driver APIs available. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/Makefile.inc.am | 7 ++++ src/qemu/qemu_shim.c | 70 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 77 insertions(+) create mode 100644 src/qemu/qemu_shim.c diff --git a/src/qemu/Makefile.inc.am b/src/qemu/Makefile.inc.am index fd32a90d56..461704c79d 100644 --- a/src/qemu/Makefile.inc.am +++ b/src/qemu/Makefile.inc.am @@ -166,3 +166,10 @@ EXTRA_DIST += \ qemu/THREADS.txt \ libvirt_qemu_probes.d \ $(NULL) + +libexec_PROGRAMS += libvirt_qemu_shim + +libvirt_qemu_shim_SOURCES = qemu/qemu_shim.c + +libvirt_qemu_shim_LDADD = libvirt.la +libvirt_qemu_shim_LDFLAGS = -Wl,--export-dynamic diff --git a/src/qemu/qemu_shim.c b/src/qemu/qemu_shim.c new file mode 100644 index 0000000000..df4385d956 --- /dev/null +++ b/src/qemu/qemu_shim.c @@ -0,0 +1,70 @@ + +#include "config.h" + +#include <stdio.h> +#include <libvirt/libvirt.h> +#include <libvirt/virterror.h> +#include <pthread.h> +#include <stdbool.h> + +#include "virfile.h" +#include "virstring.h" + +#define VIR_FROM_THIS VIR_FROM_QEMU + +static void *eventLoop(void *opaque) +{ + bool *quit = opaque; + while (!*quit) + virEventRunDefaultImpl(); + return NULL; +} + +int main(int argc, char **argv) +{ + pthread_t eventLoopThread; + bool quit = false; + virConnectPtr conn; + virDomainPtr dom; + char *xml = NULL; + char *uri; + + if (argc != 3) { + fprintf(stderr, "syntax: %s ROOT XML\n", argv[0]); + return 1; + } + if (virFileReadAll(argv[2], 102400, &xml) < 0) { + fprintf(stderr, "cannot read %s: %s\n", argv[2], virGetLastErrorMessage()); + return 1; + } + + virFileActivateDirOverride(argv[0]); + virEventRegisterDefaultImpl(); + + pthread_create(&eventLoopThread, NULL, eventLoop, &quit); + + if (virAsprintf(&uri, "qemu:///embed?root=%s", argv[1]) < 0) { + return 1; + } + conn = virConnectOpen(uri); + if (!conn) { + fprintf(stderr, "cannot open QEMU: %s\n", virGetLastErrorMessage()); + return 1; + } + + dom = virDomainCreateXML(conn, xml, 0); + if (!dom) { + fprintf(stderr, "cannot start VM: %s\n", virGetLastErrorMessage()); + virConnectClose(conn); + return 1; + } + + fprintf(stderr, "Running for 10 seconds\n"); + sleep(10); + + virDomainDestroy(dom); + + virConnectClose(conn); + + return 0; +} -- 2.21.0

On Fri, May 17, 2019 at 13:24:52 +0100, Daniel Berrange wrote:
This patch series proposes a new way to build a "libvirt QEMU shim" which supports more use cases than the previous approach from last year, as well as being vastly simpler to implement.
Few thoughts: I like this approach in general. We were bound to lose the full host view of VMs started by libvirt shim sooner or later anyways as people would want to stuff the libvirtd into a container. Few things we should consider prior to making this a feature: 1) We should introduce a hard limit that the 'root' directory must not be created by a library version greater than we are running right when connecting. This is to prevent downgrades and thus all the problems with them right upfront. 2) The library needs to make sure that the event loop is registered in this case and ideally also make sure that it has run so that we avoid problems with stuck instances. 3) The capability caching is not leveraged when creating a new instance of the 'root'. 4) Internally we'll need to relax few checks in the migration region as igrating to a different "root" will certainly make sense now. 5) It would be cool to have a way to virConnectDomainEventRegisterAny prior to actually opening the connection. This would allow e.g. get the events for block jobs which terminated while the app was away for some reason so that it does not have to poll. 6) We need to write a doc setting the expectations how to use this in the APP. (Clarify threading and that users can't avoid domain jobs etc. [1]) 7) We should get of rid of the 'autostart' feature for 'embed' type connections. 8) Some kind of label depending on 'root' should be used in logs as things may log into the same place and that would be a pain to debug.
Specifically
- We can use this to embed the QEMU driver in unit tests allowing the full range of odriver functionality to be exercised during *unit* testing without interfering with the host OS libvirtd.
This implies ... [2] [...]
- Using a virtual root directory for QEMU driver state meams we do not need to solve the hard refactoring problem of getting the main libvirtd to detect the VMs launched via the shim. They can safely live independantly.
[1] Or that removing the directory may e.g. get rid of some state which may be needed like snapshot metadata etc.
- Thanks to the driver refactoring work it is still possible for the embedded drivers to talk to main libvirtd to use the secondary drivers for storage/network/etc if needed. Ideally we would allow those secondary drivers to be embedded too but that's not implemented here. Most important would be the secrets driver. Other drivers are much less important.
[1] ... that this is ideally supported as well. It'll be surprising if your tests spawn VMs which will infiltrate your 'default' network for example. We might want to also have a way how to specify this during connection.
Daniel P. Berrangé (5): access: report an error if no access manager is present libvirt: pass a root directory path into drivers qemu: honour the root parameter during driver initialization libvirt: support an "embed" URI path selector for QEMU driver qemu: introduce a new style libvirt_qemu_shim program
Also this new program since it's advertised as 'copy this into your code' should avoid using virAsprintf for constructing the URI.

On Fri, May 17, 2019 at 15:22:12 +0200, Peter Krempa wrote:
On Fri, May 17, 2019 at 13:24:52 +0100, Daniel Berrange wrote:
This patch series proposes a new way to build a "libvirt QEMU shim" which supports more use cases than the previous approach from last year, as well as being vastly simpler to implement.
Few thoughts:
two more: [...] 9) Users may have different ideas regarding values in qemu.conf (such as default_tls_x509_cert_dir) thus we probably need a way to provide that one as well separately. 10) users almost certainly will want to use a separate instance of the secret driver as there's no other way to define secrets for disks and sharing the instance seems wrong

On Fri, May 17, 2019 at 06:45:08PM +0200, Peter Krempa wrote:
On Fri, May 17, 2019 at 15:22:12 +0200, Peter Krempa wrote:
On Fri, May 17, 2019 at 13:24:52 +0100, Daniel Berrange wrote:
This patch series proposes a new way to build a "libvirt QEMU shim" which supports more use cases than the previous approach from last year, as well as being vastly simpler to implement.
Few thoughts:
two more:
[...]
9) Users may have different ideas regarding values in qemu.conf (such as default_tls_x509_cert_dir) thus we probably need a way to provide that one as well separately.
I believe my patch should already deal with that as I prefix all the dirs that the QEMU driver knows about.
10) users almost certainly will want to use a separate instance of the secret driver as there's no other way to define secrets for disks and sharing the instance seems wrong
Yep. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Fri, May 17, 2019 at 17:49:31 +0100, Daniel Berrange wrote:
On Fri, May 17, 2019 at 06:45:08PM +0200, Peter Krempa wrote:
On Fri, May 17, 2019 at 15:22:12 +0200, Peter Krempa wrote:
On Fri, May 17, 2019 at 13:24:52 +0100, Daniel Berrange wrote:
This patch series proposes a new way to build a "libvirt QEMU shim" which supports more use cases than the previous approach from last year, as well as being vastly simpler to implement.
Few thoughts:
two more:
[...]
9) Users may have different ideas regarding values in qemu.conf (such as default_tls_x509_cert_dir) thus we probably need a way to provide that one as well separately.
I believe my patch should already deal with that as I prefix all the dirs that the QEMU driver knows about.
Hmm, I'm not sure whether just prefixing every path used internally with the passed prefix will be a good idea. Ideally we should keep the directory contents opaque to the user so they don't ever try to mess with the files in the directory. This disqualifies the use of this for passing in the qemu config file. This also reminds me that we need some locking for the directory so that there aren't two processes openning the same prefix and messing up the files internally. It will even not work in some cases because the attempt to reopen the monitor sockets would possibly fail. If not we've got even more of a problem. Also this would mean that a prefix of '/' would be equal to the system instance handled by libvirtd so we need also interlocking with libvirt if we don't change the layout. One disadvantage of this idea is that I can see users that will actually use this to replace libvirt's RPC with something else. This will be caused by the fact that there can be only one process handling each prefix as extending qemu driver to allow concurrent access would be a nightmare.

On Mon, May 20, 2019 at 10:53:30AM +0200, Peter Krempa wrote:
On Fri, May 17, 2019 at 17:49:31 +0100, Daniel Berrange wrote:
On Fri, May 17, 2019 at 06:45:08PM +0200, Peter Krempa wrote:
On Fri, May 17, 2019 at 15:22:12 +0200, Peter Krempa wrote:
On Fri, May 17, 2019 at 13:24:52 +0100, Daniel Berrange wrote:
This patch series proposes a new way to build a "libvirt QEMU shim" which supports more use cases than the previous approach from last year, as well as being vastly simpler to implement.
Few thoughts:
two more:
[...]
9) Users may have different ideas regarding values in qemu.conf (such as default_tls_x509_cert_dir) thus we probably need a way to provide that one as well separately.
I believe my patch should already deal with that as I prefix all the dirs that the QEMU driver knows about.
Hmm, I'm not sure whether just prefixing every path used internally with the passed prefix will be a good idea.
Ideally we should keep the directory contents opaque to the user so they don't ever try to mess with the files in the directory. This disqualifies the use of this for passing in the qemu config file.
I don't think the sub-dir is really much different from the normal libvirt locations in that respect. We already recommend apps/admins to not touch /etc/libvirt/qemu files, nor the /run/libvirt/qemu files, nor the /var/lib/libvirt/qemu files. At the same time though there are some files which are reasonable for apps/admins to look at. UNIX domain sockets for virtio-serial devices in the local state sub-dir. Per VM log files, though we really should create a virStream API for accessing logs, and so on. We don't really document this very well even today. If we improve our docs in this area, I don't think apps need to treat the virtial prefix dirs any different from how we tell apps to treat libvirt's files under /
This also reminds me that we need some locking for the directory so that there aren't two processes openning the same prefix and messing up the files internally. It will even not work in some cases because the attempt to reopen the monitor sockets would possibly fail. If not we've got even more of a problem.
QEMU should reject multiple attempts to open monitor socket, or rather the second attempt will just never succeeed/hang, since QEMU won't call accept() until the first connection goes away. This 1-1 model is an intentional design limitation of chardevs in QEMU. We should definitely look at doing locking though. Currently we implicitly have the libvirtd daemon's own pidfile as the mutual exclusion mechanism. We'll need to have a separate lockfile independantly of that for the QEMU driver. Probably just $prefix/var/run/libvirt/qemu/driver.lock is good enough
Also this would mean that a prefix of '/' would be equal to the system instance handled by libvirtd so we need also interlocking with libvirt if we don't change the layout.
Yes.
One disadvantage of this idea is that I can see users that will actually use this to replace libvirt's RPC with something else. This will be caused by the fact that there can be only one process handling each prefix as extending qemu driver to allow concurrent access would be a nightmare.
Yes, but in many ways we already have this situation. libvirt-dbus is providing a new RPC mechanism to talk to libvirt. It happens to then delegate to libvirt's own RPC, but from an app developer POV that is invisible. Large scale mgmt apps like OpenStack/oVirt/KubeVirt have in some sense replaced libvirt RPC at the cross-host level - they only use libvirt RPC within scope of a single host, so most of the time there's only a single client of libvirtd - VDSM, Nova, or Kubevirt's virt handler. Of course there is a critical distinction. Even if there's normally only a single client of libvirtd, other clients are not prevented from running if needed. So you can still run "virsh list" on a host running openstack/oVirt and get "normal" results. KubeVirt has changed since they run one-libvirtd instance per VM so any aggregated APIs like "virsh list" are largely useless, only showing 1 VM. The original shim concept was intended to try to fix that behaviour by allowing the shim to register its existance back with the central libvirtd. The feedback from kubevirt devs though is that even if we had that ability in the shim, they'd likely not use it. If we consider just the single VM oriented APIs though, the inability to use "virsh" commands with an embedded driver instance could be a major pain point for debugging runtime problems, depending on the app. For short lived VMs used for embedded purposes such as with libguestfs I think the limitation would be tolerable. For long lived VMs for the inability to use virsh to debug would be pretty troubling, unless the app embedding the QEMU driver exposed a comparable set of features for collecting debug info. You'd also be unable to use QMP directly unless a second monitor was added to every VM, since QEMU limits you to one active connection per monitor chardev. This debuggability issue is probably the biggest downside to this embedded driver approach. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 5/17/19 2:24 PM, Daniel P. Berrangé wrote:
This patch series proposes a new way to build a "libvirt QEMU shim" which supports more use cases than the previous approach from last year, as well as being vastly simpler to implement.
Specifically
- We can use this to embed the QEMU driver in unit tests allowing the full range of odriver functionality to be exercised during *unit* testing without interfering with the host OS libvirtd.
- Apps like libguestfs can embed the QEMU driver to allow them to spawn VMs that are immediate children of their app, thus inheriting process context, as well as hiding these "service VM" from the main libvirtd.
This would avoid a common irritation where libguestfs VMs suddenly appear & disappear in virt-manager's VM list.
- Apps like kubevirt could use this to eliminate the need to use the RPC layer for libvirt. It can directly embed the QEMU driver giving it more direct control over how it is run & again inheriting process state. This is useful for kubevirt's one VM per container model where Kubernetes itself provides the aggregated view, thus making libvirtd's aggregated view redundant
- The previous shim was very much one shim == one QEMU process. This new embed approach supports that of course, but there's also the option to run multiple VMs if desired.
- Using a virtual root directory for QEMU driver state meams we do not need to solve the hard refactoring problem of getting the main libvirtd to detect the VMs launched via the shim. They can safely live independantly.
- Thanks to the driver refactoring work it is still possible for the embedded drivers to talk to main libvirtd to use the secondary drivers for storage/network/etc if needed. Ideally we would allow those secondary drivers to be embedded too but that's not implemented here. Most important would be the secrets driver. Other drivers are much less important.
Daniel P. Berrangé (5): access: report an error if no access manager is present libvirt: pass a root directory path into drivers qemu: honour the root parameter during driver initialization libvirt: support an "embed" URI path selector for QEMU driver qemu: introduce a new style libvirt_qemu_shim program
src/access/viraccessmanager.c | 5 ++ src/driver-state.h | 1 + src/interface/interface_backend_netcf.c | 1 + src/interface/interface_backend_udev.c | 1 + src/libvirt.c | 41 ++++++++++++++- src/libvirt_internal.h | 1 + src/libxl/libxl_driver.c | 1 + src/lxc/lxc_driver.c | 2 + src/network/bridge_driver.c | 1 + src/node_device/node_device_hal.c | 1 + src/node_device/node_device_udev.c | 1 + src/nwfilter/nwfilter_driver.c | 1 + src/qemu/Makefile.inc.am | 7 +++ src/qemu/qemu_conf.c | 41 +++++++++------ src/qemu/qemu_conf.h | 3 +- src/qemu/qemu_driver.c | 9 ++-- src/qemu/qemu_shim.c | 70 +++++++++++++++++++++++++ src/remote/remote_daemon.c | 1 + src/remote/remote_driver.c | 8 +++ src/secret/secret_driver.c | 1 + src/storage/storage_driver.c | 1 + src/vz/vz_driver.c | 1 + 22 files changed, 179 insertions(+), 20 deletions(-) create mode 100644 src/qemu/qemu_shim.c
Codewise, we'll need a more clean approach, but since this is just a PoC lets not care about that for now. I like this, I wonder if there are some hidden traps, but if we make this a preview only for some time then we should be able to catch them soon enough. IIUC this is orthogonal to splitting libvirtd into multiple daemons, i.e. containers that are target of this feature already have way of setting up networking so they will not rely on our network driver. However, we might need to give users a helping hand in identifying what domain XMLs will call other drivers (daemons). I mean, I've just tried to start a domain I have with this shim you added in 5/5 and found that it tries to conenct to storage driver and network driver (yes, I have a disk type='pool' and interface type='network'). Michal
participants (3)
-
Daniel P. Berrangé
-
Michal Privoznik
-
Peter Krempa