[libvirt] [PATCH 0/3] qemu: QMP Capability Probing Fixes

QMP Capability probing will fail if the QEMU process cannot create the monitor socket file in /var/lib/libvirt/qemu. This is the case if the configured QEMU user is not root, but QEMU is run under root to perfom the probing. The suggested solution is to run QEMU as qemu user for probing as well. As it happens, this developed into a mini-series: it was necessary to let libvirt handle the pid file as this is stored in root-owned directory /var/run/libvirt/qemu. This prompted a race condition opening a socket. Last but not least caps->version was not filled with QMP probing. Viktor Mihajlovski (3): qemu: Wait for monitor socket even without pid qemu: Fix QMP Capabability Probing Failure qemu: Add QEMU version computation to QMP probing src/qemu/qemu_capabilities.c | 89 ++++++++++++++++++++++++++++++++++---------- src/qemu/qemu_capabilities.h | 7 +++- src/qemu/qemu_driver.c | 4 +- src/qemu/qemu_monitor.c | 2 +- 4 files changed, 78 insertions(+), 24 deletions(-) -- 1.7.12.4

If qemuMonitorOpenUnix is called without a related pid, i.e. for QMP probing, a connect failure can happen as the result of a race. Without a pid there is no retry and thus we give up to early. This changes the code to retry if no pid is supplied. Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- src/qemu/qemu_monitor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index cbde2b1..fe8424f 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -283,7 +283,7 @@ qemuMonitorOpenUnix(const char *monitor, pid_t cpid) break; if ((errno == ENOENT || errno == ECONNREFUSED) && - cpid && virProcessKill(cpid, 0) == 0) { + (!cpid || virProcessKill(cpid, 0) == 0)) { /* ENOENT : Socket may not have shown up yet * ECONNREFUSED : Leftover socket hasn't been removed yet */ continue; -- 1.7.12.4

On 11/26/12 15:17, Viktor Mihajlovski wrote:
If qemuMonitorOpenUnix is called without a related pid, i.e. for QMP probing, a connect failure can happen as the result of a race. Without a pid there is no retry and thus we give up to early. This changes the code to retry if no pid is supplied.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- src/qemu/qemu_monitor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
ACK. I will push this patch later today. Peter

On Mon, Nov 26, 2012 at 03:17:12PM +0100, Viktor Mihajlovski wrote:
If qemuMonitorOpenUnix is called without a related pid, i.e. for QMP probing, a connect failure can happen as the result of a race. Without a pid there is no retry and thus we give up to early. This changes the code to retry if no pid is supplied.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- src/qemu/qemu_monitor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index cbde2b1..fe8424f 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -283,7 +283,7 @@ qemuMonitorOpenUnix(const char *monitor, pid_t cpid) break;
if ((errno == ENOENT || errno == ECONNREFUSED) && - cpid && virProcessKill(cpid, 0) == 0) { + (!cpid || virProcessKill(cpid, 0) == 0)) { /* ENOENT : Socket may not have shown up yet * ECONNREFUSED : Leftover socket hasn't been removed yet */ continue; --
ACK Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

QMP Capability probing will fail if QEMU cannot bind to the QMP monitor socket in the qemu_driver->libDir directory. That's because the child process is stripped of all capabilities and this directory is chown'ed to the configured QEMU user/group (normally qemu:qemu) by the QEMU driver. To prevent this from happening, the driver startup will now pass the QEMU uid and gid down to the capability probing code. All capability probing invocations of QEMU will be run with the configured QEMU uid instead of libvirtd's. Furter, the pid file handling is moved to libvirt, as QEMU cannot write to the qemu_driver->runDir (root:root). This also means that the libvirt daemonizing must be used. Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- src/qemu/qemu_capabilities.c | 88 ++++++++++++++++++++++++++++++++++---------- src/qemu/qemu_capabilities.h | 7 +++- src/qemu/qemu_driver.c | 4 +- 3 files changed, 76 insertions(+), 23 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 6ce2638..a76d108 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -223,6 +223,8 @@ struct _qemuCapsCache { virHashTablePtr binaries; char *libDir; char *runDir; + uid_t runUid; + gid_t runGid; }; @@ -241,9 +243,37 @@ static int qemuCapsOnceInit(void) VIR_ONCE_GLOBAL_INIT(qemuCaps) +struct _qemuCapsHookData { + uid_t runUid; + gid_t runGid; +}; +typedef struct _qemuCapsHookData qemuCapsHookData; +typedef qemuCapsHookData *qemuCapsHookDataPtr; + +static int qemuCapsHook(void * data) +{ + int ret; + qemuCapsHookDataPtr hookData = data; + + if (!hookData) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("QEMU uid:gid not specified by caller")); + ret = -1; + goto cleanup; + } + + VIR_DEBUG("Switch QEMU uid:gid to %d:%d", + hookData->runUid, hookData->runGid); + ret = virSetUIDGID(hookData->runUid, hookData->runGid); + +cleanup: + return ret; +} + static virCommandPtr qemuCapsProbeCommand(const char *qemu, - qemuCapsPtr caps) + qemuCapsPtr caps, + qemuCapsHookDataPtr hookData) { virCommandPtr cmd = virCommandNew(qemu); @@ -256,6 +286,7 @@ qemuCapsProbeCommand(const char *qemu, virCommandAddEnvPassCommon(cmd); virCommandClearCaps(cmd); + virCommandSetPreExecHook(cmd, qemuCapsHook, hookData); return cmd; } @@ -342,7 +373,7 @@ no_memory: } static int -qemuCapsProbeMachineTypes(qemuCapsPtr caps) +qemuCapsProbeMachineTypes(qemuCapsPtr caps, qemuCapsHookDataPtr hookData) { char *output; int ret = -1; @@ -359,7 +390,7 @@ qemuCapsProbeMachineTypes(qemuCapsPtr caps) return -1; } - cmd = qemuCapsProbeCommand(caps->binary, caps); + cmd = qemuCapsProbeCommand(caps->binary, caps, hookData); virCommandAddArgList(cmd, "-M", "?", NULL); virCommandSetOutputBuffer(cmd, &output); @@ -498,7 +529,7 @@ cleanup: } static int -qemuCapsProbeCPUModels(qemuCapsPtr caps) +qemuCapsProbeCPUModels(qemuCapsPtr caps, qemuCapsHookDataPtr hookData) { char *output = NULL; int ret = -1; @@ -516,7 +547,7 @@ qemuCapsProbeCPUModels(qemuCapsPtr caps) return 0; } - cmd = qemuCapsProbeCommand(caps->binary, caps); + cmd = qemuCapsProbeCommand(caps->binary, caps, hookData); virCommandAddArgList(cmd, "-cpu", "?", NULL); virCommandSetOutputBuffer(cmd, &output); @@ -1530,7 +1561,8 @@ qemuCapsParseDeviceStr(qemuCapsPtr caps, const char *str) static int qemuCapsExtractDeviceStr(const char *qemu, - qemuCapsPtr caps) + qemuCapsPtr caps, + qemuCapsHookDataPtr hookData) { char *output = NULL; virCommandPtr cmd; @@ -1544,7 +1576,7 @@ qemuCapsExtractDeviceStr(const char *qemu, * understand '-device name,?', and always exits with status 1 for * the simpler '-device ?', so this function is really only useful * if -help includes "device driver,?". */ - cmd = qemuCapsProbeCommand(qemu, caps); + cmd = qemuCapsProbeCommand(qemu, caps, hookData); virCommandAddArgList(cmd, "-device", "?", "-device", "pci-assign,?", @@ -2094,7 +2126,7 @@ int qemuCapsProbeQMP(qemuCapsPtr caps, #define QEMU_SYSTEM_PREFIX "qemu-system-" static int -qemuCapsInitHelp(qemuCapsPtr caps) +qemuCapsInitHelp(qemuCapsPtr caps, uid_t runUid, gid_t runGid) { virCommandPtr cmd = NULL; unsigned int is_kvm; @@ -2102,6 +2134,7 @@ qemuCapsInitHelp(qemuCapsPtr caps) int ret = -1; const char *tmp; struct utsname ut; + qemuCapsHookData hookData; VIR_DEBUG("caps=%p", caps); @@ -2123,7 +2156,9 @@ qemuCapsInitHelp(qemuCapsPtr caps) goto cleanup; } - cmd = qemuCapsProbeCommand(caps->binary, NULL); + hookData.runUid = runUid; + hookData.runGid = runGid; + cmd = qemuCapsProbeCommand(caps->binary, NULL, &hookData); virCommandAddArgList(cmd, "-help", NULL); virCommandSetOutputBuffer(cmd, &help); @@ -2153,13 +2188,13 @@ qemuCapsInitHelp(qemuCapsPtr caps) * understands the 0.13.0+ notion of "-device driver,". */ if (qemuCapsGet(caps, QEMU_CAPS_DEVICE) && strstr(help, "-device driver,?") && - qemuCapsExtractDeviceStr(caps->binary, caps) < 0) + qemuCapsExtractDeviceStr(caps->binary, caps, &hookData) < 0) goto cleanup; - if (qemuCapsProbeCPUModels(caps) < 0) + if (qemuCapsProbeCPUModels(caps, &hookData) < 0) goto cleanup; - if (qemuCapsProbeMachineTypes(caps) < 0) + if (qemuCapsProbeMachineTypes(caps, &hookData) < 0) goto cleanup; ret = 0; @@ -2242,7 +2277,9 @@ qemuCapsInitQMPBasic(qemuCapsPtr caps) static int qemuCapsInitQMP(qemuCapsPtr caps, const char *libDir, - const char *runDir) + const char *runDir, + uid_t runUid, + gid_t runGid) { int ret = -1; virCommandPtr cmd = NULL; @@ -2254,6 +2291,7 @@ qemuCapsInitQMP(qemuCapsPtr caps, char *monarg = NULL; char *monpath = NULL; char *pidfile = NULL; + qemuCapsHookData hookData; /* the ".sock" sufix is important to avoid a possible clash with a qemu * domain called "capabilities" @@ -2289,11 +2327,14 @@ qemuCapsInitQMP(qemuCapsPtr caps, "-nographic", "-M", "none", "-qmp", monarg, - "-pidfile", pidfile, - "-daemonize", NULL); virCommandAddEnvPassCommon(cmd); virCommandClearCaps(cmd); + hookData.runUid = runUid; + hookData.runGid = runGid; + virCommandSetPreExecHook(cmd, qemuCapsHook, &hookData); + virCommandSetPidFile(cmd, pidfile); + virCommandDaemonize(cmd); if (virCommandRun(cmd, &status) < 0) goto cleanup; @@ -2410,7 +2451,9 @@ cleanup: qemuCapsPtr qemuCapsNewForBinary(const char *binary, const char *libDir, - const char *runDir) + const char *runDir, + uid_t runUid, + gid_t runGid) { qemuCapsPtr caps = qemuCapsNew(); struct stat sb; @@ -2438,11 +2481,11 @@ qemuCapsPtr qemuCapsNewForBinary(const char *binary, goto error; } - if ((rv = qemuCapsInitQMP(caps, libDir, runDir)) < 0) + if ((rv = qemuCapsInitQMP(caps, libDir, runDir, runUid, runGid)) < 0) goto error; if (!caps->usedQMP && - qemuCapsInitHelp(caps) < 0) + qemuCapsInitHelp(caps, runUid, runGid) < 0) goto error; return caps; @@ -2478,7 +2521,8 @@ qemuCapsHashDataFree(void *payload, const void *key ATTRIBUTE_UNUSED) qemuCapsCachePtr -qemuCapsCacheNew(const char *libDir, const char *runDir) +qemuCapsCacheNew(const char *libDir, const char *runDir, + uid_t runUid, gid_t runGid) { qemuCapsCachePtr cache; @@ -2502,6 +2546,9 @@ qemuCapsCacheNew(const char *libDir, const char *runDir) goto error; } + cache->runUid = runUid; + cache->runGid = runGid; + return cache; error: @@ -2526,7 +2573,8 @@ qemuCapsCacheLookup(qemuCapsCachePtr cache, const char *binary) if (!ret) { VIR_DEBUG("Creating capabilities for %s", binary); - ret = qemuCapsNewForBinary(binary, cache->libDir, cache->runDir); + ret = qemuCapsNewForBinary(binary, cache->libDir, cache->runDir, + cache->runUid, cache->runGid); if (ret) { VIR_DEBUG("Caching capabilities %p for %s", ret, binary); diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 751b3ec..f420c43 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -169,7 +169,9 @@ qemuCapsPtr qemuCapsNew(void); qemuCapsPtr qemuCapsNewCopy(qemuCapsPtr caps); qemuCapsPtr qemuCapsNewForBinary(const char *binary, const char *libDir, - const char *runDir); + const char *runDir, + uid_t runUid, + gid_t runGid); int qemuCapsProbeQMP(qemuCapsPtr caps, qemuMonitorPtr mon); @@ -207,7 +209,8 @@ int qemuCapsGetMachineTypesCaps(qemuCapsPtr caps, bool qemuCapsIsValid(qemuCapsPtr caps); -qemuCapsCachePtr qemuCapsCacheNew(const char *libDir, const char *runDir); +qemuCapsCachePtr qemuCapsCacheNew(const char *libDir, const char *runDir, + uid_t uid, gid_t gid); qemuCapsPtr qemuCapsCacheLookup(qemuCapsCachePtr cache, const char *binary); qemuCapsPtr qemuCapsCacheLookupCopy(qemuCapsCachePtr cache, const char *binary); void qemuCapsCacheFree(qemuCapsCachePtr cache); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d4cafcc..7e97110 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -787,7 +787,9 @@ qemudStartup(int privileged) { goto error; qemu_driver->capsCache = qemuCapsCacheNew(qemu_driver->libDir, - qemu_driver->stateDir); + qemu_driver->stateDir, + qemu_driver->user, + qemu_driver->group); if (!qemu_driver->capsCache) goto error; -- 1.7.12.4

On Mon, Nov 26, 2012 at 03:17:13PM +0100, Viktor Mihajlovski wrote:
QMP Capability probing will fail if QEMU cannot bind to the QMP monitor socket in the qemu_driver->libDir directory. That's because the child process is stripped of all capabilities and this directory is chown'ed to the configured QEMU user/group (normally qemu:qemu) by the QEMU driver.
To prevent this from happening, the driver startup will now pass the QEMU uid and gid down to the capability probing code. All capability probing invocations of QEMU will be run with the configured QEMU uid instead of libvirtd's.
Furter, the pid file handling is moved to libvirt, as QEMU cannot write to the qemu_driver->runDir (root:root). This also means that the libvirt daemonizing must be used.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- src/qemu/qemu_capabilities.c | 88 ++++++++++++++++++++++++++++++++++---------- src/qemu/qemu_capabilities.h | 7 +++- src/qemu/qemu_driver.c | 4 +- 3 files changed, 76 insertions(+), 23 deletions(-)
ACK, A nice side effect of this is that the traditional non-QMP based QEMU probing also now runs unprivileged Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

With QMP capability probing, the version was not set. virsh version returns: ... Cannot extract running QEMU hypervisor version This is fixed by computing caps->version from QMP major, minor, micro values. Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- src/qemu/qemu_capabilities.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index a76d108..d92947f 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2377,6 +2377,7 @@ qemuCapsInitQMP(qemuCapsPtr caps, goto cleanup; } + caps->version = major * 1000000 + minor * 1000 + micro; caps->usedQMP = true; qemuCapsInitQMPBasic(caps); -- 1.7.12.4

On Mon, Nov 26, 2012 at 03:17:14PM +0100, Viktor Mihajlovski wrote:
With QMP capability probing, the version was not set. virsh version returns: ... Cannot extract running QEMU hypervisor version
This is fixed by computing caps->version from QMP major, minor, micro values.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- src/qemu/qemu_capabilities.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index a76d108..d92947f 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2377,6 +2377,7 @@ qemuCapsInitQMP(qemuCapsPtr caps, goto cleanup; }
+ caps->version = major * 1000000 + minor * 1000 + micro; caps->usedQMP = true;
qemuCapsInitQMPBasic(caps);
ACK Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Mon, Nov 26, 2012 at 03:17:11PM +0100, Viktor Mihajlovski wrote:
QMP Capability probing will fail if the QEMU process cannot create the monitor socket file in /var/lib/libvirt/qemu. This is the case if the configured QEMU user is not root, but QEMU is run under root to perfom the probing. The suggested solution is to run QEMU as qemu user for probing as well.
As it happens, this developed into a mini-series: it was necessary to let libvirt handle the pid file as this is stored in root-owned directory /var/run/libvirt/qemu. This prompted a race condition opening a socket. Last but not least caps->version was not filled with QMP probing.
Viktor Mihajlovski (3): qemu: Wait for monitor socket even without pid qemu: Fix QMP Capabability Probing Failure qemu: Add QEMU version computation to QMP probing
I pushed this series now. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11/28/2012 03:58 PM, Daniel P. Berrange wrote:
Viktor Mihajlovski (3): qemu: Wait for monitor socket even without pid qemu: Fix QMP Capabability Probing Failure qemu: Add QEMU version computation to QMP probing
I pushed this series now.
Daniel
Thx -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294
participants (3)
-
Daniel P. Berrange
-
Peter Krempa
-
Viktor Mihajlovski