(and also properly clear capabilities bounding set for child
processes)
This replaces the earlier version of the same patches:
https://www.redhat.com/archives/libvir-list/2013-February/msg00446.html
Many of these patches were already ACKed in the first version. If I
haven't modified them (other than rebasing) I note that in the
subject. Other patches have been modified based on Dan and Eric's
reviews of V1.
Also, since I first posted the series, we've had a talk with some
kernel people who are amenable to modify the semantics of
CAP_COMPROMISE_KERNEL such that it's only checked on open(), not on
read or write. If this is done, we will (thankfully!) no longer need
to set CAP_COMPROMISE_KERNEL for the qemu process. However, the first
14 patches in this series are still useful. Two big changes are: 1) it
allows setting CAP_SYS_RAWIO when needed for generic scsi command
passthrough, and 2) the bounding set of child processes will now be
properly cleared (it previously wasn't, even though the code looked
like it *should* have been doing so).
Here is the blurb from V1's intro:
There are a bunch of patches here, but each is small and
single-purpose to make reviewing easier (and also so that any
potential regressions can be easily bisected).
The original purpose of the patches was to permit setting
CAP_COMPROMISE_KERNEL for non-root qemu processes, since Fedora 18 now
requires that in order for generic PCI passthrough to work (the
alternative was to always run qemu as root). Although we may not
actually want to do that part (if we can convince kernel people to
implement CAP_COMPROMISE_KERNEL such that it's only required when
*opening* the necessary sysfs file (done by libvirt), rather than for
every read/write (done by qemu), then we will not need
CAP_COMPROMISE_KERNEL for qemu), but that is just a couple lines in
the final patch, and the rest of the series is still useful, as it
make dropping/keeping caps truly work for non-root child processes -
this has never before been the case. (for example, CAP_SYS_RAWIO is
needed for generic scsi passthrough to work, and until now the only
way to have that was to run *all* qemus as root).
A bit higher level description of what I've done with all the patches:
1) remove the programmable "hook" from virExecWithHook(), since that
function was only called from one place, and always with the same hook
function. Rename virExecWithHook() to virExec(), and replace the call
to that hook with inline code.
2) give virCommand an API to set the intended uid/gid of the command
that's going to be run, and use that instead of a "user hook" where
appropriate (in the process completely eliminating two hook
functions).
3) Also add an API to virCommand to do the final "set the process
label" step for selinux/apparmor.
4) Add a new API to the security driver (and use it from qemu) called
virSecurityManagerSetChildProcessLabel() which a) is called prior to
virCommandRun() rather than from a command "hook" function, b) takes a
virCommand, and c) rather than immediately performing the operation
(as virSecurityManagerSetProcessLabel() did), merely stores the
necessary information in the virCommand so that virExec can perform
the operation (setting selinux label, setuid/gid, etc)
5) make a new function combining the setting of uid/gid and
maintaining of capabilities, because that is the only way you can set
uid!=0 and still maintain capabilities. Use this in virExec()
6) *Finally* set the CAP_COMPROMISE_KERNEL capability unconditionally
for all qemu processes. (If we really do have to do this, we may want
to consider making it a qemu.conf setting).
Laine Stump (15):
util: eliminate generic hook from virExecWithHook
util: eliminate extra args from virExec
util: refactor virCommandHook into virExec and
virCommandHandshakeChild
util: add virCommandSetUID and virCommandSetGID
util: make virSetUIDGID a NOP only when uid or gid is -1
qemu: replace exec hook with virCommandSetUID/GID in qemuCaps*
qemu: replace exec hook with virCommandSetUID/GID in storage_backend
build: define SECDRIVER_LIBS in Makefile.am
util: add security label setting to virCommand
security: add new virSecurityManagerSetChildProcessLabel API
qemu: let virCommand set child process security labels/uid/gid
util: drop capabilities immediately after changing uid/gid of child
util: virSetUIDGIDWithCaps - change uid while keeping caps
util: maintain caps when running command with uid != 0
qemu: set CAP_COMPROMISE_KERNEL so that pci passthrough works
src/Makefile.am | 36 ++-
src/libvirt_private.syms | 6 +
src/qemu/qemu_capabilities.c | 64 ++---
src/qemu/qemu_process.c | 23 +-
src/security/security_apparmor.c | 42 +++-
src/security/security_dac.c | 24 +-
src/security/security_driver.h | 6 +-
src/security/security_manager.c | 13 +-
src/security/security_manager.h | 6 +-
src/security/security_nop.c | 10 +-
src/security/security_selinux.c | 32 +++
src/security/security_stack.c | 20 +-
src/storage/storage_backend.c | 28 +--
src/util/vircommand.c | 523 ++++++++++++++++++++-------------------
src/util/vircommand.h | 12 +-
src/util/virutil.c | 115 ++++++++-
src/util/virutil.h | 1 +
17 files changed, 595 insertions(+), 366 deletions(-)
--
1.8.1