[libvirt] [PATCH 0/5] Override the permissions on /dev/sev when probing

The problem with /dev/sev's default permissions (0600 root:root) is that we can't make it more permissive at the moment otherwise we'd weaken the security of SEV and potentially open the door for a DOS attack. Therefore, the alternative approach is to set CAP_DAC_OVERRIDE capability for the probing QEMU process (and *only* when probing) so that libvirt truly works with SEV. As a necessary side job, this series also makes /dev/sev only available to machines that need it, thus mitigating the possible attack surface even more. Erik Skultety (5): qemu: conf: Remove /dev/sev from the default cgroup device acl list qemu: cgroup: Expose /dev/sev/ only to domains that require SEV qemu: domain: Add /dev/sev into the domain mount namespace selectively security: dac: Relabel /dev/sev in the namespace qemu: caps: Use CAP_DAC_OVERRIDE for probing to avoid permission issues docs/drvqemu.html.in | 2 +- src/qemu/qemu.conf | 2 +- src/qemu/qemu_capabilities.c | 11 +++++++ src/qemu/qemu_cgroup.c | 21 +++++++++++- src/qemu/qemu_domain.c | 24 ++++++++++++++ src/qemu/test_libvirtd_qemu.aug.in | 1 - src/security/security_dac.c | 51 ++++++++++++++++++++++++++++++ src/util/virutil.c | 31 ++++++++++++++++-- 8 files changed, 137 insertions(+), 6 deletions(-) -- 2.20.1

We should not give domains access to something they don't necessarily need by default. Remove it from the qemu driver docs too. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/drvqemu.html.in | 2 +- src/qemu/qemu.conf | 2 +- src/qemu/qemu_cgroup.c | 2 +- src/qemu/test_libvirtd_qemu.aug.in | 1 - 4 files changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/drvqemu.html.in b/docs/drvqemu.html.in index bf60a9144b..5ad956740f 100644 --- a/docs/drvqemu.html.in +++ b/docs/drvqemu.html.in @@ -396,7 +396,7 @@ chmod o+x /path/to/directory /dev/null, /dev/full, /dev/zero, /dev/random, /dev/urandom, /dev/ptmx, /dev/kvm, /dev/kqemu, -/dev/rtc, /dev/hpet, /dev/sev +/dev/rtc, /dev/hpet </pre> <p> diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index c1f1201134..7820e72dd8 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -490,7 +490,7 @@ # "/dev/null", "/dev/full", "/dev/zero", # "/dev/random", "/dev/urandom", # "/dev/ptmx", "/dev/kvm", "/dev/kqemu", -# "/dev/rtc","/dev/hpet", "/dev/sev" +# "/dev/rtc","/dev/hpet" #] # # RDMA migration requires the following extra files to be added to the list: diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 9ceecb884e..7b7cd4258b 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -46,7 +46,7 @@ const char *const defaultDeviceACL[] = { "/dev/null", "/dev/full", "/dev/zero", "/dev/random", "/dev/urandom", "/dev/ptmx", "/dev/kvm", "/dev/kqemu", - "/dev/rtc", "/dev/hpet", "/dev/sev", + "/dev/rtc", "/dev/hpet", NULL, }; #define DEVICE_PTY_MAJOR 136 diff --git a/src/qemu/test_libvirtd_qemu.aug.in b/src/qemu/test_libvirtd_qemu.aug.in index 4235464530..51a7ad5892 100644 --- a/src/qemu/test_libvirtd_qemu.aug.in +++ b/src/qemu/test_libvirtd_qemu.aug.in @@ -63,7 +63,6 @@ module Test_libvirtd_qemu = { "8" = "/dev/kqemu" } { "9" = "/dev/rtc" } { "10" = "/dev/hpet" } - { "11" = "/dev/sev" } } { "save_image_format" = "raw" } { "dump_image_format" = "raw" } -- 2.20.1

On Thu, Jan 31, 2019 at 04:26:14PM +0100, Erik Skultety wrote:
We should not give domains access to something they don't necessarily need by default. Remove it from the qemu driver docs too.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- docs/drvqemu.html.in | 2 +- src/qemu/qemu.conf | 2 +- src/qemu/qemu_cgroup.c | 2 +- src/qemu/test_libvirtd_qemu.aug.in | 1 - 4 files changed, 3 insertions(+), 4 deletions(-)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

SEV has a limit on number of concurrent guests. From security POV we should only expose resources (any resources for that matter) to domains that truly need them. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_cgroup.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 7b7cd4258b..e88cb8c45f 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -691,6 +691,22 @@ qemuTeardownChardevCgroup(virDomainObjPtr vm, } +static int +qemuSetupSEVCgroup(virDomainObjPtr vm) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int ret; + + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) + return 0; + + ret = virCgroupAllowDevicePath(priv->cgroup, "/dev/sev", + VIR_CGROUP_DEVICE_RW, false); + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", "/dev/sev", + "rw", ret); + return ret; +} + static int qemuSetupDevicesCgroup(virDomainObjPtr vm) { @@ -798,6 +814,9 @@ qemuSetupDevicesCgroup(virDomainObjPtr vm) goto cleanup; } + if (vm->def->sev && qemuSetupSEVCgroup(vm) < 0) + goto cleanup; + ret = 0; cleanup: virObjectUnref(cfg); -- 2.20.1

On Thu, Jan 31, 2019 at 04:26:15PM +0100, Erik Skultety wrote:
SEV has a limit on number of concurrent guests. From security POV we should only expose resources (any resources for that matter) to domains that truly need them.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_cgroup.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Instead of exposing /dev/sev to every domain, do it selectively. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 5bfe4fe14e..f02c45535a 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -116,6 +116,7 @@ VIR_ENUM_IMPL(qemuDomainNamespace, QEMU_DOMAIN_NS_LAST, #define DEVPREFIX "/dev/" #define DEV_VFIO "/dev/vfio/vfio" #define DEVICE_MAPPER_CONTROL_PATH "/dev/mapper/control" +#define DEV_SEV "/dev/sev" struct _qemuDomainLogContext { @@ -12200,6 +12201,26 @@ qemuDomainSetupLoader(virQEMUDriverConfigPtr cfg ATTRIBUTE_UNUSED, } +static int +qemuDomainSetupLaunchSecurity(virQEMUDriverConfigPtr cfg ATTRIBUTE_UNUSED, + virDomainObjPtr vm, + const struct qemuDomainCreateDeviceData *data) +{ + virDomainSEVDefPtr sev = vm->def->sev; + + if (!sev || sev->sectype != VIR_DOMAIN_LAUNCH_SECURITY_SEV) + return 0; + + VIR_DEBUG("Setting up launch security"); + + if (qemuDomainCreateDevice(DEV_SEV, data, false) < 0) + return -1; + + VIR_DEBUG("Set up launch security"); + return 0; +} + + int qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, virSecurityManagerPtr mgr, @@ -12271,6 +12292,9 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupLoader(cfg, vm, &data) < 0) goto cleanup; + if (qemuDomainSetupLaunchSecurity(cfg, vm, &data) < 0) + goto cleanup; + /* Save some mount points because we want to share them with the host */ for (i = 0; i < ndevMountsPath; i++) { struct stat sb; -- 2.20.1

On Thu, Jan 31, 2019 at 04:26:16PM +0100, Erik Skultety wrote:
Instead of exposing /dev/sev to every domain, do it selectively.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_domain.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

The default permissions (0600 root:root) are of no use to the qemu process so we need to change the owner to qemu iff running with namespaces. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/security/security_dac.c | 51 +++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 9f73114631..6f8ca8cd54 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -48,6 +48,7 @@ VIR_LOG_INIT("security.security_dac"); #define SECURITY_DAC_NAME "dac" +#define DEV_SEV "/dev/sev" typedef struct _virSecurityDACData virSecurityDACData; typedef virSecurityDACData *virSecurityDACDataPtr; @@ -1676,6 +1677,16 @@ virSecurityDACRestoreMemoryLabel(virSecurityManagerPtr mgr, } +static int +virSecurityDACRestoreSEVLabel(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED, + virDomainDefPtr def ATTRIBUTE_UNUSED) +{ + /* we only label /dev/sev when running with namespaces, so we don't need to + * restore anything */ + return 0; +} + + static int virSecurityDACRestoreAllLabel(virSecurityManagerPtr mgr, virDomainDefPtr def, @@ -1746,6 +1757,11 @@ virSecurityDACRestoreAllLabel(virSecurityManagerPtr mgr, rc = -1; } + if (def->sev) { + if (virSecurityDACRestoreSEVLabel(mgr, def) < 0) + rc = -1; + } + if (def->os.loader && def->os.loader->nvram && virSecurityDACRestoreFileLabel(mgr, def->os.loader->nvram) < 0) rc = -1; @@ -1819,6 +1835,36 @@ virSecurityDACSetMemoryLabel(virSecurityManagerPtr mgr, } +static int +virSecurityDACSetSEVLabel(virSecurityManagerPtr mgr, + virDomainDefPtr def) +{ + virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr); + virSecurityLabelDefPtr seclabel; + uid_t user; + gid_t group; + + /* Skip chowning /dev/sev if namespaces are disabled as we'd significantly + * increase the chance of a DOS attack on SEV + */ + if (!priv->mountNamespace) + return 0; + + seclabel = virDomainDefGetSecurityLabelDef(def, SECURITY_DAC_NAME); + if (seclabel && !seclabel->relabel) + return 0; + + if (virSecurityDACGetIds(seclabel, priv, &user, &group, NULL, NULL) < 0) + return -1; + + if (virSecurityDACSetOwnership(mgr, NULL, DEV_SEV, + user, group, false) < 0) + return -1; + + return 0; +} + + static int virSecurityDACSetAllLabel(virSecurityManagerPtr mgr, virDomainDefPtr def, @@ -1888,6 +1934,11 @@ virSecurityDACSetAllLabel(virSecurityManagerPtr mgr, return -1; } + if (def->sev) { + if (virSecurityDACSetSEVLabel(mgr, def) < 0) + return -1; + } + if (virSecurityDACGetImageIds(secdef, priv, &user, &group)) return -1; -- 2.20.1

On Thu, Jan 31, 2019 at 04:26:17PM +0100, Erik Skultety wrote:
The default permissions (0600 root:root) are of no use to the qemu process so we need to change the owner to qemu iff running with namespaces.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/security/security_dac.c | 51 +++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

This is mainly about /dev/sev and its default permissions 0600. Of course, rule of 'tinfoil' would be that we can't trust anything, but the probing code in QEMU is considered safe from security's perspective + we can't create an udev rule for this at the moment, because ioctls and filesystem permisions are cross checked in kernel and therefore a user with read permisions could issue a 'privileged' operation on SEV which is currently only limited to root. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_capabilities.c | 11 +++++++++++ src/util/virutil.c | 31 +++++++++++++++++++++++++++++-- 2 files changed, 40 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 5cf4b617c6..2e84c965e8 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -53,6 +53,10 @@ #include <stdarg.h> #include <sys/utsname.h> +#if WITH_CAPNG +# include <cap-ng.h> +#endif + #define VIR_FROM_THIS VIR_FROM_QEMU VIR_LOG_INIT("qemu.qemu_capabilities"); @@ -4515,6 +4519,13 @@ virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd, NULL); virCommandAddEnvPassCommon(cmd->cmd); virCommandClearCaps(cmd->cmd); + +#if WITH_CAPNG + /* QEMU might run into permission issues, e.g. /dev/sev (0600), override + * them just for the purpose of probing */ + virCommandAllowCap(cmd->cmd, CAP_DAC_OVERRIDE); +#endif + virCommandSetGID(cmd->cmd, cmd->runGid); virCommandSetUID(cmd->cmd, cmd->runUid); diff --git a/src/util/virutil.c b/src/util/virutil.c index 5251b66454..02de92061c 100644 --- a/src/util/virutil.c +++ b/src/util/virutil.c @@ -1502,8 +1502,10 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups, { size_t i; int capng_ret, ret = -1; - bool need_setgid = false, need_setuid = false; + bool need_setgid = false; + bool need_setuid = false; bool need_setpcap = false; + const char *capstr = NULL; /* First drop all caps (unless the requested uid is "unchanged" or * root and clearExistingCaps wasn't requested), then add back @@ -1512,14 +1514,18 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups, */ if (clearExistingCaps || (uid != (uid_t)-1 && uid != 0)) - capng_clear(CAPNG_SELECT_BOTH); + capng_clear(CAPNG_SELECT_BOTH); for (i = 0; i <= CAP_LAST_CAP; i++) { + capstr = capng_capability_to_name(i); + if (capBits & (1ULL << i)) { capng_update(CAPNG_ADD, CAPNG_EFFECTIVE|CAPNG_INHERITABLE| CAPNG_PERMITTED|CAPNG_BOUNDING_SET, i); + + VIR_DEBUG("Added '%s' to child capabilities' set", capstr); } } @@ -1579,6 +1585,27 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups, goto cleanup; } +# ifdef PR_CAP_AMBIENT + /* we couldn't do this in the loop earlier above, because the capabilities + * were not applied yet, since in order to add a capability into the AMBIENT + * set, it has to be present in both the PERMITTED and INHERITABLE sets + * (capabilities(7)) + */ + for (i = 0; i <= CAP_LAST_CAP; i++) { + capstr = capng_capability_to_name(i); + + if (capBits & (1ULL << i)) { + if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0) < 0) { + virReportSystemError(errno, + _("prctl failed to enable '%s' in the " + "AMBIENT set"), + capstr); + goto cleanup; + } + } + } +# endif + /* Set bounding set while we have CAP_SETPCAP. Unfortunately we cannot * do this if we failed to get the capability above, so ignore the * return value. -- 2.20.1

On Thu, Jan 31, 2019 at 04:26:18PM +0100, Erik Skultety wrote:
This is mainly about /dev/sev and its default permissions 0600. Of course, rule of 'tinfoil' would be that we can't trust anything, but the probing code in QEMU is considered safe from security's perspective + we can't create an udev rule for this at the moment, because ioctls and filesystem permisions are cross checked in kernel and therefore a user with read permisions could issue a 'privileged' operation on SEV which is currently only limited to root.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_capabilities.c | 11 +++++++++++ src/util/virutil.c | 31 +++++++++++++++++++++++++++++-- 2 files changed, 40 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 5cf4b617c6..2e84c965e8 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -53,6 +53,10 @@ #include <stdarg.h> #include <sys/utsname.h>
+#if WITH_CAPNG +# include <cap-ng.h> +#endif + #define VIR_FROM_THIS VIR_FROM_QEMU
VIR_LOG_INIT("qemu.qemu_capabilities"); @@ -4515,6 +4519,13 @@ virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd, NULL); virCommandAddEnvPassCommon(cmd->cmd); virCommandClearCaps(cmd->cmd); + +#if WITH_CAPNG + /* QEMU might run into permission issues, e.g. /dev/sev (0600), override + * them just for the purpose of probing */ + virCommandAllowCap(cmd->cmd, CAP_DAC_OVERRIDE); +#endif + virCommandSetGID(cmd->cmd, cmd->runGid); virCommandSetUID(cmd->cmd, cmd->runUid);
diff --git a/src/util/virutil.c b/src/util/virutil.c index 5251b66454..02de92061c 100644 --- a/src/util/virutil.c +++ b/src/util/virutil.c @@ -1502,8 +1502,10 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups, { size_t i; int capng_ret, ret = -1; - bool need_setgid = false, need_setuid = false; + bool need_setgid = false; + bool need_setuid = false; bool need_setpcap = false; + const char *capstr = NULL;
/* First drop all caps (unless the requested uid is "unchanged" or * root and clearExistingCaps wasn't requested), then add back @@ -1512,14 +1514,18 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups, */
if (clearExistingCaps || (uid != (uid_t)-1 && uid != 0)) - capng_clear(CAPNG_SELECT_BOTH); + capng_clear(CAPNG_SELECT_BOTH);
for (i = 0; i <= CAP_LAST_CAP; i++) { + capstr = capng_capability_to_name(i); + if (capBits & (1ULL << i)) { capng_update(CAPNG_ADD, CAPNG_EFFECTIVE|CAPNG_INHERITABLE| CAPNG_PERMITTED|CAPNG_BOUNDING_SET, i); + + VIR_DEBUG("Added '%s' to child capabilities' set", capstr); } }
@@ -1579,6 +1585,27 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups, goto cleanup; }
+# ifdef PR_CAP_AMBIENT + /* we couldn't do this in the loop earlier above, because the capabilities + * were not applied yet, since in order to add a capability into the AMBIENT + * set, it has to be present in both the PERMITTED and INHERITABLE sets + * (capabilities(7)) + */ + for (i = 0; i <= CAP_LAST_CAP; i++) { + capstr = capng_capability_to_name(i); + + if (capBits & (1ULL << i)) { + if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0) < 0) { + virReportSystemError(errno, + _("prctl failed to enable '%s' in the " + "AMBIENT set"), + capstr); + goto cleanup; + } + } + } +# endif
This is set a bit earlier than I set it in my PoC patch, but I'll assume it still works given the comment you added. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Fri, Feb 01, 2019 at 10:31:52AM +0000, Daniel P. Berrangé wrote:
On Thu, Jan 31, 2019 at 04:26:18PM +0100, Erik Skultety wrote:
This is mainly about /dev/sev and its default permissions 0600. Of course, rule of 'tinfoil' would be that we can't trust anything, but the probing code in QEMU is considered safe from security's perspective + we can't create an udev rule for this at the moment, because ioctls and filesystem permisions are cross checked in kernel and therefore a user with read permisions could issue a 'privileged' operation on SEV which is currently only limited to root.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_capabilities.c | 11 +++++++++++ src/util/virutil.c | 31 +++++++++++++++++++++++++++++-- 2 files changed, 40 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 5cf4b617c6..2e84c965e8 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -53,6 +53,10 @@ #include <stdarg.h> #include <sys/utsname.h>
+#if WITH_CAPNG +# include <cap-ng.h> +#endif + #define VIR_FROM_THIS VIR_FROM_QEMU
VIR_LOG_INIT("qemu.qemu_capabilities"); @@ -4515,6 +4519,13 @@ virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd, NULL); virCommandAddEnvPassCommon(cmd->cmd); virCommandClearCaps(cmd->cmd); + +#if WITH_CAPNG + /* QEMU might run into permission issues, e.g. /dev/sev (0600), override + * them just for the purpose of probing */ + virCommandAllowCap(cmd->cmd, CAP_DAC_OVERRIDE); +#endif + virCommandSetGID(cmd->cmd, cmd->runGid); virCommandSetUID(cmd->cmd, cmd->runUid);
diff --git a/src/util/virutil.c b/src/util/virutil.c index 5251b66454..02de92061c 100644 --- a/src/util/virutil.c +++ b/src/util/virutil.c @@ -1502,8 +1502,10 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups, { size_t i; int capng_ret, ret = -1; - bool need_setgid = false, need_setuid = false; + bool need_setgid = false; + bool need_setuid = false; bool need_setpcap = false; + const char *capstr = NULL;
/* First drop all caps (unless the requested uid is "unchanged" or * root and clearExistingCaps wasn't requested), then add back @@ -1512,14 +1514,18 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups, */
if (clearExistingCaps || (uid != (uid_t)-1 && uid != 0)) - capng_clear(CAPNG_SELECT_BOTH); + capng_clear(CAPNG_SELECT_BOTH);
for (i = 0; i <= CAP_LAST_CAP; i++) { + capstr = capng_capability_to_name(i); + if (capBits & (1ULL << i)) { capng_update(CAPNG_ADD, CAPNG_EFFECTIVE|CAPNG_INHERITABLE| CAPNG_PERMITTED|CAPNG_BOUNDING_SET, i); + + VIR_DEBUG("Added '%s' to child capabilities' set", capstr); } }
@@ -1579,6 +1585,27 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups, goto cleanup; }
+# ifdef PR_CAP_AMBIENT + /* we couldn't do this in the loop earlier above, because the capabilities + * were not applied yet, since in order to add a capability into the AMBIENT + * set, it has to be present in both the PERMITTED and INHERITABLE sets + * (capabilities(7)) + */ + for (i = 0; i <= CAP_LAST_CAP; i++) { + capstr = capng_capability_to_name(i); + + if (capBits & (1ULL << i)) { + if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0) < 0) { + virReportSystemError(errno, + _("prctl failed to enable '%s' in the " + "AMBIENT set"), + capstr); + goto cleanup; + } + } + } +# endif
This is set a bit earlier than I set it in my PoC patch, but I'll assume it still works given the comment you added.
I was trying to understand whether there was a particular reason why you added it to the ambient set later, so my first lame attempt was to merge the 2 'for' loops into 1, since they were identical apart from the prctl syscall which led to an error. So I investigated and found the restriction I mentioned in the comment so I moved it after the caps were first applied and it did work: (trial-error)+-research method™
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Thanks, Erik

On Fri, Feb 01, 2019 at 12:33:19PM +0100, Erik Skultety wrote:
On Fri, Feb 01, 2019 at 10:31:52AM +0000, Daniel P. Berrangé wrote:
On Thu, Jan 31, 2019 at 04:26:18PM +0100, Erik Skultety wrote:
This is mainly about /dev/sev and its default permissions 0600. Of course, rule of 'tinfoil' would be that we can't trust anything, but the probing code in QEMU is considered safe from security's perspective + we can't create an udev rule for this at the moment, because ioctls and filesystem permisions are cross checked in kernel and therefore a user with read permisions could issue a 'privileged' operation on SEV which is currently only limited to root.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/qemu/qemu_capabilities.c | 11 +++++++++++ src/util/virutil.c | 31 +++++++++++++++++++++++++++++-- 2 files changed, 40 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 5cf4b617c6..2e84c965e8 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -53,6 +53,10 @@ #include <stdarg.h> #include <sys/utsname.h>
+#if WITH_CAPNG +# include <cap-ng.h> +#endif + #define VIR_FROM_THIS VIR_FROM_QEMU
VIR_LOG_INIT("qemu.qemu_capabilities"); @@ -4515,6 +4519,13 @@ virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd, NULL); virCommandAddEnvPassCommon(cmd->cmd); virCommandClearCaps(cmd->cmd); + +#if WITH_CAPNG + /* QEMU might run into permission issues, e.g. /dev/sev (0600), override + * them just for the purpose of probing */ + virCommandAllowCap(cmd->cmd, CAP_DAC_OVERRIDE); +#endif + virCommandSetGID(cmd->cmd, cmd->runGid); virCommandSetUID(cmd->cmd, cmd->runUid);
diff --git a/src/util/virutil.c b/src/util/virutil.c index 5251b66454..02de92061c 100644 --- a/src/util/virutil.c +++ b/src/util/virutil.c @@ -1502,8 +1502,10 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups, { size_t i; int capng_ret, ret = -1; - bool need_setgid = false, need_setuid = false; + bool need_setgid = false; + bool need_setuid = false; bool need_setpcap = false; + const char *capstr = NULL;
/* First drop all caps (unless the requested uid is "unchanged" or * root and clearExistingCaps wasn't requested), then add back @@ -1512,14 +1514,18 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups, */
if (clearExistingCaps || (uid != (uid_t)-1 && uid != 0)) - capng_clear(CAPNG_SELECT_BOTH); + capng_clear(CAPNG_SELECT_BOTH);
for (i = 0; i <= CAP_LAST_CAP; i++) { + capstr = capng_capability_to_name(i); + if (capBits & (1ULL << i)) { capng_update(CAPNG_ADD, CAPNG_EFFECTIVE|CAPNG_INHERITABLE| CAPNG_PERMITTED|CAPNG_BOUNDING_SET, i); + + VIR_DEBUG("Added '%s' to child capabilities' set", capstr); } }
@@ -1579,6 +1585,27 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups, goto cleanup; }
+# ifdef PR_CAP_AMBIENT + /* we couldn't do this in the loop earlier above, because the capabilities + * were not applied yet, since in order to add a capability into the AMBIENT + * set, it has to be present in both the PERMITTED and INHERITABLE sets + * (capabilities(7)) + */ + for (i = 0; i <= CAP_LAST_CAP; i++) { + capstr = capng_capability_to_name(i); + + if (capBits & (1ULL << i)) { + if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0) < 0) { + virReportSystemError(errno, + _("prctl failed to enable '%s' in the " + "AMBIENT set"), + capstr); + goto cleanup; + } + } + } +# endif
This is set a bit earlier than I set it in my PoC patch, but I'll assume it still works given the comment you added.
I was trying to understand whether there was a particular reason why you added it to the ambient set later, so my first lame attempt was to merge the 2 'for' loops into 1, since they were identical apart from the prctl syscall which led to an error. So I investigated and found the restriction I mentioned in the comment so I moved it after the caps were first applied and it did work: (trial-error)+-research method™
I added it really early at first and it didn't work, so I the put it right at the end. I didn't bother to try it in the middle as I was lazy :-) Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (2)
-
Daniel P. Berrangé
-
Erik Skultety