[libvirt] [PATCH 0/3] Keeping / Dropping capabilities in lxc containers

Hi all, I had a request from some users to allow keeping the mknod capability in containers even thought that may be a security threat for the container and host. After discussing it with Dan on IRC, here is a patch series that adds a capabilities XML element in the features section of the domain configuration. It also allows to drop capabilities that are normally kept. Coming with this commit are one for the conversion of LXC configuration to domain XML for the lxc.cap.drop entry, and one commit to extend the documentation. There is one thing I'm not sure how to do best: I had to list all capabilities into an enum for the XML config, and I had to map those to the kernel CAP_* defines. Any improvement idea is welcomed ;) Cédric Bosdonnat (3): lxc: allow to keep or drop capabilities lxc domain from xml: convert lxc.cap.drop lxc: update doc to mention features/capabilities/* domain configuration docs/drvlxc.html.in | 27 +++ docs/schemas/domaincommon.rng | 196 +++++++++++++++++++++ src/conf/domain_conf.c | 93 +++++++++- src/conf/domain_conf.h | 47 +++++ src/libvirt_private.syms | 1 + src/lxc/lxc_cgroup.c | 5 + src/lxc/lxc_container.c | 90 ++++++++-- src/lxc/lxc_native.c | 27 +++ tests/domainschemadata/domain-caps-features.xml | 28 +++ tests/lxcconf2xmldata/lxcconf2xml-blkiotune.xml | 39 ++++ tests/lxcconf2xmldata/lxcconf2xml-cpusettune.xml | 39 ++++ tests/lxcconf2xmldata/lxcconf2xml-cputune.xml | 39 ++++ tests/lxcconf2xmldata/lxcconf2xml-idmap.xml | 39 ++++ .../lxcconf2xmldata/lxcconf2xml-macvlannetwork.xml | 41 +++++ tests/lxcconf2xmldata/lxcconf2xml-memtune.xml | 39 ++++ tests/lxcconf2xmldata/lxcconf2xml-nonenetwork.xml | 41 +++++ tests/lxcconf2xmldata/lxcconf2xml-nonetwork.xml | 39 ++++ tests/lxcconf2xmldata/lxcconf2xml-physnetwork.xml | 41 +++++ tests/lxcconf2xmldata/lxcconf2xml-simple.xml | 41 +++++ tests/lxcconf2xmldata/lxcconf2xml-vlannetwork.xml | 41 +++++ 20 files changed, 935 insertions(+), 18 deletions(-) create mode 100644 tests/domainschemadata/domain-caps-features.xml -- 1.8.4.5

Added <capabilities> in the <features> section of LXC domains configuration. This section can contain elements named after the capabilities like: <mknod state="on"/>, keep CAP_MKNOD capability <sys_chroot state="off"/> drop CAP_SYS_CHROOT capability Users can restrict or give more capabilities than the default using this mechanism. --- docs/schemas/domaincommon.rng | 196 ++++++++++++++++++++++++ src/conf/domain_conf.c | 93 ++++++++++- src/conf/domain_conf.h | 47 ++++++ src/libvirt_private.syms | 1 + src/lxc/lxc_cgroup.c | 5 + src/lxc/lxc_container.c | 90 +++++++++-- tests/domainschemadata/domain-caps-features.xml | 28 ++++ 7 files changed, 442 insertions(+), 18 deletions(-) create mode 100644 tests/domainschemadata/domain-caps-features.xml diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 6cc922c..297d0ae 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -3744,6 +3744,9 @@ <empty/> </element> </optional> + <optional> + <ref name="capabilities"/> + </optional> </interleave> </element> </optional> @@ -4303,6 +4306,199 @@ </element> </define> + <!-- Optional capabilities features --> + <define name="capabilities"> + <element name="capabilities"> + <interleave> + <optional> + <element name="audit_control"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="audit_write"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="block_suspend"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="chown"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="dac_override"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="dac_read_search"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="fowner"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="fsetid"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="ipc_lock"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="ipc_owner"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="kill"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="lease"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="linux_immutable"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="mac_admin"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="mac_override"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="mknod"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="net_admin"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="net_bind_service"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="net_broadcast"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="net_raw"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="setgid"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="setfcap"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="setpcap"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="setuid"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="sys_admin"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="sys_boot"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="sys_chroot"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="sys_module"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="sys_nice"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="sys_pacct"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="sys_ptrace"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="sys_rawio"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="sys_resource"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="sys_time"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="sys_tty_config"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="syslog"> + <ref name="featurestate"/> + </element> + </optional> + <optional> + <element name="wake_alarm"> + <ref name="featurestate"/> + </element> + </optional> + </interleave> + </element> + </define> + <define name="featurestate"> <attribute name="state"> <choice> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index ff2d447..5de4bd8 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -147,7 +147,8 @@ VIR_ENUM_IMPL(virDomainFeature, VIR_DOMAIN_FEATURE_LAST, "viridian", "privnet", "hyperv", - "pvspinlock") + "pvspinlock", + "capabilities") VIR_ENUM_IMPL(virDomainFeatureState, VIR_DOMAIN_FEATURE_STATE_LAST, "default", @@ -159,6 +160,45 @@ VIR_ENUM_IMPL(virDomainHyperv, VIR_DOMAIN_HYPERV_LAST, "vapic", "spinlocks") +VIR_ENUM_IMPL(virDomainCapsFeature, VIR_DOMAIN_CAPS_FEATURE_LAST, + "audit_control", + "audit_write", + "block_suspend", + "chown", + "dac_override", + "dac_read_search", + "fowner", + "fsetid", + "ipc_lock", + "ipc_owner", + "kill", + "lease", + "linux_immutable", + "mac_admin", + "mac_override", + "mknod", + "net_admin", + "net_bind_service", + "net_broadcast", + "net_raw", + "setgid", + "setfcap", + "setpcap", + "setuid", + "sys_admin", + "sys_boot", + "sys_chroot", + "sys_module", + "sys_nice", + "sys_pacct", + "sys_ptrace", + "sys_rawio", + "sys_resource", + "sys_time", + "sys_tty_config", + "syslog", + "wake_alarm") + VIR_ENUM_IMPL(virDomainLifecycle, VIR_DOMAIN_LIFECYCLE_LAST, "destroy", "restart", @@ -11874,6 +11914,7 @@ virDomainDefParseXML(xmlDocPtr xml, case VIR_DOMAIN_FEATURE_VIRIDIAN: case VIR_DOMAIN_FEATURE_PRIVNET: case VIR_DOMAIN_FEATURE_HYPERV: + case VIR_DOMAIN_FEATURE_CAPABILITIES: def->features[val] = VIR_DOMAIN_FEATURE_STATE_ON; break; @@ -11985,6 +12026,39 @@ virDomainDefParseXML(xmlDocPtr xml, ctxt->node = node; } + if (def->features[VIR_DOMAIN_FEATURE_CAPABILITIES] == VIR_DOMAIN_FEATURE_STATE_ON) { + if ((n = virXPathNodeSet("./features/capabilities/*", ctxt, &nodes)) < 0) + goto error; + + for (i = 0; i < n; i++) { + int val = virDomainCapsFeatureTypeFromString((const char *)nodes[i]->name); + if (val < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("unexpected capability feature '%s'"), nodes[i]->name); + goto error; + } + + if (val >= 0 && val < VIR_DOMAIN_CAPS_FEATURE_LAST) { + node = ctxt->node; + ctxt->node = nodes[i]; + + if ((tmp = virXPathString("string(./@state)", ctxt))) { + if ((def->caps_features[val] = virDomainFeatureStateTypeFromString(tmp)) == -1) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("unknown state attribute '%s' of feature capability '%s'"), + tmp, virDomainFeatureTypeToString(val)); + goto error; + } + VIR_FREE(tmp); + } else { + def->caps_features[val] = VIR_DOMAIN_FEATURE_STATE_ON; + } + ctxt->node = node; + } + } + VIR_FREE(nodes); + } + if (virDomainEventActionParseXML(ctxt, "on_reboot", "string(./on_reboot[1])", &def->onReboot, @@ -17694,6 +17768,23 @@ virDomainDefFormatInternal(virDomainDefPtr def, virBufferAddLit(buf, "</hyperv>\n"); break; + case VIR_DOMAIN_FEATURE_CAPABILITIES: + if (def->features[i] != VIR_DOMAIN_FEATURE_STATE_ON) + break; + + virBufferAddLit(buf, "<capabilities>\n"); + virBufferAdjustIndent(buf, 2); + for (j = 0; j < VIR_DOMAIN_CAPS_FEATURE_LAST; j++) { + if (def->caps_features[j] != VIR_DOMAIN_FEATURE_STATE_DEFAULT) + virBufferAsprintf(buf, "<%s state='%s'/>\n", + virDomainCapsFeatureTypeToString(j), + virDomainFeatureStateTypeToString( + def->caps_features[j])); + } + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</capabilities>\n"); + break; + case VIR_DOMAIN_FEATURE_LAST: break; } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index a6ac95a..70044d6 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1525,6 +1525,7 @@ typedef enum { VIR_DOMAIN_FEATURE_PRIVNET, VIR_DOMAIN_FEATURE_HYPERV, VIR_DOMAIN_FEATURE_PVSPINLOCK, + VIR_DOMAIN_FEATURE_CAPABILITIES, VIR_DOMAIN_FEATURE_LAST } virDomainFeature; @@ -1545,6 +1546,48 @@ typedef enum { VIR_DOMAIN_HYPERV_LAST } virDomainHyperv; +/* The capabilities are ordered alphabetically to help check for new ones */ +typedef enum { + VIR_DOMAIN_CAPS_FEATURE_AUDIT_CONTROL = 0, + VIR_DOMAIN_CAPS_FEATURE_AUDIT_WRITE, + VIR_DOMAIN_CAPS_FEATURE_BLOCK_SUSPEND, + VIR_DOMAIN_CAPS_FEATURE_CHOWN, + VIR_DOMAIN_CAPS_FEATURE_DAC_OVERRIDE, + VIR_DOMAIN_CAPS_FEATURE_DAC_READ_SEARCH, + VIR_DOMAIN_CAPS_FEATURE_FOWNER, + VIR_DOMAIN_CAPS_FEATURE_FSETID, + VIR_DOMAIN_CAPS_FEATURE_IPC_LOCK, + VIR_DOMAIN_CAPS_FEATURE_IPC_OWNER, + VIR_DOMAIN_CAPS_FEATURE_KILL, + VIR_DOMAIN_CAPS_FEATURE_LEASE, + VIR_DOMAIN_CAPS_FEATURE_LINUX_IMMUTABLE, + VIR_DOMAIN_CAPS_FEATURE_MAC_ADMIN, + VIR_DOMAIN_CAPS_FEATURE_MAC_OVERRIDE, + VIR_DOMAIN_CAPS_FEATURE_MKNOD, + VIR_DOMAIN_CAPS_FEATURE_NET_ADMIN, + VIR_DOMAIN_CAPS_FEATURE_NET_BIND_SERVICE, + VIR_DOMAIN_CAPS_FEATURE_NET_BROADCAST, + VIR_DOMAIN_CAPS_FEATURE_NET_RAW, + VIR_DOMAIN_CAPS_FEATURE_SETGID, + VIR_DOMAIN_CAPS_FEATURE_SETFCAP, + VIR_DOMAIN_CAPS_FEATURE_SETPCAP, + VIR_DOMAIN_CAPS_FEATURE_SETUID, + VIR_DOMAIN_CAPS_FEATURE_SYS_ADMIN, + VIR_DOMAIN_CAPS_FEATURE_SYS_BOOT, + VIR_DOMAIN_CAPS_FEATURE_SYS_CHROOT, + VIR_DOMAIN_CAPS_FEATURE_SYS_MODULE, + VIR_DOMAIN_CAPS_FEATURE_SYS_NICE, + VIR_DOMAIN_CAPS_FEATURE_SYS_PACCT, + VIR_DOMAIN_CAPS_FEATURE_SYS_PTRACE, + VIR_DOMAIN_CAPS_FEATURE_SYS_RAWIO, + VIR_DOMAIN_CAPS_FEATURE_SYS_RESOURCE, + VIR_DOMAIN_CAPS_FEATURE_SYS_TIME, + VIR_DOMAIN_CAPS_FEATURE_SYS_TTY_CONFIG, + VIR_DOMAIN_CAPS_FEATURE_SYSLOG, + VIR_DOMAIN_CAPS_FEATURE_WAKE_ALARM, + VIR_DOMAIN_CAPS_FEATURE_LAST +} virDomainCapsFeature; + typedef enum { VIR_DOMAIN_LIFECYCLE_DESTROY, VIR_DOMAIN_LIFECYCLE_RESTART, @@ -1914,6 +1957,9 @@ struct _virDomainDef { int hyperv_features[VIR_DOMAIN_HYPERV_LAST]; unsigned int hyperv_spinlocks; + /* This options are of type virDomainFeatureState: ON = keep, OFF = drop */ + int caps_features[VIR_DOMAIN_CAPS_FEATURE_LAST]; + virDomainClockDef clock; size_t ngraphics; @@ -2534,6 +2580,7 @@ VIR_ENUM_DECL(virDomainBoot) VIR_ENUM_DECL(virDomainBootMenu) VIR_ENUM_DECL(virDomainFeature) VIR_ENUM_DECL(virDomainFeatureState) +VIR_ENUM_DECL(virDomainCapsFeature) VIR_ENUM_DECL(virDomainLifecycle) VIR_ENUM_DECL(virDomainLifecycleCrash) VIR_ENUM_DECL(virDomainPMState) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 122c572..a411766 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -133,6 +133,7 @@ virDomainBlockedReasonTypeFromString; virDomainBlockedReasonTypeToString; virDomainBootMenuTypeFromString; virDomainBootMenuTypeToString; +virDomainCapsFeatureTypeToString; virDomainChrConsoleTargetTypeFromString; virDomainChrConsoleTargetTypeToString; virDomainChrDefForeach; diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 8dfdc60..71a0d61 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -357,6 +357,11 @@ static int virLXCCgroupSetupDeviceACL(virDomainDefPtr def, {'c', LXC_DEV_MAJ_FUSE, LXC_DEV_MIN_FUSE}, {0, 0, 0}}; + /* No white list if CAP_MKNOD has to be kept */ + int capMknod = def->caps_features[VIR_DOMAIN_CAPS_FEATURE_MKNOD]; + if (capMknod == VIR_DOMAIN_FEATURE_STATE_ON) + return 0; + if (virCgroupDenyAllDevices(cgroup) < 0) goto cleanup; diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index fd8ab16..de65ac8 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -1732,25 +1732,80 @@ static int lxcContainerResolveSymlinks(virDomainDefPtr vmDef) * host system, since they are not currently "containerized" */ #if WITH_CAPNG -static int lxcContainerDropCapabilities(bool keepReboot) +static int lxcContainerDropCapabilities(virDomainDefPtr def, + bool keepReboot) { - int ret; + int ret, i; + + /* Maps virDomainCapsFeature to CAPS_* */ + static unsigned int capsMapping[] = {CAP_AUDIT_CONTROL, + CAP_AUDIT_WRITE, + CAP_BLOCK_SUSPEND, + CAP_CHOWN, + CAP_DAC_OVERRIDE, + CAP_DAC_READ_SEARCH, + CAP_FOWNER, + CAP_FSETID, + CAP_IPC_LOCK, + CAP_IPC_OWNER, + CAP_KILL, + CAP_LEASE, + CAP_LINUX_IMMUTABLE, + CAP_MAC_ADMIN, + CAP_MAC_OVERRIDE, + CAP_MKNOD, + CAP_NET_ADMIN, + CAP_NET_BIND_SERVICE, + CAP_NET_BROADCAST, + CAP_NET_RAW, + CAP_SETGID, + CAP_SETFCAP, + CAP_SETPCAP, + CAP_SETUID, + CAP_SYS_ADMIN, + CAP_SYS_BOOT, + CAP_SYS_CHROOT, + CAP_SYS_MODULE, + CAP_SYS_NICE, + CAP_SYS_PACCT, + CAP_SYS_PTRACE, + CAP_SYS_RAWIO, + CAP_SYS_RESOURCE, + CAP_SYS_TIME, + CAP_SYS_TTY_CONFIG, + CAP_SYSLOG, + CAP_WAKE_ALARM}; capng_get_caps_process(); - if ((ret = capng_updatev(CAPNG_DROP, - CAPNG_EFFECTIVE | CAPNG_PERMITTED | - CAPNG_INHERITABLE | CAPNG_BOUNDING_SET, - CAP_SYS_MODULE, /* No kernel module loading */ - CAP_SYS_TIME, /* No changing the clock */ - CAP_MKNOD, /* No creating device nodes */ - CAP_AUDIT_CONTROL, /* No messing with auditing status */ - CAP_MAC_ADMIN, /* No messing with LSM config */ - keepReboot ? -1 : CAP_SYS_BOOT, /* No use of reboot */ - -1)) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Failed to remove capabilities: %d"), ret); - return -1; + for (i = 0; i < VIR_DOMAIN_CAPS_FEATURE_LAST; i++) { + bool toDrop = false; + int state = def->caps_features[i]; + + switch ((virDomainCapsFeature) i) { + case VIR_DOMAIN_CAPS_FEATURE_SYS_BOOT: /* No use of reboot */ + toDrop = !keepReboot && (state != VIR_DOMAIN_FEATURE_STATE_ON); + break; + case VIR_DOMAIN_CAPS_FEATURE_SYS_MODULE: /* No kernel module loading */ + case VIR_DOMAIN_CAPS_FEATURE_SYS_TIME: /* No changing the clock */ + case VIR_DOMAIN_CAPS_FEATURE_MKNOD: /* No creating device nodes */ + case VIR_DOMAIN_CAPS_FEATURE_AUDIT_CONTROL: /* No messing with auditing status */ + case VIR_DOMAIN_CAPS_FEATURE_MAC_ADMIN: /* No messing with LSM config */ + toDrop = (state != VIR_DOMAIN_FEATURE_STATE_ON); + break; + default: /* User specified capabilities to drop */ + toDrop = (state == VIR_DOMAIN_FEATURE_STATE_OFF); + } + + if (toDrop && (ret = capng_update(CAPNG_DROP, + CAPNG_EFFECTIVE | CAPNG_PERMITTED | + CAPNG_INHERITABLE | CAPNG_BOUNDING_SET, + capsMapping[i])) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to remove capability %s: %d"), + virDomainCapsFeatureTypeToString(i), ret); + return -1; + } } if ((ret = capng_apply(CAPNG_SELECT_BOTH)) < 0) { @@ -1768,7 +1823,8 @@ static int lxcContainerDropCapabilities(bool keepReboot) return 0; } #else -static int lxcContainerDropCapabilities(bool keepReboot ATTRIBUTE_UNUSED) +static int lxcContainerDropCapabilities(virDomainDefPtr def ATTRIBUTE_UNUSED, + bool keepReboot ATTRIBUTE_UNUSED) { VIR_WARN("libcap-ng support not compiled in, unable to clear capabilities"); return 0; @@ -1874,7 +1930,7 @@ static int lxcContainerChild(void *data) } /* drop a set of root capabilities */ - if (lxcContainerDropCapabilities(!!hasReboot) < 0) + if (lxcContainerDropCapabilities(vmDef, !!hasReboot) < 0) goto cleanup; if (lxcContainerSendContinue(argv->handshakefd) < 0) { diff --git a/tests/domainschemadata/domain-caps-features.xml b/tests/domainschemadata/domain-caps-features.xml new file mode 100644 index 0000000..c62c767 --- /dev/null +++ b/tests/domainschemadata/domain-caps-features.xml @@ -0,0 +1,28 @@ +<domain type='lxc'> + <name>demo</name> + <uuid>8369f1ac-7e46-e869-4ca5-759d51478066</uuid> + <os> + <type>exe</type> + <init>/sh</init> + </os> + <features> + <capabilities> + <mknod state="on"/> + </capabilities> + </features> + <resource> + <partition>/virtualmachines</partition> + </resource> + <memory unit='KiB'>500000</memory> + <devices> + <filesystem type='mount'> + <source dir='/root/container'/> + <target dir='/'/> + </filesystem> + <filesystem type='mount'> + <source dir='/home'/> + <target dir='/home'/> + </filesystem> + <console type='pty'/> + </devices> +</domain> -- 1.8.4.5

On Thu, Jun 12, 2014 at 08:48:25AM +0200, Cédric Bosdonnat wrote:
Added <capabilities> in the <features> section of LXC domains configuration. This section can contain elements named after the capabilities like:
<mknod state="on"/>, keep CAP_MKNOD capability <sys_chroot state="off"/> drop CAP_SYS_CHROOT capability
Users can restrict or give more capabilities than the default using this mechanism. --- docs/schemas/domaincommon.rng | 196 ++++++++++++++++++++++++ src/conf/domain_conf.c | 93 ++++++++++- src/conf/domain_conf.h | 47 ++++++ src/libvirt_private.syms | 1 + src/lxc/lxc_cgroup.c | 5 + src/lxc/lxc_container.c | 90 +++++++++-- tests/domainschemadata/domain-caps-features.xml | 28 ++++ 7 files changed, 442 insertions(+), 18 deletions(-) create mode 100644 tests/domainschemadata/domain-caps-features.xml
diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 8dfdc60..71a0d61 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -357,6 +357,11 @@ static int virLXCCgroupSetupDeviceACL(virDomainDefPtr def, {'c', LXC_DEV_MAJ_FUSE, LXC_DEV_MIN_FUSE}, {0, 0, 0}};
+ /* No white list if CAP_MKNOD has to be kept */ + int capMknod = def->caps_features[VIR_DOMAIN_CAPS_FEATURE_MKNOD]; + if (capMknod == VIR_DOMAIN_FEATURE_STATE_ON) + return 0; + if (virCgroupDenyAllDevices(cgroup) < 0) goto cleanup;
So wrt device nodes we have two layers of defence - Blocking CAP_MKNOD - this means user can only have access to device nodes that are present in the /dev that we populate. - CGroups ACL blocking mkdir, read and write for everything except the device nodes that are implied by (or explicitly requested in) the XML I can see the value in allowing CAP_MKNOD because if we have granted the user '/dev/foo' access, there's no harm in letting them "mknod /dev/foo" too. If we have granted CAP_MKNOD though I'm not convinced that this implies we should allow access to any possible device ever. So at most I think we should allow 'mknod' in the cgroups, but still keep read+write blocked. We definitely shouldn't allow read+write just because they requested CAP_MKNOD. We already let users request access to arbitrary devices in the XML config to deal with the latter.
- if ((ret = capng_updatev(CAPNG_DROP, - CAPNG_EFFECTIVE | CAPNG_PERMITTED | - CAPNG_INHERITABLE | CAPNG_BOUNDING_SET, - CAP_SYS_MODULE, /* No kernel module loading */ - CAP_SYS_TIME, /* No changing the clock */ - CAP_MKNOD, /* No creating device nodes */ - CAP_AUDIT_CONTROL, /* No messing with auditing status */ - CAP_MAC_ADMIN, /* No messing with LSM config */ - keepReboot ? -1 : CAP_SYS_BOOT, /* No use of reboot */ - -1)) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Failed to remove capabilities: %d"), ret); - return -1; + for (i = 0; i < VIR_DOMAIN_CAPS_FEATURE_LAST; i++) { + bool toDrop = false; + int state = def->caps_features[i]; + + switch ((virDomainCapsFeature) i) { + case VIR_DOMAIN_CAPS_FEATURE_SYS_BOOT: /* No use of reboot */ + toDrop = !keepReboot && (state != VIR_DOMAIN_FEATURE_STATE_ON); + break; + case VIR_DOMAIN_CAPS_FEATURE_SYS_MODULE: /* No kernel module loading */ + case VIR_DOMAIN_CAPS_FEATURE_SYS_TIME: /* No changing the clock */ + case VIR_DOMAIN_CAPS_FEATURE_MKNOD: /* No creating device nodes */ + case VIR_DOMAIN_CAPS_FEATURE_AUDIT_CONTROL: /* No messing with auditing status */ + case VIR_DOMAIN_CAPS_FEATURE_MAC_ADMIN: /* No messing with LSM config */ + toDrop = (state != VIR_DOMAIN_FEATURE_STATE_ON); + break; + default: /* User specified capabilities to drop */ + toDrop = (state == VIR_DOMAIN_FEATURE_STATE_OFF); + } + + if (toDrop && (ret = capng_update(CAPNG_DROP, + CAPNG_EFFECTIVE | CAPNG_PERMITTED | + CAPNG_INHERITABLE | CAPNG_BOUNDING_SET, + capsMapping[i])) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to remove capability %s: %d"), + virDomainCapsFeatureTypeToString(i), ret); + return -1; + }
So what I don't like about the current code is that the set of 5 capabilities we drop is essentially arbitrary and has no inherant value. - If user namespaces are not active, then the container is insecure even if we block those 5 caps, so they add no security benefit. - If user namespaces are active, then the container is secure even if we allow these 5 caps, so again they add no security benefit. If this were day 1, I'd just allow all possible capabilities by default but that would be a semantic change - particularly changing from block CAP_MKNOD to allowing CAP_MKNOD would cause a regression in any containers using systemd. My concern, however, is that a user who wants to run a container with absolutely everything blocked, has to list them all in the XML, and the kernel adds new capabilities over time. So they might have a config which is secure today where they've listed everything to drop, then the kernel adds a new capabilty and they become potentially less secure since their existing config doesn't know about the new capability name. Looking at the XML you've proposed below:
+<domain type='lxc'> + <name>demo</name> + <uuid>8369f1ac-7e46-e869-4ca5-759d51478066</uuid> + <os> + <type>exe</type> + <init>/sh</init> + </os> + <features> + <capabilities> + <mknod state="on"/> + </capabilities> + </features>
I think what's missing here is some notion of the initial policy. I think the <capabilities> element could usefully grow a new 'policy' attribute eg <features> <capabilities policy="default|deny|allow"> <mknod state="on"/> </capabilities> </features> Now this would mean: - default == the current historical hypervisor specific default behaviour. ie what libvirt LXC driver does today. can allow/deny any caps to change this default - deny == all CAPS blocked by default. Must whitelist any caps to be allowed in container - allow == all CAPS allowed by default. Must whitelist any caps to be blocked from container And the default policy would be 'default' for sake of back compat. The final thing that concerns me is that this list of caps names is pretty much Linux specific. eg Solaris caps will potentially have different names. I don't see a good way to deal with that other than to say if we ever need to support this on Solaris, we'll extend the list of caps names to be the union of all names used by Linux and Solaris, and only process the caps relevant to each OS. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

--- src/lxc/lxc_native.c | 27 ++++++++++++++ tests/lxcconf2xmldata/lxcconf2xml-blkiotune.xml | 39 ++++++++++++++++++++ tests/lxcconf2xmldata/lxcconf2xml-cpusettune.xml | 39 ++++++++++++++++++++ tests/lxcconf2xmldata/lxcconf2xml-cputune.xml | 39 ++++++++++++++++++++ tests/lxcconf2xmldata/lxcconf2xml-idmap.xml | 39 ++++++++++++++++++++ .../lxcconf2xmldata/lxcconf2xml-macvlannetwork.xml | 41 ++++++++++++++++++++++ tests/lxcconf2xmldata/lxcconf2xml-memtune.xml | 39 ++++++++++++++++++++ tests/lxcconf2xmldata/lxcconf2xml-nonenetwork.xml | 41 ++++++++++++++++++++++ tests/lxcconf2xmldata/lxcconf2xml-nonetwork.xml | 39 ++++++++++++++++++++ tests/lxcconf2xmldata/lxcconf2xml-physnetwork.xml | 41 ++++++++++++++++++++++ tests/lxcconf2xmldata/lxcconf2xml-simple.xml | 41 ++++++++++++++++++++++ tests/lxcconf2xmldata/lxcconf2xml-vlannetwork.xml | 41 ++++++++++++++++++++++ 12 files changed, 466 insertions(+) diff --git a/src/lxc/lxc_native.c b/src/lxc/lxc_native.c index f4c4556..9cb3bce 100644 --- a/src/lxc/lxc_native.c +++ b/src/lxc/lxc_native.c @@ -838,6 +838,30 @@ lxcSetBlkioTune(virDomainDefPtr def, virConfPtr properties) return 0; } +static void +lxcSetCapDrop(virDomainDefPtr def, virConfPtr properties) +{ + virConfValuePtr value; + char **toDrop = NULL; + const char *capString; + int i; + + if ((value = virConfGetValue(properties, "lxc.cap.drop")) && value->str) + toDrop = virStringSplit(value->str, " ", 0); + + for (i = 0; i < VIR_DOMAIN_CAPS_FEATURE_LAST; i++) { + capString = virDomainCapsFeatureTypeToString(i); + if (toDrop != NULL && virStringArrayHasString(toDrop, capString)) + def->caps_features[i] = VIR_DOMAIN_FEATURE_STATE_OFF; + else + def->caps_features[i] = VIR_DOMAIN_FEATURE_STATE_ON; + } + + def->features[VIR_DOMAIN_FEATURE_CAPABILITIES] = VIR_DOMAIN_FEATURE_STATE_ON; + + virStringFreeList(toDrop); +} + virDomainDefPtr lxcParseConfigString(const char *config) { @@ -935,6 +959,9 @@ lxcParseConfigString(const char *config) if (lxcSetBlkioTune(vmdef, properties) < 0) goto error; + /* lxc.cap.drop */ + lxcSetCapDrop(vmdef, properties); + goto cleanup; error: diff --git a/tests/lxcconf2xmldata/lxcconf2xml-blkiotune.xml b/tests/lxcconf2xmldata/lxcconf2xml-blkiotune.xml index 36b8e52..34a3830 100644 --- a/tests/lxcconf2xmldata/lxcconf2xml-blkiotune.xml +++ b/tests/lxcconf2xmldata/lxcconf2xml-blkiotune.xml @@ -25,6 +25,45 @@ </os> <features> <privnet/> + <capabilities> + <audit_control state='on'/> + <audit_write state='on'/> + <block_suspend state='on'/> + <chown state='on'/> + <dac_override state='on'/> + <dac_read_search state='on'/> + <fowner state='on'/> + <fsetid state='on'/> + <ipc_lock state='on'/> + <ipc_owner state='on'/> + <kill state='on'/> + <lease state='on'/> + <linux_immutable state='on'/> + <mac_admin state='on'/> + <mac_override state='on'/> + <mknod state='on'/> + <net_admin state='on'/> + <net_bind_service state='on'/> + <net_broadcast state='on'/> + <net_raw state='on'/> + <setgid state='on'/> + <setfcap state='on'/> + <setpcap state='on'/> + <setuid state='on'/> + <sys_admin state='on'/> + <sys_boot state='on'/> + <sys_chroot state='on'/> + <sys_module state='on'/> + <sys_nice state='on'/> + <sys_pacct state='on'/> + <sys_ptrace state='on'/> + <sys_rawio state='on'/> + <sys_resource state='on'/> + <sys_time state='on'/> + <sys_tty_config state='on'/> + <syslog state='on'/> + <wake_alarm state='on'/> + </capabilities> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> diff --git a/tests/lxcconf2xmldata/lxcconf2xml-cpusettune.xml b/tests/lxcconf2xmldata/lxcconf2xml-cpusettune.xml index 932ab61..400498c 100644 --- a/tests/lxcconf2xmldata/lxcconf2xml-cpusettune.xml +++ b/tests/lxcconf2xmldata/lxcconf2xml-cpusettune.xml @@ -13,6 +13,45 @@ </os> <features> <privnet/> + <capabilities> + <audit_control state='on'/> + <audit_write state='on'/> + <block_suspend state='on'/> + <chown state='on'/> + <dac_override state='on'/> + <dac_read_search state='on'/> + <fowner state='on'/> + <fsetid state='on'/> + <ipc_lock state='on'/> + <ipc_owner state='on'/> + <kill state='on'/> + <lease state='on'/> + <linux_immutable state='on'/> + <mac_admin state='on'/> + <mac_override state='on'/> + <mknod state='on'/> + <net_admin state='on'/> + <net_bind_service state='on'/> + <net_broadcast state='on'/> + <net_raw state='on'/> + <setgid state='on'/> + <setfcap state='on'/> + <setpcap state='on'/> + <setuid state='on'/> + <sys_admin state='on'/> + <sys_boot state='on'/> + <sys_chroot state='on'/> + <sys_module state='on'/> + <sys_nice state='on'/> + <sys_pacct state='on'/> + <sys_ptrace state='on'/> + <sys_rawio state='on'/> + <sys_resource state='on'/> + <sys_time state='on'/> + <sys_tty_config state='on'/> + <syslog state='on'/> + <wake_alarm state='on'/> + </capabilities> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> diff --git a/tests/lxcconf2xmldata/lxcconf2xml-cputune.xml b/tests/lxcconf2xmldata/lxcconf2xml-cputune.xml index 1bab1c6..fccd6f1 100644 --- a/tests/lxcconf2xmldata/lxcconf2xml-cputune.xml +++ b/tests/lxcconf2xmldata/lxcconf2xml-cputune.xml @@ -15,6 +15,45 @@ </os> <features> <privnet/> + <capabilities> + <audit_control state='on'/> + <audit_write state='on'/> + <block_suspend state='on'/> + <chown state='on'/> + <dac_override state='on'/> + <dac_read_search state='on'/> + <fowner state='on'/> + <fsetid state='on'/> + <ipc_lock state='on'/> + <ipc_owner state='on'/> + <kill state='on'/> + <lease state='on'/> + <linux_immutable state='on'/> + <mac_admin state='on'/> + <mac_override state='on'/> + <mknod state='on'/> + <net_admin state='on'/> + <net_bind_service state='on'/> + <net_broadcast state='on'/> + <net_raw state='on'/> + <setgid state='on'/> + <setfcap state='on'/> + <setpcap state='on'/> + <setuid state='on'/> + <sys_admin state='on'/> + <sys_boot state='on'/> + <sys_chroot state='on'/> + <sys_module state='on'/> + <sys_nice state='on'/> + <sys_pacct state='on'/> + <sys_ptrace state='on'/> + <sys_rawio state='on'/> + <sys_resource state='on'/> + <sys_time state='on'/> + <sys_tty_config state='on'/> + <syslog state='on'/> + <wake_alarm state='on'/> + </capabilities> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> diff --git a/tests/lxcconf2xmldata/lxcconf2xml-idmap.xml b/tests/lxcconf2xmldata/lxcconf2xml-idmap.xml index 050ccd6..a6154b5 100644 --- a/tests/lxcconf2xmldata/lxcconf2xml-idmap.xml +++ b/tests/lxcconf2xmldata/lxcconf2xml-idmap.xml @@ -14,6 +14,45 @@ </idmap> <features> <privnet/> + <capabilities> + <audit_control state='on'/> + <audit_write state='on'/> + <block_suspend state='on'/> + <chown state='on'/> + <dac_override state='on'/> + <dac_read_search state='on'/> + <fowner state='on'/> + <fsetid state='on'/> + <ipc_lock state='on'/> + <ipc_owner state='on'/> + <kill state='on'/> + <lease state='on'/> + <linux_immutable state='on'/> + <mac_admin state='on'/> + <mac_override state='on'/> + <mknod state='on'/> + <net_admin state='on'/> + <net_bind_service state='on'/> + <net_broadcast state='on'/> + <net_raw state='on'/> + <setgid state='on'/> + <setfcap state='on'/> + <setpcap state='on'/> + <setuid state='on'/> + <sys_admin state='on'/> + <sys_boot state='on'/> + <sys_chroot state='on'/> + <sys_module state='on'/> + <sys_nice state='on'/> + <sys_pacct state='on'/> + <sys_ptrace state='on'/> + <sys_rawio state='on'/> + <sys_resource state='on'/> + <sys_time state='on'/> + <sys_tty_config state='on'/> + <syslog state='on'/> + <wake_alarm state='on'/> + </capabilities> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> diff --git a/tests/lxcconf2xmldata/lxcconf2xml-macvlannetwork.xml b/tests/lxcconf2xmldata/lxcconf2xml-macvlannetwork.xml index 996c0f7..1111bf9 100644 --- a/tests/lxcconf2xmldata/lxcconf2xml-macvlannetwork.xml +++ b/tests/lxcconf2xmldata/lxcconf2xml-macvlannetwork.xml @@ -8,6 +8,47 @@ <type>exe</type> <init>/sbin/init</init> </os> + <features> + <capabilities> + <audit_control state='on'/> + <audit_write state='on'/> + <block_suspend state='on'/> + <chown state='on'/> + <dac_override state='on'/> + <dac_read_search state='on'/> + <fowner state='on'/> + <fsetid state='on'/> + <ipc_lock state='on'/> + <ipc_owner state='on'/> + <kill state='on'/> + <lease state='on'/> + <linux_immutable state='on'/> + <mac_admin state='on'/> + <mac_override state='on'/> + <mknod state='on'/> + <net_admin state='on'/> + <net_bind_service state='on'/> + <net_broadcast state='on'/> + <net_raw state='on'/> + <setgid state='on'/> + <setfcap state='on'/> + <setpcap state='on'/> + <setuid state='on'/> + <sys_admin state='on'/> + <sys_boot state='on'/> + <sys_chroot state='on'/> + <sys_module state='on'/> + <sys_nice state='on'/> + <sys_pacct state='on'/> + <sys_ptrace state='on'/> + <sys_rawio state='on'/> + <sys_resource state='on'/> + <sys_time state='on'/> + <sys_tty_config state='on'/> + <syslog state='on'/> + <wake_alarm state='on'/> + </capabilities> + </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> diff --git a/tests/lxcconf2xmldata/lxcconf2xml-memtune.xml b/tests/lxcconf2xmldata/lxcconf2xml-memtune.xml index b7c919e..a735786 100644 --- a/tests/lxcconf2xmldata/lxcconf2xml-memtune.xml +++ b/tests/lxcconf2xmldata/lxcconf2xml-memtune.xml @@ -15,6 +15,45 @@ </os> <features> <privnet/> + <capabilities> + <audit_control state='on'/> + <audit_write state='on'/> + <block_suspend state='on'/> + <chown state='on'/> + <dac_override state='on'/> + <dac_read_search state='on'/> + <fowner state='on'/> + <fsetid state='on'/> + <ipc_lock state='on'/> + <ipc_owner state='on'/> + <kill state='on'/> + <lease state='on'/> + <linux_immutable state='on'/> + <mac_admin state='on'/> + <mac_override state='on'/> + <mknod state='on'/> + <net_admin state='on'/> + <net_bind_service state='on'/> + <net_broadcast state='on'/> + <net_raw state='on'/> + <setgid state='on'/> + <setfcap state='on'/> + <setpcap state='on'/> + <setuid state='on'/> + <sys_admin state='on'/> + <sys_boot state='on'/> + <sys_chroot state='on'/> + <sys_module state='on'/> + <sys_nice state='on'/> + <sys_pacct state='on'/> + <sys_ptrace state='on'/> + <sys_rawio state='on'/> + <sys_resource state='on'/> + <sys_time state='on'/> + <sys_tty_config state='on'/> + <syslog state='on'/> + <wake_alarm state='on'/> + </capabilities> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> diff --git a/tests/lxcconf2xmldata/lxcconf2xml-nonenetwork.xml b/tests/lxcconf2xmldata/lxcconf2xml-nonenetwork.xml index 6d9e16d..cdb0861 100644 --- a/tests/lxcconf2xmldata/lxcconf2xml-nonenetwork.xml +++ b/tests/lxcconf2xmldata/lxcconf2xml-nonenetwork.xml @@ -8,6 +8,47 @@ <type>exe</type> <init>/sbin/init</init> </os> + <features> + <capabilities> + <audit_control state='on'/> + <audit_write state='on'/> + <block_suspend state='on'/> + <chown state='on'/> + <dac_override state='on'/> + <dac_read_search state='on'/> + <fowner state='on'/> + <fsetid state='on'/> + <ipc_lock state='on'/> + <ipc_owner state='on'/> + <kill state='on'/> + <lease state='on'/> + <linux_immutable state='on'/> + <mac_admin state='on'/> + <mac_override state='on'/> + <mknod state='on'/> + <net_admin state='on'/> + <net_bind_service state='on'/> + <net_broadcast state='on'/> + <net_raw state='on'/> + <setgid state='on'/> + <setfcap state='on'/> + <setpcap state='on'/> + <setuid state='on'/> + <sys_admin state='on'/> + <sys_boot state='on'/> + <sys_chroot state='on'/> + <sys_module state='on'/> + <sys_nice state='on'/> + <sys_pacct state='on'/> + <sys_ptrace state='on'/> + <sys_rawio state='on'/> + <sys_resource state='on'/> + <sys_time state='on'/> + <sys_tty_config state='on'/> + <syslog state='on'/> + <wake_alarm state='on'/> + </capabilities> + </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> diff --git a/tests/lxcconf2xmldata/lxcconf2xml-nonetwork.xml b/tests/lxcconf2xmldata/lxcconf2xml-nonetwork.xml index 101324a..ea45fc6 100644 --- a/tests/lxcconf2xmldata/lxcconf2xml-nonetwork.xml +++ b/tests/lxcconf2xmldata/lxcconf2xml-nonetwork.xml @@ -10,6 +10,45 @@ </os> <features> <privnet/> + <capabilities> + <audit_control state='on'/> + <audit_write state='on'/> + <block_suspend state='on'/> + <chown state='on'/> + <dac_override state='on'/> + <dac_read_search state='on'/> + <fowner state='on'/> + <fsetid state='on'/> + <ipc_lock state='on'/> + <ipc_owner state='on'/> + <kill state='on'/> + <lease state='on'/> + <linux_immutable state='on'/> + <mac_admin state='on'/> + <mac_override state='on'/> + <mknod state='on'/> + <net_admin state='on'/> + <net_bind_service state='on'/> + <net_broadcast state='on'/> + <net_raw state='on'/> + <setgid state='on'/> + <setfcap state='on'/> + <setpcap state='on'/> + <setuid state='on'/> + <sys_admin state='on'/> + <sys_boot state='on'/> + <sys_chroot state='on'/> + <sys_module state='on'/> + <sys_nice state='on'/> + <sys_pacct state='on'/> + <sys_ptrace state='on'/> + <sys_rawio state='on'/> + <sys_resource state='on'/> + <sys_time state='on'/> + <sys_tty_config state='on'/> + <syslog state='on'/> + <wake_alarm state='on'/> + </capabilities> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> diff --git a/tests/lxcconf2xmldata/lxcconf2xml-physnetwork.xml b/tests/lxcconf2xmldata/lxcconf2xml-physnetwork.xml index 5fe1b03..15ccd72 100644 --- a/tests/lxcconf2xmldata/lxcconf2xml-physnetwork.xml +++ b/tests/lxcconf2xmldata/lxcconf2xml-physnetwork.xml @@ -8,6 +8,47 @@ <type>exe</type> <init>/sbin/init</init> </os> + <features> + <capabilities> + <audit_control state='on'/> + <audit_write state='on'/> + <block_suspend state='on'/> + <chown state='on'/> + <dac_override state='on'/> + <dac_read_search state='on'/> + <fowner state='on'/> + <fsetid state='on'/> + <ipc_lock state='on'/> + <ipc_owner state='on'/> + <kill state='on'/> + <lease state='on'/> + <linux_immutable state='on'/> + <mac_admin state='on'/> + <mac_override state='on'/> + <mknod state='on'/> + <net_admin state='on'/> + <net_bind_service state='on'/> + <net_broadcast state='on'/> + <net_raw state='on'/> + <setgid state='on'/> + <setfcap state='on'/> + <setpcap state='on'/> + <setuid state='on'/> + <sys_admin state='on'/> + <sys_boot state='on'/> + <sys_chroot state='on'/> + <sys_module state='on'/> + <sys_nice state='on'/> + <sys_pacct state='on'/> + <sys_ptrace state='on'/> + <sys_rawio state='on'/> + <sys_resource state='on'/> + <sys_time state='on'/> + <sys_tty_config state='on'/> + <syslog state='on'/> + <wake_alarm state='on'/> + </capabilities> + </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> diff --git a/tests/lxcconf2xmldata/lxcconf2xml-simple.xml b/tests/lxcconf2xmldata/lxcconf2xml-simple.xml index b3c3659..5892072 100644 --- a/tests/lxcconf2xmldata/lxcconf2xml-simple.xml +++ b/tests/lxcconf2xmldata/lxcconf2xml-simple.xml @@ -8,6 +8,47 @@ <type arch='i686'>exe</type> <init>/sbin/init</init> </os> + <features> + <capabilities> + <audit_control state='on'/> + <audit_write state='on'/> + <block_suspend state='on'/> + <chown state='on'/> + <dac_override state='on'/> + <dac_read_search state='on'/> + <fowner state='on'/> + <fsetid state='on'/> + <ipc_lock state='on'/> + <ipc_owner state='on'/> + <kill state='on'/> + <lease state='on'/> + <linux_immutable state='on'/> + <mac_admin state='off'/> + <mac_override state='off'/> + <mknod state='off'/> + <net_admin state='on'/> + <net_bind_service state='on'/> + <net_broadcast state='on'/> + <net_raw state='on'/> + <setgid state='on'/> + <setfcap state='on'/> + <setpcap state='on'/> + <setuid state='on'/> + <sys_admin state='on'/> + <sys_boot state='on'/> + <sys_chroot state='on'/> + <sys_module state='off'/> + <sys_nice state='on'/> + <sys_pacct state='on'/> + <sys_ptrace state='on'/> + <sys_rawio state='on'/> + <sys_resource state='on'/> + <sys_time state='on'/> + <sys_tty_config state='on'/> + <syslog state='on'/> + <wake_alarm state='on'/> + </capabilities> + </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> diff --git a/tests/lxcconf2xmldata/lxcconf2xml-vlannetwork.xml b/tests/lxcconf2xmldata/lxcconf2xml-vlannetwork.xml index 45348ed..88da048 100644 --- a/tests/lxcconf2xmldata/lxcconf2xml-vlannetwork.xml +++ b/tests/lxcconf2xmldata/lxcconf2xml-vlannetwork.xml @@ -8,6 +8,47 @@ <type>exe</type> <init>/sbin/init</init> </os> + <features> + <capabilities> + <audit_control state='on'/> + <audit_write state='on'/> + <block_suspend state='on'/> + <chown state='on'/> + <dac_override state='on'/> + <dac_read_search state='on'/> + <fowner state='on'/> + <fsetid state='on'/> + <ipc_lock state='on'/> + <ipc_owner state='on'/> + <kill state='on'/> + <lease state='on'/> + <linux_immutable state='on'/> + <mac_admin state='on'/> + <mac_override state='on'/> + <mknod state='on'/> + <net_admin state='on'/> + <net_bind_service state='on'/> + <net_broadcast state='on'/> + <net_raw state='on'/> + <setgid state='on'/> + <setfcap state='on'/> + <setpcap state='on'/> + <setuid state='on'/> + <sys_admin state='on'/> + <sys_boot state='on'/> + <sys_chroot state='on'/> + <sys_module state='on'/> + <sys_nice state='on'/> + <sys_pacct state='on'/> + <sys_ptrace state='on'/> + <sys_rawio state='on'/> + <sys_resource state='on'/> + <sys_time state='on'/> + <sys_tty_config state='on'/> + <syslog state='on'/> + <wake_alarm state='on'/> + </capabilities> + </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> -- 1.8.4.5

--- docs/drvlxc.html.in | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/docs/drvlxc.html.in b/docs/drvlxc.html.in index fc4bc20..4a634c5 100644 --- a/docs/drvlxc.html.in +++ b/docs/drvlxc.html.in @@ -540,6 +540,33 @@ debootstrap, whatever) under /opt/vm-1-root: </domain> </pre> +<h2><a name="capabilities">Altering the available capabilities</a></h2> + +<p> +By default the libvirt LXC driver drops some capabilities among which CAP_MKNOD. +However <span class="since">since 1.2.6</span> libvirt can be told to keep or +drop some capabilities using a domain configuration like the following: +</p> +<pre> +... +<features> + <capabilities> + <mknod state='on'/> + <sys_chroot state='off'/> + </capabilities> +</features> +... +</pre> +<p> +The capabilities children elements are named after the capabilities as defined in +<code>man 7 capabilities</code>. An <code>off</code> state tells libvirt to drop the +capability, while an <code>on</code> state will force to keep the capability even though +this one is dropped by default. +</p> +<p> +Note that allowing capabilities that are normally dropped by default can seriously +affect the security of the container and the host. +</p> <h2><a name="usage">Container usage / management</a></h2> -- 1.8.4.5
participants (2)
-
Cédric Bosdonnat
-
Daniel P. Berrange