[libvirt] [PATCH v4 0/3] Improve handling of multipath devices

The saga continues. v3: https://www.redhat.com/archives/libvir-list/2018-April/msg00083.html diff to v3: - hopefully all Peter's review comments are worked in now. Michal Privoznik (3): util: Introduce virDevMapperGetTargets qemu_cgroup: Handle device mapper targets properly news: Document device mapper fix docs/news.xml | 12 +++ libvirt.spec.in | 2 + src/libvirt_private.syms | 4 + src/qemu/qemu_cgroup.c | 46 ++++++++++- src/util/Makefile.inc.am | 2 + src/util/virdevmapper.c | 199 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virdevmapper.h | 31 ++++++++ 7 files changed, 293 insertions(+), 3 deletions(-) create mode 100644 src/util/virdevmapper.c create mode 100644 src/util/virdevmapper.h -- 2.16.1

This helper fetches dependencies for given device mapper target. At the same time, we need to provide a dummy log function because by default libdevmapper prints out error messages to stderr which we need to suppress. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 4 + src/util/Makefile.inc.am | 2 + src/util/virdevmapper.c | 199 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virdevmapper.h | 31 ++++++++ 4 files changed, 236 insertions(+) create mode 100644 src/util/virdevmapper.c create mode 100644 src/util/virdevmapper.h diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index f6897915ce..341c29bd63 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1670,6 +1670,10 @@ virDBusMessageUnref; virDBusSetSharedBus; +# util/virdevmapper.h +virDevMapperGetTargets; + + # util/virdnsmasq.h dnsmasqAddDhcpHost; dnsmasqAddHost; diff --git a/src/util/Makefile.inc.am b/src/util/Makefile.inc.am index a3c3b711fd..9624fb687c 100644 --- a/src/util/Makefile.inc.am +++ b/src/util/Makefile.inc.am @@ -35,6 +35,8 @@ UTIL_SOURCES = \ util/virdbus.c \ util/virdbus.h \ util/virdbuspriv.h \ + util/virdevmapper.c \ + util/virdevmapper.h \ util/virdnsmasq.c \ util/virdnsmasq.h \ util/virebtables.c \ diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c new file mode 100644 index 0000000000..d2c25af003 --- /dev/null +++ b/src/util/virdevmapper.c @@ -0,0 +1,199 @@ +/* + * virdevmapper.c: Functions for handling device mapper + * + * Copyright (C) 2018 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + * Authors: + * Michal Privoznik <mprivozn@redhat.com> + */ + +#include <config.h> + +#ifdef MAJOR_IN_MKDEV +# include <sys/mkdev.h> +#elif MAJOR_IN_SYSMACROS +# include <sys/sysmacros.h> +#endif + +#ifdef WITH_DEVMAPPER +# include <libdevmapper.h> +#endif + +#include "virdevmapper.h" +#include "internal.h" +#include "virthread.h" +#include "viralloc.h" +#include "virstring.h" + +#ifdef WITH_DEVMAPPER +static void +virDevMapperDummyLogger(int level ATTRIBUTE_UNUSED, + const char *file ATTRIBUTE_UNUSED, + int line ATTRIBUTE_UNUSED, + int dm_errno ATTRIBUTE_UNUSED, + const char *fmt ATTRIBUTE_UNUSED, + ...) +{ + return; +} + +static int +virDevMapperOnceInit(void) +{ + /* Ideally, we would not need this. But libdevmapper prints + * error messages to stderr by default. Sad but true. */ + dm_log_with_errno_init(virDevMapperDummyLogger); + return 0; +} + + +VIR_ONCE_GLOBAL_INIT(virDevMapper) + + +static int +virDevMapperGetTargetsImpl(const char *path, + char ***devPaths_ret, + unsigned int ttl) +{ + struct dm_task *dmt = NULL; + struct dm_deps *deps; + struct dm_info info; + char **devPaths = NULL; + char **recursiveDevPaths = NULL; + size_t i; + int ret = -1; + + *devPaths_ret = NULL; + + if (virDevMapperInitialize() < 0) + return ret; + + if (ttl == 0) { + errno = ELOOP; + return ret; + } + + if (!(dmt = dm_task_create(DM_DEVICE_DEPS))) + return ret; + + if (!dm_task_set_name(dmt, path)) { + if (errno == ENOENT) { + /* It's okay, @path is not managed by devmapper => + * not a devmapper device. */ + ret = 0; + } + goto cleanup; + } + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto cleanup; + + if (!dm_task_get_info(dmt, &info)) + goto cleanup; + + if (!info.exists) { + ret = 0; + goto cleanup; + } + + if (!(deps = dm_task_get_deps(dmt))) + goto cleanup; + + if (VIR_ALLOC_N_QUIET(devPaths, deps->count + 1) < 0) + goto cleanup; + + for (i = 0; i < deps->count; i++) { + if (virAsprintfQuiet(&devPaths[i], "/dev/block/%u:%u", + major(deps->device[i]), + minor(deps->device[i])) < 0) + goto cleanup; + } + + recursiveDevPaths = NULL; + for (i = 0; i < deps->count; i++) { + char **tmpPaths; + + if (virDevMapperGetTargetsImpl(devPaths[i], &tmpPaths, ttl - 1) < 0) + goto cleanup; + + if (tmpPaths && + virStringListMerge(&recursiveDevPaths, &tmpPaths) < 0) { + virStringListFree(tmpPaths); + goto cleanup; + } + } + + if (virStringListMerge(&devPaths, &recursiveDevPaths) < 0) + goto cleanup; + + VIR_STEAL_PTR(*devPaths_ret, devPaths); + ret = 0; + cleanup: + virStringListFree(recursiveDevPaths); + virStringListFree(devPaths); + dm_task_destroy(dmt); + return ret; +} + + +/** + * virDevMapperGetTargets: + * @path: devmapper target + * @devPaths: returned string list of devices + * + * For given @path figure out its targets, and store them in + * @devPaths array. Note, @devPaths is a string list so it's NULL + * terminated. + * + * If @path is not a devmapper device, @devPaths is set to NULL and + * success is returned. + * + * If @path consists of yet another devmapper targets these are + * consulted recursively. + * + * If we don't have permissions to talk to kernel, -1 is returned + * and errno is set to EBADF. + * + * Returns 0 on success, + * -1 otherwise (with errno set, no libvirt error is + * reported) + */ +int +virDevMapperGetTargets(const char *path, + char ***devPaths) +{ + const unsigned int ttl = 32; + + /* Arbitrary limit on recursion level. A devmapper target can + * consist of devices or yet another targets. If that's the + * case, we have to stop recursion somewhere. */ + + return virDevMapperGetTargetsImpl(path, devPaths, ttl); +} + +#else /* ! WITH_DEVMAPPER */ + +int +virDevMapperGetTargets(const char *path ATTRIBUTE_UNUSED, + char ***devPaths ATTRIBUTE_UNUSED) +{ + errno = ENOSYS; + return -1; +} +#endif /* ! WITH_DEVMAPPER */ diff --git a/src/util/virdevmapper.h b/src/util/virdevmapper.h new file mode 100644 index 0000000000..34d6655e77 --- /dev/null +++ b/src/util/virdevmapper.h @@ -0,0 +1,31 @@ +/* + * virdevmapper.h: Functions for handling device mapper + * + * Copyright (C) 2018 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + * Authors: + * Michal Privoznik <mprivozn@redhat.com> + */ + +#ifndef __VIR_DEVMAPPER_H__ +# define __VIR_DEVMAPPER_H__ + +int +virDevMapperGetTargets(const char *path, + char ***devPaths); + +#endif /* __VIR_DEVMAPPER_H__ */ -- 2.16.1

On Thu, Apr 05, 2018 at 10:09:39 +0200, Michal Privoznik wrote:
This helper fetches dependencies for given device mapper target.
At the same time, we need to provide a dummy log function because by default libdevmapper prints out error messages to stderr which we need to suppress.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 4 + src/util/Makefile.inc.am | 2 + src/util/virdevmapper.c | 199 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virdevmapper.h | 31 ++++++++ 4 files changed, 236 insertions(+) create mode 100644 src/util/virdevmapper.c create mode 100644 src/util/virdevmapper.h
ACK

https://bugzilla.redhat.com/show_bug.cgi?id=1557769 Problem with device mapper targets is that there can be several other devices 'hidden' behind them. For instance, /dev/dm-1 can consist of /dev/sda, /dev/sdb and /dev/sdc. Therefore, when setting up devices CGroup and namespaces we have to take this into account. This bug was exposed after Linux kernel was fixed. Initially, kernel used different functions for getting block device in open() and ioctl(). While CGroup permissions were checked in the former case, due to a bug in kernel they were not checked in the latter case. This changed with the upstream commit of 519049afead4f7c3e6446028c41e99fde958cc04 (v4.16-rc5~11^2~4). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- libvirt.spec.in | 2 ++ src/qemu/qemu_cgroup.c | 46 +++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 45 insertions(+), 3 deletions(-) diff --git a/libvirt.spec.in b/libvirt.spec.in index 97143c68ae..7dd63c0762 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -801,6 +801,8 @@ Requires: gzip Requires: bzip2 Requires: lzop Requires: xz +# For mpath devices +Requires: device-mapper %if 0%{?fedora} || 0%{?rhel} > 7 Requires: systemd-container %endif diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index b604edb31c..d88eb7881f 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -37,6 +37,7 @@ #include "virtypedparam.h" #include "virnuma.h" #include "virsystemd.h" +#include "virdevmapper.h" #define VIR_FROM_THIS VIR_FROM_QEMU @@ -60,7 +61,10 @@ qemuSetupImagePathCgroup(virDomainObjPtr vm, { qemuDomainObjPrivatePtr priv = vm->privateData; int perms = VIR_CGROUP_DEVICE_READ; - int ret; + char **targetPaths = NULL; + size_t i; + int rv; + int ret = -1; if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) return 0; @@ -71,12 +75,41 @@ qemuSetupImagePathCgroup(virDomainObjPtr vm, VIR_DEBUG("Allow path %s, perms: %s", path, virCgroupGetDevicePermsString(perms)); - ret = virCgroupAllowDevicePath(priv->cgroup, path, perms, true); + rv = virCgroupAllowDevicePath(priv->cgroup, path, perms, true); virDomainAuditCgroupPath(vm, priv->cgroup, "allow", path, virCgroupGetDevicePermsString(perms), - ret); + rv); + if (rv < 0) + goto cleanup; + if (rv > 0) { + /* @path is neither character device nor block device. */ + ret = 0; + goto cleanup; + } + + if (virDevMapperGetTargets(path, &targetPaths) < 0 && + errno != ENOSYS && errno != EBADF) { + virReportSystemError(errno, + _("Unable to get devmapper targets for %s"), + path); + goto cleanup; + } + + for (i = 0; targetPaths && targetPaths[i]; i++) { + rv = virCgroupAllowDevicePath(priv->cgroup, targetPaths[i], perms, false); + + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", targetPaths[i], + virCgroupGetDevicePermsString(perms), + rv); + if (rv < 0) + goto cleanup; + } + + ret = 0; + cleanup: + virStringListFree(targetPaths); return ret; } @@ -131,6 +164,13 @@ qemuTeardownImageCgroup(virDomainObjPtr vm, virDomainAuditCgroupPath(vm, priv->cgroup, "deny", src->path, virCgroupGetDevicePermsString(perms), ret); + /* If you're looking for a counter part to + * qemuSetupImagePathCgroup you're at the right place. + * However, we can't just blindly deny all the device mapper + * targets of src->path because they might still be used by + * another disk in domain. Just like we are not removing + * disks from namespace. */ + return ret; } -- 2.16.1

On Thu, Apr 05, 2018 at 10:09:40 +0200, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1557769
Problem with device mapper targets is that there can be several other devices 'hidden' behind them. For instance, /dev/dm-1 can consist of /dev/sda, /dev/sdb and /dev/sdc. Therefore, when setting up devices CGroup and namespaces we have to take this into account.
This bug was exposed after Linux kernel was fixed. Initially, kernel used different functions for getting block device in open() and ioctl(). While CGroup permissions were checked in the former case, due to a bug in kernel they were not checked in the latter case. This changed with the upstream commit of 519049afead4f7c3e6446028c41e99fde958cc04 (v4.16-rc5~11^2~4).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- libvirt.spec.in | 2 ++ src/qemu/qemu_cgroup.c | 46 +++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 45 insertions(+), 3 deletions(-)
diff --git a/libvirt.spec.in b/libvirt.spec.in index 97143c68ae..7dd63c0762 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -801,6 +801,8 @@ Requires: gzip Requires: bzip2 Requires: lzop Requires: xz +# For mpath devices +Requires: device-mapper
AFAIK if you link with a library, RPM picks the dependency up automatically, so this should not be required. It's required only for packages which provide binaries we use. Also the comment would not be entirely true, since it's required for all device mapper devices.
%if 0%{?fedora} || 0%{?rhel} > 7 Requires: systemd-container %endif
ACK to the rest of the patch.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/news.xml | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/docs/news.xml b/docs/news.xml index 87f52e83ef..8521204a35 100644 --- a/docs/news.xml +++ b/docs/news.xml @@ -49,6 +49,18 @@ </change> </section> <section title="Bug fixes"> + <change> + <summary> + Improve handling of device mapper targets + </summary> + <description> + When starting a domain with a disk backed by a device + mapper volume libvirt also needs to allow the storage + backing the device mapper in CGroups. In the past + kernel did not care, but starting from 4.16 CGroups are + consulted on each access to the device mapper target. + </description> + </change> </section> </release> <release version="v4.2.0" date="2018-04-01"> -- 2.16.1

On 04/05/2018 12:51 PM, Peter Krempa wrote:
On Thu, Apr 05, 2018 at 10:09:41 +0200, Michal Privoznik wrote:
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/news.xml | 12 ++++++++++++ 1 file changed, 12 insertions(+)
ACK
I've pushed these. Thanks, Michal
participants (2)
-
Michal Privoznik
-
Peter Krempa