[libvirt] [PATCHv5 00/13] outgoing fd: migration and virFileOpenAs

This addresses the comments raised during v4: https://www.redhat.com/archives/libvir-list/2011-March/msg00421.html More comments in individual patches. It could still use a bit more testing with root-squash NFS, and I'm also hitting a problem where if I run daemon/libvirtd myself, I get a SELinux error: error: unable to set security context 'system_u:object_r:svirt_image_t:s0:c80,c237' on fd 23: Permission denied but if I run the system service libvirtd or SELinux permissive, things work. Somehow, the attempt to set the fd SELinux label on a pipe is not working when libvirt is started as an unconfined process (that is, the fd has label unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023) but when started as a daemon, SELinux is happy to allow the transition. I suspect that this is a bug in SELinux, since my understanding is that it should always be possible to go from unconfined to something more restrictive, but we already proved that SELinux fd labelling is relatively unused and untested back when we first added it in commit 34a19dda. If possible, I'd like to get this in before the 0.9.0 freeze, and we can fix any fallout from testing during the freeze week. Eric Blake (13): util: allow clearing cloexec bit qemu: fix restoring a compressed save image qemu: allow simple domain save to use fd: protocol util: use SCM_RIGHTS in virFileOperation when needed qemu: simplify domain save fd handling storage: simplify fd handling util: rename virFileOperation to virFileOpenAs util: adjust indentation in previous patch qemu, storage: improve type safety qemu: use common API for reading difficult files qemu: consolidate migration to file code qemu: skip granting access during fd migration qemu: support fd: migration with compression src/libvirt_private.syms | 3 +- src/qemu/qemu_command.c | 16 ++ src/qemu/qemu_driver.c | 500 +++++++++-------------------------------- src/qemu/qemu_migration.c | 139 ++++++++++++ src/qemu/qemu_migration.h | 8 + src/storage/storage_backend.c | 78 ++++--- src/util/util.c | 176 ++++++++++----- src/util/util.h | 16 +- tests/qemuxml2argvtest.c | 2 +- 9 files changed, 448 insertions(+), 490 deletions(-) -- 1.7.4

* src/util/util.h (virSetInherit): New prototype. * src/util/util.c (virSetCloseExec): Move guts... (virSetInherit): ...to new function, and allow clearing. * src/libvirt_private.syms (util.h): Export it. --- v5: new patch src/libvirt_private.syms | 1 + src/util/util.c | 14 +++++++++++--- src/util/util.h | 1 + 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 5f58970..93504e5 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -933,6 +933,7 @@ virRun; virRunWithHook; virSetBlocking; virSetCloseExec; +virSetInherit; virSetNonBlock; virSetUIDGID; virSkipSpaces; diff --git a/src/util/util.c b/src/util/util.c index 4301b00..7c7da22 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -273,13 +273,21 @@ int virSetNonBlock(int fd) { } +int virSetCloseExec(int fd) +{ + return virSetInherit(fd, false); +} + #ifndef WIN32 -int virSetCloseExec(int fd) { +int virSetInherit(int fd, bool inherit) { int flags; if ((flags = fcntl(fd, F_GETFD)) < 0) return -1; - flags |= FD_CLOEXEC; + if (inherit) + flags &= ~FD_CLOEXEC; + else + flags |= FD_CLOEXEC; if ((fcntl(fd, F_SETFD, flags)) < 0) return -1; return 0; @@ -931,7 +939,7 @@ virRunWithHook(const char *const*argv, #else /* WIN32 */ -int virSetCloseExec(int fd ATTRIBUTE_UNUSED) +int virSetInherit(int fd ATTRIBUTE_UNUSED, bool inherit ATTRIBUTE_UNUSED) { return -1; } diff --git a/src/util/util.h b/src/util/util.h index c313023..0f11f8f 100644 --- a/src/util/util.h +++ b/src/util/util.h @@ -51,6 +51,7 @@ enum { int virSetBlocking(int fd, bool blocking) ATTRIBUTE_RETURN_CHECK; int virSetNonBlock(int fd) ATTRIBUTE_RETURN_CHECK; +int virSetInherit(int fd, bool inherit) ATTRIBUTE_RETURN_CHECK; int virSetCloseExec(int fd) ATTRIBUTE_RETURN_CHECK; /* This will execute in the context of the first child -- 1.7.4

Latent bug introduced in commit 2d6a581960 (Aug 2009), but not exposed until commit 1859939a (Jan 2011). Basically, when virExec creates a pipe, it always marks libvirt's side as cloexec. If libvirt then wants to hand that pipe to another child process, things work great if the fd is dup2()'d onto stdin or stdout (as with stdin: or exec: migration), but if the pipe is instead used as-is (such as with fd: migration) then qemu sees EBADF because the fd was closed at exec(). This is a minimal fix for the problem at hand; it is slightly racy, but no more racy than the rest of libvirt fd handling, including the case of uncompressed save images. A more invasive fix, but ultimately safer at avoiding leaking unintended fds, would be to _always and atomically_ open all fds as cloexec in libvirt (thanks to primitives like open(O_CLOEXEC), pipe2(), accept4(), ...), then teach virExec to clear that bit for all fds explicitly marked to be handed to the child only after forking. * src/qemu/qemu_command.c (qemuBuildCommandLine): Clear cloexec flag. * tests/qemuxml2argvtest.c (testCompareXMLToArgvFiles): Tweak test. --- v5: new patch src/qemu/qemu_command.c | 16 ++++++++++++++++ tests/qemuxml2argvtest.c | 2 +- 2 files changed, 17 insertions(+), 1 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index c9b9850..3138943 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -4329,6 +4329,14 @@ qemuBuildCommandLine(virConnectPtr conn, } else if (STREQ(migrateFrom, "stdio")) { if (qemuCapsGet(qemuCaps, QEMU_CAPS_MIGRATE_QEMU_FD)) { virCommandAddArgFormat(cmd, "fd:%d", migrateFd); + /* migrateFd might be cloexec, but qemu must inherit + * it if vmop indicates qemu will be executed */ + if (vmop != VIR_VM_OP_NO_OP && + virSetInherit(migrateFd, true) < 0) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Failed to clear cloexec flag")); + goto error; + } virCommandPreserveFD(cmd, migrateFd); } else if (qemuCapsGet(qemuCaps, QEMU_CAPS_MIGRATE_QEMU_EXEC)) { virCommandAddArg(cmd, "exec:cat"); @@ -4358,6 +4366,14 @@ qemuBuildCommandLine(virConnectPtr conn, goto error; } virCommandAddArg(cmd, migrateFrom); + /* migrateFd might be cloexec, but qemu must inherit + * it if vmop indicates qemu will be executed */ + if (vmop != VIR_VM_OP_NO_OP && + virSetInherit(migrateFd, true) < 0) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Failed to clear cloexec flag")); + goto error; + } virCommandPreserveFD(cmd, migrateFd); } else if (STRPREFIX(migrateFrom, "unix")) { if (!qemuCapsGet(qemuCaps, QEMU_CAPS_MIGRATE_QEMU_UNIX)) { diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index c1329fa..02de8de 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -122,7 +122,7 @@ static int testCompareXMLToArgvFiles(const char *xml, if (!(cmd = qemuBuildCommandLine(conn, &driver, vmdef, &monitor_chr, false, extraFlags, migrateFrom, migrateFd, NULL, - VIR_VM_OP_CREATE))) + VIR_VM_OP_NO_OP))) goto fail; if (!!virGetLastError() != expectError) { -- 1.7.4

This allows direct saves (no compression, no root-squash NFS) to use the more efficient fd: migration, which in turn avoids a race where qemu exec: migration can sometimes fail because qemu does a generic waitpid() that conflicts with the pclose() used by exec:. Further patches will solve compression and root-squash NFS. * src/qemu/qemu_driver.c (qemudDomainSaveFlag): Use new function when there is no compression. --- v5: adjust earlier in the series Note that this opens the fd a second time, but gracefully falls back to the older exec: migration for things like root-squash NFS. src/qemu/qemu_driver.c | 26 +++++++++++++++++++++++--- 1 files changed, 23 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e0e86d3..8aa72f8 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1823,6 +1823,7 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, int is_reg = 0; unsigned long long offset; virCgroupPtr cgroup = NULL; + virBitmapPtr qemuCaps = NULL; memset(&header, 0, sizeof(header)); memcpy(header.magic, QEMUD_SAVE_MAGIC, sizeof(header.magic)); @@ -1853,6 +1854,11 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, } } + if (qemuCapsExtractVersionInfo(vm->def->emulator, vm->def->os.arch, + NULL, + &qemuCaps) < 0) + goto endjob; + /* Get XML for the domain */ xml = virDomainDefFormat(vm->def, VIR_DOMAIN_XML_SECURE); if (!xml) { @@ -2013,10 +2019,23 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, if (header.compressed == QEMUD_SAVE_FORMAT_RAW) { const char *args[] = { "cat", NULL }; + /* XXX gross - why don't we reuse the fd already opened earlier */ + int fd = -1; + + if (qemuCapsGet(qemuCaps, QEMU_CAPS_MIGRATE_QEMU_FD) && + priv->monConfig->type == VIR_DOMAIN_CHR_TYPE_UNIX) + fd = open(path, O_WRONLY); qemuDomainObjEnterMonitorWithDriver(driver, vm); - rc = qemuMonitorMigrateToFile(priv->mon, - QEMU_MONITOR_MIGRATE_BACKGROUND, - args, path, offset); + if (fd >= 0 && lseek(fd, offset, SEEK_SET) == offset) { + rc = qemuMonitorMigrateToFd(priv->mon, + QEMU_MONITOR_MIGRATE_BACKGROUND, + fd); + } else { + rc = qemuMonitorMigrateToFile(priv->mon, + QEMU_MONITOR_MIGRATE_BACKGROUND, + args, path, offset); + } + VIR_FORCE_CLOSE(fd); qemuDomainObjExitMonitorWithDriver(driver, vm); } else { const char *prog = qemudSaveCompressionTypeToString(header.compressed); @@ -2099,6 +2118,7 @@ endjob: } cleanup: + qemuCapsFree(qemuCaps); VIR_FREE(xml); if (ret != 0 && is_reg) unlink(path); -- 1.7.4

Currently, the hook function in virFileOperation is extremely limited: it must be async-signal-safe, and cannot modify any memory in the parent process. It is much handier to return a valid fd and operate on it in the parent than to deal with hook restrictions. * src/util/util.h (VIR_FILE_OP_RETURN_FD): New flag. * src/util/util.c (virFileOperationNoFork, virFileOperation): Honor new flag. --- v5: incorporate comments, such as closing fds sooner and switching to SOCK_STREAM to avoid hangs src/util/util.c | 138 ++++++++++++++++++++++++++++++++++++++++++++++++------ src/util/util.h | 4 +- 2 files changed, 125 insertions(+), 17 deletions(-) diff --git a/src/util/util.c b/src/util/util.c index 7c7da22..a0a331d 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -1428,14 +1428,15 @@ static int virFileOperationNoFork(const char *path, int openflags, mode_t mode, if ((hook) && ((ret = hook(fd, hookdata)) != 0)) { goto error; } + if (flags & VIR_FILE_OP_RETURN_FD) + return fd; if (VIR_CLOSE(fd) < 0) { ret = -errno; virReportSystemError(errno, _("failed to close new file '%s'"), path); - fd = -1; goto error; } - fd = -1; + error: VIR_FORCE_CLOSE(fd); return ret; @@ -1480,7 +1481,32 @@ error: return ret; } -/* return -errno on failure, or 0 on success */ +/** + * virFileOperation: + * @path: file to open or create + * @openflags: flags to pass to open + * @mode: mode to use on creation or when forcing permissions + * @uid: uid that should own file on creation + * @gid: gid that should own file + * @hook: callback to run once file is open; see below for restrictions + * @hookdata: opaque data for @hook + * @flags: bit-wise or of VIR_FILE_OP_* flags + * + * Open @path, and execute an optional callback @hook on the open file + * description. @hook must return 0 on success, or -errno on failure. + * If @flags includes VIR_FILE_OP_AS_UID, then open the file while the + * effective user id is @uid (by using a child process); this allows + * one to bypass root-squashing NFS issues. If @flags includes + * VIR_FILE_OP_FORCE_PERMS, then ensure that @path has those + * permissions before the callback, even if the file already existed + * with different permissions. If @flags includes + * VIR_FILE_OP_RETURN_FD, then use SCM_RIGHTS to guarantee that any + * @hook is run in the parent, and the return value (if non-negative) + * is the file descriptor, left open. Otherwise, @hook might be run + * in a child process, so it must be async-signal-safe (ie. no + * malloc() or anything else that depends on modifying locks held in + * the parent), no file descriptor remains open, and 0 is returned on + * success. Return -errno on failure. */ int virFileOperation(const char *path, int openflags, mode_t mode, uid_t uid, gid_t gid, virFileOperationHook hook, void *hookdata, @@ -1488,7 +1514,14 @@ int virFileOperation(const char *path, int openflags, mode_t mode, struct stat st; pid_t pid; int waitret, status, ret = 0; - int fd; + int fd = -1; + int pair[2] = { -1, -1 }; + struct msghdr msg; + struct cmsghdr *cmsg; + char buf[CMSG_SPACE(sizeof(fd))]; + char dummy = 0; + struct iovec iov; + int forkRet; if ((!(flags & VIR_FILE_OP_AS_UID)) || (getuid() != 0) @@ -1502,7 +1535,29 @@ int virFileOperation(const char *path, int openflags, mode_t mode, * following dance avoids problems caused by root-squashing * NFS servers. */ - int forkRet = virFork(&pid); + if (flags & VIR_FILE_OP_RETURN_FD) { + if (socketpair(AF_UNIX, SOCK_STREAM, 0, pair) < 0) { + ret = -errno; + virReportSystemError(errno, + _("failed to create socket needed for '%s'"), + path); + return ret; + } + + memset(&msg, 0, sizeof(msg)); + iov.iov_base = &dummy; + iov.iov_len = 1; + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = buf; + msg.msg_controllen = sizeof(buf); + cmsg = CMSG_FIRSTHDR(&msg); + cmsg->cmsg_level = SOL_SOCKET; + cmsg->cmsg_type = SCM_RIGHTS; + cmsg->cmsg_len = CMSG_LEN(sizeof(fd)); + } + + forkRet = virFork(&pid); if (pid < 0) { ret = -errno; @@ -1510,6 +1565,31 @@ int virFileOperation(const char *path, int openflags, mode_t mode, } if (pid) { /* parent */ + if (flags & VIR_FILE_OP_RETURN_FD) { + VIR_FORCE_CLOSE(pair[1]); + + do { + ret = recvmsg(pair[0], &msg, 0); + } while (ret < 0 && errno == EINTR); + + if (ret < 0) { + ret = -errno; + VIR_FORCE_CLOSE(pair[0]); + while ((waitret = waitpid(pid, NULL, 0) == -1) + && (errno == EINTR)); + goto parenterror; + } + VIR_FORCE_CLOSE(pair[0]); + + /* See if fd was transferred. */ + cmsg = CMSG_FIRSTHDR(&msg); + if (cmsg && cmsg->cmsg_len == CMSG_LEN(sizeof(fd)) && + cmsg->cmsg_level == SOL_SOCKET && + cmsg->cmsg_type == SCM_RIGHTS) { + memcpy(&fd, CMSG_DATA(cmsg), sizeof(fd)); + } + } + /* wait for child to complete, and retrieve its exit code */ while ((waitret = waitpid(pid, &status, 0) == -1) && (errno == EINTR)); @@ -1520,12 +1600,20 @@ int virFileOperation(const char *path, int openflags, mode_t mode, path); goto parenterror; } - if (!WIFEXITED(status) || (ret = -WEXITSTATUS(status)) == -EACCES) { + if (!WIFEXITED(status) || (ret = -WEXITSTATUS(status)) == -EACCES || + ((flags & VIR_FILE_OP_RETURN_FD) && fd == -1)) { /* fall back to the simpler method, which works better in * some cases */ return virFileOperationNoFork(path, openflags, mode, uid, gid, hook, hookdata, flags); } + if (!ret && (flags & VIR_FILE_OP_RETURN_FD)) { + if ((hook) && ((ret = hook(fd, hookdata)) != 0)) { + VIR_FORCE_CLOSE(fd); + goto parenterror; + } + ret = fd; + } parenterror: return ret; } @@ -1535,6 +1623,7 @@ parenterror: if (forkRet < 0) { /* error encountered and logged in virFork() after the fork. */ + ret = -errno; goto childerror; } @@ -1584,15 +1673,32 @@ parenterror: path, mode); goto childerror; } - if ((hook) && ((ret = hook(fd, hookdata)) != 0)) { - goto childerror; - } - if (VIR_CLOSE(fd) < 0) { - ret = -errno; - virReportSystemError(errno, _("child failed to close new file '%s'"), - path); - goto childerror; + if (flags & VIR_FILE_OP_RETURN_FD) { + VIR_FORCE_CLOSE(pair[0]); + memcpy(CMSG_DATA(cmsg), &fd, sizeof(fd)); + + do { + ret = sendmsg(pair[1], &msg, 0); + } while (ret < 0 && errno == EINTR); + + if (ret < 0) { + ret = -errno; + goto childerror; + } + ret = 0; + } else { + if ((hook) && ((ret = hook(fd, hookdata)) != 0)) + goto childerror; + if (VIR_CLOSE(fd) < 0) { + ret = -errno; + virReportSystemError(errno, + _("child failed to close new file '%s'"), + path); + goto childerror; + } } + + ret = 0; childerror: /* ret tracks -errno on failure, but exit value must be positive. * If the child exits with EACCES, then the parent tries again. */ @@ -1719,7 +1825,7 @@ int virFileOperation(const char *path ATTRIBUTE_UNUSED, virUtilError(VIR_ERR_INTERNAL_ERROR, "%s", _("virFileOperation is not implemented for WIN32")); - return -1; + return -ENOSYS; } int virDirCreate(const char *path ATTRIBUTE_UNUSED, @@ -1731,7 +1837,7 @@ int virDirCreate(const char *path ATTRIBUTE_UNUSED, virUtilError(VIR_ERR_INTERNAL_ERROR, "%s", _("virDirCreate is not implemented for WIN32")); - return -1; + return -ENOSYS; } #endif /* WIN32 */ diff --git a/src/util/util.h b/src/util/util.h index 0f11f8f..b1ca871 100644 --- a/src/util/util.h +++ b/src/util/util.h @@ -130,12 +130,14 @@ enum { VIR_FILE_OP_NONE = 0, VIR_FILE_OP_AS_UID = (1 << 0), VIR_FILE_OP_FORCE_PERMS = (1 << 1), + VIR_FILE_OP_RETURN_FD = (1 << 2), }; typedef int (*virFileOperationHook)(int fd, void *data); int virFileOperation(const char *path, int openflags, mode_t mode, uid_t uid, gid_t gid, virFileOperationHook hook, void *hookdata, - unsigned int flags) ATTRIBUTE_RETURN_CHECK; + unsigned int flags) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; enum { VIR_DIR_CREATE_NONE = 0, -- 1.7.4

This makes root-squash NFS saves more efficient. * src/qemu/qemu_driver.c (qemudDomainSaveFlag): Use new virFileOperation flag to open fd only once. --- v5: no real changes src/qemu/qemu_driver.c | 75 +++++++++++++++++++++++++----------------------- 1 files changed, 39 insertions(+), 36 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 8aa72f8..7a6e4ff 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1824,6 +1824,7 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, unsigned long long offset; virCgroupPtr cgroup = NULL; virBitmapPtr qemuCaps = NULL; + int fd = -1; memset(&header, 0, sizeof(header)); memcpy(header.magic, QEMUD_SAVE_MAGIC, sizeof(header.magic)); @@ -1902,45 +1903,32 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, header.xml_len += pad; } - /* Setup hook data needed by virFileOperation hook function */ - hdata.dom = dom; - hdata.path = path; - hdata.xml = xml; - hdata.header = &header; - - /* Write header to file, followed by XML */ + /* Obtain the file handle. */ /* First try creating the file as root */ if (!is_reg) { - int fd = open(path, O_WRONLY | O_TRUNC); + fd = open(path, O_WRONLY | O_TRUNC); if (fd < 0) { virReportSystemError(errno, _("unable to open %s"), path); goto endjob; } - if (qemudDomainSaveFileOpHook(fd, &hdata) < 0) { - VIR_FORCE_CLOSE(fd); - goto endjob; - } - if (VIR_CLOSE(fd) < 0) { - virReportSystemError(errno, _("unable to close %s"), path); - goto endjob; - } } else { - if ((rc = virFileOperation(path, O_CREAT|O_TRUNC|O_WRONLY, - S_IRUSR|S_IWUSR, - getuid(), getgid(), - qemudDomainSaveFileOpHook, &hdata, - 0)) < 0) { + if ((fd = virFileOperation(path, O_CREAT|O_TRUNC|O_WRONLY, + S_IRUSR|S_IWUSR, + getuid(), getgid(), + NULL, NULL, + VIR_FILE_OP_RETURN_FD)) < 0) { /* If we failed as root, and the error was permission-denied (EACCES or EPERM), assume it's on a network-connected share where root access is restricted (eg, root-squashed NFS). If the qemu user (driver->user) is non-root, just set a flag to bypass security driver shenanigans, and retry the operation after doing setuid to qemu user */ - + rc = fd; if (((rc != -EACCES) && (rc != -EPERM)) || driver->user == getuid()) { - virReportSystemError(-rc, _("Failed to create domain save file '%s'"), + virReportSystemError(-rc, + _("Failed to create domain save file '%s'"), path); goto endjob; } @@ -1965,7 +1953,7 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, default: /* local file - log the error returned by virFileOperation */ virReportSystemError(-rc, - _("Failed to create domain save file '%s'"), + _("Failed to create domain save file '%s'"), path); goto endjob; break; @@ -1974,13 +1962,15 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, /* Retry creating the file as driver->user */ - if ((rc = virFileOperation(path, O_CREAT|O_TRUNC|O_WRONLY, + if ((fd = virFileOperation(path, O_CREAT|O_TRUNC|O_WRONLY, S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP, driver->user, driver->group, - qemudDomainSaveFileOpHook, &hdata, - VIR_FILE_OP_AS_UID)) < 0) { - virReportSystemError(-rc, _("Error from child process creating '%s'"), - path); + NULL, NULL, + (VIR_FILE_OP_AS_UID | + VIR_FILE_OP_RETURN_FD))) < 0) { + virReportSystemError(-fd, + _("Error from child process creating '%s'"), + path); goto endjob; } @@ -1992,6 +1982,18 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, } } + /* Write header to file, followed by XML */ + hdata.dom = dom; + hdata.path = path; + hdata.xml = xml; + hdata.header = &header; + + if (qemudDomainSaveFileOpHook(fd, &hdata) < 0) { + VIR_FORCE_CLOSE(fd); + goto endjob; + } + + /* Allow qemu to access file */ if (!is_reg && qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { @@ -2019,14 +2021,10 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, if (header.compressed == QEMUD_SAVE_FORMAT_RAW) { const char *args[] = { "cat", NULL }; - /* XXX gross - why don't we reuse the fd already opened earlier */ - int fd = -1; - if (qemuCapsGet(qemuCaps, QEMU_CAPS_MIGRATE_QEMU_FD) && - priv->monConfig->type == VIR_DOMAIN_CHR_TYPE_UNIX) - fd = open(path, O_WRONLY); qemuDomainObjEnterMonitorWithDriver(driver, vm); - if (fd >= 0 && lseek(fd, offset, SEEK_SET) == offset) { + if (qemuCapsGet(qemuCaps, QEMU_CAPS_MIGRATE_QEMU_FD) && + priv->monConfig->type == VIR_DOMAIN_CHR_TYPE_UNIX) { rc = qemuMonitorMigrateToFd(priv->mon, QEMU_MONITOR_MIGRATE_BACKGROUND, fd); @@ -2035,7 +2033,6 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, QEMU_MONITOR_MIGRATE_BACKGROUND, args, path, offset); } - VIR_FORCE_CLOSE(fd); qemuDomainObjExitMonitorWithDriver(driver, vm); } else { const char *prog = qemudSaveCompressionTypeToString(header.compressed); @@ -2054,6 +2051,11 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, if (rc < 0) goto endjob; + if (VIR_CLOSE(fd) < 0) { + virReportSystemError(errno, _("unable to close %s"), path); + goto endjob; + } + rc = qemuMigrationWaitForCompletion(driver, vm); if (rc < 0) @@ -2119,6 +2121,7 @@ endjob: cleanup: qemuCapsFree(qemuCaps); + VIR_FORCE_CLOSE(fd); VIR_FREE(xml); if (ret != 0 && is_reg) unlink(path); -- 1.7.4

* src/storage/storage_backend.c (virStorageBackendCreateRaw): Use new virFileOperation flag. --- v5: no real changes src/storage/storage_backend.c | 46 +++++++++++++++++++++++++--------------- 1 files changed, 29 insertions(+), 17 deletions(-) diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index c6c16c8..b0a512c 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -370,11 +370,16 @@ virStorageBackendCreateRaw(virConnectPtr conn ATTRIBUTE_UNUSED, virStoragePoolObjPtr pool, virStorageVolDefPtr vol, virStorageVolDefPtr inputvol, - unsigned int flags ATTRIBUTE_UNUSED) + unsigned int flags) { int ret = -1; - int createstat; + int fd = -1; struct createRawFileOpHookData hdata = { vol, inputvol }; + uid_t uid; + gid_t gid; + int operation_flags; + + virCheckFlags(0, -1); if (vol->target.encryption != NULL) { virStorageReportError(VIR_ERR_NO_SUPPORT, @@ -383,24 +388,31 @@ virStorageBackendCreateRaw(virConnectPtr conn ATTRIBUTE_UNUSED, goto cleanup; } - uid_t uid = (vol->target.perms.uid == -1) ? getuid() : vol->target.perms.uid; - gid_t gid = (vol->target.perms.gid == -1) ? getgid() : vol->target.perms.gid; + uid = (vol->target.perms.uid == -1) ? getuid() : vol->target.perms.uid; + gid = (vol->target.perms.gid == -1) ? getgid() : vol->target.perms.gid; + operation_flags = VIR_FILE_OP_FORCE_PERMS | VIR_FILE_OP_RETURN_FD; + if (pool->def->type == VIR_STORAGE_POOL_NETFS) + operation_flags |= VIR_FILE_OP_AS_UID; - if ((createstat = virFileOperation(vol->target.path, - O_RDWR | O_CREAT | O_EXCL, - vol->target.perms.mode, uid, gid, - createRawFileOpHook, &hdata, - VIR_FILE_OP_FORCE_PERMS | - (pool->def->type == VIR_STORAGE_POOL_NETFS - ? VIR_FILE_OP_AS_UID : 0))) < 0) { - virReportSystemError(-createstat, - _("cannot create path '%s'"), - vol->target.path); - goto cleanup; + if ((fd = virFileOperation(vol->target.path, + O_RDWR | O_CREAT | O_EXCL, + vol->target.perms.mode, uid, gid, + NULL, NULL, operation_flags)) < 0) { + virReportSystemError(-fd, + _("cannot create path '%s'"), + vol->target.path); + goto cleanup; + } + + if ((ret = createRawFileOpHook(fd, &hdata)) < 0) { + virReportSystemError(-fd, + _("cannot create path '%s'"), + vol->target.path); + ret = -1; } - ret = 0; cleanup: + VIR_FORCE_CLOSE(fd); return ret; } -- 1.7.4

This patch intentionally doesn't change indentation, in order to make it easier to review the real changes. * src/util/util.h (VIR_FILE_OP_RETURN_FD, virFileOperationHook): Delete. (virFileOperation): Rename... (virFileOpenAs): ...and reduce parameters. * src/util/util.c (virFileOperationNoFork, virFileOperation): Rename and simplify. * src/qemu/qemu_driver.c (qemudDomainSaveFlag): Adjust caller. * src/storage/storage_backend.c (virStorageBackendCreateRaw): Likewise. * src/libvirt_private.syms: Reflect rename. --- v5: rebase to changes in patch 4/13 src/libvirt_private.syms | 2 +- src/qemu/qemu_driver.c | 20 +++----- src/storage/storage_backend.c | 12 ++-- src/util/util.c | 105 +++++++++++++--------------------------- src/util/util.h | 15 ++---- 5 files changed, 55 insertions(+), 99 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 93504e5..fb95858 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -898,8 +898,8 @@ virFileIsExecutable; virFileLinkPointsTo; virFileMakePath; virFileMatchesNameSuffix; +virFileOpenAs; virFileOpenTty; -virFileOperation; virFilePid; virFileReadAll; virFileReadLimFD; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7a6e4ff..8ee2a29 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1913,11 +1913,9 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, goto endjob; } } else { - if ((fd = virFileOperation(path, O_CREAT|O_TRUNC|O_WRONLY, - S_IRUSR|S_IWUSR, - getuid(), getgid(), - NULL, NULL, - VIR_FILE_OP_RETURN_FD)) < 0) { + if ((fd = virFileOpenAs(path, O_CREAT|O_TRUNC|O_WRONLY, + S_IRUSR|S_IWUSR, + getuid(), getgid(), 0)) < 0) { /* If we failed as root, and the error was permission-denied (EACCES or EPERM), assume it's on a network-connected share where root access is restricted (eg, root-squashed NFS). If the @@ -1951,7 +1949,7 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, case 0: default: - /* local file - log the error returned by virFileOperation */ + /* local file - log the error returned by virFileOpenAs */ virReportSystemError(-rc, _("Failed to create domain save file '%s'"), path); @@ -1962,12 +1960,10 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, /* Retry creating the file as driver->user */ - if ((fd = virFileOperation(path, O_CREAT|O_TRUNC|O_WRONLY, - S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP, - driver->user, driver->group, - NULL, NULL, - (VIR_FILE_OP_AS_UID | - VIR_FILE_OP_RETURN_FD))) < 0) { + if ((fd = virFileOpenAs(path, O_CREAT|O_TRUNC|O_WRONLY, + S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP, + driver->user, driver->group, + VIR_FILE_OPEN_AS_UID)) < 0) { virReportSystemError(-fd, _("Error from child process creating '%s'"), path); diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index b0a512c..7015475 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -390,14 +390,14 @@ virStorageBackendCreateRaw(virConnectPtr conn ATTRIBUTE_UNUSED, uid = (vol->target.perms.uid == -1) ? getuid() : vol->target.perms.uid; gid = (vol->target.perms.gid == -1) ? getgid() : vol->target.perms.gid; - operation_flags = VIR_FILE_OP_FORCE_PERMS | VIR_FILE_OP_RETURN_FD; + operation_flags = VIR_FILE_OPEN_FORCE_PERMS; if (pool->def->type == VIR_STORAGE_POOL_NETFS) - operation_flags |= VIR_FILE_OP_AS_UID; + operation_flags |= VIR_FILE_OPEN_AS_UID; - if ((fd = virFileOperation(vol->target.path, - O_RDWR | O_CREAT | O_EXCL, - vol->target.perms.mode, uid, gid, - NULL, NULL, operation_flags)) < 0) { + if ((fd = virFileOpenAs(vol->target.path, + O_RDWR | O_CREAT | O_EXCL, + vol->target.perms.mode, uid, gid, + operation_flags)) < 0) { virReportSystemError(-fd, _("cannot create path '%s'"), vol->target.path); diff --git a/src/util/util.c b/src/util/util.c index a0a331d..49a8bb6 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -1391,10 +1391,10 @@ virFileIsExecutable(const char *file) #ifndef WIN32 /* return -errno on failure, or 0 on success */ -static int virFileOperationNoFork(const char *path, int openflags, mode_t mode, - uid_t uid, gid_t gid, - virFileOperationHook hook, void *hookdata, - unsigned int flags) { +static int +virFileOpenAsNoFork(const char *path, int openflags, mode_t mode, + uid_t uid, gid_t gid, unsigned int flags) +{ int fd = -1; int ret = 0; struct stat st; @@ -1417,7 +1417,7 @@ static int virFileOperationNoFork(const char *path, int openflags, mode_t mode, path, (unsigned int) uid, (unsigned int) gid); goto error; } - if ((flags & VIR_FILE_OP_FORCE_PERMS) + if ((flags & VIR_FILE_OPEN_FORCE_PERMS) && (fchmod(fd, mode) < 0)) { ret = -errno; virReportSystemError(errno, @@ -1425,17 +1425,7 @@ static int virFileOperationNoFork(const char *path, int openflags, mode_t mode, path, mode); goto error; } - if ((hook) && ((ret = hook(fd, hookdata)) != 0)) { - goto error; - } - if (flags & VIR_FILE_OP_RETURN_FD) - return fd; - if (VIR_CLOSE(fd) < 0) { - ret = -errno; - virReportSystemError(errno, _("failed to close new file '%s'"), - path); - goto error; - } + return fd; error: VIR_FORCE_CLOSE(fd); @@ -1482,35 +1472,27 @@ error: } /** - * virFileOperation: + * virFileOpenAs: * @path: file to open or create * @openflags: flags to pass to open * @mode: mode to use on creation or when forcing permissions * @uid: uid that should own file on creation * @gid: gid that should own file - * @hook: callback to run once file is open; see below for restrictions - * @hookdata: opaque data for @hook - * @flags: bit-wise or of VIR_FILE_OP_* flags + * @flags: bit-wise or of VIR_FILE_OPEN_* flags * * Open @path, and execute an optional callback @hook on the open file * description. @hook must return 0 on success, or -errno on failure. - * If @flags includes VIR_FILE_OP_AS_UID, then open the file while the + * If @flags includes VIR_FILE_OPEN_AS_UID, then open the file while the * effective user id is @uid (by using a child process); this allows * one to bypass root-squashing NFS issues. If @flags includes - * VIR_FILE_OP_FORCE_PERMS, then ensure that @path has those + * VIR_FILE_OPEN_FORCE_PERMS, then ensure that @path has those * permissions before the callback, even if the file already existed - * with different permissions. If @flags includes - * VIR_FILE_OP_RETURN_FD, then use SCM_RIGHTS to guarantee that any - * @hook is run in the parent, and the return value (if non-negative) - * is the file descriptor, left open. Otherwise, @hook might be run - * in a child process, so it must be async-signal-safe (ie. no - * malloc() or anything else that depends on modifying locks held in - * the parent), no file descriptor remains open, and 0 is returned on - * success. Return -errno on failure. */ -int virFileOperation(const char *path, int openflags, mode_t mode, - uid_t uid, gid_t gid, - virFileOperationHook hook, void *hookdata, - unsigned int flags) { + * with different permissions. The return value (if non-negative) + * is the file descriptor, left open. Return -errno on failure. */ +int +virFileOpenAs(const char *path, int openflags, mode_t mode, + uid_t uid, gid_t gid, unsigned int flags) +{ struct stat st; pid_t pid; int waitret, status, ret = 0; @@ -1523,11 +1505,10 @@ int virFileOperation(const char *path, int openflags, mode_t mode, struct iovec iov; int forkRet; - if ((!(flags & VIR_FILE_OP_AS_UID)) + if ((!(flags & VIR_FILE_OPEN_AS_UID)) || (getuid() != 0) || ((uid == 0) && (gid == 0))) { - return virFileOperationNoFork(path, openflags, mode, uid, gid, - hook, hookdata, flags); + return virFileOpenAsNoFork(path, openflags, mode, uid, gid, flags); } /* parent is running as root, but caller requested that the @@ -1535,7 +1516,7 @@ int virFileOperation(const char *path, int openflags, mode_t mode, * following dance avoids problems caused by root-squashing * NFS servers. */ - if (flags & VIR_FILE_OP_RETURN_FD) { + { if (socketpair(AF_UNIX, SOCK_STREAM, 0, pair) < 0) { ret = -errno; virReportSystemError(errno, @@ -1565,7 +1546,7 @@ int virFileOperation(const char *path, int openflags, mode_t mode, } if (pid) { /* parent */ - if (flags & VIR_FILE_OP_RETURN_FD) { + { VIR_FORCE_CLOSE(pair[1]); do { @@ -1601,19 +1582,13 @@ int virFileOperation(const char *path, int openflags, mode_t mode, goto parenterror; } if (!WIFEXITED(status) || (ret = -WEXITSTATUS(status)) == -EACCES || - ((flags & VIR_FILE_OP_RETURN_FD) && fd == -1)) { + fd == -1) { /* fall back to the simpler method, which works better in * some cases */ - return virFileOperationNoFork(path, openflags, mode, uid, gid, - hook, hookdata, flags); + return virFileOpenAsNoFork(path, openflags, mode, uid, gid, flags); } - if (!ret && (flags & VIR_FILE_OP_RETURN_FD)) { - if ((hook) && ((ret = hook(fd, hookdata)) != 0)) { - VIR_FORCE_CLOSE(fd); - goto parenterror; - } + if (!ret) ret = fd; - } parenterror: return ret; } @@ -1626,6 +1601,7 @@ parenterror: ret = -errno; goto childerror; } + VIR_FORCE_CLOSE(pair[0]); /* set desired uid/gid, then attempt to create the file */ @@ -1665,7 +1641,7 @@ parenterror: path, (unsigned int) uid, (unsigned int) gid); goto childerror; } - if ((flags & VIR_FILE_OP_FORCE_PERMS) + if ((flags & VIR_FILE_OPEN_FORCE_PERMS) && (fchmod(fd, mode) < 0)) { ret = -errno; virReportSystemError(errno, @@ -1673,8 +1649,7 @@ parenterror: path, mode); goto childerror; } - if (flags & VIR_FILE_OP_RETURN_FD) { - VIR_FORCE_CLOSE(pair[0]); + { memcpy(CMSG_DATA(cmsg), &fd, sizeof(fd)); do { @@ -1685,17 +1660,6 @@ parenterror: ret = -errno; goto childerror; } - ret = 0; - } else { - if ((hook) && ((ret = hook(fd, hookdata)) != 0)) - goto childerror; - if (VIR_CLOSE(fd) < 0) { - ret = -errno; - virReportSystemError(errno, - _("child failed to close new file '%s'"), - path); - goto childerror; - } } ret = 0; @@ -1704,6 +1668,7 @@ childerror: * If the child exits with EACCES, then the parent tries again. */ /* XXX This makes assumptions about errno being < 255, which is * not true on Hurd. */ + VIR_FORCE_CLOSE(pair[1]); ret = -ret; if ((ret & 0xff) != ret) { VIR_WARN("unable to pass desired return value %d", ret); @@ -1813,17 +1778,15 @@ childerror: #else /* WIN32 */ /* return -errno on failure, or 0 on success */ -int virFileOperation(const char *path ATTRIBUTE_UNUSED, - int openflags ATTRIBUTE_UNUSED, - mode_t mode ATTRIBUTE_UNUSED, - uid_t uid ATTRIBUTE_UNUSED, - gid_t gid ATTRIBUTE_UNUSED, - virFileOperationHook hook ATTRIBUTE_UNUSED, - void *hookdata ATTRIBUTE_UNUSED, - unsigned int flags ATTRIBUTE_UNUSED) +int virFileOpenAs(const char *path ATTRIBUTE_UNUSED, + int openflags ATTRIBUTE_UNUSED, + mode_t mode ATTRIBUTE_UNUSED, + uid_t uid ATTRIBUTE_UNUSED, + gid_t gid ATTRIBUTE_UNUSED, + unsigned int flags ATTRIBUTE_UNUSED) { virUtilError(VIR_ERR_INTERNAL_ERROR, - "%s", _("virFileOperation is not implemented for WIN32")); + "%s", _("virFileOpenAs is not implemented for WIN32")); return -ENOSYS; } diff --git a/src/util/util.h b/src/util/util.h index b1ca871..7b7722a 100644 --- a/src/util/util.h +++ b/src/util/util.h @@ -127,16 +127,13 @@ bool virFileIsExecutable(const char *file) ATTRIBUTE_NONNULL(1); char *virFileSanitizePath(const char *path); enum { - VIR_FILE_OP_NONE = 0, - VIR_FILE_OP_AS_UID = (1 << 0), - VIR_FILE_OP_FORCE_PERMS = (1 << 1), - VIR_FILE_OP_RETURN_FD = (1 << 2), + VIR_FILE_OPEN_NONE = 0, + VIR_FILE_OPEN_AS_UID = (1 << 0), + VIR_FILE_OPEN_FORCE_PERMS = (1 << 1), }; -typedef int (*virFileOperationHook)(int fd, void *data); -int virFileOperation(const char *path, int openflags, mode_t mode, - uid_t uid, gid_t gid, - virFileOperationHook hook, void *hookdata, - unsigned int flags) +int virFileOpenAs(const char *path, int openflags, mode_t mode, + uid_t uid, gid_t gid, + unsigned int flags) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; enum { -- 1.7.4

Separating the indentation from the real patch made review easier. * src/util/util.c (virFileOpenAs): Whitespace changes. --- v5: rebase to changes in 4/13 and 7/13 src/util/util.c | 97 +++++++++++++++++++++++++----------------------------- 1 files changed, 45 insertions(+), 52 deletions(-) diff --git a/src/util/util.c b/src/util/util.c index 49a8bb6..035036b 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -1439,8 +1439,7 @@ static int virDirCreateNoFork(const char *path, mode_t mode, uid_t uid, gid_t gi struct stat st; if ((mkdir(path, mode) < 0) - && !((errno == EEXIST) && (flags & VIR_DIR_CREATE_ALLOW_EXIST))) - { + && !((errno == EEXIST) && (flags & VIR_DIR_CREATE_ALLOW_EXIST))) { ret = -errno; virReportSystemError(errno, _("failed to create directory '%s'"), path); @@ -1516,28 +1515,26 @@ virFileOpenAs(const char *path, int openflags, mode_t mode, * following dance avoids problems caused by root-squashing * NFS servers. */ - { - if (socketpair(AF_UNIX, SOCK_STREAM, 0, pair) < 0) { - ret = -errno; - virReportSystemError(errno, - _("failed to create socket needed for '%s'"), - path); - return ret; - } - - memset(&msg, 0, sizeof(msg)); - iov.iov_base = &dummy; - iov.iov_len = 1; - msg.msg_iov = &iov; - msg.msg_iovlen = 1; - msg.msg_control = buf; - msg.msg_controllen = sizeof(buf); - cmsg = CMSG_FIRSTHDR(&msg); - cmsg->cmsg_level = SOL_SOCKET; - cmsg->cmsg_type = SCM_RIGHTS; - cmsg->cmsg_len = CMSG_LEN(sizeof(fd)); + if (socketpair(AF_UNIX, SOCK_STREAM, 0, pair) < 0) { + ret = -errno; + virReportSystemError(errno, + _("failed to create socket needed for '%s'"), + path); + return ret; } + memset(&msg, 0, sizeof(msg)); + iov.iov_base = &dummy; + iov.iov_len = 1; + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = buf; + msg.msg_controllen = sizeof(buf); + cmsg = CMSG_FIRSTHDR(&msg); + cmsg->cmsg_level = SOL_SOCKET; + cmsg->cmsg_type = SCM_RIGHTS; + cmsg->cmsg_len = CMSG_LEN(sizeof(fd)); + forkRet = virFork(&pid); if (pid < 0) { @@ -1546,29 +1543,27 @@ virFileOpenAs(const char *path, int openflags, mode_t mode, } if (pid) { /* parent */ - { - VIR_FORCE_CLOSE(pair[1]); - - do { - ret = recvmsg(pair[0], &msg, 0); - } while (ret < 0 && errno == EINTR); - - if (ret < 0) { - ret = -errno; - VIR_FORCE_CLOSE(pair[0]); - while ((waitret = waitpid(pid, NULL, 0) == -1) - && (errno == EINTR)); - goto parenterror; - } + VIR_FORCE_CLOSE(pair[1]); + + do { + ret = recvmsg(pair[0], &msg, 0); + } while (ret < 0 && errno == EINTR); + + if (ret < 0) { + ret = -errno; VIR_FORCE_CLOSE(pair[0]); + while ((waitret = waitpid(pid, NULL, 0) == -1) + && (errno == EINTR)); + goto parenterror; + } + VIR_FORCE_CLOSE(pair[0]); - /* See if fd was transferred. */ - cmsg = CMSG_FIRSTHDR(&msg); - if (cmsg && cmsg->cmsg_len == CMSG_LEN(sizeof(fd)) && - cmsg->cmsg_level == SOL_SOCKET && - cmsg->cmsg_type == SCM_RIGHTS) { - memcpy(&fd, CMSG_DATA(cmsg), sizeof(fd)); - } + /* See if fd was transferred. */ + cmsg = CMSG_FIRSTHDR(&msg); + if (cmsg && cmsg->cmsg_len == CMSG_LEN(sizeof(fd)) && + cmsg->cmsg_level == SOL_SOCKET && + cmsg->cmsg_type == SCM_RIGHTS) { + memcpy(&fd, CMSG_DATA(cmsg), sizeof(fd)); } /* wait for child to complete, and retrieve its exit code */ @@ -1649,17 +1644,15 @@ parenterror: path, mode); goto childerror; } - { - memcpy(CMSG_DATA(cmsg), &fd, sizeof(fd)); + memcpy(CMSG_DATA(cmsg), &fd, sizeof(fd)); - do { - ret = sendmsg(pair[1], &msg, 0); - } while (ret < 0 && errno == EINTR); + do { + ret = sendmsg(pair[1], &msg, 0); + } while (ret < 0 && errno == EINTR); - if (ret < 0) { - ret = -errno; - goto childerror; - } + if (ret < 0) { + ret = -errno; + goto childerror; } ret = 0; -- 1.7.4

* src/storage/storage_backend.c (createRawFileOpHook): Change signature. (struct createRawFileOpHookData): Delete unused struct. (virStorageBackendCreateRaw): Adjust caller. * src/qemu/qemu_driver.c (struct fileOpHookData): Delete unused struct. (qemudDomainSaveFileOpHook): Rename... (qemuDomainSaveFileOpHook): ...and change signature. (qemudDomainSaveFlag): Adjust caller. --- v5: no real changes src/qemu/qemu_driver.c | 29 +++++++++-------------------- src/storage/storage_backend.c | 34 ++++++++++++++-------------------- 2 files changed, 23 insertions(+), 40 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 8ee2a29..6ecfbd6 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1774,30 +1774,25 @@ struct qemud_save_header { int unused[15]; }; -struct fileOpHookData { - virDomainPtr dom; - const char *path; - char *xml; - struct qemud_save_header *header; -}; - /* return -errno on failure, or 0 on success */ -static int qemudDomainSaveFileOpHook(int fd, void *data) { - struct fileOpHookData *hdata = data; +static int +qemuDomainSaveHeader(int fd, const char *path, char *xml, + struct qemud_save_header *header) +{ int ret = 0; - if (safewrite(fd, hdata->header, sizeof(*hdata->header)) != sizeof(*hdata->header)) { + if (safewrite(fd, header, sizeof(*header)) != sizeof(*header)) { ret = -errno; qemuReportError(VIR_ERR_OPERATION_FAILED, _("failed to write header to domain save file '%s'"), - hdata->path); + path); goto endjob; } - if (safewrite(fd, hdata->xml, hdata->header->xml_len) != hdata->header->xml_len) { + if (safewrite(fd, xml, header->xml_len) != header->xml_len) { ret = -errno; qemuReportError(VIR_ERR_OPERATION_FAILED, - _("failed to write xml to '%s'"), hdata->path); + _("failed to write xml to '%s'"), path); goto endjob; } endjob: @@ -1813,7 +1808,6 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, { char *xml = NULL; struct qemud_save_header header; - struct fileOpHookData hdata; int bypassSecurityDriver = 0; int ret = -1; int rc; @@ -1979,12 +1973,7 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, } /* Write header to file, followed by XML */ - hdata.dom = dom; - hdata.path = path; - hdata.xml = xml; - hdata.header = &header; - - if (qemudDomainSaveFileOpHook(fd, &hdata) < 0) { + if (qemuDomainSaveHeader(fd, path, xml, &header) < 0) { VIR_FORCE_CLOSE(fd); goto endjob; } diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index 7015475..9b7bcc4 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -292,31 +292,27 @@ cleanup: return ret; } -struct createRawFileOpHookData { - virStorageVolDefPtr vol; - virStorageVolDefPtr inputvol; -}; - -static int createRawFileOpHook(int fd, void *data) { - struct createRawFileOpHookData *hdata = data; +static int +createRawFile(int fd, virStorageVolDefPtr vol, + virStorageVolDefPtr inputvol) +{ int ret = 0; unsigned long long remain; /* Seek to the final size, so the capacity is available upfront * for progress reporting */ - if (ftruncate(fd, hdata->vol->capacity) < 0) { + if (ftruncate(fd, vol->capacity) < 0) { ret = -errno; virReportSystemError(errno, _("cannot extend file '%s'"), - hdata->vol->target.path); + vol->target.path); goto cleanup; } - remain = hdata->vol->allocation; + remain = vol->allocation; - if (hdata->inputvol) { - ret = virStorageBackendCopyToFD(hdata->vol, hdata->inputvol, - fd, &remain, 1); + if (inputvol) { + ret = virStorageBackendCopyToFD(vol, inputvol, fd, &remain, 1); if (ret < 0) { goto cleanup; } @@ -334,11 +330,10 @@ static int createRawFileOpHook(int fd, void *data) { if (bytes > remain) bytes = remain; - if (safezero(fd, 0, hdata->vol->allocation - remain, - bytes) != 0) { + if (safezero(fd, 0, vol->allocation - remain, bytes) != 0) { ret = -errno; virReportSystemError(errno, _("cannot fill file '%s'"), - hdata->vol->target.path); + vol->target.path); goto cleanup; } remain -= bytes; @@ -347,7 +342,7 @@ static int createRawFileOpHook(int fd, void *data) { if (safezero(fd, 0, 0, remain) != 0) { ret = -errno; virReportSystemError(errno, _("cannot fill file '%s'"), - hdata->vol->target.path); + vol->target.path); goto cleanup; } } @@ -357,7 +352,7 @@ static int createRawFileOpHook(int fd, void *data) { if (fsync(fd) < 0) { ret = -errno; virReportSystemError(errno, _("cannot sync data to file '%s'"), - hdata->vol->target.path); + vol->target.path); goto cleanup; } @@ -374,7 +369,6 @@ virStorageBackendCreateRaw(virConnectPtr conn ATTRIBUTE_UNUSED, { int ret = -1; int fd = -1; - struct createRawFileOpHookData hdata = { vol, inputvol }; uid_t uid; gid_t gid; int operation_flags; @@ -404,7 +398,7 @@ virStorageBackendCreateRaw(virConnectPtr conn ATTRIBUTE_UNUSED, goto cleanup; } - if ((ret = createRawFileOpHook(fd, &hdata)) < 0) { + if ((ret = createRawFile(fd, vol, inputvol)) < 0) { virReportSystemError(-fd, _("cannot create path '%s'"), vol->target.path); -- 1.7.4

Direct access to an open file is so much simpler than passing everything through a pipe! * src/qemu/qemu_driver.c (qemudOpenAsUID) (qemudDomainSaveImageClose): Delete. (qemudDomainSaveImageOpen): Rename... (qemuDomainSaveImageOpen): ...and drop read_pid argument. Use virFileOpenAs instead of qemudOpenAsUID. (qemudDomainSaveImageStartVM, qemudDomainRestore) (qemudDomainObjRestore): Rename... (qemuDomainSaveImageStartVM, qemuDomainRestore) (qemDomainObjRestore): ...and simplify accordingly. (qemudDomainObjStart, qemuDriver): Update callers. --- v5: no major changes src/qemu/qemu_driver.c | 265 ++++++++--------------------------------------- 1 files changed, 45 insertions(+), 220 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 6ecfbd6..2c852c0 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3048,183 +3048,32 @@ cleanup: return ret; } -/* qemudOpenAsUID() - pipe/fork/setuid/open a file, and return the - pipe fd to caller, so that it can read from the file. Also return - the pid of the child process, so the caller can wait for it to exit - after it's finished reading (to avoid a zombie, if nothing - else). */ - -static int -qemudOpenAsUID(const char *path, uid_t uid, gid_t gid, pid_t *child_pid) -{ - int pipefd[2]; - int fd = -1; - - *child_pid = -1; - - if (pipe(pipefd) < 0) { - virReportSystemError(errno, - _("failed to create pipe to read '%s'"), - path); - pipefd[0] = pipefd[1] = -1; - goto parent_cleanup; - } - - int forkRet = virFork(child_pid); - - if (*child_pid < 0) { - virReportSystemError(errno, - _("failed to fork child to read '%s'"), - path); - goto parent_cleanup; - } - - if (*child_pid > 0) { - - /* parent */ - - /* parent doesn't need the write side of the pipe */ - VIR_FORCE_CLOSE(pipefd[1]); - - if (forkRet < 0) { - virReportSystemError(errno, - _("failed in parent after forking child to read '%s'"), - path); - goto parent_cleanup; - } - /* caller gets the read side of the pipe */ - fd = pipefd[0]; - pipefd[0] = -1; -parent_cleanup: - VIR_FORCE_CLOSE(pipefd[0]); - VIR_FORCE_CLOSE(pipefd[1]); - if ((fd < 0) && (*child_pid > 0)) { - /* a child process was started and subsequently an error - occurred in the parent, so we need to wait for it to - exit, but its status is inconsequential. */ - while ((waitpid(*child_pid, NULL, 0) == -1) - && (errno == EINTR)) { - /* empty */ - } - *child_pid = -1; - } - return fd; - } - - /* child */ - - /* setuid to the qemu user, then open the file, read it, - and stuff it into the pipe for the parent process to - read */ - int exit_code; - char *buf = NULL; - size_t bufsize = 1024 * 1024; - int bytesread; - - /* child doesn't need the read side of the pipe */ - VIR_FORCE_CLOSE(pipefd[0]); - - if (forkRet < 0) { - exit_code = errno; - virReportSystemError(errno, - _("failed in child after forking to read '%s'"), - path); - goto child_cleanup; - } - - if (virSetUIDGID(uid, gid) < 0) { - exit_code = errno; - goto child_cleanup; - } - - if ((fd = open(path, O_RDONLY)) < 0) { - exit_code = errno; - virReportSystemError(errno, - _("cannot open '%s' as uid %d"), - path, uid); - goto child_cleanup; - } - - if (VIR_ALLOC_N(buf, bufsize) < 0) { - exit_code = ENOMEM; - virReportOOMError(); - goto child_cleanup; - } - - /* read from fd and write to pipefd[1] until EOF */ - do { - if ((bytesread = saferead(fd, buf, bufsize)) < 0) { - exit_code = errno; - virReportSystemError(errno, - _("child failed reading from '%s'"), - path); - goto child_cleanup; - } - if (safewrite(pipefd[1], buf, bytesread) != bytesread) { - exit_code = errno; - virReportSystemError(errno, "%s", - _("child failed writing to pipe")); - goto child_cleanup; - } - } while (bytesread > 0); - exit_code = 0; - -child_cleanup: - VIR_FREE(buf); - VIR_FORCE_CLOSE(fd); - VIR_FORCE_CLOSE(pipefd[1]); - _exit(exit_code); -} - -static int qemudDomainSaveImageClose(int fd, pid_t read_pid, int *status) -{ - int ret = 0; - - if (VIR_CLOSE(fd) < 0) { - virReportSystemError(errno, "%s", - _("cannot close file")); - } - - if (read_pid != -1) { - /* reap the process that read the file */ - while ((ret = waitpid(read_pid, status, 0)) == -1 - && errno == EINTR) { - /* empty */ - } - } else if (status) { - *status = 0; - } - - return ret; -} - -static int ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5) -qemudDomainSaveImageOpen(struct qemud_driver *driver, - const char *path, - virDomainDefPtr *ret_def, - struct qemud_save_header *ret_header, - pid_t *ret_read_pid) +static int ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(4) +qemuDomainSaveImageOpen(struct qemud_driver *driver, + const char *path, + virDomainDefPtr *ret_def, + struct qemud_save_header *ret_header) { int fd; - pid_t read_pid = -1; struct qemud_save_header header; char *xml = NULL; virDomainDefPtr def = NULL; - if ((fd = open(path, O_RDONLY)) < 0) { - if ((driver->user == 0) || (getuid() != 0)) { + if ((fd = virFileOpenAs(path, O_RDONLY, 0, getuid(), getgid(), 0)) < 0) { + if ((fd != -EACCES && fd != -EPERM) || + driver->user == getuid()) { qemuReportError(VIR_ERR_OPERATION_FAILED, "%s", _("cannot read domain image")); goto error; } /* Opening as root failed, but qemu runs as a different user - that might have better luck. Create a pipe, then fork a - child process to run as the qemu user, which will hopefully - have the necessary authority to read the file. */ - if ((fd = qemudOpenAsUID(path, - driver->user, driver->group, &read_pid)) < 0) { - /* error already reported */ + * that might have better luck. */ + if ((fd = virFileOpenAs(path, O_RDONLY, 0, + driver->user, driver->group, + VIR_FILE_OPEN_AS_UID)) < 0) { + qemuReportError(VIR_ERR_OPERATION_FAILED, + "%s", _("cannot read domain image")); goto error; } } @@ -3274,34 +3123,30 @@ qemudDomainSaveImageOpen(struct qemud_driver *driver, *ret_def = def; *ret_header = header; - *ret_read_pid = read_pid; return fd; error: virDomainDefFree(def); VIR_FREE(xml); - qemudDomainSaveImageClose(fd, read_pid, NULL); + VIR_FORCE_CLOSE(fd); return -1; } -static int ATTRIBUTE_NONNULL(6) -qemudDomainSaveImageStartVM(virConnectPtr conn, - struct qemud_driver *driver, - virDomainObjPtr vm, - int *fd, - pid_t *read_pid, - const struct qemud_save_header *header, - const char *path) +static int ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5) ATTRIBUTE_NONNULL(6) +qemuDomainSaveImageStartVM(virConnectPtr conn, + struct qemud_driver *driver, + virDomainObjPtr vm, + int *fd, + const struct qemud_save_header *header, + const char *path) { int ret = -1; virDomainEventPtr event; int intermediatefd = -1; pid_t intermediate_pid = -1; int childstat; - int wait_ret; - int status; if (header->version == 2) { const char *intermediate_argv[3] = { NULL, "-dc", NULL }; @@ -3334,8 +3179,9 @@ qemudDomainSaveImageStartVM(virConnectPtr conn, if (intermediate_pid != -1) { if (ret < 0) { - /* if there was an error setting up qemu, the intermediate process will - * wait forever to write to stdout, so we must manually kill it. + /* if there was an error setting up qemu, the intermediate + * process will wait forever to write to stdout, so we + * must manually kill it. */ VIR_FORCE_CLOSE(intermediatefd); VIR_FORCE_CLOSE(*fd); @@ -3350,30 +3196,10 @@ qemudDomainSaveImageStartVM(virConnectPtr conn, } VIR_FORCE_CLOSE(intermediatefd); - wait_ret = qemudDomainSaveImageClose(*fd, *read_pid, &status); - *fd = -1; - if (*read_pid != -1) { - if (wait_ret == -1) { - virReportSystemError(errno, - _("failed to wait for process reading '%s'"), - path); - ret = -1; - } else if (!WIFEXITED(status)) { - qemuReportError(VIR_ERR_OPERATION_FAILED, - _("child process exited abnormally reading '%s'"), - path); - ret = -1; - } else { - int exit_status = WEXITSTATUS(status); - if (exit_status != 0) { - virReportSystemError(exit_status, - _("child process returned error reading '%s'"), - path); - ret = -1; - } - } + if (VIR_CLOSE(*fd) < 0) { + virReportSystemError(errno, _("cannot close file: %s"), path); + ret = -1; } - *read_pid = -1; if (ret < 0) { qemuAuditDomainStart(vm, "restored", false); @@ -3412,19 +3238,20 @@ out: return ret; } -static int qemudDomainRestore(virConnectPtr conn, - const char *path) { +static int +qemuDomainRestore(virConnectPtr conn, + const char *path) +{ struct qemud_driver *driver = conn->privateData; virDomainDefPtr def = NULL; virDomainObjPtr vm = NULL; int fd = -1; - pid_t read_pid = -1; int ret = -1; struct qemud_save_header header; qemuDriverLock(driver); - fd = qemudDomainSaveImageOpen(driver, path, &def, &header, &read_pid); + fd = qemuDomainSaveImageOpen(driver, path, &def, &header); if (fd < 0) goto cleanup; @@ -3442,8 +3269,7 @@ static int qemudDomainRestore(virConnectPtr conn, if (qemuDomainObjBeginJobWithDriver(driver, vm) < 0) goto cleanup; - ret = qemudDomainSaveImageStartVM(conn, driver, vm, &fd, - &read_pid, &header, path); + ret = qemuDomainSaveImageStartVM(conn, driver, vm, &fd, &header, path); if (qemuDomainObjEndJob(vm) == 0) vm = NULL; @@ -3454,25 +3280,25 @@ static int qemudDomainRestore(virConnectPtr conn, cleanup: virDomainDefFree(def); - qemudDomainSaveImageClose(fd, read_pid, NULL); + VIR_FORCE_CLOSE(fd); if (vm) virDomainObjUnlock(vm); qemuDriverUnlock(driver); return ret; } -static int qemudDomainObjRestore(virConnectPtr conn, - struct qemud_driver *driver, - virDomainObjPtr vm, - const char *path) +static int +qemuDomainObjRestore(virConnectPtr conn, + struct qemud_driver *driver, + virDomainObjPtr vm, + const char *path) { virDomainDefPtr def = NULL; int fd = -1; - pid_t read_pid = -1; int ret = -1; struct qemud_save_header header; - fd = qemudDomainSaveImageOpen(driver, path, &def, &header, &read_pid); + fd = qemuDomainSaveImageOpen(driver, path, &def, &header); if (fd < 0) goto cleanup; @@ -3493,12 +3319,11 @@ static int qemudDomainObjRestore(virConnectPtr conn, virDomainObjAssignDef(vm, def, true); def = NULL; - ret = qemudDomainSaveImageStartVM(conn, driver, vm, &fd, - &read_pid, &header, path); + ret = qemuDomainSaveImageStartVM(conn, driver, vm, &fd, &header, path); cleanup: virDomainDefFree(def); - qemudDomainSaveImageClose(fd, read_pid, NULL); + VIR_FORCE_CLOSE(fd); return ret; } @@ -3709,7 +3534,7 @@ static int qemudDomainObjStart(virConnectPtr conn, */ managed_save = qemuDomainManagedSavePath(driver, vm); if ((managed_save) && (virFileExists(managed_save))) { - ret = qemudDomainObjRestore(conn, driver, vm, managed_save); + ret = qemuDomainObjRestore(conn, driver, vm, managed_save); if (unlink(managed_save) < 0) { VIR_WARN("Failed to remove the managed state %s", managed_save); @@ -7126,7 +6951,7 @@ static virDriver qemuDriver = { qemuDomainGetBlkioParameters, /* domainGetBlkioParameters */ qemudDomainGetInfo, /* domainGetInfo */ qemudDomainSave, /* domainSave */ - qemudDomainRestore, /* domainRestore */ + qemuDomainRestore, /* domainRestore */ qemudDomainCoreDump, /* domainCoreDump */ qemudDomainSetVcpus, /* domainSetVcpus */ qemudDomainSetVcpusFlags, /* domainSetVcpusFlags */ -- 1.7.4

This points out that core dumps (still) don't work for root-squash NFS, since the fd is not opened correctly. This patch should not introduce any functionality change, it is just a refactoring to avoid duplicated code. * src/qemu/qemu_migration.h (qemuMigrationToFile): New prototype. * src/qemu/qemu_migration.c (qemuMigrationToFile): New function. * src/qemu/qemu_driver.c (qemudDomainSaveFlag, doCoreDump): Use it. --- v5: move qemuMigrationToFile to qemu_migration, which required adding qemuCompressProgramName src/qemu/qemu_driver.c | 165 +++++++-------------------------------------- src/qemu/qemu_migration.c | 98 +++++++++++++++++++++++++++ src/qemu/qemu_migration.h | 8 ++ 3 files changed, 131 insertions(+), 140 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 2c852c0..182b0fd 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1799,6 +1799,15 @@ endjob: return ret; } +/* Given a enum qemud_save_formats compression level, return the name + * of the program to run, or NULL if no program is needed. */ +static const char * +qemuCompressProgramName(int compress) +{ + return (compress == QEMUD_SAVE_FORMAT_RAW ? NULL : + qemudSaveCompressionTypeToString(compress)); +} + /* This internal function expects the driver lock to already be held on * entry and the vm must be active. */ @@ -1808,15 +1817,14 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, { char *xml = NULL; struct qemud_save_header header; - int bypassSecurityDriver = 0; + bool bypassSecurityDriver = false; int ret = -1; int rc; virDomainEventPtr event = NULL; qemuDomainObjPrivatePtr priv; struct stat sb; - int is_reg = 0; + bool is_reg = false; unsigned long long offset; - virCgroupPtr cgroup = NULL; virBitmapPtr qemuCaps = NULL; int fd = -1; @@ -1871,9 +1879,9 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, * that with NFS we can't actually stat() the file. * The subsequent codepaths will still raise an error * if a truely fatal problem is hit */ - is_reg = 1; + is_reg = true; } else { - is_reg = S_ISREG(sb.st_mode); + is_reg = !!S_ISREG(sb.st_mode); } offset = sizeof(header) + header.xml_len; @@ -1968,7 +1976,7 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, is NFS, we assume it's a root-squashing NFS share, and that the security driver stuff would have failed anyway */ - bypassSecurityDriver = 1; + bypassSecurityDriver = true; } } @@ -1978,87 +1986,14 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, goto endjob; } - /* Allow qemu to access file */ - - if (!is_reg && - qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - qemuReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto endjob; - } - rc = virCgroupAllowDevicePath(cgroup, path, - VIR_CGROUP_DEVICE_RW); - qemuAuditCgroupPath(vm, cgroup, "allow", path, "rw", rc); - if (rc < 0) { - virReportSystemError(-rc, - _("Unable to allow device %s for %s"), - path, vm->def->name); - goto endjob; - } - } - - if ((!bypassSecurityDriver) && - virSecurityManagerSetSavedStateLabel(driver->securityManager, - vm, path) < 0) - goto endjob; - - if (header.compressed == QEMUD_SAVE_FORMAT_RAW) { - const char *args[] = { "cat", NULL }; - - qemuDomainObjEnterMonitorWithDriver(driver, vm); - if (qemuCapsGet(qemuCaps, QEMU_CAPS_MIGRATE_QEMU_FD) && - priv->monConfig->type == VIR_DOMAIN_CHR_TYPE_UNIX) { - rc = qemuMonitorMigrateToFd(priv->mon, - QEMU_MONITOR_MIGRATE_BACKGROUND, - fd); - } else { - rc = qemuMonitorMigrateToFile(priv->mon, - QEMU_MONITOR_MIGRATE_BACKGROUND, - args, path, offset); - } - qemuDomainObjExitMonitorWithDriver(driver, vm); - } else { - const char *prog = qemudSaveCompressionTypeToString(header.compressed); - const char *args[] = { - prog, - "-c", - NULL - }; - qemuDomainObjEnterMonitorWithDriver(driver, vm); - rc = qemuMonitorMigrateToFile(priv->mon, - QEMU_MONITOR_MIGRATE_BACKGROUND, - args, path, offset); - qemuDomainObjExitMonitorWithDriver(driver, vm); - } - - if (rc < 0) - goto endjob; - + /* Perform the migration */ + if (qemuMigrationToFile(driver, vm, qemuCaps, fd, offset, path, + qemuCompressProgramName(compressed), + is_reg, bypassSecurityDriver) < 0) + goto cleanup; if (VIR_CLOSE(fd) < 0) { virReportSystemError(errno, _("unable to close %s"), path); - goto endjob; - } - - rc = qemuMigrationWaitForCompletion(driver, vm); - - if (rc < 0) - goto endjob; - - if ((!bypassSecurityDriver) && - virSecurityManagerRestoreSavedStateLabel(driver->securityManager, - vm, path) < 0) - VIR_WARN("failed to restore save state label on %s", path); - bypassSecurityDriver = true; - - if (cgroup != NULL) { - rc = virCgroupDenyDevicePath(cgroup, path, - VIR_CGROUP_DEVICE_RWM); - qemuAuditCgroupPath(vm, cgroup, "deny", path, "rwm", rc); - if (rc < 0) - VIR_WARN("Unable to deny device %s for %s %d", - path, vm->def->name, rc); + goto cleanup; } ret = 0; @@ -2084,22 +2019,7 @@ endjob: if (rc < 0) VIR_WARN0("Unable to resume guest CPUs after save failure"); } - - if (cgroup != NULL) { - rc = virCgroupDenyDevicePath(cgroup, path, - VIR_CGROUP_DEVICE_RWM); - qemuAuditCgroupPath(vm, cgroup, "deny", path, "rwm", rc); - if (rc < 0) - VIR_WARN("Unable to deny device %s for %s: %d", - path, vm->def->name, rc); - } - - if ((!bypassSecurityDriver) && - virSecurityManagerRestoreSavedStateLabel(driver->securityManager, - vm, path) < 0) - VIR_WARN("failed to restore save state label on %s", path); } - if (qemuDomainObjEndJob(vm) == 0) vm = NULL; } @@ -2112,7 +2032,6 @@ cleanup: unlink(path); if (event) qemuDomainEventQueue(driver, event); - virCgroupFree(&cgroup); return ret; } @@ -2316,9 +2235,6 @@ static int doCoreDump(struct qemud_driver *driver, { int fd = -1; int ret = -1; - qemuDomainObjPrivatePtr priv; - - priv = vm->privateData; /* Create an empty file with appropriate ownership. */ if ((fd = open(path, O_CREAT|O_TRUNC|O_WRONLY, S_IRUSR|S_IWUSR)) < 0) { @@ -2327,6 +2243,10 @@ static int doCoreDump(struct qemud_driver *driver, goto cleanup; } + if (qemuMigrationToFile(driver, vm, NULL, fd, 0, path, + qemuCompressProgramName(compress), true, false) < 0) + goto cleanup; + if (VIR_CLOSE(fd) < 0) { virReportSystemError(errno, _("unable to save file %s"), @@ -2334,42 +2254,7 @@ static int doCoreDump(struct qemud_driver *driver, goto cleanup; } - if (virSecurityManagerSetSavedStateLabel(driver->securityManager, - vm, path) < 0) - goto cleanup; - - qemuDomainObjEnterMonitorWithDriver(driver, vm); - if (compress == QEMUD_SAVE_FORMAT_RAW) { - const char *args[] = { - "cat", - NULL, - }; - ret = qemuMonitorMigrateToFile(priv->mon, - QEMU_MONITOR_MIGRATE_BACKGROUND, - args, path, 0); - } else { - const char *prog = qemudSaveCompressionTypeToString(compress); - const char *args[] = { - prog, - "-c", - NULL, - }; - ret = qemuMonitorMigrateToFile(priv->mon, - QEMU_MONITOR_MIGRATE_BACKGROUND, - args, path, 0); - } - qemuDomainObjExitMonitorWithDriver(driver, vm); - if (ret < 0) - goto cleanup; - - ret = qemuMigrationWaitForCompletion(driver, vm); - - if (ret < 0) - goto cleanup; - - if (virSecurityManagerRestoreSavedStateLabel(driver->securityManager, - vm, path) < 0) - goto cleanup; + ret = 0; cleanup: if (ret != 0) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 826e1bf..592ffe2 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -29,6 +29,7 @@ #include "qemu_process.h" #include "qemu_capabilities.h" #include "qemu_audit.h" +#include "qemu_cgroup.h" #include "logging.h" #include "virterror_internal.h" @@ -1283,3 +1284,100 @@ cleanup: qemuDomainEventQueue(driver, event); return dom; } + +/* Helper function called while driver lock is held and vm is active. */ +int +qemuMigrationToFile(struct qemud_driver *driver, virDomainObjPtr vm, + virBitmapPtr qemuCaps, + int fd, off_t offset, const char *path, + const char *compressor, + bool is_reg, bool bypassSecurityDriver) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virCgroupPtr cgroup = NULL; + int ret = -1; + int rc; + bool restoreLabel = false; + + if (!is_reg && + qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (virCgroupForDomain(driver->cgroup, vm->def->name, + &cgroup, 0) != 0) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to find cgroup for %s"), + vm->def->name); + goto cleanup; + } + rc = virCgroupAllowDevicePath(cgroup, path, + VIR_CGROUP_DEVICE_RW); + qemuAuditCgroupPath(vm, cgroup, "allow", path, "rw", rc); + if (rc < 0) { + virReportSystemError(-rc, + _("Unable to allow device %s for %s"), + path, vm->def->name); + goto cleanup; + } + } + + if ((!bypassSecurityDriver) && + virSecurityManagerSetSavedStateLabel(driver->securityManager, + vm, path) < 0) + goto cleanup; + restoreLabel = true; + + if (!compressor) { + const char *args[] = { "cat", NULL }; + + qemuDomainObjEnterMonitorWithDriver(driver, vm); + if (qemuCaps && qemuCapsGet(qemuCaps, QEMU_CAPS_MIGRATE_QEMU_FD) && + priv->monConfig->type == VIR_DOMAIN_CHR_TYPE_UNIX) { + rc = qemuMonitorMigrateToFd(priv->mon, + QEMU_MONITOR_MIGRATE_BACKGROUND, + fd); + } else { + rc = qemuMonitorMigrateToFile(priv->mon, + QEMU_MONITOR_MIGRATE_BACKGROUND, + args, path, offset); + } + qemuDomainObjExitMonitorWithDriver(driver, vm); + } else { + const char *prog = compressor; + const char *args[] = { + prog, + "-c", + NULL + }; + qemuDomainObjEnterMonitorWithDriver(driver, vm); + rc = qemuMonitorMigrateToFile(priv->mon, + QEMU_MONITOR_MIGRATE_BACKGROUND, + args, path, offset); + qemuDomainObjExitMonitorWithDriver(driver, vm); + } + + if (rc < 0) + goto cleanup; + + rc = qemuMigrationWaitForCompletion(driver, vm); + + if (rc < 0) + goto cleanup; + + ret = 0; + +cleanup: + if (restoreLabel && (!bypassSecurityDriver) && + virSecurityManagerRestoreSavedStateLabel(driver->securityManager, + vm, path) < 0) + VIR_WARN("failed to restore save state label on %s", path); + + if (cgroup != NULL) { + rc = virCgroupDenyDevicePath(cgroup, path, + VIR_CGROUP_DEVICE_RWM); + qemuAuditCgroupPath(vm, cgroup, "deny", path, "rwm", rc); + if (rc < 0) + VIR_WARN("Unable to deny device %s for %s %d", + path, vm->def->name, rc); + virCgroupFree(&cgroup); + } + return ret; +} diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index 3cac617..db16ce4 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -60,4 +60,12 @@ virDomainPtr qemuMigrationFinish(struct qemud_driver *driver, int retcode); +int qemuMigrationToFile(struct qemud_driver *driver, virDomainObjPtr vm, + virBitmapPtr qemuCaps, + int fd, off_t offset, const char *path, + const char *compressor, + bool is_reg, bool bypassSecurityDriver) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(6) + ATTRIBUTE_RETURN_CHECK; + #endif /* __QEMU_MIGRATION_H__ */ -- 1.7.4

Regression introduced in commit 6034ddd55. * src/qemu/qemu_driver.c (qemudDomainSaveFlag): Jump to correct label. --- Aargh - I introduced a regression that hangs libvirtd due to an inadvertent change in label names (copied and pasted from doCoreDump, where the correct label was indeed named cleanup).
- if (rc < 0) - goto endjob; - + /* Perform the migration */ + if (qemuMigrationToFile(driver, vm, qemuCaps, fd, offset, path, + qemuCompressProgramName(compressed), + is_reg, bypassSecurityDriver) < 0) + goto cleanup;
By jumping to the wrong label, a failed save attempt never released the job condition, locking out all future operations on that domain. Pushing this under the trivial rule. src/qemu/qemu_driver.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 182b0fd..e97c1d9 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1990,10 +1990,10 @@ static int qemudDomainSaveFlag(struct qemud_driver *driver, virDomainPtr dom, if (qemuMigrationToFile(driver, vm, qemuCaps, fd, offset, path, qemuCompressProgramName(compressed), is_reg, bypassSecurityDriver) < 0) - goto cleanup; + goto endjob; if (VIR_CLOSE(fd) < 0) { virReportSystemError(errno, _("unable to close %s"), path); - goto cleanup; + goto endjob; } ret = 0; -- 1.7.4

SELinux labeling and cgroup ACLs aren't required if we hand a pre-opened fd to qemu. All the more reason to love fd: migration. * src/qemu/qemu_migration.c (qemuMigrationToFile): Skip steps that are irrelevant in fd migration. --- v5: rebase to changes in 11/13, src/qemu/qemu_migration.c | 63 ++++++++++++++++++++++++++------------------ 1 files changed, 37 insertions(+), 26 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 592ffe2..2803de6 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1299,36 +1299,49 @@ qemuMigrationToFile(struct qemud_driver *driver, virDomainObjPtr vm, int rc; bool restoreLabel = false; - if (!is_reg && - qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - if (virCgroupForDomain(driver->cgroup, vm->def->name, - &cgroup, 0) != 0) { - qemuReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; + if (qemuCaps && qemuCapsGet(qemuCaps, QEMU_CAPS_MIGRATE_QEMU_FD) && + !compressor) { + /* All right! We can use fd migration, which means that qemu + * doesn't have to open() the file, so we don't have to futz + * around with granting access or revoking it later. */ + is_reg = true; + bypassSecurityDriver = true; + } else { + /* Phooey - we have to fall back on exec migration, where qemu + * has to popen() the file by name. We might also stumble on + * a race present in some qemu versions where it does a wait() + * that botches pclose. */ + if (!is_reg && + qemuCgroupControllerActive(driver, + VIR_CGROUP_CONTROLLER_DEVICES)) { + if (virCgroupForDomain(driver->cgroup, vm->def->name, + &cgroup, 0) != 0) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to find cgroup for %s"), + vm->def->name); + goto cleanup; + } + rc = virCgroupAllowDevicePath(cgroup, path, + VIR_CGROUP_DEVICE_RW); + qemuAuditCgroupPath(vm, cgroup, "allow", path, "rw", rc); + if (rc < 0) { + virReportSystemError(-rc, + _("Unable to allow device %s for %s"), + path, vm->def->name); + goto cleanup; + } } - rc = virCgroupAllowDevicePath(cgroup, path, - VIR_CGROUP_DEVICE_RW); - qemuAuditCgroupPath(vm, cgroup, "allow", path, "rw", rc); - if (rc < 0) { - virReportSystemError(-rc, - _("Unable to allow device %s for %s"), - path, vm->def->name); + if ((!bypassSecurityDriver) && + virSecurityManagerSetSavedStateLabel(driver->securityManager, + vm, path) < 0) goto cleanup; - } + restoreLabel = true; } - if ((!bypassSecurityDriver) && - virSecurityManagerSetSavedStateLabel(driver->securityManager, - vm, path) < 0) - goto cleanup; - restoreLabel = true; - + qemuDomainObjEnterMonitorWithDriver(driver, vm); if (!compressor) { const char *args[] = { "cat", NULL }; - qemuDomainObjEnterMonitorWithDriver(driver, vm); if (qemuCaps && qemuCapsGet(qemuCaps, QEMU_CAPS_MIGRATE_QEMU_FD) && priv->monConfig->type == VIR_DOMAIN_CHR_TYPE_UNIX) { rc = qemuMonitorMigrateToFd(priv->mon, @@ -1339,7 +1352,6 @@ qemuMigrationToFile(struct qemud_driver *driver, virDomainObjPtr vm, QEMU_MONITOR_MIGRATE_BACKGROUND, args, path, offset); } - qemuDomainObjExitMonitorWithDriver(driver, vm); } else { const char *prog = compressor; const char *args[] = { @@ -1347,12 +1359,11 @@ qemuMigrationToFile(struct qemud_driver *driver, virDomainObjPtr vm, "-c", NULL }; - qemuDomainObjEnterMonitorWithDriver(driver, vm); rc = qemuMonitorMigrateToFile(priv->mon, QEMU_MONITOR_MIGRATE_BACKGROUND, args, path, offset); - qemuDomainObjExitMonitorWithDriver(driver, vm); } + qemuDomainObjExitMonitorWithDriver(driver, vm); if (rc < 0) goto cleanup; -- 1.7.4

Spawn the compressor ourselves, instead of requiring the shell. * src/qemu/qemu_migration.c (qemuMigrationToFile): Spawn compression helper process when needed. --- v5: rebase to changes in 11/13, add a cloexec, rely on recent virCommand changes to actually clean up the intermediate compressor process, actually test this time It's a minor shame that virCommandSetOutputFD can create an output pipe and return the fd, but virCommandSetInputFD cannot do likewise (this patch has to create its own pipe). On the other hand, by creating our own pipe, this patch didn't fall foul of the cloexec bug that had to first be fixed on the restore path in 2/13. src/qemu/qemu_migration.c | 38 ++++++++++++++++++++++++++++++++++---- 1 files changed, 34 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 2803de6..98b9d01 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1298,9 +1298,11 @@ qemuMigrationToFile(struct qemud_driver *driver, virDomainObjPtr vm, int ret = -1; int rc; bool restoreLabel = false; + virCommandPtr cmd = NULL; + int pipeFD[2] = { -1, -1 }; if (qemuCaps && qemuCapsGet(qemuCaps, QEMU_CAPS_MIGRATE_QEMU_FD) && - !compressor) { + (!compressor || pipe(pipeFD) == 0)) { /* All right! We can use fd migration, which means that qemu * doesn't have to open() the file, so we don't have to futz * around with granting access or revoking it later. */ @@ -1359,9 +1361,31 @@ qemuMigrationToFile(struct qemud_driver *driver, virDomainObjPtr vm, "-c", NULL }; - rc = qemuMonitorMigrateToFile(priv->mon, - QEMU_MONITOR_MIGRATE_BACKGROUND, - args, path, offset); + if (pipeFD[0] != -1) { + cmd = virCommandNewArgs(args); + virCommandSetInputFD(cmd, pipeFD[0]); + virCommandSetOutputFD(cmd, &fd); + if (virSetCloseExec(pipeFD[1]) < 0) { + virReportSystemError(errno, "%s", + _("Unable to set cloexec flag")); + qemuDomainObjExitMonitorWithDriver(driver, vm); + goto cleanup; + } + if (virCommandRunAsync(cmd, NULL) < 0) { + qemuDomainObjExitMonitorWithDriver(driver, vm); + goto cleanup; + } + rc = qemuMonitorMigrateToFd(priv->mon, + QEMU_MONITOR_MIGRATE_BACKGROUND, + pipeFD[1]); + if (VIR_CLOSE(pipeFD[0]) < 0 || + VIR_CLOSE(pipeFD[1]) < 0) + VIR_WARN0("failed to close intermediate pipe"); + } else { + rc = qemuMonitorMigrateToFile(priv->mon, + QEMU_MONITOR_MIGRATE_BACKGROUND, + args, path, offset); + } } qemuDomainObjExitMonitorWithDriver(driver, vm); @@ -1373,9 +1397,15 @@ qemuMigrationToFile(struct qemud_driver *driver, virDomainObjPtr vm, if (rc < 0) goto cleanup; + if (cmd && virCommandWait(cmd, NULL) < 0) + goto cleanup; + ret = 0; cleanup: + VIR_FORCE_CLOSE(pipeFD[0]); + VIR_FORCE_CLOSE(pipeFD[1]); + virCommandFree(cmd); if (restoreLabel && (!bypassSecurityDriver) && virSecurityManagerRestoreSavedStateLabel(driver->securityManager, vm, path) < 0) -- 1.7.4

On Sat, Mar 26, 2011 at 06:52:29AM -0600, Eric Blake wrote:
This addresses the comments raised during v4: https://www.redhat.com/archives/libvir-list/2011-March/msg00421.html More comments in individual patches.
It could still use a bit more testing with root-squash NFS, and I'm also hitting a problem where if I run daemon/libvirtd myself, I get a SELinux error:
error: unable to set security context 'system_u:object_r:svirt_image_t:s0:c80,c237' on fd 23: Permission denied
but if I run the system service libvirtd or SELinux permissive, things work. Somehow, the attempt to set the fd SELinux label on a pipe is not working when libvirt is started as an unconfined process (that is, the fd has label unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023) but when started as a daemon, SELinux is happy to allow the transition. I suspect that this is a bug in SELinux, since my understanding is that it should always be possible to go from unconfined to something more restrictive, but we already proved that SELinux fd labelling is relatively unused and untested back when we first added it in commit 34a19dda.
If possible, I'd like to get this in before the 0.9.0 freeze, and we can fix any fallout from testing during the freeze week.
Okay, go ahead, 5 iterations is a lot already, and we will clean things up as they go later. Reviewing giant patch series ain't fun for anybody (wild guess on my part :-) , and reviewing the fixes is preferable now, ACK Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

On 03/28/2011 07:01 AM, Daniel Veillard wrote:
On Sat, Mar 26, 2011 at 06:52:29AM -0600, Eric Blake wrote:
This addresses the comments raised during v4: https://www.redhat.com/archives/libvir-list/2011-March/msg00421.html More comments in individual patches.
It could still use a bit more testing with root-squash NFS, and I'm also hitting a problem where if I run daemon/libvirtd myself, I get a SELinux error:
error: unable to set security context 'system_u:object_r:svirt_image_t:s0:c80,c237' on fd 23: Permission denied
but if I run the system service libvirtd or SELinux permissive, things work. Somehow, the attempt to set the fd SELinux label on a pipe is not working when libvirt is started as an unconfined process (that is, the fd has label unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023) but when started as a daemon, SELinux is happy to allow the transition. I suspect that this is a bug in SELinux, since my understanding is that it should always be possible to go from unconfined to something more restrictive, but we already proved that SELinux fd labelling is relatively unused and untested back when we first added it in commit 34a19dda.
If possible, I'd like to get this in before the 0.9.0 freeze, and we can fix any fallout from testing during the freeze week.
Okay, go ahead, 5 iterations is a lot already, and we will clean things up as they go later. Reviewing giant patch series ain't fun for anybody (wild guess on my part :-) , and reviewing the fixes is preferable now,
ACK
Thanks. Series pushed, and I'm now trying to track down why I get that SELinux failure when run from an unconfined shell but not when run as a system service. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org
participants (2)
-
Daniel Veillard
-
Eric Blake