[libvirt] [PATCH v3 00/16]

This is an update of https://www.redhat.com/archives/libvir-list/2013-April/msg00352.html Currently libvirt creates a cgroups hiearchy at $LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$GUEST-NAME eg /sys/fs/cgroup ├── blkio │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 ├── cpu,cpuacct │ ├── libvirt │ │ ├── lxc │ │ │ └── busy │ │ └── qemu │ │ └── vm1 │ │ ├── emulator │ │ └── vcpu0 │ └── system │ ├── abrtd.service │ ....snip.... │ └── upower.service ├── cpuset │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 │ ├── emulator │ └── vcpu0 ├── devices │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 ├── freezer │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 ├── memory │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 ├── net_cls ├── perf_event This series changes it so that libvirt creates cgroups at /system/$VMNAME.{qemu,lxc}.libvirt and allows configuration of the "resource partition" (ie the "/system" bit) via the XML. So we get a layout like this: /sys/fs/cgroup ├── blkio │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt ├── cpu,cpuacct │ └── system │ ├── abrtd.service │ ....snip.... │ ├── demo.lxc.libvirt │ ....snip.... │ └── vm1.qemu.libvirt │ ├── emulator │ └── vcpu0 ├── cpuset │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt │ ├── emulator │ └── vcpu0 ├── devices │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt ├── freezer │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt ├── memory │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt ├── net_cls ├── perf_event Flattening out the libvirt created hiearchy has serious performance wins, due to poor kernel scalability with deep hierarchies. It also makes it easier to configure system wide policy for resource usage across system services and virtual machines / containers, since they all live at the top level in comon resource partitions. Changes since v2: - Merge previously ACKed patches - Incorporate Gao Feng's changes to LXC cgroup mount setup

From: "Daniel P. Berrange" <berrange@redhat.com> Introduce a method virFileDeleteTree for recursively deleting an entire directory tree Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 81 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 449696d..af13e50 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1259,6 +1259,7 @@ virEventPollUpdateTimeout; # util/virfile.h virFileClose; +virFileDeleteTree; virFileDirectFdFlag; virFileFclose; virFileFdopen; diff --git a/src/util/virfile.c b/src/util/virfile.c index 4a9fa81..4d338e1 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -644,3 +644,81 @@ int virFileLoopDeviceAssociate(const char *file, } #endif /* __linux__ */ + + +/** + * virFileDeleteTree: + * + * Recursively deletes all files / directories + * starting from the directory @dir. Does not + * follow symlinks + */ +int virFileDeleteTree(const char *dir) +{ + DIR *dh = opendir(dir); + struct dirent *de; + char *filepath = NULL; + int ret = -1; + + if (!dh) { + virReportSystemError(errno, _("Cannot open dir '%s'"), + dir); + return -1; + } + + errno = 0; + while ((de = readdir(dh)) != NULL) { + struct stat sb; + + if (STREQ(de->d_name, ".") || + STREQ(de->d_name, "..")) + continue; + + if (virAsprintf(&filepath, "%s/%s", + dir, de->d_name) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (lstat(filepath, &sb) < 0) { + virReportSystemError(errno, _("Cannot access '%s'"), + filepath); + goto cleanup; + } + + if (S_ISDIR(sb.st_mode)) { + if (virFileDeleteTree(filepath) < 0) + goto cleanup; + } else { + if (unlink(filepath) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Cannot delete file '%s'"), + filepath); + goto cleanup; + } + } + + VIR_FREE(filepath); + errno = 0; + } + + if (errno) { + virReportSystemError(errno, _("Cannot read dir '%s'"), + dir); + goto cleanup; + } + + if (rmdir(dir) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Cannot delete directory '%s'"), + dir); + goto cleanup; + } + + ret = 0; + +cleanup: + VIR_FREE(filepath); + closedir(dh); + return ret; +} diff --git a/src/util/virfile.h b/src/util/virfile.h index c885b73..5f0dd2b 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -108,4 +108,6 @@ int virFileUpdatePerm(const char *path, int virFileLoopDeviceAssociate(const char *file, char **dev); +int virFileDeleteTree(const char *dir); + #endif /* __VIR_FILES_H */ -- 1.8.1.4

On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Introduce a method virFileDeleteTree for recursively deleting an entire directory tree
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 81 insertions(+)
ACK Michal

On 04/10/2013 04:08 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Introduce a method virFileDeleteTree for recursively deleting an entire directory tree
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- + +/** + * virFileDeleteTree: + * + * Recursively deletes all files / directories + * starting from the directory @dir. Does not + * follow symlinks
I would add a comment mentioning that this function is not efficient to use on arbitrarily large or deep hierarchies. But I agree with Michal's ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Wed, Apr 10, 2013 at 11:25:10AM -0600, Eric Blake wrote:
On 04/10/2013 04:08 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Introduce a method virFileDeleteTree for recursively deleting an entire directory tree
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- + +/** + * virFileDeleteTree: + * + * Recursively deletes all files / directories + * starting from the directory @dir. Does not + * follow symlinks
I would add a comment mentioning that this function is not efficient to use on arbitrarily large or deep hierarchies. But I agree with Michal's ACK.
I added: @@ -652,6 +652,11 @@ int virFileLoopDeviceAssociate(const char *file, * Recursively deletes all files / directories * starting from the directory @dir. Does not * follow symlinks + * + * NB the algorithm is not efficient, and is subject to + * race conditions which can be exploited by malicious + * code. It should not be used in any scenarios where + * performance is important, or security is critical. */ Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

From: "Daniel P. Berrange" <berrange@redhat.com> Instead of calling virCgroupForDomain every time we need the virCgrouPtr instance, just do it once at Vm startup and cache a reference to the object in qemuDomainObjPrivatePtr until shutdown of the VM. Removing the virCgroupPtr from the QEMU driver state also means we don't have stale mount info, if someone mounts the cgroups filesystem after libvirtd has been started Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/qemu/qemu_cgroup.c | 283 +++++++++++++++------------------ src/qemu/qemu_cgroup.h | 22 +-- src/qemu/qemu_conf.h | 4 - src/qemu/qemu_domain.c | 1 + src/qemu/qemu_domain.h | 3 + src/qemu/qemu_driver.c | 397 +++++++++++++++------------------------------- src/qemu/qemu_hotplug.c | 53 +------ src/qemu/qemu_migration.c | 25 +-- src/qemu/qemu_process.c | 13 +- 9 files changed, 291 insertions(+), 510 deletions(-) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 5aa9416..019aa2e 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -45,26 +45,21 @@ static const char *const defaultDeviceACL[] = { #define DEVICE_PTY_MAJOR 136 #define DEVICE_SND_MAJOR 116 -bool qemuCgroupControllerActive(virQEMUDriverPtr driver, - int controller) -{ - return virCgroupHasController(driver->cgroup, controller); -} - static int qemuSetupDiskPathAllow(virDomainDiskDefPtr disk, const char *path, size_t depth ATTRIBUTE_UNUSED, void *opaque) { - qemuCgroupData *data = opaque; + virDomainObjPtr vm = opaque; + qemuDomainObjPrivatePtr priv = vm->privateData; int rc; VIR_DEBUG("Process path %s for disk", path); - rc = virCgroupAllowDevicePath(data->cgroup, path, + rc = virCgroupAllowDevicePath(priv->cgroup, path, (disk->readonly ? VIR_CGROUP_DEVICE_READ : VIR_CGROUP_DEVICE_RW)); - virDomainAuditCgroupPath(data->vm, data->cgroup, "allow", path, + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", path, disk->readonly ? "r" : "rw", rc); if (rc < 0) { if (rc == -EACCES) { /* Get this for root squash NFS */ @@ -81,14 +76,18 @@ qemuSetupDiskPathAllow(virDomainDiskDefPtr disk, int qemuSetupDiskCgroup(virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainDiskDefPtr disk) { - qemuCgroupData data = { vm, cgroup }; + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (!virCgroupHasController(priv->cgroup, + VIR_CGROUP_CONTROLLER_DEVICES)) + return 0; + return virDomainDiskDefForeachPath(disk, true, qemuSetupDiskPathAllow, - &data); + vm); } @@ -98,13 +97,14 @@ qemuTeardownDiskPathDeny(virDomainDiskDefPtr disk ATTRIBUTE_UNUSED, size_t depth ATTRIBUTE_UNUSED, void *opaque) { - qemuCgroupData *data = opaque; + virDomainObjPtr vm = opaque; + qemuDomainObjPrivatePtr priv = vm->privateData; int rc; VIR_DEBUG("Process path %s for disk", path); - rc = virCgroupDenyDevicePath(data->cgroup, path, + rc = virCgroupDenyDevicePath(priv->cgroup, path, VIR_CGROUP_DEVICE_RWM); - virDomainAuditCgroupPath(data->vm, data->cgroup, "deny", path, "rwm", rc); + virDomainAuditCgroupPath(vm, priv->cgroup, "deny", path, "rwm", rc); if (rc < 0) { if (rc == -EACCES) { /* Get this for root squash NFS */ VIR_DEBUG("Ignoring EACCES for %s", path); @@ -120,14 +120,18 @@ qemuTeardownDiskPathDeny(virDomainDiskDefPtr disk ATTRIBUTE_UNUSED, int qemuTeardownDiskCgroup(virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainDiskDefPtr disk) { - qemuCgroupData data = { vm, cgroup }; + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (!virCgroupHasController(priv->cgroup, + VIR_CGROUP_CONTROLLER_DEVICES)) + return 0; + return virDomainDiskDefForeachPath(disk, true, qemuTeardownDiskPathDeny, - &data); + vm); } @@ -136,7 +140,8 @@ qemuSetupChardevCgroup(virDomainDefPtr def, virDomainChrDefPtr dev, void *opaque) { - qemuCgroupData *data = opaque; + virDomainObjPtr vm = opaque; + qemuDomainObjPrivatePtr priv = vm->privateData; int rc; if (dev->source.type != VIR_DOMAIN_CHR_TYPE_DEV) @@ -144,9 +149,9 @@ qemuSetupChardevCgroup(virDomainDefPtr def, VIR_DEBUG("Process path '%s' for disk", dev->source.data.file.path); - rc = virCgroupAllowDevicePath(data->cgroup, dev->source.data.file.path, + rc = virCgroupAllowDevicePath(priv->cgroup, dev->source.data.file.path, VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupPath(data->vm, data->cgroup, "allow", + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", dev->source.data.file.path, "rw", rc); if (rc < 0) { virReportSystemError(-rc, @@ -163,13 +168,14 @@ int qemuSetupHostUsbDeviceCgroup(virUSBDevicePtr dev ATTRIBUTE_UNUSED, const char *path, void *opaque) { - qemuCgroupData *data = opaque; + virDomainObjPtr vm = opaque; + qemuDomainObjPrivatePtr priv = vm->privateData; int rc; VIR_DEBUG("Process path '%s' for USB device", path); - rc = virCgroupAllowDevicePath(data->cgroup, path, + rc = virCgroupAllowDevicePath(priv->cgroup, path, VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupPath(data->vm, data->cgroup, "allow", path, "rw", rc); + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", path, "rw", rc); if (rc < 0) { virReportSystemError(-rc, _("Unable to allow device %s"), @@ -180,34 +186,73 @@ int qemuSetupHostUsbDeviceCgroup(virUSBDevicePtr dev ATTRIBUTE_UNUSED, return 0; } + +int qemuInitCgroup(virQEMUDriverPtr driver, + virDomainObjPtr vm) +{ + int rc; + qemuDomainObjPrivatePtr priv = vm->privateData; + virCgroupPtr driverGroup = NULL; + virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + + virCgroupFree(&priv->cgroup); + + rc = virCgroupForDriver("qemu", &driverGroup, + cfg->privileged, true, + cfg->cgroupControllers); + if (rc != 0) { + if (rc == -ENXIO || + rc == -EPERM || + rc == -EACCES) { /* No cgroups mounts == success */ + VIR_DEBUG("No cgroups present/configured/accessible, ignoring error"); + goto done; + } + + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + vm->def->name); + goto cleanup; + } + + rc = virCgroupForDomain(driverGroup, vm->def->name, &priv->cgroup, 1); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + vm->def->name); + goto cleanup; + } + +done: + rc = 0; +cleanup: + virCgroupFree(&driverGroup); + virObjectUnref(cfg); + return rc; +} + + int qemuSetupCgroup(virQEMUDriverPtr driver, virDomainObjPtr vm, virBitmapPtr nodemask) { - virCgroupPtr cgroup = NULL; - int rc; + int rc = -1; unsigned int i; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + qemuDomainObjPrivatePtr priv = vm->privateData; const char *const *deviceACL = cfg->cgroupDeviceACL ? (const char *const *)cfg->cgroupDeviceACL : defaultDeviceACL; - if (driver->cgroup == NULL) - goto done; /* Not supported, so claim success */ + if (qemuInitCgroup(driver, vm) < 0) + return -1; - rc = virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 1); - if (rc != 0) { - virReportSystemError(-rc, - _("Unable to create cgroup for %s"), - vm->def->name); - goto cleanup; - } + if (!priv->cgroup) + goto done; - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - qemuCgroupData data = { vm, cgroup }; - rc = virCgroupDenyAllDevices(cgroup); - virDomainAuditCgroup(vm, cgroup, "deny", "all", rc == 0); + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { + rc = virCgroupDenyAllDevices(priv->cgroup); + virDomainAuditCgroup(vm, priv->cgroup, "deny", "all", rc == 0); if (rc != 0) { if (rc == -EPERM) { VIR_WARN("Group devices ACL is not accessible, disabling whitelisting"); @@ -220,13 +265,13 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, } for (i = 0; i < vm->def->ndisks ; i++) { - if (qemuSetupDiskCgroup(vm, cgroup, vm->def->disks[i]) < 0) + if (qemuSetupDiskCgroup(vm,vm->def->disks[i]) < 0) goto cleanup; } - rc = virCgroupAllowDeviceMajor(cgroup, 'c', DEVICE_PTY_MAJOR, + rc = virCgroupAllowDeviceMajor(priv->cgroup, 'c', DEVICE_PTY_MAJOR, VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupMajor(vm, cgroup, "allow", DEVICE_PTY_MAJOR, + virDomainAuditCgroupMajor(vm, priv->cgroup, "allow", DEVICE_PTY_MAJOR, "pty", "rw", rc == 0); if (rc != 0) { virReportSystemError(-rc, "%s", @@ -239,9 +284,9 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, ((vm->def->graphics[0]->type == VIR_DOMAIN_GRAPHICS_TYPE_VNC && cfg->vncAllowHostAudio) || (vm->def->graphics[0]->type == VIR_DOMAIN_GRAPHICS_TYPE_SDL)))) { - rc = virCgroupAllowDeviceMajor(cgroup, 'c', DEVICE_SND_MAJOR, + rc = virCgroupAllowDeviceMajor(priv->cgroup, 'c', DEVICE_SND_MAJOR, VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupMajor(vm, cgroup, "allow", DEVICE_SND_MAJOR, + virDomainAuditCgroupMajor(vm, priv->cgroup, "allow", DEVICE_SND_MAJOR, "sound", "rw", rc == 0); if (rc != 0) { virReportSystemError(-rc, "%s", @@ -257,9 +302,9 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, continue; } - rc = virCgroupAllowDevicePath(cgroup, deviceACL[i], + rc = virCgroupAllowDevicePath(priv->cgroup, deviceACL[i], VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupPath(vm, cgroup, "allow", deviceACL[i], "rw", rc); + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", deviceACL[i], "rw", rc); if (rc < 0 && rc != -ENOENT) { virReportSystemError(-rc, @@ -272,7 +317,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, if (virDomainChrDefForeach(vm->def, true, qemuSetupChardevCgroup, - &data) < 0) + vm) < 0) goto cleanup; for (i = 0; i < vm->def->nhostdevs; i++) { @@ -292,7 +337,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, goto cleanup; if (virUSBDeviceFileIterate(usb, qemuSetupHostUsbDeviceCgroup, - &data) < 0) { + vm) < 0) { virUSBDeviceFree(usb); goto cleanup; } @@ -301,8 +346,8 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, } if (vm->def->blkio.weight != 0) { - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { - rc = virCgroupSetBlkioWeight(cgroup, vm->def->blkio.weight); + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { + rc = virCgroupSetBlkioWeight(priv->cgroup, vm->def->blkio.weight); if (rc != 0) { virReportSystemError(-rc, _("Unable to set io weight for domain %s"), @@ -317,12 +362,12 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, } if (vm->def->blkio.ndevices) { - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { for (i = 0; i < vm->def->blkio.ndevices; i++) { virBlkioDeviceWeightPtr dw = &vm->def->blkio.devices[i]; if (!dw->weight) continue; - rc = virCgroupSetBlkioDeviceWeight(cgroup, dw->path, + rc = virCgroupSetBlkioDeviceWeight(priv->cgroup, dw->path, dw->weight); if (rc != 0) { virReportSystemError(-rc, @@ -339,7 +384,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, } } - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_MEMORY)) { + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_MEMORY)) { unsigned long long hard_limit = vm->def->mem.hard_limit; if (!hard_limit) { @@ -357,7 +402,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, hard_limit += vm->def->ndisks * 32768; } - rc = virCgroupSetMemoryHardLimit(cgroup, hard_limit); + rc = virCgroupSetMemoryHardLimit(priv->cgroup, hard_limit); if (rc != 0) { virReportSystemError(-rc, _("Unable to set memory hard limit for domain %s"), @@ -365,7 +410,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, goto cleanup; } if (vm->def->mem.soft_limit != 0) { - rc = virCgroupSetMemorySoftLimit(cgroup, vm->def->mem.soft_limit); + rc = virCgroupSetMemorySoftLimit(priv->cgroup, vm->def->mem.soft_limit); if (rc != 0) { virReportSystemError(-rc, _("Unable to set memory soft limit for domain %s"), @@ -375,7 +420,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, } if (vm->def->mem.swap_hard_limit != 0) { - rc = virCgroupSetMemSwapHardLimit(cgroup, vm->def->mem.swap_hard_limit); + rc = virCgroupSetMemSwapHardLimit(priv->cgroup, vm->def->mem.swap_hard_limit); if (rc != 0) { virReportSystemError(-rc, _("Unable to set swap hard limit for domain %s"), @@ -393,8 +438,8 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, } if (vm->def->cputune.shares != 0) { - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { - rc = virCgroupSetCpuShares(cgroup, vm->def->cputune.shares); + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { + rc = virCgroupSetCpuShares(priv->cgroup, vm->def->cputune.shares); if (rc != 0) { virReportSystemError(-rc, _("Unable to set io cpu shares for domain %s"), @@ -411,7 +456,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, (vm->def->numatune.memory.placement_mode == VIR_NUMA_TUNE_MEM_PLACEMENT_MODE_AUTO)) && vm->def->numatune.memory.mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT && - qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { char *mask = NULL; if (vm->def->numatune.memory.placement_mode == VIR_NUMA_TUNE_MEM_PLACEMENT_MODE_AUTO) @@ -424,7 +469,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, goto cleanup; } - rc = virCgroupSetCpusetMems(cgroup, mask); + rc = virCgroupSetCpusetMems(priv->cgroup, mask); VIR_FREE(mask); if (rc != 0) { virReportSystemError(-rc, @@ -433,18 +478,12 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, goto cleanup; } } -done: - virObjectUnref(cfg); - virCgroupFree(&cgroup); - return 0; +done: + rc = 0; cleanup: virObjectUnref(cfg); - if (cgroup) { - virCgroupRemove(cgroup); - virCgroupFree(&cgroup); - } - return -1; + return rc == 0 ? 0 : -1; } int qemuSetupCgroupVcpuBW(virCgroupPtr cgroup, unsigned long long period, @@ -538,9 +577,8 @@ cleanup: return rc; } -int qemuSetupCgroupForVcpu(virQEMUDriverPtr driver, virDomainObjPtr vm) +int qemuSetupCgroupForVcpu(virDomainObjPtr vm) { - virCgroupPtr cgroup = NULL; virCgroupPtr cgroup_vcpu = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; virDomainDefPtr def = vm->def; @@ -550,8 +588,7 @@ int qemuSetupCgroupForVcpu(virQEMUDriverPtr driver, virDomainObjPtr vm) long long quota = vm->def->cputune.quota; if ((period || quota) && - (!driver->cgroup || - !qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU))) { + !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("cgroup cpu is required for scheduler tuning")); return -1; @@ -561,28 +598,19 @@ int qemuSetupCgroupForVcpu(virQEMUDriverPtr driver, virDomainObjPtr vm) * with virProcessInfoSetAffinity, thus the lack of cgroups is not fatal * here. */ - if (driver->cgroup == NULL) + if (priv->cgroup == NULL) return 0; - rc = virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0); - if (rc != 0) { - virReportSystemError(-rc, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } - if (priv->nvcpupids == 0 || priv->vcpupids[0] == vm->pid) { /* If we don't know VCPU<->PID mapping or all vcpu runs in the same * thread, we cannot control each vcpu. */ VIR_WARN("Unable to get vcpus' pids."); - virCgroupFree(&cgroup); return 0; } for (i = 0; i < priv->nvcpupids; i++) { - rc = virCgroupForVcpu(cgroup, i, &cgroup_vcpu, 1); + rc = virCgroupForVcpu(priv->cgroup, i, &cgroup_vcpu, 1); if (rc < 0) { virReportSystemError(-rc, _("Unable to create vcpu cgroup for %s(vcpu:" @@ -606,7 +634,7 @@ int qemuSetupCgroupForVcpu(virQEMUDriverPtr driver, virDomainObjPtr vm) } /* Set vcpupin in cgroup if vcpupin xml is provided */ - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { /* find the right CPU to pin, otherwise * qemuSetupCgroupVcpuPin will fail. */ for (j = 0; j < def->cputune.nvcpupin; j++) { @@ -626,7 +654,6 @@ int qemuSetupCgroupForVcpu(virQEMUDriverPtr driver, virDomainObjPtr vm) virCgroupFree(&cgroup_vcpu); } - virCgroupFree(&cgroup); return 0; cleanup: @@ -635,11 +662,6 @@ cleanup: virCgroupFree(&cgroup_vcpu); } - if (cgroup) { - virCgroupRemove(cgroup); - virCgroupFree(&cgroup); - } - return -1; } @@ -649,33 +671,24 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, { virBitmapPtr cpumask = NULL; virBitmapPtr cpumap = NULL; - virCgroupPtr cgroup = NULL; virCgroupPtr cgroup_emulator = NULL; virDomainDefPtr def = vm->def; + qemuDomainObjPrivatePtr priv = vm->privateData; unsigned long long period = vm->def->cputune.emulator_period; long long quota = vm->def->cputune.emulator_quota; int rc; if ((period || quota) && - (!driver->cgroup || - !qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU))) { + !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("cgroup cpu is required for scheduler tuning")); return -1; } - if (driver->cgroup == NULL) + if (priv->cgroup == NULL) return 0; /* Not supported, so claim success */ - rc = virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0); - if (rc != 0) { - virReportSystemError(-rc, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } - - rc = virCgroupForEmulator(cgroup, &cgroup_emulator, 1); + rc = virCgroupForEmulator(priv->cgroup, &cgroup_emulator, 1); if (rc < 0) { virReportSystemError(-rc, _("Unable to create emulator cgroup for %s"), @@ -683,7 +696,7 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, goto cleanup; } - rc = virCgroupMoveTask(cgroup, cgroup_emulator); + rc = virCgroupMoveTask(priv->cgroup, cgroup_emulator); if (rc < 0) { virReportSystemError(-rc, _("Unable to move tasks from domain cgroup to " @@ -703,7 +716,7 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, } if (cpumask) { - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { rc = qemuSetupCgroupEmulatorPin(cgroup_emulator, cpumask); if (rc < 0) goto cleanup; @@ -712,7 +725,7 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, } if (period || quota) { - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { if ((rc = qemuSetupCgroupVcpuBW(cgroup_emulator, period, quota)) < 0) goto cleanup; @@ -720,7 +733,6 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, } virCgroupFree(&cgroup_emulator); - virCgroupFree(&cgroup); virBitmapFree(cpumap); return 0; @@ -732,67 +744,34 @@ cleanup: virCgroupFree(&cgroup_emulator); } - if (cgroup) { - virCgroupRemove(cgroup); - virCgroupFree(&cgroup); - } - return rc; } -int qemuRemoveCgroup(virQEMUDriverPtr driver, - virDomainObjPtr vm, - int quiet) +int qemuRemoveCgroup(virDomainObjPtr vm) { - virCgroupPtr cgroup; - int rc; + qemuDomainObjPrivatePtr priv = vm->privateData; - if (driver->cgroup == NULL) + if (priv->cgroup == NULL) return 0; /* Not supported, so claim success */ - rc = virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0); - if (rc != 0) { - if (!quiet) - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - return rc; - } - - rc = virCgroupRemove(cgroup); - virCgroupFree(&cgroup); - return rc; + return virCgroupRemove(priv->cgroup); } -int qemuAddToCgroup(virQEMUDriverPtr driver, - virDomainDefPtr def) +int qemuAddToCgroup(virDomainObjPtr vm) { - virCgroupPtr cgroup = NULL; - int ret = -1; + qemuDomainObjPrivatePtr priv = vm->privateData; int rc; - if (driver->cgroup == NULL) + if (priv->cgroup == NULL) return 0; /* Not supported, so claim success */ - rc = virCgroupForDomain(driver->cgroup, def->name, &cgroup, 0); - if (rc != 0) { - virReportSystemError(-rc, - _("unable to find cgroup for domain %s"), - def->name); - goto cleanup; - } - - rc = virCgroupAddTask(cgroup, getpid()); + rc = virCgroupAddTask(priv->cgroup, getpid()); if (rc != 0) { virReportSystemError(-rc, _("unable to add domain %s task %d to cgroup"), - def->name, getpid()); - goto cleanup; + vm->def->name, getpid()); + return -1; } - ret = 0; - -cleanup: - virCgroupFree(&cgroup); - return ret; + return 0; } diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index a677d07..6cbfebc 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -25,26 +25,19 @@ # define __QEMU_CGROUP_H__ # include "virusb.h" +# include "vircgroup.h" # include "domain_conf.h" # include "qemu_conf.h" -struct _qemuCgroupData { - virDomainObjPtr vm; - virCgroupPtr cgroup; -}; -typedef struct _qemuCgroupData qemuCgroupData; - -bool qemuCgroupControllerActive(virQEMUDriverPtr driver, - int controller); int qemuSetupDiskCgroup(virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainDiskDefPtr disk); int qemuTeardownDiskCgroup(virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainDiskDefPtr disk); int qemuSetupHostUsbDeviceCgroup(virUSBDevicePtr dev, const char *path, void *opaque); +int qemuInitCgroup(virQEMUDriverPtr driver, + virDomainObjPtr vm); int qemuSetupCgroup(virQEMUDriverPtr driver, virDomainObjPtr vm, virBitmapPtr nodemask); @@ -56,14 +49,11 @@ int qemuSetupCgroupVcpuPin(virCgroupPtr cgroup, int nvcpupin, int vcpuid); int qemuSetupCgroupEmulatorPin(virCgroupPtr cgroup, virBitmapPtr cpumask); -int qemuSetupCgroupForVcpu(virQEMUDriverPtr driver, virDomainObjPtr vm); +int qemuSetupCgroupForVcpu(virDomainObjPtr vm); int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, virDomainObjPtr vm, virBitmapPtr nodemask); -int qemuRemoveCgroup(virQEMUDriverPtr driver, - virDomainObjPtr vm, - int quiet); -int qemuAddToCgroup(virQEMUDriverPtr driver, - virDomainDefPtr def); +int qemuRemoveCgroup(virDomainObjPtr vm); +int qemuAddToCgroup(virDomainObjPtr vm); #endif /* __QEMU_CGROUP_H__ */ diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index 9ec993f..bac9bf7 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -34,7 +34,6 @@ # include "domain_event.h" # include "virthread.h" # include "security/security_manager.h" -# include "vircgroup.h" # include "virpci.h" # include "virusb.h" # include "cpu_conf.h" @@ -164,9 +163,6 @@ struct _virQEMUDriver { /* Atomic increment only */ int nextvmid; - /* Immutable pointer. Immutable object */ - virCgroupPtr cgroup; - /* Atomic inc/dec only */ unsigned int nactive; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index de9b8c7..a7aabdf 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -235,6 +235,7 @@ qemuDomainObjPrivateFree(void *data) virObjectUnref(priv->qemuCaps); + virCgroupFree(&priv->cgroup); qemuDomainPCIAddressSetFree(priv->pciaddrs); qemuDomainCCWAddressSetFree(priv->ccwaddrs); virDomainChrSourceDefFree(priv->monConfig); diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 089ced0..da04377 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -25,6 +25,7 @@ # define __QEMU_DOMAIN_H__ # include "virthread.h" +# include "vircgroup.h" # include "domain_conf.h" # include "snapshot_conf.h" # include "qemu_monitor.h" @@ -165,6 +166,8 @@ struct _qemuDomainObjPrivate { qemuDomainCleanupCallback *cleanupCallbacks; size_t ncleanupCallbacks; size_t ncleanupCallbacks_max; + + virCgroupPtr cgroup; }; struct qemuDomainWatchdogEvent diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 2c0d7d1..420ae39 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -551,7 +551,6 @@ qemuStartup(bool privileged, void *opaque) { char *driverConf = NULL; - int rc; virConnectPtr conn = NULL; char ebuf[1024]; char *membase = NULL; @@ -628,13 +627,6 @@ qemuStartup(bool privileged, goto error; } - rc = virCgroupForDriver("qemu", &qemu_driver->cgroup, privileged, 1, - cfg->cgroupControllers); - if (rc < 0) { - VIR_INFO("Unable to create cgroup for driver: %s", - virStrerror(-rc, ebuf, sizeof(ebuf))); - } - qemu_driver->qemuImgBinary = virFindFileInPath("kvm-img"); if (!qemu_driver->qemuImgBinary) qemu_driver->qemuImgBinary = virFindFileInPath("qemu-img"); @@ -976,8 +968,6 @@ qemuShutdown(void) { /* Free domain callback list */ virDomainEventStateFree(qemu_driver->domainEventState); - virCgroupFree(&qemu_driver->cgroup); - virLockManagerPluginUnref(qemu_driver->lockManager); virMutexDestroy(&qemu_driver->lock); @@ -3540,9 +3530,7 @@ static int qemuDomainHotplugVcpus(virQEMUDriverPtr driver, int vcpus = oldvcpus; pid_t *cpupids = NULL; int ncpupids; - virCgroupPtr cgroup = NULL; virCgroupPtr cgroup_vcpu = NULL; - bool cgroup_available = false; qemuDomainObjEnterMonitor(driver, vm); @@ -3605,15 +3593,12 @@ static int qemuDomainHotplugVcpus(virQEMUDriverPtr driver, goto cleanup; } - cgroup_available = (virCgroupForDomain(driver->cgroup, vm->def->name, - &cgroup, 0) == 0); - if (nvcpus > oldvcpus) { for (i = oldvcpus; i < nvcpus; i++) { - if (cgroup_available) { + if (priv->cgroup) { int rv = -1; /* Create cgroup for the onlined vcpu */ - rv = virCgroupForVcpu(cgroup, i, &cgroup_vcpu, 1); + rv = virCgroupForVcpu(priv->cgroup, i, &cgroup_vcpu, 1); if (rv < 0) { virReportSystemError(-rv, _("Unable to create vcpu cgroup for %s(vcpu:" @@ -3656,7 +3641,7 @@ static int qemuDomainHotplugVcpus(virQEMUDriverPtr driver, vcpupin->vcpuid = i; vm->def->cputune.vcpupin[vm->def->cputune.nvcpupin++] = vcpupin; - if (cgroup_available) { + if (cgroup_vcpu) { if (qemuSetupCgroupVcpuPin(cgroup_vcpu, vm->def->cputune.vcpupin, vm->def->cputune.nvcpupin, i) < 0) { @@ -3684,10 +3669,10 @@ static int qemuDomainHotplugVcpus(virQEMUDriverPtr driver, for (i = oldvcpus - 1; i >= nvcpus; i--) { virDomainVcpuPinDefPtr vcpupin = NULL; - if (cgroup_available) { + if (priv->cgroup) { int rv = -1; - rv = virCgroupForVcpu(cgroup, i, &cgroup_vcpu, 0); + rv = virCgroupForVcpu(priv->cgroup, i, &cgroup_vcpu, 0); if (rv < 0) { virReportSystemError(-rv, _("Unable to access vcpu cgroup for %s(vcpu:" @@ -3718,8 +3703,6 @@ cleanup: vm->def->vcpus = vcpus; VIR_FREE(cpupids); virDomainAuditVcpu(vm, oldvcpus, nvcpus, "update", rc == 1); - if (cgroup) - virCgroupFree(&cgroup); if (cgroup_vcpu) virCgroupFree(&cgroup_vcpu); return ret; @@ -3834,7 +3817,6 @@ qemuDomainPinVcpuFlags(virDomainPtr dom, virQEMUDriverPtr driver = dom->conn->privateData; virDomainObjPtr vm; virDomainDefPtr persistentDef = NULL; - virCgroupPtr cgroup_dom = NULL; virCgroupPtr cgroup_vcpu = NULL; int ret = -1; qemuDomainObjPrivatePtr priv; @@ -3916,9 +3898,8 @@ qemuDomainPinVcpuFlags(virDomainPtr dom, } /* Configure the corresponding cpuset cgroup before set affinity. */ - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup_dom, 0) == 0 && - virCgroupForVcpu(cgroup_dom, vcpu, &cgroup_vcpu, 0) == 0 && + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (virCgroupForVcpu(priv->cgroup, vcpu, &cgroup_vcpu, 0) == 0 && qemuSetupCgroupVcpuPin(cgroup_vcpu, newVcpuPin, newVcpuPinNum, vcpu) < 0) { virReportError(VIR_ERR_OPERATION_INVALID, _("failed to set cpuset.cpus in cgroup" @@ -3995,8 +3976,6 @@ qemuDomainPinVcpuFlags(virDomainPtr dom, cleanup: if (cgroup_vcpu) virCgroupFree(&cgroup_vcpu); - if (cgroup_dom) - virCgroupFree(&cgroup_dom); if (vm) virObjectUnlock(vm); virBitmapFree(pcpumap); @@ -4107,7 +4086,6 @@ qemuDomainPinEmulator(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; virDomainObjPtr vm; - virCgroupPtr cgroup_dom = NULL; virCgroupPtr cgroup_emulator = NULL; pid_t pid; virDomainDefPtr persistentDef = NULL; @@ -4177,22 +4155,19 @@ qemuDomainPinEmulator(virDomainPtr dom, goto cleanup; } - if (qemuCgroupControllerActive(driver, - VIR_CGROUP_CONTROLLER_CPUSET)) { + if (virCgroupHasController(priv->cgroup, + VIR_CGROUP_CONTROLLER_CPUSET)) { /* * Configure the corresponding cpuset cgroup. * If no cgroup for domain or hypervisor exists, do nothing. */ - if (virCgroupForDomain(driver->cgroup, vm->def->name, - &cgroup_dom, 0) == 0) { - if (virCgroupForEmulator(cgroup_dom, &cgroup_emulator, 0) == 0) { - if (qemuSetupCgroupEmulatorPin(cgroup_emulator, - newVcpuPin[0]->cpumask) < 0) { - virReportError(VIR_ERR_OPERATION_INVALID, "%s", - _("failed to set cpuset.cpus in cgroup" - " for emulator threads")); - goto cleanup; - } + if (virCgroupForEmulator(priv->cgroup, &cgroup_emulator, 0) == 0) { + if (qemuSetupCgroupEmulatorPin(cgroup_emulator, + newVcpuPin[0]->cpumask) < 0) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("failed to set cpuset.cpus in cgroup" + " for emulator threads")); + goto cleanup; } } } else { @@ -4256,8 +4231,6 @@ qemuDomainPinEmulator(virDomainPtr dom, cleanup: if (cgroup_emulator) virCgroupFree(&cgroup_emulator); - if (cgroup_dom) - virCgroupFree(&cgroup_dom); virBitmapFree(pcpumap); virObjectUnref(caps); if (vm) @@ -5750,16 +5723,8 @@ qemuDomainAttachDeviceDiskLive(virConnectPtr conn, if (qemuDomainDetermineDiskChain(driver, disk, false) < 0) goto end; - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0)) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto end; - } - if (qemuSetupDiskCgroup(vm, cgroup, disk) < 0) - goto end; - } + if (qemuSetupDiskCgroup(vm, disk) < 0) + goto end; switch (disk->device) { case VIR_DOMAIN_DISK_DEVICE_CDROM: @@ -5826,7 +5791,7 @@ qemuDomainAttachDeviceDiskLive(virConnectPtr conn, } if (ret != 0 && cgroup) { - if (qemuTeardownDiskCgroup(vm, cgroup, disk) < 0) + if (qemuTeardownDiskCgroup(vm, disk) < 0) VIR_WARN("Failed to teardown cgroup for disk path %s", NULLSTR(disk->src)); } @@ -5834,8 +5799,6 @@ qemuDomainAttachDeviceDiskLive(virConnectPtr conn, end: if (ret != 0) ignore_value(qemuRemoveSharedDisk(driver, disk, vm->def->name)); - if (cgroup) - virCgroupFree(&cgroup); virObjectUnref(caps); virDomainDeviceDefFree(dev_copy); return ret; @@ -6019,7 +5982,6 @@ qemuDomainChangeDiskMediaLive(virConnectPtr conn, virDomainDiskDefPtr disk = dev->data.disk; virDomainDiskDefPtr orig_disk = NULL; virDomainDiskDefPtr tmp = NULL; - virCgroupPtr cgroup = NULL; virDomainDeviceDefPtr dev_copy = NULL; virCapsPtr caps = NULL; int ret = -1; @@ -6030,17 +5992,8 @@ qemuDomainChangeDiskMediaLive(virConnectPtr conn, if (qemuDomainDetermineDiskChain(driver, disk, false) < 0) goto end; - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - if (virCgroupForDomain(driver->cgroup, - vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto end; - } - if (qemuSetupDiskCgroup(vm, cgroup, disk) < 0) - goto end; - } + if (qemuSetupDiskCgroup(vm, disk) < 0) + goto end; switch (disk->device) { case VIR_DOMAIN_DISK_DEVICE_CDROM: @@ -6092,14 +6045,12 @@ qemuDomainChangeDiskMediaLive(virConnectPtr conn, break; } - if (ret != 0 && cgroup) { - if (qemuTeardownDiskCgroup(vm, cgroup, disk) < 0) - VIR_WARN("Failed to teardown cgroup for disk path %s", - NULLSTR(disk->src)); - } + if (ret != 0 && + qemuTeardownDiskCgroup(vm, disk) < 0) + VIR_WARN("Failed to teardown cgroup for disk path %s", + NULLSTR(disk->src)); + end: - if (cgroup) - virCgroupFree(&cgroup); virObjectUnref(caps); virDomainDeviceDefFree(dev_copy); return ret; @@ -6735,15 +6686,25 @@ static char *qemuGetSchedulerType(virDomainPtr dom, virQEMUDriverPtr driver = dom->conn->privateData; char *ret = NULL; int rc; + virDomainObjPtr vm = NULL; + qemuDomainObjPrivatePtr priv; + + vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); + if (vm == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("No such domain %s"), dom->uuid); + goto cleanup; + } + priv = vm->privateData; - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPU controller is not mounted")); goto cleanup; } if (nparams) { - rc = qemuGetCpuBWStatus(driver->cgroup); + rc = qemuGetCpuBWStatus(priv->cgroup); if (rc < 0) goto cleanup; else if (rc == 0) @@ -6757,6 +6718,8 @@ static char *qemuGetSchedulerType(virDomainPtr dom, virReportOOMError(); cleanup: + if (vm) + virObjectUnlock(vm); return ret; } @@ -6896,12 +6859,12 @@ qemuDomainSetBlkioParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; int ret = -1; virQEMUDriverConfigPtr cfg = NULL; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -6919,6 +6882,7 @@ qemuDomainSetBlkioParameters(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; cfg = virQEMUDriverGetConfig(driver); if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup; @@ -6928,18 +6892,11 @@ qemuDomainSetBlkioParameters(virDomainPtr dom, goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("blkio cgroup isn't mounted")); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), - vm->def->name); - goto cleanup; - } } ret = 0; @@ -6956,7 +6913,7 @@ qemuDomainSetBlkioParameters(virDomainPtr dom, continue; } - rc = virCgroupSetBlkioWeight(group, params[i].value.ui); + rc = virCgroupSetBlkioWeight(priv->cgroup, params[i].value.ui); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to set blkio weight tunable")); @@ -6974,7 +6931,7 @@ qemuDomainSetBlkioParameters(virDomainPtr dom, continue; } for (j = 0; j < ndevices; j++) { - rc = virCgroupSetBlkioDeviceWeight(group, + rc = virCgroupSetBlkioDeviceWeight(priv->cgroup, devices[j].path, devices[j].weight); if (rc < 0) { @@ -7037,7 +6994,6 @@ qemuDomainSetBlkioParameters(virDomainPtr dom, } cleanup: - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -7053,13 +7009,13 @@ qemuDomainGetBlkioParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; int i, j; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; unsigned int val; int ret = -1; int rc; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG | @@ -7077,6 +7033,7 @@ qemuDomainGetBlkioParameters(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup; @@ -7093,17 +7050,11 @@ qemuDomainGetBlkioParameters(virDomainPtr dom, goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("blkio cgroup isn't mounted")); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } } if (flags & VIR_DOMAIN_AFFECT_LIVE) { @@ -7113,7 +7064,7 @@ qemuDomainGetBlkioParameters(virDomainPtr dom, switch (i) { case 0: /* fill blkio weight here */ - rc = virCgroupGetBlkioWeight(group, &val); + rc = virCgroupGetBlkioWeight(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get blkio weight")); @@ -7226,8 +7177,6 @@ qemuDomainGetBlkioParameters(virDomainPtr dom, ret = 0; cleanup: - if (group) - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -7242,7 +7191,6 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; virDomainDefPtr persistentDef = NULL; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; unsigned long long swap_hard_limit; unsigned long long memory_hard_limit; @@ -7254,6 +7202,7 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, int ret = -1; int rc; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -7272,6 +7221,7 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, if (!(vm = qemuDomObjFromDomain(dom))) return -1; + priv = vm->privateData; cfg = virQEMUDriverGetConfig(driver); if (!(caps = virQEMUDriverGetCapabilities(driver, false))) @@ -7282,17 +7232,11 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_MEMORY)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_MEMORY)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup memory controller is not mounted")); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } } #define VIR_GET_LIMIT_PARAMETER(PARAM, VALUE) \ @@ -7320,7 +7264,7 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, if (set_swap_hard_limit) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if ((rc = virCgroupSetMemSwapHardLimit(group, swap_hard_limit)) < 0) { + if ((rc = virCgroupSetMemSwapHardLimit(priv->cgroup, swap_hard_limit)) < 0) { virReportSystemError(-rc, "%s", _("unable to set memory swap_hard_limit tunable")); goto cleanup; @@ -7334,7 +7278,7 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, if (set_memory_hard_limit) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if ((rc = virCgroupSetMemoryHardLimit(group, memory_hard_limit)) < 0) { + if ((rc = virCgroupSetMemoryHardLimit(priv->cgroup, memory_hard_limit)) < 0) { virReportSystemError(-rc, "%s", _("unable to set memory hard_limit tunable")); goto cleanup; @@ -7348,7 +7292,7 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, if (set_memory_soft_limit) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if ((rc = virCgroupSetMemorySoftLimit(group, memory_soft_limit)) < 0) { + if ((rc = virCgroupSetMemorySoftLimit(priv->cgroup, memory_soft_limit)) < 0) { virReportSystemError(-rc, "%s", _("unable to set memory soft_limit tunable")); goto cleanup; @@ -7367,7 +7311,6 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, ret = 0; cleanup: - virCgroupFree(&group); virObjectUnlock(vm); virObjectUnref(caps); virObjectUnref(cfg); @@ -7382,12 +7325,12 @@ qemuDomainGetMemoryParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; int ret = -1; int rc; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG | @@ -7404,6 +7347,7 @@ qemuDomainGetMemoryParameters(virDomainPtr dom, goto cleanup; } + priv = vm->privateData; if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup; @@ -7412,17 +7356,11 @@ qemuDomainGetMemoryParameters(virDomainPtr dom, goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_MEMORY)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_MEMORY)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup memory controller is not mounted")); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } } if ((*nparams) == 0) { @@ -7473,12 +7411,9 @@ qemuDomainGetMemoryParameters(virDomainPtr dom, virTypedParameterPtr param = ¶ms[i]; unsigned long long val = 0; - /* Coverity does not realize that if we get here, group is set. */ - sa_assert(group); - switch (i) { case 0: /* fill memory hard limit here */ - rc = virCgroupGetMemoryHardLimit(group, &val); + rc = virCgroupGetMemoryHardLimit(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get memory hard limit")); @@ -7491,7 +7426,7 @@ qemuDomainGetMemoryParameters(virDomainPtr dom, break; case 1: /* fill memory soft limit here */ - rc = virCgroupGetMemorySoftLimit(group, &val); + rc = virCgroupGetMemorySoftLimit(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get memory soft limit")); @@ -7504,7 +7439,7 @@ qemuDomainGetMemoryParameters(virDomainPtr dom, break; case 2: /* fill swap hard limit here */ - rc = virCgroupGetMemSwapHardLimit(group, &val); + rc = virCgroupGetMemSwapHardLimit(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get swap hard limit")); @@ -7528,8 +7463,6 @@ out: ret = 0; cleanup: - if (group) - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -7545,11 +7478,11 @@ qemuDomainSetNumaParameters(virDomainPtr dom, virQEMUDriverPtr driver = dom->conn->privateData; int i; virDomainDefPtr persistentDef = NULL; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; int ret = -1; virQEMUDriverConfigPtr cfg = NULL; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -7568,6 +7501,7 @@ qemuDomainSetNumaParameters(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; cfg = virQEMUDriverGetConfig(driver); if (!(caps = virQEMUDriverGetCapabilities(driver, false))) @@ -7578,18 +7512,11 @@ qemuDomainSetNumaParameters(virDomainPtr dom, goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup cpuset controller is not mounted")); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), - vm->def->name); - goto cleanup; - } } ret = 0; @@ -7642,7 +7569,7 @@ qemuDomainSetNumaParameters(virDomainPtr dom, continue; } - if ((rc = virCgroupSetCpusetMems(group, nodeset_str) != 0)) { + if ((rc = virCgroupSetCpusetMems(priv->cgroup, nodeset_str) != 0)) { virReportSystemError(-rc, "%s", _("unable to set numa tunable")); virBitmapFree(nodeset); @@ -7682,7 +7609,6 @@ qemuDomainSetNumaParameters(virDomainPtr dom, } cleanup: - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -7698,13 +7624,13 @@ qemuDomainGetNumaParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; char *nodeset = NULL; int ret = -1; int rc; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG | @@ -7722,6 +7648,7 @@ qemuDomainGetNumaParameters(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup; @@ -7737,18 +7664,11 @@ qemuDomainGetNumaParameters(virDomainPtr dom, } if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_MEMORY)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_MEMORY)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup memory controller is not mounted")); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), - vm->def->name); - goto cleanup; - } } for (i = 0; i < QEMU_NB_NUMA_PARAM && i < *nparams; i++) { @@ -7771,7 +7691,7 @@ qemuDomainGetNumaParameters(virDomainPtr dom, if (!nodeset) nodeset = strdup(""); } else { - rc = virCgroupGetCpusetMems(group, &nodeset); + rc = virCgroupGetCpusetMems(priv->cgroup, &nodeset); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get numa nodeset")); @@ -7798,7 +7718,6 @@ qemuDomainGetNumaParameters(virDomainPtr dom, cleanup: VIR_FREE(nodeset); - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -7906,6 +7825,7 @@ qemuSetSchedulerParametersFlags(virDomainPtr dom, int rc; virQEMUDriverConfigPtr cfg = NULL; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -7931,6 +7851,7 @@ qemuSetSchedulerParametersFlags(virDomainPtr dom, goto cleanup; } + priv = vm->privateData; cfg = virQEMUDriverGetConfig(driver); if (!(caps = virQEMUDriverGetCapabilities(driver, false))) @@ -7948,17 +7869,11 @@ qemuSetSchedulerParametersFlags(virDomainPtr dom, } if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPU controller is not mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), - vm->def->name); - goto cleanup; - } } for (i = 0; i < nparams; i++) { @@ -7968,7 +7883,7 @@ qemuSetSchedulerParametersFlags(virDomainPtr dom, if (STREQ(param->field, VIR_DOMAIN_SCHEDULER_CPU_SHARES)) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if ((rc = virCgroupSetCpuShares(group, value_ul))) { + if ((rc = virCgroupSetCpuShares(priv->cgroup, value_ul))) { virReportSystemError(-rc, "%s", _("unable to set cpu shares tunable")); goto cleanup; @@ -8054,7 +7969,6 @@ qemuSetSchedulerParametersFlags(virDomainPtr dom, cleanup: virDomainDefFree(vmdef); - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -8098,7 +8012,7 @@ qemuGetVcpuBWLive(virCgroupPtr cgroup, unsigned long long *period, } static int -qemuGetVcpusBWLive(virDomainObjPtr vm, virCgroupPtr cgroup, +qemuGetVcpusBWLive(virDomainObjPtr vm, unsigned long long *period, long long *quota) { virCgroupPtr cgroup_vcpu = NULL; @@ -8109,7 +8023,7 @@ qemuGetVcpusBWLive(virDomainObjPtr vm, virCgroupPtr cgroup, priv = vm->privateData; if (priv->nvcpupids == 0 || priv->vcpupids[0] == vm->pid) { /* We do not create sub dir for each vcpu */ - rc = qemuGetVcpuBWLive(cgroup, period, quota); + rc = qemuGetVcpuBWLive(priv->cgroup, period, quota); if (rc < 0) goto cleanup; @@ -8119,7 +8033,7 @@ qemuGetVcpusBWLive(virDomainObjPtr vm, virCgroupPtr cgroup, } /* get period and quota for vcpu0 */ - rc = virCgroupForVcpu(cgroup, 0, &cgroup_vcpu, 0); + rc = virCgroupForVcpu(priv->cgroup, 0, &cgroup_vcpu, 0); if (!cgroup_vcpu) { virReportSystemError(-rc, _("Unable to find vcpu cgroup for %s(vcpu: 0)"), @@ -8183,7 +8097,6 @@ qemuGetSchedulerParametersFlags(virDomainPtr dom, unsigned int flags) { virQEMUDriverPtr driver = dom->conn->privateData; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; unsigned long long shares; unsigned long long period; @@ -8196,6 +8109,7 @@ qemuGetSchedulerParametersFlags(virDomainPtr dom, int saved_nparams = 0; virDomainDefPtr persistentDef; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG | @@ -8204,13 +8118,6 @@ qemuGetSchedulerParametersFlags(virDomainPtr dom, /* We don't return strings, and thus trivially support this flag. */ flags &= ~VIR_TYPED_PARAM_STRING_OKAY; - if (*nparams > 1) { - rc = qemuGetCpuBWStatus(driver->cgroup); - if (rc < 0) - goto cleanup; - cpu_bw_status = !!rc; - } - vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); if (vm == NULL) { @@ -8219,6 +8126,15 @@ qemuGetSchedulerParametersFlags(virDomainPtr dom, goto cleanup; } + priv = vm->privateData; + + if (*nparams > 1) { + rc = qemuGetCpuBWStatus(priv->cgroup); + if (rc < 0) + goto cleanup; + cpu_bw_status = !!rc; + } + if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup; @@ -8237,19 +8153,13 @@ qemuGetSchedulerParametersFlags(virDomainPtr dom, goto out; } - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPU controller is not mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - - rc = virCgroupGetCpuShares(group, &shares); + rc = virCgroupGetCpuShares(priv->cgroup, &shares); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get cpu shares tunable")); @@ -8257,13 +8167,13 @@ qemuGetSchedulerParametersFlags(virDomainPtr dom, } if (*nparams > 1 && cpu_bw_status) { - rc = qemuGetVcpusBWLive(vm, group, &period, "a); + rc = qemuGetVcpusBWLive(vm, &period, "a); if (rc != 0) goto cleanup; } if (*nparams > 3 && cpu_bw_status) { - rc = qemuGetEmulatorBandwidthLive(vm, group, &emulator_period, + rc = qemuGetEmulatorBandwidthLive(vm, priv->cgroup, &emulator_period, &emulator_quota); if (rc != 0) goto cleanup; @@ -8316,7 +8226,6 @@ out: ret = 0; cleanup: - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -8712,7 +8621,6 @@ qemuDomainSetInterfaceParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; int ret = -1; @@ -8876,7 +8784,6 @@ qemuDomainSetInterfaceParameters(virDomainPtr dom, cleanup: virNetDevBandwidthFree(bandwidth); virNetDevBandwidthFree(newBandwidth); - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -8893,7 +8800,6 @@ qemuDomainGetInterfaceParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr def = NULL; virDomainDefPtr persistentDef = NULL; @@ -9000,8 +8906,6 @@ qemuDomainGetInterfaceParameters(virDomainPtr dom, ret = 0; cleanup: - if (group) - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -10607,7 +10511,6 @@ typedef enum { static int qemuDomainPrepareDiskChainElement(virQEMUDriverPtr driver, virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainDiskDefPtr disk, const char *file, qemuDomainDiskChainMode mode) @@ -10631,13 +10534,13 @@ qemuDomainPrepareDiskChainElement(virQEMUDriverPtr driver, if (virSecurityManagerRestoreImageLabel(driver->securityManager, vm->def, disk) < 0) VIR_WARN("Unable to restore security label on %s", disk->src); - if (cgroup && qemuTeardownDiskCgroup(vm, cgroup, disk) < 0) + if (qemuTeardownDiskCgroup(vm, disk) < 0) VIR_WARN("Failed to teardown cgroup for disk path %s", disk->src); if (virDomainLockDiskDetach(driver->lockManager, vm, disk) < 0) VIR_WARN("Unable to release lock on %s", disk->src); } else if (virDomainLockDiskAttach(driver->lockManager, cfg->uri, vm, disk) < 0 || - (cgroup && qemuSetupDiskCgroup(vm, cgroup, disk) < 0) || + qemuSetupDiskCgroup(vm, disk) < 0 || virSecurityManagerSetImageLabel(driver->securityManager, vm->def, disk) < 0) { goto cleanup; @@ -11073,7 +10976,6 @@ cleanup: static int qemuDomainSnapshotCreateSingleDiskActive(virQEMUDriverPtr driver, virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainSnapshotDiskDefPtr snap, virDomainDiskDefPtr disk, virDomainDiskDefPtr persistDisk, @@ -11123,9 +11025,9 @@ qemuDomainSnapshotCreateSingleDiskActive(virQEMUDriverPtr driver, virStorageFileFreeMetadata(disk->backingChain); disk->backingChain = NULL; - if (qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, source, + if (qemuDomainPrepareDiskChainElement(driver, vm, disk, source, VIR_DISK_CHAIN_READ_WRITE) < 0) { - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, source, + qemuDomainPrepareDiskChainElement(driver, vm, disk, source, VIR_DISK_CHAIN_NO_ACCESS); goto cleanup; } @@ -11167,7 +11069,6 @@ cleanup: static void qemuDomainSnapshotUndoSingleDiskActive(virQEMUDriverPtr driver, virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainDiskDefPtr origdisk, virDomainDiskDefPtr disk, virDomainDiskDefPtr persistDisk, @@ -11184,7 +11085,7 @@ qemuDomainSnapshotUndoSingleDiskActive(virQEMUDriverPtr driver, goto cleanup; } - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, origdisk->src, + qemuDomainPrepareDiskChainElement(driver, vm, disk, origdisk->src, VIR_DISK_CHAIN_NO_ACCESS); if (need_unlink && stat(disk->src, &st) == 0 && S_ISREG(st.st_mode) && unlink(disk->src) < 0) @@ -11221,7 +11122,6 @@ qemuDomainSnapshotCreateDiskActive(virQEMUDriverPtr driver, int i; bool persist = false; bool reuse = (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT) != 0; - virCgroupPtr cgroup = NULL; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); if (!virDomainObjIsActive(vm)) { @@ -11230,15 +11130,6 @@ qemuDomainSnapshotCreateDiskActive(virQEMUDriverPtr driver, goto cleanup; } - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES) && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0)) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } - /* 'cgroup' is still NULL if cgroups are disabled. */ - if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_TRANSACTION)) { if (!(actions = virJSONValueNewArray())) { virReportOOMError(); @@ -11274,7 +11165,7 @@ qemuDomainSnapshotCreateDiskActive(virQEMUDriverPtr driver, } } - ret = qemuDomainSnapshotCreateSingleDiskActive(driver, vm, cgroup, + ret = qemuDomainSnapshotCreateSingleDiskActive(driver, vm, &snap->def->disks[i], vm->def->disks[i], persistDisk, actions, @@ -11303,7 +11194,7 @@ qemuDomainSnapshotCreateDiskActive(virQEMUDriverPtr driver, persistDisk = vm->newDef->disks[indx]; } - qemuDomainSnapshotUndoSingleDiskActive(driver, vm, cgroup, + qemuDomainSnapshotUndoSingleDiskActive(driver, vm, snap->def->dom->disks[i], vm->def->disks[i], persistDisk, @@ -11314,7 +11205,6 @@ qemuDomainSnapshotCreateDiskActive(virQEMUDriverPtr driver, qemuDomainObjExitMonitor(driver, vm); cleanup: - virCgroupFree(&cgroup); if (ret == 0 || !virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_TRANSACTION)) { if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm) < 0 || @@ -13064,7 +12954,6 @@ qemuDomainBlockPivot(virConnectPtr conn, virDomainBlockJobInfo info; const char *format = virStorageFileFormatTypeToString(disk->mirrorFormat); bool resume = false; - virCgroupPtr cgroup = NULL; char *oldsrc = NULL; int oldformat; virStorageFileMetadataPtr oldchain = NULL; @@ -13124,14 +13013,6 @@ qemuDomainBlockPivot(virConnectPtr conn, * label the entire chain. This action is safe even if the * backing chain has already been labeled; but only necessary when * we know for sure that there is a backing chain. */ - if (disk->mirrorFormat && disk->mirrorFormat != VIR_STORAGE_FILE_RAW && - qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES) && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } oldsrc = disk->src; oldformat = disk->format; oldchain = disk->backingChain; @@ -13147,7 +13028,7 @@ qemuDomainBlockPivot(virConnectPtr conn, if (disk->mirrorFormat && disk->mirrorFormat != VIR_STORAGE_FILE_RAW && (virDomainLockDiskAttach(driver->lockManager, cfg->uri, vm, disk) < 0 || - (cgroup && qemuSetupDiskCgroup(vm, cgroup, disk) < 0) || + qemuSetupDiskCgroup(vm, disk) < 0 || virSecurityManagerSetImageLabel(driver->securityManager, vm->def, disk) < 0)) { disk->src = oldsrc; @@ -13191,8 +13072,6 @@ qemuDomainBlockPivot(virConnectPtr conn, disk->mirroring = false; cleanup: - if (cgroup) - virCgroupFree(&cgroup); if (resume && virDomainObjIsActive(vm) && qemuProcessStartCPUs(driver, vm, conn, VIR_DOMAIN_RUNNING_UNPAUSED, @@ -13420,7 +13299,6 @@ qemuDomainBlockCopy(virDomainPtr dom, const char *path, struct stat st; bool need_unlink = false; char *mirror = NULL; - virCgroupPtr cgroup = NULL; virQEMUDriverConfigPtr cfg = NULL; /* Preliminaries: find the disk we are editing, sanity checks */ @@ -13436,13 +13314,6 @@ qemuDomainBlockCopy(virDomainPtr dom, const char *path, _("domain is not running")); goto cleanup; } - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES) && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } device = qemuDiskPathToAlias(vm, path, &idx); if (!device) { @@ -13544,9 +13415,9 @@ qemuDomainBlockCopy(virDomainPtr dom, const char *path, goto endjob; } - if (qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, dest, + if (qemuDomainPrepareDiskChainElement(driver, vm, disk, dest, VIR_DISK_CHAIN_READ_WRITE) < 0) { - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, dest, + qemuDomainPrepareDiskChainElement(driver, vm, disk, dest, VIR_DISK_CHAIN_NO_ACCESS); goto endjob; } @@ -13558,7 +13429,7 @@ qemuDomainBlockCopy(virDomainPtr dom, const char *path, virDomainAuditDisk(vm, NULL, dest, "mirror", ret >= 0); qemuDomainObjExitMonitor(driver, vm); if (ret < 0) { - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, dest, + qemuDomainPrepareDiskChainElement(driver, vm, disk, dest, VIR_DISK_CHAIN_NO_ACCESS); goto endjob; } @@ -13580,8 +13451,6 @@ endjob: } cleanup: - if (cgroup) - virCgroupFree(&cgroup); VIR_FREE(device); if (vm) virObjectUnlock(vm); @@ -13637,7 +13506,6 @@ qemuDomainBlockCommit(virDomainPtr dom, const char *path, const char *base, virStorageFileMetadataPtr top_meta = NULL; const char *top_parent = NULL; const char *base_canon = NULL; - virCgroupPtr cgroup = NULL; bool clean_access = false; virCheckFlags(VIR_DOMAIN_BLOCK_COMMIT_SHALLOW, -1); @@ -13721,18 +13589,11 @@ qemuDomainBlockCommit(virDomainPtr dom, const char *path, const char *base, * revoke access to files removed from the chain, when the commit * operation succeeds, but doing that requires tracking the * operation in XML across libvirtd restarts. */ - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES) && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto endjob; - } clean_access = true; - if (qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, base_canon, + if (qemuDomainPrepareDiskChainElement(driver, vm, disk, base_canon, VIR_DISK_CHAIN_READ_WRITE) < 0 || (top_parent && top_parent != disk->src && - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, + qemuDomainPrepareDiskChainElement(driver, vm, disk, top_parent, VIR_DISK_CHAIN_READ_WRITE) < 0)) goto endjob; @@ -13746,15 +13607,13 @@ qemuDomainBlockCommit(virDomainPtr dom, const char *path, const char *base, endjob: if (ret < 0 && clean_access) { /* Revert access to read-only, if possible. */ - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, base_canon, + qemuDomainPrepareDiskChainElement(driver, vm, disk, base_canon, VIR_DISK_CHAIN_READ_ONLY); if (top_parent && top_parent != disk->src) - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, + qemuDomainPrepareDiskChainElement(driver, vm, disk, top_parent, VIR_DISK_CHAIN_READ_ONLY); } - if (cgroup) - virCgroupFree(&cgroup); if (qemuDomainObjEndJob(driver, vm) == 0) { vm = NULL; goto cleanup; @@ -14398,17 +14257,18 @@ cleanup: /* qemuDomainGetCPUStats() with start_cpu == -1 */ static int -qemuDomainGetTotalcpuStats(virCgroupPtr group, +qemuDomainGetTotalcpuStats(virDomainObjPtr vm, virTypedParameterPtr params, int nparams) { unsigned long long cpu_time; int ret; + qemuDomainObjPrivatePtr priv = vm->privateData; if (nparams == 0) /* return supported number of params */ return QEMU_NB_TOTAL_CPU_STAT_PARAM; /* entry 0 is cputime */ - ret = virCgroupGetCpuacctUsage(group, &cpu_time); + ret = virCgroupGetCpuacctUsage(priv->cgroup, &cpu_time); if (ret < 0) { virReportSystemError(-ret, "%s", _("unable to get cpu account")); return -1; @@ -14422,7 +14282,7 @@ qemuDomainGetTotalcpuStats(virCgroupPtr group, unsigned long long user; unsigned long long sys; - ret = virCgroupGetCpuacctStat(group, &user, &sys); + ret = virCgroupGetCpuacctStat(priv->cgroup, &user, &sys); if (ret < 0) { virReportSystemError(-ret, "%s", _("unable to get cpu account")); return -1; @@ -14460,22 +14320,22 @@ qemuDomainGetTotalcpuStats(virCgroupPtr group, * s3 = t03 + t13 */ static int -getSumVcpuPercpuStats(virCgroupPtr group, - unsigned int nvcpu, +getSumVcpuPercpuStats(virDomainObjPtr vm, unsigned long long *sum_cpu_time, unsigned int num) { int ret = -1; int i; char *buf = NULL; + qemuDomainObjPrivatePtr priv = vm->privateData; virCgroupPtr group_vcpu = NULL; - for (i = 0; i < nvcpu; i++) { + for (i = 0; i < priv->nvcpupids; i++) { char *pos; unsigned long long tmp; int j; - if (virCgroupForVcpu(group, i, &group_vcpu, 0) < 0) { + if (virCgroupForVcpu(priv->cgroup, i, &group_vcpu, 0) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("error accessing cgroup cpuacct for vcpu")); goto cleanup; @@ -14507,7 +14367,6 @@ cleanup: static int qemuDomainGetPercpuStats(virDomainObjPtr vm, - virCgroupPtr group, virTypedParameterPtr params, unsigned int nparams, int start_cpu, @@ -14547,7 +14406,7 @@ qemuDomainGetPercpuStats(virDomainObjPtr vm, } /* we get percpu cputime accounting info. */ - if (virCgroupGetCpuacctPercpuUsage(group, &buf)) + if (virCgroupGetCpuacctPercpuUsage(priv->cgroup, &buf)) goto cleanup; pos = buf; memset(params, 0, nparams * ncpus); @@ -14587,7 +14446,7 @@ qemuDomainGetPercpuStats(virDomainObjPtr vm, virReportOOMError(); goto cleanup; } - if (getSumVcpuPercpuStats(group, priv->nvcpupids, sum_cpu_time, n) < 0) + if (getSumVcpuPercpuStats(vm, sum_cpu_time, n) < 0) goto cleanup; sum_cpu_pos = sum_cpu_time; @@ -14613,17 +14472,17 @@ cleanup: static int qemuDomainGetCPUStats(virDomainPtr domain, - virTypedParameterPtr params, - unsigned int nparams, - int start_cpu, - unsigned int ncpus, - unsigned int flags) + virTypedParameterPtr params, + unsigned int nparams, + int start_cpu, + unsigned int ncpus, + unsigned int flags) { virQEMUDriverPtr driver = domain->conn->privateData; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; int ret = -1; bool isActive; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_TYPED_PARAM_STRING_OKAY, -1); @@ -14633,6 +14492,7 @@ qemuDomainGetCPUStats(virDomainPtr domain, _("No such domain %s"), domain->uuid); goto cleanup; } + priv = vm->privateData; isActive = virDomainObjIsActive(vm); if (!isActive) { @@ -14641,25 +14501,18 @@ qemuDomainGetCPUStats(virDomainPtr domain, goto cleanup; } - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUACCT)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUACCT)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPUACCT controller is not mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - if (start_cpu == -1) - ret = qemuDomainGetTotalcpuStats(group, params, nparams); + ret = qemuDomainGetTotalcpuStats(vm, params, nparams); else - ret = qemuDomainGetPercpuStats(vm, group, params, nparams, + ret = qemuDomainGetPercpuStats(vm, params, nparams, start_cpu, ncpus); cleanup: - virCgroupFree(&group); if (vm) virObjectUnlock(vm); return ret; diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index b978b97..a6c75cb 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1136,27 +1136,16 @@ int qemuDomainAttachHostUsbDevice(virQEMUDriverPtr driver, goto error; } - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - virCgroupPtr cgroup = NULL; + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virUSBDevicePtr usb; - qemuCgroupData data; - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto error; - } if ((usb = virUSBDeviceNew(hostdev->source.subsys.u.usb.bus, hostdev->source.subsys.u.usb.device, NULL)) == NULL) goto error; - data.vm = vm; - data.cgroup = cgroup; if (virUSBDeviceFileIterate(usb, qemuSetupHostUsbDeviceCgroup, - &data) < 0) { + vm) < 0) { virUSBDeviceFree(usb); goto error; } @@ -2032,7 +2021,6 @@ int qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr driver, int i, ret = -1; virDomainDiskDefPtr detach = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; - virCgroupPtr cgroup = NULL; char *drivestr = NULL; i = qemuFindDisk(vm->def, dev->data.disk->dst); @@ -2052,15 +2040,6 @@ int qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr driver, goto cleanup; } - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } - } - if (STREQLEN(vm->def->os.machine, "s390-ccw", 8) && virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_VIRTIO_CCW)) { if (!virDomainDeviceAddressIsValid(&detach->info, @@ -2130,11 +2109,9 @@ int qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr driver, vm->def, dev->data.disk) < 0) VIR_WARN("Unable to restore security label on %s", dev->data.disk->src); - if (cgroup != NULL) { - if (qemuTeardownDiskCgroup(vm, cgroup, dev->data.disk) < 0) - VIR_WARN("Failed to teardown cgroup for disk path %s", - NULLSTR(dev->data.disk->src)); - } + if (qemuTeardownDiskCgroup(vm, dev->data.disk) < 0) + VIR_WARN("Failed to teardown cgroup for disk path %s", + NULLSTR(dev->data.disk->src)); if (virDomainLockDiskDetach(driver->lockManager, vm, dev->data.disk) < 0) VIR_WARN("Unable to release lock on %s", dev->data.disk->src); @@ -2142,7 +2119,6 @@ int qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr driver, ret = 0; cleanup: - virCgroupFree(&cgroup); VIR_FREE(drivestr); return ret; } @@ -2154,7 +2130,6 @@ int qemuDomainDetachDiskDevice(virQEMUDriverPtr driver, int i, ret = -1; virDomainDiskDefPtr detach = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; - virCgroupPtr cgroup = NULL; char *drivestr = NULL; i = qemuFindDisk(vm->def, dev->data.disk->dst); @@ -2181,15 +2156,6 @@ int qemuDomainDetachDiskDevice(virQEMUDriverPtr driver, goto cleanup; } - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } - } - /* build the actual drive id string as the disk->info.alias doesn't * contain the QEMU_DRIVE_HOST_PREFIX that is passed to qemu */ if (virAsprintf(&drivestr, "%s%s", @@ -2222,11 +2188,9 @@ int qemuDomainDetachDiskDevice(virQEMUDriverPtr driver, vm->def, dev->data.disk) < 0) VIR_WARN("Unable to restore security label on %s", dev->data.disk->src); - if (cgroup != NULL) { - if (qemuTeardownDiskCgroup(vm, cgroup, dev->data.disk) < 0) - VIR_WARN("Failed to teardown cgroup for disk path %s", - NULLSTR(dev->data.disk->src)); - } + if (qemuTeardownDiskCgroup(vm, dev->data.disk) < 0) + VIR_WARN("Failed to teardown cgroup for disk path %s", + NULLSTR(dev->data.disk->src)); if (virDomainLockDiskDetach(driver->lockManager, vm, dev->data.disk) < 0) VIR_WARN("Unable to release lock on disk %s", dev->data.disk->src); @@ -2235,7 +2199,6 @@ int qemuDomainDetachDiskDevice(virQEMUDriverPtr driver, cleanup: VIR_FREE(drivestr); - virCgroupFree(&cgroup); return ret; } diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 41ad768..ed5e841 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -4176,7 +4176,6 @@ qemuMigrationToFile(virQEMUDriverPtr driver, virDomainObjPtr vm, enum qemuDomainAsyncJob asyncJob) { qemuDomainObjPrivatePtr priv = vm->privateData; - virCgroupPtr cgroup = NULL; int ret = -1; int rc; bool restoreLabel = false; @@ -4210,21 +4209,13 @@ qemuMigrationToFile(virQEMUDriverPtr driver, virDomainObjPtr vm, * given cgroup ACL permission. We might also stumble on * a race present in some qemu versions where it does a wait() * that botches pclose. */ - if (qemuCgroupControllerActive(driver, - VIR_CGROUP_CONTROLLER_DEVICES)) { - if (virCgroupForDomain(driver->cgroup, vm->def->name, - &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } - rc = virCgroupAllowDevicePath(cgroup, path, + if (virCgroupHasController(priv->cgroup, + VIR_CGROUP_CONTROLLER_DEVICES)) { + rc = virCgroupAllowDevicePath(priv->cgroup, path, VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupPath(vm, cgroup, "allow", path, "rw", rc); + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", path, "rw", rc); if (rc == 1) { /* path was not a device, no further need for cgroup */ - virCgroupFree(&cgroup); } else if (rc < 0) { virReportSystemError(-rc, _("Unable to allow device %s for %s"), @@ -4325,14 +4316,14 @@ cleanup: vm->def, path) < 0) VIR_WARN("failed to restore save state label on %s", path); - if (cgroup != NULL) { - rc = virCgroupDenyDevicePath(cgroup, path, + if (virCgroupHasController(priv->cgroup, + VIR_CGROUP_CONTROLLER_DEVICES)) { + rc = virCgroupDenyDevicePath(priv->cgroup, path, VIR_CGROUP_DEVICE_RWM); - virDomainAuditCgroupPath(vm, cgroup, "deny", path, "rwm", rc); + virDomainAuditCgroupPath(vm, priv->cgroup, "deny", path, "rwm", rc); if (rc < 0) VIR_WARN("Unable to deny device %s for %s %d", path, vm->def->name, rc); - virCgroupFree(&cgroup); } return ret; } diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 39c49ce..da47b43 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1395,6 +1395,7 @@ qemuProcessReadLogOutput(virDomainObjPtr vm, /* Filter out debug messages from intermediate libvirt process */ while ((eol = strchr(filter_next, '\n'))) { *eol = '\0'; + VIR_ERROR("<<<<<<<<<<<<%s>>>>>>>>>>", filter_next); if (virLogProbablyLogMessage(filter_next)) { memmove(filter_next, eol + 1, got - (eol - buf)); got -= eol + 1 - filter_next; @@ -2529,7 +2530,7 @@ static int qemuProcessHook(void *data) * memory allocation is on the correct NUMA node */ VIR_DEBUG("Moving process to cgroup"); - if (qemuAddToCgroup(h->driver, h->vm->def) < 0) + if (qemuAddToCgroup(h->vm) < 0) goto cleanup; /* This must be done after cgroup placement to avoid resetting CPU @@ -3004,6 +3005,9 @@ qemuProcessReconnect(void *opaque) if (qemuUpdateActiveUsbHostdevs(driver, obj->def) < 0) goto error; + if (qemuInitCgroup(driver, obj) < 0) + goto error; + /* XXX: Need to change as long as lock is introduced for * qemu_driver->sharedDisks. */ @@ -3384,7 +3388,7 @@ int qemuProcessStart(virConnectPtr conn, /* Ensure no historical cgroup for this VM is lying around bogus * settings */ VIR_DEBUG("Ensuring no historical cgroup is lying around"); - qemuRemoveCgroup(driver, vm, 1); + qemuRemoveCgroup(vm); for (i = 0 ; i < vm->def->ngraphics; ++i) { virDomainGraphicsDefPtr graphics = vm->def->graphics[i]; @@ -3750,7 +3754,7 @@ int qemuProcessStart(virConnectPtr conn, goto cleanup; VIR_DEBUG("Setting cgroup for each VCPU (if required)"); - if (qemuSetupCgroupForVcpu(driver, vm) < 0) + if (qemuSetupCgroupForVcpu(vm) < 0) goto cleanup; VIR_DEBUG("Setting cgroup for emulator (if required)"); @@ -4085,7 +4089,7 @@ void qemuProcessStop(virQEMUDriverPtr driver, } retry: - if ((ret = qemuRemoveCgroup(driver, vm, 0)) < 0) { + if ((ret = qemuRemoveCgroup(vm)) < 0) { if (ret == -EBUSY && (retries++ < 5)) { usleep(200*1000); goto retry; @@ -4093,6 +4097,7 @@ retry: VIR_WARN("Failed to remove cgroup for %s", vm->def->name); } + virCgroupFree(&priv->cgroup); qemuProcessRemoveDomainStatus(driver, vm); -- 1.8.1.4

On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Instead of calling virCgroupForDomain every time we need the virCgrouPtr instance, just do it once at Vm startup and cache a reference to the object in qemuDomainObjPrivatePtr until shutdown of the VM. Removing the virCgroupPtr from the QEMU driver state also means we don't have stale mount info, if someone mounts the cgroups filesystem after libvirtd has been started
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/qemu/qemu_cgroup.c | 283 +++++++++++++++------------------ src/qemu/qemu_cgroup.h | 22 +-- src/qemu/qemu_conf.h | 4 - src/qemu/qemu_domain.c | 1 + src/qemu/qemu_domain.h | 3 + src/qemu/qemu_driver.c | 397 +++++++++++++++------------------------------- src/qemu/qemu_hotplug.c | 53 +------ src/qemu/qemu_migration.c | 25 +-- src/qemu/qemu_process.c | 13 +- 9 files changed, 291 insertions(+), 510 deletions(-)
diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 5aa9416..019aa2e 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c
@@ -220,13 +265,13 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, }
for (i = 0; i < vm->def->ndisks ; i++) { - if (qemuSetupDiskCgroup(vm, cgroup, vm->def->disks[i]) < 0) + if (qemuSetupDiskCgroup(vm,vm->def->disks[i]) < 0)
s/,/, /
goto cleanup; }
- rc = virCgroupAllowDeviceMajor(cgroup, 'c', DEVICE_PTY_MAJOR, + rc = virCgroupAllowDeviceMajor(priv->cgroup, 'c', DEVICE_PTY_MAJOR, VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupMajor(vm, cgroup, "allow", DEVICE_PTY_MAJOR, + virDomainAuditCgroupMajor(vm, priv->cgroup, "allow", DEVICE_PTY_MAJOR, "pty", "rw", rc == 0); if (rc != 0) { virReportSystemError(-rc, "%s",
ACK Michal

From: "Daniel P. Berrange" <berrange@redhat.com> Instead of calling virCgroupForDomain every time we need the virCgrouPtr instance, just do it once at Vm startup and cache a reference to the object in virLXCDomainObjPrivatePtr until shutdown of the VM. Removing the virCgroupPtr from the LXC driver state also means we don't have stale mount info, if someone mounts the cgroups filesystem after libvirtd has been started Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/lxc/lxc_cgroup.c | 17 ++- src/lxc/lxc_cgroup.h | 2 + src/lxc/lxc_conf.h | 2 +- src/lxc/lxc_controller.c | 2 +- src/lxc/lxc_domain.c | 2 + src/lxc/lxc_domain.h | 3 + src/lxc/lxc_driver.c | 354 +++++++++++++---------------------------------- src/lxc/lxc_process.c | 39 +++--- 8 files changed, 144 insertions(+), 277 deletions(-) diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 33641f8..1bad9ec 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -527,7 +527,6 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) { virCgroupPtr driver = NULL; virCgroupPtr cgroup = NULL; - int ret = -1; int rc; rc = virCgroupForDriver("lxc", &driver, 1, 0, -1); @@ -545,6 +544,21 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) goto cleanup; } +cleanup: + virCgroupFree(&driver); + return cgroup; +} + + +virCgroupPtr virLXCCgroupJoin(virDomainDefPtr def) +{ + virCgroupPtr cgroup = NULL; + int ret = -1; + int rc; + + if (!(cgroup = virLXCCgroupCreate(def))) + return NULL; + rc = virCgroupAddTask(cgroup, getpid()); if (rc != 0) { virReportSystemError(-rc, @@ -556,7 +570,6 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) ret = 0; cleanup: - virCgroupFree(&driver); if (ret < 0) { virCgroupFree(&cgroup); return NULL; diff --git a/src/lxc/lxc_cgroup.h b/src/lxc/lxc_cgroup.h index 942e0fc..25a427c 100644 --- a/src/lxc/lxc_cgroup.h +++ b/src/lxc/lxc_cgroup.h @@ -22,11 +22,13 @@ #ifndef __VIR_LXC_CGROUP_H__ # define __VIR_LXC_CGROUP_H__ +# include "vircgroup.h" # include "domain_conf.h" # include "lxc_fuse.h" # include "virusb.h" virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def); +virCgroupPtr virLXCCgroupJoin(virDomainDefPtr def); int virLXCCgroupSetup(virDomainDefPtr def, virCgroupPtr cgroup, virBitmapPtr nodemask); diff --git a/src/lxc/lxc_conf.h b/src/lxc/lxc_conf.h index f7aaff0..4332fb9 100644 --- a/src/lxc/lxc_conf.h +++ b/src/lxc/lxc_conf.h @@ -32,9 +32,9 @@ # include "domain_event.h" # include "capabilities.h" # include "virthread.h" -# include "vircgroup.h" # include "security/security_manager.h" # include "configmake.h" +# include "vircgroup.h" # include "virsysinfo.h" # include "virusb.h" diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index f2462ef..f46f813 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1434,7 +1434,7 @@ virLXCControllerRun(virLXCControllerPtr ctrl) if (virLXCControllerSetupPrivateNS() < 0) goto cleanup; - if (!(cgroup = virLXCCgroupCreate(ctrl->def))) + if (!(cgroup = virLXCCgroupJoin(ctrl->def))) goto cleanup; if (virLXCControllerSetupLoopDevices(ctrl) < 0) diff --git a/src/lxc/lxc_domain.c b/src/lxc/lxc_domain.c index e7a85ca..31f6e84 100644 --- a/src/lxc/lxc_domain.c +++ b/src/lxc/lxc_domain.c @@ -43,6 +43,8 @@ static void virLXCDomainObjPrivateFree(void *data) { virLXCDomainObjPrivatePtr priv = data; + virCgroupFree(&priv->cgroup); + VIR_FREE(priv); } diff --git a/src/lxc/lxc_domain.h b/src/lxc/lxc_domain.h index 12753aa..751aece 100644 --- a/src/lxc/lxc_domain.h +++ b/src/lxc/lxc_domain.h @@ -23,6 +23,7 @@ #ifndef __LXC_DOMAIN_H__ # define __LXC_DOMAIN_H__ +# include "vircgroup.h" # include "lxc_conf.h" # include "lxc_monitor.h" @@ -36,6 +37,8 @@ struct _virLXCDomainObjPrivate { bool wantReboot; pid_t initpid; + + virCgroupPtr cgroup; }; extern virDomainXMLPrivateDataCallbacks virLXCDriverPrivateDataCallbacks; diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 2b47f18..ba0fa0a 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -525,8 +525,8 @@ static int lxcDomainGetInfo(virDomainPtr dom, { virLXCDriverPtr driver = dom->conn->privateData; virDomainObjPtr vm; - virCgroupPtr cgroup = NULL; int ret = -1, rc; + virLXCDomainObjPrivatePtr priv; lxcDriverLock(driver); vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); @@ -539,24 +539,20 @@ static int lxcDomainGetInfo(virDomainPtr dom, goto cleanup; } + priv = vm->privateData; + info->state = virDomainObjGetState(vm, NULL); - if (!virDomainObjIsActive(vm) || driver->cgroup == NULL) { + if (!virDomainObjIsActive(vm)) { info->cpuTime = 0; info->memory = vm->def->mem.cur_balloon; } else { - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to get cgroup for %s"), vm->def->name); - goto cleanup; - } - - if (virCgroupGetCpuacctUsage(cgroup, &(info->cpuTime)) < 0) { + if (virCgroupGetCpuacctUsage(priv->cgroup, &(info->cpuTime)) < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("Cannot read cputime for domain")); goto cleanup; } - if ((rc = virCgroupGetMemoryUsage(cgroup, &(info->memory))) < 0) { + if ((rc = virCgroupGetMemoryUsage(priv->cgroup, &(info->memory))) < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("Cannot read memory usage for domain")); if (rc == -ENOENT) { @@ -574,8 +570,6 @@ static int lxcDomainGetInfo(virDomainPtr dom, cleanup: lxcDriverUnlock(driver); - if (cgroup) - virCgroupFree(&cgroup); if (vm) virObjectUnlock(vm); return ret; @@ -706,8 +700,8 @@ cleanup: static int lxcDomainSetMemory(virDomainPtr dom, unsigned long newmem) { virLXCDriverPtr driver = dom->conn->privateData; virDomainObjPtr vm; - virCgroupPtr cgroup = NULL; int ret = -1; + virLXCDomainObjPrivatePtr priv; lxcDriverLock(driver); vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); @@ -719,6 +713,7 @@ static int lxcDomainSetMemory(virDomainPtr dom, unsigned long newmem) { _("No domain with matching uuid '%s'"), uuidstr); goto cleanup; } + priv = vm->privateData; if (newmem > vm->def->mem.max_balloon) { virReportError(VIR_ERR_INVALID_ARG, @@ -732,19 +727,7 @@ static int lxcDomainSetMemory(virDomainPtr dom, unsigned long newmem) { goto cleanup; } - if (driver->cgroup == NULL) { - virReportError(VIR_ERR_OPERATION_INVALID, - "%s", _("cgroups must be configured on the host")); - goto cleanup; - } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to get cgroup for %s"), vm->def->name); - goto cleanup; - } - - if (virCgroupSetMemory(cgroup, newmem) < 0) { + if (virCgroupSetMemory(priv->cgroup, newmem) < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("Failed to set memory for domain")); goto cleanup; @@ -755,8 +738,6 @@ static int lxcDomainSetMemory(virDomainPtr dom, unsigned long newmem) { cleanup: if (vm) virObjectUnlock(vm); - if (cgroup) - virCgroupFree(&cgroup); return ret; } @@ -768,10 +749,10 @@ lxcDomainSetMemoryParameters(virDomainPtr dom, { virLXCDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr cgroup = NULL; virDomainObjPtr vm = NULL; int ret = -1; int rc; + virLXCDomainObjPrivatePtr priv; virCheckFlags(0, -1); if (virTypedParameterArrayValidate(params, nparams, @@ -794,33 +775,28 @@ lxcDomainSetMemoryParameters(virDomainPtr dom, _("No domain with matching uuid '%s'"), uuidstr); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } + priv = vm->privateData; ret = 0; for (i = 0; i < nparams; i++) { virTypedParameterPtr param = ¶ms[i]; if (STREQ(param->field, VIR_DOMAIN_MEMORY_HARD_LIMIT)) { - rc = virCgroupSetMemoryHardLimit(cgroup, params[i].value.ul); + rc = virCgroupSetMemoryHardLimit(priv->cgroup, params[i].value.ul); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to set memory hard_limit tunable")); ret = -1; } } else if (STREQ(param->field, VIR_DOMAIN_MEMORY_SOFT_LIMIT)) { - rc = virCgroupSetMemorySoftLimit(cgroup, params[i].value.ul); + rc = virCgroupSetMemorySoftLimit(priv->cgroup, params[i].value.ul); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to set memory soft_limit tunable")); ret = -1; } } else if (STREQ(param->field, VIR_DOMAIN_MEMORY_SWAP_HARD_LIMIT)) { - rc = virCgroupSetMemSwapHardLimit(cgroup, params[i].value.ul); + rc = virCgroupSetMemSwapHardLimit(priv->cgroup, params[i].value.ul); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to set swap_hard_limit tunable")); @@ -830,8 +806,6 @@ lxcDomainSetMemoryParameters(virDomainPtr dom, } cleanup: - if (cgroup) - virCgroupFree(&cgroup); if (vm) virObjectUnlock(vm); lxcDriverUnlock(driver); @@ -846,11 +820,11 @@ lxcDomainGetMemoryParameters(virDomainPtr dom, { virLXCDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr cgroup = NULL; virDomainObjPtr vm = NULL; unsigned long long val; int ret = -1; int rc; + virLXCDomainObjPrivatePtr priv; virCheckFlags(0, -1); @@ -864,6 +838,7 @@ lxcDomainGetMemoryParameters(virDomainPtr dom, _("No domain with matching uuid '%s'"), uuidstr); goto cleanup; } + priv = vm->privateData; if ((*nparams) == 0) { /* Current number of memory parameters supported by cgroups */ @@ -872,19 +847,13 @@ lxcDomainGetMemoryParameters(virDomainPtr dom, goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to get cgroup for %s"), vm->def->name); - goto cleanup; - } - for (i = 0; i < LXC_NB_MEM_PARAM && i < *nparams; i++) { virTypedParameterPtr param = ¶ms[i]; val = 0; switch (i) { case 0: /* fill memory hard limit here */ - rc = virCgroupGetMemoryHardLimit(cgroup, &val); + rc = virCgroupGetMemoryHardLimit(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get memory hard limit")); @@ -895,7 +864,7 @@ lxcDomainGetMemoryParameters(virDomainPtr dom, goto cleanup; break; case 1: /* fill memory soft limit here */ - rc = virCgroupGetMemorySoftLimit(cgroup, &val); + rc = virCgroupGetMemorySoftLimit(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get memory soft limit")); @@ -906,7 +875,7 @@ lxcDomainGetMemoryParameters(virDomainPtr dom, goto cleanup; break; case 2: /* fill swap hard limit here */ - rc = virCgroupGetMemSwapHardLimit(cgroup, &val); + rc = virCgroupGetMemSwapHardLimit(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get swap hard limit")); @@ -930,8 +899,6 @@ lxcDomainGetMemoryParameters(virDomainPtr dom, ret = 0; cleanup: - if (cgroup) - virCgroupFree(&cgroup); if (vm) virObjectUnlock(vm); lxcDriverUnlock(driver); @@ -1414,7 +1381,6 @@ static int lxcStartup(bool privileged, void *opaque ATTRIBUTE_UNUSED) { char *ld; - int rc; /* Valgrind gets very annoyed when we clone containers, so * disable LXC when under valgrind @@ -1459,16 +1425,6 @@ static int lxcStartup(bool privileged, lxc_driver->log_libvirtd = 0; /* by default log to container logfile */ lxc_driver->have_netns = lxcCheckNetNsSupport(); - rc = virCgroupForDriver("lxc", &lxc_driver->cgroup, privileged, 1, -1); - if (rc < 0) { - char buf[1024] ATTRIBUTE_UNUSED; - VIR_DEBUG("Unable to create cgroup for LXC driver: %s", - virStrerror(-rc, buf, sizeof(buf))); - /* Don't abort startup. We will explicitly report to - * the user when they try to start a VM - */ - } - /* Call function to load lxc driver configuration information */ if (lxcLoadDriverConfig(lxc_driver) < 0) goto cleanup; @@ -1639,30 +1595,32 @@ cleanup: } -static bool lxcCgroupControllerActive(virLXCDriverPtr driver, - int controller) -{ - return virCgroupHasController(driver->cgroup, controller); -} - - - -static char *lxcGetSchedulerType(virDomainPtr domain, +static char *lxcGetSchedulerType(virDomainPtr dom, int *nparams) { - virLXCDriverPtr driver = domain->conn->privateData; + virLXCDriverPtr driver = dom->conn->privateData; char *ret = NULL; int rc; + virDomainObjPtr vm; + virLXCDomainObjPrivatePtr priv; lxcDriverLock(driver); - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); + if (vm == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("No such domain %s"), dom->uuid); + goto cleanup; + } + priv = vm->privateData; + + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPU controller is not mounted")); goto cleanup; } if (nparams) { - rc = lxcGetCpuBWStatus(driver->cgroup); + rc = lxcGetCpuBWStatus(priv->cgroup); if (rc < 0) goto cleanup; else if (rc == 0) @@ -1676,6 +1634,8 @@ static char *lxcGetSchedulerType(virDomainPtr domain, virReportOOMError(); cleanup: + if (vm) + virObjectUnlock(vm); lxcDriverUnlock(driver); return ret; } @@ -1762,11 +1722,11 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, { virLXCDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr vmdef = NULL; int ret = -1; int rc; + virLXCDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -1789,6 +1749,7 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; if (virDomainLiveConfigHelperMethod(driver->caps, driver->xmlopt, vm, &flags, &vmdef) < 0) @@ -1802,17 +1763,11 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, } if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPU controller is not mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), - vm->def->name); - goto cleanup; - } } for (i = 0; i < nparams; i++) { @@ -1820,7 +1775,7 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, if (STREQ(param->field, VIR_DOMAIN_SCHEDULER_CPU_SHARES)) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - rc = virCgroupSetCpuShares(group, params[i].value.ul); + rc = virCgroupSetCpuShares(priv->cgroup, params[i].value.ul); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to set cpu shares tunable")); @@ -1835,7 +1790,7 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, } } else if (STREQ(param->field, VIR_DOMAIN_SCHEDULER_VCPU_PERIOD)) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - rc = lxcSetVcpuBWLive(group, params[i].value.ul, 0); + rc = lxcSetVcpuBWLive(priv->cgroup, params[i].value.ul, 0); if (rc != 0) goto cleanup; @@ -1848,7 +1803,7 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, } } else if (STREQ(param->field, VIR_DOMAIN_SCHEDULER_VCPU_QUOTA)) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - rc = lxcSetVcpuBWLive(group, 0, params[i].value.l); + rc = lxcSetVcpuBWLive(priv->cgroup, 0, params[i].value.l); if (rc != 0) goto cleanup; @@ -1879,7 +1834,6 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, cleanup: virDomainDefFree(vmdef); - virCgroupFree(&group); if (vm) virObjectUnlock(vm); lxcDriverUnlock(driver); @@ -1901,7 +1855,6 @@ lxcGetSchedulerParametersFlags(virDomainPtr dom, unsigned int flags) { virLXCDriverPtr driver = dom->conn->privateData; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef; unsigned long long shares = 0; @@ -1911,19 +1864,13 @@ lxcGetSchedulerParametersFlags(virDomainPtr dom, int rc; bool cpu_bw_status = false; int saved_nparams = 0; + virLXCDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); lxcDriverLock(driver); - if (*nparams > 1) { - rc = lxcGetCpuBWStatus(driver->cgroup); - if (rc < 0) - goto cleanup; - cpu_bw_status = !!rc; - } - vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); if (vm == NULL) { @@ -1931,6 +1878,14 @@ lxcGetSchedulerParametersFlags(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; + + if (*nparams > 1) { + rc = lxcGetCpuBWStatus(priv->cgroup); + if (rc < 0) + goto cleanup; + cpu_bw_status = !!rc; + } if (virDomainLiveConfigHelperMethod(driver->caps, driver->xmlopt, vm, &flags, &persistentDef) < 0) @@ -1945,19 +1900,13 @@ lxcGetSchedulerParametersFlags(virDomainPtr dom, goto out; } - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPU controller is not mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - - rc = virCgroupGetCpuShares(group, &shares); + rc = virCgroupGetCpuShares(priv->cgroup, &shares); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get cpu shares tunable")); @@ -1965,7 +1914,7 @@ lxcGetSchedulerParametersFlags(virDomainPtr dom, } if (*nparams > 1 && cpu_bw_status) { - rc = lxcGetVcpuBWLive(group, &period, "a); + rc = lxcGetVcpuBWLive(priv->cgroup, &period, "a); if (rc != 0) goto cleanup; } @@ -1998,7 +1947,6 @@ out: ret = 0; cleanup: - virCgroupFree(&group); if (vm) virObjectUnlock(vm); lxcDriverUnlock(driver); @@ -2022,10 +1970,10 @@ lxcDomainSetBlkioParameters(virDomainPtr dom, { virLXCDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; int ret = -1; + virLXCDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -2044,24 +1992,19 @@ lxcDomainSetBlkioParameters(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; if (virDomainLiveConfigHelperMethod(driver->caps, driver->xmlopt, vm, &flags, &persistentDef) < 0) goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("blkio cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - for (i = 0; i < nparams; i++) { virTypedParameterPtr param = ¶ms[i]; @@ -2074,7 +2017,7 @@ lxcDomainSetBlkioParameters(virDomainPtr dom, goto cleanup; } - rc = virCgroupSetBlkioWeight(group, params[i].value.ui); + rc = virCgroupSetBlkioWeight(priv->cgroup, params[i].value.ui); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to set blkio weight tunable")); @@ -2107,7 +2050,6 @@ lxcDomainSetBlkioParameters(virDomainPtr dom, ret = 0; cleanup: - virCgroupFree(&group); if (vm) virObjectUnlock(vm); lxcDriverUnlock(driver); @@ -2124,12 +2066,12 @@ lxcDomainGetBlkioParameters(virDomainPtr dom, { virLXCDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; unsigned int val; int ret = -1; int rc; + virLXCDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -2142,6 +2084,7 @@ lxcDomainGetBlkioParameters(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; if ((*nparams) == 0) { /* Current number of blkio parameters supported by cgroups */ @@ -2155,25 +2098,19 @@ lxcDomainGetBlkioParameters(virDomainPtr dom, goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("blkio cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - for (i = 0; i < *nparams && i < LXC_NB_BLKIO_PARAM; i++) { virTypedParameterPtr param = ¶ms[i]; val = 0; switch (i) { case 0: /* fill blkio weight here */ - rc = virCgroupGetBlkioWeight(group, &val); + rc = virCgroupGetBlkioWeight(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get blkio weight")); @@ -2215,8 +2152,6 @@ lxcDomainGetBlkioParameters(virDomainPtr dom, ret = 0; cleanup: - if (group) - virCgroupFree(&group); if (vm) virObjectUnlock(vm); lxcDriverUnlock(driver); @@ -2386,7 +2321,7 @@ cleanup: return ret; } -static int lxcFreezeContainer(virLXCDriverPtr driver, virDomainObjPtr vm) +static int lxcFreezeContainer(virDomainObjPtr vm) { int timeout = 1000; /* In milliseconds */ int check_interval = 1; /* In milliseconds */ @@ -2394,13 +2329,7 @@ static int lxcFreezeContainer(virLXCDriverPtr driver, virDomainObjPtr vm) int waited_time = 0; int ret = -1; char *state = NULL; - virCgroupPtr cgroup = NULL; - - if (!(driver->cgroup && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) == 0)) - return -1; - - /* From here on, we know that cgroup != NULL. */ + virLXCDomainObjPrivatePtr priv = vm->privateData; while (waited_time < timeout) { int r; @@ -2411,7 +2340,7 @@ static int lxcFreezeContainer(virLXCDriverPtr driver, virDomainObjPtr vm) * to "FROZEN". * (see linux-2.6/Documentation/cgroups/freezer-subsystem.txt) */ - r = virCgroupSetFreezerState(cgroup, "FROZEN"); + r = virCgroupSetFreezerState(priv->cgroup, "FROZEN"); /* * Returning EBUSY explicitly indicates that the group is @@ -2438,7 +2367,7 @@ static int lxcFreezeContainer(virLXCDriverPtr driver, virDomainObjPtr vm) */ usleep(check_interval * 1000); - r = virCgroupGetFreezerState(cgroup, &state); + r = virCgroupGetFreezerState(priv->cgroup, &state); if (r < 0) { VIR_DEBUG("Reading freezer.state failed with errno: %d", r); @@ -2470,11 +2399,10 @@ error: * activate the group again and return an error. * This is likely to fall the group back again gracefully. */ - virCgroupSetFreezerState(cgroup, "THAWED"); + virCgroupSetFreezerState(priv->cgroup, "THAWED"); ret = -1; cleanup: - virCgroupFree(&cgroup); VIR_FREE(state); return ret; } @@ -2504,7 +2432,7 @@ static int lxcDomainSuspend(virDomainPtr dom) } if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_PAUSED) { - if (lxcFreezeContainer(driver, vm) < 0) { + if (lxcFreezeContainer(vm) < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("Suspend operation failed")); goto cleanup; @@ -2529,27 +2457,13 @@ cleanup: return ret; } -static int lxcUnfreezeContainer(virLXCDriverPtr driver, virDomainObjPtr vm) -{ - int ret; - virCgroupPtr cgroup = NULL; - - if (!(driver->cgroup && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) == 0)) - return -1; - - ret = virCgroupSetFreezerState(cgroup, "THAWED"); - - virCgroupFree(&cgroup); - return ret; -} - static int lxcDomainResume(virDomainPtr dom) { virLXCDriverPtr driver = dom->conn->privateData; virDomainObjPtr vm; virDomainEventPtr event = NULL; int ret = -1; + virLXCDomainObjPrivatePtr priv; lxcDriverLock(driver); vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); @@ -2562,6 +2476,8 @@ static int lxcDomainResume(virDomainPtr dom) goto cleanup; } + priv = vm->privateData; + if (!virDomainObjIsActive(vm)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("Domain is not running")); @@ -2569,7 +2485,7 @@ static int lxcDomainResume(virDomainPtr dom) } if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_PAUSED) { - if (lxcUnfreezeContainer(driver, vm) < 0) { + if (virCgroupSetFreezerState(priv->cgroup, "THAWED") < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("Resume operation failed")); goto cleanup; @@ -3112,7 +3028,6 @@ lxcDomainAttachDeviceDiskLive(virLXCDriverPtr driver, { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainDiskDefPtr def = dev->data.disk; - virCgroupPtr group = NULL; int ret = -1; char *dst = NULL; struct stat sb; @@ -3197,19 +3112,13 @@ lxcDomainAttachDeviceDiskLive(virLXCDriverPtr driver, vm->def, def) < 0) goto cleanup; - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - - if (virCgroupAllowDevicePath(group, def->src, + if (virCgroupAllowDevicePath(priv->cgroup, def->src, (def->readonly ? VIR_CGROUP_DEVICE_READ : VIR_CGROUP_DEVICE_RW) | @@ -3227,8 +3136,6 @@ lxcDomainAttachDeviceDiskLive(virLXCDriverPtr driver, cleanup: def->src = tmpsrc; virDomainAuditDisk(vm, NULL, def->src, "attach", ret == 0); - if (group) - virCgroupFree(&group); if (dst && created && ret < 0) unlink(dst); return ret; @@ -3382,7 +3289,6 @@ lxcDomainAttachDeviceHostdevSubsysUSBLive(virLXCDriverPtr driver, mode_t mode; bool created = false; virUSBDevicePtr usb = NULL; - virCgroupPtr group = NULL; if (virDomainHostdevFind(vm->def, def, NULL) >= 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", @@ -3417,18 +3323,12 @@ lxcDomainAttachDeviceHostdevSubsysUSBLive(virLXCDriverPtr driver, goto cleanup; } - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - if (!(usb = virUSBDeviceNew(def->source.subsys.u.usb.bus, def->source.subsys.u.usb.device, vroot))) goto cleanup; @@ -3469,8 +3369,8 @@ lxcDomainAttachDeviceHostdevSubsysUSBLive(virLXCDriverPtr driver, goto cleanup; if (virUSBDeviceFileIterate(usb, - virLXCSetupHostUsbDeviceCgroup, - &group) < 0) + virLXCSetupHostUsbDeviceCgroup, + &priv->cgroup) < 0) goto cleanup; ret = 0; @@ -3481,7 +3381,6 @@ cleanup: unlink(dstfile); virUSBDeviceFree(usb); - virCgroupFree(&group); VIR_FREE(src); VIR_FREE(dstfile); VIR_FREE(dstdir); @@ -3497,7 +3396,6 @@ lxcDomainAttachDeviceHostdevStorageLive(virLXCDriverPtr driver, { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainHostdevDefPtr def = dev->data.hostdev; - virCgroupPtr group = NULL; int ret = -1; char *dst = NULL; char *vroot = NULL; @@ -3566,19 +3464,13 @@ lxcDomainAttachDeviceHostdevStorageLive(virLXCDriverPtr driver, vm->def, def, vroot) < 0) goto cleanup; - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - - if (virCgroupAllowDevicePath(group, def->source.caps.u.storage.block, + if (virCgroupAllowDevicePath(priv->cgroup, def->source.caps.u.storage.block, VIR_CGROUP_DEVICE_RW | VIR_CGROUP_DEVICE_MKNOD) != 0) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -3593,8 +3485,6 @@ lxcDomainAttachDeviceHostdevStorageLive(virLXCDriverPtr driver, cleanup: virDomainAuditHostdev(vm, def, "attach", ret == 0); - if (group) - virCgroupFree(&group); if (dst && created && ret < 0) unlink(dst); VIR_FREE(dst); @@ -3610,7 +3500,6 @@ lxcDomainAttachDeviceHostdevMiscLive(virLXCDriverPtr driver, { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainHostdevDefPtr def = dev->data.hostdev; - virCgroupPtr group = NULL; int ret = -1; char *dst = NULL; char *vroot = NULL; @@ -3679,19 +3568,13 @@ lxcDomainAttachDeviceHostdevMiscLive(virLXCDriverPtr driver, vm->def, def, vroot) < 0) goto cleanup; - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - - if (virCgroupAllowDevicePath(group, def->source.caps.u.misc.chardev, + if (virCgroupAllowDevicePath(priv->cgroup, def->source.caps.u.misc.chardev, VIR_CGROUP_DEVICE_RW | VIR_CGROUP_DEVICE_MKNOD) != 0) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -3706,8 +3589,6 @@ lxcDomainAttachDeviceHostdevMiscLive(virLXCDriverPtr driver, cleanup: virDomainAuditHostdev(vm, def, "attach", ret == 0); - if (group) - virCgroupFree(&group); if (dst && created && ret < 0) unlink(dst); VIR_FREE(dst); @@ -3824,13 +3705,11 @@ lxcDomainAttachDeviceLive(virConnectPtr conn, static int -lxcDomainDetachDeviceDiskLive(virLXCDriverPtr driver, - virDomainObjPtr vm, +lxcDomainDetachDeviceDiskLive(virDomainObjPtr vm, virDomainDeviceDefPtr dev) { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainDiskDefPtr def = NULL; - virCgroupPtr group = NULL; int i, ret = -1; char *dst = NULL; @@ -3856,18 +3735,12 @@ lxcDomainDetachDeviceDiskLive(virLXCDriverPtr driver, goto cleanup; } - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - VIR_DEBUG("Unlinking %s (backed by %s)", dst, def->src); if (unlink(dst) < 0 && errno != ENOENT) { virDomainAuditDisk(vm, def->src, NULL, "detach", false); @@ -3877,7 +3750,7 @@ lxcDomainDetachDeviceDiskLive(virLXCDriverPtr driver, } virDomainAuditDisk(vm, def->src, NULL, "detach", true); - if (virCgroupDenyDevicePath(group, def->src, VIR_CGROUP_DEVICE_RWM) != 0) + if (virCgroupDenyDevicePath(priv->cgroup, def->src, VIR_CGROUP_DEVICE_RWM) != 0) VIR_WARN("cannot deny device %s for domain %s", def->src, vm->def->name); @@ -3888,8 +3761,6 @@ lxcDomainDetachDeviceDiskLive(virLXCDriverPtr driver, cleanup: VIR_FREE(dst); - if (group) - virCgroupFree(&group); return ret; } @@ -3967,7 +3838,6 @@ lxcDomainDetachDeviceHostdevUSBLive(virLXCDriverPtr driver, { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainHostdevDefPtr def = NULL; - virCgroupPtr group = NULL; int idx, ret = -1; char *dst = NULL; char *vroot; @@ -3995,18 +3865,12 @@ lxcDomainDetachDeviceHostdevUSBLive(virLXCDriverPtr driver, goto cleanup; } - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - if (!(usb = virUSBDeviceNew(def->source.subsys.u.usb.bus, def->source.subsys.u.usb.device, vroot))) goto cleanup; @@ -4021,8 +3885,8 @@ lxcDomainDetachDeviceHostdevUSBLive(virLXCDriverPtr driver, virDomainAuditHostdev(vm, def, "detach", true); if (virUSBDeviceFileIterate(usb, - virLXCTeardownHostUsbDeviceCgroup, - &group) < 0) + virLXCTeardownHostUsbDeviceCgroup, + &priv->cgroup) < 0) VIR_WARN("cannot deny device %s for domain %s", dst, vm->def->name); @@ -4036,19 +3900,16 @@ lxcDomainDetachDeviceHostdevUSBLive(virLXCDriverPtr driver, cleanup: virUSBDeviceFree(usb); VIR_FREE(dst); - virCgroupFree(&group); return ret; } static int -lxcDomainDetachDeviceHostdevStorageLive(virLXCDriverPtr driver, - virDomainObjPtr vm, +lxcDomainDetachDeviceHostdevStorageLive(virDomainObjPtr vm, virDomainDeviceDefPtr dev) { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainHostdevDefPtr def = NULL; - virCgroupPtr group = NULL; int i, ret = -1; char *dst = NULL; @@ -4074,18 +3935,12 @@ lxcDomainDetachDeviceHostdevStorageLive(virLXCDriverPtr driver, goto cleanup; } - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - VIR_DEBUG("Unlinking %s", dst); if (unlink(dst) < 0 && errno != ENOENT) { virDomainAuditHostdev(vm, def, "detach", false); @@ -4095,7 +3950,7 @@ lxcDomainDetachDeviceHostdevStorageLive(virLXCDriverPtr driver, } virDomainAuditHostdev(vm, def, "detach", true); - if (virCgroupDenyDevicePath(group, def->source.caps.u.storage.block, VIR_CGROUP_DEVICE_RWM) != 0) + if (virCgroupDenyDevicePath(priv->cgroup, def->source.caps.u.storage.block, VIR_CGROUP_DEVICE_RWM) != 0) VIR_WARN("cannot deny device %s for domain %s", def->source.caps.u.storage.block, vm->def->name); @@ -4106,20 +3961,16 @@ lxcDomainDetachDeviceHostdevStorageLive(virLXCDriverPtr driver, cleanup: VIR_FREE(dst); - if (group) - virCgroupFree(&group); return ret; } static int -lxcDomainDetachDeviceHostdevMiscLive(virLXCDriverPtr driver, - virDomainObjPtr vm, +lxcDomainDetachDeviceHostdevMiscLive(virDomainObjPtr vm, virDomainDeviceDefPtr dev) { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainHostdevDefPtr def = NULL; - virCgroupPtr group = NULL; int i, ret = -1; char *dst = NULL; @@ -4145,18 +3996,12 @@ lxcDomainDetachDeviceHostdevMiscLive(virLXCDriverPtr driver, goto cleanup; } - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - VIR_DEBUG("Unlinking %s", dst); if (unlink(dst) < 0 && errno != ENOENT) { virDomainAuditHostdev(vm, def, "detach", false); @@ -4166,7 +4011,7 @@ lxcDomainDetachDeviceHostdevMiscLive(virLXCDriverPtr driver, } virDomainAuditHostdev(vm, def, "detach", true); - if (virCgroupDenyDevicePath(group, def->source.caps.u.misc.chardev, VIR_CGROUP_DEVICE_RWM) != 0) + if (virCgroupDenyDevicePath(priv->cgroup, def->source.caps.u.misc.chardev, VIR_CGROUP_DEVICE_RWM) != 0) VIR_WARN("cannot deny device %s for domain %s", def->source.caps.u.misc.chardev, vm->def->name); @@ -4177,8 +4022,6 @@ lxcDomainDetachDeviceHostdevMiscLive(virLXCDriverPtr driver, cleanup: VIR_FREE(dst); - if (group) - virCgroupFree(&group); return ret; } @@ -4202,16 +4045,15 @@ lxcDomainDetachDeviceHostdevSubsysLive(virLXCDriverPtr driver, static int -lxcDomainDetachDeviceHostdevCapsLive(virLXCDriverPtr driver, - virDomainObjPtr vm, - virDomainDeviceDefPtr dev) +lxcDomainDetachDeviceHostdevCapsLive(virDomainObjPtr vm, + virDomainDeviceDefPtr dev) { switch (dev->data.hostdev->source.caps.type) { case VIR_DOMAIN_HOSTDEV_CAPS_TYPE_STORAGE: - return lxcDomainDetachDeviceHostdevStorageLive(driver, vm, dev); + return lxcDomainDetachDeviceHostdevStorageLive(vm, dev); case VIR_DOMAIN_HOSTDEV_CAPS_TYPE_MISC: - return lxcDomainDetachDeviceHostdevMiscLive(driver, vm, dev); + return lxcDomainDetachDeviceHostdevMiscLive(vm, dev); default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -4240,7 +4082,7 @@ lxcDomainDetachDeviceHostdevLive(virLXCDriverPtr driver, return lxcDomainDetachDeviceHostdevSubsysLive(driver, vm, dev); case VIR_DOMAIN_HOSTDEV_MODE_CAPABILITIES: - return lxcDomainDetachDeviceHostdevCapsLive(driver, vm, dev); + return lxcDomainDetachDeviceHostdevCapsLive(vm, dev); default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -4260,7 +4102,7 @@ lxcDomainDetachDeviceLive(virLXCDriverPtr driver, switch (dev->type) { case VIR_DOMAIN_DEVICE_DISK: - ret = lxcDomainDetachDeviceDiskLive(driver, vm, dev); + ret = lxcDomainDetachDeviceDiskLive(vm, dev); break; case VIR_DOMAIN_DEVICE_NET: diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index b51c880..1bbffa3 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -29,6 +29,7 @@ #include "lxc_process.h" #include "lxc_domain.h" #include "lxc_container.h" +#include "lxc_cgroup.h" #include "lxc_fuse.h" #include "datatypes.h" #include "virfile.h" @@ -219,7 +220,6 @@ static void virLXCProcessCleanup(virLXCDriverPtr driver, virDomainObjPtr vm, virDomainShutoffReason reason) { - virCgroupPtr cgroup; int i; virLXCDomainObjPrivatePtr priv = vm->privateData; virNetDevVPortProfilePtr vport = NULL; @@ -277,10 +277,9 @@ static void virLXCProcessCleanup(virLXCDriverPtr driver, virDomainConfVMNWFilterTeardown(vm); - if (driver->cgroup && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) == 0) { - virCgroupRemove(cgroup); - virCgroupFree(&cgroup); + if (priv->cgroup) { + virCgroupRemove(priv->cgroup); + virCgroupFree(&priv->cgroup); } /* now that we know it's stopped call the hook if present */ @@ -742,8 +741,8 @@ int virLXCProcessStop(virLXCDriverPtr driver, virDomainObjPtr vm, virDomainShutoffReason reason) { - virCgroupPtr group = NULL; int rc; + virLXCDomainObjPrivatePtr priv; VIR_DEBUG("Stopping VM name=%s pid=%d reason=%d", vm->def->name, (int)vm->pid, (int)reason); @@ -752,6 +751,8 @@ int virLXCProcessStop(virLXCDriverPtr driver, return 0; } + priv = vm->privateData; + if (vm->pid <= 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Invalid PID %d for container"), vm->pid); @@ -769,8 +770,8 @@ int virLXCProcessStop(virLXCDriverPtr driver, VIR_FREE(vm->def->seclabels[0]->imagelabel); } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) == 0) { - rc = virCgroupKillPainfully(group); + if (priv->cgroup) { + rc = virCgroupKillPainfully(priv->cgroup); if (rc < 0) { virReportSystemError(-rc, "%s", _("Failed to kill container PIDs")); @@ -794,7 +795,6 @@ int virLXCProcessStop(virLXCDriverPtr driver, rc = 0; cleanup: - virCgroupFree(&group); return rc; } @@ -1047,26 +1047,28 @@ int virLXCProcessStart(virConnectPtr conn, virLXCDomainObjPrivatePtr priv = vm->privateData; virErrorPtr err = NULL; - if (!lxc_driver->cgroup) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("The 'cpuacct', 'devices' & 'memory' cgroups controllers must be mounted")); + virCgroupFree(&priv->cgroup); + + if (!(priv->cgroup = virLXCCgroupCreate(vm->def))) return -1; - } - if (!virCgroupHasController(lxc_driver->cgroup, - VIR_CGROUP_CONTROLLER_CPUACCT)) { + if (!virCgroupHasController(priv->cgroup, + VIR_CGROUP_CONTROLLER_CPUACCT)) { + virCgroupFree(&priv->cgroup); virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Unable to find 'cpuacct' cgroups controller mount")); return -1; } - if (!virCgroupHasController(lxc_driver->cgroup, + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { + virCgroupFree(&priv->cgroup); virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Unable to find 'devices' cgroups controller mount")); return -1; } - if (!virCgroupHasController(lxc_driver->cgroup, + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_MEMORY)) { + virCgroupFree(&priv->cgroup); virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Unable to find 'memory' cgroups controller mount")); return -1; @@ -1462,6 +1464,9 @@ virLXCProcessReconnectDomain(virDomainObjPtr vm, if (!(priv->monitor = virLXCProcessConnectMonitor(driver, vm))) goto error; + if (!(priv->cgroup = virLXCCgroupCreate(vm->def))) + goto error; + if (virLXCUpdateActiveUsbHostdevs(driver, vm->def) < 0) goto error; -- 1.8.1.4

On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Instead of calling virCgroupForDomain every time we need the virCgrouPtr instance, just do it once at Vm startup and cache a reference to the object in virLXCDomainObjPrivatePtr until shutdown of the VM. Removing the virCgroupPtr from the LXC driver state also means we don't have stale mount info, if someone mounts the cgroups filesystem after libvirtd has been started
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/lxc/lxc_cgroup.c | 17 ++- src/lxc/lxc_cgroup.h | 2 + src/lxc/lxc_conf.h | 2 +- src/lxc/lxc_controller.c | 2 +- src/lxc/lxc_domain.c | 2 + src/lxc/lxc_domain.h | 3 + src/lxc/lxc_driver.c | 354 +++++++++++++---------------------------------- src/lxc/lxc_process.c | 39 +++--- 8 files changed, 144 insertions(+), 277 deletions(-)
ACK This is the last one I have reviewed today. I will resume tomorrow. Michal

From: "Daniel P. Berrange" <berrange@redhat.com> The definition of structs for cgroups are kept in vircgroup.c since they are intended to be private from users of the API. To enable effective testing, however, they need to be accessible. To address the latter issue, without compronmising the former, this introduces a new vircgrouppriv.h file to hold the struct definitions. To prevent other files including this private header, it requires that __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ be defined before inclusion Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/Makefile.am | 2 +- src/util/vircgroup.c | 17 +++-------------- src/util/vircgrouppriv.h | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 50 insertions(+), 15 deletions(-) create mode 100644 src/util/vircgrouppriv.h diff --git a/src/Makefile.am b/src/Makefile.am index fc6b846..e2e9e37 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -69,7 +69,7 @@ UTIL_SOURCES = \ util/virauthconfig.c util/virauthconfig.h \ util/virbitmap.c util/virbitmap.h \ util/virbuffer.c util/virbuffer.h \ - util/vircgroup.c util/vircgroup.h \ + util/vircgroup.c util/vircgroup.h util/vircgrouppriv.h \ util/vircommand.c util/vircommand.h \ util/virconf.c util/virconf.h \ util/virdbus.c util/virdbus.h \ diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index dc2b431..dfa3c8a 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -37,10 +37,11 @@ #include <libgen.h> #include <dirent.h> -#include "internal.h" +#define __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ +#include "vircgrouppriv.h" + #include "virutil.h" #include "viralloc.h" -#include "vircgroup.h" #include "virlog.h" #include "virfile.h" #include "virhash.h" @@ -52,18 +53,6 @@ VIR_ENUM_IMPL(virCgroupController, VIR_CGROUP_CONTROLLER_LAST, "cpu", "cpuacct", "cpuset", "memory", "devices", "freezer", "blkio"); -struct virCgroupController { - int type; - char *mountPoint; - char *placement; -}; - -struct virCgroup { - char *path; - - struct virCgroupController controllers[VIR_CGROUP_CONTROLLER_LAST]; -}; - typedef enum { VIR_CGROUP_NONE = 0, /* create subdir under each cgroup if possible. */ VIR_CGROUP_MEM_HIERACHY = 1 << 0, /* call virCgroupSetMemoryUseHierarchy diff --git a/src/util/vircgrouppriv.h b/src/util/vircgrouppriv.h new file mode 100644 index 0000000..cc8cc0b --- /dev/null +++ b/src/util/vircgrouppriv.h @@ -0,0 +1,46 @@ +/* + * vircgrouppriv.h: methods for managing control cgroups + * + * Copyright (C) 2011-2013 Red Hat, Inc. + * Copyright IBM Corp. 2008 + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + * Authors: + * Dan Smith <danms@us.ibm.com> + */ + +#ifndef __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ +# error "vircgrouppriv.h may only be included by vircgroup.c or its test suite" +#endif + +#ifndef __VIR_CGROUP_PRIV_H__ +# define __VIR_CGROUP_PRIV_H__ + +# include "vircgroup.h" + +struct virCgroupController { + int type; + char *mountPoint; + char *placement; +}; + +struct virCgroup { + char *path; + + struct virCgroupController controllers[VIR_CGROUP_CONTROLLER_LAST]; +}; + +#endif /* __VIR_CGROUP_PRIV_H__ */ -- 1.8.1.4

On 04/10/2013 04:08 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
The definition of structs for cgroups are kept in vircgroup.c since they are intended to be private from users of the API. To enable effective testing, however, they need to be accessible. To address the latter issue, without compronmising the former, this introduces a new vircgrouppriv.h file to hold the struct definitions.
To prevent other files including this private header, it requires that __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ be defined before inclusion
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/Makefile.am | 2 +- src/util/vircgroup.c | 17 +++-------------- src/util/vircgrouppriv.h | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 50 insertions(+), 15 deletions(-) create mode 100644 src/util/vircgrouppriv.h
ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

From: "Daniel P. Berrange" <berrange@redhat.com> Rename all the virCgroupForXXX methods to use the form virCgroupNewXXX since they are all constructors. Also make sure the output parameter is the last one in the list, and annotate all pointers as non-null. Fix up all callers, and make sure they use true/false not 0/1 for the boolean parameters Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 10 +++--- src/lxc/lxc_cgroup.c | 6 ++-- src/qemu/qemu_cgroup.c | 14 ++++---- src/qemu/qemu_driver.c | 18 +++++------ src/util/vircgroup.c | 84 ++++++++++++++++++++++-------------------------- src/util/vircgroup.h | 33 +++++++++++-------- 6 files changed, 82 insertions(+), 83 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index af13e50..b0a4b5a 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1096,11 +1096,6 @@ virCgroupDenyAllDevices; virCgroupDenyDevice; virCgroupDenyDeviceMajor; virCgroupDenyDevicePath; -virCgroupForDomain; -virCgroupForDriver; -virCgroupForEmulator; -virCgroupForSelf; -virCgroupForVcpu; virCgroupFree; virCgroupGetBlkioWeight; virCgroupGetCpuacctPercpuUsage; @@ -1122,6 +1117,11 @@ virCgroupKill; virCgroupKillPainfully; virCgroupKillRecursive; virCgroupMoveTask; +virCgroupNewDomain; +virCgroupNewDriver; +virCgroupNewEmulator; +virCgroupNewSelf; +virCgroupNewVcpu; virCgroupPathOfController; virCgroupRemove; virCgroupRemoveRecursively; diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 1bad9ec..7d1432b 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -293,7 +293,7 @@ int virLXCCgroupGetMeminfo(virLXCMeminfoPtr meminfo) int ret; virCgroupPtr cgroup; - ret = virCgroupForSelf(&cgroup); + ret = virCgroupNewSelf(&cgroup); if (ret < 0) { virReportSystemError(-ret, "%s", _("Unable to get cgroup for container")); @@ -529,14 +529,14 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) virCgroupPtr cgroup = NULL; int rc; - rc = virCgroupForDriver("lxc", &driver, 1, 0, -1); + rc = virCgroupNewDriver("lxc", true, false, -1, &driver); if (rc != 0) { virReportSystemError(-rc, "%s", _("Unable to get cgroup for driver")); goto cleanup; } - rc = virCgroupForDomain(driver, def->name, &cgroup, 1); + rc = virCgroupNewDomain(driver, def->name, true, &cgroup); if (rc != 0) { virReportSystemError(-rc, _("Unable to create cgroup for domain %s"), diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 019aa2e..cb53acb 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -197,9 +197,11 @@ int qemuInitCgroup(virQEMUDriverPtr driver, virCgroupFree(&priv->cgroup); - rc = virCgroupForDriver("qemu", &driverGroup, - cfg->privileged, true, - cfg->cgroupControllers); + rc = virCgroupNewDriver("qemu", + cfg->privileged, + true, + cfg->cgroupControllers, + &driverGroup); if (rc != 0) { if (rc == -ENXIO || rc == -EPERM || @@ -214,7 +216,7 @@ int qemuInitCgroup(virQEMUDriverPtr driver, goto cleanup; } - rc = virCgroupForDomain(driverGroup, vm->def->name, &priv->cgroup, 1); + rc = virCgroupNewDomain(driverGroup, vm->def->name, true, &priv->cgroup); if (rc != 0) { virReportSystemError(-rc, _("Unable to create cgroup for %s"), @@ -610,7 +612,7 @@ int qemuSetupCgroupForVcpu(virDomainObjPtr vm) } for (i = 0; i < priv->nvcpupids; i++) { - rc = virCgroupForVcpu(priv->cgroup, i, &cgroup_vcpu, 1); + rc = virCgroupNewVcpu(priv->cgroup, i, true, &cgroup_vcpu); if (rc < 0) { virReportSystemError(-rc, _("Unable to create vcpu cgroup for %s(vcpu:" @@ -688,7 +690,7 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, if (priv->cgroup == NULL) return 0; /* Not supported, so claim success */ - rc = virCgroupForEmulator(priv->cgroup, &cgroup_emulator, 1); + rc = virCgroupNewEmulator(priv->cgroup, true, &cgroup_emulator); if (rc < 0) { virReportSystemError(-rc, _("Unable to create emulator cgroup for %s"), diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 420ae39..b8c859f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3598,7 +3598,7 @@ static int qemuDomainHotplugVcpus(virQEMUDriverPtr driver, if (priv->cgroup) { int rv = -1; /* Create cgroup for the onlined vcpu */ - rv = virCgroupForVcpu(priv->cgroup, i, &cgroup_vcpu, 1); + rv = virCgroupNewVcpu(priv->cgroup, i, true, &cgroup_vcpu); if (rv < 0) { virReportSystemError(-rv, _("Unable to create vcpu cgroup for %s(vcpu:" @@ -3672,7 +3672,7 @@ static int qemuDomainHotplugVcpus(virQEMUDriverPtr driver, if (priv->cgroup) { int rv = -1; - rv = virCgroupForVcpu(priv->cgroup, i, &cgroup_vcpu, 0); + rv = virCgroupNewVcpu(priv->cgroup, i, false, &cgroup_vcpu); if (rv < 0) { virReportSystemError(-rv, _("Unable to access vcpu cgroup for %s(vcpu:" @@ -3899,7 +3899,7 @@ qemuDomainPinVcpuFlags(virDomainPtr dom, /* Configure the corresponding cpuset cgroup before set affinity. */ if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { - if (virCgroupForVcpu(priv->cgroup, vcpu, &cgroup_vcpu, 0) == 0 && + if (virCgroupNewVcpu(priv->cgroup, vcpu, false, &cgroup_vcpu) == 0 && qemuSetupCgroupVcpuPin(cgroup_vcpu, newVcpuPin, newVcpuPinNum, vcpu) < 0) { virReportError(VIR_ERR_OPERATION_INVALID, _("failed to set cpuset.cpus in cgroup" @@ -4161,7 +4161,7 @@ qemuDomainPinEmulator(virDomainPtr dom, * Configure the corresponding cpuset cgroup. * If no cgroup for domain or hypervisor exists, do nothing. */ - if (virCgroupForEmulator(priv->cgroup, &cgroup_emulator, 0) == 0) { + if (virCgroupNewEmulator(priv->cgroup, false, &cgroup_emulator) == 0) { if (qemuSetupCgroupEmulatorPin(cgroup_emulator, newVcpuPin[0]->cpumask) < 0) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", @@ -7742,7 +7742,7 @@ qemuSetVcpusBWLive(virDomainObjPtr vm, virCgroupPtr cgroup, */ if (priv->nvcpupids != 0 && priv->vcpupids[0] != vm->pid) { for (i = 0; i < priv->nvcpupids; i++) { - rc = virCgroupForVcpu(cgroup, i, &cgroup_vcpu, 0); + rc = virCgroupNewVcpu(cgroup, i, false, &cgroup_vcpu); if (rc < 0) { virReportSystemError(-rc, _("Unable to find vcpu cgroup for %s(vcpu:" @@ -7780,7 +7780,7 @@ qemuSetEmulatorBandwidthLive(virDomainObjPtr vm, virCgroupPtr cgroup, return 0; } - rc = virCgroupForEmulator(cgroup, &cgroup_emulator, 0); + rc = virCgroupNewEmulator(cgroup, false, &cgroup_emulator); if (rc < 0) { virReportSystemError(-rc, _("Unable to find emulator cgroup for %s"), @@ -8033,7 +8033,7 @@ qemuGetVcpusBWLive(virDomainObjPtr vm, } /* get period and quota for vcpu0 */ - rc = virCgroupForVcpu(priv->cgroup, 0, &cgroup_vcpu, 0); + rc = virCgroupNewVcpu(priv->cgroup, 0, false, &cgroup_vcpu); if (!cgroup_vcpu) { virReportSystemError(-rc, _("Unable to find vcpu cgroup for %s(vcpu: 0)"), @@ -8071,7 +8071,7 @@ qemuGetEmulatorBandwidthLive(virDomainObjPtr vm, virCgroupPtr cgroup, } /* get period and quota for emulator */ - rc = virCgroupForEmulator(cgroup, &cgroup_emulator, 0); + rc = virCgroupNewEmulator(cgroup, false, &cgroup_emulator); if (!cgroup_emulator) { virReportSystemError(-rc, _("Unable to find emulator cgroup for %s"), @@ -14335,7 +14335,7 @@ getSumVcpuPercpuStats(virDomainObjPtr vm, unsigned long long tmp; int j; - if (virCgroupForVcpu(priv->cgroup, i, &group_vcpu, 0) < 0) { + if (virCgroupNewVcpu(priv->cgroup, i, false, &group_vcpu) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("error accessing cgroup cpuacct for vcpu")); goto cleanup; diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index dfa3c8a..2f52c92 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -927,7 +927,7 @@ cleanup: } /** - * virCgroupForDriver: + * virCgroupNewDriver: * * @name: name of this driver (e.g., xen, qemu, lxc) * @group: Pointer to returned virCgroupPtr @@ -935,11 +935,11 @@ cleanup: * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForDriver(const char *name, - virCgroupPtr *group, +int virCgroupNewDriver(const char *name, bool privileged, bool create, - int controllers) + int controllers, + virCgroupPtr *group) { int rc; char *path = NULL; @@ -970,10 +970,11 @@ out: return rc; } #else -int virCgroupForDriver(const char *name ATTRIBUTE_UNUSED, - virCgroupPtr *group ATTRIBUTE_UNUSED, +int virCgroupNewDriver(const char *name ATTRIBUTE_UNUSED, bool privileged ATTRIBUTE_UNUSED, - bool create ATTRIBUTE_UNUSED) + bool create ATTRIBUTE_UNUSED, + int controllers ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) { /* Claim no support */ return -ENXIO; @@ -981,7 +982,7 @@ int virCgroupForDriver(const char *name ATTRIBUTE_UNUSED, #endif /** -* virCgroupForSelf: +* virCgroupNewSelf: * * @group: Pointer to returned virCgroupPtr * @@ -991,19 +992,19 @@ int virCgroupForDriver(const char *name ATTRIBUTE_UNUSED, * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForSelf(virCgroupPtr *group) +int virCgroupNewSelf(virCgroupPtr *group) { return virCgroupNew("/", -1, group); } #else -int virCgroupForSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) +int virCgroupNewSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) { return -ENXIO; } #endif /** - * virCgroupForDomain: + * virCgroupNewDomain: * * @driver: group for driver owning the domain * @name: name of the domain @@ -1012,17 +1013,14 @@ int virCgroupForSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForDomain(virCgroupPtr driver, +int virCgroupNewDomain(virCgroupPtr driver, const char *name, - virCgroupPtr *group, - bool create) + bool create, + virCgroupPtr *group) { int rc; char *path; - if (driver == NULL) - return -EINVAL; - if (virAsprintf(&path, "%s/%s", driver->path, name) < 0) return -ENOMEM; @@ -1048,38 +1046,35 @@ int virCgroupForDomain(virCgroupPtr driver, return rc; } #else -int virCgroupForDomain(virCgroupPtr driver ATTRIBUTE_UNUSED, +int virCgroupNewDomain(virCgroupPtr driver ATTRIBUTE_UNUSED, const char *name ATTRIBUTE_UNUSED, - virCgroupPtr *group ATTRIBUTE_UNUSED, - bool create ATTRIBUTE_UNUSED) + bool create ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) { return -ENXIO; } #endif /** - * virCgroupForVcpu: + * virCgroupNewVcpu: * - * @driver: group for the domain + * @domain: group for the domain * @vcpuid: id of the vcpu * @group: Pointer to returned virCgroupPtr * * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForVcpu(virCgroupPtr driver, +int virCgroupNewVcpu(virCgroupPtr domain, int vcpuid, - virCgroupPtr *group, - bool create) + bool create, + virCgroupPtr *group) { int rc; char *path; int controllers; - if (driver == NULL) - return -EINVAL; - - if (virAsprintf(&path, "%s/vcpu%d", driver->path, vcpuid) < 0) + if (virAsprintf(&path, "%s/vcpu%d", domain->path, vcpuid) < 0) return -ENOMEM; controllers = ((1 << VIR_CGROUP_CONTROLLER_CPU) | @@ -1090,7 +1085,7 @@ int virCgroupForVcpu(virCgroupPtr driver, VIR_FREE(path); if (rc == 0) { - rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_NONE); + rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); if (rc != 0) virCgroupFree(group); } @@ -1098,36 +1093,33 @@ int virCgroupForVcpu(virCgroupPtr driver, return rc; } #else -int virCgroupForVcpu(virCgroupPtr driver ATTRIBUTE_UNUSED, +int virCgroupNewVcpu(virCgroupPtr domain ATTRIBUTE_UNUSED, int vcpuid ATTRIBUTE_UNUSED, - virCgroupPtr *group ATTRIBUTE_UNUSED, - bool create ATTRIBUTE_UNUSED) + bool create ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) { return -ENXIO; } #endif /** - * virCgroupForEmulator: + * virCgroupNewEmulator: * - * @driver: group for the domain + * @domain: group for the domain * @group: Pointer to returned virCgroupPtr * * Returns: 0 on success or -errno on failure */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForEmulator(virCgroupPtr driver, - virCgroupPtr *group, - bool create) +int virCgroupNewEmulator(virCgroupPtr domain, + bool create, + virCgroupPtr *group) { int rc; char *path; int controllers; - if (driver == NULL) - return -EINVAL; - - if (virAsprintf(&path, "%s/emulator", driver->path) < 0) + if (virAsprintf(&path, "%s/emulator", domain->path) < 0) return -ENOMEM; controllers = ((1 << VIR_CGROUP_CONTROLLER_CPU) | @@ -1138,7 +1130,7 @@ int virCgroupForEmulator(virCgroupPtr driver, VIR_FREE(path); if (rc == 0) { - rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_NONE); + rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); if (rc != 0) virCgroupFree(group); } @@ -1146,9 +1138,9 @@ int virCgroupForEmulator(virCgroupPtr driver, return rc; } #else -int virCgroupForEmulator(virCgroupPtr driver ATTRIBUTE_UNUSED, - virCgroupPtr *group ATTRIBUTE_UNUSED, - bool create ATTRIBUTE_UNUSED) +int virCgroupNewEmulator(virCgroupPtr domain ATTRIBUTE_UNUSED, + bool create ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) { return -ENXIO; } diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h index 4c1134d..91143e2 100644 --- a/src/util/vircgroup.h +++ b/src/util/vircgroup.h @@ -44,27 +44,32 @@ enum { VIR_ENUM_DECL(virCgroupController); -int virCgroupForDriver(const char *name, - virCgroupPtr *group, +int virCgroupNewDriver(const char *name, bool privileged, bool create, - int controllers); + int controllers, + virCgroupPtr *group) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(5); -int virCgroupForSelf(virCgroupPtr *group); +int virCgroupNewSelf(virCgroupPtr *group) + ATTRIBUTE_NONNULL(1); -int virCgroupForDomain(virCgroupPtr driver, +int virCgroupNewDomain(virCgroupPtr driver, const char *name, - virCgroupPtr *group, - bool create); + bool create, + virCgroupPtr *group) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(4); -int virCgroupForVcpu(virCgroupPtr driver, +int virCgroupNewVcpu(virCgroupPtr domain, int vcpuid, - virCgroupPtr *group, - bool create); - -int virCgroupForEmulator(virCgroupPtr driver, - virCgroupPtr *group, - bool create); + bool create, + virCgroupPtr *group) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(4); + +int virCgroupNewEmulator(virCgroupPtr domain, + bool create, + virCgroupPtr *group) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3); int virCgroupPathOfController(virCgroupPtr group, int controller, -- 1.8.1.4

On 04/10/2013 04:08 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Rename all the virCgroupForXXX methods to use the form virCgroupNewXXX since they are all constructors. Also make sure the output parameter is the last one in the list, and annotate all pointers as non-null. Fix up all callers, and make sure they use true/false not 0/1 for the boolean parameters
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 10 +++--- src/lxc/lxc_cgroup.c | 6 ++-- src/qemu/qemu_cgroup.c | 14 ++++---- src/qemu/qemu_driver.c | 18 +++++------ src/util/vircgroup.c | 84 ++++++++++++++++++++++-------------------------- src/util/vircgroup.h | 33 +++++++++++-------- 6 files changed, 82 insertions(+), 83 deletions(-)
@@ -935,11 +935,11 @@ cleanup: * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForDriver(const char *name, - virCgroupPtr *group, +int virCgroupNewDriver(const char *name,
Is it worth formatting like int virCgroupNewDriver(... as long as you are touching that?
/** - * virCgroupForVcpu: + * virCgroupNewVcpu: * - * @driver: group for the domain + * @domain: group for the domain * @vcpuid: id of the vcpu * @group: Pointer to returned virCgroupPtr
This documents 3 arguments,
* * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForVcpu(virCgroupPtr driver, +int virCgroupNewVcpu(virCgroupPtr domain, int vcpuid, - virCgroupPtr *group, - bool create) + bool create, + virCgroupPtr *group)
even though there are 4 here. Missing @create.
/** - * virCgroupForEmulator: + * virCgroupNewEmulator: * - * @driver: group for the domain + * @domain: group for the domain * @group: Pointer to returned virCgroupPtr * * Returns: 0 on success or -errno on failure */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForEmulator(virCgroupPtr driver, - virCgroupPtr *group, - bool create)
Again, missing @create docs. ACK, whether or not you fix up my minor findings now or in a later patch. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Wed, Apr 10, 2013 at 08:56:03PM -0600, Eric Blake wrote:
On 04/10/2013 04:08 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Rename all the virCgroupForXXX methods to use the form virCgroupNewXXX since they are all constructors. Also make sure the output parameter is the last one in the list, and annotate all pointers as non-null. Fix up all callers, and make sure they use true/false not 0/1 for the boolean parameters
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 10 +++--- src/lxc/lxc_cgroup.c | 6 ++-- src/qemu/qemu_cgroup.c | 14 ++++---- src/qemu/qemu_driver.c | 18 +++++------ src/util/vircgroup.c | 84 ++++++++++++++++++++++-------------------------- src/util/vircgroup.h | 33 +++++++++++-------- 6 files changed, 82 insertions(+), 83 deletions(-)
@@ -935,11 +935,11 @@ cleanup: * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForDriver(const char *name, - virCgroupPtr *group, +int virCgroupNewDriver(const char *name,
Is it worth formatting like
int virCgroupNewDriver(...
as long as you are touching that?
Hmm, there are a huge number of fnuctions in the file which would want updating if we did that. I'm starting to think that if we want to enforce this style, we need to try and get a 'syntax-check' script to validate this, otherwise it is a fairly tough battle to win.
ACK, whether or not you fix up my minor findings now or in a later patch.
Fixing the missing docs for the @create args before pushing. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

From: "Daniel P. Berrange" <berrange@redhat.com> Some aspects of the cgroups setup / detection code are quite subtle and easy to break. It would greatly benefit from unit testing, but this is difficult because the test suite won't have privileges to play around with cgroups. The solution is to use monkey patching via LD_PRELOAD to override the fopen, open, mkdir, access functions to redirect access of cgroups files to some magic stubs in the test suite. Using this we provide custom content for the /proc/cgroup and /proc/self/mounts files which report a fixed cgroup setup. We then override open/mkdir/access so that access to the cgroups filesystem gets redirected into files in a temporary directory tree in the test suite build dir. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- .gitignore | 1 + cfg.mk | 11 +- tests/Makefile.am | 15 +- tests/vircgroupmock.c | 453 ++++++++++++++++++++++++++++++++++++++++++++++++++ tests/vircgrouptest.c | 249 +++++++++++++++++++++++++++ 5 files changed, 723 insertions(+), 6 deletions(-) create mode 100644 tests/vircgroupmock.c create mode 100644 tests/vircgrouptest.c diff --git a/.gitignore b/.gitignore index 068cac8..5e50b52 100644 --- a/.gitignore +++ b/.gitignore @@ -178,6 +178,7 @@ /tests/virauthconfigtest /tests/virbitmaptest /tests/virbuftest +/tests/vircgrouptest /tests/virdrivermoduletest /tests/virendiantest /tests/virhashtest diff --git a/cfg.mk b/cfg.mk index 394521e..e60c4e3 100644 --- a/cfg.mk +++ b/cfg.mk @@ -788,15 +788,16 @@ $(srcdir)/src/remote/remote_client_bodies.h: $(srcdir)/src/remote/remote_protoco exclude_file_name_regexp--sc_avoid_strcase = ^tools/virsh\.h$$ _src1=libvirt|fdstream|qemu/qemu_monitor|util/(vircommand|virutil)|xen/xend_internal|rpc/virnetsocket|lxc/lxc_controller|locking/lock_daemon +_test1=shunloadtest|virnettlscontexttest|vircgroupmock exclude_file_name_regexp--sc_avoid_write = \ - ^(src/($(_src1))|daemon/libvirtd|tools/console|tests/(shunload|virnettlscontext)test)\.c$$ + ^(src/($(_src1))|daemon/libvirtd|tools/console|tests/($(_test1)))\.c$$ exclude_file_name_regexp--sc_bindtextdomain = ^(tests|examples)/ exclude_file_name_regexp--sc_copyright_address = \ ^COPYING\.LIB$$ -exclude_file_name_regexp--sc_flags_usage = ^(docs/|src/util/virnetdevtap\.c$$) +exclude_file_name_regexp--sc_flags_usage = ^(docs/|src/util/virnetdevtap\.c$$|tests/vircgroupmock\.c$$) exclude_file_name_regexp--sc_libvirt_unmarked_diagnostics = \ ^(src/rpc/gendispatch\.pl$$|tests/) @@ -812,10 +813,10 @@ exclude_file_name_regexp--sc_prohibit_always_true_header_tests = \ ^python/(libvirt-(lxc-|qemu-)?override|typewrappers)\.c$$ exclude_file_name_regexp--sc_prohibit_asprintf = \ - ^(bootstrap.conf$$|src/util/virutil\.c$$|examples/domain-events/events-c/event-test\.c$$) + ^(bootstrap.conf$$|src/util/virutil\.c$$|examples/domain-events/events-c/event-test\.c$$|tests/vircgroupmock\.c$$) exclude_file_name_regexp--sc_prohibit_close = \ - (\.p[yl]$$|^docs/|^(src/util/virfile\.c|src/libvirt\.c)$$) + (\.p[yl]$$|^docs/|^(src/util/virfile\.c|src/libvirt\.c|tests/vircgroupmock\.c)$$) exclude_file_name_regexp--sc_prohibit_empty_lines_at_EOF = \ (^tests/(qemuhelp|nodeinfo)data/|\.(gif|ico|png|diff)$$) @@ -836,7 +837,7 @@ exclude_file_name_regexp--sc_prohibit_nonreentrant = \ ^((po|tests)/|docs/.*(py|html\.in)|run.in$$) exclude_file_name_regexp--sc_prohibit_raw_allocation = \ - ^(docs/hacking\.html\.in)|(src/util/viralloc\.[ch]|examples/.*|tests/securityselinuxhelper.c)$$ + ^(docs/hacking\.html\.in)|(src/util/viralloc\.[ch]|examples/.*|tests/securityselinuxhelper\.c|tests/vircgroupmock\.c)$$ exclude_file_name_regexp--sc_prohibit_readlink = \ ^src/(util/virutil|lxc/lxc_container)\.c$$ diff --git a/tests/Makefile.am b/tests/Makefile.am index 888968d..2011049 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -97,7 +97,9 @@ test_programs = virshtest sockettest \ utiltest shunloadtest \ virtimetest viruritest virkeyfiletest \ virauthconfigtest \ - virbitmaptest virendiantest \ + virbitmaptest \ + vircgrouptest \ + virendiantest \ viridentitytest \ virkeycodetest \ virlockspacetest \ @@ -247,6 +249,7 @@ EXTRA_DIST += $(test_scripts) test_libraries = libshunload.la \ libvirportallocatormock.la \ + vircgroupmock.la \ $(NULL) if WITH_QEMU test_libraries += libqemumonitortestutils.la @@ -592,6 +595,16 @@ libvirportallocatormock_la_CFLAGS = $(AM_CFLAGS) -DMOCK_HELPER=1 libvirportallocatormock_la_LDFLAGS = -module -avoid-version \ -rpath /evil/libtool/hack/to/force/shared/lib/creation +vircgrouptest_SOURCES = \ + vircgrouptest.c testutils.h testutils.c +vircgrouptest_LDADD = $(LDADDS) + +vircgroupmock_la_SOURCES = \ + vircgroupmock.c +vircgroupmock_la_CFLAGS = $(AM_CFLAGS) -DMOCK_HELPER=1 +vircgroupmock_la_LDFLAGS = -module -avoid-version \ + -rpath /evil/libtool/hack/to/force/shared/lib/creation + viruritest_SOURCES = \ viruritest.c testutils.h testutils.c diff --git a/tests/vircgroupmock.c b/tests/vircgroupmock.c new file mode 100644 index 0000000..e50f7e0 --- /dev/null +++ b/tests/vircgroupmock.c @@ -0,0 +1,453 @@ +/* + * Copyright (C) 2013 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + * Author: Daniel P. Berrange <berrange@redhat.com> + */ + +#include <config.h> + +#include "internal.h" + +#include <stdio.h> +#include <dlfcn.h> +#include <stdlib.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/stat.h> + +static int (*realopen)(const char *path, int flags, ...); +static FILE *(*realfopen)(const char *path, const char *mode); +static int (*realaccess)(const char *path, int mode); +static int (*realmkdir)(const char *path, mode_t mode); +static char *fakesysfsdir; + + +#define SYSFS_PREFIX "/not/really/sys/fs/cgroup/" + +/* + * The plan: + * + * We fake out /proc/mounts, so make it look as is cgroups + * are mounted on /not/really/sys/fs/cgroup. We don't + * use /sys/fs/cgroup, because we want to make it easy to + * detect places where we've not mocked enough syscalls. + * + * In any open/acces/mkdir calls we look at path and if + * it starts with /not/really/sys/fs/cgroup, we rewrite + * the path to point at a temporary directory referred + * to by LIBVIRT_FAKE_SYSFS_DIR env variable that is + * set by the main test suite + * + * In mkdir() calls, we simulate the cgroups behaviour + * whereby creating the directory auto-creates a bunch + * of files beneath it + */ + +/* + * Intentionally missing the 'devices' mount. + * Co-mounting cpu & cpuacct controllers + * An anonymous controller for systemd + */ +const char *mounts = + "rootfs / rootfs rw 0 0\n" + "tmpfs /run tmpfs rw,seclabel,nosuid,nodev,mode=755 0 0\n" + "tmpfs /not/really/sys/fs/cgroup tmpfs rw,seclabel,nosuid,nodev,noexec,mode=755 0 0\n" + "cgroup /not/really/sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0\n" + "cgroup /not/really/sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0\n" + "cgroup /not/really/sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0\n" + "cgroup /not/really/sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0\n" + "cgroup /not/really/sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0\n" + "cgroup /not/really/sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0\n" + "/dev/sda1 /boot ext4 rw,seclabel,relatime,data=ordered 0 0\n" + "tmpfs /tmp tmpfs rw,seclabel,relatime,size=1024000k 0 0\n"; + +const char *cgroups = + "115:memory:/\n" + "8:blkio:/\n" + "6:freezer:/\n" + "3:cpuacct,cpu:/system\n" + "2:cpuset:/\n" + "1:name=systemd:/user/berrange/123\n"; + +static int make_file(const char *path, + const char *name, + const char *value) +{ + int fd = -1; + int ret = -1; + char *filepath = NULL; + + if (asprintf(&filepath, "%s/%s", path, name) < 0) + return -1; + + if ((fd = open(filepath, O_CREAT|O_WRONLY, 0600)) < 0) + goto cleanup; + + if (write(fd, value, strlen(value)) != strlen(value)) + goto cleanup; + + ret = 0; +cleanup: + if (fd != -1 &&close(fd) < 0) + ret = -1; + free(filepath); + + return ret; +} + +static int make_controller(const char *path, mode_t mode) +{ + int ret = -1; + const char *controller; + + if (!STRPREFIX(path, fakesysfsdir)) { + errno = EINVAL; + return -1; + } + controller = path + strlen(fakesysfsdir) + 1; + + if (STREQ(controller, "cpu")) { + if (symlink("cpu,cpuacct", path) < 0) + return -1; + return -0; + } + if (STREQ(controller, "cpuacct")) { + if (symlink("cpu,cpuacct", path) < 0) + return -1; + return 0; + } + + if (realmkdir(path, mode) < 0) + goto cleanup; + +#define MAKE_FILE(name, value) \ + do { \ + if (make_file(path, name, value) < 0) \ + goto cleanup; \ + } while (0) + + if (STRPREFIX(controller, "cpu,cpuacct")) { + MAKE_FILE("cpu.cfs_period_us", "100000\n"); + MAKE_FILE("cpu.cfs_quota_us", "-1\n"); + MAKE_FILE("cpu.rt_period_us", "1000000\n"); + MAKE_FILE("cpu.rt_runtime_us", "950000\n"); + MAKE_FILE("cpu.shares", "1024\n"); + MAKE_FILE("cpu.stat", + "nr_periods 0\n" + "nr_throttled 0\n" + "throttled_time 0\n"); + MAKE_FILE("cpuacct.stat", + "user 216687025\n" + "system 43421396\n"); + MAKE_FILE("cpuacct.usage", "2787788855799582\n"); + MAKE_FILE("cpuacct.usage_per_cpu", "1413142688153030 1374646168910542\n"); + } else if (STRPREFIX(controller, "cpuset")) { + MAKE_FILE("cpuset.cpu_exclusive", "1\n"); + if (STREQ(controller, "cpuset")) + MAKE_FILE("cpuset.cpus", "0-1"); + else + MAKE_FILE("cpuset.cpus", ""); /* Values don't inherit */ + MAKE_FILE("cpuset.mem_exclusive", "1\n"); + MAKE_FILE("cpuset.mem_hardwall", "0\n"); + MAKE_FILE("cpuset.memory_migrate", "0\n"); + MAKE_FILE("cpuset.memory_pressure", "0\n"); + MAKE_FILE("cpuset.memory_pressure_enabled", "0\n"); + MAKE_FILE("cpuset.memory_spread_page", "0\n"); + MAKE_FILE("cpuset.memory_spread_slab", "0\n"); + if (STREQ(controller, "cpuset")) + MAKE_FILE("cpuset.mems", "0"); + else + MAKE_FILE("cpuset.mems", ""); /* Values don't inherit */ + MAKE_FILE("cpuset.sched_load_balance", "1\n"); + MAKE_FILE("cpuset.sched_relax_domain_level", "-1\n"); + } else if (STRPREFIX(controller, "memory")) { + MAKE_FILE("memory.failcnt", "0\n"); + MAKE_FILE("memory.force_empty", ""); /* Write only */ + MAKE_FILE("memory.kmem.tcp.failcnt", "0\n"); + MAKE_FILE("memory.kmem.tcp.limit_in_bytes", "9223372036854775807\n"); + MAKE_FILE("memory.kmem.tcp.max_usage_in_bytes", "0\n"); + MAKE_FILE("memory.kmem.tcp.usage_in_bytes", "16384\n"); + MAKE_FILE("memory.limit_in_bytes", "9223372036854775807\n"); + MAKE_FILE("memory.max_usage_in_bytes", "0\n"); + MAKE_FILE("memory.memsw.failcnt", ""); /* Not supported */ + MAKE_FILE("memory.memsw.limit_in_bytes", ""); /* Not supported */ + MAKE_FILE("memory.memsw.max_usage_in_bytes", ""); /* Not supported */ + MAKE_FILE("memory.memsw.usage_in_bytes", ""); /* Not supported */ + MAKE_FILE("memory.move_charge_at_immigrate", "0\n"); + MAKE_FILE("memory.numa_stat", + "total=367664 N0=367664\n" + "file=314764 N0=314764\n" + "anon=51999 N0=51999\n" + "unevictable=901 N0=901\n"); + MAKE_FILE("memory.oom_control", + "oom_kill_disable 0\n" + "under_oom 0\n"); + MAKE_FILE("memory.soft_limit_in_bytes", "9223372036854775807\n"); + MAKE_FILE("memory.stat", + "cache 1336619008\n" + "rss 97792000\n" + "mapped_file 42090496\n" + "pgpgin 13022605027\n" + "pgpgout 13023820533\n" + "pgfault 54429417056\n" + "pgmajfault 315715\n" + "inactive_anon 145887232\n" + "active_anon 67100672\n" + "inactive_file 627400704\n" + "active_file 661872640\n" + "unevictable 3690496\n" + "hierarchical_memory_limit 9223372036854775807\n" + "total_cache 1336635392\n" + "total_rss 118689792\n" + "total_mapped_file 42106880\n" + "total_pgpgin 13022606816\n" + "total_pgpgout 13023820793\n" + "total_pgfault 54429422313\n" + "total_pgmajfault 315715\n" + "total_inactive_anon 145891328\n" + "total_active_anon 88010752\n" + "total_inactive_file 627400704\n" + "total_active_file 661872640\n" + "total_unevictable 3690496\n" + "recent_rotated_anon 112807028\n" + "recent_rotated_file 2547948\n" + "recent_scanned_anon 113796164\n" + "recent_scanned_file 8199863\n"); + MAKE_FILE("memory.swappiness", "60\n"); + MAKE_FILE("memory.usage_in_bytes", "1455321088\n"); + MAKE_FILE("memory.use_hierarchy", "0\n"); + } else if (STRPREFIX(controller, "freezer")) { + MAKE_FILE("freezer.state", "THAWED"); + } else if (STRPREFIX(controller, "blkio")) { + MAKE_FILE("blkio.io_merged", + "8:0 Read 1100949\n" + "8:0 Write 2248076\n" + "8:0 Sync 63063\n" + "8:0 Async 3285962\n" + "8:0 Total 3349025\n"); + MAKE_FILE("blkio.io_queued", + "8:0 Read 0\n" + "8:0 Write 0\n" + "8:0 Sync 0\n" + "8:0 Async 0\n" + "8:0 Total 0\n"); + MAKE_FILE("blkio.io_service_bytes", + "8:0 Read 59542078464\n" + "8:0 Write 397369182208\n" + "8:0 Sync 234080922624\n" + "8:0 Async 222830338048\n" + "8:0 Total 456911260672\n"); + MAKE_FILE("blkio.io_serviced", + "8:0 Read 3402504\n" + "8:0 Write 14966516\n" + "8:0 Sync 12064031\n" + "8:0 Async 6304989\n" + "8:0 Total 18369020\n"); + MAKE_FILE("blkio.io_service_time", + "8:0 Read 10747537542349\n" + "8:0 Write 9200028590575\n" + "8:0 Sync 6449319855381\n" + "8:0 Async 13498246277543\n" + "8:0 Total 19947566132924\n"); + MAKE_FILE("blkio.io_wait_time", + "8:0 Read 14687514824889\n" + "8:0 Write 357748452187691\n" + "8:0 Sync 55296974349413\n" + "8:0 Async 317138992663167\n" + "8:0 Total 372435967012580\n"); + MAKE_FILE("blkio.reset_stats", ""); /* Write only */ + MAKE_FILE("blkio.sectors", "8:0 892404806\n"); + MAKE_FILE("blkio.throttle.io_service_bytes", + "8:0 Read 59542107136\n" + "8:0 Write 411440480256\n" + "8:0 Sync 248486822912\n" + "8:0 Async 222495764480\n" + "8:0 Total 470982587392\n"); + MAKE_FILE("blkio.throttle.io_serviced", + "8:0 Read 4832583\n" + "8:0 Write 36641903\n" + "8:0 Sync 30723171\n" + "8:0 Async 10751315\n" + "8:0 Total 41474486\n"); + MAKE_FILE("blkio.throttle.read_bps_device", ""); + MAKE_FILE("blkio.throttle.read_iops_device", ""); + MAKE_FILE("blkio.throttle.write_bps_device", ""); + MAKE_FILE("blkio.throttle.write_iops_device", ""); + MAKE_FILE("blkio.time", "8:0 61019089\n"); + MAKE_FILE("blkio.weight", "1000\n"); + MAKE_FILE("blkio.weight_device", ""); + + } else { + errno = EINVAL; + goto cleanup; + } + + ret = 0; +cleanup: + return ret; +} + +static void init_syms(void) +{ + if (realfopen) + return; + +#define LOAD_SYM(name) \ + do { \ + if (!(real ## name = dlsym(RTLD_NEXT, #name))) { \ + fprintf(stderr, "Cannot find real '%s' symbol\n", #name); \ + abort(); \ + } \ + } while (0) + + LOAD_SYM(fopen); + LOAD_SYM(access); + LOAD_SYM(mkdir); + LOAD_SYM(open); +} + +static void init_sysfs(void) +{ + if (fakesysfsdir) + return; + + if (!(fakesysfsdir = getenv("LIBVIRT_FAKE_SYSFS_DIR"))) { + fprintf(stderr, "Missing LIBVIRT_FAKE_SYSFS_DIR env variable\n"); + abort(); + } + +#define MAKE_CONTROLLER(subpath) \ + do { \ + char *path; \ + if (asprintf(&path,"%s/%s", fakesysfsdir, subpath) < 0) \ + abort(); \ + if (make_controller(path, 0755) < 0) { \ + fprintf(stderr, "Cannot initialize %s\n", path); \ + abort(); \ + } \ + } while (0) + + MAKE_CONTROLLER("cpu"); + MAKE_CONTROLLER("cpuacct"); + MAKE_CONTROLLER("cpu,cpuacct"); + MAKE_CONTROLLER("cpu,cpuacct/system"); + MAKE_CONTROLLER("cpuset"); + MAKE_CONTROLLER("blkio"); + MAKE_CONTROLLER("memory"); + MAKE_CONTROLLER("freezer"); +} + + +FILE *fopen(const char *path, const char *mode) +{ + init_syms(); + + if (STREQ(path, "/proc/mounts")) { + if (STREQ(mode, "r")) { + return fmemopen((void *)mounts, strlen(mounts), mode); + } else { + errno = EACCES; + return NULL; + } + } + if (STREQ(path, "/proc/self/cgroup")) { + if (STREQ(mode, "r")) { + return fmemopen((void *)cgroups, strlen(cgroups), mode); + } else { + errno = EACCES; + return NULL; + } + } + + return realfopen(path, mode); +} + +int access(const char *path, int mode) +{ + int ret; + + init_syms(); + + if (STRPREFIX(path, SYSFS_PREFIX)) { + init_sysfs(); + char *newpath; + if (asprintf(&newpath, "%s/%s", + fakesysfsdir, + path + strlen(SYSFS_PREFIX)) < 0) { + errno = ENOMEM; + return -1; + } + ret = realaccess(newpath, mode); + free(newpath); + } else { + ret = realaccess(path, mode); + } + return ret; +} + +int mkdir(const char *path, mode_t mode) +{ + int ret; + + init_syms(); + + if (STRPREFIX(path, SYSFS_PREFIX)) { + init_sysfs(); + char *newpath; + if (asprintf(&newpath, "%s/%s", + fakesysfsdir, + path + strlen(SYSFS_PREFIX)) < 0) { + errno = ENOMEM; + return -1; + } + ret = make_controller(newpath, mode); + free(newpath); + } else { + ret = realmkdir(path, mode); + } + return ret; +} + +int open(const char *path, int flags, ...) +{ + int ret; + char *newpath = NULL; + + init_syms(); + + if (STRPREFIX(path, SYSFS_PREFIX)) { + init_sysfs(); + if (asprintf(&newpath, "%s/%s", + fakesysfsdir, + path + strlen(SYSFS_PREFIX)) < 0) { + errno = ENOMEM; + return -1; + } + } + if (flags & O_CREAT) { + va_list ap; + mode_t mode; + va_start(ap, flags); + mode = va_arg(ap, mode_t); + va_end(ap); + ret = realopen(newpath ? newpath : path, flags, mode); + } else { + ret = realopen(newpath ? newpath : path, flags); + } + free(newpath); + return ret; +} diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c new file mode 100644 index 0000000..a68aa88 --- /dev/null +++ b/tests/vircgrouptest.c @@ -0,0 +1,249 @@ +/* + * Copyright (C) 2013 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + * Author: Daniel P. Berrange <berrange@redhat.com> + */ + +#include <config.h> + +/* This part defines the actual test cases */ +#include <stdlib.h> + +#define __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ +#include "vircgrouppriv.h" +#include "testutils.h" +#include "virutil.h" +#include "virerror.h" +#include "virlog.h" +#include "virfile.h" + +#define VIR_FROM_THIS VIR_FROM_NONE + +static int validateCgroup(virCgroupPtr cgroup, + const char *expectPath, + const char **expectMountPoint, + const char **expectPlacement) +{ + int i; + + if (STRNEQ(cgroup->path, expectPath)) { + fprintf(stderr, "Wrong path '%s', expected '%s'\n", + cgroup->path, expectPath); + return -1; + } + + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + if (STRNEQ_NULLABLE(expectMountPoint[i], + cgroup->controllers[i].mountPoint)) { + fprintf(stderr, "Wrong mount '%s', expected '%s' for '%s'\n", + cgroup->controllers[i].mountPoint, + expectMountPoint[i], + virCgroupControllerTypeToString(i)); + return -1; + } + if (STRNEQ_NULLABLE(expectPlacement[i], + cgroup->controllers[i].placement)) { + fprintf(stderr, "Wrong placement '%s', expected '%s' for '%s'\n", + cgroup->controllers[i].placement, + expectPlacement[i], + virCgroupControllerTypeToString(i)); + return -1; + } + } + + return 0; +} + +const char *mountsSmall[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/not/really/sys/fs/cgroup/cpu,cpuacct", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/not/really/sys/fs/cgroup/cpu,cpuacct", + [VIR_CGROUP_CONTROLLER_CPUSET] = NULL, + [VIR_CGROUP_CONTROLLER_MEMORY] = "/not/really/sys/fs/cgroup/memory", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = NULL, + [VIR_CGROUP_CONTROLLER_BLKIO] = NULL, +}; +const char *mountsFull[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/not/really/sys/fs/cgroup/cpu,cpuacct", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/not/really/sys/fs/cgroup/cpu,cpuacct", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/not/really/sys/fs/cgroup/cpuset", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/not/really/sys/fs/cgroup/memory", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "/not/really/sys/fs/cgroup/freezer", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/not/really/sys/fs/cgroup/blkio", +}; + +static int testCgroupNewForSelf(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr cgroup = NULL; + int ret = -1; + const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/system", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system", + [VIR_CGROUP_CONTROLLER_CPUSET] = "", + [VIR_CGROUP_CONTROLLER_MEMORY] = "", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "", + [VIR_CGROUP_CONTROLLER_BLKIO] = "", + }; + + if (virCgroupNewSelf(&cgroup) < 0) { + fprintf(stderr, "Cannot create cgroup for self\n"); + goto cleanup; + } + + ret = validateCgroup(cgroup, "/", mountsFull, placement); + +cleanup: + virCgroupFree(&cgroup); + return ret; +} + + +static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr cgroup = NULL; + int ret = -1; + int rv; + const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/system", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system", + [VIR_CGROUP_CONTROLLER_CPUSET] = "", + [VIR_CGROUP_CONTROLLER_MEMORY] = "", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "", + [VIR_CGROUP_CONTROLLER_BLKIO] = "", + }; + + if ((rv = virCgroupNewDriver("lxc", true, false, -1, &cgroup)) != -ENOENT) { + fprintf(stderr, "Unexpected found LXC cgroup: %d\n", -rv); + goto cleanup; + } + + /* Asking for impossible combination since CPU is co-mounted */ + if ((rv = virCgroupNewDriver("lxc", true, true, + (1 << VIR_CGROUP_CONTROLLER_CPU), + &cgroup)) != -EINVAL) { + fprintf(stderr, "Should not have created LXC cgroup: %d\n", -rv); + goto cleanup; + } + + /* Asking for impossible combination since devices is not mounted */ + if ((rv = virCgroupNewDriver("lxc", true, true, + (1 << VIR_CGROUP_CONTROLLER_DEVICES), + &cgroup)) != -ENOENT) { + fprintf(stderr, "Should not have created LXC cgroup: %d\n", -rv); + goto cleanup; + } + + /* Asking for small combination since devices is not mounted */ + if ((rv = virCgroupNewDriver("lxc", true, true, + (1 << VIR_CGROUP_CONTROLLER_CPU) | + (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | + (1 << VIR_CGROUP_CONTROLLER_MEMORY), + &cgroup)) != 0) { + fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); + goto cleanup; + } + ret = validateCgroup(cgroup, "/libvirt/lxc", mountsSmall, placement); + virCgroupFree(&cgroup); + + if ((rv = virCgroupNewDriver("lxc", true, true, -1, &cgroup)) != 0) { + fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); + goto cleanup; + } + ret = validateCgroup(cgroup, "/libvirt/lxc", mountsFull, placement); + +cleanup: + virCgroupFree(&cgroup); + return ret; +} + + +static int testCgroupNewForDomain(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr drivercgroup = NULL; + virCgroupPtr domaincgroup = NULL; + int ret = -1; + int rv; + const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/system", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system", + [VIR_CGROUP_CONTROLLER_CPUSET] = "", + [VIR_CGROUP_CONTROLLER_MEMORY] = "", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "", + [VIR_CGROUP_CONTROLLER_BLKIO] = "", + }; + + if ((rv = virCgroupNewDriver("lxc", true, false, -1, &drivercgroup)) != 0) { + fprintf(stderr, "Cannot find LXC cgroup: %d\n", -rv); + goto cleanup; + } + + if ((rv = virCgroupNewDomain(drivercgroup, "wibble", true, &domaincgroup)) != 0) { + fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); + goto cleanup; + } + + ret = validateCgroup(domaincgroup, "/libvirt/lxc/wibble", mountsFull, placement); + +cleanup: + virCgroupFree(&drivercgroup); + virCgroupFree(&domaincgroup); + return ret; +} + + +#define FAKESYSFSDIRTEMPLATE abs_builddir "/fakesysfsdir-XXXXXX" + + + +static int +mymain(void) +{ + int ret = 0; + char *fakesysfsdir; + + if (!(fakesysfsdir = strdup(FAKESYSFSDIRTEMPLATE))) { + fprintf(stderr, "Out of memory\n"); + abort(); + } + + if (!mkdtemp(fakesysfsdir)) { + fprintf(stderr, "Cannot create fakesysfsdir"); + abort(); + } + + setenv("LIBVIRT_FAKE_SYSFS_DIR", fakesysfsdir, 1); + + if (virtTestRun("New cgroup for self", 1, testCgroupNewForSelf, NULL) < 0) + ret = -1; + + if (virtTestRun("New cgroup for driver", 1, testCgroupNewForDriver, NULL) < 0) + ret = -1; + + if (virtTestRun("New cgroup for domain", 1, testCgroupNewForDomain, NULL) < 0) + ret = -1; + + if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL) + virFileDeleteTree(fakesysfsdir); + + return ret==0 ? EXIT_SUCCESS : EXIT_FAILURE; +} + +VIRT_TEST_MAIN_PRELOAD(mymain, abs_builddir "/.libs/libvircgroupmock.so") -- 1.8.1.4

On 04/10/2013 04:08 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Some aspects of the cgroups setup / detection code are quite subtle and easy to break. It would greatly benefit from unit testing, but this is difficult because the test suite won't have privileges to play around with cgroups. The solution is to use monkey patching via LD_PRELOAD to override the fopen, open, mkdir, access functions to redirect access of cgroups files to some magic stubs in the test suite.
Using this we provide custom content for the /proc/cgroup and /proc/self/mounts files which report a fixed cgroup setup. We then override open/mkdir/access so that access to the cgroups filesystem gets redirected into files in a temporary directory tree in the test suite build dir.
Do you also need to override openat/mkdirat/faccessat, in case we (or even libc on our behalf) ever uses the newer *at syscalls? Wow, this looks complicated, so I'll have to defer my review to sometime earlier in my day when I'm thinking straight. But the premise is useful, and a passing 'make check' even on a system with no cgroups mounted is a pretty good indication of whether you got it right.
+/* + * The plan: + * + * We fake out /proc/mounts, so make it look as is cgroups + * are mounted on /not/really/sys/fs/cgroup. We don't + * use /sys/fs/cgroup, because we want to make it easy to + * detect places where we've not mocked enough syscalls.
and so that the testsuite will run and pass even on systems without cgroups mounted.
+ * + * In any open/acces/mkdir calls we look at path and if + * it starts with /not/really/sys/fs/cgroup, we rewrite + * the path to point at a temporary directory referred + * to by LIBVIRT_FAKE_SYSFS_DIR env variable that is + * set by the main test suite + * + * In mkdir() calls, we simulate the cgroups behaviour + * whereby creating the directory auto-creates a bunch + * of files beneath it + */
Good luck!
+mymain(void) +{ + int ret = 0; + char *fakesysfsdir; + + if (!(fakesysfsdir = strdup(FAKESYSFSDIRTEMPLATE))) { + fprintf(stderr, "Out of memory\n"); + abort(); + } + + if (!mkdtemp(fakesysfsdir)) {
Does this compile on mingw, or do you need to modify bootstrap.conf to pull in the mkdtemp gnulib module? [Then again, it won't compile on mingw in the first place, since the Makefile.am limits it to platforms with LD_PRELOAD support]
+ fprintf(stderr, "Cannot create fakesysfsdir"); + abort(); + } + + setenv("LIBVIRT_FAKE_SYSFS_DIR", fakesysfsdir, 1); + + if (virtTestRun("New cgroup for self", 1, testCgroupNewForSelf, NULL) < 0) + ret = -1; + + if (virtTestRun("New cgroup for driver", 1, testCgroupNewForDriver, NULL) < 0) + ret = -1; + + if (virtTestRun("New cgroup for domain", 1, testCgroupNewForDomain, NULL) < 0) + ret = -1; + + if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL) + virFileDeleteTree(fakesysfsdir); + + return ret==0 ? EXIT_SUCCESS : EXIT_FAILURE;
Spaces around ==
+} + +VIRT_TEST_MAIN_PRELOAD(mymain, abs_builddir "/.libs/libvircgroupmock.so")
-- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Wed, Apr 10, 2013 at 09:04:33PM -0600, Eric Blake wrote:
On 04/10/2013 04:08 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Some aspects of the cgroups setup / detection code are quite subtle and easy to break. It would greatly benefit from unit testing, but this is difficult because the test suite won't have privileges to play around with cgroups. The solution is to use monkey patching via LD_PRELOAD to override the fopen, open, mkdir, access functions to redirect access of cgroups files to some magic stubs in the test suite.
Using this we provide custom content for the /proc/cgroup and /proc/self/mounts files which report a fixed cgroup setup. We then override open/mkdir/access so that access to the cgroups filesystem gets redirected into files in a temporary directory tree in the test suite build dir.
Do you also need to override openat/mkdirat/faccessat, in case we (or even libc on our behalf) ever uses the newer *at syscalls?
Well, if glibc did try todo such magic, there'd probably be more that needed overriding besides hte *at() functions. So we should just wait until that day, rather than second guessing it I think.
Wow, this looks complicated, so I'll have to defer my review to sometime earlier in my day when I'm thinking straight. But the premise is useful, and a passing 'make check' even on a system with no cgroups mounted is a pretty good indication of whether you got it right.
+/* + * The plan: + * + * We fake out /proc/mounts, so make it look as is cgroups + * are mounted on /not/really/sys/fs/cgroup. We don't + * use /sys/fs/cgroup, because we want to make it easy to + * detect places where we've not mocked enough syscalls.
and so that the testsuite will run and pass even on systems without cgroups mounted.
Correct.
+mymain(void) +{ + int ret = 0; + char *fakesysfsdir; + + if (!(fakesysfsdir = strdup(FAKESYSFSDIRTEMPLATE))) { + fprintf(stderr, "Out of memory\n"); + abort(); + } + + if (!mkdtemp(fakesysfsdir)) {
Does this compile on mingw, or do you need to modify bootstrap.conf to pull in the mkdtemp gnulib module? [Then again, it won't compile on mingw in the first place, since the Makefile.am limits it to platforms with LD_PRELOAD support]
Hmm, I've not tested with win32 build. I think we actually ought to make sure this is run on Linux only, since it doesn't make sense on BSD either. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 04/10/2013 04:08 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Some aspects of the cgroups setup / detection code are quite subtle and easy to break. It would greatly benefit from unit testing, but this is difficult because the test suite won't have privileges to play around with cgroups. The solution is to use monkey patching via LD_PRELOAD to override the fopen, open, mkdir, access functions to redirect access of cgroups files to some magic stubs in the test suite.
Using this we provide custom content for the /proc/cgroup and /proc/self/mounts files which report a fixed cgroup setup. We then override open/mkdir/access so that access to the cgroups filesystem gets redirected into files in a temporary directory tree in the test suite build dir.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- .gitignore | 1 + cfg.mk | 11 +- tests/Makefile.am | 15 +- tests/vircgroupmock.c | 453 ++++++++++++++++++++++++++++++++++++++++++++++++++ tests/vircgrouptest.c | 249 +++++++++++++++++++++++++++ 5 files changed, 723 insertions(+), 6 deletions(-) create mode 100644 tests/vircgroupmock.c create mode 100644 tests/vircgrouptest.c
Now that I'm reading this earlier in the day, I can give the full review that I promised...
exclude_file_name_regexp--sc_prohibit_asprintf = \ - ^(bootstrap.conf$$|src/util/virutil\.c$$|examples/domain-events/events-c/event-test\.c$$) + ^(bootstrap.conf$$|src/util/virutil\.c$$|examples/domain-events/events-c/event-test\.c$$|tests/vircgroupmock\.c$$)
raw asprintf - yep, all the more reason to make sure this test compiles only on Linux.
+++ b/tests/Makefile.am @@ -97,7 +97,9 @@ test_programs = virshtest sockettest \ utiltest shunloadtest \ virtimetest viruritest virkeyfiletest \ virauthconfigtest \ - virbitmaptest virendiantest \ + virbitmaptest \ + vircgrouptest \ + virendiantest \ viridentitytest \ virkeycodetest \ virlockspacetest \ @@ -247,6 +249,7 @@ EXTRA_DIST += $(test_scripts)
test_libraries = libshunload.la \ libvirportallocatormock.la \ + vircgroupmock.la \ $(NULL)
shunload.c is guarded by #ifdef linux; you should do the same for vircgroupmock.c.
+vircgrouptest_SOURCES = \ + vircgrouptest.c testutils.h testutils.c +vircgrouptest_LDADD = $(LDADDS) + +vircgroupmock_la_SOURCES = \ + vircgroupmock.c +vircgroupmock_la_CFLAGS = $(AM_CFLAGS) -DMOCK_HELPER=1
Why do you need -DMOCK_HELPER=1? Too much copy and past from virportallocatortest.c, which crammed both the mock library and the test into the same file rather than your approach here of using two files?
+++ b/tests/vircgroupmock.c
+#include <config.h> + +#include "internal.h" + +#include <stdio.h> +#include <dlfcn.h> +#include <stdlib.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/stat.h> + +static int (*realopen)(const char *path, int flags, ...);
Don't you need <stdarg.h> to make it possible to implement your realopen()? [1]
+/* + * Intentionally missing the 'devices' mount. + * Co-mounting cpu & cpuacct controllers + * An anonymous controller for systemd
Should give good coverage of the various virCgroup actions.
+ */ +const char *mounts = + "rootfs / rootfs rw 0 0\n" + "tmpfs /run tmpfs rw,seclabel,nosuid,nodev,mode=755 0 0\n" + "tmpfs /not/really/sys/fs/cgroup tmpfs rw,seclabel,nosuid,nodev,noexec,mode=755 0 0\n" + "cgroup /not/really/sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0\n" + "cgroup /not/really/sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0\n" + "cgroup /not/really/sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0\n" + "cgroup /not/really/sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0\n" + "cgroup /not/really/sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0\n" + "cgroup /not/really/sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0\n" + "/dev/sda1 /boot ext4 rw,seclabel,relatime,data=ordered 0 0\n" + "tmpfs /tmp tmpfs rw,seclabel,relatime,size=1024000k 0 0\n";
Resembles a real system; so far so good.
+ +const char *cgroups = + "115:memory:/\n" + "8:blkio:/\n" + "6:freezer:/\n" + "3:cpuacct,cpu:/system\n" + "2:cpuset:/\n" + "1:name=systemd:/user/berrange/123\n";
No wonder it resembles a real system :) I'm sure you populated this based on your own system, then tweaked it into the test.
+ +static int make_file(const char *path, + const char *name, + const char *value)
Indentation is off.
+{ + int fd = -1; + int ret = -1; + char *filepath = NULL; + + if (asprintf(&filepath, "%s/%s", path, name) < 0) + return -1; + + if ((fd = open(filepath, O_CREAT|O_WRONLY, 0600)) < 0)
Is it okay if this calls our open() instead of realopen()?
+ goto cleanup; + + if (write(fd, value, strlen(value)) != strlen(value))
Can't embed any NUL in your fake files, but that's probably okay.
+ goto cleanup; + + ret = 0; +cleanup: + if (fd != -1 &&close(fd) < 0)
Spacing.
+ ret = -1; + free(filepath); + + return ret; +} + +static int make_controller(const char *path, mode_t mode) +{ + int ret = -1; + const char *controller; + + if (!STRPREFIX(path, fakesysfsdir)) { + errno = EINVAL; + return -1; + } + controller = path + strlen(fakesysfsdir) + 1; + + if (STREQ(controller, "cpu")) { + if (symlink("cpu,cpuacct", path) < 0) + return -1; + return -0;
-0 is cute. Couldn't this just be: return symlink("cpu,cpuacct", path);
+ } + if (STREQ(controller, "cpuacct")) { + if (symlink("cpu,cpuacct", path) < 0) + return -1; + return 0; + } + + if (realmkdir(path, mode) < 0) + goto cleanup; + +#define MAKE_FILE(name, value) \ + do { \ + if (make_file(path, name, value) < 0) \ + goto cleanup; \ + } while (0) + + if (STRPREFIX(controller, "cpu,cpuacct")) { + MAKE_FILE("cpu.cfs_period_us", "100000\n");
Seems useful; again, the set of files and their contents was probably copied right off your system, even if newer kernels later on add more files, this test will still be a reasonable action of at least one kernel in time that libvirt must deal with. ...
+ MAKE_FILE("blkio.weight_device", ""); + + } else { + errno = EINVAL; + goto cleanup; + } + + ret = 0; +cleanup: + return ret; +} + +static void init_syms(void) +{ + if (realfopen) + return; + +#define LOAD_SYM(name) \ + do { \ + if (!(real ## name = dlsym(RTLD_NEXT, #name))) { \ + fprintf(stderr, "Cannot find real '%s' symbol\n", #name); \ + abort(); \ + } \ + } while (0) + + LOAD_SYM(fopen); + LOAD_SYM(access); + LOAD_SYM(mkdir); + LOAD_SYM(open);
Looks correct for the stubs we are overriding.
+} + +static void init_sysfs(void) +{ + if (fakesysfsdir) + return; + + if (!(fakesysfsdir = getenv("LIBVIRT_FAKE_SYSFS_DIR"))) { + fprintf(stderr, "Missing LIBVIRT_FAKE_SYSFS_DIR env variable\n"); + abort(); + } + +#define MAKE_CONTROLLER(subpath) \ + do { \ + char *path; \ + if (asprintf(&path,"%s/%s", fakesysfsdir, subpath) < 0) \ + abort(); \ + if (make_controller(path, 0755) < 0) { \ + fprintf(stderr, "Cannot initialize %s\n", path); \ + abort(); \ + } \
Odd alignment of \ (half aligned, half not)
+ } while (0) + + MAKE_CONTROLLER("cpu"); + MAKE_CONTROLLER("cpuacct"); + MAKE_CONTROLLER("cpu,cpuacct"); + MAKE_CONTROLLER("cpu,cpuacct/system"); + MAKE_CONTROLLER("cpuset"); + MAKE_CONTROLLER("blkio"); + MAKE_CONTROLLER("memory"); + MAKE_CONTROLLER("freezer"); +} + + +FILE *fopen(const char *path, const char *mode) +{ + init_syms(); + + if (STREQ(path, "/proc/mounts")) { + if (STREQ(mode, "r")) { + return fmemopen((void *)mounts, strlen(mounts), mode);
fmemopen() - fun stuff. Glad you're fixing this up to be Linux only :) Is the (void*) cast really needed? Ah yes, to cast away const.
+ } else { + errno = EACCES; + return NULL; + } + } + if (STREQ(path, "/proc/self/cgroup")) { + if (STREQ(mode, "r")) { + return fmemopen((void *)cgroups, strlen(cgroups), mode); + } else { + errno = EACCES; + return NULL; + } + }
And these just happen to be the two files we fopen in libvirt.
+ + return realfopen(path, mode); +} + +int access(const char *path, int mode) +{ + int ret; + + init_syms(); + + if (STRPREFIX(path, SYSFS_PREFIX)) { + init_sysfs(); + char *newpath; + if (asprintf(&newpath, "%s/%s", + fakesysfsdir, + path + strlen(SYSFS_PREFIX)) < 0) { + errno = ENOMEM; + return -1; + } + ret = realaccess(newpath, mode); + free(newpath); + } else { + ret = realaccess(path, mode); + } + return ret; +} + +int mkdir(const char *path, mode_t mode) +{ + int ret; + + init_syms(); + + if (STRPREFIX(path, SYSFS_PREFIX)) { + init_sysfs(); + char *newpath; + if (asprintf(&newpath, "%s/%s", + fakesysfsdir, + path + strlen(SYSFS_PREFIX)) < 0) { + errno = ENOMEM; + return -1; + } + ret = make_controller(newpath, mode); + free(newpath); + } else { + ret = realmkdir(path, mode); + } + return ret; +}
Looks fine.
+ +int open(const char *path, int flags, ...) +{ + int ret; + char *newpath = NULL; + + init_syms(); + + if (STRPREFIX(path, SYSFS_PREFIX)) { + init_sysfs(); + if (asprintf(&newpath, "%s/%s", + fakesysfsdir, + path + strlen(SYSFS_PREFIX)) < 0) { + errno = ENOMEM; + return -1; + } + } + if (flags & O_CREAT) { + va_list ap;
[1] Yep, you DO need to fix the includes to use <stdarg.h>.
+ mode_t mode; + va_start(ap, flags); + mode = va_arg(ap, mode_t); + va_end(ap); + ret = realopen(newpath ? newpath : path, flags, mode); + } else { + ret = realopen(newpath ? newpath : path, flags); + }
You know, wouldn't it just work to write the simpler: int open(const char *path, int flags, mode_t mode) { ... ret = realopen(newpath ? newpath : path, flags, mode); ... } without any regard to the presence or absence of O_CREAT? That is, are there any platforms that support Linux but where var-arg passing would do the wrong thing if our LD_PRELOAD is specified as a three-argument function instead of a var-arg function, whether or not the calling app passed 2 or 3 args?
+ free(newpath); + return ret; +} diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c new file mode 100644 index 0000000..a68aa88 --- /dev/null +++ b/tests/vircgrouptest.c @@ -0,0 +1,249 @@
...
+ +static int validateCgroup(virCgroupPtr cgroup, + const char *expectPath, + const char **expectMountPoint, + const char **expectPlacement) +{ + int i; + + if (STRNEQ(cgroup->path, expectPath)) { + fprintf(stderr, "Wrong path '%s', expected '%s'\n", + cgroup->path, expectPath); + return -1; + }
+ +const char *mountsSmall[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/not/really/sys/fs/cgroup/cpu,cpuacct",
Is it worth spelling this: SYSFS_PREFIX "/cgroup/cpu,cpuacct"
+ +#define FAKESYSFSDIRTEMPLATE abs_builddir "/fakesysfsdir-XXXXXX" + + + +static int +mymain(void) +{ + int ret = 0; + char *fakesysfsdir; + + if (!(fakesysfsdir = strdup(FAKESYSFSDIRTEMPLATE))) { + fprintf(stderr, "Out of memory\n"); + abort(); + } + + if (!mkdtemp(fakesysfsdir)) { + fprintf(stderr, "Cannot create fakesysfsdir"); + abort(); + } + + setenv("LIBVIRT_FAKE_SYSFS_DIR", fakesysfsdir, 1); + + if (virtTestRun("New cgroup for self", 1, testCgroupNewForSelf, NULL) < 0) + ret = -1; + + if (virtTestRun("New cgroup for driver", 1, testCgroupNewForDriver, NULL) < 0) + ret = -1; + + if (virtTestRun("New cgroup for domain", 1, testCgroupNewForDomain, NULL) < 0) + ret = -1; + + if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL) + virFileDeleteTree(fakesysfsdir);
Worth calling VIR_FREE(fakesysfsdir) here, to keep valgrind happy? (That is, supposing that valgrind can even deal with our mock LD_PRELOAD tests)
+ + return ret==0 ? EXIT_SUCCESS : EXIT_FAILURE; +} + +VIRT_TEST_MAIN_PRELOAD(mymain, abs_builddir "/.libs/libvircgroupmock.so")
I know that you have access to a mingw cross-compilation environment - ACK if you fix the issues I mentioned above and check that things still compile on mingw. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Fri, Apr 12, 2013 at 03:24:04PM -0600, Eric Blake wrote:
On 04/10/2013 04:08 AM, Daniel P. Berrange wrote:
+++ b/tests/Makefile.am @@ -97,7 +97,9 @@ test_programs = virshtest sockettest \ utiltest shunloadtest \ virtimetest viruritest virkeyfiletest \ virauthconfigtest \ - virbitmaptest virendiantest \ + virbitmaptest \ + vircgrouptest \ + virendiantest \ viridentitytest \ virkeycodetest \ virlockspacetest \ @@ -247,6 +249,7 @@ EXTRA_DIST += $(test_scripts)
test_libraries = libshunload.la \ libvirportallocatormock.la \ + vircgroupmock.la \ $(NULL)
shunload.c is guarded by #ifdef linux; you should do the same for vircgroupmock.c.
Yep, put #ifdef __linux__ around the test source files
+vircgrouptest_SOURCES = \ + vircgrouptest.c testutils.h testutils.c +vircgrouptest_LDADD = $(LDADDS) + +vircgroupmock_la_SOURCES = \ + vircgroupmock.c +vircgroupmock_la_CFLAGS = $(AM_CFLAGS) -DMOCK_HELPER=1
Why do you need -DMOCK_HELPER=1? Too much copy and past from virportallocatortest.c, which crammed both the mock library and the test into the same file rather than your approach here of using two files?
Yes, I prviously had it all in two files, but split it when the helper got too big
+#include <config.h> + +#include "internal.h" + +#include <stdio.h> +#include <dlfcn.h> +#include <stdlib.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/stat.h> + +static int (*realopen)(const char *path, int flags, ...);
Don't you need <stdarg.h> to make it possible to implement your realopen()? [1]
Yes, though I guess something must have pulled it in indirectly. Will add the explicit include though.
+{ + int fd = -1; + int ret = -1; + char *filepath = NULL; + + if (asprintf(&filepath, "%s/%s", path, name) < 0) + return -1; + + if ((fd = open(filepath, O_CREAT|O_WRONLY, 0600)) < 0)
Is it okay if this calls our open() instead of realopen()?
We're lucky enough to be safe, but I switched it to use realopen()
+ if (STREQ(controller, "cpu")) { + if (symlink("cpu,cpuacct", path) < 0) + return -1; + return -0;
-0 is cute. Couldn't this just be:
return symlink("cpu,cpuacct", path);
Yes, I've made that change
+ +int open(const char *path, int flags, ...) +{ + int ret; + char *newpath = NULL; + + init_syms(); + + if (STRPREFIX(path, SYSFS_PREFIX)) { + init_sysfs(); + if (asprintf(&newpath, "%s/%s", + fakesysfsdir, + path + strlen(SYSFS_PREFIX)) < 0) { + errno = ENOMEM; + return -1; + } + } + if (flags & O_CREAT) { + va_list ap;
[1] Yep, you DO need to fix the includes to use <stdarg.h>.
+ mode_t mode; + va_start(ap, flags); + mode = va_arg(ap, mode_t); + va_end(ap); + ret = realopen(newpath ? newpath : path, flags, mode); + } else { + ret = realopen(newpath ? newpath : path, flags); + }
You know, wouldn't it just work to write the simpler:
int open(const char *path, int flags, mode_t mode)
The problem is that the decl on open() has to match the decl used in <fcntl.h>. Since that uses '...' we must too.
diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c new file mode 100644 index 0000000..a68aa88 --- /dev/null +++ b/tests/vircgrouptest.c @@ -0,0 +1,249 @@
...
+ +static int validateCgroup(virCgroupPtr cgroup, + const char *expectPath, + const char **expectMountPoint, + const char **expectPlacement) +{ + int i; + + if (STRNEQ(cgroup->path, expectPath)) { + fprintf(stderr, "Wrong path '%s', expected '%s'\n", + cgroup->path, expectPath); + return -1; + }
+ +const char *mountsSmall[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/not/really/sys/fs/cgroup/cpu,cpuacct",
Is it worth spelling this: SYSFS_PREFIX "/cgroup/cpu,cpuacct"
Well SYSFS_PREFIX only exists in the other source file, not this one :-)
+ + return ret==0 ? EXIT_SUCCESS : EXIT_FAILURE; +} + +VIRT_TEST_MAIN_PRELOAD(mymain, abs_builddir "/.libs/libvircgroupmock.so")
I know that you have access to a mingw cross-compilation environment - ACK if you fix the issues I mentioned above and check that things still compile on mingw.
Wow, we have neglected mingw recently. I've found a tonne of pre-existing problems :-( Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

From: "Daniel P. Berrange" <berrange@redhat.com> Currently the virCgroupPtr struct contains 3 pieces of information - path - path of the cgroup, relative to current process' cgroup placement - placement - current process' placement in each controller - mounts - mount point of each controller When reading/writing cgroup settings, the path & placement strings are combined to form the file path. This approach only works if we assume all cgroups will be relative to the current process' cgroup placement. To allow support for managing cgroups at any place in the heirarchy a change is needed. The 'placement' data should reflect the absolute path to the cgroup, and the 'path' value should no longer be used to form the paths to the cgroup attribute files. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/util/vircgroup.c | 222 +++++++++++++++++++++++++++++++++++--------------- tests/vircgrouptest.c | 53 +++++++----- 2 files changed, 188 insertions(+), 87 deletions(-) diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 2f52c92..c336806 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -101,6 +101,23 @@ bool virCgroupHasController(virCgroupPtr cgroup, int controller) } #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R +static int virCgroupCopyMounts(virCgroupPtr group, + virCgroupPtr parent) +{ + int i; + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + if (!parent->controllers[i].mountPoint) + continue; + + group->controllers[i].mountPoint = + strdup(parent->controllers[i].mountPoint); + + if (!group->controllers[i].mountPoint) + return -ENOMEM; + } + return 0; +} + /* * Process /proc/mounts figuring out what controllers are * mounted and where @@ -158,12 +175,61 @@ no_memory: } +static int virCgroupCopyPlacement(virCgroupPtr group, + const char *path, + virCgroupPtr parent) +{ + int i; + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + if (!group->controllers[i].mountPoint) + continue; + + if (path[0] == '/') { + if (!(group->controllers[i].placement = strdup(path))) + return -ENOMEM; + } else { + /* + * parent=="/" + path="" => "/" + * parent=="/libvirt.service" + path="" => "/libvirt.service" + * parent=="/libvirt.service" + path="foo" => "/libvirt.service/foo" + */ + if (virAsprintf(&group->controllers[i].placement, + "%s%s%s", + parent->controllers[i].placement, + (STREQ(parent->controllers[i].placement, "/") || + STREQ(path, "") ? "" : "/"), + path) < 0) + return -ENOMEM; + } + } + + return 0; +} + + /* + * @group: the group to process + * @path: the relative path to append, not starting with '/' + * * Process /proc/self/cgroup figuring out what cgroup * sub-path the current process is assigned to. ie not - * necessarily in the root + * necessarily in the root. The contents of this file + * looks like + * + * 9:perf_event:/ + * 8:blkio:/ + * 7:net_cls:/ + * 6:freezer:/ + * 5:devices:/ + * 4:memory:/ + * 3:cpuacct,cpu:/ + * 2:cpuset:/ + * 1:name=systemd:/user/berrange/2 + * + * It then appends @path to each detected path. */ -static int virCgroupDetectPlacement(virCgroupPtr group) +static int virCgroupDetectPlacement(virCgroupPtr group, + const char *path) { int i; FILE *mapping = NULL; @@ -177,18 +243,18 @@ static int virCgroupDetectPlacement(virCgroupPtr group) while (fgets(line, sizeof(line), mapping) != NULL) { char *controllers = strchr(line, ':'); - char *path = controllers ? strchr(controllers+1, ':') : NULL; - char *nl = path ? strchr(path, '\n') : NULL; + char *selfpath = controllers ? strchr(controllers + 1, ':') : NULL; + char *nl = selfpath ? strchr(selfpath, '\n') : NULL; - if (!controllers || !path) + if (!controllers || !selfpath) continue; if (nl) *nl = '\0'; - *path = '\0'; + *selfpath = '\0'; controllers++; - path++; + selfpath++; for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { const char *typestr = virCgroupControllerTypeToString(i); @@ -198,14 +264,25 @@ static int virCgroupDetectPlacement(virCgroupPtr group) char *next = strchr(tmp, ','); int len; if (next) { - len = next-tmp; + len = next - tmp; next++; } else { len = strlen(tmp); } - if (typelen == len && STREQLEN(typestr, tmp, len) && - !(group->controllers[i].placement = strdup(STREQ(path, "/") ? "" : path))) - goto no_memory; + + /* + * selfpath=="/" + path="" -> "/" + * selfpath=="/libvirt.service" + path="" -> "/libvirt.service" + * selfpath=="/libvirt.service" + path="foo" -> "/libvirt.service/foo" + */ + if (typelen == len && STREQLEN(typestr, tmp, len)) { + if (virAsprintf(&group->controllers[i].placement, + "%s%s%s", selfpath, + (STREQ(selfpath, "/") || + STREQ(path, "") ? "" : "/"), + path) < 0) + goto no_memory; + } tmp = next; } @@ -223,13 +300,20 @@ no_memory: } static int virCgroupDetect(virCgroupPtr group, - int controllers) + int controllers, + const char *path, + virCgroupPtr parent) { int rc; int i; int j; + VIR_DEBUG("group=%p controllers=%d path=%s parent=%p", + group, controllers, path, parent); - rc = virCgroupDetectMounts(group); + if (parent) + rc = virCgroupCopyMounts(group, parent); + else + rc = virCgroupDetectMounts(group); if (rc < 0) { VIR_ERROR(_("Failed to detect mounts for %s"), group->path); return rc; @@ -238,9 +322,10 @@ static int virCgroupDetect(virCgroupPtr group, if (controllers >= 0) { VIR_DEBUG("Validating controllers %d", controllers); for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { - VIR_DEBUG("Controller '%s' wanted=%s", + VIR_DEBUG("Controller '%s' wanted=%s, mount='%s'", virCgroupControllerTypeToString(i), - (1 << i) & controllers ? "yes" : "no"); + (1 << i) & controllers ? "yes" : "no", + NULLSTR(group->controllers[i].mountPoint)); if (((1 << i) & controllers)) { /* Ensure requested controller is present */ if (!group->controllers[i].mountPoint) { @@ -282,10 +367,15 @@ static int virCgroupDetect(virCgroupPtr group, } /* Check that at least 1 controller is available */ - if (!controllers) + if (!controllers) { + VIR_DEBUG("No controllers set"); return -ENXIO; + } - rc = virCgroupDetectPlacement(group); + if (parent || path[0] == '/') + rc = virCgroupCopyPlacement(group, path, parent); + else + rc = virCgroupDetectPlacement(group, path); if (rc == 0) { /* Check that for every mounted controller, we found our placement */ @@ -339,10 +429,9 @@ int virCgroupPathOfController(virCgroupPtr group, if (group->controllers[controller].placement == NULL) return -ENOENT; - if (virAsprintf(path, "%s%s%s/%s", + if (virAsprintf(path, "%s%s/%s", group->controllers[controller].mountPoint, group->controllers[controller].placement, - STREQ(group->path, "/") ? "" : group->path, key ? key : "") == -1) return -ENOMEM; @@ -634,14 +723,31 @@ static int virCgroupMakeGroup(virCgroupPtr parent, } +/** + * virCgroupNew: + * @path: path for the new group + * @parent: parent group, or NULL + * @controllers: bitmask of controllers to activate + * + * Create a new cgroup storing it in @group. + * + * If @path starts with a '/' it is treated as an + * absolute path, and @parent is ignored. Otherwise + * it is treated as being relative to @parent. If + * @parent is NULL, then the placement of the current + * process is used. + * + */ static int virCgroupNew(const char *path, + virCgroupPtr parent, int controllers, virCgroupPtr *group) { int rc = 0; char *typpath = NULL; - VIR_DEBUG("path=%s controllers=%d", path, controllers); + VIR_DEBUG("parent=%p path=%s controllers=%d", + parent, path, controllers); *group = NULL; if (VIR_ALLOC((*group)) != 0) { @@ -649,12 +755,22 @@ static int virCgroupNew(const char *path, goto err; } - if (!((*group)->path = strdup(path))) { - rc = -ENOMEM; - goto err; + if (path[0] == '/' || !parent) { + if (!((*group)->path = strdup(path))) { + rc = -ENOMEM; + goto err; + } + } else { + if (virAsprintf(&(*group)->path, "%s%s%s", + parent->path, + STREQ(parent->path, "") ? "" : "/", + path) < 0) { + rc = -ENOMEM; + goto err; + } } - rc = virCgroupDetect(*group, controllers); + rc = virCgroupDetect(*group, controllers, path, parent); if (rc < 0) goto err; @@ -673,15 +789,16 @@ static int virCgroupAppRoot(bool privileged, bool create, int controllers) { - virCgroupPtr rootgrp = NULL; + virCgroupPtr selfgrp = NULL; int rc; - rc = virCgroupNew("/", controllers, &rootgrp); + rc = virCgroupNewSelf(&selfgrp); + if (rc != 0) return rc; if (privileged) { - rc = virCgroupNew("/libvirt", controllers, group); + rc = virCgroupNew("libvirt", selfgrp, controllers, group); } else { char *rootname; char *username; @@ -690,23 +807,23 @@ static int virCgroupAppRoot(bool privileged, rc = -ENOMEM; goto cleanup; } - rc = virAsprintf(&rootname, "/libvirt-%s", username); + rc = virAsprintf(&rootname, "libvirt-%s", username); VIR_FREE(username); if (rc < 0) { rc = -ENOMEM; goto cleanup; } - rc = virCgroupNew(rootname, controllers, group); + rc = virCgroupNew(rootname, selfgrp, controllers, group); VIR_FREE(rootname); } if (rc != 0) goto cleanup; - rc = virCgroupMakeGroup(rootgrp, *group, create, VIR_CGROUP_NONE); + rc = virCgroupMakeGroup(selfgrp, *group, create, VIR_CGROUP_NONE); cleanup: - virCgroupFree(&rootgrp); + virCgroupFree(&selfgrp); return rc; } #endif @@ -942,7 +1059,6 @@ int virCgroupNewDriver(const char *name, virCgroupPtr *group) { int rc; - char *path = NULL; virCgroupPtr rootgrp = NULL; rc = virCgroupAppRoot(privileged, &rootgrp, @@ -950,20 +1066,12 @@ int virCgroupNewDriver(const char *name, if (rc != 0) goto out; - if (virAsprintf(&path, "%s/%s", rootgrp->path, name) < 0) { - rc = -ENOMEM; - goto out; - } - - rc = virCgroupNew(path, controllers, group); - VIR_FREE(path); - + rc = virCgroupNew(name, rootgrp, -1, group); if (rc == 0) { rc = virCgroupMakeGroup(rootgrp, *group, create, VIR_CGROUP_NONE); if (rc != 0) virCgroupFree(group); } - out: virCgroupFree(&rootgrp); @@ -994,7 +1102,7 @@ int virCgroupNewDriver(const char *name ATTRIBUTE_UNUSED, #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R int virCgroupNewSelf(virCgroupPtr *group) { - return virCgroupNew("/", -1, group); + return virCgroupNew("", NULL, -1, group); } #else int virCgroupNewSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) @@ -1019,13 +1127,8 @@ int virCgroupNewDomain(virCgroupPtr driver, virCgroupPtr *group) { int rc; - char *path; - if (virAsprintf(&path, "%s/%s", driver->path, name) < 0) - return -ENOMEM; - - rc = virCgroupNew(path, -1, group); - VIR_FREE(path); + rc = virCgroupNew(name, driver, -1, group); if (rc == 0) { /* @@ -1071,18 +1174,18 @@ int virCgroupNewVcpu(virCgroupPtr domain, virCgroupPtr *group) { int rc; - char *path; + char *name; int controllers; - if (virAsprintf(&path, "%s/vcpu%d", domain->path, vcpuid) < 0) + if (virAsprintf(&name, "vcpu%d", vcpuid) < 0) return -ENOMEM; controllers = ((1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | (1 << VIR_CGROUP_CONTROLLER_CPUSET)); - rc = virCgroupNew(path, controllers, group); - VIR_FREE(path); + rc = virCgroupNew(name, domain, controllers, group); + VIR_FREE(name); if (rc == 0) { rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); @@ -1116,18 +1219,13 @@ int virCgroupNewEmulator(virCgroupPtr domain, virCgroupPtr *group) { int rc; - char *path; int controllers; - if (virAsprintf(&path, "%s/emulator", domain->path) < 0) - return -ENOMEM; - controllers = ((1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | (1 << VIR_CGROUP_CONTROLLER_CPUSET)); - rc = virCgroupNew(path, controllers, group); - VIR_FREE(path); + rc = virCgroupNew("emulator", domain, controllers, group); if (rc == 0) { rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); @@ -2015,8 +2113,6 @@ static int virCgroupKillRecursiveInternal(virCgroupPtr group, int signum, virHas } while ((ent = readdir(dp))) { - char *subpath; - if (STREQ(ent->d_name, ".")) continue; if (STREQ(ent->d_name, "..")) @@ -2025,12 +2121,8 @@ static int virCgroupKillRecursiveInternal(virCgroupPtr group, int signum, virHas continue; VIR_DEBUG("Process subdir %s", ent->d_name); - if (virAsprintf(&subpath, "%s/%s", group->path, ent->d_name) < 0) { - rc = -ENOMEM; - goto cleanup; - } - if ((rc = virCgroupNew(subpath, -1, &subgroup)) != 0) + if ((rc = virCgroupNew(ent->d_name, group, -1, &subgroup)) != 0) goto cleanup; if ((rc = virCgroupKillRecursiveInternal(subgroup, signum, pids, true)) < 0) diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index a68aa88..3f35f2e 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -94,11 +94,11 @@ static int testCgroupNewForSelf(const void *args ATTRIBUTE_UNUSED) const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { [VIR_CGROUP_CONTROLLER_CPU] = "/system", [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system", - [VIR_CGROUP_CONTROLLER_CPUSET] = "", - [VIR_CGROUP_CONTROLLER_MEMORY] = "", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/", [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, - [VIR_CGROUP_CONTROLLER_FREEZER] = "", - [VIR_CGROUP_CONTROLLER_BLKIO] = "", + [VIR_CGROUP_CONTROLLER_FREEZER] = "/", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/", }; if (virCgroupNewSelf(&cgroup) < 0) { @@ -106,7 +106,7 @@ static int testCgroupNewForSelf(const void *args ATTRIBUTE_UNUSED) goto cleanup; } - ret = validateCgroup(cgroup, "/", mountsFull, placement); + ret = validateCgroup(cgroup, "", mountsFull, placement); cleanup: virCgroupFree(&cgroup); @@ -119,14 +119,23 @@ static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) virCgroupPtr cgroup = NULL; int ret = -1; int rv; - const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { - [VIR_CGROUP_CONTROLLER_CPU] = "/system", - [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system", - [VIR_CGROUP_CONTROLLER_CPUSET] = "", - [VIR_CGROUP_CONTROLLER_MEMORY] = "", + const char *placementSmall[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/system/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_CPUSET] = NULL, + [VIR_CGROUP_CONTROLLER_MEMORY] = "/libvirt/lxc", [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, - [VIR_CGROUP_CONTROLLER_FREEZER] = "", - [VIR_CGROUP_CONTROLLER_BLKIO] = "", + [VIR_CGROUP_CONTROLLER_FREEZER] = NULL, + [VIR_CGROUP_CONTROLLER_BLKIO] = NULL, + }; + const char *placementFull[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/system/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/libvirt/lxc", }; if ((rv = virCgroupNewDriver("lxc", true, false, -1, &cgroup)) != -ENOENT) { @@ -159,14 +168,14 @@ static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); goto cleanup; } - ret = validateCgroup(cgroup, "/libvirt/lxc", mountsSmall, placement); + ret = validateCgroup(cgroup, "libvirt/lxc", mountsSmall, placementSmall); virCgroupFree(&cgroup); if ((rv = virCgroupNewDriver("lxc", true, true, -1, &cgroup)) != 0) { fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); goto cleanup; } - ret = validateCgroup(cgroup, "/libvirt/lxc", mountsFull, placement); + ret = validateCgroup(cgroup, "libvirt/lxc", mountsFull, placementFull); cleanup: virCgroupFree(&cgroup); @@ -181,13 +190,13 @@ static int testCgroupNewForDomain(const void *args ATTRIBUTE_UNUSED) int ret = -1; int rv; const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { - [VIR_CGROUP_CONTROLLER_CPU] = "/system", - [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system", - [VIR_CGROUP_CONTROLLER_CPUSET] = "", - [VIR_CGROUP_CONTROLLER_MEMORY] = "", + [VIR_CGROUP_CONTROLLER_CPU] = "/system/libvirt/lxc/wibble", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system/libvirt/lxc/wibble", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/libvirt/lxc/wibble", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/libvirt/lxc/wibble", [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, - [VIR_CGROUP_CONTROLLER_FREEZER] = "", - [VIR_CGROUP_CONTROLLER_BLKIO] = "", + [VIR_CGROUP_CONTROLLER_FREEZER] = "/libvirt/lxc/wibble", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/libvirt/lxc/wibble", }; if ((rv = virCgroupNewDriver("lxc", true, false, -1, &drivercgroup)) != 0) { @@ -200,7 +209,7 @@ static int testCgroupNewForDomain(const void *args ATTRIBUTE_UNUSED) goto cleanup; } - ret = validateCgroup(domaincgroup, "/libvirt/lxc/wibble", mountsFull, placement); + ret = validateCgroup(domaincgroup, "libvirt/lxc/wibble", mountsFull, placement); cleanup: virCgroupFree(&drivercgroup); @@ -246,4 +255,4 @@ mymain(void) return ret==0 ? EXIT_SUCCESS : EXIT_FAILURE; } -VIRT_TEST_MAIN_PRELOAD(mymain, abs_builddir "/.libs/libvircgroupmock.so") +VIRT_TEST_MAIN_PRELOAD(mymain, abs_builddir "/.libs/vircgroupmock.so") -- 1.8.1.4

On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Currently the virCgroupPtr struct contains 3 pieces of information
- path - path of the cgroup, relative to current process' cgroup placement - placement - current process' placement in each controller - mounts - mount point of each controller
When reading/writing cgroup settings, the path & placement strings are combined to form the file path. This approach only works if we assume all cgroups will be relative to the current process' cgroup placement.
To allow support for managing cgroups at any place in the heirarchy a change is needed. The 'placement' data should reflect the absolute path to the cgroup, and the 'path' value should no longer be used to form the paths to the cgroup attribute files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/util/vircgroup.c | 222 +++++++++++++++++++++++++++++++++++--------------- tests/vircgrouptest.c | 53 +++++++----- 2 files changed, 188 insertions(+), 87 deletions(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 2f52c92..c336806 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -101,6 +101,23 @@ bool virCgroupHasController(virCgroupPtr cgroup, int controller) }
#if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R +static int virCgroupCopyMounts(virCgroupPtr group, + virCgroupPtr parent) +{ + int i; + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + if (!parent->controllers[i].mountPoint) + continue; + + group->controllers[i].mountPoint = + strdup(parent->controllers[i].mountPoint); + + if (!group->controllers[i].mountPoint) + return -ENOMEM; + } + return 0; +} + /* * Process /proc/mounts figuring out what controllers are * mounted and where @@ -158,12 +175,61 @@ no_memory: }
+static int virCgroupCopyPlacement(virCgroupPtr group, + const char *path, + virCgroupPtr parent) +{ + int i; + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + if (!group->controllers[i].mountPoint) + continue; + + if (path[0] == '/') { + if (!(group->controllers[i].placement = strdup(path))) + return -ENOMEM; + } else { + /* + * parent=="/" + path="" => "/" + * parent=="/libvirt.service" + path="" => "/libvirt.service" + * parent=="/libvirt.service" + path="foo" => "/libvirt.service/foo" + */
s/path=/path==/
+ if (virAsprintf(&group->controllers[i].placement, + "%s%s%s", + parent->controllers[i].placement, + (STREQ(parent->controllers[i].placement, "/") || + STREQ(path, "") ? "" : "/"), + path) < 0)
No, please no. This is too big for my small brain. And it's easy to make a mistake here, as you just did. The closing parenthesis should be just before the ternary operator. In fact, both parentheses can be left out.
+ return -ENOMEM; + } + } + + return 0; +} + + /*
Insert function name here.
+ * @group: the group to process + * @path: the relative path to append, not starting with '/' + * * Process /proc/self/cgroup figuring out what cgroup * sub-path the current process is assigned to. ie not - * necessarily in the root + * necessarily in the root. The contents of this file + * looks like + * + * 9:perf_event:/ + * 8:blkio:/ + * 7:net_cls:/ + * 6:freezer:/ + * 5:devices:/ + * 4:memory:/ + * 3:cpuacct,cpu:/ + * 2:cpuset:/ + * 1:name=systemd:/user/berrange/2 + * + * It then appends @path to each detected path. */ -static int virCgroupDetectPlacement(virCgroupPtr group) +static int virCgroupDetectPlacement(virCgroupPtr group, + const char *path) { int i; FILE *mapping = NULL; @@ -177,18 +243,18 @@ static int virCgroupDetectPlacement(virCgroupPtr group)
while (fgets(line, sizeof(line), mapping) != NULL) { char *controllers = strchr(line, ':'); - char *path = controllers ? strchr(controllers+1, ':') : NULL; - char *nl = path ? strchr(path, '\n') : NULL; + char *selfpath = controllers ? strchr(controllers + 1, ':') : NULL; + char *nl = selfpath ? strchr(selfpath, '\n') : NULL;
- if (!controllers || !path) + if (!controllers || !selfpath) continue;
if (nl) *nl = '\0';
- *path = '\0'; + *selfpath = '\0'; controllers++; - path++; + selfpath++;
for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { const char *typestr = virCgroupControllerTypeToString(i); @@ -198,14 +264,25 @@ static int virCgroupDetectPlacement(virCgroupPtr group) char *next = strchr(tmp, ','); int len; if (next) { - len = next-tmp; + len = next - tmp; next++; } else { len = strlen(tmp); } - if (typelen == len && STREQLEN(typestr, tmp, len) && - !(group->controllers[i].placement = strdup(STREQ(path, "/") ? "" : path))) - goto no_memory; + + /* + * selfpath=="/" + path="" -> "/" + * selfpath=="/libvirt.service" + path="" -> "/libvirt.service" + * selfpath=="/libvirt.service" + path="foo" -> "/libvirt.service/foo" + */ + if (typelen == len && STREQLEN(typestr, tmp, len)) { + if (virAsprintf(&group->controllers[i].placement, + "%s%s%s", selfpath, + (STREQ(selfpath, "/") || + STREQ(path, "") ? "" : "/"),
same applies here
+ path) < 0) + goto no_memory; + }
tmp = next; }
diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index a68aa88..3f35f2e 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c
@@ -246,4 +255,4 @@ mymain(void) return ret==0 ? EXIT_SUCCESS : EXIT_FAILURE; }
-VIRT_TEST_MAIN_PRELOAD(mymain, abs_builddir "/.libs/libvircgroupmock.so") +VIRT_TEST_MAIN_PRELOAD(mymain, abs_builddir "/.libs/vircgroupmock.so")
This seems a bit unrelated. It's fixing pre-existing bug so it should go into separate patch. ACK to the changes if you address those nits I've pointed out. But please split this patch into two. Michal

On Thu, Apr 11, 2013 at 12:02:05PM +0200, Michal Privoznik wrote:
On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Currently the virCgroupPtr struct contains 3 pieces of information
- path - path of the cgroup, relative to current process' cgroup placement - placement - current process' placement in each controller - mounts - mount point of each controller
When reading/writing cgroup settings, the path & placement strings are combined to form the file path. This approach only works if we assume all cgroups will be relative to the current process' cgroup placement.
To allow support for managing cgroups at any place in the heirarchy a change is needed. The 'placement' data should reflect the absolute path to the cgroup, and the 'path' value should no longer be used to form the paths to the cgroup attribute files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/util/vircgroup.c | 222 +++++++++++++++++++++++++++++++++++--------------- tests/vircgrouptest.c | 53 +++++++----- 2 files changed, 188 insertions(+), 87 deletions(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 2f52c92..c336806 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -101,6 +101,23 @@ bool virCgroupHasController(virCgroupPtr cgroup, int controller) }
#if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R +static int virCgroupCopyMounts(virCgroupPtr group, + virCgroupPtr parent) +{ + int i; + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + if (!parent->controllers[i].mountPoint) + continue; + + group->controllers[i].mountPoint = + strdup(parent->controllers[i].mountPoint); + + if (!group->controllers[i].mountPoint) + return -ENOMEM; + } + return 0; +} + /* * Process /proc/mounts figuring out what controllers are * mounted and where @@ -158,12 +175,61 @@ no_memory: }
+static int virCgroupCopyPlacement(virCgroupPtr group, + const char *path, + virCgroupPtr parent) +{ + int i; + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + if (!group->controllers[i].mountPoint) + continue; + + if (path[0] == '/') { + if (!(group->controllers[i].placement = strdup(path))) + return -ENOMEM; + } else { + /* + * parent=="/" + path="" => "/" + * parent=="/libvirt.service" + path="" => "/libvirt.service" + * parent=="/libvirt.service" + path="foo" => "/libvirt.service/foo" + */
s/path=/path==/
+ if (virAsprintf(&group->controllers[i].placement, + "%s%s%s", + parent->controllers[i].placement, + (STREQ(parent->controllers[i].placement, "/") || + STREQ(path, "") ? "" : "/"), + path) < 0)
No, please no. This is too big for my small brain. And it's easy to make a mistake here, as you just did. The closing parenthesis should be just before the ternary operator. In fact, both parentheses can be left out.
It isn't a mistake - I just used () for style reasons - get the second line to indent, so that it was obviously a continuation. The () usage I had has no functional effect since || is higher precedence than ?:
diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index a68aa88..3f35f2e 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c
@@ -246,4 +255,4 @@ mymain(void) return ret==0 ? EXIT_SUCCESS : EXIT_FAILURE; }
-VIRT_TEST_MAIN_PRELOAD(mymain, abs_builddir "/.libs/libvircgroupmock.so") +VIRT_TEST_MAIN_PRELOAD(mymain, abs_builddir "/.libs/vircgroupmock.so")
This seems a bit unrelated. It's fixing pre-existing bug so it should go into separate patch.
Yep, that's a mistake Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

From: "Daniel P. Berrange" <berrange@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/util/vircgroup.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index c336806..d3c43a2 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -662,12 +662,18 @@ static int virCgroupMakeGroup(virCgroupPtr parent, char *path = NULL; /* Skip over controllers that aren't mounted */ - if (!group->controllers[i].mountPoint) + if (!group->controllers[i].mountPoint) { + VIR_DEBUG("Skipping unmounted controller %s", + virCgroupControllerTypeToString(i)); continue; + } rc = virCgroupPathOfController(group, i, "", &path); - if (rc < 0) + if (rc < 0) { + VIR_DEBUG("Failed to find path of controller %s", + virCgroupControllerTypeToString(i)); return rc; + } /* As of Feb 2011, clang can't see that the above function * call did not modify group. */ sa_assert(group->controllers[i].mountPoint); @@ -681,11 +687,14 @@ static int virCgroupMakeGroup(virCgroupPtr parent, * other controllers even though they are available. So * treat blkio as unmounted if mkdir fails. */ if (i == VIR_CGROUP_CONTROLLER_BLKIO) { + VIR_DEBUG("Ignoring mkdir failure with blkio controller. Kernel probably too old"); rc = 0; VIR_FREE(group->controllers[i].mountPoint); VIR_FREE(path); continue; } else { + VIR_DEBUG("Failed to create controller %s for group", + virCgroupControllerTypeToString(i)); rc = -errno; VIR_FREE(path); break; @@ -719,6 +728,7 @@ static int virCgroupMakeGroup(virCgroupPtr parent, VIR_FREE(path); } + VIR_DEBUG("Done making controllers for group"); return rc; } @@ -903,6 +913,7 @@ int virCgroupRemove(virCgroupPtr group) int i; char *grppath = NULL; + VIR_DEBUG("Removing cgroup %s", group->path); for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { /* Skip over controllers not mounted */ if (!group->controllers[i].mountPoint) @@ -918,6 +929,7 @@ int virCgroupRemove(virCgroupPtr group) rc = virCgroupRemoveRecursively(grppath); VIR_FREE(grppath); } + VIR_DEBUG("Done removing cgroup %s", group->path); return rc; } -- 1.8.1.4

On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/util/vircgroup.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)
ACK Michal

From: "Daniel P. Berrange" <berrange@redhat.com> Currently if virCgroupMakeGroup fails, we can get in a situation where some controllers have been setup, but others not. Ensure we call virCgroupRemove to remove what we've done upon failure Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/util/vircgroup.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index d3c43a2..bcc61a8 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -1081,8 +1081,10 @@ int virCgroupNewDriver(const char *name, rc = virCgroupNew(name, rootgrp, -1, group); if (rc == 0) { rc = virCgroupMakeGroup(rootgrp, *group, create, VIR_CGROUP_NONE); - if (rc != 0) + if (rc != 0) { + virCgroupRemove(*group); virCgroupFree(group); + } } out: virCgroupFree(&rootgrp); @@ -1154,8 +1156,10 @@ int virCgroupNewDomain(virCgroupPtr driver, * cumulative usage that we don't need. */ rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_MEM_HIERACHY); - if (rc != 0) + if (rc != 0) { + virCgroupRemove(*group); virCgroupFree(group); + } } return rc; @@ -1201,8 +1205,10 @@ int virCgroupNewVcpu(virCgroupPtr domain, if (rc == 0) { rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); - if (rc != 0) + if (rc != 0) { + virCgroupRemove(*group); virCgroupFree(group); + } } return rc; @@ -1241,8 +1247,10 @@ int virCgroupNewEmulator(virCgroupPtr domain, if (rc == 0) { rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); - if (rc != 0) + if (rc != 0) { + virCgroupRemove(*group); virCgroupFree(group); + } } return rc; -- 1.8.1.4

On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Currently if virCgroupMakeGroup fails, we can get in a situation where some controllers have been setup, but others not. Ensure we call virCgroupRemove to remove what we've done upon failure
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/util/vircgroup.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index d3c43a2..bcc61a8 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -1081,8 +1081,10 @@ int virCgroupNewDriver(const char *name, rc = virCgroupNew(name, rootgrp, -1, group); if (rc == 0) { rc = virCgroupMakeGroup(rootgrp, *group, create, VIR_CGROUP_NONE); - if (rc != 0) + if (rc != 0) { + virCgroupRemove(*group); virCgroupFree(group); + } } out: virCgroupFree(&rootgrp); @@ -1154,8 +1156,10 @@ int virCgroupNewDomain(virCgroupPtr driver, * cumulative usage that we don't need. */ rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_MEM_HIERACHY); - if (rc != 0) + if (rc != 0) { + virCgroupRemove(*group); virCgroupFree(group); + } }
return rc; @@ -1201,8 +1205,10 @@ int virCgroupNewVcpu(virCgroupPtr domain,
if (rc == 0) { rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); - if (rc != 0) + if (rc != 0) { + virCgroupRemove(*group); virCgroupFree(group); + } }
return rc; @@ -1241,8 +1247,10 @@ int virCgroupNewEmulator(virCgroupPtr domain,
if (rc == 0) { rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); - if (rc != 0) + if (rc != 0) { + virCgroupRemove(*group); virCgroupFree(group); + } }
return rc;
Funny, shouldn't we make virCgroupMakeGroup() to cleanup on failure? But I can live with this version as well. Michal

From: "Daniel P. Berrange" <berrange@redhat.com> A resource partition is an absolute cgroup path, ignoring the current process placement. Expose a virCgroupNewPartition API for constructing such cgroups Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 4 +- src/lxc/lxc_cgroup.c | 2 +- src/qemu/qemu_cgroup.c | 2 +- src/util/vircgroup.c | 146 ++++++++++++++++++++++++++++++++++++++--- src/util/vircgroup.h | 20 ++++-- tests/vircgrouptest.c | 166 ++++++++++++++++++++++++++++++++++++++++++++++- 6 files changed, 321 insertions(+), 19 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index b0a4b5a..52c3bcb 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1117,9 +1117,11 @@ virCgroupKill; virCgroupKillPainfully; virCgroupKillRecursive; virCgroupMoveTask; -virCgroupNewDomain; +virCgroupNewDomainDriver; +virCgroupNewDomainPartition; virCgroupNewDriver; virCgroupNewEmulator; +virCgroupNewPartition; virCgroupNewSelf; virCgroupNewVcpu; virCgroupPathOfController; diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 7d1432b..72940bd 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -536,7 +536,7 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) goto cleanup; } - rc = virCgroupNewDomain(driver, def->name, true, &cgroup); + rc = virCgroupNewDomainDriver(driver, def->name, true, &cgroup); if (rc != 0) { virReportSystemError(-rc, _("Unable to create cgroup for domain %s"), diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index cb53acb..cb0faa1 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -216,7 +216,7 @@ int qemuInitCgroup(virQEMUDriverPtr driver, goto cleanup; } - rc = virCgroupNewDomain(driverGroup, vm->def->name, true, &priv->cgroup); + rc = virCgroupNewDomainDriver(driverGroup, vm->def->name, true, &priv->cgroup); if (rc != 0) { virReportSystemError(-rc, _("Unable to create cgroup for %s"), diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index bcc61a8..40e0fe6 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -1055,6 +1055,76 @@ cleanup: return rc; } + +#if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R +/** + * virCgroupNewPartition: + * @path: path for the partition + * @create: true to create the cgroup tree + * @controllers: mask of controllers to create + * + * Creates a new cgroup to represent the resource + * partition path identified by @name. + * + * Returns 0 on success, -errno on failure + */ +int virCgroupNewPartition(const char *path, + bool create, + int controllers, + virCgroupPtr *group) +{ + int rc; + char *parentPath = NULL; + virCgroupPtr parent = NULL; + VIR_DEBUG("path=%s create=%d controllers=%x", + path, create, controllers); + + if (path[0] != '/') + return -EINVAL; + + rc = virCgroupNew(path, NULL, controllers, group); + if (rc != 0) + goto cleanup; + + if (STRNEQ(path, "/")) { + char *tmp; + if (!(parentPath = strdup(path))) + return -ENOMEM; + + tmp = strrchr(parentPath, '/'); + tmp++; + *tmp = '\0'; + + rc = virCgroupNew(parentPath, NULL, controllers, &parent); + if (rc != 0) + goto cleanup; + + rc = virCgroupMakeGroup(parent, *group, create, VIR_CGROUP_NONE); + if (rc != 0) { + virCgroupRemove(*group); + virCgroupFree(group); + goto cleanup; + } + } + +cleanup: + virCgroupFree(&parent); + VIR_FREE(parentPath); + return rc; +} +#else +int virCgroupNewPartition(const char *path ATTRIBUTE_UNUSED, + const char *driver ATTRIBUTE_UNUSED, + const char *name ATTRIBUTE_UNUSED, + bool create ATTRIBUTE_UNUSED, + int controllers ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) +{ + /* Claim no support */ + return -ENXIO; +} +#endif + /** * virCgroupNewDriver: * @@ -1126,7 +1196,7 @@ int virCgroupNewSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) #endif /** - * virCgroupNewDomain: + * virCgroupNewDomainDriver: * * @driver: group for driver owning the domain * @name: name of the domain @@ -1135,10 +1205,10 @@ int virCgroupNewSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupNewDomain(virCgroupPtr driver, - const char *name, - bool create, - virCgroupPtr *group) +int virCgroupNewDomainDriver(virCgroupPtr driver, + const char *name, + bool create, + virCgroupPtr *group) { int rc; @@ -1165,10 +1235,68 @@ int virCgroupNewDomain(virCgroupPtr driver, return rc; } #else -int virCgroupNewDomain(virCgroupPtr driver ATTRIBUTE_UNUSED, - const char *name ATTRIBUTE_UNUSED, - bool create ATTRIBUTE_UNUSED, - virCgroupPtr *group ATTRIBUTE_UNUSED) +int virCgroupNewDomainDriver(virCgroupPtr driver ATTRIBUTE_UNUSED, + const char *name ATTRIBUTE_UNUSED, + bool create ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) +{ + return -ENXIO; +} +#endif + +/** + * virCgroupNewDomainPartition: + * + * @partition: partition holding the domain + * @driver: name of the driver + * @name: name of the domain + * @group: Pointer to returned virCgroupPtr + * + * Returns 0 on success + */ +#if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R +int virCgroupNewDomainPartition(virCgroupPtr partition, + const char *driver, + const char *name, + bool create, + virCgroupPtr *group) +{ + int rc; + char *dirname = NULL; + + if (virAsprintf(&dirname, "%s.%s.libvirt", + name, driver) < 0) + return -ENOMEM; + + rc = virCgroupNew(dirname, partition, -1, group); + + if (rc == 0) { + /* + * Create a cgroup with memory.use_hierarchy enabled to + * surely account memory usage of lxc with ns subsystem + * enabled. (To be exact, memory and ns subsystems are + * enabled at the same time.) + * + * The reason why doing it here, not a upper group, say + * a group for driver, is to avoid overhead to track + * cumulative usage that we don't need. + */ + rc = virCgroupMakeGroup(partition, *group, create, VIR_CGROUP_MEM_HIERACHY); + if (rc != 0) { + virCgroupRemove(*group); + virCgroupFree(group); + } + } + + VIR_FREE(dirname); + return rc; +} +#else +int virCgroupNewDomainPartition(virCgroupPtr partition ATTRIBUTE_UNUSED, + const char *driver ATTRIBUTE_UNUSED, + const char *name ATTRIBUTE_UNUSED, + bool create ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) { return -ENXIO; } diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h index 91143e2..33f86a6 100644 --- a/src/util/vircgroup.h +++ b/src/util/vircgroup.h @@ -44,6 +44,12 @@ enum { VIR_ENUM_DECL(virCgroupController); +int virCgroupNewPartition(const char *path, + bool create, + int controllers, + virCgroupPtr *group) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(4); + int virCgroupNewDriver(const char *name, bool privileged, bool create, @@ -54,10 +60,16 @@ int virCgroupNewDriver(const char *name, int virCgroupNewSelf(virCgroupPtr *group) ATTRIBUTE_NONNULL(1); -int virCgroupNewDomain(virCgroupPtr driver, - const char *name, - bool create, - virCgroupPtr *group) +int virCgroupNewDomainDriver(virCgroupPtr driver, + const char *name, + bool create, + virCgroupPtr *group) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(4); +int virCgroupNewDomainPartition(virCgroupPtr partition, + const char *driver, + const char *name, + bool create, + virCgroupPtr *group) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(4); int virCgroupNewVcpu(virCgroupPtr domain, diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index 3f35f2e..a806368 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -183,7 +183,7 @@ cleanup: } -static int testCgroupNewForDomain(const void *args ATTRIBUTE_UNUSED) +static int testCgroupNewForDriverDomain(const void *args ATTRIBUTE_UNUSED) { virCgroupPtr drivercgroup = NULL; virCgroupPtr domaincgroup = NULL; @@ -204,7 +204,7 @@ static int testCgroupNewForDomain(const void *args ATTRIBUTE_UNUSED) goto cleanup; } - if ((rv = virCgroupNewDomain(drivercgroup, "wibble", true, &domaincgroup)) != 0) { + if ((rv = virCgroupNewDomainDriver(drivercgroup, "wibble", true, &domaincgroup)) != 0) { fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); goto cleanup; } @@ -218,6 +218,156 @@ cleanup: } +static int testCgroupNewForPartition(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr cgroup = NULL; + int ret = -1; + int rv; + const char *placementSmall[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_CPUSET] = NULL, + [VIR_CGROUP_CONTROLLER_MEMORY] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = NULL, + [VIR_CGROUP_CONTROLLER_BLKIO] = NULL, + }; + const char *placementFull[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/virtualmachines", + }; + + if ((rv = virCgroupNewPartition("/virtualmachines", false, -1, &cgroup)) != -ENOENT) { + fprintf(stderr, "Unexpected found /virtualmachines cgroup: %d\n", -rv); + goto cleanup; + } + + /* Asking for impossible combination since CPU is co-mounted */ + if ((rv = virCgroupNewPartition("/virtualmachines", true, + (1 << VIR_CGROUP_CONTROLLER_CPU), + &cgroup)) != -EINVAL) { + fprintf(stderr, "Should not have created /virtualmachines cgroup: %d\n", -rv); + goto cleanup; + } + + /* Asking for impossible combination since devices is not mounted */ + if ((rv = virCgroupNewPartition("/virtualmachines", true, + (1 << VIR_CGROUP_CONTROLLER_DEVICES), + &cgroup)) != -ENOENT) { + fprintf(stderr, "Should not have created /virtualmachines cgroup: %d\n", -rv); + goto cleanup; + } + + /* Asking for small combination since devices is not mounted */ + if ((rv = virCgroupNewPartition("/virtualmachines", true, + (1 << VIR_CGROUP_CONTROLLER_CPU) | + (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | + (1 << VIR_CGROUP_CONTROLLER_MEMORY), + &cgroup)) != 0) { + fprintf(stderr, "Cannot create /virtualmachines cgroup: %d\n", -rv); + goto cleanup; + } + ret = validateCgroup(cgroup, "/virtualmachines", mountsSmall, placementSmall); + virCgroupFree(&cgroup); + + if ((rv = virCgroupNewPartition("/virtualmachines", true, -1, &cgroup)) != 0) { + fprintf(stderr, "Cannot create /virtualmachines cgroup: %d\n", -rv); + goto cleanup; + } + ret = validateCgroup(cgroup, "/virtualmachines", mountsFull, placementFull); + +cleanup: + virCgroupFree(&cgroup); + return ret; +} + + +static int testCgroupNewForPartitionNested(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr cgroup = NULL; + int ret = -1; + int rv; + const char *placementFull[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/users/berrange", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/users/berrange", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/users/berrange", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/users/berrange", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "/users/berrange", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/users/berrange", + }; + + if ((rv = virCgroupNewPartition("/users/berrange", false, -1, &cgroup)) != -ENOENT) { + fprintf(stderr, "Unexpected found /users/berrange cgroup: %d\n", -rv); + goto cleanup; + } + + /* Should not work, since we require /users to be pre-created */ + if ((rv = virCgroupNewPartition("/users/berrange", true, -1, &cgroup)) != -ENOENT) { + fprintf(stderr, "Unexpected created /users/berrange cgroup: %d\n", -rv); + goto cleanup; + } + + if ((rv = virCgroupNewPartition("/users", true, -1, &cgroup)) != 0) { + fprintf(stderr, "Failed to create /users cgroup: %d\n", -rv); + goto cleanup; + } + + /* Should now work */ + if ((rv = virCgroupNewPartition("/users/berrange", true, -1, &cgroup)) != 0) { + fprintf(stderr, "Failed to create /users/berrange cgroup: %d\n", -rv); + goto cleanup; + } + + ret = validateCgroup(cgroup, "/users/berrange", mountsFull, placementFull); + +cleanup: + virCgroupFree(&cgroup); + return ret; +} + + + +static int testCgroupNewForPartitionDomain(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr partitioncgroup = NULL; + virCgroupPtr domaincgroup = NULL; + int ret = -1; + int rv; + const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/production/foo.lxc.libvirt", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/production/foo.lxc.libvirt", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/production/foo.lxc.libvirt", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/production/foo.lxc.libvirt", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "/production/foo.lxc.libvirt", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/production/foo.lxc.libvirt", + }; + + if ((rv = virCgroupNewPartition("/production", true, -1, &partitioncgroup)) != 0) { + fprintf(stderr, "Failed to create /production cgroup: %d\n", -rv); + goto cleanup; + } + + if ((rv = virCgroupNewDomainPartition(partitioncgroup, "lxc", "foo", true, &domaincgroup)) != 0) { + fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); + goto cleanup; + } + + ret = validateCgroup(domaincgroup, "/production/foo.lxc.libvirt", mountsFull, placement); + +cleanup: + virCgroupFree(&partitioncgroup); + virCgroupFree(&domaincgroup); + return ret; +} + + #define FAKESYSFSDIRTEMPLATE abs_builddir "/fakesysfsdir-XXXXXX" @@ -246,9 +396,19 @@ mymain(void) if (virtTestRun("New cgroup for driver", 1, testCgroupNewForDriver, NULL) < 0) ret = -1; - if (virtTestRun("New cgroup for domain", 1, testCgroupNewForDomain, NULL) < 0) + if (virtTestRun("New cgroup for domain driver", 1, testCgroupNewForDriverDomain, NULL) < 0) + ret = -1; + + if (virtTestRun("New cgroup for partition", 1, testCgroupNewForPartition, NULL) < 0) + ret = -1; + + if (virtTestRun("New cgroup for partition nested", 1, testCgroupNewForPartitionNested, NULL) < 0) ret = -1; + if (virtTestRun("New cgroup for domain partition", 1, testCgroupNewForPartitionDomain, NULL) < 0) + ret = -1; + + if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL) virFileDeleteTree(fakesysfsdir); -- 1.8.1.4

On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
A resource partition is an absolute cgroup path, ignoring the current process placement. Expose a virCgroupNewPartition API for constructing such cgroups
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 4 +- src/lxc/lxc_cgroup.c | 2 +- src/qemu/qemu_cgroup.c | 2 +- src/util/vircgroup.c | 146 ++++++++++++++++++++++++++++++++++++++--- src/util/vircgroup.h | 20 ++++-- tests/vircgrouptest.c | 166 ++++++++++++++++++++++++++++++++++++++++++++++- 6 files changed, 321 insertions(+), 19 deletions(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index bcc61a8..40e0fe6 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -1055,6 +1055,76 @@ cleanup: return rc; }
+ +#if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R +/** + * virCgroupNewPartition: + * @path: path for the partition + * @create: true to create the cgroup tree + * @controllers: mask of controllers to create + * + * Creates a new cgroup to represent the resource + * partition path identified by @name. + * + * Returns 0 on success, -errno on failure + */ +int virCgroupNewPartition(const char *path, + bool create, + int controllers, + virCgroupPtr *group) +{ + int rc; + char *parentPath = NULL; + virCgroupPtr parent = NULL; + VIR_DEBUG("path=%s create=%d controllers=%x", + path, create, controllers); + + if (path[0] != '/') + return -EINVAL; + + rc = virCgroupNew(path, NULL, controllers, group); + if (rc != 0) + goto cleanup; + + if (STRNEQ(path, "/")) { + char *tmp; + if (!(parentPath = strdup(path))) + return -ENOMEM;
You've just leaked @group.
+ + tmp = strrchr(parentPath, '/'); + tmp++; + *tmp = '\0'; + + rc = virCgroupNew(parentPath, NULL, controllers, &parent); + if (rc != 0) + goto cleanup;
And here as well.
+ + rc = virCgroupMakeGroup(parent, *group, create, VIR_CGROUP_NONE); + if (rc != 0) { + virCgroupRemove(*group); + virCgroupFree(group); + goto cleanup; + } + } + +cleanup: + virCgroupFree(&parent); + VIR_FREE(parentPath); + return rc; +}
ACK if those leaks are fixed. Michal

On Thu, Apr 11, 2013 at 12:02:30PM +0200, Michal Privoznik wrote:
On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
A resource partition is an absolute cgroup path, ignoring the current process placement. Expose a virCgroupNewPartition API for constructing such cgroups
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 4 +- src/lxc/lxc_cgroup.c | 2 +- src/qemu/qemu_cgroup.c | 2 +- src/util/vircgroup.c | 146 ++++++++++++++++++++++++++++++++++++++--- src/util/vircgroup.h | 20 ++++-- tests/vircgrouptest.c | 166 ++++++++++++++++++++++++++++++++++++++++++++++- 6 files changed, 321 insertions(+), 19 deletions(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index bcc61a8..40e0fe6 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -1055,6 +1055,76 @@ cleanup: return rc; }
+ +#if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R +/** + * virCgroupNewPartition: + * @path: path for the partition + * @create: true to create the cgroup tree + * @controllers: mask of controllers to create + * + * Creates a new cgroup to represent the resource + * partition path identified by @name. + * + * Returns 0 on success, -errno on failure + */ +int virCgroupNewPartition(const char *path, + bool create, + int controllers, + virCgroupPtr *group) +{ + int rc; + char *parentPath = NULL; + virCgroupPtr parent = NULL; + VIR_DEBUG("path=%s create=%d controllers=%x", + path, create, controllers); + + if (path[0] != '/') + return -EINVAL; + + rc = virCgroupNew(path, NULL, controllers, group); + if (rc != 0) + goto cleanup; + + if (STRNEQ(path, "/")) { + char *tmp; + if (!(parentPath = strdup(path))) + return -ENOMEM;
You've just leaked @group.
+ + tmp = strrchr(parentPath, '/'); + tmp++; + *tmp = '\0'; + + rc = virCgroupNew(parentPath, NULL, controllers, &parent); + if (rc != 0) + goto cleanup;
And here as well.
+ + rc = virCgroupMakeGroup(parent, *group, create, VIR_CGROUP_NONE); + if (rc != 0) { + virCgroupRemove(*group); + virCgroupFree(group); + goto cleanup; + } + } + +cleanup: + virCgroupFree(&parent); + VIR_FREE(parentPath); + return rc; +}
ACK if those leaks are fixed.
Adding in @@ -1089,8 +1089,10 @@ int virCgroupNewPartition(const char *path, if (STRNEQ(path, "/")) { char *tmp; - if (!(parentPath = strdup(path))) - return -ENOMEM; + if (!(parentPath = strdup(path))) { + rc = -ENOMEM; + goto cleanup; + } tmp = strrchr(parentPath, '/'); tmp++; @@ -1103,12 +1105,13 @@ int virCgroupNewPartition(const char *path, rc = virCgroupMakeGroup(parent, *group, create, VIR_CGROUP_NONE); if (rc != 0) { virCgroupRemove(*group); - virCgroupFree(group); goto cleanup; } } cleanup: + if (rc != 0) + virCgroupFree(group); virCgroupFree(&parent); VIR_FREE(parentPath); return rc; Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

From: "Daniel P. Berrange" <berrange@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- docs/formatdomain.html.in | 26 ++++++++++ docs/schemas/domaincommon.rng | 12 +++++ src/conf/domain_conf.c | 78 ++++++++++++++++++++++++++++ src/conf/domain_conf.h | 7 +++ tests/domainschemadata/domain-lxc-simple.xml | 3 ++ 5 files changed, 126 insertions(+) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index d400e35..5551187 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -716,6 +716,32 @@ </dl> + <h3><a name="resPartition">Resource partitioning</a></h3> + + <p> + Hypervisors may allow for virtual machines to be placed into + resource partitions, potentially with nesting of said partitions. + The <code>resource</code> element groups together configuration + related to resource partitioning. It currently supports a child + element <code>partition</code> whose content defines the path + of the resource partition in which to place the domain. If no + partition is listed, then the domain will be placed in a default + partition. + </p> +<pre> + ... + <resource> + <partition>/virtualmachines/production</partition> + </resource> + ... +</pre> + + <p> + Resource partitions are currently supported by the QEMU and + LXC drivers, which map partition paths onto cgroups directories, + in all mounted controllers. <span class="since">Since 1.0.5</pan> + </p> + <h3><a name="elementsCPU">CPU model and topology</a></h3> <p> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 2c31f76..77d020d 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -537,6 +537,10 @@ <optional> <ref name="numatune"/> </optional> + + <optional> + <ref name="respartition"/> + </optional> </interleave> </define> @@ -680,6 +684,14 @@ </element> </define> + <define name="respartition"> + <element name="resource"> + <element name="partition"> + <ref name="absFilePath"/> + </element> + </element> + </define> + <define name="clock"> <optional> <element name="clock"> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e00a532..ae1dfd3 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1797,6 +1797,18 @@ virDomainVcpuPinDefArrayFree(virDomainVcpuPinDefPtr *def, VIR_FREE(def); } + +void +virDomainResourceDefFree(virDomainResourceDefPtr resource) +{ + if (!resource) + return; + + VIR_FREE(resource->partition); + VIR_FREE(resource); +} + + void virDomainDefFree(virDomainDefPtr def) { unsigned int i; @@ -1804,6 +1816,8 @@ void virDomainDefFree(virDomainDefPtr def) if (!def) return; + virDomainResourceDefFree(def->resource); + /* hostdevs must be freed before nets (or any future "intelligent * hostdevs") because the pointer to the hostdev is really * pointing into the middle of the higher level device's object, @@ -9685,6 +9699,37 @@ cleanup: } +static virDomainResourceDefPtr +virDomainResourceDefParse(xmlNodePtr node, + xmlXPathContextPtr ctxt) +{ + virDomainResourceDefPtr def = NULL; + xmlNodePtr tmp = ctxt->node; + + ctxt->node = node; + + if (VIR_ALLOC(def) < 0) { + virReportOOMError(); + goto error; + } + + /* Find out what type of virtualization to use */ + if (!(def->partition = virXPathString("string(./partition)", ctxt))) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("missing resource partition attribute")); + goto error; + } + + ctxt->node = tmp; + return def; + +error: + ctxt->node = tmp; + virDomainResourceDefFree(def); + return NULL; +} + + static virDomainDefPtr virDomainDefParseXML(xmlDocPtr xml, xmlNodePtr root, @@ -10255,6 +10300,25 @@ virDomainDefParseXML(xmlDocPtr xml, } VIR_FREE(nodes); + /* Extract numatune if exists. */ + if ((n = virXPathNodeSet("./resource", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("cannot extract resource nodes")); + goto error; + } + + if (n > 1) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("only one resource element is supported")); + VIR_FREE(nodes); + goto error; + } + + if (n && + !(def->resource = virDomainResourceDefParse(nodes[0], ctxt))) + goto error; + VIR_FREE(nodes); + if ((n = virXPathNodeSet("./features/*", ctxt, &nodes)) < 0) goto error; @@ -14870,6 +14934,17 @@ virDomainIsAllVcpupinInherited(virDomainDefPtr def) } } + +static void +virDomainResourceDefFormat(virBufferPtr buf, + virDomainResourceDefPtr def) +{ + virBufferAddLit(buf, " <resource>\n"); + virBufferEscapeString(buf, " <partition>%s</partition>\n", def->partition); + virBufferAddLit(buf, " </resource>\n"); +} + + #define DUMPXML_FLAGS \ (VIR_DOMAIN_XML_SECURE | \ VIR_DOMAIN_XML_INACTIVE | \ @@ -15138,6 +15213,9 @@ virDomainDefFormatInternal(virDomainDefPtr def, virBufferAddLit(buf, " </numatune>\n"); } + if (def->resource) + virDomainResourceDefFormat(buf, def->resource); + if (def->sysinfo) virDomainSysinfoDefFormat(buf, def->sysinfo); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 08b8e48..e396e85 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1751,6 +1751,11 @@ struct _virDomainRNGDef { void virBlkioDeviceWeightArrayClear(virBlkioDeviceWeightPtr deviceWeights, int ndevices); +typedef struct _virDomainResourceDef virDomainResourceDef; +typedef virDomainResourceDef *virDomainResourceDefPtr; +struct _virDomainResourceDef { + char *partition; +}; /* * Guest VM main configuration @@ -1802,6 +1807,7 @@ struct _virDomainDef { } cputune; virNumaTuneDef numatune; + virDomainResourceDefPtr resource; /* These 3 are based on virDomainLifeCycleAction enum flags */ int onReboot; @@ -2018,6 +2024,7 @@ virDomainObjPtr virDomainObjListFindByName(const virDomainObjListPtr doms, bool virDomainObjTaint(virDomainObjPtr obj, enum virDomainTaintFlags taint); +void virDomainResourceDefFree(virDomainResourceDefPtr resource); void virDomainGraphicsDefFree(virDomainGraphicsDefPtr def); void virDomainInputDefFree(virDomainInputDefPtr def); void virDomainDiskDefFree(virDomainDiskDefPtr def); diff --git a/tests/domainschemadata/domain-lxc-simple.xml b/tests/domainschemadata/domain-lxc-simple.xml index e61434f..56a0117 100644 --- a/tests/domainschemadata/domain-lxc-simple.xml +++ b/tests/domainschemadata/domain-lxc-simple.xml @@ -5,6 +5,9 @@ <type>exe</type> <init>/sh</init> </os> + <resource> + <partition>/virtualmachines</partition> + </resource> <memory unit='KiB'>500000</memory> <devices> <filesystem type='mount'> -- 1.8.1.4

On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- docs/formatdomain.html.in | 26 ++++++++++ docs/schemas/domaincommon.rng | 12 +++++ src/conf/domain_conf.c | 78 ++++++++++++++++++++++++++++ src/conf/domain_conf.h | 7 +++ tests/domainschemadata/domain-lxc-simple.xml | 3 ++ 5 files changed, 126 insertions(+)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index d400e35..5551187 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -716,6 +716,32 @@ </dl>
+ <h3><a name="resPartition">Resource partitioning</a></h3> + + <p> + Hypervisors may allow for virtual machines to be placed into + resource partitions, potentially with nesting of said partitions. + The <code>resource</code> element groups together configuration + related to resource partitioning. It currently supports a child + element <code>partition</code> whose content defines the path + of the resource partition in which to place the domain. If no + partition is listed, then the domain will be placed in a default + partition. + </p>
We should mention here the fact you are stating in the next patch: The partition path has to exists and it's admin responsibility to pre-create it.
+<pre> + ... + <resource> + <partition>/virtualmachines/production</partition> + </resource> + ... +</pre> + + <p> + Resource partitions are currently supported by the QEMU and + LXC drivers, which map partition paths onto cgroups directories, + in all mounted controllers. <span class="since">Since 1.0.5</pan> + </p> + <h3><a name="elementsCPU">CPU model and topology</a></h3>
<p>
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e00a532..ae1dfd3 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -10255,6 +10300,25 @@ virDomainDefParseXML(xmlDocPtr xml, } VIR_FREE(nodes);
+ /* Extract numatune if exists. */ + if ((n = virXPathNodeSet("./resource", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("cannot extract resource nodes")); + goto error; + } + + if (n > 1) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("only one resource element is supported")); + VIR_FREE(nodes); + goto error; + } + + if (n && + !(def->resource = virDomainResourceDefParse(nodes[0], ctxt))) + goto error; + VIR_FREE(nodes); +
Even though there is no real leak here, it seems a bit odd to VIR_FREE(nodes) in the 2nd 'if' statement, but not this in one. For consistency we should drop the fly-away VIR_FREE().
if ((n = virXPathNodeSet("./features/*", ctxt, &nodes)) < 0) goto error;
@@ -14870,6 +14934,17 @@ virDomainIsAllVcpupinInherited(virDomainDefPtr def) } }
ACK Michal

From: "Daniel P. Berrange" <berrange@redhat.com> Historically QEMU/LXC guests have been placed in a cgroup layout that is $LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$VMNAME This is bad for a number of reasons - The cgroup hierarchy gets very deep which seriously impacts kernel performance due to cgroups scalability limitations. - It is hard to setup cgroup policies which apply across services and virtual machines, since all VMs are underneath the libvirtd service. To address this the default cgroup location is changed to be /system/$VMNAME.{lxc,qemu}.libvirt This puts virtual machines at the same level in the hierarchy as system services, allowing consistent policy to be setup across all of them. This also honours the new resource partition location from the XML configuration, for example <resource> <partition>/virtualmachines/production</partitions> </resource> will result in the VM being placed at /virtualmachines/production/$VMNAME.{lxc,qemu}.libvirt NB, with the exception of the default, /system, path which is intended to always exist, libvirt will not attempt to auto-create the partitions in the XML. It is the responsibility of the admin/app to configure the partitions. Later libvirt APIs will provide a way todo this. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/lxc/lxc_cgroup.c | 91 +++++++++++++++++++++++++++++++------- src/lxc/lxc_cgroup.h | 2 +- src/lxc/lxc_process.c | 4 +- src/qemu/qemu_cgroup.c | 114 +++++++++++++++++++++++++++++++++++++----------- src/qemu/qemu_cgroup.h | 3 +- src/qemu/qemu_process.c | 2 +- 6 files changed, 169 insertions(+), 47 deletions(-) diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 72940bd..8f19057 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -523,29 +523,88 @@ cleanup: } -virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) +virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def, bool startup) { - virCgroupPtr driver = NULL; - virCgroupPtr cgroup = NULL; int rc; + virCgroupPtr parent = NULL; + virCgroupPtr cgroup = NULL; - rc = virCgroupNewDriver("lxc", true, false, -1, &driver); - if (rc != 0) { - virReportSystemError(-rc, "%s", - _("Unable to get cgroup for driver")); - goto cleanup; + if (!def->resource && startup) { + virDomainResourceDefPtr res; + + if (VIR_ALLOC(res) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (!(res->partition = strdup("/system"))) { + virReportOOMError(); + VIR_FREE(res); + goto cleanup; + } + + def->resource = res; } - rc = virCgroupNewDomainDriver(driver, def->name, true, &cgroup); - if (rc != 0) { - virReportSystemError(-rc, - _("Unable to create cgroup for domain %s"), - def->name); - goto cleanup; + if (def->resource && + def->resource->partition) { + if (def->resource->partition[0] != '/') { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Resource partition '%s' must start with '/'"), + def->resource->partition); + goto cleanup; + } + /* We only auto-create the default partition. In other + * cases we expec the sysadmin/app to have done so */ + rc = virCgroupNewPartition(def->resource->partition, + STREQ(def->resource->partition, "/system"), + -1, + &parent); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to initialize %s cgroup"), + def->resource->partition); + goto cleanup; + } + + rc = virCgroupNewDomainPartition(parent, + "lxc", + def->name, + true, + &cgroup); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + def->name); + goto cleanup; + } + } else { + rc = virCgroupNewDriver("lxc", + true, + true, + -1, + &parent); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + def->name); + goto cleanup; + } + + rc = virCgroupNewDomainDriver(parent, + def->name, + true, + &cgroup); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + def->name); + goto cleanup; + } } cleanup: - virCgroupFree(&driver); + virCgroupFree(&parent); return cgroup; } @@ -556,7 +615,7 @@ virCgroupPtr virLXCCgroupJoin(virDomainDefPtr def) int ret = -1; int rc; - if (!(cgroup = virLXCCgroupCreate(def))) + if (!(cgroup = virLXCCgroupCreate(def, true))) return NULL; rc = virCgroupAddTask(cgroup, getpid()); diff --git a/src/lxc/lxc_cgroup.h b/src/lxc/lxc_cgroup.h index 25a427c..f040de2 100644 --- a/src/lxc/lxc_cgroup.h +++ b/src/lxc/lxc_cgroup.h @@ -27,7 +27,7 @@ # include "lxc_fuse.h" # include "virusb.h" -virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def); +virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def, bool startup); virCgroupPtr virLXCCgroupJoin(virDomainDefPtr def); int virLXCCgroupSetup(virDomainDefPtr def, virCgroupPtr cgroup, diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index 1bbffa3..ab07a1e 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -1049,7 +1049,7 @@ int virLXCProcessStart(virConnectPtr conn, virCgroupFree(&priv->cgroup); - if (!(priv->cgroup = virLXCCgroupCreate(vm->def))) + if (!(priv->cgroup = virLXCCgroupCreate(vm->def, true))) return -1; if (!virCgroupHasController(priv->cgroup, @@ -1464,7 +1464,7 @@ virLXCProcessReconnectDomain(virDomainObjPtr vm, if (!(priv->monitor = virLXCProcessConnectMonitor(driver, vm))) goto error; - if (!(priv->cgroup = virLXCCgroupCreate(vm->def))) + if (!(priv->cgroup = virLXCCgroupCreate(vm->def, false))) goto error; if (virLXCUpdateActiveUsbHostdevs(driver, vm->def) < 0) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index cb0faa1..db9aafe 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -188,46 +188,108 @@ int qemuSetupHostUsbDeviceCgroup(virUSBDevicePtr dev ATTRIBUTE_UNUSED, int qemuInitCgroup(virQEMUDriverPtr driver, - virDomainObjPtr vm) + virDomainObjPtr vm, + bool startup) { - int rc; + int rc = -1; qemuDomainObjPrivatePtr priv = vm->privateData; - virCgroupPtr driverGroup = NULL; + virCgroupPtr parent = NULL; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); virCgroupFree(&priv->cgroup); - rc = virCgroupNewDriver("qemu", - cfg->privileged, - true, - cfg->cgroupControllers, - &driverGroup); - if (rc != 0) { - if (rc == -ENXIO || - rc == -EPERM || - rc == -EACCES) { /* No cgroups mounts == success */ - VIR_DEBUG("No cgroups present/configured/accessible, ignoring error"); - goto done; + if (!vm->def->resource && startup) { + virDomainResourceDefPtr res; + + if (VIR_ALLOC(res) < 0) { + virReportOOMError(); + goto cleanup; } - virReportSystemError(-rc, - _("Unable to create cgroup for %s"), - vm->def->name); - goto cleanup; + if (!(res->partition = strdup("/system"))) { + virReportOOMError(); + VIR_FREE(res); + goto cleanup; + } + + vm->def->resource = res; } - rc = virCgroupNewDomainDriver(driverGroup, vm->def->name, true, &priv->cgroup); - if (rc != 0) { - virReportSystemError(-rc, - _("Unable to create cgroup for %s"), - vm->def->name); - goto cleanup; + if (vm->def->resource && + vm->def->resource->partition) { + if (vm->def->resource->partition[0] != '/') { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Resource partition '%s' must start with '/'"), + vm->def->resource->partition); + goto cleanup; + } + /* We only auto-create the default partition. In other + * cases we expec the sysadmin/app to have done so */ + rc = virCgroupNewPartition(vm->def->resource->partition, + STREQ(vm->def->resource->partition, "/system"), + cfg->cgroupControllers, + &parent); + if (rc != 0) { + if (rc == -ENXIO || + rc == -EPERM || + rc == -EACCES) { /* No cgroups mounts == success */ + VIR_DEBUG("No cgroups present/configured/accessible, ignoring error"); + goto done; + } + + virReportSystemError(-rc, + _("Unable to initialize %s cgroup"), + vm->def->resource->partition); + goto cleanup; + } + + rc = virCgroupNewDomainPartition(parent, + "qemu", + vm->def->name, + true, + &priv->cgroup); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + vm->def->name); + goto cleanup; + } + } else { + rc = virCgroupNewDriver("qemu", + cfg->privileged, + true, + cfg->cgroupControllers, + &parent); + if (rc != 0) { + if (rc == -ENXIO || + rc == -EPERM || + rc == -EACCES) { /* No cgroups mounts == success */ + VIR_DEBUG("No cgroups present/configured/accessible, ignoring error"); + goto done; + } + + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + vm->def->name); + goto cleanup; + } + + rc = virCgroupNewDomainDriver(parent, + vm->def->name, + true, + &priv->cgroup); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + vm->def->name); + goto cleanup; + } } done: rc = 0; cleanup: - virCgroupFree(&driverGroup); + virCgroupFree(&parent); virObjectUnref(cfg); return rc; } @@ -246,7 +308,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, (const char *const *)cfg->cgroupDeviceACL : defaultDeviceACL; - if (qemuInitCgroup(driver, vm) < 0) + if (qemuInitCgroup(driver, vm, true) < 0) return -1; if (!priv->cgroup) diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index 6cbfebc..e63f443 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -37,7 +37,8 @@ int qemuSetupHostUsbDeviceCgroup(virUSBDevicePtr dev, const char *path, void *opaque); int qemuInitCgroup(virQEMUDriverPtr driver, - virDomainObjPtr vm); + virDomainObjPtr vm, + bool startup); int qemuSetupCgroup(virQEMUDriverPtr driver, virDomainObjPtr vm, virBitmapPtr nodemask); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index da47b43..ce9f501 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3005,7 +3005,7 @@ qemuProcessReconnect(void *opaque) if (qemuUpdateActiveUsbHostdevs(driver, obj->def) < 0) goto error; - if (qemuInitCgroup(driver, obj) < 0) + if (qemuInitCgroup(driver, obj, false) < 0) goto error; /* XXX: Need to change as long as lock is introduced for -- 1.8.1.4

On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Historically QEMU/LXC guests have been placed in a cgroup layout that is
$LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$VMNAME
This is bad for a number of reasons
- The cgroup hierarchy gets very deep which seriously impacts kernel performance due to cgroups scalability limitations.
- It is hard to setup cgroup policies which apply across services and virtual machines, since all VMs are underneath the libvirtd service.
To address this the default cgroup location is changed to be
/system/$VMNAME.{lxc,qemu}.libvirt
This puts virtual machines at the same level in the hierarchy as system services, allowing consistent policy to be setup across all of them.
This also honours the new resource partition location from the XML configuration, for example
<resource> <partition>/virtualmachines/production</partitions> </resource>
will result in the VM being placed at
/virtualmachines/production/$VMNAME.{lxc,qemu}.libvirt
NB, with the exception of the default, /system, path which is intended to always exist, libvirt will not attempt to auto-create the partitions in the XML. It is the responsibility of the admin/app to configure the partitions. Later libvirt APIs will provide a way todo this.
This NB part shall be duplicated to the docs from previous patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/lxc/lxc_cgroup.c | 91 +++++++++++++++++++++++++++++++------- src/lxc/lxc_cgroup.h | 2 +- src/lxc/lxc_process.c | 4 +- src/qemu/qemu_cgroup.c | 114 +++++++++++++++++++++++++++++++++++++----------- src/qemu/qemu_cgroup.h | 3 +- src/qemu/qemu_process.c | 2 +- 6 files changed, 169 insertions(+), 47 deletions(-)
ACK Michal

From: "Daniel P. Berrange" <berrange@redhat.com> The virCgroupNewDriver method had a 'bool privileged' param. If a false value was ever passed in, it would simply not work, since non-root users don't have any privileges to create new cgroups. Just delete this broken code entirely and make the QEMU driver skip cgroup setup in non-privileged mode Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/lxc/lxc_cgroup.c | 1 - src/qemu/qemu_cgroup.c | 4 +++- src/util/vircgroup.c | 27 +++------------------------ src/util/vircgroup.h | 1 - tests/vircgrouptest.c | 12 ++++++------ 5 files changed, 12 insertions(+), 33 deletions(-) diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 8f19057..0a43b61 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -581,7 +581,6 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def, bool startup) } else { rc = virCgroupNewDriver("lxc", true, - true, -1, &parent); if (rc != 0) { diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index db9aafe..a6c8638 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -196,6 +196,9 @@ int qemuInitCgroup(virQEMUDriverPtr driver, virCgroupPtr parent = NULL; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + if (!cfg->privileged) + goto done; + virCgroupFree(&priv->cgroup); if (!vm->def->resource && startup) { @@ -256,7 +259,6 @@ int qemuInitCgroup(virQEMUDriverPtr driver, } } else { rc = virCgroupNewDriver("qemu", - cfg->privileged, true, cfg->cgroupControllers, &parent); diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 40e0fe6..6202614 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -794,8 +794,7 @@ err: return rc; } -static int virCgroupAppRoot(bool privileged, - virCgroupPtr *group, +static int virCgroupAppRoot(virCgroupPtr *group, bool create, int controllers) { @@ -807,26 +806,7 @@ static int virCgroupAppRoot(bool privileged, if (rc != 0) return rc; - if (privileged) { - rc = virCgroupNew("libvirt", selfgrp, controllers, group); - } else { - char *rootname; - char *username; - username = virGetUserName(getuid()); - if (!username) { - rc = -ENOMEM; - goto cleanup; - } - rc = virAsprintf(&rootname, "libvirt-%s", username); - VIR_FREE(username); - if (rc < 0) { - rc = -ENOMEM; - goto cleanup; - } - - rc = virCgroupNew(rootname, selfgrp, controllers, group); - VIR_FREE(rootname); - } + rc = virCgroupNew("libvirt", selfgrp, controllers, group); if (rc != 0) goto cleanup; @@ -1135,7 +1115,6 @@ int virCgroupNewPartition(const char *path ATTRIBUTE_UNUSED, */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R int virCgroupNewDriver(const char *name, - bool privileged, bool create, int controllers, virCgroupPtr *group) @@ -1143,7 +1122,7 @@ int virCgroupNewDriver(const char *name, int rc; virCgroupPtr rootgrp = NULL; - rc = virCgroupAppRoot(privileged, &rootgrp, + rc = virCgroupAppRoot(&rootgrp, create, controllers); if (rc != 0) goto out; diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h index 33f86a6..936e09b 100644 --- a/src/util/vircgroup.h +++ b/src/util/vircgroup.h @@ -51,7 +51,6 @@ int virCgroupNewPartition(const char *path, ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(4); int virCgroupNewDriver(const char *name, - bool privileged, bool create, int controllers, virCgroupPtr *group) diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index a806368..4f76a06 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -138,13 +138,13 @@ static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) [VIR_CGROUP_CONTROLLER_BLKIO] = "/libvirt/lxc", }; - if ((rv = virCgroupNewDriver("lxc", true, false, -1, &cgroup)) != -ENOENT) { + if ((rv = virCgroupNewDriver("lxc", false, -1, &cgroup)) != -ENOENT) { fprintf(stderr, "Unexpected found LXC cgroup: %d\n", -rv); goto cleanup; } /* Asking for impossible combination since CPU is co-mounted */ - if ((rv = virCgroupNewDriver("lxc", true, true, + if ((rv = virCgroupNewDriver("lxc", true, (1 << VIR_CGROUP_CONTROLLER_CPU), &cgroup)) != -EINVAL) { fprintf(stderr, "Should not have created LXC cgroup: %d\n", -rv); @@ -152,7 +152,7 @@ static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) } /* Asking for impossible combination since devices is not mounted */ - if ((rv = virCgroupNewDriver("lxc", true, true, + if ((rv = virCgroupNewDriver("lxc", true, (1 << VIR_CGROUP_CONTROLLER_DEVICES), &cgroup)) != -ENOENT) { fprintf(stderr, "Should not have created LXC cgroup: %d\n", -rv); @@ -160,7 +160,7 @@ static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) } /* Asking for small combination since devices is not mounted */ - if ((rv = virCgroupNewDriver("lxc", true, true, + if ((rv = virCgroupNewDriver("lxc", true, (1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | (1 << VIR_CGROUP_CONTROLLER_MEMORY), @@ -171,7 +171,7 @@ static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) ret = validateCgroup(cgroup, "libvirt/lxc", mountsSmall, placementSmall); virCgroupFree(&cgroup); - if ((rv = virCgroupNewDriver("lxc", true, true, -1, &cgroup)) != 0) { + if ((rv = virCgroupNewDriver("lxc", true, -1, &cgroup)) != 0) { fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); goto cleanup; } @@ -199,7 +199,7 @@ static int testCgroupNewForDriverDomain(const void *args ATTRIBUTE_UNUSED) [VIR_CGROUP_CONTROLLER_BLKIO] = "/libvirt/lxc/wibble", }; - if ((rv = virCgroupNewDriver("lxc", true, false, -1, &drivercgroup)) != 0) { + if ((rv = virCgroupNewDriver("lxc", false, -1, &drivercgroup)) != 0) { fprintf(stderr, "Cannot find LXC cgroup: %d\n", -rv); goto cleanup; } -- 1.8.1.4

On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
The virCgroupNewDriver method had a 'bool privileged' param. If a false value was ever passed in, it would simply not work, since non-root users don't have any privileges to create new cgroups. Just delete this broken code entirely and make the QEMU driver skip cgroup setup in non-privileged mode
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/lxc/lxc_cgroup.c | 1 - src/qemu/qemu_cgroup.c | 4 +++- src/util/vircgroup.c | 27 +++------------------------ src/util/vircgroup.h | 1 - tests/vircgrouptest.c | 12 ++++++------ 5 files changed, 12 insertions(+), 33 deletions(-)
ACK Michal

From: "Daniel P. Berrange" <berrange@redhat.com> If a cgroup controller is co-mounted with another, eg /sys/fs/cgroup/cpu,cpuacct Then it is a requirement that there exist symlinks at /sys/fs/cgroup/cpu /sys/fs/cgroup/cpuacct pointing to the real mount point. Add support to virCgroupPtr to detect and track these symlinks Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/util/vircgroup.c | 56 ++++++++++++++++++++++++++++++++++++++++++---- src/util/vircgrouppriv.h | 5 +++++ tests/vircgroupmock.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++ tests/vircgrouptest.c | 36 +++++++++++++++++++++++------- 4 files changed, 143 insertions(+), 12 deletions(-) diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 6202614..14af16e 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -75,6 +75,7 @@ void virCgroupFree(virCgroupPtr *group) for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { VIR_FREE((*group)->controllers[i].mountPoint); + VIR_FREE((*group)->controllers[i].linkPoint); VIR_FREE((*group)->controllers[i].placement); } @@ -114,6 +115,14 @@ static int virCgroupCopyMounts(virCgroupPtr group, if (!group->controllers[i].mountPoint) return -ENOMEM; + + if (parent->controllers[i].linkPoint) { + group->controllers[i].linkPoint = + strdup(parent->controllers[i].linkPoint); + + if (!group->controllers[i].linkPoint) + return -ENOMEM; + } } return 0; } @@ -157,9 +166,46 @@ static int virCgroupDetectMounts(virCgroupPtr group) * first entry only */ if (typelen == len && STREQLEN(typestr, tmp, len) && - !group->controllers[i].mountPoint && - !(group->controllers[i].mountPoint = strdup(entry.mnt_dir))) - goto no_memory; + !group->controllers[i].mountPoint) { + char *linksrc; + struct stat sb; + char *tmp2; + + if (!(group->controllers[i].mountPoint = strdup(entry.mnt_dir))) + goto no_memory; + + tmp2 = strrchr(entry.mnt_dir, '/'); + if (!tmp2) { + errno = EINVAL; + goto error; + } + *tmp2 = '\0'; + /* If it is a co-mount it has a filename like "cpu,cpuacct" + * and we must identify the symlink path */ + if (strchr(tmp2 + 1, ',')) { + if (virAsprintf(&linksrc, "%s/%s", + entry.mnt_dir, typestr) < 0) + goto no_memory; + *tmp2 = '/'; + + if (lstat(linksrc, &sb) < 0) { + if (errno == ENOENT) { + VIR_WARN("Controller %s co-mounted at %s is missing symlink at %s", + typestr, entry.mnt_dir, linksrc); + VIR_FREE(linksrc); + } else { + goto error; + } + } else { + if (!S_ISLNK(sb.st_mode)) { + VIR_WARN("Expecting a symlink at %s for controller %s", + linksrc, typestr); + } else { + group->controllers[i].linkPoint = linksrc; + } + } + } + } tmp = next; } } @@ -170,8 +216,10 @@ static int virCgroupDetectMounts(virCgroupPtr group) return 0; no_memory: + errno = ENOENT; +error: VIR_FORCE_FCLOSE(mounts); - return -ENOMEM; + return -errno; } diff --git a/src/util/vircgrouppriv.h b/src/util/vircgrouppriv.h index cc8cc0b..582be79 100644 --- a/src/util/vircgrouppriv.h +++ b/src/util/vircgrouppriv.h @@ -34,6 +34,11 @@ struct virCgroupController { int type; char *mountPoint; + /* If mountPoint holds several controllers co-mounted, + * then linkPoint is path of the symlink to the mountPoint + * for just the one controller + */ + char *linkPoint; char *placement; }; diff --git a/tests/vircgroupmock.c b/tests/vircgroupmock.c index e50f7e0..32f074b 100644 --- a/tests/vircgroupmock.c +++ b/tests/vircgroupmock.c @@ -32,6 +32,8 @@ static int (*realopen)(const char *path, int flags, ...); static FILE *(*realfopen)(const char *path, const char *mode); static int (*realaccess)(const char *path, int mode); +static int (*reallstat)(const char *path, struct stat *sb); +static int (*real__lxstat)(int ver, const char *path, struct stat *sb); static int (*realmkdir)(const char *path, mode_t mode); static char *fakesysfsdir; @@ -314,8 +316,18 @@ static void init_syms(void) } \ } while (0) +#define LOAD_SYM_ALT(name1, name2) \ + do { \ + if (!(real ## name1 = dlsym(RTLD_NEXT, #name1)) && \ + !(real ## name2 = dlsym(RTLD_NEXT, #name2))) { \ + fprintf(stderr, "Cannot find real '%s' or '%s' symbol\n", #name1, #name2); \ + abort(); \ + } \ + } while (0) + LOAD_SYM(fopen); LOAD_SYM(access); + LOAD_SYM_ALT(lstat, __lxstat); LOAD_SYM(mkdir); LOAD_SYM(open); } @@ -399,6 +411,52 @@ int access(const char *path, int mode) return ret; } +int __lxstat(int ver, const char *path, struct stat *sb) +{ + int ret; + + init_syms(); + + if (STRPREFIX(path, SYSFS_PREFIX)) { + init_sysfs(); + char *newpath; + if (asprintf(&newpath, "%s/%s", + fakesysfsdir, + path + strlen(SYSFS_PREFIX)) < 0) { + errno = ENOMEM; + return -1; + } + ret = real__lxstat(ver, newpath, sb); + free(newpath); + } else { + ret = real__lxstat(ver, path, sb); + } + return ret; +} + +int lstat(const char *path, struct stat *sb) +{ + int ret; + + init_syms(); + + if (STRPREFIX(path, SYSFS_PREFIX)) { + init_sysfs(); + char *newpath; + if (asprintf(&newpath, "%s/%s", + fakesysfsdir, + path + strlen(SYSFS_PREFIX)) < 0) { + errno = ENOMEM; + return -1; + } + ret = reallstat(newpath, sb); + free(newpath); + } else { + ret = reallstat(path, sb); + } + return ret; +} + int mkdir(const char *path, mode_t mode) { int ret; diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index 4f76a06..4b8ca61 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -36,6 +36,7 @@ static int validateCgroup(virCgroupPtr cgroup, const char *expectPath, const char **expectMountPoint, + const char **expectLinkPoint, const char **expectPlacement) { int i; @@ -55,6 +56,14 @@ static int validateCgroup(virCgroupPtr cgroup, virCgroupControllerTypeToString(i)); return -1; } + if (STRNEQ_NULLABLE(expectLinkPoint[i], + cgroup->controllers[i].linkPoint)) { + fprintf(stderr, "Wrong link '%s', expected '%s' for '%s'\n", + cgroup->controllers[i].linkPoint, + expectLinkPoint[i], + virCgroupControllerTypeToString(i)); + return -1; + } if (STRNEQ_NULLABLE(expectPlacement[i], cgroup->controllers[i].placement)) { fprintf(stderr, "Wrong placement '%s', expected '%s' for '%s'\n", @@ -87,6 +96,17 @@ const char *mountsFull[VIR_CGROUP_CONTROLLER_LAST] = { [VIR_CGROUP_CONTROLLER_BLKIO] = "/not/really/sys/fs/cgroup/blkio", }; +const char *links[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/not/really/sys/fs/cgroup/cpu", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/not/really/sys/fs/cgroup/cpuacct", + [VIR_CGROUP_CONTROLLER_CPUSET] = NULL, + [VIR_CGROUP_CONTROLLER_MEMORY] = NULL, + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = NULL, + [VIR_CGROUP_CONTROLLER_BLKIO] = NULL, +}; + + static int testCgroupNewForSelf(const void *args ATTRIBUTE_UNUSED) { virCgroupPtr cgroup = NULL; @@ -106,7 +126,7 @@ static int testCgroupNewForSelf(const void *args ATTRIBUTE_UNUSED) goto cleanup; } - ret = validateCgroup(cgroup, "", mountsFull, placement); + ret = validateCgroup(cgroup, "", mountsFull, links, placement); cleanup: virCgroupFree(&cgroup); @@ -168,14 +188,14 @@ static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); goto cleanup; } - ret = validateCgroup(cgroup, "libvirt/lxc", mountsSmall, placementSmall); + ret = validateCgroup(cgroup, "libvirt/lxc", mountsSmall, links, placementSmall); virCgroupFree(&cgroup); if ((rv = virCgroupNewDriver("lxc", true, -1, &cgroup)) != 0) { fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); goto cleanup; } - ret = validateCgroup(cgroup, "libvirt/lxc", mountsFull, placementFull); + ret = validateCgroup(cgroup, "libvirt/lxc", mountsFull, links, placementFull); cleanup: virCgroupFree(&cgroup); @@ -209,7 +229,7 @@ static int testCgroupNewForDriverDomain(const void *args ATTRIBUTE_UNUSED) goto cleanup; } - ret = validateCgroup(domaincgroup, "libvirt/lxc/wibble", mountsFull, placement); + ret = validateCgroup(domaincgroup, "libvirt/lxc/wibble", mountsFull, links, placement); cleanup: virCgroupFree(&drivercgroup); @@ -272,14 +292,14 @@ static int testCgroupNewForPartition(const void *args ATTRIBUTE_UNUSED) fprintf(stderr, "Cannot create /virtualmachines cgroup: %d\n", -rv); goto cleanup; } - ret = validateCgroup(cgroup, "/virtualmachines", mountsSmall, placementSmall); + ret = validateCgroup(cgroup, "/virtualmachines", mountsSmall, links, placementSmall); virCgroupFree(&cgroup); if ((rv = virCgroupNewPartition("/virtualmachines", true, -1, &cgroup)) != 0) { fprintf(stderr, "Cannot create /virtualmachines cgroup: %d\n", -rv); goto cleanup; } - ret = validateCgroup(cgroup, "/virtualmachines", mountsFull, placementFull); + ret = validateCgroup(cgroup, "/virtualmachines", mountsFull, links, placementFull); cleanup: virCgroupFree(&cgroup); @@ -324,7 +344,7 @@ static int testCgroupNewForPartitionNested(const void *args ATTRIBUTE_UNUSED) goto cleanup; } - ret = validateCgroup(cgroup, "/users/berrange", mountsFull, placementFull); + ret = validateCgroup(cgroup, "/users/berrange", mountsFull, links, placementFull); cleanup: virCgroupFree(&cgroup); @@ -359,7 +379,7 @@ static int testCgroupNewForPartitionDomain(const void *args ATTRIBUTE_UNUSED) goto cleanup; } - ret = validateCgroup(domaincgroup, "/production/foo.lxc.libvirt", mountsFull, placement); + ret = validateCgroup(domaincgroup, "/production/foo.lxc.libvirt", mountsFull, links, placement); cleanup: virCgroupFree(&partitioncgroup); -- 1.8.1.4

On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
If a cgroup controller is co-mounted with another, eg
/sys/fs/cgroup/cpu,cpuacct
Then it is a requirement that there exist symlinks at
/sys/fs/cgroup/cpu /sys/fs/cgroup/cpuacct
pointing to the real mount point. Add support to virCgroupPtr to detect and track these symlinks
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/util/vircgroup.c | 56 ++++++++++++++++++++++++++++++++++++++++++---- src/util/vircgrouppriv.h | 5 +++++ tests/vircgroupmock.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++ tests/vircgrouptest.c | 36 +++++++++++++++++++++++------- 4 files changed, 143 insertions(+), 12 deletions(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 6202614..14af16e 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c
@@ -157,9 +166,46 @@ static int virCgroupDetectMounts(virCgroupPtr group) * first entry only */ if (typelen == len && STREQLEN(typestr, tmp, len) && - !group->controllers[i].mountPoint && - !(group->controllers[i].mountPoint = strdup(entry.mnt_dir))) - goto no_memory; + !group->controllers[i].mountPoint) { + char *linksrc; + struct stat sb; + char *tmp2; + + if (!(group->controllers[i].mountPoint = strdup(entry.mnt_dir))) + goto no_memory; + + tmp2 = strrchr(entry.mnt_dir, '/'); + if (!tmp2) { + errno = EINVAL; + goto error; + } + *tmp2 = '\0'; + /* If it is a co-mount it has a filename like "cpu,cpuacct" + * and we must identify the symlink path */ + if (strchr(tmp2 + 1, ',')) { + if (virAsprintf(&linksrc, "%s/%s", + entry.mnt_dir, typestr) < 0) + goto no_memory; + *tmp2 = '/'; + + if (lstat(linksrc, &sb) < 0) { + if (errno == ENOENT) { + VIR_WARN("Controller %s co-mounted at %s is missing symlink at %s", + typestr, entry.mnt_dir, linksrc); + VIR_FREE(linksrc); + } else { + goto error; + } + } else { + if (!S_ISLNK(sb.st_mode)) { + VIR_WARN("Expecting a symlink at %s for controller %s", + linksrc, typestr); + } else { + group->controllers[i].linkPoint = linksrc; + } + } + } + } tmp = next; } } @@ -170,8 +216,10 @@ static int virCgroupDetectMounts(virCgroupPtr group) return 0;
no_memory: + errno = ENOENT;
Any reason for not returning ENOMEM here? I don't see any.
+error: VIR_FORCE_FCLOSE(mounts); - return -ENOMEM; + return -errno; }
ACK if errno fixed. Michal

From: "Daniel P. Berrange" <berrange@redhat.com> Add a virCgroupIsolateMount method which looks at where the current process is place in the cgroups (eg /system/demo.lxc.libvirt) and then remounts the cgroups such that this sub-directory becomes the root directory from the current process' POV. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- configure.ac | 2 +- include/libvirt/virterror.h | 1 + src/libvirt_private.syms | 1 + src/util/vircgroup.c | 127 ++++++++++++++++++++++++++++++++++++++++++++ src/util/vircgroup.h | 4 ++ src/util/virerror.c | 1 + 6 files changed, 135 insertions(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 11b332f..529d979 100644 --- a/configure.ac +++ b/configure.ac @@ -208,7 +208,7 @@ dnl Availability of various common headers (non-fatal if missing). AC_CHECK_HEADERS([pwd.h paths.h regex.h sys/un.h \ sys/poll.h syslog.h mntent.h net/ethernet.h linux/magic.h \ sys/un.h sys/syscall.h netinet/tcp.h ifaddrs.h libtasn1.h \ - sys/ucred.h]) + sys/ucred.h sys/mount.h]) dnl Check whether endian provides handy macros. AC_CHECK_DECLS([htole64], [], [], [[#include <endian.h>]]) diff --git a/include/libvirt/virterror.h b/include/libvirt/virterror.h index 4cd9256..3864a31 100644 --- a/include/libvirt/virterror.h +++ b/include/libvirt/virterror.h @@ -116,6 +116,7 @@ typedef enum { VIR_FROM_LOCKSPACE = 51, /* Error from lockspace */ VIR_FROM_INITCTL = 52, /* Error from initctl device communication */ VIR_FROM_IDENTITY = 53, /* Error from identity code */ + VIR_FROM_CGROUP = 54, /* Error from cgroups */ # ifdef VIR_ENUM_SENTINELS VIR_ERR_DOMAIN_LAST diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 52c3bcb..8014ea1 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1113,6 +1113,7 @@ virCgroupGetMemoryUsage; virCgroupGetMemSwapHardLimit; virCgroupGetMemSwapUsage; virCgroupHasController; +virCgroupIsolateMount; virCgroupKill; virCgroupKillPainfully; virCgroupKillRecursive; diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 14af16e..e8abc70 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -27,6 +27,9 @@ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R # include <mntent.h> #endif +#if defined HAVE_SYS_MOUNT_H +# include <sys/mount.h> +#endif #include <fcntl.h> #include <string.h> #include <errno.h> @@ -42,6 +45,7 @@ #include "virutil.h" #include "viralloc.h" +#include "virerror.h" #include "virlog.h" #include "virfile.h" #include "virhash.h" @@ -49,6 +53,8 @@ #define CGROUP_MAX_VAL 512 +#define VIR_FROM_THIS VIR_FROM_CGROUP + VIR_ENUM_IMPL(virCgroupController, VIR_CGROUP_CONTROLLER_LAST, "cpu", "cpuacct", "cpuset", "memory", "devices", "freezer", "blkio"); @@ -2382,3 +2388,124 @@ int virCgroupKillPainfully(virCgroupPtr group ATTRIBUTE_UNUSED) return -ENOSYS; } #endif /* HAVE_KILL, HAVE_MNTENT_H, HAVE_GETMNTENT_R */ + +static char *virCgroupIdentifyRoot(virCgroupPtr group) +{ + char *ret = NULL; + size_t i; + + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + char *tmp; + if (!group->controllers[i].mountPoint) + continue; + if (!(tmp = strrchr(group->controllers[i].mountPoint, '/'))) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Could not find directory separator in %s"), + group->controllers[i].mountPoint); + return NULL; + } + + tmp[0] = '\0'; + ret = strdup(group->controllers[i].mountPoint); + tmp[0] = '/'; + if (!ret) { + virReportOOMError(); + return NULL; + } + return ret; + } + + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not find any mounted controllers")); + return NULL; +} + + +int virCgroupIsolateMount(virCgroupPtr group, const char *oldroot, + const char *mountopts) +{ + int ret = -1; + size_t i; + char *opts = NULL; + char *root = NULL; + + if (!(root = virCgroupIdentifyRoot(group))) + return -1; + + VIR_DEBUG("Mounting cgroups at '%s'", root); + + if (virFileMakePath(root) < 0) { + virReportSystemError(errno, + _("Unable to create directory %s"), + root); + goto cleanup; + } + + if (virAsprintf(&opts, + "mode=755,size=65536%s", mountopts) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (mount("tmpfs", root, "tmpfs", MS_NOSUID|MS_NODEV|MS_NOEXEC, opts) < 0) { + virReportSystemError(errno, + _("Failed to mount %s on %s type %s"), + "tmpfs", root, "tmpfs"); + goto cleanup; + } + + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + if (!group->controllers[i].mountPoint) + continue; + + if (!virFileExists(group->controllers[i].mountPoint)) { + char *src; + if (virAsprintf(&src, "%s%s%s", + oldroot, + group->controllers[i].mountPoint, + group->controllers[i].placement) < 0) { + virReportOOMError(); + goto cleanup; + } + + VIR_DEBUG("Create mount point '%s'", group->controllers[i].mountPoint); + if (virFileMakePath(group->controllers[i].mountPoint) < 0) { + virReportSystemError(errno, + _("Unable to create directory %s"), + group->controllers[i].mountPoint); + VIR_FREE(src); + goto cleanup; + } + + if (mount(src, group->controllers[i].mountPoint, NULL, MS_BIND, NULL) < 0) { + virReportSystemError(errno, + _("Failed to bind cgroup '%s' on '%s'"), + src, group->controllers[i].mountPoint); + VIR_FREE(src); + goto cleanup; + } + + VIR_FREE(src); + } + + if (group->controllers[i].linkPoint) { + VIR_DEBUG("Link mount point '%s' to '%s'", + group->controllers[i].mountPoint, + group->controllers[i].linkPoint); + if (symlink(group->controllers[i].mountPoint, + group->controllers[i].linkPoint) < 0) { + virReportSystemError(errno, + _("Unable to symlink directory %s to %s"), + group->controllers[i].mountPoint, + group->controllers[i].linkPoint); + return -1; + } + } + } + ret = 0; + +cleanup: + VIR_FREE(root); + VIR_FREE(opts); + return ret; +} diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h index 936e09b..61e6f91 100644 --- a/src/util/vircgroup.h +++ b/src/util/vircgroup.h @@ -183,4 +183,8 @@ int virCgroupKill(virCgroupPtr group, int signum); int virCgroupKillRecursive(virCgroupPtr group, int signum); int virCgroupKillPainfully(virCgroupPtr group); +int virCgroupIsolateMount(virCgroupPtr group, + const char *oldroot, + const char *mountopts); + #endif /* __VIR_CGROUP_H__ */ diff --git a/src/util/virerror.c b/src/util/virerror.c index c30642a..8a329a9 100644 --- a/src/util/virerror.c +++ b/src/util/virerror.c @@ -119,6 +119,7 @@ VIR_ENUM_IMPL(virErrorDomain, VIR_ERR_DOMAIN_LAST, "Lock Space", "Init control", "Identity", + "Cgroup", ) -- 1.8.1.4

On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Add a virCgroupIsolateMount method which looks at where the current process is place in the cgroups (eg /system/demo.lxc.libvirt) and then remounts the cgroups such that this sub-directory becomes the root directory from the current process' POV.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- configure.ac | 2 +- include/libvirt/virterror.h | 1 + src/libvirt_private.syms | 1 + src/util/vircgroup.c | 127 ++++++++++++++++++++++++++++++++++++++++++++ src/util/vircgroup.h | 4 ++ src/util/virerror.c | 1 + 6 files changed, 135 insertions(+), 1 deletion(-)
ACK Michal

On 04/10/2013 06:08 PM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Add a virCgroupIsolateMount method which looks at where the current process is place in the cgroups (eg /system/demo.lxc.libvirt) and then remounts the cgroups such that this sub-directory becomes the root directory from the current process' POV.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> ---
I noticed that this patch doesn't mount the subsystem which isn't used by libvirt, such as net_cls, is this what we expected? I prefer to mount these subsystems too.

On Fri, Apr 12, 2013 at 10:53:24AM +0800, Gao feng wrote:
On 04/10/2013 06:08 PM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Add a virCgroupIsolateMount method which looks at where the current process is place in the cgroups (eg /system/demo.lxc.libvirt) and then remounts the cgroups such that this sub-directory becomes the root directory from the current process' POV.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> ---
I noticed that this patch doesn't mount the subsystem which isn't used by libvirt, such as net_cls, is this what we expected?
I prefer to mount these subsystems too.
Hmm, yes, we should be mounting all of them Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

From: "Daniel P. Berrange" <berrange@redhat.com> The LXC driver currently has code to detect cgroups mounts and then re-mount them inside the new root filesystem. Replace this fragile code with a call to virCgroupIsolateMount. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/lxc/lxc_container.c | 237 ++---------------------------------------------- 1 file changed, 8 insertions(+), 229 deletions(-) diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index ab27a92..b0367e4 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -1735,227 +1735,6 @@ static int lxcContainerSetupAllHostdevs(virDomainDefPtr vmDef, } -struct lxcContainerCGroup { - const char *dir; - const char *linkDest; -}; - - -static void lxcContainerCGroupFree(struct lxcContainerCGroup *mounts, - size_t nmounts) -{ - size_t i; - - if (!mounts) - return; - - for (i = 0 ; i < nmounts ; i++) { - VIR_FREE(mounts[i].dir); - VIR_FREE(mounts[i].linkDest); - } - VIR_FREE(mounts); -} - - -static int lxcContainerIdentifyCGroups(struct lxcContainerCGroup **mountsret, - size_t *nmountsret, - char **root) -{ - FILE *procmnt = NULL; - struct mntent mntent; - struct dirent *dent; - char mntbuf[1024]; - int ret = -1; - struct lxcContainerCGroup *mounts = NULL; - size_t nmounts = 0; - DIR *dh = NULL; - char *path = NULL; - - *mountsret = NULL; - *nmountsret = 0; - *root = NULL; - - VIR_DEBUG("Finding cgroups mount points of type cgroup"); - - if (!(procmnt = setmntent("/proc/mounts", "r"))) { - virReportSystemError(errno, "%s", - _("Failed to read /proc/mounts")); - return -1; - } - - while (getmntent_r(procmnt, &mntent, mntbuf, sizeof(mntbuf)) != NULL) { - VIR_DEBUG("Got %s", mntent.mnt_dir); - if (STRNEQ(mntent.mnt_type, "cgroup")) - continue; - - if (!*root) { - char *tmp; - if (!(*root = strdup(mntent.mnt_dir))) { - virReportOOMError(); - goto cleanup; - } - tmp = strrchr(*root, '/'); - *tmp = '\0'; - } else if (!STRPREFIX(mntent.mnt_dir, *root)) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Cgroup %s is not mounted under %s"), - mntent.mnt_dir, *root); - goto cleanup; - } - - /* Skip named mounts with no controller since they're - * for application use only ie systemd */ - if (strstr(mntent.mnt_opts, "name=")) - continue; - - if (VIR_EXPAND_N(mounts, nmounts, 1) < 0) { - virReportOOMError(); - goto cleanup; - } - if (!(mounts[nmounts-1].dir = strdup(mntent.mnt_dir))) { - virReportOOMError(); - goto cleanup; - } - VIR_DEBUG("Grabbed '%s'", mntent.mnt_dir); - } - - if (!*root) { - VIR_DEBUG("No mounted cgroups found"); - ret = 0; - goto cleanup; - } - - VIR_DEBUG("Checking for symlinks in %s", *root); - if (!(dh = opendir(*root))) { - virReportSystemError(errno, - _("Unable to read directory %s"), - *root); - goto cleanup; - } - - while ((dent = readdir(dh)) != NULL) { - ssize_t rv; - /* The cgroups links are just relative to the local - * dir so we don't need a large buf */ - char linkbuf[100]; - - if (dent->d_name[0] == '.') - continue; - - VIR_DEBUG("Checking entry %s", dent->d_name); - if (virAsprintf(&path, "%s/%s", *root, dent->d_name) < 0) { - virReportOOMError(); - goto cleanup; - } - - if ((rv = readlink(path, linkbuf, sizeof(linkbuf)-1)) < 0) { - if (errno != EINVAL) { - virReportSystemError(errno, - _("Unable to resolve link %s"), - path); - VIR_FREE(path); - goto cleanup; - } - /* Ok not a link */ - VIR_FREE(path); - } else { - linkbuf[rv] = '\0'; - VIR_DEBUG("Got a link %s to %s", path, linkbuf); - if (VIR_EXPAND_N(mounts, nmounts, 1) < 0) { - virReportOOMError(); - goto cleanup; - } - if (!(mounts[nmounts-1].linkDest = strdup(linkbuf))) { - virReportOOMError(); - goto cleanup; - } - mounts[nmounts-1].dir = path; - path = NULL; - } - } - - *mountsret = mounts; - *nmountsret = nmounts; - ret = 0; - -cleanup: - closedir(dh); - endmntent(procmnt); - VIR_FREE(path); - - if (ret < 0) { - lxcContainerCGroupFree(mounts, nmounts); - VIR_FREE(*root); - } - return ret; -} - - -static int lxcContainerMountCGroups(struct lxcContainerCGroup *mounts, - size_t nmounts, - const char *root, - char *sec_mount_options) -{ - size_t i; - char *opts = NULL; - - VIR_DEBUG("Mounting cgroups at '%s'", root); - - if (virFileMakePath(root) < 0) { - virReportSystemError(errno, - _("Unable to create directory %s"), - root); - return -1; - } - - if (virAsprintf(&opts, - "mode=755,size=65536%s", sec_mount_options) < 0) { - virReportOOMError(); - return -1; - } - - if (mount("tmpfs", root, "tmpfs", MS_NOSUID|MS_NODEV|MS_NOEXEC, opts) < 0) { - VIR_FREE(opts); - virReportSystemError(errno, - _("Failed to mount %s on %s type %s"), - "tmpfs", root, "tmpfs"); - return -1; - } - VIR_FREE(opts); - - for (i = 0 ; i < nmounts ; i++) { - if (mounts[i].linkDest) { - VIR_DEBUG("Link mount point '%s' to '%s'", - mounts[i].dir, mounts[i].linkDest); - if (symlink(mounts[i].linkDest, mounts[i].dir) < 0) { - virReportSystemError(errno, - _("Unable to symlink directory %s to %s"), - mounts[i].dir, mounts[i].linkDest); - return -1; - } - } else { - VIR_DEBUG("Create mount point '%s'", mounts[i].dir); - if (virFileMakePath(mounts[i].dir) < 0) { - virReportSystemError(errno, - _("Unable to create directory %s"), - mounts[i].dir); - return -1; - } - - if (mount("cgroup", mounts[i].dir, "cgroup", - 0, mounts[i].dir + strlen(root) + 1) < 0) { - virReportSystemError(errno, - _("Failed to mount cgroup on '%s'"), - mounts[i].dir); - return -1; - } - } - } - - return 0; -} - - /* Got a FS mapped to /, we're going the pivot_root * approach to do a better-chroot-than-chroot * this is based on this thread http://lkml.org/lkml/2008/3/5/29 @@ -1966,10 +1745,9 @@ static int lxcContainerSetupPivotRoot(virDomainDefPtr vmDef, size_t nttyPaths, virSecurityManagerPtr securityDriver) { - struct lxcContainerCGroup *mounts = NULL; - size_t nmounts = 0; + virCgroupPtr cgroup = NULL; + int rc; int ret = -1; - char *cgroupRoot = NULL; char *sec_mount_options; if (!(sec_mount_options = virSecurityManagerGetMountOptions(securityDriver, vmDef))) @@ -1977,8 +1755,11 @@ static int lxcContainerSetupPivotRoot(virDomainDefPtr vmDef, /* Before pivoting we need to identify any * cgroups controllers that are mounted */ - if (lxcContainerIdentifyCGroups(&mounts, &nmounts, &cgroupRoot) < 0) + if ((rc = virCgroupNewSelf(&cgroup)) != 0) { + virReportSystemError(-rc, "%s", + _("Cannot identify cgroup placement")); goto cleanup; + } /* Ensure the root filesystem is mounted */ if (lxcContainerPrepareRoot(vmDef, root) < 0) @@ -2017,8 +1798,7 @@ static int lxcContainerSetupPivotRoot(virDomainDefPtr vmDef, /* Now we can re-mount the cgroups controllers in the * same configuration as before */ - if (lxcContainerMountCGroups(mounts, nmounts, - cgroupRoot, sec_mount_options) < 0) + if (virCgroupIsolateMount(cgroup, "/.oldroot/", sec_mount_options) < 0) goto cleanup; /* Mounts /dev/pts */ @@ -2048,8 +1828,7 @@ static int lxcContainerSetupPivotRoot(virDomainDefPtr vmDef, ret = 0; cleanup: - lxcContainerCGroupFree(mounts, nmounts); - VIR_FREE(cgroupRoot); + virCgroupFree(&cgroup); VIR_FREE(sec_mount_options); return ret; } -- 1.8.1.4

On 10.04.2013 12:08, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
The LXC driver currently has code to detect cgroups mounts and then re-mount them inside the new root filesystem. Replace this fragile code with a call to virCgroupIsolateMount.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/lxc/lxc_container.c | 237 ++---------------------------------------------- 1 file changed, 8 insertions(+), 229 deletions(-)
ACK if you squash this in: diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index b0367e4..ac0f69c 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -35,7 +35,6 @@ #include <sys/stat.h> #include <unistd.h> #include <mntent.h> -#include <dirent.h> #include <sys/reboot.h> #include <linux/reboot.h> Michal

On 2013/04/10 18:08, Daniel P. Berrange wrote:
This is an update of
https://www.redhat.com/archives/libvir-list/2013-April/msg00352.html
Currently libvirt creates a cgroups hiearchy at
$LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$GUEST-NAME
eg
/sys/fs/cgroup ├── blkio │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 ├── cpu,cpuacct │ ├── libvirt │ │ ├── lxc │ │ │ └── busy │ │ └── qemu │ │ └── vm1 │ │ ├── emulator │ │ └── vcpu0 │ └── system │ ├── abrtd.service │ ....snip.... │ └── upower.service ├── cpuset │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 │ ├── emulator │ └── vcpu0 ├── devices │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 ├── freezer │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 ├── memory │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 ├── net_cls ├── perf_event
This series changes it so that libvirt creates cgroups at
/system/$VMNAME.{qemu,lxc}.libvirt
and allows configuration of the "resource partition" (ie the "/system" bit) via the XML. So we get a layout like this:
/sys/fs/cgroup ├── blkio │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt ├── cpu,cpuacct │ └── system │ ├── abrtd.service │ ....snip.... │ ├── demo.lxc.libvirt │ ....snip.... │ └── vm1.qemu.libvirt │ ├── emulator │ └── vcpu0 ├── cpuset │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt │ ├── emulator │ └── vcpu0 ├── devices │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt ├── freezer │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt ├── memory │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt ├── net_cls ├── perf_event
Flattening out the libvirt created hiearchy has serious performance wins, due to poor kernel scalability with deep hierarchies. It also makes it easier to configure system wide policy for resource usage across system services and virtual machines / containers, since they all live at the top level in comon resource partitions.
Changes since v2:
- Merge previously ACKed patches - Incorporate Gao Feng's changes to LXC cgroup mount setup
I will review this patchset and do some simple test in this week. Thanks for your great work :)
participants (4)
-
Daniel P. Berrange
-
Eric Blake
-
Gao feng
-
Michal Privoznik