[libvirt] [PATCH 00/18] Re-arrange the way cgroups are setup

This is a greatly expanded version of a previous series I posted https://www.redhat.com/archives/libvir-list/2013-March/msg01373.html Currently libvirt creates a cgroups hiearchy at $LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$GUEST-NAME eg /sys/fs/cgroup ├── blkio │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 ├── cpu,cpuacct │ ├── libvirt │ │ ├── lxc │ │ │ └── busy │ │ └── qemu │ │ └── vm1 │ │ ├── emulator │ │ └── vcpu0 │ └── system │ ├── abrtd.service │ ....snip.... │ └── upower.service ├── cpuset │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 │ ├── emulator │ └── vcpu0 ├── devices │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 ├── freezer │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 ├── memory │ └── libvirt │ ├── lxc │ │ └── busy │ └── qemu │ └── vm1 ├── net_cls ├── perf_event This series changes it so that libvirt creates cgroups at /system/$VMNAME.{qemu,lxc}.libvirt and allows configuration of the "resource partition" (ie the "/system" bit) via the XML. So we get a layout like this: /sys/fs/cgroup ├── blkio │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt ├── cpu,cpuacct │ └── system │ ├── abrtd.service │ ....snip.... │ ├── demo.lxc.libvirt │ ....snip.... │ └── vm1.qemu.libvirt │ ├── emulator │ └── vcpu0 ├── cpuset │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt │ ├── emulator │ └── vcpu0 ├── devices │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt ├── freezer │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt ├── memory │ └── system │ ├── demo.lxc.libvirt │ └── vm1.qemu.libvirt ├── net_cls ├── perf_event Flattening out the libvirt created hiearchy has serious performance wins, due to poor kernel scalability with deep hierarchies. It also makes it easier to configure system wide policy for resource usage across system services and virtual machines / containers, since they all live at the top level in comon resource partitions.

From: "Daniel P. Berrange" <berrange@redhat.com> Split the "resource" define out into multiple smaller defines, one for each type of resource tuning parameter. This makes the schema a bit clearer to read Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- docs/schemas/domaincommon.rng | 271 ++++++++++++++++++++++-------------------- 1 file changed, 144 insertions(+), 127 deletions(-) diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 454ebdb..63ba7d1 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -498,62 +498,6 @@ </element> </optional> - <!-- The Blkio cgroup related tunables would go in the blkiotune --> - <optional> - <element name="blkiotune"> - <interleave> - <!-- I/O weight the VM can use --> - <optional> - <element name="weight"> - <ref name="weight"/> - </element> - </optional> - <zeroOrMore> - <element name="device"> - <interleave> - <element name="path"> - <ref name="absFilePath"/> - </element> - <element name="weight"> - <ref name="weight"/> - </element> - </interleave> - </element> - </zeroOrMore> - </interleave> - </element> - </optional> - - <!-- All the memory/swap related tunables would go in the memtune --> - <optional> - <element name="memtune"> - <!-- Maximum memory the VM can use --> - <optional> - <element name="hard_limit"> - <ref name='scaledInteger'/> - </element> - </optional> - <!-- Minimum memory ascertained for the VM during contention --> - <optional> - <element name="soft_limit"> - <ref name='scaledInteger'/> - </element> - </optional> - <!-- Minimum amount of memory required to start the VM --> - <optional> - <element name="min_guarantee"> - <ref name='scaledInteger'/> - </element> - </optional> - <!-- Maximum swap area the VM can use --> - <optional> - <element name="swap_hard_limit"> - <ref name='scaledInteger'/> - </element> - </optional> - </element> - </optional> - <optional> <element name="vcpu"> <optional> @@ -578,91 +522,164 @@ </element> </optional> - <!-- All the cpu related tunables would go in the cputune --> <optional> - <element name="cputune"> - <optional> - <element name="shares"> - <ref name="cpushares"/> - </element> - </optional> - <optional> - <element name="period"> - <ref name="cpuperiod"/> - </element> - </optional> - <optional> - <element name="quota"> - <ref name="cpuquota"/> - </element> - </optional> - <optional> - <element name="emulator_period"> - <ref name="cpuperiod"/> - </element> - </optional> - <optional> - <element name="emulator_quota"> - <ref name="cpuquota"/> - </element> - </optional> - <zeroOrMore> - <element name="vcpupin"> - <attribute name="vcpu"> - <ref name="vcpuid"/> - </attribute> - <attribute name="cpuset"> - <ref name="cpuset"/> - </attribute> - </element> - </zeroOrMore> - <optional> - <element name="emulatorpin"> - <attribute name="cpuset"> - <ref name="cpuset"/> - </attribute> - </element> - </optional> + <ref name="blkiotune"/> + </optional> + + <optional> + <ref name="memtune"/> + </optional> + + <optional> + <ref name="cputune"/> + </optional> + + <optional> + <ref name="numatune"/> + </optional> + </interleave> + </define> + + <!-- The Blkio cgroup related tunables would go in the blkiotune --> + <define name="blkiotune"> + <element name="blkiotune"> + <interleave> + <!-- I/O weight the VM can use --> + <optional> + <element name="weight"> + <ref name="weight"/> + </element> + </optional> + <zeroOrMore> + <element name="device"> + <interleave> + <element name="path"> + <ref name="absFilePath"/> + </element> + <element name="weight"> + <ref name="weight"/> + </element> + </interleave> + </element> + </zeroOrMore> + </interleave> + </element> + </define> + + <!-- All the memory/swap related tunables would go in the memtune --> + <define name="memtune"> + <element name="memtune"> + <!-- Maximum memory the VM can use --> + <optional> + <element name="hard_limit"> + <ref name='scaledInteger'/> + </element> + </optional> + <!-- Minimum memory ascertained for the VM during contention --> + <optional> + <element name="soft_limit"> + <ref name='scaledInteger'/> + </element> + </optional> + <!-- Minimum amount of memory required to start the VM --> + <optional> + <element name="min_guarantee"> + <ref name='scaledInteger'/> + </element> + </optional> + <!-- Maximum swap area the VM can use --> + <optional> + <element name="swap_hard_limit"> + <ref name='scaledInteger'/> + </element> + </optional> + </element> + </define> + + <!-- All the cpu related tunables would go in the cputune --> + <define name="cputune"> + <element name="cputune"> + <optional> + <element name="shares"> + <ref name="cpushares"/> + </element> + </optional> + <optional> + <element name="period"> + <ref name="cpuperiod"/> + </element> + </optional> + <optional> + <element name="quota"> + <ref name="cpuquota"/> + </element> + </optional> + <optional> + <element name="emulator_period"> + <ref name="cpuperiod"/> + </element> + </optional> + <optional> + <element name="emulator_quota"> + <ref name="cpuquota"/> + </element> + </optional> + <zeroOrMore> + <element name="vcpupin"> + <attribute name="vcpu"> + <ref name="vcpuid"/> + </attribute> + <attribute name="cpuset"> + <ref name="cpuset"/> + </attribute> + </element> + </zeroOrMore> + <optional> + <element name="emulatorpin"> + <attribute name="cpuset"> + <ref name="cpuset"/> + </attribute> </element> </optional> + </element> + </define> - <!-- All the NUMA related tunables would go in the numatune --> + <!-- All the NUMA related tunables would go in the numatune --> + <define name="numatune"> + <element name="numatune"> <optional> - <element name="numatune"> + <element name="memory"> <optional> - <element name="memory"> + <attribute name="mode"> + <choice> + <value>strict</value> + <value>preferred</value> + <value>interleave</value> + </choice> + </attribute> + </optional> + <choice> + <group> <optional> - <attribute name="mode"> - <choice> - <value>strict</value> - <value>preferred</value> - <value>interleave</value> - </choice> + <attribute name='placement'> + <value>static</value> </attribute> </optional> - <choice> - <group> - <optional> - <attribute name='placement'> - <value>static</value> - </attribute> - </optional> - <optional> - <attribute name='nodeset'> - <ref name='cpuset'/> - </attribute> - </optional> - </group> - <attribute name='placement'> - <value>auto</value> + <optional> + <attribute name='nodeset'> + <ref name='cpuset'/> </attribute> + </optional> + </group> + <attribute name='placement'> + <value>auto</value> + </attribute> </choice> - </element> - </optional> </element> </optional> - </interleave> + </element> </define> + <define name="clock"> <optional> <element name="clock"> -- 1.8.1.4

On 04/04/2013 07:40 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Split the "resource" define out into multiple smaller defines, one for each type of resource tuning parameter. This makes the schema a bit clearer to read
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- docs/schemas/domaincommon.rng | 271 ++++++++++++++++++++++-------------------- 1 file changed, 144 insertions(+), 127 deletions(-)
ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

From: "Daniel P. Berrange" <berrange@redhat.com> The linker will ignore LD_PRELOAD libraries which do not exist, just printing a warning message. This is not helpful for the test suite which will be utterly fubar without the preload library present. Add an explicit test for existance of the library to protect against this Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- tests/testutils.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tests/testutils.h b/tests/testutils.h index 546c9ae..3647487 100644 --- a/tests/testutils.h +++ b/tests/testutils.h @@ -75,6 +75,10 @@ int virtTestMain(int argc, const char *preload = getenv("LD_PRELOAD"); \ if (preload == NULL || strstr(preload, lib) == NULL) { \ char *newenv; \ + if (!virFileIsExecutable(lib)) { \ + perror(lib); \ + return EXIT_FAILURE; \ + } \ if (virAsprintf(&newenv, "%s%s%s", preload ? preload : "", \ preload ? ":" : "", lib) < 0) { \ perror("virAsprintf"); \ -- 1.8.1.4

On 04/04/2013 07:40 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
The linker will ignore LD_PRELOAD libraries which do not exist, just printing a warning message. This is not helpful for the test suite which will be utterly fubar without the preload library present. Add an explicit test for existance
s/existance/existence/
of the library to protect against this
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- tests/testutils.h | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tests/testutils.h b/tests/testutils.h index 546c9ae..3647487 100644 --- a/tests/testutils.h +++ b/tests/testutils.h @@ -75,6 +75,10 @@ int virtTestMain(int argc, const char *preload = getenv("LD_PRELOAD"); \ if (preload == NULL || strstr(preload, lib) == NULL) { \ char *newenv; \ + if (!virFileIsExecutable(lib)) { \ + perror(lib); \ + return EXIT_FAILURE; \ + } \
ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

From: "Daniel P. Berrange" <berrange@redhat.com> Introduce a method virFileDeleteTree for recursively deleting an entire directory tree Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 81 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 96eea0a..e297850 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1258,6 +1258,7 @@ virEventPollUpdateTimeout; # util/virfile.h virFileClose; +virFileDeleteTree; virFileDirectFdFlag; virFileFclose; virFileFdopen; diff --git a/src/util/virfile.c b/src/util/virfile.c index 4a9fa81..4d338e1 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -644,3 +644,81 @@ int virFileLoopDeviceAssociate(const char *file, } #endif /* __linux__ */ + + +/** + * virFileDeleteTree: + * + * Recursively deletes all files / directories + * starting from the directory @dir. Does not + * follow symlinks + */ +int virFileDeleteTree(const char *dir) +{ + DIR *dh = opendir(dir); + struct dirent *de; + char *filepath = NULL; + int ret = -1; + + if (!dh) { + virReportSystemError(errno, _("Cannot open dir '%s'"), + dir); + return -1; + } + + errno = 0; + while ((de = readdir(dh)) != NULL) { + struct stat sb; + + if (STREQ(de->d_name, ".") || + STREQ(de->d_name, "..")) + continue; + + if (virAsprintf(&filepath, "%s/%s", + dir, de->d_name) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (lstat(filepath, &sb) < 0) { + virReportSystemError(errno, _("Cannot access '%s'"), + filepath); + goto cleanup; + } + + if (S_ISDIR(sb.st_mode)) { + if (virFileDeleteTree(filepath) < 0) + goto cleanup; + } else { + if (unlink(filepath) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Cannot delete file '%s'"), + filepath); + goto cleanup; + } + } + + VIR_FREE(filepath); + errno = 0; + } + + if (errno) { + virReportSystemError(errno, _("Cannot read dir '%s'"), + dir); + goto cleanup; + } + + if (rmdir(dir) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Cannot delete directory '%s'"), + dir); + goto cleanup; + } + + ret = 0; + +cleanup: + VIR_FREE(filepath); + closedir(dh); + return ret; +} diff --git a/src/util/virfile.h b/src/util/virfile.h index c885b73..5f0dd2b 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -108,4 +108,6 @@ int virFileUpdatePerm(const char *path, int virFileLoopDeviceAssociate(const char *file, char **dev); +int virFileDeleteTree(const char *dir); + #endif /* __VIR_FILES_H */ -- 1.8.1.4

On 04/04/2013 07:40 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Introduce a method virFileDeleteTree for recursively deleting an entire directory tree
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 81 insertions(+)
Don't we already have something like this? /me goes and looks... yep, very similar to virCgroupRemoveRecursively - hopefully a later patch drops that function to use this instead. I like the idea of a generalized interface.
+ * virFileDeleteTree: + * + * Recursively deletes all files / directories + * starting from the directory @dir. Does not + * follow symlinks + */ +int virFileDeleteTree(const char *dir) +{ + DIR *dh = opendir(dir); + struct dirent *de; + char *filepath = NULL; + int ret = -1; + + if (!dh) { + virReportSystemError(errno, _("Cannot open dir '%s'"), + dir); + return -1; + } + + errno = 0; + while ((de = readdir(dh)) != NULL) { + struct stat sb; + + if (STREQ(de->d_name, ".") || + STREQ(de->d_name, "..")) + continue; + + if (virAsprintf(&filepath, "%s/%s", + dir, de->d_name) < 0) { + virReportOOMError(); + goto cleanup; + }
We should use gnulib's LGPL unlinkat. On capable kernels, it avoids O(n^2) behavior that is inherent in computing filenames in a deep hierarchy. On less capable kernels, it still makes this code simpler to write (no virAsprintf needed here).
+ + if (lstat(filepath, &sb) < 0) { + virReportSystemError(errno, _("Cannot access '%s'"), + filepath); + goto cleanup; + }
Potentially wasteful on systems like Linux that have d_type. If d_type exists, and is not DT_UNKNOWN, then the difference between DT_DIR and other types can save some system call efforts. Potential TOCTTOU race. POSIX allows unlink("dir") to succeed (although most platforms reject it, either always [Linux], or based on capabilities [Solaris, which also has code to give up that capability, and where gnulib also exposes that]). If we ever manage to unlink() a directory, because we lost the TOCTTOU race, then we have done bad things to the file system. But you are guaranteed that rmdir() on a non-directory will gracefully fail; so you can minimize the race window by attempting to treat _every_ name as a directory first, then gracefully fall back to unlink() if the opendir() fails with ENOTDIR, without ever having to waste time lstat()ing things. Hmm, except that you don't want to follow symlinks, but opendir() follows them by default; so you would have to use open(O_DIRECTORY)/fdopendir() instead of raw opendir().
+ + if (S_ISDIR(sb.st_mode)) { + if (virFileDeleteTree(filepath) < 0) + goto cleanup; + } else { + if (unlink(filepath) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Cannot delete file '%s'"), + filepath); + goto cleanup;
What happens on files that have restrictive permissions? Do we need to worry about chmod()ing files (or containing directories) to give ourselves enough access so that we can then turn around and unlink() files regardless of restrictive permissions?
+ } + } + + VIR_FREE(filepath); + errno = 0; + } + + if (errno) { + virReportSystemError(errno, _("Cannot read dir '%s'"), + dir); + goto cleanup; + } + + if (rmdir(dir) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Cannot delete directory '%s'"), + dir); + goto cleanup; + }
What you have works, even if it is inherently quadratic compared to using unlinkat(). What sort of trees do we envision deleting? Do we need to start worrying about performance, using lessons learned from GNU coreutils? Or are the trees small enough, with properties of never being too-restrictive in file mode, and where the trees we are deleting are unlikely to be hit by a malicious user exploiting a TOCTTOU race, that we can just stick with this implementation as-is?
+ + ret = 0; + +cleanup: + VIR_FREE(filepath); + closedir(dh); + return ret; +} diff --git a/src/util/virfile.h b/src/util/virfile.h index c885b73..5f0dd2b 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -108,4 +108,6 @@ int virFileUpdatePerm(const char *path, int virFileLoopDeviceAssociate(const char *file, char **dev);
+int virFileDeleteTree(const char *dir); + #endif /* __VIR_FILES_H */
Very weak ACK, depending on what you answer to my commentary. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Thu, Apr 04, 2013 at 11:56:29AM -0600, Eric Blake wrote:
On 04/04/2013 07:40 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Introduce a method virFileDeleteTree for recursively deleting an entire directory tree
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 81 insertions(+)
Don't we already have something like this?
/me goes and looks...
yep, very similar to virCgroupRemoveRecursively - hopefully a later patch drops that function to use this instead. I like the idea of a generalized interface.
NB virCgroupRemoveRecursively is a little special - with the cgroups filesystem, you never have to actually delete any of the files in the directories. You delete the directory & every file in it magically goes away too (assuming there are no sub-directories or tasks present).
+ errno = 0; + while ((de = readdir(dh)) != NULL) { + struct stat sb; + + if (STREQ(de->d_name, ".") || + STREQ(de->d_name, "..")) + continue; + + if (virAsprintf(&filepath, "%s/%s", + dir, de->d_name) < 0) { + virReportOOMError(); + goto cleanup; + }
We should use gnulib's LGPL unlinkat. On capable kernels, it avoids O(n^2) behavior that is inherent in computing filenames in a deep hierarchy. On less capable kernels, it still makes this code simpler to write (no virAsprintf needed here).
Yep, I wondered about unlinkat(), then then wondered about portability forgetting gnulib might help.
+ + if (lstat(filepath, &sb) < 0) { + virReportSystemError(errno, _("Cannot access '%s'"), + filepath); + goto cleanup; + }
Potentially wasteful on systems like Linux that have d_type. If d_type exists, and is not DT_UNKNOWN, then the difference between DT_DIR and other types can save some system call efforts.
Potential TOCTTOU race. POSIX allows unlink("dir") to succeed (although most platforms reject it, either always [Linux], or based on capabilities [Solaris, which also has code to give up that capability, and where gnulib also exposes that]). If we ever manage to unlink() a directory, because we lost the TOCTTOU race, then we have done bad things to the file system.
But you are guaranteed that rmdir() on a non-directory will gracefully fail; so you can minimize the race window by attempting to treat _every_ name as a directory first, then gracefully fall back to unlink() if the opendir() fails with ENOTDIR, without ever having to waste time lstat()ing things. Hmm, except that you don't want to follow symlinks, but opendir() follows them by default; so you would have to use open(O_DIRECTORY)/fdopendir() instead of raw opendir().
+ + if (S_ISDIR(sb.st_mode)) { + if (virFileDeleteTree(filepath) < 0) + goto cleanup; + } else { + if (unlink(filepath) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Cannot delete file '%s'"), + filepath); + goto cleanup;
What happens on files that have restrictive permissions? Do we need to worry about chmod()ing files (or containing directories) to give ourselves enough access so that we can then turn around and unlink() files regardless of restrictive permissions?
+ } + } + + VIR_FREE(filepath); + errno = 0; + } + + if (errno) { + virReportSystemError(errno, _("Cannot read dir '%s'"), + dir); + goto cleanup; + } + + if (rmdir(dir) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Cannot delete directory '%s'"), + dir); + goto cleanup; + }
What you have works, even if it is inherently quadratic compared to using unlinkat(). What sort of trees do we envision deleting? Do we need to start worrying about performance, using lessons learned from GNU coreutils? Or are the trees small enough, with properties of never being too-restrictive in file mode, and where the trees we are deleting are unlikely to be hit by a malicious user exploiting a TOCTTOU race, that we can just stick with this implementation as-is?
+ + ret = 0; + +cleanup: + VIR_FREE(filepath); + closedir(dh); + return ret; +} diff --git a/src/util/virfile.h b/src/util/virfile.h index c885b73..5f0dd2b 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -108,4 +108,6 @@ int virFileUpdatePerm(const char *path, int virFileLoopDeviceAssociate(const char *file, char **dev);
+int virFileDeleteTree(const char *dir); + #endif /* __VIR_FILES_H */
Very weak ACK, depending on what you answer to my commentary.
I guess the thing I should point out here is that I wasn't writing this to deal with a hostile environment. This function is actually only used by the testsuite I add in a later patch. It won't actully be used by libvirtd / libvirt.so. So I wasn't too bothered about perfection, just something good enough for the immediate need. I guess the problem is that once we have this function, someone else is bound to use it elsewhere, where upon my laziness might cause a problem. I'll make a bunch of the changes you suggest & re-post for further discussion. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, Apr 04, 2013 at 11:56:29AM -0600, Eric Blake wrote:
On 04/04/2013 07:40 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Introduce a method virFileDeleteTree for recursively deleting an entire directory tree
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 81 insertions(+)
Don't we already have something like this?
/me goes and looks...
yep, very similar to virCgroupRemoveRecursively - hopefully a later patch drops that function to use this instead. I like the idea of a generalized interface.
+ * virFileDeleteTree: + * + * Recursively deletes all files / directories + * starting from the directory @dir. Does not + * follow symlinks + */ +int virFileDeleteTree(const char *dir) +{ + DIR *dh = opendir(dir); + struct dirent *de; + char *filepath = NULL; + int ret = -1; + + if (!dh) { + virReportSystemError(errno, _("Cannot open dir '%s'"), + dir); + return -1; + } + + errno = 0; + while ((de = readdir(dh)) != NULL) { + struct stat sb; + + if (STREQ(de->d_name, ".") || + STREQ(de->d_name, "..")) + continue; + + if (virAsprintf(&filepath, "%s/%s", + dir, de->d_name) < 0) { + virReportOOMError(); + goto cleanup; + }
We should use gnulib's LGPL unlinkat. On capable kernels, it avoids O(n^2) behavior that is inherent in computing filenames in a deep hierarchy. On less capable kernels, it still makes this code simpler to write (no virAsprintf needed here).
I looked into unlinkat() but that gnulib module is GPL only.
+ + if (lstat(filepath, &sb) < 0) { + virReportSystemError(errno, _("Cannot access '%s'"), + filepath); + goto cleanup; + }
Potentially wasteful on systems like Linux that have d_type. If d_type exists, and is not DT_UNKNOWN, then the difference between DT_DIR and other types can save some system call efforts.
Potential TOCTTOU race. POSIX allows unlink("dir") to succeed (although most platforms reject it, either always [Linux], or based on capabilities [Solaris, which also has code to give up that capability, and where gnulib also exposes that]). If we ever manage to unlink() a directory, because we lost the TOCTTOU race, then we have done bad things to the file system.
But you are guaranteed that rmdir() on a non-directory will gracefully fail; so you can minimize the race window by attempting to treat _every_ name as a directory first, then gracefully fall back to unlink() if the opendir() fails with ENOTDIR, without ever having to waste time lstat()ing things. Hmm, except that you don't want to follow symlinks, but opendir() follows them by default; so you would have to use open(O_DIRECTORY)/fdopendir() instead of raw opendir().
I also looked at fdopendir() but we'd want gnulib for portability of that, and again it is GPL only.
Very weak ACK, depending on what you answer to my commentary.
I would like to make some of the enhancements you suggest, but doing so would require we implement a bunch of portability code since all the things you suggest are Linux specific and the gnulib modules are GPL only. So I think in the immediate term, we'll just have to go with that I have here. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 04/08/2013 08:04 AM, Daniel P. Berrange wrote:
We should use gnulib's LGPL unlinkat. On capable kernels, it avoids O(n^2) behavior that is inherent in computing filenames in a deep hierarchy. On less capable kernels, it still makes this code simpler to write (no virAsprintf needed here).
I looked into unlinkat() but that gnulib module is GPL only.
Oh right. The gnulib wrappers are nice on Linux (even with older kernels, they fall back to /proc/self/fd usage for avoidance of quadratic behavior and no thread-safety issues); but on other platforms (hello mingw and older BSD), it uses chdir() under the hood, which is not thread-safe since the current working directory is global state. Also, the gnulib wrappers currently can call exit() on some extreme corner cases (primarily related to traversing through searchable but unreadable directories), and library code should not call exit(), so the modules were left GPL to avoid using them in a library until someone is motivated enough to clean up the code to be a bit more robust.
I also looked at fdopendir() but we'd want gnulib for portability of that, and again it is GPL only.
I can make the argument for relaxing its license, if we really want it. Given that it is an essential function, and provided by libc (where glibc is LGPLv2+), I have typically had success with requests like that in the past; but it's not worth fighting the battle unless we know we want to use it.
Very weak ACK, depending on what you answer to my commentary.
I would like to make some of the enhancements you suggest, but doing so would require we implement a bunch of portability code since all the things you suggest are Linux specific and the gnulib modules are GPL only. So I think in the immediate term, we'll just have to go with that I have here.
Yeah, that's a pretty convincing argument to use what you have here. But at least add a comment at the function declaration that we know the behavior is not optimal and should therefore not be used on deep hierarchies (where the quadratic behavior would definitely be noticeable). -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

From: "Daniel P. Berrange" <berrange@redhat.com> The virCgroupGetAppRoot is not clear in its meaning. Change to virCgroupForSelf to highlight that this returns the cgroup config for the caller's process Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 2 +- src/lxc/lxc_cgroup.c | 2 +- src/util/vircgroup.c | 9 ++++++--- src/util/vircgroup.h | 2 +- 4 files changed, 9 insertions(+), 6 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e297850..b9c656e 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1098,9 +1098,9 @@ virCgroupDenyDevicePath; virCgroupForDomain; virCgroupForDriver; virCgroupForEmulator; +virCgroupForSelf; virCgroupForVcpu; virCgroupFree; -virCgroupGetAppRoot; virCgroupGetBlkioWeight; virCgroupGetCpuacctPercpuUsage; virCgroupGetCpuacctStat; diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index df468da..33c305a 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -293,7 +293,7 @@ int virLXCCgroupGetMeminfo(virLXCMeminfoPtr meminfo) int ret; virCgroupPtr cgroup; - ret = virCgroupGetAppRoot(&cgroup); + ret = virCgroupForSelf(&cgroup); if (ret < 0) { virReportSystemError(-ret, "%s", _("Unable to get cgroup for container")); diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 6998f13..266cecb 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -967,19 +967,22 @@ int virCgroupForDriver(const char *name ATTRIBUTE_UNUSED, #endif /** -* virCgroupGetAppRoot: +* virCgroupForSelf: * * @group: Pointer to returned virCgroupPtr * +* Obtain a cgroup representing the config of the +* current process +* * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupGetAppRoot(virCgroupPtr *group) +int virCgroupForSelf(virCgroupPtr *group) { return virCgroupNew("/", group); } #else -int virCgroupGetAppRoot(virCgroupPtr *group ATTRIBUTE_UNUSED) +int virCgroupForSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) { return -ENXIO; } diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h index ea42fa2..45a2006 100644 --- a/src/util/vircgroup.h +++ b/src/util/vircgroup.h @@ -49,7 +49,7 @@ int virCgroupForDriver(const char *name, bool privileged, bool create); -int virCgroupGetAppRoot(virCgroupPtr *group); +int virCgroupForSelf(virCgroupPtr *group); int virCgroupForDomain(virCgroupPtr driver, const char *name, -- 1.8.1.4

On 04/04/2013 07:40 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
The virCgroupGetAppRoot is not clear in its meaning. Change to virCgroupForSelf to highlight that this returns the cgroup config for the caller's process
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 2 +- src/lxc/lxc_cgroup.c | 2 +- src/util/vircgroup.c | 9 ++++++--- src/util/vircgroup.h | 2 +- 4 files changed, 9 insertions(+), 6 deletions(-)
ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

From: "Daniel P. Berrange" <berrange@redhat.com> Currently when getting an instance of virCgroupPtr we will create the path in all cgroup controllers. Only at the virt driver layer are we attempting to filter controllers. This is bad because the mere act of creating the dirs in the controllers can have a functional impact on the kernel, particularly for performance. Update the virCgroupForDriver() method to accept a bitmask of controllers to use. Only create dirs in the controllers that are requested. When creating cgroups for domains, respect the active controller list from the parent cgroup Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/lxc/lxc_cgroup.c | 2 +- src/lxc/lxc_driver.c | 2 +- src/qemu/qemu_cgroup.c | 27 +++------ src/qemu/qemu_conf.c | 16 +---- src/qemu/qemu_driver.c | 3 +- src/util/vircgroup.c | 162 +++++++++++++++++++++++++++++-------------------- src/util/vircgroup.h | 6 +- 7 files changed, 113 insertions(+), 105 deletions(-) diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 33c305a..33641f8 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -530,7 +530,7 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) int ret = -1; int rc; - rc = virCgroupForDriver("lxc", &driver, 1, 0); + rc = virCgroupForDriver("lxc", &driver, 1, 0, -1); if (rc != 0) { virReportSystemError(-rc, "%s", _("Unable to get cgroup for driver")); diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 654ab99..9c6f858 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -1460,7 +1460,7 @@ static int lxcStartup(bool privileged, lxc_driver->log_libvirtd = 0; /* by default log to container logfile */ lxc_driver->have_netns = lxcCheckNetNsSupport(); - rc = virCgroupForDriver("lxc", &lxc_driver->cgroup, privileged, 1); + rc = virCgroupForDriver("lxc", &lxc_driver->cgroup, privileged, 1, -1); if (rc < 0) { char buf[1024] ATTRIBUTE_UNUSED; VIR_DEBUG("Unable to create cgroup for LXC driver: %s", diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index c9b4ca2..2cdc2b7 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -57,8 +57,6 @@ bool qemuCgroupControllerActive(virQEMUDriverPtr driver, goto cleanup; if (!virCgroupMounted(driver->cgroup, controller)) goto cleanup; - if (cfg->cgroupControllers & (1 << controller)) - ret = true; cleanup: virObjectUnref(cfg); @@ -668,7 +666,7 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, virDomainDefPtr def = vm->def; unsigned long long period = vm->def->cputune.emulator_period; long long quota = vm->def->cputune.emulator_quota; - int rc, i; + int rc; if ((period || quota) && (!driver->cgroup || @@ -697,22 +695,13 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, goto cleanup; } - for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) { - if (i != VIR_CGROUP_CONTROLLER_CPU && - i != VIR_CGROUP_CONTROLLER_CPUACCT && - i != VIR_CGROUP_CONTROLLER_CPUSET) - continue; - - if (!qemuCgroupControllerActive(driver, i)) - continue; - rc = virCgroupMoveTask(cgroup, cgroup_emulator, i); - if (rc < 0) { - virReportSystemError(-rc, - _("Unable to move tasks from domain cgroup to " - "emulator cgroup in controller %d for %s"), - i, vm->def->name); - goto cleanup; - } + rc = virCgroupMoveTask(cgroup, cgroup_emulator); + if (rc < 0) { + virReportSystemError(-rc, + _("Unable to move tasks from domain cgroup to " + "emulator cgroup for %s"), + vm->def->name); + goto cleanup; } if (def->placement_mode == VIR_DOMAIN_CPU_PLACEMENT_MODE_AUTO) { diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index c2e2e10..5a3dde0 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -134,14 +134,7 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) } cfg->dynamicOwnership = privileged; - cfg->cgroupControllers = - (1 << VIR_CGROUP_CONTROLLER_CPU) | - (1 << VIR_CGROUP_CONTROLLER_DEVICES) | - (1 << VIR_CGROUP_CONTROLLER_MEMORY) | - (1 << VIR_CGROUP_CONTROLLER_BLKIO) | - (1 << VIR_CGROUP_CONTROLLER_CPUSET) | - (1 << VIR_CGROUP_CONTROLLER_CPUACCT); - + cfg->cgroupControllers = -1; /* -1 == auto-detect */ if (privileged) { if (virAsprintf(&cfg->logDir, @@ -454,6 +447,7 @@ int virQEMUDriverConfigLoadFile(virQEMUDriverConfigPtr cfg, p = virConfGetValue(conf, "cgroup_controllers"); CHECK_TYPE("cgroup_controllers", VIR_CONF_LIST); if (p) { + cfg->cgroupControllers = 0; virConfValuePtr pp; for (i = 0, pp = p->list; pp; ++i, pp = pp->next) { int ctl; @@ -472,12 +466,6 @@ int virQEMUDriverConfigLoadFile(virQEMUDriverConfigPtr cfg, cfg->cgroupControllers |= (1 << ctl); } } - for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { - if (cfg->cgroupControllers & (1 << i)) { - VIR_INFO("Configured cgroup controller '%s'", - virCgroupControllerTypeToString(i)); - } - } p = virConfGetValue(conf, "cgroup_device_acl"); CHECK_TYPE("cgroup_device_acl", VIR_CONF_LIST); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 552a81b..2809a77 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -628,7 +628,8 @@ qemuStartup(bool privileged, goto error; } - rc = virCgroupForDriver("qemu", &qemu_driver->cgroup, privileged, 1); + rc = virCgroupForDriver("qemu", &qemu_driver->cgroup, privileged, 1, + cfg->cgroupControllers); if (rc < 0) { VIR_INFO("Unable to create cgroup for driver: %s", virStrerror(-rc, ebuf, sizeof(ebuf))); diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 266cecb..085421e 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -70,8 +70,6 @@ typedef enum { * before creating subcgroups and * attaching tasks */ - VIR_CGROUP_VCPU = 1 << 1, /* create subdir only under the cgroup cpu, - * cpuacct and cpuset if possible. */ } virCgroupFlags; /** @@ -230,11 +228,12 @@ no_memory: } -static int virCgroupDetect(virCgroupPtr group) +static int virCgroupDetect(virCgroupPtr group, + int controllers) { - int any = 0; int rc; int i; + int j; rc = virCgroupDetectMounts(group); if (rc < 0) { @@ -242,14 +241,55 @@ static int virCgroupDetect(virCgroupPtr group) return rc; } - /* Check that at least 1 controller is available */ - for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { - if (group->controllers[i].mountPoint != NULL) - any = 1; + if (controllers >= 0) { + VIR_DEBUG("Validating controllers %d", controllers); + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + VIR_DEBUG("Controller '%s' wanted=%s", + virCgroupControllerTypeToString(i), + (1 << i) & controllers ? "yes" : "no"); + if (((1 << i) & controllers)) { + /* Ensure requested controller is present */ + if (!group->controllers[i].mountPoint) { + VIR_DEBUG("Requested controlled '%s' not mounted", + virCgroupControllerTypeToString(i)); + return -ENOENT; + } + } else { + /* Check whether a request to disable a controller + * clashes with co-mounting of controllers */ + for (j = 0 ; j < VIR_CGROUP_CONTROLLER_LAST ; j++) { + if (j == i) + continue; + if (!((1 << j) & controllers)) + continue; + + if (STREQ_NULLABLE(group->controllers[i].mountPoint, + group->controllers[j].mountPoint)) { + VIR_DEBUG("Controller '%s' is not wanted, but '%s' is co-mounted", + virCgroupControllerTypeToString(i), + virCgroupControllerTypeToString(j)); + return -EINVAL; + } + } + VIR_FREE(group->controllers[i].mountPoint); + } + } + } else { + VIR_DEBUG("Auto-detecting controllers"); + controllers = 0; + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + VIR_DEBUG("Controller '%s' present=%s", + virCgroupControllerTypeToString(i), + group->controllers[i].mountPoint ? "yes" : "no"); + if (group->controllers[i].mountPoint == NULL) + continue; + controllers |= (1 << i); + } } - if (!any) - return -ENXIO; + /* Check that at least 1 controller is available */ + if (!controllers) + return -ENXIO; rc = virCgroupDetectPlacement(group); @@ -542,16 +582,6 @@ static int virCgroupMakeGroup(virCgroupPtr parent, if (!group->controllers[i].mountPoint) continue; - /* We need to control cpu bandwidth for each vcpu now */ - if ((flags & VIR_CGROUP_VCPU) && - (i != VIR_CGROUP_CONTROLLER_CPU && - i != VIR_CGROUP_CONTROLLER_CPUACCT && - i != VIR_CGROUP_CONTROLLER_CPUSET)) { - /* treat it as unmounted and we can use virCgroupAddTask */ - VIR_FREE(group->controllers[i].mountPoint); - continue; - } - rc = virCgroupPathOfController(group, i, "", &path); if (rc < 0) return rc; @@ -611,12 +641,13 @@ static int virCgroupMakeGroup(virCgroupPtr parent, static int virCgroupNew(const char *path, + int controllers, virCgroupPtr *group) { int rc = 0; char *typpath = NULL; - VIR_DEBUG("New group %s", path); + VIR_DEBUG("path=%s controllers=%d", path, controllers); *group = NULL; if (VIR_ALLOC((*group)) != 0) { @@ -629,7 +660,7 @@ static int virCgroupNew(const char *path, goto err; } - rc = virCgroupDetect(*group); + rc = virCgroupDetect(*group, controllers); if (rc < 0) goto err; @@ -645,17 +676,18 @@ err: static int virCgroupAppRoot(bool privileged, virCgroupPtr *group, - bool create) + bool create, + int controllers) { virCgroupPtr rootgrp = NULL; int rc; - rc = virCgroupNew("/", &rootgrp); + rc = virCgroupNew("/", controllers, &rootgrp); if (rc != 0) return rc; if (privileged) { - rc = virCgroupNew("/libvirt", group); + rc = virCgroupNew("/libvirt", controllers, group); } else { char *rootname; char *username; @@ -671,7 +703,7 @@ static int virCgroupAppRoot(bool privileged, goto cleanup; } - rc = virCgroupNew(rootname, group); + rc = virCgroupNew(rootname, controllers, group); VIR_FREE(rootname); } if (rc != 0) @@ -779,6 +811,7 @@ int virCgroupRemove(virCgroupPtr group) return rc; } + /** * virCgroupAddTask: * @@ -872,45 +905,30 @@ cleanup: * * Returns: 0 on success or -errno on failure */ -int virCgroupMoveTask(virCgroupPtr src_group, virCgroupPtr dest_group, - int controller) +int virCgroupMoveTask(virCgroupPtr src_group, virCgroupPtr dest_group) { - int rc = 0, err = 0; + int rc = 0; char *content = NULL; + int i; - if (controller < VIR_CGROUP_CONTROLLER_CPU || - controller > VIR_CGROUP_CONTROLLER_BLKIO) - return -EINVAL; - - if (!src_group->controllers[controller].mountPoint || - !dest_group->controllers[controller].mountPoint) { - return -EINVAL; - } - - rc = virCgroupGetValueStr(src_group, controller, "tasks", &content); - if (rc != 0) - return rc; + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + if (!src_group->controllers[i].mountPoint || + !dest_group->controllers[i].mountPoint) + continue; - rc = virCgroupAddTaskStrController(dest_group, content, controller); - if (rc != 0) - goto cleanup; + rc = virCgroupGetValueStr(src_group, i, "tasks", &content); + if (rc != 0) + return rc; - VIR_FREE(content); + rc = virCgroupAddTaskStrController(dest_group, content, i); + if (rc != 0) + goto cleanup; - return 0; + VIR_FREE(content); + } cleanup: - /* - * We don't need to recover dest_cgroup because cgroup will make sure - * that one task only resides in one cgroup of the same controller. - */ - err = virCgroupAddTaskStrController(src_group, content, controller); - if (err != 0) - VIR_ERROR(_("Cannot recover cgroup %s from %s"), - src_group->controllers[controller].mountPoint, - dest_group->controllers[controller].mountPoint); VIR_FREE(content); - return rc; } @@ -926,13 +944,15 @@ cleanup: int virCgroupForDriver(const char *name, virCgroupPtr *group, bool privileged, - bool create) + bool create, + int controllers) { int rc; char *path = NULL; virCgroupPtr rootgrp = NULL; - rc = virCgroupAppRoot(privileged, &rootgrp, create); + rc = virCgroupAppRoot(privileged, &rootgrp, + create, controllers); if (rc != 0) goto out; @@ -941,7 +961,7 @@ int virCgroupForDriver(const char *name, goto out; } - rc = virCgroupNew(path, group); + rc = virCgroupNew(path, controllers, group); VIR_FREE(path); if (rc == 0) { @@ -979,7 +999,7 @@ int virCgroupForDriver(const char *name ATTRIBUTE_UNUSED, #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R int virCgroupForSelf(virCgroupPtr *group) { - return virCgroupNew("/", group); + return virCgroupNew("/", -1, group); } #else int virCgroupForSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) @@ -1012,7 +1032,7 @@ int virCgroupForDomain(virCgroupPtr driver, if (virAsprintf(&path, "%s/%s", driver->path, name) < 0) return -ENOMEM; - rc = virCgroupNew(path, group); + rc = virCgroupNew(path, -1, group); VIR_FREE(path); if (rc == 0) { @@ -1060,6 +1080,7 @@ int virCgroupForVcpu(virCgroupPtr driver, { int rc; char *path; + int controllers; if (driver == NULL) return -EINVAL; @@ -1067,11 +1088,15 @@ int virCgroupForVcpu(virCgroupPtr driver, if (virAsprintf(&path, "%s/vcpu%d", driver->path, vcpuid) < 0) return -ENOMEM; - rc = virCgroupNew(path, group); + controllers = ((1 << VIR_CGROUP_CONTROLLER_CPU) | + (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | + (1 << VIR_CGROUP_CONTROLLER_CPUSET)); + + rc = virCgroupNew(path, controllers, group); VIR_FREE(path); if (rc == 0) { - rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_VCPU); + rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_NONE); if (rc != 0) virCgroupFree(group); } @@ -1103,6 +1128,7 @@ int virCgroupForEmulator(virCgroupPtr driver, { int rc; char *path; + int controllers; if (driver == NULL) return -EINVAL; @@ -1110,11 +1136,15 @@ int virCgroupForEmulator(virCgroupPtr driver, if (virAsprintf(&path, "%s/emulator", driver->path) < 0) return -ENOMEM; - rc = virCgroupNew(path, group); + controllers = ((1 << VIR_CGROUP_CONTROLLER_CPU) | + (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | + (1 << VIR_CGROUP_CONTROLLER_CPUSET)); + + rc = virCgroupNew(path, controllers, group); VIR_FREE(path); if (rc == 0) { - rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_VCPU); + rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_NONE); if (rc != 0) virCgroupFree(group); } @@ -2014,7 +2044,7 @@ static int virCgroupKillRecursiveInternal(virCgroupPtr group, int signum, virHas goto cleanup; } - if ((rc = virCgroupNew(subpath, &subgroup)) != 0) + if ((rc = virCgroupNew(subpath, -1, &subgroup)) != 0) goto cleanup; if ((rc = virCgroupKillRecursiveInternal(subgroup, signum, pids, true)) < 0) diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h index 45a2006..725d2d0 100644 --- a/src/util/vircgroup.h +++ b/src/util/vircgroup.h @@ -47,7 +47,8 @@ VIR_ENUM_DECL(virCgroupController); int virCgroupForDriver(const char *name, virCgroupPtr *group, bool privileged, - bool create); + bool create, + int controllers); int virCgroupForSelf(virCgroupPtr *group); @@ -77,8 +78,7 @@ int virCgroupAddTaskController(virCgroupPtr group, int controller); int virCgroupMoveTask(virCgroupPtr src_group, - virCgroupPtr dest_group, - int controller); + virCgroupPtr dest_group); int virCgroupSetBlkioWeight(virCgroupPtr group, unsigned int weight); int virCgroupGetBlkioWeight(virCgroupPtr group, unsigned int *weight); -- 1.8.1.4

On 04/04/2013 07:40 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
Currently when getting an instance of virCgroupPtr we will create the path in all cgroup controllers. Only at the virt driver layer are we attempting to filter controllers. This is bad because the mere act of creating the dirs in the controllers can have a functional impact on the kernel, particularly for performance.
Update the virCgroupForDriver() method to accept a bitmask of controllers to use. Only create dirs in the controllers that are requested. When creating cgroups for domains, respect the active controller list from the parent cgroup
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> ---
+++ b/src/lxc/lxc_cgroup.c @@ -530,7 +530,7 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) int ret = -1; int rc;
- rc = virCgroupForDriver("lxc", &driver, 1, 0); + rc = virCgroupForDriver("lxc", &driver, 1, 0, -1);
Do we want -1 to always mean all possible groups, or should we define a symbolic constant which contains the mask of the maximum groups and leave remaining bits 0? Then again, this is never passed over the wire, so it is not an RPC compatibility concern, so I guess it doesn't matter.
+++ b/src/qemu/qemu_conf.c @@ -134,14 +134,7 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) } cfg->dynamicOwnership = privileged;
- cfg->cgroupControllers = - (1 << VIR_CGROUP_CONTROLLER_CPU) | - (1 << VIR_CGROUP_CONTROLLER_DEVICES) | - (1 << VIR_CGROUP_CONTROLLER_MEMORY) | - (1 << VIR_CGROUP_CONTROLLER_BLKIO) | - (1 << VIR_CGROUP_CONTROLLER_CPUSET) | - (1 << VIR_CGROUP_CONTROLLER_CPUACCT); - + cfg->cgroupControllers = -1; /* -1 == auto-detect */
Oh, -1 isn't _all_ controllers, but auto-detect. Now it makes more sense. ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

From: "Daniel P. Berrange" <berrange@redhat.com> The virCgroupMounted method is badly named, since a controller can be mounted, but disabled in the current object. Rename the method to be virCgroupHasController. Also make it tolerant to a NULL virCgroupPtr and out-of-range controller index, to avoid duplication of these checks in all callers Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 2 +- src/lxc/lxc_driver.c | 12 +----------- src/lxc/lxc_process.c | 10 +++++----- src/qemu/qemu_cgroup.c | 14 +------------- src/util/vircgroup.c | 13 +++++++++---- src/util/vircgroup.h | 2 +- 6 files changed, 18 insertions(+), 35 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index b9c656e..4db0734 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1116,10 +1116,10 @@ virCgroupGetMemorySoftLimit; virCgroupGetMemoryUsage; virCgroupGetMemSwapHardLimit; virCgroupGetMemSwapUsage; +virCgroupHasController; virCgroupKill; virCgroupKillPainfully; virCgroupKillRecursive; -virCgroupMounted; virCgroupMoveTask; virCgroupPathOfController; virCgroupRemove; diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 9c6f858..ea056c8 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -1641,17 +1641,7 @@ cleanup: static bool lxcCgroupControllerActive(virLXCDriverPtr driver, int controller) { - if (driver->cgroup == NULL) - return false; - if (controller < 0 || controller >= VIR_CGROUP_CONTROLLER_LAST) - return false; - if (!virCgroupMounted(driver->cgroup, controller)) - return false; -#if 0 - if (driver->cgroupControllers & (1 << controller)) - return true; -#endif - return true; + return virCgroupHasController(driver->cgroup, controller); } diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index f2f66e4..f311f63 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -1053,20 +1053,20 @@ int virLXCProcessStart(virConnectPtr conn, return -1; } - if (!virCgroupMounted(lxc_driver->cgroup, + if (!virCgroupHasController(lxc_driver->cgroup, VIR_CGROUP_CONTROLLER_CPUACCT)) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Unable to find 'cpuacct' cgroups controller mount")); return -1; } - if (!virCgroupMounted(lxc_driver->cgroup, - VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(lxc_driver->cgroup, + VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Unable to find 'devices' cgroups controller mount")); return -1; } - if (!virCgroupMounted(lxc_driver->cgroup, - VIR_CGROUP_CONTROLLER_MEMORY)) { + if (!virCgroupHasController(lxc_driver->cgroup, + VIR_CGROUP_CONTROLLER_MEMORY)) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Unable to find 'memory' cgroups controller mount")); return -1; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 2cdc2b7..5aa9416 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -48,19 +48,7 @@ static const char *const defaultDeviceACL[] = { bool qemuCgroupControllerActive(virQEMUDriverPtr driver, int controller) { - virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); - bool ret = false; - - if (driver->cgroup == NULL) - goto cleanup; - if (controller < 0 || controller >= VIR_CGROUP_CONTROLLER_LAST) - goto cleanup; - if (!virCgroupMounted(driver->cgroup, controller)) - goto cleanup; - -cleanup: - virObjectUnref(cfg); - return ret; + return virCgroupHasController(driver->cgroup, controller); } static int diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 085421e..dc2b431 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -94,15 +94,20 @@ void virCgroupFree(virCgroupPtr *group) } /** - * virCgroupMounted: query whether a cgroup subsystem is mounted or not + * virCgroupHasController: query whether a cgroup controller is present * - * @cgroup: The group structure to be queried + * @cgroup: The group structure to be queried, or NULL * @controller: cgroup subsystem id * - * Returns true if a cgroup is subsystem is mounted. + * Returns true if a cgroup controller is mounted and is associated + * with this cgroup object. */ -bool virCgroupMounted(virCgroupPtr cgroup, int controller) +bool virCgroupHasController(virCgroupPtr cgroup, int controller) { + if (!cgroup) + return false; + if (controller < 0 || controller >= VIR_CGROUP_CONTROLLER_LAST) + return false; return cgroup->controllers[controller].mountPoint != NULL; } diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h index 725d2d0..4c1134d 100644 --- a/src/util/vircgroup.h +++ b/src/util/vircgroup.h @@ -161,7 +161,7 @@ int virCgroupRemoveRecursively(char *grppath); int virCgroupRemove(virCgroupPtr group); void virCgroupFree(virCgroupPtr *group); -bool virCgroupMounted(virCgroupPtr cgroup, int controller); +bool virCgroupHasController(virCgroupPtr cgroup, int controller); int virCgroupKill(virCgroupPtr group, int signum); int virCgroupKillRecursive(virCgroupPtr group, int signum); -- 1.8.1.4

On 04/04/2013 03:40 PM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
The virCgroupMounted method is badly named, since a controller can be mounted, but disabled in the current object. Rename the method to be virCgroupHasController. Also make it tolerant to a NULL virCgroupPtr and out-of-range controller index, to avoid duplication of these checks in all callers
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 2 +- src/lxc/lxc_driver.c | 12 +----------- src/lxc/lxc_process.c | 10 +++++----- src/qemu/qemu_cgroup.c | 14 +------------- src/util/vircgroup.c | 13 +++++++++---- src/util/vircgroup.h | 2 +- 6 files changed, 18 insertions(+), 35 deletions(-)
ACK Jan

From: "Daniel P. Berrange" <berrange@redhat.com> Instead of calling virCgroupForDomain every time we need the virCgrouPtr instance, just do it once at Vm startup and cache a reference to the object in qemuDomainObjPrivatePtr until shutdown of the VM. Removing the virCgroupPtr from the QEMU driver state also means we don't have stale mount info, if someone mounts the cgroups filesystem after libvirtd has been started Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/qemu/qemu_cgroup.c | 283 +++++++++++++++------------------ src/qemu/qemu_cgroup.h | 22 +-- src/qemu/qemu_conf.h | 4 - src/qemu/qemu_domain.c | 1 + src/qemu/qemu_domain.h | 3 + src/qemu/qemu_driver.c | 397 +++++++++++++++------------------------------- src/qemu/qemu_hotplug.c | 53 +------ src/qemu/qemu_migration.c | 25 +-- src/qemu/qemu_process.c | 13 +- 9 files changed, 291 insertions(+), 510 deletions(-) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 5aa9416..019aa2e 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -45,26 +45,21 @@ static const char *const defaultDeviceACL[] = { #define DEVICE_PTY_MAJOR 136 #define DEVICE_SND_MAJOR 116 -bool qemuCgroupControllerActive(virQEMUDriverPtr driver, - int controller) -{ - return virCgroupHasController(driver->cgroup, controller); -} - static int qemuSetupDiskPathAllow(virDomainDiskDefPtr disk, const char *path, size_t depth ATTRIBUTE_UNUSED, void *opaque) { - qemuCgroupData *data = opaque; + virDomainObjPtr vm = opaque; + qemuDomainObjPrivatePtr priv = vm->privateData; int rc; VIR_DEBUG("Process path %s for disk", path); - rc = virCgroupAllowDevicePath(data->cgroup, path, + rc = virCgroupAllowDevicePath(priv->cgroup, path, (disk->readonly ? VIR_CGROUP_DEVICE_READ : VIR_CGROUP_DEVICE_RW)); - virDomainAuditCgroupPath(data->vm, data->cgroup, "allow", path, + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", path, disk->readonly ? "r" : "rw", rc); if (rc < 0) { if (rc == -EACCES) { /* Get this for root squash NFS */ @@ -81,14 +76,18 @@ qemuSetupDiskPathAllow(virDomainDiskDefPtr disk, int qemuSetupDiskCgroup(virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainDiskDefPtr disk) { - qemuCgroupData data = { vm, cgroup }; + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (!virCgroupHasController(priv->cgroup, + VIR_CGROUP_CONTROLLER_DEVICES)) + return 0; + return virDomainDiskDefForeachPath(disk, true, qemuSetupDiskPathAllow, - &data); + vm); } @@ -98,13 +97,14 @@ qemuTeardownDiskPathDeny(virDomainDiskDefPtr disk ATTRIBUTE_UNUSED, size_t depth ATTRIBUTE_UNUSED, void *opaque) { - qemuCgroupData *data = opaque; + virDomainObjPtr vm = opaque; + qemuDomainObjPrivatePtr priv = vm->privateData; int rc; VIR_DEBUG("Process path %s for disk", path); - rc = virCgroupDenyDevicePath(data->cgroup, path, + rc = virCgroupDenyDevicePath(priv->cgroup, path, VIR_CGROUP_DEVICE_RWM); - virDomainAuditCgroupPath(data->vm, data->cgroup, "deny", path, "rwm", rc); + virDomainAuditCgroupPath(vm, priv->cgroup, "deny", path, "rwm", rc); if (rc < 0) { if (rc == -EACCES) { /* Get this for root squash NFS */ VIR_DEBUG("Ignoring EACCES for %s", path); @@ -120,14 +120,18 @@ qemuTeardownDiskPathDeny(virDomainDiskDefPtr disk ATTRIBUTE_UNUSED, int qemuTeardownDiskCgroup(virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainDiskDefPtr disk) { - qemuCgroupData data = { vm, cgroup }; + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (!virCgroupHasController(priv->cgroup, + VIR_CGROUP_CONTROLLER_DEVICES)) + return 0; + return virDomainDiskDefForeachPath(disk, true, qemuTeardownDiskPathDeny, - &data); + vm); } @@ -136,7 +140,8 @@ qemuSetupChardevCgroup(virDomainDefPtr def, virDomainChrDefPtr dev, void *opaque) { - qemuCgroupData *data = opaque; + virDomainObjPtr vm = opaque; + qemuDomainObjPrivatePtr priv = vm->privateData; int rc; if (dev->source.type != VIR_DOMAIN_CHR_TYPE_DEV) @@ -144,9 +149,9 @@ qemuSetupChardevCgroup(virDomainDefPtr def, VIR_DEBUG("Process path '%s' for disk", dev->source.data.file.path); - rc = virCgroupAllowDevicePath(data->cgroup, dev->source.data.file.path, + rc = virCgroupAllowDevicePath(priv->cgroup, dev->source.data.file.path, VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupPath(data->vm, data->cgroup, "allow", + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", dev->source.data.file.path, "rw", rc); if (rc < 0) { virReportSystemError(-rc, @@ -163,13 +168,14 @@ int qemuSetupHostUsbDeviceCgroup(virUSBDevicePtr dev ATTRIBUTE_UNUSED, const char *path, void *opaque) { - qemuCgroupData *data = opaque; + virDomainObjPtr vm = opaque; + qemuDomainObjPrivatePtr priv = vm->privateData; int rc; VIR_DEBUG("Process path '%s' for USB device", path); - rc = virCgroupAllowDevicePath(data->cgroup, path, + rc = virCgroupAllowDevicePath(priv->cgroup, path, VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupPath(data->vm, data->cgroup, "allow", path, "rw", rc); + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", path, "rw", rc); if (rc < 0) { virReportSystemError(-rc, _("Unable to allow device %s"), @@ -180,34 +186,73 @@ int qemuSetupHostUsbDeviceCgroup(virUSBDevicePtr dev ATTRIBUTE_UNUSED, return 0; } + +int qemuInitCgroup(virQEMUDriverPtr driver, + virDomainObjPtr vm) +{ + int rc; + qemuDomainObjPrivatePtr priv = vm->privateData; + virCgroupPtr driverGroup = NULL; + virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + + virCgroupFree(&priv->cgroup); + + rc = virCgroupForDriver("qemu", &driverGroup, + cfg->privileged, true, + cfg->cgroupControllers); + if (rc != 0) { + if (rc == -ENXIO || + rc == -EPERM || + rc == -EACCES) { /* No cgroups mounts == success */ + VIR_DEBUG("No cgroups present/configured/accessible, ignoring error"); + goto done; + } + + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + vm->def->name); + goto cleanup; + } + + rc = virCgroupForDomain(driverGroup, vm->def->name, &priv->cgroup, 1); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + vm->def->name); + goto cleanup; + } + +done: + rc = 0; +cleanup: + virCgroupFree(&driverGroup); + virObjectUnref(cfg); + return rc; +} + + int qemuSetupCgroup(virQEMUDriverPtr driver, virDomainObjPtr vm, virBitmapPtr nodemask) { - virCgroupPtr cgroup = NULL; - int rc; + int rc = -1; unsigned int i; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + qemuDomainObjPrivatePtr priv = vm->privateData; const char *const *deviceACL = cfg->cgroupDeviceACL ? (const char *const *)cfg->cgroupDeviceACL : defaultDeviceACL; - if (driver->cgroup == NULL) - goto done; /* Not supported, so claim success */ + if (qemuInitCgroup(driver, vm) < 0) + return -1; - rc = virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 1); - if (rc != 0) { - virReportSystemError(-rc, - _("Unable to create cgroup for %s"), - vm->def->name); - goto cleanup; - } + if (!priv->cgroup) + goto done; - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - qemuCgroupData data = { vm, cgroup }; - rc = virCgroupDenyAllDevices(cgroup); - virDomainAuditCgroup(vm, cgroup, "deny", "all", rc == 0); + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { + rc = virCgroupDenyAllDevices(priv->cgroup); + virDomainAuditCgroup(vm, priv->cgroup, "deny", "all", rc == 0); if (rc != 0) { if (rc == -EPERM) { VIR_WARN("Group devices ACL is not accessible, disabling whitelisting"); @@ -220,13 +265,13 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, } for (i = 0; i < vm->def->ndisks ; i++) { - if (qemuSetupDiskCgroup(vm, cgroup, vm->def->disks[i]) < 0) + if (qemuSetupDiskCgroup(vm,vm->def->disks[i]) < 0) goto cleanup; } - rc = virCgroupAllowDeviceMajor(cgroup, 'c', DEVICE_PTY_MAJOR, + rc = virCgroupAllowDeviceMajor(priv->cgroup, 'c', DEVICE_PTY_MAJOR, VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupMajor(vm, cgroup, "allow", DEVICE_PTY_MAJOR, + virDomainAuditCgroupMajor(vm, priv->cgroup, "allow", DEVICE_PTY_MAJOR, "pty", "rw", rc == 0); if (rc != 0) { virReportSystemError(-rc, "%s", @@ -239,9 +284,9 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, ((vm->def->graphics[0]->type == VIR_DOMAIN_GRAPHICS_TYPE_VNC && cfg->vncAllowHostAudio) || (vm->def->graphics[0]->type == VIR_DOMAIN_GRAPHICS_TYPE_SDL)))) { - rc = virCgroupAllowDeviceMajor(cgroup, 'c', DEVICE_SND_MAJOR, + rc = virCgroupAllowDeviceMajor(priv->cgroup, 'c', DEVICE_SND_MAJOR, VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupMajor(vm, cgroup, "allow", DEVICE_SND_MAJOR, + virDomainAuditCgroupMajor(vm, priv->cgroup, "allow", DEVICE_SND_MAJOR, "sound", "rw", rc == 0); if (rc != 0) { virReportSystemError(-rc, "%s", @@ -257,9 +302,9 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, continue; } - rc = virCgroupAllowDevicePath(cgroup, deviceACL[i], + rc = virCgroupAllowDevicePath(priv->cgroup, deviceACL[i], VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupPath(vm, cgroup, "allow", deviceACL[i], "rw", rc); + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", deviceACL[i], "rw", rc); if (rc < 0 && rc != -ENOENT) { virReportSystemError(-rc, @@ -272,7 +317,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, if (virDomainChrDefForeach(vm->def, true, qemuSetupChardevCgroup, - &data) < 0) + vm) < 0) goto cleanup; for (i = 0; i < vm->def->nhostdevs; i++) { @@ -292,7 +337,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, goto cleanup; if (virUSBDeviceFileIterate(usb, qemuSetupHostUsbDeviceCgroup, - &data) < 0) { + vm) < 0) { virUSBDeviceFree(usb); goto cleanup; } @@ -301,8 +346,8 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, } if (vm->def->blkio.weight != 0) { - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { - rc = virCgroupSetBlkioWeight(cgroup, vm->def->blkio.weight); + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { + rc = virCgroupSetBlkioWeight(priv->cgroup, vm->def->blkio.weight); if (rc != 0) { virReportSystemError(-rc, _("Unable to set io weight for domain %s"), @@ -317,12 +362,12 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, } if (vm->def->blkio.ndevices) { - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { for (i = 0; i < vm->def->blkio.ndevices; i++) { virBlkioDeviceWeightPtr dw = &vm->def->blkio.devices[i]; if (!dw->weight) continue; - rc = virCgroupSetBlkioDeviceWeight(cgroup, dw->path, + rc = virCgroupSetBlkioDeviceWeight(priv->cgroup, dw->path, dw->weight); if (rc != 0) { virReportSystemError(-rc, @@ -339,7 +384,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, } } - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_MEMORY)) { + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_MEMORY)) { unsigned long long hard_limit = vm->def->mem.hard_limit; if (!hard_limit) { @@ -357,7 +402,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, hard_limit += vm->def->ndisks * 32768; } - rc = virCgroupSetMemoryHardLimit(cgroup, hard_limit); + rc = virCgroupSetMemoryHardLimit(priv->cgroup, hard_limit); if (rc != 0) { virReportSystemError(-rc, _("Unable to set memory hard limit for domain %s"), @@ -365,7 +410,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, goto cleanup; } if (vm->def->mem.soft_limit != 0) { - rc = virCgroupSetMemorySoftLimit(cgroup, vm->def->mem.soft_limit); + rc = virCgroupSetMemorySoftLimit(priv->cgroup, vm->def->mem.soft_limit); if (rc != 0) { virReportSystemError(-rc, _("Unable to set memory soft limit for domain %s"), @@ -375,7 +420,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, } if (vm->def->mem.swap_hard_limit != 0) { - rc = virCgroupSetMemSwapHardLimit(cgroup, vm->def->mem.swap_hard_limit); + rc = virCgroupSetMemSwapHardLimit(priv->cgroup, vm->def->mem.swap_hard_limit); if (rc != 0) { virReportSystemError(-rc, _("Unable to set swap hard limit for domain %s"), @@ -393,8 +438,8 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, } if (vm->def->cputune.shares != 0) { - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { - rc = virCgroupSetCpuShares(cgroup, vm->def->cputune.shares); + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { + rc = virCgroupSetCpuShares(priv->cgroup, vm->def->cputune.shares); if (rc != 0) { virReportSystemError(-rc, _("Unable to set io cpu shares for domain %s"), @@ -411,7 +456,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, (vm->def->numatune.memory.placement_mode == VIR_NUMA_TUNE_MEM_PLACEMENT_MODE_AUTO)) && vm->def->numatune.memory.mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT && - qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { char *mask = NULL; if (vm->def->numatune.memory.placement_mode == VIR_NUMA_TUNE_MEM_PLACEMENT_MODE_AUTO) @@ -424,7 +469,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, goto cleanup; } - rc = virCgroupSetCpusetMems(cgroup, mask); + rc = virCgroupSetCpusetMems(priv->cgroup, mask); VIR_FREE(mask); if (rc != 0) { virReportSystemError(-rc, @@ -433,18 +478,12 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, goto cleanup; } } -done: - virObjectUnref(cfg); - virCgroupFree(&cgroup); - return 0; +done: + rc = 0; cleanup: virObjectUnref(cfg); - if (cgroup) { - virCgroupRemove(cgroup); - virCgroupFree(&cgroup); - } - return -1; + return rc == 0 ? 0 : -1; } int qemuSetupCgroupVcpuBW(virCgroupPtr cgroup, unsigned long long period, @@ -538,9 +577,8 @@ cleanup: return rc; } -int qemuSetupCgroupForVcpu(virQEMUDriverPtr driver, virDomainObjPtr vm) +int qemuSetupCgroupForVcpu(virDomainObjPtr vm) { - virCgroupPtr cgroup = NULL; virCgroupPtr cgroup_vcpu = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; virDomainDefPtr def = vm->def; @@ -550,8 +588,7 @@ int qemuSetupCgroupForVcpu(virQEMUDriverPtr driver, virDomainObjPtr vm) long long quota = vm->def->cputune.quota; if ((period || quota) && - (!driver->cgroup || - !qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU))) { + !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("cgroup cpu is required for scheduler tuning")); return -1; @@ -561,28 +598,19 @@ int qemuSetupCgroupForVcpu(virQEMUDriverPtr driver, virDomainObjPtr vm) * with virProcessInfoSetAffinity, thus the lack of cgroups is not fatal * here. */ - if (driver->cgroup == NULL) + if (priv->cgroup == NULL) return 0; - rc = virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0); - if (rc != 0) { - virReportSystemError(-rc, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } - if (priv->nvcpupids == 0 || priv->vcpupids[0] == vm->pid) { /* If we don't know VCPU<->PID mapping or all vcpu runs in the same * thread, we cannot control each vcpu. */ VIR_WARN("Unable to get vcpus' pids."); - virCgroupFree(&cgroup); return 0; } for (i = 0; i < priv->nvcpupids; i++) { - rc = virCgroupForVcpu(cgroup, i, &cgroup_vcpu, 1); + rc = virCgroupForVcpu(priv->cgroup, i, &cgroup_vcpu, 1); if (rc < 0) { virReportSystemError(-rc, _("Unable to create vcpu cgroup for %s(vcpu:" @@ -606,7 +634,7 @@ int qemuSetupCgroupForVcpu(virQEMUDriverPtr driver, virDomainObjPtr vm) } /* Set vcpupin in cgroup if vcpupin xml is provided */ - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { /* find the right CPU to pin, otherwise * qemuSetupCgroupVcpuPin will fail. */ for (j = 0; j < def->cputune.nvcpupin; j++) { @@ -626,7 +654,6 @@ int qemuSetupCgroupForVcpu(virQEMUDriverPtr driver, virDomainObjPtr vm) virCgroupFree(&cgroup_vcpu); } - virCgroupFree(&cgroup); return 0; cleanup: @@ -635,11 +662,6 @@ cleanup: virCgroupFree(&cgroup_vcpu); } - if (cgroup) { - virCgroupRemove(cgroup); - virCgroupFree(&cgroup); - } - return -1; } @@ -649,33 +671,24 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, { virBitmapPtr cpumask = NULL; virBitmapPtr cpumap = NULL; - virCgroupPtr cgroup = NULL; virCgroupPtr cgroup_emulator = NULL; virDomainDefPtr def = vm->def; + qemuDomainObjPrivatePtr priv = vm->privateData; unsigned long long period = vm->def->cputune.emulator_period; long long quota = vm->def->cputune.emulator_quota; int rc; if ((period || quota) && - (!driver->cgroup || - !qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU))) { + !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("cgroup cpu is required for scheduler tuning")); return -1; } - if (driver->cgroup == NULL) + if (priv->cgroup == NULL) return 0; /* Not supported, so claim success */ - rc = virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0); - if (rc != 0) { - virReportSystemError(-rc, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } - - rc = virCgroupForEmulator(cgroup, &cgroup_emulator, 1); + rc = virCgroupForEmulator(priv->cgroup, &cgroup_emulator, 1); if (rc < 0) { virReportSystemError(-rc, _("Unable to create emulator cgroup for %s"), @@ -683,7 +696,7 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, goto cleanup; } - rc = virCgroupMoveTask(cgroup, cgroup_emulator); + rc = virCgroupMoveTask(priv->cgroup, cgroup_emulator); if (rc < 0) { virReportSystemError(-rc, _("Unable to move tasks from domain cgroup to " @@ -703,7 +716,7 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, } if (cpumask) { - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { rc = qemuSetupCgroupEmulatorPin(cgroup_emulator, cpumask); if (rc < 0) goto cleanup; @@ -712,7 +725,7 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, } if (period || quota) { - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { if ((rc = qemuSetupCgroupVcpuBW(cgroup_emulator, period, quota)) < 0) goto cleanup; @@ -720,7 +733,6 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, } virCgroupFree(&cgroup_emulator); - virCgroupFree(&cgroup); virBitmapFree(cpumap); return 0; @@ -732,67 +744,34 @@ cleanup: virCgroupFree(&cgroup_emulator); } - if (cgroup) { - virCgroupRemove(cgroup); - virCgroupFree(&cgroup); - } - return rc; } -int qemuRemoveCgroup(virQEMUDriverPtr driver, - virDomainObjPtr vm, - int quiet) +int qemuRemoveCgroup(virDomainObjPtr vm) { - virCgroupPtr cgroup; - int rc; + qemuDomainObjPrivatePtr priv = vm->privateData; - if (driver->cgroup == NULL) + if (priv->cgroup == NULL) return 0; /* Not supported, so claim success */ - rc = virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0); - if (rc != 0) { - if (!quiet) - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - return rc; - } - - rc = virCgroupRemove(cgroup); - virCgroupFree(&cgroup); - return rc; + return virCgroupRemove(priv->cgroup); } -int qemuAddToCgroup(virQEMUDriverPtr driver, - virDomainDefPtr def) +int qemuAddToCgroup(virDomainObjPtr vm) { - virCgroupPtr cgroup = NULL; - int ret = -1; + qemuDomainObjPrivatePtr priv = vm->privateData; int rc; - if (driver->cgroup == NULL) + if (priv->cgroup == NULL) return 0; /* Not supported, so claim success */ - rc = virCgroupForDomain(driver->cgroup, def->name, &cgroup, 0); - if (rc != 0) { - virReportSystemError(-rc, - _("unable to find cgroup for domain %s"), - def->name); - goto cleanup; - } - - rc = virCgroupAddTask(cgroup, getpid()); + rc = virCgroupAddTask(priv->cgroup, getpid()); if (rc != 0) { virReportSystemError(-rc, _("unable to add domain %s task %d to cgroup"), - def->name, getpid()); - goto cleanup; + vm->def->name, getpid()); + return -1; } - ret = 0; - -cleanup: - virCgroupFree(&cgroup); - return ret; + return 0; } diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index a677d07..6cbfebc 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -25,26 +25,19 @@ # define __QEMU_CGROUP_H__ # include "virusb.h" +# include "vircgroup.h" # include "domain_conf.h" # include "qemu_conf.h" -struct _qemuCgroupData { - virDomainObjPtr vm; - virCgroupPtr cgroup; -}; -typedef struct _qemuCgroupData qemuCgroupData; - -bool qemuCgroupControllerActive(virQEMUDriverPtr driver, - int controller); int qemuSetupDiskCgroup(virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainDiskDefPtr disk); int qemuTeardownDiskCgroup(virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainDiskDefPtr disk); int qemuSetupHostUsbDeviceCgroup(virUSBDevicePtr dev, const char *path, void *opaque); +int qemuInitCgroup(virQEMUDriverPtr driver, + virDomainObjPtr vm); int qemuSetupCgroup(virQEMUDriverPtr driver, virDomainObjPtr vm, virBitmapPtr nodemask); @@ -56,14 +49,11 @@ int qemuSetupCgroupVcpuPin(virCgroupPtr cgroup, int nvcpupin, int vcpuid); int qemuSetupCgroupEmulatorPin(virCgroupPtr cgroup, virBitmapPtr cpumask); -int qemuSetupCgroupForVcpu(virQEMUDriverPtr driver, virDomainObjPtr vm); +int qemuSetupCgroupForVcpu(virDomainObjPtr vm); int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, virDomainObjPtr vm, virBitmapPtr nodemask); -int qemuRemoveCgroup(virQEMUDriverPtr driver, - virDomainObjPtr vm, - int quiet); -int qemuAddToCgroup(virQEMUDriverPtr driver, - virDomainDefPtr def); +int qemuRemoveCgroup(virDomainObjPtr vm); +int qemuAddToCgroup(virDomainObjPtr vm); #endif /* __QEMU_CGROUP_H__ */ diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index c5ddaad..21ddd38 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -34,7 +34,6 @@ # include "domain_event.h" # include "virthread.h" # include "security/security_manager.h" -# include "vircgroup.h" # include "virpci.h" # include "virusb.h" # include "cpu_conf.h" @@ -164,9 +163,6 @@ struct _virQEMUDriver { /* Atomic increment only */ int nextvmid; - /* Immutable pointer. Immutable object */ - virCgroupPtr cgroup; - /* Atomic inc/dec only */ unsigned int nactive; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c79b05d..6e2966f 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -235,6 +235,7 @@ qemuDomainObjPrivateFree(void *data) virObjectUnref(priv->qemuCaps); + virCgroupFree(&priv->cgroup); qemuDomainPCIAddressSetFree(priv->pciaddrs); qemuDomainCCWAddressSetFree(priv->ccwaddrs); virDomainChrSourceDefFree(priv->monConfig); diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 26d5859..e68f2e0 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -25,6 +25,7 @@ # define __QEMU_DOMAIN_H__ # include "virthread.h" +# include "vircgroup.h" # include "domain_conf.h" # include "snapshot_conf.h" # include "qemu_monitor.h" @@ -165,6 +166,8 @@ struct _qemuDomainObjPrivate { qemuDomainCleanupCallback *cleanupCallbacks; size_t ncleanupCallbacks; size_t ncleanupCallbacks_max; + + virCgroupPtr cgroup; }; struct qemuDomainWatchdogEvent diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 2809a77..ab6b74d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -551,7 +551,6 @@ qemuStartup(bool privileged, void *opaque) { char *driverConf = NULL; - int rc; virConnectPtr conn = NULL; char ebuf[1024]; char *membase = NULL; @@ -628,13 +627,6 @@ qemuStartup(bool privileged, goto error; } - rc = virCgroupForDriver("qemu", &qemu_driver->cgroup, privileged, 1, - cfg->cgroupControllers); - if (rc < 0) { - VIR_INFO("Unable to create cgroup for driver: %s", - virStrerror(-rc, ebuf, sizeof(ebuf))); - } - qemu_driver->qemuImgBinary = virFindFileInPath("kvm-img"); if (!qemu_driver->qemuImgBinary) qemu_driver->qemuImgBinary = virFindFileInPath("qemu-img"); @@ -977,8 +969,6 @@ qemuShutdown(void) { /* Free domain callback list */ virDomainEventStateFree(qemu_driver->domainEventState); - virCgroupFree(&qemu_driver->cgroup); - virLockManagerPluginUnref(qemu_driver->lockManager); virMutexDestroy(&qemu_driver->lock); @@ -3542,9 +3532,7 @@ static int qemuDomainHotplugVcpus(virQEMUDriverPtr driver, int vcpus = oldvcpus; pid_t *cpupids = NULL; int ncpupids; - virCgroupPtr cgroup = NULL; virCgroupPtr cgroup_vcpu = NULL; - bool cgroup_available = false; qemuDomainObjEnterMonitor(driver, vm); @@ -3607,15 +3595,12 @@ static int qemuDomainHotplugVcpus(virQEMUDriverPtr driver, goto cleanup; } - cgroup_available = (virCgroupForDomain(driver->cgroup, vm->def->name, - &cgroup, 0) == 0); - if (nvcpus > oldvcpus) { for (i = oldvcpus; i < nvcpus; i++) { - if (cgroup_available) { + if (priv->cgroup) { int rv = -1; /* Create cgroup for the onlined vcpu */ - rv = virCgroupForVcpu(cgroup, i, &cgroup_vcpu, 1); + rv = virCgroupForVcpu(priv->cgroup, i, &cgroup_vcpu, 1); if (rv < 0) { virReportSystemError(-rv, _("Unable to create vcpu cgroup for %s(vcpu:" @@ -3658,7 +3643,7 @@ static int qemuDomainHotplugVcpus(virQEMUDriverPtr driver, vcpupin->vcpuid = i; vm->def->cputune.vcpupin[vm->def->cputune.nvcpupin++] = vcpupin; - if (cgroup_available) { + if (cgroup_vcpu) { if (qemuSetupCgroupVcpuPin(cgroup_vcpu, vm->def->cputune.vcpupin, vm->def->cputune.nvcpupin, i) < 0) { @@ -3686,10 +3671,10 @@ static int qemuDomainHotplugVcpus(virQEMUDriverPtr driver, for (i = oldvcpus - 1; i >= nvcpus; i--) { virDomainVcpuPinDefPtr vcpupin = NULL; - if (cgroup_available) { + if (priv->cgroup) { int rv = -1; - rv = virCgroupForVcpu(cgroup, i, &cgroup_vcpu, 0); + rv = virCgroupForVcpu(priv->cgroup, i, &cgroup_vcpu, 0); if (rv < 0) { virReportSystemError(-rv, _("Unable to access vcpu cgroup for %s(vcpu:" @@ -3720,8 +3705,6 @@ cleanup: vm->def->vcpus = vcpus; VIR_FREE(cpupids); virDomainAuditVcpu(vm, oldvcpus, nvcpus, "update", rc == 1); - if (cgroup) - virCgroupFree(&cgroup); if (cgroup_vcpu) virCgroupFree(&cgroup_vcpu); return ret; @@ -3854,7 +3837,6 @@ qemuDomainPinVcpuFlags(virDomainPtr dom, virQEMUDriverPtr driver = dom->conn->privateData; virDomainObjPtr vm; virDomainDefPtr persistentDef = NULL; - virCgroupPtr cgroup_dom = NULL; virCgroupPtr cgroup_vcpu = NULL; int ret = -1; qemuDomainObjPrivatePtr priv; @@ -3930,9 +3912,8 @@ qemuDomainPinVcpuFlags(virDomainPtr dom, } /* Configure the corresponding cpuset cgroup before set affinity. */ - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup_dom, 0) == 0 && - virCgroupForVcpu(cgroup_dom, vcpu, &cgroup_vcpu, 0) == 0 && + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (virCgroupForVcpu(priv->cgroup, vcpu, &cgroup_vcpu, 0) == 0 && qemuSetupCgroupVcpuPin(cgroup_vcpu, newVcpuPin, newVcpuPinNum, vcpu) < 0) { virReportError(VIR_ERR_OPERATION_INVALID, _("failed to set cpuset.cpus in cgroup" @@ -4009,8 +3990,6 @@ qemuDomainPinVcpuFlags(virDomainPtr dom, cleanup: if (cgroup_vcpu) virCgroupFree(&cgroup_vcpu); - if (cgroup_dom) - virCgroupFree(&cgroup_dom); if (vm) virObjectUnlock(vm); virBitmapFree(pcpumap); @@ -4121,7 +4100,6 @@ qemuDomainPinEmulator(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; virDomainObjPtr vm; - virCgroupPtr cgroup_dom = NULL; virCgroupPtr cgroup_emulator = NULL; pid_t pid; virDomainDefPtr persistentDef = NULL; @@ -4185,22 +4163,19 @@ qemuDomainPinEmulator(virDomainPtr dom, goto cleanup; } - if (qemuCgroupControllerActive(driver, - VIR_CGROUP_CONTROLLER_CPUSET)) { + if (virCgroupHasController(priv->cgroup, + VIR_CGROUP_CONTROLLER_CPUSET)) { /* * Configure the corresponding cpuset cgroup. * If no cgroup for domain or hypervisor exists, do nothing. */ - if (virCgroupForDomain(driver->cgroup, vm->def->name, - &cgroup_dom, 0) == 0) { - if (virCgroupForEmulator(cgroup_dom, &cgroup_emulator, 0) == 0) { - if (qemuSetupCgroupEmulatorPin(cgroup_emulator, - newVcpuPin[0]->cpumask) < 0) { - virReportError(VIR_ERR_OPERATION_INVALID, "%s", - _("failed to set cpuset.cpus in cgroup" - " for emulator threads")); - goto cleanup; - } + if (virCgroupForEmulator(priv->cgroup, &cgroup_emulator, 0) == 0) { + if (qemuSetupCgroupEmulatorPin(cgroup_emulator, + newVcpuPin[0]->cpumask) < 0) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("failed to set cpuset.cpus in cgroup" + " for emulator threads")); + goto cleanup; } } } else { @@ -4264,8 +4239,6 @@ qemuDomainPinEmulator(virDomainPtr dom, cleanup: if (cgroup_emulator) virCgroupFree(&cgroup_emulator); - if (cgroup_dom) - virCgroupFree(&cgroup_dom); virBitmapFree(pcpumap); virObjectUnref(caps); if (vm) @@ -5758,16 +5731,8 @@ qemuDomainAttachDeviceDiskLive(virConnectPtr conn, if (qemuDomainDetermineDiskChain(driver, disk, false) < 0) goto end; - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0)) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto end; - } - if (qemuSetupDiskCgroup(vm, cgroup, disk) < 0) - goto end; - } + if (qemuSetupDiskCgroup(vm, disk) < 0) + goto end; switch (disk->device) { case VIR_DOMAIN_DISK_DEVICE_CDROM: @@ -5833,7 +5798,7 @@ qemuDomainAttachDeviceDiskLive(virConnectPtr conn, } if (ret != 0 && cgroup) { - if (qemuTeardownDiskCgroup(vm, cgroup, disk) < 0) + if (qemuTeardownDiskCgroup(vm, disk) < 0) VIR_WARN("Failed to teardown cgroup for disk path %s", NULLSTR(disk->src)); } @@ -5841,8 +5806,6 @@ qemuDomainAttachDeviceDiskLive(virConnectPtr conn, end: if (ret != 0) ignore_value(qemuRemoveSharedDisk(driver, disk, vm->def->name)); - if (cgroup) - virCgroupFree(&cgroup); virObjectUnref(caps); virDomainDeviceDefFree(dev_copy); return ret; @@ -6025,7 +5988,6 @@ qemuDomainChangeDiskMediaLive(virDomainObjPtr vm, virDomainDiskDefPtr disk = dev->data.disk; virDomainDiskDefPtr orig_disk = NULL; virDomainDiskDefPtr tmp = NULL; - virCgroupPtr cgroup = NULL; virDomainDeviceDefPtr dev_copy = NULL; virCapsPtr caps = NULL; int ret = -1; @@ -6033,17 +5995,8 @@ qemuDomainChangeDiskMediaLive(virDomainObjPtr vm, if (qemuDomainDetermineDiskChain(driver, disk, false) < 0) goto end; - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - if (virCgroupForDomain(driver->cgroup, - vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto end; - } - if (qemuSetupDiskCgroup(vm, cgroup, disk) < 0) - goto end; - } + if (qemuSetupDiskCgroup(vm, disk) < 0) + goto end; switch (disk->device) { case VIR_DOMAIN_DISK_DEVICE_CDROM: @@ -6094,14 +6047,12 @@ qemuDomainChangeDiskMediaLive(virDomainObjPtr vm, break; } - if (ret != 0 && cgroup) { - if (qemuTeardownDiskCgroup(vm, cgroup, disk) < 0) - VIR_WARN("Failed to teardown cgroup for disk path %s", - NULLSTR(disk->src)); - } + if (ret != 0 && + qemuTeardownDiskCgroup(vm, disk) < 0) + VIR_WARN("Failed to teardown cgroup for disk path %s", + NULLSTR(disk->src)); + end: - if (cgroup) - virCgroupFree(&cgroup); virObjectUnref(caps); virDomainDeviceDefFree(dev_copy); return ret; @@ -6735,15 +6686,25 @@ static char *qemuGetSchedulerType(virDomainPtr dom, virQEMUDriverPtr driver = dom->conn->privateData; char *ret = NULL; int rc; + virDomainObjPtr vm = NULL; + qemuDomainObjPrivatePtr priv; + + vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); + if (vm == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("No such domain %s"), dom->uuid); + goto cleanup; + } + priv = vm->privateData; - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPU controller is not mounted")); goto cleanup; } if (nparams) { - rc = qemuGetCpuBWStatus(driver->cgroup); + rc = qemuGetCpuBWStatus(priv->cgroup); if (rc < 0) goto cleanup; else if (rc == 0) @@ -6757,6 +6718,8 @@ static char *qemuGetSchedulerType(virDomainPtr dom, virReportOOMError(); cleanup: + if (vm) + virObjectUnlock(vm); return ret; } @@ -6896,12 +6859,12 @@ qemuDomainSetBlkioParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; int ret = -1; virQEMUDriverConfigPtr cfg = NULL; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -6919,6 +6882,7 @@ qemuDomainSetBlkioParameters(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; cfg = virQEMUDriverGetConfig(driver); if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup; @@ -6928,18 +6892,11 @@ qemuDomainSetBlkioParameters(virDomainPtr dom, goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("blkio cgroup isn't mounted")); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), - vm->def->name); - goto cleanup; - } } ret = 0; @@ -6956,7 +6913,7 @@ qemuDomainSetBlkioParameters(virDomainPtr dom, continue; } - rc = virCgroupSetBlkioWeight(group, params[i].value.ui); + rc = virCgroupSetBlkioWeight(priv->cgroup, params[i].value.ui); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to set blkio weight tunable")); @@ -6974,7 +6931,7 @@ qemuDomainSetBlkioParameters(virDomainPtr dom, continue; } for (j = 0; j < ndevices; j++) { - rc = virCgroupSetBlkioDeviceWeight(group, + rc = virCgroupSetBlkioDeviceWeight(priv->cgroup, devices[j].path, devices[j].weight); if (rc < 0) { @@ -7037,7 +6994,6 @@ qemuDomainSetBlkioParameters(virDomainPtr dom, } cleanup: - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -7053,13 +7009,13 @@ qemuDomainGetBlkioParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; int i, j; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; unsigned int val; int ret = -1; int rc; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG | @@ -7077,6 +7033,7 @@ qemuDomainGetBlkioParameters(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup; @@ -7093,17 +7050,11 @@ qemuDomainGetBlkioParameters(virDomainPtr dom, goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("blkio cgroup isn't mounted")); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } } if (flags & VIR_DOMAIN_AFFECT_LIVE) { @@ -7113,7 +7064,7 @@ qemuDomainGetBlkioParameters(virDomainPtr dom, switch (i) { case 0: /* fill blkio weight here */ - rc = virCgroupGetBlkioWeight(group, &val); + rc = virCgroupGetBlkioWeight(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get blkio weight")); @@ -7226,8 +7177,6 @@ qemuDomainGetBlkioParameters(virDomainPtr dom, ret = 0; cleanup: - if (group) - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -7242,7 +7191,6 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; virDomainDefPtr persistentDef = NULL; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; unsigned long long swap_hard_limit; unsigned long long memory_hard_limit; @@ -7254,6 +7202,7 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, int ret = -1; int rc; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -7272,6 +7221,7 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, if (!(vm = qemuDomObjFromDomain(dom))) return -1; + priv = vm->privateData; cfg = virQEMUDriverGetConfig(driver); if (!(caps = virQEMUDriverGetCapabilities(driver, false))) @@ -7282,17 +7232,11 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_MEMORY)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_MEMORY)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup memory controller is not mounted")); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } } #define VIR_GET_LIMIT_PARAMETER(PARAM, VALUE) \ @@ -7320,7 +7264,7 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, if (set_swap_hard_limit) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if ((rc = virCgroupSetMemSwapHardLimit(group, swap_hard_limit)) < 0) { + if ((rc = virCgroupSetMemSwapHardLimit(priv->cgroup, swap_hard_limit)) < 0) { virReportSystemError(-rc, "%s", _("unable to set memory swap_hard_limit tunable")); goto cleanup; @@ -7334,7 +7278,7 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, if (set_memory_hard_limit) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if ((rc = virCgroupSetMemoryHardLimit(group, memory_hard_limit)) < 0) { + if ((rc = virCgroupSetMemoryHardLimit(priv->cgroup, memory_hard_limit)) < 0) { virReportSystemError(-rc, "%s", _("unable to set memory hard_limit tunable")); goto cleanup; @@ -7348,7 +7292,7 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, if (set_memory_soft_limit) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if ((rc = virCgroupSetMemorySoftLimit(group, memory_soft_limit)) < 0) { + if ((rc = virCgroupSetMemorySoftLimit(priv->cgroup, memory_soft_limit)) < 0) { virReportSystemError(-rc, "%s", _("unable to set memory soft_limit tunable")); goto cleanup; @@ -7367,7 +7311,6 @@ qemuDomainSetMemoryParameters(virDomainPtr dom, ret = 0; cleanup: - virCgroupFree(&group); virObjectUnlock(vm); virObjectUnref(caps); virObjectUnref(cfg); @@ -7382,12 +7325,12 @@ qemuDomainGetMemoryParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; int ret = -1; int rc; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG | @@ -7404,6 +7347,7 @@ qemuDomainGetMemoryParameters(virDomainPtr dom, goto cleanup; } + priv = vm->privateData; if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup; @@ -7412,17 +7356,11 @@ qemuDomainGetMemoryParameters(virDomainPtr dom, goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_MEMORY)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_MEMORY)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup memory controller is not mounted")); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } } if ((*nparams) == 0) { @@ -7473,12 +7411,9 @@ qemuDomainGetMemoryParameters(virDomainPtr dom, virTypedParameterPtr param = ¶ms[i]; unsigned long long val = 0; - /* Coverity does not realize that if we get here, group is set. */ - sa_assert(group); - switch (i) { case 0: /* fill memory hard limit here */ - rc = virCgroupGetMemoryHardLimit(group, &val); + rc = virCgroupGetMemoryHardLimit(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get memory hard limit")); @@ -7491,7 +7426,7 @@ qemuDomainGetMemoryParameters(virDomainPtr dom, break; case 1: /* fill memory soft limit here */ - rc = virCgroupGetMemorySoftLimit(group, &val); + rc = virCgroupGetMemorySoftLimit(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get memory soft limit")); @@ -7504,7 +7439,7 @@ qemuDomainGetMemoryParameters(virDomainPtr dom, break; case 2: /* fill swap hard limit here */ - rc = virCgroupGetMemSwapHardLimit(group, &val); + rc = virCgroupGetMemSwapHardLimit(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get swap hard limit")); @@ -7528,8 +7463,6 @@ out: ret = 0; cleanup: - if (group) - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -7545,11 +7478,11 @@ qemuDomainSetNumaParameters(virDomainPtr dom, virQEMUDriverPtr driver = dom->conn->privateData; int i; virDomainDefPtr persistentDef = NULL; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; int ret = -1; virQEMUDriverConfigPtr cfg = NULL; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -7568,6 +7501,7 @@ qemuDomainSetNumaParameters(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; cfg = virQEMUDriverGetConfig(driver); if (!(caps = virQEMUDriverGetCapabilities(driver, false))) @@ -7578,18 +7512,11 @@ qemuDomainSetNumaParameters(virDomainPtr dom, goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup cpuset controller is not mounted")); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), - vm->def->name); - goto cleanup; - } } ret = 0; @@ -7642,7 +7569,7 @@ qemuDomainSetNumaParameters(virDomainPtr dom, continue; } - if ((rc = virCgroupSetCpusetMems(group, nodeset_str) != 0)) { + if ((rc = virCgroupSetCpusetMems(priv->cgroup, nodeset_str) != 0)) { virReportSystemError(-rc, "%s", _("unable to set numa tunable")); virBitmapFree(nodeset); @@ -7682,7 +7609,6 @@ qemuDomainSetNumaParameters(virDomainPtr dom, } cleanup: - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -7698,13 +7624,13 @@ qemuDomainGetNumaParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; char *nodeset = NULL; int ret = -1; int rc; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG | @@ -7722,6 +7648,7 @@ qemuDomainGetNumaParameters(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup; @@ -7737,18 +7664,11 @@ qemuDomainGetNumaParameters(virDomainPtr dom, } if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_MEMORY)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_MEMORY)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup memory controller is not mounted")); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), - vm->def->name); - goto cleanup; - } } for (i = 0; i < QEMU_NB_NUMA_PARAM && i < *nparams; i++) { @@ -7771,7 +7691,7 @@ qemuDomainGetNumaParameters(virDomainPtr dom, if (!nodeset) nodeset = strdup(""); } else { - rc = virCgroupGetCpusetMems(group, &nodeset); + rc = virCgroupGetCpusetMems(priv->cgroup, &nodeset); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get numa nodeset")); @@ -7798,7 +7718,6 @@ qemuDomainGetNumaParameters(virDomainPtr dom, cleanup: VIR_FREE(nodeset); - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -7906,6 +7825,7 @@ qemuSetSchedulerParametersFlags(virDomainPtr dom, int rc; virQEMUDriverConfigPtr cfg = NULL; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -7931,6 +7851,7 @@ qemuSetSchedulerParametersFlags(virDomainPtr dom, goto cleanup; } + priv = vm->privateData; cfg = virQEMUDriverGetConfig(driver); if (!(caps = virQEMUDriverGetCapabilities(driver, false))) @@ -7948,17 +7869,11 @@ qemuSetSchedulerParametersFlags(virDomainPtr dom, } if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPU controller is not mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), - vm->def->name); - goto cleanup; - } } for (i = 0; i < nparams; i++) { @@ -7968,7 +7883,7 @@ qemuSetSchedulerParametersFlags(virDomainPtr dom, if (STREQ(param->field, VIR_DOMAIN_SCHEDULER_CPU_SHARES)) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if ((rc = virCgroupSetCpuShares(group, value_ul))) { + if ((rc = virCgroupSetCpuShares(priv->cgroup, value_ul))) { virReportSystemError(-rc, "%s", _("unable to set cpu shares tunable")); goto cleanup; @@ -8054,7 +7969,6 @@ qemuSetSchedulerParametersFlags(virDomainPtr dom, cleanup: virDomainDefFree(vmdef); - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -8098,7 +8012,7 @@ qemuGetVcpuBWLive(virCgroupPtr cgroup, unsigned long long *period, } static int -qemuGetVcpusBWLive(virDomainObjPtr vm, virCgroupPtr cgroup, +qemuGetVcpusBWLive(virDomainObjPtr vm, unsigned long long *period, long long *quota) { virCgroupPtr cgroup_vcpu = NULL; @@ -8109,7 +8023,7 @@ qemuGetVcpusBWLive(virDomainObjPtr vm, virCgroupPtr cgroup, priv = vm->privateData; if (priv->nvcpupids == 0 || priv->vcpupids[0] == vm->pid) { /* We do not create sub dir for each vcpu */ - rc = qemuGetVcpuBWLive(cgroup, period, quota); + rc = qemuGetVcpuBWLive(priv->cgroup, period, quota); if (rc < 0) goto cleanup; @@ -8119,7 +8033,7 @@ qemuGetVcpusBWLive(virDomainObjPtr vm, virCgroupPtr cgroup, } /* get period and quota for vcpu0 */ - rc = virCgroupForVcpu(cgroup, 0, &cgroup_vcpu, 0); + rc = virCgroupForVcpu(priv->cgroup, 0, &cgroup_vcpu, 0); if (!cgroup_vcpu) { virReportSystemError(-rc, _("Unable to find vcpu cgroup for %s(vcpu: 0)"), @@ -8183,7 +8097,6 @@ qemuGetSchedulerParametersFlags(virDomainPtr dom, unsigned int flags) { virQEMUDriverPtr driver = dom->conn->privateData; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; unsigned long long shares; unsigned long long period; @@ -8196,6 +8109,7 @@ qemuGetSchedulerParametersFlags(virDomainPtr dom, int saved_nparams = 0; virDomainDefPtr persistentDef; virCapsPtr caps = NULL; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG | @@ -8204,13 +8118,6 @@ qemuGetSchedulerParametersFlags(virDomainPtr dom, /* We don't return strings, and thus trivially support this flag. */ flags &= ~VIR_TYPED_PARAM_STRING_OKAY; - if (*nparams > 1) { - rc = qemuGetCpuBWStatus(driver->cgroup); - if (rc < 0) - goto cleanup; - cpu_bw_status = !!rc; - } - vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); if (vm == NULL) { @@ -8219,6 +8126,15 @@ qemuGetSchedulerParametersFlags(virDomainPtr dom, goto cleanup; } + priv = vm->privateData; + + if (*nparams > 1) { + rc = qemuGetCpuBWStatus(priv->cgroup); + if (rc < 0) + goto cleanup; + cpu_bw_status = !!rc; + } + if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup; @@ -8237,19 +8153,13 @@ qemuGetSchedulerParametersFlags(virDomainPtr dom, goto out; } - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPU controller is not mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - - rc = virCgroupGetCpuShares(group, &shares); + rc = virCgroupGetCpuShares(priv->cgroup, &shares); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get cpu shares tunable")); @@ -8257,13 +8167,13 @@ qemuGetSchedulerParametersFlags(virDomainPtr dom, } if (*nparams > 1 && cpu_bw_status) { - rc = qemuGetVcpusBWLive(vm, group, &period, "a); + rc = qemuGetVcpusBWLive(vm, &period, "a); if (rc != 0) goto cleanup; } if (*nparams > 3 && cpu_bw_status) { - rc = qemuGetEmulatorBandwidthLive(vm, group, &emulator_period, + rc = qemuGetEmulatorBandwidthLive(vm, priv->cgroup, &emulator_period, &emulator_quota); if (rc != 0) goto cleanup; @@ -8316,7 +8226,6 @@ out: ret = 0; cleanup: - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -8712,7 +8621,6 @@ qemuDomainSetInterfaceParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; int ret = -1; @@ -8876,7 +8784,6 @@ qemuDomainSetInterfaceParameters(virDomainPtr dom, cleanup: virNetDevBandwidthFree(bandwidth); virNetDevBandwidthFree(newBandwidth); - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -8893,7 +8800,6 @@ qemuDomainGetInterfaceParameters(virDomainPtr dom, { virQEMUDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr def = NULL; virDomainDefPtr persistentDef = NULL; @@ -9000,8 +8906,6 @@ qemuDomainGetInterfaceParameters(virDomainPtr dom, ret = 0; cleanup: - if (group) - virCgroupFree(&group); if (vm) virObjectUnlock(vm); virObjectUnref(caps); @@ -10607,7 +10511,6 @@ typedef enum { static int qemuDomainPrepareDiskChainElement(virQEMUDriverPtr driver, virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainDiskDefPtr disk, const char *file, qemuDomainDiskChainMode mode) @@ -10631,13 +10534,13 @@ qemuDomainPrepareDiskChainElement(virQEMUDriverPtr driver, if (virSecurityManagerRestoreImageLabel(driver->securityManager, vm->def, disk) < 0) VIR_WARN("Unable to restore security label on %s", disk->src); - if (cgroup && qemuTeardownDiskCgroup(vm, cgroup, disk) < 0) + if (qemuTeardownDiskCgroup(vm, disk) < 0) VIR_WARN("Failed to teardown cgroup for disk path %s", disk->src); if (virDomainLockDiskDetach(driver->lockManager, vm, disk) < 0) VIR_WARN("Unable to release lock on %s", disk->src); } else if (virDomainLockDiskAttach(driver->lockManager, cfg->uri, vm, disk) < 0 || - (cgroup && qemuSetupDiskCgroup(vm, cgroup, disk) < 0) || + qemuSetupDiskCgroup(vm, disk) < 0 || virSecurityManagerSetImageLabel(driver->securityManager, vm->def, disk) < 0) { goto cleanup; @@ -11073,7 +10976,6 @@ cleanup: static int qemuDomainSnapshotCreateSingleDiskActive(virQEMUDriverPtr driver, virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainSnapshotDiskDefPtr snap, virDomainDiskDefPtr disk, virDomainDiskDefPtr persistDisk, @@ -11123,9 +11025,9 @@ qemuDomainSnapshotCreateSingleDiskActive(virQEMUDriverPtr driver, virStorageFileFreeMetadata(disk->backingChain); disk->backingChain = NULL; - if (qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, source, + if (qemuDomainPrepareDiskChainElement(driver, vm, disk, source, VIR_DISK_CHAIN_READ_WRITE) < 0) { - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, source, + qemuDomainPrepareDiskChainElement(driver, vm, disk, source, VIR_DISK_CHAIN_NO_ACCESS); goto cleanup; } @@ -11167,7 +11069,6 @@ cleanup: static void qemuDomainSnapshotUndoSingleDiskActive(virQEMUDriverPtr driver, virDomainObjPtr vm, - virCgroupPtr cgroup, virDomainDiskDefPtr origdisk, virDomainDiskDefPtr disk, virDomainDiskDefPtr persistDisk, @@ -11184,7 +11085,7 @@ qemuDomainSnapshotUndoSingleDiskActive(virQEMUDriverPtr driver, goto cleanup; } - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, origdisk->src, + qemuDomainPrepareDiskChainElement(driver, vm, disk, origdisk->src, VIR_DISK_CHAIN_NO_ACCESS); if (need_unlink && stat(disk->src, &st) == 0 && S_ISREG(st.st_mode) && unlink(disk->src) < 0) @@ -11221,7 +11122,6 @@ qemuDomainSnapshotCreateDiskActive(virQEMUDriverPtr driver, int i; bool persist = false; bool reuse = (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT) != 0; - virCgroupPtr cgroup = NULL; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); if (!virDomainObjIsActive(vm)) { @@ -11230,15 +11130,6 @@ qemuDomainSnapshotCreateDiskActive(virQEMUDriverPtr driver, goto cleanup; } - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES) && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0)) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } - /* 'cgroup' is still NULL if cgroups are disabled. */ - if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_TRANSACTION)) { if (!(actions = virJSONValueNewArray())) { virReportOOMError(); @@ -11274,7 +11165,7 @@ qemuDomainSnapshotCreateDiskActive(virQEMUDriverPtr driver, } } - ret = qemuDomainSnapshotCreateSingleDiskActive(driver, vm, cgroup, + ret = qemuDomainSnapshotCreateSingleDiskActive(driver, vm, &snap->def->disks[i], vm->def->disks[i], persistDisk, actions, @@ -11303,7 +11194,7 @@ qemuDomainSnapshotCreateDiskActive(virQEMUDriverPtr driver, persistDisk = vm->newDef->disks[indx]; } - qemuDomainSnapshotUndoSingleDiskActive(driver, vm, cgroup, + qemuDomainSnapshotUndoSingleDiskActive(driver, vm, snap->def->dom->disks[i], vm->def->disks[i], persistDisk, @@ -11314,7 +11205,6 @@ qemuDomainSnapshotCreateDiskActive(virQEMUDriverPtr driver, qemuDomainObjExitMonitor(driver, vm); cleanup: - virCgroupFree(&cgroup); if (ret == 0 || !virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_TRANSACTION)) { if (virDomainSaveStatus(driver->xmlconf, cfg->stateDir, vm) < 0 || @@ -13065,7 +12955,6 @@ qemuDomainBlockPivot(virConnectPtr conn, virDomainBlockJobInfo info; const char *format = virStorageFileFormatTypeToString(disk->mirrorFormat); bool resume = false; - virCgroupPtr cgroup = NULL; char *oldsrc = NULL; int oldformat; virStorageFileMetadataPtr oldchain = NULL; @@ -13125,14 +13014,6 @@ qemuDomainBlockPivot(virConnectPtr conn, * label the entire chain. This action is safe even if the * backing chain has already been labeled; but only necessary when * we know for sure that there is a backing chain. */ - if (disk->mirrorFormat && disk->mirrorFormat != VIR_STORAGE_FILE_RAW && - qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES) && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } oldsrc = disk->src; oldformat = disk->format; oldchain = disk->backingChain; @@ -13148,7 +13029,7 @@ qemuDomainBlockPivot(virConnectPtr conn, if (disk->mirrorFormat && disk->mirrorFormat != VIR_STORAGE_FILE_RAW && (virDomainLockDiskAttach(driver->lockManager, cfg->uri, vm, disk) < 0 || - (cgroup && qemuSetupDiskCgroup(vm, cgroup, disk) < 0) || + qemuSetupDiskCgroup(vm, disk) < 0 || virSecurityManagerSetImageLabel(driver->securityManager, vm->def, disk) < 0)) { disk->src = oldsrc; @@ -13192,8 +13073,6 @@ qemuDomainBlockPivot(virConnectPtr conn, disk->mirroring = false; cleanup: - if (cgroup) - virCgroupFree(&cgroup); if (resume && virDomainObjIsActive(vm) && qemuProcessStartCPUs(driver, vm, conn, VIR_DOMAIN_RUNNING_UNPAUSED, @@ -13421,7 +13300,6 @@ qemuDomainBlockCopy(virDomainPtr dom, const char *path, struct stat st; bool need_unlink = false; char *mirror = NULL; - virCgroupPtr cgroup = NULL; virQEMUDriverConfigPtr cfg = NULL; /* Preliminaries: find the disk we are editing, sanity checks */ @@ -13437,13 +13315,6 @@ qemuDomainBlockCopy(virDomainPtr dom, const char *path, _("domain is not running")); goto cleanup; } - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES) && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } device = qemuDiskPathToAlias(vm, path, &idx); if (!device) { @@ -13545,9 +13416,9 @@ qemuDomainBlockCopy(virDomainPtr dom, const char *path, goto endjob; } - if (qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, dest, + if (qemuDomainPrepareDiskChainElement(driver, vm, disk, dest, VIR_DISK_CHAIN_READ_WRITE) < 0) { - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, dest, + qemuDomainPrepareDiskChainElement(driver, vm, disk, dest, VIR_DISK_CHAIN_NO_ACCESS); goto endjob; } @@ -13559,7 +13430,7 @@ qemuDomainBlockCopy(virDomainPtr dom, const char *path, virDomainAuditDisk(vm, NULL, dest, "mirror", ret >= 0); qemuDomainObjExitMonitor(driver, vm); if (ret < 0) { - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, dest, + qemuDomainPrepareDiskChainElement(driver, vm, disk, dest, VIR_DISK_CHAIN_NO_ACCESS); goto endjob; } @@ -13581,8 +13452,6 @@ endjob: } cleanup: - if (cgroup) - virCgroupFree(&cgroup); VIR_FREE(device); if (vm) virObjectUnlock(vm); @@ -13638,7 +13507,6 @@ qemuDomainBlockCommit(virDomainPtr dom, const char *path, const char *base, virStorageFileMetadataPtr top_meta = NULL; const char *top_parent = NULL; const char *base_canon = NULL; - virCgroupPtr cgroup = NULL; bool clean_access = false; virCheckFlags(VIR_DOMAIN_BLOCK_COMMIT_SHALLOW, -1); @@ -13722,18 +13590,11 @@ qemuDomainBlockCommit(virDomainPtr dom, const char *path, const char *base, * revoke access to files removed from the chain, when the commit * operation succeeds, but doing that requires tracking the * operation in XML across libvirtd restarts. */ - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES) && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto endjob; - } clean_access = true; - if (qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, base_canon, + if (qemuDomainPrepareDiskChainElement(driver, vm, disk, base_canon, VIR_DISK_CHAIN_READ_WRITE) < 0 || (top_parent && top_parent != disk->src && - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, + qemuDomainPrepareDiskChainElement(driver, vm, disk, top_parent, VIR_DISK_CHAIN_READ_WRITE) < 0)) goto endjob; @@ -13747,15 +13608,13 @@ qemuDomainBlockCommit(virDomainPtr dom, const char *path, const char *base, endjob: if (ret < 0 && clean_access) { /* Revert access to read-only, if possible. */ - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, base_canon, + qemuDomainPrepareDiskChainElement(driver, vm, disk, base_canon, VIR_DISK_CHAIN_READ_ONLY); if (top_parent && top_parent != disk->src) - qemuDomainPrepareDiskChainElement(driver, vm, cgroup, disk, + qemuDomainPrepareDiskChainElement(driver, vm, disk, top_parent, VIR_DISK_CHAIN_READ_ONLY); } - if (cgroup) - virCgroupFree(&cgroup); if (qemuDomainObjEndJob(driver, vm) == 0) { vm = NULL; goto cleanup; @@ -14399,17 +14258,18 @@ cleanup: /* qemuDomainGetCPUStats() with start_cpu == -1 */ static int -qemuDomainGetTotalcpuStats(virCgroupPtr group, +qemuDomainGetTotalcpuStats(virDomainObjPtr vm, virTypedParameterPtr params, int nparams) { unsigned long long cpu_time; int ret; + qemuDomainObjPrivatePtr priv = vm->privateData; if (nparams == 0) /* return supported number of params */ return QEMU_NB_TOTAL_CPU_STAT_PARAM; /* entry 0 is cputime */ - ret = virCgroupGetCpuacctUsage(group, &cpu_time); + ret = virCgroupGetCpuacctUsage(priv->cgroup, &cpu_time); if (ret < 0) { virReportSystemError(-ret, "%s", _("unable to get cpu account")); return -1; @@ -14423,7 +14283,7 @@ qemuDomainGetTotalcpuStats(virCgroupPtr group, unsigned long long user; unsigned long long sys; - ret = virCgroupGetCpuacctStat(group, &user, &sys); + ret = virCgroupGetCpuacctStat(priv->cgroup, &user, &sys); if (ret < 0) { virReportSystemError(-ret, "%s", _("unable to get cpu account")); return -1; @@ -14461,22 +14321,22 @@ qemuDomainGetTotalcpuStats(virCgroupPtr group, * s3 = t03 + t13 */ static int -getSumVcpuPercpuStats(virCgroupPtr group, - unsigned int nvcpu, +getSumVcpuPercpuStats(virDomainObjPtr vm, unsigned long long *sum_cpu_time, unsigned int num) { int ret = -1; int i; char *buf = NULL; + qemuDomainObjPrivatePtr priv = vm->privateData; virCgroupPtr group_vcpu = NULL; - for (i = 0; i < nvcpu; i++) { + for (i = 0; i < priv->nvcpupids; i++) { char *pos; unsigned long long tmp; int j; - if (virCgroupForVcpu(group, i, &group_vcpu, 0) < 0) { + if (virCgroupForVcpu(priv->cgroup, i, &group_vcpu, 0) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("error accessing cgroup cpuacct for vcpu")); goto cleanup; @@ -14508,7 +14368,6 @@ cleanup: static int qemuDomainGetPercpuStats(virDomainObjPtr vm, - virCgroupPtr group, virTypedParameterPtr params, unsigned int nparams, int start_cpu, @@ -14548,7 +14407,7 @@ qemuDomainGetPercpuStats(virDomainObjPtr vm, } /* we get percpu cputime accounting info. */ - if (virCgroupGetCpuacctPercpuUsage(group, &buf)) + if (virCgroupGetCpuacctPercpuUsage(priv->cgroup, &buf)) goto cleanup; pos = buf; memset(params, 0, nparams * ncpus); @@ -14588,7 +14447,7 @@ qemuDomainGetPercpuStats(virDomainObjPtr vm, virReportOOMError(); goto cleanup; } - if (getSumVcpuPercpuStats(group, priv->nvcpupids, sum_cpu_time, n) < 0) + if (getSumVcpuPercpuStats(vm, sum_cpu_time, n) < 0) goto cleanup; sum_cpu_pos = sum_cpu_time; @@ -14614,17 +14473,17 @@ cleanup: static int qemuDomainGetCPUStats(virDomainPtr domain, - virTypedParameterPtr params, - unsigned int nparams, - int start_cpu, - unsigned int ncpus, - unsigned int flags) + virTypedParameterPtr params, + unsigned int nparams, + int start_cpu, + unsigned int ncpus, + unsigned int flags) { virQEMUDriverPtr driver = domain->conn->privateData; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; int ret = -1; bool isActive; + qemuDomainObjPrivatePtr priv; virCheckFlags(VIR_TYPED_PARAM_STRING_OKAY, -1); @@ -14634,6 +14493,7 @@ qemuDomainGetCPUStats(virDomainPtr domain, _("No such domain %s"), domain->uuid); goto cleanup; } + priv = vm->privateData; isActive = virDomainObjIsActive(vm); if (!isActive) { @@ -14642,25 +14502,18 @@ qemuDomainGetCPUStats(virDomainPtr domain, goto cleanup; } - if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUACCT)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUACCT)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPUACCT controller is not mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - if (start_cpu == -1) - ret = qemuDomainGetTotalcpuStats(group, params, nparams); + ret = qemuDomainGetTotalcpuStats(vm, params, nparams); else - ret = qemuDomainGetPercpuStats(vm, group, params, nparams, + ret = qemuDomainGetPercpuStats(vm, params, nparams, start_cpu, ncpus); cleanup: - virCgroupFree(&group); if (vm) virObjectUnlock(vm); return ret; diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index b978b97..a6c75cb 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1136,27 +1136,16 @@ int qemuDomainAttachHostUsbDevice(virQEMUDriverPtr driver, goto error; } - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - virCgroupPtr cgroup = NULL; + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virUSBDevicePtr usb; - qemuCgroupData data; - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto error; - } if ((usb = virUSBDeviceNew(hostdev->source.subsys.u.usb.bus, hostdev->source.subsys.u.usb.device, NULL)) == NULL) goto error; - data.vm = vm; - data.cgroup = cgroup; if (virUSBDeviceFileIterate(usb, qemuSetupHostUsbDeviceCgroup, - &data) < 0) { + vm) < 0) { virUSBDeviceFree(usb); goto error; } @@ -2032,7 +2021,6 @@ int qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr driver, int i, ret = -1; virDomainDiskDefPtr detach = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; - virCgroupPtr cgroup = NULL; char *drivestr = NULL; i = qemuFindDisk(vm->def, dev->data.disk->dst); @@ -2052,15 +2040,6 @@ int qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr driver, goto cleanup; } - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } - } - if (STREQLEN(vm->def->os.machine, "s390-ccw", 8) && virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_VIRTIO_CCW)) { if (!virDomainDeviceAddressIsValid(&detach->info, @@ -2130,11 +2109,9 @@ int qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr driver, vm->def, dev->data.disk) < 0) VIR_WARN("Unable to restore security label on %s", dev->data.disk->src); - if (cgroup != NULL) { - if (qemuTeardownDiskCgroup(vm, cgroup, dev->data.disk) < 0) - VIR_WARN("Failed to teardown cgroup for disk path %s", - NULLSTR(dev->data.disk->src)); - } + if (qemuTeardownDiskCgroup(vm, dev->data.disk) < 0) + VIR_WARN("Failed to teardown cgroup for disk path %s", + NULLSTR(dev->data.disk->src)); if (virDomainLockDiskDetach(driver->lockManager, vm, dev->data.disk) < 0) VIR_WARN("Unable to release lock on %s", dev->data.disk->src); @@ -2142,7 +2119,6 @@ int qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr driver, ret = 0; cleanup: - virCgroupFree(&cgroup); VIR_FREE(drivestr); return ret; } @@ -2154,7 +2130,6 @@ int qemuDomainDetachDiskDevice(virQEMUDriverPtr driver, int i, ret = -1; virDomainDiskDefPtr detach = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; - virCgroupPtr cgroup = NULL; char *drivestr = NULL; i = qemuFindDisk(vm->def, dev->data.disk->dst); @@ -2181,15 +2156,6 @@ int qemuDomainDetachDiskDevice(virQEMUDriverPtr driver, goto cleanup; } - if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } - } - /* build the actual drive id string as the disk->info.alias doesn't * contain the QEMU_DRIVE_HOST_PREFIX that is passed to qemu */ if (virAsprintf(&drivestr, "%s%s", @@ -2222,11 +2188,9 @@ int qemuDomainDetachDiskDevice(virQEMUDriverPtr driver, vm->def, dev->data.disk) < 0) VIR_WARN("Unable to restore security label on %s", dev->data.disk->src); - if (cgroup != NULL) { - if (qemuTeardownDiskCgroup(vm, cgroup, dev->data.disk) < 0) - VIR_WARN("Failed to teardown cgroup for disk path %s", - NULLSTR(dev->data.disk->src)); - } + if (qemuTeardownDiskCgroup(vm, dev->data.disk) < 0) + VIR_WARN("Failed to teardown cgroup for disk path %s", + NULLSTR(dev->data.disk->src)); if (virDomainLockDiskDetach(driver->lockManager, vm, dev->data.disk) < 0) VIR_WARN("Unable to release lock on disk %s", dev->data.disk->src); @@ -2235,7 +2199,6 @@ int qemuDomainDetachDiskDevice(virQEMUDriverPtr driver, cleanup: VIR_FREE(drivestr); - virCgroupFree(&cgroup); return ret; } diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 3f74add..1b8719e 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -4177,7 +4177,6 @@ qemuMigrationToFile(virQEMUDriverPtr driver, virDomainObjPtr vm, enum qemuDomainAsyncJob asyncJob) { qemuDomainObjPrivatePtr priv = vm->privateData; - virCgroupPtr cgroup = NULL; int ret = -1; int rc; bool restoreLabel = false; @@ -4211,21 +4210,13 @@ qemuMigrationToFile(virQEMUDriverPtr driver, virDomainObjPtr vm, * given cgroup ACL permission. We might also stumble on * a race present in some qemu versions where it does a wait() * that botches pclose. */ - if (qemuCgroupControllerActive(driver, - VIR_CGROUP_CONTROLLER_DEVICES)) { - if (virCgroupForDomain(driver->cgroup, vm->def->name, - &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find cgroup for %s"), - vm->def->name); - goto cleanup; - } - rc = virCgroupAllowDevicePath(cgroup, path, + if (virCgroupHasController(priv->cgroup, + VIR_CGROUP_CONTROLLER_DEVICES)) { + rc = virCgroupAllowDevicePath(priv->cgroup, path, VIR_CGROUP_DEVICE_RW); - virDomainAuditCgroupPath(vm, cgroup, "allow", path, "rw", rc); + virDomainAuditCgroupPath(vm, priv->cgroup, "allow", path, "rw", rc); if (rc == 1) { /* path was not a device, no further need for cgroup */ - virCgroupFree(&cgroup); } else if (rc < 0) { virReportSystemError(-rc, _("Unable to allow device %s for %s"), @@ -4326,14 +4317,14 @@ cleanup: vm->def, path) < 0) VIR_WARN("failed to restore save state label on %s", path); - if (cgroup != NULL) { - rc = virCgroupDenyDevicePath(cgroup, path, + if (virCgroupHasController(priv->cgroup, + VIR_CGROUP_CONTROLLER_DEVICES)) { + rc = virCgroupDenyDevicePath(priv->cgroup, path, VIR_CGROUP_DEVICE_RWM); - virDomainAuditCgroupPath(vm, cgroup, "deny", path, "rwm", rc); + virDomainAuditCgroupPath(vm, priv->cgroup, "deny", path, "rwm", rc); if (rc < 0) VIR_WARN("Unable to deny device %s for %s %d", path, vm->def->name, rc); - virCgroupFree(&cgroup); } return ret; } diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 8c4bfb7..a86e62c 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1395,6 +1395,7 @@ qemuProcessReadLogOutput(virDomainObjPtr vm, /* Filter out debug messages from intermediate libvirt process */ while ((eol = strchr(filter_next, '\n'))) { *eol = '\0'; + VIR_ERROR("<<<<<<<<<<<<%s>>>>>>>>>>", filter_next); if (virLogProbablyLogMessage(filter_next)) { memmove(filter_next, eol + 1, got - (eol - buf)); got -= eol + 1 - filter_next; @@ -2529,7 +2530,7 @@ static int qemuProcessHook(void *data) * memory allocation is on the correct NUMA node */ VIR_DEBUG("Moving process to cgroup"); - if (qemuAddToCgroup(h->driver, h->vm->def) < 0) + if (qemuAddToCgroup(h->vm) < 0) goto cleanup; /* This must be done after cgroup placement to avoid resetting CPU @@ -3004,6 +3005,9 @@ qemuProcessReconnect(void *opaque) if (qemuUpdateActiveUsbHostdevs(driver, obj->def) < 0) goto error; + if (qemuInitCgroup(driver, obj) < 0) + goto error; + /* XXX: Need to change as long as lock is introduced for * qemu_driver->sharedDisks. */ @@ -3379,7 +3383,7 @@ int qemuProcessStart(virConnectPtr conn, /* Ensure no historical cgroup for this VM is lying around bogus * settings */ VIR_DEBUG("Ensuring no historical cgroup is lying around"); - qemuRemoveCgroup(driver, vm, 1); + qemuRemoveCgroup(vm); for (i = 0 ; i < vm->def->ngraphics; ++i) { virDomainGraphicsDefPtr graphics = vm->def->graphics[i]; @@ -3740,7 +3744,7 @@ int qemuProcessStart(virConnectPtr conn, goto cleanup; VIR_DEBUG("Setting cgroup for each VCPU (if required)"); - if (qemuSetupCgroupForVcpu(driver, vm) < 0) + if (qemuSetupCgroupForVcpu(vm) < 0) goto cleanup; VIR_DEBUG("Setting cgroup for emulator (if required)"); @@ -4075,7 +4079,7 @@ void qemuProcessStop(virQEMUDriverPtr driver, } retry: - if ((ret = qemuRemoveCgroup(driver, vm, 0)) < 0) { + if ((ret = qemuRemoveCgroup(vm)) < 0) { if (ret == -EBUSY && (retries++ < 5)) { usleep(200*1000); goto retry; @@ -4083,6 +4087,7 @@ retry: VIR_WARN("Failed to remove cgroup for %s", vm->def->name); } + virCgroupFree(&priv->cgroup); qemuProcessRemoveDomainStatus(driver, vm); -- 1.8.1.4

From: "Daniel P. Berrange" <berrange@redhat.com> Instead of calling virCgroupForDomain every time we need the virCgrouPtr instance, just do it once at Vm startup and cache a reference to the object in virLXCDomainObjPrivatePtr until shutdown of the VM. Removing the virCgroupPtr from the LXC driver state also means we don't have stale mount info, if someone mounts the cgroups filesystem after libvirtd has been started Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/lxc/lxc_cgroup.c | 17 ++- src/lxc/lxc_cgroup.h | 2 + src/lxc/lxc_conf.h | 3 - src/lxc/lxc_controller.c | 2 +- src/lxc/lxc_domain.c | 2 + src/lxc/lxc_domain.h | 3 + src/lxc/lxc_driver.c | 354 +++++++++++++---------------------------------- src/lxc/lxc_process.c | 39 +++--- 8 files changed, 143 insertions(+), 279 deletions(-) diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 33641f8..1bad9ec 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -527,7 +527,6 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) { virCgroupPtr driver = NULL; virCgroupPtr cgroup = NULL; - int ret = -1; int rc; rc = virCgroupForDriver("lxc", &driver, 1, 0, -1); @@ -545,6 +544,21 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) goto cleanup; } +cleanup: + virCgroupFree(&driver); + return cgroup; +} + + +virCgroupPtr virLXCCgroupJoin(virDomainDefPtr def) +{ + virCgroupPtr cgroup = NULL; + int ret = -1; + int rc; + + if (!(cgroup = virLXCCgroupCreate(def))) + return NULL; + rc = virCgroupAddTask(cgroup, getpid()); if (rc != 0) { virReportSystemError(-rc, @@ -556,7 +570,6 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) ret = 0; cleanup: - virCgroupFree(&driver); if (ret < 0) { virCgroupFree(&cgroup); return NULL; diff --git a/src/lxc/lxc_cgroup.h b/src/lxc/lxc_cgroup.h index 942e0fc..25a427c 100644 --- a/src/lxc/lxc_cgroup.h +++ b/src/lxc/lxc_cgroup.h @@ -22,11 +22,13 @@ #ifndef __VIR_LXC_CGROUP_H__ # define __VIR_LXC_CGROUP_H__ +# include "vircgroup.h" # include "domain_conf.h" # include "lxc_fuse.h" # include "virusb.h" virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def); +virCgroupPtr virLXCCgroupJoin(virDomainDefPtr def); int virLXCCgroupSetup(virDomainDefPtr def, virCgroupPtr cgroup, virBitmapPtr nodemask); diff --git a/src/lxc/lxc_conf.h b/src/lxc/lxc_conf.h index b46dc32..dbe13a5 100644 --- a/src/lxc/lxc_conf.h +++ b/src/lxc/lxc_conf.h @@ -32,7 +32,6 @@ # include "domain_event.h" # include "capabilities.h" # include "virthread.h" -# include "vircgroup.h" # include "security/security_manager.h" # include "configmake.h" # include "virusb.h" @@ -53,8 +52,6 @@ struct _virLXCDriver { virCapsPtr caps; virDomainXMLConfPtr xmlconf; - virCgroupPtr cgroup; - size_t nactive; virStateInhibitCallback inhibitCallback; void *inhibitOpaque; diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index cede445..866a2d8 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1418,7 +1418,7 @@ virLXCControllerRun(virLXCControllerPtr ctrl) if (virLXCControllerSetupPrivateNS() < 0) goto cleanup; - if (!(cgroup = virLXCCgroupCreate(ctrl->def))) + if (!(cgroup = virLXCCgroupJoin(ctrl->def))) goto cleanup; if (virLXCControllerSetupLoopDevices(ctrl) < 0) diff --git a/src/lxc/lxc_domain.c b/src/lxc/lxc_domain.c index 08cf8f6..1364e8e 100644 --- a/src/lxc/lxc_domain.c +++ b/src/lxc/lxc_domain.c @@ -43,6 +43,8 @@ static void virLXCDomainObjPrivateFree(void *data) { virLXCDomainObjPrivatePtr priv = data; + virCgroupFree(&priv->cgroup); + VIR_FREE(priv); } diff --git a/src/lxc/lxc_domain.h b/src/lxc/lxc_domain.h index 007ea84..1bc8ce5 100644 --- a/src/lxc/lxc_domain.h +++ b/src/lxc/lxc_domain.h @@ -23,6 +23,7 @@ #ifndef __LXC_DOMAIN_H__ # define __LXC_DOMAIN_H__ +# include "vircgroup.h" # include "lxc_conf.h" # include "lxc_monitor.h" @@ -36,6 +37,8 @@ struct _virLXCDomainObjPrivate { bool wantReboot; pid_t initpid; + + virCgroupPtr cgroup; }; extern virDomainXMLPrivateDataCallbacks virLXCDriverPrivateDataCallbacks; diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index ea056c8..5bab7bc 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -527,8 +527,8 @@ static int lxcDomainGetInfo(virDomainPtr dom, { virLXCDriverPtr driver = dom->conn->privateData; virDomainObjPtr vm; - virCgroupPtr cgroup = NULL; int ret = -1, rc; + virLXCDomainObjPrivatePtr priv; lxcDriverLock(driver); vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); @@ -541,24 +541,20 @@ static int lxcDomainGetInfo(virDomainPtr dom, goto cleanup; } + priv = vm->privateData; + info->state = virDomainObjGetState(vm, NULL); - if (!virDomainObjIsActive(vm) || driver->cgroup == NULL) { + if (!virDomainObjIsActive(vm)) { info->cpuTime = 0; info->memory = vm->def->mem.cur_balloon; } else { - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to get cgroup for %s"), vm->def->name); - goto cleanup; - } - - if (virCgroupGetCpuacctUsage(cgroup, &(info->cpuTime)) < 0) { + if (virCgroupGetCpuacctUsage(priv->cgroup, &(info->cpuTime)) < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("Cannot read cputime for domain")); goto cleanup; } - if ((rc = virCgroupGetMemoryUsage(cgroup, &(info->memory))) < 0) { + if ((rc = virCgroupGetMemoryUsage(priv->cgroup, &(info->memory))) < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("Cannot read memory usage for domain")); if (rc == -ENOENT) { @@ -576,8 +572,6 @@ static int lxcDomainGetInfo(virDomainPtr dom, cleanup: lxcDriverUnlock(driver); - if (cgroup) - virCgroupFree(&cgroup); if (vm) virObjectUnlock(vm); return ret; @@ -708,8 +702,8 @@ cleanup: static int lxcDomainSetMemory(virDomainPtr dom, unsigned long newmem) { virLXCDriverPtr driver = dom->conn->privateData; virDomainObjPtr vm; - virCgroupPtr cgroup = NULL; int ret = -1; + virLXCDomainObjPrivatePtr priv; lxcDriverLock(driver); vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); @@ -721,6 +715,7 @@ static int lxcDomainSetMemory(virDomainPtr dom, unsigned long newmem) { _("No domain with matching uuid '%s'"), uuidstr); goto cleanup; } + priv = vm->privateData; if (newmem > vm->def->mem.max_balloon) { virReportError(VIR_ERR_INVALID_ARG, @@ -734,19 +729,7 @@ static int lxcDomainSetMemory(virDomainPtr dom, unsigned long newmem) { goto cleanup; } - if (driver->cgroup == NULL) { - virReportError(VIR_ERR_OPERATION_INVALID, - "%s", _("cgroups must be configured on the host")); - goto cleanup; - } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to get cgroup for %s"), vm->def->name); - goto cleanup; - } - - if (virCgroupSetMemory(cgroup, newmem) < 0) { + if (virCgroupSetMemory(priv->cgroup, newmem) < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("Failed to set memory for domain")); goto cleanup; @@ -757,8 +740,6 @@ static int lxcDomainSetMemory(virDomainPtr dom, unsigned long newmem) { cleanup: if (vm) virObjectUnlock(vm); - if (cgroup) - virCgroupFree(&cgroup); return ret; } @@ -770,10 +751,10 @@ lxcDomainSetMemoryParameters(virDomainPtr dom, { virLXCDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr cgroup = NULL; virDomainObjPtr vm = NULL; int ret = -1; int rc; + virLXCDomainObjPrivatePtr priv; virCheckFlags(0, -1); if (virTypedParameterArrayValidate(params, nparams, @@ -796,33 +777,28 @@ lxcDomainSetMemoryParameters(virDomainPtr dom, _("No domain with matching uuid '%s'"), uuidstr); goto cleanup; } - - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } + priv = vm->privateData; ret = 0; for (i = 0; i < nparams; i++) { virTypedParameterPtr param = ¶ms[i]; if (STREQ(param->field, VIR_DOMAIN_MEMORY_HARD_LIMIT)) { - rc = virCgroupSetMemoryHardLimit(cgroup, params[i].value.ul); + rc = virCgroupSetMemoryHardLimit(priv->cgroup, params[i].value.ul); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to set memory hard_limit tunable")); ret = -1; } } else if (STREQ(param->field, VIR_DOMAIN_MEMORY_SOFT_LIMIT)) { - rc = virCgroupSetMemorySoftLimit(cgroup, params[i].value.ul); + rc = virCgroupSetMemorySoftLimit(priv->cgroup, params[i].value.ul); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to set memory soft_limit tunable")); ret = -1; } } else if (STREQ(param->field, VIR_DOMAIN_MEMORY_SWAP_HARD_LIMIT)) { - rc = virCgroupSetMemSwapHardLimit(cgroup, params[i].value.ul); + rc = virCgroupSetMemSwapHardLimit(priv->cgroup, params[i].value.ul); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to set swap_hard_limit tunable")); @@ -832,8 +808,6 @@ lxcDomainSetMemoryParameters(virDomainPtr dom, } cleanup: - if (cgroup) - virCgroupFree(&cgroup); if (vm) virObjectUnlock(vm); lxcDriverUnlock(driver); @@ -848,11 +822,11 @@ lxcDomainGetMemoryParameters(virDomainPtr dom, { virLXCDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr cgroup = NULL; virDomainObjPtr vm = NULL; unsigned long long val; int ret = -1; int rc; + virLXCDomainObjPrivatePtr priv; virCheckFlags(0, -1); @@ -866,6 +840,7 @@ lxcDomainGetMemoryParameters(virDomainPtr dom, _("No domain with matching uuid '%s'"), uuidstr); goto cleanup; } + priv = vm->privateData; if ((*nparams) == 0) { /* Current number of memory parameters supported by cgroups */ @@ -874,19 +849,13 @@ lxcDomainGetMemoryParameters(virDomainPtr dom, goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to get cgroup for %s"), vm->def->name); - goto cleanup; - } - for (i = 0; i < LXC_NB_MEM_PARAM && i < *nparams; i++) { virTypedParameterPtr param = ¶ms[i]; val = 0; switch (i) { case 0: /* fill memory hard limit here */ - rc = virCgroupGetMemoryHardLimit(cgroup, &val); + rc = virCgroupGetMemoryHardLimit(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get memory hard limit")); @@ -897,7 +866,7 @@ lxcDomainGetMemoryParameters(virDomainPtr dom, goto cleanup; break; case 1: /* fill memory soft limit here */ - rc = virCgroupGetMemorySoftLimit(cgroup, &val); + rc = virCgroupGetMemorySoftLimit(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get memory soft limit")); @@ -908,7 +877,7 @@ lxcDomainGetMemoryParameters(virDomainPtr dom, goto cleanup; break; case 2: /* fill swap hard limit here */ - rc = virCgroupGetMemSwapHardLimit(cgroup, &val); + rc = virCgroupGetMemSwapHardLimit(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get swap hard limit")); @@ -932,8 +901,6 @@ lxcDomainGetMemoryParameters(virDomainPtr dom, ret = 0; cleanup: - if (cgroup) - virCgroupFree(&cgroup); if (vm) virObjectUnlock(vm); lxcDriverUnlock(driver); @@ -1417,7 +1384,6 @@ static int lxcStartup(bool privileged, void *opaque ATTRIBUTE_UNUSED) { char *ld; - int rc; /* Valgrind gets very annoyed when we clone containers, so * disable LXC when under valgrind @@ -1460,16 +1426,6 @@ static int lxcStartup(bool privileged, lxc_driver->log_libvirtd = 0; /* by default log to container logfile */ lxc_driver->have_netns = lxcCheckNetNsSupport(); - rc = virCgroupForDriver("lxc", &lxc_driver->cgroup, privileged, 1, -1); - if (rc < 0) { - char buf[1024] ATTRIBUTE_UNUSED; - VIR_DEBUG("Unable to create cgroup for LXC driver: %s", - virStrerror(-rc, buf, sizeof(buf))); - /* Don't abort startup. We will explicitly report to - * the user when they try to start a VM - */ - } - /* Call function to load lxc driver configuration information */ if (lxcLoadDriverConfig(lxc_driver) < 0) goto cleanup; @@ -1638,30 +1594,32 @@ cleanup: } -static bool lxcCgroupControllerActive(virLXCDriverPtr driver, - int controller) -{ - return virCgroupHasController(driver->cgroup, controller); -} - - - -static char *lxcGetSchedulerType(virDomainPtr domain, +static char *lxcGetSchedulerType(virDomainPtr dom, int *nparams) { - virLXCDriverPtr driver = domain->conn->privateData; + virLXCDriverPtr driver = dom->conn->privateData; char *ret = NULL; int rc; + virDomainObjPtr vm; + virLXCDomainObjPrivatePtr priv; lxcDriverLock(driver); - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); + if (vm == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("No such domain %s"), dom->uuid); + goto cleanup; + } + priv = vm->privateData; + + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPU controller is not mounted")); goto cleanup; } if (nparams) { - rc = lxcGetCpuBWStatus(driver->cgroup); + rc = lxcGetCpuBWStatus(priv->cgroup); if (rc < 0) goto cleanup; else if (rc == 0) @@ -1675,6 +1633,8 @@ static char *lxcGetSchedulerType(virDomainPtr domain, virReportOOMError(); cleanup: + if (vm) + virObjectUnlock(vm); lxcDriverUnlock(driver); return ret; } @@ -1761,11 +1721,11 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, { virLXCDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr vmdef = NULL; int ret = -1; int rc; + virLXCDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -1788,6 +1748,7 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; if (virDomainLiveConfigHelperMethod(driver->caps, driver->xmlconf, vm, &flags, &vmdef) < 0) @@ -1801,17 +1762,11 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, } if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPU controller is not mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), - vm->def->name); - goto cleanup; - } } for (i = 0; i < nparams; i++) { @@ -1819,7 +1774,7 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, if (STREQ(param->field, VIR_DOMAIN_SCHEDULER_CPU_SHARES)) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - rc = virCgroupSetCpuShares(group, params[i].value.ul); + rc = virCgroupSetCpuShares(priv->cgroup, params[i].value.ul); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to set cpu shares tunable")); @@ -1834,7 +1789,7 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, } } else if (STREQ(param->field, VIR_DOMAIN_SCHEDULER_VCPU_PERIOD)) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - rc = lxcSetVcpuBWLive(group, params[i].value.ul, 0); + rc = lxcSetVcpuBWLive(priv->cgroup, params[i].value.ul, 0); if (rc != 0) goto cleanup; @@ -1847,7 +1802,7 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, } } else if (STREQ(param->field, VIR_DOMAIN_SCHEDULER_VCPU_QUOTA)) { if (flags & VIR_DOMAIN_AFFECT_LIVE) { - rc = lxcSetVcpuBWLive(group, 0, params[i].value.l); + rc = lxcSetVcpuBWLive(priv->cgroup, 0, params[i].value.l); if (rc != 0) goto cleanup; @@ -1878,7 +1833,6 @@ lxcSetSchedulerParametersFlags(virDomainPtr dom, cleanup: virDomainDefFree(vmdef); - virCgroupFree(&group); if (vm) virObjectUnlock(vm); lxcDriverUnlock(driver); @@ -1900,7 +1854,6 @@ lxcGetSchedulerParametersFlags(virDomainPtr dom, unsigned int flags) { virLXCDriverPtr driver = dom->conn->privateData; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef; unsigned long long shares = 0; @@ -1910,19 +1863,13 @@ lxcGetSchedulerParametersFlags(virDomainPtr dom, int rc; bool cpu_bw_status = false; int saved_nparams = 0; + virLXCDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); lxcDriverLock(driver); - if (*nparams > 1) { - rc = lxcGetCpuBWStatus(driver->cgroup); - if (rc < 0) - goto cleanup; - cpu_bw_status = !!rc; - } - vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); if (vm == NULL) { @@ -1930,6 +1877,14 @@ lxcGetSchedulerParametersFlags(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; + + if (*nparams > 1) { + rc = lxcGetCpuBWStatus(priv->cgroup); + if (rc < 0) + goto cleanup; + cpu_bw_status = !!rc; + } if (virDomainLiveConfigHelperMethod(driver->caps, driver->xmlconf, vm, &flags, &persistentDef) < 0) @@ -1944,19 +1899,13 @@ lxcGetSchedulerParametersFlags(virDomainPtr dom, goto out; } - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cgroup CPU controller is not mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - - rc = virCgroupGetCpuShares(group, &shares); + rc = virCgroupGetCpuShares(priv->cgroup, &shares); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get cpu shares tunable")); @@ -1964,7 +1913,7 @@ lxcGetSchedulerParametersFlags(virDomainPtr dom, } if (*nparams > 1 && cpu_bw_status) { - rc = lxcGetVcpuBWLive(group, &period, "a); + rc = lxcGetVcpuBWLive(priv->cgroup, &period, "a); if (rc != 0) goto cleanup; } @@ -1997,7 +1946,6 @@ out: ret = 0; cleanup: - virCgroupFree(&group); if (vm) virObjectUnlock(vm); lxcDriverUnlock(driver); @@ -2021,10 +1969,10 @@ lxcDomainSetBlkioParameters(virDomainPtr dom, { virLXCDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; int ret = -1; + virLXCDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -2043,24 +1991,19 @@ lxcDomainSetBlkioParameters(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; if (virDomainLiveConfigHelperMethod(driver->caps, driver->xmlconf, vm, &flags, &persistentDef) < 0) goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("blkio cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - for (i = 0; i < nparams; i++) { virTypedParameterPtr param = ¶ms[i]; @@ -2073,7 +2016,7 @@ lxcDomainSetBlkioParameters(virDomainPtr dom, goto cleanup; } - rc = virCgroupSetBlkioWeight(group, params[i].value.ui); + rc = virCgroupSetBlkioWeight(priv->cgroup, params[i].value.ui); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to set blkio weight tunable")); @@ -2106,7 +2049,6 @@ lxcDomainSetBlkioParameters(virDomainPtr dom, ret = 0; cleanup: - virCgroupFree(&group); if (vm) virObjectUnlock(vm); lxcDriverUnlock(driver); @@ -2123,12 +2065,12 @@ lxcDomainGetBlkioParameters(virDomainPtr dom, { virLXCDriverPtr driver = dom->conn->privateData; int i; - virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; virDomainDefPtr persistentDef = NULL; unsigned int val; int ret = -1; int rc; + virLXCDomainObjPrivatePtr priv; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -2141,6 +2083,7 @@ lxcDomainGetBlkioParameters(virDomainPtr dom, _("No such domain %s"), dom->uuid); goto cleanup; } + priv = vm->privateData; if ((*nparams) == 0) { /* Current number of blkio parameters supported by cgroups */ @@ -2154,25 +2097,19 @@ lxcDomainGetBlkioParameters(virDomainPtr dom, goto cleanup; if (flags & VIR_DOMAIN_AFFECT_LIVE) { - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_BLKIO)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("blkio cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - for (i = 0; i < *nparams && i < LXC_NB_BLKIO_PARAM; i++) { virTypedParameterPtr param = ¶ms[i]; val = 0; switch (i) { case 0: /* fill blkio weight here */ - rc = virCgroupGetBlkioWeight(group, &val); + rc = virCgroupGetBlkioWeight(priv->cgroup, &val); if (rc != 0) { virReportSystemError(-rc, "%s", _("unable to get blkio weight")); @@ -2214,8 +2151,6 @@ lxcDomainGetBlkioParameters(virDomainPtr dom, ret = 0; cleanup: - if (group) - virCgroupFree(&group); if (vm) virObjectUnlock(vm); lxcDriverUnlock(driver); @@ -2385,7 +2320,7 @@ cleanup: return ret; } -static int lxcFreezeContainer(virLXCDriverPtr driver, virDomainObjPtr vm) +static int lxcFreezeContainer(virDomainObjPtr vm) { int timeout = 1000; /* In milliseconds */ int check_interval = 1; /* In milliseconds */ @@ -2393,13 +2328,7 @@ static int lxcFreezeContainer(virLXCDriverPtr driver, virDomainObjPtr vm) int waited_time = 0; int ret = -1; char *state = NULL; - virCgroupPtr cgroup = NULL; - - if (!(driver->cgroup && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) == 0)) - return -1; - - /* From here on, we know that cgroup != NULL. */ + virLXCDomainObjPrivatePtr priv = vm->privateData; while (waited_time < timeout) { int r; @@ -2410,7 +2339,7 @@ static int lxcFreezeContainer(virLXCDriverPtr driver, virDomainObjPtr vm) * to "FROZEN". * (see linux-2.6/Documentation/cgroups/freezer-subsystem.txt) */ - r = virCgroupSetFreezerState(cgroup, "FROZEN"); + r = virCgroupSetFreezerState(priv->cgroup, "FROZEN"); /* * Returning EBUSY explicitly indicates that the group is @@ -2437,7 +2366,7 @@ static int lxcFreezeContainer(virLXCDriverPtr driver, virDomainObjPtr vm) */ usleep(check_interval * 1000); - r = virCgroupGetFreezerState(cgroup, &state); + r = virCgroupGetFreezerState(priv->cgroup, &state); if (r < 0) { VIR_DEBUG("Reading freezer.state failed with errno: %d", r); @@ -2469,11 +2398,10 @@ error: * activate the group again and return an error. * This is likely to fall the group back again gracefully. */ - virCgroupSetFreezerState(cgroup, "THAWED"); + virCgroupSetFreezerState(priv->cgroup, "THAWED"); ret = -1; cleanup: - virCgroupFree(&cgroup); VIR_FREE(state); return ret; } @@ -2503,7 +2431,7 @@ static int lxcDomainSuspend(virDomainPtr dom) } if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_PAUSED) { - if (lxcFreezeContainer(driver, vm) < 0) { + if (lxcFreezeContainer(vm) < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("Suspend operation failed")); goto cleanup; @@ -2528,27 +2456,13 @@ cleanup: return ret; } -static int lxcUnfreezeContainer(virLXCDriverPtr driver, virDomainObjPtr vm) -{ - int ret; - virCgroupPtr cgroup = NULL; - - if (!(driver->cgroup && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) == 0)) - return -1; - - ret = virCgroupSetFreezerState(cgroup, "THAWED"); - - virCgroupFree(&cgroup); - return ret; -} - static int lxcDomainResume(virDomainPtr dom) { virLXCDriverPtr driver = dom->conn->privateData; virDomainObjPtr vm; virDomainEventPtr event = NULL; int ret = -1; + virLXCDomainObjPrivatePtr priv; lxcDriverLock(driver); vm = virDomainObjListFindByUUID(driver->domains, dom->uuid); @@ -2561,6 +2475,8 @@ static int lxcDomainResume(virDomainPtr dom) goto cleanup; } + priv = vm->privateData; + if (!virDomainObjIsActive(vm)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("Domain is not running")); @@ -2568,7 +2484,7 @@ static int lxcDomainResume(virDomainPtr dom) } if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_PAUSED) { - if (lxcUnfreezeContainer(driver, vm) < 0) { + if (virCgroupSetFreezerState(priv->cgroup, "THAWED") < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("Resume operation failed")); goto cleanup; @@ -3111,7 +3027,6 @@ lxcDomainAttachDeviceDiskLive(virLXCDriverPtr driver, { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainDiskDefPtr def = dev->data.disk; - virCgroupPtr group = NULL; int ret = -1; char *dst = NULL; struct stat sb; @@ -3196,19 +3111,13 @@ lxcDomainAttachDeviceDiskLive(virLXCDriverPtr driver, vm->def, def) < 0) goto cleanup; - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - - if (virCgroupAllowDevicePath(group, def->src, + if (virCgroupAllowDevicePath(priv->cgroup, def->src, (def->readonly ? VIR_CGROUP_DEVICE_READ : VIR_CGROUP_DEVICE_RW) | @@ -3226,8 +3135,6 @@ lxcDomainAttachDeviceDiskLive(virLXCDriverPtr driver, cleanup: def->src = tmpsrc; virDomainAuditDisk(vm, NULL, def->src, "attach", ret == 0); - if (group) - virCgroupFree(&group); if (dst && created && ret < 0) unlink(dst); return ret; @@ -3381,7 +3288,6 @@ lxcDomainAttachDeviceHostdevSubsysUSBLive(virLXCDriverPtr driver, mode_t mode; bool created = false; virUSBDevicePtr usb = NULL; - virCgroupPtr group = NULL; if (virDomainHostdevFind(vm->def, def, NULL) >= 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", @@ -3416,18 +3322,12 @@ lxcDomainAttachDeviceHostdevSubsysUSBLive(virLXCDriverPtr driver, goto cleanup; } - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - if (!(usb = virUSBDeviceNew(def->source.subsys.u.usb.bus, def->source.subsys.u.usb.device, vroot))) goto cleanup; @@ -3468,8 +3368,8 @@ lxcDomainAttachDeviceHostdevSubsysUSBLive(virLXCDriverPtr driver, goto cleanup; if (virUSBDeviceFileIterate(usb, - virLXCSetupHostUsbDeviceCgroup, - &group) < 0) + virLXCSetupHostUsbDeviceCgroup, + &priv->cgroup) < 0) goto cleanup; ret = 0; @@ -3480,7 +3380,6 @@ cleanup: unlink(dstfile); virUSBDeviceFree(usb); - virCgroupFree(&group); VIR_FREE(src); VIR_FREE(dstfile); VIR_FREE(dstdir); @@ -3496,7 +3395,6 @@ lxcDomainAttachDeviceHostdevStorageLive(virLXCDriverPtr driver, { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainHostdevDefPtr def = dev->data.hostdev; - virCgroupPtr group = NULL; int ret = -1; char *dst = NULL; char *vroot = NULL; @@ -3565,19 +3463,13 @@ lxcDomainAttachDeviceHostdevStorageLive(virLXCDriverPtr driver, vm->def, def, vroot) < 0) goto cleanup; - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - - if (virCgroupAllowDevicePath(group, def->source.caps.u.storage.block, + if (virCgroupAllowDevicePath(priv->cgroup, def->source.caps.u.storage.block, VIR_CGROUP_DEVICE_RW | VIR_CGROUP_DEVICE_MKNOD) != 0) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -3592,8 +3484,6 @@ lxcDomainAttachDeviceHostdevStorageLive(virLXCDriverPtr driver, cleanup: virDomainAuditHostdev(vm, def, "attach", ret == 0); - if (group) - virCgroupFree(&group); if (dst && created && ret < 0) unlink(dst); VIR_FREE(dst); @@ -3609,7 +3499,6 @@ lxcDomainAttachDeviceHostdevMiscLive(virLXCDriverPtr driver, { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainHostdevDefPtr def = dev->data.hostdev; - virCgroupPtr group = NULL; int ret = -1; char *dst = NULL; char *vroot = NULL; @@ -3678,19 +3567,13 @@ lxcDomainAttachDeviceHostdevMiscLive(virLXCDriverPtr driver, vm->def, def, vroot) < 0) goto cleanup; - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - - if (virCgroupAllowDevicePath(group, def->source.caps.u.misc.chardev, + if (virCgroupAllowDevicePath(priv->cgroup, def->source.caps.u.misc.chardev, VIR_CGROUP_DEVICE_RW | VIR_CGROUP_DEVICE_MKNOD) != 0) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -3705,8 +3588,6 @@ lxcDomainAttachDeviceHostdevMiscLive(virLXCDriverPtr driver, cleanup: virDomainAuditHostdev(vm, def, "attach", ret == 0); - if (group) - virCgroupFree(&group); if (dst && created && ret < 0) unlink(dst); VIR_FREE(dst); @@ -3823,13 +3704,11 @@ lxcDomainAttachDeviceLive(virConnectPtr conn, static int -lxcDomainDetachDeviceDiskLive(virLXCDriverPtr driver, - virDomainObjPtr vm, +lxcDomainDetachDeviceDiskLive(virDomainObjPtr vm, virDomainDeviceDefPtr dev) { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainDiskDefPtr def = NULL; - virCgroupPtr group = NULL; int i, ret = -1; char *dst = NULL; @@ -3855,18 +3734,12 @@ lxcDomainDetachDeviceDiskLive(virLXCDriverPtr driver, goto cleanup; } - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - VIR_DEBUG("Unlinking %s (backed by %s)", dst, def->src); if (unlink(dst) < 0 && errno != ENOENT) { virDomainAuditDisk(vm, def->src, NULL, "detach", false); @@ -3876,7 +3749,7 @@ lxcDomainDetachDeviceDiskLive(virLXCDriverPtr driver, } virDomainAuditDisk(vm, def->src, NULL, "detach", true); - if (virCgroupDenyDevicePath(group, def->src, VIR_CGROUP_DEVICE_RWM) != 0) + if (virCgroupDenyDevicePath(priv->cgroup, def->src, VIR_CGROUP_DEVICE_RWM) != 0) VIR_WARN("cannot deny device %s for domain %s", def->src, vm->def->name); @@ -3887,8 +3760,6 @@ lxcDomainDetachDeviceDiskLive(virLXCDriverPtr driver, cleanup: VIR_FREE(dst); - if (group) - virCgroupFree(&group); return ret; } @@ -3966,7 +3837,6 @@ lxcDomainDetachDeviceHostdevUSBLive(virLXCDriverPtr driver, { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainHostdevDefPtr def = NULL; - virCgroupPtr group = NULL; int idx, ret = -1; char *dst = NULL; char *vroot; @@ -3994,18 +3864,12 @@ lxcDomainDetachDeviceHostdevUSBLive(virLXCDriverPtr driver, goto cleanup; } - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - if (!(usb = virUSBDeviceNew(def->source.subsys.u.usb.bus, def->source.subsys.u.usb.device, vroot))) goto cleanup; @@ -4020,8 +3884,8 @@ lxcDomainDetachDeviceHostdevUSBLive(virLXCDriverPtr driver, virDomainAuditHostdev(vm, def, "detach", true); if (virUSBDeviceFileIterate(usb, - virLXCTeardownHostUsbDeviceCgroup, - &group) < 0) + virLXCTeardownHostUsbDeviceCgroup, + &priv->cgroup) < 0) VIR_WARN("cannot deny device %s for domain %s", dst, vm->def->name); @@ -4035,19 +3899,16 @@ lxcDomainDetachDeviceHostdevUSBLive(virLXCDriverPtr driver, cleanup: virUSBDeviceFree(usb); VIR_FREE(dst); - virCgroupFree(&group); return ret; } static int -lxcDomainDetachDeviceHostdevStorageLive(virLXCDriverPtr driver, - virDomainObjPtr vm, +lxcDomainDetachDeviceHostdevStorageLive(virDomainObjPtr vm, virDomainDeviceDefPtr dev) { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainHostdevDefPtr def = NULL; - virCgroupPtr group = NULL; int i, ret = -1; char *dst = NULL; @@ -4073,18 +3934,12 @@ lxcDomainDetachDeviceHostdevStorageLive(virLXCDriverPtr driver, goto cleanup; } - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - VIR_DEBUG("Unlinking %s", dst); if (unlink(dst) < 0 && errno != ENOENT) { virDomainAuditHostdev(vm, def, "detach", false); @@ -4094,7 +3949,7 @@ lxcDomainDetachDeviceHostdevStorageLive(virLXCDriverPtr driver, } virDomainAuditHostdev(vm, def, "detach", true); - if (virCgroupDenyDevicePath(group, def->source.caps.u.storage.block, VIR_CGROUP_DEVICE_RWM) != 0) + if (virCgroupDenyDevicePath(priv->cgroup, def->source.caps.u.storage.block, VIR_CGROUP_DEVICE_RWM) != 0) VIR_WARN("cannot deny device %s for domain %s", def->source.caps.u.storage.block, vm->def->name); @@ -4105,20 +3960,16 @@ lxcDomainDetachDeviceHostdevStorageLive(virLXCDriverPtr driver, cleanup: VIR_FREE(dst); - if (group) - virCgroupFree(&group); return ret; } static int -lxcDomainDetachDeviceHostdevMiscLive(virLXCDriverPtr driver, - virDomainObjPtr vm, +lxcDomainDetachDeviceHostdevMiscLive(virDomainObjPtr vm, virDomainDeviceDefPtr dev) { virLXCDomainObjPrivatePtr priv = vm->privateData; virDomainHostdevDefPtr def = NULL; - virCgroupPtr group = NULL; int i, ret = -1; char *dst = NULL; @@ -4144,18 +3995,12 @@ lxcDomainDetachDeviceHostdevMiscLive(virLXCDriverPtr driver, goto cleanup; } - if (!lxcCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("devices cgroup isn't mounted")); goto cleanup; } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot find cgroup for domain %s"), vm->def->name); - goto cleanup; - } - VIR_DEBUG("Unlinking %s", dst); if (unlink(dst) < 0 && errno != ENOENT) { virDomainAuditHostdev(vm, def, "detach", false); @@ -4165,7 +4010,7 @@ lxcDomainDetachDeviceHostdevMiscLive(virLXCDriverPtr driver, } virDomainAuditHostdev(vm, def, "detach", true); - if (virCgroupDenyDevicePath(group, def->source.caps.u.misc.chardev, VIR_CGROUP_DEVICE_RWM) != 0) + if (virCgroupDenyDevicePath(priv->cgroup, def->source.caps.u.misc.chardev, VIR_CGROUP_DEVICE_RWM) != 0) VIR_WARN("cannot deny device %s for domain %s", def->source.caps.u.misc.chardev, vm->def->name); @@ -4176,8 +4021,6 @@ lxcDomainDetachDeviceHostdevMiscLive(virLXCDriverPtr driver, cleanup: VIR_FREE(dst); - if (group) - virCgroupFree(&group); return ret; } @@ -4201,16 +4044,15 @@ lxcDomainDetachDeviceHostdevSubsysLive(virLXCDriverPtr driver, static int -lxcDomainDetachDeviceHostdevCapsLive(virLXCDriverPtr driver, - virDomainObjPtr vm, - virDomainDeviceDefPtr dev) +lxcDomainDetachDeviceHostdevCapsLive(virDomainObjPtr vm, + virDomainDeviceDefPtr dev) { switch (dev->data.hostdev->source.caps.type) { case VIR_DOMAIN_HOSTDEV_CAPS_TYPE_STORAGE: - return lxcDomainDetachDeviceHostdevStorageLive(driver, vm, dev); + return lxcDomainDetachDeviceHostdevStorageLive(vm, dev); case VIR_DOMAIN_HOSTDEV_CAPS_TYPE_MISC: - return lxcDomainDetachDeviceHostdevMiscLive(driver, vm, dev); + return lxcDomainDetachDeviceHostdevMiscLive(vm, dev); default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -4239,7 +4081,7 @@ lxcDomainDetachDeviceHostdevLive(virLXCDriverPtr driver, return lxcDomainDetachDeviceHostdevSubsysLive(driver, vm, dev); case VIR_DOMAIN_HOSTDEV_MODE_CAPABILITIES: - return lxcDomainDetachDeviceHostdevCapsLive(driver, vm, dev); + return lxcDomainDetachDeviceHostdevCapsLive(vm, dev); default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -4259,7 +4101,7 @@ lxcDomainDetachDeviceLive(virLXCDriverPtr driver, switch (dev->type) { case VIR_DOMAIN_DEVICE_DISK: - ret = lxcDomainDetachDeviceDiskLive(driver, vm, dev); + ret = lxcDomainDetachDeviceDiskLive(vm, dev); break; case VIR_DOMAIN_DEVICE_NET: diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index f311f63..193dd9a 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -29,6 +29,7 @@ #include "lxc_process.h" #include "lxc_domain.h" #include "lxc_container.h" +#include "lxc_cgroup.h" #include "lxc_fuse.h" #include "datatypes.h" #include "virfile.h" @@ -219,7 +220,6 @@ static void virLXCProcessCleanup(virLXCDriverPtr driver, virDomainObjPtr vm, virDomainShutoffReason reason) { - virCgroupPtr cgroup; int i; virLXCDomainObjPrivatePtr priv = vm->privateData; virNetDevVPortProfilePtr vport = NULL; @@ -277,10 +277,9 @@ static void virLXCProcessCleanup(virLXCDriverPtr driver, virDomainConfVMNWFilterTeardown(vm); - if (driver->cgroup && - virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) == 0) { - virCgroupRemove(cgroup); - virCgroupFree(&cgroup); + if (priv->cgroup) { + virCgroupRemove(priv->cgroup); + virCgroupFree(&priv->cgroup); } /* now that we know it's stopped call the hook if present */ @@ -742,8 +741,8 @@ int virLXCProcessStop(virLXCDriverPtr driver, virDomainObjPtr vm, virDomainShutoffReason reason) { - virCgroupPtr group = NULL; int rc; + virLXCDomainObjPrivatePtr priv; VIR_DEBUG("Stopping VM name=%s pid=%d reason=%d", vm->def->name, (int)vm->pid, (int)reason); @@ -752,6 +751,8 @@ int virLXCProcessStop(virLXCDriverPtr driver, return 0; } + priv = vm->privateData; + if (vm->pid <= 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Invalid PID %d for container"), vm->pid); @@ -769,8 +770,8 @@ int virLXCProcessStop(virLXCDriverPtr driver, VIR_FREE(vm->def->seclabels[0]->imagelabel); } - if (virCgroupForDomain(driver->cgroup, vm->def->name, &group, 0) == 0) { - rc = virCgroupKillPainfully(group); + if (priv->cgroup) { + rc = virCgroupKillPainfully(priv->cgroup); if (rc < 0) { virReportSystemError(-rc, "%s", _("Failed to kill container PIDs")); @@ -794,7 +795,6 @@ int virLXCProcessStop(virLXCDriverPtr driver, rc = 0; cleanup: - virCgroupFree(&group); return rc; } @@ -1047,26 +1047,28 @@ int virLXCProcessStart(virConnectPtr conn, virLXCDomainObjPrivatePtr priv = vm->privateData; virErrorPtr err = NULL; - if (!lxc_driver->cgroup) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("The 'cpuacct', 'devices' & 'memory' cgroups controllers must be mounted")); + virCgroupFree(&priv->cgroup); + + if (!(priv->cgroup = virLXCCgroupCreate(vm->def))) return -1; - } - if (!virCgroupHasController(lxc_driver->cgroup, - VIR_CGROUP_CONTROLLER_CPUACCT)) { + if (!virCgroupHasController(priv->cgroup, + VIR_CGROUP_CONTROLLER_CPUACCT)) { + virCgroupFree(&priv->cgroup); virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Unable to find 'cpuacct' cgroups controller mount")); return -1; } - if (!virCgroupHasController(lxc_driver->cgroup, + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) { + virCgroupFree(&priv->cgroup); virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Unable to find 'devices' cgroups controller mount")); return -1; } - if (!virCgroupHasController(lxc_driver->cgroup, + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_MEMORY)) { + virCgroupFree(&priv->cgroup); virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Unable to find 'memory' cgroups controller mount")); return -1; @@ -1462,6 +1464,9 @@ virLXCProcessReconnectDomain(virDomainObjPtr vm, if (!(priv->monitor = virLXCProcessConnectMonitor(driver, vm))) goto error; + if (!(priv->cgroup = virLXCCgroupCreate(vm->def))) + goto error; + if (virLXCUpdateActiveUsbHostdevs(driver, vm->def) < 0) goto error; -- 1.8.1.4

From: "Daniel P. Berrange" <berrange@redhat.com> The definition of structs for cgroups are kept in vircgroup.c since they are intended to be private from users of the API. To enable effective testing, however, they need to be accessible. To address the latter issue, without compronmising the former, this introduces a new vircgrouppriv.h file to hold the struct definitions. To prevent other files including this private header, it requires that __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ be defined before inclusion Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/Makefile.am | 2 +- src/util/vircgroup.c | 17 +++-------------- src/util/vircgrouppriv.h | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 50 insertions(+), 15 deletions(-) create mode 100644 src/util/vircgrouppriv.h diff --git a/src/Makefile.am b/src/Makefile.am index 78b4ab6..412adae 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -69,7 +69,7 @@ UTIL_SOURCES = \ util/virauthconfig.c util/virauthconfig.h \ util/virbitmap.c util/virbitmap.h \ util/virbuffer.c util/virbuffer.h \ - util/vircgroup.c util/vircgroup.h \ + util/vircgroup.c util/vircgroup.h util/vircgrouppriv.h \ util/vircommand.c util/vircommand.h \ util/virconf.c util/virconf.h \ util/virdbus.c util/virdbus.h \ diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index dc2b431..dfa3c8a 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -37,10 +37,11 @@ #include <libgen.h> #include <dirent.h> -#include "internal.h" +#define __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ +#include "vircgrouppriv.h" + #include "virutil.h" #include "viralloc.h" -#include "vircgroup.h" #include "virlog.h" #include "virfile.h" #include "virhash.h" @@ -52,18 +53,6 @@ VIR_ENUM_IMPL(virCgroupController, VIR_CGROUP_CONTROLLER_LAST, "cpu", "cpuacct", "cpuset", "memory", "devices", "freezer", "blkio"); -struct virCgroupController { - int type; - char *mountPoint; - char *placement; -}; - -struct virCgroup { - char *path; - - struct virCgroupController controllers[VIR_CGROUP_CONTROLLER_LAST]; -}; - typedef enum { VIR_CGROUP_NONE = 0, /* create subdir under each cgroup if possible. */ VIR_CGROUP_MEM_HIERACHY = 1 << 0, /* call virCgroupSetMemoryUseHierarchy diff --git a/src/util/vircgrouppriv.h b/src/util/vircgrouppriv.h new file mode 100644 index 0000000..cc8cc0b --- /dev/null +++ b/src/util/vircgrouppriv.h @@ -0,0 +1,46 @@ +/* + * vircgrouppriv.h: methods for managing control cgroups + * + * Copyright (C) 2011-2013 Red Hat, Inc. + * Copyright IBM Corp. 2008 + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + * Authors: + * Dan Smith <danms@us.ibm.com> + */ + +#ifndef __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ +# error "vircgrouppriv.h may only be included by vircgroup.c or its test suite" +#endif + +#ifndef __VIR_CGROUP_PRIV_H__ +# define __VIR_CGROUP_PRIV_H__ + +# include "vircgroup.h" + +struct virCgroupController { + int type; + char *mountPoint; + char *placement; +}; + +struct virCgroup { + char *path; + + struct virCgroupController controllers[VIR_CGROUP_CONTROLLER_LAST]; +}; + +#endif /* __VIR_CGROUP_PRIV_H__ */ -- 1.8.1.4

From: "Daniel P. Berrange" <berrange@redhat.com> Rename all the virCgroupForXXX methods to use the form virCgroupNewXXX since they are all constructors. Also make sure the output parameter is the last one in the list, and annotate all pointers as non-null. Fix up all callers, and make sure they use true/false not 0/1 for the boolean parameters Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 10 +++--- src/lxc/lxc_cgroup.c | 6 ++-- src/qemu/qemu_cgroup.c | 14 ++++---- src/qemu/qemu_driver.c | 18 +++++------ src/util/vircgroup.c | 84 ++++++++++++++++++++++-------------------------- src/util/vircgroup.h | 33 +++++++++++-------- 6 files changed, 82 insertions(+), 83 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 4db0734..1fea9a2 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1095,11 +1095,6 @@ virCgroupDenyAllDevices; virCgroupDenyDevice; virCgroupDenyDeviceMajor; virCgroupDenyDevicePath; -virCgroupForDomain; -virCgroupForDriver; -virCgroupForEmulator; -virCgroupForSelf; -virCgroupForVcpu; virCgroupFree; virCgroupGetBlkioWeight; virCgroupGetCpuacctPercpuUsage; @@ -1121,6 +1116,11 @@ virCgroupKill; virCgroupKillPainfully; virCgroupKillRecursive; virCgroupMoveTask; +virCgroupNewDomain; +virCgroupNewDriver; +virCgroupNewEmulator; +virCgroupNewSelf; +virCgroupNewVcpu; virCgroupPathOfController; virCgroupRemove; virCgroupRemoveRecursively; diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 1bad9ec..7d1432b 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -293,7 +293,7 @@ int virLXCCgroupGetMeminfo(virLXCMeminfoPtr meminfo) int ret; virCgroupPtr cgroup; - ret = virCgroupForSelf(&cgroup); + ret = virCgroupNewSelf(&cgroup); if (ret < 0) { virReportSystemError(-ret, "%s", _("Unable to get cgroup for container")); @@ -529,14 +529,14 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) virCgroupPtr cgroup = NULL; int rc; - rc = virCgroupForDriver("lxc", &driver, 1, 0, -1); + rc = virCgroupNewDriver("lxc", true, false, -1, &driver); if (rc != 0) { virReportSystemError(-rc, "%s", _("Unable to get cgroup for driver")); goto cleanup; } - rc = virCgroupForDomain(driver, def->name, &cgroup, 1); + rc = virCgroupNewDomain(driver, def->name, true, &cgroup); if (rc != 0) { virReportSystemError(-rc, _("Unable to create cgroup for domain %s"), diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 019aa2e..cb53acb 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -197,9 +197,11 @@ int qemuInitCgroup(virQEMUDriverPtr driver, virCgroupFree(&priv->cgroup); - rc = virCgroupForDriver("qemu", &driverGroup, - cfg->privileged, true, - cfg->cgroupControllers); + rc = virCgroupNewDriver("qemu", + cfg->privileged, + true, + cfg->cgroupControllers, + &driverGroup); if (rc != 0) { if (rc == -ENXIO || rc == -EPERM || @@ -214,7 +216,7 @@ int qemuInitCgroup(virQEMUDriverPtr driver, goto cleanup; } - rc = virCgroupForDomain(driverGroup, vm->def->name, &priv->cgroup, 1); + rc = virCgroupNewDomain(driverGroup, vm->def->name, true, &priv->cgroup); if (rc != 0) { virReportSystemError(-rc, _("Unable to create cgroup for %s"), @@ -610,7 +612,7 @@ int qemuSetupCgroupForVcpu(virDomainObjPtr vm) } for (i = 0; i < priv->nvcpupids; i++) { - rc = virCgroupForVcpu(priv->cgroup, i, &cgroup_vcpu, 1); + rc = virCgroupNewVcpu(priv->cgroup, i, true, &cgroup_vcpu); if (rc < 0) { virReportSystemError(-rc, _("Unable to create vcpu cgroup for %s(vcpu:" @@ -688,7 +690,7 @@ int qemuSetupCgroupForEmulator(virQEMUDriverPtr driver, if (priv->cgroup == NULL) return 0; /* Not supported, so claim success */ - rc = virCgroupForEmulator(priv->cgroup, &cgroup_emulator, 1); + rc = virCgroupNewEmulator(priv->cgroup, true, &cgroup_emulator); if (rc < 0) { virReportSystemError(-rc, _("Unable to create emulator cgroup for %s"), diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index ab6b74d..d9124e0 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3600,7 +3600,7 @@ static int qemuDomainHotplugVcpus(virQEMUDriverPtr driver, if (priv->cgroup) { int rv = -1; /* Create cgroup for the onlined vcpu */ - rv = virCgroupForVcpu(priv->cgroup, i, &cgroup_vcpu, 1); + rv = virCgroupNewVcpu(priv->cgroup, i, true, &cgroup_vcpu); if (rv < 0) { virReportSystemError(-rv, _("Unable to create vcpu cgroup for %s(vcpu:" @@ -3674,7 +3674,7 @@ static int qemuDomainHotplugVcpus(virQEMUDriverPtr driver, if (priv->cgroup) { int rv = -1; - rv = virCgroupForVcpu(priv->cgroup, i, &cgroup_vcpu, 0); + rv = virCgroupNewVcpu(priv->cgroup, i, false, &cgroup_vcpu); if (rv < 0) { virReportSystemError(-rv, _("Unable to access vcpu cgroup for %s(vcpu:" @@ -3913,7 +3913,7 @@ qemuDomainPinVcpuFlags(virDomainPtr dom, /* Configure the corresponding cpuset cgroup before set affinity. */ if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) { - if (virCgroupForVcpu(priv->cgroup, vcpu, &cgroup_vcpu, 0) == 0 && + if (virCgroupNewVcpu(priv->cgroup, vcpu, false, &cgroup_vcpu) == 0 && qemuSetupCgroupVcpuPin(cgroup_vcpu, newVcpuPin, newVcpuPinNum, vcpu) < 0) { virReportError(VIR_ERR_OPERATION_INVALID, _("failed to set cpuset.cpus in cgroup" @@ -4169,7 +4169,7 @@ qemuDomainPinEmulator(virDomainPtr dom, * Configure the corresponding cpuset cgroup. * If no cgroup for domain or hypervisor exists, do nothing. */ - if (virCgroupForEmulator(priv->cgroup, &cgroup_emulator, 0) == 0) { + if (virCgroupNewEmulator(priv->cgroup, false, &cgroup_emulator) == 0) { if (qemuSetupCgroupEmulatorPin(cgroup_emulator, newVcpuPin[0]->cpumask) < 0) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", @@ -7742,7 +7742,7 @@ qemuSetVcpusBWLive(virDomainObjPtr vm, virCgroupPtr cgroup, */ if (priv->nvcpupids != 0 && priv->vcpupids[0] != vm->pid) { for (i = 0; i < priv->nvcpupids; i++) { - rc = virCgroupForVcpu(cgroup, i, &cgroup_vcpu, 0); + rc = virCgroupNewVcpu(cgroup, i, false, &cgroup_vcpu); if (rc < 0) { virReportSystemError(-rc, _("Unable to find vcpu cgroup for %s(vcpu:" @@ -7780,7 +7780,7 @@ qemuSetEmulatorBandwidthLive(virDomainObjPtr vm, virCgroupPtr cgroup, return 0; } - rc = virCgroupForEmulator(cgroup, &cgroup_emulator, 0); + rc = virCgroupNewEmulator(cgroup, false, &cgroup_emulator); if (rc < 0) { virReportSystemError(-rc, _("Unable to find emulator cgroup for %s"), @@ -8033,7 +8033,7 @@ qemuGetVcpusBWLive(virDomainObjPtr vm, } /* get period and quota for vcpu0 */ - rc = virCgroupForVcpu(priv->cgroup, 0, &cgroup_vcpu, 0); + rc = virCgroupNewVcpu(priv->cgroup, 0, false, &cgroup_vcpu); if (!cgroup_vcpu) { virReportSystemError(-rc, _("Unable to find vcpu cgroup for %s(vcpu: 0)"), @@ -8071,7 +8071,7 @@ qemuGetEmulatorBandwidthLive(virDomainObjPtr vm, virCgroupPtr cgroup, } /* get period and quota for emulator */ - rc = virCgroupForEmulator(cgroup, &cgroup_emulator, 0); + rc = virCgroupNewEmulator(cgroup, false, &cgroup_emulator); if (!cgroup_emulator) { virReportSystemError(-rc, _("Unable to find emulator cgroup for %s"), @@ -14336,7 +14336,7 @@ getSumVcpuPercpuStats(virDomainObjPtr vm, unsigned long long tmp; int j; - if (virCgroupForVcpu(priv->cgroup, i, &group_vcpu, 0) < 0) { + if (virCgroupNewVcpu(priv->cgroup, i, false, &group_vcpu) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("error accessing cgroup cpuacct for vcpu")); goto cleanup; diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index dfa3c8a..2f52c92 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -927,7 +927,7 @@ cleanup: } /** - * virCgroupForDriver: + * virCgroupNewDriver: * * @name: name of this driver (e.g., xen, qemu, lxc) * @group: Pointer to returned virCgroupPtr @@ -935,11 +935,11 @@ cleanup: * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForDriver(const char *name, - virCgroupPtr *group, +int virCgroupNewDriver(const char *name, bool privileged, bool create, - int controllers) + int controllers, + virCgroupPtr *group) { int rc; char *path = NULL; @@ -970,10 +970,11 @@ out: return rc; } #else -int virCgroupForDriver(const char *name ATTRIBUTE_UNUSED, - virCgroupPtr *group ATTRIBUTE_UNUSED, +int virCgroupNewDriver(const char *name ATTRIBUTE_UNUSED, bool privileged ATTRIBUTE_UNUSED, - bool create ATTRIBUTE_UNUSED) + bool create ATTRIBUTE_UNUSED, + int controllers ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) { /* Claim no support */ return -ENXIO; @@ -981,7 +982,7 @@ int virCgroupForDriver(const char *name ATTRIBUTE_UNUSED, #endif /** -* virCgroupForSelf: +* virCgroupNewSelf: * * @group: Pointer to returned virCgroupPtr * @@ -991,19 +992,19 @@ int virCgroupForDriver(const char *name ATTRIBUTE_UNUSED, * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForSelf(virCgroupPtr *group) +int virCgroupNewSelf(virCgroupPtr *group) { return virCgroupNew("/", -1, group); } #else -int virCgroupForSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) +int virCgroupNewSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) { return -ENXIO; } #endif /** - * virCgroupForDomain: + * virCgroupNewDomain: * * @driver: group for driver owning the domain * @name: name of the domain @@ -1012,17 +1013,14 @@ int virCgroupForSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForDomain(virCgroupPtr driver, +int virCgroupNewDomain(virCgroupPtr driver, const char *name, - virCgroupPtr *group, - bool create) + bool create, + virCgroupPtr *group) { int rc; char *path; - if (driver == NULL) - return -EINVAL; - if (virAsprintf(&path, "%s/%s", driver->path, name) < 0) return -ENOMEM; @@ -1048,38 +1046,35 @@ int virCgroupForDomain(virCgroupPtr driver, return rc; } #else -int virCgroupForDomain(virCgroupPtr driver ATTRIBUTE_UNUSED, +int virCgroupNewDomain(virCgroupPtr driver ATTRIBUTE_UNUSED, const char *name ATTRIBUTE_UNUSED, - virCgroupPtr *group ATTRIBUTE_UNUSED, - bool create ATTRIBUTE_UNUSED) + bool create ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) { return -ENXIO; } #endif /** - * virCgroupForVcpu: + * virCgroupNewVcpu: * - * @driver: group for the domain + * @domain: group for the domain * @vcpuid: id of the vcpu * @group: Pointer to returned virCgroupPtr * * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForVcpu(virCgroupPtr driver, +int virCgroupNewVcpu(virCgroupPtr domain, int vcpuid, - virCgroupPtr *group, - bool create) + bool create, + virCgroupPtr *group) { int rc; char *path; int controllers; - if (driver == NULL) - return -EINVAL; - - if (virAsprintf(&path, "%s/vcpu%d", driver->path, vcpuid) < 0) + if (virAsprintf(&path, "%s/vcpu%d", domain->path, vcpuid) < 0) return -ENOMEM; controllers = ((1 << VIR_CGROUP_CONTROLLER_CPU) | @@ -1090,7 +1085,7 @@ int virCgroupForVcpu(virCgroupPtr driver, VIR_FREE(path); if (rc == 0) { - rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_NONE); + rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); if (rc != 0) virCgroupFree(group); } @@ -1098,36 +1093,33 @@ int virCgroupForVcpu(virCgroupPtr driver, return rc; } #else -int virCgroupForVcpu(virCgroupPtr driver ATTRIBUTE_UNUSED, +int virCgroupNewVcpu(virCgroupPtr domain ATTRIBUTE_UNUSED, int vcpuid ATTRIBUTE_UNUSED, - virCgroupPtr *group ATTRIBUTE_UNUSED, - bool create ATTRIBUTE_UNUSED) + bool create ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) { return -ENXIO; } #endif /** - * virCgroupForEmulator: + * virCgroupNewEmulator: * - * @driver: group for the domain + * @domain: group for the domain * @group: Pointer to returned virCgroupPtr * * Returns: 0 on success or -errno on failure */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupForEmulator(virCgroupPtr driver, - virCgroupPtr *group, - bool create) +int virCgroupNewEmulator(virCgroupPtr domain, + bool create, + virCgroupPtr *group) { int rc; char *path; int controllers; - if (driver == NULL) - return -EINVAL; - - if (virAsprintf(&path, "%s/emulator", driver->path) < 0) + if (virAsprintf(&path, "%s/emulator", domain->path) < 0) return -ENOMEM; controllers = ((1 << VIR_CGROUP_CONTROLLER_CPU) | @@ -1138,7 +1130,7 @@ int virCgroupForEmulator(virCgroupPtr driver, VIR_FREE(path); if (rc == 0) { - rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_NONE); + rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); if (rc != 0) virCgroupFree(group); } @@ -1146,9 +1138,9 @@ int virCgroupForEmulator(virCgroupPtr driver, return rc; } #else -int virCgroupForEmulator(virCgroupPtr driver ATTRIBUTE_UNUSED, - virCgroupPtr *group ATTRIBUTE_UNUSED, - bool create ATTRIBUTE_UNUSED) +int virCgroupNewEmulator(virCgroupPtr domain ATTRIBUTE_UNUSED, + bool create ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) { return -ENXIO; } diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h index 4c1134d..91143e2 100644 --- a/src/util/vircgroup.h +++ b/src/util/vircgroup.h @@ -44,27 +44,32 @@ enum { VIR_ENUM_DECL(virCgroupController); -int virCgroupForDriver(const char *name, - virCgroupPtr *group, +int virCgroupNewDriver(const char *name, bool privileged, bool create, - int controllers); + int controllers, + virCgroupPtr *group) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(5); -int virCgroupForSelf(virCgroupPtr *group); +int virCgroupNewSelf(virCgroupPtr *group) + ATTRIBUTE_NONNULL(1); -int virCgroupForDomain(virCgroupPtr driver, +int virCgroupNewDomain(virCgroupPtr driver, const char *name, - virCgroupPtr *group, - bool create); + bool create, + virCgroupPtr *group) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(4); -int virCgroupForVcpu(virCgroupPtr driver, +int virCgroupNewVcpu(virCgroupPtr domain, int vcpuid, - virCgroupPtr *group, - bool create); - -int virCgroupForEmulator(virCgroupPtr driver, - virCgroupPtr *group, - bool create); + bool create, + virCgroupPtr *group) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(4); + +int virCgroupNewEmulator(virCgroupPtr domain, + bool create, + virCgroupPtr *group) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3); int virCgroupPathOfController(virCgroupPtr group, int controller, -- 1.8.1.4

From: "Daniel P. Berrange" <berrange@redhat.com> Some aspects of the cgroups setup / detection code are quite subtle and easy to break. It would greatly benefit from unit testing, but this is difficult because the test suite won't have privileges to play around with cgroups. The solution is to use monkey patching via LD_PRELOAD to override the fopen, open, mkdir, access functions to redirect access of cgroups files to some magic stubs in the test suite. Using this we provide custom content for the /proc/cgroup and /proc/self/mounts files which report a fixed cgroup setup. We then override open/mkdir/access so that access to the cgroups filesystem gets redirected into files in a temporary directory tree in the test suite build dir. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- .gitignore | 1 + cfg.mk | 11 +- tests/Makefile.am | 15 +- tests/vircgroupmock.c | 453 ++++++++++++++++++++++++++++++++++++++++++++++++++ tests/vircgrouptest.c | 249 +++++++++++++++++++++++++++ 5 files changed, 723 insertions(+), 6 deletions(-) create mode 100644 tests/vircgroupmock.c create mode 100644 tests/vircgrouptest.c diff --git a/.gitignore b/.gitignore index 4b44820..7478e1e 100644 --- a/.gitignore +++ b/.gitignore @@ -178,6 +178,7 @@ /tests/virauthconfigtest /tests/virbitmaptest /tests/virbuftest +/tests/vircgrouptest /tests/virdrivermoduletest /tests/virendiantest /tests/virhashtest diff --git a/cfg.mk b/cfg.mk index 394521e..e60c4e3 100644 --- a/cfg.mk +++ b/cfg.mk @@ -788,15 +788,16 @@ $(srcdir)/src/remote/remote_client_bodies.h: $(srcdir)/src/remote/remote_protoco exclude_file_name_regexp--sc_avoid_strcase = ^tools/virsh\.h$$ _src1=libvirt|fdstream|qemu/qemu_monitor|util/(vircommand|virutil)|xen/xend_internal|rpc/virnetsocket|lxc/lxc_controller|locking/lock_daemon +_test1=shunloadtest|virnettlscontexttest|vircgroupmock exclude_file_name_regexp--sc_avoid_write = \ - ^(src/($(_src1))|daemon/libvirtd|tools/console|tests/(shunload|virnettlscontext)test)\.c$$ + ^(src/($(_src1))|daemon/libvirtd|tools/console|tests/($(_test1)))\.c$$ exclude_file_name_regexp--sc_bindtextdomain = ^(tests|examples)/ exclude_file_name_regexp--sc_copyright_address = \ ^COPYING\.LIB$$ -exclude_file_name_regexp--sc_flags_usage = ^(docs/|src/util/virnetdevtap\.c$$) +exclude_file_name_regexp--sc_flags_usage = ^(docs/|src/util/virnetdevtap\.c$$|tests/vircgroupmock\.c$$) exclude_file_name_regexp--sc_libvirt_unmarked_diagnostics = \ ^(src/rpc/gendispatch\.pl$$|tests/) @@ -812,10 +813,10 @@ exclude_file_name_regexp--sc_prohibit_always_true_header_tests = \ ^python/(libvirt-(lxc-|qemu-)?override|typewrappers)\.c$$ exclude_file_name_regexp--sc_prohibit_asprintf = \ - ^(bootstrap.conf$$|src/util/virutil\.c$$|examples/domain-events/events-c/event-test\.c$$) + ^(bootstrap.conf$$|src/util/virutil\.c$$|examples/domain-events/events-c/event-test\.c$$|tests/vircgroupmock\.c$$) exclude_file_name_regexp--sc_prohibit_close = \ - (\.p[yl]$$|^docs/|^(src/util/virfile\.c|src/libvirt\.c)$$) + (\.p[yl]$$|^docs/|^(src/util/virfile\.c|src/libvirt\.c|tests/vircgroupmock\.c)$$) exclude_file_name_regexp--sc_prohibit_empty_lines_at_EOF = \ (^tests/(qemuhelp|nodeinfo)data/|\.(gif|ico|png|diff)$$) @@ -836,7 +837,7 @@ exclude_file_name_regexp--sc_prohibit_nonreentrant = \ ^((po|tests)/|docs/.*(py|html\.in)|run.in$$) exclude_file_name_regexp--sc_prohibit_raw_allocation = \ - ^(docs/hacking\.html\.in)|(src/util/viralloc\.[ch]|examples/.*|tests/securityselinuxhelper.c)$$ + ^(docs/hacking\.html\.in)|(src/util/viralloc\.[ch]|examples/.*|tests/securityselinuxhelper\.c|tests/vircgroupmock\.c)$$ exclude_file_name_regexp--sc_prohibit_readlink = \ ^src/(util/virutil|lxc/lxc_container)\.c$$ diff --git a/tests/Makefile.am b/tests/Makefile.am index 3abd698..898d6ed 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -97,7 +97,9 @@ test_programs = virshtest sockettest \ utiltest shunloadtest \ virtimetest viruritest virkeyfiletest \ virauthconfigtest \ - virbitmaptest virendiantest \ + virbitmaptest \ + vircgrouptest \ + virendiantest \ viridentitytest \ virlockspacetest \ virstringtest \ @@ -246,6 +248,7 @@ EXTRA_DIST += $(test_scripts) test_libraries = libshunload.la \ libvirportallocatormock.la \ + vircgroupmock.la \ $(NULL) if WITH_QEMU test_libraries += libqemumonitortestutils.la @@ -587,6 +590,16 @@ libvirportallocatormock_la_CFLAGS = $(AM_CFLAGS) -DMOCK_HELPER=1 libvirportallocatormock_la_LDFLAGS = -module -avoid-version \ -rpath /evil/libtool/hack/to/force/shared/lib/creation +vircgrouptest_SOURCES = \ + vircgrouptest.c testutils.h testutils.c +vircgrouptest_LDADD = $(LDADDS) + +vircgroupmock_la_SOURCES = \ + vircgroupmock.c +vircgroupmock_la_CFLAGS = $(AM_CFLAGS) -DMOCK_HELPER=1 +vircgroupmock_la_LDFLAGS = -module -avoid-version \ + -rpath /evil/libtool/hack/to/force/shared/lib/creation + viruritest_SOURCES = \ viruritest.c testutils.h testutils.c diff --git a/tests/vircgroupmock.c b/tests/vircgroupmock.c new file mode 100644 index 0000000..e50f7e0 --- /dev/null +++ b/tests/vircgroupmock.c @@ -0,0 +1,453 @@ +/* + * Copyright (C) 2013 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + * Author: Daniel P. Berrange <berrange@redhat.com> + */ + +#include <config.h> + +#include "internal.h" + +#include <stdio.h> +#include <dlfcn.h> +#include <stdlib.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/stat.h> + +static int (*realopen)(const char *path, int flags, ...); +static FILE *(*realfopen)(const char *path, const char *mode); +static int (*realaccess)(const char *path, int mode); +static int (*realmkdir)(const char *path, mode_t mode); +static char *fakesysfsdir; + + +#define SYSFS_PREFIX "/not/really/sys/fs/cgroup/" + +/* + * The plan: + * + * We fake out /proc/mounts, so make it look as is cgroups + * are mounted on /not/really/sys/fs/cgroup. We don't + * use /sys/fs/cgroup, because we want to make it easy to + * detect places where we've not mocked enough syscalls. + * + * In any open/acces/mkdir calls we look at path and if + * it starts with /not/really/sys/fs/cgroup, we rewrite + * the path to point at a temporary directory referred + * to by LIBVIRT_FAKE_SYSFS_DIR env variable that is + * set by the main test suite + * + * In mkdir() calls, we simulate the cgroups behaviour + * whereby creating the directory auto-creates a bunch + * of files beneath it + */ + +/* + * Intentionally missing the 'devices' mount. + * Co-mounting cpu & cpuacct controllers + * An anonymous controller for systemd + */ +const char *mounts = + "rootfs / rootfs rw 0 0\n" + "tmpfs /run tmpfs rw,seclabel,nosuid,nodev,mode=755 0 0\n" + "tmpfs /not/really/sys/fs/cgroup tmpfs rw,seclabel,nosuid,nodev,noexec,mode=755 0 0\n" + "cgroup /not/really/sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0\n" + "cgroup /not/really/sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0\n" + "cgroup /not/really/sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0\n" + "cgroup /not/really/sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0\n" + "cgroup /not/really/sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0\n" + "cgroup /not/really/sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0\n" + "/dev/sda1 /boot ext4 rw,seclabel,relatime,data=ordered 0 0\n" + "tmpfs /tmp tmpfs rw,seclabel,relatime,size=1024000k 0 0\n"; + +const char *cgroups = + "115:memory:/\n" + "8:blkio:/\n" + "6:freezer:/\n" + "3:cpuacct,cpu:/system\n" + "2:cpuset:/\n" + "1:name=systemd:/user/berrange/123\n"; + +static int make_file(const char *path, + const char *name, + const char *value) +{ + int fd = -1; + int ret = -1; + char *filepath = NULL; + + if (asprintf(&filepath, "%s/%s", path, name) < 0) + return -1; + + if ((fd = open(filepath, O_CREAT|O_WRONLY, 0600)) < 0) + goto cleanup; + + if (write(fd, value, strlen(value)) != strlen(value)) + goto cleanup; + + ret = 0; +cleanup: + if (fd != -1 &&close(fd) < 0) + ret = -1; + free(filepath); + + return ret; +} + +static int make_controller(const char *path, mode_t mode) +{ + int ret = -1; + const char *controller; + + if (!STRPREFIX(path, fakesysfsdir)) { + errno = EINVAL; + return -1; + } + controller = path + strlen(fakesysfsdir) + 1; + + if (STREQ(controller, "cpu")) { + if (symlink("cpu,cpuacct", path) < 0) + return -1; + return -0; + } + if (STREQ(controller, "cpuacct")) { + if (symlink("cpu,cpuacct", path) < 0) + return -1; + return 0; + } + + if (realmkdir(path, mode) < 0) + goto cleanup; + +#define MAKE_FILE(name, value) \ + do { \ + if (make_file(path, name, value) < 0) \ + goto cleanup; \ + } while (0) + + if (STRPREFIX(controller, "cpu,cpuacct")) { + MAKE_FILE("cpu.cfs_period_us", "100000\n"); + MAKE_FILE("cpu.cfs_quota_us", "-1\n"); + MAKE_FILE("cpu.rt_period_us", "1000000\n"); + MAKE_FILE("cpu.rt_runtime_us", "950000\n"); + MAKE_FILE("cpu.shares", "1024\n"); + MAKE_FILE("cpu.stat", + "nr_periods 0\n" + "nr_throttled 0\n" + "throttled_time 0\n"); + MAKE_FILE("cpuacct.stat", + "user 216687025\n" + "system 43421396\n"); + MAKE_FILE("cpuacct.usage", "2787788855799582\n"); + MAKE_FILE("cpuacct.usage_per_cpu", "1413142688153030 1374646168910542\n"); + } else if (STRPREFIX(controller, "cpuset")) { + MAKE_FILE("cpuset.cpu_exclusive", "1\n"); + if (STREQ(controller, "cpuset")) + MAKE_FILE("cpuset.cpus", "0-1"); + else + MAKE_FILE("cpuset.cpus", ""); /* Values don't inherit */ + MAKE_FILE("cpuset.mem_exclusive", "1\n"); + MAKE_FILE("cpuset.mem_hardwall", "0\n"); + MAKE_FILE("cpuset.memory_migrate", "0\n"); + MAKE_FILE("cpuset.memory_pressure", "0\n"); + MAKE_FILE("cpuset.memory_pressure_enabled", "0\n"); + MAKE_FILE("cpuset.memory_spread_page", "0\n"); + MAKE_FILE("cpuset.memory_spread_slab", "0\n"); + if (STREQ(controller, "cpuset")) + MAKE_FILE("cpuset.mems", "0"); + else + MAKE_FILE("cpuset.mems", ""); /* Values don't inherit */ + MAKE_FILE("cpuset.sched_load_balance", "1\n"); + MAKE_FILE("cpuset.sched_relax_domain_level", "-1\n"); + } else if (STRPREFIX(controller, "memory")) { + MAKE_FILE("memory.failcnt", "0\n"); + MAKE_FILE("memory.force_empty", ""); /* Write only */ + MAKE_FILE("memory.kmem.tcp.failcnt", "0\n"); + MAKE_FILE("memory.kmem.tcp.limit_in_bytes", "9223372036854775807\n"); + MAKE_FILE("memory.kmem.tcp.max_usage_in_bytes", "0\n"); + MAKE_FILE("memory.kmem.tcp.usage_in_bytes", "16384\n"); + MAKE_FILE("memory.limit_in_bytes", "9223372036854775807\n"); + MAKE_FILE("memory.max_usage_in_bytes", "0\n"); + MAKE_FILE("memory.memsw.failcnt", ""); /* Not supported */ + MAKE_FILE("memory.memsw.limit_in_bytes", ""); /* Not supported */ + MAKE_FILE("memory.memsw.max_usage_in_bytes", ""); /* Not supported */ + MAKE_FILE("memory.memsw.usage_in_bytes", ""); /* Not supported */ + MAKE_FILE("memory.move_charge_at_immigrate", "0\n"); + MAKE_FILE("memory.numa_stat", + "total=367664 N0=367664\n" + "file=314764 N0=314764\n" + "anon=51999 N0=51999\n" + "unevictable=901 N0=901\n"); + MAKE_FILE("memory.oom_control", + "oom_kill_disable 0\n" + "under_oom 0\n"); + MAKE_FILE("memory.soft_limit_in_bytes", "9223372036854775807\n"); + MAKE_FILE("memory.stat", + "cache 1336619008\n" + "rss 97792000\n" + "mapped_file 42090496\n" + "pgpgin 13022605027\n" + "pgpgout 13023820533\n" + "pgfault 54429417056\n" + "pgmajfault 315715\n" + "inactive_anon 145887232\n" + "active_anon 67100672\n" + "inactive_file 627400704\n" + "active_file 661872640\n" + "unevictable 3690496\n" + "hierarchical_memory_limit 9223372036854775807\n" + "total_cache 1336635392\n" + "total_rss 118689792\n" + "total_mapped_file 42106880\n" + "total_pgpgin 13022606816\n" + "total_pgpgout 13023820793\n" + "total_pgfault 54429422313\n" + "total_pgmajfault 315715\n" + "total_inactive_anon 145891328\n" + "total_active_anon 88010752\n" + "total_inactive_file 627400704\n" + "total_active_file 661872640\n" + "total_unevictable 3690496\n" + "recent_rotated_anon 112807028\n" + "recent_rotated_file 2547948\n" + "recent_scanned_anon 113796164\n" + "recent_scanned_file 8199863\n"); + MAKE_FILE("memory.swappiness", "60\n"); + MAKE_FILE("memory.usage_in_bytes", "1455321088\n"); + MAKE_FILE("memory.use_hierarchy", "0\n"); + } else if (STRPREFIX(controller, "freezer")) { + MAKE_FILE("freezer.state", "THAWED"); + } else if (STRPREFIX(controller, "blkio")) { + MAKE_FILE("blkio.io_merged", + "8:0 Read 1100949\n" + "8:0 Write 2248076\n" + "8:0 Sync 63063\n" + "8:0 Async 3285962\n" + "8:0 Total 3349025\n"); + MAKE_FILE("blkio.io_queued", + "8:0 Read 0\n" + "8:0 Write 0\n" + "8:0 Sync 0\n" + "8:0 Async 0\n" + "8:0 Total 0\n"); + MAKE_FILE("blkio.io_service_bytes", + "8:0 Read 59542078464\n" + "8:0 Write 397369182208\n" + "8:0 Sync 234080922624\n" + "8:0 Async 222830338048\n" + "8:0 Total 456911260672\n"); + MAKE_FILE("blkio.io_serviced", + "8:0 Read 3402504\n" + "8:0 Write 14966516\n" + "8:0 Sync 12064031\n" + "8:0 Async 6304989\n" + "8:0 Total 18369020\n"); + MAKE_FILE("blkio.io_service_time", + "8:0 Read 10747537542349\n" + "8:0 Write 9200028590575\n" + "8:0 Sync 6449319855381\n" + "8:0 Async 13498246277543\n" + "8:0 Total 19947566132924\n"); + MAKE_FILE("blkio.io_wait_time", + "8:0 Read 14687514824889\n" + "8:0 Write 357748452187691\n" + "8:0 Sync 55296974349413\n" + "8:0 Async 317138992663167\n" + "8:0 Total 372435967012580\n"); + MAKE_FILE("blkio.reset_stats", ""); /* Write only */ + MAKE_FILE("blkio.sectors", "8:0 892404806\n"); + MAKE_FILE("blkio.throttle.io_service_bytes", + "8:0 Read 59542107136\n" + "8:0 Write 411440480256\n" + "8:0 Sync 248486822912\n" + "8:0 Async 222495764480\n" + "8:0 Total 470982587392\n"); + MAKE_FILE("blkio.throttle.io_serviced", + "8:0 Read 4832583\n" + "8:0 Write 36641903\n" + "8:0 Sync 30723171\n" + "8:0 Async 10751315\n" + "8:0 Total 41474486\n"); + MAKE_FILE("blkio.throttle.read_bps_device", ""); + MAKE_FILE("blkio.throttle.read_iops_device", ""); + MAKE_FILE("blkio.throttle.write_bps_device", ""); + MAKE_FILE("blkio.throttle.write_iops_device", ""); + MAKE_FILE("blkio.time", "8:0 61019089\n"); + MAKE_FILE("blkio.weight", "1000\n"); + MAKE_FILE("blkio.weight_device", ""); + + } else { + errno = EINVAL; + goto cleanup; + } + + ret = 0; +cleanup: + return ret; +} + +static void init_syms(void) +{ + if (realfopen) + return; + +#define LOAD_SYM(name) \ + do { \ + if (!(real ## name = dlsym(RTLD_NEXT, #name))) { \ + fprintf(stderr, "Cannot find real '%s' symbol\n", #name); \ + abort(); \ + } \ + } while (0) + + LOAD_SYM(fopen); + LOAD_SYM(access); + LOAD_SYM(mkdir); + LOAD_SYM(open); +} + +static void init_sysfs(void) +{ + if (fakesysfsdir) + return; + + if (!(fakesysfsdir = getenv("LIBVIRT_FAKE_SYSFS_DIR"))) { + fprintf(stderr, "Missing LIBVIRT_FAKE_SYSFS_DIR env variable\n"); + abort(); + } + +#define MAKE_CONTROLLER(subpath) \ + do { \ + char *path; \ + if (asprintf(&path,"%s/%s", fakesysfsdir, subpath) < 0) \ + abort(); \ + if (make_controller(path, 0755) < 0) { \ + fprintf(stderr, "Cannot initialize %s\n", path); \ + abort(); \ + } \ + } while (0) + + MAKE_CONTROLLER("cpu"); + MAKE_CONTROLLER("cpuacct"); + MAKE_CONTROLLER("cpu,cpuacct"); + MAKE_CONTROLLER("cpu,cpuacct/system"); + MAKE_CONTROLLER("cpuset"); + MAKE_CONTROLLER("blkio"); + MAKE_CONTROLLER("memory"); + MAKE_CONTROLLER("freezer"); +} + + +FILE *fopen(const char *path, const char *mode) +{ + init_syms(); + + if (STREQ(path, "/proc/mounts")) { + if (STREQ(mode, "r")) { + return fmemopen((void *)mounts, strlen(mounts), mode); + } else { + errno = EACCES; + return NULL; + } + } + if (STREQ(path, "/proc/self/cgroup")) { + if (STREQ(mode, "r")) { + return fmemopen((void *)cgroups, strlen(cgroups), mode); + } else { + errno = EACCES; + return NULL; + } + } + + return realfopen(path, mode); +} + +int access(const char *path, int mode) +{ + int ret; + + init_syms(); + + if (STRPREFIX(path, SYSFS_PREFIX)) { + init_sysfs(); + char *newpath; + if (asprintf(&newpath, "%s/%s", + fakesysfsdir, + path + strlen(SYSFS_PREFIX)) < 0) { + errno = ENOMEM; + return -1; + } + ret = realaccess(newpath, mode); + free(newpath); + } else { + ret = realaccess(path, mode); + } + return ret; +} + +int mkdir(const char *path, mode_t mode) +{ + int ret; + + init_syms(); + + if (STRPREFIX(path, SYSFS_PREFIX)) { + init_sysfs(); + char *newpath; + if (asprintf(&newpath, "%s/%s", + fakesysfsdir, + path + strlen(SYSFS_PREFIX)) < 0) { + errno = ENOMEM; + return -1; + } + ret = make_controller(newpath, mode); + free(newpath); + } else { + ret = realmkdir(path, mode); + } + return ret; +} + +int open(const char *path, int flags, ...) +{ + int ret; + char *newpath = NULL; + + init_syms(); + + if (STRPREFIX(path, SYSFS_PREFIX)) { + init_sysfs(); + if (asprintf(&newpath, "%s/%s", + fakesysfsdir, + path + strlen(SYSFS_PREFIX)) < 0) { + errno = ENOMEM; + return -1; + } + } + if (flags & O_CREAT) { + va_list ap; + mode_t mode; + va_start(ap, flags); + mode = va_arg(ap, mode_t); + va_end(ap); + ret = realopen(newpath ? newpath : path, flags, mode); + } else { + ret = realopen(newpath ? newpath : path, flags); + } + free(newpath); + return ret; +} diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c new file mode 100644 index 0000000..a68aa88 --- /dev/null +++ b/tests/vircgrouptest.c @@ -0,0 +1,249 @@ +/* + * Copyright (C) 2013 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + * Author: Daniel P. Berrange <berrange@redhat.com> + */ + +#include <config.h> + +/* This part defines the actual test cases */ +#include <stdlib.h> + +#define __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ +#include "vircgrouppriv.h" +#include "testutils.h" +#include "virutil.h" +#include "virerror.h" +#include "virlog.h" +#include "virfile.h" + +#define VIR_FROM_THIS VIR_FROM_NONE + +static int validateCgroup(virCgroupPtr cgroup, + const char *expectPath, + const char **expectMountPoint, + const char **expectPlacement) +{ + int i; + + if (STRNEQ(cgroup->path, expectPath)) { + fprintf(stderr, "Wrong path '%s', expected '%s'\n", + cgroup->path, expectPath); + return -1; + } + + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + if (STRNEQ_NULLABLE(expectMountPoint[i], + cgroup->controllers[i].mountPoint)) { + fprintf(stderr, "Wrong mount '%s', expected '%s' for '%s'\n", + cgroup->controllers[i].mountPoint, + expectMountPoint[i], + virCgroupControllerTypeToString(i)); + return -1; + } + if (STRNEQ_NULLABLE(expectPlacement[i], + cgroup->controllers[i].placement)) { + fprintf(stderr, "Wrong placement '%s', expected '%s' for '%s'\n", + cgroup->controllers[i].placement, + expectPlacement[i], + virCgroupControllerTypeToString(i)); + return -1; + } + } + + return 0; +} + +const char *mountsSmall[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/not/really/sys/fs/cgroup/cpu,cpuacct", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/not/really/sys/fs/cgroup/cpu,cpuacct", + [VIR_CGROUP_CONTROLLER_CPUSET] = NULL, + [VIR_CGROUP_CONTROLLER_MEMORY] = "/not/really/sys/fs/cgroup/memory", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = NULL, + [VIR_CGROUP_CONTROLLER_BLKIO] = NULL, +}; +const char *mountsFull[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/not/really/sys/fs/cgroup/cpu,cpuacct", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/not/really/sys/fs/cgroup/cpu,cpuacct", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/not/really/sys/fs/cgroup/cpuset", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/not/really/sys/fs/cgroup/memory", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "/not/really/sys/fs/cgroup/freezer", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/not/really/sys/fs/cgroup/blkio", +}; + +static int testCgroupNewForSelf(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr cgroup = NULL; + int ret = -1; + const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/system", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system", + [VIR_CGROUP_CONTROLLER_CPUSET] = "", + [VIR_CGROUP_CONTROLLER_MEMORY] = "", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "", + [VIR_CGROUP_CONTROLLER_BLKIO] = "", + }; + + if (virCgroupNewSelf(&cgroup) < 0) { + fprintf(stderr, "Cannot create cgroup for self\n"); + goto cleanup; + } + + ret = validateCgroup(cgroup, "/", mountsFull, placement); + +cleanup: + virCgroupFree(&cgroup); + return ret; +} + + +static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr cgroup = NULL; + int ret = -1; + int rv; + const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/system", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system", + [VIR_CGROUP_CONTROLLER_CPUSET] = "", + [VIR_CGROUP_CONTROLLER_MEMORY] = "", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "", + [VIR_CGROUP_CONTROLLER_BLKIO] = "", + }; + + if ((rv = virCgroupNewDriver("lxc", true, false, -1, &cgroup)) != -ENOENT) { + fprintf(stderr, "Unexpected found LXC cgroup: %d\n", -rv); + goto cleanup; + } + + /* Asking for impossible combination since CPU is co-mounted */ + if ((rv = virCgroupNewDriver("lxc", true, true, + (1 << VIR_CGROUP_CONTROLLER_CPU), + &cgroup)) != -EINVAL) { + fprintf(stderr, "Should not have created LXC cgroup: %d\n", -rv); + goto cleanup; + } + + /* Asking for impossible combination since devices is not mounted */ + if ((rv = virCgroupNewDriver("lxc", true, true, + (1 << VIR_CGROUP_CONTROLLER_DEVICES), + &cgroup)) != -ENOENT) { + fprintf(stderr, "Should not have created LXC cgroup: %d\n", -rv); + goto cleanup; + } + + /* Asking for small combination since devices is not mounted */ + if ((rv = virCgroupNewDriver("lxc", true, true, + (1 << VIR_CGROUP_CONTROLLER_CPU) | + (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | + (1 << VIR_CGROUP_CONTROLLER_MEMORY), + &cgroup)) != 0) { + fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); + goto cleanup; + } + ret = validateCgroup(cgroup, "/libvirt/lxc", mountsSmall, placement); + virCgroupFree(&cgroup); + + if ((rv = virCgroupNewDriver("lxc", true, true, -1, &cgroup)) != 0) { + fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); + goto cleanup; + } + ret = validateCgroup(cgroup, "/libvirt/lxc", mountsFull, placement); + +cleanup: + virCgroupFree(&cgroup); + return ret; +} + + +static int testCgroupNewForDomain(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr drivercgroup = NULL; + virCgroupPtr domaincgroup = NULL; + int ret = -1; + int rv; + const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/system", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system", + [VIR_CGROUP_CONTROLLER_CPUSET] = "", + [VIR_CGROUP_CONTROLLER_MEMORY] = "", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "", + [VIR_CGROUP_CONTROLLER_BLKIO] = "", + }; + + if ((rv = virCgroupNewDriver("lxc", true, false, -1, &drivercgroup)) != 0) { + fprintf(stderr, "Cannot find LXC cgroup: %d\n", -rv); + goto cleanup; + } + + if ((rv = virCgroupNewDomain(drivercgroup, "wibble", true, &domaincgroup)) != 0) { + fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); + goto cleanup; + } + + ret = validateCgroup(domaincgroup, "/libvirt/lxc/wibble", mountsFull, placement); + +cleanup: + virCgroupFree(&drivercgroup); + virCgroupFree(&domaincgroup); + return ret; +} + + +#define FAKESYSFSDIRTEMPLATE abs_builddir "/fakesysfsdir-XXXXXX" + + + +static int +mymain(void) +{ + int ret = 0; + char *fakesysfsdir; + + if (!(fakesysfsdir = strdup(FAKESYSFSDIRTEMPLATE))) { + fprintf(stderr, "Out of memory\n"); + abort(); + } + + if (!mkdtemp(fakesysfsdir)) { + fprintf(stderr, "Cannot create fakesysfsdir"); + abort(); + } + + setenv("LIBVIRT_FAKE_SYSFS_DIR", fakesysfsdir, 1); + + if (virtTestRun("New cgroup for self", 1, testCgroupNewForSelf, NULL) < 0) + ret = -1; + + if (virtTestRun("New cgroup for driver", 1, testCgroupNewForDriver, NULL) < 0) + ret = -1; + + if (virtTestRun("New cgroup for domain", 1, testCgroupNewForDomain, NULL) < 0) + ret = -1; + + if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL) + virFileDeleteTree(fakesysfsdir); + + return ret==0 ? EXIT_SUCCESS : EXIT_FAILURE; +} + +VIRT_TEST_MAIN_PRELOAD(mymain, abs_builddir "/.libs/libvircgroupmock.so") -- 1.8.1.4

From: "Daniel P. Berrange" <berrange@redhat.com> Currently the virCgroupPtr struct contains 3 pieces of information - path - path of the cgroup, relative to current process' cgroup placement - placement - current process' placement in each controller - mounts - mount point of each controller When reading/writing cgroup settings, the path & placement strings are combined to form the file path. This approach only works if we assume all cgroups will be relative to the current process' cgroup placement. To allow support for managing cgroups at any place in the heirarchy a change is needed. The 'placement' data should reflect the absolute path to the cgroup, and the 'path' value should no longer be used to form the paths to the cgroup attribute files. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/util/vircgroup.c | 222 +++++++++++++++++++++++++++++++++++--------------- tests/vircgrouptest.c | 53 +++++++----- 2 files changed, 188 insertions(+), 87 deletions(-) diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 2f52c92..c336806 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -101,6 +101,23 @@ bool virCgroupHasController(virCgroupPtr cgroup, int controller) } #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R +static int virCgroupCopyMounts(virCgroupPtr group, + virCgroupPtr parent) +{ + int i; + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + if (!parent->controllers[i].mountPoint) + continue; + + group->controllers[i].mountPoint = + strdup(parent->controllers[i].mountPoint); + + if (!group->controllers[i].mountPoint) + return -ENOMEM; + } + return 0; +} + /* * Process /proc/mounts figuring out what controllers are * mounted and where @@ -158,12 +175,61 @@ no_memory: } +static int virCgroupCopyPlacement(virCgroupPtr group, + const char *path, + virCgroupPtr parent) +{ + int i; + for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { + if (!group->controllers[i].mountPoint) + continue; + + if (path[0] == '/') { + if (!(group->controllers[i].placement = strdup(path))) + return -ENOMEM; + } else { + /* + * parent=="/" + path="" => "/" + * parent=="/libvirt.service" + path="" => "/libvirt.service" + * parent=="/libvirt.service" + path="foo" => "/libvirt.service/foo" + */ + if (virAsprintf(&group->controllers[i].placement, + "%s%s%s", + parent->controllers[i].placement, + (STREQ(parent->controllers[i].placement, "/") || + STREQ(path, "") ? "" : "/"), + path) < 0) + return -ENOMEM; + } + } + + return 0; +} + + /* + * @group: the group to process + * @path: the relative path to append, not starting with '/' + * * Process /proc/self/cgroup figuring out what cgroup * sub-path the current process is assigned to. ie not - * necessarily in the root + * necessarily in the root. The contents of this file + * looks like + * + * 9:perf_event:/ + * 8:blkio:/ + * 7:net_cls:/ + * 6:freezer:/ + * 5:devices:/ + * 4:memory:/ + * 3:cpuacct,cpu:/ + * 2:cpuset:/ + * 1:name=systemd:/user/berrange/2 + * + * It then appends @path to each detected path. */ -static int virCgroupDetectPlacement(virCgroupPtr group) +static int virCgroupDetectPlacement(virCgroupPtr group, + const char *path) { int i; FILE *mapping = NULL; @@ -177,18 +243,18 @@ static int virCgroupDetectPlacement(virCgroupPtr group) while (fgets(line, sizeof(line), mapping) != NULL) { char *controllers = strchr(line, ':'); - char *path = controllers ? strchr(controllers+1, ':') : NULL; - char *nl = path ? strchr(path, '\n') : NULL; + char *selfpath = controllers ? strchr(controllers + 1, ':') : NULL; + char *nl = selfpath ? strchr(selfpath, '\n') : NULL; - if (!controllers || !path) + if (!controllers || !selfpath) continue; if (nl) *nl = '\0'; - *path = '\0'; + *selfpath = '\0'; controllers++; - path++; + selfpath++; for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { const char *typestr = virCgroupControllerTypeToString(i); @@ -198,14 +264,25 @@ static int virCgroupDetectPlacement(virCgroupPtr group) char *next = strchr(tmp, ','); int len; if (next) { - len = next-tmp; + len = next - tmp; next++; } else { len = strlen(tmp); } - if (typelen == len && STREQLEN(typestr, tmp, len) && - !(group->controllers[i].placement = strdup(STREQ(path, "/") ? "" : path))) - goto no_memory; + + /* + * selfpath=="/" + path="" -> "/" + * selfpath=="/libvirt.service" + path="" -> "/libvirt.service" + * selfpath=="/libvirt.service" + path="foo" -> "/libvirt.service/foo" + */ + if (typelen == len && STREQLEN(typestr, tmp, len)) { + if (virAsprintf(&group->controllers[i].placement, + "%s%s%s", selfpath, + (STREQ(selfpath, "/") || + STREQ(path, "") ? "" : "/"), + path) < 0) + goto no_memory; + } tmp = next; } @@ -223,13 +300,20 @@ no_memory: } static int virCgroupDetect(virCgroupPtr group, - int controllers) + int controllers, + const char *path, + virCgroupPtr parent) { int rc; int i; int j; + VIR_DEBUG("group=%p controllers=%d path=%s parent=%p", + group, controllers, path, parent); - rc = virCgroupDetectMounts(group); + if (parent) + rc = virCgroupCopyMounts(group, parent); + else + rc = virCgroupDetectMounts(group); if (rc < 0) { VIR_ERROR(_("Failed to detect mounts for %s"), group->path); return rc; @@ -238,9 +322,10 @@ static int virCgroupDetect(virCgroupPtr group, if (controllers >= 0) { VIR_DEBUG("Validating controllers %d", controllers); for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { - VIR_DEBUG("Controller '%s' wanted=%s", + VIR_DEBUG("Controller '%s' wanted=%s, mount='%s'", virCgroupControllerTypeToString(i), - (1 << i) & controllers ? "yes" : "no"); + (1 << i) & controllers ? "yes" : "no", + NULLSTR(group->controllers[i].mountPoint)); if (((1 << i) & controllers)) { /* Ensure requested controller is present */ if (!group->controllers[i].mountPoint) { @@ -282,10 +367,15 @@ static int virCgroupDetect(virCgroupPtr group, } /* Check that at least 1 controller is available */ - if (!controllers) + if (!controllers) { + VIR_DEBUG("No controllers set"); return -ENXIO; + } - rc = virCgroupDetectPlacement(group); + if (parent || path[0] == '/') + rc = virCgroupCopyPlacement(group, path, parent); + else + rc = virCgroupDetectPlacement(group, path); if (rc == 0) { /* Check that for every mounted controller, we found our placement */ @@ -339,10 +429,9 @@ int virCgroupPathOfController(virCgroupPtr group, if (group->controllers[controller].placement == NULL) return -ENOENT; - if (virAsprintf(path, "%s%s%s/%s", + if (virAsprintf(path, "%s%s/%s", group->controllers[controller].mountPoint, group->controllers[controller].placement, - STREQ(group->path, "/") ? "" : group->path, key ? key : "") == -1) return -ENOMEM; @@ -634,14 +723,31 @@ static int virCgroupMakeGroup(virCgroupPtr parent, } +/** + * virCgroupNew: + * @path: path for the new group + * @parent: parent group, or NULL + * @controllers: bitmask of controllers to activate + * + * Create a new cgroup storing it in @group. + * + * If @path starts with a '/' it is treated as an + * absolute path, and @parent is ignored. Otherwise + * it is treated as being relative to @parent. If + * @parent is NULL, then the placement of the current + * process is used. + * + */ static int virCgroupNew(const char *path, + virCgroupPtr parent, int controllers, virCgroupPtr *group) { int rc = 0; char *typpath = NULL; - VIR_DEBUG("path=%s controllers=%d", path, controllers); + VIR_DEBUG("parent=%p path=%s controllers=%d", + parent, path, controllers); *group = NULL; if (VIR_ALLOC((*group)) != 0) { @@ -649,12 +755,22 @@ static int virCgroupNew(const char *path, goto err; } - if (!((*group)->path = strdup(path))) { - rc = -ENOMEM; - goto err; + if (path[0] == '/' || !parent) { + if (!((*group)->path = strdup(path))) { + rc = -ENOMEM; + goto err; + } + } else { + if (virAsprintf(&(*group)->path, "%s%s%s", + parent->path, + STREQ(parent->path, "") ? "" : "/", + path) < 0) { + rc = -ENOMEM; + goto err; + } } - rc = virCgroupDetect(*group, controllers); + rc = virCgroupDetect(*group, controllers, path, parent); if (rc < 0) goto err; @@ -673,15 +789,16 @@ static int virCgroupAppRoot(bool privileged, bool create, int controllers) { - virCgroupPtr rootgrp = NULL; + virCgroupPtr selfgrp = NULL; int rc; - rc = virCgroupNew("/", controllers, &rootgrp); + rc = virCgroupNewSelf(&selfgrp); + if (rc != 0) return rc; if (privileged) { - rc = virCgroupNew("/libvirt", controllers, group); + rc = virCgroupNew("libvirt", selfgrp, controllers, group); } else { char *rootname; char *username; @@ -690,23 +807,23 @@ static int virCgroupAppRoot(bool privileged, rc = -ENOMEM; goto cleanup; } - rc = virAsprintf(&rootname, "/libvirt-%s", username); + rc = virAsprintf(&rootname, "libvirt-%s", username); VIR_FREE(username); if (rc < 0) { rc = -ENOMEM; goto cleanup; } - rc = virCgroupNew(rootname, controllers, group); + rc = virCgroupNew(rootname, selfgrp, controllers, group); VIR_FREE(rootname); } if (rc != 0) goto cleanup; - rc = virCgroupMakeGroup(rootgrp, *group, create, VIR_CGROUP_NONE); + rc = virCgroupMakeGroup(selfgrp, *group, create, VIR_CGROUP_NONE); cleanup: - virCgroupFree(&rootgrp); + virCgroupFree(&selfgrp); return rc; } #endif @@ -942,7 +1059,6 @@ int virCgroupNewDriver(const char *name, virCgroupPtr *group) { int rc; - char *path = NULL; virCgroupPtr rootgrp = NULL; rc = virCgroupAppRoot(privileged, &rootgrp, @@ -950,20 +1066,12 @@ int virCgroupNewDriver(const char *name, if (rc != 0) goto out; - if (virAsprintf(&path, "%s/%s", rootgrp->path, name) < 0) { - rc = -ENOMEM; - goto out; - } - - rc = virCgroupNew(path, controllers, group); - VIR_FREE(path); - + rc = virCgroupNew(name, rootgrp, -1, group); if (rc == 0) { rc = virCgroupMakeGroup(rootgrp, *group, create, VIR_CGROUP_NONE); if (rc != 0) virCgroupFree(group); } - out: virCgroupFree(&rootgrp); @@ -994,7 +1102,7 @@ int virCgroupNewDriver(const char *name ATTRIBUTE_UNUSED, #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R int virCgroupNewSelf(virCgroupPtr *group) { - return virCgroupNew("/", -1, group); + return virCgroupNew("", NULL, -1, group); } #else int virCgroupNewSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) @@ -1019,13 +1127,8 @@ int virCgroupNewDomain(virCgroupPtr driver, virCgroupPtr *group) { int rc; - char *path; - if (virAsprintf(&path, "%s/%s", driver->path, name) < 0) - return -ENOMEM; - - rc = virCgroupNew(path, -1, group); - VIR_FREE(path); + rc = virCgroupNew(name, driver, -1, group); if (rc == 0) { /* @@ -1071,18 +1174,18 @@ int virCgroupNewVcpu(virCgroupPtr domain, virCgroupPtr *group) { int rc; - char *path; + char *name; int controllers; - if (virAsprintf(&path, "%s/vcpu%d", domain->path, vcpuid) < 0) + if (virAsprintf(&name, "vcpu%d", vcpuid) < 0) return -ENOMEM; controllers = ((1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | (1 << VIR_CGROUP_CONTROLLER_CPUSET)); - rc = virCgroupNew(path, controllers, group); - VIR_FREE(path); + rc = virCgroupNew(name, domain, controllers, group); + VIR_FREE(name); if (rc == 0) { rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); @@ -1116,18 +1219,13 @@ int virCgroupNewEmulator(virCgroupPtr domain, virCgroupPtr *group) { int rc; - char *path; int controllers; - if (virAsprintf(&path, "%s/emulator", domain->path) < 0) - return -ENOMEM; - controllers = ((1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | (1 << VIR_CGROUP_CONTROLLER_CPUSET)); - rc = virCgroupNew(path, controllers, group); - VIR_FREE(path); + rc = virCgroupNew("emulator", domain, controllers, group); if (rc == 0) { rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); @@ -2015,8 +2113,6 @@ static int virCgroupKillRecursiveInternal(virCgroupPtr group, int signum, virHas } while ((ent = readdir(dp))) { - char *subpath; - if (STREQ(ent->d_name, ".")) continue; if (STREQ(ent->d_name, "..")) @@ -2025,12 +2121,8 @@ static int virCgroupKillRecursiveInternal(virCgroupPtr group, int signum, virHas continue; VIR_DEBUG("Process subdir %s", ent->d_name); - if (virAsprintf(&subpath, "%s/%s", group->path, ent->d_name) < 0) { - rc = -ENOMEM; - goto cleanup; - } - if ((rc = virCgroupNew(subpath, -1, &subgroup)) != 0) + if ((rc = virCgroupNew(ent->d_name, group, -1, &subgroup)) != 0) goto cleanup; if ((rc = virCgroupKillRecursiveInternal(subgroup, signum, pids, true)) < 0) diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index a68aa88..3f35f2e 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -94,11 +94,11 @@ static int testCgroupNewForSelf(const void *args ATTRIBUTE_UNUSED) const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { [VIR_CGROUP_CONTROLLER_CPU] = "/system", [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system", - [VIR_CGROUP_CONTROLLER_CPUSET] = "", - [VIR_CGROUP_CONTROLLER_MEMORY] = "", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/", [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, - [VIR_CGROUP_CONTROLLER_FREEZER] = "", - [VIR_CGROUP_CONTROLLER_BLKIO] = "", + [VIR_CGROUP_CONTROLLER_FREEZER] = "/", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/", }; if (virCgroupNewSelf(&cgroup) < 0) { @@ -106,7 +106,7 @@ static int testCgroupNewForSelf(const void *args ATTRIBUTE_UNUSED) goto cleanup; } - ret = validateCgroup(cgroup, "/", mountsFull, placement); + ret = validateCgroup(cgroup, "", mountsFull, placement); cleanup: virCgroupFree(&cgroup); @@ -119,14 +119,23 @@ static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) virCgroupPtr cgroup = NULL; int ret = -1; int rv; - const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { - [VIR_CGROUP_CONTROLLER_CPU] = "/system", - [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system", - [VIR_CGROUP_CONTROLLER_CPUSET] = "", - [VIR_CGROUP_CONTROLLER_MEMORY] = "", + const char *placementSmall[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/system/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_CPUSET] = NULL, + [VIR_CGROUP_CONTROLLER_MEMORY] = "/libvirt/lxc", [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, - [VIR_CGROUP_CONTROLLER_FREEZER] = "", - [VIR_CGROUP_CONTROLLER_BLKIO] = "", + [VIR_CGROUP_CONTROLLER_FREEZER] = NULL, + [VIR_CGROUP_CONTROLLER_BLKIO] = NULL, + }; + const char *placementFull[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/system/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "/libvirt/lxc", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/libvirt/lxc", }; if ((rv = virCgroupNewDriver("lxc", true, false, -1, &cgroup)) != -ENOENT) { @@ -159,14 +168,14 @@ static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); goto cleanup; } - ret = validateCgroup(cgroup, "/libvirt/lxc", mountsSmall, placement); + ret = validateCgroup(cgroup, "libvirt/lxc", mountsSmall, placementSmall); virCgroupFree(&cgroup); if ((rv = virCgroupNewDriver("lxc", true, true, -1, &cgroup)) != 0) { fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); goto cleanup; } - ret = validateCgroup(cgroup, "/libvirt/lxc", mountsFull, placement); + ret = validateCgroup(cgroup, "libvirt/lxc", mountsFull, placementFull); cleanup: virCgroupFree(&cgroup); @@ -181,13 +190,13 @@ static int testCgroupNewForDomain(const void *args ATTRIBUTE_UNUSED) int ret = -1; int rv; const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { - [VIR_CGROUP_CONTROLLER_CPU] = "/system", - [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system", - [VIR_CGROUP_CONTROLLER_CPUSET] = "", - [VIR_CGROUP_CONTROLLER_MEMORY] = "", + [VIR_CGROUP_CONTROLLER_CPU] = "/system/libvirt/lxc/wibble", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/system/libvirt/lxc/wibble", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/libvirt/lxc/wibble", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/libvirt/lxc/wibble", [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, - [VIR_CGROUP_CONTROLLER_FREEZER] = "", - [VIR_CGROUP_CONTROLLER_BLKIO] = "", + [VIR_CGROUP_CONTROLLER_FREEZER] = "/libvirt/lxc/wibble", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/libvirt/lxc/wibble", }; if ((rv = virCgroupNewDriver("lxc", true, false, -1, &drivercgroup)) != 0) { @@ -200,7 +209,7 @@ static int testCgroupNewForDomain(const void *args ATTRIBUTE_UNUSED) goto cleanup; } - ret = validateCgroup(domaincgroup, "/libvirt/lxc/wibble", mountsFull, placement); + ret = validateCgroup(domaincgroup, "libvirt/lxc/wibble", mountsFull, placement); cleanup: virCgroupFree(&drivercgroup); @@ -246,4 +255,4 @@ mymain(void) return ret==0 ? EXIT_SUCCESS : EXIT_FAILURE; } -VIRT_TEST_MAIN_PRELOAD(mymain, abs_builddir "/.libs/libvircgroupmock.so") +VIRT_TEST_MAIN_PRELOAD(mymain, abs_builddir "/.libs/vircgroupmock.so") -- 1.8.1.4

From: "Daniel P. Berrange" <berrange@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/util/vircgroup.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index c336806..d3c43a2 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -662,12 +662,18 @@ static int virCgroupMakeGroup(virCgroupPtr parent, char *path = NULL; /* Skip over controllers that aren't mounted */ - if (!group->controllers[i].mountPoint) + if (!group->controllers[i].mountPoint) { + VIR_DEBUG("Skipping unmounted controller %s", + virCgroupControllerTypeToString(i)); continue; + } rc = virCgroupPathOfController(group, i, "", &path); - if (rc < 0) + if (rc < 0) { + VIR_DEBUG("Failed to find path of controller %s", + virCgroupControllerTypeToString(i)); return rc; + } /* As of Feb 2011, clang can't see that the above function * call did not modify group. */ sa_assert(group->controllers[i].mountPoint); @@ -681,11 +687,14 @@ static int virCgroupMakeGroup(virCgroupPtr parent, * other controllers even though they are available. So * treat blkio as unmounted if mkdir fails. */ if (i == VIR_CGROUP_CONTROLLER_BLKIO) { + VIR_DEBUG("Ignoring mkdir failure with blkio controller. Kernel probably too old"); rc = 0; VIR_FREE(group->controllers[i].mountPoint); VIR_FREE(path); continue; } else { + VIR_DEBUG("Failed to create controller %s for group", + virCgroupControllerTypeToString(i)); rc = -errno; VIR_FREE(path); break; @@ -719,6 +728,7 @@ static int virCgroupMakeGroup(virCgroupPtr parent, VIR_FREE(path); } + VIR_DEBUG("Done making controllers for group"); return rc; } @@ -903,6 +913,7 @@ int virCgroupRemove(virCgroupPtr group) int i; char *grppath = NULL; + VIR_DEBUG("Removing cgroup %s", group->path); for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { /* Skip over controllers not mounted */ if (!group->controllers[i].mountPoint) @@ -918,6 +929,7 @@ int virCgroupRemove(virCgroupPtr group) rc = virCgroupRemoveRecursively(grppath); VIR_FREE(grppath); } + VIR_DEBUG("Done removing cgroup %s", group->path); return rc; } -- 1.8.1.4

From: "Daniel P. Berrange" <berrange@redhat.com> Currently if virCgroupMakeGroup fails, we can get in a situation where some controllers have been setup, but others not. Ensure we call virCgroupRemove to remove what we've done upon failure Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/util/vircgroup.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index d3c43a2..bcc61a8 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -1081,8 +1081,10 @@ int virCgroupNewDriver(const char *name, rc = virCgroupNew(name, rootgrp, -1, group); if (rc == 0) { rc = virCgroupMakeGroup(rootgrp, *group, create, VIR_CGROUP_NONE); - if (rc != 0) + if (rc != 0) { + virCgroupRemove(*group); virCgroupFree(group); + } } out: virCgroupFree(&rootgrp); @@ -1154,8 +1156,10 @@ int virCgroupNewDomain(virCgroupPtr driver, * cumulative usage that we don't need. */ rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_MEM_HIERACHY); - if (rc != 0) + if (rc != 0) { + virCgroupRemove(*group); virCgroupFree(group); + } } return rc; @@ -1201,8 +1205,10 @@ int virCgroupNewVcpu(virCgroupPtr domain, if (rc == 0) { rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); - if (rc != 0) + if (rc != 0) { + virCgroupRemove(*group); virCgroupFree(group); + } } return rc; @@ -1241,8 +1247,10 @@ int virCgroupNewEmulator(virCgroupPtr domain, if (rc == 0) { rc = virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE); - if (rc != 0) + if (rc != 0) { + virCgroupRemove(*group); virCgroupFree(group); + } } return rc; -- 1.8.1.4

From: "Daniel P. Berrange" <berrange@redhat.com> A resource partition is an absolute cgroup path, ignoring the current process placement. Expose a virCgroupNewPartition API for constructing such cgroups Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/libvirt_private.syms | 4 +- src/lxc/lxc_cgroup.c | 2 +- src/qemu/qemu_cgroup.c | 2 +- src/util/vircgroup.c | 146 ++++++++++++++++++++++++++++++++++++++--- src/util/vircgroup.h | 20 ++++-- tests/vircgrouptest.c | 166 ++++++++++++++++++++++++++++++++++++++++++++++- 6 files changed, 321 insertions(+), 19 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 1fea9a2..cba3f77 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1116,9 +1116,11 @@ virCgroupKill; virCgroupKillPainfully; virCgroupKillRecursive; virCgroupMoveTask; -virCgroupNewDomain; +virCgroupNewDomainDriver; +virCgroupNewDomainPartition; virCgroupNewDriver; virCgroupNewEmulator; +virCgroupNewPartition; virCgroupNewSelf; virCgroupNewVcpu; virCgroupPathOfController; diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 7d1432b..72940bd 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -536,7 +536,7 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) goto cleanup; } - rc = virCgroupNewDomain(driver, def->name, true, &cgroup); + rc = virCgroupNewDomainDriver(driver, def->name, true, &cgroup); if (rc != 0) { virReportSystemError(-rc, _("Unable to create cgroup for domain %s"), diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index cb53acb..cb0faa1 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -216,7 +216,7 @@ int qemuInitCgroup(virQEMUDriverPtr driver, goto cleanup; } - rc = virCgroupNewDomain(driverGroup, vm->def->name, true, &priv->cgroup); + rc = virCgroupNewDomainDriver(driverGroup, vm->def->name, true, &priv->cgroup); if (rc != 0) { virReportSystemError(-rc, _("Unable to create cgroup for %s"), diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index bcc61a8..40e0fe6 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -1055,6 +1055,76 @@ cleanup: return rc; } + +#if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R +/** + * virCgroupNewPartition: + * @path: path for the partition + * @create: true to create the cgroup tree + * @controllers: mask of controllers to create + * + * Creates a new cgroup to represent the resource + * partition path identified by @name. + * + * Returns 0 on success, -errno on failure + */ +int virCgroupNewPartition(const char *path, + bool create, + int controllers, + virCgroupPtr *group) +{ + int rc; + char *parentPath = NULL; + virCgroupPtr parent = NULL; + VIR_DEBUG("path=%s create=%d controllers=%x", + path, create, controllers); + + if (path[0] != '/') + return -EINVAL; + + rc = virCgroupNew(path, NULL, controllers, group); + if (rc != 0) + goto cleanup; + + if (STRNEQ(path, "/")) { + char *tmp; + if (!(parentPath = strdup(path))) + return -ENOMEM; + + tmp = strrchr(parentPath, '/'); + tmp++; + *tmp = '\0'; + + rc = virCgroupNew(parentPath, NULL, controllers, &parent); + if (rc != 0) + goto cleanup; + + rc = virCgroupMakeGroup(parent, *group, create, VIR_CGROUP_NONE); + if (rc != 0) { + virCgroupRemove(*group); + virCgroupFree(group); + goto cleanup; + } + } + +cleanup: + virCgroupFree(&parent); + VIR_FREE(parentPath); + return rc; +} +#else +int virCgroupNewPartition(const char *path ATTRIBUTE_UNUSED, + const char *driver ATTRIBUTE_UNUSED, + const char *name ATTRIBUTE_UNUSED, + bool create ATTRIBUTE_UNUSED, + int controllers ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) +{ + /* Claim no support */ + return -ENXIO; +} +#endif + /** * virCgroupNewDriver: * @@ -1126,7 +1196,7 @@ int virCgroupNewSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) #endif /** - * virCgroupNewDomain: + * virCgroupNewDomainDriver: * * @driver: group for driver owning the domain * @name: name of the domain @@ -1135,10 +1205,10 @@ int virCgroupNewSelf(virCgroupPtr *group ATTRIBUTE_UNUSED) * Returns 0 on success */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R -int virCgroupNewDomain(virCgroupPtr driver, - const char *name, - bool create, - virCgroupPtr *group) +int virCgroupNewDomainDriver(virCgroupPtr driver, + const char *name, + bool create, + virCgroupPtr *group) { int rc; @@ -1165,10 +1235,68 @@ int virCgroupNewDomain(virCgroupPtr driver, return rc; } #else -int virCgroupNewDomain(virCgroupPtr driver ATTRIBUTE_UNUSED, - const char *name ATTRIBUTE_UNUSED, - bool create ATTRIBUTE_UNUSED, - virCgroupPtr *group ATTRIBUTE_UNUSED) +int virCgroupNewDomainDriver(virCgroupPtr driver ATTRIBUTE_UNUSED, + const char *name ATTRIBUTE_UNUSED, + bool create ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) +{ + return -ENXIO; +} +#endif + +/** + * virCgroupNewDomainPartition: + * + * @partition: partition holding the domain + * @driver: name of the driver + * @name: name of the domain + * @group: Pointer to returned virCgroupPtr + * + * Returns 0 on success + */ +#if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R +int virCgroupNewDomainPartition(virCgroupPtr partition, + const char *driver, + const char *name, + bool create, + virCgroupPtr *group) +{ + int rc; + char *dirname = NULL; + + if (virAsprintf(&dirname, "%s.%s.libvirt", + name, driver) < 0) + return -ENOMEM; + + rc = virCgroupNew(dirname, partition, -1, group); + + if (rc == 0) { + /* + * Create a cgroup with memory.use_hierarchy enabled to + * surely account memory usage of lxc with ns subsystem + * enabled. (To be exact, memory and ns subsystems are + * enabled at the same time.) + * + * The reason why doing it here, not a upper group, say + * a group for driver, is to avoid overhead to track + * cumulative usage that we don't need. + */ + rc = virCgroupMakeGroup(partition, *group, create, VIR_CGROUP_MEM_HIERACHY); + if (rc != 0) { + virCgroupRemove(*group); + virCgroupFree(group); + } + } + + VIR_FREE(dirname); + return rc; +} +#else +int virCgroupNewDomainPartition(virCgroupPtr partition ATTRIBUTE_UNUSED, + const char *driver ATTRIBUTE_UNUSED, + const char *name ATTRIBUTE_UNUSED, + bool create ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED) { return -ENXIO; } diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h index 91143e2..33f86a6 100644 --- a/src/util/vircgroup.h +++ b/src/util/vircgroup.h @@ -44,6 +44,12 @@ enum { VIR_ENUM_DECL(virCgroupController); +int virCgroupNewPartition(const char *path, + bool create, + int controllers, + virCgroupPtr *group) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(4); + int virCgroupNewDriver(const char *name, bool privileged, bool create, @@ -54,10 +60,16 @@ int virCgroupNewDriver(const char *name, int virCgroupNewSelf(virCgroupPtr *group) ATTRIBUTE_NONNULL(1); -int virCgroupNewDomain(virCgroupPtr driver, - const char *name, - bool create, - virCgroupPtr *group) +int virCgroupNewDomainDriver(virCgroupPtr driver, + const char *name, + bool create, + virCgroupPtr *group) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(4); +int virCgroupNewDomainPartition(virCgroupPtr partition, + const char *driver, + const char *name, + bool create, + virCgroupPtr *group) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(4); int virCgroupNewVcpu(virCgroupPtr domain, diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index 3f35f2e..a806368 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -183,7 +183,7 @@ cleanup: } -static int testCgroupNewForDomain(const void *args ATTRIBUTE_UNUSED) +static int testCgroupNewForDriverDomain(const void *args ATTRIBUTE_UNUSED) { virCgroupPtr drivercgroup = NULL; virCgroupPtr domaincgroup = NULL; @@ -204,7 +204,7 @@ static int testCgroupNewForDomain(const void *args ATTRIBUTE_UNUSED) goto cleanup; } - if ((rv = virCgroupNewDomain(drivercgroup, "wibble", true, &domaincgroup)) != 0) { + if ((rv = virCgroupNewDomainDriver(drivercgroup, "wibble", true, &domaincgroup)) != 0) { fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); goto cleanup; } @@ -218,6 +218,156 @@ cleanup: } +static int testCgroupNewForPartition(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr cgroup = NULL; + int ret = -1; + int rv; + const char *placementSmall[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_CPUSET] = NULL, + [VIR_CGROUP_CONTROLLER_MEMORY] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = NULL, + [VIR_CGROUP_CONTROLLER_BLKIO] = NULL, + }; + const char *placementFull[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "/virtualmachines", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/virtualmachines", + }; + + if ((rv = virCgroupNewPartition("/virtualmachines", false, -1, &cgroup)) != -ENOENT) { + fprintf(stderr, "Unexpected found /virtualmachines cgroup: %d\n", -rv); + goto cleanup; + } + + /* Asking for impossible combination since CPU is co-mounted */ + if ((rv = virCgroupNewPartition("/virtualmachines", true, + (1 << VIR_CGROUP_CONTROLLER_CPU), + &cgroup)) != -EINVAL) { + fprintf(stderr, "Should not have created /virtualmachines cgroup: %d\n", -rv); + goto cleanup; + } + + /* Asking for impossible combination since devices is not mounted */ + if ((rv = virCgroupNewPartition("/virtualmachines", true, + (1 << VIR_CGROUP_CONTROLLER_DEVICES), + &cgroup)) != -ENOENT) { + fprintf(stderr, "Should not have created /virtualmachines cgroup: %d\n", -rv); + goto cleanup; + } + + /* Asking for small combination since devices is not mounted */ + if ((rv = virCgroupNewPartition("/virtualmachines", true, + (1 << VIR_CGROUP_CONTROLLER_CPU) | + (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | + (1 << VIR_CGROUP_CONTROLLER_MEMORY), + &cgroup)) != 0) { + fprintf(stderr, "Cannot create /virtualmachines cgroup: %d\n", -rv); + goto cleanup; + } + ret = validateCgroup(cgroup, "/virtualmachines", mountsSmall, placementSmall); + virCgroupFree(&cgroup); + + if ((rv = virCgroupNewPartition("/virtualmachines", true, -1, &cgroup)) != 0) { + fprintf(stderr, "Cannot create /virtualmachines cgroup: %d\n", -rv); + goto cleanup; + } + ret = validateCgroup(cgroup, "/virtualmachines", mountsFull, placementFull); + +cleanup: + virCgroupFree(&cgroup); + return ret; +} + + +static int testCgroupNewForPartitionNested(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr cgroup = NULL; + int ret = -1; + int rv; + const char *placementFull[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/users/berrange", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/users/berrange", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/users/berrange", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/users/berrange", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "/users/berrange", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/users/berrange", + }; + + if ((rv = virCgroupNewPartition("/users/berrange", false, -1, &cgroup)) != -ENOENT) { + fprintf(stderr, "Unexpected found /users/berrange cgroup: %d\n", -rv); + goto cleanup; + } + + /* Should not work, since we require /users to be pre-created */ + if ((rv = virCgroupNewPartition("/users/berrange", true, -1, &cgroup)) != -ENOENT) { + fprintf(stderr, "Unexpected created /users/berrange cgroup: %d\n", -rv); + goto cleanup; + } + + if ((rv = virCgroupNewPartition("/users", true, -1, &cgroup)) != 0) { + fprintf(stderr, "Failed to create /users cgroup: %d\n", -rv); + goto cleanup; + } + + /* Should now work */ + if ((rv = virCgroupNewPartition("/users/berrange", true, -1, &cgroup)) != 0) { + fprintf(stderr, "Failed to create /users/berrange cgroup: %d\n", -rv); + goto cleanup; + } + + ret = validateCgroup(cgroup, "/users/berrange", mountsFull, placementFull); + +cleanup: + virCgroupFree(&cgroup); + return ret; +} + + + +static int testCgroupNewForPartitionDomain(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr partitioncgroup = NULL; + virCgroupPtr domaincgroup = NULL; + int ret = -1; + int rv; + const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPU] = "/production/foo.lxc.libvirt", + [VIR_CGROUP_CONTROLLER_CPUACCT] = "/production/foo.lxc.libvirt", + [VIR_CGROUP_CONTROLLER_CPUSET] = "/production/foo.lxc.libvirt", + [VIR_CGROUP_CONTROLLER_MEMORY] = "/production/foo.lxc.libvirt", + [VIR_CGROUP_CONTROLLER_DEVICES] = NULL, + [VIR_CGROUP_CONTROLLER_FREEZER] = "/production/foo.lxc.libvirt", + [VIR_CGROUP_CONTROLLER_BLKIO] = "/production/foo.lxc.libvirt", + }; + + if ((rv = virCgroupNewPartition("/production", true, -1, &partitioncgroup)) != 0) { + fprintf(stderr, "Failed to create /production cgroup: %d\n", -rv); + goto cleanup; + } + + if ((rv = virCgroupNewDomainPartition(partitioncgroup, "lxc", "foo", true, &domaincgroup)) != 0) { + fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); + goto cleanup; + } + + ret = validateCgroup(domaincgroup, "/production/foo.lxc.libvirt", mountsFull, placement); + +cleanup: + virCgroupFree(&partitioncgroup); + virCgroupFree(&domaincgroup); + return ret; +} + + #define FAKESYSFSDIRTEMPLATE abs_builddir "/fakesysfsdir-XXXXXX" @@ -246,9 +396,19 @@ mymain(void) if (virtTestRun("New cgroup for driver", 1, testCgroupNewForDriver, NULL) < 0) ret = -1; - if (virtTestRun("New cgroup for domain", 1, testCgroupNewForDomain, NULL) < 0) + if (virtTestRun("New cgroup for domain driver", 1, testCgroupNewForDriverDomain, NULL) < 0) + ret = -1; + + if (virtTestRun("New cgroup for partition", 1, testCgroupNewForPartition, NULL) < 0) + ret = -1; + + if (virtTestRun("New cgroup for partition nested", 1, testCgroupNewForPartitionNested, NULL) < 0) ret = -1; + if (virtTestRun("New cgroup for domain partition", 1, testCgroupNewForPartitionDomain, NULL) < 0) + ret = -1; + + if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL) virFileDeleteTree(fakesysfsdir); -- 1.8.1.4

From: "Daniel P. Berrange" <berrange@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- docs/formatdomain.html.in | 26 ++++++++++ docs/schemas/domaincommon.rng | 12 +++++ src/conf/domain_conf.c | 78 ++++++++++++++++++++++++++++ src/conf/domain_conf.h | 7 +++ tests/domainschemadata/domain-lxc-simple.xml | 3 ++ 5 files changed, 126 insertions(+) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index cf382e8..3e7ab65 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -716,6 +716,32 @@ </dl> + <h3><a name="resPartition">Resource partitioning</a></h3> + + <p> + Hypervisors may allow for virtual machines to be placed into + resource partitions, potentially with nesting of said partitions. + The <code>resource</code> element groups together configuration + related to resource partitioning. It currently supports a child + element <code>partition</code> whose content defines the path + of the resource partition in which to place the domain. If no + partition is listed, then the domain will be placed in a default + partition. + </p> +<pre> + ... + <resource> + <partition>/virtualmachines/production</partition> + </resource> + ... +</pre> + + <p> + Resource partitions are currently supported by the QEMU and + LXC drivers, which map partition paths onto cgroups directories, + in all mounted controllers. <span class="since">Since 1.0.5</pan> + </p> + <h3><a name="elementsCPU">CPU model and topology</a></h3> <p> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 63ba7d1..296f8f9 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -537,6 +537,10 @@ <optional> <ref name="numatune"/> </optional> + + <optional> + <ref name="respartition"/> + </optional> </interleave> </define> @@ -680,6 +684,14 @@ </element> </define> + <define name="respartition"> + <element name="resource"> + <element name="partition"> + <ref name="absFilePath"/> + </element> + </element> + </define> + <define name="clock"> <optional> <element name="clock"> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index cc26f21..d44bb5d 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1748,6 +1748,18 @@ virDomainVcpuPinDefArrayFree(virDomainVcpuPinDefPtr *def, VIR_FREE(def); } + +void +virDomainResourceDefFree(virDomainResourceDefPtr resource) +{ + if (!resource) + return; + + VIR_FREE(resource->partition); + VIR_FREE(resource); +} + + void virDomainDefFree(virDomainDefPtr def) { unsigned int i; @@ -1755,6 +1767,8 @@ void virDomainDefFree(virDomainDefPtr def) if (!def) return; + virDomainResourceDefFree(def->resource); + /* hostdevs must be freed before nets (or any future "intelligent * hostdevs") because the pointer to the hostdev is really * pointing into the middle of the higher level device's object, @@ -9378,6 +9392,37 @@ cleanup: } +static virDomainResourceDefPtr +virDomainResourceDefParse(xmlNodePtr node, + xmlXPathContextPtr ctxt) +{ + virDomainResourceDefPtr def = NULL; + xmlNodePtr tmp = ctxt->node; + + ctxt->node = node; + + if (VIR_ALLOC(def) < 0) { + virReportOOMError(); + goto error; + } + + /* Find out what type of virtualization to use */ + if (!(def->partition = virXPathString("string(./partition)", ctxt))) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("missing resource partition attribute")); + goto error; + } + + ctxt->node = tmp; + return def; + +error: + ctxt->node = tmp; + virDomainResourceDefFree(def); + return NULL; +} + + static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps, virDomainXMLConfPtr xmlconf, @@ -9948,6 +9993,25 @@ virDomainDefParseXML(virCapsPtr caps, } VIR_FREE(nodes); + /* Extract numatune if exists. */ + if ((n = virXPathNodeSet("./resource", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("cannot extract resource nodes")); + goto error; + } + + if (n > 1) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("only one resource element is supported")); + VIR_FREE(nodes); + goto error; + } + + if (n && + !(def->resource = virDomainResourceDefParse(nodes[0], ctxt))) + goto error; + VIR_FREE(nodes); + if ((n = virXPathNodeSet("./features/*", ctxt, &nodes)) < 0) goto error; @@ -14605,6 +14669,17 @@ virDomainIsAllVcpupinInherited(virDomainDefPtr def) } } + +static void +virDomainResourceDefFormat(virBufferPtr buf, + virDomainResourceDefPtr def) +{ + virBufferAddLit(buf, " <resource>\n"); + virBufferEscapeString(buf, " <partition>%s</partition>\n", def->partition); + virBufferAddLit(buf, " </resource>\n"); +} + + #define DUMPXML_FLAGS \ (VIR_DOMAIN_XML_SECURE | \ VIR_DOMAIN_XML_INACTIVE | \ @@ -14873,6 +14948,9 @@ virDomainDefFormatInternal(virDomainDefPtr def, virBufferAddLit(buf, " </numatune>\n"); } + if (def->resource) + virDomainResourceDefFormat(buf, def->resource); + if (def->sysinfo) virDomainSysinfoDefFormat(buf, def->sysinfo); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index edddf25..b05cd34 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1735,6 +1735,11 @@ struct _virDomainRNGDef { void virBlkioDeviceWeightArrayClear(virBlkioDeviceWeightPtr deviceWeights, int ndevices); +typedef struct _virDomainResourceDef virDomainResourceDef; +typedef virDomainResourceDef *virDomainResourceDefPtr; +struct _virDomainResourceDef { + char *partition; +}; /* * Guest VM main configuration @@ -1786,6 +1791,7 @@ struct _virDomainDef { } cputune; virNumaTuneDef numatune; + virDomainResourceDefPtr resource; /* These 3 are based on virDomainLifeCycleAction enum flags */ int onReboot; @@ -1976,6 +1982,7 @@ virDomainObjPtr virDomainObjListFindByName(const virDomainObjListPtr doms, bool virDomainObjTaint(virDomainObjPtr obj, enum virDomainTaintFlags taint); +void virDomainResourceDefFree(virDomainResourceDefPtr resource); void virDomainGraphicsDefFree(virDomainGraphicsDefPtr def); void virDomainInputDefFree(virDomainInputDefPtr def); void virDomainDiskDefFree(virDomainDiskDefPtr def); diff --git a/tests/domainschemadata/domain-lxc-simple.xml b/tests/domainschemadata/domain-lxc-simple.xml index e61434f..56a0117 100644 --- a/tests/domainschemadata/domain-lxc-simple.xml +++ b/tests/domainschemadata/domain-lxc-simple.xml @@ -5,6 +5,9 @@ <type>exe</type> <init>/sh</init> </os> + <resource> + <partition>/virtualmachines</partition> + </resource> <memory unit='KiB'>500000</memory> <devices> <filesystem type='mount'> -- 1.8.1.4

From: "Daniel P. Berrange" <berrange@redhat.com> Historically QEMU/LXC guests have been placed in a cgroup layout that is $LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$VMNAME This is bad for a number of reasons - The cgroup hierarchy gets very deep which seriously impacts kernel performance due to cgroups scalability limitations. - It is hard to setup cgroup policies which apply across services and virtual machines, since all VMs are underneath the libvirtd service. To address this the default cgroup location is changed to be /system/$VMNAME.{lxc,qemu}.libvirt This puts virtual machines at the same level in the hierarchy as system services, allowing consistent policy to be setup across all of them. This also honours the new resource partition location from the XML configuration, for example <resource> <partition>/virtualmachines/production</partitions> </resource> will result in the VM being placed at /virtualmachines/production/$VMNAME.{lxc,qemu}.libvirt NB, with the exception of the default, /system, path which is intended to always exist, libvirt will not attempt to auto-create the partitions in the XML. It is the responsibility of the admin/app to configure the partitions. Later libvirt APIs will provide a way todo this. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/lxc/lxc_cgroup.c | 91 +++++++++++++++++++++++++++++++------- src/lxc/lxc_cgroup.h | 2 +- src/lxc/lxc_process.c | 4 +- src/qemu/qemu_cgroup.c | 114 +++++++++++++++++++++++++++++++++++++----------- src/qemu/qemu_cgroup.h | 3 +- src/qemu/qemu_process.c | 2 +- 6 files changed, 169 insertions(+), 47 deletions(-) diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 72940bd..8f19057 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -523,29 +523,88 @@ cleanup: } -virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def) +virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def, bool startup) { - virCgroupPtr driver = NULL; - virCgroupPtr cgroup = NULL; int rc; + virCgroupPtr parent = NULL; + virCgroupPtr cgroup = NULL; - rc = virCgroupNewDriver("lxc", true, false, -1, &driver); - if (rc != 0) { - virReportSystemError(-rc, "%s", - _("Unable to get cgroup for driver")); - goto cleanup; + if (!def->resource && startup) { + virDomainResourceDefPtr res; + + if (VIR_ALLOC(res) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (!(res->partition = strdup("/system"))) { + virReportOOMError(); + VIR_FREE(res); + goto cleanup; + } + + def->resource = res; } - rc = virCgroupNewDomainDriver(driver, def->name, true, &cgroup); - if (rc != 0) { - virReportSystemError(-rc, - _("Unable to create cgroup for domain %s"), - def->name); - goto cleanup; + if (def->resource && + def->resource->partition) { + if (def->resource->partition[0] != '/') { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Resource partition '%s' must start with '/'"), + def->resource->partition); + goto cleanup; + } + /* We only auto-create the default partition. In other + * cases we expec the sysadmin/app to have done so */ + rc = virCgroupNewPartition(def->resource->partition, + STREQ(def->resource->partition, "/system"), + -1, + &parent); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to initialize %s cgroup"), + def->resource->partition); + goto cleanup; + } + + rc = virCgroupNewDomainPartition(parent, + "lxc", + def->name, + true, + &cgroup); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + def->name); + goto cleanup; + } + } else { + rc = virCgroupNewDriver("lxc", + true, + true, + -1, + &parent); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + def->name); + goto cleanup; + } + + rc = virCgroupNewDomainDriver(parent, + def->name, + true, + &cgroup); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + def->name); + goto cleanup; + } } cleanup: - virCgroupFree(&driver); + virCgroupFree(&parent); return cgroup; } @@ -556,7 +615,7 @@ virCgroupPtr virLXCCgroupJoin(virDomainDefPtr def) int ret = -1; int rc; - if (!(cgroup = virLXCCgroupCreate(def))) + if (!(cgroup = virLXCCgroupCreate(def, true))) return NULL; rc = virCgroupAddTask(cgroup, getpid()); diff --git a/src/lxc/lxc_cgroup.h b/src/lxc/lxc_cgroup.h index 25a427c..f040de2 100644 --- a/src/lxc/lxc_cgroup.h +++ b/src/lxc/lxc_cgroup.h @@ -27,7 +27,7 @@ # include "lxc_fuse.h" # include "virusb.h" -virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def); +virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def, bool startup); virCgroupPtr virLXCCgroupJoin(virDomainDefPtr def); int virLXCCgroupSetup(virDomainDefPtr def, virCgroupPtr cgroup, diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index 193dd9a..9f42354 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -1049,7 +1049,7 @@ int virLXCProcessStart(virConnectPtr conn, virCgroupFree(&priv->cgroup); - if (!(priv->cgroup = virLXCCgroupCreate(vm->def))) + if (!(priv->cgroup = virLXCCgroupCreate(vm->def, true))) return -1; if (!virCgroupHasController(priv->cgroup, @@ -1464,7 +1464,7 @@ virLXCProcessReconnectDomain(virDomainObjPtr vm, if (!(priv->monitor = virLXCProcessConnectMonitor(driver, vm))) goto error; - if (!(priv->cgroup = virLXCCgroupCreate(vm->def))) + if (!(priv->cgroup = virLXCCgroupCreate(vm->def, false))) goto error; if (virLXCUpdateActiveUsbHostdevs(driver, vm->def) < 0) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index cb0faa1..db9aafe 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -188,46 +188,108 @@ int qemuSetupHostUsbDeviceCgroup(virUSBDevicePtr dev ATTRIBUTE_UNUSED, int qemuInitCgroup(virQEMUDriverPtr driver, - virDomainObjPtr vm) + virDomainObjPtr vm, + bool startup) { - int rc; + int rc = -1; qemuDomainObjPrivatePtr priv = vm->privateData; - virCgroupPtr driverGroup = NULL; + virCgroupPtr parent = NULL; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); virCgroupFree(&priv->cgroup); - rc = virCgroupNewDriver("qemu", - cfg->privileged, - true, - cfg->cgroupControllers, - &driverGroup); - if (rc != 0) { - if (rc == -ENXIO || - rc == -EPERM || - rc == -EACCES) { /* No cgroups mounts == success */ - VIR_DEBUG("No cgroups present/configured/accessible, ignoring error"); - goto done; + if (!vm->def->resource && startup) { + virDomainResourceDefPtr res; + + if (VIR_ALLOC(res) < 0) { + virReportOOMError(); + goto cleanup; } - virReportSystemError(-rc, - _("Unable to create cgroup for %s"), - vm->def->name); - goto cleanup; + if (!(res->partition = strdup("/system"))) { + virReportOOMError(); + VIR_FREE(res); + goto cleanup; + } + + vm->def->resource = res; } - rc = virCgroupNewDomainDriver(driverGroup, vm->def->name, true, &priv->cgroup); - if (rc != 0) { - virReportSystemError(-rc, - _("Unable to create cgroup for %s"), - vm->def->name); - goto cleanup; + if (vm->def->resource && + vm->def->resource->partition) { + if (vm->def->resource->partition[0] != '/') { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Resource partition '%s' must start with '/'"), + vm->def->resource->partition); + goto cleanup; + } + /* We only auto-create the default partition. In other + * cases we expec the sysadmin/app to have done so */ + rc = virCgroupNewPartition(vm->def->resource->partition, + STREQ(vm->def->resource->partition, "/system"), + cfg->cgroupControllers, + &parent); + if (rc != 0) { + if (rc == -ENXIO || + rc == -EPERM || + rc == -EACCES) { /* No cgroups mounts == success */ + VIR_DEBUG("No cgroups present/configured/accessible, ignoring error"); + goto done; + } + + virReportSystemError(-rc, + _("Unable to initialize %s cgroup"), + vm->def->resource->partition); + goto cleanup; + } + + rc = virCgroupNewDomainPartition(parent, + "qemu", + vm->def->name, + true, + &priv->cgroup); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + vm->def->name); + goto cleanup; + } + } else { + rc = virCgroupNewDriver("qemu", + cfg->privileged, + true, + cfg->cgroupControllers, + &parent); + if (rc != 0) { + if (rc == -ENXIO || + rc == -EPERM || + rc == -EACCES) { /* No cgroups mounts == success */ + VIR_DEBUG("No cgroups present/configured/accessible, ignoring error"); + goto done; + } + + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + vm->def->name); + goto cleanup; + } + + rc = virCgroupNewDomainDriver(parent, + vm->def->name, + true, + &priv->cgroup); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to create cgroup for %s"), + vm->def->name); + goto cleanup; + } } done: rc = 0; cleanup: - virCgroupFree(&driverGroup); + virCgroupFree(&parent); virObjectUnref(cfg); return rc; } @@ -246,7 +308,7 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, (const char *const *)cfg->cgroupDeviceACL : defaultDeviceACL; - if (qemuInitCgroup(driver, vm) < 0) + if (qemuInitCgroup(driver, vm, true) < 0) return -1; if (!priv->cgroup) diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index 6cbfebc..e63f443 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -37,7 +37,8 @@ int qemuSetupHostUsbDeviceCgroup(virUSBDevicePtr dev, const char *path, void *opaque); int qemuInitCgroup(virQEMUDriverPtr driver, - virDomainObjPtr vm); + virDomainObjPtr vm, + bool startup); int qemuSetupCgroup(virQEMUDriverPtr driver, virDomainObjPtr vm, virBitmapPtr nodemask); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index a86e62c..a7f0563 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3005,7 +3005,7 @@ qemuProcessReconnect(void *opaque) if (qemuUpdateActiveUsbHostdevs(driver, obj->def) < 0) goto error; - if (qemuInitCgroup(driver, obj) < 0) + if (qemuInitCgroup(driver, obj, false) < 0) goto error; /* XXX: Need to change as long as lock is introduced for -- 1.8.1.4

From: "Daniel P. Berrange" <berrange@redhat.com> The virCgroupNewDriver method had a 'bool privileged' param. If a false value was ever passed in, it would simply not work, since non-root users don't have any privileges to create new cgroups. Just delete this broken code entirely and make the QEMU driver skip cgroup setup in non-privileged mode Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/lxc/lxc_cgroup.c | 1 - src/qemu/qemu_cgroup.c | 4 +++- src/util/vircgroup.c | 27 +++------------------------ src/util/vircgroup.h | 1 - tests/vircgrouptest.c | 12 ++++++------ 5 files changed, 12 insertions(+), 33 deletions(-) diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 8f19057..0a43b61 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -581,7 +581,6 @@ virCgroupPtr virLXCCgroupCreate(virDomainDefPtr def, bool startup) } else { rc = virCgroupNewDriver("lxc", true, - true, -1, &parent); if (rc != 0) { diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index db9aafe..a6c8638 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -196,6 +196,9 @@ int qemuInitCgroup(virQEMUDriverPtr driver, virCgroupPtr parent = NULL; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + if (!cfg->privileged) + goto done; + virCgroupFree(&priv->cgroup); if (!vm->def->resource && startup) { @@ -256,7 +259,6 @@ int qemuInitCgroup(virQEMUDriverPtr driver, } } else { rc = virCgroupNewDriver("qemu", - cfg->privileged, true, cfg->cgroupControllers, &parent); diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 40e0fe6..6202614 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -794,8 +794,7 @@ err: return rc; } -static int virCgroupAppRoot(bool privileged, - virCgroupPtr *group, +static int virCgroupAppRoot(virCgroupPtr *group, bool create, int controllers) { @@ -807,26 +806,7 @@ static int virCgroupAppRoot(bool privileged, if (rc != 0) return rc; - if (privileged) { - rc = virCgroupNew("libvirt", selfgrp, controllers, group); - } else { - char *rootname; - char *username; - username = virGetUserName(getuid()); - if (!username) { - rc = -ENOMEM; - goto cleanup; - } - rc = virAsprintf(&rootname, "libvirt-%s", username); - VIR_FREE(username); - if (rc < 0) { - rc = -ENOMEM; - goto cleanup; - } - - rc = virCgroupNew(rootname, selfgrp, controllers, group); - VIR_FREE(rootname); - } + rc = virCgroupNew("libvirt", selfgrp, controllers, group); if (rc != 0) goto cleanup; @@ -1135,7 +1115,6 @@ int virCgroupNewPartition(const char *path ATTRIBUTE_UNUSED, */ #if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R int virCgroupNewDriver(const char *name, - bool privileged, bool create, int controllers, virCgroupPtr *group) @@ -1143,7 +1122,7 @@ int virCgroupNewDriver(const char *name, int rc; virCgroupPtr rootgrp = NULL; - rc = virCgroupAppRoot(privileged, &rootgrp, + rc = virCgroupAppRoot(&rootgrp, create, controllers); if (rc != 0) goto out; diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h index 33f86a6..936e09b 100644 --- a/src/util/vircgroup.h +++ b/src/util/vircgroup.h @@ -51,7 +51,6 @@ int virCgroupNewPartition(const char *path, ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(4); int virCgroupNewDriver(const char *name, - bool privileged, bool create, int controllers, virCgroupPtr *group) diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index a806368..4f76a06 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -138,13 +138,13 @@ static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) [VIR_CGROUP_CONTROLLER_BLKIO] = "/libvirt/lxc", }; - if ((rv = virCgroupNewDriver("lxc", true, false, -1, &cgroup)) != -ENOENT) { + if ((rv = virCgroupNewDriver("lxc", false, -1, &cgroup)) != -ENOENT) { fprintf(stderr, "Unexpected found LXC cgroup: %d\n", -rv); goto cleanup; } /* Asking for impossible combination since CPU is co-mounted */ - if ((rv = virCgroupNewDriver("lxc", true, true, + if ((rv = virCgroupNewDriver("lxc", true, (1 << VIR_CGROUP_CONTROLLER_CPU), &cgroup)) != -EINVAL) { fprintf(stderr, "Should not have created LXC cgroup: %d\n", -rv); @@ -152,7 +152,7 @@ static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) } /* Asking for impossible combination since devices is not mounted */ - if ((rv = virCgroupNewDriver("lxc", true, true, + if ((rv = virCgroupNewDriver("lxc", true, (1 << VIR_CGROUP_CONTROLLER_DEVICES), &cgroup)) != -ENOENT) { fprintf(stderr, "Should not have created LXC cgroup: %d\n", -rv); @@ -160,7 +160,7 @@ static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) } /* Asking for small combination since devices is not mounted */ - if ((rv = virCgroupNewDriver("lxc", true, true, + if ((rv = virCgroupNewDriver("lxc", true, (1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | (1 << VIR_CGROUP_CONTROLLER_MEMORY), @@ -171,7 +171,7 @@ static int testCgroupNewForDriver(const void *args ATTRIBUTE_UNUSED) ret = validateCgroup(cgroup, "libvirt/lxc", mountsSmall, placementSmall); virCgroupFree(&cgroup); - if ((rv = virCgroupNewDriver("lxc", true, true, -1, &cgroup)) != 0) { + if ((rv = virCgroupNewDriver("lxc", true, -1, &cgroup)) != 0) { fprintf(stderr, "Cannot create LXC cgroup: %d\n", -rv); goto cleanup; } @@ -199,7 +199,7 @@ static int testCgroupNewForDriverDomain(const void *args ATTRIBUTE_UNUSED) [VIR_CGROUP_CONTROLLER_BLKIO] = "/libvirt/lxc/wibble", }; - if ((rv = virCgroupNewDriver("lxc", true, false, -1, &drivercgroup)) != 0) { + if ((rv = virCgroupNewDriver("lxc", false, -1, &drivercgroup)) != 0) { fprintf(stderr, "Cannot find LXC cgroup: %d\n", -rv); goto cleanup; } -- 1.8.1.4

On 04/04/2013 03:40 PM, Daniel P. Berrange wrote:
/sys/fs/cgroup ├── cpu,cpuacct │ ├── libvirt │ │ ├── lxc │ │ │ └── busy │ │ └── qemu │ │ └── vm1 │ │ ├── emulator │ │ └── vcpu0
It's somehow off-topic but if you do a rework you might also consider a conceptual change wrt to $domain/emulator and $domain/vcpu* ... Just today I was confronted with a race in qemuSetupCgroupForEmulator/virCgroupMoveTask on highly utilized system. The problem is that if a QEMU thread terminates during the move from $domain/tasks to $domain/emulator/tasks the virCgroupAddTaskController call will fail resulting in a failure to start the domain. Another possible issue is that if new QEMU threads are spawned after the virCgroupGetValueStr call they will not be moved. So, since the threads in $domain/tasks are 'hypervisor' threads anyway, shouldn't we get rid of the emulator directory altogether? -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Köderitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On Thu, Apr 04, 2013 at 06:38:29PM +0200, Viktor Mihajlovski wrote:
On 04/04/2013 03:40 PM, Daniel P. Berrange wrote:
/sys/fs/cgroup ├── cpu,cpuacct │ ├── libvirt │ │ ├── lxc │ │ │ └── busy │ │ └── qemu │ │ └── vm1 │ │ ├── emulator │ │ └── vcpu0
It's somehow off-topic but if you do a rework you might also consider a conceptual change wrt to $domain/emulator and $domain/vcpu* ...
Just today I was confronted with a race in qemuSetupCgroupForEmulator/virCgroupMoveTask on highly utilized system. The problem is that if a QEMU thread terminates during the move from $domain/tasks to $domain/emulator/tasks the virCgroupAddTaskController call will fail resulting in a failure to start the domain. Another possible issue is that if new QEMU threads are spawned after the virCgroupGetValueStr call they will not be moved.
This seems easy enough to fix. Instead of starting in $domain/ and moving them to $domain/emulator, we should just start QEMU in the $domain/emulator directory right away.
So, since the threads in $domain/tasks are 'hypervisor' threads anyway, shouldn't we get rid of the emulator directory altogether?
The problem is that any controls you apply at the $domain/ level, will also affect the $domain/vcpuNN levels. To get controls that are guaranteed to only affect the emulator threads, you need them to be in a directory that is parallel to the vcpuN directories, not a parent of them.
--
Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski
IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Köderitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294
-- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 04/04/2013 06:41 PM, Daniel P. Berrange wrote:
The problem is that any controls you apply at the $domain/ level, will also affect the $domain/vcpuNN levels. To get controls that are guaranteed to only affect the emulator threads, you need them to be in a directory that is parallel to the vcpuN directories, not a parent of them.
probably not for vcpuN (since we set them explicitly) but maybe for a future third kind of processes/threads, so I agree. -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Köderitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294
participants (4)
-
Daniel P. Berrange
-
Eric Blake
-
Ján Tomko
-
Viktor Mihajlovski