[libvirt] [PATCH 0/2] Fix cgroupv2 issue on Fedora 31

There's been a bunch of reports of VM startup failures on Fedora 31 and elsewhere: https://bugzilla.redhat.com/show_bug.cgi?id=1751120 The reproducer I found is running a mock build, which adds a name=systemd cgroup to /proc/self/cgroup, which messes up libvirt's logic. The first patch has a bit more detail. I'm not positive this is entirely correct... maybe we should be doing something with that cgroup in some way? I'm kinda ignorant here. I also filed a systemd bug incase that behavior is unintentional. We probably want to change libvirt regardless though https://bugzilla.redhat.com/show_bug.cgi?id=1756143 Cole Robinson (2): vircgroupv2: Fix VM startup when legacy cgroups are defined vircgroup: Add some VIR_DEBUG statements src/util/vircgroup.c | 3 ++- src/util/vircgroupv2.c | 15 ++++++++++++++- 2 files changed, 16 insertions(+), 2 deletions(-) -- 2.23.0

On Fedora 31, starting a 'mock' build alters /proc/$pid/cgroup, probably due to usage of systemd-nspawn. Before: $ cat /proc/self/cgroup 0::/user.slice/user-1000.slice/... After: $ cat /proc/self/cgroup 1:name=systemd:/ 0::/user.slice/user-1000.slice/... The cgroupv2 code mishandles that first line in the second case, which causes VM startup to fail with: Unable to read from '/sys/fs/cgroup/machine/cgroup.controllers': No such file or directory The kernel docs[1] say that the cgroupv2 path will always start with '0::', which in the code here controllers="". Only set the v2 placement path when we see that cgroup file entry. [1] https://www.kernel.org/doc/html/v5.3/admin-guide/cgroup-v2.html#processes https://bugzilla.redhat.com/show_bug.cgi?id=1751120 Signed-off-by: Cole Robinson <crobinso@redhat.com> --- src/util/vircgroupv2.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 0663c67190..0cb20e4896 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -194,12 +194,16 @@ virCgroupV2DetectMounts(virCgroupPtr group, static int virCgroupV2DetectPlacement(virCgroupPtr group, const char *path, - const char *controllers ATTRIBUTE_UNUSED, + const char *controllers, const char *selfpath) { if (group->unified.placement) return 0; + /* controllers="" indicates the cgroupv2 controller path */ + if (STRNEQ_NULLABLE(controllers, "")) + return 0; + /* * selfpath == "/" + path="" -> "/" * selfpath == "/libvirt.service" + path == "" -> "/libvirt.service" -- 2.23.0

On Thu, Sep 26, 2019 at 05:18:40PM -0400, Cole Robinson wrote:
On Fedora 31, starting a 'mock' build alters /proc/$pid/cgroup, probably due to usage of systemd-nspawn.
Before: $ cat /proc/self/cgroup 0::/user.slice/user-1000.slice/...
After: $ cat /proc/self/cgroup 1:name=systemd:/ 0::/user.slice/user-1000.slice/...
The cgroupv2 code mishandles that first line in the second case, which causes VM startup to fail with: Unable to read from '/sys/fs/cgroup/machine/cgroup.controllers': No such file or directory
The kernel docs[1] say that the cgroupv2 path will always start with '0::', which in the code here controllers="". Only set the v2 placement path when we see that cgroup file entry.
[1] https://www.kernel.org/doc/html/v5.3/admin-guide/cgroup-v2.html#processes
https://bugzilla.redhat.com/show_bug.cgi?id=1751120
Signed-off-by: Cole Robinson <crobinso@redhat.com> --- src/util/vircgroupv2.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 0663c67190..0cb20e4896 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -194,12 +194,16 @@ virCgroupV2DetectMounts(virCgroupPtr group, static int virCgroupV2DetectPlacement(virCgroupPtr group, const char *path, - const char *controllers ATTRIBUTE_UNUSED, + const char *controllers, const char *selfpath) { if (group->unified.placement) return 0;
+ /* controllers="" indicates the cgroupv2 controller path */ + if (STRNEQ_NULLABLE(controllers, "")) + return 0;
Fixes the issue reported by several users and looks reasonable. I would use /* controllers == "" ...*/ to mach the following comment and there is no need to use STRNEQ_NULLABLE as the caller already checks if !controllers. Reviewed-by: Pavel Hrdina <phrdina@redhat.com>

These helped with debugging https://bugzilla.redhat.com/show_bug.cgi?id=1612383 Signed-off-by: Cole Robinson <crobinso@redhat.com> --- src/util/vircgroup.c | 3 ++- src/util/vircgroupv2.c | 9 +++++++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 825f62a97b..4f9d80666d 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -1157,7 +1157,8 @@ virCgroupNewMachineSystemd(const char *name, virCgroupFree(&init); if (!path || STREQ(path, "/") || path[0] != '/') { - VIR_DEBUG("Systemd didn't setup its controller"); + VIR_DEBUG("Systemd didn't setup its controller, path=%s", + NULLSTR(path)); return -2; } diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 0cb20e4896..f3f83c1e95 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -155,10 +155,14 @@ virCgroupV2CopyPlacement(virCgroupPtr group, const char *path, virCgroupPtr parent) { + VIR_DEBUG("group=%p path=%s parent=%p", group, path, parent); + if (path[0] == '/') { if (VIR_STRDUP(group->unified.placement, path) < 0) return -1; } else { + VIR_DEBUG("parent->unified.placement=%s", parent->unified.placement); + /* * parent == "/" + path="" => "/" * parent == "/libvirt.service" + path == "" => "/libvirt.service" @@ -172,6 +176,7 @@ virCgroupV2CopyPlacement(virCgroupPtr group, return -1; } + VIR_DEBUG("set group->unified.placement=%s", group->unified.placement); return 0; } @@ -200,6 +205,9 @@ virCgroupV2DetectPlacement(virCgroupPtr group, if (group->unified.placement) return 0; + VIR_DEBUG("group=%p path=%s controllers=%s selfpath=%s", + group, path, controllers, selfpath); + /* controllers="" indicates the cgroupv2 controller path */ if (STRNEQ_NULLABLE(controllers, "")) return 0; @@ -216,6 +224,7 @@ virCgroupV2DetectPlacement(virCgroupPtr group, path) < 0) return -1; + VIR_DEBUG("set group->unified.placement=%s", group->unified.placement); return 0; } -- 2.23.0

On Thu, Sep 26, 2019 at 05:18:41PM -0400, Cole Robinson wrote:
These helped with debugging https://bugzilla.redhat.com/show_bug.cgi?id=1612383
Signed-off-by: Cole Robinson <crobinso@redhat.com> --- src/util/vircgroup.c | 3 ++- src/util/vircgroupv2.c | 9 +++++++++ 2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 825f62a97b..4f9d80666d 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -1157,7 +1157,8 @@ virCgroupNewMachineSystemd(const char *name, virCgroupFree(&init);
if (!path || STREQ(path, "/") || path[0] != '/') { - VIR_DEBUG("Systemd didn't setup its controller"); + VIR_DEBUG("Systemd didn't setup its controller, path=%s", + NULLSTR(path)); return -2; }
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 0cb20e4896..f3f83c1e95 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -155,10 +155,14 @@ virCgroupV2CopyPlacement(virCgroupPtr group, const char *path, virCgroupPtr parent) { + VIR_DEBUG("group=%p path=%s parent=%p", group, path, parent); + if (path[0] == '/') { if (VIR_STRDUP(group->unified.placement, path) < 0) return -1; } else { + VIR_DEBUG("parent->unified.placement=%s", parent->unified.placement); +
This one seems a bit redundant as the parent->unified.placement will be part of the group->unified.placement together with path.
/* * parent == "/" + path="" => "/" * parent == "/libvirt.service" + path == "" => "/libvirt.service" @@ -172,6 +176,7 @@ virCgroupV2CopyPlacement(virCgroupPtr group, return -1; }
+ VIR_DEBUG("set group->unified.placement=%s", group->unified.placement); return 0; }
@@ -200,6 +205,9 @@ virCgroupV2DetectPlacement(virCgroupPtr group, if (group->unified.placement) return 0;
+ VIR_DEBUG("group=%p path=%s controllers=%s selfpath=%s", + group, path, controllers, selfpath); + /* controllers="" indicates the cgroupv2 controller path */ if (STRNEQ_NULLABLE(controllers, "")) return 0; @@ -216,6 +224,7 @@ virCgroupV2DetectPlacement(virCgroupPtr group, path) < 0) return -1;
+ VIR_DEBUG("set group->unified.placement=%s", group->unified.placement); return 0;
When I was writing the code I had a lot of debug messages around the code but before posting the patches I removed a lot of it in order to not spam debug logs with everything. Finding the sweet spot is difficult, the function entry debug logs are definitely good and can help a lot to track the code workflow, but I'm not so sure about the debug logs at the end of functions to log what was configured. I'll leave it up to you whether you want to keep it or not. Reviewed-by: Pavel Hrdina <phrdina@redhat.com>

On 9/27/19 7:03 AM, Pavel Hrdina wrote:
On Thu, Sep 26, 2019 at 05:18:41PM -0400, Cole Robinson wrote:
These helped with debugging https://bugzilla.redhat.com/show_bug.cgi?id=1612383
Signed-off-by: Cole Robinson <crobinso@redhat.com> --- src/util/vircgroup.c | 3 ++- src/util/vircgroupv2.c | 9 +++++++++ 2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 825f62a97b..4f9d80666d 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -1157,7 +1157,8 @@ virCgroupNewMachineSystemd(const char *name, virCgroupFree(&init);
if (!path || STREQ(path, "/") || path[0] != '/') { - VIR_DEBUG("Systemd didn't setup its controller"); + VIR_DEBUG("Systemd didn't setup its controller, path=%s", + NULLSTR(path)); return -2; }
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 0cb20e4896..f3f83c1e95 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -155,10 +155,14 @@ virCgroupV2CopyPlacement(virCgroupPtr group, const char *path, virCgroupPtr parent) { + VIR_DEBUG("group=%p path=%s parent=%p", group, path, parent); + if (path[0] == '/') { if (VIR_STRDUP(group->unified.placement, path) < 0) return -1; } else { + VIR_DEBUG("parent->unified.placement=%s", parent->unified.placement); +
This one seems a bit redundant as the parent->unified.placement will be part of the group->unified.placement together with path.
/* * parent == "/" + path="" => "/" * parent == "/libvirt.service" + path == "" => "/libvirt.service" @@ -172,6 +176,7 @@ virCgroupV2CopyPlacement(virCgroupPtr group, return -1; }
+ VIR_DEBUG("set group->unified.placement=%s", group->unified.placement); return 0; }
@@ -200,6 +205,9 @@ virCgroupV2DetectPlacement(virCgroupPtr group, if (group->unified.placement) return 0;
+ VIR_DEBUG("group=%p path=%s controllers=%s selfpath=%s", + group, path, controllers, selfpath); + /* controllers="" indicates the cgroupv2 controller path */ if (STRNEQ_NULLABLE(controllers, "")) return 0; @@ -216,6 +224,7 @@ virCgroupV2DetectPlacement(virCgroupPtr group, path) < 0) return -1;
+ VIR_DEBUG("set group->unified.placement=%s", group->unified.placement); return 0;
When I was writing the code I had a lot of debug messages around the code but before posting the patches I removed a lot of it in order to not spam debug logs with everything. Finding the sweet spot is difficult, the function entry debug logs are definitely good and can help a lot to track the code workflow, but I'm not so sure about the debug logs at the end of functions to log what was configured. I'll leave it up to you whether you want to keep it or not.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
I fixed your suggestions and dropped the debug statements at the end of functions, I think the entry ones are sufficient Thanks, Cole - Cole
participants (2)
-
Cole Robinson
-
Pavel Hrdina