On Wed, May 22, 2019 at 05:16:38PM -0600, Jim Fehlig wrote:
Hi All,
I recently received an internal bug report of VM "crashing" due to hitting
thread limits. Seems there was an assert in pthread_create within the VM
when hitting the limit enforced by pids controller on the host
Apr 28 07:45:46 lpcomp02007 kernel: cgroup: fork rejected by pids controller
in /machine.slice/machine-qemu\x2d90028\x2dinstance\x2d0000634b.scope
The user has TasksMax set to infinity in machine.slice, but apparently that
is not inherited by child scopes and appears to be hardcoded to 16384
https://github.com/systemd/systemd/blob/51aba17b88617515e037e8985d3a4ea87...
The TasksMax property can be set when creating the machine as is done in the
attached proof of concept patch. Question is whether this should be a
tunable? My initial thought when seeing the report was TasksMax could be
calculated based on number of vcpus, iothreads, emulator threads, etc. But
it appears that could be quite tricky. The following mail thread describes
the basic scenario encountered by my user
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-March/008174.html
As you can see, many rbd images attached to a VM can result in an awful lot
of threads. 300 images could result in 720K threads! We could punt and set
the limit to infinity, but it exists for a reason - fork bomb prevention. A
potential compromise between a hardcoded value and per-VM tunable is a
driver tunable in qemu.conf. If a per-VM tunable is preferred, suggestions
on where to place it and what to call it would be much appreciated :-).
Yeah, RBD is problematic as you can't predict how many threads it will
use.
We currently have a "max_processes" stting in qemu.conf for the ulimit
base process limit. This applies to the user as a whole though, not the
cgroup.
On Fedora we don't seem to have any "tasks_max" cgroup setting or TasksMax
systemd setting, at least when running with cgroups v1, so we can't set that
unconditionally.
I'd be inclined to have a new qemu.conf setting "max_tasks". If this is
set to 0, then we should just set TasksMax to infinity, otherwise honour
the setting.
>From 0583ee3b26b2ee43efe8d25226eceb8547400d97 Mon Sep 17 00:00:00
2001
From: Jim Fehlig <jfehlig(a)suse.com>
Date: Wed, 22 May 2019 17:12:14 -0600
Subject: [PATCH] systemd: set TasksMax when calling CreateMachine
An example of how to set TasksMax when creating a scope for a machine.
Signed-off-by: Jim Fehlig <jfehlig(a)suse.com>
---
src/util/virsystemd.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/src/util/virsystemd.c b/src/util/virsystemd.c
index 3f03e3bd63..6177447bdb 100644
--- a/src/util/virsystemd.c
+++ b/src/util/virsystemd.c
@@ -341,10 +341,11 @@ int virSystemdCreateMachine(const char *name,
(unsigned int)pidleader,
NULLSTR_EMPTY(rootdir),
nnicindexes, nicindexes,
- 3,
+ 4,
"Slice", "s", slicename,
"After", "as", 1,
"libvirtd.service",
- "Before", "as", 1,
"virt-guest-shutdown.target") < 0)
+ "Before", "as", 1,
"virt-guest-shutdown.target",
+ "TasksMax", "t", UINT64_C(32768)) <
0)
goto cleanup;
if (error.level == VIR_ERR_ERROR) {
@@ -382,10 +383,11 @@ int virSystemdCreateMachine(const char *name,
iscontainer ? "container" : "vm",
(unsigned int)pidleader,
NULLSTR_EMPTY(rootdir),
- 3,
+ 4,
"Slice", "s", slicename,
"After", "as", 1,
"libvirtd.service",
- "Before", "as", 1,
"virt-guest-shutdown.target") < 0)
+ "Before", "as", 1,
"virt-guest-shutdown.target",
+ "TasksMax", "t", UINT64_C(32768)) <
0)
goto cleanup;
}
--
2.21.0
--
libvir-list mailing list
libvir-list(a)redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|