
Hi All, I recently received an internal bug report of VM "crashing" due to hitting thread limits. Seems there was an assert in pthread_create within the VM when hitting the limit enforced by pids controller on the host Apr 28 07:45:46 lpcomp02007 kernel: cgroup: fork rejected by pids controller in /machine.slice/machine-qemu\x2d90028\x2dinstance\x2d0000634b.scope The user has TasksMax set to infinity in machine.slice, but apparently that is not inherited by child scopes and appears to be hardcoded to 16384 https://github.com/systemd/systemd/blob/51aba17b88617515e037e8985d3a4ea871ac... The TasksMax property can be set when creating the machine as is done in the attached proof of concept patch. Question is whether this should be a tunable? My initial thought when seeing the report was TasksMax could be calculated based on number of vcpus, iothreads, emulator threads, etc. But it appears that could be quite tricky. The following mail thread describes the basic scenario encountered by my user http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-March/008174.html As you can see, many rbd images attached to a VM can result in an awful lot of threads. 300 images could result in 720K threads! We could punt and set the limit to infinity, but it exists for a reason - fork bomb prevention. A potential compromise between a hardcoded value and per-VM tunable is a driver tunable in qemu.conf. If a per-VM tunable is preferred, suggestions on where to place it and what to call it would be much appreciated :-). Regards, Jim