On Fri, Jul 09, 2021 at 11:25:37AM -0500, Mike Christie wrote:
Hi,
The goal of this email is to try and figure how we want to track/limit the
number of kernel threads created by vhost devices.
Background:
-----------
For vhost-scsi, we've hit a issue where the single vhost worker thread can't
handle all IO the being sent from multiple queues. IOPs is stuck at around
500K. To fix this, we did this patchset:
https://lore.kernel.org/linux-scsi/20210525180600.6349-1-michael.christie...
which allows userspace to create N threads and map them to a dev's virtqueues.
With this we can get around 1.4M IOPs.
Problem:
--------
While those patches were being reviewed, a concern about tracking all these
new possible threads was raised here:
https://lore.kernel.org/linux-scsi/YL45CfpHyzSEcAJv@stefanha-x1.localdomain/
To save you some time, the question is what does other kernel code using the
kthread API do to track the number of kernel threads created on behalf of
a userspace thread. The answer is they don't do anything so we will have to
add that code.
I started to do that here:
https://lkml.org/lkml/2021/6/23/1233
where those patches would charge/check the vhost device owner's RLIMIT_NPROC
value. But, the question of if we really want to do this has come up which is
why I'm bugging lists like libvirt now.
Question/Solution:
------------------
I'm bugging everyone so we can figure out:
If we need to specifically track the number of kernel threads being made
for the vhost kernel use case by the RLIMIT_NPROC limit?
Or, is it ok to limit the number of devices with the RLIMIT_NOFILE limit.
Then each device has a limit on the number of threads it can create.
Do we want to add an interface where an unprivileged userspace process
can create large numbers of kthreads? The number is indirectly bounded
by RLIMIT_NOFILE * num_virtqueues, but there is no practical way to
use that rlimit since num_virtqueues various across vhost devices and
RLIMIT_NOFILE might need to have a specific value to control file
descriptors.
io_uring worker threads are limited by RLIMIT_NPROC. I think it makes
sense in vhost too where the device instance is owned by a specific
userspace process and can be accounted against that process' rlimit.
I don't have a specific use case other than that I think vhost should be
safe and well-behaved.
Stefan