Hi,
The goal of this email is to try and figure how we want to track/limit the
number of kernel threads created by vhost devices.
Background:
-----------
For vhost-scsi, we've hit a issue where the single vhost worker thread can't
handle all IO the being sent from multiple queues. IOPs is stuck at around
500K. To fix this, we did this patchset:
https://lore.kernel.org/linux-scsi/20210525180600.6349-1-michael.christie...
which allows userspace to create N threads and map them to a dev's virtqueues.
With this we can get around 1.4M IOPs.
Problem:
--------
While those patches were being reviewed, a concern about tracking all these
new possible threads was raised here:
https://lore.kernel.org/linux-scsi/YL45CfpHyzSEcAJv@stefanha-x1.localdomain/
To save you some time, the question is what does other kernel code using the
kthread API do to track the number of kernel threads created on behalf of
a userspace thread. The answer is they don't do anything so we will have to
add that code.
I started to do that here:
https://lkml.org/lkml/2021/6/23/1233
where those patches would charge/check the vhost device owner's RLIMIT_NPROC
value. But, the question of if we really want to do this has come up which is
why I'm bugging lists like libvirt now.
Question/Solution:
------------------
I'm bugging everyone so we can figure out:
If we need to specifically track the number of kernel threads being made
for the vhost kernel use case by the RLIMIT_NPROC limit?
Or, is it ok to limit the number of devices with the RLIMIT_NOFILE limit.
Then each device has a limit on the number of threads it can create.