On Mon, Mar 08, 2021 at 02:11:56PM +0100, Andrea Bolognani wrote:
On Mon, 2021-03-08 at 10:54 +0000, Daniel P. Berrangé wrote:
> On Fri, Mar 05, 2021 at 08:13:47PM +0100, Andrea Bolognani wrote:
> > This feature has been requested by KubeVirt developers and will make
> > it possible for them to make some VFIO-related features, such as
> > migration and hotplug, work correctly.
> >
> >
https://bugzilla.redhat.com/show_bug.cgi?id=1916346
> >
> > The first part of the series, especially the first 9 patches, is
> > preparation work: it addresses a few annoying issues with our APIs
> > that deal with process limits, and makes them all nice, consistent
> > and easy to reason about while moving policy code from the generic
> > code to the QEMU driver where it belongs.
>
> IIUC, the code was already supposed to attempt to set the limits
> and gracefully carry on if it was unable todo so. Can you outline
> the actual problem being solved, as wading through the bug comments
> to understand the precise detail of the problem is not very clear.
Not being able to set the memory locking limit shouldn't be a soft
failure, because that might cause QEMU to start up apparently fine
and then error out later on during the lifetime of the VM, which is
much worse than not starting up at all. We're actually doing this
correctly for the most part, which is why hotplug requests for VFIO
devices in KubeVirt result in errors under certain conditions.
Yep, I agree we shoudln't have it be a mere warning. IIUC, we would
try to set the limit, and if we fail, then we query the current limit
to see if its satisfactory and report a fatal error if not.
The reason why VFIO device assignment is currently not completely
broken in KubeVirt is that, when the QEMU process is initially
started, we set the memory locking limit after fork(), so we can do
that using setrlimit() which doesn't require additional capabilities,
but in the hotplug scenario libvirtd needs to change the limits of a
different process: in that case we are forced to use prlimit(), which
fails due to the lack of CAP_SYS_RESOURCE.
Since you added code to parse existing limits from /proc, I'm wondering
if we can just do without the config option. Simply try to use prlimit
and if it fails, query existing limits to determine if we sould treat
the prlimit as fatal or ignore it. Overall I'd prefer libvirt to
"just work" out of the box rather than requiring people to know about
setting a "make-vfio-hotplug-work=yes" flag in the config file.
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|