On Mon, 2017-03-13 at 14:24 -0400, Luiz Capitulino wrote:
> NB if we did enforce $RAM + $LARGE_NUMBER, then I'd suggest
we did
> set a default hard_limit universally once more, not only set a mlock
> limit when using <locked/>. This would at least ensure we see consistent
> (bad) behaviour rather than have edge cases that only appeared when
> <locked/> was present.
Setting <hard_limit> doesn't limit just the amount of memory
the QEMU process is allowed to lock, but also the amount of
memory it's allowed to allocate at all (through cgroups).
Is it fair to assume that QEMU will never *allocate* more
than 2 GiB (or whatever $LARGE_NUMBER we end up picking) in
addition to what's needed to fit the guest memory? I would
certainly hope so, but maybe there are use cases that require
more than that.
Were you thinking about adding <hard_limit> automatically to
all guests where it's not already present, or just setting
the limits silently? The latter sounds like it would be very
opaque to users, so I'm guessing the former.
If we added <hard_limit> to the XML automatically, we would
have the problem of keeping it updated with changes to the
amount of guest memory... Or maybe we could expect users to
remove the <hard_limit> element, to get it regenerated, every
time they change the amount of guest memory?
One nice side-effect of doing this unconditionally is that
we could get rid of some of the special-casing we are doing
for VFIO-using guest, especially saving and restoring the
memory locking limit when host devices are assigned and
unassigned.
Makes me even more nervous, but I agree with your reasoning.
Btw, do we have a volunteer to do this work? Andrea?
Since I've already spent a significant amount of time
researching the issue, and wrote the patch that caused it
to rear its ugly head in the first place, I guess it makes
sense for me to be volunteered to fix it ;)
--
Andrea Bolognani / Red Hat / Virtualization