[CC'ing qemu-devel list]
On 09.08.2013 15:17, Daniel P. Berrange wrote:
On Fri, Aug 09, 2013 at 07:13:58AM -0600, Eric Blake wrote:
> On 08/09/2013 06:56 AM, Michal Privoznik wrote:
>> This function is to guess the correct limit for maximal memory
>> usage by qemu for given domain. This can never be guessed
>> correctly, not to mention all the pains and sleepless nights this
>> code has caused. Once somebody discovers algorithm to solve the
>> Halting Problem, we can compute the limit algorithmically. But
>> till then, this code should never see the light of the release
>> again.
>> ---
>> src/qemu/qemu_cgroup.c | 3 +--
>> src/qemu/qemu_command.c | 2 +-
>> src/qemu/qemu_domain.c | 49 -------------------------------------------------
>> src/qemu/qemu_domain.h | 2 --
>> src/qemu/qemu_hotplug.c | 2 +-
>> 5 files changed, 3 insertions(+), 55 deletions(-)
>
> ACK. Users that put an explicit limit in their XML are taking on their
> own risk at guessing correctly; all other users should not be forced to
> suffer from a bad guess on our part killing their domain.
If we don't understand how to calculate a default limit that works,
how are users with even less knowledge than us, suppose to calculate
an explicit level of their own ?
This limit was designed so that the hosts are not vulnerable to DOS
attack from a compromised QEMU, so removing this is arguably introducing
a security weakness in our default deployment.
I think I'd like to see some feedback / agreement from QEMU developers
that this problem really can't be solved, before we remove it.
Daniel
In libvirt I've introduced a heuristic to guess the maximum limit for a
memory for a given VM definition. The rationale was "it's better to be
safe by default" and not let leaking qemu trash the host. The heuristic
is only used if user has not configured any limit himself. However, over
the time the number of users reporting OOM kills due to my heuristic has
grown. Finally, I've full nose of this problem so I've made a patch [1]
that removes this 'functionality' completely (I'd say it's bug after
all). In the patch you can see the heuristic we've converged to. But Dan
has his point. If libvirt & qemu devels aren't able to come up with
proper heuristic, how can an ordinary user (who doesn't have any
knowledge of code) do so? So before I apply my patch, I want to ask you
guys, what do you think about it.
Michal
1:
https://www.redhat.com/archives/libvir-list/2013-August/msg00437.html