On Fri, Aug 09, 2013 at 10:58:55AM -0500, Anthony Liguori wrote:
Michal Privoznik <mprivozn(a)redhat.com> writes:
> [CC'ing qemu-devel list]
> On 09.08.2013 15:17, Daniel P. Berrange wrote:
>> On Fri, Aug 09, 2013 at 07:13:58AM -0600, Eric Blake wrote:
>>> On 08/09/2013 06:56 AM, Michal Privoznik wrote:
>>>> This function is to guess the correct limit for maximal memory
>>>> usage by qemu for given domain. This can never be guessed
>>>> correctly, not to mention all the pains and sleepless nights this
>>>> code has caused. Once somebody discovers algorithm to solve the
>>>> Halting Problem, we can compute the limit algorithmically. But
>>>> till then, this code should never see the light of the release
>>>> again.
>>>> ---
>>>> src/qemu/qemu_cgroup.c | 3 +--
>>>> src/qemu/qemu_command.c | 2 +-
>>>> src/qemu/qemu_domain.c | 49
-------------------------------------------------
>>>> src/qemu/qemu_domain.h | 2 --
>>>> src/qemu/qemu_hotplug.c | 2 +-
>>>> 5 files changed, 3 insertions(+), 55 deletions(-)
>>>
>>> ACK. Users that put an explicit limit in their XML are taking on their
>>> own risk at guessing correctly; all other users should not be forced to
>>> suffer from a bad guess on our part killing their domain.
>>
>> If we don't understand how to calculate a default limit that works,
>> how are users with even less knowledge than us, suppose to calculate
>> an explicit level of their own ?
>>
>> This limit was designed so that the hosts are not vulnerable to DOS
>> attack from a compromised QEMU, so removing this is arguably introducing
>> a security weakness in our default deployment.
>>
>> I think I'd like to see some feedback / agreement from QEMU developers
>> that this problem really can't be solved, before we remove it.
>>
>> Daniel
>>
>
> In libvirt I've introduced a heuristic to guess the maximum limit for a
> memory for a given VM definition. The rationale was "it's better to be
> safe by default" and not let leaking qemu trash the host. The heuristic
> is only used if user has not configured any limit himself. However, over
> the time the number of users reporting OOM kills due to my heuristic has
> grown. Finally, I've full nose of this problem so I've made a patch [1]
> that removes this 'functionality' completely (I'd say it's bug
after
> all). In the patch you can see the heuristic we've converged to. But Dan
> has his point. If libvirt & qemu devels aren't able to come up with
> proper heuristic, how can an ordinary user (who doesn't have any
> knowledge of code) do so? So before I apply my patch, I want to ask you
> guys, what do you think about it.
Even if we had an algorithm for calculating memory overhead (we don't),
glibc will still introduce uncertainty since malloc(size) doesn't
translate to allocating size bytes from the kernel. When you throw in
fragmentation too it becomes extremely hard to predict.
The only practical way of doing this would be to have QEMU gracefully
handle malloc() == NULL so that you could set a limit and gracefully
degrade. We don't though so setting a limit is likely to get you in
trouble.
So you're saying there's no way we can define a reasonable limit
on a QEMU guest to prevent a compomised QEMU exhausting all host
memory ? It rather sucks if that's the position we're in :-(
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|