On Thu, 2015-11-05 at 16:27 +0100, Peter Krempa wrote:
On Wed, Nov 04, 2015 at 17:16:53 -0700, Alex Williamson wrote:
> On Wed, 2015-11-04 at 16:54 +0100, Peter Krempa wrote:
> > On Wed, Nov 04, 2015 at 08:43:34 -0700, Alex Williamson wrote:
> > > On Wed, 2015-11-04 at 16:14 +0100, Peter Krempa wrote:
[...]
> > Additionally if users wish to impose a limit on this they still might
> > want to use the <hard_limit> setting.
>
> What's wrong with the current algorithm? The 1G fudge factor certainly
The wrong think is that it doesn't work all the time. See below...
> isn't ideal, but the 2nd bug you reference above is clearly a result
> that more memory is being added to the VM but the locked memory limit is
> not adjusted to account for it. That's just an implementation
Indeed, that is a separate bug, and if we figure out how to make this
work all the time I'll fix that separately.
> oversight. I'm not sure what's going on in the first bug, but why does
> using hard_limit to override the locked limit to something smaller than
> we think it should be set to automatically solve the problem? Is it not
> getting set as we expect on power? Do we simply need to set the limit
> using max memory rather than current memory? It seems like there's a
Setting it to max memory won't actually fix the first referenced bug.
The bug can be reproduced even if you don't use max memory at all on
power pc (I didn't manage to update the BZ yet though with this
information). Using max memory for that would basically add yet another
workaround for setting the mlock size large enough.
The bug happens if you set up a guest with 1GiB of ram and pass a AMD
FirePro 2270 graphics card into it. Libvirt sets the memory limit to
1+1GiB and starts qemu. qemu then aborts as the VFIO code cannot lock
the memory.
This seems like a power bug, on x86 mmap'd memory, such as the MMIO
space of an assigned device, doesn't count against locked memory limits.
This does not happen on larger guests though (2GiB of ram or more)
which
leads to the suspicion that the limit doesn't take into account some
kind of overhead. As the original comment in the code was hinting that
this is just a guess and it proved to be unreliable we shouldn't special
case such configurations.
Power has a different IOMMU model than x86, there may be some lower
bound at which trying to approximate it using the x86 limits doesn't
work.
Setting it to max memory + 1G would actually work around the second
mentioned bug to the current state.
> whole lot of things we could do that are better than allowing the VM
> unlimited locked memory. Thanks,
I'm happy to set something large enough, but the value we set must be
the absolute upper bound of anything that might be necessary so that it
will 'just work'. We decided to be nice enough to the users to set the
limit to somethig that works so we shouldn't special case any
configuration.
The power devs will need to speak to what their locked memory
requirements are and maybe we can come up with a combined algorithm that
works good enough for both, or maybe libvirt needs to apply different
algorithms based on the machine type. The x86 limit has been working
well for us, so I see no reason to abandon it simply because we tried to
apply it to a different platform and it didn't work. Thanks,
Alex