On Fri, Nov 06, 2015 at 13:39:06 -0500, John Ferlan wrote:
On 11/06/2015 10:47 AM, Peter Krempa wrote:
> If mlock is required either due to use of VFIO hostdevs or due to the
> fact that it's enabled it needs to be tweaked prior to adding new memory
> or after removing a module. Add a helper to determine when it's
> necessary and reuse it both on hotplug and hotunplug.
>
> Resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1273491
Not sure I'd get from the commit message why this change would
necessarily fix the problem unless of course I've been reading along
with the other patches proposed.
As I understand it, the tweak is the assumption that when attaching
memory that if the domain has or requires memory to be locked, then
we'll adjust that value by what we're attaching. Likewise if we detach
memory, then we have to reduce the value.
Exactly. This is due to the fact that we are doing the automagic limit
calculation on startup, so the users expect this to work flawlessly with
memory hotplug too.
> ---
> src/qemu/qemu_domain.c | 27 +++++++++++++++++++++++++++
> src/qemu/qemu_domain.h | 1 +
> src/qemu/qemu_hotplug.c | 15 +++++++++++++++
> 3 files changed, 43 insertions(+)
>
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index 8441d7a..6f82081 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -3633,3 +3633,30 @@ qemuDomainGetMlockLimitBytes(virDomainDefPtr def)
>
> return memKB << 10;
> }
> +
> +
> +/**
> + * @def: domain definition
> + *
> + * Returns ture if the locked memory limit needs to be set or updated due to
s/ture/true
> + * configuration or passthrough devices.
> + * */
> +bool
> +qemuDomainRequiresMlock(virDomainDefPtr def)
> +{
> + size_t i;
> +
> + if (def->mem.locked)
> + return true;
> +
> + for (i = 0; i < def->nhostdevs; i++) {
> + virDomainHostdevDefPtr dev = def->hostdevs[i];
> +
> + if (dev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS &&
> + dev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI
&&
> + dev->source.subsys.u.pci.backend ==
VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO)
> + return true;
> + }
Seeing this made me wonder - can there be more than one? Not related to
this patch, but if so wouldn't it affect the 1G adjustment...
There actually was code doing exactly that as a part of
qemuDomainMemoryLimit. Commit 16bcb3b61675a88bf that reverted it
actually introduced quite a few regressions in other code. Some of them
were fixed later on, but one of them is visible in this hunk.
If you specify <memoryBacking><locked/></memoryBacking> in the XML then
the same code for VFIO is used to calculate the memory size, which is
incorrect. For that case the hard limit setting should be enforced, but
since we already are doing it wrong this might really break existing
configs.
> +
> + return false;
> +}
> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
> index e34370b..4be998c 100644
> --- a/src/qemu/qemu_domain.h
> +++ b/src/qemu/qemu_domain.h
> @@ -483,5 +483,6 @@ int qemuDomainUpdateCurrentMemorySize(virQEMUDriverPtr driver,
> virDomainObjPtr vm);
>
> unsigned long long qemuDomainGetMlockLimitBytes(virDomainDefPtr def);
> +bool qemuDomainRequiresMlock(virDomainDefPtr def);
>
> #endif /* __QEMU_DOMAIN_H__ */
> diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
> index e7fc036..d4cacc0 100644
> --- a/src/qemu/qemu_hotplug.c
> +++ b/src/qemu/qemu_hotplug.c
> @@ -1768,6 +1768,7 @@ qemuDomainAttachMemory(virQEMUDriverPtr driver,
> virJSONValuePtr props = NULL;
> virObjectEventPtr event;
> bool fix_balloon = false;
> + bool mlock = false;
> int id;
> int ret = -1;
>
> @@ -1802,6 +1803,11 @@ qemuDomainAttachMemory(virQEMUDriverPtr driver,
> goto cleanup;
> }
>
> + mlock = qemuDomainRequiresMlock(vm->def);
> +
> + if (mlock)
> + virProcessSetMaxMemLock(vm->pid,
qemuDomainGetMlockLimitBytes(vm->def));
> +
Somewhat existing, but virProcessSetMaxMemLock does return a status and
this code doesn't check it (nor do other callers). If the call failed
or wasn't supported, then I assume whatever follows won't be very happy
either, right?
Actually, you are totally right. In this case we really need to check
the return value. And also the VFIO hotplug case needs to be fixed. Qemu
crashes/exits in case where it can't lock the memory. We need to reject
the hotplug if we can't increase the limit.
Other adjustments in this code could have "ignore_value"; however,
realize as soon as you add a check callers from other places will need
to be checked too so that Coverity keeps happy.
You know my disregard for coverity's non-bug whining. We can definitely
add a ATTRIBUTE_RETURN_CHECK though to be sure to fix all cases.
It seems to me the changes would resolve the problem, although I still
wonder about that case where we've defined a hard limit and not the case
where we set the limit because of VFIO.
That is if I set a hard limit of 4G, then attaching more than the hard
limit allows would seem counter intuitive to why there's a limit set. It
This is totally a user's problem. Setting a hard limit in a way where it
doesn't accomodate vm's legitimate needs isn't our problem.
would seem in that case someone should be forced to increase the
hard
limit. I guess it all goes back to assuming whether attaching memory
should increase the limit. I certainly understand the other case where
In case when hard limit was specifed we can't just increase it. This
would be against the semantics of the hard limit setting.
Also if qemu crashes in such case it's expected just because of the hard
limit.
One thing we should do though is document how this behaves in
conjunction with that.
Additional optimization could be not to re-set the memory limit on
memory hotplug in case it commes from the hard limit setting, but
that'd overcomplicate things. I'll think about that though before
posting the next version.
because there was a specific card, we had to lock memory so we must
update the limit.
Peter