On Wed, Nov 04, 2015 at 16:14:29 +0100, Peter Krempa wrote:
For VFIO passthrough the guest memory including the device memory to
be
a verb is missing here --------------------------------------------^ :-)
resident in memory. Previously we used a magic constant of 1GiB that
we
added to the current memory size of the VM to accomodate all this. It is
possible though that the device that is passed through exposes more than
that. To avoid guessing wrong again in the future set the memory lock
limit to unlimited at the point when VFIO will be used.
To avoid any confusion in the future, I'd explicitly mention that if
there is a need to limit the memory which a QEMU process can lock,
memory hard_limit in domain XML needs to be set.
This problem is similar to the issue where we tried to infer the
hard
limit for a VM according to it's configuration. This proved to be really
problematic so we moved this burden to the user.
Resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1273480
Additionally this patch also fixes a similar bug, where the mlock limit
was not increased prior to a memory hotplug operation.
https://bugzilla.redhat.com/show_bug.cgi?id=1273491
---
src/qemu/qemu_command.c | 17 +++++------------
src/qemu/qemu_hotplug.c | 20 +++++---------------
src/util/virprocess.c | 3 +++
3 files changed, 13 insertions(+), 27 deletions(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index 8824541..183aa85 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -11448,20 +11448,13 @@ qemuBuildCommandLine(virConnectPtr conn,
}
if (mlock) {
- unsigned long long memKB;
-
- /* VFIO requires all of the guest's memory to be
- * locked resident, plus some amount for IO
- * space. Alex Williamson suggested adding 1GiB for IO
- * space just to be safe (some finer tuning might be
- * nice, though).
- */
+ /* VFIO requires all of the guest's memory and the IO space of the
+ * device to be locked resident. Since we can't really know what the
+ * users decide to plug in we need to set the limit to unlimited. */
if (virMemoryLimitIsSet(def->mem.hard_limit))
- memKB = def->mem.hard_limit;
+ virCommandSetMaxMemLock(cmd, def->mem.hard_limit << 10);
else
- memKB = virDomainDefGetMemoryActual(def) + 1024 * 1024;
-
- virCommandSetMaxMemLock(cmd, memKB * 1024);
+ virCommandSetMaxMemLock(cmd, ULLONG_MAX);
}
if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_MSG_TIMESTAMP) &&
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
index 8f2fda9..1cf61ac 100644
--- a/src/qemu/qemu_hotplug.c
+++ b/src/qemu/qemu_hotplug.c
@@ -1254,7 +1254,6 @@ qemuDomainAttachHostPCIDevice(virQEMUDriverPtr driver,
bool teardowncgroup = false;
bool teardownlabel = false;
int backend;
- unsigned long long memKB;
virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver);
unsigned int flags = 0;
@@ -1279,20 +1278,11 @@ qemuDomainAttachHostPCIDevice(virQEMUDriverPtr driver,
goto error;
}
- /* VFIO requires all of the guest's memory to be locked
- * resident (plus an additional 1GiB to cover IO space). During
- * hotplug, the guest's memory may already be locked, but it
- * doesn't hurt to "change" the limit to the same value.
- * NB: the domain's memory tuning parameters are stored as
- * Kibibytes, but virProcessSetMaxMemLock expects the value in
- * bytes.
- */
- if (virMemoryLimitIsSet(vm->def->mem.hard_limit))
- memKB = vm->def->mem.hard_limit;
- else
- memKB = virDomainDefGetMemoryActual(vm->def) + (1024 * 1024);
-
- virProcessSetMaxMemLock(vm->pid, memKB * 1024);
+ /* VFIO requires all of the guest's memory and the IO space of the
+ * device to be locked resident. Since we can't really know what the
+ * users decide to plug in we need to set the limit to unlimited. */
+ if (!virMemoryLimitIsSet(vm->def->mem.hard_limit))
+ virProcessSetMaxMemLock(vm->pid, ULLONG_MAX);
I think this needs to set the limit in the same way qemuBuildCommandLine
does it. If there was no need to set the limit (no VFIO device, not
explicitly requested, no RDMA migration) in qemuBuildCommandLine we have
to set it here, otherwise hotplugging the first VFIO device into such
domain would kill the domain.
break;
default:
diff --git a/src/util/virprocess.c b/src/util/virprocess.c
index 103c114..92195df 100644
--- a/src/util/virprocess.c
+++ b/src/util/virprocess.c
@@ -749,6 +749,9 @@ virProcessSetMaxMemLock(pid_t pid, unsigned long long bytes)
if (bytes == 0)
return 0;
+ if (bytes == ULLONG_MAX)
+ bytes = RLIM_INFINITY;
+
rlim.rlim_cur = rlim.rlim_max = bytes;
if (pid == 0) {
if (setrlimit(RLIMIT_MEMLOCK, &rlim) < 0) {
Jirka