On Mon, Nov 14, 2022 at 04:35:35PM +0100, Michal Privoznik wrote:
This is similar to one of previous commits which generated
thread-context object for memory devices at cmd line generation
phase. This one does the same for hotplug case, except it's doing
so iff QEMU sandboxing is turned off. The reason is that once
sandboxing is turned on, the __NR_sched_setaffinity syscall is
filtered by libseccomp and thus QEMU is unable to instantiate the
thread-context object.
I'm not sure this is enough checking for possible errors and I am not
sure we actually need this; it looks like it might cause more problems
than benefits. I can't think of a reason anyone would hotplug so much
memory during runtime that it would benefit from the thread-context,
although I admit I might very well be wrong on that. Anyway it can
easily fail if the emulator is pinned to anything but a superset of
host-nodes set for this particular memory device. There might be more
restrictions from mgmt apps filtering some syscalls or whatnot. And
there is an issue described below as well.
Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
---
src/qemu/qemu_hotplug.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
index da92ced2f4..5c49da87ba 100644
--- a/src/qemu/qemu_hotplug.c
+++ b/src/qemu/qemu_hotplug.c
@@ -2240,11 +2240,13 @@ qemuDomainAttachMemory(virQEMUDriver *driver,
g_autoptr(virJSONValue) devprops = NULL;
g_autofree char *objalias = NULL;
bool objAdded = false;
+ bool tcObjAdded = false;
bool releaseaddr = false;
bool teardownlabel = false;
bool teardowncgroup = false;
bool teardowndevice = false;
g_autoptr(virJSONValue) props = NULL;
+ g_autoptr(virJSONValue) tcProps = NULL;
virObjectEvent *event;
int id;
int ret = -1;
@@ -2273,6 +2275,11 @@ qemuDomainAttachMemory(virQEMUDriver *driver,
priv, vm->def, mem, true, false) < 0)
goto cleanup;
+ /* In case sandbox was turned on, thread-context won't work. */
+ if (cfg->seccompSandbox == 0 &&
+ qemuBuildThreadContextProps(&tcProps, &props, priv) < 0)
+ goto cleanup;
+
This function adds the alias to the list of tc aliases...
if (qemuProcessBuildDestroyMemoryPaths(driver, vm, mem, true)
< 0)
goto cleanup;
@@ -2294,6 +2301,12 @@ qemuDomainAttachMemory(virQEMUDriver *driver,
goto removedef;
qemuDomainObjEnterMonitor(vm);
+ if (tcProps) {
+ if (qemuMonitorAddObject(priv->mon, &tcProps, NULL) < 0)
+ goto exit_monitor;
+ tcObjAdded = true;
If this (or anything above) fails the tcObjAdded is false (because it
was not added, kinda makes sense. The hotplug fails, but if the user
tries to hotplug again and it does work, then...
+ }
+
if (qemuMonitorAddObject(priv->mon, &props, NULL) < 0)
goto exit_monitor;
objAdded = true;
@@ -2301,6 +2314,12 @@ qemuDomainAttachMemory(virQEMUDriver *driver,
if (qemuMonitorAddDeviceProps(priv->mon, &devprops) < 0)
goto exit_monitor;
+ if (tcObjAdded) {
+ if (qemuProcessDeleteThreadContext(vm) < 0)
+ goto exit_monitor;
This will fail because there is one leftover tc alias from the first
hotplug attempt, even though this second hotplug was completely
successful.
Either remove it from the list if anything is unsuccessful
(i.e. (tcProps && !tcObjAdded), add it to the list only when the object
is added, or ignore the return value even in this call (or inside the
qemuProcessDeleteThreadContext() function), so that we don't have corner
cases of weird errors.
+ tcObjAdded = false;
+ }
+
qemuDomainObjExitMonitor(vm);
event = virDomainEventDeviceAddedNewFromObj(vm, objalias);
@@ -2339,6 +2358,8 @@ qemuDomainAttachMemory(virQEMUDriver *driver,
virErrorPreserveLast(&orig_err);
if (objAdded)
ignore_value(qemuMonitorDelObject(priv->mon, objalias, false));
+ if (tcObjAdded)
+ ignore_value(qemuProcessDeleteThreadContext(vm));
qemuDomainObjExitMonitor(vm);
if (objAdded && mem)
@@ -4380,6 +4401,7 @@ qemuDomainRemoveMemoryDevice(virQEMUDriver *driver,
qemuDomainObjEnterMonitor(vm);
rc = qemuMonitorDelObject(priv->mon, backendAlias, true);
+ /* XXX remove TC object */
Leftover? Or did you mean to call DeleteThreadContext here too? It
should be safe to remove all the leftover ones here, or if not, just
removing "tc-{backendAlias}" here should do.
qemuDomainObjExitMonitor(vm);
virDomainAuditMemory(vm, oldmem, newmem, "update", rc == 0);
--
2.37.4