[libvirt PATCH 0/2] Fix the way libvirt cleans up the list of active mdevs for a shutoff domain

See patch 2 for explanation. Erik Skultety (2): hostdev: Update mdev pointer reference after checking device type hostdev: mdev: Lookup mdevs by sysfs path rather than mdev struct src/hypervisor/virhostdev.c | 12 +++++------- src/util/virmdev.c | 16 ++++++++-------- src/util/virmdev.h | 4 ++-- 3 files changed, 15 insertions(+), 17 deletions(-) -- 2.29.2

We set the pointer to some garbage packed structure data without knowing whether it actually we were handling the type of device we expected to be handling. On its own, this was harmless, because we'd never use the pointer as we'd skip the device if it were not the expected type. However, it's better to make the logic even more explicit - we first check the device and only when we're sure we have the expected type we then update the pointer shortcut. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/hypervisor/virhostdev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index 120187b07a..aa3fc8738f 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -1980,11 +1980,11 @@ virHostdevReAttachMediatedDevices(virHostdevManagerPtr mgr, virDomainHostdevSubsysMediatedDevPtr mdevsrc; virDomainHostdevDefPtr hostdev = hostdevs[i]; - mdevsrc = &hostdev->source.subsys.u.mdev; - if (!virHostdevIsMdevDevice(hostdev)) continue; + mdevsrc = &hostdev->source.subsys.u.mdev; + if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr, mdevsrc->model))) continue; -- 2.29.2

On a Thursday in 2021, Erik Skultety wrote:
We set the pointer to some garbage packed structure data without knowing whether it actually we were handling the type of device we expected to be handling. On its own, this was harmless, because we'd never use the pointer as we'd skip the device if it were not the expected type. However, it's better to make the logic even more explicit - we first check the device and only when we're sure we have the expected type we then update the pointer shortcut.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/hypervisor/virhostdev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

The lookup didn't do anything apart from comparing the sysfs paths anyway since that's what makes each mdev unique. The most ridiculous usage of the old logic was in virHostdevReAttachMediatedDevices where in order to drop an mdev hostdev from the list of active devices we first had to create a new mdev and use it in the lookup call. Why couldn't we have used the hostdev directly? Because the hostdev and mdev structures are incompatible. The way mdevs are currently removed is via a write to a specific sysfs attribute. If you do it while the machine which has the mdev assigned is running, the write call may block (this should return a write error instead and it is likely a bug in the vendor driver) until the device is no longer bound to vfio which is when the QEMU process exits. The interesting part here comes afterwards when we're cleaning up and call virHostdevReAttachMediatedDevices. The domain doesn't exist anymore, so the list of active hostdevs needs to be updated and the respective hostdevs removed from the list, but remember we had to create an mdev object in the memory in order to find it in the list first which will fail because the write to sysfs had already removed the mdev instance from the host system. And so the next time you try to start the same domain you'll get: "Requested operation is not valid: mediated device <path> is in use by driver QEMU, domain <name>" Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/hypervisor/virhostdev.c | 10 ++++------ src/util/virmdev.c | 16 ++++++++-------- src/util/virmdev.h | 4 ++-- 3 files changed, 14 insertions(+), 16 deletions(-) diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index aa3fc8738f..ea04382d0d 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -1975,7 +1975,7 @@ virHostdevReAttachMediatedDevices(virHostdevManagerPtr mgr, virObjectLock(mgr->activeMediatedHostdevs); for (i = 0; i < nhostdevs; i++) { - g_autoptr(virMediatedDevice) mdev = NULL; + g_autofree char * sysfspath = NULL; virMediatedDevicePtr tmp; virDomainHostdevSubsysMediatedDevPtr mdevsrc; virDomainHostdevDefPtr hostdev = hostdevs[i]; @@ -1984,14 +1984,12 @@ virHostdevReAttachMediatedDevices(virHostdevManagerPtr mgr, continue; mdevsrc = &hostdev->source.subsys.u.mdev; - - if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr, - mdevsrc->model))) - continue; + sysfspath = virMediatedDeviceGetSysfsPath(mdevsrc->uuidstr); /* Remove from the list only mdevs assigned to @drv_name/@dom_name */ - tmp = virMediatedDeviceListFind(mgr->activeMediatedHostdevs, mdev); + tmp = virMediatedDeviceListFind(mgr->activeMediatedHostdevs, + sysfspath); /* skip inactive devices */ if (!tmp) diff --git a/src/util/virmdev.c b/src/util/virmdev.c index b6df353d56..fc27e9e45d 100644 --- a/src/util/virmdev.c +++ b/src/util/virmdev.c @@ -308,7 +308,7 @@ int virMediatedDeviceListAdd(virMediatedDeviceListPtr list, virMediatedDevicePtr *dev) { - if (virMediatedDeviceListFind(list, *dev)) { + if (virMediatedDeviceListFind(list, (*dev)->path)) { virReportError(VIR_ERR_INTERNAL_ERROR, _("device %s is already in use"), (*dev)->path); return -1; @@ -354,7 +354,7 @@ virMediatedDevicePtr virMediatedDeviceListSteal(virMediatedDeviceListPtr list, virMediatedDevicePtr dev) { - int idx = virMediatedDeviceListFindIndex(list, dev); + int idx = virMediatedDeviceListFindIndex(list, dev->path); return virMediatedDeviceListStealIndex(list, idx); } @@ -370,13 +370,13 @@ virMediatedDeviceListDel(virMediatedDeviceListPtr list, int virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list, - virMediatedDevicePtr dev) + const char *sysfspath) { size_t i; for (i = 0; i < list->count; i++) { - virMediatedDevicePtr other = list->devs[i]; - if (STREQ(other->path, dev->path)) + virMediatedDevicePtr dev = list->devs[i]; + if (STREQ(sysfspath, dev->path)) return i; } return -1; @@ -385,11 +385,11 @@ virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list, virMediatedDevicePtr virMediatedDeviceListFind(virMediatedDeviceListPtr list, - virMediatedDevicePtr dev) + const char *sysfspath) { int idx; - if ((idx = virMediatedDeviceListFindIndex(list, dev)) >= 0) + if ((idx = virMediatedDeviceListFindIndex(list, sysfspath)) >= 0) return list->devs[idx]; else return NULL; @@ -403,7 +403,7 @@ virMediatedDeviceIsUsed(virMediatedDevicePtr dev, const char *drvname, *domname; virMediatedDevicePtr tmp = NULL; - if ((tmp = virMediatedDeviceListFind(list, dev))) { + if ((tmp = virMediatedDeviceListFind(list, dev->path))) { virMediatedDeviceGetUsedBy(tmp, &drvname, &domname); virReportError(VIR_ERR_OPERATION_INVALID, _("mediated device %s is in use by " diff --git a/src/util/virmdev.h b/src/util/virmdev.h index b6563a94fc..e96e5bb8c6 100644 --- a/src/util/virmdev.h +++ b/src/util/virmdev.h @@ -130,11 +130,11 @@ virMediatedDeviceListDel(virMediatedDeviceListPtr list, virMediatedDevicePtr virMediatedDeviceListFind(virMediatedDeviceListPtr list, - virMediatedDevicePtr dev); + const char *sysfspath); int virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list, - virMediatedDevicePtr dev); + const char *sysfspath); int virMediatedDeviceListMarkDevices(virMediatedDeviceListPtr dst, -- 2.29.2

On a Thursday in 2021, Erik Skultety wrote:
The lookup didn't do anything apart from comparing the sysfs paths anyway since that's what makes each mdev unique. The most ridiculous usage of the old logic was in virHostdevReAttachMediatedDevices where in order to drop an mdev hostdev from the list of active devices we first had to create a new mdev and use it in the lookup call. Why couldn't we have used the hostdev directly? Because the hostdev and mdev structures are incompatible.
The way mdevs are currently removed is via a write to a specific sysfs attribute. If you do it while the machine which has the mdev assigned is running, the write call may block (this should return a write error instead and it is likely a bug in the vendor driver) until the device is no longer bound to vfio which is when the QEMU process exits.
The interesting part here comes afterwards when we're cleaning up and call virHostdevReAttachMediatedDevices. The domain doesn't exist anymore, so the list of active hostdevs needs to be updated and the respective hostdevs removed from the list, but remember we had to create an mdev object in the memory in order to find it in the list first which will fail because the write to sysfs had already removed the mdev instance from the host system. And so the next time you try to start the same domain you'll get:
"Requested operation is not valid: mediated device <path> is in use by driver QEMU, domain <name>"
Is there a public bug/issue you could link here?
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/hypervisor/virhostdev.c | 10 ++++------ src/util/virmdev.c | 16 ++++++++-------- src/util/virmdev.h | 4 ++-- 3 files changed, 14 insertions(+), 16 deletions(-)
diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index aa3fc8738f..ea04382d0d 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -1975,7 +1975,7 @@ virHostdevReAttachMediatedDevices(virHostdevManagerPtr mgr,
virObjectLock(mgr->activeMediatedHostdevs); for (i = 0; i < nhostdevs; i++) { - g_autoptr(virMediatedDevice) mdev = NULL; + g_autofree char * sysfspath = NULL;
extra space
virMediatedDevicePtr tmp; virDomainHostdevSubsysMediatedDevPtr mdevsrc; virDomainHostdevDefPtr hostdev = hostdevs[i];
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

On Thu, Jan 07, 2021 at 09:32:43PM +0100, Ján Tomko wrote:
On a Thursday in 2021, Erik Skultety wrote:
The lookup didn't do anything apart from comparing the sysfs paths anyway since that's what makes each mdev unique. The most ridiculous usage of the old logic was in virHostdevReAttachMediatedDevices where in order to drop an mdev hostdev from the list of active devices we first had to create a new mdev and use it in the lookup call. Why couldn't we have used the hostdev directly? Because the hostdev and mdev structures are incompatible.
The way mdevs are currently removed is via a write to a specific sysfs attribute. If you do it while the machine which has the mdev assigned is running, the write call may block (this should return a write error instead and it is likely a bug in the vendor driver) until the device
I'll reword ^this bit slightly as it turned out that this behavior conforms to the kernel device model (which it previously didn't) and a device "remove" apparently cannot return an error (I have this via a conversation on a private list, so I can't link it as a source of truth).
is no longer bound to vfio which is when the QEMU process exits.
The interesting part here comes afterwards when we're cleaning up and call virHostdevReAttachMediatedDevices. The domain doesn't exist anymore, so the list of active hostdevs needs to be updated and the respective hostdevs removed from the list, but remember we had to create an mdev object in the memory in order to find it in the list first which will fail because the write to sysfs had already removed the mdev instance from the host system. And so the next time you try to start the same domain you'll get:
"Requested operation is not valid: mediated device <path> is in use by driver QEMU, domain <name>"
Is there a public bug/issue you could link here?
I just created one (gitlab issue) for bookkeeping and I'll link it here.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/hypervisor/virhostdev.c | 10 ++++------ src/util/virmdev.c | 16 ++++++++-------- src/util/virmdev.h | 4 ++-- 3 files changed, 14 insertions(+), 16 deletions(-)
diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index aa3fc8738f..ea04382d0d 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -1975,7 +1975,7 @@ virHostdevReAttachMediatedDevices(virHostdevManagerPtr mgr,
virObjectLock(mgr->activeMediatedHostdevs); for (i = 0; i < nhostdevs; i++) { - g_autoptr(virMediatedDevice) mdev = NULL; + g_autofree char * sysfspath = NULL;
extra space
virMediatedDevicePtr tmp; virDomainHostdevSubsysMediatedDevPtr mdevsrc; virDomainHostdevDefPtr hostdev = hostdevs[i];
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Thanks. Erik
participants (2)
-
Erik Skultety
-
Ján Tomko