[PATCH 00/20] handle AWOL SR-IOV VF hostdevs during domain lifetime

Hi, This series is an attempt to fix https://gitlab.com/libvirt/libvirt/-/issues/72. The root issue is that, for obvious reasons, virPCIDeviceNew() requires the device to exist in the host (i.e. the config file must be present in sysfs). The unexpected absence of the device (e.g. removing the VFs from the host while a domain was using them) will make virPCIDeviceNew() return -1 in several code paths, and the effects in Libvirt can be bothersome. Patch 20 will explain it in greater detail. Changing virPCIDeviceNew() internals to not fail in this scenario would be a herculean refactor of innocent code that doesn't have anything to do with the problem. My approach here is to check if the device exists in the sysfs in key places, keeping the default virPCIDeviceNew() behavior intact. I mentioned this is patch 20 but to emphasize: this is NOT an attempt to implement surprise hostdev hotunplug support in Libvirt. Removing SR-IOVs VFs used by any domain is still not cool and should be avoided. But after this series, Libvirt can at least fall on its feet if this situation happens. The series was tested using a Mellanox MT28800 card in a Power 9 server. Daniel Henrique Barboza (20): virpci, domain_audit: use virPCIDeviceAddressAsString() qemu, lxc: move NodeDeviceGetPCIInfo() function to domain_driver.c domain_driver.c: use PCI address with virDomainDriverNodeDeviceGetPCIInfo() virpci.c: simplify virPCIDeviceNew() signature virpci: introduce virPCIDeviceExists() virhostdev.c: virHostdevGetPCIHostDevice() now reports missing device security_selinux.c: modernize set/restore hostdev subsys label functions security_dac.c: modernize hostdev label set/restore functions dac, selinux: skip setting/restoring label for absent PCI devices libvirt-nodedev.c: remove return value from virNodeDeviceFree() qemu_driver.c: modernize qemuNodeDeviceReAttach() libxl_driver.c: modernize libxlNodeDeviceReAttach() virsh-domain.c: modernize cmdDetachDevice() virhostdev.c: add virHostdevIsPCIDevice() helper qemu_cgroup.c: skip absent PCI devices in qemuTeardownHostdevCgroup() virpci.c: use virPCIDeviceAddressPtr in virPCIDeviceListFindIndex() virpci.c: use virPCIDeviceAddressPtr in virPCIDeviceListFind() virpci.c: use virPCIDeviceAddressPtr in virPCIDeviceListSteal() virpci.c: use virPCIDeviceAddressPtr in virPCIDeviceListDel() virhostdev.c: remove missing PCI devs from hostdev manager include/libvirt/libvirt-nodedev.h | 2 +- src/conf/domain_audit.c | 6 +- src/datatypes.h | 2 + src/hypervisor/domain_driver.c | 30 ++++++++++ src/hypervisor/domain_driver.h | 4 ++ src/hypervisor/virhostdev.c | 88 ++++++++++++++++++++++------ src/hypervisor/virhostdev.h | 2 + src/libvirt-nodedev.c | 15 +++-- src/libvirt_private.syms | 3 + src/libxl/libxl_driver.c | 87 ++++++++-------------------- src/node_device/node_device_udev.c | 11 ++-- src/qemu/qemu_cgroup.c | 10 ++++ src/qemu/qemu_domain_address.c | 5 +- src/qemu/qemu_driver.c | 81 +++++++------------------- src/security/security_apparmor.c | 3 +- src/security/security_dac.c | 61 ++++++++------------ src/security/security_selinux.c | 66 +++++++++------------ src/security/virt-aa-helper.c | 6 +- src/util/virnetdev.c | 3 +- src/util/virnvme.c | 5 +- src/util/virpci.c | 93 ++++++++++++++---------------- src/util/virpci.h | 14 ++--- tests/virhostdevtest.c | 3 +- tests/virpcitest.c | 35 ++++++++--- tools/virsh-domain.c | 18 ++---- 25 files changed, 325 insertions(+), 328 deletions(-) -- 2.26.2

There is no need to open code the PCI address string format when we have a function that does exactly that. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/conf/domain_audit.c | 6 +----- src/util/virpci.c | 6 ++---- 2 files changed, 3 insertions(+), 9 deletions(-) diff --git a/src/conf/domain_audit.c b/src/conf/domain_audit.c index 8bc6633af4..5fa65a8078 100644 --- a/src/conf/domain_audit.c +++ b/src/conf/domain_audit.c @@ -361,11 +361,7 @@ virDomainAuditHostdev(virDomainObjPtr vm, virDomainHostdevDefPtr hostdev, case VIR_DOMAIN_HOSTDEV_MODE_SUBSYS: switch ((virDomainHostdevSubsysType) hostdev->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: - address = g_strdup_printf(VIR_PCI_DEVICE_ADDRESS_FMT, - pcisrc->addr.domain, - pcisrc->addr.bus, - pcisrc->addr.slot, - pcisrc->addr.function); + address = virPCIDeviceAddressAsString(&pcisrc->addr); break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: address = g_strdup_printf("%.3d.%.3d", usbsrc->bus, usbsrc->device); diff --git a/src/util/virpci.c b/src/util/virpci.c index 9bfc743fbd..3d9d583622 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -1471,8 +1471,7 @@ virPCIDeviceNew(unsigned int domain, dev->address.slot = slot; dev->address.function = function; - dev->name = g_strdup_printf(VIR_PCI_DEVICE_ADDRESS_FMT, domain, bus, slot, - function); + dev->name = virPCIDeviceAddressAsString(&dev->address); dev->path = g_strdup_printf(PCI_SYSFS "devices/%s/config", dev->name); @@ -1998,8 +1997,7 @@ virPCIDeviceAddressGetIOMMUGroupNum(virPCIDeviceAddressPtr addr) g_autofree char *groupNumStr = NULL; unsigned int groupNum; - devName = g_strdup_printf(VIR_PCI_DEVICE_ADDRESS_FMT, addr->domain, addr->bus, - addr->slot, addr->function); + devName = virPCIDeviceAddressAsString(addr); devPath = virPCIFile(devName, "iommu_group"); -- 2.26.2

libxlNodeDeviceGetPCIInfo() and qemuNodeDeviceGetPCIInfo() are equal. Let's move the logic to a new virDomainDriverNodeDeviceGetPCIInfo() info to be used by libxl_driver.c and qemu_driver.c. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/hypervisor/domain_driver.c | 33 +++++++++++++++++++++++++++++ src/hypervisor/domain_driver.h | 7 +++++++ src/libvirt_private.syms | 1 + src/libxl/libxl_driver.c | 38 ++++------------------------------ src/qemu/qemu_driver.c | 37 +++------------------------------ 5 files changed, 48 insertions(+), 68 deletions(-) diff --git a/src/hypervisor/domain_driver.c b/src/hypervisor/domain_driver.c index 8dc5870a61..68dbf10ac6 100644 --- a/src/hypervisor/domain_driver.c +++ b/src/hypervisor/domain_driver.c @@ -21,6 +21,7 @@ #include <config.h> #include "domain_driver.h" +#include "node_device_conf.h" #include "viralloc.h" #include "virstring.h" #include "vircrypto.h" @@ -336,3 +337,35 @@ virDomainDriverSetupPersistentDefBlkioParams(virDomainDefPtr persistentDef, return ret; } + + +int +virDomainDriverNodeDeviceGetPCIInfo(virNodeDeviceDefPtr def, + unsigned *domain, + unsigned *bus, + unsigned *slot, + unsigned *function) +{ + virNodeDevCapsDefPtr cap; + + cap = def->caps; + while (cap) { + if (cap->data.type == VIR_NODE_DEV_CAP_PCI_DEV) { + *domain = cap->data.pci_dev.domain; + *bus = cap->data.pci_dev.bus; + *slot = cap->data.pci_dev.slot; + *function = cap->data.pci_dev.function; + break; + } + + cap = cap->next; + } + + if (!cap) { + virReportError(VIR_ERR_INVALID_ARG, + _("device %s is not a PCI device"), def->name); + return -1; + } + + return 0; +} diff --git a/src/hypervisor/domain_driver.h b/src/hypervisor/domain_driver.h index b66ae2d421..2bb053d559 100644 --- a/src/hypervisor/domain_driver.h +++ b/src/hypervisor/domain_driver.h @@ -21,6 +21,7 @@ #pragma once #include "domain_conf.h" +#include "node_device_conf.h" char * virDomainDriverGenerateRootHash(const char *drivername, @@ -45,3 +46,9 @@ int virDomainDriverParseBlkioDeviceStr(char *blkioDeviceStr, const char *type, int virDomainDriverSetupPersistentDefBlkioParams(virDomainDefPtr persistentDef, virTypedParameterPtr params, int nparams); + +int virDomainDriverNodeDeviceGetPCIInfo(virNodeDeviceDefPtr def, + unsigned *domain, + unsigned *bus, + unsigned *slot, + unsigned *function); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 583fc5800e..fe4ee25f3a 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1445,6 +1445,7 @@ virDomainCgroupSetupMemtune; virDomainDriverGenerateMachineName; virDomainDriverGenerateRootHash; virDomainDriverMergeBlkioDevice; +virDomainDriverNodeDeviceGetPCIInfo; virDomainDriverParseBlkioDeviceStr; virDomainDriverSetupPersistentDefBlkioParams; diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c index d58af1a869..ac2564a563 100644 --- a/src/libxl/libxl_driver.c +++ b/src/libxl/libxl_driver.c @@ -56,6 +56,7 @@ #include "cpu/cpu.h" #include "virutil.h" #include "domain_validate.h" +#include "domain_driver.h" #define VIR_FROM_THIS VIR_FROM_LIBXL @@ -5779,37 +5780,6 @@ libxlConnectSupportsFeature(virConnectPtr conn, int feature) } } -static int -libxlNodeDeviceGetPCIInfo(virNodeDeviceDefPtr def, - unsigned *domain, - unsigned *bus, - unsigned *slot, - unsigned *function) -{ - virNodeDevCapsDefPtr cap; - - cap = def->caps; - while (cap) { - if (cap->data.type == VIR_NODE_DEV_CAP_PCI_DEV) { - *domain = cap->data.pci_dev.domain; - *bus = cap->data.pci_dev.bus; - *slot = cap->data.pci_dev.slot; - *function = cap->data.pci_dev.function; - break; - } - - cap = cap->next; - } - - if (!cap) { - virReportError(VIR_ERR_INVALID_ARG, - _("device %s is not a PCI device"), def->name); - return -1; - } - - return 0; -} - static int libxlNodeDeviceDetachFlags(virNodeDevicePtr dev, const char *driverName, @@ -5851,7 +5821,7 @@ libxlNodeDeviceDetachFlags(virNodeDevicePtr dev, if (virNodeDeviceDetachFlagsEnsureACL(dev->conn, def) < 0) goto cleanup; - if (libxlNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) + if (virDomainDriverNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) goto cleanup; pci = virPCIDeviceNew(domain, bus, slot, function); @@ -5922,7 +5892,7 @@ libxlNodeDeviceReAttach(virNodeDevicePtr dev) if (virNodeDeviceReAttachEnsureACL(dev->conn, def) < 0) goto cleanup; - if (libxlNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) + if (virDomainDriverNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) goto cleanup; pci = virPCIDeviceNew(domain, bus, slot, function); @@ -5980,7 +5950,7 @@ libxlNodeDeviceReset(virNodeDevicePtr dev) if (virNodeDeviceResetEnsureACL(dev->conn, def) < 0) goto cleanup; - if (libxlNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) + if (virDomainDriverNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) goto cleanup; pci = virPCIDeviceNew(domain, bus, slot, function); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 62b0852c33..6c09d95164 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11951,37 +11951,6 @@ qemuDomainMigrateConfirm3Params(virDomainPtr domain, } -static int -qemuNodeDeviceGetPCIInfo(virNodeDeviceDefPtr def, - unsigned *domain, - unsigned *bus, - unsigned *slot, - unsigned *function) -{ - virNodeDevCapsDefPtr cap; - - cap = def->caps; - while (cap) { - if (cap->data.type == VIR_NODE_DEV_CAP_PCI_DEV) { - *domain = cap->data.pci_dev.domain; - *bus = cap->data.pci_dev.bus; - *slot = cap->data.pci_dev.slot; - *function = cap->data.pci_dev.function; - break; - } - - cap = cap->next; - } - - if (!cap) { - virReportError(VIR_ERR_INVALID_ARG, - _("device %s is not a PCI device"), def->name); - return -1; - } - - return 0; -} - static int qemuNodeDeviceDetachFlags(virNodeDevicePtr dev, const char *driverName, @@ -12024,7 +11993,7 @@ qemuNodeDeviceDetachFlags(virNodeDevicePtr dev, if (virNodeDeviceDetachFlagsEnsureACL(dev->conn, def) < 0) goto cleanup; - if (qemuNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) + if (virDomainDriverNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) goto cleanup; pci = virPCIDeviceNew(domain, bus, slot, function); @@ -12105,7 +12074,7 @@ qemuNodeDeviceReAttach(virNodeDevicePtr dev) if (virNodeDeviceReAttachEnsureACL(dev->conn, def) < 0) goto cleanup; - if (qemuNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) + if (virDomainDriverNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) goto cleanup; pci = virPCIDeviceNew(domain, bus, slot, function); @@ -12159,7 +12128,7 @@ qemuNodeDeviceReset(virNodeDevicePtr dev) if (virNodeDeviceResetEnsureACL(dev->conn, def) < 0) goto cleanup; - if (qemuNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) + if (virDomainDriverNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) goto cleanup; pci = virPCIDeviceNew(domain, bus, slot, function); -- 2.26.2

Instead of receiving 4 uints in order and write domain/bus/slot/function, receive a virPCIDeviceAddressPtr instead and write into it. This change will allow us to simplify the API for virPCIDeviceNew() in the next patch. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/hypervisor/domain_driver.c | 13 +++++-------- src/hypervisor/domain_driver.h | 5 +---- src/libxl/libxl_driver.c | 18 +++++++++--------- src/qemu/qemu_driver.c | 18 +++++++++--------- 4 files changed, 24 insertions(+), 30 deletions(-) diff --git a/src/hypervisor/domain_driver.c b/src/hypervisor/domain_driver.c index 68dbf10ac6..cdb53bcf3e 100644 --- a/src/hypervisor/domain_driver.c +++ b/src/hypervisor/domain_driver.c @@ -341,20 +341,17 @@ virDomainDriverSetupPersistentDefBlkioParams(virDomainDefPtr persistentDef, int virDomainDriverNodeDeviceGetPCIInfo(virNodeDeviceDefPtr def, - unsigned *domain, - unsigned *bus, - unsigned *slot, - unsigned *function) + virPCIDeviceAddressPtr devAddr) { virNodeDevCapsDefPtr cap; cap = def->caps; while (cap) { if (cap->data.type == VIR_NODE_DEV_CAP_PCI_DEV) { - *domain = cap->data.pci_dev.domain; - *bus = cap->data.pci_dev.bus; - *slot = cap->data.pci_dev.slot; - *function = cap->data.pci_dev.function; + devAddr->domain = cap->data.pci_dev.domain; + devAddr->bus = cap->data.pci_dev.bus; + devAddr->slot = cap->data.pci_dev.slot; + devAddr->function = cap->data.pci_dev.function; break; } diff --git a/src/hypervisor/domain_driver.h b/src/hypervisor/domain_driver.h index 2bb053d559..86b92d0284 100644 --- a/src/hypervisor/domain_driver.h +++ b/src/hypervisor/domain_driver.h @@ -48,7 +48,4 @@ int virDomainDriverSetupPersistentDefBlkioParams(virDomainDefPtr persistentDef, int nparams); int virDomainDriverNodeDeviceGetPCIInfo(virNodeDeviceDefPtr def, - unsigned *domain, - unsigned *bus, - unsigned *slot, - unsigned *function); + virPCIDeviceAddressPtr devAddr); diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c index ac2564a563..691cecb1e9 100644 --- a/src/libxl/libxl_driver.c +++ b/src/libxl/libxl_driver.c @@ -5786,7 +5786,7 @@ libxlNodeDeviceDetachFlags(virNodeDevicePtr dev, unsigned int flags) { virPCIDevicePtr pci = NULL; - unsigned domain = 0, bus = 0, slot = 0, function = 0; + virPCIDeviceAddress devAddr; int ret = -1; virNodeDeviceDefPtr def = NULL; char *xml = NULL; @@ -5821,10 +5821,10 @@ libxlNodeDeviceDetachFlags(virNodeDevicePtr dev, if (virNodeDeviceDetachFlagsEnsureACL(dev->conn, def) < 0) goto cleanup; - if (virDomainDriverNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) + if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) goto cleanup; - pci = virPCIDeviceNew(domain, bus, slot, function); + pci = virPCIDeviceNew(devAddr.domain, devAddr.bus, devAddr.slot, devAddr.function); if (!pci) goto cleanup; @@ -5859,7 +5859,7 @@ static int libxlNodeDeviceReAttach(virNodeDevicePtr dev) { virPCIDevicePtr pci = NULL; - unsigned domain = 0, bus = 0, slot = 0, function = 0; + virPCIDeviceAddress devAddr; int ret = -1; virNodeDeviceDefPtr def = NULL; char *xml = NULL; @@ -5892,10 +5892,10 @@ libxlNodeDeviceReAttach(virNodeDevicePtr dev) if (virNodeDeviceReAttachEnsureACL(dev->conn, def) < 0) goto cleanup; - if (virDomainDriverNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) + if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) goto cleanup; - pci = virPCIDeviceNew(domain, bus, slot, function); + pci = virPCIDeviceNew(devAddr.domain, devAddr.bus, devAddr.slot, devAddr.function); if (!pci) goto cleanup; @@ -5917,7 +5917,7 @@ static int libxlNodeDeviceReset(virNodeDevicePtr dev) { virPCIDevicePtr pci = NULL; - unsigned domain = 0, bus = 0, slot = 0, function = 0; + virPCIDeviceAddress devAddr; int ret = -1; virNodeDeviceDefPtr def = NULL; char *xml = NULL; @@ -5950,10 +5950,10 @@ libxlNodeDeviceReset(virNodeDevicePtr dev) if (virNodeDeviceResetEnsureACL(dev->conn, def) < 0) goto cleanup; - if (virDomainDriverNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) + if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) goto cleanup; - pci = virPCIDeviceNew(domain, bus, slot, function); + pci = virPCIDeviceNew(devAddr.domain, devAddr.bus, devAddr.slot, devAddr.function); if (!pci) goto cleanup; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 6c09d95164..8c59a5e589 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11958,7 +11958,7 @@ qemuNodeDeviceDetachFlags(virNodeDevicePtr dev, { virQEMUDriverPtr driver = dev->conn->privateData; virPCIDevicePtr pci = NULL; - unsigned domain = 0, bus = 0, slot = 0, function = 0; + virPCIDeviceAddress devAddr; int ret = -1; virNodeDeviceDefPtr def = NULL; g_autofree char *xml = NULL; @@ -11993,10 +11993,10 @@ qemuNodeDeviceDetachFlags(virNodeDevicePtr dev, if (virNodeDeviceDetachFlagsEnsureACL(dev->conn, def) < 0) goto cleanup; - if (virDomainDriverNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) + if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) goto cleanup; - pci = virPCIDeviceNew(domain, bus, slot, function); + pci = virPCIDeviceNew(devAddr.domain, devAddr.bus, devAddr.slot, devAddr.function); if (!pci) goto cleanup; @@ -12042,7 +12042,7 @@ qemuNodeDeviceReAttach(virNodeDevicePtr dev) { virQEMUDriverPtr driver = dev->conn->privateData; virPCIDevicePtr pci = NULL; - unsigned domain = 0, bus = 0, slot = 0, function = 0; + virPCIDeviceAddress devAddr; int ret = -1; virNodeDeviceDefPtr def = NULL; g_autofree char *xml = NULL; @@ -12074,10 +12074,10 @@ qemuNodeDeviceReAttach(virNodeDevicePtr dev) if (virNodeDeviceReAttachEnsureACL(dev->conn, def) < 0) goto cleanup; - if (virDomainDriverNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) + if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) goto cleanup; - pci = virPCIDeviceNew(domain, bus, slot, function); + pci = virPCIDeviceNew(devAddr.domain, devAddr.bus, devAddr.slot, devAddr.function); if (!pci) goto cleanup; @@ -12096,7 +12096,7 @@ qemuNodeDeviceReset(virNodeDevicePtr dev) { virQEMUDriverPtr driver = dev->conn->privateData; virPCIDevicePtr pci; - unsigned domain = 0, bus = 0, slot = 0, function = 0; + virPCIDeviceAddress devAddr; int ret = -1; virNodeDeviceDefPtr def = NULL; g_autofree char *xml = NULL; @@ -12128,10 +12128,10 @@ qemuNodeDeviceReset(virNodeDevicePtr dev) if (virNodeDeviceResetEnsureACL(dev->conn, def) < 0) goto cleanup; - if (virDomainDriverNodeDeviceGetPCIInfo(def, &domain, &bus, &slot, &function) < 0) + if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) goto cleanup; - pci = virPCIDeviceNew(domain, bus, slot, function); + pci = virPCIDeviceNew(devAddr.domain, devAddr.bus, devAddr.slot, devAddr.function); if (!pci) goto cleanup; -- 2.26.2

The current virPCIDeviceNew() signature, receiving 4 uints in sequence (domain, bus, slot, function), is not neat. We already have a way to represent a PCI address in virPCIDeviceAddress that is used in the code. Aside from the test files, most of virPCIDeviceNew() callers have access to a virPCIDeviceAddress reference, but then we need to retrieve the 4 required uints (addr.domain, addr.bus, addr.slot, addr.function) to satisfy virPCIDeviceNew(). The result is that we have extra verbosity/boilerplate to retrieve an information that is already available in virPCIDeviceAddress. A better way is presented by virNVMEDeviceNew(), where the caller just supplies a virPCIDeviceAddress pointer and the function handles the details internally. This patch changes virPCIDeviceNew() to receive a virPCIDeviceAddress pointer instead of 4 uints. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/hypervisor/virhostdev.c | 3 +-- src/libxl/libxl_driver.c | 6 ++--- src/node_device/node_device_udev.c | 11 +++++--- src/qemu/qemu_domain_address.c | 5 +--- src/qemu/qemu_driver.c | 6 ++--- src/security/security_apparmor.c | 3 +-- src/security/security_dac.c | 6 ++--- src/security/security_selinux.c | 6 ++--- src/security/virt-aa-helper.c | 6 +---- src/util/virnetdev.c | 3 +-- src/util/virnvme.c | 5 +--- src/util/virpci.c | 40 +++++++++--------------------- src/util/virpci.h | 5 +--- tests/virhostdevtest.c | 3 ++- tests/virpcitest.c | 35 +++++++++++++++++++------- 15 files changed, 64 insertions(+), 79 deletions(-) diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index 120187b07a..7338ac0700 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -235,8 +235,7 @@ virHostdevGetPCIHostDevice(const virDomainHostdevDef *hostdev, hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) return 0; - actual = virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus, - pcisrc->addr.slot, pcisrc->addr.function); + actual = virPCIDeviceNew(&pcisrc->addr); if (!actual) return -1; diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c index 691cecb1e9..9b93fd77d2 100644 --- a/src/libxl/libxl_driver.c +++ b/src/libxl/libxl_driver.c @@ -5824,7 +5824,7 @@ libxlNodeDeviceDetachFlags(virNodeDevicePtr dev, if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) goto cleanup; - pci = virPCIDeviceNew(devAddr.domain, devAddr.bus, devAddr.slot, devAddr.function); + pci = virPCIDeviceNew(&devAddr); if (!pci) goto cleanup; @@ -5895,7 +5895,7 @@ libxlNodeDeviceReAttach(virNodeDevicePtr dev) if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) goto cleanup; - pci = virPCIDeviceNew(devAddr.domain, devAddr.bus, devAddr.slot, devAddr.function); + pci = virPCIDeviceNew(&devAddr); if (!pci) goto cleanup; @@ -5953,7 +5953,7 @@ libxlNodeDeviceReset(virNodeDevicePtr dev) if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) goto cleanup; - pci = virPCIDeviceNew(devAddr.domain, devAddr.bus, devAddr.slot, devAddr.function); + pci = virPCIDeviceNew(&devAddr); if (!pci) goto cleanup; diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 55a2731681..fceb135aa5 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -367,6 +367,7 @@ udevProcessPCI(struct udev_device *device, virNodeDevCapPCIDevPtr pci_dev = &def->caps->data.pci_dev; virPCIEDeviceInfoPtr pci_express = NULL; virPCIDevicePtr pciDev = NULL; + virPCIDeviceAddress devAddr; int ret = -1; char *p; bool privileged; @@ -416,10 +417,12 @@ udevProcessPCI(struct udev_device *device, if (virNodeDeviceGetPCIDynamicCaps(def->sysfs_path, pci_dev) < 0) goto cleanup; - if (!(pciDev = virPCIDeviceNew(pci_dev->domain, - pci_dev->bus, - pci_dev->slot, - pci_dev->function))) + devAddr.domain = pci_dev->domain; + devAddr.bus = pci_dev->bus; + devAddr.slot = pci_dev->slot; + devAddr.function = pci_dev->function; + + if (!(pciDev = virPCIDeviceNew(&devAddr))) goto cleanup; /* We need to be root to read PCI device configs */ diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index f0ba318cc8..ae4ad6677c 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -857,10 +857,7 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, return 0; } - if (!(pciDev = virPCIDeviceNew(hostAddr->domain, - hostAddr->bus, - hostAddr->slot, - hostAddr->function))) { + if (!(pciDev = virPCIDeviceNew(hostAddr))) { /* libvirt should be able to perform all the * operations in virPCIDeviceNew() even if it's * running unprivileged, so if this fails, the device diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 8c59a5e589..c450501c4d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11996,7 +11996,7 @@ qemuNodeDeviceDetachFlags(virNodeDevicePtr dev, if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) goto cleanup; - pci = virPCIDeviceNew(devAddr.domain, devAddr.bus, devAddr.slot, devAddr.function); + pci = virPCIDeviceNew(&devAddr); if (!pci) goto cleanup; @@ -12077,7 +12077,7 @@ qemuNodeDeviceReAttach(virNodeDevicePtr dev) if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) goto cleanup; - pci = virPCIDeviceNew(devAddr.domain, devAddr.bus, devAddr.slot, devAddr.function); + pci = virPCIDeviceNew(&devAddr); if (!pci) goto cleanup; @@ -12131,7 +12131,7 @@ qemuNodeDeviceReset(virNodeDevicePtr dev) if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) goto cleanup; - pci = virPCIDeviceNew(devAddr.domain, devAddr.bus, devAddr.slot, devAddr.function); + pci = virPCIDeviceNew(&devAddr); if (!pci) goto cleanup; diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c index 1b035cce2f..b564866678 100644 --- a/src/security/security_apparmor.c +++ b/src/security/security_apparmor.c @@ -876,8 +876,7 @@ AppArmorSetSecurityHostdevLabel(virSecurityManagerPtr mgr, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { virPCIDevicePtr pci = - virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus, - pcisrc->addr.slot, pcisrc->addr.function); + virPCIDeviceNew(&pcisrc->addr); if (!pci) goto done; diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 4f4a0a069e..b5e56feaa8 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -1268,8 +1268,7 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { virPCIDevicePtr pci = - virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus, - pcisrc->addr.slot, pcisrc->addr.function); + virPCIDeviceNew(&pcisrc->addr); if (!pci) return -1; @@ -1437,8 +1436,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { virPCIDevicePtr pci = - virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus, - pcisrc->addr.slot, pcisrc->addr.function); + virPCIDeviceNew(&pcisrc->addr); if (!pci) return -1; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index e9cd95916e..c43d326314 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -2104,8 +2104,7 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { virPCIDevicePtr pci = - virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus, - pcisrc->addr.slot, pcisrc->addr.function); + virPCIDeviceNew(&pcisrc->addr); if (!pci) return -1; @@ -2344,8 +2343,7 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { virPCIDevicePtr pci = - virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus, - pcisrc->addr.slot, pcisrc->addr.function); + virPCIDeviceNew(&pcisrc->addr); if (!pci) return -1; diff --git a/src/security/virt-aa-helper.c b/src/security/virt-aa-helper.c index 5a6f4a5f7d..5efdb61e59 100644 --- a/src/security/virt-aa-helper.c +++ b/src/security/virt-aa-helper.c @@ -1104,11 +1104,7 @@ get_files(vahControl * ctl) } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { - virPCIDevicePtr pci = virPCIDeviceNew( - dev->source.subsys.u.pci.addr.domain, - dev->source.subsys.u.pci.addr.bus, - dev->source.subsys.u.pci.addr.slot, - dev->source.subsys.u.pci.addr.function); + virPCIDevicePtr pci = virPCIDeviceNew(&dev->source.subsys.u.pci.addr); virDomainHostdevSubsysPCIBackendType backend = dev->source.subsys.u.pci.backend; if (backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO || diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index a73e5f72f1..acb3ec960c 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1141,8 +1141,7 @@ virNetDevGetPCIDevice(const char *devName) if (!vfPCIAddr) return NULL; - return virPCIDeviceNew(vfPCIAddr->domain, vfPCIAddr->bus, - vfPCIAddr->slot, vfPCIAddr->function); + return virPCIDeviceNew(vfPCIAddr); } # endif diff --git a/src/util/virnvme.c b/src/util/virnvme.c index b8179aa431..66b73cd1d1 100644 --- a/src/util/virnvme.c +++ b/src/util/virnvme.c @@ -290,10 +290,7 @@ virNVMeDeviceCreatePCIDevice(const virNVMeDevice *nvme) { g_autoptr(virPCIDevice) pci = NULL; - if (!(pci = virPCIDeviceNew(nvme->address.domain, - nvme->address.bus, - nvme->address.slot, - nvme->address.function))) + if (!(pci = virPCIDeviceNew(&nvme->address))) return NULL; /* NVMe devices must be bound to vfio */ diff --git a/src/util/virpci.c b/src/util/virpci.c index 3d9d583622..545b2650a0 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -475,24 +475,24 @@ virPCIDeviceIterDevices(virPCIDeviceIterPredicate predicate, return -1; while ((ret = virDirRead(dir, &entry, PCI_SYSFS "devices")) > 0) { - unsigned int domain, bus, slot, function; g_autoptr(virPCIDevice) check = NULL; + virPCIDeviceAddress devAddr; char *tmp; /* expected format: <domain>:<bus>:<slot>.<function> */ if (/* domain */ - virStrToLong_ui(entry->d_name, &tmp, 16, &domain) < 0 || *tmp != ':' || + virStrToLong_ui(entry->d_name, &tmp, 16, &devAddr.domain) < 0 || *tmp != ':' || /* bus */ - virStrToLong_ui(tmp + 1, &tmp, 16, &bus) < 0 || *tmp != ':' || + virStrToLong_ui(tmp + 1, &tmp, 16, &devAddr.bus) < 0 || *tmp != ':' || /* slot */ - virStrToLong_ui(tmp + 1, &tmp, 16, &slot) < 0 || *tmp != '.' || + virStrToLong_ui(tmp + 1, &tmp, 16, &devAddr.slot) < 0 || *tmp != '.' || /* function */ - virStrToLong_ui(tmp + 1, NULL, 16, &function) < 0) { + virStrToLong_ui(tmp + 1, NULL, 16, &devAddr.function) < 0) { VIR_WARN("Unusual entry in " PCI_SYSFS "devices: %s", entry->d_name); continue; } - check = virPCIDeviceNew(domain, bus, slot, function); + check = virPCIDeviceNew(&devAddr); if (!check) { ret = -1; break; @@ -767,10 +767,7 @@ virPCIDeviceIsParent(virPCIDevicePtr dev, virPCIDevicePtr check, void *data) */ if (dev->address.bus > secondary && dev->address.bus <= subordinate) { if (*best == NULL) { - *best = virPCIDeviceNew(check->address.domain, - check->address.bus, - check->address.slot, - check->address.function); + *best = virPCIDeviceNew(&check->address); if (*best == NULL) { ret = -1; goto cleanup; @@ -790,10 +787,7 @@ virPCIDeviceIsParent(virPCIDevicePtr dev, virPCIDevicePtr check, void *data) if (secondary > best_secondary) { virPCIDeviceFree(*best); - *best = virPCIDeviceNew(check->address.domain, - check->address.bus, - check->address.slot, - check->address.function); + *best = virPCIDeviceNew(&check->address); if (*best == NULL) { ret = -1; goto cleanup; @@ -1455,10 +1449,7 @@ virPCIDeviceAddressAsString(const virPCIDeviceAddress *addr) } virPCIDevicePtr -virPCIDeviceNew(unsigned int domain, - unsigned int bus, - unsigned int slot, - unsigned int function) +virPCIDeviceNew(const virPCIDeviceAddress *address) { g_autoptr(virPCIDevice) dev = NULL; g_autofree char *vendor = NULL; @@ -1466,10 +1457,7 @@ virPCIDeviceNew(unsigned int domain, dev = g_new0(virPCIDevice, 1); - dev->address.domain = domain; - dev->address.bus = bus; - dev->address.slot = slot; - dev->address.function = function; + virPCIDeviceAddressCopy(&dev->address, address); dev->name = virPCIDeviceAddressAsString(&dev->address); @@ -1896,8 +1884,7 @@ virPCIDeviceGetIOMMUGroupAddOne(virPCIDeviceAddressPtr newDevAddr, void *opaque) virPCIDeviceListPtr groupList = opaque; g_autoptr(virPCIDevice) newDev = NULL; - if (!(newDev = virPCIDeviceNew(newDevAddr->domain, newDevAddr->bus, - newDevAddr->slot, newDevAddr->function))) + if (!(newDev = virPCIDeviceNew(newDevAddr))) return -1; if (virPCIDeviceListAdd(groupList, newDev) < 0) @@ -2028,10 +2015,7 @@ virPCIDeviceAddressGetIOMMUGroupDev(const virPCIDeviceAddress *devAddr) { g_autoptr(virPCIDevice) pci = NULL; - if (!(pci = virPCIDeviceNew(devAddr->domain, - devAddr->bus, - devAddr->slot, - devAddr->function))) + if (!(pci = virPCIDeviceNew(devAddr))) return NULL; return virPCIDeviceGetIOMMUGroupDev(pci); diff --git a/src/util/virpci.h b/src/util/virpci.h index 43828b0a8a..d4451848c1 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -113,10 +113,7 @@ struct _virPCIEDeviceInfo { virPCIELink *link_sta; /* Actually negotiated capabilities */ }; -virPCIDevicePtr virPCIDeviceNew(unsigned int domain, - unsigned int bus, - unsigned int slot, - unsigned int function); +virPCIDevicePtr virPCIDeviceNew(const virPCIDeviceAddress *address); virPCIDevicePtr virPCIDeviceCopy(virPCIDevicePtr dev); void virPCIDeviceFree(virPCIDevicePtr dev); const char *virPCIDeviceGetName(virPCIDevicePtr dev); diff --git a/tests/virhostdevtest.c b/tests/virhostdevtest.c index 385db0849a..40c14a5281 100644 --- a/tests/virhostdevtest.c +++ b/tests/virhostdevtest.c @@ -138,7 +138,8 @@ myInit(void) } for (i = 0; i < nhostdevs; i++) { - if (!(dev[i] = virPCIDeviceNew(0, 0, i + 1, 0))) + virDomainHostdevSubsys subsys = hostdevs[i]->source.subsys; + if (!(dev[i] = virPCIDeviceNew(&subsys.u.pci.addr))) goto cleanup; virPCIDeviceSetStubDriver(dev[i], VIR_PCI_STUB_DRIVER_VFIO); diff --git a/tests/virpcitest.c b/tests/virpcitest.c index 6f064a3f85..6a4bd5518d 100644 --- a/tests/virpcitest.c +++ b/tests/virpcitest.c @@ -60,8 +60,9 @@ testVirPCIDeviceNew(const void *opaque G_GNUC_UNUSED) int ret = -1; virPCIDevicePtr dev; const char *devName; + virPCIDeviceAddress devAddr = {.domain = 0, .bus = 0, .slot = 0, .function = 0}; - if (!(dev = virPCIDeviceNew(0, 0, 0, 0))) + if (!(dev = virPCIDeviceNew(&devAddr))) goto cleanup; devName = virPCIDeviceGetName(dev); @@ -103,7 +104,9 @@ testVirPCIDeviceDetach(const void *opaque G_GNUC_UNUSED) CHECK_LIST_COUNT(inactiveDevs, 0); for (i = 0; i < nDev; i++) { - if (!(dev[i] = virPCIDeviceNew(0, 0, i + 1, 0))) + virPCIDeviceAddress devAddr = {.domain = 0, .bus = 0, + .slot = i + 1, .function = 0}; + if (!(dev[i] = virPCIDeviceNew(&devAddr))) goto cleanup; virPCIDeviceSetStubDriver(dev[i], VIR_PCI_STUB_DRIVER_VFIO); @@ -144,7 +147,9 @@ testVirPCIDeviceReset(const void *opaque G_GNUC_UNUSED) CHECK_LIST_COUNT(inactiveDevs, 0); for (i = 0; i < nDev; i++) { - if (!(dev[i] = virPCIDeviceNew(0, 0, i + 1, 0))) + virPCIDeviceAddress devAddr = {.domain = 0, .bus = 0, + .slot = i + 1, .function = 0}; + if (!(dev[i] = virPCIDeviceNew(&devAddr))) goto cleanup; virPCIDeviceSetStubDriver(dev[i], VIR_PCI_STUB_DRIVER_VFIO); @@ -176,7 +181,9 @@ testVirPCIDeviceReattach(const void *opaque G_GNUC_UNUSED) goto cleanup; for (i = 0; i < nDev; i++) { - if (!(dev[i] = virPCIDeviceNew(0, 0, i + 1, 0))) + virPCIDeviceAddress devAddr = {.domain = 0, .bus = 0, + .slot = i + 1, .function = 0}; + if (!(dev[i] = virPCIDeviceNew(&devAddr))) goto cleanup; if (virPCIDeviceListAdd(inactiveDevs, dev[i]) < 0) { @@ -222,8 +229,10 @@ testVirPCIDeviceIsAssignable(const void *opaque) const struct testPCIDevData *data = opaque; int ret = -1; virPCIDevicePtr dev; + virPCIDeviceAddress devAddr = {.domain = data->domain, .bus = data->bus, + .slot = data->slot, .function = data->function}; - if (!(dev = virPCIDeviceNew(data->domain, data->bus, data->slot, data->function))) + if (!(dev = virPCIDeviceNew(&devAddr))) return -1; if (virPCIDeviceIsAssignable(dev, true)) @@ -239,8 +248,10 @@ testVirPCIDeviceDetachSingle(const void *opaque) const struct testPCIDevData *data = opaque; int ret = -1; virPCIDevicePtr dev; + virPCIDeviceAddress devAddr = {.domain = data->domain, .bus = data->bus, + .slot = data->slot, .function = data->function}; - dev = virPCIDeviceNew(data->domain, data->bus, data->slot, data->function); + dev = virPCIDeviceNew(&devAddr); if (!dev) goto cleanup; @@ -261,8 +272,10 @@ testVirPCIDeviceReattachSingle(const void *opaque) const struct testPCIDevData *data = opaque; int ret = -1; virPCIDevicePtr dev; + virPCIDeviceAddress devAddr = {.domain = data->domain, .bus = data->bus, + .slot = data->slot, .function = data->function}; - dev = virPCIDeviceNew(data->domain, data->bus, data->slot, data->function); + dev = virPCIDeviceNew(&devAddr); if (!dev) goto cleanup; @@ -285,8 +298,10 @@ testVirPCIDeviceCheckDriverTest(const void *opaque) const struct testPCIDevData *data = opaque; int ret = -1; virPCIDevicePtr dev; + virPCIDeviceAddress devAddr = {.domain = data->domain, .bus = data->bus, + .slot = data->slot, .function = data->function}; - dev = virPCIDeviceNew(data->domain, data->bus, data->slot, data->function); + dev = virPCIDeviceNew(&devAddr); if (!dev) goto cleanup; @@ -305,8 +320,10 @@ testVirPCIDeviceUnbind(const void *opaque) const struct testPCIDevData *data = opaque; int ret = -1; virPCIDevicePtr dev; + virPCIDeviceAddress devAddr = {.domain = data->domain, .bus = data->bus, + .slot = data->slot, .function = data->function}; - dev = virPCIDeviceNew(data->domain, data->bus, data->slot, data->function); + dev = virPCIDeviceNew(&devAddr); if (!dev) goto cleanup; -- 2.26.2

We're going to add logic to handle the case where a previously existing PCI device does not longer exist in the host. The logic was copied from virPCIDeviceNew(), which verifies if a PCI device exists in the host, returning NULL and throwing an error if it doesn't. The NULL is used for other errors as well (product/vendor id read errors, dev id overflow), meaning that we can't re-use virPCIDeviceNew() for the purpose of detecting if the device exists. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/libvirt_private.syms | 1 + src/util/virpci.c | 10 ++++++++++ src/util/virpci.h | 1 + 3 files changed, 12 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index fe4ee25f3a..81ce142635 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2817,6 +2817,7 @@ virPCIDeviceAddressIsValid; virPCIDeviceAddressParse; virPCIDeviceCopy; virPCIDeviceDetach; +virPCIDeviceExists; virPCIDeviceFileIterate; virPCIDeviceFree; virPCIDeviceGetAddress; diff --git a/src/util/virpci.c b/src/util/virpci.c index 545b2650a0..1a542b18b6 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -1448,6 +1448,16 @@ virPCIDeviceAddressAsString(const virPCIDeviceAddress *addr) return str; } +bool +virPCIDeviceExists(const virPCIDeviceAddress *addr) +{ + g_autofree char *devName = virPCIDeviceAddressAsString(addr); + g_autofree char *devPath = g_strdup_printf(PCI_SYSFS "devices/%s/config", + devName); + + return virFileExists(devPath); +} + virPCIDevicePtr virPCIDeviceNew(const virPCIDeviceAddress *address) { diff --git a/src/util/virpci.h b/src/util/virpci.h index d4451848c1..a9c597a428 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -201,6 +201,7 @@ int virPCIDeviceAddressGetIOMMUGroupAddresses(virPCIDeviceAddressPtr devAddr, size_t *nIommuGroupDevices); int virPCIDeviceAddressGetIOMMUGroupNum(virPCIDeviceAddressPtr addr); char *virPCIDeviceAddressGetIOMMUGroupDev(const virPCIDeviceAddress *devAddr); +bool virPCIDeviceExists(const virPCIDeviceAddress *addr); char *virPCIDeviceGetIOMMUGroupDev(virPCIDevicePtr dev); int virPCIDeviceIsAssignable(virPCIDevicePtr dev, -- 2.26.2

Gitlab issue #72 [1] reports that removing SR-IOVs VFs before removing the devices from the running domains can have strange consequences. QEMU might be able to hotunplug the device inside the guest, but Libvirt will not be aware of that, and then the guest is now inconsistent with the domain definition. There's also the possibility of the VFs removal not succeeding while the domain is running but then, as soon as the domain is shutdown, all the VFs are removed. Libvirt can't handle the removal of the PCI devices while trying to reattach the hostdevs, and the Libvirt daemon can be left in an inconsistent state (see [2]). This patch starts to address the issue related in Gitlab #72, most notably the issue described in [2]. When shutting down a domain with SR-IOV hostdevs that got missing, virHostdevReAttachPCIDevices() is failing the whole process and failing to reattach all the PCI devices, including the ones that aren't related to the VFs that went missing. Let's make it more resilient with host changes by changing virHostdevGetPCIHostDevice() to return an exclusive error code '-2' for this case. virHostdevGetPCIHostDeviceList() can then tell when virHostdevGetPCIHostDevice() failed to find the PCI device of a hostdev and continue to make the list of PCI devices. virHostdevReAttachPCIDevices() will now be able to proceed reattaching all other valid PCI devices, at least. The 'ghost hostdevs' will be handled later on. [1] https://gitlab.com/libvirt/libvirt/-/issues/72 [2] https://gitlab.com/libvirt/libvirt/-/issues/72#note_459032148 Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/hypervisor/virhostdev.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index 7338ac0700..18373deb41 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -220,7 +220,8 @@ virHostdevManagerGetDefault(void) * is returned. * * Returns: 0 on success (@pci might be NULL though), - * -1 otherwise (with error reported). + * -1 otherwise (with error reported), + * -2 PCI device not found. @pci will be NULL */ static int virHostdevGetPCIHostDevice(const virDomainHostdevDef *hostdev, @@ -235,6 +236,9 @@ virHostdevGetPCIHostDevice(const virDomainHostdevDef *hostdev, hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) return 0; + if (!virPCIDeviceExists(&pcisrc->addr)) + return -2; + actual = virPCIDeviceNew(&pcisrc->addr); if (!actual) @@ -270,7 +274,7 @@ virHostdevGetPCIHostDeviceList(virDomainHostdevDefPtr *hostdevs, int nhostdevs) virDomainHostdevDefPtr hostdev = hostdevs[i]; g_autoptr(virPCIDevice) pci = NULL; - if (virHostdevGetPCIHostDevice(hostdev, &pci) < 0) + if (virHostdevGetPCIHostDevice(hostdev, &pci) == -1) return NULL; if (!pci) -- 2.26.2

Use g_auto* cleanup to avoid free() calls. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/security/security_selinux.c | 54 ++++++++++----------------------- 1 file changed, 16 insertions(+), 38 deletions(-) diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index c43d326314..bf53932ccc 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -2086,7 +2086,7 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, switch ((virDomainHostdevSubsysType)dev->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: { - virUSBDevicePtr usb; + g_autoptr(virUSBDevice) usb = NULL; if (dev->missing) return 0; @@ -2098,39 +2098,34 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, return -1; ret = virUSBDeviceFileIterate(usb, virSecuritySELinuxSetUSBLabel, &data); - virUSBDeviceFree(usb); break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { - virPCIDevicePtr pci = - virPCIDeviceNew(&pcisrc->addr); + g_autoptr(virPCIDevice) pci = virPCIDeviceNew(&pcisrc->addr); if (!pci) return -1; if (pcisrc->backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) { - char *vfioGroupDev = virPCIDeviceGetIOMMUGroupDev(pci); + g_autofree char *vfioGroupDev = virPCIDeviceGetIOMMUGroupDev(pci); - if (!vfioGroupDev) { - virPCIDeviceFree(pci); + if (!vfioGroupDev) return -1; - } + ret = virSecuritySELinuxSetHostdevLabelHelper(vfioGroupDev, false, &data); - VIR_FREE(vfioGroupDev); } else { ret = virPCIDeviceFileIterate(pci, virSecuritySELinuxSetPCILabel, &data); } - virPCIDeviceFree(pci); break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: { virDomainHostdevSubsysSCSIHostPtr scsihostsrc = &scsisrc->u.host; - virSCSIDevicePtr scsi = + g_autoptr(virSCSIDevice) scsi = virSCSIDeviceNew(NULL, scsihostsrc->adapter, scsihostsrc->bus, scsihostsrc->target, scsihostsrc->unit, @@ -2142,13 +2137,11 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, ret = virSCSIDeviceFileIterate(scsi, virSecuritySELinuxSetSCSILabel, &data); - virSCSIDeviceFree(scsi); - break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: { - virSCSIVHostDevicePtr host = virSCSIVHostDeviceNew(hostsrc->wwpn); + g_autoptr(virSCSIVHostDevice) host = virSCSIVHostDeviceNew(hostsrc->wwpn); if (!host) return -1; @@ -2156,19 +2149,16 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, ret = virSCSIVHostDeviceFileIterate(host, virSecuritySELinuxSetHostLabel, &data); - virSCSIVHostDeviceFree(host); break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { - char *vfiodev = NULL; + g_autofree char *vfiodev = NULL; if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdevsrc->uuidstr))) return ret; ret = virSecuritySELinuxSetHostdevLabelHelper(vfiodev, true, &data); - - VIR_FREE(vfiodev); break; } @@ -2324,7 +2314,7 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, switch ((virDomainHostdevSubsysType)dev->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: { - virUSBDevicePtr usb; + g_autoptr(virUSBDevice) usb = NULL; if (dev->missing) return 0; @@ -2336,37 +2326,31 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, return -1; ret = virUSBDeviceFileIterate(usb, virSecuritySELinuxRestoreUSBLabel, mgr); - virUSBDeviceFree(usb); - break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { - virPCIDevicePtr pci = - virPCIDeviceNew(&pcisrc->addr); + g_autoptr(virPCIDevice) pci = virPCIDeviceNew(&pcisrc->addr); if (!pci) return -1; if (pcisrc->backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) { - char *vfioGroupDev = virPCIDeviceGetIOMMUGroupDev(pci); + g_autofree char *vfioGroupDev = virPCIDeviceGetIOMMUGroupDev(pci); - if (!vfioGroupDev) { - virPCIDeviceFree(pci); + if (!vfioGroupDev) return -1; - } + ret = virSecuritySELinuxRestoreFileLabel(mgr, vfioGroupDev, false); - VIR_FREE(vfioGroupDev); } else { ret = virPCIDeviceFileIterate(pci, virSecuritySELinuxRestorePCILabel, mgr); } - virPCIDeviceFree(pci); break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: { virDomainHostdevSubsysSCSIHostPtr scsihostsrc = &scsisrc->u.host; - virSCSIDevicePtr scsi = + g_autoptr(virSCSIDevice) scsi = virSCSIDeviceNew(NULL, scsihostsrc->adapter, scsihostsrc->bus, scsihostsrc->target, scsihostsrc->unit, @@ -2376,13 +2360,11 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, return -1; ret = virSCSIDeviceFileIterate(scsi, virSecuritySELinuxRestoreSCSILabel, mgr); - virSCSIDeviceFree(scsi); - break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: { - virSCSIVHostDevicePtr host = virSCSIVHostDeviceNew(hostsrc->wwpn); + g_autoptr(virSCSIVHostDevice) host = virSCSIVHostDeviceNew(hostsrc->wwpn); if (!host) return -1; @@ -2390,20 +2372,16 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, ret = virSCSIVHostDeviceFileIterate(host, virSecuritySELinuxRestoreHostLabel, mgr); - virSCSIVHostDeviceFree(host); - break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { - char *vfiodev = NULL; + g_autofree char *vfiodev = NULL; if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdevsrc->uuidstr))) return -1; ret = virSecuritySELinuxRestoreFileLabel(mgr, vfiodev, true); - - VIR_FREE(vfiodev); break; } -- 2.26.2

Use g_auto* cleanup to avoid free() calls. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/security/security_dac.c | 49 +++++++++++-------------------------- 1 file changed, 14 insertions(+), 35 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index b5e56feaa8..0085982bb1 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -1251,7 +1251,7 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, switch ((virDomainHostdevSubsysType)dev->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: { - virUSBDevicePtr usb; + g_autoptr(virUSBDevice) usb = NULL; if (dev->missing) return 0; @@ -1262,41 +1262,35 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, ret = virUSBDeviceFileIterate(usb, virSecurityDACSetUSBLabel, &cbdata); - virUSBDeviceFree(usb); break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { - virPCIDevicePtr pci = - virPCIDeviceNew(&pcisrc->addr); + g_autoptr(virPCIDevice) pci = virPCIDeviceNew(&pcisrc->addr); if (!pci) return -1; if (pcisrc->backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) { - char *vfioGroupDev = virPCIDeviceGetIOMMUGroupDev(pci); + g_autofree char *vfioGroupDev = virPCIDeviceGetIOMMUGroupDev(pci); - if (!vfioGroupDev) { - virPCIDeviceFree(pci); + if (!vfioGroupDev) return -1; - } + ret = virSecurityDACSetHostdevLabelHelper(vfioGroupDev, false, &cbdata); - VIR_FREE(vfioGroupDev); } else { ret = virPCIDeviceFileIterate(pci, virSecurityDACSetPCILabel, &cbdata); } - - virPCIDeviceFree(pci); break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: { virDomainHostdevSubsysSCSIHostPtr scsihostsrc = &scsisrc->u.host; - virSCSIDevicePtr scsi = + g_autoptr(virSCSIDevice) scsi = virSCSIDeviceNew(NULL, scsihostsrc->adapter, scsihostsrc->bus, scsihostsrc->target, scsihostsrc->unit, @@ -1308,13 +1302,11 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, ret = virSCSIDeviceFileIterate(scsi, virSecurityDACSetSCSILabel, &cbdata); - virSCSIDeviceFree(scsi); - break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: { - virSCSIVHostDevicePtr host = virSCSIVHostDeviceNew(hostsrc->wwpn); + g_autoptr(virSCSIVHostDevice) host = virSCSIVHostDeviceNew(hostsrc->wwpn); if (!host) return -1; @@ -1322,19 +1314,16 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, ret = virSCSIVHostDeviceFileIterate(host, virSecurityDACSetHostLabel, &cbdata); - virSCSIVHostDeviceFree(host); break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { - char *vfiodev = NULL; + g_autofree char *vfiodev = NULL; if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdevsrc->uuidstr))) return -1; ret = virSecurityDACSetHostdevLabelHelper(vfiodev, true, &cbdata); - - VIR_FREE(vfiodev); break; } @@ -1420,7 +1409,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, switch ((virDomainHostdevSubsysType)dev->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB: { - virUSBDevicePtr usb; + g_autoptr(virUSBDevice) usb = NULL; if (dev->missing) return 0; @@ -1429,20 +1418,17 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, return -1; ret = virUSBDeviceFileIterate(usb, virSecurityDACRestoreUSBLabel, mgr); - virUSBDeviceFree(usb); - break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { - virPCIDevicePtr pci = - virPCIDeviceNew(&pcisrc->addr); + g_autoptr(virPCIDevice) pci = virPCIDeviceNew(&pcisrc->addr); if (!pci) return -1; if (pcisrc->backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) { - char *vfioGroupDev = virPCIDeviceGetIOMMUGroupDev(pci); + g_autofree char *vfioGroupDev = virPCIDeviceGetIOMMUGroupDev(pci); if (!vfioGroupDev) { virPCIDeviceFree(pci); @@ -1450,17 +1436,15 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, } ret = virSecurityDACRestoreFileLabelInternal(mgr, NULL, vfioGroupDev, false); - VIR_FREE(vfioGroupDev); } else { ret = virPCIDeviceFileIterate(pci, virSecurityDACRestorePCILabel, mgr); } - virPCIDeviceFree(pci); break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: { virDomainHostdevSubsysSCSIHostPtr scsihostsrc = &scsisrc->u.host; - virSCSIDevicePtr scsi = + g_autoptr(virSCSIDevice) scsi = virSCSIDeviceNew(NULL, scsihostsrc->adapter, scsihostsrc->bus, scsihostsrc->target, scsihostsrc->unit, @@ -1470,13 +1454,11 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, return -1; ret = virSCSIDeviceFileIterate(scsi, virSecurityDACRestoreSCSILabel, mgr); - virSCSIDeviceFree(scsi); - break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST: { - virSCSIVHostDevicePtr host = virSCSIVHostDeviceNew(hostsrc->wwpn); + g_autoptr(virSCSIVHostDevice) host = virSCSIVHostDeviceNew(hostsrc->wwpn); if (!host) return -1; @@ -1484,19 +1466,16 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, ret = virSCSIVHostDeviceFileIterate(host, virSecurityDACRestoreHostLabel, mgr); - virSCSIVHostDeviceFree(host); - break; } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: { - char *vfiodev = NULL; + g_autofree char *vfiodev = NULL; if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdevsrc->uuidstr))) return -1; ret = virSecurityDACRestoreFileLabel(mgr, vfiodev); - VIR_FREE(vfiodev); break; } -- 2.26.2

If the underlying PCI device of a hostdev does not exist in the host (e.g. a SR-IOV VF that was removed while the domain was running), skip security label handling for it. This will avoid errors that happens during qemuProcessStop() time, where a VF that was being used by the domain is not present anymore. The restore label functions of both DAC and SELinux drivers will trigger errors in virPCIDeviceNew(). Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/security/security_dac.c | 14 ++++++++++++-- src/security/security_selinux.c | 14 ++++++++++++-- 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 0085982bb1..a2528aeb2d 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -1266,7 +1266,12 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { - g_autoptr(virPCIDevice) pci = virPCIDeviceNew(&pcisrc->addr); + g_autoptr(virPCIDevice) pci = NULL; + + if (!virPCIDeviceExists(&pcisrc->addr)) + break; + + pci = virPCIDeviceNew(&pcisrc->addr); if (!pci) return -1; @@ -1422,7 +1427,12 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { - g_autoptr(virPCIDevice) pci = virPCIDeviceNew(&pcisrc->addr); + g_autoptr(virPCIDevice) pci = NULL; + + if (!virPCIDeviceExists(&pcisrc->addr)) + break; + + pci = virPCIDeviceNew(&pcisrc->addr); if (!pci) return -1; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index bf53932ccc..cf4984d8b1 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -2102,7 +2102,12 @@ virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr, } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { - g_autoptr(virPCIDevice) pci = virPCIDeviceNew(&pcisrc->addr); + g_autoptr(virPCIDevice) pci = NULL; + + if (!virPCIDeviceExists(&pcisrc->addr)) + break; + + pci = virPCIDeviceNew(&pcisrc->addr); if (!pci) return -1; @@ -2330,7 +2335,12 @@ virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr, } case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: { - g_autoptr(virPCIDevice) pci = virPCIDeviceNew(&pcisrc->addr); + g_autoptr(virPCIDevice) pci = NULL; + + if (!virPCIDeviceExists(&pcisrc->addr)) + break; + + pci = virPCIDeviceNew(&pcisrc->addr); if (!pci) return -1; -- 2.26.2

The function returns -1 on error, but no caller is actually checking the return value. Making it 'void' makes more sense with its current use. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- include/libvirt/libvirt-nodedev.h | 2 +- src/libvirt-nodedev.c | 15 +++++++++++---- 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index eab8abf6ab..5634980a75 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -114,7 +114,7 @@ char * virNodeDeviceGetXMLDesc (virNodeDevicePtr dev, unsigned int flags); int virNodeDeviceRef (virNodeDevicePtr dev); -int virNodeDeviceFree (virNodeDevicePtr dev); +void virNodeDeviceFree (virNodeDevicePtr dev); int virNodeDeviceDettach (virNodeDevicePtr dev); int virNodeDeviceDetachFlags(virNodeDevicePtr dev, diff --git a/src/libvirt-nodedev.c b/src/libvirt-nodedev.c index eb8c735a8c..fcca40f47b 100644 --- a/src/libvirt-nodedev.c +++ b/src/libvirt-nodedev.c @@ -445,19 +445,26 @@ virNodeDeviceListCaps(virNodeDevicePtr dev, * Drops a reference to the node device, freeing it if * this was the last reference. * - * Returns the 0 for success, -1 for error. + * Throws a VIR_ERR_INVALID_NODE_DEVICE error if @dev is + * not a valid node device. Does nothing if @dev is + * NULL. */ -int +void virNodeDeviceFree(virNodeDevicePtr dev) { + if (!dev) + return; + VIR_DEBUG("dev=%p, conn=%p", dev, dev ? dev->conn : NULL); virResetLastError(); - virCheckNodeDeviceReturn(dev, -1); + virCheckNodeDeviceGoto(dev, invalid_device); virObjectUnref(dev); - return 0; + + invalid_device: + return; } -- 2.26.2

Add virObjectUnref an autoptr cleanup func for virNodeDevice, then remove all unref and free calls from qemuNodeDeviceReAttach(). Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/datatypes.h | 2 ++ src/qemu/qemu_driver.c | 32 ++++++++++++-------------------- 2 files changed, 14 insertions(+), 20 deletions(-) diff --git a/src/datatypes.h b/src/datatypes.h index ade3779e43..7a88aba0df 100644 --- a/src/datatypes.h +++ b/src/datatypes.h @@ -707,6 +707,8 @@ struct _virNodeDevice { char *parentName; /* parent device name */ }; +G_DEFINE_AUTOPTR_CLEANUP_FUNC(virNodeDevice, virObjectUnref); + /** * _virSecret: * diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index c450501c4d..7b8350dff3 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -12041,17 +12041,16 @@ static int qemuNodeDeviceReAttach(virNodeDevicePtr dev) { virQEMUDriverPtr driver = dev->conn->privateData; - virPCIDevicePtr pci = NULL; + g_autoptr(virPCIDevice) pci = NULL; virPCIDeviceAddress devAddr; - int ret = -1; - virNodeDeviceDefPtr def = NULL; + g_autoptr(virNodeDeviceDef) def = NULL; g_autofree char *xml = NULL; virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; - virConnectPtr nodeconn = NULL; - virNodeDevicePtr nodedev = NULL; + g_autoptr(virConnect) nodeconn = NULL; + g_autoptr(virNodeDevice) nodedev = NULL; if (!(nodeconn = virGetConnectNodeDev())) - goto cleanup; + return -1; /* 'dev' is associated with the QEMU virConnectPtr, * so for split daemons, we need to get a copy that @@ -12059,36 +12058,29 @@ qemuNodeDeviceReAttach(virNodeDevicePtr dev) */ if (!(nodedev = virNodeDeviceLookupByName( nodeconn, virNodeDeviceGetName(dev)))) - goto cleanup; + return -1; xml = virNodeDeviceGetXMLDesc(nodedev, 0); if (!xml) - goto cleanup; + return -1; def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL); if (!def) - goto cleanup; + return -1; /* ACL check must happen against original 'dev', * not the new 'nodedev' we acquired */ if (virNodeDeviceReAttachEnsureACL(dev->conn, def) < 0) - goto cleanup; + return -1; if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) - goto cleanup; + return -1; pci = virPCIDeviceNew(&devAddr); if (!pci) - goto cleanup; - - ret = virHostdevPCINodeDeviceReAttach(hostdev_mgr, pci); + return -1; - virPCIDeviceFree(pci); - cleanup: - virNodeDeviceDefFree(def); - virObjectUnref(nodedev); - virObjectUnref(nodeconn); - return ret; + return virHostdevPCINodeDeviceReAttach(hostdev_mgr, pci); } static int -- 2.26.2

Use g_auto* wherever we can and remove the 'cleanup' label. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/libxl/libxl_driver.c | 37 ++++++++++++++----------------------- 1 file changed, 14 insertions(+), 23 deletions(-) diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c index 9b93fd77d2..b83e35f67d 100644 --- a/src/libxl/libxl_driver.c +++ b/src/libxl/libxl_driver.c @@ -5858,18 +5858,17 @@ libxlNodeDeviceDettach(virNodeDevicePtr dev) static int libxlNodeDeviceReAttach(virNodeDevicePtr dev) { - virPCIDevicePtr pci = NULL; + g_autoptr(virPCIDevice) pci = NULL; virPCIDeviceAddress devAddr; - int ret = -1; - virNodeDeviceDefPtr def = NULL; - char *xml = NULL; + g_autoptr(virNodeDeviceDef) def = NULL; + g_autofree char *xml = NULL; libxlDriverPrivatePtr driver = dev->conn->privateData; virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr; - virConnectPtr nodeconn = NULL; - virNodeDevicePtr nodedev = NULL; + g_autoptr(virConnect) nodeconn = NULL; + g_autoptr(virNodeDevice) nodedev = NULL; if (!(nodeconn = virGetConnectNodeDev())) - goto cleanup; + return -1; /* 'dev' is associated with the QEMU virConnectPtr, * so for split daemons, we need to get a copy that @@ -5877,40 +5876,32 @@ libxlNodeDeviceReAttach(virNodeDevicePtr dev) */ if (!(nodedev = virNodeDeviceLookupByName( nodeconn, virNodeDeviceGetName(dev)))) - goto cleanup; + return -1; xml = virNodeDeviceGetXMLDesc(nodedev, 0); if (!xml) - goto cleanup; + return -1; def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL); if (!def) - goto cleanup; + return -1; /* ACL check must happen against original 'dev', * not the new 'nodedev' we acquired */ if (virNodeDeviceReAttachEnsureACL(dev->conn, def) < 0) - goto cleanup; + return -1; if (virDomainDriverNodeDeviceGetPCIInfo(def, &devAddr) < 0) - goto cleanup; + return -1; pci = virPCIDeviceNew(&devAddr); if (!pci) - goto cleanup; + return -1; if (virHostdevPCINodeDeviceReAttach(hostdev_mgr, pci) < 0) - goto cleanup; - - ret = 0; + return -1; - cleanup: - virPCIDeviceFree(pci); - virNodeDeviceDefFree(def); - virObjectUnref(nodedev); - virObjectUnref(nodeconn); - VIR_FREE(xml); - return ret; + return 0; } static int -- 2.26.2

Use g_auto* pointers to avoid the need of a cleanup label. The type of the pointer 'virDomainPtr dom' was changed to its alias 'virshDomainPtr' to allow the use of g_autoptr(). Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- tools/virsh-domain.c | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 1ef9b8d606..a8a0f1d3cc 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -11953,11 +11953,10 @@ static const vshCmdOptDef opts_detach_device[] = { static bool cmdDetachDevice(vshControl *ctl, const vshCmd *cmd) { - virDomainPtr dom = NULL; + g_autoptr(virshDomain) dom = NULL; const char *from = NULL; - char *buffer = NULL; + g_autofree char *buffer = NULL; int ret; - bool funcRet = false; bool current = vshCommandOptBool(cmd, "current"); bool config = vshCommandOptBool(cmd, "config"); bool live = vshCommandOptBool(cmd, "live"); @@ -11982,11 +11981,11 @@ cmdDetachDevice(vshControl *ctl, const vshCmd *cmd) flags |= VIR_DOMAIN_AFFECT_LIVE; if (vshCommandOptStringReq(ctl, cmd, "file", &from) < 0) - goto cleanup; + return false; if (virFileReadAll(from, VSH_MAX_XML_FILE, &buffer) < 0) { vshReportError(ctl); - goto cleanup; + return false; } if (flags != 0 || current) @@ -11996,16 +11995,11 @@ cmdDetachDevice(vshControl *ctl, const vshCmd *cmd) if (ret < 0) { vshError(ctl, _("Failed to detach device from %s"), from); - goto cleanup; + return false; } vshPrintExtra(ctl, "%s", _("Device detached successfully\n")); - funcRet = true; - - cleanup: - VIR_FREE(buffer); - virshDomainFree(dom); - return funcRet; + return true; } -- 2.26.2

Add a helper to quickly determine if a hostdev is a PCI device, instead of doing a tedius 'if' check with hostdev mode and subsys type. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/hypervisor/virhostdev.c | 12 +++++++++--- src/hypervisor/virhostdev.h | 2 ++ src/libvirt_private.syms | 1 + 3 files changed, 12 insertions(+), 3 deletions(-) diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index 18373deb41..16a21f240b 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -347,12 +347,18 @@ virHostdevNetDevice(virDomainHostdevDefPtr hostdev, } +bool +virHostdevIsPCIDevice(const virDomainHostdevDef *hostdev) +{ + return hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI; +} + + static bool virHostdevIsPCINetDevice(const virDomainHostdevDef *hostdev) { - return hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && - hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && - hostdev->parentnet != NULL; + return virHostdevIsPCIDevice(hostdev) && hostdev->parentnet != NULL; } diff --git a/src/hypervisor/virhostdev.h b/src/hypervisor/virhostdev.h index 811bda40ed..b9407cd837 100644 --- a/src/hypervisor/virhostdev.h +++ b/src/hypervisor/virhostdev.h @@ -235,3 +235,5 @@ virHostdevUpdateActiveNVMeDevices(virHostdevManagerPtr hostdev_mgr, const char *dom_name, virDomainDiskDefPtr *disks, size_t ndisks); + +bool virHostdevIsPCIDevice(const virDomainHostdevDef *hostdev); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 81ce142635..ac72a384ed 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1461,6 +1461,7 @@ virCloseCallbacksUnset; # hypervisor/virhostdev.h virHostdevFindUSBDevice; +virHostdevIsPCIDevice; virHostdevManagerGetDefault; virHostdevPCINodeDeviceDetach; virHostdevPCINodeDeviceReAttach; -- 2.26.2

There is no need to bother with cgroup tearing down for absent PCI devices, given that their entries in the sysfs are already gone. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/qemu/qemu_cgroup.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index f7146a71c9..050df21d87 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -467,6 +467,16 @@ qemuTeardownHostdevCgroup(virDomainObjPtr vm, if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES)) return 0; + /* Skip tearing down Cgroup for hostdevs that represents absent + * PCI devices, e.g. SR-IOV virtual functions that were removed from + * the host while the domain was still running. */ + if (virHostdevIsPCIDevice(dev)) { + const virDomainHostdevSubsysPCI *pcisrc = &dev->source.subsys.u.pci; + + if (!virPCIDeviceExists(&pcisrc->addr)) + return 0; + } + if (qemuDomainGetHostdevPath(dev, &path, NULL) < 0) return -1; -- 2.26.2

We're going to need a way to remove a PCI Device from a list without having a valid virPCIDevicePtr, because the device is missing from the host. This means that virPCIDevicesListDel() must operate with a PCI Device address instead. Turns out that virPCIDevicesListDel() and its related functions only use the virPCIDeviceAddressPtr of the virPCIDevicePtr, so this change is simple to do and will not cause hassle in all other callers. Let's start adapting virPCIDeviceListFindIndex() and crawl our way up to virPCIDevicesListDel(). Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/util/virpci.c | 15 ++++++++------- src/util/virpci.h | 2 +- 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/src/util/virpci.c b/src/util/virpci.c index 1a542b18b6..122056dbe0 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -1745,7 +1745,7 @@ virPCIDevicePtr virPCIDeviceListSteal(virPCIDeviceListPtr list, virPCIDevicePtr dev) { - return virPCIDeviceListStealIndex(list, virPCIDeviceListFindIndex(list, dev)); + return virPCIDeviceListStealIndex(list, virPCIDeviceListFindIndex(list, &dev->address)); } void @@ -1756,16 +1756,17 @@ virPCIDeviceListDel(virPCIDeviceListPtr list, } int -virPCIDeviceListFindIndex(virPCIDeviceListPtr list, virPCIDevicePtr dev) +virPCIDeviceListFindIndex(virPCIDeviceListPtr list, + virPCIDeviceAddressPtr devAddr) { size_t i; for (i = 0; i < list->count; i++) { virPCIDevicePtr other = list->devs[i]; - if (other->address.domain == dev->address.domain && - other->address.bus == dev->address.bus && - other->address.slot == dev->address.slot && - other->address.function == dev->address.function) + if (other->address.domain == devAddr->domain && + other->address.bus == devAddr->bus && + other->address.slot == devAddr->slot && + other->address.function == devAddr->function) return i; } return -1; @@ -1798,7 +1799,7 @@ virPCIDeviceListFind(virPCIDeviceListPtr list, virPCIDevicePtr dev) { int idx; - if ((idx = virPCIDeviceListFindIndex(list, dev)) >= 0) + if ((idx = virPCIDeviceListFindIndex(list, &dev->address)) >= 0) return list->devs[idx]; else return NULL; diff --git a/src/util/virpci.h b/src/util/virpci.h index a9c597a428..8c6776da21 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -175,7 +175,7 @@ virPCIDeviceListFindByIDs(virPCIDeviceListPtr list, unsigned int slot, unsigned int function); int virPCIDeviceListFindIndex(virPCIDeviceListPtr list, - virPCIDevicePtr dev); + virPCIDeviceAddressPtr devAddr); /* * Callback that will be invoked once for each file -- 2.26.2

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/hypervisor/virhostdev.c | 12 ++++++++---- src/util/virpci.c | 16 ++++++++-------- src/util/virpci.h | 2 +- 3 files changed, 17 insertions(+), 13 deletions(-) diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index 16a21f240b..e7ef615f60 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -657,7 +657,8 @@ virHostdevReattachAllPCIDevices(virHostdevManagerPtr mgr, /* We need to look up the actual device because that's what * virPCIDeviceReattach() expects as its argument */ - if (!(actual = virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci))) + if (!(actual = virPCIDeviceListFind(mgr->inactivePCIHostdevs, + virPCIDeviceGetAddress(pci)))) continue; if (virPCIDeviceGetManaged(actual)) { @@ -777,7 +778,8 @@ virHostdevPreparePCIDevicesImpl(virHostdevManagerPtr mgr, /* Unmanaged devices should already have been marked as * inactive: if that's the case, we can simply move on */ - if (virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci)) { + if (virPCIDeviceListFind(mgr->inactivePCIHostdevs, + virPCIDeviceGetAddress(pci))) { VIR_DEBUG("Not detaching unmanaged PCI device %s", virPCIDeviceGetName(pci)); continue; @@ -860,7 +862,8 @@ virHostdevPreparePCIDevicesImpl(virHostdevManagerPtr mgr, * there because 'pci' only contain address information and will * be released at the end of the function */ pci = virPCIDeviceListGet(pcidevs, i); - actual = virPCIDeviceListFind(mgr->activePCIHostdevs, pci); + actual = virPCIDeviceListFind(mgr->activePCIHostdevs, + virPCIDeviceGetAddress(pci)); VIR_DEBUG("Setting driver and domain information for PCI device %s", virPCIDeviceGetName(pci)); @@ -992,7 +995,8 @@ virHostdevReAttachPCIDevicesImpl(virHostdevManagerPtr mgr, * information such as by which domain and driver it is used. As a * side effect, by looking it up we can also tell whether it was * really active in the first place */ - actual = virPCIDeviceListFind(mgr->activePCIHostdevs, pci); + actual = virPCIDeviceListFind(mgr->activePCIHostdevs, + virPCIDeviceGetAddress(pci)); if (actual) { const char *actual_drvname; const char *actual_domname; diff --git a/src/util/virpci.c b/src/util/virpci.c index 122056dbe0..938ffeaaed 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -705,7 +705,7 @@ virPCIDeviceSharesBusWithActive(virPCIDevicePtr dev, virPCIDevicePtr check, void return 0; /* same bus, but inactive, i.e. about to be assigned to guest */ - if (inactiveDevs && virPCIDeviceListFind(inactiveDevs, check)) + if (inactiveDevs && virPCIDeviceListFind(inactiveDevs, &check->address)) return 0; return 1; @@ -1022,7 +1022,7 @@ virPCIDeviceReset(virPCIDevicePtr dev, return -1; } - if (activeDevs && virPCIDeviceListFind(activeDevs, dev)) { + if (activeDevs && virPCIDeviceListFind(activeDevs, &dev->address)) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Not resetting active device %s"), dev->name); return -1; @@ -1294,7 +1294,7 @@ virPCIDeviceDetach(virPCIDevicePtr dev, if (virPCIProbeStubDriver(dev->stubDriver) < 0) return -1; - if (activeDevs && virPCIDeviceListFind(activeDevs, dev)) { + if (activeDevs && virPCIDeviceListFind(activeDevs, &dev->address)) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Not detaching active device %s"), dev->name); return -1; @@ -1306,7 +1306,7 @@ virPCIDeviceDetach(virPCIDevicePtr dev, /* Add *a copy of* the dev into list inactiveDevs, if * it's not already there. */ - if (inactiveDevs && !virPCIDeviceListFind(inactiveDevs, dev)) { + if (inactiveDevs && !virPCIDeviceListFind(inactiveDevs, &dev->address)) { VIR_DEBUG("Adding PCI device %s to inactive list", dev->name); if (virPCIDeviceListAddCopy(inactiveDevs, dev) < 0) return -1; @@ -1324,7 +1324,7 @@ virPCIDeviceReattach(virPCIDevicePtr dev, virPCIDeviceListPtr activeDevs, virPCIDeviceListPtr inactiveDevs) { - if (activeDevs && virPCIDeviceListFind(activeDevs, dev)) { + if (activeDevs && virPCIDeviceListFind(activeDevs, &dev->address)) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Not reattaching active device %s"), dev->name); return -1; @@ -1684,7 +1684,7 @@ int virPCIDeviceListAdd(virPCIDeviceListPtr list, virPCIDevicePtr dev) { - if (virPCIDeviceListFind(list, dev)) { + if (virPCIDeviceListFind(list, &dev->address)) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Device %s is already in use"), dev->name); return -1; @@ -1795,11 +1795,11 @@ virPCIDeviceListFindByIDs(virPCIDeviceListPtr list, virPCIDevicePtr -virPCIDeviceListFind(virPCIDeviceListPtr list, virPCIDevicePtr dev) +virPCIDeviceListFind(virPCIDeviceListPtr list, virPCIDeviceAddressPtr devAddr) { int idx; - if ((idx = virPCIDeviceListFindIndex(list, &dev->address)) >= 0) + if ((idx = virPCIDeviceListFindIndex(list, devAddr)) >= 0) return list->devs[idx]; else return NULL; diff --git a/src/util/virpci.h b/src/util/virpci.h index 8c6776da21..628a293972 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -167,7 +167,7 @@ virPCIDevicePtr virPCIDeviceListStealIndex(virPCIDeviceListPtr list, void virPCIDeviceListDel(virPCIDeviceListPtr list, virPCIDevicePtr dev); virPCIDevicePtr virPCIDeviceListFind(virPCIDeviceListPtr list, - virPCIDevicePtr dev); + virPCIDeviceAddressPtr devAddr); virPCIDevicePtr virPCIDeviceListFindByIDs(virPCIDeviceListPtr list, unsigned int domain, -- 2.26.2

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/hypervisor/virhostdev.c | 8 +++++--- src/util/virpci.c | 6 +++--- src/util/virpci.h | 2 +- 3 files changed, 9 insertions(+), 7 deletions(-) diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index e7ef615f60..4539c55e09 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -846,7 +846,7 @@ virHostdevPreparePCIDevicesImpl(virHostdevManagerPtr mgr, VIR_DEBUG("Removing PCI device %s from inactive list", virPCIDeviceGetName(pci)); - actual = virPCIDeviceListSteal(mgr->inactivePCIHostdevs, pci); + actual = virPCIDeviceListSteal(mgr->inactivePCIHostdevs, virPCIDeviceGetAddress(pci)); VIR_DEBUG("Adding PCI device %s to active list", virPCIDeviceGetName(pci)); @@ -918,7 +918,8 @@ virHostdevPreparePCIDevicesImpl(virHostdevManagerPtr mgr, VIR_DEBUG("Removing PCI device %s from active list", virPCIDeviceGetName(pci)); - if (!(actual = virPCIDeviceListSteal(mgr->activePCIHostdevs, pci))) + if (!(actual = virPCIDeviceListSteal(mgr->activePCIHostdevs, + virPCIDeviceGetAddress(pci)))) continue; VIR_DEBUG("Adding PCI device %s to inactive list", @@ -1022,7 +1023,8 @@ virHostdevReAttachPCIDevicesImpl(virHostdevManagerPtr mgr, VIR_DEBUG("Removing PCI device %s from active list", virPCIDeviceGetName(pci)); - actual = virPCIDeviceListSteal(mgr->activePCIHostdevs, pci); + actual = virPCIDeviceListSteal(mgr->activePCIHostdevs, + virPCIDeviceGetAddress(pci)); VIR_DEBUG("Adding PCI device %s to inactive list", virPCIDeviceGetName(pci)); diff --git a/src/util/virpci.c b/src/util/virpci.c index 938ffeaaed..8973c7c754 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -1743,16 +1743,16 @@ virPCIDeviceListStealIndex(virPCIDeviceListPtr list, virPCIDevicePtr virPCIDeviceListSteal(virPCIDeviceListPtr list, - virPCIDevicePtr dev) + virPCIDeviceAddressPtr devAddr) { - return virPCIDeviceListStealIndex(list, virPCIDeviceListFindIndex(list, &dev->address)); + return virPCIDeviceListStealIndex(list, virPCIDeviceListFindIndex(list, devAddr)); } void virPCIDeviceListDel(virPCIDeviceListPtr list, virPCIDevicePtr dev) { - virPCIDeviceFree(virPCIDeviceListSteal(list, dev)); + virPCIDeviceFree(virPCIDeviceListSteal(list, &dev->address)); } int diff --git a/src/util/virpci.h b/src/util/virpci.h index 628a293972..e3458a75fa 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -161,7 +161,7 @@ virPCIDevicePtr virPCIDeviceListGet(virPCIDeviceListPtr list, int idx); size_t virPCIDeviceListCount(virPCIDeviceListPtr list); virPCIDevicePtr virPCIDeviceListSteal(virPCIDeviceListPtr list, - virPCIDevicePtr dev); + virPCIDeviceAddressPtr devAddr); virPCIDevicePtr virPCIDeviceListStealIndex(virPCIDeviceListPtr list, int idx); void virPCIDeviceListDel(virPCIDeviceListPtr list, -- 2.26.2

This change will allow us to remove PCI devices from a list without the need of a PCI Device object, which will be need in the next patch. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/hypervisor/virhostdev.c | 7 ++++--- src/util/virpci.c | 6 +++--- src/util/virpci.h | 2 +- 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index 4539c55e09..0510eb020c 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -1005,11 +1005,11 @@ virHostdevReAttachPCIDevicesImpl(virHostdevManagerPtr mgr, if (STRNEQ_NULLABLE(drv_name, actual_drvname) || STRNEQ_NULLABLE(dom_name, actual_domname)) { - virPCIDeviceListDel(pcidevs, pci); + virPCIDeviceListDel(pcidevs, virPCIDeviceGetAddress(pci)); continue; } } else { - virPCIDeviceListDel(pcidevs, pci); + virPCIDeviceListDel(pcidevs, virPCIDeviceGetAddress(pci)); continue; } @@ -2524,7 +2524,8 @@ virHostdevUpdateActiveNVMeDevices(virHostdevManagerPtr hostdev_mgr, while (lastGoodPCIIdx >= 0) { virPCIDevicePtr actual = virPCIDeviceListGet(pciDevices, i); - virPCIDeviceListDel(hostdev_mgr->activePCIHostdevs, actual); + virPCIDeviceListDel(hostdev_mgr->activePCIHostdevs, + virPCIDeviceGetAddress(actual)); lastGoodPCIIdx--; } diff --git a/src/util/virpci.c b/src/util/virpci.c index 8973c7c754..cbf49357c6 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -1336,7 +1336,7 @@ virPCIDeviceReattach(virPCIDevicePtr dev, /* Steal the dev from list inactiveDevs */ if (inactiveDevs) { VIR_DEBUG("Removing PCI device %s from inactive list", dev->name); - virPCIDeviceListDel(inactiveDevs, dev); + virPCIDeviceListDel(inactiveDevs, &dev->address); } return 0; @@ -1750,9 +1750,9 @@ virPCIDeviceListSteal(virPCIDeviceListPtr list, void virPCIDeviceListDel(virPCIDeviceListPtr list, - virPCIDevicePtr dev) + virPCIDeviceAddressPtr devAddr) { - virPCIDeviceFree(virPCIDeviceListSteal(list, &dev->address)); + virPCIDeviceFree(virPCIDeviceListSteal(list, devAddr)); } int diff --git a/src/util/virpci.h b/src/util/virpci.h index e3458a75fa..eb71eb4451 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -165,7 +165,7 @@ virPCIDevicePtr virPCIDeviceListSteal(virPCIDeviceListPtr list, virPCIDevicePtr virPCIDeviceListStealIndex(virPCIDeviceListPtr list, int idx); void virPCIDeviceListDel(virPCIDeviceListPtr list, - virPCIDevicePtr dev); + virPCIDeviceAddressPtr devAddr); virPCIDevicePtr virPCIDeviceListFind(virPCIDeviceListPtr list, virPCIDeviceAddressPtr devAddr); virPCIDevicePtr -- 2.26.2

virHostdevReAttachPCIDevices() is called when we want to re-attach a list of hostdevs back to the host, either on the shutdown path or via a 'virsh detach-device' call. This function always count on the existence of the device in the host to work, but this can lead to problems. For example, a SR-IOV device can be removed via an admin "echo 0 > /sys/bus/pci/devices/<addr>/sriov_numvfs", making the kernel fire up and eventfd_signal() to the process, asking for the process to release the device. The result might vary depending on the device driver and OS/arch, but two possible outcomes are: 1) the hypervisor driver will detach the device from the VM, issuing a delete event to Libvirt. This can be observed in QEMU; 2) the 'echo 0 > ...' will hang waiting for the device to be unplugged. This means that the VM process failed/refused to release the hostdev back to the host, and the hostdev will be detached during VM shutdown. Today we don't behave well for both cases. We'll fail to remove the PCI device reference from mgr->activePCIHostdevs and mgr->inactivePCIHostdevs because we rely on the existence of the PCI device conf file in the sysfs. Attempting to re-utilize the same device (assuming it is now present back in the host) can result in an error like this: $ ./run tools/virsh start vm1-sriov --console error: Failed to start domain vm1-sriov error: Requested operation is not valid: PCI device 0000:01:00.2 is in use by driver QEMU, domain vm1-sriov For (1), a VM destroy/start cycle is needed to re-use the VF in the guest. For (2), the effect is more nefarious, requiring a Libvirtd daemon restart to use the VF again in any guest. We can make it a bit better by checking, during virHostdevReAttachPCIDevices(), if there is any missing PCI device that will be left behind in activePCIHostdevs and inactivePCIHostdevs lists. Remove any missing device found from both lists, unconditionally, matching the current state of the host. This change affects the code path in (1) (processDeviceDeletedEvent into qemuDomainRemoveDevice, all the way back to qemuHostdevReAttachPCIDevices) and also in (b) (qemuProcessStop into qemuHostdevReAttachDomainDevices). NB: this patch does **NOT** intend to implement support for surprise hotunplug of PCI devices in Libvirt. Although this can now be achieved if the hypervisor and the PCI driver copes with it, our goal is to mitigate what it is still considered an user oopsie. For all supported purposes, the admin must remove the SR-IOV VFs from all running domains before removing the VFs from the host. Resolves: https://gitlab.com/libvirt/libvirt/-/issues/72 Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/hypervisor/virhostdev.c | 38 +++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index 0510eb020c..50fe29f142 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -1077,6 +1077,40 @@ virHostdevReAttachPCIDevicesImpl(virHostdevManagerPtr mgr, } +static void +virHostdevDeleteMissingPCIDevices(virHostdevManagerPtr mgr, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + size_t i; + + if (nhostdevs == 0) + return; + + virObjectLock(mgr->activePCIHostdevs); + virObjectLock(mgr->inactivePCIHostdevs); + + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDef *hostdev = hostdevs[i]; + virDomainHostdevSubsysPCI *pcisrc = &hostdev->source.subsys.u.pci; + g_autoptr(virPCIDevice) pci = NULL; + + if (virHostdevGetPCIHostDevice(hostdev, &pci) != -2) + continue; + + /* The PCI device from 'hostdev' does not exist in the host + * anymore. Delete it from both active and inactive lists to + * reflect the current host state. + */ + virPCIDeviceListDel(mgr->activePCIHostdevs, &pcisrc->addr); + virPCIDeviceListDel(mgr->inactivePCIHostdevs, &pcisrc->addr); + } + + virObjectUnlock(mgr->activePCIHostdevs); + virObjectUnlock(mgr->inactivePCIHostdevs); +} + + /* @oldStateDir: * For upgrade purpose: see virHostdevRestoreNetConfig */ @@ -1102,6 +1136,10 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr mgr, virHostdevReAttachPCIDevicesImpl(mgr, drv_name, dom_name, pcidevs, hostdevs, nhostdevs, oldStateDir); + + /* Handle the case where PCI devices from the host went missing + * during the domain lifetime */ + virHostdevDeleteMissingPCIDevices(mgr, hostdevs, nhostdevs); } -- 2.26.2

Ping On 1/4/21 9:54 AM, Daniel Henrique Barboza wrote:
Hi,
This series is an attempt to fix https://gitlab.com/libvirt/libvirt/-/issues/72.
The root issue is that, for obvious reasons, virPCIDeviceNew() requires the device to exist in the host (i.e. the config file must be present in sysfs). The unexpected absence of the device (e.g. removing the VFs from the host while a domain was using them) will make virPCIDeviceNew() return -1 in several code paths, and the effects in Libvirt can be bothersome. Patch 20 will explain it in greater detail.
Changing virPCIDeviceNew() internals to not fail in this scenario would be a herculean refactor of innocent code that doesn't have anything to do with the problem. My approach here is to check if the device exists in the sysfs in key places, keeping the default virPCIDeviceNew() behavior intact.
I mentioned this is patch 20 but to emphasize: this is NOT an attempt to implement surprise hostdev hotunplug support in Libvirt. Removing SR-IOVs VFs used by any domain is still not cool and should be avoided. But after this series, Libvirt can at least fall on its feet if this situation happens.
The series was tested using a Mellanox MT28800 card in a Power 9 server.
Daniel Henrique Barboza (20): virpci, domain_audit: use virPCIDeviceAddressAsString() qemu, lxc: move NodeDeviceGetPCIInfo() function to domain_driver.c domain_driver.c: use PCI address with virDomainDriverNodeDeviceGetPCIInfo() virpci.c: simplify virPCIDeviceNew() signature virpci: introduce virPCIDeviceExists() virhostdev.c: virHostdevGetPCIHostDevice() now reports missing device security_selinux.c: modernize set/restore hostdev subsys label functions security_dac.c: modernize hostdev label set/restore functions dac, selinux: skip setting/restoring label for absent PCI devices libvirt-nodedev.c: remove return value from virNodeDeviceFree() qemu_driver.c: modernize qemuNodeDeviceReAttach() libxl_driver.c: modernize libxlNodeDeviceReAttach() virsh-domain.c: modernize cmdDetachDevice() virhostdev.c: add virHostdevIsPCIDevice() helper qemu_cgroup.c: skip absent PCI devices in qemuTeardownHostdevCgroup() virpci.c: use virPCIDeviceAddressPtr in virPCIDeviceListFindIndex() virpci.c: use virPCIDeviceAddressPtr in virPCIDeviceListFind() virpci.c: use virPCIDeviceAddressPtr in virPCIDeviceListSteal() virpci.c: use virPCIDeviceAddressPtr in virPCIDeviceListDel() virhostdev.c: remove missing PCI devs from hostdev manager
include/libvirt/libvirt-nodedev.h | 2 +- src/conf/domain_audit.c | 6 +- src/datatypes.h | 2 + src/hypervisor/domain_driver.c | 30 ++++++++++ src/hypervisor/domain_driver.h | 4 ++ src/hypervisor/virhostdev.c | 88 ++++++++++++++++++++++------ src/hypervisor/virhostdev.h | 2 + src/libvirt-nodedev.c | 15 +++-- src/libvirt_private.syms | 3 + src/libxl/libxl_driver.c | 87 ++++++++-------------------- src/node_device/node_device_udev.c | 11 ++-- src/qemu/qemu_cgroup.c | 10 ++++ src/qemu/qemu_domain_address.c | 5 +- src/qemu/qemu_driver.c | 81 +++++++------------------- src/security/security_apparmor.c | 3 +- src/security/security_dac.c | 61 ++++++++------------ src/security/security_selinux.c | 66 +++++++++------------ src/security/virt-aa-helper.c | 6 +- src/util/virnetdev.c | 3 +- src/util/virnvme.c | 5 +- src/util/virpci.c | 93 ++++++++++++++---------------- src/util/virpci.h | 14 ++--- tests/virhostdevtest.c | 3 +- tests/virpcitest.c | 35 ++++++++--- tools/virsh-domain.c | 18 ++---- 25 files changed, 325 insertions(+), 328 deletions(-)
participants (1)
-
Daniel Henrique Barboza