[PATCH] util: Allow for PCI root buses not numbered "0"
From: ehanoc via Devel <devel@lists.libvirt.org> virPCIDeviceReset() and virPCIDeviceIsBehindSwitchLackingACS() both used a hardcoded check for bus != 0 to determine if a device is attached directly to a "root bus". This breaks on systems with more than one root bus, where at least one of the buses is necessarily not 0! (for example Intel Arrow Lake based systems where the CPU's root complex is bus 0x00 and the PCH root complex is bus 0x80). Update both functions to use virPCIDeviceIsOnRootBus(), which detects root bus attachment via the canonicalized sysfs device link in /sys/devices/*, making it work correctly for all root buses, not just those numbered 0. Discussion of the issue here: https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/NE62X... Resolves: https://github.com/QubesOS/qubes-issues/issues/10393 Signed-off-by: Bruno Martins <ehanoc@protonmail.com> Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> --- Changes from V1: * Editorial changes to comments in code and commit log message * log an error when virPCIDeviceIsOnRootBus() fails * use g_path_get_basename() rather than strrchr(blah, '/') * rename a couple variables I haven't been able to test these changes because I only have QEMU/KVM, and it doesn't call any of this code. Please test and add your Reviewed-by (of my changes) and Tested-by: and I'll push it. src/util/virpci.c | 106 +++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 100 insertions(+), 6 deletions(-) diff --git a/src/util/virpci.c b/src/util/virpci.c index d43fa1ef54..642717d23c 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -1080,6 +1080,87 @@ virPCIDeviceInit(virPCIDevice *dev, int cfgfd) return 0; } +/* + * Check if a PCI device is directly attached to a root bus (no parent bridge). + * + * A PCIe topology can have multiple root buses (a "root" bus is any + * PCI bus that doesn't have yet another higher level PCI bus as its + * parent), so by definition we can't detect a root bus by just + * looking for bus 0. Instead, we can detect if a bus is attached to a + * root bus by examining the *canonical* path of the device's + * directory in sysfs: a device is connected directly to a root bus if + * its canonicalized sysfs path is a direct child of a + * /sys/devices/pciXXXX:YY directory. For example, this device: + * + * /sys/bus/pci/devices/0000:00:14.0 (path in "flat" sysfs pci namespace) + * /sys/devices/pci0000:80/0000:80:14.0). (canonicalized version of above) + + * is connnected to a root bus. Conversely, if the given device's + * canonial path has yet another PCI device as its parent in the + * sys/device's hierarchy, then it *isn't* attached to a root bus, + * e.g.: + * + * /sys/bus/pci/devices/0000:04:00.1 (link in "flat" namespace) + * /sys/devices/pci0000:00/0000:00:1c.4/0000:04:00.0 (canonicalized) + * + * Returns 1 if the given device is directly attached to a root bus, + * 0 if not, -1 on error. + */ +static int +virPCIDeviceIsOnRootBus(virPCIDevice *dev) +{ + unsigned int domain, bus; + g_autofree char *devPath = NULL; + g_autofree char *canonicalPath = NULL; + g_autofree char *parentPath = NULL; + g_autofree char *parentName = NULL; + + /* Get the sysfs symlink path for this device */ + devPath = g_strdup_printf(PCI_SYSFS "devices/%s", dev->name); + + /* Resolve the symlink to get the real path in /sys/devices/... */ + if (virFileResolveLink(devPath, &canonicalPath) < 0) { + virReportSystemError(errno, _("failed to resolve links in PCI device %1$s sysfs path '%2$s'"), + dev->name, devPath); + return -1; + } + + /* get just the final component of the parent directory */ + parentPath = g_path_get_dirname(canonicalPath); + parentName = g_path_get_basename(parentPath); + + /* + * Detect root bus by parsing the parent directory name with sscanf. + * + * PCI root buses in /sys/devices follow the naming convention: + * pci<domain>:<bus> + * + * <domain>: number (0000-ffff, hex) + * <bus>: number within the domain (00-ff, hex) + * + * This format is defined in: + * https://www.kernel.org/doc/html/latest/PCI/sysfs-pci.html + * + * Examples from Intel Arrow Lake systems: + * - pci0000:00 (CPU root bus) + * - pci0000:80 (PCH root bus) + * + * We aren't concerned with the actual values parsed by sscanf, + * only whether or not the parse was successful - if soo, then we + * return 1 indicating this device is attached to a root bus, + * otherwise 0. + */ + if (sscanf(parentName, "pci%x:%x", &domain, &bus) == 2) { + VIR_DEBUG("%s %s: device is on root bus (parent=%s)", + dev->id, dev->name, parentName); + return 1; + } + + VIR_DEBUG("%s %s: device is not on root bus (parent=%s)", + dev->id, dev->name, parentName); + return 0; +} + int virPCIDeviceReset(virPCIDevice *dev, virPCIDeviceList *activeDevs, @@ -1145,9 +1226,18 @@ virPCIDeviceReset(virPCIDevice *dev, if (dev->has_pm_reset) ret = virPCIDeviceTryPowerManagementReset(dev, fd); - /* Bus reset is not an option with the root bus */ - if (ret < 0 && dev->address.bus != 0) - ret = virPCIDeviceTrySecondaryBusReset(dev, fd, inactiveDevs); + /* If power management reset isn't available (or if it failed), + * and the bus we're attached to isn't a root bus, try a secondary + * bus reset. (This is not an option if we are attached directly + * to a root bus). + */ + if (ret < 0) { + int onRootBus = virPCIDeviceIsOnRootBus(dev); + if (onRootBus < 0) + goto cleanup; + if (!onRootBus) + ret = virPCIDeviceTrySecondaryBusReset(dev, fd, inactiveDevs); + } if (ret < 0) { virErrorPtr err = virGetLastError(); @@ -2531,15 +2621,19 @@ static int virPCIDeviceIsBehindSwitchLackingACS(virPCIDevice *dev) { g_autoptr(virPCIDevice) parent = NULL; + int onRootBus; if (virPCIDeviceGetParent(dev, &parent) < 0) return -1; if (!parent) { - /* if we have no parent, and this is the root bus, ACS doesn't come - * into play since devices on the root bus can't P2P without going + /* if we have no parent, and this is a root bus, ACS doesn't come + * into play since devices on root buses can't P2P without going * through the root IOMMU. */ - if (dev->address.bus == 0) { + onRootBus = virPCIDeviceIsOnRootBus(dev); + if (onRootBus < 0) + return -1; + if (onRootBus) { return 0; } else { virReportError(VIR_ERR_INTERNAL_ERROR, -- 2.53.0
On Thu, Apr 23, 2026 at 16:43:05 -0400, Laine Stump via Devel wrote:
From: ehanoc via Devel <devel@lists.libvirt.org>
^^^^ the from line got mangled due to the maling list
Signed-off-by: Bruno Martins <ehanoc@protonmail.com
I suggest you fix it by adopting this also as the author as perceived by git. Also '>' is missing at the end.
On 4/24/26 12:41 AM, Peter Krempa via Devel wrote:
On Thu, Apr 23, 2026 at 16:43:05 -0400, Laine Stump via Devel wrote:
From: ehanoc via Devel <devel@lists.libvirt.org>
^^^^ the from line got mangled due to the maling list
Oops. My eyes only scanned as far as seeing that there was a "From: "and shut off after that :-/
Signed-off-by: Bruno Martins <ehanoc@protonmail.com
I suggest you fix it by adopting this also as the author as perceived by git. Also '>' is missing at the end.
Now that part is strange. The '>' is there in the copy of the patch email I received.
On Fri, Apr 24, 2026 at 06:15:23 -0400, Laine Stump wrote:
On 4/24/26 12:41 AM, Peter Krempa via Devel wrote:
On Thu, Apr 23, 2026 at 16:43:05 -0400, Laine Stump via Devel wrote:
From: ehanoc via Devel <devel@lists.libvirt.org>
^^^^ the from line got mangled due to the maling list
Oops. My eyes only scanned as far as seeing that there was a "From: "and shut off after that :-/
Signed-off-by: Bruno Martins <ehanoc@protonmail.com
I suggest you fix it by adopting this also as the author as perceived by git. Also '>' is missing at the end.
Now that part is strange. The '>' is there in the copy of the patch email I received.
I think I messed up when creating space for this hunk. Since it's ..
... the same as a reply character I didn't see it misplaced.
Laine Stump wrote:
From: ehanoc via Devel <devel@lists.libvirt.org>
virPCIDeviceReset() and virPCIDeviceIsBehindSwitchLackingACS() both used a hardcoded check for bus != 0 to determine if a device is attached directly to a "root bus". This breaks on systems with more than one root bus, where at least one of the buses is necessarily not 0! (for example Intel Arrow Lake based systems where the CPU's root complex is bus 0x00 and the PCH root complex is bus 0x80).
Update both functions to use virPCIDeviceIsOnRootBus(), which detects root bus attachment via the canonicalized sysfs device link in /sys/devices/*, making it work correctly for all root buses, not just those numbered 0.
Discussion of the issue here:
https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/NE62X...
Resolves: https://github.com/QubesOS/qubes-issues/issues/10393
Signed-off-by: Bruno Martins <ehanoc@protonmail.com> Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> --- Changes from V1:
* Editorial changes to comments in code and commit log message * log an error when virPCIDeviceIsOnRootBus() fails * use g_path_get_basename() rather than strrchr(blah, '/') * rename a couple variables
I haven't been able to test these changes because I only have QEMU/KVM, and it doesn't call any of this code. Please test and add your Reviewed-by (of my changes) and Tested-by: and I'll push it.
src/util/virpci.c | 106 +++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 100 insertions(+), 6 deletions(-)
diff --git a/src/util/virpci.c b/src/util/virpci.c index d43fa1ef54..642717d23c 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -1080,6 +1080,87 @@ virPCIDeviceInit(virPCIDevice *dev, int cfgfd) return 0; }
+/* + * Check if a PCI device is directly attached to a root bus (no parent bridge). + * + * A PCIe topology can have multiple root buses (a "root" bus is any + * PCI bus that doesn't have yet another higher level PCI bus as its + * parent), so by definition we can't detect a root bus by just + * looking for bus 0. Instead, we can detect if a bus is attached to a + * root bus by examining the *canonical* path of the device's + * directory in sysfs: a device is connected directly to a root bus if + * its canonicalized sysfs path is a direct child of a + * /sys/devices/pciXXXX:YY directory. For example, this device: + * + * /sys/bus/pci/devices/0000:00:14.0 (path in "flat" sysfs pci namespace) + * /sys/devices/pci0000:80/0000:80:14.0). (canonicalized version of above) + + * is connnected to a root bus. Conversely, if the given device's + * canonial path has yet another PCI device as its parent in the + * sys/device's hierarchy, then it *isn't* attached to a root bus, + * e.g.: + * + * /sys/bus/pci/devices/0000:04:00.1 (link in "flat" namespace) + * /sys/devices/pci0000:00/0000:00:1c.4/0000:04:00.0 (canonicalized) + * + * Returns 1 if the given device is directly attached to a root bus, + * 0 if not, -1 on error. + */ +static int +virPCIDeviceIsOnRootBus(virPCIDevice *dev) +{ + unsigned int domain, bus; + g_autofree char *devPath = NULL; + g_autofree char *canonicalPath = NULL; + g_autofree char *parentPath = NULL; + g_autofree char *parentName = NULL; + + /* Get the sysfs symlink path for this device */ + devPath = g_strdup_printf(PCI_SYSFS "devices/%s", dev->name); + + /* Resolve the symlink to get the real path in /sys/devices/... */ + if (virFileResolveLink(devPath, &canonicalPath) < 0) { + virReportSystemError(errno, _("failed to resolve links in PCI device %1$s sysfs path '%2$s'"), + dev->name, devPath); + return -1; + } + + /* get just the final component of the parent directory */ + parentPath = g_path_get_dirname(canonicalPath); + parentName = g_path_get_basename(parentPath); + + /* + * Detect root bus by parsing the parent directory name with sscanf. + * + * PCI root buses in /sys/devices follow the naming convention: + * pci<domain>:<bus> + * + * <domain>: number (0000-ffff, hex) + * <bus>: number within the domain (00-ff, hex) + * + * This format is defined in: + * https://www.kernel.org/doc/html/latest/PCI/sysfs-pci.html + * + * Examples from Intel Arrow Lake systems: + * - pci0000:00 (CPU root bus) + * - pci0000:80 (PCH root bus) + * + * We aren't concerned with the actual values parsed by sscanf, + * only whether or not the parse was successful - if soo, then we + * return 1 indicating this device is attached to a root bus, + * otherwise 0. + */ + if (sscanf(parentName, "pci%x:%x", &domain, &bus) == 2) { + VIR_DEBUG("%s %s: device is on root bus (parent=%s)", + dev->id, dev->name, parentName); + return 1; + } + + VIR_DEBUG("%s %s: device is not on root bus (parent=%s)", + dev->id, dev->name, parentName); + return 0; +} + int virPCIDeviceReset(virPCIDevice *dev, virPCIDeviceList *activeDevs, @@ -1145,9 +1226,18 @@ virPCIDeviceReset(virPCIDevice *dev, if (dev->has_pm_reset) ret = virPCIDeviceTryPowerManagementReset(dev, fd);
- /* Bus reset is not an option with the root bus */ - if (ret < 0 && dev->address.bus != 0) - ret = virPCIDeviceTrySecondaryBusReset(dev, fd, inactiveDevs); + /* If power management reset isn't available (or if it failed), + * and the bus we're attached to isn't a root bus, try a secondary + * bus reset. (This is not an option if we are attached directly + * to a root bus). + */ + if (ret < 0) { + int onRootBus = virPCIDeviceIsOnRootBus(dev); + if (onRootBus < 0) + goto cleanup; + if (!onRootBus) + ret = virPCIDeviceTrySecondaryBusReset(dev, fd, inactiveDevs); + }
if (ret < 0) { virErrorPtr err = virGetLastError(); @@ -2531,15 +2621,19 @@ static int virPCIDeviceIsBehindSwitchLackingACS(virPCIDevice *dev) { g_autoptr(virPCIDevice) parent = NULL; + int onRootBus;
if (virPCIDeviceGetParent(dev, &parent) < 0) return -1; if (!parent) { - /* if we have no parent, and this is the root bus, ACS doesn't come - * into play since devices on the root bus can't P2P without going + /* if we have no parent, and this is a root bus, ACS doesn't come + * into play since devices on root buses can't P2P without going * through the root IOMMU. */ - if (dev->address.bus == 0) { + onRootBus = virPCIDeviceIsOnRootBus(dev); + if (onRootBus < 0) + return -1; + if (onRootBus) { return 0; } else { virReportError(VIR_ERR_INTERNAL_ERROR,
Reviewed-by: Bruno Martins <ehanoc@protonmail.com> Tested-by: Bruno Martins <ehanoc@protonmail.com>
Applied this in my dom0 (QubesOS) and tested. It's working for me. You had flag some potential improvements for more verbosity in the run-time non-debug error handling. Let me know if you still would like those. -- Best
On 4/24/26 7:36 AM, Bruno Martins via Devel wrote:
Applied this in my dom0 (QubesOS) and tested. It's working for me.
You had flag some potential improvements for more verbosity in the run-time non-debug error handling. Let me know if you still would like those.
If you mean checking the return from g_path_get_dirname(), I realized after sending my message that we don't check the return from that anywhere else in the code, and that it can't fail - it will always return something non-null (if nothing else it will return "."), so the proper change there was just to remove the check. And in the callers of the new function, as long as virPCIDeviceIsOnRootBus() is logging an error in all cases when it returns failure, we don't need to (and shouldn't) log anything in the caller (since it's already been logged). Is there something else I've missed?
-- Best
If you mean checking the return from g_path_get_dirname(), I realized after sending my message that we don't check the return from that anywhere else in the code, and that it can't fail - it will always return something non-null (if nothing else it will return "."), so the proper change there was just to remove the check. And in the callers of the new function, as long as virPCIDeviceIsOnRootBus() is logging an error in all cases when it returns failure, we don't need to (and shouldn't) log anything in the caller (since it's already been logged). Is there something else I've missed? I think that makes sense. As you said, the real critical path is trying to resolve the symlink to an actual device and that seems covered.
On 4/24/26 8:17 AM, Bruno Martins via Devel wrote:
If you mean checking the return from g_path_get_dirname(), I realized after sending my message that we don't check the return from that anywhere else in the code, and that it can't fail - it will always return something non-null (if nothing else it will return "."), so the proper change there was just to remove the check. And in the callers of the new function, as long as virPCIDeviceIsOnRootBus() is logging an error in all cases when it returns failure, we don't need to (and shouldn't) log anything in the caller (since it's already been logged). Is there something else I've missed? I think that makes sense. As you said, the real critical path is trying to resolve the symlink to an actual device and that seems covered.
Okay, I've pushed it now, so it will be in the upcoming release. Thanks!
I think that makes sense. As you said, the real critical path is trying to resolve the symlink to an actual device and that seems covered.
participants (4)
-
Bruno Martins -
Laine Stump -
Laine Stump -
Peter Krempa