[libvirt] [PATCH v2] util: disallow assigning or resetting a pci-bridge device

It is destructive to attempt reset on a pci-bridge, the host can crash. The bridges won't contain any guest data and neither they can be passed through using vfio/stub. So, no point in allowing a reset on them. Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com> --- src/util/virhostdev.c | 10 ++++++++++ src/util/virpci.c | 11 +++++++++++ 2 files changed, 21 insertions(+) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 0673afb..16b96f3 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -532,6 +532,16 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, bool strict_acs_check = !!(flags & VIR_HOSTDEV_STRICT_ACS_CHECK); bool usesVFIO = (virPCIDeviceGetStubDriver(pci) == VIR_PCI_STUB_DRIVER_VFIO); struct virHostdevIsPCINodeDeviceUsedData data = { mgr, dom_name, usesVFIO }; + int hdrType = -1; + + if (virPCIGetHeaderType(pci, &hdrType) < 0) + goto cleanup; + + if (hdrType == VIR_PCI_HEADER_PCI_BRIDGE) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("PCI bridge devices" + " cannot be assigned to guests")); + goto cleanup; + } if (!usesVFIO && !virPCIDeviceIsAssignable(pci, strict_acs_check)) { virReportError(VIR_ERR_OPERATION_INVALID, diff --git a/src/util/virpci.c b/src/util/virpci.c index 0601f49..f205abf 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -933,6 +933,17 @@ virPCIDeviceReset(virPCIDevicePtr dev, char *drvName = NULL; int ret = -1; int fd = -1; + int hdrType = -1; + + if (virPCIGetHeaderType(dev, &hdrType) < 0) + return -1; + + if (hdrType != VIR_PCI_HEADER_ENDPOINT) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid attempt to reset non-endpoint PCI device %s." + " Only PCI endpoint devices can be reset"), dev->name); + return -1; + } if (activeDevs && virPCIDeviceListFind(activeDevs, dev)) { virReportError(VIR_ERR_INTERNAL_ERROR,

On Wed, Jan 18, 2017 at 07:29:29PM +0530, Shivaprasad G Bhat wrote:
It is destructive to attempt reset on a pci-bridge, the host can crash. The bridges won't contain any guest data and neither they can be passed through using vfio/stub. So, no point in allowing a reset on them.
Wasn't resetting non-endpoint the only way in some cases when you needed to passthrough 2 endpoint devices when none of them could be reset? I might be very easily wrong, though.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com> --- src/util/virhostdev.c | 10 ++++++++++ src/util/virpci.c | 11 +++++++++++ 2 files changed, 21 insertions(+)
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 0673afb..16b96f3 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -532,6 +532,16 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, bool strict_acs_check = !!(flags & VIR_HOSTDEV_STRICT_ACS_CHECK); bool usesVFIO = (virPCIDeviceGetStubDriver(pci) == VIR_PCI_STUB_DRIVER_VFIO); struct virHostdevIsPCINodeDeviceUsedData data = { mgr, dom_name, usesVFIO }; + int hdrType = -1; + + if (virPCIGetHeaderType(pci, &hdrType) < 0) + goto cleanup; + + if (hdrType == VIR_PCI_HEADER_PCI_BRIDGE) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("PCI bridge devices" + " cannot be assigned to guests")); + goto cleanup; + }
if (!usesVFIO && !virPCIDeviceIsAssignable(pci, strict_acs_check)) { virReportError(VIR_ERR_OPERATION_INVALID, diff --git a/src/util/virpci.c b/src/util/virpci.c index 0601f49..f205abf 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -933,6 +933,17 @@ virPCIDeviceReset(virPCIDevicePtr dev, char *drvName = NULL; int ret = -1; int fd = -1; + int hdrType = -1; + + if (virPCIGetHeaderType(dev, &hdrType) < 0) + return -1; + + if (hdrType != VIR_PCI_HEADER_ENDPOINT) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid attempt to reset non-endpoint PCI device %s." + " Only PCI endpoint devices can be reset"), dev->name); + return -1; + }
if (activeDevs && virPCIDeviceListFind(activeDevs, dev)) { virReportError(VIR_ERR_INTERNAL_ERROR,
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 01/18/2017 11:10 AM, Martin Kletzander wrote:
On Wed, Jan 18, 2017 at 07:29:29PM +0530, Shivaprasad G Bhat wrote:
It is destructive to attempt reset on a pci-bridge, the host can crash. The bridges won't contain any guest data and neither they can be passed through using vfio/stub. So, no point in allowing a reset on them.
Wasn't resetting non-endpoint the only way in some cases when you needed to passthrough 2 endpoint devices when none of them could be reset? I might be very easily wrong, though.
What you're thinking of is the code in virPCIDeviceReset() that attempts to do a "secondary bus reset" on the parent of an endpoint device if the reset of the endpoint device itself fails. This is never initiated directly by the user (or the top level of the libvirt API) though, but only done internally when attempts to reset an endpoint device fail (libvirt then tries doing a secondary bus reset of the endpoint's parent device). Of course *all* of the PCI reset stuff in libvirt is a moot point if you're using vfio anyway - vfio itself will handle any device resets that need to be done at the appropriate time. That's why virPCIDeviceReset() returns success almost immediately if the device is bound to vfio-pci. Since the only time the device would *not* be bound to vfio-pci would be 1) if a domain was using legacy KVM device assignment (which is deprecated, and I think may even be completely removed from new kernels) or 2) if a user manually requested a device reset with libvirt's virNodeDeviceReset() API (which realistically speaking nobody should ever need to do), there's really not much chance of this making an actual difference. (This makes me wonder where Shivaprasad encountered this in the real world...)
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com> --- src/util/virhostdev.c | 10 ++++++++++ src/util/virpci.c | 11 +++++++++++ 2 files changed, 21 insertions(+)
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 0673afb..16b96f3 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -532,6 +532,16 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, bool strict_acs_check = !!(flags & VIR_HOSTDEV_STRICT_ACS_CHECK); bool usesVFIO = (virPCIDeviceGetStubDriver(pci) == VIR_PCI_STUB_DRIVER_VFIO); struct virHostdevIsPCINodeDeviceUsedData data = { mgr, dom_name, usesVFIO }; + int hdrType = -1; + + if (virPCIGetHeaderType(pci, &hdrType) < 0) + goto cleanup; + + if (hdrType == VIR_PCI_HEADER_PCI_BRIDGE) {
I verified with the VFIO author/maintainer (Alex Williamson) that vfio will assign only endpoint devices - cardbus bridge devices and PCI bridge devices are both not allowed. So I think that this check should also be for "hdrType != VIR_PCI_HEADER_ENDPOINT". Also, it's kind of nit-picking, but I think this chunk should be in a separate patch from the other. One is preventing attempts to assign a device that isn't an endpoint, and the otehr is preventing attemps to reset a device that isn't an endpoint.
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("PCI bridge devices" + " cannot be assigned to guests"));
When a string is split into multiple lines, the final character of each line's substring must be a space (this is in the contribution guidelines and is enforced by make syntax-check)
+ goto cleanup; + }
if (!usesVFIO && !virPCIDeviceIsAssignable(pci, strict_acs_check)) { virReportError(VIR_ERR_OPERATION_INVALID, diff --git a/src/util/virpci.c b/src/util/virpci.c index 0601f49..f205abf 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -933,6 +933,17 @@ virPCIDeviceReset(virPCIDevicePtr dev, char *drvName = NULL; int ret = -1; int fd = -1; + int hdrType = -1; + + if (virPCIGetHeaderType(dev, &hdrType) < 0) + return -1; + + if (hdrType != VIR_PCI_HEADER_ENDPOINT) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid attempt to reset non-endpoint PCI device %s." + " Only PCI endpoint devices can be reset"), dev->name);
Same problem. make syntax-check will catch it. If you split this in two, change the first check to be for != endpoint rather than == bridge, and fix the strings, then these can be pushed.

Hi Laine, On 01/21/2017 01:50 AM, Laine Stump wrote:
On 01/18/2017 11:10 AM, Martin Kletzander wrote:
On Wed, Jan 18, 2017 at 07:29:29PM +0530, Shivaprasad G Bhat wrote:
It is destructive to attempt reset on a pci-bridge, the host can crash. The bridges won't contain any guest data and neither they can be passed through using vfio/stub. So, no point in allowing a reset on them.
Wasn't resetting non-endpoint the only way in some cases when you needed to passthrough 2 endpoint devices when none of them could be reset? I might be very easily wrong, though.
What you're thinking of is the code in virPCIDeviceReset() that attempts to do a "secondary bus reset" on the parent of an endpoint device if the reset of the endpoint device itself fails. This is never initiated directly by the user (or the top level of the libvirt API) though, but only done internally when attempts to reset an endpoint device fail (libvirt then tries doing a secondary bus reset of the endpoint's parent device).
Of course *all* of the PCI reset stuff in libvirt is a moot point if you're using vfio anyway - vfio itself will handle any device resets that need to be done at the appropriate time. That's why virPCIDeviceReset() returns success almost immediately if the device is bound to vfio-pci. Since the only time the device would *not* be bound to vfio-pci would be 1) if a domain was using legacy KVM device assignment (which is deprecated, and I think may even be completely removed from new kernels) or 2) if a user manually requested a device reset with libvirt's virNodeDeviceReset() API (which realistically speaking nobody should ever need to do), there's really not much chance of this making an actual difference. (This makes me wonder where Shivaprasad encountered this in the real world...)
I encountered this with virsh nodedev-reset on a pci-bridge.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com> --- src/util/virhostdev.c | 10 ++++++++++ src/util/virpci.c | 11 +++++++++++ 2 files changed, 21 insertions(+)
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 0673afb..16b96f3 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -532,6 +532,16 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, bool strict_acs_check = !!(flags & VIR_HOSTDEV_STRICT_ACS_CHECK); bool usesVFIO = (virPCIDeviceGetStubDriver(pci) == VIR_PCI_STUB_DRIVER_VFIO); struct virHostdevIsPCINodeDeviceUsedData data = { mgr, dom_name, usesVFIO }; + int hdrType = -1; + + if (virPCIGetHeaderType(pci, &hdrType) < 0) + goto cleanup; + + if (hdrType == VIR_PCI_HEADER_PCI_BRIDGE) {
I verified with the VFIO author/maintainer (Alex Williamson) that vfio will assign only endpoint devices - cardbus bridge devices and PCI bridge devices are both not allowed. So I think that this check should also be for "hdrType != VIR_PCI_HEADER_ENDPOINT".
Also, it's kind of nit-picking, but I think this chunk should be in a separate patch from the other. One is preventing attempts to assign a device that isn't an endpoint, and the otehr is preventing attemps to reset a device that isn't an endpoint.
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("PCI bridge devices" + " cannot be assigned to guests"));
When a string is split into multiple lines, the final character of each line's substring must be a space (this is in the contribution guidelines and is enforced by make syntax-check)
I don't see a make syntax-check failure on my system though. Moved the space to the previous line.
+ goto cleanup; + }
if (!usesVFIO && !virPCIDeviceIsAssignable(pci, strict_acs_check)) { virReportError(VIR_ERR_OPERATION_INVALID, diff --git a/src/util/virpci.c b/src/util/virpci.c index 0601f49..f205abf 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -933,6 +933,17 @@ virPCIDeviceReset(virPCIDevicePtr dev, char *drvName = NULL; int ret = -1; int fd = -1; + int hdrType = -1; + + if (virPCIGetHeaderType(dev, &hdrType) < 0) + return -1; + + if (hdrType != VIR_PCI_HEADER_ENDPOINT) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid attempt to reset non-endpoint PCI device %s." + " Only PCI endpoint devices can be reset"), dev->name);
Same problem. make syntax-check will catch it.
If you split this in two, change the first check to be for != endpoint rather than == bridge, and fix the strings, then these can be pushed.
Posted the v2 as suggested. Thanks, Shivaprasad
participants (3)
-
Laine Stump
-
Martin Kletzander
-
Shivaprasad G Bhat