[libvirt] question about PCI new_id sysfs interface

After updating the dom0 kernel on one of my Xen test hosts, I noticed problems with PCI hostdev management. E.g # virsh nodedev-detach pci_0000_07_10_1 error: Failed to detach device pci_0000_07_10_1 error: Failed to add PCI device ID '8086 1520' to pciback: File exists It turns out there was a small interface change to new_id with the following commit to 3.16 kernel https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/pci/pci-driver.c?h=v3.16&id=8895d3bcb8ba960b1b83f95d772b641352ea8e51 which now causes xen_pciback to fail writes of "vendorid productid" to new_id. e.g. # echo "8086 1520" > /sys/bus/pci/drivers/pciback/new_id -bash: echo: write error: File exists Interestingly, vfio doesn't encounter the same error # echo "8086 1520" > /sys/bus/pci/drivers/vfio-pci/new_id # echo $? 0 vfio-pci has: static struct pci_driver vfio_pci_driver = { .name = "vfio-pci", .id_table = NULL, /* only dynamic ids */ while xen-pciback has: static const struct pci_device_id pcistub_ids[] = { { .vendor = PCI_ANY_ID, .device = PCI_ANY_ID, .subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID, }, {0,}, }; static struct pci_driver xen_pcibk_pci_driver = { .name = "pciback", .id_table = pcistub_ids, So any vendor/device pair will match for xen-pciback, while none will match for vfio-pci. But after reading that commit and the associated thread, it is not clear to me how to best fix this. Options are 1. set .id_table to NULL for xen-pciback 2. drop using the new_id interface from libvirt 3. pass more values (subvendor, subdevice, class, etc) to the new_id interface I'm not sure what problems, if any, options 1 and 2 might cause. Option 2 seems the best approach since new_id seems to be a rather unsafe interface. Thanks for your opinions! Regards, Jim

Adding Alex & Bandan, since they signed off the kernel patch which broke things. On Tue, Jun 28, 2016 at 11:39:59AM -0600, Jim Fehlig wrote:
After updating the dom0 kernel on one of my Xen test hosts, I noticed problems with PCI hostdev management. E.g
# virsh nodedev-detach pci_0000_07_10_1 error: Failed to detach device pci_0000_07_10_1 error: Failed to add PCI device ID '8086 1520' to pciback: File exists
It turns out there was a small interface change to new_id with the following commit to 3.16 kernel
which now causes xen_pciback to fail writes of "vendorid productid" to new_id. e.g.
# echo "8086 1520" > /sys/bus/pci/drivers/pciback/new_id -bash: echo: write error: File exists
Interestingly, vfio doesn't encounter the same error
# echo "8086 1520" > /sys/bus/pci/drivers/vfio-pci/new_id # echo $? 0
vfio-pci has: static struct pci_driver vfio_pci_driver = { .name = "vfio-pci", .id_table = NULL, /* only dynamic ids */
pci-stub also has that setup with NULL id_table.
while xen-pciback has: static const struct pci_device_id pcistub_ids[] = { { .vendor = PCI_ANY_ID, .device = PCI_ANY_ID, .subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID, }, {0,}, }; static struct pci_driver xen_pcibk_pci_driver = { .name = "pciback", .id_table = pcistub_ids,
So any vendor/device pair will match for xen-pciback, while none will match for vfio-pci.
But after reading that commit and the associated thread, it is not clear to me how to best fix this. Options are
1. set .id_table to NULL for xen-pciback 2. drop using the new_id interface from libvirt 3. pass more values (subvendor, subdevice, class, etc) to the new_id interface
I'm not sure what problems, if any, options 1 and 2 might cause. Option 2 seems the best approach since new_id seems to be a rather unsafe interface.
I'm thinking either pci-back should be made to work more like vfio, or the kernel patch should be reverted or fixed to take account of the way pci-back works. Whichever way, I don't consider this a libvirt problem to solve. As Linus' always says - the kernel must never break existing userspace app usage. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

"Daniel P. Berrange" <berrange@redhat.com> writes:
Adding Alex & Bandan, since they signed off the kernel patch which broke things.
On Tue, Jun 28, 2016 at 11:39:59AM -0600, Jim Fehlig wrote:
After updating the dom0 kernel on one of my Xen test hosts, I noticed problems with PCI hostdev management. E.g
# virsh nodedev-detach pci_0000_07_10_1 error: Failed to detach device pci_0000_07_10_1 error: Failed to add PCI device ID '8086 1520' to pciback: File exists
It turns out there was a small interface change to new_id with the following commit to 3.16 kernel
which now causes xen_pciback to fail writes of "vendorid productid" to new_id. e.g.
# echo "8086 1520" > /sys/bus/pci/drivers/pciback/new_id -bash: echo: write error: File exists
Interestingly, vfio doesn't encounter the same error
# echo "8086 1520" > /sys/bus/pci/drivers/vfio-pci/new_id # echo $? 0
vfio-pci has: static struct pci_driver vfio_pci_driver = { .name = "vfio-pci", .id_table = NULL, /* only dynamic ids */
pci-stub also has that setup with NULL id_table.
while xen-pciback has: static const struct pci_device_id pcistub_ids[] = { { .vendor = PCI_ANY_ID, .device = PCI_ANY_ID, .subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID, }, {0,}, }; static struct pci_driver xen_pcibk_pci_driver = { .name = "pciback", .id_table = pcistub_ids,
So any vendor/device pair will match for xen-pciback, while none will match for vfio-pci.
But after reading that commit and the associated thread, it is not clear to me how to best fix this. Options are
1. set .id_table to NULL for xen-pciback 2. drop using the new_id interface from libvirt 3. pass more values (subvendor, subdevice, class, etc) to the new_id interface
Atleast 3. is what I had in mind if someone deliberately wants to add a matching entry.
I'm not sure what problems, if any, options 1 and 2 might cause. Option 2 seems the best approach since new_id seems to be a rather unsafe interface.
It is indeed an unsafe interface. I am not sure what else we can do to not make it easy to cause a driver induced crash and not break existing scripts.
I'm thinking either pci-back should be made to work more like vfio, or the kernel patch should be reverted or fixed to take account of the way pci-back works.
Whichever way, I don't consider this a libvirt problem to solve. As Linus' always says - the kernel must never break existing userspace
Agreed, but in this specific case, the usage is unsafe since unknown indexes are potentially being passed to the driver operations. It should always have been 3. to begin with. Bandan
app usage.
Regards, Daniel

On Tue, Jun 28, 2016 at 02:30:47PM -0400, Bandan Das wrote:
"Daniel P. Berrange" <berrange@redhat.com> writes:
Adding Alex & Bandan, since they signed off the kernel patch which I'm thinking either pci-back should be made to work more like vfio, or the kernel patch should be reverted or fixed to take account of the way pci-back works.
Whichever way, I don't consider this a libvirt problem to solve. As Linus' always says - the kernel must never break existing userspace
Agreed, but in this specific case, the usage is unsafe since unknown indexes are potentially being passed to the driver operations. It should always have been 3. to begin with.
Whether the userspace usage is good or not is irrelevant - this kernel change has broken existing userspace apps and that is not acceptable and must be fixed. I'm fine with suggestions to change future libvirt to work in a better way, but we need to fix the regressions seen by *current* libvirt releases Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Wed, 29 Jun 2016 09:47:55 +0100 "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Tue, Jun 28, 2016 at 02:30:47PM -0400, Bandan Das wrote:
"Daniel P. Berrange" <berrange@redhat.com> writes:
Adding Alex & Bandan, since they signed off the kernel patch which I'm thinking either pci-back should be made to work more like vfio, or the kernel patch should be reverted or fixed to take account of the way pci-back works.
Whichever way, I don't consider this a libvirt problem to solve. As Linus' always says - the kernel must never break existing userspace
Agreed, but in this specific case, the usage is unsafe since unknown indexes are potentially being passed to the driver operations. It should always have been 3. to begin with.
Whether the userspace usage is good or not is irrelevant - this kernel change has broken existing userspace apps and that is not acceptable and must be fixed.
I'm fine with suggestions to change future libvirt to work in a better way, but we need to fix the regressions seen by *current* libvirt releases
I don't think this is a reasonable demand. For one, the change was made 2yrs ago and nobody noticed until now, I don't think there are stable kernel releases to cover all those kernels. There must be some sort of statute of limitations. Second, the fix was arguably for security, the previous interface allowed the user to casually override driver data in ways known to cause driver segfaults. Match that with the racy nature of the new_id interface and a system becomes susceptible to exploit. Finally, the kernel is returning -EEXIST, indicating to the user that the entry already exists in the device ID table, libvirt is choosing to handle that as a failure rather than acknowledge that the entry exists, there's no need to add or remove it. So there's clearly a user bug at play here as well. If there's a fix to be made, I think it would be to make xen-pciback use a fully dynamic device ID table like the other meta drivers rather than pre-populate a PCI_ANY_ID table. Thanks, Alex

On Wed, Jun 29, 2016 at 08:00:22AM -0600, Alex Williamson wrote:
On Wed, 29 Jun 2016 09:47:55 +0100 "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Tue, Jun 28, 2016 at 02:30:47PM -0400, Bandan Das wrote:
"Daniel P. Berrange" <berrange@redhat.com> writes:
Adding Alex & Bandan, since they signed off the kernel patch which I'm thinking either pci-back should be made to work more like vfio, or the kernel patch should be reverted or fixed to take account of the way pci-back works.
Whichever way, I don't consider this a libvirt problem to solve. As Linus' always says - the kernel must never break existing userspace
Agreed, but in this specific case, the usage is unsafe since unknown indexes are potentially being passed to the driver operations. It should always have been 3. to begin with.
Whether the userspace usage is good or not is irrelevant - this kernel change has broken existing userspace apps and that is not acceptable and must be fixed.
I'm fine with suggestions to change future libvirt to work in a better way, but we need to fix the regressions seen by *current* libvirt releases
I don't think this is a reasonable demand. For one, the change was made 2yrs ago and nobody noticed until now, I don't think there are stable kernel releases to cover all those kernels. There must be some sort of statute of limitations.
Oh sorry, I totally missed the date on that. I was thinking this was a recent kernel regression. I agree that if 2+ years have passed, this ship has sailed. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 06/28/2016 01:39 PM, Jim Fehlig wrote:
After updating the dom0 kernel on one of my Xen test hosts, I noticed problems with PCI hostdev management. E.g
# virsh nodedev-detach pci_0000_07_10_1 error: Failed to detach device pci_0000_07_10_1 error: Failed to add PCI device ID '8086 1520' to pciback: File exists
It turns out there was a small interface change to new_id with the following commit to 3.16 kernel
which now causes xen_pciback to fail writes of "vendorid productid" to new_id. e.g.
# echo "8086 1520" > /sys/bus/pci/drivers/pciback/new_id -bash: echo: write error: File exists
Interestingly, vfio doesn't encounter the same error
# echo "8086 1520" > /sys/bus/pci/drivers/vfio-pci/new_id # echo $? 0
vfio-pci has: static struct pci_driver vfio_pci_driver = { .name = "vfio-pci", .id_table = NULL, /* only dynamic ids */
while xen-pciback has: static const struct pci_device_id pcistub_ids[] = { { .vendor = PCI_ANY_ID, .device = PCI_ANY_ID, .subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID, }, {0,}, }; static struct pci_driver xen_pcibk_pci_driver = { .name = "pciback", .id_table = pcistub_ids,
So any vendor/device pair will match for xen-pciback, while none will match for vfio-pci.
But after reading that commit and the associated thread, it is not clear to me how to best fix this. Options are
1. set .id_table to NULL for xen-pciback 2. drop using the new_id interface from libvirt 3. pass more values (subvendor, subdevice, class, etc) to the new_id interface
I'm not sure what problems, if any, options 1 and 2 might cause. Option 2 seems the best approach since new_id seems to be a rather unsafe interface.
Regardless of your current problem (as Dan says in his reply, this is kernel breakage and should be fixed)... "Unsafe" was *one* of the words that came to my mind when I first saw the new_id interface. These days there is a sysfs interface called driver_override that seems much more thoughtfully designed - you just write the name of the desired driver to /sys/devices/[rest of path to device]/driver_override. I didn't check if this is the version of the patch that was pushed upstream, but the commit log message does give a nice synopsis of its use: https://www.redhat.com/archives/libvir-list/2014-April/msg00382.html It would be nice to completely get rid of new_id in libvirt, but driver_override doesn't exist in 2.6 kernels, so we have to keep it around for compatibility with RHEL6/CentOS6. In the meantime, I wouldn't complain at all if someone added support for driver_override that would fallback to new_id if the driver_override node wasn't found. (A nice side effect would be that your problem would be solved even when the kernel wasn't fixed - driver_override is present at least as far back as kernel 3.10, and you say your problem doesn't occur until 3.16).
Thanks for your opinions!
Regards, Jim
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Tue, 28 Jun 2016 14:39:22 -0400 Laine Stump <laine@laine.org> wrote:
On 06/28/2016 01:39 PM, Jim Fehlig wrote:
After updating the dom0 kernel on one of my Xen test hosts, I noticed problems with PCI hostdev management. E.g
# virsh nodedev-detach pci_0000_07_10_1 error: Failed to detach device pci_0000_07_10_1 error: Failed to add PCI device ID '8086 1520' to pciback: File exists
It turns out there was a small interface change to new_id with the following commit to 3.16 kernel
which now causes xen_pciback to fail writes of "vendorid productid" to new_id. e.g.
# echo "8086 1520" > /sys/bus/pci/drivers/pciback/new_id -bash: echo: write error: File exists
Interestingly, vfio doesn't encounter the same error
# echo "8086 1520" > /sys/bus/pci/drivers/vfio-pci/new_id # echo $? 0
vfio-pci has: static struct pci_driver vfio_pci_driver = { .name = "vfio-pci", .id_table = NULL, /* only dynamic ids */
while xen-pciback has: static const struct pci_device_id pcistub_ids[] = { { .vendor = PCI_ANY_ID, .device = PCI_ANY_ID, .subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID, }, {0,}, }; static struct pci_driver xen_pcibk_pci_driver = { .name = "pciback", .id_table = pcistub_ids,
So any vendor/device pair will match for xen-pciback, while none will match for vfio-pci.
But after reading that commit and the associated thread, it is not clear to me how to best fix this. Options are
1. set .id_table to NULL for xen-pciback 2. drop using the new_id interface from libvirt 3. pass more values (subvendor, subdevice, class, etc) to the new_id interface
I'm not sure what problems, if any, options 1 and 2 might cause. Option 2 seems the best approach since new_id seems to be a rather unsafe interface.
Regardless of your current problem (as Dan says in his reply, this is kernel breakage and should be fixed)...
"Unsafe" was *one* of the words that came to my mind when I first saw the new_id interface. These days there is a sysfs interface called driver_override that seems much more thoughtfully designed - you just write the name of the desired driver to /sys/devices/[rest of path to device]/driver_override. I didn't check if this is the version of the patch that was pushed upstream, but the commit log message does give a nice synopsis of its use:
https://www.redhat.com/archives/libvir-list/2014-April/msg00382.html
It would be nice to completely get rid of new_id in libvirt, but driver_override doesn't exist in 2.6 kernels, so we have to keep it around for compatibility with RHEL6/CentOS6. In the meantime, I wouldn't complain at all if someone added support for driver_override that would fallback to new_id if the driver_override node wasn't found. (A nice side effect would be that your problem would be solved even when the kernel wasn't fixed - driver_override is present at least as far back as kernel 3.10, and you say your problem doesn't occur until 3.16).
+1 The new_id interface has several issues and for meta drivers like vfio-pci, pci-stub, and xen-pciback it should probably be considered deprecated. driver_override is the preferred interface. In addition to the fix made by the referenced commit, which resolved some really difficult to debug issues, new_id is racy. Any time we add and remove an ID to a driver there's a window where any device matching that specification can bind to the driver. driver_override avoids these sorts of issues. There are also bus types which do not support dynamic IDs, but have added support for driver_override, ex. vfio-platform. Migrating to driver_override makes that support easier. xen-pciback seems like it was always used in a questionable way if the driver already binds to PCI_ANY_ID, but userspace persists in trying to add a specific ID anyway. -EEXIST seems like an actual correct response rather than silently building up overlapping and conflicting dynamic ID entries. Also it looks to me like both the new_id behavior and driver_override were both introduced in 3.16, the driver_override commit is: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/pci/pci-driver.c?h=v3.16&id=782a985d7af26db39e86070d28f987cad21313c0 Thanks, Alex

On 06/28/2016 12:39 PM, Laine Stump wrote:
On 06/28/2016 01:39 PM, Jim Fehlig wrote:
After updating the dom0 kernel on one of my Xen test hosts, I noticed problems with PCI hostdev management. E.g
# virsh nodedev-detach pci_0000_07_10_1 error: Failed to detach device pci_0000_07_10_1 error: Failed to add PCI device ID '8086 1520' to pciback: File exists
It turns out there was a small interface change to new_id with the following commit to 3.16 kernel
which now causes xen_pciback to fail writes of "vendorid productid" to new_id. e.g.
# echo "8086 1520" > /sys/bus/pci/drivers/pciback/new_id -bash: echo: write error: File exists
Interestingly, vfio doesn't encounter the same error
# echo "8086 1520" > /sys/bus/pci/drivers/vfio-pci/new_id # echo $? 0
vfio-pci has: static struct pci_driver vfio_pci_driver = { .name = "vfio-pci", .id_table = NULL, /* only dynamic ids */
while xen-pciback has: static const struct pci_device_id pcistub_ids[] = { { .vendor = PCI_ANY_ID, .device = PCI_ANY_ID, .subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID, }, {0,}, }; static struct pci_driver xen_pcibk_pci_driver = { .name = "pciback", .id_table = pcistub_ids,
So any vendor/device pair will match for xen-pciback, while none will match for vfio-pci.
But after reading that commit and the associated thread, it is not clear to me how to best fix this. Options are
1. set .id_table to NULL for xen-pciback 2. drop using the new_id interface from libvirt 3. pass more values (subvendor, subdevice, class, etc) to the new_id interface
I'm not sure what problems, if any, options 1 and 2 might cause. Option 2 seems the best approach since new_id seems to be a rather unsafe interface.
Regardless of your current problem (as Dan says in his reply, this is kernel breakage and should be fixed)...
"Unsafe" was *one* of the words that came to my mind when I first saw the new_id interface. These days there is a sysfs interface called driver_override that seems much more thoughtfully designed - you just write the name of the desired driver to /sys/devices/[rest of path to device]/driver_override. I didn't check if this is the version of the patch that was pushed upstream, but the commit log message does give a nice synopsis of its use:
https://www.redhat.com/archives/libvir-list/2014-April/msg00382.html
It would be nice to completely get rid of new_id in libvirt, but driver_override doesn't exist in 2.6 kernels, so we have to keep it around for compatibility with RHEL6/CentOS6. In the meantime, I wouldn't complain at all if someone added support for driver_override that would fallback to new_id if the driver_override node wasn't found. (A nice side effect would be that your problem would be solved even when the kernel wasn't fixed - driver_override is present at least as far back as kernel 3.10, and you say your problem doesn't occur until 3.16).
Sorry for the delay. I got sidetracked for a while but finally got around to adding support for driver_override, falling back to the existing new_id approach https://www.redhat.com/archives/libvir-list/2016-July/msg00370.html Regards, Jim
participants (5)
-
Alex Williamson
-
Bandan Das
-
Daniel P. Berrange
-
Jim Fehlig
-
Laine Stump