On 06/28/2016 01:39 PM, Jim Fehlig wrote:
> After updating the dom0 kernel on one of my Xen test hosts, I noticed
> problems with PCI hostdev management. E.g
>
> # virsh nodedev-detach pci_0000_07_10_1
> error: Failed to detach device pci_0000_07_10_1
> error: Failed to add PCI device ID '8086 1520' to pciback: File exists
>
> It turns out there was a small interface change to new_id with the following
> commit to 3.16 kernel
>
>
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/dr...
>
>
> which now causes xen_pciback to fail writes of "vendorid productid" to
> new_id. e.g.
>
> # echo "8086 1520" > /sys/bus/pci/drivers/pciback/new_id
> -bash: echo: write error: File exists
>
> Interestingly, vfio doesn't encounter the same error
>
> # echo "8086 1520" > /sys/bus/pci/drivers/vfio-pci/new_id
> # echo $?
> 0
>
> vfio-pci has:
> static struct pci_driver vfio_pci_driver = {
> .name = "vfio-pci",
> .id_table = NULL, /* only dynamic ids */
>
> while xen-pciback has:
> static const struct pci_device_id pcistub_ids[] = {
> {
> .vendor = PCI_ANY_ID,
> .device = PCI_ANY_ID,
> .subvendor = PCI_ANY_ID,
> .subdevice = PCI_ANY_ID,
> },
> {0,},
> };
> static struct pci_driver xen_pcibk_pci_driver = {
> .name = "pciback",
> .id_table = pcistub_ids,
>
> So any vendor/device pair will match for xen-pciback, while none will match
> for vfio-pci.
>
> But after reading that commit and the associated thread, it is not clear to
> me how to best fix this. Options are
>
> 1. set .id_table to NULL for xen-pciback
> 2. drop using the new_id interface from libvirt
> 3. pass more values (subvendor, subdevice, class, etc) to the new_id interface
>
> I'm not sure what problems, if any, options 1 and 2 might cause. Option 2
> seems the best approach since new_id seems to be a rather unsafe interface.
Regardless of your current problem (as Dan says in his reply, this is kernel
breakage and should be fixed)...
"Unsafe" was *one* of the words that came to my mind when I first saw the
new_id interface. These days there is a sysfs interface called driver_override
that seems much more thoughtfully designed - you just write the name of the
desired driver to /sys/devices/[rest of path to device]/driver_override. I
didn't check if this is the version of the patch that was pushed upstream, but
the commit log message does give a nice synopsis of its use:
https://www.redhat.com/archives/libvir-list/2014-April/msg00382.html
It would be nice to completely get rid of new_id in libvirt, but
driver_override doesn't exist in 2.6 kernels, so we have to keep it around for
compatibility with RHEL6/CentOS6. In the meantime, I wouldn't complain at all
if someone added support for driver_override that would fallback to new_id if
the driver_override node wasn't found. (A nice side effect would be that your
problem would be solved even when the kernel wasn't fixed - driver_override is
present at least as far back as kernel 3.10, and you say your problem doesn't
occur until 3.16).
Sorry for the delay. I got sidetracked for a while but finally got around to
adding support for driver_override, falling back to the existing new_id approach