Persistent Failed to attach device following transient Failed to read product/vendor ID

I'm attempting to solve the issue reported here: https://gitlab.com/libvirt/libvirt/-/issues/345 Hoping the originator of virHostdevDeleteMissingPCIDevices() will be able to comment, as I am still trying to reproduce the issue with additional debug in place. diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index c0ce867596..d43354963e 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -1101,11 +1101,11 @@ virHostdevReAttachPCIDevices(virHostdevManager *mgr, VIR_ERROR(_("Failed to allocate PCI device list: %s"), virGetLastErrorMessage()); virResetLastError(); - return; } - - virHostdevReAttachPCIDevicesImpl(mgr, drv_name, dom_name, pcidevs, - hostdevs, nhostdevs); + else { + virHostdevReAttachPCIDevicesImpl(mgr, drv_name, dom_name, pcidevs, + hostdevs, nhostdevs); + } /* Handle the case where PCI devices from the host went missing * during the domain lifetime */

On 7/11/22 20:14, Pighin, Anthony (Nokia - CA/Ottawa) wrote:
I'm attempting to solve the issue reported here:
https://gitlab.com/libvirt/libvirt/-/issues/345
Hoping the originator of virHostdevDeleteMissingPCIDevices() will be able to comment, as I am still trying to reproduce the issue with additional debug in place.
diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index c0ce867596..d43354963e 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -1101,11 +1101,11 @@ virHostdevReAttachPCIDevices(virHostdevManager *mgr, VIR_ERROR(_("Failed to allocate PCI device list: %s"), virGetLastErrorMessage()); virResetLastError(); - return; } - - virHostdevReAttachPCIDevicesImpl(mgr, drv_name, dom_name, pcidevs, - hostdevs, nhostdevs); + else { + virHostdevReAttachPCIDevicesImpl(mgr, drv_name, dom_name, pcidevs, + hostdevs, nhostdevs); + }
/* Handle the case where PCI devices from the host went missing * during the domain lifetime */
Yeah, this looks like a correct fix, but I'm trying to understand the original problem more. In the gilab issue you mention 'link bounce' - do you mean PCIe link? Michal

Correct, PCIe link bounce/flap. The attached PCIe device entered a failed state where it was repeatedly resetting, and therefore the link itself was going up and down. -----Original Message----- From: Michal Prívozník <mprivozn@redhat.com> Sent: Wednesday, July 20, 2022 11:07 AM To: Pighin, Anthony (Nokia - CA/Ottawa) <anthony.pighin@nokia.com>; libvir-list@redhat.com Subject: Re: Persistent Failed to attach device following transient Failed to read product/vendor ID On 7/11/22 20:14, Pighin, Anthony (Nokia - CA/Ottawa) wrote:
I'm attempting to solve the issue reported here:
https://gitlab.com/libvirt/libvirt/-/issues/345
Hoping the originator of virHostdevDeleteMissingPCIDevices() will be able to comment, as I am still trying to reproduce the issue with additional debug in place.
diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index c0ce867596..d43354963e 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -1101,11 +1101,11 @@ virHostdevReAttachPCIDevices(virHostdevManager *mgr, VIR_ERROR(_("Failed to allocate PCI device list: %s"), virGetLastErrorMessage()); virResetLastError(); - return; } - - virHostdevReAttachPCIDevicesImpl(mgr, drv_name, dom_name, pcidevs, - hostdevs, nhostdevs); + else { + virHostdevReAttachPCIDevicesImpl(mgr, drv_name, dom_name, pcidevs, + hostdevs, nhostdevs); + }
/* Handle the case where PCI devices from the host went missing * during the domain lifetime */
Yeah, this looks like a correct fix, but I'm trying to understand the original problem more. In the gilab issue you mention 'link bounce' - do you mean PCIe link? Michal
participants (2)
-
Michal Prívozník
-
Pighin, Anthony (Nokia - CA/Ottawa)