[libvirt] [PATCH v2 0/3] misc virhostdevs cleanups

changes in v2: - changed the parameter order in the function calls - gave up on moving virPCIDeviceSetX(pci, true) calls to virPCIDeviceReattach(). The attributes being set changes the behavior of virPCIDeviceReattach() in a more complex way than I expected. I still believe a simplification can be done there, but it became out of scope for a more simplistic cleanup such as this one. These are cleanups that I made together with an attempt to enable parcial PCI Multifunction assignment with managed=true. That work will be scrapped after discussions with Laine in [1], but these cleanups kind of make sense on their own, so here they are. [1] https://www.redhat.com/archives/libvir-list/2019-July/msg01175.html *** BLURB HERE *** Daniel Henrique Barboza (3): virhostdev: introduce virHostdevResetAllPCIDevices virhostdev: remove virHostdevReattachPCIDevice virhostdev: introduce virHostdevReattachAllPCIDevices src/util/virhostdev.c | 148 +++++++++++++++++------------------------- src/util/virpci.c | 14 ++++ 2 files changed, 75 insertions(+), 87 deletions(-) -- 2.21.0

This code that executes virPCIDeviceReset in all virPCIDevicePtr objects of a given virPCIDeviceListPtr list is replicated twice in the code. Putting it in a helper function helps with readability. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/util/virhostdev.c | 54 +++++++++++++++++++++++-------------------- 1 file changed, 29 insertions(+), 25 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index a3647a6cf4..4dd24a8f65 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -613,6 +613,32 @@ virHostdevRestoreNetConfig(virDomainHostdevDefPtr hostdev, } } +static int +virHostdevResetAllPCIDevices(virHostdevManagerPtr mgr, + virPCIDeviceListPtr pcidevs) +{ + int ret = 0; + size_t i; + + for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { + virPCIDevicePtr pci = virPCIDeviceListGet(pcidevs, i); + + /* We can avoid looking up the actual device here, because performing + * a PCI reset on a device doesn't require any information other than + * the address, which 'pci' already contains */ + VIR_DEBUG("Resetting PCI device %s", virPCIDeviceGetName(pci)); + if (virPCIDeviceReset(pci, mgr->activePCIHostdevs, + mgr->inactivePCIHostdevs) < 0) { + VIR_ERROR(_("Failed to reset PCI device: %s"), + virGetLastErrorMessage()); + virResetLastError(); + ret = -1; + } + } + + return ret; +} + int virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, const char *drv_name, @@ -765,17 +791,8 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, /* Step 3: Now that all the PCI hostdevs have been detached, we * can safely reset them */ - for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { - virPCIDevicePtr pci = virPCIDeviceListGet(pcidevs, i); - - /* We can avoid looking up the actual device here, because performing - * a PCI reset on a device doesn't require any information other than - * the address, which 'pci' already contains */ - VIR_DEBUG("Resetting PCI device %s", virPCIDeviceGetName(pci)); - if (virPCIDeviceReset(pci, mgr->activePCIHostdevs, - mgr->inactivePCIHostdevs) < 0) - goto reattachdevs; - } + if (virHostdevResetAllPCIDevices(mgr, pcidevs) < 0) + goto reattachdevs; /* Step 4: For SRIOV network devices, Now that we have detached the * the network device, set the new netdev config */ @@ -1046,20 +1063,7 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr mgr, } /* Step 4: perform a PCI Reset on all devices */ - for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { - virPCIDevicePtr pci = virPCIDeviceListGet(pcidevs, i); - - /* We can avoid looking up the actual device here, because performing - * a PCI reset on a device doesn't require any information other than - * the address, which 'pci' already contains */ - VIR_DEBUG("Resetting PCI device %s", virPCIDeviceGetName(pci)); - if (virPCIDeviceReset(pci, mgr->activePCIHostdevs, - mgr->inactivePCIHostdevs) < 0) { - VIR_ERROR(_("Failed to reset PCI device: %s"), - virGetLastErrorMessage()); - virResetLastError(); - } - } + virHostdevResetAllPCIDevices(mgr, pcidevs); /* Step 5: Reattach managed devices to their host drivers; unmanaged * devices don't need to be processed further */ -- 2.21.0

On 7/23/19 7:35 PM, Daniel Henrique Barboza wrote:
This code that executes virPCIDeviceReset in all virPCIDevicePtr objects of a given virPCIDeviceListPtr list is replicated twice in the code. Putting it in a helper function helps with readability.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/util/virhostdev.c | 54 +++++++++++++++++++++++-------------------- 1 file changed, 29 insertions(+), 25 deletions(-)
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index a3647a6cf4..4dd24a8f65 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -613,6 +613,32 @@ virHostdevRestoreNetConfig(virDomainHostdevDefPtr hostdev, } }
+static int +virHostdevResetAllPCIDevices(virHostdevManagerPtr mgr, + virPCIDeviceListPtr pcidevs) +{ + int ret = 0; + size_t i; + + for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { + virPCIDevicePtr pci = virPCIDeviceListGet(pcidevs, i); + + /* We can avoid looking up the actual device here, because performing + * a PCI reset on a device doesn't require any information other than + * the address, which 'pci' already contains */ + VIR_DEBUG("Resetting PCI device %s", virPCIDeviceGetName(pci)); + if (virPCIDeviceReset(pci, mgr->activePCIHostdevs, + mgr->inactivePCIHostdevs) < 0) { + VIR_ERROR(_("Failed to reset PCI device: %s"), + virGetLastErrorMessage()); + virResetLastError();
I don't think this is a good idea. VIR_ERROR() logs the error message, but then virResetLastError() clears the thread local error object and thus if a PCI fails to reset the caller is left with 'uknown error' and sysadmin has to go to logs to find out what's happening. In fact, I think all those virResetLastError() calls should be removed from virHostdevReAttachPCIDevices(). I'll fix it here and in 3/3 too.
+ ret = -1; + } + } + + return ret; +} +
Michal

On 7/31/19 9:53 AM, Michal Privoznik wrote:
On 7/23/19 7:35 PM, Daniel Henrique Barboza wrote:
This code that executes virPCIDeviceReset in all virPCIDevicePtr objects of a given virPCIDeviceListPtr list is replicated twice in the code. Putting it in a helper function helps with readability.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/util/virhostdev.c | 54 +++++++++++++++++++++++-------------------- 1 file changed, 29 insertions(+), 25 deletions(-)
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index a3647a6cf4..4dd24a8f65 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -613,6 +613,32 @@ virHostdevRestoreNetConfig(virDomainHostdevDefPtr hostdev, } } +static int +virHostdevResetAllPCIDevices(virHostdevManagerPtr mgr, + virPCIDeviceListPtr pcidevs) +{ + int ret = 0; + size_t i; + + for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { + virPCIDevicePtr pci = virPCIDeviceListGet(pcidevs, i); + + /* We can avoid looking up the actual device here, because performing + * a PCI reset on a device doesn't require any information other than + * the address, which 'pci' already contains */ + VIR_DEBUG("Resetting PCI device %s", virPCIDeviceGetName(pci)); + if (virPCIDeviceReset(pci, mgr->activePCIHostdevs, + mgr->inactivePCIHostdevs) < 0) { + VIR_ERROR(_("Failed to reset PCI device: %s"), + virGetLastErrorMessage()); + virResetLastError();
I don't think this is a good idea. VIR_ERROR() logs the error message, but then virResetLastError() clears the thread local error object and thus if a PCI fails to reset the caller is left with 'uknown error' and sysadmin has to go to logs to find out what's happening. In fact, I think all those virResetLastError() calls should be removed from virHostdevReAttachPCIDevices(). I'll fix it here and in 3/3 too.
Thanks for explaining and fixing it. I blindly moved the code to this new function (here and in 3/3) without giving too much thought about whether the existing error handling makes sense. DHB
+ ret = -1; + } + } + + return ret; +} +
Michal

virHostdevReattachPCIDevice() is a static that simply does a wait loop with virPCIDeviceWaitForCleanup() before calling virPCIDeviceReattach(). This loop traces back to commit d1e5676c0d, aiming to solve a race condition between Libvirt returning the device back to the host and QEMU trying to access it in the meantime, which resulted in QEMU exiting on error and killing the guest. This happens because device_del is asynchronous, returning OK even if the guest didn't release the device. Commit 01abc8a1b8 moved this code to qemu_hostdev.c, 82e8dd4cf8 added the pci-stub conditional for the loop, 899b261127 moved the code to virhostdev.c where it stood until now. The intent of this wait loop is still valid: device_del is still not bullet proof into preventing the conditions that commit d1e5676c0d aimed to fix, especially when considering all the architectures we must support. However, this loop is executed only in virHostdevReattachPCIDevice(), leaving every other virPCIDeviceReattach() call prone to that error. Let's move the wait loop code to virPCIDeviceReattach(). This will: - make every reattach call safe from this race condition with the pci-stub; - allow for a bit of code cleanup (virHostdevReattachPCIDevice() can be erased, and virHostdevReAttachPCIDevices() can use virPCIDeviceReattach() directly); - make it easier to understand the overall reattach mechanisms in Libvirt, without the risk of a newcomer wondering why reattach is done slightly different in some instances. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/util/virhostdev.c | 40 ++++++++++------------------------------ src/util/virpci.c | 14 ++++++++++++++ 2 files changed, 24 insertions(+), 30 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 4dd24a8f65..31d075a11a 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -927,33 +927,6 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, return ret; } -/* - * Pre-condition: inactivePCIHostdevs & activePCIHostdevs - * are locked - */ -static void -virHostdevReattachPCIDevice(virHostdevManagerPtr mgr, - virPCIDevicePtr actual) -{ - /* Wait for device cleanup if it is qemu/kvm */ - if (virPCIDeviceGetStubDriver(actual) == VIR_PCI_STUB_DRIVER_KVM) { - int retries = 100; - while (virPCIDeviceWaitForCleanup(actual, "kvm_assigned_device") - && retries) { - usleep(100*1000); - retries--; - } - } - - VIR_DEBUG("Reattaching PCI device %s", virPCIDeviceGetName(actual)); - if (virPCIDeviceReattach(actual, mgr->activePCIHostdevs, - mgr->inactivePCIHostdevs) < 0) { - VIR_ERROR(_("Failed to re-attach PCI device: %s"), - virGetLastErrorMessage()); - virResetLastError(); - } -} - /* @oldStateDir: * For upgrade purpose: see virHostdevRestoreNetConfig */ @@ -1072,12 +1045,19 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr mgr, virPCIDevicePtr actual; /* We need to look up the actual device because that's what - * virHostdevReattachPCIDevice() expects as its argument */ + * virPCIDeviceReattach() expects as its argument */ if (!(actual = virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci))) continue; - if (virPCIDeviceGetManaged(actual)) - virHostdevReattachPCIDevice(mgr, actual); + if (virPCIDeviceGetManaged(actual)) { + if (virPCIDeviceReattach(actual, + mgr->activePCIHostdevs, + mgr->inactivePCIHostdevs) < 0) { + VIR_ERROR(_("Failed to re-attach PCI device: %s"), + virGetLastErrorMessage()); + virResetLastError(); + } + } else VIR_DEBUG("Not reattaching unmanaged PCI device %s", virPCIDeviceGetName(actual)); diff --git a/src/util/virpci.c b/src/util/virpci.c index 6dc0a2711c..05232888c3 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -1509,6 +1509,10 @@ virPCIDeviceDetach(virPCIDevicePtr dev, return 0; } +/* + * Pre-condition: inactivePCIHostdevs & activePCIHostdevs + * are locked + */ int virPCIDeviceReattach(virPCIDevicePtr dev, virPCIDeviceListPtr activeDevs, @@ -1520,6 +1524,16 @@ virPCIDeviceReattach(virPCIDevicePtr dev, return -1; } + /* Wait for device cleanup if it is qemu/kvm */ + if (virPCIDeviceGetStubDriver(dev) == VIR_PCI_STUB_DRIVER_KVM) { + int retries = 100; + while (virPCIDeviceWaitForCleanup(dev, "kvm_assigned_device") + && retries) { + usleep(100*1000); + retries--; + } + } + if (virPCIDeviceUnbindFromStub(dev) < 0) return -1; -- 2.21.0

This code that executes virPCIDeviceReattach in all virPCIDevicePtr objects of a given virPCIDeviceListPtr list is replicated twice in the code. Putting it in a helper function helps with readability. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/util/virhostdev.c | 74 +++++++++++++++++++------------------------ 1 file changed, 32 insertions(+), 42 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 31d075a11a..d2474aa140 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -639,6 +639,36 @@ virHostdevResetAllPCIDevices(virHostdevManagerPtr mgr, return ret; } +static void +virHostdevReattachAllPCIDevices(virHostdevManagerPtr mgr, + virPCIDeviceListPtr pcidevs) +{ + size_t i; + + for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { + virPCIDevicePtr pci = virPCIDeviceListGet(pcidevs, i); + virPCIDevicePtr actual; + + /* We need to look up the actual device because that's what + * virPCIDeviceReattach() expects as its argument */ + if (!(actual = virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci))) + continue; + + if (virPCIDeviceGetManaged(actual)) { + if (virPCIDeviceReattach(actual, + mgr->activePCIHostdevs, + mgr->inactivePCIHostdevs) < 0) { + VIR_ERROR(_("Failed to re-attach PCI device: %s"), + virGetLastErrorMessage()); + virResetLastError(); + } + } + else + VIR_DEBUG("Not reattaching unmanaged PCI device %s", + virPCIDeviceGetName(actual)); + } +} + int virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, const char *drv_name, @@ -899,26 +929,7 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, } reattachdevs: - for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { - virPCIDevicePtr pci = virPCIDeviceListGet(pcidevs, i); - virPCIDevicePtr actual; - - /* We need to look up the actual device because that's what - * virPCIDeviceReattach() expects as its argument */ - if (!(actual = virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci))) - continue; - - if (virPCIDeviceGetManaged(actual)) { - VIR_DEBUG("Reattaching managed PCI device %s", - virPCIDeviceGetName(pci)); - ignore_value(virPCIDeviceReattach(actual, - mgr->activePCIHostdevs, - mgr->inactivePCIHostdevs)); - } else { - VIR_DEBUG("Not reattaching unmanaged PCI device %s", - virPCIDeviceGetName(pci)); - } - } + virHostdevReattachAllPCIDevices(mgr, pcidevs); cleanup: virObjectUnlock(mgr->activePCIHostdevs); @@ -1040,28 +1051,7 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr mgr, /* Step 5: Reattach managed devices to their host drivers; unmanaged * devices don't need to be processed further */ - for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { - virPCIDevicePtr pci = virPCIDeviceListGet(pcidevs, i); - virPCIDevicePtr actual; - - /* We need to look up the actual device because that's what - * virPCIDeviceReattach() expects as its argument */ - if (!(actual = virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci))) - continue; - - if (virPCIDeviceGetManaged(actual)) { - if (virPCIDeviceReattach(actual, - mgr->activePCIHostdevs, - mgr->inactivePCIHostdevs) < 0) { - VIR_ERROR(_("Failed to re-attach PCI device: %s"), - virGetLastErrorMessage()); - virResetLastError(); - } - } - else - VIR_DEBUG("Not reattaching unmanaged PCI device %s", - virPCIDeviceGetName(actual)); - } + virHostdevReattachAllPCIDevices(mgr, pcidevs); virObjectUnlock(mgr->activePCIHostdevs); virObjectUnlock(mgr->inactivePCIHostdevs); -- 2.21.0

On 7/23/19 7:35 PM, Daniel Henrique Barboza wrote:
changes in v2: - changed the parameter order in the function calls - gave up on moving virPCIDeviceSetX(pci, true) calls to virPCIDeviceReattach(). The attributes being set changes the behavior of virPCIDeviceReattach() in a more complex way than I expected. I still believe a simplification can be done there, but it became out of scope for a more simplistic cleanup such as this one.
These are cleanups that I made together with an attempt to enable parcial PCI Multifunction assignment with managed=true.
That work will be scrapped after discussions with Laine in [1], but these cleanups kind of make sense on their own, so here they are.
[1] https://www.redhat.com/archives/libvir-list/2019-July/msg01175.html
*** BLURB HERE ***
Daniel Henrique Barboza (3): virhostdev: introduce virHostdevResetAllPCIDevices virhostdev: remove virHostdevReattachPCIDevice virhostdev: introduce virHostdevReattachAllPCIDevices
src/util/virhostdev.c | 148 +++++++++++++++++------------------------- src/util/virpci.c | 14 ++++ 2 files changed, 75 insertions(+), 87 deletions(-)
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Michal
participants (2)
-
Daniel Henrique Barboza
-
Michal Privoznik