[libvirt] [PATCH 0/2] hostdev: Small usability improvements

Patch 2/2 is basically https://www.redhat.com/archives/libvir-list/2016-March/msg00230.html and patch 1/2 is an implementation of the idea outlined in https://www.redhat.com/archives/libvir-list/2016-March/msg00729.html to address the flaw discovered by John. Spun off from the main series because it can be merged on its own, and I still need to rework the remaining patches a bit before posting v2 of that. Cheers. Andrea Bolognani (2): hostdev: Detect untracked inactive devices hostdev: Stop early if unmanaged devices have not been detached src/util/virhostdev.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 50 insertions(+), 3 deletions(-) -- 2.5.0

Unmanaged devices are attached to guests in two steps: first, the device is detached from the host and marked as inactive; subsequently, it is marked as active and attached to the guest. If the daemon is restarted between these two operations, we lose track of the inactive device. Steps 5 and 6 of virHostdevPreparePCIDevices() already subtly take care of this situation, but some planned changes will make it so that's no longer the case. Plus, explicit is always better than implicit. --- src/util/virhostdev.c | 47 ++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 44 insertions(+), 3 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index e0d6465..7204bd7 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -560,7 +560,8 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, } } - /* Step 2: detach managed devices (i.e. bind to appropriate stub driver) */ + /* Step 2: detach managed devices and make sure unmanaged devices + * have already been taken care of */ for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { virPCIDevicePtr pci = virPCIDeviceListGet(pcidevs, i); @@ -577,8 +578,48 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, mgr->inactivePCIHostdevs) < 0) goto reattachdevs; } else { - VIR_DEBUG("Not detaching unmanaged PCI device %s", - virPCIDeviceGetName(pci)); + char *driverPath; + char *driverName; + int stub; + + /* Unmanaged devices should already have been marked as + * inactive: if that's the case, we can simply move on */ + if (virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci)) { + VIR_DEBUG("Not detaching unmanaged PCI device %s", + virPCIDeviceGetName(pci)); + continue; + } + + /* If that's not the case, though, it might be because the + * daemon has been restarted, causing us to lose track of the + * device. Try and recover by marking the device as inactive + * if it happens to be bound to a known stub driver. + * + * FIXME Get rid of this once a proper way to keep track of + * information about active / inactive device across + * daemon restarts has been implemented */ + + if (virPCIDeviceGetDriverPathAndName(pci, + &driverPath, &driverName) < 0) + goto reattachdevs; + + stub = virPCIStubDriverTypeFromString(driverName); + + VIR_FREE(driverPath); + VIR_FREE(driverName); + + if (stub >= 0 && + stub != VIR_PCI_STUB_DRIVER_NONE) { + + /* The device is bound to a known stub driver: store this + * information and add a copy to the inactive list */ + virPCIDeviceSetStubDriver(pci, stub); + + VIR_DEBUG("Adding PCI device %s to inactive list", + virPCIDeviceGetName(pci)); + if (virPCIDeviceListAddCopy(mgr->inactivePCIHostdevs, pci) < 0) + goto reattachdevs; + } } } -- 2.5.0

On 03/17/2016 01:24 PM, Andrea Bolognani wrote:
Unmanaged devices are attached to guests in two steps: first, the device is detached from the host and marked as inactive; subsequently, it is marked as active and attached to the guest.
If the daemon is restarted between these two operations, we lose track of the inactive device.
Steps 5 and 6 of virHostdevPreparePCIDevices() already subtly take care of this situation, but some planned changes will make it so that's no longer the case. Plus, explicit is always better than implicit. --- src/util/virhostdev.c | 47 ++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 44 insertions(+), 3 deletions(-)
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index e0d6465..7204bd7 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -560,7 +560,8 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, } }
- /* Step 2: detach managed devices (i.e. bind to appropriate stub driver) */ + /* Step 2: detach managed devices and make sure unmanaged devices + * have already been taken care of */ for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { virPCIDevicePtr pci = virPCIDeviceListGet(pcidevs, i);
@@ -577,8 +578,48 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, mgr->inactivePCIHostdevs) < 0) goto reattachdevs; } else { - VIR_DEBUG("Not detaching unmanaged PCI device %s", - virPCIDeviceGetName(pci)); + char *driverPath; + char *driverName; + int stub;
stub = -1;
+ + /* Unmanaged devices should already have been marked as + * inactive: if that's the case, we can simply move on */ + if (virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci)) { + VIR_DEBUG("Not detaching unmanaged PCI device %s", + virPCIDeviceGetName(pci)); + continue; + } + + /* If that's not the case, though, it might be because the + * daemon has been restarted, causing us to lose track of the + * device. Try and recover by marking the device as inactive + * if it happens to be bound to a known stub driver. + *
Just to make it clearer (for me) - this is because using the node device API virHostdevPCINodeDeviceDetach it's possible that somewhere between virPCIDeviceBindToStub and the virPCIDeviceListAddCopy we had a daemon restart causing us to lose track of the device, so this code tries to detect that the BindToStub was done and complete the task. Of course it's also possible that BindToStub was "in process" when we failed, but I'm not sure *how* to detect that... I suppose though all that matters is we had a completed BindToStub.
+ * FIXME Get rid of this once a proper way to keep track of + * information about active / inactive device across + * daemon restarts has been implemented */ + + if (virPCIDeviceGetDriverPathAndName(pci, + &driverPath, &driverName) < 0) + goto reattachdevs;
I note that virPCIDeviceUnbindFromStub can return 0, but driverName == NULL. If driverName == NULL, then it declares it's not bound to a known stub. However, for our purposes, it more than likely means the BindToStub may not have completed successfully. But still we only *care* that it did.
+ + stub = virPCIStubDriverTypeFromString(driverName);
So, I think this becomes, if (driverName) stub = virPCIStubDriverTypeFromString(driverName);
+ + VIR_FREE(driverPath); + VIR_FREE(driverName); + + if (stub >= 0 && + stub != VIR_PCI_STUB_DRIVER_NONE) {
Strange check since VIR_PCI_STUB_DRIVER_NONE = 0 if (stub >= 0 && stub != 0) a change to seems to be what you're after if (stub > VIR_PCI_STUB_DRIVER_NONE && stub < VIR_PCI_STUB_DRIVER_LAST)
+ + /* The device is bound to a known stub driver: store this + * information and add a copy to the inactive list */ + virPCIDeviceSetStubDriver(pci, stub); + + VIR_DEBUG("Adding PCI device %s to inactive list", + virPCIDeviceGetName(pci)); + if (virPCIDeviceListAddCopy(mgr->inactivePCIHostdevs, pci) < 0) + goto reattachdevs; + }
I think perhaps patch 2 should be merged into here - I was kind of wondering what the "else" condition would be... Until I looked at patch 2. ACK with the adjustments - including the merge; otherwise, we fall into Step 3 still. John
} }

On Tue, 2016-03-22 at 07:45 -0400, John Ferlan wrote:
On 03/17/2016 01:24 PM, Andrea Bolognani wrote:
Unmanaged devices are attached to guests in two steps: first, the device is detached from the host and marked as inactive; subsequently, it is marked as active and attached to the guest. If the daemon is restarted between these two operations, we lose track of the inactive device. Steps 5 and 6 of virHostdevPreparePCIDevices() already subtly take care of this situation, but some planned changes will make it so that's no longer the case. Plus, explicit is always better than implicit. --- src/util/virhostdev.c | 47 ++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 44 insertions(+), 3 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index e0d6465..7204bd7 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -560,7 +560,8 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, } } - /* Step 2: detach managed devices (i.e. bind to appropriate stub driver) */ + /* Step 2: detach managed devices and make sure unmanaged devices + * have already been taken care of */ for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { virPCIDevicePtr pci = virPCIDeviceListGet(pcidevs, i); @@ -577,8 +578,48 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, mgr->inactivePCIHostdevs) < 0) goto reattachdevs; } else { - VIR_DEBUG("Not detaching unmanaged PCI device %s", - virPCIDeviceGetName(pci)); + char *driverPath; + char *driverName; + int stub; stub = -1;
May not be needed, see below.
+ + /* Unmanaged devices should already have been marked as + * inactive: if that's the case, we can simply move on */ + if (virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci)) { + VIR_DEBUG("Not detaching unmanaged PCI device %s", + virPCIDeviceGetName(pci)); + continue; + } + + /* If that's not the case, though, it might be because the + * daemon has been restarted, causing us to lose track of the + * device. Try and recover by marking the device as inactive + * if it happens to be bound to a known stub driver. + * Just to make it clearer (for me) - this is because using the node device API virHostdevPCINodeDeviceDetach it's possible that somewhere between virPCIDeviceBindToStub and the virPCIDeviceListAddCopy we had a daemon restart causing us to lose track of the device, so this code tries to detect that the BindToStub was done and complete the task. Of course it's also possible that BindToStub was "in process" when we failed, but I'm not sure *how* to detect that... I suppose though all that matters is we had a completed BindToStub.
Mh, not really what I had in mind :) What this code is meant to take care of is the case when virHostdevPCINodeDeviceDetach() completes successfully, marking the device as inactive, but the daemon is restarted before the device can be attached to a guest. That causes us to lose all information in the inactive list. The current code copes with this because step 5 adds the device to the active list (even if it was never in the inactive list) and step 6 removes the device from the inactive list (even if it was never in the inactive list), but that's extremely opaque and most importantly will stop working once we start moving device objects from one list to the other instead of copying them around. The new code deals with the situation explicitly by adding devices to the inactive list if they look like they belong there - namely, they are bound to a stub driver. As noted in the comment right below, this is sub-optimal and a follow-up patch should take care of storing both the active and inactive list to disk properly so that daemon restarts are no longer a problem. That's for another day, though :)
+ * FIXME Get rid of this once a proper way to keep track of + * information about active / inactive device across + * daemon restarts has been implemented */ + + if (virPCIDeviceGetDriverPathAndName(pci, + &driverPath, &driverName) < 0) + goto reattachdevs; I note that virPCIDeviceUnbindFromStub can return 0, but driverName == NULL. If driverName == NULL, then it declares it's not bound to a known stub. However, for our purposes, it more than likely means the BindToStub may not have completed successfully. But still we only *care* that it did. + + stub = virPCIStubDriverTypeFromString(driverName); So, I think this becomes, if (driverName) stub = virPCIStubDriverTypeFromString(driverName);
virPCIStubDriverTypeFromString() is implemented using virEnumFromString(), which handles NULL gracefully and simply returns -1. That's why I think initializing stub above and adding the check on driverName are not required changes, but I'm not opposed to adding them if you think they make the code clearer.
+ + VIR_FREE(driverPath); + VIR_FREE(driverName); + + if (stub >= 0 && + stub != VIR_PCI_STUB_DRIVER_NONE) { Strange check since VIR_PCI_STUB_DRIVER_NONE = 0 if (stub >= 0 && stub != 0) a change to seems to be what you're after if (stub > VIR_PCI_STUB_DRIVER_NONE && stub < VIR_PCI_STUB_DRIVER_LAST)
The idea here is that 'stub >= 0' makes sure that virPCIStubDriverTypeFromString() didn't fail, and the second check excludes a value that, while part of the virPCIStub enumeration, we want to reject in this case. But I like your version better, so consider it changed :)
+ /* The device is bound to a known stub driver: store this + * information and add a copy to the inactive list */ + virPCIDeviceSetStubDriver(pci, stub); + + VIR_DEBUG("Adding PCI device %s to inactive list", + virPCIDeviceGetName(pci)); + if (virPCIDeviceListAddCopy(mgr->inactivePCIHostdevs, pci) < 0) + goto reattachdevs; + } I think perhaps patch 2 should be merged into here - I was kind of wondering what the "else" condition would be... Until I looked at patch 2. ACK with the adjustments - including the merge; otherwise, we fall into Step 3 still.
Patch 2 actually deals with a separate issue, so I'm not sure squashing them together is warranted. If the 'else' branch is missing here, we'll just keep on doing what we're already doing: try to pass a device to the guest, and have QEMU spew out a kinda cryptic error message when it realizes it can't work the VFIO magic on it. Cheers. -- Andrea Bolognani Software Engineer - Virtualization Team

On 03/22/2016 11:43 AM, Andrea Bolognani wrote:
On Tue, 2016-03-22 at 07:45 -0400, John Ferlan wrote:
On 03/17/2016 01:24 PM, Andrea Bolognani wrote:
Unmanaged devices are attached to guests in two steps: first, the device is detached from the host and marked as inactive; subsequently, it is marked as active and attached to the guest.
If the daemon is restarted between these two operations, we lose track of the inactive device.
Steps 5 and 6 of virHostdevPreparePCIDevices() already subtly take care of this situation, but some planned changes will make it so that's no longer the case. Plus, explicit is always better than implicit. --- src/util/virhostdev.c | 47 ++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 44 insertions(+), 3 deletions(-)
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index e0d6465..7204bd7 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -560,7 +560,8 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, } }
- /* Step 2: detach managed devices (i.e. bind to appropriate stub driver) */ + /* Step 2: detach managed devices and make sure unmanaged devices + * have already been taken care of */ for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { virPCIDevicePtr pci = virPCIDeviceListGet(pcidevs, i);
@@ -577,8 +578,48 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, mgr->inactivePCIHostdevs) < 0) goto reattachdevs; } else { - VIR_DEBUG("Not detaching unmanaged PCI device %s", - virPCIDeviceGetName(pci)); + char *driverPath; + char *driverName; + int stub;
stub = -1;
May not be needed, see below.
+ + /* Unmanaged devices should already have been marked as + * inactive: if that's the case, we can simply move on */ + if (virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci)) { + VIR_DEBUG("Not detaching unmanaged PCI device %s", + virPCIDeviceGetName(pci)); + continue; + } + + /* If that's not the case, though, it might be because the + * daemon has been restarted, causing us to lose track of the + * device. Try and recover by marking the device as inactive + * if it happens to be bound to a known stub driver. + *
Just to make it clearer (for me) - this is because using the node device API virHostdevPCINodeDeviceDetach it's possible that somewhere between virPCIDeviceBindToStub and the virPCIDeviceListAddCopy we had a daemon restart causing us to lose track of the device, so this code tries to detect that the BindToStub was done and complete the task.
Of course it's also possible that BindToStub was "in process" when we failed, but I'm not sure *how* to detect that... I suppose though all that matters is we had a completed BindToStub.
Mh, not really what I had in mind :)
What this code is meant to take care of is the case when virHostdevPCINodeDeviceDetach() completes successfully, marking the device as inactive, but the daemon is restarted before the device can be attached to a guest. That causes us to lose all information in the inactive list.
Oh right ... that too... it's a tangled web. At least we're not dealing with detach as well ;-) I can only imagine how much you look forward to backporting any of these changes.
The current code copes with this because step 5 adds the device to the active list (even if it was never in the inactive list) and step 6 removes the device from the inactive list (even if it was never in the inactive list), but that's extremely opaque and most importantly will stop working once we start moving device objects from one list to the other instead of copying them around.
The new code deals with the situation explicitly by adding devices to the inactive list if they look like they belong there - namely, they are bound to a stub driver.
As noted in the comment right below, this is sub-optimal and a follow-up patch should take care of storing both the active and inactive list to disk properly so that daemon restarts are no longer a problem. That's for another day, though :)
+ * FIXME Get rid of this once a proper way to keep track of + * information about active / inactive device across + * daemon restarts has been implemented */ + + if (virPCIDeviceGetDriverPathAndName(pci, + &driverPath, &driverName) < 0) + goto reattachdevs;
I note that virPCIDeviceUnbindFromStub can return 0, but driverName == NULL. If driverName == NULL, then it declares it's not bound to a known stub. However, for our purposes, it more than likely means the BindToStub may not have completed successfully. But still we only *care* that it did.
+ + stub = virPCIStubDriverTypeFromString(driverName);
So, I think this becomes,
if (driverName) stub = virPCIStubDriverTypeFromString(driverName);
virPCIStubDriverTypeFromString() is implemented using virEnumFromString(), which handles NULL gracefully and simply returns -1. That's why I think initializing stub above and adding the check on driverName are not required changes, but I'm not opposed to adding them if you think they make the code clearer.
No need...
+ + VIR_FREE(driverPath); + VIR_FREE(driverName); + + if (stub >= 0 && + stub != VIR_PCI_STUB_DRIVER_NONE) {
Strange check since VIR_PCI_STUB_DRIVER_NONE = 0
if (stub >= 0 && stub != 0)
a change to seems to be what you're after
if (stub > VIR_PCI_STUB_DRIVER_NONE && stub < VIR_PCI_STUB_DRIVER_LAST)
The idea here is that 'stub >= 0' makes sure that virPCIStubDriverTypeFromString() didn't fail, and the second check excludes a value that, while part of the virPCIStub enumeration, we want to reject in this case.
But I like your version better, so consider it changed :)
Since you're doing this...
+ /* The device is bound to a known stub driver: store this + * information and add a copy to the inactive list */ + virPCIDeviceSetStubDriver(pci, stub); + + VIR_DEBUG("Adding PCI device %s to inactive list", + virPCIDeviceGetName(pci)); + if (virPCIDeviceListAddCopy(mgr->inactivePCIHostdevs, pci) < 0) + goto reattachdevs; + }
I think perhaps patch 2 should be merged into here - I was kind of wondering what the "else" condition would be... Until I looked at patch 2.
ACK with the adjustments - including the merge; otherwise, we fall into Step 3 still.
Patch 2 actually deals with a separate issue, so I'm not sure squashing them together is warranted.
If the 'else' branch is missing here, we'll just keep on doing what we're already doing: try to pass a device to the guest, and have QEMU spew out a kinda cryptic error message when it realizes it can't work the VFIO magic on it.
OK - 13 of one, bakers dozen of another ;-) Only would be an issue for someone that falls into the git bisect game. ACK (to both) John

On Tue, 2016-03-22 at 14:44 -0400, John Ferlan wrote:
I can only imagine how much you look forward to backporting any of these changes.
Luckily this part of libvirt doesn't change very often (one has to wonder why ;) so hopefully backporting won't be *too* painful.
ACK (to both)
Now pushed. Thanks! -- Andrea Bolognani Software Engineer - Virtualization Team

Unmanaged devices, as the name suggests, are not detached automatically from the host by libvirt before being attached to a guest: it's the user's responsability to detach them manually beforehand. If that preliminary step has not been performed, the attach operation can't complete successfully. Instead of relying on the lower layers to error out with cryptic messages such as error: Failed to attach device from /tmp/hostdev.xml error: Path '/dev/vfio/12' is not accessible: No such file or directory prevent the situation altogether and provide the user with a more useful error message. --- src/util/virhostdev.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 7204bd7..572aec0 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -619,6 +619,12 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, virPCIDeviceGetName(pci)); if (virPCIDeviceListAddCopy(mgr->inactivePCIHostdevs, pci) < 0) goto reattachdevs; + } else { + virReportError(VIR_ERR_OPERATION_INVALID, + _("Unmanaged PCI device %s must be manually " + "detached from the host"), + virPCIDeviceGetName(pci)); + goto reattachdevs; } } } -- 2.5.0

On 03/17/2016 01:24 PM, Andrea Bolognani wrote:
Unmanaged devices, as the name suggests, are not detached automatically from the host by libvirt before being attached to a guest: it's the user's responsability to detach them manually beforehand. If that preliminary step has not been performed, the attach operation can't complete successfully.
Instead of relying on the lower layers to error out with cryptic messages such as
error: Failed to attach device from /tmp/hostdev.xml error: Path '/dev/vfio/12' is not accessible: No such file or directory
prevent the situation altogether and provide the user with a more useful error message. --- src/util/virhostdev.c | 6 ++++++ 1 file changed, 6 insertions(+)
Like noted in 1/2 - I think this needs to be merged into the previous. Including this commit message. ACK with that John
participants (2)
-
Andrea Bolognani
-
John Ferlan