于 2011年10月15日 02:53, Eric Blake 写道:
On 10/12/2011 10:05 PM, Osier Yang wrote:
> When failing on starting a domain, it tries to reattach all the PCI
> devices defined in the domain conf, regardless of whether the devices
> are still used by other domain. This will cause the devices to be
> deleted
> from the list qemu_driver->activePciHostdevs, thus the devices will be
> thought as usable even if it's not true. And following commands
> nodedev-{reattach,reset} will be successful.
>
> How to reproduce:
> 1) Define two domains with same PCI device defined in the confs.
> 2) # virsh start domain1
> 3) # virsh start domain2
> 4) # virsh nodedev-reattach $pci_device
This time around, I actually spent time reproducing the bug scenario
on hardware rather than just analyzing by inspection (it took me a
while to figure out why the same machine that used to let me do pci
passthrough was no longer working, until I realized that my hardware
is old enough to be insecure, and I've upgraded software since my last
passthrough test, so the CVE fix from
https://bugzilla.redhat.com/show_bug.cgi?id=715555 has to be
intentionally bypassed for me to get passthrough again). With that
testing, I proved that this patch prevents that problem. But, as
written, your patch caused the dreaded "An error occurred, but the
cause is unknown".
>
> You will see the device will be reattached to host successfully.
> As pciDeviceReattach just check if the device is still used by
> other domain via checking if the device is in list
> driver->activePciHostdevs,
> however, the device is deleted from the list by step 2).
>
> This patch is to prohibit the bug by:
> 1) Prohibit a domain starting or device attachment right at
> preparation period (qemuPrepareHostdevPCIDevices) if the
> device is in list driver->activePciHostdevs, which means
> it's used by other domain.
>
> 2) Introduces a new field for struct _pciDevice, (const char
> *used_by),
> it will be set as the domain name at preparation period,
> (qemuPrepareHostdevPCIDevices). Thus we can prohibit deleting
> the device from driver->activePciHostdevs if it's still used by
> other domain when stopping the domain process.
>
> @@ -111,7 +112,7 @@ int qemuPrepareHostdevPCIDevices(struct
> qemud_driver *driver,
> if (!(pcidevs = qemuGetPciHostDeviceList(hostdevs, nhostdevs)))
> return -1;
>
> - /* We have to use 4 loops here. *All* devices must
> + /* We have to use 6 loops here. *All* devices must
> * be detached before we reset any of them, because
> * in some cases you have to reset the whole PCI,
> * which impacts all devices on it. Also, all devices
I added further loop labels.
> + /* The device is in use by other active domain if
> + * the dev is in list driver->activePciHostdevs.
> + */
> + if (!pciDeviceIsAssignable(dev, !driver->relaxedACS) ||
> + pciDeviceListFind(driver->activePciHostdevs, dev))
> + goto cleanup;
> + }
Here's where the bad message comes in - you jump to cleanup without
ever emitting an error message. Not to mention now that we have the
new used_by field, the message could actually be informative :) With
my changes below, I get the much nicer:
Error starting domain: Requested operation is not valid: PCI device
0000:0a:0a.0 is in use by domain dom
Oh, right, thanks for the fixing, :)
> @@ -277,6 +298,18 @@ void qemuDomainReAttachHostdevDevices(struct
> qemud_driver *driver,
>
> for (i = 0; i< pciDeviceListCount(pcidevs); i++) {
> pciDevice *dev = pciDeviceListGet(pcidevs, i);
> + pciDevice *activeDev = NULL;
> + const char *used_by = NULL;
> +
> + /* Never delete the dev from list driver->activePciHostdevs
> + * if it's used by other domain.
> + */
> + activeDev = pciDeviceListFind(driver->activePciHostdevs, dev);
> + if (activeDev&&
> + (used_by = pciDeviceGetUsedBy(activeDev))&&
> + STRNEQ(name, used_by))
> + continue;
>
> used_by is kept as it might be NULL.
>
In that case, use STRNEQ_NULLABLE. But you are right - across
libvirtd restarts, we lose track of which domain owns the device - so
I did indeed reproduce a case of NULL. Perhaps food for a later patch
(when reloading domains, cycle through all hostdevs in use by active
domains to repopulate the used_by fields). But not a show-stopper to
getting this bug fix in now.
Yes, I didn't realize this, we need further patch.
Well, I did find one other problem with a libvirtd restart - as soon
as used_by is NULL, then stopping the domain that was actually using
the device no longer frees that device out of the driver->active list
(since we now only free if used_by matches) - so maybe it's important
to fix libvirtd reload repopulating used_by sooner rather than later.
> @@ -1312,6 +1313,7 @@ pciGetDevice(unsigned domain,
> dev->bus = bus;
> dev->slot = slot;
> dev->function = function;
> + dev->used_by = NULL;
Pointless, since VIR_ALLOC 0-initializes.
>
> if (snprintf(dev->name, sizeof(dev->name),
"%.4x:%.2x:%.2x.%.1x",
> dev->domain, dev->bus, dev->slot,
> @@ -1374,6 +1376,7 @@ pciFreeDevice(pciDevice *dev)
> VIR_DEBUG("%s %s: freeing", dev->id, dev->name);
> pciCloseConfig(dev);
> VIR_FREE(dev->path);
> + dev->used_by = NULL;
> VIR_FREE(dev);
Pointless, since dev is freed in the next statement.
ACK with this squashed in, so I went ahead and pushed.
diff --git i/src/libvirt_private.syms w/src/libvirt_private.syms
index 4673332..1e03e33 100644
--- i/src/libvirt_private.syms
+++ w/src/libvirt_private.syms
@@ -881,6 +881,7 @@ virNWFilterHashTableRemoveEntry;
pciDettachDevice;
pciDeviceFileIterate;
pciDeviceGetManaged;
+pciDeviceGetName;
pciDeviceGetUsedBy;
pciDeviceIsAssignable;
pciDeviceIsVirtualFunction;
diff --git i/src/qemu/qemu_hostdev.c w/src/qemu/qemu_hostdev.c
index f354ea8..c82c512 100644
--- i/src/qemu/qemu_hostdev.c
+++ w/src/qemu/qemu_hostdev.c
@@ -1,7 +1,7 @@
/*
* qemu_hostdev.c: QEMU hostdev management
*
- * Copyright (C) 2006-2007, 2009-2010 Red Hat, Inc.
+ * Copyright (C) 2006-2007, 2009-2011 Red Hat, Inc.
* Copyright (C) 2006 Daniel P. Berrange
*
* This library is free software; you can redistribute it and/or
@@ -119,21 +119,40 @@ int qemuPrepareHostdevPCIDevices(struct
qemud_driver *driver,
* must be reset before being marked as active.
*/
- /* XXX validate that non-managed device isn't in use, eg
+ /* Loop 1: validate that non-managed device isn't in use, eg
* by checking that device is either un-bound, or bound
* to pci-stub.ko
*/
for (i = 0; i < pciDeviceListCount(pcidevs); i++) {
pciDevice *dev = pciDeviceListGet(pcidevs, i);
+ pciDevice *other;
+
+ if (!pciDeviceIsAssignable(dev, !driver->relaxedACS)) {
+ qemuReportError(VIR_ERR_OPERATION_INVALID,
+ _("PCI device %s is not assignable"),
+ pciDeviceGetName(dev));
+ goto cleanup;
+ }
/* The device is in use by other active domain if
* the dev is in list driver->activePciHostdevs.
*/
- if (!pciDeviceIsAssignable(dev, !driver->relaxedACS) ||
- pciDeviceListFind(driver->activePciHostdevs, dev))
+ if ((other = pciDeviceListFind(driver->activePciHostdevs,
dev))) {
+ const char *other_name = pciDeviceGetUsedBy(other);
+
+ if (other_name)
+ qemuReportError(VIR_ERR_OPERATION_INVALID,
+ _("PCI device %s is in use by domain
%s"),
+ pciDeviceGetName(dev), other_name);
+ else
+ qemuReportError(VIR_ERR_OPERATION_INVALID,
+ _("PCI device %s is already in use"),
+ pciDeviceGetName(dev));
goto cleanup;
+ }
}
+ /* Loop 2: detach managed devices */
for (i = 0; i < pciDeviceListCount(pcidevs); i++) {
pciDevice *dev = pciDeviceListGet(pcidevs, i);
if (pciDeviceGetManaged(dev) &&
@@ -141,15 +160,15 @@ int qemuPrepareHostdevPCIDevices(struct
qemud_driver *driver,
goto reattachdevs;
}
- /* Now that all the PCI hostdevs have be dettached, we can safely
- * reset them */
+ /* Loop 3: Now that all the PCI hostdevs have been detached, we
+ * can safely reset them */
for (i = 0; i < pciDeviceListCount(pcidevs); i++) {
pciDevice *dev = pciDeviceListGet(pcidevs, i);
if (pciResetDevice(dev, driver->activePciHostdevs, pcidevs) < 0)
goto reattachdevs;
}
- /* Now mark all the devices as active */
+ /* Loop 4: Now mark all the devices as active */
for (i = 0; i < pciDeviceListCount(pcidevs); i++) {
pciDevice *dev = pciDeviceListGet(pcidevs, i);
if (pciDeviceListAdd(driver->activePciHostdevs, dev) < 0) {
@@ -158,8 +177,8 @@ int qemuPrepareHostdevPCIDevices(struct
qemud_driver *driver,
}
}
- /* Now set the used_by_domain of the device in
driver->activePciHostdevs
- * as domain name.
+ /* Loop 5: Now set the used_by_domain of the device in
+ * driver->activePciHostdevs as domain name.
*/
for (i = 0; i < pciDeviceListCount(pcidevs); i++) {
pciDevice *dev, *activeDev;
@@ -170,7 +189,7 @@ int qemuPrepareHostdevPCIDevices(struct
qemud_driver *driver,
pciDeviceSetUsedBy(activeDev, name);
}
- /* Now steal all the devices from pcidevs */
+ /* Loop 6: Now steal all the devices from pcidevs */
while (pciDeviceListCount(pcidevs) > 0) {
pciDevice *dev = pciDeviceListGet(pcidevs, 0);
pciDeviceListSteal(pcidevs, dev);
@@ -299,15 +318,13 @@ void qemuDomainReAttachHostdevDevices(struct
qemud_driver *driver,
for (i = 0; i < pciDeviceListCount(pcidevs); i++) {
pciDevice *dev = pciDeviceListGet(pcidevs, i);
pciDevice *activeDev = NULL;
- const char *used_by = NULL;
/* Never delete the dev from list driver->activePciHostdevs
- * if it's used by other domain.
+ * if it's used by other domain.
*/
activeDev = pciDeviceListFind(driver->activePciHostdevs, dev);
if (activeDev &&
- (used_by = pciDeviceGetUsedBy(activeDev)) &&
- STRNEQ(name, used_by))
+ STRNEQ_NULLABLE(name, pciDeviceGetUsedBy(activeDev)))
continue;
pciDeviceListDel(driver->activePciHostdevs, dev);
diff --git i/src/util/pci.c w/src/util/pci.c
index 888784a..2bbb90c 100644
--- i/src/util/pci.c
+++ w/src/util/pci.c
@@ -1313,7 +1313,6 @@ pciGetDevice(unsigned domain,
dev->bus = bus;
dev->slot = slot;
dev->function = function;
- dev->used_by = NULL;
if (snprintf(dev->name, sizeof(dev->name), "%.4x:%.2x:%.2x.%.1x",
dev->domain, dev->bus, dev->slot,
@@ -1376,10 +1375,15 @@ pciFreeDevice(pciDevice *dev)
VIR_DEBUG("%s %s: freeing", dev->id, dev->name);
pciCloseConfig(dev);
VIR_FREE(dev->path);
- dev->used_by = NULL;
VIR_FREE(dev);
}
+const char *
+pciDeviceGetName(pciDevice *dev)
+{
+ return dev->name;
+}
+
void pciDeviceSetManaged(pciDevice *dev, unsigned managed)
{
dev->managed = !!managed;
diff --git i/src/util/pci.h w/src/util/pci.h
index 640c6af..ab29c0b 100644
--- i/src/util/pci.h
+++ w/src/util/pci.h
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2009 Red Hat, Inc.
+ * Copyright (C) 2009, 2011 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
@@ -39,6 +39,7 @@ pciDevice *pciGetDevice (unsigned domain,
unsigned slot,
unsigned function);
void pciFreeDevice (pciDevice *dev);
+const char *pciDeviceGetName (pciDevice *dev);
int pciDettachDevice (pciDevice *dev, pciDeviceList
*activeDevs);
int pciReAttachDevice (pciDevice *dev, pciDeviceList
*activeDevs);
int pciResetDevice (pciDevice *dev,