[libvirt] [PATCH] qemu: Fix PCI address allocation

When attaching a PCI device which doesn't explicitly set its PCI address, libvirt allocates the address automatically. The problem is that when checking which PCI address is unused, we only check for those with slot number higher than the highest slot number ever used. Thus attaching/detaching such device several times in a row (31 is the theoretical limit, less then 30 tries are enough in practise) makes any further device attachment fail. Furthermore, attaching a device with predefined PCI address to 0:0:31 immediately forbids attachment of any PCI device without explicit address. This patch changes the logic so that we always check all PCI addresses before we say there is no PCI address available. --- src/qemu/qemu_conf.c | 14 ++++---------- 1 files changed, 4 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 57bc02f..eaebcc1 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -2055,8 +2055,6 @@ qemuAssignDeviceAliases(virDomainDefPtr def, unsigned long long qemuCmdFlags) #define QEMU_PCI_ADDRESS_LAST_SLOT 31 struct _qemuDomainPCIAddressSet { virHashTablePtr used; - int nextslot; - /* XXX add domain, bus later when QEMU allows > 1 */ }; @@ -2148,9 +2146,6 @@ int qemuDomainPCIAddressReserveAddr(qemuDomainPCIAddressSetPtr addrs, return -1; } - if (dev->addr.pci.slot > addrs->nextslot) - addrs->nextslot = dev->addr.pci.slot + 1; - return 0; } @@ -2217,7 +2212,7 @@ int qemuDomainPCIAddressSetNextAddr(qemuDomainPCIAddressSetPtr addrs, { int i; - for (i = addrs->nextslot ; i <= QEMU_PCI_ADDRESS_LAST_SLOT ; i++) { + for (i = 0 ; i <= QEMU_PCI_ADDRESS_LAST_SLOT ; i++) { virDomainDeviceInfo maybe; char *addr; @@ -2228,13 +2223,14 @@ int qemuDomainPCIAddressSetNextAddr(qemuDomainPCIAddressSetPtr addrs, addr = qemuPCIAddressAsString(&maybe); - VIR_DEBUG("Allocating PCI addr %s", addr); - if (virHashLookup(addrs->used, addr)) { + VIR_DEBUG("PCI addr %s already in use", addr); VIR_FREE(addr); continue; } + VIR_DEBUG("Allocating PCI addr %s", addr); + if (virHashAddEntry(addrs->used, addr, addr) < 0) { VIR_FREE(addr); return -1; @@ -2245,8 +2241,6 @@ int qemuDomainPCIAddressSetNextAddr(qemuDomainPCIAddressSetPtr addrs, dev->addr.pci.bus = 0; dev->addr.pci.slot = i; - addrs->nextslot = i + 1; - return 0; } -- 1.7.2

On 07/30/10 - 04:56:47PM, Jiri Denemark wrote:
When attaching a PCI device which doesn't explicitly set its PCI address, libvirt allocates the address automatically. The problem is that when checking which PCI address is unused, we only check for those with slot number higher than the highest slot number ever used.
Thus attaching/detaching such device several times in a row (31 is the theoretical limit, less then 30 tries are enough in practise) makes any further device attachment fail. Furthermore, attaching a device with predefined PCI address to 0:0:31 immediately forbids attachment of any PCI device without explicit address.
This patch changes the logic so that we always check all PCI addresses before we say there is no PCI address available.
Yes, makes perfect sense. Most of the patch is adding VIR_DEBUG() lines and removing the "nextslot"; the real change is this line:
@@ -2217,7 +2212,7 @@ int qemuDomainPCIAddressSetNextAddr(qemuDomainPCIAddressSetPtr addrs, { int i;
- for (i = addrs->nextslot ; i <= QEMU_PCI_ADDRESS_LAST_SLOT ; i++) { + for (i = 0 ; i <= QEMU_PCI_ADDRESS_LAST_SLOT ; i++) { virDomainDeviceInfo maybe; char *addr;
ACK -- Chris Lalancette

On 07/30/2010 08:56 AM, Jiri Denemark wrote:
When attaching a PCI device which doesn't explicitly set its PCI address, libvirt allocates the address automatically. The problem is that when checking which PCI address is unused, we only check for those with slot number higher than the highest slot number ever used.
Thus attaching/detaching such device several times in a row (31 is the theoretical limit, less then 30 tries are enough in practise) makes any further device attachment fail. Furthermore, attaching a device with predefined PCI address to 0:0:31 immediately forbids attachment of any PCI device without explicit address.
This patch changes the logic so that we always check all PCI addresses before we say there is no PCI address available. --- src/qemu/qemu_conf.c | 14 ++++---------- 1 files changed, 4 insertions(+), 10 deletions(-)
diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 57bc02f..eaebcc1 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -2055,8 +2055,6 @@ qemuAssignDeviceAliases(virDomainDefPtr def, unsigned long long qemuCmdFlags) #define QEMU_PCI_ADDRESS_LAST_SLOT 31 struct _qemuDomainPCIAddressSet { virHashTablePtr used; - int nextslot; - /* XXX add domain, bus later when QEMU allows > 1 */
- for (i = addrs->nextslot ; i <= QEMU_PCI_ADDRESS_LAST_SLOT ; i++) { + for (i = 0 ; i <= QEMU_PCI_ADDRESS_LAST_SLOT ; i++) {
Don't we still want to start from the last slot, and wrap around the search only when we reach the end, rather than always starting from 0? I'm thinking particularly about compatibility with older qemu, where always starting from 0 risks interleaving new assignments among the pre-assigned slots, and may end up allocating slots in a way that throws off Windows. Or am I worried about a non-issue? -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

When attaching a PCI device which doesn't explicitly set its PCI address, libvirt allocates the address automatically. The problem is that when checking which PCI address is unused, we only check for those with slot number higher than the highest slot number ever used.
Thus attaching/detaching such device several times in a row (31 is the theoretical limit, less then 30 tries are enough in practise) makes any further device attachment fail. Furthermore, attaching a device with predefined PCI address to 0:0:31 immediately forbids attachment of any PCI device without explicit address.
This patch changes the logic so that we always check all PCI addresses before we say there is no PCI address available.
- for (i = addrs->nextslot ; i <= QEMU_PCI_ADDRESS_LAST_SLOT ; i++) { + for (i = 0 ; i <= QEMU_PCI_ADDRESS_LAST_SLOT ; i++) {
Don't we still want to start from the last slot, and wrap around the search only when we reach the end, rather than always starting from 0? I'm thinking particularly about compatibility with older qemu, where always starting from 0 risks interleaving new assignments among the pre-assigned slots, and may end up allocating slots in a way that throws off Windows. Or am I worried about a non-issue?
TBH I don't know. If qemu can get upset about it, we probably can't change the logic at all. Or at the best we can have it variable depending on qemu version. So if someone knows better if we can safely change this or not, that would be great. (too bad Daniel is on vacation) Anyway, going from the nextslot and wrapping wouldn't really help with this situation. It would only postpone the possible compatibility issues until nextslot wraps around. Jirka

On 08/02/2010 07:31 AM, Jiri Denemark wrote:
Don't we still want to start from the last slot, and wrap around the search only when we reach the end, rather than always starting from 0? I'm thinking particularly about compatibility with older qemu, where always starting from 0 risks interleaving new assignments among the pre-assigned slots, and may end up allocating slots in a way that throws off Windows. Or am I worried about a non-issue?
TBH I don't know. If qemu can get upset about it, we probably can't change the logic at all. Or at the best we can have it variable depending on qemu version. So if someone knows better if we can safely change this or not, that would be great. (too bad Daniel is on vacation)
Anyway, going from the nextslot and wrapping wouldn't really help with this situation. It would only postpone the possible compatibility issues until nextslot wraps around.
Well, my understanding is that older qemu didn't allow specifying specific slots, so they only assigned slots in sequential order. The moment you request a particular slot, you are already depending on newer qemu. But we also just checked in several patches recently to pre-assign several slots, so that we match the same default assignment as older qemu. If those pre-assignments contain gaps, but we start from 0 on every search, then we will fill in those gaps, where the old code used to put additional slots beyond the last pre-assigned gap. And since Windows is sensitive to the case where pci devices have changed slots from the previous boot, you can see where my question was coming from. Also remember that with those older qemu, since you can't request a specific slot, you would have already had a problem with consecutively populating all available slots if you ever get to the end where wraparound would even be considered. It's only with newer qemu, where you can request a specific slot but in so doing create a huge gap, where wraparound proves useful. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

From: Jiri Denemark <jdenemar@redhat.com> When attaching a PCI device which doesn't explicitly set its PCI address, libvirt allocates the address automatically. The problem is that when checking which PCI address is unused, we only check for those with slot number higher than the highest slot number ever used. Thus attaching/detaching such device several times in a row (31 is the theoretical limit, less then 30 tries are enough in practise) makes any further device attachment fail. Furthermore, attaching a device with predefined PCI address to 0:0:31 immediately forbids attachment of any PCI device without explicit address. This patch changes the logic so that we always check all PCI addresses before we say there is no PCI address available. Signed-off-by: Eric Blake <eblake@redhat.com> --- Modifications from v1: revert back to remembering the last slot reserved, but allow wraparound to not be limited by the end. In this way, slots are still assigned in the same order as before the patch, rather than filling in the gaps closest to 0 and risking making windows guests mad. src/qemu/qemu_conf.c | 19 ++++++++++++++----- 1 files changed, 14 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 57bc02f..5397997 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -2056,7 +2056,6 @@ qemuAssignDeviceAliases(virDomainDefPtr def, unsigned long long qemuCmdFlags) struct _qemuDomainPCIAddressSet { virHashTablePtr used; int nextslot; - /* XXX add domain, bus later when QEMU allows > 1 */ }; @@ -2148,8 +2147,11 @@ int qemuDomainPCIAddressReserveAddr(qemuDomainPCIAddressSetPtr addrs, return -1; } - if (dev->addr.pci.slot > addrs->nextslot) + if (dev->addr.pci.slot > addrs->nextslot) { addrs->nextslot = dev->addr.pci.slot + 1; + if (QEMU_PCI_ADDRESS_LAST_SLOT < addrs->nextslot) + addrs->nextslot = 0; + } return 0; } @@ -2216,11 +2218,15 @@ int qemuDomainPCIAddressSetNextAddr(qemuDomainPCIAddressSetPtr addrs, virDomainDeviceInfoPtr dev) { int i; + int iteration; - for (i = addrs->nextslot ; i <= QEMU_PCI_ADDRESS_LAST_SLOT ; i++) { + for (i = addrs->nextslot, iteration = 0; + iteration <= QEMU_PCI_ADDRESS_LAST_SLOT; i++, iteration++) { virDomainDeviceInfo maybe; char *addr; + if (QEMU_PCI_ADDRESS_LAST_SLOT < i) + i = 0; memset(&maybe, 0, sizeof(maybe)); maybe.addr.pci.domain = 0; maybe.addr.pci.bus = 0; @@ -2228,13 +2234,14 @@ int qemuDomainPCIAddressSetNextAddr(qemuDomainPCIAddressSetPtr addrs, addr = qemuPCIAddressAsString(&maybe); - VIR_DEBUG("Allocating PCI addr %s", addr); - if (virHashLookup(addrs->used, addr)) { + VIR_DEBUG("PCI addr %s already in use", addr); VIR_FREE(addr); continue; } + VIR_DEBUG("Allocating PCI addr %s", addr); + if (virHashAddEntry(addrs->used, addr, addr) < 0) { VIR_FREE(addr); return -1; @@ -2246,6 +2253,8 @@ int qemuDomainPCIAddressSetNextAddr(qemuDomainPCIAddressSetPtr addrs, dev->addr.pci.slot = i; addrs->nextslot = i + 1; + if (QEMU_PCI_ADDRESS_LAST_SLOT < addrs->nextslot) + addrs->nextslot = 0; return 0; } -- 1.7.2

On Tue, Aug 03, 2010 at 03:17:01PM -0600, Eric Blake wrote:
From: Jiri Denemark <jdenemar@redhat.com>
When attaching a PCI device which doesn't explicitly set its PCI address, libvirt allocates the address automatically. The problem is that when checking which PCI address is unused, we only check for those with slot number higher than the highest slot number ever used.
Thus attaching/detaching such device several times in a row (31 is the theoretical limit, less then 30 tries are enough in practise) makes any further device attachment fail. Furthermore, attaching a device with predefined PCI address to 0:0:31 immediately forbids attachment of any PCI device without explicit address.
This patch changes the logic so that we always check all PCI addresses before we say there is no PCI address available.
Signed-off-by: Eric Blake <eblake@redhat.com> ---
Modifications from v1: revert back to remembering the last slot reserved, but allow wraparound to not be limited by the end. In this way, slots are still assigned in the same order as before the patch, rather than filling in the gaps closest to 0 and risking making windows guests mad.
ACK, that looks fine, I will push it once as part of my 0.8.3 release within an hour or so ! Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/
participants (4)
-
Chris Lalancette
-
Daniel Veillard
-
Eric Blake
-
Jiri Denemark