[libvirt] [PATCH 0/2] fix 2 bugs in virHostdevReAttachPCIDevices

Fix 1: Fix index error in loop after remove an element from the array 'virPCIDeviceList' is actually an array. Removing one element makes the rest of the element move. Use while loop, increase index only when not do 'virPCIDeviceListDel(pcidevs, dev)' Fix 2: In such a case: 1. Domain A and B xml contain the same SRIOV net hostdev(<interface type='hostdev' /> with same pci address). 2. virsh start A (Successfully, and configure the SRIOV net with custom mac) 3. virsh start B (Fail because of the hostdev used by domain A or other reason.) In step 3, 'virHostdevNetConfigRestore' is called for the hostdev which is still used by domain A. It makes the mac/vlan of the SRIOV net change. Code Change in this fix: 1. As the pci used by other domain have been removed from 'pcidevs' in previous loop, we only restore the nic config for the hostdev still in 'pcidevs'(used by this domain) 2. wrap a function 'virHostdevIsPCINetDevice', which detect whether the hostdev is a pci net device or not. 3. update the comments to make it more clear Huanle Han (2): hostdev: Fix index error in loop after remove an element hostdev: fix net config restore error src/util/virhostdev.c | 56 ++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 40 insertions(+), 16 deletions(-) -- 1.9.1

'virPCIDeviceList' is actually an array. Removing one element makes the rest of the element move. Use while loop, increase index only when not virPCIDeviceListDel(pcidevs, dev) Signed-off-by: Huanle Han <hanxueluo@gmail.com> --- src/util/virhostdev.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 23365a3..83f567d 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -785,7 +785,8 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, * them and reset all the devices before re-attach. * Attach mac and port profile parameters to devices */ - for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { + i = 0; + while (i < virPCIDeviceListCount(pcidevs)) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); virPCIDevicePtr activeDev = NULL; @@ -806,6 +807,7 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, } virPCIDeviceListDel(hostdev_mgr->activePCIHostdevs, dev); + i++; } /* At this point, any device that had been used by the guest is in -- 1.9.1

On Thu, Mar 26, 2015 at 10:23:46PM +0800, Huanle Han wrote:
'virPCIDeviceList' is actually an array. Removing one element makes the rest of the element move.
Use while loop, increase index only when not virPCIDeviceListDel(pcidevs, dev)
Signed-off-by: Huanle Han <hanxueluo@gmail.com> --- src/util/virhostdev.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 23365a3..83f567d 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -785,7 +785,8 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, * them and reset all the devices before re-attach. * Attach mac and port profile parameters to devices */ - for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { + i = 0; + while (i < virPCIDeviceListCount(pcidevs)) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); virPCIDevicePtr activeDev = NULL;
@@ -806,6 +807,7 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, }
virPCIDeviceListDel(hostdev_mgr->activePCIHostdevs, dev); + i++;
'i--;' before that 'continue;' that's not in the index would be shorter, but I guess this is more readable. Or maybe: for (i = virPCIDeviceListCount(pcidevs); i > 0; i--) { size_t dev_index = i - 1; would just deal with this. Anyway, this works just as it is, ACK if you don't want to change it.
}
/* At this point, any device that had been used by the guest is in -- 1.9.1
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Tue, Mar 31, 2015 at 10:34:57AM +0200, Martin Kletzander wrote:
On Thu, Mar 26, 2015 at 10:23:46PM +0800, Huanle Han wrote:
'virPCIDeviceList' is actually an array. Removing one element makes the rest of the element move.
Use while loop, increase index only when not virPCIDeviceListDel(pcidevs, dev)
Signed-off-by: Huanle Han <hanxueluo@gmail.com> --- src/util/virhostdev.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 23365a3..83f567d 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -785,7 +785,8 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, * them and reset all the devices before re-attach. * Attach mac and port profile parameters to devices */ - for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { + i = 0; + while (i < virPCIDeviceListCount(pcidevs)) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); virPCIDevicePtr activeDev = NULL;
@@ -806,6 +807,7 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, }
virPCIDeviceListDel(hostdev_mgr->activePCIHostdevs, dev); + i++;
'i--;' before that 'continue;' that's not in the index would be shorter, but I guess this is more readable. Or maybe:
for (i = virPCIDeviceListCount(pcidevs); i > 0; i--) { size_t dev_index = i - 1;
would just deal with this.
Anyway, this works just as it is, ACK if you don't want to change it.
Pushed now as it's after release.

Fix for such a case: 1. Domain A and B xml contain the same SRIOV net hostdev(<interface type='hostdev' /> with same pci address). 2. virsh start A (Successfully, and configure the SRIOV net with custom mac) 3. virsh start B (Fail because of the hostdev used by domain A or other reason.) In step 3, 'virHostdevNetConfigRestore' is called for the hostdev which is still used by domain A. It makes the mac/vlan of the SRIOV net change. Code Change in this fix: 1. As the pci used by other domain have been removed from 'pcidevs' in previous loop, we only restore the nic config for the hostdev still in 'pcidevs'(used by this domain) 2. wrap a function 'virHostdevIsPCINetDevice', which detect whether the hostdev is a pci net device or not. 3. update the comments to make it more clear Signed-off-by: Huanle Han <hanxueluo@gmail.com> --- src/util/virhostdev.c | 52 ++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 37 insertions(+), 15 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 83f567d..b04bae2 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -350,6 +350,14 @@ virHostdevNetDevice(virDomainHostdevDefPtr hostdev, char **linkdev, return ret; } +static int +virHostdevIsPCINetDevice(virDomainHostdevDefPtr hostdev) +{ + return hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + hostdev->parent.type == VIR_DOMAIN_DEVICE_NET && + hostdev->parent.data.net; +} static int virHostdevNetConfigVirtPortProfile(const char *linkdev, int vf, @@ -481,10 +489,7 @@ virHostdevNetConfigRestore(virDomainHostdevDefPtr hostdev, /* This is only needed for PCI devices that have been defined * using <interface type='hostdev'>. For all others, it is a NOP. */ - if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || - hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI || - hostdev->parent.type != VIR_DOMAIN_DEVICE_NET || - !hostdev->parent.data.net) + if (!virHostdevIsPCINetDevice(hostdev)) return 0; isvf = virHostdevIsVirtualFunction(hostdev); @@ -781,19 +786,20 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, goto cleanup; } - /* Again 4 loops; mark all devices as inactive before reset + /* Here are 4 loops; mark all devices as inactive before reset * them and reset all the devices before re-attach. * Attach mac and port profile parameters to devices */ + + /* Loop 1: delete the copy of the dev from pcidevs if it's used by + * other domain. Or delete it from activePCIHostDevs if it had + * been used by this domain. + */ i = 0; while (i < virPCIDeviceListCount(pcidevs)) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); virPCIDevicePtr activeDev = NULL; - /* delete the copy of the dev from pcidevs if it's used by - * other domain. Or delete it from activePCIHostDevs if it had - * been used by this domain. - */ activeDev = virPCIDeviceListFind(hostdev_mgr->activePCIHostdevs, dev); if (activeDev) { const char *usedby_drvname; @@ -814,14 +820,27 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, * pcidevs, but has been removed from activePCIHostdevs. */ - /* - * For SRIOV net host devices, unset mac and port profile before - * reset and reattach device + /* Loop 2: before reset and reattach device, + * unset mac and port profile for SRIOV net host devices used by this domain */ - for (i = 0; i < nhostdevs; i++) - virHostdevNetConfigRestore(hostdevs[i], hostdev_mgr->stateDir, - oldStateDir); + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDefPtr hostdev = hostdevs[i]; + if (virHostdevIsPCINetDevice(hostdev)) { + virDomainHostdevSubsysPCIPtr pcisrc = &hostdev->source.subsys.u.pci; + virPCIDevicePtr dev = NULL; + dev = virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus, + pcisrc->addr.slot, pcisrc->addr.function); + + if (virPCIDeviceListFind(pcidevs, dev)) { + virHostdevNetConfigRestore(hostdev, hostdev_mgr->stateDir, + oldStateDir); + } + virPCIDeviceFree(dev); + } + } + + /* Loop 3: reset pci device used by this domain */ for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); @@ -834,6 +853,9 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, } } + /* Loop 4: reattach pci devices used by this domain + * and steal all the devices from pcidevs + */ while (virPCIDeviceListCount(pcidevs) > 0) { virPCIDevicePtr dev = virPCIDeviceListStealIndex(pcidevs, 0); virHostdevReattachPCIDevice(dev, hostdev_mgr); -- 1.9.1

On Thu, Mar 26, 2015 at 10:23:47PM +0800, Huanle Han wrote:
Fix for such a case: 1. Domain A and B xml contain the same SRIOV net hostdev(<interface type='hostdev' /> with same pci address). 2. virsh start A (Successfully, and configure the SRIOV net with custom mac) 3. virsh start B (Fail because of the hostdev used by domain A or other reason.) In step 3, 'virHostdevNetConfigRestore' is called for the hostdev which is still used by domain A. It makes the mac/vlan of the SRIOV net change.
Makes perfect sense, thanks for finding that out.
Code Change in this fix: 1. As the pci used by other domain have been removed from 'pcidevs' in previous loop, we only restore the nic config for the hostdev still in 'pcidevs'(used by this domain) 2. wrap a function 'virHostdevIsPCINetDevice', which detect whether the hostdev is a pci net device or not. 3. update the comments to make it more clear
Signed-off-by: Huanle Han <hanxueluo@gmail.com> --- src/util/virhostdev.c | 52 ++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 37 insertions(+), 15 deletions(-)
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 83f567d..b04bae2 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -350,6 +350,14 @@ virHostdevNetDevice(virDomainHostdevDefPtr hostdev, char **linkdev, return ret; }
+static int +virHostdevIsPCINetDevice(virDomainHostdevDefPtr hostdev) +{ + return hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + hostdev->parent.type == VIR_DOMAIN_DEVICE_NET && + hostdev->parent.data.net; +}
static int virHostdevNetConfigVirtPortProfile(const char *linkdev, int vf, @@ -481,10 +489,7 @@ virHostdevNetConfigRestore(virDomainHostdevDefPtr hostdev, /* This is only needed for PCI devices that have been defined * using <interface type='hostdev'>. For all others, it is a NOP. */ - if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || - hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI || - hostdev->parent.type != VIR_DOMAIN_DEVICE_NET || - !hostdev->parent.data.net) + if (!virHostdevIsPCINetDevice(hostdev)) return 0;
isvf = virHostdevIsVirtualFunction(hostdev); @@ -781,19 +786,20 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, goto cleanup; }
- /* Again 4 loops; mark all devices as inactive before reset + /* Here are 4 loops; mark all devices as inactive before reset * them and reset all the devices before re-attach. * Attach mac and port profile parameters to devices */ + + /* Loop 1: delete the copy of the dev from pcidevs if it's used by + * other domain. Or delete it from activePCIHostDevs if it had + * been used by this domain. + */ i = 0; while (i < virPCIDeviceListCount(pcidevs)) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); virPCIDevicePtr activeDev = NULL;
- /* delete the copy of the dev from pcidevs if it's used by - * other domain. Or delete it from activePCIHostDevs if it had - * been used by this domain. - */ activeDev = virPCIDeviceListFind(hostdev_mgr->activePCIHostdevs, dev); if (activeDev) { const char *usedby_drvname; @@ -814,14 +820,27 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, * pcidevs, but has been removed from activePCIHostdevs. */
- /* - * For SRIOV net host devices, unset mac and port profile before - * reset and reattach device + /* Loop 2: before reset and reattach device, + * unset mac and port profile for SRIOV net host devices used by this domain */
The comments were formatted as a sentence, it would be nice to keep it that way.
- for (i = 0; i < nhostdevs; i++) - virHostdevNetConfigRestore(hostdevs[i], hostdev_mgr->stateDir, - oldStateDir); + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDefPtr hostdev = hostdevs[i];
+ if (virHostdevIsPCINetDevice(hostdev)) { + virDomainHostdevSubsysPCIPtr pcisrc = &hostdev->source.subsys.u.pci; + virPCIDevicePtr dev = NULL; + dev = virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus, + pcisrc->addr.slot, pcisrc->addr.function); +
The daemon will crash if this function returns NULL. There should be check for that, but it shouldn't be fatal, so we can clean up as much as possible (see Loop 3 for example on how to handle that).
+ if (virPCIDeviceListFind(pcidevs, dev)) { + virHostdevNetConfigRestore(hostdev, hostdev_mgr->stateDir, + oldStateDir); + } + virPCIDeviceFree(dev); + } + } + + /* Loop 3: reset pci device used by this domain */ for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i);
@@ -834,6 +853,9 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, } }
+ /* Loop 4: reattach pci devices used by this domain + * and steal all the devices from pcidevs + */ while (virPCIDeviceListCount(pcidevs) > 0) { virPCIDevicePtr dev = virPCIDeviceListStealIndex(pcidevs, 0); virHostdevReattachPCIDevice(dev, hostdev_mgr); -- 1.9.1
Other than that, the patch looks fine. Martin

On 2015年03月31日 17:46, Martin Kletzander wrote: On Thu, Mar 26, 2015 at 10:23:47PM +0800, Huanle Han wrote: Fix for such a case: 1. Domain A and B xml contain the same SRIOV net hostdev(<interface type='hostdev' /> with same pci address). 2. virsh start A (Successfully, and configure the SRIOV net with custom mac) 3. virsh start B (Fail because of the hostdev used by domain A or other reason.) In step 3, 'virHostdevNetConfigRestore' is called for the hostdev which is still used by domain A. It makes the mac/vlan of the SRIOV net change. Makes perfect sense, thanks for finding that out. Code Change in this fix: 1. As the pci used by other domain have been removed from 'pcidevs' in previous loop, we only restore the nic config for the hostdev still in 'pcidevs'(used by this domain) 2. wrap a function 'virHostdevIsPCINetDevice', which detect whether the hostdev is a pci net device or not. 3. update the comments to make it more clear Signed-off-by: Huanle Han <hanxueluo@gmail.com> <hanxueluo@gmail.com> --- src/util/virhostdev.c | 52 ++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 37 insertions(+), 15 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 83f567d..b04bae2 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -350,6 +350,14 @@ virHostdevNetDevice(virDomainHostdevDefPtr hostdev, char **linkdev, return ret; } +static int +virHostdevIsPCINetDevice(virDomainHostdevDefPtr hostdev) +{ + return hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + hostdev->parent.type == VIR_DOMAIN_DEVICE_NET && + hostdev->parent.data.net; +} static int virHostdevNetConfigVirtPortProfile(const char *linkdev, int vf, @@ -481,10 +489,7 @@ virHostdevNetConfigRestore(virDomainHostdevDefPtr hostdev, /* This is only needed for PCI devices that have been defined * using <interface type='hostdev'>. For all others, it is a NOP. */ - if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || - hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI || - hostdev->parent.type != VIR_DOMAIN_DEVICE_NET || - !hostdev->parent.data.net) + if (!virHostdevIsPCINetDevice(hostdev)) return 0; isvf = virHostdevIsVirtualFunction(hostdev); @@ -781,19 +786,20 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, goto cleanup; } - /* Again 4 loops; mark all devices as inactive before reset + /* Here are 4 loops; mark all devices as inactive before reset * them and reset all the devices before re-attach. * Attach mac and port profile parameters to devices */ + + /* Loop 1: delete the copy of the dev from pcidevs if it's used by + * other domain. Or delete it from activePCIHostDevs if it had + * been used by this domain. + */ i = 0; while (i < virPCIDeviceListCount(pcidevs)) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); virPCIDevicePtr activeDev = NULL; - /* delete the copy of the dev from pcidevs if it's used by - * other domain. Or delete it from activePCIHostDevs if it had - * been used by this domain. - */ activeDev = virPCIDeviceListFind(hostdev_mgr->activePCIHostdevs, dev); if (activeDev) { const char *usedby_drvname; @@ -814,14 +820,27 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, * pcidevs, but has been removed from activePCIHostdevs. */ - /* - * For SRIOV net host devices, unset mac and port profile before - * reset and reattach device + /* Loop 2: before reset and reattach device, + * unset mac and port profile for SRIOV net host devices used by this domain */ The comments were formatted as a sentence, it would be nice to keep it that way. Is comment as below OK? /* * Loop 2: For SRIOV net host devices used by this domain, unset mac * and port profile before reset and reattach device */ - for (i = 0; i < nhostdevs; i++) - virHostdevNetConfigRestore(hostdevs[i], hostdev_mgr->stateDir, - oldStateDir); + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDefPtr hostdev = hostdevs[i]; + if (virHostdevIsPCINetDevice(hostdev)) { + virDomainHostdevSubsysPCIPtr pcisrc = &hostdev->source.subsys.u.pci; + virPCIDevicePtr dev = NULL; + dev = virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus, + pcisrc->addr.slot, pcisrc->addr.function); + The daemon will crash if this function returns NULL. There should be check for that, but it shouldn't be fatal, so we can clean up as much as possible (see Loop 3 for example on how to handle that). Is this OK? if (dev) { if (virPCIDeviceListFind(pcidevs, dev)) { virHostdevNetConfigRestore(hostdev, hostdev_mgr->stateDir, oldStateDir); } } else { virErrorPtr err = virGetLastError(); VIR_ERROR(_("Failed to new PCI device: %s"), err ? err->message : _("unknown error")); virResetError(err); } virPCIDeviceFree(dev); + if (virPCIDeviceListFind(pcidevs, dev)) { + virHostdevNetConfigRestore(hostdev, hostdev_mgr->stateDir, + oldStateDir); + } + virPCIDeviceFree(dev); + } + } + + /* Loop 3: reset pci device used by this domain */ for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); @@ -834,6 +853,9 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, } } + /* Loop 4: reattach pci devices used by this domain + * and steal all the devices from pcidevs + */ while (virPCIDeviceListCount(pcidevs) > 0) { virPCIDevicePtr dev = virPCIDeviceListStealIndex(pcidevs, 0); virHostdevReattachPCIDevice(dev, hostdev_mgr); -- 1.9.1 Other than that, the patch looks fine. Martin

On 04/01/2015 12:15 PM, Huanle Han wrote:
On 2015年03月31日 17:46, Martin Kletzander wrote:
On Thu, Mar 26, 2015 at 10:23:47PM +0800, Huanle Han wrote:
Fix for such a case: 1. Domain A and B xml contain the same SRIOV net hostdev(<interface type='hostdev' /> with same pci address). 2. virsh start A (Successfully, and configure the SRIOV net with custom mac) 3. virsh start B (Fail because of the hostdev used by domain A or other reason.) In step 3, 'virHostdevNetConfigRestore' is called for the hostdev which is still used by domain A. It makes the mac/vlan of the SRIOV net change.
Makes perfect sense, thanks for finding that out.
Someone else encountered this last month too, and posted a patch: https://www.redhat.com/archives/libvir-list/2015-February/msg00394.html The patch was NACKed because I didn't think the fix was correct (although it did work) https://www.redhat.com/archives/libvir-list/2015-February/msg00976.html I don't have the time to do a close analysis of this patch right now, but from the comments it sounds like it addresses the concern that I had with the previous patch (that it wouldn't be messing with any devices that were currently not in use by any domain, and hadn't yet been reached in the setup part). I wanted to point that out now so that whoever looks at the next version of this patch checks for that.
Code Change in this fix: 1. As the pci used by other domain have been removed from 'pcidevs' in previous loop, we only restore the nic config for the hostdev still in 'pcidevs'(used by this domain) 2. wrap a function 'virHostdevIsPCINetDevice', which detect whether the hostdev is a pci net device or not. 3. update the comments to make it more clear
Signed-off-by: Huanle Han <hanxueluo@gmail.com> <mailto:hanxueluo@gmail.com> --- src/util/virhostdev.c | 52 ++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 37 insertions(+), 15 deletions(-)
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 83f567d..b04bae2 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -350,6 +350,14 @@ virHostdevNetDevice(virDomainHostdevDefPtr hostdev, char **linkdev, return ret; }
+static int +virHostdevIsPCINetDevice(virDomainHostdevDefPtr hostdev) +{ + return hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + hostdev->parent.type == VIR_DOMAIN_DEVICE_NET && + hostdev->parent.data.net <http://parent.data.net>; +}
static int virHostdevNetConfigVirtPortProfile(const char *linkdev, int vf, @@ -481,10 +489,7 @@ virHostdevNetConfigRestore(virDomainHostdevDefPtr hostdev, /* This is only needed for PCI devices that have been defined * using <interface type='hostdev'>. For all others, it is a NOP. */ - if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || - hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI || - hostdev->parent.type != VIR_DOMAIN_DEVICE_NET || - !hostdev->parent.data.net <http://parent.data.net>) + if (!virHostdevIsPCINetDevice(hostdev)) return 0;
isvf = virHostdevIsVirtualFunction(hostdev); @@ -781,19 +786,20 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, goto cleanup; }
- /* Again 4 loops; mark all devices as inactive before reset + /* Here are 4 loops; mark all devices as inactive before reset * them and reset all the devices before re-attach. * Attach mac and port profile parameters to devices */ + + /* Loop 1: delete the copy of the dev from pcidevs if it's used by + * other domain. Or delete it from activePCIHostDevs if it had + * been used by this domain. + */ i = 0; while (i < virPCIDeviceListCount(pcidevs)) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); virPCIDevicePtr activeDev = NULL;
- /* delete the copy of the dev from pcidevs if it's used by - * other domain. Or delete it from activePCIHostDevs if it had - * been used by this domain. - */ activeDev = virPCIDeviceListFind(hostdev_mgr->activePCIHostdevs, dev); if (activeDev) { const char *usedby_drvname; @@ -814,14 +820,27 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, * pcidevs, but has been removed from activePCIHostdevs. */
- /* - * For SRIOV net host devices, unset mac and port profile before - * reset and reattach device + /* Loop 2: before reset and reattach device, + * unset mac and port profile for SRIOV net host devices used by this domain */
The comments were formatted as a sentence, it would be nice to keep it that way.
Is comment as below OK? /* * Loop 2: For SRIOV net host devices used by this domain, unset mac * and port profile before reset and reattach device */
- for (i = 0; i < nhostdevs; i++) - virHostdevNetConfigRestore(hostdevs[i], hostdev_mgr->stateDir, - oldStateDir); + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDefPtr hostdev = hostdevs[i];
+ if (virHostdevIsPCINetDevice(hostdev)) { + virDomainHostdevSubsysPCIPtr pcisrc = &hostdev->source.subsys.u.pci; + virPCIDevicePtr dev = NULL; + dev = virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus, + pcisrc->addr.slot, pcisrc->addr.function); +
The daemon will crash if this function returns NULL. There should be check for that, but it shouldn't be fatal, so we can clean up as much as possible (see Loop 3 for example on how to handle that). Is this OK? if (dev) { if (virPCIDeviceListFind(pcidevs, dev)) { virHostdevNetConfigRestore(hostdev, hostdev_mgr->stateDir, oldStateDir); } } else { virErrorPtr err = virGetLastError(); VIR_ERROR(_("Failed to new PCI device: %s"), err ? err->message : _("unknown error")); virResetError(err); } virPCIDeviceFree(dev);
+ if (virPCIDeviceListFind(pcidevs, dev)) { + virHostdevNetConfigRestore(hostdev, hostdev_mgr->stateDir, + oldStateDir); + } + virPCIDeviceFree(dev); + } + } + + /* Loop 3: reset pci device used by this domain */ for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i);
@@ -834,6 +853,9 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, } }
+ /* Loop 4: reattach pci devices used by this domain + * and steal all the devices from pcidevs + */ while (virPCIDeviceListCount(pcidevs) > 0) { virPCIDevicePtr dev = virPCIDeviceListStealIndex(pcidevs, 0); virHostdevReattachPCIDevice(dev, hostdev_mgr); -- 1.9.1
Other than that, the patch looks fine.
Martin
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Wed, Apr 01, 2015 at 01:35:36PM -0400, Laine Stump wrote:
On 04/01/2015 12:15 PM, Huanle Han wrote:
On 2015年03月31日 17:46, Martin Kletzander wrote:
On Thu, Mar 26, 2015 at 10:23:47PM +0800, Huanle Han wrote:
Fix for such a case: 1. Domain A and B xml contain the same SRIOV net hostdev(<interface type='hostdev' /> with same pci address). 2. virsh start A (Successfully, and configure the SRIOV net with custom mac) 3. virsh start B (Fail because of the hostdev used by domain A or other reason.) In step 3, 'virHostdevNetConfigRestore' is called for the hostdev which is still used by domain A. It makes the mac/vlan of the SRIOV net change.
Makes perfect sense, thanks for finding that out.
Someone else encountered this last month too, and posted a patch:
https://www.redhat.com/archives/libvir-list/2015-February/msg00394.html
The patch was NACKed because I didn't think the fix was correct (although it did work)
https://www.redhat.com/archives/libvir-list/2015-February/msg00976.html
I don't have the time to do a close analysis of this patch right now, but from the comments it sounds like it addresses the concern that I had with the previous patch (that it wouldn't be messing with any devices that were currently not in use by any domain, and hadn't yet been reached in the setup part). I wanted to point that out now so that whoever looks at the next version of this patch checks for that.
I must have missed that. It looks like this series does not have the issue of the one you've mentioned.
Code Change in this fix: 1. As the pci used by other domain have been removed from 'pcidevs' in previous loop, we only restore the nic config for the hostdev still in 'pcidevs'(used by this domain) 2. wrap a function 'virHostdevIsPCINetDevice', which detect whether the hostdev is a pci net device or not. 3. update the comments to make it more clear
Signed-off-by: Huanle Han <hanxueluo@gmail.com> <mailto:hanxueluo@gmail.com> --- src/util/virhostdev.c | 52 ++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 37 insertions(+), 15 deletions(-)
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 83f567d..b04bae2 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -350,6 +350,14 @@ virHostdevNetDevice(virDomainHostdevDefPtr hostdev, char **linkdev, return ret; }
+static int +virHostdevIsPCINetDevice(virDomainHostdevDefPtr hostdev) +{ + return hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + hostdev->parent.type == VIR_DOMAIN_DEVICE_NET && + hostdev->parent.data.net <http://parent.data.net>; +}
static int virHostdevNetConfigVirtPortProfile(const char *linkdev, int vf, @@ -481,10 +489,7 @@ virHostdevNetConfigRestore(virDomainHostdevDefPtr hostdev, /* This is only needed for PCI devices that have been defined * using <interface type='hostdev'>. For all others, it is a NOP. */ - if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || - hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI || - hostdev->parent.type != VIR_DOMAIN_DEVICE_NET || - !hostdev->parent.data.net <http://parent.data.net>) + if (!virHostdevIsPCINetDevice(hostdev)) return 0;
isvf = virHostdevIsVirtualFunction(hostdev); @@ -781,19 +786,20 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, goto cleanup; }
- /* Again 4 loops; mark all devices as inactive before reset + /* Here are 4 loops; mark all devices as inactive before reset * them and reset all the devices before re-attach. * Attach mac and port profile parameters to devices */ + + /* Loop 1: delete the copy of the dev from pcidevs if it's used by + * other domain. Or delete it from activePCIHostDevs if it had + * been used by this domain. + */ i = 0; while (i < virPCIDeviceListCount(pcidevs)) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); virPCIDevicePtr activeDev = NULL;
- /* delete the copy of the dev from pcidevs if it's used by - * other domain. Or delete it from activePCIHostDevs if it had - * been used by this domain. - */ activeDev = virPCIDeviceListFind(hostdev_mgr->activePCIHostdevs, dev); if (activeDev) { const char *usedby_drvname; @@ -814,14 +820,27 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, * pcidevs, but has been removed from activePCIHostdevs. */
- /* - * For SRIOV net host devices, unset mac and port profile before - * reset and reattach device + /* Loop 2: before reset and reattach device, + * unset mac and port profile for SRIOV net host devices used by this domain */
The comments were formatted as a sentence, it would be nice to keep it that way.
Is comment as below OK? /* * Loop 2: For SRIOV net host devices used by this domain, unset mac * and port profile before reset and reattach device */
s/reset.*/resetting and reattaching the device./
- for (i = 0; i < nhostdevs; i++) - virHostdevNetConfigRestore(hostdevs[i], hostdev_mgr->stateDir, - oldStateDir); + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDefPtr hostdev = hostdevs[i];
+ if (virHostdevIsPCINetDevice(hostdev)) { + virDomainHostdevSubsysPCIPtr pcisrc = &hostdev->source.subsys.u.pci; + virPCIDevicePtr dev = NULL; + dev = virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus, + pcisrc->addr.slot, pcisrc->addr.function); +
The daemon will crash if this function returns NULL. There should be check for that, but it shouldn't be fatal, so we can clean up as much as possible (see Loop 3 for example on how to handle that). Is this OK? if (dev) { if (virPCIDeviceListFind(pcidevs, dev)) { virHostdevNetConfigRestore(hostdev, hostdev_mgr->stateDir, oldStateDir); } } else { virErrorPtr err = virGetLastError(); VIR_ERROR(_("Failed to new PCI device: %s"), err ? err->message : _("unknown error")); virResetError(err); } virPCIDeviceFree(dev);
Looks good to me.
+ if (virPCIDeviceListFind(pcidevs, dev)) { + virHostdevNetConfigRestore(hostdev, hostdev_mgr->stateDir, + oldStateDir); + } + virPCIDeviceFree(dev); + } + } + + /* Loop 3: reset pci device used by this domain */ for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i);
@@ -834,6 +853,9 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, } }
+ /* Loop 4: reattach pci devices used by this domain + * and steal all the devices from pcidevs + */ while (virPCIDeviceListCount(pcidevs) > 0) { virPCIDevicePtr dev = virPCIDeviceListStealIndex(pcidevs, 0); virHostdevReattachPCIDevice(dev, hostdev_mgr); -- 1.9.1
Other than that, the patch looks fine.
Martin
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 04/01/2015 12:15 PM, Huanle Han wrote:
On 2015年03月31日 17:46, Martin Kletzander wrote:
On Thu, Mar 26, 2015 at 10:23:47PM +0800, Huanle Han wrote:
Fix for such a case: 1. Domain A and B xml contain the same SRIOV net hostdev(<interface type='hostdev' /> with same pci address). 2. virsh start A (Successfully, and configure the SRIOV net with custom mac) 3. virsh start B (Fail because of the hostdev used by domain A or other reason.) In step 3, 'virHostdevNetConfigRestore' is called for the hostdev which is still used by domain A. It makes the mac/vlan of the SRIOV net change.
Makes perfect sense, thanks for finding that out.
Someone else encountered this last month too, and posted a patch:
https://www.redhat.com/archives/libvir-list/2015-February/msg00394.html
The patch was NACKed because I didn't think the fix was correct (although it did work)
https://www.redhat.com/archives/libvir-list/2015-February/msg00976.html
I don't have the time to do a close analysis of this patch right now, but from the comments it sounds like it addresses the concern that I had with the previous patch (that it wouldn't be messing with any devices that were currently not in use by any domain, and hadn't yet been reached in the setup part). I wanted to point that out now so that whoever looks at the next version of this patch checks for that.
active virPCIDevice has member variable 'used_by_drvname' and 'used_by_domname' to mark its user. According to ’virHostdevPreparePCIDevices‘, only the pci is successfully prepared, will
2015-04-02 1:35 GMT+08:00 Laine Stump <laine@laine.org>: the member 'used_by_XXX' be assigned. The domain PCI state when reattach: 1. If qemuProcessStart() fails before pci setup(virHostdevPreparePCIDevices()), the pci will either be not active, or be active but used by others. 2. if qemuProcessStart() fails during pci setup , virHostdevPreparePCIDevices rollbacks the change. The result will be the same with 1. 3. if qemuProcessStart() fails after pci setup, the pci will be active and used by this domain. In fucntion virHostdevReAttachPCIDevices, 1. pcidevs is from virHostdevGetActivePCIHostDeviceList(). That is say, pci in pcidevs is active ( used/prepared by any domains). 2. In loop 1, the pci used by other domains is removed from the pcidevs. After loop, all pcis in pcidevs are used/prepared by this domain. 3. The following reattach operation is on these 'pcidevs', not the pci used by nobody or other domains. So I don't think the problem you said exists. This solution is OK. Am I right?
Code Change in this fix: 1. As the pci used by other domain have been removed from 'pcidevs' in previous loop, we only restore the nic config for the hostdev still in 'pcidevs'(used by this domain) 2. wrap a function 'virHostdevIsPCINetDevice', which detect whether the hostdev is a pci net device or not. 3. update the comments to make it more clear
Signed-off-by: Huanle Han <hanxueluo@gmail.com> <hanxueluo@gmail.com> --- src/util/virhostdev.c | 52 ++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 37 insertions(+), 15 deletions(-)
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 83f567d..b04bae2 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -350,6 +350,14 @@ virHostdevNetDevice(virDomainHostdevDefPtr hostdev, char **linkdev, return ret; }
+static int +virHostdevIsPCINetDevice(virDomainHostdevDefPtr hostdev) +{ + return hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && + hostdev->parent.type == VIR_DOMAIN_DEVICE_NET && + hostdev->parent.data.net; +}
static int virHostdevNetConfigVirtPortProfile(const char *linkdev, int vf, @@ -481,10 +489,7 @@ virHostdevNetConfigRestore(virDomainHostdevDefPtr hostdev, /* This is only needed for PCI devices that have been defined * using <interface type='hostdev'>. For all others, it is a NOP. */ - if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || - hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI || - hostdev->parent.type != VIR_DOMAIN_DEVICE_NET || - !hostdev->parent.data.net) + if (!virHostdevIsPCINetDevice(hostdev)) return 0;
isvf = virHostdevIsVirtualFunction(hostdev); @@ -781,19 +786,20 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, goto cleanup; }
- /* Again 4 loops; mark all devices as inactive before reset + /* Here are 4 loops; mark all devices as inactive before reset * them and reset all the devices before re-attach. * Attach mac and port profile parameters to devices */ + + /* Loop 1: delete the copy of the dev from pcidevs if it's used by + * other domain. Or delete it from activePCIHostDevs if it had + * been used by this domain. + */ i = 0; while (i < virPCIDeviceListCount(pcidevs)) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); virPCIDevicePtr activeDev = NULL;
- /* delete the copy of the dev from pcidevs if it's used by - * other domain. Or delete it from activePCIHostDevs if it had - * been used by this domain. - */ activeDev = virPCIDeviceListFind(hostdev_mgr->activePCIHostdevs, dev); if (activeDev) { const char *usedby_drvname; @@ -814,14 +820,27 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, * pcidevs, but has been removed from activePCIHostdevs. */
- /* - * For SRIOV net host devices, unset mac and port profile before - * reset and reattach device + /* Loop 2: before reset and reattach device, + * unset mac and port profile for SRIOV net host devices used by this domain */
The comments were formatted as a sentence, it would be nice to keep it that way.
Is comment as below OK? /* * Loop 2: For SRIOV net host devices used by this domain, unset mac * and port profile before reset and reattach device */
- for (i = 0; i < nhostdevs; i++) - virHostdevNetConfigRestore(hostdevs[i], hostdev_mgr->stateDir, - oldStateDir); + for (i = 0; i < nhostdevs; i++) { + virDomainHostdevDefPtr hostdev = hostdevs[i];
+ if (virHostdevIsPCINetDevice(hostdev)) { + virDomainHostdevSubsysPCIPtr pcisrc = &hostdev->source.subsys.u.pci; + virPCIDevicePtr dev = NULL; + dev = virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus, + pcisrc->addr.slot, pcisrc->addr.function); +
The daemon will crash if this function returns NULL. There should be check for that, but it shouldn't be fatal, so we can clean up as much as possible (see Loop 3 for example on how to handle that).
Is this OK? if (dev) { if (virPCIDeviceListFind(pcidevs, dev)) { virHostdevNetConfigRestore(hostdev, hostdev_mgr->stateDir, oldStateDir); } } else { virErrorPtr err = virGetLastError(); VIR_ERROR(_("Failed to new PCI device: %s"), err ? err->message : _("unknown error")); virResetError(err); } virPCIDeviceFree(dev);
+ if (virPCIDeviceListFind(pcidevs, dev)) { + virHostdevNetConfigRestore(hostdev, hostdev_mgr->stateDir, + oldStateDir); + } + virPCIDeviceFree(dev); + } + } + + /* Loop 3: reset pci device used by this domain */ for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i);
@@ -834,6 +853,9 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, } }
+ /* Loop 4: reattach pci devices used by this domain + * and steal all the devices from pcidevs + */ while (virPCIDeviceListCount(pcidevs) > 0) { virPCIDevicePtr dev = virPCIDeviceListStealIndex(pcidevs, 0); virHostdevReattachPCIDevice(dev, hostdev_mgr); -- 1.9.1
Other than that, the patch looks fine.
Martin
-- libvir-list mailing listlibvir-list@redhat.comhttps://www.redhat.com/mailman/listinfo/libvir-list
participants (3)
-
Huanle Han
-
Laine Stump
-
Martin Kletzander