[libvirt] [PATCH v3 0/4] util: Fixing libvirt errors on cavium/thunder-nicvf

Following up on my offer to take the baton[*], here's a v3 of this series that should address the provided feedback provided for Radoslaw's v2. ThunderX is Cavium SoC. This platform contain SRIOV NIC. Unlike other commonly known network devices it does not have VF functionality duplicated in its PF. PF is purely management device (on HW level). This creates several problems with existing libvirt code as in many places libvirt assumes that each VF netdev has PF netdev assigned. This patch series trying to address issues which can be easily fixed. (mostly bug fixes found while working on full featured solution) First patch in series is most important as it allows to unblock the netdev detection and use <hostdev> on this platform.i <interface type="hostdev" still does not work as it requires bigger changes both on netdev driver and in libvirt itself. More details about those issues can be found at: https://bugs.linaro.org/show_bug.cgi?id=3778 https://bugs.launchpad.net/charm-nova-compute/+bug/1771662 v3: - Reinstated error path in virNetDevGetPhysicalFunction() - Add missing free of *pfname in error path - Use proper < 0 comparisons when checking for errors v2: - error reporting taken out of virNetDevGetPhysicalFunction() and moved to calling function virNetDevGetVirtualFunctionInfo() - curly braces removed from single line - net-> model check removed as STREQ_NULLABLE() follows [*] https://www.redhat.com/archives/libvir-list/2019-January/msg00201.html Radoslaw Biernacki (4): util: fixing wrong assumption that PF has to have netdev assigned util: Code simplification util: Fix for NULL dereference util: Fixing invalid error checking from virPCIGetNetname() src/qemu/qemu_domain_address.c | 13 ++++---- src/util/virhostdev.c | 2 +- src/util/virnetdev.c | 55 ++++++++++++---------------------- 3 files changed, 25 insertions(+), 45 deletions(-) -- 2.20.1

From: Radoslaw Biernacki <radoslaw.biernacki@linaro.org> libvirt wrongly assumes that VF netdev has to have the netdev assigned to PF. There is no such requirement in SRIOV standard. This patch change the virNetDevSwitchdevFeature() function to deal with SRIOV devices which does not have netdev on PF. Also corrects one comment about PF netdev assumption. One example of such devices is ThunderX VNIC. By applying this change, VF device is used for virNetlinkCommand() as it is the only netdev assigned to VNIC. Signed-off-by: Radoslaw Biernacki <radoslaw.biernacki@linaro.org> [ dannf: Reinstated error path in virNetDevGetPhysicalFunction() as suggested by Michal Privoznik ] Signed-off-by: dann frazier <dann.frazier@canonical.com> --- src/util/virnetdev.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 2111b3ada9..82823c0dfc 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1355,9 +1355,8 @@ virNetDevGetPhysicalFunction(const char *ifname, char **pfname) } if (!*pfname) { - /* this shouldn't be possible. A VF can't exist unless its - * PF device is bound to a network driver - */ + /* The SRIOV standard does not require VF netdevs to have the + netdev assigned to a PF */ virReportError(VIR_ERR_INTERNAL_ERROR, _("The PF device for VF %s has no network device name"), ifname); @@ -1449,8 +1448,12 @@ virNetDevGetVirtualFunctionInfo(const char *vfname, char **pfname, *pfname = NULL; - if (virNetDevGetPhysicalFunction(vfname, pfname) < 0) + if (virNetDevGetPhysicalFunction(vfname, pfname) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot get PF netdev name for VF %s"), + vfname); return ret; + } if (virNetDevSysfsFile(&pf_sysfs_path, *pfname, "device") < 0) goto cleanup; @@ -3178,8 +3181,12 @@ virNetDevSwitchdevFeature(const char *ifname, if ((is_vf = virNetDevIsVirtualFunction(ifname)) < 0) return ret; - if (is_vf == 1 && virNetDevGetPhysicalFunction(ifname, &pfname) < 0) - goto cleanup; + if (is_vf == 1) { + /* ignore error if PF does noto have netdev assigned + * in that case pfname == NULL */ + if (virNetDevGetPhysicalFunction(ifname, &pfname) < 0) + virResetLastError(); + } pci_device_ptr = pfname ? virNetDevGetPCIDevice(pfname) : virNetDevGetPCIDevice(ifname); -- 2.20.1

On 1/22/19 8:26 PM, dann frazier wrote:
From: Radoslaw Biernacki <radoslaw.biernacki@linaro.org>
libvirt wrongly assumes that VF netdev has to have the netdev assigned to PF. There is no such requirement in SRIOV standard. This patch change the virNetDevSwitchdevFeature() function to deal with SRIOV devices which does not have netdev on PF. Also corrects one comment about PF netdev assumption.
One example of such devices is ThunderX VNIC. By applying this change, VF device is used for virNetlinkCommand() as it is the only netdev assigned to VNIC.
Signed-off-by: Radoslaw Biernacki <radoslaw.biernacki@linaro.org> [ dannf: Reinstated error path in virNetDevGetPhysicalFunction() as suggested by Michal Privoznik ] Signed-off-by: dann frazier <dann.frazier@canonical.com> --- src/util/virnetdev.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 2111b3ada9..82823c0dfc 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1355,9 +1355,8 @@ virNetDevGetPhysicalFunction(const char *ifname, char **pfname) }
if (!*pfname) { - /* this shouldn't be possible. A VF can't exist unless its - * PF device is bound to a network driver - */ + /* The SRIOV standard does not require VF netdevs to have the + netdev assigned to a PF */ virReportError(VIR_ERR_INTERNAL_ERROR, _("The PF device for VF %s has no network device name"), ifname); @@ -1449,8 +1448,12 @@ virNetDevGetVirtualFunctionInfo(const char *vfname, char **pfname,
*pfname = NULL;
- if (virNetDevGetPhysicalFunction(vfname, pfname) < 0) + if (virNetDevGetPhysicalFunction(vfname, pfname) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot get PF netdev name for VF %s"), + vfname); return ret;
This is not needed. It overwrites (probably more accurate) error reported by virNetDevGetPhysicalFunction(). ACK with that removed. Michal

From: Radoslaw Biernacki <radoslaw.biernacki@linaro.org> Removing redundant sections of the code Signed-off-by: Radoslaw Biernacki <radoslaw.biernacki@linaro.org> [ dannf: Add missing free of *pfname in error path and use proper < 0 comparison when checking return for errors, both suggested by Michal Privoznik ] Signed-off-by: dann frazier <dann.frazier@canonical.com> --- src/util/virnetdev.c | 36 ++++++------------------------------ 1 file changed, 6 insertions(+), 30 deletions(-) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 82823c0dfc..5571a10865 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1443,33 +1443,21 @@ int virNetDevGetVirtualFunctionInfo(const char *vfname, char **pfname, int *vf) { - char *pf_sysfs_path = NULL, *vf_sysfs_path = NULL; - int ret = -1; - + int ret; *pfname = NULL; if (virNetDevGetPhysicalFunction(vfname, pfname) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Cannot get PF netdev name for VF %s"), vfname); - return ret; + return -1; } - if (virNetDevSysfsFile(&pf_sysfs_path, *pfname, "device") < 0) - goto cleanup; - - if (virNetDevSysfsFile(&vf_sysfs_path, vfname, "device") < 0) - goto cleanup; + ret = virNetDevGetVirtualFunctionIndex(*pfname, vfname, vf); - ret = virPCIGetVirtualFunctionIndex(pf_sysfs_path, vf_sysfs_path, vf); - - cleanup: if (ret < 0) VIR_FREE(*pfname); - VIR_FREE(vf_sysfs_path); - VIR_FREE(pf_sysfs_path); - return ret; } @@ -1865,13 +1853,9 @@ virNetDevSaveNetConfig(const char *linkdev, int vf, * it to PF + VFname */ - if (virNetDevGetPhysicalFunction(linkdev, &pfDevOrig) < 0) + if (virNetDevGetVirtualFunctionInfo(linkdev, &pfDevOrig, &vf) < 0) goto cleanup; - pfDevName = pfDevOrig; - - if (virNetDevGetVirtualFunctionIndex(pfDevName, linkdev, &vf) < 0) - goto cleanup; } if (pfDevName) { @@ -2023,13 +2007,9 @@ virNetDevReadNetConfig(const char *linkdev, int vf, * it to PF + VFname */ - if (virNetDevGetPhysicalFunction(linkdev, &pfDevOrig) < 0) + if (virNetDevGetVirtualFunctionInfo(linkdev, &pfDevOrig, &vf) < 0) goto cleanup; - pfDevName = pfDevOrig; - - if (virNetDevGetVirtualFunctionIndex(pfDevName, linkdev, &vf) < 0) - goto cleanup; } /* if there is a PF, it's now in pfDevName, and linkdev is either @@ -2228,13 +2208,9 @@ virNetDevSetNetConfig(const char *linkdev, int vf, * it to PF + VFname */ - if (virNetDevGetPhysicalFunction(linkdev, &pfDevOrig) < 0) + if (virNetDevGetVirtualFunctionInfo(linkdev, &pfDevOrig, &vf)) goto cleanup; - pfDevName = pfDevOrig; - - if (virNetDevGetVirtualFunctionIndex(pfDevName, linkdev, &vf) < 0) - goto cleanup; } -- 2.20.1

On 1/22/19 8:26 PM, dann frazier wrote:
From: Radoslaw Biernacki <radoslaw.biernacki@linaro.org>
Removing redundant sections of the code
Signed-off-by: Radoslaw Biernacki <radoslaw.biernacki@linaro.org> [ dannf: Add missing free of *pfname in error path and use proper < 0 comparison when checking return for errors, both suggested by Michal Privoznik ] Signed-off-by: dann frazier <dann.frazier@canonical.com> --- src/util/virnetdev.c | 36 ++++++------------------------------ 1 file changed, 6 insertions(+), 30 deletions(-)
diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 82823c0dfc..5571a10865 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1443,33 +1443,21 @@ int virNetDevGetVirtualFunctionInfo(const char *vfname, char **pfname, int *vf) { - char *pf_sysfs_path = NULL, *vf_sysfs_path = NULL; - int ret = -1; - + int ret; *pfname = NULL;
if (virNetDevGetPhysicalFunction(vfname, pfname) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Cannot get PF netdev name for VF %s"), vfname); - return ret; + return -1; }
- if (virNetDevSysfsFile(&pf_sysfs_path, *pfname, "device") < 0) - goto cleanup; - - if (virNetDevSysfsFile(&vf_sysfs_path, vfname, "device") < 0) - goto cleanup; + ret = virNetDevGetVirtualFunctionIndex(*pfname, vfname, vf);
- ret = virPCIGetVirtualFunctionIndex(pf_sysfs_path, vf_sysfs_path, vf); - - cleanup: if (ret < 0) VIR_FREE(*pfname);
- VIR_FREE(vf_sysfs_path); - VIR_FREE(pf_sysfs_path); - return ret;
Nitpick, for better future extensibility I prefer initializing @ret, keeping cleanup label and using goto cleanup. ACK Michal

From: Radoslaw Biernacki <radoslaw.biernacki@linaro.org> The device xml parser code does not set "model" while parsing <interface type='hostdev'> <source> <address type='pci' domain='0x0002' bus='0x01' slot='0x00' function='0x2'/> </source> </interface> virDomainDefPtr def->nets[i]->model can be NULL while latter compares strings with STREQ instead of STREQ_NULLABLE. Fixes: ac47e4a6225 (qemu: replace "def->nets[i]" with "net" and "def->sounds[i]" with "sound") Fixes: c7fc151eec7 (qemu: assign virtio devices to PCIe slot when appropriate) Signed-off-by: Radoslaw Biernacki <radoslaw.biernacki@linaro.org> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: dann frazier <dann.frazier@canonical.com> --- src/qemu/qemu_domain_address.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index bd6c4031e0..9e0b50d41e 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -230,10 +230,8 @@ qemuDomainAssignSpaprVIOAddresses(virDomainDefPtr def) for (i = 0; i < def->nnets; i++) { virDomainNetDefPtr net = def->nets[i]; - if (net->model && - STREQ(net->model, "spapr-vlan")) { + if (STREQ_NULLABLE(net->model, "spapr-vlan")) net->info.type = VIR_DOMAIN_DEVICE_ADDRESS_TYPE_SPAPRVIO; - } if (qemuDomainAssignSpaprVIOAddress(def, &net->info, VIO_ADDR_NET) < 0) goto cleanup; @@ -322,8 +320,7 @@ qemuDomainPrimeVirtioDeviceAddresses(virDomainDefPtr def, for (i = 0; i < def->nnets; i++) { virDomainNetDefPtr net = def->nets[i]; - if (net->model && - STREQ(net->model, "virtio") && + if (STREQ_NULLABLE(net->model, "virtio") && net->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) { net->info.type = type; } @@ -691,14 +688,14 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, * addresses for other hostdev devices. */ if (net->type == VIR_DOMAIN_NET_TYPE_HOSTDEV || - STREQ(net->model, "usb-net")) { + STREQ_NULLABLE(net->model, "usb-net")) { return 0; } - if (STREQ(net->model, "virtio")) + if (STREQ_NULLABLE(net->model, "virtio")) return virtioFlags; - if (STREQ(net->model, "e1000e")) + if (STREQ_NULLABLE(net->model, "e1000e")) return pcieFlags; return pciFlags; -- 2.20.1

From: Radoslaw Biernacki <radoslaw.biernacki@linaro.org> linkdev is In/Out function parameter as second order reference pointer so requires first order dereference for checking NULLs which can be a result from virPCIGetNetName() Fixes: d6ee56d7237 (util: change virPCIGetNetName() to not return error if device has no net name) Signed-off-by: Radoslaw Biernacki <radoslaw.biernacki@linaro.org> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: dann frazier <dann.frazier@canonical.com> --- src/util/virhostdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 41d4e8d936..880ca083d4 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -314,7 +314,7 @@ virHostdevNetDevice(virDomainHostdevDefPtr hostdev, if (virPCIGetNetName(sysfs_path, 0, NULL, linkdev) < 0) return -1; - if (!linkdev) { + if (!(*linkdev)) { virReportError(VIR_ERR_INTERNAL_ERROR, _("The device at %s has no network device name"), sysfs_path); -- 2.20.1

On 1/22/19 8:26 PM, dann frazier wrote:
Following up on my offer to take the baton[*], here's a v3 of this series that should address the provided feedback provided for Radoslaw's v2.
ThunderX is Cavium SoC. This platform contain SRIOV NIC. Unlike other commonly known network devices it does not have VF functionality duplicated in its PF. PF is purely management device (on HW level).
This creates several problems with existing libvirt code as in many places libvirt assumes that each VF netdev has PF netdev assigned.
This patch series trying to address issues which can be easily fixed. (mostly bug fixes found while working on full featured solution)
First patch in series is most important as it allows to unblock the netdev detection and use <hostdev> on this platform.i <interface type="hostdev" still does not work as it requires bigger changes both on netdev driver and in libvirt itself. More details about those issues can be found at: https://bugs.linaro.org/show_bug.cgi?id=3778 https://bugs.launchpad.net/charm-nova-compute/+bug/1771662
v3: - Reinstated error path in virNetDevGetPhysicalFunction() - Add missing free of *pfname in error path - Use proper < 0 comparisons when checking for errors
v2: - error reporting taken out of virNetDevGetPhysicalFunction() and moved to calling function virNetDevGetVirtualFunctionInfo() - curly braces removed from single line - net-> model check removed as STREQ_NULLABLE() follows
[*] https://www.redhat.com/archives/libvir-list/2019-January/msg00201.html
Radoslaw Biernacki (4): util: fixing wrong assumption that PF has to have netdev assigned util: Code simplification util: Fix for NULL dereference util: Fixing invalid error checking from virPCIGetNetname()
src/qemu/qemu_domain_address.c | 13 ++++---- src/util/virhostdev.c | 2 +- src/util/virnetdev.c | 55 ++++++++++++---------------------- 3 files changed, 25 insertions(+), 45 deletions(-)
I've fixed all the small issues I've raised, ACKed the whole thing and pushed. Congratulations Radoslaw on your first libvirt contribution and thank you Dann for picking this up. Michal

Thank you very much Dann for finishing that. And sorry for lack of response for so long. I somehow missed your last email and feeling bad that you had to ping me so long for this. Thank you Michal for accepting this. On Wed, 23 Jan 2019 at 10:29, Michal Privoznik <mprivozn@redhat.com> wrote:
Following up on my offer to take the baton[*], here's a v3 of this series that should address the provided feedback provided for Radoslaw's v2.
ThunderX is Cavium SoC. This platform contain SRIOV NIC. Unlike other commonly known network devices it does not have VF functionality duplicated in its PF. PF is purely management device (on HW level).
This creates several problems with existing libvirt code as in many
On 1/22/19 8:26 PM, dann frazier wrote: places
libvirt assumes that each VF netdev has PF netdev assigned.
This patch series trying to address issues which can be easily fixed. (mostly bug fixes found while working on full featured solution)
First patch in series is most important as it allows to unblock the netdev detection and use <hostdev> on this platform.i <interface type="hostdev" still does not work as it requires bigger changes both on netdev driver and in libvirt itself. More details about those issues can be found at: https://bugs.linaro.org/show_bug.cgi?id=3778 https://bugs.launchpad.net/charm-nova-compute/+bug/1771662
v3: - Reinstated error path in virNetDevGetPhysicalFunction() - Add missing free of *pfname in error path - Use proper < 0 comparisons when checking for errors
v2: - error reporting taken out of virNetDevGetPhysicalFunction() and moved to calling function virNetDevGetVirtualFunctionInfo() - curly braces removed from single line - net-> model check removed as STREQ_NULLABLE() follows
[*] https://www.redhat.com/archives/libvir-list/2019-January/msg00201.html
Radoslaw Biernacki (4): util: fixing wrong assumption that PF has to have netdev assigned util: Code simplification util: Fix for NULL dereference util: Fixing invalid error checking from virPCIGetNetname()
src/qemu/qemu_domain_address.c | 13 ++++---- src/util/virhostdev.c | 2 +- src/util/virnetdev.c | 55 ++++++++++++---------------------- 3 files changed, 25 insertions(+), 45 deletions(-)
I've fixed all the small issues I've raised, ACKed the whole thing and pushed.
Congratulations Radoslaw on your first libvirt contribution and thank you Dann for picking this up.
Michal

Not a problem, and thanks for doing all the heavy lifting! -dann On Wed, Jan 23, 2019 at 12:11 PM Radoslaw Biernacki <radoslaw.biernacki@linaro.org> wrote:
Thank you very much Dann for finishing that. And sorry for lack of response for so long. I somehow missed your last email and feeling bad that you had to ping me so long for this.
Thank you Michal for accepting this.
On Wed, 23 Jan 2019 at 10:29, Michal Privoznik <mprivozn@redhat.com> wrote:
On 1/22/19 8:26 PM, dann frazier wrote:
Following up on my offer to take the baton[*], here's a v3 of this series that should address the provided feedback provided for Radoslaw's v2.
ThunderX is Cavium SoC. This platform contain SRIOV NIC. Unlike other commonly known network devices it does not have VF functionality duplicated in its PF. PF is purely management device (on HW level).
This creates several problems with existing libvirt code as in many places libvirt assumes that each VF netdev has PF netdev assigned.
This patch series trying to address issues which can be easily fixed. (mostly bug fixes found while working on full featured solution)
First patch in series is most important as it allows to unblock the netdev detection and use <hostdev> on this platform.i <interface type="hostdev" still does not work as it requires bigger changes both on netdev driver and in libvirt itself. More details about those issues can be found at: https://bugs.linaro.org/show_bug.cgi?id=3778 https://bugs.launchpad.net/charm-nova-compute/+bug/1771662
v3: - Reinstated error path in virNetDevGetPhysicalFunction() - Add missing free of *pfname in error path - Use proper < 0 comparisons when checking for errors
v2: - error reporting taken out of virNetDevGetPhysicalFunction() and moved to calling function virNetDevGetVirtualFunctionInfo() - curly braces removed from single line - net-> model check removed as STREQ_NULLABLE() follows
[*] https://www.redhat.com/archives/libvir-list/2019-January/msg00201.html
Radoslaw Biernacki (4): util: fixing wrong assumption that PF has to have netdev assigned util: Code simplification util: Fix for NULL dereference util: Fixing invalid error checking from virPCIGetNetname()
src/qemu/qemu_domain_address.c | 13 ++++---- src/util/virhostdev.c | 2 +- src/util/virnetdev.c | 55 ++++++++++++---------------------- 3 files changed, 25 insertions(+), 45 deletions(-)
I've fixed all the small issues I've raised, ACKed the whole thing and pushed.
Congratulations Radoslaw on your first libvirt contribution and thank you Dann for picking this up.
Michal
participants (3)
-
dann frazier
-
Michal Privoznik
-
Radoslaw Biernacki