[libvirt] [PATCH 0/5] logging fixes

These are all related to excessive, misleading, or missing info in logs when trying to debug problems with SR-IOV network devices. Patch 2 does change the logging to eliminate an error message when no error has occurred (or prevent overwriting a prior error if a DISASSOCIATE is happening as part of the cleanup after said prior error), but there is a change to behavior in patch 2 that could have unintended bad consequences, which is why I've Cc'ed Christian at Cisco and Stefan and Shivaprasad at IBM, in hopes that they (or someone they can contact at their respective organizations) can look at the change and report back if it will cause a problem. The change in question (again, in 2/5) is that we would previously always return a status of 0 (PORT_VDP_RESPONSE_SUCCESS) from virNetDevVPortProfileGetStatus if instanceId was NULL; that is *always* the case for both ASSOCIATE and DISASSOCIATE for 802.1Qbg, and is true for all DISASSOCIATE commands for 802.1Qbh. With the change in Patch 2/5, we now will now set status to the actual IFLA_PORT_RESPONSE from the response message, which seems to be correct behavior, but could have bad side effects if there is a previously undiscovered bug at the other end of the communication. Laine Stump (5): util: report the MAC address that couldn't be set util: don't log error in virNetDevVPortProfileGetStatus if instanceId is NULL util: improve error reporting in virNetDevVPortProfileGetStatus util: reduce debug log in virPCIGetVirtualFunctions() docs: update to properly reflect meaning of fields in log filter daemon/libvirtd.conf | 14 ++++++--- docs/logging.html.in | 15 ++++++---- src/util/virnetdev.c | 23 +++++++++----- src/util/virnetdevvportprofile.c | 65 ++++++++++++++++++++++++++++++++-------- src/util/virpci.c | 37 ++++++----------------- src/util/virpci.h | 4 +-- 6 files changed, 98 insertions(+), 60 deletions(-) -- 2.5.0

I noticed in a log file that we had failed to set a MAC address. The log said which interface we were trying to set, but didn't give the offending MAC address, which could have been useful in determining the source of the problem. This patch modifies all three places in the code that set MAC addresses to report the failed MAC as well as interface. --- src/util/virnetdev.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 6f3e681..ea95552 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -256,9 +256,11 @@ int virNetDevSetMAC(const char *ifname, virMacAddrGetRaw(macaddr, (unsigned char *)ifr.ifr_hwaddr.sa_data); if (ioctl(fd, SIOCSIFHWADDR, &ifr) < 0) { + char macstr[VIR_MAC_STRING_BUFLEN]; + virReportSystemError(errno, - _("Cannot set interface MAC on '%s'"), - ifname); + _("Cannot set interface MAC to %s on '%s'"), + virMacAddrFormat(macaddr, macstr), ifname); goto cleanup; } @@ -291,8 +293,8 @@ int virNetDevSetMAC(const char *ifname, if (ioctl(s, SIOCSIFLLADDR, &ifr) < 0) { virReportSystemError(errno, - _("Cannot set interface MAC on '%s'"), - ifname); + _("Cannot set interface MAC to %s on '%s'"), + mac + 1, ifname); goto cleanup; } @@ -2270,10 +2272,17 @@ virNetDevSetVfConfig(const char *ifname, int ifindex, int vf, goto malformed_resp; if (err->error) { + char macstr[VIR_MAC_STRING_BUFLEN]; + virReportSystemError(-err->error, - _("error during set %s of ifindex %d"), - (macaddr ? (vlanid >= 0 ? "mac/vlan" : "mac") : "vlanid"), - ifindex); + _("Cannot set interface MAC/vlanid to %s/%d " + "for ifname %s ifindex %d vf %d"), + (macaddr + ? virMacAddrFormat(macaddr, macstr) + : "(unchanged)"), + vlanid, + ifname ? ifname : "(unspecified)", + ifindex, vf); goto cleanup; } break; -- 2.5.0

virNetDevVPortProfileGetStatus() would log the following error: Could not find netlink response with expected parameters anytime a port profile DISASSOCIATE operation was done for 802.1Qbh, even though the disassociate had been successfully completely. Then, due to the fortunate coincidence of status having been initialized to 0 and then not changed when the "failure" was encountered, it would still return a status of 0 (PORT_VDP_RESPONSE_SUCCESS), so the caller would assume a successful operation. This would result in a spurious log message though, and would fill in LastErrorMessage, so that the API would return that error if it happened during cleanup from some other error. That, in turn, would lead to an incorrect supposition that the response to the port profile disassociate was the cause of the failure. During debugging, I noticed that the VF in question usually had *no uuid* associated with it (big surprise), so the solution is *not* to send the previous instanceId down. This patch fixes virNetDevVPortProfileGetStatus() to only check the VF's uuid in the status if it was given an instanceId to check against. Otherwise it only checks that the particular VF is present (it will be). This does cause a difference in behavior, so I want confirmation from Cisco and IBM that this behavior change is desired (or at least not *un*desired) - rather than returning with status unchanged (and thus always 0, aka PORT_VDP_RESPONSE_SUCCESS) when instanceId is NULL, it will actually get the IFLA_PORT_RESPONSE. This could lead to revelation of error conditions we were previously ignoring. Or not. Only experimentation will tell. Note that for 802.1Qbg instanceId is *always* NULL, and for 802.1Qbh, it is NULL for all DISASSOCIATE commands. --- src/util/virnetdevvportprofile.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/src/util/virnetdevvportprofile.c b/src/util/virnetdevvportprofile.c index d0d4552..427c67b 100644 --- a/src/util/virnetdevvportprofile.c +++ b/src/util/virnetdevvportprofile.c @@ -1,5 +1,5 @@ /* - * Copyright (C) 2009-2014 Red Hat, Inc. + * Copyright (C) 2009-2015 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -544,14 +544,22 @@ virNetDevVPortProfileGetStatus(struct nlattr **tb, int32_t vf, goto cleanup; } - if (instanceId && - tb_port[IFLA_PORT_INSTANCE_UUID] && - !memcmp(instanceId, - (unsigned char *) - RTA_DATA(tb_port[IFLA_PORT_INSTANCE_UUID]), - VIR_UUID_BUFLEN) && - tb_port[IFLA_PORT_VF] && - vf == *(uint32_t *)RTA_DATA(tb_port[IFLA_PORT_VF])) { + /* This ensures that the given VF is present in the + * IFLA_VF_PORTS list, and that its uuid matches the + * instanceId (in case we've associated it). If no + * instanceId is sent from the caller, that means we've + * disassociated it from this instanceId, and the uuid + * will either be unset (if still not associated with + * anything) or will be set to a new and different uuid + */ + if ((tb_port[IFLA_PORT_VF] && + vf == *(uint32_t *)RTA_DATA(tb_port[IFLA_PORT_VF])) && + (!instanceId || + (tb_port[IFLA_PORT_INSTANCE_UUID] && + !memcmp(instanceId, + (unsigned char *) + RTA_DATA(tb_port[IFLA_PORT_INSTANCE_UUID]), + VIR_UUID_BUFLEN)))) { found = true; break; } -- 2.5.0

Laine Stump <laine@laine.org> wrote on 12/21/2015 12:17:54 PM:
This does cause a difference in behavior, so I want confirmation from Cisco and IBM that this behavior change is desired (or at least not *un*desired) - rather than returning with status unchanged (and thus always 0, aka PORT_VDP_RESPONSE_SUCCESS) when instanceId is NULL, it will actually get the IFLA_PORT_RESPONSE. This could lead to revelation of error conditions we were previously ignoring. Or not. Only experimentation will tell. Note that for 802.1Qbg instanceId is *always* NULL, and for 802.1Qbh, it is NULL for all DISASSOCIATE commands.
I have forwarded this request to others who have the hardware to test with, though this may take some time to get an answer back... Stefan

On 21.12.2015 18:17, Laine Stump wrote:
virNetDevVPortProfileGetStatus() would log the following error:
Could not find netlink response with expected parameters
anytime a port profile DISASSOCIATE operation was done for 802.1Qbh, even though the disassociate had been successfully completely. Then, due to the fortunate coincidence of status having been initialized to 0 and then not changed when the "failure" was encountered, it would still return a status of 0 (PORT_VDP_RESPONSE_SUCCESS), so the caller would assume a successful operation.
This would result in a spurious log message though, and would fill in LastErrorMessage, so that the API would return that error if it happened during cleanup from some other error. That, in turn, would lead to an incorrect supposition that the response to the port profile disassociate was the cause of the failure.
During debugging, I noticed that the VF in question usually had *no uuid* associated with it (big surprise), so the solution is *not* to send the previous instanceId down.
This patch fixes virNetDevVPortProfileGetStatus() to only check the VF's uuid in the status if it was given an instanceId to check against. Otherwise it only checks that the particular VF is present (it will be).
This does cause a difference in behavior, so I want confirmation from Cisco and IBM that this behavior change is desired (or at least not *un*desired) - rather than returning with status unchanged (and thus always 0, aka PORT_VDP_RESPONSE_SUCCESS) when instanceId is NULL, it will actually get the IFLA_PORT_RESPONSE. This could lead to revelation of error conditions we were previously ignoring. Or not. Only experimentation will tell. Note that for 802.1Qbg instanceId is *always* NULL, and for 802.1Qbh, it is NULL for all DISASSOCIATE commands.
--- src/util/virnetdevvportprofile.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/src/util/virnetdevvportprofile.c b/src/util/virnetdevvportprofile.c index d0d4552..427c67b 100644 --- a/src/util/virnetdevvportprofile.c +++ b/src/util/virnetdevvportprofile.c @@ -1,5 +1,5 @@ /* - * Copyright (C) 2009-2014 Red Hat, Inc. + * Copyright (C) 2009-2015 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -544,14 +544,22 @@ virNetDevVPortProfileGetStatus(struct nlattr **tb, int32_t vf, goto cleanup; }
- if (instanceId && - tb_port[IFLA_PORT_INSTANCE_UUID] && - !memcmp(instanceId, - (unsigned char *) - RTA_DATA(tb_port[IFLA_PORT_INSTANCE_UUID]), - VIR_UUID_BUFLEN) && - tb_port[IFLA_PORT_VF] && - vf == *(uint32_t *)RTA_DATA(tb_port[IFLA_PORT_VF])) { + /* This ensures that the given VF is present in the + * IFLA_VF_PORTS list, and that its uuid matches the + * instanceId (in case we've associated it). If no + * instanceId is sent from the caller, that means we've + * disassociated it from this instanceId, and the uuid + * will either be unset (if still not associated with + * anything) or will be set to a new and different uuid + */ + if ((tb_port[IFLA_PORT_VF] && + vf == *(uint32_t *)RTA_DATA(tb_port[IFLA_PORT_VF])) && + (!instanceId || + (tb_port[IFLA_PORT_INSTANCE_UUID] && + !memcmp(instanceId, + (unsigned char *) + RTA_DATA(tb_port[IFLA_PORT_INSTANCE_UUID]), + VIR_UUID_BUFLEN)))) { found = true; break; }
I must admit that even though I understand C, I have no idea whether this is correct or not. I mean, I've never played with such hardware so I'll let Stefan review this one. Michal

-----Original Message----- From: libvir-list-bounces@redhat.com [mailto:libvir-list-bounces@redhat.com] On Behalf Of Michal Privoznik Sent: Monday, December 21, 2015 11:56 PM To: Laine Stump; libvir-list@redhat.com Cc: stefanb@us.ibm.com Subject: Re: [libvirt] [PATCH 2/5] util: don't log error in virNetDevVPortProfileGetStatus if instanceId is NULL
On 21.12.2015 18:17, Laine Stump wrote:
virNetDevVPortProfileGetStatus() would log the following error:
Could not find netlink response with expected parameters
anytime a port profile DISASSOCIATE operation was done for 802.1Qbh, even though the disassociate had been successfully completely. Then, due to the fortunate coincidence of status having been initialized to 0 and then not changed when the "failure" was encountered, it would still return a status of 0 (PORT_VDP_RESPONSE_SUCCESS), so the caller would assume a successful operation.
This would result in a spurious log message though, and would fill in LastErrorMessage, so that the API would return that error if it happened during cleanup from some other error. That, in turn, would lead to an incorrect supposition that the response to the port profile disassociate was the cause of the failure.
During debugging, I noticed that the VF in question usually had *no uuid* associated with it (big surprise), so the solution is *not* to send the previous instanceId down.
This patch fixes virNetDevVPortProfileGetStatus() to only check the VF's uuid in the status if it was given an instanceId to check against. Otherwise it only checks that the particular VF is present (it will be).
This does cause a difference in behavior, so I want confirmation from Cisco and IBM that this behavior change is desired (or at least not *un*desired) - rather than returning with status unchanged (and thus always 0, aka PORT_VDP_RESPONSE_SUCCESS) when instanceId is NULL, it will actually get the IFLA_PORT_RESPONSE. This could lead to revelation of error conditions we were previously ignoring. Or not. Only experimentation will tell. Note that for 802.1Qbg instanceId is *always* NULL, and for 802.1Qbh, it is NULL for all DISASSOCIATE commands.
--- src/util/virnetdevvportprofile.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/src/util/virnetdevvportprofile.c b/src/util/virnetdevvportprofile.c index d0d4552..427c67b 100644 --- a/src/util/virnetdevvportprofile.c +++ b/src/util/virnetdevvportprofile.c @@ -1,5 +1,5 @@ /* - * Copyright (C) 2009-2014 Red Hat, Inc. + * Copyright (C) 2009-2015 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -544,14 +544,22 @@ virNetDevVPortProfileGetStatus(struct nlattr **tb, int32_t vf, goto cleanup; }
- if (instanceId && - tb_port[IFLA_PORT_INSTANCE_UUID] && - !memcmp(instanceId, - (unsigned char *) - RTA_DATA(tb_port[IFLA_PORT_INSTANCE_UUID]), - VIR_UUID_BUFLEN) && - tb_port[IFLA_PORT_VF] && - vf == *(uint32_t *)RTA_DATA(tb_port[IFLA_PORT_VF])) { + /* This ensures that the given VF is present in the + * IFLA_VF_PORTS list, and that its uuid matches the + * instanceId (in case we've associated it). If no + * instanceId is sent from the caller, that means we've + * disassociated it from this instanceId, and the uuid + * will either be unset (if still not associated with + * anything) or will be set to a new and different uuid + */ + if ((tb_port[IFLA_PORT_VF] && + vf == *(uint32_t *)RTA_DATA(tb_port[IFLA_PORT_VF])) && + (!instanceId || + (tb_port[IFLA_PORT_INSTANCE_UUID] && + !memcmp(instanceId, + (unsigned char *) + RTA_DATA(tb_port[IFLA_PORT_INSTANCE_UUID]), + VIR_UUID_BUFLEN)))) { found = true; break; }
I must admit that even though I understand C, I have no idea whether this is correct or not. I mean, I've never played with such hardware so I'll let Stefan review this one.
Michal
ACK I have not tested the fix above, but both the analysis and the fix seem correct to me. Thanks /Chris

On 01/11/2016 04:50 PM, Christian Benvenuti (benve) wrote:
-----Original Message----- From: libvir-list-bounces@redhat.com [mailto:libvir-list-bounces@redhat.com] On Behalf Of Michal Privoznik Sent: Monday, December 21, 2015 11:56 PM To: Laine Stump; libvir-list@redhat.com Cc: stefanb@us.ibm.com Subject: Re: [libvirt] [PATCH 2/5] util: don't log error in virNetDevVPortProfileGetStatus if instanceId is NULL
On 21.12.2015 18:17, Laine Stump wrote:
virNetDevVPortProfileGetStatus() would log the following error:
Could not find netlink response with expected parameters
anytime a port profile DISASSOCIATE operation was done for 802.1Qbh, even though the disassociate had been successfully completely. Then, due to the fortunate coincidence of status having been initialized to 0 and then not changed when the "failure" was encountered, it would still return a status of 0 (PORT_VDP_RESPONSE_SUCCESS), so the caller would assume a successful operation.
This would result in a spurious log message though, and would fill in LastErrorMessage, so that the API would return that error if it happened during cleanup from some other error. That, in turn, would lead to an incorrect supposition that the response to the port profile disassociate was the cause of the failure.
During debugging, I noticed that the VF in question usually had *no uuid* associated with it (big surprise), so the solution is *not* to send the previous instanceId down.
This patch fixes virNetDevVPortProfileGetStatus() to only check the VF's uuid in the status if it was given an instanceId to check against. Otherwise it only checks that the particular VF is present (it will be).
This does cause a difference in behavior, so I want confirmation from Cisco and IBM that this behavior change is desired (or at least not *un*desired) - rather than returning with status unchanged (and thus always 0, aka PORT_VDP_RESPONSE_SUCCESS) when instanceId is NULL, it will actually get the IFLA_PORT_RESPONSE. This could lead to revelation of error conditions we were previously ignoring. Or not. Only experimentation will tell. Note that for 802.1Qbg instanceId is *always* NULL, and for 802.1Qbh, it is NULL for all DISASSOCIATE commands.
--- src/util/virnetdevvportprofile.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/src/util/virnetdevvportprofile.c b/src/util/virnetdevvportprofile.c index d0d4552..427c67b 100644 --- a/src/util/virnetdevvportprofile.c +++ b/src/util/virnetdevvportprofile.c @@ -1,5 +1,5 @@ /* - * Copyright (C) 2009-2014 Red Hat, Inc. + * Copyright (C) 2009-2015 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -544,14 +544,22 @@ virNetDevVPortProfileGetStatus(struct nlattr **tb, int32_t vf, goto cleanup; }
- if (instanceId && - tb_port[IFLA_PORT_INSTANCE_UUID] && - !memcmp(instanceId, - (unsigned char *) - RTA_DATA(tb_port[IFLA_PORT_INSTANCE_UUID]), - VIR_UUID_BUFLEN) && - tb_port[IFLA_PORT_VF] && - vf == *(uint32_t *)RTA_DATA(tb_port[IFLA_PORT_VF])) { + /* This ensures that the given VF is present in the + * IFLA_VF_PORTS list, and that its uuid matches the + * instanceId (in case we've associated it). If no + * instanceId is sent from the caller, that means we've + * disassociated it from this instanceId, and the uuid + * will either be unset (if still not associated with + * anything) or will be set to a new and different uuid + */ + if ((tb_port[IFLA_PORT_VF] && + vf == *(uint32_t *)RTA_DATA(tb_port[IFLA_PORT_VF])) && + (!instanceId || + (tb_port[IFLA_PORT_INSTANCE_UUID] && + !memcmp(instanceId, + (unsigned char *) + RTA_DATA(tb_port[IFLA_PORT_INSTANCE_UUID]), + VIR_UUID_BUFLEN)))) { found = true; break; }
I must admit that even though I understand C, I have no idea whether this is correct or not. I mean, I've never played with such hardware so I'll let Stefan review this one.
Michal ACK
I have not tested the fix above, but both the analysis and the fix seem correct to me.
Thanks, Christian! A coworker was able to test it on a system running RHEL6 (which is based on a 2.6.32 kernel) and Cisco VMFEX/enic with 802.1Qbh port profiles, and the associate/disassociate commands worked properly for both starting/stopping domains as well as migration. This leads me to think it's safe, so I pushed it.

The previous error message just indicated that the desired response couldn't be found, this patch tells what was desired, as well as listing out the entire table that had been in the netlink response, to give some kind of idea why it failed. --- src/util/virnetdevvportprofile.c | 39 +++++++++++++++++++++++++++++++++++---- 1 file changed, 35 insertions(+), 4 deletions(-) diff --git a/src/util/virnetdevvportprofile.c b/src/util/virnetdevvportprofile.c index 427c67b..f4b876f 100644 --- a/src/util/virnetdevvportprofile.c +++ b/src/util/virnetdevvportprofile.c @@ -566,10 +566,41 @@ virNetDevVPortProfileGetStatus(struct nlattr **tb, int32_t vf, } if (!found) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Could not find netlink response with " - "expected parameters")); - goto cleanup; + char instanceIdStr[VIR_UUID_STRING_BUFLEN] = "(none)"; + + if (instanceId) + virUUIDFormat(instanceId, instanceIdStr); + + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Could not find vf/instanceId %u/%s " + " in netlink response"), + vf, instanceIdStr); + + /* go through all the entries again. This seems tedious, + * but experience has shown the resulting log to be + * very useful. + */ + VIR_WARN("IFLA_VF_PORTS entries that were returned:"); + nla_for_each_nested(tb_vf_ports, tb[IFLA_VF_PORTS], rem) { + char uuidstr[VIR_UUID_STRING_BUFLEN] = "(none)"; + + if (nla_parse_nested(tb_port, IFLA_PORT_MAX, tb_vf_ports, + ifla_port_policy)) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("error parsing IFLA_VF_PORT " + "during error reporting")); + goto cleanup; + } + if (tb_port[IFLA_PORT_INSTANCE_UUID]) { + virUUIDFormat((unsigned char *) + RTA_DATA(tb_port[IFLA_PORT_INSTANCE_UUID]), + uuidstr); + } + VIR_WARN(" vf: %d uuid: %s", + tb_port[IFLA_PORT_VF] ? + *(uint32_t *)RTA_DATA(tb_port[IFLA_PORT_VF]) : -1, + uuidstr); + } goto cleanup; } } else { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", -- 2.5.0

On 21.12.2015 18:17, Laine Stump wrote:
The previous error message just indicated that the desired response couldn't be found, this patch tells what was desired, as well as listing out the entire table that had been in the netlink response, to give some kind of idea why it failed. --- src/util/virnetdevvportprofile.c | 39 +++++++++++++++++++++++++++++++++++---- 1 file changed, 35 insertions(+), 4 deletions(-)
diff --git a/src/util/virnetdevvportprofile.c b/src/util/virnetdevvportprofile.c index 427c67b..f4b876f 100644 --- a/src/util/virnetdevvportprofile.c +++ b/src/util/virnetdevvportprofile.c @@ -566,10 +566,41 @@ virNetDevVPortProfileGetStatus(struct nlattr **tb, int32_t vf, }
if (!found) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Could not find netlink response with " - "expected parameters")); - goto cleanup; + char instanceIdStr[VIR_UUID_STRING_BUFLEN] = "(none)"; + + if (instanceId) + virUUIDFormat(instanceId, instanceIdStr); + + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Could not find vf/instanceId %u/%s " + " in netlink response"), + vf, instanceIdStr); + + /* go through all the entries again. This seems tedious, + * but experience has shown the resulting log to be + * very useful. + */ + VIR_WARN("IFLA_VF_PORTS entries that were returned:"); + nla_for_each_nested(tb_vf_ports, tb[IFLA_VF_PORTS], rem) { + char uuidstr[VIR_UUID_STRING_BUFLEN] = "(none)"; + + if (nla_parse_nested(tb_port, IFLA_PORT_MAX, tb_vf_ports, + ifla_port_policy)) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("error parsing IFLA_VF_PORT " + "during error reporting"));
error when printing error :)
+ goto cleanup; + } + if (tb_port[IFLA_PORT_INSTANCE_UUID]) { + virUUIDFormat((unsigned char *) + RTA_DATA(tb_port[IFLA_PORT_INSTANCE_UUID]), + uuidstr); + } + VIR_WARN(" vf: %d uuid: %s", + tb_port[IFLA_PORT_VF] ? + *(uint32_t *)RTA_DATA(tb_port[IFLA_PORT_VF]) : -1, + uuidstr); + } goto cleanup; } } else { virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
ACK Michal

Due to debug logs like this: virPCIGetDeviceAddressFromSysfsLink:2432 : Attempting to resolve device path from device link '/sys/class/net/eth1/device/virtfn6' logStrToLong_ui:2369 : Converted '0000:07:00.7' to unsigned int 0 logStrToLong_ui:2369 : Converted '07:00.7' to unsigned int 7 logStrToLong_ui:2369 : Converted '00.7' to unsigned int 0 logStrToLong_ui:2369 : Converted '7' to unsigned int 7 virPCIGetDeviceAddressFromSysfs:1947 : virPCIDeviceAddress 0000:07:00.7 virPCIGetVirtualFunctions:2554 : Found virtual function 7 printed *once for each SR-IOV Virtual Function* of a Physical Function each time libvirt retrieved the list of VFs (so if the system has 128 VFs, there would be 900 lines of log for each call), the debug logs on any system with a large number of VFs was dominated by "information" that was possibly useful for debugging when the code was being written, but is now useless for debugging of any problem on a running system, and only serves to obscure the real useful information. This overkill has no place in production code, so this patch removes it. --- src/util/virpci.c | 37 +++++++++---------------------------- src/util/virpci.h | 4 ++-- 2 files changed, 11 insertions(+), 30 deletions(-) diff --git a/src/util/virpci.c b/src/util/virpci.c index 6f0cb8c..d7d2b92 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -2363,12 +2363,8 @@ logStrToLong_ui(char const *s, int ret = 0; ret = virStrToLong_ui(s, end_ptr, base, result); - if (ret != 0) { + if (ret != 0) VIR_ERROR(_("Failed to convert '%s' to unsigned int"), s); - } else { - VIR_DEBUG("Converted '%s' to unsigned int %u", s, *result); - } - return ret; } @@ -2429,11 +2425,8 @@ virPCIGetDeviceAddressFromSysfsLink(const char *device_link, char errbuf[64]; int ret = -1; - VIR_DEBUG("Attempting to resolve device path from device link '%s'", - device_link); - if (!virFileExists(device_link)) { - VIR_DEBUG("sysfs_path '%s' does not exist", device_link); + VIR_DEBUG("'%s' does not exist", device_link); return ret; } @@ -2458,14 +2451,7 @@ virPCIGetDeviceAddressFromSysfsLink(const char *device_link, goto out; } - VIR_DEBUG("virPCIDeviceAddress %.4x:%.2x:%.2x.%.1x", - (*bdf)->domain, - (*bdf)->bus, - (*bdf)->slot, - (*bdf)->function); - ret = 0; - out: VIR_FREE(device_path); @@ -2477,22 +2463,20 @@ virPCIGetDeviceAddressFromSysfsLink(const char *device_link, */ int virPCIGetPhysicalFunction(const char *vf_sysfs_path, - virPCIDeviceAddressPtr *physical_function) + virPCIDeviceAddressPtr *pf) { int ret = -1; char *device_link = NULL; - VIR_DEBUG("Attempting to get SR IOV physical function for device " - "with sysfs path '%s'", vf_sysfs_path); - if (virBuildPath(&device_link, vf_sysfs_path, "physfn") == -1) { virReportOOMError(); return ret; - } else { - ret = virPCIGetDeviceAddressFromSysfsLink(device_link, - physical_function); } + if ((ret = virPCIGetDeviceAddressFromSysfsLink(device_link, pf)) >= 0) { + VIR_DEBUG("PF for VF device '%s': %.4x:%.2x:%.2x.%.1x", vf_sysfs_path, + (*pf)->domain, (*pf)->bus, (*pf)->slot, (*pf)->function); + } VIR_FREE(device_link); return ret; @@ -2514,9 +2498,6 @@ virPCIGetVirtualFunctions(const char *sysfs_path, virPCIDeviceAddress *config_addr = NULL; char *totalvfs_file = NULL, *totalvfs_str = NULL; - VIR_DEBUG("Attempting to get SR IOV virtual functions for device" - "with sysfs path '%s'", sysfs_path); - *virtual_functions = NULL; *num_virtual_functions = 0; *max_virtual_functions = 0; @@ -2551,13 +2532,13 @@ virPCIGetVirtualFunctions(const char *sysfs_path, goto error; } - VIR_DEBUG("Found virtual function %zu", *num_virtual_functions); if (VIR_APPEND_ELEMENT(*virtual_functions, *num_virtual_functions, config_addr) < 0) goto error; VIR_FREE(device_link); } while (1); + VIR_DEBUG("Found %zu virtual functions for %s", *num_virtual_functions, sysfs_path); ret = 0; cleanup: VIR_FREE(device_link); @@ -2742,7 +2723,7 @@ static const char *unsupported = N_("not supported on non-linux platforms"); int virPCIGetPhysicalFunction(const char *vf_sysfs_path ATTRIBUTE_UNUSED, - virPCIDeviceAddressPtr *physical_function ATTRIBUTE_UNUSED) + virPCIDeviceAddressPtr *pf ATTRIBUTE_UNUSED) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _(unsupported)); return -1; diff --git a/src/util/virpci.h b/src/util/virpci.h index f3d5676..8930133 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -166,8 +166,8 @@ int virPCIDeviceIsAssignable(virPCIDevicePtr dev, int strict_acs_check); int virPCIDeviceWaitForCleanup(virPCIDevicePtr dev, const char *matcher); -int virPCIGetPhysicalFunction(const char *sysfs_path, - virPCIDeviceAddressPtr *phys_fn); +int virPCIGetPhysicalFunction(const char *vf_sysfs_path, + virPCIDeviceAddressPtr *pf); int virPCIGetVirtualFunctions(const char *sysfs_path, virPCIDeviceAddressPtr **virtual_functions, -- 2.5.0

The documentation (and comment in libvirtd.conf) says that the text in a log filter is compared to the "source file name", and gives the example of "util/json", but this is not correct (at least not since commit 2835c1e, possibly earlier). It is instead compared to the string given in the VIR_LOG_INIT() macro invocation at the top of each source file, which is always "similar to but not the same as" the source file name (in the example above, the proper name is "util.json", while the file name is "util/virjson.c"). This patch corrects the misstatement in both the documentation and in libvirtd.conf. --- daemon/libvirtd.conf | 14 ++++++++++---- docs/logging.html.in | 15 +++++++++------ 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/daemon/libvirtd.conf b/daemon/libvirtd.conf index 514e6e4..5485f98 100644 --- a/daemon/libvirtd.conf +++ b/daemon/libvirtd.conf @@ -344,10 +344,16 @@ # The format for a filter is one of: # x:name # x:+name -# where name is a string which is matched against source file name, -# e.g., "remote", "qemu", or "util/json", the optional "+" prefix -# tells libvirt to log stack trace for each message matching name, -# and x is the minimal level where matching messages should be logged: + +# where name is a string which is matched against the category +# given in the VIR_LOG_INIT() at the top of each libvirt source +# file, e.g., "remote", "qemu", or "util.json" (the name in the +# filter can be a substring of the full category name, in order +# to match multiple similar categories), the optional "+" prefix +# tells libvirt to log stack trace for each message matching +# name, and x is the minimal level where matching messages should +# be logged: + # 1: DEBUG # 2: INFO # 3: WARNING diff --git a/docs/logging.html.in b/docs/logging.html.in index d8d8f1e..b6dc91c 100644 --- a/docs/logging.html.in +++ b/docs/logging.html.in @@ -104,12 +104,15 @@ <pre> x:name (log message only) x:+name (log message + stack trace)</pre> - <p>where <code>name</code> is a string which is matched against source - file name, e.g., <code>remote</code>, <code>qemu</code>, or - <code>util/json</code>, the optional <code>+</code> prefix tells libvirt - to log stack trace for each message matching <code>name</code>, and - <code>x</code> is the minimal level where matching messages should - be logged:</p> + <p>where <code>name</code> is a string which is matched against + the category given in the VIR_LOG_INIT() at the top of each + libvirt source file, e.g., <code>remote</code>, <code>qemu</code>, + or <code>util.json</code> (the name in the filter can be a + substring of the full category name, in order to match multiple + similar categories), the optional <code>+</code> prefix tells + libvirt to log stack trace for each message + matching <code>name</code>, and <code>x</code> is the minimal + level where matching messages should be logged:</p> <ul> <li>1: DEBUG</li> <li>2: INFO</li> -- 2.5.0

On 21.12.2015 18:17, Laine Stump wrote:
These are all related to excessive, misleading, or missing info in logs when trying to debug problems with SR-IOV network devices.
Patch 2 does change the logging to eliminate an error message when no error has occurred (or prevent overwriting a prior error if a DISASSOCIATE is happening as part of the cleanup after said prior error), but there is a change to behavior in patch 2 that could have unintended bad consequences, which is why I've Cc'ed Christian at Cisco and Stefan and Shivaprasad at IBM, in hopes that they (or someone they can contact at their respective organizations) can look at the change and report back if it will cause a problem. The change in question (again, in 2/5) is that we would previously always return a status of 0 (PORT_VDP_RESPONSE_SUCCESS) from virNetDevVPortProfileGetStatus if instanceId was NULL; that is *always* the case for both ASSOCIATE and DISASSOCIATE for 802.1Qbg, and is true for all DISASSOCIATE commands for 802.1Qbh. With the change in Patch 2/5, we now will now set status to the actual IFLA_PORT_RESPONSE from the response message, which seems to be correct behavior, but could have bad side effects if there is a previously undiscovered bug at the other end of the communication.
Laine Stump (5): util: report the MAC address that couldn't be set util: don't log error in virNetDevVPortProfileGetStatus if instanceId is NULL util: improve error reporting in virNetDevVPortProfileGetStatus util: reduce debug log in virPCIGetVirtualFunctions() docs: update to properly reflect meaning of fields in log filter
daemon/libvirtd.conf | 14 ++++++--- docs/logging.html.in | 15 ++++++---- src/util/virnetdev.c | 23 +++++++++----- src/util/virnetdevvportprofile.c | 65 ++++++++++++++++++++++++++++++++-------- src/util/virpci.c | 37 ++++++----------------- src/util/virpci.h | 4 +-- 6 files changed, 98 insertions(+), 60 deletions(-)
ACK series modulo 2/5 where I don't feel confident enough. Michal
participants (4)
-
Christian Benvenuti (benve)
-
Laine Stump
-
Michal Privoznik
-
Stefan Berger