On 01/11/2016 04:50 PM, Christian Benvenuti (benve) wrote:
> -----Original Message-----
> From: libvir-list-bounces(a)redhat.com [mailto:libvir-list-bounces@redhat.com] On
> Behalf Of Michal Privoznik
> Sent: Monday, December 21, 2015 11:56 PM
> To: Laine Stump; libvir-list(a)redhat.com
> Cc: stefanb(a)us.ibm.com
> Subject: Re: [libvirt] [PATCH 2/5] util: don't log error in
> virNetDevVPortProfileGetStatus if instanceId is NULL
>
> On 21.12.2015 18:17, Laine Stump wrote:
>> virNetDevVPortProfileGetStatus() would log the following error:
>>
>> Could not find netlink response with expected parameters
>>
>> anytime a port profile DISASSOCIATE operation was done for 802.1Qbh,
>> even though the disassociate had been successfully completely. Then,
>> due to the fortunate coincidence of status having been initialized to
>> 0 and then not changed when the "failure" was encountered, it would
>> still return a status of 0 (PORT_VDP_RESPONSE_SUCCESS), so the caller
>> would assume a successful operation.
>>
>> This would result in a spurious log message though, and would fill in
>> LastErrorMessage, so that the API would return that error if it
>> happened during cleanup from some other error. That, in turn, would
>> lead to an incorrect supposition that the response to the port profile
>> disassociate was the cause of the failure.
>>
>> During debugging, I noticed that the VF in question usually had *no
>> uuid* associated with it (big surprise), so the solution is *not* to
>> send the previous instanceId down.
>>
>> This patch fixes virNetDevVPortProfileGetStatus() to only check the
>> VF's uuid in the status if it was given an instanceId to check
>> against. Otherwise it only checks that the particular VF is present
>> (it will be).
>>
>> This does cause a difference in behavior, so I want confirmation from
>> Cisco and IBM that this behavior change is desired (or at least not
>> *un*desired) - rather than returning with status unchanged (and thus
>> always 0, aka PORT_VDP_RESPONSE_SUCCESS) when instanceId is NULL, it
>> will actually get the IFLA_PORT_RESPONSE. This could lead to
>> revelation of error conditions we were previously ignoring. Or not.
>> Only experimentation will tell. Note that for 802.1Qbg instanceId is
>> *always* NULL, and for 802.1Qbh, it is NULL for all DISASSOCIATE
>> commands.
>>
>> --- src/util/virnetdevvportprofile.c | 26
>> +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9
>> deletions(-)
>>
>> diff --git a/src/util/virnetdevvportprofile.c
>> b/src/util/virnetdevvportprofile.c
>> index d0d4552..427c67b 100644
>> --- a/src/util/virnetdevvportprofile.c
>> +++ b/src/util/virnetdevvportprofile.c
>> @@ -1,5 +1,5 @@
>> /*
>> - * Copyright (C) 2009-2014 Red Hat, Inc.
>> + * Copyright (C) 2009-2015 Red Hat, Inc.
>> *
>> * This library is free software; you can redistribute it and/or
>> * modify it under the terms of the GNU Lesser General Public @@
>> -544,14 +544,22 @@ virNetDevVPortProfileGetStatus(struct nlattr **tb, int32_t
> vf,
>> goto cleanup;
>> }
>>
>> - if (instanceId &&
>> - tb_port[IFLA_PORT_INSTANCE_UUID] &&
>> - !memcmp(instanceId,
>> - (unsigned char *)
>> -
> RTA_DATA(tb_port[IFLA_PORT_INSTANCE_UUID]),
>> - VIR_UUID_BUFLEN) &&
>> - tb_port[IFLA_PORT_VF] &&
>> - vf == *(uint32_t *)RTA_DATA(tb_port[IFLA_PORT_VF])) {
>> + /* This ensures that the given VF is present in the
>> + * IFLA_VF_PORTS list, and that its uuid matches the
>> + * instanceId (in case we've associated it). If no
>> + * instanceId is sent from the caller, that means we've
>> + * disassociated it from this instanceId, and the uuid
>> + * will either be unset (if still not associated with
>> + * anything) or will be set to a new and different uuid
>> + */
>> + if ((tb_port[IFLA_PORT_VF] &&
>> + vf == *(uint32_t *)RTA_DATA(tb_port[IFLA_PORT_VF]))
&&
>> + (!instanceId ||
>> + (tb_port[IFLA_PORT_INSTANCE_UUID] &&
>> + !memcmp(instanceId,
>> + (unsigned char *)
>> + RTA_DATA(tb_port[IFLA_PORT_INSTANCE_UUID]),
>> + VIR_UUID_BUFLEN)))) {
>> found = true;
>> break;
>> }
>>
> I must admit that even though I understand C, I have no idea whether this is
> correct or not. I mean, I've never played with such hardware so I'll let
Stefan
> review this one.
>
> Michal
ACK
I have not tested the fix above, but both the analysis and the fix
seem correct to me.
Thanks, Christian! A coworker was able to test it on a system running
RHEL6 (which is based on a 2.6.32 kernel) and Cisco VMFEX/enic with
802.1Qbh port profiles, and the associate/disassociate commands worked
properly for both starting/stopping domains as well as migration. This
leads me to think it's safe, so I pushed it.