Re: [libvirt] [Fwd: [PATCH v9] add 802.1Qbh and 802.1Qbg handling]

28 May 2010


      On Thu, 2010-05-27 at 17:43 +0200, Arnd Bergmann wrote:
...
On Thursday 27 May 2010, Stefan Berger wrote:
...
On Thu, 2010-05-27 at 15:37 +0200, Arnd Bergmann wrote:
...
...
I'm even more confused now. Why should the response be different
from the response we get from the kernel? What's different on the
sending side other than the PID?
Also, what is the RTMGRP_LINK argument used for? I thought we don't
need multicast any more because we don't target kernel and lldpad
in the same message but only one of them.
Fact is that if I set RTMGRP_LINK to 0 here on libvirt only side, the
dummy server doesn't get a message. If I set it to 0 on both side, the
dummy server also doesn't get a message. I think it's necessary for
user-space-to-user-space communication, but I am also only learning
about these types of sockets while I 'go'.
AFAICT, RTMGRP_LINK makes it a multicast message, which means that
anyone can receive it, which a unicast message will only be sent
to the task that has a matching pid.
Right. So, yes, if I read the pid of lldpad from /var/run/lldpad.pid
then I can direct the netlink message and nlComm() becomes common code
for talking to the kernel or another userspace process.
...
...
...
...
You mean that's not defined in the (pre-)standard?
Yes, the Qbg wire protocol has no need for this, because the
messages are only sent after the state has changed, we never
see a VDP message with an incomplete status in there, so there
is no need to specify it in Qbg, but we need something in the netlink
protocol.
Yes, I would suggest to mimic 802.1Qbh in this case.
Like what? The first 256 result numbers in the netlink protocol
are meant to be the same as the ones in the wire protocol, the
next 256 are for the Qbh protocol.
I did not look carefully and differentiate between error codes of
802.1Qbg or .1Qbh but  took them as shared between both technologies.
...
We could of course define yet another range for the inprogress
result and possibly future extensions, starting at 512 if you
insist on requiring the IFLA_PORT_RESPONSE.
No, I'll fake it for now with the INPROGRESS value of 802.Qbh.
...
I originally wanted to defer the response until we hear back
from the switch, but that may take a really long time
(over a minute with all the VDP timeouts). That would be
cleaner IMHO, but you may not want to wait that long in libvirt.
802.1Qbh is polling for max. 10 seconds. So we need to tune this
to a even higher value for 802.1Qbg ?
...
...
...
...
...
what we should return there? Should we possibly just leave out
IFLA_PORT_RESPONSE in order to signal INPROGRESS, as in not clear yet?
We want to be able to signal failure in case the switch setup failed so
that we don't have a malfunctioning network interface where the user
then doesn't know what to debug. In case of failure we simply wouldn't
start the VM or hotplug the interface and return an indication that
something went wrong during the negotiation with the switch.
So leaving out IFLA_PORT_RESPONSE would work for that, right?
We need to poll for the current status. At least in 802.1Qbh case using
an RTM_GETLINK type of message. Don't know how we could leave out
IFLA_PORT_RESPONSE...
The response can simply contain all attributes of the request
but not contain an IFLA_PORT_RESPONSE attribute inside of the IFLA_VF_PORT.
Yes, understood now.
...
...
...
attribute for Qbg, which is probably the best approximation we can get.
I think we should just mandate this for the request, and let lldpad
decide on the fake vf number, which it returns to getPortProfileStatus.
Just be a bit more clear about the exact parameter to pass.
I'd suggest doing a request with
vfinfo_attr[IFLA_VF_MAC] = { .vf = -1u, .mac = OUR_MAC };
vfinfo_attr[IFLA_VF_VLAN] = { .vf = -1u, vlan = OUT_VLAN };
vf_port_attr[IFLA_PORT_VF] = NULL; /* don't send this at all */
vf_port_attr[IFLA_PORT_RESPONSE] = NULL; /* also don't send this */
vf_port_attr[IFLA_PORT_*] = WHATEVER_WE_NEED;
and waiting for the receiver to choose a free VF and ack this message with
I think the typical ACK in a netlink message on a RTM_SETLINK is just
an error code indication of whether the message was successfully process
or now. The subsequent RTM_GETLINK would then return what you show here.
...
vfinfo_attr[IFLA_VF_MAC] = { .vf = CHOSEN_VF, .mac = OUR_MAC };
vfinfo_attr[IFLA_VF_VLAN] = { .vf = CHOSEN_VF, vlan = OUT_VLAN };
vf_port_attr[IFLA_PORT_VF] = { CHOSEN_VF }; /* to identify the vf_port_attr */
vf_port_attr[IFLA_PORT_RESPONSE] = NULL; /* no response yet */
vf_port_attr[IFLA_PORT_RESPONSE] = PORT_RESPONSE_INPROGRESS; /* alternative,
      					 if you prefer */
vf_port_attr[IFLA_PORT_*] = WHATEVER_WE_REQUESTED;
I don't care which of the two responses we put in there, we just need to
make a decision here since we forgot to define one before.
I don't do much with the returned vf so for now I don't read it.
...
When waiting for the request to complete, we can then check for the
specific vf returned from lldpad (or the kernel, if we ever need to go there).
I have a v11 that does hopefully most of the above now correctly. I will
send it to you and Scott privately first before generating more noise on
the list.

    Stefan
...
Arnd