
On Sat, May 22, 2010 at 12:17:05PM -0700, Scott Feldman wrote:
On 5/22/10 11:34 AM, "Dave Allan" <dallan@redhat.com> wrote:
On Sat, May 22, 2010 at 11:14:20AM -0400, Stefan Berger wrote:
On Fri, 2010-05-21 at 23:35 -0700, Scott Feldman wrote:
On 5/21/10 6:50 AM, "Stefan Berger" <stefanb@linux.vnet.ibm.com> wrote:
This patch may get 802.1Qbh devices working. I am adding some code to poll for the status of an 802.1Qbh device and loop for a while until the status indicates success. This part for sure needs more work and testing...
I think we can drop this patch 3/3. For bh, we don't want to poll for status because it may take awhile before status of other than in-progress is indicated. Link UP on the eth is the async notification of status=success.
The idea was to find out whether the association actually worked and if not either fail the start of the VM or not hotplug the interface. If we don't do that the user may end up having a VM that has no connectivity (depending on how the switch handles an un-associated VM) and start debugging all kinds of things... Really, I would like to know if something went wrong. How long would we have to wait for the status to change? How does a switch handle traffic from a VM if the association failed? At least for 802.1Qbg we were going to get failure notification.
I tend to agree that we should try to get some indication of whether the associate succeeded or failed. Is the time that we would have to poll bounded by anything, or is it reasonably short?
It's difficult to put an upper bound on how long to poll. In most case, status would be available in a reasonably short period of time, but the upper bound depends on activity external to the host.
That makes sense. The timeout should be a configurable value. What do you think is a reasonable default?
Mostly I'm concerned about the failure case: how would the user know that something has gone wrong, and where would information to debug the problem appear?
Think of it as equivalent to waiting to get link UP after plugging in a physical cable into a physical switch port. In some cases negotiation of the link may take on the order of seconds. Depends on the physical media, of course. A user can check for link UP using ethtool or ip cmd. Similarly, a user can check for association status using ip cmd, once we extend ip cmd to know about virtual ports (patch for ip cmd coming soon).
That's the way I was thinking about it as well. The difference I see between an actual physical cable and what we're doing here is that if you're in the data center and you plug in a cable, you're focused on whether the link comes up. Here, the actor is likely to be an automated process, and users will simply be presented with a VM with no or incorrect connectivity, and they will have no idea what happened. It's just not supportable to provide them with no indication of what failed or why. Dave