Re: [libvirt] [RFC] [PATCH 3/3 v2] vepa+vsi: Some experimental code for 802.1Qbh

Sunday, 23 May 2010

On Sat, May 22, 2010 at 12:17:05PM -0700, Scott Feldman wrote:
...
 On 5/22/10 11:34 AM, "Dave Allan" <dallan(a)redhat.com&gt;
wrote:

 > On Sat, May 22, 2010 at 11:14:20AM -0400, Stefan Berger wrote:
 >> On Fri, 2010-05-21 at 23:35 -0700, Scott Feldman wrote:
 >>> On 5/21/10 6:50 AM, "Stefan Berger"
<stefanb(a)linux.vnet.ibm.com&gt; wrote:
 >>> 
 >>>> This patch may get 802.1Qbh devices working. I am adding some code to
 >>>> poll for the status of an 802.1Qbh device and loop for a while until
the
 >>>> status indicates success. This part for sure needs more work and
 >>>> testing...
 >>> 
 >>> I think we can drop this patch 3/3.  For bh, we don't want to poll for
 >>> status because it may take awhile before status of other than in-progress
is
 >>> indicated.  Link UP on the eth is the async notification of status=success.
 >> 
 >> The idea was to find out whether the association actually worked and if
 >> not either fail the start of the VM or not hotplug the interface. If we
 >> don't do that the user may end up having a VM that has no connectivity
 >> (depending on how the switch handles an un-associated VM) and start
 >> debugging all kinds of things... Really, I would like to know if
 >> something went wrong. How long would we have to wait for the status to
 >> change? How does a switch handle traffic from a VM if the association
 >> failed? At least for 802.1Qbg we were going to get failure notification.
 > 
 > I tend to agree that we should try to get some indication of whether
 > the associate succeeded or failed.  Is the time that we would have to
 > poll bounded by anything, or is it reasonably short?

 It's difficult to put an upper bound on how long to poll.  In most case,
 status would be available in a reasonably short period of time, but the
 upper bound depends on activity external to the host. 
That makes sense.  The timeout should be a configurable value.  What
do you think is a reasonable default?

...
 > Mostly I'm concerned about the failure case: how would the
user know
 > that something has gone wrong, and where would information to debug
 > the problem appear?

 Think of it as equivalent to waiting to get link UP after plugging in a
 physical cable into a physical switch port.  In some cases negotiation of
 the link may take on the order of seconds.  Depends on the physical media,
 of course.  A user can check for link UP using ethtool or ip cmd.
 Similarly, a user can check for association status using ip cmd, once we
 extend ip cmd to know about virtual ports (patch for ip cmd coming soon). 
That's the way I was thinking about it as well.  The difference I see
between an actual physical cable and what we're doing here is that if
you're in the data center and you plug in a cable, you're focused on
whether the link comes up.  Here, the actor is likely to be an
automated process, and users will simply be presented with a VM with
no or incorrect connectivity, and they will have no idea what
happened.  It's just not supportable to provide them with no
indication of what failed or why.

Dave

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] [RFC] [PATCH 3/3 v2] vepa+vsi: Some experimental code for 802.1Qbh