On 06/05/2014 06:39 PM, Michal Privoznik wrote:
+
+ /* Workaround broken kernel API. If the link is unplugged then
+ * depending on the NIC driver, link speed can be reported as -1.
+ * However, the value is printed out as unsigned integer instead of
+ * signed one. Terrifying but true. */
+ *speed = (int) tmp_speed == -1 ? 0 : tmp_speed;
It appears the problem isn't quite so simple.
On my system (Fedora 20, with a built-in realtek NIC, an Intel
82576-based SRIOV NIC, two bonds, a bridge device, a macvtap device, and
several taps, when I look at /sys/class/net/$ifname/speed, I get the
following results:
(yes, I realized later on that the support you put into libvirt only
reports link state for ethernets. Still it may make sense to report one
or both items for some other types).
* ANY device when ifconfiged down - attempts to read $ifname/speed
return EINVAL.
In addition, when *UP*:
* lo/speed - EINVAL (note that lo self-identifies as type='ethernet' in
netcf for some reason)
* bridge - EINVAL
(operstate seems to always be "down" for bridges, even if the bridge
ifflags show UP|RUNNING.)
* tap device - "10"
* macvtap (physdev is Realtek)
- physdev connected - "1000"
- physdev disconnected - "10" (same thing reported by physdev)
(about operstate - for a macvtap device, if the physdev is disconnected,
the macvtap operstate is "lowerlayerdown", *not* "down")
* bond - "4294967295" (if the physdevs (2 VFs) are disconnected)
* bond - "1000" (2 physdevs connected)
(interesting note - when I looked at the speed of bond0 (tying together
two sriov VFs each with speed = 1000 using mode="802.3ad"), it showed
"1000". At least one time when I ifconfiged both of the VFs down and
back up, bond0/speed became "2000". I later could not reproduce this
behavior.)
(operstate: bonds with mode="activebackup" always show status as
"down"
and speed as "4294967295", but a bond with mode="802.3ad" properly
reflects the physdev state)
* RealTek ethernet
- connected - "1000"
- disconnected - "10" (operstate does show "down")
* sriov-PF (Intel 82576)
- up/disconnected - "65535"
- up/connected - "1000"
* sriov-VF (Intel 82576)
- up/disconnected - "4294967295"
- up/connected - "1000"
So:
1) we need to allow for an EINVAL error, and report the speed as "0" in
that case rather than logging an error.
2) just checking for ((unsigned int)-1) isn't good enough - notice that
at least one driver (igb) returns 65535 when disconnected, and another
returns "10" when disconnected. I think if the operstate is "down",
then
speed should unconditionally be 0.
3) we should probably report what we can for other interface types. In
particular, the status for bond interfaces is useful in at least some
modes of operation, and macvtap interfaces seem to reliably reflect the
link state of their physdev. And maybe we should even detect bridges and
report "unknown" or "indeterminate" or something as the state rather
than "down".
In my current working patch for netcf, I've accounted for 1, 2, and most
of 3, don't know if it's better to leave the operstate of bridges alone
or not. (likewise, at one point I was changing "lowerlayerdown" to
"down", but then decided to leave that alone as well).
Also, there is a larger problem with the XML that I just realized while
making the netcf patch - you are only adding the <link> element to the
toplevel document, so in the case of nested interfaces (e.g. two
ethernets that are together in a bond, and the bond is attached to a
bridge) you will only have it for the master (in the example, that would
be the bridge, but of course you don't add that information for bridge
or bond devices), but it really should be at all levels. For example:
<interface name="br0" type="bridge">
<link state="down" speed="0"/>
<bridge>
<interface name="bond0" type="bond">
<link state="up" speed="2000"/>
<bond>
<interface name="p13p2_5" type="ethernet">
<link state="up" speed="1000"/>
<mac address="ea:8b:8d:18:67:fe"/>
</interface>
<interface name="p13p2_6" type="ethernet">
<link state="up" speed="1000"/>
<mac address="ea:8b:8d:18:67:fe"/>
</interface>
</bond>
</interface>
</bridge>
</interface>
(or, if you're against reporting anything for bridge devices, then
remove that toplevel <link> element)
I've made the netcf patch work this way, but libvirt strips out all but
the toplevel. (note that in the case above, the interface xml for the
ethernets is not available separate from the xml of the bridge - it is
considered to be a single interface.)