On 05/30/2014 04:42 PM, Michal Privoznik wrote:
On 30.05.2014 14:35, Daniel P. Berrange wrote:
> On Fri, May 30, 2014 at 01:41:08PM +0200, Michal Privoznik wrote:
>> On 30.05.2014 10:56, Daniel P. Berrange wrote:
>>> On Thu, May 29, 2014 at 10:32:40AM +0200, Michal Privoznik wrote:
>>>> Currently it is not possible to determine the speed of an interface
>>>> and whether a link is actually detected from the API. Orchestrating
>>>> platforms want to be able to determine when the link has failed and
>>>> where multiple speeds may be available which one the interface is
>>>> actually connected at. This commit introduces an extension to our
>>>> interface XML (without implementation to interface driver backends):
>>>>
>>>> <interface type='ethernet' name='eth0'>
>>>> <start mode='none'/>
>>>> <mac address='aa:bb:cc:dd:ee:ff'/>
>>>> <link speed='1000' state='up'/>
>>>> <mtu size='1492'/>
>>>> ...
>>>> </interface>
>>>>
>>>> Where @speed is negotiated link speed in Mbits per second, and state
>>>> is the current NIC state (can be one of the following:
"unknown",
>>>> "notpresent", "down",
"lowerlayerdown","testing", "dormant", "up").
>>>
>>> This is fine for the the <interface> objects, but it is limited
>>> in usefulness for SRIOV use cases. The <interface> objects only
>>> exist for interfaces which are configured for the host. With
>>> SRIOV passthrough some of the interfaces we're interested in
>>> are not going to be configured - they're just bare devices
>>> waiting to be given to a guest.
>>
>> I hear what you're saying, but unless a PCI device is given
>> interface name I
>> am afraid we can't do anything. For instance, if you have a NIC but
>> detach
>> it from the driver (echo ${PCI_ADDR} >
>> /sys/bus/pci/drivers/<driver>/unbind), kernel still sees the PCI device
>> (it's shown in lspci output for instance), but it's not configured
>> anymore -
>> kernel doesn't know device link state, hence it's not aware if NIC's
>> link
>> speed, etc. So tools like ethtool, ip, ifconfing won't show the device.
>
> IIUC, We have three classes of device
>
> 1 Devices not bound - no NIC visible in the host OS
>
> 2 Devices bound but not configured. NIC visible in host OS, but no
> /etc/sysconfig/networking/ifcfg-XXX file
>
> 3 Devices bound and configured. NIC visible in host OS, and has a
> /etc/sysconfig/networking/ifcfg-XXX file
>
> The <interface> configs only let you deal with NIC devices in class 3.
>
> The <nodedev> XML / APIs let you see NIC devices in class 2 + 3.
>
Right. The netcf world. Okay, makes sense now. In udev we have 2nd and
3rd classes merged into one.
Unfortunately yes. The udev backend was added for distros that didn't
support netcf, and doesn't exactly fit the original semantics of the
virInterface API. In particular, virConnect*Interfaces() lists the class
2 devices, which was not intended to be the case. I guess that's our
fault for adding the ability to report device status into the netcf API
at all - if I recall correctly, it wasn't in the initial requirements of
virInterface (which were only to provide a way to *configure* host
interfaces and report on their configuration), but "somebody" (I forget
who, I could have been one of the guilty parties) requested that it also
provide interface status, and we thought "sounds useful, what harm could
it possibly do?". Then once it became an essential part of
virt-manager's guest network config (in order to get a list of host
interfaces the guest could connect to), distros without netcf felt the
pain of missing this feature that was present in other distros, and saw
the udev backend as a way to sidestep the problems of making a netcf
port. It has definitely been useful, but has kind of messed up the
clarity of the virinterface APIs.