Re: [libvirt-users] e1000 network interface takes a long time to set the link ready

Thursday, 10 May 2018

On Thu, May 10, 2018 at 2:07 PM, Laine Stump <laine(a)redhat.com&gt; wrote:
...
 On 05/10/2018 02:53 PM, Ihar Hrachyshka wrote:
> Hi,
>
> In kubevirt, we discovered [1] that whenever e1000 is used for vNIC,
> link on the interface becomes ready several seconds after 'ifup' is
> executed

 What is your definition of "becomes ready"? Are you looking at the
 output of "ip link show" in the guest? Or are you watching "brctl
 showstp" for the bridge device on the host? Or something else? 
I was watching the guest dmesg for the following messages:

[    4.773275] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[    6.769235] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX
[    6.771408] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

For e1000, there are 2 seconds in between those messages; for virtio,
it's near instant. Interesting that it happens on the very first ifup;
when I do it the second time after the guest booted, it's instant.

...

> which for some buggy images like cirros may slow down boot
> process for up to 1 minute [2]. If we switch from e1000 to virtio, the
> link is brought up and ready almost immediately.
>
> For the record, I am using the following versions:
> - L0 kernel: 4.16.5-200.fc27.x86_64 #1 SMP
> - libvirt: 3.7.0-4.fc27
> - guest kernel: 4.4.0-28-generic #47-Ubuntu
>
> Is there something specific about e1000 that makes it initialize the
> link too slowly on libvirt or guest side?

 There isn't anything libvirt could do that would cause the link to
 IFF_UP up any faster or slower, so if there is an issue it's elsewhere.
 Since switching to the virtio device eliminates the problem, my guess
 would be that it's something about the implementation of the emulated
 device in qemu that is causing a delay in the e1000 driver in the guest.
 That's just a guess though.

>
> [1] https://github.com/kubevirt/kubevirt/issues/936
> [2] https://bugs.launchpad.net/cirros/+bug/1768955

 (I discount the idea of the stp delay timer having an effect, as
 suggested in one of the comments on github that points to my explanation
 of STP in a libvirt bugzilla record, because that would cause the same
 problem for e1000 or virtio). 
Yes, it's not STP, and I also tried to explicitly set all bridge
timers to 0 with no result. I also did "tcpdump -i any" inside the
container that hosts the VM VIF, and there was no relevant traffic on
tap device.

...

 I hesitate to suggest this, because the rtl8139 code in qemu is
 considered less well maintained and lower performance than e1000, but
 have you tried setting that model to see how it behaves? You may be
 forced to make that the default when virtio isn't available. 
Indeed rth8139 is near instant too:

[    4.156872] 8139cp 0000:07:01.0 eth0: link up, 100Mbps,
full-duplex, lpa 0x05E1
[    4.177520] 8139cp 0000:07:01.0 eth0: link up, 100Mbps,
full-duplex, lpa 0x05E1

Thanks for the tip, we will consider it too (also thanks for the
background info about the driver support state).

...

 Another thought - I guess the virtio driver in Cirros is always
 available? Perhaps kubevirt could use libosinfo to auto-decide what
 device to use for networking based on OS.

This, or we can introduce explicit tags for NICs / guest type to use.

Thanks a lot for reply,
Ihar

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [libvirt-users] e1000 network interface takes a long time to set the link ready