On 07/11/2013 03:18 PM, Richard Weinberger wrote:
> Am 10.07.2013 11:42, schrieb Gao feng:
>> On 07/10/2013 03:23 PM, Richard Weinberger wrote:
>>> Am 10.07.2013 09:03, schrieb Gao feng:
>>>> On 07/10/2013 02:00 PM, Richard Weinberger wrote:
>>>>
>>>>>>
>>>>>> Yes,actually libvirt did up the veth devices, that's why only
veth2& veth5 are down.
>>>>>
>>>>> Where does libvirt up the devices? The debug log does not contain any
"ip link set dev XXX up" commands.
>>>>> Also in src/util/virnetdevveth.c I'm unable to find such a ip
command.
>>>>>
>>>>
>>>> virLXCProcessSetupInterfaceBridged calls virNetDevSetOnline.
>>>
>>> Ahhhh, it's using an ioctl().
>>>
>>>>>> I need to know why these two devices are down, I believe they
were up, your bridge and default-net
>>>>>> looks good. So please show me your kernel message (dmesg), maybe
it can give us some useful information.
>>>>>
>>>>> This time veth4 and 5 are down.
>>>>>
>>>>> ---cut---
>>>>
>>>>> [ 44.158209] IPv6: ADDRCONF(NETDEV_UP): veth4: link is not ready
>>>>> [ 44.473317] IPv6: ADDRCONF(NETDEV_CHANGE): veth4: link becomes
ready
>>>>> [ 44.473400] virbr0: topology change detected, propagating
>>>>> [ 44.473407] virbr0: port 5(veth4) entered forwarding state
>>>>> [ 44.473423] virbr0: port 5(veth4) entered forwarding state
>>>>
>>>> veth4 were up here
>>>>
>>>>> [ 44.566186] device veth5 entered promiscuous mode
>>>>> [ 44.571234] IPv6: ADDRCONF(NETDEV_UP): veth5: link is not ready
>>>>> [ 44.571243] virbr0: topology change detected, propagating
>>>>> [ 44.571250] virbr0: port 6(veth5) entered forwarding state
>>>>> [ 44.571261] virbr0: port 6(veth5) entered forwarding state
>>>>> [ 44.902308] IPv6: ADDRCONF(NETDEV_CHANGE): veth5: link becomes
ready
>>>>> [ 45.000580] virbr0: port 5(veth4) entered disabled state
>>>>
>>>> and then it became down.
>>>>
>>>>> [ 45.348548] virbr0: port 6(veth5) entered disabled state
>>>>
>>>> So, Some places disable the veth4 and veth5.
>>>> I don't know in which case these two devices will be disabled.
>>>>
>>>> I still can't reproduce this problem in my test bed :(
>>>> I need more information to analyse why these two device being disabled.
>>>>
>>>> So, can you run kernel with the below debug patch?
>>>>
>>>> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
>>>> index d45e760..aed319b 100644
>>>> --- a/net/bridge/br_stp_if.c
>>>> +++ b/net/bridge/br_stp_if.c
>>>> @@ -103,7 +103,7 @@ void br_stp_disable_port(struct net_bridge_port *p)
>>>> p->state = BR_STATE_DISABLED;
>>>> p->topology_change_ack = 0;
>>>> p->config_pending = 0;
>>>> -
>>>> + dump_stack();
>>>> br_log_state(p);
>>>> br_ifinfo_notify(RTM_NEWLINK, p);
>>>>
>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>> index faebb39..9b1617b 100644
>>>> --- a/net/core/dev.c
>>>> +++ b/net/core/dev.c
>>>> @@ -1368,6 +1368,7 @@ static int dev_close_many(struct list_head *head)
>>>>
>>>> list_for_each_entry(dev, head, unreg_list) {
>>>> rtmsg_ifinfo(RTM_NEWLINK, dev, IFF_UP|IFF_RUNNING);
>>>> + dump_stack();
>>>> call_netdevice_notifiers(NETDEV_DOWN, dev);
>>>> }
>>>>
>>>> @@ -4729,8 +4730,10 @@ void __dev_notify_flags(struct net_device *dev,
unsigned int old_flags)
>>>> if (changes & IFF_UP) {
>>>> if (dev->flags & IFF_UP)
>>>> call_netdevice_notifiers(NETDEV_UP, dev);
>>>> - else
>>>> + else {
>>>> + dump_stack();
>>>> call_netdevice_notifiers(NETDEV_DOWN, dev);
>>>> + }
>>>> }
>>>>
>>>> if (dev->flags & IFF_UP &&
>>>>
>>>>
>>>> Thanks!
>>>>
>>>
>>> There you go:
>>>
>>
>> Thank you very much.
>>
>>> [ 129.084408] CPU: 1 PID: 4473 Comm: ip Not tainted 3.10.0+ #20
>>> [ 129.084412] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>>> [ 129.084415] ffff88003760d000 ffff88003ce7f798 ffffffff8172b2a6
ffff88003ce7f7b8
>>> [ 129.084419] ffffffff8154be04 ffff88003760d000 0000000000001103
ffff88003ce7f7e8
>>> [ 129.084422] ffffffff8154be60 0000000000000010 ffff88003760d000
ffff88003ce7f918
>>> [ 129.084426] Call Trace:
>>
>>> [ 129.084821] virbr0: port 6(veth5) entered disabled state
>>>
>>
>> I can confirm it's the ip command disable the veth device now.
>> but I still don't know who calls ip and why.
>>
>> I search the libvirt code, there are no codes calling "ip link set xxx
down".
>>
>> It's so strange...
>>
>> Give you an advice, modify the code of
ip(git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git).
>> use read /proc/<getppid>/comm to trace which command calls ip.
>
> This morning I've installed a wrapper around ip to show me the process tree upon
ip link ... down is used.
> The log showed this:
>
> 769 ? Ss 0:00 /usr/lib/systemd/systemd-udevd
> 17759 ? S 0:00 \_ /usr/lib/systemd/systemd-udevd
> 17764 ? S 0:00 \_ /usr/lib/systemd/systemd-udevd
> 17772 ? S 0:00 \_ /usr/lib/systemd/systemd-udevd
> 19477 ? S 0:00 | \_ /bin/bash /sbin/ifdown veth5 -o hotplug
> 19910 ? S 0:00 | \_ /sbin/ip link set dev veth5 down
>
> Now I have to urge to use a "Kantholz". ;-)
>
hmmm...
it's systemd... I have no idea now... :(
TBH it is not systemd's fault.
OpenSUSE's /usr/lib/udev/rules.d/77-network.rules did not white list veth* devices.
Therefore systemd-udevd called ifup/down and other hotplug magic.
Thanks,
//richard