Re: frequent network collapse possibly due to bridging

Monday, 24 January 2022

On 1/24/22 4:35 AM, Martin Kletzander wrote:
...
 On Fri, Jan 21, 2022 at 08:42:58AM -0600, Hakan E. Duran wrote:
> Hi,
>
> I would like some help to troubleshoot the problem I have been having
> lately with my VM host, which contains 5 VMs, one of which is for
> pi-hole, unbound services. It has been a relatively common occurrence in
> the last few weeks for me to find that the host machine has lost its
> network when I get back home from work. Restoring the VM/VMs do not fix
> the problem, the host needs to be restarted for a fix, otherwise there
> is both loss of name resolution, as well as an internet connection; I
> cannot ping even IPs such as 8.8.8.8. Since I use the pi-hole VM as 
> the DNS
> server for my LAN, this means that my whole LAN gets disconnected from
> internet, until the host machine is rebooted. The host machine has a
> little complicated network setup: the two gigabit connections are bonded
> and bridged to the VMs; however this set up has been serving me so well
> for several years now. The problem, on the other hand, appeared a few
> weeks ago. This doesn't happen every day but often enough to be annoying
> and disruptive for my family.
>

 Always good to check what has changed those weeks ago, but I understand
 it is difficult to find out what you were updating and where.

> My question is, how can I troubleshoot this problem and figure out
> whether it is truly due to network bridging somehow collapsing or not? I
> tried to find some log files but all I could find were the
> /var/log/libvirt/qemu/$VM files, and the particular log file for the 
> pi-hole
> VM reported the following lines; however, I am not sure if they are
> associated with a real crash or just due to shutting down and restarting
> the host (please excuse the word-wrapping):
>
> char device redirected to /dev/pts/2 (label charserial0)
> qxl_send_events: spice-server bug: guest stopped, ignoring
> 2022-01-20T23:41:17.012445Z qemu-system-x86_64: terminating on signal 
> 15 from pid 1 (/sbin/init)

 Probably restarting the host as it got SIGTERM'd by init.  Maybe it was
 restarted in a bad time and there is some inconsistency on the disk?
 Using something like libvirt-guests which can manage your machines when
 rebooting would be a good idea.

> 2022-01-20 23:41:17.716+0000: shutting down, reason=crashed
> 2022-01-20 23:42:46.059+0000: starting up libvirt version: 7.10.0, qemu
> version: 6.2.0, kernel: 5.10.89-1-MANJARO, hostname: -redacted-
>
> Please excuse my ignorance but is there a way to restart the
> networking without rebooting the host machine? This will not solve my

 You can do:

 virsh net-destroy <network_name>
 virsh net-start <network_name>

 but depending on what the network looks like, how it is set up etc. you
 might need to restart some of the VMs or manually plug them in. 
The connection between any guest tap device and a host bridge device 
will be broken by virsh net-destroy, and not restored by virsh net-start 
(because the network driver has no good way of notifying the QEMU driver 
that it has restarted a network). This is something that's been on my 
"list of annoying things I should fix some day" for a very long time, 
but I've never been motivated enough to figure out a clean solution.

In the meantime, if you destroy/start a network, you can get all the 
guest tap devices reconnected by restarting libvirtd:

    systemctl restart libvirtd.service

or if you're using split daemons:

    systemctl restart virtqemud.service

One of the things the QEMU driver does when it's initializing is to 
check where each guest tap device *should* be connected, compare that to 
where it *is* connected, and if those don't match then fix it.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: frequent network collapse possibly due to bridging