On 02/21/2017 12:07 PM, Yunchih Chen wrote:
> On 02/20/2017 09:10 PM, Michal Privoznik wrote:
> > On 17.02.2017 17:18, Yunchih Chen wrote:
> > > `virsh list` hangs on my server that hosts a bunch of VMs.
> > > This might be due to the Debian upgrade I did on Feb 15, which upgrades
> > > `libvirt` from 2.4.0-1 to 3.0.0-2.
> > > I have tried restarting libvirtd for a few times, without luck.
> > >
> > > Attached below are some relevant logs; let me know if you need
> > > some more
> > > for debugging.
> > > Thanks for your help!!
> > >
> > > root@vm-host:~# uname -a
> > > Linux vm-host 4.6.0-1-amd64 #1 SMP Debian 4.6.4-1 (2016-07-18) x86_64
> > > GNU/Linux
> > >
> > > root@vm-host:~# apt-cache policy libvirt-daemon
> > > libvirt-daemon:
> > > Installed: 3.0.0-2
> > > Candidate: 3.0.0-2
> > > Version table:
> > > *** 3.0.0-2 500
> > > 500
http://debian.csie.ntu.edu.tw/debian testing/main amd64
> > > Packages
> > > 100 /var/lib/dpkg/status
> > >
> > > root@vm-host:~# strace -o /tmp/trace -e trace=network,file,poll virsh
> > > list # hangs forever .....
> > > ^C
> > > root@vm-host:~# tail -10 /tmp/trace
> > > access("/etc/libvirt/libvirt.conf", F_OK) = 0
> > > open("/etc/libvirt/libvirt.conf", O_RDONLY) = 5
> > > access("/proc/vz", F_OK) = -1 ENOENT (No such
file or
> > > directory)
> > > socket(AF_UNIX, SOCK_STREAM, 0) = 5
> > > connect(5, {sa_family=AF_UNIX,
> > > sun_path="/var/run/libvirt/libvirt-sock"}, 110) = 0
> > > getsockname(5, {sa_family=AF_UNIX}, [128->2]) = 0
> > > poll([{fd=5, events=POLLOUT}, {fd=6, events=POLLIN}], 2, -1) = 1
> > > ([{fd=5, revents=POLLOUT}])
> > > poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}], 2, -1) = ?
> > > ERESTART_RESTARTBLOCK (Interrupted by signal)
> > > --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---
> > > +++ killed by SIGINT +++
> > >
> > > root@vm-host:~# lsof /var/run/libvirt/libvirt-sock # hangs too ...
> > This is very suspicious. Looks like the daemon is in some weird state
> > and hence virsh is unable to get list of domains.
> >
> > # ps axf | grep libvirtd
> > # gdb -p $(pgrep libvirtd)
> > (gdb) t a a bt
> >
> > if you could run those commands and share the output that might shed
> > more light.
> >
> > Michal
>
> Unfortunately, gdb also hangs when attaching to libvirt ....
>
> root@vm-host:~# gdb -q -p $(pgrep libvirtd)
> Attaching to process 9556
> [New LWP 9557]
> [New LWP 9558]
> [New LWP 9559]
> [New LWP 9560]
> [New LWP 9561]
> [New LWP 9562]
> [New LWP 9563]
> [New LWP 9564]
> [New LWP 9565]
> [New LWP 9566]
> [New LWP 9567]
> [New LWP 9568]
> [New LWP 9569]
> [New LWP 9570]
> [New LWP 9571]
> [New LWP 9572]
> ^C^C^C^C^C^C^C^C^C^C^C
>
> It must be killed with SIGKILL, and libvirtd will die with gdb.
>
> Here[1] is the output of the following command:
>
> strace -o /tmp/gdb-full-trace.txt -s 1024 -f gdb -q -p $(pgrep libvirtd)
>
> Thanks for your help : )
>
> [1]
https://www.csie.ntu.edu.tw/~yunchih/s/gdb-full-trace.txt
>
Any update on this? What other debug info could I provide?
I just enable libvirtd's debug option and here[1] is the output.
[1]
https://www.csie.ntu.edu.tw/~yunchih/s/libvirtd-debug.log
So that shows many lines like:
2017-02-24 15:20:47.667+0000: 15887: debug : virStorageFileGetMetadataInternal:939 :
path=/home/xxxx/install-virt.sh, buf=0x7f1d3810e580, len=1237, meta->format=-1
2017-02-24 15:20:47.667+0000: 15887: debug : virStorageFileProbeFormatFromBuf:808 :
path=/home/xxxx/install-virt.sh, buf=0x7f1d3810e580, buflen=1237
2017-02-24 15:20:47.667+0000: 15887: debug : virStorageFileProbeFormatFromBuf:845 :
format=1
2017-02-24 15:20:47.667+0000: 15887: debug : virFileClose:108 : Closed fd 19
It looks like you have configured a libvirt storage pool against your
home directory, which is a pretty unusual thing todo. That shoudln't
break things though but it will make the storage pool list 100's of
irrelevant file that aren't VM disk images.
Regards,
Daniel
--
|: