On 5/15/22 17:48, Digimer wrote:
Hi all,
I've got a series of programs that monitor various things on a CentOS
Stream 8 VM host. All of these scripts work when called directly.
However, when I have a parent program that calls all the little programs
in series, I found that some virsh calls hang.
Initially, there were two scripts that were hanging repeatedly. Once
called 'virsh net-list --all --name', so I changed it to check for
configs in '/etc/libvirt/qemu/networks/', and that script started
working. The other script though calls 'virsh list --all', and that
can't be easily swapped out, so I really need to find the source of
these hangs.
Whenever the hang happens, about 30~45 seconds later, I see
'libvirtd[1643714]: Cannot recv data: Connection reset by peer'.
I think the issue is striking other scripts that run, but this
scenario is happening predictably and consistently right now.
I thought it might be a concurrent connect limit or a problem with how
many times virsh is called by a script, so I wrote a test script that
kept calling 'virsh list --all' each second, but it was close to 100
calls without hanging, far more that all the calls in my scripts
combined, so I don't think that's it.
Any advice/guidance would be very much appreciated!
I wonder whether specifying the connection URI explicitly would help. I
don't know anything about your app, but if it perhaps clears some env
vars (LIBVIRT_DEFAULT_URI VIRSH_DEFAULT_CONNECT_URI) or runs under
different user than what you're testing virsh under then virsh will try
to start a session daemon. If you wan it to connect to the system URI
specify that explicitly:
virsh -c qemu:///system list ...
Michal