Problem calling 'virsh' in a script

On 5/15/22 11:48 AM, Digimer wrote:
Hi all,
I've got a series of programs that monitor various things on a CentOS Stream 8 VM host. All of these scripts work when called directly. However, when I have a parent program that calls all the little programs in series, I found that some virsh calls hang.
Is your script being called from a libvirt "hook" script? (https://libvirt.org/hooks.html )If so, that won't work - a libvirt hook script is called from within libvirt, and can't call back into libvirt. Other than that, is there anything different about the context the script is being run from vs. the context you're directly running virsh from?
Initially, there were two scripts that were hanging repeatedly. Once called 'virsh net-list --all --name', so I changed it to check for configs in '/etc/libvirt/qemu/networks/', and that script started working. The other script though calls 'virsh list --all', and that can't be easily swapped out, so I really need to find the source of these hangs.
Whenever the hang happens, about 30~45 seconds later, I see 'libvirtd[1643714]: Cannot recv data: Connection reset by peer'.
I think the issue is striking other scripts that run, but this scenario is happening predictably and consistently right now.
I thought it might be a concurrent connect limit or a problem with how many times virsh is called by a script, so I wrote a test script that kept calling 'virsh list --all' each second, but it was close to 100 calls without hanging, far more that all the calls in my scripts combined, so I don't think that's it.
Any advice/guidance would be very much appreciated!
-- Digimer Papers and Projects:https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould

On 5/15/22 17:48, Digimer wrote:
Hi all,
I've got a series of programs that monitor various things on a CentOS Stream 8 VM host. All of these scripts work when called directly. However, when I have a parent program that calls all the little programs in series, I found that some virsh calls hang.
Initially, there were two scripts that were hanging repeatedly. Once called 'virsh net-list --all --name', so I changed it to check for configs in '/etc/libvirt/qemu/networks/', and that script started working. The other script though calls 'virsh list --all', and that can't be easily swapped out, so I really need to find the source of these hangs.
Whenever the hang happens, about 30~45 seconds later, I see 'libvirtd[1643714]: Cannot recv data: Connection reset by peer'.
I think the issue is striking other scripts that run, but this scenario is happening predictably and consistently right now.
I thought it might be a concurrent connect limit or a problem with how many times virsh is called by a script, so I wrote a test script that kept calling 'virsh list --all' each second, but it was close to 100 calls without hanging, far more that all the calls in my scripts combined, so I don't think that's it.
Any advice/guidance would be very much appreciated!
I wonder whether specifying the connection URI explicitly would help. I don't know anything about your app, but if it perhaps clears some env vars (LIBVIRT_DEFAULT_URI VIRSH_DEFAULT_CONNECT_URI) or runs under different user than what you're testing virsh under then virsh will try to start a session daemon. If you wan it to connect to the system URI specify that explicitly: virsh -c qemu:///system list ... Michal
participants (3)
-
Digimer
-
Laine Stump
-
Michal Prívozník