On 2022-05-15 12:07, Laine Stump wrote:
On 5/15/22 11:48 AM, Digimer wrote:
Hi all,

   I've got a series of programs that monitor various things on a CentOS Stream 8 VM host. All of these scripts work when called directly. However, when I have a parent program that calls all the little programs in series, I found that some virsh calls hang.

Is your script being called from a libvirt "hook" script? (https://libvirt.org/hooks.html )If so, that won't work - a libvirt hook script is called from within libvirt, and can't call back into libvirt.

Other than that, is there anything different about the context the script is being run from vs. the context you're directly running virsh from?

It's a perl script making a shell (system) call. So it's basically;

open (my $fh, "/usr/bin/virsh list --all |") or die;
while ($fh)
{
    chomp;
    # Do things
}
close $fh;

  There's about 15 programs that are sitting in a given directory. When the parent program runs, it looks at the scripts in the directory and runs them (again as simple shell calls), one after the other. This is where things fail. I'm happy to provide more detail or add debugging if you'd like.




   Initially, there were two scripts that were hanging repeatedly. Once called 'virsh net-list --all --name', so I changed it to check for configs in '/etc/libvirt/qemu/networks/', and that script started working. The other script though calls 'virsh list --all', and that can't be easily swapped out, so I really need to find the source of these hangs.

   Whenever the hang happens, about 30~45 seconds later, I see 'libvirtd[1643714]: Cannot recv data: Connection reset by peer'.

   I think the issue is striking other scripts that run, but this scenario is happening predictably and consistently right now.

   I thought it might be a concurrent connect limit or a problem with how many times virsh is called by a script, so I wrote a test script that kept calling 'virsh list --all' each second, but it was close to 100 calls without hanging, far more that all the calls in my scripts combined, so I don't think that's it.

Any advice/guidance would be very much appreciated!

-- 
Digimer