On Wed, Nov 07, 2018 at 14:59:59 +0100, Michal Privoznik wrote:
On 11/07/2018 01:46 PM, Nikola Ciprich wrote:
> Hi fellow libvirt users,
>
> I'd like to ask, whether somebody possibly dealt with similar
> problem we're hitting.. Some of libvirt VM operations (ie
> fs freeze) are prone to hang for long time, in case the guest
> agent is in some bad state.. My question is, if it's possible
> to set some timeout for such operations, or we have to deal with
> it ie with separate thread and some timers? we're using python
> libvirt bindings..
>
> I'll appreciate any advice
We explicitly chose not to have any timeouts becuase no one can know how
big the timeout should be. Nor libvirt, nor mgmt application. What I am
saying is that even if you'd set timeout of X seconds, fs freeze might
still time out. But given that Murphy's law are correct the freeze will
finish right after timeout is reported. Problem with this is that domain
is in different state than libvirt thinks.
But specifically for qemu guest agent related issues, there is
virDomainQemuAgentCommand() through which you can send 'guest-ping' to
check that the agent is responsive. If it fails, then don't issue fs
freeze API and vice versa.
Well, internally libvirt actually pings the guest agent prior to issuing
an API, but after we indeed issue the API, the call is synchronous.
If they weren't synchronous it would be impossible to figure out what
the actual state is without an elaborate event based infrastructure.
Libvirt's APIs are specifically designed to be synchronous (except those
that are not ... obviously - mostly device hotplug and block jobs).