On 24.06.2011 07:19, Daniel Veillard wrote:
On Wed, Jun 22, 2011 at 11:26:27AM -0600, Eric Blake wrote:
> On 06/22/2011 11:05 AM, Jiri Denemark wrote:
>> On Wed, Jun 22, 2011 at 16:47:18 +0100, Daniel P. Berrange wrote:
>>> If the QEMU process has been stopped (kill -STOP/gdb), or the
>>> QEMU process has live-locked itself, then we will never get a
>>> reply from the monitor. We should not wait forever in this
>>> case, but instead timeout after a reasonable amount of time.
>>>
>>> NB if the host has high CPU load, or a single monitor command
>>> intentionally takes a long time, then this will cause bogus
>>> failures. In the case of high CPU load, arguably the guest
>>> should have been migrated elsewhere, since you can't effectively
>>> manage guests on a host if QEMU is taking > 30 seconds to reply
>>> to simply commands. Since we use background migration, there
>>> should not be any commands which take significant time to
>>> execute any more
>>
>> The thing I'm most concerned about is that is far too easy to get into such
>> situations especially since disk cache subsystem in Linux kernel is not the
>> best thing in the world. While I agree that running guests on a loaded host is
>> not very clever and guests should rather be migrated elsewhere, such situation
>> doesn't have to be intentional. In other words, in case of a malfunction of
>> some kind (some processes go crazy, network disruptions, ...) QEMU may require
>> more than a timeout seconds to respond and we will penalize an innocent QEMU
>> process because we won't be able to control it anymore even though the
issues
>> get fixed.
>
> Is there any way to measure time spent by the child process, rather than
> just relying on wall-time elapsed? That is, when libvirt hits 30
> seconds of wall time in waiting for a monitor, can it then check whether
> the child process has accumulated any execution time (likely hung) vs.
> no execution time (likely a starved system situation), and only give up
> in the former case?
Well a STOP'ed child process won't accumulate any execution time,
and you won't be able to discriminate just based on this, but I think
we should be able to poke linux to see if the process is in D state for
example and if we do mark the guest as non reponding then being able
to provide an useful error information upon the associated API failure
like
"Failed to contact domain: process stopped"
"Failed to contact domain: blocked on I/O"
"Failed to contact domain: process looping"
would be a really good thing. That probing and reporting can be done
as a separate step though
Daniel
To me this looks like solving the Halting problem. That means - for some
cases we might be able to tell qemu will not answer anymore, but for
others we will not.
I agree if qemu (and thus libvirt API call) does not return in ~30
seconds, users get anxious, but it would be nice if we could send
destroy to a unresponsive domains at least.
Michal