On Wed, Aug 24, 2011 at 11:58:29PM +0800, Guannan Ren wrote:
On 08/18/2011 05:55 AM, Dave Allan wrote:
>So, after your patches which have greatly improved the console
>behavior, I find that I'm back to this hang, which by its nature I
>can't reproduce with virsh console, as it only appears when I've
>shutdown and started a domain several times within the same
>connection. The hang is 100% reproducible. Per our IRC conversation,
>I'm attaching the RPC logs, as well as the python code for reference
>and a backtrace of the python process at the time that it was hung.
>
>Dave
>
I can produce the problem, so I did an research on this.
According to the libvirtd log, it hangs because when the
domain boot up at the second time,
the libvirtd send a message to python scripts due to the
lifecycle_callback setting, meanwhile
setting the socket fd of the client to "mode=0", that means
neither readable or writable on the
libvirtd side.
So when the python scripts got the lifecycle event and trys to
call virDomainGetState() in
the command of openning console, after it sent the message to
libvirtd, it hanged and never get
the response.
The problem is that before decrementing 'client->nrequests' we check
what the message type/status is. The check is incorrect for streams,
because it failed to take account of the fact that some stream
errors may be asynchronous and thus untracked. This in turn caused
the 'nrequests' variable to go negative.
A fix which worked for me is here
https://www.redhat.com/archives/libvir-list/2011-August/msg01518.html
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|