On 01/19/2012 08:56 AM, Luiz Capitulino wrote:
Long ago, commit 625a5be added the guest provided memory statistics
to
the query-balloon command. Unfortunately, it also introduced a severe
bug: query-balloon would hang if the guest didn't respond. This, in turn,
would also cause a hang in libvirt.
Because of that, we decided to disable the guest memory stats feature
(commit 11724ff).
As we decided to let commands implement ad-hoc async mechanisms until we
get a proper way to do it, I decided to try to re-enable that feature.
My idea is to have a command and an event. The command gets the process
started by sending a request to guest and returns. Later, when the guest
makes the memory stats info available, it's sent to the client by means
of an QMP event (please, take a look at patch 05/05 for full details).
I'm not sure if that approach is good for libvirt though, so it would be
very helpful to get their input (Eric, I'm CC'ing you here, but feel free
to route this to someone else).
[I went ahead and cc'd the libvirt list]
Yes, libvirt can live with this approach. And having this in parallel
to a qemu-ga verb is nice, since, as it was pointed out, this would
allow interaction with guests that have a balloon device but not a guest
agent.
You may want to read this thread [1], for thoughts on the impact of
making another existing blocking command be extended into one that
starts an async event and ends when an event is raised; libvirt can
expose both a blocking and an asynchronous implementation to the user on
top of the qemu model being just asynchronous.
[1]
https://www.redhat.com/archives/libvir-list/2012-January/msg00562.html
Thinking aloud - do we need a means to poll the state of the
balloon-stat query? On the one hand, if libvirtd issues the start
command, then gets stopped, then the event occurs, then libvirtd is
restarted, then libvirt won't know that the event was missed. On the
other hand, since this involves guest interaction, libvirt already has
to assume that the guest may be malicious and refuse to report stats
and/or report invalid stats, so libvirt would already have to be
prepared to give up if no event has arrived in a fixed amount of time,
and that also means that restarting libvirtd can just ignore any balloon
query that was in flight before the restart.
So I guess I'm okay with just a start and an event, with no poll of the
last-known guest response. But it does mean that qemu has to gracefully
handle if libvirt makes two start requests in a row without any
intervening events, and conversely that libvirt has to be prepared for
an event that happens even when libvirt doesn't remember triggering the
start command.
Another interesting point is that, there's another way of doing
this and
it's using qemu-ga instead. That's, qemu-ga could read that information
from proc and return it. This is easier & simpler, as it doesn't involve
guest communication. We also could return a lot more information if needed.
The only disadvantage I can see is the dependency on qemu-ga...
Most likely, we would want to teach libvirt to use both methods, and
give the choice to the user on which approach to use when the guest
supports both.
--
Eric Blake eblake(a)redhat.com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org