On 01/30/2012 08:44 AM, Luiz Capitulino wrote:
On Mon, 30 Jan 2012 07:54:56 -0600
Anthony Liguori<anthony(a)codemonkey.ws> wrote:
> On 01/30/2012 06:57 AM, Luiz Capitulino wrote:
>> On Thu, 26 Jan 2012 16:57:01 -0600
>> Anthony Liguori<anthony(a)codemonkey.ws> wrote:
>>
>>> On 01/26/2012 01:35 PM, Luiz Capitulino wrote:
>>>> On Thu, 26 Jan 2012 08:18:03 -0700
>>>> Eric Blake<eblake(a)redhat.com> wrote:
>>>>
>>>>> [adding qemu-devel]
>>>>>
>>>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
>>>>>>> One thing, that you'll probably notice is this
>>>>>>> 'set-support-level' command. Basically, it tells GA
what qemu version
>>>>>>> is it running on. Ideally, this should be done as soon as
>>>>>>> GA starts up. However, that cannot be determined from
outside
>>>>>>> world as GA doesn't emit any events yet.
>>>>>>> Ideally^2 this command should be left out as it should be
qemu
>>>>>>> who tells its own agent this kind of information.
>>>>>>> Anyway, I was going to call this command in
qemuProcess{Startup,
>>>>>>> Reconnect,Attach}, but it won't work. We need to un-pause
guest CPUs
>>>>>>> so guest can boot and start GA, but that implies returning
from qemuProcess*.
>>>>>>>
>>>>>>> So I am setting this just before 'guest-suspend'
command, as
>>>>>>> there is one more thing about GA. It is unable to remember
anything
>>>>>>> upon its restart (GA process). Which has BTW show flaw
>>>>>>> in our current code with FS freeze& thaw. If we freeze
guest
>>>>>>> FS, and somebody restart GA, the simple FS Thaw will not
succeed as
>>>>>>> GA thinks FS are not frozen. But that's a different cup
of tea.
>>>>>>>
>>>>>>> Because of what written above, we need to call set-level
>>>>>>> on every suspend.
>>>>>>
>>>>>>
>>>>>> IMHO all this says that the 'set-level' command is a
conceptually
>>>>>> unfixably broken design& should be killed in QEMU before
it turns
>>>>>> into an even bigger mess.
>>>>
>>>> Can you elaborate on this? Michal and I talked on irc about making the
>>>> compatibility level persistent, would that help?
>>>>
>>>>>> Once we're in a situation where we need to call
'set-level' prior
>>>>>> to every single invocation, you might as well just allow the
QEMU
>>>>>> version number to be passed in directly as an arg to the command
>>>>>> you are running directly thus avoiding this horrificness.
>>>>>
>>>>> Qemu folks, would you care to chime in on this?
>>>>>
>>>>> Exactly how is the set-level command supposed to work? As I
understand
>>>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is
>>>>> being run by qemu 1.0, then we want to ensure that any guest agent
>>>>> command supported by qemu-ga 1.1 but requiring features of qemu not
>>>>> present in qemu 1.0 will be properly rejected.
>>>>
>>>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that
by
>>>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0.
This
>>>> way the set-support-level command allows you to specify that qemu 2.0
features
>>>> are supported.
>>>
>>> Version numbers are meaningless. What happens when a bunch of features get
>>> backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version
of
>>> 2.0?
>>>
>>> The feature negotiation mechanism we have in QMP is the existence of a
command.
>>> If we're in a position where we're trying to disable part of a
command, it
>>> simply means that we should have multiple commands such that we can just
remove
>>> the disabled part entirely.
>>
>> You may have a point that we shouldn't be using the version number for that,
>> but just switching to multiple commands doesn't solve the fundamental
problem.
>>
>> The fundamental problem is that, S3 in current (and old) qemu has two known
bugs:
>>
>> 1. The screen is left black after S3 (it's a bug in seabios)
>> 2. QEMU resumes the guest immediately (Gerd posted patches to address this)
>>
>> We're going to address both issues in 1.1. However, if qemu-ga is installed
in
>> an old qemu and S3 is used, the bugs will be triggered.
>
> It's a management tool problem.
>
> Before a management tool issues a command, it should query the existence of the
> command to determine whether this version of QEMU has that capability. If the
> tool needs to use two commands, it should query the existence of both of them.
>
> In this case, the management tool needs a qemu-ga command *and* a QEMU command
> (to resume from suspend) so it should query both of them.
>
> Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3
> worked in QEMU as expected.
That's right, it's a coincidence, but I don't see why this wouldn't
work.
I think we should do the following then:
1. Drop the set-support-level command
2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid,
guest-suspend-disk
3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing
the guest-suspend-ram command
Michal, Michael, do you agree?
Yup yup, looks sound to me.
> Alternatively, if there really was no reason to have a resume-from-suspend
> command, this would be the point where we would add a capabilities command
> adding the "working-s3" capability.
>
> But with capabilities, this is a direct QEMU->management tool interaction, not a
> proxy through the guest agent.
>
> We shouldn't trust the guest agent and we certainly don't want to rely on
the
> guest agent to avoid sending an improper command to QEMU! That would be a
> security issue.
Fair enough, although that makes me wonder if the planned feature of pulling
qemu-ga's commands into the QMP namespace won't have similar security issues.
>
>>
>> We need a way for qemu-ga to query qemu about the existence of a working S3
>> support. The set-support-level solves that.
>
> qemu-ga is not an entry point for QEMU features. It's strictly a mechanism to
> ask the guest to do something. If we need to interact with QEMU directly to
> query a capability and/or presence of a command, then we should talk to QEMU
> directly.
>
> To put it another way, a management tool MUST deal with the fact that when
> issuing the suspend-to-ram command, a guest may ignore it or attempt to do
> something malicious.
>
> Regards,
>
> Anthony Liguori
>
>
>> Another option would be to disable (or enable) S3 by default in qemu-ga, and let
>> the admin enable (or disable it) according to S3 support being fixed in qemu.
>>
>