On Fri, 06 Jan 2012 09:08:19 -0600
Anthony Liguori <anthony(a)codemonkey.ws> wrote:
On 01/06/2012 06:45 AM, Luiz Capitulino wrote:
> On Fri, 6 Jan 2012 11:06:12 +0000
> Stefan Hajnoczi<stefanha(a)gmail.com> wrote:
>
>> Proper async support - if you mean the ability to have multiple QMP
>> commands pending at a time - is harder than just fixing QEMU. Clients
>> also need to start taking advantage of it. Clients that do not will
>> be unable to continue when a QMP command takes a long time to
>> complete.
>
> They can be fixed if we offer proper async support. Today they can't.
>
>> I think avoiding long-running QMP commands is a good idea. We have
>> events which can be used to signal completion. It's easy to implement
>> and does not require clients to change the way they think about QMP
>> commands.
>
> I agree in principle, but in practice we risk having different subsystems
> and different commands introducing their own async support which is going
> to make our API (which is already far from perfect) impossible to use,
> not to mention the maintainability hell that will arise from it.
I absolutely agree with you but practically speaking, we don't have generic
async support today.
It's been my experience that holding up patch series for generic infrastructure
that does exist 1) causes unnecessary angst in contributors 2) puts pressure on
the infrastructure to get something in fast vs. get something in that's good.
And honestly, it's (2) that I worry the most about. I don't want us to rush
async support because we're eager to get block streaming merged. This is why
I'm not holding any new devices back while we get QOM merged even if it creates
more work for me and introduces new compatibility problems to solve.
I agree.
We also need to look at this interface as a public interface whether
we
technically committed it to or not. The fact is, an important user is relying
upon so that makes it a supported interface. Even though I absolutely hate it,
this is why we haven't changed the help output even after all of these years.
Not breaking users should be one of our highest priorities.
One thing I don't understand: how is libvirt relying on it if it doesn't
exist in qemu.git yet?
Now we could change this command to make it a better QMP interface
and we could
do that in a compatible fashion.
However, since I think we'll get proper async support really soon and that will
involve fundamentally changing this command (along with a bunch of other
commands), I don't think there's a lot of value in making cosmetic changes right
now. If we're going to break backwards compatibility, I'd rather do it once
than twice.
It goes beyond cosmetic changes. For example, will we allow other async
block commands to use this interface? And if we're doing this for block,
why not accept something similar for other subsystems if someone happen to
submit it?
Let me take a non-cosmetic change request example. The BLOCK_JOB_COMPLETED
event has a 'error' field. However, it's impossible to know which error
happened because the 'error' field contains only the human error description.
Another problem: the event is called BLOCK_JOB_COMPLETED, but it's tied
to the streaming API. If we allow other commands to use it, they will likely
have to add fields there, making the event worse than it already is.
There's more, because I skipped this review in v3 as I jumped to the
"proper async support" discussion...
What I'd suggest is that we take the command in as-is and we mark
it:
Since: 1.1
Deprecated: 1.2
See Also: TBD
The idea being that we'll introduce new generic async commands in 1.2 and
deprecate this command. We can figure out the removal schedule then too. Since
this command hasn't been around all that long, we can probably have a short
removal schedule.
That makes its inclusion even discussable :) A few (very honest) questions:
1. Is it really worth it to have the command for one or two releases?
2. Will we allow other block commands to use this async API?
3. Are we going to accept other ad-hoc async APIs until we have a
proper one?
We should also mark the other psuedo-async commands this way too FWIW.
Regards,
Anthony Liguori
> Note that I'm not exactly advocating for heavyweight async support, I just
> want to avoid keeping messing with this area.
>
> Maybe, we could go real simple by having a standard event for
> asynchronous commands, say ASYNC_CMD_FINISHED or something and that event
> would contain only the command id and if the command succeeded or
> failed. The APIs for cancelling and querying would have to be provided
> by the command itself.
>
> I can start a new thread to discuss async support. I haven't done it yet
> because I don't have a concrete proposal and I also suspect that people are
> tired of discussing this over and over again.
>
>> Today I doubt many QMP clients have implemented multiple pending
>> commands, although the wire protocol allows it.
>
> That's true, but adding the id field in the command dict was silly, as
> we don't support multiple pending commands.
>
>>>> With respect to libvirt relying on interfaces before they exist in QEMU,
we need
>>>> to be a bit flexible here. We want to get better at co-development to
help make
>>>> libvirt support QEMU features as the bleeding edge.
>>>>
>>>> Forcing libvirt to wait until a feature is fully baked in QEMU will
ensure
>>>> there's always a feature gap in libvirt which is in none of our best
interests.
>>>
>>> We can ask them to wait at least until the API is merged. Most good review
>>> and potential problems will only come when the patches are worked on and
>>> reviewed on the list.
>>
>> The API was agreed on between QEMU and libvirt developers on the mailing
>> lists - you were included in that process. Back in August I sent
>> patches which you saw ("[0/4] Image Streaming API").
>>
>> I know the API is not what we'd design today when it comes to the
>> cosmetics. We'd want to name things differently, use the Unsupported
>> event which was introduced in the meantime, and maybe make the job
>> completion concept generic.
>>
>> QMP and QAPI have evolved in the time that this feature has been
>> reimplemented. I have tried to keep up with QMP but the API itself is
>> from August. We can't keep redrawing the lines.
>>
>> In summary:
>> * The API was designed and agreed several months ago.
>> * You saw it back then and I've kept you up-to-date along the way.
>> * It predates current QAPI conventions.
>> * Merging it poses no problem, changing the API breaks existing libvirt.
>
> It does pose problems. The name changes I've proposed are not minor
> things, it's about conforming to the protocol which is quite important.
> Duplicating errors is something that just doesn't make sense either.
>
> And most importantly, you're adding async support to the block layer. This
> means that we'll have two different async APIs when we add one to QMP,
> or worse other subsystems will be motivated to have their own async APIs
> too.
>
>> I do feel bad that the code has been out of qemu.git for so long and I
>> certainly won't attempt this again in the future. But I really think
>> the pros and cons say we should accept it as an August 2011 API just
>> like many of the other HMP/QMP commands we carry.
>
> I disagree. This should be reviwed and changed as any other submission.
>
>> Regarding being more flexible about working together with libvirt, I
>> do think it's important to work on APIs together. This avoids use
>> developing something purely from the QEMU internal perspective which
>> turns out to be unconsumable by our biggest QMP user :).
>
> We do work together with them. I've never ignored their opinion and I'm
> probably the strongest opinionated when it comes to compatibility.
>
> I just can't see how accepting something that is now rotted is going
> to help either of us.
>