On Wed, Dec 07, 2011 at 04:01:58PM -0700, Eric Blake wrote:
On 12/07/2011 03:35 PM, Adam Litke wrote:
> Stefan's qemu tree has a block_job_cancel command that always acts
> asynchronously. In order to provide the synchronous behavior in libvirt (when
> flags is 0), I need to wait for the block job to go away. I see two options:
>
> 1) Use the event:
> To implement this I would create an internal event callback function and
> register it to receive the block job events. When the cancel event comes in, I
> can exit the qemu job context. One problem I see with this is that events are
> only available when using the json monitor mode.
I like this idea. We have internally handled events before, and limited
it to just JSON if that made life easier: for example, virDomainReboot
on qemu is rejected if you only have the HMP monitor, and if you have
the JSON monitor, the implementation internally handles the event to
change the domain state.
Can we reliably detect whether qemu is new enough to provide the event,
and if qemu was older and did not provide the event, do we reliably know
that abort was blocking in that case?
I think we can say that qemu will operate in one of two modes:
a) Blocking abort AND event is not emitted
b) Non-blocking abort AND event is emitted
The difficulty is in detecting which case the current qemu supports. I don't
believe there is a way to query qemu for a list of currently-supported events.
Therefore, we'd have to use version numbers. If we do this, how do we avoid
breaking users of qemu git versions that fall between official qemu releases?
It's okay to make things work that used to fail, but it is a
regression
to make blocking job cancel fail where it used to work, so rejecting
blocking job cancel with HMP monitor is not a good idea. If we can
guarantee that all qemu new enough to have async cancel also support the
event, while older qemu was always blocking, and that we can always use
the JSON monitor to newer qemu, then we're set - merely ensure that we
use only the JSON monitor and the event. But if we can't make the
guarantees, and insist on supporting newer qemu via HMP monitor, then we
may need a hybrid combination of options 1 and 2, or for less code
maintenance, just option 2.
Is there a deprecation plan for HMP with newer qemu versions? I really hate the
idea of needing two implementations for this: one polling and one event-based.
> 2) Poll the qemu monitor
> To do it this way, I would write a function that repeatedly calls
> virDomainGetBlockJobInfo() against the disk in question. Once the job
> disappears from the list I can return with confidence that the job is gone.
> This is obviously sub-optimal because I need to poll and sleep.
We've done this before, for both HMP and JSON - see
qemuMigrationWaitForCompletion. I agree that an event is nicer than
polling, but we may be locked into this.
>
> 3) Ask Stefan to provide a synchronous mode for the qemu monitor command
> This one is the nicest from a libvirt perspective, but I doubt qemu wants to add
> such an interface given that it basically has broken semantics as far as qemu is
> concerned.
Or even:
4) Ask Stefan to make the HMP monitor command synchronous, but only
expose the JSON command as asynchronous. After all, it is only HMP
where we can't wait for an event.
Stefan, how 'bout it?
>
> If this is all too nasty, we could probably just change the behavior of
> blockJobAbort and make it always synchronous with a 'cancelled' event.
No, we can't change the behavior without breaking back-compat of
existing clients of job cancellation.
--
Adam Litke <agl(a)us.ibm.com>
IBM Linux Technology Center