
On Fri, Jan 13, 2012 at 8:51 PM, Adam Litke <agl@us.ibm.com> wrote:
Qemu has changed the semantics of the "block_job_cancel" API. Originally, the operation was synchronous (ie. upon command completion, the operation was guaranteed to be completely stopped). With the new semantics, a "block_job_cancel" merely requests that the operation be cancelled and an event is triggered once the cancellation request has been honored.
To adopt the new semantics while preserving compatibility I propose the following updates to the virDomainBlockJob API:
A new block job event type VIR_DOMAIN_BLOCK_JOB_CANCELLED will be recognized by libvirt. Regardless of the flags used with virDomainBlockJobAbort, this event will be raised whenever it is received from qemu. This event indicates that a block job has been successfully cancelled.
A new extension flag VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC will be added to the virDomainBlockJobAbort API. When enabled, this function will operate asynchronously (ie, it can return before the job has actually been cancelled). When the API is used in this mode, it is the responsibility of the caller to wait for a VIR_DOMAIN_BLOCK_JOB_CANCELLED event or poll via the virDomainGetBlockJobInfo API to check the cancellation status.
Without the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag, libvirt will internally poll using qemu's "query-block-jobs" API and will not return until the operation has been completed. API users are advised that this operation is unbounded and further interaction with the domain during this period may block.
This patch implements the new event type, the API flag, and the polling. The main outstanding issue is whether we should bound the amount of time we will wait for cancellation and return an error.
Comments on this proposal?
Hi, What's the latest thinking on this issue? I'm happy to help come up with an acceptable patch. I found Adam's initial approach plus suggested cleanups fine. Would that be accepted? Stefan