On 10/06/2016 06:34 AM, Peter Krempa wrote:
> Currently libvirt block APIs (& consequently higher-level
applications
> like Nova which use these APIs) rely on polling for job completion via
libvirt is _not_ polling the data. Libvirt relies on the events from
qemu which are also exposed as libvirt events.
Libvirt is not the one deciding when to issue the pivot command, Nova
is. Right now, Nova is polling (rather than waiting for events), and
its polling is solely conditional on cur==end rather than on the XML
addition of ready='true'.
We expose the state of the copy job in the XML and forward the READY
event from qemu to the users.
I was not aware of that when I was chatting on IRC yesterday; that's
useful to know, because virDomainGetBlockJobInfo() is NOT exposing that
information at the moment.
The documentation suggests that block jobs should listen to the
events
and act accordingly only after receiving the event.
Yes, but the documentation ALSO states that waiting for cur==end is
SUPPOSED to work. And it doesn't.
> ~~~~~~~~~~~~~~~~~~~~~
>
> libvirt finds cur==end AND sends a pivot request, all in the window
> before QEMU would have sent "ready": true field [emitted as part of the
This is not true. Libvirt checks that the mirror is actually ready. It's
done by the commit you've mentioned above.
In other words, Nova sees cur==end, and requests the pivot, but libvirt
is rejecting Nova's request because 'ready' is not true yet; and Nova
then gives up rather than trying again.
> QMP `query-block-jobs` command's response, indicating that the job has
> actually completed], however the pivot request fails because it requires
> "ready": true.
The problem is that you are polling the block job info which correctly
reports that all data was properly copied and you are inferring the
block job state from that data.
But the problem here is that qemu is NOT accurately reporting data - it
is reporting cur==end with the promise that they are only equal if the
job is stable, WHEN THE JOB IS NOT STABLE.
I'm against deliberately reporting false data in the block info
structure.
We are NOT falsifying any information, any more than we are falsifying
information by changing cur/end to 0/1 when ready:false and qemu
reported 0/0. (see commit 988218ca).
The application should register handlers for the block job events and
act only if it receives such event. Additionally you may want to check
that the state is correct in the XML. The current block job information
structure can't be extended unfortunately.
Yes, changing Nova to use event handlers is a good idea. But I'm ALSO
in favor of fixing libvirt to work around the qemu bug, by intentionally
munging the output to state cur<end (even if qemu reported cur==end) if
qemu reports ready:false.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org