On 10/06/2016 10:25 AM, Eric Blake wrote:
On 10/06/2016 06:34 AM, Peter Krempa wrote:
>> Currently libvirt block APIs (& consequently higher-level applications
>> like Nova which use these APIs) rely on polling for job completion via
>
> libvirt is _not_ polling the data. Libvirt relies on the events from
> qemu which are also exposed as libvirt events.
Libvirt is not the one deciding when to issue the pivot command, Nova
is. Right now, Nova is polling (rather than waiting for events), and
its polling is solely conditional on cur==end rather than on the XML
addition of ready='true'.
>
> We expose the state of the copy job in the XML and forward the READY
> event from qemu to the users.
I was not aware of that when I was chatting on IRC yesterday; that's
useful to know, because virDomainGetBlockJobInfo() is NOT exposing that
information at the moment.
> The documentation suggests that block jobs should listen to the events
> and act accordingly only after receiving the event.
Yes, but the documentation ALSO states that waiting for cur==end is
SUPPOSED to work. And it doesn't.
>> ~~~~~~~~~~~~~~~~~~~~~
>>
>> libvirt finds cur==end AND sends a pivot request, all in the window
>> before QEMU would have sent "ready": true field [emitted as part of
the
>
> This is not true. Libvirt checks that the mirror is actually ready. It's
> done by the commit you've mentioned above.
In other words, Nova sees cur==end, and requests the pivot, but libvirt
is rejecting Nova's request because 'ready' is not true yet; and Nova
then gives up rather than trying again.
>
>> QMP `query-block-jobs` command's response, indicating that the job has
>> actually completed], however the pivot request fails because it requires
>> "ready": true.
>
> The problem is that you are polling the block job info which correctly
> reports that all data was properly copied and you are inferring the
> block job state from that data.
But the problem here is that qemu is NOT accurately reporting data - it
is reporting cur==end with the promise that they are only equal if the
job is stable, WHEN THE JOB IS NOT STABLE.
Do we really promise that in QEMU? I guess since jobs have existed since
before the ready field I guess we do...
>
> I'm against deliberately reporting false data in the block info
> structure.
We are NOT falsifying any information, any more than we are falsifying
information by changing cur/end to 0/1 when ready:false and qemu
reported 0/0. (see commit 988218ca).
>
> The application should register handlers for the block job events and
> act only if it receives such event. Additionally you may want to check
> that the state is correct in the XML. The current block job information
> structure can't be extended unfortunately.
Yes, changing Nova to use event handlers is a good idea. But I'm ALSO
in favor of fixing libvirt to work around the qemu bug, by intentionally
munging the output to state cur<end (even if qemu reported cur==end) if
qemu reports ready:false.