On Thu, Oct 06, 2016 at 09:25:26AM -0500, Eric Blake wrote:
On 10/06/2016 06:34 AM, Peter Krempa wrote:
[...]
> We expose the state of the copy job in the XML and forward the
READY
> event from qemu to the users.
I was not aware of that when I was chatting on IRC yesterday; that's
useful to know, because virDomainGetBlockJobInfo() is NOT exposing that
information at the moment.
That is what this RFC was asking to consider -- whether an [I think it
has to be a new one] API should report.
> The documentation suggests that block jobs should listen to the
events
> and act accordingly only after receiving the event.
Yes, but the documentation ALSO states that waiting for cur==end is
SUPPOSED to work. And it doesn't.
Yes.
>> libvirt finds cur==end AND sends a pivot request, all in the
window
>> before QEMU would have sent "ready": true field [emitted as part of
the
>
> This is not true. Libvirt checks that the mirror is actually ready. It's
> done by the commit you've mentioned above.
In other words, Nova sees cur==end, and requests the pivot, but libvirt
is rejecting Nova's request because 'ready' is not true yet; and Nova
then gives up rather than trying again.
Indeed ^ (I made this correction in my other response.)
>> QMP `query-block-jobs` command's response, indicating
that the job has
>> actually completed], however the pivot request fails because it requires
>> "ready": true.
>
> The problem is that you are polling the block job info which correctly
> reports that all data was properly copied and you are inferring the
> block job state from that data.
But the problem here is that qemu is NOT accurately reporting data - it
is reporting cur==end with the promise that they are only equal if the
job is stable, WHEN THE JOB IS NOT STABLE.
That's precisely the source of the confusion for Nova here.
> I'm against deliberately reporting false data in the block
info
> structure.
We are NOT falsifying any information, any more than we are falsifying
information by changing cur/end to 0/1 when ready:false and qemu
reported 0/0. (see commit 988218ca).
Indeed, it seems inconsistent to allow it in one case (like the above
commit ID) to adjust (& _not_ falsify, as you accurately point out)
libvirt reporting, but not the other case (cur==end, "ready": false case
when cur != 0).
> The application should register handlers for the block job
events and
> act only if it receives such event. Additionally you may want to check
> that the state is correct in the XML. The current block job information
> structure can't be extended unfortunately.
Yes, changing Nova to use event handlers is a good idea. But I'm ALSO
in favor of fixing libvirt to work around the qemu bug, by intentionally
munging the output to state cur<end (even if qemu reported cur==end) if
qemu reports ready:false.
Given the above, I've re-opened the bug here:
https://bugzilla.redhat.com/show_bug.cgi?id=1382165#c3
--
/kashyap