On 09/27/2016 05:06 AM, Nikolay Shirokovskiy wrote:
On 26.09.2016 23:07, John Ferlan wrote:
>
>
> On 09/07/2016 05:24 AM, Nikolay Shirokovskiy wrote:
>> Special error code helps gracefully handle race conditions on
>> blockjob cancelling. Consider for example pull backup. We want
>> to stop it and in the process we cancel blockjob for some of the
>> exported disks. But at the time this command reaches qemu blockjob
>> fail for some reason and cancel result and result of stopping
>> will be an error with quite unexpecte message - no such job (sort of).
>> Instead we can detect this race and treat as correct cancelling
>> and wait until fail event will arrive. Then we can report corect
>> error message with some details from qemu probably.
>>
>> Users of domainBlockJobAbort can be affected as -2 is new possible
>> return code. Not sure if I should stick to the original behaviour in
>> this case.
>> ---
>> src/qemu/qemu_monitor.c | 5 +++++
>> src/qemu/qemu_monitor_json.c | 11 +++++++----
>> 2 files changed, 12 insertions(+), 4 deletions(-)
>>
>
> I haven't looked forward to patch 3/3 yet, but perhaps it'd be better in
> the "caller" that cares to :
>
> if (qemu*() < 0) {
> virErrorPtr err = virGetLastError();
> if (err && err->code == VIR_ERR_OPERATION_INVALID) {
> /* Do something specific */
> }
> }
>
> Returning -2 alters virDomainBlockJobAbort expected return values and
> well that's probably not a good idea since we don't know if some caller
> has said (if virDomainBlockJobAbort() == -1) instead of <= -1
>
It is common place in libvirt to have special error codes instead of 0/-1
only (qemuMonitorJSONCheckCPUx86) and it feels bulky to have checks that
involve virGetLastError. It looks like other places that use similar
construction do not have other choice because of rpc call or smth.
True - my concern was more that virDomainBlockJobAbort now has a new
possible error value of -2 that's not described. Still just adding to
the documentation that it can now return -1 or -2 probably isn't the
best idea especially if what you're trying to do is resolve/handle an
issue for a specific case within qemu code.
Other API's within the bowels of qemu (monitor, monitor_json, etc)
returning -2 usually is somehow handled by the caller, but before
returning out into what would be API land would return what the API has
documented ({0, -1}, {true, false}, {NULL, address}, {count, 0, -1}, etc.)
Changing an external API return value is a risky proposition - works for
some instances fails for others... I've seen code that does this:
if (public_api(arg1, arg2) == -1) {
print some error message
return failure
}
even though it's been strongly suggested to do something like
if (public_api(arg1, arg2) < 0) {
get specific error (usually via errno) and perhaps
make a decision based on that... but fall through to
print some error message
return failure
}
I think when function spawns error when it comes to error code
value is does not take much care as they are not really semantic
(at least such common error as invalid operation)
so one day the caller can catch 'operation invalid' for different
reason from different place on the stack.
As to virDomainBlockJobAbort it is easy to patch)
I think I should also remove reporting error from qemuMonitorJSONBlockJobError
in case of DeviceNotActive however it is used from several places even
like setting speed. It is legacy of commit b976165c when
a lot of qemu commands share common error checking code. I think I'd
better copy this code to qemuMonitorJSONBlockJobCancel and change it there
instead of common place. What do you think?
I think you need to think about all the callers of
qemuMonitorJSONBlockJobError... I almost hate to use the word fragile
to describe things, but there is a lot of complexity involved with
respect to all the consumers. I just recall hearing about someone doing
a lot of muttering while implementing the code. The snapshot code in
particular (which you seem to have copied to a degree) has been the
source of some fairly "exotic" bug reports.
I personally don't have a lot of insight into the totality of the
blockdev* code. I'd need to spend some time to really understand things
better and provide a more clear/concise answer...
John