On 31.10.2017 15:57, Peter Krempa wrote:
> On Tue, Oct 31, 2017 at 13:48:01 +0100, Peter Krempa wrote:
>> On Tue, Oct 31, 2017 at 15:06:36 +0300, Nikolay Shirokovskiy wrote:
>>> If block job is completed with error qemu additionally provides error
message.
>>> This patch introduces new event VIR_DOMAIN_EVENT_ID_BLOCK_JOB_3 to pass
error
>>> message to client.
>>>
>>> ---
>>>
>>> The patch is applied on top of [1] patch series (not yet pushed though).
Looks
>>> like this patch consists of too much boilerplate code only to pass extra
string
>>> parameter for blockjob event but looks like there is no other way now.
>>> Especially ugly looking is adding event3 in src/qemu/qemu_blockjob.c.
>>
>> Yes, the boilerplate is horrible. Complaining won't help though, you
>> need to send patches.
>>
>>>
>>> Actually there is old RFC for providing error message for blockjob event
[2].
>>> Hope this one gain more attention.
>>>
>>> [1]
https://www.redhat.com/archives/libvir-list/2017-October/msg01292.html
>>> [2]
https://www.redhat.com/archives/libvir-list/2016-December/msg00093.html
>>>
>>> daemon/remote.c | 47 ++++++++++++++++++++++++++++++++
>>> examples/object-events/event-test.c | 21 +++++++++++++++
>>> include/libvirt/libvirt-domain.h | 23 ++++++++++++++++
>>> src/conf/domain_event.c | 54
++++++++++++++++++++++++++++++++-----
>>> src/conf/domain_event.h | 13 +++++++++
>>> src/libvirt_private.syms | 2 ++
>>> src/qemu/qemu_blockjob.c | 9 +++++--
>>> src/qemu/qemu_blockjob.h | 3 ++-
>>> src/qemu/qemu_domain.h | 1 +
>>> src/qemu/qemu_driver.c | 10 ++++---
>>> src/qemu/qemu_process.c | 12 ++++++---
>>> src/remote/remote_driver.c | 34 +++++++++++++++++++++++
>>> src/remote/remote_protocol.x | 17 +++++++++++-
>>> src/remote_protocol-structs | 9 +++++++
>>> tools/virsh-domain.c | 25 +++++++++++++++++
>>> 15 files changed, 264 insertions(+), 16 deletions(-)
>>
>> [...]
>>
>>> diff --git a/include/libvirt/libvirt-domain.h
b/include/libvirt/libvirt-domain.h
>>> index 4048acf..4f942da 100644
>>> --- a/include/libvirt/libvirt-domain.h
>>> +++ b/include/libvirt/libvirt-domain.h
>>> @@ -4363,6 +4363,28 @@ typedef void
(*virConnectDomainEventBlockThresholdCallback)(virConnectPtr conn,
>>> unsigned long
long excess,
>>> void *opaque);
>>>
>>> +
>>> +/**
>>> + * virConnectDomainEventBlockJob3Callback:
>>> + * @conn: connection object
>>> + * @dom: domain on which the event occurred
>>> + * @disk: name associated with the affected disk (filename or target
>>> + * device, depending on how the callback was registered)
>>
>> This is copypasted from others, but should be fixed here. We should not
>> allow the same mistake we did for the '1' event where we pass the path.
>> This event should only report the target name (along with possibly the
>> layer via the square bracket syntax 'vda[3]').
>>
>>> + * @type: type of block job (virDomainBlockJobType)
>>> + * @status: status of the operation (virConnectDomainEventBlockJobStatus)
>>
>> I'm not quite sure whether I'm a fan of adding a third event that
>> reports success. I think we are fine with two. This one should probably
>> fire only on error conditions and thus the status field will be
>> unnecessary.
>>
>>> + * @error: error string
>>
>> This needs more docs. Especially stating that the error string is
>> free-form and should NOT be ever used for inferring the error cause
>> since it's bound to change, and is hypervisor specific. Libvirt will not
>> guarantee that they will not change.
>>
>>> + * @opaque: application specified data
>>
>> In general. The usefullnes of this will be limited for human consumption
>> since we can't guarantee that the errors will not change.
>>
>> Otherwise we'd probably report the error as an enum value rather than a
>> string since it's easier to use for higer level APPs.
>>
>> If human consumption is what you shoot for, I'm okay with this as long
>> as it's made clear in the docs.
>
> One more note on this:
> For future use we actually should do a error-only event which will also
> have an error-type enum value which currently will have only one
> possible value and that's a free-form error string. In the future we can
> then assign some codes in some cases.
>
I was thinking to add block job error event instead of extending generic
block job event. My concern was how client will use it. So client wants
to print error message somewhere. It receives completed event but len and
offset are different so it is a error. Client skips this event and awaits
The events don't report 'len' and 'offset' and querying is inherently
racy.
If the block job will fail it will not receive the "completed event"
but rather VIR_DOMAIN_EVENT_ID_BLOCK_JOB_2 (or '1') with the status
field having VIR_DOMAIN_BLOCK_JOB_FAILED. If the APP registered the job
error event, it can ignore this one and report failure once the error
event is delivered.
In case when the registration failed (e.g.) talking to old libvirtd, it
can fail when it gets the first event without reporting an error.