On Fri, Nov 25, 2016 at 6:04 PM, Michal Privoznik <mprivozn@redhat.com> wrote:
On 25.11.2016 17:54, Roman Mohr wrote:
> On Fri, Nov 25, 2016 at 4:34 PM, Michal Privoznik <mprivozn@redhat.com>
> wrote:
>
>> On 25.11.2016 14:38, Roman Mohr wrote:
>>> Hi,
>>>
>>> I recently started to use the libvirt domain events. With them I increase
>>> the responsiveness of my VM state wachers.
>>> In general it works pretty well. I just listen to the events and do a
>>> periodic resync to cope with missed events.
>>>
>>> While watching the events I ran into a few interesting situations I
>> wanted
>>> to share. The points 1-3 describe some minor issues or irregularities.
>>> Point 4 is about the fact that domain and state updates are not versioned
>>> which makes it very hard to stay in sync with libvirt when using events.
>>>
>>> My libvirt version is 1.2.18.4.
>>
>> This might be the root cause. I'm unable to see some of the scenarios
>> you're seeing. Have you tried the latest release (or even git HEAD) to
>> check whether all the scenarios you are describing still stand?
>>
>
> Definitely better with latest HEAD but still it does not look completely
> right.
>
>>
>>>
>>> 1) Event order seems to be weird on startup:
>>>
>>> When listening for VM lifecycle events I get this order:
>>>
>>> {"event_type": "Started", "timestamp": "2016-11-25T11:59:53.209326Z",
>>> "reason": "Booted", "domain_name": "generic", "domain_id":
>>> "8ff7047b-fb46-44ff-a4c6-7c20c73ab86e"}
>>> {"event_type": "Defined", "timestamp": "2016-11-25T11:59:53.435530Z",
>>> "reason": "Added", "domain_name": "generic", "domain_id":
>>> "8ff7047b-fb46-44ff-a4c6-7c20c73ab86e"}
>>>
>>> It is strange that a VM already boots before it is defined. Is this the
>>> intended order?
>>
>> I don't see this order so probable this is fixed upstream.
>>
>
> On latest master a normal creation emits these events:
>
> event 'lifecycle' for domain testvm: Resumed Unpaused
> event 'lifecycle' for domain testvm: Started Booted
>
> The Resumed event looks wrong. Further I get no more Defined/Undefined
> events. Maybe they were removed?

Yes, they were removed.

Nice
 
The Resumed event comes from qemu actually,
because libvirt starts qemu in paused mode so that we can do some setup
(e.g. place vcpu threads into cgroups) and only after that we can resume
guest CPUs and in fact let guest start. Once this is done we
deliberately emit Started event.

I would expect an event like this:

event 'lifecycle' for domain testvm: Suspended Bootstrapping

before the other two events. That takes the ambiguity from the Resumed event.


>
>
>>
>>>
>>> 2) Defining a VM with VIR_DOMAIN_START_PAUSED gives me this event order
>>
>> I don't think you can define a domain with that flag. What's the actual
>> action?
>>
>
> That is the flag for the api, when using virsh using `--paused` does that.

Ah, that's for virsh create/start not virsh define. Anyway, this is no
longer the case with upstream, is it?


Right
 
>
>
>>
>>>
>>> {"event_type": "Defined", "timestamp": "2016-11-25T12:02:44.037817Z",
>>> "reason": "Added", "domain_name": "core_node", "domain_id":
>>> "b9906489-6d5b-40f8-a742-ca71b2b84277"}
>>> {"event_type": "Resumed", "timestamp": "2016-11-25T12:02:44.813104Z",
>>> "reason": "Unpaused", "domain_name": "core_node", "domain_id":
>>> "b9906489-6d5b-40f8-a742-ca71b2b84277"}
>>> {"event_type": "Started", "timestamp": "2016-11-25T12:02:44.813733Z",
>>> "reason": "Booted", "domain_name": "core_node", "domain_id":
>>> "b9906489-6d5b-40f8-a742-ca71b2b84277"}
>>
>>
>> Interesting, so here is "defined" event delivered before the "started"
>> event. Also - where is "suspended" event?
>>
>
>
> With latest master the situation looks better. Now I see
>
> event 'lifecycle' for domain testvm: Started Booted
> event 'lifecycle' for domain testvm: Suspended Paused

Again, both of these are deliberately emitted by libvirt and in fact I
think they reflect what is happening.


Why is in this case  not 

 event 'lifecycle' for domain testvm: Resumed Unpaused

event emitted?

I would expect

event 'lifecycle' for domain testvm: Resumed Unpaused
event 'lifecycle' for domain testvm: Started Booted
event 'lifecycle' for domain testvm: Suspended Paused


So the situation on master is much better but because of the Resumed/Unpaused event I still have the feeling that the most simple but powerful usecase, watching for CREATE, UPDATE, DELETE is very hard because you can't know if the Resumed/Unpaused is the indicator for CREATE or UPDATE.

What do you think? 


Michal