On Fri, Nov 25, 2016 at 4:34 PM, Michal Privoznik <mprivozn@redhat.com> wrote:

On 25.11.2016 14:38, Roman Mohr wrote:

[...]

>
> 4) There libvirt domain description is not versioned
>
> I would expect that every time I update a domainxml (update from third
> party entity), or an event is generated (update from libvirt), that the
> resource version of a Domain is increased and that I get this resource
> version when I do a xmldump or when I get an event. Without this there is
> afaik no way to stay in sync with libvirt, even if you do regular polling
> of all domains. The main issue here is that I can never know if events in
> the queue arrived before my latest domain resync or after it.
>
> Also not that this is not about delivery guarantees of events. It is just
> about having a consistent view of a VM and the individual event. If I have
> resource versions, I can decide if an event is still interesting for me or
> not, which is exactly what I need to solve the syncing problem above.
> When I do a complete relisting of all domains to syn, I know which version
> I got and I can then see on every event if it is newer or older.
>
> If along side with the event, the domain xml, the VM state, and the
> resource version would be sent to a client, it would be even better. Then,
> whenever there is a new event for a VM in the queue, I can be sure that
> this domainxml I see is the one which triggered the event. This xml is then
> a complete representation for this revision number.

I recall some people asking for this. Basically, they were worried about
somebody from outside could manipulate their XMLs without them knowing.
Frankly I don't recall what was our answer to that.

Having a version number in live XML makes sense. However, it makes less
sense for config XML - there would be no way how to start with version
#0 once I've edited the file.

I think it would be very beneficial to have it on the config file too. Think about the resource version as opaque data which can be used by libvirt to see if the domain xml update contains the same resource number which libvirt sees.

So if you want to be sure that you are updating the domain xml from the latest state, you pass in the resource version of your cached domain xml view. If the version is still the same inside of libvirt, libvirt updates the domain xml and increases the resource version. If it has changed in the meantime, it rejects the update and the client can re-fetch the latest state and try again. For classic update mode, just don't pass in the resource version as a client and libvirt can then just update the domain xml like always. This is pretty much the same principle like described in [1].

What is the rationale for this?

I am mostly operating on cached views on libvirts data in combination with events. If, on listing resources and on events, I get a domain xml with a resource version and the Domain state, I have a full snapshot of the Domain, which I can put into a cache or queue. Then syncing with libvirt based on events and initial listing is possible. Otherwise I can never be sure if my view of libvirt is out of sync.

When I then process an event I can process it based on the consistent snapshot view of the Domain and update the domain xml. If something has changed in the meantime, the update of the domain xml will fail and I can recheck and retry. Even better: In most cases the event does not need retries, because a newer event is already in the queue with the new Domain view which caused the update to fail.

Finally it allows consistent incremental Domain state and description updates which can be sent to third parties without periodic refetching of all resources.

Roman

[1] https://github.com/kubernetes/kubernetes/blob/master/docs/devel/api-conventions.md

Michal