On 08/19/14 16:43, Daniel P. Berrange wrote:
On Tue, Aug 19, 2014 at 03:14:19PM +0200, Peter Krempa wrote:
> I'd like to propose a (hopefully) fairly future-proof API to retrieve
> various statistics for domains.
>
> The motivation is that management layers that use libvirt usually poll
> libvirt for statistics using various split up APIs we currently provide.
> To get all the necessary stuff, the mgmt app need to issue Ndomains *
> Napis calls and cope with the various returned formats. The APIs I'm
> wanting to introduce here will:
>
> 1) Return data in a format that we can expand in the future and is
> hierarchical. For starters I'll use XML, with possible expansion to
> something like JSON if it will be favourable for a consumer (switchable
> by a flag)
I'm not particularly a fan of using XML for this. As a guiding principal
we've used structures when we needed APIs for which efficiency is
important, and XML for APIs where efficiency is irrelevant. Even with
the ability to bulk list data from many VMs at once, I'd think that
efficiency is still a very important property of the API. Consider if
we have 1000 virtual machines on a host - the XML document is going to
get very large and frankly most XML parsers are terribly slow. Would
not surprise me if it too several seconds or more to parse an XML doc
that this proposed API would return, which I don't think is really
viable. JSON might be slightly better in this respect, but not by
as much as we might imagine.
So I'd rather think we need to figure out a way to map this into some
kind of struct, where reading any single statistic could be done in
approx constant time, or at least time that is independant of the
number of VMs.
Much as I dislike the virTypedParameter struct, it might actually be
a reasonable fit here. Perhaps a struct
struct virDomainRecord {
virDomainPtr dom;
size_t nparams;
virTypedParameter params;
};
and the API returns an array of virDomainRecord elements.
For the keys in the parameters we could use dot separated
components. eg
state=runing
cpu.online=8
cpu.0.state=running
cpu.0.time=10001231231
cpu.1.state=running
cpu.1.time=10001231231
...
cpu.7.state=running
cpu.7.time=10001231231
This can be fairly efficiently accessed without any parsing involved.
Well while this does reduce the amount of transferred data (by something
more than a half, due to the missing end tags). It still requires fair
amount of post-processing (parsing) of the output. Potential users of
this API will need to split the elements by dots and re-create the
hierarchy as it would be in XML. And this introduces the second
dimension of complexity (n^2) as you need to go trhough the string
identifiers for each returned typed parameter.
For C applications this API will be unusable mostly while python clients
at least are able to get arrays back.
The app would have todo an O(n) scan over the array to record the
mapping
of UUID -> array indexes. Once that's done any single stat can be accessed
in O(1) time. I don't think anything XML or JSON based could even come
close to this kind of efficiency of access, not to mention that it avoids
the need for apps to write XML parsers which will simplify their life no
end.
Peter