[Libvir] Add a timestamp to virDomainInfo ?

The primary use of the virDomainInfo data is in calculating resource utilization over time. The CPU runtime data, however, is represented as a counter, so to calulate utilization requires some transformations. Basically subtract the CPU runtime value for time X, from the value at time Y, and divide the result by Y-X. Obviously this requires us to know the times X & Y. Currently the virDomainInfo structure does not contain this information though, so apps have to call something like gettimeofday() before virDomainGetInfo(), to record the timestamp of each virDomainInfo record. The problem is that this is a pretty poor approximation - particularly if libvirt is talking via XenD, a non-trivial period of time can pass between the call to gettimeofday() and the values for virDomainInfo actually sampled. Thus I would like to suggest that the virtDomainInfo structure be expanded to add an extra data field of type 'struct timeval' to record a high resolution timestamp. Now currently this wouldn't help accuracy all that much, since libvirt would still be filling this in much the same way as the app did. It does however, open the door to having Xen itself provide an accurate timestamp back to libvirt, which would in turn benefit apps without requiring further code change. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, Apr 13, 2006 at 06:49:33PM +0100, Daniel P. Berrange wrote:
The primary use of the virDomainInfo data is in calculating resource utilization over time. The CPU runtime data, however, is represented as a counter, so to calulate utilization requires some transformations. Basically subtract the CPU runtime value for time X, from the value at time Y, and divide the result by Y-X. Obviously this requires us to know the times X & Y. Currently the virDomainInfo structure does not contain this information though, so apps have to call something like gettimeofday() before virDomainGetInfo(), to record the timestamp of each virDomainInfo record.
The problem is that this is a pretty poor approximation - particularly if libvirt is talking via XenD, a non-trivial period of time can pass between the call to gettimeofday() and the values for virDomainInfo actually sampled.
the fact that libvirt does it or the application does it generate the exact same incertitude. You won't get any added precision really...
Thus I would like to suggest that the virtDomainInfo structure be expanded to add an extra data field of type 'struct timeval' to record a high resolution timestamp. Now currently this wouldn't help accuracy all that much, since libvirt would still be filling this in much the same way as the app did. It does however, open the door to having Xen itself provide an accurate timestamp back to libvirt, which would in turn benefit apps without requiring further code change.
In practice this forces libvir to make one extra syscall for all sampling even if the application is not interested in such a timestamp. I find this a bit dubious honnestly. It would make some sense only if the timestamp is actually added at the protocol level, best is to ask on xen-devel, if there is a positive feedback then okay why not. But to me this just won't work in the general case, if you extract the informations from the Xen Store this mean multiple RPCs to get back the informations which one should be used for the timestamp ? In practice what kind of deviation are you seeing ? If you're afraid of network induced delays ? Please explain with some concrete actual numbers because I'm not convinced myself, and it would help also convince people on xen-devel this is actually needed. Basically if doing the RPC(s) takes a millisecond, then you're in troubles anyway because that means 1ms of active CPU time, and the more you sample to get accuracy, the more you generate artefact. Adding the timestamp makes sense only if the time to process is large and you want accuracy, i.e. sampling often, but high latency, and in that case the simple fact of sampling often kills any precision you may hope to get, so my gut feeling is that this is not useful in practice, but doing so increase the cost of the operation even in the fast case or when not needed. Data would help convince me :-) Daniel -- Daniel Veillard | Red Hat http://redhat.com/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Thu, Apr 13, 2006 at 04:44:40PM -0400, Daniel Veillard wrote:
On Thu, Apr 13, 2006 at 06:49:33PM +0100, Daniel P. Berrange wrote:
The primary use of the virDomainInfo data is in calculating resource utilization over time. The CPU runtime data, however, is represented as a counter, so to calulate utilization requires some transformations. Basically subtract the CPU runtime value for time X, from the value at time Y, and divide the result by Y-X. Obviously this requires us to know the times X & Y. Currently the virDomainInfo structure does not contain this information though, so apps have to call something like gettimeofday() before virDomainGetInfo(), to record the timestamp of each virDomainInfo record.
The problem is that this is a pretty poor approximation - particularly if libvirt is talking via XenD, a non-trivial period of time can pass between the call to gettimeofday() and the values for virDomainInfo actually sampled.
the fact that libvirt does it or the application does it generate the exact same incertitude. You won't get any added precision really...
Thus I would like to suggest that the virtDomainInfo structure be expanded to add an extra data field of type 'struct timeval' to record a high resolution timestamp. Now currently this wouldn't help accuracy all that much, since libvirt would still be filling this in much the same way as the app did. It does however, open the door to having Xen itself provide an accurate timestamp back to libvirt, which would in turn benefit apps without requiring further code change.
In practice this forces libvir to make one extra syscall for all sampling even if the application is not interested in such a timestamp. I find this a bit dubious honnestly. It would make some sense only if the timestamp is actually added at the protocol level, best is to ask on xen-devel, if there is a positive feedback then okay why not. But to me this just won't work in the general case, if you extract the informations from the Xen Store this mean multiple RPCs to get back the informations which one should be used for the timestamp ? In practice what kind of deviation are you seeing ? If you're afraid of network induced delays ? Please explain with some concrete actual numbers because I'm not convinced myself, and it would help also convince people on xen-devel this is actually needed. Basically if doing the RPC(s) takes a millisecond, then you're in troubles anyway because that means 1ms of active CPU time, and the more you sample to get accuracy, the more you generate artefact. Adding the timestamp makes sense only if the time to process is large and you want accuracy, i.e. sampling often, but high latency, and in that case the simple fact of sampling often kills any precision you may hope to get, so my gut feeling is that this is not useful in practice, but doing so increase the cost of the operation even in the fast case or when not needed. Data would help convince me :-)
I've not really got any formal data on it at this time - it was just a random afternoon thought. I'll see if there's any useful way to get some data on the effects. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, Apr 13, 2006 at 10:37:17PM +0100, Daniel P. Berrange wrote:
Basically if doing the RPC(s) takes a millisecond, then you're in troubles anyway because that means 1ms of active CPU time, and the more you sample to get accuracy, the more you generate artefact. Adding the timestamp makes sense only if the time to process is large and you want accuracy, i.e. sampling often, but high latency, and in that case the simple fact of sampling often kills any precision you may hope to get, so my gut feeling is that this is not useful in practice, but doing so increase the cost of the operation even in the fast case or when not needed. Data would help convince me :-)
I've not really got any formal data on it at this time - it was just a random afternoon thought. I'll see if there's any useful way to get some data on the effects.
if running as root locally with Xen, then getting the data is a simple hypercall, I would expect that to be nearly as fast as a gettimeofday(), and this won't increase precision. In the case of a non-root local process with an HTTP request to xend the time spent could potentially be quite large actually not bounded at all due to potential I/O, the quality of the data extracted then will be poor due to the tme of acquisition, would that be worth it ? The last corner case is a remote monitoring, and there the time spent is most likely to be due to network round trip, which in general is approximated by taking the medium time between emission and reception the time to do the 2 gettimeofday() are probably neglectible. So in those 3 kind of extreme scenario it's a bit unclear how adding the timestamp to the data would really help, except maybe as a convenience to the user layer. Actually getting some data about the costs of doing the call as root though the hypervisor versus the xend HTTP RPC would be an interesting datapoint in itself, I initially wanted to hack virsh to extract statistics about this but never took the time to do it :-) Daniel -- Daniel Veillard | Red Hat http://redhat.com/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Thu, Apr 13, 2006 at 05:57:13PM -0400, Daniel Veillard wrote:
On Thu, Apr 13, 2006 at 10:37:17PM +0100, Daniel P. Berrange wrote:
I've not really got any formal data on it at this time - it was just a random afternoon thought. I'll see if there's any useful way to get some data on the effects.
if running as root locally with Xen, then getting the data is a simple hypercall, I would expect that to be nearly as fast as a gettimeofday(), and this won't increase precision. In the case of a non-root local process with an HTTP request to xend the time spent could potentially be quite large actually not bounded at all due to potential I/O, the quality of the data extracted then will be poor due to the tme of acquisition, would that be worth it ? The last corner case is a remote monitoring, and there the time spent is most likely to be due to network round trip, which in general is approximated by taking the medium time between emission and reception the time to do the 2 gettimeofday() are probably neglectible. So in those 3 kind of extreme scenario it's a bit unclear how adding the timestamp to the data would really help, except maybe as a convenience to the user layer. Actually getting some data about the costs of doing the call as root though the hypervisor versus the xend HTTP RPC would be an interesting datapoint in itself, I initially wanted to hack virsh to extract statistics about this but never took the time to do it :-)
So I wrote a crude micro-benchmark to just analyse the cost of calling virDomainGetInfo under different circumstances. Basically the loop does 10,000 calls to the method & reports min,max,avg time. Like I said, the test is crude, but the results give a picture which is consistent with what I'm seeing in practice (ie the applet consume 5-10% CPU just updating domain stats once a second). All times are milliseconds in the following results.. 1. Running the test as root, so virDomainGetINfo does a hypercall: Total 239.397094726562 Avg: 0.0239397094726563 Min: 0.021484375 Max: 0.548974609375 2. Running the test unprivileged, so calls go via XenD/XenStoreD: Total: 71546.1286621094 Avg: 7.15461286621094 Min: 6.1657958984375 Max: 45.3959228515625 So, as to be expected, the XenD/XenStoreD approach has significantly higher overhead that direct HV calls. The question is, is a x350 overhead for unprivileged user's acceptable / can it be improved to just one order of magnitude worse. As a proof of concept, I wrote a daemon wich exposes the APIs from libvirt as a DBus service, then adapted the test case to call the DBus service rather than libvirt directly. 3. Running the DBus service as root, so libvirt can make HV calls Total: 11280.2186035156 Avg: 1.12802186035156 Min: 1.0397216796875 Max: 6.5512939453125 So this basic DBus service (written in Perl BTW) is approx x50 overhead compared to HV calls, significantly better than the existing HTTP/SExpr RPC method. It'll be interesting to see how the new XML-RPC method compares in performance. Getting back to the original point of my first mail, while there is definitely a difference between calls via HV and those via XenD/XenStore, but even the worst case is only 45 milliseconds - with the applet taking measurements once per second it looks like CPU utilization calculations will be accurate enough. So there is no pressing need to add a timestamp to virDomainInfo. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Tue, Apr 18, 2006 at 11:32:12PM +0100, Daniel P. Berrange wrote: [ data snipped] cool, thanks for the data. This confirms my expectations too.
So, as to be expected, the XenD/XenStoreD approach has significantly higher overhead that direct HV calls. The question is, is a x350 overhead for unprivileged user's acceptable / can it be improved to just one order of magnitude worse.
with HTTP and no pipelining of requests it's gonna be hard to improve, but I'm afraid most of the time was spent in python code interpretation on the xend side. did you ran top at some point to check ?
As a proof of concept, I wrote a daemon wich exposes the APIs from libvirt as a DBus service, then adapted the test case to call the DBus service rather than libvirt directly.
3. Running the DBus service as root, so libvirt can make HV calls
Total: 11280.2186035156 Avg: 1.12802186035156 Min: 1.0397216796875 Max: 6.5512939453125
So this basic DBus service (written in Perl BTW) is approx x50 overhead compared to HV calls, significantly better than the existing HTTP/SExpr RPC method. It'll be interesting to see how the new XML-RPC method compares in performance.
Well it doesn't exist yet at least on the client side. There is certainly a number of optimization doable, hard to tell without a first full round trip to test.
Getting back to the original point of my first mail, while there is definitely a difference between calls via HV and those via XenD/XenStore, but even the worst case is only 45 milliseconds - with the applet taking measurements once per second it looks like CPU utilization calculations will be accurate enough. So there is no pressing need to add a timestamp to virDomainInfo.
okay. Still this raised the interesting point of virDomainInfo size, and future ABI compatibility. I wanted to avoid returning memory allocated by the library, so the client allocate that structure most likely on the stack, and we really can't change its size in the future. We could try to make it future proof by adding extra padding, but then what would be returned would be libvirt version dependant which is not a good thing either. Last possibility is to add a version info to virDomainInfo itself, but that messy too what happen if the client code forget to initialize the value, there is a risk of random crash. So all options considered it seem virDomainInfo should be defined once and for all, so if we foresee more informations to be needed this should be added ASAP (not that network device informations should really be made part of a different probe call IMHO). There is also a cost tradeoff, if informations are pulled from an RPC it's best to minimize the round-trip and extract as much as possible in one call, while if using HV call, the danger is to not be able to complete the virDomainInfo and requiring some more costly operations. In a nutshell I just hope the current set of data in virDomainInfo is sufficient for applications and we won't need to extend it. Daniel -- Daniel Veillard | Red Hat http://redhat.com/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Wed, Apr 19, 2006 at 04:47:56AM -0400, Daniel Veillard wrote:
On Tue, Apr 18, 2006 at 11:32:12PM +0100, Daniel P. Berrange wrote:
So, as to be expected, the XenD/XenStoreD approach has significantly higher overhead that direct HV calls. The question is, is a x350 overhead for unprivileged user's acceptable / can it be improved to just one order of magnitude worse.
with HTTP and no pipelining of requests it's gonna be hard to improve, but I'm afraid most of the time was spent in python code interpretation on the xend side. did you ran top at some point to check ?
Yes, the CPU was highly loaded by both xenstore & xend processes.
So this basic DBus service (written in Perl BTW) is approx x50 overhead compared to HV calls, significantly better than the existing HTTP/SExpr RPC method. It'll be interesting to see how the new XML-RPC method compares in performance.
Well it doesn't exist yet at least on the client side. There is certainly a number of optimization doable, hard to tell without a first full round trip to test.
So all options considered it seem virDomainInfo should be defined once and for all, so if we foresee more informations to be needed this should be added ASAP (not that network device informations should really be made part of a different probe call IMHO).
That makes sense - since we can have an arbitrary number of network / disk adapters registered to the machine, it wouldn't really be feasible to define storage space in a statically size structured. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
participants (2)
-
Daniel P. Berrange
-
Daniel Veillard