[Libvir] libvirt and accessing remote systems

This is a follow on to this thread: https://www.redhat.com/archives/libvir-list/2007-January/thread.html#00064 but I think it deserves a thread of its own for discussion. Background: Dan drew this diagram proposing a way to include remote access to systems from within libvirt: http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png libvirt would continue as now to provide direct hypervisor calls, direct access to xend and so on. But in addition, a new backend would be written ("remote") which could talk to a remote daemon ("libvirtd") using some sort of RPC mechanism. Position: I gave this architecture some thought over the weekend, and I like it for the following reasons (some not very technical): * Authentication and encryption is handled entirely within the libvirt / libvirtd library, allowing us to use whatever RPC mechanism we like on top of a selection of transports of our choosing (eg. GnuTLS, ssh, unencrypted TCP sockets, ...) * We don't need to modify xend at all, and additionally we won't need to modify future flavour-of-the-month virtual machine monitors. I have a particular issue with xend (written in Python) because in my own tests I've seen my Python XMLRPC/SSL server actually segfault. It doesn't inspire me that this Python solution is adding anything more than apparent security. * The architecture is very flexible: It allows virt-manager to run as root or as non-root, according to customer wishes. virt-manager can make direct HV calls, or everything can be remoted, and it's easy to explain to the user about the performance vs management trade-offs. * It's relatively easy to implement. Note that libvirtd is just a thin server layer linked to its own copy of libvirt. * Another proposal was to make all libvirt calls remote (http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-3.png) but I don't think this is a going concern because (1) it requires a daemon always be run, which is another installation problem and another chance for sysadmins to give up, and (2) the perception will be that this is slow, whether or not that is actually true. Now some concerns: * libvirtd will likely need to be run as root, so another root daemon written in C listening on a public port. (On the other hand, xend listening on a public port also isn't too desirable, even with authentication). * If Xen upstream in the meantime come up with a secure remote access method then potentially this means clients could have to choose between the two, or run two services (libvirtd + Xen/remote). * There are issues with versioning the remote API. Do we allow different versions of libvirt/libvirtd to talk to each other? Do we provide backwards compatibility when we move to a new API? * Do we allow more than one client to talk to a libvirtd daemon (no | multiple readers one writer | multiple readers & writers). * What's the right level to make a remote API? Should we batch calls up together? RPC mechanism: I've been investigating RPC mechanisms and there seem to be two reasonable possibilities, SunRPC and XMLRPC. (Both would need to run over some sort of secure connection, so there is a layer below both). My analysis of those is here: http://et.redhat.com/~rjones/secure_rpc/ Rich. -- Red Hat UK Ltd. 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 (will change soon)

On Wed, Jan 24, 2007 at 02:17:31PM +0000, Richard W.M. Jones wrote:
This is a follow on to this thread: https://www.redhat.com/archives/libvir-list/2007-January/thread.html#00064 but I think it deserves a thread of its own for discussion.
Background:
Dan drew this diagram proposing a way to include remote access to systems from within libvirt:
http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-2.png
libvirt would continue as now to provide direct hypervisor calls, direct access to xend and so on. But in addition, a new backend would be written ("remote") which could talk to a remote daemon ("libvirtd") using some sort of RPC mechanism.
Position:
I gave this architecture some thought over the weekend, and I like it for the following reasons (some not very technical):
* Authentication and encryption is handled entirely within the libvirt / libvirtd library, allowing us to use whatever RPC mechanism we like on top of a selection of transports of our choosing (eg. GnuTLS, ssh, unencrypted TCP sockets, ...)
Yes, having a single consistent wire encyption & user auth system across all virt backends makes a very nice end user / admin story
* We don't need to modify xend at all, and additionally we won't need to modify future flavour-of-the-month virtual machine monitors.
I have a particular issue with xend (written in Python) because in my own tests I've seen my Python XMLRPC/SSL server actually segfault. It doesn't inspire me that this Python solution is adding anything more than apparent security.
Did I mention XenD is slow. If we can get remote management of Xen bypassing XenD just like we do for the local case, we'll be much better off.
* The architecture is very flexible: It allows virt-manager to run as root or as non-root, according to customer wishes. virt-manager can make direct HV calls, or everything can be remoted, and it's easy to explain to the user about the performance vs management trade-offs.
* It's relatively easy to implement. Note that libvirtd is just a thin server layer linked to its own copy of libvirt.
* Another proposal was to make all libvirt calls remote (http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-3.png) but I don't think this is a going concern because (1) it requires a daemon always be run, which is another installation problem and another chance for sysadmins to give up, and (2) the perception will be that this is slow, whether or not that is actually true.
I'd never compared performance of direct hypercalls vs libvirt_proxy before, so I did a little test. The most commonly called method is virt-manager is virDomainGetInfo for fetching current status of a running domain - we call that once a second per guest. So I wrote a simple program in C which calls virDomainGetInfo 100,000 times for 3 active guest VMs. I ran the test under a couple of different libvirt backends. The results were: 1. As root, direct hypercalls -> 1.4 seconds 2. As non-root, hypercalls via libvirt_proxy -> 9 seconds 3. As non-root, via XenD -> 45 minutes [1] So although it is x10 slower than hypercalls, the libvirt_proxy is actaully pretty damn fast - 9 seconds for 300,000 calls. There are many reasons the XenD path is slow. Each operation makes a new HTTP request. It spawns a new thread per request. It talks to XenStore for every request which has very high I/O overhead. It uses the old SEXPR protocol which requests far more info than we actually need. It is written in Python. Now I'm sure we can improve performance somewhat by switching to the new XML-RPC api, and getting persistent connections running, but I doubt it'll ever be as fast as libvirt_proxy let alone hypercalls. So as mentioned above, I'd like to take XenD out of the loop for remote management just like we do for the local case with libvirt_proxy, but with full authenticated read+write access.
Now some concerns:
* libvirtd will likely need to be run as root, so another root daemon written in C listening on a public port. (On the other hand, xend listening on a public port also isn't too desirable, even with authentication).
For Xen we have no choice but to have something running as root since hypercalls needs to mlock() memory and access the Xen device node. So both options are unpleasent, but we have to choose one, and I can't say XenD is the obvious winner - particularly given the tendancy of python SSL code to segfault. We can, however, make sure that libvirtd is written to allow the full suite of modern protection mechanisms to be applied - SELinux, execshield, TLS, fortify source etc.
* If Xen upstream in the meantime come up with a secure remote access method then potentially this means clients could have to choose between the two, or run two services (libvirtd + Xen/remote).
* There are issues with versioning the remote API. Do we allow different versions of libvirt/libvirtd to talk to each other? Do we provide backwards compatibility when we move to a new API?
We can simply apply the same rules as we do for public API. No changes to existing calls, only additions are allowed. A simple protocol version number can allow the client & server to negotiate the mutually supported feature set.
* Do we allow more than one client to talk to a libvirtd daemon (no | multiple readers one writer | multiple readers & writers).
The latter - since we have things modifiying domains via XenD, or the HV indirectly updating domain state, we defacto have multiple writers already. From my work in the qemu daemon, I didn't encounter any major problems with allowing multiple writers - by using a poll() based single-thread event loop approach, I avoided any nasty multi-thread problems associated with multiple connections. Provided each request can be completed in a short amount of time, there should be no need to go fully threaded.
* What's the right level to make a remote API? Should we batch calls up together?
We may be already constrained by the client side API - all calls in the libvirt public API are synchronous so from a single client thread there is nothing available to batch.
I've been investigating RPC mechanisms and there seem to be two reasonable possibilities, SunRPC and XMLRPC. (Both would need to run over some sort of secure connection, so there is a layer below both). My analysis of those is here:
SunRPC would handle our current APIs fine. We've talked every now & then about providing asynchronous callbacks into the API - eg, so the client can be notified of VM state changes without having to poll the virDomainGetInfo api every second. The RPC wire protocol certainly supports that, but its not clear the C apis do. The XDR wire formating rules are very nicely defined - another option is to use XDR as the wire encoding for our existing prototype impl in the qemud. For XML-RPC I'd like to do a proof of concept of the virDomainGetInfo method impl to see how much overhead it adds. Hopefully it would be acceptable, although I'm sure its a fair bit more than XDR / SunRPC. We would need persistent connections for XML-RPC to be viable, particularly with TLS enabled. Since XML-RPC doesn't really impose any firm C API I imagine we could get a-synchronous notifications from the server working without much trouble. Dan. [1] It didn't actually finish after 45 seconds. I just got bored of waiting. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Wed, Jan 24, 2007 at 11:48:47PM +0000, Daniel P. Berrange wrote:
On Wed, Jan 24, 2007 at 02:17:31PM +0000, Richard W.M. Jones wrote:
* Another proposal was to make all libvirt calls remote (http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-3.png) but I don't think this is a going concern because (1) it requires a daemon always be run, which is another installation problem and another chance for sysadmins to give up, and (2) the perception will be that this is slow, whether or not that is actually true.
I'd never compared performance of direct hypercalls vs libvirt_proxy before, so I did a little test. The most commonly called method is virt-manager is virDomainGetInfo for fetching current status of a running domain - we call that once a second per guest.
So I wrote a simple program in C which calls virDomainGetInfo 100,000 times for 3 active guest VMs. I ran the test under a couple of different libvirt backends. The results were:
1. As root, direct hypercalls -> 1.4 seconds 2. As non-root, hypercalls via libvirt_proxy -> 9 seconds 3. As non-root, via XenD -> 45 minutes [1]
So although it is x10 slower than hypercalls, the libvirt_proxy is actaully pretty damn fast - 9 seconds for 300,000 calls.
Interesting figures, I has expected the proxy inter-process communication to slow things down more, I guess it works well because scheduling follows exactly the message passing so there is little latency in the RPC, was that on a uniprocessor machine ?
There are many reasons the XenD path is slow. Each operation makes a new HTTP request. It spawns a new thread per request. It talks to XenStore for every request which has very high I/O overhead. It uses the old SEXPR protocol which requests far more info than we actually need. It is written in Python. Now I'm sure we can improve performance somewhat by switching to the new XML-RPC api, and getting persistent connections running, but I doubt it'll ever be as fast as libvirt_proxy let alone hypercalls. So as mentioned above, I'd like to take XenD out of the loop for remote management just like we do for the local case with libvirt_proxy, but with full authenticated read+write access.
I love XML, but I doubt switching to XML-RPC will speed things up. Well maybe if the parser is written in C, but overall parsing an XML instance has a cost and I doubt you will get anyway close to the 300,000/s of a proxy like RPC, that would mean 600,000 XML instances parsing per second of just overhead and that's really not realistic. My only concern with an ad-hoc protocol like the proxy one is that it would make harder to build client side say in Java (since we don't have bindings and that would be a relatively nice way to do it as mixing C and Java always raises some resistance/problems). Though I really don't think this is a blocker.
* If Xen upstream in the meantime come up with a secure remote access method then potentially this means clients could have to choose between the two, or run two services (libvirtd + Xen/remote).
* There are issues with versioning the remote API. Do we allow different versions of libvirt/libvirtd to talk to each other? Do we provide backwards compatibility when we move to a new API?
We can simply apply the same rules as we do for public API. No changes to existing calls, only additions are allowed. A simple protocol version number can allow the client & server to negotiate the mutually supported feature set.
Agreed, actually we already do that with the protocol to the proxy: .... struct _virProxyPacket { unsigned short version; /* version of the proxy protocol */ unsigned short command; /* command number a virProxyCommand */ .... If we stick to adding only that should work in general c.f. HTTP 1.0 and HTTP 1.1 long term cohabitation.
* Do we allow more than one client to talk to a libvirtd daemon (no | multiple readers one writer | multiple readers & writers).
The latter - since we have things modifiying domains via XenD, or the HV indirectly updating domain state, we defacto have multiple writers already. From my work in the qemu daemon, I didn't encounter any major problems with allowing multiple writers - by using a poll() based single-thread event loop approach, I avoided any nasty multi-thread problems associated with multiple connections. Provided each request can be completed in a short amount of time, there should be no need to go fully threaded.
And if needed we can revamp the design internally later without breaking any compatibility since we operate in stricly synchronous RPC mode at the connection level (true from the C API down to the wire).
* What's the right level to make a remote API? Should we batch calls up together?
We may be already constrained by the client side API - all calls in the libvirt public API are synchronous so from a single client thread there is nothing available to batch.
Yup let's avoid the asynch support, that killed so many protocol that I'm really cautious about this. If that mean we need to add higher level APIs, why not but this really should come as a user induced evolution, i.e. field reports :-)
I've been investigating RPC mechanisms and there seem to be two reasonable possibilities, SunRPC and XMLRPC. (Both would need to run over some sort of secure connection, so there is a layer below both). My analysis of those is here:
SunRPC would handle our current APIs fine. We've talked every now & then about providing asynchronous callbacks into the API - eg, so the client can be notified of VM state changes without having to poll the virDomainGetInfo api every second. The RPC wire protocol certainly supports that, but its not clear the C apis do.
Callbacks are hairy, somehow I would prefer to allow piggybacking extra payload on an RPC return than initiating one from the server, this simplifies things both on the client and server code, but also integration on the client event loop (please no threads !)
The XDR wire formating rules are very nicely defined - another option is to use XDR as the wire encoding for our existing prototype impl in the qemud.
For XML-RPC I'd like to do a proof of concept of the virDomainGetInfo method impl to see how much overhead it adds. Hopefully it would be acceptable, although I'm sure its a fair bit more than XDR / SunRPC. We would need persistent connections for XML-RPC to be viable, particularly with TLS enabled. Since XML-RPC doesn't really impose any firm C API I imagine we could get a-synchronous notifications from the server working without much trouble.
I'm a bit afraid of the XML-RPC overhead, parsing 10,000 small instances per second with libxml2 is doable with a bit of tuning but that's still more than an order of magnitude slower than the ad-hoc protocol and without the marshalling/demarshalling from the strings values. For SunRPC, well if you do some testing that will be fantastic.
[1] It didn't actually finish after 45 seconds. I just got bored of waiting.
s/seconds/minutes/ I guess, and you checked CPU was at 100% not a bad deadlock somewhere, right ;-) ? Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Thu, 2007-01-25 at 04:56 -0500, Daniel Veillard wrote:
I'm a bit afraid of the XML-RPC overhead, parsing 10,000 small instances per second with libxml2 is doable with a bit of tuning but that's still more than an order of magnitude slower than the ad-hoc protocol and without the marshalling/demarshalling from the strings values. For SunRPC, well if you do some testing that will be fantastic.
So, you're making a guess at 100us overhead for the XML processing? That's an extra 30s with Dan's test. That really wouldn't concern me in the slightest ... If a DomainGetInfo() call is in the sub-millisecond range then it's never going to be a bottleneck IMHO. (Especially since the reason Dan is talking about XML-RPC is so that we can have notification of state changes instead of polling for them) Cheers, Mark.

On Thu, 2007-01-25 at 10:50 +0000, Mark McLoughlin wrote:
If a DomainGetInfo() call is in the sub-millisecond range then it's never going to be a bottleneck IMHO.
Hmm, that might sound like I'm talking out my ass ... Consider an app that displays the UUID of all guests in a list. It calls GetUUID() for each guest. With 100 guests you get figures like: + Direct HV calls - 500us + Proxy - 3ms + XML-RPC proxy - 100ms + Proxy over network - 10s + XML-RPC proxy over network - 10.1s So, the app seems snappy even with this number of guests until you run it over the network. At this point we optimise by adding a ListDomainUUIDs() call and this speeds up the local case too. At no point should we worry about the XML-RPC. Cheers, Mark.

On Thu, Jan 25, 2007 at 11:03:32AM +0000, Mark McLoughlin wrote:
On Thu, 2007-01-25 at 10:50 +0000, Mark McLoughlin wrote:
If a DomainGetInfo() call is in the sub-millisecond range then it's never going to be a bottleneck IMHO.
Hmm, that might sound like I'm talking out my ass ...
Consider an app that displays the UUID of all guests in a list. It calls GetUUID() for each guest. With 100 guests you get figures like:
+ Direct HV calls - 500us + Proxy - 3ms + XML-RPC proxy - 100ms + Proxy over network - 10s + XML-RPC proxy over network - 10.1s
So, the app seems snappy even with this number of guests until you run it over the network. At this point we optimise by adding a ListDomainUUIDs() call and this speeds up the local case too. At no point should we worry about the XML-RPC.
When we consider libvirt over a network, we're not really talking across the whole world - really a LAN/WAN in a data center management role. I would not expect network roundtrip ping time to be more than 20ms, perhaps even as low as 5ms or less. So the 10s figure is a bit of an over estimate I think. In addition, ignoring the roundtrip time for the client, that is not inconsiderable CPU overhead on the server - monitoring 100 guests, once per second uses 10% CPU time with XML-RPC compared to 0.3% with proxy. That's not something to be discounted because there may well be several apps talking to libvirt on a machine - eg the systems management tool, a monitoring daemon, and a VM policy daemon. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, Jan 25, 2007 at 04:56:23AM -0500, Daniel Veillard wrote:
On Wed, Jan 24, 2007 at 11:48:47PM +0000, Daniel P. Berrange wrote:
On Wed, Jan 24, 2007 at 02:17:31PM +0000, Richard W.M. Jones wrote:
* Another proposal was to make all libvirt calls remote (http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-3.png) but I don't think this is a going concern because (1) it requires a daemon always be run, which is another installation problem and another chance for sysadmins to give up, and (2) the perception will be that this is slow, whether or not that is actually true.
I'd never compared performance of direct hypercalls vs libvirt_proxy before, so I did a little test. The most commonly called method is virt-manager is virDomainGetInfo for fetching current status of a running domain - we call that once a second per guest.
So I wrote a simple program in C which calls virDomainGetInfo 100,000 times for 3 active guest VMs. I ran the test under a couple of different libvirt backends. The results were:
1. As root, direct hypercalls -> 1.4 seconds 2. As non-root, hypercalls via libvirt_proxy -> 9 seconds 3. As non-root, via XenD -> 45 minutes [1]
So although it is x10 slower than hypercalls, the libvirt_proxy is actaully pretty damn fast - 9 seconds for 300,000 calls.
Interesting figures, I has expected the proxy inter-process communication to slow things down more, I guess it works well because scheduling follows exactly the message passing so there is little latency in the RPC, was that on a uniprocessor machine ?
It was a dual core machine, so there wasn't so much process-contention as you'd get on UP.
[1] It didn't actually finish after 45 seconds. I just got bored of waiting.
s/seconds/minutes/ I guess, and you checked CPU was at 100% not a bad deadlock somewhere, right ;-) ?
Of course I meant minutes here :-) At least 60% of the CPU time was the usual XenStoreD doing stupid amounts of I/O problem. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, Jan 25, 2007 at 04:56:23AM -0500, Daniel Veillard wrote:
On Wed, Jan 24, 2007 at 11:48:47PM +0000, Daniel P. Berrange wrote:
There are many reasons the XenD path is slow. Each operation makes a new HTTP request. It spawns a new thread per request. It talks to XenStore for every request which has very high I/O overhead. It uses the old SEXPR protocol which requests far more info than we actually need. It is written in Python. Now I'm sure we can improve performance somewhat by switching to the new XML-RPC api, and getting persistent connections running, but I doubt it'll ever be as fast as libvirt_proxy let alone hypercalls. So as mentioned above, I'd like to take XenD out of the loop for remote management just like we do for the local case with libvirt_proxy, but with full authenticated read+write access.
I love XML, but I doubt switching to XML-RPC will speed things up. Well maybe if the parser is written in C, but overall parsing an XML instance has a cost and I doubt you will get anyway close to the 300,000/s of a proxy like RPC, that would mean 600,000 XML instances parsing per second of just overhead and that's really not realistic.
Actually I should have clarified that - the reason I suggested switching to XML-RPC might be faster, is not because XML is fast to parse! The current SEXPR protocol basically requires us to fetch the entire VM description each time, even though we only want the VM status info - building this VM description has quite significant overhead in XenD. So switching to XML-RPC would let us only fetch the info we actually need, which outght to remove a significant chunk of XenD cpu time.
My only concern with an ad-hoc protocol like the proxy one is that it would make harder to build client side say in Java (since we don't have bindings and that would be a relatively nice way to do it as mixing C and Java always raises some resistance/problems). Though I really don't think this is a blocker.
Not as much as you might think - in my side job working on DBus I've noticed that in the past couple of months people have provided 100% pure C# and Java libraries to speak the raw DBus protocol without very much effort - in many ways it actually simplified their code. So provided we /document/ any protocol and used reasonably portable data types I don't think Java clients would be all that difficult.
I've been investigating RPC mechanisms and there seem to be two reasonable possibilities, SunRPC and XMLRPC. (Both would need to run over some sort of secure connection, so there is a layer below both). My analysis of those is here:
SunRPC would handle our current APIs fine. We've talked every now & then about providing asynchronous callbacks into the API - eg, so the client can be notified of VM state changes without having to poll the virDomainGetInfo api every second. The RPC wire protocol certainly supports that, but its not clear the C apis do.
Callbacks are hairy, somehow I would prefer to allow piggybacking extra payload on an RPC return than initiating one from the server, this simplifies things both on the client and server code, but also integration on the client event loop (please no threads !)
I would expect that if we wanted to add callbacks, we'd have either provide a method to get a libvirt file descriptor, which the client could then add to their app's own eventloop; Or we'd let the client provide a set of functions which libvirt could use for (un)registering its file descriptors with an event loop as needed. I certainly wasn't thinking about threads :-) The latter is what DBus does, the former is what SunRPC does (albeit for a different reason, not async callbacks). Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Hi, On Wed, 2007-01-24 at 23:48 +0000, Daniel P. Berrange wrote:
On Wed, Jan 24, 2007 at 02:17:31PM +0000, Richard W.M. Jones wrote:
* Another proposal was to make all libvirt calls remote (http://people.redhat.com/berrange/libvirt/libvirt-arch-remote-3.png) but I don't think this is a going concern because (1) it requires a daemon always be run, which is another installation problem and another chance for sysadmins to give up, and (2) the perception will be that this is slow, whether or not that is actually true.
Note, I don't think this was just being proposed as a way to re-architect things for the remote access stuff. I suggested it as a way to ensure that if we aggregated hypervisor types under the one connect we could ensure that multiple guests of the same name couldn't be created. i.e. you'd need this if the daemon was more than just a proxy and performed any management tasks itself. If the daemon is *just* a proxy, I don't think it makes sense.
I'd never compared performance of direct hypercalls vs libvirt_proxy before, so I did a little test. The most commonly called method is virt-manager is virDomainGetInfo for fetching current status of a running domain - we call that once a second per guest.
So I wrote a simple program in C which calls virDomainGetInfo 100,000 times for 3 active guest VMs. I ran the test under a couple of different libvirt backends. The results were:
1. As root, direct hypercalls -> 1.4 seconds 2. As non-root, hypercalls via libvirt_proxy -> 9 seconds 3. As non-root, via XenD -> 45 minutes [1]
So although it is x10 slower than hypercalls, the libvirt_proxy is actaully pretty damn fast - 9 seconds for 300,000 calls.
That's interesting because it shows that the overhead of a roundtrip on a unix domain socket and context switch is negligible (i.e. 25us) I'd caution against optimising for this benchmark, though. An application author should not consider virDomainGetInfo() to be so cheap that even the extra 9ms caused by XenD should cause a problem for their application[1]. Consider factoring in a 100ms network roundtrip to the call. You'd be talking about 8hr 20min just for the 300,000 roundtrips. But anyway, I'd agree with the conclusions - using a daemon for the local case is not a problem from a performance perspective and avoiding XenD where we can gives a nice win.
[1] It didn't actually finish after 45 seconds. I just got bored of waiting.
Oh, look at this ... I only saw this now :-) So, XenD is *at least* 9ms per call ... Cheers, Mark. [1] - This reminds me of when there was an almost fanatical effort to make in-process CORBA calls with ORBit really, really fast. It was pointless, though, because app authors could not rely on a CORBA call being fast because it could just as easily be out-of-process.

Some highlights from today's investigation into secure RPCs (a topic which I want to bring to a close as soon as possible). * TI-RPC was suggested as an alternative to SunRPC. The available library has licensing problems which may preclude any integration into libvirt. * SunRPC over IPv6: tested and working * Either xmlrpc-c or Curl (which it uses for HTTP connections) cannot manage to do persistent connections, which makes XML-RPC very slow. I haven't yet got to the bottom of why this isn't working, since Curl certainly ought to be able to do keepalives. * Some performance numbers which you can take with a large grain of salt are available here: http://et.redhat.com/~rjones/secure_rpc/#performance The other issues are written up in the above document for your browsing pleasure. Rich. -- Emerging Technologies, Red Hat http://et.redhat.com/~rjones/ 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 "[Negative numbers] darken the very whole doctrines of the equations and make dark of the things which are in their nature excessively obvious and simple" (Francis Maseres FRS, mathematician, 1759)

On Thu, Jan 25, 2007 at 04:01:50PM +0000, Richard W.M. Jones wrote:
Some highlights from today's investigation into secure RPCs (a topic which I want to bring to a close as soon as possible).
* TI-RPC was suggested as an alternative to SunRPC. The available library has licensing problems which may preclude any integration into libvirt.
* SunRPC over IPv6: tested and working
* Either xmlrpc-c or Curl (which it uses for HTTP connections) cannot manage to do persistent connections, which makes XML-RPC very slow. I haven't yet got to the bottom of why this isn't working, since Curl certainly ought to be able to do keepalives.
I would not rely on the 'XML-RPC' library if we go the XML-RPC way, I know libxml2 well enough that I should be able to make sure we keep a single connection and make sure the XML parsing is implemented in a proper and efficient fashion.
* Some performance numbers which you can take with a large grain of salt are available here:
http://et.redhat.com/~rjones/secure_rpc/#performance
The other issues are written up in the above document for your browsing pleasure.
Cool, thanks, Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
participants (4)
-
Daniel P. Berrange
-
Daniel Veillard
-
Mark McLoughlin
-
Richard W.M. Jones