On Fri, May 11, 2007 at 11:25:00PM +0100, Daniel P. Berrange wrote:
On Wed, May 02, 2007 at 07:04:44PM +0100, Richard W.M. Jones wrote:
> Below is the plan for delivering the remote patch for review in stages.
> More details about in each email. I only expect to get through the
> first two emails today.
I've been testing this out for real today
- IPv6 works correctly
- Once I generated the client & server certs the TLS stuff was working
pretty much without issue. Though we could do with printing out some
clearer stuff in the scenario where user typos on cert/key path names
as the current stuff is a littel obscure.
- I've been testing with the QEMU driver and hit a few problems with
the fact that qemuinternal.c will grab the URIs containing hostnames
in the virConnectOpen call. So qemu was getting the connection before
remote driver had a chance. Should be simply to path qemu_internal
to ignore URIs with a hostname set.
This last point was in turn killing virt-manager, with a hack workaround
it seems virt-manager more or less works. Well with obvious exception of
places where virt-manager uses local state outside the libvirt APIs, but
that's another story :-)
I've just spent /hours/ trying to work out why
c = libvirt.openReadOnly(uri);
for i in range(100000):
d = c.listDefinedDomains()
for dom in d:
f = c.lookupByName(dom)
Was at least x100 slower with
qemu+tcp://localhost/system
than with
qemu:///system
Neither the server, nor the client showed *any* cpu time. The box was
completely idle. No sleep states anywhere in the code either. After much
confusion I finally realized that we were being hit by Nagle. Since each
RPC operation requires < 100 bytes to be read & written, every write was
being queued up for some 10's of ms being being sent out. Pointless since
the RPC ops are synchronous we'd never fill a big enough packet to cause
the data to be sent before Nagle timeout.
When I do
int no_slow_start = 1;
setsockopt(sock, IPPROTO_TCP, TCP_NODELAY, (void *)&no_slow_start,
sizeof(no_slow_start));
On both the client & server end every socket then performance using
qemu+tcp://localhost/system was basically identical to qemu:///system
Now if going across the LAN/WAN the delay caused by Nagle will be a
smaller proportion of the RPC call time, due to extra round trip time
on the real network. It is still wasteful to leave it enable though
because its inserting arbitrary delays & due to the sync call-reply
nature of our RPC it'll never get enough data to fill a packet. So
I say disable Nagle all the time.
Dan.
--
|=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=|
|=- Perl modules:
http://search.cpan.org/~danberr/ -=|
|=- Projects:
http://freshmeat.net/~danielpb/ -=|
|=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|