The kernel images that I want to snoop in virt-mem are around 16 MB in
size. In the qemu / KVM case, these images have to travel over the
remote connection. Because of limits on the maximum message size,
they have to travel currently in 64 KB chunks, and it turns out that
this is slow. Apparently the dominating factors are how long it takes
to issue the 'memsave' command in the qemu monitor (there is some big
constant overhead), and extra network round trips.
The current remote message size is intentionally limited to 256KB
(fully serialized, including all XDR headers and overhead), so the
most we could practically send in a single message at the moment is
128KB if we stick to powers of two, or ~255KB if we don't.
The reason we limit it is to avoid denial of service attacks, where a
rogue client or server sends excessively large messages and causes the
peer to allocate lots of memory [eg. if we didn't have any limit, then
you could send a message which was several GB in size and cause
problems at the other end, because the message is slurped in before it
is fully parsed].
There is a second problem with reading the kernel in small chunks,
namely that this allows the virtual machine to make a lot of progress,
so we don't get anything near an 'instantaneous' snapshot (getting the
kernel in a single chunk doesn't necessarily guarantee this either,
but it's better).
As an experiment, I tried increasing the maximum message to 32 MB, so
that I could send the whole kernel in one go.
Unfortunately just increasing the limit doesn't work for two reasons,
one prosaic and one very weird:
(1) The current code likes to keep message buffers on the stack, and
because Linux limits the stack to something artificially small, this
fails. Increasing the stack ulimit is a short-term fix for this,
while testing. In the long term we could rewrite any code which does
this to use heap buffers instead.
(2) There is some really odd problem with our use of recv(2) which
causes messages > 64 KB to fail. I have no idea what is really
happening, but the sequence of events seems to be this:
server client
write(sock,buf,len) = len-k
recv(sock,buf,len) = len-k
write(sock,buf+len-k,len-k) = k
recv(sock,buf,k) = 0 [NOT k]
At this point the client assumes that the server has unexpectedly
closed the connection and fails. I have stared at this for a while,
but I've got no idea at all what's going on.
A test program is attached. You'll need a 32 bit KVM guest.
Rich.
--
Richard Jones, Emerging Technologies, Red Hat
http://et.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine. Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/