On Thu, Apr 28, 2016 at 12:04:47PM +0200, Michal Privoznik wrote:
So, after couple of sleepless nights and headaches I'm proud to
announce that finally got this working.
What?
Our regular streams that are can be used to transfer disk images
for domains are unaware of any sparseness. Therefore they have
two limitations:
a) transferring big but sparse image can take ages as all the
holes (interpreted by kernel as '\0') have to go through our
event loop.
b) resulting volume is not sparse even if the source was.
How?
I went by verified approach that linux kernel has. One way to
look at our streams is just like read() and write() with a
different names: virStreamRecv() and virStreamSend(). They even
have the same prototype (if 'int fd' is substituted with
'virStreamPtr'). Now, holes in files are created and detected via
third API: lseek(). Therefore I'm introducing new virStreamSkip()
API that mimics the missing primitive. Now, because our streams
do not necessarily have to work over files (they are for generic
data transfer), I had to let users register a callback that is
called whenever the other side calls virStreamSkip().
So now that we have all three primitives, we can focus on making
life easier for our users. Nobody is actually using bare
virStreamSend() and virStreamRecv() rather than our wrappers:
virStreamSendAll() and virStreamRecvAll(). With my approach
described above just virStreamSendAll() needs to be adjusted so
that it's 'sparse file' aware. The virStreamRecvAll() will only
get the data to write (just like it is now) with skip callback
called automatically whenever needed. In order for
virStreamSendAll() to skip holes I'm introducing yet another
callback: virStreamInDataFunc(). This callback will help us to
create a map of a file: before each virStreamSend() it checks
whether we are in a data section or a hole and calls
virStreamSend() or virStreamSkip() respectively.
Do not worry - it will all become clear once you see the code.
Now question is - how will users enable this feature? I mean, we
have take into account that we might be talking to an older
daemon that does not know how to skip a hole. Or vice versa -
older client.
The solution I came up with is to introduce flags to APIs where
sparse streams make sense. I guess it makes sense for volume
upload and download, but almost certainly makes no sense for
virDomainOpenConsole().
Code?
>From users POV they just need to pass correct argument to
'vol-upload' or 'vol-download' virsh commands. One layer down, on
programming level they need to merely:
st = virStreamNew(conn, 0);
virStreamRegisterSkip(st, skipFunc, &fd);
virStorageVolDownload(st, ...);
virStreamRecvAll(st, sinkFunc, &fd);
where:
int skipFunc(virStreamPtr st,
unsigned long long offset, void *opaque)
{
int *fd = opaque;
return lseek(*fd, offset, SEEK_CUR) == (off_t) -1 ? -1 : 0;
}
And for uploading it's slightly more verbose - see patch 24.
While it looks ok-ish in the context of the RecvAll function,
because you have skipFunc + recvFunc both being invoked
asynchronously, this design feels a quite odd when used
in combination with the plain virStreamRecv().
Just for a minute lets consider
st = virStreamNew(conn, 0);
virStreamRegisterSkip(st, skipFunc, &fd);
virStorageVolDownload(st, ...);
while (1) {
char buf[4096];
virStreamRecv(st, buf, sizeof(buf));
..do something with buf..
}
I think it is quite unpleasant to have the skipFunc be called
out of band to the code dealing with the Recv.
I think it is preferrable that we have an explicit synchronous
API for consuming the hole.
The same really applies from POV of the upload side - I think
we should be able to synchronously push the hole in, rather
than having the asynchronous InData callback.
IIUC, the reason you've chosen this async callback approach
is because the virStream{Send,Recv,SendAll,RecvAll} methods
are not extensible to cope with holes.
I'm thinking though that we'd get a more attractive public API
by creating hole friendly versions of virStream{Send,Recv,etc}
instead of going the extra callback route.
The download side could we made to work if we had a flag for
virStreamRecv that instructed it to only read data up to the
next hole, and defined an explicit error code to use to indicate
that we've hit a hole.
This would be used thus:
while (1) {
char buf[4096];
int ret = virStreamRecvFlags(st, buf, len, VIR_STREAM_STOP_AT_HOLE);
if (ret == -1) {
virErrorPtr err = virGetLastError();
if (err.code == VIR_ERR_STREAM_HOLE) {
ret = virStreamHoleSize(st, buf);
...seek ret bytes in target...
} else {
return -1;
}
} else {
...write buf to target...
}
}
To make this work with virStreamRecvAll, we would need to add
a virStreamSparseRecvAll() which had two callbacks eg
typedef int (*virStreamSinkHoleFunc)(virStreamPtr st, unsigned long long len):
virStreamSparseRecvAll(virStreamPtr stream,
virStreamSinkFunc handler,
virStreamSinkHoleFunc holeHandler,
void *opaque) {
while (1) {
char buf[4096];
int ret = virStreamRecvFlags(st, buf, len, VIR_STREAM_STOP_AT_HOLE);
if (ret == -1) {
virErrorPtr err = virGetLastError();
if (err.code == VIR_ERR_STREAM_HOLE) {
ret = virStreamHoleSize(st, buf);
holeHandler(st, ret);
} else {
return -1;
}
} else {
handler(st, buf, ret);
}
}
}
Now considering the upload side of things, the virStreamSend
function doesn't actually need cahnges, though it could do
with a flags parameter for best practice. We just need the
virStreamSkip() function you already add.
while (1) {
char buf[4096];
if (..in hole...) {
..get hole size...
virStreamSkip(st, len);
} else {
...read N bytes...
virStreamSend(st, buf, len);
}
}
The SendAll method is again more complicated as the callback we pass
into it is insufficient to figure out if we have holes. We could
add a virStreamSparseSendAll() which had two callbacks again:
typedef int (*virStreamSourceHoleFunc)(holeHandler);
This returns the length of the current hole, or 0 if not at
a hole.
virStreamSparseSendAll(virStreamPtr stream,
virStreamSourceFunc handler,
virStreamSourceHoleFunc holeHandler,
void *opaque)
while (1) {
char buf[4096];
int ret = holeHandler(st);
if (ret > 0) {
virStreamSkip(st, ret);
} else {
ret = handler(st, buf, sizeof(buf);
virStreamSend(st, buf, ret);
}
}
}
So we would avoid the virStreamInData and virStreamRegisterInData
and virStreamRegisterSkip and virStreamSkipCallback methods
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|