On 07.12.2015 14:51, Daniel P. Berrange wrote:
On Mon, Dec 07, 2015 at 02:46:59PM +0100, Michal Privoznik wrote:
> Dear list,
>
> I'd like to hear your opinion on the following bug:
>
>
https://bugzilla.redhat.com/show_bug.cgi?id=1282859
>
> Long story short, Imagine the following scenario:
>
> 1. Create 4GB file full of zeroes
> 2. virsh vol-download it
>
> What happens is that all those 4GB are transferred byte after byte
> through our RPC system. Not only this puts needles pressure on our event
> loop, it's suboptimal for network and other resources too.
>
> I'd like to explore our options here keeping in mind that the original
> volume might have been sparse and we ought to keep it sparse on the
> destination too.
>
> In the bug the reporter (Matthew Booth) suggests introducing new type of
> RPC message that will let us keep our APIs unchanged. The source will
> scan the file for windows of zeroes bigger than some value. When found
> the new type of message is passed to client without need to copy those
> zeroes. Yes, this is very similar to RLE.
>
> If we are going that way, should we enable users to put a compression
> program in between read()/write() and our RPC? Well, should we let users
> to choose what compression program we will put there? Because there are
> better compression algorithms than RLE.
It only looks like compression if you're solely looking at the network
data transfer. A keep feature of sparse support is that we preserve
the sparseness on both sides.
ie, if I have a sparse raw file locally, and vol-upload it, it should
remain a sparse file on the server. Likewise vol-downloading a sparse
file should let me create a sparse file locally. For this reason the
RPC program must explicitly represent data holes, and not merely
consider them a type of compression algorithm, as that would not let
us preserve the holes on both ends of the stream.
Right. But how could we apply both our RLE algorithm and an external
program on the same stream? Should we multiplex and send holes to the
other side as they are and run the rest through the external compression
program? Otherwise I don't see how we could preserve sparseness.
Michal