
On Tue, Dec 08, 2015 at 10:13:31AM +0100, Michal Privoznik wrote:
On 07.12.2015 20:25, Vasiliy Tolstov wrote:
07 дек. 2015 г. 18:13 пользователь "Daniel P. Berrange" <berrange@redhat.com> написал:
On Mon, Dec 07, 2015 at 04:04:40PM +0100, Michal Privoznik wrote:
On 07.12.2015 14:51, Daniel P. Berrange wrote:
On Mon, Dec 07, 2015 at 02:46:59PM +0100, Michal Privoznik wrote:
Dear list,
I'd like to hear your opinion on the following bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1282859
Long story short, Imagine the following scenario:
1. Create 4GB file full of zeroes 2. virsh vol-download it
What happens is that all those 4GB are transferred byte after byte through our RPC system. Not only this puts needles pressure on our
loop, it's suboptimal for network and other resources too.
I'd like to explore our options here keeping in mind that the original volume might have been sparse and we ought to keep it sparse on the destination too.
In the bug the reporter (Matthew Booth) suggests introducing new type of RPC message that will let us keep our APIs unchanged. The source will scan the file for windows of zeroes bigger than some value. When found the new type of message is passed to client without need to copy
event those
zeroes. Yes, this is very similar to RLE.
If we are going that way, should we enable users to put a compression program in between read()/write() and our RPC? Well, should we let users to choose what compression program we will put there? Because there are better compression algorithms than RLE.
It only looks like compression if you're solely looking at the network data transfer. A keep feature of sparse support is that we preserve the sparseness on both sides.
ie, if I have a sparse raw file locally, and vol-upload it, it should remain a sparse file on the server. Likewise vol-downloading a sparse file should let me create a sparse file locally. For this reason the RPC program must explicitly represent data holes, and not merely consider them a type of compression algorithm, as that would not let us preserve the holes on both ends of the stream.
Right. But how could we apply both our RLE algorithm and an external program on the same stream? Should we multiplex and send holes to the other side as they are and run the rest through the external compression program? Otherwise I don't see how we could preserve sparseness.
I think we should just focus on sending holes in the RPC protocol right now, and not try todo compression at the same time, as we need to be able to represent holes in the protocol regardless of whether compression is present.
Sometimes ago I'm already ask about this and to add compress flag to vol upload and download (don't have time to complete). For my use case best way is to able to create compressed stream that goes to libvirt. So in this case we effectively solve sparse file problem and also can transfer less data, all my tests with lz4 compression says that I get is about 20% minimum benefit compared to original volume size.
Right. And as Dan pointed out, these two approaches are orthogonal to each other. Compressing a stream of data to reduce size is a nice feature to have, preserving sparseness of a file is something different though (although the way I'm intending to implement it will reduce data sent through virStream too).
One thing that I am still wondering about is sparseness detection. Finding a window full of zeroes in a file does not necessarily mean that those come from read() over segment that's not on disk. We surely can have a raw file that is sparse and also contains a window full of zeroes. But I guess it's okay if we sparsify (if that's even a verb) file even more on volDownload or volUpload.
There is an ioctl you can use to detect actual holes in recentish Linux. You only need to fallback to scanning for zeros if that is not available Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|