On Mon, Sep 30, 2013 at 01:21:18AM +0300, Oskari Saarenmaa wrote:
On Fri, Sep 27, 2013 at 03:19:06PM +0100, Daniel P. Berrange wrote:
> On Fri, Sep 27, 2013 at 05:02:53PM +0300, Oskari Saarenmaa wrote:
> > Btrfs provides a copy-on-write clone ioctl so let's try to use it instead
> > of copying files block by block. The ioctl is executed unconditionally if
> > it's available and we fall back to block copying if it fails, similarly to
> > cp --reflink=auto.
>
> Currently the virStorageVolCreateXMLFrom method does a full allocation
> of storage when cloning volumes. This means applications can rely on
> the image having enough space when clone completes and won't get ENOSPC
> in the VM. AFAICT, this change to do copy-on-write changes the API to do
> thin provisioning of the storage during clone, so any future write on
> either the new or old volume may generate ENOSPC when btrfs finally copies
> the sector. I don't think this is a good thing. I think applications
> should have to explicitly request copy-on-write behaviour for the clone
> so they know the implications.
That's a good point. However, it looks like this change would only change
the behavior for the old volumes; new volumes are always created sparsely
and they may already get ENOSPC on write if they contained zero blocks. This
should probably be fixed by calling fallocate instead of lseek when noticing
empty blocks (safezero should probably be used instead, but it's currently
rather unsafe if posix_fallocate isn't available.)
I was wondering if we could reuse the allocation and capacity fields to
decide whether or not to try to do a cow-clone (or sparse allocation of the
cloned bits)? Currently a cloned volume's allocation is always set to at
least the original volume's capacity and the original client-requested
allocation value is not passed on to the code doing the cloning, but we
could pass it on and allow copy-on-write clones if allocation is set to zero
(no space is guaranteed to be available for writing) and also change sparse
cloning to only happen if allocation is lower than capacity.
I think just having a VIR_STORAGE_VOL_CLONE_COPY_ON_WRITE flag for the
API would suffice.
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|