[libvirt] [PATCH 0/2] try to use btrfs COW copy

If VIR_STORAGE_VOL_CREATE_REFLINK is specified, try to use btrfs COW copy; If btrfs COW copy is not supported, fall back to normal copy. Chen Hanxiao (2): introduce btrfsCloneFile for COW copy storage: try to perform btrfs clone if possible include/libvirt/libvirt-storage.h | 1 + src/storage/storage_backend.c | 65 ++++++++++++++++++++++++++++++++++----- src/storage/storage_backend_fs.c | 8 +++-- src/storage/storage_driver.c | 4 ++- 4 files changed, 68 insertions(+), 10 deletions(-) -- 1.9.3

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> --- src/storage/storage_backend.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index b990a82..5ce3566 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -156,6 +156,32 @@ enum { #define READ_BLOCK_SIZE_DEFAULT (1024 * 1024) #define WRITE_BLOCK_SIZE_DEFAULT (4 * 1024) +/* + * Perform the O(1) btrfs clone operation, if possible. + * Upon success, return 0. Otherwise, return -1 and set errno. + */ +#if !defined(BTRFS_IOCTL_MAGIC) +# define BTRFS_IOCTL_MAGIC 0x94 +#endif + +#define BTRFS_IOC_CLONE _IOW (BTRFS_IOCTL_MAGIC, 9, int) + +#if defined(__linux__) +static inline int +btrfsCloneFile(int dest_fd, int src_fd) +{ + return ioctl(dest_fd, BTRFS_IOC_CLONE, src_fd); +} +#else +static inline int +btrfsCloneFile(int dest_fd ATTRIBUTE_UNUSED, + int src_fd ATTRIBUTE_UNUSED) +{ + errno = ENOTSUP; + return -1; +} +#endif + static int ATTRIBUTE_NONNULL(2) virStorageBackendCopyToFD(virStorageVolDefPtr vol, virStorageVolDefPtr inputvol, -- 1.9.3

On 01/13/2015 09:18 AM, Chen Hanxiao wrote:
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> --- src/storage/storage_backend.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index b990a82..5ce3566 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -156,6 +156,32 @@ enum { #define READ_BLOCK_SIZE_DEFAULT (1024 * 1024) #define WRITE_BLOCK_SIZE_DEFAULT (4 * 1024)
+/* + * Perform the O(1) btrfs clone operation, if possible. + * Upon success, return 0. Otherwise, return -1 and set errno. + */ +#if !defined(BTRFS_IOCTL_MAGIC) +# define BTRFS_IOCTL_MAGIC 0x94 +#endif + +#define BTRFS_IOC_CLONE _IOW (BTRFS_IOCTL_MAGIC, 9, int) +
Instead of redefining the constants, can we check for btrfs.h at configure time and wrap the function in #ifdef BTRFS_IOC_CLONE ? Jan

-----Original Message----- From: Ján Tomko [mailto:jtomko@redhat.com] Sent: Friday, January 16, 2015 10:21 PM To: Chen, Hanxiao/陈 晗霄; libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 1/2] Introduce btrfsCloneFile for btrfs COW copy
On 01/13/2015 09:18 AM, Chen Hanxiao wrote:
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> --- src/storage/storage_backend.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index b990a82..5ce3566 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -156,6 +156,32 @@ enum { #define READ_BLOCK_SIZE_DEFAULT (1024 * 1024) #define WRITE_BLOCK_SIZE_DEFAULT (4 * 1024)
+/* + * Perform the O(1) btrfs clone operation, if possible. + * Upon success, return 0. Otherwise, return -1 and set errno. + */ +#if !defined(BTRFS_IOCTL_MAGIC) +# define BTRFS_IOCTL_MAGIC 0x94 +#endif + +#define BTRFS_IOC_CLONE _IOW (BTRFS_IOCTL_MAGIC, 9, int) +
Instead of redefining the constants, can we check for btrfs.h at configure time and wrap the function in #ifdef BTRFS_IOC_CLONE ?
We need to install btrfs-progs-devel for BTRFS_IOC_CLONE in <btrfs/ioctl.h>. So it should be easy to just define magic number and issue a ioctl, than add a dependency on btrfs-progs-devel. Do we need to add a LIBVIRT_CHECK_PKG for btrfs-progs-devel? Thanks, - Chen

-----Original Message----- From: libvir-list-bounces@redhat.com [mailto:libvir-list-bounces@redhat.com] On Behalf Of Chen, Hanxiao Sent: Monday, January 19, 2015 2:26 PM To: Ján Tomko; libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 1/2] Introduce btrfsCloneFile for btrfs COW copy
-----Original Message----- From: Ján Tomko [mailto:jtomko@redhat.com] Sent: Friday, January 16, 2015 10:21 PM To: Chen, Hanxiao/陈 晗霄; libvir-list@redhat.com [snip]
+#define BTRFS_IOC_CLONE _IOW (BTRFS_IOCTL_MAGIC, 9, int) +
Instead of redefining the constants, can we check for btrfs.h at configure time and wrap the function in #ifdef BTRFS_IOC_CLONE ?
We need to install btrfs-progs-devel for BTRFS_IOC_CLONE in <btrfs/ioctl.h>. So it should be easy to just define magic number and issue a ioctl, than add a dependency on btrfs-progs-devel.
Do we need to add a LIBVIRT_CHECK_PKG for btrfs-progs-devel?
Hi Jan, Any comments? Thanks, - Chen

On 01/21/2015 08:06 AM, Chen, Hanxiao wrote:
-----Original Message----- From: libvir-list-bounces@redhat.com [mailto:libvir-list-bounces@redhat.com] On Behalf Of Chen, Hanxiao Sent: Monday, January 19, 2015 2:26 PM To: Ján Tomko; libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 1/2] Introduce btrfsCloneFile for btrfs COW copy
-----Original Message----- From: Ján Tomko [mailto:jtomko@redhat.com] Sent: Friday, January 16, 2015 10:21 PM To: Chen, Hanxiao/陈 晗霄; libvir-list@redhat.com [snip]
+#define BTRFS_IOC_CLONE _IOW (BTRFS_IOCTL_MAGIC, 9, int) +
Instead of redefining the constants, can we check for btrfs.h at configure time and wrap the function in #ifdef BTRFS_IOC_CLONE ?
We need to install btrfs-progs-devel for BTRFS_IOC_CLONE in <btrfs/ioctl.h>. So it should be easy to just define magic number and issue a ioctl, than add a dependency on btrfs-progs-devel.
Do we need to add a LIBVIRT_CHECK_PKG for btrfs-progs-devel?
I can see BTRFS_IOC_CLONE in linux/btrfs.h, which is part of kernel-headers on RHEL 7. Anyway, we should be checking for the header file via AC_CHECK_HEADER (preferably only when we're building with the storage driver that will use it), not a distro-specific package name. Jan

We don't take advantage of clone of btrfs. So a)try to do a btrfs lightweight copy b)fall back to a standard copy if COW copy not supported. Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> --- include/libvirt/libvirt-storage.h | 1 + src/storage/storage_backend.c | 39 ++++++++++++++++++++++++++++++++------- src/storage/storage_backend_fs.c | 8 ++++++-- src/storage/storage_driver.c | 4 +++- 4 files changed, 42 insertions(+), 10 deletions(-) diff --git a/include/libvirt/libvirt-storage.h b/include/libvirt/libvirt-storage.h index 1f3087b..4ae42e1 100644 --- a/include/libvirt/libvirt-storage.h +++ b/include/libvirt/libvirt-storage.h @@ -306,6 +306,7 @@ const char* virStorageVolGetKey (virStorageVolPtr vol); typedef enum { VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA = 1 << 0, + VIR_STORAGE_VOL_CREATE_REFLINK = 1 << 1, } virStorageVolCreateFlags; virStorageVolPtr virStorageVolCreateXML (virStoragePoolPtr pool, diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index 5ce3566..838398e 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -187,7 +187,8 @@ virStorageBackendCopyToFD(virStorageVolDefPtr vol, virStorageVolDefPtr inputvol, int fd, unsigned long long *total, - bool want_sparse) + bool want_sparse, + bool reflink_copy) { int inputfd = -1; int amtread = -1; @@ -226,6 +227,15 @@ virStorageBackendCopyToFD(virStorageVolDefPtr vol, goto cleanup; } + if (reflink_copy) { + if (btrfsCloneFile(fd, inputfd) == -1) { + VIR_DEBUG("btrfs clone not supported, try another way."); + } else { + VIR_DEBUG("btrfs clone findished."); + goto cleanup; + } + } + while (amtread != 0) { int amtleft; @@ -306,8 +316,11 @@ virStorageBackendCreateBlockFrom(virConnectPtr conn ATTRIBUTE_UNUSED, struct stat st; gid_t gid; uid_t uid; + bool reflink_copy = false; - virCheckFlags(VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA, -1); + virCheckFlags(VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA | + VIR_STORAGE_VOL_CREATE_REFLINK, + -1); if (flags & VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", @@ -316,6 +329,9 @@ virStorageBackendCreateBlockFrom(virConnectPtr conn ATTRIBUTE_UNUSED, goto cleanup; } + if (flags & VIR_STORAGE_VOL_CREATE_REFLINK) + reflink_copy = true; + if ((fd = open(vol->target.path, O_RDWR)) < 0) { virReportSystemError(errno, _("cannot create path '%s'"), @@ -327,7 +343,7 @@ virStorageBackendCreateBlockFrom(virConnectPtr conn ATTRIBUTE_UNUSED, if (inputvol) { int res = virStorageBackendCopyToFD(vol, inputvol, - fd, &remain, false); + fd, &remain, false, reflink_copy); if (res < 0) goto cleanup; } @@ -372,7 +388,8 @@ virStorageBackendCreateBlockFrom(virConnectPtr conn ATTRIBUTE_UNUSED, static int createRawFile(int fd, virStorageVolDefPtr vol, - virStorageVolDefPtr inputvol) + virStorageVolDefPtr inputvol, + bool reflink_copy) { bool need_alloc = true; int ret = 0; @@ -419,7 +436,8 @@ createRawFile(int fd, virStorageVolDefPtr vol, bool want_sparse = !need_alloc || (vol->target.allocation < inputvol->target.capacity); - ret = virStorageBackendCopyToFD(vol, inputvol, fd, &remain, want_sparse); + ret = virStorageBackendCopyToFD(vol, inputvol, fd, &remain, + want_sparse, reflink_copy); if (ret < 0) goto cleanup; } @@ -454,8 +472,11 @@ virStorageBackendCreateRaw(virConnectPtr conn ATTRIBUTE_UNUSED, int ret = -1; int fd = -1; int operation_flags; + bool reflink_copy = false; - virCheckFlags(VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA, -1); + virCheckFlags(VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA | + VIR_STORAGE_VOL_CREATE_REFLINK, + -1); if (flags & VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", @@ -464,6 +485,10 @@ virStorageBackendCreateRaw(virConnectPtr conn ATTRIBUTE_UNUSED, goto cleanup; } + if (flags & VIR_STORAGE_VOL_CREATE_REFLINK) + reflink_copy = true; + + if (vol->target.encryption != NULL) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("storage pool does not support encrypted volumes")); @@ -506,7 +531,7 @@ virStorageBackendCreateRaw(virConnectPtr conn ATTRIBUTE_UNUSED, #endif } - if ((ret = createRawFile(fd, vol, inputvol)) < 0) + if ((ret = createRawFile(fd, vol, inputvol, reflink_copy)) < 0) /* createRawFile already reported the exact error. */ ret = -1; diff --git a/src/storage/storage_backend_fs.c b/src/storage/storage_backend_fs.c index 34f2153..cf30aab 100644 --- a/src/storage/storage_backend_fs.c +++ b/src/storage/storage_backend_fs.c @@ -1105,7 +1105,9 @@ virStorageBackendFileSystemVolBuild(virConnectPtr conn, virStorageVolDefPtr vol, unsigned int flags) { - virCheckFlags(VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA, -1); + virCheckFlags(VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA | + VIR_STORAGE_VOL_CREATE_REFLINK, + -1); return _virStorageBackendFileSystemVolBuild(conn, pool, vol, NULL, flags); } @@ -1120,7 +1122,9 @@ virStorageBackendFileSystemVolBuildFrom(virConnectPtr conn, virStorageVolDefPtr inputvol, unsigned int flags) { - virCheckFlags(VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA, -1); + virCheckFlags(VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA | + VIR_STORAGE_VOL_CREATE_REFLINK, + -1); return _virStorageBackendFileSystemVolBuild(conn, pool, vol, inputvol, flags); } diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 66dc994..2c7af48 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -1775,7 +1775,9 @@ storageVolCreateXMLFrom(virStoragePoolPtr obj, unsigned long long allocation; int buildret; - virCheckFlags(VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA, NULL); + virCheckFlags(VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA | + VIR_STORAGE_VOL_CREATE_REFLINK, + NULL); storageDriverLock(); pool = virStoragePoolObjFindByUUID(&driver->pools, obj->uuid); -- 1.9.3

On 01/13/2015 09:18 AM, Chen Hanxiao wrote:
We don't take advantage of clone of btrfs.
So a)try to do a btrfs lightweight copy
b)fall back to a standard copy if COW copy not supported.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> --- include/libvirt/libvirt-storage.h | 1 + src/storage/storage_backend.c | 39 ++++++++++++++++++++++++++++++++------- src/storage/storage_backend_fs.c | 8 ++++++-- src/storage/storage_driver.c | 4 +++- 4 files changed, 42 insertions(+), 10 deletions(-)
diff --git a/include/libvirt/libvirt-storage.h b/include/libvirt/libvirt-storage.h index 1f3087b..4ae42e1 100644 --- a/include/libvirt/libvirt-storage.h +++ b/include/libvirt/libvirt-storage.h @@ -306,6 +306,7 @@ const char* virStorageVolGetKey (virStorageVolPtr vol);
typedef enum { VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA = 1 << 0, + VIR_STORAGE_VOL_CREATE_REFLINK = 1 << 1,
There should be some human-readable commentary after the flag, that will be shown on our API reference page: http://libvirt.org/html/libvirt-libvirt-storage.html#virStorageVolCreateFlag...
} virStorageVolCreateFlags;
virStorageVolPtr virStorageVolCreateXML (virStoragePoolPtr pool, diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index 5ce3566..838398e 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -187,7 +187,8 @@ virStorageBackendCopyToFD(virStorageVolDefPtr vol, virStorageVolDefPtr inputvol, int fd, unsigned long long *total, - bool want_sparse) + bool want_sparse, + bool reflink_copy) { int inputfd = -1; int amtread = -1; @@ -226,6 +227,15 @@ virStorageBackendCopyToFD(virStorageVolDefPtr vol, goto cleanup; }
+ if (reflink_copy) { + if (btrfsCloneFile(fd, inputfd) == -1) { + VIR_DEBUG("btrfs clone not supported, try another way.");
If reflink copy is unsupported, I think we should just error out, as a full copy won't be as fast / occupy as little space. Also, this would fall back even on other errors. The rest looks good to me. Jan

-----Original Message----- From: Ján Tomko [mailto:jtomko@redhat.com] Sent: Friday, January 16, 2015 10:22 PM To: Chen, Hanxiao/陈 晗霄; libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 2/2] storage: try to perform btrfs COW copy if possible
On 01/13/2015 09:18 AM, Chen Hanxiao wrote:
We don't take advantage of clone of btrfs.
So a)try to do a btrfs lightweight copy
b)fall back to a standard copy if COW copy not supported.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> --- include/libvirt/libvirt-storage.h | 1 + src/storage/storage_backend.c | 39 ++++++++++++++++++++++++++++++++------- src/storage/storage_backend_fs.c | 8 ++++++-- src/storage/storage_driver.c | 4 +++- 4 files changed, 42 insertions(+), 10 deletions(-)
diff --git a/include/libvirt/libvirt-storage.h b/include/libvirt/libvirt-storage.h index 1f3087b..4ae42e1 100644 --- a/include/libvirt/libvirt-storage.h +++ b/include/libvirt/libvirt-storage.h @@ -306,6 +306,7 @@ const char* virStorageVolGetKey (virStorageVolPtr vol);
typedef enum { VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA = 1 << 0, + VIR_STORAGE_VOL_CREATE_REFLINK = 1 << 1,
There should be some human-readable commentary after the flag, that will be shown on our API reference page: http://libvirt.org/html/libvirt-libvirt-storage.html#virStorageVolCreateFlag...
Will do.
} virStorageVolCreateFlags;
virStorageVolPtr virStorageVolCreateXML (virStoragePoolPtr pool, diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index 5ce3566..838398e 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -187,7 +187,8 @@ virStorageBackendCopyToFD(virStorageVolDefPtr vol, virStorageVolDefPtr inputvol, int fd, unsigned long long *total, - bool want_sparse) + bool want_sparse, + bool reflink_copy) { int inputfd = -1; int amtread = -1; @@ -226,6 +227,15 @@ virStorageBackendCopyToFD(virStorageVolDefPtr vol, goto cleanup; }
+ if (reflink_copy) { + if (btrfsCloneFile(fd, inputfd) == -1) { + VIR_DEBUG("btrfs clone not supported, try another way.");
If reflink copy is unsupported, I think we should just error out, as a full copy won't be as fast / occupy as little space.
Also, this would fall back even on other errors.
That's a better solution, will be changed in v2. Thanks, - Chen
The rest looks good to me.
Jan

-----Original Message----- From: libvir-list-bounces@redhat.com [mailto:libvir-list-bounces@redhat.com] On Behalf Of Chen Hanxiao Sent: Tuesday, January 13, 2015 4:18 PM To: libvir-list@redhat.com Subject: [libvirt] [PATCH 0/2] try to use btrfs COW copy
If VIR_STORAGE_VOL_CREATE_REFLINK is specified, try to use btrfs COW copy; If btrfs COW copy is not supported, fall back to normal copy.
Chen Hanxiao (2): introduce btrfsCloneFile for COW copy storage: try to perform btrfs clone if possible
include/libvirt/libvirt-storage.h | 1 + src/storage/storage_backend.c | 65 ++++++++++++++++++++++++++++++++++----- src/storage/storage_backend_fs.c | 8 +++-- src/storage/storage_driver.c | 4 ++- 4 files changed, 68 insertions(+), 10 deletions(-)
ping.
participants (3)
-
Chen Hanxiao
-
Chen, Hanxiao
-
Ján Tomko