2014/1/15 Michal Privoznik <mprivozn(a)redhat.com>
On 24.12.2013 09:56, Chunyan Liu wrote:
> Btrfs has terrible performance when hosting VM images, even more when
the guest
> in those VM are also using btrfs as file system. One way to mitigate
this bad
> performance is to turn off COW attributes on VM files (since having copy
on
> write for this kind of data is not useful).
>
> According to 'chattr' manpage, NOCOW could be set to new or empty file
only on
> btrfs, so this patch tries to add nocow feature option in volume xml and
handle
> it in vol-create, so that users could have a chance to set NOCOW to a new
> volume if that happens on a btrfs like file system.
>
> Signed-off-by: Chunyan Liu <cyliu(a)suse.com>
>
> ---
> This is a revised version to:
>
http://www.redhat.com/archives/libvir-list/2013-December/msg00303.html
>
> Changes:
> * fix Daniel's comments
>
> ---
> docs/formatstorage.html.in | 12 +++++---
> docs/schemas/storagefilefeatures.rng | 3 ++
> src/conf/storage_conf.c | 9 ++++--
> src/storage/storage_backend.c | 4 +-
> src/storage/storage_backend_fs.c | 48
++++++++++++++++++++++++++++++++++
> src/util/virstoragefile.c | 1 +
> src/util/virstoragefile.h | 1 +
> 7 files changed, 69 insertions(+), 9 deletions(-)
Can you add a testcase that (at least) checks the RNG scheme? The more
your test tests the better.
Michal, thanks for your comment. Sorry for late response. I'm on vacation
these
days.
Sure, I can add some testcase for that.
> diff --git a/docs/formatstorage.html.in b/docs/formatstorage.html.in
> index a089a31..3de1a2b 100644
> --- a/docs/formatstorage.html.in
> +++ b/docs/formatstorage.html.in
> @@ -385,6 +385,7 @@
> <compat>1.1</compat>
> <features>
> <lazy_refcounts/>
> + <nocow/>
> </features>
> </target></pre>
>
> @@ -423,11 +424,14 @@
> <span class="since">Since 1.1.0</span>
> </dd>
> <dt><code>features</code></dt>
> - <dd>Format-specific features. Only used for
<code>qcow2</code>
now.
> - Valid sub-elements are:
> - <ul>
> + <dd>Format-specific features. Valid sub-elements are:
> + <ul>
> <li><code><lazy_refcounts/></code> -
allow delayed
reference
> - counter updates. <span class="since">Since
1.1.0</span></li>
> + counter updates. Only used for <code>qcow2</code> now.
> + <span class="since">Since 1.1.0</span></li>
> + <li><code><nocow/></code> - turn off
copy-on-write.
Only valid
> + to volume on <code>btrfs</code>, can improve performance.
> + <span class="since">Since 1.2.2</span></li>
> </ul>
> </dd>
> </dl>
> diff --git a/docs/schemas/storagefilefeatures.rng
b/docs/schemas/storagefilefeatures.rng
> index 424b4e2..0cf3513 100644
> --- a/docs/schemas/storagefilefeatures.rng
> +++ b/docs/schemas/storagefilefeatures.rng
> @@ -17,6 +17,9 @@
> <element name='lazy_refcounts'>
> <empty/>
> </element>
> + <element name='nocow'>
> + <empty/>
> + </element>
> </optional>
> </interleave>
> </element>
> diff --git a/src/conf/storage_conf.c b/src/conf/storage_conf.c
> index 22e38c1..b6409a6 100644
> --- a/src/conf/storage_conf.c
> +++ b/src/conf/storage_conf.c
> @@ -1398,9 +1398,6 @@ virStorageVolDefParseXML(virStoragePoolDefPtr pool,
> if ((n = virXPathNodeSet("./target/features/*", ctxt,
&nodes))
< 0)
> goto error;
>
> - if (!ret->target.compat && VIR_STRDUP(ret->target.compat,
"1.1") < 0)
> - goto error;
> -
> if (!(ret->target.features =
virBitmapNew(VIR_STORAGE_FILE_FEATURE_LAST)))
> goto error;
>
> @@ -1412,6 +1409,12 @@ virStorageVolDefParseXML(virStoragePoolDefPtr
pool,
> (const char*)nodes[i]->name);
> goto error;
> }
> +
> + if (f == VIR_STORAGE_FILE_FEATURE_LAZY_REFCOUNTS) {
> + if (!ret->target.compat &&
VIR_STRDUP(ret->target.compat, "1.1") < 0)
> + goto error;
> + }
> +
> ignore_value(virBitmapSetBit(ret->target.features, f));
> }
> VIR_FREE(nodes);
> diff --git a/src/storage/storage_backend.c
b/src/storage/storage_backend.c
> index b08d646..b4ab866 100644
> --- a/src/storage/storage_backend.c
> +++ b/src/storage/storage_backend.c
> @@ -423,7 +423,7 @@ virStorageBackendCreateRaw(virConnectPtr conn
ATTRIBUTE_UNUSED,
> operation_flags |= VIR_FILE_OPEN_FORK;
>
> if ((fd = virFileOpenAs(vol->target.path,
> - O_RDWR | O_CREAT | O_EXCL,
> + O_RDWR | O_CREAT,
I don't think this is safe. I see why are you doing this [2], but what
if user hadn't specified nocow feature? Then we might overwrite an
existing file. And we want to do that.
I'm aware of the overwrite issue. But in fact, if there is an existing
volume
with the same name, it will be checked out in earlier stage.
> vol->target.perms.mode,
> vol->target.perms.uid,
> vol->target.perms.gid,
> @@ -729,7 +729,7 @@ virStorageBackendCreateQemuImgOpts(char **opts,
> break;
>
> /* coverity[dead_error_begin] */
> - case VIR_STORAGE_FILE_FEATURE_LAST:
> + default:
No, please no. The whole intent of having the enum enumerated explicitly
is: whenever a new item is added to a enum compiler will find all the
places that needs adjustment.
> ;
> }
> virBufferAsprintf(&buf, "%s,",
> diff --git a/src/storage/storage_backend_fs.c
b/src/storage/storage_backend_fs.c
> index 95783be..dc26f6d 100644
> --- a/src/storage/storage_backend_fs.c
> +++ b/src/storage/storage_backend_fs.c
> @@ -51,6 +51,13 @@
> #include "virfile.h"
> #include "virlog.h"
> #include "virstring.h"
> +#ifdef __linux__
> +# include <sys/ioctl.h>
> +# include <linux/fs.h>
> +#ifndef FS_NOCOW_FL
> +#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
> +#endif
> +#endif
>
> #define VIR_FROM_THIS VIR_FROM_STORAGE
>
> @@ -1061,6 +1068,7 @@ _virStorageBackendFileSystemVolBuild(virConnectPtr
conn,
> {
> virStorageBackendBuildVolFrom create_func;
> int tool_type;
> + bool nocow;
This should rather be initialized to false ...[1]
>
> if (inputvol) {
> if (vol->target.encryption != NULL) {
> @@ -1090,6 +1098,46 @@
_virStorageBackendFileSystemVolBuild(virConnectPtr conn,
> return -1;
> }
>
> + if (vol->target.features)
> + ignore_value(virBitmapGetBit(vol->target.features,
> + VIR_STORAGE_FILE_FEATURE_NOCOW,
&nocow));
> + if (nocow) {
[1] .. otherwise we might use an uninitialized value here.
> +#ifdef __linux__
Okay, btrfs is currently supported only on linux.
> + /* create an empty file and set nocow flag.
> + * This could optimize performance on file system like btrfs.
> + */
> + int attr, fd;
> + int operation_flags = VIR_FILE_OPEN_FORCE_MODE |
VIR_FILE_OPEN_FORCE_OWNER;
> + if (pool->def->type == VIR_STORAGE_POOL_NETFS)
> + operation_flags |= VIR_FILE_OPEN_FORK;
> +
> + if ((fd = virFileOpenAs(vol->target.path,
> + O_RDWR | O_CREAT | O_EXCL | O_LARGEFILE,
> + vol->target.perms.mode,
> + vol->target.perms.uid,
> + vol->target.perms.gid,
> + operation_flags)) < 0) {
> + virReportSystemError(-fd,
> + _("Failed to create file
'%s'"),
> + vol->target.path);
> + return -1;
> + }
2: as you say, btrfs allows setting NOCOW flag only on an empty file
(I'm taking your word for it, haven't checked myself). However, with
this approach we might get into troubles when 'nocow' wasn't specified
in the XML. What would be more appropriate is to pass the NOCOW flag
down the path and do ioctl() just after the virFileOpenAs in
virStorageBackendCreateRaw().
There are three create functions according to the vol format:
virStorageBackendCreateRaw, virStorageBackendCreateQemuImg and
virStorageBackendCreateQcowCreate. To pass down NOCOW flag, we need to
handle it
in all three places. Last two will call 'qemu-img' and 'qcow-create'
commands to create
the image, NOCOW option is not supported yet in those commands.
BTW, why need qcow-create, but not using 'qemu-img create -f qcow'.
> +
> + /* This is an optimisation. The FS_IOC_SETFLAGS ioctl return
value will
> + * be ignored since any failure of this operation should not
block the
> + * left work.
> + */
> + if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
> + attr |= FS_NOCOW_FL;
> + ioctl(fd, FS_IOC_SETFLAGS, &attr);
> + }
> +
> + VIR_FORCE_CLOSE(fd);
> +#endif
I think this #endif is premature. Should we:
#else
virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, _("nocow is not
supported on this platform");
#endif
instead? BTW:
> + if (virBitmapClearBit(vol->target.features,
VIR_STORAGE_FILE_FEATURE_NOCOW) < 0)
> + return -1;
> + }
> +
> if (create_func(conn, pool, vol, inputvol, flags) < 0)
> return -1;
> return 0;
> diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c
> index e45236f..6e34e4c 100644
> --- a/src/util/virstoragefile.c
> +++ b/src/util/virstoragefile.c
> @@ -62,6 +62,7 @@ VIR_ENUM_IMPL(virStorageFileFormat,
> VIR_ENUM_IMPL(virStorageFileFeature,
> VIR_STORAGE_FILE_FEATURE_LAST,
> "lazy_refcounts",
> + "nocow",
> )
>
> enum lv_endian {
> diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h
> index 7bd2fe0..f45743a 100644
> --- a/src/util/virstoragefile.h
> +++ b/src/util/virstoragefile.h
> @@ -62,6 +62,7 @@ VIR_ENUM_DECL(virStorageFileFormat);
>
> enum virStorageFileFeature {
> VIR_STORAGE_FILE_FEATURE_LAZY_REFCOUNTS = 0,
> + VIR_STORAGE_FILE_FEATURE_NOCOW = 1,
This = 1 shouldn't be needed, but doesn't hurt either.
>
> VIR_STORAGE_FILE_FEATURE_LAST
> };
>
Looking forward to v3. Which reminds me, can you please note the version
of the patch in the subject line?
Michal