[libvirt] [PATCHv4 0/8] glusterfs storage pool

v3: https://www.redhat.com/archives/libvir-list/2013-November/msg00348.html Depends on: https://www.redhat.com/archives/libvir-list/2013-November/msg00955.html Changes since then, addressing review feedback: - rebase to other improvements in the meantime - New patches 4-7 - pool changed to require <name>volume</name> to have no slash, with subdirectory within a volume selected by <dir path=.../> which must begin with slash - documentation improved to match actual testing - directories, symlinks are handled - volume owner and timestamps are handled - volume xml tests added, with several bugs in earlier version fixed along the way - compared gluster pool with a netfs pool to ensure both can see the same level of detail from the same gluster storage If you think it will help review, ask me to provide an interdiff from v3 (although I have not done it yet). Eric Blake (8): storage: initial support for linking with libgfapi storage: document gluster pool storage: implement rudimentary glusterfs pool refresh storage: add network-dir as new storage volume type storage: improve directory support in gluster pool storage: improve allocation stats reported on gluster files storage: improve handling of symlinks in gluster storage: probe qcow2 volumes in gluster pool configure.ac | 21 ++ docs/formatstorage.html.in | 15 +- docs/schemas/storagepool.rng | 26 +- docs/storage.html.in | 91 +++++- include/libvirt/libvirt.h.in | 2 + libvirt.spec.in | 15 + m4/virt-gluster.m4 | 28 ++ po/POTFILES.in | 1 + src/Makefile.am | 10 + src/conf/storage_conf.c | 28 +- src/conf/storage_conf.h | 3 +- src/qemu/qemu_command.c | 6 +- src/qemu/qemu_conf.c | 4 +- src/storage/storage_backend.c | 14 +- src/storage/storage_backend.h | 6 +- src/storage/storage_backend_fs.c | 5 +- src/storage/storage_backend_gluster.c | 381 +++++++++++++++++++++++ src/storage/storage_backend_gluster.h | 29 ++ tests/storagepoolxml2xmlin/pool-gluster-sub.xml | 9 + tests/storagepoolxml2xmlin/pool-gluster.xml | 8 + tests/storagepoolxml2xmlout/pool-gluster-sub.xml | 12 + tests/storagepoolxml2xmlout/pool-gluster.xml | 12 + tests/storagepoolxml2xmltest.c | 2 + tests/storagevolxml2xmlin/vol-gluster-dir.xml | 13 + tests/storagevolxml2xmlout/vol-gluster-dir.xml | 18 ++ tests/storagevolxml2xmltest.c | 1 + tools/virsh-volume.c | 5 +- 27 files changed, 740 insertions(+), 25 deletions(-) create mode 100644 m4/virt-gluster.m4 create mode 100644 src/storage/storage_backend_gluster.c create mode 100644 src/storage/storage_backend_gluster.h create mode 100644 tests/storagepoolxml2xmlin/pool-gluster-sub.xml create mode 100644 tests/storagepoolxml2xmlin/pool-gluster.xml create mode 100644 tests/storagepoolxml2xmlout/pool-gluster-sub.xml create mode 100644 tests/storagepoolxml2xmlout/pool-gluster.xml create mode 100644 tests/storagevolxml2xmlin/vol-gluster-dir.xml create mode 100644 tests/storagevolxml2xmlout/vol-gluster-dir.xml -- 1.8.3.1

We support gluster volumes in domain XML, so we also ought to support them as a storage pool. Besides, a future patch will want to take advantage of libgfapi to handle the case of a gluster device holding qcow2 rather than raw storage, and for that to work, we need a storage backend that can read gluster storage volume contents. This sets up the framework. Note that the new pool is named 'gluster' to match a <disk type='network'><source protocol='gluster'> image source already supported in a <domain>; it does NOT match the <pool type='netfs'><source><target type='glusterfs'>, since that uses a FUSE mount to a local file name rather than a network name. This and subsequent patches have been tested against glusterfs 3.4.1 (available on Fedora 19); there are likely bugs in older versions that may prevent decent use of gfapi, so this patch enforces the minimum version tested. A future patch may lower the minimum. On the other hand, I hit at least two bugs in 3.4.1 that will be fixed in 3.5/3.4.2, where it might be worth raising the minimum: glfs_readdir is nicer to use than glfs_readdir_r [1], and glfs_fini should only return failure on an actual failure [2]. [1] http://lists.gnu.org/archive/html/gluster-devel/2013-10/msg00085.html [2] http://lists.gnu.org/archive/html/gluster-devel/2013-10/msg00086.html * configure.ac (WITH_STORAGE_GLUSTER): New conditional. * m4/virt-gluster.m4: new file. * libvirt.spec.in (BuildRequires): Support gluster in spec file. * src/conf/storage_conf.h (VIR_STORAGE_POOL_GLUSTER): New pool type. * src/conf/storage_conf.c (poolTypeInfo): Treat similar to sheepdog and rbd. (virStoragePoolDefFormat): Don't output target for gluster. * src/storage/storage_backend_gluster.h: New file. * src/storage/storage_backend_gluster.c: Likewise. * po/POTFILES.in: Add new file. * src/storage/storage_backend.c (backends): Register new type. * src/Makefile.am (STORAGE_DRIVER_GLUSTER_SOURCES): Build new files. * src/storage/storage_backend.h (_virStorageBackend): Documet assumption. Signed-off-by: Eric Blake <eblake@redhat.com> --- configure.ac | 21 ++++++++++++++++ libvirt.spec.in | 15 ++++++++++++ m4/virt-gluster.m4 | 28 +++++++++++++++++++++ po/POTFILES.in | 1 + src/Makefile.am | 10 ++++++++ src/conf/storage_conf.c | 26 +++++++++++++++++--- src/conf/storage_conf.h | 3 ++- src/storage/storage_backend.c | 6 +++++ src/storage/storage_backend.h | 6 +++-- src/storage/storage_backend_gluster.c | 46 +++++++++++++++++++++++++++++++++++ src/storage/storage_backend_gluster.h | 29 ++++++++++++++++++++++ 11 files changed, 184 insertions(+), 7 deletions(-) create mode 100644 m4/virt-gluster.m4 create mode 100644 src/storage/storage_backend_gluster.c create mode 100644 src/storage/storage_backend_gluster.h diff --git a/configure.ac b/configure.ac index 044cf37..934c87e 100644 --- a/configure.ac +++ b/configure.ac @@ -228,6 +228,7 @@ LIBVIRT_CHECK_CAPNG LIBVIRT_CHECK_CURL LIBVIRT_CHECK_DBUS LIBVIRT_CHECK_FUSE +LIBVIRT_CHECK_GLUSTER LIBVIRT_CHECK_HAL LIBVIRT_CHECK_NETCF LIBVIRT_CHECK_NUMACTL @@ -1643,6 +1644,10 @@ AC_ARG_WITH([storage-sheepdog], [AS_HELP_STRING([--with-storage-sheepdog], [with Sheepdog backend for the storage driver @<:@default=check@:>@])], [],[with_storage_sheepdog=check]) +AC_ARG_WITH([storage-gluster], + [AS_HELP_STRING([--with-storage-gluster], + [with Gluster backend for the storage driver @<:@default=check@:>@])], + [],[with_storage_gluster=check]) if test "$with_libvirtd" = "no"; then with_storage_dir=no @@ -1654,6 +1659,7 @@ if test "$with_libvirtd" = "no"; then with_storage_disk=no with_storage_rbd=no with_storage_sheepdog=no + with_storage_gluster=no fi if test "$with_storage_dir" = "yes" ; then AC_DEFINE_UNQUOTED([WITH_STORAGE_DIR], 1, [whether directory backend for storage driver is enabled]) @@ -1855,6 +1861,19 @@ fi AM_CONDITIONAL([WITH_STORAGE_SHEEPDOG], [test "$with_storage_sheepdog" = "yes"]) +LIBGLUSTER_LIBS= +if test "$with_storage_gluster" = "check"; then + with_storage_gluster=$with_glusterfs +fi +if test "$with_storage_gluster" = "yes"; then + if test "$with_glusterfs" = no; then + AC_MSG_ERROR([Need glusterfs (libgfapi) for gluster storage driver]) + fi + AC_DEFINE_UNQUOTED([WITH_STORAGE_GLUSTER], [1], + [whether Gluster backend for storage driver is enabled]) +fi +AM_CONDITIONAL([WITH_STORAGE_GLUSTER], [test "$with_storage_gluster" = "yes"]) + LIBPARTED_CFLAGS= LIBPARTED_LIBS= @@ -2582,6 +2601,7 @@ AC_MSG_NOTICE([ mpath: $with_storage_mpath]) AC_MSG_NOTICE([ Disk: $with_storage_disk]) AC_MSG_NOTICE([ RBD: $with_storage_rbd]) AC_MSG_NOTICE([Sheepdog: $with_storage_sheepdog]) +AC_MSG_NOTICE([ Gluster: $with_storage_gluster]) AC_MSG_NOTICE([]) AC_MSG_NOTICE([Security Drivers]) AC_MSG_NOTICE([]) @@ -2607,6 +2627,7 @@ LIBVIRT_RESULT_CAPNG LIBVIRT_RESULT_CURL LIBVIRT_RESULT_DBUS LIBVIRT_RESULT_FUSE +LIBVIRT_RESULT_GLUSTER LIBVIRT_RESULT_HAL LIBVIRT_RESULT_NETCF LIBVIRT_RESULT_NUMACTL diff --git a/libvirt.spec.in b/libvirt.spec.in index 5bc53a0..760148c 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -98,6 +98,11 @@ %else %define with_storage_sheepdog 0 %endif +%if 0%{?fedora} >= 19 + %define with_storage_gluster 0%{!?_without_storage_gluster:%{server_drivers}} +%else + %define with_storage_gluster 0 +%endif %define with_numactl 0%{!?_without_numactl:%{server_drivers}} %define with_selinux 0%{!?_without_selinux:%{server_drivers}} @@ -280,6 +285,7 @@ %define with_storage_mpath 0 %define with_storage_rbd 0 %define with_storage_sheepdog 0 + %define with_storage_gluster 0 %define with_storage_disk 0 %endif @@ -553,6 +559,10 @@ BuildRequires: device-mapper-devel BuildRequires: ceph-devel %endif %endif +%if %{with_storage_gluster} +BuildRequires: glusterfs-api-devel >= 3.4.1 +BuildRequires: glusterfs-devel >= 3.4.1 +%endif %if %{with_numactl} # For QEMU/LXC numa info BuildRequires: numactl-devel @@ -1255,6 +1265,10 @@ driver %define _without_storage_sheepdog --without-storage-sheepdog %endif +%if ! %{with_storage_gluster} + %define _without_storage_gluster --without-storage-gluster +%endif + %if ! %{with_numactl} %define _without_numactl --without-numactl %endif @@ -1376,6 +1390,7 @@ driver %{?_without_storage_mpath} \ %{?_without_storage_rbd} \ %{?_without_storage_sheepdog} \ + %{?_without_storage_gluster} \ %{?_without_numactl} \ %{?_without_numad} \ %{?_without_capng} \ diff --git a/m4/virt-gluster.m4 b/m4/virt-gluster.m4 new file mode 100644 index 0000000..5a4a263 --- /dev/null +++ b/m4/virt-gluster.m4 @@ -0,0 +1,28 @@ +dnl The gluster libgfapi.so library +dnl +dnl Copyright (C) 2013 Red Hat, Inc. +dnl +dnl This library is free software; you can redistribute it and/or +dnl modify it under the terms of the GNU Lesser General Public +dnl License as published by the Free Software Foundation; either +dnl version 2.1 of the License, or (at your option) any later version. +dnl +dnl This library is distributed in the hope that it will be useful, +dnl but WITHOUT ANY WARRANTY; without even the implied warranty of +dnl MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +dnl Lesser General Public License for more details. +dnl +dnl You should have received a copy of the GNU Lesser General Public +dnl License along with this library. If not, see +dnl <http://www.gnu.org/licenses/>. +dnl + +dnl Currently tested against Fedora 19 with glusterfs 3.4.1; earlier +dnl versions may be possible but only with further testing +AC_DEFUN([LIBVIRT_CHECK_GLUSTER],[ + LIBVIRT_CHECK_PKG([GLUSTERFS], [glusterfs-api], [3.4.1]) +]) + +AC_DEFUN([LIBVIRT_RESULT_GLUSTER],[ + LIBVIRT_RESULT_LIB([GLUSTERFS]) +]) diff --git a/po/POTFILES.in b/po/POTFILES.in index 15afdec..0835b52 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -133,6 +133,7 @@ src/storage/parthelper.c src/storage/storage_backend.c src/storage/storage_backend_disk.c src/storage/storage_backend_fs.c +src/storage/storage_backend_gluster.c src/storage/storage_backend_iscsi.c src/storage/storage_backend_logical.c src/storage/storage_backend_mpath.c diff --git a/src/Makefile.am b/src/Makefile.am index ad39b74..e9fd14d 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -819,6 +819,9 @@ STORAGE_DRIVER_RBD_SOURCES = \ STORAGE_DRIVER_SHEEPDOG_SOURCES = \ storage/storage_backend_sheepdog.h storage/storage_backend_sheepdog.c +STORAGE_DRIVER_GLUSTER_SOURCES = \ + storage/storage_backend_gluster.h storage/storage_backend_gluster.c + STORAGE_HELPER_DISK_SOURCES = \ storage/parthelper.c @@ -1434,6 +1437,12 @@ if WITH_STORAGE_SHEEPDOG libvirt_driver_storage_impl_la_SOURCES += $(STORAGE_DRIVER_SHEEPDOG_SOURCES) endif WITH_STORAGE_SHEEPDOG +if WITH_STORAGE_GLUSTER +libvirt_driver_storage_impl_la_SOURCES += $(STORAGE_DRIVER_GLUSTER_SOURCES) +libvirt_driver_storage_impl_la_CFLAGS += $(GLUSTERFS_CFLAGS) +libvirt_driver_storage_impl_la_LIBADD += $(GLUSTERFS_LIBS) +endif WITH_STORAGE_GLUSTER + if WITH_NODE_DEVICES # Needed to keep automake quiet about conditionals if WITH_DRIVER_MODULES @@ -1634,6 +1643,7 @@ EXTRA_DIST += \ $(STORAGE_DRIVER_DISK_SOURCES) \ $(STORAGE_DRIVER_RBD_SOURCES) \ $(STORAGE_DRIVER_SHEEPDOG_SOURCES) \ + $(STORAGE_DRIVER_GLUSTER_SOURCES) \ $(NODE_DEVICE_DRIVER_SOURCES) \ $(NODE_DEVICE_DRIVER_HAL_SOURCES) \ $(NODE_DEVICE_DRIVER_UDEV_SOURCES) \ diff --git a/src/conf/storage_conf.c b/src/conf/storage_conf.c index 0d2932b..add2ae1 100644 --- a/src/conf/storage_conf.c +++ b/src/conf/storage_conf.c @@ -59,7 +59,7 @@ VIR_ENUM_IMPL(virStoragePool, VIR_STORAGE_POOL_LAST, "dir", "fs", "netfs", "logical", "disk", "iscsi", - "scsi", "mpath", "rbd", "sheepdog") + "scsi", "mpath", "rbd", "sheepdog", "gluster") VIR_ENUM_IMPL(virStoragePoolFormatFileSystem, VIR_STORAGE_POOL_FS_LAST, @@ -248,6 +248,19 @@ static virStoragePoolTypeInfo poolTypeInfo[] = { .formatToString = virStoragePoolFormatDiskTypeToString, } }, + {.poolType = VIR_STORAGE_POOL_GLUSTER, + .poolOptions = { + .flags = (VIR_STORAGE_POOL_SOURCE_HOST | + VIR_STORAGE_POOL_SOURCE_NETWORK | + VIR_STORAGE_POOL_SOURCE_NAME | + VIR_STORAGE_POOL_SOURCE_DIR), + }, + .volOptions = { + .defaultFormat = VIR_STORAGE_FILE_RAW, + .formatToString = virStorageFileFormatTypeToString, + .formatFromString = virStorageVolumeFormatFromString, + } + }, {.poolType = VIR_STORAGE_POOL_MPATH, .volOptions = { .formatToString = virStoragePoolFormatDiskTypeToString, @@ -652,6 +665,10 @@ virStoragePoolDefParseSource(xmlXPathContextPtr ctxt, } source->dir = virXPathString("string(./dir/@path)", ctxt); + /* In gluster, a missing dir defaults to "/" */ + if (!source->dir && pool_type == VIR_STORAGE_POOL_GLUSTER && + VIR_STRDUP(source->dir, "/") < 0) + goto cleanup; if ((adapter_type = virXPathString("string(./adapter/@type)", ctxt))) { if ((source->adapter.type = @@ -1196,10 +1213,11 @@ virStoragePoolDefFormat(virStoragePoolDefPtr def) if (virStoragePoolSourceFormat(&buf, options, &def->source) < 0) goto cleanup; - /* RBD and Sheepdog devices are not local block devs nor files, so it - * doesn't have a target */ + /* RBD, Sheepdog, and Gluster devices are not local block devs nor + * files, so they don't have a target */ if (def->type != VIR_STORAGE_POOL_RBD && - def->type != VIR_STORAGE_POOL_SHEEPDOG) { + def->type != VIR_STORAGE_POOL_SHEEPDOG && + def->type != VIR_STORAGE_POOL_GLUSTER) { virBufferAddLit(&buf, " <target>\n"); virBufferEscapeString(&buf, " <path>%s</path>\n", def->target.path); diff --git a/src/conf/storage_conf.h b/src/conf/storage_conf.h index c4dd403..f8a7eec 100644 --- a/src/conf/storage_conf.h +++ b/src/conf/storage_conf.h @@ -1,7 +1,7 @@ /* * storage_conf.h: config handling for storage driver * - * Copyright (C) 2006-2008, 2010-2012 Red Hat, Inc. + * Copyright (C) 2006-2008, 2010-2013 Red Hat, Inc. * Copyright (C) 2006-2008 Daniel P. Berrange * * This library is free software; you can redistribute it and/or @@ -129,6 +129,7 @@ enum virStoragePoolType { VIR_STORAGE_POOL_MPATH, /* Multipath devices */ VIR_STORAGE_POOL_RBD, /* RADOS Block Device */ VIR_STORAGE_POOL_SHEEPDOG, /* Sheepdog device */ + VIR_STORAGE_POOL_GLUSTER, /* Gluster device */ VIR_STORAGE_POOL_LAST, }; diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index 1e3292b..c94e89a 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -80,6 +80,9 @@ #if WITH_STORAGE_SHEEPDOG # include "storage_backend_sheepdog.h" #endif +#if WITH_STORAGE_GLUSTER +# include "storage_backend_gluster.h" +#endif #define VIR_FROM_THIS VIR_FROM_STORAGE @@ -112,6 +115,9 @@ static virStorageBackendPtr backends[] = { #if WITH_STORAGE_SHEEPDOG &virStorageBackendSheepdog, #endif +#if WITH_STORAGE_GLUSTER + &virStorageBackendGluster, +#endif NULL }; diff --git a/src/storage/storage_backend.h b/src/storage/storage_backend.h index d8d3097..9e07dd8 100644 --- a/src/storage/storage_backend.h +++ b/src/storage/storage_backend.h @@ -1,7 +1,7 @@ /* * storage_backend.h: internal storage driver backend contract * - * Copyright (C) 2007-2010, 2012 Red Hat, Inc. + * Copyright (C) 2007-2010, 2012-2013 Red Hat, Inc. * Copyright (C) 2007-2008 Daniel P. Berrange * * This library is free software; you can redistribute it and/or @@ -70,6 +70,8 @@ virStorageBackendFSImageToolTypeToFunc(int tool_type); typedef struct _virStorageBackend virStorageBackend; typedef virStorageBackend *virStorageBackendPtr; +/* Callbacks are optional unless documented otherwise; but adding more + * callbacks provides better pool support. */ struct _virStorageBackend { int type; @@ -77,7 +79,7 @@ struct _virStorageBackend { virStorageBackendCheckPool checkPool; virStorageBackendStartPool startPool; virStorageBackendBuildPool buildPool; - virStorageBackendRefreshPool refreshPool; + virStorageBackendRefreshPool refreshPool; /* Must be non-NULL */ virStorageBackendStopPool stopPool; virStorageBackendDeletePool deletePool; diff --git a/src/storage/storage_backend_gluster.c b/src/storage/storage_backend_gluster.c new file mode 100644 index 0000000..2863c73 --- /dev/null +++ b/src/storage/storage_backend_gluster.c @@ -0,0 +1,46 @@ +/* + * storage_backend_gluster.c: storage backend for Gluster handling + * + * Copyright (C) 2013 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + */ + +#include <config.h> + +#include <glusterfs/api/glfs.h> + +#include "virerror.h" +#include "storage_backend_gluster.h" +#include "storage_conf.h" + +#define VIR_FROM_THIS VIR_FROM_STORAGE + + +static int +virStorageBackendGlusterRefreshPool(virConnectPtr conn ATTRIBUTE_UNUSED, + virStoragePoolObjPtr pool ATTRIBUTE_UNUSED) +{ + virReportError(VIR_ERR_NO_SUPPORT, "%s", + _("gluster pool type not fully supported yet")); + return -1; +} + +virStorageBackend virStorageBackendGluster = { + .type = VIR_STORAGE_POOL_GLUSTER, + + .refreshPool = virStorageBackendGlusterRefreshPool, +}; diff --git a/src/storage/storage_backend_gluster.h b/src/storage/storage_backend_gluster.h new file mode 100644 index 0000000..b21bda7 --- /dev/null +++ b/src/storage/storage_backend_gluster.h @@ -0,0 +1,29 @@ +/* + * storage_backend_gluster.h: storage backend for Gluster handling + * + * Copyright (C) 2013 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + */ + +#ifndef __VIR_STORAGE_BACKEND_GLUSTER_H__ +# define __VIR_STORAGE_BACKEND_GLUSTER_H__ + +# include "storage_backend.h" + +extern virStorageBackend virStorageBackendGluster; + +#endif /* __VIR_STORAGE_BACKEND_GLUSTER_H__ */ -- 1.8.3.1

On Fri, Nov 22, 2013 at 08:20:23PM -0700, Eric Blake wrote:
We support gluster volumes in domain XML, so we also ought to support them as a storage pool. Besides, a future patch will want to take advantage of libgfapi to handle the case of a gluster device holding qcow2 rather than raw storage, and for that to work, we need a storage backend that can read gluster storage volume contents. This sets up the framework.
Note that the new pool is named 'gluster' to match a <disk type='network'><source protocol='gluster'> image source already supported in a <domain>; it does NOT match the <pool type='netfs'><source><target type='glusterfs'>, since that uses a FUSE mount to a local file name rather than a network name.
This and subsequent patches have been tested against glusterfs 3.4.1 (available on Fedora 19); there are likely bugs in older versions that may prevent decent use of gfapi, so this patch enforces the minimum version tested. A future patch may lower the minimum. On the other hand, I hit at least two bugs in 3.4.1 that will be fixed in 3.5/3.4.2, where it might be worth raising the minimum: glfs_readdir is nicer to use than glfs_readdir_r [1], and glfs_fini should only return failure on an actual failure [2].
[1] http://lists.gnu.org/archive/html/gluster-devel/2013-10/msg00085.html [2] http://lists.gnu.org/archive/html/gluster-devel/2013-10/msg00086.html
* configure.ac (WITH_STORAGE_GLUSTER): New conditional. * m4/virt-gluster.m4: new file. * libvirt.spec.in (BuildRequires): Support gluster in spec file. * src/conf/storage_conf.h (VIR_STORAGE_POOL_GLUSTER): New pool type. * src/conf/storage_conf.c (poolTypeInfo): Treat similar to sheepdog and rbd. (virStoragePoolDefFormat): Don't output target for gluster. * src/storage/storage_backend_gluster.h: New file. * src/storage/storage_backend_gluster.c: Likewise. * po/POTFILES.in: Add new file. * src/storage/storage_backend.c (backends): Register new type. * src/Makefile.am (STORAGE_DRIVER_GLUSTER_SOURCES): Build new files. * src/storage/storage_backend.h (_virStorageBackend): Documet assumption.
Signed-off-by: Eric Blake <eblake@redhat.com> --- configure.ac | 21 ++++++++++++++++ libvirt.spec.in | 15 ++++++++++++ m4/virt-gluster.m4 | 28 +++++++++++++++++++++ po/POTFILES.in | 1 + src/Makefile.am | 10 ++++++++ src/conf/storage_conf.c | 26 +++++++++++++++++--- src/conf/storage_conf.h | 3 ++- src/storage/storage_backend.c | 6 +++++ src/storage/storage_backend.h | 6 +++-- src/storage/storage_backend_gluster.c | 46 +++++++++++++++++++++++++++++++++++ src/storage/storage_backend_gluster.h | 29 ++++++++++++++++++++++ 11 files changed, 184 insertions(+), 7 deletions(-) create mode 100644 m4/virt-gluster.m4 create mode 100644 src/storage/storage_backend_gluster.c create mode 100644 src/storage/storage_backend_gluster.h
ACK Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11/25/2013 08:39 AM, Daniel P. Berrange wrote:
On Fri, Nov 22, 2013 at 08:20:23PM -0700, Eric Blake wrote:
We support gluster volumes in domain XML, so we also ought to support them as a storage pool. Besides, a future patch will want to take advantage of libgfapi to handle the case of a gluster device holding qcow2 rather than raw storage, and for that to work, we need a storage backend that can read gluster storage volume contents. This sets up the framework.
ACK
Pushed. We're committed now - 'gluster' storage pool will be part of 1.2.0, and I'll be trying hard to hammer on it to squash bugs during our code freeze. If there's any slight naming tweaks to XML, now's the time to decide it before it gets baked in to the release :) [I'm testing that each patch of the series still works after all the rebasing to latest, but the overall series will hopefully be upstream alongside 1/8 within the next couple hours] -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Add support for a new <pool type='gluster'>, similar to RBD and Sheepdog. Terminology wise, a gluster volume forms a libvirt storage pool, within the gluster volume, individual files are treated as libvirt storage volumes. * docs/schemas/storagepool.rng (poolgluster): New pool type. * docs/formatstorage.html.in: Document gluster. * docs/storage.html.in: Likewise, and contrast it with netfs. * tests/storagepoolxml2xmlin/pool-gluster.xml: New test. * tests/storagepoolxml2xmlout/pool-gluster.xml: Likewise. * tests/storagepoolxml2xmltest.c (mymain): Likewise. Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/formatstorage.html.in | 15 ++-- docs/schemas/storagepool.rng | 26 ++++++- docs/storage.html.in | 91 +++++++++++++++++++++++- tests/storagepoolxml2xmlin/pool-gluster-sub.xml | 9 +++ tests/storagepoolxml2xmlin/pool-gluster.xml | 8 +++ tests/storagepoolxml2xmlout/pool-gluster-sub.xml | 12 ++++ tests/storagepoolxml2xmlout/pool-gluster.xml | 12 ++++ tests/storagepoolxml2xmltest.c | 2 + 8 files changed, 168 insertions(+), 7 deletions(-) create mode 100644 tests/storagepoolxml2xmlin/pool-gluster-sub.xml create mode 100644 tests/storagepoolxml2xmlin/pool-gluster.xml create mode 100644 tests/storagepoolxml2xmlout/pool-gluster-sub.xml create mode 100644 tests/storagepoolxml2xmlout/pool-gluster.xml diff --git a/docs/formatstorage.html.in b/docs/formatstorage.html.in index 5f277b4..a7f3b57 100644 --- a/docs/formatstorage.html.in +++ b/docs/formatstorage.html.in @@ -21,8 +21,10 @@ <code>iscsi</code>, <code>logical</code>, <code>scsi</code> (all <span class="since">since 0.4.1</span>), <code>mpath</code> (<span class="since">since 0.7.1</span>), <code>rbd</code> - (<span class="since">since 0.9.13</span>), or <code>sheepdog</code> - (<span class="since">since 0.10.0</span>). This corresponds to the + (<span class="since">since 0.9.13</span>), <code>sheepdog</code> + (<span class="since">since 0.10.0</span>), + or <code>gluster</code> (<span class="since">since + 1.1.5</span>). This corresponds to the storage backend drivers listed further along in this document. </p> <h3><a name="StoragePoolFirst">General metadata</a></h3> @@ -101,7 +103,9 @@ path to the block device node. <span class="since">Since 0.4.1</span></dd> <dt><code>dir</code></dt> <dd>Provides the source for pools backed by directories (pool - type <code>dir</code>). May + type <code>dir</code>), or optionally to select a subdirectory + within a pool that resembles a filesystem (pool + type <code>gluster</code>). May only occur once. Contains a single attribute <code>path</code> which is the fully qualified path to the backing directory. <span class="since">Since 0.4.1</span></dd> @@ -129,7 +133,7 @@ <dt><code>host</code></dt> <dd>Provides the source for pools backed by storage from a remote server (pool types <code>netfs</code>, <code>iscsi</code>, - <code>rbd</code>, <code>sheepdog</code>). Will be + <code>rbd</code>, <code>sheepdog</code>, <code>gluster</code>). Will be used in combination with a <code>directory</code> or <code>device</code> element. Contains an attribute <code>name</code> which is the hostname or IP address of the server. May optionally @@ -160,7 +164,8 @@ <dt><code>name</code></dt> <dd>Provides the source for pools backed by storage from a named element (pool types <code>logical</code>, <code>rbd</code>, - <code>sheepdog</code>). Contains a string identifier. + <code>sheepdog</code>, <code>gluster</code>). Contains a + string identifier. <span class="since">Since 0.4.5</span></dd> <dt><code>format</code></dt> <dd>Provides information about the format of the pool (pool diff --git a/docs/schemas/storagepool.rng b/docs/schemas/storagepool.rng index 61fa7a3..8d7a94d 100644 --- a/docs/schemas/storagepool.rng +++ b/docs/schemas/storagepool.rng @@ -21,6 +21,7 @@ <ref name='poolmpath'/> <ref name='poolrbd'/> <ref name='poolsheepdog'/> + <ref name='poolgluster'/> </choice> </element> </define> @@ -145,6 +146,17 @@ </interleave> </define> + <define name='poolgluster'> + <attribute name='type'> + <value>gluster</value> + </attribute> + <interleave> + <ref name='commonmetadata'/> + <ref name='sizing'/> + <ref name='sourcegluster'/> + </interleave> + </define> + <define name='sourceinfovendor'> <interleave> <optional> @@ -309,7 +321,7 @@ <define name='sourceinfodir'> <element name='dir'> <attribute name='path'> - <ref name='absFilePath'/> + <ref name='absDirPath'/> </attribute> <empty/> </element> @@ -554,6 +566,18 @@ </element> </define> + <define name='sourcegluster'> + <element name='source'> + <interleave> + <ref name='sourceinfohost'/> + <ref name='sourceinfoname'/> + <optional> + <ref name='sourceinfodir'/> + </optional> + </interleave> + </element> + </define> + <define name='IscsiQualifiedName'> <data type='string'> <param name="pattern">iqn\.[0-9]{4}-(0[1-9]|1[0-2])\.[a-zA-Z0-9\.\-]+(:.+)?</param> diff --git a/docs/storage.html.in b/docs/storage.html.in index 0464565..9465b83 100644 --- a/docs/storage.html.in +++ b/docs/storage.html.in @@ -114,6 +114,9 @@ <li> <a href="#StorageBackendSheepdog">Sheepdog backend</a> </li> + <li> + <a href="#StorageBackendGluster">Gluster backend</a> + </li> </ul> <h2><a name="StorageBackendDir">Directory pool</a></h2> @@ -280,7 +283,9 @@ <code>glusterfs</code> - use the glusterfs FUSE file system. For now, the <code>dir</code> specified as the source can only be a gluster volume name, as gluster does not provide a way to - directly mount subdirectories within a volume. + directly mount subdirectories within a volume. (To bypass the + file system completely, see + the <a href="#StorageBackendGluster">gluster</a> pool.) </li> <li> <code>cifs</code> - use the SMB (samba) or CIFS file system @@ -652,5 +657,89 @@ The Sheepdog pool does not use the volume format type element. </p> + <h2><a name="StorageBackendGluster">Gluster pools</a></h2> + <p> + This provides a pool based on native Gluster access. Gluster is + a distributed file system that can be exposed to the user via + FUSE, NFS or SMB (see the <a href="#StorageBackendNetfs">netfs</a> + pool for that usage); but for minimal overhead, the ideal access + is via native access (only possible for QEMU/KVM compiled with + libgfapi support). + + The cluster and storage volume must already be running, and it + is recommended that the volume be configured with <code>gluster + volume set $volname storage.owner-uid=$uid</code> + and <code>gluster volume set $volname + storage.owner-gid=$gid</code> for the uid and gid that qemu will + be run as. It may also be necessary to + set <code>rpc-auth-allow-insecure on</code> for the glusterd + service, as well as <code>gluster set $volname + server.allow-insecure on</code>, to allow access to the gluster + volume. + + <span class="since">Since 1.1.5</span> + </p> + + <h3>Example pool input</h3> + <p>A gluster volume corresponds to a libvirt storage pool. If a + gluster volume could be mounted as <code>mount -t glusterfs + localhost:/volname /some/path</code>, then the following example + will describe the same pool without having to create a local + mount point. Remember that with gluster, the mount point can be + through any machine in the cluster, and gluster will + automatically pick the ideal transport to the actual bricks + backing the gluster volume, even if on a different host than the + one named in the <code>host</code> designation. + The <code><name></code> element is always the volume name + (no slash). The pool source also supports an + optional <code><dir></code> element with + a <code>path</code> attribute that lists the absolute name of a + subdirectory relative to the gluster volume to use instead of + the top-level directory of the volume.</p> + <pre> + <pool type="gluster"> + <name>myglusterpool</name> + <source> + <name>volname</name> + <host name='localhost'/> + <dir path='/'/> + </source> + </pool></pre> + + <h3>Example volume output</h3> + <p>Libvirt storage volumes associated with a gluster pool + correspond to the files that can be found when mounting the + gluster volume. The <code>name</code> is the path relative to + the effective mount specified for the pool; and + the <code>key</code> is a path including the gluster volume + name and any subdirectory specified by the pool.</p> + <pre> + <volume> + <name>myfile</name> + <key>volname/myfile</key> + <source> + </source> + <capacity unit='bytes'>53687091200</capacity> + <allocation unit='bytes'>53687091200</allocation> + </volume></pre> + + <h3>Example disk attachment</h3> + <p>Files within a gluster volume can be attached to Qemu guests. + Information about attaching a Gluster image to a + guest can be found + at the <a href="formatdomain.html#elementsDisks">format domain</a> + page.</p> + + <h3>Valid pool format types</h3> + <p> + The Gluster pool does not use the pool format type element. + </p> + + <h3>Valid volume format types</h3> + <p> + The valid volume types are the same as for the <code>directory</code> + pool type. + </p> + </body> </html> diff --git a/tests/storagepoolxml2xmlin/pool-gluster-sub.xml b/tests/storagepoolxml2xmlin/pool-gluster-sub.xml new file mode 100644 index 0000000..ba4458f --- /dev/null +++ b/tests/storagepoolxml2xmlin/pool-gluster-sub.xml @@ -0,0 +1,9 @@ +<pool type='gluster'> + <source> + <name>volume</name> + <dir path='/sub'/> + <host name='localhost'/> + </source> + <name>mygluster</name> + <uuid>65fcba04-5b13-bd93-cff3-52ce48e11ad8</uuid> +</pool> diff --git a/tests/storagepoolxml2xmlin/pool-gluster.xml b/tests/storagepoolxml2xmlin/pool-gluster.xml new file mode 100644 index 0000000..ae9401f --- /dev/null +++ b/tests/storagepoolxml2xmlin/pool-gluster.xml @@ -0,0 +1,8 @@ +<pool type='gluster'> + <source> + <name>volume</name> + <host name='localhost'/> + </source> + <name>mygluster</name> + <uuid>65fcba04-5b13-bd93-cff3-52ce48e11ad8</uuid> +</pool> diff --git a/tests/storagepoolxml2xmlout/pool-gluster-sub.xml b/tests/storagepoolxml2xmlout/pool-gluster-sub.xml new file mode 100644 index 0000000..df7d719 --- /dev/null +++ b/tests/storagepoolxml2xmlout/pool-gluster-sub.xml @@ -0,0 +1,12 @@ +<pool type='gluster'> + <name>mygluster</name> + <uuid>65fcba04-5b13-bd93-cff3-52ce48e11ad8</uuid> + <capacity unit='bytes'>0</capacity> + <allocation unit='bytes'>0</allocation> + <available unit='bytes'>0</available> + <source> + <host name='localhost'/> + <dir path='/sub'/> + <name>volume</name> + </source> +</pool> diff --git a/tests/storagepoolxml2xmlout/pool-gluster.xml b/tests/storagepoolxml2xmlout/pool-gluster.xml new file mode 100644 index 0000000..a7fa506 --- /dev/null +++ b/tests/storagepoolxml2xmlout/pool-gluster.xml @@ -0,0 +1,12 @@ +<pool type='gluster'> + <name>mygluster</name> + <uuid>65fcba04-5b13-bd93-cff3-52ce48e11ad8</uuid> + <capacity unit='bytes'>0</capacity> + <allocation unit='bytes'>0</allocation> + <available unit='bytes'>0</available> + <source> + <host name='localhost'/> + <dir path='/'/> + <name>volume</name> + </source> +</pool> diff --git a/tests/storagepoolxml2xmltest.c b/tests/storagepoolxml2xmltest.c index a81cf99..039d515 100644 --- a/tests/storagepoolxml2xmltest.c +++ b/tests/storagepoolxml2xmltest.c @@ -102,6 +102,8 @@ mymain(void) DO_TEST("pool-iscsi-multiiqn"); DO_TEST("pool-iscsi-vendor-product"); DO_TEST("pool-sheepdog"); + DO_TEST("pool-gluster"); + DO_TEST("pool-gluster-sub"); return ret==0 ? EXIT_SUCCESS : EXIT_FAILURE; } -- 1.8.3.1

On Fri, Nov 22, 2013 at 08:20:24PM -0700, Eric Blake wrote:
Add support for a new <pool type='gluster'>, similar to RBD and Sheepdog. Terminology wise, a gluster volume forms a libvirt storage pool, within the gluster volume, individual files are treated as libvirt storage volumes.
* docs/schemas/storagepool.rng (poolgluster): New pool type. * docs/formatstorage.html.in: Document gluster. * docs/storage.html.in: Likewise, and contrast it with netfs. * tests/storagepoolxml2xmlin/pool-gluster.xml: New test. * tests/storagepoolxml2xmlout/pool-gluster.xml: Likewise. * tests/storagepoolxml2xmltest.c (mymain): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/formatstorage.html.in | 15 ++-- docs/schemas/storagepool.rng | 26 ++++++- docs/storage.html.in | 91 +++++++++++++++++++++++- tests/storagepoolxml2xmlin/pool-gluster-sub.xml | 9 +++ tests/storagepoolxml2xmlin/pool-gluster.xml | 8 +++ tests/storagepoolxml2xmlout/pool-gluster-sub.xml | 12 ++++ tests/storagepoolxml2xmlout/pool-gluster.xml | 12 ++++ tests/storagepoolxml2xmltest.c | 2 + 8 files changed, 168 insertions(+), 7 deletions(-) create mode 100644 tests/storagepoolxml2xmlin/pool-gluster-sub.xml create mode 100644 tests/storagepoolxml2xmlin/pool-gluster.xml create mode 100644 tests/storagepoolxml2xmlout/pool-gluster-sub.xml create mode 100644 tests/storagepoolxml2xmlout/pool-gluster.xml
diff --git a/docs/formatstorage.html.in b/docs/formatstorage.html.in index 5f277b4..a7f3b57 100644 --- a/docs/formatstorage.html.in +++ b/docs/formatstorage.html.in @@ -21,8 +21,10 @@ <code>iscsi</code>, <code>logical</code>, <code>scsi</code> (all <span class="since">since 0.4.1</span>), <code>mpath</code> (<span class="since">since 0.7.1</span>), <code>rbd</code> - (<span class="since">since 0.9.13</span>), or <code>sheepdog</code> - (<span class="since">since 0.10.0</span>). This corresponds to the + (<span class="since">since 0.9.13</span>), <code>sheepdog</code> + (<span class="since">since 0.10.0</span>), + or <code>gluster</code> (<span class="since">since + 1.1.5</span>). This corresponds to the
well 1.1.5 will become 1.2.0 actually so just change the 2 occurences
storage backend drivers listed further along in this document. </p> <h3><a name="StoragePoolFirst">General metadata</a></h3> @@ -101,7 +103,9 @@ path to the block device node. <span class="since">Since 0.4.1</span></dd> <dt><code>dir</code></dt> <dd>Provides the source for pools backed by directories (pool - type <code>dir</code>). May + type <code>dir</code>), or optionally to select a subdirectory + within a pool that resembles a filesystem (pool + type <code>gluster</code>). May only occur once. Contains a single attribute <code>path</code> which is the fully qualified path to the backing directory. <span class="since">Since 0.4.1</span></dd> @@ -129,7 +133,7 @@ <dt><code>host</code></dt> <dd>Provides the source for pools backed by storage from a remote server (pool types <code>netfs</code>, <code>iscsi</code>, - <code>rbd</code>, <code>sheepdog</code>). Will be + <code>rbd</code>, <code>sheepdog</code>, <code>gluster</code>). Will be used in combination with a <code>directory</code> or <code>device</code> element. Contains an attribute <code>name</code> which is the hostname or IP address of the server. May optionally @@ -160,7 +164,8 @@ <dt><code>name</code></dt> <dd>Provides the source for pools backed by storage from a named element (pool types <code>logical</code>, <code>rbd</code>, - <code>sheepdog</code>). Contains a string identifier. + <code>sheepdog</code>, <code>gluster</code>). Contains a + string identifier. <span class="since">Since 0.4.5</span></dd> <dt><code>format</code></dt> <dd>Provides information about the format of the pool (pool diff --git a/docs/schemas/storagepool.rng b/docs/schemas/storagepool.rng index 61fa7a3..8d7a94d 100644 --- a/docs/schemas/storagepool.rng +++ b/docs/schemas/storagepool.rng @@ -21,6 +21,7 @@ <ref name='poolmpath'/> <ref name='poolrbd'/> <ref name='poolsheepdog'/> + <ref name='poolgluster'/> </choice> </element> </define> @@ -145,6 +146,17 @@ </interleave> </define>
+ <define name='poolgluster'> + <attribute name='type'> + <value>gluster</value> + </attribute> + <interleave> + <ref name='commonmetadata'/> + <ref name='sizing'/> + <ref name='sourcegluster'/> + </interleave> + </define> + <define name='sourceinfovendor'> <interleave> <optional> @@ -309,7 +321,7 @@ <define name='sourceinfodir'> <element name='dir'> <attribute name='path'> - <ref name='absFilePath'/> + <ref name='absDirPath'/> </attribute> <empty/> </element> @@ -554,6 +566,18 @@ </element> </define>
+ <define name='sourcegluster'> + <element name='source'> + <interleave> + <ref name='sourceinfohost'/> + <ref name='sourceinfoname'/> + <optional> + <ref name='sourceinfodir'/> + </optional> + </interleave> + </element> + </define> + <define name='IscsiQualifiedName'> <data type='string'> <param name="pattern">iqn\.[0-9]{4}-(0[1-9]|1[0-2])\.[a-zA-Z0-9\.\-]+(:.+)?</param> diff --git a/docs/storage.html.in b/docs/storage.html.in index 0464565..9465b83 100644 --- a/docs/storage.html.in +++ b/docs/storage.html.in @@ -114,6 +114,9 @@ <li> <a href="#StorageBackendSheepdog">Sheepdog backend</a> </li> + <li> + <a href="#StorageBackendGluster">Gluster backend</a> + </li> </ul>
<h2><a name="StorageBackendDir">Directory pool</a></h2> @@ -280,7 +283,9 @@ <code>glusterfs</code> - use the glusterfs FUSE file system. For now, the <code>dir</code> specified as the source can only be a gluster volume name, as gluster does not provide a way to - directly mount subdirectories within a volume. + directly mount subdirectories within a volume. (To bypass the + file system completely, see + the <a href="#StorageBackendGluster">gluster</a> pool.) </li> <li> <code>cifs</code> - use the SMB (samba) or CIFS file system @@ -652,5 +657,89 @@ The Sheepdog pool does not use the volume format type element. </p>
+ <h2><a name="StorageBackendGluster">Gluster pools</a></h2> + <p> + This provides a pool based on native Gluster access. Gluster is + a distributed file system that can be exposed to the user via + FUSE, NFS or SMB (see the <a href="#StorageBackendNetfs">netfs</a> + pool for that usage); but for minimal overhead, the ideal access + is via native access (only possible for QEMU/KVM compiled with + libgfapi support). + + The cluster and storage volume must already be running, and it + is recommended that the volume be configured with <code>gluster + volume set $volname storage.owner-uid=$uid</code> + and <code>gluster volume set $volname + storage.owner-gid=$gid</code> for the uid and gid that qemu will + be run as. It may also be necessary to + set <code>rpc-auth-allow-insecure on</code> for the glusterd + service, as well as <code>gluster set $volname + server.allow-insecure on</code>, to allow access to the gluster + volume. + + <span class="since">Since 1.1.5</span>
1.2.0
+ </p> + + <h3>Example pool input</h3> + <p>A gluster volume corresponds to a libvirt storage pool. If a + gluster volume could be mounted as <code>mount -t glusterfs + localhost:/volname /some/path</code>, then the following example + will describe the same pool without having to create a local + mount point. Remember that with gluster, the mount point can be + through any machine in the cluster, and gluster will + automatically pick the ideal transport to the actual bricks + backing the gluster volume, even if on a different host than the + one named in the <code>host</code> designation. + The <code><name></code> element is always the volume name + (no slash). The pool source also supports an + optional <code><dir></code> element with + a <code>path</code> attribute that lists the absolute name of a + subdirectory relative to the gluster volume to use instead of + the top-level directory of the volume.</p> + <pre> + <pool type="gluster"> + <name>myglusterpool</name> + <source> + <name>volname</name> + <host name='localhost'/> + <dir path='/'/> + </source> + </pool></pre> + + <h3>Example volume output</h3> + <p>Libvirt storage volumes associated with a gluster pool + correspond to the files that can be found when mounting the + gluster volume. The <code>name</code> is the path relative to + the effective mount specified for the pool; and + the <code>key</code> is a path including the gluster volume + name and any subdirectory specified by the pool.</p> + <pre> + <volume> + <name>myfile</name> + <key>volname/myfile</key> + <source> + </source> + <capacity unit='bytes'>53687091200</capacity> + <allocation unit='bytes'>53687091200</allocation> + </volume></pre> + + <h3>Example disk attachment</h3> + <p>Files within a gluster volume can be attached to Qemu guests. + Information about attaching a Gluster image to a + guest can be found + at the <a href="formatdomain.html#elementsDisks">format domain</a> + page.</p> + + <h3>Valid pool format types</h3> + <p> + The Gluster pool does not use the pool format type element. + </p> + + <h3>Valid volume format types</h3> + <p> + The valid volume types are the same as for the <code>directory</code> + pool type. + </p> + </body> </html> diff --git a/tests/storagepoolxml2xmlin/pool-gluster-sub.xml b/tests/storagepoolxml2xmlin/pool-gluster-sub.xml new file mode 100644 index 0000000..ba4458f --- /dev/null +++ b/tests/storagepoolxml2xmlin/pool-gluster-sub.xml @@ -0,0 +1,9 @@ +<pool type='gluster'> + <source> + <name>volume</name> + <dir path='/sub'/> + <host name='localhost'/> + </source> + <name>mygluster</name> + <uuid>65fcba04-5b13-bd93-cff3-52ce48e11ad8</uuid> +</pool> diff --git a/tests/storagepoolxml2xmlin/pool-gluster.xml b/tests/storagepoolxml2xmlin/pool-gluster.xml new file mode 100644 index 0000000..ae9401f --- /dev/null +++ b/tests/storagepoolxml2xmlin/pool-gluster.xml @@ -0,0 +1,8 @@ +<pool type='gluster'> + <source> + <name>volume</name> + <host name='localhost'/> + </source> + <name>mygluster</name> + <uuid>65fcba04-5b13-bd93-cff3-52ce48e11ad8</uuid> +</pool> diff --git a/tests/storagepoolxml2xmlout/pool-gluster-sub.xml b/tests/storagepoolxml2xmlout/pool-gluster-sub.xml new file mode 100644 index 0000000..df7d719 --- /dev/null +++ b/tests/storagepoolxml2xmlout/pool-gluster-sub.xml @@ -0,0 +1,12 @@ +<pool type='gluster'> + <name>mygluster</name> + <uuid>65fcba04-5b13-bd93-cff3-52ce48e11ad8</uuid> + <capacity unit='bytes'>0</capacity> + <allocation unit='bytes'>0</allocation> + <available unit='bytes'>0</available> + <source> + <host name='localhost'/> + <dir path='/sub'/> + <name>volume</name> + </source> +</pool> diff --git a/tests/storagepoolxml2xmlout/pool-gluster.xml b/tests/storagepoolxml2xmlout/pool-gluster.xml new file mode 100644 index 0000000..a7fa506 --- /dev/null +++ b/tests/storagepoolxml2xmlout/pool-gluster.xml @@ -0,0 +1,12 @@ +<pool type='gluster'> + <name>mygluster</name> + <uuid>65fcba04-5b13-bd93-cff3-52ce48e11ad8</uuid> + <capacity unit='bytes'>0</capacity> + <allocation unit='bytes'>0</allocation> + <available unit='bytes'>0</available> + <source> + <host name='localhost'/> + <dir path='/'/> + <name>volume</name> + </source> +</pool> diff --git a/tests/storagepoolxml2xmltest.c b/tests/storagepoolxml2xmltest.c index a81cf99..039d515 100644 --- a/tests/storagepoolxml2xmltest.c +++ b/tests/storagepoolxml2xmltest.c @@ -102,6 +102,8 @@ mymain(void) DO_TEST("pool-iscsi-multiiqn"); DO_TEST("pool-iscsi-vendor-product"); DO_TEST("pool-sheepdog"); + DO_TEST("pool-gluster"); + DO_TEST("pool-gluster-sub");
return ret==0 ? EXIT_SUCCESS : EXIT_FAILURE; }
That patch set got a lot of review already (though it seems to have increased slighty since v3), but i would love to get it in 1.2.0. Since Peter and Dan fully reviewed it last round, and assuming you handled all comments back, I suggest to push before tomorrow freeze :-) ACK, Daniel -- Daniel Veillard | Open Source and Standards, Red Hat veillard@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/

On Fri, Nov 22, 2013 at 08:20:24PM -0700, Eric Blake wrote:
Add support for a new <pool type='gluster'>, similar to RBD and Sheepdog. Terminology wise, a gluster volume forms a libvirt storage pool, within the gluster volume, individual files are treated as libvirt storage volumes.
* docs/schemas/storagepool.rng (poolgluster): New pool type. * docs/formatstorage.html.in: Document gluster. * docs/storage.html.in: Likewise, and contrast it with netfs. * tests/storagepoolxml2xmlin/pool-gluster.xml: New test. * tests/storagepoolxml2xmlout/pool-gluster.xml: Likewise. * tests/storagepoolxml2xmltest.c (mymain): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
ACK with the s/1.1.5/1.2.0/ change DV suggested Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11/25/2013 08:40 AM, Daniel P. Berrange wrote:
On Fri, Nov 22, 2013 at 08:20:24PM -0700, Eric Blake wrote:
Add support for a new <pool type='gluster'>, similar to RBD and Sheepdog. Terminology wise, a gluster volume forms a libvirt storage pool, within the gluster volume, individual files are treated as libvirt storage volumes.
* docs/schemas/storagepool.rng (poolgluster): New pool type. * docs/formatstorage.html.in: Document gluster. * docs/storage.html.in: Likewise, and contrast it with netfs. * tests/storagepoolxml2xmlin/pool-gluster.xml: New test. * tests/storagepoolxml2xmlout/pool-gluster.xml: Likewise. * tests/storagepoolxml2xmltest.c (mymain): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
ACK with the s/1.1.5/1.2.0/ change DV suggested
Pushed. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Actually put gfapi to use, by allowing the creation of a gluster pool. Right now, all volumes are treated as raw and directories are skipped; further patches will allow peering into files to allow for qcow2 files and backing chains, and reporting proper volume allocation. This implementation was tested against Fedora 19's glusterfs 3.4.1; it might be made simpler by requiring a higher minimum, and/or require more hacks to work with a lower minimum. * src/storage/storage_backend_gluster.c (virStorageBackendGlusterRefreshPool): Initial implementation. (virStorageBackendGlusterOpen, virStorageBackendGlusterClose) (virStorageBackendGlusterRefreshVol): New helper functions. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/storage/storage_backend_gluster.c | 249 +++++++++++++++++++++++++++++++++- 1 file changed, 244 insertions(+), 5 deletions(-) diff --git a/src/storage/storage_backend_gluster.c b/src/storage/storage_backend_gluster.c index 2863c73..120cf0f 100644 --- a/src/storage/storage_backend_gluster.c +++ b/src/storage/storage_backend_gluster.c @@ -23,20 +23,259 @@ #include <glusterfs/api/glfs.h> -#include "virerror.h" #include "storage_backend_gluster.h" #include "storage_conf.h" +#include "viralloc.h" +#include "virerror.h" +#include "virlog.h" +#include "virstoragefile.h" +#include "virstring.h" +#include "viruri.h" #define VIR_FROM_THIS VIR_FROM_STORAGE +struct _virStorageBackendGlusterState { + glfs_t *vol; + + /* Accept the same URIs as qemu's block/gluster.c: + * gluster[+transport]://[server[:port]]/vol/[dir/]image[?socket=...] */ + virURI *uri; + + char *volname; /* vol from URI */ + char *dir; /* dir from URI, or "/"; always ends in '/' */ +}; + +typedef struct _virStorageBackendGlusterState virStorageBackendGlusterState; +typedef virStorageBackendGlusterState *virStorageBackendGlusterStatePtr; + +static void +virStorageBackendGlusterClose(virStorageBackendGlusterStatePtr state) +{ + if (!state) + return; + + /* Yuck - glusterfs-api-3.4.1 appears to always return -1 for + * glfs_fini, with errno containing random data, so there's no way + * to tell if it succeeded. 3.4.2 is supposed to fix this.*/ + if (state->vol && glfs_fini(state->vol) < 0) + VIR_DEBUG("shutdown of gluster volume %s failed with errno %d", + state->volname, errno); + + virURIFree(state->uri); + VIR_FREE(state->volname); + VIR_FREE(state->dir); + VIR_FREE(state); +} + +static virStorageBackendGlusterStatePtr +virStorageBackendGlusterOpen(virStoragePoolObjPtr pool) +{ + virStorageBackendGlusterStatePtr ret = NULL; + const char *name = pool->def->source.name; + const char *dir = pool->def->source.dir; + bool trailing_slash = true; + + /* Volume name must not contain '/'; optional path allows use of a + * subdirectory within the volume name. */ + if (strchr(name, '/')) { + virReportError(VIR_ERR_XML_ERROR, + _("gluster pool name '%s' must not contain /"), + name); + return NULL; + } + if (dir) { + if (*dir != '/') { + virReportError(VIR_ERR_XML_ERROR, + _("gluster pool path '%s' must start with /"), + dir); + return NULL; + } + if (strchr(dir, '\0')[-1] != '/') + trailing_slash = false; + } + + if (VIR_ALLOC(ret) < 0) + return NULL; + + if (VIR_STRDUP(ret->volname, name) < 0) + goto error; + if (virAsprintf(&ret->dir, "%s%s", dir ? dir : "/", + trailing_slash ? "" : "/") < 0) + goto error; + + /* FIXME: Currently hard-coded to tcp transport; XML needs to be + * extended to allow alternate transport */ + if (VIR_ALLOC(ret->uri) < 0) + goto error; + if (VIR_STRDUP(ret->uri->scheme, "gluster") < 0) + goto error; + if (VIR_STRDUP(ret->uri->server, pool->def->source.hosts[0].name) < 0) + goto error; + if (virAsprintf(&ret->uri->path, "/%s%s", ret->volname, ret->dir) < 0) + goto error; + ret->uri->port = pool->def->source.hosts[0].port; + + /* Actually connect to glfs */ + if (!(ret->vol = glfs_new(ret->volname))) { + virReportOOMError(); + goto error; + } + + if (glfs_set_volfile_server(ret->vol, "tcp", + ret->uri->server, ret->uri->port) < 0 || + glfs_init(ret->vol) < 0) { + char *uri = virURIFormat(ret->uri); + + virReportSystemError(errno, _("failed to connect to %s"), + NULLSTR(uri)); + VIR_FREE(uri); + goto error; + } + + if (glfs_chdir(ret->vol, ret->dir) < 0) { + virReportSystemError(errno, + _("failed to change to directory '%s' in '%s'"), + ret->dir, ret->volname); + goto error; + } + + return ret; + +error: + virStorageBackendGlusterClose(ret); + return NULL; +} + + +/* Populate *volptr for the given name and stat information, or leave + * it NULL if the entry should be skipped (such as "."). Return 0 on + * success, -1 on failure. */ +static int +virStorageBackendGlusterRefreshVol(virStorageBackendGlusterStatePtr state, + const char *name, + struct stat *st, + virStorageVolDefPtr *volptr) +{ + char *tmp; + int ret = -1; + virStorageVolDefPtr vol = NULL; + + *volptr = NULL; + + /* Silently skip '.' and '..'. */ + if (STREQ(name, ".") || STREQ(name, "..")) + return 0; + /* FIXME: support directories. For now, silently skip them. */ + if (S_ISDIR(st->st_mode)) + return 0; + + if (VIR_ALLOC(vol) < 0) + goto cleanup; + if (VIR_STRDUP(vol->name, name) < 0) + goto cleanup; + if (virAsprintf(&vol->key, "/%s%s%s", state->volname, state->dir, + vol->name) < 0) + goto cleanup; + + vol->type = VIR_STORAGE_VOL_NETWORK; + tmp = state->uri->path; + state->uri->path = vol->key; + if (!(vol->target.path = virURIFormat(state->uri))) { + state->uri->path = tmp; + goto cleanup; + } + state->uri->path = tmp; + + /* FIXME - must open files to determine if they are non-raw */ + vol->target.format = VIR_STORAGE_FILE_RAW; + vol->capacity = vol->allocation = st->st_size; + + *volptr = vol; + vol = NULL; + ret = 0; +cleanup: + virStorageVolDefFree(vol); + return ret; +} static int virStorageBackendGlusterRefreshPool(virConnectPtr conn ATTRIBUTE_UNUSED, - virStoragePoolObjPtr pool ATTRIBUTE_UNUSED) + virStoragePoolObjPtr pool) { - virReportError(VIR_ERR_NO_SUPPORT, "%s", - _("gluster pool type not fully supported yet")); - return -1; + int ret = -1; + virStorageBackendGlusterStatePtr state = NULL; + struct { + struct dirent ent; + /* See comment below about readdir_r needing padding */ + char padding[MAX(1, 256 - (int) (sizeof(struct dirent) + - offsetof(struct dirent, d_name)))]; + } de; + struct dirent *ent; + glfs_fd_t *dir = NULL; + struct stat st; + struct statvfs sb; + + if (!(state = virStorageBackendGlusterOpen(pool))) + goto cleanup; + + /* Why oh why did glfs 3.4 decide to expose only readdir_r rather + * than readdir? POSIX admits that readdir_r is inherently a + * flawed design, because systems are not required to define + * NAME_MAX: http://austingroupbugs.net/view.php?id=696 + * http://womble.decadent.org.uk/readdir_r-advisory.html + * + * Fortunately, gluster appears to limit its underlying bricks to + * only use file systems such as XFS that have a NAME_MAX of 255; + * so we are currently guaranteed that if we provide 256 bytes of + * tail padding, then we should have enough space to avoid buffer + * overflow no matter whether the OS used d_name[], d_name[1], or + * d_name[256] in its 'struct dirent'. + * http://lists.gnu.org/archive/html/gluster-devel/2013-10/msg00083.html + */ + + if (!(dir = glfs_opendir(state->vol, state->dir))) { + virReportSystemError(errno, _("cannot open path '%s' in '%s'"), + state->dir, state->volname); + goto cleanup; + } + while (!(errno = glfs_readdirplus_r(dir, &st, &de.ent, &ent)) && ent) { + virStorageVolDefPtr vol; + int okay = virStorageBackendGlusterRefreshVol(state, + ent->d_name, &st, + &vol); + + if (okay < 0) + goto cleanup; + if (vol && VIR_APPEND_ELEMENT(pool->volumes.objs, pool->volumes.count, + vol) < 0) + goto cleanup; + } + if (errno) { + virReportSystemError(errno, _("failed to read directory '%s' in '%s'"), + state->dir, state->volname); + goto cleanup; + } + + if (glfs_statvfs(state->vol, state->dir, &sb) < 0) { + virReportSystemError(errno, _("cannot statvfs path '%s' in '%s'"), + state->dir, state->volname); + goto cleanup; + } + + pool->def->capacity = ((unsigned long long)sb.f_frsize * + (unsigned long long)sb.f_blocks); + pool->def->available = ((unsigned long long)sb.f_bfree * + (unsigned long long)sb.f_frsize); + pool->def->allocation = pool->def->capacity - pool->def->available; + + ret = 0; +cleanup: + if (dir) + glfs_closedir(dir); + virStorageBackendGlusterClose(state); + if (ret < 0) + virStoragePoolObjClearVols(pool); + return ret; } virStorageBackend virStorageBackendGluster = { -- 1.8.3.1

On Fri, Nov 22, 2013 at 08:20:25PM -0700, Eric Blake wrote:
Actually put gfapi to use, by allowing the creation of a gluster pool. Right now, all volumes are treated as raw and directories are skipped; further patches will allow peering into files to allow for qcow2 files and backing chains, and reporting proper volume allocation. This implementation was tested against Fedora 19's glusterfs 3.4.1; it might be made simpler by requiring a higher minimum, and/or require more hacks to work with a lower minimum.
* src/storage/storage_backend_gluster.c (virStorageBackendGlusterRefreshPool): Initial implementation. (virStorageBackendGlusterOpen, virStorageBackendGlusterClose) (virStorageBackendGlusterRefreshVol): New helper functions.
Signed-off-by: Eric Blake <eblake@redhat.com>
ACK Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11/25/2013 08:52 AM, Daniel P. Berrange wrote:
On Fri, Nov 22, 2013 at 08:20:25PM -0700, Eric Blake wrote:
Actually put gfapi to use, by allowing the creation of a gluster pool. Right now, all volumes are treated as raw and directories are skipped; further patches will allow peering into files to allow for qcow2 files and backing chains, and reporting proper volume allocation. This implementation was tested against Fedora 19's glusterfs 3.4.1; it might be made simpler by requiring a higher minimum, and/or require more hacks to work with a lower minimum.
* src/storage/storage_backend_gluster.c (virStorageBackendGlusterRefreshPool): Initial implementation. (virStorageBackendGlusterOpen, virStorageBackendGlusterClose) (virStorageBackendGlusterRefreshVol): New helper functions.
Signed-off-by: Eric Blake <eblake@redhat.com>
ACK
Squashed this in and pushed (in reviewing 2/8, I noticed that I had documented that <key> should NOT start with a slash, but just the volume name). diff --git i/src/storage/storage_backend_gluster.c w/src/storage/storage_backend_gluster.c index 120cf0f..d57427e 100644 --- i/src/storage/storage_backend_gluster.c +++ w/src/storage/storage_backend_gluster.c @@ -41,8 +41,8 @@ struct _virStorageBackendGlusterState { * gluster[+transport]://[server[:port]]/vol/[dir/]image[?socket=...] */ virURI *uri; - char *volname; /* vol from URI */ - char *dir; /* dir from URI, or "/"; always ends in '/' */ + char *volname; /* vol from URI, no '/' */ + char *dir; /* dir from URI, or "/"; always starts and ends in '/' */ }; typedef struct _virStorageBackendGlusterState virStorageBackendGlusterState; @@ -86,7 +86,7 @@ virStorageBackendGlusterOpen(virStoragePoolObjPtr pool) if (dir) { if (*dir != '/') { virReportError(VIR_ERR_XML_ERROR, - _("gluster pool path '%s' must start with /"), + _("gluster pool path '%s' must start with /"), dir); return NULL; } @@ -173,7 +173,7 @@ virStorageBackendGlusterRefreshVol(virStorageBackendGlusterStatePtr state, goto cleanup; if (VIR_STRDUP(vol->name, name) < 0) goto cleanup; - if (virAsprintf(&vol->key, "/%s%s%s", state->volname, state->dir, + if (virAsprintf(&vol->key, "%s%s%s", state->volname, state->dir, vol->name) < 0) goto cleanup; -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

In the 'directory' and 'netfs' storage pools, a user can see both 'file' and 'dir' storage volume types, to know when they can descend into a subdirectory. But in a network-based storage pool, such as the upcoming 'gluster' pool, we use 'network' instead of 'file', and did not have any counterpart for a directory until this patch. Adding a new volume type 'network-dir' is better than reusing 'dir', because it makes it clear that the only way to access 'network' volumes within that container is through the network mounting (leaving 'dir' for something accessible in the local file system). * include/libvirt/libvirt.h.in (virStorageVolType): Expand enum. * src/qemu/qemu_command.c (qemuBuildVolumeString): Fix client. * src/qemu/qemu_conf.c (qemuTranslateDiskSourcePool): Likewise. * tools/virsh-volume.c (vshVolumeTypeToString): Likewise. * src/storage/storage_backend_fs.c (virStorageBackendFileSystemVolDelete): Likewise. Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 2 ++ src/conf/storage_conf.c | 2 +- src/qemu/qemu_command.c | 6 ++++-- src/qemu/qemu_conf.c | 4 +++- src/storage/storage_backend_fs.c | 5 +++-- tools/virsh-volume.c | 5 ++++- 6 files changed, 17 insertions(+), 7 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 146a59b..5e8cba6 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -2951,6 +2951,8 @@ typedef enum { VIR_STORAGE_VOL_BLOCK = 1, /* Block based volumes */ VIR_STORAGE_VOL_DIR = 2, /* Directory-passthrough based volume */ VIR_STORAGE_VOL_NETWORK = 3, /* Network volumes like RBD (RADOS Block Device) */ + VIR_STORAGE_VOL_NETWORK_DIR = 4, /* Network accessible directory that can + * contain other network volumes */ #ifdef VIR_ENUM_SENTINELS VIR_STORAGE_VOL_LAST diff --git a/src/conf/storage_conf.c b/src/conf/storage_conf.c index add2ae1..493a874 100644 --- a/src/conf/storage_conf.c +++ b/src/conf/storage_conf.c @@ -53,7 +53,7 @@ VIR_ENUM_IMPL(virStorageVol, VIR_STORAGE_VOL_LAST, - "file", "block", "dir", "network") + "file", "block", "dir", "network", "network-dir") VIR_ENUM_IMPL(virStoragePool, VIR_STORAGE_POOL_LAST, diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 8dc7e43..711adcb 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3782,7 +3782,7 @@ qemuBuildVolumeString(virConnectPtr conn, { int ret = -1; - switch (disk->srcpool->voltype) { + switch ((virStorageVolType) disk->srcpool->voltype) { case VIR_STORAGE_VOL_DIR: if (!disk->readonly) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -3825,10 +3825,12 @@ qemuBuildVolumeString(virConnectPtr conn, } break; case VIR_STORAGE_VOL_NETWORK: + case VIR_STORAGE_VOL_NETWORK_DIR: + case VIR_STORAGE_VOL_LAST: /* Keep the compiler quiet, qemuTranslateDiskSourcePool already * reported the unsupported error. */ - break; + goto cleanup; } ret = 0; diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 5803850..3c1e248 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -1332,7 +1332,7 @@ qemuTranslateDiskSourcePool(virConnectPtr conn, goto cleanup; } - switch (info.type) { + switch ((virStorageVolType) info.type) { case VIR_STORAGE_VOL_FILE: case VIR_STORAGE_VOL_DIR: if (!(def->src = virStorageVolGetPath(vol))) @@ -1377,6 +1377,8 @@ qemuTranslateDiskSourcePool(virConnectPtr conn, break; case VIR_STORAGE_VOL_NETWORK: + case VIR_STORAGE_VOL_NETWORK_DIR: + case VIR_STORAGE_VOL_LAST: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Using network volume as disk source is not supported")); goto cleanup; diff --git a/src/storage/storage_backend_fs.c b/src/storage/storage_backend_fs.c index 510a3d6..2af6faf 100644 --- a/src/storage/storage_backend_fs.c +++ b/src/storage/storage_backend_fs.c @@ -1137,7 +1137,7 @@ virStorageBackendFileSystemVolDelete(virConnectPtr conn ATTRIBUTE_UNUSED, { virCheckFlags(0, -1); - switch (vol->type) { + switch ((virStorageVolType) vol->type) { case VIR_STORAGE_VOL_FILE: if (unlink(vol->target.path) < 0) { /* Silently ignore failures where the vol has already gone away */ @@ -1159,7 +1159,8 @@ virStorageBackendFileSystemVolDelete(virConnectPtr conn ATTRIBUTE_UNUSED, break; case VIR_STORAGE_VOL_BLOCK: case VIR_STORAGE_VOL_NETWORK: - default: + case VIR_STORAGE_VOL_NETWORK_DIR: + case VIR_STORAGE_VOL_LAST: virReportError(VIR_ERR_NO_SUPPORT, _("removing block or network volumes is not supported: %s"), vol->target.path); diff --git a/tools/virsh-volume.c b/tools/virsh-volume.c index cbdb3f0..66d9922 100644 --- a/tools/virsh-volume.c +++ b/tools/virsh-volume.c @@ -947,7 +947,7 @@ out: static const char * vshVolumeTypeToString(int type) { - switch (type) { + switch ((virStorageVolType) type) { case VIR_STORAGE_VOL_FILE: return N_("file"); @@ -960,6 +960,9 @@ vshVolumeTypeToString(int type) case VIR_STORAGE_VOL_NETWORK: return N_("network"); + case VIR_STORAGE_VOL_NETWORK_DIR: + return N_("net-dir"); + case VIR_STORAGE_VOL_LAST: break; } -- 1.8.3.1

On Fri, Nov 22, 2013 at 08:20:26PM -0700, Eric Blake wrote:
In the 'directory' and 'netfs' storage pools, a user can see both 'file' and 'dir' storage volume types, to know when they can descend into a subdirectory. But in a network-based storage pool, such as the upcoming 'gluster' pool, we use 'network' instead of 'file', and did not have any counterpart for a directory until this patch. Adding a new volume type 'network-dir' is better than reusing 'dir', because it makes it clear that the only way to access 'network' volumes within that container is through the network mounting (leaving 'dir' for something accessible in the local file system).
* include/libvirt/libvirt.h.in (virStorageVolType): Expand enum. * src/qemu/qemu_command.c (qemuBuildVolumeString): Fix client. * src/qemu/qemu_conf.c (qemuTranslateDiskSourcePool): Likewise. * tools/virsh-volume.c (vshVolumeTypeToString): Likewise. * src/storage/storage_backend_fs.c (virStorageBackendFileSystemVolDelete): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 2 ++ src/conf/storage_conf.c | 2 +- src/qemu/qemu_command.c | 6 ++++-- src/qemu/qemu_conf.c | 4 +++- src/storage/storage_backend_fs.c | 5 +++-- tools/virsh-volume.c | 5 ++++- 6 files changed, 17 insertions(+), 7 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 146a59b..5e8cba6 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -2951,6 +2951,8 @@ typedef enum { VIR_STORAGE_VOL_BLOCK = 1, /* Block based volumes */ VIR_STORAGE_VOL_DIR = 2, /* Directory-passthrough based volume */ VIR_STORAGE_VOL_NETWORK = 3, /* Network volumes like RBD (RADOS Block Device) */ + VIR_STORAGE_VOL_NETWORK_DIR = 4, /* Network accessible directory that can + * contain other network volumes */
#ifdef VIR_ENUM_SENTINELS VIR_STORAGE_VOL_LAST diff --git a/src/conf/storage_conf.c b/src/conf/storage_conf.c index add2ae1..493a874 100644 --- a/src/conf/storage_conf.c +++ b/src/conf/storage_conf.c @@ -53,7 +53,7 @@
VIR_ENUM_IMPL(virStorageVol, VIR_STORAGE_VOL_LAST, - "file", "block", "dir", "network") + "file", "block", "dir", "network", "network-dir")
I've got to say I really don't like this naming but not got a better suggestion yet. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Mon, Nov 25, 2013 at 10:42:30AM +0000, Daniel P. Berrange wrote:
On Fri, Nov 22, 2013 at 08:20:26PM -0700, Eric Blake wrote:
In the 'directory' and 'netfs' storage pools, a user can see both 'file' and 'dir' storage volume types, to know when they can descend into a subdirectory. But in a network-based storage pool, such as the upcoming 'gluster' pool, we use 'network' instead of 'file', and did not have any counterpart for a directory until this patch. Adding a new volume type 'network-dir' is better than reusing 'dir', because it makes it clear that the only way to access 'network' volumes within that container is through the network mounting (leaving 'dir' for something accessible in the local file system).
* include/libvirt/libvirt.h.in (virStorageVolType): Expand enum. * src/qemu/qemu_command.c (qemuBuildVolumeString): Fix client. * src/qemu/qemu_conf.c (qemuTranslateDiskSourcePool): Likewise. * tools/virsh-volume.c (vshVolumeTypeToString): Likewise. * src/storage/storage_backend_fs.c (virStorageBackendFileSystemVolDelete): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 2 ++ src/conf/storage_conf.c | 2 +- src/qemu/qemu_command.c | 6 ++++-- src/qemu/qemu_conf.c | 4 +++- src/storage/storage_backend_fs.c | 5 +++-- tools/virsh-volume.c | 5 ++++- 6 files changed, 17 insertions(+), 7 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 146a59b..5e8cba6 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -2951,6 +2951,8 @@ typedef enum { VIR_STORAGE_VOL_BLOCK = 1, /* Block based volumes */ VIR_STORAGE_VOL_DIR = 2, /* Directory-passthrough based volume */ VIR_STORAGE_VOL_NETWORK = 3, /* Network volumes like RBD (RADOS Block Device) */ + VIR_STORAGE_VOL_NETWORK_DIR = 4, /* Network accessible directory that can + * contain other network volumes */
#ifdef VIR_ENUM_SENTINELS VIR_STORAGE_VOL_LAST diff --git a/src/conf/storage_conf.c b/src/conf/storage_conf.c index add2ae1..493a874 100644 --- a/src/conf/storage_conf.c +++ b/src/conf/storage_conf.c @@ -53,7 +53,7 @@
VIR_ENUM_IMPL(virStorageVol, VIR_STORAGE_VOL_LAST, - "file", "block", "dir", "network") + "file", "block", "dir", "network", "network-dir")
I've got to say I really don't like this naming but not got a better suggestion yet.
Could we at least shorten it to 'netdir' ? Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11/25/2013 08:54 AM, Daniel P. Berrange wrote:
VIR_ENUM_IMPL(virStorageVol, VIR_STORAGE_VOL_LAST, - "file", "block", "dir", "network") + "file", "block", "dir", "network", "network-dir")
I've got to say I really don't like this naming but not got a better suggestion yet.
Could we at least shorten it to 'netdir' ?
Done, and pushed (we have until 1.2.0 is actually released to change our minds with a baked-in name, but the shorter netdir works for me). diff --git i/docs/formatstorage.html.in w/docs/formatstorage.html.in index d9ceeae..a089a31 100644 --- i/docs/formatstorage.html.in +++ w/docs/formatstorage.html.in @@ -291,7 +291,7 @@ A storage volume will generally be either a file or a device node; <span class="since">since 1.2.0</span>, an optional output-only attribute <code>type</code> lists the actual type - (file, block, dir, network, or network-dir), which is also available + (file, block, dir, network, or netdir), which is also available from <code>virStorageVolGetInfo()</code>. The storage volume XML format is available <span class="since">since 0.4.1</span> </p> diff --git i/docs/schemas/storagevol.rng w/docs/schemas/storagevol.rng index 45207b4..8f07d8f 100644 --- i/docs/schemas/storagevol.rng +++ w/docs/schemas/storagevol.rng @@ -20,7 +20,7 @@ <value>block</value> <value>dir</value> <value>network</value> - <value>network-dir</value> + <value>netdir</value> </choice> </attribute> </optional> diff --git i/include/libvirt/libvirt.h.in w/include/libvirt/libvirt.h.in index 5e8cba6..5aad75c 100644 --- i/include/libvirt/libvirt.h.in +++ w/include/libvirt/libvirt.h.in @@ -2951,8 +2951,8 @@ typedef enum { VIR_STORAGE_VOL_BLOCK = 1, /* Block based volumes */ VIR_STORAGE_VOL_DIR = 2, /* Directory-passthrough based volume */ VIR_STORAGE_VOL_NETWORK = 3, /* Network volumes like RBD (RADOS Block Device) */ - VIR_STORAGE_VOL_NETWORK_DIR = 4, /* Network accessible directory that can - * contain other network volumes */ + VIR_STORAGE_VOL_NETDIR = 4, /* Network accessible directory that can + * contain other network volumes */ #ifdef VIR_ENUM_SENTINELS VIR_STORAGE_VOL_LAST diff --git i/src/conf/storage_conf.c w/src/conf/storage_conf.c index 3ea5bd7..22e38c1 100644 --- i/src/conf/storage_conf.c +++ w/src/conf/storage_conf.c @@ -53,7 +53,7 @@ VIR_ENUM_IMPL(virStorageVol, VIR_STORAGE_VOL_LAST, - "file", "block", "dir", "network", "network-dir") + "file", "block", "dir", "network", "netdir") VIR_ENUM_IMPL(virStoragePool, VIR_STORAGE_POOL_LAST, diff --git i/src/qemu/qemu_command.c w/src/qemu/qemu_command.c index 6d22bc8..763417f 100644 --- i/src/qemu/qemu_command.c +++ w/src/qemu/qemu_command.c @@ -3825,7 +3825,7 @@ qemuBuildVolumeString(virConnectPtr conn, } break; case VIR_STORAGE_VOL_NETWORK: - case VIR_STORAGE_VOL_NETWORK_DIR: + case VIR_STORAGE_VOL_NETDIR: case VIR_STORAGE_VOL_LAST: /* Keep the compiler quiet, qemuTranslateDiskSourcePool already * reported the unsupported error. diff --git i/src/qemu/qemu_conf.c w/src/qemu/qemu_conf.c index 3c1e248..77df370 100644 --- i/src/qemu/qemu_conf.c +++ w/src/qemu/qemu_conf.c @@ -1377,7 +1377,7 @@ qemuTranslateDiskSourcePool(virConnectPtr conn, break; case VIR_STORAGE_VOL_NETWORK: - case VIR_STORAGE_VOL_NETWORK_DIR: + case VIR_STORAGE_VOL_NETDIR: case VIR_STORAGE_VOL_LAST: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Using network volume as disk source is not supported")); diff --git i/src/storage/storage_backend_fs.c w/src/storage/storage_backend_fs.c index 2af6faf..11cf2df 100644 --- i/src/storage/storage_backend_fs.c +++ w/src/storage/storage_backend_fs.c @@ -1159,7 +1159,7 @@ virStorageBackendFileSystemVolDelete(virConnectPtr conn ATTRIBUTE_UNUSED, break; case VIR_STORAGE_VOL_BLOCK: case VIR_STORAGE_VOL_NETWORK: - case VIR_STORAGE_VOL_NETWORK_DIR: + case VIR_STORAGE_VOL_NETDIR: case VIR_STORAGE_VOL_LAST: virReportError(VIR_ERR_NO_SUPPORT, _("removing block or network volumes is not supported: %s"), diff --git i/tools/virsh-volume.c w/tools/virsh-volume.c index 66d9922..22b10d5 100644 --- i/tools/virsh-volume.c +++ w/tools/virsh-volume.c @@ -960,8 +960,8 @@ vshVolumeTypeToString(int type) case VIR_STORAGE_VOL_NETWORK: return N_("network"); - case VIR_STORAGE_VOL_NETWORK_DIR: - return N_("net-dir"); + case VIR_STORAGE_VOL_NETDIR: + return N_("netdir"); case VIR_STORAGE_VOL_LAST: break; -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Take advantage of the previous patch's addition of 'network-dir' as a distinct volume type, to expose rather than silently skip directories embedded in a gluster pool. Also serves as an XML validation for the previous patch. * src/storage/storage_backend_gluster.c (virStorageBackendGlusterRefreshVol): Don't skip directories. * tests/storagevolxml2xmltest.c (mymain): Add test. * tests/storagevolxml2xmlin/vol-gluster-dir.xml: New file. * tests/storagevolxml2xmlout/vol-gluster-dir.xml: Likewise. --- src/storage/storage_backend_gluster.c | 14 ++++++++++---- tests/storagevolxml2xmlin/vol-gluster-dir.xml | 13 +++++++++++++ tests/storagevolxml2xmlout/vol-gluster-dir.xml | 18 ++++++++++++++++++ tests/storagevolxml2xmltest.c | 1 + 4 files changed, 42 insertions(+), 4 deletions(-) create mode 100644 tests/storagevolxml2xmlin/vol-gluster-dir.xml create mode 100644 tests/storagevolxml2xmlout/vol-gluster-dir.xml diff --git a/src/storage/storage_backend_gluster.c b/src/storage/storage_backend_gluster.c index 120cf0f..89abb57 100644 --- a/src/storage/storage_backend_gluster.c +++ b/src/storage/storage_backend_gluster.c @@ -165,9 +165,6 @@ virStorageBackendGlusterRefreshVol(virStorageBackendGlusterStatePtr state, /* Silently skip '.' and '..'. */ if (STREQ(name, ".") || STREQ(name, "..")) return 0; - /* FIXME: support directories. For now, silently skip them. */ - if (S_ISDIR(st->st_mode)) - return 0; if (VIR_ALLOC(vol) < 0) goto cleanup; @@ -177,7 +174,6 @@ virStorageBackendGlusterRefreshVol(virStorageBackendGlusterStatePtr state, vol->name) < 0) goto cleanup; - vol->type = VIR_STORAGE_VOL_NETWORK; tmp = state->uri->path; state->uri->path = vol->key; if (!(vol->target.path = virURIFormat(state->uri))) { @@ -186,7 +182,17 @@ virStorageBackendGlusterRefreshVol(virStorageBackendGlusterStatePtr state, } state->uri->path = tmp; + if (S_ISDIR(st->st_mode)) { + vol->type = VIR_STORAGE_VOL_NETWORK_DIR; + vol->target.format = VIR_STORAGE_FILE_DIR; + *volptr = vol; + vol = NULL; + ret = 0; + goto cleanup; + } + /* FIXME - must open files to determine if they are non-raw */ + vol->type = VIR_STORAGE_VOL_NETWORK; vol->target.format = VIR_STORAGE_FILE_RAW; vol->capacity = vol->allocation = st->st_size; diff --git a/tests/storagevolxml2xmlin/vol-gluster-dir.xml b/tests/storagevolxml2xmlin/vol-gluster-dir.xml new file mode 100644 index 0000000..bd20a6a --- /dev/null +++ b/tests/storagevolxml2xmlin/vol-gluster-dir.xml @@ -0,0 +1,13 @@ +<volume> + <name>dir</name> + <key>/vol/dir</key> + <source> + </source> + <type>network-dir</type> + <capacity unit='bytes'>0</capacity> + <allocation unit='bytes'>0</allocation> + <target> + <format type='dir'/> + <path>gluster://example.com/vol/dir</path> + </target> +</volume> diff --git a/tests/storagevolxml2xmlout/vol-gluster-dir.xml b/tests/storagevolxml2xmlout/vol-gluster-dir.xml new file mode 100644 index 0000000..29e6d1a --- /dev/null +++ b/tests/storagevolxml2xmlout/vol-gluster-dir.xml @@ -0,0 +1,18 @@ +<volume> + <name>dir</name> + <key>/vol/dir</key> + <source> + </source> + <type>network-dir</type> + <capacity unit='bytes'>0</capacity> + <allocation unit='bytes'>0</allocation> + <target> + <path>gluster://example.com/vol/dir</path> + <format type='dir'/> + <permissions> + <mode>0600</mode> + <owner>4294967295</owner> + <group>4294967295</group> + </permissions> + </target> +</volume> diff --git a/tests/storagevolxml2xmltest.c b/tests/storagevolxml2xmltest.c index e1db465..fdcdea1 100644 --- a/tests/storagevolxml2xmltest.c +++ b/tests/storagevolxml2xmltest.c @@ -121,6 +121,7 @@ mymain(void) DO_TEST("pool-logical", "vol-logical"); DO_TEST("pool-logical", "vol-logical-backing"); DO_TEST("pool-sheepdog", "vol-sheepdog"); + DO_TEST("pool-gluster", "vol-gluster-dir"); return ret==0 ? EXIT_SUCCESS : EXIT_FAILURE; } -- 1.8.3.1

On Fri, Nov 22, 2013 at 08:20:27PM -0700, Eric Blake wrote:
diff --git a/tests/storagevolxml2xmlin/vol-gluster-dir.xml b/tests/storagevolxml2xmlin/vol-gluster-dir.xml new file mode 100644 index 0000000..bd20a6a --- /dev/null +++ b/tests/storagevolxml2xmlin/vol-gluster-dir.xml @@ -0,0 +1,13 @@ +<volume> + <name>dir</name> + <key>/vol/dir</key> + <source> + </source> + <type>network-dir</type>
Per my other reply I think that 'type' would be better as an attribute on the top level <volume> element. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11/25/2013 08:56 AM, Daniel P. Berrange wrote:
On Fri, Nov 22, 2013 at 08:20:27PM -0700, Eric Blake wrote:
diff --git a/tests/storagevolxml2xmlin/vol-gluster-dir.xml b/tests/storagevolxml2xmlin/vol-gluster-dir.xml new file mode 100644 index 0000000..bd20a6a --- /dev/null +++ b/tests/storagevolxml2xmlin/vol-gluster-dir.xml @@ -0,0 +1,13 @@ +<volume> + <name>dir</name> + <key>/vol/dir</key> + <source> + </source> + <type>network-dir</type>
Per my other reply I think that 'type' would be better as an attribute on the top level <volume> element.
Indeed, 'make check' fails after my tweaks earlier in the series unless I squash in this (now pushed): diff --git i/src/storage/storage_backend_gluster.c w/src/storage/storage_backend_gluster.c index 51dc742..3f4e9f7 100644 --- i/src/storage/storage_backend_gluster.c +++ w/src/storage/storage_backend_gluster.c @@ -183,7 +183,7 @@ virStorageBackendGlusterRefreshVol(virStorageBackendGlusterStatePtr state, state->uri->path = tmp; if (S_ISDIR(st->st_mode)) { - vol->type = VIR_STORAGE_VOL_NETWORK_DIR; + vol->type = VIR_STORAGE_VOL_NETDIR; vol->target.format = VIR_STORAGE_FILE_DIR; *volptr = vol; vol = NULL; diff --git i/tests/storagevolxml2xmlin/vol-gluster-dir.xml w/tests/storagevolxml2xmlin/vol-gluster-dir.xml index bd20a6a..208c2c2 100644 --- i/tests/storagevolxml2xmlin/vol-gluster-dir.xml +++ w/tests/storagevolxml2xmlin/vol-gluster-dir.xml @@ -1,9 +1,8 @@ -<volume> +<volume type='netdir'> <name>dir</name> - <key>/vol/dir</key> + <key>vol/dir</key> <source> </source> - <type>network-dir</type> <capacity unit='bytes'>0</capacity> <allocation unit='bytes'>0</allocation> <target> diff --git i/tests/storagevolxml2xmlout/vol-gluster-dir.xml w/tests/storagevolxml2xmlout/vol-gluster-dir.xml index 29e6d1a..f188ceb 100644 --- i/tests/storagevolxml2xmlout/vol-gluster-dir.xml +++ w/tests/storagevolxml2xmlout/vol-gluster-dir.xml @@ -1,9 +1,8 @@ -<volume> +<volume type='netdir'> <name>dir</name> - <key>/vol/dir</key> + <key>vol/dir</key> <source> </source> - <type>network-dir</type> <capacity unit='bytes'>0</capacity> <allocation unit='bytes'>0</allocation> <target> -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

We already had code for handling allocation different than capacity for sparse files; we just had to wire it up to be used when inspecting gluster images. * src/storage/storage_backend.c (virStorageBackendUpdateVolTargetInfoFD): Handle no fd. * src/storage/storage_backend_gluster.c (virStorageBackendGlusterRefreshVol): Handle sparse files. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/storage/storage_backend.c | 8 ++++---- src/storage/storage_backend_gluster.c | 7 ++++++- 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index c94e89a..57c1728 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -1296,9 +1296,9 @@ int virStorageBackendUpdateVolInfo(virStorageVolDefPtr vol, /* * virStorageBackendUpdateVolTargetInfoFD: - * @conn: connection to report errors on * @target: target definition ptr of volume to update - * @fd: fd of storage volume to update, via virStorageBackendOpenVol* + * @fd: fd of storage volume to update, via virStorageBackendOpenVol*, or -1 + * @sb: details about file (must match @fd, if that is provided) * @allocation: If not NULL, updated allocation information will be stored * @capacity: If not NULL, updated capacity info will be stored * @@ -1333,7 +1333,7 @@ virStorageBackendUpdateVolTargetInfoFD(virStorageVolTargetPtr target, if (capacity) *capacity = 0; - } else { + } else if (fd >= 0) { off_t end; /* XXX this is POSIX compliant, but doesn't work for CHAR files, * only BLOCK. There is a Linux specific ioctl() for getting @@ -1368,7 +1368,7 @@ virStorageBackendUpdateVolTargetInfoFD(virStorageVolTargetPtr target, #if WITH_SELINUX /* XXX: make this a security driver call */ - if (fgetfilecon_raw(fd, &filecon) == -1) { + if (fd >= 0 && fgetfilecon_raw(fd, &filecon) == -1) { if (errno != ENODATA && errno != ENOTSUP) { virReportSystemError(errno, _("cannot get file context of '%s'"), diff --git a/src/storage/storage_backend_gluster.c b/src/storage/storage_backend_gluster.c index 89abb57..a4fd4b3 100644 --- a/src/storage/storage_backend_gluster.c +++ b/src/storage/storage_backend_gluster.c @@ -168,6 +168,12 @@ virStorageBackendGlusterRefreshVol(virStorageBackendGlusterStatePtr state, if (VIR_ALLOC(vol) < 0) goto cleanup; + + if (virStorageBackendUpdateVolTargetInfoFD(&vol->target, -1, st, + &vol->allocation, + &vol->capacity) < 0) + goto cleanup; + if (VIR_STRDUP(vol->name, name) < 0) goto cleanup; if (virAsprintf(&vol->key, "/%s%s%s", state->volname, state->dir, @@ -194,7 +200,6 @@ virStorageBackendGlusterRefreshVol(virStorageBackendGlusterStatePtr state, /* FIXME - must open files to determine if they are non-raw */ vol->type = VIR_STORAGE_VOL_NETWORK; vol->target.format = VIR_STORAGE_FILE_RAW; - vol->capacity = vol->allocation = st->st_size; *volptr = vol; vol = NULL; -- 1.8.3.1

On Fri, Nov 22, 2013 at 08:20:28PM -0700, Eric Blake wrote:
We already had code for handling allocation different than capacity for sparse files; we just had to wire it up to be used when inspecting gluster images.
* src/storage/storage_backend.c (virStorageBackendUpdateVolTargetInfoFD): Handle no fd. * src/storage/storage_backend_gluster.c (virStorageBackendGlusterRefreshVol): Handle sparse files.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/storage/storage_backend.c | 8 ++++---- src/storage/storage_backend_gluster.c | 7 ++++++- 2 files changed, 10 insertions(+), 5 deletions(-)
ACK Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11/25/2013 08:57 AM, Daniel P. Berrange wrote:
On Fri, Nov 22, 2013 at 08:20:28PM -0700, Eric Blake wrote:
We already had code for handling allocation different than capacity for sparse files; we just had to wire it up to be used when inspecting gluster images.
* src/storage/storage_backend.c (virStorageBackendUpdateVolTargetInfoFD): Handle no fd. * src/storage/storage_backend_gluster.c (virStorageBackendGlusterRefreshVol): Handle sparse files.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/storage/storage_backend.c | 8 ++++---- src/storage/storage_backend_gluster.c | 7 ++++++- 2 files changed, 10 insertions(+), 5 deletions(-)
ACK
Pushed. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

With this patch, dangling and looping symlinks are silently ignored, while links to files and directories are treated the same as the underlying file or directory. This is the same behavior as both 'directory' and 'netfs' pools. * src/storage/storage_backend_gluster.c (virStorageBackendGlusterRefreshVol): Treat symlinks similar to directory and netfs pools. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/storage/storage_backend_gluster.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/src/storage/storage_backend_gluster.c b/src/storage/storage_backend_gluster.c index a4fd4b3..e935583 100644 --- a/src/storage/storage_backend_gluster.c +++ b/src/storage/storage_backend_gluster.c @@ -166,6 +166,17 @@ virStorageBackendGlusterRefreshVol(virStorageBackendGlusterStatePtr state, if (STREQ(name, ".") || STREQ(name, "..")) return 0; + /* Follow symlinks; silently skip broken links and loops. */ + if (S_ISLNK(st->st_mode) && glfs_stat(state->vol, name, st) < 0) { + if (errno == ENOENT || errno == ELOOP) { + VIR_WARN("ignoring dangling symlink '%s'", name); + ret = 0; + } else { + virReportSystemError(errno, _("cannot stat '%s'"), name); + } + return ret; + } + if (VIR_ALLOC(vol) < 0) goto cleanup; -- 1.8.3.1

On Fri, Nov 22, 2013 at 08:20:29PM -0700, Eric Blake wrote:
With this patch, dangling and looping symlinks are silently ignored, while links to files and directories are treated the same as the underlying file or directory. This is the same behavior as both 'directory' and 'netfs' pools.
* src/storage/storage_backend_gluster.c (virStorageBackendGlusterRefreshVol): Treat symlinks similar to directory and netfs pools.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/storage/storage_backend_gluster.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
ACK Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11/25/2013 08:57 AM, Daniel P. Berrange wrote:
On Fri, Nov 22, 2013 at 08:20:29PM -0700, Eric Blake wrote:
With this patch, dangling and looping symlinks are silently ignored, while links to files and directories are treated the same as the underlying file or directory. This is the same behavior as both 'directory' and 'netfs' pools.
* src/storage/storage_backend_gluster.c (virStorageBackendGlusterRefreshVol): Treat symlinks similar to directory and netfs pools.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/storage/storage_backend_gluster.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
ACK
Since I just added 'netdir' volume types, I'm wondering if I should also add a 'badlink' volume type (covers both ENOENT and ELOOP errors). After all, in a directory pool, you cannot create a new file with a name occupied by a bad link without first deleting the bad link; and our current behavior of silently ignoring bad links is a little unfriendly when compared with actually telling the user that they have a broken symlink, and without a way to use libvirt API to get the broken link out of the way (you can't delete something that is silently ignored). But that would be a followup patch, particularly since it affects directory (and not just gluster) pools. I've pushed this one as-is. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Putting together pieces from previous patches, it is now possible for 'virsh vol-dumpxml --pool gluster volname' to report metadata about a qcow2 file stored on gluster. The backing file is still treated as raw; to fix that, more patches are needed to make the storage backing chain analysis recursive rather than halting at a network protocol name, but that work will not need any further calls into libgfapi so much as just reusing this code, and that should be the only code outside of the storage driver that needs any help from libgfapi. Any additional use of libgfapi within libvirt should only be needed for implementing storage pool APIs such as volume creation or resizing, where backing chain analysis should be unaffected. * src/storage/storage_backend_gluster.c (virStorageBackendGlusterReadHeader): New helper function. (virStorageBackendGlusterRefreshVol): Probe non-raw files. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/storage/storage_backend_gluster.c | 76 ++++++++++++++++++++++++++++++++++- 1 file changed, 75 insertions(+), 1 deletion(-) diff --git a/src/storage/storage_backend_gluster.c b/src/storage/storage_backend_gluster.c index e935583..b239880 100644 --- a/src/storage/storage_backend_gluster.c +++ b/src/storage/storage_backend_gluster.c @@ -147,6 +147,37 @@ error: } +static ssize_t +virStorageBackendGlusterReadHeader(glfs_fd_t *fd, + const char *name, + ssize_t maxlen, + char **buf) +{ + char *s; + size_t nread = 0; + + if (VIR_ALLOC_N(*buf, maxlen) < 0) + return -1; + + s = *buf; + while (maxlen) { + ssize_t r = glfs_read(fd, s, maxlen, 0); + if (r < 0 && errno == EINTR) + continue; + if (r < 0) { + VIR_FREE(*buf); + virReportSystemError(errno, _("unable to read '%s'"), name); + return r; + } + if (r == 0) + return nread; + buf += r; + maxlen -= r; + nread += r; + } + return nread; +} + /* Populate *volptr for the given name and stat information, or leave * it NULL if the entry should be skipped (such as "."). Return 0 on * success, -1 on failure. */ @@ -159,6 +190,10 @@ virStorageBackendGlusterRefreshVol(virStorageBackendGlusterStatePtr state, char *tmp; int ret = -1; virStorageVolDefPtr vol = NULL; + glfs_fd_t *fd = NULL; + virStorageFileMetadata *meta = NULL; + char *header = NULL; + ssize_t len = VIR_STORAGE_MAX_HEADER; *volptr = NULL; @@ -208,15 +243,54 @@ virStorageBackendGlusterRefreshVol(virStorageBackendGlusterStatePtr state, goto cleanup; } - /* FIXME - must open files to determine if they are non-raw */ vol->type = VIR_STORAGE_VOL_NETWORK; vol->target.format = VIR_STORAGE_FILE_RAW; + if (!(fd = glfs_open(state->vol, name, O_RDONLY| O_NONBLOCK | O_NOCTTY))) { + /* A dangling symlink now implies a TOCTTOU race; report it. */ + virReportSystemError(errno, _("cannot open volume '%s'"), name); + goto cleanup; + } + + if ((len = virStorageBackendGlusterReadHeader(fd, name, len, &header)) < 0) + goto cleanup; + + if ((vol->target.format = virStorageFileProbeFormatFromBuf(name, + header, + len)) < 0 || + !(meta = virStorageFileGetMetadataFromBuf(name, header, len, + vol->target.format))) + goto cleanup; + + if (meta->backingStore) { + vol->backingStore.path = meta->backingStore; + meta->backingStore = NULL; + vol->backingStore.format = meta->backingStoreFormat; + if (vol->backingStore.format < 0) + vol->backingStore.format = VIR_STORAGE_FILE_RAW; + } + if (meta->capacity) + vol->capacity = meta->capacity; + if (meta->encrypted) { + if (VIR_ALLOC(vol->target.encryption) < 0) + goto cleanup; + if (vol->target.format == VIR_STORAGE_FILE_QCOW || + vol->target.format == VIR_STORAGE_FILE_QCOW2) + vol->target.encryption->format = VIR_STORAGE_ENCRYPTION_FORMAT_QCOW; + } + vol->target.features = meta->features; + meta->features = NULL; + vol->target.compat = meta->compat; + meta->compat = NULL; *volptr = vol; vol = NULL; ret = 0; cleanup: + virStorageFileFreeMetadata(meta); virStorageVolDefFree(vol); + if (fd) + glfs_close(fd); + VIR_FREE(header); return ret; } -- 1.8.3.1

On Fri, Nov 22, 2013 at 08:20:30PM -0700, Eric Blake wrote:
Putting together pieces from previous patches, it is now possible for 'virsh vol-dumpxml --pool gluster volname' to report metadata about a qcow2 file stored on gluster. The backing file is still treated as raw; to fix that, more patches are needed to make the storage backing chain analysis recursive rather than halting at a network protocol name, but that work will not need any further calls into libgfapi so much as just reusing this code, and that should be the only code outside of the storage driver that needs any help from libgfapi. Any additional use of libgfapi within libvirt should only be needed for implementing storage pool APIs such as volume creation or resizing, where backing chain analysis should be unaffected.
* src/storage/storage_backend_gluster.c (virStorageBackendGlusterReadHeader): New helper function. (virStorageBackendGlusterRefreshVol): Probe non-raw files.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/storage/storage_backend_gluster.c | 76 ++++++++++++++++++++++++++++++++++- 1 file changed, 75 insertions(+), 1 deletion(-)
diff --git a/src/storage/storage_backend_gluster.c b/src/storage/storage_backend_gluster.c index e935583..b239880 100644 --- a/src/storage/storage_backend_gluster.c +++ b/src/storage/storage_backend_gluster.c @@ -147,6 +147,37 @@ error: }
+static ssize_t +virStorageBackendGlusterReadHeader(glfs_fd_t *fd, + const char *name, + ssize_t maxlen, + char **buf) +{ + char *s; + size_t nread = 0; + + if (VIR_ALLOC_N(*buf, maxlen) < 0) + return -1; + + s = *buf; + while (maxlen) { + ssize_t r = glfs_read(fd, s, maxlen, 0); + if (r < 0 && errno == EINTR) + continue; + if (r < 0) { + VIR_FREE(*buf); + virReportSystemError(errno, _("unable to read '%s'"), name); + return r; + }
Further down you're requesting O_NONBLOCK, and here you are not handling EAGAIN explicitly. Is is desirable that we turn EAGAIN into a fatal error, or should we remove the O_NONBLOCK flag ?
@@ -208,15 +243,54 @@ virStorageBackendGlusterRefreshVol(virStorageBackendGlusterStatePtr state, goto cleanup; }
- /* FIXME - must open files to determine if they are non-raw */ vol->type = VIR_STORAGE_VOL_NETWORK; vol->target.format = VIR_STORAGE_FILE_RAW; + if (!(fd = glfs_open(state->vol, name, O_RDONLY| O_NONBLOCK | O_NOCTTY))) { + /* A dangling symlink now implies a TOCTTOU race; report it. */ + virReportSystemError(errno, _("cannot open volume '%s'"), name); + goto cleanup; + }
Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11/25/2013 08:59 AM, Daniel P. Berrange wrote:
On Fri, Nov 22, 2013 at 08:20:30PM -0700, Eric Blake wrote:
Putting together pieces from previous patches, it is now possible for 'virsh vol-dumpxml --pool gluster volname' to report metadata about a qcow2 file stored on gluster. The backing file is still treated as raw; to fix that, more patches are needed to make the storage backing chain analysis recursive rather than halting at a network protocol name, but that work will not need any further calls into libgfapi so much as just reusing this code, and that should be the only code outside of the storage driver that needs any help from libgfapi. Any additional use of libgfapi within libvirt should only be needed for implementing storage pool APIs such as volume creation or resizing, where backing chain analysis should be unaffected.
+ while (maxlen) { + ssize_t r = glfs_read(fd, s, maxlen, 0); + if (r < 0 && errno == EINTR) + continue; + if (r < 0) { + VIR_FREE(*buf); + virReportSystemError(errno, _("unable to read '%s'"), name); + return r; + }
Further down you're requesting O_NONBLOCK, and here you are not handling EAGAIN explicitly. Is is desirable that we turn EAGAIN into a fatal error, or should we remove the O_NONBLOCK flag ?
Hmm. I was copying from directory pools, which also use O_NONBLOCK then call into virFileReadHeaderFD. So that code needs to be fixed (separate patch). For gluster, I think it's easiest to just drop O_NONBLOCK from the code (I just verified that attempts to use 'mkfifo' inside a gluster volume fail with EACCES); for directory pools you DO want to open O_NONBLOCK (otherwise opening a fifo for read would hang waiting for a writer), then use virSetNonBlock() after verifying file type but before reading the header (since we already reject fifos as unable to be used as a storage volume). Squashing in this, then pushing. diff --git i/src/storage/storage_backend_gluster.c w/src/storage/storage_backend_gluster.c index 71edb12..7e5ea9e 100644 --- i/src/storage/storage_backend_gluster.c +++ w/src/storage/storage_backend_gluster.c @@ -245,7 +245,9 @@ virStorageBackendGlusterRefreshVol(virStorageBackendGlusterStatePtr state, vol->type = VIR_STORAGE_VOL_NETWORK; vol->target.format = VIR_STORAGE_FILE_RAW; - if (!(fd = glfs_open(state->vol, name, O_RDONLY| O_NONBLOCK | O_NOCTTY))) { + /* No need to worry about O_NONBLOCK - gluster doesn't allow creation + * of fifos, so there's nothing it would protect us from. */ + if (!(fd = glfs_open(state->vol, name, O_RDONLY | O_NOCTTY))) { /* A dangling symlink now implies a TOCTTOU race; report it. */ virReportSystemError(errno, _("cannot open volume '%s'"), name); goto cleanup; -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Mon, Nov 25, 2013 at 01:45:54PM -0700, Eric Blake wrote:
On 11/25/2013 08:59 AM, Daniel P. Berrange wrote:
On Fri, Nov 22, 2013 at 08:20:30PM -0700, Eric Blake wrote:
Putting together pieces from previous patches, it is now possible for 'virsh vol-dumpxml --pool gluster volname' to report metadata about a qcow2 file stored on gluster. The backing file is still treated as raw; to fix that, more patches are needed to make the storage backing chain analysis recursive rather than halting at a network protocol name, but that work will not need any further calls into libgfapi so much as just reusing this code, and that should be the only code outside of the storage driver that needs any help from libgfapi. Any additional use of libgfapi within libvirt should only be needed for implementing storage pool APIs such as volume creation or resizing, where backing chain analysis should be unaffected.
+ while (maxlen) { + ssize_t r = glfs_read(fd, s, maxlen, 0); + if (r < 0 && errno == EINTR) + continue; + if (r < 0) { + VIR_FREE(*buf); + virReportSystemError(errno, _("unable to read '%s'"), name); + return r; + }
Further down you're requesting O_NONBLOCK, and here you are not handling EAGAIN explicitly. Is is desirable that we turn EAGAIN into a fatal error, or should we remove the O_NONBLOCK flag ?
Hmm. I was copying from directory pools, which also use O_NONBLOCK then call into virFileReadHeaderFD. So that code needs to be fixed (separate patch). For gluster, I think it's easiest to just drop O_NONBLOCK from the code (I just verified that attempts to use 'mkfifo' inside a gluster volume fail with EACCES); for directory pools you DO want to open O_NONBLOCK (otherwise opening a fifo for read would hang waiting for a writer), then use virSetNonBlock() after verifying file type but before reading the header (since we already reject fifos as unable to be used as a storage volume).
Fortunately with local directory based pools, the only effect that O_NONBLOCK has is to prevent you blocking on fifos. If you use the O_NONBLOCK flag in conjunction with plain files, you'll still block waiting for I/O, to the annoyance of programmers everywhere. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 23/11/13 11:20, Eric Blake wrote:
v3: https://www.redhat.com/archives/libvir-list/2013-November/msg00348.html
Depends on: https://www.redhat.com/archives/libvir-list/2013-November/msg00955.html
Changes since then, addressing review feedback: - rebase to other improvements in the meantime - New patches 4-7 - pool changed to require <name>volume</name> to have no slash, with subdirectory within a volume selected by <dir path=.../> which must begin with slash - documentation improved to match actual testing - directories, symlinks are handled - volume owner and timestamps are handled - volume xml tests added, with several bugs in earlier version fixed along the way - compared gluster pool with a netfs pool to ensure both can see the same level of detail from the same gluster storage
If you think it will help review, ask me to provide an interdiff from v3 (although I have not done it yet).
Eric Blake (8): storage: initial support for linking with libgfapi storage: document gluster pool storage: implement rudimentary glusterfs pool refresh storage: add network-dir as new storage volume type storage: improve directory support in gluster pool storage: improve allocation stats reported on gluster files storage: improve handling of symlinks in gluster storage: probe qcow2 volumes in gluster pool
Looked through the whole set, except the version nit pointed out by Daniel, I didn't see any other problem. So ACK. /btw, I need to support the gluster pool for domain config after these patches are pushed. Regards, Osier

On 11/25/13 08:43, Osier Yang wrote:
On 23/11/13 11:20, Eric Blake wrote:
...
Eric Blake (8): storage: initial support for linking with libgfapi storage: document gluster pool storage: implement rudimentary glusterfs pool refresh storage: add network-dir as new storage volume type storage: improve directory support in gluster pool storage: improve allocation stats reported on gluster files storage: improve handling of symlinks in gluster storage: probe qcow2 volumes in gluster pool
Looked through the whole set, except the version nit pointed out by Daniel, I didn't see any other problem. So ACK.
/btw, I need to support the gluster pool for domain config after these patches are pushed.
Don't bother with that at this point. My patches that deal with snaphots on gluster (not yet posted) refactored the code to format pool sources, thus we'd have a ton of rebase conflicts.
Regards, Osier
Peter
participants (5)
-
Daniel P. Berrange
-
Daniel Veillard
-
Eric Blake
-
Osier Yang
-
Peter Krempa