[libvirt] [PATCH 0/4 v2] Automatic lease management for sanlock

This is an update to http://www.redhat.com/archives/libvir-list/2011-June/msg00813.html Changed in this v2 posting - Incorporated feedback from v1 - Removed the cron job. This should be manually setup by the admin on only one host - Added documentation to the website on how to deploy sanlock

Allow a 'configFile' parameter to be passed into the lock drivers to provide configuration. Wire up the QEMU driver to pass in file names '/etc/libvirt/qemu-$NAME.conf eg qemu-sanlock.conf * src/locking/lock_driver.h, src/locking/lock_driver_nop.c, src/locking/lock_driver_sanlock.c, src/locking/lock_manager.c, src/locking/lock_manager.h: Add configFile parameter * src/qemu/qemu_conf.c: Pass in configuration file path to lock driver plugins --- src/locking/lock_driver.h | 1 + src/locking/lock_driver_nop.c | 3 ++- src/locking/lock_driver_sanlock.c | 1 + src/locking/lock_manager.c | 7 +++---- src/locking/lock_manager.h | 1 + src/qemu/qemu_conf.c | 11 +++++++++-- 6 files changed, 17 insertions(+), 7 deletions(-) diff --git a/src/locking/lock_driver.h b/src/locking/lock_driver.h index 2e71113..3b90efa 100644 --- a/src/locking/lock_driver.h +++ b/src/locking/lock_driver.h @@ -121,6 +121,7 @@ struct _virLockManagerParam { * Returns -1 if the requested version/flags were inadequate */ typedef int (*virLockDriverInit)(unsigned int version, + const char *configFile, unsigned int flags); /** diff --git a/src/locking/lock_driver_nop.c b/src/locking/lock_driver_nop.c index 36a9083..43374e4 100644 --- a/src/locking/lock_driver_nop.c +++ b/src/locking/lock_driver_nop.c @@ -28,9 +28,10 @@ static int virLockManagerNopInit(unsigned int version, + const char *configFile, unsigned int flags) { - VIR_DEBUG("version=%u flags=%u", version, flags); + VIR_DEBUG("version=%u configFile=%s flags=%u", version, NULLSTR(configFile), flags); return 0; } diff --git a/src/locking/lock_driver_sanlock.c b/src/locking/lock_driver_sanlock.c index adead76..3eab23e 100644 --- a/src/locking/lock_driver_sanlock.c +++ b/src/locking/lock_driver_sanlock.c @@ -64,6 +64,7 @@ struct _virLockManagerSanlockPrivate { */ static int virLockManagerSanlockInit(unsigned int version ATTRIBUTE_UNUSED, + const char *configFile ATTRIBUTE_UNUSED, unsigned int flags) { virCheckFlags(0, -1); diff --git a/src/locking/lock_manager.c b/src/locking/lock_manager.c index e97c738..c28ca86 100644 --- a/src/locking/lock_manager.c +++ b/src/locking/lock_manager.c @@ -119,6 +119,7 @@ static void virLockManagerLogParams(size_t nparams, */ #if HAVE_DLFCN_H virLockManagerPluginPtr virLockManagerPluginNew(const char *name, + const char *configFile, unsigned int flags) { void *handle = NULL; @@ -162,11 +163,8 @@ virLockManagerPluginPtr virLockManagerPluginNew(const char *name, } } - if (driver->drvInit(VIR_LOCK_MANAGER_VERSION, flags) < 0) { - virLockError(VIR_ERR_INTERNAL_ERROR, "%s", - _("plugin ABI is not compatible")); + if (driver->drvInit(VIR_LOCK_MANAGER_VERSION, configFile, flags) < 0) goto cleanup; - } if (VIR_ALLOC(plugin) < 0) { virReportOOMError(); @@ -193,6 +191,7 @@ cleanup: } #else /* !HAVE_DLFCN_H */ virLockManagerPluginPtr virLockManagerPluginNew(const char *name ATTRIBUTE_UNUSED, + const char *configFile ATTRIBUTE_UNUSED, unsigned int flags ATTRIBUTE_UNUSED) { virLockError(VIR_ERR_INTERNAL_ERROR, "%s", diff --git a/src/locking/lock_manager.h b/src/locking/lock_manager.h index 13ad372..315c798 100644 --- a/src/locking/lock_manager.h +++ b/src/locking/lock_manager.h @@ -29,6 +29,7 @@ typedef struct _virLockManagerPlugin virLockManagerPlugin; typedef virLockManagerPlugin *virLockManagerPluginPtr; virLockManagerPluginPtr virLockManagerPluginNew(const char *name, + const char *configFile, unsigned int flags); void virLockManagerPluginRef(virLockManagerPluginPtr plugin); void virLockManagerPluginUnref(virLockManagerPluginPtr plugin); diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 46f6976..3d8aba4 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -116,7 +116,7 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, #endif if (!(driver->lockManager = - virLockManagerPluginNew("nop", 0))) + virLockManagerPluginNew("nop", NULL, 0))) return -1; /* Just check the file is readable before opening it, otherwise @@ -438,10 +438,17 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, p = virConfGetValue (conf, "lock_manager"); CHECK_TYPE ("lock_manager", VIR_CONF_STRING); if (p && p->str) { + char *lockConf; virLockManagerPluginUnref(driver->lockManager); + if (virAsprintf(&lockConf, "%s/libvirt/qemu-%s.conf", SYSCONFDIR, p->str) < 0) { + virReportOOMError(); + virConfFree(conf); + return -1; + } if (!(driver->lockManager = - virLockManagerPluginNew(p->str, 0))) + virLockManagerPluginNew(p->str, lockConf, 0))) VIR_ERROR(_("Failed to load lock manager %s"), p->str); + VIR_FREE(lockConf); } virConfFree (conf); -- 1.7.4.4

On 06/24/2011 07:02 AM, Daniel P. Berrange wrote:
Allow a 'configFile' parameter to be passed into the lock drivers to provide configuration. Wire up the QEMU driver to pass in file names '/etc/libvirt/qemu-$NAME.conf eg qemu-sanlock.conf
* src/locking/lock_driver.h, src/locking/lock_driver_nop.c, src/locking/lock_driver_sanlock.c, src/locking/lock_manager.c, src/locking/lock_manager.h: Add configFile parameter * src/qemu/qemu_conf.c: Pass in configuration file path to lock driver plugins --- src/locking/lock_driver.h | 1 + src/locking/lock_driver_nop.c | 3 ++- src/locking/lock_driver_sanlock.c | 1 + src/locking/lock_manager.c | 7 +++---- src/locking/lock_manager.h | 1 + src/qemu/qemu_conf.c | 11 +++++++++-- 6 files changed, 17 insertions(+), 7 deletions(-)
ACK. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

Introduce a configuration file with a single parameter 'require_lease_for_disks', which is used to decide whether it is allowed to start a guest which has read/write disks, but without any leases. * libvirt.spec.in: Add sanlock config file and augeas lens * src/Makefile.am: Install sanlock config file and augeas lens * src/locking/libvirt_sanlock.aug: Augeas master lens * src/locking/test_libvirt_sanlock.aug: Augeas test file * src/locking/sanlock.conf: Example sanlock config * src/locking/lock_driver_sanlock.c: Wire up loading of configuration file --- libvirt.spec.in | 3 + src/Makefile.am | 25 ++++++++++- src/locking/libvirt_sanlock.aug | 31 ++++++++++++++ src/locking/lock_driver_sanlock.c | 77 ++++++++++++++++++++++++++++++++- src/locking/sanlock.conf | 6 +++ src/locking/test_libvirt_sanlock.aug | 7 +++ 6 files changed, 144 insertions(+), 5 deletions(-) create mode 100644 src/locking/libvirt_sanlock.aug create mode 100644 src/locking/sanlock.conf create mode 100644 src/locking/test_libvirt_sanlock.aug diff --git a/libvirt.spec.in b/libvirt.spec.in index 11b8591..9b22983 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -1028,7 +1028,10 @@ fi %if %{with_sanlock} %files lock-sanlock %defattr(-, root, root) +%config(noreplace) %{_sysconfdir}/libvirt/qemu-sanlock.conf %attr(0755, root, root) %{_libdir}/libvirt/lock-driver/sanlock.so +%{_datadir}/augeas/lenses/libvirt_sanlock.aug +%{_datadir}/augeas/lenses/tests/test_libvirt_sanlock.aug %endif %files client -f %{name}.lang diff --git a/src/Makefile.am b/src/Makefile.am index 52fc63c..9208883 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -1035,6 +1035,10 @@ if WITH_LXC $(srcdir)/lxc/test_libvirtd_lxc.aug; \ fi endif + $(AM_V_GEN)if test -x '$(AUGPARSE)'; then \ + '$(AUGPARSE)' -I $(srcdir)/locking \ + $(srcdir)/locking/test_libvirt_sanlock.aug; \ + fi # # Build our version script. This is composed of three parts: @@ -1182,9 +1186,26 @@ lockdriver_LTLIBRARIES = sanlock.la sanlock_la_SOURCES = $(LOCK_DRIVER_SANLOCK_SOURCES) sanlock_la_CFLAGS = $(AM_CLFAGS) sanlock_la_LDFLAGS = -module -avoid-version -sanlock_la_LIBADD = -lsanlock +sanlock_la_LIBADD = -lsanlock \ + ../gnulib/lib/libgnu.la + +augeas_DATA += locking/libvirt_sanlock.aug +augeastest_DATA += locking/test_libvirt_sanlock.aug + +EXTRA_DIST += locking/sanlock.conf \ + locking/libvirt_sanlock.aug \ + locking/test_libvirt_sanlock.aug + +$(builddir)/locking/%-sanlock.conf: $(srcdir)/locking/sanlock.conf + $(AM_V_GEN)mkdir locking ; \ + cp $< $@ + +if WITH_QEMU +conf_DATA += locking/qemu-sanlock.conf +BUILT_SOURCES += locking/qemu-sanlock.conf +endif else -EXTRA_DIST += $(LOCK_DRIVER_SANLOCK_SOURCES) +EXTRA_DIST += $(LOCK_DRIVER_SANLOCK_SOURCES) locking/sanlock.conf endif noinst_LTLIBRARIES += libvirt-net-rpc.la libvirt-net-rpc-server.la libvirt-net-rpc-client.la diff --git a/src/locking/libvirt_sanlock.aug b/src/locking/libvirt_sanlock.aug new file mode 100644 index 0000000..baa639a --- /dev/null +++ b/src/locking/libvirt_sanlock.aug @@ -0,0 +1,31 @@ +(* /etc/libvirt/qemu-sanlock.conf *) + +module Libvirt_sanlock = + autoload xfm + + let eol = del /[ \t]*\n/ "\n" + let value_sep = del /[ \t]*=[ \t]*/ " = " + let indent = del /[ \t]*/ "" + + let str_val = del /\"/ "\"" . store /[^\"]*/ . del /\"/ "\"" + let bool_val = store /0|1/ + let int_val = store /[0-9]+/ + + let str_entry (kw:string) = [ key kw . value_sep . str_val ] + let bool_entry (kw:string) = [ key kw . value_sep . bool_val ] + let int_entry (kw:string) = [ key kw . value_sep . int_val ] + + + (* Each enty in the config is one of the following three ... *) + let entry = bool_entry "require_lease_for_disks" + let comment = [ label "#comment" . del /#[ \t]*/ "# " . store /([^ \t\n][^\n]*)?/ . del /\n/ "\n" ] + let empty = [ label "#empty" . eol ] + + let record = indent . entry . eol + + let lns = ( record | comment | empty ) * + + let filter = incl "/etc/libvirt/qemu-sanlock.conf" + . Util.stdexcl + + let xfm = transform lns filter diff --git a/src/locking/lock_driver_sanlock.c b/src/locking/lock_driver_sanlock.c index 3eab23e..7ad54e8 100644 --- a/src/locking/lock_driver_sanlock.c +++ b/src/locking/lock_driver_sanlock.c @@ -45,9 +45,19 @@ virReportErrorHelper(VIR_FROM_THIS, code, __FILE__, \ __FUNCTION__, __LINE__, __VA_ARGS__) + +typedef struct _virLockManagerSanlockDriver virLockManagerSanlockDriver; +typedef virLockManagerSanlockDriver *virLockManagerSanlockDriverPtr; + typedef struct _virLockManagerSanlockPrivate virLockManagerSanlockPrivate; typedef virLockManagerSanlockPrivate *virLockManagerSanlockPrivatePtr; +struct _virLockManagerSanlockDriver { + bool requireLeaseForDisks; +}; + +static virLockManagerSanlockDriver *driver = NULL; + struct _virLockManagerSanlockPrivate { char vm_name[SANLK_NAME_LEN]; char vm_uuid[VIR_UUID_BUFLEN]; @@ -62,22 +72,76 @@ struct _virLockManagerSanlockPrivate { /* * sanlock plugin for the libvirt virLockManager API */ +static int virLockManagerSanlockLoadConfig(const char *configFile) +{ + virConfPtr conf; + virConfValuePtr p; + + if (access(configFile, R_OK) == -1) { + if (errno != ENOENT) { + virReportSystemError(errno, + _("Unable to access config file %s"), + configFile); + return -1; + } + return 0; + } + + if (!(conf = virConfReadFile(configFile, 0))) + return -1; + +#define CHECK_TYPE(name,typ) if (p && p->type != (typ)) { \ + virLockError(VIR_ERR_INTERNAL_ERROR, \ + "%s: %s: expected type " #typ, \ + configFile, (name)); \ + virConfFree(conf); \ + return -1; \ + } + + p = virConfGetValue(conf, "require_lease_for_disks"); + CHECK_TYPE("require_lease_for_disks", VIR_CONF_LONG); + if (p) + driver->requireLeaseForDisks = p->l; + + virConfFree(conf); + return 0; +} -static int virLockManagerSanlockInit(unsigned int version ATTRIBUTE_UNUSED, - const char *configFile ATTRIBUTE_UNUSED, + +static int virLockManagerSanlockInit(unsigned int version, + const char *configFile, unsigned int flags) { + VIR_DEBUG("version=%u configFile=%s flags=%u", version, NULLSTR(configFile), flags); virCheckFlags(0, -1); + + if (driver) + return 0; + + if (VIR_ALLOC(driver) < 0) { + virReportOOMError(); + return -1; + } + + driver->requireLeaseForDisks = true; + + if (virLockManagerSanlockLoadConfig(configFile) < 0) + return -1; + return 0; } static int virLockManagerSanlockDeinit(void) { + if (!driver) + return 0; + virLockError(VIR_ERR_INTERNAL_ERROR, "%s", _("Unloading sanlock plugin is forbidden")); return -1; } + static int virLockManagerSanlockNew(virLockManagerPtr lock, unsigned int type, size_t nparams, @@ -90,6 +154,12 @@ static int virLockManagerSanlockNew(virLockManagerPtr lock, virCheckFlags(0, -1); + if (!driver) { + virLockError(VIR_ERR_INTERNAL_ERROR, + _("Sanlock plugin is not initialized")); + return -1; + } + if (type != VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN) { virLockError(VIR_ERR_INTERNAL_ERROR, _("Unsupported object type %d"), type); @@ -245,7 +315,8 @@ static int virLockManagerSanlockAcquire(virLockManagerPtr lock, VIR_LOCK_MANAGER_ACQUIRE_REGISTER_ONLY, -1); if (priv->res_count == 0 && - priv->hasRWDisks) { + priv->hasRWDisks && + driver->requireLeaseForDisks) { virLockError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Read/write, exclusive access, disks were present, but no leases specified")); return -1; diff --git a/src/locking/sanlock.conf b/src/locking/sanlock.conf new file mode 100644 index 0000000..818b529 --- /dev/null +++ b/src/locking/sanlock.conf @@ -0,0 +1,6 @@ +# +# Flag to determine whether we allow starting of guests +# which do not have any <lease> elements defined in their +# configuration. +# +#require_lease_for_disks = 1 diff --git a/src/locking/test_libvirt_sanlock.aug b/src/locking/test_libvirt_sanlock.aug new file mode 100644 index 0000000..2f1f57a --- /dev/null +++ b/src/locking/test_libvirt_sanlock.aug @@ -0,0 +1,7 @@ +module Test_libvirt_sanlock = + + let conf = "require_lease_for_disks = 1 +" + + test Libvirt_sanlock.lns get conf = +{ "require_lease_for_disks" = "1" } -- 1.7.4.4

On 06/24/2011 07:02 AM, Daniel P. Berrange wrote:
Introduce a configuration file with a single parameter 'require_lease_for_disks', which is used to decide whether it is allowed to start a guest which has read/write disks, but without any leases.
* libvirt.spec.in: Add sanlock config file and augeas lens * src/Makefile.am: Install sanlock config file and augeas lens * src/locking/libvirt_sanlock.aug: Augeas master lens * src/locking/test_libvirt_sanlock.aug: Augeas test file * src/locking/sanlock.conf: Example sanlock config * src/locking/lock_driver_sanlock.c: Wire up loading of configuration file --- libvirt.spec.in | 3 + src/Makefile.am | 25 ++++++++++- src/locking/libvirt_sanlock.aug | 31 ++++++++++++++ src/locking/lock_driver_sanlock.c | 77 ++++++++++++++++++++++++++++++++- src/locking/sanlock.conf | 6 +++ src/locking/test_libvirt_sanlock.aug | 7 +++ 6 files changed, 144 insertions(+), 5 deletions(-) create mode 100644 src/locking/libvirt_sanlock.aug create mode 100644 src/locking/sanlock.conf create mode 100644 src/locking/test_libvirt_sanlock.aug
+EXTRA_DIST += locking/sanlock.conf \ + locking/libvirt_sanlock.aug \ + locking/test_libvirt_sanlock.aug + +$(builddir)/locking/%-sanlock.conf: $(srcdir)/locking/sanlock.conf + $(AM_V_GEN)mkdir locking ; \ + cp $< $@
This still looks a bit complex to me; I think it might be possible to simplify to just add $(srcdir)/locking/sanlock.conf into EXTRA_DIST rather than going through cp. But I'd rather see this commit go in now, and save that for a later cleanup, rather than miss getting this commit into the release, because it will take me some time to test my proposal with VPATH builds and 'make distcheck'. ACK -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

The current sanlock plugin requires a central management application to manually add <lease> elements to each guest, to protect resources that are assigned to it (eg writable disks). This makes the sanlock plugin useless for usage in more adhoc deployment environments where there is no central authority to associate disks with leases. This patch adds a mode where the sanlock plugin will automatically create leases for each assigned read-write disk, using a md5 checksum of the fully qualified disk path. This can work pretty well if guests are using stable disk paths for block devices eg /dev/disk/by-path/XXXX symlinks, or if all hosts have NFS volumes mounted in a consistent pattern. The plugin will create one lockspace for managing disks with filename /var/lib/libvirt/sanlock/__LIBVIRT__DISKS__. For each VM disks, there will be another file to hold a lease /var/lib/libvirt/sanlock/5903e5d25e087e60a20fe4566fab41fd Each VM disk lease is usually 1 MB in size. The script virt-sanlock-cleanup should be run periodically to remove unused lease files from the lockspace directory. To make use of this capability the admin will need todo several tasks: - Mount an NFS volume (or other shared filesystem) on /var/lib/libvirt/sanlock - Configure 'host_id' in /etc/libvirt/qemu-sanlock.conf with a unique value for each host with the same NFS mount - Toggle the 'auto_disk_leases' parameter in qemu-sanlock.conf Technically the first step can be skipped, in which case sanlock will only protect against 2 vms on the same host using the same disk (or the same VM being started twice due to error by libvirt). * src/locking/libvirt_sanlock.aug, src/locking/sanlock.conf, src/locking/test_libvirt_sanlock.aug: Add config params for configuring auto lease setup * libvirt.spec.in: Add virt-sanlock-cleanup program, man page * tools/virt-sanlock-cleanup.in: Script to purge unused disk resource lease files --- libvirt.spec.in | 2 + src/locking/libvirt_sanlock.aug | 6 +- src/locking/lock_driver_sanlock.c | 449 ++++++++++++++++++++++++++++++--- src/locking/sanlock.conf | 60 +++++ src/locking/test_libvirt_sanlock.aug | 10 +- tools/.gitignore | 2 + tools/Makefile.am | 17 ++- tools/virt-sanlock-cleanup.in | 96 +++++++ 8 files changed, 598 insertions(+), 44 deletions(-) create mode 100644 tools/virt-sanlock-cleanup.in diff --git a/libvirt.spec.in b/libvirt.spec.in index 9b22983..c853847 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -1032,6 +1032,8 @@ fi %attr(0755, root, root) %{_libdir}/libvirt/lock-driver/sanlock.so %{_datadir}/augeas/lenses/libvirt_sanlock.aug %{_datadir}/augeas/lenses/tests/test_libvirt_sanlock.aug +%{_sbindir}/virt-sanlock-cleanup +%{_mandir}/man8/virt-sanlock-cleanup.8* %endif %files client -f %{name}.lang diff --git a/src/locking/libvirt_sanlock.aug b/src/locking/libvirt_sanlock.aug index baa639a..548181e 100644 --- a/src/locking/libvirt_sanlock.aug +++ b/src/locking/libvirt_sanlock.aug @@ -17,7 +17,11 @@ module Libvirt_sanlock = (* Each enty in the config is one of the following three ... *) - let entry = bool_entry "require_lease_for_disks" + let entry = str_entry "disk_lease_dir" + | bool_entry "auto_disk_leases" + | int_entry "max_hosts" + | int_entry "host_id" + | bool_entry "require_lease_for_disks" let comment = [ label "#comment" . del /#[ \t]*/ "# " . store /([^ \t\n][^\n]*)?/ . del /\n/ "\n" ] let empty = [ label "#empty" . eol ] diff --git a/src/locking/lock_driver_sanlock.c b/src/locking/lock_driver_sanlock.c index 7ad54e8..ad3a65a 100644 --- a/src/locking/lock_driver_sanlock.c +++ b/src/locking/lock_driver_sanlock.c @@ -28,9 +28,13 @@ #include <stdio.h> #include <errno.h> #include <sys/types.h> +#include <sys/stat.h> +#include <fcntl.h> #include <sanlock.h> #include <sanlock_resource.h> +#include <sanlock_direct.h> +#include <sanlock_admin.h> #include "lock_driver.h" #include "logging.h" @@ -38,6 +42,10 @@ #include "memory.h" #include "util.h" #include "files.h" +#include "md5.h" +#include "conf.h" + +#include "configmake.h" #define VIR_FROM_THIS VIR_FROM_LOCKING @@ -46,6 +54,8 @@ __FUNCTION__, __LINE__, __VA_ARGS__) +#define VIR_LOCK_MANAGER_SANLOCK_AUTO_DISK_LOCKSPACE "__LIBVIRT__DISKS__" + typedef struct _virLockManagerSanlockDriver virLockManagerSanlockDriver; typedef virLockManagerSanlockDriver *virLockManagerSanlockDriverPtr; @@ -54,6 +64,10 @@ typedef virLockManagerSanlockPrivate *virLockManagerSanlockPrivatePtr; struct _virLockManagerSanlockDriver { bool requireLeaseForDisks; + int hostID; + int maxHosts; + bool autoDiskLease; + char *autoDiskLeasePath; }; static virLockManagerSanlockDriver *driver = NULL; @@ -98,16 +112,155 @@ static int virLockManagerSanlockLoadConfig(const char *configFile) return -1; \ } + p = virConfGetValue(conf, "auto_disk_leases"); + CHECK_TYPE("auto_disk_leases", VIR_CONF_LONG); + if (p) driver->autoDiskLease = p->l; + + p = virConfGetValue(conf, "disk_lease_dir"); + CHECK_TYPE("disk_lease_dir", VIR_CONF_STRING); + if (p && p->str) { + VIR_FREE(driver->autoDiskLeasePath); + if (!(driver->autoDiskLeasePath = strdup(p->str))) { + virReportOOMError(); + virConfFree(conf); + return -1; + } + } + + p = virConfGetValue(conf, "max_hosts"); + CHECK_TYPE("max_hosts", VIR_CONF_LONG); + if (p) driver->maxHosts = p->l; + + p = virConfGetValue(conf, "host_id"); + CHECK_TYPE("host_id", VIR_CONF_LONG); + if (p) driver->hostID = p->l; + p = virConfGetValue(conf, "require_lease_for_disks"); CHECK_TYPE("require_lease_for_disks", VIR_CONF_LONG); if (p) driver->requireLeaseForDisks = p->l; + else + driver->requireLeaseForDisks = !driver->autoDiskLease; virConfFree(conf); return 0; } +static int virLockManagerSanlockSetupLockspace(void) +{ + int fd = -1; + struct stat st; + int rv; + struct sanlk_lockspace ls; + char *path = NULL; + + if (virAsprintf(&path, "%s/%s", + driver->autoDiskLeasePath, + VIR_LOCK_MANAGER_SANLOCK_AUTO_DISK_LOCKSPACE) < 0) { + virReportOOMError(); + goto error; + } + + memcpy(ls.name, VIR_LOCK_MANAGER_SANLOCK_AUTO_DISK_LOCKSPACE, SANLK_NAME_LEN); + ls.host_id = 0; /* Doesn't matter for initialization */ + ls.flags = 0; + memcpy(&ls.host_id_disk, path, sizeof(struct sanlk_disk)); + ls.host_id_disk.offset = 0; + + /* Stage 1: Ensure the lockspace file exists on disk, has + * space allocated for it and is initialized with lease + */ + if (stat(path, &st) < 0) { + VIR_DEBUG("Lockspace %s does not yet exist", path); + if ((fd = open(path, O_WRONLY|O_CREAT|O_EXCL, 0600)) < 0) { + if (errno != EEXIST) { + virReportSystemError(errno, + _("Unable to create lockspace %s"), + path); + goto error; + } + VIR_DEBUG("Someone else just created lockspace %s", path); + } else { + if ((rv = sanlock_direct_align(&ls.host_id_disk)) < 0) { + if (rv <= -200) + virLockError(VIR_ERR_INTERNAL_ERROR, + _("Unable to query sector size %s: error %d"), + path, rv); + else + virReportSystemError(-rv, + _("Unable to query sector size %s"), + path); + goto error_unlink; + } + + /* + * Pre allocate enough data for 1 block of leases at preferred alignment + */ + if (safezero(fd, 0, 0, rv) < 0) { + virReportSystemError(errno, + _("Unable to allocate lockspace %s"), + path); + goto error_unlink; + } + + if (VIR_CLOSE(fd) < 0) { + virReportSystemError(errno, + _("Unable to save lockspace %s"), + path); + goto error_unlink; + } + + if ((rv = sanlock_direct_init(&ls, NULL, 0, driver->maxHosts, 0)) < 0) { + if (rv <= -200) + virLockError(VIR_ERR_INTERNAL_ERROR, + _("Unable to initialize lockspace %s: error %d"), + path, rv); + else + virReportSystemError(-rv, + _("Unable to initialize lockspace %s"), + path); + goto error_unlink; + } + VIR_DEBUG("Lockspace %s has been initialized", path); + } + } + + ls.host_id = driver->hostID; + /* Stage 2: Try to register the lockspace with the daemon. + * If the lockspace is already registered, we should get EEXIST back + * in which case we can just carry on with life + */ + if ((rv = sanlock_add_lockspace(&ls, 0)) < 0) { + if (-rv != EEXIST) { + if (rv <= -200) + virLockError(VIR_ERR_INTERNAL_ERROR, + _("Unable to add lockspace %s: error %d"), + path, rv); + else + virReportSystemError(-rv, + _("Unable to add lockspace %s"), + path); + return -1; + } else { + VIR_DEBUG("Lockspace %s is already registered", path); + } + } else { + VIR_DEBUG("Lockspace %s has been registered", path); + } + + return 0; + +error_unlink: + if (path) + unlink(path); +error: + VIR_FORCE_CLOSE(fd); + VIR_FREE(path); + return -1; +} + +static int virLockManagerSanlockDeinit(void); static int virLockManagerSanlockInit(unsigned int version, const char *configFile, unsigned int flags) @@ -124,11 +277,32 @@ static int virLockManagerSanlockInit(unsigned int version, } driver->requireLeaseForDisks = true; + driver->hostID = 0; + driver->maxHosts = 64; + driver->autoDiskLease = false; + if (!(driver->autoDiskLeasePath = strdup(LOCALSTATEDIR "/lib/libvirt/sanlock"))) { + VIR_FREE(driver); + virReportOOMError(); + goto error; + } if (virLockManagerSanlockLoadConfig(configFile) < 0) - return -1; + goto error; + + if (driver->autoDiskLease && !driver->hostID) { + virLockError(VIR_ERR_INTERNAL_ERROR, + _("Automatic disk lease mode enabled, but no host ID is set")); + goto error; + } + + if (virLockManagerSanlockSetupLockspace() < 0) + goto error; return 0; + +error: + virLockManagerSanlockDeinit(); + return -1; } static int virLockManagerSanlockDeinit(void) @@ -136,9 +310,10 @@ static int virLockManagerSanlockDeinit(void) if (!driver) return 0; - virLockError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Unloading sanlock plugin is forbidden")); - return -1; + VIR_FREE(driver->autoDiskLeasePath); + VIR_FREE(driver); + + return 0; } @@ -214,51 +389,50 @@ static void virLockManagerSanlockFree(virLockManagerPtr lock) lock->privateData = NULL; } -static int virLockManagerSanlockAddResource(virLockManagerPtr lock, - unsigned int type, - const char *name, - size_t nparams, - virLockManagerParamPtr params, - unsigned int flags) + +static const char hex[] = { '0', '1', '2', '3', '4', '5', '6', '7', + '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' }; + +static int virLockManagerSanlockDiskLeaseName(const char *path, + char *str, + size_t strbuflen) { - virLockManagerSanlockPrivatePtr priv = lock->privateData; - struct sanlk_resource *res; + unsigned char buf[MD5_DIGEST_SIZE]; int i; - virCheckFlags(VIR_LOCK_MANAGER_RESOURCE_READONLY | - VIR_LOCK_MANAGER_RESOURCE_SHARED, -1); - - if (priv->res_count == SANLK_MAX_RESOURCES) { + if (strbuflen < ((MD5_DIGEST_SIZE * 2) + 1)) { virLockError(VIR_ERR_INTERNAL_ERROR, - _("Too many resources %d for object"), - SANLK_MAX_RESOURCES); + _("String length too small to store md5 checksum")); return -1; } - if (type == VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK) { - if (!(flags & (VIR_LOCK_MANAGER_RESOURCE_SHARED | - VIR_LOCK_MANAGER_RESOURCE_READONLY))) - priv->hasRWDisks = true; - return 0; - } - - if (type != VIR_LOCK_MANAGER_RESOURCE_TYPE_LEASE) - return 0; - - if (flags & VIR_LOCK_MANAGER_RESOURCE_READONLY) { - virLockError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("Readonly leases are not supported")); + if (!(md5_buffer(path, strlen(path), buf))) { + virLockError(VIR_ERR_INTERNAL_ERROR, + _("Unable to compute md5 checksum")); return -1; } - if (flags & VIR_LOCK_MANAGER_RESOURCE_SHARED) { - virLockError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("Sharable leases are not supported")); - return -1; + + for (i = 0 ; i < MD5_DIGEST_SIZE ; i++) { + str[i*2] = hex[(buf[i] >> 4) & 0xf]; + str[(i*2)+1] = hex[buf[i] & 0xf]; } + str[(MD5_DIGEST_SIZE*2)+1] = '\0'; + return 0; +} + +static int virLockManagerSanlockAddLease(virLockManagerPtr lock, + const char *name, + size_t nparams, + virLockManagerParamPtr params) +{ + virLockManagerSanlockPrivatePtr priv = lock->privateData; + int ret = -1; + struct sanlk_resource *res = NULL; + int i; if (VIR_ALLOC_VAR(res, struct sanlk_disk, 1) < 0) { virReportOOMError(); - return -1; + goto cleanup; } res->num_disks = 1; @@ -266,7 +440,7 @@ static int virLockManagerSanlockAddResource(virLockManagerPtr lock, virLockError(VIR_ERR_INTERNAL_ERROR, _("Resource name '%s' exceeds %d characters"), name, SANLK_NAME_LEN); - goto error; + goto cleanup; } for (i = 0; i < nparams; i++) { @@ -275,7 +449,7 @@ static int virLockManagerSanlockAddResource(virLockManagerPtr lock, virLockError(VIR_ERR_INTERNAL_ERROR, _("Lease path '%s' exceeds %d characters"), params[i].value.str, SANLK_PATH_LEN); - goto error; + goto cleanup; } } else if (STREQ(params[i].key, "offset")) { res->disks[0].offset = params[i].value.ul; @@ -284,20 +458,213 @@ static int virLockManagerSanlockAddResource(virLockManagerPtr lock, virLockError(VIR_ERR_INTERNAL_ERROR, _("Resource lockspace '%s' exceeds %d characters"), params[i].value.str, SANLK_NAME_LEN); - goto error; + goto cleanup; } } } priv->res_args[priv->res_count] = res; priv->res_count++; + + ret = 0; + +cleanup: + if (ret == -1) + VIR_FREE(res); + return ret; +} + + + + +static int virLockManagerSanlockAddDisk(virLockManagerPtr lock, + const char *name, + size_t nparams, + virLockManagerParamPtr params ATTRIBUTE_UNUSED) +{ + virLockManagerSanlockPrivatePtr priv = lock->privateData; + int ret = -1; + struct sanlk_resource *res = NULL; + char *path = NULL; + + if (nparams) { + virLockError(VIR_ERR_INTERNAL_ERROR, + _("Unexpected lock parameters for disk resource")); + return -1; + } + + if (VIR_ALLOC_VAR(res, struct sanlk_disk, 1) < 0) { + virReportOOMError(); + goto cleanup; + } + + res->num_disks = 1; + if (virLockManagerSanlockDiskLeaseName(name, res->name, SANLK_NAME_LEN) < 0) + goto cleanup; + + if (virAsprintf(&path, "%s/%s", + driver->autoDiskLeasePath, res->name) < 0) { + virReportOOMError(); + goto cleanup; + } + if (!virStrcpy(res->disks[0].path, path, SANLK_PATH_LEN)) { + virLockError(VIR_ERR_INTERNAL_ERROR, + _("Lease path '%s' exceeds %d characters"), + path, SANLK_PATH_LEN); + goto cleanup; + } + + if (!virStrcpy(res->lockspace_name, + VIR_LOCK_MANAGER_SANLOCK_AUTO_DISK_LOCKSPACE, + SANLK_NAME_LEN)) { + virLockError(VIR_ERR_INTERNAL_ERROR, + _("Resource lockspace '%s' exceeds %d characters"), + VIR_LOCK_MANAGER_SANLOCK_AUTO_DISK_LOCKSPACE, SANLK_NAME_LEN); + goto cleanup; + } + + priv->res_args[priv->res_count] = res; + priv->res_count++; + + ret = 0; + +cleanup: + if (ret == -1) + VIR_FREE(res); + VIR_FREE(path); + return ret; +} + + +static int virLockManagerSanlockCreateLease(struct sanlk_resource *res) +{ + int fd = -1; + struct stat st; + int rv; + + if (stat(res->disks[0].path, &st) < 0) { + VIR_DEBUG("Lockspace %s does not yet exist", res->disks[0].path); + if ((fd = open(res->disks[0].path, O_WRONLY|O_CREAT|O_EXCL, 0600)) < 0) { + if (errno != EEXIST) { + virReportSystemError(errno, + _("Unable to create lockspace %s"), + res->disks[0].path); + return -1; + } + VIR_DEBUG("Someone else just created lockspace %s", res->disks[0].path); + } else { + if ((rv = sanlock_direct_align(&res->disks[0])) < 0) { + if (rv <= -200) + virLockError(VIR_ERR_INTERNAL_ERROR, + _("Unable to query sector size %s: error %d"), + res->disks[0].path, rv); + else + virReportSystemError(-rv, + _("Unable to query sector size %s"), + res->disks[0].path); + goto error_unlink; + } + + /* + * Pre allocate enough data for 1 block of leases at preferred alignment + */ + if (safezero(fd, 0, 0, rv) < 0) { + virReportSystemError(errno, + _("Unable to allocate lease %s"), + res->disks[0].path); + goto error_unlink; + } + + if (VIR_CLOSE(fd) < 0) { + virReportSystemError(errno, + _("Unable to save lease %s"), + res->disks[0].path); + goto error_unlink; + } + + if ((rv = sanlock_direct_init(NULL, res, driver->maxHosts, driver->maxHosts, 0)) < 0) { + if (rv <= -200) + virLockError(VIR_ERR_INTERNAL_ERROR, + _("Unable to initialize lease %s: error %d"), + res->disks[0].path, rv); + else + virReportSystemError(-rv, + _("Unable to initialize lease %s"), + res->disks[0].path); + goto error_unlink; + } + VIR_DEBUG("Lease %s has been initialized", res->disks[0].path); + } + } + return 0; -error: - VIR_FREE(res); +error_unlink: + unlink(res->disks[0].path); + VIR_FORCE_CLOSE(fd); return -1; } + +static int virLockManagerSanlockAddResource(virLockManagerPtr lock, + unsigned int type, + const char *name, + size_t nparams, + virLockManagerParamPtr params, + unsigned int flags) +{ + virLockManagerSanlockPrivatePtr priv = lock->privateData; + + virCheckFlags(VIR_LOCK_MANAGER_RESOURCE_READONLY | + VIR_LOCK_MANAGER_RESOURCE_SHARED, -1); + + if (priv->res_count == SANLK_MAX_RESOURCES) { + virLockError(VIR_ERR_INTERNAL_ERROR, + _("Too many resources %d for object"), + SANLK_MAX_RESOURCES); + return -1; + } + + if (flags & VIR_LOCK_MANAGER_RESOURCE_READONLY) { + virLockError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Readonly leases are not supported")); + return -1; + } + if (flags & VIR_LOCK_MANAGER_RESOURCE_SHARED) { + virLockError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Sharable leases are not supported")); + return -1; + } + + switch (type) { + case VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK: + if (driver->autoDiskLease) { + if (virLockManagerSanlockAddDisk(lock, name, nparams, params) < 0) + return -1; + + if (virLockManagerSanlockCreateLease(priv->res_args[priv->res_count-1]) < 0) + return -1; + } else { + if (!(flags & (VIR_LOCK_MANAGER_RESOURCE_SHARED | + VIR_LOCK_MANAGER_RESOURCE_READONLY))) + priv->hasRWDisks = true; + /* Ignore disk resources without error */ + } + break; + + case VIR_LOCK_MANAGER_RESOURCE_TYPE_LEASE: + if (virLockManagerSanlockAddLease(lock, name, nparams, params) < 0) + return -1; + break; + + default: + /* Ignore other resources, without error */ + break; + } + + return 0; +} + static int virLockManagerSanlockAcquire(virLockManagerPtr lock, const char *state, unsigned int flags) diff --git a/src/locking/sanlock.conf b/src/locking/sanlock.conf index 818b529..19a8770 100644 --- a/src/locking/sanlock.conf +++ b/src/locking/sanlock.conf @@ -1,6 +1,66 @@ + +# +# The default sanlock configuration requires the management +# application to manually define <lease> elements in the +# guest configuration, typically one lease per disk. An +# alternative is to enable "auto disk lease" mode. In this +# usage, libvirt will automatically create a lockspace and +# lease for each fully qualified disk path. This works if +# you are able to ensure stable, unique disk paths across +# all hosts in a network. +# +# Uncomment this to enable automatic lease creation. +# +# NB: the 'host_id' parameter must be set if enabling this +# +#auto_disk_leases = 1 + +# +# The default location in which lockspaces are created when +# automatic lease creation is enabled. For each unique disk +# path, a file $LEASE_DIR/NNNNNNNNNNNNNN will be created +# where 'NNNNNNNNNNNNNN' is the MD5 checkout of the disk path. +# +# If this directory is on local storage, it will only protect +# against a VM being started twice on the same host, or two +# guests on the same host using the same disk path. If the +# directory is on NFS, then it can protect against concurrent +# usage across all hosts which have the share mounted. +# +# Recommendation is to just mount this default location as +# an NFS volume. Uncomment this, if you would prefer the mount +# point to be somewhere else. +# +#disk_lease_dir = "/var/lib/libvirt/sanlock" + +# +# When automatically creating leases for disks, each host which +# can access the filesystem mounted at 'disk_lease_dir' will need +# to have a unique host ID assigned. The 'max_hosts' parameter +# specifies an upper limit on the number of participating hosts. +# +# Each additional host requires 1 sector of disk space, usually +# 512 bytes. The default is 64, and can be reduced if you don't +# have many hosts, or increased if you have more. +# +#max_hosts = 64 + +# +# The unique ID for this host. +# +# IMPORTANT: *EVERY* host which can access the filesystem mounted +# at 'disk_lease_dir' *MUST* be given a different host ID. +# +# This parameter has no default and must be manually set if +# 'auto_disk_leases' is enabled +#host_id = 1 + # # Flag to determine whether we allow starting of guests # which do not have any <lease> elements defined in their # configuration. # +# If 'auto_disk_leases' is disabled, this setting defaults +# to enabled, otherwise it defaults to disabled. +# #require_lease_for_disks = 1 diff --git a/src/locking/test_libvirt_sanlock.aug b/src/locking/test_libvirt_sanlock.aug index 2f1f57a..26a0608 100644 --- a/src/locking/test_libvirt_sanlock.aug +++ b/src/locking/test_libvirt_sanlock.aug @@ -1,7 +1,15 @@ module Test_libvirt_sanlock = - let conf = "require_lease_for_disks = 1 + let conf = "auto_disk_leases = 1 +disk_lease_dir = \"/var/lib/libvirt/sanlock\" +max_hosts = 64 +host_id = 1 +require_lease_for_disks = 1 " test Libvirt_sanlock.lns get conf = +{ "auto_disk_leases" = "1" } +{ "disk_lease_dir" = "/var/lib/libvirt/sanlock" } +{ "max_hosts" = "64" } +{ "host_id" = "1" } { "require_lease_for_disks" = "1" } diff --git a/tools/.gitignore b/tools/.gitignore index 3e34ea8..0f7a25e 100644 --- a/tools/.gitignore +++ b/tools/.gitignore @@ -1,7 +1,9 @@ libvirt-guests.init virt-xml-validate virt-pki-validate +virt-sanlock-cleanup *.1 +*.8 Makefile Makefile.in virsh-net-edit.c diff --git a/tools/Makefile.am b/tools/Makefile.am index 909c4b2..ed38396 100644 --- a/tools/Makefile.am +++ b/tools/Makefile.am @@ -12,6 +12,7 @@ EXTRA_DIST = \ $(ICON_FILES) \ virt-xml-validate.in \ virt-pki-validate.in \ + virt-sanlock-cleanup.in \ virsh.pod \ libvirt-guests.init.sh \ libvirt-guests.sysconf @@ -19,8 +20,14 @@ EXTRA_DIST = \ bin_SCRIPTS = virt-xml-validate virt-pki-validate bin_PROGRAMS = virsh -dist_man1_MANS = virt-xml-validate.1 virt-pki-validate.1 virsh.1 +if HAVE_SANLOCK +sbin_SCRIPTS = virt-sanlock-cleanup +endif +dist_man1_MANS = virt-xml-validate.1 virt-pki-validate.1 virsh.1 +if HAVE_SANLOCK +dist_man8_MANS = virt-sanlock-cleanup.8 +endif virt-xml-validate: virt-xml-validate.in Makefile $(AM_V_GEN)sed -e 's,[@]SCHEMADIR@,$(pkgdatadir)/schemas,' < $< > $@ \ @@ -36,6 +43,14 @@ virt-pki-validate: virt-pki-validate.in Makefile virt-pki-validate.1: virt-pki-validate.in $(AM_V_GEN)$(POD2MAN) $< $(srcdir)/$@ +virt-sanlock-cleanup: virt-sanlock-cleanup.in Makefile + $(AM_V_GEN)sed -e 's,[@]SYSCONFDIR@,$(sysconfdir),' \ + -e 's,[@]LOCALSTATEDIR@,$(localstatedir),' < $< > $@ \ + || (rm $@ && exit 1) && chmod +x $@ + +virt-sanlock-cleanup.8: virt-sanlock-cleanup.in + $(AM_V_GEN)$(POD2MAN) $< $(srcdir)/$@ + virsh_SOURCES = \ console.c console.h \ virsh.c diff --git a/tools/virt-sanlock-cleanup.in b/tools/virt-sanlock-cleanup.in new file mode 100644 index 0000000..98daef6 --- /dev/null +++ b/tools/virt-sanlock-cleanup.in @@ -0,0 +1,96 @@ +#!/bin/sh + +# A script to cleanup resource leases auto-created by +# the libvirt lock plugin for sanlock + +verbose=1 +if test "$1" = "-q" ; then + verbose=0 +fi + +LOCKSPACE="__LIBVIRT__DISKS__" + +LOCKDIR=`augtool print '/files@SYSCONFDIR@/libvirt/qemu-sanlock.conf/disk_lease_dir'` +if test $? != 0 || "x$LOCKDIR" = "x" ; then + LOCKDIR="@LOCALSTATEDIR@/lib/libvirt/sanlock" +fi + +notify() { + test $verbose = 1 || return + if test "x$1" = "x-n"; then + shift + printf %s "$*" + else + printf %s\\n "$*" + fi +} + +cd "$LOCKDIR" || exit 1 + +for MD5 in * +do + if test $MD5 != '*' && $MD5 != $LOCKSPACE ; then + RESOURCE="$LOCKSPACE:$MD5:$LOCKDIR/$MD5:0" + notify -n "Cleanup: $RESOURCE " + sanlock client command -r $RESOURCE -c /bin/rm -f "$LOCKDIR/$MD5" 2>/dev/null + if test $? = 0 ; then + notify "PASS" + else + notify "FAIL" + fi + fi +done + +exit 0 + +: <<=cut +=pod + +=head1 NAME + + virt-sanlock-cleanup - remove stale sanlock resource lease files + +=head1 SYNOPSIS + + virt-sanlock-cleanup + +=head1 DESCRIPTION + +This tool removes any resource lease files created by the sanlock +lock manager plugin. The resource lease files only need to exist +on disks when a guest using the resource is active. This script +reclaims the disk space used by resources which are not currently +active. + +=head1 EXIT STATUS + +Upon successful processing of leases cleanup, the exit status +will be 0 will be set. Upon fatal eror a non-zero status will +be set. + +=head1 AUTHOR + +Daniel Berrange + +=head1 BUGS + +Report any bugs discovered to the libvirt community via the +mailing list C<http://libvirt.org/contact.html> or bug tracker C<http://libvirt.org/bugs.html>. +Alternatively report bugs to your software distributor / vendor. + +=head1 COPYRIGHT + +Copyright (C) 2011 Red Hat, Inc. + +=head1 LICENSE + +virt-sanlock-cleanup is distributed under the terms of the GNU GPL v2+. +This is free software; see the source for copying conditions. There +is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR +PURPOSE + +=head1 SEE ALSO + +C<virsh(1)>, online instructions C<http://libvirt.org/locksanlock.html> + +=cut -- 1.7.4.4

On 06/24/2011 07:02 AM, Daniel P. Berrange wrote:
The current sanlock plugin requires a central management application to manually add <lease> elements to each guest, to protect resources that are assigned to it (eg writable disks). This makes the sanlock plugin useless for usage in more adhoc deployment environments where there is no
s/adhoc/ad hoc/
To make use of this capability the admin will need todo several tasks:
s/todo/to do/
+static int virLockManagerSanlockSetupLockspace(void) +{ + int fd = -1; + struct stat st; + int rv; + struct sanlk_lockspace ls; + char *path = NULL; + + if (virAsprintf(&path, "%s/%s", + driver->autoDiskLeasePath, + VIR_LOCK_MANAGER_SANLOCK_AUTO_DISK_LOCKSPACE) < 0) { + virReportOOMError(); + goto error; + } + + memcpy(ls.name, VIR_LOCK_MANAGER_SANLOCK_AUTO_DISK_LOCKSPACE, SANLK_NAME_LEN); + ls.host_id = 0; /* Doesn't matter for initialization */ + ls.flags = 0; + memcpy(&ls.host_id_disk, path, sizeof(struct sanlk_disk));
Dangerous. This could copy more bytes than strlen(path) and cross a page boundary, or it could silently truncate if strlen(path) is longer than sizeof(struct sanlk_disk). I think you need to use virStrncpy here.
-static int virLockManagerSanlockAddResource(virLockManagerPtr lock, - unsigned int type, - const char *name, - size_t nparams, - virLockManagerParamPtr params, - unsigned int flags) + +static const char hex[] = { '0', '1', '2', '3', '4', '5', '6', '7', + '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' };
I would have written this: static const char hex[] = "0123456789abcdef"; but either way works.
+ +static int virLockManagerSanlockCreateLease(struct sanlk_resource *res) +{ + int fd = -1; + struct stat st; + int rv; + + if (stat(res->disks[0].path, &st) < 0) { + VIR_DEBUG("Lockspace %s does not yet exist", res->disks[0].path); + if ((fd = open(res->disks[0].path, O_WRONLY|O_CREAT|O_EXCL, 0600)) < 0) {
This feels like common copy-and-paste code; should attaching to the lockspace be factored into a common helper routine?
+# +# When automatically creating leases for disks, each host which +# can access the filesystem mounted at 'disk_lease_dir' will need +# to have a unique host ID assigned. The 'max_hosts' parameter +# specifies an upper limit on the number of participating hosts. +# +# Each additional host requires 1 sector of disk space, usually +# 512 bytes. The default is 64, and can be reduced if you don't +# have many hosts, or increased if you have more. +# +#max_hosts = 64
Weren't there some list comments about either dropping this parameter, or changing it to 2000, and that sanlock has changed to use more than 512 bytes per host now? http://www.redhat.com/archives/libvir-list/2011-June/msg01237.html
+++ b/tools/Makefile.am @@ -12,6 +12,7 @@ EXTRA_DIST = \ $(ICON_FILES) \ virt-xml-validate.in \ virt-pki-validate.in \ + virt-sanlock-cleanup.in \ virsh.pod \ libvirt-guests.init.sh \ libvirt-guests.sysconf @@ -19,8 +20,14 @@ EXTRA_DIST = \ bin_SCRIPTS = virt-xml-validate virt-pki-validate bin_PROGRAMS = virsh
-dist_man1_MANS = virt-xml-validate.1 virt-pki-validate.1 virsh.1 +if HAVE_SANLOCK +sbin_SCRIPTS = virt-sanlock-cleanup +endif
+dist_man1_MANS = virt-xml-validate.1 virt-pki-validate.1 virsh.1 +if HAVE_SANLOCK +dist_man8_MANS = virt-sanlock-cleanup.8
Ouch. Conditional distribution...
+endif
virt-xml-validate: virt-xml-validate.in Makefile $(AM_V_GEN)sed -e 's,[@]SCHEMADIR@,$(pkgdatadir)/schemas,' < $< > $@ \ @@ -36,6 +43,14 @@ virt-pki-validate: virt-pki-validate.in Makefile virt-pki-validate.1: virt-pki-validate.in $(AM_V_GEN)$(POD2MAN) $< $(srcdir)/$@
+virt-sanlock-cleanup: virt-sanlock-cleanup.in Makefile + $(AM_V_GEN)sed -e 's,[@]SYSCONFDIR@,$(sysconfdir),' \ + -e 's,[@]LOCALSTATEDIR@,$(localstatedir),' < $< > $@ \ + || (rm $@ && exit 1) && chmod +x $@ + +virt-sanlock-cleanup.8: virt-sanlock-cleanup.in + $(AM_V_GEN)$(POD2MAN) $< $(srcdir)/$@
...that depends on perl. Commit 6db98a2d4b previously fixed a bug just like this; the idea being that we have to unconditionally build virt-sanlock-cleanup.8.in using $(POD2MAN) and put that in the tarball, then conditionally build virt-sanlock-cleanup.8 using only sed, so that end users can run 'make dist' regardless of configure options and can still avoid perl. But since I wrote commit 6db98a2d4b, I'm happy to clean this up as a separate followup patch, if you'd like.
+++ b/tools/virt-sanlock-cleanup.in @@ -0,0 +1,96 @@ +#!/bin/sh + +# A script to cleanup resource leases auto-created by +# the libvirt lock plugin for sanlock + +verbose=1 +if test "$1" = "-q" ; then
if test "x$1" = "x-q" ; then The leading x is necessary, to prevent test from going haywire if $1 happens to be something like [.
+=head1 EXIT STATUS + +Upon successful processing of leases cleanup, the exit status +will be 0 will be set. Upon fatal eror a non-zero status will
s/eror/error/ ACK with the spelling and memcpy fixes, and up to you whether you make the remaining suggested changes now or leave them for followups. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

On Tue, Jun 28, 2011 at 11:02:24AM -0600, Eric Blake wrote:
+ if (virAsprintf(&path, "%s/%s", + driver->autoDiskLeasePath, + VIR_LOCK_MANAGER_SANLOCK_AUTO_DISK_LOCKSPACE) < 0) { + virReportOOMError(); + goto error; + } + + memcpy(ls.name, VIR_LOCK_MANAGER_SANLOCK_AUTO_DISK_LOCKSPACE, SANLK_NAME_LEN); + ls.host_id = 0; /* Doesn't matter for initialization */ + ls.flags = 0; + memcpy(&ls.host_id_disk, path, sizeof(struct sanlk_disk));
Dangerous. This could copy more bytes than strlen(path) and cross a page boundary, or it could silently truncate if strlen(path) is longer than sizeof(struct sanlk_disk). I think you need to use virStrncpy here.
That line is so bogus. It has been replace with if (!virStrcpy(ls.host_id_disk.path, path, SANLK_NAME_LEN)) { virLockError(VIR_ERR_INTERNAL_ERROR, _("Lockspace path '%s' exceeded %d characters"), path, SANLK_NAME_LEN); goto error; }
+# When automatically creating leases for disks, each host which +# can access the filesystem mounted at 'disk_lease_dir' will need +# to have a unique host ID assigned. The 'max_hosts' parameter +# specifies an upper limit on the number of participating hosts. +# +# Each additional host requires 1 sector of disk space, usually +# 512 bytes. The default is 64, and can be reduced if you don't +# have many hosts, or increased if you have more. +# +#max_hosts = 64
Weren't there some list comments about either dropping this parameter, or changing it to 2000, and that sanlock has changed to use more than 512 bytes per host now? http://www.redhat.com/archives/libvir-list/2011-June/msg01237.html
Yes, I've killed off 'max_hosts' now.
+++ b/tools/Makefile.am @@ -12,6 +12,7 @@ EXTRA_DIST = \ $(ICON_FILES) \ virt-xml-validate.in \ virt-pki-validate.in \ + virt-sanlock-cleanup.in \ virsh.pod \ libvirt-guests.init.sh \ libvirt-guests.sysconf @@ -19,8 +20,14 @@ EXTRA_DIST = \ bin_SCRIPTS = virt-xml-validate virt-pki-validate bin_PROGRAMS = virsh
-dist_man1_MANS = virt-xml-validate.1 virt-pki-validate.1 virsh.1 +if HAVE_SANLOCK +sbin_SCRIPTS = virt-sanlock-cleanup +endif
+dist_man1_MANS = virt-xml-validate.1 virt-pki-validate.1 virsh.1 +if HAVE_SANLOCK +dist_man8_MANS = virt-sanlock-cleanup.8
Ouch. Conditional distribution...
+endif
virt-xml-validate: virt-xml-validate.in Makefile $(AM_V_GEN)sed -e 's,[@]SCHEMADIR@,$(pkgdatadir)/schemas,' < $< > $@ \ @@ -36,6 +43,14 @@ virt-pki-validate: virt-pki-validate.in Makefile virt-pki-validate.1: virt-pki-validate.in $(AM_V_GEN)$(POD2MAN) $< $(srcdir)/$@
+virt-sanlock-cleanup: virt-sanlock-cleanup.in Makefile + $(AM_V_GEN)sed -e 's,[@]SYSCONFDIR@,$(sysconfdir),' \ + -e 's,[@]LOCALSTATEDIR@,$(localstatedir),' < $< > $@ \ + || (rm $@ && exit 1) && chmod +x $@ + +virt-sanlock-cleanup.8: virt-sanlock-cleanup.in + $(AM_V_GEN)$(POD2MAN) $< $(srcdir)/$@
...that depends on perl. Commit 6db98a2d4b previously fixed a bug just like this; the idea being that we have to unconditionally build virt-sanlock-cleanup.8.in using $(POD2MAN) and put that in the tarball, then conditionally build virt-sanlock-cleanup.8 using only sed, so that end users can run 'make dist' regardless of configure options and can still avoid perl.
But since I wrote commit 6db98a2d4b, I'm happy to clean this up as a separate followup patch, if you'd like.
Ok, I'll let you tackle it
+# A script to cleanup resource leases auto-created by +# the libvirt lock plugin for sanlock + +verbose=1 +if test "$1" = "-q" ; then
if test "x$1" = "x-q" ; then
The leading x is necessary, to prevent test from going haywire if $1 happens to be something like [.
Done that Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Add a page which documents how to configure lock managers, focusing on use of sanlock with the QEMU/KVM driver * docs/locking.html.in: Docs about lock managers * docs/sitemap.html.in: Add lock manager config to the deployment section --- docs/locking.html.in | 212 ++++++++++++++++++++++++++++++++++++++++++++++++++ docs/sitemap.html.in | 4 + 2 files changed, 216 insertions(+), 0 deletions(-) create mode 100644 docs/locking.html.in diff --git a/docs/locking.html.in b/docs/locking.html.in new file mode 100644 index 0000000..068857d --- /dev/null +++ b/docs/locking.html.in @@ -0,0 +1,212 @@ +<?xml version="1.0"?> +<html> + <body> + <h1>Virtual machine disk locking</h1> + + <ul id="toc"></ul> + + <p> + This page describes how to ensure a single disk cannot be + used by more than one running VM at a time, across any + host in a network. This is critical to avoid data corruption + of guest files systems that are not cluster aware. + </p> + + <h2><a name="plugins">Lock manager plugins</a></h2> + + <p> + libvirt includes a pluggable framework for lock managers, + which hypervisor drivers can use to ensure safety for + guest domain disks, and potentially other resources. + At this time there are only two plugin implementations, + a "no op" implementation which does absolutely nothing, + and a <a href="https://fedorahosted.org/sanlock/">sanlock</a> implementation which uses + the Disk Paxos algorithm to ensure safety. + </p> + + <h2><a name="sanlock">Sanlock daemon setup</a></h2> + + <p> + On many operating systems, the <strong>sanlock</strong> plugin + is distributed in a sub-package which needs to be installed + separately from the main libvirt RPM. On a Fedora/RHEL host + this can be done with the <code>yum</code> command + </p> + + <pre> + $ su - root + # yum install libvirt-lock-sanlock + </pre> + + <p> + The next step is to start the sanlock daemon. For maximum + safety sanlock prefers to have a connection to a watchdog + daemon. This will cause the entire host to be rebooted in + the event that sanlock crashes / terminates abnormally. + To start the watchdog daemon on a Fedora/RHEL host + the following commands can be run: + </p> + + <pre> + $ su - root + # chkconfig wdmd on + # service wdmd start + </pre> + + <p> + Once the watchdog is running, sanlock can be started + as follows + </p> + + <pre> + # chkconfig sanlock on + # service sanlock start + </pre> + + <p> + <em>Note:</em> if you wish to avoid the use of the + watchdog, add the following line to <code>/etc/sysconfig/sanlock</code> + before starting it + </p> + + <pre> + SANLOCKOPTS="-w 0" + </pre> + + <p> + The sanlock daemon must be started on every single host + that will be running virtual machines. So repeat these + steps as neccessary. + </p> + + <h2><a name="sanlockplugin">libvirt sanlock plugin configuration</a></h2> + + <p> + Once the sanlock daemon is running, the next step is to + configure the libvirt sanlock plugin. There is a separate + configuration file for each libvirt driver that is using + sanlock. For QEMU, we will edit <code>/etc/libvirt/qemu-sanlock.conf</code> + There is one mandatory parameter that needs to be set, + the <code>host_id</code>. This is a integer between + 1 and 2000, which must be set to a <strong>unique</strong> + value on each host running virtual machines + </p> + + <pre> + $ su - root + # augtool -s set /files/etc/libvirt/qemu-sanlock.conf/host_id 1 + </pre> + + <p> + Repeat this on every host, changing <strong>1</strong> to a + unique value for the host. + </p> + + <h2><a name="sanlockstorage">libvirt sanlock storage configuration</a></h2> + + <p> + The sanlock plugin needs to create leases in a directory + that is on a filesystem shared between all hosts running + virtual machines. Obvious choices for this include NFS, + or GFS2. The libvirt sanlock plugin expects its lease + directory be at <code>/var/lib/libvirt/sanlock</code> + so update the host's <code>/etc/fstab</code> to mount + a suitable shared/cluster filesystem at that location + </p> + + <pre> + $ su - root + # echo "some.nfs.server:/export/sanlock /var/lib/libvirt/sanlock nfs hard,nointr 0 0" >> /etc/fstab + # mount /var/lib/libvirt/sanlock + </pre> + + <p> + In terms of storage requirements, if the filesystem + uses 512 byte sectors, you need to allow for <code>1MB</code> + of storage for each guest disk. So if you have a network + with 20 virtualization hosts, each running 50 virtual + machines and an average of 2 disks per guest, you will + need <code>20*50*2 == 2000 MB</code> of storage for + sanlock. + </p> + + + <p> + On one of the hosts on the network is it wise to setup + a cron job which runs the <code>virt-sanlock-cleanup</code> + script periodically. This scripts deletes any lease + files which are not currently in use by running virtual + machines, freeing up disk space on the shared filesystem. + Unless VM disks are very frequently created + deleted + it should be sufficient to run the cleanup once a week. + </p> + + <h2><a name="qemuconfig">QEMU/KVM driver configuration</a></h2> + + <p> + The QEMU/KVM driver is fully integrated with the lock + manager framework as of release <span>0.9.3</span>. + The out of the box configuration, however, currently + uses the <strong>nop</strong> lock manager plugin. + To get protection for disks, it is thus neccessary + to reconfigure QEMU to activate the <strong>sanlock</strong> + driver. This is achieved by editing the QEMU driver + configuration file (<code>/etc/libvirt/qemu.conf</code>) + and changing the <code>lock_manager</code> configuration + tunable. + </p> + + <pre> + $ su - root + # augtool -s set /files/etc/libvirt/qemu.conf/lock_manager sanlock + # service libvirtd restart + </pre> + + <p> + If all went well, libvirtd will have talked to sanlock + and created the basic lockspace. This can be checked + by looking for existance of the following file + </p> + + <pre> + # ls /var/lib/libvirt/sanlock/ + __LIBVIRT__DISKS__ + </pre> + + <p> + Every time you start a guest, additional lease files will appear + in this directory, one for each virtual disk. The lease + files are named based on the MD5 checksum of the fully qualified + path of the virtual disk backing file. So if the guest is given + a disk backed by <code>/var/lib/libvirt/images/demo.img</code> + expect to see a lease <code>/var/lib/libvirt/sanlock/bfa0240911bc17753e0b473688822159</code> + </p> + + <p> + It should be obvious that for locking to work correctly, every + host running virtual machines should have storage configured + in the same way. The easiest way todo this is to use the libvirt + storage pool capability to configure any NFS volumes, iSCSI targets, + or SCSI HBAs used for guest storage. Simply replicate the same + storage pool XML across every host. It is important that any + storage pools exposing block devices are configured to create + volume paths under <code>/dev/disks/by-path</code> to ensure + stable paths across hosts. An example iSCSI configuration + which ensures this is: + </p> + + <pre> +<pool type='iscsi'> + <name>myiscsipool</name> + <source> + <host name='192.168.254.8'/> + <device path='your-iscsi-target-iqn'/> + </source> + <target> + <path>/dev/disk/by-path</path> + </target> +</pool> + </pre> + + </body> +</html> diff --git a/docs/sitemap.html.in b/docs/sitemap.html.in index db2963e..897ee94 100644 --- a/docs/sitemap.html.in +++ b/docs/sitemap.html.in @@ -73,6 +73,10 @@ <span>Firewall and network filter configuration</span> </li> <li> + <a href="locking.html">Disk locking</a> + <span>Ensuring exclusive guest access to disks</span> + </li> + <li> <a href="hooks.html">Hooks</a> <span>Hooks for system specific management</span> </li> -- 1.7.4.4

On 06/24/2011 07:02 AM, Daniel P. Berrange wrote:
Add a page which documents how to configure lock managers, focusing on use of sanlock with the QEMU/KVM driver
* docs/locking.html.in: Docs about lock managers * docs/sitemap.html.in: Add lock manager config to the deployment section --- docs/locking.html.in | 212 ++++++++++++++++++++++++++++++++++++++++++++++++++ docs/sitemap.html.in | 4 + 2 files changed, 216 insertions(+), 0 deletions(-) create mode 100644 docs/locking.html.in
Docs are always nice!
+ <p> + Once the sanlock daemon is running, the next step is to + configure the libvirt sanlock plugin. There is a separate + configuration file for each libvirt driver that is using + sanlock. For QEMU, we will edit <code>/etc/libvirt/qemu-sanlock.conf</code> + There is one mandatory parameter that needs to be set, + the <code>host_id</code>. This is a integer between + 1 and 2000, which must be set to a <strong>unique</strong> + value on each host running virtual machines
How does this fit in with the max_hosts configuration value?
+ <p> + The sanlock plugin needs to create leases in a directory + that is on a filesystem shared between all hosts running + virtual machines. Obvious choices for this include NFS, + or GFS2. The libvirt sanlock plugin expects its lease
When listing two items, no comma is necessary: s/NFS,/NFS/
+ <p> + The QEMU/KVM driver is fully integrated with the lock + manager framework as of release <span>0.9.3</span>. + The out of the box configuration, however, currently + uses the <strong>nop</strong> lock manager plugin. + To get protection for disks, it is thus neccessary
s/neccessary/necessary/
+ <p> + It should be obvious that for locking to work correctly, every + host running virtual machines should have storage configured + in the same way. The easiest way todo this is to use the libvirt
s/todo/to do/ ACK with nits fixed. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

On Tue, Jun 28, 2011 at 11:10:57AM -0600, Eric Blake wrote:
On 06/24/2011 07:02 AM, Daniel P. Berrange wrote:
Add a page which documents how to configure lock managers, focusing on use of sanlock with the QEMU/KVM driver
* docs/locking.html.in: Docs about lock managers * docs/sitemap.html.in: Add lock manager config to the deployment section --- docs/locking.html.in | 212 ++++++++++++++++++++++++++++++++++++++++++++++++++ docs/sitemap.html.in | 4 + 2 files changed, 216 insertions(+), 0 deletions(-) create mode 100644 docs/locking.html.in
Docs are always nice!
+ <p> + Once the sanlock daemon is running, the next step is to + configure the libvirt sanlock plugin. There is a separate + configuration file for each libvirt driver that is using + sanlock. For QEMU, we will edit <code>/etc/libvirt/qemu-sanlock.conf</code> + There is one mandatory parameter that needs to be set, + the <code>host_id</code>. This is a integer between + 1 and 2000, which must be set to a <strong>unique</strong> + value on each host running virtual machines
How does this fit in with the max_hosts configuration value?
max_hosts is dead now, so the range is 1->2000 as the docs say here, instead of the previous '1->max_hosts'. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 06/24/2011 07:02 AM, Daniel P. Berrange wrote:
This is an update to
http://www.redhat.com/archives/libvir-list/2011-June/msg00813.html
Changed in this v2 posting
- Incorporated feedback from v1 - Removed the cron job. This should be manually setup by the admin on only one host - Added documentation to the website on how to deploy sanlock
This series had a first review prior to the freeze, and primarily only affects a new feature, so I think the series is safe for inclusion in 0.9.3, once I review v2 to make sure we aren't missing any obvious gotchas... -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org
participants (2)
-
Daniel P. Berrange
-
Eric Blake