---
docs/internals/locking.html.in | 257 ++++++++++++++++++++++++++++++++++++++++
docs/sitemap.html.in | 4 +
2 files changed, 261 insertions(+), 0 deletions(-)
create mode 100644 docs/internals/locking.html.in
diff --git a/docs/internals/locking.html.in b/docs/internals/locking.html.in
new file mode 100644
index 0000000..3790ef0
--- /dev/null
+++ b/docs/internals/locking.html.in
@@ -0,0 +1,257 @@
+<html>
+ <body>
+ <h1>Resource Lock Manager</h1>
+
+ <ul id="toc"></ul>
+
+ <p>
+ This page describes the design of the resource lock manager
+ that is used for locking disk images, to ensure exclusive
+ access to content.
+ </p>
+
+ <h2><a name="goals">Goals</a></h2>
+
+ <p>
+ The high level goal is to prevent the same disk image being
+ used by more than one QEMU instance at a time (unless the
+ disk is marked as sharable, or readonly). The scenarios
+ to be prevented are thus:
+ </p>
+
+ <ol>
+ <li>
+ Two different guests running configured to point at the
+ same disk image.
+ </li>
+ <li>
+ One guest being started more than once on two different
+ machines due to admin mistake
+ </li>
+ <li>
+ One guest being started more than once on a single machine
+ due to libvirt driver bug on a single machine.
+ </li>
+ </ol>
+
+ <h2><a name="requirement">Requirements</a></h2>
+
+ <p>
+ The high level goal leads to a set of requirements
+ for the lock manager design
+ </p>
+
+ <ol>
+ <li>
+ A lock must be held on a disk whenever a QEMU process
+ has the disk open
+ </li>
+ <li>
+ The lock scheme must allow QEMU to be configured with
+ readonly, shared write, or exclusive writable disks
+ </li>
+ <li>
+ A lock handover must be performed during the migration
+ process where 2 QEMU processes will have the same disk
+ open concurrently.
+ </li>
+ <li>
+ The lock manager must be able to identify and kill the
+ process accessing the resource if the lock is revoked.
+ </li>
+ <li>
+ Locks can be acquired for arbitrary VM related resources,
+ as determined by the management application.
+ </li>
+ </ol>
+
+ <h2><a name="design">Design</a></h2>
+
+ <p>
+ Within a lock manager the following series of operations
+ will need to be supported.
+ </p>
+
+ <ul>
+ <li>
+ <strong>Register object</strong>
+ Register the identity of an object against which
+ locks will be acquired
+ </li>
+ <li>
+ <strong>Add resource</strong>
+ Associate a resource with an object for future
+ lock acquisition / release
+ </li>
+ <li>
+ <strong>Acquire locks</strong>
+ Acquire the locks for all resources associated
+ with the object
+ </li>
+ <li>
+ <strong>Release locks</strong>
+ Release the locks for all resources associated
+ with the object
+ </li>
+ <li>
+ <strong>Inquire locks</strong>
+ Get a representation of the state of the locks
+ for all resources associated with the object
+ </li>
+ </ul>
+
+ <h2><a name="impl">Plugin Implementations</a></h2>
+
+ <p>
+ Lock manager implementations are provided as LGPLv2+
+ licensed, dlopen()able library modules. The plugins
+ will be loadable from the following location:
+ </p>
+
+ <pre>
+/usr/{lib,lib64}/libvirt/lock_manager/$NAME.so
+</pre>
+
+ <p>
+ The lock manager plugin must export a single ELF
+ symbol named <code>virLockDriverImpl</code>, which is
+ a static instance of the <code>virLockDriver</code>
+ struct. The struct is defined in the header file
+ </p>
+
+ <pre>
+ #include <libvirt/plugins/lock_manager.h>
+ </pre>
+
+ <p>
+ All callbacks in the struct must be initialized
+ to non-NULL pointers. The semantics of each
+ callback are defined in the API docs embedded
+ in the previously mentioned header file
+ </p>
+
+ <h2><a name="qemuIntegrate">QEMU Driver
integration</a></h2>
+
+ <p>
+ With the QEMU driver, the lock plugin will be set
+ in the <code>/etc/libvirt/qemu.conf</code> configuration
+ file by specifying the lock manager name.
+ </p>
+
+ <pre>
+ lockManager="sanlock"
+ </pre>
+
+ <p>
+ By default the lock manager will be a 'no op' implementation
+ for backwards compatibility
+ </p>
+
+ <h2><a name="usagePatterns">Lock usage
patterns</a></h2>
+
+ <p>
+ The following psuedo code illustrates the common
+ patterns of operations invoked on the lock
+ manager plugin callbacks.
+ </p>
+
+ <h3><a name="usageLockAcquire">Lock
acquisition</a></h3>
+
+ <p>
+ Initial lock acquisition will be performed from the
+ process that is to own the lock. This is typically
+ the QEMU child process, in between the fork+exec
+ pairing. When adding further resources on the fly,
+ to an existing object holding locks, this will be
+ done from the libvirtd process.
+ </p>
+
+ <pre>
+ virLockManagerParam params[] = {
+ { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UUID,
+ .key = "uuid",
+ },
+ { .type = VIR_LOCK_MANAGER_PARAM_TYPE_STRING,
+ .key = "name",
+ .value = { .str = dom->def->name },
+ },
+ { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT,
+ .key = "id",
+ .value = { .i = dom->def->id },
+ },
+ { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT,
+ .key = "pid",
+ .value = { .i = dom->pid },
+ },
+ };
+ mgr = virLockManagerNew(lockPlugin,
+ VIR_LOCK_MANAGER_TYPE_DOMAIN,
+ ARRAY_CARDINALITY(params),
+ params,
+ 0)));
+
+ foreach (initial disks)
+ virLockManagerAddResource(mgr,
+ VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK,
+ $path, 0, NULL, $flags);
+
+ if (virLockManagerAcquire(lock, NULL, 0) < 0);
+ ...abort...
+ </pre>
+
+ <h3><a name="usageLockAttach">Lock
release</a></h3>
+
+ <p>
+ The locks are all implicitly released when the process
+ that acquired them exits, however, a process may
+ voluntarily give up the lock by running
+ </p>
+
+ <pre>
+ char *state = NULL;
+ virLockManagerParam params[] = {
+ { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UUID,
+ .key = "uuid",
+ },
+ { .type = VIR_LOCK_MANAGER_PARAM_TYPE_STRING,
+ .key = "name",
+ .value = { .str = dom->def->name },
+ },
+ { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT,
+ .key = "id",
+ .value = { .i = dom->def->id },
+ },
+ { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT,
+ .key = "pid",
+ .value = { .i = dom->pid },
+ },
+ };
+ mgr = virLockManagerNew(lockPlugin,
+ VIR_LOCK_MANAGER_TYPE_DOMAIN,
+ ARRAY_CARDINALITY(params),
+ params,
+ 0)));
+
+ foreach (initial disks)
+ virLockManagerAddResource(mgr,
+ VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK,
+ $path, 0, NULL, $flags);
+
+ virLockManagerRelease(mgr, & state, 0);
+ </pre>
+
+ <p>
+ The returned state string can be passed to the
+ <code>virLockManagerAcquire</code> method to
+ later re-acquire the exact same locks. This
+ state transfer is commonly used when performing
+ live migration of virtual machines. By validating
+ the state the lock manager can ensure no other
+ VM has re-acquire the same locks on a different
+ host. The state can also be obtained without
+ releasing the locks, by calling the
+ <code>virLockManagerInquire</code> method.
+ </p>
+
+ </body>
+</html>
diff --git a/docs/sitemap.html.in b/docs/sitemap.html.in
index 5650fee..0f90ce8 100644
--- a/docs/sitemap.html.in
+++ b/docs/sitemap.html.in
@@ -284,6 +284,10 @@
<a href="internals/command.html">Spawning
commands</a>
<span>Spawning commands from libvirt driver code</span>
</li>
+ <li>
+ <a href="internals/locking.html">Lock managers</a>
+ <span>Use lock managers to protect disk content</span>
+ </li>
</ul>
</li>
<li>
--
1.7.4.4