[libvirt] [PATCH 0/2] sanlock: Forbid unsupported lock failure actions

Some lock failure actions do not make any sense in combination with sanlock driver. Let's just report an error if someone tries to use them instead of causing unexpected and possibly quite bad thing to happen. Jiri Denemark (2): sanlock: Forbid VIR_DOMAIN_LOCK_FAILURE_IGNORE sanlock: Forbid VIR_DOMAIN_LOCK_FAILURE_RESTART src/locking/lock_driver_sanlock.c | 12 +++++++++++- src/locking/sanlock_helper.c | 28 +++------------------------- 2 files changed, 14 insertions(+), 26 deletions(-) -- 1.9.1

https://bugzilla.redhat.com/show_bug.cgi?id=905280 https://bugzilla.redhat.com/show_bug.cgi?id=967493 Sanlock expects that the configured kill script either kills the PID on lock failure or removes all locks the PID owns. If none of the two options happen, sanlock will reboot the host. Although IGNORE action is supposed to ignore the request to kill the PID or remove all leases, it's certainly not designed to cause the host to be rebooted. That said, IGNORE action is incompatible with sanlock and should be forbidden by libvirt. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/locking/lock_driver_sanlock.c | 12 +++++++++++- src/locking/sanlock_helper.c | 6 ++---- 2 files changed, 13 insertions(+), 5 deletions(-) diff --git a/src/locking/lock_driver_sanlock.c b/src/locking/lock_driver_sanlock.c index 3e12c4f..b8c86fc 100644 --- a/src/locking/lock_driver_sanlock.c +++ b/src/locking/lock_driver_sanlock.c @@ -781,7 +781,17 @@ virLockManagerSanlockRegisterKillscript(int sock, int ret = -1; int rv; - if (action > VIR_DOMAIN_LOCK_FAILURE_IGNORE) { + switch (action) { + case VIR_DOMAIN_LOCK_FAILURE_DEFAULT: + return 0; + + case VIR_DOMAIN_LOCK_FAILURE_POWEROFF: + case VIR_DOMAIN_LOCK_FAILURE_RESTART: + case VIR_DOMAIN_LOCK_FAILURE_PAUSE: + break; + + case VIR_DOMAIN_LOCK_FAILURE_IGNORE: + case VIR_DOMAIN_LOCK_FAILURE_LAST: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Failure action %s is not supported by sanlock"), virDomainLockFailureTypeToString(action)); diff --git a/src/locking/sanlock_helper.c b/src/locking/sanlock_helper.c index 3e400b7..c913cc3 100644 --- a/src/locking/sanlock_helper.c +++ b/src/locking/sanlock_helper.c @@ -117,11 +117,9 @@ main(int argc, char **argv) ret = EXIT_SUCCESS; break; + case VIR_DOMAIN_LOCK_FAILURE_DEFAULT: case VIR_DOMAIN_LOCK_FAILURE_IGNORE: - ret = EXIT_SUCCESS; - break; - - default: + case VIR_DOMAIN_LOCK_FAILURE_LAST: fprintf(stderr, _("unsupported failure action: '%s'\n"), virDomainLockFailureTypeToString(action)); break; -- 1.9.1

https://bugzilla.redhat.com/show_bug.cgi?id=905282 https://bugzilla.redhat.com/show_bug.cgi?id=967494 When lock failure is detected by sanlock, our sanlock_helper kill script will try to restart (shutdown followed by start) the affected domain when RESTART action is configured for it. While shutting down kills QEMU and removes all its leases (which is what sanlock wants to happen), trying to start it again just hangs because libvirt tries reacquire the locks in the failed lock space. Hence, this action cannot be supported by sanlock driver. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/locking/lock_driver_sanlock.c | 2 +- src/locking/sanlock_helper.c | 22 +--------------------- 2 files changed, 2 insertions(+), 22 deletions(-) diff --git a/src/locking/lock_driver_sanlock.c b/src/locking/lock_driver_sanlock.c index b8c86fc..624691f 100644 --- a/src/locking/lock_driver_sanlock.c +++ b/src/locking/lock_driver_sanlock.c @@ -786,10 +786,10 @@ virLockManagerSanlockRegisterKillscript(int sock, return 0; case VIR_DOMAIN_LOCK_FAILURE_POWEROFF: - case VIR_DOMAIN_LOCK_FAILURE_RESTART: case VIR_DOMAIN_LOCK_FAILURE_PAUSE: break; + case VIR_DOMAIN_LOCK_FAILURE_RESTART: case VIR_DOMAIN_LOCK_FAILURE_IGNORE: case VIR_DOMAIN_LOCK_FAILURE_LAST: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, diff --git a/src/locking/sanlock_helper.c b/src/locking/sanlock_helper.c index c913cc3..08102eb 100644 --- a/src/locking/sanlock_helper.c +++ b/src/locking/sanlock_helper.c @@ -91,33 +91,13 @@ main(int argc, char **argv) ret = EXIT_SUCCESS; break; - case VIR_DOMAIN_LOCK_FAILURE_RESTART: - if (virDomainIsPersistent(dom)) { - if ((virDomainDestroy(dom) == 0 || - virDomainIsActive(dom) == 0) && - virDomainCreate(dom) == 0) - ret = EXIT_SUCCESS; - } else { - xml = virDomainGetXMLDesc(dom, - VIR_DOMAIN_XML_SECURE | - VIR_DOMAIN_XML_INACTIVE); - - if (!xml || - (virDomainDestroy(dom) < 0 && - virDomainIsActive(dom) != 0)) - goto cleanup; - virDomainFree(dom); - if ((dom = virDomainCreateXML(conn, xml, 0))) - ret = EXIT_SUCCESS; - } - break; - case VIR_DOMAIN_LOCK_FAILURE_PAUSE: if (virDomainSuspend(dom) == 0) ret = EXIT_SUCCESS; break; case VIR_DOMAIN_LOCK_FAILURE_DEFAULT: + case VIR_DOMAIN_LOCK_FAILURE_RESTART: case VIR_DOMAIN_LOCK_FAILURE_IGNORE: case VIR_DOMAIN_LOCK_FAILURE_LAST: fprintf(stderr, _("unsupported failure action: '%s'\n"), -- 1.9.1

On 03/24/2014 07:50 AM, Jiri Denemark wrote:
Some lock failure actions do not make any sense in combination with sanlock driver. Let's just report an error if someone tries to use them instead of causing unexpected and possibly quite bad thing to happen.
ACK series. However, are we missing some documentation patches to match?
Jiri Denemark (2): sanlock: Forbid VIR_DOMAIN_LOCK_FAILURE_IGNORE sanlock: Forbid VIR_DOMAIN_LOCK_FAILURE_RESTART
src/locking/lock_driver_sanlock.c | 12 +++++++++++- src/locking/sanlock_helper.c | 28 +++------------------------- 2 files changed, 14 insertions(+), 26 deletions(-)
-- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Mon, Mar 24, 2014 at 14:41:52 -0600, Eric Blake wrote:
On 03/24/2014 07:50 AM, Jiri Denemark wrote:
Some lock failure actions do not make any sense in combination with sanlock driver. Let's just report an error if someone tries to use them instead of causing unexpected and possibly quite bad thing to happen.
ACK series. However, are we missing some documentation patches to match?
I think the documentation is fine as it is: The on_lockfailure element (since 1.0.0) may be used to configure what action should be taken when a lock manager loses resource locks. The following actions are recognized by libvirt, although not all of them need to be supported by individual lock managers. ... I mean it would be nice to explicitly state what actions are supported by individual lock managers but doing it manually seems to fragile. Not to mention we don't document such things elsewhere either :-) I pushed the patches, thanks for the review. Jirka
participants (2)
-
Eric Blake
-
Jiri Denemark