[libvirt] [PATCH 0/4] libvirt-guests: improve behavior of the guests script

This patchset tweaks the libvirt guest script to enable parallel shutdown of guests and fix some bugs that appeared through time. Peter Krempa (4): libvirt-guests: Add documentation and clean up to use virsh's improved list libvirt-guests: Don't try to do a managed-save of transient guests libvirt-guests: Check if URI is reachable before launching commands libvirt-guests: Add parallel startup and shutdown of guests tools/libvirt-guests.init.sh | 261 ++++++++++++++++++++++++++++++++++++------ tools/libvirt-guests.sysconf | 10 ++- 2 files changed, 235 insertions(+), 36 deletions(-) -- 1.7.3.4

This patch adds documentation to functions defined in the libvirt-guests init script and changes use of virsh's new commands to make the script easier. --- tools/libvirt-guests.init.sh | 60 ++++++++++++++++++++++++++++-------------- 1 files changed, 40 insertions(+), 20 deletions(-) diff --git a/tools/libvirt-guests.init.sh b/tools/libvirt-guests.init.sh index 367177e..0d2fc24 100644 --- a/tools/libvirt-guests.init.sh +++ b/tools/libvirt-guests.init.sh @@ -53,6 +53,9 @@ VAR_SUBSYS_LIBVIRT_GUESTS="$localstatedir"/lock/subsys/libvirt-guests RETVAL=0 +# retval COMMAND ARGUMENTS... +# run command with arguments and convert non-zero return value to 1 and set +# the global return variable retval() { "$@" if [ $? -ne 0 ]; then @@ -63,6 +66,10 @@ retval() { fi } +# run_virsh URI ARGUMENTS... +# start virsh and let it execute ARGUMENTS on URI +# If URI is "default" virsh is called without the "-c" argument +# (using libvirt's default connection) run_virsh() { uri=$1 shift @@ -74,64 +81,67 @@ run_virsh() { fi } +# run_virsh_c URI ARGUMENTS +# Same as "run_virsh" but the "C" locale is used instead of +# the system's locale. run_virsh_c() { ( export LC_ALL=C; run_virsh "$@" ) } +# list_guests URI PERSISTENT +# List running guests on URI. +# PERSISTENT argument options: +# --persistent: list only persistent guests +# --transient: list only transient guests +# [none]: list both persistent and transient guests list_guests() { uri=$1 + persistent=$2 - list=$(run_virsh_c "$uri" list) + list=$(run_virsh_c "$uri" list --uuid $persistent) if [ $? -ne 0 ]; then RETVAL=1 return 1 fi - uuids= - for id in $(echo "$list" | awk 'NR > 2 {print $1}'); do - uuid=$(run_virsh_c "$uri" dominfo "$id" | awk '/^UUID:/{print $2}') - if [ -z "$uuid" ]; then - RETVAL=1 - return 1 - fi - uuids="$uuids $uuid" - done - - echo $uuids + echo $list } +# guest_name URI UUID +# return name of guest UUID on URI guest_name() { uri=$1 uuid=$2 - name=$(run_virsh_c "$uri" dominfo "$uuid" 2>/dev/null | \ - sed -ne 's/^Name: *//p') - [ -n "$name" ] || name=$uuid - - echo "$name" + run_virsh "$uri" domname "$uuid" 2>/dev/null } +# guest_is_on URI UUID +# check if guest UUID on URI is runnig +# Result is returned by variable "guest_running" guest_is_on() { uri=$1 uuid=$2 guest_running=false - info=$(run_virsh_c "$uri" dominfo "$uuid") + id=$(run_virsh "$uri" domid "$uuid") if [ $? -ne 0 ]; then RETVAL=1 return 1 fi - id=$(echo "$info" | awk '/^Id:/{print $2}') - [ -n "$id" ] && [ "x$id" != x- ] && guest_running=true return 0 } +# started +# Create the startup lock file started() { touch "$VAR_SUBSYS_LIBVIRT_GUESTS" } +# start +# Start or resume the guests start() { [ -f "$LISTFILE" ] || { started; return 0; } @@ -187,6 +197,9 @@ start() { started } +# suspend_guest URI GUEST +# Do a managed save on a GUEST on URI. This function returns after the guest +# was saved. suspend_guest() { uri=$1 @@ -213,6 +226,9 @@ suspend_guest() retval wait "$virsh_pid" && printf '\r%s%-12s\n' "$label" "$(gettext "done")" } +# shutdown_guest URI GUEST +# Start a ACPI shutdown of GUEST on URI. This function return after the quest +# was successfuly shutdown or the timeout defined by $SHUTDOWN_TIMEOUT expires. shutdown_guest() { uri=$1 @@ -241,6 +257,8 @@ shutdown_guest() fi } +# stop +# Shutdown or save guests on the configured uris stop() { # last stop was not followed by start [ -f "$LISTFILE" ] && return 0 @@ -304,6 +322,8 @@ stop() { rm -f "$VAR_SUBSYS_LIBVIRT_GUESTS" } +# gueststatus +# List status of guests gueststatus() { set -f for uri in $URIS; do -- 1.7.3.4

On 02/28/2012 11:00 AM, Peter Krempa wrote:
This patch adds documentation to functions defined in the libvirt-guests init script and changes use of virsh's new commands to make the script easier. --- tools/libvirt-guests.init.sh | 60 ++++++++++++++++++++++++++++-------------- 1 files changed, 40 insertions(+), 20 deletions(-)
+# guest_is_on URI UUID +# check if guest UUID on URI is runnig
s/runnig/running/
+# shutdown_guest URI GUEST +# Start a ACPI shutdown of GUEST on URI. This function return after the quest +# was successfuly shutdown or the timeout defined by $SHUTDOWN_TIMEOUT expires.
s/successfuly/successfully/ ACK with typos fixed. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

The libvirt-guests script tried to do a managed save of transient guest that failed. This patch notifies which guests are transient (and not being saved) and saves only the persistent ones. --- tools/libvirt-guests.init.sh | 37 +++++++++++++++++++++++++++++++++++-- 1 files changed, 35 insertions(+), 2 deletions(-) diff --git a/tools/libvirt-guests.init.sh b/tools/libvirt-guests.init.sh index 0d2fc24..21a7d31 100644 --- a/tools/libvirt-guests.init.sh +++ b/tools/libvirt-guests.init.sh @@ -293,13 +293,46 @@ stop() { printf %s "$(guest_name "$uri" "$uuid")" empty=false done + if "$empty"; then - gettext "no running guests."; echo + gettext "no running guests." + fi + echo + fi + + if "$suspending"; then + transient=$(list_guests "$uri" "--transient") + if [ $? -eq 0 ]; then + empty=true + for uuid in $transient; do + if "$empty"; then + eval_gettext "Not suspending transient guests on URI: \$uri: " + empty=false + else + printf ", " + fi + printf %s "$(guest_name "$uri" "$uuid")" + done + echo + # reload domain list to contain only persistent guests + list=$(list_guests "$uri" "--persistent") + if [ $? -ne 0 ]; then + eval_gettext "Failed to list persistent guests on \$uri" + echo + RETVAL=1 + return + fi else + gettext "Failed to list transient guests" echo - echo "$uri" "$list" >>"$LISTFILE" + RETVAL=1 + return fi fi + + if [ -n "$list" ]; then + echo "$uri" "$list" >>"$LISTFILE" + fi done set +f -- 1.7.3.4

On 02/28/2012 11:00 AM, Peter Krempa wrote:
The libvirt-guests script tried to do a managed save of transient guest that failed. This patch notifies which guests are transient (and not being saved) and saves only the persistent ones. --- tools/libvirt-guests.init.sh | 37 +++++++++++++++++++++++++++++++++++-- 1 files changed, 35 insertions(+), 2 deletions(-)
ACK with one fix:
+ # reload domain list to contain only persistent guests + list=$(list_guests "$uri" "--persistent") + if [ $? -ne 0 ]; then + eval_gettext "Failed to list persistent guests on \$uri" + echo + RETVAL=1 + return + fi else + gettext "Failed to list transient guests" echo - echo "$uri" "$list" >>"$LISTFILE" + RETVAL=1 + return
Before these two return statements, you need to add a 'set +f' statement;
fi fi + + if [ -n "$list" ]; then + echo "$uri" "$list" >>"$LISTFILE" + fi done set +f
since both of those early exits need to leave the function in the same state as if you exited normally. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

This patch adds a check to the libvirt-guests script to check for the URI to be alive before attempting any calls. This avoids nasty error messages and allows us to fail gracefuly and continue on other URIs configured in the script. --- tools/libvirt-guests.init.sh | 24 +++++++++++++++++++----- 1 files changed, 19 insertions(+), 5 deletions(-) diff --git a/tools/libvirt-guests.init.sh b/tools/libvirt-guests.init.sh index 21a7d31..47914e3 100644 --- a/tools/libvirt-guests.init.sh +++ b/tools/libvirt-guests.init.sh @@ -88,6 +88,20 @@ run_virsh_c() { ( export LC_ALL=C; run_virsh "$@" ) } +# test_connect URI +# check if URI is reachable +test_connect() +{ + uri=$1 + + run_virsh "$uri" connect 2>/dev/null + if [ $? -ne 0 ]; then + eval_gettext "Can't connect to \$uri. Skipping." + echo + return 1 + fi +} + # list_guests URI PERSISTENT # List running guests on URI. # PERSISTENT argument options: @@ -172,6 +186,8 @@ start() { continue fi + test_connect "$uri" || continue + eval_gettext "Resuming guests on \$uri URI..."; echo for guest in $list; do name=$(guest_name "$uri" "$guest") @@ -278,12 +294,10 @@ stop() { set -f for uri in $URIS; do set +f - eval_gettext "Running guests on \$uri URI: " - if [ "x$uri" = xdefault ] && [ ! -x "$libvirtd" ]; then - gettext "libvirtd not installed; skipping this URI."; echo - continue - fi + test_connect "$uri" || continue + + eval_gettext "Running guests on \$uri URI: " list=$(list_guests "$uri") if [ $? -eq 0 ]; then -- 1.7.3.4

On 02/28/2012 11:00 AM, Peter Krempa wrote:
This patch adds a check to the libvirt-guests script to check for the URI to be alive before attempting any calls. This avoids nasty error messages and allows us to fail gracefuly and continue on other URIs
s/gracefuly/gracefully/
configured in the script. --- tools/libvirt-guests.init.sh | 24 +++++++++++++++++++----- 1 files changed, 19 insertions(+), 5 deletions(-)
ACK. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 02/29/2012 04:02 AM, Eric Blake wrote:
On 02/28/2012 11:00 AM, Peter Krempa wrote:
This patch adds a check to the libvirt-guests script to check for the URI to be alive before attempting any calls. This avoids nasty error messages and allows us to fail gracefuly and continue on other URIs
s/gracefuly/gracefully/
configured in the script. --- tools/libvirt-guests.init.sh | 24 +++++++++++++++++++----- 1 files changed, 19 insertions(+), 5 deletions(-)
ACK.
Thanks. I pushed patches 1,2 and 3 as they make sense without the last one and will send a fixed version soon. Peter

With this patch, it's possible to shut down guests in parallel. Parallel startup was possible before, but this functionality was not documented properly. To enable parallel startup set the START_DELAY to 0. Parallel shutdown has a configurable parameter PARALLEL_SHUTDOWN that defines the number of machines being shut down in parallel. Enabling this feature changes the semantics of SHUTDOWN_TIMEOUT parameter that is applied as a cumulative timeout to shutdown all guests on a URI. --- tools/libvirt-guests.init.sh | 140 +++++++++++++++++++++++++++++++++++++++-- tools/libvirt-guests.sysconf | 10 +++- 2 files changed, 141 insertions(+), 9 deletions(-) diff --git a/tools/libvirt-guests.init.sh b/tools/libvirt-guests.init.sh index 47914e3..dd933c7 100644 --- a/tools/libvirt-guests.init.sh +++ b/tools/libvirt-guests.init.sh @@ -273,6 +273,127 @@ shutdown_guest() fi } +# shutdown_guest_async URI GUEST +# Start a ACPI shutdown of GUEST on URI. This function returns after the command +# was issued to libvirt to allow parallel shutdown. +shutdown_guest_async() +{ + uri=$1 + guest=$2 + + name=$(guest_name "$uri" "$guest") + eval_gettext "Starting shutdown on guest: \$name" + echo + retval run_virsh "$uri" shutdown "$guest" > /dev/null +} + +# guest_count GUEST_LIST +# Returns number of guests in GUEST_LIST +guest_count() +{ + set -- $1 + echo $# +} + +# guest_first GUEST GUEST GUEST... +# Returns the first guest in GUEST... +guest_first() +{ + echo $1 +} + +# guest_remove GUEST GUEST_LIST +# Remove GUEST from GUEST_LIST +guest_remove() +{ + guest=$1 + guests=$2 + + newguests= + for dom in $guests; do + if [ "$dom" != "$guest" ]; then + newguests="$newguests $dom" + fi + done + + echo "$newguests" +} + +# check_domains_shutdown URI GUESTS +# check if shutdown is complete on guests in "GUESTS" and returns only +# guests that are still shutting down +check_domains_shutdown() +{ + uri=$1 + guests=$2 + + guests_up= + for guest in $guests; do + guest_is_on "$uri" "$dom" 2>&1 > /dev/null || continue + if "$guest_running"; then + guests_up="$guests_up $guest" + fi + done + echo "$guests_up" +} + +# print_domains_shutdown URI BEFORE AFTER +# Checks for differences in the lists BEFORE and AFTER and prints +# a shutdown complete notice for guests that have finished +print_domains_shutdown() +{ + uri=$1 + before=$2 + after=$3 + + for guest in $before; do + found=false + for running in $after; do + if [ $guest = $running ]; then + found=true + break + fi + done + + if ! "$found"; then + name=$(guest_name "$uri" "$guest") + eval_gettext "Shutdown of guest \$name complete." + echo + fi + done +} + +# shutdown_guests_parallel URI GUESTS +# Shutdown guests GUESTS on machine URI in parallel +shutdown_guests_parallel() +{ + uri=$1 + guests=$2 + + on_shutdown= + timeout=$SHUTDOWN_TIMEOUT + while [ -n "$on_shutdown" ] || [ -n "$guests" ]; do + while [ -n "$guests" ] && + [ $(guest_count "$on_shutdown") -lt "$PARALLEL_SHUTDOWN" ]; do + guest=$(guest_first $guests) + guests=$(guest_remove "$guest" "$guests") + shutdown_guest_async "$uri" "$guest" + on_shutdown="$on_shutdown $guest" + done + sleep 1 + timeout=$(($timeout - 1)) + if [ $timeout -le 0 ]; then + eval_gettext "Timeout expired while shutting down domains"; echo + RETVAL=1 + return + fi + on_shutdown_prev=$on_shutdown + on_shutdown=$(check_domains_shutdown "$uri" "$on_shutdown") + print_domains_shutdown "$uri" "$on_shutdown_prev" "$on_shutdown" + done + +} + # stop # Shutdown or save guests on the configured uris stop() { @@ -357,13 +478,18 @@ stop() { eval_gettext "Shutting down guests on \$uri URI..."; echo fi - for guest in $list; do - if "$suspending"; then - suspend_guest "$uri" "$guest" - else - shutdown_guest "$uri" "$guest" - fi - done + if [ "$PARALLEL_SHUTDOWN" -gt 1 ] && + ! "$suspending"; then + shutdown_guests_parallel "$uri" "$list" + else + for guest in $list; do + if "$suspending"; then + suspend_guest "$uri" "$guest" + else + shutdown_guest "$uri" "$guest" + fi + done + fi done <"$LISTFILE" rm -f "$VAR_SUBSYS_LIBVIRT_GUESTS" diff --git a/tools/libvirt-guests.sysconf b/tools/libvirt-guests.sysconf index 9b8b64f..e16af4f 100644 --- a/tools/libvirt-guests.sysconf +++ b/tools/libvirt-guests.sysconf @@ -10,7 +10,8 @@ # libvirtd #ON_BOOT=start -# number of seconds to wait between each guest start +# number of seconds to wait between each guest start. Set to 0 to allow parallel +# startup. #START_DELAY=0 # action taken on host shutdown @@ -23,7 +24,12 @@ # value suitable for your guests. #ON_SHUTDOWN=suspend -# number of seconds we're willing to wait for a guest to shut down +# If set to non-zero, shutdown will suspend domains concurently. Number of domains +# on shutdown at any time will not exceed number set in this variable. +#PARALLEL_SHUTDOWN=0 + +# number of seconds we're willing to wait for a guest to shut down. If parallel +# shutdown is enabled, this timeout applies as a timeout for shutting down all guests. #SHUTDOWN_TIMEOUT=0 # If non-zero, try to bypass the file system cache when saving and -- 1.7.3.4

On 02/28/2012 11:00 AM, Peter Krempa wrote:
With this patch, it's possible to shut down guests in parallel. Parallel startup was possible before, but this functionality was not documented properly.
To enable parallel startup set the START_DELAY to 0.
Parallel shutdown has a configurable parameter PARALLEL_SHUTDOWN that defines the number of machines being shut down in parallel. Enabling this feature changes the semantics of SHUTDOWN_TIMEOUT parameter that is applied as a cumulative timeout to shutdown all guests on a URI. --- tools/libvirt-guests.init.sh | 140 +++++++++++++++++++++++++++++++++++++++-- tools/libvirt-guests.sysconf | 10 +++- 2 files changed, 141 insertions(+), 9 deletions(-)
Missing an initialization of PARALLEL_SHUTDOWN=0 alongside SHUTDOWN_TIMEOUT=0 before sourcing the user's values.
+ +# check_domains_shutdown URI GUESTS +# check if shutdown is complete on guests in "GUESTS" and returns only +# guests that are still shutting down +check_domains_shutdown() +{ + uri=$1 + guests=$2 + + guests_up= + for guest in $guests; do + guest_is_on "$uri" "$dom" 2>&1 > /dev/null || continue
You want '>/dev/null 2>&1' (order matters, when silencing a command and using it just for its side effects). But are we sure we want to silently ignore a guest_is_on failure, or should we emit a message stating that we are no longer able to track the shutdown status of that guest?
+print_domains_shutdown() +{ + uri=$1 + before=$2 + after=$3 + + for guest in $before; do + found=false + for running in $after; do + if [ $guest = $running ]; then + found=true + break + fi + done
slightly faster to do: for guest in $before; do found=false; case " $after " in *" $guest "*) found=true ;; esac But what you have is functionally correct.
+ + if ! "$found"; then + name=$(guest_name "$uri" "$guest") + eval_gettext "Shutdown of guest \$name complete." + echo + fi + done +} + +# shutdown_guests_parallel URI GUESTS +# Shutdown guests GUESTS on machine URI in parallel +shutdown_guests_parallel() +{ + uri=$1 + guests=$2 + + on_shutdown= + timeout=$SHUTDOWN_TIMEOUT
If SHUTDOWN_TIMEOUT is the default of 0, then this function times out right away; shouldn't that really mean that we have no maximum timeout, and wait until everything has shut down? I think you need another variable, based on whether $timeout is 0 on entry, and skip the $((timeout - 1)) calculation if that variable is true.
+ while [ -n "$on_shutdown" ] || [ -n "$guests" ]; do + while [ -n "$guests" ] && + [ $(guest_count "$on_shutdown") -lt "$PARALLEL_SHUTDOWN" ]; do + guest=$(guest_first $guests) + guests=$(guest_remove "$guest" "$guests")
Faster to avoid these two sub-shells by doing: set -- $guests guest=$1 shift guests=$* and might even get rid of the need for some helper functions. But what you have is functionally correct.
@@ -23,7 +24,12 @@ # value suitable for your guests. #ON_SHUTDOWN=suspend
-# number of seconds we're willing to wait for a guest to shut down +# If set to non-zero, shutdown will suspend domains concurently. Number of domains
s/concurently/concurrently/ I think it's probably worth a v2 (or is it v3?) for just this patch. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (2)
-
Eric Blake
-
Peter Krempa