[libvirt] [PATCH v3 0/2] API to invoke S3/S4 on a host and also resume from within libvirt

(This patch is positioned to go in after the patch that exports the host power management capabilities as XML, posted in [4]) This patch adds a new API to put a host to a suspended state (either Suspend-to-RAM or Suspend-to-Disk) and setup a timed resume to get the host back online, from within libvirt. This uses the RTC wakeup mechanism to set up a timer alarm before suspending the host, so that in-band resume is facilitated by the firing of the RTC alarm, which wakes up the host. The decision to use the RTC Wakeup mechanism to resume the host from sleep was taken in [1]. An initial API was discussed in [2]. Some design ideas for the asynchronous mechanism implementation was discussed in [3]. v3: * Rebased to libvirt 0.9.7 * Added a check to see if alarmTime (suspend duration) is within an acceptable range. v2: * Added an init function which finds out if S3/S4 is supported by the host, upon the first request to suspend/hibernate. * Added synchronization/locking to ensure that only one suspend operation is active at a time. v1: http://www.redhat.com/archives/libvir-list/2011-September/msg00830.html v2: http://comments.gmane.org/gmane.comp.emulators.libvirt/46729 References: [1]. http://www.redhat.com/archives/libvir-list/2011-August/msg00327.html [2]. http://www.redhat.com/archives/libvir-list/2011-August/msg00248.html [3]. http://www.redhat.com/archives/libvir-list/2011-September/msg00438.html [4]. http://www.redhat.com/archives/libvir-list/2011-November/msg00378.html Srivatsa S. Bhat (2): Implement the asynchronous suspend and RTC wakeup Make the API public include/libvirt/libvirt.h.in | 9 ++ src/driver.h | 5 + src/libvirt.c | 48 +++++++++ src/libvirt_private.syms | 7 + src/libvirt_public.syms | 1 src/nodeinfo.c | 220 ++++++++++++++++++++++++++++++++++++++++++ src/nodeinfo.h | 14 +++ src/qemu/qemu_driver.c | 6 + src/remote/remote_driver.c | 1 src/remote/remote_protocol.x | 12 ++ src/util/threads-pthread.c | 17 +++ src/util/threads.h | 1 12 files changed, 339 insertions(+), 2 deletions(-)

Add the core functions that implement the functionality of the API. Suspend is done by using an asynchronous mechanism so that we can return the status to the caller successfully before the host gets suspended. This asynchronous operation is achieved by suspending the host in a separate thread of execution. To resume the host, an RTC alarm is set up (based on how long we want to suspend) before suspending the host. When this alarm fires, the host gets woken up. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> --- include/libvirt/libvirt.h.in | 5 + src/libvirt_private.syms | 7 + src/nodeinfo.c | 220 ++++++++++++++++++++++++++++++++++++++++++ src/nodeinfo.h | 14 +++ src/qemu/qemu_driver.c | 5 + src/util/threads-pthread.c | 17 +++ src/util/threads.h | 1 7 files changed, 268 insertions(+), 1 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index aa320b6..25f1c9b 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -3357,6 +3357,11 @@ typedef struct _virTypedParameter virMemoryParameter; */ typedef virMemoryParameter *virMemoryParameterPtr; +typedef enum { + VIR_S3 = 1, /* Suspend-to-RAM */ + VIR_S4 = 2 /* Suspend-to-disk */ +} virSuspendState; + #ifdef __cplusplus } #endif diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 6a1562e..2fa84e0 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -844,6 +844,13 @@ nodeGetCellsFreeMemory; nodeGetFreeMemory; nodeGetInfo; nodeGetMemoryStats; +virSuspendLock; +virSuspendUnlock; +virSuspendInit; +nodeSuspendForDuration; +setNodeWakeup; +nodeSuspend; +virSuspendCleanup; # nwfilter_conf.h diff --git a/src/nodeinfo.c b/src/nodeinfo.c index 6448b79..3c67fe6 100644 --- a/src/nodeinfo.c +++ b/src/nodeinfo.c @@ -46,6 +46,9 @@ #include "count-one-bits.h" #include "intprops.h" #include "virfile.h" +#include "command.h" +#include "threads.h" +#include "datatypes.h" #define VIR_FROM_THIS VIR_FROM_NONE @@ -65,6 +68,11 @@ # define LINUX_NB_MEMORY_STATS_ALL 4 # define LINUX_NB_MEMORY_STATS_CELL 2 +/* Bitmask to hold the Power Management features supported by the host, + * such as Suspend-to-RAM (S3), Suspend-to-Disk (S4) etc. + */ +static unsigned int hostPMFeatures; + /* NB, this is not static as we need to call it from the testsuite */ int linuxNodeInfoCPUPopulate(FILE *cpuinfo, virNodeInfoPtr nodeinfo, @@ -897,3 +905,215 @@ unsigned long long nodeGetFreeMemory(virConnectPtr conn ATTRIBUTE_UNUSED) return 0; } #endif + + +static int initialized; +virMutex virSuspendMutex; + +int virSuspendLock(void) +{ + return virMutexTryLock(&virSuspendMutex); +} + +void virSuspendUnlock(void) +{ + virMutexUnlock(&virSuspendMutex); +} + +/** + * virSuspendInit: + * + * Get the low power states supported by the host, such as Suspend-to-RAM (S3) + * or Suspend-to-Disk (S4), so that a request to suspend/hibernate the host + * can be handled appropriately based on this information. + * + * Returns 0 if successful, and -1 in case of error. + */ +int virSuspendInit(void) +{ + + if (virMutexInit(&virSuspendMutex) < 0) + return -1; + + /* Get the power management capabilities supported by the host. + * Ensure that this is done only once, by using the 'initialized' + * variable. + */ + if (virGetPMCapabilities(&hostPMFeatures) < 0) { + VIR_ERROR(_("Failed to get host power management features")); + return -1; + } + + return 0; +} + + +/** + * nodeSuspendForDuration: + * @conn: pointer to the hypervisor connection + * @state: the state to which the host must be suspended to - + * VIR_HOST_PM_S3 (Suspend-to-RAM) + * or VIR_HOST_PM_S4 (Suspend-to-disk) + * @duration: the time duration in seconds, for which the host + * must be suspended + * + * Suspend the node (host machine) for the given duration of time + * in the specified state (such as S3 or S4). Resume the node + * after the time duration is complete. + * + * An RTC alarm is set appropriately to wake up the node from + * its sleep state. Then the actual node suspend is carried out + * asynchronously in another thread, after a short time delay + * so as to give enough time for this function to return status + * to its caller through the connection. + * + * Returns 0 in case the node is going to be suspended after a short + * delay, -1 if suspending the node is not supported, or if a + * previous suspend operation is still in progress. + */ +int nodeSuspendForDuration(virConnectPtr conn ATTRIBUTE_UNUSED, + int state, + unsigned long long duration) +{ + static virThread thread; + char *cmdString; + + if (!initialized) { + if(virSuspendInit() < 0) + return -1; + initialized = 1; + } + + /* + * Ensure that we are the only ones trying to suspend. + * Fail if somebody has already initiated suspend. + */ + if (!virSuspendLock()) + return -1; + + /* Check if the host supports the requested suspend state */ + switch (state) { + case VIR_S3: + if (hostPMFeatures & VIR_S3) { + cmdString = strdup("pm-suspend"); + if (cmdString == NULL) { + virReportOOMError(); + goto cleanup; + } + break; + } + goto cleanup; + case VIR_S4: + if (hostPMFeatures & VIR_S4) { + cmdString = strdup("pm-hibernate"); + if (cmdString == NULL) { + virReportOOMError(); + goto cleanup; + } + break; + } + goto cleanup; + default: + goto cleanup; + } + + /* Just set the RTC alarm. Don't suspend yet. */ + if (setNodeWakeup(duration) < 0) { + nodeReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Failed to set up the RTC alarm for node wakeup\n")); + goto cleanup; + } + + if (virThreadCreate(&thread, false, nodeSuspend, (void *)cmdString) < 0) { + nodeReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Failed to create thread to suspend the host\n")); + goto cleanup; + } + + /* virSuspendUnlock() must be called only after resume is complete, + * in the thread that did the suspend and resume. + */ + return 0; + +cleanup: + virSuspendUnlock(); + return -1; +} + +#define MAX_SUSPEND_DURATION 365*24*60*60 /* 1 year, a reasonable max? */ + +/** + * setNodeWakeup: + * @alarmTime : time in seconds from now, at which the RTC alarm has to be set. + * + * Set up the RTC alarm to the specified time. + * Return 0 on success, -1 on failure. + */ + +int setNodeWakeup(unsigned long long alarmTime) +{ + virCommandPtr setAlarmCmd; + int ret = -1; + + if (alarmTime <= SUSPEND_DELAY || alarmTime > MAX_SUSPEND_DURATION) + return -1; + + setAlarmCmd = virCommandNewArgList("rtcwake", "-m", "no", "-s", NULL); + virCommandAddArgFormat(setAlarmCmd, "%lld", alarmTime); + + if (virCommandRun(setAlarmCmd, NULL) < 0) { + nodeReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("Failed to set up the RTC alarm\n")); + goto cleanup; + } + + ret = 0; + +cleanup: + virCommandFree(setAlarmCmd); + return ret; +} + +/** + * nodeSuspend: + * @cmdString: pointer to the command string this thread has to execute. + * + * Actually perform the suspend operation by invoking the + * command. Give a short delay before executing the command + * so as to give a chance to virNodeSuspendForDuration() to + * return the status to the caller. If we don't give this delay, + * that function will not be able to return the status since the + * suspend operation would have begun and hence no data can be + * sent through the connection to the caller. + */ + +void nodeSuspend(void *cmdString) +{ + virCommandPtr suspendCmd = virCommandNew((char *)cmdString); + + VIR_FREE(cmdString); + + /* Delay for sometime so that the function nodeSuspendForDuration() + * can return the status to the caller. + */ + sleep(SUSPEND_DELAY); + if (virCommandRun(suspendCmd, NULL) < 0) { + nodeReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("Failed to suspend the node\n")); + } + + virCommandFree(suspendCmd); + + /* Now that we have resumed from suspend, release the lock we + * had acquired while suspending. + */ + virSuspendUnlock(); +} + + +void virSuspendCleanup(void) +{ + if(initialized) + virMutexDestroy(&virSuspendMutex); +} + diff --git a/src/nodeinfo.h b/src/nodeinfo.h index 4766152..186547a 100644 --- a/src/nodeinfo.h +++ b/src/nodeinfo.h @@ -46,4 +46,18 @@ int nodeGetCellsFreeMemory(virConnectPtr conn, int maxCells); unsigned long long nodeGetFreeMemory(virConnectPtr conn); +int virSuspendLock(void); +void virSuspendUnlock(void); +int virSuspendInit(void); +int nodeSuspendForDuration(virConnectPtr conn, + int state, + unsigned long long duration); + +int setNodeWakeup(unsigned long long alarmTime); + +# define SUSPEND_DELAY 10 /* in seconds */ + +void nodeSuspend(void *cmdString); +void virSuspendCleanup(void); + #endif /* __VIR_NODEINFO_H__*/ diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 118197a..b4dc582 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -778,6 +778,11 @@ qemudShutdown(void) { virSysinfoDefFree(qemu_driver->hostsysinfo); + /* Cleanup up the structures initialized for + * suspending the host. + */ + virSuspendCleanup(); + qemuProcessAutoDestroyShutdown(qemu_driver); VIR_FREE(qemu_driver->configDir); diff --git a/src/util/threads-pthread.c b/src/util/threads-pthread.c index 5b8fd5b..e0a4f71 100644 --- a/src/util/threads-pthread.c +++ b/src/util/threads-pthread.c @@ -81,10 +81,25 @@ void virMutexDestroy(virMutexPtr m) pthread_mutex_destroy(&m->lock); } -void virMutexLock(virMutexPtr m){ +void virMutexLock(virMutexPtr m) +{ pthread_mutex_lock(&m->lock); } +/** + * virMutexTryLock: + * This is same as virMutexLock() except that + * if the mutex is unavailable (already locked), + * this fails and returns an error. + * + * Returns 1 if the lock was acquired, 0 if there was + * contention or error. + */ +int virMutexTryLock(virMutexPtr m) +{ + return !pthread_mutex_trylock(&m->lock); +} + void virMutexUnlock(virMutexPtr m) { pthread_mutex_unlock(&m->lock); diff --git a/src/util/threads.h b/src/util/threads.h index b72610c..5ef8714 100644 --- a/src/util/threads.h +++ b/src/util/threads.h @@ -81,6 +81,7 @@ int virMutexInitRecursive(virMutexPtr m) ATTRIBUTE_RETURN_CHECK; void virMutexDestroy(virMutexPtr m); void virMutexLock(virMutexPtr m); +int virMutexTryLock(virMutexPtr m); void virMutexUnlock(virMutexPtr m);

On 11/09/2011 05:05 AM, Srivatsa S. Bhat wrote:
Add the core functions that implement the functionality of the API. Suspend is done by using an asynchronous mechanism so that we can return the status to the caller successfully before the host gets suspended. This asynchronous operation is achieved by suspending the host in a separate thread of execution.
To resume the host, an RTC alarm is set up (based on how long we want to suspend) before suspending the host. When this alarm fires, the host gets woken up.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> ---
include/libvirt/libvirt.h.in | 5 + src/libvirt_private.syms | 7 + src/nodeinfo.c | 220 ++++++++++++++++++++++++++++++++++++++++++ src/nodeinfo.h | 14 +++ src/qemu/qemu_driver.c | 5 + src/util/threads-pthread.c | 17 +++ src/util/threads.h | 1
It looks weird seeing a public change (include/libvirt/libvirt.h.in) mixed in with a driver-specific change (src/qemu/qemu_driver.c). I would favor re-ordering the patches, by splitting patch 2/2 into two parts (public API, then src/remote/remote_protocol.x implementation of remote driver), then putting this patch 1/2 as the third patch. We also need a fourth patch, to tools/virsh.{pod,c} to expose the new feature through virsh.
7 files changed, 268 insertions(+), 1 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index aa320b6..25f1c9b 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -3357,6 +3357,11 @@ typedef struct _virTypedParameter virMemoryParameter; */ typedef virMemoryParameter *virMemoryParameterPtr;
+typedef enum { + VIR_S3 = 1, /* Suspend-to-RAM */ + VIR_S4 = 2 /* Suspend-to-disk */
We tend to use trailing commas in enum decls (we require C99, after all), so that if we later add VIR_... = 3, we don't have to edit the VIR_S4 = 2, /* Suspend-to-disk */ line.
+} virSuspendState; +
This hunk should be moved alongside the half of 2/2 that deals with the public API. Also, float it up to appear earlier in the file; virMemoryParameter is in the section of deprecated names at the bottom for a reason, but virSuspendState is not deprecated, so it should appear sooner. I wonder if we should name this VIR_NODE_S3 and virNodeSuspendState, since it a different concept of 'suspend' than virDomainSuspend.
+++ b/src/libvirt_private.syms @@ -844,6 +844,13 @@ nodeGetCellsFreeMemory; nodeGetFreeMemory; nodeGetInfo; nodeGetMemoryStats; +virSuspendLock; +virSuspendUnlock; +virSuspendInit; +nodeSuspendForDuration; +setNodeWakeup; +nodeSuspend; +virSuspendCleanup;
Sort these lines.
@@ -897,3 +905,215 @@ unsigned long long nodeGetFreeMemory(virConnectPtr conn ATTRIBUTE_UNUSED) return 0; } #endif + + +static int initialized; +virMutex virSuspendMutex; + +int virSuspendLock(void) +{ + return virMutexTryLock(&virSuspendMutex); +} + +void virSuspendUnlock(void) +{ + virMutexUnlock(&virSuspendMutex); +} + +/** + * virSuspendInit: + * + * Get the low power states supported by the host, such as Suspend-to-RAM (S3) + * or Suspend-to-Disk (S4), so that a request to suspend/hibernate the host + * can be handled appropriately based on this information. + * + * Returns 0 if successful, and -1 in case of error. + */ +int virSuspendInit(void) +{ + + if (virMutexInit(&virSuspendMutex) < 0) + return -1; + + /* Get the power management capabilities supported by the host. + * Ensure that this is done only once, by using the 'initialized' + * variable.
Should this function be marked static? If it should only be called once, and 'initialized' is static, then no one outside of this file should be calling it.
+ */ + if (virGetPMCapabilities(&hostPMFeatures) < 0) { + VIR_ERROR(_("Failed to get host power management features")); + return -1; + } + + return 0; +} + + +/** + * nodeSuspendForDuration: + * @conn: pointer to the hypervisor connection + * @state: the state to which the host must be suspended to - + * VIR_HOST_PM_S3 (Suspend-to-RAM) + * or VIR_HOST_PM_S4 (Suspend-to-disk) + * @duration: the time duration in seconds, for which the host + * must be suspended + * + * Suspend the node (host machine) for the given duration of time + * in the specified state (such as S3 or S4). Resume the node + * after the time duration is complete. + * + * An RTC alarm is set appropriately to wake up the node from + * its sleep state. Then the actual node suspend is carried out + * asynchronously in another thread, after a short time delay + * so as to give enough time for this function to return status + * to its caller through the connection. + * + * Returns 0 in case the node is going to be suspended after a short + * delay, -1 if suspending the node is not supported, or if a + * previous suspend operation is still in progress.
Do we need a way to tune the delay before the suspend? I'm worried that a heavily-loaded machine will fail to send the response prior to the suspend. Maybe the best we can do is state that the response is best-effort, and should generally happen due to the short delay but might be missed if the machine suspends first. Or can we wire things into virnetserver.c and friends to guarantee that we don't attempt the suspend until after the response has been successfully delivered?
+ */ +int nodeSuspendForDuration(virConnectPtr conn ATTRIBUTE_UNUSED, + int state, + unsigned long long duration) +{ + static virThread thread; + char *cmdString; + + if (!initialized) { + if(virSuspendInit() < 0) + return -1; + initialized = 1; + } + + /* + * Ensure that we are the only ones trying to suspend. + * Fail if somebody has already initiated suspend. + */ + if (!virSuspendLock()) + return -1; + + /* Check if the host supports the requested suspend state */ + switch (state) { + case VIR_S3:
This says VIR_S3, but the docs above said VIR_HOST_PM_S3. We need to have consistent naming between the code and docs.
+ if (hostPMFeatures & VIR_S3) { + cmdString = strdup("pm-suspend"); + if (cmdString == NULL) { + virReportOOMError(); + goto cleanup; + } + break; + } + goto cleanup; + case VIR_S4: + if (hostPMFeatures & VIR_S4) { + cmdString = strdup("pm-hibernate"); + if (cmdString == NULL) { + virReportOOMError(); + goto cleanup; + } + break; + } + goto cleanup; + default: + goto cleanup; + } + + /* Just set the RTC alarm. Don't suspend yet. */ + if (setNodeWakeup(duration) < 0) { + nodeReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Failed to set up the RTC alarm for node wakeup\n")); + goto cleanup; + } + + if (virThreadCreate(&thread, false, nodeSuspend, (void *)cmdString) < 0) { + nodeReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Failed to create thread to suspend the host\n")); + goto cleanup; + } + + /* virSuspendUnlock() must be called only after resume is complete, + * in the thread that did the suspend and resume. + */ + return 0; + +cleanup: + virSuspendUnlock();
Memory leak - you don't call VIR_FREE(cmdString) on all paths.
+ return -1; +} + +#define MAX_SUSPEND_DURATION 365*24*60*60 /* 1 year, a reasonable max? */
Any time you impose an arbitrary limit (and 1 year can be deemed arbitrary), you are risking problems. There should be a real technical reason why (and if) we have to impose a limit, not an arbitrary time duration; otherwise, I would favor letting the user sleep as long as the RTC clock supports (even if the sleep amount seems ridiculous to me). This is as far as I got on my review today; I'll pick up here tomorrow. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/22/2011 06:18 AM, Eric Blake wrote:
On 11/09/2011 05:05 AM, Srivatsa S. Bhat wrote:
Add the core functions that implement the functionality of the API. Suspend is done by using an asynchronous mechanism so that we can return the status to the caller successfully before the host gets suspended. This asynchronous operation is achieved by suspending the host in a separate thread of execution.
To resume the host, an RTC alarm is set up (based on how long we want to suspend) before suspending the host. When this alarm fires, the host gets woken up.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> ---
[...]
+ +#define MAX_SUSPEND_DURATION 365*24*60*60 /* 1 year, a reasonable max? */
Any time you impose an arbitrary limit (and 1 year can be deemed arbitrary), you are risking problems. There should be a real technical reason why (and if) we have to impose a limit, not an arbitrary time duration; otherwise, I would favor letting the user sleep as long as the RTC clock supports (even if the sleep amount seems ridiculous to me).
This is as far as I got on my review today; I'll pick up here tomorrow.
Hi Eric, Thanks a lot for the review comments! I'll address them in my next iteration of the patchset and post out soon. Thanks, Srivatsa S. Bhat IBM Linux Technology Center

On 11/09/2011 05:05 AM, Srivatsa S. Bhat wrote:
Add the core functions that implement the functionality of the API. Suspend is done by using an asynchronous mechanism so that we can return the status to the caller successfully before the host gets suspended. This asynchronous operation is achieved by suspending the host in a separate thread of execution.
+ +#define MAX_SUSPEND_DURATION 365*24*60*60 /* 1 year, a reasonable max? */
Resuming where I left off...
+ +/** + * setNodeWakeup: + * @alarmTime : time in seconds from now, at which the RTC alarm has to be set. + * + * Set up the RTC alarm to the specified time. + * Return 0 on success, -1 on failure. + */ + +int setNodeWakeup(unsigned long long alarmTime) +{ + virCommandPtr setAlarmCmd; + int ret = -1; + + if (alarmTime <= SUSPEND_DELAY || alarmTime > MAX_SUSPEND_DURATION)
Why is SUSPEND_DELAY in nodeinfo.h but MAX_SUSPEND_DURATION in the .c? They might as well be next to one another.
+ return -1; + + setAlarmCmd = virCommandNewArgList("rtcwake", "-m", "no", "-s", NULL); + virCommandAddArgFormat(setAlarmCmd, "%lld", alarmTime);
'man rtcwake' says that not all systems support RTC wake from S4; systems that have a functioning nvram-wakeup will succeed, but that is not all systems. Do we need to be more cautious about allowing S4 suspension if we can't prove that RTC will wake up the system from S4? On the other hand, you are using -m no to just set the wakeup time, which ought to fail if the system does not support the requested delay or the ability to wake up, so that you never actually suspend until after you know the wakeup was successfully scheduled. Hmm, does that mean the public API should provide a way to schedule the wakeup without also scheduling a suspend?
+++ b/src/util/threads-pthread.c @@ -81,10 +81,25 @@ void virMutexDestroy(virMutexPtr m) pthread_mutex_destroy(&m->lock); }
-void virMutexLock(virMutexPtr m){ +void virMutexLock(virMutexPtr m) +{ pthread_mutex_lock(&m->lock); }
+/** + * virMutexTryLock:
I'm not convinced we need this. As I understand it, your code does: thread 1: thread 2: thread 3: request suspend grab lock spawn helper sleep 10 sec return success request suspend lock not available return failure suspend resume request suspend lock not available return failure release lock But we don't need a try-lock operation, if we do: thread 1: thread 2: thread 3: request suspend grab lock request suspend mark flag to true release lock grab lock flag already true release lock return failure spawn helper sleep 10 sec return success suspend resume grab lock flag already true release lock return failure grab lock clear flag release lock That is, instead of holding the lock for the entire duration of the suspend, just hold the lock long enough to mark a volatile variable; then you no longer need a non-blocking try-lock primitive, because the lock will never starve anyone else long enough to be an issue. But if you can still convince me that we need a try-lock operation, then it should probably be added as a separate commit, alongside an implementation for mingw in the same commit (without a mingw implementation, you will cause a build failure for mingw32-libvirt).
+ * This is same as virMutexLock() except that + * if the mutex is unavailable (already locked), + * this fails and returns an error. + * + * Returns 1 if the lock was acquired, 0 if there was + * contention or error. + */ +int virMutexTryLock(virMutexPtr m) +{ + return !pthread_mutex_trylock(&m->lock);
Either return a bool (if we don't care about distinguishing why the lock failed) or keep the return as an int but make it tri-state (1 for grabbed, 0 for EBUSY, and -1 for all other failures (such as EINVAL).
+++ b/src/util/threads.h @@ -81,6 +81,7 @@ int virMutexInitRecursive(virMutexPtr m) ATTRIBUTE_RETURN_CHECK; void virMutexDestroy(virMutexPtr m);
void virMutexLock(virMutexPtr m); +int virMutexTryLock(virMutexPtr m);
And if you convince me we need this, then mark it ATTRIBUTE_RETURN_CHECK. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/22/2011 11:33 PM, Eric Blake wrote:
On 11/09/2011 05:05 AM, Srivatsa S. Bhat wrote:
Add the core functions that implement the functionality of the API. Suspend is done by using an asynchronous mechanism so that we can return the status to the caller successfully before the host gets suspended. This asynchronous operation is achieved by suspending the host in a separate thread of execution.
+ [...]
+ return -1; + + setAlarmCmd = virCommandNewArgList("rtcwake", "-m", "no", "-s", NULL); + virCommandAddArgFormat(setAlarmCmd, "%lld", alarmTime);
'man rtcwake' says that not all systems support RTC wake from S4; systems that have a functioning nvram-wakeup will succeed, but that is not all systems. Do we need to be more cautious about allowing S4 suspension if we can't prove that RTC will wake up the system from S4?
On the other hand, you are using -m no to just set the wakeup time, which ought to fail if the system does not support the requested delay or the ability to wake up, so that you never actually suspend until after you know the wakeup was successfully scheduled.
Hmm, does that mean the public API should provide a way to schedule the wakeup without also scheduling a suspend?
But how would that help? The aim of having this API is to suspend and resume the system.. and hence I don't see why it has to expose a functionality to only schedule a wakeup..
+++ b/src/util/threads-pthread.c @@ -81,10 +81,25 @@ void virMutexDestroy(virMutexPtr m) pthread_mutex_destroy(&m->lock); }
-void virMutexLock(virMutexPtr m){ +void virMutexLock(virMutexPtr m) +{ pthread_mutex_lock(&m->lock); }
+/** + * virMutexTryLock:
I'm not convinced we need this. As I understand it, your code does:
thread 1: thread 2: thread 3: request suspend grab lock spawn helper sleep 10 sec return success request suspend lock not available return failure suspend resume request suspend lock not available return failure release lock
But we don't need a try-lock operation, if we do:
thread 1: thread 2: thread 3: request suspend grab lock request suspend mark flag to true release lock grab lock flag already true release lock return failure spawn helper sleep 10 sec return success suspend resume grab lock flag already true release lock return failure grab lock clear flag release lock
That is, instead of holding the lock for the entire duration of the suspend, just hold the lock long enough to mark a volatile variable; then you no longer need a non-blocking try-lock primitive, because the lock will never starve anyone else long enough to be an issue.
Yes, that would work. (In fact, that was the way I first developed the code. But then I felt trylock() was a fairly popular primitive to use in such cases and hence I thought I might as well add it to libvirt). Anyways, I am fine with going with the method you suggested above. Thanks, Srivatsa S. Bhat IBM Linux Technology Center

Define the required interfaces to export the API. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> --- include/libvirt/libvirt.h.in | 4 ++++ src/driver.h | 5 ++++ src/libvirt.c | 48 ++++++++++++++++++++++++++++++++++++++++++ src/libvirt_public.syms | 1 + src/qemu/qemu_driver.c | 1 + src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 12 ++++++++++- 7 files changed, 71 insertions(+), 1 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 25f1c9b..809a1fd 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1055,6 +1055,10 @@ unsigned long long virNodeGetFreeMemory (virConnectPtr conn); int virNodeGetSecurityModel (virConnectPtr conn, virSecurityModelPtr secmodel); +int virNodeSuspendForDuration (virConnectPtr conn, + int state, + unsigned long long duration); + /* * Gather list of running domains */ diff --git a/src/driver.h b/src/driver.h index 4c14aaa..981bfae 100644 --- a/src/driver.h +++ b/src/driver.h @@ -740,6 +740,10 @@ typedef int (*virDrvDomainBlockPull)(virDomainPtr dom, const char *path, unsigned long bandwidth, unsigned int flags); +typedef int + (*virDrvNodeSuspendForDuration)(virConnectPtr conn, int state, + unsigned long long duration); + /** * _virDriver: @@ -899,6 +903,7 @@ struct _virDriver { virDrvDomainGetBlockJobInfo domainGetBlockJobInfo; virDrvDomainBlockJobSetSpeed domainBlockJobSetSpeed; virDrvDomainBlockPull domainBlockPull; + virDrvNodeSuspendForDuration nodeSuspendForDuration; }; typedef int diff --git a/src/libvirt.c b/src/libvirt.c index b0d1e01..fc4575a 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -6303,6 +6303,54 @@ error: } /** + * virNodeSuspendForDuration: + * @conn: pointer to the hypervisor connection + * @state: the state to which the host must be suspended to, + * such as: VIR_S3 (Suspend-to-RAM) + * VIR_S4 (Suspend-to-Disk) + * @duration: the time duration in seconds, for which the host + * has to be suspended + * + * Suspend the node (host machine) for the given duration of time + * in the specified state (such as S3 or S4). Resume the node + * after the time duration is complete. + * + * Returns 0 on success (i.e., the node will be suspended after a + * short delay), -1 on failure (the operation is not supported). + */ +int +virNodeSuspendForDuration(virConnectPtr conn, + int state, + unsigned long long duration) +{ + + VIR_DEBUG("conn=%p, state=%d, duration=%lld", conn, state, duration); + + virResetLastError(); + + if (!VIR_IS_CONNECT(conn)) { + virLibConnError(VIR_ERR_INVALID_CONN, __FUNCTION__); + virDispatchError(NULL); + return -1; + } + + if (conn->driver->nodeSuspendForDuration) { + int ret; + ret = conn->driver->nodeSuspendForDuration(conn, state, duration); + if (ret < 0) + goto error; + return ret; + } + + virLibConnError(VIR_ERR_NO_SUPPORT, __FUNCTION__); + +error: + virDispatchError(conn); + return -1; +} + + +/** * virDomainGetSchedulerType: * @domain: pointer to domain object * @nparams: pointer to number of scheduler parameters, can be NULL diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index bcefb10..fd44170 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -478,6 +478,7 @@ LIBVIRT_0.9.4 { virDomainGetBlockJobInfo; virDomainBlockJobSetSpeed; virDomainBlockPull; + virNodeSuspendForDuration; } LIBVIRT_0.9.3; LIBVIRT_0.9.5 { diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index b4dc582..f744539 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10911,6 +10911,7 @@ static virDriver qemuDriver = { .domainGetBlockJobInfo = qemuDomainGetBlockJobInfo, /* 0.9.4 */ .domainBlockJobSetSpeed = qemuDomainBlockJobSetSpeed, /* 0.9.4 */ .domainBlockPull = qemuDomainBlockPull, /* 0.9.4 */ + .nodeSuspendForDuration = nodeSuspendForDuration, /* 0.9.7 */ }; diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index f3b8ad5..2f6b29a 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -4526,6 +4526,7 @@ static virDriver remote_driver = { .domainGetBlockJobInfo = remoteDomainGetBlockJobInfo, /* 0.9.4 */ .domainBlockJobSetSpeed = remoteDomainBlockJobSetSpeed, /* 0.9.4 */ .domainBlockPull = remoteDomainBlockPull, /* 0.9.4 */ + .nodeSuspendForDuration = remoteNodeSuspendForDuration, /* 0.9.7 */ }; static virNetworkDriver network_driver = { diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index a174af8..5c21421 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -2267,6 +2267,15 @@ struct remote_domain_open_graphics_args { unsigned int flags; }; +struct remote_node_suspend_for_duration_args { + int state; + unsigned hyper duration; +}; + +struct remote_node_suspend_for_duration_ret { + int status; +}; + /*----- Protocol. -----*/ /* Define the program number, protocol version and procedure numbers here. */ @@ -2562,7 +2571,8 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_SNAPSHOT_NUM_CHILDREN = 246, /* autogen autogen priority:high */ REMOTE_PROC_DOMAIN_SNAPSHOT_LIST_CHILDREN_NAMES = 247, /* autogen autogen priority:high */ REMOTE_PROC_DOMAIN_EVENT_DISK_CHANGE = 248, /* skipgen skipgen */ - REMOTE_PROC_DOMAIN_OPEN_GRAPHICS = 249 /* skipgen skipgen */ + REMOTE_PROC_DOMAIN_OPEN_GRAPHICS = 249, /* skipgen skipgen */ + REMOTE_PROC_NODE_SUSPEND_FOR_DURATION = 250 /* autogen autogen */ /* * Notice how the entries are grouped in sets of 10 ?

On 11/09/2011 05:05 AM, Srivatsa S. Bhat wrote:
Define the required interfaces to export the API.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> ---
include/libvirt/libvirt.h.in | 4 ++++ src/driver.h | 5 ++++ src/libvirt.c | 48 ++++++++++++++++++++++++++++++++++++++++++ src/libvirt_public.syms | 1 +
These changes are good for the first patch, in introducing the API. Plus see my comment in 1/2 about the hunk defining VIR_S3 (or whatever name we settle on) being part of this patch.
src/qemu/qemu_driver.c | 1 +
This change should be shuffled into the qemu driver implementation of the driver backend.
src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 12 ++++++++++-
These changes belong to a second patch, implementing the RPC driver backend. And don't forget a patch to virsh.
+++ b/include/libvirt/libvirt.h.in @@ -1055,6 +1055,10 @@ unsigned long long virNodeGetFreeMemory (virConnectPtr conn); int virNodeGetSecurityModel (virConnectPtr conn, virSecurityModelPtr secmodel);
+int virNodeSuspendForDuration (virConnectPtr conn, + int state, + unsigned long long duration);
Needs an 'unsigned int flags' argument, even if we always pass 0 for now.
+++ b/src/libvirt.c @@ -6303,6 +6303,54 @@ error: }
/** + * virNodeSuspendForDuration: + * @conn: pointer to the hypervisor connection + * @state: the state to which the host must be suspended to, + * such as: VIR_S3 (Suspend-to-RAM) + * VIR_S4 (Suspend-to-Disk)
Is this a bitmask or a linear list? I guess a list makes more sense. Also, some machines support a hybrid between S3 and S4 (pm-is-supported --suspend-hybrid), which saves state to disk like S4 but then drops to S3 for faster resume as long as power is present. Perhaps a hybrid state is deserving of a third value in the enum?
+ * @duration: the time duration in seconds, for which the host + * has to be suspended + * + * Suspend the node (host machine) for the given duration of time + * in the specified state (such as S3 or S4). Resume the node + * after the time duration is complete.
We might want to be a bit more "fuzzy" on how we word this, so that we aren't making impossible promises. Something like: Attempt to suspend the node (host machine) for the given duration of time into the specified state (S3 for suspend to RAM, S4 for suspend to disk). Schedule the node's real-time-clock interrupt to resume the node after the time duration is complete.
+ * + * Returns 0 on success (i.e., the node will be suspended after a + * short delay), -1 on failure (the operation is not supported).
(the operation is not supported, or an attempted suspend is already underway).
+ */ +int +virNodeSuspendForDuration(virConnectPtr conn, + int state, + unsigned long long duration) +{ + + VIR_DEBUG("conn=%p, state=%d, duration=%lld", conn, state, duration); + + virResetLastError(); + + if (!VIR_IS_CONNECT(conn)) { + virLibConnError(VIR_ERR_INVALID_CONN, __FUNCTION__); + virDispatchError(NULL); + return -1; + } +
You must check for and reject an attempt to use this on a read-only connection.
+ if (conn->driver->nodeSuspendForDuration) { + int ret; + ret = conn->driver->nodeSuspendForDuration(conn, state, duration);
Be sure to pass a flags argument on to the driver.
+ if (ret < 0) + goto error; + return ret; + } + + virLibConnError(VIR_ERR_NO_SUPPORT, __FUNCTION__); + +error: + virDispatchError(conn); + return -1; +} + + +/** * virDomainGetSchedulerType: * @domain: pointer to domain object * @nparams: pointer to number of scheduler parameters, can be NULL diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index bcefb10..fd44170 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -478,6 +478,7 @@ LIBVIRT_0.9.4 { virDomainGetBlockJobInfo; virDomainBlockJobSetSpeed; virDomainBlockPull; + virNodeSuspendForDuration; } LIBVIRT_0.9.3;
The new API must be in a LIBVIRT_0.9.8 section at the bottom of the file; we are NOT re-doing 0.9.3.
LIBVIRT_0.9.5 { diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index b4dc582..f744539 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10911,6 +10911,7 @@ static virDriver qemuDriver = { .domainGetBlockJobInfo = qemuDomainGetBlockJobInfo, /* 0.9.4 */ .domainBlockJobSetSpeed = qemuDomainBlockJobSetSpeed, /* 0.9.4 */ .domainBlockPull = qemuDomainBlockPull, /* 0.9.4 */ + .nodeSuspendForDuration = nodeSuspendForDuration, /* 0.9.7 */
0.9.8.
};
diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index f3b8ad5..2f6b29a 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -4526,6 +4526,7 @@ static virDriver remote_driver = { .domainGetBlockJobInfo = remoteDomainGetBlockJobInfo, /* 0.9.4 */ .domainBlockJobSetSpeed = remoteDomainBlockJobSetSpeed, /* 0.9.4 */ .domainBlockPull = remoteDomainBlockPull, /* 0.9.4 */ + .nodeSuspendForDuration = remoteNodeSuspendForDuration, /* 0.9.7 */
0.9.8.
+++ b/src/remote/remote_protocol.x @@ -2267,6 +2267,15 @@ struct remote_domain_open_graphics_args { unsigned int flags; };
+struct remote_node_suspend_for_duration_args { + int state; + unsigned hyper duration;
needs a flags member.
+}; + +struct remote_node_suspend_for_duration_ret { + int status; +};
Do we really need this? Given that we return exactly two values (0 on success, -1 on failure), I think this is like other APIs that need no _ret object (for example, remote_domain_set_blkio_parameters_args), because the default return values are already handled by the basic RPC operation. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/22/2011 11:56 PM, Eric Blake wrote:
On 11/09/2011 05:05 AM, Srivatsa S. Bhat wrote:
Define the required interfaces to export the API.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> ---
include/libvirt/libvirt.h.in | 4 ++++ src/driver.h | 5 ++++ src/libvirt.c | 48 ++++++++++++++++++++++++++++++++++++++++++ src/libvirt_public.syms | 1 +
These changes are good for the first patch, in introducing the API. Plus see my comment in 1/2 about the hunk defining VIR_S3 (or whatever name we settle on) being part of this patch.
src/qemu/qemu_driver.c | 1 +
This change should be shuffled into the qemu driver implementation of the driver backend.
src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 12 ++++++++++-
These changes belong to a second patch, implementing the RPC driver backend. And don't forget a patch to virsh.
+++ b/include/libvirt/libvirt.h.in @@ -1055,6 +1055,10 @@ unsigned long long virNodeGetFreeMemory (virConnectPtr conn); int virNodeGetSecurityModel (virConnectPtr conn, virSecurityModelPtr secmodel);
+int virNodeSuspendForDuration (virConnectPtr conn, + int state, + unsigned long long duration);
Needs an 'unsigned int flags' argument, even if we always pass 0 for now.
+++ b/src/libvirt.c @@ -6303,6 +6303,54 @@ error: }
/** + * virNodeSuspendForDuration: + * @conn: pointer to the hypervisor connection + * @state: the state to which the host must be suspended to, + * such as: VIR_S3 (Suspend-to-RAM) + * VIR_S4 (Suspend-to-Disk)
Is this a bitmask or a linear list? I guess a list makes more sense. Also, some machines support a hybrid between S3 and S4 (pm-is-supported --suspend-hybrid), which saves state to disk like S4 but then drops to S3 for faster resume as long as power is present. Perhaps a hybrid state is deserving of a third value in the enum?
That would need a corresponding change in virGetPMCapabilities etc (which was introduced in the other patch which Daniel Veillard pushed already). http://www.redhat.com/archives/libvir-list/2011-November/msg01155.html Perhaps, I'll send a companion patch to add the hybrid-suspend discovery to that, and then base the next version of this patchset on that. -- Regards, Srivatsa S. Bhat IBM Linux Technology Center

On 11/23/2011 04:10 PM, Srivatsa S. Bhat wrote:
On 11/22/2011 11:56 PM, Eric Blake wrote:
On 11/09/2011 05:05 AM, Srivatsa S. Bhat wrote:
Define the required interfaces to export the API.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> ---
include/libvirt/libvirt.h.in | 4 ++++ src/driver.h | 5 ++++ src/libvirt.c | 48 ++++++++++++++++++++++++++++++++++++++++++ src/libvirt_public.syms | 1 +
These changes are good for the first patch, in introducing the API. Plus see my comment in 1/2 about the hunk defining VIR_S3 (or whatever name we settle on) being part of this patch.
src/qemu/qemu_driver.c | 1 +
This change should be shuffled into the qemu driver implementation of the driver backend.
src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 12 ++++++++++-
These changes belong to a second patch, implementing the RPC driver backend. And don't forget a patch to virsh.
+++ b/include/libvirt/libvirt.h.in @@ -1055,6 +1055,10 @@ unsigned long long virNodeGetFreeMemory (virConnectPtr conn); int virNodeGetSecurityModel (virConnectPtr conn, virSecurityModelPtr secmodel);
+int virNodeSuspendForDuration (virConnectPtr conn, + int state, + unsigned long long duration);
Needs an 'unsigned int flags' argument, even if we always pass 0 for now.
+++ b/src/libvirt.c @@ -6303,6 +6303,54 @@ error: }
/** + * virNodeSuspendForDuration: + * @conn: pointer to the hypervisor connection + * @state: the state to which the host must be suspended to, + * such as: VIR_S3 (Suspend-to-RAM) + * VIR_S4 (Suspend-to-Disk)
Is this a bitmask or a linear list? I guess a list makes more sense. Also, some machines support a hybrid between S3 and S4 (pm-is-supported --suspend-hybrid), which saves state to disk like S4 but then drops to S3 for faster resume as long as power is present. Perhaps a hybrid state is deserving of a third value in the enum?
That would need a corresponding change in virGetPMCapabilities etc (which was introduced in the other patch which Daniel Veillard pushed already). http://www.redhat.com/archives/libvir-list/2011-November/msg01155.html
Perhaps, I'll send a companion patch to add the hybrid-suspend discovery to that, and then base the next version of this patchset on that.
Actually thinking about it, I don't see how that would be useful for a libvirt managed host. The hybrid suspend feature was designed keeping laptops in mind, which when suspended with the power supply turned off (and hence powered by only batteries), the laptops could lose power and power off, and hence in that case resuming the saved contents from disk would be beneficial. I doubt if anyone would be seriously interested in running virtualization software on laptops and managing them via libvirt, rather than using servers for that purpose (which have continuous power-supply). Though I have already written a patch to add the hybrid-suspend capability discovery to libvirt and export it in the XML along with S3 and S4, I would rather prefer not to send that patch if nobody is going to use it. -- Regards, Srivatsa S. Bhat IBM Linux Technology Center

On 11/23/2011 05:33 AM, Srivatsa S. Bhat wrote:
Perhaps, I'll send a companion patch to add the hybrid-suspend discovery to that, and then base the next version of this patchset on that.
Actually thinking about it, I don't see how that would be useful for a libvirt managed host. The hybrid suspend feature was designed keeping laptops in mind, which when suspended with the power supply turned off (and hence powered by only batteries), the laptops could lose power and power off, and hence in that case resuming the saved contents from disk would be beneficial.
Whether to use it is a policy decision. I have much less heartburn providing a feature no one will use, but exposing full flexibility, than I do with failing to provide a feature on a policy decision ("no one will want that") only to find out that someone really did want it.
I doubt if anyone would be seriously interested in running virtualization software on laptops and managing them via libvirt, rather than using servers for that purpose (which have continuous power-supply).
Enterprise VMs, sure. But personal VMs - I run VMs on my laptop, and imagine that I may eventually have a reason to use a hybrid suspend.
Though I have already written a patch to add the hybrid-suspend capability discovery to libvirt and export it in the XML along with S3 and S4, I would rather prefer not to send that patch if nobody is going to use it.
Please go ahead with sending the patch. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/23/2011 06:55 PM, Eric Blake wrote:
On 11/23/2011 05:33 AM, Srivatsa S. Bhat wrote:
Perhaps, I'll send a companion patch to add the hybrid-suspend discovery to that, and then base the next version of this patchset on that.
Actually thinking about it, I don't see how that would be useful for a libvirt managed host. The hybrid suspend feature was designed keeping laptops in mind, which when suspended with the power supply turned off (and hence powered by only batteries), the laptops could lose power and power off, and hence in that case resuming the saved contents from disk would be beneficial.
Whether to use it is a policy decision. I have much less heartburn providing a feature no one will use, but exposing full flexibility, than I do with failing to provide a feature on a policy decision ("no one will want that") only to find out that someone really did want it.
I doubt if anyone would be seriously interested in running virtualization software on laptops and managing them via libvirt, rather than using servers for that purpose (which have continuous power-supply).
Enterprise VMs, sure. But personal VMs - I run VMs on my laptop, and imagine that I may eventually have a reason to use a hybrid suspend.
Though I have already written a patch to add the hybrid-suspend capability discovery to libvirt and export it in the XML along with S3 and S4, I would rather prefer not to send that patch if nobody is going to use it.
Please go ahead with sending the patch.
OK, I will send the patch. Thanks for the clarification! Regards, Srivatsa S. Bhat IBM Linux Technology Center

On 11/09/2011 05:05 AM, Srivatsa S. Bhat wrote:
(This patch is positioned to go in after the patch that exports the host power management capabilities as XML, posted in [4])
I'm now reviewing that patch along with this series; if we need another revision, it may be easiest to merge the three patches into a single series and title it v6, so that they are together in one thread.
This patch adds a new API to put a host to a suspended state (either Suspend-to-RAM or Suspend-to-Disk) and setup a timed resume to get the host back online, from within libvirt. This uses the RTC wakeup mechanism to set up a timer alarm before suspending the host, so that in-band resume is facilitated by the firing of the RTC alarm, which wakes up the host.
I'm not as familiar with S3/S4 as I would like to be - am I correct that RTC wakeup works for S3 (where the host is maintaining minimal power and can thus react to interrupts), but not S4 (where the host has flushed completely to disk and powers off, but resumes from the state on disk on next power on)? Or am I misunderstanding these two power-saving states?
The decision to use the RTC Wakeup mechanism to resume the host from sleep was taken in [1]. An initial API was discussed in [2]. Some design ideas for the asynchronous mechanism implementation was discussed in [3].
v3: * Rebased to libvirt 0.9.7 * Added a check to see if alarmTime (suspend duration) is within an acceptable range.
v2: * Added an init function which finds out if S3/S4 is supported by the host, upon the first request to suspend/hibernate. * Added synchronization/locking to ensure that only one suspend operation is active at a time.
v1: http://www.redhat.com/archives/libvir-list/2011-September/msg00830.html v2: http://comments.gmane.org/gmane.comp.emulators.libvirt/46729
References: [1]. http://www.redhat.com/archives/libvir-list/2011-August/msg00327.html [2]. http://www.redhat.com/archives/libvir-list/2011-August/msg00248.html [3]. http://www.redhat.com/archives/libvir-list/2011-September/msg00438.html [4]. http://www.redhat.com/archives/libvir-list/2011-November/msg00378.html
-- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Mon, Nov 21, 2011 at 05:26:55PM -0700, Eric Blake wrote:
On 11/09/2011 05:05 AM, Srivatsa S. Bhat wrote:
(This patch is positioned to go in after the patch that exports the host power management capabilities as XML, posted in [4])
I'm now reviewing that patch along with this series; if we need another revision, it may be easiest to merge the three patches into a single series and title it v6, so that they are together in one thread.
Ah, too bad, I already reviewed and pushed, that could go standalone even if it is of limited use by itself, Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

On Mon, Nov 21, 2011 at 05:26:55PM -0700, Eric Blake wrote:
On 11/09/2011 05:05 AM, Srivatsa S. Bhat wrote:
(This patch is positioned to go in after the patch that exports the host power management capabilities as XML, posted in [4])
I'm now reviewing that patch along with this series; if we need another revision, it may be easiest to merge the three patches into a single series and title it v6, so that they are together in one thread.
This patch adds a new API to put a host to a suspended state (either Suspend-to-RAM or Suspend-to-Disk) and setup a timed resume to get the host back online, from within libvirt. This uses the RTC wakeup mechanism to set up a timer alarm before suspending the host, so that in-band resume is facilitated by the firing of the RTC alarm, which wakes up the host.
I'm not as familiar with S3/S4 as I would like to be - am I correct that RTC wakeup works for S3 (where the host is maintaining minimal power and can thus react to interrupts), but not S4 (where the host has flushed completely to disk and powers off, but resumes from the state on disk on next power on)? Or am I misunderstanding these two power-saving states?
Theoretically I think you're right, but in current qemu S3 does a hardware reset without reseting the memory, so kind of a sleep + immediate wakeup. S4 is emulated correctly - qemu doesn't have to do anything special.
The decision to use the RTC Wakeup mechanism to resume the host from sleep was taken in [1]. An initial API was discussed in [2]. Some design ideas for the asynchronous mechanism implementation was discussed in [3].
v3: * Rebased to libvirt 0.9.7 * Added a check to see if alarmTime (suspend duration) is within an acceptable range.
v2: * Added an init function which finds out if S3/S4 is supported by the host, upon the first request to suspend/hibernate. * Added synchronization/locking to ensure that only one suspend operation is active at a time.
v1: http://www.redhat.com/archives/libvir-list/2011-September/msg00830.html v2: http://comments.gmane.org/gmane.comp.emulators.libvirt/46729
References: [1]. http://www.redhat.com/archives/libvir-list/2011-August/msg00327.html [2]. http://www.redhat.com/archives/libvir-list/2011-August/msg00248.html [3]. http://www.redhat.com/archives/libvir-list/2011-September/msg00438.html [4]. http://www.redhat.com/archives/libvir-list/2011-November/msg00378.html
-- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 11/22/2011 03:48 PM, Alon Levy wrote:
On Mon, Nov 21, 2011 at 05:26:55PM -0700, Eric Blake wrote:
On 11/09/2011 05:05 AM, Srivatsa S. Bhat wrote:
(This patch is positioned to go in after the patch that exports the host power management capabilities as XML, posted in [4])
I'm now reviewing that patch along with this series; if we need another revision, it may be easiest to merge the three patches into a single series and title it v6, so that they are together in one thread.
This patch adds a new API to put a host to a suspended state (either Suspend-to-RAM or Suspend-to-Disk) and setup a timed resume to get the host back online, from within libvirt. This uses the RTC wakeup mechanism to set up a timer alarm before suspending the host, so that in-band resume is facilitated by the firing of the RTC alarm, which wakes up the host.
I'm not as familiar with S3/S4 as I would like to be - am I correct that RTC wakeup works for S3 (where the host is maintaining minimal power and can thus react to interrupts), but not S4 (where the host has flushed completely to disk and powers off, but resumes from the state on disk on next power on)? Or am I misunderstanding these two power-saving states?
Theoretically I think you're right, but in current qemu S3 does a hardware reset without reseting the memory, so kind of a sleep + immediate wakeup. S4 is emulated correctly - qemu doesn't have to do anything special.
Hi Alon, Actually this patchset talks about S3/S4 on the _host_, not the guest. So, in the host, RTC wakeup works for both S3 and S4. Thanks, Srivatsa S. Bhat IBM Linux Technology Center

On 11/22/2011 05:56 AM, Eric Blake wrote:
On 11/09/2011 05:05 AM, Srivatsa S. Bhat wrote:
(This patch is positioned to go in after the patch that exports the host power management capabilities as XML, posted in [4])
I'm now reviewing that patch along with this series; if we need another revision, it may be easiest to merge the three patches into a single series and title it v6, so that they are together in one thread.
This patch adds a new API to put a host to a suspended state (either Suspend-to-RAM or Suspend-to-Disk) and setup a timed resume to get the host back online, from within libvirt. This uses the RTC wakeup mechanism to set up a timer alarm before suspending the host, so that in-band resume is facilitated by the firing of the RTC alarm, which wakes up the host.
I'm not as familiar with S3/S4 as I would like to be - am I correct that RTC wakeup works for S3 (where the host is maintaining minimal power and can thus react to interrupts), but not S4 (where the host has flushed completely to disk and powers off, but resumes from the state on disk on next power on)? Or am I misunderstanding these two power-saving states?
Actually, RTC wakeup works for both S3 and S4. Hence this patch can use that mechanism effectively for both the system-wide sleep states. Thanks, Srivatsa S. Bhat IBM Linux Technology Center
participants (4)
-
Alon Levy
-
Daniel Veillard
-
Eric Blake
-
Srivatsa S. Bhat