[libvirt] [PATCH RFC 0/4] Allow hibernation on guests

As we've added guest agent recently, the whole world of new functionality has opened. As this patch set, which allows domains to enter S4 state. What is needed for this? Patched qemu. As this is not in qemu git, but patches are await to be pushed in very short future. They can be found here: http://lists.gnu.org/archive/html/qemu-devel/2012-01/msg03261.html Despite that, if anybody is willing to give me review if I am going the right way, I'd appreciate it. One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*. So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze & thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea. Because of what written above, we need to call set-level on every suspend. Michal Privoznik (4): qemu-agent: Allow setting supported level qemu-agent: Create suspend function Create new virDrvDomainSuspendFlags API Wire up qemu agent for suspend include/libvirt/libvirt.h.in | 8 +++ src/driver.h | 4 ++ src/libvirt.c | 55 ++++++++++++++++++++++ src/libvirt_public.syms | 1 + src/qemu/qemu_agent.c | 105 ++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_agent.h | 13 +++++ src/qemu/qemu_driver.c | 57 +++++++++++++++++++--- src/qemu/qemu_process.c | 45 +++++++++++++++++- src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 9 +++- src/remote_protocol-structs | 5 ++ tools/virsh.c | 30 ++++++++++-- 12 files changed, 317 insertions(+), 16 deletions(-) -- 1.7.3.4

Some qemu guest-agent commands require so called 'supported level' to be set. In other words, GA needs to be told which version of qemu is being used. Moreover, guest-agent can't remember this information during its restarts. But QEMU developers are trying to learn guest-agent to remember it. --- src/qemu/qemu_agent.c | 35 +++++++++++++++++++++++++++++++++++ src/qemu/qemu_agent.h | 2 ++ src/qemu/qemu_process.c | 45 ++++++++++++++++++++++++++++++++++++++++++--- 3 files changed, 79 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c index 9df5546..536724f 100644 --- a/src/qemu/qemu_agent.c +++ b/src/qemu/qemu_agent.c @@ -106,6 +106,16 @@ struct _qemuAgent { /* If anything went wrong, this will be fed back * the next monitor msg */ virError lastError; + + /* Some guest-agent commands are available only + * after telling guest-agent what qemu version it + * deals with. However, currently guest-agent is not + * capable of remembering this information. + * Therefore we need to memorize it and set + * before such commands. + * XXX Remove this in the future + */ + unsigned int version; }; #if DEBUG_RAW_IO @@ -1184,3 +1194,28 @@ cleanup: virJSONValueFree(reply); return ret; } + +/* + * qemuAgentSetSupportedLevelCommand: + * @mon: Agent + * @version: QEMU version + * + * Let GA knows what qemu version it deals with. + * This enables/disables some functionality on + * one hand, but on the other GA should cope nicely + * after this. + * + * QEMU @version format: + * major * 1000 * 1000 + minor * 1000 + micro + * as returned by qemuCapsExtractVersionInfo(). + * + * The minimum required by QEMU is 1.0.0, which + * makes @version MUST be >= 1000*1000 + */ +void qemuAgentSetSupportedLevel(qemuAgentPtr mon, + unsigned int version) +{ + qemuAgentLock(mon); + mon->version = version; + qemuAgentUnlock(mon); +} diff --git a/src/qemu/qemu_agent.h b/src/qemu/qemu_agent.h index df59ef7..2858207 100644 --- a/src/qemu/qemu_agent.h +++ b/src/qemu/qemu_agent.h @@ -69,4 +69,6 @@ int qemuAgentShutdown(qemuAgentPtr mon, int qemuAgentFSFreeze(qemuAgentPtr mon); int qemuAgentFSThaw(qemuAgentPtr mon); +void qemuAgentSetSupportedLevel(qemuAgentPtr mon, + unsigned int version); #endif /* __QEMU_AGENT_H__ */ diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index d22020b..a0b0159 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2825,6 +2825,7 @@ qemuProcessReconnect(void *opaque) struct qemuDomainJobObj oldjob; int state; int reason; + unsigned int version; memcpy(&oldjob, &data->oldjob, sizeof(oldjob)); @@ -2876,7 +2877,7 @@ qemuProcessReconnect(void *opaque) */ if (!priv->qemuCaps && qemuCapsExtractVersionInfo(obj->def->emulator, obj->def->os.arch, - NULL, + &version, &priv->qemuCaps) < 0) goto error; @@ -2925,6 +2926,18 @@ qemuProcessReconnect(void *opaque) if (obj->def->id >= driver->nextvmid) driver->nextvmid = obj->def->id + 1; + if (priv->agent) { + if (priv->agentError) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unable to set guest-supported-level" + " due to agent error")); + goto error; + } + + VIR_DEBUG("Setting guest-supported-level to %u", version); + qemuAgentSetSupportedLevel(priv->agent, version); + } + endjob: if (qemuDomainObjEndJob(driver, obj) == 0) obj = NULL; @@ -3083,6 +3096,7 @@ int qemuProcessStart(virConnectPtr conn, virCommandPtr cmd = NULL; struct qemuProcessHookData hookData; unsigned long cur_balloon; + unsigned int version; hookData.conn = conn; hookData.vm = vm; @@ -3212,7 +3226,7 @@ int qemuProcessStart(virConnectPtr conn, qemuCapsFree(priv->qemuCaps); priv->qemuCaps = NULL; if (qemuCapsExtractVersionInfo(vm->def->emulator, vm->def->os.arch, - NULL, + &version, &priv->qemuCaps) < 0) goto cleanup; @@ -3432,6 +3446,18 @@ int qemuProcessStart(virConnectPtr conn, priv->agentError = true; } + if (priv->agent) { + if (priv->agentError) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unable to set guest-supported-level" + " due to agent error")); + goto cleanup; + } + + VIR_DEBUG("Setting guest-supported-level to %u", version); + qemuAgentSetSupportedLevel(priv->agent, version); + } + VIR_DEBUG("Detecting VCPU PIDs"); if (qemuProcessDetectVcpuPIDs(driver, vm) < 0) goto cleanup; @@ -3782,6 +3808,7 @@ int qemuProcessAttach(virConnectPtr conn ATTRIBUTE_UNUSED, bool running = true; virDomainPausedReason reason; virSecurityLabelPtr seclabel = NULL; + unsigned int version; VIR_DEBUG("Beginning VM attach process"); @@ -3835,7 +3862,7 @@ int qemuProcessAttach(virConnectPtr conn ATTRIBUTE_UNUSED, priv->qemuCaps = NULL; if (qemuCapsExtractVersionInfo(vm->def->emulator, vm->def->os.arch, - NULL, + &version, &priv->qemuCaps) < 0) goto cleanup; @@ -3901,6 +3928,18 @@ int qemuProcessAttach(virConnectPtr conn ATTRIBUTE_UNUSED, priv->agentError = true; } + if (priv->agent) { + if (priv->agentError) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unable to set guest-supported-level" + " due to agent error")); + goto cleanup; + } + + VIR_DEBUG("Setting guest-supported-level to %u", version); + qemuAgentSetSupportedLevel(priv->agent, version); + } + VIR_DEBUG("Detecting VCPU PIDs"); if (qemuProcessDetectVcpuPIDs(driver, vm) < 0) goto cleanup; -- 1.7.3.4

which allows guest to enter one of these modes: -hibernation: RAM is stored onto disk and guest is powered off (S4) -sleep: RAM stays powered, but CPUs are powered off (S3) -hybrid: RAM is stored onto disk but guest is not powered off --- src/qemu/qemu_agent.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_agent.h | 11 +++++++ 2 files changed, 81 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c index 536724f..f2cae94 100644 --- a/src/qemu/qemu_agent.c +++ b/src/qemu/qemu_agent.c @@ -1219,3 +1219,73 @@ void qemuAgentSetSupportedLevel(qemuAgentPtr mon, mon->version = version; qemuAgentUnlock(mon); } + +static int +qemuAgentSetSupportedLevelCommand(qemuAgentPtr mon) +{ + int ret = -1; + virJSONValuePtr cmd; + virJSONValuePtr reply = NULL; + unsigned int major, minor, micro; + unsigned int version = mon->version; + + if (version < 1000*1000) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid qemu version set: %u"), + version); + return -1; + } + + major = version / (1000 * 1000); + version %= 1000*1000; + minor = version / 1000; + version %= 1000; + micro = version; + + cmd = qemuAgentMakeCommand("guest-set-support-level", + "u:major", major, + "u:minor", minor, + "u:micro", micro, + NULL); + if (!cmd) + return -1; + + ret = qemuAgentCommand(mon, cmd, &reply); + + if (ret == 0) + ret = qemuAgentCheckError(cmd, reply); + + virJSONValueFree(cmd); + virJSONValueFree(reply); + return ret; +} + +VIR_ENUM_DECL(qemuAgentSuspendMode); + +VIR_ENUM_IMPL(qemuAgentSuspendMode, + QEMU_AGENT_SUSPEND_LAST, + "hibernate", "sleep", "hybrid"); + +int qemuAgentSuspend(qemuAgentPtr mon, + qemuAgentSuspendMode mode) +{ + int ret = -1; + virJSONValuePtr cmd; + virJSONValuePtr reply = NULL; + + cmd = qemuAgentMakeCommand("guest-suspend", + "s:mode", qemuAgentSuspendModeTypeToString(mode), + NULL); + if (!cmd) + return -1; + + /* XXX Unconditionally set supported level until + * qemu learns to keep this information */ + if ((ret = qemuAgentSetSupportedLevelCommand(mon)) == 0 && + (ret = qemuAgentCommand(mon, cmd, &reply)) == 0) + ret = qemuAgentCheckError(cmd, reply); + + virJSONValueFree(cmd); + virJSONValueFree(reply); + return ret; +} diff --git a/src/qemu/qemu_agent.h b/src/qemu/qemu_agent.h index 2858207..56e5573 100644 --- a/src/qemu/qemu_agent.h +++ b/src/qemu/qemu_agent.h @@ -71,4 +71,15 @@ int qemuAgentFSThaw(qemuAgentPtr mon); void qemuAgentSetSupportedLevel(qemuAgentPtr mon, unsigned int version); + +typedef enum { + QEMU_AGENT_SUSPEND_HIBERNATE, + QEMU_AGENT_SUSPEND_SLEEP, + QEMU_AGENT_SUSPEND_HYBRID, + + QEMU_AGENT_SUSPEND_LAST, +} qemuAgentSuspendMode; + +int qemuAgentSuspend(qemuAgentPtr mon, + qemuAgentSuspendMode mode); #endif /* __QEMU_AGENT_H__ */ -- 1.7.3.4

define these flags: VIR_DOMAIN_SUSPEND_SLEEP = 0, /* Suspend to RAM */ VIR_DOMAIN_SUSPEND_HIBERNATE = (1 << 0), /* Suspend to disk */ --- include/libvirt/libvirt.h.in | 8 ++++++ src/driver.h | 4 +++ src/libvirt.c | 55 ++++++++++++++++++++++++++++++++++++++++++ src/libvirt_public.syms | 1 + src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 9 ++++++- src/remote_protocol-structs | 5 ++++ 7 files changed, 82 insertions(+), 1 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index e99cd00..d5ac891 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1233,6 +1233,14 @@ int virDomainFree (virDomainPtr domain); int virDomainSuspend (virDomainPtr domain); int virDomainResume (virDomainPtr domain); +typedef enum { + VIR_DOMAIN_SUSPEND_SLEEP = 0, /* Suspend to RAM */ + VIR_DOMAIN_SUSPEND_HIBERNATE = (1 << 0), /* Suspend to disk */ +} virDomainSuspendFlagValues; + +int virDomainSuspendFlags (virDomainPtr domain, + unsigned int flags); + /* * Domain save/restore */ diff --git a/src/driver.h b/src/driver.h index df2aa60..2b878dd 100644 --- a/src/driver.h +++ b/src/driver.h @@ -118,6 +118,9 @@ typedef virDomainPtr typedef int (*virDrvDomainSuspend) (virDomainPtr domain); typedef int + (*virDrvDomainSuspendFlags) (virDomainPtr domain, + unsigned int flags); +typedef int (*virDrvDomainResume) (virDomainPtr domain); typedef int (*virDrvDomainShutdown) (virDomainPtr domain); @@ -831,6 +834,7 @@ struct _virDriver { virDrvDomainLookupByUUID domainLookupByUUID; virDrvDomainLookupByName domainLookupByName; virDrvDomainSuspend domainSuspend; + virDrvDomainSuspendFlags domainSuspendFlags; virDrvDomainResume domainResume; virDrvDomainShutdown domainShutdown; virDrvDomainShutdownFlags domainShutdownFlags; diff --git a/src/libvirt.c b/src/libvirt.c index 8be4e13..f94a5b9 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -2387,6 +2387,61 @@ error: } /** + * virDomainSuspendFlags: + * @domain: a domain object + * @flags: an OR'ed set of virDomainSuspendFlagValues + * + * Suspends an active domain. Dependent of @flags passed, + * the domain will enter either S3 or S4 state. + * Use VIR_DOMAIN_SUSPEND_SLEEP for S4, and + * VIR_DOMAIN_SUSPEND_HIBERNATE for S4. + * + * This function may require privileged access. + * + * Moreover, on some hypervisors, like QEMU, this may + * requires guest agent to be configured and running + * inside the domain. + * + * Returns 0 in case of success and -1 in case of failure. + */ +int +virDomainSuspendFlags(virDomainPtr domain, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain); + + virResetLastError(); + + if (!VIR_IS_CONNECTED_DOMAIN(domain)) { + virLibDomainError(VIR_ERR_INVALID_DOMAIN, __FUNCTION__); + virDispatchError(NULL); + return -1; + } + if (domain->conn->flags & VIR_CONNECT_RO) { + virLibDomainError(VIR_ERR_OPERATION_DENIED, __FUNCTION__); + goto error; + } + + conn = domain->conn; + + if (conn->driver->domainSuspendFlags) { + int ret; + ret = conn->driver->domainSuspendFlags(domain, flags); + if (ret < 0) + goto error; + return ret; + } + + virLibConnError(VIR_ERR_NO_SUPPORT, __FUNCTION__); + +error: + virDispatchError(domain->conn); + return -1; +} + +/** * virDomainResume: * @domain: a domain object * diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index b7f1944..4289ccf 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -520,6 +520,7 @@ LIBVIRT_0.9.10 { global: virDomainShutdownFlags; virStorageVolWipePattern; + virDomainSuspendFlags; } LIBVIRT_0.9.9; # .... define new API here using predicted next version number .... diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index f79f53e..ed5772f 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -4615,6 +4615,7 @@ static virDriver remote_driver = { .domainLookupByUUID = remoteDomainLookupByUUID, /* 0.3.0 */ .domainLookupByName = remoteDomainLookupByName, /* 0.3.0 */ .domainSuspend = remoteDomainSuspend, /* 0.3.0 */ + .domainSuspendFlags = remoteDomainSuspendFlags, /* 0.9.10 */ .domainResume = remoteDomainResume, /* 0.3.0 */ .domainShutdown = remoteDomainShutdown, /* 0.3.0 */ .domainShutdownFlags = remoteDomainShutdownFlags, /* 0.9.10 */ diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 0f354bb..d298a03 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -732,6 +732,11 @@ struct remote_domain_suspend_args { remote_nonnull_domain dom; }; +struct remote_domain_suspend_flags_args { + remote_nonnull_domain dom; + unsigned int flags; +}; + struct remote_domain_resume_args { remote_nonnull_domain dom; }; @@ -2667,7 +2672,9 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_SET_INTERFACE_PARAMETERS = 256, /* autogen autogen */ REMOTE_PROC_DOMAIN_GET_INTERFACE_PARAMETERS = 257, /* skipgen skipgen */ REMOTE_PROC_DOMAIN_SHUTDOWN_FLAGS = 258, /* autogen autogen */ - REMOTE_PROC_STORAGE_VOL_WIPE_PATTERN = 259 /* autogen autogen */ + REMOTE_PROC_STORAGE_VOL_WIPE_PATTERN = 259, /* autogen autogen */ + + REMOTE_PROC_DOMAIN_SUSPEND_FLAGS = 260 /* autogen autogen */ /* * Notice how the entries are grouped in sets of 10 ? diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index de85862..dc47038 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -440,6 +440,10 @@ struct remote_domain_lookup_by_name_ret { struct remote_domain_suspend_args { remote_nonnull_domain dom; }; +struct remote_domain_suspend_flags_args { + remote_nonnull_domain dom; + u_int flags; +}; struct remote_domain_resume_args { remote_nonnull_domain dom; }; @@ -2101,4 +2105,5 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_GET_INTERFACE_PARAMETERS = 257, REMOTE_PROC_DOMAIN_SHUTDOWN_FLAGS = 258, REMOTE_PROC_STORAGE_VOL_WIPE_PATTERN = 259, + REMOTE_PROC_DOMAIN_SUSPEND_FLAGS = 260, }; -- 1.7.3.4

On 01/26/2012 07:06 AM, Michal Privoznik wrote:
define these flags:
VIR_DOMAIN_SUSPEND_SLEEP = 0, /* Suspend to RAM */ VIR_DOMAIN_SUSPEND_HIBERNATE = (1 << 0), /* Suspend to disk */
Not quite right. Our policy when fixing old APIs that forgot a flags argument is that the new API with flags 0 must be identical to the old API. Which means you need: VIR_DOMAIN_SUSPEND_PAUSE = 0, /* pause CPUs */ VIR_DOMAIN_SUSPEND_SLEEP = 1<<0, /* Suspend to RAM, power stays on */ VIR_DOMAIN_SUSPEND_HIBERNATE = 1<<1, /* Suspend to disk, power off */ and you might as well make things convenient: VIR_DOMAIN_SUSPEND_HYBRID = 3, /* Suspend to disk but leave power on */ Then, you should also provide a stub virDomainSuspendFlags for all other hypervisors that only takes flags == 0, forwarding the old virDomainSuspend to the new callback name. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 26.01.2012 15:17, Eric Blake wrote:
On 01/26/2012 07:06 AM, Michal Privoznik wrote:
define these flags:
VIR_DOMAIN_SUSPEND_SLEEP = 0, /* Suspend to RAM */ VIR_DOMAIN_SUSPEND_HIBERNATE = (1 << 0), /* Suspend to disk */
Not quite right. Our policy when fixing old APIs that forgot a flags argument is that the new API with flags 0 must be identical to the old API. Which means you need:
VIR_DOMAIN_SUSPEND_PAUSE = 0, /* pause CPUs */ VIR_DOMAIN_SUSPEND_SLEEP = 1<<0, /* Suspend to RAM, power stays on */
Honestly, in case of virtual machines I can't see the difference between these two. But okay, I'll rewrite it. Michal

On 01/26/2012 07:31 AM, Michal Privoznik wrote:
On 26.01.2012 15:17, Eric Blake wrote:
On 01/26/2012 07:06 AM, Michal Privoznik wrote:
define these flags:
VIR_DOMAIN_SUSPEND_SLEEP = 0, /* Suspend to RAM */ VIR_DOMAIN_SUSPEND_HIBERNATE = (1 << 0), /* Suspend to disk */
Not quite right. Our policy when fixing old APIs that forgot a flags argument is that the new API with flags 0 must be identical to the old API. Which means you need:
VIR_DOMAIN_SUSPEND_PAUSE = 0, /* pause CPUs */ VIR_DOMAIN_SUSPEND_SLEEP = 1<<0, /* Suspend to RAM, power stays on */
Honestly, in case of virtual machines I can't see the difference between these two. But okay, I'll rewrite it.
VIR_DOMAIN_SUSPEND_PAUSE is what we have now, when you click the pause button in virt-manager - it means that qemu stops all guest CPUs, with no input from the guest. Resuming from this state requires virDomainResume to issue a monitor command. VIR_DOMAIN_SUSPEND_SLEEP involves the guest. S3 might be a very fast op with qemu, but it is not an instant op - the guest CPU continues to run, and the guest really does flush state to RAM. Furthermore, qemu is implementing S3 so that you can re-wake the guest via input (such as a keypress into a serial device, or a mouse click), with no intervening monitor command. That said, qemu is also implementing a new monitor command to wake up a guest that is in S3 state; so we _also_ need to add a new API virDomainResumeFlags, to give use the flexibility to expose that new monitor command. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Thu, Jan 26, 2012 at 03:06:15PM +0100, Michal Privoznik wrote:
define these flags:
VIR_DOMAIN_SUSPEND_SLEEP = 0, /* Suspend to RAM */ VIR_DOMAIN_SUSPEND_HIBERNATE = (1 << 0), /* Suspend to disk */ --- include/libvirt/libvirt.h.in | 8 ++++++ src/driver.h | 4 +++ src/libvirt.c | 55 ++++++++++++++++++++++++++++++++++++++++++ src/libvirt_public.syms | 1 + src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 9 ++++++- src/remote_protocol-structs | 5 ++++ 7 files changed, 82 insertions(+), 1 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index e99cd00..d5ac891 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1233,6 +1233,14 @@ int virDomainFree (virDomainPtr domain); int virDomainSuspend (virDomainPtr domain); int virDomainResume (virDomainPtr domain);
+typedef enum { + VIR_DOMAIN_SUSPEND_SLEEP = 0, /* Suspend to RAM */ + VIR_DOMAIN_SUSPEND_HIBERNATE = (1 << 0), /* Suspend to disk */ +} virDomainSuspendFlagValues; + +int virDomainSuspendFlags (virDomainPtr domain, + unsigned int flags);
I'm not entirely comfortable about adding this facility to the virDomainSuspendAPI. IMHO the only commonality between what the current code does, vs what the guest agent does is that they have the misfortune to share the word 'suspend'. Conceptually & functionally they are completely different from each other. In particular the current API is a guest lifecycle changing API, which causes emission of events. The proposed flags don't change the guest lifecycle directly, but may indirectly cause emission of a SHUTDOWN event some time later. It is a shame that our current API is called 'Suspend' - I wish it had been called 'Pause', but we can't change that now. I've no objection to the new functionality, but I feel we need a different name for the API. Perphaps virDomainPMSuspend (PM == Power Manager) Or virDomainNodeSuspendForDuration to match the host API naming Second, do we need some more parameters here, besides just the suspend target. When we added virNodeSuspendForDuration, we had an unsigned long long duration field to allow auto-wakeup after some time. Ultimately on Linux, but virNodeSuspendForDuration and virDOmainPMSuspend end up invoking 'pm-suspend' command line tool, so I feel we ought to have the 'duration' parameter for this too. Even if QEMU agent doesn't currently support it, I imagine it could in the future. We could even use the same enum for the 'target' as with our existing API Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

This allows guests to enter S4 ACPI state aka hibernation by using guest agent. With no flag given, we keep consistent and enter only S3 state (sleep). --- src/qemu/qemu_driver.c | 57 +++++++++++++++++++++++++++++++++++++++++------ tools/virsh.c | 30 +++++++++++++++++++++--- 2 files changed, 75 insertions(+), 12 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index ab69dca..fba9aeb 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1388,7 +1388,10 @@ cleanup: } -static int qemudDomainSuspend(virDomainPtr dom) { +static int +qemudDomainSuspendFlags(virDomainPtr dom, + unsigned int flags) +{ struct qemud_driver *driver = dom->conn->privateData; virDomainObjPtr vm; int ret = -1; @@ -1397,6 +1400,15 @@ static int qemudDomainSuspend(virDomainPtr dom) { virDomainPausedReason reason; int eventDetail; + virCheckFlags(VIR_DOMAIN_SUSPEND_SLEEP | + VIR_DOMAIN_SUSPEND_HIBERNATE, -1); + + if ((flags & VIR_DOMAIN_SUSPEND_SLEEP) && + (flags & VIR_DOMAIN_SUSPEND_HIBERNATE)) { + qemuReportError(VIR_ERR_INVALID_ARG, "%s", _("seriosly?")); + return -1; + } + qemuDriverLock(driver); vm = virDomainFindByUUID(&driver->domains, dom->uuid); @@ -1431,16 +1443,39 @@ static int qemudDomainSuspend(virDomainPtr dom) { "%s", _("domain is not running")); goto endjob; } - if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_PAUSED) { - if (qemuProcessStopCPUs(driver, vm, reason, QEMU_ASYNC_JOB_NONE) < 0) { + + if (flags & VIR_DOMAIN_SUSPEND_SLEEP) { + if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_PAUSED) { + if (qemuProcessStopCPUs(driver, vm, reason, QEMU_ASYNC_JOB_NONE) < 0) { + goto endjob; + } + event = virDomainEventNewFromObj(vm, + VIR_DOMAIN_EVENT_SUSPENDED, + eventDetail); + } + if (virDomainSaveStatus(driver->caps, driver->stateDir, vm) < 0) + goto endjob; + } else if (flags & VIR_DOMAIN_SUSPEND_HIBERNATE) { + if (priv->agentError) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("QEMU guest agent is not " + "available due to an error")); goto endjob; } - event = virDomainEventNewFromObj(vm, - VIR_DOMAIN_EVENT_SUSPENDED, - eventDetail); + if (!priv->agent) { + qemuReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("QEMU guest agent is not configured")); + goto endjob; + } + + qemuDomainObjEnterAgent(driver, vm); + ret = qemuAgentSuspend(priv->agent, QEMU_AGENT_SUSPEND_HIBERNATE); + qemuDomainObjExitAgent(driver, vm); + + if (ret < 0) + goto endjob; } - if (virDomainSaveStatus(driver->caps, driver->stateDir, vm) < 0) - goto endjob; + ret = 0; endjob: @@ -1457,6 +1492,11 @@ cleanup: return ret; } +static int +qemudDomainSuspend(virDomainPtr dom) +{ + return qemudDomainSuspendFlags(dom, 0); +} static int qemudDomainResume(virDomainPtr dom) { struct qemud_driver *driver = dom->conn->privateData; @@ -11760,6 +11800,7 @@ static virDriver qemuDriver = { .domainLookupByUUID = qemudDomainLookupByUUID, /* 0.2.0 */ .domainLookupByName = qemudDomainLookupByName, /* 0.2.0 */ .domainSuspend = qemudDomainSuspend, /* 0.2.0 */ + .domainSuspendFlags = qemudDomainSuspendFlags, /* 0.9.10 */ .domainResume = qemudDomainResume, /* 0.2.0 */ .domainShutdown = qemuDomainShutdown, /* 0.2.0 */ .domainShutdownFlags = qemuDomainShutdownFlags, /* 0.9.10 */ diff --git a/tools/virsh.c b/tools/virsh.c index 74655c2..99b2ca9 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -2142,6 +2142,7 @@ static const vshCmdInfo info_suspend[] = { static const vshCmdOptDef opts_suspend[] = { {"domain", VSH_OT_DATA, VSH_OFLAG_REQ, N_("domain name, id or uuid")}, + {"mode", VSH_OT_STRING, VSH_OFLAG_NONE, N_("state to enter: sleep|hibernate")}, {NULL, 0, 0, NULL} }; @@ -2150,7 +2151,9 @@ cmdSuspend(vshControl *ctl, const vshCmd *cmd) { virDomainPtr dom; const char *name; + const char *mode; bool ret = true; + unsigned int flags = 0; if (!vshConnectionUsability(ctl, ctl->conn)) return false; @@ -2158,12 +2161,31 @@ cmdSuspend(vshControl *ctl, const vshCmd *cmd) if (!(dom = vshCommandOptDomain(ctl, cmd, &name))) return false; - if (virDomainSuspend(dom) == 0) { + if (vshCommandOptString(cmd, "mode", &mode) < 0) { + vshError(ctl, "%s", _("Invalid type")); + return false; + } + + if (mode) { + if (STREQ(mode, "sleep")) { + flags |= VIR_DOMAIN_SUSPEND_SLEEP; + } else if (STREQ(mode, "hibernate")) { + flags |= VIR_DOMAIN_SUSPEND_HIBERNATE; + } else { + vshError(ctl, _("Unknown mode %s value, expecting 'sleep' or 'hibernate'"), mode); + return false; + } + } + + if (flags) + ret = virDomainSuspendFlags(dom, flags) == 0; + else + ret = virDomainSuspend(dom) == 0; + + if (ret) vshPrint(ctl, _("Domain %s suspended\n"), name); - } else { + else vshError(ctl, _("Failed to suspend domain %s"), name); - ret = false; - } virDomainFree(dom); return ret; -- 1.7.3.4

On Thu, Jan 26, 2012 at 03:06:12PM +0100, Michal Privoznik wrote:
As we've added guest agent recently, the whole world of new functionality has opened. As this patch set, which allows domains to enter S4 state.
What is needed for this? Patched qemu. As this is not in qemu git, but patches are await to be pushed in very short future. They can be found here:
http://lists.gnu.org/archive/html/qemu-devel/2012-01/msg03261.html
Despite that, if anybody is willing to give me review if I am going the right way, I'd appreciate it.
One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*.
So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze & thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea.
Because of what written above, we need to call set-level on every suspend.
IMHO all this says that the 'set-level' command is a conceptually unfixably broken design & should be killed in QEMU before it turns into an even bigger mess. Once we're in a situation where we need to call 'set-level' prior to every single invocation, you might as well just allow the QEMU version number to be passed in directly as an arg to the command you are running directly thus avoiding this horrificness. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

[adding qemu-devel] On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*.
So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze & thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea.
Because of what written above, we need to call set-level on every suspend.
IMHO all this says that the 'set-level' command is a conceptually unfixably broken design & should be killed in QEMU before it turns into an even bigger mess.
Once we're in a situation where we need to call 'set-level' prior to every single invocation, you might as well just allow the QEMU version number to be passed in directly as an arg to the command you are running directly thus avoiding this horrificness.
Qemu folks, would you care to chime in on this? Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected. But whose job is it to tell the guest agent what version of qemu is running? Based on the above conversation, it looks like the current qemu implementation does not do any handshaking on its own when the guest agent first comes alive, which means that you are forcing the work on the management app (libvirt). And this is inherently racy - if the guest is allowed to restart its qemu-ga process at will, and each restart of that guest process triggers a need to redo the handshake, then libvirt can never reliably know what version the agent is running at. I think we really do need a mode where as soon as the qemu-ga process exists, it sends an event, then the qemu side of the virtio bus responds to that event by telling the agent what version of qemu is talking to the agent, all prior to exposing any agent commands out to management apps, thus making the qemu-ga set-level command an automatic part of the handshake, and invisible to management apps. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake <eblake@redhat.com> wrote:
[adding qemu-devel]
On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*.
So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze & thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea.
Because of what written above, we need to call set-level on every suspend.
IMHO all this says that the 'set-level' command is a conceptually unfixably broken design & should be killed in QEMU before it turns into an even bigger mess.
Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help?
Once we're in a situation where we need to call 'set-level' prior to every single invocation, you might as well just allow the QEMU version number to be passed in directly as an arg to the command you are running directly thus avoiding this horrificness.
Qemu folks, would you care to chime in on this?
Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected.
Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported. Note that this is only about specific features that depend on host support, like S3 suspend which is known to be buggy in current and old qemu.
But whose job is it to tell the guest agent what version of qemu is running? Based on the above conversation, it looks like the current qemu implementation does not do any handshaking on its own when the guest agent first comes alive, which means that you are forcing the work on the management app (libvirt). And this is inherently racy - if the guest is allowed to restart its qemu-ga process at will, and each restart of that guest process triggers a need to redo the handshake, then libvirt can never reliably know what version the agent is running at.
Making the set-support-level persistent would solve it, wouldn't it?
I think we really do need a mode where as soon as the qemu-ga process exists, it sends an event, then the qemu side of the virtio bus responds to that event by telling the agent what version of qemu is talking to the agent, all prior to exposing any agent commands out to management apps, thus making the qemu-ga set-level command an automatic part of the handshake, and invisible to management apps.

On 26.01.2012 20:35, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake <eblake@redhat.com> wrote:
[adding qemu-devel]
On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*.
So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze & thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea.
Because of what written above, we need to call set-level on every suspend.
IMHO all this says that the 'set-level' command is a conceptually unfixably broken design & should be killed in QEMU before it turns into an even bigger mess.
Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help?
Once we're in a situation where we need to call 'set-level' prior to every single invocation, you might as well just allow the QEMU version number to be passed in directly as an arg to the command you are running directly thus avoiding this horrificness.
Qemu folks, would you care to chime in on this?
Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected.
Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported.
Note that this is only about specific features that depend on host support, like S3 suspend which is known to be buggy in current and old qemu.
But whose job is it to tell the guest agent what version of qemu is running? Based on the above conversation, it looks like the current qemu implementation does not do any handshaking on its own when the guest agent first comes alive, which means that you are forcing the work on the management app (libvirt). And this is inherently racy - if the guest is allowed to restart its qemu-ga process at will, and each restart of that guest process triggers a need to redo the handshake, then libvirt can never reliably know what version the agent is running at.
Making the set-support-level persistent would solve it, wouldn't it?
Yes and no. We still need an event when GA come to live. Because if anybody tries to write something for GA which is not running (and for purpose of this scenario assume it never will), like 'set-support-level' and wait for answer (which will never come) he will be blocked indefinitely. However, if he writes it after 1st event come, everything is OK.

On Thu, 26 Jan 2012 20:41:13 +0100 Michal Privoznik <mprivozn@redhat.com> wrote:
On 26.01.2012 20:35, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake <eblake@redhat.com> wrote:
[adding qemu-devel]
On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*.
So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze & thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea.
Because of what written above, we need to call set-level on every suspend.
IMHO all this says that the 'set-level' command is a conceptually unfixably broken design & should be killed in QEMU before it turns into an even bigger mess.
Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help?
Once we're in a situation where we need to call 'set-level' prior to every single invocation, you might as well just allow the QEMU version number to be passed in directly as an arg to the command you are running directly thus avoiding this horrificness.
Qemu folks, would you care to chime in on this?
Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected.
Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported.
Note that this is only about specific features that depend on host support, like S3 suspend which is known to be buggy in current and old qemu.
But whose job is it to tell the guest agent what version of qemu is running? Based on the above conversation, it looks like the current qemu implementation does not do any handshaking on its own when the guest agent first comes alive, which means that you are forcing the work on the management app (libvirt). And this is inherently racy - if the guest is allowed to restart its qemu-ga process at will, and each restart of that guest process triggers a need to redo the handshake, then libvirt can never reliably know what version the agent is running at.
Making the set-support-level persistent would solve it, wouldn't it?
Yes and no. We still need an event when GA come to live. Because if anybody tries to write something for GA which is not running (and for purpose of this scenario assume it never will), like 'set-support-level' and wait for answer (which will never come) he will be blocked indefinitely. However, if he writes it after 1st event come, everything is OK.
What if the event never reach libvirt? This problem is a lot more general and is not related to the set-support-level command. Maybe adding shutdown & start events can serve as good hints, but they won't fix the problem. IMHO, the best way to solve this is to issue the guest-sync command with a timeout. If you get no answer, then try again later.

On 01/26/2012 02:13 PM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 20:41:13 +0100 Michal Privoznik<mprivozn@redhat.com> wrote:
On 26.01.2012 20:35, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake<eblake@redhat.com> wrote:
[adding qemu-devel]
On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*.
So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze& thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea.
Because of what written above, we need to call set-level on every suspend.
IMHO all this says that the 'set-level' command is a conceptually unfixably broken design& should be killed in QEMU before it turns into an even bigger mess.
Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help?
Once we're in a situation where we need to call 'set-level' prior to every single invocation, you might as well just allow the QEMU version number to be passed in directly as an arg to the command you are running directly thus avoiding this horrificness.
Qemu folks, would you care to chime in on this?
Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected.
Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported.
Note that this is only about specific features that depend on host support, like S3 suspend which is known to be buggy in current and old qemu.
But whose job is it to tell the guest agent what version of qemu is running? Based on the above conversation, it looks like the current qemu implementation does not do any handshaking on its own when the guest agent first comes alive, which means that you are forcing the work on the management app (libvirt). And this is inherently racy - if the guest is allowed to restart its qemu-ga process at will, and each restart of that guest process triggers a need to redo the handshake, then libvirt can never reliably know what version the agent is running at.
Making the set-support-level persistent would solve it, wouldn't it?
Yes and no. We still need an event when GA come to live. Because if anybody tries to write something for GA which is not running (and for purpose of this scenario assume it never will), like 'set-support-level' and wait for answer (which will never come) he will be blocked indefinitely. However, if he writes it after 1st event come, everything is OK.
What if the event never reach libvirt?
This problem is a lot more general and is not related to the set-support-level command. Maybe adding shutdown& start events can serve as good hints, but they won't fix the problem.
Yah, start up events are a good indicator to issue the guest-sync sequence (we had them at one point, and planned to re-add them for QMP integration, and since libvirt is taking on this role for now it might make sense to re-add it now), but once that sequence is issued the agent can still be manually stopped, or the guest-sync sequence itself can timeout. And there's no way to reliably send a stop indicator, maybe to capture shutdown events, but not consistently enough that we can base the protocol on it (agent might get pkill -9'd for instance, and virtio-serial doesn't currently plumb a guest-side disconnect to the chardev front-end, so you'd never know). So, the only indication you'll ever get that your "session" ended is either a timeout, or, if we add it, a start up event. In either case the response is to issue the reset sequence. The way it would need to work with resets is everytime a command times out you: 1) report the timeout error to libvirt client/management app. set guest-agent_available = 0, such that further libvirt calls that depend on it would return "not currently available", or something to that effect. 2) issue guest-sync with new unique session id 3) read a json object/response. - if you time out, goto 2 - if your response doesn't have the session id you're expecting, repeat 3) (since it may be a response to a previous guest-sync RPC that you timed out on prematurely, but you can't just wait indefinitely, since it may never arrive) 4) set guest_agent_available = 1, proceed with normal operation till the next timeout (or start up event, if we add one).
IMHO, the best way to solve this is to issue the guest-sync command with a timeout. If you get no answer, then try again later.

On 01/26/2012 01:35 PM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake<eblake@redhat.com> wrote:
[adding qemu-devel]
On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*.
So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze& thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea.
Because of what written above, we need to call set-level on every suspend.
IMHO all this says that the 'set-level' command is a conceptually unfixably broken design& should be killed in QEMU before it turns into an even bigger mess.
Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help?
Once we're in a situation where we need to call 'set-level' prior to every single invocation, you might as well just allow the QEMU version number to be passed in directly as an arg to the command you are running directly thus avoiding this horrificness.
Qemu folks, would you care to chime in on this?
Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected.
Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported.
Version numbers are meaningless. What happens when a bunch of features get backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of 2.0? The feature negotiation mechanism we have in QMP is the existence of a command. If we're in a position where we're trying to disable part of a command, it simply means that we should have multiple commands such that we can just remove the disabled part entirely. Regards, Anthony Liguori

On Thu, 26 Jan 2012 16:57:01 -0600 Anthony Liguori <anthony@codemonkey.ws> wrote:
On 01/26/2012 01:35 PM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake<eblake@redhat.com> wrote:
[adding qemu-devel]
On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*.
So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze& thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea.
Because of what written above, we need to call set-level on every suspend.
IMHO all this says that the 'set-level' command is a conceptually unfixably broken design& should be killed in QEMU before it turns into an even bigger mess.
Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help?
Once we're in a situation where we need to call 'set-level' prior to every single invocation, you might as well just allow the QEMU version number to be passed in directly as an arg to the command you are running directly thus avoiding this horrificness.
Qemu folks, would you care to chime in on this?
Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected.
Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported.
Version numbers are meaningless. What happens when a bunch of features get backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of 2.0?
The feature negotiation mechanism we have in QMP is the existence of a command. If we're in a position where we're trying to disable part of a command, it simply means that we should have multiple commands such that we can just remove the disabled part entirely.
You may have a point that we shouldn't be using the version number for that, but just switching to multiple commands doesn't solve the fundamental problem. The fundamental problem is that, S3 in current (and old) qemu has two known bugs: 1. The screen is left black after S3 (it's a bug in seabios) 2. QEMU resumes the guest immediately (Gerd posted patches to address this) We're going to address both issues in 1.1. However, if qemu-ga is installed in an old qemu and S3 is used, the bugs will be triggered. We need a way for qemu-ga to query qemu about the existence of a working S3 support. The set-support-level solves that. Another option would be to disable (or enable) S3 by default in qemu-ga, and let the admin enable (or disable it) according to S3 support being fixed in qemu.

On 01/30/2012 06:57 AM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 16:57:01 -0600 Anthony Liguori<anthony@codemonkey.ws> wrote:
On 01/26/2012 01:35 PM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake<eblake@redhat.com> wrote:
[adding qemu-devel]
On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*.
So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze& thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea.
Because of what written above, we need to call set-level on every suspend.
IMHO all this says that the 'set-level' command is a conceptually unfixably broken design& should be killed in QEMU before it turns into an even bigger mess.
Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help?
Once we're in a situation where we need to call 'set-level' prior to every single invocation, you might as well just allow the QEMU version number to be passed in directly as an arg to the command you are running directly thus avoiding this horrificness.
Qemu folks, would you care to chime in on this?
Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected.
Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported.
Version numbers are meaningless. What happens when a bunch of features get backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of 2.0?
The feature negotiation mechanism we have in QMP is the existence of a command. If we're in a position where we're trying to disable part of a command, it simply means that we should have multiple commands such that we can just remove the disabled part entirely.
You may have a point that we shouldn't be using the version number for that, but just switching to multiple commands doesn't solve the fundamental problem.
The fundamental problem is that, S3 in current (and old) qemu has two known bugs:
1. The screen is left black after S3 (it's a bug in seabios) 2. QEMU resumes the guest immediately (Gerd posted patches to address this)
We're going to address both issues in 1.1. However, if qemu-ga is installed in an old qemu and S3 is used, the bugs will be triggered.
It's a management tool problem. Before a management tool issues a command, it should query the existence of the command to determine whether this version of QEMU has that capability. If the tool needs to use two commands, it should query the existence of both of them. In this case, the management tool needs a qemu-ga command *and* a QEMU command (to resume from suspend) so it should query both of them. Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3 worked in QEMU as expected. Alternatively, if there really was no reason to have a resume-from-suspend command, this would be the point where we would add a capabilities command adding the "working-s3" capability. But with capabilities, this is a direct QEMU->management tool interaction, not a proxy through the guest agent. We shouldn't trust the guest agent and we certainly don't want to rely on the guest agent to avoid sending an improper command to QEMU! That would be a security issue.
We need a way for qemu-ga to query qemu about the existence of a working S3 support. The set-support-level solves that.
qemu-ga is not an entry point for QEMU features. It's strictly a mechanism to ask the guest to do something. If we need to interact with QEMU directly to query a capability and/or presence of a command, then we should talk to QEMU directly. To put it another way, a management tool MUST deal with the fact that when issuing the suspend-to-ram command, a guest may ignore it or attempt to do something malicious. Regards, Anthony Liguori
Another option would be to disable (or enable) S3 by default in qemu-ga, and let the admin enable (or disable it) according to S3 support being fixed in qemu.

On Mon, 30 Jan 2012 07:54:56 -0600 Anthony Liguori <anthony@codemonkey.ws> wrote:
On 01/30/2012 06:57 AM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 16:57:01 -0600 Anthony Liguori<anthony@codemonkey.ws> wrote:
On 01/26/2012 01:35 PM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake<eblake@redhat.com> wrote:
[adding qemu-devel]
On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
> One thing, that you'll probably notice is this > 'set-support-level' command. Basically, it tells GA what qemu version > is it running on. Ideally, this should be done as soon as > GA starts up. However, that cannot be determined from outside > world as GA doesn't emit any events yet. > Ideally^2 this command should be left out as it should be qemu > who tells its own agent this kind of information. > Anyway, I was going to call this command in qemuProcess{Startup, > Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs > so guest can boot and start GA, but that implies returning from qemuProcess*. > > So I am setting this just before 'guest-suspend' command, as > there is one more thing about GA. It is unable to remember anything > upon its restart (GA process). Which has BTW show flaw > in our current code with FS freeze& thaw. If we freeze guest > FS, and somebody restart GA, the simple FS Thaw will not succeed as > GA thinks FS are not frozen. But that's a different cup of tea. > > Because of what written above, we need to call set-level > on every suspend.
IMHO all this says that the 'set-level' command is a conceptually unfixably broken design& should be killed in QEMU before it turns into an even bigger mess.
Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help?
Once we're in a situation where we need to call 'set-level' prior to every single invocation, you might as well just allow the QEMU version number to be passed in directly as an arg to the command you are running directly thus avoiding this horrificness.
Qemu folks, would you care to chime in on this?
Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected.
Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported.
Version numbers are meaningless. What happens when a bunch of features get backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of 2.0?
The feature negotiation mechanism we have in QMP is the existence of a command. If we're in a position where we're trying to disable part of a command, it simply means that we should have multiple commands such that we can just remove the disabled part entirely.
You may have a point that we shouldn't be using the version number for that, but just switching to multiple commands doesn't solve the fundamental problem.
The fundamental problem is that, S3 in current (and old) qemu has two known bugs:
1. The screen is left black after S3 (it's a bug in seabios) 2. QEMU resumes the guest immediately (Gerd posted patches to address this)
We're going to address both issues in 1.1. However, if qemu-ga is installed in an old qemu and S3 is used, the bugs will be triggered.
It's a management tool problem.
Before a management tool issues a command, it should query the existence of the command to determine whether this version of QEMU has that capability. If the tool needs to use two commands, it should query the existence of both of them.
In this case, the management tool needs a qemu-ga command *and* a QEMU command (to resume from suspend) so it should query both of them.
Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3 worked in QEMU as expected.
That's right, it's a coincidence, but I don't see why this wouldn't work. I think we should do the following then: 1. Drop the set-support-level command 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, guest-suspend-disk 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing the guest-suspend-ram command Michal, Michael, do you agree?
Alternatively, if there really was no reason to have a resume-from-suspend command, this would be the point where we would add a capabilities command adding the "working-s3" capability.
But with capabilities, this is a direct QEMU->management tool interaction, not a proxy through the guest agent.
We shouldn't trust the guest agent and we certainly don't want to rely on the guest agent to avoid sending an improper command to QEMU! That would be a security issue.
Fair enough, although that makes me wonder if the planned feature of pulling qemu-ga's commands into the QMP namespace won't have similar security issues.
We need a way for qemu-ga to query qemu about the existence of a working S3 support. The set-support-level solves that.
qemu-ga is not an entry point for QEMU features. It's strictly a mechanism to ask the guest to do something. If we need to interact with QEMU directly to query a capability and/or presence of a command, then we should talk to QEMU directly.
To put it another way, a management tool MUST deal with the fact that when issuing the suspend-to-ram command, a guest may ignore it or attempt to do something malicious.
Regards,
Anthony Liguori
Another option would be to disable (or enable) S3 by default in qemu-ga, and let the admin enable (or disable it) according to S3 support being fixed in qemu.

On 01/30/2012 08:44 AM, Luiz Capitulino wrote:
On Mon, 30 Jan 2012 07:54:56 -0600 Anthony Liguori<anthony@codemonkey.ws> wrote:
On 01/30/2012 06:57 AM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 16:57:01 -0600 Anthony Liguori<anthony@codemonkey.ws> wrote:
On 01/26/2012 01:35 PM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake<eblake@redhat.com> wrote:
[adding qemu-devel]
On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >> One thing, that you'll probably notice is this >> 'set-support-level' command. Basically, it tells GA what qemu version >> is it running on. Ideally, this should be done as soon as >> GA starts up. However, that cannot be determined from outside >> world as GA doesn't emit any events yet. >> Ideally^2 this command should be left out as it should be qemu >> who tells its own agent this kind of information. >> Anyway, I was going to call this command in qemuProcess{Startup, >> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >> so guest can boot and start GA, but that implies returning from qemuProcess*. >> >> So I am setting this just before 'guest-suspend' command, as >> there is one more thing about GA. It is unable to remember anything >> upon its restart (GA process). Which has BTW show flaw >> in our current code with FS freeze& thaw. If we freeze guest >> FS, and somebody restart GA, the simple FS Thaw will not succeed as >> GA thinks FS are not frozen. But that's a different cup of tea. >> >> Because of what written above, we need to call set-level >> on every suspend. > > > IMHO all this says that the 'set-level' command is a conceptually > unfixably broken design& should be killed in QEMU before it turns > into an even bigger mess.
Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help?
> Once we're in a situation where we need to call 'set-level' prior > to every single invocation, you might as well just allow the QEMU > version number to be passed in directly as an arg to the command > you are running directly thus avoiding this horrificness.
Qemu folks, would you care to chime in on this?
Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected.
Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported.
Version numbers are meaningless. What happens when a bunch of features get backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of 2.0?
The feature negotiation mechanism we have in QMP is the existence of a command. If we're in a position where we're trying to disable part of a command, it simply means that we should have multiple commands such that we can just remove the disabled part entirely.
You may have a point that we shouldn't be using the version number for that, but just switching to multiple commands doesn't solve the fundamental problem.
The fundamental problem is that, S3 in current (and old) qemu has two known bugs:
1. The screen is left black after S3 (it's a bug in seabios) 2. QEMU resumes the guest immediately (Gerd posted patches to address this)
We're going to address both issues in 1.1. However, if qemu-ga is installed in an old qemu and S3 is used, the bugs will be triggered.
It's a management tool problem.
Before a management tool issues a command, it should query the existence of the command to determine whether this version of QEMU has that capability. If the tool needs to use two commands, it should query the existence of both of them.
In this case, the management tool needs a qemu-ga command *and* a QEMU command (to resume from suspend) so it should query both of them.
Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3 worked in QEMU as expected.
That's right, it's a coincidence, but I don't see why this wouldn't work.
I think we should do the following then:
1. Drop the set-support-level command 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, guest-suspend-disk 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing the guest-suspend-ram command
Michal, Michael, do you agree?
Yup yup, looks sound to me.
Alternatively, if there really was no reason to have a resume-from-suspend command, this would be the point where we would add a capabilities command adding the "working-s3" capability.
But with capabilities, this is a direct QEMU->management tool interaction, not a proxy through the guest agent.
We shouldn't trust the guest agent and we certainly don't want to rely on the guest agent to avoid sending an improper command to QEMU! That would be a security issue.
Fair enough, although that makes me wonder if the planned feature of pulling qemu-ga's commands into the QMP namespace won't have similar security issues.
We need a way for qemu-ga to query qemu about the existence of a working S3 support. The set-support-level solves that.
qemu-ga is not an entry point for QEMU features. It's strictly a mechanism to ask the guest to do something. If we need to interact with QEMU directly to query a capability and/or presence of a command, then we should talk to QEMU directly.
To put it another way, a management tool MUST deal with the fact that when issuing the suspend-to-ram command, a guest may ignore it or attempt to do something malicious.
Regards,
Anthony Liguori
Another option would be to disable (or enable) S3 by default in qemu-ga, and let the admin enable (or disable it) according to S3 support being fixed in qemu.

On 01/30/2012 07:44 AM, Luiz Capitulino wrote:
I think we should do the following then:
1. Drop the set-support-level command 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, guest-suspend-disk 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing the guest-suspend-ram command
Michal, Michael, do you agree?
I'm not Michal, but speaking for libvirt, this definitely sounds like the way to go. Questions: Should libvirt also check for 'wakeup' before attempting guest-suspend-hybrid? With guest-suspend-disk, what happens when the suspend completes? Does this look like a normal case of the guest powering down, which qemu then passes as an event to libvirt and libvirt then ends the qemu process? That would mean that the only difference from a normal guest shutdown is that on the next guest boot the guest's disk image allows to recover state from disk rather than booting from scratch; either way, a new qemu process is created to resume the guest, and qemu is doing nothing different in how it starts the guest (it's just that the guest itself does something different based on what it stored into the disk images before shutting down). Also, I know there has been talk about a qemu-ga command to resync clocks after a resume from S3 and/or 'loadvm'; is this command fully in place yet, and is it another command that libvirt should be checking for prior to allowing any S3 attempts? -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 01/30/2012 09:58 AM, Eric Blake wrote:
On 01/30/2012 07:44 AM, Luiz Capitulino wrote:
I think we should do the following then:
1. Drop the set-support-level command 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, guest-suspend-disk 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing the guest-suspend-ram command
Michal, Michael, do you agree?
I'm not Michal, but speaking for libvirt, this definitely sounds like the way to go.
Questions:
Should libvirt also check for 'wakeup' before attempting guest-suspend-hybrid?
With guest-suspend-disk, what happens when the suspend completes? Does this look like a normal case of the guest powering down, which qemu then passes as an event to libvirt and libvirt then ends the qemu process? That would mean that the only difference from a normal guest shutdown is that on the next guest boot the guest's disk image allows to recover state from disk rather than booting from scratch; either way, a new qemu process is created to resume the guest, and qemu is doing nothing different in how it starts the guest (it's just that the guest itself does something different based on what it stored into the disk images before shutting down).
Also, I know there has been talk about a qemu-ga command to resync clocks after a resume from S3 and/or 'loadvm'; is this command fully in place yet, and is it another command that libvirt should be checking for prior to allowing any S3 attempts?
Hi Eric, I'm not aware of a clock re-sync command being worked on.. are we maybe talking about the guest-sync command qemu-ga currently has in place to re-sync the protocol stream after a client-side timeout?

On Mon, 30 Jan 2012 11:07:10 -0600 Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
On 01/30/2012 09:58 AM, Eric Blake wrote:
On 01/30/2012 07:44 AM, Luiz Capitulino wrote:
I think we should do the following then:
1. Drop the set-support-level command 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, guest-suspend-disk 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing the guest-suspend-ram command
Michal, Michael, do you agree?
I'm not Michal, but speaking for libvirt, this definitely sounds like the way to go.
Questions:
Should libvirt also check for 'wakeup' before attempting guest-suspend-hybrid?
Yes.
With guest-suspend-disk, what happens when the suspend completes? Does this look like a normal case of the guest powering down, which qemu then passes as an event to libvirt and libvirt then ends the qemu process?
Yes.
That would mean that the only difference from a normal guest shutdown is that on the next guest boot the guest's disk image allows to recover state from disk rather than booting from scratch; either way, a new qemu process is created to resume the guest, and qemu is doing nothing different in how it starts the guest (it's just that the guest itself does something different based on what it stored into the disk images before shutting down).
Exactly.
Also, I know there has been talk about a qemu-ga command to resync clocks after a resume from S3 and/or 'loadvm'; is this command fully in place yet, and is it another command that libvirt should be checking for prior to allowing any S3 attempts?
Hi Eric,
I'm not aware of a clock re-sync command being worked on.. are we maybe talking about the guest-sync command qemu-ga currently has in place to re-sync the protocol stream after a client-side timeout?
I've heard some conversations about doing what Eric is saying, but I don't have any details either. Eric, do you know whom is assigned to work on that on the qemu side?

On 30.01.2012 15:44, Luiz Capitulino wrote:
On Mon, 30 Jan 2012 07:54:56 -0600 Anthony Liguori <anthony@codemonkey.ws> wrote:
On 01/30/2012 06:57 AM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 16:57:01 -0600 Anthony Liguori<anthony@codemonkey.ws> wrote:
On 01/26/2012 01:35 PM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake<eblake@redhat.com> wrote:
[adding qemu-devel]
On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >> One thing, that you'll probably notice is this >> 'set-support-level' command. Basically, it tells GA what qemu version >> is it running on. Ideally, this should be done as soon as >> GA starts up. However, that cannot be determined from outside >> world as GA doesn't emit any events yet. >> Ideally^2 this command should be left out as it should be qemu >> who tells its own agent this kind of information. >> Anyway, I was going to call this command in qemuProcess{Startup, >> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >> so guest can boot and start GA, but that implies returning from qemuProcess*. >> >> So I am setting this just before 'guest-suspend' command, as >> there is one more thing about GA. It is unable to remember anything >> upon its restart (GA process). Which has BTW show flaw >> in our current code with FS freeze& thaw. If we freeze guest >> FS, and somebody restart GA, the simple FS Thaw will not succeed as >> GA thinks FS are not frozen. But that's a different cup of tea. >> >> Because of what written above, we need to call set-level >> on every suspend. > > > IMHO all this says that the 'set-level' command is a conceptually > unfixably broken design& should be killed in QEMU before it turns > into an even bigger mess.
Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help?
> Once we're in a situation where we need to call 'set-level' prior > to every single invocation, you might as well just allow the QEMU > version number to be passed in directly as an arg to the command > you are running directly thus avoiding this horrificness.
Qemu folks, would you care to chime in on this?
Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected.
Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported.
Version numbers are meaningless. What happens when a bunch of features get backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of 2.0?
The feature negotiation mechanism we have in QMP is the existence of a command. If we're in a position where we're trying to disable part of a command, it simply means that we should have multiple commands such that we can just remove the disabled part entirely.
You may have a point that we shouldn't be using the version number for that, but just switching to multiple commands doesn't solve the fundamental problem.
The fundamental problem is that, S3 in current (and old) qemu has two known bugs:
1. The screen is left black after S3 (it's a bug in seabios) 2. QEMU resumes the guest immediately (Gerd posted patches to address this)
We're going to address both issues in 1.1. However, if qemu-ga is installed in an old qemu and S3 is used, the bugs will be triggered.
It's a management tool problem.
Before a management tool issues a command, it should query the existence of the command to determine whether this version of QEMU has that capability. If the tool needs to use two commands, it should query the existence of both of them.
In this case, the management tool needs a qemu-ga command *and* a QEMU command (to resume from suspend) so it should query both of them.
Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3 worked in QEMU as expected.
That's right, it's a coincidence, but I don't see why this wouldn't work.
I think we should do the following then:
1. Drop the set-support-level command 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, guest-suspend-disk 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing the guest-suspend-ram command
Michal, Michael, do you agree?
I don't see into qemu internals, but <like> to dropping set-support-level for sure. From my POV, splitting may solve the problem. But how hard would it be to fix suspend to ram in qemu, so libvirt doesn't have to query for wakeup command? It would be nice if we have easy to use interface. Michal

On Mon, 30 Jan 2012 17:08:33 +0100 Michal Privoznik <mprivozn@redhat.com> wrote:
On 30.01.2012 15:44, Luiz Capitulino wrote:
On Mon, 30 Jan 2012 07:54:56 -0600 Anthony Liguori <anthony@codemonkey.ws> wrote:
On 01/30/2012 06:57 AM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 16:57:01 -0600 Anthony Liguori<anthony@codemonkey.ws> wrote:
On 01/26/2012 01:35 PM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake<eblake@redhat.com> wrote:
> [adding qemu-devel] > > On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >>> One thing, that you'll probably notice is this >>> 'set-support-level' command. Basically, it tells GA what qemu version >>> is it running on. Ideally, this should be done as soon as >>> GA starts up. However, that cannot be determined from outside >>> world as GA doesn't emit any events yet. >>> Ideally^2 this command should be left out as it should be qemu >>> who tells its own agent this kind of information. >>> Anyway, I was going to call this command in qemuProcess{Startup, >>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >>> so guest can boot and start GA, but that implies returning from qemuProcess*. >>> >>> So I am setting this just before 'guest-suspend' command, as >>> there is one more thing about GA. It is unable to remember anything >>> upon its restart (GA process). Which has BTW show flaw >>> in our current code with FS freeze& thaw. If we freeze guest >>> FS, and somebody restart GA, the simple FS Thaw will not succeed as >>> GA thinks FS are not frozen. But that's a different cup of tea. >>> >>> Because of what written above, we need to call set-level >>> on every suspend. >> >> >> IMHO all this says that the 'set-level' command is a conceptually >> unfixably broken design& should be killed in QEMU before it turns >> into an even bigger mess.
Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help?
>> Once we're in a situation where we need to call 'set-level' prior >> to every single invocation, you might as well just allow the QEMU >> version number to be passed in directly as an arg to the command >> you are running directly thus avoiding this horrificness. > > Qemu folks, would you care to chime in on this? > > Exactly how is the set-level command supposed to work? As I understand > it, the goal is that if the guest has qemu-ga 1.1 installed, but is > being run by qemu 1.0, then we want to ensure that any guest agent > command supported by qemu-ga 1.1 but requiring features of qemu not > present in qemu 1.0 will be properly rejected.
Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported.
Version numbers are meaningless. What happens when a bunch of features get backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of 2.0?
The feature negotiation mechanism we have in QMP is the existence of a command. If we're in a position where we're trying to disable part of a command, it simply means that we should have multiple commands such that we can just remove the disabled part entirely.
You may have a point that we shouldn't be using the version number for that, but just switching to multiple commands doesn't solve the fundamental problem.
The fundamental problem is that, S3 in current (and old) qemu has two known bugs:
1. The screen is left black after S3 (it's a bug in seabios) 2. QEMU resumes the guest immediately (Gerd posted patches to address this)
We're going to address both issues in 1.1. However, if qemu-ga is installed in an old qemu and S3 is used, the bugs will be triggered.
It's a management tool problem.
Before a management tool issues a command, it should query the existence of the command to determine whether this version of QEMU has that capability. If the tool needs to use two commands, it should query the existence of both of them.
In this case, the management tool needs a qemu-ga command *and* a QEMU command (to resume from suspend) so it should query both of them.
Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3 worked in QEMU as expected.
That's right, it's a coincidence, but I don't see why this wouldn't work.
I think we should do the following then:
1. Drop the set-support-level command 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, guest-suspend-disk 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing the guest-suspend-ram command
Michal, Michael, do you agree?
I don't see into qemu internals, but <like> to dropping set-support-level for sure. From my POV, splitting may solve the problem. But how hard would it be to fix suspend to ram in qemu, so libvirt doesn't have to query for wakeup command? It would be nice if we have easy to use interface.
libivrt will have to always query for the wakeup command, because that's the only way to know if the qemu libvirt is talking to supports suspend to ram. Of course that you should always query for a command before issuing it too (you can use the guest-info command for that), because qemu-ga commands may be unavailable.

On 01/30/2012 06:57 AM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 16:57:01 -0600 Anthony Liguori<anthony@codemonkey.ws> wrote:
On 01/26/2012 01:35 PM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake<eblake@redhat.com> wrote:
[adding qemu-devel]
On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*.
So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze& thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea.
Because of what written above, we need to call set-level on every suspend.
IMHO all this says that the 'set-level' command is a conceptually unfixably broken design& should be killed in QEMU before it turns into an even bigger mess.
Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help?
Once we're in a situation where we need to call 'set-level' prior to every single invocation, you might as well just allow the QEMU version number to be passed in directly as an arg to the command you are running directly thus avoiding this horrificness.
Qemu folks, would you care to chime in on this?
Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected.
Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported.
Version numbers are meaningless. What happens when a bunch of features get backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of 2.0?
The feature negotiation mechanism we have in QMP is the existence of a command. If we're in a position where we're trying to disable part of a command, it simply means that we should have multiple commands such that we can just remove the disabled part entirely.
You may have a point that we shouldn't be using the version number for that, but just switching to multiple commands doesn't solve the fundamental problem.
Agreed, but the multiple commands isn't really the fix here, it's libvirt querying for the "wakeup" command that Gerd's patches add. We implemented set-version-level as a way to let management tools obliviously report something simple, like the version information it parses from QEMU already, to let the guest agent Do The Right Thing without any deep insight into what it's host requirements are (which is also why we opted for versions over specific capabilities flags). But we can't rely on version levels applying in all cases, a distro might backport s3 support for 1.0, modify the guest agent version level dependencies accordingly, and thus render the that agent incompatible with, say, RHEL. So it was a naive approach on my part, and the issues the libvirt folks noted with not knowing if a reset occurred since the last set-support-level make it even less desirable (we could persist it as you suggested, and I am working on patches to persist fsfreeze state, but it's not worth trying to fix set-support-level). And at least in this case we have an easy out: libvirt knows it can't resume a guest without the presence of a QMP command to do so, and that doesn't require any intimate knowledge of how the agent works, it's just common sense. Breaking the suspend/hibernate stuff into multiple commands avoids the need for libvirt to special case based on specific parameters to the command. We can also imagine distro-specific implementations of the agent disabling s4 due to things like not having s4 patches for virtio in place, so it makes it easier (possible) to discover those situations as well. It may be that we opt for a single command for libvirt, but at least on the agent side it should be multiple discoverable ones. Hopefully this doesn't set anyone back too much :( IMHO it's the right way to go though.
The fundamental problem is that, S3 in current (and old) qemu has two known bugs:
1. The screen is left black after S3 (it's a bug in seabios) 2. QEMU resumes the guest immediately (Gerd posted patches to address this)
We're going to address both issues in 1.1. However, if qemu-ga is installed in an old qemu and S3 is used, the bugs will be triggered.
We need a way for qemu-ga to query qemu about the existence of a working S3 support. The set-support-level solves that.
Another option would be to disable (or enable) S3 by default in qemu-ga, and let the admin enable (or disable it) according to S3 support being fixed in qemu.

On 01/26/2012 09:18 AM, Eric Blake wrote:
[adding qemu-devel]
On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*.
So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze& thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea.
Because of what written above, we need to call set-level on every suspend.
IMHO all this says that the 'set-level' command is a conceptually unfixably broken design& should be killed in QEMU before it turns into an even bigger mess.
Once we're in a situation where we need to call 'set-level' prior to every single invocation, you might as well just allow the QEMU version number to be passed in directly as an arg to the command you are running directly thus avoiding this horrificness.
Qemu folks, would you care to chime in on this?
Big Ack on my part. I told Mike this afternoon that I wasn't going to take the pull request with this command in it. The fundamental problem here is simple--untested code is broken code. Until we introduce a resume from suspend command (such that we can test the guest agent suspend command), we shouldn't be implementing a guest-agent suspend command. As far as I can tell, the only reason we're introducing it is because we're trying to add a multiplexed command that does suspend to ram and suspend to disk. Since it's multiplexed, it's an all-or-nothing introduction. We're then adding a side-interface to attempt to deal work around that. Let's not introduce a multiplexed command in the first place. Here's what I suggest: 1) Throw away set-level interface. 2) Introduce a suspend-to-disk command. 3) Plan to introduce a suspend-to-ram command, but I won't pull it until we have the ability to test it successfully (which means we probably need a resume-from-ram command for QMP). 4) libvirt can probe the existence of suspend-to-disk in the guest agent and act accordingly. 5) To implement virDomainSuspendToRam (or whatever it will be called), libvirt should: a) check if the guest agent command 'suspend-to-ram' exists b) check if the QMP command 'resume-from-ram' exists 6) The recommendation of (5) should be prominently documented in qapi-schema.json 7) In order for libvirt to start implementing (5), we should stub out (3) in qapi-schema-guest.json but set gen=False. That commits us to the interface without actually introducing the command. Regards, Anthony Liguori

On 01/26/2012 09:18 AM, Eric Blake wrote:
[adding qemu-devel]
On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
One thing, that you'll probably notice is this 'set-support-level' command. Basically, it tells GA what qemu version is it running on. Ideally, this should be done as soon as GA starts up. However, that cannot be determined from outside world as GA doesn't emit any events yet. Ideally^2 this command should be left out as it should be qemu who tells its own agent this kind of information. Anyway, I was going to call this command in qemuProcess{Startup, Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs so guest can boot and start GA, but that implies returning from qemuProcess*.
So I am setting this just before 'guest-suspend' command, as there is one more thing about GA. It is unable to remember anything upon its restart (GA process). Which has BTW show flaw in our current code with FS freeze& thaw. If we freeze guest FS, and somebody restart GA, the simple FS Thaw will not succeed as GA thinks FS are not frozen. But that's a different cup of tea.
We could drop the state tracking (guest-fsfreeze-status) and issue the freeze/thaw unconditionally. Talked with Anthony and going stateless in general seems to solve a lot of nastiness that might pop up with the current implementation. I'll send some patches out soon for fsfreeze, guest-file* may end up getting similar treatment in the not-too-distant-future.
participants (6)
-
Anthony Liguori
-
Daniel P. Berrange
-
Eric Blake
-
Luiz Capitulino
-
Michael Roth
-
Michal Privoznik