[libvirt] [PATCH] qemu driver: sync guest time via qemu-guest-agent when domain is resumed

Since QEMU commit "kvmclock: clock should count only if vm is running" (http://lists.gnu.org/archive/html/qemu-devel/2013-06/msg01225.html), guests have their realtime clock stopped during pause. To correct the situation, invoke guest agent to sync time from host time. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 8bcd98e..df01244 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2695,6 +2695,40 @@ qemuProcessPrepareMonitorChr(virQEMUDriverConfigPtr cfg, } + +#include <sys/time.h> + +static void +qemuAgentSyncGuestTime(virDomainObjPtr vm) +{ + int ret; + struct timeval tv; + char *result; + qemuDomainObjPrivatePtr priv; + char buf[500]; + + priv = vm->privateData; + + ret = gettimeofday(&tv, NULL); + if (ret) { + virReportSystemError(errno, "%s", _("gettimeofday failure")); + return; + } + + memset(buf, 0, sizeof(buf)); + + sprintf(buf, "{ \"execute\": \"guest-set-time\"," + "\"arguments\":{\"time\":%lld}}\" ", + tv.tv_usec * 1000 + (tv.tv_sec * 1000000000LL)); + + qemuDomainObjEnterAgent(vm); + ret = qemuAgentArbitraryCommand(priv->agent, buf, &result, + VIR_DOMAIN_QEMU_AGENT_COMMAND_DEFAULT); + qemuDomainObjExitAgent(vm); + if (ret < 0) + VIR_FREE(result); +} + /* * Precondition: vm must be locked, and a job must be active. * This method will call {Enter,Exit}Monitor @@ -2727,6 +2761,7 @@ qemuProcessStartCPUs(virQEMUDriverPtr driver, virDomainObjPtr vm, if (ret == 0) { virDomainObjSetState(vm, VIR_DOMAIN_RUNNING, reason); + qemuAgentSyncGuestTime(vm); } else { if (virDomainLockProcessPause(driver->lockManager, vm, &priv->lockState) < 0) VIR_WARN("Unable to release lease on %s", vm->def->name);

On 02/10/2014 03:10 PM, Marcelo Tosatti wrote:
Since QEMU commit "kvmclock: clock should count only if vm is running" (http://lists.gnu.org/archive/html/qemu-devel/2013-06/msg01225.html), guests have their realtime clock stopped during pause.
To correct the situation, invoke guest agent to sync time from host time.
This needs to be conditional on new API (either a flag to existing API [except that virDomainResumeFlags is not quite existing yet], or new API altogether). We cannot require an API to depend on guest interaction (which an agent command does) without an explicit flag mentioning that it is okay. We may even decide to have conditional ACL checks (allowing some users the ability to restart a domain, but not to interact with the guest agent). For that matter, when we proposed adding virDomainResumeFlags as the way to add in a flag for requesting a sync via this API, Dan suggested it would be better to add a new API altogether instead of piggybacking on top of the existing API: https://www.redhat.com/archives/libvir-list/2014-February/msg00104.html
+ +static void +qemuAgentSyncGuestTime(virDomainObjPtr vm) +{ + int ret; + struct timeval tv; + char *result; + qemuDomainObjPrivatePtr priv; + char buf[500]; + + priv = vm->privateData; + + ret = gettimeofday(&tv, NULL); + if (ret) { + virReportSystemError(errno, "%s", _("gettimeofday failure")); + return; + } + + memset(buf, 0, sizeof(buf)); + + sprintf(buf, "{ \"execute\": \"guest-set-time\"," + "\"arguments\":{\"time\":%lld}}\" ", + tv.tv_usec * 1000 + (tv.tv_sec * 1000000000LL));
Ewww. This should be using qemuAgentMakeCommand, not open-coding a string via printf (and even if we DO stick with printf, it should use virAsprintf, not sprintf). qemuAgentGuestSync is not the best example to be copying.
+ + qemuDomainObjEnterAgent(vm); + ret = qemuAgentArbitraryCommand(priv->agent, buf, &result, + VIR_DOMAIN_QEMU_AGENT_COMMAND_DEFAULT);
Again, if you use qemUAgentMakeCommand, then this is much nicer with qemuAgentCommand(). qemuAgentArbitraryCommand should be reserved only for the public API in libvirt-qemu.so.
+ qemuDomainObjExitAgent(vm); + if (ret < 0) + VIR_FREE(result); +} + /* * Precondition: vm must be locked, and a job must be active. * This method will call {Enter,Exit}Monitor @@ -2727,6 +2761,7 @@ qemuProcessStartCPUs(virQEMUDriverPtr driver, virDomainObjPtr vm,
if (ret == 0) { virDomainObjSetState(vm, VIR_DOMAIN_RUNNING, reason); + qemuAgentSyncGuestTime(vm);
Making this unconditional as part of starting every guest is flat out wrong. It MUST be explicit, as not all guests will have an agent installed to be able to do this, and even for guests that have an agent installed, interacting with the guest agent may be deemed too much of a security risk in some setups. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Mon, Feb 10, 2014 at 03:42:45PM -0700, Eric Blake wrote:
On 02/10/2014 03:10 PM, Marcelo Tosatti wrote:
Since QEMU commit "kvmclock: clock should count only if vm is running" (http://lists.gnu.org/archive/html/qemu-devel/2013-06/msg01225.html), guests have their realtime clock stopped during pause.
To correct the situation, invoke guest agent to sync time from host time.
This needs to be conditional on new API (either a flag to existing API [except that virDomainResumeFlags is not quite existing yet], or new API altogether).
We cannot require an API to depend on guest interaction (which an agent command does) without an explicit flag mentioning that it is okay. We may even decide to have conditional ACL checks (allowing some users the ability to restart a domain, but not to interact with the guest agent).
For that matter, when we proposed adding virDomainResumeFlags as the way to add in a flag for requesting a sync via this API, Dan suggested it would be better to add a new API altogether instead of piggybacking on top of the existing API:
https://www.redhat.com/archives/libvir-list/2014-February/msg00104.html
"So there's interesting, and I'm not sure I entirely agree about adding flags here. It seems to me that if the QEMU agent has a "set-time" capability we'd be better off having an explicit API virDomainSetTime(...). The action of resume + set time cannot be atomic so I don't see any point in overloading the "set time" functionality into the resume API call. Just let apps call the set time method if they so desire." Apps desire the guest time to be synchronized to host time, which is the standard 99% of usecases configuration. Ok, i was thinking of adding a new element GuestSyncTimeOnResume to control this, which would default to on (as users expect their guests time to be correct at all times, without additional commands, and rightly so). It seems unreasonable to me that every API user has to keep track of every location which a guest can possibly resume execution from a paused state: this can be hidden inside libvirt. Daniel, all the logic behind the necessity to set-time after guest resume (*) can be hidden inside libvirt. (*) all instances of guest resume: upon migration continuation, upon ENOSPACE pause continuation, all of them. Don't see the need to force the knowledge and maintenance of this state on libvirt users. Are you OK with the new knob, default on, to control sync time on guest resume, then?
+ +static void +qemuAgentSyncGuestTime(virDomainObjPtr vm) +{ + int ret; + struct timeval tv; + char *result; + qemuDomainObjPrivatePtr priv; + char buf[500]; + + priv = vm->privateData; + + ret = gettimeofday(&tv, NULL); + if (ret) { + virReportSystemError(errno, "%s", _("gettimeofday failure")); + return; + } + + memset(buf, 0, sizeof(buf)); + + sprintf(buf, "{ \"execute\": \"guest-set-time\"," + "\"arguments\":{\"time\":%lld}}\" ", + tv.tv_usec * 1000 + (tv.tv_sec * 1000000000LL));
Ewww. This should be using qemuAgentMakeCommand, not open-coding a string via printf (and even if we DO stick with printf, it should use virAsprintf, not sprintf). qemuAgentGuestSync is not the best example to be copying.
Ok, i'll use that.
+ + qemuDomainObjEnterAgent(vm); + ret = qemuAgentArbitraryCommand(priv->agent, buf, &result, + VIR_DOMAIN_QEMU_AGENT_COMMAND_DEFAULT);
Again, if you use qemUAgentMakeCommand, then this is much nicer with qemuAgentCommand(). qemuAgentArbitraryCommand should be reserved only for the public API in libvirt-qemu.so.
+ qemuDomainObjExitAgent(vm); + if (ret < 0) + VIR_FREE(result); +} + /* * Precondition: vm must be locked, and a job must be active. * This method will call {Enter,Exit}Monitor @@ -2727,6 +2761,7 @@ qemuProcessStartCPUs(virQEMUDriverPtr driver, virDomainObjPtr vm,
if (ret == 0) { virDomainObjSetState(vm, VIR_DOMAIN_RUNNING, reason); + qemuAgentSyncGuestTime(vm);
Making this unconditional as part of starting every guest is flat out wrong. It MUST be explicit, as not all guests will have an agent installed to be able to do this, and even for guests that have an agent installed, interacting with the guest agent may be deemed too much of a security risk in some setups.
-- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
OK.

On 11.02.2014 01:04, Marcelo Tosatti wrote:
On Mon, Feb 10, 2014 at 03:42:45PM -0700, Eric Blake wrote:
On 02/10/2014 03:10 PM, Marcelo Tosatti wrote:
Since QEMU commit "kvmclock: clock should count only if vm is running" (http://lists.gnu.org/archive/html/qemu-devel/2013-06/msg01225.html), guests have their realtime clock stopped during pause.
To correct the situation, invoke guest agent to sync time from host time.
This needs to be conditional on new API (either a flag to existing API [except that virDomainResumeFlags is not quite existing yet], or new API altogether).
We cannot require an API to depend on guest interaction (which an agent command does) without an explicit flag mentioning that it is okay. We may even decide to have conditional ACL checks (allowing some users the ability to restart a domain, but not to interact with the guest agent).
For that matter, when we proposed adding virDomainResumeFlags as the way to add in a flag for requesting a sync via this API, Dan suggested it would be better to add a new API altogether instead of piggybacking on top of the existing API:
https://www.redhat.com/archives/libvir-list/2014-February/msg00104.html
"So there's interesting, and I'm not sure I entirely agree about adding flags here. It seems to me that if the QEMU agent has a "set-time" capability we'd be better off having an explicit API virDomainSetTime(...). The action of resume + set time cannot be atomic so I don't see any point in overloading the "set time" functionality into the resume API call. Just let apps call the set time method if they so desire."
Apps desire the guest time to be synchronized to host time, which is the standard 99% of usecases configuration.
Ok, i was thinking of adding a new element GuestSyncTimeOnResume to control this, which would default to on (as users expect their guests time to be correct at all times, without additional commands, and rightly so).
It seems unreasonable to me that every API user has to keep track of every location which a guest can possibly resume execution from a paused state: this can be hidden inside libvirt.
Daniel, all the logic behind the necessity to set-time after guest resume (*) can be hidden inside libvirt.
(*) all instances of guest resume: upon migration continuation, upon ENOSPACE pause continuation, all of them.
Don't see the need to force the knowledge and maintenance of this state on libvirt users.
Are you OK with the new knob, default on, to control sync time on guest resume, then?
I think we may need both approaches. I too think that resume with syncing guest time is so common use case that is deserves to be exposed as a single operation outside the libvirt. On the other hand, we certainly want to expose this as a new virDomain{Get,Set}Time() API. BTW: I would have implement this already, but I'm still waiting for my qemu patch to be merged upstream: http://lists.nongnu.org/archive/html/qemu-devel/2014-01/msg04293.html I don't really want to start implementing this feature with need to patch qemu myself. Michal

On Tue, Feb 11, 2014 at 11:48:58AM +0100, Michal Privoznik wrote:
On 11.02.2014 01:04, Marcelo Tosatti wrote:
On Mon, Feb 10, 2014 at 03:42:45PM -0700, Eric Blake wrote:
On 02/10/2014 03:10 PM, Marcelo Tosatti wrote:
Since QEMU commit "kvmclock: clock should count only if vm is running" (http://lists.gnu.org/archive/html/qemu-devel/2013-06/msg01225.html), guests have their realtime clock stopped during pause.
To correct the situation, invoke guest agent to sync time from host time.
This needs to be conditional on new API (either a flag to existing API [except that virDomainResumeFlags is not quite existing yet], or new API altogether).
We cannot require an API to depend on guest interaction (which an agent command does) without an explicit flag mentioning that it is okay. We may even decide to have conditional ACL checks (allowing some users the ability to restart a domain, but not to interact with the guest agent).
For that matter, when we proposed adding virDomainResumeFlags as the way to add in a flag for requesting a sync via this API, Dan suggested it would be better to add a new API altogether instead of piggybacking on top of the existing API:
https://www.redhat.com/archives/libvir-list/2014-February/msg00104.html
"So there's interesting, and I'm not sure I entirely agree about adding flags here. It seems to me that if the QEMU agent has a "set-time" capability we'd be better off having an explicit API virDomainSetTime(...). The action of resume + set time cannot be atomic so I don't see any point in overloading the "set time" functionality into the resume API call. Just let apps call the set time method if they so desire."
Apps desire the guest time to be synchronized to host time, which is the standard 99% of usecases configuration.
Ok, i was thinking of adding a new element GuestSyncTimeOnResume to control this, which would default to on (as users expect their guests time to be correct at all times, without additional commands, and rightly so).
It seems unreasonable to me that every API user has to keep track of every location which a guest can possibly resume execution from a paused state: this can be hidden inside libvirt.
Daniel, all the logic behind the necessity to set-time after guest resume (*) can be hidden inside libvirt.
(*) all instances of guest resume: upon migration continuation, upon ENOSPACE pause continuation, all of them.
^^^^ (1).
Don't see the need to force the knowledge and maintenance of this state on libvirt users.
Are you OK with the new knob, default on, to control sync time on guest resume, then?
I think we may need both approaches. I too think that resume with syncing guest time is so common use case that is deserves to be exposed as a single operation outside the libvirt.
On the other hand, we certainly want to expose this as a new virDomain{Get,Set}Time() API.
Why a new API which requires modifications to every libvirt user (read: current libvirt users are broken until patched) ? Note it requires applications to keep track of every pause/resume instance (see item 1 above). Let us please back the suggestion to not perform guest-agent time sync inside libvirt without app knowledge and instead have a separate operation with facts.
BTW: I would have implement this already, but I'm still waiting for my qemu patch to be merged upstream:
http://lists.nongnu.org/archive/html/qemu-devel/2014-01/msg04293.html
I don't really want to start implementing this feature with need to patch qemu myself.
Michal
No problem.

On 02/11/2014 02:37 PM, Marcelo Tosatti wrote:
I think we may need both approaches. I too think that resume with syncing guest time is so common use case that is deserves to be exposed as a single operation outside the libvirt.
On the other hand, we certainly want to expose this as a new virDomain{Get,Set}Time() API.
Why a new API which requires modifications to every libvirt user (read: current libvirt users are broken until patched) ?
ALL existing libvirt users are "broken", in that NO ONE has been able to automatically resync guest time yet (at least, using supported libvirt API). This is a new feature; but as a new feature that involves guest cooperation, it must be explicitly requested by any client that wants to take advantage of it. We cannot change the default and automatically turn this feature on, because it has back-compat implications to existing libvirt clients that are not expecting libvirt to automatically try and munge guest time.
Note it requires applications to keep track of every pause/resume instance (see item 1 above).
Whether we make it a new API call that all clients must call every time they want, or a single knob that says that for the given VM, libvirt should automatically attempt to resync time after any resume action, isn't as important to me as the fact that it IS something new that gets explicitly requested. We already have <on_poweroff> and friends in the guest XML that control what the guest does on a state transition to poweroff, maybe we could add an <on_resume> element that has instructions on what actions libvirt should take when a guest transitions from paused back to running. But the point is that such an XML element is an explicit request, and only appropriate for guests where the management stack trusts interacting with the guest agent. Maybe even both additions would be appropriate (a knob for automatic libvirt syncing as well as a new API for explicit syncing outside the events where libvirt would normally do it automatically). -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Tue, Feb 11, 2014 at 02:45:28PM -0700, Eric Blake wrote:
On 02/11/2014 02:37 PM, Marcelo Tosatti wrote:
I think we may need both approaches. I too think that resume with syncing guest time is so common use case that is deserves to be exposed as a single operation outside the libvirt.
On the other hand, we certainly want to expose this as a new virDomain{Get,Set}Time() API.
Why a new API which requires modifications to every libvirt user (read: current libvirt users are broken until patched) ?
ALL existing libvirt users are "broken", in that NO ONE has been able to automatically resync guest time yet (at least, using supported libvirt API). This is a new feature; but as a new feature that involves guest cooperation, it must be explicitly requested by any client that wants to take advantage of it. We cannot change the default and automatically turn this feature on, because it has back-compat implications to existing libvirt clients that are not expecting libvirt to automatically try and munge guest time.
What libvirt clients expect is proper operation of their guests. Proper operation, for 99% of usecases, is to have both guest and host time synchronized to UTC.
From the first message in this thread:
"Since QEMU commit "kvmclock: clock should count only if vm is running" (http://lists.gnu.org/archive/html/qemu-devel/2013-06/msg01225.html), guests have their realtime clock stopped during pause." So Linux guests did have their clocks counting during suspend/resume before, which meant that on resume the realtime clock was synchronized to UTC. However, we can't do that (the link above contains more information), but 99% of deployments still want their guests clock to be synchronized to UTC. So it is not really a new feature.
Note it requires applications to keep track of every pause/resume instance (see item 1 above).
Whether we make it a new API call that all clients must call every time they want, or a single knob that says that for the given VM, libvirt should automatically attempt to resync time after any resume action, isn't as important to me as the fact that it IS something new that gets explicitly requested. We already have <on_poweroff> and friends in the guest XML that control what the guest does on a state transition to poweroff, maybe we could add an <on_resume> element that has instructions on what actions libvirt should take when a guest transitions from paused back to running. But the point is that such an XML element is an explicit request, and only appropriate for guests where the management stack trusts interacting with the guest agent.
Sure.
Maybe even both additions would be appropriate (a knob for automatic libvirt syncing as well as a new API for explicit syncing outside the events where libvirt would normally do it automatically).
OK, what i am trying to do is to make sure you understand why libvirt users are not really interested in this as a new feature (because it is not). OK will improve on the knob and resubmit.

On 11.02.2014 22:45, Eric Blake wrote:
On 02/11/2014 02:37 PM, Marcelo Tosatti wrote:
I think we may need both approaches. I too think that resume with syncing guest time is so common use case that is deserves to be exposed as a single operation outside the libvirt.
On the other hand, we certainly want to expose this as a new virDomain{Get,Set}Time() API.
Why a new API which requires modifications to every libvirt user (read: current libvirt users are broken until patched) ?
ALL existing libvirt users are "broken", in that NO ONE has been able to automatically resync guest time yet (at least, using supported libvirt API). This is a new feature; but as a new feature that involves guest cooperation, it must be explicitly requested by any client that wants to take advantage of it. We cannot change the default and automatically turn this feature on, because it has back-compat implications to existing libvirt clients that are not expecting libvirt to automatically try and munge guest time.
Note it requires applications to keep track of every pause/resume instance (see item 1 above).
Whether we make it a new API call that all clients must call every time they want, or a single knob that says that for the given VM, libvirt should automatically attempt to resync time after any resume action, isn't as important to me as the fact that it IS something new that gets explicitly requested. We already have <on_poweroff> and friends in the guest XML that control what the guest does on a state transition to poweroff, maybe we could add an <on_resume> element that has instructions on what actions libvirt should take when a guest transitions from paused back to running. But the point is that such an XML element is an explicit request, and only appropriate for guests where the management stack trusts interacting with the guest agent.
Maybe even both additions would be appropriate (a knob for automatic libvirt syncing as well as a new API for explicit syncing outside the events where libvirt would normally do it automatically).
I don't think we should add the <on_resume/> element. After all, it would be the same as: virDomainResumeFlags(dom, VIR_SYNC_TIME) which as Dan correctly argues, makes sensible error reporting impossible. If the domain is resumed, libvirt should execute the <on_resume/> action, which may fail. Michal

On 02/13/2014 02:09 AM, Michal Privoznik wrote:
Maybe even both additions would be appropriate (a knob for automatic libvirt syncing as well as a new API for explicit syncing outside the events where libvirt would normally do it automatically).
I don't think we should add the <on_resume/> element. After all, it would be the same as:
virDomainResumeFlags(dom, VIR_SYNC_TIME)
which as Dan correctly argues, makes sensible error reporting impossible. If the domain is resumed, libvirt should execute the <on_resume/> action, which may fail.
Or maybe we document that the virDomainResume() api tracks just the resume state, then we add a new libvirt event that says the result of any attempt at a sync operation. Then when you resume, and <on_resume> says to attempt an auto-sync, then you can also listen for the libvirt event that tells you the status of the sync (whether it succeeded (and to what timestamp) or failed). -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 13.02.2014 14:32, Eric Blake wrote:
On 02/13/2014 02:09 AM, Michal Privoznik wrote:
Maybe even both additions would be appropriate (a knob for automatic libvirt syncing as well as a new API for explicit syncing outside the events where libvirt would normally do it automatically).
I don't think we should add the <on_resume/> element. After all, it would be the same as:
virDomainResumeFlags(dom, VIR_SYNC_TIME)
which as Dan correctly argues, makes sensible error reporting impossible. If the domain is resumed, libvirt should execute the <on_resume/> action, which may fail.
Or maybe we document that the virDomainResume() api tracks just the resume state, then we add a new libvirt event that says the result of any attempt at a sync operation. Then when you resume, and <on_resume> says to attempt an auto-sync, then you can also listen for the libvirt event that tells you the status of the sync (whether it succeeded (and to what timestamp) or failed).
I think we are going the most complicated way here. Again. We are not in freeze so there's still time for new APIs. But then again, prior to posting patches to the libvirt list, I'd expect qemu to be patched already (I've proposed patch on the qemu-devel list and it got acked but not merged yet). My requirement may feel unnatural for somebody, but it wouldn't be for the first time that libvirt implemented something with blindly trusting in the design/API that was just discussed on the qemu-devel. Unfortunately for us, the design/API had changed leaving libvirt hanging in the air. Once the qemu patch is merged, it serves to me as a kind of guarantee that there is wide consensus on the design / API and it will not change (of course, it's get written in stone after the release, but I'm just fine with pushed-to-git). Having said that, I have patches that add virDomain{Get,Set}Time ready. Even though I need to polish them a bit before they are ready to send. Michal

On Thu, Feb 13, 2014 at 03:02:38PM +0100, Michal Privoznik wrote:
On 13.02.2014 14:32, Eric Blake wrote:
On 02/13/2014 02:09 AM, Michal Privoznik wrote:
Maybe even both additions would be appropriate (a knob for automatic libvirt syncing as well as a new API for explicit syncing outside the events where libvirt would normally do it automatically).
I don't think we should add the <on_resume/> element. After all, it would be the same as:
virDomainResumeFlags(dom, VIR_SYNC_TIME)
which as Dan correctly argues, makes sensible error reporting impossible. If the domain is resumed, libvirt should execute the <on_resume/> action, which may fail.
Or maybe we document that the virDomainResume() api tracks just the resume state, then we add a new libvirt event that says the result of any attempt at a sync operation. Then when you resume, and <on_resume> says to attempt an auto-sync, then you can also listen for the libvirt event that tells you the status of the sync (whether it succeeded (and to what timestamp) or failed).
I think that leads to a pretty unpleasant usage pattern for applications. eg instead of just getting an exception from a dedicated API call in python, the app has to setup an event loop thread and wait around in the hope an event may arrive at some point after the API call. I don't see any compelling reason why we want to make such an unpleasant API out of virDomainResume, when we can just create a dedicated API that does what we need.
I think we are going the most complicated way here. Again. We are not in freeze so there's still time for new APIs. But then again, prior to posting patches to the libvirt list, I'd expect qemu to be patched already (I've proposed patch on the qemu-devel list and it got acked but not merged yet). My requirement may feel unnatural for somebody, but it wouldn't be for the first time that libvirt implemented something with blindly trusting in the design/API that was just discussed on the qemu-devel. Unfortunately for us, the design/API had changed leaving libvirt hanging in the air. Once the qemu patch is merged, it serves to me as a kind of guarantee that there is wide consensus on the design / API and it will not change (of course, it's get written in stone after the release, but I'm just fine with pushed-to-git).
Having said that, I have patches that add virDomain{Get,Set}Time ready. Even though I need to polish them a bit before they are ready to send.
Might as well post them for review when you have them ready, even if QEMU isn't merged. Worst case, we ACK them, but delay merging them in libvirt until QEMU side is merged. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Tue, Feb 11, 2014 at 11:48:58AM +0100, Michal Privoznik wrote:
On 11.02.2014 01:04, Marcelo Tosatti wrote:
On Mon, Feb 10, 2014 at 03:42:45PM -0700, Eric Blake wrote:
On 02/10/2014 03:10 PM, Marcelo Tosatti wrote:
Since QEMU commit "kvmclock: clock should count only if vm is running" (http://lists.gnu.org/archive/html/qemu-devel/2013-06/msg01225.html), guests have their realtime clock stopped during pause.
To correct the situation, invoke guest agent to sync time from host time.
This needs to be conditional on new API (either a flag to existing API [except that virDomainResumeFlags is not quite existing yet], or new API altogether).
We cannot require an API to depend on guest interaction (which an agent command does) without an explicit flag mentioning that it is okay. We may even decide to have conditional ACL checks (allowing some users the ability to restart a domain, but not to interact with the guest agent).
For that matter, when we proposed adding virDomainResumeFlags as the way to add in a flag for requesting a sync via this API, Dan suggested it would be better to add a new API altogether instead of piggybacking on top of the existing API:
https://www.redhat.com/archives/libvir-list/2014-February/msg00104.html
"So there's interesting, and I'm not sure I entirely agree about adding flags here. It seems to me that if the QEMU agent has a "set-time" capability we'd be better off having an explicit API virDomainSetTime(...). The action of resume + set time cannot be atomic so I don't see any point in overloading the "set time" functionality into the resume API call. Just let apps call the set time method if they so desire."
Apps desire the guest time to be synchronized to host time, which is the standard 99% of usecases configuration.
Ok, i was thinking of adding a new element GuestSyncTimeOnResume to control this, which would default to on (as users expect their guests time to be correct at all times, without additional commands, and rightly so).
It seems unreasonable to me that every API user has to keep track of every location which a guest can possibly resume execution from a paused state: this can be hidden inside libvirt.
Daniel, all the logic behind the necessity to set-time after guest resume (*) can be hidden inside libvirt.
(*) all instances of guest resume: upon migration continuation, upon ENOSPACE pause continuation, all of them.
Don't see the need to force the knowledge and maintenance of this state on libvirt users.
Are you OK with the new knob, default on, to control sync time on guest resume, then?
I think we may need both approaches. I too think that resume with syncing guest time is so common use case that is deserves to be exposed as a single operation outside the libvirt.
On the other hand, we certainly want to expose this as a new virDomain{Get,Set}Time() API.
I agree that this should be a completely separate command, not merely a flag to the Resume API. The reason is that you cannot do sensible error reporting if you overload this in one API call. ie consider that resuming the CPUs succeeds, but syncing time fails. If you return "success" to the caller you are lieing about the result of time sync. If you return "error" to the caller you are lieing about the result of resuming the CPUs. If there is a separate API to invoke then the caller can clearly see which operation succeeded vs failed. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Wed, Feb 12, 2014 at 10:22:11AM +0000, Daniel P. Berrange wrote:
I agree that this should be a completely separate command, not merely a flag to the Resume API. The reason is that you cannot do sensible error reporting if you overload this in one API call. ie consider that resuming the CPUs succeeds, but syncing time fails. If you return "success" to the caller you are lieing about the result of time sync. If you return "error" to the caller you are lieing about the result of resuming the CPUs.
If there is a separate API to invoke then the caller can clearly see which operation succeeded vs failed.
Well then just require a new guest agent command. No need for a separate command. Going from "resume" to "resume-and-sync" versus Going from "resume" to "resume; send-guest-agent-command" Is not very different is it? In both cases qemu guest agent channel must be operational. I'll go write a virsh alias and virt-manager patch. (BTW, error reporting would be in the libvirt logs, GuestAgentError with all the details why time-sync failed).

On Wed, Feb 12, 2014 at 06:45:28PM -0200, Marcelo Tosatti wrote:
On Wed, Feb 12, 2014 at 10:22:11AM +0000, Daniel P. Berrange wrote:
I agree that this should be a completely separate command, not merely a flag to the Resume API. The reason is that you cannot do sensible error reporting if you overload this in one API call. ie consider that resuming the CPUs succeeds, but syncing time fails. If you return "success" to the caller you are lieing about the result of time sync. If you return "error" to the caller you are lieing about the result of resuming the CPUs.
If there is a separate API to invoke then the caller can clearly see which operation succeeded vs failed.
Well then just require a new guest agent command. No need for a separate command.
Going from "resume" to "resume-and-sync" versus
Going from "resume" to "resume; send-guest-agent-command"
Is not very different is it? In both cases qemu guest agent channel must be operational.
Well doh which fails to hide QEMU. Ok then, resume-and-sync-time command.
I'll go write a virsh alias and virt-manager patch.
(BTW, error reporting would be in the libvirt logs, GuestAgentError with all the details why time-sync failed).

On 02/12/2014 01:45 PM, Marcelo Tosatti wrote:
On Wed, Feb 12, 2014 at 10:22:11AM +0000, Daniel P. Berrange wrote:
I agree that this should be a completely separate command, not merely a flag to the Resume API. The reason is that you cannot do sensible error reporting if you overload this in one API call. ie consider that resuming the CPUs succeeds, but syncing time fails. If you return "success" to the caller you are lieing about the result of time sync. If you return "error" to the caller you are lieing about the result of resuming the CPUs.
If there is a separate API to invoke then the caller can clearly see which operation succeeded vs failed.
Well then just require a new guest agent command. No need for a separate command.
Going from "resume" to "resume-and-sync" versus
Going from "resume" to "resume; send-guest-agent-command"
Is not very different is it? In both cases qemu guest agent channel must be operational.
You're confusing things. 'resume' is a command to qemu. 'sync' is a command to the guest agent. The guest agent can't respond unless the guest is running. If 'resume' fails, the guest is not running. But if 'sync' fails, the guest is necessarily in a different state than when you started. Having to report failure after the guest ran an indeterminate number of instructions is not good. The guest agent is not in charge of resume, therefore adding a guest agent command to do resume-and-sync is not possible. And qemu is not in charge of talking to the guest, only the agent is. So adding a new qemu command to resume-and-sync is not possible. We really are talking about two disparate operations by two different actors, and where failure of the second action cannot be rolled back to the state of the first action.
I'll go write a virsh alias and virt-manager patch.
(BTW, error reporting would be in the libvirt logs, GuestAgentError with all the details why time-sync failed).
-- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 02/12/2014 02:05 PM, Eric Blake wrote:
I'll go write a virsh alias and virt-manager patch.
A single virsh command that makes multiple API calls under the hood is just fine - on partial failure, it can make it quite clear which of the steps failed. I have no problem with creating a 'virsh resume --sync' that issues both the resume API call and the sync agent API call. But that doesn't change the fact that the libvirt.so API still needs to be separate. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Wed, Feb 12, 2014 at 02:05:11PM -0700, Eric Blake wrote:
On 02/12/2014 01:45 PM, Marcelo Tosatti wrote:
On Wed, Feb 12, 2014 at 10:22:11AM +0000, Daniel P. Berrange wrote:
I agree that this should be a completely separate command, not merely a flag to the Resume API. The reason is that you cannot do sensible error reporting if you overload this in one API call. ie consider that resuming the CPUs succeeds, but syncing time fails. If you return "success" to the caller you are lieing about the result of time sync. If you return "error" to the caller you are lieing about the result of resuming the CPUs.
If there is a separate API to invoke then the caller can clearly see which operation succeeded vs failed.
Well then just require a new guest agent command. No need for a separate command.
Going from "resume" to "resume-and-sync" versus
Going from "resume" to "resume; send-guest-agent-command"
Is not very different is it? In both cases qemu guest agent channel must be operational.
You're confusing things. 'resume' is a command to qemu. 'sync' is a command to the guest agent. The guest agent can't respond unless the guest is running. If 'resume' fails, the guest is not running. But if 'sync' fails, the guest is necessarily in a different state than when you started. Having to report failure after the guest ran an indeterminate number of instructions is not good.
The guest agent is not in charge of resume, therefore adding a guest agent command to do resume-and-sync is not possible.
And qemu is not in charge of talking to the guest, only the agent is. So adding a new qemu command to resume-and-sync is not possible.
We really are talking about two disparate operations by two different actors, and where failure of the second action cannot be rolled back to the state of the first action.
Yes, exactly, unless you can atomically succeed or fail the whole thing without side-effects they must be separate APIs. It is impossible to rollback execution of the guest after resuming it, so that's a blocking factor forcing separate APIs for this task. Doing a single monitor command in QEMU can't magically solve this as Eric describes. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
participants (4)
-
Daniel P. Berrange
-
Eric Blake
-
Marcelo Tosatti
-
Michal Privoznik