[libvirt] [PATCHv2 libvirt] qemu: Issue rtc-reset-reinjection command after guest-set-time

https://bugzilla.redhat.com/show_bug.cgi?id=1103245 An advice appeared there on the qemu-devel list [1]. When a domain is suspended and then resumed guest kernel is not aware of this. So we've introduced virDomainSetTime API that resets the time within guest using qemu-ga. On the other hand, qemu itself is trying to make RTC beat faster to catch the difference. But if we don't tell qemu that guest's time was reset via the other method, both mechanisms are applied resulting in again wrong guest time. In order to avoid summing both corrections we need to tell qemu that it should not use the RTC injection if the guest time is set via guest agent. 1: http://www.mail-archive.com/qemu-devel@nongnu.org/msg236435.html Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- Notes: diff to v1: -fixed command name in subject -added testcase src/qemu/qemu_driver.c | 10 ++++++++++ src/qemu/qemu_monitor.c | 33 +++++++++++++++++++++++++++++++++ src/qemu/qemu_monitor.h | 2 ++ src/qemu/qemu_monitor_json.c | 21 +++++++++++++++++++++ src/qemu/qemu_monitor_json.h | 2 ++ tests/qemumonitorjsontest.c | 1 + 6 files changed, 69 insertions(+) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index b6219ba..bdfd155 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -16879,6 +16879,16 @@ qemuDomainSetTime(virDomainPtr dom, rv = qemuAgentSetTime(priv->agent, seconds, nseconds, rtcSync); qemuDomainObjExitAgent(vm); + if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, + "%s", _("domain is not running")); + goto endjob; + } + + qemuDomainObjEnterMonitor(driver, vm); + rv = qemuMonitorRTCResetReinjection(priv->mon); + qemuDomainObjExitMonitor(driver, vm); + if (rv < 0) goto endjob; diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 3d9f87b..77627bc 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -4037,3 +4037,36 @@ qemuMonitorGetGuestCPU(qemuMonitorPtr mon, return qemuMonitorJSONGetGuestCPU(mon, arch, data); } + +/** + * qemuMonitorRTCResetReinjection: + * @mon: Pointer to the monitor + * + * Issue rtc-reset-reinjection command. + * This should be used in cases where guest time is restored via + * guest agent so RTC injection is not needed (in fact it will + * confuse guest's RTC). + * + * Returns 0 on success + * -1 on error. + */ +int +qemuMonitorRTCResetReinjection(qemuMonitorPtr mon) +{ + + VIR_DEBUG("mon=%p", mon); + + if (!mon) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("monitor must not be NULL")); + return -1; + } + + if (!mon->json) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("JSON monitor is required")); + return -1; + } + + return qemuMonitorJSONRTCResetReinjection(mon); +} diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index c3695f2..4fd6f01 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -790,6 +790,8 @@ int qemuMonitorGetGuestCPU(qemuMonitorPtr mon, virArch arch, virCPUDataPtr *data); +int qemuMonitorRTCResetReinjection(qemuMonitorPtr mon); + /** * When running two dd process and using <> redirection, we need a * shell that will not truncate files. These two strings serve that diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index a62c02f..538110c 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -5851,3 +5851,24 @@ qemuMonitorJSONGetGuestCPU(qemuMonitorPtr mon, return -1; } } + +int +qemuMonitorJSONRTCResetReinjection(qemuMonitorPtr mon) +{ + int ret = -1; + virJSONValuePtr cmd; + virJSONValuePtr reply = NULL; + + if (!(cmd = qemuMonitorJSONMakeCommand("rtc-reset-reinjection", + NULL))) + return ret; + + ret = qemuMonitorJSONCommand(mon, cmd, &reply); + + if (ret == 0) + ret = qemuMonitorJSONCheckError(cmd, reply); + + virJSONValueFree(cmd); + virJSONValueFree(reply); + return ret; +} diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index 5f6c846..d8c9308 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -437,4 +437,6 @@ int qemuMonitorJSONGetDeviceAliases(qemuMonitorPtr mon, int qemuMonitorJSONGetGuestCPU(qemuMonitorPtr mon, virArch arch, virCPUDataPtr *data); + +int qemuMonitorJSONRTCResetReinjection(qemuMonitorPtr mon); #endif /* QEMU_MONITOR_JSON_H */ diff --git a/tests/qemumonitorjsontest.c b/tests/qemumonitorjsontest.c index baee80a..e3fb4f7 100644 --- a/tests/qemumonitorjsontest.c +++ b/tests/qemumonitorjsontest.c @@ -2279,6 +2279,7 @@ mymain(void) DO_TEST_SIMPLE("inject-nmi", qemuMonitorJSONInjectNMI); DO_TEST_SIMPLE("system_wakeup", qemuMonitorJSONSystemWakeup); DO_TEST_SIMPLE("nbd-server-stop", qemuMonitorJSONNBDServerStop); + DO_TEST_SIMPLE("rtc-reset-reinjection", qemuMonitorJSONRTCResetReinjection); DO_TEST_GEN(qemuMonitorJSONSetLink); DO_TEST_GEN(qemuMonitorJSONBlockResize); DO_TEST_GEN(qemuMonitorJSONSetVNCPassword); -- 1.8.5.5

On 08/14/2014 02:24 AM, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1103245
An advice appeared there on the qemu-devel list [1]. When a domain is suspended and then resumed guest kernel is not aware of this. So we've introduced virDomainSetTime API that resets the time within guest using qemu-ga. On the other hand, qemu itself is trying to make RTC beat faster to catch the difference. But if we don't tell qemu that guest's time was reset via the other method, both mechanisms are applied resulting in again wrong guest time. In order to avoid summing both corrections we need to tell qemu that it should not use the RTC injection if the guest time is set via guest agent.
1: http://www.mail-archive.com/qemu-devel@nongnu.org/msg236435.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ---
Notes: diff to v1: -fixed command name in subject -added testcase
+++ b/src/qemu/qemu_driver.c @@ -16879,6 +16879,16 @@ qemuDomainSetTime(virDomainPtr dom, rv = qemuAgentSetTime(priv->agent, seconds, nseconds, rtcSync); qemuDomainObjExitAgent(vm);
+ if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, + "%s", _("domain is not running")); + goto endjob; + } + + qemuDomainObjEnterMonitor(driver, vm); + rv = qemuMonitorRTCResetReinjection(priv->mon); + qemuDomainObjExitMonitor(driver, vm);
We have four combinations: 1. old qemu, old qga: command fails because qga doesn't support it, qemu tries to catch up time manually (might eventually match real time) 2. new qemu, old qga: command fails because qga doesn't support it, qemu tries to catch up time manually (might eventually match real time) 3. new qemu, new qga: both qga and qemu commands work, no additional catchup attempted and guest is now accurate 4. old qemu, new qga: qga succeeds, but qemu command fails, so we have overcorrected and qemu is trying to catch up time manually (overcorrected, so it cannot match real time) I guess reporting failure in those three cases is fine, although I'm still worried about case 4. I'd feel a lot better if there were a qemu_capabilities.h bit that detects if the qemu command is present, and skip even attempting the qga command unless we ALSO know the qemu command is present (that is, use the capability check to completely avoid case 4, by turning it into the same behavior as case 1). Weak ACK, depending on whether you agree with my desire to avoid attempting the qga command unless we also know the qemu command exists. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 18.08.2014 17:28, Eric Blake wrote:
On 08/14/2014 02:24 AM, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1103245
An advice appeared there on the qemu-devel list [1]. When a domain is suspended and then resumed guest kernel is not aware of this. So we've introduced virDomainSetTime API that resets the time within guest using qemu-ga. On the other hand, qemu itself is trying to make RTC beat faster to catch the difference. But if we don't tell qemu that guest's time was reset via the other method, both mechanisms are applied resulting in again wrong guest time. In order to avoid summing both corrections we need to tell qemu that it should not use the RTC injection if the guest time is set via guest agent.
1: http://www.mail-archive.com/qemu-devel@nongnu.org/msg236435.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ---
Notes: diff to v1: -fixed command name in subject -added testcase
+++ b/src/qemu/qemu_driver.c @@ -16879,6 +16879,16 @@ qemuDomainSetTime(virDomainPtr dom, rv = qemuAgentSetTime(priv->agent, seconds, nseconds, rtcSync); qemuDomainObjExitAgent(vm);
+ if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, + "%s", _("domain is not running")); + goto endjob; + } + + qemuDomainObjEnterMonitor(driver, vm); + rv = qemuMonitorRTCResetReinjection(priv->mon); + qemuDomainObjExitMonitor(driver, vm);
We have four combinations:
1. old qemu, old qga: command fails because qga doesn't support it, qemu tries to catch up time manually (might eventually match real time)
2. new qemu, old qga: command fails because qga doesn't support it, qemu tries to catch up time manually (might eventually match real time)
3. new qemu, new qga: both qga and qemu commands work, no additional catchup attempted and guest is now accurate
4. old qemu, new qga: qga succeeds, but qemu command fails, so we have overcorrected and qemu is trying to catch up time manually (overcorrected, so it cannot match real time)
I guess reporting failure in those three cases is fine, although I'm still worried about case 4. I'd feel a lot better if there were a qemu_capabilities.h bit that detects if the qemu command is present, and skip even attempting the qga command unless we ALSO know the qemu command is present (that is, use the capability check to completely avoid case 4, by turning it into the same behavior as case 1).
Okay. Although I've just realized one (corner) case. From my understanding of rtc-reset-reinjection time it's only necessary if guest was suspended for a while and the guest's RTC clock skewed. But what if I start fresh new guest and just want to set its time (leave aside the reasoning why would I do that for a while)? Is the rtc-reset-reinjection necessary? I wouldn't say. But on the other hand - libvirt doesn't know if the RTC is synced already or not. Hence it's safer for libvirt to issue the command every single time. In fact, there are two ways to set guest time: a) {"execute":"guest-set-time"} b) {"execute":"guest-set-time, "arguments":{"time":1234567890}} While in the case a) guest time is set by reading from guest's RTC, in case of b) guest time is set by calling settimeofday() and RTC is written thereafter. So is the rtc-reset-reinjection necessary only for case a) and in case b) QEMU somehow detects RTC write and cancels the reinjection itself? Michal

On Mon, Aug 18, 2014 at 06:29:42PM +0200, Michal Privoznik wrote:
On 18.08.2014 17:28, Eric Blake wrote:
On 08/14/2014 02:24 AM, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1103245
An advice appeared there on the qemu-devel list [1]. When a domain is suspended and then resumed guest kernel is not aware of this. So we've introduced virDomainSetTime API that resets the time within guest using qemu-ga. On the other hand, qemu itself is trying to make RTC beat faster to catch the difference. But if we don't tell qemu that guest's time was reset via the other method, both mechanisms are applied resulting in again wrong guest time. In order to avoid summing both corrections we need to tell qemu that it should not use the RTC injection if the guest time is set via guest agent.
1: http://www.mail-archive.com/qemu-devel@nongnu.org/msg236435.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ---
Notes: diff to v1: -fixed command name in subject -added testcase
+++ b/src/qemu/qemu_driver.c @@ -16879,6 +16879,16 @@ qemuDomainSetTime(virDomainPtr dom, rv = qemuAgentSetTime(priv->agent, seconds, nseconds, rtcSync); qemuDomainObjExitAgent(vm);
+ if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, + "%s", _("domain is not running")); + goto endjob; + } + + qemuDomainObjEnterMonitor(driver, vm); + rv = qemuMonitorRTCResetReinjection(priv->mon); + qemuDomainObjExitMonitor(driver, vm);
We have four combinations:
1. old qemu, old qga: command fails because qga doesn't support it, qemu tries to catch up time manually (might eventually match real time)
2. new qemu, old qga: command fails because qga doesn't support it, qemu tries to catch up time manually (might eventually match real time)
3. new qemu, new qga: both qga and qemu commands work, no additional catchup attempted and guest is now accurate
4. old qemu, new qga: qga succeeds, but qemu command fails, so we have overcorrected and qemu is trying to catch up time manually (overcorrected, so it cannot match real time)
I guess reporting failure in those three cases is fine, although I'm still worried about case 4. I'd feel a lot better if there were a qemu_capabilities.h bit that detects if the qemu command is present, and skip even attempting the qga command unless we ALSO know the qemu command is present (that is, use the capability check to completely avoid case 4, by turning it into the same behavior as case 1).
Okay. Although I've just realized one (corner) case. From my understanding of rtc-reset-reinjection time it's only necessary if guest was suspended for a while and the guest's RTC clock skewed. But what if I start fresh new guest and just want to set its time (leave aside the reasoning why would I do that for a while)? Is the rtc-reset-reinjection necessary? I wouldn't say. But on the other hand - libvirt doesn't know if the RTC is synced already or not. Hence it's safer for libvirt to issue the command every single time.
In fact, there are two ways to set guest time:
a) {"execute":"guest-set-time"}
b) {"execute":"guest-set-time, "arguments":{"time":1234567890}}
While in the case a) guest time is set by reading from guest's RTC, in case of b) guest time is set by calling settimeofday() and RTC is written thereafter.
So is the rtc-reset-reinjection necessary only for case a) and in case b) QEMU somehow detects RTC write and cancels the reinjection itself?
Michal
rtc-reset-reinjection has been introduced because certain Windows versions will advance the guest system time (via rtc interrupt reinjection). So if libvirt adjusts the guest system time via guest-set-time, allowing rtc interrupt reinjection to compensate for lost time, as well, will cause an incorrect guest system time. So you should always use the guest-set-time rtc-reset-reinjection pair.

On 08/19/2014 10:57 AM, Marcelo Tosatti wrote:
rtc-reset-reinjection has been introduced because certain Windows versions will advance the guest system time (via rtc interrupt reinjection).
So if libvirt adjusts the guest system time via guest-set-time, allowing rtc interrupt reinjection to compensate for lost time, as well, will cause an incorrect guest system time.
So you should always use the
guest-set-time rtc-reset-reinjection
pair.
But is that true both for the 'guest-set-time' no-arg case (which tells the guest to read the current RTC and update in-memory time accordingly), as well as the 'guest-set-time with time argument' case (which tells the guest to forcefully set in-memory time, then write that time back to the RTC)? -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Tue, Aug 19, 2014 at 11:00:26AM -0600, Eric Blake wrote:
On 08/19/2014 10:57 AM, Marcelo Tosatti wrote:
rtc-reset-reinjection has been introduced because certain Windows versions will advance the guest system time (via rtc interrupt reinjection).
So if libvirt adjusts the guest system time via guest-set-time, allowing rtc interrupt reinjection to compensate for lost time, as well, will cause an incorrect guest system time.
So you should always use the
guest-set-time rtc-reset-reinjection
pair.
But is that true both for the 'guest-set-time' no-arg case (which tells the guest to read the current RTC and update in-memory time accordingly), as well as the 'guest-set-time with time argument' case (which tells the guest to forcefully set in-memory time, then write that time back to the RTC)?
Yes.

On 19.08.2014 19:23, Marcelo Tosatti wrote:
On Tue, Aug 19, 2014 at 11:00:26AM -0600, Eric Blake wrote:
On 08/19/2014 10:57 AM, Marcelo Tosatti wrote:
rtc-reset-reinjection has been introduced because certain Windows versions will advance the guest system time (via rtc interrupt reinjection).
So if libvirt adjusts the guest system time via guest-set-time, allowing rtc interrupt reinjection to compensate for lost time, as well, will cause an incorrect guest system time.
So you should always use the
guest-set-time rtc-reset-reinjection
pair.
But is that true both for the 'guest-set-time' no-arg case (which tells the guest to read the current RTC and update in-memory time accordingly), as well as the 'guest-set-time with time argument' case (which tells the guest to forcefully set in-memory time, then write that time back to the RTC)?
Yes.
Okay then. I'm posting v3 here [1] which unconditionally calls the monitor command right after the guest agent command. Michal 1: https://www.redhat.com/archives/libvir-list/2014-August/msg00867.html
participants (3)
-
Eric Blake
-
Marcelo Tosatti
-
Michal Privoznik