Re: [libvirt] [PATCHv2 3/4] qemu: fix RTC_CHANGE event for <clock offset='variable' basis='utc'/>

23 May 2014

      On 05/23/2014 04:19 AM, Laine Stump wrote:
...
On 05/23/2014 12:17 PM, Laine Stump wrote:
...
*However*, this discussion forced me to investigate some of the basic
assumptions that I'd been making when coming in to fix this bug. In
particular, my assumption was that the value of "adjustment" that was
set in the status would be preserved across a domain save/restore
operation, or a migration, but after talking to jdenemar and looking
at the code, I believe that this is *not* the case.
Okay, disregard this "sky is falling" outburst. I was misreading the
code and misinterpreting what jdenemar told me. The updated value of
adjustment and basis *are* properly preserved across save/restore and
migrate.
Okay, now to rephrase things to see if I understand correctly, or maybe
add more confusion to the discussion.

If I understand it, the qemu event is always outputting the 'current
adjustment to the command-line offset', and not 'the delta applied to
the most recent RTC change'; while libvirt is trying to report 'the
current delta that the guest is using in relation to UTC'.  If the
command line was specified with UTC as the basis (that is, 0
command-line offset), then the qemu offset happens to also be the
libvirt desired offset from UTC.  If the command line was specified with
any other offset, then the offset from UTC is always the sum of the
command line initial offset + the current offset reported by the event,
and libvirt is not altering the initial offset while the guest runs.
And I don't think libvirt has a way for management to query the current
offset (libvirt was only tracking the current offset of running domains
in its private xml).

For a guest that uses localtime bios, you would start the guest
initially with a non-zero command-line offset (representing the offset
of the timezone the guest thinks it is running in).  Twice a year, the
guest will change its RTC by an hour, and the GOAL is that if we stop
the guest and then restart it later, the restarted guest will resume
with bios at the same offset from UTC as what it last had (that is, the
freshly-booted guest will behave as if bios has silently advanced by the
amount of time that the guest was not running, similar to how bare-metal
hardware has a battery powering the RTC even when the box is completely
off and unplugged).  If, during the downtime, the guest has crossed a
daylight savings boundary, we don't care - the guest OS upon boot will
recognize that it is using localtime bios, and that it missed a dst
boundary while the machine was powered off, and so it will presumably do
an RTC update by an hour fairly early in its boot process - which will
be caught as an RTC update event so we once again know the new offset to
be applied the next time we boot the guest.  For this to work, a
persistent guest definition needs to record the offset that was in use
at the time the guest powered off, and the next time qemu is started,
pass _that_ offset as the command-line offset against UTC.  In libvirt's
case, we intentionally run qemu to stay alive even after the guest
quits, and send an event that the guest is no longer running; this event
could be our trigger to read the qemu offset and adjust the persistent
state to track the new offset to use on the next boot of the guest.

More interesting is migration (whether by saving to a file or by
migration between hosts).  Libvirt is able to query the current qemu
offset prior to starting the migration, but what happens if the guest
changes the RTC on the source after the time that libvirt did the query
but before the guest is paused in preparation for the destination to
start running the guest?  Is the RTC offset transmitted as part of the
migration stream, even if it is a different offset at the time the
migration finally converges than what it was when the migration started?
 Furthermore, what command line offset should libvirt tell the
destination machine to use, since qemu is only tracking the RTC offset
relative to the command line, and not the offset relative to UTC?  Is
there a race here?

More concretely, suppose I start a guest with a command-line offset of 3
hours.  Then the guest does an RTC change to 4 hours (an offset of one
hour from the command line).  Then I want to do a migration - do I tell
the destination qemu to start with a 3 hour offset (identical to what
the source had) and the migration stream corrects it so that the
destination picks up with RTC already at 4 hours (and the guest sees no
discontinuity)? Or does libvirt have to query the current offset at the
time it gets ready to start the migration, and start the destination
with a command-line offset of 4 hours, because the offset is not
migrated?  If the latter, what happens if the guest changes RTC in
between when libvirt queries the source for the value to give to the
destination, vs. actually starting the migration?  It sounds like the
only sane way for migration to work is for the RTC offset to be part of
the stream, and that the destination must be started with the SAME
command line as the source; and therefore, the only time the command
line should be altered is when the guest is shut down (not for live
migration) - where libvirt must update the domain XML to track the
offset in use at the time the guest shuts down.

Meanwhile, I know we also recently added the qemu-guest-agent command to
set time, with two modes: 1. tell the guest what UTC time it should be
using, and update RTC to match (which will trigger an RTC change event;
and perhaps can be used to snoop whether the guest is setting RTC to a
localtime rather than UTC time); 2. tell the guest to re-read the RTC
and adjust its local time to match (which pre-supposes that RTC is
accurate).  The idea is that after management does something where the
guest has a long downtime (either migration to file, or a long suspend),
the guest's internal notion of time did not advance (because the CPU was
not running) while the RTC did advance (because qemu is still exposing
RTC relative to the host's UTC), so the management will tell the guest
to resync time as part of resuming.

Now, what good does the RTC change event do, if libvirt really only
needs to query the current RTC offset at the time shuts down?  Given
that the guest-agent command deals in UTC (when setting time), does the
management app need to know when the guest changes RTC due to daylight
savings?  Or is the case where management tells guest to reset its own
time by re-reading RTC matter for management to know whether that RTC is
not tracking UTC?

Also, it seems like most people want their guests to run with a proper
notion of current time, even over gaps of time where the guest is not
running (true if the guest is connected to a network and expects to do
anything where being in sync with other computers is important).  But is
there ever someone that wants a guest to run as though it were seeing
all possible time values, and thus where every time the guest is paused
and then resumed, its offset from UTC is increased by the amount of
downtime that elapsed (most likely, for a standalone guest with no
emulated network connections)?  Do we need to do anything different for
a management app trying to run a guest in such a mode?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org