Re: [libvirt] [PATCHv2 3/4] qemu: fix RTC_CHANGE event for <clock offset='variable' basis='utc'/>

23 May 2014

      On 05/23/2014 06:50 AM, Marcelo Tosatti wrote:
> On Thu, May 22, 2014 at 01:33:14PM -0600, Eric Blake wrote:
>> [Adding qemu]
>>
>> On 05/22/2014 05:07 AM, Laine Stump wrote:
>>> commit e31b5cf393857 attempted to fix libvirt's
>>> VIR_DOMAIN_EVENT_ID_RTC_CHANGE, which is documentated to always
>> s/documentated/documented/
>>
>>> provide the new offset of the domain's real time clock from UTC. The
>>> problem was that, in the case that qemu is provided with an "-rtc
>>> base=x" where x is an absolute time (rather than "utc" or
>>> "localtime"), the offset sent by qemu's RTC_CHANGE event is *not* the
>>> new offset from UTC, but rather is the sum of all changes to the
>>> domain's RTC since it was started with base=x.
>>>
>>> So, despite what was said in commit e31b5cf393857, if we assume that
>>> the original value stored in "adjustment" was the offset from UTC at
>>> the time the domain was started, we can always determine the current
>>> offset from UTC by simply adding the most recent (i.e. current) offset
>>> from qemu to that original adjustment.
>> Is this true even if we miss an RTC update event from qemu?  I'm worried
>> about the following situation:
>>
>> user prepares to do a libvirtd upgrade, so libvirtd is shut down. 
> If adjustment0 field is updated from adjustment, via a libvirtd shutdown, the current
> patch will also break, i believe. Not sure if thats possible, though.

No, adjustment0 is only set at the time a new qemu process is started.

>
>> Then the guest triggers an RTC update, so qemu sends an event, but the
>> event is lost. Then libvirtd starts again, and doesn't realize the
>> event is lost.

That case would only be a problem until the *next* time an RTC update is
sent; at that time the adjustment would be readjusted to adjustment0 +
new offset (and that new offset is the cumulative sum of all adjustments
since the domain was started).

> Yes, but that case is also true for any other QMP asynchronous event,
> and therefore should be handled generically i suppose (QMP channel data
> should be maintained across libvirtd shutdown). Luiz?
>
>> Do we need more help from qemu, such as a new field to an existing QMP
>> command (or a new QMP command) that lists the cumulative offset that
>> qemu is using, where we call that query command any time after an RTC
>> update event or after a libvirtd restart? I'm wondering if this is more
>> a bug in qemu for not providing the right information rather than
>> libvirt's responsibility to work around it.  If the only way to keep
>> accurate information is to sum the values we get from events, we are at
>> risk of a lost event getting us messed up.
> Good point, unsure whether its specific to this command, though. Could
> add a new QMP command to query the RTC offset, yes.
>
>>> This patch accomplishes that by storing the initial adjustment in the
>>> domain's status as "adjustment0". Each time a new RTC_CHANGE event is
>>> received from qemu, we simply add adjustment0 to the value sent by
>>> qemu, store that as the new adjustment, and forward that value on to
>>> any event handler.
>>>
>>> This patch (*not* e31b5cf393857, which should be reverted prior to
>>> applying this patch) fixes:
>>>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=964177
>>>
>>> (for the case where basis='utc'. It does not fix basis='localtime')
>>> ---
>>>
>>> Changes from V1: remove all attempts to fix basis='localtime' in favor
>>> of fixing it in a simpler and better manner in a separate patch.
>> I'd also appreciate it if the qemu developers can chime in on what is
>> supposed to happen for localtime guests.
>>
>> There are at least four combinations:
>>
>> host running on UTC bios time, guest running on UTC time (in my opinion,
>> the only sane setting, but we're talking about reality not sanity)
> 1) Why does Windows keep your BIOS clock on local time?
> http://blogs.msdn.com/b/oldnewthing/archive/2004/09/02/224672.aspx
>
> 2) Windows with RTC UTC
> https://wiki.archlinux.org/index.php/Time#UTC_in_Windows
>
>> host running on UTC, guest running on localtime (perhaps the guest is
>> windows, and we know that windows prefers to run on localtime)
> Its the default, thats all. See 1) above.
>
>> host running on localtime bios (perhaps because it is dual-boot with
>> windows, and windows prefers bios in localtime), guest running on UTC time
> Can't see why hosts using localtime or UTC is relevant. Assume host is
> synchronized to UTC via NTP (so you can use UTC or convert to localtime
> if desired).

Correct - the host's use of its own RTC is irrelevant. The only
potential difference is basis='utc' vs. basis='localtime', and anyone
using basis='localtime' more or less deserves the confusion it causes
:-P (That really is a "tongue in cheek" emoticon - I'm only half serious).

*However*, this discussion forced me to investigate some of the basic
assumptions that I'd been making when coming in to fix this bug. In
particular, my assumption was that the value of "adjustment" that was
set in the status would be preserved across a domain save/restore
operation, or a migration, but after talking to jdenemar and looking at
the code, I believe that this is *not* the case. As I understand it, one
of the driving factors behind having adjustment='variable' was that
changes to the RTC would be properly maintained across save/resoter and
migrations, and thus I ASSumed that the setting *was* being maintained,
but that the math was wrong. As far as I can tell, though, only the
*inactive* XML is saved in a save image or sent in a migration, while
the modified adjustment is only in the transient/status XML; the result
is that the modified adjustment (and modified basis per patch 4/4) are
lost any time there is a save/restore or migration.

So while these patches are correcting the math, I guess any claims that
I make in the commit logs about them fixing problems with migration or
save/restore are ill-informed, and should be removed (and we should
file/track a separate bug about that issue).

>> host running on localtime, guest running on localtime
>>
>> But it gets even more complicated.  The host localtime need not be
>> consistent with the guest localtime.  That is, I could be a cloud
>> provider with servers on the east coast, and renting out processor time
>> to a client on the west coast that wants their guest tied to west coast
>> localtime.  And that's assuming that both host and guest switch in and
>> out of daylight savings at the same time, which falls apart when you
>> cross political boundaries.  Then there's the fun of migration (what if
>> my server farm is spread across multiple timezones - does migration take
>> into account the difference in localtime between source and destination
>> servers).

I'm pretty sure that Daniel's suggestion (which I've implemented in
patch 4) removes all of the problems *that don't involve migration or
save/restore* to the extent that they can be removed (and they *do* set
the adjustment/basis in the status such that if they *were* preserved
across migrate or save/restore, behavior would then be correct) although
there is still a problem if you don't know which timezone the initial
host will be in, but want your guest to have RTC set to the localtime of
a particular timezone - I suppose for that we would need to expand
"basis='utc|localtime'" to allow specifying a particular timezone, then
learn the time of that timezone when qemu is started. That's beyond the
scope of this "bugfix" patch series though.

>>
>> I can _totally_ understand the desire to run a GUEST in such a way that
>> the guest thinks it has a bios stored in localtime (and when the guest
>> updates the RTC twice a year to account for daylight savings, it changes
>> what offset we track about the guest).  But I think it is INSANITY to
>> ever try and run a host on a localtime system (daylight savings changes
>> in the host are just asking for problems to the guests) - so even if the
>> host is tied to localtime bios, it is still probably wiser for qemu to
>> base its offsets to UTC no matter what.  If the commandline allows a
>> specification of a localtime offset, I think it should be used ONLY for
>> a one-time up-front conversion into a corresponding UTC offset, and then
>> execute qemu in relation to utc thereafter (therefore, migration is
>> always done in terms of utc, without regards for whether source and
>> destination have a different localtime).

Yep :-)

> Guest side:
> * Emulated RTC CMOS clock is initialized to either UTC or localtime when the 
> guest initializes.
> * Guest reads RTC CMOS clock on boot, or if explicitly requested to
> during runtime, and transfers that value to its system time.
>
> * Migration maintains the emulated RTC CMOS clock value, but a
> subsequent VM restart will use the destination hosts localtime,
> in case -rtc base=localtime.
>
> So using 
>
> -rtc base= "localtime_r(time())" (this is a date) on VM creation
>
> and then maintaining that base, along with guest RTC writes, across VM
> restarts, seems much saner than ever using -rtc base=localtime on a cluster
> with different timezones.
>
>

Again, yep :-)