On Fri, Sep 21, 2012 at 11:30:31PM +0300, Dor Laor wrote:
On 09/21/2012 05:51 AM, Marcelo Tosatti wrote:
>On Fri, Sep 21, 2012 at 12:02:46AM +0300, Dor Laor wrote:
>>On 09/12/2012 06:39 PM, Marcelo Tosatti wrote:
>>>
>>>
>>>HW TSC scaling is a feature of AMD processors that allows a
>>>multiplier to be specified to the TSC frequency exposed to the guest.
>>>
>>>KVM also contains provision to trap TSC ("KVM: Infrastructure for
>>>software and hardware based TSC rate scaling" cc578287e3224d0da)
>>>or advance TSC frequency.
>>>
>>>This is useful when migrating to a host with different frequency and
>>>the guest is possibly using direct RDTSC instructions for purposes
>>>other than measuring cycles (that is, it previously calculated
>>>cycles-per-second, and uses that information which is stale after
>>>migration).
>>>
>>>"qemu-x86: Set tsc_khz in kvm when supported" (e7429073ed1a76518)
>>>added support for tsc_khz= option in QEMU.
>>>
>>>I am proposing the following changes so that management applications
>>>can work with this:
>>>
>>>1) New option for tsc_khz, which is tsc_khz=host (QEMU command line
>>>option). Host means that QEMU is responsible for retrieving the
>>>TSC frequency of the host processor and use that.
>>>Management application does not have to deal with the burden.
>>>
>>>2) New subsection with tsc_khz value. Destination host should consult
>>>supported features of running kernel and fail if feature is unsupported.
>>>
>>>
>>>It is not necessary to use this tsc_khz setting with modern guests
>>>using paravirtual clocks, or when its known that applications make
>>>proper use of the time interface provided by operating systems.
>>>
>>>On the other hand, legacy applications or setups which require no
>>>modification and correct operation while virtualized and make
>>>use of RDTSC might need this.
>>>
>>>Therefore it appears that this "tsc_khz=auto" option can be
specified
>>>only if the user specifies so (it can be a per-guest flag hidden
>>>in the management configuration/manual).
>>>
>>>Sending this email to gather suggestions (or objections)
>>>to this interface.
>>
>>I'm not sure I understand the exact difference between the offers.
>>We can define these 3 options:
>>
>>1. Qemu/kvm won't make use of tsc scaling feature at all.
>>2. tsc scaling is used and we take the value either from the host or
>> from the live migration data that overrides the later for incoming.
>> As you've said, it should be passed through a sub section.
>>3. Manual setting of the value (uncommon).
>>
>>Is there another option worth considering?
>>The questions is what should be the default. IMHO #2 is more
>>appropriate to serve as a default since we do expect tsc to change
>>between hosts.
>
>Option 1. is more appropriate to serve as a default given that
>modern guests make use of paravirt, as you have observed.
but you also observed that legacy applications that use rdtsc (even
over pv kernel) will still be affected by the physical tsc
frequency. Since I'm not aware of downside for using scaling, I
rather pick opt #2 as a default.
The downside is that, if your destination host does not support tsc
scaling, two possibilities arise:
1) destination tsc frequency > source tsc frequency: TSC trap
2) destination tsc frequency < source tsc frequency: TSC catchup
TSC trapping is not wanted, because it is slow.
This is the downside.
Note Intel does not support tsc scaling.
>That is, tsc scaling is only required if the guest does direct
RDTSC
>on the expectation that the value won't change.
>
>>Cheers,
>>Dor
>--
>To unsubscribe from this list: send the line "unsubscribe kvm" in
>the body of a message to majordomo(a)vger.kernel.org
>More majordomo info at
http://vger.kernel.org/majordomo-info.html
>