On Wed, Jan 04, 2017 at 05:59:17PM -0200, Eduardo Habkost wrote:
On Wed, Jan 04, 2017 at 11:39:16AM -0200, Eduardo Habkost wrote:
> On Wed, Jan 04, 2017 at 09:56:56AM -0200, Marcelo Tosatti wrote:
> > On Tue, Dec 27, 2016 at 05:21:20PM -0200, Eduardo Habkost wrote:
> > > Instead of blocking migration on the source when invtsc is
> > > enabled, rely on the migration destination to ensure there's no
> > > TSC frequency mismatch.
> > >
> > > We can't allow migration unconditionally because we don't know if
> > > the destination is a QEMU version that is really going to ensure
> > > there's no TSC frequency mismatch. To ensure we are migrating to
> > > a destination that won't ignore SET_TSC_KHZ errors, allow invtsc
> > > migration only on pc-*-2.9 and newer.
> > >
> > > Signed-off-by: Eduardo Habkost <ehabkost(a)redhat.com>
[...]
> > > @@ -2655,12 +2656,14 @@ int kvm_arch_put_registers(CPUState *cpu, int
level)
> > > }
> > >
> > > if (level == KVM_PUT_FULL_STATE) {
> > > - /* We don't check for kvm_arch_set_tsc_khz() errors here,
> > > - * because TSC frequency mismatch shouldn't abort migration,
> > > - * unless the user explicitly asked for a more strict TSC
> > > - * setting (e.g. using an explicit "tsc-freq" option).
> > > + /* Migration TSC frequency mismatch is fatal only if we are
> > > + * actually reporting Invariant TSC to the guest.
> > > */
> > > - kvm_arch_set_tsc_khz(cpu);
> > > + ret = kvm_arch_set_tsc_khz(cpu);
> > > + if ((x86_cpu->env.features[FEAT_8000_0007_EDX] &
CPUID_APM_INVTSC) &&
> > > + ret < 0) {
> > > + return ret;
> > > + }
> > > }
> >
> > Will the guest continue in the source in this case?
> >
> > I think this is past the point where migration has been declared
> > successful.
> >
> > Otherwise looks good.
>
> Good point. I will make additional tests and see if there's some
> other place where the kvm_arch_set_tsc_khz() call can be moved
> to.
So, if we solve this and do something on (for example) post_load,
we still have a problem: device state is migrated after RAM. This
means QEMU will check for TSC scaling and abort migration very
late.
We could solve that by manually registering a SaveVMHandler that
will send the TSC frequency on save_live_setup, so migration is
aborted earlier.
But: this sounds like just a complex hack to work around the real
problems:
1) TSC frequency is guest-visible, and anything that affects
guest ABI should depend on the VM configuration, not on host
capabilities;
Well not really: the TSC frequency where the guest starts
is the frequency the guest software expects.
So it does depend on host capabilities.
2) Setting TSC frequency depending on the host will make
migratability unpredictable for management software: the same
VM config could be migratable to host A when started on host
B, but not migratable to host A when started on host C.
Well, just check the frequency.
I suggest we allow migration with invtsc if and only if
tsc-frequency is set explicitly by management software. In other
words, apply only patches 1/4 and 2/4 from this series. After
that, we will need libvirt support for configuring tsc-frequency.
I don't like that for the following reasons:
* It moves low level complexity from QEMU/KVM to libvirt
(libvirt has to call KVM_GET_TSC_KHZ from a vcpu thread).
* It makes it difficult to run QEMU manually (i use QEMU
manually all the time).
* It requires patching libvirt.
In my experience things work better when the functionality is
not split across libvirt/qemu.
Can't this be fixed in QEMU? Just check that destination host supports
TSC scaling before migration (or that KVM_GET_TSC_KHZ return value
matches on source and destination).