On Fri, Jul 10, 2020 at 07:48:26 -0400, Mark Mielke wrote:
On Fri, Jul 10, 2020 at 7:14 AM Jiri Denemark
<jdenemar(a)redhat.com> wrote:
> On Sun, Jul 05, 2020 at 12:45:55 -0400, Mark Mielke wrote:
> > With 6.4.0, live migration was working fine with Qemu 5.0. After trying
> out
> > 6.5.0, migration broke with the following error:
> >
> > libvirt.libvirtError: internal error: unable to execute QEMU command
> > 'migrate': State blocked by non-migratable CPU device (invtsc flag)
>
> Could you please describe the reproducer steps? For example, was the
> domain you're trying to migrate already running when you upgrade libvirt
> or is it freshly started by the new libvirt?
>
The original case was:
1) Machine X running libvirt 6.4.0 + qemu 5.0
2) Machine Y running libvirt 6.5.0 + qemu 5.0
3) Live migration from X to Y works. Guest appears fine.
4) Upgrade Machine X from libvirt 6.4.0 to 6.5.0 and reboot.
5) Live migration from Y to X fails with the message shown.
Oh I see, so I guess the bad default is chosen during the incoming
migration to machine Y. I'll try to reproduce myself to see what's going
on.
In each case, live migration was done with OpenStack Train directing
libvirt + qemu.
And it would be helpful to see the <cpu> element as shown by virsh
> dumpxml before you try to start the domain as well as the QEMU command
> line libvirt used to start the domain (in
> /var/log/libvirt/qemu/$VM.log).
>
The <cpu> element looks like this:
<cpu mode='host-passthrough' check='none'>
<topology sockets='1' dies='1' cores='4'
threads='2'/>
</cpu>
The QEMU command line is very long, and includes details I would avoid
publishing publicly unless you need them. The "-cpu" portion is just:
-cpu host
The QEMU command line itself is generated from libvirt, which is directed
by OpenStack Train.
These are from machine X before step 3, right? Can you also share the
same from machine Y before step 5?
I wasn't sure what QEMU_CAPS_CPU_MIGRATABLE represents. I
initially
suspected what you are saying, but since it apparently did not work the way
I expected, I then presumed it does not work the way I expected. :-)
Is QEMU_CAPS_CPU_MIGRATABLE only from the <cpu> element? If so, doesn't
this mean that it is not explicitly listed for host-passthrough, and this
means the check is not detecting whether it is enabled or not properly?
QEMU_CAPS_CPU_MIGRATABLE comes from the QEMU capability probing.
Specifically, the capability is enabled when a given QEMU binary reports
'migratable' property for the CPU object. And the capability detection
tests show we should be properly detecting this capability:
tests/qemucapabilitiesdata $ git grep cpu.migratable
caps_2.12.0.x86_64.xml: <flag name='cpu.migratable'/>
caps_3.0.0.x86_64.xml: <flag name='cpu.migratable'/>
caps_3.1.0.x86_64.xml: <flag name='cpu.migratable'/>
caps_4.0.0.x86_64.xml: <flag name='cpu.migratable'/>
caps_4.1.0.x86_64.xml: <flag name='cpu.migratable'/>
caps_4.2.0.x86_64.xml: <flag name='cpu.migratable'/>
caps_5.0.0.x86_64.xml: <flag name='cpu.migratable'/>
caps_5.1.0.x86_64.xml: <flag name='cpu.migratable'/>
I think it can go either way. There is also convention over
configuration
as a competing principle. However, I also prefer explicit. Just, it needs
to be correct, otherwise explicit can be very bad, as it seems in my case.
:-)
Of course, the explicit default must match the implicit one.
Jirka