
On Mon, Jul 13, 2020 at 14:04:25 +0200, Jiri Denemark wrote:
On Sat, Jul 11, 2020 at 13:44:19 -0400, Mark Mielke wrote:
On Sat, Jul 11, 2020 at 6:04 AM Mark Mielke <mark.mielke@gmail.com> wrote:
On Fri, Jul 10, 2020 at 7:48 AM Mark Mielke <mark.mielke@gmail.com> wrote:
On Fri, Jul 10, 2020 at 7:14 AM Jiri Denemark <jdenemar@redhat.com> wrote:
The implementation seems to be doing exactly what the commit message
says. The migratable=off default should be used only when QEMU does not
support -cpu host,migratable=on|off, that is only when QEMU is very old. Every non-ancient version of libvirt should have the QEMU_CAPS_CPU_MIGRATABLE set and thus this code should choose migrateble=on default.
QEMU_CAPS_CPU_MIGRATABLE only from the <cpu> element? If so, doesn't this mean that it is not explicitly listed for host-passthrough, and this means the check is not detecting whether it is enabled or not properly?
Trying to understand what is going on more - I see "migratable" seems to be ok when launching a new machine, but the failure scenario was live migration from 6.4.0 to 6.5.0.
Is this because the QEMU_CAPS_CPU_MIGRATABLE was not filled in for 6.4.0, and live migration grabs the capabilities from the source, where the absence of this capability makes it presume an older Qemu in the above code?
Sorry all - I am having trouble reproducing now. The expected use cases are now working.
Is it possible that the "migratable" flag might have been missing on some of the instances, although migration worked fine, and despite having used Qemu 4.2 and Qemu 5.0?
When an updated libvirtd which knows about this new capability starts, it would reprobe all QEMU capabilities (lazily, i.e., once they are needed). However, if there is a running domain, libvirt will use cached capabilities probed when the domain was started. I suspect migrating such domain could be a problem. I'll try to reproduce locally.
OK, I did not reproduce the failure, because migratable=off doesn't enable anything more than migratable=on (likely because L1 VM in my nested environment does not have any non-migratable features enabled). But I was able to reproduce the issue itself and the migration could clearly fail if migratable=off enabled some non-migratable features. The reproducer is actually easy and one doesn't even need to migrate to see libvirt did something wrong: 1. run libvirtd older then 6.5.0 2. start a domain with host-passthrough CPU (QEMU would default to migratable=on) 3. upgrade libvirt to 6.5.0 and restart libvirtd 4. virsh dumpxml $DOMAIN_STARTED_IN_STEP_2 Now you would see <cpu mode='host-passthrough' check='none' migratable='off'/> which differs from the default used by QEMU. Migrating such domain would succeed anyway, because it was actually started with migratable='on'. But when such domain is migrated to libvirt 6.5.0, we would honor the migratable attribute and start QEMU with -cpu host,migratable=off which could cause failures when trying to migrate this domain again. The problem is exactly where I was afraid it could be. When libvirtd starts, it reads the QEMU capabilities probed by older libvirt (QEMU_CAPS_CPU_MIGRATABLE would be off) and wrongly updates the XML of the running domain. I'll prepare a patch to fix this. Jirka