Question about migrate the vm with host-passthrough cpu

Hi there, Sorry to bother you, but could I ask you a question about the live migration of virtual machine configured with host-passthough cpu, it confuse me a lot. Here i have two hosts with different cpu, and the cpu feature set of old host is subset of the new one (by virsh cpu-compare). If the vm was first started on the old host,is it safe to migrate the vm between the two hosts back and forth with the vm always keep running? I've test the case in my environment, and the migration succeeds, the cpu family/model/stepping and features of lscpu in Guest OS is the same. -- Best Regards, Guoyi Tu

On Wed, Feb 24, 2021 at 09:42:37PM +0800, Guoyi Tu wrote:
Hi there,
Sorry to bother you, but could I ask you a question about the live migration of virtual machine configured with host-passthough cpu, it confuse me a lot.
Here i have two hosts with different cpu, and the cpu feature set of old host is subset of the new one (by virsh cpu-compare). If the vm was first started on the old host,is it safe to migrate the vm between the two hosts back and forth with the vm always keep running?
The CPUID features are the biggest compatibility issue that is likely to cause problems generally, but I believe there can be others. For example if performance counters are exposed to the guest, these are likely to vary between different CPU models. Even different BIOS settings can change performance counters exposed by a CPU (HT vs no-HT) There might also be problems with CPU TSC frequency differences between the hosts.
I've test the case in my environment, and the migration succeeds, the cpu family/model/stepping and features of lscpu in Guest OS is the same.
Right, the issue is that the migration appears to succeed from QEMU's POV, but the guest may none the less see incompatibilities which result in a crash or incorrect behaviour an arbitrary amount of time later. It is really hard to guarantee that behaviour is going to be correct in all scenarios. Personally I would consider the use of host-passthrough to be unsupportable in production environment, unless the CPUs are all homogeneous. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Thanks for you reply. I checked the qemu code, the destination qemu instance actually exposed the cpu feature of new host , and if execute the cpuid instruction in Guest OS after migration the family/model/stepping is the same as new host. On Thu, 2021-02-25 at 14:10 +0000, Daniel P. Berrangé wrote:
On Wed, Feb 24, 2021 at 09:42:37PM +0800, Guoyi Tu wrote:
Hi there,
Sorry to bother you, but could I ask you a question about the live migration of virtual machine configured with host-passthough cpu, it confuse me a lot.
Here i have two hosts with different cpu, and the cpu feature set of old host is subset of the new one (by virsh cpu-compare). If the vm was first started on the old host,is it safe to migrate the vm between the two hosts back and forth with the vm always keep running?
The CPUID features are the biggest compatibility issue that is likely to cause problems generally, but I believe there can be others.
For example if performance counters are exposed to the guest, these are likely to vary between different CPU models. Even different BIOS settings can change performance counters exposed by a CPU (HT vs no- HT)
There might also be problems with CPU TSC frequency differences between the hosts.
I've test the case in my environment, and the migration succeeds, the cpu family/model/stepping and features of lscpu in Guest OS is the same.
Right, the issue is that the migration appears to succeed from QEMU's POV, but the guest may none the less see incompatibilities which result in a crash or incorrect behaviour an arbitrary amount of time later. It is really hard to guarantee that behaviour is going to be correct in all scenarios.
Personally I would consider the use of host-passthrough to be unsupportable in production environment, unless the CPUs are all homogeneous.
Regards, Daniel

[Catching up with libvirt-users list after a long while ... hence the delay in my response.] On Fri, Feb 26, 2021 at 03:58:21PM +0800, Guoyi Tu wrote:
Thanks for you reply.
I checked the qemu code, the destination qemu instance actually exposed the cpu feature of new host , and if execute the cpuid instruction in Guest OS after migration the family/model/stepping is the same as new host.
Just to reinforce Daniel's point below: 'host-passthrough' and live migration is tricky ... so the safest suggestion is to go with identical physical CPUs on source and destination hosts, along with identical kernel, _and_ CPU microcode versions (see the performance counters point below; and also thanks to Meltdown/Spectre). Or else live migration is very much likely to fail.
On Thu, 2021-02-25 at 14:10 +0000, Daniel P. Berrangé wrote:
On Wed, Feb 24, 2021 at 09:42:37PM +0800, Guoyi Tu wrote:
[...]
The CPUID features are the biggest compatibility issue that is likely to cause problems generally, but I believe there can be others.
For example if performance counters are exposed to the guest, these are likely to vary between different CPU models. Even different BIOS settings can change performance counters exposed by a CPU (HT vs no- HT)
There might also be problems with CPU TSC frequency differences between the hosts.
I've test the case in my environment, and the migration succeeds, the cpu family/model/stepping and features of lscpu in Guest OS is the same.
Right, the issue is that the migration appears to succeed from QEMU's POV, but the guest may none the less see incompatibilities which result in a crash or incorrect behaviour an arbitrary amount of time later. It is really hard to guarantee that behaviour is going to be correct in all scenarios.
Personally I would consider the use of host-passthrough to be unsupportable in production environment, unless the CPUs are all homogeneous.
Regards, Daniel
-- /kashyap
participants (3)
-
Daniel P. Berrangé
-
Guoyi Tu
-
Kashyap Chamarthy