Thanks alot for the reply, 

This sounds like a valid use case (not sure it is that useful), but we
need to be careful. But we should make sure implementing this does not
break anything. This means, we need to do different things depending on
the type of the CPU definition we are asked to compare. Guest CPU
definitions should keep the old behaviour (return IDENTICAL) and host
CPU definitions can be compared to the host CPU. But when doing so,
don't get too influenced by x86 code, which I believe is way too
complicated for ARM. Specifically you don't need to create armCompute
and mess with guestData and other stuff there as all you want to do is
compare the two CPU definitions. In x86 the same function is used for
several things, but that's not the case for ARM.

for this, what I have in mind now is that we can check `VIR_CPU_MODE_HOST_PASSTHROUGH`
and if that is the case, we compare CPU vendors and models to allow only identical definitions to pass,
like the implementation of https://gitlab.com/libvirt/libvirt/-/blob/master/src/cpu/cpu_ppc64.c#L505 ,
(this is because when a VM is in host-passthrough mode, its' CPU xml reflects the original host CPU
definition, and we actually compare the source host and destination host CPU definitions,
I think this suits your ideas too) and for other cases, we just return IDENTICAL as is.

What do you think?

Thanks alot.

Zhenyu Zheng


On Fri, Sep 4, 2020 at 8:00 PM Jiri Denemark <jdenemar@redhat.com> wrote:
On Thu, Sep 03, 2020 at 19:50:13 +0800, Zhenyu Zheng wrote:
> Hi,
>
> Thanks alot for the review and feedback. As for host-passthrough cases, I
> have some other understandings,
> if I understand correctly, what you mean is that if a vm uses
> host-passthrough, it can migrate to any other
> host, since it asks for host-passthrough. In my point of view, I think in
> real cases, there are few different kinds
> of ARM datacenter CPUs on the market, and there CPU capabilities are
> different, so one might create a vm on
> hostA with feature1 and feature2 because it uses host-passthrough and hostA
> has these features. Now in your
> definition(if I understand correctly) of host-passthrough and the current
> code(returns identical directly), this vm
> can be migrated to hostB with only feature1, since there are no
> limitations. If one has some important application
> that depends on feature2, the app will break as feature2 is not available
> on hostB.

And that's why migrating a domain with host-passthrough CPU is generally
not supported. It should work if you have two identical hosts in terms
of CPU, microcode (if applicable to ARM), system settings (not sure
about ARM, but x86 let's you hide some CPU features), kernel and its
command line arguments, kvm modules and their options, and QEMU. If any
of this is different the guest CPU may change in an unexpected way
during migration and there's no way libvirt could know about it or
prevent it in general.

Even if we made the change and compared CPU features, the CPUs may have
other features which libvirt does not know about or the CPUs could even
differ in other aspects which libvirt does not cover in the CPU modeling
code.

But thinking about this and your patch, I guess I know what you are
trying to do. You're not trying to compare the CPU definition from a
domain XML (which was the primary use case for CPU comparison APIs), but
you want to compare the host CPU definition from one host to the host
CPU on the other host to see whether the two hosts might be compatible.

This sounds like a valid use case (not sure it is that useful), but we
need to be careful. But we should make sure implementing this does not
break anything. This means, we need to do different things depending on
the type of the CPU definition we are asked to compare. Guest CPU
definitions should keep the old behaviour (return IDENTICAL) and host
CPU definitions can be compared to the host CPU. But when doing so,
don't get too influenced by x86 code, which I believe is way too
complicated for ARM. Specifically you don't need to create armCompute
and mess with guestData and other stuff there as all you want to do is
compare the two CPU definitions. In x86 the same function is used for
several things, but that's not the case for ARM.

> Considering this, I proposed to add basic checks to compare CPU to
> limit the migration to only the same CPU models. And once the
> capability of ARM driver is enhanced in QEMU or other related
> projects, we can make the compare API better.

Sure, but that would be comparing guest CPU definition with the CPU we
get from QEMU rather than the one we detect ourselves.

Jirka