On Thu, Sep 08, 2022 at 02:24:00PM +0200, Roman Mohr wrote:
Hi,
I have a question regarding capability caching in the context of KubeVirt.
Since we start in KubeVirt one libvirt instance per VM, libvirt has to
re-discover on every VM start the qemu capabilities which leads to a 1-2s+
delay in startup.
We already discover the features in a dedicated KubeVirt pod on each node.
Therefore I tried to copy the capabilities over to see if that would work.
It looks like in general it could work, but libvirt seems to detect a
mismatch in the exposed KVM CPU ID in every pod. Therefore it invalidates
the cache. The recreated capability cache looks esctly like the original
one though ...
The check responsible for the invalidation is this:
```
Outdated capabilities for '%s': host cpuid changed
```
So the KVM_GET_SUPPORTED_CPUID call seems to return
slightly different values in different containers.
After trying out the attached golang scripts in different containers, I
could indeed see differences.
I can however not really judge what the differences in these KVM function
registers mean and I am curious if someone else knows. The files are
attached too (as json for easy diffing).
Can you confirm whether the two attached data files were captured
by containers running on the same physical host, or could each
container have run on a different host.
My understanding is that KVM_GET_SUPPORTED_CPUID returns the intersection
of CPUID flags supported by the physical CPUs and CPUID flag supported by
the KVM kernel module.
IOW, I believe the results should only differe if run across hosts with
differing CPU models and/or kernel versions.
I've not tried to diagnose exactly which feature bits are different
in your dumps yet.
With regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|