On Thu, Sep 8, 2022 at 4:32 PM Daniel P. Berrangé
<berrange(a)redhat.com>
wrote:
> On Thu, Sep 08, 2022 at 04:22:17PM +0200, Roman Mohr wrote:
> > On Thu, Sep 8, 2022 at 4:04 PM Daniel P. Berrangé <berrange(a)redhat.com>
> > wrote:
> >
> > > On Thu, Sep 08, 2022 at 02:23:31PM +0100, Daniel P. Berrangé wrote:
> > > > On Thu, Sep 08, 2022 at 03:10:09PM +0200, Roman Mohr wrote:
> > > > > On Thu, Sep 8, 2022 at 2:56 PM Daniel P. Berrangé <
> berrange(a)redhat.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > On Thu, Sep 08, 2022 at 02:24:00PM +0200, Roman Mohr
wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > >
> > > > > > > I have a question regarding capability caching in the
context
> of
> > > > > > KubeVirt.
> > > > > > > Since we start in KubeVirt one libvirt instance per
VM, libvirt
> > > has to
> > > > > > > re-discover on every VM start the qemu capabilities
which
> leads to
> > > a
> > > > > > 1-2s+
> > > > > > > delay in startup.
> > > > > > >
> > > > > > > We already discover the features in a dedicated
KubeVirt pod on
> > > each
> > > > > > node.
> > > > > > > Therefore I tried to copy the capabilities over to see
if that
> > > would
> > > > > > work.
> > > > > > >
> > > > > > > It looks like in general it could work, but libvirt
seems to
> > > detect a
> > > > > > > mismatch in the exposed KVM CPU ID in every pod.
Therefore it
> > > invalidates
> > > > > > > the cache. The recreated capability cache looks esctly
like the
> > > original
> > > > > > > one though ...
> > > > > > >
> > > > > > > The check responsible for the invalidation is this:
> > > > > > >
> > > > > > > ```
> > > > > > > Outdated capabilities for '%s': host cpuid
changed
> > > > > > > ```
> > > > > > >
> > > > > > > So the KVM_GET_SUPPORTED_CPUID call seems to return
> > > > > > > slightly different values in different containers.
> > > > > > >
> > > > > > > After trying out the attached golang scripts in
different
> > > containers, I
> > > > > > > could indeed see differences.
> > > > > > >
> > > > > > > I can however not really judge what the differences in
these
> KVM
> > > function
> > > > > > > registers mean and I am curious if someone else knows.
The
> files
> > > are
> > > > > > > attached too (as json for easy diffing).
> > > > > >
> > > > > > Can you confirm whether the two attached data files were
captured
> > > > > > by containers running on the same physical host, or could
each
> > > > > > container have run on a different host.
> > > > > >
> > > > >
> > > > > They are coming from the same host, that is the most surprising
bit
> > > for me.
> > > > > I am also very sure that this is the case, because I only had
one
> k8s
> > > node
> > > > > from where I took these.
> > > > > The containers however differ (obviously) on namespaces and on
the
> > > > > privilege level (less obvious). The handler dump is from a
fully
> > > privileged
> > > > > container.
> > > >
> > > > The privilege level sounds like something that might be impactful,
> > > > so I'll investigate that. I'd be pretty surprised for
namespaces
> > > > to have any impact thnough.
> > >
> > > The privilege level is a red herring. Peter reminded me that we have
> > > to filter out some parts of CPUID because the APIC IDs vary depending
> > > on what host CPU the task executes on.
> > >
> > >
> > >
>
https://gitlab.com/libvirt/libvirt/-/blob/master/src/util/virhostcpu.c#L1346
> > >
> > > In the 2 jSON files you provide, the differences i see should already
> > > be matched by
> > >
> > > /* filter out local apic id */
> > > if (entry->function == 0x01 && entry->index ==
0x00)
> > > entry->ebx &= 0x00ffffff;
> > > if (entry->function == 0x0b)
> > > entry->edx &= 0xffffff00;
> > >
> > > so those differences ought not to be causing the cache to be
> > > invalidated.
> > >
> >
> > Hm, maybe I misinterpreted the logs then. The snipped I looked at was
> this:
> >
> >
> > ```
> >
{"component":"virt-launcher","level":"info","msg":"/dev/kvm
has changed
> > (1661786802 vs
> >
>
0)","pos":"virQEMUCapsKVMUsable:4850","subcomponent":"libvirt","thread":"25","timestamp":"2022-08-29T15:26:42.936000Z"}
> >
{"component":"virt-launcher","level":"info","msg":"a=0x7f8138153ba0,
> >
>
b=0x7f818001c480","pos":"virCPUDataIsIdentical:1178","subcomponent":"libvirt","thread":"25","timestamp":"2022-08-29T15:26:42.939000Z"}
> >
{"component":"virt-launcher","level":"info","msg":"Outdated
capabilities
> > for '/usr/bin/qemu-system-x86_64': host cpuid
> >
>
changed","pos":"virQEMUCapsIsValid:4993","subcomponent":"libvirt","thread":"25","timestamp":"2022-08-29T15:26:42.939000Z"}
> >
{"component":"virt-launcher","level":"info","msg":"Outdated
cached
> > capabilities
> >
>
'/var/cache/libvirt/qemu/capabilities/926803a9278e445ec919c2b6cbd8c1c449c75b26dcb1686b774314180376c725.xml'
> > for
> >
>
'/usr/bin/qemu-system-x86_64'","pos":"virFileCacheLoad:163","subcomponent":"libvirt","thread":"25","timestamp":"2022-08-29T15:26:42.939000Z"}
> > ```
>
> Can you capture the
>
>
>
/var/cache/libvirt/qemu/capabilities/926803a9278e445ec919c2b6cbd8c1c449c75b26dcb1686b774314180376c725.xml'
> from the virt-handler and virt-launcher pods. It contains a <cpuid>
> block that will show us the differences libvirt recored, /after/,
> libvirt has done its filtering. This will show if there si more
> we need to filter.
>
Done. Only attached the one from handler since it is 100% identical to the
launcher.
The only differences between libvirt's XML and what's in your
original testbin_*json files, are the two functions that libvirt
masks. The result of masking your JSON matches the libvirt XML
record, so that's good.
This still leaves me puzzelled.
Let me know if you need more information.
Can you run libvirt with a different log filter
1:cpu 1:qemu
and attach the full log, from when libvirtd starts up until it shows
the capabilities invalidation. as the cpu code might be telling us
more about what the difference is.
With regards,
Daniel
--
|: