On 29.04.20 16:09, Christian Borntraeger wrote:
On 29.04.20 15:25, Daniel P. Berrangé wrote:
> On Wed, Apr 29, 2020 at 10:19:20AM -0300, Daniel Henrique Barboza wrote:
>>
>>
>> On 4/28/20 12:58 PM, Boris Fiuczynski wrote:
>>> From: Viktor Mihajlovski <mihajlov(a)linux.ibm.com>
>>>
>>
>> [...]
>>> +
>>> +If the check fails despite the host system actually supporting
>>> +protected virtualization guests, this can be caused by a stale
>>> +libvirt capabilities cache. To recover, run the following
>>> +commands
>>> +
>>> +::
>>> +
>>> + $ systemctl stop libvirtd
>>> + $ rm /var/cache/libvirt/qemu/capabilities/*.xml
>>> + $ systemctl start libvirtd
>>> +
>>> +
>>
>>
>> Why isn't Libvirt re-fetching the capabilities after host changes that
affects
>> KVM capabilities? I see that we're following up QEMU timestamps to detect
>> if the binary changes, which is sensible, but what about /dev/kvm? Shouldn't
>> we refresh domain capabilities every time following a host reboot?
>
> Caching of capabilities was done precisely to avoid refreshing on every boot
> because it resulted in slow startup for apps using libvirt after boot.
Caching beyond the life time of /dev/kvm seems broken from a kernel perspective.
It is totally valid to load a new version of the kvm module with new capabilities.
I am curious, Why did it take so long? we should - on normal system - only
refresh _one_ binary as most binaries are just TCG.
As Boris said, we are going to provide yet another check (besides the nested
thing.
But in general I think invalidating the cache for that one and only binary
that provides KVM after a change of /dev/kvm seems like the proper solution.
Looking into that, I found a handler for the "nested" module parameter. Now: We
also
have a hpage parameter that decides if you can use large pages or not in the host.
Do we need to check that as well or is that something that libvirt does not care
about?
This parameter will go away at a later point in time though as soon as we have added
the code to have hpage and nested at the same time.
So given the shakiness of these things, the time stamp of /dev/kvm really seems
to be the best way to go forward long term. Would be really interesting to understand
the conditions where you saw the long startup times due to rescanning.