On Tue, Nov 20, 2018 at 11:00 AM +0100, Marc Hartmayer
<mhartmay(a)linux.ibm.com> wrote:
> On Thu, Nov 01, 2018 at 04:37 PM +0100, Michal Privoznik <mprivozn(a)redhat.com>
wrote:
>> On 10/30/2018 01:55 PM, Daniel P. Berrangé wrote:
>>> On Tue, Oct 30, 2018 at 10:32:08AM +0000, Daniel P. Berrangé wrote:
>>>> On Tue, Oct 30, 2018 at 11:08:45AM +0100, Michal Privoznik wrote:
>>>>> On 10/30/2018 10:35 AM, Daniel P. Berrangé wrote:
>>>>>> On Tue, Oct 30, 2018 at 09:13:50AM +0100, Michal Privoznik
wrote:
>>>>>>> On 10/29/2018 06:34 PM, Marc Hartmayer wrote:
>>>>>>>> Introduce caching whether /dev/kvm is usable as the QEMU
user:QEMU
>>>>>>>> group. This reduces the overhead of the QEMU capabilities
cache
>>>>>>>> lookup. Before this patch there were many fork() calls
used for
>>>>>>>> checking whether /dev/kvm is accessible. Now we store the
result
>>>>>>>> whether /dev/kvm is accessible or not and we only need to
re-run the
>>>>>>>> virFileAccessibleAs check if the ctime of /dev/kvm has
changed.
>>>>>>>>
>>>>>>>> Suggested-by: Daniel P. Berrangé
<berrange(a)redhat.com>
>>>>>>>> Signed-off-by: Marc Hartmayer
<mhartmay(a)linux.ibm.com>
>>>>>>>> ---
>>>>>>>> src/qemu/qemu_capabilities.c | 54
++++++++++++++++++++++++++++++++++--
>>>>>>>> 1 file changed, 52 insertions(+), 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/src/qemu/qemu_capabilities.c
b/src/qemu/qemu_capabilities.c
>>>>>>>> index e228f52ec0bb..85516954149b 100644
>>>>>>>> --- a/src/qemu/qemu_capabilities.c
>>>>>>>> +++ b/src/qemu/qemu_capabilities.c
>>>>>>>> @@ -3238,6 +3238,10 @@ struct _virQEMUCapsCachePriv {
>>>>>>>> virArch hostArch;
>>>>>>>> unsigned int microcodeVersion;
>>>>>>>> char *kernelVersion;
>>>>>>>> +
>>>>>>>> + /* cache whether /dev/kvm is usable as
runUid:runGuid */
>>>>>>>> + virTristateBool kvmUsable;
>>>>>>>> + time_t kvmCtime;
>>>>>>>> };
>>>>>>>> typedef struct _virQEMUCapsCachePriv
virQEMUCapsCachePriv;
>>>>>>>> typedef virQEMUCapsCachePriv *virQEMUCapsCachePrivPtr;
>>>>>>>> @@ -3824,6 +3828,52 @@ virQEMUCapsSaveFile(void *data,
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> +/* Determine whether '/dev/kvm' is usable as
QEMU user:QEMU group. */
>>>>>>>> +static bool
>>>>>>>> +virQEMUCapsKVMUsable(virQEMUCapsCachePrivPtr priv)
>>>>>>>> +{
>>>>>>>> + struct stat sb;
>>>>>>>> + static const char *kvm_device =
"/dev/kvm";
>>>>>>>> + virTristateBool value;
>>>>>>>> + virTristateBool cached_value = priv->kvmUsable;
>>>>>>>> + time_t kvm_ctime;
>>>>>>>> + time_t cached_kvm_ctime = priv->kvmCtime;
>>>>>>>> +
>>>>>>>> + if (stat(kvm_device, &sb) < 0) {
>>>>>>>> + virReportSystemError(errno,
>>>>>>>> + _("Failed to stat
%s"), kvm_device);
>>>>>>>> + return false;
>>>>>>>> + }
>>>>>>>> + kvm_ctime = sb.st_ctime;
>>>>>>>
>>>>>>> This doesn't feel right. /dev/kvm ctime is changed every
time qemu is
>>>>>>> started or powered off (try running stat over it before and
after a
>>>>>>> domain is started/shut off). So effectively we will fork more
often than
>>>>>>> we would think. Should we cache inode number instead? Because
for all
>>>>>>> that we care is simply if the file is there.
>>>>>>
>>>>>> Urgh, that is a bit strange and not the usual semantics for
timestamps :-(
>>>>>
>>>>> Indeed.
>>>>>
>>>>>>
>>>>>> We can't stat the inode - the code was explicitly trying to
cope with the
>>>>>> way /dev/kvm can change permissions when udev rules get applied.
We would
>>>>>> have to compare the user, group, permissions mask and even ACL,
or a hash
>>>>>> of those.
>>>>>
>>>>> Well, we can use ctime as suggested and post a patch for kernel to
fix
>>>>> ctime behaviour. Until the patch is merged our behaviour would be
>>>>> suboptimal, but still better than it is now.
>>>>
>>>> I guess lets talk to KVM team for their input on this and then decide
>>>> what todo.
>>>
>>> Hmm, I wonder if it is not actually a kernel problem, but rather something
>>> in userspace genuinely touching the device in a way that caues these
>>> timestamps to be updated.
>>>
>>> eg I vaguely recall a udev rule that resets permissions on device nodes
>>> whenever an FD is closed, which might cause this kind of behaviour
>>
>> After trying hard to find the rule that mangles the ctime I've found the
>> following bug in udev:
>>
>>
https://github.com/zippy2/systemd/commit/51915e4e6a64c581da321c25448d8062...
>>
>> I've send that as a pull request to systemd. So after all, ctime might
>> be usable again.
>
> Are there any other objections against this patch? Because even with the
> ctime bug in udev it’s performs _much_ better than before.
Not from my side, really. I would still argue that we *need* to cache it
per-user, but since that was broken before I can convince myself I'm OK with
that going in separately.
Don't know about others, though.
Polite ping :)
[…snip]
--
Kind regards / Beste Grüße
Marc Hartmayer
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294