On Tue, Nov 20, 2018 at 11:00 AM +0100, Marc Hartmayer <mhartmay(a)linux.ibm.com>
wrote:
On Thu, Nov 01, 2018 at 04:37 PM +0100, Michal Privoznik
<mprivozn(a)redhat.com> wrote:
> On 10/30/2018 01:55 PM, Daniel P. Berrangé wrote:
>> On Tue, Oct 30, 2018 at 10:32:08AM +0000, Daniel P. Berrangé wrote:
>>> On Tue, Oct 30, 2018 at 11:08:45AM +0100, Michal Privoznik wrote:
>>>> On 10/30/2018 10:35 AM, Daniel P. Berrangé wrote:
>>>>> On Tue, Oct 30, 2018 at 09:13:50AM +0100, Michal Privoznik wrote:
>>>>>> On 10/29/2018 06:34 PM, Marc Hartmayer wrote:
>>>>>>> Introduce caching whether /dev/kvm is usable as the QEMU
user:QEMU
>>>>>>> group. This reduces the overhead of the QEMU capabilities
cache
>>>>>>> lookup. Before this patch there were many fork() calls used
for
>>>>>>> checking whether /dev/kvm is accessible. Now we store the
result
>>>>>>> whether /dev/kvm is accessible or not and we only need to
re-run the
>>>>>>> virFileAccessibleAs check if the ctime of /dev/kvm has
changed.
>>>>>>>
>>>>>>> Suggested-by: Daniel P. Berrangé <berrange(a)redhat.com>
>>>>>>> Signed-off-by: Marc Hartmayer <mhartmay(a)linux.ibm.com>
>>>>>>> ---
>>>>>>> src/qemu/qemu_capabilities.c | 54
++++++++++++++++++++++++++++++++++--
>>>>>>> 1 file changed, 52 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/src/qemu/qemu_capabilities.c
b/src/qemu/qemu_capabilities.c
>>>>>>> index e228f52ec0bb..85516954149b 100644
>>>>>>> --- a/src/qemu/qemu_capabilities.c
>>>>>>> +++ b/src/qemu/qemu_capabilities.c
>>>>>>> @@ -3238,6 +3238,10 @@ struct _virQEMUCapsCachePriv {
>>>>>>> virArch hostArch;
>>>>>>> unsigned int microcodeVersion;
>>>>>>> char *kernelVersion;
>>>>>>> +
>>>>>>> + /* cache whether /dev/kvm is usable as runUid:runGuid
*/
>>>>>>> + virTristateBool kvmUsable;
>>>>>>> + time_t kvmCtime;
>>>>>>> };
>>>>>>> typedef struct _virQEMUCapsCachePriv virQEMUCapsCachePriv;
>>>>>>> typedef virQEMUCapsCachePriv *virQEMUCapsCachePrivPtr;
>>>>>>> @@ -3824,6 +3828,52 @@ virQEMUCapsSaveFile(void *data,
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> +/* Determine whether '/dev/kvm' is usable as QEMU
user:QEMU group. */
>>>>>>> +static bool
>>>>>>> +virQEMUCapsKVMUsable(virQEMUCapsCachePrivPtr priv)
>>>>>>> +{
>>>>>>> + struct stat sb;
>>>>>>> + static const char *kvm_device = "/dev/kvm";
>>>>>>> + virTristateBool value;
>>>>>>> + virTristateBool cached_value = priv->kvmUsable;
>>>>>>> + time_t kvm_ctime;
>>>>>>> + time_t cached_kvm_ctime = priv->kvmCtime;
>>>>>>> +
>>>>>>> + if (stat(kvm_device, &sb) < 0) {
>>>>>>> + virReportSystemError(errno,
>>>>>>> + _("Failed to stat
%s"), kvm_device);
>>>>>>> + return false;
>>>>>>> + }
>>>>>>> + kvm_ctime = sb.st_ctime;
>>>>>>
>>>>>> This doesn't feel right. /dev/kvm ctime is changed every time
qemu is
>>>>>> started or powered off (try running stat over it before and after
a
>>>>>> domain is started/shut off). So effectively we will fork more
often than
>>>>>> we would think. Should we cache inode number instead? Because for
all
>>>>>> that we care is simply if the file is there.
>>>>>
>>>>> Urgh, that is a bit strange and not the usual semantics for
timestamps :-(
>>>>
>>>> Indeed.
>>>>
>>>>>
>>>>> We can't stat the inode - the code was explicitly trying to cope
with the
>>>>> way /dev/kvm can change permissions when udev rules get applied. We
would
>>>>> have to compare the user, group, permissions mask and even ACL, or a
hash
>>>>> of those.
>>>>
>>>> Well, we can use ctime as suggested and post a patch for kernel to fix
>>>> ctime behaviour. Until the patch is merged our behaviour would be
>>>> suboptimal, but still better than it is now.
>>>
>>> I guess lets talk to KVM team for their input on this and then decide
>>> what todo.
>>
>> Hmm, I wonder if it is not actually a kernel problem, but rather something
>> in userspace genuinely touching the device in a way that caues these
>> timestamps to be updated.
>>
>> eg I vaguely recall a udev rule that resets permissions on device nodes
>> whenever an FD is closed, which might cause this kind of behaviour
>
> After trying hard to find the rule that mangles the ctime I've found the
> following bug in udev:
>
>
https://github.com/zippy2/systemd/commit/51915e4e6a64c581da321c25448d8062...
>
> I've send that as a pull request to systemd. So after all, ctime might
> be usable again.
Are there any other objections against this patch? Because even with the
ctime bug in udev it’s performs _much_ better than before.
Polite ping :)
[…snip]
--
Kind regards / Beste Grüße
Marc Hartmayer
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294