On 10/30/2018 01:55 PM, Daniel P. Berrangé wrote:
On Tue, Oct 30, 2018 at 10:32:08AM +0000, Daniel P. Berrangé wrote:
> On Tue, Oct 30, 2018 at 11:08:45AM +0100, Michal Privoznik wrote:
>> On 10/30/2018 10:35 AM, Daniel P. Berrangé wrote:
>>> On Tue, Oct 30, 2018 at 09:13:50AM +0100, Michal Privoznik wrote:
>>>> On 10/29/2018 06:34 PM, Marc Hartmayer wrote:
>>>>> Introduce caching whether /dev/kvm is usable as the QEMU user:QEMU
>>>>> group. This reduces the overhead of the QEMU capabilities cache
>>>>> lookup. Before this patch there were many fork() calls used for
>>>>> checking whether /dev/kvm is accessible. Now we store the result
>>>>> whether /dev/kvm is accessible or not and we only need to re-run the
>>>>> virFileAccessibleAs check if the ctime of /dev/kvm has changed.
>>>>>
>>>>> Suggested-by: Daniel P. Berrangé <berrange(a)redhat.com>
>>>>> Signed-off-by: Marc Hartmayer <mhartmay(a)linux.ibm.com>
>>>>> ---
>>>>> src/qemu/qemu_capabilities.c | 54
++++++++++++++++++++++++++++++++++--
>>>>> 1 file changed, 52 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/src/qemu/qemu_capabilities.c
b/src/qemu/qemu_capabilities.c
>>>>> index e228f52ec0bb..85516954149b 100644
>>>>> --- a/src/qemu/qemu_capabilities.c
>>>>> +++ b/src/qemu/qemu_capabilities.c
>>>>> @@ -3238,6 +3238,10 @@ struct _virQEMUCapsCachePriv {
>>>>> virArch hostArch;
>>>>> unsigned int microcodeVersion;
>>>>> char *kernelVersion;
>>>>> +
>>>>> + /* cache whether /dev/kvm is usable as runUid:runGuid */
>>>>> + virTristateBool kvmUsable;
>>>>> + time_t kvmCtime;
>>>>> };
>>>>> typedef struct _virQEMUCapsCachePriv virQEMUCapsCachePriv;
>>>>> typedef virQEMUCapsCachePriv *virQEMUCapsCachePrivPtr;
>>>>> @@ -3824,6 +3828,52 @@ virQEMUCapsSaveFile(void *data,
>>>>> }
>>>>>
>>>>>
>>>>> +/* Determine whether '/dev/kvm' is usable as QEMU user:QEMU
group. */
>>>>> +static bool
>>>>> +virQEMUCapsKVMUsable(virQEMUCapsCachePrivPtr priv)
>>>>> +{
>>>>> + struct stat sb;
>>>>> + static const char *kvm_device = "/dev/kvm";
>>>>> + virTristateBool value;
>>>>> + virTristateBool cached_value = priv->kvmUsable;
>>>>> + time_t kvm_ctime;
>>>>> + time_t cached_kvm_ctime = priv->kvmCtime;
>>>>> +
>>>>> + if (stat(kvm_device, &sb) < 0) {
>>>>> + virReportSystemError(errno,
>>>>> + _("Failed to stat %s"),
kvm_device);
>>>>> + return false;
>>>>> + }
>>>>> + kvm_ctime = sb.st_ctime;
>>>>
>>>> This doesn't feel right. /dev/kvm ctime is changed every time qemu
is
>>>> started or powered off (try running stat over it before and after a
>>>> domain is started/shut off). So effectively we will fork more often than
>>>> we would think. Should we cache inode number instead? Because for all
>>>> that we care is simply if the file is there.
>>>
>>> Urgh, that is a bit strange and not the usual semantics for timestamps :-(
>>
>> Indeed.
>>
>>>
>>> We can't stat the inode - the code was explicitly trying to cope with
the
>>> way /dev/kvm can change permissions when udev rules get applied. We would
>>> have to compare the user, group, permissions mask and even ACL, or a hash
>>> of those.
>>
>> Well, we can use ctime as suggested and post a patch for kernel to fix
>> ctime behaviour. Until the patch is merged our behaviour would be
>> suboptimal, but still better than it is now.
>
> I guess lets talk to KVM team for their input on this and then decide
> what todo.
Hmm, I wonder if it is not actually a kernel problem, but rather something
in userspace genuinely touching the device in a way that caues these
timestamps to be updated.
It is kernel problem. In my testing, the moment I call:
ioctl(kvm, KVM_CREATE_VM, 0);
all three times (atime, mtime, ctime) are changed. However, prior to
that I'm calling some 'query' ioctls (KVM_GET_API_VERSION,
KVM_CHECK_EXTENSION) but none of the Xtime is changed.
eg I vaguely recall a udev rule that resets permissions on device
nodes
whenever an FD is closed, which might cause this kind of behaviour
Huh, I did not know that udev can hook on close(2).
Michal