On 10/30/2018 10:35 AM, Daniel P. Berrangé wrote:
On Tue, Oct 30, 2018 at 09:13:50AM +0100, Michal Privoznik wrote:
> On 10/29/2018 06:34 PM, Marc Hartmayer wrote:
>> Introduce caching whether /dev/kvm is usable as the QEMU user:QEMU
>> group. This reduces the overhead of the QEMU capabilities cache
>> lookup. Before this patch there were many fork() calls used for
>> checking whether /dev/kvm is accessible. Now we store the result
>> whether /dev/kvm is accessible or not and we only need to re-run the
>> virFileAccessibleAs check if the ctime of /dev/kvm has changed.
>>
>> Suggested-by: Daniel P. Berrangé <berrange(a)redhat.com>
>> Signed-off-by: Marc Hartmayer <mhartmay(a)linux.ibm.com>
>> ---
>> src/qemu/qemu_capabilities.c | 54 ++++++++++++++++++++++++++++++++++--
>> 1 file changed, 52 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
>> index e228f52ec0bb..85516954149b 100644
>> --- a/src/qemu/qemu_capabilities.c
>> +++ b/src/qemu/qemu_capabilities.c
>> @@ -3238,6 +3238,10 @@ struct _virQEMUCapsCachePriv {
>> virArch hostArch;
>> unsigned int microcodeVersion;
>> char *kernelVersion;
>> +
>> + /* cache whether /dev/kvm is usable as runUid:runGuid */
>> + virTristateBool kvmUsable;
>> + time_t kvmCtime;
>> };
>> typedef struct _virQEMUCapsCachePriv virQEMUCapsCachePriv;
>> typedef virQEMUCapsCachePriv *virQEMUCapsCachePrivPtr;
>> @@ -3824,6 +3828,52 @@ virQEMUCapsSaveFile(void *data,
>> }
>>
>>
>> +/* Determine whether '/dev/kvm' is usable as QEMU user:QEMU group. */
>> +static bool
>> +virQEMUCapsKVMUsable(virQEMUCapsCachePrivPtr priv)
>> +{
>> + struct stat sb;
>> + static const char *kvm_device = "/dev/kvm";
>> + virTristateBool value;
>> + virTristateBool cached_value = priv->kvmUsable;
>> + time_t kvm_ctime;
>> + time_t cached_kvm_ctime = priv->kvmCtime;
>> +
>> + if (stat(kvm_device, &sb) < 0) {
>> + virReportSystemError(errno,
>> + _("Failed to stat %s"), kvm_device);
>> + return false;
>> + }
>> + kvm_ctime = sb.st_ctime;
>
> This doesn't feel right. /dev/kvm ctime is changed every time qemu is
> started or powered off (try running stat over it before and after a
> domain is started/shut off). So effectively we will fork more often than
> we would think. Should we cache inode number instead? Because for all
> that we care is simply if the file is there.
Urgh, that is a bit strange and not the usual semantics for timestamps :-(
Indeed.
We can't stat the inode - the code was explicitly trying to cope with the
way /dev/kvm can change permissions when udev rules get applied. We would
have to compare the user, group, permissions mask and even ACL, or a hash
of those.
Well, we can use ctime as suggested and post a patch for kernel to fix
ctime behaviour. Until the patch is merged our behaviour would be
suboptimal, but still better than it is now.
Michal