On Tue, Oct 30, 2018 at 10:32:08AM +0000, Daniel P. Berrangé wrote:
On Tue, Oct 30, 2018 at 11:08:45AM +0100, Michal Privoznik wrote:
> On 10/30/2018 10:35 AM, Daniel P. Berrangé wrote:
> > On Tue, Oct 30, 2018 at 09:13:50AM +0100, Michal Privoznik wrote:
> >> On 10/29/2018 06:34 PM, Marc Hartmayer wrote:
> >>> Introduce caching whether /dev/kvm is usable as the QEMU user:QEMU
> >>> group. This reduces the overhead of the QEMU capabilities cache
> >>> lookup. Before this patch there were many fork() calls used for
> >>> checking whether /dev/kvm is accessible. Now we store the result
> >>> whether /dev/kvm is accessible or not and we only need to re-run the
> >>> virFileAccessibleAs check if the ctime of /dev/kvm has changed.
> >>>
> >>> Suggested-by: Daniel P. Berrangé <berrange(a)redhat.com>
> >>> Signed-off-by: Marc Hartmayer <mhartmay(a)linux.ibm.com>
> >>> ---
> >>> src/qemu/qemu_capabilities.c | 54
++++++++++++++++++++++++++++++++++--
> >>> 1 file changed, 52 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/src/qemu/qemu_capabilities.c
b/src/qemu/qemu_capabilities.c
> >>> index e228f52ec0bb..85516954149b 100644
> >>> --- a/src/qemu/qemu_capabilities.c
> >>> +++ b/src/qemu/qemu_capabilities.c
> >>> @@ -3238,6 +3238,10 @@ struct _virQEMUCapsCachePriv {
> >>> virArch hostArch;
> >>> unsigned int microcodeVersion;
> >>> char *kernelVersion;
> >>> +
> >>> + /* cache whether /dev/kvm is usable as runUid:runGuid */
> >>> + virTristateBool kvmUsable;
> >>> + time_t kvmCtime;
> >>> };
> >>> typedef struct _virQEMUCapsCachePriv virQEMUCapsCachePriv;
> >>> typedef virQEMUCapsCachePriv *virQEMUCapsCachePrivPtr;
> >>> @@ -3824,6 +3828,52 @@ virQEMUCapsSaveFile(void *data,
> >>> }
> >>>
> >>>
> >>> +/* Determine whether '/dev/kvm' is usable as QEMU user:QEMU
group. */
> >>> +static bool
> >>> +virQEMUCapsKVMUsable(virQEMUCapsCachePrivPtr priv)
> >>> +{
> >>> + struct stat sb;
> >>> + static const char *kvm_device = "/dev/kvm";
> >>> + virTristateBool value;
> >>> + virTristateBool cached_value = priv->kvmUsable;
> >>> + time_t kvm_ctime;
> >>> + time_t cached_kvm_ctime = priv->kvmCtime;
> >>> +
> >>> + if (stat(kvm_device, &sb) < 0) {
> >>> + virReportSystemError(errno,
> >>> + _("Failed to stat %s"),
kvm_device);
> >>> + return false;
> >>> + }
> >>> + kvm_ctime = sb.st_ctime;
> >>
> >> This doesn't feel right. /dev/kvm ctime is changed every time qemu is
> >> started or powered off (try running stat over it before and after a
> >> domain is started/shut off). So effectively we will fork more often than
> >> we would think. Should we cache inode number instead? Because for all
> >> that we care is simply if the file is there.
> >
> > Urgh, that is a bit strange and not the usual semantics for timestamps :-(
>
> Indeed.
>
> >
> > We can't stat the inode - the code was explicitly trying to cope with the
> > way /dev/kvm can change permissions when udev rules get applied. We would
> > have to compare the user, group, permissions mask and even ACL, or a hash
> > of those.
>
> Well, we can use ctime as suggested and post a patch for kernel to fix
> ctime behaviour. Until the patch is merged our behaviour would be
> suboptimal, but still better than it is now.
I guess lets talk to KVM team for their input on this and then decide
what todo.
Hmm, I wonder if it is not actually a kernel problem, but rather something
in userspace genuinely touching the device in a way that caues these
timestamps to be updated.
eg I vaguely recall a udev rule that resets permissions on device nodes
whenever an FD is closed, which might cause this kind of behaviour
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|