Re: [libvirt] [PATCH 01/12] conf: introduce domain XML element <polling> for iothread

21 Feb 2017


      On Tue, Feb 21, 2017 at 02:14:44PM +0100, Pavel Hrdina wrote:
...
On Tue, Feb 21, 2017 at 12:55:51PM +0000, Daniel P. Berrange wrote:
...
On Tue, Feb 21, 2017 at 01:48:15PM +0100, Pavel Hrdina wrote:
...
On Tue, Feb 21, 2017 at 12:26:02PM +0000, Daniel P. Berrange wrote:
...
On Tue, Feb 21, 2017 at 01:14:57PM +0100, Pavel Hrdina wrote:
...
QEMU 2.9.0 will introduce polling feature for AioContext that instead
of blocking syscalls polls for events without blocking.  This means
that polling can be in most cases faster but it also increases CPU
utilization.
To address this issue QEMU implements self-tuning algorithm that
modifies the current polling time to adapt to different workloads
and it can also fallback to blocking syscalls.
For each IOThread this all is controlled by three parameters,
poll-max-ns, poll-grow and poll-shrink.  If parameter poll-max-ns
is set to 0 it disables the polling, if it is omitted the default
behavior is used and any value more than 0 enables polling.
Parameters poll-grow and poll-shrink configure how the self-tuning
algorithm will adapt the current polling time.  If they are omitted
or set to 0 default values will be used.
With my app developer hat on I have to wonder how an app is supposed
to figure out what to set these parameters to ? It has been difficult
enough figuring out existing QEMU block tunables, but at least most
of those can be set dependant on the type of storage use on the host
side. Tunables whose use depends on the guest workload are harder to
use since it largely involves predicting the unknown.  IOW is there
a compelling reason to add these low level parameters that are tightly
coupled to the specific algorithm that QEMU happens to use today.
I agree that it's probably way to complicated for management applications,
but there is a small issue with QEMU.  Currently it you don't specify
anything the polling is enabled with some reasonable default value and base
on experience with QEMU I'm not planning to count on that they will not change
the default behavior in the future.  To explicitly enable polling you need
to set poll-max-ns to some value more than 0.  We would have to check qemu
source code, and define the default value in our code in order to let
users explicitly enable polling.
The QEMU commit says polling is now enabled by default without needing
to set poll-max-ns AFAICT
commit cdd7abfdba9287a289c404dfdcb02316f9ffee7d
  Author: Stefan Hajnoczi <stefanha@redhat.com>
  Date:   Thu Jan 26 17:01:19 2017 +0000
iothread: enable AioContext polling by default
IOThread AioContexts are likely to consist only of event sources like
    virtqueue ioeventfds and LinuxAIO completion eventfds that are pollable
    from userspace (without system calls).
We recently merged the AioContext polling feature but didn't enable it
    by default yet.  I have gone back over the performance data on the
    mailing list and picked a default polling value that gave good results.
Let's enable AioContext polling by default so users don't have another
    switch they need to set manually.  If performance regressions are found
    we can still disable this for the QEMU 2.9 release.
...
...
The QEMU commits say the tunables all default to sane parameters so
I'm inclined to say we ignore them at the libvirt level entirely.
Yes, it would be way batter to have only <polling enable='yes|no'> end let
QEMU deal with the sane values for all parameters but that would mean to come
up with the sane values ourselves or modify QEMU to add another property that
would simply control whether it is enabled or not.
I'm saying don't even add that.
Do exactly nothing and just rely on the QEMU defaults here. This is not
affecting guest ABI at all so it doesn't matter if QEMU changes its
defaults later. In fact if QEMU changes defaults based on newer performance
measurements, it is a good thing if libvirt hasn't hardcoded all its VM
configs to the old default.
What if someone would like to disable it even if QEMU thinks that the
performance is good?  This patch series doesn't hardcode anything into
VM config.  If you don't set the polling element at all Libvirt will follow
the QEMU defaults and only the live XML would contain current state of
polling with default values loaded from QEMU.
This patch series ads a possibility to explicitly configure the polling if
someone want's to do that for some reason, but it also preserve the benefit
when you just don't care about it and you want to use the default.
If you still think that we should not export this feature at all, well we
don't have to.  The use-case that you've described is still possible with
this series, it only adds extra functionality on top of that.
I'm very wary of adding config parameters in libvirt just because they
exist in QEMU, particularly when the parameters are totally specific to
an algorithm that just happens to be the one implemented right now. We've
no idea if QEMU will stick with this algorithm or change it entirely, and
if the latter, then any config parameters will be likely meaningless for
any other algorithm. I can understand why QEMU would expose them on its
CLI, but I don't feel they are a good fit for exposing up the stack,
particularly given a lack of any guidance as to how people might consider
changing the values, other than random uninformed guesswork.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|