On Tue, Feb 21, 2017 at 1:39 PM, Pavel Hrdina <phrdina(a)redhat.com> wrote:
On Tue, Feb 21, 2017 at 01:26:25PM +0000, Daniel P. Berrange wrote:
> On Tue, Feb 21, 2017 at 02:14:44PM +0100, Pavel Hrdina wrote:
> > On Tue, Feb 21, 2017 at 12:55:51PM +0000, Daniel P. Berrange wrote:
> > > On Tue, Feb 21, 2017 at 01:48:15PM +0100, Pavel Hrdina wrote:
> > > > On Tue, Feb 21, 2017 at 12:26:02PM +0000, Daniel P. Berrange wrote:
> > > > > On Tue, Feb 21, 2017 at 01:14:57PM +0100, Pavel Hrdina wrote:
> > > > > > QEMU 2.9.0 will introduce polling feature for AioContext
that instead
> > > > > > of blocking syscalls polls for events without blocking.
This means
> > > > > > that polling can be in most cases faster but it also
increases CPU
> > > > > > utilization.
> > > > > >
> > > > > > To address this issue QEMU implements self-tuning algorithm
that
> > > > > > modifies the current polling time to adapt to different
workloads
> > > > > > and it can also fallback to blocking syscalls.
> > > > > >
> > > > > > For each IOThread this all is controlled by three
parameters,
> > > > > > poll-max-ns, poll-grow and poll-shrink. If parameter
poll-max-ns
> > > > > > is set to 0 it disables the polling, if it is omitted the
default
> > > > > > behavior is used and any value more than 0 enables polling.
> > > > > > Parameters poll-grow and poll-shrink configure how the
self-tuning
> > > > > > algorithm will adapt the current polling time. If they are
omitted
> > > > > > or set to 0 default values will be used.
> > > > >
> > > > > With my app developer hat on I have to wonder how an app is
supposed
> > > > > to figure out what to set these parameters to ? It has been
difficult
> > > > > enough figuring out existing QEMU block tunables, but at least
most
> > > > > of those can be set dependant on the type of storage use on the
host
> > > > > side. Tunables whose use depends on the guest workload are harder
to
> > > > > use since it largely involves predicting the unknown. IOW is
there
> > > > > a compelling reason to add these low level parameters that are
tightly
> > > > > coupled to the specific algorithm that QEMU happens to use
today.
> > > >
> > > > I agree that it's probably way to complicated for management
applications,
> > > > but there is a small issue with QEMU. Currently it you don't
specify
> > > > anything the polling is enabled with some reasonable default value and
base
> > > > on experience with QEMU I'm not planning to count on that they
will not change
> > > > the default behavior in the future. To explicitly enable polling you
need
> > > > to set poll-max-ns to some value more than 0. We would have to check
qemu
> > > > source code, and define the default value in our code in order to let
> > > > users explicitly enable polling.
> > >
> > > The QEMU commit says polling is now enabled by default without needing
> > > to set poll-max-ns AFAICT
> > >
> > > commit cdd7abfdba9287a289c404dfdcb02316f9ffee7d
> > > Author: Stefan Hajnoczi <stefanha(a)redhat.com>
> > > Date: Thu Jan 26 17:01:19 2017 +0000
> > >
> > > iothread: enable AioContext polling by default
> > >
> > > IOThread AioContexts are likely to consist only of event sources like
> > > virtqueue ioeventfds and LinuxAIO completion eventfds that are
pollable
> > > from userspace (without system calls).
> > >
> > > We recently merged the AioContext polling feature but didn't enable
it
> > > by default yet. I have gone back over the performance data on the
> > > mailing list and picked a default polling value that gave good
results.
> > >
> > > Let's enable AioContext polling by default so users don't have
another
> > > switch they need to set manually. If performance regressions are
found
> > > we can still disable this for the QEMU 2.9 release.
> > >
> > > > > The QEMU commits say the tunables all default to sane parameters
so
> > > > > I'm inclined to say we ignore them at the libvirt level
entirely.
> > > >
> > > > Yes, it would be way batter to have only <polling
enable='yes|no'> end let
> > > > QEMU deal with the sane values for all parameters but that would mean
to come
> > > > up with the sane values ourselves or modify QEMU to add another
property that
> > > > would simply control whether it is enabled or not.
> > >
> > > I'm saying don't even add that.
> > >
> > > Do exactly nothing and just rely on the QEMU defaults here. This is not
> > > affecting guest ABI at all so it doesn't matter if QEMU changes its
> > > defaults later. In fact if QEMU changes defaults based on newer
performance
> > > measurements, it is a good thing if libvirt hasn't hardcoded all its
VM
> > > configs to the old default.
> >
> > What if someone would like to disable it even if QEMU thinks that the
> > performance is good? This patch series doesn't hardcode anything into
> > VM config. If you don't set the polling element at all Libvirt will follow
> > the QEMU defaults and only the live XML would contain current state of
> > polling with default values loaded from QEMU.
> >
> > This patch series ads a possibility to explicitly configure the polling if
> > someone want's to do that for some reason, but it also preserve the benefit
> > when you just don't care about it and you want to use the default.
> >
> > If you still think that we should not export this feature at all, well we
> > don't have to. The use-case that you've described is still possible
with
> > this series, it only adds extra functionality on top of that.
>
> I'm very wary of adding config parameters in libvirt just because they
> exist in QEMU, particularly when the parameters are totally specific to
> an algorithm that just happens to be the one implemented right now. We've
> no idea if QEMU will stick with this algorithm or change it entirely, and
> if the latter, then any config parameters will be likely meaningless for
> any other algorithm. I can understand why QEMU would expose them on its
> CLI, but I don't feel they are a good fit for exposing up the stack,
> particularly given a lack of any guidance as to how people might consider
> changing the values, other than random uninformed guesswork.
Yes, that's true and I had the same worrying feeling about the parameters
specifically tied to QEMU algorithm, but I thought that it would be nice
to export them anyway. Let's ignore this series for now and if someone asks
for a feature to disable polling we can revive it.
libvirt doesn't need a dedicated API for <iothread> polling parameters
but the QEMU command-line passthrough feature must make it possible
using <qemu:arg value='-newarg'/>. I have tested the following QEMU
command-line:
-object iothread,id=iothread0 \ # assume libvirt defines the iothreads
-object iothread,id=iothread1 \
-set object.iothread0.poll-max-ns=0 # override poll-max-ns using -set
This disables polling in iothread0 and leaves the default value in iothread1.
I'm fine if libvirt doesn't add a dedicated API for setting <iothread>
polling parameters. It's unlikely that users will need to change the
setting. In an emergency (e.g. disabling it due to a performance
regression) they can use <qemu:arg value='-newarg'/>.
Stefan