On Fri, 2 Feb 2018 18:09:12 -0200
Eduardo Habkost <ehabkost(a)redhat.com> wrote:
On Fri, Feb 02, 2018 at 01:50:33PM -0500, Luiz Capitulino wrote:
> On Fri, 2 Feb 2018 15:42:49 -0200
> Eduardo Habkost <ehabkost(a)redhat.com> wrote:
>
> > On Fri, Feb 02, 2018 at 05:19:34PM +0100, Viktor Mihajlovski wrote:
> > > On 02.02.2018 17:01, Luiz Capitulino wrote:
> > [...]
> > > > o Make qemuDomainRefreshVcpuHalted() s390-only in libvirt. This by
> > > > itself fixes the original performance issue
> > > We are normally trying to avoid architecture-specific code in libvirt
> > > (not always successfully). We could omit the call, based on a QEMU
> > > Capability derived from the presence of said flag. This would change the
> > > libvirt-client side default to not report halted. A client can the still
> > > request the value via a tbd libvirt flag. Which is what an s390-aware
> > > management app would have to do...
> >
> > The problem I see here is that the current semantics of the
> > "halted" field in QEMU is arch-specific, so either libvirt or
> > upper layers will necessarily need arch-specific code if they
> > want to support QEMU 2.11 or older.
>
> My understanding of this plan is:
>
> 1. Deprecate the "halted" field in query-cpus (that is, make it
> always return halted=false)
I don't think we really need to do this. If we do, this should
be the last step (after libvirt is already using the new
interfaces).
Yeah, I've just started taking a look on how to implement this
plan I my first conclusion was to let current query-cpus alone.
> 2. Add a new command, say query-cpu-state, which is arch
dependent
> and is only available in archs that support sane "halted"
> semantics (I guess we can have per-arch QMP commands, right?)
I don't see why we would make the new command arch-dependent. We
need two new interfaces:
1) A lightweight version of query-cpus that won't interrupt the
VCPUs. This can be a new command, or a new parameter to
query-cpus.
Exactly how I thought of doing it. I prefer a new command so that
we untangle this from query-cpus.
2) A arch-independent way to query "CPU is online" state,
as the
existing "halted" field has confusing arch-specific semantics.
Honest question: is it at all possible for QEMU to know a CPU
is online? My impression is that the halted thing is the best
we can do. If we need better than this, then libvirt should use
the guest agent instead.
Btw, I haven't checked all archs yet, but I'm under the impression
the halted thing is confusing in x86 because of the in kernel irqchip.
Otherwise this state is maintained be qemu itself.
> 3. Modify libvirt to use query-cpu-state if it's available,
> otherwise use query-cpus (in which case "halted" will be bogus,
> but that's a feature :) )
>
> In essence, we're moving the arch-specific code from libvirt to
> qemu.
Your plan above covers what will happen when using newer QEMU
versions, but libvirt still needs to work sanely if running QEMU
2.11. My suggestion is that libvirt do not run query-cpus to ask
for the "halted" field on any architecture except s390.
My current plan is to ask libvirt to completely remove query-cpus
usage, independent of the arch and use the new command instead.