On Mon, 5 Feb 2018 17:10:18 +0100
Viktor Mihajlovski <mihajlov(a)linux.vnet.ibm.com> wrote:
On 05.02.2018 16:37, Luiz Capitulino wrote:
> On Mon, 5 Feb 2018 13:47:27 +0000
> Daniel P. Berrangé <berrange(a)redhat.com> wrote:
>
>> On Mon, Feb 05, 2018 at 02:43:15PM +0100, Viktor Mihajlovski wrote:
>>> On 02.02.2018 21:41, Eduardo Habkost wrote:
>>>> On Fri, Feb 02, 2018 at 03:19:45PM -0500, Luiz Capitulino wrote:
>>>>> On Fri, 2 Feb 2018 18:09:12 -0200
>>>>> Eduardo Habkost <ehabkost(a)redhat.com> wrote:
>>>> [...]
>>>>>> Your plan above covers what will happen when using newer QEMU
>>>>>> versions, but libvirt still needs to work sanely if running
QEMU
>>>>>> 2.11. My suggestion is that libvirt do not run query-cpus to
ask
>>>>>> for the "halted" field on any architecture except
s390.
>>>>>
>>>>> My current plan is to ask libvirt to completely remove query-cpus
>>>>> usage, independent of the arch and use the new command instead.
>>>>
>>>> This would be a regression for people running QEMU 2.11 on s390.
>>>>
>>>> (But maybe it would be an acceptable regression? Viktor, what do
>>>> you think? Are there production releases of management systems
>>>> that already rely on vcpu.<n>.halted?)
>>>>
>>> Unfortunately, there's code out there looking at vcpu.<n>.halted.
I've
>>> informed the product team about the issue.
>>>
>>> If we drop/deprecate vcpu.<n>.halted from the domain statistics, this
>>> should be done for all arches, if there's a replacement mechanism (i.e.
>>> new VCPU states). As a stop-gap measure we can make the call
>>> arch-dependent until the new stuff is in place.
>>
>> Yes, I think libvirt should just restrict this 'halted' feature
reporting
>> to s390 only, since the other archs have different semantics for this
>> item, and the s390 semantics are the ones we want.
>
> From this whole discussion, there's only one thing that I still don't
> understand (in a very honest way): what makes s390 halted semantics
> different?One problem is that using the halted property to indicate that the CPU
has assumed the architected disabled wait state may not have been the
wisest decision (my fault). If the CPU enters disabled wait, it will
stay inactive until it is explicitly restarted which is different on x86.
Ah, OK. So, s390 does indeed have different semantics.
> By quickly looking at the code, it seems to be very like the x86
one
> when in kernel irqchip is not used: if a guest vCPU executes HLT, the
> vCPU exits to userspace and qemu will put the vCPU thread to sleep.
> This is the semantics I'd expect for HLT, and maybe for all archs.>
> What makes x86 different, is when the in kernel irqchip is used (which
> should be the default with libvirt). In this case, the vCPU thread avoids
> exiting to user-space. So, qemu doesn't know the vCPU halted.
>
> That's only one of the reasons why query-cpus forces vCPUs to user-space.
> But there are other reasons, and that's why even on s390 query-cpus
> will also force vCPUs to user-space, which means s390 has the same perf
> issue but maybe this hasn't been detected yet.
>
> For the immediate term, I still think we should have a query-cpus
> replacement that doesn't cause vCPUs to go to userspace. I'll work this
> this week.
FWIW: I currently exploring an extension to query-cpus to report
s390-specific information, allowing to ditch halted in the long run.
Further, I'm considering a new QAPI event along the lines of "CPU info
has changed" allowing QEMU to announce low-frequency changes of CPU
state (as is the case for s390) and finally wire up a handler in libvirt
to update a tbd. property (!= halted).
I very much prefer adding a replacement for query-cpus, which works
for all archs and which doesn't have any performance impact.
>
> However, IMHO, what we really want is to add an API to the guest agent
> to export the CPU online bit from the guest userspace sysfs. This will
> give the ultimate semantics and move us away from this halted mess.
>