On 10/30/18 5:25 AM, Daniel P. Berrangé wrote:
On Mon, Oct 29, 2018 at 05:55:36PM -0400, John Ferlan wrote:
>
>
> On 10/17/18 10:15 AM, Daniel P. Berrangé wrote:
>> MTTCG is the new multi-threaded impl of TCG which follows
>> KVM in having one host OS thread per vCPU. Historically
>> we have discarded all PIDs reported for TCG guests, but
>> we must now selectively honour this data.
>>
>> We don't have anything in the domain XML that indicates
>> whether a guest is using TCG or MTTCG. While QEMU does
>> have an option (-accel tcg,thread=single|multi), it is
>> not desirable to expose this in libvirt. QEMU will
>> automatically use MTTCG when the host/guest architecture
>> pairing is known to be safe. Only developers of QEMU TCG
>> have a strong reason to override this logic.
>>
>> Thus we use two sanity checks to decide if the vCPU
>> PID information is usable. First we see if the PID
>> duplicates the main emulator PID, and second we see
>> if the PID duplicates any other vCPUs.
>>
>> Signed-off-by: Daniel P. Berrangé <berrange(a)redhat.com>
>> ---
>> src/qemu/qemu_domain.c | 81 ++++++++++++++++++++++++++----------------
>> 1 file changed, 51 insertions(+), 30 deletions(-)
>>
>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>> index f00f1b3fdb..c7a0c03e3f 100644
>> --- a/src/qemu/qemu_domain.c
>> +++ b/src/qemu/qemu_domain.c
>> @@ -10326,9 +10326,10 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver,
>> qemuDomainVcpuPrivatePtr vcpupriv;
>> qemuMonitorCPUInfoPtr info = NULL;
>> size_t maxvcpus = virDomainDefGetVcpusMax(vm->def);
>> - size_t i;
>> + size_t i, j;
>> bool hotplug;
>> bool fast;
>> + bool validTIDs = true;
>> int rc;
>> int ret = -1;
>>
>> @@ -10336,6 +10337,8 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver,
>> fast = virQEMUCapsGet(QEMU_DOMAIN_PRIVATE(vm)->qemuCaps,
>> QEMU_CAPS_QUERY_CPUS_FAST);
>>
>> + VIR_DEBUG("Maxvcpus %zu hotplug %d fast query %d", maxvcpus,
hotplug, fast);
>> +
>> if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0)
>> return -1;
>>
>> @@ -10348,39 +10351,57 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver,
>> if (rc < 0)
>> goto cleanup;
>>
>> + /*
>> + * The query-cpus[-fast] commands return information
>> + * about the vCPUs, including the OS level PID that
>> + * is executing the vCPU.
>> + *
>> + * For KVM there is always a 1-1 mapping between
>> + * vCPUs and host OS PIDs.
>> + *
>> + * For TCG things are a little more complicated.
>> + *
>> + * - In some cases the vCPUs will all have the same
>> + * PID as the main emulator thread.
>> + * - In some cases the first vCPU will have a distinct
>> + * PID, but other vCPUs will share the emulator thread
>> + *
>> + * For MTTCG, things work the same as KVM, with each
>> + * vCPU getting its own PID.
>> + *
>> + * We use the Host OS PIDs for doing vCPU pinning
>> + * and reporting. The TCG data reporting will result
>> + * in bad behaviour such as pinning the wrong PID.
>> + * We must thus detect and discard bogus PID info
>> + * from TCG, while still honouring the modern MTTCG
>> + * impl which we can support.
>> + */
>> + for (i = 0; i < maxvcpus && validTIDs; i++) {
>> + if (info[i].tid == vm->pid) {
>> + VIR_DEBUG("vCPU[%zu] PID %llu duplicates process",
>> + i, (unsigned long long)info[i].tid);
>> + validTIDs = false;
>> + }
>> +
>
> If !validTIDs does the next loop matter? IOW:
>
> Should the above section add a "continue;" since the loop exit would
> force the exit?
Mostly I wanted to ensure that we logged all the vCPU pids and it
won't impose an unreasonable performance impact by doing so.
Those are only VIR_DEBUG's though, so who really sees them? Perhaps
using VIR_WARN would be different.
> Beyond that the logic and comments look reasonable. I assume
since
> domain XML doesn't care whether MTTCG or TCG is used and things are
> handled under the covers by QEMU that means there's no migration or
> save/restore issues. Of course you have a much deeper understanding of
> the QEMU code than I do!
>
> The one other question I'd have is should validTIDs setting be done just
> once and saved perhaps in the domain private block? There is more than 1
> caller (and *Launch can call twice). It's not like it's going to change,
> right? So doing the same loop from a hotplug path won't matter nor would
> subsequent reconnects or attaches. So perhaps the validTIDs should be a
> tristate that only needs to be checked when the value is UNKNOWN. It's
> not like the loop is that expensive since it's only numeric comparisons,
> so it doesn't matter. I suppose I can be easily convinced taking this
> route would be fine, but figured I'd ask.
AFAIK, this can only be called once per running VM for each libvirtd
run, so I dont see need to do any more advanced caching.
I looked at/for callers of qemuDomainRefreshVcpuInfo - this one stuck
out once the domain is started:
....
qemuProcessLaunch()
...
VIR_DEBUG("setting up hotpluggable cpus");
if (qemuDomainHasHotpluggableStartupVcpus(vm->def)) {
if (qemuDomainRefreshVcpuInfo(driver, vm, asyncJob, false) < 0)
goto cleanup;
if (qemuProcessValidateHotpluggableVcpus(vm->def) < 0)
goto cleanup;
if (qemuProcessSetupHotpluggableVcpus(driver, vm, asyncJob) < 0)
goto cleanup;
}
VIR_DEBUG("Refreshing VCPU info");
if (qemuDomainRefreshVcpuInfo(driver, vm, asyncJob, false) < 0)
goto cleanup;
...
but this is also called during hotplug add/remove and of course
attach/reconnect (which I'd expect).
John