On 10/29/2018 10:55 PM, John Ferlan wrote:
On 10/17/18 10:15 AM, Daniel P. Berrangé wrote:
> MTTCG is the new multi-threaded impl of TCG which follows
> KVM in having one host OS thread per vCPU. Historically
> we have discarded all PIDs reported for TCG guests, but
> we must now selectively honour this data.
>
> We don't have anything in the domain XML that indicates
> whether a guest is using TCG or MTTCG. While QEMU does
> have an option (-accel tcg,thread=single|multi), it is
> not desirable to expose this in libvirt. QEMU will
> automatically use MTTCG when the host/guest architecture
> pairing is known to be safe. Only developers of QEMU TCG
> have a strong reason to override this logic.
>
> Thus we use two sanity checks to decide if the vCPU
> PID information is usable. First we see if the PID
> duplicates the main emulator PID, and second we see
> if the PID duplicates any other vCPUs.
>
> Signed-off-by: Daniel P. Berrangé <berrange(a)redhat.com>
> ---
> src/qemu/qemu_domain.c | 81 ++++++++++++++++++++++++++----------------
> 1 file changed, 51 insertions(+), 30 deletions(-)
>
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index f00f1b3fdb..c7a0c03e3f 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -10326,9 +10326,10 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver,
> qemuDomainVcpuPrivatePtr vcpupriv;
> qemuMonitorCPUInfoPtr info = NULL;
> size_t maxvcpus = virDomainDefGetVcpusMax(vm->def);
> - size_t i;
> + size_t i, j;
> bool hotplug;
> bool fast;
> + bool validTIDs = true;
> int rc;
> int ret = -1;
>
> @@ -10336,6 +10337,8 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver,
> fast = virQEMUCapsGet(QEMU_DOMAIN_PRIVATE(vm)->qemuCaps,
> QEMU_CAPS_QUERY_CPUS_FAST);
>
> + VIR_DEBUG("Maxvcpus %zu hotplug %d fast query %d", maxvcpus, hotplug,
fast);
> +
> if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0)
> return -1;
>
> @@ -10348,39 +10351,57 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver,
> if (rc < 0)
> goto cleanup;
>
> + /*
> + * The query-cpus[-fast] commands return information
> + * about the vCPUs, including the OS level PID that
> + * is executing the vCPU.
> + *
> + * For KVM there is always a 1-1 mapping between
> + * vCPUs and host OS PIDs.
> + *
> + * For TCG things are a little more complicated.
> + *
> + * - In some cases the vCPUs will all have the same
> + * PID as the main emulator thread.
> + * - In some cases the first vCPU will have a distinct
> + * PID, but other vCPUs will share the emulator thread
> + *
> + * For MTTCG, things work the same as KVM, with each
> + * vCPU getting its own PID.
> + *
> + * We use the Host OS PIDs for doing vCPU pinning
> + * and reporting. The TCG data reporting will result
> + * in bad behaviour such as pinning the wrong PID.
> + * We must thus detect and discard bogus PID info
> + * from TCG, while still honouring the modern MTTCG
> + * impl which we can support.
> + */
> + for (i = 0; i < maxvcpus && validTIDs; i++) {
> + if (info[i].tid == vm->pid) {
> + VIR_DEBUG("vCPU[%zu] PID %llu duplicates process",
> + i, (unsigned long long)info[i].tid);
> + validTIDs = false;
> + }
> +
If !validTIDs does the next loop matter? IOW:
Should the above section add a "continue;" since the loop exit would
force the exit?
It should break rather than continue. Even a single mismatch is enough
for us to not believe qemu reply. At any rate, that is just an
optimization and since nobody uses TCG with dozen or more vCPUs the
O(n^2) algorithm is not that bad.
Beyond that the logic and comments look reasonable. I assume since
domain XML doesn't care whether MTTCG or TCG is used and things are
handled under the covers by QEMU that means there's no migration or
save/restore issues. Of course you have a much deeper understanding of
the QEMU code than I do!
The one other question I'd have is should validTIDs setting be done just
once and saved perhaps in the domain private block? There is more than 1
caller (and *Launch can call twice). It's not like it's going to change,
right? So doing the same loop from a hotplug path won't matter nor would
subsequent reconnects or attaches. So perhaps the validTIDs should be a
tristate that only needs to be checked when the value is UNKNOWN. It's
not like the loop is that expensive since it's only numeric comparisons,
so it doesn't matter. I suppose I can be easily convinced taking this
route would be fine, but figured I'd ask.
Maybe we can save this, but I think it can be done in a separate patch.
ACK
Michal