[PATCH 0/3] CI/test improvements in qemuxml2argvtest

Hi, This started as a saga of figuring out why Libvirt was failing to build in a Power 9 host after the switch to meson. Turns out that one of the tests, qemuxml2argvtest, is taking almost 5 minutes to complete, and meson throws a timeout error. This wasn't noticed before because the test actually completes without errors. Running 'perf' showed that 95% of the time spent in qemuxml2argvtest in Power 9 were in virHostCPUGetMicrocodeVersion() calls, which is not needed for PowerPC since 'microcode' is x86 specific. This means that we're calling 'show_cpuinfo' from the kernel, fetching dynamic data from all CPUs, and then failing because there is no 'microcode' string. The Power 9 host I tested with has 128 CPUs, meaning that 'show_cpuinfo' is not something to scoff at. First patch is a trivial cleanup. Patch 02 is a runtime fix. The cake is in patch 03. Patch 03 will improve CI on all archs, including x86, which is nice. Daniel Henrique Barboza (3): virhostcpu.c: modernize virHostCPUGetMicrocodeVersion() virhostcpu.c: skip non x86 hosts in virHostCPUGetMicrocodeVersion() domaincapsmoc.c: mock virHostCPUGetMicrocodeVersion() src/cpu/cpu_x86.c | 2 +- src/qemu/qemu_capabilities.c | 4 ++-- src/util/virhostcpu.c | 13 +++++++------ src/util/virhostcpu.h | 3 ++- tests/domaincapsmock.c | 6 ++++++ 5 files changed, 18 insertions(+), 10 deletions(-) -- 2.26.2

Use g_autofree and remove the cleanup label. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/util/virhostcpu.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/src/util/virhostcpu.c b/src/util/virhostcpu.c index 6cea75536d..d7aa39c131 100644 --- a/src/util/virhostcpu.c +++ b/src/util/virhostcpu.c @@ -1241,7 +1241,7 @@ virHostCPUGetKVMMaxVCPUs(void) unsigned int virHostCPUGetMicrocodeVersion(void) { - char *outbuf = NULL; + g_autofree char *outbuf = NULL; char *cur; unsigned int version = 0; @@ -1254,16 +1254,14 @@ virHostCPUGetMicrocodeVersion(void) /* Account for format 'microcode : XXXX'*/ if (!(cur = strstr(outbuf, "microcode")) || !(cur = strchr(cur, ':'))) - goto cleanup; + return 0; cur++; /* Linux places the microcode revision in a 32-bit integer, so * ui is fine for us too. */ if (virStrToLong_ui(cur, &cur, 0, &version) < 0) - goto cleanup; + return 0; - cleanup: - VIR_FREE(outbuf); return version; } -- 2.26.2

On a Monday in 2020, Daniel Henrique Barboza wrote:
Use g_autofree and remove the cleanup label.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/util/virhostcpu.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Non-x86 archs does not have a 'microcode' version like x86. This is covered already inside the function - just return 0 if no microcode is found. Regardless of that, a read of /proc/cpuinfo is always made. Each read will invoke the kernel to fill in the CPU details every time. Now let's consider a non-x86 host, like a Power 9 server with 128 CPUs. Each /proc/cpuinfo read will need to fetch data for each CPU and it won't even matter because we know beforehand that PowerPC chips don't have microcode information. We can do better for non-x86 hosts by skipping this process entirely. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/cpu/cpu_x86.c | 2 +- src/qemu/qemu_capabilities.c | 4 ++-- src/util/virhostcpu.c | 5 ++++- src/util/virhostcpu.h | 2 +- 4 files changed, 8 insertions(+), 5 deletions(-) diff --git a/src/cpu/cpu_x86.c b/src/cpu/cpu_x86.c index 1e5cd93abb..aa982589be 100644 --- a/src/cpu/cpu_x86.c +++ b/src/cpu/cpu_x86.c @@ -2759,7 +2759,7 @@ virCPUx86GetHost(virCPUDefPtr cpu, } ret = x86DecodeCPUData(cpu, cpuData, models); - cpu->microcodeVersion = virHostCPUGetMicrocodeVersion(); + cpu->microcodeVersion = virHostCPUGetMicrocodeVersion(cpuData->arch); /* Probing for TSC frequency makes sense only if the CPU supports * invariant TSC (Linux calls this constant_tsc in /proc/cpuinfo). */ diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index ff6ba8c9e9..da743ecbe0 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -5502,7 +5502,7 @@ virQEMUCapsNewData(const char *binary, priv->runUid, priv->runGid, priv->hostCPUSignature, - virHostCPUGetMicrocodeVersion(), + virHostCPUGetMicrocodeVersion(priv->hostArch), priv->kernelVersion); } @@ -5636,7 +5636,7 @@ virQEMUCapsCacheLookup(virFileCachePtr cache, virQEMUCapsCachePrivPtr priv = virFileCacheGetPriv(cache); virQEMUCapsPtr ret = NULL; - priv->microcodeVersion = virHostCPUGetMicrocodeVersion(); + priv->microcodeVersion = virHostCPUGetMicrocodeVersion(priv->hostArch); ret = virFileCacheLookup(cache, binary); diff --git a/src/util/virhostcpu.c b/src/util/virhostcpu.c index d7aa39c131..aafc3a14c6 100644 --- a/src/util/virhostcpu.c +++ b/src/util/virhostcpu.c @@ -1239,12 +1239,15 @@ virHostCPUGetKVMMaxVCPUs(void) * some reason. */ unsigned int -virHostCPUGetMicrocodeVersion(void) +virHostCPUGetMicrocodeVersion(virArch hostArch) { g_autofree char *outbuf = NULL; char *cur; unsigned int version = 0; + if (!ARCH_IS_X86(hostArch)) + return 0; + if (virFileReadHeaderQuiet(CPUINFO_PATH, 4096, &outbuf) < 0) { VIR_DEBUG("Failed to read microcode version from %s: %s", CPUINFO_PATH, g_strerror(errno)); diff --git a/src/util/virhostcpu.h b/src/util/virhostcpu.h index d07503857e..11cbcd72a3 100644 --- a/src/util/virhostcpu.h +++ b/src/util/virhostcpu.h @@ -75,7 +75,7 @@ virBitmapPtr virHostCPUGetSiblingsList(unsigned int cpu); int virHostCPUGetOnline(unsigned int cpu, bool *online); -unsigned int virHostCPUGetMicrocodeVersion(void); +unsigned int virHostCPUGetMicrocodeVersion(virArch hostArch); int virHostCPUGetMSR(unsigned long index, uint64_t *msr); -- 2.26.2

On a Monday in 2020, Daniel Henrique Barboza wrote:
Non-x86 archs does not have a 'microcode' version like x86. This is covered already inside the function - just return 0 if no microcode is found. Regardless of that, a read of /proc/cpuinfo is always made. Each read will invoke the kernel to fill in the CPU details every time.
Now let's consider a non-x86 host, like a Power 9 server with 128 CPUs. Each /proc/cpuinfo read will need to fetch data for each CPU and it won't even matter because we know beforehand that PowerPC chips don't have microcode information.
We can do better for non-x86 hosts by skipping this process entirely.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/cpu/cpu_x86.c | 2 +- src/qemu/qemu_capabilities.c | 4 ++-- src/util/virhostcpu.c | 5 ++++- src/util/virhostcpu.h | 2 +- 4 files changed, 8 insertions(+), 5 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

On a Monday in 2020, Daniel Henrique Barboza wrote:
Non-x86 archs does not have a 'microcode' version like x86. This is covered already inside the function - just return 0 if no microcode is found. Regardless of that, a read of /proc/cpuinfo is always made. Each read will invoke the kernel to fill in the CPU details every time.
Now let's consider a non-x86 host, like a Power 9 server with 128 CPUs. Each /proc/cpuinfo read will need to fetch data for each CPU and it won't even matter because we know beforehand that PowerPC chips don't have microcode information.
We can do better for non-x86 hosts by skipping this process entirely.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/cpu/cpu_x86.c | 2 +- src/qemu/qemu_capabilities.c | 4 ++-- src/util/virhostcpu.c | 5 ++++- src/util/virhostcpu.h | 2 +- 4 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/src/cpu/cpu_x86.c b/src/cpu/cpu_x86.c index 1e5cd93abb..aa982589be 100644 --- a/src/cpu/cpu_x86.c +++ b/src/cpu/cpu_x86.c @@ -2759,7 +2759,7 @@ virCPUx86GetHost(virCPUDefPtr cpu, }
ret = x86DecodeCPUData(cpu, cpuData, models); - cpu->microcodeVersion = virHostCPUGetMicrocodeVersion(); + cpu->microcodeVersion = virHostCPUGetMicrocodeVersion(cpuData->arch);
/* Probing for TSC frequency makes sense only if the CPU supports * invariant TSC (Linux calls this constant_tsc in /proc/cpuinfo). */ diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index ff6ba8c9e9..da743ecbe0 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -5502,7 +5502,7 @@ virQEMUCapsNewData(const char *binary, priv->runUid, priv->runGid, priv->hostCPUSignature, - virHostCPUGetMicrocodeVersion(), + virHostCPUGetMicrocodeVersion(priv->hostArch), priv->kernelVersion); }
@@ -5636,7 +5636,7 @@ virQEMUCapsCacheLookup(virFileCachePtr cache, virQEMUCapsCachePrivPtr priv = virFileCacheGetPriv(cache); virQEMUCapsPtr ret = NULL;
- priv->microcodeVersion = virHostCPUGetMicrocodeVersion(); + priv->microcodeVersion = virHostCPUGetMicrocodeVersion(priv->hostArch);
ret = virFileCacheLookup(cache, binary);
diff --git a/src/util/virhostcpu.c b/src/util/virhostcpu.c index d7aa39c131..aafc3a14c6 100644 --- a/src/util/virhostcpu.c +++ b/src/util/virhostcpu.c @@ -1239,12 +1239,15 @@ virHostCPUGetKVMMaxVCPUs(void) * some reason. */ unsigned int -virHostCPUGetMicrocodeVersion(void) +virHostCPUGetMicrocodeVersion(virArch hostArch)
The corresponding non-Linux stub also needed the argument change. I fixed that before pushing. Jano
{ g_autofree char *outbuf = NULL; char *cur; unsigned int version = 0;
+ if (!ARCH_IS_X86(hostArch)) + return 0; + if (virFileReadHeaderQuiet(CPUINFO_PATH, 4096, &outbuf) < 0) { VIR_DEBUG("Failed to read microcode version from %s: %s", CPUINFO_PATH, g_strerror(errno));

Previous patch handled the runtime case where a non-x86 host is fetching /proc/cpuinfo data for a microcode info that we know it doesn't exist. This change alone speeded everything by a bit for non-x86, but there is at least one major culprit left. qemuxml2argvtest does several arch-specific tests, and a good chunk of them are x86 exclusive. This means that 'hostArch' will be seen as x86 for these tests, even when running in non-x86 hosts. In a Power 9 server with 128 CPUs, qemuxmlargvtest takes 298 seconds to complete in average, and 'perf record' indicates that 95% of the time is spent in virHostCPUGetMicrocodeVersion(). This patch mocks virHostCPUGetMicrocodeVersion() to always return 0 in the tests, avoiding /proc/cpuinfo reads. This will make all tests behave arch-agnostic, and the microcode value being 0 has no impact on any existing test. This is a CI speed across the board for all archs, including x86, given that we're not reading /proc/cpuinfo in the tests. For a Thinkpad T480 laptop with 8 Intel i7 CPUs, qemuxml2argvtest went from 15.50 sec to 12.50 seconds. The performance gain is even more noticeable for huge servers with lots of CPUs. For the Power 9 server mentioned above, this patch speeds qemuxml2argvtest to 9 seconds, down from 298 sec. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/util/virhostcpu.h | 3 ++- tests/domaincapsmock.c | 6 ++++++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/src/util/virhostcpu.h b/src/util/virhostcpu.h index 11cbcd72a3..6148f9e66f 100644 --- a/src/util/virhostcpu.h +++ b/src/util/virhostcpu.h @@ -75,7 +75,8 @@ virBitmapPtr virHostCPUGetSiblingsList(unsigned int cpu); int virHostCPUGetOnline(unsigned int cpu, bool *online); -unsigned int virHostCPUGetMicrocodeVersion(virArch hostArch); +unsigned int +virHostCPUGetMicrocodeVersion(virArch hostArch) G_GNUC_NO_INLINE; int virHostCPUGetMSR(unsigned long index, uint64_t *msr); diff --git a/tests/domaincapsmock.c b/tests/domaincapsmock.c index 90e17c19f0..d81a898dc0 100644 --- a/tests/domaincapsmock.c +++ b/tests/domaincapsmock.c @@ -34,3 +34,9 @@ virHostCPUGetKVMMaxVCPUs(void) { return INT_MAX; } + +unsigned int +virHostCPUGetMicrocodeVersion(virArch hostArch G_GNUC_UNUSED) +{ + return 0; +} -- 2.26.2

s/capsmoc.c/capsmock/ On a Monday in 2020, Daniel Henrique Barboza wrote:
Previous patch handled the runtime case where a non-x86 host is fetching /proc/cpuinfo data for a microcode info that we know it doesn't exist. This change alone speeded everything by a bit for non-x86, but there is at least one major culprit left.
qemuxml2argvtest does several arch-specific tests, and a good chunk of them are x86 exclusive. This means that 'hostArch' will be seen as x86 for these tests, even when running in non-x86 hosts. In a Power 9 server with 128 CPUs, qemuxmlargvtest
*qemuxml2argvtest
takes 298 seconds to complete in average, and 'perf record' indicates that 95% of the time is spent in virHostCPUGetMicrocodeVersion().
No surprise, we call it 23448 times.
This patch mocks virHostCPUGetMicrocodeVersion() to always return 0 in the tests, avoiding /proc/cpuinfo reads. This will make all tests behave arch-agnostic, and the microcode value being 0 has no impact on any existing test.
This is a CI speed across the board for all archs, including x86, given that we're not reading /proc/cpuinfo in the tests. For a Thinkpad T480 laptop with 8 Intel i7 CPUs, qemuxml2argvtest went from 15.50 sec to 12.50 seconds. The performance gain is even more noticeable for huge servers with lots of CPUs. For the Power 9 server mentioned above, this patch speeds qemuxml2argvtest to 9 seconds, down from 298 sec.
I assume the speedup for Power is measured for the whole series, not just for this patch.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/util/virhostcpu.h | 3 ++- tests/domaincapsmock.c | 6 ++++++ 2 files changed, 8 insertions(+), 1 deletion(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

On 8/24/20 11:36 AM, Ján Tomko wrote:
s/capsmoc.c/capsmock/
On a Monday in 2020, Daniel Henrique Barboza wrote:
Previous patch handled the runtime case where a non-x86 host is fetching /proc/cpuinfo data for a microcode info that we know it doesn't exist. This change alone speeded everything by a bit for non-x86, but there is at least one major culprit left.
qemuxml2argvtest does several arch-specific tests, and a good chunk of them are x86 exclusive. This means that 'hostArch' will be seen as x86 for these tests, even when running in non-x86 hosts. In a Power 9 server with 128 CPUs, qemuxmlargvtest
*qemuxml2argvtest
takes 298 seconds to complete in average, and 'perf record' indicates that 95% of the time is spent in virHostCPUGetMicrocodeVersion().
No surprise, we call it 23448 times.
This patch mocks virHostCPUGetMicrocodeVersion() to always return 0 in the tests, avoiding /proc/cpuinfo reads. This will make all tests behave arch-agnostic, and the microcode value being 0 has no impact on any existing test.
This is a CI speed across the board for all archs, including x86, given that we're not reading /proc/cpuinfo in the tests. For a Thinkpad T480 laptop with 8 Intel i7 CPUs, qemuxml2argvtest went from 15.50 sec to 12.50 seconds. The performance gain is even more noticeable for huge servers with lots of CPUs. For the Power 9 server mentioned above, this patch speeds qemuxml2argvtest to 9 seconds, down from 298 sec.
I assume the speedup for Power is measured for the whole series, not just for this patch.
Yes, for the whole series, for both Power and my x86 laptop. Should've make it clearer in the commit msg. Thanks, DHB
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/util/virhostcpu.h | 3 ++- tests/domaincapsmock.c | 6 ++++++ 2 files changed, 8 insertions(+), 1 deletion(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Jano
participants (2)
-
Daniel Henrique Barboza
-
Ján Tomko