[PATCH 0/3] Add test for virHostCPUGetMap and fix it
I found an old patch that fell between the cracks and to show that it can manifest itself I also added a test in PATCH 3/3 which fails without PATCH 2/3, but it shows the explained case from the 2nd patch's commit message. Can be squashed to any of the other two, this is just easier to see I feel like. Felix Huettner (1): virhostcpu: Fix potential use of unallocated memory Martin Kletzander (2): tests: Extend virhostcputest to check virHostCPUGetMap tests/virhostcpudata: Add test case with many offline CPUs src/util/virhostcpu.c | 22 ++++--- .../linux-many-offline/cpu/online | 1 + .../linux-many-offline/cpu/present | 1 + tests/virhostcputest.c | 57 +++++++++++++++++++ 4 files changed, 69 insertions(+), 12 deletions(-) create mode 100644 tests/virhostcpudata/linux-many-offline/cpu/online create mode 100644 tests/virhostcpudata/linux-many-offline/cpu/present -- 2.53.0
From: Martin Kletzander <mkletzan@redhat.com> This commit goes through all virhostcpudata subdirectories and checks that `virHostCPUGetMap()` returns valid data for each one of them. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- tests/virhostcputest.c | 57 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+) diff --git a/tests/virhostcputest.c b/tests/virhostcputest.c index 4e65e5febf93..4b31b7944484 100644 --- a/tests/virhostcputest.c +++ b/tests/virhostcputest.c @@ -244,11 +244,41 @@ linuxTestNodeCPUStats(const void *data) } +static int +linuxTestHostCPUGetMap(const void *data G_GNUC_UNUSED) +{ + g_autofree unsigned char *cpumap = NULL; + + int ncpus = virHostCPUGetMap(&cpumap, NULL, 0); + + g_autoptr(virBitmap) actual = virBitmapNewData(cpumap, VIR_DIV_UP(ncpus, 8)); + g_autoptr(virBitmap) expected = NULL; + + if (virFileReadValueBitmap(&expected, "%s/cpu/online", SYSFS_SYSTEM_PATH) < 0) + return -1; + + if (!virBitmapEqual(actual, expected)) { + g_autofree char *expected_str = virBitmapFormat(expected); + g_autofree char *actual_str = virBitmapFormat(actual); + fprintf(stderr, + "Bitmaps are different\nexpected: %s\nactual: %s\n", + expected_str, actual_str); + return -1; + } + + return 0; +} + + static int mymain(void) { int ret = 0; size_t i; + int rc = 0; + struct dirent *ent = NULL; + g_autofree char *datadir = g_strdup_printf("%s/virhostcpudata", abs_srcdir); + g_autoptr(DIR) dir = NULL; const struct linuxTestHostCPUData nodeData[] = { {"test1", VIR_ARCH_X86_64}, {"test1", VIR_ARCH_PPC}, @@ -300,6 +330,33 @@ mymain(void) DO_TEST_CPU_STATS("24cpu", 24, false); DO_TEST_CPU_STATS("24cpu", 25, true); + /* Tests for virHostCPUGetMap() with each data subdirectory. */ + if (virDirOpen(&dir, datadir) < 0) + return -1; + + while ((rc = virDirRead(dir, &ent, datadir)) > 0) { + struct stat sb; + g_autofree char *path = g_strdup_printf("%s/%s", datadir, ent->d_name); + + if (stat(path, &sb) < 0) { + fprintf(stderr, "Cannot stat %s\n", path); + return -1; + } + + if (!S_ISDIR(sb.st_mode)) + continue; + + virFileWrapperAddPrefix(SYSFS_SYSTEM_PATH, path); + if (virTestRun(ent->d_name, linuxTestHostCPUGetMap, NULL) < 0) + ret = -1; + virFileWrapperRemovePrefix(SYSFS_SYSTEM_PATH); + } + + if (rc < 0) { + fprintf(stderr, "Error reading %s\n", SYSFS_SYSTEM_PATH); + return -1; + } + return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE; } -- 2.53.0
From: Felix Huettner <felix.huettner@stackit.cloud> In case of a host that has a large number of cpus offline the count of host cpus and the last bit set in the virHostCPUGetOnlineBitmap might diverge significantly. This can e.g. be the case when disabling smt via /sys/devices/system/cpu/smt/control. On the host this looks like: ``` $ cat /sys/devices/system/cpu/present 0-255 $ cat /sys/devices/system/cpu/online 0-127 ``` However in this case virBitmapToData previously only allocated 16 bytes for the output bitmap. This is becase the last set bit is on the 15th byte. Users of virHostCPUGetMap however rely on the "cpumap" containing enough space for all existing cpus (so they would expect 32 bytes in this case). E.g. cmdNodeCpuMap relies on this for its output. It will then actually read 32 bytes from the start of the "cpumap" address where in this case the last 16 of these bytes are uninitialized. This manifests itself in flapping outputs of "virsh nodecpumap --pretty" like: ``` $ virsh nodecpumap --pretty CPUs present: 256 CPUs online: 128 CPU map: 0-127,192,194,202 $ virsh nodecpumap --pretty CPUs present: 256 CPUs online: 128 CPU map: 0-127,192,194,197 $ virsh nodecpumap --pretty CPUs present: 256 CPUs online: 128 CPU map: 0-127,192,194,196-197 ``` This in turn potentially causes users of this data to report wrong cpu counts. Note that this only seems to happen with at least 256 physical cpus where at least 128 are offline. We fix this by preallocating the expected bitmap size. Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud> Reviewed-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virhostcpu.c | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/src/util/virhostcpu.c b/src/util/virhostcpu.c index 8688b6ec67a6..870338edad0a 100644 --- a/src/util/virhostcpu.c +++ b/src/util/virhostcpu.c @@ -1148,28 +1148,26 @@ virHostCPUGetMap(unsigned char **cpumap, unsigned int flags) { g_autoptr(virBitmap) cpus = NULL; - int ret = -1; - int dummy; + int ncpus = virHostCPUGetCount(); virCheckFlags(0, -1); if (!cpumap && !online) - return virHostCPUGetCount(); + return ncpus; if (!(cpus = virHostCPUGetOnlineBitmap())) - goto cleanup; + return -1; + + if (cpumap) { + int len = (ncpus + CHAR_BIT) / CHAR_BIT; + *cpumap = g_new0(unsigned char, len); + virBitmapToDataBuf(cpus, *cpumap, len); + } - if (cpumap) - virBitmapToData(cpus, cpumap, &dummy); if (online) *online = virBitmapCountBits(cpus); - ret = virHostCPUGetCount(); - - cleanup: - if (ret < 0 && cpumap) - VIR_FREE(*cpumap); - return ret; + return ncpus; } -- 2.53.0
From: Martin Kletzander <mkletzan@redhat.com> Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- tests/virhostcpudata/linux-many-offline/cpu/online | 1 + tests/virhostcpudata/linux-many-offline/cpu/present | 1 + 2 files changed, 2 insertions(+) create mode 100644 tests/virhostcpudata/linux-many-offline/cpu/online create mode 100644 tests/virhostcpudata/linux-many-offline/cpu/present diff --git a/tests/virhostcpudata/linux-many-offline/cpu/online b/tests/virhostcpudata/linux-many-offline/cpu/online new file mode 100644 index 000000000000..fb812ed1df3c --- /dev/null +++ b/tests/virhostcpudata/linux-many-offline/cpu/online @@ -0,0 +1 @@ +0-127 diff --git a/tests/virhostcpudata/linux-many-offline/cpu/present b/tests/virhostcpudata/linux-many-offline/cpu/present new file mode 100644 index 000000000000..8d232c7839e8 --- /dev/null +++ b/tests/virhostcpudata/linux-many-offline/cpu/present @@ -0,0 +1 @@ +0-255 -- 2.53.0
On a Thursday in 2026, Martin Kletzander via Devel wrote:
I found an old patch that fell between the cracks and to show that it can manifest itself I also added a test in PATCH 3/3 which fails without PATCH 2/3, but it shows the explained case from the 2nd patch's commit message. Can be squashed to any of the other two, this is just easier to see I feel like.
Felix Huettner (1): virhostcpu: Fix potential use of unallocated memory
Martin Kletzander (2): tests: Extend virhostcputest to check virHostCPUGetMap tests/virhostcpudata: Add test case with many offline CPUs
src/util/virhostcpu.c | 22 ++++--- .../linux-many-offline/cpu/online | 1 + .../linux-many-offline/cpu/present | 1 + tests/virhostcputest.c | 57 +++++++++++++++++++ 4 files changed, 69 insertions(+), 12 deletions(-) create mode 100644 tests/virhostcpudata/linux-many-offline/cpu/online create mode 100644 tests/virhostcpudata/linux-many-offline/cpu/present
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano
participants (2)
-
Ján Tomko -
Martin Kletzander