[libvirt] [PATCHv3 0/4] Use nodeGetCPUCount/virNodeGetCPUMap where appropriate

This series concludes the introduction of the virNodeGetCPUMap API by replacing the usage of virNodeGetInfo for computing the maximum number of node CPUs with either nodeGetCPUCount or virNodeGetCPUMap. It's basically a resend of the original 3-patch series plus the nodeGetCPUCount fix. Event in the light of Peter Krempa's recent CPU topology fix this series is still relevant. 1/4 is needed to support older kernels 2/4 is not strictly required but is more efficient 3/4 and 4/4 are required for correct function of vcpu pinning calls of python and virsh clients when talking to libvirtd 1.0.0. V3 Changes: - pulled in the nodeGetCPUCount patch for older kernels - changed commit message of 2/4 V2 Changes: Rework based on Eric's feedback: - Use nodeGetCPUCount instead of nodeGetCPUMap - Avoid code bloat by computing node CPU count in a helper function Viktor Mihajlovski (4): nodeinfo: enable nodeGetCPUCount for older kernels qemu, lxc: Change host CPU number detection logic. python: Use virNodeGetCPUMap where possible virsh: Use virNodeGetCPUMap if possible python/libvirt-override.c | 87 ++++++++++++++++++++++++++++------------------- src/lxc/lxc_controller.c | 8 ++--- src/nodeinfo.c | 33 +++++++++++++++--- src/qemu/qemu_driver.c | 14 +++----- src/qemu/qemu_process.c | 8 ++--- tools/virsh-domain.c | 32 ++++++++++++----- 6 files changed, 114 insertions(+), 68 deletions(-) -- 1.7.12.4

Since /sys/devices/system/cpu/present is not available on older kernels like on RHEL 5.x nodeGetCPUCount will fail there. The fallback implemented is to scan for /sys/devices/system/cpu/cpuNN entries. Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- src/nodeinfo.c | 33 ++++++++++++++++++++++++++++----- 1 file changed, 28 insertions(+), 5 deletions(-) diff --git a/src/nodeinfo.c b/src/nodeinfo.c index e3d7cfe..cea3775 100644 --- a/src/nodeinfo.c +++ b/src/nodeinfo.c @@ -978,11 +978,34 @@ int nodeGetCPUCount(void) { #ifdef __linux__ - /* XXX should we also work on older kernels, like RHEL5, that lack - * cpu/present and cpu/online files? Those kernels also lack cpu - * hotplugging, so it would be a matter of finding the largest - * cpu/cpuNN directory, and returning NN + 1 */ - return linuxParseCPUmax(SYSFS_SYSTEM_PATH "/cpu/present"); + /* to support older kernels, like RHEL5, that lack + * cpu/present we fall back to count cpu/cpuNN + * entries. + */ + char *cpupath = NULL; + int i = 0; + + if (virFileExists(SYSFS_SYSTEM_PATH "/cpu/present")) { + i = linuxParseCPUmax(SYSFS_SYSTEM_PATH "/cpu/present"); + } else if (virFileExists(SYSFS_SYSTEM_PATH "/cpu/cpu0")) { + do { + i++; + VIR_FREE(cpupath); + if (virAsprintf(&cpupath, "%s/cpu/cpu%d", + SYSFS_SYSTEM_PATH, i) < 0) { + virReportOOMError(); + return -1; + } + } while (virFileExists(cpupath)); + } else { + /* no cpu/cpu0: we give up */ + virReportError(VIR_ERR_NO_SUPPORT, "%s", + _("host cpu counting not supported on this node")); + return -1; + } + + VIR_FREE(cpupath); + return i; #else virReportError(VIR_ERR_NO_SUPPORT, "%s", _("host cpu counting not implemented on this platform")); -- 1.7.12.4

On 11/13/2012 05:54 AM, Viktor Mihajlovski wrote:
Since /sys/devices/system/cpu/present is not available on older kernels like on RHEL 5.x nodeGetCPUCount will fail there. The fallback implemented is to scan for /sys/devices/system/cpu/cpuNN entries.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- src/nodeinfo.c | 33 ++++++++++++++++++++++++++++----- 1 file changed, 28 insertions(+), 5 deletions(-)
#ifdef __linux__ - /* XXX should we also work on older kernels, like RHEL5, that lack - * cpu/present and cpu/online files? Those kernels also lack cpu - * hotplugging, so it would be a matter of finding the largest - * cpu/cpuNN directory, and returning NN + 1 */ - return linuxParseCPUmax(SYSFS_SYSTEM_PATH "/cpu/present"); + /* to support older kernels, like RHEL5, that lack + * cpu/present we fall back to count cpu/cpuNN + * entries. + */ + char *cpupath = NULL; + int i = 0; + + if (virFileExists(SYSFS_SYSTEM_PATH "/cpu/present")) { + i = linuxParseCPUmax(SYSFS_SYSTEM_PATH "/cpu/present"); + } else if (virFileExists(SYSFS_SYSTEM_PATH "/cpu/cpu0")) { + do { + i++; + VIR_FREE(cpupath); + if (virAsprintf(&cpupath, "%s/cpu/cpu%d", + SYSFS_SYSTEM_PATH, i) < 0) { + virReportOOMError(); + return -1; + } + } while (virFileExists(cpupath));
The assumption here is that any kernel lacking cpu/present also lacks hotplug, and therefore the cpuNN will be consecutive and we aren't going to miss anything. You removed that assumption from the comment, but I think it is important to leave in. ACK. I'm firing up my RHEL 5 VM to test this before I push with this comment change, but it looks sane. diff --git i/src/nodeinfo.c w/src/nodeinfo.c index cea3775..4589b77 100644 --- i/src/nodeinfo.c +++ w/src/nodeinfo.c @@ -978,9 +978,10 @@ int nodeGetCPUCount(void) { #ifdef __linux__ - /* to support older kernels, like RHEL5, that lack - * cpu/present we fall back to count cpu/cpuNN - * entries. + /* To support older kernels that lack cpu/present, such as 2.6.18 + * in RHEL5, we fall back to count cpu/cpuNN entries; this assumes + * that such kernels also lack hotplug, and therefore cpu/cpuNN + * will be consecutive. */ char *cpupath = NULL; int i = 0; -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/15/2012 04:43 AM, Eric Blake wrote:
The assumption here is that any kernel lacking cpu/present also lacks hotplug, and therefore the cpuNN will be consecutive and we aren't going to miss anything. You removed that assumption from the comment, but I think it is important to leave in.
ACK. I'm firing up my RHEL 5 VM to test this before I push with this comment change, but it looks sane.
FWIW: I also run a 64-bit CentOS 5.8 and it supports hotplugging despite the lack of cpu/present. Only cpu/cpu0 cannot be hotplugged (no cpu/cpu0/online pseudofile). Still the cpu/cpuNN directories are consecutive as they are present even if the respective CPUs are offline. -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

The drivers for QEMU and LXC use virNodeGetInfo only to determine the number of host CPUs. On Linux hosts nodeGetCPUCount has less overhead. --- V3 Changes: Commit message wording, since Peter's fix, nodeGetInfo will return correct/consistent topology. Still, this here is more efficient. V2 Changes: Implemented Eric Blake's suggestion to remove code bloat introduced by first patch version. New helper function getPyNodeCPUCount is now used to calculate the number of node CPUs. src/lxc/lxc_controller.c | 8 +++----- src/qemu/qemu_driver.c | 14 +++++--------- src/qemu/qemu_process.c | 8 +++----- 3 files changed, 11 insertions(+), 19 deletions(-) diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index a41c903..ed3d3d0 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -492,17 +492,15 @@ static int virLXCControllerSetupNUMAPolicy(virLXCControllerPtr ctrl) static int virLXCControllerSetupCpuAffinity(virLXCControllerPtr ctrl) { int hostcpus, maxcpu = CPU_SETSIZE; - virNodeInfo nodeinfo; virBitmapPtr cpumap, cpumapToSet; VIR_DEBUG("Setting CPU affinity"); - if (nodeGetInfo(NULL, &nodeinfo) < 0) - return -1; - /* setaffinity fails if you set bits for CPUs which * aren't present, so we have to limit ourselves */ - hostcpus = VIR_NODEINFO_MAXCPUS(nodeinfo); + if ((hostcpus = nodeGetCPUCount()) < 0) + return -1; + if (maxcpu > hostcpus) maxcpu = hostcpus; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 5556f1e..826efe6 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4170,7 +4170,6 @@ qemudDomainGetVcpuPinInfo(virDomainPtr dom, struct qemud_driver *driver = dom->conn->privateData; virDomainObjPtr vm = NULL; - virNodeInfo nodeinfo; virDomainDefPtr targetDef = NULL; int ret = -1; int maxcpu, hostcpus, vcpu, pcpu; @@ -4206,9 +4205,9 @@ qemudDomainGetVcpuPinInfo(virDomainPtr dom, /* Coverity didn't realize that targetDef must be set if we got here. */ sa_assert(targetDef); - if (nodeGetInfo(dom->conn, &nodeinfo) < 0) + if ((hostcpus = nodeGetCPUCount()) < 0) goto cleanup; - hostcpus = VIR_NODEINFO_MAXCPUS(nodeinfo); + maxcpu = maplen * 8; if (maxcpu > hostcpus) maxcpu = hostcpus; @@ -4421,7 +4420,6 @@ qemudDomainGetEmulatorPinInfo(virDomainPtr dom, { struct qemud_driver *driver = dom->conn->privateData; virDomainObjPtr vm = NULL; - virNodeInfo nodeinfo; virDomainDefPtr targetDef = NULL; int ret = -1; int maxcpu, hostcpus, pcpu; @@ -4453,9 +4451,9 @@ qemudDomainGetEmulatorPinInfo(virDomainPtr dom, /* Coverity didn't realize that targetDef must be set if we got here. */ sa_assert(targetDef); - if (nodeGetInfo(dom->conn, &nodeinfo) < 0) + if ((hostcpus = nodeGetCPUCount()) < 0) goto cleanup; - hostcpus = VIR_NODEINFO_MAXCPUS(nodeinfo); + maxcpu = maplen * 8; if (maxcpu > hostcpus) maxcpu = hostcpus; @@ -4498,7 +4496,6 @@ qemudDomainGetVcpus(virDomainPtr dom, int maplen) { struct qemud_driver *driver = dom->conn->privateData; virDomainObjPtr vm; - virNodeInfo nodeinfo; int i, v, maxcpu, hostcpus; int ret = -1; qemuDomainObjPrivatePtr priv; @@ -4524,10 +4521,9 @@ qemudDomainGetVcpus(virDomainPtr dom, priv = vm->privateData; - if (nodeGetInfo(dom->conn, &nodeinfo) < 0) + if ((hostcpus = nodeGetCPUCount()) < 0) goto cleanup; - hostcpus = VIR_NODEINFO_MAXCPUS(nodeinfo); maxcpu = maplen * 8; if (maxcpu > hostcpus) maxcpu = hostcpus; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 8bf80e7..29b7ae1 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1902,15 +1902,13 @@ qemuPrepareCpumap(struct qemud_driver *driver, virBitmapPtr nodemask) { int i, hostcpus, maxcpu = QEMUD_CPUMASK_LEN; - virNodeInfo nodeinfo; virBitmapPtr cpumap = NULL; - if (nodeGetInfo(NULL, &nodeinfo) < 0) - return NULL; - /* setaffinity fails if you set bits for CPUs which * aren't present, so we have to limit ourselves */ - hostcpus = VIR_NODEINFO_MAXCPUS(nodeinfo); + if ((hostcpus = nodeGetCPUCount()) < 0) + return NULL; + if (maxcpu > hostcpus) maxcpu = hostcpus; -- 1.7.12.4

On 11/13/2012 05:54 AM, Viktor Mihajlovski wrote:
The drivers for QEMU and LXC use virNodeGetInfo only to determine the number of host CPUs. On Linux hosts nodeGetCPUCount has less overhead. --- V3 Changes: Commit message wording, since Peter's fix, nodeGetInfo will return correct/consistent topology. Still, this here is more efficient.
Indeed - parsing cpu/present (when available) is much faster than parsing each of cpu/cpuNN/online. ACK and pushed. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Modified the places where virNodeGetInfo was used for the purpose of obtaining the maximum node CPU number. Transparently falling back to virNodeGetInfo in case of failure. Wrote a utility function getPyNodeCPUCount for that purpose. Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- V2 Changes: Implemented Eric Blake's suggestion to remove code bloat introduced by first patch version. New helper function getPyNodeCPUCount is now used to calculate the number of node CPUs. python/libvirt-override.c | 87 ++++++++++++++++++++++++++++------------------- 1 file changed, 52 insertions(+), 35 deletions(-) diff --git a/python/libvirt-override.c b/python/libvirt-override.c index c07b33a..169df11 100644 --- a/python/libvirt-override.c +++ b/python/libvirt-override.c @@ -253,6 +253,39 @@ cleanup: return NULL; } +/* + * Utility function to retrieve the number of node CPUs present. + * It first tries virGetNodeCPUMap, which will return the + * number reliably, if available. + * As a fallback and for compatibility with backlevel libvirt + * versions virGetNodeInfo will be called to calculate the + * CPU number, which has the potential to return a too small + * number if some host CPUs are offline. + */ +static int +getPyNodeCPUCount(virConnectPtr conn) { + int i_retval; + virNodeInfo nodeinfo; + + LIBVIRT_BEGIN_ALLOW_THREADS; + i_retval = virNodeGetCPUMap(conn, NULL, NULL, 0); + LIBVIRT_END_ALLOW_THREADS; + + if (i_retval < 0) { + /* fallback: use nodeinfo */ + LIBVIRT_BEGIN_ALLOW_THREADS; + i_retval = virNodeGetInfo(conn, &nodeinfo); + LIBVIRT_END_ALLOW_THREADS; + if (i_retval < 0) + goto cleanup; + + i_retval = VIR_NODEINFO_MAXCPUS(nodeinfo); + } + +cleanup: + return i_retval; +} + /************************************************************************ * * * Statistics * @@ -1338,22 +1371,18 @@ libvirt_virDomainGetVcpus(PyObject *self ATTRIBUTE_UNUSED, virDomainPtr domain; PyObject *pyobj_domain, *pyretval = NULL, *pycpuinfo = NULL, *pycpumap = NULL; PyObject *error = NULL; - virNodeInfo nodeinfo; virDomainInfo dominfo; virVcpuInfoPtr cpuinfo = NULL; unsigned char *cpumap = NULL; size_t cpumaplen, i; - int i_retval; + int i_retval, cpunum; if (!PyArg_ParseTuple(args, (char *)"O:virDomainGetVcpus", &pyobj_domain)) return NULL; domain = (virDomainPtr) PyvirDomain_Get(pyobj_domain); - LIBVIRT_BEGIN_ALLOW_THREADS; - i_retval = virNodeGetInfo(virDomainGetConnect(domain), &nodeinfo); - LIBVIRT_END_ALLOW_THREADS; - if (i_retval < 0) + if ((cpunum = getPyNodeCPUCount(virDomainGetConnect(domain))) < 0) return VIR_PY_INT_FAIL; LIBVIRT_BEGIN_ALLOW_THREADS; @@ -1365,7 +1394,7 @@ libvirt_virDomainGetVcpus(PyObject *self ATTRIBUTE_UNUSED, if (VIR_ALLOC_N(cpuinfo, dominfo.nrVirtCpu) < 0) return PyErr_NoMemory(); - cpumaplen = VIR_CPU_MAPLEN(VIR_NODEINFO_MAXCPUS(nodeinfo)); + cpumaplen = VIR_CPU_MAPLEN(cpunum); if (xalloc_oversized(dominfo.nrVirtCpu, cpumaplen) || VIR_ALLOC_N(cpumap, dominfo.nrVirtCpu * cpumaplen) < 0) { error = PyErr_NoMemory(); @@ -1423,11 +1452,11 @@ libvirt_virDomainGetVcpus(PyObject *self ATTRIBUTE_UNUSED, goto cleanup; } for (i = 0 ; i < dominfo.nrVirtCpu ; i++) { - PyObject *info = PyTuple_New(VIR_NODEINFO_MAXCPUS(nodeinfo)); + PyObject *info = PyTuple_New(cpunum); int j; if (info == NULL) goto cleanup; - for (j = 0 ; j < VIR_NODEINFO_MAXCPUS(nodeinfo) ; j++) { + for (j = 0 ; j < cpunum ; j++) { PyObject *item = NULL; if ((item = PyBool_FromLong(VIR_CPU_USABLE(cpumap, cpumaplen, i, j))) == NULL || PyTuple_SetItem(info, j, item) < 0) { @@ -1467,9 +1496,8 @@ libvirt_virDomainPinVcpu(PyObject *self ATTRIBUTE_UNUSED, virDomainPtr domain; PyObject *pyobj_domain, *pycpumap; PyObject *ret = NULL; - virNodeInfo nodeinfo; unsigned char *cpumap; - int cpumaplen, i, vcpu, tuple_size; + int cpumaplen, i, vcpu, tuple_size, cpunum; int i_retval; if (!PyArg_ParseTuple(args, (char *)"OiO:virDomainPinVcpu", @@ -1477,10 +1505,7 @@ libvirt_virDomainPinVcpu(PyObject *self ATTRIBUTE_UNUSED, return NULL; domain = (virDomainPtr) PyvirDomain_Get(pyobj_domain); - LIBVIRT_BEGIN_ALLOW_THREADS; - i_retval = virNodeGetInfo(virDomainGetConnect(domain), &nodeinfo); - LIBVIRT_END_ALLOW_THREADS; - if (i_retval < 0) + if ((cpunum = getPyNodeCPUCount(virDomainGetConnect(domain))) < 0) return VIR_PY_INT_FAIL; if (PyTuple_Check(pycpumap)) { @@ -1492,7 +1517,7 @@ libvirt_virDomainPinVcpu(PyObject *self ATTRIBUTE_UNUSED, return ret; } - cpumaplen = VIR_CPU_MAPLEN(VIR_NODEINFO_MAXCPUS(nodeinfo)); + cpumaplen = VIR_CPU_MAPLEN(cpunum); if (VIR_ALLOC_N(cpumap, cpumaplen) < 0) return PyErr_NoMemory(); @@ -1509,7 +1534,7 @@ libvirt_virDomainPinVcpu(PyObject *self ATTRIBUTE_UNUSED, VIR_UNUSE_CPU(cpumap, i); } - for (; i < VIR_NODEINFO_MAXCPUS(nodeinfo); i++) + for (; i < cpunum; i++) VIR_UNUSE_CPU(cpumap, i); LIBVIRT_BEGIN_ALLOW_THREADS; @@ -1534,9 +1559,8 @@ libvirt_virDomainPinVcpuFlags(PyObject *self ATTRIBUTE_UNUSED, virDomainPtr domain; PyObject *pyobj_domain, *pycpumap; PyObject *ret = NULL; - virNodeInfo nodeinfo; unsigned char *cpumap; - int cpumaplen, i, vcpu, tuple_size; + int cpumaplen, i, vcpu, tuple_size, cpunum; unsigned int flags; int i_retval; @@ -1545,10 +1569,7 @@ libvirt_virDomainPinVcpuFlags(PyObject *self ATTRIBUTE_UNUSED, return NULL; domain = (virDomainPtr) PyvirDomain_Get(pyobj_domain); - LIBVIRT_BEGIN_ALLOW_THREADS; - i_retval = virNodeGetInfo(virDomainGetConnect(domain), &nodeinfo); - LIBVIRT_END_ALLOW_THREADS; - if (i_retval < 0) + if ((cpunum = getPyNodeCPUCount(virDomainGetConnect(domain))) < 0) return VIR_PY_INT_FAIL; if (PyTuple_Check(pycpumap)) { @@ -1560,7 +1581,7 @@ libvirt_virDomainPinVcpuFlags(PyObject *self ATTRIBUTE_UNUSED, return ret; } - cpumaplen = VIR_CPU_MAPLEN(VIR_NODEINFO_MAXCPUS(nodeinfo)); + cpumaplen = VIR_CPU_MAPLEN(cpunum); if (VIR_ALLOC_N(cpumap, cpumaplen) < 0) return PyErr_NoMemory(); @@ -1577,7 +1598,7 @@ libvirt_virDomainPinVcpuFlags(PyObject *self ATTRIBUTE_UNUSED, VIR_UNUSE_CPU(cpumap, i); } - for (; i < VIR_NODEINFO_MAXCPUS(nodeinfo); i++) + for (; i < cpunum; i++) VIR_UNUSE_CPU(cpumap, i); LIBVIRT_BEGIN_ALLOW_THREADS; @@ -1599,23 +1620,19 @@ libvirt_virDomainGetVcpuPinInfo(PyObject *self ATTRIBUTE_UNUSED, PyObject *args) { virDomainPtr domain; PyObject *pyobj_domain, *pycpumaps = NULL; - virNodeInfo nodeinfo; virDomainInfo dominfo; unsigned char *cpumaps; size_t cpumaplen, vcpu, pcpu; unsigned int flags; - int i_retval; + int i_retval, cpunum; if (!PyArg_ParseTuple(args, (char *)"Oi:virDomainGetVcpuPinInfo", &pyobj_domain, &flags)) return NULL; domain = (virDomainPtr) PyvirDomain_Get(pyobj_domain); - LIBVIRT_BEGIN_ALLOW_THREADS; - i_retval = virNodeGetInfo(virDomainGetConnect(domain), &nodeinfo); - LIBVIRT_END_ALLOW_THREADS; - if (i_retval < 0) - return VIR_PY_NONE; + if ((cpunum = getPyNodeCPUCount(virDomainGetConnect(domain))) < 0) + return VIR_PY_INT_FAIL; LIBVIRT_BEGIN_ALLOW_THREADS; i_retval = virDomainGetInfo(domain, &dominfo); @@ -1623,7 +1640,7 @@ libvirt_virDomainGetVcpuPinInfo(PyObject *self ATTRIBUTE_UNUSED, if (i_retval < 0) return VIR_PY_NONE; - cpumaplen = VIR_CPU_MAPLEN(VIR_NODEINFO_MAXCPUS(nodeinfo)); + cpumaplen = VIR_CPU_MAPLEN(cpunum); if (xalloc_oversized(dominfo.nrVirtCpu, cpumaplen) || VIR_ALLOC_N(cpumaps, dominfo.nrVirtCpu * cpumaplen) < 0) goto cleanup; @@ -1639,11 +1656,11 @@ libvirt_virDomainGetVcpuPinInfo(PyObject *self ATTRIBUTE_UNUSED, goto cleanup; for (vcpu = 0; vcpu < dominfo.nrVirtCpu; vcpu++) { - PyObject *mapinfo = PyTuple_New(VIR_NODEINFO_MAXCPUS(nodeinfo)); + PyObject *mapinfo = PyTuple_New(cpunum); if (mapinfo == NULL) goto cleanup; - for (pcpu = 0; pcpu < VIR_NODEINFO_MAXCPUS(nodeinfo); pcpu++) { + for (pcpu = 0; pcpu < cpunum; pcpu++) { PyTuple_SetItem(mapinfo, pcpu, PyBool_FromLong(VIR_CPU_USABLE(cpumaps, cpumaplen, vcpu, pcpu))); } -- 1.7.12.4

On 11/13/2012 05:54 AM, Viktor Mihajlovski wrote:
Modified the places where virNodeGetInfo was used for the purpose of obtaining the maximum node CPU number. Transparently falling back to virNodeGetInfo in case of failure. Wrote a utility function getPyNodeCPUCount for that purpose.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- V2 Changes: Implemented Eric Blake's suggestion to remove code bloat introduced by first patch version. New helper function getPyNodeCPUCount is now used to calculate the number of node CPUs.
python/libvirt-override.c | 87 ++++++++++++++++++++++++++++------------------- 1 file changed, 52 insertions(+), 35 deletions(-)
@@ -1599,23 +1620,19 @@ libvirt_virDomainGetVcpuPinInfo(PyObject *self ATTRIBUTE_UNUSED, PyObject *args) {
- LIBVIRT_BEGIN_ALLOW_THREADS; - i_retval = virNodeGetInfo(virDomainGetConnect(domain), &nodeinfo); - LIBVIRT_END_ALLOW_THREADS; - if (i_retval < 0) - return VIR_PY_NONE; + if ((cpunum = getPyNodeCPUCount(virDomainGetConnect(domain))) < 0) + return VIR_PY_INT_FAIL;
Oops - unintentional type change, due to too much copy-n-paste. ACK with that fixed, and pushed. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Modified the places where virNodeGetInfo was used for the purpose of obtaining the maximum node CPU number. Transparently falling back to virNodeGetInfo in case of failure. Wrote utility function vshNodeGetCPUCount to compute node CPU number. Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- V2 Changes: Implemented Eric Blake's suggestion to remove code bloat introduced by first patch version. New helper function vshNodeGetCPUCount is now used to calculate the number of node CPUs. tools/virsh-domain.c | 32 +++++++++++++++++++++++--------- 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 86ed4d3..183a598 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -126,6 +126,26 @@ vshDomainVcpuStateToString(int state) } /* + * Determine number of CPU nodes present by trying + * virNodeGetCPUMap and falling back to virNodeGetInfo + * if needed. + */ +static int +vshNodeGetCPUCount(virConnectPtr conn) +{ + int ret; + virNodeInfo nodeinfo; + + if ((ret = virNodeGetCPUMap(conn, NULL, NULL, 0)) < 0) { + /* fall back to nodeinfo */ + if (virNodeGetInfo(conn, &nodeinfo) == 0) { + ret = VIR_NODEINFO_MAXCPUS(nodeinfo); + } + } + return ret; +} + +/* * "attach-device" command */ static const vshCmdInfo info_attach_device[] = { @@ -4497,7 +4517,6 @@ cmdVcpuinfo(vshControl *ctl, const vshCmd *cmd) { virDomainInfo info; virDomainPtr dom; - virNodeInfo nodeinfo; virVcpuInfoPtr cpuinfo; unsigned char *cpumaps; int ncpus, maxcpu; @@ -4508,7 +4527,7 @@ cmdVcpuinfo(vshControl *ctl, const vshCmd *cmd) if (!(dom = vshCommandOptDomain(ctl, cmd, NULL))) return false; - if (virNodeGetInfo(ctl->conn, &nodeinfo) != 0) { + if ((maxcpu = vshNodeGetCPUCount(ctl->conn)) < 0) { virDomainFree(dom); return false; } @@ -4519,7 +4538,6 @@ cmdVcpuinfo(vshControl *ctl, const vshCmd *cmd) } cpuinfo = vshMalloc(ctl, sizeof(virVcpuInfo)*info.nrVirtCpu); - maxcpu = VIR_NODEINFO_MAXCPUS(nodeinfo); cpumaplen = VIR_CPU_MAPLEN(maxcpu); cpumaps = vshMalloc(ctl, info.nrVirtCpu * cpumaplen); @@ -4645,7 +4663,6 @@ cmdVcpuPin(vshControl *ctl, const vshCmd *cmd) { virDomainInfo info; virDomainPtr dom; - virNodeInfo nodeinfo; int vcpu = -1; const char *cpulist = NULL; bool ret = true; @@ -4695,7 +4712,7 @@ cmdVcpuPin(vshControl *ctl, const vshCmd *cmd) return false; } - if (virNodeGetInfo(ctl->conn, &nodeinfo) != 0) { + if ((maxcpu = vshNodeGetCPUCount(ctl->conn)) < 0) { virDomainFree(dom); return false; } @@ -4712,7 +4729,6 @@ cmdVcpuPin(vshControl *ctl, const vshCmd *cmd) return false; } - maxcpu = VIR_NODEINFO_MAXCPUS(nodeinfo); cpumaplen = VIR_CPU_MAPLEN(maxcpu); /* Query mode: show CPU affinity information then exit.*/ @@ -4864,7 +4880,6 @@ static bool cmdEmulatorPin(vshControl *ctl, const vshCmd *cmd) { virDomainPtr dom; - virNodeInfo nodeinfo; const char *cpulist = NULL; bool ret = true; unsigned char *cpumap = NULL; @@ -4905,12 +4920,11 @@ cmdEmulatorPin(vshControl *ctl, const vshCmd *cmd) } query = !cpulist; - if (virNodeGetInfo(ctl->conn, &nodeinfo) != 0) { + if ((maxcpu = vshNodeGetCPUCount(ctl->conn)) < 0) { virDomainFree(dom); return false; } - maxcpu = VIR_NODEINFO_MAXCPUS(nodeinfo); cpumaplen = VIR_CPU_MAPLEN(maxcpu); /* Query mode: show CPU affinity information then exit.*/ -- 1.7.12.4

On 11/13/2012 05:54 AM, Viktor Mihajlovski wrote:
Modified the places where virNodeGetInfo was used for the purpose of obtaining the maximum node CPU number. Transparently falling back to virNodeGetInfo in case of failure. Wrote utility function vshNodeGetCPUCount to compute node CPU number.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- V2 Changes: Implemented Eric Blake's suggestion to remove code bloat introduced by first patch version. New helper function vshNodeGetCPUCount is now used to calculate the number of node CPUs.
+static int +vshNodeGetCPUCount(virConnectPtr conn) +{ + int ret; + virNodeInfo nodeinfo; + + if ((ret = virNodeGetCPUMap(conn, NULL, NULL, 0)) < 0) {
Probably worth clearing out the error here, so that the rest of virsh doesn't leak an error message that we ignored.
+ /* fall back to nodeinfo */ + if (virNodeGetInfo(conn, &nodeinfo) == 0) { + ret = VIR_NODEINFO_MAXCPUS(nodeinfo); + } + } + return ret; +} +
ACK with that fixed, and pushed. diff --git i/tools/virsh-domain.c w/tools/virsh-domain.c index 183a598..d483f3f 100644 --- i/tools/virsh-domain.c +++ w/tools/virsh-domain.c @@ -138,6 +138,7 @@ vshNodeGetCPUCount(virConnectPtr conn) if ((ret = virNodeGetCPUMap(conn, NULL, NULL, 0)) < 0) { /* fall back to nodeinfo */ + vshResetLibvirtError(); if (virNodeGetInfo(conn, &nodeinfo) == 0) { ret = VIR_NODEINFO_MAXCPUS(nodeinfo); } -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/15/2012 05:06 PM, Eric Blake wrote:
ACK with that fixed, and pushed.
Thanks! -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On RHEL 5, I was getting a segfault trying to start libvirtd, because we were failing virNodeParseSocket but not checking for errors, and then calling CPU_SET(-1, &sock_map) as a result. But if you don't have a topology/physical_package_id file, then you can just assume that the cpu belongs to socket 0. * src/nodeinfo.c (virNodeGetCpuValue): Change bool into default_value. (virNodeParseSocket): Allow for default value when file is missing, different from fatal error on reading file. (virNodeParseNode): Update call sites to fail on error. --- src/nodeinfo.c | 29 ++++++++++++++++------------- 1 file changed, 16 insertions(+), 13 deletions(-) diff --git a/src/nodeinfo.c b/src/nodeinfo.c index 4589b77..d5f2c00 100644 --- a/src/nodeinfo.c +++ b/src/nodeinfo.c @@ -79,13 +79,14 @@ static int linuxNodeGetMemoryStats(FILE *meminfo, int *nparams); /* Return the positive decimal contents of the given - * DIR/cpu%u/FILE, or -1 on error. If MISSING_OK and the - * file could not be found, return 1 instead of an error; this is - * because some machines cannot hot-unplug cpu0, or because - * hot-unplugging is disabled. */ + * DIR/cpu%u/FILE, or -1 on error. If DEFAULT_VALUE is non-negative + * and the file could not be found, return that instead of an error; + * this is useful for machines that cannot hot-unplug cpu0, or where + * hot-unplugging is disabled, or where the kernel is too old + * to support NUMA cells, etc. */ static int virNodeGetCpuValue(const char *dir, unsigned int cpu, const char *file, - bool missing_ok) + int default_value) { char *path; FILE *pathfp; @@ -100,8 +101,8 @@ virNodeGetCpuValue(const char *dir, unsigned int cpu, const char *file, pathfp = fopen(path, "r"); if (pathfp == NULL) { - if (missing_ok && errno == ENOENT) - value = 1; + if (default_value >= 0 && errno == ENOENT) + value = default_value; else virReportSystemError(errno, _("cannot open %s"), path); goto cleanup; @@ -174,7 +175,7 @@ static int virNodeParseSocket(const char *dir, unsigned int cpu) { int ret = virNodeGetCpuValue(dir, cpu, "topology/physical_package_id", - false); + 0); # if defined(__powerpc__) || \ defined(__powerpc64__) || \ defined(__s390__) || \ @@ -241,14 +242,15 @@ virNodeParseNode(const char *node, if (sscanf(cpudirent->d_name, "cpu%u", &cpu) != 1) continue; - if ((online = virNodeGetCpuValue(node, cpu, "online", true)) < 0) + if ((online = virNodeGetCpuValue(node, cpu, "online", 1)) < 0) goto cleanup; if (!online) continue; /* Parse socket */ - sock = virNodeParseSocket(node, cpu); + if ((sock = virNodeParseSocket(node, cpu)) < 0) + goto cleanup; CPU_SET(sock, &sock_map); if (sock > sock_max) @@ -280,7 +282,7 @@ virNodeParseNode(const char *node, if (sscanf(cpudirent->d_name, "cpu%u", &cpu) != 1) continue; - if ((online = virNodeGetCpuValue(node, cpu, "online", true)) < 0) + if ((online = virNodeGetCpuValue(node, cpu, "online", 1)) < 0) goto cleanup; if (!online) { @@ -291,7 +293,8 @@ virNodeParseNode(const char *node, processors++; /* Parse socket */ - sock = virNodeParseSocket(node, cpu); + if ((sock = virNodeParseSocket(node, cpu)) < 0) + goto cleanup; if (!CPU_ISSET(sock, &sock_map)) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("CPU socket topology has changed")); @@ -304,7 +307,7 @@ virNodeParseNode(const char *node, /* logical cpu is equivalent to a core on s390 */ core = cpu; # else - core = virNodeGetCpuValue(node, cpu, "topology/core_id", false); + core = virNodeGetCpuValue(node, cpu, "topology/core_id", 0); # endif CPU_SET(core, &core_maps[sock]); -- 1.7.11.7

On 11/15/2012 04:38 PM, Eric Blake wrote:
On RHEL 5, I was getting a segfault trying to start libvirtd, because we were failing virNodeParseSocket but not checking for errors, and then calling CPU_SET(-1, &sock_map) as a result. But if you don't have a topology/physical_package_id file, then you can just assume that the cpu belongs to socket 0.
* src/nodeinfo.c (virNodeGetCpuValue): Change bool into default_value. (virNodeParseSocket): Allow for default value when file is missing, different from fatal error on reading file. (virNodeParseNode): Update call sites to fail on error. --- src/nodeinfo.c | 29 ++++++++++++++++------------- 1 file changed, 16 insertions(+), 13 deletions(-)
Keeps the original good path behavior and provides a sound fallback. Looks good to me, +1. -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

Prior to this patch, 'virsh nodecpumap' on older kernels reported: error: Unable to get cpu map error: out of memory * src/nodeinfo.c (linuxParseCPUmax): Don't overwrite error. (nodeGetCPUBitmap): Provide backup implementation. --- src/nodeinfo.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/src/nodeinfo.c b/src/nodeinfo.c index d5f2c00..75d6524 100644 --- a/src/nodeinfo.c +++ b/src/nodeinfo.c @@ -792,10 +792,8 @@ linuxParseCPUmax(const char *path) char *tmp; int ret = -1; - if (virFileReadAll(path, 5 * VIR_DOMAIN_CPUMASK_LEN, &str) < 0) { - virReportOOMError(); + if (virFileReadAll(path, 5 * VIR_DOMAIN_CPUMASK_LEN, &str) < 0) goto cleanup; - } tmp = str; do { @@ -1024,15 +1022,30 @@ nodeGetCPUBitmap(int *max_id ATTRIBUTE_UNUSED) virBitmapPtr cpumap; int present; - present = linuxParseCPUmax(SYSFS_SYSTEM_PATH "/cpu/present"); - /* XXX should we also work on older kernels, like RHEL5, that lack - * cpu/present and cpu/online files? Those kernels also lack cpu - * hotplugging, so it would be a matter of finding the largest - * cpu/cpuNN directory, and creating a map that size with all bits - * set. */ + present = nodeGetCPUCount(); if (present < 0) return NULL; - cpumap = linuxParseCPUmap(present, SYSFS_SYSTEM_PATH "/cpu/online"); + + if (virFileExists(SYSFS_SYSTEM_PATH "/cpu/online")) { + cpumap = linuxParseCPUmap(present, SYSFS_SYSTEM_PATH "/cpu/online"); + } else { + int i; + + cpumap = virBitmapNew(present); + if (!cpumap) { + virReportOOMError(); + return NULL; + } + for (i = 0; i < present; i++) { + int online = virNodeGetCpuValue(SYSFS_SYSTEM_PATH, i, "online", 1); + if (online < 0) { + virBitmapFree(cpumap); + return NULL; + } + if (online) + ignore_value(virBitmapSetBit(cpumap, i)); + } + } if (max_id && cpumap) *max_id = present; return cpumap; -- 1.7.11.7

On 11/15/2012 04:38 PM, Eric Blake wrote:
- cpumap = linuxParseCPUmap(present, SYSFS_SYSTEM_PATH "/cpu/online"); + + if (virFileExists(SYSFS_SYSTEM_PATH "/cpu/online")) { + cpumap = linuxParseCPUmap(present, SYSFS_SYSTEM_PATH "/cpu/online"); + } else { + int i; + + cpumap = virBitmapNew(present); + if (!cpumap) { + virReportOOMError(); + return NULL; + } + for (i = 0; i < present; i++) { + int online = virNodeGetCpuValue(SYSFS_SYSTEM_PATH, i, "online", 1); + if (online < 0) { + virBitmapFree(cpumap); + return NULL; + } + if (online) + ignore_value(virBitmapSetBit(cpumap, i)); + } + } if (max_id && cpumap) *max_id = present; return cpumap;
Fallback should provide correct result on back-level kernels. The code doesn't use the goto error/cleanup pattern, but there's a mixture in nodeinfo.c anyway and it wouldn't make it anymore compact. As far as I am concerned: +1. -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 11/16/2012 09:57 AM, Viktor Mihajlovski wrote:
On 11/15/2012 04:38 PM, Eric Blake wrote:
- cpumap = linuxParseCPUmap(present, SYSFS_SYSTEM_PATH "/cpu/online"); + + if (virFileExists(SYSFS_SYSTEM_PATH "/cpu/online")) { + cpumap = linuxParseCPUmap(present, SYSFS_SYSTEM_PATH "/cpu/online"); + } else { + int i; + + cpumap = virBitmapNew(present); + if (!cpumap) { + virReportOOMError(); + return NULL; + } + for (i = 0; i < present; i++) { + int online = virNodeGetCpuValue(SYSFS_SYSTEM_PATH, i, "online", 1); + if (online < 0) { + virBitmapFree(cpumap); + return NULL; + } + if (online) + ignore_value(virBitmapSetBit(cpumap, i)); + } + } if (max_id && cpumap) *max_id = present; return cpumap;
Fallback should provide correct result on back-level kernels. The code doesn't use the goto error/cleanup pattern, but there's a mixture in nodeinfo.c anyway and it wouldn't make it anymore compact. As far as I am concerned: +1.
Thanks for reviewing. I've now pushed these two. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (2)
-
Eric Blake
-
Viktor Mihajlovski