[libvirt] [PATCH v2 0/2] Guest NUMA topology support

Hi, This is v2 of the patchset that adds support for specifying NUMA topology for guests. <cpu> ... <topology sockets='2' cores='4' threads='2'/> <numa> <cell cpus='0-7' mems='512000'/> <cell cpus='8-15' mems='512000'/> </numa> ... </cpu> This change allows libvirt to generate -numa options for QEMU/KVM. This patchset passes all tests except daemon-conf test. Changes for v2 -------------- - Renamed mems to memory in NUMA cell specification. - Make both cpus= and memory= mandatory in a NUMA cell. - Reuse cpuset and memoryKB definitions for cpus and memory. - Fix a bug in reading memory=. - Correct error handling for usages of virXMLPropString. - Support virsh dumpxml. - Fix XML in domaincommon.rng so that <numa> works correctly for both <cpu> ... <cpu> as well as <cpu match="..."> ... <cpu> - Don't use virBufferTruncate. - Modifiy qemuxml2argv test cases for s/mems/memory change. - Use virBufferLit wherever possible. - Now qemuBuildNumaCPUArgStr can't fail, hence doesn't need return val. - Pass memory in MB to qemu. v1 - https://www.redhat.com/archives/libvir-list/2011-November/msg00247.html v0 - https://www.redhat.com/archives/libvir-list/2011-October/msg00025.html RFC - http://permalink.gmane.org/gmane.comp.emulators.libvirt/44626 Regards, Bharata.

XML definitions for guest NUMA and parsing routines. From: Bharata B Rao <bharata@linux.vnet.ibm.com> This patch adds XML definitions for guest NUMA specification and contains routines to parse the same. The guest NUMA specification looks like this: <cpu> ... <topology sockets='2' cores='4' threads='2'/> <numa> <cell cpus='0-7' memory='512000'/> <cell cpus='8-15' memory='512000'/> </numa> ... </cpu> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> --- docs/formatdomain.html.in | 29 ++++++++++++++ docs/schemas/domaincommon.rng | 31 ++++++++++++++- src/conf/cpu_conf.c | 86 ++++++++++++++++++++++++++++++++++++++++- src/conf/cpu_conf.h | 13 ++++++ src/conf/domain_conf.c | 7 +++ 5 files changed, 163 insertions(+), 3 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index cbad196..28a4edc 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -628,6 +628,35 @@ </dd> </dl> + <p> + Guest NUMA topology can be specifed using <code>numa</code> element. + <span class="since">Since 0.9.8</span> + </p> + +<pre> + ... + <cpu> + ... + <numa> + <cell cpus='0-3' memory='512000'/> + <cell cpus='4-7' memory='512000'/> + </numa> + ... + </cpu> + ...</pre> + + <p> + Each <code>cell</code> element specifies a NUMA cell or a NUMA node. + <code>cpus</code> specifies the CPU or range of CPUs that are part of + the node. <code>memory</code> specifies the node memory in kilobytes + (i.e. blocks of 1024 bytes). Each cell or node is assigned cellid + or nodeid in the increasing order starting from 0. + </p> + + <p> + This guest NUMA specification is currently available only for QEMU/KVM. + </p> + <h3><a name="elementsLifecycle">Lifecycle control</a></h3> <p> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index b6f858e..afcaccc 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -2297,7 +2297,14 @@ <define name="cpu"> <element name="cpu"> <choice> - <ref name="cpuTopology"/> + <group> + <optional> + <ref name="cpuTopology"/> + </optional> + <optional> + <ref name="cpuNuma"/> + </optional> + </group> <group> <ref name="cpuMatch"/> <interleave> @@ -2311,6 +2318,9 @@ <zeroOrMore> <ref name="cpuFeature"/> </zeroOrMore> + <optional> + <ref name="cpuNuma"/> + </optional> </interleave> </group> </choice> @@ -2371,6 +2381,25 @@ </element> </define> + <define name="cpuNuma"> + <element name="numa"> + <oneOrMore> + <ref name="numaCell"/> + </oneOrMore> + </element> + </define> + + <define name="numaCell"> + <element name="cell"> + <attribute name="cpus"> + <ref name="cpuset"/> + </attribute> + <attribute name="memory"> + <ref name="memoryKB"/> + </attribute> + </element> + </define> + <!-- System information specification: Placeholder for system specific informations likes the ones diff --git a/src/conf/cpu_conf.c b/src/conf/cpu_conf.c index 41e997e..4aabe98 100644 --- a/src/conf/cpu_conf.c +++ b/src/conf/cpu_conf.c @@ -28,6 +28,7 @@ #include "util.h" #include "buf.h" #include "cpu_conf.h" +#include "domain_conf.h" #define VIR_FROM_THIS VIR_FROM_CPU @@ -67,6 +68,12 @@ virCPUDefFree(virCPUDefPtr def) VIR_FREE(def->features[i].name); VIR_FREE(def->features); + for (i = 0 ; i < def->ncells ; i++) { + VIR_FREE(def->cells[i].cpumask); + VIR_FREE(def->cells[i].cpustr); + } + VIR_FREE(def->cells); + VIR_FREE(def); } @@ -109,7 +116,6 @@ no_memory: return NULL; } - virCPUDefPtr virCPUDefParseXML(const xmlNodePtr node, xmlXPathContextPtr ctxt, @@ -289,9 +295,75 @@ virCPUDefParseXML(const xmlNodePtr node, def->features[i].policy = policy; } + if (virXPathNode("./numa[1]", ctxt)) { + VIR_FREE(nodes); + n = virXPathNodeSet("./numa[1]/cell", ctxt, &nodes); + if (n <= 0) { + virCPUReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("NUMA topology defined without NUMA cells")); + goto error; + } + + if (VIR_RESIZE_N(def->cells, def->ncells_max, + def->ncells, n) < 0) + goto no_memory; + + def->ncells = n; + + for (i = 0 ; i < n ; i++) { + char *cpus, *cpus_parse, *memory; + int cpumasklen = VIR_DOMAIN_CPUMASK_LEN; + int ret, ncpus = 0; + + def->cells[i].cellid = i; + cpus = cpus_parse = virXMLPropString(nodes[i], "cpus"); + if (!cpus) { + virCPUReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("Missing 'cpus' attribute in NUMA cell")); + goto error; + } + + def->cells[i].cpustr = strdup(cpus); + if (!def->cells[i].cpustr) { + VIR_FREE(cpus); + goto no_memory; + } + + if (VIR_ALLOC_N(def->cells[i].cpumask, cpumasklen) < 0) { + VIR_FREE(cpus); + goto no_memory; + } + + ncpus = virDomainCpuSetParse((const char **)&cpus_parse, + 0, def->cells[i].cpumask, cpumasklen); + if (ncpus <= 0) { + VIR_FREE(cpus); + goto error; + } + def->cells_cpus += ncpus; + + memory = virXMLPropString(nodes[i], "memory"); + if (!memory) { + virCPUReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("Missing 'memory' attribute in NUMA cell")); + VIR_FREE(cpus); + goto error; + } + + ret = virStrToLong_ui(memory, NULL, 10, &def->cells[i].mem); + if (ret == -1) { + VIR_FREE(cpus); + VIR_FREE(memory); + goto error; + } + + VIR_FREE(cpus); + VIR_FREE(memory); + } + } + cleanup: VIR_FREE(nodes); - return def; no_memory: @@ -414,6 +486,16 @@ virCPUDefFormatBuf(virBufferPtr buf, } } + if (def->ncells) { + virBufferAddLit(buf, "<numa>\n"); + for (i = 0; i < def->ncells; i++) { + virBufferAddLit(buf, " <cell"); + virBufferAsprintf(buf, " cpus='%s'", def->cells[i].cpustr); + virBufferAsprintf(buf, " memory='%d'", def->cells[i].mem); + virBufferAddLit(buf, "/>\n"); + } + virBufferAddLit(buf, "</numa>\n"); + } return 0; } diff --git a/src/conf/cpu_conf.h b/src/conf/cpu_conf.h index 4406cba..efff473 100644 --- a/src/conf/cpu_conf.h +++ b/src/conf/cpu_conf.h @@ -67,6 +67,15 @@ struct _virCPUFeatureDef { int policy; /* enum virCPUFeaturePolicy */ }; +typedef struct _virCellDef virCellDef; +typedef virCellDef *virCellDefPtr; +struct _virCellDef { + int cellid; + char *cpumask; /* CPUs that are part of this node */ + char *cpustr; /* CPUs stored in string form for dumpxml */ + unsigned int mem; /* Node memory in kB */ +}; + typedef struct _virCPUDef virCPUDef; typedef virCPUDef *virCPUDefPtr; struct _virCPUDef { @@ -81,6 +90,10 @@ struct _virCPUDef { size_t nfeatures; size_t nfeatures_max; virCPUFeatureDefPtr features; + size_t ncells; + size_t ncells_max; + virCellDefPtr cells; + unsigned int cells_cpus; }; diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a85f837..1203119 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -7572,6 +7572,13 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps, if (def->cpu == NULL) goto error; + + if (def->cpu->cells_cpus > def->maxvcpus) { + virDomainReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Number of CPUs in <numa> exceeds the" + " <vcpu> count")); + goto error; + } } if ((node = virXPathNode("./sysinfo[1]", ctxt)) != NULL) {

On Fri, Nov 11, 2011 at 06:21:45PM +0530, Bharata B Rao wrote:
XML definitions for guest NUMA and parsing routines.
From: Bharata B Rao <bharata@linux.vnet.ibm.com>
This patch adds XML definitions for guest NUMA specification and contains routines to parse the same. The guest NUMA specification looks like this:
<cpu> ... <topology sockets='2' cores='4' threads='2'/> <numa> <cell cpus='0-7' memory='512000'/> <cell cpus='8-15' memory='512000'/> </numa> ... </cpu>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> ---
docs/formatdomain.html.in | 29 ++++++++++++++ docs/schemas/domaincommon.rng | 31 ++++++++++++++- src/conf/cpu_conf.c | 86 ++++++++++++++++++++++++++++++++++++++++- src/conf/cpu_conf.h | 13 ++++++ src/conf/domain_conf.c | 7 +++ 5 files changed, 163 insertions(+), 3 deletions(-)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index cbad196..28a4edc 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -628,6 +628,35 @@ </dd> </dl>
+ <p> + Guest NUMA topology can be specifed using <code>numa</code> element. + <span class="since">Since 0.9.8</span> + </p> + +<pre> + ... + <cpu> + ... + <numa> + <cell cpus='0-3' memory='512000'/> + <cell cpus='4-7' memory='512000'/> + </numa> + ... + </cpu> + ...</pre> + + <p> + Each <code>cell</code> element specifies a NUMA cell or a NUMA node. + <code>cpus</code> specifies the CPU or range of CPUs that are part of + the node. <code>memory</code> specifies the node memory in kilobytes + (i.e. blocks of 1024 bytes). Each cell or node is assigned cellid + or nodeid in the increasing order starting from 0. + </p> + + <p> + This guest NUMA specification is currently available only for QEMU/KVM. + </p> + <h3><a name="elementsLifecycle">Lifecycle control</a></h3>
<p> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index b6f858e..afcaccc 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -2297,7 +2297,14 @@ <define name="cpu"> <element name="cpu"> <choice> - <ref name="cpuTopology"/> + <group> + <optional> + <ref name="cpuTopology"/> + </optional> + <optional> + <ref name="cpuNuma"/> + </optional> + </group> <group> <ref name="cpuMatch"/> <interleave> @@ -2311,6 +2318,9 @@ <zeroOrMore> <ref name="cpuFeature"/> </zeroOrMore> + <optional> + <ref name="cpuNuma"/> + </optional> </interleave> </group> </choice> @@ -2371,6 +2381,25 @@ </element> </define>
+ <define name="cpuNuma"> + <element name="numa"> + <oneOrMore> + <ref name="numaCell"/> + </oneOrMore> + </element> + </define> + + <define name="numaCell"> + <element name="cell"> + <attribute name="cpus"> + <ref name="cpuset"/> + </attribute> + <attribute name="memory"> + <ref name="memoryKB"/> + </attribute> + </element> + </define> + <!-- System information specification: Placeholder for system specific informations likes the ones diff --git a/src/conf/cpu_conf.c b/src/conf/cpu_conf.c index 41e997e..4aabe98 100644 --- a/src/conf/cpu_conf.c +++ b/src/conf/cpu_conf.c @@ -28,6 +28,7 @@ #include "util.h" #include "buf.h" #include "cpu_conf.h" +#include "domain_conf.h"
#define VIR_FROM_THIS VIR_FROM_CPU
@@ -67,6 +68,12 @@ virCPUDefFree(virCPUDefPtr def) VIR_FREE(def->features[i].name); VIR_FREE(def->features);
+ for (i = 0 ; i < def->ncells ; i++) { + VIR_FREE(def->cells[i].cpumask); + VIR_FREE(def->cells[i].cpustr); + } + VIR_FREE(def->cells); + VIR_FREE(def); }
@@ -109,7 +116,6 @@ no_memory: return NULL; }
- virCPUDefPtr virCPUDefParseXML(const xmlNodePtr node, xmlXPathContextPtr ctxt, @@ -289,9 +295,75 @@ virCPUDefParseXML(const xmlNodePtr node, def->features[i].policy = policy; }
+ if (virXPathNode("./numa[1]", ctxt)) { + VIR_FREE(nodes); + n = virXPathNodeSet("./numa[1]/cell", ctxt, &nodes); + if (n <= 0) { + virCPUReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("NUMA topology defined without NUMA cells")); + goto error; + } + + if (VIR_RESIZE_N(def->cells, def->ncells_max, + def->ncells, n) < 0) + goto no_memory; + + def->ncells = n; + + for (i = 0 ; i < n ; i++) { + char *cpus, *cpus_parse, *memory; + int cpumasklen = VIR_DOMAIN_CPUMASK_LEN; + int ret, ncpus = 0; + + def->cells[i].cellid = i; + cpus = cpus_parse = virXMLPropString(nodes[i], "cpus"); + if (!cpus) { + virCPUReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("Missing 'cpus' attribute in NUMA cell")); + goto error; + } + + def->cells[i].cpustr = strdup(cpus); + if (!def->cells[i].cpustr) { + VIR_FREE(cpus); + goto no_memory; + } + + if (VIR_ALLOC_N(def->cells[i].cpumask, cpumasklen) < 0) { + VIR_FREE(cpus); + goto no_memory; + } + + ncpus = virDomainCpuSetParse((const char **)&cpus_parse, + 0, def->cells[i].cpumask, cpumasklen); + if (ncpus <= 0) { + VIR_FREE(cpus); + goto error; + } + def->cells_cpus += ncpus; + + memory = virXMLPropString(nodes[i], "memory"); + if (!memory) { + virCPUReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("Missing 'memory' attribute in NUMA cell")); + VIR_FREE(cpus); + goto error; + } + + ret = virStrToLong_ui(memory, NULL, 10, &def->cells[i].mem); + if (ret == -1) { + VIR_FREE(cpus); + VIR_FREE(memory); + goto error; + } + + VIR_FREE(cpus); + VIR_FREE(memory); + } + } + cleanup: VIR_FREE(nodes); - return def;
no_memory: @@ -414,6 +486,16 @@ virCPUDefFormatBuf(virBufferPtr buf, } }
+ if (def->ncells) { + virBufferAddLit(buf, "<numa>\n"); + for (i = 0; i < def->ncells; i++) { + virBufferAddLit(buf, " <cell"); + virBufferAsprintf(buf, " cpus='%s'", def->cells[i].cpustr); + virBufferAsprintf(buf, " memory='%d'", def->cells[i].mem); + virBufferAddLit(buf, "/>\n"); + } + virBufferAddLit(buf, "</numa>\n"); + } return 0; }
diff --git a/src/conf/cpu_conf.h b/src/conf/cpu_conf.h index 4406cba..efff473 100644 --- a/src/conf/cpu_conf.h +++ b/src/conf/cpu_conf.h @@ -67,6 +67,15 @@ struct _virCPUFeatureDef { int policy; /* enum virCPUFeaturePolicy */ };
+typedef struct _virCellDef virCellDef; +typedef virCellDef *virCellDefPtr; +struct _virCellDef { + int cellid; + char *cpumask; /* CPUs that are part of this node */ + char *cpustr; /* CPUs stored in string form for dumpxml */ + unsigned int mem; /* Node memory in kB */ +}; + typedef struct _virCPUDef virCPUDef; typedef virCPUDef *virCPUDefPtr; struct _virCPUDef { @@ -81,6 +90,10 @@ struct _virCPUDef { size_t nfeatures; size_t nfeatures_max; virCPUFeatureDefPtr features; + size_t ncells; + size_t ncells_max; + virCellDefPtr cells; + unsigned int cells_cpus; };
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a85f837..1203119 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -7572,6 +7572,13 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps,
if (def->cpu == NULL) goto error; + + if (def->cpu->cells_cpus > def->maxvcpus) { + virDomainReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Number of CPUs in <numa> exceeds the" + " <vcpu> count")); + goto error; + } }
if ((node = virXPathNode("./sysinfo[1]", ctxt)) != NULL) {
ACK Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11/15/2011 04:48 AM, Daniel P. Berrange wrote:
On Fri, Nov 11, 2011 at 06:21:45PM +0530, Bharata B Rao wrote:
XML definitions for guest NUMA and parsing routines.
From: Bharata B Rao <bharata@linux.vnet.ibm.com>
This patch adds XML definitions for guest NUMA specification and contains routines to parse the same. The guest NUMA specification looks like this:
<cpu> ... <topology sockets='2' cores='4' threads='2'/> <numa> <cell cpus='0-7' memory='512000'/> <cell cpus='8-15' memory='512000'/> </numa> ... </cpu>
ACK
Now pushed, once I fixed the trailing space ('make syntax-check' catches that). I also added you to AUTHORS; let me know if I need to update any spellings. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Fri, 2011-11-11 at 18:21 +0530, Bharata B Rao wrote:
XML definitions for guest NUMA and parsing routines.
From: Bharata B Rao <bharata@linux.vnet.ibm.com>
This patch adds XML definitions for guest NUMA specification and contains routines to parse the same. The guest NUMA specification looks like this:
<cpu> ... <topology sockets='2' cores='4' threads='2'/> <numa> <cell cpus='0-7' memory='512000'/> <cell cpus='8-15' memory='512000'/> </numa> ... </cpu>
Hi Bharata, I realise I'm a bit late on this, but I'm just catching up on the list traffic. Firstly why is the XML tag called "cell", it seems to represent what we would normally call a "node" in terms of NUMA? Also does the parser support disjoint ranges for cpus and memory? Or do we not need that complexity for some reason? For example what if I have a topology that looks like: Node 0: CPUs: 0-3,8-11 MEM : 0-1G,2G-3G Node 1: CPUs: 4-7,12-15 MEM : 1G-2G,3G-4G cheers --

On Wed, Dec 07, 2011 at 03:49:59PM +1100, Michael Ellerman wrote:
On Fri, 2011-11-11 at 18:21 +0530, Bharata B Rao wrote:
XML definitions for guest NUMA and parsing routines.
From: Bharata B Rao <bharata@linux.vnet.ibm.com>
This patch adds XML definitions for guest NUMA specification and contains routines to parse the same. The guest NUMA specification looks like this:
<cpu> ... <topology sockets='2' cores='4' threads='2'/> <numa> <cell cpus='0-7' memory='512000'/> <cell cpus='8-15' memory='512000'/> </numa> ... </cpu>
Hi Bharata,
I realise I'm a bit late on this, but I'm just catching up on the list traffic.
Firstly why is the XML tag called "cell", it seems to represent what we would normally call a "node" in terms of NUMA?
I initially started out with "node", but in libvirt node is referred to as "cell". Hence stuck to "cell" to be consistent.
Also does the parser support disjoint ranges for cpus and memory? Or do we not need that complexity for some reason?
For example what if I have a topology that looks like:
Node 0: CPUs: 0-3,8-11
This is supported, and since we use the same parsing routine that parses cpuset, we support fancy stuff like cpus-0-4^3,8-11 But such things aren't support by QEMU currently.
MEM : 0-1G,2G-3G
QEMU currently doesn't support such construct. Do we really want this in guest topology ? There are some wierd NUMA topologies in the real world which aren't supported as of now, we wanted to start simple and if needed support futher extensions (Refer to discussions on my v0 post) Regards, Bharata.

qemu: Generate -numa option From: Bharata B Rao <bharata@linux.vnet.ibm.com> Add routines to generate -numa QEMU command line option based on <numa> ... </numa> XML specifications. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> --- src/conf/cpu_conf.c | 3 + src/qemu/qemu_command.c | 63 ++++++++++++++++++++ tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.args | 5 ++ tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.xml | 25 ++++++++ tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.args | 6 ++ tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.xml | 25 ++++++++ tests/qemuxml2argvtest.c | 2 + 7 files changed, 128 insertions(+), 1 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.xml diff --git a/src/conf/cpu_conf.c b/src/conf/cpu_conf.c index 4aabe98..7c5bf69 100644 --- a/src/conf/cpu_conf.c +++ b/src/conf/cpu_conf.c @@ -352,11 +352,12 @@ virCPUDefParseXML(const xmlNodePtr node, ret = virStrToLong_ui(memory, NULL, 10, &def->cells[i].mem); if (ret == -1) { + virCPUReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("Invalid 'memory' attribute in NUMA cell")); VIR_FREE(cpus); VIR_FREE(memory); goto error; } - VIR_FREE(cpus); VIR_FREE(memory); } diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 01ee23f..7083928 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3266,6 +3266,65 @@ qemuBuildSmpArgStr(const virDomainDefPtr def, return virBufferContentAndReset(&buf); } +static void +qemuBuildNumaCPUArgStr(char *cpumask, virBufferPtr buf) +{ + int i, first, last; + int cpuSet = 0; + + for (i = 0; i < VIR_DOMAIN_CPUMASK_LEN; i++) { + if (cpumask[i]) { + if (cpuSet) + last = i; + else { + first = last = i; + cpuSet = 1; + } + } else { + if (!cpuSet) + continue; + if (first == last) + virBufferAsprintf(buf, "%d,", first); + else + virBufferAsprintf(buf, "%d-%d,", first, last); + cpuSet = 0; + } + } + + if (cpuSet) { + if (first == last) + virBufferAsprintf(buf, "%d,", first); + else + virBufferAsprintf(buf, "%d-%d,", first, last); + } +} + +static int +qemuBuildNumaArgStr(const virDomainDefPtr def, virCommandPtr cmd) +{ + int i; + virBuffer buf = VIR_BUFFER_INITIALIZER; + + for (i = 0; i < def->cpu->ncells; i++) { + virCommandAddArg(cmd, "-numa"); + virBufferAsprintf(&buf, "node,nodeid=%d", def->cpu->cells[i].cellid); + virBufferAddLit(&buf, ",cpus="); + qemuBuildNumaCPUArgStr(def->cpu->cells[i].cpumask, &buf); + virBufferAsprintf(&buf, "mem=%d", + VIR_DIV_UP(def->cpu->cells[i].mem, 1024)); + + if (virBufferError(&buf)) + goto error; + + virCommandAddArgBuffer(cmd, &buf); + } + return 0; + +error: + virBufferFreeAndReset(&buf); + virReportOOMError(); + return -1; +} /* * Constructs a argv suitable for launching qemu with config defined @@ -3429,6 +3488,10 @@ qemuBuildCommandLine(virConnectPtr conn, virCommandAddArg(cmd, smp); VIR_FREE(smp); + if (def->cpu && def->cpu->ncells) + if (qemuBuildNumaArgStr(def, cmd) < 0) + goto error; + if (qemuCapsGet(qemuCaps, QEMU_CAPS_NAME)) { virCommandAddArg(cmd, "-name"); if (driver->setProcessName && diff --git a/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.args b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.args new file mode 100644 index 0000000..7c0dd30 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.args @@ -0,0 +1,5 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test ./qemu.sh -S -M pc \ +-m 214 -smp 16 -numa node,nodeid=0,cpus=0-7,mem=107 \ +-numa node,nodeid=1,cpus=8-15,mem=107 -nographic -monitor \ +unix:/tmp/test-monitor,server,nowait -no-acpi -boot n -net none -serial none \ +-parallel none -usb diff --git a/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.xml b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.xml new file mode 100644 index 0000000..9fcdda8 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.xml @@ -0,0 +1,25 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory>219100</memory> + <currentMemory>219100</currentMemory> + <vcpu>16</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='network'/> + </os> + <cpu> + <topology sockets="2" cores="4" threads="2"/> + <numa> + <cell cpus="0-7" memory="109550"/> + <cell cpus="8-15" memory="109550"/> + </numa> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/./qemu.sh</emulator> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.args b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.args new file mode 100644 index 0000000..2ac2568 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.args @@ -0,0 +1,6 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test ./qemu.sh -S -M pc \ +-m 214 -smp 16,sockets=2,cores=4,threads=2 \ +-numa node,nodeid=0,cpus=0-7,mem=107 \ +-numa node,nodeid=1,cpus=8-15,mem=107 -nographic -monitor \ +unix:/tmp/test-monitor,server,nowait -no-acpi -boot n -net none -serial none \ +-parallel none -usb diff --git a/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.xml b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.xml new file mode 100644 index 0000000..9fcdda8 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.xml @@ -0,0 +1,25 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory>219100</memory> + <currentMemory>219100</currentMemory> + <vcpu>16</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='network'/> + </os> + <cpu> + <topology sockets="2" cores="4" threads="2"/> + <numa> + <cell cpus="0-7" memory="109550"/> + <cell cpus="8-15" memory="109550"/> + </numa> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/./qemu.sh</emulator> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index d9a6e8d..6a8f898 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -580,6 +580,8 @@ mymain(void) DO_TEST("cpu-exact1", false, NONE); DO_TEST("cpu-exact2", false, NONE); DO_TEST("cpu-strict1", false, NONE); + DO_TEST("cpu-numa1", false, NONE); + DO_TEST("cpu-numa2", false, QEMU_CAPS_SMP_TOPOLOGY); DO_TEST("memtune", false, QEMU_CAPS_NAME); DO_TEST("blkiotune", false, QEMU_CAPS_NAME);

On Fri, Nov 11, 2011 at 06:23:04PM +0530, Bharata B Rao wrote:
qemu: Generate -numa option
From: Bharata B Rao <bharata@linux.vnet.ibm.com>
Add routines to generate -numa QEMU command line option based on <numa> ... </numa> XML specifications.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> ---
src/conf/cpu_conf.c | 3 + src/qemu/qemu_command.c | 63 ++++++++++++++++++++ tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.args | 5 ++ tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.xml | 25 ++++++++ tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.args | 6 ++ tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.xml | 25 ++++++++ tests/qemuxml2argvtest.c | 2 + 7 files changed, 128 insertions(+), 1 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.xml
diff --git a/src/conf/cpu_conf.c b/src/conf/cpu_conf.c index 4aabe98..7c5bf69 100644 --- a/src/conf/cpu_conf.c +++ b/src/conf/cpu_conf.c @@ -352,11 +352,12 @@ virCPUDefParseXML(const xmlNodePtr node,
ret = virStrToLong_ui(memory, NULL, 10, &def->cells[i].mem); if (ret == -1) { + virCPUReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("Invalid 'memory' attribute in NUMA cell")); VIR_FREE(cpus); VIR_FREE(memory); goto error; } - VIR_FREE(cpus); VIR_FREE(memory); } diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 01ee23f..7083928 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3266,6 +3266,65 @@ qemuBuildSmpArgStr(const virDomainDefPtr def, return virBufferContentAndReset(&buf); }
+static void +qemuBuildNumaCPUArgStr(char *cpumask, virBufferPtr buf) +{ + int i, first, last; + int cpuSet = 0; + + for (i = 0; i < VIR_DOMAIN_CPUMASK_LEN; i++) { + if (cpumask[i]) { + if (cpuSet) + last = i; + else { + first = last = i; + cpuSet = 1; + } + } else { + if (!cpuSet) + continue; + if (first == last) + virBufferAsprintf(buf, "%d,", first); + else + virBufferAsprintf(buf, "%d-%d,", first, last); + cpuSet = 0; + } + } + + if (cpuSet) { + if (first == last) + virBufferAsprintf(buf, "%d,", first); + else + virBufferAsprintf(buf, "%d-%d,", first, last); + } +} + +static int +qemuBuildNumaArgStr(const virDomainDefPtr def, virCommandPtr cmd) +{ + int i; + virBuffer buf = VIR_BUFFER_INITIALIZER; + + for (i = 0; i < def->cpu->ncells; i++) { + virCommandAddArg(cmd, "-numa"); + virBufferAsprintf(&buf, "node,nodeid=%d", def->cpu->cells[i].cellid); + virBufferAddLit(&buf, ",cpus="); + qemuBuildNumaCPUArgStr(def->cpu->cells[i].cpumask, &buf); + virBufferAsprintf(&buf, "mem=%d", + VIR_DIV_UP(def->cpu->cells[i].mem, 1024)); + + if (virBufferError(&buf)) + goto error; + + virCommandAddArgBuffer(cmd, &buf); + } + return 0; + +error: + virBufferFreeAndReset(&buf); + virReportOOMError(); + return -1; +}
/* * Constructs a argv suitable for launching qemu with config defined @@ -3429,6 +3488,10 @@ qemuBuildCommandLine(virConnectPtr conn, virCommandAddArg(cmd, smp); VIR_FREE(smp);
+ if (def->cpu && def->cpu->ncells) + if (qemuBuildNumaArgStr(def, cmd) < 0) + goto error; + if (qemuCapsGet(qemuCaps, QEMU_CAPS_NAME)) { virCommandAddArg(cmd, "-name"); if (driver->setProcessName && diff --git a/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.args b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.args new file mode 100644 index 0000000..7c0dd30 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.args @@ -0,0 +1,5 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test ./qemu.sh -S -M pc \ +-m 214 -smp 16 -numa node,nodeid=0,cpus=0-7,mem=107 \ +-numa node,nodeid=1,cpus=8-15,mem=107 -nographic -monitor \ +unix:/tmp/test-monitor,server,nowait -no-acpi -boot n -net none -serial none \ +-parallel none -usb diff --git a/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.xml b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.xml new file mode 100644 index 0000000..9fcdda8 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.xml @@ -0,0 +1,25 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory>219100</memory> + <currentMemory>219100</currentMemory> + <vcpu>16</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='network'/> + </os> + <cpu> + <topology sockets="2" cores="4" threads="2"/> + <numa> + <cell cpus="0-7" memory="109550"/> + <cell cpus="8-15" memory="109550"/> + </numa> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/./qemu.sh</emulator> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.args b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.args new file mode 100644 index 0000000..2ac2568 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.args @@ -0,0 +1,6 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test ./qemu.sh -S -M pc \ +-m 214 -smp 16,sockets=2,cores=4,threads=2 \ +-numa node,nodeid=0,cpus=0-7,mem=107 \ +-numa node,nodeid=1,cpus=8-15,mem=107 -nographic -monitor \ +unix:/tmp/test-monitor,server,nowait -no-acpi -boot n -net none -serial none \ +-parallel none -usb diff --git a/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.xml b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.xml new file mode 100644 index 0000000..9fcdda8 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.xml @@ -0,0 +1,25 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory>219100</memory> + <currentMemory>219100</currentMemory> + <vcpu>16</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='network'/> + </os> + <cpu> + <topology sockets="2" cores="4" threads="2"/> + <numa> + <cell cpus="0-7" memory="109550"/> + <cell cpus="8-15" memory="109550"/> + </numa> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/./qemu.sh</emulator> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index d9a6e8d..6a8f898 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -580,6 +580,8 @@ mymain(void) DO_TEST("cpu-exact1", false, NONE); DO_TEST("cpu-exact2", false, NONE); DO_TEST("cpu-strict1", false, NONE); + DO_TEST("cpu-numa1", false, NONE); + DO_TEST("cpu-numa2", false, QEMU_CAPS_SMP_TOPOLOGY);
DO_TEST("memtune", false, QEMU_CAPS_NAME); DO_TEST("blkiotune", false, QEMU_CAPS_NAME);
ACK Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11/15/2011 04:49 AM, Daniel P. Berrange wrote:
On Fri, Nov 11, 2011 at 06:23:04PM +0530, Bharata B Rao wrote:
qemu: Generate -numa option
From: Bharata B Rao <bharata@linux.vnet.ibm.com>
Add routines to generate -numa QEMU command line option based on <numa> ... </numa> XML specifications.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
ACK
Applied, after fixing a leading TAB issue (again, 'make syntax-check'), as well as fixing a use of if stmt else { stmt stmt } to consistently use {} on both branches, per HACKING. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (4)
-
Bharata B Rao
-
Daniel P. Berrange
-
Eric Blake
-
Michael Ellerman