[libvirt] [PATCH 0/5 0/1 0/1 V3] Add new public API virDomainGetPcpusUsage() and pcpuinfo command in virsh
by Lai Jiangshan
"virt-top -1" can call virDomainGetPcpusUsage() periodically and get
the CPU activities per CPU. (See the last patch in this series).
virsh is also added a pcpuinfo command which calls virDomainGetPcpusUsage(),
it gets information about the physical CPUs, such as the usage of
CPUs, the current attached vCPUs.
# virsh pcpuinfo rhel6
CPU: 0
Curr VCPU: -
Usage: 47.3
CPU: 1
Curr VCPU: 1
Usage: 46.8
CPU: 2
Curr VCPU: 0
Usage: 52.7
CPU: 3
Curr VCPU: -
Usage: 44.1
Changed from V2:
Simple cleanup
Add python implementation of virDomainGetPcpusUsage()
Acked-by: "Richard W.M. Jones" <rjones(a)redhat.com>
Signed-off-by: Lai Jiangshan <laijs(a)cn.fujitsu.com>
Patch for libvirt(5 patches):
daemon/remote.c | 68 ++++++++++++++++++++++++++++
include/libvirt/libvirt.h.in | 5 ++
python/generator.py | 1 +
python/libvirt-override-api.xml | 6 +++
python/libvirt-override.c | 33 ++++++++++++++
src/driver.h | 7 +++
src/libvirt.c | 51 +++++++++++++++++++++
src/libvirt_public.syms | 5 ++
src/qemu/qemu.conf | 5 +-
src/qemu/qemu_conf.c | 3 +-
src/qemu/qemu_driver.c | 74 +++++++++++++++++++++++++++++++
src/remote/remote_driver.c | 51 +++++++++++++++++++++
src/remote/remote_protocol.x | 17 +++++++-
src/remote_protocol-structs | 13 +++++
src/util/cgroup.c | 7 +++
src/util/cgroup.h | 1 +
tools/virsh.c | 93 +++++++++++++++++++++++++++++++++++++++
tools/virsh.pod | 5 ++
18 files changed, 441 insertions(+), 4 deletions(-)
Patch for ocaml-libvirt (1 patch):
libvirt/libvirt.ml | 1 +
libvirt/libvirt.mli | 4 ++++
libvirt/libvirt_c_oneoffs.c | 25 +++++++++++++++++++++++++
3 files changed, 30 insertions(+), 0 deletions(-)
Patch for virt-top (1 patch):
virt-top/virt_top.ml | 75 +++++++++++++++++--------------------------------
1 files changed, 26 insertions(+), 49 deletions(-)
--
1.7.4.4
12 years, 9 months
[libvirt] [PATCH] util: fix build mingw (and all non-linux) build failure
by Laine Stump
ATTRIBUTE_UNUSED was accidentally forgotten on one arg of a stub
function for functionality that's not present on non-linux
platforms. This causes a non-linux build with
--enable-compile-warnings=error to fail.
---
Pushed under build-breaker rule.
src/util/pci.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/src/util/pci.c b/src/util/pci.c
index c8a5287..a6212f2 100644
--- a/src/util/pci.c
+++ b/src/util/pci.c
@@ -2219,7 +2219,8 @@ pciDeviceNetName(char *device_link_sysfs_path ATTRIBUTE_UNUSED,
int
pciDeviceGetVirtualFunctionInfo(const char *vf_sysfs_device_path ATTRIBUTE_UNUSED,
- char **pfname, int *vf_index ATTRIBUTE_UNUSED)
+ char **pfname ATTRIBUTE_UNUSED,
+ int *vf_index ATTRIBUTE_UNUSED)
{
pciReportError(VIR_ERR_INTERNAL_ERROR, _("pciDeviceGetVirtualFunctionInfo "
"is not supported on non-linux platforms"));
--
1.7.7.6
12 years, 9 months
[libvirt] [PATCH RFC v2] qemu: Support numad
by Osier Yang
numad is an user-level daemon that monitors NUMA topology and
processes resource consumption to facilitate good NUMA resource
alignment of applications/virtual machines to improve performance
and minimize cost of remote memory latencies. It provides a
pre-placement advisory interface, so significant processes can
be pre-bound to nodes with sufficient available resources.
More details: http://fedoraproject.org/wiki/Features/numad
"numad -w ncpus:memory_amount" is the advisory interface numad
provides currently.
This patch add the support by introducing a bool XML element:
<numatune>
<autonuma/>
</numatune>
If it's specified, the number of vcpus and the current memory
amount specified in domain XML will be used for numad command
line (numad uses MB for memory amount):
numad -w $num_of_vcpus:$current_memory_amount / 1024
The advisory nodeset returned from numad will be used to set
domain process CPU affinity then. (e.g. qemuProcessInitCpuAffinity).
If the user specifies both CPU affinity policy (e.g.
(<vcpu cpuset="1-10,^7,^8">4</vcpu>) and XML indicating to use
numad for the advisory nodeset, the specified CPU affinity will be
ignored.
Only QEMU/KVM and LXC drivers support it now.
v1 - v2:
* Since Bill Gray says it doesn't matter to use the number of
vcpus and current memory amount as numad cmd line argument,
though from sementics point of view, what numad expects are
physical CPU numbers, let's go this way.
v2 dropped XML <cpu required_cpu='4' required_memory='512000'/>,
and just a new boolean XML element <autonuma>. Codes are refactored
accordingly.
* v1 overrides the cpuset specified by <vcpu cpuset='1-10,^7'>2</vcpu>,
v2 doesn't do that, but just silently ignored.
* xml2xml test is added
---
configure.ac | 8 ++
docs/formatdomain.html.in | 11 ++-
docs/schemas/domaincommon.rng | 5 +
src/conf/domain_conf.c | 124 +++++++++++++-------
src/conf/domain_conf.h | 1 +
src/lxc/lxc_controller.c | 86 ++++++++++++--
src/qemu/qemu_process.c | 86 ++++++++++++--
.../qemuxml2argv-numatune-autonuma.xml | 32 +++++
tests/qemuxml2xmltest.c | 1 +
9 files changed, 287 insertions(+), 67 deletions(-)
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-numatune-autonuma.xml
diff --git a/configure.ac b/configure.ac
index c9cdd7b..31f0835 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1445,6 +1445,14 @@ AM_CONDITIONAL([HAVE_NUMACTL], [test "$with_numactl" != "no"])
AC_SUBST([NUMACTL_CFLAGS])
AC_SUBST([NUMACTL_LIBS])
+dnl Do we have numad?
+if test "$with_qemu" = "yes"; then
+ AC_PATH_PROG([NUMAD], [numad], [], [/bin:/usr/bin:/usr/local/bin:$PATH])
+
+ if test -n "$NUMAD"; then
+ AC_DEFINE_UNQUOTED([NUMAD],["$NUMAD"], [Location or name of the numad program])
+ fi
+fi
dnl pcap lib
LIBPCAP_CONFIG="pcap-config"
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 42f38d3..22872c8 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -505,6 +505,7 @@
...
<numatune>
<memory mode="strict" nodeset="1-4,^3"/>
+ <autonuma/>
</numatune>
...
</domain>
@@ -519,7 +520,7 @@
<span class='since'>Since 0.9.3</span>
<dt><code>memory</code></dt>
<dd>
- The optional <code>memory</code> element specify how to allocate memory
+ The optional <code>memory</code> element specifies how to allocate memory
for the domain process on a NUMA host. It contains two attributes,
attribute <code>mode</code> is either 'interleave', 'strict',
or 'preferred',
@@ -527,6 +528,14 @@
syntax with attribute <code>cpuset</code> of element <code>vcpu</code>.
<span class='since'>Since 0.9.3</span>
</dd>
+ <dd>
+ The optional <code>autonuma</code> element indicates pinning the virtual CPUs
+ to the advisory nodeset returned by querying "numad" (a system daemon that
+ monitors NUMA topology and usage). With using this element, the physical CPUs
+ specified by attribute <code>cpuset</code> (of element <code>vcpu</code>) will
+ be ignored.
+ <span class='since'>Since 0.9.11 (QEMU/KVM and LXC only)</span>
+ </dd>
</dl>
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index a905457..9066b40 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -549,6 +549,11 @@
</attribute>
</element>
</optional>
+ <optional>
+ <element name="autonuma">
+ <empty/>
+ </element>
+ </optional>
</element>
</optional>
</interleave>
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index b994718..89d00ae 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -7447,7 +7447,6 @@ error:
goto cleanup;
}
-
static int virDomainDefMaybeAddController(virDomainDefPtr def,
int type,
int idx)
@@ -7507,6 +7506,7 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps,
bool uuid_generated = false;
virBitmapPtr bootMap = NULL;
unsigned long bootMapSize = 0;
+ xmlNodePtr cur;
if (VIR_ALLOC(def) < 0) {
virReportOOMError();
@@ -7776,47 +7776,76 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps,
VIR_FREE(nodes);
/* Extract numatune if exists. */
- if ((n = virXPathNodeSet("./numatune", ctxt, NULL)) < 0) {
+ if ((n = virXPathNodeSet("./numatune", ctxt, &nodes)) < 0) {
virDomainReportError(VIR_ERR_INTERNAL_ERROR,
"%s", _("cannot extract numatune nodes"));
goto error;
}
+ if (n > 1) {
+ virDomainReportError(VIR_ERR_XML_ERROR, "%s",
+ _("only one numatune is supported"));
+ VIR_FREE(nodes);
+ goto error;
+ }
+
if (n) {
- tmp = virXPathString("string(./numatune/memory/@nodeset)", ctxt);
- if (tmp) {
- char *set = tmp;
- int nodemasklen = VIR_DOMAIN_CPUMASK_LEN;
+ cur = nodes[0]->children;
+ while (cur != NULL) {
+ if (cur->type == XML_ELEMENT_NODE) {
+ if ((xmlStrEqual(cur->name, BAD_CAST "memory"))) {
+ tmp = virXMLPropString(cur, "nodeset");
- if (VIR_ALLOC_N(def->numatune.memory.nodemask, nodemasklen) < 0) {
- goto no_memory;
- }
+ if (tmp) {
+ char *set = tmp;
+ int nodemasklen = VIR_DOMAIN_CPUMASK_LEN;
- /* "nodeset" leads same syntax with "cpuset". */
- if (virDomainCpuSetParse(set, 0, def->numatune.memory.nodemask,
- nodemasklen) < 0)
- goto error;
- VIR_FREE(tmp);
- } else {
- virDomainReportError(VIR_ERR_INTERNAL_ERROR,
- "%s", _("nodeset for NUMA memory tuning must be set"));
- goto error;
- }
+ if (VIR_ALLOC_N(def->numatune.memory.nodemask,
+ nodemasklen) < 0) {
+ virReportOOMError();
+ goto error;
+ }
- tmp = virXPathString("string(./numatune/memory/@mode)", ctxt);
- if (tmp) {
- if ((def->numatune.memory.mode =
- virDomainNumatuneMemModeTypeFromString(tmp)) < 0) {
- virDomainReportError(VIR_ERR_INTERNAL_ERROR,
- _("Unsupported NUMA memory tuning mode '%s'"),
- tmp);
- goto error;
+ /* "nodeset" leads same syntax with "cpuset". */
+ if (virDomainCpuSetParse(set, 0,
+ def->numatune.memory.nodemask,
+ nodemasklen) < 0)
+ goto error;
+ VIR_FREE(tmp);
+ } else {
+ virDomainReportError(VIR_ERR_XML_ERROR, "%s",
+ _("nodeset for NUMA memory "
+ "tuning must be set"));
+ goto error;
+ }
+
+ tmp = virXMLPropString(cur, "mode");
+ if (tmp) {
+ if ((def->numatune.memory.mode =
+ virDomainNumatuneMemModeTypeFromString(tmp)) < 0) {
+ virDomainReportError(VIR_ERR_XML_ERROR,
+ _("Unsupported NUMA memory "
+ "tuning mode '%s'"),
+ tmp);
+ goto error;
+ }
+ VIR_FREE(tmp);
+ } else {
+ def->numatune.memory.mode = VIR_DOMAIN_NUMATUNE_MEM_STRICT;
+ }
+ } else if (xmlStrEqual(cur->name, BAD_CAST "autonuma")) {
+ def->numatune.autonuma = true;
+ } else {
+ virDomainReportError(VIR_ERR_XML_ERROR,
+ _("unsupported XML element %s"),
+ (const char *)cur->name);
+ goto error;
+ }
}
- VIR_FREE(tmp);
- } else {
- def->numatune.memory.mode = VIR_DOMAIN_NUMATUNE_MEM_STRICT;
+ cur = cur->next;
}
}
+ VIR_FREE(nodes);
n = virXPathNodeSet("./features/*", ctxt, &nodes);
if (n < 0)
@@ -12149,23 +12178,30 @@ virDomainDefFormatInternal(virDomainDefPtr def,
def->cputune.period || def->cputune.quota)
virBufferAddLit(buf, " </cputune>\n");
- if (def->numatune.memory.nodemask) {
- const char *mode;
- char *nodemask = NULL;
-
+ if (def->numatune.memory.nodemask ||
+ def->numatune.autonuma) {
virBufferAddLit(buf, " <numatune>\n");
- nodemask = virDomainCpuSetFormat(def->numatune.memory.nodemask,
- VIR_DOMAIN_CPUMASK_LEN);
- if (nodemask == NULL) {
- virDomainReportError(VIR_ERR_INTERNAL_ERROR, "%s",
- _("failed to format nodeset for NUMA memory tuning"));
- goto cleanup;
+ if (def->numatune.memory.nodemask) {
+ const char *mode;
+ char *nodemask = NULL;
+
+ nodemask = virDomainCpuSetFormat(def->numatune.memory.nodemask,
+ VIR_DOMAIN_CPUMASK_LEN);
+ if (nodemask == NULL) {
+ virDomainReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("failed to format nodeset for "
+ "NUMA memory tuning"));
+ goto cleanup;
+ }
+
+ mode = virDomainNumatuneMemModeTypeToString(def->numatune.memory.mode);
+ virBufferAsprintf(buf, " <memory mode='%s' nodeset='%s'/>\n",
+ mode, nodemask);
+ VIR_FREE(nodemask);
}
- mode = virDomainNumatuneMemModeTypeToString(def->numatune.memory.mode);
- virBufferAsprintf(buf, " <memory mode='%s' nodeset='%s'/>\n",
- mode, nodemask);
- VIR_FREE(nodemask);
+ if (def->numatune.autonuma)
+ virBufferAddLit(buf, " <autonuma/>\n");
virBufferAddLit(buf, " </numatune>\n");
}
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
index 87b4103..b15faf0 100644
--- a/src/conf/domain_conf.h
+++ b/src/conf/domain_conf.h
@@ -1458,6 +1458,7 @@ struct _virDomainNumatuneDef {
int mode;
} memory;
+ bool autonuma;
/* Future NUMA tuning related stuff should go here. */
};
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c
index 8f336f5..507b2c6 100644
--- a/src/lxc/lxc_controller.c
+++ b/src/lxc/lxc_controller.c
@@ -327,6 +327,41 @@ static int lxcSetContainerNUMAPolicy(virDomainDefPtr def)
}
#endif
+#if defined(NUMAD)
+static char *
+lxcGetNumadAdvice(virDomainDefPtr def)
+{
+ virCommandPtr cmd = NULL;
+ char *args = NULL;
+ char *ret = NULL;
+
+ if (virAsprintf(&args, "%d:%lu", def->vcpus, def->mem.cur_balloon) < 0) {
+ virReportOOMError();
+ goto out;
+ }
+ cmd = virCommandNewArgList(NUMAD, "-w", args, NULL);
+
+ virCommandSetOutputBuffer(cmd, &ret);
+
+ if (virCommandRun(cmd, NULL) < 0) {
+ lxcError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Failed to query numad for the advisory nodeset"));
+ }
+
+out:
+ VIR_FREE(args);
+ virCommandFree(cmd);
+ return ret;
+}
+#else
+static char *
+lxcGetNumadAdvice(virDomainDefPtr def ATTRIBUTE_UNUSED)
+{
+ lxcError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("numad is not available on this host"));
+ return NULL;
+}
+#endif
/*
* To be run while still single threaded
@@ -355,19 +390,48 @@ static int lxcSetContainerCpuAffinity(virDomainDefPtr def)
return -1;
}
- if (def->cpumask) {
- /* XXX why don't we keep 'cpumask' in the libvirt cpumap
- * format to start with ?!?! */
- for (i = 0 ; i < maxcpu && i < def->cpumasklen ; i++)
- if (def->cpumask[i])
+ if (def->numatune.autonuma) {
+ char *tmp_cpumask = NULL;
+ char *nodeset = NULL;
+
+ nodeset = lxcGetNumadAdvice(def);
+ if (!nodeset)
+ return -1;
+
+ if (VIR_ALLOC_N(tmp_cpumask, VIR_DOMAIN_CPUMASK_LEN) < 0) {
+ virReportOOMError();
+ return -1;
+ }
+
+ if (virDomainCpuSetParse(nodeset, 0, tmp_cpumask,
+ VIR_DOMAIN_CPUMASK_LEN) < 0) {
+ VIR_FREE(tmp_cpumask);
+ VIR_FREE(nodeset);
+ return -1;
+ }
+
+ for (i = 0; i < maxcpu && i < VIR_DOMAIN_CPUMASK_LEN; i++) {
+ if (tmp_cpumask[i])
VIR_USE_CPU(cpumap, i);
+ }
+
+ VIR_FREE(tmp_cpumask);
+ VIR_FREE(nodeset);
} else {
- /* You may think this is redundant, but we can't assume libvirtd
- * itself is running on all pCPUs, so we need to explicitly set
- * the spawned LXC instance to all pCPUs if no map is given in
- * its config file */
- for (i = 0 ; i < maxcpu ; i++)
- VIR_USE_CPU(cpumap, i);
+ if (def->cpumask) {
+ /* XXX why don't we keep 'cpumask' in the libvirt cpumap
+ * format to start with ?!?! */
+ for (i = 0 ; i < maxcpu && i < def->cpumasklen ; i++)
+ if (def->cpumask[i])
+ VIR_USE_CPU(cpumap, i);
+ } else {
+ /* You may think this is redundant, but we can't assume libvirtd
+ * itself is running on all pCPUs, so we need to explicitly set
+ * the spawned LXC instance to all pCPUs if no map is given in
+ * its config file */
+ for (i = 0 ; i < maxcpu ; i++)
+ VIR_USE_CPU(cpumap, i);
+ }
}
/* We are pressuming we are running between fork/exec of LXC
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 7b99814..209403f 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -1633,6 +1633,41 @@ qemuProcessInitNumaMemoryPolicy(virDomainObjPtr vm)
}
#endif
+#if defined(NUMAD)
+static char *
+qemuGetNumadAdvice(virDomainDefPtr def)
+{
+ virCommandPtr cmd = NULL;
+ char *args = NULL;
+ char *output = NULL;
+
+ if (virAsprintf(&args, "%d:%lu", def->vcpus, def->mem.cur_balloon) < 0) {
+ virReportOOMError();
+ goto out;
+ }
+ cmd = virCommandNewArgList(NUMAD, "-w", args, NULL);
+
+ virCommandSetOutputBuffer(cmd, &output);
+
+ if (virCommandRun(cmd, NULL) < 0)
+ qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Failed to query numad for the advisory nodeset"));
+
+out:
+ VIR_FREE(args);
+ virCommandFree(cmd);
+ return output;
+}
+#else
+static char *
+qemuGetNumadAdvice(virDomainDefPtr def ATTRIBUTE_UNUSED)
+{
+ qemuReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("numad is not available on this host"));
+ return NULL;
+}
+#endif
+
/*
* To be run between fork/exec of QEMU only
*/
@@ -1661,19 +1696,48 @@ qemuProcessInitCpuAffinity(virDomainObjPtr vm)
return -1;
}
- if (vm->def->cpumask) {
- /* XXX why don't we keep 'cpumask' in the libvirt cpumap
- * format to start with ?!?! */
- for (i = 0 ; i < maxcpu && i < vm->def->cpumasklen ; i++)
- if (vm->def->cpumask[i])
+ if (vm->def->numatune.autonuma) {
+ char *tmp_cpumask = NULL;
+ char *nodeset = NULL;
+
+ nodeset = qemuGetNumadAdvice(vm->def);
+ if (!nodeset)
+ return -1;
+
+ if (VIR_ALLOC_N(tmp_cpumask, VIR_DOMAIN_CPUMASK_LEN) < 0) {
+ virReportOOMError();
+ return -1;
+ }
+
+ if (virDomainCpuSetParse(nodeset, 0, tmp_cpumask,
+ VIR_DOMAIN_CPUMASK_LEN) < 0) {
+ VIR_FREE(tmp_cpumask);
+ VIR_FREE(nodeset);
+ return -1;
+ }
+
+ for (i = 0; i < maxcpu && i < VIR_DOMAIN_CPUMASK_LEN; i++) {
+ if (tmp_cpumask[i])
VIR_USE_CPU(cpumap, i);
+ }
+
+ VIR_FREE(tmp_cpumask);
+ VIR_FREE(nodeset);
} else {
- /* You may think this is redundant, but we can't assume libvirtd
- * itself is running on all pCPUs, so we need to explicitly set
- * the spawned QEMU instance to all pCPUs if no map is given in
- * its config file */
- for (i = 0 ; i < maxcpu ; i++)
- VIR_USE_CPU(cpumap, i);
+ if (vm->def->cpumask) {
+ /* XXX why don't we keep 'cpumask' in the libvirt cpumap
+ * format to start with ?!?! */
+ for (i = 0 ; i < maxcpu && i < vm->def->cpumasklen ; i++)
+ if (vm->def->cpumask[i])
+ VIR_USE_CPU(cpumap, i);
+ } else {
+ /* You may think this is redundant, but we can't assume libvirtd
+ * itself is running on all pCPUs, so we need to explicitly set
+ * the spawned QEMU instance to all pCPUs if no map is given in
+ * its config file */
+ for (i = 0 ; i < maxcpu ; i++)
+ VIR_USE_CPU(cpumap, i);
+ }
}
/* We are pressuming we are running between fork/exec of QEMU
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-numatune-autonuma.xml b/tests/qemuxml2argvdata/qemuxml2argv-numatune-autonuma.xml
new file mode 100644
index 0000000..7e50bbd
--- /dev/null
+++ b/tests/qemuxml2argvdata/qemuxml2argv-numatune-autonuma.xml
@@ -0,0 +1,32 @@
+<domain type='qemu'>
+ <name>QEMUGuest1</name>
+ <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
+ <memory>219136</memory>
+ <currentMemory>219136</currentMemory>
+ <vcpu current='1'>2</vcpu>
+ <numatune>
+ <autonuma/>
+ </numatune>
+ <os>
+ <type arch='i686' machine='pc'>hvm</type>
+ <boot dev='hd'/>
+ </os>
+ <cpu>
+ <topology sockets='2' cores='1' threads='1'/>
+ </cpu>
+ <clock offset='utc'/>
+ <on_poweroff>destroy</on_poweroff>
+ <on_reboot>restart</on_reboot>
+ <on_crash>destroy</on_crash>
+ <devices>
+ <emulator>/usr/bin/qemu</emulator>
+ <disk type='block' device='disk'>
+ <source dev='/dev/HostVG/QEMUGuest1'/>
+ <target dev='hda' bus='ide'/>
+ <address type='drive' controller='0' bus='0' target='0' unit='0'/>
+ </disk>
+ <controller type='usb' index='0'/>
+ <controller type='ide' index='0'/>
+ <memballoon model='virtio'/>
+ </devices>
+</domain>
diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c
index 03c75f8..3893ece 100644
--- a/tests/qemuxml2xmltest.c
+++ b/tests/qemuxml2xmltest.c
@@ -199,6 +199,7 @@ mymain(void)
DO_TEST("blkiotune");
DO_TEST("blkiotune-device");
DO_TEST("cputune");
+ DO_TEST("numatune-autonuma");
DO_TEST("smp");
DO_TEST("lease");
--
1.7.1
12 years, 9 months
[libvirt] [PATCH] rpc: allow truncated return for virDomainGetCPUStats
by Eric Blake
The RPC code assumed that the array returned by the driver would be
fully populated; that is, ncpus on entry resulted in ncpus * return
value on exit. However, while we don't support holes in the middle
of ncpus, we do want to permit the case of ncpus on entry being
longer than the array returned by the driver (that is, it should be
safe for the caller to pass ncpus=128 on entry, and the driver will
stop populating the array when it hits max_id).
There are now three cases:
server 0.9.10 and client 0.9.10 or newer: No impact - there were no
hypervisor drivers that supported cpu stats
server 0.9.11 or newer and client 0.9.10: if the client calls with
ncpus beyond the max, then the rpc call will fail on the client side
and disconnect the client, but the server is no worse for the wear
server 0.9.11 or newer and client 0.9.11: the server can return a
truncated array and the client will do just fine
I reproduced the problem by using a host with 2 CPUs, and doing:
virsh cpu-stats $dom --start 1 --count 2
* daemon/remote.c (remoteDispatchDomainGetCPUStats): Allow driver
to omit tail of array.
* src/remote/remote_driver.c (remoteDomainGetCPUStats):
Accommodate driver that omits tail of array.
---
daemon/remote.c | 10 ++++++++--
src/remote/remote_driver.c | 6 ++++--
2 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/daemon/remote.c b/daemon/remote.c
index 74a5f16..39302cc 100644
--- a/daemon/remote.c
+++ b/daemon/remote.c
@@ -3574,11 +3574,17 @@ remoteDispatchDomainGetCPUStats(virNetServerPtr server ATTRIBUTE_UNUSED,
args->flags) < 0)
goto cleanup;
- percpu_len = ret->params.params_len / args->ncpus;
-
success:
rv = 0;
ret->nparams = percpu_len;
+ if (args->nparams && !(args->flags & VIR_TYPED_PARAM_STRING_OKAY)) {
+ int i;
+
+ for (i = 0; i < percpu_len; i++) {
+ if (params[i].type == VIR_TYPED_PARAM_STRING)
+ ret->nparams--;
+ }
+ }
cleanup:
if (rv < 0)
diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c
index 9e74cea..031167a 100644
--- a/src/remote/remote_driver.c
+++ b/src/remote/remote_driver.c
@@ -2382,7 +2382,7 @@ static int remoteDomainGetCPUStats(virDomainPtr domain,
/* Check the length of the returned list carefully. */
if (ret.params.params_len > nparams * ncpus ||
(ret.params.params_len &&
- ret.nparams * ncpus != ret.params.params_len)) {
+ ((ret.params.params_len % ret.nparams) || ret.nparams > nparams))) {
remoteError(VIR_ERR_RPC, "%s",
_("remoteDomainGetCPUStats: "
"returned number of stats exceeds limit"));
@@ -2399,9 +2399,11 @@ static int remoteDomainGetCPUStats(virDomainPtr domain,
}
/* The remote side did not send back any zero entries, so we have
- * to expand things back into a possibly sparse array.
+ * to expand things back into a possibly sparse array, where the
+ * tail of the array may be omitted.
*/
memset(params, 0, sizeof(*params) * nparams * ncpus);
+ ncpus = ret.params.params_len / ret.nparams;
for (cpu = 0; cpu < ncpus; cpu++) {
int tmp = nparams;
remote_typed_param *stride = &ret.params.params_val[cpu * ret.nparams];
--
1.7.7.6
12 years, 9 months
[libvirt] [PATCH] Ensure max_id is initialized in linuxParseCPUmap()
by Daniel P. Berrange
From: "Daniel P. Berrange" <berrange(a)redhat.com>
Pushing as a build-breaker fix
---
src/nodeinfo.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/src/nodeinfo.c b/src/nodeinfo.c
index 709e94a..61a5925 100644
--- a/src/nodeinfo.c
+++ b/src/nodeinfo.c
@@ -581,7 +581,7 @@ linuxParseCPUmap(int *max_cpuid, const char *path)
{
char *map = NULL;
char *str = NULL;
- int max_id, i;
+ int max_id = 0, i;
if (virFileReadAll(path, 5 * VIR_DOMAIN_CPUMASK_LEN, &str) < 0) {
virReportOOMError();
--
1.7.7.6
12 years, 9 months
[libvirt] ANNOUNCE: libvirt-glib release 0.0.6
by Daniel P. Berrange
I am pleased to announce that a new release of the libvirt-glib package,
version 0.0.6 is now available from
ftp://libvirt.org/libvirt/glib/
The packages are GPG signed with
Key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF (4096R)
New in this release:
- Add binding for virDomainBlockResize(): gvir_domain_disk_resize().
- Set correct target node attribute for domain interface.
gvir_config_domain_interface_set_ifname() should be setting 'dev' attribute
under 'target', not 'device'.
- Getter for the associated domain of a domain device.
- Getters for GVirConfigDomainInterface attributes.
- GVirDomainDevice now has an associated GVirConfigDomainDevice.
- Remove now redundant 'path' property from GVirDomainDevice subclasses.
- Add gvir_domain_get_devices().
- Empty statistics for user-mode interfaces. One of the limitations of user-mode
networking of libvirt is that you can't get statistics for it (not yet, at
least). Instead of erroring-out in that case, simply return empty statistics
result and spit a debug message.
- Fix a GVirStream leak.
- Also distribute GNUmakefile, cfg.mk and maint.mk files.
libvirt-glib comprises three distinct libraries:
- libvirt-glib - Integrate with the GLib event loop and error handling
- libvirt-gconfig - Representation of libvirt XML documents as GObjects
- libvirt-gobject - Mapping of libvirt APIs into the GObject type system
NB: While libvirt aims to be API/ABI stable, for the first few releases,
we are *NOT* guaranteeing that libvirt-glib libraries are API/ABI stable.
ABI stability will only be guaranteed once the bulk of the APIs have been
fleshed out and proved in non-trivial application usage. We anticipate
this will be within the next few months in order to line up with Fedora 17.
Follow up comments about libvirt-glib should be directed to the regular
libvir-list redhat com development list.
Thanks to all the people involved in contributing to this release.
Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
12 years, 9 months
[libvirt] [PATCH] docs: Fix typo
by Osier Yang
It used "<" for ">", reported by Kyla Zhang <weizhan(a)redhat.com>
--
Pushed under trivial rule.
---
docs/formatdomain.html.in | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 6434ae5..42f38d3 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -1096,9 +1096,9 @@
</disk>
<disk type='block' device='lun'>
<driver name='qemu' type='raw'/>
- <source dev='/dev/sda'/<
- <target dev='sda' bus='scsi'/<
- <address type='drive' controller='0' bus='0' target='3' unit='0'/<
+ <source dev='/dev/sda'/>
+ <target dev='sda' bus='scsi'/>
+ <address type='drive' controller='0' bus='0' target='3' unit='0'/>
</disk>
</devices>
...</pre>
--
1.7.1
12 years, 9 months
[libvirt] [PATCH RFC]: Support numad
by Osier Yang
numad is an user-level daemon that monitors NUMA topology and
processes resource consumption to facilitate good NUMA resource
alignment of applications/virtual machines to improve performance
and minimize cost of remote memory latencies. It provides a
pre-placement advisory interface, so significant processes can
be pre-bound to nodes with sufficient available resources.
More details: http://fedoraproject.org/wiki/Features/numad
"numad -w ncpus:memory_amount" is the advisory interface numad
provides currently.
This patch add the support by introducing new XML like:
<numatune>
<cpu required_cpus="4" required_memory="524288"/>
</numatune>
And the corresponding numad command line will be:
numad -w 4:500
The advisory nodeset returned from numad will be used to set
domain process CPU affinity then. (e.g. qemuProcessInitCpuAffinity).
If the user specifies both CPU affinity policy (e.g.
(<vcpu cpuset="1-10,^7,^8">4</vcpu>) and XML indicating to use
numad for the advisory nodeset, the specified CPU affinity will be
overridden by the nodeset returned from numad.
If no XML to specify the CPU affinity policy, and XML indicating
to use numad is specified, the returned nodeset will be printed
in <cpu cpuset="$nodeset_from_numad"/>4</vcpu>.
Only QEMU/KVM and LXC drivers support it now.
---
configure.ac | 8 +++
docs/formatdomain.html.in | 18 ++++++-
docs/schemas/domaincommon.rng | 12 ++++
src/conf/domain_conf.c | 125 +++++++++++++++++++++++++++++++----------
src/conf/domain_conf.h | 5 ++
src/lxc/lxc_controller.c | 98 ++++++++++++++++++++++++++++----
src/qemu/qemu_process.c | 99 +++++++++++++++++++++++++++++----
7 files changed, 311 insertions(+), 54 deletions(-)
diff --git a/configure.ac b/configure.ac
index c9cdd7b..31f0835 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1445,6 +1445,14 @@ AM_CONDITIONAL([HAVE_NUMACTL], [test "$with_numactl" != "no"])
AC_SUBST([NUMACTL_CFLAGS])
AC_SUBST([NUMACTL_LIBS])
+dnl Do we have numad?
+if test "$with_qemu" = "yes"; then
+ AC_PATH_PROG([NUMAD], [numad], [], [/bin:/usr/bin:/usr/local/bin:$PATH])
+
+ if test -n "$NUMAD"; then
+ AC_DEFINE_UNQUOTED([NUMAD],["$NUMAD"], [Location or name of the numad program])
+ fi
+fi
dnl pcap lib
LIBPCAP_CONFIG="pcap-config"
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 6fcca94..d8e70a6 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -505,6 +505,7 @@
...
<numatune>
<memory mode="strict" nodeset="1-4,^3"/>
+ <cpu required_cpus="3" required_memory="524288"/>
</numatune>
...
</domain>
@@ -519,7 +520,7 @@
<span class='since'>Since 0.9.3</span>
<dt><code>memory</code></dt>
<dd>
- The optional <code>memory</code> element specify how to allocate memory
+ The optional <code>memory</code> element specifies how to allocate memory
for the domain process on a NUMA host. It contains two attributes,
attribute <code>mode</code> is either 'interleave', 'strict',
or 'preferred',
@@ -527,6 +528,21 @@
syntax with attribute <code>cpuset</code> of element <code>vcpu</code>.
<span class='since'>Since 0.9.3</span>
</dd>
+ <dd>
+ The optional <code>cpu</code> element indicates pinning the virtual CPUs
+ to the nodeset returned by querying "numad" (a system daemon that monitors
+ NUMA topology and usage). It has two attributes, attribute
+ <code>required_cpus</code> specifies the number of physical CPUs the guest
+ process want to use. And the optional attribute <code>required_memory</code>
+ specifies the amount of free memory the guest process want to see on a node,
+ "numad" will pick the physical CPUs on the node which has enough free
+ memory of amount specified by <code>required_memory</code>.
+
+ NB, with using this element, the physical CPUs specified by attribute
+ <code>cpuset</code> (of element <code>vcpu</code>) will be overridden by the
+ nodeset returned from "numad".
+ <span class='since'>Since 0.9.11 (QEMU/KVM and LXC only)</span>
+ </dd>
</dl>
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index 3908733..d0f443d 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -549,6 +549,18 @@
</attribute>
</element>
</optional>
+ <optional>
+ <element name="cpu">
+ <attribute name="required_cpu">
+ <ref name="countCPU"/>
+ </attribute>
+ <optional>
+ <attribute name="required_memory">
+ <ref name="memoryKB"/>
+ </attribute>
+ </optional>
+ </element>
+ </optional>
</element>
</optional>
</interleave>
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index f9654f1..aa03c05 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -7125,7 +7125,6 @@ error:
goto cleanup;
}
-
static int virDomainDefMaybeAddController(virDomainDefPtr def,
int type,
int idx)
@@ -7185,6 +7184,7 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps,
bool uuid_generated = false;
virBitmapPtr bootMap = NULL;
unsigned long bootMapSize = 0;
+ xmlNodePtr cur;
if (VIR_ALLOC(def) < 0) {
virReportOOMError();
@@ -7454,47 +7454,100 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps,
VIR_FREE(nodes);
/* Extract numatune if exists. */
- if ((n = virXPathNodeSet("./numatune", ctxt, NULL)) < 0) {
+ if ((n = virXPathNodeSet("./numatune", ctxt, &nodes)) < 0) {
virDomainReportError(VIR_ERR_INTERNAL_ERROR,
"%s", _("cannot extract numatune nodes"));
goto error;
}
+ if (n > 1) {
+ virDomainReportError(VIR_ERR_XML_ERROR, "%s",
+ _("only one numatune is supported"));
+ VIR_FREE(nodes);
+ goto error;
+ }
+
if (n) {
- tmp = virXPathString("string(./numatune/memory/@nodeset)", ctxt);
- if (tmp) {
- char *set = tmp;
- int nodemasklen = VIR_DOMAIN_CPUMASK_LEN;
+ cur = nodes[0]->children;
+ while (cur != NULL) {
+ if (cur->type == XML_ELEMENT_NODE) {
+ if ((xmlStrEqual(cur->name, BAD_CAST "memory"))) {
+ tmp = virXMLPropString(cur, "nodeset");
- if (VIR_ALLOC_N(def->numatune.memory.nodemask, nodemasklen) < 0) {
- goto no_memory;
- }
+ if (tmp) {
+ char *set = tmp;
+ int nodemasklen = VIR_DOMAIN_CPUMASK_LEN;
- /* "nodeset" leads same syntax with "cpuset". */
- if (virDomainCpuSetParse(set, 0, def->numatune.memory.nodemask,
- nodemasklen) < 0)
- goto error;
- VIR_FREE(tmp);
- } else {
- virDomainReportError(VIR_ERR_INTERNAL_ERROR,
- "%s", _("nodeset for NUMA memory tuning must be set"));
- goto error;
- }
+ if (VIR_ALLOC_N(def->numatune.memory.nodemask,
+ nodemasklen) < 0) {
+ virReportOOMError();
+ goto error;
+ }
- tmp = virXPathString("string(./numatune/memory/@mode)", ctxt);
- if (tmp) {
- if ((def->numatune.memory.mode =
- virDomainNumatuneMemModeTypeFromString(tmp)) < 0) {
- virDomainReportError(VIR_ERR_INTERNAL_ERROR,
- _("Unsupported NUMA memory tuning mode '%s'"),
- tmp);
- goto error;
+ /* "nodeset" leads same syntax with "cpuset". */
+ if (virDomainCpuSetParse(set, 0,
+ def->numatune.memory.nodemask,
+ nodemasklen) < 0)
+ goto error;
+ VIR_FREE(tmp);
+ } else {
+ virDomainReportError(VIR_ERR_XML_ERROR, "%s",
+ _("nodeset for NUMA memory "
+ "tuning must be set"));
+ goto error;
+ }
+
+ tmp = virXMLPropString(cur, "mode");
+ if (tmp) {
+ if ((def->numatune.memory.mode =
+ virDomainNumatuneMemModeTypeFromString(tmp)) < 0) {
+ virDomainReportError(VIR_ERR_XML_ERROR,
+ _("Unsupported NUMA memory "
+ "tuning mode '%s'"),
+ tmp);
+ goto error;
+ }
+ VIR_FREE(tmp);
+ } else {
+ def->numatune.memory.mode = VIR_DOMAIN_NUMATUNE_MEM_STRICT;
+ }
+ } else if (xmlStrEqual(cur->name, BAD_CAST "cpu")) {
+ char *req_cpus = NULL;
+ char *req_memory = NULL;
+ req_cpus = virXMLPropString(cur, "required_cpus");
+ req_memory = virXMLPropString(cur, "required_memory");
+
+ if (req_cpus &&
+ virStrToLong_ui(req_cpus, NULL, 10,
+ &def->numatune.cpu.required_cpus) < 0) {
+ virDomainReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Cannot parse <cpu> 'required_cpus'"
+ " attribute"));
+ goto error;
+ }
+
+ if (req_memory &&
+ virStrToLong_ul(req_memory, NULL, 10,
+ &def->numatune.cpu.required_memory) < 0) {
+ virDomainReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Cannot parse <cpu> 'required_memory'"
+ " attribute"));
+ goto error;
+ }
+
+ VIR_FREE(req_cpus);
+ VIR_FREE(req_memory);
+ } else {
+ virDomainReportError(VIR_ERR_XML_ERROR,
+ _("unsupported XML element %s"),
+ (const char *)cur->name);
+ goto error;
+ }
}
- VIR_FREE(tmp);
- } else {
- def->numatune.memory.mode = VIR_DOMAIN_NUMATUNE_MEM_STRICT;
+ cur = cur->next;
}
}
+ VIR_FREE(nodes);
n = virXPathNodeSet("./features/*", ctxt, &nodes);
if (n < 0)
@@ -11761,7 +11814,8 @@ virDomainDefFormatInternal(virDomainDefPtr def,
def->cputune.period || def->cputune.quota)
virBufferAddLit(buf, " </cputune>\n");
- if (def->numatune.memory.nodemask) {
+ if (def->numatune.memory.nodemask ||
+ def->numatune.cpu.required_cpus) {
const char *mode;
char *nodemask = NULL;
@@ -11778,6 +11832,15 @@ virDomainDefFormatInternal(virDomainDefPtr def,
virBufferAsprintf(buf, " <memory mode='%s' nodeset='%s'/>\n",
mode, nodemask);
VIR_FREE(nodemask);
+
+ if (def->numatune.cpu.required_cpus)
+ virBufferAsprintf(buf, " <cpu required_cpus='%d' ",
+ def->numatune.cpu.required_cpus);
+
+ if (def->numatune.cpu.required_memory)
+ virBufferAsprintf(buf, "required_memory='%lu'/>\n",
+ def->numatune.cpu.required_memory);
+
virBufferAddLit(buf, " </numatune>\n");
}
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
index 596be4d..1284599 100644
--- a/src/conf/domain_conf.h
+++ b/src/conf/domain_conf.h
@@ -1416,6 +1416,11 @@ struct _virDomainNumatuneDef {
int mode;
} memory;
+ struct {
+ unsigned int required_cpus;
+ unsigned long required_memory;
+ } cpu;
+
/* Future NUMA tuning related stuff should go here. */
};
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c
index 8f336f5..ec6434d 100644
--- a/src/lxc/lxc_controller.c
+++ b/src/lxc/lxc_controller.c
@@ -327,6 +327,47 @@ static int lxcSetContainerNUMAPolicy(virDomainDefPtr def)
}
#endif
+#if defined(NUMAD)
+static char *
+lxcGetNumadAdvice(unsigned int req_cpus,
+ unsigned long req_memory) {
+ virCommandPtr cmd = NULL;
+ char *reqs = NULL;
+ char *ret = NULL;
+
+ /* numad uses "MB" for memory. */
+ if (req_memory) {
+ req_memory = req_memory / 1024;
+ if (virAsprintf(&reqs, "%d:%lu", req_cpus, req_memory) < 0) {
+ virReportOOMError();
+ goto out;
+ }
+ cmd = virCommandNewArgList(NUMAD, "-w", reqs, NULL);
+ } else {
+ cmd = virCommandNewArgList(NUMAD, "-w", "%d", req_cpus, NULL);
+ }
+
+ virCommandSetOutputBuffer(cmd, &ret);
+
+ if (virCommandRun(cmd, NULL) < 0) {
+ lxcError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Failed to query numad for the advisory nodeset"));
+ }
+
+out:
+ VIR_FREE(reqs);
+ virCommandFree(cmd);
+ return ret;
+}
+#else
+static char *
+lxcGetNumadAdvice(unsigned int req_cpus ATTRIBUTE_UNUSED,
+ unsigned long req_memory ATTRIBUTE_UNUSED) {
+ lxcError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("numad is not available on this host"));
+ return NULL;
+}
+#endif
/*
* To be run while still single threaded
@@ -355,19 +396,54 @@ static int lxcSetContainerCpuAffinity(virDomainDefPtr def)
return -1;
}
- if (def->cpumask) {
- /* XXX why don't we keep 'cpumask' in the libvirt cpumap
- * format to start with ?!?! */
- for (i = 0 ; i < maxcpu && i < def->cpumasklen ; i++)
- if (def->cpumask[i])
+ /* def->cpumask will be overridden by the nodeset
+ * suggested by numad if it's specified.
+ */
+ if (def->numatune.cpu.required_cpus) {
+ char *tmp_cpumask = NULL;
+ char *nodeset = NULL;
+
+ nodeset = lxcGetNumadAdvice(def->numatune.cpu.required_cpus,
+ def->numatune.cpu.required_memory);
+ if (!nodeset)
+ return -1;
+
+ if (VIR_ALLOC_N(tmp_cpumask, VIR_DOMAIN_CPUMASK_LEN) < 0) {
+ virReportOOMError();
+ return -1;
+ }
+
+ if (virDomainCpuSetParse(nodeset, 0, tmp_cpumask,
+ VIR_DOMAIN_CPUMASK_LEN) < 0) {
+ VIR_FREE(tmp_cpumask);
+ VIR_FREE(nodeset);
+ return -1;
+ }
+
+ for (i = 0; i < maxcpu && i < VIR_DOMAIN_CPUMASK_LEN; i++) {
+ if (tmp_cpumask[i])
VIR_USE_CPU(cpumap, i);
+ }
+
+ /* Update def->cpumask */
+ VIR_FREE(def->cpumask);
+ def->cpumask = tmp_cpumask;
+ VIR_FREE(nodeset);
} else {
- /* You may think this is redundant, but we can't assume libvirtd
- * itself is running on all pCPUs, so we need to explicitly set
- * the spawned LXC instance to all pCPUs if no map is given in
- * its config file */
- for (i = 0 ; i < maxcpu ; i++)
- VIR_USE_CPU(cpumap, i);
+ if (def->cpumask) {
+ /* XXX why don't we keep 'cpumask' in the libvirt cpumap
+ * format to start with ?!?! */
+ for (i = 0 ; i < maxcpu && i < def->cpumasklen ; i++)
+ if (def->cpumask[i])
+ VIR_USE_CPU(cpumap, i);
+ } else {
+ /* You may think this is redundant, but we can't assume libvirtd
+ * itself is running on all pCPUs, so we need to explicitly set
+ * the spawned LXC instance to all pCPUs if no map is given in
+ * its config file */
+ for (i = 0 ; i < maxcpu ; i++)
+ VIR_USE_CPU(cpumap, i);
+ }
}
/* We are pressuming we are running between fork/exec of LXC
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 41218de..eb9f8f1 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -1633,6 +1633,48 @@ qemuProcessInitNumaMemoryPolicy(virDomainObjPtr vm)
}
#endif
+#if defined(NUMAD)
+static char *
+qemuGetNumadAdvice(unsigned int req_cpus,
+ unsigned long req_memory) {
+ virCommandPtr cmd = NULL;
+ char *reqs = NULL;
+ char *output = NULL;
+
+ /* numad uses "MB" for memory. */
+ if (req_memory) {
+ req_memory = req_memory / 1024;
+ if (virAsprintf(&reqs, "%d:%lu", req_cpus, req_memory) < 0) {
+ virReportOOMError();
+ goto out;
+ }
+
+ cmd = virCommandNewArgList(NUMAD, "-w", reqs, NULL);
+ } else {
+ cmd = virCommandNewArgList(NUMAD, "-w", "%u", req_cpus, NULL);
+ }
+
+ virCommandSetOutputBuffer(cmd, &output);
+
+ if (virCommandRun(cmd, NULL) < 0)
+ qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Failed to query numad for the advisory nodeset"));
+
+out:
+ VIR_FREE(reqs);
+ virCommandFree(cmd);
+ return output;
+}
+#else
+static char *
+qemuGetNumadAdvice(unsigned int req_cpus ATTRIBUTE_UNUSED,
+ unsigned long req_memory ATTRIBUTE_UNUSED) {
+ qemuReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("numad is not available on this host"));
+ return NULL;
+}
+#endif
+
/*
* To be run between fork/exec of QEMU only
*/
@@ -1661,19 +1703,54 @@ qemuProcessInitCpuAffinity(virDomainObjPtr vm)
return -1;
}
- if (vm->def->cpumask) {
- /* XXX why don't we keep 'cpumask' in the libvirt cpumap
- * format to start with ?!?! */
- for (i = 0 ; i < maxcpu && i < vm->def->cpumasklen ; i++)
- if (vm->def->cpumask[i])
+ /* vm->def->cpumask will be overridden by the nodeset
+ * suggested by numad if it's specified.
+ */
+ if (vm->def->numatune.cpu.required_cpus) {
+ char *tmp_cpumask = NULL;
+ char *nodeset = NULL;
+
+ nodeset = qemuGetNumadAdvice(vm->def->numatune.cpu.required_cpus,
+ vm->def->numatune.cpu.required_memory);
+ if (!nodeset)
+ return -1;
+
+ if (VIR_ALLOC_N(tmp_cpumask, VIR_DOMAIN_CPUMASK_LEN) < 0) {
+ virReportOOMError();
+ return -1;
+ }
+
+ if (virDomainCpuSetParse(nodeset, 0, tmp_cpumask,
+ VIR_DOMAIN_CPUMASK_LEN) < 0) {
+ VIR_FREE(tmp_cpumask);
+ VIR_FREE(nodeset);
+ return -1;
+ }
+
+ for (i = 0; i < maxcpu && i < VIR_DOMAIN_CPUMASK_LEN; i++) {
+ if (tmp_cpumask[i])
VIR_USE_CPU(cpumap, i);
+ }
+
+ /* Update vm->def->cpumask */
+ VIR_FREE(vm->def->cpumask);
+ vm->def->cpumask = tmp_cpumask;
+ VIR_FREE(nodeset);
} else {
- /* You may think this is redundant, but we can't assume libvirtd
- * itself is running on all pCPUs, so we need to explicitly set
- * the spawned QEMU instance to all pCPUs if no map is given in
- * its config file */
- for (i = 0 ; i < maxcpu ; i++)
- VIR_USE_CPU(cpumap, i);
+ if (vm->def->cpumask) {
+ /* XXX why don't we keep 'cpumask' in the libvirt cpumap
+ * format to start with ?!?! */
+ for (i = 0 ; i < maxcpu && i < vm->def->cpumasklen ; i++)
+ if (vm->def->cpumask[i])
+ VIR_USE_CPU(cpumap, i);
+ } else {
+ /* You may think this is redundant, but we can't assume libvirtd
+ * itself is running on all pCPUs, so we need to explicitly set
+ * the spawned QEMU instance to all pCPUs if no map is given in
+ * its config file */
+ for (i = 0 ; i < maxcpu ; i++)
+ VIR_USE_CPU(cpumap, i);
+ }
}
/* We are pressuming we are running between fork/exec of QEMU
--
1.7.7.3
12 years, 9 months
[libvirt] [PATCH] add screendump async to qemu
by Alon Levy
RHBZ: 800338
Adds a new capability to qemu, QEMU_CAPS_SCREENDUMP_ASYNC, available if
the qmp command "screendump-async" exists.
If that cap exists qemuDomainScreenshot uses it. The implementation
consists of a hash from filename to struct holding the stream and
temporary fd. The fd is closed and the stream is written to (in reverse
order) by the completion callback, qemuProcessScreenshotComplete.
Note: in qemuDomainScreenshot I don't check for an existing entry in the
screenshots hash table because we the key is a temporary filename,
produced by mkstemp, and it's only unlinked at
qemuProcessScreenshotComplete.
For testing you need to apply the following patches (they are still
pending review on qemu-devel):
http://patchwork.ozlabs.org/patch/144706/
http://patchwork.ozlabs.org/patch/144705/
http://patchwork.ozlabs.org/patch/144704/
Signed-off-by: Alon Levy <alevy(a)redhat.com>
---
src/qemu/qemu_capabilities.c | 1 +
src/qemu/qemu_capabilities.h | 1 +
src/qemu/qemu_domain.c | 6 ++++++
src/qemu/qemu_domain.h | 12 ++++++++++++
src/qemu/qemu_driver.c | 42 +++++++++++++++++++++++++++++++++++-------
src/qemu/qemu_monitor.c | 26 ++++++++++++++++++++++++++
src/qemu/qemu_monitor.h | 8 ++++++++
src/qemu/qemu_monitor_json.c | 39 +++++++++++++++++++++++++++++++++++++++
src/qemu/qemu_monitor_json.h | 3 +++
src/qemu/qemu_process.c | 29 +++++++++++++++++++++++++++++
10 files changed, 160 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
index 64a4546..57771ff 100644
--- a/src/qemu/qemu_capabilities.c
+++ b/src/qemu/qemu_capabilities.c
@@ -154,6 +154,7 @@ VIR_ENUM_IMPL(qemuCaps, QEMU_CAPS_LAST,
"drive-iotune", /* 85 */
"system_wakeup",
"scsi-disk.channel",
+ "screendump-async",
);
struct qemu_feature_flags {
diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h
index db584ce..24d620d 100644
--- a/src/qemu/qemu_capabilities.h
+++ b/src/qemu/qemu_capabilities.h
@@ -122,6 +122,7 @@ enum qemuCapsFlags {
QEMU_CAPS_DRIVE_IOTUNE = 85, /* -drive bps= and friends */
QEMU_CAPS_WAKEUP = 86, /* system_wakeup monitor command */
QEMU_CAPS_SCSI_DISK_CHANNEL = 87, /* Is scsi-disk.channel available? */
+ QEMU_CAPS_SCREENDUMP_ASYNC = 88, /* screendump-async qmp command */
QEMU_CAPS_LAST, /* this must always be the last item */
};
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 2fed91e..acf56c4 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -181,6 +181,10 @@ qemuDomainObjFreeJob(qemuDomainObjPrivatePtr priv)
ignore_value(virCondDestroy(&priv->job.asyncCond));
}
+static void
+freeScreenshot(void *payload, const void *name ATTRIBUTE_UNUSED) {
+ VIR_FREE(payload);
+}
static void *qemuDomainObjPrivateAlloc(void)
{
@@ -196,6 +200,8 @@ static void *qemuDomainObjPrivateAlloc(void)
goto error;
priv->migMaxBandwidth = QEMU_DOMAIN_DEFAULT_MIG_BANDWIDTH_MAX;
+ priv->screenshots = virHashCreate(QEMU_DOMAIN_SCREENSHOTS_CONCURRENT_MAX,
+ freeScreenshot);
return priv;
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
index 1333d8c..15721ec 100644
--- a/src/qemu/qemu_domain.h
+++ b/src/qemu/qemu_domain.h
@@ -40,6 +40,8 @@
# define QEMU_DOMAIN_DEFAULT_MIG_BANDWIDTH_MAX 32
+# define QEMU_DOMAIN_SCREENSHOTS_CONCURRENT_MAX 16
+
# define JOB_MASK(job) (1 << (job - 1))
# define DEFAULT_JOB_MASK \
(JOB_MASK(QEMU_JOB_QUERY) | \
@@ -91,6 +93,14 @@ struct qemuDomainJobObj {
virDomainJobInfo info; /* Async job progress data */
};
+struct _qemuScreenshotAsync {
+ virStreamPtr stream; /* stream to write results to */
+ const char *filename; /* temporary file to read results from */
+ int fd; /* handle to open temporary file */
+};
+typedef struct _qemuScreenshotAsync qemuScreenshotAsync;
+typedef qemuScreenshotAsync *qemuScreenshotAsyncPtr;
+
typedef struct _qemuDomainPCIAddressSet qemuDomainPCIAddressSet;
typedef qemuDomainPCIAddressSet *qemuDomainPCIAddressSetPtr;
@@ -130,6 +140,8 @@ struct _qemuDomainObjPrivate {
char *origname;
virConsolesPtr cons;
+
+ virHashTablePtr screenshots;
};
struct qemuDomainWatchdogEvent
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 733df0a..e397bc3 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -3135,6 +3135,7 @@ qemuDomainScreenshot(virDomainPtr dom,
int tmp_fd = -1;
char *ret = NULL;
bool unlink_tmp = false;
+ qemuScreenshotAsync *screenshot;
virCheckFlags(0, NULL);
@@ -3184,9 +3185,34 @@ qemuDomainScreenshot(virDomainPtr dom,
virSecurityManagerSetSavedStateLabel(qemu_driver->securityManager, vm->def, tmp);
qemuDomainObjEnterMonitor(driver, vm);
- if (qemuMonitorScreendump(priv->mon, tmp) < 0) {
- qemuDomainObjExitMonitor(driver, vm);
- goto endjob;
+ if (qemuCapsGet(priv->qemuCaps, QEMU_CAPS_SCREENDUMP_ASYNC)) {
+ if (virHashSize(priv->screenshots) >=
+ QEMU_DOMAIN_SCREENSHOTS_CONCURRENT_MAX) {
+ qemuReportError(VIR_ERR_INTERNAL_ERROR,
+ "%s", _("too many ongoing screenshots"));
+ goto endjob;
+ }
+ if (VIR_ALLOC(screenshot) < 0) {
+ qemuReportError(VIR_ERR_NO_MEMORY, "%s", _("out of memory"));
+ goto endjob;
+ }
+ screenshot->fd = tmp_fd;
+ screenshot->filename = tmp;
+ screenshot->stream = st;
+ virHashAddEntry(priv->screenshots, tmp, screenshot);
+ if (qemuMonitorScreendumpAsync(priv->mon, tmp) < 0) {
+ qemuDomainObjExitMonitor(driver, vm);
+ goto endjob;
+ }
+ /* string and fd are freed by qmp event callback */
+ tmp = NULL;
+ tmp_fd = -1;
+ unlink_tmp = false;
+ } else {
+ if (qemuMonitorScreendump(priv->mon, tmp) < 0) {
+ qemuDomainObjExitMonitor(driver, vm);
+ goto endjob;
+ }
}
qemuDomainObjExitMonitor(driver, vm);
@@ -3195,10 +3221,12 @@ qemuDomainScreenshot(virDomainPtr dom,
goto endjob;
}
- if (virFDStreamOpenFile(st, tmp, 0, 0, O_RDONLY) < 0) {
- qemuReportError(VIR_ERR_OPERATION_FAILED, "%s",
- _("unable to open stream"));
- goto endjob;
+ if (!qemuCapsGet(priv->qemuCaps, QEMU_CAPS_SCREENDUMP_ASYNC)) {
+ if (virFDStreamOpenFile(st, tmp, 0, 0, O_RDONLY) < 0) {
+ qemuReportError(VIR_ERR_OPERATION_FAILED, "%s",
+ _("unable to open stream"));
+ goto endjob;
+ }
}
ret = strdup("image/x-portable-pixmap");
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
index 1da73f6..0df63e7 100644
--- a/src/qemu/qemu_monitor.c
+++ b/src/qemu/qemu_monitor.c
@@ -1054,6 +1054,18 @@ int qemuMonitorEmitBlockJob(qemuMonitorPtr mon,
}
+int qemuMonitorEmitScreenDumpComplete(qemuMonitorPtr mon,
+ const char *filename)
+{
+ int ret = -1;
+ VIR_DEBUG("mon=%p", mon);
+
+ QEMU_MONITOR_CALLBACK(mon, ret, domainScreenshotComplete, mon->vm,
+ filename);
+ return ret;
+}
+
+
int qemuMonitorSetCapabilities(qemuMonitorPtr mon,
virBitmapPtr qemuCaps)
@@ -2710,6 +2722,20 @@ int qemuMonitorScreendump(qemuMonitorPtr mon,
return ret;
}
+int qemuMonitorScreendumpAsync(qemuMonitorPtr mon,
+ const char *file)
+{
+ VIR_DEBUG("mon=%p, file=%s", mon, file);
+
+ if (!mon) {
+ qemuReportError(VIR_ERR_INVALID_ARG,"%s",
+ _("monitor must not be NULL"));
+ return -1;
+ }
+
+ return qemuMonitorJSONScreendumpAsync(mon, file);
+}
+
int qemuMonitorBlockJob(qemuMonitorPtr mon,
const char *device,
const char *base,
diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h
index b1c956c..abe977c 100644
--- a/src/qemu/qemu_monitor.h
+++ b/src/qemu/qemu_monitor.h
@@ -124,6 +124,9 @@ struct _qemuMonitorCallbacks {
const char *diskAlias,
int type,
int status);
+ int (*domainScreenshotComplete)(qemuMonitorPtr mon,
+ virDomainObjPtr vm,
+ const char *filename);
};
@@ -195,6 +198,8 @@ int qemuMonitorEmitBlockJob(qemuMonitorPtr mon,
const char *diskAlias,
int type,
int status);
+int qemuMonitorEmitScreenDumpComplete(qemuMonitorPtr mon,
+ const char *filename);
@@ -505,6 +510,9 @@ int qemuMonitorInjectNMI(qemuMonitorPtr mon);
int qemuMonitorScreendump(qemuMonitorPtr mon,
const char *file);
+int qemuMonitorScreendumpAsync(qemuMonitorPtr mon,
+ const char *file);
+
int qemuMonitorSendKey(qemuMonitorPtr mon,
unsigned int holdtime,
unsigned int *keycodes,
diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c
index dc67b4b..79ec6ba 100644
--- a/src/qemu/qemu_monitor_json.c
+++ b/src/qemu/qemu_monitor_json.c
@@ -59,6 +59,7 @@ static void qemuMonitorJSONHandleVNCConnect(qemuMonitorPtr mon, virJSONValuePtr
static void qemuMonitorJSONHandleVNCInitialize(qemuMonitorPtr mon, virJSONValuePtr data);
static void qemuMonitorJSONHandleVNCDisconnect(qemuMonitorPtr mon, virJSONValuePtr data);
static void qemuMonitorJSONHandleBlockJob(qemuMonitorPtr mon, virJSONValuePtr data);
+static void qemuMonitorJSONHandleScreenDumpComplete(qemuMonitorPtr mon, virJSONValuePtr data);
static struct {
const char *type;
@@ -75,6 +76,7 @@ static struct {
{ "VNC_INITIALIZED", qemuMonitorJSONHandleVNCInitialize, },
{ "VNC_DISCONNECTED", qemuMonitorJSONHandleVNCDisconnect, },
{ "BLOCK_JOB_COMPLETED", qemuMonitorJSONHandleBlockJob, },
+ { "SCREEN_DUMP_COMPLETE", qemuMonitorJSONHandleScreenDumpComplete, },
};
@@ -725,6 +727,16 @@ out:
qemuMonitorEmitBlockJob(mon, device, type, status);
}
+static void qemuMonitorJSONHandleScreenDumpComplete(qemuMonitorPtr mon,
+ virJSONValuePtr data)
+{
+ const char *filename;
+
+ if ((filename = virJSONValueObjectGetString(data, "filename")) == NULL) {
+ VIR_WARN("missing filename in screen dump complete event");
+ }
+ qemuMonitorEmitScreenDumpComplete(mon, filename);
+}
int
qemuMonitorJSONHumanCommandWithFd(qemuMonitorPtr mon,
@@ -836,6 +848,9 @@ qemuMonitorJSONCheckCommands(qemuMonitorPtr mon,
if (STREQ(name, "system_wakeup"))
qemuCapsSet(qemuCaps, QEMU_CAPS_WAKEUP);
+
+ if (STREQ(name, "screendump-async"))
+ qemuCapsSet(qemuCaps, QEMU_CAPS_SCREENDUMP_ASYNC);
}
ret = 0;
@@ -3135,6 +3150,30 @@ int qemuMonitorJSONScreendump(qemuMonitorPtr mon,
return ret;
}
+int qemuMonitorJSONScreendumpAsync(qemuMonitorPtr mon,
+ const char *file)
+{
+ int ret;
+ virJSONValuePtr cmd, reply = NULL;
+
+ cmd = qemuMonitorJSONMakeCommand("screendump-async",
+ "s:filename", file,
+ NULL);
+
+ if (!cmd)
+ return -1;
+
+ ret = qemuMonitorJSONCommand(mon, cmd, &reply);
+
+ if (ret == 0)
+ ret = qemuMonitorJSONCheckError(cmd, reply);
+
+ virJSONValueFree(cmd);
+ virJSONValueFree(reply);
+ return ret;
+}
+
+
static int qemuMonitorJSONGetBlockJobInfoOne(virJSONValuePtr entry,
const char *device,
virDomainBlockJobInfoPtr info)
diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h
index 0932a2c..74acce1 100644
--- a/src/qemu/qemu_monitor_json.h
+++ b/src/qemu/qemu_monitor_json.h
@@ -244,6 +244,9 @@ int qemuMonitorJSONSendKey(qemuMonitorPtr mon,
int qemuMonitorJSONScreendump(qemuMonitorPtr mon,
const char *file);
+int qemuMonitorJSONScreendumpAsync(qemuMonitorPtr mon,
+ const char *file);
+
int qemuMonitorJSONBlockJob(qemuMonitorPtr mon,
const char *device,
const char *base,
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 7b99814..7985a37 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -47,6 +47,7 @@
#include "datatypes.h"
#include "logging.h"
+#include "fdstream.h"
#include "virterror_internal.h"
#include "memory.h"
#include "hooks.h"
@@ -918,6 +919,33 @@ qemuProcessHandleBlockJob(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
}
static int
+qemuProcessScreenshotComplete(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
+ virDomainObjPtr vm,
+ const char *filename)
+{
+ qemuDomainObjPrivatePtr priv = vm->privateData;
+ qemuScreenshotAsyncPtr screenshot;
+ int ret = 0;
+
+ if ((screenshot = virHashLookup(priv->screenshots, filename)) == NULL) {
+ qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("got screendump completion event for wrong filename"));
+ ret = -1;
+ goto end;
+ }
+ if (virFDStreamOpenFile(screenshot->stream, filename, 0, 0, O_RDONLY) < 0) {
+ qemuReportError(VIR_ERR_OPERATION_FAILED, "%s",
+ _("unable to open stream"));
+ ret = -1;
+ }
+end:
+ VIR_FORCE_CLOSE(screenshot->fd);
+ VIR_FREE(screenshot->filename);
+ virHashRemoveEntry(priv->screenshots, filename);
+ return ret;
+}
+
+static int
qemuProcessHandleGraphics(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
virDomainObjPtr vm,
int phase,
@@ -1034,6 +1062,7 @@ static qemuMonitorCallbacks monitorCallbacks = {
.domainIOError = qemuProcessHandleIOError,
.domainGraphics = qemuProcessHandleGraphics,
.domainBlockJob = qemuProcessHandleBlockJob,
+ .domainScreenshotComplete = qemuProcessScreenshotComplete,
};
static int
--
1.7.9.1
12 years, 9 months
[libvirt] [PATCH 0/4 v3] Support mac and port profile for <interface type='hostdev'>
by Roopa Prabhu
v3:
Changes include:
- Review comments from Laine
- rebased with latest upstream
v2:
changes include:
- feedback from stefan for 802.1Qbg. Code now prints an error if virtualport is
specified for 802.1Qbg on an interface of type hostdev
- feedback from laine for non-sriov devices. Interface type hostdev for non-sriov devices
is not supported.
v1: https://www.redhat.com/archives/libvir-list/2012-March/msg00015.html
This patch series is based on laines patches to support <interface type='hostdev'>.
https://www.redhat.com/archives/libvir-list/2012-February/msg01126.html
It support to set mac and port profile on an interface of type hostdev.
* If virtualport is specified, the existing virtual port functions are
called to set mac, vlan and port profile.
* If virtualport is not specified and device is a sriov virtual function,
- mac is set using IFLA_VF_MAC
* If virtualport is not specified and device is a non-sriov virtual function,
- mac is set using existing SIOCGIFHWADDR (This requires that the
netdev be present on the host before starting the VM)
This series implements the below :
01/4 pci: Add two new pci util pciDeviceGetVirtualFunctionInfo and pciConfigAddressToSysfsFile
02/4 virtnetdev: Add support functions for mac and portprofile associations on a hostdev
03/4 virnetdevvportprofile: Changes to support portprofiles for hostdevs
04/4 qemu_hostdev: Add support to install port profile and mac address on hostdev
Stefan Berger is CC'ed for 802.1Qbg changes in patch 03/4. Current code for
802.1Qbg uses macvtap ifname. And for network interfaces with type=hostdev a
macvtap ifname does not exist. This patch just adds a null check for ifname in
802.1Qbg port profile handling code.
12 years, 9 months