[libvirt] [PATCH 00/10] Introduce x86 Cache Monitoring Technology (CMT)
by Wang Huaqiang
This series of patches introduced the x86 Cache Monitoring Technology
(CMT) to libvirt by interacting with kernel resource control (resctrl)
interface. CMT is one of the Intel(R) x86 CPU feature which belongs to
the Resource Director Technology (RDT). CMT reports the occupancy of the
last level cache, which is shared by all CPU cores.
We have serval discussion about the enabling of CMT, please refer to
following links for the RFCs.
RFCv3
https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html
RFCv2
https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html
https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html
RFCv1
https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html
1. About reason why CMT is necessary in libvirt?
The perf events of 'CMT, MBML, MBMT' have been phased out since Linux
kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt
the perf based cmt,mbm will not work with the latest linux kernel. These
patches add CMT feature to libvirt through kernel resctrlfs interface.
2. Interfaces for CMT from the high level.
2.1 Query the host capability of CMT.
The element 'monitor' represents the host capabilities of CMT.
The explanations of involved CMT attributes:
- 'maxAllocs' denotes the maximum monitoring groups could be created,
which is limited by the number of hardware 'RMID'.
- 'threshold' denotes the upper bound of cache occupancy for current
group, in bytes, to determine if an RMID can be reused.
- element 'feature' denotes the monitoring feature supported.
- 'llc_occupancy' is the feature for reporting the last level cache
occupancy information.
# virsh capabilities
...
<cache>
<bank id='0' level='3' type='both' size='15' unit='MiB' cpus='0-5'>
<control granularity='768' unit='KiB' type='code' maxAllocs='8'/>
<control granularity='768' unit='KiB' type='data' maxAllocs='8'/>
+ <monitor threshold='540672' unit='B' maxAllocs='176'/>
+ <feature name=llc_occupancy/>
+ </monitor>
</bank>
<bank id='1' level='3' type='both' size='15' unit='MiB' cpus='6-11'>
<control granularity='768' unit='KiB' type='code' maxAllocs='8'/>
<control granularity='768' unit='KiB' type='data' maxAllocs='8'/>
+ <monitor threshold='540672' unit='B' maxAllocs='176'/>
+ <feature name=llc_occupancy/>
+ </monitor>
</bank>
</cache>
...
2.2 Create cache monitoring group (cache monitor).
The main interface for creating monitoring group is through XML file. The
proposed configuration is like:
<cputune>
<cachetune vcpus='1'>
<cache id='0' level='3' type='code' size='7680' unit='KiB'/>
<cache id='1' level='3' type='data' size='3840' unit='KiB'/>
+ <monitor vcpus='1'/>
</cachetune>
<cachetune vcpus='4-7'>
+ <monitor vcpus='4-6'/>
</cachetune>
</cputune>
In above XML, created 2 cache resctrl allocation groups and 2 resctrl
monitoring groups.
The changes of cache monitor will be effective in next booting of VM.
2.3 Show CMT result through command 'domstats'
Adding the interface in qemu to report this information for resource
monitor group through command 'virsh domstats --cpu-total'.
Below is a typical output:
# virsh domstats 1 --cpu-total
Domain: 'ubuntu16.04-base'
...
cpu.cache.monitor.count=2
cpu.cache.0.name=vcpus_1
cpu.cache.0.vcpus=1
cpu.cache.0.bank.count=2
cpu.cache.0.bank.0.id=0
cpu.cache.0.bank.0.bytes=4505600
cpu.cache.0.bank.1.id=1
cpu.cache.0.bank.1.bytes=5586944
cpu.cache.1.name=vcpus_4-6
cpu.cache.1.vcpus=4,5,6
cpu.cache.1.bank.count=2
cpu.cache.1.bank.0.id=0
cpu.cache.1.bank.0.bytes=17571840
cpu.cache.1.bank.1.id=1
cpu.cache.1.bank.1.bytes=29106176
**Changes Since RFCv3**
In the output of 'domstats', added
'cpu.cache.<cmt_group_index>.bank.<bank_index>.id'
to tell the OS assigned cache bank id of current cache.
Changes is prefixed with a '+':
# virsh domstats 1 --cpu-total
Domain: 'ubuntu16.04-base'
...
cpu.cache.monitor.count=2
cpu.cache.0.name=vcpus_1
cpu.cache.0.vcpus=1
cpu.cache.0.bank.count=2
+ cpu.cache.0.bank.0.id=0
cpu.cache.0.bank.0.bytes=4505600
+ cpu.cache.0.bank.1.id=1
cpu.cache.0.bank.1.bytes=5586944
cpu.cache.1.name=vcpus_4-6
cpu.cache.1.vcpus=4,5,6
cpu.cache.1.bank.count=2
+ cpu.cache.1.bank.0.id=0
cpu.cache.1.bank.0.bytes=17571840
+ cpu.cache.1.bank.1.id=1
cpu.cache.1.bank.1.bytes=29106176
Wang Huaqiang (10):
conf: Renamed 'controlBuf' to 'childrenBuf'
util: add interface retrieving CMT capability
conf: Add CMT capability to host
test: add test case for resctrl monitor
util: resctrl: refactoring some functions
util: Introduce resctrl monitor for CMT
conf: refactor virDomainResctrlAppend
conf: introduce resctrl monitor group in domain
qemu: Introduce resctrl monitoring group
qemu: Report cache occupancy (CMT) with domstats
.gnulib | 1 -
docs/formatdomain.html.in | 14 +-
docs/schemas/capability.rng | 28 +
docs/schemas/domaincommon.rng | 11 +-
src/conf/capabilities.c | 51 +-
src/conf/capabilities.h | 1 +
src/conf/domain_conf.c | 159 +++++-
src/conf/domain_conf.h | 20 +
src/libvirt-domain.c | 9 +
src/libvirt_private.syms | 6 +
src/qemu/qemu_driver.c | 265 ++++++++-
src/qemu/qemu_process.c | 40 +-
src/util/virresctrl.c | 597 +++++++++++++++++++--
src/util/virresctrl.h | 48 +-
tests/genericxml2xmlindata/cachetune-cdp.xml | 2 +
.../cachetune-colliding-monitors.xml | 36 ++
tests/genericxml2xmlindata/cachetune-small.xml | 1 +
tests/genericxml2xmlindata/cachetune.xml | 3 +
tests/genericxml2xmltest.c | 4 +
.../resctrl/info/L3_MON/max_threshold_occupancy | 1 +
.../linux-resctrl/resctrl/info/L3_MON/mon_features | 3 +
.../linux-resctrl/resctrl/info/L3_MON/num_rmids | 1 +
tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml | 6 +
23 files changed, 1208 insertions(+), 99 deletions(-)
delete mode 160000 .gnulib
create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitors.xml
create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/L3_MON/max_threshold_occupancy
create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/L3_MON/mon_features
create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/L3_MON/num_rmids
--
2.7.4
6 years, 2 months
[libvirt] [PATCH v2] numa: fix unsafe access to numa_nodes_ptr
by Wang Yechao
numa_nodes_ptr is a global variable in libnuma.so. It is been freed
after main thread exits. If we have many running vms, restart the
libvirtd service continuously at intervals of a few seconds, the main
thread may exit before qemuProcessReconnect thread, and a segfault
error occurs. Backstrace as follows:
0 0x00007f40e3d2dd72 in numa_bitmask_isbitset () from /lib64/libnuma.so.1
1 0x00007f40e4d14c55 in virNumaNodeIsAvailable (node=node@entry=0) at util/virnuma.c:396
2 0x00007f40e4d16010 in virNumaGetHostMemoryNodeset () at util/virnuma.c:1011
3 0x00007f40b94ced90 in qemuRestoreCgroupState (vm=0x7f407c39df00, vm=0x7f407c39df00) at qemu/qemu_cgroup.c:877
4 qemuConnectCgroup (driver=driver@entry=0x7f407c21fe80, vm=0x7f407c39df00) at qemu/qemu_cgroup.c:969
5 0x00007f40b94eef93 in qemuProcessReconnect (opaque=<optimized out>) at qemu/qemu_process.c:3531
6 0x00007f40e4d34bd2 in virThreadHelper (data=<optimized out>) at util/virthread.c:206
7 0x00007f40e214ee25 in start_thread () from /lib64/libpthread.so.0
8 0x00007f40e1e7c36d in clone () from /lib64/libc.so.6
Signed-off-by: Wang Yechao <wang.yechao255(a)zte.com.cn>
---
src/util/virnuma.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/util/virnuma.c b/src/util/virnuma.c
index 67e6c86..f06f6b3 100644
--- a/src/util/virnuma.c
+++ b/src/util/virnuma.c
@@ -381,7 +381,10 @@ virNumaGetMaxCPUs(void)
bool
virNumaNodeIsAvailable(int node)
{
- return numa_bitmask_isbitset(numa_nodes_ptr, node);
+ if (numa_nodes_ptr)
+ return numa_bitmask_isbitset(numa_nodes_ptr, node);
+ else
+ return false;
}
--
1.8.3.1
6 years, 2 months
[libvirt] [PATCH v2] qemu: check for vhostusers bandwidth
by Roland Schulz
Vhostuser doesn't support bandwidth and due to backwards compatibility
it was decided to just warn users instead of erroring out
https://bugzilla.redhat.com/show_bug.cgi?id=1524230
Signed-off-by: Roland Schulz <schullzroll(a)gmail.com>
---
src/qemu/qemu_command.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index ff9589f593..011e2b45af 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -8244,6 +8244,8 @@ qemuBuildVhostuserCommandLine(virQEMUDriverPtr driver,
virQEMUCapsPtr qemuCaps,
unsigned int bootindex)
{
+ virNetDevBandwidthPtr actualBandwidth = virDomainNetGetActualBandwidth(net);
+ virDomainNetType actualType = virDomainNetGetActualType(net);
virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver);
char *chardev = NULL;
char *netdev = NULL;
@@ -8257,6 +8259,13 @@ qemuBuildVhostuserCommandLine(virQEMUDriverPtr driver,
goto cleanup;
}
+ /* Warn if unsupported bandwidth requested */
+ if (actualBandwidth && !virNetDevSupportBandwidth(actualType)) {
+ VIR_WARN(_("setting bandwidth on interfaces of "
+ "type '%s' is not implemented yet"),
+ virDomainNetTypeToString(actualType));
+ }
+
switch ((virDomainChrType)net->data.vhostuser->type) {
case VIR_DOMAIN_CHR_TYPE_UNIX:
if (!(chardev = qemuBuildChrChardevStr(logManager, secManager,
--
2.17.1
6 years, 2 months
[libvirt] [PATCH] vhost-user: define conventions for vhost-user backends
by Marc-André Lureau
As discussed during "[PATCH v4 00/29] vhost-user for input & GPU"
review, let's define a common set of backend conventions to help with
management layer implementation, and interoperability.
Cc: libvir-list(a)redhat.com
Cc: Gerd Hoffmann <kraxel(a)redhat.com>
Cc: Daniel P. Berrangé <berrange(a)redhat.com>
Cc: Changpeng Liu <changpeng.liu(a)intel.com>
Cc: Dr. David Alan Gilbert <dgilbert(a)redhat.com>
Cc: Felipe Franciosi <felipe(a)nutanix.com>
Cc: Gonglei <arei.gonglei(a)huawei.com>
Cc: Maxime Coquelin <maxime.coquelin(a)redhat.com>
Cc: Michael S. Tsirkin <mst(a)redhat.com>
Cc: Victor Kaplansky <victork(a)redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau(a)redhat.com>
---
docs/interop/vhost-user.txt | 106 +++++++++++++++++++++++++++++++++++-
1 file changed, 104 insertions(+), 2 deletions(-)
diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
index ba5e37d714..691ce173ed 100644
--- a/docs/interop/vhost-user.txt
+++ b/docs/interop/vhost-user.txt
@@ -17,8 +17,13 @@ The protocol defines 2 sides of the communication, master and slave. Master is
the application that shares its virtqueues, in our case QEMU. Slave is the
consumer of the virtqueues.
-In the current implementation QEMU is the Master, and the Slave is intended to
-be a software Ethernet switch running in user space, such as Snabbswitch.
+In the current implementation QEMU is the Master, and the Slave is the
+external process consuming the virtio queues, for example a software
+Ethernet switch running in user space, such as Snabbswitch, or a block
+device backend processing read & write to a virtual disk. In order to
+facilitate interoperability between various backend implementations,
+it is recommended to follow the "Backend program conventions"
+described in this document.
Master and slave can be either a client (i.e. connecting) or server (listening)
in the socket communication.
@@ -859,3 +864,100 @@ resilient for selective requests.
For the message types that already solicit a reply from the client, the
presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
no behavioural change. (See the 'Communication' section for details.)
+
+Backend program conventions
+---------------------------
+
+vhost-user backends provide various services and they may need to be
+configured manually depending on the use case. However, it is a good
+idea to follow the conventions listed here when possible. Users, QEMU
+or libvirt, can then rely on some common behaviour to avoid
+heterogenous configuration and management of the backend program and
+facilitate interoperability.
+
+In order to be discoverable, default vhost-user backends should be
+located under "/usr/libexec", and be named "vhost-user-$device" where
+"$device" is the device name in lower-case following the name listed
+in the Linux virtio_ids.h header (ex: the VIRTIO_ID_RPROC_SERIAL
+backend would be named "vhost-user-rproc-serial").
+
+Mechanisms to list, and to select among alternatives implementations
+or modify the default backend are not described at this point (a
+distribution may use update-alternatives, for example, to list and to
+pick a different default backend).
+
+The backend program must end (as quickly and cleanly as possible) when
+the SIGTERM signal is received. Eventually, it may be SIGKILL by the
+management layer after a few seconds.
+
+The following command line options have an expected behaviour. They
+are mandatory, unless explicitly said differently:
+
+* --socket-path=PATH
+
+This option specify the location of the vhost-user Unix domain socket.
+It is incompatible with --fd.
+
+* --fd=FDNUM
+
+When this argument is given, the backend program is started with the
+vhost-user socket as file descriptor FDNUM. It is incompatible with
+--socket-path.
+
+* --print-capabilities
+
+Output to stdout a line-seperated list of backend capabilities, and
+then exit successfully. Other options and arguments should be ignored,
+and the backend program should not perform its normal function.
+
+At the time of writing, there are no common capabilities. Some
+device-specific capabilities are listed in the respective sections. By
+convention, device-specific capabilities are prefixed by their device
+name.
+
+* --pidfile=PATH
+
+Write the process id (PID) to the given file PATH. This is mostly
+useful if the backend daemonize/fork itself.
+
+vhost-user-input program conventions
+------------------------------------
+
+Capabilities:
+
+input-evdev-path
+
+ The --evdev-path command line option is supported.
+
+input-no-grab
+
+ The --no-grab command line option is supported.
+
+* --evdev-path=PATH (optional)
+
+Specify the linux input device.
+
+* --no-grab (optional)
+
+Do no request exclusive access to the input device.
+
+vhost-user-gpu program conventions
+----------------------------------
+
+Capabilities:
+
+gpu-render-node
+
+ The --render-node command line option is supported.
+
+gpu-virgl
+
+ The --virgl command line option is supported.
+
+* --render-node=PATH (optional)
+
+Specify the GPU DRM render node.
+
+* --virgl (optional)
+
+Enable virgl rendering support.
--
2.19.0.rc1
6 years, 2 months
[libvirt] [PATCH] util: Add stubs for virDoes{User, Group}Exist() without getpwuid_r
by Martin Kletzander
Signed-off-by: Martin Kletzander <mkletzan(a)redhat.com>
---
Pushed under the build-breaker rule.
src/util/virutil.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/src/util/virutil.c b/src/util/virutil.c
index 3d1b02eceb31..b0334edb36a0 100644
--- a/src/util/virutil.c
+++ b/src/util/virutil.c
@@ -1248,6 +1248,18 @@ virGetGroupList(uid_t uid ATTRIBUTE_UNUSED, gid_t gid ATTRIBUTE_UNUSED,
return 0;
}
+int
+virDoesUserExist(const char *name ATTRIBUTE_UNUSED)
+{
+ return 0;
+}
+
+int
+virDoesGroupExist(const char *name ATTRIBUTE_UNUSED)
+{
+ return 0;
+}
+
# ifdef WIN32
/* These methods are adapted from GLib2 under terms of LGPLv2+ */
static int
--
2.18.0
6 years, 2 months
[libvirt] [PATCH] qemu: Introduce state_lock_timeout to qemu.conf
by Yi Wang
When doing some job holding state lock for a long time,
we may come across error:
"Timed out during operation: cannot acquire state change lock"
Well, sometimes it's not a problem and users want to continue
to wait, and this patch allow users decide how long time they
can wait the state lock.
Signed-off-by: Yi Wang <wang.yi59(a)zte.com.cn>
Reviewed-by: Xi Xu <xu.xi8(a)zte.com.cn>
---
changes in v6:
- modify the description in qemu.conf
- move the multiplication to BeginJobInternal
changes in v5:
- adjust position of state lock in aug file
- fix state lock time got from conf file from milliseconds to
seconds
changes in v4:
- fix syntax-check error
changes in v3:
- add user-friendly description and nb of state lock
- check validity of stateLockTimeout
changes in v2:
- change default value to 30 in qemu.conf
- set the default value in virQEMUDriverConfigNew()
---
src/qemu/libvirtd_qemu.aug | 1 +
src/qemu/qemu.conf | 7 +++++++
src/qemu/qemu_conf.c | 14 ++++++++++++++
src/qemu/qemu_conf.h | 2 ++
src/qemu/qemu_domain.c | 7 +++----
src/qemu/test_libvirtd_qemu.aug.in | 1 +
6 files changed, 28 insertions(+), 4 deletions(-)
diff --git a/src/qemu/libvirtd_qemu.aug b/src/qemu/libvirtd_qemu.aug
index ddc4bbf..a5601e1 100644
--- a/src/qemu/libvirtd_qemu.aug
+++ b/src/qemu/libvirtd_qemu.aug
@@ -100,6 +100,7 @@ module Libvirtd_qemu =
| str_entry "lock_manager"
let rpc_entry = int_entry "max_queued"
+ | int_entry "state_lock_timeout"
| int_entry "keepalive_interval"
| int_entry "keepalive_count"
diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf
index cd57b3c..f5e34f1 100644
--- a/src/qemu/qemu.conf
+++ b/src/qemu/qemu.conf
@@ -667,6 +667,13 @@
#
#max_queued = 0
+
+# It is strongly recommended to not touch this setting
+#
+# Default is 30
+#
+#state_lock_timeout = 60
+
###################################################################
# Keepalive protocol:
# This allows qemu driver to detect broken connections to remote
diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c
index a4f545e..5be37dc 100644
--- a/src/qemu/qemu_conf.c
+++ b/src/qemu/qemu_conf.c
@@ -129,6 +129,9 @@ void qemuDomainCmdlineDefFree(qemuDomainCmdlineDefPtr def)
#endif
+/* Give up waiting for mutex after 30 seconds */
+#define QEMU_JOB_WAIT_TIME (30)
+
virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged)
{
virQEMUDriverConfigPtr cfg;
@@ -346,6 +349,8 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged)
cfg->glusterDebugLevel = 4;
cfg->stdioLogD = true;
+ cfg->stateLockTimeout = QEMU_JOB_WAIT_TIME;
+
if (!(cfg->namespaces = virBitmapNew(QEMU_DOMAIN_NS_LAST)))
goto error;
@@ -863,6 +868,9 @@ int virQEMUDriverConfigLoadFile(virQEMUDriverConfigPtr cfg,
if (virConfGetValueUInt(conf, "keepalive_count", &cfg->keepAliveCount) < 0)
goto cleanup;
+ if (virConfGetValueInt(conf, "state_lock_timeout", &cfg->stateLockTimeout) < 0)
+ goto cleanup;
+
if (virConfGetValueInt(conf, "seccomp_sandbox", &cfg->seccompSandbox) < 0)
goto cleanup;
@@ -1055,6 +1063,12 @@ virQEMUDriverConfigValidate(virQEMUDriverConfigPtr cfg)
return -1;
}
+ if (cfg->stateLockTimeout <= 0) {
+ virReportError(VIR_ERR_CONF_SYNTAX, "%s",
+ _("state_lock_timeout must be larger than zero"));
+ return -1;
+ }
+
return 0;
}
diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h
index a8d84ef..97cf2e1 100644
--- a/src/qemu/qemu_conf.h
+++ b/src/qemu/qemu_conf.h
@@ -190,6 +190,8 @@ struct _virQEMUDriverConfig {
int keepAliveInterval;
unsigned int keepAliveCount;
+ int stateLockTimeout;
+
int seccompSandbox;
char *migrateHost;
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 886e3fb..306772a 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -6652,9 +6652,6 @@ qemuDomainObjCanSetJob(qemuDomainObjPrivatePtr priv,
priv->job.agentActive == QEMU_AGENT_JOB_NONE));
}
-/* Give up waiting for mutex after 30 seconds */
-#define QEMU_JOB_WAIT_TIME (1000ull * 30)
-
/**
* qemuDomainObjBeginJobInternal:
* @driver: qemu driver
@@ -6714,7 +6711,9 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
}
priv->jobs_queued++;
- then = now + QEMU_JOB_WAIT_TIME;
+
+ cfg->stateLockTimeout *= 1000;
+ then = now + cfg->stateLockTimeout;
retry:
if ((!async && job != QEMU_JOB_DESTROY) &&
diff --git a/src/qemu/test_libvirtd_qemu.aug.in b/src/qemu/test_libvirtd_qemu.aug.in
index f1e8806..8e072d0 100644
--- a/src/qemu/test_libvirtd_qemu.aug.in
+++ b/src/qemu/test_libvirtd_qemu.aug.in
@@ -82,6 +82,7 @@ module Test_libvirtd_qemu =
{ "relaxed_acs_check" = "1" }
{ "lock_manager" = "lockd" }
{ "max_queued" = "0" }
+{ "state_lock_timeout" = "60" }
{ "keepalive_interval" = "5" }
{ "keepalive_count" = "5" }
{ "seccomp_sandbox" = "1" }
--
1.8.3.1
6 years, 2 months
[libvirt] [PATCH v5] qemu: Introduce state_lock_timeout to qemu.conf
by Yi Wang
When doing some job holding state lock for a long time,
we may come across error:
"Timed out during operation: cannot acquire state change lock"
Well, sometimes it's not a problem and users want to continue
to wait, and this patch allow users decide how long time they
can wait the state lock.
Signed-off-by: Yi Wang <wang.yi59(a)zte.com.cn>
Reviewed-by: Xi Xu <xu.xi8(a)zte.com.cn>
---
changes in v5:
- adjust position of state lock in aug file
- fix state lock time got from conf file from milliseconds to
seconds
changes in v4:
- fix syntax-check error
changes in v3:
- add user-friendly description and nb of state lock
- check validity of stateLockTimeout
changes in v2:
- change default value to 30 in qemu.conf
- set the default value in virQEMUDriverConfigNew()
---
src/qemu/libvirtd_qemu.aug | 1 +
src/qemu/qemu.conf | 10 ++++++++++
src/qemu/qemu_conf.c | 15 +++++++++++++++
src/qemu/qemu_conf.h | 2 ++
src/qemu/qemu_domain.c | 5 +----
src/qemu/test_libvirtd_qemu.aug.in | 1 +
6 files changed, 30 insertions(+), 4 deletions(-)
diff --git a/src/qemu/libvirtd_qemu.aug b/src/qemu/libvirtd_qemu.aug
index ddc4bbf..a5601e1 100644
--- a/src/qemu/libvirtd_qemu.aug
+++ b/src/qemu/libvirtd_qemu.aug
@@ -100,6 +100,7 @@ module Libvirtd_qemu =
| str_entry "lock_manager"
let rpc_entry = int_entry "max_queued"
+ | int_entry "state_lock_timeout"
| int_entry "keepalive_interval"
| int_entry "keepalive_count"
diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf
index cd57b3c..8920a1a 100644
--- a/src/qemu/qemu.conf
+++ b/src/qemu/qemu.conf
@@ -667,6 +667,16 @@
#
#max_queued = 0
+
+# When two or more threads want to work with the same domain they use a
+# job lock to mutually exclude each other. However, waiting for the lock
+# is limited up to state_lock_timeout seconds.
+# NB, strong recommendation to set the timeout longer than 30 seconds.
+#
+# Default is 30
+#
+#state_lock_timeout = 60
+
###################################################################
# Keepalive protocol:
# This allows qemu driver to detect broken connections to remote
diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c
index a4f545e..012f4d1 100644
--- a/src/qemu/qemu_conf.c
+++ b/src/qemu/qemu_conf.c
@@ -129,6 +129,9 @@ void qemuDomainCmdlineDefFree(qemuDomainCmdlineDefPtr def)
#endif
+/* Give up waiting for mutex after 30 seconds */
+#define QEMU_JOB_WAIT_TIME (1000ull * 30)
+
virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged)
{
virQEMUDriverConfigPtr cfg;
@@ -346,6 +349,8 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged)
cfg->glusterDebugLevel = 4;
cfg->stdioLogD = true;
+ cfg->stateLockTimeout = QEMU_JOB_WAIT_TIME;
+
if (!(cfg->namespaces = virBitmapNew(QEMU_DOMAIN_NS_LAST)))
goto error;
@@ -863,6 +868,10 @@ int virQEMUDriverConfigLoadFile(virQEMUDriverConfigPtr cfg,
if (virConfGetValueUInt(conf, "keepalive_count", &cfg->keepAliveCount) < 0)
goto cleanup;
+ if (virConfGetValueInt(conf, "state_lock_timeout", &cfg->stateLockTimeout) < 0)
+ goto cleanup;
+ cfg->stateLockTimeout *= 1000;
+
if (virConfGetValueInt(conf, "seccomp_sandbox", &cfg->seccompSandbox) < 0)
goto cleanup;
@@ -1055,6 +1064,12 @@ virQEMUDriverConfigValidate(virQEMUDriverConfigPtr cfg)
return -1;
}
+ if (cfg->stateLockTimeout <= 0) {
+ virReportError(VIR_ERR_CONF_SYNTAX, "%s",
+ _("state_lock_timeout should larger than zero"));
+ return -1;
+ }
+
return 0;
}
diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h
index a8d84ef..97cf2e1 100644
--- a/src/qemu/qemu_conf.h
+++ b/src/qemu/qemu_conf.h
@@ -190,6 +190,8 @@ struct _virQEMUDriverConfig {
int keepAliveInterval;
unsigned int keepAliveCount;
+ int stateLockTimeout;
+
int seccompSandbox;
char *migrateHost;
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 886e3fb..5a2ca52 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -6652,9 +6652,6 @@ qemuDomainObjCanSetJob(qemuDomainObjPrivatePtr priv,
priv->job.agentActive == QEMU_AGENT_JOB_NONE));
}
-/* Give up waiting for mutex after 30 seconds */
-#define QEMU_JOB_WAIT_TIME (1000ull * 30)
-
/**
* qemuDomainObjBeginJobInternal:
* @driver: qemu driver
@@ -6714,7 +6711,7 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
}
priv->jobs_queued++;
- then = now + QEMU_JOB_WAIT_TIME;
+ then = now + cfg->stateLockTimeout;
retry:
if ((!async && job != QEMU_JOB_DESTROY) &&
diff --git a/src/qemu/test_libvirtd_qemu.aug.in b/src/qemu/test_libvirtd_qemu.aug.in
index f1e8806..8e072d0 100644
--- a/src/qemu/test_libvirtd_qemu.aug.in
+++ b/src/qemu/test_libvirtd_qemu.aug.in
@@ -82,6 +82,7 @@ module Test_libvirtd_qemu =
{ "relaxed_acs_check" = "1" }
{ "lock_manager" = "lockd" }
{ "max_queued" = "0" }
+{ "state_lock_timeout" = "60" }
{ "keepalive_interval" = "5" }
{ "keepalive_count" = "5" }
{ "seccomp_sandbox" = "1" }
--
1.8.3.1
6 years, 2 months
[libvirt] Re:Re: [PATCH] add nodeset='all' and default for interleavemode
by peng.hao2@zte.com.cn
>On 09/11/2018 04:28 PM, Peng Hao wrote:
>> For interleave mode,sometimes we want to allocate mmeory regularly
>> in all nodes on the host. But different hosts has different node number.
>> So we add nodeset='all' for interleave mode and if nodeset=NULL default
>> nodeset is 'all' for interleave mode.
>>
>> Signed-off-by: Peng Hao <peng.hao2(a)zte.com.cn>
>> ---
>> src/conf/numa_conf.c | 73 ++++++++++++++++++++++++++++++++++++++++------------
>> 1 file changed, 57 insertions(+), 16 deletions(-)
>
>Firstly, this patch does not pass 'syntax-check'. Secondly, it breaks
>qemuxml2argvtest.
>
I will pay attention to this.
>> + numa->memory.allnode = true;
>> + } else {
>Any patch that changes accepted XML needs to go hand in hand with
>documentation and RNG update and a test case.
I will add next.
>> + if (placement == VIR_DOMAIN_NUMATUNE_PLACEMENT_STATIC &&
>> + mode == VIR_DOMAIN_NUMATUNE_MEM_INTERLEAVE &&
>> + numa->memory.allnode) {
>> + if ((bitmap = virBitmapNew(VIR_DOMAIN_CPUMASK_LEN)) == NULL)
>> + goto cleanup;
>> + virBitmapClearAll(bitmap);
>> + maxnode = numa_max_node();
>So, you're including numa.h to get this function. What if:
>a) numa.h is not available?
>b) what is wrong with virNumaGetMaxNode()?
>> + for (i = 0; i <= maxnode; i++) {
>> + if (virBitmapSetBit(bitmap, i) < 0) {
>> + virBitmapFree(bitmap);
>> + goto cleanup;
>> + }
>> + }
>> + if (numa->memory.nodeset)
>> + virBitmapFree(numa->memory.nodeset);
>> + numa->memory.nodeset = bitmap;
>> }
>>
>> /* setting nodeset when placement auto is invalid */
>>
>But more importantly, why is this patch needed? I might be missing
>something, but:
>a) you can just not pin the memory to avoid mismatch of NUMA nodes on
>migration,
>b) supply new domain XML on migration where NUMA nodes match the destination
>Isn't pinning memory to all NUMA nodes equivalent to no pinning at all?
I would use 'interlaeve' to let virtual machine's memory distribute evenly in all nodes. And
'interleave' setting ask for providing 'nodeset'. I think it is not so convenient.
>Michal
6 years, 2 months
[libvirt] [PATCH 0/5] qemu: misc graphics fixes
by Nikolay Shirokovskiy
Nikolay Shirokovskiy (5):
qemu: fix typo in vnc port releasing
qemu: simplify graphics port releasing
qemu: vnc: mark websocket as used on reconnect
qemu: mark graphics ports as used on migration
qemu: keep websocketGenerated on libvirtd restarts
src/conf/domain_conf.c | 9 +++++++++
src/conf/domain_conf.h | 1 +
src/qemu/qemu_migration.c | 6 ++++++
src/qemu/qemu_process.c | 51 ++++++++++++++++++++++-------------------------
src/qemu/qemu_process.h | 3 +++
5 files changed, 43 insertions(+), 27 deletions(-)
--
1.8.3.1
6 years, 2 months
[libvirt] [libvirt PATCH v2 0/4] Share cgroup code that is duplicated between QEMU and LXC
by Fabiano Fidêncio
virLXCCgroupSetupBlkioTune() and qemuSetupBlkioCgroup() and
virLXCCgroupSetupCpuTune() and qemuSetupCpuCgroup() are the most similar
functions between QEMU and LXC code.
Let's move their common code to virCgroup.
Mind that the first two patches are basically preparing the ground for
the changes introduced in the last two patches.
changes since v1:
- Michal Privoznik pointed out (as did the `make syntax-check` :-)) that
we do want to keep src/util independently of the parsing code (thus,
including "conf/domain_conf.h" in vircgroup.h is not the way to go).
This has been solved now by partially following Michal's suggestion
and splitting the structs and functions that would be use in the
common code to new different files.
Fabiano Fidêncio (4):
domain_conf: split out virBlkioDevice and virDomainBlkiotune
definitions
domain_conf: split out virDomainMemtune and virDomainHugePage
definitions
vircgroup: Add virCgroupSetupBlkioTune()
vircgroup: Add virCgroupSetupMemTune()
src/Makefile.am | 1 +
src/conf/domain_conf.c | 22 ++++--------
src/conf/domain_conf.h | 70 +++----------------------------------
src/libvirt_private.syms | 2 ++
src/lxc/lxc_cgroup.c | 69 ++-----------------------------------
src/qemu/qemu_cgroup.c | 61 ++-------------------------------
src/qemu/qemu_command.c | 4 +--
src/util/Makefile.inc.am | 2 ++
src/util/virblkio.c | 37 ++++++++++++++++++++
src/util/virblkio.h | 52 ++++++++++++++++++++++++++++
src/util/vircgroup.c | 74 ++++++++++++++++++++++++++++++++++++++++
src/util/vircgroup.h | 7 ++++
src/util/virmem.h | 66 +++++++++++++++++++++++++++++++++++
13 files changed, 259 insertions(+), 208 deletions(-)
create mode 100644 src/util/virblkio.c
create mode 100644 src/util/virblkio.h
create mode 100644 src/util/virmem.h
--
2.17.1
6 years, 2 months