[libvirt] [PATCHv8 00/17] Introduce x86 Cache Monitoring Technology (CMT)
by Wang Huaqiang
This series of patches and the series already been merged introduce
the x86 Cache Monitoring Technology (CMT) to libvirt by interacting
with kernel resource control (resctrl) interface. CMT is one of the
Intel(R) x86 CPU feature which belongs to the Resource Director
Technology (RDT). CMT reports the occupancy of the last level cache,
which is shared by all CPU cores.
In the v1 series, an original and complete feature for CMT was introduced
The v2 and v3 patches address the feature for the host capability of CMT.
v4 is addressing the feature for monitoring VM vcpu thread set cache
occupancy and reporting it through a virsh command.
We have serval discussion about the enabling of CMT, please refer to
following links for the RFCs.
RFCv3
https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html
RFCv2
https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html
https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html
RFCv1
https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html
And the merged commits are list as below, for host capability of CMT.
6af8417415508c31f8ce71234b573b4999f35980
8f6887998bf63594ae26e3db18d4d5896c5f2cb4
58fcee6f3a2b7e89c21c1fb4ec21429c31a0c5b8
12093f1feaf8f5023dcd9d65dff111022842183d
a5d293c18831dcf69ec6195798387fbb70c9f461
1. About reason why CMT is necessary in libvirt?
The perf events of 'CMT, MBML, MBMT' have been phased out since Linux
kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt
the perf based cmt,mbm will not work with the latest linux kernel. These
patches add CMT feature to libvirt through kernel resctrlfs interface.
2 Create cache monitoring group (cache monitor).
The main interface for creating monitoring group is through XML file. The
proposed configuration is like:
<cputune>
<cachetune vcpus='1'>
<cache id='0' level='3' type='code' size='7680' unit='KiB'/>
<cache id='1' level='3' type='data' size='3840' unit='KiB'/>
+ <monitor level='3' vcpus='1'/>
</cachetune>
<cachetune vcpus='4-7'>
+ <monitor level='3' vcpus='4-6'/>
</cachetune>
</cputune>
In above XML, created 2 cache resctrl allocation groups and 2 resctrl
monitoring groups.
The changes of cache monitor will be effective in next booting of VM.
2 Show CMT result through command 'domstats'
Adding the interface in qemu to report this information for resource
monitor group through command 'virsh domstats --cpu-total'.
Below is a typical output:
# virsh domstats 1 --cpu-total
Domain: 'ubuntu16.04-base'
...
cpu.cache.monitor.count=2
cpu.cache.monitor.0.name=vcpus_1
cpu.cache.monitor.0.vcpus=1
cpu.cache.monitor.0.bank.count=2
cpu.cache.monitor.0.bank.0.id=0
cpu.cache.monitor.0.bank.0.bytes=4505600
cpu.cache.monitor.0.bank.1.id=1
cpu.cache.monitor.0.bank.1.bytes=5586944
cpu.cache.monitor.1.name=vcpus_4-6
cpu.cache.monitor.1.vcpus=4,5,6
cpu.cache.monitor.1.bank.count=2
cpu.cache.monitor.1.bank.0.id=0
cpu.cache.monitor.1.bank.0.bytes=17571840
cpu.cache.monitor.1.bank.1.id=1
cpu.cache.monitor.1.bank.1.bytes=29106176
Changes in v8:
- Addressing John's review comments for v7.
- Add patch for refactoring virRresctrlAllocSetID and a separate patch
for virResctrlMonitorSetID.
- Removed patch for 'resctrl->id'.
- Removed patch for validating monitor through checking *tasks file.
- Removed patch for setup vcpu in libvirt re-reconnection.
- Re-designed the functions for showing the result for command 'virsh domstats'.
- Move virResctrlMonitorGetCacheOccupancy and its local helper functions to
virresctrl.c.
Changes in v7:
- Add several lines removed by mistake.
Changes in v6:
- Addressing John's review comments for v5.
- Removed and cleaned the concepts of 'default allocation' and
'default monitor'.
- qemu: virsh domstats --cpu-total output for CMT, add identifier
'monitor' for each itm.
Changes in v5:
- qemu: Setting up vcpu and adding pids to resctrl monitor groups during
re-connection.
- Add the document for domain configuration related to resctrl monitor.
Changes in v4:
v4 is addressing the feature for monitoring VM vcpu
thread set cache occupancy and reporting it through a
virsh command.
- Introduced resctrl default allocation
- Introduced resctrl monitor and default monitor
Changes in v3:
- Addressed John Ferlan's review.
- Typo fixed.
- Removed VIR_ENUM_DECL(virMonitor);
Changes in v2:
- Introduced MBM capability.
- Capability layout changed
* Moved <monitor> from cahe <bank> to <cache>
* Renamed <Threshold> to <reuseThreshold>
- Document for 'reuseThreshold' changed.
- Introduced API virResctrlInfoGetMonitorPrefix
- Added more tests, covering standalone CMT, fake new
feature.
- Creating CMT resource control group will be
subsequent job.
Wang Huaqiang (17):
docs,util: Refactor schemas and virresctrl to support optional cache
util: Introduce resctrl monitor for CMT
util: Refactor code for determining allocation path
util: Add interface to determine monitor path
util: Refactor code for adding PID to the resource group
util: Add interface for adding PID to the monitor
util: Refactor code for creating resctrl group
util: Add interface for creating monitor group
util: Refactor virResctrlAllocSetID to set allocation ID
util: Add interface for setting monitor ID.
util: Add more interfaces for resctrl monitor
conf: Remove virDomainResctrlAppend and introduce virDomainResctrlNew
conf: Introduce cache monitor element in cachetune
qemu: enable resctrl monitor in qemu
qemu: Refactor qemuDomainGetStatsCpu
qemu: Report cache occupancy (CMT) with domstats
docs: Updated news.xml about the CMT support
docs/formatdomain.html.in | 30 +-
docs/news.xml | 11 +
docs/schemas/domaincommon.rng | 14 +-
src/conf/domain_conf.c | 286 +++++++++++-
src/conf/domain_conf.h | 11 +
src/libvirt-domain.c | 9 +
src/libvirt_private.syms | 9 +
src/qemu/qemu_driver.c | 219 +++++++++-
src/qemu/qemu_process.c | 49 ++-
src/util/virresctrl.c | 484 +++++++++++++++++++--
src/util/virresctrl.h | 46 ++
tests/genericxml2xmlindata/cachetune-cdp.xml | 3 +
.../cachetune-colliding-monitor.xml | 30 ++
tests/genericxml2xmlindata/cachetune-small.xml | 7 +
tests/genericxml2xmltest.c | 2 +
15 files changed, 1142 insertions(+), 68 deletions(-)
create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
--
2.7.4
6 years, 1 month
[libvirt] [PATCH for-3.2 v2] vhost-user: define conventions for vhost-user backends
by Marc-André Lureau
As discussed during "[PATCH v4 00/29] vhost-user for input & GPU"
review, let's define a common set of backend conventions to help with
management layer implementation, and interoperability.
v2:
- use a vhost-user.json schema to discover backends and describe
capability format
- drop --pidfile
- add some notes about daemonizing & stdin/out/err
Cc: libvir-list(a)redhat.com
Cc: Gerd Hoffmann <kraxel(a)redhat.com>
Cc: Daniel P. Berrangé <berrange(a)redhat.com>
Cc: Changpeng Liu <changpeng.liu(a)intel.com>
Cc: Dr. David Alan Gilbert <dgilbert(a)redhat.com>
Cc: Felipe Franciosi <felipe(a)nutanix.com>
Cc: Gonglei <arei.gonglei(a)huawei.com>
Cc: Maxime Coquelin <maxime.coquelin(a)redhat.com>
Cc: Michael S. Tsirkin <mst(a)redhat.com>
Cc: Victor Kaplansky <victork(a)redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau(a)redhat.com>
---
MAINTAINERS | 1 +
docs/interop/vhost-user.json | 219 +++++++++++++++++++++++++++++++++++
docs/interop/vhost-user.txt | 101 +++++++++++++++-
3 files changed, 319 insertions(+), 2 deletions(-)
create mode 100644 docs/interop/vhost-user.json
diff --git a/MAINTAINERS b/MAINTAINERS
index bd2dff7827..58082c6d92 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1238,6 +1238,7 @@ vhost
M: Michael S. Tsirkin <mst(a)redhat.com>
S: Supported
F: hw/*/*vhost*
+F: docs/interop/vhost-user.json
F: docs/interop/vhost-user.txt
F: backends/vhost-user.c
F: include/sysemu/vhost-user-backend.h
diff --git a/docs/interop/vhost-user.json b/docs/interop/vhost-user.json
new file mode 100644
index 0000000000..91b5bf499e
--- /dev/null
+++ b/docs/interop/vhost-user.json
@@ -0,0 +1,219 @@
+# -*- Mode: Python -*-
+#
+# Copyright (C) 2018 Red Hat, Inc.
+#
+# Authors:
+# Marc-André Lureau <marcandre.lureau(a)redhat.com>
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# later. See the COPYING file in the top-level directory.
+
+##
+# = vhost user backend discovery & capabilities
+##
+
+##
+# @VHostUserBackendType:
+#
+# List the various vhost user backend types.
+#
+# @net: virtio net
+# @block: virtio block
+# @console: virtio console
+# @rng: virtio rng
+# @balloon: virtio balloon
+# @rpmsg: virtio remote processor messaging
+# @scsi: virtio scsi
+# @9p: 9p virtio console
+# @rproc-serial: virtio remoteproc serial link
+# @caif: virtio caif
+# @gpu: virtio gpu
+# @input: virtio input
+# @vsock: virtio vsock transport
+# @crypto: virtio crypto
+#
+# Since: 3.2
+##
+{
+ 'enum': 'VHostUserBackendType',
+ 'data': [ 'net', 'block', 'console', 'rng', 'balloon', 'rpmsg',
+ 'scsi', '9p', 'rproc-serial', 'caif', 'gpu', 'input', 'vsock',
+ 'crypto' ]
+}
+
+##
+# @VHostUserBackendInputFeature:
+#
+# List of vhost user "input" features.
+#
+# @evdev-path: The --evdev-path command line option is supported.
+# @no-grab: The --no-grab command line option is supported.
+#
+# Since: 3.2
+##
+{
+ 'enum': 'VHostUserBackendInputFeature',
+ 'data': [ 'evdev-path', 'no-grab' ]
+}
+
+##
+# @VHostUserBackendCapabilitiesInput:
+#
+# Capabilities reported by vhost user "input" backends
+#
+# @features: list of supported features.
+#
+# Since: 3.2
+##
+{
+ 'struct': 'VHostUserBackendCapabilitiesInput',
+ 'data': {
+ 'features': [ 'VHostUserBackendInputFeature' ]
+ }
+}
+
+##
+# @VHostUserBackendGPUFeature:
+#
+# List of vhost user "gpu" features.
+#
+# @render-node: The --render-node command line option is supported.
+# @virgl: The --virgl command line option is supported.
+#
+# Since: 3.2
+##
+{
+ 'enum': 'VHostUserBackendGPUFeature',
+ 'data': [ 'render-node', 'virgl' ]
+}
+
+##
+# @VHostUserBackendCapabilitiesGPU:
+#
+# Capabilities reported by vhost user "gpu" backends.
+#
+# @features: list of supported features.
+#
+# Since: 3.2
+##
+{
+ 'struct': 'VHostUserBackendCapabilitiesGPU',
+ 'data': {
+ 'features': [ 'VHostUserBackendGPUFeature' ]
+ }
+}
+
+##
+# @VHostUserBackendCapabilities:
+#
+# Capabilities reported by vhost user backends.
+#
+# @type: The vhost user backend type.
+#
+# Since: 3.2
+##
+{
+ 'union': 'VHostUserBackendCapabilities',
+ 'base': { 'type': 'VHostUserBackendType' },
+ 'discriminator': 'type',
+ 'data': {
+ 'input': 'VHostUserBackendCapabilitiesInput',
+ 'gpu': 'VHostUserBackendCapabilitiesGPU'
+ }
+}
+
+##
+# @VhostUserBackend:
+#
+# Describes a vhost user backend to management software.
+#
+# It is possible for multiple @VhostUserBackend elements to match the
+# search criteria of management software. Applications thus need rules
+# to pick one of the many matches, and users need the ability to
+# override distro defaults.
+#
+# It is recommended to create vhost user backend JSON files (each
+# containing a single @VhostUserBackend root element) with a
+# double-digit prefix, for example "50-qemu-gpu.json",
+# "50-crosvm-gpu.json", etc, so they can be sorted in predictable
+# order. The backend JSON files should be searched for in three
+# directories:
+#
+# - /usr/share/qemu/vhost-user -- populated by distro-provided
+# packages (XDG_DATA_DIRS covers
+# /usr/share by default),
+#
+# - /etc/qemu/vhost-user -- exclusively for sysadmins' local additions,
+#
+# - $XDG_CONFIG_HOME/qemu/vhost-user -- exclusively for per-user local
+# additions (XDG_CONFIG_HOME
+# defaults to $HOME/.config).
+#
+# Top-down, the list of directories goes from general to specific.
+#
+# Management software should build a list of files from all three
+# locations, then sort the list by filename (i.e., last pathname
+# component). Management software should choose the first JSON file on
+# the sorted list that matches the search criteria. If a more specific
+# directory has a file with same name as a less specific directory, then
+# the file in the more specific directory takes effect. If the more
+# specific file is zero length, it hides the less specific one.
+#
+# For example, if a distro ships
+#
+# - /usr/share/qemu/vhost-user/50-qemu-gpu.json
+#
+# - /usr/share/qemu/vhost-user/50-crosvm-gpu.json
+#
+# then the sysadmin can prevent the default QEMU being used at all with
+#
+# $ touch /etc/qemu/vhost-user/50-qemu-gpu.json
+#
+# The sysadmin can replace/alter the distro default OVMF with
+#
+# $ vim /etc/qemu/vhost-user/50-qemu-gpu.json
+#
+# or they can provide a parallel QEMU GPU with higher priority
+#
+# $ vim /etc/qemu/vhost-user/10-qemu-gpu.json
+#
+# or they can provide a parallel OVMF with lower priority
+#
+# $ vim /etc/qemu/vhost-user/99-qemu-gpu.json
+#
+# @type: The vhost user backend type.
+#
+# @description: Provides a human-readable description of the backend.
+# Management software may or may not display @description.
+#
+# @binary: Absolute path to the backend binary.
+#
+# @tags: An optional list of auxiliary strings associated with the
+# backend for which @description is not appropriate, due to the
+# latter's possible exposure to the end-user. @tags serves
+# development and debugging purposes only, and management
+# software shall explicitly ignore it.
+#
+# Since: 3.2
+#
+# Example:
+#
+# {
+# "description": "QEMU vhost-user-gpu",
+# "type": "gpu",
+# "binary": "/usr/libexec/qemu/vhost-user-gpu",
+# "tags": [
+# "CONFIG_OPENGL_DMABUF=y"
+# ]
+# }
+#
+##
+{
+ 'struct' : 'VhostUserBackend',
+ 'data' : {
+ 'description': 'str',
+ 'type': 'VHostUserBackendType',
+ 'binary': 'str',
+ '*tags': [ 'str' ]
+ }
+}
diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
index 5d5bdcb8cb..e3e765e2ac 100644
--- a/docs/interop/vhost-user.txt
+++ b/docs/interop/vhost-user.txt
@@ -17,8 +17,13 @@ The protocol defines 2 sides of the communication, master and slave. Master is
the application that shares its virtqueues, in our case QEMU. Slave is the
consumer of the virtqueues.
-In the current implementation QEMU is the Master, and the Slave is intended to
-be a software Ethernet switch running in user space, such as Snabbswitch.
+In the current implementation QEMU is the Master, and the Slave is the
+external process consuming the virtio queues, for example a software
+Ethernet switch running in user space, such as Snabbswitch, or a block
+device backend processing read & write to a virtual disk. In order to
+facilitate interoperability between various backend implementations,
+it is recommended to follow the "Backend program conventions"
+described in this document.
Master and slave can be either a client (i.e. connecting) or server (listening)
in the socket communication.
@@ -842,3 +847,95 @@ resilient for selective requests.
For the message types that already solicit a reply from the client, the
presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
no behavioural change. (See the 'Communication' section for details.)
+
+Backend program conventions
+---------------------------
+
+vhost-user backends can provide various devices & services and may
+need to be configured manually depending on the use case. However, it
+is a good idea to follow the conventions listed here when
+possible. Users, QEMU or libvirt, can then rely on some common
+behaviour to avoid heterogenous configuration and management of the
+backend programs and facilitate interoperability.
+
+Each backend installed on a host system should come with at least one
+JSON file that conforms to the vhost-user.json schema. Each file
+informs the management applications about the backend type, and binary
+location. In addition, it defines rules for management apps for
+picking the highest priority backend when multiple match the search
+criteria (see @VhostUserBackend documentation in the schema file).
+
+If the backend is not capable of enabling a requested feature on the
+host (such as 3D acceleration with virgl), or the initialization
+failed, the backend should fail to start early and exit with a status
+!= 0. It may also print a message to stderr for further details.
+
+The backend program must not daemonize itself, but it may be
+daemonized by the management layer. It may also have a restricted
+access to the system.
+
+File descriptors 0, 1 and 2 will exist, and have regular
+stdin/stdout/stderr usage (they may have been redirected to /dev/null
+by the management layer, or to a log handler).
+
+The backend program must end (as quickly and cleanly as possible) when
+the SIGTERM signal is received. Eventually, it may be SIGKILL by the
+management layer after a few seconds.
+
+The following command line options have an expected behaviour. They
+are mandatory, unless explicitly said differently:
+
+* --socket-path=PATH
+
+This option specify the location of the vhost-user Unix domain socket.
+It is incompatible with --fd.
+
+* --fd=FDNUM
+
+When this argument is given, the backend program is started with the
+vhost-user socket as file descriptor FDNUM. It is incompatible with
+--socket-path.
+
+* --print-capabilities
+
+Output to stdout the backend capabilities in JSON format, and then
+exit successfully. Other options and arguments should be ignored, and
+the backend program should not perform its normal function. The
+capabilities can be reported dynamically depending on the host
+capabilities.
+
+The JSON output is described in the vhost-user.json schema, by
+@VHostUserBackendCapabilities. Example:
+{
+ "type": "foo",
+ "features": [
+ "feature-a",
+ "feature-b"
+ ]
+}
+
+vhost-user-input
+----------------
+
+Command line options:
+
+* --evdev-path=PATH (optional)
+
+Specify the linux input device.
+
+* --no-grab (optional)
+
+Do no request exclusive access to the input device.
+
+vhost-user-gpu
+--------------
+
+Command line options:
+
+* --render-node=PATH (optional)
+
+Specify the GPU DRM render node.
+
+* --virgl (optional)
+
+Enable virgl rendering support.
--
2.19.1.708.g4ede3d42df
6 years, 1 month
[libvirt] [PATCH v3 0/2] util, qemu: Fix virDoes*Exist usage
by Martin Kletzander
le blurb
Martin Kletzander (2):
qemu: Fix virDoes*Exist usage
util: Fix virDoes*Exist return typ
src/qemu/qemu_conf.c | 4 ++--
src/util/virutil.c | 12 ++++++------
src/util/virutil.h | 4 ++--
3 files changed, 10 insertions(+), 10 deletions(-)
--
2.19.1
6 years, 1 month
[libvirt] [PATCH v2] util, qemu: Fix virDoes*Exist usage
by Martin Kletzander
Since the functions only return 0 or 1, they should return bool (missed the
change in the first commit). That way it's clearer that the check for
non-existing group should be either "== 0" instead. Fix this by using proper
negation instead.
Signed-off-by: Martin Kletzander <mkletzan(a)redhat.com>
---
src/qemu/qemu_conf.c | 4 ++--
src/util/virutil.c | 12 ++++++------
src/util/virutil.h | 4 ++--
3 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c
index 32da9a735184..a946b05d5d47 100644
--- a/src/qemu/qemu_conf.c
+++ b/src/qemu/qemu_conf.c
@@ -193,10 +193,10 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged)
if (virAsprintf(&cfg->swtpmStorageDir, "%s/lib/libvirt/swtpm",
LOCALSTATEDIR) < 0)
goto error;
- if (virDoesUserExist("tss") != 0 ||
+ if (!virDoesUserExist("tss") ||
virGetUserID("tss", &cfg->swtpm_user) < 0)
cfg->swtpm_user = 0; /* fall back to root */
- if (virDoesGroupExist("tss") != 0 ||
+ if (!virDoesGroupExist("tss") ||
virGetGroupID("tss", &cfg->swtpm_group) < 0)
cfg->swtpm_group = 0; /* fall back to root */
} else {
diff --git a/src/util/virutil.c b/src/util/virutil.c
index c0783ecb285b..77baaa33f472 100644
--- a/src/util/virutil.c
+++ b/src/util/virutil.c
@@ -1133,7 +1133,7 @@ virGetGroupID(const char *group, gid_t *gid)
/* Silently checks if User @name exists.
* Returns if the user exists and fallbacks to false on error.
*/
-int
+bool
virDoesUserExist(const char *name)
{
return virGetUserIDByName(name, NULL, true) == 0;
@@ -1142,7 +1142,7 @@ virDoesUserExist(const char *name)
/* Silently checks if Group @name exists.
* Returns if the group exists and fallbacks to false on error.
*/
-int
+bool
virDoesGroupExist(const char *name)
{
return virGetGroupIDByName(name, NULL, true) == 0;
@@ -1243,16 +1243,16 @@ virGetGroupList(uid_t uid ATTRIBUTE_UNUSED, gid_t gid ATTRIBUTE_UNUSED,
return 0;
}
-int
+bool
virDoesUserExist(const char *name ATTRIBUTE_UNUSED)
{
- return 0;
+ return false
}
-int
+bool
virDoesGroupExist(const char *name ATTRIBUTE_UNUSED)
{
- return 0;
+ return false;
}
# ifdef WIN32
diff --git a/src/util/virutil.h b/src/util/virutil.h
index 2407f54efd47..e0ab0da0f2fc 100644
--- a/src/util/virutil.h
+++ b/src/util/virutil.h
@@ -152,8 +152,8 @@ int virGetUserID(const char *name,
int virGetGroupID(const char *name,
gid_t *gid) ATTRIBUTE_RETURN_CHECK;
-int virDoesUserExist(const char *name);
-int virDoesGroupExist(const char *name);
+bool virDoesUserExist(const char *name);
+bool virDoesGroupExist(const char *name);
bool virIsDevMapperDevice(const char *dev_name) ATTRIBUTE_NONNULL(1);
--
2.19.1
6 years, 1 month
[libvirt] RFC: put domain's interfaces into distinct namespaces
by Nikolay Shirokovskiy
Hi, all!
There is performance issue with network filters and broadcast ethernet traffic.
If L2 segment is large enough (several thousands of VMs) then there is a lot of
broadcast ARP traffic (about frames 100/s). As aresult on host with several hundreds
VMs (say 300) we have kernel thread eating 100% of CPUs just for checking this traffic
against firewall rules. The problem is if there are rules in ebtables POSTROUTING chain
(clean-traffic is example of such filter) then when every single broadcast frame turns into
300, one for every distinct bridge port and then each one of these 300 is checked against
300 / 2 rules average to find chain for that port. As a result we have 100 * 300 * 300 / 2
= 4.5 * 10^6 rules checks per second. Kernel does not spread this workload onto
different CPUs and anyway this is wasting CPUs!
The simple solution is to put rules that ACCEPT ARP traffic into POSTROUTING
itself before any port specific chains. But this will affect non-VM ports too
and host itself. So can we instead make a distinct network namespace for every
VM and put tap there, next add the bridge into the namespace too so we can apply
ebtables rules there and insert tap into the bridge. Finally connect the bridges
in root namespace and VM namespace by veth pair. As result in the situation
described above each cloned frame will be cheched only againt rules for this
very VM. The regular TCP traffic will have same benefits. On the other hand we
need a bridge and veth pair for every VM and some CPU power to process this extra
traffic path.
The proposed approach also fixes the problem of slow libvirtd restarting with
network filters ([1], [2]) as it is rather difficult to mess network rules in
different network namespace, at least restarting/reloading firewalld won't
hurt such rules so we just don't need to reinstantiate rules at all.
[1] [RFC] Faster libvirtd restart with nwfilter rules
https://www.redhat.com/archives/libvir-list/2018-September/msg01206.html
which continues in
https://www.redhat.com/archives/libvir-list/2018-October/msg00657.html
[2] [PATCH v2 0/2] nwfilter: don't reinstantiate rules if there is no need to
https://www.redhat.com/archives/libvir-list/2018-October/msg01317.html
Nikolay
6 years, 1 month
[libvirt] [PATCH] util, qemu: Fix virDoes*Exist usage
by Martin Kletzander
Since the functions only return 0 or 1, they should return bool (missed the
change in the first commit). That way it's clearer that the check for
non-existing group should be either "== 0" instead. Fix this by using proper
negation instead.
Signed-off-by: Martin Kletzander <mkletzan(a)redhat.com>
---
src/qemu/qemu_conf.c | 4 ++--
src/util/virutil.c | 8 ++++----
src/util/virutil.h | 4 ++--
3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c
index 32da9a735184..a946b05d5d47 100644
--- a/src/qemu/qemu_conf.c
+++ b/src/qemu/qemu_conf.c
@@ -193,10 +193,10 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged)
if (virAsprintf(&cfg->swtpmStorageDir, "%s/lib/libvirt/swtpm",
LOCALSTATEDIR) < 0)
goto error;
- if (virDoesUserExist("tss") != 0 ||
+ if (!virDoesUserExist("tss") ||
virGetUserID("tss", &cfg->swtpm_user) < 0)
cfg->swtpm_user = 0; /* fall back to root */
- if (virDoesGroupExist("tss") != 0 ||
+ if (!virDoesGroupExist("tss") ||
virGetGroupID("tss", &cfg->swtpm_group) < 0)
cfg->swtpm_group = 0; /* fall back to root */
} else {
diff --git a/src/util/virutil.c b/src/util/virutil.c
index c0783ecb285b..1407c026e298 100644
--- a/src/util/virutil.c
+++ b/src/util/virutil.c
@@ -1133,7 +1133,7 @@ virGetGroupID(const char *group, gid_t *gid)
/* Silently checks if User @name exists.
* Returns if the user exists and fallbacks to false on error.
*/
-int
+bool
virDoesUserExist(const char *name)
{
return virGetUserIDByName(name, NULL, true) == 0;
@@ -1142,7 +1142,7 @@ virDoesUserExist(const char *name)
/* Silently checks if Group @name exists.
* Returns if the group exists and fallbacks to false on error.
*/
-int
+bool
virDoesGroupExist(const char *name)
{
return virGetGroupIDByName(name, NULL, true) == 0;
@@ -1243,13 +1243,13 @@ virGetGroupList(uid_t uid ATTRIBUTE_UNUSED, gid_t gid ATTRIBUTE_UNUSED,
return 0;
}
-int
+bool
virDoesUserExist(const char *name ATTRIBUTE_UNUSED)
{
return 0;
}
-int
+bool
virDoesGroupExist(const char *name ATTRIBUTE_UNUSED)
{
return 0;
diff --git a/src/util/virutil.h b/src/util/virutil.h
index 2407f54efd47..e0ab0da0f2fc 100644
--- a/src/util/virutil.h
+++ b/src/util/virutil.h
@@ -152,8 +152,8 @@ int virGetUserID(const char *name,
int virGetGroupID(const char *name,
gid_t *gid) ATTRIBUTE_RETURN_CHECK;
-int virDoesUserExist(const char *name);
-int virDoesGroupExist(const char *name);
+bool virDoesUserExist(const char *name);
+bool virDoesGroupExist(const char *name);
bool virIsDevMapperDevice(const char *dev_name) ATTRIBUTE_NONNULL(1);
--
2.19.1
6 years, 1 month
[libvirt] [PATCH v8 00/14] PCI passthrough support on s390
by Yi Min Zhao
Abstract
========
The PCI representation in QEMU has been extended for S390
allowing configuration of zPCI attributes like uid (user-defined
identifier) and fid (PCI function identifier).
The details can be found here:
https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07262.html
To support the new zPCI feature of the S390 platform, a new element of
PCI address is introduced. It has two optional attributes, @uid and
@fid. For example:
<hostdev mode='subsystem' type='pci'>
<driver name='vfio'/>
<source>
<address domain='0x0001' bus='0x00' slot='0x00' function='0x0'/>
</source>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'>
<zpci uid='0x0003' fid='0x00000027'/>
</address>
</hostdev>
If they are defined by the user, unique values within the guest domain
must be used. If they are not specified and the architecture requires
them, they are automatically generated with non-conflicting values.
zPCI address as an extension of the PCI address are stored in a new
structure 'virZPCIDeviceAddress' which is a member of common PCI
Address structure. Additionally, two hashtables are used for assignment
and reservation of zPCI uid/fid.
In support of extending the PCI address, a new PCI address extension flag is
introduced. This extension flag allows is not only dedicated for the S390
platform but also other architectures needing certain extensions to PCI
address space.
Code Base
=========
commit in master:
4f1107614d docs: Enhance polkit documentation to describe secondary connection
Change Log
==========
v7->v8:
1. Rebase the code to the newest code in master branch.
2. Update the words regarding version number in docs to 4.10.0.
3. Move the code introducing zpci member from patch 1 to patch 3.
v6->v7:
1. Optimize some functions' names and code logic.
2. Fixup build error.
3. Add negative test case for patch 9.
4. Use virXMLFormatElement() in virDomainDeviceInfoFormat().
v5->v6:
1. Modify zPCI XML definition.
2. Optimize the logic of zPCI address assignment and reservation.
3. Add extension flag into PCI address structure.
4. Update commit messages.
v4->v5:
1. Update the version number.
2. Fixup code style error.
3. Separate qemu code into single patch.
4. Rebase the patches to the new code of master branch.
v3->v4:
1. Update docs.
2. Format code style.
3. Optimize zPCI support check.
4. Move the check of zPCI defined in xml but unsupported by Qemu to
qemuDomainDeviceDefValidate().
5. Change zpci address member of PCI address struct from pointer to
instance.
6. Modify zpci address definition principle. Currently the user must
either define both of uid and fid or not.
v2->v3:
1. Revise code style.
2. Update test cases.
3. Introduce qemuDomainCollectPCIAddressExtension() to collect PCI
extension addresses.
4. Introduce virDeviceInfoPCIAddressExtensionPresent() to check if zPCI
address exists.
5. Optimize zPCI address check logic.
6. Optimize passed parameters of zPCI addr alloc/release/reserve functions.
7. Report enum range error in qemuDomainDeviceSupportZPCI().
8. Update commit messages.
v1->v2:
1. Separate test commit and merge testcases into corresponding commits that
introduce the functionalities firstly.
2. Spare some checks for zpci device.
3. Add vsock and controller support.
4. Add uin32 type schema.
5. Rename zpciuid and zpcifid to zpci_uid and zpci_fid.
6. Always return multibus support on S390.
Yi Min Zhao (14):
conf: Add definitions for 'uid' and 'fid' PCI address attributes
qemu: Introduce zPCI capability
conf: Introduce extension flag and zPCI member for PCI address
qemu: Enable PCI multi bus for S390 guests
qemu: Auto add pci-root for s390/s390x guests
conf: Introduce address caching for PCI extensions
conf: use virXMLFormatElement() in virDomainDeviceInfoFormat()
conf: Introduce parser, formatter for uid and fid
qemu: Add zPCI address definition check
conf: Allocate/release 'uid' and 'fid' in PCI address
qemu: Generate and use zPCI device in QEMU command line
qemu: Add hotpluging support for PCI devices on S390 guests
docs: Add 'uid' and 'fid' information
news: Update news for PCI address extension attributes
cfg.mk | 1 +
docs/formatdomain.html.in | 10 +-
docs/news.xml | 11 +
docs/schemas/basictypes.rng | 27 ++
docs/schemas/domaincommon.rng | 1 +
src/bhyve/bhyve_device.c | 3 +-
src/conf/device_conf.c | 69 ++++
src/conf/device_conf.h | 7 +
src/conf/domain_addr.c | 340 +++++++++++++++++-
src/conf/domain_addr.h | 27 +-
src/conf/domain_conf.c | 50 ++-
src/libvirt_private.syms | 7 +
src/qemu/qemu_capabilities.c | 6 +
src/qemu/qemu_capabilities.h | 1 +
src/qemu/qemu_command.c | 104 ++++++
src/qemu/qemu_command.h | 2 +
src/qemu/qemu_domain.c | 37 ++
src/qemu/qemu_domain_address.c | 205 ++++++++++-
src/qemu/qemu_hotplug.c | 160 ++++++++-
src/util/virpci.c | 26 ++
src/util/virpci.h | 15 +
.../caps_2.10.0.s390x.xml | 1 +
.../caps_2.11.0.s390x.xml | 1 +
.../caps_2.12.0.s390x.xml | 1 +
.../qemucapabilitiesdata/caps_2.7.0.s390x.xml | 1 +
.../qemucapabilitiesdata/caps_2.8.0.s390x.xml | 1 +
.../qemucapabilitiesdata/caps_2.9.0.s390x.xml | 1 +
.../qemucapabilitiesdata/caps_3.0.0.s390x.xml | 1 +
.../disk-virtio-s390-zpci.args | 26 ++
.../disk-virtio-s390-zpci.xml | 19 +
.../hostdev-vfio-zpci-autogenerate.args | 25 ++
.../hostdev-vfio-zpci-autogenerate.xml | 18 +
.../hostdev-vfio-zpci-boundaries.args | 29 ++
.../hostdev-vfio-zpci-boundaries.xml | 30 ++
.../hostdev-vfio-zpci-multidomain-many.args | 39 ++
.../hostdev-vfio-zpci-multidomain-many.xml | 79 ++++
.../hostdev-vfio-zpci-wrong-arch.xml | 34 ++
tests/qemuxml2argvdata/hostdev-vfio-zpci.args | 25 ++
tests/qemuxml2argvdata/hostdev-vfio-zpci.xml | 21 ++
tests/qemuxml2argvtest.c | 22 ++
.../disk-virtio-s390-zpci.xml | 31 ++
.../hostdev-vfio-zpci-autogenerate.xml | 34 ++
.../hostdev-vfio-zpci-boundaries.xml | 48 +++
.../hostdev-vfio-zpci-multidomain-many.xml | 97 +++++
.../qemuxml2xmloutdata/hostdev-vfio-zpci.xml | 32 ++
tests/qemuxml2xmltest.c | 17 +
46 files changed, 1707 insertions(+), 35 deletions(-)
create mode 100644 tests/qemuxml2argvdata/disk-virtio-s390-zpci.args
create mode 100644 tests/qemuxml2argvdata/disk-virtio-s390-zpci.xml
create mode 100644 tests/qemuxml2argvdata/hostdev-vfio-zpci-autogenerate.args
create mode 100644 tests/qemuxml2argvdata/hostdev-vfio-zpci-autogenerate.xml
create mode 100644 tests/qemuxml2argvdata/hostdev-vfio-zpci-boundaries.args
create mode 100644 tests/qemuxml2argvdata/hostdev-vfio-zpci-boundaries.xml
create mode 100644 tests/qemuxml2argvdata/hostdev-vfio-zpci-multidomain-many.args
create mode 100644 tests/qemuxml2argvdata/hostdev-vfio-zpci-multidomain-many.xml
create mode 100644 tests/qemuxml2argvdata/hostdev-vfio-zpci-wrong-arch.xml
create mode 100644 tests/qemuxml2argvdata/hostdev-vfio-zpci.args
create mode 100644 tests/qemuxml2argvdata/hostdev-vfio-zpci.xml
create mode 100644 tests/qemuxml2xmloutdata/disk-virtio-s390-zpci.xml
create mode 100644 tests/qemuxml2xmloutdata/hostdev-vfio-zpci-autogenerate.xml
create mode 100644 tests/qemuxml2xmloutdata/hostdev-vfio-zpci-boundaries.xml
create mode 100644 tests/qemuxml2xmloutdata/hostdev-vfio-zpci-multidomain-many.xml
create mode 100644 tests/qemuxml2xmloutdata/hostdev-vfio-zpci.xml
--
Yi Min
6 years, 1 month
[libvirt] [PATCH] docs: remove redundant words and blank lines
by luzhipeng@uniudc.com
From: ZhiPeng Lu <luzhipeng(a)uniudc.com>
Signed-off-by: ZhiPeng Lu <luzhipeng(a)uniudc.com>
---
docs/formatdomain.html.in | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 8a23b78..de25ad3 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -2835,7 +2835,6 @@
</source>
<target dev='sdb' bus='scsi'/>
</disk>
- </disk>
<disk type='network' device='lun'>
<driver name='qemu' type='raw'/>
<source protocol='iscsi' name='iqn.2013-07.com.example:iscsi-nopool/0'>
@@ -5216,7 +5215,6 @@
<virtualport>
<parameters instanceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
</virtualport>
-
</interface>
</devices>
...</pre>
@@ -5709,8 +5707,7 @@ qemu-kvm -net nic,model=? /dev/null
<host csum='off' gso='off' tso4='off' tso6='off' ecn='off' ufo='off' mrg_rxbuf='off'/>
<guest csum='off' tso4='off' tso6='off' ecn='off' ufo='off'/>
</driver>
- </b>
- </interface>
+ </b></interface>
</devices>
...</pre>
@@ -6212,7 +6209,7 @@ qemu-kvm -net nic,model=? /dev/null
<b><route family='ipv4' address='192.168.122.0' prefix='24' gateway='192.168.122.1'/></b>
<b><route family='ipv4' address='192.168.122.8' gateway='192.168.122.1'/></b>
</hostdev>
-
+ ...
</devices>
...
</pre>
--
1.8.3.1
6 years, 1 month
[libvirt] [PATCH v4 0/8] Virtio-crypto device support
by Longpeng(Mike)
As virtio-crypto has been supported in QEMU 2.8 and the frontend
driver has been merged in linux 4.10, so it's necessary to support
virtio-crypto in libvirt.
---
Changes since v3:
- spilt the capabilities part into a separate patch. [Boris]
- include Boris's virtio-crypto ccw support(PATCH 6 & 8). [Boris]
- add the missing capabilities in caps_2.9.0.x86_64.xml. [Boris]
- fix Indentation and missing virDomainCryptoDefFree. [Marc]
Changes since v2:
- PATCH 1: modify docs as Martin & Boris's suggestion. [Martin & Boris]
- PATCH 2: add the missing 'ToString'. [Martin]
- PATCH 3: use virAsprintf instead of virBufferAsprintf. [Martin]
remove pointless virBufferCheckError. [Martin]
- rebase on master. [Longpeng]
Changes since v1:
- split patch [Martin]
- rebase on master [Martin]
- add docs/tests/schema [Martin]
- fix typos [Gonglei]
---
Boris Fiuczynski (2):
qemu: virtio-crypto: add ccw support
qemu: virtio-crypto: add test for ccw support
Longpeng(Mike) (6):
docs: schema: Add basic documentation for the virtual
docs: news: Add virtio-crypto devices
conf: Parse virtio-crypto in the domain XML
caps: Add qemu capabilities about virtio-crypto
qemu: Implement support for 'builtin' backend for virtio-crypto
tests: Add testcase for virtio-crypto parsing
docs/formatdomain.html.in | 61 ++++++
docs/news.xml | 10 +
docs/schemas/domaincommon.rng | 30 +++
src/conf/domain_conf.c | 213 ++++++++++++++++++++-
src/conf/domain_conf.h | 32 ++++
src/libvirt_private.syms | 5 +
src/qemu/qemu_alias.c | 20 ++
src/qemu/qemu_alias.h | 3 +
src/qemu/qemu_capabilities.c | 6 +
src/qemu/qemu_capabilities.h | 4 +
src/qemu/qemu_command.c | 130 +++++++++++++
src/qemu/qemu_command.h | 3 +
src/qemu/qemu_domain_address.c | 25 +++
src/qemu/qemu_driver.c | 6 +
src/qemu/qemu_hotplug.c | 1 +
tests/qemucapabilitiesdata/caps_2.8.0.s390x.xml | 2 +
tests/qemucapabilitiesdata/caps_2.8.0.x86_64.xml | 2 +
tests/qemucapabilitiesdata/caps_2.9.0.x86_64.xml | 2 +
.../qemuxml2argv-virtio-crypto-builtin.xml | 26 +++
.../qemuxml2argv-virtio-crypto-ccw.args | 22 +++
.../qemuxml2argv-virtio-crypto-ccw.xml | 16 ++
.../qemuxml2argv-virtio-crypto.args | 22 +++
tests/qemuxml2argvtest.c | 6 +
.../qemuxml2xmlout-virtio-crypto-builtin.xml | 31 +++
tests/qemuxml2xmltest.c | 2 +
25 files changed, 679 insertions(+), 1 deletion(-)
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-virtio-crypto-builtin.xml
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-virtio-crypto-ccw.args
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-virtio-crypto-ccw.xml
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-virtio-crypto.args
create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-virtio-crypto-builtin.xml
--
1.8.3.1
6 years, 1 month
[libvirt] [RFC/WIP] [PATCH 0/5] Add support for revert and delete operations to external disk snapshots
by Povilas Kanapickas
Hey all,
Currently libvirt only supports creation of external disk snapshots, but not
reversion and deletion which are essential for any serious use of this feature.
I've looked into implementing removal and reversion of external disk snapshots
and came up with some prototype code that works with my simple test VMs (see
attached patches).
I'd like to discuss about how these features could be implemented properly. As
I've never significantly contributed to libvirt yet, I wanted to delay the
discussion until I understand the problem space myself so that the discussion
could be productive.
My current approach is relatively simple. For snapshot deletion we either
simply remove the disk or use `qemu-img rebase` to reparent a snapshot on top
of the parent of the snapshot that is being deleted. For reversion we delete
the current overlay disk and create another that uses the image of the
snapshot we want to revert to as the backing disk.
Are the attached patches good in principle? Are there any major blockers aside
from lack of tests, code formatting, bugs and so on? Are there any design
issues which prevent a simple implementation of external disk snapshot
support that I didn't see?
If there aren't significant blockers, my plan would be to continue work on the
feature until I have something that could actually be reviewed and possibly
merged.
Regards,
Povilas
Povilas Kanapickas (5):
snapshot: Implement reverting for external disk snapshots
snapshot: Add VIR_DEBUG to qemuDomainSnapshotCreateXML()
snapshot: Support deleting external disk snapshots when deleting
snapshot: Extract qemuDomainSnapshotReparentChildrenMetadata()
snapshot: Support reparenting external disk snapshots when deleting
src/qemu/qemu_domain.c | 45 ++++-
src/qemu/qemu_driver.c | 372 +++++++++++++++++++++++++++++++++++++----
2 files changed, 379 insertions(+), 38 deletions(-)
--
2.17.1
6 years, 1 month