[libvirt] Supporting vhost-net and macvtap in libvirt for QEMU
by Anthony Liguori
Disclaimer: I am neither an SR-IOV nor a vhost-net expert, but I've CC'd
people that are who can throw tomatoes at me for getting bits wrong :-)
I wanted to start a discussion about supporting vhost-net in libvirt.
vhost-net has not yet been merged into qemu but I expect it will be soon
so it's a good time to start this discussion.
There are two modes worth supporting for vhost-net in libvirt. The
first mode is where vhost-net backs to a tun/tap device. This is
behaves in very much the same way that -net tap behaves in qemu today.
Basically, the difference is that the virtio backend is in the kernel
instead of in qemu so there should be some performance improvement.
Current, libvirt invokes qemu with -net tap,fd=X where X is an already
open fd to a tun/tap device. I suspect that after we merge vhost-net,
libvirt could support vhost-net in this mode by just doing -net
vhost,fd=X. I think the only real question for libvirt is whether to
provide a user visible switch to use vhost or to just always use vhost
when it's available and it makes sense. Personally, I think the later
makes sense.
The more interesting invocation of vhost-net though is one where the
vhost-net device backs directly to a physical network card. In this
mode, vhost should get considerably better performance than the current
implementation. I don't know the syntax yet, but I think it's
reasonable to assume that it will look something like -net
tap,dev=eth0. The effect will be that eth0 is dedicated to the guest.
On most modern systems, there is a small number of network devices so
this model is not all that useful except when dealing with SR-IOV
adapters. In that case, each physical device can be exposed as many
virtual devices (VFs). There are a few restrictions here though. The
biggest is that currently, you can only change the number of VFs by
reloading a kernel module so it's really a parameter that must be set at
startup time.
I think there are a few ways libvirt could support vhost-net in this
second mode. The simplest would be to introduce a new tag similar to
<source network='br0'>. In fact, if you probed the device type for the
network parameter, you could probably do something like <source
network='eth0'> and have it Just Work.
Another model would be to have libvirt see an SR-IOV adapter as a
network pool whereas it handled all of the VF management. Considering
how inflexible SR-IOV is today, I'm not sure whether this is the best model.
Has anyone put any more thought into this problem or how this should be
modeled in libvirt? Michael, could you share your current thinking for
-net syntax?
--
Regards,
Anthony Liguori
1 year, 2 months
[libvirt] Libvirt multi queue support
by Naor Shlomo
Hello experts,
Could anyone please tell me if Multi Queue it fully supported in Libvirt and if so what version contains it?
Thanks,
Naor
8 years, 6 months
[libvirt] [PATCH/RFC] Add missing delta from Ubuntu to apparmor profiles
by Stefan Bader
This had been on the Debian package list before but its time to take
this onwards. So the goal would be to have one set to rule them all
(when using apparmor) and drop the seperate set of definitions which
exist at least in the Ubuntu packaging.
Right now the patch would be at a state which adds all missing files
and rules to the current examples in libvirt and installs them when
using --with-apparmor-profiles.
One problem seems to be that some of the definitions might cause
parse failures on certain versions of apparmor. I checked this morning
and this looks a bit hairy. So some apparmor 2.8 versions potentially
have issues, but not all apparmor 2.8 are the same (gah).
I could imagine (but John, we really could use some guidance here ;))
that at least some changes could be related to version 2.8.95~2430:
+ debian/patches/mediate-signals.patch,
debian/patches/change-signal-syntax.patch: Parse signal rules with
apparmor_parser. See the apparmor.d(5) man page for syntax details.
+ debian/patches/change-ptrace-syntax.patch,
debian/patches/mediate-ptrace.patch: Parse ptrace rules with
apparmor_parser. See the apparmor.d(5) man page for syntax details.
But, regardless of the when, the apparmor rules maybe need a way to handle
versioned features of the parser. One proposal was to comment out problematic
rules and allow the packager to re-enable things. Maybe going one step
further and have some pre-processing that handles version based sections
(like #if (APPARMOR_VERSION >= xxx)).
So that is where we stand. Ideas are very welcome.
-Stefan
---
>From aec5cf8cc30c80492a37856626264c3d4c27a31f Mon Sep 17 00:00:00 2001
From: Stefan Bader <stefan.bader(a)canonical.com>
Date: Thu, 18 Sep 2014 14:15:17 +0200
Subject: [PATCH] Add missing delta from Ubuntu to apparmor profiles
This fixes up the upstream profiles and would allow to drop apparmor
related delta from the Ubuntu package.
Thanks to Serge Hallyn for the Makefile.am install hook that allows
to rename the local file.
Signed-off-by: Stefan Bader <stefan.bader(a)canonical.com>
---
examples/apparmor/Makefile.am | 10 ++++++++
examples/apparmor/libvirt-lxc | 15 +++++++++++-
examples/apparmor/libvirt-qemu | 31 +++++++++++++++++++++++-
examples/apparmor/local-usr.sbin.libvirtd | 2 ++
examples/apparmor/usr.lib.libvirt.virt-aa-helper | 25 ++++++++++++++++---
examples/apparmor/usr.sbin.libvirtd | 17 ++++++++++++-
6 files changed, 94 insertions(+), 6 deletions(-)
create mode 100644 examples/apparmor/local-usr.sbin.libvirtd
diff --git a/examples/apparmor/Makefile.am b/examples/apparmor/Makefile.am
index 7a20e16..aa46cb9 100644
--- a/examples/apparmor/Makefile.am
+++ b/examples/apparmor/Makefile.am
@@ -20,6 +20,7 @@ EXTRA_DIST= \
libvirt-qemu \
libvirt-lxc \
usr.lib.libvirt.virt-aa-helper \
+ local-usr.sbin.libvirtd \
usr.sbin.libvirtd
if WITH_APPARMOR_PROFILES
@@ -29,6 +30,15 @@ apparmor_DATA = \
usr.sbin.libvirtd \
$(NULL)
+localdir = $(apparmordir)/local
+local_DATA = \
+ local-usr.sbin.libvirtd \
+ $(NULL)
+
+install-data-hook:
+ mv $(DESTDIR)$(localdir)/local-usr.sbin.libvirtd \
+ $(DESTDIR)$(localdir)/usr.sbin.libvirtd
+
abstractionsdir = $(apparmordir)/abstractions
abstractions_DATA = \
libvirt-qemu \
diff --git a/examples/apparmor/libvirt-lxc b/examples/apparmor/libvirt-lxc
index 4bfb503..4705e0a 100644
--- a/examples/apparmor/libvirt-lxc
+++ b/examples/apparmor/libvirt-lxc
@@ -1,12 +1,18 @@
-# Last Modified: Fri Feb 7 13:01:36 2014
+# Last Modified: Thu, 18 Sep 2014 13:56:49 +0200
#include <abstractions/base>
umount,
+ dbus,
+ signal,
+ ptrace,
# ignore DENIED message on / remount
deny mount options=(ro, remount) -> /,
+ # support use of cgmanager proxy
+ mount options=(move) /sys/fs/cgroup/cgmanager/ -> /sys/fs/cgroup/cgmanager.lower/,
+
# allow tmpfs mounts everywhere
mount fstype=tmpfs,
@@ -33,8 +39,15 @@
mount fstype=fusectl -> /sys/fs/fuse/connections/,
mount fstype=securityfs -> /sys/kernel/security/,
mount fstype=debugfs -> /sys/kernel/debug/,
+ deny mount fstype=debugfs -> /var/lib/ureadahead/debugfs/,
mount fstype=proc -> /proc/,
mount fstype=sysfs -> /sys/,
+
+ mount options=(rw nosuid nodev noexec remount) -> /sys/,
+ mount options=(rw remount) -> /sys/kernel/security/,
+ mount options=(rw remount) -> /sys/fs/pstore/,
+ mount options=(ro remount) -> /sys/fs/pstore/,
+
deny /sys/firmware/efi/efivars/** rwklx,
deny /sys/kernel/security/** rwklx,
diff --git a/examples/apparmor/libvirt-qemu b/examples/apparmor/libvirt-qemu
index c6de6dd..b69e64c 100644
--- a/examples/apparmor/libvirt-qemu
+++ b/examples/apparmor/libvirt-qemu
@@ -1,4 +1,4 @@
-# Last Modified: Wed Sep 3 21:52:03 2014
+# Last Modified: Thu, 18 Sep 2014 16:41:21 +0200
#include <abstractions/base>
#include <abstractions/consoles>
@@ -13,15 +13,22 @@
capability setgid,
capability setuid,
+ # this is needed with libcap-ng support, however it breaks a lot of things
+ # atm, so just silence the denial until libcap-ng works right. LP: #522845
+ deny capability setpcap,
+
network inet stream,
network inet6 stream,
/dev/net/tun rw,
+ /dev/tap* rw,
/dev/kvm rw,
/dev/ptmx rw,
/dev/kqemu rw,
@{PROC}/*/status r,
@{PROC}/sys/kernel/cap_last_cap r,
+ owner @{PROC}/*/auxv r,
+ @{PROC}/sys/vm/overcommit_memory r,
# For hostdev access. The actual devices will be added dynamically
/sys/bus/usb/devices/ r,
@@ -38,6 +45,9 @@
/dev/snd/* rw,
capability ipc_lock,
# spice
+ /usr/bin/qemu-system-i386-spice rmix,
+ /usr/bin/qemu-system-x86_64-spice rmix,
+ /{dev,run}/shm/ r,
owner /{dev,run}/shm/spice.* rw,
# 'kill' is not required for sound and is a security risk. Do not enable
# unless you absolutely need it.
@@ -73,6 +83,7 @@
# the various binaries
/usr/bin/kvm rmix,
/usr/bin/qemu rmix,
+ /usr/bin/qemu-system-aarch64 rmix,
/usr/bin/qemu-system-arm rmix,
/usr/bin/qemu-system-cris rmix,
/usr/bin/qemu-system-i386 rmix,
@@ -91,6 +102,7 @@
/usr/bin/qemu-system-sparc rmix,
/usr/bin/qemu-system-sparc64 rmix,
/usr/bin/qemu-system-x86_64 rmix,
+ /usr/bin/qemu-system-x86_64-spice rmix,
/usr/bin/qemu-alpha rmix,
/usr/bin/qemu-arm rmix,
/usr/bin/qemu-armeb rmix,
@@ -117,6 +129,16 @@
/bin/dash rmix,
/bin/dd rmix,
/bin/cat rmix,
+ /etc/pki/CA/ r,
+ /etc/pki/CA/* r,
+ /etc/pki/libvirt/ r,
+ /etc/pki/libvirt/** r,
+
+ # for rbd
+ /etc/ceph/ceph.conf r,
+
+ # for access to hugepages
+ owner "/run/hugepages/kvm/libvirt/qemu/**" rw,
# for usb access
/dev/bus/usb/ r,
@@ -124,6 +146,13 @@
/sys/bus/ r,
/sys/class/ r,
+ signal (receive) peer=/usr/sbin/libvirtd,
+ ptrace (tracedby) peer=/usr/sbin/libvirtd,
+
+ # for ppc device-tree access
+ @{PROC}/device-tree/ r,
+ @{PROC}/device-tree/** r,
+
/usr/{lib,libexec}/qemu-bridge-helper Cx -> qemu_bridge_helper,
# child profile for bridge helper process
profile qemu_bridge_helper {
diff --git a/examples/apparmor/local-usr.sbin.libvirtd b/examples/apparmor/local-usr.sbin.libvirtd
new file mode 100644
index 0000000..6e19f20
--- /dev/null
+++ b/examples/apparmor/local-usr.sbin.libvirtd
@@ -0,0 +1,2 @@
+# Site-specific additions and overrides for usr.sbin.libvirtd.
+# For more details, please see /etc/apparmor.d/local/README.
diff --git a/examples/apparmor/usr.lib.libvirt.virt-aa-helper b/examples/apparmor/usr.lib.libvirt.virt-aa-helper
index bceaaff..4df86b0 100644
--- a/examples/apparmor/usr.lib.libvirt.virt-aa-helper
+++ b/examples/apparmor/usr.lib.libvirt.virt-aa-helper
@@ -1,8 +1,9 @@
-# Last Modified: Mon Apr 5 15:10:27 2010
+# Last Modified: Thu, 18 Sep 2014 14:05:36 +0200
#include <tunables/global>
/usr/lib/libvirt/virt-aa-helper {
#include <abstractions/base>
+ #include <abstractions/user-tmp>
# needed for searching directories
capability dac_override,
@@ -19,6 +20,12 @@
# for hostdev
/sys/devices/ r,
/sys/devices/** r,
+ /sys/bus/usb/devices/ r,
+ /sys/bus/usb/devices/** r,
+ deny /dev/sd* r,
+ deny /dev/dm-* r,
+ deny /dev/mapper/ r,
+ deny /dev/mapper/* r,
/usr/lib/libvirt/virt-aa-helper mr,
/sbin/apparmor_parser Ux,
@@ -26,8 +33,11 @@
/etc/apparmor.d/libvirt/* r,
/etc/apparmor.d/libvirt/libvirt-[0-9a-f]*-[0-9a-f]*-[0-9a-f]*-[0-9a-f]*-[0-9a-f]* rw,
- # for backingstore -- allow access to non-hidden files in @{HOME} as well
- # as storage pools
+ # For backingstore, virt-aa-helper needs to peek inside the disk image, so
+ # allow access to non-hidden files in @{HOME} as well as storage pools, and
+ # removable media and filesystems, and certain file extentions. A
+ # virt-aa-helper failure when checking a disk for backinsgstore is non-fatal
+ # (but obviously the backingstore won't be added).
audit deny @{HOME}/.* mrwkl,
audit deny @{HOME}/.*/ rw,
audit deny @{HOME}/.*/** mrwkl,
@@ -35,8 +45,17 @@
audit deny @{HOME}/bin/** mrwkl,
@{HOME}/ r,
@{HOME}/** r,
+ @{HOME}/.Private/** mrwlk,
+ @{HOMEDIRS}/.ecryptfs/*/.Private/** mrwlk,
+
/var/lib/libvirt/images/ r,
/var/lib/libvirt/images/** r,
+ /var/lib/nova/images/** r,
+ /var/lib/nova/instances/_base/** r,
+ /var/lib/nova/instances/snapshots/** r,
+ /var/lib/eucalyptus/instances/**/disk* r,
+ /var/lib/eucalyptus/instances/**/loader* r,
+ /var/lib/uvtool/libvirt/images/** r,
/{media,mnt,opt,srv}/** r,
/**.img r,
diff --git a/examples/apparmor/usr.sbin.libvirtd b/examples/apparmor/usr.sbin.libvirtd
index 3011eff..814b4d81 100644
--- a/examples/apparmor/usr.sbin.libvirtd
+++ b/examples/apparmor/usr.sbin.libvirtd
@@ -1,10 +1,12 @@
-# Last Modified: Mon Apr 5 15:03:58 2010
+# Last Modified: Tue, 23 Sep 2014 09:28:07 +0200
#include <tunables/global>
@{LIBVIRT}="libvirt"
/usr/sbin/libvirtd {
#include <abstractions/base>
#include <abstractions/dbus>
+ # Site-specific additions and overrides. See local/README for details.
+ #include <local/usr.sbin.libvirtd>
capability kill,
capability net_admin,
@@ -23,6 +25,7 @@
capability setpcap,
capability mknod,
capability fsetid,
+ capability ipc_lock,
capability audit_write,
# Needed for vfio
@@ -33,6 +36,12 @@
network inet6 stream,
network inet6 dgram,
network packet dgram,
+ network netlink,
+
+ dbus bus=system,
+ signal,
+ ptrace,
+ unix,
# Very lenient profile for libvirtd since we want to first focus on confining
# the guests. Guests will have a very restricted profile.
@@ -45,6 +54,12 @@
/usr/sbin/* PUx,
/lib/udev/scsi_id PUx,
/usr/lib/xen-common/bin/xen-toolstack PUx,
+ /usr/lib/xen-*/bin/pygrub PUx,
+ /usr/lib/xen-*/bin/libxl-save-helper PUx,
+
+ # Required by nwfilter_ebiptables_driver.c:ebiptablesWriteToTempFile() to
+ # write and run an ebtables script.
+ /var/lib/libvirt/virtd* ixr,
# force the use of virt-aa-helper
audit deny /sbin/apparmor_parser rwxl,
--
1.9.1
8 years, 7 months
[libvirt] [PATCH v2 0/8] Add support for fetching statistics of completed jobs
by Jiri Denemark
Using virDomainGetJobStats, we can monitor running jobs but sometimes it
may be useful to get statistics about a job that already finished, for
example, to get the final amount of data transferred during migration or
to get an idea about total downtime. This is what the following patches
are about.
Version 2:
- changed according to John's review (see individual patches for
details)
Jiri Denemark (8):
Refactor job statistics
qemu: Avoid incrementing jobs_queued if virTimeMillisNow fails
Add support for fetching statistics of completed jobs
qemu: Silence coverity on optional migration stats
virsh: Add support for completed job stats
qemu: Transfer migration statistics to destination
qemu: Recompute downtime and total time when migration completes
qemu: Transfer recomputed stats back to source
include/libvirt/libvirt.h.in | 11 ++
src/libvirt.c | 11 +-
src/qemu/qemu_domain.c | 189 ++++++++++++++++++++++++++-
src/qemu/qemu_domain.h | 32 ++++-
src/qemu/qemu_driver.c | 130 ++++--------------
src/qemu/qemu_migration.c | 304 ++++++++++++++++++++++++++++++++++++-------
src/qemu/qemu_monitor_json.c | 10 +-
src/qemu/qemu_process.c | 9 +-
tools/virsh-domain.c | 27 +++-
tools/virsh.pod | 10 +-
10 files changed, 557 insertions(+), 176 deletions(-)
--
2.1.0
8 years, 7 months
[libvirt] securityselinuxlabeltest test fails on v1.2.5
by Scott Sullivan
I am trying to build v1.2.5-maint, however I have one test failing
causing the build to fail:
TEST: securityselinuxlabeltest
!!!. 4 FAIL
PASS: virsh-undefine
=======================================
1 of 112 tests failed
Please report to libvir-list(a)redhat.com
=======================================
make[2]: *** [check-TESTS] Error 1
make[2]: Leaving directory `/home/rpmbuild/packages/libvirt/tests'
make[1]: *** [check-am] Error 2
make[1]: Leaving directory `/home/rpmbuild/packages/libvirt/tests'
make: *** [check-recursive] Error 1
error: Bad exit status from /var/tmp/rpm-tmp.UGNUaq (%build)
Is anyone else having this problem? Im building on CentOS 6.5. Im happy
to provide any further information as needed.
9 years, 8 months
[libvirt] [PATCH] LXC: create a bind mount for sysfs when enable userns but disable netns
by Chen Hanxiao
kernel commit 7dc5dbc879bd0779924b5132a48b731a0bc04a1e
forbid us doing a fresh mount for sysfs
when enable userns but disable netns.
This patch will create a bind mount in this senario.
Signed-off-by: Chen Hanxiao <chenhanxiao(a)cn.fujitsu.com>
---
src/lxc/lxc_container.c | 44 +++++++++++++++++++++++++++++++++-----------
1 file changed, 33 insertions(+), 11 deletions(-)
diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c
index 4d89677..8a27215 100644
--- a/src/lxc/lxc_container.c
+++ b/src/lxc/lxc_container.c
@@ -815,10 +815,13 @@ static int lxcContainerSetReadOnly(void)
}
-static int lxcContainerMountBasicFS(bool userns_enabled)
+static int lxcContainerMountBasicFS(bool userns_enabled,
+ bool netns_disabled)
{
size_t i;
int rc = -1;
+ char* mnt_src = NULL;
+ int mnt_mflags;
VIR_DEBUG("Mounting basic filesystems");
@@ -826,8 +829,25 @@ static int lxcContainerMountBasicFS(bool userns_enabled)
bool bindOverReadonly;
virLXCBasicMountInfo const *mnt = &lxcBasicMounts[i];
+ /* When enable userns but disable netns, kernel will
+ * forbid us doing a new fresh mount for sysfs.
+ * So we had to do a bind mount for sysfs instead.
+ */
+ if (userns_enabled && netns_disabled &&
+ STREQ(mnt->src, "sysfs")) {
+ if (VIR_STRDUP(mnt_src, "/sys") < 0) {
+ goto cleanup;
+ }
+ mnt_mflags = MS_NOSUID|MS_NOEXEC|MS_NODEV|MS_RDONLY|MS_BIND;
+ } else {
+ if (VIR_STRDUP(mnt_src, mnt->src) < 0) {
+ goto cleanup;
+ }
+ mnt_mflags = mnt->mflags;
+ }
+
VIR_DEBUG("Processing %s -> %s",
- mnt->src, mnt->dst);
+ mnt_src, mnt->dst);
if (mnt->skipUnmounted) {
char *hostdir;
@@ -856,7 +876,7 @@ static int lxcContainerMountBasicFS(bool userns_enabled)
if (virFileMakePath(mnt->dst) < 0) {
virReportSystemError(errno,
_("Failed to mkdir %s"),
- mnt->src);
+ mnt_src);
goto cleanup;
}
@@ -867,24 +887,24 @@ static int lxcContainerMountBasicFS(bool userns_enabled)
* we mount the filesystem in read-write mode initially, and then do a
* separate read-only bind mount on top of that.
*/
- bindOverReadonly = !!(mnt->mflags & MS_RDONLY);
+ bindOverReadonly = !!(mnt_mflags & MS_RDONLY);
VIR_DEBUG("Mount %s on %s type=%s flags=%x",
- mnt->src, mnt->dst, mnt->type, mnt->mflags & ~MS_RDONLY);
- if (mount(mnt->src, mnt->dst, mnt->type, mnt->mflags & ~MS_RDONLY, NULL) < 0) {
+ mnt_src, mnt->dst, mnt->type, mnt_mflags & ~MS_RDONLY);
+ if (mount(mnt_src, mnt->dst, mnt->type, mnt_mflags & ~MS_RDONLY, NULL) < 0) {
virReportSystemError(errno,
_("Failed to mount %s on %s type %s flags=%x"),
- mnt->src, mnt->dst, NULLSTR(mnt->type),
- mnt->mflags & ~MS_RDONLY);
+ mnt_src, mnt->dst, NULLSTR(mnt->type),
+ mnt_mflags & ~MS_RDONLY);
goto cleanup;
}
if (bindOverReadonly &&
- mount(mnt->src, mnt->dst, NULL,
+ mount(mnt_src, mnt->dst, NULL,
MS_BIND|MS_REMOUNT|MS_RDONLY, NULL) < 0) {
virReportSystemError(errno,
_("Failed to re-mount %s on %s flags=%x"),
- mnt->src, mnt->dst,
+ mnt_src, mnt->dst,
MS_BIND|MS_REMOUNT|MS_RDONLY);
goto cleanup;
}
@@ -893,6 +913,7 @@ static int lxcContainerMountBasicFS(bool userns_enabled)
rc = 0;
cleanup:
+ VIR_FREE(mnt_src);
VIR_DEBUG("rc=%d", rc);
return rc;
}
@@ -1643,7 +1664,8 @@ static int lxcContainerSetupPivotRoot(virDomainDefPtr vmDef,
goto cleanup;
/* Mounts the core /proc, /sys, etc filesystems */
- if (lxcContainerMountBasicFS(vmDef->idmap.nuidmap) < 0)
+ if (lxcContainerMountBasicFS(vmDef->idmap.nuidmap,
+ !vmDef->nnets) < 0)
goto cleanup;
/* Ensure entire root filesystem (except /.oldroot) is readonly */
--
1.9.0
9 years, 9 months
[libvirt] [RFC PATCH v2 0/4] Enable spapr-pci-vfio-host-bridge controllers for VFIO passthrough support
by Shivaprasad G Bhat
The following series of patches enable spapr-pci-vfio-host-bridge
controllers on PPC64-pseries machine which is required for supporting
host device passthrough using VFIO.
There were some initial enablement work on the same at
http://www.redhat.com/archives/libvir-list/2013-September/msg00838.html.
On pseries(ppc64), the vfio host devices(will refer as
hostdevs here on) cannot be assigned to the default emulated pci-host-bus(phb)
controller(like the default pci.0). The hostdevs goto spapr-pci-vfio-host-bridge.
The hostdevs belonging to the same iommu group share the same
spapr-pci-vfio-host-bridge. Henceforth, new spapr-pci-host-bridge needs to be
added for every hostdev belonging to any new iommu group. The hostdevs should
be attached to their respective spapr-pci-vfio-host-bridge.
Libvirt today adds all the devices to the default pci domain. The patch series
take care to add the new controller. A new pci domain in the guest per
controller is created. The hostdevs get their pci address in the respective
domain. The patch series taskes care of device addressing in vfio hostdevs,
SR-IOV interfaces and network interfaces from SRIOV virtual function pools.
Reference:
======
v1: http://www.redhat.com/archives/libvir-list/2014-October/msg00500.html
Changes Since v1:
* Some minor code clean up and rebase to latest code base after review with Prerna
* Patch 2 : Added logic to remove redundant spapr-vfio controllers.
* : Moved domaincommon.rng from Patch 4 to Patch 2.
---
Shivaprasad G Bhat (4):
qemu: Add SPAPR_VFIO_HOST_BRIDGE capability for PPC platform
qemu: parse spapr-vfio-pci controller from xml
qemu: assign addresses for spapr vfio hostdevices and generate cli
qemu: add test case for spapr-pci-vfio-host-bridge
docs/schemas/domaincommon.rng | 28 ++
src/bhyve/bhyve_domain.c | 2
src/conf/domain_addr.c | 8 -
src/conf/domain_addr.h | 2
src/conf/domain_conf.c | 165 +++++++++++-
src/conf/domain_conf.h | 19 +
src/libvirt_private.syms | 2
src/qemu/qemu_capabilities.c | 2
src/qemu/qemu_capabilities.h | 1
src/qemu/qemu_command.c | 276 +++++++++++++++++++-
src/qemu/qemu_command.h | 17 +
src/qemu/qemu_domain.c | 12 -
src/qemu/qemu_driver.c | 6
tests/qemuhotplugtest.c | 2
.../qemuxml2argv-hostdev-spapr-vfio.args | 20 +
.../qemuxml2argv-hostdev-spapr-vfio.xml | 75 +++++
tests/qemuxml2argvtest.c | 8 +
17 files changed, 609 insertions(+), 36 deletions(-)
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-spapr-vfio.args
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-spapr-vfio.xml
--
Signature
9 years, 9 months
[libvirt] [PATCH] Fix reporting of i/o errors by iohelper process
by Jason J. Herne
From: "Jason J. Herne" <jjherne(a)us.ibm.com>
libvirt_iohelper is a helper process that is exec'ed and used to handle I/O
during a Qemu managed save operation. Due to a missing call to
virFileWrapperFdClose, all I/O error messages reported by iohelper are lost.
This patch adds a call to virFileWrapperFdClose to the cleanup phase of
qemuDomainSaveMemory.
This patch also modifies virFileWrapperFdClose such that errors are only
reported when the length of the err_msg buffer is > 0. Before now, the
existence of the buffer would trigger error reporting in virFileWrapperFdClose.
Signed-off-by: Jason J. Herne <jjherne(a)us.ibm.com>
---
src/qemu/qemu_driver.c | 1 +
src/util/virfile.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index ecccf6c..8d78805 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -3015,6 +3015,7 @@ qemuDomainSaveMemory(virQEMUDriverPtr driver,
cleanup:
VIR_FORCE_CLOSE(fd);
+ virFileWrapperFdClose(wrapperFd);
virFileWrapperFdFree(wrapperFd);
VIR_FREE(xml);
diff --git a/src/util/virfile.c b/src/util/virfile.c
index 463064c..813b4f5 100644
--- a/src/util/virfile.c
+++ b/src/util/virfile.c
@@ -322,7 +322,7 @@ virFileWrapperFdClose(virFileWrapperFdPtr wfd)
return 0;
ret = virCommandWait(wfd->cmd, NULL);
- if (wfd->err_msg)
+ if (wfd->err_msg && strlen(wfd->err_msg))
VIR_WARN("iohelper reports: %s", wfd->err_msg);
return ret;
--
1.8.3.2
9 years, 10 months
[libvirt] [PATCH v3] qemu: Pass file descriptor when using TPM passthrough
by Stefan Berger
Pass the TPM file descriptor to QEMU via command line.
Instead of passing /dev/tpm0 we now pass /dev/fdset/10 and the additional
parameters -add-fd set=10,fd=20.
This addresses the use case when QEMU is started with non-root privileges
and QEMU cannot open /dev/tpm0 for example.
One problem is that for the passing of the file descriptor set to work,
virCommandReorderFDs must not be called on the virCommand. This is prevented
by setting a flag in the virCommandPassFDGetFDIndex that is checked to be
clear when virCommandReorderFDs is run.
Signed-off-by: Stefan Berger <stefanb(a)linux.vnet.ibm.com>
v2->v3: Fixed some memory leaks
---
src/libvirt_private.syms | 1 +
src/qemu/qemu_command.c | 136 ++++++++++++++++++++++++++++++++++++++++++++---
src/util/vircommand.c | 33 ++++++++++++
src/util/vircommand.h | 3 ++
4 files changed, 166 insertions(+), 7 deletions(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index aeec440..3194e8b 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -1164,6 +1164,7 @@ virCommandNewArgList;
virCommandNewArgs;
virCommandNonblockingFDs;
virCommandPassFD;
+virCommandPassFDGetFDIndex;
virCommandPassListenFDs;
virCommandRawStatus;
virCommandRequireHandshake;
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index 8ed7934..17debba 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -159,6 +159,58 @@ VIR_ENUM_IMPL(qemuNumaPolicy, VIR_DOMAIN_NUMATUNE_MEM_LAST,
"interleave");
/**
+ * qemuVirCommandGetFDSet:
+ * @cmd: the command to modify
+ * @fd: fd to reassign to the child
+ *
+ * Get the parameters for the QEMU -add-fd command line option
+ * for the given file descriptor. The file descriptor must previously
+ * have been 'transferred' in a virCommandPassFD() call.
+ * This function for example returns "set=10,fd=20".
+ */
+static char *
+qemuVirCommandGetFDSet(virCommandPtr cmd, int fd)
+{
+ char *result = NULL;
+ int idx = virCommandPassFDGetFDIndex(cmd, fd);
+
+ if (idx >= 0) {
+ ignore_value(virAsprintf(&result, "set=%d,fd=%d", idx, fd) < 0);
+ } else {
+ virReportError(VIR_ERR_INTERNAL_ERROR,
+ _("file descriptor %d has not been transferred"), fd);
+ }
+
+ return result;
+}
+
+/**
+ * qemuVirCommandGetDevSet:
+ * @cmd: the command to modify
+ * @fd: fd to reassign to the child
+ *
+ * Get the parameters for the QEMU path= parameter where a file
+ * descriptor is accessed via a file descriptor set, for example
+ * /dev/fdset/10. The file descriptor must previously have been
+ * 'transferred' in a virCommandPassFD() call.
+ */
+static char *
+qemuVirCommandGetDevSet(virCommandPtr cmd, int fd)
+{
+ char *result = NULL;
+ int idx = virCommandPassFDGetFDIndex(cmd, fd);
+
+ if (idx >= 0) {
+ ignore_value(virAsprintf(&result, "/dev/fdset/%d", idx) < 0);
+ } else {
+ virReportError(VIR_ERR_INTERNAL_ERROR,
+ _("file descriptor %d has not been transferred"), fd);
+ }
+ return result;
+}
+
+
+/**
* qemuPhysIfaceConnect:
* @def: the definition of the VM (needed by 802.1Qbh and audit)
* @driver: pointer to the driver instance
@@ -5926,14 +5978,20 @@ qemuBuildRNGDeviceArgs(virCommandPtr cmd,
static char *qemuBuildTPMBackendStr(const virDomainDef *def,
+ virCommandPtr cmd,
virQEMUCapsPtr qemuCaps,
- const char *emulator)
+ const char *emulator,
+ int *tpmfd, int *cancelfd)
{
const virDomainTPMDef *tpm = def->tpm;
virBuffer buf = VIR_BUFFER_INITIALIZER;
const char *type = virDomainTPMBackendTypeToString(tpm->type);
- char *cancel_path;
+ char *cancel_path = NULL;
const char *tpmdev;
+ char *devset = NULL, *cancel_devset = NULL;
+
+ *tpmfd = -1;
+ *cancelfd = -1;
virBufferAsprintf(&buf, "%s,id=tpm-%s", type, tpm->info.alias);
@@ -5946,11 +6004,49 @@ static char *qemuBuildTPMBackendStr(const virDomainDef *def,
if (!(cancel_path = virTPMCreateCancelPath(tpmdev)))
goto error;
- virBufferAddLit(&buf, ",path=");
- virBufferEscape(&buf, ',', ",", "%s", tpmdev);
+ if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_ADD_FD)) {
+ *tpmfd = open(tpmdev, O_RDWR);
+ if (*tpmfd < 0) {
+ virReportError(VIR_ERR_INTERNAL_ERROR,
+ _("Could not open TPM device %s"), tpmdev);
+ goto error;
+ }
+
+ virCommandPassFD(cmd, *tpmfd,
+ VIR_COMMAND_PASS_FD_CLOSE_PARENT);
+ devset = qemuVirCommandGetDevSet(cmd, *tpmfd);
+ if (devset == NULL)
+ goto error;
+
+ *cancelfd = open(cancel_path, O_WRONLY);
+ if (*cancelfd < 0) {
+ virReportError(VIR_ERR_INTERNAL_ERROR,
+ _("Could not open TPM device's cancel path "
+ "%s"), cancel_path);
+ goto error;
+ }
+
+ virCommandPassFD(cmd, *cancelfd,
+ VIR_COMMAND_PASS_FD_CLOSE_PARENT);
+ cancel_devset = qemuVirCommandGetDevSet(cmd, *cancelfd);
+ if (cancel_devset == NULL)
+ goto error;
+
+ virBufferAddLit(&buf, ",path=");
+ virBufferEscape(&buf, ',', ",", "%s", devset);
+ VIR_FREE(devset);
- virBufferAddLit(&buf, ",cancel-path=");
- virBufferEscape(&buf, ',', ",", "%s", cancel_path);
+ virBufferAddLit(&buf, ",cancel-path=");
+ virBufferEscape(&buf, ',', ",", "%s", cancel_devset);
+ VIR_FREE(cancel_devset);
+ } else {
+ /* all test cases will use this path */
+ virBufferAddLit(&buf, ",path=");
+ virBufferEscape(&buf, ',', ",", "%s", tpmdev);
+
+ virBufferAddLit(&buf, ",cancel-path=");
+ virBufferEscape(&buf, ',', ",", "%s", cancel_path);
+ }
VIR_FREE(cancel_path);
break;
@@ -5970,6 +6066,10 @@ static char *qemuBuildTPMBackendStr(const virDomainDef *def,
emulator, type);
error:
+ VIR_FREE(devset);
+ VIR_FREE(cancel_devset);
+ VIR_FREE(cancel_path);
+
virBufferFreeAndReset(&buf);
return NULL;
}
@@ -9223,13 +9323,35 @@ qemuBuildCommandLine(virConnectPtr conn,
if (def->tpm) {
char *optstr;
+ int tpmfd = -1;
+ int cancelfd = -1;
+ char *fdset;
- if (!(optstr = qemuBuildTPMBackendStr(def, qemuCaps, emulator)))
+ if (!(optstr = qemuBuildTPMBackendStr(def, cmd, qemuCaps, emulator,
+ &tpmfd, &cancelfd)))
goto error;
virCommandAddArgList(cmd, "-tpmdev", optstr, NULL);
VIR_FREE(optstr);
+ if (tpmfd >= 0) {
+ fdset = qemuVirCommandGetFDSet(cmd, tpmfd);
+ if (!fdset)
+ goto error;
+
+ virCommandAddArgList(cmd, "-add-fd", fdset, NULL);
+ VIR_FREE(fdset);
+ }
+
+ if (cancelfd >= 0) {
+ fdset = qemuVirCommandGetFDSet(cmd, cancelfd);
+ if (!fdset)
+ goto error;
+
+ virCommandAddArgList(cmd, "-add-fd", fdset, NULL);
+ VIR_FREE(fdset);
+ }
+
if (!(optstr = qemuBuildTPMDevStr(def, qemuCaps, emulator)))
goto error;
diff --git a/src/util/vircommand.c b/src/util/vircommand.c
index 6527d85..2616446 100644
--- a/src/util/vircommand.c
+++ b/src/util/vircommand.c
@@ -67,6 +67,7 @@ enum {
VIR_EXEC_RUN_SYNC = (1 << 3),
VIR_EXEC_ASYNC_IO = (1 << 4),
VIR_EXEC_LISTEN_FDS = (1 << 5),
+ VIR_EXEC_FIXED_FDS = (1 << 6),
};
typedef struct _virCommandFD virCommandFD;
@@ -214,6 +215,12 @@ virCommandReorderFDs(virCommandPtr cmd)
if (!cmd || cmd->has_error || !cmd->npassfd)
return;
+ if ((cmd->flags & VIR_EXEC_FIXED_FDS)) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("The fds are fixed and cannot be reordered"));
+ goto error;
+ }
+
for (i = 0; i < cmd->npassfd; i++)
maxfd = MAX(cmd->passfd[i].fd, maxfd);
@@ -1019,6 +1026,32 @@ virCommandPassListenFDs(virCommandPtr cmd)
cmd->flags |= VIR_EXEC_LISTEN_FDS;
}
+/*
+ * virCommandPassFDGetFDIndex:
+ * @cmd: pointer to virCommand
+ * @fd: FD to get index of
+ *
+ * Determine the index of the FD in the transfer set.
+ *
+ * Returns index >= 0 if @set contains @fd,
+ * -1 otherwise.
+ */
+int
+virCommandPassFDGetFDIndex(virCommandPtr cmd, int fd)
+{
+ size_t i = 0;
+
+ while (i < cmd->npassfd) {
+ if (cmd->passfd[i].fd == fd) {
+ cmd->flags |= VIR_EXEC_FIXED_FDS;
+ return i;
+ }
+ i++;
+ }
+
+ return -1;
+}
+
/**
* virCommandSetPidFile:
* @cmd: the command to modify
diff --git a/src/util/vircommand.h b/src/util/vircommand.h
index bf65de4..198da2f 100644
--- a/src/util/vircommand.h
+++ b/src/util/vircommand.h
@@ -62,6 +62,9 @@ void virCommandPassFD(virCommandPtr cmd,
void virCommandPassListenFDs(virCommandPtr cmd);
+int virCommandPassFDGetFDIndex(virCommandPtr cmd,
+ int fd);
+
void virCommandSetPidFile(virCommandPtr cmd,
const char *pidfile) ATTRIBUTE_NONNULL(2);
--
1.9.3
9 years, 10 months