September 2013 - Devel - Libvirt List Archives

[libvirt] USB: USB Passthrough Device Autoconnect Feature

by Wangyufei (A)

Hello, Qemu upstream had achieved USB Passthrough Device Autoconnect Feature for the guest. Such as a USB device is unplugged from the host then plugged back in to the same USB physical port. the patch was: https://lists.gnu.org/archive/html/qemu-devel/2011-05/msg02341.html However, Libvirt has not provided such an interface that identifies a USB device for pass through with physical port, rather than the device number. Do you have any ideas or plans about this problem? Thanks in advance. Best Regards, -WangYufei

11 years, 10 months

1
0
0 / 0

[libvirt] [PATCH v2] Add some notes about security considerations when using LXC

by Daniel P. Berrange

From: "Daniel P. Berrange" <berrange(a)redhat.com> Describe some of the issues to be aware of when configuring LXC guests with security isolation as a goal. Signed-off-by: Daniel P. Berrange <berrange(a)redhat.com> --- docs/drvlxc.html.in | 103 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 103 insertions(+) In v2: - Clarify UNIX domain socket issues wrt filesystem & network namespaces diff --git a/docs/drvlxc.html.in b/docs/drvlxc.html.in index 1e6aa1d..66d97e4 100644 --- a/docs/drvlxc.html.in +++ b/docs/drvlxc.html.in @@ -168,6 +168,109 @@ Further block or character devices will be made available to containers depending on their configuration. </p> +<h2><a name="security">Security considerations</a></h2> + +<p> +The libvirt LXC driver is fairly flexible in how it can be configured, +and as such does not enforce a requirement for strict security +separation between a container and the host. This allows it to be used +in scenarios where only resource control capabilities are important, +and resource sharing is desired. Applications wishing to ensure secure +isolation between a container and the host must ensure that they are +writing a suitable configuration. +</p> + +<h3><a name="securenetworking">Network isolation</a></h3> + +<p> +If the guest configuration does not list any network interfaces, +the <code>network</code> namespace will not be activated, and thus +the container will see all the host's network interfaces. This will +allow apps in the container to bind to/connect from TCP/UDP addresses +and ports from the host OS. It also allows applications to access +UNIX domain sockets associated with the host OS, which are in the +abstract namespace. If access to UNIX domains sockets in the abstract +namespace is not wanted, then applications should set the +<code><privnet/></code> flag in the +<code><features>....</features></code> element. +</p> + +<h3><a name="securefs">Filesystem isolation</a></h3> + +<p> +If the guest configuration does not list any filesystems, then +the container will be set up with a root filesystem that matches +the host's root filesystem. As noted earlier, only a few locations +such as <code>/dev</code>, <code>/proc</code> and <code>/sys</code> +will be altered. This means that, in the absence of restrictions +from sVirt, a process running as user/group N:M inside the container +will be able to access almost exactly the same files as a process +running as user/group N:M in the host. +</p> + +<p> +There are multiple options for restricting this. It is possible to +simply map the existing root filesystem through to the container in +read-only mode. Alternatively a completely separate root filesystem +can be configured for the guest. In both cases, further sub-mounts +can be applied to customize the content that is made visible. Note +that in the absence of sVirt controls, it is still possible for the +root user in a container to unmount any sub-mounts applied. The user +namespace feature can also be used to restrict access to files based +on the UID/GID mappings. +</p> + +<p> +Sharing the host filesystem tree, also allows applications to access +UNIX domains sockets associated with the host OS, which are in the +filesystem namespaces. It should be noted that a number of init +systems including at least <code>systemd</code> and <code>upstart</code> +have UNIX domain socket which are used to control their operation. +Thus, if the directory/filesystem holding their UNIX domain socket is +exposed to the container, it will be possible for a user in the container +to invoke operations on the init service in the same way it could if +outside the container. This also applies to other applications in the +host which use UNIX domain sockets in the filesystem, such as DBus, +Libvirtd, and many more. If this is not desired, then applications +should either specify the UID/GID mapping in the configuration to +enable user namespaces & thus block access to the UNIX domain socket +based on permissions, or should ensure the relevant directories have +a bind mount to hide them. This is particularly important for the +<code>/run</code> or <code>/var/run</code> directories. +</p> + + +<h3><a name="secureusers">User and group isolation</a></h3> + +<p> +If the guest configuration does not list any ID mapping, then the +user and group IDs used inside the container will match those used +outside the container. In addition, the capabilities associated with +a process in the container will infer the same privileges they would +for a process in the host. This has obvious implications for security, +since a root user inside the container will be able to access any +file owned by root that is visible to the container, and perform more +or less any privileged kernel operation. In the absence of additional +protection from sVirt, this means that the root user inside a container +is effectively as powerful as the root user in the host. There is no +security isolation of the root user. +</p> + +<p> +The ID mapping facility was introduced to allow for stricter control +over the privileges of users inside the container. It allows apps to +define rules such as "user ID 0 in the container maps to user ID 1000 +in the host". In addition the privileges associated with capabilities +are somewhat reduced so that they can not be used to escape from the +container environment. A full description of user namespaces is outside +the scope of this document, however LWN has +<a href="https://lwn.net/Articles/532593/">a good write-up on the topic</a>. +From the libvirt point of view, the key thing to remember is that defining +an ID mapping for users and groups in the container XML configuration +causes libvirt to activate the user namespace feature. +</p> + + <h2><a name="activation">Systemd Socket Activation Integration</a></h2> <p> -- 1.8.3.1

11 years, 10 months

3
4
0 / 0

[libvirt] [PATCH] Fix naming of permission for detecting storage pools

by Daniel P. Berrange

From: "Daniel P. Berrange" <berrange(a)redhat.com> The VIR_ACCESS_PERM_CONNECT_DETECT_STORAGE_POOLS enum constant had its string format be 'detect_storage_pool', note the missing trailing 's'. This prevent the ACL check from ever succeeding. Fix this and add a simple test script to validate this problem of matching names. Signed-off-by: Daniel P. Berrange <berrange(a)redhat.com> --- src/Makefile.am | 8 ++++- src/access/viraccessperm.c | 2 +- src/check-aclperms.pl | 75 ++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 83 insertions(+), 2 deletions(-) create mode 100755 src/check-aclperms.pl diff --git a/src/Makefile.am b/src/Makefile.am index 711da32..9f9dcd9 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -528,10 +528,16 @@ check-aclrules: $(REMOTE_PROTOCOL) \ $(addprefix $(srcdir)/,$(filter-out /%,$(STATEFUL_DRIVER_SOURCE_FILES))) +check-aclperms: + $(AM_V_GEN)$(PERL) $(srcdir)/check-aclperms.pl \ + $(srcdir)/access/viraccessperm.h \ + $(srcdir)/access/viraccessperm.c + EXTRA_DIST += check-driverimpls.pl check-aclrules.pl check-local: check-protocol check-symfile check-symsorting \ - check-drivername check-driverimpls check-aclrules + check-drivername check-driverimpls check-aclrules \ + check-aclperms .PHONY: check-protocol $(PROTOCOL_STRUCTS:structs=struct) # Mock driver, covering domains, storage, networks, etc diff --git a/src/access/viraccessperm.c b/src/access/viraccessperm.c index 9c720f9..d517c66 100644 --- a/src/access/viraccessperm.c +++ b/src/access/viraccessperm.c @@ -30,7 +30,7 @@ VIR_ENUM_IMPL(virAccessPermConnect, "search_storage_pools", "search_node_devices", "search_interfaces", "search_secrets", "search_nwfilters", - "detect_storage_pool", "pm_control", + "detect_storage_pools", "pm_control", "interface_transaction"); VIR_ENUM_IMPL(virAccessPermDomain, diff --git a/src/check-aclperms.pl b/src/check-aclperms.pl new file mode 100755 index 0000000..b7fadcd --- /dev/null +++ b/src/check-aclperms.pl @@ -0,0 +1,75 @@ +#!/usr/bin/perl +# +# Copyright (C) 2013 Red Hat, Inc. +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library. If not, see +# <http://www.gnu.org/licenses/>. +# +# This script just validates that the stringified version of +# a virAccessPerm enum matches the enum constant name. We do +# alot of auto-generation of code, so when these don't match +# problems occur, preventing auth from succeeding at all. + +my $hdr = shift; +my $impl = shift; + +my %perms; + +my @perms; + +open HDR, $hdr or die "cannot read $hdr: $!"; + +while (<HDR>) { + if (/^\s+VIR_ACCESS_PERM_([_A-Z]+)(,?|\s|$)/) { + my $perm = $1; + + $perms{$perm} = 1 unless ($perm =~ /_LAST$/); + } +} + +close HDR; + + +open IMPL, $impl or die "cannot read $impl: $!"; + +my $group; +my $warned = 0; + +while (defined (my $line = <IMPL>)) { + if ($line =~ /VIR_ACCESS_PERM_([_A-Z]+)_LAST/) { + $group = $1; + } elsif ($line =~ /"[_a-z]+"/) { + my @bits = split /,/, $line; + foreach my $bit (@bits) { + if ($bit =~ /"([_a-z]+)"/) { + #print $1, "\n"; + + my $perm = uc($group . "_" . $1); + if (!exists $perms{$perm}) { + print STDERR "Unknown perm string $1 for group $group\n"; + $warned = 1; + } + delete $perms{$perm}; + } + } + } +} +close IMPL; + +foreach my $perm (keys %perms) { + print STDERR "Perm $perm had not string form\n"; + $warned = 1; +} + +exit $warned; -- 1.8.3.1

11 years, 10 months

2
2
0 / 0

[libvirt] [PATCH] rbd: Use rbd_create3 to create RBD format 2 images by default

by Wido den Hollander

This new RBD format supports snapshotting and cloning. By having libvirt create images in format 2 end-users of the created images can benefit of the new RBD format. Signed-off-by: Wido den Hollander <wido(a)widodh.nl> --- src/storage/storage_backend_rbd.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/src/storage/storage_backend_rbd.c b/src/storage/storage_backend_rbd.c index d9e1789..e79873f 100644 --- a/src/storage/storage_backend_rbd.c +++ b/src/storage/storage_backend_rbd.c @@ -435,6 +435,26 @@ cleanup: return ret; } +static int virStorageBackendRBDCreateImage(rados_ioctx_t io, + char *name, long capacity) +{ + int order = 0; + #if LIBRBD_VERSION_CODE > 260 + uint64_t features = 3; + uint64_t stripe_count = 1; + uint64_t stripe_unit = 4194304; + + if (rbd_create3(io, name, capacity, features, &order, + stripe_count, stripe_unit) < 0) { + #else + if (rbd_create(io, name, capacity, &order) < 0) { + #endif + return -1; + } + + return 0; +} + static int virStorageBackendRBDCreateVol(virConnectPtr conn, virStoragePoolObjPtr pool, virStorageVolDefPtr vol) @@ -442,7 +462,6 @@ static int virStorageBackendRBDCreateVol(virConnectPtr conn, virStorageBackendRBDStatePtr ptr; ptr.cluster = NULL; ptr.ioctx = NULL; - int order = 0; int ret = -1; VIR_DEBUG("Creating RBD image %s/%s with size %llu", @@ -467,7 +486,7 @@ static int virStorageBackendRBDCreateVol(virConnectPtr conn, goto cleanup; } - if (rbd_create(ptr.ioctx, vol->name, vol->capacity, &order) < 0) { + if (virStorageBackendRBDCreateImage(ptr.ioctx, vol->name, vol->capacity) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("failed to create volume '%s/%s'"), pool->def->source.name, -- 1.7.9.5

11 years, 10 months

2
2
0 / 0

[libvirt] [PATCH] Allow root users to have their own configuration file

by Martin Kletzander

Currently, we have two configuration file paths, one global (where "global" means root-only and we're probably not changing this in near future) and one per-user. Unfortunately root user cannot use the second option because until now we were choosing the file path depending only on whether the user is root or not. This patch modifies the mentioned behavior for root only, allowing him to set his own configuration files without changing anything in system-wide configuration folders. This also makes the virsh-uriprecedence test pass its first test case when ran as root. Signed-off-by: Martin Kletzander <mkletzan(a)redhat.com> --- Notes: I'm playing along previously mentioned "proper behavior" in this patch. However, IMNSHO, our "global" or "system-wide" configuration file (defaulting to '/etc/libvirt/libvirt.conf') should be accessible for all users since this has no security impact (security information may be in files 'libvirtd.conf' or 'qemu.conf'). This file should be also read and used for all users. After that, settings in user configuration file (defaulting to '~/.config/libvirt/libvirt.conf') may override some of these settings for that user. This is how all sensible configurations are loaded and that's also what I'd prefer. Unfortunately some developers feels this should be done in completely different way. src/libvirt.c | 56 ++++++++++++++++++++++++++++++++++++-------------------- 1 file changed, 36 insertions(+), 20 deletions(-) diff --git a/src/libvirt.c b/src/libvirt.c index 20a2d4c..bfc466b 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -957,28 +957,34 @@ error: return -1; } -static char * -virConnectGetConfigFilePath(void) +/* + * Return code 0 means no error, but doesn't guarantee path != NULL. + */ +static int +virConnectGetConfigFilePath(char **path, bool global) { - char *path; - if (geteuid() == 0) { - if (virAsprintf(&path, "%s/libvirt/libvirt.conf", + char *userdir = NULL; + int ret = -1; + *path = NULL; + + /* Don't provide the global configuration file to non-root users */ + if (geteuid() != 0 && global) + return 0; + + if (global) { + if (virAsprintf(path, "%s/libvirt/libvirt.conf", SYSCONFDIR) < 0) - return NULL; + goto cleanup; } else { - char *userdir = virGetUserConfigDirectory(); - if (!userdir) - return NULL; - - if (virAsprintf(&path, "%s/libvirt.conf", - userdir) < 0) { - VIR_FREE(userdir); - return NULL; - } - VIR_FREE(userdir); + if (!(userdir = virGetUserConfigDirectory()) || + virAsprintf(path, "%s/libvirt.conf", userdir) < 0) + goto cleanup; } - return path; + ret = 0; + cleanup: + VIR_FREE(userdir); + return ret; } static int @@ -989,12 +995,22 @@ virConnectGetConfigFile(virConfPtr *conf) *conf = NULL; - if (!(filename = virConnectGetConfigFilePath())) + /* Try reading user configuration file unconditionally */ + if (virConnectGetConfigFilePath(&filename, false) < 0) goto cleanup; if (!virFileExists(filename)) { - ret = 0; - goto cleanup; + /* and in case there is none, try the global one. */ + + VIR_FREE(filename); + if (virConnectGetConfigFilePath(&filename, true) < 0) + goto cleanup; + + if (!filename || + !virFileExists(filename)) { + ret = 0; + goto cleanup; + } } VIR_DEBUG("Loading config file '%s'", filename); -- 1.8.3.2

11 years, 10 months

3
6
0 / 0

[libvirt] [PATCHv2] netcf driver: use a single netcf handle for all connections

by Laine Stump

This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=983026 The netcf interface driver previously had no state driver associated with it - as a connection was opened, it would create a new netcf instance just for that connection, and close it when it was finished. the problem with this is that each connection to libvirt used up a netlink socket, and there is a per process maximum of ~1000 netlink sockets. The solution is to create a state driver to go along with the netcf driver. The state driver will opens a netcf instance, then all connections share that same netcf instance, thus only a single netlink socket will be used no matter how many connections are mde to libvirtd. This was rather simple to do - a new virObjectLockable class is created for the single driverState object, which is created in netcfStateInitialize and contains the single netcf handle; instead of creating a new object for each client connection, netcfInterfaceOpen now just increments the driverState object's reference count and puts a pointer to it into the connection's privateData. Similarly, netcfInterfaceClose() just un-refs the driverState object (as does netcfStateCleanup()), and virNetcfInterfaceDriverStateDispose() handles closing the netcf instance. Since all the functions already have locking around them, the static lock functions used by all functions just needed to be changed to call virObjectLock() and virObjectUnlock() instead of directly calling the virMutex* functions. --- Changes from V1: * make driverState a static. * switch to using a virObjectLockable for driverState, at Eric's suggestion. * add a simple error message if ncf_init() fails. Again, I've tried this with a small number of simultaneous connections (including virt-manager), but I don't have a ready-made stress test. src/interface/interface_backend_netcf.c | 173 +++++++++++++++++++++++--------- 1 file changed, 125 insertions(+), 48 deletions(-) diff --git a/src/interface/interface_backend_netcf.c b/src/interface/interface_backend_netcf.c index f47669e..627c225 100644 --- a/src/interface/interface_backend_netcf.c +++ b/src/interface/interface_backend_netcf.c @@ -41,19 +41,119 @@ /* Main driver state */ typedef struct { - virMutex lock; + virObjectLockable parent; struct netcf *netcf; } virNetcfDriverState, *virNetcfDriverStatePtr; +static virClassPtr virNetcfDriverStateClass; +static void virNetcfDriverStateDispose(void *obj); -static void interfaceDriverLock(virNetcfDriverStatePtr driver) +static int +virNetcfDriverStateOnceInit(void) +{ + if (!(virNetcfDriverStateClass = virClassNew(virClassForObjectLockable(), + "virNetcfDriverState", + sizeof(virNetcfDriverState), + virNetcfDriverStateDispose))) + return -1; + return 0; +} + +VIR_ONCE_GLOBAL_INIT(virNetcfDriverState) + +static virNetcfDriverStatePtr driverState = NULL; + +static void +virNetcfDriverStateDispose(void *obj) +{ + virNetcfDriverStatePtr driver = obj; + + if (driver->netcf) + ncf_close(driver->netcf); +} + +static void +interfaceDriverLock(virNetcfDriverStatePtr driver) +{ + virObjectLock(driver); +} + +static void +interfaceDriverUnlock(virNetcfDriverStatePtr driver) +{ + virObjectUnlock(driver); +} + +static int +netcfStateInitialize(bool privileged ATTRIBUTE_UNUSED, + virStateInhibitCallback callback ATTRIBUTE_UNUSED, + void *opaque ATTRIBUTE_UNUSED) +{ + if (virNetcfDriverStateInitialize() < 0) + return -1; + + if (!(driverState = virObjectLockableNew(virNetcfDriverStateClass))) + return -1; + + /* open netcf */ + if (ncf_init(&driverState->netcf, NULL) != 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to initialize netcf")); + virObjectUnref(driverState); + driverState = NULL; + return -1; + } + return 0; +} + +static int +netcfStateCleanup(void) { - virMutexLock(&driver->lock); + if (!driverState) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Attempt to close netcf state driver already closed")); + return -1; + } + + if (virObjectUnref(driverState)) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Attempt to close netcf state driver " + "with open connections")); + return -1; + } + driverState = NULL; + return 0; } -static void interfaceDriverUnlock(virNetcfDriverStatePtr driver) +static int +netcfStateReload(void) { - virMutexUnlock(&driver->lock); + int ret = -1; + + if (!driverState) + return 0; + + interfaceDriverLock(driverState); + ncf_close(driverState->netcf); + if (ncf_init(&driverState->netcf, NULL) != 0) + { + /* this isn't a good situation, because we can't shut down the + * driver as there may still be connections to it. If we set + * the netcf handle to NULL, any subsequent calls to netcf + * will just fail rather than causing a crash. Not ideal, but + * livable (since this should never happen). + */ + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to re-init netcf")); + driverState->netcf = NULL; + goto cleanup; + } + + ret = 0; +cleanup: + interfaceDriverUnlock(driverState); + + return ret; } /* @@ -148,61 +248,30 @@ static struct netcf_if *interfaceDriverGetNetcfIF(struct netcf *ncf, virInterfac return iface; } -static virDrvOpenStatus netcfInterfaceOpen(virConnectPtr conn, - virConnectAuthPtr auth ATTRIBUTE_UNUSED, - unsigned int flags) +static virDrvOpenStatus +netcfInterfaceOpen(virConnectPtr conn, + virConnectAuthPtr auth ATTRIBUTE_UNUSED, + unsigned int flags) { - virNetcfDriverStatePtr driverState; - virCheckFlags(VIR_CONNECT_RO, VIR_DRV_OPEN_ERROR); - if (VIR_ALLOC(driverState) < 0) - goto alloc_error; - - /* initialize non-0 stuff in driverState */ - if (virMutexInit(&driverState->lock) < 0) - { - /* what error to report? */ - goto mutex_error; - } - - /* open netcf */ - if (ncf_init(&driverState->netcf, NULL) != 0) - { - /* what error to report? */ - goto netcf_error; - } + if (!driverState) + return VIR_DRV_OPEN_ERROR; + virObjectRef(driverState); conn->interfacePrivateData = driverState; return VIR_DRV_OPEN_SUCCESS; - -netcf_error: - if (driverState->netcf) - { - ncf_close(driverState->netcf); - } - virMutexDestroy(&driverState->lock); -mutex_error: - VIR_FREE(driverState); -alloc_error: - return VIR_DRV_OPEN_ERROR; } -static int netcfInterfaceClose(virConnectPtr conn) +static int +netcfInterfaceClose(virConnectPtr conn) { if (conn->interfacePrivateData != NULL) { - virNetcfDriverStatePtr driver = conn->interfacePrivateData; - - /* close netcf instance */ - ncf_close(driver->netcf); - /* destroy lock */ - virMutexDestroy(&driver->lock); - /* free driver state */ - VIR_FREE(driver); + virObjectUnref(conn->interfacePrivateData); + conn->interfacePrivateData = NULL; } - conn->interfacePrivateData = NULL; return 0; } @@ -1070,7 +1139,7 @@ static int netcfInterfaceChangeRollback(virConnectPtr conn, unsigned int flags) #endif /* HAVE_NETCF_TRANSACTIONS */ static virInterfaceDriver interfaceDriver = { - "netcf", + .name = INTERFACE_DRIVER_NAME, .interfaceOpen = netcfInterfaceOpen, /* 0.7.0 */ .interfaceClose = netcfInterfaceClose, /* 0.7.0 */ .connectNumOfInterfaces = netcfConnectNumOfInterfaces, /* 0.7.0 */ @@ -1093,11 +1162,19 @@ static virInterfaceDriver interfaceDriver = { #endif /* HAVE_NETCF_TRANSACTIONS */ }; +static virStateDriver interfaceStateDriver = { + .name = INTERFACE_DRIVER_NAME, + .stateInitialize = netcfStateInitialize, + .stateCleanup = netcfStateCleanup, + .stateReload = netcfStateReload, +}; + int netcfIfaceRegister(void) { if (virRegisterInterfaceDriver(&interfaceDriver) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("failed to register netcf interface driver")); return -1; } + virRegisterStateDriver(&interfaceStateDriver); return 0; } -- 1.7.11.7

11 years, 10 months

4
5
0 / 0

[libvirt] Doc: How to use NPIV in libvirt

by Osier Yang

Before posting it to WIKI or somewhere, I want to see if there is any suggestions on it, or if I missed something. ============================================ How to use NPIV in libvirt I planned to wrote a document about how to use NPIV in libvirt after more features are supported, but it looks like I can't wait till then, got lots lots of questions from both the bugs and mails. So here we go. The document tries to summary up the things about NPIV that libvirt supports till now, and the TODO list. Feedback or suggestion is welcomed. 1) How to find out which HBA(s) support vHBA For libvirt newer than "1.0.4", you can find it out simply by: # virsh nodedev-list --cap vports "--cap vports" is to tell "nodedev-list" only outputs the devices which support "vports" capability, i.e. support vHBA. And also since version "1.0.4", you should be able to know the maximum vports the HBA supports and the current vports number from the HBA's XML, e.g. # virsh nodedev-dumpxml scsi_host5 <device> <name>scsi_host5</name> <parent>pci_0000_04_00_1</parent> <capability type='scsi_host'> <host>5</host> <capability type='fc_host'> <wwnn>2001001b32a9da4e</wwnn> <wwpn>2101001b32a9da4e</wwpn> <fabric_wwn>2001000dec9877c1</fabric_wwn> </capability> <capability type='vport_ops'> <max_vports>164</max_vports> <vports>5</vports> </capability> </capability> </device> For libvirt older than "1.0.4", it's a bit complicated than above: First you need to find out all the HBAs, e.g. # virsh nodedev-list --cap scsi_host scsi_host0 scsi_host1 scsi_host2 scsi_host3 scsi_host4 scsi_host5 And then, to see if the HBA supports vHBA, check if the dumped XML contains "vport_ops" capability. E.g. # virsh nodedev-dumpxml scsi_host3 <device> <name>scsi_host3</name> <parent>pci_0000_00_08_0</parent> <capability type='scsi_host'> <host>3</host> </capability> </device> That says "scsi_host3" doesn't support vHBA # virsh nodedev-dumpxml scsi_host5 <device> <name>scsi_host5</name> <parent>pci_0000_04_00_1</parent> <capability type='scsi_host'> <host>5</host> <capability type='fc_host'> <wwnn>2001001b32a9da4e</wwnn> <wwpn>2101001b32a9da4e</wwpn> <fabric_wwn>2001000dec9877c1</fabric_wwn> </capability> <capability type='vport_ops' /> </capability> </device> But "scsi_host5" supports it. One might be confused with the node device naming style (e.g. scsi_host5) in this document and RHEL6 Virtualization Guide [1] (pci_10df_fe00_scsi_host_0). It's because of libvirt has two backends for node device driver: udev and HAL. We prefer the udev backend more than HAL backend in internal implementation, I think there is good enough reason to do so (HAL is maintenance mode now). I believe udev backend is used more than HAL backend, but if your destribution packager build libvirt without udev backend, don't be surprised with the node device names like the ones in [1]. 2) How to create a vHBA Pick up one HBA which supports vHBA, use it's "node device name" as the "parent" of vHBA, and specify the "wwnn" and "wwpn" in the vHBA's XML. E.g. <device> <name>scsi_host6</name> <parent>scsi_host5</parent> <capability type='scsi_host'> <capability type='fc_host'> <wwnn>2001001b32a9da5e</wwnn> <wwpn>2101001b32a9da5e</wwpn> </capability> </capability> </device> Then create the vHBA with virsh command "nodedev-create" (assuming above XML file is named "vhba.xml"): # virsh nodedev-create vhba.xml Node device scsi_host6 created from vhba.xml Since "0.9.10", libvirt will generate "wwnn" and "wwpn" automatically if they are not specified. It means one can create the vHBA by a more simple XML like: <device> <parent>scsi_host5</parent> <capability type='scsi_host'> <capability type='fc_host'> </capability> </capability> </device> 3) How to destroy a vHBA As usual, destroying something is always simpler than creating it: # virsh nodedev-destroy scsi_host6 Destroyed node device 'scsi_host6' You might already realize that the vHBA is removed permanently, don't be surprised, it's the life, node device driver doesn't support persistent config. I won't say it's nightmare for users who screams when realizing the vHBA disappeared after a system rebooting, but it's relatively not good, (assuming that you got the wwnn:wwpn pair from the storage admin, but didn't record it). Fortunately, we support the persistent vHBA now, see next section for details. 4) How to create a persistent vHBA Let's go back to the history a bit firstly. Prior to libvirt "1.0.5", one can define a "scsi" type pool based on a (v)HBA by it's scsi host name (e.g. "host5" in XML below). E.g. <pool type='scsi'> <name>poolhba0</name> <uuid>e9392370-2917-565e-692b-d057f46512d6</uuid> <capacity unit='bytes'>0</capacity> <allocation unit='bytes'>0</allocation> <available unit='bytes'>0</available> <source> <adapter name='host0'/> </source> <target> <path>/dev/disk/by-path</path> <permissions> <mode>0700</mode> <owner>0</owner> <group>0</group> </permissions> </target> </pool> Quite nice? yeah, at least it looks so, but the problem is the scsi host number is *unstable* (it can be changed after system rebooting, or kernel module reloading, or a vHBA recreating etc), and thus the "scsi" type pool based on a (v)HBA becomes unstable too. Obviously it doesn't help on the "persistent vHBA" problem. To solve the problems, since libvirt "1.0.5", we introduced new XML schema to indicate the (v)HBA. An example of the XML: <pool type='scsi'> <name>poolvhba0</name> <uuid>e9392370-2917-565e-692b-d057f46512d6</uuid> <source> <adapter type='fc_host' parent='scsi_host5' wwnn='20000000c9831b4b' wwpn='10000000c9831b4b'/> </source> <target> <path>/dev/disk/by-path</path> <permissions> <mode>0700</mode> <owner>0</owner> <group>0</group> </permissions> </target> </pool> It allows to define a "scsi" type pool based on either a HBA or a vHBA. For HBA, "parent" attribute can be omitted. For vHBA, if "parent" is not specified, libvirt will pick up the first HBA which supports vHBA, and doesn't exceed the maximum vports it supports, automatically. For the pool based on a vHBA, When the pool is starting, libvirt will check if the specified vHBA (wwnn:wwpn) is existing on host or not, if it doesn't exist yet, libvirt will create it automatically. When the pool is being stopped, the vHBA is destroyed. But since storage driver supports the persistent config, one can easily gets the vHBA with same "wwnn:wwpn" in next starting (Don't scream if your pool is transient). It's not the end if you want to get the vHBA created automatically after system rebooting, you will need to set the pool as "autostart": # virsh pool-autostart poolvhba0 One might be curious about why not to support persistent config for node device driver, and support to create persistent vHBA there. One of the reason is that it will be duplicate with what storage pool does. And another reason (the important one) is we want to assiciate the libvirt storage pool/volume with domain (see section "Use LUN for guest" below). 5) How to find out the LUN's path If you have defined the "scsi" type pool based on the (v)HBA, it's simple to lookup what LUNs attached to the (v)HBA by virsh command "vol-list", e.g. # virsh vol-list poolvhba0 --details Name Path Type Capacity Allocation -------------------------------------------------------------------------------------------------------- unit:0:2:0 /dev/disk/by-path/pci-0000:04:00.1-fc-0x203500a0b85ad1d7-lun-0 block 20.01 GiB 20.01 GiB If you have not defined a "scsi" type pool based on the (v)HBA, you can find it out (v)HBA by either virsh command "nodedev-list --tree", or iterating sysfs manually. To find out the LUNs by virsh command "nodedev-list" (irrelevant ouputs are omitted): # virsh nodedev-list --tree +- pci_0000_00_0d_0 | | | +- pci_0000_04_00_0 | | | | | +- scsi_host4 | | | +- pci_0000_04_00_1 | | | +- scsi_host5 | | | +- scsi_host7 | +- scsi_target5_0_0 | | | | | +- scsi_5_0_0_0 | | | +- scsi_target5_0_1 | | | | | +- scsi_5_0_1_0 | | | +- scsi_target5_0_2 | | | | | +- scsi_5_0_2_0 | | | | | +- block_sdb_3600a0b80005adb0b0000ab2d4cae9254 | | | +- scsi_target5_0_3 | | | +- scsi_5_0_3_0 "scsi_host5" is an HBA on my host, it has a LUN named "block_sdb_3600a0b80005adb0b0000ab2d4cae9254", don't be confused with the naming, it's the naming style libvirt uses, meaningful only for libvirt. It indicates the LUN has a short device path "/dev/sdb", and a ID "3600a0b80005adb0b0000ab2d4cae9254": # ls /dev/disk/by-id/ | grep 3600a0b80005adb0b0000ab2d4cae9254 scsi-3600a0b80005adb0b0000ab2d4cae9254 To manually find the LUNs of a (v)HBA: First, you need to iterate over all the directores begins with the SCSI scsi host number of the v(HBA) under "/sys/bus/scsi/devices". E.g. I will look up the LUNs of the HBA with SCSI host number 5 on my host: # ls /sys/bus/scsi/devices/5:* -d /sys/bus/scsi/devices/5:0:0:0 /sys/bus/scsi/devices/5:0:1:0 /sys/bus/scsi/devices/5:0:2:0 /sys/bus/scsi/devices/5:0:3:0 # ls /sys/bus/scsi/devices/5\:0\:3\:0/block/sdc It means scsi_host5 has a LUN attached with device name "sdc" on address "5:0:3:0". # ls /sys/bus/scsi/devices/5\:0\:1\:0/ | grep block device_blocked scsi_host5 doesn't have a LUN attached on address "5:0:2:0" The device name like "sdc" is not stable, to find out the stable path, find out the symbol link which points to the device name. E.g. # ls -l /dev/disk/by-path/ lrwxrwxrwx. 1 root root 9 Sep 10 22:28 pci-0000:00:07.0-scsi-0:0:0:0 -> ../../sda lrwxrwxrwx. 1 root root 10 Sep 10 22:28 pci-0000:00:07.0-scsi-0:0:0:0-part1 -> ../../sda1 lrwxrwxrwx. 1 root root 9 Sep 10 22:28 pci-0000:04:00.1-fc-0x203400a0b85ad1d7-lun-0 -> ../../sdc Then "/dev/disk/by-path/pci-0000:04:00.1-fc-0x203400a0b85ad1d7-lun-0" is the stable path of the LUN attached to address "5:0:3:0". Of course, you can use the similiar method to get the "by-id | by-uuid | by-label" stable path. 6) Use the LUN to guest Since libvirt "1.0.5", we supported to use the storage volume as disk source by two new attributes ("pool" and "volume") for disk "<source"> element. E.g. <disk type='volume' device='disk'> <driver name='qemu' type='raw'/> <source pool='poolvhba0' volume='unit:0:2:0 '/> <target dev='hda' bus='ide'/> </disk> There are lots of advantage to do so. Since the mainly purpose of the document is about "how to use", I will only mention two here to persuade you using the it. First, you don't need to look up the LUN's path youself. Second, assuming that you want to migrate a domain which uses a LUN attached to a vHBA, do you want to create the vHBA manually on target host? With the pool, you can simply define/start a pool with same config on target host. So, if your libvirt is newer than "1.0.5", we recommend you to define the "scsi" type pool based on the (v)HBA, and use "pool/volume" names to use the LUN as disk source. You can either use the LUN as qemu emulated disk, or passthrough it to guest. To use it as qemu emulated disk, specifying the "device" attribute as "device='disk|cdrom|floppy'". E.g. <disk type='volume' device='disk'> <driver name='qemu' type='raw'/> <source pool='blk-pool0' volume='blk-pool0-vol0'/> <target dev='hda' bus='ide'/> </disk> Or (using the LUN's path directly) <disk type='volume' device='disk'> <driver name='qemu' type='raw'/> <source dev='/dev/disk/by-path/pci-0000\:04\:00.1-fc-0x203400a0b85ad1d7-lun-0'/> <target dev='sda' bus='scsi'/> </disk> To passthrough the LUN, specifying the "device" attribute as "device='lun'", e.g. <disk type='volume' device='lun'> <driver name='qemu' type='raw'/> <source dev='/dev/disk/by-path/pci-0000\:04\:00.1-fc-0x203400a0b85ad1d7-lun-0'/> <target dev='sda' bus='scsi'/> </disk> 6) Future work * NPIV based SCSI host passthrough That's what the users ask: How to passthrough a (v)HBA to guest? * Expose vendor information, LUN's path, state of (v)HBA in its XML * May be a virsh command to simplify vHBA creation with options [1] http://www.linuxtopia.org/online_books/rhel6/rhel_6_virtualization/rhel_6... Regards, Osier

11 years, 10 months

2
1
0 / 0

[libvirt] [v0.9.12-maint v2 00/12] Debian's 0.9.12 patches

by Guido Günther

These are the patches Debian is currently carrying on 0.9.12. Most are straight cherry-picks. Since we're maintaining 0.9.12 for our current stable release I'm happy to push these to v0.9.12-maint. Daniel P. Berrange (2): Don't ignore return value of qemuProcessKill Fix race condition when destroying guests Eric Blake (1): build: fix virnetlink on glibc 2.11 Jiri Denemark (3): daemon: Fix crash in virTypedParameterArrayClear Revert "rpc: Discard non-blocking calls only when necessary" qemu: Add support for -no-user-config Luca Tettamanti (1): Make sure regfree is called close to it's usage Martin Kletzander (1): security: Fix libvirtd crash possibility Peter Krempa (4): qemu: Fix off-by-one error while unescaping monitor strings rpc: Fix crash on error paths of message dispatching conf: Remove callback from stream when freeing entries in console hash conf: Remove console stream callback only when freeing console helper cfg.mk | 3 +- daemon/remote.c | 16 +- src/conf/virconsole.c | 13 ++ src/qemu/qemu_capabilities.c | 7 +- src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 11 +- src/qemu/qemu_driver.c | 21 ++- src/qemu/qemu_monitor.c | 11 +- src/rpc/virnetclient.c | 21 +-- src/rpc/virnetserverclient.c | 3 + src/rpc/virnetserverprogram.c | 11 +- src/storage/storage_backend_logical.c | 5 +- src/util/virnetlink.h | 2 + tests/qemuhelpdata/qemu-1.1 | 268 ++++++++++++++++++++++++++++++++++ tests/qemuhelpdata/qemu-1.1-device | 160 ++++++++++++++++++++ tests/qemuhelptest.c | 75 ++++++++++ 16 files changed, 586 insertions(+), 42 deletions(-) create mode 100644 tests/qemuhelpdata/qemu-1.1 create mode 100644 tests/qemuhelpdata/qemu-1.1-device -- 1.8.4.rc3

11 years, 10 months

3
14
0 / 0

[libvirt] [PATCH] LXC: don't try to mount selinux filesystem when user namespace enabled

by Gao feng

Right now we mount selinuxfs even user namespace is enabled and ignore the error. But we shouldn't ignore these errors when user namespace is not enabled. This patch skips mounting selinuxfs when user namespace enabled. Signed-off-by: Gao feng <gaofeng(a)cn.fujitsu.com> --- src/lxc/lxc_container.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index 661ac52..84b1b57 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -797,7 +797,7 @@ static int lxcContainerMountBasicFS(bool userns_enabled) #if WITH_SELINUX if (STREQ(mnts[i].src, SELINUX_MOUNT) && - !is_selinux_enabled()) + (!is_selinux_enabled() || userns_enabled)) continue; #endif @@ -814,12 +814,6 @@ static int lxcContainerMountBasicFS(bool userns_enabled) VIR_DEBUG("Mount %s on %s type=%s flags=%x, opts=%s", srcpath, mnts[i].dst, mnts[i].type, mnts[i].mflags, mnts[i].opts); if (mount(srcpath, mnts[i].dst, mnts[i].type, mnts[i].mflags, mnts[i].opts) < 0) { -#if WITH_SELINUX - if (STREQ(mnts[i].src, SELINUX_MOUNT) && - (errno == EINVAL || errno == EPERM)) - continue; -#endif - virReportSystemError(errno, _("Failed to mount %s on %s type %s flags=%x opts=%s"), srcpath, mnts[i].dst, NULLSTR(mnts[i].type), -- 1.8.3.1

11 years, 10 months

2
1
0 / 0

[libvirt] [PATCH] qemu: Fix checking of guest ABI compatibility when reverting snapshots

by Peter Krempa

When reverting a live internal snapshot with a live guest the ABI compatiblity check was comparing a "migratable" definition with a normal one. This resulted in the check failing with: revert requires force: Target device address type none does not match source pci This patch generates a "migratable" definition from the actual one to check against the definition from the snapshot to avoid this problem. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1006886 --- src/qemu/qemu_driver.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index bbf2d23..ae1948f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13037,6 +13037,7 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, qemuDomainObjPrivatePtr priv; int rc; virDomainDefPtr config = NULL; + virDomainDefPtr migratableDef = NULL; virQEMUDriverConfigPtr cfg = NULL; virCapsPtr caps = NULL; @@ -13151,8 +13152,13 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, * to have finer control. */ if (virDomainObjIsActive(vm)) { /* Transitions 5, 6, 8, 9 */ - /* Check for ABI compatibility. */ - if (config && !virDomainDefCheckABIStability(vm->def, config)) { + /* Check for ABI compatibility. We need to do this check against + * the migratable XML or it will always fail otherwise */ + if (!(migratableDef = qemuDomainDefCopy(driver, vm->def, + VIR_DOMAIN_XML_MIGRATABLE))) + goto cleanup; + + if (config && !virDomainDefCheckABIStability(migratableDef, config)) { virErrorPtr err = virGetLastError(); if (!(flags & VIR_DOMAIN_SNAPSHOT_REVERT_FORCE)) { @@ -13357,6 +13363,7 @@ cleanup: } if (vm) virObjectUnlock(vm); + virDomainDefFree(migratableDef); virObjectUnref(caps); virObjectUnref(cfg); -- 1.8.3.2

11 years, 10 months

2
2
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Devel September 2013