[libvirt] [PATCHv2 00/17] Support for <interface type='hostdev'>
by Laine Stump
This series of patches enhances the <interface device to support a
sort of "intelligent hostdev", i.e. PCI passthrough where device-type
specific initialization is done prior to assigning the device to the
guest, in particular to allow setting the MAC address and do 802.1QbX
setup for network devices.
The first posting of this patch only supported parsing and formatting
of these devices. This version also supports them in persistent
config, as well as hotplug (both persistent and live-only).
The only piece that isn't in this patchset (because it is coming from
another author) is the code that actually
Rather than adding all of the device-type specific config to
<hostdev>, this is accomplished through adding a new type of
<interface> element, type='hostdev'. When an interface is
type='hostdev' the following is changed:
* in the toplevel device, the managed attribute can be specified
(with identical results as when it's specified in a <hostdev>
* The <source> element can specify a pci address or usb address,
just as can be done in <hostdev>. One notable difference is that
the type of the address is specified directly in the source
<address> element, rather than as an attribute of the toplevel
device (that's how it's done for <hostdev>, but for <interface>,
the toplevel element's type attribute is already used).
NB: a type=hostdev interface will reside in both the interface list
(for configuration and memory management) and hostdev list (for PCI
attach/detach, and tracking of which devices are assigned)).
This entire series is available on gitorious:
git://gitorious.org/~laine/libvirt/laine-staging.git
in the "passthrough8" branch.
Patches 1-7, 9-12, and 15-16 are just setup for the new
functionality - they reorder and refactor existing code to allow
greater re-use of existing code and easier plugin of the new
code. Those marked with "X" are unchanged from V1 (as far as my
git logs tell me). Those marked "+" are new patches that weren't
in V1.
+ [PATCH 01/17] conf: add missing device types to
[PATCH 02/17] conf: relocate virDomainDeviceDef and
X [PATCH 03/17] conf: reorder static functions in domain_conf.c
+ [PATCH 04/17] qemu: rename virDomainDeviceInfoPtr variables to avoid
+ [PATCH 05/17] conf: add device pointer to args of
[PATCH 06/17] conf: make hostdev info a separate object
X [PATCH 07/17] conf: HostdevDef parse/format helper functions
+ [PATCH 09/17] conf: put subsys part of virDomainHostdevDef into its
+ [PATCH 10/17] conf: hostdev utility functions
+ [PATCH 11/17] qemu: re-order functions in qemu_hotplug.c
+ [PATCH 12/17] qemu: refactor hotplug detach of hostdevs
+ [PATCH 15/17] conf: change virDomainNetRemove from static to global
+ [PATCH 16/17] qemu: use virDomainNetRemove instead of inline code
Patch 8 is just a couple lines:
[PATCH 08/17] conf: give each hostdevdef a parent pointer
[PATCH 13/17] conf: parse/format type='hostdev' network interfaces
+ [PATCH 14/17] qemu: support type='hostdev' network devices at domain start
+ [PATCH 17/17] qemu: support type=hostdev network device live hotplug
12 years, 7 months
Re: [libvirt] [PATCH] Add migration APIs for libxl driver
by Chun Yan Liu
> I didn't get a chance to test this yet, but have some initial review
> comments.
>
>> Signed-off-by: Chunyan Liu
>> ---
>> src/libxl/libxl_driver.c | 617
>> ++++++++++++++++++++++++++++++++++++++++++++++
>> src/libxl/libxl_driver.h | 17 ++-
>> 2 files changed, 632 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c
>> index d5fa64a..4037635 100644
>> --- a/src/libxl/libxl_driver.c
>> +++ b/src/libxl/libxl_driver.c
>> @@ -30,6 +30,12 @@
>> #include
>> #include
>> #include
>> +#include
>> +#include
>> +#include
>> +#include
>> +#include
>> +#include
>>
>> #include "internal.h"
>> #include "logging.h"
>> @@ -60,6 +66,13 @@
>> #define XEN_SCHED_CREDIT_NPARAM 2
>>
>> static libxlDriverPrivatePtr libxl_driver = NULL;
>> +static virThread migrate_receive_thread;
>>
>
> This prevents receiving concurrent migrations.
Yes. It is a problem. Defined as "static" is to be used in Begin3 function virThreadCreate and in Finish3() function virThreadJoin, but actually the thread will exit when receiving data completed or meets error, so virThreadJoin doesn't have much effect here. If a call of virThreadJoin is not needed, then can specify migrate_receive_thread as a local variable.
>
>> +
>> +typedef struct migrate_receive_args {
>> + virConnectPtr conn;
>> + virDomainObjPtr vm;
>> + int sockfd;
>> +} migrate_receive_args;
>>
>
> If there is a future need to extend this structure, will it cause
> incompatibility issues between a source with the extensions and a
> destination without? Or vise versa?
It will if send logic and receive logic doesn't match. Maybe need to add some extra check, but seems no better way to completely avoid that?
>>
>> /* Function declarations */
>> static int
>> @@ -1095,6 +1108,17 @@ libxlClose(virConnectPtr conn ATTRIBUTE_UNUSED)
>> return 0;
>> }
>>
>> +static int
>> +libxlSupportsFeature(virConnectPtr conn ATTRIBUTE_UNUSED, int feature)
>> +{
>> + switch (feature) {
>> + case VIR_DRV_FEATURE_MIGRATION_V3:
>> + return 1;
>> + default:
>> + return 0;
>> + }
>> +}
>> +
>> static const char *
>> libxlGetType(virConnectPtr conn ATTRIBUTE_UNUSED)
>> {
>> @@ -3842,12 +3866,600 @@ libxlIsAlive(virConnectPtr conn ATTRIBUTE_UNUSED)
>> return 1;
>> }
>>
>> +static int libxlReadFixedMessage(int fd, const void *msg, int msgsz)
>> +{
>> + char buf[msgsz];
>> +
>> + if (saferead(fd, buf, msgsz) != msgsz || memcmp(buf, msg, msgsz)) {
>> + return -1;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int doParseURI(const char *uri, char **p_hostname, int *p_port)
>> +{
>> + char *p, *hostname;
>> + char port[16] = "0";
>> +
>> + if (strstr (uri, "//")) { /* Full URI. */
>> + virURIPtr uriptr = virURIParse(uri);
>> + if (!uriptr) {
>> + libxlError(VIR_ERR_INVALID_ARG,
>> + _("Invalid URI"));
>> + return -1;
>> + }
>> + if (uriptr->scheme && STRCASENEQ(uriptr->scheme, "xlmigr")) {
>> + libxlError(VIR_ERR_INVALID_ARG,
>> + _("Only xlmigr://"
>> + " migrations are supported by Xen"));
>> + xmlFreeURI(uriptr);
>> + return -1;
>> + }
>>
>
> This is just tcp migration. Any reason for requiring the "xlmigr" scheme?
It's not a necessary. Just refer to xen_driver syntax, migration uri could be [xlmigr://]IP[:Port]. We can also define libxl own syntax, migration uri like IP[:Port].
>> + if (!uriptr->server) {
>> + libxlError(VIR_ERR_INVALID_ARG,
>> + _("A hostname must be"
>> + " specified in the URI"));
>> + xmlFreeURI(uriptr);
>> + return -1;
>> + }
>> + hostname = strdup(uriptr->server);
>>
>
> I think it would be better to rework this, and other uses of strdup()
> and snprintf() in this function, to use virAsprintf().
>
>> + if (!hostname) {
>> + virReportOOMError();
>> + xmlFreeURI(uriptr);
>> + return -1;
>> + }
>> + if (uriptr->port)
>> + snprintf(port, sizeof(port), "%d", uriptr->port);
>> + xmlFreeURI(uriptr);
>> + }
>> + else if ((p = strrchr(uri, ':')) != NULL) { /* "hostname:port" */
>> + int port_nr, n;
>> +
>> + if (virStrToLong_i(p+1, NULL, 10, &port_nr) < 0) {
>> + libxlError(VIR_ERR_INVALID_ARG,
>> + _("Invalid port number"));
>> + return -1;
>> + }
>> + snprintf(port, sizeof(port), "%d", port_nr);
>> +
>> + /* Get the hostname. */
>> + n = p - uri; /* n = Length of hostname in bytes. */
>> + hostname = strdup(uri);
>> + if (!hostname) {
>> + virReportOOMError();
>> + return -1;
>> + }
>> + hostname[n] = '\0';
>> + }
>> + else {/* "hostname" (or IP address) */
>> + hostname = strdup(uri);
>> + if (!hostname) {
>> + virReportOOMError();
>> + return -1;
>> + }
>> + }
>> + *p_hostname = hostname;
>> + *p_port = atoi(port);
>> + return 0;
>> +}
>> +
>> +static bool libxlMigrationFromIsAllowed(libxlDriverPrivatePtr driver,
>> virDomainObjPtr vm)
>> +{
>> + /* to be extended */
>> + VIR_DEBUG("driver=%p, vm=%p", driver, vm);
>> + return true;
>> +}
>> +
>> +static bool libxlMigrationToIsAllowed(libxlDriverPrivatePtr driver,
>> virDomainDefPtr def)
>> +{
>> + /* to be extended */
>> + VIR_DEBUG("driver=%p, def=%p", driver, def);
>> + return true;
>> +}
>>
>
> I think these functions (and their call sites) can be added in a future
> patch that provides an implementation.
>
>> +
>> +static char *
>> +libxlDomainMigrateBegin3(virDomainPtr domain,
>> + const char *xmlin,
>> + char **cookieout ATTRIBUTE_UNUSED,
>> + int *cookieoutlen ATTRIBUTE_UNUSED,
>> + unsigned long flags,
>> + const char *dname ATTRIBUTE_UNUSED,
>> + unsigned long resource ATTRIBUTE_UNUSED)
>> +{
>> + libxlDriverPrivatePtr driver = domain->conn->privateData;
>> + virDomainObjPtr vm;
>> + virDomainDefPtr def = NULL;
>> + char *xml = NULL;
>> +
>> + virCheckFlags(LIBXL_MIGRATION_FLAGS, NULL);
>> +
>> + libxlDriverLock(driver);
>> + vm = virDomainFindByUUID(&driver->domains, domain->uuid);
>> + if (!vm) {
>> + char uuidstr[VIR_UUID_STRING_BUFLEN];
>> + virUUIDFormat(domain->uuid, uuidstr);
>> + libxlError(VIR_ERR_OPERATION_INVALID,
>> + _("no domain with matching uuid '%s'"), uuidstr);
>> + goto cleanup;
>> + }
>> +
>> + if (!virDomainObjIsActive(vm)) {
>> + libxlError(VIR_ERR_OPERATION_INVALID,
>> + _("domain is not running"));
>> + goto cleanup;
>> + }
>> +
>> + if (!libxlMigrationFromIsAllowed(driver, vm)) {
>> + libxlError(VIR_ERR_OPERATION_INVALID,
>> + _("domain cannot do migration"));
>> + goto cleanup;
>> + }
>> +
>> + if (xmlin) {
>> + if (!(def = virDomainDefParseString(driver->caps, xml,
>> + 1 << VIR_DOMAIN_VIRT_XEN,
>> + VIR_DOMAIN_XML_INACTIVE)))
>> + goto cleanup;
>> +
>> + xml = virDomainDefFormat(def, VIR_DOMAIN_XML_SECURE);
>>
>
> Do we need to ensure the name provided in xmlin is the same as def->name?
Now that it supports domain name change, I didn't check it. I noticed qemu driver check that, but I couldn't think of any reason that needs to check?
>> + } else {
>> + xml = virDomainDefFormat(vm->def, VIR_DOMAIN_XML_SECURE);
>> + }
>> +
>> +cleanup:
>> + if (vm)
>> + virDomainObjUnlock(vm);
>> + libxlDriverUnlock(driver);
>> + return xml;
>> +}
>> +
>> +static void doMigrateReceive(void *opaque)
>> +{
>> + migrate_receive_args *data = opaque;
>> + virConnectPtr conn = data->conn;
>> + int sockfd = data->sockfd;
>> + virDomainObjPtr vm = data->vm;
>> + libxlDriverPrivatePtr driver = conn->privateData;
>> + int recv_fd;
>> + struct sockaddr_in new_addr;
>> + socklen_t socklen = sizeof(new_addr);
>> + int len;
>> +
>> + do {
>> + recv_fd = accept(sockfd, (struct sockaddr *)&new_addr, &socklen);
>> + } while(recv_fd < 0 && errno == EINTR);
>> +
>> + VIR_DEBUG("Accepted migration\n");
>>
>
> This should be moved after checking for a valid recv_fd.
>
>> + if (recv_fd < 0) {
>> + VIR_DEBUG("Could not accept migration connection\n");
>>
>
> libxlError?
>
>> + goto cleanup;
>> + }
>> +
>> + len = sizeof(migrate_receiver_banner);
>> + if (safewrite(recv_fd, migrate_receiver_banner, len) != len) {
>> + libxlError(VIR_ERR_OPERATION_FAILED,
>> + _("Failed to write migrate_receiver_banner"));
>> + goto cleanup;
>> + }
>> +
>> + if (libxlVmStart(driver, vm, false, recv_fd) < 0) {
>> + libxlError(VIR_ERR_INTERNAL_ERROR,
>> + _("Failed to restore domain with libxenlight"));
>> + if (!vm->persistent) {
>> + virDomainRemoveInactive(&driver->domains, vm);
>> + vm = NULL;
>> + }
>> + goto cleanup;
>> + }
>> +
>> + len = sizeof(migrate_receiver_ready);
>> + if (safewrite(recv_fd, migrate_receiver_ready, len) != len) {
>> + libxlError(VIR_ERR_OPERATION_FAILED,
>> + _("Failed to write migrate_receiver_ready"));
>> + }
>> +
>> +cleanup:
>> + if (VIR_CLOSE(recv_fd) < 0)
>> + virReportSystemError(errno, "%s", _("cannot close recv_fd"));
>> + if (VIR_CLOSE(sockfd) < 0)
>> + virReportSystemError(errno, "%s", _("cannot close sockfd"));
>> + if (vm)
>> + virDomainObjUnlock(vm);
>> + VIR_FREE(opaque);
>> + return;
>> +}
>> +
>> +static int doMigrateSend(virDomainPtr dom, unsigned long flags, int
>> sockfd)
>> +{
>> + libxlDriverPrivatePtr driver = dom->conn->privateData;
>> + virDomainObjPtr vm;
>> + libxlDomainObjPrivatePtr priv;
>> + libxl_domain_suspend_info suspinfo;
>> + virDomainEventPtr event = NULL;
>> + int live = 0;
>> + int ret = -1;
>> +
>> + if (flags & VIR_MIGRATE_LIVE)
>> + live = 1;
>> +
>> + vm = virDomainFindByUUID(&driver->domains, dom->uuid);
>> + if (!vm) {
>> + char uuidstr[VIR_UUID_STRING_BUFLEN];
>> + virUUIDFormat(dom->uuid, uuidstr);
>> + libxlError(VIR_ERR_OPERATION_INVALID,
>> + _("no domain with matching uuid '%s'"),
>> uuidstr);
>> + goto cleanup;
>> + }
>> +
>> + priv = vm->privateData;
>> +
>> + /* read fixed message from dest (ready to receive) */
>> + if (libxlReadFixedMessage(sockfd, migrate_receiver_banner,
>> + sizeof(migrate_receiver_banner))) {
>> + goto cleanup;
>> + }
>> +
>> + /* send suspend data */
>> + memset(&suspinfo, 0, sizeof(suspinfo));
>> + if (live == 1)
>> + suspinfo.flags |= XL_SUSPEND_LIVE;
>> + if (libxl_domain_suspend(&priv->ctx, &suspinfo, vm->def->id, sockfd)
>> != 0) {
>> + libxlError(VIR_ERR_INTERNAL_ERROR,
>> + _("Failed to save domain '%d' with libxenlight"),
>> + vm->def->id);
>> + goto cleanup;
>> + }
>> +
>> + /* read fixed message from dest (receive completed) */
>> + if (libxlReadFixedMessage(sockfd, migrate_receiver_ready,
>> + sizeof(migrate_receiver_ready))) {
>> + /* Src side should be resumed, but for ret < 0, virsh won't call
>> Src side
>> + * Confirm3, handle it here.
>> + */
>> + libxl_domain_resume(&priv->ctx, vm->def->id);
>> + goto cleanup;
>> + }
>> +
>> + ret = 0;
>> +
>> +cleanup:
>> + if (vm)
>> + virDomainObjUnlock(vm);
>> + if (event)
>> + libxlDomainEventQueue(driver, event);
>> + return ret;
>> +}
>> +
>> +static int
>> +libxlDomainMigratePrepare3(virConnectPtr dconn,
>> + const char *cookiein ATTRIBUTE_UNUSED,
>> + int cookieinlen ATTRIBUTE_UNUSED,
>> + char **cookieout ATTRIBUTE_UNUSED,
>> + int *cookieoutlen ATTRIBUTE_UNUSED,
>> + const char *uri_in,
>> + char **uri_out,
>> + unsigned long flags,
>> + const char *dname,
>> + unsigned long resource ATTRIBUTE_UNUSED,
>> + const char *dom_xml)
>> +{
>> + libxlDriverPrivatePtr driver = dconn->privateData;
>> + virDomainDefPtr def = NULL;
>> + virDomainObjPtr vm = NULL;
>> + int port = 0;
>> + int sockfd = -1;
>> + struct sockaddr_in addr;
>> + migrate_receive_args *args;
>> + int ret = -1;
>> +
>> + virCheckFlags(LIBXL_MIGRATION_FLAGS, -1);
>> +
>> + libxlDriverLock(driver);
>> + if (!dom_xml) {
>> + libxlError(VIR_ERR_OPERATION_INVALID,
>> + _("no domain XML passed"));
>> + goto cleanup;
>> + }
>> +
>> + def = virDomainDefParseString(driver->caps, dom_xml,
>> + 1 << VIR_DOMAIN_VIRT_XEN,
>> + VIR_DOMAIN_XML_INACTIVE);
>> +
>> + if (!libxlMigrationToIsAllowed(driver, def)) {
>> + libxlError(VIR_ERR_OPERATION_INVALID,
>> + _("cannot migrate domain to this host"));
>> + goto cleanup;
>> + }
>> +
>> + /* Target domain name, maybe renamed. */
>> + if (dname) {
>> + def->name = strdup(dname);
>>
>
> This leaks the original name.
I didn't save original name as qemu driver does. In a remote migration, seems no place needs the original name again?
>
Thanks,
Chunyan
12 years, 7 months
[libvirt] [PATCH 0/4] Support mac and port profile for <interface type='hostdev'>
by Roopa Prabhu
v2:
changes include:
- feedback from stefan for 802.1Qbg. Code now prints an error if virtualport is
specified for 802.1Qbg on an interface of type hostdev
- feedback from laine for non-sriov devices. Interface type hostdev for non-sriov devices
is not supported.
v1: https://www.redhat.com/archives/libvir-list/2012-March/msg00015.html
This patch series is based on laines patches to support <interface type='hostdev'>.
https://www.redhat.com/archives/libvir-list/2012-February/msg01126.html
It support to set mac and port profile on an interface of type hostdev.
* If virtualport is specified, the existing virtual port functions are
called to set mac, vlan and port profile.
* If virtualport is not specified and device is a sriov virtual function,
- mac is set using IFLA_VF_MAC
* If virtualport is not specified and device is a non-sriov virtual function,
- mac is set using existing SIOCGIFHWADDR (This requires that the
netdev be present on the host before starting the VM)
This series implements the below :
01/4 pci: Add two new pci util pciDeviceGetVirtualFunctionInfo and pciConfigAddressToSysfsFile
02/4 virtnetdev: Add support functions for mac and portprofile associations on a hostdev
03/4 virnetdevvportprofile: Changes to support portprofiles for hostdevs
04/4 qemu_hostdev: Add support to install port profile and mac address on hostdev
Stefan Berger is CC'ed for 802.1Qbg changes in patch 03/4. Current code for
802.1Qbg uses macvtap ifname. And for network interfaces with type=hostdev a
macvtap ifname does not exist. This patch just adds a null check for ifname in
802.1Qbg port profile handling code.
12 years, 7 months
[libvirt] [PATCH] util: fail attempts to use same mac address for guest and tap
by Laine Stump
This patch is in response to:
https://bugzilla.redhat.com/show_bug.cgi?id=798467
If a guest's tap device is created using the same MAC address the
guest uses for its own network card (which connects to the tap
device), the Linux kernel will log the following message and traffic
will not pass:
kernel: vnet9: received packet with own address as source address
This patch disallows MAC addresses with a first byte of 0xFE, but only in
the case that the MAC address is used for a guest interface that's
connected by way of a standard tap device. (In other words, the
validation is done at runtime at the same place the MAC address is
modified for the tap device, rather than when mac address is parsed,
the idea being that it is then we know for sure the address will be
problematic.)
---
src/util/virnetdevtap.c | 22 ++++++++++++++++++++--
1 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c
index fb0a8d2..b19c006 100644
--- a/src/util/virnetdevtap.c
+++ b/src/util/virnetdevtap.c
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2007-2011 Red Hat, Inc.
+ * Copyright (C) 2007-2012 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
@@ -45,6 +45,10 @@
#define VIR_FROM_THIS VIR_FROM_NONE
+#define virNetDevTapError(code, ...) \
+ virReportErrorHelper(VIR_FROM_NONE, code, __FILE__, \
+ __FUNCTION__, __LINE__, __VA_ARGS__)
+
/**
* virNetDevProbeVnetHdr:
* @tapfd: a tun/tap file descriptor
@@ -293,8 +297,22 @@ int virNetDevTapCreateInBridgePort(const char *brname,
* device before we set our static MAC.
*/
memcpy(tapmac, macaddr, VIR_MAC_BUFLEN);
- if (!(flags & VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE))
+ if (!(flags & VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE)) {
+ if (macaddr[0] == 0xFE) {
+ /* For normal use, the tap device's MAC address cannot
+ * match the MAC address used by the guest. This results
+ * in "received packet on vnetX with own address as source
+ * address" error logs from the kernel.
+ */
+ virNetDevTapError(VIR_ERR_CONFIG_UNSUPPORTED,
+ "Unable to use MAC address starting with "
+ "reserved value 0xFE - '%02X:%02X:%02X:%02X:%02X:%02X' - ",
+ macaddr[0], macaddr[1], macaddr[2],
+ macaddr[3], macaddr[4], macaddr[5]);
+ goto error;
+ }
tapmac[0] = 0xFE; /* Discourage bridge from using TAP dev MAC */
+ }
if (virNetDevSetMAC(*ifname, tapmac) < 0)
goto error;
--
1.7.7.6
12 years, 7 months
[libvirt] [PATCH] util: eliminate crash in virNetDevMacVLanCreateWithVPortProfile
by Laine Stump
From: root <root(a)vlap.laine.org>
Commit 723d5c (added after the release of 0.9.10) adds a
NetlinkEventClient for each interface sent to
virNetDevMacVLanCreateWithVPortProfile. This should only be done if
the interface actually *has* a virtPortProfile, otherwise the event
handler would be a NOP. The bigger problem is that part of the setup
to create the NetlinkEventClient is to do a memcpy of virtPortProfile
- if it's NULL, this triggers a segv.
This patch just qualifies the code that adds the client - if
virtPortProfile is NULL, it's skipped.
---
src/util/virnetdevmacvlan.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c
index 57fc69e..e18b149 100644
--- a/src/util/virnetdevmacvlan.c
+++ b/src/util/virnetdevmacvlan.c
@@ -915,7 +915,7 @@ create_name:
goto disassociate_exit;
}
- if (virNetlinkEventServiceIsRunning()) {
+ if (virtPortProfile && virNetlinkEventServiceIsRunning()) {
if (VIR_ALLOC(calld) < 0)
goto memory_error;
if ((calld->cr_ifname = strdup(cr_ifname)) == NULL)
--
1.7.7.6
12 years, 7 months
[libvirt] [PATCH] blockResize: add flag for bytes
by Eric Blake
Qemu supports sizing by bytes; we shouldn't force the user to
round up if they really wanted an unaligned total size.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_RESIZE_BYTES):
New flag.
* src/libvirt.c (virDomainBlockResize): Document it.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockResize): Take
size in bytes.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextBlockResize):
Likewise. Pass bytes, not megabytes, to monitor.
* src/qemu/qemu_driver.c (qemuDomainBlockResize): Implement new
flag.
---
Virsh support will come later, as part of my cleanups to make
virsh handle scaled integers.
include/libvirt/libvirt.h.in | 10 ++++++++++
src/libvirt.c | 17 +++++++++++------
src/qemu/qemu_driver.c | 24 ++++++++++++++----------
src/qemu/qemu_monitor_json.c | 5 +++--
src/qemu/qemu_monitor_text.c | 6 +++---
5 files changed, 41 insertions(+), 21 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in
index 58c4366..b2f8f5f 100644
--- a/include/libvirt/libvirt.h.in
+++ b/include/libvirt/libvirt.h.in
@@ -1636,6 +1636,16 @@ int virDomainBlockPeek (virDomainPtr dom,
size_t size,
void *buffer,
unsigned int flags);
+
+/**
+ * virDomainBlockResizeFlags:
+ *
+ * Flags available for virDomainBlockResize().
+ */
+typedef enum {
+ VIR_DOMAIN_BLOCK_RESIZE_BYTES = 1 << 0, /* size in bytes instead of KiB */
+} virDomainBlockResizeFlags;
+
int virDomainBlockResize (virDomainPtr dom,
const char *disk,
unsigned long long size,
diff --git a/src/libvirt.c b/src/libvirt.c
index c2e9733..d98741b 100644
--- a/src/libvirt.c
+++ b/src/libvirt.c
@@ -7601,12 +7601,15 @@ error:
* virDomainBlockResize:
* @dom: pointer to the domain object
* @disk: path to the block image, or shorthand
- * @size: new size of the block image in kilobytes
- * @flags: extra flags; not used yet, so callers should always pass 0
+ * @size: new size of the block image, see below for unit
+ * @flags: bitwise-OR of virDomainBlockResizeFlags
*
- * Note that this call may fail if the underlying virtualization hypervisor
- * does not support it. And this call requires privileged access to the
- * hypervisor.
+ * Resize a block device of domain while the domain is running. If
+ * @flags is 0, then @size is in kibibytes (blocks of 1024); since
+ * 0.9.11, if @flags includes VIR_DOMAIN_BLOCK_RESIZE_BYTES, @size is
+ * in bytes instead. @size is taken directly as the new size.
+ * Depending on the file format, the hypervisor may round up to the
+ * next alignment boundary.
*
* The @disk parameter is either an unambiguous source name of the
* block device (the <source file='...'/> sub-element, such as
@@ -7615,7 +7618,9 @@ error:
* can be found by calling virDomainGetXMLDesc() and inspecting
* elements within //domain/devices/disk.
*
- * Resize a block device of domain while the domain is running.
+ * Note that this call may fail if the underlying virtualization hypervisor
+ * does not support it; this call requires privileged access to the
+ * hypervisor.
*
* Returns: 0 in case of success or -1 in case of failure.
*/
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 2afcc3f..1f57508 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -7348,10 +7348,10 @@ qemuGetSchedulerParameters(virDomainPtr dom,
* like LVM volumes.
*/
static int
-qemuDomainBlockResize (virDomainPtr dom,
- const char *path,
- unsigned long long size,
- unsigned int flags)
+qemuDomainBlockResize(virDomainPtr dom,
+ const char *path,
+ unsigned long long size,
+ unsigned int flags)
{
struct qemud_driver *driver = dom->conn->privateData;
virDomainObjPtr vm;
@@ -7360,7 +7360,7 @@ qemuDomainBlockResize (virDomainPtr dom,
char *device = NULL;
virDomainDiskDefPtr disk = NULL;
- virCheckFlags(0, -1);
+ virCheckFlags(VIR_DOMAIN_BLOCK_RESIZE_BYTES, -1);
if (path[0] == '\0') {
qemuReportError(VIR_ERR_INVALID_ARG,
@@ -7368,11 +7368,15 @@ qemuDomainBlockResize (virDomainPtr dom,
return -1;
}
- if (size > ULLONG_MAX / 1024) {
- qemuReportError(VIR_ERR_INVALID_ARG,
- _("size must be less than %llu"),
- ULLONG_MAX / 1024);
- return -1;
+ /* We prefer operating on bytes. */
+ if ((flags & VIR_DOMAIN_BLOCK_RESIZE_BYTES) == 0) {
+ if (size > ULLONG_MAX / 1024) {
+ qemuReportError(VIR_ERR_INVALID_ARG,
+ _("size must be less than %llu"),
+ ULLONG_MAX / 1024);
+ return -1;
+ }
+ size *= 1024;
}
qemuDriverLock(driver);
diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c
index c0f148b..dc67b4b 100644
--- a/src/qemu/qemu_monitor_json.c
+++ b/src/qemu/qemu_monitor_json.c
@@ -1783,7 +1783,8 @@ cleanup:
return ret;
}
-/* Return 0 on success, -1 on failure, or -2 if not supported. */
+/* Return 0 on success, -1 on failure, or -2 if not supported. Size
+ * is in bytes. */
int qemuMonitorJSONBlockResize(qemuMonitorPtr mon,
const char *device,
unsigned long long size)
@@ -1794,7 +1795,7 @@ int qemuMonitorJSONBlockResize(qemuMonitorPtr mon,
cmd = qemuMonitorJSONMakeCommand("block_resize",
"s:device", device,
- "U:size", size * 1024,
+ "U:size", size,
NULL);
if (!cmd)
return -1;
diff --git a/src/qemu/qemu_monitor_text.c b/src/qemu/qemu_monitor_text.c
index d6f7dac..a7ebfba 100644
--- a/src/qemu/qemu_monitor_text.c
+++ b/src/qemu/qemu_monitor_text.c
@@ -1097,7 +1097,8 @@ int qemuMonitorTextGetBlockExtent(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
return -1;
}
-/* Return 0 on success, -1 on failure, or -2 if not supported. */
+/* Return 0 on success, -1 on failure, or -2 if not supported. Size
+ * is in bytes. */
int qemuMonitorTextBlockResize(qemuMonitorPtr mon,
const char *device,
unsigned long long size)
@@ -1106,8 +1107,7 @@ int qemuMonitorTextBlockResize(qemuMonitorPtr mon,
char *reply = NULL;
int ret = -1;
- if (virAsprintf(&cmd, "block_resize %s %llu",
- device, VIR_DIV_UP(size, 1024)) < 0) {
+ if (virAsprintf(&cmd, "block_resize %s %lluB", device, size) < 0) {
virReportOOMError();
goto cleanup;
}
--
1.7.7.6
12 years, 7 months
[libvirt] [PATCH] qemu: Shared or readonly disks are always safe wrt migration
by Jiri Denemark
No matter what cache mode is used, readonly disks are always safe wrt
migration. Shared disks are required to be readonly or to disable
host-side cache, which makes them safe as well.
---
src/qemu/qemu_migration.c | 8 +++++---
1 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
index 84037e4..5c4297c 100644
--- a/src/qemu/qemu_migration.c
+++ b/src/qemu/qemu_migration.c
@@ -828,10 +828,12 @@ qemuMigrationIsSafe(virDomainDefPtr def)
for (i = 0 ; i < def->ndisks ; i++) {
virDomainDiskDefPtr disk = def->disks[i];
- /* shared && !readonly implies cache=none */
+ /* Our code elsewhere guarantees shared disks are either readonly (in
+ * which case cache mode doesn't matter) or used with cache=none */
if (disk->src &&
- disk->cachemode != VIR_DOMAIN_DISK_CACHE_DISABLE &&
- (disk->cachemode || !disk->shared || disk->readonly)) {
+ !disk->shared &&
+ !disk->readonly &&
+ disk->cachemode != VIR_DOMAIN_DISK_CACHE_DISABLE) {
int cfs;
if ((cfs = virStorageFileIsClusterFS(disk->src)) == 1)
continue;
--
1.7.8.5
12 years, 7 months
[libvirt] [PATCH v2] libvirt-guests: Add parallel startup and shutdown of guests
by Peter Krempa
With this patch, it's possible to shut down guests in parallel. Parallel
startup was possible before, but this functionality was not documented
properly.
To enable parallel startup set the START_DELAY to 0.
Parallel shutdown has a configurable parameter PARALLEL_SHUTDOWN that
defines the number of machines being shut down in parallel. Enabling
this feature changes the semantics of SHUTDOWN_TIMEOUT parameter that is
applied as a cumulative timeout to shutdown all guests on a URI.
---
Diff to v1:
-changed "domains" to "guests"
-used efficient code to avoid subshells
-remove unused helper functions
-added error message when loosing track of guest in case of error
-added docs that setting of SHUTDOWN_TIMEOUT is needed with shutdown action
-added initialisation of PARALLEL_SHUTDOWN
-fixed output redirections
-fixed typo in sysconfig file
tools/libvirt-guests.init.sh | 116 +++++++++++++++++++++++++++++++++++++++---
tools/libvirt-guests.sysconf | 12 ++++-
2 files changed, 119 insertions(+), 9 deletions(-)
diff --git a/tools/libvirt-guests.init.sh b/tools/libvirt-guests.init.sh
index c867ece..141d7fa 100644
--- a/tools/libvirt-guests.init.sh
+++ b/tools/libvirt-guests.init.sh
@@ -42,6 +42,7 @@ URIS=default
ON_BOOT=start
ON_SHUTDOWN=suspend
SHUTDOWN_TIMEOUT=0
+PARALLEL_SHUTDOWN=0
START_DELAY=0
BYPASS_CACHE=0
@@ -273,6 +274,102 @@ shutdown_guest()
fi
}
+# shutdown_guest_async URI GUEST
+# Start a ACPI shutdown of GUEST on URI. This function returns after the command
+# was issued to libvirt to allow parallel shutdown.
+shutdown_guest_async()
+{
+ uri=$1
+ guest=$2
+
+ name=$(guest_name "$uri" "$guest")
+ eval_gettext "Starting shutdown on guest: \$name"
+ echo
+ retval run_virsh "$uri" shutdown "$guest" > /dev/null
+}
+
+# guest_count GUEST_LIST
+# Returns number of guests in GUEST_LIST
+guest_count()
+{
+ set -- $1
+ echo $#
+}
+
+# check_guests_shutdown URI GUESTS
+# check if shutdown is complete on guests in "GUESTS" and returns only
+# guests that are still shutting down
+check_guests_shutdown()
+{
+ uri=$1
+ guests=$2
+
+ guests_up=
+ for guest in $guests; do
+ if ! guest_is_on "$uri" "$guest" >/dev/null 2>&1; then
+ eval_gettext "Failed to determine state of guest: \$guest. Not tracking it anymore."
+ echo
+ continue
+ fi
+ if "$guest_running"; then
+ guests_up="$guests_up $guest"
+ fi
+ done
+ echo "$guests_up"
+}
+
+# print_guests_shutdown URI BEFORE AFTER
+# Checks for differences in the lists BEFORE and AFTER and prints
+# a shutdown complete notice for guests that have finished
+print_guests_shutdown()
+{
+ uri=$1
+ before=$2
+ after=$3
+
+ for guest in $before; do
+ case " $after " in
+ *" $guest "*) continue;;
+ esac
+
+ name=$(guest_name "$uri" "$guest")
+ eval_gettext "Shutdown of guest \$name complete."
+ echo
+ done
+}
+
+# shutdown_guests_parallel URI GUESTS
+# Shutdown guests GUESTS on machine URI in parallel
+shutdown_guests_parallel()
+{
+ uri=$1
+ guests=$2
+
+ on_shutdown=
+ timeout=$SHUTDOWN_TIMEOUT
+ while [ -n "$on_shutdown" ] || [ -n "$guests" ]; do
+ while [ -n "$guests" ] &&
+ [ $(guest_count "$on_shutdown") -lt "$PARALLEL_SHUTDOWN" ]; do
+ set -- $guests
+ guest=$1
+ shift
+ guests=$*
+ shutdown_guest_async "$uri" "$guest"
+ on_shutdown="$on_shutdown $guest"
+ done
+ sleep 1
+ timeout=$(($timeout - 1))
+ if [ $timeout -le 0 ]; then
+ eval_gettext "Timeout expired while shutting down domains"; echo
+ RETVAL=1
+ return
+ fi
+ on_shutdown_prev=$on_shutdown
+ on_shutdown=$(check_guests_shutdown "$uri" "$on_shutdown")
+ print_guests_shutdown "$uri" "$on_shutdown_prev" "$on_shutdown"
+ done
+}
+
# stop
# Shutdown or save guests on the configured uris
stop() {
@@ -359,13 +456,18 @@ stop() {
eval_gettext "Shutting down guests on \$uri URI..."; echo
fi
- for guest in $list; do
- if "$suspending"; then
- suspend_guest "$uri" "$guest"
- else
- shutdown_guest "$uri" "$guest"
- fi
- done
+ if [ "$PARALLEL_SHUTDOWN" -gt 1 ] &&
+ ! "$suspending"; then
+ shutdown_guests_parallel "$uri" "$list"
+ else
+ for guest in $list; do
+ if "$suspending"; then
+ suspend_guest "$uri" "$guest"
+ else
+ shutdown_guest "$uri" "$guest"
+ fi
+ done
+ fi
done <"$LISTFILE"
rm -f "$VAR_SUBSYS_LIBVIRT_GUESTS"
diff --git a/tools/libvirt-guests.sysconf b/tools/libvirt-guests.sysconf
index 9b8b64f..3e0db46 100644
--- a/tools/libvirt-guests.sysconf
+++ b/tools/libvirt-guests.sysconf
@@ -10,7 +10,8 @@
# libvirtd
#ON_BOOT=start
-# number of seconds to wait between each guest start
+# Number of seconds to wait between each guest start. Set to 0 to allow
+# parallel startup.
#START_DELAY=0
# action taken on host shutdown
@@ -23,7 +24,14 @@
# value suitable for your guests.
#ON_SHUTDOWN=suspend
-# number of seconds we're willing to wait for a guest to shut down
+# If set to non-zero, shutdown will suspend guests concurrently. Number of
+# guests on shutdown at any time will not exceed number set in this variable.
+#PARALLEL_SHUTDOWN=0
+
+# Number of seconds we're willing to wait for a guest to shut down. If parallel
+# shutdown is enabled, this timeout applies as a timeout for shutting down all
+# guests on a single URI defined in the variable URIS. This must be set to
+# a nonzero positive value if the shutdown action is requested.
#SHUTDOWN_TIMEOUT=0
# If non-zero, try to bypass the file system cache when saving and
--
1.7.3.4
12 years, 7 months
[libvirt] [PATCH] rpc: Fix client crash on connection close
by Jiri Denemark
A multi-threaded client with event loop may crash if one of its threads
closes a connection while event loop is in the middle of sending
keep-alive message (either request or response). The right place for it
is inside virNetClientIOEventLoop() between poll() and
virNetClientLock(). We should only close a connection directly if no-one
is using it and defer the closing to the last user otherwise. So far we
only did so if the close was initiated by keep-alive timeout.
---
src/rpc/virnetclient.c | 18 ++++--------------
1 files changed, 4 insertions(+), 14 deletions(-)
diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c
index 167fbf6..c2b901d 100644
--- a/src/rpc/virnetclient.c
+++ b/src/rpc/virnetclient.c
@@ -108,8 +108,6 @@ struct _virNetClient {
};
-static void virNetClientRequestClose(virNetClientPtr client);
-
static void virNetClientLock(virNetClientPtr client)
{
virMutexLock(&client->lock);
@@ -253,7 +251,7 @@ virNetClientKeepAliveStart(virNetClientPtr client,
static void
virNetClientKeepAliveDeadCB(void *opaque)
{
- virNetClientRequestClose(opaque);
+ virNetClientClose(opaque);
}
static int
@@ -512,19 +510,11 @@ virNetClientCloseLocked(virNetClientPtr client)
void virNetClientClose(virNetClientPtr client)
{
- if (!client)
- return;
-
- virNetClientLock(client);
- virNetClientCloseLocked(client);
- virNetClientUnlock(client);
-}
-
-static void
-virNetClientRequestClose(virNetClientPtr client)
-{
VIR_DEBUG("client=%p", client);
+ if (!client)
+ return;
+
virNetClientLock(client);
/* If there is a thread polling for data on the socket, set wantClose flag
--
1.7.8.5
12 years, 7 months