[libvirt] [RFC PATCH v2 1/1] qemu: host NUMA hugepage policy without guest NUMA
by Sam Bobroff
At the moment, guests that are backed by hugepages in the host are
only able to use policy to control the placement of those hugepages
on a per-(guest-)CPU basis. Policy applied globally is ignored.
Such guests would use <memoryBacking><hugepages/></memoryBacking> and
a <numatune> block with <memory mode=... nodeset=.../> but no <memnode
.../> elements.
This patch corrects this by, in this specific case, changing the QEMU
command line from "-mem-prealloc -mem-path=..." (which cannot
specify NUMA policy) to "-object memory-backend-file ..." (which can).
Note: This is not visible to the guest and does not appear to create
a migration incompatibility.
Signed-off-by: Sam Bobroff <sam.bobroff(a)au1.ibm.com>
---
Hello libvirt community,
This patch is a RFC of an attempt to understand and fix a problem with NUMA
memory placement policy being ignored in a specific situation.
Previous discussion on this issue:
https://www.redhat.com/archives/libvir-list/2016-October/msg00033.html
https://www.redhat.com/archives/libvir-list/2016-October/msg00514.html
(Sorry, this is a long description because the situation is complicated and
there are several ways of approaching it and I need to explain my reasoning for
choosing this specific method.)
The issue I'm trying to fix:
The problem is apparent when using this combination of features:
* The guest is backed by hugepages in the host.
* The backing memory is constrained by NUMA memory placement policy.
* The guest either does not use NUMA nodes at all, or does but does not specify a per-node policy.
The guest XML would therefore contain blocks similar to the following:
<memoryBacking><hugepages/></memoryBacking>
<numatune><memory mode='strict' nodeset='0'/></numatune>
(And does not use <numatune><memnode ..>... .)
(Policy works correctly without hugepages, or when memnode is used.)
What happens is that the guest runs successfully but the NUMA memory placement
policy is ignored, allowing memory to be allocated on any (host) node.
Attempted solutions:
The most obvious approach to me is a trivial patch to
qemuBuildMemoryBackendStr(). The test that checks to see if a memory-backend
is necessary (around line 3332 in qemu_command.c) is this:
/* If none of the following is requested... */
if (!needHugepage && !userNodeset && !memAccess && !nodeSpecified && !force) {
/* report back that using the new backend is not necessary
* to achieve the desired configuration */
ret = 1;
} else {
This test does not consider the nodemask, so when no other reason exists to use
a memory-backend (which is the case in point) the policy is lost.
Adding nodemask to this test appears to easily fixe the issue. However:
* It only works when the guest uses NUMA nodes (because otherwise the whole
function is skipped at a higher level).
* When the guest does use NUMA nodes, it causes a memory-backend object to be
added to each one (when they did not previously have one) and this causes
migration incompatability.
So it seems worthwhile to look for other options.
Other ways of controlling NUMA memory placement are numactl (libnuma) or
cgroups but in practice neither can work in this situation due to the
interaction of several features:
* numactl and cgroups can only apply policy to all memory allocated by that thread.
* When a guest is backed by hugepages, memory needs to be preallcated.
* QEMU performs memory pre-allocation from it's main thread, not the VCPU threads.
This seems to mean that policy applied to the QEMU VCPU threads does not apply
to pre-allocated memory and policy applied to the QEMU main thread incorrectly
applies to all of QEMU's allocations, not just the VCPU ones. My testing seems to
confirm this.
If that is true, then neither of those approaches can work (at least, not
without changes to QEMU).
At this point it appeared that the only ways to proceed would all require a
migration-breaking change, so I started to look at QEMU to see if that could be
worked around somehow.
... another option:
While investigating QEMU I discovered that it is possible to use a QEMU memory
backend object without attaching it to any guest NUMA node and when it is used
that way it does not break migration. This is (to me at least) a new way of
using a memory-backend object, but it appears to directly replace -mem-path and
-mem-prealloc and additionally allow NUMA memory placement policy to be
specified. (The ID field is still required even though it is not used.)
For example, this...
-m 4G -mem-path /dev/hugepages -mem-prealloc
... seems to be equavalent to this:
-m 4G -object memory-backend-file,id=xxx,prealloc=yes,path=/dev/hugepages,size=4G
But this is now possible as well:
-m 4G -object memory-backend-file,id=xxx,prealloc=yes,path=/dev/hugepages,host-nodes=0-1,size=4G
... so it seems to be a way of solving this issue without breaking migration.
Additionally, there seems to be code specifically in QEMU to prevent the
migration data from changing in this case, which leads me to believe that this
is an intended use of the option but I haven't directly consulted with the QEMU
community.
Implementation:
My patch is based on this last case. It is intended to work by switching the
"-mem-path" style argument to an un-attached memory-backend-file, which is why
the change is done within qemuBuildMemPathStr().
In the case where QEMU doesn't support the needed option, I've opted to raise a
warning rather than an error, because existing guests may have this
configuration and although their policy is being ignored they do at least run
and I assume it's important to maintain this.
I realize that this policy can't be changed at run-time by libvirt, unlike
policy set by cgroups or libnuma, but memory-backend objects are already in use
in more common situations, and already have this limitation so I don't think
this is adding significantly to that problem.
Questions:
So does this line of reasoning make sense?
Is this a good way to fix it? Are there better solutions?
Are there concerns with using memory-backend in this way: do I need to consult
with the QEMU community before a libvirt patch that uses it could be accepted?
Should the warning be an error?
It seemed safest to me to only use the backend object when necessary, but there
doesn't seem to be a reason to avoid using it whenever it is available. Is
there one? Should I have done that instead?
Thanks for reading!
Sam.
v2:
Incorporated review feedback from Peter Krempa <pkrempa(a)redhat.com>
and Martin Kletzander <mkletzan(a)redhat.com>.
src/qemu/qemu_command.c | 56 +++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 47 insertions(+), 9 deletions(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index 405e118..0dea9fd 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -7105,12 +7105,18 @@ qemuBuildSmpCommandLine(virCommandPtr cmd,
static int
qemuBuildMemPathStr(virQEMUDriverConfigPtr cfg,
- const virDomainDef *def,
+ virDomainDefPtr def,
virQEMUCapsPtr qemuCaps,
- virCommandPtr cmd)
+ virCommandPtr cmd,
+ virBitmapPtr auto_nodeset)
{
const long system_page_size = virGetSystemPageSizeKB();
char *mem_path = NULL;
+ virBitmapPtr nodemask = NULL;
+ const char *backendType = NULL;
+ char *backendStr = NULL;
+ virJSONValuePtr props = NULL;
+ int rv = -1;
/*
* No-op if hugepages were not requested.
@@ -7135,18 +7141,50 @@ qemuBuildMemPathStr(virQEMUDriverConfigPtr cfg,
if (qemuGetHupageMemPath(cfg, def->mem.hugepages[0].size, &mem_path) < 0)
return -1;
- virCommandAddArgList(cmd, "-mem-prealloc", "-mem-path", mem_path, NULL);
+ if (virDomainNumatuneMaybeGetNodeset(def->numa, auto_nodeset,
+ &nodemask, -1) < 0)
+ return -1;
+ if (nodemask && virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE)) {
+ if (qemuBuildMemoryBackendStr(virDomainDefGetMemoryInitial(def),
+ 0, -1, NULL, auto_nodeset,
+ def, qemuCaps, cfg, &backendType,
+ &props, false) < 0)
+ goto cleanup;
+ /* The memory-backend object created here is not going to be
+ * attached anywhere and there is only one so the ID can be hard coded.
+ * It acts like -mem-prealloc -mem-path ... but allows policy to be
+ * set. */
+ if (!(backendStr = virQEMUBuildObjectCommandlineFromJSON(backendType,
+ "mem",
+ props)))
+ goto cleanup;
+ virCommandAddArgList(cmd, "-object", backendStr, NULL);
+ rv = 0;
+ cleanup:
+ virJSONValueFree(props);
+ VIR_FREE(backendStr);
+ } else {
+ if (nodemask)
+ /* Ideally this would be an error, but if it was it would
+ * break existing guests already using this configuration. */
+ VIR_WARN("Memory file backend objects are "
+ "unsupported by QEMU binary. Global NUMA "
+ "hugepage policy will be ignored.");
+ virCommandAddArgList(cmd, "-mem-prealloc", "-mem-path", mem_path, NULL);
+ rv = 0;
+ }
VIR_FREE(mem_path);
- return 0;
+ return rv;
}
static int
qemuBuildMemCommandLine(virCommandPtr cmd,
virQEMUDriverConfigPtr cfg,
- const virDomainDef *def,
- virQEMUCapsPtr qemuCaps)
+ virDomainDefPtr def,
+ virQEMUCapsPtr qemuCaps,
+ virBitmapPtr auto_nodeset)
{
if (qemuDomainDefValidateMemoryHotplug(def, qemuCaps, NULL) < 0)
return -1;
@@ -7170,7 +7208,7 @@ qemuBuildMemCommandLine(virCommandPtr cmd,
* there is no numa node specified.
*/
if (!virDomainNumaGetNodeCount(def->numa) &&
- qemuBuildMemPathStr(cfg, def, qemuCaps, cmd) < 0)
+ qemuBuildMemPathStr(cfg, def, qemuCaps, cmd, auto_nodeset) < 0)
return -1;
if (def->mem.locked && !virQEMUCapsGet(qemuCaps, QEMU_CAPS_REALTIME_MLOCK)) {
@@ -7307,7 +7345,7 @@ qemuBuildNumaArgStr(virQEMUDriverConfigPtr cfg,
}
if (!needBackend &&
- qemuBuildMemPathStr(cfg, def, qemuCaps, cmd) < 0)
+ qemuBuildMemPathStr(cfg, def, qemuCaps, cmd, auto_nodeset) < 0)
goto cleanup;
for (i = 0; i < ncells; i++) {
@@ -9385,7 +9423,7 @@ qemuBuildCommandLine(virQEMUDriverPtr driver,
if (!migrateURI && !snapshot && qemuDomainAlignMemorySizes(def) < 0)
goto error;
- if (qemuBuildMemCommandLine(cmd, cfg, def, qemuCaps) < 0)
+ if (qemuBuildMemCommandLine(cmd, cfg, def, qemuCaps, nodeset) < 0)
goto error;
if (qemuBuildSmpCommandLine(cmd, def) < 0)
--
2.10.0.297.gf6727b0
8 years
[libvirt] [PATCH 5/5] docs: Add Qt Virtual machines manager and Qt Remote viewer
by Alexander Vasilenko
---
docs/apps.html.in | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/docs/apps.html.in b/docs/apps.html.in
index 1a138b3..4efb318 100644
--- a/docs/apps.html.in
+++ b/docs/apps.html.in
@@ -208,6 +208,17 @@
to remote consoles supporting the VNC protocol. Also provides
an optional mozilla browser plugin.
</dd>
+ <dt><a href="http://f1ash.github.io/qt-virt-manager">qt-virt-manager</a></dt>
+ <dd>
+ The Qt GUI for create and control VMs and another virtual entities
+ (aka networks, storages, interfaces, secrets, network filters).
+ Contains integrated LXC/SPICE/VNC viewer for accessing the graphical or
+ text console associated with a virtual machine or container.
+ </dd>
+ <dt><a href="http://f1ash.github.io/qt-virt-manager/#virtual-machines-viewer">qt-remote-viewer</a></dt>
+ <dd>
+ The Qt VNC/SPICE viewer for access to remote desktops or VMs.
+ </dd>
</dl>
<h2><a name="iaas">Infrastructure as a Service (IaaS)</a></h2>
--
2.10.1
8 years, 1 month
[libvirt] [PATCH 0/3] wireshark: Further build system fixes
by Andrea Bolognani
This takes care of the few remaining nits.
All use cases I could think of are covered; if any more
issues are discovered, we'll take care of them then.
Andrea Bolognani (3):
wireshark: Don't redefine ws_plugindir
wireshark: Try a bunch of possible prefixes
wireshark: Use ${exec_prefix} for $ws_plugindir
m4/virt-wireshark.m4 | 23 ++++++++++++++++-------
tools/Makefile.am | 1 -
2 files changed, 16 insertions(+), 8 deletions(-)
--
2.7.4
8 years, 1 month
[libvirt] [RFC] Toward a better NEWS file
by Andrea Bolognani
Hi,
there's an idea that has been kicking around in my head for
a while, and I'd like to share it with the list to gather
some feedback before I forget about it :)
Right now, each entry in our NEWS file contains what is
basically the output of
git log \
--pretty=oneline \
vX.Y-1.0..vX.Y.0
with the commits organized somewhat arbitrarily into a bunch
of sections with partially overlapping scopes.
I believe the current form is less than useful, as it is too
detailed for end users and distro packagers, who only care
about the high-level user visible changes, and not detailed
enough for developers, who are always going to refer to the
proper git log anyway. Moreover, we ship a ChangeLog file
that contains very detailed information already.
Ideally, the NEWS file would contain a summary of notable
changes (new APIs, significantly improved features, etc.)
laid out in an easy to digest fashion, such that people who
are not knee-deep into libvirt development can grasp them
and hopefully get excited about upgrading to our latest and
greatest release :)
Of course, it would take an insane amount of time for any
single one of us to turn the git log into such a document,
and the result would still be sub-par because we simply
can't expect anyone to have full insight in every single
change to the project.
My solution for this is to ask the people with the most
knowledge of the changes - the authors themselves!
The workflow I'm envisioning would look like this:
* DV, at the same time as he announces that libvirt has
entered freeze, will put out a Call for NEWS and ask
people who have contributed code to the upcoming release
to post a summary of their changes
* the authors will go over
git log \
--author=$(git config user.email) \
vX.Y-1.0..master
and come up with a short (one-three sentences) summary
for each of the changes, if they are notable. Commits
that are part of a larger series would not be described
on their own: a short summary of the series would be
used instead, akin to the one you would put in your
cover letter.
To give a practical example: I've mostly been busy with
reviews this cycle, but if I were to go over my commits
since v2.3.0 right now I would write something like
* Bug fix: don't restart libvirt-guests.service when
libvirtd.service is restarted
for commit 2273bbd, and omit both 61e1014 and a0da413 as
they're neither notable enough on their own, nor part of
a larger series: we'll always have a "various bug fixes
and improvements" bullet point in a NEWS file entry to
take care of that kind of small cleanups and improvements.
* the authors would post the resulting summaries to the
list. We could simply post them as regular patches to
docs/news.html.in (potentially without requiring review
before pushing them), or post them as plain text and have
someone collect them and prepare a single commit
* DV will tag the release and push the tarballs, and
everyone will be able to enjoy the NEWS file :)
Some light editoral work might be needed throughout the
process, eg. fix typos or post one or two reminders during
the freeze: I volunteer to take such tasks upon myself.
I'm looking forward to feedback about this idea, especially
from people who might be part of any community where anything
like this is already happening.
--
Andrea Bolognani / Red Hat / Virtualization
8 years, 1 month
[libvirt] [PATCH 0/3] vz: nettype=bridge and other corrections
by Maxim Nestratov
Maxim Nestratov (3):
vz: support type=bridge network interface type correctly
vz: remove Bridged network name and rename Routed
vz: add MIGRATION_V3 capability
src/vz/vz_driver.c | 1 +
src/vz/vz_sdk.c | 100 ++++++++---------------------------------------------
src/vz/vz_utils.h | 3 +-
3 files changed, 16 insertions(+), 88 deletions(-)
--
2.4.11
8 years, 1 month
[libvirt] [REPOST] regarding cgroup v2 support in libvirt
by Tejun Heo
(reposting w/ libvir-list cc'd, sorry about the delay in reposting,
was traveling and then on vacation)
Hello, Daniel. How have you been?
We (facebook) are deploying cgroup v2 and internally use libvirt to
manage virtual machines, so I'm trying to add cgroup v2 support to
libvirt.
Because cgroup v2's resource configurations differ from v1 in varying
degrees depending on the specific resource type, it unfortunately
introduces new configurations (some completely new configs, others
just a different range / format). This means that adding cgroup v2
support to libvirt requires adding new config options to it and maybe
implementing some form of translation mechanism between overlapping
configs.
The upcoming systemd release includes all that's necessary to support
v1/v2 compatibility so that users setting resource configs through
systemd don't have to worry about whether v1 or v2 is in use. I'm
wondering whether it would make sense to make libvirt use dbus calls
to systemd to set resource configs when systemd is in use, so that it
can piggyback on systemd's v1/v2 compatibility.
It is true that, as libvirt can be used without systemd, libvirt will
probably want its own direct implementation down the line, but I think
there are benefits to going through systemd for resource settings in
general given that hierarchy setup is already done through systemd
when available.
What do you think?
Thanks!
--
tejun
8 years, 1 month
[libvirt] [PATCH] qemu_driver: don't set cpu pin for TCG domain
by Chen Hanxiao
From: Chen Hanxiao <chenhanxiao(a)gmail.com>
We don't support pinning cpu for TCG domain.
But we could set it by vcpupin command,
which result in a failed startup.
Signed-off-by: Chen Hanxiao <chenhanxiao(a)gmail.com>
---
src/qemu/qemu_driver.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 93ea5e2..98cfcab 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -5177,6 +5177,14 @@ qemuDomainPinVcpuFlags(virDomainPtr dom,
if (virDomainObjGetDefs(vm, flags, &def, &persistentDef) < 0)
goto endjob;
+ if ((def && def->virtType == VIR_DOMAIN_VIRT_QEMU) ||
+ (persistentDef && persistentDef->virtType == VIR_DOMAIN_VIRT_QEMU))
+ {
+ virReportError(VIR_ERR_OPERATION_FAILED, "%s",
+ _("Virt type 'Qemu'(TCG) did not support CPU pin"));
+ goto endjob;
+ }
+
if (persistentDef &&
!(vcpuinfo = virDomainDefGetVcpu(persistentDef, vcpu))) {
virReportError(VIR_ERR_INVALID_ARG,
--
1.8.3.1
8 years, 1 month
[libvirt] [RFC PATCH] libxl: add tunnelled migration support
by Bob Liu
Tunnelled migration doesn't require any extra network connections beside the
libvirt daemon.
It's capable of strong encryption and is the default option in openstack-nova.
This patch add the tunnelled migration(Tunnel3params) support to libxl.
The data flow in the src side is:
* libxlDoMigrateSend() -> pipe
* libxlTunnel3MigrationFunc() poll pipe out and then write to dest stream.
While in the dest side:
Stream -> pipe -> 'recvfd of libxlDomainStartRestore'
The usage is the same as p2p migration, execpt adding one more '--tunnelled' to
the libvirt p2p migration command.
Signed-off-by: Bob Liu <bob.liu(a)oracle.com>
---
src/libxl/libxl_driver.c | 58 ++++++++++-
src/libxl/libxl_migration.c | 241 +++++++++++++++++++++++++++++++++++++++++---
src/libxl/libxl_migration.h | 9 ++
3 files changed, 292 insertions(+), 16 deletions(-)
diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c
index b66cb1f..a01bbff 100644
--- a/src/libxl/libxl_driver.c
+++ b/src/libxl/libxl_driver.c
@@ -5918,6 +5918,61 @@ libxlDomainMigrateBegin3Params(virDomainPtr domain,
}
static int
+libxlDomainMigratePrepareTunnel3Params(virConnectPtr dconn,
+ virStreamPtr st,
+ virTypedParameterPtr params,
+ int nparams,
+ const char *cookiein,
+ int cookieinlen,
+ char **cookieout ATTRIBUTE_UNUSED,
+ int *cookieoutlen ATTRIBUTE_UNUSED,
+ unsigned int flags)
+{
+ libxlDriverPrivatePtr driver = dconn->privateData;
+ virDomainDefPtr def = NULL;
+ const char *dom_xml = NULL;
+ const char *dname = NULL;
+ const char *uri_in = NULL;
+
+#ifdef LIBXL_HAVE_NO_SUSPEND_RESUME
+ virReportUnsupportedError();
+ return -1;
+#endif
+
+ virCheckFlags(LIBXL_MIGRATION_FLAGS, -1);
+ if (virTypedParamsValidate(params, nparams, LIBXL_MIGRATION_PARAMETERS) < 0)
+ goto error;
+
+ if (virTypedParamsGetString(params, nparams,
+ VIR_MIGRATE_PARAM_DEST_XML,
+ &dom_xml) < 0 ||
+ virTypedParamsGetString(params, nparams,
+ VIR_MIGRATE_PARAM_DEST_NAME,
+ &dname) < 0 ||
+ virTypedParamsGetString(params, nparams,
+ VIR_MIGRATE_PARAM_URI,
+ &uri_in) < 0)
+
+ goto error;
+
+ if (!(def = libxlDomainMigrationPrepareDef(driver, dom_xml, dname)))
+ goto error;
+
+ if (virDomainMigratePrepareTunnel3ParamsEnsureACL(dconn, def) < 0)
+ goto error;
+
+ if (libxlDomainMigrationPrepareTunnel3(dconn, st, &def, cookiein,
+ cookieinlen, flags) < 0)
+ goto error;
+
+ return 0;
+
+ error:
+ virDomainDefFree(def);
+ return -1;
+}
+
+static int
libxlDomainMigratePrepare3Params(virConnectPtr dconn,
virTypedParameterPtr params,
int nparams,
@@ -6017,7 +6072,7 @@ libxlDomainMigratePerform3Params(virDomainPtr dom,
if (virDomainMigratePerform3ParamsEnsureACL(dom->conn, vm->def) < 0)
goto cleanup;
- if (flags & VIR_MIGRATE_PEER2PEER) {
+ if ((flags & (VIR_MIGRATE_TUNNELLED | VIR_MIGRATE_PEER2PEER))) {
if (libxlDomainMigrationPerformP2P(driver, vm, dom->conn, dom_xml,
dconnuri, uri, dname, flags) < 0)
goto cleanup;
@@ -6501,6 +6556,7 @@ static virHypervisorDriver libxlHypervisorDriver = {
.nodeDeviceReset = libxlNodeDeviceReset, /* 1.2.3 */
.domainMigrateBegin3Params = libxlDomainMigrateBegin3Params, /* 1.2.6 */
.domainMigratePrepare3Params = libxlDomainMigratePrepare3Params, /* 1.2.6 */
+ .domainMigratePrepareTunnel3Params = libxlDomainMigratePrepareTunnel3Params, /* 2.3.1 */
.domainMigratePerform3Params = libxlDomainMigratePerform3Params, /* 1.2.6 */
.domainMigrateFinish3Params = libxlDomainMigrateFinish3Params, /* 1.2.6 */
.domainMigrateConfirm3Params = libxlDomainMigrateConfirm3Params, /* 1.2.6 */
diff --git a/src/libxl/libxl_migration.c b/src/libxl/libxl_migration.c
index 534abb8..88c9bb8 100644
--- a/src/libxl/libxl_migration.c
+++ b/src/libxl/libxl_migration.c
@@ -44,6 +44,7 @@
#include "libxl_migration.h"
#include "locking/domain_lock.h"
#include "virtypedparam.h"
+#include "fdstream.h"
#define VIR_FROM_THIS VIR_FROM_LIBXL
@@ -484,6 +485,90 @@ libxlDomainMigrationPrepareDef(libxlDriverPrivatePtr driver,
}
int
+libxlDomainMigrationPrepareTunnel3(virConnectPtr dconn,
+ virStreamPtr st,
+ virDomainDefPtr *def,
+ const char *cookiein,
+ int cookieinlen,
+ unsigned int flags)
+{
+ libxlMigrationCookiePtr mig = NULL;
+ libxlDriverPrivatePtr driver = dconn->privateData;
+ virDomainObjPtr vm = NULL;
+ libxlMigrationDstArgs *args = NULL;
+ virThread thread;
+ int dataFD[2] = { -1, -1 };
+ int ret = 0;
+
+ if (libxlMigrationEatCookie(cookiein, cookieinlen, &mig) < 0)
+ goto error;
+
+ if (mig->xenMigStreamVer > LIBXL_SAVE_VERSION) {
+ virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
+ _("Xen migration stream version '%d' is not supported on this host"),
+ mig->xenMigStreamVer);
+ goto error;
+ }
+
+ if (!(vm = virDomainObjListAdd(driver->domains, *def,
+ driver->xmlopt,
+ VIR_DOMAIN_OBJ_LIST_ADD_LIVE |
+ VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE,
+ NULL)))
+ goto error;
+
+ /*
+ * The data flow of tunnel3 migration in the dest side:
+ * stream -> pipe -> recvfd of libxlDomainStartRestore
+ */
+ if (pipe(dataFD) < 0)
+ goto error;
+
+ /* Stream data will be written to pipeIn */
+ if (virFDStreamOpen(st, dataFD[1]) < 0)
+ goto error;
+
+ if (libxlMigrationDstArgsInitialize() < 0)
+ goto error;
+
+ if (!(args = virObjectNew(libxlMigrationDstArgsClass)))
+ goto error;
+
+ args->conn = dconn;
+ args->vm = vm;
+ args->flags = flags;
+ args->migcookie = mig;
+ /* Receive from pipeOut */
+ args->recvfd = dataFD[0];
+ args->nsocks = 0;
+ if (virThreadCreate(&thread, false, libxlDoMigrateReceive, args) < 0) {
+ virReportError(VIR_ERR_OPERATION_FAILED, "%s",
+ _("Failed to create thread for receiving migration data"));
+ goto error;
+ }
+
+ goto done;
+
+ error:
+ VIR_FORCE_CLOSE(dataFD[1]);
+ VIR_FORCE_CLOSE(dataFD[0]);
+ virObjectUnref(args);
+ /* Remove virDomainObj from domain list */
+ if (vm) {
+ virDomainObjListRemove(driver->domains, vm);
+ vm = NULL;
+ }
+ ret = -1;
+
+ done:
+ /* Nobody will close dataFD[1]? */
+ if (vm)
+ virObjectUnlock(vm);
+
+ return ret;
+}
+
+int
libxlDomainMigrationPrepare(virConnectPtr dconn,
virDomainDefPtr *def,
const char *uri_in,
@@ -710,9 +795,90 @@ libxlDomainMigrationPrepare(virConnectPtr dconn,
return ret;
}
-/* This function is a simplification of virDomainMigrateVersion3Full
- * excluding tunnel support and restricting it to migration v3
- * with params since it was the first to be introduced in libxl.
+typedef struct _libxlTunnelMigrationThread libxlTunnelMigrationThread;
+struct _libxlTunnelMigrationThread {
+ virThread thread;
+ virStreamPtr st;
+ int srcFD;
+};
+#define TUNNEL_SEND_BUF_SIZE 65536
+
+/*
+ * The data flow of tunnel3 migration in the src side:
+ * libxlDoMigrateSend() -> pipe
+ * libxlTunnel3MigrationFunc() polls pipe out and then write to dest stream.
+ */
+static void libxlTunnel3MigrationFunc(void *arg)
+{
+ libxlTunnelMigrationThread *data = (libxlTunnelMigrationThread *)arg;
+ char *buffer = NULL;
+ struct pollfd fds[1];
+ int timeout = -1;
+
+ if (VIR_ALLOC_N(buffer, TUNNEL_SEND_BUF_SIZE) < 0) {
+ virReportError(errno, "%s", _("poll failed in migration tunnel"));
+ return;
+ }
+
+ fds[0].fd = data->srcFD;
+ for (;;) {
+ int ret;
+
+ fds[0].events = POLLIN;
+ fds[0].revents = 0;
+ ret = poll(fds, ARRAY_CARDINALITY(fds), timeout);
+ if (ret < 0) {
+ if (errno == EAGAIN || errno == EINTR)
+ continue;
+ virReportError(errno, "%s",
+ _("poll failed in libxlTunnel3MigrationFunc"));
+ goto abrt;
+ }
+
+ if (ret == 0) {
+ VIR_DEBUG("poll got timeout");
+ break;
+ }
+
+ if (fds[0].revents & (POLLIN | POLLERR | POLLHUP)) {
+ int nbytes;
+
+ nbytes = read(data->srcFD, buffer, TUNNEL_SEND_BUF_SIZE);
+ if (nbytes > 0) {
+ /* Write to dest stream */
+ if (virStreamSend(data->st, buffer, nbytes) < 0) {
+ virReportError(errno, "%s",
+ _("tunnelled migration failed to send to virStream"));
+ goto abrt;
+ }
+ } else if (nbytes < 0) {
+ virReportError(errno, "%s",
+ _("tunnelled migration failed to read from xen side"));
+ goto abrt;
+ } else {
+ /* EOF; transferred all data */
+ break;
+ }
+ }
+ }
+
+ if (virStreamFinish(data->st) < 0)
+ virReportError(errno, "%s",
+ _("tunnelled migration failed to finish stream"));
+
+ cleanup:
+ VIR_FREE(buffer);
+
+ return;
+
+ abrt:
+ virStreamAbort(data->st);
+ goto cleanup;
+}
+
+/* This function is a simplification of virDomainMigrateVersion3Full and
+ * restricting it to migration v3 with params since it was the first to be
+ * introduced in libxl.
*/
static int
libxlDoMigrateP2P(libxlDriverPrivatePtr driver,
@@ -737,6 +903,10 @@ libxlDoMigrateP2P(libxlDriverPrivatePtr driver,
bool cancelled = true;
virErrorPtr orig_err = NULL;
int ret = -1;
+ /* For tunnel migration */
+ virStreamPtr st = NULL;
+ libxlTunnelMigrationThread *libxlTunnelMigationThreadPtr = NULL;
+ int dataFD[2] = { -1, -1 };
dom_xml = libxlDomainMigrationBegin(sconn, vm, xmlin,
&cookieout, &cookieoutlen);
@@ -764,29 +934,62 @@ libxlDoMigrateP2P(libxlDriverPrivatePtr driver,
VIR_DEBUG("Prepare3");
virObjectUnlock(vm);
- ret = dconn->driver->domainMigratePrepare3Params
- (dconn, params, nparams, cookieout, cookieoutlen, NULL, NULL, &uri_out, destflags);
+ if (flags & VIR_MIGRATE_TUNNELLED) {
+ if (!(st = virStreamNew(dconn, 0)))
+ goto cleanup;
+ ret = dconn->driver->domainMigratePrepareTunnel3Params
+ (dconn, st, params, nparams, cookieout, cookieoutlen, NULL, NULL, destflags);
+ } else {
+ ret = dconn->driver->domainMigratePrepare3Params
+ (dconn, params, nparams, cookieout, cookieoutlen, NULL, NULL, &uri_out, destflags);
+ }
virObjectLock(vm);
if (ret == -1)
goto cleanup;
- if (uri_out) {
- if (virTypedParamsReplaceString(¶ms, &nparams,
- VIR_MIGRATE_PARAM_URI, uri_out) < 0) {
- orig_err = virSaveLastError();
+ if (!(flags & VIR_MIGRATE_TUNNELLED)) {
+ if (uri_out) {
+ if (virTypedParamsReplaceString(¶ms, &nparams,
+ VIR_MIGRATE_PARAM_URI, uri_out) < 0) {
+ orig_err = virSaveLastError();
+ goto finish;
+ }
+ } else {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("domainMigratePrepare3 did not set uri"));
goto finish;
}
- } else {
- virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
- _("domainMigratePrepare3 did not set uri"));
- goto finish;
}
VIR_DEBUG("Perform3 uri=%s", NULLSTR(uri_out));
- ret = libxlDomainMigrationPerform(driver, vm, NULL, NULL,
- uri_out, NULL, flags);
+ if (flags & VIR_MIGRATE_TUNNELLED) {
+ if (VIR_ALLOC(libxlTunnelMigationThreadPtr) < 0)
+ goto cleanup;
+ if (pipe(dataFD) < 0) {
+ virReportError(errno, "%s", _("Unable to make pipes"));
+ goto cleanup;
+ }
+ /* Read from pipe */
+ libxlTunnelMigationThreadPtr->srcFD = dataFD[0];
+ /* Write to dest stream */
+ libxlTunnelMigationThreadPtr->st = st;
+ if (virThreadCreate(&libxlTunnelMigationThreadPtr->thread, true,
+ libxlTunnel3MigrationFunc,
+ libxlTunnelMigationThreadPtr) < 0) {
+ virReportError(errno, "%s",
+ _("Unable to create tunnel migration thread"));
+ goto cleanup;
+ }
+ virObjectUnlock(vm);
+ /* Send data to pipe */
+ ret = libxlDoMigrateSend(driver, vm, flags, dataFD[1]);
+ virObjectLock(vm);
+ } else {
+ ret = libxlDomainMigrationPerform(driver, vm, NULL, NULL,
+ uri_out, NULL, flags);
+ }
if (ret < 0)
orig_err = virSaveLastError();
@@ -824,6 +1027,14 @@ libxlDoMigrateP2P(libxlDriverPrivatePtr driver,
vm->def->name);
cleanup:
+ if (libxlTunnelMigationThreadPtr) {
+ virThreadCancel(&libxlTunnelMigationThreadPtr->thread);
+ VIR_FREE(libxlTunnelMigationThreadPtr);
+ }
+ VIR_FORCE_CLOSE(dataFD[0]);
+ VIR_FORCE_CLOSE(dataFD[1]);
+ virObjectUnref(st);
+
if (ddomain) {
virObjectUnref(ddomain);
ret = 0;
diff --git a/src/libxl/libxl_migration.h b/src/libxl/libxl_migration.h
index 8a074a0..fcea558 100644
--- a/src/libxl/libxl_migration.h
+++ b/src/libxl/libxl_migration.h
@@ -29,6 +29,7 @@
# define LIBXL_MIGRATION_FLAGS \
(VIR_MIGRATE_LIVE | \
VIR_MIGRATE_PEER2PEER | \
+ VIR_MIGRATE_TUNNELLED | \
VIR_MIGRATE_PERSIST_DEST | \
VIR_MIGRATE_UNDEFINE_SOURCE | \
VIR_MIGRATE_PAUSED)
@@ -53,6 +54,14 @@ libxlDomainMigrationPrepareDef(libxlDriverPrivatePtr driver,
const char *dname);
int
+libxlDomainMigrationPrepareTunnel3(virConnectPtr dconn,
+ virStreamPtr st,
+ virDomainDefPtr *def,
+ const char *cookiein,
+ int cookieinlen,
+ unsigned int flags);
+
+int
libxlDomainMigrationPrepare(virConnectPtr dconn,
virDomainDefPtr *def,
const char *uri_in,
--
2.6.5
8 years, 1 month
[libvirt] [PATCH 0/6] Address assignment fixes
by Ján Tomko
Fixes a crash on usb-serial hotplug and missing addresses
after libvirtd restart, and makes some code more readable.
Ján Tomko (6):
Add 'FromCache' to virDomainVirtioSerialAddrAutoAssign
Introduce virDomainVirtioSerialAddrAutoAssign again
Return directly from qemuDomainAttachChrDeviceAssignAddr
Also create the USB address cache for domains with all the USB
addresses
Fix crash on usb-serial hotplug
Do not try to release virtio serial addresses
src/conf/domain_addr.c | 41 +++++++++++++++++++++++++++++++++----
src/conf/domain_addr.h | 14 +++++++++++--
src/libvirt_private.syms | 2 ++
src/qemu/qemu_domain_address.c | 22 ++++++++++++++++----
src/qemu/qemu_hotplug.c | 46 +++++++++++++++++-------------------------
5 files changed, 87 insertions(+), 38 deletions(-)
--
2.7.3
8 years, 1 month
[libvirt] libvirt compiles on RISC-V (RV64G)
by Richard W.M. Jones
I'm happy to announce that libvirt compiles fine from git on
Fedora/RISC-V. This has little or no practical value at all, since
RISC-V lacks such essentials such as virtualization, qemu etc.
However I suppose you could use it as a remote client.
# file src/.libs/libvirt.so.0.2004.0
src/.libs/libvirt.so.0.2004.0: ELF 64-bit LSB shared object, UCB RISC-V, version 1 (SYSV), dynamically linked, BuildID[sha1]=525ed42ce6d1284c6a909bd6f1b0d6181e88af7b, not stripped
# file tools/.libs/virsh
tools/.libs/virsh: ELF 64-bit LSB shared object, UCB RISC-V, version 1 (SYSV), dynamically linked, interpreter /lib/ld.so.1, for GNU/Linux 2.6.32, BuildID[sha1]=67e2b69f9007c02137545fed1b7fd9b2871740ee, not stripped
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v
8 years, 1 month