[libvirt] Supporting vhost-net and macvtap in libvirt for QEMU
by Anthony Liguori
Disclaimer: I am neither an SR-IOV nor a vhost-net expert, but I've CC'd
people that are who can throw tomatoes at me for getting bits wrong :-)
I wanted to start a discussion about supporting vhost-net in libvirt.
vhost-net has not yet been merged into qemu but I expect it will be soon
so it's a good time to start this discussion.
There are two modes worth supporting for vhost-net in libvirt. The
first mode is where vhost-net backs to a tun/tap device. This is
behaves in very much the same way that -net tap behaves in qemu today.
Basically, the difference is that the virtio backend is in the kernel
instead of in qemu so there should be some performance improvement.
Current, libvirt invokes qemu with -net tap,fd=X where X is an already
open fd to a tun/tap device. I suspect that after we merge vhost-net,
libvirt could support vhost-net in this mode by just doing -net
vhost,fd=X. I think the only real question for libvirt is whether to
provide a user visible switch to use vhost or to just always use vhost
when it's available and it makes sense. Personally, I think the later
makes sense.
The more interesting invocation of vhost-net though is one where the
vhost-net device backs directly to a physical network card. In this
mode, vhost should get considerably better performance than the current
implementation. I don't know the syntax yet, but I think it's
reasonable to assume that it will look something like -net
tap,dev=eth0. The effect will be that eth0 is dedicated to the guest.
On most modern systems, there is a small number of network devices so
this model is not all that useful except when dealing with SR-IOV
adapters. In that case, each physical device can be exposed as many
virtual devices (VFs). There are a few restrictions here though. The
biggest is that currently, you can only change the number of VFs by
reloading a kernel module so it's really a parameter that must be set at
startup time.
I think there are a few ways libvirt could support vhost-net in this
second mode. The simplest would be to introduce a new tag similar to
<source network='br0'>. In fact, if you probed the device type for the
network parameter, you could probably do something like <source
network='eth0'> and have it Just Work.
Another model would be to have libvirt see an SR-IOV adapter as a
network pool whereas it handled all of the VF management. Considering
how inflexible SR-IOV is today, I'm not sure whether this is the best model.
Has anyone put any more thought into this problem or how this should be
modeled in libvirt? Michael, could you share your current thinking for
-net syntax?
--
Regards,
Anthony Liguori
1 year, 2 months
[libvirt] [PATCH 0/4] Multiple problems with saving to block devices
by Daniel P. Berrange
This patch series makes it possible to save to a block device,
instead of a plain file. There were multiple problems
- WHen save failed, we might de-reference a NULL pointer
- When save failed, we unlinked the device node !!
- The approach of using >> to append, doesn't work with block devices
- CGroups was blocking QEMU access to the block device when enabled
One remaining problem is not in libvirt, but rather QEMU. The QEMU
exec: based migration often fails to detect failure of the command
and will thus hang forever attempting a migration that'll never
succeed! Fortunately you can now work around this in libvirt using
the virsh domjobabort command
11 years, 9 months
[libvirt] libvirt(-java): virDomainMigrateSetMaxDowntime
by Thomas Treutner
Hi,
I'm facing some troubles with virDomainMigrate &
virDomainMigrateSetMaxDowntime. The core problem is that KVM's default
value for the maximum allowed downtime is 30ms (max_downtime in
migration.c, it's nanoseconds there; 0.12.3) which is too low for my VMs
when they're busy (~50% CPU util and above). Migrations then take
literally forever, I had to abort them after 15 minutes or so. I'm using
GBit Ethernet, so plenty bandwidth should be available. Increasing the
allowed downtime to 50ms seems to help, but I have not tested situations
where the VM is completely utilized. Anyways, the default value is too
low for me, so I tried virDomainMigrateSetMaxDowntime resp. the Java
wrapper function.
Here I'm facing a problem I can overcome only with a quite crude hack:
org.libvirt.Domain.migrate(..) blocks until the migration is done, which
is of course reasonable. So I tried calling migrateSetMaxDowntime(..)
before migrating, causing an error:
"Requested operation is not valid: domain is not being migrated"
This tells me that calling migrateSetMaxDowntime is only allowed during
migrations. As I'm migrating VMs automatically and without any user
intervention I'd need to create some glue code that runs in an extra
thread, waiting "some time" hoping that the migration was kicked off in
the main thread yet and then calling migrateSetMaxDowntime. I'd like to
avoid such quirks in the long run, if possible.
So my question: Would it be possible to extend the migrate() method
resp. virDomainMigrate() function with an optional maxDowntime parameter
that is passed down as QEMU_JOB_SIGNAL_MIGRATE_DOWNTIME so that
qemuDomainWaitForMigrationComplete would set the value? Or are there
easier ways?
Thanks and regards,
-t
13 years, 3 months
[libvirt] snapshots += domain description?
by Philipp Hahn
Hello,
I just encountered a problem with the current snapshot implementation for
qemu/kvm: To revert back to a snapshot, the domain description must closely
match the domain description when the snapshot was created. If the ram-size
doesn't match or the VM now contains fewer CPUs or contains additional disks,
kvm will either not load the snapshot or the machine will be broken
afterwarts.
Is this a known problem?
Without looking into the implementation details I think saving the full domain
description instead of just a reference to the domain uuid as part of the
snapshot description would work better. Or am I supposed to save the domain
description myself in addition the the data saved
in /var/lib/libvirt/qemu/snapshot/$DOMAIN_NAME/$SNAPSHOT_NAME.xml. I think
this is very important feature, because when you - for example - notice that
you need additional disk space and would like to include an additional block
device, this is your ideal time to take a snapshot you can revert to when you
do something wrong.
Looking at VMware I see that reverting a VM there will not only revert the
disks and ram back to the saved state, but also the description including
name, ram size, etc.
Any comment or advise is appreciated.
Sincerely
Philipp Hahn
--
Philipp Hahn Open Source Software Engineer hahn(a)univention.de
Univention GmbH Linux for Your Business fon: +49 421 22 232- 0
Mary-Somerville-Str.1 28359 Bremen fax: +49 421 22 232-99
http://www.univention.de
13 years, 4 months
[libvirt] [PATCH] qemu: Fix -chardev udp if parameters are omitted
by Cole Robinson
The following XML:
<serial type='udp'>
<source mode='connect' service='9999'/>
</serial>
is accepted by domain_conf.c but maps to the qemu command line:
-chardev udp,host=127.0.0.1,port=2222,localaddr=(null),localport=(null)
qemu can cope with everything omitting except the connection port, which
seems to also be the intent of domain_conf validation, so let's not
generate bogus command lines for that case.
Additionally, tweak the qemu cli parsing to handle omitted host parameters
for -serial udp
---
src/qemu/qemu_command.c | 48 +++++++++++---------
.../qemuxml2argv-serial-udp-chardev.args | 6 ++-
.../qemuxml2argv-serial-udp-chardev.xml | 4 ++
.../qemuxml2argvdata/qemuxml2argv-serial-udp.args | 2 +-
tests/qemuxml2argvdata/qemuxml2argv-serial-udp.xml | 4 ++
5 files changed, 39 insertions(+), 25 deletions(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index c9b9850..231d7c3 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -2119,14 +2119,17 @@ qemuBuildChrChardevStr(virDomainChrSourceDefPtr dev, const char *alias,
break;
case VIR_DOMAIN_CHR_TYPE_UDP:
- virBufferVSprintf(&buf,
- "udp,id=char%s,host=%s,port=%s,localaddr=%s,"
- "localport=%s",
+ virBufferVSprintf(&buf, "udp,id=char%s,port=%s",
alias,
- dev->data.udp.connectHost,
- dev->data.udp.connectService,
- dev->data.udp.bindHost,
- dev->data.udp.bindService);
+ dev->data.udp.connectService);
+
+ if (dev->data.udp.connectHost)
+ virBufferVSprintf(&buf, ",host=%s", dev->data.udp.connectHost);
+ if (dev->data.udp.bindHost)
+ virBufferVSprintf(&buf, ",localaddr=%s", dev->data.udp.bindHost);
+ if (dev->data.udp.bindService)
+ virBufferVSprintf(&buf, ",localport=%s",
+ dev->data.udp.bindService);
break;
case VIR_DOMAIN_CHR_TYPE_TCP:
@@ -2216,11 +2219,13 @@ qemuBuildChrArgStr(virDomainChrSourceDefPtr dev, const char *prefix)
break;
case VIR_DOMAIN_CHR_TYPE_UDP:
- virBufferVSprintf(&buf, "udp:%s:%s@%s:%s",
- dev->data.udp.connectHost,
- dev->data.udp.connectService,
- dev->data.udp.bindHost,
- dev->data.udp.bindService);
+ virBufferVSprintf(&buf, "udp:%s:%s",
+ dev->data.udp.connectHost ?: "",
+ dev->data.udp.connectService);
+ if (dev->data.udp.bindService)
+ virBufferVSprintf(&buf, "@%s:%s",
+ dev->data.udp.bindHost ?: "",
+ dev->data.udp.bindService);
break;
case VIR_DOMAIN_CHR_TYPE_TCP:
@@ -5302,13 +5307,12 @@ qemuParseCommandLineChr(const char *val)
host2 = svc1 ? strchr(svc1, '@') : NULL;
svc2 = host2 ? strchr(host2, ':') : NULL;
- if (svc1)
+ if (svc1 && (svc1 != val)) {
def->source.data.udp.connectHost = strndup(val, svc1-val);
- else
- def->source.data.udp.connectHost = strdup(val);
- if (!def->source.data.udp.connectHost)
- goto no_memory;
+ if (!def->source.data.udp.connectHost)
+ goto no_memory;
+ }
if (svc1) {
svc1++;
@@ -5323,14 +5327,14 @@ qemuParseCommandLineChr(const char *val)
if (host2) {
host2++;
- if (svc2)
+ if (svc2 && (svc2 != host2)) {
def->source.data.udp.bindHost = strndup(host2, svc2-host2);
- else
- def->source.data.udp.bindHost = strdup(host2);
- if (!def->source.data.udp.bindHost)
- goto no_memory;
+ if (!def->source.data.udp.bindHost)
+ goto no_memory;
+ }
}
+
if (svc2) {
svc2++;
def->source.data.udp.bindService = strdup(svc2);
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-serial-udp-chardev.args b/tests/qemuxml2argvdata/qemuxml2argv-serial-udp-chardev.args
index 7525110..8c6a6d5 100644
--- a/tests/qemuxml2argvdata/qemuxml2argv-serial-udp-chardev.args
+++ b/tests/qemuxml2argvdata/qemuxml2argv-serial-udp-chardev.args
@@ -2,6 +2,8 @@ LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test /usr/bin/qemu -S -M \
pc -m 214 -smp 1 -nographic -nodefconfig -nodefaults -chardev socket,\
id=charmonitor,path=/tmp/test-monitor,server,nowait -mon chardev=charmonitor,\
id=monitor,mode=readline -no-acpi -boot c -hda /dev/HostVG/QEMUGuest1 -chardev \
-udp,id=charserial0,host=127.0.0.1,port=9998,localaddr=127.0.0.1,localport=9999 \
--device isa-serial,chardev=charserial0,id=serial0 -usb -device \
+udp,id=charserial0,port=9998,host=127.0.0.1,localaddr=127.0.0.1,localport=9999 \
+-device isa-serial,chardev=charserial0,id=serial0 \
+-chardev udp,id=charserial1,port=9999 \
+-device isa-serial,chardev=charserial1,id=serial1 -usb -device \
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x2
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-serial-udp-chardev.xml b/tests/qemuxml2argvdata/qemuxml2argv-serial-udp-chardev.xml
index 12622d4..9627c67 100644
--- a/tests/qemuxml2argvdata/qemuxml2argv-serial-udp-chardev.xml
+++ b/tests/qemuxml2argvdata/qemuxml2argv-serial-udp-chardev.xml
@@ -25,6 +25,10 @@
<source mode='connect' host='127.0.0.1' service='9998'/>
<target port='0'/>
</serial>
+ <serial type='udp'>
+ <source mode='connect' service='9999'/>
+ <target port='1'/>
+ </serial>
<console type='udp'>
<source mode='bind' host='127.0.0.1' service='9999'/>
<source mode='connect' host='127.0.0.1' service='9998'/>
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-serial-udp.args b/tests/qemuxml2argvdata/qemuxml2argv-serial-udp.args
index 53c69bc..cf25fe0 100644
--- a/tests/qemuxml2argvdata/qemuxml2argv-serial-udp.args
+++ b/tests/qemuxml2argvdata/qemuxml2argv-serial-udp.args
@@ -1,4 +1,4 @@
LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test /usr/bin/qemu -S -M \
pc -m 214 -smp 1 -nographic -monitor unix:/tmp/test-monitor,server,nowait \
-no-acpi -boot c -hda /dev/HostVG/QEMUGuest1 -net none -serial \
-udp:127.0.0.1:9998@127.0.0.1:9999 -parallel none -usb
+udp:127.0.0.1:9998@127.0.0.1:9999 -serial udp::9999 -parallel none -usb
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-serial-udp.xml b/tests/qemuxml2argvdata/qemuxml2argv-serial-udp.xml
index 8697f5a..f606ea4 100644
--- a/tests/qemuxml2argvdata/qemuxml2argv-serial-udp.xml
+++ b/tests/qemuxml2argvdata/qemuxml2argv-serial-udp.xml
@@ -25,6 +25,10 @@
<source mode='connect' host='127.0.0.1' service='9998'/>
<target port='0'/>
</serial>
+ <serial type='udp'>
+ <source mode='connect' service='9999'/>
+ <target port='1'/>
+ </serial>
<console type='udp'>
<source mode='bind' host='127.0.0.1' service='9999'/>
<source mode='connect' host='127.0.0.1' service='9998'/>
--
1.7.4
13 years, 5 months
[libvirt] (how much) support for kqemu domain
by John Lumby
I am wondering about the extent to which "old" qemu-0.11.1 and kqemu-1.4.0 are supported by virt-manager.
I see I can specify --virt-type=kqemu on virt-install and it remembers domain type='kqemu', and does things such as refusing to start the vm if the kqemu kernel mod not loaded, but it seems it does not tack on the
-enable-kqemu -kernel-kqemu
options on to the qemu command line. There is really not much point in trying to start a qemu-based vm with neither hardware kvm nor kqemu ...
I can work around it with an override script to intercept the qemu command, but does anyone think virt-manager ought to do this for me?
John Lumby
_________________________________________________________________
13 years, 7 months
[libvirt] [PATCH 0/6] Introduce a new migration protocol to QEMU driver
by Daniel P. Berrange
The current migration protocol has several flaws
- No initial hook on the source host to do work before
the dst VM is launched
- No ability to restart src VM if dst fails to recv all
migration data, but src successfully sent it all
This introduces a new 5 step migration process to address
this limitation. To support features such as seemless
migration of SPICE clients, and lock driver state passing
this now makes use of the migration cookie feature too
13 years, 8 months
[libvirt] [PATCH V2] xen: check if device is assigned to guest before reattaching
by Yufang Zhang
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=664059
This is the version2 patch for BZ#664059. Reattaching pci device back to
host without destroying guest or detaching device from guest would cause
host to crash. This patch adds a check before doing device reattach. If
the device is being assigned to guest, libvirt refuses to reattach device
to host. The patch only works for Xen, for it just checks xenstore to get
pci device information.
Signed-off-by: Yufang Zhang <yuzhang(a)redhat.com>
---
src/xen/xen_driver.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 67 insertions(+), 0 deletions(-)
diff --git a/src/xen/xen_driver.c b/src/xen/xen_driver.c
index 4c11b11..318bb6a 100644
--- a/src/xen/xen_driver.c
+++ b/src/xen/xen_driver.c
@@ -1891,11 +1891,70 @@ out:
}
static int
+xenUnifiedNodeDeviceAssignedDomainId (virNodeDevicePtr dev)
+{
+ int numdomains;
+ int ret = -1, i;
+ int *ids = NULL;
+ char *bdf = NULL;
+ char *xref = NULL;
+ unsigned int domain, bus, slot, function;
+ virConnectPtr conn = dev->conn;
+ xenUnifiedPrivatePtr priv = conn->privateData;
+
+ /* Get active domains */
+ numdomains = xenUnifiedNumOfDomains(conn);
+ if (numdomains < 0) {
+ return ret;
+ }
+ if (numdomains > 0){
+ if (VIR_ALLOC_N(ids, numdomains) < 0){
+ virReportOOMError();
+ goto out;
+ }
+ if ((numdomains = xenUnifiedListDomains(conn, &ids[0], numdomains)) < 0){
+ goto out;
+ }
+ }
+
+ /* Get pci bdf */
+ if (xenUnifiedNodeDeviceGetPciInfo(dev, &domain, &bus, &slot, &function) < 0)
+ goto out;
+
+ if (virAsprintf(&bdf, "%04x:%02x:%02x.%0x",
+ domain, bus, slot, function) < 0) {
+ virReportOOMError();
+ goto out;
+ }
+
+ /* Check if bdf is assigned to one of active domains */
+ for (i = 0; i < numdomains; i++ ){
+ xenUnifiedLock(priv);
+ xref = xenStoreDomainGetPCIID(conn, ids[i], bdf);
+ xenUnifiedUnlock(priv);
+ if (xref == NULL)
+ continue;
+ else {
+ ret = ids[i];
+ break;
+ }
+ }
+
+ VIR_FREE(xref);
+ VIR_FREE(bdf);
+out:
+ VIR_FREE(ids);
+
+ return ret;
+}
+
+static int
xenUnifiedNodeDeviceReAttach (virNodeDevicePtr dev)
{
pciDevice *pci;
unsigned domain, bus, slot, function;
int ret = -1;
+ int domid;
if (xenUnifiedNodeDeviceGetPciInfo(dev, &domain, &bus, &slot, &function) < 0)
return -1;
@@ -1904,6 +1963,14 @@ xenUnifiedNodeDeviceReAttach (virNodeDevicePtr dev)
if (!pci)
return -1;
+ /* Check if device is assigned to an active guest */
+ if ((domid = xenUnifiedNodeDeviceAssignedDomainId(dev)) >= 0){
+ xenUnifiedError(VIR_ERR_INTERNAL_ERROR,
+ _("Device %s has been assigned to guest %d"),
+ dev->name, domid);
+ goto out;
+ }
+
if (pciReAttachDevice(pci, NULL) < 0)
goto out;
--
1.7.3.4
13 years, 8 months