[libvirt] Supporting vhost-net and macvtap in libvirt for QEMU
by Anthony Liguori
Disclaimer: I am neither an SR-IOV nor a vhost-net expert, but I've CC'd
people that are who can throw tomatoes at me for getting bits wrong :-)
I wanted to start a discussion about supporting vhost-net in libvirt.
vhost-net has not yet been merged into qemu but I expect it will be soon
so it's a good time to start this discussion.
There are two modes worth supporting for vhost-net in libvirt. The
first mode is where vhost-net backs to a tun/tap device. This is
behaves in very much the same way that -net tap behaves in qemu today.
Basically, the difference is that the virtio backend is in the kernel
instead of in qemu so there should be some performance improvement.
Current, libvirt invokes qemu with -net tap,fd=X where X is an already
open fd to a tun/tap device. I suspect that after we merge vhost-net,
libvirt could support vhost-net in this mode by just doing -net
vhost,fd=X. I think the only real question for libvirt is whether to
provide a user visible switch to use vhost or to just always use vhost
when it's available and it makes sense. Personally, I think the later
makes sense.
The more interesting invocation of vhost-net though is one where the
vhost-net device backs directly to a physical network card. In this
mode, vhost should get considerably better performance than the current
implementation. I don't know the syntax yet, but I think it's
reasonable to assume that it will look something like -net
tap,dev=eth0. The effect will be that eth0 is dedicated to the guest.
On most modern systems, there is a small number of network devices so
this model is not all that useful except when dealing with SR-IOV
adapters. In that case, each physical device can be exposed as many
virtual devices (VFs). There are a few restrictions here though. The
biggest is that currently, you can only change the number of VFs by
reloading a kernel module so it's really a parameter that must be set at
startup time.
I think there are a few ways libvirt could support vhost-net in this
second mode. The simplest would be to introduce a new tag similar to
<source network='br0'>. In fact, if you probed the device type for the
network parameter, you could probably do something like <source
network='eth0'> and have it Just Work.
Another model would be to have libvirt see an SR-IOV adapter as a
network pool whereas it handled all of the VF management. Considering
how inflexible SR-IOV is today, I'm not sure whether this is the best model.
Has anyone put any more thought into this problem or how this should be
modeled in libvirt? Michael, could you share your current thinking for
-net syntax?
--
Regards,
Anthony Liguori
1 year
[libvirt] [PATCH 0/4] Multiple problems with saving to block devices
by Daniel P. Berrange
This patch series makes it possible to save to a block device,
instead of a plain file. There were multiple problems
- WHen save failed, we might de-reference a NULL pointer
- When save failed, we unlinked the device node !!
- The approach of using >> to append, doesn't work with block devices
- CGroups was blocking QEMU access to the block device when enabled
One remaining problem is not in libvirt, but rather QEMU. The QEMU
exec: based migration often fails to detect failure of the command
and will thus hang forever attempting a migration that'll never
succeed! Fortunately you can now work around this in libvirt using
the virsh domjobabort command
11 years, 8 months
[libvirt] [PATCHv4 00/51] another round of snapshot patches
by Eric Blake
I think I've addressed most findings from round 3 - by implementing
the ability to redefine a snapshot, it becomes possible to restore
snapshot hierarchy when recreating a transient domain by the same
name. New goodies in this round: several bug fixes, add virsh
snapshot-edit, drop undefine --snapshots-full (you can only remove
snapshot metadata on undefine). I tested as I went, but this went
through so many rebases that there may be some nasties that snuck
in; but I wanted to get this posted now. I also know that I'm
missing at least one major feature requested in the v3 review:
namely, transient domains _should_ auto-remove snapshot metadata
files when they halt, but right now aren't doing that.
v3 was at:
https://www.redhat.com/archives/libvir-list/2011-August/msg01132.html
Also available here:
git fetch git://repo.or.cz/libvirt/ericb.git snapshot
or browse online at:
http://repo.or.cz/w/libvirt/ericb.git/shortlog/refs/heads/snapshot
I'm also trying to group things by several bugzilla related to
various patches (looks like I still need to create a few):
Eric Blake (51):
https://bugzilla.redhat.com/show_bug.cgi?id=674537
snapshot: fix corner case on OOM during creation
https://bugzilla.redhat.com/show_bug.cgi?id=733762
snapshot: better events when starting paused
snapshot: fine-tune ability to start paused
snapshot: expose --running and --paused in virsh
snapshot: fine-tune qemu saved images starting paused
snapshot: improve reverting to qemu paused snapshots
snapshot: properly revert qemu to offline snapshots
snapshot: fine-tune qemu snapshot revert states
no bug filed yet... should be one about no stale metadata
snapshot: allow deletion of just snapshot metadata
snapshot: add snapshot-list --parent to virsh
https://bugzilla.redhat.com/show_bug.cgi?id=733529
snapshot: speed up snapshot location
snapshot: avoid crash when deleting qemu snapshots
snapshot: track current domain across deletion of children
snapshot: simplify acting on just children
no bug filed yet... should be one about no stale metadata
snapshot: let qemu discard only snapshot metadata
snapshot: identify which snapshots have metadata
snapshot: reflect new dumpxml and list options in virsh
snapshot: identify qemu snapshot roots
snapshot: allow recreation of metadata
snapshot: refactor virsh snapshot creation
snapshot: improve virsh snapshot-create, add snapshot-edit
snapshot: add qemu snapshot creation without metadata
no bug filed yet... should be one about snapshot migration
snapshot: add qemu snapshot redefine support
snapshot: prevent stranding snapshot data on domain destruction
snapshot: teach virsh about new undefine flags
snapshot: refactor some qemu code
snapshot: cache qemu-img location
snapshot: support new undefine flags in qemu
snapshot: prevent migration from stranding snapshot data
https://bugzilla.redhat.com/show_bug.cgi?id=638510
snapshot: refactor domain xml output
snapshot: allow full domain xml in snapshot
snapshot: correctly escape generated xml
snapshot: update rng to support full domain in xml
snapshot: store qemu domain details in xml
snapshot: additions to domain xml for disks
snapshot: reject transient disks where code is not ready
snapshot: introduce new deletion flag
snapshot: expose new delete flag in virsh
snapshot: allow halting after snapshot
snapshot: expose halt-after-creation in virsh
snapshot: wire up new qemu monitor command
snapshot: support extra state in snapshots
snapshot: add <disks> to snapshot xml
snapshot: also support disks by path
snapshot: add virsh domblklist command
snapshot: add flag for requesting disk snapshot
snapshot: wire up disk-only flag to snapshot-create
snapshot: reject unimplemented disk snapshot features
snapshot: make it possible to audit external snapshot
snapshot: wire up live qemu disk snapshots
snapshot: use SELinux and lock manager with external snapshots
docs/formatdomain.html.in | 40 +-
docs/formatsnapshot.html.in | 269 ++-
docs/schemas/Makefile.am | 1 +
docs/schemas/domain.rng | 2555 +-------------------
docs/schemas/{domain.rng => domaincommon.rng} | 32 +-
docs/schemas/domainsnapshot.rng | 84 +-
examples/domain-events/events-c/event-test.c | 37 +-
include/libvirt/libvirt.h.in | 66 +-
src/conf/domain_audit.c | 12 +-
src/conf/domain_audit.h | 4 +-
src/conf/domain_conf.c | 902 ++++++--
src/conf/domain_conf.h | 76 +-
src/esx/esx_driver.c | 38 +-
src/libvirt.c | 256 ++-
src/libvirt_private.syms | 8 +
src/libxl/libxl_conf.c | 5 +
src/libxl/libxl_driver.c | 11 +-
src/qemu/qemu_command.c | 5 +
src/qemu/qemu_conf.h | 1 +
src/qemu/qemu_driver.c | 1532 +++++++++---
src/qemu/qemu_hotplug.c | 18 +-
src/qemu/qemu_migration.c | 48 +-
src/qemu/qemu_migration.h | 2 -
src/qemu/qemu_monitor.c | 24 +
src/qemu/qemu_monitor.h | 4 +
src/qemu/qemu_monitor_json.c | 33 +
src/qemu/qemu_monitor_json.h | 4 +
src/qemu/qemu_monitor_text.c | 40 +
src/qemu/qemu_monitor_text.h | 4 +
src/qemu/qemu_process.c | 11 +-
src/uml/uml_driver.c | 56 +-
src/vbox/vbox_tmpl.c | 43 +-
src/xen/xend_internal.c | 12 +-
src/xenxs/xen_sxpr.c | 5 +
src/xenxs/xen_xm.c | 5 +
tests/domainsnapshotxml2xmlin/disk_snapshot.xml | 16 +
tests/domainsnapshotxml2xmlout/disk_snapshot.xml | 77 +
tests/domainsnapshotxml2xmlout/full_domain.xml | 35 +
.../qemuxml2argv-disk-snapshot.args | 7 +
.../qemuxml2argv-disk-snapshot.xml | 39 +
.../qemuxml2argv-disk-transient.xml | 27 +
tests/qemuxml2argvtest.c | 2 +
tests/virsh-optparse | 20 +
tools/virsh.c | 772 +++++-
tools/virsh.pod | 214 ++-
45 files changed, 3978 insertions(+), 3474 deletions(-)
copy docs/schemas/{domain.rng => domaincommon.rng} (98%)
create mode 100644 tests/domainsnapshotxml2xmlin/disk_snapshot.xml
create mode 100644 tests/domainsnapshotxml2xmlout/disk_snapshot.xml
create mode 100644 tests/domainsnapshotxml2xmlout/full_domain.xml
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-disk-snapshot.args
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-disk-snapshot.xml
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-disk-transient.xml
--
1.7.4.4
12 years
[libvirt] [PATCH 0/2] Introduce two new virsh commands
by Osier Yang
These two patches is to introduce two new virsh commands, one is
eject-media, which is to eject media from CD or floppy drive, the other
is insert-media, which is to insert media into CD or floppy drive.
There are commands existed can be used to eject/insert media, such as
"update-device", but it's not quite easy to use. That's the original
intention of these patches.
Both of the two commands only allow to operate on CDROM or floppy disk.
[PATCH 1/2] virsh: Introduce two new commands to insert or eject media
[PATCH 2/2] doc: Add docs for two new introduced commands
Regards
Osier
12 years, 11 months
[libvirt] [test-API][PATCH 1/2] Add support for spice graphics
by Nan Zhang
* utils/Python/xmlgenerator.py: This extends graphics element for spice
XML composing, and support sub-elements settings for audio, images,
streaming and so on:
<graphics type='spice' autoport='yes'>
<image compression='auto_glz'/>
<jpeg compression='auto'/>
<zlib compression='auto'/>
<playback compression='on'/>
<streaming mode='filter'/>
<clipboard copypaste='no'/>
</graphics>
* utils/Python/xmlbuilder.py: Add 2 methods add_graphics() and
build_graphics() to XmlBuilder class.
---
utils/Python/xmlbuilder.py | 36 +++++++++++++++++++++++-
utils/Python/xmlgenerator.py | 62 +++++++++++++++++++++++++++++++++++++----
2 files changed, 91 insertions(+), 7 deletions(-)
diff --git a/utils/Python/xmlbuilder.py b/utils/Python/xmlbuilder.py
index 5a0f8c8..739eccb 100644
--- a/utils/Python/xmlbuilder.py
+++ b/utils/Python/xmlbuilder.py
@@ -64,6 +64,13 @@ class XmlBuilder:
hostdev_node, domain.getElementsByTagName("console")[0])
return hostdev
+ def add_graphics(self, params, domain):
+ graphics = xmlgenerator.graphics_xml(params)
+ graphics_node = domain.importNode(graphics.childNodes[0], True)
+ domain.getElementsByTagName("devices")[0].insertBefore(
+ graphics_node, domain.getElementsByTagName("console")[0])
+ return graphics
+
def build_domain_install(self, params):
domain = xmlgenerator.domain_xml(params, True)
self.add_disk(params, domain)
@@ -151,6 +158,12 @@ class XmlBuilder:
self.write_toxml(hostdev)
return hostdev.toxml()
+ def build_graphics(self, params):
+ graphics = xmlgenerator.graphics_xml(params)
+ if __DEBUG__:
+ self.write_toxml(graphics)
+ return graphics.toxml()
+
def build_pool(self, params):
pool = xmlgenerator.pool_xml(params)
if __DEBUG__:
@@ -242,6 +255,20 @@ if __name__ == "__main__":
interfacexml = xmlobj.build_interface(params)
+ #--------------------------
+ # get graphics xml string
+ #--------------------------
+ print '=' * 30, 'graphics xml', '=' * 30
+ params['graphtype'] = 'spice'
+ params['image'] = 'auto_glz'
+ params['jpeg'] = 'auto'
+ params['zlib'] = 'auto'
+ params['playback'] = 'on'
+ params['streaming'] = 'filter'
+ params['clipboard'] = 'no'
+
+ graphicsxml = xmlobj.build_graphics(params)
+
#---------------------
# get pool xml string
#---------------------
@@ -297,6 +324,13 @@ if __name__ == "__main__":
params['memory'] = '1048576'
params['vcpu'] = '2'
params['inputbus'] = 'usb'
+ params['graphtype'] = 'spice'
+ params['image'] = 'auto_glz'
+ params['jpeg'] = 'auto'
+ params['zlib'] = 'auto'
+ params['playback'] = 'on'
+ params['streaming'] = 'filter'
+ params['clipboard'] = 'no'
params['sound'] = 'ac97'
params['bootcd'] = '/iso/rhel5.iso'
@@ -367,7 +401,7 @@ if __name__ == "__main__":
#----------------------------------------
# get domain snapshot xml string
#----------------------------------------
- params['name'] = 'hello'
+ params['snapshotname'] = 'hello'
params['description'] = 'hello snapshot'
snapshot_xml = xmlobj.build_domain_snapshot(params)
diff --git a/utils/Python/xmlgenerator.py b/utils/Python/xmlgenerator.py
index d57dd33..460f2e5 100644
--- a/utils/Python/xmlgenerator.py
+++ b/utils/Python/xmlgenerator.py
@@ -233,12 +233,6 @@ def domain_xml(params, install = False):
input_element.setAttribute('bus', 'ps2')
devices_element.appendChild(input_element)
- # <graphics>
- graphics_element = domain.createElement('graphics')
- graphics_element.setAttribute('type', 'vnc')
- graphics_element.setAttribute('port', '-1')
- graphics_element.setAttribute('keymap', 'en-us')
- devices_element.appendChild(graphics_element)
domain_element.appendChild(devices_element)
# <sound>
@@ -253,6 +247,62 @@ def domain_xml(params, install = False):
return domain
+def graphics_xml(params):
+ graphics = xml.dom.minidom.Document()
+ # <graphics>
+ graphics_element = graphics.createElement('graphics')
+ if not params.has_key('graphtype'):
+ params['graphtype'] == 'vnc'
+
+ graphics_element.setAttribute('type', params['graphtype'])
+ graphics.appendChild(graphics_element)
+
+ if params['graphtype'] == 'vnc':
+ graphics_element.setAttribute('port', '-1')
+ graphics_element.setAttribute('keymap', 'en-us')
+ elif params['graphtype'] == 'spice':
+ graphics_element.setAttribute('autoport', 'yes')
+ if params.has_key('image'):
+ image_element = graphics.createElement('image')
+ # image to set image compression (accepts
+ # auto_glz, auto_lz, quic, glz, lz, off)
+ image_element.setAttribute('compression', params['image'])
+ graphics_element.appendChild(image_element)
+ if params.has_key('jpeg'):
+ jpeg_element = graphics.createElement('jpeg')
+ # jpeg for JPEG compression for images over wan (accepts
+ # auto, never, always)
+ jpeg_element.setAttribute('compression', params['jpeg'])
+ graphics_element.appendChild(jpeg_element)
+ if params.has_key('zlib'):
+ zlib_element = graphics.createElement('zlib')
+ # zlib for configuring wan image compression (accepts
+ # auto, never, always)
+ zlib_element.setAttribute('compression', params['zlib'])
+ graphics_element.appendChild(zlib_element)
+ if params.has_key('playback'):
+ playback_element = graphics.createElement('playback')
+ # playback for enabling audio stream compression (accepts on or off)
+ playback_element.setAttribute('compression', params['playback'])
+ graphics_element.appendChild(playback_element)
+ if params.has_key('streaming'):
+ streaming_element = graphics.createElement('streaming')
+ # streamming for settings it's mode attribute to one of
+ # filter, all or off
+ streaming_element.setAttribute('mode', params['streaming'])
+ graphics_element.appendChild(streaming_element)
+ if params.has_key('clipboard'):
+ clipboard_element = graphics.createElement('clipboard')
+ # Copy & Paste functionality is enabled by default, and can
+ # be disabled by setting the copypaste property to no
+ clipboard_element.setAttribute('copypaste', params['clipboard'])
+ graphics_element.appendChild(clipboard_element)
+ else:
+ print 'Wrong graphics type was specified.'
+ sys.exit(1)
+
+ return graphics
+
def disk_xml(params, cdrom = False):
disk = xml.dom.minidom.Document()
# <disk> -- START
--
1.7.4.4
12 years, 11 months
[libvirt] [PATCH 00/14] Add a virtlockd lock manager daemon
by Daniel P. Berrange
The lock manager infrastructure we recently added to QEMU only has
two possible drivers at this time, 'nop' and 'sanlock'. The former
does absolutely nothing, while the latter requires a 3rd party
package installed and is a little heavy on disk I/O and storage
requirements.
This series adds a new daemon 'virtlockd' which is intended to be
enabled by default on all hosts running 'libvirtd'. This daemon
provides a service for disk locking based on the traditional
fcntl() lock primitives. There is a new libvirt manager plugin
which talks to this daemon over RPC. The reason for doing the
locks in a separate process is that we want the locks to remain
active, even if libvirtd crashes, or is restarted. The virtlockd
daemon has this one single job so should be pretty reliable and
selfcontained. This patch series really benefits from the new RPC
APIs, requiring minimal code for the new daemon / client
At this time, virtlockd does not lock the actual disk files, but
instead creates a lockspace & leases under /var/lib/libvirt/lockd.
The lockspace we use for disks is named org.libvirt.lockd.files,
and lease names are based on a SHA256 checksum of the fully
qualified disk name. eg
/var/lib/libvirt/lockd/org.libvirt.lockd.files/adf94fc33a24da1abff7dd7374a9919bb51efee646da8c3ac464c10cd59750bd
These leases are all zero-bytes long and no I/O is ever performed
on them, only fcntl() is used. So there is material overhead.
Whenever creating or deleting leases, we first acquire a lock on
/var/lib/libvirt/lockd/org.libvirt.lockd.files/org.libvirt.lockd.index
A non-root virtlockd will instead use $HOME/.libvirt/lockd
By default we gain protection out of the box against
- Starting two guests on the same host with the same disk image
not marked with <shareable/>
- libvirtd getting confused and forgetting a guest, allowing it
to be started for a 2nd time
If the admin mounts a shared filesytem (eg NFS) on /var/lib/libvirt/lockd
then this protection is extended across all hosts sharing that
mount volume.
As part of this series, I also introduce support for systemd
services for libvirtd and libvir-guests.
12 years, 11 months
[libvirt] [RFC PATCH 0/5] Support online resizing of block devices.
by Osier Yang
This patch series introduce new API "virDomainBlockResize" to expose
qemu monitor command "block_size", which is for resizing the a block
device while the domain is running.
The prototype for the new API is:
int
virDomainBlockResize (virDomainPtr dom,
const char *path,
unsigned long long size,
unsigned int flags)
* "@path" is the absolute path of the block device, which can be
extraced from domain xml.
* The units for "@size" is kilobytes, which might be not quite properly.
(qemu HMP uses Megabytes as the default units, QMP uses Bytes as the
default units, so it means we need to divice "@size" by 1024 for HMP,
and multiply "@size" by 1024 for QMP. On the other hand, we need to
check the overflowing). Any ideas on this is welcomed.
* "@flags" is unused currently.
[PATCH 1/5] block_resize: Define the new API
[PATCH 2/5] block_resize: Wire up the remote protocol
[PATCH 3/5] block_resize: Implement qemu monitor functions
[PATCH 4/5] block_resize: Implement qemu driver method
[PATCH 5/5] block_resize: Expose the new API to virsh
12 years, 12 months
[libvirt] bug: try to take disk snapshot for LVM2 Volume
by MATSUDA, Daiki
I tried the new snapshot function implemented by Eric Blake.
It works very well for QCOW2 disk image system.
But I often use LVM2 volume for QEMU virtual machines and tried to take
disk snapshot by virsh command ( snapshot-create DOMNAME --disk-only).
So, finally qemu monitor command 'snapshot_blkdev' accepts the LVM2
volume and create QCOW2 snapshot image. In addition, domain's
configuration file is replaced to use snapshot disk image instead of
LVM2 volume.
configuration file
from
....
<disk type='block' device='disk>
<driver name='qemu' type='raw' cache='none'/>
<source dev='dev/VG1/LVM2_dom'/>
....
to
<disk type='block' device='disk>
<driver name='qemu' type='qcow2' cache='none'/>
<source dev='dev/VG1/LVM2_dom.1317357844'/>
After then, the domain runs well till it is shutdowned. I started the
domain, but it does not with following error
virtsh # start LVM2_dom
error: Failed to start domain LVM2_dom
error: 内部エラー Process exited while reading console log output: char
device redirected to /dev/pts/7
qemu: could not open disk image /dev/VG1/LVM2_dom.1317357844: Invalid
argument.
I think that if the volume but qcow2 is given libvirt should be refuse,
e.g. in qemuDomainSnapshotCreateDiskActive() with voulme driver type.
But currently the structures concerning with snapshot or disk has no
member to hold such a volume driver information. In addition, as we want
to add the LVM2 and other volume snapshot function, we hope you add its
information and fix.
Regards
MATSUDA Daiki
13 years