[libvirt] Supporting vhost-net and macvtap in libvirt for QEMU
by Anthony Liguori
Disclaimer: I am neither an SR-IOV nor a vhost-net expert, but I've CC'd
people that are who can throw tomatoes at me for getting bits wrong :-)
I wanted to start a discussion about supporting vhost-net in libvirt.
vhost-net has not yet been merged into qemu but I expect it will be soon
so it's a good time to start this discussion.
There are two modes worth supporting for vhost-net in libvirt. The
first mode is where vhost-net backs to a tun/tap device. This is
behaves in very much the same way that -net tap behaves in qemu today.
Basically, the difference is that the virtio backend is in the kernel
instead of in qemu so there should be some performance improvement.
Current, libvirt invokes qemu with -net tap,fd=X where X is an already
open fd to a tun/tap device. I suspect that after we merge vhost-net,
libvirt could support vhost-net in this mode by just doing -net
vhost,fd=X. I think the only real question for libvirt is whether to
provide a user visible switch to use vhost or to just always use vhost
when it's available and it makes sense. Personally, I think the later
makes sense.
The more interesting invocation of vhost-net though is one where the
vhost-net device backs directly to a physical network card. In this
mode, vhost should get considerably better performance than the current
implementation. I don't know the syntax yet, but I think it's
reasonable to assume that it will look something like -net
tap,dev=eth0. The effect will be that eth0 is dedicated to the guest.
On most modern systems, there is a small number of network devices so
this model is not all that useful except when dealing with SR-IOV
adapters. In that case, each physical device can be exposed as many
virtual devices (VFs). There are a few restrictions here though. The
biggest is that currently, you can only change the number of VFs by
reloading a kernel module so it's really a parameter that must be set at
startup time.
I think there are a few ways libvirt could support vhost-net in this
second mode. The simplest would be to introduce a new tag similar to
<source network='br0'>. In fact, if you probed the device type for the
network parameter, you could probably do something like <source
network='eth0'> and have it Just Work.
Another model would be to have libvirt see an SR-IOV adapter as a
network pool whereas it handled all of the VF management. Considering
how inflexible SR-IOV is today, I'm not sure whether this is the best model.
Has anyone put any more thought into this problem or how this should be
modeled in libvirt? Michael, could you share your current thinking for
-net syntax?
--
Regards,
Anthony Liguori
1 year, 3 months
[libvirt] [PATCH 0/2] qemu_cgroup: allow access to /dev/dri/render*
by Ján Tomko
Technically a v2, but v1 is already pushed.
This version is based on the <gl enable> in <spice> instead
of accel3d="yes" in <video><model type="virtio".
It also only allows access to the render* devices, instead of all of them.
https://bugzilla.redhat.com/show_bug.cgi?id=1337290
Ján Tomko (2):
Revert "qemu_cgroup: allow access to /dev/dri for virtio-vga"
qemu_cgroup: allow access to /dev/dri/render*
src/qemu/qemu_cgroup.c | 71 ++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 57 insertions(+), 14 deletions(-)
--
2.7.3
8 years, 1 month
Re: [libvirt] question about rdma migration
by Michael R. Hines
Hi Roy,
On 02/09/2016 03:57 AM, Roy Shterman wrote:
> Hi,
>
> I tried to understand the rdma-migration in qemu code and i have two
> questions about it:
>
> 1. I'm working with qemu-kvm using libvirt and i'm getting
>
> MEMLOCK max locked-in-memory address space 65536 65536 bytes
>
> in qemu process so I don't understand how can you use rdma-pin-all
> with such low MEMLOCK.
>
> I found a solution in libvirt to lock all vm memory in advance and to
> enlarge MEMLOCK.
> It uses memoryBacking locking and memory tuning hard_limit of vm
> memory but I couldn't find a usage of this in rdma-migration code.
>
You're absolutey right, the RDMA migration code itself doesn't set this
lock limit explicitly because there are system-wide restrictions in both
appArmour,
/etc/security, as well as SELINUX that restrict applications from
arbitrarily setting their maximum memory lock limits.
The other problem is CGROUPS: If someone sets a cgroup control for
maximum memory and forgets about that mlock() limits, then
there will be a conflict.
So, libvirt must have a policy to deal with all of these possibilities,
not just handle a special case for RDMA migration.
The only way "simple" way (without patching the problems above) to apply
a higher lock limit to QEMU is to set the ulimit for libvirt
(or for QEMU if starting QEMU manually) in your environment or the
command line with $ ulimit # before attempting the migration,
then the RDMA subsystem will be able to lock the memory successfully.
The other option is to use /etc/security/limits.conf and set the option
for a specific libvirt process user and make sure your libvirt/qemu
are not running as root.
QEMU itself also has a "mlock" option built into the command line, but
it also suffers from the same problem --- you have to find
a way (currently) to increase the limit before using the option.
> 2. Do you have any comparison of IOPS and bandwidth between TCP
> migration and rdma migration?
>
Yes, lots of comparisons.
http://wiki.qemu.org/Features/RDMALiveMigration
http://www.canturkisci.com/ETC/papers/IBMJRD2011/preprint.pdf
> Regards,
> Roy
>
>
8 years, 3 months
[libvirt] Qemu: create empty cdrom
by Gromak Yuriy
Hello.
Qemu is latest from master branch.
Tryingto start a domain, which is connected toa blankcdrom:
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<target dev='sdb' bus='scsi'/>
<readonly/>
<address type='drive' controller='0' target='1' bus='0'
unit='0'/>
</disk>
But I get an error:
qemu-system-x86_64: -drive
if=none,id=drive-scsi0-0-1-0,readonly=on,format=raw: Can't use 'raw' as
a block driver for the protocol level.
8 years, 3 months
[libvirt] [PATCH 0/3] add option to keep nvram file on undefine
by Nikolay Shirokovskiy
There is already a patch [1] on this topic with a different approach - keep
nvram file by default. There is also some discussion there. To sum up keeping
nvram on undefine could be useful in some usecases so there should be an option
to do it. On the other hand there is a danger of leaving domain assets after
its undefine and unsing them unintentionally on defining domain with the same
name.
AFAIU keeping nvram by default was motivated by domain disks behaviour.
I think there is a difference as libvirt never create disks for domain as
opposed to nvram and managed save and without disks domain will not start so
user is quite aware of disks files. On the other hand one can start using nvram
file solely putting <nvram> in config and managed save is created on daemon
shutdown. So user is much less aware of nvram and managed save existence. Thus
one can easily mess up by unaware define $name/using/undefine/define $name again
usecase. Thus I vote for keeping said assets only if it is specified explicitly
so user knows what he is doing.
Adding option to undefine is best solution I come up with. The other options
are add checks on define or start and both are impossible. Such a check should
be done without any extra flags for it to be useful but this way we break
existing users.
As this a proof of concept this series does not add extra flag for managed save.
[1] https://www.redhat.com/archives/libvir-list/2015-February/msg00915.html
Nikolay Shirokovskiy (3):
api: add VIR_DOMAIN_UNDEFINE_KEEP_NVRAM flag
qemu: add VIR_DOMAIN_UNDEFINE_KEEP_NVRAM support
virsh: add --keep-nvram option to undefine command
include/libvirt/libvirt-domain.h | 1 +
src/qemu/qemu_driver.c | 26 +++++++++++++++++---------
tools/virsh-domain.c | 8 ++++++++
tools/virsh.pod | 6 +++---
4 files changed, 29 insertions(+), 12 deletions(-)
--
1.8.3.1
8 years, 4 months
[libvirt] [libvirt-glib/libvirt-gconfig 00/17] Graphics: Introduce the new Remote and Local classes (and also implement a few missing methods).
by Fabiano Fidêncio
While trying to use libvirt-gobject and libvirt-gconfig for accessing VMs
and looking at their config, instead of using libvirt and parsing XML
directly, I found out that a few methods have been missing and that
libvirt-gconfig is not exactly thought for the "reading their config" use
case (see more explanations on the 10th and 14th commits.
This series, unfortunately, introduces an ABI breakage.
Fabiano Fidêncio (17):
gconfig: Implement gvir_config_domain_graphics_vnc_get_autoport()
gconfig: Implement gvir_config_domain_graphics_spice_get_autoport()
gconfig: Implement gvir_config_domain_graphics_rdp_get_autoport()
gconfig: Implement gvir_config_domain_graphics_sdl_get_display()
gconfig: Implement gvir_config_domain_graphics_sdl_get_fullscreen()
gconfig: Implement gvir_config_domain_graphics_spice_get_tls_port()
gconfig: Implement gvir_config_domain_graphics_spice_{get,set}_host()
gconfig: Implement gvir_config_domain_graphics_vnc_{get,set}_host()
gconfig: Implement gvir_config_domain_graphics_rdp_{get,set}_host()
gconfig: Add GVirCofigDomainGraphicsRemote class
gconfig: Adapt GVirConfigDomainGraphicsSpice to
GVirConfigDomainGraphicsRemote
gconfig: Adapt GVirConfigDomainGraphicsRdp to
GVirConfigDomainGraphicsRemote
gconfig: Adapt GVirConfigDomainGraphicsVnc to
GVirConfigDomainGraphicsRemote
gconfig: Add GVirCofigDomainGraphicsLocal class
gconfig: Adapt GVirConfigDomainGraphicsSdl to
GVirConfigDomainGraphicsLocal
gconfig: Adapt GVirConfigDomainGraphicsDesktop to
GVirConfigDomainGraphicsLocal
gconfig,graphics: Avoid crash when gvir_config_object_new_from_xml()
returns NULL
libvirt-gconfig/Makefile.am | 4 +
.../libvirt-gconfig-domain-graphics-desktop.c | 14 ++-
.../libvirt-gconfig-domain-graphics-desktop.h | 4 +-
.../libvirt-gconfig-domain-graphics-local.c | 97 +++++++++++++++++++
.../libvirt-gconfig-domain-graphics-local.h | 68 ++++++++++++++
.../libvirt-gconfig-domain-graphics-rdp.c | 32 ++++++-
.../libvirt-gconfig-domain-graphics-rdp.h | 9 +-
.../libvirt-gconfig-domain-graphics-remote.c | 103 +++++++++++++++++++++
.../libvirt-gconfig-domain-graphics-remote.h | 70 ++++++++++++++
.../libvirt-gconfig-domain-graphics-sdl.c | 19 +++-
.../libvirt-gconfig-domain-graphics-sdl.h | 6 +-
.../libvirt-gconfig-domain-graphics-spice.c | 40 +++++++-
.../libvirt-gconfig-domain-graphics-spice.h | 10 +-
.../libvirt-gconfig-domain-graphics-vnc.c | 32 ++++++-
.../libvirt-gconfig-domain-graphics-vnc.h | 9 +-
libvirt-gconfig/libvirt-gconfig.h | 2 +
libvirt-gconfig/libvirt-gconfig.sym | 20 ++++
po/POTFILES.in | 2 +
18 files changed, 513 insertions(+), 28 deletions(-)
create mode 100644 libvirt-gconfig/libvirt-gconfig-domain-graphics-local.c
create mode 100644 libvirt-gconfig/libvirt-gconfig-domain-graphics-local.h
create mode 100644 libvirt-gconfig/libvirt-gconfig-domain-graphics-remote.c
create mode 100644 libvirt-gconfig/libvirt-gconfig-domain-graphics-remote.h
--
2.5.0
8 years, 4 months
[libvirt] [PATCH 0/6] auto-assign addresses when <address type='pci'/> is specified
by Laine Stump
This is an alternative to Cole's series that permits <address
type='pci'/> to force assignment of a PCI address, which is
particularly useful on platforms that could connect the same device in
different ways (e.g. aarch64/virt).
Here is Cole's last iteration of the series:
https://www.redhat.com/archives/libvir-list/2016-May/msg01088.html
I had expressed a dislike of the "auto_allocate" flag that his series
temporarily adds to the address (while simultaneously changing the
address type to NONE) and suggested just changing all the necessary
places to check for a valid PCI address instead of just checking the
address type. He replied that this wasn't as simple as it looked, so I
decided to try it; turns out he's right. But I still like this method
better because it's not playing tricks with the address type, or
adding a temporary private attribute to what should be pure config
data.
Your opinion may vary though :-)
Note that patch 5/6 incorporates the same test case that Cole used in
his penultimate patch, and I've added his patch to check the case of
aarch64 at the end as well.
Cole Robinson (1):
tests: qemu: test <address type='pci'/> with aarch64
Laine Stump (5):
conf: move virDomainDeviceInfo definition from domain_conf.h to
device_conf.h
conf: new functions to check if PCI address is wanted/present
conf: allow type='pci' addresses with no address attributes specified
bhyve: auto-assign addresses when <address type='pci'/> is specified
qemu: auto-assign addresses when <address type='pci'/> is specified
docs/schemas/basictypes.rng | 8 +-
src/bhyve/bhyve_device.c | 10 +-
src/conf/device_conf.c | 6 +-
src/conf/device_conf.h | 132 ++++++++++++++++++++-
src/conf/domain_addr.c | 2 +-
src/conf/domain_conf.c | 13 +-
src/conf/domain_conf.h | 129 --------------------
src/qemu/qemu_domain_address.c | 64 +++++-----
...l2argv-aarch64-virtio-pci-manual-addresses.args | 4 +-
...ml2argv-aarch64-virtio-pci-manual-addresses.xml | 5 +
.../qemuxml2argv-pci-autofill-addr.args | 25 ++++
.../qemuxml2argv-pci-autofill-addr.xml | 35 ++++++
tests/qemuxml2argvtest.c | 1 +
...2xmlout-aarch64-virtio-pci-manual-addresses.xml | 5 +
.../qemuxml2xmlout-pci-autofill-addr.xml | 41 +++++++
tests/qemuxml2xmltest.c | 1 +
16 files changed, 298 insertions(+), 183 deletions(-)
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pci-autofill-addr.args
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pci-autofill-addr.xml
create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-pci-autofill-addr.xml
--
2.5.5
8 years, 5 months
[libvirt] [RFC PATCH 0/2] nodeinfo: PPC64: Fix topology and siblings info on capabilities and nodeinfo
by Shivaprasad G Bhat
The nodeinfo output was fixed earlier to reflect the actual cpus available in
KVM mode on PPC64. The earlier fixes covered the aspect of not making a host
look overcommitted when its not. The current fixes are aimed at helping the
users make better decisions on the kind of guest cpu topology that can be
supported on the given sucore_per_core setting of KVM host and also hint the
way to pin the guest vcpus efficiently.
I am planning to add some test cases once the approach is accepted.
With respect to Patch 2:
The second patch adds a new element to the cpus tag and I need your inputs on
if that is okay. Also if there is a better way. I am not sure if the existing
clients have RNG checks that might fail with the approach. Or if the checks
are not enoforced on the elements but only on the tags.
With my approach if the rng checks pass, the new element "capacity" even if
ignored by many clients would have no impact except for PPC64.
To the extent I looked at code, the siblings changes dont affect existing
libvirt functionality. Please do let me know otherwise.
---
Shivaprasad G Bhat (2):
nodeinfo: Reflect guest usable host topology on PPC64
Introduce capacity to virCapsHostNUMACellCPU to help vcpu pinning decisions
src/conf/capabilities.c | 3 +
src/conf/capabilities.h | 1
src/nodeinfo.c | 51 ++++++++++++++++++++++--
tests/vircaps2xmldata/vircaps-basic-4-4-2G.xml | 32 ++++++++-------
tests/vircaps2xmltest.c | 1
5 files changed, 67 insertions(+), 21 deletions(-)
--
Signature
8 years, 6 months
[libvirt] [PATCH 0/2] vz: add tcp and udp serial device support
by Nikolay Shirokovskiy
Nikolay Shirokovskiy (2):
vz: add mode of unix socket serial device to xml dump
vz: add tcp and udp serial device support
src/vz/vz_sdk.c | 111 ++++++++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 96 insertions(+), 15 deletions(-)
--
1.8.3.1
8 years, 6 months