[libvirt] Supporting vhost-net and macvtap in libvirt for QEMU
by Anthony Liguori
Disclaimer: I am neither an SR-IOV nor a vhost-net expert, but I've CC'd
people that are who can throw tomatoes at me for getting bits wrong :-)
I wanted to start a discussion about supporting vhost-net in libvirt.
vhost-net has not yet been merged into qemu but I expect it will be soon
so it's a good time to start this discussion.
There are two modes worth supporting for vhost-net in libvirt. The
first mode is where vhost-net backs to a tun/tap device. This is
behaves in very much the same way that -net tap behaves in qemu today.
Basically, the difference is that the virtio backend is in the kernel
instead of in qemu so there should be some performance improvement.
Current, libvirt invokes qemu with -net tap,fd=X where X is an already
open fd to a tun/tap device. I suspect that after we merge vhost-net,
libvirt could support vhost-net in this mode by just doing -net
vhost,fd=X. I think the only real question for libvirt is whether to
provide a user visible switch to use vhost or to just always use vhost
when it's available and it makes sense. Personally, I think the later
makes sense.
The more interesting invocation of vhost-net though is one where the
vhost-net device backs directly to a physical network card. In this
mode, vhost should get considerably better performance than the current
implementation. I don't know the syntax yet, but I think it's
reasonable to assume that it will look something like -net
tap,dev=eth0. The effect will be that eth0 is dedicated to the guest.
On most modern systems, there is a small number of network devices so
this model is not all that useful except when dealing with SR-IOV
adapters. In that case, each physical device can be exposed as many
virtual devices (VFs). There are a few restrictions here though. The
biggest is that currently, you can only change the number of VFs by
reloading a kernel module so it's really a parameter that must be set at
startup time.
I think there are a few ways libvirt could support vhost-net in this
second mode. The simplest would be to introduce a new tag similar to
<source network='br0'>. In fact, if you probed the device type for the
network parameter, you could probably do something like <source
network='eth0'> and have it Just Work.
Another model would be to have libvirt see an SR-IOV adapter as a
network pool whereas it handled all of the VF management. Considering
how inflexible SR-IOV is today, I'm not sure whether this is the best model.
Has anyone put any more thought into this problem or how this should be
modeled in libvirt? Michael, could you share your current thinking for
-net syntax?
--
Regards,
Anthony Liguori
1 year, 1 month
[libvirt] Don't add iptables rules when creating networks
by Felix Schwarz
Hi,
I just found out that libvirt always add some iptables rules if it creates a
natted (or routed) network. There were a couple of mailing list posts about
this so I'm pretty sure this is not news to you.
I don't want to go into the debate if your approach is sensible or not (I
guess there are some use cases where I kind of like it). However on my server
machine I really need full control over my (rather complicated) firewall
settings.
Currently the newly added rules really create a lot of problems for me. For
example if I manage to have a good configuration after startup and then start
a libvirt network afterwards, it will inject its rules at the start of the
FORWARD queue (even though the same parameters are already present at the
end!). On every net start there will be more duplicated rules and they will
take preference over my existing rules.
Besides that specific issue I think this is only one tiny problem compared to
others (central configuration of firewall rules, auditing requirements, ...).
Therefore I would like to have some kind 'power user' flag that prevents
libvirt from adding any filter rules. I'm fine with activating it manually as
long as I don't have to patch libvirt.
fs
14 years, 6 months
[libvirt] cannot start a qemu64 model on an Intel host
by Dan Kenigsberg
If I specify
<cpu match="exact">
<model>qemu64</model>
</cpu>
according to the new cpu schema, I get
error: internal error guest CPU is not compatible with host CPU
because qemu64 supports svm, and my host does not.
However, the error remains when I explicitly ask to disable svm with
<feature policy="disable" name="svm"/>
I am not sure if this is a bug or an intended feature, but I'd expect
cpu_disable taken into account when doing
x86ModelCompare(host_model, cpu_require)
Regards,
Dan.
14 years, 8 months
[libvirt] Two issues regarding TFTP support in virtual networks...
by Darryl L. Pierce
The first is just a wiki fix: the wiki says this functionality is
available as of 0.7.1 of libvirt. The code though is only in the 0.7.2
tag and later. So the wiki should say 0.7.2 instead.
The second regards how I'm using it and what I'm doing wrong. I'm
creating a virtual network and pointing it to a temporary directory
where I've run livecd-iso-to-pxeboot to setup an ISO file for PXE
booting. So the network XML looks like this:
(mcpierce@mcpierce-desktop:node-image)$ sudo virsh net-dumpxml testbr541
<network>
<name>testbr541</name>
<uuid>f389317f-8420-7516-df9d-756b7deb3d37</uuid>
<forward mode='nat'/>
<bridge name='testbr541' stp='on' delay='0' />
<ip address='192.168.31.1' netmask='255.255.255.0'>
<tftp root='/tmp/tmp.3B8opJfBXw/tftpboot/' />
<dhcp>
<range start='192.168.31.100' end='192.168.31.199' />
</dhcp>
</ip>
</network>
When I start up the VM, I see it get an IP address within the range
specified, but it never pxeboots the ISO image.
To what directory should the root attribute be pointed?
--
Darryl L. Pierce, Sr. Software Engineer @ Red Hat, Inc.
Delivering value year after year.
Red Hat ranks #1 in value among software vendors.
http://www.redhat.com/promo/vendor/
14 years, 8 months
[libvirt] [RFC] Proposal for introduction of network traffic filtering capabilities for filtering of network traffic from and to VMs
by Stefan Berger
Hello!
The following is a proposal to introduce network traffic filtering
capabilities for the network traffic originating from and destined to
virtual machines. Libvirt's capabilities will be extended to allow users
to describe what traffic filtering rules are to be applied on a virtual
machine (using XML) and libvirt then creates the appropriate ebtables and
iptables rules and applies those on the host when the virtual machine
starts up, resumes after suspension or resumes on a new host after
migration. libvirt tears down the traffic filtering rules when the virtual
machine shuts down, suspends, or a VM is migrated to another host. It will
also be possible to modify the filtering rules while a virtual machine is
running. In this architecture we apply the firewall rules on the outside
of the virtual machines on the Linux host and make use of the fact that
virtual machines can be configured by libvirt to have their network
traffic pass through the host and the host exposes (tap) 'backend'
interfaces on which the firewall rules can be applied. The application of
the firewall rules is optional and those who do not want to introduce a
raw network throughput performance hit on their system due to the
evaluation of every packet passing through the filtering chains, do not
have to use these capabilities. An initial implementation would be done
for libvirt's Qemu support.
As stated above, the application of firewall rules on virtual machines
will require some form of XML description that lets the user choose what
type of filtering is to be performed. In effect the user needs to be able
to tell libvirt which ebtables and iptables rules to generate so that the
appropriate filtering can be done. We realize that ebtables and iptables
have lots of capabilities but think that we need to restrict which
capabilities can actually be 'reached' through this XML.
The following proposal is based on an XML as defined by the DMTF (
http://www.dmtf.org/standards/cim/cim_schema_v2230/CIM_Network.pdf, slide
10) with extensions for processing of ARP packets.It gives control over
the evaluation of Ethernet (MAC) frames, ARP packet data and IP header
information. A (draft) XML skeleton in this case could look as follows:
<FilterEntry>
<Action>DROP|ACCEPT</Action> <!-- from FilterEntry -->
<TrafficType>incoming [to VM]|outgoing [from VM]</TrafficType>
<Hdr8021Filter>
<HdrSrcMACAddr8021> </HdrSrcMACAddr8021>
<HdrSrcMACMask8021> </HdrSrcMACMask8021>
<HdrDestMACAddr8021> </HdrDestMACAddr8021>
<HdrDestMACMask8021> </HdrDestMACMask8021>
<HdrProtocolID8021> numeric and/or string?
</HdrProtocolID8021>
<HdrPriorityValue8021></HdrPriorityValue8021>
<HdrVLANID8021> </HdrVLANID8021>
</Hdr8021Filter>
<ARPFilter>
<HWType> </HWType>
<ProtocolType> </ProtocolType>
<OPCode> </OPCode>
<SrcMACAddr> </SrcMACAddr>
<SrcIPAddr> </SrcIPAddr>
<DestMACAddr> </DestMACAddr>
<DestIPAddr> </DestIPAddr>
</ARPFilter>
<IPHeadersFilter>
<HdrIPVersion> </HdrIPVersion>
<HdrSrcAddress> </HdrSrcAddress>
<HdrSrcMask> </HdrSrcMask>
<HdrDestAddress> </HdrDestAddress>
<HdrDestMask> </HdrDestMask>
<HdrProtocolID> numeric and/or string? </HdrProtocolID>
<HdrSrcPortStart> </HdrSrcPortStart>
<HdrSrcPortEnd> </HdrSrcPortEnd>
<HdrDestPortStart> </HdrDestPortStart>
<HdrDestPortEnd> </HdrDestPortEnd>
<HdrDSCP> </HdrDSCP>
</IPHeadersFilter>
</FilterEntry>
Since the ebtables and iptables command cannot accept all possible
parameters at the same time, only a certain subset of the above parameters
may be set for any given filter command. Higher level application writers
will likely use a library that lets them choose which features they would
want to have enforced, such as no-broadcast or no-multicast, enforcement
of MAC spoofing prevention or ARP poisoning prevention, which then
generates this lower level XML rules in the appropriate order of the
rules.
Further, we will introduce filter pools where a collection of the above
filter rules can be stored and referenced to by virtual machines'
individual interface. A reference to such a filter pool entry will be
given in the interface description and may look as in the following draft
XML:
<interface type='bridge'>
<source bridge='virbr0'/>
<script path='vif-bridge'/>
<firewall>
<reference profile='generic-layer2' ip_address='10.0.0.1'/>
<reference profile='VM-specific-layer3'/>
</firewall>
</interface>
The above XML has one reference to a generic layer2 filter template XML
that should be usable by multiple virtual machines but would need to be
customized with interface-specific parameters. In this case, the IP
address of the interface is explicitly provided in order to make the
filter XML template a concrete XML for the particular interface. The MAC
address of the interface is not explicitly provided since it will already
have been randomly generated by libvirt at the point when this layer2
filter XML needs to be converted into concrete ebtables commands/rules. We
still need to determine how the format of a template should look like,
though an idea would be to indicate a placeholder for a VM's MAC address
using THIS_MAC and similarly for the IP address with THIS_IP as
placeholder.
Further, we would introduce the management of filter 'pools'. Considering
existing functionality in libvirt and CLI commands for similar type of
management functionality, the following CLI commands would be added:
virsh nwfilter-create <file>
virsh nwfilter-destroy <profile name>
virsh nwfilter-list <options>
virsh nwfilter-dumpxml <profile name>
virsh nwfilter-update <filename> (performs an update on an existing
profile replacing all rules)
possibly also:
virsh nwfilter-edit <profile name>
virsh nwfilter-create-from <profile name>
Please let us know what you think of this proposal.
Regards
Stefan, Gerhard and Vivek
14 years, 9 months
[libvirt] [PATCH 0/1] A couple of problems with the daemon-conf test
by David Allan
Here's a patch fixing two problems I found with the daemon-conf test that prevented the test from passing if the system libvirt was running on my system.
The first change only affects the direct running of the daemon-conf script, i.e, if someone executes ./daemon-conf, which will always exit failure because the default config file contains a line that the test believes should not be there. The invocation by make-check supplies a different config file that's not subject to that problem.
The second problem is that the corrupt config tests don't supply the pid-file argument to libvirt, so on my system it was attempting to use the same pidfile as the running system daemon and failing all the tests except the valid config test. Supplying the --pid-file argument when running the corrupt config tests as it is supplied for the valid config tests fixes the problem.
Dave
14 years, 9 months
[libvirt] [ PATCH ] fix multiple veth problem for OpenVZ
by Yuji NISHIDA
Dear all
This is to fix multiple veth problem.
NETIF setting was overwritten after first CT because any CT could not be found by character name.
---
src/openvz/openvz_conf.c | 38 ++++++++++++++++++++++++++++++++++++++
src/openvz/openvz_conf.h | 1 +
src/openvz/openvz_driver.c | 2 +-
3 files changed, 40 insertions(+), 1 deletions(-)
diff --git a/src/openvz/openvz_conf.c b/src/openvz/openvz_conf.c
index 43bbaf2..9fb9f7e 100644
--- a/src/openvz/openvz_conf.c
+++ b/src/openvz/openvz_conf.c
@@ -948,3 +948,41 @@ static int openvzAssignUUIDs(void)
VIR_FREE(conf_dir);
return 0;
}
+
+
+/*
+ * Return CTID from name
+ *
+ */
+
+int openvzGetVEID(char *name) {
+
+ char cmd[64];
+ int veid;
+ FILE *fp;
+
+ strcpy( cmd, VZLIST );
+ strcat( cmd, " " );
+ strcat( cmd, name );
+ strcat( cmd, " -ovpsid -H" );
+
+ if ((fp = popen(cmd, "r")) == NULL) {
+ openvzError(NULL, VIR_ERR_INTERNAL_ERROR, "%s", _("popen failed"));
+ return -1;
+ }
+
+ if (fscanf(fp, "%d\n", &veid ) != 1) {
+ if (feof(fp))
+ return -1;
+
+ openvzError(NULL, VIR_ERR_INTERNAL_ERROR,
+ "%s", _("Failed to parse vzlist output"));
+ goto cleanup;
+ }
+
+ return veid;
+
+ cleanup:
+ fclose(fp);
+ return -1;
+}
diff --git a/src/openvz/openvz_conf.h b/src/openvz/openvz_conf.h
index 00e18b4..518c267 100644
--- a/src/openvz/openvz_conf.h
+++ b/src/openvz/openvz_conf.h
@@ -66,5 +66,6 @@ void openvzFreeDriver(struct openvz_driver *driver);
int strtoI(const char *str);
int openvzSetDefinedUUID(int vpsid, unsigned char *uuid);
unsigned int openvzGetNodeCPUs(void);
+int openvzGetVEID(char *name);
#endif /* OPENVZ_CONF_H */
diff --git a/src/openvz/openvz_driver.c b/src/openvz/openvz_driver.c
index 196fd8c..879b5d0 100644
--- a/src/openvz/openvz_driver.c
+++ b/src/openvz/openvz_driver.c
@@ -667,7 +667,7 @@ openvzDomainSetNetwork(virConnectPtr conn, const char *vpsid,
if (net->type == VIR_DOMAIN_NET_TYPE_BRIDGE) {
virBuffer buf = VIR_BUFFER_INITIALIZER;
char *dev_name_ve;
- int veid = strtoI(vpsid);
+ int veid = openvzGetVEID(vpsid);
//--netif_add ifname[,mac,host_ifname,host_mac]
ADD_ARG_LIT("--netif_add") ;
--
1.5.2.2
-----
Yuji Nishida
nishidy(a)nict.go.jp
14 years, 9 months
[libvirt] [PATCH 0/9] virDomain{Attach,Detach}DeviceFlags patches
by Jim Fehlig
This set implements two new APIs, virDomainAttachDeviceFlags and
virDomainDetachDeviceFlags as discussed here
https://www.redhat.com/archives/libvir-list/2009-December/msg00124.html
Introduce two new APIs
virDomainAttachDeviceFlags(virDomainPtr dom, const char *xml,
unsigned flags)
virDomainDetachDeviceFlags(virDomainPtr dom, const char *xml,
unsigned flags)
with flags being one or more from VIR_DOMAIN_DEVICE_MODIFY_CURRENT,
VIR_DOMAIN_DEVICE_MODIFY_LIVE, VIR_DOMAIN_DEVICE_MODIFY_CONFIG.
If caller specifies CURRENT (default), add or remove the device
depending on the current state of domain. E.g. if domain is active add
or remove the device to/from live domain, if it is inactive change the
persistent config. If caller specifies LIVE, only change the active
domain. If caller specifies CONFIG, only change persistent config -
even if the domain is active. If caller specifies both LIVE and CONFIG,
then change both.
If a driver can not satisfy the exact requested flags it must return
an error. E.g if user specified LIVE but the driver can only change
live and persisted config, the driver must fail the request.
The existing virDomain{Attach,Detach}Device is now explicitly restricted
to active domains and is equivalent to
virDomain{Attach,Detach}DeviceFlags(LIVE).
Finally, virsh {attach,detach}-{disk,interface,device} has been modified
to add a --persistent flag in order to set the appropriate flags when
calling the new APIs.
Jim Fehlig (9):
Restrict virDomain{Attach,Detach}Device to active domains
Public API
Internal API
Public API Implementation
Wire protocol format
Remote driver
Server side dispatcher
domain{Attach,Detach}DeviceFlags handler for drivers
Modify virsh commands
daemon/remote.c | 53 ++++++++++++++++++
include/libvirt/libvirt.h.in | 13 +++++
src/driver.h | 10 ++++
src/esx/esx_driver.c | 2 +
src/libvirt.c | 120 ++++++++++++++++++++++++++++++++++++++----
src/libvirt_public.syms | 6 ++
src/lxc/lxc_driver.c | 2 +
src/opennebula/one_driver.c | 2 +
src/openvz/openvz_driver.c | 2 +
src/phyp/phyp_driver.c | 2 +
src/qemu/qemu_driver.c | 26 +++++++++
src/remote/remote_driver.c | 54 +++++++++++++++++++
src/remote/remote_protocol.x | 17 ++++++-
src/test/test_driver.c | 2 +
src/uml/uml_driver.c | 2 +
src/vbox/vbox_tmpl.c | 24 ++++++++
src/xen/proxy_internal.c | 4 +-
src/xen/xen_driver.c | 42 +++++++++++++--
src/xen/xen_driver.h | 4 +-
src/xen/xen_hypervisor.c | 4 +-
src/xen/xen_inotify.c | 4 +-
src/xen/xend_internal.c | 98 ++++++++++++++++++++++++++++------
src/xen/xm_internal.c | 30 +++++++----
src/xen/xs_internal.c | 4 +-
tools/virsh.c | 55 +++++++++++++++++--
25 files changed, 523 insertions(+), 59 deletions(-)
14 years, 10 months