[libvirt] [PATCH] Allow nwfilter functions to be compiled with C++
by Chris Lalancette
Unfortunately the NWFilter functions were outside of the
"extern C { ... }" declaration in include/libvirt/libvirt.h.in,
which means that they couldn't be properly used with C++. Move
them inside of the braces, which should fix the problem.
Signed-off-by: Chris Lalancette <clalance(a)redhat.com>
---
include/libvirt/libvirt.h.in | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in
index d95d7ae..e6e4d0c 100644
--- a/include/libvirt/libvirt.h.in
+++ b/include/libvirt/libvirt.h.in
@@ -2213,10 +2213,6 @@ int virConnectDomainEventRegisterAny(virConnectPtr conn,
int virConnectDomainEventDeregisterAny(virConnectPtr conn,
int callbackID);
-#ifdef __cplusplus
-}
-#endif
-
/**
* virNWFilter:
@@ -2280,4 +2276,8 @@ int virNWFilterGetUUIDString (virNWFilterPtr nwfilter,
char * virNWFilterGetXMLDesc (virNWFilterPtr nwfilter,
int flags);
+#ifdef __cplusplus
+}
+#endif
+
#endif /* __VIR_VIRLIB_H__ */
--
1.6.6.1
14 years, 7 months
[libvirt] [PATCH v6] add 802.1Qbh handling
by Stefan Berger
This patch builds on the work recently posted by Stefan Berger. It builds
on top of Stefan's three posted patches:
[PATCH v10] vepa: parsing for 802.1Qb{g|h} XML
[RFC][PATCH 1/3] vepa+vsi: Introduce dependency on libnl
[PATCH v3] Add host UUID (to libvirt capabilities)
Stefan's RFC patches 2/3 and 3/3 are incorporated into my patch.
Changes from v5 to v6:
- Renamed occurrencvirVirtualPortProfileDef to
virVirtualPortProfileParamses
- 802.1Qbg part prepared for sending a RTM_SETLINK and getting
processing status back plus a subsequent RTM_GETLINK to
get IFLA_PORT_RESPONSE.
Note: This interface for 802.1Qbg may still change
Changes from v4 to v5:
- [David Allan] move getPhysfn inside IFLA_VF_PORT_MAX to avoid
compiler
warning when latest if_link.h isn't available
Changes from v3 to v4:
- move from Stefan's 802.1Qb{g|h} XML v8 to v9
- move hostuuid and vf index calcs to inside doPortProfileOp8021Qbh
Changes from v2 to v3:
- remove debug fprintfs
- use virGetHostUUID (thanks Stefan!)
- fix compile issue when latest if_link.h isn't available
- change poll timeout to 10s, at 1/8 intervals
- if polling times out, log msg and return -ETIMEDOUT
Changes from v1 to v2:
- Add Stefan's code for getPortProfileStatus
- Poll for up to 2 secs for port-profile status, at 1/8 sec intervals:
- if status indicates error, abort openMacvtapTap
- if status indicates success, exit polling
- if status is "in-progress" after 2 secs of polling, exit
polling loop silently, without error
My patch finishes out the 802.1Qbh parts, which Stefan had mostly complete.
I've tested using the recent kernel updates for VF_PORT netlink msgs and
enic for Cisco's 10G Ethernet NIC. I tested many VMs, each with several
direct interfaces, each configured with a port-profile per the XML. VM-to-VM,
and VM-to-external work as expected. VM-to-VM on same host (using same NIC)
works same as VM-to-VM where VMs are on diff hosts. I'm able to change
settings on the port-profile while the VM is running to change the virtual
port behaviour. For example, adjusting a QoS setting like rate limit. All
VMs with interfaces using that port-profile immediatly see the effect of the
change to the port-profile.
I don't have a SR-IOV device to test so source dev is a non-SR-IOV device,
but most of the code paths include support for specifing the source dev and
VF index. We'll need to complete this by discovering the PF given the VF
linkdev. Once we have the PF, we'll also have the VF index. All this info-
mation is available from sysfs.
Signed-off-by: Scott Feldman <scofeldm(a)cisco.com>
Signed-off-by: Stefan Berger <stefanb(a)us.ibm.com>
---
configure.ac | 16 -
src/qemu/qemu_conf.c | 2
src/qemu/qemu_driver.c | 4
src/util/macvtap.c | 768 ++++++++++++++++++++++++++++++++++++++++++++++++-
src/util/macvtap.h | 1
5 files changed, 775 insertions(+), 16 deletions(-)
Index: libvirt-acl/configure.ac
===================================================================
--- libvirt-acl.orig/configure.ac
+++ libvirt-acl/configure.ac
@@ -2005,13 +2005,26 @@ if test "$with_macvtap" != "no" ; then
fi
AM_CONDITIONAL([WITH_MACVTAP], [test "$with_macvtap" = "yes"])
+AC_TRY_COMPILE([ #include <sys/socket.h>
+ #include <linux/rtnetlink.h> ],
+ [ int x = IFLA_PORT_MAX; ],
+ [ with_virtualport=yes ],
+ [ with_virtualport=no ])
+if test "$with_virtualport" = "yes"; then
+ val=1
+else
+ val=0
+fi
+AC_DEFINE_UNQUOTED([WITH_VIRTUALPORT], $val, [whether vsi vepa support is enabled])
+AM_CONDITIONAL([WITH_VIRTUALPORT], [test "$with_virtualport" = "yes"])
+
dnl netlink library
LIBNL_CFLAGS=""
LIBNL_LIBS=""
-if test "$with_macvtap" = "yes"; then
+if test "$with_macvtap" = "yes" || "$with_virtualport" = "yes"; then
PKG_CHECK_MODULES([LIBNL], [libnl-1 >= $LIBNL_REQUIRED], [
], [
AC_MSG_ERROR([libnl >= $LIBNL_REQUIRED is required for macvtap support])
@@ -2084,6 +2097,7 @@ AC_MSG_NOTICE([ Network: $with_network])
AC_MSG_NOTICE([Libvirtd: $with_libvirtd])
AC_MSG_NOTICE([ netcf: $with_netcf])
AC_MSG_NOTICE([ macvtap: $with_macvtap])
+AC_MSG_NOTICE([virtport: $with_virtualport])
AC_MSG_NOTICE([])
AC_MSG_NOTICE([Storage Drivers])
AC_MSG_NOTICE([])
Index: libvirt-acl/src/qemu/qemu_conf.c
===================================================================
--- libvirt-acl.orig/src/qemu/qemu_conf.c
+++ libvirt-acl/src/qemu/qemu_conf.c
@@ -1505,7 +1505,7 @@ qemudPhysIfaceConnect(virConnectPtr conn
if (err) {
close(rc);
rc = -1;
- delMacvtap(net->ifname,
+ delMacvtap(net->ifname, net->data.direct.linkdev,
&net->data.direct.virtPortProfile);
}
}
Index: libvirt-acl/src/qemu/qemu_driver.c
===================================================================
--- libvirt-acl.orig/src/qemu/qemu_driver.c
+++ libvirt-acl/src/qemu/qemu_driver.c
@@ -3709,7 +3709,7 @@ static void qemudShutdownVMDaemon(struct
for (i = 0; i < def->nnets; i++) {
virDomainNetDefPtr net = def->nets[i];
if (net->type == VIR_DOMAIN_NET_TYPE_DIRECT)
- delMacvtap(net->ifname,
+ delMacvtap(net->ifname, net->data.direct.linkdev,
&net->data.direct.virtPortProfile);
}
#endif
@@ -8513,7 +8513,7 @@ qemudDomainDetachNetDevice(struct qemud_
#if WITH_MACVTAP
if (detach->type == VIR_DOMAIN_NET_TYPE_DIRECT)
- delMacvtap(detach->ifname,
+ delMacvtap(detach->ifname, detach->data.direct.linkdev,
&detach->data.direct.virtPortProfile);
#endif
Index: libvirt-acl/src/util/macvtap.c
===================================================================
--- libvirt-acl.orig/src/util/macvtap.c
+++ libvirt-acl/src/util/macvtap.c
@@ -27,7 +27,7 @@
#include <config.h>
-#if WITH_MACVTAP
+#if WITH_MACVTAP || WITH_VIRTUALPORT
# include <stdio.h>
# include <errno.h>
@@ -41,6 +41,8 @@
# include <linux/rtnetlink.h>
# include <linux/if_tun.h>
+# include <netlink/msg.h>
+
# include "util.h"
# include "memory.h"
# include "logging.h"
@@ -48,6 +50,7 @@
# include "interface.h"
# include "conf/domain_conf.h"
# include "virterror_internal.h"
+# include "uuid.h"
# define VIR_FROM_THIS VIR_FROM_NET
@@ -58,15 +61,23 @@
# define MACVTAP_NAME_PREFIX "macvtap"
# define MACVTAP_NAME_PATTERN "macvtap%d"
+# define MICROSEC_PER_SEC (1000 * 1000)
+
static int associatePortProfileId(const char *macvtap_ifname,
+ const char *linkdev,
const virVirtualPortProfileParamsPtr virtPort,
- int vf,
const unsigned char *vmuuid);
static int disassociatePortProfileId(const char *macvtap_ifname,
+ const char *linkdev,
const virVirtualPortProfileParamsPtr virtPort);
+enum virVirtualPortOp {
+ ASSOCIATE = 0x1,
+ DISASSOCIATE = 0x2,
+};
+
static int nlOpen(void)
{
@@ -159,6 +170,156 @@ err_exit:
}
+# ifdef IFLA_VF_PORT_MAX
+
+/**
+ * nlCommWaitSuccess:
+ *
+ * @nlmsg: pointer to netlink message
+ * @nl_grousp: the netlink multicast groups to send to
+ * @respbuf: pointer to pointer where response buffer will be allocated
+ * @respbuflen: pointer to integer holding the size of the response buffer
+ * on return of the function.
+ * @to_usecs: timeout in microseconds to wait for a success message
+ * to be returned
+ *
+ * Send the given message to the netlink multicast group and receive
+ * responses. Skip responses indicating an error and keep on receiving
+ * responses until a success response is returned.
+ * Returns 0 on success, -1 on error. In case of error, no response
+ * buffer will be returned.
+ */
+static int
+nlCommWaitSuccess(struct nlmsghdr *nlmsg, int nl_groups,
+ char **respbuf, int *respbuflen, long to_usecs)
+{
+ int rc = 0;
+ struct sockaddr_nl nladdr = {
+ .nl_family = AF_NETLINK,
+ .nl_pid = getpid(),
+ .nl_groups = nl_groups,
+ };
+ int rcvChunkSize = 1024; // expecting less than that
+ int rcvoffset = 0;
+ ssize_t nbytes;
+ int n;
+ struct timeval tv = {
+ .tv_sec = to_usecs / MICROSEC_PER_SEC,
+ .tv_usec = to_usecs % MICROSEC_PER_SEC,
+ };
+ fd_set rfds;
+ bool gotvalid = false;
+ int fd = nlOpen();
+ static uint32_t seq = 0x1234;
+ uint32_t myseq = seq++;
+ uint32_t mypid = getpid();
+
+ if (fd < 0)
+ return -1;
+
+ nlmsg->nlmsg_pid = mypid;
+ nlmsg->nlmsg_seq = myseq;
+ nlmsg->nlmsg_flags |= NLM_F_ACK;
+
+ nbytes = sendto(fd, (void *)nlmsg, nlmsg->nlmsg_len, 0,
+ (struct sockaddr *)&nladdr, sizeof(nladdr));
+ if (nbytes < 0) {
+ virReportSystemError(errno,
+ "%s", _("cannot send to netlink socket"));
+ rc = -1;
+ goto err_exit;
+ }
+
+ while (!gotvalid) {
+ rcvoffset = 0;
+ while (1) {
+ socklen_t addrlen = sizeof(nladdr);
+
+ if (VIR_REALLOC_N(*respbuf, rcvoffset+rcvChunkSize) < 0) {
+ virReportOOMError();
+ rc = -1;
+ goto err_exit;
+ }
+
+ FD_ZERO(&rfds);
+ FD_SET(fd, &rfds);
+
+ n = select(fd + 1, &rfds, NULL, NULL, &tv);
+ if (n == 0) {
+ rc = -1;
+ goto err_exit;
+ }
+
+ nbytes = recvfrom(fd, &((*respbuf)[rcvoffset]), rcvChunkSize, 0,
+ (struct sockaddr *)&nladdr, &addrlen);
+ if (nbytes < 0) {
+ if (errno == EAGAIN || errno == EINTR)
+ continue;
+ virReportSystemError(errno, "%s",
+ _("error receiving from netlink socket"));
+ rc = -1;
+ goto err_exit;
+ }
+ rcvoffset += nbytes;
+ break;
+ }
+ *respbuflen = rcvoffset;
+
+ /* check message for error */
+ if (*respbuflen > NLMSG_LENGTH(0) && *respbuf != NULL) {
+ struct nlmsghdr *resp = (struct nlmsghdr *)*respbuf;
+ struct nlmsgerr *err;
+
+ if (resp->nlmsg_pid != mypid ||
+ resp->nlmsg_seq != myseq)
+ continue;
+
+ /* skip reflected message */
+ if (resp->nlmsg_type & 0x10)
+ continue;
+
+ switch (resp->nlmsg_type) {
+ case NLMSG_ERROR:
+ err = (struct nlmsgerr *)NLMSG_DATA(resp);
+ if (resp->nlmsg_len >= NLMSG_LENGTH(sizeof(*err))) {
+ if (-err->error != EOPNOTSUPP) {
+ /* assuming error msg from daemon */
+ gotvalid = true;
+ break;
+ }
+ }
+ /* whatever this is, skip it */
+ VIR_FREE(*respbuf);
+ *respbuf = NULL;
+ *respbuflen = 0;
+ break;
+
+ case NLMSG_DONE:
+ gotvalid = true;
+ break;
+
+ default:
+ VIR_FREE(*respbuf);
+ *respbuf = NULL;
+ *respbuflen = 0;
+ break;
+ }
+ }
+ }
+
+err_exit:
+ if (rc == -1) {
+ VIR_FREE(*respbuf);
+ *respbuf = NULL;
+ *respbuflen = 0;
+ }
+
+ nlClose(fd);
+ return rc;
+}
+
+# endif
+
static struct rtattr *
rtattrCreate(char *buffer, int bufsize, int type,
const void *data, int datalen)
@@ -204,6 +365,8 @@ nlAppend(struct nlmsghdr *nlm, int totle
}
+# if WITH_MACVTAP
+
static int
link_add(const char *type,
const unsigned char *macaddress, int macaddrsize,
@@ -655,8 +818,8 @@ create_name:
}
if (associatePortProfileId(cr_ifname,
+ linkdev,
virtPortProfile,
- -1,
vmuuid) != 0) {
rc = -1;
goto link_del_exit;
@@ -689,6 +852,7 @@ create_name:
disassociate_exit:
disassociatePortProfileId(cr_ifname,
+ linkdev,
virtPortProfile);
link_del_exit:
@@ -701,6 +865,7 @@ link_del_exit:
/**
* delMacvtap:
* @ifname : The name of the macvtap interface
+ * @linkdev: The interface name of the NIC to connect to the external bridge
* @virtPortProfile: pointer to object holding the virtual port profile data
*
* Delete an interface given its name. Disassociate
@@ -709,24 +874,593 @@ link_del_exit:
*/
void
delMacvtap(const char *ifname,
+ const char *linkdev,
virVirtualPortProfileParamsPtr virtPortProfile)
{
if (ifname) {
disassociatePortProfileId(ifname,
+ linkdev,
virtPortProfile);
link_del(ifname);
}
}
-#endif
+# endif
+
+
+# ifdef IFLA_PORT_MAX
+
+static struct nla_policy ifla_policy[IFLA_MAX + 1] =
+{
+ [IFLA_VF_PORTS] = { .type = NLA_NESTED },
+};
+
+static struct nla_policy ifla_vf_ports_policy[IFLA_VF_PORT_MAX + 1] =
+{
+ [IFLA_VF_PORT] = { .type = NLA_NESTED },
+};
+
+static struct nla_policy ifla_port_policy[IFLA_PORT_MAX + 1] =
+{
+ [IFLA_PORT_RESPONSE] = { .type = NLA_U16 },
+};
+
+
+static int
+link_dump(bool multicast, int ifindex, struct nlattr **tb, char **recvbuf)
+{
+ int rc = 0;
+ char nlmsgbuf[256] = { 0, };
+ struct nlmsghdr *nlm = (struct nlmsghdr *)nlmsgbuf, *resp;
+ struct nlmsgerr *err;
+ struct ifinfomsg i = {
+ .ifi_family = AF_UNSPEC,
+ .ifi_index = ifindex
+ };
+ int recvbuflen;
+
+ *recvbuf = NULL;
+
+ nlInit(nlm, NLM_F_REQUEST, RTM_GETLINK);
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), &i, sizeof(i)))
+ goto buffer_too_small;
+
+ if (!multicast) {
+ if (nlComm(nlm, recvbuf, &recvbuflen) < 0)
+ return -1;
+ } else {
+ if (nlCommWaitSuccess(nlm, RTMGRP_LINK, recvbuf, &recvbuflen,
+ 5 * MICROSEC_PER_SEC) < 0)
+ return -1;
+ }
+
+ if (recvbuflen < NLMSG_LENGTH(0) || *recvbuf == NULL)
+ goto malformed_resp;
+
+ resp = (struct nlmsghdr *)*recvbuf;
+
+ switch (resp->nlmsg_type) {
+ case NLMSG_ERROR:
+ err = (struct nlmsgerr *)NLMSG_DATA(resp);
+ if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err)))
+ goto malformed_resp;
+
+ switch (-err->error) {
+ case 0:
+ break;
+
+ default:
+ virReportSystemError(-err->error,
+ _("error dumping %d interface"),
+ ifindex);
+ rc = -1;
+ }
+ break;
+
+ case GENL_ID_CTRL:
+ case NLMSG_DONE:
+ if (nlmsg_parse(resp, sizeof(struct ifinfomsg),
+ tb, IFLA_MAX, ifla_policy)) {
+ goto malformed_resp;
+ }
+ break;
+
+ default:
+ goto malformed_resp;
+ }
+
+ if (rc != 0)
+ VIR_FREE(*recvbuf);
+
+ return rc;
+
+malformed_resp:
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("malformed netlink response message"));
+ VIR_FREE(*recvbuf);
+ return -1;
+
+buffer_too_small:
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("internal buffer is too small"));
+ return -1;
+}
+
+
+static int
+getPortProfileStatus(struct nlattr **tb, int32_t vf, uint16_t *status)
+{
+ int rc = 1;
+ const char *msg = NULL;
+ struct nlattr *tb2[IFLA_VF_PORT_MAX + 1],
+ *tb3[IFLA_PORT_MAX+1];
+
+ if (vf == PORT_SELF_VF) {
+ if (tb[IFLA_PORT_SELF]) {
+ if (nla_parse_nested(tb3, IFLA_PORT_MAX, tb[IFLA_PORT_SELF],
+ ifla_port_policy)) {
+ msg = _("error parsing nested IFLA_VF_PORT part");
+ goto err_exit;
+ }
+ }
+ } else {
+ if (tb[IFLA_VF_PORTS]) {
+ if (nla_parse_nested(tb2, IFLA_VF_PORT_MAX, tb[IFLA_VF_PORTS],
+ ifla_vf_ports_policy)) {
+ msg = _("error parsing nested IFLA_VF_PORTS part");
+ goto err_exit;
+ }
+ if (tb2[IFLA_VF_PORT]) {
+ if (nla_parse_nested(tb3, IFLA_PORT_MAX, tb2[IFLA_VF_PORT],
+ ifla_port_policy)) {
+ msg = _("error parsing nested IFLA_VF_PORT part");
+ goto err_exit;
+ }
+ }
+ }
+ }
+ if (tb3[IFLA_PORT_RESPONSE]) {
+ *status = *(uint16_t *)RTA_DATA(tb3[IFLA_PORT_RESPONSE]);
+ rc = 0;
+ } else {
+ msg = _("no IFLA_PORT_RESPONSE found in netlink message");
+ goto err_exit;
+ }
+
+err_exit:
+ if (msg)
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s", msg);
+
+ return rc;
+}
+
+
+static int
+doPortProfileOpSetLink(bool multicast,
+ int ifindex,
+ const char *profileId,
+ struct ifla_port_vsi *portVsi,
+ const unsigned char *instanceId,
+ const unsigned char *hostUUID,
+ int32_t vf,
+ uint8_t op)
+{
+ int rc = 0;
+ char nlmsgbuf[256];
+ struct nlmsghdr *nlm = (struct nlmsghdr *)nlmsgbuf, *resp;
+ struct nlmsgerr *err;
+ char rtattbuf[64];
+ struct rtattr *rta, *vfports = NULL, *vfport;
+ struct ifinfomsg ifinfo = {
+ .ifi_family = AF_UNSPEC,
+ .ifi_index = ifindex,
+ };
+ char *recvbuf = NULL;
+ int recvbuflen = 0;
+
+ memset(&nlmsgbuf, 0, sizeof(nlmsgbuf));
+
+ nlInit(nlm, NLM_F_REQUEST, RTM_SETLINK);
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), &ifinfo, sizeof(ifinfo)))
+ goto buffer_too_small;
+
+ if (vf == PORT_SELF_VF) {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_SELF, NULL, 0);
+ } else {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_VF_PORTS, NULL, 0);
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!(vfports = nlAppend(nlm, sizeof(nlmsgbuf),
+ rtattbuf, rta->rta_len)))
+ goto buffer_too_small;
+
+ /* beging nesting vfports */
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_VF_PORT, NULL, 0);
+ }
+
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!(vfport = nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len)))
+ goto buffer_too_small;
+
+ if (profileId) {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_PROFILE,
+ profileId, strlen(profileId) + 1);
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
+ goto buffer_too_small;
+ }
+
+ if (portVsi) {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_VSI_TYPE,
+ portVsi, sizeof(*portVsi));
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
+ goto buffer_too_small;
+ }
+
+ if (instanceId) {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_INSTANCE_UUID,
+ instanceId, VIR_UUID_BUFLEN);
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
+ goto buffer_too_small;
+ }
+
+ if (hostUUID) {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_HOST_UUID,
+ hostUUID, VIR_UUID_BUFLEN);
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
+ goto buffer_too_small;
+ }
+
+ if (vf != PORT_SELF_VF) {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_VF,
+ &vf, sizeof(vf));
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
+ goto buffer_too_small;
+ }
+
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_REQUEST,
+ &op, sizeof(op));
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
+ goto buffer_too_small;
+
+ /* end nesting of vport */
+ vfport->rta_len = (char *)nlm + nlm->nlmsg_len - (char *)vfport;
+
+ if (vf != PORT_SELF_VF) {
+ /* end nesting of vfports */
+ vfports->rta_len = (char *)nlm + nlm->nlmsg_len - (char *)vfports;
+ }
+
+ if (!multicast) {
+ if (nlComm(nlm, &recvbuf, &recvbuflen) < 0)
+ return -1;
+ } else {
+ if (nlCommWaitSuccess(nlm, RTMGRP_LINK, &recvbuf, &recvbuflen,
+ 5 * MICROSEC_PER_SEC) < 0)
+ return -1;
+ }
+
+ if (recvbuflen < NLMSG_LENGTH(0) || recvbuf == NULL)
+ goto malformed_resp;
+
+ resp = (struct nlmsghdr *)recvbuf;
+
+ switch (resp->nlmsg_type) {
+ case NLMSG_ERROR:
+ err = (struct nlmsgerr *)NLMSG_DATA(resp);
+ if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err)))
+ goto malformed_resp;
+
+ switch (-err->error) {
+ case 0:
+ break;
+
+ default:
+ virReportSystemError(-err->error,
+ _("error during virtual port configuration of ifindex %d"),
+ ifindex);
+ rc = -1;
+ }
+ break;
+
+ case NLMSG_DONE:
+ break;
+
+ default:
+ goto malformed_resp;
+ }
+
+ VIR_FREE(recvbuf);
+
+ return rc;
+
+malformed_resp:
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("malformed netlink response message"));
+ VIR_FREE(recvbuf);
+ return -1;
+
+buffer_too_small:
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("internal buffer is too small"));
+ return -1;
+}
+
+
+static int
+doPortProfileOpCommon(bool multicast,
+ int ifindex,
+ const char *profileId,
+ struct ifla_port_vsi *portVsi,
+ const unsigned char *instanceId,
+ const unsigned char *hostUUID,
+ int32_t vf,
+ uint8_t op)
+{
+ int rc;
+ char *recvbuf = NULL;
+ struct nlattr *tb[IFLA_MAX + 1];
+ int repeats = 80;
+ uint16_t status = 0;
+
+ rc = doPortProfileOpSetLink(multicast,
+ ifindex,
+ profileId,
+ portVsi,
+ instanceId,
+ hostUUID,
+ vf,
+ op);
+
+ if (rc != 0) {
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("sending of PortProfileRequest failed."));
+ return rc;
+ }
+
+ while (--repeats) {
+ rc = link_dump(multicast, ifindex, tb, &recvbuf);
+ if (rc)
+ goto err_exit;
+ rc = getPortProfileStatus(tb, vf, &status);
+ if (rc == 0) {
+ if (status == PORT_PROFILE_RESPONSE_SUCCESS ||
+ status == PORT_VDP_RESPONSE_SUCCESS) {
+ break;
+ } else if (status == PORT_PROFILE_RESPONSE_INPROGRESS) {
+ // keep trying...
+ } else {
+ virReportSystemError(EINVAL,
+ _("error %d during port-profile setlink on ifindex %d"),
+ status, ifindex);
+ rc = 1;
+ break;
+ }
+ }
+ usleep(125000);
+
+ VIR_FREE(recvbuf);
+ }
+
+ if (status == PORT_PROFILE_RESPONSE_INPROGRESS) {
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("port-profile setlink timed out"));
+ rc = -ETIMEDOUT;
+ }
+
+err_exit:
+ VIR_FREE(recvbuf);
+
+ return rc;
+}
+
+# endif /* IFLA_PORT_MAX */
+
+static int
+doPortProfileOp8021Qbg(const char *ifname,
+ const virVirtualPortProfileParamsPtr virtPort,
+ enum virVirtualPortOp virtPortOp)
+{
+ int rc;
+
+# ifndef IFLA_VF_PORT_MAX
+
+ (void)ifname;
+ (void)virtPort;
+ (void)virtPortOp;
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Kernel VF Port support was missing at compile time."));
+ rc = 1;
+
+# else /* IFLA_VF_PORT_MAX */
+
+ int op = PORT_REQUEST_ASSOCIATE;
+ struct ifla_port_vsi portVsi = {
+ .vsi_mgr_id = virtPort->u.virtPort8021Qbg.managerID,
+ .vsi_type_version = virtPort->u.virtPort8021Qbg.typeIDVersion,
+ };
+ bool multicast = true;
+ int ifindex;
+
+ if (ifaceGetIndex(true, ifname, &ifindex) != 0) {
+ rc = 1;
+ goto err_exit;
+ }
+
+ portVsi.vsi_type_id[2] = virtPort->u.virtPort8021Qbg.typeID >> 16;
+ portVsi.vsi_type_id[1] = virtPort->u.virtPort8021Qbg.typeID >> 8;
+ portVsi.vsi_type_id[0] = virtPort->u.virtPort8021Qbg.typeID;
+
+ switch (virtPortOp) {
+ case ASSOCIATE:
+ op = PORT_REQUEST_ASSOCIATE;
+ break;
+ case DISASSOCIATE:
+ op = PORT_REQUEST_DISASSOCIATE;
+ break;
+ default:
+ macvtapError(VIR_ERR_INTERNAL_ERROR,
+ _("operation type %d not supported"), op);
+ rc = 1;
+ goto err_exit;
+ }
+
+ rc = doPortProfileOpCommon(multicast, ifindex,
+ NULL,
+ &portVsi,
+ virtPort->u.virtPort8021Qbg.instanceID,
+ NULL,
+ PORT_SELF_VF,
+ op);
+
+err_exit:
+
+# endif /* IFLA_VF_PORT_MAX */
+
+ return rc;
+}
+
+
+# ifdef IFLA_VF_PORT_MAX
+static int
+getPhysfn(const char *linkdev,
+ int32_t *vf,
+ char **physfndev)
+{
+ int rc = 0;
+ bool virtfn = false;
+
+ if (virtfn) {
+
+ // XXX: if linkdev is SR-IOV VF, then set vf = VF index
+ // XXX: and set linkdev = PF device
+ // XXX: need to use get_physical_function_linux() or
+ // XXX: something like that to get PF
+ // XXX: device and figure out VF index
+
+ rc = 1;
+
+ } else {
+
+ /* Not SR-IOV VF: physfndev is linkdev and VF index
+ * refers to linkdev self
+ */
+
+ *vf = PORT_SELF_VF;
+ *physfndev = (char *)linkdev;
+ }
+
+ return rc;
+}
+# endif /* IFLA_VF_PORT_MAX */
+
+static int
+doPortProfileOp8021Qbh(const char *ifname,
+ const virVirtualPortProfileParamsPtr virtPort,
+ const unsigned char *vm_uuid,
+ enum virVirtualPortOp virtPortOp)
+{
+ int rc;
+
+# ifndef IFLA_VF_PORT_MAX
+
+ (void)ifname;
+ (void)virtPort;
+ (void)vm_uuid;
+ (void)virtPortOp;
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Kernel VF Port support was missing at compile time."));
+ rc = 1;
+
+# else /* IFLA_VF_PORT_MAX */
+
+ char *physfndev;
+ unsigned char hostuuid[VIR_UUID_BUFLEN];
+ int32_t vf;
+ int op = PORT_REQUEST_ASSOCIATE;
+ bool multicast = false;
+ int ifindex;
+
+ rc = virGetHostUUID(hostuuid);
+ if (rc)
+ goto err_exit;
+
+ rc = getPhysfn(ifname, &vf, &physfndev);
+ if (rc)
+ goto err_exit;
+
+ if (ifaceGetIndex(true, physfndev, &ifindex) != 0) {
+ rc = 1;
+ goto err_exit;
+ }
+
+ switch (virtPortOp) {
+ case ASSOCIATE:
+ op = PORT_REQUEST_ASSOCIATE;
+ break;
+ case DISASSOCIATE:
+ op = PORT_REQUEST_DISASSOCIATE;
+ break;
+ default:
+ macvtapError(VIR_ERR_INTERNAL_ERROR,
+ _("operation type %d not supported"), op);
+ rc = 1;
+ goto err_exit;
+ }
+
+ rc = doPortProfileOpCommon(multicast, ifindex,
+ virtPort->u.virtPort8021Qbh.profileID,
+ NULL,
+ vm_uuid,
+ hostuuid,
+ vf,
+ op);
+
+ switch (virtPortOp) {
+ case ASSOCIATE:
+ ifaceUp(ifname);
+ break;
+ case DISASSOCIATE:
+ ifaceDown(ifname);
+ break;
+ }
+
+err_exit:
+
+# endif /* IFLA_VF_PORT_MAX */
+
+ return rc;
+}
/**
* associatePortProfile
*
* @macvtap_ifname: The name of the macvtap device
+ * @linkdev: The link device in case of macvtap
* @virtPort: pointer to the object holding port profile parameters
- * @vf: virtual function number, -1 if to be ignored
* @vmuuid : the UUID of the virtual machine
*
* Associate a port on a swtich with a profile. This function
@@ -740,15 +1474,14 @@ delMacvtap(const char *ifname,
*/
static int
associatePortProfileId(const char *macvtap_ifname,
+ const char *linkdev,
const virVirtualPortProfileParamsPtr virtPort,
- int vf,
const unsigned char *vmuuid)
{
int rc = 0;
+
VIR_DEBUG("Associating port profile '%p' on link device '%s'",
virtPort, macvtap_ifname);
- (void)vf;
- (void)vmuuid;
switch (virtPort->virtPortType) {
case VIR_VIRTUALPORT_NONE:
@@ -756,11 +1489,14 @@ associatePortProfileId(const char *macvt
break;
case VIR_VIRTUALPORT_8021QBG:
-
+ rc = doPortProfileOp8021Qbg(macvtap_ifname, virtPort,
+ ASSOCIATE);
break;
case VIR_VIRTUALPORT_8021QBH:
-
+ rc = doPortProfileOp8021Qbh(linkdev, virtPort,
+ vmuuid,
+ ASSOCIATE);
break;
}
@@ -772,6 +1508,7 @@ associatePortProfileId(const char *macvt
* disassociatePortProfile
*
* @macvtap_ifname: The name of the macvtap device
+ * @linkdev: The link device in case of macvtap
* @virtPort: point to object holding port profile parameters
*
* Returns 0 in case of success, != 0 otherwise with error
@@ -779,9 +1516,11 @@ associatePortProfileId(const char *macvt
*/
static int
disassociatePortProfileId(const char *macvtap_ifname,
+ const char *linkdev,
const virVirtualPortProfileParamsPtr virtPort)
{
int rc = 0;
+
VIR_DEBUG("Disassociating port profile id '%p' on link device '%s' ",
virtPort, macvtap_ifname);
@@ -791,13 +1530,18 @@ disassociatePortProfileId(const char *ma
break;
case VIR_VIRTUALPORT_8021QBG:
-
+ rc = doPortProfileOp8021Qbg(macvtap_ifname, virtPort,
+ DISASSOCIATE);
break;
case VIR_VIRTUALPORT_8021QBH:
-
+ rc = doPortProfileOp8021Qbh(linkdev, virtPort,
+ NULL,
+ DISASSOCIATE);
break;
}
return rc;
}
+
+#endif
Index: libvirt-acl/src/util/macvtap.h
===================================================================
--- libvirt-acl.orig/src/util/macvtap.h
+++ libvirt-acl/src/util/macvtap.h
@@ -72,6 +72,7 @@ int openMacvtapTap(const char *ifname,
char **res_ifname);
void delMacvtap(const char *ifname,
+ const char *linkdev,
virVirtualPortProfileParamsPtr virtPortProfile);
# endif /* WITH_MACVTAP */
14 years, 7 months
[libvirt] [PATCH] virDrvStorageVolLookupByKey and virDrvStorageVolLookupByPath should use virStoragePoolPtr as parameter
by Eduardo Otubo
Hello,
These two functions, virDrvStorageVolLookupByKey and
virDrvStorageVolLookupByPath should use virStoragePoolPtr as parameter
instead of virConnectPtr for some few reasons:
1) Should follow the standard virStorage*Ptr parameters like the rest of
storage related functions.
2) Functions now are able to access pool structure. This is particularly
important for the optimization of the PowerHypervisor
phypVolumeLookupByKey function.
Thanks,
--
Eduardo Otubo
Software Engineer
Linux Technology Center
IBM Systems & Technology Group
Mobile: +55 19 8135 0885
eotubo(a)linux.vnet.ibm.com
--
diff --git a/src/driver.h b/src/driver.h
index 0975b59..bb05306 100644
--- a/src/driver.h
+++ b/src/driver.h
@@ -798,10 +798,10 @@ typedef virStorageVolPtr
(*virDrvStorageVolLookupByName) (virStoragePoolPtr pool,
const char *name);
typedef virStorageVolPtr
- (*virDrvStorageVolLookupByKey) (virConnectPtr pool,
+ (*virDrvStorageVolLookupByKey) (virStoragePoolPtr pool,
const char *key);
typedef virStorageVolPtr
- (*virDrvStorageVolLookupByPath) (virConnectPtr pool,
+ (*virDrvStorageVolLookupByPath) (virStoragePoolPtr pool,
const char *path);
diff --git a/src/libvirt.c b/src/libvirt.c
index 9d42c76..c43ce9c 100644
--- a/src/libvirt.c
+++ b/src/libvirt.c
@@ -8418,11 +8418,12 @@ error:
* Returns a storage volume, or NULL if not found / error
*/
virStorageVolPtr
-virStorageVolLookupByKey(virConnectPtr conn,
+virStorageVolLookupByKey(virStoragePoolPtr pool,
const char *key)
{
- DEBUG("conn=%p, key=%s", conn, key);
+ DEBUG("pool=%p, key=%s", pool, key);
+ virConnectPtr conn = pool->conn;
virResetLastError();
if (!VIR_IS_CONNECT(conn)) {
@@ -8437,7 +8438,7 @@ virStorageVolLookupByKey(virConnectPtr conn,
if (conn->storageDriver && conn->storageDriver->volLookupByKey) {
virStorageVolPtr ret;
- ret = conn->storageDriver->volLookupByKey (conn, key);
+ ret = conn->storageDriver->volLookupByKey (pool, key);
if (!ret)
goto error;
return ret;
@@ -8461,11 +8462,12 @@ error:
* Returns a storage volume, or NULL if not found / error
*/
virStorageVolPtr
-virStorageVolLookupByPath(virConnectPtr conn,
+virStorageVolLookupByPath(virStoragePoolPtr pool,
const char *path)
{
- DEBUG("conn=%p, path=%s", conn, path);
+ DEBUG("pool=%p, path=%s", pool, path);
+ virConnectPtr conn = pool->conn;
virResetLastError();
if (!VIR_IS_CONNECT(conn)) {
@@ -8480,7 +8482,7 @@ virStorageVolLookupByPath(virConnectPtr conn,
if (conn->storageDriver && conn->storageDriver->volLookupByPath) {
virStorageVolPtr ret;
- ret = conn->storageDriver->volLookupByPath (conn, path);
+ ret = conn->storageDriver->volLookupByPath (pool, path);
if (!ret)
goto error;
return ret;
diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c
index 80977a3..12380f4 100644
--- a/src/remote/remote_driver.c
+++ b/src/remote/remote_driver.c
@@ -5559,9 +5559,10 @@ done:
}
static virStorageVolPtr
-remoteStorageVolLookupByKey (virConnectPtr conn,
+remoteStorageVolLookupByKey (virStoragePoolPtr pool,
const char *key)
{
+ virConnectPtr conn = pool->conn;
virStorageVolPtr vol = NULL;
remote_storage_vol_lookup_by_key_args args;
remote_storage_vol_lookup_by_key_ret ret;
@@ -5586,9 +5587,10 @@ done:
}
static virStorageVolPtr
-remoteStorageVolLookupByPath (virConnectPtr conn,
+remoteStorageVolLookupByPath (virStoragePoolPtr pool,
const char *path)
{
+ virConnectPtr conn = pool->conn;
virStorageVolPtr vol = NULL;
remote_storage_vol_lookup_by_path_args args;
remote_storage_vol_lookup_by_path_ret ret;
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c
index b148e39..25fe1d1 100644
--- a/src/storage/storage_driver.c
+++ b/src/storage/storage_driver.c
@@ -1168,8 +1168,9 @@ cleanup:
static virStorageVolPtr
-storageVolumeLookupByKey(virConnectPtr conn,
+storageVolumeLookupByKey(virStoragePoolPtr pool,
const char *key) {
+ virConnectPtr conn = pool->conn;
virStorageDriverStatePtr driver = conn->storagePrivateData;
unsigned int i;
virStorageVolPtr ret = NULL;
@@ -1199,8 +1200,9 @@ storageVolumeLookupByKey(virConnectPtr conn,
}
static virStorageVolPtr
-storageVolumeLookupByPath(virConnectPtr conn,
+storageVolumeLookupByPath(virStoragePoolPtr pool,
const char *path) {
+ virConnectPtr conn = pool->conn;
virStorageDriverStatePtr driver = conn->storagePrivateData;
unsigned int i;
virStorageVolPtr ret = NULL;
diff --git a/src/test/test_driver.c b/src/test/test_driver.c
index 395c8c9..c65e4a0 100644
--- a/src/test/test_driver.c
+++ b/src/test/test_driver.c
@@ -4251,8 +4251,9 @@ cleanup:
static virStorageVolPtr
-testStorageVolumeLookupByKey(virConnectPtr conn,
+testStorageVolumeLookupByKey(virStoragePoolPtr pool,
const char *key) {
+ virConnectPtr conn = pool->conn;
testConnPtr privconn = conn->privateData;
unsigned int i;
virStorageVolPtr ret = NULL;
@@ -4285,8 +4286,9 @@ testStorageVolumeLookupByKey(virConnectPtr conn,
}
static virStorageVolPtr
-testStorageVolumeLookupByPath(virConnectPtr conn,
+testStorageVolumeLookupByPath(virStoragePoolPtr pool,
const char *path) {
+ virConnectPtr conn = pool->conn;
testConnPtr privconn = conn->privateData;
unsigned int i;
virStorageVolPtr ret = NULL;
diff --git a/src/vbox/vbox_tmpl.c b/src/vbox/vbox_tmpl.c
index 6a9a2bf..ce665ef 100644
--- a/src/vbox/vbox_tmpl.c
+++ b/src/vbox/vbox_tmpl.c
@@ -7547,7 +7547,8 @@ static virStorageVolPtr
vboxStorageVolLookupByName(virStoragePoolPtr pool, const
return ret;
}
-static virStorageVolPtr vboxStorageVolLookupByKey(virConnectPtr conn,
const char *key) {
+static virStorageVolPtr vboxStorageVolLookupByKey(virStoragePoolPtr
pool, const char *key) {
+ virConnectPtr conn = pool->conn;
VBOX_OBJECT_CHECK(conn, virStorageVolPtr, NULL);
vboxIID *hddIID = NULL;
IHardDisk *hardDisk = NULL;
@@ -7616,7 +7617,8 @@ cleanup:
return ret;
}
-static virStorageVolPtr vboxStorageVolLookupByPath(virConnectPtr conn,
const char *path) {
+static virStorageVolPtr vboxStorageVolLookupByPath(virStoragePoolPtr
pool, const char *path) {
+ virConnectPtr conn = pool->conn;
VBOX_OBJECT_CHECK(conn, virStorageVolPtr, NULL);
PRUnichar *hddPathUtf16 = NULL;
IHardDisk *hardDisk = NULL;
14 years, 7 months
[libvirt] [PATCH v8] vepa: parsing for 802.1Qb{g|h} XML
by Stefan Berger
Below is David Alan's original patch with lots of changes.
In particular, it now parses the following two XML descriptions, one
for 802.1Qbg and 802.1Qbh and stored the data internally. The actual
triggering of the switch setup protocol has not been implemented
here but the relevant code to do that should go into the functions
associatePortProfileId() and disassociatePortProfileId().
<interface type='direct'>
<source dev='eth0.100' mode='vepa'/>
<model type='virtio'/>
<virtualport type='802.1Qbg'>
<parameters managerid='12' typeid='0x123456' typeidversion='1'
instanceid='fa9b7fff-b0a0-4893-8e0e-beef4ff18f8f'/>
</virtualport>
<filterref filter='clean-traffic'/>
</interface>
<interface type='direct'>
<source dev='eth0.100' mode='vepa'/>
<model type='virtio'/>
<virtualport type='802.1Qbh'>
<parameters profileid='my_profile'/>
</virtualport>
</interface>
I'd suggest to use this patch as a base for triggering the setup
protocol with the 802.1Qb{g|h} switch.
Changes from V7 to V8:
- Addressed most of Chris Wright's comments:
- indicating error in case virtualport XML node cannot be parsed
properly
- parsing hex and decimal numbers using virStrToLong_ui() with
parameter '0' for base
- tgifname (target interface name) variable wasn't necessary
to pass to openMacvtapTap function anymore
- assigning the virtual port data structure to the virDomainNetDef
only if it was previously parsed
-> still leaving possibility to start a domain with macvtap but no profile
Changes from V6 to V7:
- make sure that the error code returned by openMacvtapTap() is a negative number
in case the associatePortProfileId() function failed.
Changes from V5 to V6:
- renaming vsi in the XML to virtualport
- replace all occurrences of vsi in the source as well
Changes from V4 to V5:
- removing mode and MAC address parameters from the functions that
will communicate with the hareware diretctly or indirectly
Changes from V3 to V4:
- moving the associate and disassociate functions to the end of the
file for subsequent patches to easier make them generally available
for export
- passing the macvtap interface name rather than the link device since
this otherwise gives funny side effects when using netlink messages
where IFLA_IFNAME and IFLA_ADDRESS are specified and the link dev
all of a sudden gets the MAC address of the macvtap interface.
- Removing rc = -1 error indications in the case of 802.1Qbg|h setup in case
we wanted to use hook scripts for the setup and so the setup doesn't fail
here.
Changes from V2 to V3:
- if instance ID UUID is not supplied it will automatically be generated
- adapted schema to make instance ID UUID optional
- added test case
Some of the changes from V1 to V2:
- parser and XML generator have been separated into their own
functions so they can be re-used elsewhere (passthrough case
for example)
- Adapted XML parser and generator support the above shown type
(802.1Qbg, 802.1Qbh).
- Adapted schema to above XML
- Adapted test XML to above XML
- Passing through the VM's UUID which seems to be necessary for
802.1Qbh -- sorry no host UUID
- adding virtual function ID to association function, in case it's
necessary to use (for SR-IOV)
Signed-off-by: Stefan Berger <stefanb(a)us.ibm.com>
>From a945107f047c7cd71f9c1b74fd74c47d8cdc3670 Mon Sep 17 00:00:00 2001
From: David Allan <dallan(a)redhat.com>
Date: Fri, 12 Mar 2010 13:25:04 -0500
Subject: [PATCH 1/1] POC of port profile id support
* Modified schema per DanPB's feedback
* Added test for modified schema
---
docs/schemas/domain.rng | 69 ++++++++++++++
src/conf/domain_conf.c | 155 +++++++++++++++++++++++++++++++++
src/conf/domain_conf.h | 35 +++++++
src/qemu/qemu_conf.c | 18 +--
src/qemu/qemu_conf.h | 5 -
src/qemu/qemu_driver.c | 17 +--
src/util/macvtap.c | 151 +++++++++++++++++++++++++++-----
src/util/macvtap.h | 10 +-
tests/domainschemadata/portprofile.xml | 36 +++++++
9 files changed, 446 insertions(+), 50 deletions(-)
create mode 100644 tests/domainschemadata/portprofile.xml
Index: libvirt-acl/docs/schemas/domain.rng
===================================================================
--- libvirt-acl.orig/docs/schemas/domain.rng
+++ libvirt-acl/docs/schemas/domain.rng
@@ -817,6 +817,9 @@
</optional>
<empty/>
</element>
+ <optional>
+ <ref name="virtualPortProfile"/>
+ </optional>
<ref name="interface-options"/>
</interleave>
</group>
@@ -902,6 +905,45 @@
</optional>
</interleave>
</define>
+ <define name="virtualPortProfile">
+ <choice>
+ <group>
+ <element name="virtualport">
+ <attribute name="type">
+ <value>802.1Qbg</value>
+ </attribute>
+ <element name="parameters">
+ <attribute name="managerid">
+ <ref name="uint8range"/>
+ </attribute>
+ <attribute name="typeid">
+ <ref name="uint24range"/>
+ </attribute>
+ <attribute name="typeidversion">
+ <ref name="uint8range"/>
+ </attribute>
+ <optional>
+ <attribute name="instanceid">
+ <ref name="UUID"/>
+ </attribute>
+ </optional>
+ </element>
+ </element>
+ </group>
+ <group>
+ <element name="virtualport">
+ <attribute name="type">
+ <value>802.1Qbh</value>
+ </attribute>
+ <element name="parameters">
+ <attribute name="profileid">
+ <ref name="virtualPortProfileID"/>
+ </attribute>
+ </element>
+ </element>
+ </group>
+ </choice>
+ </define>
<!--
An emulator description is just a path to the binary used for the task
-->
@@ -1769,4 +1811,31 @@
<param name="pattern">[a-zA-Z0-9_\.:]+</param>
</data>
</define>
+ <define name="uint8range">
+ <choice>
+ <data type="string">
+ <param name="pattern">0x[0-9a-fA-F]{1,2}</param>
+ </data>
+ <data type="int">
+ <param name="minInclusive">0</param>
+ <param name="maxInclusive">255</param>
+ </data>
+ </choice>
+ </define>
+ <define name="uint24range">
+ <choice>
+ <data type="string">
+ <param name="pattern">0x[0-9a-fA-F]{1,6}</param>
+ </data>
+ <data type="int">
+ <param name="minInclusive">0</param>
+ <param name="maxInclusive">16777215</param>
+ </data>
+ </choice>
+ </define>
+ <define name="virtualPortProfileID">
+ <data type="string">
+ <param name="maxLength">39</param>
+ </data>
+ </define>
</grammar>
Index: libvirt-acl/src/conf/domain_conf.c
===================================================================
--- libvirt-acl.orig/src/conf/domain_conf.c
+++ libvirt-acl/src/conf/domain_conf.c
@@ -242,6 +242,11 @@ VIR_ENUM_IMPL(virDomainNetdevMacvtap, VI
"private",
"bridge")
+VIR_ENUM_IMPL(virVirtualPort, VIR_VIRTUALPORT_TYPE_LAST,
+ "none",
+ "802.1Qbg",
+ "802.1Qbh")
+
VIR_ENUM_IMPL(virDomainClockOffset, VIR_DOMAIN_CLOCK_OFFSET_LAST,
"utc",
"localtime",
@@ -1807,6 +1812,190 @@ cleanup:
}
+static int
+virVirtualPortProfileDefParseXML(xmlNodePtr node,
+ virVirtualPortProfileDefPtr virtPort)
+{
+ int ret = -1;
+ char *virtPortType;
+ char *virtPortManagerID = NULL;
+ char *virtPortTypeID = NULL;
+ char *virtPortTypeIDVersion = NULL;
+ char *virtPortInstanceID = NULL;
+ char *virtPortProfileID = NULL;
+ xmlNodePtr cur = node->children;
+ const char *msg = NULL;
+
+ virtPortType = virXMLPropString(node, "type");
+ if (!virtPortType)
+ return -1;
+
+ while (cur != NULL) {
+ if (xmlStrEqual(cur->name, BAD_CAST "parameters")) {
+
+ virtPortManagerID = virXMLPropString(cur, "managerid");
+ virtPortTypeID = virXMLPropString(cur, "typeid");
+ virtPortTypeIDVersion = virXMLPropString(cur, "typeidversion");
+ virtPortInstanceID = virXMLPropString(cur, "instanceid");
+ virtPortProfileID = virXMLPropString(cur, "profileid");
+
+ break;
+ }
+
+ cur = cur->next;
+ }
+
+ virtPort->virtPortType = VIR_VIRTUALPORT_NONE;
+
+ switch (virVirtualPortTypeFromString(virtPortType)) {
+
+ case VIR_VIRTUALPORT_8021QBG:
+ if (virtPortManagerID != NULL && virtPortTypeID != NULL &&
+ virtPortTypeIDVersion != NULL) {
+ unsigned int val;
+
+ if (virStrToLong_ui(virtPortManagerID, NULL, 0, &val)) {
+ msg = _("cannot parse value of managerid parameter");
+ goto err_exit;
+ }
+
+ if (val > 0xff) {
+ msg = _("value of managerid out of range");
+ goto err_exit;
+ }
+
+ virtPort->u.virtPort8021Qbg.managerID = (uint8_t)val;
+
+ if (virStrToLong_ui(virtPortTypeID, NULL, 0, &val)) {
+ msg = _("cannot parse value of typeid parameter");
+ goto err_exit;
+ }
+
+ if (val > 0xffffff) {
+ msg = _("value for typeid out of range");
+ goto err_exit;
+ }
+
+ virtPort->u.virtPort8021Qbg.typeID = (uint32_t)val;
+
+ if (virStrToLong_ui(virtPortTypeIDVersion, NULL, 0, &val)) {
+ msg = _("cannot parse value of typeidversion parameter");
+ goto err_exit;
+ }
+
+ if (val > 0xff) {
+ msg = _("value of typeidversion out of range");
+ goto err_exit;
+ }
+
+ virtPort->u.virtPort8021Qbg.typeIDVersion = (uint8_t)val;
+
+ if (virtPortInstanceID != NULL) {
+ if (virUUIDParse(virtPortInstanceID,
+ virtPort->u.virtPort8021Qbg.instanceID)) {
+ msg = _("cannot parse instanceid parameter as a uuid");
+ goto err_exit;
+ }
+ } else {
+ if (virUUIDGenerate(virtPort->u.virtPort8021Qbg.instanceID)) {
+ msg = _("cannot generate a random uuid for instanceid");
+ goto err_exit;
+ }
+ }
+
+ virtPort->virtPortType = VIR_VIRTUALPORT_8021QBG;
+ ret = 0;
+ } else {
+ msg = _("a parameter is missing for 802.1Qbg description");
+ goto err_exit;
+ }
+ break;
+
+ case VIR_VIRTUALPORT_8021QBH:
+ if (virtPortProfileID != NULL) {
+ if (virStrcpyStatic(virtPort->u.virtPort8021Qbh.profileID,
+ virtPortProfileID) != NULL) {
+ virtPort->virtPortType = VIR_VIRTUALPORT_8021QBH;
+ ret = 0;
+ } else {
+ msg = _("profileid parameter too long");
+ goto err_exit;
+ }
+ } else {
+ msg = _("profileid parameter is missing for 802.1Qbh descripion");
+ goto err_exit;
+ }
+ break;
+
+
+ default:
+ case VIR_VIRTUALPORT_NONE:
+ case VIR_VIRTUALPORT_TYPE_LAST:
+ msg = _("unknown virtualport type");
+ goto err_exit;
+ break;
+ }
+
+err_exit:
+
+ if (msg)
+ virDomainReportError(VIR_ERR_INTERNAL_ERROR, "%s", msg);
+
+ VIR_FREE(virtPortManagerID);
+ VIR_FREE(virtPortTypeID);
+ VIR_FREE(virtPortTypeIDVersion);
+ VIR_FREE(virtPortInstanceID);
+ VIR_FREE(virtPortProfileID);
+ VIR_FREE(virtPortType);
+
+ return ret;
+}
+
+
+static void
+virVirtualPortProfileFormat(virBufferPtr buf,
+ virVirtualPortProfileDefPtr virtPort,
+ const char *indent)
+{
+ char uuidstr[VIR_UUID_STRING_BUFLEN];
+
+ if (virtPort->virtPortType == VIR_VIRTUALPORT_NONE)
+ return;
+
+ virBufferVSprintf(buf, "%s<virtualport type='%s'>\n",
+ indent,
+ virVirtualPortTypeToString(virtPort->virtPortType));
+
+ switch (virtPort->virtPortType) {
+ case VIR_VIRTUALPORT_NONE:
+ case VIR_VIRTUALPORT_TYPE_LAST:
+ break;
+
+ case VIR_VIRTUALPORT_8021QBG:
+ virUUIDFormat(virtPort->u.virtPort8021Qbg.instanceID,
+ uuidstr);
+ virBufferVSprintf(buf,
+ "%s <parameters managerid='%d' typeid='%d' "
+ "typeidversion='%d' instanceid='%s'/>\n",
+ indent,
+ virtPort->u.virtPort8021Qbg.managerID,
+ virtPort->u.virtPort8021Qbg.typeID,
+ virtPort->u.virtPort8021Qbg.typeIDVersion,
+ uuidstr);
+ break;
+
+ case VIR_VIRTUALPORT_8021QBH:
+ virBufferVSprintf(buf,
+ "%s <parameters profileid='%s'/>\n",
+ indent,
+ virtPort->u.virtPort8021Qbh.profileID);
+ break;
+ }
+
+ virBufferVSprintf(buf, "%s</virtualport>\n", indent);
+}
+
+
/* Parse the XML definition for a network interface
* @param node XML nodeset to parse for net definition
* @return 0 on success, -1 on failure
@@ -1832,6 +2021,8 @@ virDomainNetDefParseXML(virCapsPtr caps,
char *devaddr = NULL;
char *mode = NULL;
virNWFilterHashTablePtr filterparams = NULL;
+ virVirtualPortProfileDef virtPort;
+ bool virtPortParsed = false;
if (VIR_ALLOC(def) < 0) {
virReportOOMError();
@@ -1873,6 +2064,12 @@ virDomainNetDefParseXML(virCapsPtr caps,
xmlStrEqual(cur->name, BAD_CAST "source")) {
dev = virXMLPropString(cur, "dev");
mode = virXMLPropString(cur, "mode");
+ } else if ((virtPortParsed == false) &&
+ (def->type == VIR_DOMAIN_NET_TYPE_DIRECT) &&
+ xmlStrEqual(cur->name, BAD_CAST "virtualport")) {
+ if (virVirtualPortProfileDefParseXML(cur, &virtPort))
+ goto error;
+ virtPortParsed = true;
} else if ((network == NULL) &&
((def->type == VIR_DOMAIN_NET_TYPE_SERVER) ||
(def->type == VIR_DOMAIN_NET_TYPE_CLIENT) ||
@@ -2048,6 +2245,9 @@ virDomainNetDefParseXML(virCapsPtr caps,
} else
def->data.direct.mode = VIR_DOMAIN_NETDEV_MACVTAP_MODE_VEPA;
+ if (virtPortParsed)
+ def->data.direct.virtPortProfile = virtPort;
+
def->data.direct.linkdev = dev;
dev = NULL;
@@ -5140,6 +5340,8 @@ virDomainNetDefFormat(virBufferPtr buf,
virBufferVSprintf(buf, " mode='%s'",
virDomainNetdevMacvtapTypeToString(def->data.direct.mode));
virBufferAddLit(buf, "/>\n");
+ virVirtualPortProfileFormat(buf, &def->data.direct.virtPortProfile,
+ " ");
break;
case VIR_DOMAIN_NET_TYPE_USER:
Index: libvirt-acl/src/conf/domain_conf.h
===================================================================
--- libvirt-acl.orig/src/conf/domain_conf.h
+++ libvirt-acl/src/conf/domain_conf.h
@@ -259,6 +259,39 @@ enum virDomainNetdevMacvtapType {
};
+enum virVirtualPortType {
+ VIR_VIRTUALPORT_NONE,
+ VIR_VIRTUALPORT_8021QBG,
+ VIR_VIRTUALPORT_8021QBH,
+
+ VIR_VIRTUALPORT_TYPE_LAST,
+};
+
+# ifdef IFLA_VF_PORT_PROFILE_MAX
+# define LIBVIRT_IFLA_VF_PORT_PROFILE_MAX IFLA_VF_PORT_PROFILE_MAX
+# else
+# define LIBVIRT_IFLA_VF_PORT_PROFILE_MAX 40
+# endif
+
+/* profile data for macvtap (VEPA) */
+typedef struct _virVirtualPortProfileDef virVirtualPortProfileDef;
+typedef virVirtualPortProfileDef *virVirtualPortProfileDefPtr;
+struct _virVirtualPortProfileDef {
+ enum virVirtualPortType virtPortType;
+ union {
+ struct {
+ uint8_t managerID;
+ uint32_t typeID; // 24 bit valid
+ uint8_t typeIDVersion;
+ unsigned char instanceID[VIR_UUID_BUFLEN];
+ } virtPort8021Qbg;
+ struct {
+ char profileID[LIBVIRT_IFLA_VF_PORT_PROFILE_MAX];
+ } virtPort8021Qbh;
+ } u;
+};
+
+
/* Stores the virtual network interface configuration */
typedef struct _virDomainNetDef virDomainNetDef;
typedef virDomainNetDef *virDomainNetDefPtr;
@@ -290,6 +323,7 @@ struct _virDomainNetDef {
struct {
char *linkdev;
int mode;
+ virVirtualPortProfileDef virtPortProfile;
} direct;
} data;
char *ifname;
@@ -1089,6 +1123,7 @@ VIR_ENUM_DECL(virDomainSeclabel)
VIR_ENUM_DECL(virDomainClockOffset)
VIR_ENUM_DECL(virDomainNetdevMacvtap)
+VIR_ENUM_DECL(virVirtualPort)
VIR_ENUM_DECL(virDomainTimerName)
VIR_ENUM_DECL(virDomainTimerTrack)
Index: libvirt-acl/src/util/macvtap.c
===================================================================
--- libvirt-acl.orig/src/util/macvtap.c
+++ libvirt-acl/src/util/macvtap.c
@@ -43,6 +43,7 @@
# include "util.h"
# include "memory.h"
+# include "logging.h"
# include "macvtap.h"
# include "interface.h"
# include "conf/domain_conf.h"
@@ -57,6 +58,16 @@
# define MACVTAP_NAME_PREFIX "macvtap"
# define MACVTAP_NAME_PATTERN "macvtap%d"
+
+static int associatePortProfileId(const char *macvtap_ifname,
+ const virVirtualPortProfileDefPtr virtPort,
+ int vf,
+ const unsigned char *vmuuid);
+
+static int disassociatePortProfileId(const char *macvtap_ifname,
+ const virVirtualPortProfileDefPtr virtPort);
+
+
static int nlOpen(void)
{
int fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
@@ -567,39 +578,36 @@ configMacvtapTap(int tapfd, int vnet_hdr
return 0;
}
-
-
/**
* openMacvtapTap:
* Create an instance of a macvtap device and open its tap character
* device.
- * @tgifname: Interface name that the macvtap is supposed to have. May
- * be NULL if this function is supposed to choose a name
- * @macaddress: The MAC address for the macvtap device
- * @linkdev: The interface name of the NIC to connect to the external bridge
- * @mode_str: String describing the mode. Valid are 'bridge', 'vepa' and
- * 'private'.
+ * @net: pointer to the virDomainNetDef object describing the direct
+ * type if an interface
* @res_ifname: Pointer to a string pointer where the actual name of the
* interface will be stored into if everything succeeded. It is up
* to the caller to free the string.
+ * @vnet_hdr: Whether to enable IFF_VNET_HDR on the interface
+ * @vmuuid: The (raw) UUID of the VM
*
* Returns file descriptor of the tap device in case of success,
* negative value otherwise with error reported.
*
+ * Open a macvtap device and trigger the switch setup protocol
+ * if valid port profile parameters were provided.
*/
int
-openMacvtapTap(const char *tgifname,
- const unsigned char *macaddress,
- const char *linkdev,
- int mode,
+openMacvtapTap(virDomainNetDefPtr net,
char **res_ifname,
- int vnet_hdr)
+ int vnet_hdr,
+ const unsigned char *vmuuid)
{
const char *type = "macvtap";
+ const char *tgifname = net->ifname;
int c, rc;
char ifname[IFNAMSIZ];
int retries, do_retry = 0;
- uint32_t macvtapMode = macvtapModeFromInt(mode);
+ uint32_t macvtapMode = macvtapModeFromInt(net->data.direct.mode);
const char *cr_ifname;
int ifindex;
@@ -616,7 +624,7 @@ openMacvtapTap(const char *tgifname,
return -1;
}
cr_ifname = tgifname;
- rc = link_add(type, macaddress, 6, tgifname, linkdev,
+ rc = link_add(type, net->mac, 6, tgifname, net->data.direct.linkdev,
macvtapMode, &do_retry);
if (rc)
return -1;
@@ -626,7 +634,8 @@ create_name:
for (c = 0; c < 8192; c++) {
snprintf(ifname, sizeof(ifname), MACVTAP_NAME_PATTERN, c);
if (ifaceGetIndex(false, ifname, &ifindex) == ENODEV) {
- rc = link_add(type, macaddress, 6, ifname, linkdev,
+ rc = link_add(type, net->mac, 6, ifname,
+ net->data.direct.linkdev,
macvtapMode, &do_retry);
if (rc == 0)
break;
@@ -639,6 +648,14 @@ create_name:
cr_ifname = ifname;
}
+ if (associatePortProfileId(cr_ifname,
+ &net->data.direct.virtPortProfile,
+ -1,
+ vmuuid) != 0) {
+ rc = -1;
+ goto link_del_exit;
+ }
+
rc = ifaceUp(cr_ifname);
if (rc != 0) {
virReportSystemError(errno,
@@ -647,7 +664,7 @@ create_name:
"MAC address"),
cr_ifname);
rc = -1;
- goto link_del_exit;
+ goto disassociate_exit;
}
rc = openTap(cr_ifname, 10);
@@ -656,14 +673,18 @@ create_name:
if (configMacvtapTap(rc, vnet_hdr) < 0) {
close(rc);
rc = -1;
- goto link_del_exit;
+ goto disassociate_exit;
}
*res_ifname = strdup(cr_ifname);
} else
- goto link_del_exit;
+ goto disassociate_exit;
return rc;
+disassociate_exit:
+ disassociatePortProfileId(cr_ifname,
+ &net->data.direct.virtPortProfile);
+
link_del_exit:
link_del(cr_ifname);
@@ -673,14 +694,102 @@ link_del_exit:
/**
* delMacvtap:
- * @ifname : The name of the macvtap interface
+ * @net: pointer to virDomainNetDef object
*
- * Delete an interface given its name.
+ * Delete an interface given its name. Disassociate
+ * it with the switch if port profile parameters
+ * were provided.
*/
void
-delMacvtap(const char *ifname)
+delMacvtap(virDomainNetDefPtr net)
{
- link_del(ifname);
+ if (net->ifname) {
+ disassociatePortProfileId(net->ifname,
+ &net->data.direct.virtPortProfile);
+ link_del(net->ifname);
+ }
}
#endif
+
+
+/**
+ * associatePortProfile
+ *
+ * @macvtap_ifname: The name of the macvtap device
+ * @virtPort: pointer to the object holding port profile parameters
+ * @vf: virtual function number, -1 if to be ignored
+ * @vmuuid : the UUID of the virtual machine
+ *
+ * Associate a port on a swtich with a profile. This function
+ * may notify a kernel driver or an external daemon to run
+ * the setup protocol. If profile parameters were not supplied
+ * by the user, then this function returns without doing
+ * anything.
+ *
+ * Returns 0 in case of success, != 0 otherwise with error
+ * having been reported.
+ */
+static int
+associatePortProfileId(const char *macvtap_ifname,
+ const virVirtualPortProfileDefPtr virtPort,
+ int vf,
+ const unsigned char *vmuuid)
+{
+ int rc = 0;
+ VIR_DEBUG("Associating port profile '%p' on link device '%s'",
+ virtPort, macvtap_ifname);
+ (void)vf;
+ (void)vmuuid;
+
+ switch (virtPort->virtPortType) {
+ case VIR_VIRTUALPORT_NONE:
+ case VIR_VIRTUALPORT_TYPE_LAST:
+ break;
+
+ case VIR_VIRTUALPORT_8021QBG:
+
+ break;
+
+ case VIR_VIRTUALPORT_8021QBH:
+
+ break;
+ }
+
+ return rc;
+}
+
+
+/**
+ * disassociatePortProfile
+ *
+ * @macvtap_ifname: The name of the macvtap device
+ * @virtPort: point to object holding port profile parameters
+ *
+ * Returns 0 in case of success, != 0 otherwise with error
+ * having been reported.
+ */
+static int
+disassociatePortProfileId(const char *macvtap_ifname,
+ const virVirtualPortProfileDefPtr virtPort)
+{
+ int rc = 0;
+ VIR_DEBUG("Disassociating port profile id '%p' on link device '%s' ",
+ virtPort, macvtap_ifname);
+
+ switch (virtPort->virtPortType) {
+ case VIR_VIRTUALPORT_NONE:
+ case VIR_VIRTUALPORT_TYPE_LAST:
+ break;
+
+ case VIR_VIRTUALPORT_8021QBG:
+
+ break;
+
+ case VIR_VIRTUALPORT_8021QBH:
+
+ break;
+ }
+
+ return rc;
+}
Index: libvirt-acl/src/util/macvtap.h
===================================================================
--- libvirt-acl.orig/src/util/macvtap.h
+++ libvirt-acl/src/util/macvtap.h
@@ -27,15 +27,14 @@
# if defined(WITH_MACVTAP)
# include "internal.h"
+# include "conf/domain_conf.h"
-int openMacvtapTap(const char *ifname,
- const unsigned char *macaddress,
- const char *linkdev,
- int mode,
+int openMacvtapTap(virDomainNetDefPtr net,
char **res_ifname,
- int vnet_hdr);
+ int vnet_hdr,
+ const unsigned char *vmuuid);
-void delMacvtap(const char *ifname);
+void delMacvtap(virDomainNetDefPtr net);
# endif /* WITH_MACVTAP */
Index: libvirt-acl/tests/domainschemadata/portprofile.xml
===================================================================
--- /dev/null
+++ libvirt-acl/tests/domainschemadata/portprofile.xml
@@ -0,0 +1,36 @@
+<domain type='lxc'>
+ <name>portprofile</name>
+ <uuid>00000000-0000-0000-0000-000000000000</uuid>
+ <memory>1048576</memory>
+ <os>
+ <type>exe</type>
+ <init>/sh</init>
+ </os>
+ <devices>
+ <interface type='direct'>
+ <source dev='eth0' mode='vepa'/>
+ <virtualport type='802.1Qbg'>
+ <parameters managerid='12' typeid='1193046' typeidversion='1'
+ instanceid='fa9b7fff-b0a0-4893-8e0e-beef4ff18f8f'/>
+ </virtualport>
+ </interface>
+ <interface type='direct'>
+ <source dev='eth0' mode='vepa'/>
+ <virtualport type='802.1Qbg'>
+ <parameters managerid='12' typeid='1193046' typeidversion='1'/>
+ </virtualport>
+ </interface>
+ <interface type='direct'>
+ <source dev='eth0' mode='vepa'/>
+ <virtualport type='802.1Qbh'>
+ <parameters profileid='my_profile'/>
+ </virtualport>
+ </interface>
+ <interface type='direct'>
+ <source dev='eth0' mode='vepa'/>
+ </interface>
+ <interface type='direct'>
+ <source dev='eth0' mode='vepa'/>
+ </interface>
+ </devices>
+</domain>
Index: libvirt-acl/src/qemu/qemu_conf.h
===================================================================
--- libvirt-acl.orig/src/qemu/qemu_conf.h
+++ libvirt-acl/src/qemu/qemu_conf.h
@@ -274,9 +274,8 @@ qemudOpenVhostNet(virDomainNetDefPtr net
int qemudPhysIfaceConnect(virConnectPtr conn,
struct qemud_driver *driver,
virDomainNetDefPtr net,
- char *linkdev,
- int brmode,
- unsigned long long qemuCmdFlags);
+ unsigned long long qemuCmdFlags,
+ const unsigned char *vmuuid);
int qemudProbeMachineTypes (const char *binary,
virCapsGuestMachinePtr **machines,
Index: libvirt-acl/src/qemu/qemu_driver.c
===================================================================
--- libvirt-acl.orig/src/qemu/qemu_driver.c
+++ libvirt-acl/src/qemu/qemu_driver.c
@@ -3702,10 +3702,8 @@ static void qemudShutdownVMDaemon(struct
def = vm->def;
for (i = 0; i < def->nnets; i++) {
virDomainNetDefPtr net = def->nets[i];
- if (net->type == VIR_DOMAIN_NET_TYPE_DIRECT) {
- if (net->ifname)
- delMacvtap(net->ifname);
- }
+ if (net->type == VIR_DOMAIN_NET_TYPE_DIRECT)
+ delMacvtap(net);
}
#endif
@@ -7464,9 +7462,8 @@ static int qemudDomainAttachNetDevice(vi
}
if ((tapfd = qemudPhysIfaceConnect(conn, driver, net,
- net->data.direct.linkdev,
- net->data.direct.mode,
- qemuCmdFlags)) < 0)
+ qemuCmdFlags,
+ vm->def->uuid)) < 0)
return -1;
}
@@ -8509,10 +8506,8 @@ qemudDomainDetachNetDevice(struct qemud_
virNWFilterTearNWFilter(detach);
#if WITH_MACVTAP
- if (detach->type == VIR_DOMAIN_NET_TYPE_DIRECT) {
- if (detach->ifname)
- delMacvtap(detach->ifname);
- }
+ if (detach->type == VIR_DOMAIN_NET_TYPE_DIRECT)
+ delMacvtap(detach);
#endif
if ((driver->macFilter) && (detach->ifname != NULL)) {
Index: libvirt-acl/src/qemu/qemu_conf.c
===================================================================
--- libvirt-acl.orig/src/qemu/qemu_conf.c
+++ libvirt-acl/src/qemu/qemu_conf.c
@@ -1470,9 +1470,8 @@ int
qemudPhysIfaceConnect(virConnectPtr conn,
struct qemud_driver *driver,
virDomainNetDefPtr net,
- char *linkdev,
- int brmode,
- unsigned long long qemuCmdFlags)
+ unsigned long long qemuCmdFlags,
+ const unsigned char *vmuuid)
{
int rc;
#if WITH_MACVTAP
@@ -1484,8 +1483,7 @@ qemudPhysIfaceConnect(virConnectPtr conn
net->model && STREQ(net->model, "virtio"))
vnet_hdr = 1;
- rc = openMacvtapTap(net->ifname, net->mac, linkdev, brmode,
- &res_ifname, vnet_hdr);
+ rc = openMacvtapTap(net, &res_ifname, vnet_hdr, vmuuid);
if (rc >= 0) {
VIR_FREE(net->ifname);
net->ifname = res_ifname;
@@ -1505,17 +1503,16 @@ qemudPhysIfaceConnect(virConnectPtr conn
if (err) {
close(rc);
rc = -1;
- delMacvtap(net->ifname);
+ delMacvtap(net);
}
}
}
#else
(void)conn;
(void)net;
- (void)linkdev;
- (void)brmode;
(void)qemuCmdFlags;
(void)driver;
+ (void)vmuuid;
qemuReportError(VIR_ERR_INTERNAL_ERROR,
"%s", _("No support for macvtap device"));
rc = -1;
@@ -4135,9 +4132,8 @@ int qemudBuildCommandLine(virConnectPtr
goto no_memory;
} else if (net->type == VIR_DOMAIN_NET_TYPE_DIRECT) {
int tapfd = qemudPhysIfaceConnect(conn, driver, net,
- net->data.direct.linkdev,
- net->data.direct.mode,
- qemuCmdFlags);
+ qemuCmdFlags,
+ def->uuid);
if (tapfd < 0)
goto error;
14 years, 7 months
[libvirt] [PATCH] Query block allocation extent from QEMU monitor
by Daniel P. Berrange
The virDomainGetBlockInfo API allows query physical block
extent and allocated block extent. These are normally the
same value unless storing a special format like qcow2
inside a block device. In this scenario we can query QEMU
to get the actual allocated extent.
Since last time:
- Return fatal error in text monitor
- Only invoke monitor command for block devices
- Fix error handling JSON code
* src/qemu/qemu_driver.c: Fill in block aloction extent when VM
is running
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
API to query the highest block extent via info blockstats
---
src/qemu/qemu_driver.c | 33 ++++++++++++---
src/qemu/qemu_monitor.c | 16 +++++++
src/qemu/qemu_monitor.h | 4 ++
src/qemu/qemu_monitor_json.c | 94 ++++++++++++++++++++++++++++++++++++++++++
src/qemu/qemu_monitor_json.h | 3 +
src/qemu/qemu_monitor_text.c | 10 ++++
src/qemu/qemu_monitor_text.h | 4 +-
7 files changed, 156 insertions(+), 8 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 4ef2f57..356c4be 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -9651,6 +9651,7 @@ static int qemuDomainGetBlockInfo(virDomainPtr dom,
int fd = -1;
off_t end;
virStorageFileMetadata meta;
+ virDomainDiskDefPtr disk = NULL;
struct stat sb;
int i;
@@ -9677,19 +9678,17 @@ static int qemuDomainGetBlockInfo(virDomainPtr dom,
for (i = 0 ; i < vm->def->ndisks ; i++) {
if (vm->def->disks[i]->src != NULL &&
STREQ (vm->def->disks[i]->src, path)) {
- ret = 0;
+ disk = vm->def->disks[i];
break;
}
}
- if (ret != 0) {
+ if (!disk) {
qemuReportError(VIR_ERR_INVALID_ARG,
_("invalid path %s not assigned to domain"), path);
goto cleanup;
}
- ret = -1;
-
/* The path is correct, now try to open it and get its size. */
fd = open (path, O_RDONLY);
if (fd == -1) {
@@ -9740,11 +9739,31 @@ static int qemuDomainGetBlockInfo(virDomainPtr dom,
if (meta.capacity)
info->capacity = meta.capacity;
- /* XXX allocation will need to be pulled from QEMU for
- * the qcow inside LVM case */
+ /* Set default value .. */
info->allocation = info->physical;
- ret = 0;
+ /* ..but if guest is running & not using raw
+ disk format and on a block device, then query
+ highest allocated extent from QEMU */
+ if (virDomainObjIsActive(vm) &&
+ disk->type == VIR_DOMAIN_DISK_TYPE_BLOCK &&
+ meta.format != VIR_STORAGE_FILE_RAW &&
+ S_ISBLK(sb.st_mode)) {
+ qemuDomainObjPrivatePtr priv = vm->privateData;
+ if (qemuDomainObjBeginJob(vm) < 0)
+ goto cleanup;
+
+ qemuDomainObjEnterMonitor(vm);
+ ret = qemuMonitorGetBlockExtent(priv->mon,
+ disk->info.alias,
+ &info->allocation);
+ qemuDomainObjExitMonitor(vm);
+
+ if (qemuDomainObjEndJob(vm) == 0)
+ vm = NULL;
+ } else {
+ ret = 0;
+ }
cleanup:
if (fd != -1)
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
index 582225e..efaf74a 100644
--- a/src/qemu/qemu_monitor.c
+++ b/src/qemu/qemu_monitor.c
@@ -1009,6 +1009,22 @@ int qemuMonitorGetBlockStatsInfo(qemuMonitorPtr mon,
return ret;
}
+int qemuMonitorGetBlockExtent(qemuMonitorPtr mon,
+ const char *devname,
+ unsigned long long *extent)
+{
+ int ret;
+ DEBUG("mon=%p, fd=%d, devname=%p",
+ mon, mon->fd, devname);
+
+ if (mon->json)
+ ret = qemuMonitorJSONGetBlockExtent(mon, devname, extent);
+ else
+ ret = qemuMonitorTextGetBlockExtent(mon, devname, extent);
+
+ return ret;
+}
+
int qemuMonitorSetVNCPassword(qemuMonitorPtr mon,
const char *password)
diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h
index 7b1589e..adfb3bc 100644
--- a/src/qemu/qemu_monitor.h
+++ b/src/qemu/qemu_monitor.h
@@ -185,6 +185,10 @@ int qemuMonitorGetBlockStatsInfo(qemuMonitorPtr mon,
long long *wr_bytes,
long long *errs);
+int qemuMonitorGetBlockExtent(qemuMonitorPtr mon,
+ const char *devname,
+ unsigned long long *extent);
+
int qemuMonitorSetVNCPassword(qemuMonitorPtr mon,
const char *password);
diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c
index 6d8f328..31a66f1 100644
--- a/src/qemu/qemu_monitor_json.c
+++ b/src/qemu/qemu_monitor_json.c
@@ -1186,6 +1186,100 @@ cleanup:
}
+int qemuMonitorJSONGetBlockExtent(qemuMonitorPtr mon,
+ const char *devname,
+ unsigned long long *extent)
+{
+ int ret = -1;
+ int i;
+ int found = 0;
+ virJSONValuePtr cmd = qemuMonitorJSONMakeCommand("query-blockstats",
+ NULL);
+ virJSONValuePtr reply = NULL;
+ virJSONValuePtr devices;
+
+ *extent = 0;
+
+ if (!cmd)
+ return -1;
+
+ if (qemuMonitorJSONCommand(mon, cmd, &reply) < 0)
+ goto cleanup;
+
+ if (qemuMonitorJSONCheckError(cmd, reply) < 0)
+ goto cleanup;
+
+ devices = virJSONValueObjectGet(reply, "return");
+ if (!devices || devices->type != VIR_JSON_TYPE_ARRAY) {
+ qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("blockstats reply was missing device list"));
+ goto cleanup;
+ }
+
+ for (i = 0 ; i < virJSONValueArraySize(devices) ; i++) {
+ virJSONValuePtr dev = virJSONValueArrayGet(devices, i);
+ virJSONValuePtr stats;
+ virJSONValuePtr parent;
+ const char *thisdev;
+ if (!dev || dev->type != VIR_JSON_TYPE_OBJECT) {
+ qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("blockstats device entry was not in expected format"));
+ goto cleanup;
+ }
+
+ if ((thisdev = virJSONValueObjectGetString(dev, "device")) == NULL) {
+ qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("blockstats device entry was not in expected format"));
+ goto cleanup;
+ }
+
+ /* New QEMU has separate names for host & guest side of the disk
+ * and libvirt gives the host side a 'drive-' prefix. The passed
+ * in devname is the guest side though
+ */
+ if (STRPREFIX(thisdev, QEMU_DRIVE_HOST_PREFIX))
+ thisdev += strlen(QEMU_DRIVE_HOST_PREFIX);
+
+ if (STRNEQ(thisdev, devname))
+ continue;
+
+ found = 1;
+ if ((parent = virJSONValueObjectGet(dev, "parent")) == NULL ||
+ parent->type != VIR_JSON_TYPE_OBJECT) {
+ qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("blockstats parent entry was not in expected format"));
+ goto cleanup;
+ }
+
+ if ((stats = virJSONValueObjectGet(parent, "stats")) == NULL ||
+ stats->type != VIR_JSON_TYPE_OBJECT) {
+ qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("blockstats stats entry was not in expected format"));
+ goto cleanup;
+ }
+
+ if (virJSONValueObjectGetNumberUlong(stats, "wr_highest_offset", extent) < 0) {
+ qemuReportError(VIR_ERR_INTERNAL_ERROR,
+ _("cannot read %s statistic"),
+ "wr_highest_offset");
+ goto cleanup;
+ }
+ }
+
+ if (!found) {
+ qemuReportError(VIR_ERR_INTERNAL_ERROR,
+ _("cannot find statistics for device '%s'"), devname);
+ goto cleanup;
+ }
+ ret = 0;
+
+cleanup:
+ virJSONValueFree(cmd);
+ virJSONValueFree(reply);
+ return ret;
+}
+
+
int qemuMonitorJSONSetVNCPassword(qemuMonitorPtr mon,
const char *password)
{
diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h
index 26fc865..14597f4 100644
--- a/src/qemu/qemu_monitor_json.h
+++ b/src/qemu/qemu_monitor_json.h
@@ -56,6 +56,9 @@ int qemuMonitorJSONGetBlockStatsInfo(qemuMonitorPtr mon,
long long *wr_req,
long long *wr_bytes,
long long *errs);
+int qemuMonitorJSONGetBlockExtent(qemuMonitorPtr mon,
+ const char *devname,
+ unsigned long long *extent);
int qemuMonitorJSONSetVNCPassword(qemuMonitorPtr mon,
diff --git a/src/qemu/qemu_monitor_text.c b/src/qemu/qemu_monitor_text.c
index d725d6d..72e3fd5 100644
--- a/src/qemu/qemu_monitor_text.c
+++ b/src/qemu/qemu_monitor_text.c
@@ -711,6 +711,16 @@ int qemuMonitorTextGetBlockStatsInfo(qemuMonitorPtr mon,
}
+int qemuMonitorTextGetBlockExtent(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
+ const char *devname ATTRIBUTE_UNUSED,
+ unsigned long long *extent ATTRIBUTE_UNUSED)
+{
+ qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("unable to query block extent with this QEMU"));
+ return -1;
+}
+
+
static int
qemuMonitorSendVNCPassphrase(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
qemuMonitorMessagePtr msg,
diff --git a/src/qemu/qemu_monitor_text.h b/src/qemu/qemu_monitor_text.h
index 2a62c7e..6fb7d7a 100644
--- a/src/qemu/qemu_monitor_text.h
+++ b/src/qemu/qemu_monitor_text.h
@@ -55,7 +55,9 @@ int qemuMonitorTextGetBlockStatsInfo(qemuMonitorPtr mon,
long long *wr_req,
long long *wr_bytes,
long long *errs);
-
+int qemuMonitorTextGetBlockExtent(qemuMonitorPtr mon,
+ const char *devname,
+ unsigned long long *extent);
int qemuMonitorTextSetVNCPassword(qemuMonitorPtr mon,
const char *password);
--
1.6.6.1
14 years, 7 months
[libvirt] [PATCH] lxcSetSchedulerParameters: reverse order of tests; improve a diagnostic
by Jim Meyering
Investigating something else in the vicinity,
I noticed that the ordering of tests was backwards.
Here's the fix for that.
Also, looking at the following code, I saw that failure of
virCgroupSetCpuShares could evoke no diagnostic.
Now it does.
>From bc3404d9f12c42cf883a43395fee6fc14c952b2c Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering(a)redhat.com>
Date: Tue, 11 May 2010 15:43:32 +0200
Subject: [PATCH] lxcSetSchedulerParameters: reverse order of tests; diagnose a failure
* src/lxc/lxc_driver.c (lxcSetSchedulerParameters): Ensure that
"->field" is "cpu_shares" before possibly giving a diagnostic about
a type for a "cpu_shares" value.
Also, virCgroupSetCpuShares could fail without evoking a diagnostic.
Add one.
---
src/lxc/lxc_driver.c | 19 ++++++++++++-------
1 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c
index fc0df37..5636eee 100644
--- a/src/lxc/lxc_driver.c
+++ b/src/lxc/lxc_driver.c
@@ -2054,18 +2054,23 @@ static int lxcSetSchedulerParameters(virDomainPtr domain,
for (i = 0; i < nparams; i++) {
virSchedParameterPtr param = ¶ms[i];
+
+ if (STRNEQ(param->field, "cpu_shares")) {
+ lxcError(VIR_ERR_INVALID_ARG,
+ _("Invalid parameter `%s'"), param->field);
+ goto cleanup;
+ }
+
if (param->type != VIR_DOMAIN_SCHED_FIELD_ULLONG) {
lxcError(VIR_ERR_INVALID_ARG, "%s",
- _("Invalid type for cpu_shares tunable, expected a 'ullong'"));
+ _("Invalid type for cpu_shares tunable, expected a 'ullong'"));
goto cleanup;
}
- if (STREQ(param->field, "cpu_shares")) {
- if (virCgroupSetCpuShares(group, params[i].value.ul) != 0)
- goto cleanup;
- } else {
- lxcError(VIR_ERR_INVALID_ARG,
- _("Invalid parameter `%s'"), param->field);
+ int rc = virCgroupSetCpuShares(group, params[i].value.ul);
+ if (rc != 0) {
+ virReportSystemError(-rc, _("failed to set cpu_shares=%llu"),
+ params[i].value.ul);
goto cleanup;
}
}
--
1.7.1.189.g07419
14 years, 7 months
[libvirt] [v4 PATCH] add 802.1Qbh handling for port-profiles based on Stefan's previous patches
by Scott Feldman
From: Scott Feldman <scofeldm(a)cisco.com>
This patch builds on the work recently posted by Stefan Berger. It builds
on top of Stefan's three posted patches:
[PATCH v9] vepa: parsing for 802.1Qb{g|h} XML
[RFC][PATCH 1/3] vepa+vsi: Introduce dependency on libnl
[PATCH v3] Add host UUID (to libvirt capabilities)
Stefan's RFC patches 2/3 and 3/3 are incorporated into my patch.
Changes from v3 to v4:
- move from Stafan's 802.1Qb{g|h} XML v8 to v9
- move hostuuid and vf index calcs to inside doPortProfileOp8021Qbh
Changes from v2 to v3:
- remove debug fprintfs
- use virGetHostUUID (thanks Stefan!)
- fix compile issue when latest if_link.h isn't available
- change poll timeout to 10s, at 1/8 intervals
- if polling times out, log msg and return -ETIMEDOUT
Changes from v1 to v2:
- Add Stefan's code for getPortProfileStatus
- Poll for up to 2 secs for port-profile status, at 1/8 sec intervals:
- if status indicates error, abort openMacvtapTap
- if status indicates success, exit polling
- if status is "in-progress" after 2 secs of polling, exit
polling loop silently, without error
My patch finishes out the 802.1Qbh parts, which Stefan had mostly complete.
I've tested using the recent kernel updates for VF_PORT netlink msgs and
enic for Cisco's 10G Ethernet NIC. I tested many VMs, each with several
direct interfaces, each configured with a port-profile per the XML. VM-to-VM,
and VM-to-external work as expected. VM-to-VM on same host (using same NIC)
works same as VM-to-VM where VMs are on diff hosts. I'm able to change
settings on the port-profile while the VM is running to change the virtual
port behaviour. For example, adjusting a QoS setting like rate limit. All
VMs with interfaces using that port-profile immediatly see the effect of the
change to the port-profile.
I don't have a SR-IOV device to test so source dev is a non-SR-IOV device,
but most of the code paths include support for specifing the source dev and
VF index. We'll need to complete this by discovering the PF given the VF
linkdev. Once we have the PF, we'll also have the VF index. All this info-
mation is available from sysfs.
Signed-off-by: Scott Feldman <scofeldm(a)cisco.com>
---
configure.ac | 16 +
src/qemu/qemu_conf.c | 2
src/qemu/qemu_driver.c | 4
src/util/macvtap.c | 778 +++++++++++++++++++++++++++++++++++++++++++++++-
src/util/macvtap.h | 1
5 files changed, 785 insertions(+), 16 deletions(-)
diff --git a/configure.ac b/configure.ac
index 36ba703..885b0ae 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2005,13 +2005,26 @@ if test "$with_macvtap" != "no" ; then
fi
AM_CONDITIONAL([WITH_MACVTAP], [test "$with_macvtap" = "yes"])
+AC_TRY_COMPILE([ #include <sys/socket.h>
+ #include <linux/rtnetlink.h> ],
+ [ int x = IFLA_PORT_MAX; ],
+ [ with_virtualport=yes ],
+ [ with_virtualport=no ])
+if test "$with_virtualport" = "yes"; then
+ val=1
+else
+ val=0
+fi
+AC_DEFINE_UNQUOTED([WITH_VIRTUALPORT], $val, [whether vsi vepa support is enabled])
+AM_CONDITIONAL([WITH_VIRTUALPORT], [test "$with_virtualport" = "yes"])
+
dnl netlink library
LIBNL_CFLAGS=""
LIBNL_LIBS=""
-if test "$with_macvtap" = "yes"; then
+if test "$with_macvtap" = "yes" || "$with_virtualport" = "yes"; then
PKG_CHECK_MODULES([LIBNL], [libnl-1 >= $LIBNL_REQUIRED], [
], [
AC_MSG_ERROR([libnl >= $LIBNL_REQUIRED is required for macvtap support])
@@ -2084,6 +2097,7 @@ AC_MSG_NOTICE([ Network: $with_network])
AC_MSG_NOTICE([Libvirtd: $with_libvirtd])
AC_MSG_NOTICE([ netcf: $with_netcf])
AC_MSG_NOTICE([ macvtap: $with_macvtap])
+AC_MSG_NOTICE([virtport: $with_virtualport])
AC_MSG_NOTICE([])
AC_MSG_NOTICE([Storage Drivers])
AC_MSG_NOTICE([])
diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c
index 111fa6e..95d4c1a 100644
--- a/src/qemu/qemu_conf.c
+++ b/src/qemu/qemu_conf.c
@@ -1505,7 +1505,7 @@ qemudPhysIfaceConnect(virConnectPtr conn,
if (err) {
close(rc);
rc = -1;
- delMacvtap(net->ifname,
+ delMacvtap(net->ifname, net->data.direct.linkdev,
&net->data.direct.virtPortProfile);
}
}
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index f02bf3b..f1a0d0e 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -3679,7 +3679,7 @@ static void qemudShutdownVMDaemon(struct qemud_driver *driver,
for (i = 0; i < def->nnets; i++) {
virDomainNetDefPtr net = def->nets[i];
if (net->type == VIR_DOMAIN_NET_TYPE_DIRECT)
- delMacvtap(net->ifname,
+ delMacvtap(net->ifname, net->data.direct.linkdev,
&net->data.direct.virtPortProfile);
}
#endif
@@ -8369,7 +8369,7 @@ qemudDomainDetachNetDevice(struct qemud_driver *driver,
#if WITH_MACVTAP
if (detach->type == VIR_DOMAIN_NET_TYPE_DIRECT)
- delMacvtap(detach->ifname,
+ delMacvtap(detach->ifname, detach->data.direct.linkdev,
&detach->data.direct.virtPortProfile);
#endif
diff --git a/src/util/macvtap.c b/src/util/macvtap.c
index 5cbd02b..d5a08d9 100644
--- a/src/util/macvtap.c
+++ b/src/util/macvtap.c
@@ -27,7 +27,7 @@
#include <config.h>
-#if WITH_MACVTAP
+#if WITH_MACVTAP || WITH_VIRTUALPORT
# include <stdio.h>
# include <errno.h>
@@ -41,6 +41,8 @@
# include <linux/rtnetlink.h>
# include <linux/if_tun.h>
+# include <netlink/msg.h>
+
# include "util.h"
# include "memory.h"
# include "logging.h"
@@ -48,6 +50,7 @@
# include "interface.h"
# include "conf/domain_conf.h"
# include "virterror_internal.h"
+# include "uuid.h"
# define VIR_FROM_THIS VIR_FROM_NET
@@ -58,15 +61,23 @@
# define MACVTAP_NAME_PREFIX "macvtap"
# define MACVTAP_NAME_PATTERN "macvtap%d"
+# define MICROSEC_PER_SEC (1000 * 1000)
+
static int associatePortProfileId(const char *macvtap_ifname,
+ const char *linkdev,
const virVirtualPortProfileDefPtr virtPort,
- int vf,
const unsigned char *vmuuid);
static int disassociatePortProfileId(const char *macvtap_ifname,
+ const char *linkdev,
const virVirtualPortProfileDefPtr virtPort);
+enum virVirtualPortOp {
+ ASSOCIATE = 0x1,
+ DISASSOCIATE = 0x2,
+};
+
static int nlOpen(void)
{
@@ -159,6 +170,156 @@ err_exit:
}
+# ifdef IFLA_VF_PORT_MAX
+
+/**
+ * nlCommWaitSuccess:
+ *
+ * @nlmsg: pointer to netlink message
+ * @nl_grousp: the netlink multicast groups to send to
+ * @respbuf: pointer to pointer where response buffer will be allocated
+ * @respbuflen: pointer to integer holding the size of the response buffer
+ * on return of the function.
+ * @to_usecs: timeout in microseconds to wait for a success message
+ * to be returned
+ *
+ * Send the given message to the netlink multicast group and receive
+ * responses. Skip responses indicating an error and keep on receiving
+ * responses until a success response is returned.
+ * Returns 0 on success, -1 on error. In case of error, no response
+ * buffer will be returned.
+ */
+static int
+nlCommWaitSuccess(struct nlmsghdr *nlmsg, int nl_groups,
+ char **respbuf, int *respbuflen, long to_usecs)
+{
+ int rc = 0;
+ struct sockaddr_nl nladdr = {
+ .nl_family = AF_NETLINK,
+ .nl_pid = getpid(),
+ .nl_groups = nl_groups,
+ };
+ int rcvChunkSize = 1024; // expecting less than that
+ int rcvoffset = 0;
+ ssize_t nbytes;
+ int n;
+ struct timeval tv = {
+ .tv_sec = to_usecs / MICROSEC_PER_SEC,
+ .tv_usec = to_usecs % MICROSEC_PER_SEC,
+ };
+ fd_set rfds;
+ bool gotvalid = false;
+ int fd = nlOpen();
+ static uint32_t seq = 0x1234;
+ uint32_t myseq = seq++;
+ uint32_t mypid = getpid();
+
+ if (fd < 0)
+ return -1;
+
+ nlmsg->nlmsg_pid = mypid;
+ nlmsg->nlmsg_seq = myseq;
+ nlmsg->nlmsg_flags |= NLM_F_ACK;
+
+ nbytes = sendto(fd, (void *)nlmsg, nlmsg->nlmsg_len, 0,
+ (struct sockaddr *)&nladdr, sizeof(nladdr));
+ if (nbytes < 0) {
+ virReportSystemError(errno,
+ "%s", _("cannot send to netlink socket"));
+ rc = -1;
+ goto err_exit;
+ }
+
+ while (!gotvalid) {
+ rcvoffset = 0;
+ while (1) {
+ socklen_t addrlen = sizeof(nladdr);
+
+ if (VIR_REALLOC_N(*respbuf, rcvoffset+rcvChunkSize) < 0) {
+ virReportOOMError();
+ rc = -1;
+ goto err_exit;
+ }
+
+ FD_ZERO(&rfds);
+ FD_SET(fd, &rfds);
+
+ n = select(fd + 1, &rfds, NULL, NULL, &tv);
+ if (n == 0) {
+ rc = -1;
+ goto err_exit;
+ }
+
+ nbytes = recvfrom(fd, &((*respbuf)[rcvoffset]), rcvChunkSize, 0,
+ (struct sockaddr *)&nladdr, &addrlen);
+ if (nbytes < 0) {
+ if (errno == EAGAIN || errno == EINTR)
+ continue;
+ virReportSystemError(errno, "%s",
+ _("error receiving from netlink socket"));
+ rc = -1;
+ goto err_exit;
+ }
+ rcvoffset += nbytes;
+ break;
+ }
+ *respbuflen = rcvoffset;
+
+ /* check message for error */
+ if (*respbuflen > NLMSG_LENGTH(0) && *respbuf != NULL) {
+ struct nlmsghdr *resp = (struct nlmsghdr *)*respbuf;
+ struct nlmsgerr *err;
+
+ if (resp->nlmsg_pid != mypid ||
+ resp->nlmsg_seq != myseq)
+ continue;
+
+ /* skip reflected message */
+ if (resp->nlmsg_type & 0x10)
+ continue;
+
+ switch (resp->nlmsg_type) {
+ case NLMSG_ERROR:
+ err = (struct nlmsgerr *)NLMSG_DATA(resp);
+ if (resp->nlmsg_len >= NLMSG_LENGTH(sizeof(*err))) {
+ if (-err->error != EOPNOTSUPP) {
+ /* assuming error msg from daemon */
+ gotvalid = true;
+ break;
+ }
+ }
+ /* whatever this is, skip it */
+ VIR_FREE(*respbuf);
+ *respbuf = NULL;
+ *respbuflen = 0;
+ break;
+
+ case NLMSG_DONE:
+ gotvalid = true;
+ break;
+
+ default:
+ VIR_FREE(*respbuf);
+ *respbuf = NULL;
+ *respbuflen = 0;
+ break;
+ }
+ }
+ }
+
+err_exit:
+ if (rc == -1) {
+ VIR_FREE(*respbuf);
+ *respbuf = NULL;
+ *respbuflen = 0;
+ }
+
+ nlClose(fd);
+ return rc;
+}
+
+#endif
+
static struct rtattr *
rtattrCreate(char *buffer, int bufsize, int type,
const void *data, int datalen)
@@ -204,6 +365,8 @@ nlAppend(struct nlmsghdr *nlm, int totlen, const void *data, int datalen)
}
+# if WITH_MACVTAP
+
static int
link_add(const char *type,
const unsigned char *macaddress, int macaddrsize,
@@ -655,8 +818,8 @@ create_name:
}
if (associatePortProfileId(cr_ifname,
+ linkdev,
virtPortProfile,
- -1,
vmuuid) != 0) {
rc = -1;
goto link_del_exit;
@@ -689,6 +852,7 @@ create_name:
disassociate_exit:
disassociatePortProfileId(cr_ifname,
+ linkdev,
virtPortProfile);
link_del_exit:
@@ -701,6 +865,7 @@ link_del_exit:
/**
* delMacvtap:
* @ifname : The name of the macvtap interface
+ * @linkdev: The interface name of the NIC to connect to the external bridge
* @virtPortProfile: pointer to object holding the virtual port profile data
*
* Delete an interface given its name. Disassociate
@@ -709,24 +874,603 @@ link_del_exit:
*/
void
delMacvtap(const char *ifname,
+ const char *linkdev,
virVirtualPortProfileDefPtr virtPortProfile)
{
if (ifname) {
disassociatePortProfileId(ifname,
+ linkdev,
virtPortProfile);
link_del(ifname);
}
}
-#endif
+# endif
+
+
+# ifdef IFLA_PORT_MAX
+
+static struct nla_policy ifla_policy[IFLA_MAX + 1] =
+{
+ [IFLA_VF_PORTS] = { .type = NLA_NESTED },
+};
+
+static struct nla_policy ifla_vf_ports_policy[IFLA_VF_PORT_MAX + 1] =
+{
+ [IFLA_VF_PORT] = { .type = NLA_NESTED },
+};
+
+static struct nla_policy ifla_port_policy[IFLA_PORT_MAX + 1] =
+{
+ [IFLA_PORT_RESPONSE] = { .type = NLA_U16 },
+};
+
+
+static int
+link_dump(int ifindex, struct nlattr **tb, char **recvbuf)
+{
+ int rc = 0;
+ char nlmsgbuf[256] = { 0, };
+ struct nlmsghdr *nlm = (struct nlmsghdr *)nlmsgbuf, *resp;
+ struct nlmsgerr *err;
+ struct ifinfomsg i = {
+ .ifi_family = AF_UNSPEC,
+ .ifi_index = ifindex
+ };
+ int recvbuflen;
+
+ *recvbuf = NULL;
+
+ nlInit(nlm, NLM_F_REQUEST, RTM_GETLINK);
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), &i, sizeof(i)))
+ goto buffer_too_small;
+
+ if (nlComm(nlm, recvbuf, &recvbuflen) < 0)
+ return -1;
+
+ if (recvbuflen < NLMSG_LENGTH(0) || *recvbuf == NULL)
+ goto malformed_resp;
+ resp = (struct nlmsghdr *)*recvbuf;
+
+ switch (resp->nlmsg_type) {
+ case NLMSG_ERROR:
+ err = (struct nlmsgerr *)NLMSG_DATA(resp);
+ if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err)))
+ goto malformed_resp;
+
+ switch (-err->error) {
+ case 0:
+ break;
+
+ default:
+ virReportSystemError(-err->error,
+ _("error dumping %d interface"),
+ ifindex);
+ rc = -1;
+ }
+ break;
+
+ case GENL_ID_CTRL:
+ case NLMSG_DONE:
+ if (nlmsg_parse(resp, sizeof(struct ifinfomsg),
+ tb, IFLA_MAX, ifla_policy)) {
+ goto malformed_resp;
+ }
+ break;
+
+ default:
+ goto malformed_resp;
+ }
+
+ if (rc != 0)
+ VIR_FREE(*recvbuf);
+
+ return rc;
+
+malformed_resp:
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("malformed netlink response message"));
+ VIR_FREE(*recvbuf);
+ return -1;
+
+buffer_too_small:
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("internal buffer is too small"));
+ return -1;
+}
+
+
+static int
+getPortProfileStatus(struct nlattr **tb, int32_t vf, uint16_t *status)
+{
+ int rc = 1;
+ const char *msg = NULL;
+ struct nlattr *tb2[IFLA_VF_PORT_MAX + 1],
+ *tb3[IFLA_PORT_MAX+1];
+
+ if (vf == PORT_SELF_VF) {
+ if (tb[IFLA_PORT_SELF]) {
+ if (nla_parse_nested(tb3, IFLA_PORT_MAX, tb[IFLA_PORT_SELF],
+ ifla_port_policy)) {
+ msg = _("error parsing nested IFLA_VF_PORT part");
+ goto err_exit;
+ }
+ }
+ } else {
+ if (tb[IFLA_VF_PORTS]) {
+ if (nla_parse_nested(tb2, IFLA_VF_PORT_MAX, tb[IFLA_VF_PORTS],
+ ifla_vf_ports_policy)) {
+ msg = _("error parsing nested IFLA_VF_PORTS part");
+ goto err_exit;
+ }
+ if (tb2[IFLA_VF_PORT]) {
+ if (nla_parse_nested(tb3, IFLA_PORT_MAX, tb2[IFLA_VF_PORT],
+ ifla_port_policy)) {
+ msg = _("error parsing nested IFLA_VF_PORT part");
+ goto err_exit;
+ }
+ }
+ }
+ }
+
+ if (tb3[IFLA_PORT_RESPONSE]) {
+ *status = *(uint16_t *)RTA_DATA(tb3[IFLA_PORT_RESPONSE]);
+ rc = 0;
+ } else {
+ msg = _("no IFLA_PORT_RESPONSE found in netlink message");
+ goto err_exit;
+ }
+
+err_exit:
+ if (msg)
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s", msg);
+
+ return rc;
+}
+
+
+static int
+doPortProfileOpSetLink(bool multicast,
+ int ifindex,
+ const char *profileId,
+ struct ifla_port_vsi *portVsi,
+ const unsigned char *instanceId,
+ const unsigned char *hostUUID,
+ int32_t vf,
+ uint8_t op)
+{
+ int rc = 0;
+ char nlmsgbuf[256];
+ struct nlmsghdr *nlm = (struct nlmsghdr *)nlmsgbuf, *resp;
+ struct nlmsgerr *err;
+ char rtattbuf[64];
+ struct rtattr *rta, *vfports, *vfport;
+ struct ifinfomsg ifinfo = {
+ .ifi_family = AF_UNSPEC,
+ .ifi_index = ifindex,
+ };
+ char *recvbuf = NULL;
+ int recvbuflen = 0;
+
+ memset(&nlmsgbuf, 0, sizeof(nlmsgbuf));
+
+ nlInit(nlm, NLM_F_REQUEST, RTM_SETLINK);
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), &ifinfo, sizeof(ifinfo)))
+ goto buffer_too_small;
+
+ if (vf == PORT_SELF_VF) {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_SELF, NULL, 0);
+ } else {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_VF_PORTS, NULL, 0);
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!(vfports = nlAppend(nlm, sizeof(nlmsgbuf),
+ rtattbuf, rta->rta_len)))
+ goto buffer_too_small;
+
+ /* beging nesting vfports */
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_VF_PORT, NULL, 0);
+ }
+
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!(vfport = nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len)))
+ goto buffer_too_small;
+
+ if (profileId) {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_PROFILE,
+ profileId, strlen(profileId) + 1);
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
+ goto buffer_too_small;
+ }
+
+ if (portVsi) {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_VSI_TYPE,
+ portVsi, sizeof(*portVsi));
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
+ goto buffer_too_small;
+ }
+
+ if (instanceId) {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_INSTANCE_UUID,
+ instanceId, VIR_UUID_BUFLEN);
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
+ goto buffer_too_small;
+ }
+
+ if (hostUUID) {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_HOST_UUID,
+ hostUUID, VIR_UUID_BUFLEN);
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
+ goto buffer_too_small;
+ }
+
+ if (vf != PORT_SELF_VF) {
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_VF,
+ &vf, sizeof(vf));
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
+ goto buffer_too_small;
+ }
+
+ rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_REQUEST,
+ &op, sizeof(op));
+ if (!rta)
+ goto buffer_too_small;
+
+ if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
+ goto buffer_too_small;
+
+ /* end nesting of vport */
+ vfport->rta_len = (char *)nlm + nlm->nlmsg_len - (char *)vfport;
+
+ if (vf != PORT_SELF_VF) {
+ /* end nesting of vfports */
+ vfports->rta_len = (char *)nlm + nlm->nlmsg_len - (char *)vfports;
+ }
+
+ if (!multicast) {
+ if (nlComm(nlm, &recvbuf, &recvbuflen) < 0)
+ return -1;
+ } else {
+ if (nlCommWaitSuccess(nlm, RTMGRP_LINK, &recvbuf, &recvbuflen,
+ 5 * MICROSEC_PER_SEC) < 0)
+ return -1;
+ }
+
+ if (recvbuflen < NLMSG_LENGTH(0) || recvbuf == NULL)
+ goto malformed_resp;
+
+ resp = (struct nlmsghdr *)recvbuf;
+
+ switch (resp->nlmsg_type) {
+ case NLMSG_ERROR:
+ err = (struct nlmsgerr *)NLMSG_DATA(resp);
+ if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err)))
+ goto malformed_resp;
+
+ switch (-err->error) {
+ case 0:
+ break;
+
+ default:
+ virReportSystemError(-err->error,
+ _("error during virtual port configuration of ifindex %d"),
+ ifindex);
+ rc = -1;
+ }
+ break;
+
+ case NLMSG_DONE:
+ break;
+
+ default:
+ goto malformed_resp;
+ }
+
+ VIR_FREE(recvbuf);
+
+ return rc;
+
+malformed_resp:
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("malformed netlink response message"));
+ VIR_FREE(recvbuf);
+ return -1;
+
+buffer_too_small:
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("internal buffer is too small"));
+ return -1;
+}
+
+
+static int
+doPortProfileOpCommon(bool multicast,
+ int ifindex,
+ const char *profileId,
+ struct ifla_port_vsi *portVsi,
+ const unsigned char *instanceId,
+ const unsigned char *hostUUID,
+ int32_t vf,
+ uint8_t op)
+{
+ int rc;
+ char *recvbuf = NULL;
+ struct nlattr *tb[IFLA_MAX + 1];
+ int repeats = 80;
+ uint16_t status = 0;
+
+ rc = doPortProfileOpSetLink(multicast,
+ ifindex,
+ profileId,
+ portVsi,
+ instanceId,
+ hostUUID,
+ vf,
+ op);
+
+ if (rc != 0) {
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("sending of PortProfileRequest failed.\n"));
+ return rc;
+ }
+
+ if (!multicast) {
+ while (--repeats) {
+ rc = link_dump(ifindex, tb, &recvbuf);
+ if (rc)
+ goto err_exit;
+ rc = getPortProfileStatus(tb, vf, &status);
+ if (rc == 0) {
+ if (status == PORT_PROFILE_RESPONSE_SUCCESS ||
+ status == PORT_VDP_RESPONSE_SUCCESS) {
+ break;
+ } else if (status == PORT_PROFILE_RESPONSE_INPROGRESS) {
+ // keep trying...
+ } else {
+ virReportSystemError(EINVAL,
+ _("error %d during port-profile setlink on ifindex %d"),
+ status, ifindex);
+ rc = 1;
+ break;
+ }
+ }
+ usleep(125000);
+
+ VIR_FREE(recvbuf);
+ }
+ }
+
+ if (status == PORT_PROFILE_RESPONSE_INPROGRESS) {
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("port-profile setlink timed out"));
+ rc = -ETIMEDOUT;
+ }
+
+err_exit:
+ VIR_FREE(recvbuf);
+
+ return rc;
+}
+
+# endif /* IFLA_PORT_MAX */
+
+static int
+doPortProfileOp8021Qbg(const char *ifname,
+ const virVirtualPortProfileDefPtr virtPort,
+ enum virVirtualPortOp virtPortOp)
+{
+ int rc;
+
+# ifndef IFLA_VF_PORT_MAX
+
+ (void)ifname;
+ (void)virtPort;
+ (void)vf;
+ (void)virtPortOp;
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Kernel VF Port support was missing at compile time."));
+ rc = 1;
+
+# else /* IFLA_VF_PORT_MAX */
+
+ int op = PORT_REQUEST_ASSOCIATE;
+ struct ifla_port_vsi portVsi = {
+ .vsi_mgr_id = virtPort->u.virtPort8021Qbg.managerID,
+ .vsi_type_version = virtPort->u.virtPort8021Qbg.typeIDVersion,
+ };
+ bool multicast = true;
+ int ifindex;
+
+ if (ifaceGetIndex(true, ifname, &ifindex) != 0) {
+ rc = 1;
+ goto err_exit;
+ }
+
+ portVsi.vsi_type_id[2] = virtPort->u.virtPort8021Qbg.typeID >> 16;
+ portVsi.vsi_type_id[1] = virtPort->u.virtPort8021Qbg.typeID >> 8;
+ portVsi.vsi_type_id[0] = virtPort->u.virtPort8021Qbg.typeID;
+
+ switch (virtPortOp) {
+ case ASSOCIATE:
+ op = PORT_REQUEST_ASSOCIATE;
+ break;
+ case DISASSOCIATE:
+ op = PORT_REQUEST_DISASSOCIATE;
+ break;
+ default:
+ macvtapError(VIR_ERR_INTERNAL_ERROR,
+ _("operation type %d not supported"), op);
+ rc = 1;
+ goto err_exit;
+ }
+
+ rc = doPortProfileOpCommon(multicast, ifindex,
+ NULL,
+ &portVsi,
+ virtPort->u.virtPort8021Qbg.instanceID,
+ NULL,
+ PORT_SELF_VF,
+ op);
+
+err_exit:
+
+# endif /* IFLA_VF_PORT_MAX */
+
+ return rc;
+}
+
+
+static int
+getPhysfn(const char *linkdev,
+ int32_t *vf,
+ char **physfndev)
+{
+ int rc = 0;
+
+# ifndef IFLA_VF_PORT_MAX
+
+ (void)linkdev;
+ (void)vf;
+ (void)physfndev;
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Kernel VF Port support was missing at compile time."));
+ rc = 1;
+
+# else /* IFLA_VF_PORT_MAX */
+
+ bool virtfn = false;
+
+ if (virtfn) {
+
+ // XXX: if linkdev is SR-IOV VF, then set vf = VF index
+ // XXX: and set linkdev = PF device
+ // XXX: need to use get_physical_function_linux() or
+ // XXX: something like that to get PF
+ // XXX: device and figure out VF index
+
+ rc = 1;
+
+ } else {
+
+ /* Not SR-IOV VF: physfndev is linkdev and VF index
+ * refers to linkdev self
+ */
+
+ *vf = PORT_SELF_VF;
+ *physfndev = (char *)linkdev;
+ }
+
+# endif /* IFLA_VF_PORT_MAX */
+
+ return rc;
+}
+
+
+static int
+doPortProfileOp8021Qbh(const char *ifname,
+ const virVirtualPortProfileDefPtr virtPort,
+ const unsigned char *vm_uuid,
+ enum virVirtualPortOp virtPortOp)
+{
+ int rc;
+
+# ifndef IFLA_VF_PORT_MAX
+
+ (void)ifname;
+ (void)virtPort;
+ (void)vm_uuid;
+ (void)virtPortOp;
+ macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Kernel VF Port support was missing at compile time."));
+ rc = 1;
+
+# else /* IFLA_VF_PORT_MAX */
+
+ char *physfndev;
+ unsigned char hostuuid[VIR_UUID_BUFLEN];
+ int32_t vf;
+ int op = PORT_REQUEST_ASSOCIATE;
+ bool multicast = false;
+ int ifindex;
+
+ rc = virGetHostUUID(hostuuid);
+ if (rc)
+ goto err_exit;
+
+ rc = getPhysfn(ifname, &vf, &physfndev);
+ if (rc)
+ goto err_exit;
+
+ if (ifaceGetIndex(true, physfndev, &ifindex) != 0) {
+ rc = 1;
+ goto err_exit;
+ }
+
+ switch (virtPortOp) {
+ case ASSOCIATE:
+ op = PORT_REQUEST_ASSOCIATE;
+ break;
+ case DISASSOCIATE:
+ op = PORT_REQUEST_DISASSOCIATE;
+ break;
+ default:
+ macvtapError(VIR_ERR_INTERNAL_ERROR,
+ _("operation type %d not supported"), op);
+ rc = 1;
+ goto err_exit;
+ }
+
+ rc = doPortProfileOpCommon(multicast, ifindex,
+ virtPort->u.virtPort8021Qbh.profileID,
+ NULL,
+ vm_uuid,
+ hostuuid,
+ vf,
+ op);
+
+ switch (virtPortOp) {
+ case ASSOCIATE:
+ ifaceUp(ifname);
+ break;
+ case DISASSOCIATE:
+ ifaceDown(ifname);
+ break;
+ }
+
+err_exit:
+
+# endif /* IFLA_VF_PORT_MAX */
+
+ return rc;
+}
/**
* associatePortProfile
*
* @macvtap_ifname: The name of the macvtap device
+ * @linkdev: The link device in case of macvtap
* @virtPort: pointer to the object holding port profile parameters
- * @vf: virtual function number, -1 if to be ignored
* @vmuuid : the UUID of the virtual machine
*
* Associate a port on a swtich with a profile. This function
@@ -740,15 +1484,14 @@ delMacvtap(const char *ifname,
*/
static int
associatePortProfileId(const char *macvtap_ifname,
+ const char *linkdev,
const virVirtualPortProfileDefPtr virtPort,
- int vf,
const unsigned char *vmuuid)
{
int rc = 0;
+
VIR_DEBUG("Associating port profile '%p' on link device '%s'",
virtPort, macvtap_ifname);
- (void)vf;
- (void)vmuuid;
switch (virtPort->virtPortType) {
case VIR_VIRTUALPORT_NONE:
@@ -756,11 +1499,14 @@ associatePortProfileId(const char *macvtap_ifname,
break;
case VIR_VIRTUALPORT_8021QBG:
-
+ rc = doPortProfileOp8021Qbg(macvtap_ifname, virtPort,
+ ASSOCIATE);
break;
case VIR_VIRTUALPORT_8021QBH:
-
+ rc = doPortProfileOp8021Qbh(linkdev, virtPort,
+ vmuuid,
+ ASSOCIATE);
break;
}
@@ -772,6 +1518,7 @@ associatePortProfileId(const char *macvtap_ifname,
* disassociatePortProfile
*
* @macvtap_ifname: The name of the macvtap device
+ * @linkdev: The link device in case of macvtap
* @virtPort: point to object holding port profile parameters
*
* Returns 0 in case of success, != 0 otherwise with error
@@ -779,9 +1526,11 @@ associatePortProfileId(const char *macvtap_ifname,
*/
static int
disassociatePortProfileId(const char *macvtap_ifname,
+ const char *linkdev,
const virVirtualPortProfileDefPtr virtPort)
{
int rc = 0;
+
VIR_DEBUG("Disassociating port profile id '%p' on link device '%s' ",
virtPort, macvtap_ifname);
@@ -791,13 +1540,18 @@ disassociatePortProfileId(const char *macvtap_ifname,
break;
case VIR_VIRTUALPORT_8021QBG:
-
+ rc = doPortProfileOp8021Qbg(macvtap_ifname, virtPort,
+ DISASSOCIATE);
break;
case VIR_VIRTUALPORT_8021QBH:
-
+ rc = doPortProfileOp8021Qbh(linkdev, virtPort,
+ NULL,
+ DISASSOCIATE);
break;
}
return rc;
}
+
+#endif
diff --git a/src/util/macvtap.h b/src/util/macvtap.h
index ae11c5c..35db31c 100644
--- a/src/util/macvtap.h
+++ b/src/util/macvtap.h
@@ -72,6 +72,7 @@ int openMacvtapTap(const char *ifname,
char **res_ifname);
void delMacvtap(const char *ifname,
+ const char *linkdev,
virVirtualPortProfileDefPtr virtPortProfile);
# endif /* WITH_MACVTAP */
14 years, 7 months
[libvirt] [PATCH 0/2] Device assignment hotplug fixes
by Alex Williamson
These add a couple fixes for device assignment hotplug
---
Alex Williamson (2):
qemu: Release bus address on PCI host device remove
qemu: avoid corrupting guest info struct on host device PCI hot add
src/qemu/qemu_driver.c | 23 +++++++++++++++--------
1 files changed, 15 insertions(+), 8 deletions(-)
14 years, 7 months
[libvirt] [PATCH] [DOCS] nwfilter: documentation
by Stefan Berger
This patch adds documentation of the nwfilter subsystem of libvirt to
the existing (web) docs. I am attaching a PDF in case you don't want to
read the plain html sources.
Signed-off-by: Stefan Berger <stefanb(a)linux.vnet.ibm.com>
---
docs/formatnwfilter.html.in | 1407 ++++++++++++++++++++++++++++++++++++++++++++
docs/sitemap.html.in | 6
2 files changed, 1413 insertions(+)
Index: libvirt-acl/docs/sitemap.html.in
===================================================================
--- libvirt-acl.orig/docs/sitemap.html.in
+++ libvirt-acl/docs/sitemap.html.in
@@ -97,6 +97,12 @@
<li>
<a href="formatnetwork.html">Networks</a>
<span>The virtual network XML format</span>
+ <ul>
+ <li>
+ <a href="formatnwfilter.html">Network Filtering</a>
+ <span>Network filter XML format</span>
+ </li>
+ </ul>
</li>
<li>
<a href="formatstorage.html">Storage</a>
Index: libvirt-acl/docs/formatnwfilter.html.in
===================================================================
--- /dev/null
+++ libvirt-acl/docs/formatnwfilter.html.in
@@ -0,0 +1,1407 @@
+<html>
+ <body>
+ <h1>Network Filters</h1>
+
+ <ul id="toc">
+ </ul>
+
+ <p>
+ This page provides an introduction to libvirt's network filters,
+ their goals, concepts and XML format.
+ </p>
+
+ <h2><a name="goals">Goals and background</a></h2>
+
+ <p>
+ The goal of the network filtering XML is to enable administrators
+ of virtualized system to configure and enforce network traffic
+ filtering rules on virtual
+ machines and manage the parameters of network traffic that
+ virtual machines
+ are allowed to send or receive.
+ The network traffic filtering rules are
+ applied on the host when a virtual machine is started. Since the
+ filtering rules
+ cannot be circumvented from within
+ the virtual machine, it makes them mandatory from the point of
+ view of a virtual machine user.
+ <br><br>
+ The network filter subsystem allows each virtual machine's network
+ traffic filtering rules to be configured individually on a per
+ interface basis. The rules are
+ applied on the host when the virtual machine is started and can be modified
+ while the virtual machine is running. The latter can be achieved by
+ modifying the XML description of a network filter.
+ <br><br>
+ Multiple virtual machines can make use of the same generic network filter.
+ When such a filter is modified, the network traffic filtering rules
+ of all running virtual machines that reference this filter are updated.
+ <br><br>
+ Network filtering support is available <span class="since">since 0.8.1
+ (Qemu, KVM)</span>
+ </p>
+
+ <h2><a name="nwfconcpts">Concepts</a></h2>
+ <p>
+ The network traffic filtering subsystem enables configuration
+ of network traffic filtering rules on individual network
+ interfaces that are configured for certain types of
+ network configurations. Supported network types are
+ </p>
+ <ul>
+ <li><code>network</code></li>
+ <li><code>ethernet</code> -- must be used in bridging mode</li>
+ <li><code>bridge</code></li>
+ <li><code>direct</code> -- only protocols mac, arp, ip and ipv6
+ can be filtered</li>
+ </ul>
+ <p>
+ The interface XML is used to reference a top-level filter. In the
+ following example, the interface description references
+ the filter <code>clean-traffic</code>.
+ </p>
+<pre>
+ ...
+ <devices>
+ <interface type='bridge'>
+ <mac address='00:16:3e:5d:c7:9e'/>
+ <filterref filter='clean-traffic'/>
+ </interface>
+ </devices>
+ ...</pre>
+
+ <p>
+ Network filters are written in XML and may either contain references
+ to other filters, contain rules for traffic filtering or can
+ hold a combination of both. The above referenced filter
+ <code>clean-traffic </code> is a filter that for example only
+ contains references to
+ other filters and no actual filtering rules. Since references to
+ other filters can be used, a <i>tree</i> of filters can be built.
+ The <code>clean-traffic</code> filter can be viewed using the
+ command <code>virsh nwfilter-dumpxml clean-traffic</code>.
+ <br><br>
+ As previously mentioned, a single network filter can be referenced
+ by multiple virtual machines. Since interfaces will typically
+ have individual parameters associated with their respective traffic
+ filtering rules, the rules described in a filter XML can
+ be parameterized with variables. In this case, the variable name
+ is used in the filter XML and the name and value are provided at the
+ place where the filter is referenced. In the
+ following example, the interface description has been extended with
+ the parameter <code>IP</code> and a dotted IP address as value.
+ </p>
+<pre>
+ ...
+ <devices>
+ <interface type='bridge'>
+ <mac address='00:16:3e:5d:c7:9e'/>
+ <filterref filter='clean-traffic'>
+ <parameter name='IP' value='10.0.0.1'/>
+ </filterref>
+ </interface>
+ </devices>
+ ...</pre>
+
+ <p>
+ In this particular example, the <code>clean-traffic</code> network
+ traffic filter will be instantiated with the IP address parameter
+ 10.0.0.1 and enforce that the traffic from this interface will
+ always be using 10.0.0.1 as the source IP address, which is
+ one of the purposes of this particular filter.
+ <br><br>
+ </p>
+
+ <h3><a name="nwfconcptsvars">Usage of variables in filters</a></h3>
+ <p>
+
+ Two variables names have so far been reserved for usage by the
+ network traffic filtering subsystem: <code>MAC</code> and
+ <code>IP</code>.
+ <br><br>
+ <code>MAC</code> is the MAC address of the
+ network interface. A filtering rule that references this variable
+ will automatically be instantiated with the MAC address of the
+ interface. This works without the user having to explicitly provide
+ the MAC parameter. Even though it is possible to specify the MAC
+ parameter similar to the IP parameter above, it is discouraged
+ since libvirt knows what MAC address an interface will be using.
+ <br><br>
+ The parameter <code>IP</code> represents the IP address
+ that the operating system inside the virtual machine is expected
+ to use on the given interface. The <code>IP</code> parameter
+ is special in so far as the libvirt daemon will try to determine
+ the IP address (and thus the IP parameter's value) that is being
+ used on an interface if the parameter
+ is not explicitly provided but referenced.
+ For current limitations on IP address detection, consult the
+ <a href="#nwflimits">section on limitations</a> on how to use this
+ feature and what to expect when using it.
+ <br><br>
+ The following is the XML description of the network filer
+ <code>no-arp-spoofing</code>. It serves as an example for
+ a network filter XML referencing the <code>MAC</code> and
+ <code>IP</code> parameters. This particular filter is referenced by the
+ <code>clean-traffic</code> filter.
+ </p>
+<pre>
+<filter name='no-arp-spoofing' chain='arp'>
+ <uuid>f88f1932-debf-4aa1-9fbe-f10d3aa4bc95</uuid>
+ <rule action='drop' direction='out' priority='300'>
+ <mac match='no' srcmacaddr='$MAC'/>
+ </rule>
+ <rule action='drop' direction='out' priority='350'>
+ <arp match='no' arpsrcmacaddr='$MAC'/>
+ </rule>
+ <rule action='drop' direction='out' priority='400'>
+ <arp match='no' arpsrcipaddr='$IP'/>
+ </rule>
+ <rule action='drop' direction='in' priority='450'>
+ <arp opcode='Reply'/>
+ <arp match='no' arpdstmacaddr='$MAC'/>
+ </rule>
+ <rule action='drop' direction='in' priority='500'>
+ <arp match='no' arpdstipaddr='$IP'/>
+ </rule>
+ <rule action='accept' direction='inout' priority='600'>
+ <arp opcode='Request'/>
+ </rule>
+ <rule action='accept' direction='inout' priority='650'>
+ <arp opcode='Reply'/>
+ </rule>
+ <rule action='drop' direction='inout' priority='1000'/>
+</filter>
+</pre>
+
+ <p>
+ Note that referenced variables are always prefixed with the
+ $ (dollar) sign. The format of the value of a variable
+ must be of the type expected by the filter attribute in the
+ XML. In the above example, the <code>IP</code> parameter
+ must hold a dotted IP address in decimal numbers format.
+ Failure to provide the correct
+ value type will result in the filter not being instantiatable
+ and will prevent a virtual machine from starting or the
+ interface from attaching when hotplugging is used. The types
+ that are expected for each XML attribute are shown
+ below.
+ </p>
+
+ <h2><a name="nwfelems">Element and attribute overview</a></h2>
+
+ <p>
+ The root element required for all network filters is
+ named <code>filter</code> with two possible attributes. The
+ <code>name</code> attribute provides a unique name of the
+ given filter. The <code>chain</code> attribute is optional but
+ allows certain filters to be better organized for more efficient
+ processing by the firewall subsystem of the underlying host.
+ Currently the system only supports the chains <code>root,
+ ipv4, ipv6, arp and rarp</code>.
+ </p>
+
+ <h3><a name="nwfelemsRefs">References to other filers</a></h3>
+ <p>
+ Any filter may hold references to other filters. Individual
+ filters may be referenced multiple times in a filter tree but
+ references between filters must not introduce loops (directed
+ acyclic graph).
+ <br><br>
+ The following shows the XML of the <code>clean-traffic</code>
+ network filter referencing several other filters.
+ </p>
+<pre>
+<filter name='clean-traffic'>
+ <uuid>6ef53069-ba34-94a0-d33d-17751b9b8cb1</uuid>
+ <filterref filter='no-mac-spoofing'/>
+ <filterref filter='no-ip-spoofing'/>
+ <filterref filter='allow-incoming-ipv4'/>
+ <filterref filter='no-arp-spoofing'/>
+ <filterref filter='no-other-l2-traffic'/>
+ <filterref filter='qemu-announce-self'/>
+</filter>
+</pre>
+
+ <p>
+ To reference another filter, the XML node <code>filterref</code>
+ needs to be provided inside a <code>filter</code> node. This
+ node must have the attribute <code>filter</code> whose value contains
+ the name of the filter to be referenced.
+ <br><br>
+ New network filters can be defined at any time and
+ may contain references to network filters that are
+ not known to libvirt, yet. However, once a virtual machine
+ is started or a network interface
+ referencing a filter is to be hotplugged, all network filters
+ in the filter tree must be available. Otherwise the virtual
+ machine will not start or the network interface cannot be
+ attached.
+ </p>
+
+ <h3><a name="nwfelemsRules">Filter rules</a></h3>
+ <p>
+ The following XML shows a simple example of a network
+ traffic filter implementing a rule to drop traffic if
+ the IP address (provided through the value of the
+ variable IP) in an outgoing IP packet is not the expected
+ one, thus preventing IP address spoofing by the VM.
+ </p>
+<pre>
+<filter name='no-ip-spoofing' chain='ipv4'>
+ <uuid>fce8ae33-e69e-83bf-262e-30786c1f8072</uuid>
+ <rule action='drop' direction='out' priority='500'>
+ <ip match='no' srcipaddr='$IP'/>
+ </rule>
+</filter>
+</pre>
+
+ <p>
+ A traffic filtering rule starts with the <code>rule</code>
+ node. This node may contain up to three attributes
+ </p>
+ <ul>
+ <li>
+ action -- mandatory; must either be <code>drop</code> or <code>accept</code> if
+ the evaluation of the filtering rule is supposed to drop or accept
+ a packet
+ </li>
+ <li>
+ direction -- mandatory; must either be <code>in</code>, <code>out</code> or
+ <code>inout</code> if the rule is for incoming,
+ outgoing or incoming-and-outgoing traffic
+ </li>
+ <li>
+ priority -- optional; the priority of the rule controls the order in
+ which the rule will be instantiated relative to other rules.
+ Rules with lower value will be instantiated and therefore evaluated
+ before rules with higher value.
+ Valid values are in the range of 0 to 1000. If this attribute is not
+ provided, the value 500 will automatically be assigned.
+ </li>
+ </ul>
+ <p>
+ The above example indicates that the traffic of type <code>ip</code>
+ will be asscociated with the chain 'ipv4' and the rule will have
+ priority 500. If for example another filter is referenced whose
+ traffic of type <code>ip</code> is also associated with the chain
+ 'ipv4' then that filter's rules will be ordered relative to the priority
+ 500 of the shown rule.
+ <br><br>
+ A rule may contain a single rule for filtering of traffic. The
+ above example shows that traffic of type <code>ip</code> is to be
+ filtered.
+ </p>
+
+ <h4><a name="nwfelemsRulesProto">Supported protocols</a></h4>
+ <p>
+ The following sections enumerate the list of protocols that
+ are supported by the network filtering subsystem. The
+ type of traffic a rule is supposed to filter on is provided
+ in the <code>rule</code> node as a nested node. Depending
+ on the traffic type a rule is filtering, the attributes are
+ different. The above example showed the single
+ attribute <code>srcipaddr</code> that is valid inside the
+ <code>ip</code> traffic filtering node. The following sections
+ show what attributes are valid and what type of data they are
+ expecting. The following datatypes are available:
+ </p>
+ <ul>
+ <li>UINT8 : 8 bit integer; range 0-255</li>
+ <li>UINT16: 16 bit integer; range 0-65535</li>
+ <li>MAC_ADDR: MAC adrress in dotted decimal format, i.e., 00:11:22:33:44:55</li>
+ <li>MAC_MASK: MAC address mask in MAC address format, i.e., FF:FF:FF:FC:00:00</li>
+ <li>IP_ADDR: IP address in dotted decimal format, i.e., 10.1.2.3</li>
+ <li>IP_MASK: IP address mask in either dotted decimal format (255.255.248.0) or CIDR mask (0-32)</li>
+ <li>IPV6_ADDR: IPv6 address in numbers format, i.e., FFFF::1</li>
+ <li>IPV6_MASK: IPv6 mask in numbers format (FFFF:FFFF:FC00::) or CIDR mask (0-128)</li>
+ <li>STRING: A string</li>
+ </ul>
+ <p>
+ <br><br>
+ Every attribute except for those of type IP_MASK or IPV6_MASK can
+ be negated using the <code>match</code>
+ attribute with value <code>no</code>. Multiple negated attributes
+ may be grouped together. The following
+ XML fragment shows such an example using abstract attributes.
+ </p>
+<pre>
+[...]
+ <rule action='drop' direction='in'>
+ <protocol match='no' attribute1='value1' attribute2='value2'/>
+ <protocol attribute3='value3'/>
+ </rule>
+[...]
+</pre>
+ <p>
+ Rules perform a logical AND evaluation on all values of the given
+ protocol attributes. Thus, if a single attribute's value does not match
+ the one given in the rule, the whole rule will be skipped during
+ evaluation. Therefore, in the above example incoming traffic
+ will only be dropped if
+ the protocol property attribute1 does not match value1 AND
+ the protocol property attribute2 does not match value2 AND
+ the protocol property attribute3 matches value3.
+ <br><br>
+ </p>
+
+
+ <h5><a name="nwfelemsRulesProtoMAC">MAC (Ethernet)</a></h5>
+ <p>
+ Protocol ID: <code>mac</code>
+ <br>
+ Note: Rules of this type should go into the <code>root</code> chain.
+ </p>
+ <table class="top_table">
+ <tr>
+ <th> Attribute </th>
+ <th> Datatype </th>
+ <th> Semantics </th>
+ </tr>
+ <tr>
+ <td>srcmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>srcmacmask</td>
+ <td>MAC_MASK</td>
+ <td>Mask applied to MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>dstmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of destination</td>
+ </tr>
+ <tr>
+ <td>dstmacmask</td>
+ <td>MAC_MASK</td>
+ <td>Mask applied to MAC address of destination</td>
+ </tr>
+ <tr>
+ <td>protocolid</td>
+ <td>UINT16 (0x600-0xffff), STRING</td>
+ <td>Layer 3 protocol ID</td>
+ </tr>
+ </table>
+ <p>
+ Valid Strings for <code>protocolid</code> are: arp, rarp, ipv4, ipv6
+ <br><br>
+ Example: <pre><mac match='no' srcmacaddr='$MAC'/></pre>
+ <br><br>
+ </p>
+
+ <h5><a name="nwfelemsRulesProtoARP">ARP/RARP</a></h5>
+ <p>
+ Protocol ID: <code>arp</code> or <code>rarp</code>
+ <br>
+ Note: Rules of this type should either go into the
+ <code>root</code> or <code>arp/rarp</code> chain.
+ </p>
+ <table class="top_table">
+ <tr>
+ <th> Attribute </th>
+ <th> Datatype </th>
+ <th> Semantics </th>
+ </tr>
+ <tr>
+ <td>srcmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>srcmacmask</td>
+ <td>MAC_MASK</td>
+ <td>Mask applied to MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>dstmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of destination</td>
+ </tr>
+ <tr>
+ <td>dstmacmask</td>
+ <td>MAC_MASK</td>
+ <td>Mask applied to MAC address of destination</td>
+ </tr>
+ <tr>
+ <td>hwtype</td>
+ <td>UINT16</td>
+ <td>Hardware type</td>
+ </tr>
+ <tr>
+ <td>protocoltype</td>
+ <td>UINT16</td>
+ <td>Protocol type</td>
+ </tr>
+ <tr>
+ <td>opcode</td>
+ <td>UINT16, STRING</td>
+ <td>Opcode</td>
+ </tr>
+ <tr>
+ <td>arpsrcmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>Source MAC address in ARP/RARP packet</td>
+ </tr>
+ <tr>
+ <td>arpdstmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>Destination MAC address in ARP/RARP packet</td>
+ </tr>
+ <tr>
+ <td>arpsrcipaddr</td>
+ <td>IP_ADDR</td>
+ <td>Source IP address in ARP/RARP packet</td>
+ </tr>
+ <tr>
+ <td>arpdstipaddr</td>
+ <td>IP_ADDR</td>
+ <td>Destination IP address in ARP/RARP packet</td>
+ </tr>
+ </table>
+ <p>
+ Valid strings for the <code>Opcode</code> field are:
+ Request, Reply, Request_Reverse, Reply_Reverse, DRARP_Request,
+ DRARP_Reply, DRARP_Error, InARP_Request, ARP_NAK
+ <br><br>
+ </p>
+
+ <h5><a name="nwfelemsRulesProtoIP">IPv4</a></h5>
+ <p>
+ Protocol ID: <code>ip</code>
+ Note: Rules of this type should either go into the
+ <code>root</code> or <code>ipv4</code> chain.
+ </p>
+ <table class="top_table">
+ <tr>
+ <th> Attribute </th>
+ <th> Datatype </th>
+ <th> Semantics </th>
+ </tr>
+ <tr>
+ <td>srcmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>srcmacmask</td>
+ <td>MAC_MASK</td>
+ <td>Mask applied to MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>dstmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of destination</td>
+ </tr>
+ <tr>
+ <td>dstmacmask</td>
+ <td>MAC_MASK</td>
+ <td>Mask applied to MAC address of destination</td>
+ </tr>
+ <tr>
+ <td>srcipaddr</td>
+ <td>IP_ADDR</td>
+ <td>Source IP address</td>
+ </tr>
+ <tr>
+ <td>srcipmask</td>
+ <td>IP_MASK</td>
+ <td>Mask applied to source IP address</td>
+ </tr>
+ <tr>
+ <td>dstipaddr</td>
+ <td>IP_ADDR</td>
+ <td>Destination IP address</td>
+ </tr>
+ <tr>
+ <td>dstipmask</td>
+ <td>IP_MASK</td>
+ <td>Mask applied to destination IP address</td>
+ </tr>
+ <tr>
+ <td>protocol</td>
+ <td>UINT8, STRING</td>
+ <td>Layer 4 protocol identifier</td>
+ </tr>
+ <tr>
+ <td>srcportstart</td>
+ <td>UINT16</td>
+ <td>Start of range of valid source ports; requires <code>protocol</code></td>
+ </tr>
+ <tr>
+ <td>srcportend</td>
+ <td>UINT16</td>
+ <td>End of range of valid source ports; requires <code>protocol</code></td>
+ </tr>
+ <tr>
+ <td>dstportstart</td>
+ <td>UINT16</td>
+ <td>Start of range of valid destination ports; requires <code>protocol</code></td>
+ </tr>
+ <tr>
+ <td>dstportend</td>
+ <td>UINT16</td>
+ <td>End of range of valid destination ports; requires <code>protocol</code></td>
+ </tr>
+ </table>
+ <p>
+ Valid strings for <code>protocol</code> are:
+ tcp, udp, udplite, esp, ah, icmp, igmp, sctp
+ <br><br>
+ </p>
+
+
+ <h5><a name="nwfelemsRulesProtoIPv6">IPv6</a></h5>
+ <p>
+ Protocol ID: <code>ipv6</code>
+ Note: Rules of this type should either go into the
+ <code>root</code> or <code>ipv6</code> chain.
+ </p>
+ <table class="top_table">
+ <tr>
+ <th> Attribute </th>
+ <th> Datatype </th>
+ <th> Semantics </th>
+ </tr>
+ <tr>
+ <td>srcmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>srcmacmask</td>
+ <td>MAC_MASK</td>
+ <td>Mask applied to MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>dstmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of destination</td>
+ </tr>
+ <tr>
+ <td>dstmacmask</td>
+ <td>MAC_MASK</td>
+ <td>Mask applied to MAC address of destination</td>
+ </tr>
+ <tr>
+ <td>srcipaddr</td>
+ <td>IPV6_ADDR</td>
+ <td>Source IPv6 address</td>
+ </tr>
+ <tr>
+ <td>srcipmask</td>
+ <td>IPV6_MASK</td>
+ <td>Mask applied to source IPv6 address</td>
+ </tr>
+ <tr>
+ <td>dstipaddr</td>
+ <td>IPV6_ADDR</td>
+ <td>Destination IPv6 address</td>
+ </tr>
+ <tr>
+ <td>dstipmask</td>
+ <td>IPV6_MASK</td>
+ <td>Mask applied to destination IPv6 address</td>
+ </tr>
+ <tr>
+ <td>protocol</td>
+ <td>UINT8</td>
+ <td>Layer 4 protocol identifier</td>
+ </tr>
+ <tr>
+ <td>srcportstart</td>
+ <td>UINT16</td>
+ <td>Start of range of valid source ports; requires <code>protocol</code></td>
+ </tr>
+ <tr>
+ <td>srcportend</td>
+ <td>UINT16</td>
+ <td>End of range of valid source ports; requires <code>protocol</code></td>
+ </tr>
+ <tr>
+ <td>dstportstart</td>
+ <td>UINT16</td>
+ <td>Start of range of valid destination ports; requires <code>protocol</code></td>
+ </tr>
+ <tr>
+ <td>dstportend</td>
+ <td>UINT16</td>
+ <td>End of range of valid destination ports; requires <code>protocol</code></td>
+ </tr>
+ </table>
+ <p>
+ Valid strings for <code>protocol</code> are:
+ tcp, udp, udplite, esp, ah, icmpv6, sctp
+ <br><br>
+ </p>
+
+ <h5><a name="nwfelemsRulesProtoTCP-ipv4">TCP/UDP/SCTP</a></h5>
+ <p>
+ Protocol ID: <code>tcp</code>, <code>udp</code>, <code>sctp</code>
+ <br>
+ Note: The chain parameter is ignored for this type of traffic
+ and should either be omitted or set to <code>root</code>.
+ </p>
+ <table class="top_table">
+ <tr>
+ <th> Attribute </th>
+ <th> Datatype </th>
+ <th> Semantics </th>
+ </tr>
+ <tr>
+ <td>srcmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>srcipaddr</td>
+ <td>IP_ADDR</td>
+ <td>Source IP address</td>
+ </tr>
+ <tr>
+ <td>srcipmask</td>
+ <td>IP_MASK</td>
+ <td>Mask applied to source IP address</td>
+ </tr>
+ <tr>
+ <td>dstipaddr</td>
+ <td>IP_ADDR</td>
+ <td>Destination IP address</td>
+ </tr>
+ <tr>
+ <td>dstipmask</td>
+ <td>IP_MASK</td>
+ <td>Mask applied to destination IP address</td>
+ </tr>
+
+ <tr>
+ <td>srcipfrom</td>
+ <td>IP_ADDR</td>
+ <td>Start of range of source IP address</td>
+ </tr>
+ <tr>
+ <td>srcipto</td>
+ <td>IP_ADDR</td>
+ <td>End of range of source IP address</td>
+ </tr>
+ <tr>
+ <td>dstipfrom</td>
+ <td>IP_ADDR</td>
+ <td>Start of range of destination IP address</td>
+ </tr>
+ <tr>
+ <td>dstipto</td>
+ <td>IP_ADDR</td>
+ <td>End of range of destination IP address</td>
+ </tr>
+
+ <tr>
+ <td>srcportstart</td>
+ <td>UINT16</td>
+ <td>Start of range of valid source ports</td>
+ </tr>
+ <tr>
+ <td>srcportend</td>
+ <td>UINT16</td>
+ <td>End of range of valid source ports</code></td>
+ </tr>
+ <tr>
+ <td>dstportstart</td>
+ <td>UINT16</td>
+ <td>Start of range of valid destination ports</code></td>
+ </tr>
+ <tr>
+ <td>dstportend</td>
+ <td>UINT16</td>
+ <td>End of range of valid destination ports</td>
+ </tr>
+ </table>
+ <p>
+ <br><br>
+ </p>
+
+
+ <h5><a name="nwfelemsRulesProtoICMP">ICMP</a></h5>
+ <p>
+ Protocol ID: <code>icmp</code>
+ <br>
+ Note: The chain parameter is ignored for this type of traffic
+ and should either be omitted or set to <code>root</code>.
+ </p>
+ <table class="top_table">
+ <tr>
+ <th> Attribute </th>
+ <th> Datatype </th>
+ <th> Semantics </th>
+ </tr>
+ <tr>
+ <td>srcmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>srcmacmask</td>
+ <td>MAC_MASK</td>
+ <td>Mask applied to MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>dstmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of destination</td>
+ </tr>
+ <tr>
+ <td>dstmacmask</td>
+ <td>MAC_MASK</td>
+ <td>Mask applied to MAC address of destination</td>
+ </tr>
+ <tr>
+ <td>srcipaddr</td>
+ <td>IP_ADDR</td>
+ <td>Source IP address</td>
+ </tr>
+ <tr>
+ <td>srcipmask</td>
+ <td>IP_MASK</td>
+ <td>Mask applied to source IP address</td>
+ </tr>
+ <tr>
+ <td>dstipaddr</td>
+ <td>IP_ADDR</td>
+ <td>Destination IP address</td>
+ </tr>
+ <tr>
+ <td>dstipmask</td>
+ <td>IP_MASK</td>
+ <td>Mask applied to destination IP address</td>
+ </tr>
+
+ <tr>
+ <td>srcipfrom</td>
+ <td>IP_ADDR</td>
+ <td>Start of range of source IP address</td>
+ </tr>
+ <tr>
+ <td>srcipto</td>
+ <td>IP_ADDR</td>
+ <td>End of range of source IP address</td>
+ </tr>
+ <tr>
+ <td>dstipfrom</td>
+ <td>IP_ADDR</td>
+ <td>Start of range of destination IP address</td>
+ </tr>
+ <tr>
+ <td>dstipto</td>
+ <td>IP_ADDR</td>
+ <td>End of range of destination IP address</td>
+ </tr>
+ <tr>
+ <td>type</td>
+ <td>UINT16</td>
+ <td>ICMP type</td>
+ </tr>
+ <tr>
+ <td>code</td>
+ <td>UINT16</td>
+ <td>ICMP code</td>
+ </tr>
+ </table>
+ <p>
+ <br><br>
+ </p>
+
+ <h5><a name="nwfelemsRulesProtoMisc">IGMP, ESP, AH, UDPLITE, 'ALL'</a></h5>
+ <p>
+ Protocol ID: <code>igmp</code>, <code>esp</code>, <code>ah</code>, <code>udplite</code>, <code>all</code>
+ <br>
+ Note: The chain parameter is ignored for this type of traffic
+ and should either be omitted or set to <code>root</code>.
+ </p>
+ <table class="top_table">
+ <tr>
+ <th> Attribute </th>
+ <th> Datatype </th>
+ <th> Semantics </th>
+ </tr>
+ <tr>
+ <td>srcmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>srcmacmask</td>
+ <td>MAC_MASK</td>
+ <td>Mask applied to MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>dstmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of destination</td>
+ </tr>
+ <tr>
+ <td>dstmacmask</td>
+ <td>MAC_MASK</td>
+ <td>Mask applied to MAC address of destination</td>
+ </tr>
+ <tr>
+ <td>srcipaddr</td>
+ <td>IP_ADDR</td>
+ <td>Source IP address</td>
+ </tr>
+ <tr>
+ <td>srcipmask</td>
+ <td>IP_MASK</td>
+ <td>Mask applied to source IP address</td>
+ </tr>
+ <tr>
+ <td>dstipaddr</td>
+ <td>IP_ADDR</td>
+ <td>Destination IP address</td>
+ </tr>
+ <tr>
+ <td>dstipmask</td>
+ <td>IP_MASK</td>
+ <td>Mask applied to destination IP address</td>
+ </tr>
+
+ <tr>
+ <td>srcipfrom</td>
+ <td>IP_ADDR</td>
+ <td>Start of range of source IP address</td>
+ </tr>
+ <tr>
+ <td>srcipto</td>
+ <td>IP_ADDR</td>
+ <td>End of range of source IP address</td>
+ </tr>
+ <tr>
+ <td>dstipfrom</td>
+ <td>IP_ADDR</td>
+ <td>Start of range of destination IP address</td>
+ </tr>
+ <tr>
+ <td>dstipto</td>
+ <td>IP_ADDR</td>
+ <td>End of range of destination IP address</td>
+ </tr>
+ </table>
+ <p>
+ <br><br>
+ </p>
+
+
+ <h5><a name="nwfelemsRulesProtoTCP-ipv6">TCP/UDP/SCTP over IPV6</a></h5>
+ <p>
+ Protocol ID: <code>tcp-ipv6</code>, <code>udp-ipv6</code>, <code>sctp-ipv6</code>
+ <br>
+ Note: The chain parameter is ignored for this type of traffic
+ and should either be omitted or set to <code>root</code>.
+ </p>
+ <table class="top_table">
+ <tr>
+ <th> Attribute </th>
+ <th> Datatype </th>
+ <th> Semantics </th>
+ </tr>
+ <tr>
+ <td>srcmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>srcipaddr</td>
+ <td>IPV6_ADDR</td>
+ <td>Source IP address</td>
+ </tr>
+ <tr>
+ <td>srcipmask</td>
+ <td>IPV6_MASK</td>
+ <td>Mask applied to source IP address</td>
+ </tr>
+ <tr>
+ <td>dstipaddr</td>
+ <td>IPV6_ADDR</td>
+ <td>Destination IP address</td>
+ </tr>
+ <tr>
+ <td>dstipmask</td>
+ <td>IPV6_MASK</td>
+ <td>Mask applied to destination IP address</td>
+ </tr>
+
+ <tr>
+ <td>srcipfrom</td>
+ <td>IPV6_ADDR</td>
+ <td>Start of range of source IP address</td>
+ </tr>
+ <tr>
+ <td>srcipto</td>
+ <td>IPV6_ADDR</td>
+ <td>End of range of source IP address</td>
+ </tr>
+ <tr>
+ <td>dstipfrom</td>
+ <td>IPV6_ADDR</td>
+ <td>Start of range of destination IP address</td>
+ </tr>
+ <tr>
+ <td>dstipto</td>
+ <td>IPV6_ADDR</td>
+ <td>End of range of destination IP address</td>
+ </tr>
+
+ <tr>
+ <td>srcportstart</td>
+ <td>UINT16</td>
+ <td>Start of range of valid source ports</td>
+ </tr>
+ <tr>
+ <td>srcportend</td>
+ <td>UINT16</td>
+ <td>End of range of valid source ports</td>
+ </tr>
+ <tr>
+ <td>dstportstart</td>
+ <td>UINT16</td>
+ <td>Start of range of valid destination ports</td>
+ </tr>
+ <tr>
+ <td>dstportend</td>
+ <td>UINT16</td>
+ <td>End of range of valid destination ports</td>
+ </tr>
+ </table>
+ <p>
+ <br><br>
+ </p>
+
+
+ <h5><a name="nwfelemsRulesProtoICMPv6">ICMPv6</a></h5>
+ <p>
+ Protocol ID: <code>icmpv6</code>
+ <br>
+ Note: The chain parameter is ignored for this type of traffic
+ and should either be omitted or set to <code>root</code>.
+ </p>
+ <table class="top_table">
+ <tr>
+ <th> Attribute </th>
+ <th> Datatype </th>
+ <th> Semantics </th>
+ </tr>
+ <tr>
+ <td>srcmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>srcipaddr</td>
+ <td>IPV6_ADDR</td>
+ <td>Source IPv6 address</td>
+ </tr>
+ <tr>
+ <td>srcipmask</td>
+ <td>IPV6_MASK</td>
+ <td>Mask applied to source IPv6 address</td>
+ </tr>
+ <tr>
+ <td>dstipaddr</td>
+ <td>IPV6_ADDR</td>
+ <td>Destination IPv6 address</td>
+ </tr>
+ <tr>
+ <td>dstipmask</td>
+ <td>IPV6_MASK</td>
+ <td>Mask applied to destination IPv6 address</td>
+ </tr>
+
+ <tr>
+ <td>srcipfrom</td>
+ <td>IPV6_ADDR</td>
+ <td>Start of range of source IP address</td>
+ </tr>
+ <tr>
+ <td>srcipto</td>
+ <td>IPV6_ADDR</td>
+ <td>End of range of source IP address</td>
+ </tr>
+ <tr>
+ <td>dstipfrom</td>
+ <td>IPV6_ADDR</td>
+ <td>Start of range of destination IP address</td>
+ </tr>
+ <tr>
+ <td>dstipto</td>
+ <td>IPV6_ADDR</td>
+ <td>End of range of destination IP address</td>
+ </tr>
+
+ <tr>
+ <td>type</td>
+ <td>UINT16</td>
+ <td>ICMPv6 type</td>
+ </tr>
+ <tr>
+ <td>code</td>
+ <td>UINT16</td>
+ <td>ICMPv6 code</td>
+ </tr>
+ </table>
+ <p>
+ <br><br>
+ </p>
+
+ <h5><a name="nwfelemsRulesProtoMiscv6">IGMP, ESP, AH, UDPLITE, 'ALL' over IPv6</a></h5>
+ <p>
+ Protocol ID: <code>igmp-ipv6</code>, <code>esp-ipv6</code>, <code>ah-ipv6</code>, <code>udplite-ipv6</code>, <code>all-ipv6</code>
+ <br>
+ Note: The chain parameter is ignored for this type of traffic
+ and should either be omitted or set to <code>root</code>.
+ </p>
+ <table class="top_table">
+ <tr>
+ <th> Attribute </th>
+ <th> Datatype </th>
+ <th> Semantics </th>
+ </tr>
+ <tr>
+ <td>srcmacaddr</td>
+ <td>MAC_ADDR</td>
+ <td>MAC address of sender</td>
+ </tr>
+ <tr>
+ <td>srcipaddr</td>
+ <td>IPV6_ADDR</td>
+ <td>Source IPv6 address</td>
+ </tr>
+ <tr>
+ <td>srcipmask</td>
+ <td>IPV6_MASK</td>
+ <td>Mask applied to source IPv6 address</td>
+ </tr>
+ <tr>
+ <td>dstipaddr</td>
+ <td>IPV6_ADDR</td>
+ <td>Destination IPv6 address</td>
+ </tr>
+ <tr>
+ <td>dstipmask</td>
+ <td>IPV6_MASK</td>
+ <td>Mask applied to destination IPv6 address</td>
+ </tr>
+
+ <tr>
+ <td>srcipfrom</td>
+ <td>IPV6_ADDR</td>
+ <td>Start of range of source IP address</td>
+ </tr>
+ <tr>
+ <td>srcipto</td>
+ <td>IPV6_ADDR</td>
+ <td>End of range of source IP address</td>
+ </tr>
+ <tr>
+ <td>dstipfrom</td>
+ <td>IPV6_ADDR</td>
+ <td>Start of range of destination IP address</td>
+ </tr>
+ <tr>
+ <td>dstipto</td>
+ <td>IPV6_ADDR</td>
+ <td>End of range of destination IP address</td>
+ </tr>
+
+ </table>
+ <p>
+ <br><br>
+ </p>
+
+ <h2><a name="nwfcli">Command line tools</a></h2>
+ <p>
+ The libvirt command line tool <code>virsh</code> has been extended
+ with life-cycle support for network filters. All commands related
+ to the network filtering subsystem start with the prefix
+ <code>nwfilter</code>. The following commands are available:
+ <p>
+ <ul>
+ <li>nwfilter-list : list UUIDs and names of all network filters</li>
+ <li>nwfilter-define : define a new network filter or update an existing one</li>
+ <li>nwfilter-undefine : delete a network filter given its name; it must not be currently in use</li>
+ <li>nwfilter-dumpxml : display a network filter given its name</li>
+ <li>nwfilter-edit : edit a network filter given its name</li>
+ </ul>
+
+ <h2><a name="nwfexamples">Example network filters</a></h2>
+ <p>
+ The following is a list of example network filters that are
+ automatically installed with libvirt. </p>
+ <table class="top_table">
+ <tr>
+ <th> Name </th>
+ <th> Description </th>
+ </tr>
+ <tr>
+ <td> no-arp-spoofing </td>
+ <td> Prevent a VM from spoofing ARP traffic; this filter
+ only allows ARP request and reply messages and enforces
+ that those packets contain the MAC and IP addresses
+ of the VM.</td>
+ </tr>
+ <tr>
+ <td> allow-dhcp </td>
+ <td> Allow a VM to request an IP address via DHCP (from any
+ DHCP server)</td>
+ </tr>
+ <tr>
+ <td> allow-dhcp-server </td>
+ <td> Allow a VM to request an IP address from a specified
+ DHCP server. The dotted decimal IP address of the DHCP
+ server must be provided in a reference to this filter.
+ The name of the variable must be <i>DHCPSERVER</i>.</td>
+ </tr>
+ <tr>
+ <td> no-ip-spoofing </td>
+ <td> Prevent a VM from sending of IP packets with
+ a source IP address different from the one
+ in the packet. </td>
+ </tr>
+ <tr>
+ <td> no-ip-multicast </td>
+ <td> Prevent a VM from sending IP multicast packets. </td>
+ </tr>
+ <tr>
+ <td> clean-traffic </td>
+ <td> Prevent MAC, IP and ARP spoofing. This filter references
+ several other filters as building blocks. </td>
+ </tr>
+ </table>
+ <p>
+ Note that most of the above filters are only building blocks and
+ require a combination with other filters to provide useful network
+ traffic filtering.
+ The most useful one in the above list is the <i>clean-traffic</i>
+ filter. This filter itself can for example be combined with the
+ <i>no-ip-multicast</i>
+ filter to prevent virtual machines from sending IP multicast traffic
+ on top of the prevention of packet spoofing.
+ </p>
+
+ <h2><a name="nwfwrite">Writing your own filters</a></h2>
+
+ <p>
+ Since libvirt only provides a couple of example networking filters, you
+ may consider writing your own. When planning on doing so
+ there are a couple of things
+ you may need to know regarding the network filtering subsystem and how
+ it works internally. Certainly you also have to know and understand
+ the protocols very well that you want to be filtering on so that
+ no further traffic than what you want can pass and that in fact the
+ traffic you want to allow does pass.
+ <br><br>
+ The network filtering subsystem is currently only available on
+ Linux hosts and only works for Qemu and KVM type of virtual machines.
+ On Linux
+ it builds upon the support for <code>ebtables</code>, <code>iptables
+ </code> and <code>ip6tables</code> and makes use of their features.
+ From the above list of supported protocols the following ones are
+ implemented using <code>ebtables</code>:
+ </p>
+ <ul>
+ <li>mac</li>
+ <li>arp, rarp</li>
+ <li>ip</li>
+ <li>ipv6</li>
+ </uL>
+
+ <p>
+ All other protocols over IPv4 are supported using iptables, those over
+ IPv6 are implemented using ip6tables.
+ <br><br>
+ On a Linux host, all traffic filtering instantiated by libvirt's network
+ filter subsystem first passes through the filtering support implemented
+ by ebtables and only then through iptables or ip6tables filters. If
+ a filter tree has rules with the protocols <code>mac</code>,
+ <code>arp</code>, <code>rarp</code>, <code>ip</code>, or <code>ipv6</code>
+ ebtables rules will automatically be instantiated.
+ <br>
+ The role of the <code>chain</code> attribute in the network filter
+ XML is that internally a new user-defined ebtables table is created
+ that then for example receives all <code>arp</code> traffic coming
+ from or going to a virtual machine, if the chain <code>arp</code>
+ has been specified. Further, a rule is generated in an interface's
+ <code>root</code> chain that directs all ipv4 traffic into the
+ user-defined chain. Therefore, all ARP traffic rules should then be
+ placed into filters specifying this chain. This type of branching
+ into user-define tables is only supported with filtering on the ebtables
+ layer.
+ <br>
+ As an example, it is
+ possible to filter on UDP traffic by source and destination ports using
+ the <code>ip</code> protocol filter and specifying attributes for the
+ protocol, source and destination IP addresses and ports of UDP packets
+ that are to be accepted. This allows
+ early filtering of UDP traffic with ebtables. However, once an IP or IPv6
+ packet, such as a UDP packet,
+ has passed the ebtables layer and there is at least one rule in a filter
+ tree that instantiates iptables or ip6tables rules, a rule to let
+ the UDP packet pass will also be necessary to be provided for those
+ filtering layers. This can be
+ achieved with a rule containing an approriate <code>udp</code> or
+ <code>udp-ipv6</code> traffic filtering node.
+ </p>
+
+ <h3><a name="nwfwriteexample">Example custom filter</a></h3>
+ <p>
+ As an example we want to now build a filter that fulfills the following
+ list of requirements:
+ </p>
+ <ul>
+ <li>prevents a VM's interface from MAC, IP and ARP spoofing</li>
+ <li>opens only TCP ports 22 and 80 of a VM's interface</li>
+ <li>allows the VM to send ping traffic from an interface
+ but no let the VM be pinged on the interface</li>
+ </ul>
+ <p>
+ The requirement to prevent spoofing is fulfilled by the existing
+ <code>clean-traffic</code> network filter, thus we will reference this
+ filter from our custom filter.
+ <br>
+ To enable traffic for TCP ports 22 and 80 we will add 2 rules to
+ enable this type of traffic. To allow the VM to send ping traffic
+ we will add a rule for ICMP traffic. For simplicity reasons
+ we allow general ICMP traffic to be initated from the VM, not
+ just ICMP echo request and response messages. To then
+ disallow all other traffic to reach or be initated by the
+ VM we will then need to add a rule that drops all other traffic.
+ Assuming our VM is called <i>test</i> and
+ the interface we want to associate our filter with is called <i>eth0</i>,
+ we name our filter <i>test-eth0</i>.
+ The result of these considerations is the following network filter XML:
+ </p>
+<pre>
+<filter name='test-eth0'>
+ <!-- reference the clean traffic filter preventing
+ MAC, IP and ARP spoofing. By not providing
+ and IP address parameter libvirt will detect the
+ IP address the VM is using. -->
+ <filterref filter='clean-traffic'/>
+
+ <!-- enable TCP ports 22 (ssh) and 80 (http) to be reachable -->
+ <rule action='accept' direction='in'>
+ <tcp dstportstart='22'/>
+ </rule>
+
+ <rule action='accept' direction='in'>
+ <tcp dstportstart='80'/>
+ </rule>
+
+ <!-- enable general ICMP traffic to be initiated by the VM;
+ this includes ping traffic -->
+ <rule action='accept' direction='out'>
+ <icmp/>
+ </rule>
+
+ <!-- drop all other traffic -->
+ <rule action='drop' direction='inout'>
+ <all/>
+ </rule>
+
+</filter>
+</pre>
+ <p>
+ Note that none of the rules in the above XML contain the
+ IP address of the VM as either source or destination address, yet
+ the filtering of the traffic works correctly. The reason is that
+ the evaluation of the rules internally happens on a
+ per-interface basis and the rules are evaluated based on the knowledge
+ about which (tap) interface has sent or will receive the packet rather
+ than what their source or destination IP address may be.
+ <br><br>
+ An XML fragment for a possible network interface description inside
+ the domain XML of the <code>test</code> VM could then look like this:
+ </p>
+<pre>
+ [...]
+ <interface type='bridge'>
+ <source bridge='mybridge'/>
+ <filterref filter='test-eth0'/>
+ </interface>
+ [...]
+</pre>
+
+ <p>
+ To more strictly control the ICMP traffic and enforce that only
+ ICMP echo requests can be sent from the VM
+ and only ICMP echo responses be received by the VM, the above
+ <code>ICMP</code> rule can be replaced with the following two rules:
+ </p>
+<pre>
+ <!-- enable outgoing ICMP echo requests-->
+ <rule action='accept' direction='out'>
+ <icmp type='8'/>
+ </rule>
+
+ <!-- enable incoming ICMP echo replies-->
+ <rule action='accept' direction='in'>
+ <icmp type='0'/>
+ </rule>
+</pre>
+
+
+ <h2><a name="nwflimits">Limitations</a></h2>
+ <p>
+ The following sections list (current) limitations of the network
+ filtering subsystem.
+ </p>
+
+ <h3><a name="nwflimitsIP">IP Address Detection</a></h3>
+ <p>
+ In case a network filter references the variable
+ <i>IP</i> and no variable was defined in any higher layer
+ references to the filter, IP address detection will automatically
+ be started when the filter is to be instantiated (VM start, interface
+ hotplug event). Only IPv4
+ addresses can be detected and only a single IP address
+ legitimately in use by a VM on a single interface will be detected.
+ In case a VM was to use multiple IP address on a single interface
+ (IP aliasing),
+ the IP addresses would have to be provided explicitly either
+ in the network filter itself or as variables used in attributes'
+ values. These
+ variables must then be defined in a higher level reference to the filter
+ and each assigned the value of the IP address that the VM is expected
+ to be using.
+ Different IP addresses in use by multiple interfaces of a VM
+ (one IP address each) will be independently detected.
+ <br><br>
+ Once a VM's IP address has been detected, its IP network traffic
+ may be locked to that address, if for example IP address spoofing
+ is prevented by one of its filters. In that case the user of the VM
+ will not be able to change the IP address on the interface inside
+ the VM, which would be considered IP address spoofing.
+ <br><br>
+ In case a VM is resumed after suspension or migrated, IP address
+ detection will be restarted.
+ </p>
+
+ <h3><a name="nwflimitsmigr">VM Migration</a></h3>
+ <p>
+ VM migration is only supported if the whole filter tree
+ that is referenced by a virtual machine's top level filter
+ is also available on the target host. The network filter
+ <i>clean-traffic</i>
+ for example should be available on all libvirt installations
+ of version 0.8.1 or later and thus enable migration of VMs that
+ for example reference this filter. All other
+ custom filters must be migrated using higher layer software. It is
+ outside the scope of libvirt to ensure that referenced filters
+ on the source system are equivalent to those on the target system
+ and vice versa.
+ <br><br>
+ Migration must occurr between libvirt insallations of version
+ 0.8.1 or later in order not to loose the network traffic filters
+ associated with an interface.
+ </p>
+
+ </body>
+</html>
14 years, 7 months