[libvirt] [PATCHv3 0/4] util: fix libvirtd startup failure due to netlink error

The initial patch in this series is the same as the single patch I provided before, but in addition there are 3 new patches that take care of the fallout from 1/4 by: 2/4) removing hard-coding of the src nl_pid when virNetLinkCommand is sending a message 3/4) adding a helper function to retrieve the local nl_pid from a libnl socket and 4/4) using that value when appropriate (I think). As before, I'm unable to fully test this myself, so I won't push unless/until I get verification it works.

This solves the problem detailed in: https://bugzilla.redhat.com/show_bug.cgi?id=816465 and further detailed in https://www.redhat.com/archives/libvir-list/2012-May/msg00202.htm A short explanation is included in the comments of the patch itself. Even with ACK, I will wait to push this until I have verification that it does not break lldpad<-->libvirtd communication (if it does, I may need to use the nl_handle allocated during virNetlinkStartup() for virNetlinkEventServiceStart()). --- daemon/libvirtd.c | 6 +++++ src/libvirt_private.syms | 2 ++ src/util/virnetlink.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++ src/util/virnetlink.h | 5 +++- 4 files changed, 77 insertions(+), 1 deletion(-) diff --git a/daemon/libvirtd.c b/daemon/libvirtd.c index b098f6a..5d57b50 100644 --- a/daemon/libvirtd.c +++ b/daemon/libvirtd.c @@ -1007,6 +1007,11 @@ int main(int argc, char **argv) { goto cleanup; } + if (virNetlinkStartup() < 0) { + ret = VIR_DAEMON_ERR_INIT; + goto cleanup; + } + if (!(srv = virNetServerNew(config->min_workers, config->max_workers, config->prio_workers, @@ -1143,6 +1148,7 @@ cleanup: virNetServerProgramFree(qemuProgram); virNetServerClose(srv); virNetServerFree(srv); + virNetlinkShutdown(); if (statuswrite != -1) { if (ret != 0) { /* Tell parent of daemon what failed */ diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 88f8a21..9b20dd4 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1333,6 +1333,8 @@ virNetlinkEventRemoveClient; virNetlinkEventServiceIsRunning; virNetlinkEventServiceStop; virNetlinkEventServiceStart; +virNetlinkShutdown; +virNetlinkStartup; # virnetmessage.h diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c index 2cbf32a..1b64033 100644 --- a/src/util/virnetlink.c +++ b/src/util/virnetlink.c @@ -98,10 +98,63 @@ static int nextWatch = 1; # define NETLINK_EVENT_ALLOC_EXTENT 10 static virNetlinkEventSrvPrivatePtr server = NULL; +static struct nl_handle *placeholder_nlhandle = NULL; /* Function definitions */ /** + * virNetlinkStartup: + * + * Perform any initialization that needs to take place before the + * program starts up worker threads. This is currently used to assure + * that an nl_handle is allocated prior to any attempts to bind a + * netlink socket. For a discussion of why this is necessary, please + * see the following email message: + * + * https://www.redhat.com/archives/libvir-list/2012-May/msg00202.html + * + * The short version is that, without this placeholder allocation of + * an nl_handle that is never used, it is possible for nl_connect() in + * one thread to collide with a direct bind() of a netlink socket in + * another thread, leading to failure of the operation (which could + * lead to failure of libvirtd to start). Since getaddrinfo() (used by + * libvirtd in virSocketAddrParse, which is called quite frequently + * during startup) directly calls bind() on a netlink socket, this is + * actually a very common occurence (15-20% failure rate on some + * hardware). + * + * Returns 0 on success, -1 on failure. + */ +int +virNetlinkStartup(void) +{ + if (placeholder_nlhandle) + return 0; + placeholder_nlhandle = nl_handle_alloc(); + if (!placeholder_nlhandle) { + virReportSystemError(errno, "%s", + _("cannot allocate placeholder nlhandle for netlink")); + return -1; + } + return 0; +} + +/** + * virNetlinkShutdown: + * + * Undo any initialization done by virNetlinkStartup. This currently + * destroys the placeholder nl_handle. + */ +void +virNetlinkShutdown(void) +{ + if (placeholder_nlhandle) { + nl_handle_destroy(placeholder_nlhandle); + placeholder_nlhandle = NULL; + } +} + +/** * virNetlinkCommand: * @nlmsg: pointer to netlink message * @respbuf: pointer to pointer where response buffer will be allocated @@ -546,6 +599,18 @@ static const char *unsupported = N_("libnl was not available at build time"); static const char *unsupported = N_("not supported on non-linux platforms"); # endif +int +virNetlinkStartup(void) +{ + return 0; +} + +void +virNetlinkShutdown(void) +{ + return; +} + int virNetlinkCommand(struct nl_msg *nl_msg ATTRIBUTE_UNUSED, unsigned char **respbuf ATTRIBUTE_UNUSED, unsigned int *respbuflen ATTRIBUTE_UNUSED, diff --git a/src/util/virnetlink.h b/src/util/virnetlink.h index a72612e..93df59a 100644 --- a/src/util/virnetlink.h +++ b/src/util/virnetlink.h @@ -1,5 +1,5 @@ /* - * Copyright (C) 2010-2011 Red Hat, Inc. + * Copyright (C) 2010-2012 Red Hat, Inc. * Copyright (C) 2010-2012 IBM Corporation * * This library is free software; you can redistribute it and/or @@ -35,6 +35,9 @@ struct nlattr; # endif /* __linux__ */ +int virNetlinkStartup(void); +void virNetlinkShutdown(void); + int virNetlinkCommand(struct nl_msg *nl_msg, unsigned char **respbuf, unsigned int *respbuflen, int nl_pid); -- 1.7.10

On 05/04/2012 12:51 PM, Laine Stump wrote:
This solves the problem detailed in:
https://bugzilla.redhat.com/show_bug.cgi?id=816465
and further detailed in
https://www.redhat.com/archives/libvir-list/2012-May/msg00202.htm
A short explanation is included in the comments of the patch itself.
Even with ACK, I will wait to push this until I have verification that it does not break lldpad<-->libvirtd communication (if it does, I may need to use the nl_handle allocated during virNetlinkStartup() for virNetlinkEventServiceStart()). ---
Conditional ACK from v2 remains good; here's hoping we get feedback on patches 2-4 fixing the lldpad interaction. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Until now, virNetlinkCommand has assumed that the nl_pid in the source address of outgoing netlink messages should always be the return value of getpid(). In most cases it actually doesn't matter, but in the case of communication with lldpad, lldpad saves this info and later uses it to send netlink messages back to libvirt. A recent patch to fix Bug 816465 changed the order of the universe such that the netlink event service socket is no longer bound with nl_pid == getpid(), so lldpad could no longer send unsolicited messages to libvirtd. Adding src_pid as an argument to virNetlinkCommand() is the first step in notifying lldpad of the proper address of the netlink event service socket. --- src/util/virnetdev.c | 4 ++-- src/util/virnetdevmacvlan.c | 6 +++--- src/util/virnetdevvportprofile.c | 4 ++-- src/util/virnetlink.c | 13 +++++++------ src/util/virnetlink.h | 2 +- 5 files changed, 15 insertions(+), 14 deletions(-) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 460aa83..72af3cd 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1288,7 +1288,7 @@ virNetDevLinkDump(const char *ifname, int ifindex, } } - if (virNetlinkCommand(nl_msg, recvbuf, &recvbuflen, pid) < 0) + if (virNetlinkCommand(nl_msg, recvbuf, &recvbuflen, 0, pid) < 0) goto cleanup; if (recvbuflen < NLMSG_LENGTH(0) || *recvbuf == NULL) @@ -1416,7 +1416,7 @@ virNetDevSetVfConfig(const char *ifname, int ifindex, int vf, } } - if (virNetlinkCommand(nl_msg, &recvbuf, &recvbuflen, pid) < 0) + if (virNetlinkCommand(nl_msg, &recvbuf, &recvbuflen, 0, pid) < 0) goto cleanup; if (recvbuflen < NLMSG_LENGTH(0) || recvbuf == NULL) diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c index 401ee74..d168d42 100644 --- a/src/util/virnetdevmacvlan.c +++ b/src/util/virnetdevmacvlan.c @@ -1,5 +1,5 @@ /* - * Copyright (C) 2010-2011 Red Hat, Inc. + * Copyright (C) 2010-2012 Red Hat, Inc. * Copyright (C) 2010-2012 IBM Corporation * * This library is free software; you can redistribute it and/or @@ -154,7 +154,7 @@ virNetDevMacVLanCreate(const char *ifname, nla_nest_end(nl_msg, linkinfo); - if (virNetlinkCommand(nl_msg, &recvbuf, &recvbuflen, 0) < 0) { + if (virNetlinkCommand(nl_msg, &recvbuf, &recvbuflen, 0, 0) < 0) { goto cleanup; } @@ -242,7 +242,7 @@ int virNetDevMacVLanDelete(const char *ifname) if (nla_put(nl_msg, IFLA_IFNAME, strlen(ifname)+1, ifname) < 0) goto buffer_too_small; - if (virNetlinkCommand(nl_msg, &recvbuf, &recvbuflen, 0) < 0) { + if (virNetlinkCommand(nl_msg, &recvbuf, &recvbuflen, 0, 0) < 0) { goto cleanup; } diff --git a/src/util/virnetdevvportprofile.c b/src/util/virnetdevvportprofile.c index 5b562be..def5c62 100644 --- a/src/util/virnetdevvportprofile.c +++ b/src/util/virnetdevvportprofile.c @@ -1,5 +1,5 @@ /* - * Copyright (C) 2009-2011 Red Hat, Inc. + * Copyright (C) 2009-2012 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -395,7 +395,7 @@ virNetDevVPortProfileOpSetLink(const char *ifname, int ifindex, goto cleanup; } - if (virNetlinkCommand(nl_msg, &recvbuf, &recvbuflen, pid) < 0) + if (virNetlinkCommand(nl_msg, &recvbuf, &recvbuflen, 0, pid) < 0) goto cleanup; if (recvbuflen < NLMSG_LENGTH(0) || recvbuf == NULL) diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c index 1b64033..df0f57a 100644 --- a/src/util/virnetlink.c +++ b/src/util/virnetlink.c @@ -168,12 +168,12 @@ virNetlinkShutdown(void) */ int virNetlinkCommand(struct nl_msg *nl_msg, unsigned char **respbuf, unsigned int *respbuflen, - int nl_pid) + uint32_t src_pid, uint32_t dst_pid) { int rc = 0; struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK, - .nl_pid = nl_pid, + .nl_pid = dst_pid, .nl_groups = 0, }; ssize_t nbytes; @@ -201,7 +201,7 @@ int virNetlinkCommand(struct nl_msg *nl_msg, nlmsg_set_dst(nl_msg, &nladdr); - nlmsg->nlmsg_pid = getpid(); + nlmsg->nlmsg_pid = src_pid ? src_pid : getpid(); nbytes = nl_send_auto_complete(nlhandle, nl_msg); if (nbytes < 0) { @@ -612,9 +612,10 @@ virNetlinkShutdown(void) } int virNetlinkCommand(struct nl_msg *nl_msg ATTRIBUTE_UNUSED, - unsigned char **respbuf ATTRIBUTE_UNUSED, - unsigned int *respbuflen ATTRIBUTE_UNUSED, - int nl_pid ATTRIBUTE_UNUSED) + unsigned char **respbuf ATTRIBUTE_UNUSED, + unsigned int *respbuflen ATTRIBUTE_UNUSED, + uint32_t src_pid ATTRIBUTE_UNUSED, + uint32_t dst_pid ATTRIBUTE_UNUSED) { netlinkError(VIR_ERR_INTERNAL_ERROR, "%s", _(unsupported)); return -1; diff --git a/src/util/virnetlink.h b/src/util/virnetlink.h index 93df59a..c57be82 100644 --- a/src/util/virnetlink.h +++ b/src/util/virnetlink.h @@ -40,7 +40,7 @@ void virNetlinkShutdown(void); int virNetlinkCommand(struct nl_msg *nl_msg, unsigned char **respbuf, unsigned int *respbuflen, - int nl_pid); + uint32_t src_port, uint32_t dst_port); typedef void (*virNetlinkEventHandleCallback)(unsigned char *msg, int length, struct sockaddr_nl *peer, bool *handled, void *opaque); -- 1.7.10

On 05/04/2012 12:51 PM, Laine Stump wrote:
Until now, virNetlinkCommand has assumed that the nl_pid in the source address of outgoing netlink messages should always be the return value of getpid(). In most cases it actually doesn't matter, but in the case of communication with lldpad, lldpad saves this info and later uses it to send netlink messages back to libvirt. A recent patch to fix Bug 816465 changed the order of the universe such that the netlink event service socket is no longer bound with nl_pid == getpid(), so lldpad could no longer send unsolicited messages to libvirtd. Adding src_pid as an argument to virNetlinkCommand() is the first step in notifying lldpad of the proper address of the netlink event service socket. --- src/util/virnetdev.c | 4 ++-- src/util/virnetdevmacvlan.c | 6 +++--- src/util/virnetdevvportprofile.c | 4 ++-- src/util/virnetlink.c | 13 +++++++------ src/util/virnetlink.h | 2 +- 5 files changed, 15 insertions(+), 14 deletions(-)
ACK. This one can technically be applied without waiting on lldpad testing (although it doesn't hurt to wait to apply the series as a unit when patch 1 and 4 pass testing). -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

This value will be needed to set the src_pid when sending netlink messages to lldpad. It is part of the solution to: https://bugzilla.redhat.com/show_bug.cgi?id=816465 Note that libnl's port generation algorithm guarantees that the nl_socket_get_local_port() will always be > 0, so it is okay to cast the uint32_t to int (thus allowing us to use -1 as an error sentinel). --- src/libvirt_private.syms | 1 + src/util/virnetlink.c | 18 ++++++++++++++++++ src/util/virnetlink.h | 5 +++++ 3 files changed, 24 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 9b20dd4..4ab1e65 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1331,6 +1331,7 @@ virNetlinkCommand; virNetlinkEventAddClient; virNetlinkEventRemoveClient; virNetlinkEventServiceIsRunning; +virNetlinkEventServiceLocalPid; virNetlinkEventServiceStop; virNetlinkEventServiceStart; virNetlinkShutdown; diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c index df0f57a..aeca2a7 100644 --- a/src/util/virnetlink.c +++ b/src/util/virnetlink.c @@ -379,6 +379,24 @@ virNetlinkEventServiceIsRunning(void) } /** + * virNetlinkEventServiceLocalPid: + * + * Returns the nl_pid value that was used to bind() the netlink socket + * used by the netlink event service, or -1 on error (netlink + * guarantees that this value will always be > 0). + */ +int virNetlinkEventServiceLocalPid(void) +{ + if (!(server && server->netlinknh)) { + netlinkError(VIR_ERR_INTERNAL_ERROR, "%s", + _("netlink event service not running")); + return -1; + } + return (int)nl_socket_get_local_port(server->netlinknh); +} + + +/** * virNetlinkEventServiceStart: * * start a monitor to receive netlink messages for libvirtd. diff --git a/src/util/virnetlink.h b/src/util/virnetlink.h index c57be82..bafe8ca 100644 --- a/src/util/virnetlink.h +++ b/src/util/virnetlink.h @@ -62,6 +62,11 @@ int virNetlinkEventServiceStart(void); bool virNetlinkEventServiceIsRunning(void); /** + * virNetlinkEventServiceLocalPid: returns nl_pid used to bind() netlink socket + */ +int virNetlinkEventServiceLocalPid(void); + +/** * virNetlinkEventAddClient: register a callback for handling of netlink messages */ int virNetlinkEventAddClient(virNetlinkEventHandleCallback handleCB, -- 1.7.10

On 05/04/2012 12:51 PM, Laine Stump wrote:
This value will be needed to set the src_pid when sending netlink messages to lldpad. It is part of the solution to:
https://bugzilla.redhat.com/show_bug.cgi?id=816465
Note that libnl's port generation algorithm guarantees that the nl_socket_get_local_port() will always be > 0, so it is okay to cast the uint32_t to int (thus allowing us to use -1 as an error sentinel).
I had to look; we always let libnl generate a socket, and indeed the libnl code: http://git.infradead.org/users/tgr/libnl.git/blob?f=lib/socket.c#l50 returns a uint32_t, but computes that value as 'pid + (n << 22);', where n is at most 32, and where pid is at most PID_MAX_LIMIT of 2^22, so the end result is less than 31 bits and therefore positive when cast to int. Maybe it would help to tweak the wording a bit: Note that libnl's port generation algorithm guarantees that the uint32_t result of nl_socket_get_local_port() will never set the most significant bit, and therefore will be positive when cast to int, allowing us to use -1 as an error sentinel.
/** + * virNetlinkEventServiceLocalPid: + * + * Returns the nl_pid value that was used to bind() the netlink socket + * used by the netlink event service, or -1 on error (netlink + * guarantees that this value will always be > 0). + */ +int virNetlinkEventServiceLocalPid(void) +{ + if (!(server && server->netlinknh)) { + netlinkError(VIR_ERR_INTERNAL_ERROR, "%s", + _("netlink event service not running")); + return -1; + } + return (int)nl_socket_get_local_port(server->netlinknh);
Technically, the cast is redundant, but I'm okay if you leave it in. ACK; this one can technically be applied without test results, although it doesn't hurt to wait for the rest of the series. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Until now, the nl_pid of the source address of every message sent by virNetlinkCommand has been set to the value of getpid(). Most of the time this doesn't matter, and in the one case where it does (communication with lldpad), it previously was the proper thing to do, because the netlink event service (which listens on a netlink socket for unsolicited messages from lldpad) coincidentally always happened to bind with a local nl_pid == getpid(). With the fix for: https://bugzilla.redhat.com/show_bug.cgi?id=816465 that particular nl_pid is now effectively a reserved value, so the netlink event service will always bind to something else (coincidentally "getpid() + (1 << 22)", but it really could be anything). The result is that communication between lldpad and libvirtd is broken (lldpad gets a "disconnected" error when it tries to send a directed message). The solution to this problem cause by a solution, is to query the netlink event service's nlhandle for its "local_port", and send that as the source nl_pid (but only when sending to lldpad, of course - in other cases we maintain the old behavior of sending getpid()). There are two cases where a message is being directed at lldpad - one in virNetDevLinkDump, and one in virNetDevVPortProfileOpSetLink. The case of virNetDevVPortProfileOpSetLink is simplest to explain - only if !nltarget_kernel, i.e. the message isn't targetted for the kernel, is the dst_pid set (by calling virNetDevVPortProfileGetLldpadPid()), so only in that case do we call virNetlinkEvenServiceLocalPid() to set src_pid. For virNetDevLinkDump, it's a bit more complicated. The call to virNetDevVPortProfileGetLldpadPid() was effectively up one level (in virNetDevVPortProfileOpCommon), although obscured by an unnecessary passing of a function pointer. This patch removes the function pointer, and calls virNetDevVPortProfileGetLldpadPid() directly in virNetDevVPortProfileOpCommon - if it's doing this, it knows that it should also call virNetlinkEventServiceLocalPid() to set src_pid too; then it just passes src_pid and dst_pid down to virNetDevLinkDump. Since (src_pid == 0 && dst_pid == 0) implies that the kernel is the destination, there is no longer any need to send nltarget_kernel as an arg to virNetDevLinkDump, so it's been removed. The disparity between src_pid being int and dst_pid being uint32_t may be a bit disconcerting to some, but I didn't want to complicate virNetlinkEventServiceLocalPid() by having status returned separately from the value. --- src/util/virnetdev.c | 30 +++++++++++------------------- src/util/virnetdev.h | 4 ++-- src/util/virnetdevvportprofile.c | 24 +++++++++++++++++------- 3 files changed, 30 insertions(+), 28 deletions(-) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 72af3cd..d53352f 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1229,15 +1229,15 @@ static struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = { * * @ifname: The name of the interface; only use if ifindex < 0 * @ifindex: The interface index; may be < 0 if ifname is given - * @nltarget_kernel: whether to send the message to the kernel or another - * process * @nlattr: pointer to a pointer of netlink attributes that will contain * the results * @recvbuf: Pointer to the buffer holding the returned netlink response * message; free it, once not needed anymore - * @getPidFunc: Pointer to a function that will be invoked if the kernel - * is not the target of the netlink message but it is to be - * sent to another process. + * @src_pid: pid used for nl_pid of the local end of the netlink message + * (0 == "use getpid()") + * @dst_pid: pid of destination nl_pid if the kernel + * is not the target of the netlink message but it is to be + * sent to another process (0 if sending to the kernel) * * Get information about an interface given its name or index. * @@ -1245,9 +1245,9 @@ static struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = { */ int virNetDevLinkDump(const char *ifname, int ifindex, - bool nltarget_kernel, struct nlattr **tb, + struct nlattr **tb, unsigned char **recvbuf, - uint32_t (*getPidFunc)(void)) + uint32_t src_pid, uint32_t dst_pid) { int rc = -1; struct nlmsghdr *resp; @@ -1257,7 +1257,6 @@ virNetDevLinkDump(const char *ifname, int ifindex, .ifi_index = ifindex }; unsigned int recvbuflen; - uint32_t pid = 0; struct nl_msg *nl_msg; *recvbuf = NULL; @@ -1281,14 +1280,7 @@ virNetDevLinkDump(const char *ifname, int ifindex, goto buffer_too_small; } - if (!nltarget_kernel) { - pid = getPidFunc(); - if (pid == 0) { - goto cleanup; - } - } - - if (virNetlinkCommand(nl_msg, recvbuf, &recvbuflen, 0, pid) < 0) + if (virNetlinkCommand(nl_msg, recvbuf, &recvbuflen, src_pid, dst_pid) < 0) goto cleanup; if (recvbuflen < NLMSG_LENGTH(0) || *recvbuf == NULL) @@ -1526,7 +1518,7 @@ virNetDevGetVfConfig(const char *ifname, int vf, unsigned char *mac, struct nlattr *tb[IFLA_MAX + 1] = {NULL, }; int ifindex = -1; - rc = virNetDevLinkDump(ifname, ifindex, true, tb, &recvbuf, NULL); + rc = virNetDevLinkDump(ifname, ifindex, tb, &recvbuf, 0, 0); if (rc < 0) return rc; @@ -1658,10 +1650,10 @@ virNetDevRestoreNetConfig(char *linkdev, int vf, char *stateDir) int virNetDevLinkDump(const char *ifname ATTRIBUTE_UNUSED, int ifindex ATTRIBUTE_UNUSED, - bool nltarget_kernel ATTRIBUTE_UNUSED, struct nlattr **tb ATTRIBUTE_UNUSED, unsigned char **recvbuf ATTRIBUTE_UNUSED, - uint32_t (*getPidFunc)(void) ATTRIBUTE_UNUSED) + uint32_t src_pid ATTRIBUTE_UNUSED, + uint32_t dst_pid ATTRIBUTE_UNUSED) { virReportSystemError(ENOSYS, "%s", _("Unable to dump link info on this platform")); diff --git a/src/util/virnetdev.h b/src/util/virnetdev.h index 73f4c64..660d2db 100644 --- a/src/util/virnetdev.h +++ b/src/util/virnetdev.h @@ -107,9 +107,9 @@ int virNetDevGetVirtualFunctions(const char *pfname, ATTRIBUTE_RETURN_CHECK; int virNetDevLinkDump(const char *ifname, int ifindex, - bool nltarget_kernel, struct nlattr **tb, + struct nlattr **tb, unsigned char **recvbuf, - uint32_t (*getPidFunc)(void)) + uint32_t src_pid, uint32_t dst_pid) ATTRIBUTE_RETURN_CHECK; int virNetDevReplaceNetConfig(char *linkdev, int vf, diff --git a/src/util/virnetdevvportprofile.c b/src/util/virnetdevvportprofile.c index def5c62..38261d1 100644 --- a/src/util/virnetdevvportprofile.c +++ b/src/util/virnetdevvportprofile.c @@ -282,7 +282,8 @@ virNetDevVPortProfileOpSetLink(const char *ifname, int ifindex, }; unsigned char *recvbuf = NULL; unsigned int recvbuflen = 0; - uint32_t pid = 0; + int src_pid = 0; + uint32_t dst_pid = 0; struct nl_msg *nl_msg; struct nlattr *vfports = NULL, *vfport; @@ -390,12 +391,13 @@ virNetDevVPortProfileOpSetLink(const char *ifname, int ifindex, } if (!nltarget_kernel) { - pid = virNetDevVPortProfileGetLldpadPid(); - if (pid == 0) + if ((src_pid = virNetlinkEventServiceLocalPid()) < 0) + goto cleanup; + if ((dst_pid = virNetDevVPortProfileGetLldpadPid()) == 0) goto cleanup; } - if (virNetlinkCommand(nl_msg, &recvbuf, &recvbuflen, 0, pid) < 0) + if (virNetlinkCommand(nl_msg, &recvbuf, &recvbuflen, src_pid, dst_pid) < 0) goto cleanup; if (recvbuflen < NLMSG_LENGTH(0) || recvbuf == NULL) @@ -477,7 +479,7 @@ virNetDevVPortProfileGetNthParent(const char *ifname, int ifindex, unsigned int return -1; while (!end && i <= nthParent) { - rc = virNetDevLinkDump(ifname, ifindex, true, tb, &recvbuf, NULL); + rc = virNetDevLinkDump(ifname, ifindex, tb, &recvbuf, 0, 0); if (rc < 0) break; @@ -524,6 +526,8 @@ virNetDevVPortProfileOpCommon(const char *ifname, int ifindex, bool setlink_only) { int rc; + int src_pid = 0; + uint32_t dst_pid = 0; unsigned char *recvbuf = NULL; struct nlattr *tb[IFLA_MAX + 1] = { NULL , }; int repeats = STATUS_POLL_TIMEOUT_USEC / STATUS_POLL_INTERVL_USEC; @@ -549,9 +553,15 @@ virNetDevVPortProfileOpCommon(const char *ifname, int ifindex, if (setlink_only) /*for re-associations on existing links*/ return 0; + if (!nltarget_kernel && + (((src_pid = virNetlinkEventServiceLocalPid()) < 0) || + ((dst_pid = virNetDevVPortProfileGetLldpadPid()) == 0))) { + rc = -1; + goto cleanup; + } + while (--repeats >= 0) { - rc = virNetDevLinkDump(NULL, ifindex, nltarget_kernel, tb, - &recvbuf, virNetDevVPortProfileGetLldpadPid); + rc = virNetDevLinkDump(NULL, ifindex, tb, &recvbuf, src_pid, dst_pid); if (rc < 0) goto cleanup; -- 1.7.10

On 05/04/2012 12:51 PM, Laine Stump wrote:
Until now, the nl_pid of the source address of every message sent by virNetlinkCommand has been set to the value of getpid(). Most of the time this doesn't matter, and in the one case where it does (communication with lldpad), it previously was the proper thing to do, because the netlink event service (which listens on a netlink socket for unsolicited messages from lldpad) coincidentally always happened to bind with a local nl_pid == getpid().
With the fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=816465
that particular nl_pid is now effectively a reserved value, so the netlink event service will always bind to something else (coincidentally "getpid() + (1 << 22)", but it really could be anything). The result is that communication between lldpad and libvirtd is broken (lldpad gets a "disconnected" error when it tries to send a directed message).
Thanks for the detailed commit message - it will be a lifesaver looking back at this patch down the road.
The solution to this problem cause by a solution, is to query the
s/cause/caused/
netlink event service's nlhandle for its "local_port", and send that as the source nl_pid (but only when sending to lldpad, of course - in other cases we maintain the old behavior of sending getpid()).
There are two cases where a message is being directed at lldpad - one in virNetDevLinkDump, and one in virNetDevVPortProfileOpSetLink.
The case of virNetDevVPortProfileOpSetLink is simplest to explain - only if !nltarget_kernel, i.e. the message isn't targetted for the kernel, is the dst_pid set (by calling virNetDevVPortProfileGetLldpadPid()), so only in that case do we call virNetlinkEvenServiceLocalPid() to set src_pid.
s/EvenService/EventService/
For virNetDevLinkDump, it's a bit more complicated. The call to virNetDevVPortProfileGetLldpadPid() was effectively up one level (in virNetDevVPortProfileOpCommon), although obscured by an unnecessary passing of a function pointer. This patch removes the function pointer, and calls virNetDevVPortProfileGetLldpadPid() directly in virNetDevVPortProfileOpCommon - if it's doing this, it knows that it should also call virNetlinkEventServiceLocalPid() to set src_pid too; then it just passes src_pid and dst_pid down to virNetDevLinkDump. Since (src_pid == 0 && dst_pid == 0) implies that the kernel is the destination, there is no longer any need to send nltarget_kernel as an arg to virNetDevLinkDump, so it's been removed.
The disparity between src_pid being int and dst_pid being uint32_t may be a bit disconcerting to some, but I didn't want to complicate virNetlinkEventServiceLocalPid() by having status returned separately from the value.
I can live with the difference; after all, pid_t is a signed type, precisely because negative pids have special meaning in POSIX, so we already have issues if the most significant bit of a variable labeled 'uint32_t pid' is set. I suppose you could have also used pid==0 as error instead of worrying about int, but what you have is reasonable. ACK to the code aspects; but wait for the test results before pushing. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Fri, 2012-05-04 at 14:51 -0400, Laine Stump wrote:
As before, I'm unable to fully test this myself, so I won't push unless/until I get verification it works.
We tried various test scenarios, and libvirtd->lldpad and lldpad->libvirtd communication appeared to be functional and stable. -- Best regards, Gerhard Stenzel, ----------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 05/07/2012 06:14 AM, Gerhard Stenzel wrote:
On Fri, 2012-05-04 at 14:51 -0400, Laine Stump wrote:
As before, I'm unable to fully test this myself, so I won't push unless/until I get verification it works. We tried various test scenarios, and libvirtd->lldpad and lldpad->libvirtd communication appeared to be functional and stable.
Based on this report of successful testing of the 802.1Qbg macvtap code and communication with lldpad, my own successful testing that multiple restarts are reliable, and the reviews, I've pushed this series. Thanks to everyone for their reviews and testing!
participants (3)
-
Eric Blake
-
Gerhard Stenzel
-
Laine Stump