[PATCHv3 0/3] Use netlink to create veth device pair when netlink is supported

V2 here: https://www.redhat.com/archives/libvir-list/2020-November/msg01239.html Since V2: * Remove the argument 'status' of virNetDevVethCreateInternal * fix a memory leak in virLXCProcessSetupInterfaceTap V1 here: https://www.redhat.com/archives/libvir-list/2020-November/msg00898.html Since V1: * Include <linux/veth.h> rather than introducing enum VETH_INFO_* * Have complete functions within an #ifdefs rather than putting #ifdefs in function Shi Lei (3): util:netlink: Enable virNetlinkNewLink to support veth util:veth: Create veth device pair by netlink lxc: fix a memory leak src/lxc/lxc_process.c | 4 +- src/util/virnetdevveth.c | 126 ++++++++++++++++++++++----------------- src/util/virnetlink.c | 14 +++++ src/util/virnetlink.h | 1 + 4 files changed, 89 insertions(+), 56 deletions(-) -- 2.25.1

Signed-off-by: Shi Lei <shi_lei@massclouds.com> --- src/util/virnetlink.c | 14 ++++++++++++++ src/util/virnetlink.h | 1 + 2 files changed, 15 insertions(+) diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c index fdd3a6a4..8625b896 100644 --- a/src/util/virnetlink.c +++ b/src/util/virnetlink.c @@ -24,6 +24,7 @@ #include <config.h> #include <unistd.h> +#include <linux/veth.h> #include "virnetlink.h" #include "virnetdev.h" @@ -535,6 +536,19 @@ virNetlinkNewLink(const char *ifname, NETLINK_MSG_NEST_END(nl_msg, infodata); } + if (STREQ(type, "veth") && extra_args && extra_args->veth_peer) { + struct nlattr *infoveth = NULL; + + NETLINK_MSG_NEST_START(nl_msg, infodata, IFLA_INFO_DATA); + NETLINK_MSG_NEST_START(nl_msg, infoveth, VETH_INFO_PEER); + nlmsg_reserve(nl_msg, sizeof(struct ifinfomsg), 0); + NETLINK_MSG_PUT(nl_msg, IFLA_IFNAME, + (strlen(extra_args->veth_peer) + 1), + extra_args->veth_peer); + NETLINK_MSG_NEST_END(nl_msg, infoveth); + NETLINK_MSG_NEST_END(nl_msg, infodata); + } + NETLINK_MSG_NEST_END(nl_msg, linkinfo); if (extra_args) { diff --git a/src/util/virnetlink.h b/src/util/virnetlink.h index 7121eac4..7c4ed202 100644 --- a/src/util/virnetlink.h +++ b/src/util/virnetlink.h @@ -84,6 +84,7 @@ struct _virNetlinkNewLinkData { const int *ifindex; /* The index for the 'link' device */ const virMacAddr *mac; /* The MAC address of the device */ const uint32_t *macvlan_mode; /* The mode of macvlan */ + const char *veth_peer; /* The peer name for veth */ }; int virNetlinkNewLink(const char *ifname, -- 2.25.1

On 12/16/20 1:01 AM, Shi Lei wrote:
Signed-off-by: Shi Lei <shi_lei@massclouds.com> --- src/util/virnetlink.c | 14 ++++++++++++++ src/util/virnetlink.h | 1 + 2 files changed, 15 insertions(+)
diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c index fdd3a6a4..8625b896 100644 --- a/src/util/virnetlink.c +++ b/src/util/virnetlink.c @@ -24,6 +24,7 @@ #include <config.h>
#include <unistd.h> +#include <linux/veth.h>
^^ This #include had to be moved down inside the #if defined(WITH_LIBNL) to avoid build failures on mingw, MacOS, and FreeBSD. I squashed in that change before pushing.
#include "virnetlink.h" #include "virnetdev.h" @@ -535,6 +536,19 @@ virNetlinkNewLink(const char *ifname, NETLINK_MSG_NEST_END(nl_msg, infodata); }
+ if (STREQ(type, "veth") && extra_args && extra_args->veth_peer) { + struct nlattr *infoveth = NULL; + + NETLINK_MSG_NEST_START(nl_msg, infodata, IFLA_INFO_DATA); + NETLINK_MSG_NEST_START(nl_msg, infoveth, VETH_INFO_PEER); + nlmsg_reserve(nl_msg, sizeof(struct ifinfomsg), 0); + NETLINK_MSG_PUT(nl_msg, IFLA_IFNAME, + (strlen(extra_args->veth_peer) + 1), + extra_args->veth_peer); + NETLINK_MSG_NEST_END(nl_msg, infoveth); + NETLINK_MSG_NEST_END(nl_msg, infodata); + } + NETLINK_MSG_NEST_END(nl_msg, linkinfo);
if (extra_args) { diff --git a/src/util/virnetlink.h b/src/util/virnetlink.h index 7121eac4..7c4ed202 100644 --- a/src/util/virnetlink.h +++ b/src/util/virnetlink.h @@ -84,6 +84,7 @@ struct _virNetlinkNewLinkData { const int *ifindex; /* The index for the 'link' device */ const virMacAddr *mac; /* The MAC address of the device */ const uint32_t *macvlan_mode; /* The mode of macvlan */ + const char *veth_peer; /* The peer name for veth */ };
int virNetlinkNewLink(const char *ifname,

When netlink is supported, use netlink to create veth device pair rather than 'ip link' command. Signed-off-by: Shi Lei <shi_lei@massclouds.com> --- src/util/virnetdevveth.c | 126 ++++++++++++++++++++++----------------- 1 file changed, 72 insertions(+), 54 deletions(-) diff --git a/src/util/virnetdevveth.c b/src/util/virnetdevveth.c index 194f595a..7133af44 100644 --- a/src/util/virnetdevveth.c +++ b/src/util/virnetdevveth.c @@ -27,21 +27,69 @@ #include "virfile.h" #include "virstring.h" #include "virnetdev.h" +#include "virnetlink.h" #define VIR_FROM_THIS VIR_FROM_NONE VIR_LOG_INIT("util.netdevveth"); +#if defined(WITH_LIBNL) +static int +virNetDevVethCreateInternal(const char *veth1, const char *veth2) +{ + int status; /* Just ignore it */ + virNetlinkNewLinkData data = { .veth_peer = veth2 }; + + return virNetlinkNewLink(veth1, "veth", &data, &status); +} + +static int +virNetDevVethDeleteInternal(const char *veth) +{ + return virNetlinkDelLink(veth, NULL); +} +#else +static int +virNetDevVethCreateInternal(const char *veth1, const char *veth2) +{ + g_autoptr(virCommand) cmd = virCommandNew("ip"); + virCommandAddArgList(cmd, "link", "add", veth1, "type", "veth", + "peer", "name", veth2, NULL); + + return virCommandRun(cmd, NULL); +} + +static int +virNetDevVethDeleteInternal(const char *veth) +{ + int status; + g_autoptr(virCommand) cmd = virCommandNewArgList("ip", "link", + "del", veth, NULL); + + if (virCommandRun(cmd, &status) < 0) + return -1; + + if (status != 0) { + if (!virNetDevExists(veth)) { + VIR_DEBUG("Device %s already deleted (by kernel namespace cleanup)", veth); + return 0; + } + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to delete veth device %s"), veth); + return -1; + } + + return 0; +} +#endif /* WITH_LIBNL */ + /** * virNetDevVethCreate: - * @veth1: pointer to name for parent end of veth pair - * @veth2: pointer to return name for container end of veth pair + * @veth1: pointer to name for one end of veth pair + * @veth2: pointer to name for another end of veth pair * - * Creates a veth device pair using the ip command: - * ip link add veth1 type veth peer name veth2 - * If veth1 points to NULL on entry, it will be a valid interface on - * return. veth2 should point to NULL on entry. + * Creates a veth device pair. * * NOTE: If veth1 and veth2 names are not specified, ip will auto assign * names. There seems to be two problems here - @@ -58,44 +106,31 @@ VIR_LOG_INIT("util.netdevveth"); * * Returns 0 on success or -1 in case of error */ -int virNetDevVethCreate(char** veth1, char** veth2) +int virNetDevVethCreate(char **veth1, char **veth2) { - int status; - g_autofree char *veth1auto = NULL; - g_autofree char *veth2auto = NULL; - g_autoptr(virCommand) cmd = NULL; + const char *orig1 = *veth1; + const char *orig2 = *veth2; - if (virNetDevGenerateName(&veth1auto, VIR_NET_DEV_GEN_NAME_VNET) < 0) - return -1; + if (virNetDevGenerateName(veth1, VIR_NET_DEV_GEN_NAME_VNET) < 0) + goto cleanup; - if (virNetDevGenerateName(&veth2auto, VIR_NET_DEV_GEN_NAME_VNET) < 0) - return -1; + if (virNetDevGenerateName(veth2, VIR_NET_DEV_GEN_NAME_VNET) < 0) + goto cleanup; - cmd = virCommandNew("ip"); - virCommandAddArgList(cmd, "link", "add", - *veth1 ? *veth1 : veth1auto, - "type", "veth", "peer", "name", - *veth2 ? *veth2 : veth2auto, - NULL); + if (virNetDevVethCreateInternal(*veth1, *veth2) < 0) + goto cleanup; - if (virCommandRun(cmd, &status) < 0 || status) { - virReportError(VIR_ERR_INTERNAL_ERROR, - "%s", _("Failed to allocate free veth pair")); - return -1; - } + VIR_DEBUG("Create Host: %s guest: %s", *veth1, *veth2); + return 0; - VIR_DEBUG("create veth host: %s guest: %s: %d", - *veth1 ? *veth1 : veth1auto, - *veth2 ? *veth2 : veth2auto, - status); + cleanup: + if (orig1 == NULL) + VIR_FREE(*veth1); - if (veth1auto) - *veth1 = g_steal_pointer(&veth1auto); - if (veth2auto) - *veth2 = g_steal_pointer(&veth2auto); + if (orig2 == NULL) + VIR_FREE(*veth2); - VIR_DEBUG("Create Host: %s guest: %s", *veth1, *veth2); - return 0; + return -1; } /** @@ -111,22 +146,5 @@ int virNetDevVethCreate(char** veth1, char** veth2) */ int virNetDevVethDelete(const char *veth) { - int status; - g_autoptr(virCommand) cmd = virCommandNewArgList("ip", "link", - "del", veth, NULL); - - if (virCommandRun(cmd, &status) < 0) - return -1; - - if (status != 0) { - if (!virNetDevExists(veth)) { - VIR_DEBUG("Device %s already deleted (by kernel namespace cleanup)", veth); - return 0; - } - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Failed to delete veth device %s"), veth); - return -1; - } - - return 0; + return virNetDevVethDeleteInternal(veth); } -- 2.25.1

In virLXCProcessSetupInterfaceTap, containerVeth needs to be freed on failure. Signed-off-by: Shi Lei <shi_lei@massclouds.com> --- src/lxc/lxc_process.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index 85d0287a..0f7c9295 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -303,7 +303,7 @@ virLXCProcessSetupInterfaceTap(virDomainDefPtr vm, const char *brname) { char *parentVeth; - char *containerVeth = NULL; + g_autofree char *containerVeth = NULL; const virNetDevVPortProfile *vport = virDomainNetGetActualVirtPortProfile(net); VIR_DEBUG("calling vethCreate()"); @@ -357,7 +357,7 @@ virLXCProcessSetupInterfaceTap(virDomainDefPtr vm, virDomainConfNWFilterInstantiate(vm->name, vm->uuid, net, false) < 0) return NULL; - return containerVeth; + return g_steal_pointer(&containerVeth); } -- 2.25.1

Coverity reminds us of the ancient software engineering proverb related to being stuck with ownership because you touched the code last :-) - I know this patch didn't cause the problem, but because the code was touched Coverity decided to look harder and found another leak. On 12/16/20 1:01 AM, Shi Lei wrote:
In virLXCProcessSetupInterfaceTap, containerVeth needs to be freed on failure.
Signed-off-by: Shi Lei <shi_lei@massclouds.com> --- src/lxc/lxc_process.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index 85d0287a..0f7c9295 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -303,7 +303,7 @@ virLXCProcessSetupInterfaceTap(virDomainDefPtr vm, const char *brname) { char *parentVeth;
Coverity complains that @parentVeth is leaked too - although it's far more opaque and ornery. Let's assume on input that net->ifname != NULL - that means that in virNetDevVethCreate the call to virNetDevGenerateName and the memory in @**ifname (e.g. @parentVeth's copy of net->ifname) will be g_free()'d when !virNetDevExists(try) succeeds and @*ifname gets the memory from @try. In this case @try is a g_strdup_printf of @*ifname. So this code essentially free's memory pointed to by net->ifname, but since @parentVeth is used in the call net->ifname is *NOT* updated which is a second problem that Coverity did not note. Then let's say the call to virNetDevGenerateName fails for @veth2... Since on input @orig1 != NULL (e.g. @parentVeth != NULL), we will not VIR_FREE(*veth1); - that's fine given the original assumption, but when we return to virLXCProcessSetupInterfaceTap that means net->ifname will point at memory that was g_free'd and @parentVeth will not be free'd, thus Coverity squawks loud and proud about the resource leak - although it doesn't complain about the fact that net->ifname now points to memory that was free'd. As an aside, I see no way on input @*veth2 could not be NULL so in the cleanup path the check for @orig2 would seem to be dead code. Although, sure future changes could alter that reality. As a test if I replace all @parentVeth refs w/ net->ifname - then Coverity is happy again. I will leave it up to Laine or Shi Lei to generate the "real fix" and/or validate that my reading of the logic is right or not ;-). John
- char *containerVeth = NULL; + g_autofree char *containerVeth = NULL; const virNetDevVPortProfile *vport = virDomainNetGetActualVirtPortProfile(net);
VIR_DEBUG("calling vethCreate()"); @@ -357,7 +357,7 @@ virLXCProcessSetupInterfaceTap(virDomainDefPtr vm, virDomainConfNWFilterInstantiate(vm->name, vm->uuid, net, false) < 0) return NULL;
- return containerVeth; + return g_steal_pointer(&containerVeth); }

On 2020-12-18 at 22:01, John Ferlan wrote:
Coverity reminds us of the ancient software engineering proverb related
to being stuck with ownership because you touched the code last :-) - I
know this patch didn't cause the problem, but because the code was
touched Coverity decided to look harder and found another leak.
On 12/16/20 1:01 AM, Shi Lei wrote:
In virLXCProcessSetupInterfaceTap, containerVeth needs to be freed on
failure.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
---
src/lxc/lxc_process.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c
index 85d0287a..0f7c9295 100644
--- a/src/lxc/lxc_process.c
+++ b/src/lxc/lxc_process.c
@@ -303,7 +303,7 @@ virLXCProcessSetupInterfaceTap(virDomainDefPtr vm,
const char *brname)
{
char *parentVeth;
Coverity complains that @parentVeth is leaked too - although it's far
more opaque and ornery.
Let's assume on input that net->ifname != NULL - that means that in
virNetDevVethCreate the call to virNetDevGenerateName and the memory in
@**ifname (e.g. @parentVeth's copy of net->ifname) will be g_free()'d
when !virNetDevExists(try) succeeds and @*ifname gets the memory from
@try. In this case @try is a g_strdup_printf of @*ifname. So this code
essentially free's memory pointed to by net->ifname, but since
@parentVeth is used in the call net->ifname is *NOT* updated which is a
second problem that Coverity did not note.
Then let's say the call to virNetDevGenerateName fails for @veth2...
Since on input @orig1 != NULL (e.g. @parentVeth != NULL), we will not
VIR_FREE(*veth1); - that's fine given the original assumption, but when
we return to virLXCProcessSetupInterfaceTap that means net->ifname will
point at memory that was g_free'd and @parentVeth will not be free'd,
thus Coverity squawks loud and proud about the resource leak - although
it doesn't complain about the fact that net->ifname now points to memory
that was free'd.
As an aside, I see no way on input @*veth2 could not be NULL so in the
cleanup path the check for @orig2 would seem to be dead code. Although,
sure future changes could alter that reality.
As a test if I replace all @parentVeth refs w/ net->ifname - then
Coverity is happy again. I will leave it up to Laine or Shi Lei to
generate the "real fix" and/or validate that my reading of the logic is
right or not ;-).
John
Thanks! :-) As far as I can see, if the original net->ifname != NULL, it should be a user-provided name and NOT a template (prefix+'%d'). When @parentVeth (as the copy of net->ifname) is passed into virNetDevGenerateName, this function will leave it unchanged because it is not a template. So for now these problems will not happen. But for future, I think it's necessary to fix virLXCProcessSetupInterfaceTap. I agree that it's a right way to replace all *parentVeth* with net->ifname in virLXCProcessSetupInterfaceTap. And the *parentVeth* should be removed. Shi Lei

On 12/18/20 10:45 PM, Shi Lei wrote:
On 2020-12-18 at 22:01, John Ferlan wrote:
In virLXCProcessSetupInterfaceTap, containerVeth needs to be freed on failure. Signed-off-by: Shi Lei <shi_lei@massclouds.com> --- src/lxc/lxc_process.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index 85d0287a..0f7c9295 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -303,7 +303,7 @@ virLXCProcessSetupInterfaceTap(virDomainDefPtr vm, const char *brname) { char *parentVeth; Coverity complains that @parentVeth is leaked too - although it's far more opaque and ornery. Let's assume on input that net->ifname != NULL - that means that in virNetDevVethCreate the call to virNetDevGenerateName and the memory in @**ifname (e.g. @parentVeth's copy of net->ifname) will be g_free()'d when !virNetDevExists(try) succeeds and @*ifname gets the memory from @try. In this case @try is a g_strdup_printf of @*ifname. So this code essentially free's memory pointed to by net->ifname, but since @parentVeth is used in the call net->ifname is *NOT* updated which is a second problem that Coverity did not note. Then let's say the call to virNetDevGenerateName fails for @veth2... Since on input @orig1 != NULL (e.g. @parentVeth != NULL), we will not VIR_FREE(*veth1); - that's fine given the original assumption, but when we return to virLXCProcessSetupInterfaceTap that means net->ifname will
Coverity reminds us of the ancient software engineering proverb related to being stuck with ownership because you touched the code last :-) - I know this patch didn't cause the problem, but because the code was touched Coverity decided to look harder and found another leak. On 12/16/20 1:01 AM, Shi Lei wrote: point at memory that was g_free'd and @parentVeth will not be free'd, thus Coverity squawks loud and proud about the resource leak - although it doesn't complain about the fact that net->ifname now points to memory that was free'd. As an aside, I see no way on input @*veth2 could not be NULL so in the cleanup path the check for @orig2 would seem to be dead code. Although, sure future changes could alter that reality. As a test if I replace all @parentVeth refs w/ net->ifname - then Coverity is happy again. I will leave it up to Laine or Shi Lei to generate the "real fix" and/or validate that my reading of the logic is right or not ;-). John Thanks! :-)
As far as I can see, if the original net->ifname != NULL, it should be a user-provided name and NOT a template (prefix+'%d'). When @parentVeth (as the copy of net->ifname) is passed into virNetDevGenerateName, this function will leave it unchanged because it is not a template.
So for now these problems will not happen. But for future, I think it's necessary to fix virLXCProcessSetupInterfaceTap.
I agree that it's a right way to replace all *parentVeth* with net->ifname in virLXCProcessSetupInterfaceTap. And the *parentVeth* should be removed.
I tried something a bit different: https://www.redhat.com/archives/libvir-list/2021-January/msg00311.html

On 12/16/20 1:01 AM, Shi Lei wrote:
V2 here: https://www.redhat.com/archives/libvir-list/2020-November/msg01239.html Since V2: * Remove the argument 'status' of virNetDevVethCreateInternal * fix a memory leak in virLXCProcessSetupInterfaceTap
V1 here: https://www.redhat.com/archives/libvir-list/2020-November/msg00898.html Since V1: * Include <linux/veth.h> rather than introducing enum VETH_INFO_* * Have complete functions within an #ifdefs rather than putting #ifdefs in function
Shi Lei (3): util:netlink: Enable virNetlinkNewLink to support veth util:veth: Create veth device pair by netlink lxc: fix a memory leak
src/lxc/lxc_process.c | 4 +- src/util/virnetdevveth.c | 126 ++++++++++++++++++++++----------------- src/util/virnetlink.c | 14 +++++ src/util/virnetlink.h | 1 + 4 files changed, 89 insertions(+), 56 deletions(-)
Reviewed-by: Laine Stump <laine@redhat.com> for all 3 (with one tiny change to patch 1 to fix build failure in mingw, MacOS and FreeBSD). They're pushed now. Thanks!
participants (3)
-
John Ferlan
-
Laine Stump
-
Shi Lei