On 04/24/2012 12:05 PM, Michal Privoznik wrote:
On 24.04.2012 17:52, Michal Privoznik wrote:
> Currently, we are calling memcpy(virtPortProfile, ...) unconditionally.
Actually, we're not - right at the top of the function it already has:
if (virtPortProfile && virNetLinkEventServiceIsRunning)
It's a very common occurence that virtPortProfile is NULL.
Are you using an out-of-date source directory?
> Which means if virtPortProfile is NULL we SIGSEGV. Therefore, add
check
> to call memcpy() conditionally.
> (gdb) bt
> #0 virNetDevMacVLanVPortProfileRegisterCallback (ifname=0x7ffff3f1ee60
"macvtap0", macaddress=0x7fffe8096e34 "RT", linkdev=0x7fffe8098b90
"eth0",
> vmuuid=0x7fffe8099558
"l\220*\237u`\326\213\325\227F\f-B\bD0g\t\350\377\177", virtPortProfile=0x0,
vmOp=VIR_NETDEV_VPORT_PROFILE_OP_CREATE) at /usr/include/bits/string3.h:52
> #1 0x00007ffff7782446 in virNetDevMacVLanCreateWithVPortProfile
(tgifname=<optimized out>, macaddress=0x7fffe8096e34 "RT",
linkdev=0x7fffe8098b90 "eth0", mode=<optimized out>, withTap=true,
vnet_hdr=0,
> vmuuid=0x7fffe8099558
"l\220*\237u`\326\213\325\227F\f-B\bD0g\t\350\377\177", virtPortProfile=0x0,
res_ifname=0x7ffff3f1ef00, vmOp=VIR_NETDEV_VPORT_PROFILE_OP_CREATE,
> stateDir=0x7fffe800d900 "/var/run/libvirt/qemu", bandwidth=0x0) at
util/virnetdevmacvlan.c:954
> #2 0x00000000004738d4 in qemuPhysIfaceConnect ()
> #3 0x000000000047e5f3 in qemuBuildCommandLine ()
> #4 0x000000000049a404 in qemuProcessStart ()
> #5 0x0000000000468ac9 in qemuDomainObjStart ()
> #6 0x0000000000469112 in qemuDomainStartWithFlags ()
> #7 0x00007ffff77ef0c6 in virDomainCreate (domain=0x768cb0) at libvirt.c:8162
> #8 0x000000000043e158 in remoteDispatchDomainCreateHelper ()
> #9 0x00007ffff783d835 in virNetServerProgramDispatchCall (msg=0x7fffe80cbe50,
client=0x7699d0, server=0x7658a0, prog=0x770ab0) at rpc/virnetserverprogram.c:416
> #10 virNetServerProgramDispatch (prog=0x770ab0, server=0x7658a0, client=0x7699d0,
msg=0x7fffe80cbe50) at rpc/virnetserverprogram.c:289
> #11 0x00007ffff7839961 in virNetServerHandleJob (jobOpaque=<optimized out>,
opaque=0x7658a0) at rpc/virnetserver.c:161
> #12 0x00007ffff776c1ce in virThreadPoolWorker (opaque=<optimized out>) at
util/threadpool.c:144
> #13 0x00007ffff776b876 in virThreadHelper (data=<optimized out>) at
util/threads-pthread.c:161
> #14 0x00007ffff74f0e2c in start_thread () from /lib64/libpthread.so.0
> #15 0x00007ffff723766d in clone () from /lib64/libc.so.6
> ---
> src/util/virnetdevmacvlan.c | 8 +++++---
> 1 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c
> index 879d846..c2fd6d0 100644
> --- a/src/util/virnetdevmacvlan.c
> +++ b/src/util/virnetdevmacvlan.c
> @@ -769,9 +769,11 @@ virNetDevMacVLanVPortProfileRegisterCallback(const char
*ifname,
> goto memory_error;
> if ((calld->cr_ifname = strdup(ifname)) == NULL)
> goto memory_error;
> - if (VIR_ALLOC(calld->virtPortProfile) < 0)
> - goto memory_error;
> - memcpy(calld->virtPortProfile, virtPortProfile,
sizeof(*virtPortProfile));
> + if (virtPortProfile) {
> + if (VIR_ALLOC(calld->virtPortProfile) < 0)
> + goto memory_error;
> + memcpy(calld->virtPortProfile, virtPortProfile,
sizeof(*virtPortProfile));
> + }
> if (VIR_ALLOC_N(calld->macaddress, VIR_MAC_BUFLEN) < 0)
> goto memory_error;
> memcpy(calld->macaddress, macaddress, VIR_MAC_BUFLEN);
Self NACK; It turned out to be race because it's still reproducible in
some cases: "if(virPortProfile)" evaluates to true; however gdb still
catches SIGSEGV in memcpy(). Meanwhile, something free() virPortProfile,
therefore gdb is unable to print any non-NULL value.
Maybe it's related to qemu process dying right after it was started.
What is your test scenario? Are you just starting and stopping guests?
Or are you migrating them or something more complicated like that?
The above bt doesn't seem like it could wind up even getting into the
main if() statement of the function (because virtPortProfile is NULL) -
was that bt actually collected at the time of the segv?
And even if virtPortProfile wasn't NULL, but was a valid pointer, just
having it be freed by a different thread prior to the memcpy wouldn't
cause a segv - it's still a valid pointer, we just no longer own the
memory it points to.
The two possibilities I can see:
1) maybe a bogus (but non-NULL) virtPortProfile is being passed in
2) possibly it's segv'ing on one of the other calls to memcpy - we
aren't checking for a NULL macaddress or vmuuid (although they shouldn't
ever be NULL).