[libvirt] [PATCH] Fix crash in virNetDevGetVirtualFunctions

Commit 9a3ff01d7f16cc280ce3176620c0714f55511a65 (which was ACKed at the end of January, but for some reason didn't get pushed until during the 1.0.4 freeze) fixed the logic in virPCIGetVirtualFunctions(). Unfortunately, a typo in the fix (replacing VIR_REALLOC_N with VIR_ALLOC_N during code movement) caused not only a memory leak, but also resulted in most of the elements of the result array being replaced with NULL. virNetDevGetVirtualFunctions() assumed (and I think rightly so) that virPCIGetVirtualFunctions() wouldn't return any NULL elements in the array, so it ended up segfaulting. This was found when attempting to use a virtual network with an auto-created pool of SRIOV VFs, e.g.: <forward mode='hostdev' managed='yes'> <pf dev='eth4'/> </forward> (the pool of PCI addresses is discovered by calling virNetDevGetVirtualFunctions() on the PF dev). --- src/util/virpci.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/util/virpci.c b/src/util/virpci.c index a0da1cd..85cd694 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -2026,8 +2026,8 @@ virPCIGetVirtualFunctions(const char *sysfs_path, continue; } - if (VIR_ALLOC_N(*virtual_functions, - *num_virtual_functions + 1) < 0) { + if (VIR_REALLOC_N(*virtual_functions, + *num_virtual_functions + 1) < 0) { virReportOOMError(); VIR_FREE(config_addr); goto error; -- 1.7.11.7

On 04/09/2013 12:15 PM, Laine Stump wrote:
Commit 9a3ff01d7f16cc280ce3176620c0714f55511a65 (which was ACKed at the end of January, but for some reason didn't get pushed until during the 1.0.4 freeze) fixed the logic in virPCIGetVirtualFunctions(). Unfortunately, a typo in the fix (replacing VIR_REALLOC_N with VIR_ALLOC_N during code movement) caused not only a memory leak, but also resulted in most of the elements of the result array being replaced with NULL. virNetDevGetVirtualFunctions() assumed (and I think rightly so) that virPCIGetVirtualFunctions() wouldn't return any NULL elements in the array, so it ended up segfaulting.
What a difference two characters makes.
src/util/virpci.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/util/virpci.c b/src/util/virpci.c index a0da1cd..85cd694 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -2026,8 +2026,8 @@ virPCIGetVirtualFunctions(const char *sysfs_path, continue; }
- if (VIR_ALLOC_N(*virtual_functions, - *num_virtual_functions + 1) < 0) { + if (VIR_REALLOC_N(*virtual_functions, + *num_virtual_functions + 1) < 0) { virReportOOMError();
ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 04/09/2013 02:21 PM, Eric Blake wrote:
On 04/09/2013 12:15 PM, Laine Stump wrote:
Commit 9a3ff01d7f16cc280ce3176620c0714f55511a65 (which was ACKed at the end of January, but for some reason didn't get pushed until during the 1.0.4 freeze) fixed the logic in virPCIGetVirtualFunctions(). Unfortunately, a typo in the fix (replacing VIR_REALLOC_N with VIR_ALLOC_N during code movement) caused not only a memory leak, but also resulted in most of the elements of the result array being replaced with NULL. virNetDevGetVirtualFunctions() assumed (and I think rightly so) that virPCIGetVirtualFunctions() wouldn't return any NULL elements in the array, so it ended up segfaulting. What a difference two characters makes.
And difficult to catch in the middle of a bunch of other changes. It would sure be nice to have a box running a few SRIOV regression tests somewhere. Because it's apparently mostly used only in production environments, it ends up being very late in the cycle when problems are caught (and problems are coming up regardless of whether or not code in libvirt is changed; see https://bugzilla.redhat.com/show_bug.cgi?id=908888 for a Fedora-specific example)
ACK.
Pushed. Thanks! (I noticed there is a v1.0.4-maint branch so (although nothing else had been pushed there since the release of 1.0.4) I pushed this patch to that branch as well.)

On 04/09/2013 12:41 PM, Laine Stump wrote:
(I noticed there is a v1.0.4-maint branch so (although nothing else had been pushed there since the release of 1.0.4) I pushed this patch to that branch as well.)
I pushed a few more patches into that branch. I don't know if Fedora 19 will end up on 1.0.4, or if there is still time to rebase to 1.0.5 before F19 beta freeze hits, but we might as well document the obvious bug fixes. I skipped over some doc-only patches in coming up with this list. c3e33e6 sanlock: add missing test command in virt-sanlock-cleanup.in 6f1b9c8 spec: Require pod2man when running autoreconf 5899e09 build: check correct protocol.o file b1d3154 Ensure LD_PRELOAD exists before running test case 8ad126e rpc: Fix connection close callback race condition and memory corruption/crash 69ab075 virsh: Register and unregister the close callback also in cmdConnect ca9e73e virsh: Move cmdConnect from virsh-host.c to virsh.c e964ba2 virsh: Unregister the connection close notifier upon termination 03a43ef libvirt: Increase connection reference count for callbacks d369e50 storage: Fix volume cloning for logical volume. fc8c178 Enable full RELRO mode 1150999 Build all binaries with PIE 43b6f30 qemu: Fix crash when updating media with shared device deb86ee virsh: Call virDomainFree in cmdDomFSTrim 6f7e4ea smartcard: spell ccid-card-emulated qemu property correctly 9a80050 Resolve valgrind failure When pushing to a maint branch, remember to use 'cherry-pick -x', so that it is obvious which mainline patch you are backporting. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (2)
-
Eric Blake
-
Laine Stump