[libvirt] RHBZ 1013045: Crash on xen domain startup: *** Error in `/usr/sbin/libvirtd': free(): invalid next size (fast): 0x00007f82c8003210 ***

Hi all, I posted this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1013045) to the Redhat Bugzilla a while ago, and the only response has been to post a note to this list about the bug. Summary below, but it looks like a pretty clear use-after-free or something. The full details are attached to the bug report. Thanks, J -- Description of problem: When starting a Xen domain with libvirt + libxl, it crashes after creating the domain. The domain is left in a paused state, and works fine if manually unpaused with xl unpause. virt-manager never shows the domain as running. [...] Steps to Reproduce: 1. Open virt-manager 2. Connect to localhost (xen) 3. Start a domain Actual results: Domain is created in a paused state, virt-manager shows errors about losing connection to the daemon. Logs show libvirtd crashed. Expected results: Domain creation. Additional info: Sep 27 09:08:30 saboo libvirtd[24880]: *** Error in `/usr/sbin/libvirtd': free(): invalid next size (fast): 0x00007f82c8003210 *** Sep 27 09:08:30 saboo libvirtd[24880]: ======= Backtrace: ========= Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libc.so.6(+0x365b27d0e8)[0x7f82f5a7a0e8] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virFree+0x1a)[0x7f82f8f07d5a] Sep 27 09:08:30 saboo libvirtd[24880]: /usr/lib64/libvirt/connection-driver/libvirt_driver_libxl.so(+0x14b6c)[0x7f82e032bb6c] Sep 27 09:08:30 saboo libvirtd[24880]: /usr/lib64/libvirt/connection-driver/libvirt_driver_libxl.so(+0x154d4)[0x7f82e032c4d4] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virDomainCreate+0xf7)[0x7f82f8fdb6b7] Sep 27 09:08:30 saboo libvirtd[24880]: /usr/sbin/libvirtd(+0x350c7)[0x7f82f9a1a0c7] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virNetServerProgramDispatch+0x3ba)[0x7f82f90314aa] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33f822d8)[0x7f82f902c2d8] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33ea0c15)[0x7f82f8f4ac15] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33ea0691)[0x7f82f8f4a691] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libpthread.so.0(+0x365ba07c53)[0x7f82f61ccc53] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libc.so.6(clone+0x6d)[0x7f82f5af2d3d]

On Wed, Oct 23, 2013 at 10:46:14AM -0700, Jeremy Fitzhardinge wrote:
Hi all,
I posted this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1013045) to the Redhat Bugzilla a while ago, and the only response has been to post a note to this list about the bug.
Summary below, but it looks like a pretty clear use-after-free or something. The full details are attached to the bug report.
From the looks of the BZ, I think the probnlem found by valgrind (not the one in libxl) is a different than the one which causes the 'invalid free()', but anyway, I posted a patch [1] which might help (read: fixes a problem found out thanks to the valgrind output), but I have no way to test it. If you do, I would appreciate you trying whether the issue gets fixed for you with that patch. Thank you, Martin [1] https://www.redhat.com/archives/libvir-list/2013-October/msg01075.html
Thanks,
J
-- Description of problem: When starting a Xen domain with libvirt + libxl, it crashes after creating the domain. The domain is left in a paused state, and works fine if manually unpaused with xl unpause. virt-manager never shows the domain as running.
[...]
Steps to Reproduce: 1. Open virt-manager 2. Connect to localhost (xen) 3. Start a domain
Actual results: Domain is created in a paused state, virt-manager shows errors about losing connection to the daemon. Logs show libvirtd crashed.
Expected results: Domain creation.
Additional info: Sep 27 09:08:30 saboo libvirtd[24880]: *** Error in `/usr/sbin/libvirtd': free(): invalid next size (fast): 0x00007f82c8003210 *** Sep 27 09:08:30 saboo libvirtd[24880]: ======= Backtrace: ========= Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libc.so.6(+0x365b27d0e8)[0x7f82f5a7a0e8] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virFree+0x1a)[0x7f82f8f07d5a] Sep 27 09:08:30 saboo libvirtd[24880]: /usr/lib64/libvirt/connection-driver/libvirt_driver_libxl.so(+0x14b6c)[0x7f82e032bb6c] Sep 27 09:08:30 saboo libvirtd[24880]: /usr/lib64/libvirt/connection-driver/libvirt_driver_libxl.so(+0x154d4)[0x7f82e032c4d4] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virDomainCreate+0xf7)[0x7f82f8fdb6b7] Sep 27 09:08:30 saboo libvirtd[24880]: /usr/sbin/libvirtd(+0x350c7)[0x7f82f9a1a0c7] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virNetServerProgramDispatch+0x3ba)[0x7f82f90314aa] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33f822d8)[0x7f82f902c2d8] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33ea0c15)[0x7f82f8f4ac15] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33ea0691)[0x7f82f8f4a691] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libpthread.so.0(+0x365ba07c53)[0x7f82f61ccc53] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libc.so.6(clone+0x6d)[0x7f82f5af2d3d]
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 10/24/2013 02:52 AM, Martin Kletzander wrote:
On Wed, Oct 23, 2013 at 10:46:14AM -0700, Jeremy Fitzhardinge wrote:
Hi all,
I posted this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1013045) to the Redhat Bugzilla a while ago, and the only response has been to post a note to this list about the bug.
Summary below, but it looks like a pretty clear use-after-free or something. The full details are attached to the bug report.
From the looks of the BZ, I think the probnlem found by valgrind (not the one in libxl) is a different than the one which causes the 'invalid free()', but anyway, I posted a patch [1] which might help (read: fixes a problem found out thanks to the valgrind output), but I have no way to test it. If you do, I would appreciate you trying whether the issue gets fixed for you with that patch.
Thanks, I'll give it a try when I get the chance. I agree that none of the Valgrind messages really point to why I'm getting invalid frees, or why valgrind itself is crashing. The invalid memory accesses are all reads, so while not good, don't explain the symptom. I think SVN Valgrind has more Xen support in it, so I'm going to try that. J
Thank you, Martin
[1] https://www.redhat.com/archives/libvir-list/2013-October/msg01075.html
Thanks,
J
-- Description of problem: When starting a Xen domain with libvirt + libxl, it crashes after creating the domain. The domain is left in a paused state, and works fine if manually unpaused with xl unpause. virt-manager never shows the domain as running.
[...]
Steps to Reproduce: 1. Open virt-manager 2. Connect to localhost (xen) 3. Start a domain
Actual results: Domain is created in a paused state, virt-manager shows errors about losing connection to the daemon. Logs show libvirtd crashed.
Expected results: Domain creation.
Additional info: Sep 27 09:08:30 saboo libvirtd[24880]: *** Error in `/usr/sbin/libvirtd': free(): invalid next size (fast): 0x00007f82c8003210 *** Sep 27 09:08:30 saboo libvirtd[24880]: ======= Backtrace: ========= Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libc.so.6(+0x365b27d0e8)[0x7f82f5a7a0e8] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virFree+0x1a)[0x7f82f8f07d5a] Sep 27 09:08:30 saboo libvirtd[24880]:
/usr/lib64/libvirt/connection-driver/libvirt_driver_libxl.so(+0x14b6c)[0x7f82e032bb6c]
Sep 27 09:08:30 saboo libvirtd[24880]:
/usr/lib64/libvirt/connection-driver/libvirt_driver_libxl.so(+0x154d4)[0x7f82e032c4d4]
Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virDomainCreate+0xf7)[0x7f82f8fdb6b7] Sep 27 09:08:30 saboo libvirtd[24880]: /usr/sbin/libvirtd(+0x350c7)[0x7f82f9a1a0c7] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virNetServerProgramDispatch+0x3ba)[0x7f82f90314aa] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33f822d8)[0x7f82f902c2d8] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33ea0c15)[0x7f82f8f4ac15] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33ea0691)[0x7f82f8f4a691] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libpthread.so.0(+0x365ba07c53)[0x7f82f61ccc53] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libc.so.6(clone+0x6d)[0x7f82f5af2d3d]
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQEkBAEBCgAGBQJSaU6eAAoJEAUkni6MUg7HY2MIOwdamON9ggteRT+FVO6cPISE FEmP/75Hu06SqdcnNw8agwha4ZYBG5JpdrUgWcpudbg4A2XUVsrRXWLJOukaF8t7 d5OgZ9lKOU9Hv0o3+kDK+Yh6KWu9NwnvxoTtX+Ft+z+9vwARtL1JBIfuIcXegT9m eV0A0M+mI3x0cp2PwnQepzJwxA3IOh9PtbP+3K+ydm/sU3Tiv/Qn9HEpgkR4AEOk S6xTrJ2pPwi6/+/Tan7ya4Xcsyma2YTg0mu2dYkQighsSTp6yqI/fE2DFzsV6aJC SDkdqlmxDzm1+bM5ybt8Afukvp1/wZJLR0Hk4TqggWiAxNpA+3j1TGt2VqsUUZWo /lvHGs3KKQ== =PzKp -----END PGP SIGNATURE-----

On 10/24/2013 02:52 AM, Martin Kletzander wrote:
On Wed, Oct 23, 2013 at 10:46:14AM -0700, Jeremy Fitzhardinge wrote:
Hi all,
I posted this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1013045) to the Redhat Bugzilla a while ago, and the only response has been to post a note to this list about the bug.
Summary below, but it looks like a pretty clear use-after-free or something. The full details are attached to the bug report.
From the looks of the BZ, I think the probnlem found by valgrind (not the one in libxl) is a different than the one which causes the 'invalid free()', but anyway, I posted a patch [1] which might help (read: fixes a problem found out thanks to the valgrind output), but I have no way to test it. If you do, I would appreciate you trying whether the issue gets fixed for you with that patch.
Actually, the (original) code in question looks completely bogus. It's casting a virBitmap * into a uint8_t * and then inspecting it byte-by-byte, whereas it looks like it should be using the bit-test API. I'm reworking it at the moment. J
Thank you, Martin
[1] https://www.redhat.com/archives/libvir-list/2013-October/msg01075.html
Thanks,
J
-- Description of problem: When starting a Xen domain with libvirt + libxl, it crashes after creating the domain. The domain is left in a paused state, and works fine if manually unpaused with xl unpause. virt-manager never shows the domain as running.
[...]
Steps to Reproduce: 1. Open virt-manager 2. Connect to localhost (xen) 3. Start a domain
Actual results: Domain is created in a paused state, virt-manager shows errors about losing connection to the daemon. Logs show libvirtd crashed.
Expected results: Domain creation.
Additional info: Sep 27 09:08:30 saboo libvirtd[24880]: *** Error in `/usr/sbin/libvirtd': free(): invalid next size (fast): 0x00007f82c8003210 *** Sep 27 09:08:30 saboo libvirtd[24880]: ======= Backtrace: ========= Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libc.so.6(+0x365b27d0e8)[0x7f82f5a7a0e8] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virFree+0x1a)[0x7f82f8f07d5a] Sep 27 09:08:30 saboo libvirtd[24880]:
/usr/lib64/libvirt/connection-driver/libvirt_driver_libxl.so(+0x14b6c)[0x7f82e032bb6c]
Sep 27 09:08:30 saboo libvirtd[24880]:
/usr/lib64/libvirt/connection-driver/libvirt_driver_libxl.so(+0x154d4)[0x7f82e032c4d4]
Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virDomainCreate+0xf7)[0x7f82f8fdb6b7] Sep 27 09:08:30 saboo libvirtd[24880]: /usr/sbin/libvirtd(+0x350c7)[0x7f82f9a1a0c7] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virNetServerProgramDispatch+0x3ba)[0x7f82f90314aa] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33f822d8)[0x7f82f902c2d8] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33ea0c15)[0x7f82f8f4ac15] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33ea0691)[0x7f82f8f4a691] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libpthread.so.0(+0x365ba07c53)[0x7f82f61ccc53] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libc.so.6(clone+0x6d)[0x7f82f5af2d3d]
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 10/24/2013 02:52 AM, Martin Kletzander wrote:
On Wed, Oct 23, 2013 at 10:46:14AM -0700, Jeremy Fitzhardinge wrote:
Hi all,
I posted this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1013045) to the Redhat Bugzilla a while ago, and the only response has been to post a note to this list about the bug.
Summary below, but it looks like a pretty clear use-after-free or something. The full details are attached to the bug report.
From the looks of the BZ, I think the probnlem found by valgrind (not the one in libxl) is a different than the one which causes the 'invalid free()', but anyway, I posted a patch [1] which might help (read: fixes a problem found out thanks to the valgrind output), but I have no way to test it. If you do, I would appreciate you trying whether the issue gets fixed for you with that patch.
I reverted your change then applied the following, which looks like it fixes the problem. Thanks, J commit 65d342a6df5e8020b682a6085aa7aced7215e93b Author: Jeremy Fitzhardinge <jeremy@goop.org> Date: Wed Oct 30 10:36:37 2013 -0700 libxl: fix dubious cpumask handling in libxlDomainSetVcpuAffinities Rather than casting the virBitmap pointer to uint8_t* and then using the structure contents as a byte array, use the virBitmap API to determine the bitmap size and test each bit. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c index e2a6d44..ab509a6 100644 - --- a/src/libxl/libxl_driver.c +++ b/src/libxl/libxl_driver.c @@ -448,7 +448,7 @@ libxlDomainSetVcpuAffinities(libxlDriverPrivatePtr driver, virDomainObjPtr vm) libxlDomainObjPrivatePtr priv = vm->privateData; virDomainDefPtr def = vm->def; libxl_bitmap map; - - uint8_t *cpumask = NULL; + virBitmapPtr cpumask = NULL; uint8_t *cpumap = NULL; virNodeInfo nodeinfo; size_t cpumaplen; @@ -468,10 +468,16 @@ libxlDomainSetVcpuAffinities(libxlDriverPrivatePtr driver, virDomainObjPtr vm) if (VIR_ALLOC_N(cpumap, cpumaplen) < 0) goto cleanup; - - cpumask = (uint8_t*) def->cputune.vcpupin[vcpu]->cpumask; + cpumask = def->cputune.vcpupin[vcpu]->cpumask; - - for (i = 0; i < VIR_DOMAIN_CPUMASK_LEN; ++i) { - - if (cpumask[i]) + for (i = 0; i < virBitmapSize(cpumask); ++i) { + bool bit; + if (virBitmapGetBit(cpumask, i, &bit) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to get cpumask bit '%zd' with libxenlight"), i); + goto cleanup; + } + if (bit) VIR_USE_CPU(cpumap, i); }
Thank you, Martin
[1] https://www.redhat.com/archives/libvir-list/2013-October/msg01075.html
Thanks,
J
-- Description of problem: When starting a Xen domain with libvirt + libxl, it crashes after creating the domain. The domain is left in a paused state, and works fine if manually unpaused with xl unpause. virt-manager never shows the domain as running.
[...]
Steps to Reproduce: 1. Open virt-manager 2. Connect to localhost (xen) 3. Start a domain
Actual results: Domain is created in a paused state, virt-manager shows errors about losing connection to the daemon. Logs show libvirtd crashed.
Expected results: Domain creation.
Additional info: Sep 27 09:08:30 saboo libvirtd[24880]: *** Error in `/usr/sbin/libvirtd': free(): invalid next size (fast): 0x00007f82c8003210 *** Sep 27 09:08:30 saboo libvirtd[24880]: ======= Backtrace: ========= Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libc.so.6(+0x365b27d0e8)[0x7f82f5a7a0e8] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virFree+0x1a)[0x7f82f8f07d5a] Sep 27 09:08:30 saboo libvirtd[24880]:
/usr/lib64/libvirt/connection-driver/libvirt_driver_libxl.so(+0x14b6c)[0x7f82e032bb6c]
Sep 27 09:08:30 saboo libvirtd[24880]:
/usr/lib64/libvirt/connection-driver/libvirt_driver_libxl.so(+0x154d4)[0x7f82e032c4d4]
Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virDomainCreate+0xf7)[0x7f82f8fdb6b7] Sep 27 09:08:30 saboo libvirtd[24880]: /usr/sbin/libvirtd(+0x350c7)[0x7f82f9a1a0c7] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(virNetServerProgramDispatch+0x3ba)[0x7f82f90314aa] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33f822d8)[0x7f82f902c2d8] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33ea0c15)[0x7f82f8f4ac15] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libvirt.so.0(+0x3a33ea0691)[0x7f82f8f4a691] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libpthread.so.0(+0x365ba07c53)[0x7f82f61ccc53] Sep 27 09:08:30 saboo libvirtd[24880]: /lib64/libc.so.6(clone+0x6d)[0x7f82f5af2d3d]
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQEkBAEBCgAGBQJScUP4AAoJEAUkni6MUg7HuRwIQJF41DkVUDNeuYaQd+wzrT56 XJRvzuH5IKXw0SwgVv0q6cNQ0VfpSgLhIjUM2I3TIAs/d8JIqrBuV7Dy3D0y71Iz Kk+x01mSnT3N5uUi2PQqiJAPSDZanD0c//m5mDgUa5YcvY1RrG8toVbvewkZg36o 7kJPn8kGZPSVE7kw9o9GNeP8JSJHmEo6oJEyRwvIzGZtEV+zzEeOehM/mitF/N4X kewKFz6m4A/QFytasc43kOokQd6DWeSqF6lLT4Usi6uZ/ktikevrc843dd6OEzTl 9KV8L7lRaqY/z1/OiWtflMmZonadwFpTCS8R43zCf2TzHSFfRkqrzxQVSur+m9dX gvd+vyCPTg== =ItmR -----END PGP SIGNATURE-----

On 10/30/2013 11:38 AM, Jeremy Fitzhardinge wrote:
libxl: fix dubious cpumask handling in libxlDomainSetVcpuAffinities
Rather than casting the virBitmap pointer to uint8_t* and then using the structure contents as a byte array, use the virBitmap API to determine the bitmap size and test each bit.
Hmm, we already have virBitmapToData for converting from a virBitmap to a uint8_t; using that would be much faster for populating cpumap than doing a per-bit iteration over cpumask. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 10/30/2013 11:37 AM, Eric Blake wrote:
On 10/30/2013 11:38 AM, Jeremy Fitzhardinge wrote:
libxl: fix dubious cpumask handling in libxlDomainSetVcpuAffinities
Rather than casting the virBitmap pointer to uint8_t* and then using the structure contents as a byte array, use the virBitmap API to determine the bitmap size and test each bit.
Hmm, we already have virBitmapToData for converting from a virBitmap to a uint8_t; using that would be much faster for populating cpumap than doing a per-bit iteration over cpumask.
I looked at that, but I couldn't see it being any more efficient given that we're typically talking about a small number of (v)CPUs. J -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQEkBAEBCgAGBQJScVdcAAoJEAUkni6MUg7HF7IIQLBNIlssnxTx9dAqPSIc9mL+ gNvBzuflYT1QXjggfj49c/MozJJ+rZdCCu5NjAKqpLGYUpYohNg5aG1Vc1uw3OtQ zyyU1vmF7/kwxLKKcqHpxcpc+miB4NYIVMTYQ+bvIHcpZv8nk+a4freR0/8zVgX1 zQH94kcV8zBn5i03BRuCMO+GCd6AYJRNz++hBadrn/OErE/D8iwcfssasIDl3kFP lpby15niqiv3BT6bmrg2bMRJ5qKYnhbezhcMilOw6nZ2QOqP2/cZltHv99qwcbns 32uD+FdeTjN6BJ61UP8nAzEy96cvY1YIbc7z1kmr8F2aMPELoHfXPUCkJyUj9YLM pc+NV9915A== =MRR8 -----END PGP SIGNATURE-----

On 10/30/2013 01:00 PM, Jeremy Fitzhardinge wrote:
On 10/30/2013 11:37 AM, Eric Blake wrote:
On 10/30/2013 11:38 AM, Jeremy Fitzhardinge wrote:
libxl: fix dubious cpumask handling in libxlDomainSetVcpuAffinities
Rather than casting the virBitmap pointer to uint8_t* and then using the structure contents as a byte array, use the virBitmap API to determine the bitmap size and test each bit.
Hmm, we already have virBitmapToData for converting from a virBitmap to a uint8_t; using that would be much faster for populating cpumap than doing a per-bit iteration over cpumask.
I looked at that, but I couldn't see it being any more efficient given that we're typically talking about a small number of (v)CPUs.
Efficiency isn't my concern. Maintainability is. It's better to reuse existing functions instead of risking open-coding it wrong. We have a testsuite over our existing functions, but not over your open-coding. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 10/30/2013 12:04 PM, Eric Blake wrote:
On 10/30/2013 01:00 PM, Jeremy Fitzhardinge wrote:
On 10/30/2013 11:37 AM, Eric Blake wrote:
On 10/30/2013 11:38 AM, Jeremy Fitzhardinge wrote:
libxl: fix dubious cpumask handling in libxlDomainSetVcpuAffinities
Rather than casting the virBitmap pointer to uint8_t* and then
using
the structure contents as a byte array, use the virBitmap API to determine the bitmap size and test each bit.
Hmm, we already have virBitmapToData for converting from a virBitmap to a uint8_t; using that would be much faster for populating cpumap than doing a per-bit iteration over cpumask.
I looked at that, but I couldn't see it being any more efficient given that we're typically talking about a small number of (v)CPUs.
Efficiency isn't my concern. Maintainability is. It's better to reuse existing functions instead of risking open-coding it wrong. We have a testsuite over our existing functions, but not over your open-coding.
Well, that was also a consideration, but using virBitmapToData would be more complex. The same loop would exist, but I'd also have to manage freeing the array. So this seems like the better of the two approaches. J

Jeremy Fitzhardinge wrote:
On 10/30/2013 12:04 PM, Eric Blake wrote:
On 10/30/2013 01:00 PM, Jeremy Fitzhardinge wrote:
On 10/30/2013 11:37 AM, Eric Blake wrote:
On 10/30/2013 11:38 AM, Jeremy Fitzhardinge wrote:
libxl: fix dubious cpumask handling in libxlDomainSetVcpuAffinities
Rather than casting the virBitmap pointer to uint8_t* and then
using
the structure contents as a byte array, use the virBitmap API to determine the bitmap size and test each bit.
Hmm, we already have virBitmapToData for converting from a virBitmap to a uint8_t; using that would be much faster for populating cpumap than doing a per-bit iteration over cpumask.
I looked at that, but I couldn't see it being any more efficient given that we're typically talking about a small number of (v)CPUs.
Efficiency isn't my concern. Maintainability is. It's better to reuse existing functions instead of risking open-coding it wrong. We have a testsuite over our existing functions, but not over your open-coding.
Well, that was also a consideration, but using virBitmapToData would be more complex. The same loop would exist, but I'd also have to manage freeing the array. So this seems like the better of the two approaches.
What is the consensus here? It would be nice to get this pushed for 1.1.4 since it fixes a libvirtd segfault. I prefer Jeremy's approach of less memory management. Regards, Jim

On Thu, Oct 31, 2013 at 06:50:57PM -0600, Jim Fehlig wrote:
Jeremy Fitzhardinge wrote:
On 10/30/2013 12:04 PM, Eric Blake wrote:
On 10/30/2013 01:00 PM, Jeremy Fitzhardinge wrote:
On 10/30/2013 11:37 AM, Eric Blake wrote:
On 10/30/2013 11:38 AM, Jeremy Fitzhardinge wrote:
libxl: fix dubious cpumask handling in libxlDomainSetVcpuAffinities
Rather than casting the virBitmap pointer to uint8_t* and then
using
the structure contents as a byte array, use the virBitmap API to determine the bitmap size and test each bit.
Hmm, we already have virBitmapToData for converting from a virBitmap to a uint8_t; using that would be much faster for populating cpumap than doing a per-bit iteration over cpumask.
I looked at that, but I couldn't see it being any more efficient given that we're typically talking about a small number of (v)CPUs.
Efficiency isn't my concern. Maintainability is. It's better to reuse existing functions instead of risking open-coding it wrong. We have a testsuite over our existing functions, but not over your open-coding.
Well, that was also a consideration, but using virBitmapToData would be more complex. The same loop would exist, but I'd also have to manage freeing the array. So this seems like the better of the two approaches.
What is the consensus here? It would be nice to get this pushed for 1.1.4 since it fixes a libvirtd segfault. I prefer Jeremy's approach of less memory management.
Absolutely want this in 1.1.4. Personally I think Jeremy's patch is fine. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11/01/2013 03:55 AM, Daniel P. Berrange wrote:
On Thu, Oct 31, 2013 at 06:50:57PM -0600, Jim Fehlig wrote:
Jeremy Fitzhardinge wrote:
On 10/30/2013 12:04 PM, Eric Blake wrote:
On 10/30/2013 01:00 PM, Jeremy Fitzhardinge wrote:
On 10/30/2013 11:37 AM, Eric Blake wrote:
On 10/30/2013 11:38 AM, Jeremy Fitzhardinge wrote:
> libxl: fix dubious cpumask handling in libxlDomainSetVcpuAffinities > > Rather than casting the virBitmap pointer to uint8_t* and then > using
> the structure contents as a byte array, use the virBitmap API to > determine > the bitmap size and test each bit. > Hmm, we already have virBitmapToData for converting from a virBitmap to a uint8_t; using that would be much faster for populating cpumap than doing a per-bit iteration over cpumask.
I looked at that, but I couldn't see it being any more efficient given that we're typically talking about a small number of (v)CPUs.
Efficiency isn't my concern. Maintainability is. It's better to reuse existing functions instead of risking open-coding it wrong. We have a testsuite over our existing functions, but not over your open-coding.
Well, that was also a consideration, but using virBitmapToData would be more complex. The same loop would exist, but I'd also have to manage freeing the array. So this seems like the better of the two approaches.
What is the consensus here? It would be nice to get this pushed for 1.1.4 since it fixes a libvirtd segfault. I prefer Jeremy's approach of less memory management.
Absolutely want this in 1.1.4. Personally I think Jeremy's patch is fine.
Thanks. I've pushed the revert of 394d6e0a, and Jeremy's patch after removing the unreachable error as noted by Eric https://www.redhat.com/archives/libvir-list/2013-November/msg00006.html Regards, Jim

Jeremy Fitzhardinge wrote:
On 10/24/2013 02:52 AM, Martin Kletzander wrote:
On Wed, Oct 23, 2013 at 10:46:14AM -0700, Jeremy Fitzhardinge wrote:
Hi all,
I posted this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1013045) to the Redhat Bugzilla a while ago, and the only response has been to post a note to this list about the bug.
Summary below, but it looks like a pretty clear use-after-free or something. The full details are attached to the bug report.
From the looks of the BZ, I think the probnlem found by valgrind (not the one in libxl) is a different than the one which causes the 'invalid free()', but anyway, I posted a patch [1] which might help (read: fixes a problem found out thanks to the valgrind output), but I have no way to test it. If you do, I would appreciate you trying whether the issue gets fixed for you with that patch.
I reverted your change then applied the following, which looks like it fixes the problem.
Thanks,
J
commit 65d342a6df5e8020b682a6085aa7aced7215e93b Author: Jeremy Fitzhardinge <jeremy@goop.org> Date: Wed Oct 30 10:36:37 2013 -0700
libxl: fix dubious cpumask handling in libxlDomainSetVcpuAffinities
Rather than casting the virBitmap pointer to uint8_t* and then using the structure contents as a byte array, use the virBitmap API to determine the bitmap size and test each bit.
Yikes...
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c index e2a6d44..ab509a6 100644 --- a/src/libxl/libxl_driver.c +++ b/src/libxl/libxl_driver.c @@ -448,7 +448,7 @@ libxlDomainSetVcpuAffinities(libxlDriverPrivatePtr driver, virDomainObjPtr vm) libxlDomainObjPrivatePtr priv = vm->privateData; virDomainDefPtr def = vm->def; libxl_bitmap map; - uint8_t *cpumask = NULL; + virBitmapPtr cpumask = NULL; uint8_t *cpumap = NULL; virNodeInfo nodeinfo; size_t cpumaplen; @@ -468,10 +468,16 @@ libxlDomainSetVcpuAffinities(libxlDriverPrivatePtr driver, virDomainObjPtr vm) if (VIR_ALLOC_N(cpumap, cpumaplen) < 0) goto cleanup;
- cpumask = (uint8_t*) def->cputune.vcpupin[vcpu]->cpumask; + cpumask = def->cputune.vcpupin[vcpu]->cpumask;
- for (i = 0; i < VIR_DOMAIN_CPUMASK_LEN; ++i) { - if (cpumask[i]) + for (i = 0; i < virBitmapSize(cpumask); ++i) { + bool bit; + if (virBitmapGetBit(cpumask, i, &bit) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to get cpumask bit '%zd' with libxenlight"), i);
I don't think that is the most informative error message, but can't really suggest anything better since I'm having a problem understanding how virtBitmapGetBit() could fail. The virDomainObjPtr is locked, so the cpumask for the vcpu should not change. We just asked for the size of the map, thus shouldn't be asking for a bit outside that range, right? I suppose if failure was somehow possible, it would indicate misconfiguration. E.g. an invalid cpumask for the vcpu. If so, something like "Failed to decode cpumask for vcpu '%zd'" is more helpful Regards, Jim

On 10/31/2013 05:23 PM, Jim Fehlig wrote:
Jeremy Fitzhardinge wrote:
On 10/24/2013 02:52 AM, Martin Kletzander wrote:
On Wed, Oct 23, 2013 at 10:46:14AM -0700, Jeremy Fitzhardinge wrote:
Hi all,
I posted this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1013045) to the Redhat Bugzilla a while ago, and the only response has been to post a note to this list about the bug.
Summary below, but it looks like a pretty clear use-after-free or something. The full details are attached to the bug report.
From the looks of the BZ, I think the probnlem found by valgrind (not the one in libxl) is a different than the one which causes the 'invalid free()', but anyway, I posted a patch [1] which might help (read: fixes a problem found out thanks to the valgrind output), but I have no way to test it. If you do, I would appreciate you trying whether the issue gets fixed for you with that patch. I reverted your change then applied the following, which looks like it fixes the problem.
Thanks,
J
commit 65d342a6df5e8020b682a6085aa7aced7215e93b Author: Jeremy Fitzhardinge <jeremy@goop.org> Date: Wed Oct 30 10:36:37 2013 -0700
libxl: fix dubious cpumask handling in libxlDomainSetVcpuAffinities
Rather than casting the virBitmap pointer to uint8_t* and then using the structure contents as a byte array, use the virBitmap API to determine the bitmap size and test each bit. Yikes...
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c index e2a6d44..ab509a6 100644 --- a/src/libxl/libxl_driver.c +++ b/src/libxl/libxl_driver.c @@ -448,7 +448,7 @@ libxlDomainSetVcpuAffinities(libxlDriverPrivatePtr driver, virDomainObjPtr vm) libxlDomainObjPrivatePtr priv = vm->privateData; virDomainDefPtr def = vm->def; libxl_bitmap map; - uint8_t *cpumask = NULL; + virBitmapPtr cpumask = NULL; uint8_t *cpumap = NULL; virNodeInfo nodeinfo; size_t cpumaplen; @@ -468,10 +468,16 @@ libxlDomainSetVcpuAffinities(libxlDriverPrivatePtr driver, virDomainObjPtr vm) if (VIR_ALLOC_N(cpumap, cpumaplen) < 0) goto cleanup;
- cpumask = (uint8_t*) def->cputune.vcpupin[vcpu]->cpumask; + cpumask = def->cputune.vcpupin[vcpu]->cpumask;
- for (i = 0; i < VIR_DOMAIN_CPUMASK_LEN; ++i) { - if (cpumask[i]) + for (i = 0; i < virBitmapSize(cpumask); ++i) { + bool bit; + if (virBitmapGetBit(cpumask, i, &bit) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to get cpumask bit '%zd' with libxenlight"), i); I don't think that is the most informative error message, but can't really suggest anything better since I'm having a problem understanding how virtBitmapGetBit() could fail. The virDomainObjPtr is locked, so the cpumask for the vcpu should not change. We just asked for the size of the map, thus shouldn't be asking for a bit outside that range, right?
I suppose if failure was somehow possible, it would indicate misconfiguration. E.g. an invalid cpumask for the vcpu. If so, something like "Failed to decode cpumask for vcpu '%zd'" is more helpful
It returns an error, so out of an abundance of caution I test for the error. But no, I can't see how it could happen. J
Regards, Jim

On 10/31/2013 06:23 PM, Jim Fehlig wrote:
+ for (i = 0; i < virBitmapSize(cpumask); ++i) { + bool bit; + if (virBitmapGetBit(cpumask, i, &bit) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to get cpumask bit '%zd' with libxenlight"), i);
I don't think that is the most informative error message, but can't really suggest anything better since I'm having a problem understanding how virtBitmapGetBit() could fail.
Typically it fails only for an out-of-range request, which is a coding error. So some places use ignore_value() rather than bothering with an unreachable error message. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Eric Blake wrote:
On 10/31/2013 06:23 PM, Jim Fehlig wrote:
+ for (i = 0; i < virBitmapSize(cpumask); ++i) { + bool bit; + if (virBitmapGetBit(cpumask, i, &bit) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to get cpumask bit '%zd' with libxenlight"), i);
I don't think that is the most informative error message, but can't really suggest anything better since I'm having a problem understanding how virtBitmapGetBit() could fail.
Typically it fails only for an out-of-range request, which is a coding error. So some places use ignore_value() rather than bothering with an unreachable error message.
Would there be any objections to committing this, with the use of ignore_value() squashed in? Regards, Jim
participants (5)
-
Daniel P. Berrange
-
Eric Blake
-
Jeremy Fitzhardinge
-
Jim Fehlig
-
Martin Kletzander