Reviving old thread, this came up again via a Fedora bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1278847
https://retrace.fedoraproject.org/faf/reports/597209/
I took a cursory look at libxl's sigchld handling... it's intense to say the
least, but there's some driver options that tweak the handling. Maybe there's
a simple fix.
jfehlig, any suggestions here? The summary is that virtualbox loads a stub
SIGCHLD handler of its own, which causes libxl to assert in
sigchld_installhandler_core
Thanks,
Cole
On 01/26/2015 08:17 AM, Klaus Espenlaub wrote:
Hi all,
On 26.01.2015 10:52, Michal Privoznik wrote:
> [CC'ing vbox-dev list]
>
> On 23.01.2015 21:42, CloudPatch Staff wrote:
>> After some debugging we found what was causing of the assert. In our
>> configuration we have
>> two kernels to boot, one is a pv-linux for Xen dom0 and another just a
>> normal linux kernel.
>> We have libvirt built with both Xen and vbox support. When running with
>> Xen, the libxl
>> driver is used so it ends calling libxenlight who doesn't want any
>> SIGCHLD handler set.
>> Normally this is the case but, since we have vbox support in libvirt the
>> vbox driver loads
>> some of the vbox libs and one of them sets a SIGCHLD handler. When
>> libxenlight checks
>> if there is any handler for SIGCHLD it finds that one and fails.
>>
>> Here is the backtrace from when vbox is setting the handler:
>>
>> (gdb) bt
>> #0 0x00007ffff71d8250 in sigaction () from /lib64/libpthread.so.0
>> #1 0x00007fffedbf3716 in ?? () from /usr/lib64/virtualbox/VBoxRT.so
>> #2 0x00007fffedbf3960 in ?? () from /usr/lib64/virtualbox/VBoxRT.so
>> #3 0x00007fffedeea485 in VBoxGetCAPIFunctions () from
>> /usr/lib64/virtualbox/VBoxXPCOMC.so
[...]
That's
https://www.virtualbox.org/browser/vbox/trunk/src/VBox/Main/cbinding/VBox...
- initializing the runtime our code uses which abstracts platform specifics.
The rationale why the runtime installs a dummy signal handler is that if
SIGCHLD is set to be ignored (that's the default) then POSIX compliant
waitpid() won't work. It's mentioned in the wait(2) man page on my linux
system, see the notes about the ECHILD error code.
It's really excessive that some other code prevents using the full
functionality. VirtualBox checks if the SIGCHLD handler is already set. If the
signal isn't ignored it happily continues without touching SIGCHLD. So if
libvirt wants to establish a sane default it would also work.
To state the obvious: running an API client with crippled waitpid() can lead
to extremely strange behavior. XPCOM creates worker threads on demand and
expects to be able to wait on their termination.
>> After building libvirt without vbox support the assert disappeared.
>
> So what are you saying is that vbox interferes with libxenlight? While I
> see why vbox library wants SIGCHLD handler, maybe they became fault
> tolerant meanwhile. Or?
Hope the explanations help resolving the conflict. It's not that we want to
sabotage anyone else.
Klaus
>
>>
>>
>> On Fri, Jan 23, 2015 at 5:14 AM, Michal Privoznik <mprivozn(a)redhat.com
>> <mailto:mprivozn@redhat.com>> wrote:
>>
>> On 22.01.2015 17:49, CloudPatch Staff wrote:
>> > We're hitting an assert whenever we try to create an HVM instance
under
>> > Xen via libvirtd.
>> >
>> > System is running on Gentoo, package information as follows:
>> >
>> > app-emulation/xen-4.5.0 USE="api debug flask hvm pam pygrub python
qemu
>> > screen"
>> > app-emulation/xen-tools-4.5.0 USE="api debug flask hvm pam pygrub
>> python
>> > qemu screen"
>> > app-emulation/libvirt-1.2.11-r2:0/1.2.11 USE="caps libvirtd lvm
macvtap
>> > nls qemu udev vepa virtualbox xen"
>> >
>> > The following commands are run in parallel:
>> >
>> > vmmachine ~ # libvirtd --listen
>> > 2015-01-22 16:33:13.596+0000: 2620: info : libvirt version: 1.2.11
>> > 2015-01-22 16:33:13.596+0000: 2620: error : udevGetDMIData:1607 :
>> Failed
>> > to get udev device for syspath '/sys/devices/virtual/dmi/id'
or
>> > '/sys/class/dmi/id'
>> > libvirtd: libxl_fork.c:350: sigchld_installhandler_core: Assertion
>> > `((void)"application must negotiate with libxl about
SIGCHLD",
>> > !(sigchld_saved_action.sa_flags & 4) &&
>> > (sigchld_saved_action.__sigaction_handler.sa_handler ==
>> > ((__sighandler_t) 0) ||
>> > sigchld_saved_action.__sigaction_handler.sa_handler ==
>> ((__sighandler_t)
>> > 1)))' failed.
>> > Aborted
>>
>> Interesting. Can you attach a debugger so we can see stacktrace?
>>
>> Michal
>>
>>
>
> Michal
>
--
libvir-list mailing list
libvir-list(a)redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list