[libvirt] Assert with libvirt + xen hvm

We're hitting an assert whenever we try to create an HVM instance under Xen via libvirtd. System is running on Gentoo, package information as follows: app-emulation/xen-4.5.0 USE="api debug flask hvm pam pygrub python qemu screen" app-emulation/xen-tools-4.5.0 USE="api debug flask hvm pam pygrub python qemu screen" app-emulation/libvirt-1.2.11-r2:0/1.2.11 USE="caps libvirtd lvm macvtap nls qemu udev vepa virtualbox xen" The following commands are run in parallel: vmmachine ~ # libvirtd --listen 2015-01-22 16:33:13.596+0000: 2620: info : libvirt version: 1.2.11 2015-01-22 16:33:13.596+0000: 2620: error : udevGetDMIData:1607 : Failed to get udev device for syspath '/sys/devices/virtual/dmi/id' or '/sys/class/dmi/id' libvirtd: libxl_fork.c:350: sigchld_installhandler_core: Assertion `((void)"application must negotiate with libxl about SIGCHLD", !(sigchld_saved_action.sa_flags & 4) && (sigchld_saved_action.__sigaction_handler.sa_handler == ((__sighandler_t) 0) || sigchld_saved_action.__sigaction_handler.sa_handler == ((__sighandler_t) 1)))' failed. Aborted vmmachine ~ # VIRSH_DEBUG=0 virsh create xml create: file(optdata): xml libvirt: XML-RPC error : End of file while reading data: Input/output error error: Failed to create domain from xml error: End of file while reading data: Input/output error libvirt: Domain Config error : Requested operation is not valid: A different callback was requested

On 22.01.2015 17:49, CloudPatch Staff wrote:
We're hitting an assert whenever we try to create an HVM instance under Xen via libvirtd.
System is running on Gentoo, package information as follows:
app-emulation/xen-4.5.0 USE="api debug flask hvm pam pygrub python qemu screen" app-emulation/xen-tools-4.5.0 USE="api debug flask hvm pam pygrub python qemu screen" app-emulation/libvirt-1.2.11-r2:0/1.2.11 USE="caps libvirtd lvm macvtap nls qemu udev vepa virtualbox xen"
The following commands are run in parallel:
vmmachine ~ # libvirtd --listen 2015-01-22 16:33:13.596+0000: 2620: info : libvirt version: 1.2.11 2015-01-22 16:33:13.596+0000: 2620: error : udevGetDMIData:1607 : Failed to get udev device for syspath '/sys/devices/virtual/dmi/id' or '/sys/class/dmi/id' libvirtd: libxl_fork.c:350: sigchld_installhandler_core: Assertion `((void)"application must negotiate with libxl about SIGCHLD", !(sigchld_saved_action.sa_flags & 4) && (sigchld_saved_action.__sigaction_handler.sa_handler == ((__sighandler_t) 0) || sigchld_saved_action.__sigaction_handler.sa_handler == ((__sighandler_t) 1)))' failed. Aborted
Interesting. Can you attach a debugger so we can see stacktrace? Michal

After some debugging we found what was causing of the assert. In our configuration we have two kernels to boot, one is a pv-linux for Xen dom0 and another just a normal linux kernel. We have libvirt built with both Xen and vbox support. When running with Xen, the libxl driver is used so it ends calling libxenlight who doesn't want any SIGCHLD handler set. Normally this is the case but, since we have vbox support in libvirt the vbox driver loads some of the vbox libs and one of them sets a SIGCHLD handler. When libxenlight checks if there is any handler for SIGCHLD it finds that one and fails. Here is the backtrace from when vbox is setting the handler: (gdb) bt #0 0x00007ffff71d8250 in sigaction () from /lib64/libpthread.so.0 #1 0x00007fffedbf3716 in ?? () from /usr/lib64/virtualbox/VBoxRT.so #2 0x00007fffedbf3960 in ?? () from /usr/lib64/virtualbox/VBoxRT.so #3 0x00007fffedeea485 in VBoxGetCAPIFunctions () from /usr/lib64/virtualbox/VBoxXPCOMC.so #4 0x00007fffee37d7ca in ?? () from /usr/lib64/libvirt/connection-driver/libvirt_driver_vbox_network.so #5 0x00007fffee37d995 in VBoxCGlueInit () from /usr/lib64/libvirt/connection-driver/libvirt_driver_vbox_network.so #6 0x00007fffee321e7b in vboxNetworkRegister () from /usr/lib64/libvirt/connection-driver/libvirt_driver_vbox_network.so #7 0x00007ffff755bbe8 in virDriverLoadModule () from /usr/lib64/libvirt.so.0 #8 0x0000555555568714 in ?? () #9 0x000055555556ab86 in ?? () #10 0x00007ffff6e54aa5 in __libc_start_main () from /lib64/libc.so.6 #11 0x0000555555567f59 in ?? () After building libvirt without vbox support the assert disappeared. On Fri, Jan 23, 2015 at 5:14 AM, Michal Privoznik <mprivozn@redhat.com> wrote:
On 22.01.2015 17:49, CloudPatch Staff wrote:
We're hitting an assert whenever we try to create an HVM instance under Xen via libvirtd.
System is running on Gentoo, package information as follows:
app-emulation/xen-4.5.0 USE="api debug flask hvm pam pygrub python qemu screen" app-emulation/xen-tools-4.5.0 USE="api debug flask hvm pam pygrub python qemu screen" app-emulation/libvirt-1.2.11-r2:0/1.2.11 USE="caps libvirtd lvm macvtap nls qemu udev vepa virtualbox xen"
The following commands are run in parallel:
vmmachine ~ # libvirtd --listen 2015-01-22 16:33:13.596+0000: 2620: info : libvirt version: 1.2.11 2015-01-22 16:33:13.596+0000: 2620: error : udevGetDMIData:1607 : Failed to get udev device for syspath '/sys/devices/virtual/dmi/id' or '/sys/class/dmi/id' libvirtd: libxl_fork.c:350: sigchld_installhandler_core: Assertion `((void)"application must negotiate with libxl about SIGCHLD", !(sigchld_saved_action.sa_flags & 4) && (sigchld_saved_action.__sigaction_handler.sa_handler == ((__sighandler_t) 0) || sigchld_saved_action.__sigaction_handler.sa_handler == ((__sighandler_t) 1)))' failed. Aborted
Interesting. Can you attach a debugger so we can see stacktrace?
Michal

[CC'ing vbox-dev list] On 23.01.2015 21:42, CloudPatch Staff wrote:
After some debugging we found what was causing of the assert. In our configuration we have two kernels to boot, one is a pv-linux for Xen dom0 and another just a normal linux kernel. We have libvirt built with both Xen and vbox support. When running with Xen, the libxl driver is used so it ends calling libxenlight who doesn't want any SIGCHLD handler set. Normally this is the case but, since we have vbox support in libvirt the vbox driver loads some of the vbox libs and one of them sets a SIGCHLD handler. When libxenlight checks if there is any handler for SIGCHLD it finds that one and fails.
Here is the backtrace from when vbox is setting the handler:
(gdb) bt #0 0x00007ffff71d8250 in sigaction () from /lib64/libpthread.so.0 #1 0x00007fffedbf3716 in ?? () from /usr/lib64/virtualbox/VBoxRT.so #2 0x00007fffedbf3960 in ?? () from /usr/lib64/virtualbox/VBoxRT.so #3 0x00007fffedeea485 in VBoxGetCAPIFunctions () from /usr/lib64/virtualbox/VBoxXPCOMC.so #4 0x00007fffee37d7ca in ?? () from /usr/lib64/libvirt/connection-driver/libvirt_driver_vbox_network.so #5 0x00007fffee37d995 in VBoxCGlueInit () from /usr/lib64/libvirt/connection-driver/libvirt_driver_vbox_network.so #6 0x00007fffee321e7b in vboxNetworkRegister () from /usr/lib64/libvirt/connection-driver/libvirt_driver_vbox_network.so #7 0x00007ffff755bbe8 in virDriverLoadModule () from /usr/lib64/libvirt.so.0 #8 0x0000555555568714 in ?? () #9 0x000055555556ab86 in ?? () #10 0x00007ffff6e54aa5 in __libc_start_main () from /lib64/libc.so.6 #11 0x0000555555567f59 in ?? ()
After building libvirt without vbox support the assert disappeared.
So what are you saying is that vbox interferes with libxenlight? While I see why vbox library wants SIGCHLD handler, maybe they became fault tolerant meanwhile. Or?
On Fri, Jan 23, 2015 at 5:14 AM, Michal Privoznik <mprivozn@redhat.com <mailto:mprivozn@redhat.com>> wrote:
On 22.01.2015 17:49, CloudPatch Staff wrote: > We're hitting an assert whenever we try to create an HVM instance under > Xen via libvirtd. > > System is running on Gentoo, package information as follows: > > app-emulation/xen-4.5.0 USE="api debug flask hvm pam pygrub python qemu > screen" > app-emulation/xen-tools-4.5.0 USE="api debug flask hvm pam pygrub python > qemu screen" > app-emulation/libvirt-1.2.11-r2:0/1.2.11 USE="caps libvirtd lvm macvtap > nls qemu udev vepa virtualbox xen" > > The following commands are run in parallel: > > vmmachine ~ # libvirtd --listen > 2015-01-22 16:33:13.596+0000: 2620: info : libvirt version: 1.2.11 > 2015-01-22 16:33:13.596+0000: 2620: error : udevGetDMIData:1607 : Failed > to get udev device for syspath '/sys/devices/virtual/dmi/id' or > '/sys/class/dmi/id' > libvirtd: libxl_fork.c:350: sigchld_installhandler_core: Assertion > `((void)"application must negotiate with libxl about SIGCHLD", > !(sigchld_saved_action.sa_flags & 4) && > (sigchld_saved_action.__sigaction_handler.sa_handler == > ((__sighandler_t) 0) || > sigchld_saved_action.__sigaction_handler.sa_handler == ((__sighandler_t) > 1)))' failed. > Aborted
Interesting. Can you attach a debugger so we can see stacktrace?
Michal
Michal

Hi all, On 26.01.2015 10:52, Michal Privoznik wrote:
[CC'ing vbox-dev list]
On 23.01.2015 21:42, CloudPatch Staff wrote:
After some debugging we found what was causing of the assert. In our configuration we have two kernels to boot, one is a pv-linux for Xen dom0 and another just a normal linux kernel. We have libvirt built with both Xen and vbox support. When running with Xen, the libxl driver is used so it ends calling libxenlight who doesn't want any SIGCHLD handler set. Normally this is the case but, since we have vbox support in libvirt the vbox driver loads some of the vbox libs and one of them sets a SIGCHLD handler. When libxenlight checks if there is any handler for SIGCHLD it finds that one and fails.
Here is the backtrace from when vbox is setting the handler:
(gdb) bt #0 0x00007ffff71d8250 in sigaction () from /lib64/libpthread.so.0 #1 0x00007fffedbf3716 in ?? () from /usr/lib64/virtualbox/VBoxRT.so #2 0x00007fffedbf3960 in ?? () from /usr/lib64/virtualbox/VBoxRT.so #3 0x00007fffedeea485 in VBoxGetCAPIFunctions () from /usr/lib64/virtualbox/VBoxXPCOMC.so [...]
That's https://www.virtualbox.org/browser/vbox/trunk/src/VBox/Main/cbinding/VBoxCAP... - initializing the runtime our code uses which abstracts platform specifics. The rationale why the runtime installs a dummy signal handler is that if SIGCHLD is set to be ignored (that's the default) then POSIX compliant waitpid() won't work. It's mentioned in the wait(2) man page on my linux system, see the notes about the ECHILD error code. It's really excessive that some other code prevents using the full functionality. VirtualBox checks if the SIGCHLD handler is already set. If the signal isn't ignored it happily continues without touching SIGCHLD. So if libvirt wants to establish a sane default it would also work. To state the obvious: running an API client with crippled waitpid() can lead to extremely strange behavior. XPCOM creates worker threads on demand and expects to be able to wait on their termination.
After building libvirt without vbox support the assert disappeared.
So what are you saying is that vbox interferes with libxenlight? While I see why vbox library wants SIGCHLD handler, maybe they became fault tolerant meanwhile. Or?
Hope the explanations help resolving the conflict. It's not that we want to sabotage anyone else. Klaus
On Fri, Jan 23, 2015 at 5:14 AM, Michal Privoznik <mprivozn@redhat.com <mailto:mprivozn@redhat.com>> wrote:
On 22.01.2015 17:49, CloudPatch Staff wrote: > We're hitting an assert whenever we try to create an HVM instance under > Xen via libvirtd. > > System is running on Gentoo, package information as follows: > > app-emulation/xen-4.5.0 USE="api debug flask hvm pam pygrub python qemu > screen" > app-emulation/xen-tools-4.5.0 USE="api debug flask hvm pam pygrub python > qemu screen" > app-emulation/libvirt-1.2.11-r2:0/1.2.11 USE="caps libvirtd lvm macvtap > nls qemu udev vepa virtualbox xen" > > The following commands are run in parallel: > > vmmachine ~ # libvirtd --listen > 2015-01-22 16:33:13.596+0000: 2620: info : libvirt version: 1.2.11 > 2015-01-22 16:33:13.596+0000: 2620: error : udevGetDMIData:1607 : Failed > to get udev device for syspath '/sys/devices/virtual/dmi/id' or > '/sys/class/dmi/id' > libvirtd: libxl_fork.c:350: sigchld_installhandler_core: Assertion > `((void)"application must negotiate with libxl about SIGCHLD", > !(sigchld_saved_action.sa_flags & 4) && > (sigchld_saved_action.__sigaction_handler.sa_handler == > ((__sighandler_t) 0) || > sigchld_saved_action.__sigaction_handler.sa_handler == ((__sighandler_t) > 1)))' failed. > Aborted
Interesting. Can you attach a debugger so we can see stacktrace?
Michal
Michal

Reviving old thread, this came up again via a Fedora bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1278847 https://retrace.fedoraproject.org/faf/reports/597209/ I took a cursory look at libxl's sigchld handling... it's intense to say the least, but there's some driver options that tweak the handling. Maybe there's a simple fix. jfehlig, any suggestions here? The summary is that virtualbox loads a stub SIGCHLD handler of its own, which causes libxl to assert in sigchld_installhandler_core Thanks, Cole On 01/26/2015 08:17 AM, Klaus Espenlaub wrote:
Hi all,
On 26.01.2015 10:52, Michal Privoznik wrote:
[CC'ing vbox-dev list]
On 23.01.2015 21:42, CloudPatch Staff wrote:
After some debugging we found what was causing of the assert. In our configuration we have two kernels to boot, one is a pv-linux for Xen dom0 and another just a normal linux kernel. We have libvirt built with both Xen and vbox support. When running with Xen, the libxl driver is used so it ends calling libxenlight who doesn't want any SIGCHLD handler set. Normally this is the case but, since we have vbox support in libvirt the vbox driver loads some of the vbox libs and one of them sets a SIGCHLD handler. When libxenlight checks if there is any handler for SIGCHLD it finds that one and fails.
Here is the backtrace from when vbox is setting the handler:
(gdb) bt #0 0x00007ffff71d8250 in sigaction () from /lib64/libpthread.so.0 #1 0x00007fffedbf3716 in ?? () from /usr/lib64/virtualbox/VBoxRT.so #2 0x00007fffedbf3960 in ?? () from /usr/lib64/virtualbox/VBoxRT.so #3 0x00007fffedeea485 in VBoxGetCAPIFunctions () from /usr/lib64/virtualbox/VBoxXPCOMC.so [...]
That's https://www.virtualbox.org/browser/vbox/trunk/src/VBox/Main/cbinding/VBoxCAP... - initializing the runtime our code uses which abstracts platform specifics.
The rationale why the runtime installs a dummy signal handler is that if SIGCHLD is set to be ignored (that's the default) then POSIX compliant waitpid() won't work. It's mentioned in the wait(2) man page on my linux system, see the notes about the ECHILD error code.
It's really excessive that some other code prevents using the full functionality. VirtualBox checks if the SIGCHLD handler is already set. If the signal isn't ignored it happily continues without touching SIGCHLD. So if libvirt wants to establish a sane default it would also work.
To state the obvious: running an API client with crippled waitpid() can lead to extremely strange behavior. XPCOM creates worker threads on demand and expects to be able to wait on their termination.
After building libvirt without vbox support the assert disappeared.
So what are you saying is that vbox interferes with libxenlight? While I see why vbox library wants SIGCHLD handler, maybe they became fault tolerant meanwhile. Or?
Hope the explanations help resolving the conflict. It's not that we want to sabotage anyone else.
Klaus
On Fri, Jan 23, 2015 at 5:14 AM, Michal Privoznik <mprivozn@redhat.com <mailto:mprivozn@redhat.com>> wrote:
On 22.01.2015 17:49, CloudPatch Staff wrote: > We're hitting an assert whenever we try to create an HVM instance under > Xen via libvirtd. > > System is running on Gentoo, package information as follows: > > app-emulation/xen-4.5.0 USE="api debug flask hvm pam pygrub python qemu > screen" > app-emulation/xen-tools-4.5.0 USE="api debug flask hvm pam pygrub python > qemu screen" > app-emulation/libvirt-1.2.11-r2:0/1.2.11 USE="caps libvirtd lvm macvtap > nls qemu udev vepa virtualbox xen" > > The following commands are run in parallel: > > vmmachine ~ # libvirtd --listen > 2015-01-22 16:33:13.596+0000: 2620: info : libvirt version: 1.2.11 > 2015-01-22 16:33:13.596+0000: 2620: error : udevGetDMIData:1607 : Failed > to get udev device for syspath '/sys/devices/virtual/dmi/id' or > '/sys/class/dmi/id' > libvirtd: libxl_fork.c:350: sigchld_installhandler_core: Assertion > `((void)"application must negotiate with libxl about SIGCHLD", > !(sigchld_saved_action.sa_flags & 4) && > (sigchld_saved_action.__sigaction_handler.sa_handler == > ((__sighandler_t) 0) || > sigchld_saved_action.__sigaction_handler.sa_handler == ((__sighandler_t) > 1)))' failed. > Aborted
Interesting. Can you attach a debugger so we can see stacktrace?
Michal
Michal
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
participants (4)
-
CloudPatch Staff
-
Cole Robinson
-
Klaus Espenlaub
-
Michal Privoznik