[libvirt-users] libvrtd-1.1.0 crashes when attempting to start some (but not all) LXC containers

Hello all, I have two issues: 1) I am unable to start a seemingly correct LXC domain (I cloned it from a working domain). 2) I am able to crash "libvirtd" by attempting to start the cloned domain, but starting the original works just fine. I humbly submit that item #2 is a bug - the "libvirtd" daemon should never crash due to anything the "libvirt" client throws at it. As for item #1, I'm not sure where I went wrong. A full walk-through is below (ending with a DIFF of the XML from the two domains). I created by original domain ("dwj-lnx-dev") a long time ago. Today I created the new domain ("dwj-hfax-dev") as follows: 1) Shutdown "dwj-lnx-dev" 2) Clone the root file system: "cd /vm/lxc/; cp -a dwj-lnx-dev dwj-hfax-dev" (2.5GB, ~5 min) 3) "libvirt -c lxc:/// dumpxml dwj-lnx-dev > a.xml" 4) ${EDITOR} a.xml a) changed MAC address, name, memory, source directory for "/" 5) "libvirt -c lxc:/// define a.xml" 6) Edit "/etc/bind/pri/*" and "/etc/dhcp/dhcpd.conf" on my host. It does not matter is "dwj-lnx-dev" is running or not. Any attempt to start "dwj-hfax-dev" will crash libvirtd. In the past I was asked to turn on some debugging and capture a detailed log ( https://www.redhat.com/archives/libvirt-users/2013-May/msg00076.html). I will do this soon and post my results as a follow up. ostara ~ # uname -a Linux ostara 3.8.13-gentoo #1 SMP PREEMPT Mon Jun 3 17:10:56 CDT 2013 x86_64 Intel(R) Core(TM) i5 CPU 760 @ 2.80GHz GenuineIntel GNU/Linux ostara ~ # equery l libvirt * Searching for libvirt ... [IP-] [ ] app-emulation/libvirt-1.1.0-r1:0 ostara ~ # virsh -c lxc:/// version Compiled against library: libvirt 1.1.0 Using library: libvirt 1.1.0 Using API: LXC 1.1.0 Running hypervisor: LXC 3.8.13 ostara ~ # /etc/init.d/libvirtd restart * Caching service dependencies ... [ ok ] * Stopping libvirtd ... * Shutting down network(s): * default [ ok ] * Starting libvirtd ... [ ok ] ostara ~ # virsh -c lxc:/// list --all Id Name State ---------------------------------------------------- - dwj-hfax-dev shut off - dwj-lnx-dev shut off - vm1 shut off ostara ~ # virsh -c lxc:/// start dwj-lnx-dev Domain dwj-lnx-dev started ostara ~ # virsh -c lxc:/// list --all Id Name State ---------------------------------------------------- 9441 dwj-lnx-dev running - dwj-hfax-dev shut off - vm1 shut off ostara ~ # virsh -c lxc:/// start dwj-hfax-dev error: Failed to start domain dwj-hfax-dev error: End of file while reading data: Input/output error error: One or more references were leaked after disconnect from the hypervisor error: Failed to reconnect to the hypervisor ostara ~ # virsh -c lxc:/// list --all error: failed to connect to the hypervisor error: no valid connection error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': Connection refused ostara ~ # ls -l /var/run/libvirt/libvirt-sock srwx------ 1 root root 0 Jul 12 11:21 /var/run/libvirt/libvirt-sock ostara ~ # ps axfw | grep libvirt 9997 pts/2 S+ 0:00 \_ grep --colour=auto libvirt 8446 ? S 0:00 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf 9441 ? Ss 0:00 /usr/libexec/libvirt_lxc --name dwj-lnx-dev --console 19 --security=none --handshake 23 --background --veth veth1 ostara ~ # /etc/init.d/libvirtd restart * Stopping libvirtd ... * start-stop-daemon: no matching processes found [ ok ] * Starting libvirtd ... [ ok ] ostara ~ # ps axfw | grep libvirt 10130 pts/2 S+ 0:00 \_ grep --colour=auto libvirt 8446 ? S 0:00 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf 9441 ? Ss 0:00 /usr/libexec/libvirt_lxc --name dwj-lnx-dev --console 19 --security=none --handshake 23 --background --veth veth1 10033 ? Sl 0:00 /usr/sbin/libvirtd -d --listen ostara ~ # virsh -c lxc:/// list --all Id Name State ---------------------------------------------------- 9441 dwj-lnx-dev running - dwj-hfax-dev shut off - vm1 shut off ostara ~ # virsh -c lxc:/// dumpxml dwj-hfax-dev <domain type='lxc'> <name>dwj-hfax-dev</name> <uuid>681410de-7b56-41bd-b38d-3c66ce97e7b3</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <vcpu placement='static'>4</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64'>exe</type> <init>/sbin/init</init> </os> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/libvirt_lxc</emulator> <filesystem type='mount' accessmode='passthrough'> <source dir='/vm/lxc/dwj-hfax-dev'/> <target dir='/'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/portage'/> <target dir='/usr/portage'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/src'/> <target dir='/usr/src'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/home'/> <target dir='/home'/> </filesystem> <interface type='bridge'> <mac address='82:00:00:00:01:01'/> <source bridge='br0'/> <target dev='veth0'/> </interface> <console type='pty'> <target type='lxc' port='0'/> </console> </devices> <seclabel type='none'/> </domain> ostara ~ # virsh -c lxc:/// dumpxml dwj-lnx-dev <domain type='lxc' id='9441'> <name>dwj-lnx-dev</name> <uuid>fbcd8c3a-9939-12b4-727d-5d3526bc448f</uuid> <memory unit='KiB'>500000</memory> <currentMemory unit='KiB'>500000</currentMemory> <vcpu placement='static'>2</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64'>exe</type> <init>/sbin/init</init> </os> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/libvirt_lxc</emulator> <filesystem type='mount' accessmode='passthrough'> <source dir='/vm/lxc/dwj-lnx-dev'/> <target dir='/'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/portage'/> <target dir='/usr/portage'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/src'/> <target dir='/usr/src'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/home'/> <target dir='/home'/> </filesystem> <interface type='bridge'> <mac address='82:00:00:00:01:00'/> <source bridge='br0'/> <target dev='veth0'/> </interface> <console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='lxc' port='0'/> <alias name='console0'/> </console> </devices> <seclabel type='none'/> </domain> ostara ~ # virsh -c lxc:/// dumpxml dwj-lnx-dev > lnx.xml ostara ~ # virsh -c lxc:/// dumpxml dwj-hfax-dev > hfax.xml ostara ~ # diff lnx.xml hfax.xml 1,6c1,6 < <domain type='lxc' id='9441'> < <name>dwj-lnx-dev</name> < <uuid>fbcd8c3a-9939-12b4-727d-5d3526bc448f</uuid> < <memory unit='KiB'>500000</memory> < <currentMemory unit='KiB'>500000</currentMemory> < <vcpu placement='static'>2</vcpu> ---
<domain type='lxc'> <name>dwj-hfax-dev</name> <uuid>681410de-7b56-41bd-b38d-3c66ce97e7b3</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <vcpu placement='static'>4</vcpu> 21c21 < <source dir='/vm/lxc/dwj-lnx-dev'/>
<source dir='/vm/lxc/dwj-hfax-dev'/>
37c37 < <mac address='82:00:00:00:01:00'/> ---
<mac address='82:00:00:00:01:01'/>
41,42c41 < <console type='pty' tty='/dev/pts/3'> < <source path='/dev/pts/3'/> ---
<console type='pty'>
44d42 < <alias name='console0'/> (After reseting everything, and attempting to boot hfax with dev offline, libvirtd still crashes) ostara ~ # virsh -c lxc:/// list --all Id Name State ---------------------------------------------------- - dwj-hfax-dev shut off - dwj-lnx-dev shut off - vm1 shut off ostara ~ # virsh -c lxc:/// start dwj-hfax-dev error: Failed to start domain dwj-hfax-dev error: End of file while reading data: Input/output error error: One or more references were leaked after disconnect from the hypervisor error: Failed to reconnect to the hypervisor

The debug log ends with this: 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:708 : Make group /machine/dwj-hfax-dev.libvirt-lxc 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/cpu/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/cpuacct/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/cpuset/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/memory/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/devices/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/freezer/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/blkio/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:715 : Skipping unmounted controller net_cls 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/perf_event/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:779 : Done making controllers for group 2013-07-12 16:43:31.740+0000: 21365: debug : virFileMakePathHelper:1995 : path=/var/log/libvirt/lxc mode=0777 2013-07-12 16:43:31.740+0000: 21365: debug : virLXCProcessStart:1096 : Setting current domain def as transient 2013-07-12 16:43:31.741+0000: 21365: debug : virLXCProcessStart:1121 : Preparing host devices 2013-07-12 16:43:31.741+0000: 21365: debug : virLXCProcessStart:1139 : Generating domain security label (if required) ====== end of log ===== Segmentation fault (core dumped) (gdb) bt #0 0x00007fe4750c5d76 in __strcmp_sse42 () from /lib64/libc.so.6 #1 0x00007fe47578ad31 in virSecurityManagerGenLabel () from /usr/lib64/libvirt.so.0 #2 0x00007fe46aa92979 in virLXCProcessStart () from /usr/lib64/libvirt/connection-driver/libvirt_driver_lxc.so #3 0x00007fe46aa9736e in lxcDomainCreateWithFlags () from /usr/lib64/libvirt/connection-driver/libvirt_driver_lxc.so #4 0x00007fe47569c067 in virDomainCreate () from /usr/lib64/libvirt.so.0 #5 0x00007fe4760d5578 in remoteDispatchDomainCreateHelper () #6 0x00007fe4756fcd78 in virNetServerProgramDispatch () from /usr/lib64/libvirt.so.0 #7 0x00007fe4756f7302 in virNetServerProcessMsg () from /usr/lib64/libvirt.so.0 #8 0x00007fe4756f7a93 in virNetServerHandleJob () from /usr/lib64/libvirt.so.0 #9 0x00007fe47560f95e in virThreadPoolWorker () from /usr/lib64/libvirt.so.0 #10 0x00007fe47560efc6 in virThreadHelper () from /usr/lib64/libvirt.so.0 #11 0x00007fe475354da6 in start_thread () from /lib64/libpthread.so.0 #12 0x00007fe47508d99d in clone () from /lib64/libc.so.6 (gdb) On Fri, Jul 12, 2013 at 11:40 AM, Dennis Jenkins < dennis.jenkins.75@gmail.com> wrote:
Hello all,
I have two issues:
1) I am unable to start a seemingly correct LXC domain (I cloned it from a working domain).
2) I am able to crash "libvirtd" by attempting to start the cloned domain, but starting the original works just fine.
I humbly submit that item #2 is a bug - the "libvirtd" daemon should never crash due to anything the "libvirt" client throws at it. As for item #1, I'm not sure where I went wrong. A full walk-through is below (ending with a DIFF of the XML from the two domains).
I created by original domain ("dwj-lnx-dev") a long time ago. Today I created the new domain ("dwj-hfax-dev") as follows:
1) Shutdown "dwj-lnx-dev" 2) Clone the root file system: "cd /vm/lxc/; cp -a dwj-lnx-dev dwj-hfax-dev" (2.5GB, ~5 min) 3) "libvirt -c lxc:/// dumpxml dwj-lnx-dev > a.xml" 4) ${EDITOR} a.xml a) changed MAC address, name, memory, source directory for "/" 5) "libvirt -c lxc:/// define a.xml" 6) Edit "/etc/bind/pri/*" and "/etc/dhcp/dhcpd.conf" on my host.
It does not matter is "dwj-lnx-dev" is running or not. Any attempt to start "dwj-hfax-dev" will crash libvirtd.
In the past I was asked to turn on some debugging and capture a detailed log ( https://www.redhat.com/archives/libvirt-users/2013-May/msg00076.html). I will do this soon and post my results as a follow up.
ostara ~ # uname -a Linux ostara 3.8.13-gentoo #1 SMP PREEMPT Mon Jun 3 17:10:56 CDT 2013 x86_64 Intel(R) Core(TM) i5 CPU 760 @ 2.80GHz GenuineIntel GNU/Linux
ostara ~ # equery l libvirt * Searching for libvirt ... [IP-] [ ] app-emulation/libvirt-1.1.0-r1:0
ostara ~ # virsh -c lxc:/// version Compiled against library: libvirt 1.1.0 Using library: libvirt 1.1.0 Using API: LXC 1.1.0 Running hypervisor: LXC 3.8.13
ostara ~ # /etc/init.d/libvirtd restart * Caching service dependencies ... [ ok ] * Stopping libvirtd ... * Shutting down network(s): * default [ ok ] * Starting libvirtd ... [ ok ]
ostara ~ # virsh -c lxc:/// list --all Id Name State ---------------------------------------------------- - dwj-hfax-dev shut off - dwj-lnx-dev shut off - vm1 shut off
ostara ~ # virsh -c lxc:/// start dwj-lnx-dev Domain dwj-lnx-dev started
ostara ~ # virsh -c lxc:/// list --all Id Name State ---------------------------------------------------- 9441 dwj-lnx-dev running - dwj-hfax-dev shut off - vm1 shut off
ostara ~ # virsh -c lxc:/// start dwj-hfax-dev error: Failed to start domain dwj-hfax-dev error: End of file while reading data: Input/output error error: One or more references were leaked after disconnect from the hypervisor error: Failed to reconnect to the hypervisor
ostara ~ # virsh -c lxc:/// list --all error: failed to connect to the hypervisor error: no valid connection error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': Connection refused
ostara ~ # ls -l /var/run/libvirt/libvirt-sock srwx------ 1 root root 0 Jul 12 11:21 /var/run/libvirt/libvirt-sock
ostara ~ # ps axfw | grep libvirt 9997 pts/2 S+ 0:00 \_ grep --colour=auto libvirt 8446 ? S 0:00 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf 9441 ? Ss 0:00 /usr/libexec/libvirt_lxc --name dwj-lnx-dev --console 19 --security=none --handshake 23 --background --veth veth1
ostara ~ # /etc/init.d/libvirtd restart * Stopping libvirtd ... * start-stop-daemon: no matching processes found [ ok ] * Starting libvirtd ... [ ok ]
ostara ~ # ps axfw | grep libvirt 10130 pts/2 S+ 0:00 \_ grep --colour=auto libvirt 8446 ? S 0:00 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf 9441 ? Ss 0:00 /usr/libexec/libvirt_lxc --name dwj-lnx-dev --console 19 --security=none --handshake 23 --background --veth veth1 10033 ? Sl 0:00 /usr/sbin/libvirtd -d --listen
ostara ~ # virsh -c lxc:/// list --all Id Name State ---------------------------------------------------- 9441 dwj-lnx-dev running - dwj-hfax-dev shut off - vm1 shut off
ostara ~ # virsh -c lxc:/// dumpxml dwj-hfax-dev <domain type='lxc'> <name>dwj-hfax-dev</name> <uuid>681410de-7b56-41bd-b38d-3c66ce97e7b3</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <vcpu placement='static'>4</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64'>exe</type> <init>/sbin/init</init> </os> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/libvirt_lxc</emulator> <filesystem type='mount' accessmode='passthrough'> <source dir='/vm/lxc/dwj-hfax-dev'/> <target dir='/'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/portage'/> <target dir='/usr/portage'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/src'/> <target dir='/usr/src'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/home'/> <target dir='/home'/> </filesystem> <interface type='bridge'> <mac address='82:00:00:00:01:01'/> <source bridge='br0'/> <target dev='veth0'/> </interface> <console type='pty'> <target type='lxc' port='0'/> </console> </devices> <seclabel type='none'/> </domain>
ostara ~ # virsh -c lxc:/// dumpxml dwj-lnx-dev <domain type='lxc' id='9441'> <name>dwj-lnx-dev</name> <uuid>fbcd8c3a-9939-12b4-727d-5d3526bc448f</uuid> <memory unit='KiB'>500000</memory> <currentMemory unit='KiB'>500000</currentMemory> <vcpu placement='static'>2</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64'>exe</type> <init>/sbin/init</init> </os> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/libvirt_lxc</emulator> <filesystem type='mount' accessmode='passthrough'> <source dir='/vm/lxc/dwj-lnx-dev'/> <target dir='/'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/portage'/> <target dir='/usr/portage'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/src'/> <target dir='/usr/src'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/home'/> <target dir='/home'/> </filesystem> <interface type='bridge'> <mac address='82:00:00:00:01:00'/> <source bridge='br0'/> <target dev='veth0'/> </interface> <console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='lxc' port='0'/> <alias name='console0'/> </console> </devices> <seclabel type='none'/> </domain>
ostara ~ # virsh -c lxc:/// dumpxml dwj-lnx-dev > lnx.xml
ostara ~ # virsh -c lxc:/// dumpxml dwj-hfax-dev > hfax.xml
ostara ~ # diff lnx.xml hfax.xml 1,6c1,6 < <domain type='lxc' id='9441'> < <name>dwj-lnx-dev</name> < <uuid>fbcd8c3a-9939-12b4-727d-5d3526bc448f</uuid> < <memory unit='KiB'>500000</memory> < <currentMemory unit='KiB'>500000</currentMemory> < <vcpu placement='static'>2</vcpu> ---
<domain type='lxc'> <name>dwj-hfax-dev</name> <uuid>681410de-7b56-41bd-b38d-3c66ce97e7b3</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <vcpu placement='static'>4</vcpu> 21c21 < <source dir='/vm/lxc/dwj-lnx-dev'/>
<source dir='/vm/lxc/dwj-hfax-dev'/>
37c37 < <mac address='82:00:00:00:01:00'/> ---
<mac address='82:00:00:00:01:01'/>
41,42c41 < <console type='pty' tty='/dev/pts/3'> < <source path='/dev/pts/3'/> ---
<console type='pty'>
44d42 < <alias name='console0'/>
(After reseting everything, and attempting to boot hfax with dev offline, libvirtd still crashes)
ostara ~ # virsh -c lxc:/// list --all Id Name State ---------------------------------------------------- - dwj-hfax-dev shut off - dwj-lnx-dev shut off - vm1 shut off
ostara ~ # virsh -c lxc:/// start dwj-hfax-dev error: Failed to start domain dwj-hfax-dev error: End of file while reading data: Input/output error error: One or more references were leaked after disconnect from the hypervisor error: Failed to reconnect to the hypervisor

Update: I am able to edit the XML in "dwj-hfax-dev" such that libvirtd no longer crashes, and edit the XML for "dwj-lnx-dev" such that it will crash. The presents of "<seclabel type='none'/>" near the bottom causes libvirtd to crash. I do not recall ever manually adding that to my domain. In any event, libvirtd should probably not crash due to the XML element (which seems valid - or at least "virsh edit" allows it). On Fri, Jul 12, 2013 at 11:44 AM, Dennis Jenkins < dennis.jenkins.75@gmail.com> wrote:
The debug log ends with this:
2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:708 : Make group /machine/dwj-hfax-dev.libvirt-lxc 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/cpu/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/cpuacct/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/cpuset/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/memory/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/devices/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/freezer/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/blkio/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:715 : Skipping unmounted controller net_cls 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:729 : Make controller /sys/fs/cgroup/perf_event/machine/dwj-hfax-dev.libvirt-lxc/ 2013-07-12 16:43:31.740+0000: 21365: debug : virCgroupMakeGroup:779 : Done making controllers for group 2013-07-12 16:43:31.740+0000: 21365: debug : virFileMakePathHelper:1995 : path=/var/log/libvirt/lxc mode=0777 2013-07-12 16:43:31.740+0000: 21365: debug : virLXCProcessStart:1096 : Setting current domain def as transient 2013-07-12 16:43:31.741+0000: 21365: debug : virLXCProcessStart:1121 : Preparing host devices 2013-07-12 16:43:31.741+0000: 21365: debug : virLXCProcessStart:1139 : Generating domain security label (if required)
====== end of log =====
Segmentation fault (core dumped)
(gdb) bt #0 0x00007fe4750c5d76 in __strcmp_sse42 () from /lib64/libc.so.6 #1 0x00007fe47578ad31 in virSecurityManagerGenLabel () from /usr/lib64/libvirt.so.0 #2 0x00007fe46aa92979 in virLXCProcessStart () from /usr/lib64/libvirt/connection-driver/libvirt_driver_lxc.so #3 0x00007fe46aa9736e in lxcDomainCreateWithFlags () from /usr/lib64/libvirt/connection-driver/libvirt_driver_lxc.so #4 0x00007fe47569c067 in virDomainCreate () from /usr/lib64/libvirt.so.0 #5 0x00007fe4760d5578 in remoteDispatchDomainCreateHelper () #6 0x00007fe4756fcd78 in virNetServerProgramDispatch () from /usr/lib64/libvirt.so.0 #7 0x00007fe4756f7302 in virNetServerProcessMsg () from /usr/lib64/libvirt.so.0 #8 0x00007fe4756f7a93 in virNetServerHandleJob () from /usr/lib64/libvirt.so.0 #9 0x00007fe47560f95e in virThreadPoolWorker () from /usr/lib64/libvirt.so.0 #10 0x00007fe47560efc6 in virThreadHelper () from /usr/lib64/libvirt.so.0 #11 0x00007fe475354da6 in start_thread () from /lib64/libpthread.so.0 #12 0x00007fe47508d99d in clone () from /lib64/libc.so.6 (gdb)
On Fri, Jul 12, 2013 at 11:40 AM, Dennis Jenkins < dennis.jenkins.75@gmail.com> wrote:
Hello all,
I have two issues:
1) I am unable to start a seemingly correct LXC domain (I cloned it from a working domain).
2) I am able to crash "libvirtd" by attempting to start the cloned domain, but starting the original works just fine.
I humbly submit that item #2 is a bug - the "libvirtd" daemon should never crash due to anything the "libvirt" client throws at it. As for item #1, I'm not sure where I went wrong. A full walk-through is below (ending with a DIFF of the XML from the two domains).
I created by original domain ("dwj-lnx-dev") a long time ago. Today I created the new domain ("dwj-hfax-dev") as follows:
1) Shutdown "dwj-lnx-dev" 2) Clone the root file system: "cd /vm/lxc/; cp -a dwj-lnx-dev dwj-hfax-dev" (2.5GB, ~5 min) 3) "libvirt -c lxc:/// dumpxml dwj-lnx-dev > a.xml" 4) ${EDITOR} a.xml a) changed MAC address, name, memory, source directory for "/" 5) "libvirt -c lxc:/// define a.xml" 6) Edit "/etc/bind/pri/*" and "/etc/dhcp/dhcpd.conf" on my host.
It does not matter is "dwj-lnx-dev" is running or not. Any attempt to start "dwj-hfax-dev" will crash libvirtd.
In the past I was asked to turn on some debugging and capture a detailed log ( https://www.redhat.com/archives/libvirt-users/2013-May/msg00076.html). I will do this soon and post my results as a follow up.
ostara ~ # uname -a Linux ostara 3.8.13-gentoo #1 SMP PREEMPT Mon Jun 3 17:10:56 CDT 2013 x86_64 Intel(R) Core(TM) i5 CPU 760 @ 2.80GHz GenuineIntel GNU/Linux
ostara ~ # equery l libvirt * Searching for libvirt ... [IP-] [ ] app-emulation/libvirt-1.1.0-r1:0
ostara ~ # virsh -c lxc:/// version Compiled against library: libvirt 1.1.0 Using library: libvirt 1.1.0 Using API: LXC 1.1.0 Running hypervisor: LXC 3.8.13
ostara ~ # /etc/init.d/libvirtd restart * Caching service dependencies ... [ ok ] * Stopping libvirtd ... * Shutting down network(s): * default [ ok ] * Starting libvirtd ... [ ok ]
ostara ~ # virsh -c lxc:/// list --all Id Name State ---------------------------------------------------- - dwj-hfax-dev shut off - dwj-lnx-dev shut off - vm1 shut off
ostara ~ # virsh -c lxc:/// start dwj-lnx-dev Domain dwj-lnx-dev started
ostara ~ # virsh -c lxc:/// list --all Id Name State ---------------------------------------------------- 9441 dwj-lnx-dev running - dwj-hfax-dev shut off - vm1 shut off
ostara ~ # virsh -c lxc:/// start dwj-hfax-dev error: Failed to start domain dwj-hfax-dev error: End of file while reading data: Input/output error error: One or more references were leaked after disconnect from the hypervisor error: Failed to reconnect to the hypervisor
ostara ~ # virsh -c lxc:/// list --all error: failed to connect to the hypervisor error: no valid connection error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': Connection refused
ostara ~ # ls -l /var/run/libvirt/libvirt-sock srwx------ 1 root root 0 Jul 12 11:21 /var/run/libvirt/libvirt-sock
ostara ~ # ps axfw | grep libvirt 9997 pts/2 S+ 0:00 \_ grep --colour=auto libvirt 8446 ? S 0:00 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf 9441 ? Ss 0:00 /usr/libexec/libvirt_lxc --name dwj-lnx-dev --console 19 --security=none --handshake 23 --background --veth veth1
ostara ~ # /etc/init.d/libvirtd restart * Stopping libvirtd ... * start-stop-daemon: no matching processes found [ ok ] * Starting libvirtd ... [ ok ]
ostara ~ # ps axfw | grep libvirt 10130 pts/2 S+ 0:00 \_ grep --colour=auto libvirt 8446 ? S 0:00 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf 9441 ? Ss 0:00 /usr/libexec/libvirt_lxc --name dwj-lnx-dev --console 19 --security=none --handshake 23 --background --veth veth1 10033 ? Sl 0:00 /usr/sbin/libvirtd -d --listen
ostara ~ # virsh -c lxc:/// list --all Id Name State ---------------------------------------------------- 9441 dwj-lnx-dev running - dwj-hfax-dev shut off - vm1 shut off
ostara ~ # virsh -c lxc:/// dumpxml dwj-hfax-dev <domain type='lxc'> <name>dwj-hfax-dev</name> <uuid>681410de-7b56-41bd-b38d-3c66ce97e7b3</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <vcpu placement='static'>4</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64'>exe</type> <init>/sbin/init</init> </os> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/libvirt_lxc</emulator> <filesystem type='mount' accessmode='passthrough'> <source dir='/vm/lxc/dwj-hfax-dev'/> <target dir='/'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/portage'/> <target dir='/usr/portage'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/src'/> <target dir='/usr/src'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/home'/> <target dir='/home'/> </filesystem> <interface type='bridge'> <mac address='82:00:00:00:01:01'/> <source bridge='br0'/> <target dev='veth0'/> </interface> <console type='pty'> <target type='lxc' port='0'/> </console> </devices> <seclabel type='none'/> </domain>
ostara ~ # virsh -c lxc:/// dumpxml dwj-lnx-dev <domain type='lxc' id='9441'> <name>dwj-lnx-dev</name> <uuid>fbcd8c3a-9939-12b4-727d-5d3526bc448f</uuid> <memory unit='KiB'>500000</memory> <currentMemory unit='KiB'>500000</currentMemory> <vcpu placement='static'>2</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64'>exe</type> <init>/sbin/init</init> </os> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/libvirt_lxc</emulator> <filesystem type='mount' accessmode='passthrough'> <source dir='/vm/lxc/dwj-lnx-dev'/> <target dir='/'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/portage'/> <target dir='/usr/portage'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/src'/> <target dir='/usr/src'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/home'/> <target dir='/home'/> </filesystem> <interface type='bridge'> <mac address='82:00:00:00:01:00'/> <source bridge='br0'/> <target dev='veth0'/> </interface> <console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='lxc' port='0'/> <alias name='console0'/> </console> </devices> <seclabel type='none'/> </domain>
ostara ~ # virsh -c lxc:/// dumpxml dwj-lnx-dev > lnx.xml
ostara ~ # virsh -c lxc:/// dumpxml dwj-hfax-dev > hfax.xml
ostara ~ # diff lnx.xml hfax.xml 1,6c1,6 < <domain type='lxc' id='9441'> < <name>dwj-lnx-dev</name> < <uuid>fbcd8c3a-9939-12b4-727d-5d3526bc448f</uuid> < <memory unit='KiB'>500000</memory> < <currentMemory unit='KiB'>500000</currentMemory> < <vcpu placement='static'>2</vcpu> ---
<domain type='lxc'> <name>dwj-hfax-dev</name> <uuid>681410de-7b56-41bd-b38d-3c66ce97e7b3</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <vcpu placement='static'>4</vcpu> 21c21 < <source dir='/vm/lxc/dwj-lnx-dev'/>
<source dir='/vm/lxc/dwj-hfax-dev'/>
37c37 < <mac address='82:00:00:00:01:00'/> ---
<mac address='82:00:00:00:01:01'/>
41,42c41 < <console type='pty' tty='/dev/pts/3'> < <source path='/dev/pts/3'/> ---
<console type='pty'>
44d42 < <alias name='console0'/>
(After reseting everything, and attempting to boot hfax with dev offline, libvirtd still crashes)
ostara ~ # virsh -c lxc:/// list --all Id Name State ---------------------------------------------------- - dwj-hfax-dev shut off - dwj-lnx-dev shut off - vm1 shut off
ostara ~ # virsh -c lxc:/// start dwj-hfax-dev error: Failed to start domain dwj-hfax-dev error: End of file while reading data: Input/output error error: One or more references were leaked after disconnect from the hypervisor error: Failed to reconnect to the hypervisor

On 12.07.2013 20:32, Dennis Jenkins wrote:
Update: I am able to edit the XML in "dwj-hfax-dev" such that libvirtd no longer crashes, and edit the XML for "dwj-lnx-dev" such that it will crash.
The presents of "<seclabel type='none'/>" near the bottom causes libvirtd to crash.
I do not recall ever manually adding that to my domain.
In any event, libvirtd should probably not crash due to the XML element (which seems valid - or at least "virsh edit" allows it).
Interesting. If you are still able to reproduce the crash, can you try to get the line number within virSecurityManagerGenLabel where the crash happened? I think it's the STREQ line (440 linenr). Question is whether model or name is NULL. src/security/security_manager.c-438- for (i = 0; i < vm->nseclabels; i++) { src/security/security_manager.c-439- for (j = 0; sec_managers[j]; j++) src/security/security_manager.c-440- if (STREQ(vm->seclabels[i]->model, sec_managers[j]->drv->name)) src/security/security_manager.c-441- break; src/security/security_manager.c-442- src/security/security_manager.c-443- if (!sec_managers[j]) { src/security/security_manager.c-444- virReportError(VIR_ERR_CONFIG_UNSUPPORTED, src/security/security_manager.c-445- _("Unable to find security driver for label %s"), src/security/security_manager.c-446- vm->seclabels[i]->model); src/security/security_manager.c-447- goto cleanup; src/security/security_manager.c-448- } src/security/security_manager.c-449- } Michal

On Mon, Jul 15, 2013 at 3:18 AM, Michal Privoznik <mprivozn@redhat.com>wrote:
Interesting. If you are still able to reproduce the crash, can you try to get the line number within virSecurityManagerGenLabel where the crash happened? I think it's the STREQ line (440 linenr). Question is whether model or name is NULL.
I'll try. I'm not sure why GDB failed to list line numbers in the backtrace. I will recompile libvirt with "-O0 -g3" and try again. I'm running libvirt on my Gentoo development server, built from portage. Instead of tinkering with portage and rebuilding libvirt, I thought that I would just try the latest pull from git. "./configure" fails, unable to find an input file. I'll try again, using the same source tarball as listed in Gentoo's ebuild. ostara libvirt # CFLAGS="-O0 -g3" ./configure --with-lxc config.status: creating libvirt.pc config.status: creating libvirt.spec config.status: error: cannot find input file: `mingw32-libvirt.spec.in'

Ah ha! I just learned about "gdb bt full". The existing core dump might have what you need: Line #442. However, the line numbers for the source code in the source tree that my Gentoo system is building from does not match exactly what you listed. Line #442 for me is the one containing the "STREQ" macro: virObjectLock(mgr); for (i = 0; i < vm->nseclabels; i++) { for (j = 0; sec_managers[j]; j++) if (STREQ(vm->seclabels[i]->model, sec_managers[j]->drv->name)) break; I can rebuild with "-O0" and try again. If I can still trigger the crash, the backtrace might have useful values for the optimized variables. I'll post again in a few minutes. (gdb) bt full #0 0x00007fe4750c5d76 in __strcmp_sse42 () from /lib64/libc.so.6 No symbol table info available. #1 0x00007fe47578ad31 in virSecurityManagerGenLabel (mgr=0x7fe4640acfa0, vm=0x7fe4640c5e40) at security/security_manager.c:442 ret = -1 i = <optimized out> j = <optimized out> sec_managers = 0x7fe458001880 seclabel = <optimized out> generated = false __FUNCTION__ = "virSecurityManagerGenLabel" __func__ = "virSecurityManagerGenLabel" #2 0x00007fe46aa92979 in virLXCProcessStart (conn=0x7fe460000a80, driver=0x7fe4640bd340, vm=0x7fe4640c1610, autoDestroy=false, reason=VIR_DOMAIN_RUNNING_BOOTED) at lxc/lxc_process.c:1144 rc = -1 r = <optimized out> nttyFDs = 1 ttyFDs = 0x7fe458001790 i = <optimized out> logfile = 0x7fe458000ad0 "/var/log/libvirt/lxc/dwj-hfax-dev.log" logfd = -1 nveths = 0 veths = 0x0 handshakefds = {-1, -1} pos = -1 ebuf = "\000\000\000\000\344\177\000\000\020\000\000\000\000\000\000\000\376\377\377\377\377\377\377\377\000\000\000\000\000\000\000\000\377\377\377\377", '\000' <repeats 12 times>"\364, \377\377\377\377\377\377\377\220Y|u\344\177\000\000\034\313\376t\344\177\000\000\000\000\000\000\020\000\000\000\217\b~u\344\177\000\000\000\000\000\000\000\000\000\000\250\246\327o\344\177\000\000t\b~u\344\177\000\000\000\000\000\000\000\000\000\000\020", '\000' <repeats 15 times>, "P\251\327o", '\000' <repeats 12 times>, "\006\247\327o\344\177\000\000\034\313\376t\344\177\000\000\000\000\000\000\344\177\000\000C\342{u\344\177\000\000x\247\327o\344\177\000\000\b\247\327o2739\000\342{u\344\177\000\000\000\000\000\000\000\000\000\000x", '\000' <repeats 15 times>"\247, Z\016v\000\000\000\000(\000\000\000\060\000\000\000\200\246\327o\344\177\000\000\300\245\327o\344\177\000\000\n", '\000' <repeats 31 times>"\227"... timestamp = <optimized out> cmd = 0x0 priv = 0x7fe464022500 err = 0x0 __FUNCTION__ = "virLXCProcessStart" __func__ = "virLXCProcessStart" ---Type <return> to continue, or q <return> to quit--- #3 0x00007fe46aa9736e in lxcDomainCreateWithFlags (dom=0x7fe4580008c0, flags=<optimized out>) at lxc/lxc_driver.c:1054 driver = 0x7fe4640bd340 vm = 0x7fe4640c1610 event = 0x0 ret = -1 __FUNCTION__ = "lxcDomainCreateWithFlags" On Mon, Jul 15, 2013 at 7:06 AM, Dennis Jenkins <dennis.jenkins.75@gmail.com
wrote:
On Mon, Jul 15, 2013 at 3:18 AM, Michal Privoznik <mprivozn@redhat.com>wrote:
Interesting. If you are still able to reproduce the crash, can you try to get the line number within virSecurityManagerGenLabel where the crash happened? I think it's the STREQ line (440 linenr). Question is whether model or name is NULL.
I'll try.
I'm not sure why GDB failed to list line numbers in the backtrace. I will recompile libvirt with "-O0 -g3" and try again.
I'm running libvirt on my Gentoo development server, built from portage. Instead of tinkering with portage and rebuilding libvirt, I thought that I would just try the latest pull from git. "./configure" fails, unable to find an input file. I'll try again, using the same source tarball as listed in Gentoo's ebuild.
ostara libvirt # CFLAGS="-O0 -g3" ./configure --with-lxc
config.status: creating libvirt.pc config.status: creating libvirt.spec config.status: error: cannot find input file: `mingw32-libvirt.spec.in'

On Mon, Jul 15, 2013 at 7:37 AM, Dennis Jenkins <dennis.jenkins.75@gmail.com
wrote:
Ah ha! I just learned about "gdb bt full". The existing core dump might have what you need: Line #442. However, the line numbers for the source code in the source tree that my Gentoo system is building from does not match exactly what you listed.
Line #442 for me is the one containing the "STREQ" macro:
virObjectLock(mgr);
for (i = 0; i < vm->nseclabels; i++) { for (j = 0; sec_managers[j]; j++) if (STREQ(vm->seclabels[i]->model, sec_managers[j]->drv->name)) break;
I can rebuild with "-O0" and try again. If I can still trigger the crash, the backtrace might have useful values for the optimized variables. I'll post again in a few minutes.
On Mon, Jul 15, 2013 at 7:06 AM, Dennis Jenkins < dennis.jenkins.75@gmail.com> wrote:
On Mon, Jul 15, 2013 at 3:18 AM, Michal Privoznik <mprivozn@redhat.com>wrote:
Interesting. If you are still able to reproduce the crash, can you try to get the line number within virSecurityManagerGenLabel where the crash happened? I think it's the STREQ line (440 linenr). Question is whether model or name is NULL.
I'll try.
I'm not sure why GDB failed to list line numbers in the backtrace. I will recompile libvirt with "-O0 -g3" and try again.
TL;DR: "vm->seclabels[i]->model" is NULL. (gdb) bt full #0 0x00007ffff6fc5d76 in __strcmp_sse42 () from /lib64/libc.so.6 No symbol table info available. #1 0x00007ffff76e2ae2 in virSecurityManagerGenLabel (mgr=0x7fffe40bef20, vm=0x7fffd00097a0) at security/security_manager.c:442 ret = -1 i = 0 j = 0 sec_managers = 0x7fffd8003e60 seclabel = 0x7ffff74c4911 <virAllocN+54> generated = false __FUNCTION__ = "virSecurityManagerGenLabel" __func__ = "virSecurityManagerGenLabel" (gdb) list 436,450 436 if ((sec_managers = virSecurityManagerGetNested(mgr)) == NULL) 437 return ret; 438 439 virObjectLock(mgr); 440 for (i = 0; i < vm->nseclabels; i++) { 441 for (j = 0; sec_managers[j]; j++) 442 if (STREQ(vm->seclabels[i]->model, sec_managers[j]->drv->name)) 443 break; 444 445 if (!sec_managers[j]) { 446 virReportError(VIR_ERR_CONFIG_UNSUPPORTED, 447 _("Unable to find security driver for label %s"), 448 vm->seclabels[i]->model); 449 goto cleanup; 450 } (gdb) frame 1 #1 0x00007ffff76e2ae2 in virSecurityManagerGenLabel (mgr=0x7fffe40bef20, vm=0x7fffd00097a0) at security/security_manager.c:442 442 if (STREQ(vm->seclabels[i]->model, sec_managers[j]->drv->name)) (gdb) print vm->nseclabels $1 = 1 (gdb) print sec_managers $2 = (virSecurityManagerPtr *) 0x7fffd8003e60 (gdb) print sec_managers[0] $3 = (virSecurityManagerPtr) 0x7fffe40bef20 (gdb) print vm->seclabels[i]->model $4 = 0x0 (gdb) print sec_managers[j]->drv->name $5 = 0x7ffff77729c0 "none"

On Mon, Jul 15, 2013 at 7:53 AM, Dennis Jenkins <dennis.jenkins.75@gmail.com
wrote:
TL;DR: "vm->seclabels[i]->model" is NULL.
It probably would have been helpful had I posted everything in the struct: (gdb) print vm->seclabels[i] $6 = (virSecurityLabelDefPtr) 0x7fffd0009dc0 (gdb) print *(vm->seclabels[i]) $7 = {model = 0x0, label = 0x0, imagelabel = 0x0, baselabel = 0x0, type = 1, norelabel = true, implicit = false}

On 15.07.2013 14:37, Dennis Jenkins wrote:
Ah ha! I just learned about "gdb bt full". The existing core dump might have what you need: Line #442. However, the line numbers for the source code in the source tree that my Gentoo system is building from does not match exactly what you listed.
Line #442 for me is the one containing the "STREQ" macro:
virObjectLock(mgr); for (i = 0; i < vm->nseclabels; i++) { for (j = 0; sec_managers[j]; j++) if (STREQ(vm->seclabels[i]->model, sec_managers[j]->drv->name)) break;
I can rebuild with "-O0" and try again. If I can still trigger the crash, the backtrace might have useful values for the optimized variables. I'll post again in a few minutes.
So I've managed to reproduce this and found the root cause for this problem. Will post patch shortly. Michal
participants (2)
-
Dennis Jenkins
-
Michal Privoznik