Hi Cedric,x
On Wed, Dec 21, 2016 at 02:36:39PM +0100, Cedric Bosdonnat wrote:
Hey Christian,
On Tue, 2016-12-20 at 12:29 +0100, Christian Ehrhardt wrote:
> Hi,
> I found an issue in libvirt related to libvirt-lxc, but fail to find the root
cause.
>
> The TL;DR is: libvirt-lxc guests get killed on libvirt restart due to "internal
error: No valid cgroup for machine"
>
> It was able to reproduce libvirt 1.3.1, 2.4 and 2.5 as packages in Ubuntu and
Debian.
> I wanted to ask for two things:
> - wider coverage where this does reproduce
I couldn't reproduce here with openSUSE Tumbleweed and libvirt 2.5 packages.
I had a short look and it seems like this sequence is killing all running
libvirt-lxc guests reliably:
# no lxc guest running yet
export LIBVIRT_DEFAULT_URI=lxc:///
DOMAIN=sl
systemctl daemon-reload
# start lxc guest
virsh start ${DOMAIN}
sleep 1 # give vm some time to start
systemctl restart libvirtd
virsh list | grep -qs "${DOMAIN}[[:space:]]\+running"
# lxc guest gone
The important part is the "systemctl daemon-reload". If one leaves that
out libvirtd restarts don't kill off any lxc-domains anymore.
The issue is that libvirt on reattach fails virCgroupNewDetectMachine
due to /proc/<pid-of-lxc-container>/cgroup having changed after
libvird's restart:
Before systemctl restarts libvirtd:
10:perf_event:/machine/lxc-21383-sl.libvirt-lxc
9:cpuset:/machine/lxc-21383-sl.libvirt-lxc
8:net_cls,net_prio:/machine/lxc-21383-sl.libvirt-lxc
7:pids:/system.slice/libvirtd.service
6:memory:/machine/lxc-21383-sl.libvirt-lxc
5:cpu,cpuacct:/machine/lxc-21383-sl.libvirt-lxc
4:devices:/machine/lxc-21383-sl.libvirt-lxc
3:freezer:/machine/lxc-21383-sl.libvirt-lxc
2:blkio:/machine/lxc-21383-sl.libvirt-lxc
1:name=systemd:/system.slice/libvirtd.service
After systemctl restart libvirtd:
10:perf_event:/machine/lxc-21383-sl.libvirt-lxc
9:cpuset:/machine/lxc-21383-sl.libvirt-lxc
8:net_cls,net_prio:/machine/lxc-21383-sl.libvirt-lxc
7:pids:/system.slice/libvirtd.service
6:memory:/system.slice/libvirtd.service
5:cpu,cpuacct:/system.slice/libvirtd.service
4:devices:/system.slice/libvirtd.service
3:freezer:/machine/lxc-21383-sl.libvirt-lxc
2:blkio:/system.slice/libvirtd.service
1:name=systemd:/system.slice/libvirtd.service
so the process is moved to other memory, cpu, device and blkio cgroups
and therefore libvirtd can't find it anymore. The error in the log looks
like:
debug : virCgroupValidateMachineGroup:333 : Name 'libvirtd.service' for controller
'cpu' does not match 'sl', 'lxc-21383-sl',
'sl.libvirt-lxc', 'machine-lxc\x2dsl.scope' or
'machine-lxc\x2d21383\x2dsl.scope'
This does _not_ happen if one restarts libvirtd right after the "systemctl
daemon-reload" or if one drops the "systemctl daemon-reload" from the
above
example. This also does not happen if one stops libvird via systemd but
starts it as /usr/sbin/libvirtd directly. So the culprit happens when
* systemctl daemon-reload
* libvirtd is restared via systemctl
I've looked at audit logs and straced pid 1 without spotting
anything. Any ideas where to go looking now?
This is systemd 232.
Cheers,
-- Guido
> - your expertise on the case itself.
It seems that you'll need to check what's going on in virCgroupDetect().
> Steps to reproduce:
> 1. Spawn new KVM Guest of your choice
> 2. install test dependencies
> $ apt-get install libvirt-daemon-system libvirt-clients libxml2-utils
> # or package managers / package names of your chosen os
> 3. run the following sequence as root
> export LIBVIRT_DEFAULT_URI=lxc:///
> cat << EOF > /tmp/smoke-lxc.xml
> <domain type='lxc'>
> <name>sl</name>
> <memory unit='KiB'>256000</memory>
> <currentMemory unit='KiB'>256000</currentMemory>
> <vcpu placement='static'>1</vcpu>
> <os>
> <type>exe</type>
> <init>/bin/bash</init>
> </os>
> <features>
> <privnet/>
> </features>
> <clock offset='utc'/>
> <devices>
> <emulator>/usr/lib/libvirt/libvirt_lxc</emulator>
The emulator should be removed from the config for portability
purpose: the libvirt_lxc path may vary from a distro / arch to another
and libvirt's lxc driver is able to auto-add it.
--
Cedric
--
libvir-list mailing list
libvir-list(a)redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list