Hi Guido,
On Sun, 2016-12-25 at 00:21 +0100, Guido Günther wrote:
On Sat, Dec 24, 2016 at 05:14:44PM +0100, Guido Günther wrote:
> Hi Cedric,x
> On Wed, Dec 21, 2016 at 02:36:39PM +0100, Cedric Bosdonnat wrote:
> > Hey Christian,
> >
> > On Tue, 2016-12-20 at 12:29 +0100, Christian Ehrhardt wrote:
> > > Hi,
> > > I found an issue in libvirt related to libvirt-lxc, but fail to find the
root cause.
> > >
> > > The TL;DR is: libvirt-lxc guests get killed on libvirt restart due to
"internal error: No valid cgroup for
> > > machine"
> > >
> > > It was able to reproduce libvirt 1.3.1, 2.4 and 2.5 as packages in Ubuntu
and Debian.
> > > I wanted to ask for two things:
> > > - wider coverage where this does reproduce
> >
> > I couldn't reproduce here with openSUSE Tumbleweed and libvirt 2.5
packages.
>
> I had a short look and it seems like this sequence is killing all running
> libvirt-lxc guests reliably:
>
> # no lxc guest running yet
> export LIBVIRT_DEFAULT_URI=lxc:///
> DOMAIN=sl
> systemctl daemon-reload
>
> # start lxc guest
> virsh start ${DOMAIN}
> sleep 1 # give vm some time to start
> systemctl restart libvirtd
Using ftrae I can see that systemd moves the process into the wrong
cgroup on start:
systemd-1 [000] .... 652.333068: cgroup_attach_task: dst_root=3 dst_id=80
dst_level=2
dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc
systemd-1 [000] .... 652.333117: cgroup_attach_task: dst_root=3 dst_id=80
dst_level=2
dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc
systemd-1 [000] .... 652.333160: cgroup_attach_task: dst_root=6 dst_id=80
dst_level=2
dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc
systemd-1 [000] .... 652.333203: cgroup_attach_task: dst_root=4 dst_id=107
dst_level=2
dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc
systemd-1 [000] .... 652.333245: cgroup_attach_task: dst_root=8 dst_id=80
dst_level=2
dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc
systemd-1 [000] .... 652.333286: cgroup_attach_task: dst_root=7 dst_id=84
dst_level=2
dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc
I've attached the script to reproduce this and would be happy about
ideas of the root cause.
Thanks for your work: it really helped me reproduce it here too. After quite a lot
of fiddling with the thing I discovered that it happens to me only if daemon-reload is
done
between the container start and the libvirtd restart.
I've also seen that the following doesn't lead to the problem:
virsh start sl
systemctl daemon-reload
systemctl stop libvirtd
libvirtd (manual start)
But after that, if I kill libvirtd and start it using systemctl start libvirtd, then the
container disappears too.
I tried debugging this, but didn't come to anything interesting thus far. I'll try
again later
with a less confused brain ;)
--
Cedric