Quoting Daniel P. Berrange (berrange(a)redhat.com):
On Fri, Feb 14, 2014 at 11:14:39AM +0100, Richard Weinberger wrote:
> Hi!
>
> If we suspend a LXC domain libvirt freezes all tasks in the cgroup using the process
freezer.
> Upon destroy libvirt tries to kill all tasks using SIGTERM and later SIGKILL, but as
they are frozen
> the tasks are unkillable.
> This seems to confuse libvirt, all tasks remain but libvirt forgets the domain.
>
> Here a small example:
> ---cut---
> lxc-host1:/etc # /opt/libvirt-dev/bin/virsh domstate my3rdcontainer
> paused
>
> lxc-host1:/etc # /opt/libvirt-dev/bin/virsh destroy my3rdcontainer
> error: Failed to destroy domain my3rdcontainer
> error: internal error: Some processes refused to die
>
> lxc-host1:/etc # ps fax
> ...
> 2118 ? Dsl 0:00 /opt/libvirt-dev/lib/libvirt_lxc --name my3rdcontainer
--console 19 --security=none --handshake 22 --backgr
> 2128 ? Ds 0:00 \_ /sbin/init
> 2152 ? Ds 0:00 \_ /usr/lib/systemd/systemd-journald
> 2171 ? Ds 0:00 \_ /bin/dbus-daemon --system --address=systemd:
--nofork --nopidfile --systemd-activation
> 2174 ? Ds 0:00 \_ /usr/lib/systemd/systemd-logind
> 2189 ? Dsl 0:00 \_ /usr/sbin/rsyslogd -n
> 2778 ? Ds 0:00 \_ /usr/sbin/cron -n
> 2782 pts/0 Ds+ 0:00 \_ /sbin/agetty --noclear -s console 115200 38400
9600
> 2786 ? Ds 0:00 \_ /usr/sbin/sshd -D
> ...
> ---cut---
>
> I can think of three options to deal with that.
>
> a) Refuse to destroy a suspended LXC domain
>
> b) Implicitly resume it upon destroy
>
> c) Send a SIGKILL to each task and then thaw all tasks using the process freezer.
> If the task is woken up the it sees immediately the pending SIGKILL and dies.
>
> I'd vote for c) because I want to destroy a LXC domain without resuming it.
> I.e. I want to kill it to avoid any further IO from the already suspended domain.
Yes, I think c) is the only reasonable option here. Allowing processes
any window where they can continue executing is not ok.
( For the record that's what lxc does as well - +1 )