On Fri, Aug 04, 2023 at 03:38:07PM +0200, Michal Privoznik wrote:
When spawning a new container (via clone()) we allocate stack for
lxcContainerChild(). So far, we allocate 4 pages for the stack
and this used to be enough until we started rewriting everything
to glib. With glib we switched to g_strerror() which localizes
errno strings and thus increases stack usage, while the
previously used strerror_r() was more compact.
We're allocating the stack using g_new0, so when we overflowed
the stack we started scribbling over other allocations which
is horrible to diagnose.
Fortunately, the solution is easy - just increase how much stack
the child can use (16 pages ought to be enough for anybody).
I wonder if we're better off switching to mmap(), allocating
17 pages,and then using mprotect() to remove read+write
perms from first and/or last page, so that any future overflow
will generate SIGBUS immediately.
Resolves:
https://gitlab.com/libvirt/libvirt/-/issues/511
Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
---
src/lxc/lxc_container.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c
index 63cf283285..f741a754ce 100644
--- a/src/lxc/lxc_container.c
+++ b/src/lxc/lxc_container.c
@@ -2132,7 +2132,7 @@ int lxcContainerStart(virDomainDef *def,
{
pid_t pid;
int cflags;
- int stacksize = getpagesize() * 4;
+ int stacksize = getpagesize() * 16;
g_autofree char *stack = NULL;
char *stacktop;
lxc_child_argv_t args = {
--
2.41.0
With regards,
Daniel
[1] first or last - arches differ on whether stack grows up vs down IIRC
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|