Re: [PATCH 8/8] build: Decrease maximum stack frame size to 2048

5 Sep 2023

On 9/5/23 09:43, Peter Krempa wrote:
...
On Mon, Sep 04, 2023 at 16:09:00 +0200, Michal Prívozník wrote:
...
On 8/30/23 13:59, Peter Krempa wrote:
...
After recent cleanups we can now restrict the maximum stack frame size
to 2k.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index 965ada483b..e45f3e2c39 100644
--- a/meson.build
+++ b/meson.build
@@ -248,7 +248,7 @@ alloc_max = run_command(
 )
# sanitizer instrumentation may enlarge stack frames
-stack_frame_size = get_option('b_sanitize') == 'none' ? 4096 : 32768
+stack_frame_size = get_option('b_sanitize') == 'none' ? 2048 : 32768
# array_bounds=2 check triggers false positive on some GCC
 # versions when using sanitizers. Seen on Fedora 34 with
I know this is already pushed but to be honest, I don't really
understand why this is needed. I mean, we certainly do not want large
frames, but IIUC only frames larger than a page are problem (i.e. 4KiB).
Can you please point me to some docs where I can learn more?
The main idea of this is to ensure that we don't have massive stack
frames which could cause a stack overflow.
I'm failing to see what stack size and buffer overflow have in common.
To overwrite the return address you can has as small buffer as 1 byte
and not check the boundaries (obviously). And yet, you can have 2K
buffer on stack but if you check the boundaries when reading into it, no
exploit is possible.
...
But to be honest, it doesn't matter too much whether the actual limit
size is 4k or 2k. It still allows us to have very deep call stack, and a
factor of 2 in the size will not make a real difference.
Yeah. If we care about the actual stack usage (e.g. like kernel does -
it has limit of 8KiB for the whole stack of a kernel thread), then
restricting individual functions is not enough since we have recursive
functions.
...
At one point I was looking at which functions have massive stack frames
and tried avoiding such state. Now this last set of patches was what I
had in my local branch for a long time and decided to finally get rid of
them. As you can see, there were multiple cases of 2k buffers being
stack allocated, so the end result IMO made sense.
Indeed and this reason alone is good enough. It made the code cleaner
too. I was just wondering whether there is some deeper sense.
...
The decreasing of the actual limit to 2k isn't as important as I've
noted but similarly to when we add a syntax check after a full-repo
cleanup it is a way to prevent regressions after a cleanup.
BTW: if you (or anybody else) want to identify functions which have
massive stack usage, I've found clever perl script from kernel.git:

libvirt.git $ objdump -d _build/src/libvirt.so.0.9008.0 | \
kernel.git/scripts/checkstack.pl | \
head -n30

some very interesting functions appear on the list
(virDomainDefFeaturesCheckABIStability()) - but when looking at the
disassembled code directly, it's not just about initial sub 0x68, %rsp;
it's also about how many arguments are passed to functions (esp. with
variable arguments)

Michal