Stack allocation fail when close to heap - c++

Issue Description
C++ server got several crashes at the same stack frame as below
enter image description here
Investigation
Memory was not used up, 400M-800M
Stack space used is not large, about 270K-350K
Haven't found any other code issue caused crash
The distance between stack and heap is about 1M.
Binary is built with PIE
OS is 2.6.32-696.16.1.el6.i686 #1 SMP Wed Nov 15 16:16:47 UTC 2017 i686 i686 i386 GNU/Linux
Stack space used in one crash
(gdb) f 0
#0 0xb717f616 in WDMS_Byte_Stream::write (this=Cannot access memory at
address 0xbfa16b1c) at wdms_bs.cpp:63 in wdms_bs.cpp
(gdb) set $top=$esp
(gdb) f 38
#38 0xb6a1bcbf in main (argc=1, argv=0xbfa5a8b4) at main.cpp:66
main.cpp: No such file or directory.
in main.cpp
(gdb) p $esp-$top
$1 = 277728
(gdb)
Distance between stack and heap
Memory allocated
enter image description here
One heap segment close to stack space, 0xb8a8f000->0xbf917000
enter image description here
Stack space 0xbfa18000->0xbfa5c000
enter image description here
the distance between them
(gdb) p 0xbfa18000- 0xbf917000
$3 = 1052672
Why crash
Program need to allocate 8268 bytes (0x204c) stack size at last frame
OS only expand stack size to 0xbfa18000, actual size is
(gdb) p $esp+0x204c-0xbfa18000
$19 = (void *) 0xb4c(2892 bytes)
So, when execute next instruction
call 0xb6a19ec9 <__i686.get_pc_thunk.bx>, where need access $esp, which is out of accessible memory space.
Why crash at the same stack frame? It could be because this call sequence has most call frames and to use most stack space.
enter image description here
Question-1
Is this related CVE-2017-1000364 Fix ?
enter image description here
Question-2
Why does OS allocate heap segment so close to stack? how does it avoid stack and heap collision and not allocation failure?

Related

what means bytes in the stack trace of c++ application

i got a crash dump and a stack trace for an app on windows 2012 r2 and the stack trace line ends up with function + N bytes.
What bytes / offset means? Could it be used to identify the line in the source code of the function that crashed? For example below what means 6092 bytes? It is not the line number
I see that in other application crash the offset is in hexal base.
app.exe caused exception C0000005 (EXCEPTION_ACCESS_VIOLATION) at 000000009067B6BC
Register dump:
Crash dump successfully written to file 'C:\Program Files\App\appcrash.dmp'
Stack Trace:
0x000000009067B6BC (0xA95C44A0 0xA86FCE60 0xA95C44A0 0x00000008) app.exe, appFunction1()+6092 bytes

Invalid cast in gdb (armv7)

I have a similar issue like in this question GDB corrupted stack frame - How to debug?, like this:
(gdb) bt
#0 0x76bd6978 in fputs () from /lib/libc.so.6
#1 0x0000b080 in getfunction1 ()
#2 0x0000b080 in getfunction1 ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Chris Dodd wrote an answer to point the top of the stack to the program counter (PC). In a 32bit machine it shall be
(gdb) set $pc = *(void **)$esp
(gdb) set $esp = $esp + 4
However, after run the first line I got invalid cast:
(gdb) set $pc = *(void **)$esp
Invalid cast.
(gdb) set $esp = $esp + 4
Argument to arithmetic operation not a number or boolean.
Why do I get this messages? and how can I make a workaround to figure out where the crash occurs? I work on a armv7 machine with Linux.
ESP does not exist in ARM. It's MSP (Main Stack Pointer) or PSP (Stack pointer).
ARM Registers
As ESP does not exists, that's why you get invalid cast. If you do the same command with another valid ARM register there is no error

arm_data abort failure in case of running my program for the second time and thereafter

I add my program (load a file and do some computation) into the app of TizenRT on ARTIK053. The program can run successfully in the first time, but the data abort failure will be met when running it second time. The specific error info is as follows:
arm_dataabort:
Data abort. PC: 040d25a0 DFAR: 00000011 DFSR: 0000080d
up_assert: Assertion failed at file:armv7-r/arm_dataabort.c line: 111 task: ghsom_test
up_dumpstate: Current sp: 020c3eb0
up_dumpstate: User stack:
up_dumpstate: base: 020c3fd0
up_dumpstate: size: 00000fd4
up_dumpstate: used: 00000220
up_dumpstate: User Stack
up_stackdump: 020c3ea0: 00000003 020c3eb0 040c9638 041d38b8 00000000 040c9644 00000011 0000080
.....
.....
up_taskdump: Idle Task: PID=0 Stack Used=1024 of 1024
up_taskdump: hpwork: PID=1 Stack Used=164 of 2028
up_taskdump: lpwork: PID=2 Stack Used=164 of 2028
up_taskdump: logm: PID=3 Stack Used=300 of 2028
up_taskdump: LWIP_TCP/IP: PID=4 Stack Used=228 of 4068
up_taskdump: tash: PID=6 Stack Used=948 of 4076
up_taskdump: ghsom_test: PID=10 Stack Used=616 of 4052
I checked the remaining free RAM space, it is enough for my program. And I added some printing info into my main function to check on which line the error come out. I found that if I commented some lines before the line that the error come out, in the next time I running the program, the error line will move downward some lines. It seems like I released some stack space. So I guess it might be an issue related with the stack size that I can assign to a single proc. Anyone knows the reason, and how to solve the issue? To be mentioned, it only happens for the second time and thereafter I running the program.
With the stackdump you can almost always figure out where the error originated from.
Since you have the image file for your build you can do
arm-none-eabi-addr2line -f -p -i -b build/out/bin/tinyara 0xADDR
where ADDR would be the addr is one of the relevant addresses in the stack dump.
You can usually check the "current sp" (stack pointer) but often it points to the arm_dataabort shown in the failure above.
Then you can check the PC address and also look for addresses in the stack dump (starting from the back of it) that looks like the PC in value.
In your case it could be addresses like (in that order): 040c9644, 041d38b8, 040c9638
So basically:
arm-none-eabi-addr2line -f -p -i -b build/out/bin/tinyara 0x040c9644
notice the 0x in front of the address.
The command will give you a good indication for where this address is coming from in your binary like:
up_idlepm_static at /home/user/tizenrt/os/arch/arm/src/chip/s5j_idle.c:111
(inlined by) up_idle at /home/user/tizenrt/os/arch/arm/src/chip/s5j_idle.c:254
if the address is not pointing to code lines then it will look like:
?? ??:0
hope that helps

Inconsistencies in values from mallinfo and ps

I am trying to identify a huge memory growth in a linux application which runs around 20-25 threads. From one of those threads I dump the memory stats using the system call mallinfo . It shows the total allocated space as 1005025904 (uordblks). However, the top command shows a value of 8GB as total memory and 7GB as resident memory. Can some one explain this inconsistency?
Following is the full stat from mallinfo:
Total non-mmapped bytes (arena): 1005035520
# of free chunks (ordblks): 2
# of free fastbin blocks (smblks): 0
# of mapped regions (hblks): 43
Bytes in mapped regions (hblkhd): 15769600
Max. total allocated space (usmblks): 0
Free bytes held in fastbins (fsmblks): 0
Total allocated space (uordblks): 1005025904
Total free space (fordblks): 9616
Topmost releasable block (keepcost): 9584
The reason is mallinfo gives the stats of the main arena. To find details of all arena's you have to use the system call malloc_stats.

Valgrind stack misses a function completely

i have two c files:
a.c
void main(){
...
getvtable()->function();
}
the vtable is pointing to a function that is located in b.c:
void function(){
malloc(42);
}
now if i trace the program in valgrind I get the following:
==29994== 4,155 bytes in 831 blocks are definitely lost in loss record 26 of 28
==29994== at 0x402CB7A: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==29994== by 0x40A24D2: (below main) (libc-start.c:226)
so the call to function is completely ommited on the stack! How is it possible? In case I use GDB, a correct stack including "function" is shown.
Debug symbols are included, Linux, 32-bit.
Upd:
Answering the first question, I get the following output when debugging valgrind's GDB server. The breakpoint is not coming, while it comes when i debug directly with GDB.
stasik#gemini:~$ gdb -q
(gdb) set confirm off
(gdb) target remote | vgdb
Remote debugging using | vgdb
relaying data between gdb and process 11665
[Switching to Thread 11665]
0x040011d0 in ?? ()
(gdb) file /home/stasik/leak.so
Reading symbols from /home/stasik/leak.so...done.
(gdb) break function
Breakpoint 1 at 0x110c: file ../../source/leakclass.c, line 32.
(gdb) commands
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
>silent
>end
(gdb) continue
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
0x0404efcb in ?? ()
(gdb) source thread-frames.py
Stack level 0, frame at 0x42348a0:
eip = 0x404efcb; saved eip 0x4f2f544c
called by frame at 0x42348a4
Arglist at 0x4234898, args:
Locals at 0x4234898, Previous frame's sp is 0x42348a0
Saved registers:
ebp at 0x4234898, eip at 0x423489c
Stack level 1, frame at 0x42348a4:
eip = 0x4f2f544c; saved eip 0x6e492056
called by frame at 0x42348a8, caller of frame at 0x42348a0
Arglist at 0x423489c, args:
Locals at 0x423489c, Previous frame's sp is 0x42348a4
Saved registers:
eip at 0x42348a0
Stack level 2, frame at 0x42348a8:
eip = 0x6e492056; saved eip 0x205d6f66
called by frame at 0x42348ac, caller of frame at 0x42348a4
Arglist at 0x42348a0, args:
Locals at 0x42348a0, Previous frame's sp is 0x42348a8
Saved registers:
eip at 0x42348a4
Stack level 3, frame at 0x42348ac:
eip = 0x205d6f66; saved eip 0x61746144
---Type <return> to continue, or q <return> to quit---
called by frame at 0x42348b0, caller of frame at 0x42348a8
Arglist at 0x42348a4, args:
Locals at 0x42348a4, Previous frame's sp is 0x42348ac
Saved registers:
eip at 0x42348a8
Stack level 4, frame at 0x42348b0:
eip = 0x61746144; saved eip 0x65736162
called by frame at 0x42348b4, caller of frame at 0x42348ac
Arglist at 0x42348a8, args:
Locals at 0x42348a8, Previous frame's sp is 0x42348b0
Saved registers:
eip at 0x42348ac
Stack level 5, frame at 0x42348b4:
eip = 0x65736162; saved eip 0x70616d20
called by frame at 0x42348b8, caller of frame at 0x42348b0
Arglist at 0x42348ac, args:
Locals at 0x42348ac, Previous frame's sp is 0x42348b4
Saved registers:
eip at 0x42348b0
Stack level 6, frame at 0x42348b8:
eip = 0x70616d20; saved eip 0x2e646570
called by frame at 0x42348bc, caller of frame at 0x42348b4
Arglist at 0x42348b0, args:
---Type <return> to continue, or q <return> to quit---
Locals at 0x42348b0, Previous frame's sp is 0x42348b8
Saved registers:
eip at 0x42348b4
Stack level 7, frame at 0x42348bc:
eip = 0x2e646570; saved eip 0x0
called by frame at 0x42348c0, caller of frame at 0x42348b8
Arglist at 0x42348b4, args:
Locals at 0x42348b4, Previous frame's sp is 0x42348bc
Saved registers:
eip at 0x42348b8
Stack level 8, frame at 0x42348c0:
eip = 0x0; saved eip 0x0
caller of frame at 0x42348bc
Arglist at 0x42348b8, args:
Locals at 0x42348b8, Previous frame's sp is 0x42348c0
Saved registers:
eip at 0x42348bc
(gdb) continue
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
0x0404efcb in ?? ()
(gdb) continue
Continuing.
I see two possible reasons:
Valgrind is using a different stack unwind method than GDB
The address space layout is different while running your program under the two environments and you're only hitting stack corruption under Valgrind.
We can gain more insight by using Valgrind's builtin gdbserver.
Save this Python snippet to thread-frames.py
import gdb
f = gdb.newest_frame()
while f is not None:
f.select()
gdb.execute('info frame')
f = f.older()
t.gdb
set confirm off
file MY-PROGRAM
break function
commands
silent
end
run
source thread-frames.py
quit
v.gdb
set confirm off
target remote | vgdb
file MY-PROGRAM
break function
commands
silent
end
continue
source thread-frames.py
quit
(Change MY-PROGRAM, function in the scripts above and the commands below as required)
Get details about the stack frames under GDB:
$ gdb -q -x t.gdb
Breakpoint 1 at 0x80484a2: file valgrind-unwind.c, line 6.
Stack level 0, frame at 0xbffff2f0:
eip = 0x80484a2 in function (valgrind-unwind.c:6); saved eip 0x8048384
called by frame at 0xbffff310
source language c.
Arglist at 0xbffff2e8, args:
Locals at 0xbffff2e8, Previous frame's sp is 0xbffff2f0
Saved registers:
ebp at 0xbffff2e8, eip at 0xbffff2ec
Stack level 1, frame at 0xbffff310:
eip = 0x8048384 in main (valgrind-unwind.c:17); saved eip 0xb7e33963
caller of frame at 0xbffff2f0
source language c.
Arglist at 0xbffff2f8, args:
Locals at 0xbffff2f8, Previous frame's sp is 0xbffff310
Saved registers:
ebp at 0xbffff2f8, eip at 0xbffff30c
Get the same data under Valgrind:
$ valgrind --vgdb=full --vgdb-error=0 ./MY-PROGRAM
In another shell:
$ gdb -q -x v.gdb
relaying data between gdb and process 574
0x04001020 in ?? ()
Breakpoint 1 at 0x80484a2: file valgrind-unwind.c, line 6.
Stack level 0, frame at 0xbe88e2c0:
eip = 0x80484a2 in function (valgrind-unwind.c:6); saved eip 0x8048384
called by frame at 0xbe88e2e0
source language c.
Arglist at 0xbe88e2b8, args:
Locals at 0xbe88e2b8, Previous frame's sp is 0xbe88e2c0
Saved registers:
ebp at 0xbe88e2b8, eip at 0xbe88e2bc
Stack level 1, frame at 0xbe88e2e0:
eip = 0x8048384 in main (valgrind-unwind.c:17); saved eip 0x4051963
caller of frame at 0xbe88e2c0
source language c.
Arglist at 0xbe88e2c8, args:
Locals at 0xbe88e2c8, Previous frame's sp is 0xbe88e2e0
Saved registers:
ebp at 0xbe88e2c8, eip at 0xbe88e2dc
If GDB can successfully unwind the stack while connecting to "valgrind --gdb" then it's a problem with Valgrind's stack unwind algorithm. You can inspect the "info frame" output carefully for inline and tail call frames or some other reason that could throw Valgrind off. Otherwise it's probably stack corruption.
Ok, compiling all .so parts and the main program with an explicit -O0 seems to solve the problem. It seems that some of the optimizations of the 'core' program that was loading the .so (so was always compiled unoptimized) was breaking the stack.
This is Tail-call optimization in action.
The function function calls malloc as the last thing it does. The compiler sees this and kills the stack frame for function before it calls malloc. The advantage is that when malloc returns it returns directly to whichever function called function. I.e. it avoids malloc returning to function only to hit yet another return instruction.
In this case the optimization has prevented an unnecessary jump and made stack usage slightly more efficient, which is nice, but in the case of a recursive tail call then this optimization is a huge win as it turns a recursion into something more like iteration.
As you've discovered already, disabling optimization makes debugging much easier. If you want to debug optimized code (for performance testing, perhaps), then, as #Zang MingJie already said, you can disable this one optimization with -fno-optimize-sibling-calls.