More heaps found in each dump, where do they come from? - c++

I am investigating a native memory leak through WinDbg.
With consecutive dumps, !heap -s returns more heaps every dump. Number of heaps returned in the first dump: 1170, second dump: 1208.
There are three sizes of heaps that are returning a lot:
0:000> !heap -stat -h 2ba60000
heap # 2ba60000
group-by: TOTSIZE max-display: 20
size #blocks total ( %) (percent of total busy bytes)
1ffa 1 - 1ffa (35.44)
1000 1 - 1000 (17.73)
a52 1 - a52 (11.44)
82a 1 - 82a (9.05)
714 1 - 714 (7.84)
64c 1 - 64c (6.98)
Most blocks refer to the same callstack:
777a5887 ntdll!RtlAllocateHeap
73f9f1de sqlsrv32!SQLAllocateMemory
73fc0370 sqlsrv32!SQLAllocConnect
73fc025d sqlsrv32!SQLAllocHandle
74c6a146 odbc32!GetInfoForConnection
74c6969d odbc32!SQLInternalDriverConnectW
74c6bc24 odbc32!SQLDriverConnectW
74c63141 odbc32!SQLDriverConnect
When will a new heap will be created, and how you would dive deeper into this to find the root cause?

If you are able to do live debugging, you can try to set a break:
bp ntdll!RtlCreateHeap "kc;gc"
will display the call stack and continue. Maybe you see the culprit.
Do also the same with ntdll!RtlDebugCreateHeap

Related

Solaris 12.3 C++ compiler out of memory

I have a swig generated C++ code file of 24MB, nearly 5,00,000 lines of code. I am able to compile it when set the compiler Optimization level to xO0,but fails as soon as i add any other C++ compiler flags(like xprofile ...). I am using Solaris Studio 12.3 C++ compiler.
Below is the console error:
Element size (in bytes): 48
Table size (in elements): 2560000
Table maximum size: 134217727
Table size increment: 5000
Bytes written to disk: 0
Expansions required: 9
Segments used: 1
Max Segments used: 1
Max Segment offset: 134217727
Segment offset size:: 27
Resizes made: 0
Copies due to expansions: 4
Reset requests: 0
Allocation requests: 2827527
Deallocation requests: 267537
Allocated element count: 4086
Free element count: 2555914
Unused element count: 0
Free list size (elements): 0
ir2hf: error: Out of memory
Thanks in Advance.
I found this article suggesting that it has to do with the fact that Solaris the amount of memory for data segments.
Following the steps in the blog, try to remove the limit.
$ usermod -K defaultpriv=basic,sys_resource karel
Now logoff and logon again and change the limit:
$ ulimit -d unlimited
Then check that the limit has changed
$ ulimit -d
The output should be unlimited

Limiting Java 8 Memory Consumption

I'm running three Java 8 JVMs on a 64 bit Ubuntu VM which was built from a minimal install with nothing extra running other than the three JVMs. The VM itself has 2GB of memory and each JVM was limited by -Xmx512M which I assumed would be fine as there would be a couple of hundred MB spare.
A few weeks ago, one crashed and the hs_err_pid dump showed:
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 196608 bytes for committing reserved memory.
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
I restarted the JVM with a reduced heap size of 384MB and so far everything is fine. However when I currently look at the VM using the ps command and sort in descending RSS size I see
RSS %MEM VSZ PID CMD
708768 35.4 2536124 29568 java -Xms64m -Xmx512m ...
542776 27.1 2340996 12934 java -Xms64m -Xmx384m ...
387336 19.3 2542336 6788 java -Xms64m -Xmx512m ...
12128 0.6 288120 1239 /usr/lib/snapd/snapd
4564 0.2 21476 27132 -bash
3524 0.1 5724 1235 /sbin/iscsid
3184 0.1 37928 1 /sbin/init
3032 0.1 27772 28829 ps ax -o rss,pmem,vsz,pid,cmd --sort -rss
3020 0.1 652988 1308 /usr/bin/lxcfs /var/lib/lxcfs/
2936 0.1 274596 1237 /usr/lib/accountsservice/accounts-daemon
..
..
and the free command shows
total used free shared buff/cache available
Mem: 1952 1657 80 20 213 41
Swap: 0 0 0
Taking the first process as an example, there is an RSS size of 708768 KB even though the heap limit would be 524288 KB (512*1024).
I am aware that extra memory is used over the JVM heap but the question is how can I control this to ensure I do not run out of memory again ? I am trying to set the heap size for each JVM as large as I can without crashing them.
Or is there a good general guideline as to how to set JVM heap size in relation to overall memory availability ?
There does not appear to be a way of controlling how much extra memory the JVM will use over the heap. However by monitoring the application over a period of time, a good estimate of this amount can be obtained. If the overall consumption of the java process is higher than desired, then the heap size can be reduced. Further monitoring is needed to see if this impacts performance.
Continuing with the example above and using the command ps ax -o rss,pmem,vsz,pid,cmd --sort -rss we see usage as of today is
RSS %MEM VSZ PID CMD
704144 35.2 2536124 29568 java -Xms64m -Xmx512m ...
429504 21.4 2340996 12934 java -Xms64m -Xmx384m ...
367732 18.3 2542336 6788 java -Xms64m -Xmx512m ...
13872 0.6 288120 1239 /usr/lib/snapd/snapd
..
..
These java processes are all running the same application but with different data sets. The first process (29568) has stayed stable using about 190M beyond the heap limit while the second (12934) has reduced from 156M to 35M. The total memory usage of the third has stayed well under the heap size which suggests the heap limit could be reduced.
It would seem that allowing 200MB extra non heap memory per java process here would be more than enough as that gives 600MB leeway total. Subtracting this from 2GB leaves 1400MB so the three -Xmx parameter values combined should be less than this amount.
As will be gleaned from reading the article pointed out in a comment by Fairoz there are many different ways in which the JVM can use non heap memory. One of these that is measurable though is the thread stack size. The default for a JVM can be found on linux using java -XX:+PrintFlagsFinal -version | grep ThreadStackSize In the case above it is 1MB and as there are about 25 threads, we can safely say that at least 25MB extra will always be required.

arm_data abort failure in case of running my program for the second time and thereafter

I add my program (load a file and do some computation) into the app of TizenRT on ARTIK053. The program can run successfully in the first time, but the data abort failure will be met when running it second time. The specific error info is as follows:
arm_dataabort:
Data abort. PC: 040d25a0 DFAR: 00000011 DFSR: 0000080d
up_assert: Assertion failed at file:armv7-r/arm_dataabort.c line: 111 task: ghsom_test
up_dumpstate: Current sp: 020c3eb0
up_dumpstate: User stack:
up_dumpstate: base: 020c3fd0
up_dumpstate: size: 00000fd4
up_dumpstate: used: 00000220
up_dumpstate: User Stack
up_stackdump: 020c3ea0: 00000003 020c3eb0 040c9638 041d38b8 00000000 040c9644 00000011 0000080
.....
.....
up_taskdump: Idle Task: PID=0 Stack Used=1024 of 1024
up_taskdump: hpwork: PID=1 Stack Used=164 of 2028
up_taskdump: lpwork: PID=2 Stack Used=164 of 2028
up_taskdump: logm: PID=3 Stack Used=300 of 2028
up_taskdump: LWIP_TCP/IP: PID=4 Stack Used=228 of 4068
up_taskdump: tash: PID=6 Stack Used=948 of 4076
up_taskdump: ghsom_test: PID=10 Stack Used=616 of 4052
I checked the remaining free RAM space, it is enough for my program. And I added some printing info into my main function to check on which line the error come out. I found that if I commented some lines before the line that the error come out, in the next time I running the program, the error line will move downward some lines. It seems like I released some stack space. So I guess it might be an issue related with the stack size that I can assign to a single proc. Anyone knows the reason, and how to solve the issue? To be mentioned, it only happens for the second time and thereafter I running the program.
With the stackdump you can almost always figure out where the error originated from.
Since you have the image file for your build you can do
arm-none-eabi-addr2line -f -p -i -b build/out/bin/tinyara 0xADDR
where ADDR would be the addr is one of the relevant addresses in the stack dump.
You can usually check the "current sp" (stack pointer) but often it points to the arm_dataabort shown in the failure above.
Then you can check the PC address and also look for addresses in the stack dump (starting from the back of it) that looks like the PC in value.
In your case it could be addresses like (in that order): 040c9644, 041d38b8, 040c9638
So basically:
arm-none-eabi-addr2line -f -p -i -b build/out/bin/tinyara 0x040c9644
notice the 0x in front of the address.
The command will give you a good indication for where this address is coming from in your binary like:
up_idlepm_static at /home/user/tizenrt/os/arch/arm/src/chip/s5j_idle.c:111
(inlined by) up_idle at /home/user/tizenrt/os/arch/arm/src/chip/s5j_idle.c:254
if the address is not pointing to code lines then it will look like:
?? ??:0
hope that helps

Is there a way to get userstack for all heap userptr

I was reading through this article detecting memory leak using windbg. I am trying to find a way to print userstack for all userptrs that appear when heap is filtered for a block of memory for a particular size. Is this possible ?
I am looking to achieve something like:
foreach(userPtr)
dump_to_a_file !heap -p -a userPtr
where userPtr is: UserPtr as under
HEAP_ENTRY Size Prev Flags UserPtr UserSize - state
003360e0 03f0 0000 [07] 003360e8 01f64 - (busy)
00338060 03f0 03f0 [07] 00338068 01f64 - (busy)
00339fe0 03f0 03f0 [07] 00339fe8 01f64 - (busy)
I am trying to do this in order to avoid a manual checking for thousands of such UserPtr. Thanks for any help that you could give.
This is the output of the !heap -flt s xxx command which contains a lot of text before and after the heap entry table. Let's get rid of that additional text by doing the hack
.shell -ci "!heap -flt s xxx" find "["
Now it's quite stable output, which can be used in a foreach loop:
.foreach (userptr {.shell -ci "!heap -flt s xxx" find "["}) { .echo ${userptr}}
See how it splits each line. Let's get rid of the first 4 tokens (entry, size, prev, flags) and last 3 tokens (usersize, -, state) using /pS 4 /ps 7.
.foreach /pS 4 /ps 7 (userptr {.shell -ci "!heap -flt s xxx" find "["}) { .echo ${userptr}}
Now that you have the pure addresses, do something useful with it, which is !heap -p -a
.foreach /pS 4 /ps 7 (userptr {.shell -ci "!heap -flt s xxx" find "["}) { !heap -p -a ${userptr}}
To dump it into a file, surround it by a log (.logopen and .logclose):
.logopen d:\debug\logs\heap.log; .foreach /pS 4 /ps 7 (userptr {.shell -ci "!heap -flt s xxx" find "["}) { !heap -p -a ${userptr}}; .logclose
There you go.
You can use umdh.exe for this. Umdh can dump all allocations, or a delta between two snapshots of the same process, which is the most convenient way to locate memory leaks. You can find the tool in the location where you install Windows debugger tools.
The catch when using umdh.exe that you need to know, is that is only resolves symbols when performing delta operation, i.e. comparing two snapshots of the process. If you really-really need every callstack, just make a fist snapshot at the very beginning of process execution.
Umdh.exe also aggregates allocations with the same callstack into buckets, so in diff output you will see something like this:
+ 18f0 ( 2354 - a64) 11 allocs BackTrace113457DC
+ c ( 11 - 5) BackTrace113457DC allocations
ntdll!RtlAllocateHeap+38CB9
msvcrt!_calloc_impl+134
msvcrt!_calloc_crt+16
msvcrt!_CRTDLL_INIT+FC
ntdll!LdrxCallInitRoutine+16
ntdll!LdrpCallInitRoutine+43
ntdll!LdrpInitializeThread+106
ntdll!_LdrpInitialize+6A
ntdll!LdrInitializeThunk+10
which is an example of call stack for 11 allocations, with number of allocations for this call stack increasing from 5 to 11 between snapshots, and memory consumed by these allocations from 0xa64 to 0x2354 bytes.
Sample steps to show how to use umdh.exe:
Set up _NT_SYMBOL_PATH environment variable, required for umdh. Assuming you are in directory where you have your private symbols (%CD%)
set _NT_SYMBOL_PATH=%CD%;srv*http://msdl.microsoft.com/download/symbols
Start your process under debugger, stop it at breakpoint or wherever you need.
Create your first snapshot:
umdh -p: -f:MyFirstSnapshot.txt
Resume execution of your process, stop it second time in place you need.
Create your second snapshot:
umdh -p: -f:MySecondSnapshot.txt
Create a diff with symbols resolved:
umdh MyFirstSnapshot.txt MySecondSnapshot.txt -f:MyDiff.txt

Inconsistencies in values from mallinfo and ps

I am trying to identify a huge memory growth in a linux application which runs around 20-25 threads. From one of those threads I dump the memory stats using the system call mallinfo . It shows the total allocated space as 1005025904 (uordblks). However, the top command shows a value of 8GB as total memory and 7GB as resident memory. Can some one explain this inconsistency?
Following is the full stat from mallinfo:
Total non-mmapped bytes (arena): 1005035520
# of free chunks (ordblks): 2
# of free fastbin blocks (smblks): 0
# of mapped regions (hblks): 43
Bytes in mapped regions (hblkhd): 15769600
Max. total allocated space (usmblks): 0
Free bytes held in fastbins (fsmblks): 0
Total allocated space (uordblks): 1005025904
Total free space (fordblks): 9616
Topmost releasable block (keepcost): 9584
The reason is mallinfo gives the stats of the main arena. To find details of all arena's you have to use the system call malloc_stats.