Calloc fail on ARM device - c++

I have a C++ program that uses a shared C library (namely Darknet) to load and make use of lightweight neural networks.
The program run flawlessly under Ubuntu Trusty on x86_64 box, but crashes with segmentation fault under the same OS but on the ARM device. The reason of the crash is that calloc returns NULL during memory allocation for an array. The code looks like the following:
l.filters = calloc(c * n * size * size, sizeof(float));
...
for (i = 0; i < c * n * size * size; ++i)
l.filters[i] = scale * rand_uniform(-1, 1);
So, after trying to write the first element, the application halts with segfault.
In my case the amount of the memory to be allocated is 4.7 MB, while there is more than 1GB available. I also tried to run it after reboot to exclude the heap fragmentation, but with the same result.
What is more interesting, when I am trying to load a larger network, it works just fine. And the two networks have the same configuration of the layer for which the crash happens...
Valgrind tells me nothing new:
==2591== Invalid write of size 4
==2591== at 0x40C70: make_convolutional_layer (convolutional_layer.c:135)
==2591== by 0x2C0DF: parse_convolutional (parser.c:159)
==2591== by 0x2D7EB: parse_network_cfg (parser.c:493)
==2591== by 0xBE4D: main (annotation.cpp:58)
==2591== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==2591==
==2591==
==2591== Process terminating with default action of signal 11 (SIGSEGV)
==2591== Access not within mapped region at address 0x0
==2591== at 0x40C70: make_convolutional_layer (convolutional_layer.c:135)
==2591== by 0x2C0DF: parse_convolutional (parser.c:159)
==2591== by 0x2D7EB: parse_network_cfg (parser.c:493)
==2591== by 0xBE4D: main (annotation.cpp:58)
==2591== If you believe this happened as a result of a stack
==2591== overflow in your program's main thread (unlikely but
==2591== possible), you can try to increase the size of the
==2591== main thread stack using the --main-stacksize= flag.
==2591== The main thread stack size used in this run was 4294967295.
==2591==
==2591== HEAP SUMMARY:
==2591== in use at exit: 1,731,358,649 bytes in 2,164 blocks
==2591== total heap usage: 12,981 allocs, 10,817 frees, 9,996,704,911 bytes allocated
==2591==
==2591== LEAK SUMMARY:
==2591== definitely lost: 16,645 bytes in 21 blocks
==2591== indirectly lost: 529,234 bytes in 236 blocks
==2591== possibly lost: 1,729,206,304 bytes in 232 blocks
==2591== still reachable: 1,606,466 bytes in 1,675 blocks
==2591== suppressed: 0 bytes in 0 blocks
==2591== Rerun with --leak-check=full to see details of leaked memory
==2591==
==2591== For counts of detected and suppressed errors, rerun with: -v
==2591== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 402 from 8)
Killed
I am really confused what might be the reason. Could anybody help me?

Related

Why is valgrind reporting two memory allocations when my code only requests one?

When I run the following program through valgrind (valgrind ./a.out --leak-check=yes):
int main() {
char* ptr = new char;
return 0;
}
the report contains this:
==103== error calling PR_SET_PTRACER, vgdb might block
==103==
==103== HEAP SUMMARY:
==103== in use at exit: 1 bytes in 1 blocks
==103== total heap usage: 2 allocs, 1 frees, 72,705 bytes allocated
==103==
==103== LEAK SUMMARY:
==103== definitely lost: 1 bytes in 1 blocks
==103== indirectly lost: 0 bytes in 0 blocks
==103== possibly lost: 0 bytes in 0 blocks
==103== still reachable: 0 bytes in 0 blocks
==103== suppressed: 0 bytes in 0 blocks
==103== Rerun with --leak-check=full to see details of leaked memory
What's the extra 72,704-byte allocation valgrind is reporting? It seems to be taken care of before the program is over, so I'm guessing it's something the compiler is doing. I'm using gcc on an Ubuntu subsystem in Windows 10.
Edit: The memory leak was intentional in this example, but I get similar messages about an extra allocation regardless of whether or not there's a leak.
As you surmise, the extra allocation is part of the runtime, and it is correctly cleaned up as you can see in Valgrind's output. Don't worry about it.
If you're asking specifically what it is, you'll have to read the C++ runtime library of your specific compiler and version (libc since you're using gcc). But again, it shouldn't concern you.

GDB does not display full backtrace

I am writing a CHIP-8 interpreter in C++ with SDL2. The source code is at https://github.com/robbie0630/Chip8Emu. There is a problem where it gets a segmentation fault with this ROM. I tried to debug the problem with GDB, but when I type bt, it displays an incomplete stack trace, only showing the top two functions, making me unable to effectively diagnose the problem. How do I get a full and useful stack trace?
EDIT: When I run bt, GDB displays this:
#0 0x0000000101411a14 in ?? ()
#1 0x0000000000406956 in Chip8_CPU::doCycle (this=0x7fffffffc7b0) at /my/home/code/Chip8Emu/src/cpu.cpp:223
#2 0x0000000000402080 in main (argc=2, argv=0x7fffffffe108) at /my/home/code/Chip8Emu/src/main.cpp:152
This is useless because ?? does not indicate anything, and line 223 of cpu.cpp is a function call.
EDIT 2: I ran valgrind on the program, and here is the output:
==11791== Conditional jump or move depends on uninitialised value(s)
==11791== at 0x406BA0: Chip8_CPU::doCycle() (cpu.cpp:215)
==11791== by 0x4020EF: main (main.cpp:152)
==11791==
==11791== Jump to the invalid address stated on the next line
==11791== at 0x101411A74: ???
==11791== by 0x4020EF: main (main.cpp:152)
==11791== Address 0x101411a74 is not stack'd, malloc'd or (recently) free'd
==11791==
==11791==
==11791== Process terminating with default action of signal 11 (SIGSEGV)
==11791== Access not within mapped region at address 0x101411A74
==11791== at 0x101411A74: ???
==11791== by 0x4020EF: main (main.cpp:152)
==11791== If you believe this happened as a result of a stack
==11791== overflow in your program's main thread (unlikely but
==11791== possible), you can try to increase the size of the
==11791== main thread stack using the --main-stacksize= flag.
==11791== The main thread stack size used in this run was 8388608.
==11791==
==11791== HEAP SUMMARY:
==11791== in use at exit: 7,827,602 bytes in 41,498 blocks
==11791== total heap usage: 169,848 allocs, 128,350 frees, 94,139,303 bytes allocated
==11791==
==11791== LEAK SUMMARY:
==11791== definitely lost: 0 bytes in 0 blocks
==11791== indirectly lost: 0 bytes in 0 blocks
==11791== possibly lost: 4,056,685 bytes in 36,878 blocks
==11791== still reachable: 3,770,917 bytes in 4,620 blocks
==11791== suppressed: 0 bytes in 0 blocks
==11791== Rerun with --leak-check=full to see details of leaked memory
==11791==
==11791== For counts of detected and suppressed errors, rerun with: -v
==11791== Use --track-origins=yes to see where uninitialised values come from
==11791== ERROR SUMMARY: 12 errors from 3 contexts (suppressed: 0 from 0)
Killed
EDIT 3: I ran GDB again, this time watching GfxDraw, and I noticed this happened:
Old value = (void (*)(array2d)) 0x1411bc4
New value = (void (*)(array2d)) 0x101411bc4
Chip8_CPU::doCycle (this=0x7fffffffc7a0) at /home/robbie/code/Chip8Emu/src/cpu.cpp:213
(gdb) cont
Continuing.
Thread 1 "Chip8Emu" received signal SIGSEGV, Segmentation fault.
0x0000000101411bc4 in ?? ()
So somehow GfxDraw is getting modified to an invalid function pointer. I can't figure out where it is modified, however.
After a few months, I finally identified the problem. Some nasty CHIP-8 programs make illegal memory accesses to the graphics memory that are outside of the bounds of the array and corrupt properties of the CPU (such as GfxDraw). I fixed this problem by accessing the graphics memory with at and ignoring std::out_of_range errors. It seems to work for now, so I'm declaring it the solution.

new libstdc++ of gcc5.1 may allocate large heap memory

valgrind detects "still reachable leak" in an empty program compiled with gcc5.1, g++ ./a.cpp,
int main () {}
valgrind says, valgrind ./a.out,
==32037== HEAP SUMMARY:
==32037== in use at exit: 72,704 bytes in 1 blocks
==32037== total heap usage: 1 allocs, 0 frees, 72,704 bytes allocated
==32037==
==32037== LEAK SUMMARY:
==32037== definitely lost: 0 bytes in 0 blocks
==32037== indirectly lost: 0 bytes in 0 blocks
==32037== possibly lost: 0 bytes in 0 blocks
==32037== still reachable: 72,704 bytes in 1 blocks
==32037== suppressed: 0 bytes in 0 blocks
==32037== Rerun with --leak-check=full to see details of leaked memory
==32037==
==32037== For counts of detected and suppressed errors, rerun with: -v
==32037== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 5 from 5)
For c programs, valgrinds reports no memory leaks and no memory allocation.
In addition, for gcc5.0 and gcc4.9.2, valgrinds reports no memory leaks and no memory allocation too.
Then, I guess that new libstdc++ of gcc5.1 is the cause.
My question is how to reduce this huge memory allocation which may be in libstdc++.
Indeed, the execution time of this empty c++ program compiled with -O3 is larger than one of the empty c program by a few milliseconds (without systime).
The space is allocated as an emergency exception buffer in libsup++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64535
the devs talk of maybe suppressing the trace in Valgrind but presumably nothing was done in the end. The only way of eliminating it from the trace right now would probably be to disable exceptions, but it doesn't look like it's a big deal either way, it's not as if the memory can be reclaimed for something else before the program exits.

Valgrind: Where is my memory leak?

I am working on a rather chaotic library (client/server application) which has a memory leak somewhere, but I cannot find where.
When I start the library and let it do its work, I get the following memory usage using top when it finished:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
21010 root 20 0 111m 12m 6836 S 0.0 0.7 0:29.20 myapp
21008 root 20 0 172m 99m 6480 S 0.0 5.8 0:14.39 myapp
The first line is the client, the second the server.
I let valgrind execute the server and got the following result:
==20904==
==20904== HEAP SUMMARY:
==20904== in use at exit: 4,723,124 bytes in 199 blocks
==20904== total heap usage: 40,423,345 allocs, 40,423,146 frees, 1,977,998,844 bytes allocated
==20904==
==20904== LEAK SUMMARY:
==20904== definitely lost: 0 bytes in 0 blocks
==20904== indirectly lost: 0 bytes in 0 blocks
==20904== possibly lost: 914 bytes in 18 blocks
==20904== still reachable: 4,722,210 bytes in 181 blocks
==20904== suppressed: 0 bytes in 0 blocks
==20904== Rerun with --leak-check=full to see details of leaked memory
==20904==
==20904== For counts of detected and suppressed errors, rerun with: -v
==20904== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2) Killed
If I understand this correctly, valgrind says that the server has just about 4,7M of memory still allocated with no real leaks.
It can be that there are no "real" leaks, but normally the library should at this state have freed all resources. I cannot see any part in the code where something is not freed.
How can I find out where the resources are still allocated?
You could use a brute-force method and print to a log file before each allocation and after each delete.
You may want to print the object's size (or the size allocated) and maybe the name of the object.
You are looking for matching pairs, allocation and deallocation.
You may also want to print out the invocation number which will help match, i.e. if there are 4 invocations of allocation, you should be able to find deallocation #3 to match with allocation #3.
This requires careful inspection of the code to verify that all allocations have been identified.

use signal handler to release resources?

I wrote a socket server. And I realize when I hit Ctrl-C while it's running, there's some possible memory leak. I used valgrind to find this.
My server code is quite simple. Basically I create a Listener object, start a thread to accept connections and try to join that thread:
try {
Server::Listener listener(1234);
boost::thread l(boost::bind(&Server::Listener::start, &listener));
l.join();
} catch(exception& e) {
cout<<e.what()<<endl;
}
When I run valgrind, it give me:
==3580== Command: bin/Debug/p_rpc
==3580==
Listner started ...
in loop..
^C==3580==
==3580== HEAP SUMMARY:
==3580== in use at exit: 3,176 bytes in 24 blocks
==3580== total heap usage: 28 allocs, 4 frees, 4,328 bytes allocated
==3580==
==3580== 288 bytes in 1 blocks are possibly lost in loss record 21 of 24
==3580== at 0x4C29E46: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3580== by 0x4012084: _dl_allocate_tls (dl-tls.c:297)
==3580== by 0x4E3AABC: pthread_create##GLIBC_2.2.5 (allocatestack.c:571)
==3580== by 0x5260F9F: boost::thread::start_thread() (in /usr/lib/libboost_thread.so.1.49.0)
==3580== by 0x407B93: boost::thread::thread, boost::_bi::list1 > > >(boost::_bi::bind_t, boost::_bi::list1 > >&&) (thread.hpp:171)
==3580== by 0x404CA4: main (main.cpp:179)
==3580==
==3580== LEAK SUMMARY:
==3580== definitely lost: 0 bytes in 0 blocks
==3580== indirectly lost: 0 bytes in 0 blocks
==3580== possibly lost: 288 bytes in 1 blocks
==3580== still reachable: 2,888 bytes in 23 blocks
==3580== suppressed: 0 bytes in 0 blocks
==3580== Reachable blocks (those to which a pointer was found) are not shown.
==3580== To see them, rerun with: --leak-check=full --show-reachable=yes
==3580==
==3580== For counts of detected and suppressed errors, rerun with: -v
==3580== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 2 from 2)
Killed
It points out there's 288bytes possibly lost. I imagine I could use a signal handler to release this resourse. But I don't know how I do that. Can you give me an example please?
Cheers,
Elton
When a process closes, the OS automatically cleans up all memory owned by the process. You don't need to worry about freeing that memory when the program exits. The building is being demolished. Don't bother sweeping the floor and emptying the trash cans and erasing the whiteboards. And don't line up at the exit to the building so everybody can move their in/out magnet to out. All you're doing is making the demolition team wait for you to finish these pointless housecleaning tasks.
The kind of leaks you do need to worry about are those that continuously leak during the program's lifetime.
In principle, you can destroy the object there. There are restrictions on what you can do in a signal handler, and they mix very badly with threads. Note that in this area the compiler can do no (or very little) checking, the signal handler is just an ordinary function. Be extra careful.
The answers to this question give some details on how to do it.