How does GDB know where the heap is allocated - gdb

In GDB, running info proc mappings dumps the address space of the target, including the heap. My question is, how does GDB know where the heap is allocated? Obviously, something like malloc returns an address, but it does not specify the exact heap start address or its allocated size.

When debugging a live process on Linux, GDB's info proc mappings command parses the /proc/pid/maps file - which contains the details of a process's memory regions - then formats and displays the information. If the pathname field of an entry in the maps file says [heap], that's what GDB will display.
The Linux kernel's implementation. of /proc/pid/maps will show [heap] on the line corresponding to the memory region that contains the address known as the break, which historically has been the top of the data segment. The break can be moved to higher or lower addresses by using the sbrk system call.
glibc's malloc uses the heap for small allocations. For larger allocations, it calls mmap with anonymous backing, and you can see these memory regions in the maps file - they have no pathname field.
I've written a small program which calls malloc to allocate memory in a variety of sizes, then displays the memory region where each allocation was placed. It's in my answer to Can't search into heap using gdb.

Related

Analyze Glibc heap memory

I research an embedded device that use GLIBC 2.25.
When I look at /proc/PID/maps I see under the heap section some anonymous sections ,I understand that sections create when the process use new
I dump those sections with dd and there is there interesting value that I want to understand is that buffer allocated or free, and what is the size of this buffer.
How can I do that please?
You can use the gdb (GNU Debugger) tool to inspect the memory of a running process. You can attach to the process using its PID and use the x command to examine memory at a specific address. You can also use the info proc mapping command to view the memory maps of the process, including the size of the heap. Additionally, you can use the heap command to list heap blocks and the malloc_info command to show detailed information about heap blocks.
You can also use the malloc_stats function to display information about the heap usage such as the number of bytes allocated, the number of bytes free and the number of bytes in use.
You can also use the pmap command to display the memory map of a process, including the heap size. This command is available on some systems and may not be present on others.
It's also worth noting that the /proc/PID/maps file can also give you an idea about the heap section of a process.
Please keep in mind that you need to have the right permission to access the process you want to inspect.
Instead of analyzing the memory from proc, you may want to try following options, limited to your env.
use tools like valgrind if you suspect any kind of leaks or invalid read/writes.
rather than looking at output of dd, attach to running process and inspect memory within process, gives you context to make sense of memory usage.
use logging to dump addresses of allocation/free/read/write. This allows you to build better understanding of memory usage.
You may have to use all of the above options depending upon the complexity of your task.

How does GLIBC decide segment for malloc

I look at some Linux Glibc(2.25) system and see that when the code use malloc .
sometimes the buffer has been allocated at heap segment and sometimes in anonymous segment, It's not relate for size, I can see all the segments in /proc/PID/maps
I thought that the heap segment relate for malloc And anonymous segment relate for mmap. But why GLIBC decide for the same size to use malloc and sometimes use mmap
I saw that sometimes when I use malloc in some thread the memory has been allocated at heap segment but when I switch for another thread(using GDB) the memory has been allocated to anonymous segment
glibc's malloc implementation will sometimes use brk or sbrk (what you're calling the heap -- it shows up as 'heap' in /proc/PID/maps) and sometimes use mmap. Which depends on some tradeoffs, but generally
if a process only needs a small amount of heap space, brk/sbrk is better
if a process needs a lot of heap space and/or very large blocks, mmap is better.
So GLIBC's malloc implementation has a bunch of heuristics to decide what is 'small' and what is 'large' and looks at what calls have been made so far to malloc/free in order decide on which method to use to get more memory from the system when it needs it.
There's a function mallopt you can call that affects this tuning -- there's a bunch of info on the man page about it.

Dynamic allocation in uClinux

I'm new to embedded development, and the big differences I see between traditional Linux and uClinux is that uClinux lacks the MMU.
From this article:
Without VM, each process must be located at a place in memory where it can be run. In the simplest case, this area of memory must be contiguous. Generally, it cannot be expanded as there may be other processes above and below it. This means that a process in uClinux cannot increase the size of its available memory at runtime as a traditional Linux process would.
To me, this sounds like all data must reside on the stack, and that heap allocation is impossible, meaning malloc() and/or "new" are out of the question... is that accurate? Perhaps there are techniques/libraries which allow for managing a "static heap" (i.e. a stack based area from which "dynamic" allocations can be requested)?
Or am I over thinking it? Or over simplifying it?
Under regular Linux, the programmer does not need to deal with physical resources. The kernel takes care of this, and a user space process sees only its own address space. As the stack grows, or malloc-type requests are made, the kernel will map free memory into the process's virtual address space.
In uClinux, the programmer must be more concerned with physical memory. The MMU and VM are not available, and all address space is shared with the kernel. When a user space program is loaded, the process is allocated physical memory pages for the text, stack, and variables. The process's program counter, stack pointer, and data/bss table pointers are set to physical memory addresses. Heap allocations (via malloc-type calls) are made from the same pool.
You will not have to get rid of heap allocation in programs. You will need to be concerned with some new issues. Since the stack cannot grow via virtual memory, you must size it correctly during linking to prevent stack overflows. Memory fragmentation becomes an issue because there's no MMU to consolidate smaller free pages. Errant pointers become more dangerous because they can now cause unintended writes to anywhere in physical memory.
It's been a while since I've worked with uCLinux (it was before it was integrated into the main tree), but I thought malloc was still available as part of the c library. There was a lot higher chance of doing Very Bad Things (tm) in memory since the heap wasn't isolated, but it was possible.
yes you can use malloc in user space applications on uclinux ,but then you have to increase the size of stack of user space application(before running the program cause stack size would be static),so that when malloc runs it will get the space it needs.
for e.g. uclinux on arm-cortex
arm toolchain provides command to find and change size of stack used by binary of user application then you can tranfer it to your embedded system and run
----- > arm-uclinuxeabi-flthdr

how to get Heap size of a program

How to find heap memory size of a c++ program under linux platform ?I need heap memory space before the usage of new or malloc and also after that.can anyone help?
#include <malloc.h>
#include <iostream>
int main()
{
//here need heap memory space
unsigned char* I2C_Read_Data= new unsigned char[250];
//get heap memory space After the usage of new
return 0;
}
You can also add heap tracking to your own programs by overloading the new and delete operators. In a game engine I am working on, I have all memory allocation going through special functions, which attach each allocation to a particular heap tracker object. This way, at any given moment, I can pull up a report and see how much memory is being taken up by entities, actors, Lua scripts, etc.
It's not as thorough as using an external profiler (particularly when outside libraries handle their own memory management), but it is very nice for seeing exactly what memory you were responsible for.
Use valgrind's heap profiler: Massif
On Linux you can read /proc/[pid]/statm to get memory usage information.
Provides information about memory usage, measured in pages. The
columns are:
size total program size
(same as VmSize in /proc/[pid]/status)
resident resident set size
(same as VmRSS in /proc/[pid]/status)
share shared pages (from shared mappings)
text text (code)
lib library (unused in Linux 2.6)
data data + stack
dt dirty pages (unused in Linux 2.6)
See the man page for more details.
Answer by Adam Zalcman to this question describes some interesting details of the heap allocation
You can use the getrlimit function call and pass the RLIMIT_DATA for the resource. That should give you the size of the data segment for your program.
Apart from external inspection, you can also instrument your implementation of malloc to let you inspect those statistics. jemalloc and tcmalloc are implementations that, on top of performing better for multithreaded code that typical libc implementations, add some utility functions of that sort.
To dig deeper, you should learn a bit more how heap allocation works. Ultimately, the OS is the one assigning memory to processes as they ask for it, however requests to the OS (syscalls) are slower than regular calls, so in general an implementation of malloc will request large chunks to the OS (4KB or 8KB blocks are common) and the subdivise them to serve them to its callers.
You need to identify whether you are interested in the total memory consumed by the process (which includes the code itself), the memory the process requested from the OS within a particular procedure call, the memory actually in use by the malloc implementation (which adds its own book-keeping overhead, however small) or the memory you requested.
Also, fragmentation can be a pain for the latter two, and may somewhat blurs the differences between really used and assigned to.
You can try "mallinfo" and "malloc_info". They might work. mallinfo has issues when you allocate more than 2GB. malloc_info is o/s specific and notably very weird. I agree - very often it's nice to do this stuff without 3rd party tools.

Ubuntu System Monitor and valgrind to discover memory leaks in C++ applications

I'm writing an application in C++ which uses some external open source libraries. I tried to look at the Ubuntu System Monitor to have information about how my process uses resources, and I noticed that resident memory continues to increase to very large values (over 100MiB). This application should run in an embedded device, so I have to be careful.
I started to think there should be a (some) memory leak(s), so I'm using valgrind. Unfortunately it seems valgrind is not reporting significant memory leaks, only some minor issues in the libraries I'm using, nothing more.
So, do I have to conclude that my algorithm really uses that much memory? It seems very strange to me... Or maybe I'm misunderstanding the meaning of the columns of the System Monitor? Can someone clarify the meaning of "Virtual Memory", "Resident Memory", "Writable Memory" and "Memory" in the System Monitor when related to software profiling? Should I expect those values to immediately represent how much memory my process is taking in RAM?
In the past I've used tools that were able to tell me where I was using memory, like Apple Profiling Tools. Is there anything similar I can use in Linux as well?
Thanks!
Another tool you can try is the /lib/libmemusage.so library:
$ LD_PRELOAD=/lib/libmemusage.so vim
Memory usage summary: heap total: 4643025, heap peak: 997580, stack peak: 26160
total calls total memory failed calls
malloc| 42346 4528378 0
realloc| 52 7988 0 (nomove:26, dec:0, free:0)
calloc| 34 106659 0
free| 28622 3720100
Histogram for block sizes:
0-15 14226 33% ==================================================
16-31 8618 20% ==============================
32-47 1433 3% =====
48-63 4174 9% ==============
64-79 4736 11% ================
80-95 313 <1% =
...
(I quit vim immediately after startup.)
Maybe the histogram of block sizes will give you enough information to tell where leaks may be happening.
valgrind is very configurable; --leak-check=full --show-reachable=yes might be a good starting point, if you haven't tried it yet.
"Virtual Memory", "Resident Memory", "Writable Memory" and "Memory"
Virtual memory is the address space that your application has allocated. If you run malloc(1024*1024*100);, the malloc(3) library function will request 100 megabytes of storage from the operating system (or handle it out of the free lists). The 100 megabytes will be allocated with mmap(..., MAP_ANONYMOUS), which won't actually allocate any memory. (See the rant at the end of the malloc(3) page for details.) The OS will provide memory the first time each page is written.
Virtual memory accounts for all the libraries and executable objects that are mapped into your process, as well as your stack space.
Resident memory is the amount of memory that is actually in RAM. You might link against the entire 1.5 megabyte C library, but only use the 100k (wild guess) of the library required to support the Standard IO interface. The rest of the library will be demand paged in from disk when it is needed. Or, if your system is under memory pressure and some less-recently-used data is paged out to swap, it will no longer count against Resident memory.
Writable memory is the amount of address space that your process has allocated with write privileges. (Check the output of pmap(1) command: pmap $$ for the shell, for example, to see which pages are mapped to which files, anonymous space, the stack, and the privileges on those pages.) This is a reasonable indication of how much swap space the program might require in a worst-case swapping scenario, when everything must be paged to disk, or how much memory the process is using for itself.
Because there are probably 50--100 processes on your system at a time, and almost all of them are linked against the standard C library, all the processes get to share the read-only memory mappings for the library. (They also get to share all the copy-on-write private writable mappings for any files opened with mmap(..., MAP_PRIVATE|PROT_WRITE), until the process writes to the memory.) The top(1) tool will report the amount of memory that can be shared among processes in the SHR column. (Note that the memory might not be shared, but some of it (libc) definitely is shared.)
Memory is very vague. I don't know what it means.