This question already has answers here:
Why is stack memory size so limited?
(9 answers)
Closed 2 years ago.
I mainly program with C++. I have seen from many places that I should put large objects (like array of 10k elements) on heap (using new) but not on stack (using pure array type).
I don't really understand. The reason might be because abuse of stack many result in stack overflow error at runtime. But why should OS set a limit for the stack size of a process (or more precisely, a thread) as virtual memory can go as large as needed (or 4G in practice).
Can anyone help me, I really have no clue.
Tradition, and threads.
Stacks are per thread, and for efficiency must be contiguous memory space. Non contiguous stacks make every function call more expensive.
Heaps are usually shared between threads; when they aren't, they don't have to be contiguous.
In the 32 bit days, having 1000 threads isn't impossible. At 1 meg per thread, that is 1 gigabyte of address space. And 1 meg isn't that big.
In comparison, 2 gigs of heap serves all threads.
On 64 bit systems, often addressable memory is way less than 64 bits. At 40 bits, if you gave half of the address space to stacks and you had 10,000 threads, that is a mere 50 megs per stack.
48 bits is more common, but that still leaves you with mere gigabytes of address space per stack.
In comparison, the heap has tebibytes.
What more is that with a large object on the stack doesn't help much with cache coherance; no cpu cache can hold the tront and the back. Having to follow a single pointer is trivial if you are working heavily with it, and can even ensure the stack stays in cache better.
So (a) stack size cost scales with threads (b) address space can be limited (c) the benefits of mega stacks are small.
Finally, infinite recursion is a common bug. You want your stack to blow and trap (the binary loader surrounds the stack with trap pages often) before your user shell crashes from resource exhastion. A modest size stack makes that more likely.
It's totally possible to use a static array of 10k INTs. It's usually better to use "new" because when you're dealing with big chunks of data, you can't guarantee they'll always be the size of your array. Dynamic allocation of memory lets you use only what you need. It's more efficient.
If your curious about how the OS chooses stack sizes or "pages", read this:
https://en.m.wikipedia.org/wiki/Page_(computer_memory)
Also, I'm not sure you're using "heap" the right way. "heap" is a tree-based data structure. The Stack refers to the layout of the executable and process data in RAM.
In my program I see some resident size increase. I suppose it is because of heap fragmentation.
So I am planning to use #pragma pack 1.
Will it reduce the heap fragmentation?
Will it be having some other overheads?
Shall I go for it or not?
There is a well proved technique called Memory pools. It is designed especially to reduce memory fragmentation and help with memory leaks. And it should be used in case where memory fragmentation became the bottleneck of the program functionality.
'pragma pack 1' isn't helpful to avoid heap fragmentation.
'pragma pack 1' is used to remove the padding bytes from structures to help with transmission of binary structures between programs.
It's simply how the operating system works. When you free some memory you've allocated, it's not unmapped from the process memory map. This is kind of an optimization from the OS in case the process needs to allocate more memory again, because then the OS doesn't have to add a new mapping to the process memory map.
#pragma pack N, tells the compiler to align members of structure in a particular way, with (N-1) bytes padding. For example, if N is 2, each char will occupy 2 bytes, one assigned, one padding. With N being 1, there will be no padding. This will have more fragmentation, as there would be odd number of bytes if the structure has say one char and one int, totaling 5 bytes.
Check: #pragma pack effect
Packing structures probably won't have much affect on heap fragmentation. Heap fragmentation normally occurs when there is a repeating pattern of allocations and freeing of memory. There are two issues here, one issue is that the virtual address space gets fragmented, the other issue is that physical 4k pages end up with unused gaps, consuming increasing amounts of memory over time. Microsoft addresses the 4k page issue with it's .net framework that occasionally repacks memory, but does so by "pausing" a .net application during the repacks. I'm not sure how server apps that run 24 hours a day / 7 days a week deal with this without having to deal with the pauses, unless they occasionally fork off a new process to take over the server side and then close down the old process which would refresh the new process virtual address space with a new set of pages.
If you're specifically concerned about heap fragmentation then you might want to increase the structure packing. This would (occasionally) result in different structures being distributed between fewer different-sized buckets and reducing the likelihood of allocations leaving unusable gaps when they occupy the space of a previously-freed, slightly larger structure.
But this is unlikely to be your real concern. As another answer points out, the OS does not reclaim freed memory right away, and this can affect the apparent memory usage of the process.
I've heard the term "memory fragmentation" used a few times in the context of C++ dynamic memory allocation. I've found some questions about how to deal with memory fragmentation, but can't find a direct question that deals with it itself. So:
What is memory fragmentation?
How can I tell if memory fragmentation is a problem for my application? What kind of program is most likely to suffer?
What are good common ways to deal with memory fragmentation?
Also:
I've heard using dynamic allocations a lot can increase memory fragmentation. Is this true? In the context of C++, I understand all the standard containers (std::string, std::vector, etc) use dynamic memory allocation. If these are used throughout a program (especially std::string), is memory fragmentation more likely to be a problem?
How can memory fragmentation be dealt with in an STL-heavy application?
Imagine that you have a "large" (32 bytes) expanse of free memory:
----------------------------------
| |
----------------------------------
Now, allocate some of it (5 allocations):
----------------------------------
|aaaabbccccccddeeee |
----------------------------------
Now, free the first four allocations but not the fifth:
----------------------------------
| eeee |
----------------------------------
Now, try to allocate 16 bytes. Oops, I can't, even though there's nearly double that much free.
On systems with virtual memory, fragmentation is less of a problem than you might think, because large allocations only need to be contiguous in virtual address space, not in physical address space. So in my example, if I had virtual memory with a page size of 2 bytes then I could make my 16 byte allocation with no problem. Physical memory would look like this:
----------------------------------
|ffffffffffffffeeeeff |
----------------------------------
whereas virtual memory (being much bigger) could look like this:
------------------------------------------------------...
| eeeeffffffffffffffff
------------------------------------------------------...
The classic symptom of memory fragmentation is that you try to allocate a large block and you can't, even though you appear to have enough memory free. Another possible consequence is the inability of the process to release memory back to the OS (because each of the large blocks it has allocated from the OS, for malloc etc. to sub-divide, has something left in it, even though most of each block is now unused).
Tactics to prevent memory fragmentation in C++ work by allocating objects from different areas according to their size and/or their expected lifetime. So if you're going to create a lot of objects and destroy them all together later, allocate them from a memory pool. Any other allocations you do in between them won't be from the pool, hence won't be located in between them in memory, so memory will not be fragmented as a result. Or, if you're going to allocate a lot of objects of the same size then allocate them from the same pool. Then a stretch of free space in the pool can never be smaller than the size you're trying to allocate from that pool.
Generally you don't need to worry about it much, unless your program is long-running and does a lot of allocation and freeing. It's when you have mixtures of short-lived and long-lived objects that you're most at risk, but even then malloc will do its best to help. Basically, ignore it until your program has allocation failures or unexpectedly causes the system to run low on memory (catch this in testing, for preference!).
The standard libraries are no worse than anything else that allocates memory, and standard containers all have an Alloc template parameter which you could use to fine-tune their allocation strategy if absolutely necessary.
What is memory fragmentation?
Memory fragmentation is when most of your memory is allocated in a large number of non-contiguous blocks, or chunks - leaving a good percentage of your total memory unallocated, but unusable for most typical scenarios. This results in out of memory exceptions, or allocation errors (i.e. malloc returns null).
The easiest way to think about this is to imagine you have a big empty wall that you need to put pictures of varying sizes on. Each picture takes up a certain size and you obviously can't split it into smaller pieces to make it fit. You need an empty spot on the wall, the size of the picture, or else you can't put it up. Now, if you start hanging pictures on the wall and you're not careful about how you arrange them, you will soon end up with a wall that's partially covered with pictures and even though you may have empty spots most new pictures won't fit because they're larger than the available spots. You can still hang really small pictures, but most ones won't fit. So you'll have to re-arrange (compact) the ones already on the wall to make room for more..
Now, imagine that the wall is your (heap) memory and the pictures are objects.. That's memory fragmentation..
How can I tell if memory fragmentation is a problem for my application? What kind of program is most likely to suffer?
A telltale sign that you may be dealing with memory fragmentation is if you get many allocation errors, especially when the percentage of used memory is high - but not you haven't yet used up all the memory - so technically you should have plenty of room for the objects you are trying to allocate.
When memory is heavily fragmented, memory allocations will likely take longer because the memory allocator has to do more work to find a suitable space for the new object. If in turn you have many memory allocations (which you probably do since you ended up with memory fragmentation) the allocation time may even cause noticeable delays.
What are good common ways to deal with memory fragmentation?
Use a good algorithm for allocating memory. Instead of allocating memory for a lot of small objects, pre-allocate memory for a contiguous array of those smaller objects. Sometimes being a little wasteful when allocating memory can go along way for performance and may save you the trouble of having to deal with memory fragmentation.
Memory fragmentation is the same concept as disk fragmentation: it refers to space being wasted because the areas in use are not packed closely enough together.
Suppose for a simple toy example that you have ten bytes of memory:
| | | | | | | | | | |
0 1 2 3 4 5 6 7 8 9
Now let's allocate three three-byte blocks, name A, B, and C:
| A | A | A | B | B | B | C | C | C | |
0 1 2 3 4 5 6 7 8 9
Now deallocate block B:
| A | A | A | | | | C | C | C | |
0 1 2 3 4 5 6 7 8 9
Now what happens if we try to allocate a four-byte block D? Well, we have four bytes of memory free, but we don't have four contiguous bytes of memory free, so we can't allocate D! This is inefficient use of memory, because we should have been able to store D, but we were unable to. And we can't move C to make room, because very likely some variables in our program are pointing at C, and we can't automatically find and change all of these values.
How do you know it's a problem? Well, the biggest sign is that your program's virtual memory size is considerably larger than the amount of memory you're actually using. In a real-world example, you would have many more than ten bytes of memory, so D would just get allocated starting a byte 9, and bytes 3-5 would remain unused unless you later allocated something three bytes long or smaller.
In this example, 3 bytes is not a whole lot to waste, but consider a more pathological case where two allocations of a a couple of bytes are, for example, ten megabytes apart in memory, and you need to allocate a block of size 10 megabytes + 1 byte. You have to go ask the OS for over ten megabytes more virtual memory to do that, even though you're just one byte shy of having enough space already.
How do you prevent it? The worst cases tend to arise when you frequently create and destroy small objects, since that tends to produce a "swiss cheese" effect with many small objects separated by many small holes, making it impossible to allocate larger objects in those holes. When you know you're going to be doing this, an effective strategy is to pre-allocate a large block of memory as a pool for your small objects, and then manually manage the creation of the small objects within that block, rather than letting the default allocator handle it.
In general, the fewer allocations you do, the less likely memory is to get fragmented. However, STL deals with this rather effectively. If you have a string which is using the entirety of its current allocation and you append one character to it, it doesn't simply re-allocate to its current length plus one, it doubles its length. This is a variation on the "pool for frequent small allocations" strategy. The string is grabbing a large chunk of memory so that it can deal efficiently with repeated small increases in size without doing repeated small reallocations. All STL containers in fact do this sort of thing, so generally you won't need to worry too much about fragmentation caused by automatically-reallocating STL containers.
Although of course STL containers don't pool memory between each other, so if you're going to create many small containers (rather than a few containers that get resized frequently) you may have to concern yourself with preventing fragmentation in the same way you would for any frequently-created small objects, STL or not.
What is memory fragmentation?
Memory fragmentation is the problem of memory becoming unusable even though it is theoretically available. There are two kinds of fragmentation: internal fragmentation is memory that is allocated but cannot be used (e.g. when memory is allocated in 8 byte chunks but the program repeatedly does single allocations when it needs only 4 bytes). external fragmentation is the problem of free memory becoming divided into many small chunks so that large allocation requests cannot be met although there is enough overall free memory.
How can I tell if memory fragmentation is a problem for my application? What kind of program is most likely to suffer?
memory fragmentation is a problem if your program uses much more system memory than its actual paylod data would require (and you've ruled out memory leaks).
What are good common ways to deal with memory fragmentation?
Use a good memory allocator. IIRC, those that use a "best fit" strategy are generally much superior at avoiding fragmentation, if a little slower. However, it has also been shown that for any allocation strategy, there are pathological worst cases. Fortunately, the typical allocation patterns of most applications are actually relatively benign for the allocators to handle. There's a bunch of papers out there if you're interested in the details:
Paul R. Wilson, Mark S. Johnstone, Michael Neely and David Boles. Dynamic Storage Allocation: A Survey and Critical Review. In Proceedings of the 1995
International Workshop on Memory Management, Springer Verlag LNCS, 1995
Mark S.Johnstone, Paul R. Wilson. The Memory Fragmentation Problem: Solved?
In ACM SIG-PLAN Notices, volume 34 No. 3, pages 26-36, 1999
M.R. Garey, R.L. Graham and J.D. Ullman. Worst-Case analysis of memory allocation algorithms. In Fourth Annual ACM Symposium on the Theory of Computing, 1972
Update:
Google TCMalloc: Thread-Caching Malloc
It has been found that it is quite good at handling fragmentation in a long running process.
I have been developing a server application that had problems with memory fragmentation on HP-UX 11.23/11.31 ia64.
It looked like this. There was a process that made memory allocations and deallocations and ran for days. And even though there were no memory leaks memory consumption of the process kept increasing.
About my experience. On HP-UX it is very easy to find memory fragmentation using HP-UX gdb. You set a break-point and when you hit it you run this command: info heap and see all memory allocations for the process and the total size of heap. Then your continue your program and then some time later your again hit the break-point. You do again info heap. If the total size of heap is bigger but the number and the size of separate allocations are the same then it is likely that you have memory allocation problems. If necessary do this check few fore times.
My way of improving the situation was this. After I had done some analysis with HP-UX gdb I saw that memory problems were caused by the fact that I used std::vector for storing some types of information from a database. std::vector requires that its data must be kept in one block. I had a few containers based on std::vector. These containers were regularly recreated. There were often situations when new records were added to the database and after that the containers were recreated. And since the recreated containers were bigger their did not fit into available blocks of free memory and the runtime asked for a new bigger block from the OS. As a result even though there were no memory leaks the memory consumption of the process grew. I improved the situation when I changed the containers. Instead of std::vector I started using std::deque which has a different way of allocating memory for data.
I know that one of ways to avoid memory fragmentation on HP-UX is to use either Small Block Allocator or use MallocNextGen. On RedHat Linux the default allocator seems to handle pretty well allocating of a lot of small blocks. On Windows there is Low-fragmentation Heap and it adresses the problem of large number of small allocations.
My understanding is that in an STL-heavy application you have first to identify problems. Memory allocators (like in libc) actually handle the problem of a lot of small allocations, which is typical for std::string (for instance in my server application there are lots of STL strings but as I see from running info heap they are not causing any problems). My impression is that you need to avoid frequent large allocations. Unfortunately there are situations when you can't avoid them and have to change your code. As I say in my case I improved the situation when switched to std::deque. If you identify your memory fragmention it might be possible to talk about it more precisely.
Memory fragmentation is most likely to occur when you allocate and deallocate many objects of varying sizes. Suppose you have the following layout in memory:
obj1 (10kb) | obj2(20kb) | obj3(5kb) | unused space (100kb)
Now, when obj2 is released, you have 120kb of unused memory, but you cannot allocate a full block of 120kb, because the memory is fragmented.
Common techniques to avoid that effect include ring buffers and object pools. In the context of the STL, methods like std::vector::reserve() can help.
A very detailed answer on memory fragmentation can be found here.
http://library.softwareverify.com/memory-fragmentation-your-worst-nightmare/
This is the culmination of 11 years of memory fragmentation answers I have been providing to people asking me questions about memory fragmentation at softwareverify.com
What is memory fragmentation?
When your app uses dynamic memory, it allocates and frees chunks of memory. In the beginning, the whole memory space of your app is one contiguous block of free memory. However, when you allocate and free blocks of different size, the memory starts to get fragmented, i.e. instead of a big contiguous free block and a number of contiguous allocated blocks, there will be a allocated and free blocks mixed up. Since the free blocks have limited size, it is difficult to reuse them. E.g. you may have 1000 bytes of free memory, but can't allocate memory for a 100 byte block, because all the free blocks are at most 50 bytes long.
Another, unavoidable, but less problematic source of fragmentation is that in most architectures, memory addresses must be aligned to 2, 4, 8 etc. byte boundaries (i.e. the addresses must be multiples of 2, 4, 8 etc.) This means that even if you have e.g. a struct containing 3 char fields, your struct may have a size of 12 instead of 3, due to the fact that each field is aligned to a 4-byte boundary.
How can I tell if memory fragmentation is a problem for my application? What kind of program is most likely to suffer?
The obvious answer is that you get an out of memory exception.
Apparently there is no good portable way to detect memory fragmentation in C++ apps. See this answer for more details.
What are good common ways to deal with memory fragmentation?
It is difficult in C++, since you use direct memory addresses in pointers, and you have no control over who references a specific memory address. So rearranging the allocated memory blocks (the way the Java garbage collector does) is not an option.
A custom allocator may help by managing the allocation of small objects in a bigger chunk of memory, and reusing the free slots within that chunk.
This is a super-simplified version for dummies.
As objects get created in memory, they get added to the end of the used portion in memory.
If an object that is not at the end of the used portion of memory is deleted, meaning this object was in between 2 other objects, it will create a "hole".
This is what's called fragmentation.
When you want to add an item on the heap what happens is that the computer has to do a search for space to fit that item. That's why dynamic allocations when not done on a memory pool or with a pooled allocator can "slow" things down. For a heavy STL application if you're doing multi-threading there is the Hoard allocator or the TBB Intel version.
Now, when memory is fragmented two things can occur:
There will have to be more searches to find a good space to stick "large" objects. That is, with many small objects scattered about finding a nice contigous chunk of memory could under certain conditions be difficult (these are extreme.)
Memory is not some easily read entity. Processors are limited to how much they can hold and where. They do this by swapping pages if an item they need is one place but the current addresses are another. If you are constantly having to swap pages, processing can slow down (again, extreme scenarios where this impacts performance.) See this posting on virtual memory.
Memory fragmentation occurs because memory blocks of different sizes are requested. Consider a buffer of 100 bytes. You request two chars, then an integer. Now you free the two chars, then request a new integer- but that integer can't fit in the space of the two chars. That memory cannot be re-used because it is not in a large enough contiguous block to re-allocate. On top of that, you've invoked a lot of allocator overhead for your chars.
Essentially, memory only comes in blocks of a certain size on most systems. Once you split these blocks up, they cannot be rejoined until the whole block is freed. This can lead to whole blocks in use when actually only a small part of the block is in use.
The primary way to reduce heap fragmentation is to make larger, less frequent allocations. In the extreme, you can use a managed heap that is capable of moving objects, at least, within your own code. This completely eliminates the problem - from a memory perspective, anyway. Obviously moving objects and such has a cost. In reality, you only really have a problem if you are allocating very small amounts off the heap often. Using contiguous containers (vector, string, etc) and allocating on the stack as much as humanly possible (always a good idea for performance) is the best way to reduce it. This also increases cache coherence, which makes your application run faster.
What you should remember is that on a 32bit x86 desktop system, you have an entire 2GB of memory, which is split into 4KB "pages" (pretty sure the page size is the same on all x86 systems). You will have to invoke some omgwtfbbq fragmentation to have a problem. Fragmentation really is an issue of the past, since modern heaps are excessively large for the vast majority of applications, and there's a prevalence of systems that are capable of withstanding it, such as managed heaps.
What kind of program is most likely to suffer?
A nice (=horrifying) example for the problems associated with memory fragmentation was the development and release of "Elemental: War of Magic", a computer game by Stardock.
The game was built for 32bit/2GB Memory and had to do a lot of optimisation in memory management to make the game work within those 2GB of Memory. As the "optimisation" lead to constant allocation and de-allocation, over time heap memory fragmentation occurred and made the game crash every time.
There is a "war story" interview on YouTube.
I am running a long run C++ application which allocates different objects and store it in several deque and maps,and deallocate these object from these data structures at a time.I'm experiencing a small amount of increase in memory(generally 1 mb to 2 mb) day by day.I have run memory leakage detector(valgrind),but i could'nt find any suspicious memory leakage.
I was wondering whether the problem is with the deque and maps where the objects are stored.
Whether the memory of deque and map releases the memory to OS as soon as object is popped from the data structures?
Can anyone please point out solutions or general possible cause for the increase in memory?
C++ standard provides no guarantees that delete will release memory to the OS. In fact many standard C++ libraries do not do this. If you want memory to be released to the OS then you will have to use your OS's own memory allocation routines.
Standard C++ library provides custom allocators which might help you do this.
You might run into heap fragmentation.
If you allocate memory blocks of different sizes, it could mean that large memory blocks are eventually divided into smaller blocks and become unusable. For example:
you allocate one large block (say 1 MB) and the runtime gets that from the OS
you freethe large block
you allocate a smaller block and malloc chops that off the freed 1 MB block
you try to allocate a 1 MB block again, but the free'd one is no longer big enough, so the runtime requests a new 1 MB block from the OS
If this goes on for days, you might end up with lots of free 0.99 MB blocks but the runtime will still have to get a new 1 MB block from the OS each time it is needed.