I have some code here where there is an array of "Bacon" objects. I can compile and run it and add objects to the array, but when I make the array size more than one million, I run it and it says 'bacon.exe has stopped working' and I have to close it. I think it might be a memory leak, but I am still learning about that. I am using netbeans ide, and I tried allocating more memory when it gets compiled, but I couldn't figure out how to do that. Note: It isn't because my whole computer runs out of memory, because I still have 2GB free after running the program. Here is my code:
#include <iostream>
#include "Bacon.h"
using namespace std;
int main() {
const int objs = 1000000;
Bacon *bacs[objs];
for(int i = 0;i < objs;i++){
bacs[i] = new Bacon(2,3);
}
for(int i = 0;i < objs;i++){
bacs[i]->print();
}
cin.ignore();
return 0;
}
Your computer has plenty of memory, but only so much of it can be allocated on the stack. Try allocating it on the heap instead:
Bacon **bacs = new Bacon*[objs];
and later:
delete[] bacs;
You're probably out of stack space.
You allocate huge array of pointers right on stack. Stack is limited resource (usually 8 megabytes per process). Size of pointer is usually 4 or 8 bytes; multiply it by one million and you overrun that limit.
As I learned, when you request for space from memory, if the operation system, which you use(Windows in this case, I think), lets you to take it, you can take and use that space.
For some reason, Windows may not be letting you to take that memory for this situation. But I'm not that much expert in this field. I am stating this as a thought.
The default stack size (windows visual studio 2005, probably others keep the same number) is 1MB, check out http://msdn.microsoft.com/en-us/library/tdkhxaks%28v=vs.80%29.aspx to change it
ulimit in linux to change it.
The heap solution is valid too, but in your example you don't need heap. Requesting heap memory to the OS for something that won't escape the current function is not a good practice. In assembler the stack is translated just to a bigger subtraction, heap is requested thru other methods that require more processing.
Related
When assigning values to a large array the used memory keeps increasing even though no new memory is allocated. I am checking the used memory simply by the task manager (windows) or system monitor (Ubuntu).
The Problem is the same on both OS. I am using gcc 4.7 or 4.6 respectively.
This is my code:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
int i,j;
int n=40000000; //array size
int s=100;
double *array;
array=malloc(n*sizeof(double)); //allocate array
if(array==NULL){
return -1;
}
for(i=0;i<n;i++){ //loop for array, memory increases during this loop
for(j=0;j<s;j++){ //loop to slow down the program
array[i] = 3.0;
}
}
return 0;
}
I do not see any logical Problem, but to my knowledge I do not exceed any system limits either. So my questions are:
can the problem be reproduced by others?
what is the reason for the growing memory?
how do I solve this issue?
When modern systems 'allocate' memory, the pages are not actually allocated within physical RAM. You will get a virtual memory allocation. As you write to those pages, a physical page will be taken. So the virtual RAM taken will be increased when you do the malloc(), but only when you write the value in will the physical RAM be taken (on a page by page basis).
You should see the virtual memory used increase immediately. After that the RSS, or real memory used will increment as you write into the newly allocated memory. More information at How to measure actual memory usage of an application or process?
This is because memory allocated in Linux and on many other operating systems, isn't actually given to your program until you use it.
So you could malloc 1 GB on a 256 MB machine, and not run out of memory until you actually tried to use all 1 GB.
In Linux there is a group of overcommit settings which changes this behavior. See Cent OS: How do I turn off or reduce memory overcommitment, and is it safe to do it?
I am using C++ on Windows 7 with MSVC 9.0, and have also been able to test and reproduce on Windows XP SP3 with MSVC 9.0.
If I allocate 1 GB of 0.5 MB sized objects, when I delete them, everything is ok and behaves as expected. However if I allocate 1 GB of 0.25 MB sized objects when I delete them, the memory remains reserved (yellow in Address Space Monitor) and from then on will only be able to be used for allocations smaller than 0.25 MB.
This simple code will let you test both scenarios by changing which struct is typedef'd. After it has allocated and deleted the structs it will then allocate 1 GB of 1 MB char buffers to see if the char buffers will use the memory that the structs once occupied.
struct HalfMegStruct
{
HalfMegStruct():m_Next(0){}
/* return the number of objects needed to allocate one gig */
static int getIterations(){ return 2048; }
int m_Data[131071];
HalfMegStruct* m_Next;
};
struct QuarterMegStruct
{
QuarterMegStruct():m_Next(0){}
/* return the number of objects needed to allocate one gig */
static int getIterations(){ return 4096; }
int m_Data[65535];
QuarterMegStruct* m_Next;
};
// which struct to use
typedef QuarterMegStruct UseType;
int main()
{
UseType* first = new UseType;
UseType* current = first;
for ( int i = 0; i < UseType::getIterations(); ++i )
current = current->m_Next = new UseType;
while ( first->m_Next )
{
UseType* temp = first->m_Next;
delete first;
first = temp;
}
delete first;
for ( unsigned int i = 0; i < 1024; ++i )
// one meg buffer, i'm aware this is a leak but its for illustrative purposes.
new char[ 1048576 ];
return 0;
}
Below you can see my results from within Address Space Monitor. Let me stress that the only difference between these two end results is the size of the structs being allocated up to the 1 GB marker.
This seems like quite a serious problem to me, and one that many people could be suffering from and not even know it.
So is this by design or should this be considered a bug?
Can I make small deleted objects actually be free for use by larger allocations?
And more out of curiosity, does a Mac or a Linux machine suffer from the same problem?
I cannot positively state this is the case, but this does look like memory fragmentation (in one of its many forms). The allocator (malloc) might be keeping buckets of different sizes to enable fast allocation, after you release the memory, instead of directly giving it back to the OS it is keeping the buckets so that later allocations of the same size can be processed from the same memory. If this is the case, the memory would be available for further allocations of the same size.
This type of optimization, is usually disabled for big objects, as it requires reserving memory even if not in use. If the threshold is somewhere between your two sizes, that would explain the behavior.
Note that while you might see this as weird, in most programs (not test, but real life) the memory usage patterns are repeated: if you asked for 100k blocks once, it more often than not is the case that you will do it again. And keeping the memory reserved can improve performance and actually reduce fragmentation that would come from all requests being granted from the same bucket.
You can, if you want to invest some time, learn how your allocator works by analyzing the behavior. Write some tests, that will acquire size X, release it, then acquire size Y and then show the memory usage. Fix the value of X and play with Y. If the requests for both sizes are granted from the same buckets, you will not have reserved/unused memory (image on the left), while when the sizes are granted from different buckets you will see the effect on the image on the right.
I don't usually code for windows, and I don't even have Windows 7, so I cannot positively state that this is the case, but it does look like it.
I can confirm the same behaviour with g++ 4.4.0 under Windows 7, so it's not in the compiler. In fact, the program fails when getIterations() returns 3590 or more -- do you get the same cutoff? This looks like a bug in Windows system memory allocation. It's all very well for knowledgeable souls to talk about memory fragmentation, but everything got deleted here, so the observed behaviour definitely shouldn't happen.
Using your code I performed your test and got the same result. I suspect that David RodrÃguez is right in this case.
I ran the test and had the same result as you. It seems there might be this "bucket" behaviour going on.
I tried two different tests too. Instead of allocating 1GB of data using 1MB buffers I allocated the same way as the memory was first allocated after deleting. The second test I allocated the half meg buffer cleaned up then allocated the quater meg buffer, adding up to 512MB for each. Both tests had the same memory result in the end, only 512 is allocated an no large chunk of reserved memory.
As David mentions, most applications tend to make allocation of the same size. One can see quite clearly why this could be a problem though.
Perhaps the solution to this is that if you are allocating many smaller objects in this way you would be better to allocate a large block of memory and manage it yourself. Then when you're done free the large block.
I spoke with some authorities on the subject (Greg, if you're out there, say hi ;D) and can confirm that what David is saying is basically right.
As the heap grows in the first pass of allocating ~0.25MB objects, the heap is reserving and committing memory. As the heap shrinks in the delete pass, it decommits at some pace but does not necessarily release the virtual address ranges it reserved in the allocation pass. In the last allocation pass, the 1MB allocations are bypassing the heap due to their size and thus begin to compete with the heap for VA.
Note that the heap is reserving the VA, not keeping it committed. VirtualAlloc and VirtualFree can help explain the different if you're curious. This fact doesn't solve the problem you ran into, which is that the process ran out of virtual address space.
This is a side-effect of the Low-Fragmentation Heap.
http://msdn.microsoft.com/en-us/library/aa366750(v=vs.85).aspx
You should try disabling it to see if that helps. Run against both GetProcessHeap and the CRT heap (and any other heaps you may have created).
I have a strongly recursive function, that creates a (very small) std::multimap locally for each function instance using new (which recurses to malloc/calloc in the std lib). After some hundred recursions new fails although i am using a native 64Bit application on Windows XP x64. The machine has 10 GB RAM, The application only uses about 1GB. No other big apps are running.
This happens a few minutes after starting the program and starting the recursive function. The recursive function has been called about 150.000 times at this point with a probably max. recursion of some hundreds. The problem occurring is not a stack overflow.
I am using Visual Studio 2005 and the dinkumware STL. The fault occurs in a release build.
EDIT:
Ok, here is some code.
I rearranged the code now and put the map on the stack, but it uses new to initialize - there it fails. I also tried with a std::multimap instead of hash_multimap. All of this die not change the behavior.
int TraceBackSource(CalcParams *CalcData, CKnoObj *theKno, int qualNo,
double maschFak, double partAmount, int MaschLevel, char *MaschID,
double *totalStrFlow, int passNo,
CTraceBackData *ResultData)
{ typedef std::hash_multimap<double, CStrObj *>StrFMap;
StrFMap thePipes;
for(...)
{
...
thePipes.insert(std::make_pair(thisFlow, theStr));
}
// max. 5 elements in "thePipes"
for(StrFMap::iterator it = thePipes.begin(); it != thePipes.end(); it++)
{
...
try
{
TraceBackSource(CalcData, otherKno, qualNo, maschFak * nodeFak, nodeAmount, SubMaschlevel, newMaschID, totalStrFlow, passNo, ResultData);
}
catch(std::exception &it)
{
Trace(0, "*** Exception, %s", it.what());
return 0;
}
return 0;
}
}
Interestingly, the first failure runs into the catch handler, quite a bit later on i end with a ACCESS VIOLATION and a corrupted stack.
The amount of RAM on your machine and the other processes running are irrelevant for this particular scenario. Every process has the same amount of virtual address space assigned to it. The size of this space is irrespective of the amount of RAM on your machine or other processes running.
What's happening here is likely one of the following
You've simply allocated too much memory. Hard to do in 64 bit yes but possible
There is no contiguous block of memory available which has the requested size.
Your number suggests an easily defaulted 1MB stacks size (c150K x 8 ). So from a quick look at your code (and that map::insert especially and not providing the for'...' code ) you are running into an interaction with stackoverflow.com :)
You are probably hitting it for the OS you're running it on. On Windows use the VS linker setttings or use editbin.exe or some exotic unportable api, triple your stack size and see whether it significantly changes the observed recursive count at time of exception.
Your application is probably suffering from memory fragmentation. There might be plenty of memory available, but it may be fragmented into much smaller contiguous blocks than your application asks for.
As Majkara mentions, the thread stack space is a fixed size, and you are running out of it - it doesn't matter how much memory you have free. You need to rewrite your algorithm to be iterative using a stl::stack allocated on the heap (or some other data structure) to keep track of the depth.
I wanna to declare an array:
int a[256][256][256]
And the program hang. (I already comment out all other codes...)
When I try int a[256][256], it runs okay.
I am using MingW C++ compiler, Eclipse CDT.
My code is:
int main(){
int a[256][256][256];
return 0;
}
Any comment is welcomed.
This might happen if your array is local to a function. In that case, you'd need a stack size sufficient to hold 2^24 ints (2^26 bytes, or 64 MB).
If you make the array a global, it should work. I'm not sure how to modify the stack size in Windows; in Linux you'd use "ulimit -s 10000" (units are KB).
If you have a good reason not to use a global (concurrency or recursion), you can use malloc/free. The important thing is to either increase your stack (not a good idea if you're using threads), or get the data on the heap (malloc/free) or the static data segment (global).
Ideally you'd get program termination (core dump) and not a hang. I do in cygwin.
Maybe you don't have 16MB of free continuous memory? Kind of hard to imagine but possible...
You want something like this
#include <malloc.h>
int main()
{
int *a;
a = (int*)malloc(256*256*256*sizeof(int)); // allocate array space in heap
return 0;
}
Otherwise, you get something like this:
alt text http://bweaver.net/files/stackoverflow1.jpg
Because, as others have pointed out, in your code you're allocating the array on the stack, and blowing it up.
Allocating the array via malloc or its friends is the way to go. (Creating it globally works too, if you must go that route.)
Simple question, I'm writting a program that needs to open huge image files (8kx8k) but I'm a little bit confused on how to initialize the huge arrays to hold the images in c++.
I been trying something like this:
long long SIZE = 8092*8092; ///8096*8096
double* array;
array = (double*) malloc(sizeof(double) * SIZE);
if (array == NULL)
{
fprintf(stderr,"Could not allocate that much memory");
}
But sometimes my NULL check does not catch that the array was not initialized, any idea why?
Also I can't initialize more that 2 or 3 arrays, even when running in a x64 machine with 12 GB of RAM, any idea why?
I would really wish not to have to work with sections of array instead. Any help is welcome.
Thanks.
You're not running into an array size problem. 8K*8K is merely 64M. Even 64M doubles (sizeof==8) are not an issue; that would require a mere 512 MB. Now, a 32 bit application (no matter where it's running) should be able to allocate a few of them. Not 8, because the OS typically needs to reserve some space for itself (often slightly over 2GB) and sometimes not even 3 when memory is fragmented.
The behavior of "malloc failed but didn't return NULL" is a Linux configuration bug, fixed by # echo 2 > /proc/sys/vm/overcommit_memory
malloc() does not initialize memory, it just reserves it. You will have to initialize it explicitly, e.g. via memset() from string.h:
array = (double*) malloc(SIZE * sizeof(double));
if (array) memset(array, 0, SIZE * sizeof(double));
However, in C++ you should use new instead of malloc:
double* array = new double[SIZE];
if (!array) {
cerr << "Could not allocate that much memory" << endl;
}
for (int i=0; i<SIZE; i++) array[i] = 0.0;
Regarding size: each such array is 512 MB. Are you positively sure you need double precision (which means the image has 64-bit pixel depth)? Maybe a float would suffice? That would halve the memory footprint.
You might be running into a 2GB per-process address space limit if you are running a 32bit operating system. With a few hundred MBs of system libs and other stuff, and 2 or 3 arrays of 512MB each, that will give 2GB easily. A 64bit OS would help you there.
Are you compiling your application as a 32-bit application (the default in Visual Studio, if that's what you're using), or as a 64-bit application? You shouldn't have troubles if you build it as a 64-bit app.
malloc allocates (reserves memory and returns a pointer), calloc initializes (writes all zeros to that memory).
Seems to be that you have no continuous memory block of such size (~500Mb) in C runtime heap. Instead of copying file into memory try to map image into a processes address space. You could map only necessary parts of the file.
Just as a side note: although you don't want to bother about the whole image not being in memory at once, there are reasons not to do it. Maybe think about an abstraction that allows you to keep only the currently needed chunk in memory. The program code then can be written as though ignorant of the memory issues.
I would really wish not to have to work with sections of array instead. Any help is welcome.
Have you looked into memory-mapped files?
Yep, sounds a lot like heap fragmentation, as Kirill pointed out. See also: How to avoid heap fragmentation?
i suggest using compression. decompress part of it which you need to process in your code whenever, and compress it after the part done.
2nd proposal: write code to overload memory pointer "operator+" and "operator-" so you could use non-continuous memory buffers. use smaller memory buffers make your code more stable than a continuous larger one. i had experienced it and had written some operator-overloading, see http://code.google.com/p/effoaddon/source/browse/trunk/devel/effo/codebase/addons/mem/include/mcur_i.h for the example. when i test 47G malloc()ed system memory on a x86_64, i allocated just 1G per malloc() call, so i allocated 47 memory blocks in total. EDIT: while if i tried to allocate as much as possible by using just one malloc(), i would only get 30G on a 48G system, say less than 70%, that's because larger buffer per malloc() requested, much more managemental memory consumed by the system/libc itself, you know, I called mlock() to prevent the allocated memory from being swapped out to the disk.
3rd one: try posix file mapping, map to memory per image.
Btw: call malloc() is more stable than new() though writing c++, because when memory got stressed, new() is prone to trow exceptions instead of returning NULL.