Size of dynamic array vs static array - c++

What is the maximum size of static array, and dynamic array? I think that there is no limit for dynamic array but why static arrays have a limited size?

Unhandled exception at 0x011164A7 in StackOverflow.exe: 0xC00000FD: Stack overflow (parameters: 0x00000000, 0x00482000)
This looks more like a runtime error. More precisely - stack overflow.
In most places the size of array is limited only by available memory. However, the limit on stack allocated objects is usually much more severe. By default, it's 1Mb on Windows and 8Mb on Linux. It looks like your array and other data already on the stack is taking more space than the limit.
There are few ways to avoid this error:
Make this array static or declare it at top level of your module. This way it will be allocated in .bss segment instead of stack.
Use malloc/new to explicitly allocate this array on heap.
Use C++ collections such as std::vector instead of arrays.
Increase stack size limit. On Linux this can be done with ulimit -s unlimited

The maximum size of an array is determined by the amount of memory that a program can access. On a 32-bit system, the maximum amount of memory that can be addressed by a pointer is 2^32 bytes which is 4 gigabytes. The actual limit may be less, depending on operating system implementation details.
Note that this has nothing to do with the amount of physical memory you have available. Even on a machine with substantially less than 1 GB of RAM, you can allocate a 2 GB array... it's just going to be slow, as most of the array will be in virtual memory, swapped out to disk.

Related

Some problem about the initialization of the vector

I can initialize the vector with 10^8,but I can't initialize it with 10^9.Why?
vector<int> bucket;
bucket.resize(100000000); √
bucket.resize(1000000000); ×
It's because resize function will apply memory from heap. As you can figure that, the size will be 4000000000 bytes in your second resize operation, which is larger than the space your system could allocate(may be your computer couldn't find a piece of continuous space for you), and will cause exception and failure.
The maximum memory you can apply for depends on many reasons as follow:
the hardware limitation of physical memory.
the os bit(32 or 64)
memory left for user. Operating system should meet the kernel's need first. Generally speaking, windows kernel needs more memory than linux or unix.
..
In a word, it is hard to know accurate memory size you can use, because it's a dynamic value. But you can make a rough estimation by new operator, and here is a good reference.
C++ vectors allocate memory in a contiguous block and it is likely that the operating system cannot find such a block when the block size gets too large.
Would the error message you are getting indicate that you are running out of memory?
The point is: Even if you think that you have enough memory left on your system, if your program's address space can not accommodate the large block in one chunk then you cannot construct the large vector (the maximum address space size may differ for 32-bit and 64-bit programs).

Get the size of bin per memory allocation in jemalloc

I run a c++ program that uses jemalloc as memory allocator which pre-divides big chunks into small chunks of pre-defined sizes (i.e. 1, 2, 4, 8, ... bytes)
Even though I ask 110 bytes of memory allocation, it returns a memory with 128 bytes capacity.
In my program, I track the amount of dynamically allocated memory (with highly diverse size) and limit the memory allocation of threads to avoid OutOfMemory crash.
However, due to the discrepancy between the size requested and the actual size granted, I cannot exactly count the amounts of dynamically allocated bytes.
Is there any 'jemalloc' API that receives a request size as an input and provides an actual allocation size as an output?
Thanks
As per the documentation, you can use malloc_usable_size() and passing your allocated pointer.
The malloc_usable_size function returns the usable size of the allocation pointed to by ptr. The return value may be larger than the size that was requested during allocation. The malloc_usable_size function is not a mechanism for in-place realloc; rather it is provided solely as a tool for introspection purposes. Any discrepancy between the requested allocation size and the size reported by malloc_usable_size should not be depended on, since such behavior is entirely implementation-dependent.
More info :
Documentation

Why can't we declare an array, say of int data type, of any size within the memory limit?

int A[10000000]; //This gives a segmentation fault
int *A = (int*)malloc(10000000*sizeof(int));//goes without any set fault.
Now my question is, just out of curiosity, that if ultimately we are able to allocate higher space for our data structures, say for example, BSTs and linked lists created using the pointers approach in C have no as such memory limit(unless the total size exceeds the size of RAM for our machine) and for example, in the second statement above of declaring a pointer type, why is that we can't have an array declared of higher size(until it reaches the memory limit!!)...Is this because the space allocated is contiguous in a static sized array?.But then from where do we get the guarantee that in the next 1000000 words in RAM no other piece of code would be running...??
PS: I may be wrong in some of the statements i made..please correct in that case.
Firstly, in a typical modern OS with virtual memory (Linux, Windows etc.) the amount of RAM makes no difference whatsoever. Your program is working with virtual memory, not with RAM. RAM is just a cache for virtual memory access. The absolute limiting factor for maximum array size is not RAM, it is the size of the available address space. Address space is the resource you have to worry about in OSes with virtual memory. In 32-bit OSes you have 4 gigabytes of address space, part of which is taken up for various household needs and the rest is available to you. In 64-bit OSes you theoretically have 16 exabytes of address space (less than that in practical implementations, since CPUs usually use less than 64 bits to represent the address), which can be perceived as practically unlimited.
Secondly, the amount of available address space in a typical C/C++ implementation depends on the memory type. There's static memory, there's automatic memory, there's dynamic memory. The address space limits for each memory type are pre-set in advance by the compiler. Which raises the question: where are you declaring your large array? Which memory type? Automatic? Static? You provided no information, but this is absolutely necessary. If you are attempting to declare it as a local variable (automatic memory), then no wonder it doesn't work, since automatic memory (aka "stack memory") has very limited address space assigned to it. Your array simply does not fit. Meanwhile, malloc allocates dynamic memory, which normally has the largest amount of address space available.
Thirdly, many compilers provide you with options that control the initial distribution of address space between different kinds of memory. You can request a much larger stack size for your program by manipulating such options. Quite possibly you can request a stack so large, than your local array will fit in it without any problems. But in practice, for obvious reasons, it makes very little sense to declare huge arrays as local variables.
Assuming local variables, this is because on modern implementations automatic variables will be allocated on the stack which is very limited in space. This link gives some of the common stack sizes:
platform default size
=====================================
SunOS/Solaris 8172K bytes
Linux 8172K bytes
Windows 1024K bytes
cygwin 2048K bytes
The linked article also notes that the stack size can be changed for example in Linux, one possible way from the shell before running your process would be:
ulimit -s 32768 # sets the stack size to 32M bytes
While malloc on modern implementations will come from the heap, which is only limited to the memory you have available to the process and in many cases you can even allocate more than is available due to overcommit.
I THINK you're missing the difference between total memory, and your programs memory space. Your program runs in an environment created by your operating system. It grants it a specific memory range to the program, and the program has to try to deal with that.
The catch: Your compiler can't 100% know the size of this range.
That means your compiler will successfully build, and it will REQUEST that much room in memory when the time comes to make the call to malloc (or move the stack pointer when the function is called). When the function is called (creating a stack frame) you'll get a segmentation fault, caused by the stack overflow. When the malloc is called, you won't get a segfault unless you try USING the memory. (If you look at the manpage for malloc() you'll see it returns NULL when there's not enough memory.)
To explain the two failures, your program is granted two memory spaces. The stack, and the heap. Memory allocated using malloc() is done using a system call, and is created on the heap of your program. This dynamically accepts or rejects the request and returns either the start address, or NULL, depending on a success or fail. The stack is used when you call a new function. Room for all the local variables is made on the stack, this is done by program instructions. Calling a function can't just FAIL, as that would break program flow completely. That causes the system to say "You're now overstepping" and segfault, stopping the execution.

C++ stack array limit?

I'm running some code which may be pointing out I don't understand the difference between the heap and stack that well. Below I have some example code, where I either declare an array on the stack or the heap of 1234567 elements. Both work.
int main(int argc, char** argv){
int N = 1234567;
int A[N];
//int* A = new int[N];
}
But if we take N to be 12345678, I get a seg fault with int A[N], whereas the heap declaration still works fine. (I'm using g++ O3 -std=c++0x if that matters). What madness is this? Does the stack have a (rather small) array size limit?
This is because the stack is of a much smaller size than the heap. The heap can occupy all memory available to the program. By default VC++ compiles the stack with a size of 1 MB. The stack offers better performance but is for smaller quantities of data. In general it is not used for large data structures. This is why functions accepting lists/arrays/dictionaries/ect in c++ generally take a pointer or reference to that structure. Parameters passed by value are copied onto the stack and passing such structures would frequently cause programs to crash.
In your example you're using N int's, an int is 4 bytes. That makes the size of A[N] ~4.7 MB, much larger than the size of your stack.
The heap grows dynamically with allocation through malloc and co. The stack grows with each function call made in the course of running a program. The return address, arguments, local variables are usually stored in the stack (except that in certain processor architectures a handful of these are stored in registers instead). It is also possible (but not common) to allocate stack space dynamically.
The heap and the stack compete for the use of the same memory. You can think on one growing left to right and the other growing right to left. There is a possibility that, if left unchecked, they may collide. The stack is typically restrained from growing beyond a certain bound. This is relatively small because it is expected that it will use only a few bytes for most calls and only a few stack levels will be used. The limit is small but sufficient for most tasks. You can expand this limit by changing your build settings (not for Linux ELF binaries though) or by calling setrlimit. The OS may also impose a limit which you can change. There may be soft and hard limits (http://www.nics.tennessee.edu/node/327).
Going into greater detail about the limits falls outside the scope of the question. The bottomline is that the stack is limited and it is quite small because it competes with the heap for actual memory and for typical applications it need not be bigger.
http://en.wikipedia.org/wiki/Call_stack

Stack overflow - static memory vs. dynamic memory

If you write int m[1000000]; inside the main function of C/C++, it will get a runtime error for stack overflow. Instead if you write vector<int> m; and then push_back 1000000 elements there, it will run fine.
I am very curious about why this is happening. They both are local memory, aren't they? Thanks in advance.
Yes, the vector itself is an automatic (stack) object. But the vector holds a pointer to its contents (an internal dynamic array), and that will be allocated on the heap (by default). To simplify a little, you can think of vector as doing malloc/realloc or new[] calls internally (actually it uses an allocator).
EDIT: As I noted, automatic variables are allocated on the stack, while malloc generally allocates on the heap. The available memory for each are platform and even configuration-specific, but the available stack memory is typically much more limited.
The amount of stack memory is limited, because it has to be reserved in advance. However, the amount of heap memory will typically expend up to much higher limits imposed by your OS, but "nearly" reaching the limits of your virtual address space (2GB for a 32-bit machine, a whole lot more for a 64-bit machine).
You can increase the amount of reserved stack space, typically as a setting to your linker.
int m[1000000] - It will allocate the 1000000 ints on stack. as stack is limited so it will throw the stack overflow runtime error.
vector m; and then push_back of 1000000 elements are working because internally vector allocates the memory on heap not on stack. so in your application stack only the vector object is present so it is not throwing the stack overflow runtime error.
The vector object itself is on the stack; but internally it will allocate memory from the heap as needed to store an arbitary number of elements. So the stack cost is small and fixed for it.