Stack Overflow in Fortran program - fortran

I have a problem with my simple Fortran program. I am working in Fortran 77, using Compaq Visual Fortran. The program structure must be in the form of a main and a subroutine, because it is part of a big program related to the finite element method.
My issue is that I would like to set the values 10000 & 10000 for NHELE and NVELE respectively, but when I run the code, the program stops and gives the following error:
forrt1: server <170>: program Exception - stack overflow
I've tried iteratively reducing the required values, until I reached 507 & 507. At this point the code runs without errors.
However, increasing the values to 508 & 508 causes the same error to reappear.
I think the problem is related to the subroutine NIGTEE, because when I rearrange the program without it, everything works fine.
I've tried increasing the stack size to a maximum by using the menu project>>settings>>link>>output>>reserve & commit
but this didn't make a difference.
How can I solve this problem?
Here is my program:
PARAMETER(NHELE=508,NVELE=508)
PARAMETER(NHNODE=NHELE+1,NVNODE=NVELE+1)
PARAMETER(NTOTALELE=NHELE*NVELE)
DIMENSION MELE(NTOTALELE,4)
CALL NIGTEE(NHELE,NVELE,NHNODE,NVNODE,NTOTALELE,MELE)
OPEN(UNIT=7,FILE='MeshNO For Rectangular.TXT',STATUS='UNKNOWN')
WRITE(7,500) ((MELE(I,J),J=1,4),I=1,NTOTALELE)
500 FORMAT(4I20)
STOP
END
SUBROUTINE NIGTEE(NHELE,NVELE,NHNODE,NVNODE,NTOTALELE,MELE)
DIMENSION NM(NVNODE,NHNODE),NODE(4)
DIMENSION MELE(NTOTALELE,4)
KK=0
DO 20 I=1,NVNODE
DO 20 J=1,NHNODE
KK=KK+1
NM(I,J)=KK
20 CONTINUE
KK=0
DO 30 I=1,NVELE
DO 30 J=1,NHELE
NODE(1)=NM(I,J)
NODE(2)=NM(I,J+1)
NODE(3)=NM(I+1,J+1)
NODE(4)=NM(I+1,J)
KK=KK+1
DO 50 II=1,4
50 MELE(KK,II)=NODE(II)
30 CONTINUE
RETURN
END
Thanks.

Update:
Here's your actual problem. Your NM array is being declared as being a two-dimensional array of NHNODE cells by NVNODE rows. If that is 10,000 by 10,000, then you will need more than 381 megabytes of memory to allocate this array alone, aside from any other memory being used by your program. (By contrast, if the array is 500 by 500, you only need about 1 megabyte of memory for the same array.)
The problem is that old Fortran would allocate all the arrays directly in the code segment or on the stack. The concept of an OS "heap" (general purpose memory for large objects) had been invented by 1977, but Fortran 77 still didn't have any constructs for making use of it. So every time your subroutine is called, it has to push the stack pointer to make room for 381 megabytes of space on the stack. This is almost certainly larger than the amount of space your operating system is allowing for the stack segment, and you are overflowing stack memory (and hence getting a stack overflow).
The solution is to allocate that memory from a different place. I know in old Fortran it is possible to use COMMON blocks to statically allocate memory directly from your code segment. You still can't dynamically allocate more, so your subroutine can't be reentrant, but if your subroutine only gets called once at a time (which it appears to be) this may be the best solution.
A better solution would be to switch to Fortran 90 or newer and use the ALLOCATE keyword to dynamically allocate the arrays on the heap instead of the stack. Then you can allocate as large a chunk as your OS can give you, but you won't have to worry about overflowing the stack, since the memory will be coming from another place.
You may be able to fix this by changing it in the compiler, as M.S.B. suggests, but a better solution is to simply fix the code.

Does that compiler have an option to put arrays on the heap?
You could try a different compiler, such as one that is still supported. Fortran 95 compilers will compile FORTRAN 77. There are many choices, including open source. Intel Visual Fortran, the successor to Compaq Visual Fortran, has the heap option on Windows & Mac OS X for placing automatic and temporary arrays on the heap.

MELE is actually a larger array then NM: 10000 x 10000 x 4 x 4, versus 10001 x 100001 x 4 (supposing 4 byte numbers, as did Daniel) -- 1.49 GB versus 381 kB. MELE is declared in your main program and, from your tests, is acceptable, even though it is larger. So either adding NM pushes the memory usage over a limit (the total for these two arrays is 1.86 GB) or the difference in the declaration matters. The size of MELE is known at compile time, that of NM only at run time, so probably the compiler allocates the memory differently. Really in this program the size of NM is known, but in the subroutine the dimensions are received as arguments, so to the compiler the size is unknown. If you change this, the compiler may change how it allocates the memory for NM and the program may run. Don't pass the dimensions of NM as arguments -- make them constants. The elegant way would be to make an include file with three PARAMETER statements setting up the array sizes, then include that include file wherever needed. Quick and dirty, as a test, would be to repeat identical PARAMETER statements in the subroutine -- but then you have the same information twice, which has to be changed twice if you make changes. In either case, you have to remove the array dimensions from the subroutine arguments in both the call and subroutine declaration, or use different names, because the same variable in a subroutine can't be a parameter and an argument. If that doesn't work, declare NM in the main program and pass it as an argument to the subroutine.
Re the COMMON block -- the dimensions need to be known at compile time, and so can't be subroutine arguments -- same as above. As Daniel explained, putting the array into COMMON would definitely cause it not to be on the stack.
This is beyond the language standard -- how the compiler provides memory is an implementation detail, "under the hood". So the solution is partially guess work. The manual or help for the compiler might have answers, e.g., a heap allocation option.

Stack overflows related to array size is a warning sign that they are being pushed whole onto the call stack, instead of on the heap. Have you tried making the array variables allocatable? (I'm not sure if this is possible in F77, though)

Related

Very large array - C array vs C++ array. Visual Studio - exceeds max (268435456)

I am trying to create a very large array, to which, I then get the following error.
char largearray[1744830451];
warning LNK4084: total image size 1750372352 exceeds max (268435456); image may not run
I was told I could use a C-array and not C++ . I'm not sure I fully understood my friend's response. I am currently using Visual Studio 6.0 C++ . Do I need to get another compiler to do straight C or is it a method to how to declare the array that needs to change?
If I need to change compilers, does someone have suggestions?
The char array[size] syntax means the array will be created in the data section of your compiled program and not allocated at runtime.
Win32 PE code cannot exceed 256MB (according to your linker's error message), but the array you're declaring is 1.6GB in length.
If you want a 1.6GB array, use malloc (and don't forget to call free!)
...but why on earth are you running VC6?
If you predefine the size, then you are restricted to the stack size (stack has less size but faster), so it is better to define the size dynamically, which means your data is stored in heap (heap has bigger size but a little bit slower than stack).
Have a look at http://gribblelab.org/CBootcamp/7_Memory_Stack_vs_Heap.html which explains the difference of stack and heap.

What does "Memory allocated at compile time" really mean?

In programming languages like C and C++, people often refer to static and dynamic memory allocation. I understand the concept but the phrase "All memory was allocated (reserved) during compile time" always confuses me.
Compilation, as I understand it, converts high level C/C++ code to machine language and outputs an executable file. How is memory "allocated" in a compiled file ? Isn't memory always allocated in the RAM with all the virtual memory management stuff ?
Isn't memory allocation by definition a runtime concept ?
If I make a 1KB statically allocated variable in my C/C++ code, will that increase the size of the executable by the same amount ?
This is one of the pages where the phrase is used under the heading "Static allocation".
Back To Basics: Memory allocation, a walk down the history
Memory allocated at compile-time means the compiler resolves at compile-time where certain things will be allocated inside the process memory map.
For example, consider a global array:
int array[100];
The compiler knows at compile-time the size of the array and the size of an int, so it knows the entire size of the array at compile-time. Also a global variable has static storage duration by default: it is allocated in the static memory area of the process memory space (.data/.bss section). Given that information, the compiler decides during compilation in what address of that static memory area the array will be.
Of course that memory addresses are virtual addresses. The program assumes that it has its own entire memory space (From 0x00000000 to 0xFFFFFFFF for example). That's why the compiler could do assumptions like "Okay, the array will be at address 0x00A33211". At runtime that addresses are translated to real/hardware addresses by the MMU and OS.
Value initialized static storage things are a bit different. For example:
int array[] = { 1 , 2 , 3 , 4 };
In our first example, the compiler only decided where the array will be allocated, storing that information in the executable.
In the case of value-initialized things, the compiler also injects the initial value of the array into the executable, and adds code which tells the program loader that after the array allocation at program start, the array should be filled with these values.
Here are two examples of the assembly generated by the compiler (GCC4.8.1 with x86 target):
C++ code:
int a[4];
int b[] = { 1 , 2 , 3 , 4 };
int main()
{}
Output assembly:
a:
.zero 16
b:
.long 1
.long 2
.long 3
.long 4
main:
pushq %rbp
movq %rsp, %rbp
movl $0, %eax
popq %rbp
ret
As you can see, the values are directly injected into the assembly. In the array a, the compiler generates a zero initialization of 16 bytes, because the Standard says that static stored things should be initialized to zero by default:
8.5.9 (Initializers) [Note]:
Every object of static storage duration is zero-initialized at
program startup before any other initial- ization takes place. In some
cases, additional initialization is done later.
I always suggest people to disassembly their code to see what the compiler really does with the C++ code. This applies from storage classes/duration (like this question) to advanced compiler optimizations. You could instruct your compiler to generate the assembly, but there are wonderful tools to do this on the Internet in a friendly manner. My favourite is GCC Explorer.
Memory allocated at compile time simply means there will be no further allocation at run time -- no calls to malloc, new, or other dynamic allocation methods. You'll have a fixed amount of memory usage even if you don't need all of that memory all of the time.
Isn't memory allocation by definition a runtime concept?
The memory is not in use prior to run time, but immediately prior to execution starting its allocation is handled by the system.
If I make a 1KB statically allocated variable in my C/C++ code, will that increase the size of the executable by the same amount?
Simply declaring the static will not increase the size of your executable more than a few bytes. Declaring it with an initial value that is non-zero will (in order to hold that initial value). Rather, the linker simply adds this 1KB amount to the memory requirement that the system's loader creates for you immediately prior to execution.
Memory allocated in compile time means that when you load the program, some part of the memory will be immediately allocated and the size and (relative) position of this allocation is determined at compile time.
char a[32];
char b;
char c;
Those 3 variables are "allocated at compile time", it means that the compiler calculates their size (which is fixed) at compile time. The variable a will be an offset in memory, let's say, pointing to address 0, b will point at address 33 and c at 34 (supposing no alignment optimization). So, allocating 1Kb of static data will not increase the size of your code, since it will just change an offset inside it. The actual space will be allocated at load time.
Real memory allocation always happens in run time, because the kernel needs to keep track of it and to update its internal data structures (how much memory is allocated for each process, pages and so on). The difference is that the compiler already knows the size of each data you are going to use and this is allocated as soon as your program is executed.
Remember also that we are talking about relative addresses. The real address where the variable will be located will be different. At load time the kernel will reserve some memory for the process, lets say at address x, and all the hard coded addresses contained in the executable file will be incremented by x bytes, so that variable a in the example will be at address x, b at address x+33 and so on.
Adding variables on the stack that take up N bytes doesn't (necessarily) increase the bin's size by N bytes. It will, in fact, add but a few bytes most of the time.
Let's start off with an example of how adding a 1000 chars to your code will increase the bin's size in a linear fashion.
If the 1k is a string, of a thousand chars, which is declared like so
const char *c_string = "Here goes a thousand chars...999";//implicit \0 at end
and you then were to vim your_compiled_bin, you'd actually be able to see that string in the bin somewhere. In that case, yes: the executable will be 1 k bigger, because it contains the string in full.
If, however you allocate an array of ints, chars or longs on the stack and assign it in a loop, something along these lines
int big_arr[1000];
for (int i=0;i<1000;++i) big_arr[i] = some_computation_func(i);
then, no: it won't increase the bin... by 1000*sizeof(int)
Allocation at compile time means what you've now come to understand it means (based on your comments): the compiled bin contains information the system requires to know how much memory what function/block will need when it gets executed, along with information on the stack size your application requires. That's what the system will allocate when it executes your bin, and your program becomes a process (well, the executing of your bin is the process that... well, you get what I'm saying).
Of course, I'm not painting the full picture here: The bin contains information about how big a stack the bin will actually be needing. Based on this information (among other things), the system will reserve a chunk of memory, called the stack, that the program gets sort of free reign over. Stack memory still is allocated by the system, when the process (the result of your bin being executed) is initiated. The process then manages the stack memory for you. When a function or loop (any type of block) is invoked/gets executed, the variables local to that block are pushed to the stack, and they are removed (the stack memory is "freed" so to speak) to be used by other functions/blocks. So declaring int some_array[100] will only add a few bytes of additional information to the bin, that tells the system that function X will be requiring 100*sizeof(int) + some book-keeping space extra.
On many platforms, all of the global or static allocations within each module will be consolidated by the compiler into three or fewer consolidated allocations (one for uninitialized data (often called "bss"), one for initialized writable data (often called "data"), and one for constant data ("const")), and all of the global or static allocations of each type within a program will be consolidated by the linker into one global for each type. For example, assuming int is four bytes, a module has the following as its only static allocations:
int a;
const int b[6] = {1,2,3,4,5,6};
char c[200];
const int d = 23;
int e[4] = {1,2,3,4};
int f;
it would tell the linker that it needed 208 bytes for bss, 16 bytes for "data", and 28 bytes for "const". Further, any reference to a variable would be replaced with an area selector and offset, so a, b, c, d, and e, would be replaced by bss+0, const+0, bss+4, const+24, data+0, or bss+204, respectively.
When a program is linked, all of the bss areas from all the modules are be concatenated together; likewise the data and const areas. For each module, the address of any bss-relative variables will be increased by the size of all preceding modules' bss areas (again, likewise with data and const). Thus, when the linker is done, any program will have one bss allocation, one data allocation, and one const allocation.
When a program is loaded, one of four things will generally happen depending upon the platform:
The executable will indicate how many bytes it needs for each kind of data and--for the initialized data area, where the initial contents may be found. It will also include a list of all the instructions which use a bss-, data-, or const- relative address. The operating system or loader will allocate the appropriate amount of space for each area and then add the starting address of that area to each instruction which needs it.
The operating system will allocate a chunk of memory to hold all three kinds of data, and give the application a pointer to that chunk of memory. Any code which uses static or global data will dereference it relative to that pointer (in many cases, the pointer will be stored in a register for the lifetime of an application).
The operating system will initially not allocate any memory to the application, except for what holds its binary code, but the first thing the application does will be to request a suitable allocation from the operating system, which it will forevermore keep in a register.
The operating system will initially not allocate space for the application, but the application will request a suitable allocation on startup (as above). The application will include a list of instructions with addresses that need to be updated to reflect where memory was allocated (as with the first style), but rather than having the application patched by the OS loader, the application will include enough code to patch itself.
All four approaches have advantages and disadvantages. In every case, however, the compiler will consolidate an arbitrary number of static variables into a fixed small number of memory requests, and the linker will consolidate all of those into a small number of consolidated allocations. Even though an application will have to receive a chunk of memory from the operating system or loader, it is the compiler and linker which are responsible for allocating individual pieces out of that big chunk to all the individual variables that need it.
The core of your question is this: "How is memory "allocated" in a compiled file? Isn't memory always allocated in the RAM with all the virtual memory management stuff? Isn't memory allocation by definition a runtime concept?"
I think the problem is that there are two different concepts involved in memory allocation. At its basic, memory allocation is the process by which we say "this item of data is stored in this specific chunk of memory". In a modern computer system, this involves a two step process:
Some system is used to decide the virtual address at which the item will be stored
The virtual address is mapped to a physical address
The latter process is purely run time, but the former can be done at compile time, if the data have a known size and a fixed number of them is required. Here's basically how it works:
The compiler sees a source file containing a line that looks a bit like this:
int c;
It produces output for the assembler that instructs it to reserve memory for the variable 'c'. This might look like this:
global _c
section .bss
_c: resb 4
When the assembler runs, it keeps a counter that tracks offsets of each item from the start of a memory 'segment' (or 'section'). This is like the parts of a very large 'struct' that contains everything in the entire file it doesn't have any actual memory allocated to it at this time, and could be anywhere. It notes in a table that _c has a particular offset (say 510 bytes from the start of the segment) and then increments its counter by 4, so the next such variable will be at (e.g.) 514 bytes. For any code that needs the address of _c, it just puts 510 in the output file, and adds a note that the output needs the address of the segment that contains _c adding to it later.
The linker takes all of the assembler's output files, and examines them. It determines an address for each segment so that they won't overlap, and adds the offsets necessary so that instructions still refer to the correct data items. In the case of uninitialized memory like that occupied by c (the assembler was told that the memory would be uninitialized by the fact that the compiler put it in the '.bss' segment, which is a name reserved for uninitialized memory), it includes a header field in its output that tells the operating system how much needs to be reserved. It may be relocated (and usually is) but is usually designed to be loaded more efficiently at one particular memory address, and the OS will try to load it at this address. At this point, we have a pretty good idea what the virtual address is that will be used by c.
The physical address will not actually be determined until the program is running. However, from the programmer's perspective the physical address is actually irrelevant—we'll never even find out what it is, because the OS doesn't usually bother telling anyone, it can change frequently (even while the program is running), and a main purpose of the OS is to abstract this away anyway.
An executable describes what space to allocate for static variables. This allocation is done by the system, when you run the executable. So your 1kB static variable won't increase the size of the executable with 1kB:
static char[1024];
Unless of course you specify an initializer:
static char[1024] = { 1, 2, 3, 4, ... };
So, in addition to 'machine language' (i.e. CPU instructions), an executable contains a description of the required memory layout.
Memory can be allocated in many ways:
in application heap (whole heap is allocated for your app by OS when the program starts)
in operating system heap (so you can grab more and more)
in garbage collector controlled heap (same as both above)
on stack (so you can get a stack overflow)
reserved in code/data segment of your binary (executable)
in remote place (file, network - and you receive a handle not a pointer to that memory)
Now your question is what is "memory allocated at compile time". Definitely it is just an incorrectly phrased saying, which is supposed to refer to either binary segment allocation or stack allocation, or in some cases even to a heap allocation, but in that case the allocation is hidden from programmer eyes by invisible constructor call. Or probably the person who said that just wanted to say that memory is not allocated on heap, but did not know about stack or segment allocations.(Or did not want to go into that kind of detail).
But in most cases person just wants to say that the amount of memory being allocated is known at compile time.
The binary size will only change when the memory is reserved in the code or data segment of your app.
You are right. Memory is actually allocated (paged) at load time, i.e. when the executable file is brought into (virtual) memory. Memory can also be initialized on that moment. The compiler just creates a memory map. [By the way, stack and heap spaces are also allocated at load time !]
I think you need to step back a bit. Memory allocated at compile time.... What can that mean? Can it mean that memory on chips that have not yet been manufactured, for computers that have not yet been designed, is somehow being reserved? No. No, time travel, no compilers that can manipulate the universe.
So, it must mean that the compiler generates instructions to allocate that memory somehow at runtime. But if you look at it in from the right angle, the compiler generates all instructions, so what can be the difference. The difference is that the compiler decides, and at runtime, your code can not change or modify its decisions. If it decided it needed 50 bytes at compile time, at runtime, you can't make it decide to allocate 60 -- that decision has already been made.
If you learn assembly programming, you will see that you have to carve out segments for the data, the stack, and code, etc. The data segment is where your strings and numbers live. The code segment is where your code lives. These segments are built into the executable program. Of course the stack size is important as well... you wouldn't want a stack overflow!
So if your data segment is 500 bytes, your program has a 500 byte area. If you change the data segment to 1500 bytes, the size of the program will be 1000 bytes larger. The data is assembled into the actual program.
This is what is going on when you compile higher level languages. The actual data area is allocated when it is compiled into an executable program, increasing the size of the program. The program can request memory on the fly, as well, and this is dynamic memory. You can request memory from the RAM and the CPU will give it to you to use, you can let go of it, and your garbage collector will release it back to the CPU. It can even be swapped to a hard disk, if necessary, by a good memory manager. These features are what high level languages provide you.
I would like to explain these concepts with the help of few diagrams.
This is true that memory cannot be allocated at compile time, for sure.
But, then what happens in fact at compile time.
Here comes the explanation.
Say, for example a program has four variables x,y,z and k.
Now, at compile time it simply makes a memory map, where the location of these variables with respect to each other is ascertained.
This diagram will illustrate it better.
Now imagine, no program is running in memory.
This I show by a big empty rectangle.
Next, the first instance of this program is executed.
You can visualize it as follows.
This is the time when actually memory is allocated.
When second instance of this program is running, the memory would look like as follows.
And the third ..
So on and so forth.
I hope this visualization explains this concept well.
There is very nice explanation given in the accepted answer. Just in case i will post the link which i have found useful.
https://www.tenouk.com/ModuleW.html
One among the many thing what a compiler does is that create and maintain a SYMTAB(Symbol Table under the section.symtab). This will be purely created and maintained by compilers using any Data Structure(List, Trees...etc) and not for the developers eyes. Any access request made by the developers this is where it will hit first.
Now about the Symbol Table,
We only need to know about the two columns Symbol Name and the Offset.
Symbol Name column will have the variable names and the offset column will have the offset value.
Lets see this with an example:
int a , b , c ;
Now we all know that the register Stack_Pointer(sp) points to the Top of the Stack Memory. Let that be sp = 1000.
Now the Symbol Name column will have three values in it a then b and then c. Reminding you all that variable a will be at the top of the stack memory.
So a's equivalent offset value will be 0.
(Compile Time Offset_Value)
Then b and its equivalent offset value will be 1. (Compile Time Offset_Value)
Then c and its equivalent offset value will be 2. (Compile Time Offset_Value)
Now calculating a's Physical address (or) Runtime Memory Address = (sp + offset_value of a)
= (1000 + 0) = 1000
Now calculating b's Physical address (or) Runtime Memory Address = (sp - offset_value of b)
= (1000 - 1) = 996
Now calculating c's Physical address (or) Runtime Memory Address = (sp - offset_value of c)
= (1000 - 2) = 992
Therefore at the time of the compilation we will only be having the offset values and only during the runtime the actual physical addresses are calculated.
Note:
Stack_Pointer value will be assigned only after the program is loaded. Pointer Arithmetic happens between the Stack_Pointer register and the variables offset to calculate the variables Physical Address.
"POINTERS AND POINTER ARITHMETIC, WAY OF THE PROGRAMMING WORLD"
Share what I learned about this question.
You can understand this issue in two steps:
First, the compilation step: the compiler generates the binary. In Linux system, binary is a file in ELF (Executable and Linkable Format) format. ELF file contains several sections, including .bss and .data
.data
Initialized data, with read/write access rights
.bss
Uninitialized data, with read/write access rights (=WA)
.data and .bss just map to the segments of process's memory layout, which contains static variables.
second, the loading step. When the binary file get executed, the ELF file will be loaded into process's memory. The loader can find static variables' information from ELF file.
Simply speaking, the compiler and the loader follow the same standard to communicate with each other, and the standard is ELF format.

Declaring large character array in c++

I am trying right now to declare a large character array. I am using the character array as a bitmap (as in a map of booleans, not the image file type). The following code generates a compilation error.
//This is code before main. I want these as globals.
unsigned const long bitmap_size = (ULONG_MAX/(sizeof(char)));
char bitmap[bitmap_size];
The error is overflow in array dimension. I recognize that I'm trying to have my process consume a lot of data and that there might be some limit in place that prevents me from doing so. I am curious as to whether I am making a syntax error or if I need to request more resources from the kernel. Also, I have no interest in creating a bitmap with some class. Thank you for your time.
EDIT
ULONG_MAX is very much dependent upon the machine that you are using. On the particular machine I was compiling my code on it was well over 4.2 billion. All in all, I wouldn't to use that constant like a constant, at least for the purpose of memory allocation.
ULONG_MAX/sizeof(char) is the same as ULONG_MAX, which is a very large number. So large, in fact, that you don't have room for it even in virtual memory (because ULONG_MAX is probably the number of bytes in your entire virtual memory).
You definitely need to rethink what you are trying to do.
It's impossible to declare an array that large on most systems -- on a 32-bit system, that array is 4 GB, which doesn't fit into the available address space, and on most 64-bit systems, it's 16 exabytes (16 million terabytes), which doesn't fit into the available address space there either (and, incidentally, may be more memory than exists on the entire planet).
Use malloc() to allocate large amounts of memory. But be realistic. :)
As I understand it, the maximum size of an array in c++ is the largest integer the platform supports. It is likely that your long-type bitmap_size constant exceeds that limit.

Initializing C++ array in constant time

char buffer[1000] = {0};
This initializes all 1000 elements to 0. Is this constant time? If not, why?
It seems like the compiler could optimize this to O(1) based on the following facts:
The array is of fixed size and known at compile time
The array is located on the stack, which means that presumably the executable could contain this data in the data segment of the executable (on Windows) as a chunk of data that is already filled with 0's.
Note that answers can be general to any compiler, but I'm specifically interested in answers tested on the MSVC compiler (any version) on Windows.
Bonus Points: Links to any articles, white papers, etc. on the details of this would be greatly appreciated.
If it's inside a function, no, it's not constant time.
Your second assumption isn't correct:
"The array is located on the stack, which means that presumably the executable could contain this data in the data segment of the executable (on Windows) as a chunk of data that is already filled with 0's."
The stack isn't already filled with zeros. It's filled with junk leftover from previous function calls.
So it's not possible to do it in O(1) because it will have to zero it.
It can be O(1) only as a global variable. If it is a local variable (on stack) it is O(n) where n is the size of array.
Stack is a shared memory, you need to actively zero always you want to have 1000 zeros in there. An array like you defined is not implemented as a pointer to data segment, it is 1000 variables on the stack and must be initialized in O(1000).
EDIT: Dani is right, I have to fix my statement: If it is a global array, it is initialized when program starts. And it is O(n) as well.
It will never be constant time, global or not. its true the compiler initializes that, but the operating system must load all the file into memory, which takes O(n) time.
The array is located on the stack, which means that presumably the executable could contain this data in the data segment of the executable (on Windows) as a chunk of data that is already filled with 0's.
What if you recurse into the function that defines array? The global DATA segment would need to have a copy of this array for each function call to allow each function to have its own array to work over. The compiler would have to run your code to decide the maximum recursions are going to happen.
Also what happens when you have multiple threads in your program and each calls foo? All the sudden you have shared stuff in DATA that has to be locked. The locking might cause more performance problems than getting rid of the initialization.
I wouldn't worry about it too much too. Most platforms have fairly efficient ways of zero filling memory. Unless you profile it and find a problem, don't sweat it.
As others have pointed out, assumption 2 is wrong. Stack variables are allocated at run-time in O(1) time but are not normally initialised unless you're running a debug build.
PUSH ebp
MOV ebp, esp
SUB esp, 10
// function body code goes here
Here, the stack pointer 'esp' is decremented by 10 to make room for some local function variables. They are not initalised... that would require a loop.
This article seems friendly enough.
If it's a global static, the "constant" here is ZERO - initialization is done at COMPILE TIME.

c++ memory allocation question

im trying to create an array:
int HR[32487834];
doesn't this only take up about 128 - 130 megabytes of memory?
im using MS c++ visual studios 2005 SP1 and it crashes and tells me stack overflow.
Use a vector - the array data will be located on the heap, while you'll still get the array cleaned up automatically when you leave the function or block:
std::vector<int> HR( 32487834);
While your computer may have gigabytes of memory, the stack does not (by default, I think it is ~1 MB on windows, but you can make it larger).
Try allocating it on the heap with new [].
The stack is not that big by default. You can set the stack size with the /F compiler switch.
Without this option the stack size
defaults to 1 MB. The number argument
can be in decimal or C-language
notation. The argument can range from
1 to the maximum stack size accepted
by the linker. The linker rounds up
the specified value to the nearest 4
bytes. The space between /F and number
is optional.
You can also use the /STACK linker option for executables
But likely you should be splitting up your problem into parts instead of doing everything at once. Do you really need all that memory all at once?
You can usually allocate more memory on the heap than on the stack as well.