Stack allocation of unknown size complexity - c++

I know that stack allocation takes constant time. From what I understand, this happens because the allocation size can be determined in compile time. In this case the program knows how much memory is needed to run (for example) a function and the entire chunk of memory that is needed can be allocated at once.
What happens in cases where the allocation size is only known at run time?
Consider this code snippet,
void func(){
int n;
std::cin >> n;
// this is a static allocation and its size is only known at run time
int arr[n];
}
Edit: I'm using g++ 5.4 on linux and this code compiles and runs.

What happens in cases where the allocation size is only known at run time?
Then the program is ill-formed, and therefore compilers are not required to compile the program.
If a compiler does compile it, then it is up to the compiler to decide what happens (other than issuing a diagnostic message, as required by the standard). This is usually called a "language extension". What probably happens is: amount of memory is allocated for the array, determined by the runtime argument. More details may be available in the documentation of the compiler.

It is impossible (using standard C++ language) to allocate space on the stack without knowing how much space to allocate.
The line int arr[n]; is not a valid C++ code. It only compiles because the compiler you are using decided to let you do that, so for more information you should refer to your compiler documentation.
For GCC, you might take a look at this page: https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html

I'm using g++ 5.4 on linux and this code compiles and runs.
Yes, and this invalid code compiles under MSVC 2010:
int& u = 0;
The standard sais that this code is ill formed. Yet MSVC compiles it! This is because of a compiler extension.
GCC accepts it because it implements it as an extension.
When compiling with -pedantic-errors, GCC will reject the code correctly.
Likewise, MSVC has the /permissive- compiler argument to disable some of it's extensions.

The memory allocation procedure varies when the size to be allocated is determined at runtime. Instead of allocation on stack, memory is reserved on the heap when the size is not known at compile time. Now allocation of memory is possible on the heap until the main memory of the computer is completely used up. Also, in some languages like C,C++ the allocation is permanent and the user is required to deallocate the memory after use.
In the example given above, memory of size n*sizeof(int) is reserved on the heap and is garbage collected (in java or python) or manually deallocated if the memory is assigned a pointer. (in c/c++)

Related

How does GCC create an array on the stack without its size being given by a constant variable? [duplicate]

This question already has answers here:
How does GCC implement variable-length arrays?
(2 answers)
Closed 1 year ago.
How does this example compile and run?
#include <iostream>
int main() {
int input;
std::cin >> input;
int arr[input];
return 0;
}
My understanding is that since the input's value is not known during compile time, it'd have to be a heap allocated array. Isn't the stack space for things like arrays (without allocating on the heap) allocated when the program starts?
My understanding is that since the input's value is not known during compile time, it'd have to be a heap allocated array.
While the C++ language rules do indeed say that you can’t do this, from a technical perspective about how the call stack is typically implemented this actually isn’t necessarily true. Generally, yes, objects that don’t have a size known in advance aren’t put on the stack. However, in the case of a local variable that’s an array of variable length, it’s not that hard for the compiler to put that array in stack space. For example, the C language supports this, and this older question addresses one way of implementing it. Even though C++ doesn’t allow for variable-length arrays the way C does (technically speaking what you have here isn’t legal C++ code), some compilers allow for it.
Isn't the stack space for things like arrays (without allocating on the heap) allocated when the program starts?
This usually isn’t the case. When the program starts up, it’s allocated a region of memory and told “this is where your stack should go,” but that space is typically not determined by anything about how the program is written and is usually controlled by the OS or set by the compiler. Then, whenever space on the stack is needed - say, because a function is called or because a new block scope is entered - the program takes up some space on the call stack, handing it back when the block scope exits or the function returns.
As a result, the program doesn’t need to know at the point where it starts how much space to reserve on the stack for each function. It can defer that until the actual functions are called.
My understanding is that since the input's value is not known during compile time, it'd have to be a heap allocated array.
Your understanding is correct.
Isn't the stack space for things like arrays (without allocating on the heap) allocated when the program starts?
In practice, the memory for execution stack is typically allocated when the program starts. This isn't something specified by the C++ language, but is an implementation detail.
How does this example compile and run?
The program is ill-formed. Compilers aren't required to compile the program and are require to diagnose the problem (if it doesn't diagnose it, then the compiler doesn't conform to the C++ standard). A compiler may still compile the program as a language extension. How that happens isn't specified by the C++ language.

What really happens and who is responsible when a call to delete[] X command?

I am trying to figure out who are the components or modules (maybe belong to the OS?) that actually do the stuff when application or a process is running and specifically run the command delete[] X.
My question came up after I read about delete[] X and I understand that the compiler is responsible (according its implementation) to know how many objects of X to delete. But, the compiler is not "active" at runtime! I mean, at compile time, the compiler does not know how many memory the user need in a new command so, nor it does at delete, so what actually happened at run time when the program actually running?
One of the answers I read about was something called run-time system, what is it? is it connected to the CPU - because the CPU executes the command eventually... or maybe the OS?
Another answer I saw said it "is done by the system's allocator" (How does delete[] know how much memory to delete?) - again where is this component (OS, CPU)?
The compiler is responsible to generate code that deletes it when the need arises. It doesn't need to be running when it happens. The generated code will probably be a function call to a routine that does something along these lines:
void delete_arr(object *ptr)
{
size_t *actual_start = ((size_t *)ptr) - 1;
int count = *actual_start;
for (int i = count-1; i >= 0; i--)
destruct(ptr[i]);
free(actual_start);
}
When new[] is called, it actually saved the number of elements next to the allocated memory. When you call delete[] it looks up the number count and then deletes that number of elements.
The library that provides these facilities, is called the C++ standard library or the C++ runtime environment. The standard doesn't say anything about what constitutes a runtime, so definitions might differ, but the gist is it's what's need to support running C++ code.
C++ runtime is (indirectly) using Operating System primitives to change the virtual address space of the process running your program.
Read more about computer architecture, CPU modes, operating systems, OS kernels, system calls, instruction sets, machine code, object code, linkers, relocation, name mangling, compilers, virtual memory.
On my Linux system, new (provided by the C++ standard library) is generally built above malloc(3) (provided by the C standard library) which may call the mmap(2) system call (implemented inside the kernel) which changes the virtual address space (by dealing with the MMU). And delete (from C++ standard library) is generally built above free(3) which may call munmap(2) system call which changes the virtual address space.
Things are much more complex in the details:
new is calling the constructor after having allocated memory with malloc
delete is calling the destructor before releasing memory with free
free usually mark the freed memory zone as reusable by future malloc (so usually don't release memory with munmap)
so malloc usually reuses previously freed memory zone before request more address space (using mmap) from the kernel
for array new[] and delete[], the memory zone contains the size of the array, and the constructor (new[]) or destructor (delete[]) is called in a loop
technically, when you code SomeClass*p = new SomeClass(12); the memory is first allocated using ::operator new (which calls malloc), and then the constructor of SomeClass is called with 12 as argument
when you code delete p;, the destructor of SomeClass is called and then the memory is released using ::operator delete (which calls free)
BTW, a Linux system is made of free software, so I strongly suggest you to install some Linux distribution on your machine and use it. So you can study the source code of libstdc++ (the standard C++ library, which is part of the GCC compiler source code but linked by your program), of libc (the standard C library), of the kernel. You could also strace(1) your C++ program and process to understand what system calls it is doing.
If using GCC, you can get the generated assembler code by compiling your foo.cc C++ source file with g++ -Wall -O -fverbose-asm -S foo.cc which produces the foo.s assembler file. You can also get some textual view of the intermediate Gimple internal representation inside the compiler with g++ -Wall -O -fdump-tree-gimple -c foo.cc (you'll get some foo.cc.*.gimple and perhaps many other GCC dump files). You could even search something inside the Gimple representation using the GCC MELT tool (I designed and implemented most of it; use g++ -fplugin=melt -fplugin-arg-melt-mode=findgimple).
The standard C++ library has internal invariants and conventions, and the C++ compiler is responsible to follow them when emitting assembler code. So the compiler and its standard C++ library are co-designed and written in close cooperation (and some dirty tricks inside your C++ library implementations require compiler support, perhaps thru compiler builtins etc...). This is not specific to C++: Ocaml folks also co-design and co-implement the Ocaml language and its standard library.
The C++ runtime system has conceptually several layers: the C++ standard library libstdc++, the C standard library libc, the operating system (and at the bottom the hardware, including the MMU). All these are implementation details, the C++11 language standard don't really mention them.
It might be helpfull
for each call to global ::operator new() it will take the object size passed and add the size of extra data
it will allocate a memory block of size deduced at previous step
it will offset the pointer to the part of the block not occupied with extra data and return that offset value to the caller
::operator delete() will do the same in reverse - shift the pointer,
access extra data, deallocate memory.
And usually delete [] uses when you delete an array of object allocated into the heap. As I know new [] also add extra data in the start of allocated memory in which it store information about size of array for delete [] operator. It also could be useful:
In other words, in general case a memory block allocated by new[] has two sets of extra bytes in front of the actual data: the block size in bytes (introduced by malloc) and the element count (introduced by new[]). The second one is optional, as your example demonstrates. The first one is typically always present, as it is unconditionally allocated by malloc. I.e. your malloc call will physically allocate more than 20 bytes even if you request only 20. These extra bytes will be used by malloc to store the block size in bytes.
...
"Extra bytes" requested by new[] from operator new[] are not used to "store the size of allocated memory", as you seem to believe. They are used to store the number of elements in the array, so that the delete[] will know how many destructors to call. In your example destructors are trivial. There's no need to call them. So, there's no need to allocate these extra bytes and store the element count.
It works in two stages. One is the compiler doing things that seem magical and then the heap doing things that also seem magical.
That is, until you realize the trick. Then the magic is gone.
But fist lets recap what happens when you do new X[12];
The code that the compiler writes under the cover conceptually looks like this:
void* data = malloc(12 * sizeof(X))
for (int i=0; i != 12; ++i) {
X::Ctor(data);
data += sizeof(X);
}
Where Ctor(void* this_ptr) is a secret function that sets the this pointer calls the constructor of X. In this case the default one.
So at destruction, we can undo this, if only we could stash the 12 somewhere easy to find ...
I am guessing you have guessed where already ...
Anyplace! really! for example, it could be store right before the start of the object.
the first line becomes these 3 lines:
void* data = malloc((12 * sizeof(X)) +sizeof(int));
*((int*)data) = 12;
data += sizeof(int);
The rest stays the same.
When the compiler sees delete [] addr it knows that 4 bytes before addr it can find the object count. It also needs to call free(addr - sizeof(int));
This is by the way in essence the same trick that malloc and free use. At least on the old days when we had simple allocators.
When you use the new keyword, a the program requests a block of memory from the OS on the heap to hold the object. A pointer to that memory space is returned. Without using new the compiler puts the object on the stack and during compile time the memory for those objects is aligned on the stack. Any object created using new needs to be deleted when there is no longer need for it so it is important that the original pointer to the heap block is not lost so you can call delete on it. When you use delete[] it will free all the blocks in an array. Example you use delete[] if you created char* anarray = new char[128] and you would use delete if you did string *str = new string() because a string is referred to as an object and char* is a pointer to an array.
Edit: some objects overload the delete operator so your object can support proper freeing of dynamic memory, so an object could be held responsible for determining the behavior of it

c++ memory allocated at compile time

I read that while dynamic memory is allocated on the heap during runtime, static memory is allocated on the stack during compile time since the compiler knows how much memory has to be allocated at compile time.
Consider the following code:
int n;
cin>>n;
int a[n];
How does the compiler possibly know how much memory to allocate for a[] at compile time if its actual size is read during the run only?
You won't be able to compile that, for the exact reason you specified. C++ needs to have a fixed number in there in order for compilation to be performed. If you want to do that, you have to use dynamic allocation.

C/C++ Dynamic or Static memory allocation?

Dynamic memory allocation in C/C++ happens through malloc and the static memory allocation ex: int a[3]; its allocated after the code is executed.
But this code int x[y+1]; only can happen after a value is attributed to y and this happens in execution time, so its static, dynamic or both? does the compiler insert a malloc in the machine code automatically?
It is a Variable Length Array (VLA). Wikipedia: http://en.wikipedia.org/wiki/Variable-length_array
Technically, it is not legal in C++, but compilers often support it as an extension, but generate warnings when they are turned on. See
Why aren't variable-length arrays part of the C++ standard?
It is legal in C.
int[] is on the stack, while malloc'd or new'd things are on the heap.
VERY basically int[] gets allocated automatically when it is reached (where y is already known) and gets dropped when it gets out of scope. It is not everything already allocated at startup.
There are no hidden malloc calls or stuff, that is just how the stack memory works.
(I hope for an answer from someone who actually knows C/C++)

error creating a dynamic c++ array

I was trying to make a dynamic array in this form :
int x;
cin>>x;
int ar[x];
My g++ (gcc) compiler on Linux refused to create an array without a fixed size. However using the same code on windows on dev-cpp, it was complied and executed, also it allows me to create and use the dynamic array, i thought it was a compiler bug, however when i restarted and returned to g++ it compiled and executed the code although it refused to do it before I tried the code on windows, how can that be and is it dangerous?
C++ requires the size of an automatic storage array to be known at compile time, otherwise the array must be dynamically allocated (unless with compiler extension).
You should use
int *ar = new int[x];
...
delete []ar; // free the memory after use
or
vector<int> ar;
As the other answerers point out, if you don't know the size of the array at compile time then you should dynamically allocate using new. But (somehwat shamefully) they fail to tell you that you will be responsible for deallocating this memory with delete : details here
This responsibility (making sure you always release memory you have allocated) is the biggest source of problems in C++. A technique like RAII can make this easier (put simply : wrap the memory in an object, new in the constructor and delete in the destructor, then the language makes sure the destructor is always called)