I'm a beginning C++ programmer. So, I just learned that gcc has an extension that allows variably sized array without having to dynamically allocate memory. I want to know if this variably sized array is allocated in the stack or heap.
Conceptually it's allocated with automatic storage duration, so in terms of implementation, you can think of it as being on the stack.
Do consider using std::vector as an alternative though as that's standard and therefore portable C++.
The variable sized array is allocated in the stack.
VLA's are not supported by the C++ standard, although some compilers such as GCC do have them as an extension.
std::vector <> VLA in the GCC implementation.
std::vector is resizable and allocates memory on the heap.
VLA is not resizable, is limited by the maximum stack size and doesn't allocate memory.
So there is a flexibility difference, and there can be a performance difference especially if the array creation happens regularly (such as in a tight loop).
That said, some of these differences can sometimes be mitigated by for example, moving the 'array' outside of loops etc
Related
I'm reading about dynamic arrays (specifically at https://www.learncpp.com/cpp-tutorial/dynamically-allocating-arrays/), and it seems to me that dynamic arrays are not actually dynamic, in that the size allocated for them cannot be changed.
If I am understanding correctly, the main use or point of dynamic arrays vs fixed arrays is that dynamic arrays will be allocated on the heap rather than the stack, and therefore can be larger. The terms "dynamic" and "fixed" give me the impression that one can be changed and the other cannot, but it doesn't seem to be the case.
Is this correct, or am I misunderstanding something about dynamic vs fixed arrays?
Dynamic arrays are dynamic i.e. they have dynamic lifetime / dynamic storage (i.e. they are stored in free store aka "heap").
Dynamic arrays are also dynamic in the sense that unlike array variables, their size can be determined at runtime i.e. it doesn't need to be compile time constant. Example:
int size;
std::cin >> size;
auto ptr = std::make_unique<int[]>(size); // dynamic
int arr[size]; // ill-formed
You're correct in that the size of a (dynamic) array cannot change through its lifetime. Thus, a dynamic array in C++ isn't the abstract data structure by the same name, also known by names "growable array", "resizable array", "dynamic table", "mutable array", or "array list". The C++ standard library has implementation of that data structure by the name std::vector.
"Dynamic" refers not to size, but to allocation method. In most C++ implementations there's several methods of storing data, but the two relevant here are local, as in on the stack, and dynamic, as in on the heap.
Dynamic tends to mean "allocated with new" or a similar mechanism, while stack allocations happen implicitly.
X* x = new X[20]; // "Dynamic" array
X x[20]; // Local array
Neither of these can be easily resized.
A dynamic array can be reallocated and resized, while a stack one cannot, the size on the stack is fixed. This makes dynamic arrays more flexible in that regard, but it's not something that comes for free, you need to do a lot of work to implement that to ensure that your newly allocated object is consistent and that all pointers to it have been updated.
Additionally, the lifetime is different, as dynamically allocated structures have an indefinite lifetime, as the programmer you control that, while local variables have a scope-defined lifetime that is non-negotiable. When you exit the scope, they're gone.
Dynamic allocations come with considerable overhead on the part of the programmer, you're responsible for managing that lifecycle or delegating it to a wrapper that can do it for you, like a smart pointer.
In modern C++ using actual low-level arrays is difficult and error-prone which is why std::vector is typically used instead. It provides way more guarantees and helps manage lifecycles for you if used effectively. The implementation of these manages resizing, and object deletion for you.
I have seen two ways to declare a dynamic array in C++. One is by the use of new operator:
int *arr = new int [size];
and other is directly declaring:
int arr[size];
NOTE: Here note that size is a variable whose value will be provided by the user at runtime.
Question is what is the best approach to declare a dynamic array in C++?
Assuming your question is, "what is better?".
The second, the direct creation of the array, creates the array on the stack, the first on the heap. The second is called variadic-length-array (VLA), which is nonstandard C++ and not portable, but in C it is standard. The GNU C++ compilers support that, but others do not support that. Internally the array is allocated as with alloca(POSIX)/__builtin_alloca(GNU), which extends the stackframe. The variadic-length-array can smash your stack with a big size (maybe produces a SIGSEGV, but may also corrupt other data), while the new-operator throws a catchable exception. (However, using recursive functions can smash your stack the same way...). It is not a bad practice to use VLAs, when you know the size is relatively small. The VLAs can even improve the performance, when the array needs to be allocated multiple times (the allocation of the VLA is faster than the allocation on the heap). Because of the VLA living on the stack it doesn't need to be freed/deleted, it is automaticly freed when the function quits.
This applies to the GNU-Compilers: VLAs do call the destructors on destruction, but the memory allocated with alloca/__builtin_alloca is just going to be freed at the end of the function as memory (allocated with malloc) freed with free.
As conclusion, I think the allocation with the new is better for most problems. But the VLA is good for fast memory allocation local in a function. There is no portable approach to return a VLA from a function (without hacking through assembly) (You can return arrays with constant size from a function, however it needs to be specified in the signature). For this, there is std::array and std::vector, I recommend to use that instead of hand made memory management (the allocation with new and delete or Cs malloc and free), which is not freed when an exception is raised. Memory-management should always be nested in the constructor and destructor of a class, if you need to use such functions. The destructors are always called, when the object goes out of scope, so there are no memory leaks.
One thing you cannot do with VLAs and new/delete is fast resizing. Even std::vector does not use it. It is done with the C-function realloc, which tries to keep the buffer inplace. When you need this you can easily design a std::vector-like class, which should call free in the destructor. To destruct an element you call element.~T(), where T is the type of element.
However std::vector tries to improve the performance of resizing by allocating a buffer with additional space.
The main difference between the two methods is that the first allocates memory from the Free-store(Heap), the second one allocates from the stack. In fact the second one is not good to use because the stack memory is very limited in space compared to the heap. Also the first statement obviously returns a pointer to the first element in the allocated memory while the second one returns the array itself.
I am writing a container that uses alloca internally to allocate data on the stack. Risks of using alloca aside, assume that I must use it for the domain I am in (it's partly a learning exercise around alloca and partly to investigate possible implementations of dynamically-sized stack-allocated containers).
According to the man page for alloca (emphasis mine) :
The alloca() function allocates size bytes of space in the stack frame of the caller. This temporary space is automatically freed when the function that called alloca() returns to its caller.
Using implementation-specific features, I have managed to force inlining in such a way that the callers stack is used for this function-level "scoping".
However, that means that the following code will allocate a huge amount of memory on the stack (compiler optimisations aside):
for(auto iteration : range(0, 10000)) {
// the ctor parameter is the number of
// instances of T to allocate on the stack,
// it's not normally known at compile-time
my_container<T> instance(32);
}
Without knowing the implementation details of this container, one might expect any memory it allocates to be free'd when instance goes out of scope. This is not the case and can result in a stack overflow / high memory usage for the duration of the enclosing function.
One approach that came to mind was to explicitly free the memory in the destructor. Short of reverse engineering the resulting assembly, I haven't found a way of doing that yet (also see this).
The only other approach I have thought of is to have a maximum size specified at compile-time, use that to allocate a fixed-size buffer, have the real size specified at runtime and use the fixed-size buffer internally. The issue with this is that it's potentially very wasteful (suppose your maximum were 256 bytes per container, but you only needed 32 most of the time).
Hence this question; I want to find a way to provide these scope semantics to the users of this container. Non-portable is fine, so long as it's reliable on the platform its targeting (for example, some documented compiler extension that only works for x86_64 is fine).
I appreciate this could be an XY problem, so let me restate my goals clearly:
I am writing a container that must always allocate its memory on the stack (to the best of my knowledge, this rules out C VLAs).
The size of the container is not known at compile-time.
I would like to maintain the semantics of the memory as if it were held by an std::unique_ptr inside of the container.
Whilst the container must have a C++ API, using compiler extensions from C is fine.
The code need only work on x86_64 for now.
The target operating system can be Linux-based or Windows, it doesn't need to work on both.
I am writing a container that must always allocate its memory on the stack (to the best of my knowledge, this rules out C VLAs).
The normal implementation of C VLAs in most compilers is on the stack. Of course ISO C++ doesn't say anything about how automatic storage is implemented under the hood, but it's (nearly?) universal for C implementations on normal machines (that do have a call+data stack) to use that for all automatic storage including VLAs.
If your VLA is too large, you get a stack overflow rather than a fallback to malloc / free.
Neither C nor C++ specify alloca; it's only available on implementations that have a stack like "normal" machines, i.e. the same machines where you can expect VLAs to do what you want.
All of these conditions hold for all the major compilers on x86-64 (except that MSVC doesn't support VLAs).
If you have a C++ compiler that supports C99 VLAs (like GNU C++), smart compilers may reuse the same stack memory for a VLA with loop scope.
have a maximum size specified at compile-time, use that to allocate a fixed-size buffer ... wasteful
For a special case like you mention, you could maybe have a fixed-size buffer as part of the object (size as a template param), and use that if it's big enough. If not, dynamically allocate. Maybe use a pointer member to point to either the internal or external buffer, and a flag to remember whether to delete it or not in the destructor. (You need to avoid delete on an array that's part of the object, of course.)
// optionally static_assert (! (internalsize & (internalsize-1), "internalsize not a power of 2")
// if you do anything that's easier with a power of 2 size
template <type T, size_t internalsize>
class my_container {
T *data;
T internaldata[internalsize];
unsigned used_size;
int allocated_size; // intended for small containers: use int instead of size_t
// bool needs_delete; // negative allocated size means internal
}
The allocated_size only needs to be checked when it grows, so I made it signed int so we can overload it instead of needing an extra boolean member.
Normally a container uses 3 pointers instead of pointer + 2 integers, but if you don't grow/shrink often then we save space (on x86-64 where int is 32 bits and pointers are 64-bit), and allow this overloading.
A container that grows large enough to need dynamic allocation should continue using that space but then shrinks should keep using the dynamic space, so it's cheaper to grow again, and to avoid copying back into the internal storage. Unless the caller uses a function to release unused excess storage, then copy back.
A move constructor should probably keep allocation as-is, but a copy constructor should copy into the internal buffer if possible instead of allocating new dynamic storage.
int m,n;
cin>>m>>n;
int A[m][n];
Question is: Will array A get memory on stack or heap in C++ ?
Edit: I know using new is a better route.
This technique works in my mingw g++ compiler. I am just curious.
This behaviour depends on the particular compiler and is not part of the standard.
In gcc, which mingw is a port of, the memory for automatic variables as such, including variable lengths arrays is allocated on the stack.
According to the gcc manual:
6.19 Arrays of Variable Length
[...] These arrays are declared like any other automatic arrays, but with a
length that is not a constant expression. The storage is allocated at
the point of declaration and deallocated when the block scope
containing the declaration exits. [...] You can use the function alloca to get an effect much like variable-length arrays.
Ref: https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html
According to man 3 alloca:
The space allocated by alloca() is allocated within the stack frame
Please keep in mind that:
ISO C++ forbids variable length arrays
Alternatively you can allocate your array dynamically (with new) or preferably use the C++ containers anyway where possible.
Edit: Added note on variable behaviour between compilers, based on Paul's comment.
in which cases should I use the keyword new to allocate array when the size is a variable? I am reading this code: https://github.com/Hawstein/cracking-the-coding-interview/blob/master/1.7.cpp
In the function zero(), why row[m] and col[n] declarations doesn't cause errors? The m and n are function variables.
Thanks
VLA - variable length arrays are a nonstandard extension of the C++ language, thus your code can only be compiled with a compiler extension
You should use dynamic allocated memory in every case when you don't know in advance the size of the array and you don't want/can't waste precious stack memory allocating a temporary or, even better if that works for you, use a std::vector (a vector uses heap memory anyway for its elements)
Edit: Another important suggestion is to take a look at smart pointers which can often offer additional advantages over raw pointers
Never.
Modern c++ compilers can handle variables for array sizes. They just use the values currently in m and n. There is no need to use new.