I'm reading about dynamic arrays (specifically at https://www.learncpp.com/cpp-tutorial/dynamically-allocating-arrays/), and it seems to me that dynamic arrays are not actually dynamic, in that the size allocated for them cannot be changed.
If I am understanding correctly, the main use or point of dynamic arrays vs fixed arrays is that dynamic arrays will be allocated on the heap rather than the stack, and therefore can be larger. The terms "dynamic" and "fixed" give me the impression that one can be changed and the other cannot, but it doesn't seem to be the case.
Is this correct, or am I misunderstanding something about dynamic vs fixed arrays?
Dynamic arrays are dynamic i.e. they have dynamic lifetime / dynamic storage (i.e. they are stored in free store aka "heap").
Dynamic arrays are also dynamic in the sense that unlike array variables, their size can be determined at runtime i.e. it doesn't need to be compile time constant. Example:
int size;
std::cin >> size;
auto ptr = std::make_unique<int[]>(size); // dynamic
int arr[size]; // ill-formed
You're correct in that the size of a (dynamic) array cannot change through its lifetime. Thus, a dynamic array in C++ isn't the abstract data structure by the same name, also known by names "growable array", "resizable array", "dynamic table", "mutable array", or "array list". The C++ standard library has implementation of that data structure by the name std::vector.
"Dynamic" refers not to size, but to allocation method. In most C++ implementations there's several methods of storing data, but the two relevant here are local, as in on the stack, and dynamic, as in on the heap.
Dynamic tends to mean "allocated with new" or a similar mechanism, while stack allocations happen implicitly.
X* x = new X[20]; // "Dynamic" array
X x[20]; // Local array
Neither of these can be easily resized.
A dynamic array can be reallocated and resized, while a stack one cannot, the size on the stack is fixed. This makes dynamic arrays more flexible in that regard, but it's not something that comes for free, you need to do a lot of work to implement that to ensure that your newly allocated object is consistent and that all pointers to it have been updated.
Additionally, the lifetime is different, as dynamically allocated structures have an indefinite lifetime, as the programmer you control that, while local variables have a scope-defined lifetime that is non-negotiable. When you exit the scope, they're gone.
Dynamic allocations come with considerable overhead on the part of the programmer, you're responsible for managing that lifecycle or delegating it to a wrapper that can do it for you, like a smart pointer.
In modern C++ using actual low-level arrays is difficult and error-prone which is why std::vector is typically used instead. It provides way more guarantees and helps manage lifecycles for you if used effectively. The implementation of these manages resizing, and object deletion for you.
Related
For example if an integer array is declared:
int ar[12];
And here a vector of integers:
vector<int> ar; //OR
vector<int> ar(12);
In either case, is memory allocated to the array at compile time or runtime? I know that vector class in C++ STL uses dynamic memory allocation but what about the ordinary array? Also:
int n;
cin >> n;
char ar[n];
If memory allocation is at compile time then how does this work? I can't find anything scavenging the net.
"Normal" arrays will have a size known at compile-time, which means the compiler can (and will) make sure that there's space for them. That space might not be allocated inside the executable program but allocated at run-time (like e.g. a local variable inside a function).
The size of a vector is unknown at compile-time, and its the vectors constructor that will allocate memory (if asked to, as in the case with vector<int> ar(12);). The memory for vectors will always be allocated dynamically of the heap.
Then there's also std::array which is a C++ standard container around a compile-time array. When it comes to size and allocations it acts like a "normal" array, but since it's also a standard container object it can be used with functions and algorithms designed for those.
And to confuse matter even more, something being "static" has a special meaning in C++, so saying than an array is "statically" allocated could mean different things depending one ones viewpoint. However, "statically allocated" seems to be commonly used for things like arrays, whose memory is allocated and handled by the compiler and its generated code.
I am currently implementing my own vector container and I encountered a pretty interesting Issue(At leas for me). It may be a stupid question but idk.
My vector uses an heap array of pointers to heap allocated objects of unknown type (T**).
I did this because I wanted the pointers and references to individual elements to stay same, even after resizing.
This comes at performance cost when constructing and copying, because I need to create the array on the heap and each object of the array on the heap too. (Heap allocation is slower than on the stack, right?)
T** arr = new *T[size]{nullptr};
and then for each element
arr[i] = new T{data};
Now I wonder if it would be safe, beneficial(faster) and possible, if instead of allocating each object individually, I could create a second array on the heap and save the pointer of each object in the first one.Then use (and delete) these objects later as if they were allocated separately.
=> Is allocating arrays on the heap faster than allocating each object individually?
=> Is it safe to allocate objects in an array and forgetting about the array later? (sounds pretty dumb i think)
Link to my github repo: https://github.com/LinuxGameGeek/personal/tree/main/c%2B%2B/vector
Thanks for your help :)
First a remark, you should not think of the comparison heap/stack in terms of efficiency, but on object lifetime:
automatic arrays (what you call on stack) end their life at the end of the block where they are defined
dynamic arrays (whay you call on heap) exists until they are explicitly deleted
Now it is always more efficient to allocate a bunch of objects in an array than to allocate them separately. You save a number of internal calls and various data structure to maintain the heap. Simply you can only deallocate the array and not the individual objects.
Finally, except for trivially copyable objects, only the compiler and not the programmer knows about the exact allocation detail. For example (and for common implementations) an automatic string (so on stack) contains a pointer to a dynamic char array (so on heap)...
Said differently, unless you plan to only use you container for POD or trivially copyable objects, do not expect to handle all the allocation and deallocation yourself: non trivial objects have internal allocations.
Heap allocation is slower than on the stack, right?
Yes. Dynamic allocation has a cost.
Is allocating arrays on the heap faster than allocating each object individually?
Yes. Multiple allocations have that cost multiplied.
I wonder if it would be ... possible, if instead of allocating each object individually, I could create a second array on the heap and save the pointer of each object in the first one
It would be possible, but not trivial. Think hard how you would implement element erasure. And then think about how you would implement other features such as random access correctly into the container with arrays that contain indices from which elements have been erased.
... safe
It can be implemented safely.
... beneficial(faster)
Of course, reducing allocations from N to 1 would be beneficial by itself. But it comes at the cost of some scheme to implement the erasure. Whether this cost is greater than the benefit of reduced allocations depends on many things such as how the container is used.
Is it safe to allocate objects in an array and forgetting about the array later?
"Forgetting" about an allocation seems like a way to say "memory leak".
You could achieve similar advantages with a custom "pool" allocator. Implementing support for custom allocators to your container might be more generally useful.
P.S. Boost already has a "ptr_vector" container that supports custom allocators. No need to reinvent the wheel.
I did this because I wanted the pointers and references to individual
elements to stay same, even after resizing.
You should just use std::vector::reserve to prevent reallocation of vector data when it is resized.
Vector is quite primitive, but is is highly optimized. It will be extremely hard for you to beat it with your code. Just inspect its API and try its all functionalities. To create something better advanced knowledge of template programing is required (which apparently you do not have yet).
What you are trying to come up with is a use of placement new allocation for a deque-like container. It's a viable optimization, but usually its done to reduce allocation calls and memory fragmentation, e.g. on some RT or embedded systems. The array maybe even a static array in that case. But if you also require that instances of T would occupy adjacent space, that's a contradicting requirement, resorting them would kill any performance gains.
... beneficial(faster)
Depends on T. E.g. there is no point to do that to something like strings or shared pointers. Or anything that actually allocates resources elsewhere, unless T allows to change that behaviour too.
I wonder if it would be ... possible, if instead of allocating each
object individually, I could create a second array on the heap and
save the pointer of each object in the first one
Yes it is possible, even with standard ISO containers, thanks to allocators.
There is concern of thread safety or awareness if this "array" appears to be shared resource between multiple writer and reader threads. You might want to implement thread-local storages instead of using shared one and implement semaphores for crossover cases.
Usual application for that is to allocate not on heap but in statically allocated array, predetermined. Or in array that was allocated once at start of program.
Note that if you use placement new you should not use delete on created objects, you have to call destructor directly. placement new overload is not a true new as far as delete concerned. You may or may not cause error but you certainly will cause an crash if you used static array and you will cause heap corruption when deleting element that got same address as dynamically allocated array beginning
This comes at performance cost when constructing and copying, because I need to create the array on the heap and each object of the array on the heap too.
Copying a POD is extremely cheap. If you research perfect forwarding you can achieve the zero cost abstraction for constructors and the emplace_back() function. When copying, use std::copy() as it is very fast.
Is allocating arrays on the heap faster than allocating each object individually?
Each allocation requires you to ask the operating system for memory. Unless you are asking for a particularly large amount of memory you can assume each request will be a constant amount of time. Instead of asking for a parking space 10 times, ask for 10 parking spaces.
Is it safe to allocate objects in an array and forgetting about the array later? (sounds pretty dumb i think)
Depends what you mean by safe. If you can't answer this question on your own, then you must cleanup the memory and not leak under any circumstance.
An example of a time you might ignore cleaning up memory is when you know the program is going to end and cleaning up memory just to exit is kinda pointless. Still, you should clean it up. Read Serge Ballesta answer for more information about lifetime.
Just came across this. I can't believe it compiles, but it does. What kind of string initialization is this? And why do this?
std::string* name = new std::string[12];
This is a dynamic C-style array syntax, which was in place before std::vector obsoleted all but the small fraction of this usage - and since C++11 even that smallest usage has vanished.
This code dynamically creates and initializes 12 empty strings and sets name pointer to point to the very first of them. Now those strings can be accessed with [] operator, for example:
std::cout << name[0] << "\n";
Will output empty string.
There should never be any reason to use this construct, though, and instead
std::vector<std::string> name(12);
should be used.
What ... is this?
That is a new-expression. It allocates an object in the free store. More specifically, this expression allocates an array of 12 std::string objects.
What kind of ... initialization is this?
The strings of the array are default-initialized.
And why do this?
The scope of this question is unclear...
Why use an array?
Because arrays are the most efficient data structure. They incur zero space overhead and (depending on situation) interact well with processor caching.
Why allocate a dynamic array (from the free store)?
Because the size of an automatic array must be known at compile time. The size of a dynamic array does not need to be known until runtime. Of course, your example uses a compile time constant size for the array, so dynamic allocation is not necessary for that reason.
Also because the memory for automatic variables is limited (one to few megabytes on typical desktop systems). As such, large objects such as arrays that contain many objects must be allocated form the free store. An array of 12 strings is not significantly large in relation to the size of memory that is usually available for automatic objects.
Also because dynamic objects are not automatically destroyed at the end of current scope, so their lifetime is more flexible than automatic or static objects. Of course, this is as much a reason to not use dynamic objects: They are not destroyed automatically, and managing their lifetime is difficult and proving the correctness of a program that uses dynamic memory can be very difficult.
Why use a new expression to allocate an array
There's typically no reason to do so. The standard library provides a RAII container that handles the lifetime of the dynamically allocated array: std::vector.
This code is allocating an array of 12 std::string objects and storing the pointer to the first element of the array in the name variable.
std::string* name = new std::string[12];
The new expression allocates an array of 12 std::string objects with dynamic storage duration. Each std::string object in the array is initialized via its default constructor.
The new expression attempts to allocate storage and then attempts to construct and initialize either a single unnamed object, or an unnamed array of objects in the allocated storage. The new-expression returns a prvalue pointer to the constructed object or, if an array of objects was constructed, a pointer to the initial element of the array.
The pointer to the initial element of the array is then stored in name so that you can access the elements of the array using the [] subscript operator.
One of the C++ features that sets it apart from other languages is the ability to allocate complex objects as member variables or local variables instead of always having to allocate them with new. But this then leads to the question of which to choose in any given situation.
Is there some good set of criteria for choosing how to allocate variables? When should I declare a member variable as a straight variable instead of as a reference or a pointer? When should I allocate a variable with new rather than use a local variable that's allocated on the stack?
One of the C++ features that sets it apart from other languages
... is that you have to do memory allocation manually. But let's leave that aside:
allocate on the heap when an object has to be long-lived, i.e. must outlive a certain scope, and is expensive or impossible to copy or move,
allocate on the heap when an object is large (where large might mean several kilobytes if you want to be on the safe side) to prevent stack overflows, even if the object is only needed temporarily,
allocate on the heap if you're using the pimpl (compiler firewall) idiom,
allocate variable-sized arrays on the heap,
allocate on the stack otherwise because it's so much more convenient.
Note that in the second rule, by "large object" I mean something like
char buffer[1024 * 1024]; // 1MB buffer
but not
std::vector<char> buffer(1024 * 1024);
since the second is actually a very small object wrapping a pointer to a heap-allocated buffer.
As for pointer vs. value members:
use a pointer if you need heap allocation,
use a pointer if you're sharing structure,
use a pointer or reference for polymorphism,
use a reference if you get an object from client code and the client promises to keep it alive,
use a value in most other cases.
The use of smart pointers is of course recommended where appropriate. Note that you can use a reference in case of heap allocation because you can always delete &ref, but I wouldn't recommend doing that. References are pointers in disguise with only one difference (a reference can't be null), but they also signal a different intent.
There is little to add to the answer of larsmans.
Allocating on the stack usually simplifies resource management, you do not have to bother with memory leaks or ownership, etc. A GUI library is built around this observation, check at "Everything belongs somewhere" and "Who owns widgets."
If you allocate all members on the stack then the default copy ctor and default op= usually suffices. If you allocate the members on the heap, you have to be careful how you implement them.
If you allocate the member variable on the stack, the member's definition has to be visible. If you allocate it on the heap then you can forward declare that member. I personally like forward declarations, it reduces dependency.
I have a class which requiring a large amount of memory.
class BigClass {
public:
BigClass() {
bf1[96000000-1] = 1;
}
double bf1[96000000];
};
I can only initiate the class by "new" a object in heap memory.
BigClass *c = new BigClass();
assert( c->bf1[96000000-1] == 1 );
delete c;
If I initiate it without "new". I will get a segmentation fault in runtime.
BigClass c; // SIGSEGV!
How can I determine the memory limit? or should I better always use "new"?
First of all since you've entitled this C++ and not C why are you using arrays? Instead may I suggest vector<double> or, if contiguous memory is causing problems deque<double> which relaxes the constraint on contiguous memory without removing the nearly constant time lookup.
Using vector or deque may also alleviate other seg fault issues which could plague your project at a later date. For instance, overrunning bounds in your array. If you convert to using vector or deque you can use the .at(x) member function to retrieve and set values in your collection. Should you attempt to write out of bounds, that function will throw an error.
The stack have a fixed size that is dependant on the compiler options. See your compiler documentation to change the stack size for your executable.
Anyway, for big objects, prefer using new or better : smart pointers like shared_pointer (from boost or from std::tr1 or std:: if you have very recent compiler).
You shouldn't play that game ever. Your code could be called from another function or on a thread with a lower stack size limit and then your code will break nastily. See this closely related question.
If you're in doubt use heap-allocation (new) - either directly with smart pointers (like auto_ptr) or indirectly using std::vector.
There is no platform-independent way of determining the memory limit. For "large" amounts of memory, you're far safer allocating on the heap (i.e. using new); you can check for success by comparing the resulting pointer against NULL, or catching std::bad_alloc exceptions.
The way your class is designed is, as you discovered, quite fragile. Instead of always allocating your objects on the heap, instead your class itself should allocate the huge memory block on the heap, preferably with std::vector, or possibly with a shared_ptr if vector doesn't work for some reason. Then you don't have to worry about how your clients use the object, it's safe to put on the stack or the heap.
On Linux, in the Bash shell, you can check the stack size with ulimit -s. Variables with automatic storage duration will have their space allocated on the stack. As others have said, there are better ways of approaching this:
Use a std::vector to hold your data inside your BigClass.
Allocate the memory for bf1 inside BigClass's constructor and then free it in the destructor.
If you must have a large double[] member, allocate an instance of BigClass with some kind of smart pointer; if you don't need shared access something as simple as std::auto_ptr will let you safely construct/destroy your object:
std::auto_ptr<BigClass>(new BigClass) myBigClass;
myBigClass->bf1; // your array