C++, Array of objects VS array of pointers pointing to those objects

C++, Array of objects VS array of pointers pointing to those objects - c++

Consider a c++ class named A. What are the pros/cons to use an array of objects:
std::array<A, 10>
instead of an array of pointers:
std::array<A*, 10>

Here are important differences:
Array of objects:
Memory of the objects is managed by std::array.
Objects are stored in contiguous memory (good cache efficiency)
All objects are of same type
All objects exist
Assigning an element makes a copy of the object
Array of pointers:
Memory of the objects that are pointed to is not managed by the std::array which contains the pointers.
You can store pointers to a common base of polymorphic types
Pointers can have nullptr value i.e. does not point to any object
Assigning an element does not make a copy of the object which is pointed to
Whether any of these things is a pro or a con depends on your use case.
And now for the opinion based part, as a hint to beginners: In my opinion, the fact that the memory is managed by the array makes it clear that the array "owns" the objects. It's often not clear who owns the objects that are pointed to by the pointers. The clarity of ownership, combined with the cache efficiency which is always a bonus, makes the array of objects a good default choice when you are not sure. Use objects in arrays when you can, pointers when you need to. And when you need pointers, consider whether std::unique_ptr is appropriate.

if you don't want to use STL array, you can use your own array
1) A array[10]
or
2) A* array[10]
For #1, class A must have a default constructor.
A needs more memory to hold object
Whenever you assigned an object to any index of the array, copy constructor gets called
Compilation time required is more
For #2 There is no need of constructor
sizeof(array) = size of pointer * 10
compilation time required is less
There is no need of default constructor or copy constructor

Related

array class member with different size for each class instance

I would like to have a class that contains an array member, but the constructor lets me set the size of an array member.
Is this doable? I do not thing I need dynamic allocation, since once the class instances are created, there is no need for the array to change size, it is just that each class instance will have a different size.

Despite several comments suggest that this would be impossible, it is actually not impossible.
The simplest way, of course, is to use an indirection and allocate the array during construction just the normal way (with a = new type[size] and calling delete[] a - not delete a - in the destructor).
But if for some reason you really do not want to have the array data being allocated separately from your object, you can use placement-new to construct your object into a pre-allocated buffer that is large enough to contain all your elements. This avoids a separate allocation for your array and you can still have dynamic size.
I would not recommend using this technique, though, unless you really have a demanding use case for it.

Why is vector of unique_ptr the prefered way to store pointers?

What I have readen say that a common approach to make a vector of pointer that own the pointers, of MyObject for example for simples uses, is vector<unique_pointer<MyObject>>.
But each time we access an element will call unique_ptr::get(). There is also a little overhead.
Why isn't vector of the pointer with "custom deleter", if such a thing exists (I don't have used allocators), more standard? That is, a smart vector instead of a vector of a smart pointer. It will eliminate the little overhead of using unique_ptr::get().
Something like vector<MyObject*, delete_on_destroy_allocator<MyObject>> or unique_vector<MyObject>.
The vector would take the behaviour "delete pointer when destroy" instead of duplicate this behaviour in each unique_ptr , is there a reason, or is just the overhead neglegible ?

Why isn't vector of pointer with "custom deleter", if such a thing exists
Because such a thing doesn't exist and cannot exist.
The allocator supplied to a container exists to allocate memory for the container and (optionally) creates/destroys the objects in that container. A vector<T*> is a container of pointers; therefore, the allocator allocates memory for the pointer and (optionally) creates/destroys the pointers. It is not responsible for the content of the pointer: the object it points to. That is the domain of the user to provide and manage.
If an allocator takes responsibility for destroying the object being pointed to, then it must logically also have responsibility for creating the object being pointed to, yes? After all, if it didn't, and we copied such a vector<T*, owning_allocator>, each copy would expect to destroy the objects being pointed to. But since they're pointing to the same objects (copying a vector<T> copies the Ts), you get a double destroy.
Therefore, if owning_allocator::destruct is going to delete the memory, owning_allocator::construct must also create the object being pointed to.
So... what does this do:
vector<T*, owning_allocator> vec;
vec.push_back(new T());
See the problem? allocator::construct cannot decide when to create a T and when not to. It doesn't know if its being called because of a vector copy operation or because push_back is being called with a user-created T*. All it knows is that it is being called with a T* value (technically a reference to a T*, but that's irrelevant, since it will be called with such a reference in both cases).
Therefore, either it 1) allocates a new object (initialized via a copy from the pointer it is given), or 2) it copies the pointer value. And since it cannot detect which situation is in play, it must always pick the same option. If it does #1, then the above code is a memory leak, because the vector didn't store the new T(), and nobody else deleted it. If it does #2, then you can't copy such a vector (and the story for internal vector reallocation is equally hazy).
What you want is not possible.
A vector<T> is a container of Ts, whatever T may be. It treats T as whatever it is; any meaning of this value is up to the user. And ownership semantics are part of that meaning.
T* has no ownership semantics, so vector<T*> also has no ownership semantics. unique_ptr<T> has ownership semantics, so vector<unique_ptr<T>> also has ownership semantics.
This is why Boost has ptr_vector<T>, which is explicitly a vector-style class that specifically contains pointers to Ts. It has a slightly modified interface because of this; if you hand it a T*, it knows it is adopting the T* and will destroy it. If you hand it a T, then it allocates a new T and copies/moves the value into the newly allocated T. This is a different container, with a different interface, and different behavior; therefore, it merits a different type from vector<T*>.

Neither a vector of unique_ptr's nor a vector of plain pointers are the preferred way to store data. In your example: std::vector<MyObject> is usually just fine, and if you know the size at compile time, try std::array<int>.
If you absolutely need indirect references , you can also consider std::vector<std::reference_wrapper<MyObject>>. Read about reference wrappers here.
Having said that... if you:
Need to store your vector somewhere else than your actual data, or
If MyObjects are very large / expensive to move, or
If construction or destruction of MyObjects has real-world side-effects which you want to avoid;
and, additionally, you want your MyObject to be freed when it's no longer refered to from the vector is gone - the vector of unique pointers is relevant.
Now, pointers are just a plain and simple data type inherited from the C language; it doesn't have custom deleters or custom anything... but - std::unique_ptr does support custom deleters. Also, it may be the case that you have more complex resource management needs for which it doesn't makes sense to have each element manage its own allocation and de-allocation - in which case as "smart" vector class may be relevant.
So: Different data structures fit different scenarios.

What kind of string initialization is this?

Just came across this. I can't believe it compiles, but it does. What kind of string initialization is this? And why do this?
std::string* name = new std::string[12];

This is a dynamic C-style array syntax, which was in place before std::vector obsoleted all but the small fraction of this usage - and since C++11 even that smallest usage has vanished.
This code dynamically creates and initializes 12 empty strings and sets name pointer to point to the very first of them. Now those strings can be accessed with [] operator, for example:
std::cout << name[0] << "\n";
Will output empty string.
There should never be any reason to use this construct, though, and instead
std::vector<std::string> name(12);
should be used.

What ... is this?
That is a new-expression. It allocates an object in the free store. More specifically, this expression allocates an array of 12 std::string objects.
What kind of ... initialization is this?
The strings of the array are default-initialized.
And why do this?
The scope of this question is unclear...
Why use an array?
Because arrays are the most efficient data structure. They incur zero space overhead and (depending on situation) interact well with processor caching.
Why allocate a dynamic array (from the free store)?
Because the size of an automatic array must be known at compile time. The size of a dynamic array does not need to be known until runtime. Of course, your example uses a compile time constant size for the array, so dynamic allocation is not necessary for that reason.
Also because the memory for automatic variables is limited (one to few megabytes on typical desktop systems). As such, large objects such as arrays that contain many objects must be allocated form the free store. An array of 12 strings is not significantly large in relation to the size of memory that is usually available for automatic objects.
Also because dynamic objects are not automatically destroyed at the end of current scope, so their lifetime is more flexible than automatic or static objects. Of course, this is as much a reason to not use dynamic objects: They are not destroyed automatically, and managing their lifetime is difficult and proving the correctness of a program that uses dynamic memory can be very difficult.
Why use a new expression to allocate an array
There's typically no reason to do so. The standard library provides a RAII container that handles the lifetime of the dynamically allocated array: std::vector.

This code is allocating an array of 12 std::string objects and storing the pointer to the first element of the array in the name variable.
std::string* name = new std::string[12];
The new expression allocates an array of 12 std::string objects with dynamic storage duration. Each std::string object in the array is initialized via its default constructor.
The new expression attempts to allocate storage and then attempts to construct and initialize either a single unnamed object, or an unnamed array of objects in the allocated storage. The new-expression returns a prvalue pointer to the constructed object or, if an array of objects was constructed, a pointer to the initial element of the array.
The pointer to the initial element of the array is then stored in name so that you can access the elements of the array using the [] subscript operator.

C++ vector of objects vs. vector of pointers to objects

I am writing an application using openFrameworks, but my question is not specific to just oF; rather, it is a general question about C++ vectors in general.
I wanted to create a class that contains multiple instances of another class, but also provides an intuitive interface for interacting with those objects. Internally, my class used a vector of the class, but when I tried to manipulate an object using vector.at(), the program would compile but not work properly (in my case, it would not display a video).
// instantiate object dynamically, do something, then append to vector
vector<ofVideoPlayer> videos;
ofVideoPlayer *video = new ofVideoPlayer;
video->loadMovie(filename);
videos.push_back(*video);
// access object in vector and do something; compiles but does not work properly
// without going into specific openFrameworks details, the problem was that the video would
// not draw to screen
videos.at(0)->draw();
Somewhere, it was suggested that I make a vector of pointers to objects of that class instead of a vector of those objects themselves. I implemented this and indeed it worked like a charm.
vector<ofVideoPlayer*> videos;
ofVideoPlayer * video = new ofVideoPlayer;
video->loadMovie(filename);
videos.push_back(video);
// now dereference pointer to object and call draw
videos.at(0)->draw();
I was allocating memory for the objects dynamically, i.e. ofVideoPlayer = new ofVideoPlayer;
My question is simple: why did using a vector of pointers work, and when would you create a vector of objects versus a vector of pointers to those objects?

What you have to know about vectors in c++ is that they have to use the copy operator of the class of your objects to be able to enter them into the vector. If you had memory allocation in these objects that was automatically deallocated when the destructor was called, that could explain your problems: your object was copied into the vector then destroyed.
If you have, in your object class, a pointer that points towards a buffer allocated, a copy of this object will point towards the same buffer (if you use the default copy operator). If the destructor deallocates the buffer, when the copy destructor will be called, the original buffer will be deallocated, therefore your data won't be available anymore.
This problem doesn't happen if you use pointers, because you control the life of your elements via new/destroy, and the vector functions only copy pointer towards your elements.

My question is simple: why did using a
vector of pointers work, and when
would you create a vector of objects
versus a vector of pointers to those
objects?
std::vector is like a raw array allocated with new and reallocated when you try to push in more elements than its current size.
So, if it contains A pointers, it's like if you were manipulating an array of A*.
When it needs to resize (you push_back() an element while it's already filled to its current capacity), it will create another A* array and copy in the array of A* from the previous vector.
If it contains A objects, then it's like you were manipulating an array of A, so A should be default-constructible if there are automatic reallocations occuring. In this case, the whole A objects get copied too in another array.
See the difference? The A objects in std::vector<A> can change address if you do some manipulations that requires the resizing of the internal array. That's where most problems with containing objects in std::vector comes from.
A way to use std::vector without having such problems is to allocate a large enough array from the start. The keyword here is "capacity". The std::vector capacity is the real size of the memory buffer in which it will put the objects. So, to setup the capacity, you have two choices:
1) size your std::vector on construction to build all the object from the start , with maximum number of objects - that will call constructors of each objects.
2) once the std::vector is constructed (but has nothing in it), use its reserve() function : the vector will then allocate a large enough buffer (you provide the maximum size of the vector). The vector will set the capacity. If you push_back() objects in this vector or resize() under the limit of the size you've provided in the reserve() call, it will never reallocate the internal buffer and your objects will not change location in memory, making pointers to those objects always valid (some assertions to check that change of capacity never occurs is an excellent practice).

If you are allocating memory for the objects using new, you are allocating it on the heap. In this case, you should use pointers. However, in C++, the convention is generally to create all objects on the stack and pass copies of those objects around instead of passing pointers to objects on the heap.
Why is this better? It is because C++ does not have garbage collection, so memory for objects on the heap will not be reclaimed unless you specifically delete the object. However, objects on the stack are always destroyed when they leave scope. If you create objects on the stack instead of the heap, you minimize your risk of memory leaks.
If you do use the stack instead of the heap, you will need to write good copy constructors and destructors. Badly written copy constructors or destructors can lead to either memory leaks or double frees.
If your objects are too large to be efficiently copied, then it is acceptable to use pointers. However, you should use reference-counting smart pointers (either the C++0x auto_ptr or one the Boost library pointers) to avoid memory leaks.

vector addition and internal housekeeping use copies of the original object - if taking a copy is very expensive or impossible, then using a pointer is preferable.
If you make the vector member a pointer, use a smart pointer to simplify your code and minimize the risk of leaks.
Maybe your class does not do proper (ie. deep) copy construction/assignment? If so, pointers would work but not object instances as the vector member.

Usually I don't store classes directly in std::vector. The reason is simple: you would not know if the class is derived or not.
E.g.:
In headers:
class base
{
public:
virtual base * clone() { new base(*this); };
virtual ~base(){};
};
class derived : public base
{
public:
virtual base * clone() { new derived(*this); };
};
void some_code(void);
void work_on_some_class( base &_arg );
In source:
void some_code(void)
{
...
derived instance;
work_on_some_class(derived instance);
...
}
void work_on_some_class( base &_arg )
{
vector<base> store;
...
store.push_back(*_arg.clone());
// Issue!
// get derived * from clone -> the size of the object would greater than size of base
}
So I prefer to use shared_ptr:
void work_on_some_class( base &_arg )
{
vector<shared_ptr<base> > store;
...
store.push_back(_arg.clone());
// no issue :)
}

The main idea of using vector is to store objects in a continue space, when using pointer or smart pointer that won't happen

Here also need to keep in mind the performance of memory usage by CPU.
std::vector vector guarantees(not sure) that the mem block is
continuous.
std::vectorstd::unique_ptr<Object> will keep smart-pointers in continuous memory, but real memory blocks for objects can be placed in different positions in RAM.
So I can guess that std::vector will be faster for cases when the size of the vector is reserved and known. However, std::vectorstd::unique_ptr<Object> will be faster if we don't know the planned size or we have plans to change the order of objects.

Pointer to vector vs vector of pointers vs pointer to vector of pointers

Just wondering what you think is the best practice regarding vectors in C++.
If I have a class containing a vector member variable.
When should this vector be declared a:
"Whole-object" vector member varaiable containing values, i.e. vector<MyClass> my_vector;
Pointer to a vector, i.e vector<MyClass>* my_vector;
Vector of pointers, i.e. vector<MyClass*> my_vector;
Pointer to vector of pointers, i.e. vector<MyClass*>* my_vector;
I have a specific example in one of my classes where I have currently declared a vector as case 4, i.e. vector<AnotherClass*>* my_vector;
where AnotherClass is another of the classes I have created.
Then, in the initialization list of my constructor, I create the vector using new:
MyClass::MyClass()
: my_vector(new vector<AnotherClass*>())
{}
In my destructor I do the following:
MyClass::~MyClass()
{
for (int i=my_vector->size(); i>0; i--)
{
delete my_vector->at(i-1);
}
delete my_vector;
}
The elements of the vectors are added in one of the methods of my class.
I cannot know how many objects will be added to my vector in advance. That is decided when the code executes, based on parsing an xml-file.
Is this good practice? Or should the vector instead be declared as one of the other cases 1, 2 or 3 ?
When to use which case?
I know the elements of a vector should be pointers if they are subclasses of another class (polymorphism). But should pointers be used in any other cases ?
Thank you very much!!

Usually solution 1 is what you want since it’s the simplest in C++: you don’t have to take care of managing the memory, C++ does all that for you (for example you wouldn’t need to provide any destructor then).
There are specific cases where this doesn’t work (most notably when working with polymorphous objects) but in general this is the only good way.
Even when working with polymorphous objects or when you need heap allocated objects (for whatever reason) raw pointers are almost never a good idea. Instead, use a smart pointer or container of smart pointers. Modern C++ compilers provide shared_ptr from the upcoming C++ standard. If you’re using a compiler that doesn’t yet have that, you can use the implementation from Boost.

Definitely the first!
You use vector for its automatic memory management. Using a raw pointer to a vector means you don't get automatic memory management anymore, which does not make sense.
As for the value type: all containers basically assume value-like semantics. Again, you'd have to do memory management when using pointers, and it's vector's purpose to do that for you. This is also described in item 79 from the book C++ Coding Standards. If you need to use shared ownership or "weak" links, use the appropriate smart pointer instead.

Deleting all elements in a vector manually is an anti-pattern and violates the RAII idiom in C++. So if you have to store pointers to objects in a vector, better use a 'smart pointer' (for example boost::shared_ptr) to facilitate resource destructions. boost::shared_ptr for example calls delete automatically when the last reference to an object is destroyed.
There is also no need to allocate MyClass::my_vector using new. A simple solution would be:
class MyClass {
std::vector<whatever> m_vector;
};
Assuming whatever is a smart pointer type, there is no extra work to be done. That's it, all resources are automatically destroyed when the lifetime of a MyClass instance ends.
In many cases you can even use a plain std::vector<MyClass> - that's when the objects in the vector are safe to copy.

In your example, the vector is created when the object is created, and it is destroyed when the object is destroyed. This is exactly the behavior you get when making the vector a normal member of the class.
Also, in your current approach, you will run into problems when making copies of your object. By default, a pointer would result in a flat copy, meaning all copies of the object would share the same vector. This is the reason why, if you manually manage resources, you usually need The Big Three.
A vector of pointers is useful in cases of polymorphic objects, but there are alternatives you should consider:
If the vector owns the objects (that means their lifetime is bounded by that of the vector), you could use a boost::ptr_vector.
If the objects are not owned by the vector, you could either use a vector of boost::shared_ptr, or a vector of boost::ref.

A pointer to a vector is very rarely useful - a vector is cheap to construct and destruct.
For elements in the vector, there's no correct answer. How often does the vector change? How much does it cost to copy-construct the elements in the vector? Do other containers have references or pointers to the vector elements?
As a rule of thumb, I'd go with no pointers until you see or measure that the copying of your classes is expensive. And of course the case you mentioned, where you store various subclasses of a base class in the vector, will require pointers.
A reference counting smart pointer like boost::shared_ptr will likely be the best choice if your design would otherwise require you to use pointers as vector elements.

Complex answer : it depends.
if your vector is shared or has a lifecycle different from the class which embeds it, it might be better to keep it as a pointer.
If the objects you're referencing have no (or have expensive) copy constructors , then it's better to keep a vector of pointer. In the contrary, if your objects use shallow copy, using vector of objects prevent you from leaking...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js