Array: Storing Objects or References - c++

As a Java developer I have the following C++ question.
If I have objects of type A and I want to store a collection of them in an array,
then should I just store pointers to the objects or is it better to store the object itself?
In my opinion it is better to store pointers because:
1) One can easily remove an object, by setting its pointer to null
2) One saves space.

Pointers or just the objects?
You can't put references in an array in C++. You can make an array of pointers, but I'd still prefer a container and of actual objects rather than pointers because:
No chance to leak, exception safety is easier to deal with.
It isn't less space - if you store an array of pointers you need the memory for the object plus the memory for a pointer.
The only times I'd advocate putting pointers (or smart pointers would be better) in a container (or array if you must) is when your object isn't copy construable and assignable (a requirement for containers, pointers always meet this) or you need them to be polymorphic. E.g.
#include <vector>
struct foo {
virtual void it() {}
};
struct bar : public foo {
int a;
virtual void it() {}
};
int main() {
std::vector<foo> v;
v.push_back(bar()); // not doing what you expected! (the temporary bar gets "made into" a foo before storing as a foo and your vector doesn't get a bar added)
std::vector<foo*> v2;
v2.push_back(new bar()); // Fine
}
If you want to go down this road boost pointer containers might be of interest because they do all of the hard work for you.
Removing from arrays or containers.
Assigning NULL doesn't cause there to be any less pointers in your container/array, (it doesn't handle the delete either), the size remains the same but there are now pointers you can't legally dereference. This makes the rest of your code more complex in the form of extra if statements and prohibits things like:
// need to go out of our way to make sure there's no NULL here
std::for_each(v2.begin(),v2.end(), std::mem_fun(&foo::it));
I really dislike the idea of allowing NULLs in sequences of pointers in general because you quickly end up burying all the real work in a sequence of conditional statements. The alternative is that std::vector provides an erase method that takes an iterator so you can write:
v2.erase(v2.begin());
to remove the first or v2.begin()+1 for the second. There's no easy "erase the nth element" method though on std::vector because of the time complexity - if you're doing lots of erasing then there are other containers which might be more appropriate.
For an array you can simulate erasing with:
#include <utility>
#include <iterator>
#include <algorithm>
#include <iostream>
int main() {
int arr[] = {1,2,3,4};
int len = sizeof(arr)/sizeof(*arr);
std::copy(arr, arr+len, std::ostream_iterator<int>(std::cout, " "));
std::cout << std::endl;
// remove 2nd element, without preserving order:
std::swap(arr[1], arr[len-1]);
len -= 1;
std::copy(arr, arr+len, std::ostream_iterator<int>(std::cout, " "));
std::cout << std::endl;
// and again, first element:
std::swap(arr[0], arr[len-1]);
len -= 1;
std::copy(arr, arr+len, std::ostream_iterator<int>(std::cout, " "));
std::cout << std::endl;
}
preserving the order requires a series of shuffles instead of a single swap, which nicely illustrates the complexity of erasing that std::vector faces. Of course by doing this you've just reinvented a pretty big wheel a whole lot less usefully and flexibly than a standard library container would do for you for free!

It sounds like you are confusing references with pointers. C++ has 3 common ways of representing object handles
References
Pointers
Values
Coming from Java the most analogous way is to do so with a pointer. This is likely what you are trying to do here.
How they are stored though has some pretty fundamental effects on their behaviors. When you store as a value you are often dealing with copies of the values. Where pointers are dealing with one object with multiple references. Giving a flat answer of one is better than the other is not really possible without a bit more context on what these objects do

It completely depends on what you want to do... but you're misguided in some ways.
Things you should know are:
You can't set a reference to NULL in C++, though you can set a pointer to NULL.
A reference can only be made to an existing object - it must start initialized as such.
A reference cannot be changed (though the referenced value can be).
You wouldn't save space, in fact you would use more since you're using an object and a reference. If you need to reference the same object multiple times then you save space, but you might as well use a pointer - it's more flexible in MOST (read: not all) scenarios.
A last important one: STL containers (vector, list, etc) have COPY semantics - they cannot work with references. They can work with pointers, but it gets complicated, so for now you should always use copyable objects in those containers and accept that they will be copied, like it or not. The STL is designed to be efficient and safe with copy semantics.
Hope that helps! :)
PS (EDIT): You can use some new features in BOOST/TR1 (google them), and make a container/array of shared_ptr (reference counting smart pointers) which will give you a similar feel to Java's references and garbage collection. There's a flurry of differences but you'll have to read about it yourself - they are a great feature of the new standard.

You should always store objects when possible; that way, the container will manage the objects' lifetimes for you.
Occasionally, you will need to store pointers; most commonly, pointers to a base class where the objects themselves will be of different types. In that case, you need to be careful to manage the lifetime of the objects yourself; ensuring that they are not destroyed while in the container, but that they are destroyed once they are no longer needed.
Unlike Java, setting a pointer to null does not deallocate the object pointed to; instead, you get a memory leak if there are no more pointers to the object. If the object was created using new, then delete must be called at some point. Your best options here are to store smart pointers (shared_ptr, or perhaps unique_ptr if available), or to use Boost's pointer containers.

You can't store references in a container. You could store (naked) pointers instead, but that's prone to errors and is therefore frowned upon.
Thus, the real choice is between storing objects and smart pointers to objects. Both have their uses. My recommendation would be to go with storing objects by value unless the particular situation demands otherwise. This could happen:
if you need to NULL out the object without removing it from the
container;
if you need to store pointers to the same object in
multiple containers;
if you need to treat elements of the container
polymorphically.
One reason to not do it is to save space, since storing elements by value is likely to be more space-efficient.

To add to the answer of aix:
If you want to store polymorphic objects, you must use smart pointers because the containers make a copy, and for derived types only copy the base part (at least the standard ones, I think boost has some containers which work differently). Therefore you'll lose any polymorphic behaviour (and any derived-class state) of your objects.

Related

Reason for using smart pointers with a container

Simply written I would like to ask "what is a good reason to use smart pointers?"
for ex std::unique_ptr
However, I am not asking for reasons to use smart pointers over regular (dumb) pointers. I think every body knows that or a quick search can find the reason.
What I am asking is a comparison of these two cases:
Given a class (or a struct) named MyObject use
std:queue<std::unique_ptr<MyObject>>queue;
rather than
std:queue<MyObject> queue;
(it can be any container, not necessarily a queue)
Why should someone use option 1 rather than 2?
That is actually a good question.
There are a few reasons I can think of:
Polymorphism works only with references and pointers, not with value types. So if you want to hold derived objects in a container you can't have std::queue<MyObject>. One options is unique_ptr, another is reference_wrapper
the contained objects are referenced (*) from outside of the container. Depending on the container, the elements it holds can move, invalidating previous references to it. For instance std::vector::insert or the move of the container itself. In this case std::unique_ptr<MyObject> assures that the reference is valid, regardless of what the container does with it (ofc, as long as the unique_ptr is alive).
In the following example in Objects you can add a bunch of objects in a queue. However two of those objects can be special and you can access those two at any time.
struct MyObject { MyObject(int); };
struct Objects
{
std::queue<std::unique_ptr<MyObject>> all_objects_;
MyObject* special_object_ = nullptr;
MyObject* secondary_special_object_ = nullptr;
void AddObject(int i)
{
all_objects_.emplace(std::make_unique<MyObject>(i));
}
void AddSpecialObject(int i)
{
auto& emplaced = all_objects_.emplace(std::make_unique<MyObject>(i));
special_object_ = emplaced.get();
}
void AddSecondarySpecialObject(int i)
{
auto& emplaced = all_objects_.emplace(std::make_unique<MyObject>(i));
secondary_special_object_ = emplaced.get();
}
};
(*) I use "reference" here with its english meaning, not the C++ type. Any way to refer to an object (e.g. via a raw pointer)
Usecase: You want to store something in a std::vector with constant indices, while at the same time being able to remove objects from that vector.
If you use pointers, you can delete a pointed to object and set vector[i] = nullptr, (and also check for it later) which is something you cannot do when storing objects themselves. If you'd store Objects you would have to keep the instance in the vector and use a flag bool valid or something, because if you'd delete an object from a vector all indices after that object's index change by -1.
Note: As mentioned in a comment to this answer, the same can be archieved using std::optional, if you have access to C++17 or later.
The first declaration generates a container with pointer elements and the second one generates pure objects.
Here are some benefits of using pointers over objects:
They allow you to create dynamically sized data structures.
They allow you to manipulate memory directly (such as when packing or
unpacking data from hardware devices.)
They allow object references(function or data objects)
They allow you to manipulate an object(through an API) without needing to know the details of the object(other than the API.)
(raw) pointers are usually well matched to CPU registers, which makes dereferencing a value via a pointer efficient. (C++ “smart” pointers are more complicated data objects.)
Also, polymorphism is considered as one of the important features of Object-Oriented Programming.
In C++ polymorphism is mainly divided into two types:
Compile-time Polymorphism
This type of polymorphism is achieved by function overloading or operator overloading.
Runtime Polymorphism
This type of polymorphism is achieved by Function Overriding which if we want to use the base class to use these functions, it is necessary to use pointers instead of objects.

How to use multiple pointers to an object by vector/map

I have a question about using multiple pointers to an object.
I have a pointer in a vector and another one in a map.
The map uses the vector to index the object. Example code:
class Thing
{
public:
int x = 1;
};
Thing obj_Thing;
std::vector<Thing*> v_Things;
v_Things.push_back(&obj_Thing);
std::map<int, Thing*> m_ThingMap;
m_ThingsMap[v_Things[0]->x] = v_Things[0]; // crucial part
is it good practice to assign pointers to each other like this?
Should the vector and/or map hold addresses instead? Or should I be using a pointer to pointer for the map?
It all depends on what you want to do.
However, your approach can get really hairy when your project grows, and especially when others contribute to it.
To start with, this:
m_ThingsMap[v_Things[0]->x] = v_Things(0);
should be:
m_ThingsMap[v_Things[0]->x] = v_Things[0];
Moreover, storing raw pointers in a std::vector is possible, but needs special care, since you may end up with dangling pointers, if the object that a pointer points to gets deallocated too soon.
For that I suggest you to use std::weak_ptr, like this:
std::vector<std::weak_ptr<Thing>> v_Things;
in case you decide to stick with that approach (I mean if the pointer points to an object that is pointer-shared from another pointer).
If I were you, I would redesign my approach, since your code is not clean enough, let alone your logic; it takes one moment or two for someone to understand what is going on with all the pointers and shared places.

Do standard library containers structures store copies or references?

I'm afraid that standard library containers store inside themselves copies of all elements that I push into them. I hoped that they worked with references or pointers to my elements, so they don't waste extra memory and time making copies of each element. I made this proof:
queue<int> prueba;
int x = 5;
prueba.push(x);
x++;
cout << prueba.front() << ", ";
cout << x;
prueba.pop();
And the result was: 5, 6.
So, If I make a big class with a lot of heavy members, and then, I push a lot of objects of that class into a standard library container.
Will containers make a copy of each object inside? That's terrible!
Is there any way to avoid this catastrophic end, other than create just containers of pointers?
Does STL structures stores copies or references?
C++ standard containers are non-instrusive containers and as such they have the following properties:
Object doesn't "know" and contain details about the container in which is to be stored. Example:
struct Node
{
T data;
}
1. Pros:
does not containe additional information regarding the container integration.
object's lifetime managed by the container. (less complex.)
2. Cons:
store copies of values passed by the user. (inplace emplace construction possible.)
an object can belong only to one container. (or the contaier should store pointers to objects.)
overhead on storing copies. (bookkeeping on each allocation.)
can't store derived object and still maintain its original type. (slicing - looses polymorphism.)
Thus, the answer to your question is - they store copies.
Is there any way to avoid this catastrophic end, other than create just containers of pointers?
As far as I know, a reasonable solution is container of smart pointers.
The answer to your question is simple: STL containers store copies.

When to use pointers in C++

I just started learning about pointers in C++, and I'm not very sure on when to use pointers, and when to use actual objects.
For example, in one of my assignments we have to construct a gPolyline class, where each point is defined by a gVector. Right now my variables for the gPolyline class looks like this:
private:
vector<gVector3*> points;
If I had vector< gVector3 > points instead, what difference would it make? Also, is there a general rule of thumb for when to use pointers? Thanks in advance!
The general rule of thumb is to use pointers when you need to, and values or references when you can.
If you use vector<gVector3> inserting elements will make copies of these elements and the elements will not be connected any more to the item you inserted. When you store pointers, the vector just refers to the object you inserted.
So if you want several vectors to share the same elements, so that changes in the element are reflected in all the vectors, you need the vectors to contain pointers. If you don't need such functionality storing values is usually better, for example it saves you from worrying about when to delete all these pointed to objects.
Pointers are generally to be avoided in modern C++. The primary purpose for pointers nowadays revolves around the fact that pointers can be polymorphic, whereas explicit objects are not.
When you need polymorphism nowadays though it's better to use a smart pointer class -- such as std::shared_ptr (if your compiler supports C++0x extensions), std::tr1::shared_ptr (if your compiler doesn't support C++0x but does support TR1) or boost::shared_ptr.
Generally, it's a good idea to use pointers when you have to, but references or alternatively objects objects (think of values) when you can.
First you need to know if gVector3 fulfils requirements of standard containers, namely if the type gVector3 copyable and assignable. It is useful if gVector3 is default constructible as well (see UPDATE note below).
Assuming it does, then you have two choices, store objects of gVector3 directly in std::vector
std::vector<gVector3> points;
points.push_back(gVector(1, 2, 3)); // std::vector will make a copy of passed object
or manage creation (and also destruction) of gVector3 objects manually.
std::vector points;
points.push_back(new gVector3(1, 2, 3));
//...
When the points array is no longer needed, remember to talk through all elements and call delete operator on it.
Now, it's your choice if you can manipulate gVector3 as objects (you can assume to think of them as values or value objects) because (if, see condition above) thanks to availability of copy constructor and assignment operator the following operations are possible:
gVector3 v1(1, 2, 3);
gVector3 v2;
v2 = v1; // assignment
gVector3 v3(v2); // copy construction
or you may want or need to allocate objects of gVector3 in dynamic storage using new operator. Meaning, you may want or need to manage lifetime of those objects on your own.
By the way, you may be also wondering When should I use references, and when should I use pointers?
UPDATE: Here is explanation to the note on default constructibility. Thanks to Neil for pointing that it was initially unclear. As Neil correctly noticed, it is not required by C++ standard, however I pointed on this feature because it is an important and useful one. If type T is not default constructible, what is not required by the C++ standard, then user should be aware of potential problems which I try to illustrate below:
#include <vector>
struct T
{
int i;
T(int i) : i(i) {}
};
int main()
{
// Request vector of 10 elements
std::vector<T> v(10); // Compilation error about missing T::T() function/ctor
}
You can use pointers or objects - it's really the same at the end of the day.
If you have a pointer, you'll need to allocate space for the actual object (then point to it) any way. At the end of the day, if you have a million objects regardless of whether you are storing pointers or the objects themselves, you'll have the space for a million objects allocated in the memory.
When to use pointers instead? If you need to pass the objects themselves around, modify individual elements after they are in the data structure without having to retrieve them each and every time, or if you're using a custom memory manager to manage the allocation, deallocation, and cleanup of the objects.
Putting the objects themselves in the STL structure is easier and simpler. It requires less * and -> operators which you may find to be difficult to comprehend. Certain STL objects would need to have the objects themselves present instead of pointers in their default format (i.e. hashtables that need to hash the entry - and you want to hash the object, not the pointer to it) but you can always work around that by overriding functions, etc.
Bottom line: use pointers when it makes sense to. Use objects otherwise.
Normally you use objects.
Its easier to eat an apple than an apple on a stick (OK 2 meter stick because I like candy apples).
In this case just make it a vector<gVector3>
If you had a vector<g3Vector*> this implies that you are dynamically allocating new objects of g3Vector (using the new operator). If so then you need to call delete on these pointers at some point and std::Vector is not designed to do that.
But every rule is an exception.
If g3Vector is a huge object that costs a lot to copy (hard to tell read your documentation) then it may be more effecient to store as a pointer. But in this case I would use the boost::ptr_vector<g3Vector> as this automatically manages the life span of the object.

Container of Pointers vs Container of Objects - Performance

I was wondering if there is any difference in performance when you compare/contrast
A) Allocating objects on the heap, putting pointers to those objects in a container, operating on the container elsewhere in the code
Ex:
std::list<SomeObject*> someList;
// Somewhere else in the code
SomeObject* foo = new SomeObject(param1, param2);
someList.push_back(foo);
// Somewhere else in the code
while (itr != someList.end())
{
(*itr)->DoStuff();
//...
}
B) Creating an object, putting it in a container, operating on that container elsewhere in the code
Ex:
std::list<SomeObject> someList;
// Somewhere else in the code
SomeObject newObject(param1, param2);
someList.push_back(newObject);
// Somewhere else in the code
while (itr != someList.end())
{
itr->DoStuff();
...
}
Assuming the pointers are all deallocated correctly and everything works fine, my question is...
If there is a difference, what would yield better performance, and how great would the difference be?
There is a performance hit when inserting objects instead of pointers to objects.
std::list as well as other std containers make a copy of the parameter that you store (for std::map both key and value is copied).
As your someList is a std::list the following line copies your object:
Foo foo;
someList.push_back(foo); // copy foo object
It will get copied again when you retrieve it from list. So you are making of copies of the whole object compared to making copies of pointer when using:
Foo * foo = new Foo();
someList.push_back(foo); // copy of foo*
You can double check by inserting print statements into Foo's constructor, destructor, copy constructor.
EDIT: As mentioned in comments, pop_front does not return anything. You usually get reference to front element with front then you pop_front to remove the element from list:
Foo * fooB = someList.front(); // copy of foo*
someList.pop_front();
OR
Foo fooB = someList.front(); // front() returns reference to element but if you
someList.pop_front(); // are going to pop it from list you need to keep a
// copy so Foo fooB = someList.front() makes a copy
Like most performance questions, this doesn't have one clear cut answer.
For one thing, it depends on what exactly you're doing with the list. Pointers might make it easier to do various operations (like sorting). That's because comparing pointers and swapping pointers is probably going to be faster than comparing/swapping SomeObject (of course, it depends on the implementation of SomeObject).
On the other hand, dynamic memory allocation tends to be worse than allocating on the stack. So, assuming you have enough memory on the stack for all the objects, that's another thing to consider.
In the end, I would personally recommend the best piece of advice I've ever gotten: It's pointless trying to guess what will perform better. Code it the way that makes the most sense (easiest to implement/maintain). If, and only if* you later discover there is a performance problem, run a profiler and figure out why. Chances are, most programs won't need all these optimizations, and this will turn out to be a moot point.
It depends how you use the list. Do you just fill it with stuff, and do lookups, or do you insert and remove data regularly. Lookups may be marginally faster without pointers, while adding and removing elements will be faster with pointers.
With objects it is going to be memberwise copy (thus new object creation and copy of members) assuming there aren't any copy constructors and = operator overloads. Therefore, using pointers is efficient std::auto_ptr or boost's smart pointers better, but that is beyond the scope of this question.
If you still have to use object syntax using reference.
Some additional things to consider (You have already been made aware of the copy semantics of STL containers):
Are your objects really smaller than pointers to them? This becomes more relevant if you use any kind of smart pointer as those have a tendency to be larger.
Copy operations are (often?) optimized to use memcpy() by the compiler. Especially this is probably not true for smart pointers.
Additional dereferencing caused by pointers
All the things I have mentioned are micro optimizations considerations and I'd discourage even thinking about them and go with them. On the other hand: A lot of my claims would need verification and would make for interesting test cases. Feel free to benchmark them.