Recently I've met with opinion that I shouldn't use vector of pointers.
I wanted to know - why I cant?
For example if I have a class foo it is possible to do this:
vector <foo*> v;
v.push_back(new foo());
I've already seen some people down voting such practices, why is that?
Storing plain pointers in a container can lead to memory leaks and dangling pointers. Storing a pointer in a container does not define any kind of ownership of the pointer. Thus the container does not know the semantics of desctruction and copy operations. When the elements are being removed from the container the container is not aware how to properly destroy them, when a copy operation is performend no ownership semanctics are known. Of course, you can always handle these things by yourself, but then still a chance of human error is possible.
Using smart pointers leaves the ownership and destruction semantics up to them.
Another thing to mention is that containers are divided into non-intrusive and intrusive contaiers - they store the actual provided object instead of a copy so it actually comes down to a collection of pointers. Non intrusive pointers have some advantages, so you can't generalize that pointers in a container is something that should be avoided in all times, still in most cases it is recommended.
Using an vector of raw pointers is not necessary bad style, as long as you remember that the pointers do not have ownership semantics. When you start using new and delete, it usually means that you're doing something wrong.
In particular, the only cases where you should use new or delete in modern C++ code is when constructing unique_ptr's, or constructing shared_ptr's with custom deleters.
For example, assume that we have an class that implemented an bidirectional Graph, a Graph contains some amount of Vertexes.
class Vertex
{
public:
Vertex();
// raw pointer. No ownership
std::vector<Vertex *> edges;
}
class Graph
{
public:
Graph() {};
void addNode()
{
vertexes.push_back(new Vertex); // in C++14: prefer std::make_unique<>
}
// not shown: our Graph class implements a method to traverse over it's nodes
private:
// unique_ptr. Explicit ownership
std::vector<std::unique_ptr<Vertex>> vertexes;
}
void connect(Vertex *a, Vertex *b)
{
a->edges.push_back(b);
b->edges.push_back(a);
}
Notice how i have an vector of raw Vertex * in that Vertex class? I can do that because the lifetime of the Vertexes that it points to are managed by the class Graph. The ownership of my Vertex class is explicit from just looking at the code.
An different answer suggests using shared_ptr's. I personally dislike that approach because shared pointers, in general, make it very hard to reason about the lifetime of objects. In this particular example, shared pointers would not have worked at all because of the circular references between the Vertexes.
Because the vector's destructor won't call delete on the pointers, so it's easy to accidentally leak memory. A vector's destructor calls the destructors of all the elements in the vector, but raw pointers don't have destructors.
However, you can use a vector of smart pointers to ensure that destroying the vector will free the objects in it. vector<unique_ptr<foo>> can be used in C++11, and in C++98 with TR1 you can use vector<tr1::shared_ptr<foo>> (though shared_ptr has a slight overhead compared to a raw pointer or unique_ptr).
Boost also has a pointer container library, where the special delete-on-destruction behavior is built into the container itself so you don't need smart pointers.
One of the problems is exception-safety.
For example, suppose that somewhere an exception is thrown: in this case, the destructor of std::vector is called. But this destructor call does not delete the raw owning pointers stored in the vector. So, the resources managed by those pointers are leaked (these can be both memory resources, so you have a memory leak, but they could also be non-memory resources, e.g. sockets, OpenGL textures, etc.).
Instead, if you have a vector of smart pointers (e.g. std::vector<std::unique_ptr<Foo>>), then if the vector's destructor is called, each pointed item (safely owned by a smart pointer) in the vector is properly deleted, calling its destructor. So, the resources associated to each item ("smartly" pointed to in the vector) are properly released.
Note that vectors of observing raw pointers are fine (assuming that the lifetime of the observed items exceeeds that of the vector). The problem is with raw owning pointers.
I will talk specifically about a vector of pointers that's responsible for managing the lifetime of the pointed objects, because that's the only case where a vector of pointers is clearly a questionable choice.
There are much better alternatives. Specifically:
std::vector<std::shared_ptr<foo>> v;
and
std::vector<std::unique_ptr<foo>> v;
and
boost::ptr_vector<foo> v; // www.boost.org
The above versions tell the user how the lifetime of the objects is taken care of. Using raw pointers instead can possibly lead to the pointers being deleted either more or less than once, especially if the code is modified over time, or if exceptions become involved.
If you use an interface like ´shared_ptr´ or ´unique_ptr´, this self-documents the lifetime management for the user. When you use raw pointers you have to be clearly document how you handle the lifetime management of the objects, and hope that the right people read the documentation at the right time.
The benefits of using vectors of raw pointers are that you have more flexibility in how taking care of lifetime management, and that you can possibly get rid of some performance and space overhead.
There is absolutely no problem in using a vector of pointers. Most here are suggesting smart pointers, but I just have to say, there is no problem with using a vector of pointers without smart pointers. I do it all the time.
I agree with juanchopanza that the problem is your example is the pointers come from new foo(). In a normal completely-valid use case, you might have the objects in some other collection C, so that the objects will automatically get destroyed when C is destroyed. Then, in doing in the process of doing in-depth operations on the objects in C, you might create any number of other collections containing pointers to the objects in C. (If the other collections used object copies that would be time and memory wasteful, while collections of references is expressly forbidden.) In this use case, we never want to destroy any objects when a collection of pointers is destroyed.
Related
What I have readen say that a common approach to make a vector of pointer that own the pointers, of MyObject for example for simples uses, is vector<unique_pointer<MyObject>>.
But each time we access an element will call unique_ptr::get(). There is also a little overhead.
Why isn't vector of the pointer with "custom deleter", if such a thing exists (I don't have used allocators), more standard? That is, a smart vector instead of a vector of a smart pointer. It will eliminate the little overhead of using unique_ptr::get().
Something like vector<MyObject*, delete_on_destroy_allocator<MyObject>> or unique_vector<MyObject>.
The vector would take the behaviour "delete pointer when destroy" instead of duplicate this behaviour in each unique_ptr , is there a reason, or is just the overhead neglegible ?
Why isn't vector of pointer with "custom deleter", if such a thing exists
Because such a thing doesn't exist and cannot exist.
The allocator supplied to a container exists to allocate memory for the container and (optionally) creates/destroys the objects in that container. A vector<T*> is a container of pointers; therefore, the allocator allocates memory for the pointer and (optionally) creates/destroys the pointers. It is not responsible for the content of the pointer: the object it points to. That is the domain of the user to provide and manage.
If an allocator takes responsibility for destroying the object being pointed to, then it must logically also have responsibility for creating the object being pointed to, yes? After all, if it didn't, and we copied such a vector<T*, owning_allocator>, each copy would expect to destroy the objects being pointed to. But since they're pointing to the same objects (copying a vector<T> copies the Ts), you get a double destroy.
Therefore, if owning_allocator::destruct is going to delete the memory, owning_allocator::construct must also create the object being pointed to.
So... what does this do:
vector<T*, owning_allocator> vec;
vec.push_back(new T());
See the problem? allocator::construct cannot decide when to create a T and when not to. It doesn't know if its being called because of a vector copy operation or because push_back is being called with a user-created T*. All it knows is that it is being called with a T* value (technically a reference to a T*, but that's irrelevant, since it will be called with such a reference in both cases).
Therefore, either it 1) allocates a new object (initialized via a copy from the pointer it is given), or 2) it copies the pointer value. And since it cannot detect which situation is in play, it must always pick the same option. If it does #1, then the above code is a memory leak, because the vector didn't store the new T(), and nobody else deleted it. If it does #2, then you can't copy such a vector (and the story for internal vector reallocation is equally hazy).
What you want is not possible.
A vector<T> is a container of Ts, whatever T may be. It treats T as whatever it is; any meaning of this value is up to the user. And ownership semantics are part of that meaning.
T* has no ownership semantics, so vector<T*> also has no ownership semantics. unique_ptr<T> has ownership semantics, so vector<unique_ptr<T>> also has ownership semantics.
This is why Boost has ptr_vector<T>, which is explicitly a vector-style class that specifically contains pointers to Ts. It has a slightly modified interface because of this; if you hand it a T*, it knows it is adopting the T* and will destroy it. If you hand it a T, then it allocates a new T and copies/moves the value into the newly allocated T. This is a different container, with a different interface, and different behavior; therefore, it merits a different type from vector<T*>.
Neither a vector of unique_ptr's nor a vector of plain pointers are the preferred way to store data. In your example: std::vector<MyObject> is usually just fine, and if you know the size at compile time, try std::array<int>.
If you absolutely need indirect references , you can also consider std::vector<std::reference_wrapper<MyObject>>. Read about reference wrappers here.
Having said that... if you:
Need to store your vector somewhere else than your actual data, or
If MyObjects are very large / expensive to move, or
If construction or destruction of MyObjects has real-world side-effects which you want to avoid;
and, additionally, you want your MyObject to be freed when it's no longer refered to from the vector is gone - the vector of unique pointers is relevant.
Now, pointers are just a plain and simple data type inherited from the C language; it doesn't have custom deleters or custom anything... but - std::unique_ptr does support custom deleters. Also, it may be the case that you have more complex resource management needs for which it doesn't makes sense to have each element manage its own allocation and de-allocation - in which case as "smart" vector class may be relevant.
So: Different data structures fit different scenarios.
I'm starting with the assumption that, generally, it is a good idea to allocate small objects in the stack, and big objects in dynamic memory. Another assumption is that I'm possibly confused while trying to learn about memory, STL containers and smart pointers.
Consider the following example, where I have an object that is necessarily allocated in the free store through a smart pointer, and I can rely on clients getting said object from a factory, for instance. This object contains some data that is specifically allocated using an STL container, which happens to be a std::vector. In one case, this data vector itself is dynamically allocated using some smart pointer, and in the other situation I just don't use a smart pointer.
Is there any practical difference between design A and design B, described below?
Situation A:
class SomeClass{
public:
SomeClass(){ /* initialize some potentially big STL container */ }
private:
std::vector<double> dataVector_;
};
Situation B:
class SomeOtherClass{
public:
SomeOtherClass() { /* initialize some potentially big STL container,
but is it allocated in any different way? */ }
private:
std::unique_ptr<std::vector<double>> pDataVector_;
};
Some factory functions.
std::unique_ptr<SomeClass> someClassFactory(){
return std::make_unique<SomeClass>();
}
std::unique_ptr<SomeOtherClass> someOtherClassFactory(){
return std::make_unique<SomeOtherClass>();
}
Use case:
int main(){
//in my case I can reliably assume that objects themselves
//are going to always be allocated in dynamic memory
auto pSomeClassObject(someClassFactory());
auto pSomeOtherClassObject(someOtherClassFactory());
return 0;
}
I would expect that both design choices have the same outcome, but do they?
Is there any advantage or disadvantage for choosing A or B? Specifically, should I generally choose design A because it's simpler or are there more considerations? Is B morally wrong because it can dangle for a std::vector?
tl;dr : Is it wrong to have a smart pointer pointing to a STL container?
edit:
The related answers pointed to useful additional information for someone as confused as myself.
Usage of objects or pointers to objects as class members and memory allocation
and Class members that are objects - Pointers or not? C++
And changing some google keywords lead me to When vectors are allocated, do they use memory on the heap or the stack?
std::unique_ptr<std::vector<double>> is slower, takes more memory, and the only advantage is that it contains an additional possible state: "vector doesn't exist". However, if you care about that state, use boost::optional<std::vector> instead. You should almost never have a heap-allocated container, and definitely never use a unique_ptr. It actually works fine, no "dangling", it's just pointlessly slow.
Using std::unique_ptr here is just wasteful unless your goal is a compiler firewall (basically hiding the compile-time dependency to vector, but then you'd need a forward declaration to standard containers).
You're adding an indirection but, more importantly, the full contents of SomeClass turns into 3 separate memory blocks to load when accessing the contents (SomeClass merged with/containing unique_ptr's block pointing to std::vector's block pointing to its element array). In addition you're paying one extra superfluous level of heap overhead.
Now you might start imagining scenarios where an indirection is helpful to the vector, like maybe you can shallow move/swap the unique_ptrs between two SomeClass instances. Yes, but vector already provides that without a unique_ptr wrapper on top. And it already has states like empty that you can reuse for some concept of validity/nilness.
Remember that variable-sized containers themselves are small objects, not big ones, pointing to potentially big blocks. vector isn't big, its dynamic contents can be. The idea of adding indirections for big objects isn't a bad rule of thumb, but vector is not a big object. With move semantics in place, it's worth thinking of it more like a little memory block pointing to a big one that can be shallow copied and swapped cheaply. Before move semantics, there were more reasons to think of something like std::vector as one indivisibly large object (though its contents were always swappable), but now it's worth thinking of it more like a little handle pointing to big, dynamic contents.
Some common reasons to introduce an indirection through something like unique_ptr is:
Abstraction & hiding. If you're trying to abstract or hide the concrete definition of some type/subtype, Foo, then this is where you need the indirection so that its handle can be captured (or potentially even used with abstraction) by those who don't know exactly what Foo is.
To allow a big, contiguous 1-block-type object to be passed around from owner to owner without invoking a copy or invalidating references/pointers (iterators included) to it or its contents.
A hasty kind of reason that's wasteful but sometimes useful in a deadline rush is to simply introduce a validity/null state to something that doesn't inherently have it.
Occasionally it's useful as an optimization to hoist out certain less frequently-accessed, larger members of an object so that its commonly-accessed elements fit more snugly (and perhaps with adjacent objects) in a cache line. There unique_ptr can let you split apart that object's memory layout while still conforming to RAII.
Now wrapping a shared_ptr on top of a standard container might have more legitimate applications if you have a container that can actually be owned (sensibly) by more than one owner. With unique_ptr, only one owner can possess the object at a time, and standard containers already let you swap and move each other's internal guts (the big, dynamic parts). So there's very little reason I can think of to wrap a standard container directly with a unique_ptr, as it's already somewhat like a smart pointer to a dynamic array (but with more functionality to work with that dynamic data, including deep copying it if desired).
And if we talk about non-standard containers, like say you're working with a third party library that provides some data structures whose contents can get very large but they fail to provide those cheap, non-invalidating move/swap semantics, then you might superficially wrap it around a unique_ptr, exchanging some creation/access/destruction overhead to get those cheap move/swap semantics back as a workaround. For the standard containers, no such workaround is needed.
I agree with #MooingDuck; I don't think using std::unique_ptr has any compelling advantages. However, I could see a use case for std::shared_ptr if the member data is very large and the class is going to support COW (copy-on-write) semantics (or any other use case where the data is shared across multiple instances).
I want a vector to hold pointers to some objects that it will own.
Here is the vector:
private:
std::vector<fppVirtual*> m_fapps;
I have created elements like this:
m_fapps.push_back(new fpp1(renderingEngine)); //fpp* are subclasses of fppVirtual
m_fapps.push_back(new fpp2(renderingEngine));
m_fapps.push_back(new fpp3(renderingEngine));
As m_fapps is a vector instance variable in another class, I want to make sure that class's destructor properly cleans up m_fapps:
for (int i=0, size=m_fapps.size();i<size;++i){
delete m_fapps[i];
}
Is this acceptable memory management technique? I assume this loop is needed since when the vector goes out of scope when its owning class is destructed, only pointers to these new objects will be removed, right?
This works (with a few caveats) but is not considered idiomatic C++, for good reason.
You should strongly consider using a vector of smart pointers (or a smart vector like boost::ptr_vector) instead, in order to avoid having to do manual memory management.
This would also give you exception safety for free, and would also avoid nasty ownership issues that occur if your outer class is copyable.
As no one gave you straight forward answer yet - yes, it is acceptable and this is the only way to free this memory, having this declaration of the vector.
This can and should be avoided, using smart pointers, as #OliCharlesworth suggested or using some other container, ponited by #BjörnPollex.
You should use boost::ptr_vector instead. The interface is the same, but it handles memory management for you. See this question for some guidelines about whether to use ptr_vector or vector<shared_ptr<>>.
Just wondering what you think is the best practice regarding vectors in C++.
If I have a class containing a vector member variable.
When should this vector be declared a:
"Whole-object" vector member varaiable containing values, i.e. vector<MyClass> my_vector;
Pointer to a vector, i.e vector<MyClass>* my_vector;
Vector of pointers, i.e. vector<MyClass*> my_vector;
Pointer to vector of pointers, i.e. vector<MyClass*>* my_vector;
I have a specific example in one of my classes where I have currently declared a vector as case 4, i.e. vector<AnotherClass*>* my_vector;
where AnotherClass is another of the classes I have created.
Then, in the initialization list of my constructor, I create the vector using new:
MyClass::MyClass()
: my_vector(new vector<AnotherClass*>())
{}
In my destructor I do the following:
MyClass::~MyClass()
{
for (int i=my_vector->size(); i>0; i--)
{
delete my_vector->at(i-1);
}
delete my_vector;
}
The elements of the vectors are added in one of the methods of my class.
I cannot know how many objects will be added to my vector in advance. That is decided when the code executes, based on parsing an xml-file.
Is this good practice? Or should the vector instead be declared as one of the other cases 1, 2 or 3 ?
When to use which case?
I know the elements of a vector should be pointers if they are subclasses of another class (polymorphism). But should pointers be used in any other cases ?
Thank you very much!!
Usually solution 1 is what you want since it’s the simplest in C++: you don’t have to take care of managing the memory, C++ does all that for you (for example you wouldn’t need to provide any destructor then).
There are specific cases where this doesn’t work (most notably when working with polymorphous objects) but in general this is the only good way.
Even when working with polymorphous objects or when you need heap allocated objects (for whatever reason) raw pointers are almost never a good idea. Instead, use a smart pointer or container of smart pointers. Modern C++ compilers provide shared_ptr from the upcoming C++ standard. If you’re using a compiler that doesn’t yet have that, you can use the implementation from Boost.
Definitely the first!
You use vector for its automatic memory management. Using a raw pointer to a vector means you don't get automatic memory management anymore, which does not make sense.
As for the value type: all containers basically assume value-like semantics. Again, you'd have to do memory management when using pointers, and it's vector's purpose to do that for you. This is also described in item 79 from the book C++ Coding Standards. If you need to use shared ownership or "weak" links, use the appropriate smart pointer instead.
Deleting all elements in a vector manually is an anti-pattern and violates the RAII idiom in C++. So if you have to store pointers to objects in a vector, better use a 'smart pointer' (for example boost::shared_ptr) to facilitate resource destructions. boost::shared_ptr for example calls delete automatically when the last reference to an object is destroyed.
There is also no need to allocate MyClass::my_vector using new. A simple solution would be:
class MyClass {
std::vector<whatever> m_vector;
};
Assuming whatever is a smart pointer type, there is no extra work to be done. That's it, all resources are automatically destroyed when the lifetime of a MyClass instance ends.
In many cases you can even use a plain std::vector<MyClass> - that's when the objects in the vector are safe to copy.
In your example, the vector is created when the object is created, and it is destroyed when the object is destroyed. This is exactly the behavior you get when making the vector a normal member of the class.
Also, in your current approach, you will run into problems when making copies of your object. By default, a pointer would result in a flat copy, meaning all copies of the object would share the same vector. This is the reason why, if you manually manage resources, you usually need The Big Three.
A vector of pointers is useful in cases of polymorphic objects, but there are alternatives you should consider:
If the vector owns the objects (that means their lifetime is bounded by that of the vector), you could use a boost::ptr_vector.
If the objects are not owned by the vector, you could either use a vector of boost::shared_ptr, or a vector of boost::ref.
A pointer to a vector is very rarely useful - a vector is cheap to construct and destruct.
For elements in the vector, there's no correct answer. How often does the vector change? How much does it cost to copy-construct the elements in the vector? Do other containers have references or pointers to the vector elements?
As a rule of thumb, I'd go with no pointers until you see or measure that the copying of your classes is expensive. And of course the case you mentioned, where you store various subclasses of a base class in the vector, will require pointers.
A reference counting smart pointer like boost::shared_ptr will likely be the best choice if your design would otherwise require you to use pointers as vector elements.
Complex answer : it depends.
if your vector is shared or has a lifecycle different from the class which embeds it, it might be better to keep it as a pointer.
If the objects you're referencing have no (or have expensive) copy constructors , then it's better to keep a vector of pointer. In the contrary, if your objects use shallow copy, using vector of objects prevent you from leaking...
How is the destructor for the vector managed when adding elements to this list? Is the object destroyed correctly when it goes out of scope? Are there cases where it would not delete the object correctly? For example what are the consequences if "table" was a child of object, and we added a new table to a vector of object pointers?
vector <object*> _objectList;
_objectList.PushBack(new object);
Since you're making a vector of "bare" pointers, C++ can't possibly know that the pointers in question are meant to have "ownership" of the objects they point to, and so it will not call those objects' destructors when the pointer goes away. You should use a simple "smart" pointer instead of a "bare" pointer as the vector's item. For example, Boost's shared_ptr would be perfectly adequate for the task (although you can surely do it with "cheaper", lighter-weight approaches, if you don't want to deal with Boost as a whole and have no other need for smart pointers in your code).
Edit: since you (the OP) say that using a framework such as Boost is not feasible, and a couple comments usefully point out that even wrapping std::auto_ptr doesn't really qualify as a decent shortcut, you may have to implement your own smart pointers (or, if you find an open-source, stand-alone smart pointer template class that looks usable, audit it for compliance with your requirements). This article is a useful primer to smart pointers in C++, whether you have to roll your own or audit an existing implementation.
You could use bost 'ptr_vector'. It will automatically destruct objects that the items point to when they are either deleted or the instance of ptr_vector goes out of scope. More info is available here.
In your case, the object pointers are destroyed properly, but the actual objects themselves won't be touched. The STL properly destructs all contained elements - but will not implicitly dereference pointers to types.
STL Vectors make a copy of whatever you put in there, and ultimately delete that copy.
So in this case, the vector is storing a pointer to an object - not the object itself. So it makes a copy of the pointer, and deletes the pointer. But, as Chris said, the object itself will not be deleted.
So, solutions:
If you don't really need to use pointers, then don't:
vector <object> _objectList;
_objectList.PushBack(object());
If you do need to use pointers, you can either use a smart pointer (which handles reference counting for you, and will delete the object along with the pointer), as Alex suggested, or use a ptr_vector, as Igor mentioned.