I am having some difficulty understanding how containers are implemented in C++. Specifically, how can I deal with data allocated on the stack vs data allocated on the heap. For instance:
vector<int> VectorA;
VectorA.push_back (1);
VectorA.push_back (2);
VectorA.push_back (3);
vector<int*> VectorB;
VectorB.push_back (new int (1));
VectorB.push_back (new int (2));
VectorB.push_back (new int (3));
How does one deal with making sure the integers in VectorB are deleted properly. I remember reading somewhere that std::vector only calls the destructor and doesn't actually delete anything. Also, if I wanted to implement my own LinkedList class, how would I deal with this particular problem?
The ideal way to deal with the problem is by using Smart Pointers as elements of your container instead of raw pointers.
Otherwise you have to manually do the memory management yourself.
The objects stored in the vector are stored on the heap anyway (in space allocated by the vector).
There is very little advantage in allocating the objects separately (option B), unless you intend to manage them somehow outside of the vector. In that case the vector can just destroy the pointers it stores, and trust the real owner to destroy the objects themselves.
The short answer is that most don't deal with such things at all. The standard containers (e.g., std::vector) don't store the original object at all -- they store a copy of the object. They manage the storage for their own copy, and it's up to you to manage the storage for the original.
In the case of storing pointers, you're basically taking responsibility for managing the storage pointed to by the pointer. The container is still making (and managing the storage for) a copy, but it's just a copy of the pointer, not the object to which it refers. That's up to you to deal with.
Related
Recently I've met with opinion that I shouldn't use vector of pointers.
I wanted to know - why I cant?
For example if I have a class foo it is possible to do this:
vector <foo*> v;
v.push_back(new foo());
I've already seen some people down voting such practices, why is that?
Storing plain pointers in a container can lead to memory leaks and dangling pointers. Storing a pointer in a container does not define any kind of ownership of the pointer. Thus the container does not know the semantics of desctruction and copy operations. When the elements are being removed from the container the container is not aware how to properly destroy them, when a copy operation is performend no ownership semanctics are known. Of course, you can always handle these things by yourself, but then still a chance of human error is possible.
Using smart pointers leaves the ownership and destruction semantics up to them.
Another thing to mention is that containers are divided into non-intrusive and intrusive contaiers - they store the actual provided object instead of a copy so it actually comes down to a collection of pointers. Non intrusive pointers have some advantages, so you can't generalize that pointers in a container is something that should be avoided in all times, still in most cases it is recommended.
Using an vector of raw pointers is not necessary bad style, as long as you remember that the pointers do not have ownership semantics. When you start using new and delete, it usually means that you're doing something wrong.
In particular, the only cases where you should use new or delete in modern C++ code is when constructing unique_ptr's, or constructing shared_ptr's with custom deleters.
For example, assume that we have an class that implemented an bidirectional Graph, a Graph contains some amount of Vertexes.
class Vertex
{
public:
Vertex();
// raw pointer. No ownership
std::vector<Vertex *> edges;
}
class Graph
{
public:
Graph() {};
void addNode()
{
vertexes.push_back(new Vertex); // in C++14: prefer std::make_unique<>
}
// not shown: our Graph class implements a method to traverse over it's nodes
private:
// unique_ptr. Explicit ownership
std::vector<std::unique_ptr<Vertex>> vertexes;
}
void connect(Vertex *a, Vertex *b)
{
a->edges.push_back(b);
b->edges.push_back(a);
}
Notice how i have an vector of raw Vertex * in that Vertex class? I can do that because the lifetime of the Vertexes that it points to are managed by the class Graph. The ownership of my Vertex class is explicit from just looking at the code.
An different answer suggests using shared_ptr's. I personally dislike that approach because shared pointers, in general, make it very hard to reason about the lifetime of objects. In this particular example, shared pointers would not have worked at all because of the circular references between the Vertexes.
Because the vector's destructor won't call delete on the pointers, so it's easy to accidentally leak memory. A vector's destructor calls the destructors of all the elements in the vector, but raw pointers don't have destructors.
However, you can use a vector of smart pointers to ensure that destroying the vector will free the objects in it. vector<unique_ptr<foo>> can be used in C++11, and in C++98 with TR1 you can use vector<tr1::shared_ptr<foo>> (though shared_ptr has a slight overhead compared to a raw pointer or unique_ptr).
Boost also has a pointer container library, where the special delete-on-destruction behavior is built into the container itself so you don't need smart pointers.
One of the problems is exception-safety.
For example, suppose that somewhere an exception is thrown: in this case, the destructor of std::vector is called. But this destructor call does not delete the raw owning pointers stored in the vector. So, the resources managed by those pointers are leaked (these can be both memory resources, so you have a memory leak, but they could also be non-memory resources, e.g. sockets, OpenGL textures, etc.).
Instead, if you have a vector of smart pointers (e.g. std::vector<std::unique_ptr<Foo>>), then if the vector's destructor is called, each pointed item (safely owned by a smart pointer) in the vector is properly deleted, calling its destructor. So, the resources associated to each item ("smartly" pointed to in the vector) are properly released.
Note that vectors of observing raw pointers are fine (assuming that the lifetime of the observed items exceeeds that of the vector). The problem is with raw owning pointers.
I will talk specifically about a vector of pointers that's responsible for managing the lifetime of the pointed objects, because that's the only case where a vector of pointers is clearly a questionable choice.
There are much better alternatives. Specifically:
std::vector<std::shared_ptr<foo>> v;
and
std::vector<std::unique_ptr<foo>> v;
and
boost::ptr_vector<foo> v; // www.boost.org
The above versions tell the user how the lifetime of the objects is taken care of. Using raw pointers instead can possibly lead to the pointers being deleted either more or less than once, especially if the code is modified over time, or if exceptions become involved.
If you use an interface like ´shared_ptr´ or ´unique_ptr´, this self-documents the lifetime management for the user. When you use raw pointers you have to be clearly document how you handle the lifetime management of the objects, and hope that the right people read the documentation at the right time.
The benefits of using vectors of raw pointers are that you have more flexibility in how taking care of lifetime management, and that you can possibly get rid of some performance and space overhead.
There is absolutely no problem in using a vector of pointers. Most here are suggesting smart pointers, but I just have to say, there is no problem with using a vector of pointers without smart pointers. I do it all the time.
I agree with juanchopanza that the problem is your example is the pointers come from new foo(). In a normal completely-valid use case, you might have the objects in some other collection C, so that the objects will automatically get destroyed when C is destroyed. Then, in doing in the process of doing in-depth operations on the objects in C, you might create any number of other collections containing pointers to the objects in C. (If the other collections used object copies that would be time and memory wasteful, while collections of references is expressly forbidden.) In this use case, we never want to destroy any objects when a collection of pointers is destroyed.
What is the difference in memory usage between:
std::vector<X*> vec
where each element is on the heap, but the vector itself isn't
and
std::vector<X>* vec
where the vector is declared on the heap, but each element is (on the stack?).
The second option doesn't make much sense- does it mean the vector pointer is on the heap, but it points back at each element, which are on the stack??
std::vector<X*> vec
Is an array of pointers of the class X. This is useful, for example, when making an array of non-copyable classes/objects like std::fstream in C++98. So
std::vector<std::fstream> vec;
is WRONG, and won't work. But
std::vector<std::fstream*> vec;
works, while you have to create a new object for each element, so for example if you want 5 fstream elements you have to write something like
vec.resize(5);
for(unsigned long i = 0; i < vec.size(); i++)
{
vec[i] = new std::fstream;
}
Of course, there are many other uses depending on your application.
Now the second case is a pointer of the vector itself. So:
vector<int>* vec;
is just a pointer! it doesn't carry any information, and you can't use it unless you create the object for the vector itself, like
vec = new vector<int>();
and eventually you may use it as:
vec->resize(5);
Now this is not really useful, since vectors anyway store their data on the heap and manage the memory they carry. So use it only if you have a good reason to do it, and sometimes you would need it. I don't have any example in mind on how it could be useful.
If this is what you really asked:
vector<X>* vec = new vector<X>();
it means that the whole vector with all its elements is on the heap. The elements occupy a contiguous memory block on the heap.
The difference is where (and what) you need to do for manual memory management.
Whenever you have a raw C-style pointer in C++, you need to do some manual memory management -- the raw pointer can point at anything, and the compiler won't do any automatic construction or destruction for you. So you need to be aware of where the pointer points and who 'owns' the memory pointed at in the rest of your code.
So when you have
std::vector<X*> vec;
you don't need to worry about memory management for the vector itself (the compiler will do it for you), but you do need to worry about the memory management of the pointed at X objects for the pointers you put in the vector. If you're allocating them with new, you need to make sure to manually delete them at some point.
When you have
std::vector<X> *vec;
You DO need to worry about memory management for the vector itself, but you DON'T need to worry about memory management for the individual elements.
Simplest is if you have:
std::vector<X> vec;
then you don't need to worry about memory management at all -- the compiler will take care of it for you.
In code using good modern C++ style, none of the above is true.
std::vector<X*> is a collection of handles to objects of type X or any of its subclasses, which you do not own. The owner knows how they were allocated and will deallocate them -- you don't know and don't care.
std::vector<X>* would in practice, only ever be used as a function argument which represents a vector you do not own (the caller does) but which you are going to modify. According to one common approach, the fact that it's a pointer rather than a vector means that it is optional. Much more rarely it might be used as a class member where the lifetime of the attached vector is known to outlive the class pointing to it.
std::vector<std::unique_ptr<X>> is a polymorphic collection of mixed objects of various subclasses of X (and maybe X itself directly). Occasionally you might use it non-polymorphically if X is expensive to move, but modern style makes most types cheap to move.
Prior to C++11, std::vector<some_smart_pointer<X> > (yes, there's a space between the closing brackets) would be used for both the polymorphic case and the non-copyable case. Note that some_smart_pointer isn't std::unique_ptr, which didn't exist yet, not std::auto_ptr, which wasn't usable in collections. boost::unique_ptr was a good choice. With C++11, the copyability requirement for collection elements is relaxed to moveability, so this reason completely went away. (There remain some types which are neither copyable nor moveable, such as the ScopeGuard pattern, but these should not be stored in a collection anyway)
What is better -
std::vector<MyClass*>
or
std::vector<MyClass>
?
I mean, will the second option store objects in heap anyway?
Which one is faster and cleaner?
std::vector<MyClass> shall be preferable in most cases. Yes, it will store objects in heap (dynamic storage) by default std::allocator.
The advantages are that objects are automatically destroyed on vector destruction and are allocated in single contiguous memory block reducing heap fragmentation. So this way it's cleaner.
Also this way is faster because it could minimize memory allocation operations. Vector will preallocate storage before constructing objects, so for N objects it will be M allocation operations and N constructor call, N > M (the bigger is N - the greater is difference). If you create objects manually and place them to vector by pointers it will lead to M allocations and N constructions, M = N + X, where X is vector storage allocations. And you always can minimize vector memory allocations if you know number of stored objects - using std::vector::reserve().
On the contrary using std::vector of pointers will require you to destroy objects manually, e.g. calling delete for dynamically allocated objects. This is not recommended. Such containers shall be used only as non-owning. Objects ownership shall be maintained externally in this case.
Yes, the second version will store the objects on the heap too, but in a more compact form, namely an array of MyClass.
In the first form you must allocate and deallocate your objects, while std::vector will do this for you in the second version.
So, as always, it depends on your needs and requirements. If you can choose, take the second form:
std::vector<MyClass>
it's much easier to maintain.
Depends. Storing copies of your class in the container is easier to work with, of course, but you need to run the copy constructor every time to save an instance to the container which may be problematical. Also, your container cannot store anything but that one class - in particular it cannot store other classes that are specialised ( inherited ) from your base class. So, in general, you usually end up having to store pointers to the class.
There are snags with storing pointers. However, to get the best of both worlds, consider using boost::ptr_vector instead, which has the advantages of smart pointers without the overhead.
std::vector<MyClass> will store objects on the heap (probably), std::vector<MyClass*> will store your pointers on the heap (probably) and your objects wherever they were created. Use std::vector<MyClass> whenever applicable, if you need to support inheritance std::vector<std::unique_ptr<MyClass>> again when applicable. If your vector isn't supposed to own the objects an std::vector<MyClass*> can be useful, but that's rarely the case.
I am writing an application using openFrameworks, but my question is not specific to just oF; rather, it is a general question about C++ vectors in general.
I wanted to create a class that contains multiple instances of another class, but also provides an intuitive interface for interacting with those objects. Internally, my class used a vector of the class, but when I tried to manipulate an object using vector.at(), the program would compile but not work properly (in my case, it would not display a video).
// instantiate object dynamically, do something, then append to vector
vector<ofVideoPlayer> videos;
ofVideoPlayer *video = new ofVideoPlayer;
video->loadMovie(filename);
videos.push_back(*video);
// access object in vector and do something; compiles but does not work properly
// without going into specific openFrameworks details, the problem was that the video would
// not draw to screen
videos.at(0)->draw();
Somewhere, it was suggested that I make a vector of pointers to objects of that class instead of a vector of those objects themselves. I implemented this and indeed it worked like a charm.
vector<ofVideoPlayer*> videos;
ofVideoPlayer * video = new ofVideoPlayer;
video->loadMovie(filename);
videos.push_back(video);
// now dereference pointer to object and call draw
videos.at(0)->draw();
I was allocating memory for the objects dynamically, i.e. ofVideoPlayer = new ofVideoPlayer;
My question is simple: why did using a vector of pointers work, and when would you create a vector of objects versus a vector of pointers to those objects?
What you have to know about vectors in c++ is that they have to use the copy operator of the class of your objects to be able to enter them into the vector. If you had memory allocation in these objects that was automatically deallocated when the destructor was called, that could explain your problems: your object was copied into the vector then destroyed.
If you have, in your object class, a pointer that points towards a buffer allocated, a copy of this object will point towards the same buffer (if you use the default copy operator). If the destructor deallocates the buffer, when the copy destructor will be called, the original buffer will be deallocated, therefore your data won't be available anymore.
This problem doesn't happen if you use pointers, because you control the life of your elements via new/destroy, and the vector functions only copy pointer towards your elements.
My question is simple: why did using a
vector of pointers work, and when
would you create a vector of objects
versus a vector of pointers to those
objects?
std::vector is like a raw array allocated with new and reallocated when you try to push in more elements than its current size.
So, if it contains A pointers, it's like if you were manipulating an array of A*.
When it needs to resize (you push_back() an element while it's already filled to its current capacity), it will create another A* array and copy in the array of A* from the previous vector.
If it contains A objects, then it's like you were manipulating an array of A, so A should be default-constructible if there are automatic reallocations occuring. In this case, the whole A objects get copied too in another array.
See the difference? The A objects in std::vector<A> can change address if you do some manipulations that requires the resizing of the internal array. That's where most problems with containing objects in std::vector comes from.
A way to use std::vector without having such problems is to allocate a large enough array from the start. The keyword here is "capacity". The std::vector capacity is the real size of the memory buffer in which it will put the objects. So, to setup the capacity, you have two choices:
1) size your std::vector on construction to build all the object from the start , with maximum number of objects - that will call constructors of each objects.
2) once the std::vector is constructed (but has nothing in it), use its reserve() function : the vector will then allocate a large enough buffer (you provide the maximum size of the vector). The vector will set the capacity. If you push_back() objects in this vector or resize() under the limit of the size you've provided in the reserve() call, it will never reallocate the internal buffer and your objects will not change location in memory, making pointers to those objects always valid (some assertions to check that change of capacity never occurs is an excellent practice).
If you are allocating memory for the objects using new, you are allocating it on the heap. In this case, you should use pointers. However, in C++, the convention is generally to create all objects on the stack and pass copies of those objects around instead of passing pointers to objects on the heap.
Why is this better? It is because C++ does not have garbage collection, so memory for objects on the heap will not be reclaimed unless you specifically delete the object. However, objects on the stack are always destroyed when they leave scope. If you create objects on the stack instead of the heap, you minimize your risk of memory leaks.
If you do use the stack instead of the heap, you will need to write good copy constructors and destructors. Badly written copy constructors or destructors can lead to either memory leaks or double frees.
If your objects are too large to be efficiently copied, then it is acceptable to use pointers. However, you should use reference-counting smart pointers (either the C++0x auto_ptr or one the Boost library pointers) to avoid memory leaks.
vector addition and internal housekeeping use copies of the original object - if taking a copy is very expensive or impossible, then using a pointer is preferable.
If you make the vector member a pointer, use a smart pointer to simplify your code and minimize the risk of leaks.
Maybe your class does not do proper (ie. deep) copy construction/assignment? If so, pointers would work but not object instances as the vector member.
Usually I don't store classes directly in std::vector. The reason is simple: you would not know if the class is derived or not.
E.g.:
In headers:
class base
{
public:
virtual base * clone() { new base(*this); };
virtual ~base(){};
};
class derived : public base
{
public:
virtual base * clone() { new derived(*this); };
};
void some_code(void);
void work_on_some_class( base &_arg );
In source:
void some_code(void)
{
...
derived instance;
work_on_some_class(derived instance);
...
}
void work_on_some_class( base &_arg )
{
vector<base> store;
...
store.push_back(*_arg.clone());
// Issue!
// get derived * from clone -> the size of the object would greater than size of base
}
So I prefer to use shared_ptr:
void work_on_some_class( base &_arg )
{
vector<shared_ptr<base> > store;
...
store.push_back(_arg.clone());
// no issue :)
}
The main idea of using vector is to store objects in a continue space, when using pointer or smart pointer that won't happen
Here also need to keep in mind the performance of memory usage by CPU.
std::vector vector guarantees(not sure) that the mem block is
continuous.
std::vectorstd::unique_ptr<Object> will keep smart-pointers in continuous memory, but real memory blocks for objects can be placed in different positions in RAM.
So I can guess that std::vector will be faster for cases when the size of the vector is reserved and known. However, std::vectorstd::unique_ptr<Object> will be faster if we don't know the planned size or we have plans to change the order of objects.
Just wondering what you think is the best practice regarding vectors in C++.
If I have a class containing a vector member variable.
When should this vector be declared a:
"Whole-object" vector member varaiable containing values, i.e. vector<MyClass> my_vector;
Pointer to a vector, i.e vector<MyClass>* my_vector;
Vector of pointers, i.e. vector<MyClass*> my_vector;
Pointer to vector of pointers, i.e. vector<MyClass*>* my_vector;
I have a specific example in one of my classes where I have currently declared a vector as case 4, i.e. vector<AnotherClass*>* my_vector;
where AnotherClass is another of the classes I have created.
Then, in the initialization list of my constructor, I create the vector using new:
MyClass::MyClass()
: my_vector(new vector<AnotherClass*>())
{}
In my destructor I do the following:
MyClass::~MyClass()
{
for (int i=my_vector->size(); i>0; i--)
{
delete my_vector->at(i-1);
}
delete my_vector;
}
The elements of the vectors are added in one of the methods of my class.
I cannot know how many objects will be added to my vector in advance. That is decided when the code executes, based on parsing an xml-file.
Is this good practice? Or should the vector instead be declared as one of the other cases 1, 2 or 3 ?
When to use which case?
I know the elements of a vector should be pointers if they are subclasses of another class (polymorphism). But should pointers be used in any other cases ?
Thank you very much!!
Usually solution 1 is what you want since it’s the simplest in C++: you don’t have to take care of managing the memory, C++ does all that for you (for example you wouldn’t need to provide any destructor then).
There are specific cases where this doesn’t work (most notably when working with polymorphous objects) but in general this is the only good way.
Even when working with polymorphous objects or when you need heap allocated objects (for whatever reason) raw pointers are almost never a good idea. Instead, use a smart pointer or container of smart pointers. Modern C++ compilers provide shared_ptr from the upcoming C++ standard. If you’re using a compiler that doesn’t yet have that, you can use the implementation from Boost.
Definitely the first!
You use vector for its automatic memory management. Using a raw pointer to a vector means you don't get automatic memory management anymore, which does not make sense.
As for the value type: all containers basically assume value-like semantics. Again, you'd have to do memory management when using pointers, and it's vector's purpose to do that for you. This is also described in item 79 from the book C++ Coding Standards. If you need to use shared ownership or "weak" links, use the appropriate smart pointer instead.
Deleting all elements in a vector manually is an anti-pattern and violates the RAII idiom in C++. So if you have to store pointers to objects in a vector, better use a 'smart pointer' (for example boost::shared_ptr) to facilitate resource destructions. boost::shared_ptr for example calls delete automatically when the last reference to an object is destroyed.
There is also no need to allocate MyClass::my_vector using new. A simple solution would be:
class MyClass {
std::vector<whatever> m_vector;
};
Assuming whatever is a smart pointer type, there is no extra work to be done. That's it, all resources are automatically destroyed when the lifetime of a MyClass instance ends.
In many cases you can even use a plain std::vector<MyClass> - that's when the objects in the vector are safe to copy.
In your example, the vector is created when the object is created, and it is destroyed when the object is destroyed. This is exactly the behavior you get when making the vector a normal member of the class.
Also, in your current approach, you will run into problems when making copies of your object. By default, a pointer would result in a flat copy, meaning all copies of the object would share the same vector. This is the reason why, if you manually manage resources, you usually need The Big Three.
A vector of pointers is useful in cases of polymorphic objects, but there are alternatives you should consider:
If the vector owns the objects (that means their lifetime is bounded by that of the vector), you could use a boost::ptr_vector.
If the objects are not owned by the vector, you could either use a vector of boost::shared_ptr, or a vector of boost::ref.
A pointer to a vector is very rarely useful - a vector is cheap to construct and destruct.
For elements in the vector, there's no correct answer. How often does the vector change? How much does it cost to copy-construct the elements in the vector? Do other containers have references or pointers to the vector elements?
As a rule of thumb, I'd go with no pointers until you see or measure that the copying of your classes is expensive. And of course the case you mentioned, where you store various subclasses of a base class in the vector, will require pointers.
A reference counting smart pointer like boost::shared_ptr will likely be the best choice if your design would otherwise require you to use pointers as vector elements.
Complex answer : it depends.
if your vector is shared or has a lifecycle different from the class which embeds it, it might be better to keep it as a pointer.
If the objects you're referencing have no (or have expensive) copy constructors , then it's better to keep a vector of pointer. In the contrary, if your objects use shallow copy, using vector of objects prevent you from leaking...