Is it safe to push_back 'dynamically allocated object' to vector? - c++

Whenever I need to add dynamically allocated object into a vector I've been doing that the following way:
class Foo { ... };
vector<Foo*> v;
v.push_back(new Foo);
// do stuff with Foo in v
// delete all Foo in v
It just worked and many others seem to do the same thing.
Today, I learned vector::push_back can throw an exception. That means the code above is not exception safe. :-( So I came up with a solution:
class Foo { ... };
vector<Foo*> v;
auto_ptr<Foo> p(new Foo);
v.push_back(p.get());
p.release();
// do stuff with Foo in v
// delete all Foo in v
But the problem is that the new way is verbose, tedious, and I see nobody's doing it. (At least not around me...)
Should I go with the new way?
Or, can I just stick with the old way?
Or, is there a better way of doing it?

If all you care about is exception-safety of this operation:
v.reserve(v.size()+1); // reserve can throw, but that doesn't matter
v.push_back(new Foo); // new can throw, that doesn't matter either.
The issue of a vector having responsibility for freeing the objects pointed to by its contents is a separate thing, I'm sure you'll get plenty of advice about that ;-)
Edit: hmm, I was going to quote the standard, but I actually can't find the necessary guarantee. What I'm looking for is that push_back will not throw unless either (a) it has to reallocate (which we know it won't because of the capacity), or (b) a constructor of T throws (which we know it won't since T is a pointer type). Sounds reasonable, but reasonable != guaranteed.
So, unless there's a beneficial answer over on this question:
Is std::vector::push_back permitted to throw for any reason other than failed reallocation or construction?
this code depends on the implementation not doing anything too "imaginative". Failing that, your solution from the question can be templated up:
template <typename T, typename Container>
void push_back_new(Container &c) {
auto_ptr<T> p(new T);
c.push_back(p.get());
p.release();
}
Usage then isn't too tedious:
struct Bar : Foo { };
vector<Foo*> v;
push_back_new<Foo>(v);
push_back_new<Bar>(v);
If it's really a factory function rather than new then you could modify the template accordingly. Passing a lot of different parameter lists in different situations would be difficult, though.

Your new way is more exception safe but there is a reason that you don't see it done anywhere else.
A vector of pointers only owns the pointers, it doesn't express ownership of the pointed-to objects. You are effectively releasing ownership to an object that doesn't "want" ownership.
Most people will use a vector of shared_ptr to express the ownership correctly or use something like boost::ptr_vector. Either of these mean that you don't have to explicitly delete the objects whose pointers you are storing which is error prone and potentially exception 'dangerous' at other points in the program.
Edit: You still have to be very careful with insertion into ptr_vector. Unfortunately, push_back taking a raw pointer provides the strong guarantee which means that either insertion succeeds or (effectively) nothing happens, so the object passed in is neither taken over nor destroyed. The version taking a smart pointer by value is defined as calling .release() before calling the strongly guaranteed version which effectively means that it can leak.
Using a vector of shared_ptr together with make_shared is much easier to use correctly.

The preferred way to do this is to use a container of smart pointers, for example, a std::vector<std::shared_ptr<Foo> > or a std::vector<std::unique_ptr<Foo> > (shared_ptr can be found in Boost and C++ TR1 as well; std::unique_ptr is effectively restricted to C++0x).
Another option is to use a container that owns dynamic objects, like those containers provided by the Boost Pointer Containers library.

How resilient to memory shortage is your program? If you really care about this you have to be prepared for new to throw as well. If you aren't going to handle that, I would not worry about jumping through the push_back hoops.
In the general case, if you run out of memory, the program already has likely insurmountable problems unless it's specifically designed to run forever in a constrained footprint (embedded systems) - in that case you do have to care about all of these cases.
If that applies to you, you could have a lengthy code review and retest cycle ahead of you. My guess is that you are OK to follow your team's practice here, though.
As others have pointed out, using vector to store raw pointers has its own problems, and there is a wealth of material on this site and in the other answers to direct you to a safer pattern.

Related

Trouble switching from vector of dumb pointers to boost::shared_ptr

Alright, this has had me stumped on and off for a couple of months now, so I'll see if anyone else can help with this.
So in my main program I have two kinds of structs (solarBody_t, ship_t) that are both derived from the same base class (physical_t). I make a vector of both objects, since I can't put a solarBody and a ship in the same vector. They are both dumb pointer vectors.
I tried putting both in the same vector, using a vector of boost::shared_ptrs. Now, to my understanding, shared_ptrs should have the same semantics and syntax as dumb pointers (e.g. foobar->mass = 42; should work for both). However, just changing the declaration to a vector of boost::shared_ptr to dumb pointers, it gives me an error when I try and push_back something to the vector of shared_ptrs.
From what I can tell, this should work. The boost docs give the example of
boost::shared_ptr<int> p(new int(2));
which is pretty much what I'm doing.
Has anyone had previous experiences with this? Or want to suggest another way to store everything in a vector?
For more details, here's the gist of it (kind of a contradiction of terms, but I made the pun, so there.)
I don't think it'll let you automatically construct a shared_ptr from a bare pointer. push_back is expecting a shared_ptr<foobar_t>, not a bare foobar_t. You should take a look at boost::make_shared and try something like this:
entity_smart.push_back(boost::make_shared<foobar_t>(42));
make_shared has a few advantages: namely, it allocates the pointer control block and the object itself in one allocation and keeps an unmatched new out of your code. It also makes it explicitly clear that you're creating a shared_ptr to an object.
Other than that, yes, the semantics should be basically the same. Keep in mind that shared_ptr may be overkill for what you're doing, though, if you don't actually need to share ownership of the objects.
Lateral thinking:
You do not actually need a shared_ptr here, what you want is a STL(-like) collection in which to store polymorphic values; the collection being the owner.
You have basically two solutions:
use a pointer-aware collection: boost::ptr_vector, from the Pointer Container library.
use a better pointer: std::unique_ptr, from the C++11 Standard
I would still advise boost::ptr_vector even if you have access to C++11 because it provides additional guarantees (null not allowed by default) and sugar coating (dereferencing an iterator gives a reference, not a pointer than you have to dereference once more).

Using auto_ptr<> with array

I'm using auto_ptr<> which uses an array of class pointer type so how do I assign a value to it.
e.g.
auto_ptr<class*> arr[10];
How can I assign a value to the arr array?
You cannot use auto_ptr with array, because it calls delete p, not delete [] p.
You want boost::scoped_array or some other boost::smart_array :)
If you have C++0x (e.g. MSVC10, GCC >= 4.3), I'd strongly advise to use either a std::vector<T> or a std::array<T, n> as your base object type (depending on whether the size is fixed or variable), and if you allocate this guy on the heap and need to pass it around, put it in a std::shared_ptr:
typedef std::array<T, n> mybox_t;
typedef std::shared_ptr<mybox_t> mybox_p;
mybox_p makeBox() { auto bp = std::make_shared<mybox_t>(...); ...; return bp; }
Arrays and auto_ptr<> don't mix.
From the GotW site:
Every delete must match the form of
its new. If you use single-object
new, you must use single-object
delete; if you use the array form of
new, you must use the array form of
delete. Doing otherwise yields
undefined behaviour.
I'm not going to copy the GotW site verbatim; however, I will summarize your options to solve your problem:
Roll your own auto array
1a. Derive from auto_ptr. Few advantages, too difficult.
1b. Clone auto_ptr code. Easy to implement, no significant space/overhead. Hard to maintain.
Use the Adapter Pattern. Easy to implement, hard to use, maintain, understand. Takes more time / overhead.
Replace auto_ptr With Hand-Coded EH Logic. Easy to use, no significant space/time/overhead. Hard to implement, read, brittle.
Use a vector<> Instead Of an Array. Easy to implement, easy to read, less brittle, no significant space, time, overhead. Syntactic changes needed and sometimes usability changes.
So the bottom line is to use a vector<> instead of C-style arrays.
As everyone said here, don't mix arrays with auto_ptr. This must be used only when you've multiple returns where you feel really difficult to release memory, or when you get an allocated pointer from somewhere else and you've the responsibility to clean it up before existing the function.
the other thing is that, in the destructor of auto_ptr it calls delete operator with the stored pointer. Now what you're passing is a single element of an array. Memory manager will try to find and free up the memory blocks allocated starting from the address you're passing. Probably this might not be existing heap where all allocations are maintained. You may experience an undefined behavior like crash, memory corruption etc. upon this operation.

Why is creating STL containers dynamically considered bad practice?

Title says it.
Sample of bad practive:
std::vector<Point>* FindPoints()
{
std::vector<Point>* result = new std::vector<Point>();
//...
return result;
}
What's wrong with it if I delete that vector later?
I mostly program in C#, so this problem is not very clear for me in C++ context.
As a rule of thumb, you don't do this because the less you allocate on the heap, the less you risk leaking memory. :)
std::vector is useful also because it automatically manages the memory used for the vector in RAII fashion; by allocating it on the heap now you require an explicit deallocation (with delete result) to avoid leaking its memory. The thing is made complicated because of exceptions, that can alter your return path and skip any delete you put on the way. (In C# you don't have such problems because inaccessible memory is just recalled periodically by the garbage collector)
If you want to return an STL container you have several choices:
just return it by value; in theory you should incur in a copy-penality because of the temporaries that are created in the process of returning result, but newer compilers should be able to elide the copy using NRVO1. There may also be std::vector implementations that implement copy-on-write optimization like many std::string implementations do, but I've never heard about that.
On C++0x compilers, instead, the move semantics should trigger, avoiding any copy.
Store the pointer of result in an ownership-transferring smart pointer like std::auto_ptr (or std::unique_ptr in C++0x), and also change the return type of your function to std::auto_ptr<std::vector<Point > >; in that way, your pointer is always encapsulated in a stack-object, that is automatically destroyed when the function exits (in any way), and destroys the vector if its still owned by it. Also, it's completely clear who owns the returned object.
Make the result vector a parameter passed by reference by the caller, and fill that one instead of returning a new vector.
Hardcore STL option: you would instead provide your data as iterators; the client code would then use std::copy+std::back_inserter or whatever to store such data in whichever container it wants. Not seen much (it can be tricky to code right) but it's worth mentioning.
As #Steve Jessop pointed out in the comments, NRVO works completely only if the return value is used directly to initialize a variable in the calling method; otherwise, it would still be able to elide the construction of the temporary return value, but the assignment operator for the variable to which the return value is assigned could still be called (see #Steve Jessop's comments for details).
Creating anything dynamically is bad practice unless it's really necessary. There's rarely a good reason to create a container dynamically, so it's usually not a good idea.
Edit: Usually, instead of worrying about things like how fast or slow returning a container is, most of the code should deal only with an iterator (or two) into the container.
Creating objects dynamically in general is considered a bad practice in C++. What if an exception is thrown from your "//..." code? You'll never be able to delete the object. It is easier and safer to simply do:
std::vector<Point> FindPoints()
{
std::vector<Point> result;
//...
return result;
}
Shorter, safer, more straghtforward... As for the performance, modern compilers will optimize away the copy on return and if they are not able to, move constructors will get executed so this is still a cheap operation.
Perhaps you're referring to this recent question: C++: vector<string> *args = new vector<string>(); causes SIGABRT
One liner: It's bad practice because it's a pattern that's prone to memory leaks.
You're forcing the caller to accept dynamic allocation and take charge of its lifetime. It's ambiguous from the declaration whether the pointer returned is a static buffer, a buffer owned by some other API (or object), or a buffer that's now owned by the caller. You should avoid this pattern in any language (including plain C) unless it's clear from the function name what's going on (e.g strdup, malloc).
The usual way is to instead do this:
void FindPoints(std::vector<Point>* ret) {
std::vector<Point> result;
//...
ret->swap(result);
}
void caller() {
//...
std::vector<Point> foo;
FindPoints(&foo);
// foo deletes itself
}
All objects are on the stack, and all the deletion is taken care of by the compiler. Or just return by value, if you're running a C++0x compiler+STL, or don't mind the copy.
I like Jerry Coffin's answer. Additionally, if you want to avoid returning a copy, consider passing the result container as a reference, and the swap() method may be needed sometimes.
void FindPoints(std::vector<Point> &points)
{
std::vector<Point> result;
//...
result.swap(points);
}
Programming is the art of finding good compromises. Dynamically allocated memory can have some place of course, and I can even think to problems where a good compromise between code complexity and efficiency is obtained using std::vector<std::vector<T>*>.
However std::vector does a great job of hiding most needs of dynamically allocated arrays, and managed pointers are many times just a perfect solution for dynamically allocated single instances. This means that it's just not so common finding cases where an unmanaged dynamically allocated container (or dynamically allocated whatever, actually) is the best compromise in C++.
This in my opinion doesn't make dynamic allocation "bad", but just "suspect" if you see it in code, because there's an high probability that better solutions could be possile.
In your case for example I see no reason for using dynamic allocation; just making the function returning an std::vector would be efficient and safe. With any decent compiler Return Value Optimization will be used when assigning to a newly declared vector, and if you need to assign the result to an existing vector you can still do something like:
FindPoints().swap(myvector);
that will not do any copying of the data but just some pointer twiddling (note that you cannot use the apparently more natural myvector.swap(FindPoints()) because of a C++ rule that is sometimes annoying that forbids passing temporaries as non-const references).
In my experience the biggest source of needs of dynamically allocated objects are complex data structures where the same instance can be reached using multiple access paths (e.g. instances are at the same time both in a doubly linked list and indexed by a map). In the standard library containers are always the only owner of the contained objects (C++ is a copy semantic language) so it may be difficult to implement those solutions efficiently without the pointer and dynamic allocation concept.
Often you can stil reasonable-enough compromises that just use standard containers however (may be paying some extra O(log N) lookups that you could have avoided) and that, considering the much simpler code, can be IMO the best compromise in most cases.

How do ensure that while writing C++ code itself it will not cause any memory leaks?

Running valgrind or purify would be the next steps
But while while writing the code itself how do you ensure that it will not cause any memory leaks?
You can ensure following things:-
1: Number of new equal to delete
2: Opened File descriptor is closed or not
Is there any thing else?
Use the RAII idiom everywhere you can
Use smart pointers, e.g. std::auto_ptr where appropriate. (don't use auto_prt in any of the standard collections as it won't work as you think it will)
Avoid creating objects dynamically wherever possible. Programmers coming from Java and other similar languages often write stuff like:
string * s = new string( "hello world" );
when they should have written:
string s = "hello world";
Similarly, they create collections of pointers when they should create collections of values. For example, if you have a class like this:
class Person {
public:
Person( const string & name ) : mName( name ) {}
...
private:
string mName;
};
Rather than writing code like:
vector <Person *> vp;
or even:
vector <shared_ptr <Person> > vp;
instead use values:
vector <Person> vp;
You can easily add to such a vector:
vp.push_back( Person( "neil butterworth" ) );
and all the memory for both Person and the vector is managed for you. Of course, if you need a collection of polymorphic types, you should use (smart) pointers
Use Smart Pointers
Use RAII
Hide default copy ctors, operator=()
in EVERY CLASS,
unless a) your class is trivial and
only uses native types and YOU KNOW
IT ALWAYS WILL BE SO b) you
explicitly define your own
On 1) RAII, the idea is to have deletes happen automatically, if you find yourself thinking "I just called new, I'll need to remember to call delete somewhere" then you're doing something wrong. The delete should either be a) automatic or b) be put in a dtor (and which dtor should be obvious).
On 2) Hiding defaults. Identifying rogue default copy ctors etc can be a nightmare, the easiest thing is to avoid them by hiding them. If you have a generic "root" object that everything inherits from (can be handy for debugging / profiling anyway) hide the defaults here, then when an something tries to assign / copy an inheriting class the compiler barfs because the ctor's etc aren't available on the base class.
Minimize the calls to new by using the STL containers for storing your data.
I'm with Glen and jalf regarding RAII at every opportunity.
IMHO you should aim to write completely delete-free code. The only explicit "delete"s should be in your smart pointer class implementations. If you find yourself wanting to write a "delete", go and find an appropriate smart pointer type instead. If none of the "industry standard" ones (boost's etc) fit and you find yourself wanting to write some bizzare new one, chances are your architecture is broken or at the least there will be maintenance difficulties in future.
I've long held that explicit "delete" is to memory management what "goto" is to flow control. More on this in this answer.
I always use std::auto_ptr when I need to create a new object on the heap.
std::auto_ptr<Foo> CreateFoo()
{
return std::auto_ptr<Foo>(new Foo());
}
Even if you call
CreateFoo()
it won't leak
The basic steps are twofold:
Firstly, be aware that every new requires a delete. So, when you use the new operator, up your awareness of what that object will be doing, how it will be used, and how its lifetime will be managed.
Secondly, make sure that you never overwrite a pointer. You can do this using a smart pointer class instead of raw pointers, but if you do make absolutely sure you never use it with implicit conversion. (an example: using MSXML library, I created a CCOMPtr smart pointer to hold nodes, to get a node you call the get_Node method, passing in the address of the smart pointer - which had a conversion operator that returned the underlying pointer type. Unfortunately, this meant that if the smart pointer already held data, that member data would be overwritten, leaking the previous node).
I think those 2 cases are the times when you might leak memory. If you only use the smart pointer directly - never allowing its internal data to be exposed, you're safe from the latter issue. If you wrap all your code that uses new and delete in a class (ie using RAII) then you're pretty safe from the former too.
Avoiding memory leaks in C++ is very easy if you do the above.
Two simple rules of thumb:
Never call delete explicitly (outside a RAII class, that is). Every memory allocation should be the responsibility of a RAII class which calls delete in the destructor.
Almost never call new explicitly. If you do, you should immediately wrap the resulting pointer in a smart pointer, which takes ownership of the allocation, and works as above.
In your own RAII classes, two common pitfalls are:
Failure to handle copying correctly: Who takes ownership of the memory if the object is copied? Do they create a new allocation? Do you implement both copy constructor and assignment operator? Does the latter handle self assignment?
Failure to consider exception safety. What happens if an exception is thrown during an operation (an assignment, for example)? Does the object revert to a consistent state? (it should always do this, no matter what) Does it roll back to the state it had before the operation? (it should do this when possible) std::vector has to handle this, during push_back for example. It might cause the vector to resize, which means 1) a memory allocation which may throw, and 2) all the existing elements have to be copied, each of which may throw. An algorithm like std::sort has to deal with it too. It has to call a user-supplied comparer, which could potentially throw too! if that happens, is the sequence left in a valid state? Are temporary objects destructed cleanly?
If you handle the above two cases in your RAII classes, it is pretty much impossible for them to leak memory.
And if you use RAII classes to wrap all resource allocations (memory allocations, file handles, database connections and any other type of resource that has to be acquired and released), then your application can not leak memory.
Make sure shared memory created by your application is freed if nobody's using it anymore, clean up memory mapped files...
Basically, make sure you clean up any type of resource your application directly or indirectly creates. File descriptors are only one type of resource your application may use during runtime.
if you make any tree or graph recursively in your code for your data structure maybe eat all of your memory.
There are static code analysis tools available that do this sort of thing; wikipedia is a good place to start looking. Basically, outside of being careful and choosing the correct containers you can not make guarantees about the code you write - hence the need for tools such as valgrind and gdb.
Incorporate valgrind unit and system testing early in your development cycle and use it consistantly.

Does myVector.erase(myPtr) delete the object pointed by myPtr?

If I have the following code,
Foo *f = new Foo();
vector<Foo*> vect;
vect.push_back(f);
// do stuff
vect.erase(f);
Did I create a memory leak?
I guess so, but the word erase gives the feeling that it is deleting it.
Writing this, I am wondering if it is not a mistake to put a pointer in a STL vector. What do you think?
Yes, you created a memory leak by that. std::vector and other containers will just remove the pointer, they won't free the memory the pointer points to.
It's not unusual to put a pointer into a standard library container. The problem, however, is that you have to keep track of deleting it when removing it from the container. A better, yet simple, way to do the above, is to use boost::shared_ptr:
{
boost::shared_ptr<foo> f(new foo);
std::vector< boost::shared_ptr<foo> > v;
v.push_back(f);
v.erase(v.begin());
} /* if the last copy of foo goes out of scope, the memory is automatically freed */
The next C++ standard (called C++1x and C++0x commonly) will include std::shared_ptr. There, you will also be able to use std::unique_ptr<T> which is faster, as it doesn't allow copying. Using std::unique_ptr with containers in c++0x is similar to the ptr_container library in boost.
Another option is to use the Boost Pointer Containers. They are designed to do exactly what you want.
Alternatively there is the boost::ptr_vector container.
It knows it is holding pointers that it owns and thus auto deletes them.
As a nice side affect, when accessing elements it returns a reference to the object not the pointer to make the code look nice.
Foo *f = new Foo();
boost::ptr_vector<Foo> vect;
vect.push_back(f);
// do stuff
vect.erase(f);
To clarify why the pointer is not deleted, consider
std::vector<char const*> strings;
strings.push_back("hello");
strings.push_back("world");
// .erase should not call delete, pointers are to literals
std::vector<int*> arrays;
strings.push_back(new int[10]);
strings.push_back(new int[20]);
// .erase should call delete[] instead of delete
std::vector<unsigned char*> raw;
strings.push_back(malloc(1000));
strings.push_back(malloc(2000));
// .erase should call free() instead of delete
In general, vector<T*>::erase cannot guess how you'd dispose of a T*.
It is definitely not a mistake to point a pointer into a standard container (it's a mistake to make a container of auto_ptr's however). Yes, you do need to explicitly delete to free the memory pointed to by the individual elements, or you can use one of the boost smart pointers.
vector deletes the data it contains. Since your vector contains pointers, it only deletes the pointers, not the data they may or may not point to.
It's a pretty general rule in C++ that memory is released where it was allocated. The vector did not allocate whatever your pointers point to, so it must not release it.
You probably shouldn't store pointers in your vector in the first place.
In many cases, you would be better off with something like this:
vector<Foo> vect;
vect.push_back(Foo());
// do stuff
vect.erase(f);
Of course this assumes that Foo is copyable, and that its copy constructor is not too expensive, but it avoids memory leaks, and you don't have to remember to delete the Foo object. Another approach would be to use smart pointers (such as Boost's shared_ptr), but you may not need pointer semantics at all, in which case the simple solution is the best one.
STL containers will not free your memory. The best advice is using smart pointers, knowing that std::auto_ptr will not fit inside containers. I would recommend boost::shared_ptr, or if your compiler vendor has support for TR1 extensions (many do) you can use std::tr1::shared_ptr.
Also note that the vector will not even free the internal memory reserved for the pointer. std::vectors never downsize not even with a call to clear(). If you need to downsize a vector you will have to resort to creating another vector and swapping contents.