Related
I'll start out by saying, use smart pointers and you'll never have to worry about this.
What are the problems with the following code?
Foo * p = new Foo;
// (use p)
delete p;
p = NULL;
This was sparked by an answer and comments to another question. One comment from Neil Butterworth generated a few upvotes:
Setting pointers to NULL following delete is not universal good practice in C++. There are times when it is a good thing to do, and times when it is pointless and can hide errors.
There are plenty of circumstances where it wouldn't help. But in my experience, it can't hurt. Somebody enlighten me.
Setting a pointer to 0 (which is "null" in standard C++, the NULL define from C is somewhat different) avoids crashes on double deletes.
Consider the following:
Foo* foo = 0; // Sets the pointer to 0 (C++ NULL)
delete foo; // Won't do anything
Whereas:
Foo* foo = new Foo();
delete foo; // Deletes the object
delete foo; // Undefined behavior
In other words, if you don't set deleted pointers to 0, you will get into trouble if you're doing double deletes. An argument against setting pointers to 0 after delete would be that doing so just masks double delete bugs and leaves them unhandled.
It's best to not have double delete bugs, obviously, but depending on ownership semantics and object lifecycles, this can be hard to achieve in practice. I prefer a masked double delete bug over UB.
Finally, a sidenote regarding managing object allocation, I suggest you take a look at std::unique_ptr for strict/singular ownership, std::shared_ptr for shared ownership, or another smart pointer implementation, depending on your needs.
Setting pointers to NULL after you've deleted what it pointed to certainly can't hurt, but it's often a bit of a band-aid over a more fundamental problem: Why are you using a pointer in the first place? I can see two typical reasons:
You simply wanted something allocated on the heap. In which case wrapping it in a RAII object would have been much safer and cleaner. End the RAII object's scope when you no longer need the object. That's how std::vector works, and it solves the problem of accidentally leaving pointers to deallocated memory around. There are no pointers.
Or perhaps you wanted some complex shared ownership semantics. The pointer returned from new might not be the same as the one that delete is called on. Multiple objects may have used the object simultaneously in the meantime. In that case, a shared pointer or something similar would have been preferable.
My rule of thumb is that if you leave pointers around in user code, you're Doing It Wrong. The pointer shouldn't be there to point to garbage in the first place. Why isn't there an object taking responsibility for ensuring its validity? Why doesn't its scope end when the pointed-to object does?
I've got an even better best practice: Where possible, end the variable's scope!
{
Foo* pFoo = new Foo;
// use pFoo
delete pFoo;
}
I always set a pointer to NULL (now nullptr) after deleting the object(s) it points to.
It can help catch many references to freed memory (assuming your platform faults on a deref of a null pointer).
It won't catch all references to free'd memory if, for example, you have copies of the pointer lying around. But some is better than none.
It will mask a double-delete, but I find those are far less common than accesses to already freed memory.
In many cases the compiler is going to optimize it away. So the argument that it's unnecessary doesn't persuade me.
If you're already using RAII, then there aren't many deletes in your code to begin with, so the argument that the extra assignment causes clutter doesn't persuade me.
It's often convenient, when debugging, to see the null value rather than a stale pointer.
If this still bothers you, use a smart pointer or a reference instead.
I also set other types of resource handles to the no-resource value when the resource is free'd (which is typically only in the destructor of an RAII wrapper written to encapsulate the resource).
I worked on a large (9 million statements) commercial product (primarily in C). At one point, we used macro magic to null out the pointer when memory was freed. This immediately exposed lots of lurking bugs that were promptly fixed. As far as I can remember, we never had a double-free bug.
Update: Microsoft believes that it's a good practice for security and recommends the practice in their SDL policies. Apparently MSVC++11 will stomp the deleted pointer automatically (in many circumstances) if you compile with the /SDL option.
Firstly, there are a lot of existing questions on this and closely related topics, for example Why doesn't delete set the pointer to NULL?.
In your code, the issue what goes on in (use p). For example, if somewhere you have code like this:
Foo * p2 = p;
then setting p to NULL accomplishes very little, as you still have the pointer p2 to worry about.
This is not to say that setting a pointer to NULL is always pointless. For example, if p were a member variable pointing to a resource who's lifetime was not exactly the same as the class containing p, then setting p to NULL could be a useful way of indicating the presence or absence of the resource.
If there is more code after the delete, Yes. When the pointer is deleted in a constructor or at the end of method or function, No.
The point of this parable is to remind the programmer, during run-time, that the object has already been deleted.
An even better practice is to use Smart Pointers (shared or scoped) which automagically delete their target objects.
As others have said, delete ptr; ptr = 0; is not going to cause demons to fly out of your nose. However, it does encourage the usage of ptr as a flag of sorts. The code becomes littered with delete and setting the pointer to NULL. The next step is to scatter if (arg == NULL) return; through your code to protect against the accidental usage of a NULL pointer. The problem occurs once the checks against NULL become your primary means of checking for the state of an object or program.
I'm sure that there is a code smell about using a pointer as a flag somewhere but I haven't found one.
I'll change your question slightly:
Would you use an uninitialized
pointer? You know, one that you didn't
set to NULL or allocate the memory it
points to?
There are two scenarios where setting the pointer to NULL can be skipped:
the pointer variable goes out of scope immediately
you have overloaded the semantic of the pointer and are using its value not only as a memory pointer, but also as a key or raw value. this approach however suffers from other problems.
Meanwhile, arguing that setting the pointer to NULL might hide errors to me sounds like arguing that you shouldn't fix a bug because the fix might hide another bug. The only bugs that might show if the pointer is not set to NULL would be the ones that try to use the pointer. But setting it to NULL would actually cause exactly the same bug as would show if you use it with freed memory, wouldn't it?
If you have no other constraint that forces you to either set or not set the pointer to NULL after you delete it (one such constraint was mentioned by Neil Butterworth), then my personal preference is to leave it be.
For me, the question isn't "is this a good idea?" but "what behavior would I prevent or allow to succeed by doing this?" For example, if this allows other code to see that the pointer is no longer available, why is other code even attempting to look at freed pointers after they are freed? Usually, it's a bug.
It also does more work than necessary as well as hindering post-mortem debugging. The less you touch memory after you don't need it, the easier it is to figure out why something crashed. Many times I have relied on the fact that memory is in a similar state to when a particular bug occurred to diagnose and fix said bug.
Explicitly nulling after delete strongly suggests to a reader that the pointer represents something which is conceptually optional. If I saw that being done, I'd start worrying that everywhere in the source the pointer gets used that it should be first tested against NULL.
If that's what you actually mean, it's better to make that explicit in the source using something like boost::optional
optional<Foo*> p (new Foo);
// (use p.get(), but must test p for truth first!...)
delete p.get();
p = optional<Foo*>();
But if you really wanted people to know the pointer has "gone bad", I'll pitch in 100% agreement with those who say the best thing to do is make it go out of scope. Then you're using the compiler to prevent the possibility of bad dereferences at runtime.
That's the baby in all the C++ bathwater, shouldn't throw it out. :)
In a well structured program with appropriate error checking, there is no reason not to assign it null. 0 stands alone as a universally recognized invalid value in this context. Fail hard and Fail soon.
Many of the arguments against assigning 0 suggest that it could hide a bug or complicate control flow. Fundamentally, that is either an upstream error (not your fault (sorry for the bad pun)) or another error on the programmer's behalf -- perhaps even an indication that program flow has grown too complex.
If the programmer wants to introduce the use of a pointer which may be null as a special value and write all the necessary dodging around that, that's a complication they have deliberately introduced. The better the quarantine, the sooner you find cases of misuse, and the less they are able to spread into other programs.
Well structured programs may be designed using C++ features to avoid these cases. You can use references, or you can just say "passing/using null or invalid arguments is an error" -- an approach which is equally applicable to containers, such as smart pointers. Increasing consistent and correct behavior forbids these bugs from getting far.
From there, you have only a very limited scope and context where a null pointer may exist (or is permitted).
The same may be applied to pointers which are not const. Following the value of a pointer is trivial because its scope is so small, and improper use is checked and well defined. If your toolset and engineers cannot follow the program following a quick read or there is inappropriate error checking or inconsistent/lenient program flow, you have other, bigger problems.
Finally, your compiler and environment likely has some guards for the times when you would like to introduce errors (scribbling), detect accesses to freed memory, and catch other related UB. You can also introduce similar diagnostics into your programs, often without affecting existing programs.
Let me expand what you've already put into your question.
Here's what you've put into your question, in bullet-point form:
Setting pointers to NULL following delete is not universal good practice in C++. There are times when:
it is a good thing to do
and times when it is pointless and can hide errors.
However, there is no times when this is bad! You will not introduce more bugs by explicitly nulling it, you will not leak memory, you will not cause undefined behaviour to happen.
So, if in doubt, just null it.
Having said that, if you feel that you have to explicitly null some pointer, then to me this sounds like you haven't split up a method enough, and should look at the refactoring approach called "Extract method" to split up the method into separate parts.
There is always Dangling Pointers to worry about.
Yes.
The only "harm" it can do is to introduce inefficiency (an unnecessary store operation) into your program - but this overhead will be insignificant in relation to the cost of allocating and freeing the block of memory in most cases.
If you don't do it, you will have some nasty pointer derefernce bugs one day.
I always use a macro for delete:
#define SAFEDELETE(ptr) { delete(ptr); ptr = NULL; }
(and similar for an array, free(), releasing handles)
You can also write "self delete" methods that take a reference to the calling code's pointer, so they force the calling code's pointer to NULL. For example, to delete a subtree of many objects:
static void TreeItem::DeleteSubtree(TreeItem *&rootObject)
{
if (rootObject == NULL)
return;
rootObject->UnlinkFromParent();
for (int i = 0; i < numChildren)
DeleteSubtree(rootObject->child[i]);
delete rootObject;
rootObject = NULL;
}
edit
Yes, these techniques do violate some rules about use of macros (and yes, these days you could probably achieve the same result with templates) - but by using over many years I never ever accessed dead memory - one of the nastiest and most difficult and most time consuming to debug problems you can face. In practice over many years they have effectively eliminated a whjole class of bugs from every team I have introduced them on.
There are also many ways you could implement the above - I am just trying to illustrate the idea of forcing people to NULL a pointer if they delete an object, rather than providing a means for them to release the memory that does not NULL the caller's pointer.
Of course, the above example is just a step towards an auto-pointer. Which I didn't suggest because the OP was specifically asking about the case of not using an auto pointer.
"There are times when it is a good thing to do, and times when it is pointless and can hide errors"
I can see two problems:
That simple code:
delete myObj;
myobj = 0
becomes to a for-liner in multithreaded environment:
lock(myObjMutex);
delete myObj;
myobj = 0
unlock(myObjMutex);
The "best practice" of Don Neufeld don't apply always. E.g. in one automotive project we had to set pointers to 0 even in destructors. I can imagine in safety-critical software such rules are not uncommon. It is easier (and wise) to follow them than trying to persuade
the team/code-checker for each pointer use in code, that a line nulling this pointer is redundant.
Another danger is relying on this technique in exceptions-using code:
try{
delete myObj; //exception in destructor
myObj=0
}
catch
{
//myObj=0; <- possibly resource-leak
}
if (myObj)
// use myObj <--undefined behaviour
In such code either you produce resource-leak and postpone the problem or the process crashes.
So, this two problems going spontaneously through my head (Herb Sutter would for sure tell more) make for me all the questions of the kind "How to avoid using smart-pointers and do the job safely with normal pointers" as obsolete.
If you're going to reallocate the pointer before using it again (dereferencing it, passing it to a function, etc.), making the pointer NULL is just an extra operation. However, if you aren't sure whether it will be reallocated or not before it is used again, setting it to NULL is a good idea.
As many have said, it is of course much easier to just use smart pointers.
Edit: As Thomas Matthews said in this earlier answer, if a pointer is deleted in a destructor, there isn't any need to assign NULL to it since it won't be used again because the object is being destroyed already.
I can imagine setting a pointer to NULL after deleting it being useful in rare cases where there is a legitimate scenario of reusing it in a single function (or object). Otherwise it makes no sense - a pointer needs to point to something meaningful as long as it exists - period.
If the code does not belong to the most performance-critical part of your application, keep it simple and use a shared_ptr:
shared_ptr<Foo> p(new Foo);
//No more need to call delete
It performs reference counting and is thread-safe. You can find it in the tr1 (std::tr1 namespace, #include < memory >) or if your compiler does not provide it, get it from boost.
I'll start out by saying, use smart pointers and you'll never have to worry about this.
What are the problems with the following code?
Foo * p = new Foo;
// (use p)
delete p;
p = NULL;
This was sparked by an answer and comments to another question. One comment from Neil Butterworth generated a few upvotes:
Setting pointers to NULL following delete is not universal good practice in C++. There are times when it is a good thing to do, and times when it is pointless and can hide errors.
There are plenty of circumstances where it wouldn't help. But in my experience, it can't hurt. Somebody enlighten me.
Setting a pointer to 0 (which is "null" in standard C++, the NULL define from C is somewhat different) avoids crashes on double deletes.
Consider the following:
Foo* foo = 0; // Sets the pointer to 0 (C++ NULL)
delete foo; // Won't do anything
Whereas:
Foo* foo = new Foo();
delete foo; // Deletes the object
delete foo; // Undefined behavior
In other words, if you don't set deleted pointers to 0, you will get into trouble if you're doing double deletes. An argument against setting pointers to 0 after delete would be that doing so just masks double delete bugs and leaves them unhandled.
It's best to not have double delete bugs, obviously, but depending on ownership semantics and object lifecycles, this can be hard to achieve in practice. I prefer a masked double delete bug over UB.
Finally, a sidenote regarding managing object allocation, I suggest you take a look at std::unique_ptr for strict/singular ownership, std::shared_ptr for shared ownership, or another smart pointer implementation, depending on your needs.
Setting pointers to NULL after you've deleted what it pointed to certainly can't hurt, but it's often a bit of a band-aid over a more fundamental problem: Why are you using a pointer in the first place? I can see two typical reasons:
You simply wanted something allocated on the heap. In which case wrapping it in a RAII object would have been much safer and cleaner. End the RAII object's scope when you no longer need the object. That's how std::vector works, and it solves the problem of accidentally leaving pointers to deallocated memory around. There are no pointers.
Or perhaps you wanted some complex shared ownership semantics. The pointer returned from new might not be the same as the one that delete is called on. Multiple objects may have used the object simultaneously in the meantime. In that case, a shared pointer or something similar would have been preferable.
My rule of thumb is that if you leave pointers around in user code, you're Doing It Wrong. The pointer shouldn't be there to point to garbage in the first place. Why isn't there an object taking responsibility for ensuring its validity? Why doesn't its scope end when the pointed-to object does?
I've got an even better best practice: Where possible, end the variable's scope!
{
Foo* pFoo = new Foo;
// use pFoo
delete pFoo;
}
I always set a pointer to NULL (now nullptr) after deleting the object(s) it points to.
It can help catch many references to freed memory (assuming your platform faults on a deref of a null pointer).
It won't catch all references to free'd memory if, for example, you have copies of the pointer lying around. But some is better than none.
It will mask a double-delete, but I find those are far less common than accesses to already freed memory.
In many cases the compiler is going to optimize it away. So the argument that it's unnecessary doesn't persuade me.
If you're already using RAII, then there aren't many deletes in your code to begin with, so the argument that the extra assignment causes clutter doesn't persuade me.
It's often convenient, when debugging, to see the null value rather than a stale pointer.
If this still bothers you, use a smart pointer or a reference instead.
I also set other types of resource handles to the no-resource value when the resource is free'd (which is typically only in the destructor of an RAII wrapper written to encapsulate the resource).
I worked on a large (9 million statements) commercial product (primarily in C). At one point, we used macro magic to null out the pointer when memory was freed. This immediately exposed lots of lurking bugs that were promptly fixed. As far as I can remember, we never had a double-free bug.
Update: Microsoft believes that it's a good practice for security and recommends the practice in their SDL policies. Apparently MSVC++11 will stomp the deleted pointer automatically (in many circumstances) if you compile with the /SDL option.
Firstly, there are a lot of existing questions on this and closely related topics, for example Why doesn't delete set the pointer to NULL?.
In your code, the issue what goes on in (use p). For example, if somewhere you have code like this:
Foo * p2 = p;
then setting p to NULL accomplishes very little, as you still have the pointer p2 to worry about.
This is not to say that setting a pointer to NULL is always pointless. For example, if p were a member variable pointing to a resource who's lifetime was not exactly the same as the class containing p, then setting p to NULL could be a useful way of indicating the presence or absence of the resource.
If there is more code after the delete, Yes. When the pointer is deleted in a constructor or at the end of method or function, No.
The point of this parable is to remind the programmer, during run-time, that the object has already been deleted.
An even better practice is to use Smart Pointers (shared or scoped) which automagically delete their target objects.
As others have said, delete ptr; ptr = 0; is not going to cause demons to fly out of your nose. However, it does encourage the usage of ptr as a flag of sorts. The code becomes littered with delete and setting the pointer to NULL. The next step is to scatter if (arg == NULL) return; through your code to protect against the accidental usage of a NULL pointer. The problem occurs once the checks against NULL become your primary means of checking for the state of an object or program.
I'm sure that there is a code smell about using a pointer as a flag somewhere but I haven't found one.
I'll change your question slightly:
Would you use an uninitialized
pointer? You know, one that you didn't
set to NULL or allocate the memory it
points to?
There are two scenarios where setting the pointer to NULL can be skipped:
the pointer variable goes out of scope immediately
you have overloaded the semantic of the pointer and are using its value not only as a memory pointer, but also as a key or raw value. this approach however suffers from other problems.
Meanwhile, arguing that setting the pointer to NULL might hide errors to me sounds like arguing that you shouldn't fix a bug because the fix might hide another bug. The only bugs that might show if the pointer is not set to NULL would be the ones that try to use the pointer. But setting it to NULL would actually cause exactly the same bug as would show if you use it with freed memory, wouldn't it?
If you have no other constraint that forces you to either set or not set the pointer to NULL after you delete it (one such constraint was mentioned by Neil Butterworth), then my personal preference is to leave it be.
For me, the question isn't "is this a good idea?" but "what behavior would I prevent or allow to succeed by doing this?" For example, if this allows other code to see that the pointer is no longer available, why is other code even attempting to look at freed pointers after they are freed? Usually, it's a bug.
It also does more work than necessary as well as hindering post-mortem debugging. The less you touch memory after you don't need it, the easier it is to figure out why something crashed. Many times I have relied on the fact that memory is in a similar state to when a particular bug occurred to diagnose and fix said bug.
Explicitly nulling after delete strongly suggests to a reader that the pointer represents something which is conceptually optional. If I saw that being done, I'd start worrying that everywhere in the source the pointer gets used that it should be first tested against NULL.
If that's what you actually mean, it's better to make that explicit in the source using something like boost::optional
optional<Foo*> p (new Foo);
// (use p.get(), but must test p for truth first!...)
delete p.get();
p = optional<Foo*>();
But if you really wanted people to know the pointer has "gone bad", I'll pitch in 100% agreement with those who say the best thing to do is make it go out of scope. Then you're using the compiler to prevent the possibility of bad dereferences at runtime.
That's the baby in all the C++ bathwater, shouldn't throw it out. :)
In a well structured program with appropriate error checking, there is no reason not to assign it null. 0 stands alone as a universally recognized invalid value in this context. Fail hard and Fail soon.
Many of the arguments against assigning 0 suggest that it could hide a bug or complicate control flow. Fundamentally, that is either an upstream error (not your fault (sorry for the bad pun)) or another error on the programmer's behalf -- perhaps even an indication that program flow has grown too complex.
If the programmer wants to introduce the use of a pointer which may be null as a special value and write all the necessary dodging around that, that's a complication they have deliberately introduced. The better the quarantine, the sooner you find cases of misuse, and the less they are able to spread into other programs.
Well structured programs may be designed using C++ features to avoid these cases. You can use references, or you can just say "passing/using null or invalid arguments is an error" -- an approach which is equally applicable to containers, such as smart pointers. Increasing consistent and correct behavior forbids these bugs from getting far.
From there, you have only a very limited scope and context where a null pointer may exist (or is permitted).
The same may be applied to pointers which are not const. Following the value of a pointer is trivial because its scope is so small, and improper use is checked and well defined. If your toolset and engineers cannot follow the program following a quick read or there is inappropriate error checking or inconsistent/lenient program flow, you have other, bigger problems.
Finally, your compiler and environment likely has some guards for the times when you would like to introduce errors (scribbling), detect accesses to freed memory, and catch other related UB. You can also introduce similar diagnostics into your programs, often without affecting existing programs.
Let me expand what you've already put into your question.
Here's what you've put into your question, in bullet-point form:
Setting pointers to NULL following delete is not universal good practice in C++. There are times when:
it is a good thing to do
and times when it is pointless and can hide errors.
However, there is no times when this is bad! You will not introduce more bugs by explicitly nulling it, you will not leak memory, you will not cause undefined behaviour to happen.
So, if in doubt, just null it.
Having said that, if you feel that you have to explicitly null some pointer, then to me this sounds like you haven't split up a method enough, and should look at the refactoring approach called "Extract method" to split up the method into separate parts.
There is always Dangling Pointers to worry about.
Yes.
The only "harm" it can do is to introduce inefficiency (an unnecessary store operation) into your program - but this overhead will be insignificant in relation to the cost of allocating and freeing the block of memory in most cases.
If you don't do it, you will have some nasty pointer derefernce bugs one day.
I always use a macro for delete:
#define SAFEDELETE(ptr) { delete(ptr); ptr = NULL; }
(and similar for an array, free(), releasing handles)
You can also write "self delete" methods that take a reference to the calling code's pointer, so they force the calling code's pointer to NULL. For example, to delete a subtree of many objects:
static void TreeItem::DeleteSubtree(TreeItem *&rootObject)
{
if (rootObject == NULL)
return;
rootObject->UnlinkFromParent();
for (int i = 0; i < numChildren)
DeleteSubtree(rootObject->child[i]);
delete rootObject;
rootObject = NULL;
}
edit
Yes, these techniques do violate some rules about use of macros (and yes, these days you could probably achieve the same result with templates) - but by using over many years I never ever accessed dead memory - one of the nastiest and most difficult and most time consuming to debug problems you can face. In practice over many years they have effectively eliminated a whjole class of bugs from every team I have introduced them on.
There are also many ways you could implement the above - I am just trying to illustrate the idea of forcing people to NULL a pointer if they delete an object, rather than providing a means for them to release the memory that does not NULL the caller's pointer.
Of course, the above example is just a step towards an auto-pointer. Which I didn't suggest because the OP was specifically asking about the case of not using an auto pointer.
"There are times when it is a good thing to do, and times when it is pointless and can hide errors"
I can see two problems:
That simple code:
delete myObj;
myobj = 0
becomes to a for-liner in multithreaded environment:
lock(myObjMutex);
delete myObj;
myobj = 0
unlock(myObjMutex);
The "best practice" of Don Neufeld don't apply always. E.g. in one automotive project we had to set pointers to 0 even in destructors. I can imagine in safety-critical software such rules are not uncommon. It is easier (and wise) to follow them than trying to persuade
the team/code-checker for each pointer use in code, that a line nulling this pointer is redundant.
Another danger is relying on this technique in exceptions-using code:
try{
delete myObj; //exception in destructor
myObj=0
}
catch
{
//myObj=0; <- possibly resource-leak
}
if (myObj)
// use myObj <--undefined behaviour
In such code either you produce resource-leak and postpone the problem or the process crashes.
So, this two problems going spontaneously through my head (Herb Sutter would for sure tell more) make for me all the questions of the kind "How to avoid using smart-pointers and do the job safely with normal pointers" as obsolete.
If you're going to reallocate the pointer before using it again (dereferencing it, passing it to a function, etc.), making the pointer NULL is just an extra operation. However, if you aren't sure whether it will be reallocated or not before it is used again, setting it to NULL is a good idea.
As many have said, it is of course much easier to just use smart pointers.
Edit: As Thomas Matthews said in this earlier answer, if a pointer is deleted in a destructor, there isn't any need to assign NULL to it since it won't be used again because the object is being destroyed already.
I can imagine setting a pointer to NULL after deleting it being useful in rare cases where there is a legitimate scenario of reusing it in a single function (or object). Otherwise it makes no sense - a pointer needs to point to something meaningful as long as it exists - period.
If the code does not belong to the most performance-critical part of your application, keep it simple and use a shared_ptr:
shared_ptr<Foo> p(new Foo);
//No more need to call delete
It performs reference counting and is thread-safe. You can find it in the tr1 (std::tr1 namespace, #include < memory >) or if your compiler does not provide it, get it from boost.
I know this has probably been asked and I've looked through other answers, but still I cannot get this completely.
I want to understand the difference between the two following codes:
MyClass getClass(){
return MyClass();
}
and
MyClass* returnClass(){
return new MyClass();
}
Now let's say I call such functions in a main:
MyClass what = getClass();
MyClass* who = returnClass();
If I got this straight, in the first case the object created in the
function scope will have automatic storage, i.e. when you exit the
scope of the function its memory block will be freed. Also, before
freeing such memory, the returned object will be copied into the
"what" variable I created. So there will exist only one copy of
the object. Am I correct?
1a. If I'm correct, why is RVO (Return Value Optimization) needed?
In the second case, the object will be allocated through a dynamic storage, i.e. it will exist even out of the function scope. So I need to use a deleteon it. The function returns a pointer to such object, so there's no copy made this time, and performing delete who will free the previously allocated memory. Am I (hopefully) correct?
Also I understand I can do something like this:
MyClass& getClass(){
return MyClass();
}
and then in main:
MyClass who = getClass();
In this way I'm just telling that "who" is the same object as the one created in the function. Though, now we're out of the function scope and thus that object doesn't necessarily exists anymore. So I think this should be avoided in order to avoid trouble, right? (and the same goes for
MyClass* who = &getClass();
which would create a pointer to the local variable).
Bonus question: I assume that anything said till now is also true when returning vector<T>(say, for example, vector<double>), though I miss some pieces.
I know that a vector is allocated in the stack while the things it contains are in the heap, but using vector<T>::clear() is enough to clear such memory.
Now I want to follow the first procedure (i.e. return a vector by value): when the vector will be copied, also the onjects it contains will be copied; but exiting the function scope destroys the first object. Now I have the original objects that are contained nowhere, since their vector has been destroyed and I have no way of deleting such objects that are still in the heap. Or maybe a clear() is performed automatically?
I know that I may beatray some misunderstandings in these subjects (expecially in the vector part), so I hope you can help me clarify them.
Q1. What happens conceptually is the following: you create an object of type MyClass on the stack in the stack frame of getClass.
You then copy that object into the return value of the function, which is a bit of stack that was allocated before the function call to hold this object.
Then the function returns, the temporary gets cleaned up. You copy the return value into the local variable what. So you have one allocation and two copies.
Most (all?) compilers are smart enough to omit the first copy: the temporary is not used except as return value. However, the copy from the return value into the local variable on the caller side cannot be omitted, because the return value lives on a part of the stack that is freed as soon as the function finishes.
Q1a. Return Value Optimization (RVO) is a special feature, that does allow that final copy to be elided. That is, instead of returning the function result on the stack, it will be allocated straight away in the memory allocated for what, avoiding all copying altogether. Note that, contrary to all other compiler optimizations, RVO can change the behaviour of your program! You could give MyClass a non-default copy constructor, that has side effects, like printing a message to the console or liking a post on Facebook. Normally, the compiler is not allowed to remove such function calls unless it can prove that these side effects are absent. However, the C++ specs contain a special exception for RVO, that says that even if the copy constructor does something non-trivial, it is still allowed to omit the return value copy and reduce the whole thing to a single constructor call.
2. In the second case, the MyClass instance is not allocated on the stack, but on the heap. The result of the new operator is an integer: the address of the object on the heap. This is the only point where you will ever be able to obtain this address (provided you didn't use placement new), so you need to hold onto it: if you lose it, you cannot call delete and you will have created a memory leak.
You assign the result of new to a variable whose type is denoted by MyClass* so that the compiler can do type checking and stuff, but in memory it is just an integer large enough to hold an address on your system (32- or 64-bits). You can check this for yourself by trying to coerce the result to a size_t (which is typedef'd to typically an unsigned int or something larger depending on your architecture) and seeing the conversion succeed.
This integer is returned to the caller by value, i.e. on the stack, just as in example (1). So again,
in principle, there is copying going on, but in this case only copying of a single integer which your CPU is very good at (most of the times it will not even go on the stack but get passed in a register) and not the whole MyClass object (which in general has to go on the stack because it's very large, read: larger than an integer).
3. Yes, you should not do that. Your analysis is correct: as the function finishes, the local object is cleaned up and its address becomes meaningless. The problem is, that it sometimes seems to work. Forgetting about optimizations for the time being, the main reason the way memory works: clearing (zero-ing) memory is quite expensive, so that is hardly ever done. Instead, it is just marked as available again, but it's not overwritten until you make another allocation that needs it. Therefore, even though the object is technically dead, its data may still be in the memory so when you dereference the pointer you may still get the right data back. However, since the memory is technically free, it may be overwritten at any time between right now and at the end of the universe. You have created what C++ calls Undefined Behaviour (UB): it may seem to work right now on your computer, but there's no telling what may happen somewhere else or at another point in time.
Bonus: When you return a vector by value, as you remarked, it is not just destroyed: it is first copied to the return value or - taking RVO into account - into the target variable. There are two options now: (1) The copy creates its own objects on the heap, and modifies its internal pointers accordingly. You now have two proper (deep) copies co-existing temporarily -- then when the temporary object goes out of scope, you are just left with the one valid vector. Or (2): When copying the vector, the new copy takes ownership of all the pointers that the old one holds. This is possible, if you know that the old vector is about to be destroyed: rather than re-allocating all the contents again on the heap, you can just move them to the new vector and leave the old one in a sort of half-dead state -- as soon as the function is done cleaning that stack the old vector is no longer there anyway.
Which of these two options is used, is really irrelevant or rather, an implementation detail: they have the same result and whether the compiler is smart enough to choose (2) should not usually be your concern (though in practice option (2) will always happen: deep copying an object just to destroy the original is just pointless and easily avoided).
As long as you realize that the thing that gets copied is the part on the stack and the ownership of the pointers on the heap gets transferred: no copying happens on the heap and nothing gets cleared.
Here are my answers to your different questions:
1- You are absolutely correct. If I understand the sequentiallity correctly, your code will allocate memory, create your object, copy the variable into the what variable, and get destroyed as out of scope. The same thing happens when you do:
int SomeFunction()
{
return 10;
}
This will create a temporary that holds 10 (so allocate), copy it to the return vairbale, and then destroy the temporary (so deallocate) (Here I'm not sure of the specifics, maybe the compiler can remove some stuff via automatic inlining, constante values, ... but you get the idea). Which brings me to
1a- You need RVO when to limit this allocation, copy, and deallocation part. If your class allocates a lot of data upon construction it is a bad idea to return it directly. You can use move constructor in that case, and reuse the storage space allocated by the temporary for example. Or return a pointer. Which takes all the way down to
2- Returning a pointer works exactly as returning an int from a function. But because pointers are only 4 or 8 bytes long, allocation and deallocation cost a lot less than doing so for a class that's 10 Mb long. And instead of copying the object you copy its adress on the heap (usually less heavy, but copy nonetheless). Do not forget it is not because a pointer represents a memory that its size is 0 byte. So using a pointer requires getting the value from some memory address. Returning a reference and inlining are also good ideas to optimise your code, as you avoid chasing pointer, function calls, etc.
3- I think you are correct there. I'd have to make sure by testing, but if follow my logic you are right.
I hope I answered your questions. And I hope my answers are as correct as can be. But maybe someone more clever than me can correct me :-)
Best.
I am learning about pointers in C++ currently, in college. I have coded a program that is a binary tree of objects that points to a linked list of sub-objects. IF I am even wording that correctly. Anyways, my program seems to work correctly, but I am having trouble wrapping my head around how to test pointer deletion.
For instance, my delete function for single object of the binary tree is:
void EmployeeRecord::destroyCustomerList()
{
if(m_oCustomerList != NULL)
{
delete m_oCustomerList;
m_oCustomerList = NULL;
}
}
When printing my tree, everything populates and is taken off correctly (meaning the tree is kept intact through every removal of a node)...but how do I confirm what happens to the deallocated memory? I know that since I am setting the pointer *m_oCustomerList to NULL, that I can test for a NULL value on a previously populated object, but what happens to the actual memory?
I am using Visual Studio/C++ and have read that the debugger will use a code starting at 0xCC for deallocated memory...but I can't seem to figure out how to use that information.
Note that your code
void EmployeeRecord::destroyCustomerList()
{
if(m_oCustomerList != NULL)
{
delete m_oCustomerList;
m_oCustomerList = NULL;
}
}
Simplifies to:
void EmployeeRecord::destroyCustomerList()
{
delete m_oCustomerList;
m_oCustomerList = NULL;
}
It is safe to invoke the delete operator on a null pointer in C++. It does nothing. In other words, the check for null is already "built in".
Once you delete an object, it no longer exists, and the pointer to that object becomes and indeterminate value (so it's not a bad idea to null out all copies of that pointer).
What really happens to the memory in actual C++ implementations, rather than in the abstract sense, is that it continues to exist at the same address, but is marked as free, so that it can be allocated for another purpose. An allocation request coming from the program (possibly a completely unrelated module) or possibly from another program in the system, could obtain that memory for its own use.
Any uses of a pointer to an object which no longer exists are "undefined behavior". Functions for safely verifying such a pointer do exist, but they are very platform-specific and rarely perfect.
The problem is that whereas it is not particularly hard for an implementation to confirm that a pointer is bad, it is not possible to confirm that a pointer is good. We can walk the internal memory data structures of the memory allocator to determine that some pointer refers to free storage. But what if the storage is subsequently allocated? Then the pointer no longer refers to free storage. But it does not refer to the original object which was allocated, either! This is known as an "ABA ambiguity": because some A changed into a B, but then back into A, indistinguishable from the original A.
Approaches exist to solve the ABA ambiguity (if not completely than at least partially). For instance, pointers be made "fat" so the they have an extra field in addition to the address bits. The field could contain a sequence number which is used to stamp the pointer that are returned from the allocator. Now when an object is deleted and reallocated, the new pointer to the same location has a different sequence number: we have ABA'. The pointer A has gone bad, making it B, but the when it is resurrected it comes back as A'. If we ask the system to validate A, it will correctly determine that A is bad, because it does not have the expected sequence number. The correct, valid pointer to the object is A', which does not match A.
However, sequence number fields are only so many bits wide and they will wrap around eventually. So the ABA problem has not really been solved. The validation of good versus bad pointers has only been made substantially more reliable. To absolutely deal with the ABA problem, the system must always hand out new pointers which are not equal to any pointers which could still be in use. This means never actually freeing anything (thereby running out of memory) or implementing garbage collection. (Meaning that delete actually does nothing: deleted objects are destructed, but stick around in memory until they are garbage-collected, which happens when the program no longer remembers any copies of the pointer. At that point, the program no longer remembers A, and so A can be re-introduced, and there is no ABA problem.)
To make all pointers "fat", you have to change the entire toolchain and runtime: compilers, libraries, et cetera. There are further difficulties because large programs tend to have multiple memory allocators. If you ask the wrong allocator "is this pointer valid", all it can say is "this pointer is not from my arena". Another approach you can do is to invent your own pointers and implement them as smart pointers in C++. Your pointers can support an is_valid method which tries to be as reliable as possible (dealing with the ABA problem somehow: either partially with some sequence numbers and such, or by implementing your own garbage collection scheme.)
Accessing deleted memory is undefined behaviour by the standard. For instance, if this was a multithreaded application (or some other process had injected a thread into your application) then a new allocation could allocate the memory you just deallocated before you are able to "verify" it.
Once you delete your memory and set your pointer to NULL you no longer have access to that memory even if you want it. So, there is no way to verify that it really gone. However, if you did something wrong and the memory was never deleted it would consist of a memory leak which would cause your program to increase the amount of ram it uses, you could see this as a symptom of a pointer not properly disposed of.
You will probably learn later that you will not have to worry about the deletion of your pointers because of std::shared_ptr which will delete your object when the pointer goes out of scope. Which will be safer later on because you will probably will learn that exceptions can cause your destructor to never fire leaving a memory leak.
...
...
delete m_oCustomerList;
// Try using the deleted pointer here
// This should cause a runtime exception
// which means you did free the pointer
m_oCustomerList->someStrMemberVariable = "This will fail"
...
...
Needless to say, don't do this in the actual code. Hope this helps.
I'll start out by saying, use smart pointers and you'll never have to worry about this.
What are the problems with the following code?
Foo * p = new Foo;
// (use p)
delete p;
p = NULL;
This was sparked by an answer and comments to another question. One comment from Neil Butterworth generated a few upvotes:
Setting pointers to NULL following delete is not universal good practice in C++. There are times when it is a good thing to do, and times when it is pointless and can hide errors.
There are plenty of circumstances where it wouldn't help. But in my experience, it can't hurt. Somebody enlighten me.
Setting a pointer to 0 (which is "null" in standard C++, the NULL define from C is somewhat different) avoids crashes on double deletes.
Consider the following:
Foo* foo = 0; // Sets the pointer to 0 (C++ NULL)
delete foo; // Won't do anything
Whereas:
Foo* foo = new Foo();
delete foo; // Deletes the object
delete foo; // Undefined behavior
In other words, if you don't set deleted pointers to 0, you will get into trouble if you're doing double deletes. An argument against setting pointers to 0 after delete would be that doing so just masks double delete bugs and leaves them unhandled.
It's best to not have double delete bugs, obviously, but depending on ownership semantics and object lifecycles, this can be hard to achieve in practice. I prefer a masked double delete bug over UB.
Finally, a sidenote regarding managing object allocation, I suggest you take a look at std::unique_ptr for strict/singular ownership, std::shared_ptr for shared ownership, or another smart pointer implementation, depending on your needs.
Setting pointers to NULL after you've deleted what it pointed to certainly can't hurt, but it's often a bit of a band-aid over a more fundamental problem: Why are you using a pointer in the first place? I can see two typical reasons:
You simply wanted something allocated on the heap. In which case wrapping it in a RAII object would have been much safer and cleaner. End the RAII object's scope when you no longer need the object. That's how std::vector works, and it solves the problem of accidentally leaving pointers to deallocated memory around. There are no pointers.
Or perhaps you wanted some complex shared ownership semantics. The pointer returned from new might not be the same as the one that delete is called on. Multiple objects may have used the object simultaneously in the meantime. In that case, a shared pointer or something similar would have been preferable.
My rule of thumb is that if you leave pointers around in user code, you're Doing It Wrong. The pointer shouldn't be there to point to garbage in the first place. Why isn't there an object taking responsibility for ensuring its validity? Why doesn't its scope end when the pointed-to object does?
I've got an even better best practice: Where possible, end the variable's scope!
{
Foo* pFoo = new Foo;
// use pFoo
delete pFoo;
}
I always set a pointer to NULL (now nullptr) after deleting the object(s) it points to.
It can help catch many references to freed memory (assuming your platform faults on a deref of a null pointer).
It won't catch all references to free'd memory if, for example, you have copies of the pointer lying around. But some is better than none.
It will mask a double-delete, but I find those are far less common than accesses to already freed memory.
In many cases the compiler is going to optimize it away. So the argument that it's unnecessary doesn't persuade me.
If you're already using RAII, then there aren't many deletes in your code to begin with, so the argument that the extra assignment causes clutter doesn't persuade me.
It's often convenient, when debugging, to see the null value rather than a stale pointer.
If this still bothers you, use a smart pointer or a reference instead.
I also set other types of resource handles to the no-resource value when the resource is free'd (which is typically only in the destructor of an RAII wrapper written to encapsulate the resource).
I worked on a large (9 million statements) commercial product (primarily in C). At one point, we used macro magic to null out the pointer when memory was freed. This immediately exposed lots of lurking bugs that were promptly fixed. As far as I can remember, we never had a double-free bug.
Update: Microsoft believes that it's a good practice for security and recommends the practice in their SDL policies. Apparently MSVC++11 will stomp the deleted pointer automatically (in many circumstances) if you compile with the /SDL option.
Firstly, there are a lot of existing questions on this and closely related topics, for example Why doesn't delete set the pointer to NULL?.
In your code, the issue what goes on in (use p). For example, if somewhere you have code like this:
Foo * p2 = p;
then setting p to NULL accomplishes very little, as you still have the pointer p2 to worry about.
This is not to say that setting a pointer to NULL is always pointless. For example, if p were a member variable pointing to a resource who's lifetime was not exactly the same as the class containing p, then setting p to NULL could be a useful way of indicating the presence or absence of the resource.
If there is more code after the delete, Yes. When the pointer is deleted in a constructor or at the end of method or function, No.
The point of this parable is to remind the programmer, during run-time, that the object has already been deleted.
An even better practice is to use Smart Pointers (shared or scoped) which automagically delete their target objects.
As others have said, delete ptr; ptr = 0; is not going to cause demons to fly out of your nose. However, it does encourage the usage of ptr as a flag of sorts. The code becomes littered with delete and setting the pointer to NULL. The next step is to scatter if (arg == NULL) return; through your code to protect against the accidental usage of a NULL pointer. The problem occurs once the checks against NULL become your primary means of checking for the state of an object or program.
I'm sure that there is a code smell about using a pointer as a flag somewhere but I haven't found one.
I'll change your question slightly:
Would you use an uninitialized
pointer? You know, one that you didn't
set to NULL or allocate the memory it
points to?
There are two scenarios where setting the pointer to NULL can be skipped:
the pointer variable goes out of scope immediately
you have overloaded the semantic of the pointer and are using its value not only as a memory pointer, but also as a key or raw value. this approach however suffers from other problems.
Meanwhile, arguing that setting the pointer to NULL might hide errors to me sounds like arguing that you shouldn't fix a bug because the fix might hide another bug. The only bugs that might show if the pointer is not set to NULL would be the ones that try to use the pointer. But setting it to NULL would actually cause exactly the same bug as would show if you use it with freed memory, wouldn't it?
If you have no other constraint that forces you to either set or not set the pointer to NULL after you delete it (one such constraint was mentioned by Neil Butterworth), then my personal preference is to leave it be.
For me, the question isn't "is this a good idea?" but "what behavior would I prevent or allow to succeed by doing this?" For example, if this allows other code to see that the pointer is no longer available, why is other code even attempting to look at freed pointers after they are freed? Usually, it's a bug.
It also does more work than necessary as well as hindering post-mortem debugging. The less you touch memory after you don't need it, the easier it is to figure out why something crashed. Many times I have relied on the fact that memory is in a similar state to when a particular bug occurred to diagnose and fix said bug.
Explicitly nulling after delete strongly suggests to a reader that the pointer represents something which is conceptually optional. If I saw that being done, I'd start worrying that everywhere in the source the pointer gets used that it should be first tested against NULL.
If that's what you actually mean, it's better to make that explicit in the source using something like boost::optional
optional<Foo*> p (new Foo);
// (use p.get(), but must test p for truth first!...)
delete p.get();
p = optional<Foo*>();
But if you really wanted people to know the pointer has "gone bad", I'll pitch in 100% agreement with those who say the best thing to do is make it go out of scope. Then you're using the compiler to prevent the possibility of bad dereferences at runtime.
That's the baby in all the C++ bathwater, shouldn't throw it out. :)
In a well structured program with appropriate error checking, there is no reason not to assign it null. 0 stands alone as a universally recognized invalid value in this context. Fail hard and Fail soon.
Many of the arguments against assigning 0 suggest that it could hide a bug or complicate control flow. Fundamentally, that is either an upstream error (not your fault (sorry for the bad pun)) or another error on the programmer's behalf -- perhaps even an indication that program flow has grown too complex.
If the programmer wants to introduce the use of a pointer which may be null as a special value and write all the necessary dodging around that, that's a complication they have deliberately introduced. The better the quarantine, the sooner you find cases of misuse, and the less they are able to spread into other programs.
Well structured programs may be designed using C++ features to avoid these cases. You can use references, or you can just say "passing/using null or invalid arguments is an error" -- an approach which is equally applicable to containers, such as smart pointers. Increasing consistent and correct behavior forbids these bugs from getting far.
From there, you have only a very limited scope and context where a null pointer may exist (or is permitted).
The same may be applied to pointers which are not const. Following the value of a pointer is trivial because its scope is so small, and improper use is checked and well defined. If your toolset and engineers cannot follow the program following a quick read or there is inappropriate error checking or inconsistent/lenient program flow, you have other, bigger problems.
Finally, your compiler and environment likely has some guards for the times when you would like to introduce errors (scribbling), detect accesses to freed memory, and catch other related UB. You can also introduce similar diagnostics into your programs, often without affecting existing programs.
Let me expand what you've already put into your question.
Here's what you've put into your question, in bullet-point form:
Setting pointers to NULL following delete is not universal good practice in C++. There are times when:
it is a good thing to do
and times when it is pointless and can hide errors.
However, there is no times when this is bad! You will not introduce more bugs by explicitly nulling it, you will not leak memory, you will not cause undefined behaviour to happen.
So, if in doubt, just null it.
Having said that, if you feel that you have to explicitly null some pointer, then to me this sounds like you haven't split up a method enough, and should look at the refactoring approach called "Extract method" to split up the method into separate parts.
There is always Dangling Pointers to worry about.
Yes.
The only "harm" it can do is to introduce inefficiency (an unnecessary store operation) into your program - but this overhead will be insignificant in relation to the cost of allocating and freeing the block of memory in most cases.
If you don't do it, you will have some nasty pointer derefernce bugs one day.
I always use a macro for delete:
#define SAFEDELETE(ptr) { delete(ptr); ptr = NULL; }
(and similar for an array, free(), releasing handles)
You can also write "self delete" methods that take a reference to the calling code's pointer, so they force the calling code's pointer to NULL. For example, to delete a subtree of many objects:
static void TreeItem::DeleteSubtree(TreeItem *&rootObject)
{
if (rootObject == NULL)
return;
rootObject->UnlinkFromParent();
for (int i = 0; i < numChildren)
DeleteSubtree(rootObject->child[i]);
delete rootObject;
rootObject = NULL;
}
edit
Yes, these techniques do violate some rules about use of macros (and yes, these days you could probably achieve the same result with templates) - but by using over many years I never ever accessed dead memory - one of the nastiest and most difficult and most time consuming to debug problems you can face. In practice over many years they have effectively eliminated a whjole class of bugs from every team I have introduced them on.
There are also many ways you could implement the above - I am just trying to illustrate the idea of forcing people to NULL a pointer if they delete an object, rather than providing a means for them to release the memory that does not NULL the caller's pointer.
Of course, the above example is just a step towards an auto-pointer. Which I didn't suggest because the OP was specifically asking about the case of not using an auto pointer.
"There are times when it is a good thing to do, and times when it is pointless and can hide errors"
I can see two problems:
That simple code:
delete myObj;
myobj = 0
becomes to a for-liner in multithreaded environment:
lock(myObjMutex);
delete myObj;
myobj = 0
unlock(myObjMutex);
The "best practice" of Don Neufeld don't apply always. E.g. in one automotive project we had to set pointers to 0 even in destructors. I can imagine in safety-critical software such rules are not uncommon. It is easier (and wise) to follow them than trying to persuade
the team/code-checker for each pointer use in code, that a line nulling this pointer is redundant.
Another danger is relying on this technique in exceptions-using code:
try{
delete myObj; //exception in destructor
myObj=0
}
catch
{
//myObj=0; <- possibly resource-leak
}
if (myObj)
// use myObj <--undefined behaviour
In such code either you produce resource-leak and postpone the problem or the process crashes.
So, this two problems going spontaneously through my head (Herb Sutter would for sure tell more) make for me all the questions of the kind "How to avoid using smart-pointers and do the job safely with normal pointers" as obsolete.
If you're going to reallocate the pointer before using it again (dereferencing it, passing it to a function, etc.), making the pointer NULL is just an extra operation. However, if you aren't sure whether it will be reallocated or not before it is used again, setting it to NULL is a good idea.
As many have said, it is of course much easier to just use smart pointers.
Edit: As Thomas Matthews said in this earlier answer, if a pointer is deleted in a destructor, there isn't any need to assign NULL to it since it won't be used again because the object is being destroyed already.
I can imagine setting a pointer to NULL after deleting it being useful in rare cases where there is a legitimate scenario of reusing it in a single function (or object). Otherwise it makes no sense - a pointer needs to point to something meaningful as long as it exists - period.
If the code does not belong to the most performance-critical part of your application, keep it simple and use a shared_ptr:
shared_ptr<Foo> p(new Foo);
//No more need to call delete
It performs reference counting and is thread-safe. You can find it in the tr1 (std::tr1 namespace, #include < memory >) or if your compiler does not provide it, get it from boost.