Is this valid? An acceptable practice?
typedef vector<int> intArray;
intArray& createArray()
{
intArray *arr = new intArray(10000, 0);
return(*arr);
}
int main(int argc, char *argv[])
{
intArray& array = createArray();
//..........
delete &array;
return 0;
}
The behavior of the code will be your intended behavior. Now, the problem is that while you might consider that programming is about writing something for the compiler to process, it is just as much about writing something that other programmers (or you in the future) will understand and be able to maintain. The code you provided in many cases will be equivalent to using pointers for the compiler, but for other programmers, it will just be a potential source of errors.
References are meant to be aliases to objects that are managed somewhere else, somehow else. In general people will be surprised when they encounter delete &ref, and in most cases programmers won't expect having to perform a delete on the address of a reference, so chances are that in the future someone is going to call the function an forget about deleting and you will have a memory leak.
In most cases, memory can be better managed by the use of smart pointers (if you cannot use other high level constructs like std::vectors). By hiding the pointer away behind the reference you are making it harder to use smart pointers on the returned reference, and thus you are not helping but making it harder for users to work with your interface.
Finally, the good thing about references is that when you read them in code, you know that the lifetime of the object is managed somewhere else and you need not to worry about it. By using a reference instead of a pointer you are basically going back to the single solution (previously in C only pointers) and suddenly extra care must be taken with all references to figure out whether memory must be managed there or not. That means more effort, more time to think about memory management, and less time to worry about the actual problem being solved -- with the extra strain of unusual code, people grow used to look for memory leaks with pointers and expect none out of references.
In a few words: having memory held by reference hides from the user the requirement to handle the memory and makes it harder to do so correctly.
Yes, I think it will work. But if I saw something like this in any code I worked on, I would rip it out and refactor right away.
If you intend to return an allocated object, use a pointer. Please!
It's valid... but I don't see why you'd ever want to do it. It's not exception safe, and std::vector is going to manage the memory for you anyway. Why new it?
EDIT: If you are returning new'd memory from a function, you should return the pointer, lest users of your function's heads explode.
Is this valid?
Yes.
An acceptable practice?
No.
This code has several problems:
The guideline of designing for least surprising behavior is broken: you return something that "looks like" an object but must be deleted by the client code (that should mean a pointer - a reference should be something that always points to a valid object).
your allocation can fail. Even if you check the result in the allocating function, what will you return? An invalid reference? Do you rely on the allocation throwing an exception for such a case?
As a design principle, consider either creating a RAII object that is responsible for managing the lifetime of your object (in this case a smart pointer) or deleting the pointer at the same abstraction level that you created it:
typedef vector<int> intArray;
intArray& createArray()
{
intArray *arr = new intArray(10000, 0);
return(*arr);
}
void deleteArray(intArray& object)
{
delete &object;
}
int main(int argc, char *argv[])
{
intArray& array = createArray();
//..........
deleteArray(array);
return 0;
}
This design improves coding style consistency (allocation and deallocation are hidden and implemented at the same abstraction level) but it would still make more sense to work through a pointer than a reference (unless the fact that your object is dynamically allocated must remain an implementation detail for some design reason).
It will work but I'm afraid it's flat-out unacceptable practise. There's a strong convention in the C++ world that memory management is done with pointers. Your code violates this convention, and is liable to trip up just about anyone who uses it.
It seems like you're going out of your way to avoid returning a raw pointer from this function. If your concern is having to check repeatedly for a valid pointer in main, you can use a reference for the processing of your array. But have createArray return a pointer, and make sure that the code which deletes the array takes it as a pointer too. Or, if it's really as simple as this, simply declare the array on the stack in main and forego the function altogether. (Initialization code in that case could take a reference to the array object to be initialized, and the caller could pass its stack object to the init code.)
It is valid because compiler can compile and run successfully. However, this kind of coding practices makes codes more harder for readers and maintainers because of
Manual memory management
Vague ownership transfer to client side
But there is a subtle point in this question, it is efficiency requirement. Sometimes we can not return pass-by value because object size might be too big, bulky as in this example (1000 * sizeof(int)); For that reason; we should use pointers if we need to transfer objects to different parts of our code. But this doesn't means above implementation is acceptable because for this kind of requirements, there is very useful tool, it is smart-pointers. So, design decision is up to programmer but for this kind of specific implementation details, programmer should use acceptable patterns like smart-pointers in this example.
Related
I'll start out by saying, use smart pointers and you'll never have to worry about this.
What are the problems with the following code?
Foo * p = new Foo;
// (use p)
delete p;
p = NULL;
This was sparked by an answer and comments to another question. One comment from Neil Butterworth generated a few upvotes:
Setting pointers to NULL following delete is not universal good practice in C++. There are times when it is a good thing to do, and times when it is pointless and can hide errors.
There are plenty of circumstances where it wouldn't help. But in my experience, it can't hurt. Somebody enlighten me.
Setting a pointer to 0 (which is "null" in standard C++, the NULL define from C is somewhat different) avoids crashes on double deletes.
Consider the following:
Foo* foo = 0; // Sets the pointer to 0 (C++ NULL)
delete foo; // Won't do anything
Whereas:
Foo* foo = new Foo();
delete foo; // Deletes the object
delete foo; // Undefined behavior
In other words, if you don't set deleted pointers to 0, you will get into trouble if you're doing double deletes. An argument against setting pointers to 0 after delete would be that doing so just masks double delete bugs and leaves them unhandled.
It's best to not have double delete bugs, obviously, but depending on ownership semantics and object lifecycles, this can be hard to achieve in practice. I prefer a masked double delete bug over UB.
Finally, a sidenote regarding managing object allocation, I suggest you take a look at std::unique_ptr for strict/singular ownership, std::shared_ptr for shared ownership, or another smart pointer implementation, depending on your needs.
Setting pointers to NULL after you've deleted what it pointed to certainly can't hurt, but it's often a bit of a band-aid over a more fundamental problem: Why are you using a pointer in the first place? I can see two typical reasons:
You simply wanted something allocated on the heap. In which case wrapping it in a RAII object would have been much safer and cleaner. End the RAII object's scope when you no longer need the object. That's how std::vector works, and it solves the problem of accidentally leaving pointers to deallocated memory around. There are no pointers.
Or perhaps you wanted some complex shared ownership semantics. The pointer returned from new might not be the same as the one that delete is called on. Multiple objects may have used the object simultaneously in the meantime. In that case, a shared pointer or something similar would have been preferable.
My rule of thumb is that if you leave pointers around in user code, you're Doing It Wrong. The pointer shouldn't be there to point to garbage in the first place. Why isn't there an object taking responsibility for ensuring its validity? Why doesn't its scope end when the pointed-to object does?
I've got an even better best practice: Where possible, end the variable's scope!
{
Foo* pFoo = new Foo;
// use pFoo
delete pFoo;
}
I always set a pointer to NULL (now nullptr) after deleting the object(s) it points to.
It can help catch many references to freed memory (assuming your platform faults on a deref of a null pointer).
It won't catch all references to free'd memory if, for example, you have copies of the pointer lying around. But some is better than none.
It will mask a double-delete, but I find those are far less common than accesses to already freed memory.
In many cases the compiler is going to optimize it away. So the argument that it's unnecessary doesn't persuade me.
If you're already using RAII, then there aren't many deletes in your code to begin with, so the argument that the extra assignment causes clutter doesn't persuade me.
It's often convenient, when debugging, to see the null value rather than a stale pointer.
If this still bothers you, use a smart pointer or a reference instead.
I also set other types of resource handles to the no-resource value when the resource is free'd (which is typically only in the destructor of an RAII wrapper written to encapsulate the resource).
I worked on a large (9 million statements) commercial product (primarily in C). At one point, we used macro magic to null out the pointer when memory was freed. This immediately exposed lots of lurking bugs that were promptly fixed. As far as I can remember, we never had a double-free bug.
Update: Microsoft believes that it's a good practice for security and recommends the practice in their SDL policies. Apparently MSVC++11 will stomp the deleted pointer automatically (in many circumstances) if you compile with the /SDL option.
Firstly, there are a lot of existing questions on this and closely related topics, for example Why doesn't delete set the pointer to NULL?.
In your code, the issue what goes on in (use p). For example, if somewhere you have code like this:
Foo * p2 = p;
then setting p to NULL accomplishes very little, as you still have the pointer p2 to worry about.
This is not to say that setting a pointer to NULL is always pointless. For example, if p were a member variable pointing to a resource who's lifetime was not exactly the same as the class containing p, then setting p to NULL could be a useful way of indicating the presence or absence of the resource.
If there is more code after the delete, Yes. When the pointer is deleted in a constructor or at the end of method or function, No.
The point of this parable is to remind the programmer, during run-time, that the object has already been deleted.
An even better practice is to use Smart Pointers (shared or scoped) which automagically delete their target objects.
As others have said, delete ptr; ptr = 0; is not going to cause demons to fly out of your nose. However, it does encourage the usage of ptr as a flag of sorts. The code becomes littered with delete and setting the pointer to NULL. The next step is to scatter if (arg == NULL) return; through your code to protect against the accidental usage of a NULL pointer. The problem occurs once the checks against NULL become your primary means of checking for the state of an object or program.
I'm sure that there is a code smell about using a pointer as a flag somewhere but I haven't found one.
I'll change your question slightly:
Would you use an uninitialized
pointer? You know, one that you didn't
set to NULL or allocate the memory it
points to?
There are two scenarios where setting the pointer to NULL can be skipped:
the pointer variable goes out of scope immediately
you have overloaded the semantic of the pointer and are using its value not only as a memory pointer, but also as a key or raw value. this approach however suffers from other problems.
Meanwhile, arguing that setting the pointer to NULL might hide errors to me sounds like arguing that you shouldn't fix a bug because the fix might hide another bug. The only bugs that might show if the pointer is not set to NULL would be the ones that try to use the pointer. But setting it to NULL would actually cause exactly the same bug as would show if you use it with freed memory, wouldn't it?
If you have no other constraint that forces you to either set or not set the pointer to NULL after you delete it (one such constraint was mentioned by Neil Butterworth), then my personal preference is to leave it be.
For me, the question isn't "is this a good idea?" but "what behavior would I prevent or allow to succeed by doing this?" For example, if this allows other code to see that the pointer is no longer available, why is other code even attempting to look at freed pointers after they are freed? Usually, it's a bug.
It also does more work than necessary as well as hindering post-mortem debugging. The less you touch memory after you don't need it, the easier it is to figure out why something crashed. Many times I have relied on the fact that memory is in a similar state to when a particular bug occurred to diagnose and fix said bug.
Explicitly nulling after delete strongly suggests to a reader that the pointer represents something which is conceptually optional. If I saw that being done, I'd start worrying that everywhere in the source the pointer gets used that it should be first tested against NULL.
If that's what you actually mean, it's better to make that explicit in the source using something like boost::optional
optional<Foo*> p (new Foo);
// (use p.get(), but must test p for truth first!...)
delete p.get();
p = optional<Foo*>();
But if you really wanted people to know the pointer has "gone bad", I'll pitch in 100% agreement with those who say the best thing to do is make it go out of scope. Then you're using the compiler to prevent the possibility of bad dereferences at runtime.
That's the baby in all the C++ bathwater, shouldn't throw it out. :)
In a well structured program with appropriate error checking, there is no reason not to assign it null. 0 stands alone as a universally recognized invalid value in this context. Fail hard and Fail soon.
Many of the arguments against assigning 0 suggest that it could hide a bug or complicate control flow. Fundamentally, that is either an upstream error (not your fault (sorry for the bad pun)) or another error on the programmer's behalf -- perhaps even an indication that program flow has grown too complex.
If the programmer wants to introduce the use of a pointer which may be null as a special value and write all the necessary dodging around that, that's a complication they have deliberately introduced. The better the quarantine, the sooner you find cases of misuse, and the less they are able to spread into other programs.
Well structured programs may be designed using C++ features to avoid these cases. You can use references, or you can just say "passing/using null or invalid arguments is an error" -- an approach which is equally applicable to containers, such as smart pointers. Increasing consistent and correct behavior forbids these bugs from getting far.
From there, you have only a very limited scope and context where a null pointer may exist (or is permitted).
The same may be applied to pointers which are not const. Following the value of a pointer is trivial because its scope is so small, and improper use is checked and well defined. If your toolset and engineers cannot follow the program following a quick read or there is inappropriate error checking or inconsistent/lenient program flow, you have other, bigger problems.
Finally, your compiler and environment likely has some guards for the times when you would like to introduce errors (scribbling), detect accesses to freed memory, and catch other related UB. You can also introduce similar diagnostics into your programs, often without affecting existing programs.
Let me expand what you've already put into your question.
Here's what you've put into your question, in bullet-point form:
Setting pointers to NULL following delete is not universal good practice in C++. There are times when:
it is a good thing to do
and times when it is pointless and can hide errors.
However, there is no times when this is bad! You will not introduce more bugs by explicitly nulling it, you will not leak memory, you will not cause undefined behaviour to happen.
So, if in doubt, just null it.
Having said that, if you feel that you have to explicitly null some pointer, then to me this sounds like you haven't split up a method enough, and should look at the refactoring approach called "Extract method" to split up the method into separate parts.
There is always Dangling Pointers to worry about.
Yes.
The only "harm" it can do is to introduce inefficiency (an unnecessary store operation) into your program - but this overhead will be insignificant in relation to the cost of allocating and freeing the block of memory in most cases.
If you don't do it, you will have some nasty pointer derefernce bugs one day.
I always use a macro for delete:
#define SAFEDELETE(ptr) { delete(ptr); ptr = NULL; }
(and similar for an array, free(), releasing handles)
You can also write "self delete" methods that take a reference to the calling code's pointer, so they force the calling code's pointer to NULL. For example, to delete a subtree of many objects:
static void TreeItem::DeleteSubtree(TreeItem *&rootObject)
{
if (rootObject == NULL)
return;
rootObject->UnlinkFromParent();
for (int i = 0; i < numChildren)
DeleteSubtree(rootObject->child[i]);
delete rootObject;
rootObject = NULL;
}
edit
Yes, these techniques do violate some rules about use of macros (and yes, these days you could probably achieve the same result with templates) - but by using over many years I never ever accessed dead memory - one of the nastiest and most difficult and most time consuming to debug problems you can face. In practice over many years they have effectively eliminated a whjole class of bugs from every team I have introduced them on.
There are also many ways you could implement the above - I am just trying to illustrate the idea of forcing people to NULL a pointer if they delete an object, rather than providing a means for them to release the memory that does not NULL the caller's pointer.
Of course, the above example is just a step towards an auto-pointer. Which I didn't suggest because the OP was specifically asking about the case of not using an auto pointer.
"There are times when it is a good thing to do, and times when it is pointless and can hide errors"
I can see two problems:
That simple code:
delete myObj;
myobj = 0
becomes to a for-liner in multithreaded environment:
lock(myObjMutex);
delete myObj;
myobj = 0
unlock(myObjMutex);
The "best practice" of Don Neufeld don't apply always. E.g. in one automotive project we had to set pointers to 0 even in destructors. I can imagine in safety-critical software such rules are not uncommon. It is easier (and wise) to follow them than trying to persuade
the team/code-checker for each pointer use in code, that a line nulling this pointer is redundant.
Another danger is relying on this technique in exceptions-using code:
try{
delete myObj; //exception in destructor
myObj=0
}
catch
{
//myObj=0; <- possibly resource-leak
}
if (myObj)
// use myObj <--undefined behaviour
In such code either you produce resource-leak and postpone the problem or the process crashes.
So, this two problems going spontaneously through my head (Herb Sutter would for sure tell more) make for me all the questions of the kind "How to avoid using smart-pointers and do the job safely with normal pointers" as obsolete.
If you're going to reallocate the pointer before using it again (dereferencing it, passing it to a function, etc.), making the pointer NULL is just an extra operation. However, if you aren't sure whether it will be reallocated or not before it is used again, setting it to NULL is a good idea.
As many have said, it is of course much easier to just use smart pointers.
Edit: As Thomas Matthews said in this earlier answer, if a pointer is deleted in a destructor, there isn't any need to assign NULL to it since it won't be used again because the object is being destroyed already.
I can imagine setting a pointer to NULL after deleting it being useful in rare cases where there is a legitimate scenario of reusing it in a single function (or object). Otherwise it makes no sense - a pointer needs to point to something meaningful as long as it exists - period.
If the code does not belong to the most performance-critical part of your application, keep it simple and use a shared_ptr:
shared_ptr<Foo> p(new Foo);
//No more need to call delete
It performs reference counting and is thread-safe. You can find it in the tr1 (std::tr1 namespace, #include < memory >) or if your compiler does not provide it, get it from boost.
I'll start out by saying, use smart pointers and you'll never have to worry about this.
What are the problems with the following code?
Foo * p = new Foo;
// (use p)
delete p;
p = NULL;
This was sparked by an answer and comments to another question. One comment from Neil Butterworth generated a few upvotes:
Setting pointers to NULL following delete is not universal good practice in C++. There are times when it is a good thing to do, and times when it is pointless and can hide errors.
There are plenty of circumstances where it wouldn't help. But in my experience, it can't hurt. Somebody enlighten me.
Setting a pointer to 0 (which is "null" in standard C++, the NULL define from C is somewhat different) avoids crashes on double deletes.
Consider the following:
Foo* foo = 0; // Sets the pointer to 0 (C++ NULL)
delete foo; // Won't do anything
Whereas:
Foo* foo = new Foo();
delete foo; // Deletes the object
delete foo; // Undefined behavior
In other words, if you don't set deleted pointers to 0, you will get into trouble if you're doing double deletes. An argument against setting pointers to 0 after delete would be that doing so just masks double delete bugs and leaves them unhandled.
It's best to not have double delete bugs, obviously, but depending on ownership semantics and object lifecycles, this can be hard to achieve in practice. I prefer a masked double delete bug over UB.
Finally, a sidenote regarding managing object allocation, I suggest you take a look at std::unique_ptr for strict/singular ownership, std::shared_ptr for shared ownership, or another smart pointer implementation, depending on your needs.
Setting pointers to NULL after you've deleted what it pointed to certainly can't hurt, but it's often a bit of a band-aid over a more fundamental problem: Why are you using a pointer in the first place? I can see two typical reasons:
You simply wanted something allocated on the heap. In which case wrapping it in a RAII object would have been much safer and cleaner. End the RAII object's scope when you no longer need the object. That's how std::vector works, and it solves the problem of accidentally leaving pointers to deallocated memory around. There are no pointers.
Or perhaps you wanted some complex shared ownership semantics. The pointer returned from new might not be the same as the one that delete is called on. Multiple objects may have used the object simultaneously in the meantime. In that case, a shared pointer or something similar would have been preferable.
My rule of thumb is that if you leave pointers around in user code, you're Doing It Wrong. The pointer shouldn't be there to point to garbage in the first place. Why isn't there an object taking responsibility for ensuring its validity? Why doesn't its scope end when the pointed-to object does?
I've got an even better best practice: Where possible, end the variable's scope!
{
Foo* pFoo = new Foo;
// use pFoo
delete pFoo;
}
I always set a pointer to NULL (now nullptr) after deleting the object(s) it points to.
It can help catch many references to freed memory (assuming your platform faults on a deref of a null pointer).
It won't catch all references to free'd memory if, for example, you have copies of the pointer lying around. But some is better than none.
It will mask a double-delete, but I find those are far less common than accesses to already freed memory.
In many cases the compiler is going to optimize it away. So the argument that it's unnecessary doesn't persuade me.
If you're already using RAII, then there aren't many deletes in your code to begin with, so the argument that the extra assignment causes clutter doesn't persuade me.
It's often convenient, when debugging, to see the null value rather than a stale pointer.
If this still bothers you, use a smart pointer or a reference instead.
I also set other types of resource handles to the no-resource value when the resource is free'd (which is typically only in the destructor of an RAII wrapper written to encapsulate the resource).
I worked on a large (9 million statements) commercial product (primarily in C). At one point, we used macro magic to null out the pointer when memory was freed. This immediately exposed lots of lurking bugs that were promptly fixed. As far as I can remember, we never had a double-free bug.
Update: Microsoft believes that it's a good practice for security and recommends the practice in their SDL policies. Apparently MSVC++11 will stomp the deleted pointer automatically (in many circumstances) if you compile with the /SDL option.
Firstly, there are a lot of existing questions on this and closely related topics, for example Why doesn't delete set the pointer to NULL?.
In your code, the issue what goes on in (use p). For example, if somewhere you have code like this:
Foo * p2 = p;
then setting p to NULL accomplishes very little, as you still have the pointer p2 to worry about.
This is not to say that setting a pointer to NULL is always pointless. For example, if p were a member variable pointing to a resource who's lifetime was not exactly the same as the class containing p, then setting p to NULL could be a useful way of indicating the presence or absence of the resource.
If there is more code after the delete, Yes. When the pointer is deleted in a constructor or at the end of method or function, No.
The point of this parable is to remind the programmer, during run-time, that the object has already been deleted.
An even better practice is to use Smart Pointers (shared or scoped) which automagically delete their target objects.
As others have said, delete ptr; ptr = 0; is not going to cause demons to fly out of your nose. However, it does encourage the usage of ptr as a flag of sorts. The code becomes littered with delete and setting the pointer to NULL. The next step is to scatter if (arg == NULL) return; through your code to protect against the accidental usage of a NULL pointer. The problem occurs once the checks against NULL become your primary means of checking for the state of an object or program.
I'm sure that there is a code smell about using a pointer as a flag somewhere but I haven't found one.
I'll change your question slightly:
Would you use an uninitialized
pointer? You know, one that you didn't
set to NULL or allocate the memory it
points to?
There are two scenarios where setting the pointer to NULL can be skipped:
the pointer variable goes out of scope immediately
you have overloaded the semantic of the pointer and are using its value not only as a memory pointer, but also as a key or raw value. this approach however suffers from other problems.
Meanwhile, arguing that setting the pointer to NULL might hide errors to me sounds like arguing that you shouldn't fix a bug because the fix might hide another bug. The only bugs that might show if the pointer is not set to NULL would be the ones that try to use the pointer. But setting it to NULL would actually cause exactly the same bug as would show if you use it with freed memory, wouldn't it?
If you have no other constraint that forces you to either set or not set the pointer to NULL after you delete it (one such constraint was mentioned by Neil Butterworth), then my personal preference is to leave it be.
For me, the question isn't "is this a good idea?" but "what behavior would I prevent or allow to succeed by doing this?" For example, if this allows other code to see that the pointer is no longer available, why is other code even attempting to look at freed pointers after they are freed? Usually, it's a bug.
It also does more work than necessary as well as hindering post-mortem debugging. The less you touch memory after you don't need it, the easier it is to figure out why something crashed. Many times I have relied on the fact that memory is in a similar state to when a particular bug occurred to diagnose and fix said bug.
Explicitly nulling after delete strongly suggests to a reader that the pointer represents something which is conceptually optional. If I saw that being done, I'd start worrying that everywhere in the source the pointer gets used that it should be first tested against NULL.
If that's what you actually mean, it's better to make that explicit in the source using something like boost::optional
optional<Foo*> p (new Foo);
// (use p.get(), but must test p for truth first!...)
delete p.get();
p = optional<Foo*>();
But if you really wanted people to know the pointer has "gone bad", I'll pitch in 100% agreement with those who say the best thing to do is make it go out of scope. Then you're using the compiler to prevent the possibility of bad dereferences at runtime.
That's the baby in all the C++ bathwater, shouldn't throw it out. :)
In a well structured program with appropriate error checking, there is no reason not to assign it null. 0 stands alone as a universally recognized invalid value in this context. Fail hard and Fail soon.
Many of the arguments against assigning 0 suggest that it could hide a bug or complicate control flow. Fundamentally, that is either an upstream error (not your fault (sorry for the bad pun)) or another error on the programmer's behalf -- perhaps even an indication that program flow has grown too complex.
If the programmer wants to introduce the use of a pointer which may be null as a special value and write all the necessary dodging around that, that's a complication they have deliberately introduced. The better the quarantine, the sooner you find cases of misuse, and the less they are able to spread into other programs.
Well structured programs may be designed using C++ features to avoid these cases. You can use references, or you can just say "passing/using null or invalid arguments is an error" -- an approach which is equally applicable to containers, such as smart pointers. Increasing consistent and correct behavior forbids these bugs from getting far.
From there, you have only a very limited scope and context where a null pointer may exist (or is permitted).
The same may be applied to pointers which are not const. Following the value of a pointer is trivial because its scope is so small, and improper use is checked and well defined. If your toolset and engineers cannot follow the program following a quick read or there is inappropriate error checking or inconsistent/lenient program flow, you have other, bigger problems.
Finally, your compiler and environment likely has some guards for the times when you would like to introduce errors (scribbling), detect accesses to freed memory, and catch other related UB. You can also introduce similar diagnostics into your programs, often without affecting existing programs.
Let me expand what you've already put into your question.
Here's what you've put into your question, in bullet-point form:
Setting pointers to NULL following delete is not universal good practice in C++. There are times when:
it is a good thing to do
and times when it is pointless and can hide errors.
However, there is no times when this is bad! You will not introduce more bugs by explicitly nulling it, you will not leak memory, you will not cause undefined behaviour to happen.
So, if in doubt, just null it.
Having said that, if you feel that you have to explicitly null some pointer, then to me this sounds like you haven't split up a method enough, and should look at the refactoring approach called "Extract method" to split up the method into separate parts.
There is always Dangling Pointers to worry about.
Yes.
The only "harm" it can do is to introduce inefficiency (an unnecessary store operation) into your program - but this overhead will be insignificant in relation to the cost of allocating and freeing the block of memory in most cases.
If you don't do it, you will have some nasty pointer derefernce bugs one day.
I always use a macro for delete:
#define SAFEDELETE(ptr) { delete(ptr); ptr = NULL; }
(and similar for an array, free(), releasing handles)
You can also write "self delete" methods that take a reference to the calling code's pointer, so they force the calling code's pointer to NULL. For example, to delete a subtree of many objects:
static void TreeItem::DeleteSubtree(TreeItem *&rootObject)
{
if (rootObject == NULL)
return;
rootObject->UnlinkFromParent();
for (int i = 0; i < numChildren)
DeleteSubtree(rootObject->child[i]);
delete rootObject;
rootObject = NULL;
}
edit
Yes, these techniques do violate some rules about use of macros (and yes, these days you could probably achieve the same result with templates) - but by using over many years I never ever accessed dead memory - one of the nastiest and most difficult and most time consuming to debug problems you can face. In practice over many years they have effectively eliminated a whjole class of bugs from every team I have introduced them on.
There are also many ways you could implement the above - I am just trying to illustrate the idea of forcing people to NULL a pointer if they delete an object, rather than providing a means for them to release the memory that does not NULL the caller's pointer.
Of course, the above example is just a step towards an auto-pointer. Which I didn't suggest because the OP was specifically asking about the case of not using an auto pointer.
"There are times when it is a good thing to do, and times when it is pointless and can hide errors"
I can see two problems:
That simple code:
delete myObj;
myobj = 0
becomes to a for-liner in multithreaded environment:
lock(myObjMutex);
delete myObj;
myobj = 0
unlock(myObjMutex);
The "best practice" of Don Neufeld don't apply always. E.g. in one automotive project we had to set pointers to 0 even in destructors. I can imagine in safety-critical software such rules are not uncommon. It is easier (and wise) to follow them than trying to persuade
the team/code-checker for each pointer use in code, that a line nulling this pointer is redundant.
Another danger is relying on this technique in exceptions-using code:
try{
delete myObj; //exception in destructor
myObj=0
}
catch
{
//myObj=0; <- possibly resource-leak
}
if (myObj)
// use myObj <--undefined behaviour
In such code either you produce resource-leak and postpone the problem or the process crashes.
So, this two problems going spontaneously through my head (Herb Sutter would for sure tell more) make for me all the questions of the kind "How to avoid using smart-pointers and do the job safely with normal pointers" as obsolete.
If you're going to reallocate the pointer before using it again (dereferencing it, passing it to a function, etc.), making the pointer NULL is just an extra operation. However, if you aren't sure whether it will be reallocated or not before it is used again, setting it to NULL is a good idea.
As many have said, it is of course much easier to just use smart pointers.
Edit: As Thomas Matthews said in this earlier answer, if a pointer is deleted in a destructor, there isn't any need to assign NULL to it since it won't be used again because the object is being destroyed already.
I can imagine setting a pointer to NULL after deleting it being useful in rare cases where there is a legitimate scenario of reusing it in a single function (or object). Otherwise it makes no sense - a pointer needs to point to something meaningful as long as it exists - period.
If the code does not belong to the most performance-critical part of your application, keep it simple and use a shared_ptr:
shared_ptr<Foo> p(new Foo);
//No more need to call delete
It performs reference counting and is thread-safe. You can find it in the tr1 (std::tr1 namespace, #include < memory >) or if your compiler does not provide it, get it from boost.
I have seen at least 5 C++ tutorial sites that return pointers this way:
int* somefunction(){
int x = 5;
int *result = &x;
return result;
}
Is that not a very, VERY bad idea? My logic tells me that the returned pointer's value can be overwritten at any time. I would rather think that this would be the right solution:
int* somefuntion(){
int x = 5;
int* result = new int;
*(result) = x;
return result;
}
And then leave the calling function to delete the pointer. Or am i wrong?
Your instinct about the problem is correct- UB is the result. However, your proposed solution is le bad. "Leave the caller to delete it" is hideously error prone and unadvisable. Instead, return it in some owning class that properly encapsulates it's intended usage- preferably std::unique_ptr<int>.
Yes, the first option will return a dangling pointer, and leads to undefined behavior. Your second option is correct, although you could just write:
int* somefuntion(){
return new int(5);
}
or having a static variable inside the method and returning its address.
Yes, the first example is not good as you'll be returning a pointer to memory that the system may re-purpose for something else. The second example is better, but still risks leaking memory as it's not clear to the caller of somefunction that it's their responsibility to delete the memory that's pointed at.
Something like this might be better:
std::unique_ptr<int> somefunction(){
int x = 5;
std::unique_ptr<int> result( new int );
*result = x;
return result;
}
This way, the unique_ptr will take care of delete'ing the memory that you new'ed, and will helpfully help to eliminate potential memory leaks.
Your question mixes several different issues into one. In reality, the main question here is whether you really need that mixture.
There's no such thing as "returning a pointer" by itself. You don't "return a pointer" just because you want to "return a pointer". Returning pointers is done for some specific reason and that reason will dictate how it is done and what needs to be done in order to ensure that it works properly.
Your original example does not really illustrate that since in your original example there's simply no meaningful reason to return a pointer. It looks like you can simply return an int.
For example, in many cases you'll want to return a pointer because it is a pointer to a dynamically allocated object, i.e. an object whose lifetime is not subject to scoping rules of the language. Note that the casual relationship in this case works in the opposite direction: you need a dynamic object -> you have to return a pointer. That way, not the other way around. In you example you seem to use it backwards: I want to return a pointer -> I have to allocate the object dynamically. That latter reasoning is fundamentally flawed, although one might see it used more often than one would expect.
If you really need a dynamically allocated object (for which, as I said above, the main reason is to override the scope-based lifetime rules of the language), then a matter of memory ownership becomes an issue. In order to know when this memory can/has to be deallocated and who has to deallocate it, you have to implement either exclusive (one designated owner at any moment) or shared (like reference counting) ownership scheme. It can be done with raw pointers, but a better idea would be to use various smart pointer classes provided by the libraries.
But in many situations you can also return pointers to non-dynamic objects (static or automatic), which is perfectly fine assuming that the lifetime of these pointers is the same or shorter than the lifetime of the objects they point to.
In other words, the reasoning behind the decision to return a pointer is not really different between C and C++. It is more design/intent-related than language-related. It is just that C++ provides you with more tools to make your life easier once you already decided to return a pointer. (Which sometimes works as incentive for C++ programmers to overuse concealed pointers).
In any case, again, this is an issue of what functionality you are trying to implement. Once you know that, you can make a good decision about whether you should "return a pointer" or not. And if you finally decided to return a pointer, it will help you to choose the proper return method. That's how it works. Trying to think about it backwards ("I just want to return a pointer, but I don't have a real reason for it yet") will only produce academically useless answers, each of which can be shown to be "wrong" in some specific circumstances.
As you suspected and others clarified, the first method is clearly wrong. Although C++ is a systems language and there are some circumstances where you might want do do this (it would return a particular, relative location on the stack, on most systems), it's ALMOST NEVER right. The second method is much more sane.
However, neither method should be encouraged in C++. One of the main points of C++ is that you now have references and exceptions rather than just the pointers of C. So what you do is return a reference, and allow new to throw an exception up the stack, if the memory allocation fails.
Don't forget to
delete
pointer after invocation of your method, which is correct.
T *p = new T();
For the pointer on heap, there can be disastrous operations such as,
p++; // (1) scope missed
p = new T(); // (2) re-assignment
Which would result in memory leaks or crashes due to wrong delete. Apart from using smart pointers, is it advisable always to make heap pointer a const;
T* const p = new T(); // now "p" is not modifiable
This question is in regards of maintaining good programming practice and coding style.
Well, about the only time I use raw heap pointers is if writing my own data structures. and if you used a const pointer for them, your data structure immediately becomes unassignable. which may or may not be what you want.
I hesitate to say always, but what you propose seems reasonable for many/most cases. Const correctness is something most C++ folks pay a fair bit of attention to in function parameters, but not so much in local (or even member) variables. We might be better off to do so.
There is one major potential problem I see with this:
after delete you should set the pointer to NULL to (help) prevent it being used in other parts of the code.
Const will not allow you to do this.
I believe it is unusual to use a pointer if it is not going to change. Perhaps because I hardly ever use new in normal code, but rely on container classes.
Otherwise it is generally a good idea to preserve const-correctness by making things const whenever possible.
Title says it.
Sample of bad practive:
std::vector<Point>* FindPoints()
{
std::vector<Point>* result = new std::vector<Point>();
//...
return result;
}
What's wrong with it if I delete that vector later?
I mostly program in C#, so this problem is not very clear for me in C++ context.
As a rule of thumb, you don't do this because the less you allocate on the heap, the less you risk leaking memory. :)
std::vector is useful also because it automatically manages the memory used for the vector in RAII fashion; by allocating it on the heap now you require an explicit deallocation (with delete result) to avoid leaking its memory. The thing is made complicated because of exceptions, that can alter your return path and skip any delete you put on the way. (In C# you don't have such problems because inaccessible memory is just recalled periodically by the garbage collector)
If you want to return an STL container you have several choices:
just return it by value; in theory you should incur in a copy-penality because of the temporaries that are created in the process of returning result, but newer compilers should be able to elide the copy using NRVO1. There may also be std::vector implementations that implement copy-on-write optimization like many std::string implementations do, but I've never heard about that.
On C++0x compilers, instead, the move semantics should trigger, avoiding any copy.
Store the pointer of result in an ownership-transferring smart pointer like std::auto_ptr (or std::unique_ptr in C++0x), and also change the return type of your function to std::auto_ptr<std::vector<Point > >; in that way, your pointer is always encapsulated in a stack-object, that is automatically destroyed when the function exits (in any way), and destroys the vector if its still owned by it. Also, it's completely clear who owns the returned object.
Make the result vector a parameter passed by reference by the caller, and fill that one instead of returning a new vector.
Hardcore STL option: you would instead provide your data as iterators; the client code would then use std::copy+std::back_inserter or whatever to store such data in whichever container it wants. Not seen much (it can be tricky to code right) but it's worth mentioning.
As #Steve Jessop pointed out in the comments, NRVO works completely only if the return value is used directly to initialize a variable in the calling method; otherwise, it would still be able to elide the construction of the temporary return value, but the assignment operator for the variable to which the return value is assigned could still be called (see #Steve Jessop's comments for details).
Creating anything dynamically is bad practice unless it's really necessary. There's rarely a good reason to create a container dynamically, so it's usually not a good idea.
Edit: Usually, instead of worrying about things like how fast or slow returning a container is, most of the code should deal only with an iterator (or two) into the container.
Creating objects dynamically in general is considered a bad practice in C++. What if an exception is thrown from your "//..." code? You'll never be able to delete the object. It is easier and safer to simply do:
std::vector<Point> FindPoints()
{
std::vector<Point> result;
//...
return result;
}
Shorter, safer, more straghtforward... As for the performance, modern compilers will optimize away the copy on return and if they are not able to, move constructors will get executed so this is still a cheap operation.
Perhaps you're referring to this recent question: C++: vector<string> *args = new vector<string>(); causes SIGABRT
One liner: It's bad practice because it's a pattern that's prone to memory leaks.
You're forcing the caller to accept dynamic allocation and take charge of its lifetime. It's ambiguous from the declaration whether the pointer returned is a static buffer, a buffer owned by some other API (or object), or a buffer that's now owned by the caller. You should avoid this pattern in any language (including plain C) unless it's clear from the function name what's going on (e.g strdup, malloc).
The usual way is to instead do this:
void FindPoints(std::vector<Point>* ret) {
std::vector<Point> result;
//...
ret->swap(result);
}
void caller() {
//...
std::vector<Point> foo;
FindPoints(&foo);
// foo deletes itself
}
All objects are on the stack, and all the deletion is taken care of by the compiler. Or just return by value, if you're running a C++0x compiler+STL, or don't mind the copy.
I like Jerry Coffin's answer. Additionally, if you want to avoid returning a copy, consider passing the result container as a reference, and the swap() method may be needed sometimes.
void FindPoints(std::vector<Point> &points)
{
std::vector<Point> result;
//...
result.swap(points);
}
Programming is the art of finding good compromises. Dynamically allocated memory can have some place of course, and I can even think to problems where a good compromise between code complexity and efficiency is obtained using std::vector<std::vector<T>*>.
However std::vector does a great job of hiding most needs of dynamically allocated arrays, and managed pointers are many times just a perfect solution for dynamically allocated single instances. This means that it's just not so common finding cases where an unmanaged dynamically allocated container (or dynamically allocated whatever, actually) is the best compromise in C++.
This in my opinion doesn't make dynamic allocation "bad", but just "suspect" if you see it in code, because there's an high probability that better solutions could be possile.
In your case for example I see no reason for using dynamic allocation; just making the function returning an std::vector would be efficient and safe. With any decent compiler Return Value Optimization will be used when assigning to a newly declared vector, and if you need to assign the result to an existing vector you can still do something like:
FindPoints().swap(myvector);
that will not do any copying of the data but just some pointer twiddling (note that you cannot use the apparently more natural myvector.swap(FindPoints()) because of a C++ rule that is sometimes annoying that forbids passing temporaries as non-const references).
In my experience the biggest source of needs of dynamically allocated objects are complex data structures where the same instance can be reached using multiple access paths (e.g. instances are at the same time both in a doubly linked list and indexed by a map). In the standard library containers are always the only owner of the contained objects (C++ is a copy semantic language) so it may be difficult to implement those solutions efficiently without the pointer and dynamic allocation concept.
Often you can stil reasonable-enough compromises that just use standard containers however (may be paying some extra O(log N) lookups that you could have avoided) and that, considering the much simpler code, can be IMO the best compromise in most cases.