Is "delete p; p = NULL(nullptr);" an antipattern? - c++

Searching something on SO, I stumbled across this question and one of the comments to the most voted answer (the fifth comment to that most voted answer) suggests that delete p; p = NULL; is an antipattern. I must confess that I happen to use it quite often paring it sometimes\most of the times with the check if (NULL != p). The Man himself seems to suggest it (please see the destroy() function example) so I'm really confused to why it might be such a dreaded thing to be considered an antipattern. I use it for the following reasons:
when I'm releasing a resource I also want to kind of invalidate it for further usage and NULL is the right tool to say a pointer is invalid
I don't want to leave behind dangling pointers
I want to avoid double\multiple free bugs - deleting a NULL pointer is like a nop but deleting a dangling pointer is like "shooting yourself in the foot"
Please note that I'm not asking the question in the context of the "this" pointer and let's assume we don't live in a perfect C++ world and that legacy code does exist and it has to be maintained, so please do not suggest any kind of smart pointer :).

Yes, I would not recommended doing it.
The reason is that the extra setting to null will only help in very limited contexts. If you are in a destructor, the pointer itself will not exist right after the destructor execution, which means that whether it is null or not does not matter.
If the pointer was copied, setting p to null will not set the rest of the pointers, and you can then hit the same problems with the extra issue that you will be expecting to find deleted pointers being null, and it won't make sense how your pointer became non-zero and yet the object is not there....
Additionally it might hide other errors, like for example if your application is trying to delete a pointer many times, by setting the pointer to null, the effect is that the second delete will be converted to a no-op, and while the application will not crash the error in the logic is still there (consider that a later edit accesses the pointer right before the delete that is not failing... how can that fail?

I recommend doing so.
Obviously, it's valid in a context where NULL is a valid value of the pointer. This, of course, means if it's used in other places, it must be checked.
Even if the pointer may, technically, not be NULL, it does help in real-world scenarios when customers send you a crash dump. If it's NULL and it's not supposed to be (and it wasn't trapped in testing with the assert() that you should do), then it's easy to see that this is the problem - you'll crash at something like mov eax, [edx+4] or something, you'll see that edx is 0, and you know what the problem is. If, on the other hand, you don't NULL the pointer, but it is deleted, then all sorts of things may happen. It may work, it may crash immediately, it may crash later, it may show weird things - at this point, anything that happens is soft-non-deterministic.
Defensive programming is King. That includes setting a pointer to NULL extraneously, even if you think you don't have to, and doing that extra NULL check in a few places even though you technically know it isn't supposed to happen.
Even if you have the logic error of a pointer going through delete twice, it's better to not crash and handle it safely than to crash. That may mean you do that extra check, and you'll probably want to log that, or maybe even end the program gracefully, but it's not just a crash. Customers don't like that.
Personally, I use an inline function that takes the pointer as a reference and sets it to NULL.

No, it's not an anti-pattern.
Setting a pointer to NULL is a perfectly good way of indicating that the pointer is no longer pointing at anything valid. In fact, that's exactly what NULL value is intended to mean.
If there's an anti-pattern to be avoided here, the anti-pattern would be not having a simple and well-defined set of rules/conventions for how your program manages its memory. It doesn't matter so much what those rules are, as long as they work to avoid leaks and crashes, and as long as you can and do follow them consistently.

It depends on how the pointer is used. If all the code that can see the pointer should "know" that it's no longer valid — especially in a destructor where the pointer is about to go out of scope anyway — there's no need to set it to null.
On the other hand, the pointer may represent an object that sometimes exists, sometimes doesn't, and you have code like if (p) { p->doStuff(); } to act upon the object when it exists. In that case, obviously, you should set it to null after deleting the object.
The important distinction in the latter case is that the lifetime of the pointer variable is much longer than the lifetime of the objects it (sometimes) points to, and its null-ness carries some significant meaning that has to be communicated to other parts of the program.

In my opinion the anti-pattern is not delete p; p = NULL;, it's assert(this != NULL).
I use the anti-pattern as you call it for two reasons - first to enhance the likelyhood that bad code will crash spectacularly without hiding, and second to make the core problem more obvious in debugging. I wouldn't litter my code with asserts just on the off chance that it might catch something.

Although I think #Mark Ransom is on sort of the right track, I would suggest there's an even more fundamental problem than just assert(this!=NULL).
I think it's much more telling that you're using raw pointers and new (directly) often enough to care. While this isn't (necessarily) a code smell/anti-pattern/whatever in itself, it often points toward code that's unnecessarily similar to C, and isn't taking advantage of the things like containers in the standard library.
Even if the standard containers don't fit your needs well, you can still normally wrap your pointer up into a small enough, simple enough package that techniques like this are irrelevant -- you've restricted access to the pointer in question to such a small amount of code that you can glance through it and assure that it's only ever used when it's valid. As Hoare said, there are two ways of doing things: one is to keep the code so simple there are obviously no deficiencies; the other is to make the code so complex there are no obvious deficiencies. This technique only appears relevant (to me) when you're already dealing with the latter case.
Ultimately, I see the desire to do this at all as basically admitting defeat -- rather than even attempting to assure that the code is correct, you've done the equivalent of setting up a bug-zapper next to a swamp. It reduces the appearance of bugs in a small area by a small percentage, but leaves so many more free to breed that if there's any effect on the total population, it's far too small to measure.

The reason is that the extra setting to null will only help in very limited contexts. If you are in a destructor, the pointer itself will not exist right after the destructor execution, which means that whether it is null or not does not matter.
Got to correct this statement since it is false in C++.
The other functions of an object being destroyed may be called while the object is getting destroyed (because somehow the destruction process requires it.) It is generally viewed as ugly, but there not always good work around.
Therefore, clearing your pointers may be the only good solution to avoid problems (i.e. these other functions being called can then test to see whether the object is valid before using it.)
Yet, a good idea in C++ is to use smart objects (probably what you are talking about). More or less, a class that holds a reference to an object and which makes sure said object is released when the destructor is hit and not add many objects to destroy all at once in one object (although the result is the same, it is cleaner.)

Related

Should I reset primitive member variable in destructor?

Please see following code,
class MyClass
{
public:
int i;
MyClass()
{
i = 10;
}
};
MyClass* pObj = nullptr;
int main()
{
{
MyClass obj;
pObj = &obj;
}
while (1)
{
cout << pObj->i; //pObj is dangling pointer, still no crash.
Sleep(1000);
}
return 0;
}
obj will die once it comes out of scope. But I tested in VS 2017, I see no crash even after I use it.
Is it good practice to reset int member varialbe i?
Accessing a member after an object got destroyed is undefined behavior. It may seem like a good to set members in a destructor to a predictable and most likely unexpected value, e.g., a rather large value or a value with specific bit pattern making it easy to recognize the value in a debugger.
However, this idea is flawed and dwarved by the system:
All classes would need to play along and instead of concentrating on creating correct code developers would spent time (both development time as well as run-time) making pointless change.
Compilers happen get rather smart and can detect that changes in destructor are not needed. Since a correct program cannot detect whether the change was made they may not make the change at all. This effect is an actual issue for security applications where, e.g., a password should be erased from memory so it cannot be read (using some non-portable mean).
Even if the value gets set to a specific value, memory gets reused and the values get overwritten. Especially with objects on the stack it is most likely that the memory is used for something else before you see the bad value in a debugger.
Even when resetting values you would necessarily see a "crash": a crash is caused by something being setup to protect against something invalid. In your example you are accessing an int on the stack: the stack will remain accessible from a CPU point of view and at best you'd get an unexpected value. Use of unusual pointer values typically leads to a crash because the memory management system tries to access a location which isn't mapped but even that isn't guaranteed: on a busy 32 bit system pretty much all memory may be in use. That is, trying to rely on undefined behavior being detect is also futile.
Correspondingly, it is much better to use good coding practices which avoid dangling references right away and concentrate on using these. For example, I'm always initializing members in the member initializer list, even in the rare cases they end up getting changed in the body of the constructor (i.e., you'd write your constructor as MyClass(): i() {}).
As a debugging tool it may be reasonable to replace the allocation functions (ideally the allocator object but potentially the global operator new()/operator delete() and family with a version which doesn't quickly hand out released memory and instead fills the released memory with a predictable pattern. Since these actions slow down the program you'd only use this code in a debug build but it is relatively simple to implement once and easy to enable/disable centrally it may be worth the effort. In practice I don't think even such a system pays off as use of managed pointers and proper design of ownership and lifetime avoid most errors due to dangling references.
The behaviour of code you gave is undefined. Partial case of undefined behaviour is working as expected, so here is nothing strange that the code works. Code can work now and it can broke anyway at any time depending of compiler version, compiler options, stack content and a moon phase.
So first and most important is to avoid dangling pointers (and all other kinds of undefined behaviour) everywhere.
What about clearing variables in destructor, I found a best practice:
Follow coding rules saving me from mistakes of access to unallocated or destroyed objects. I cannot describe it in a few words but rules are pretty common (see here and anywhere).
Analyze code by humans (code review) or by statical analyzers (like cppcheck or PVS-Studio or another) to avoid cases similar to one you described above.
Do not call delete manually, better use scoped_ptr or similar object lifetime managers. When delete is reasonable, I usually (usually) set pointer to nullptr after deletion to keep myself from mistakes.
Use pointers as rare as it possible. References are preferred.
When objects of my class used outside and I suspect that somebody can access it after deletion I can put signature field inside, set it to something like 0xDEAD in destructor and check at enter or every public method. Here be careful to not slow down your code to unacceptable speed.
After all of this setting i from your example to 0 or -1 is redundant. As for me it's not a thing you should focus your attention.

How bad is "if (!this)" in a C++ member function?

If I come across old code that does if (!this) return; in an app, how severe a risk is this? Is it a dangerous ticking time bomb that requires an immediate app-wide search and destroy effort, or is it more like a code smell that can be quietly left in place?
I am not planning on writing code that does this, of course. Rather, I've recently discovered something in an old core library used by many pieces of our app.
Imagine a CLookupThingy class has a non-virtual CThingy *CLookupThingy::Lookup( name ) member function. Apparently one of the programmers back in those cowboy days encountered many crashes where NULL CLookupThingy *s were being passed from functions, and rather than fixing hundreds of call sites, he quietly fixed up Lookup():
CThingy *CLookupThingy::Lookup( name )
{
if (!this)
{
return NULL;
}
// else do the lookup code...
}
// now the above can be used like
CLookupThingy *GetLookup()
{
if (notReady()) return NULL;
// else etc...
}
CThingy *pFoo = GetLookup()->Lookup( "foo" ); // will set pFoo to NULL without crashing
I discovered this gem earlier this week, but now am conflicted as to whether I ought to fix it. This is in a core library used by all of our apps. Several of those apps have already been shipped to millions of customers, and it seems to be working fine; there are no crashes or other bugs from that code. Removing the if !this in the lookup function will mean fixing thousands of call sites that potentially pass NULL; inevitably some will be missed, introducing new bugs that will pop up randomly over the next year of development.
So I'm inclined to leave it alone, unless absolutely necessary.
Given that it is technically undefined behavior, how dangerous is if (!this) in practice? Is it worth man-weeks of labor to fix, or can MSVC and GCC be counted on to safely return?
Our app compiles on MSVC and GCC, and runs on Windows, Ubuntu, and MacOS. Portability to other platforms is irrelevant. The function in question is guaranteed to never be virtual.
Edit: The kind of objective answer I am looking for is something like
"Current versions of MSVC and GCC use an ABI where nonvirtual members are really statics with an implicit 'this' parameter; therefore they will safely branch into the function even if 'this' is NULL" or
"a forthcoming version of GCC will change the ABI so that even nonvirtual functions require loading a branch target from the class pointer" or
"the current GCC 4.5 has an inconsistent ABI where sometimes it compiles nonvirtual members as direct branches with an implicit parameter, and sometimes as class-offset function pointers."
The former means the code is stinky but unlikely to break; the second is something to test after a compiler upgrade; the latter requires immediate action even at high cost.
Clearly this is a latent bug waiting to happen, but right now I'm only concerned with mitigating risk on our specific compilers.
I would leave it alone. This might have been a deliberate choice as an old-fashioned version of the SafeNavigationOperator. As you say, in new code, I wouldn't recommend it, but for existing code, I'd leave it alone. If you do end up modifying it, I'd make sure that all calls to it are well-covered by tests.
Edit to add: you could choose to remove it only in debug versions of your code via:
CThingy *CLookupThingy::Lookup( name )
{
#if !defined(DEBUG)
if (!this)
{
return NULL;
}
#endif
// else do the lookup code...
}
Thus, it wouldn't break anything on production code, while giving you a chance to test it in debug mode.
Like all undefined behavior
if (!this)
{
   return NULL;
}
this is a bomb waiting to go off. If it works with your current compilers, you are kind-of lucky, kind-of unlucky!
The next release of the same compilers might be more aggressive and see this as dead code. As this can never be null, the code can "safely" be removed.
I think it is better if you removed it!
If you have many GetLookup functions return NULL, then you're better off fixing code that calls methods using a NULL pointer. First, replace
if (!this) return NULL;
with
if (!this) {
// TODO(Crashworks): Replace this case with an assertion on July, 2012, once all callers are fixed.
printf("Please mail the following stack trace to myemailaddress. Thanks!");
print_stacktrace();
return NULL;
}
Now, carry on with your other work, but fix these as they roll in. Replace:
GetLookup(x)->Lookup(y)...
with
convert_to_proxy(GetLookup(x))->Lookup(y)...
Where conver_to_proxy does returns the pointer unchanged, unless it's NULL, in which case it returns a FailedLookupObject as in my other answer.
It may not crash in most compilers since non-virtual functions are typically either inlined or translated into non-member functions taking "this" as a parameter. However, the standard specifically says that calling a non-static member function outside the lifetime of the object is undefined, and the lifetime of an object is defined as beginning when memory for the object has been allocated and the constructor has completed, if it has non-trivial initialization.
The standard only makes an exception to this rule for calls made by the object itself during construction or destruction, but even then one must be careful because the behavior of virtual calls can differ from the behavior during the object's lifetime.
TL:DR: I'd kill it with fire, even if it will take a long time to clean up all the call sites.
Future versions of the compiler are likely to more aggressively optimize in cases of formally undefined behavior. I wouldn't worry about existing deployments (where you know the behavior the compiler actually implemented), but it should be fixed in the source code in case you ever use a different compiler or different version.
this is something that's called 'a smart and ugly hack'. note: smart != wise.
finding all the call sites without any refactoring tools should be easy enough; break GetLookup() somehow so it doesn't compile (e.g. change signature) so you can identify misusage statically. then add a function called DoLookup() which does what all this hacks are doing right now.
In this case I'd suggest removing the NULL check from the member function and create a non-member function
CThingy* SafeLookup(CLookupThing *lookupThing) {
if (lookupThing == NULL) {
return NULL;
} else {
return lookupThing->Lookup();
}
}
Then it should be easy enough to find every call to the Lookup member function and replace it with the safe non-member function.
If it's something that's bothering you today, it'll bother you a year from now. As you pointed out, changing it will almost certainly introduce some bugs -- but you can begin by retaining the return NULL functionality, add a bit of logging, let it run in the wild for a few weeks, and find how many times it even gets hit?
This is a "ticking bomb" only if you are pedantic about the wording of the specification. However, regardless, it is a terrible, ill-advised approach because it obscures a program error. For that reason alone, I would remove it, even if it means considerable work. It is not an immediate (or even middle-term) risk, but it just isn't a good approach.
Such error hiding behavior really isn't something you want to rely on, either. Imagine you rely on this behavior (i.e. it doesn't matter whether my objects are valid, it will work anyway!) and then, by some hazard, the compiler optimizes out the if statement in a particular case because it can prove that this is not a null pointer. That is a legitimate optimization not just for some hypothetical future compiler, but for very real, present-time compilers as well.
But of course, since your program isn't well-formed, it happens that at some point you pass it a null this around 20 corners. Bang, you're dead.
That's very contrieved, admittedly, and it won't happen, but you cannot be 100% certain that it still cannot possibly happen.
Note that when I shout out "remove!", that does not mean the whole lot of them must be removed immediately or in one massive manpower operation. You could remove these checks one by one as you encounter them (when you change something in the same file anyway, avoid recompilations), or just text-search for one (preferrably in a highly used function), and remove that one.
Since you are using GCC, you may be intersted in __builtin_return_address, which may help you remove these checks without massive manpower and totally disrupting the whole workflow and rendering the application entirely unusable.
Before removing the check, modify it to to output the caller's address, and addr2line will tell you the location in your source. That way, you should be able to quickly identify all the locations in the application that are behaving wrongly, so you can fix these.
So instead of
if(!this) return 0;
change one location at a time to something like:
if(!this) { __builtin_printf("!!! %p\n", __builtin_return_address(0)); return 0; }
That lets you identify the invalid call sites for this change while still letting the program "work as intended" (you can also query the caller's caller if needed). Fix every ill-behaved location, one by one. The program will still "work" as normal.
Once no more addresses come up, remove the check alltogether. You might still have to fix one or the other crash if you are unlucky (because it didn't show while you tested), but that should be a very rare thing to happen. In any case, it should prevent your co-worker from shouting at you.
Remove one or two checks per week, and eventually none will be left. Meanwhile, life goes on and nobody notices what you're doing at all.
TL;DR
As for "current versions of GCC", you are fine for non-virtual functions, but of course nobody can tell what a future version might do. I deem it however highly unlikely that a future version will cause your code to break. Not few existing projects have this kind of check (I remember we had literally hundreds of them in Code::Blocks code completion at some time, don't ask me why!). Compiler makers probably don't want to make dozens/hundreds of major project maintainers unhappy on purpose, only to prove a point.
Also, consider the last paragraph ("from a logical point of view"). Even if this check will crash and burn with a future compiler, it will crash and burn anyway.
The if(!this) return; statement is somewhat useless insofar as this cannot ever be a null pointer in a well-formed program (it means you called a member function on a null pointer). This does not mean, of course, that it couldn't possibly happen. But in this case, the program should crash hard or abort with an assertion. Under no conditions should such a program continue silently.
On the other hand, it is perfectly possible to call a member function on an invalid object that happens to be not null. Checking whether this is the null pointer obviously doesn't catch that case, but it is the exact same UB. So, apart from hiding wrong behavior, this check also only detects one half of the problematic cases.
If you are going by the wording of the speficication, using this (which includes checking whether it's a null pointer) is undefined behavior. Insofar, strictly speaking, it is a "time bomb". However, I would not reasonably deem that a problem, both from a practical point of view and from a logical one.
From a practical point of view, it doesn't really matter whether you read a pointer that is not valid as long as you do not dereference it. Yes, strictly to the letter, this is not allowed. Yes, in theory someone might build a CPU which will check invalid pointers when you load them, and fault. Alas, this isn't the case, if you're being real.
From a logical point of view, assuming that the check will blow up, it still isn't going to happen. For this statement to be executed, the member function must be called, and (virtual or not, inlined or not) is using an invalid this, which it makes available inside the function body. If one illegitimate use of this blows up, the other will, too. Thus, the check is being obsoleted because the program already crashes earlier.
n.b.: This check is very similar to the "safe delete idiom" which sets a pointer to nullptr after deleting it (using a macro or a templated safe_delete function). Presumably, this is "safe" because it doesn't crash deleting the same pointer twice. Some people even add a redundant if(!ptr) delete ptr;.
As you know, operator delete is guaranteed to be a no-op on a null pointer. Which means no more and no less than by setting a pointer to the null pointer, you have successfully eliminated the only chance to detect double deletion (which is a program error that needs to be fixed!). It is not any "safer", but it instead hides incorrect program behavior. If you delete an object twice, the program should crash hard.
You should either leave a deleted pointer alone, or, if you insist on tampering, set it to a non-null invalid pointer (such as the address of a special "invalid" global variable, or a magic value like -1 if you will -- but you should not try to cheat and hide the crash when it is to occur).
You can safely fix this today by returning a failed lookup object.
class CLookupThingy: public Interface {
// ...
}
class CFailedLookupThingy: public Interface {
public:
CThingy* Lookup(string const& name) {
return NULL;
}
operator bool() const { return false; } // So that GetLookup() can be tested in a condition.
} failed_lookup;
Interface *GetLookup() {
if (notReady())
return &failed_lookup;
// else etc...
}
This code still works:
CThingy *pFoo = GetLookup()->Lookup( "foo" ); // will set pFoo to NULL without crashing
It's my personal opinion that you should fail as early as possible to alert you to problems. In that case, I'd unceremoniously remove each and every occurrence of if(!this) I could find.

C++: Safe to use locals of caller in function?

I think it's best if I describe the situation using a code example:
int MyFuncA()
{
MyClass someInstance;
//<Work with and fill someInstance...>
MyFuncB( &someInstance )
}
int MyFuncB( MyClass* instance )
{
//Do anything you could imagine with instance, *except*:
//* Allowing references to it or any of it's data members to escape this function
//* Freeing anything the class will free in it's destructor, including itself
instance->DoThis();
instance->ModifyThat();
}
And here come my straightforward questions:
Is the above concept guranteed, by C and C++ standards, to work as expected? Why (not)?
Is this considered doing this, sparingly and with care, bad practice?
Is the above concept guranteed, by C and C++ standards, to work as expected? Why (not)?
Yes, it will work as expected. someInstance is available through the scope of MyFuncA. The call to MyFuncB is within that scope.
Is this considered doing this, sparingly and with care, bad practice?
Don't see why.
I don't see any problem in actually using the pointer you were passed to call functions on the object. As long as you call public methods of MyClass, everything remains valid C/C++.
The actual instance you create at the beginning of MyFuncA() will get destroyed at the end of MyFuncA(), and you are guaranteed that the instance will remain valid for the whole execution of MyFuncB() because someInstance is still valid in the scope of MyFuncA().
Yes it will work. It does not matter if the pointer you pass into MyFuncB is on the stack or on the heap (in this specific case).
In regards for the bad practice part you can probably argue both ways. In general it's bad I think because if for any reason any object which is living outside of MyFuncA gets hold of the object reference then it will die a horrible death later on and cause sometime very hard to track bugs. It rewally depends how extensive the usage of the object becomes in MyFuncB. Especially when it starts involving another 3rd class it can get messy.
Others have answered the basic question, with "yeah, that's legal". And in the absence of greater architecture it is hard to call it good or bad practice. But I'll try and wax philosophical on the broader question you seem to be picking up about pointers, object lifetimes, and expectations across function calls...
In the C++ language, there's no built-in way to pass a pointer to a function and "enforce" that it won't stow that away after the call is complete. And since C++ pointers are "weak references" by default, the objects pointed to may disappear out from under someone you pass it to.
But explicitly weak pointer abstractions do exist, for instance in Qt:
http://doc.qt.nokia.com/latest/qweakpointer.html
These are designed to specifically encode the "paranoia" to the recipient that the object it is holding onto can disappear out from under it. Anyone dereferencing one sort of realizes something is up, and they have to take the proper cautions under the design contract.
Additionally, abstractions like shared pointer exist which signal a different understanding to the recipient. Passing them one of those gives them the right to keep the object alive as long as they want, giving you something like garbage collection:
http://doc.qt.nokia.com/4.7-snapshot/qsharedpointer.html
These are only some options. But in the most general sense, if you come up with any interesting invariant for the lifetimes of your object...consider not passing raw pointers. Instead pass some pointer-wrapping class that embodies and documents the rules of the "game" in your architecture.
(One of major the reasons to use C++ instead of other languages is the wealth of tools you have to do cool things like that, without too much runtime cost!)
i don't think there should be any problem with that barring, as you say, something that frees the object, or otherwise trashes its state. i think whatever unexpected things happen would not have anything to do with using the class this way. (nothing in life is guaranteed of course, but classes are intended to be passed around and operated on, whether it's a local variable or otherwise i do not believe is relevant.)
the one thing you would not be able to do is keep a reference to the class after it goes out of scope when MyFuncA() returns, but that's just the nature of the scoping rules.

It is good programming practice to always check for null pointers before using an object in C++?

This seems like a lot of work; to check for null each time an object is used.
I have been advised that it is a good idea to check for null pointers so you don't have to spend time looking for where segmentation faults occur.
Just wondering what the community here thinks?
Use references whenever you can, because they can't be null, therefore you don't have to check if they are null.
It's good practice to check for null in function parameters and other places you may be dealing with pointers someone else is passing you. However, in your own code, you might have pointers you know will always be pointing to a valid object, so a null check is probably overkill... just use your common sense.
I don't know if it really helps with debugging because any debugger will be showing you pretty clearly that a null pointer was used and it won't take long to find it. It's more about making sure you don't crash if another programmer passes in NULL, or that the mistake is picked up by an assert in a debug build.
No. You should instead make sure the pointers were not set to NULL in the first place. Note that in Standard C++:
int * p = new int;
then p can never be NULL because new will throw an exception if the allocation fails.
If you are writing functions that can take a pointer as a parameter, you should treat them like this
// does something with p
// note that p cannot be NULL
void f( int * p );
In other words you should document the requirements of the function. You can also use assert() to check if someone has ignored your documentation (if they have, it's their problem, not yours), but I must admit I have gone off this as time has gone on - simply say what the function requires, and leave the responsibility with the caller.
A third bit of advice is simply not to use pointers - most C++ code that I've seen overuses pointers to a ridiculous extent - you should use values and references wherever possible.
In general, I would advise against doing this, as it makes your code harder to read and you also have to come up with some sensible way of dealing with the situation if a pointer is actually NULL.
In my C++ projects, I only check if a pointer (if I am using pointers at all) is NULL, only if it could be a valid state of the pointer. Checking for NULL if the pointer should never actually be NULL is a bit pointless, because you are then trying work around some programming error you should fix instead.
Additionally, when you feel the need to check if a pointer is NULL, you probably should define more clearly who owns pointer/object.
Also, you never have to check if new returns NULL, because it never will return NULL. It will throw an exception if it could not create an object.
I hate the amount of code checking for nulls adds, so I only do it for functions I export to another person.
If use the function internally, and I know how I use it, I don't check for nulls since it would get the code too messy.
the answer is yes, if you are not in control of the object. that is, if the object is returned from some method you do not control, or if in your own code you expect (or it is possible) that an object can be null.
it also depends on where the code will run. if you are writing professional code that customers / users will see, it's generally bad for them to see null pointer problems. it's better if you can detect it beforehand and print out some debugging information or otherwise report it to them in a "nicer" way.
if it's just code you are using informally, you will probably be able to understand the source of the null pointer without any additional information.
I figure I can do a whole lot of checks for NULL pointers for the cost of (debugging) just one segfault.
And the performance hit is negligible. TWO INSTRUCTIONS. Test for register == zero, branch if test succeeds. Depending on the machine, maybe only ONE instruction, if the register load sets the condition codes (and some do).
Others (AshleysBrain and Neil Butterworth), already answered correctly, but I will summarize it here:
Use references as much as possible
If using pointers, initialize them either to NULL or to a valid memory address/object
If using pointers, always verify if they are NULL before using them
Use references (again)... This is C++, not C.
Still, there is one corner case where a reference can be invalid/NULL :
void foo(T & t)
{
t.blah() ;
}
void bar()
{
T * t = NULL ;
foo(*t) ;
}
The compiler will probably compile this, and then, at execution, the code will crash at the t.blah() line (if T::blah() uses this one way or another).
Still, this is cheating/sabotage : The writer of the bar() function dereferenced t without verifying t was NOT null. So, even if the crash happens in foo(), the error is in the code of bar(), and the writer of bar() is responsible.
So, yes, use references as much as possible, know this corner case, and don't bother to protect against sabotaged references...
And if you really need to use a pointer in C++, unless you are 100% sure the pointer is not NULL (some functions guarantee that kind of thing), then always test the pointer.
I think that is a good idea for a debug version.
In a release version, checking for null pointers can result in a performance degradation.
Moreover, there are cases where you can check the pointer value in a parent function and avoid the checking in its children.
If the pointers are coming to you as parameters to a function, then make sure they are valid at the beginning of the function. Otherwise, there is not much point. new throws an exception on failure.

Checking for null before pointer usage

Most people use pointers like this...
if ( p != NULL ) {
DoWhateverWithP();
}
However, if the pointer is null for whatever reason, the function won't be called.
My question is, could it possibly be more beneficial to just not check for NULL? Obviously on safety critical systems this isn't an option, but your program crashing in a blaze of glory is more obvious than a function not being called if the program can still run without it.
In relation to the first question, do you always check for NULL before you use pointers?
Secondly, consider you have a function that takes a pointer as an argument, and you use this function multiple times on multiple pointers throughout your program. Do you find it more beneficial to test for NULL in the function (the benefit being you don't have to test for NULL all over the place), or on the pointer before calling the function (the benefit being no overhead from calling the function)?
You are right in thinking that NULL pointers often result in immediate crashes, but do not forget that if you are indexing into a large array through a NULL pointer, you might indeed get a valid memory address if your index is high enough. And then, you'll get memory corruption or incorrect memory reads, which will be much harder to locate.
Whenever I can assume that calling a function with NULL is a bug, which should never happen in production code, I prefer using ASSERT guards in the function, which are only compiled into real code in a debug build, and not checking for NULL otherwise.
And from my point of view, generally, a function should check its arguments, not the caller. You should always assume that your callers might have been a bit sloppy about the checking, or that they might contain bugs...
Morality: check for NULL in the function being called, either through some if() statement that throws, or using some ASSERT construct (possibly with a clear message of why this happened). Also check for NULL in the callers, but only if the callers know that this condition might happen in a normal program execution, and act accordingly.
When it's acceptable for the program to just crash if a NULL pointer comes up, I'm partial to:
assert(p);
DoWhateverWithP();
This will only check the pointer in debug builds since defining NDEBUG usually #undefs assert() at the preprocessor level. It documents your assumption and assists with debugging but has zero performance impact on the released binary (though, to be fair, checking for a NULL pointer should have effectively zero impact on performance in the vast majority of circumstances).
As a side benefit, this is legal for C as well as C++ and, in the latter case, doesn't require exceptions to be enabled in your compiler/runtime.
Concerning your second question, I prefer to put the assertions at the beginning of the subroutine. Again, the beauty of assert() is the fact that there's really no 'overhead' to speak of. As such, there's nothing to weigh against the benefits of only requiring one assertion in the subroutine definition.
Of course, the caveat is that you never want to assert an expression with side-effects:
assert(p = malloc(1)); // NEVER DO THIS!
DoSomethingWithP(); // If NDEBUG was defined, malloc() was never called!
Don't make it a rule to just check for null and do nothing if you find it.
If the pointer is allowed to be null, then you have to think about what your code does in the case that it actually is null. Usually, just doing nothing is the wrong answer. With care it's possible to define APIs which work like that, but this requires more than just scattering a few NULL checks about the place.
So, if the pointer is allowed to be null, then you must check for null, and you must do whatever is appropriate.
If the pointer is not allowed be null, then it's perfectly reasonable to write code which invokes undefined behaviour if it is null. It's no different from writing string-handling routines which invoke undefined behaviour if the input is not NUL-terminated, or writing buffer-using routines which invoke undefined behaviour if the caller passes in the wrong value for the length, or writing a function that takes a file* parameter, and invokes undefined behaviour if the user passes in a file descriptor reinterpret_cast to file*. In C and C++, you simply have to be able to rely on what your caller tells you. Garbage in, garbage out.
However, you might like to write code which helps out your caller (who is probably you, after all) when the most likely kinds of garbage are passed in. Asserts and exceptions are good for this.
Taking up the analogy from Franci's comment on the question: most people do not look for cars when crossing a footpath, or before sitting down on their sofa. They could still be hit by a car. It happens. But it would generally be considered paranoid to spend any effort checking for cars in those circumstances, or for the instructions on a can of soup to say "first, check for cars in your kitchen. Then, heat the soup".
The same goes for your code. It's much easier to pass an invalid value to a function than it is to accidentally drive your car into someone's kitchen. But it's still the fault of the driver if they do so and hit someone, not a failure of the cook to exercise due care. You don't necessarily want cooks (or callees) to clutter up their recipes (code) with checks that ought to be redundant.
There are other ways to find problems, such as unit tests and debuggers. In any case it is much safer to create a car-free environment except where necessary (roads), than it is to drive cars willy-nilly all over the place and hope everybody can cope with them at all times. So, if you do check for null in cases where it isn't allowed, you shouldn't let this give people the idea that it is allowed after all.
[Edit - I literally just hit an example of a bug where checking for null would not find an invalid pointer. I'm going to use a map to hold some objects. I will be using pointers to those objects (to represent a graph), which is fine because map never relocates its contents. But I haven't defined an ordering for the objects yet (and it's going to be a bit tricky to do so). So, to get things moving and prove that some other code works, I used a vector and a linear search instead of a map. That's right, I didn't mean vector, I meant deque. So after the first time the vector resized, I wasn't passing null pointers into functions, but I was passing pointers to memory which had been freed.
I make dumb errors which pass invalid garbage approximately as often as I make dumb errors which pass null pointers invalidly. So regardless of whether I add checking for null, I still need to be able to diagnose problems where the program just crashes for reasons I can't check. Since this will also diagnose null pointer accesses, I usually don't bother checking for null unless I'm writing code to generally check the preconditions on entry to the function. In that case it should if possible do a lot more than just check null.]
I prefer this style:
if (p == NULL) {
// throw some exception here
}
DoWhateverWithP();
This means that whatever function this code lives in will fail quickly in the event that p is NULL. You are correct that if p is NULL there is no way that DoWhateverWithP can execute but using a null pointer or simply not executing the function are both unacceptable ways to handle the fack the p is NULL.
The important thing to remember is to exit early and fail fast - this kind of approach yields code that is easier to debug.
In addition to the other answers, it depends upon what NULL signifies. For example, this code is perfectly OK, and is pretty idiomatic:
while (fgets(buf, sizeof buf, fp) != NULL) {
process(buf);
}
Here, NULL value indicates not only error, but end-of-file condition as well. Similarly, strtok() returns NULL to say, "there are no more tokens" (although one should avoid strtok() to begin with, but I digress). In cases like this, it is perfectly OK to call a function if the returned pointer is not NULL, and do nothing otherwise.
Edit: another example, closer to what was asked:
const char *data = "this;is;a;test;";
const char *curr = data;
const char *p;
while ((p = strchr(curr, ';')) != NULL) {
/* process data in [curr, p) */
process(curr, p);
curr = p + 1;
}
Once again, NULL here is an indication from strchr() that it couldn't find a ;, and that we should stop processing the data further.
Having said that, if NULL is not used as an indication, then it depends:
If the pointer can't be NULL at this point in code, it's useful to have an assert(p != NULL); when developing, and also having a fprintf(stderr, "Can't happen\n"); or equivalent statement, and then take whatever action as appropriate (abort() or similar is probably the only sane choice at this point).
If the pointer can be NULL, and it's not critical, it might be better to just bypass the usage of the null pointer. Suppose you're trying to allocate memory for writing a log message, and malloc() fails. You shouldn't abort the program because of this. If malloc() succeeds, you want to call a function (sprintf()/whatever) to fill the buffer.
If the pointer can be NULL, and it's critical. In this case, you probably want to fail, and hopefully such conditions don't happen too often.
Secondly, consider you have a function
that takes a pointer as an argument,
and you use this function multiple
times on multiple pointers throughout
your program. Do you find it more
beneficial to test for NULL in the
function (the benefit being you don't
have to test for NULL all over the
place), or on the pointer before
calling the function (the benefit
being no overhead from calling the
function)?
This depends upon a lot of factors. If I can be sure sometimes or most of the times that the pointer passed to a function cannot be NULL, the extra check in the function is wasteful. If the pointer passed comes out of a lot of places, and it's tricky to put in a check everywhere, sure, then the check is good to have in the function itself.
The standard library functions, for the most part, don't check for NULL: str*, mem* functions for example. An exception is free(), it does check for NULL.
A comment about assert: assert is a no-op if NDEBUG is defined, so one should not use it for debugging—its only use is during development to catch programming errors. Also, in C89, assert takes an int, so assert(p != NULL) is better in such cases than a just plain assert(p).
This non-NULLness check can be avoided by using references instead of pointers. This way, the compiler ensures the parameter passed is not NULL. For example:
void f(Param& param)
{
// "param" is a pointer that is guaranteed not to be NULL
}
In this case, it is up to the client to do the checking. However, mostly the client situation will be like this:
Param instance;
f(instance);
No non-NULLness checking is needed.
When using with objects allocated on the heap, you can do the following:
Param& instance = *new Param();
f(*instance);
Update: As user Crashworks remarks, it is still possible to make you program crash. However, when using references, it is the responsibility of the client to pass a valid reference, and as I show in the example, this is very easy to do.
How about: a comment clarifying the intent? If the intent is "this can't happen", then perhaps an assert would be the right thing to do instead of the if statement.
On the other hand, if a null value is normal, perhaps an "else comment" explaining why we can skip the "then" step would be in order. Stevel McConnel has a good section in "Code Complete" about if/else statements, and how a missing else is a very common error (distracted, forgot it?).
Consequently, I usually put a comment in for a "no-op else", unless it is something of the form "if done, return/break".
When you check for NULL, it is not good idea just to skip the function call. You should have an else-part that does something meaningful in case of NULL, for example throws an error or returns error code to upper level.
On the other hand, NULL is not always an error. It is often used to indicate for example that end of data has been reached. In such case, you will have to handle the situation as normal program flow.
Well the answer to the first question is: you are talking about ideal situation, most of the code that I see which uses if ( p != NULL ) are legacy. Also suppose, you want to return an evaluator, and then call the evaluator with the data, but say there is no evaluator for that data, its make logical sense to return NULL and check for NULL before calling the evaluator.
The answer to the second question is, it depends on the situation, like the delete checks for the NULL pointer, whereas lots of other function don't. Sometimes, if you test the pointer inside the function, then you might have to test it in lots of functions like:
ABC(p);
a = DEF(p);
d = GHI(a);
JKL(p, d);
but this code would be much better:
if(p)
{
ABC(p);
a = DEF(p);
d = GHI(a);
JKL(p, d);
}
Could it possibly be more beneficial to just not check for NULL?
I wouldn't do it, I favor assertions on the frontline and some form of recovery in the body past that. What would assertions not provide to you, that not checking for null would? Similar effect, with easier interpretation and a formal acknowledgement.
In relation to the first question, do you always check for NULL before you use pointers?
It really depends on the code and the time available, but I am irritatingly good at it; a good chunk of 'implementation' in my programs consists of what a program should not do, rather than the usual 'what it should do'.
Secondly, consider you have a function that takes a pointer as an argument...
I test it in the function, as the function is (hopefully) the program that is reused more frequently. I also tend to test it before making the call, without that test, the error loses localization (useful for reporting and isolation).
I think I've seen more of. This way you don't proceed if you know it's going to blow up anyway.
if (NULL == p)
{
goto FunctionExit; // or some other common label to exit the function.
}
I think it is better to check for null. Although, you can cut down on the amount of checks you need to make.
For most cases I prefer a simple guard clause at the top of a function:
if (p == NULL) return;
That said, I typically only put the check on functions that are publicly exposed.
However, when the null pointer in unexpected I will throw an exception. (There are some functions it doesn't make any sense to call with null, and the consumer should be responsible enough to use it right.)
Constructor initialization can be used as an alternative to checking for null all the time. This is especially useful when the class contains a collection. The collection can be used throughout the class without checking whether it has been initialized.
Dereferencing a null pointer is undefined behavior. If you want to crash if the pointer is null, use an assert or something similar (and, depending on the defined behavior of your class, that can be a perfectly valid response - it's certainly better than continuing to run when people may be expecting something to have been done!).
Since the behavior of dereferencing a null pointer is undefined, it can do anything. Crash, corrupt memory, create a wormhole to an alternate dimension allowing the Elder Gods to come forth and devour all of mankind... anything. While bugs happen, depending upon undefined behavior is, by definition, a bug. So don't do it deliberately.