This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
is const (c++) optional?
In c++ or any programing language, what is the point of declaring a variable const or constant? I understand what const does, but isn't it safer to declare everything not constant, because doesn't the programmer know whether or not to change the variable? I just don't see the objective of const.
If the programmer (or programming team) can successfully keep track of every detail of every variable and not accidentally assign a value to a constant, then by all means don't declare constants const.
The addition of const to the language is useful for preventing easily preventable errors, unlike languages of the dinosaur era where evil things would happen.
Here is an approximation of a bug I once had to track down in a huge FORTRAN 77 application. FORTRAN 77 passes parameters by reference unless extraordinary measures are taken:
subroutine increment(i)
integer i
i = i + 1
end
subroutine process ()
call increment (1)
call someprocedure (1, 2, 3)
...
The result was that someprocedure() was called with (2, 2, 3)!
but isn't it safer to declare everything not constant, because doesn't the programmer know whether or not to change the variable?
That is exactly wrong. It is "safer" to ensure that what is supposed to be a constant value is not changed by mistake. It conveys the intent of the value to all programmers who may stumble upon it in the future. If a program assumes that a value should never change then why allow it to be? That may potential cause very hard to track down bugs.
Don't assume your program is correct, make it so (as much as is possible) using the utilities that your language provides. Most real-world projects are not completed by one guy sitting in his basement. They involve multiple programmers and will be maintained for many years, often by a group of people who had nothing to do with the initial version.
Please don't make me or anyone else guess as to what your design decisions were, make them explicit whenever possible. Hell, even you will forget what you were thinking when you come back to a program you haven't touched for some time.
because doesn't the programmer know whether or not to change the variable?
No. You will write a lot of code for other programmers. They may want to change that value.
Maybe, you make a mistake and you change the value unintentionally. If it was const, it wouldn't have let you. Const is also very useful for overloading operators.
Yes you are right when you say that the basic idea is to make sure constant variables prevents programming errors . But one additional use is that the interface you provide to your client also ensures that whatever you want to be const remains constant!It prevents people who use your code from violating the constraint.
This will come in handy especially in OOP. By making sure your object is const you can write a lot of code without worrying about the consequences. The objects could be used by a new programmer or a client who would have to keep the property. Const iterators are also very handy.
This link should help you out
const helps the compiler to optmize code. Temporaries are bound to const referances. So, the life time of temporaries can be increased a bit more using const.
In web programming, it can be so that the variable isn't changed (sometimes maliciously) via injection.
It also helps when you may accidentally change the variable but know ahead of time that that isn't desired.
Constants are useful when writing huge programs, where you may get lost and forget what a variable (or constant) does or whether you are allowed to change it's value or not.
Moreover and more important, It's important in teamwork and in writing libraries for others. If you write code for others, they may change a variable you intend to keep with constant value, so declaring it as constant is much useful.
Also it's useful when you want to use the same number in places where only constants are allowed, for example, delcaring static arrays. In this case you cannot determine the size of the array with a variable, also it's very annoying to review all the code and change the number everytime you want to change the size of the array, so declaring that number as a constant is very useful and I really had an experience with it.
Hope that's convincing, that's what I remember for now, I will come back and edit should I remember something else.
Have a nice programming :)
Related
It baffles me why C++ compilers do not initialise every integer declaration to 0, be it local or global or members? Why do uninitialised sections exists in the memory model?
I know it's a dull answer, but your question begs for it exactly:
Because the C++ standard says so.
Why does it say so? Because C++ is built on a principle:
Don't pay for what you don't use.
Setting memory to a certain value costs CPU time and memory bandwidth. If you want to do it, do it explicitly. A variable declaration should not incur this cost.
C++ is based on C, and in C a primary design concern was code efficiency. In most cases you want to initialize a new variable to a specific value after declaring it. When the compiler would write 0 to that memory address just to write another value to it shortly afterwards, it would be a waste of a CPU cycle.
Sure, a smart compiler could detect that a variable isn't read before it gets a value assigned and could optimize the initialization to 0 away. But when C was developed, compilers weren't that smart yet.
The C language and its standard library generally follow the principle that it doesn't do stuff automatically when it might be unnecessary to do it under some circumstances.
It might make life easier for you if it did, but C++ errs on the side of avoiding overheads, e.g. setting values which you might then reset to something else.
As per the title, I am planning to move some legacy code developed a decade+ ago for AIX. The problem is the code base is huge. The developers didn't initialize their pointers in the original code. Now while migrating the code to the latest servers, I see some problems with it.
I know that the best solution is to run through all the code and initialize all the variables whereever required. However, I am just keen to know if there are any other solutions available to this problem. I tried google but couldn't find an appropriate answer.
The most preventive long-term approach is to initialize all pointers at the location they're declared, changing the code to use appropriate smart pointers to manage the lifetime. If you have any sort of unit tests this refactoring can be relatively painless.
In a shorter term and if you're porting to Linux you could use valgrind and get a good shot at tracking down the one or two real issues that are biting you, giving you time to refactor at a more leisurely pace.
Just initializing all the variables may not be a good idea.
Reliable behavior generally depends on variables having values known to be correct ("guaranteed by construction" to be correct). The problem with uninitialized variables isn't simply that they have unknown values. Obviously being unknown is a problem, but again the desired sate is having known and correct values. Initializing a variable to a known value that is not correct does not yield reliable behavior.
Not infrequently it happens that there is no 'default' value that is correct to use as a fallback if more complicated initialization fails. A program may choose not to initialize a variable with a value if that value must be over-written before the variable can be used.
Initializing a variable to a default value may have a few problems in such cases. Often 'default' values are inoffensive in that if they are used the consequences aren't immediately obvious. That's not generally desirable because as the developer you want to notice when things go wrong. You can avoid this problem by picking default values that will have obvious consequences, but that doesn't solve a second issue; Static analyzers can often detect and report when an uninitialized variable is used. If there's a problem with some complicated initialization logic such that no value is set, you want that to be detectable. Setting a default value prevents static analysis from detecting such cases. So there are cases where you do not want to initialize variables.
With pointers the default value is typically nullptr, which to a certain extent avoids the first issue discussed above because dereferencing a null pointer typically produces an immediate crash (good for debugging). However code might also detect a null pointer and report an error (good for debugging) or might fall back to some other method (bad for debugging). You may be better off using static analysis to detect usages of uninitialized pointers rather than initializing them. Though static analysis may detect dereferencing of null pointers it won't detect when null pointers cause error reporting or fallback routines to be used.
In response to your comment:
The major problems that i see are
Pointers to local variables are returned from functions.
Almost all the pointer variables are not initialized. I am sure that AIX does provide this comfort for the customer in the earlier platform however i really doubt that the code would run flawlessly in Linux when it is being put to real test (Production).
I cannot deliver partial solutions which may work. i prefer to give the best to my customer who pays me for my work. So Wont prefer to use workarounds.
Quality cannot be compromised.
fix them (and pay special attention to correctly cleaning up)
As I argue above simply lacking an initializer is not in and of itself a defect. There is only a defect if the uninitialized value is actually used in an illegal manner. I'm not sure what you mean about AIX providing comfort.
As I argue above the 'partial solution' and 'workaround' would be to blindly initialize everything.
Again, blindly initializing everything can result not only in useless work, but it can actually compromise quality by taking away some tools for detecting bugs.
I'm pretty much a beginner at C++. Just started learning it a few weeks ago. I'm really interested in improving my skills as a programmer, and there's something that's been confusing me in the last few days. It is pointers. Pointers and the reference operator. My question is, what exactly is the functionality of the pointers and reference operator? How will I know when to use them, and what are their purposes and common usages. Any examples consisting of common algorithms using dereference and reference will be greatly appreciated.
how can I use reference and dereference to become a better programmer, and improve my algorithms(and possibly make them simpler)?
Thanks :D
Definitely check this question out, the accepted answer explains pointers and common errors with them in a nice manner.
Update: a few words of my own
Pointers are bunches of bits, like any other kind of variable. We use them so much because they have several very convenient properties:
Their size (in bytes) is fixed, making it trivial to know how many bytes we need to read to get the value of a pointer.
When using other types of variables (e.g. objects), some mechanism needs to be in place so that the compiler knows how large each object is. This introduces various restrictions which vary among languages and compilers. Pointers have no such problems.
Their size is also small (typically 4 or 8 bytes), making it very fast to update their values.
This is very useful when you use the pointer as a token that points to a potentially large amount of information. Consider an example: we have a book with pictures of paintings. You need to describe a painting to me, so I can find it in the book. You can either sit down and paint an exact copy of it, show it to me, and let me search the book for it; or you can tell me "it's in page 25". This would be like using a pointer, and so much faster.
They can be used to implement polymorphism, which is one of the foundations of object-oriented-programming.
So, to find out how to use pointers: find cases where these properties will come in handy. :)
There's some things a programmer needs to understand before diving into pointers and C++ references.
First you must understand how a program works. When you write variables out, when you write statements, you need to understand what's happening at a lower level; it's important to know what happens from a computer stand-point.
Essentially your program becomes data in memory (a process) when you execute it. At this point you must have a simple way to reference spots of data - we call these variables. You can store things and read them, all from memory (the computers memory).
Now imagine having to pass some data to a function - you want this function to manipulate this data - you can either do this by passing the entire set of data, or you can do it by passing its address (the location of the data in memory). All the function really needs is the address of this data, it doesn't need the entire data itself.
So pointers are used exactly for this sort of task - when you need to pass address of data around - pointers in fact are just regular variables that contain an address.
C++ makes things a bit easier with references (int &var) but the concept is the same. It lets you skip the step of creating a pointer to store the address of some data, and it does it all automatically for you when passing data to a function.
This is just a simple introduction of how they work - you should read up on Google to search fo more detailed resources and all the cool things you can do with pointers/references.
Better name of the operator is "Address of" operator. Because it returns the address of the operand.
In C++ you will use pointers (and both reference/dereference operators) when dealing with dynamically allocated memory or when working with pointer arithmetic.
Pointers are also used to break down static bindings since they imply dynamic binding (through the address stored in the pointer, which can change dynamically).
For all other uses, it is usually better to use references instead of pointers.
to be short:
reference are some improvment of pointers that inherited from C to C++
its a bit safer because it helps you avoid using "*" in your functions and that cause you less segmentation faults.
or like my frines say "avoid the starwars"
there is a lot to learn about it !!!!
look for the use of "&" for sending and receiving values by refrence
understand the use of "&" for getting variable adress
its a very very big question, if you can be more specific it will be better.
Lets say I know a guy who is new to C++. He does not pass around pointers (rightly so) but he refuses to pass by reference. He uses pass by value always. Reason being that he feels that "passing objects by reference is a sign of a broken design".
The program is a small graphics program and most of the passing in question is mathematical Vector(3-tuple) objects. There are some big controller objects but nothing more complicated than that.
I'm finding it hard to find a killer argument against only using the stack.
I would argue that pass by value is fine for small objects such as vectors but even then there is a lot of unnecessary copying occurring in the code. Passing large objects by value is obviously wasteful and most likely not what you want functionally.
On the pro side, I believe the stack is faster at allocating/deallocating memory and has a constant allocation time.
The only major argument I can think of is that the stack could possibly overflow, but I'm guessing that it is improbable that this will occur? Are there any other arguments against using only the stack/pass by value as opposed to pass by reference?
Subtyping-polymorphism is a case where passing by value wouldn't work because you would slice the derived class to its base class. Maybe to some, using subtyping-polymorphism is bad design?
Your friend's problem is not his idea as much as his religion. Given any function, always consider the pros and cons of passing by value, reference, const reference, pointer or smart pointer. Then decide.
The only sign of broken design I see here is your friend's blind religion.
That said, there are a few signatures that don't bring much to the table. Taking a const by value might be silly, because if you promise not to change the object then you might as well not make your own copy of it. Unless its a primitive, of course, in which case the compiler can be smart enough to take a reference still. Or, sometimes it's clumsy to take a pointer to a pointer as argument. This adds complexity; instead, you might be able to get away with it by taking a reference to a pointer, and get the same effect.
But don't take these guidelines as set in stone; always consider your options because there is no formal proof that eliminates any alternative's usefulness.
If you need to change the argument for your own needs, but don't want to affect the client, then take the argument by value.
If you want to provide a service to the client, and the client is not closely related to the service, then consider taking an argument by reference.
If the client is closely related to the service then consider taking no arguments but write a member function.
If you wish to write a service function for a family of clients that are closely related to the service but very distinct from each other then consider taking a reference argument, and perhaps make the function a friend of the clients that need this friendship.
If you don't need to change the client at all then consider taking a const-reference.
There are all sorts of things that cannot be done without using references - starting with a copy constructor. References (or pointers) are fundamental and whether he likes it or not, he is using references. (One advantage, or maybe disadvantage, of references is that you do not have to alter the code, in general, to pass a (const) reference.) And there is no reason not to use references most of the time.
And yes, passing by value is OK for smallish objects without requirements for dynamic allocation, but it is still silly to hobble oneself by saying "no references" without concrete measurements that the so-called overhead is (a) perceptible and (b) significant. "Premature optimization is the root of all evil"1.
1
Various attributions, including C A Hoare (although apparently he disclaims it).
I think there is a huge misunderstanding in the question itself.
There is not relationship between stack or heap allocated objects on the one hand and pass by value or reference or pointer on the other.
Stack vs Heap allocation
Always prefer stack when possible, the object's lifetime is then managed for you which is much easier to deal with.
It might not be possible in a couple of situations though:
Virtual construction (think of a Factory)
Shared Ownership (though you should always try to avoid it)
And I might miss some, but in this case you should use SBRM (Scope Bound Resources Management) to leverage the stack lifetime management abilities, for example by using smart pointers.
Pass by: value, reference, pointer
First of all, there is a difference of semantics:
value, const reference: the passed object will not be modified by the method
reference: the passed object might be modified by the method
pointer/const pointer: same as reference (for the behavior), but might be null
Note that some languages (the functional kind like Haskell) do not offer reference/pointer by default. The values are immutable once created. Apart from some work-arounds for dealing with the exterior environment, they are not that restricted by this use and it somehow makes debugging easier.
Your friend should learn that there is absolutely nothing wrong with pass-by-reference or pass-by-pointer: for example thing of swap, it cannot be implemented with pass-by-value.
Finally, Polymorphism does not allow pass-by-value semantics.
Now, let's speak about performances.
It's usually well accepted that built-ins should be passed by value (to avoid an indirection) and user-defined big classes should be passed by reference/pointer (to avoid copying). big in fact generally means that the Copy Constructor is not trivial.
There is however an open question regarding small user-defined classes. Some articles published recently suggest that in some case pass-by-value might allow better optimization from the compiler, for example, in this case:
Object foo(Object d) { d.bar(); return d; }
int main(int argc, char* argv[])
{
Object o;
o = foo(o);
return 0;
}
Here a smart compiler is able to determine that o can be modified in place without any copying! (It is necessary that the function definition be visible I think, I don't know if Link-Time Optimization would figure it out)
Therefore, there is only one possibility to the performance issue, like always: measure.
Reason being that he feels that "passing objects by reference is a sign of a broken design".
Although this is wrong in C++ for purely technical reasons, always using pass-by-value is a good enough approximation for beginners – it’s certainly much better than passing everything by pointers (or perhaps even than passing everything by reference). It will make some code inefficient but, hey! As long as this doesn’t bother your friend, don’t be unduly disturbed by this practice. Just remind him that someday he might want to reconsider.
On the other hand, this:
There are some big controller objects but nothing more complicated than that.
is a problem. Your friend is talking about broken design, and then all the code uses are a few 3D vectors and large control structures? That is a broken design. Good code achieves modularity through the use of data structures. It doesn’t seem as though this were the case.
… And once you use such data structures, code without pass-by-reference may indeed become quite inefficient.
First thing is, stack rarely overflows outside this website, except in the recursion case.
About his reasoning, I think he might be wrong because he is too generalized, but what he has done might be correct... or not?
For example, the Windows Forms library use Rectangle struct that have 4 members, the Apple's QuartzCore also has CGRect struct, and those structs always passed by value. I think we can compare that to Vector with 3 floating-point variable.
However, as I do not see the code, I feel I should not judge what he has done, though I have a feeling he might did the right thing despite of his over generalized idea.
I would argue that pass by value is fine for small objects such as vectors but even then there is a lot of unnecessary copying occurring in the code. Passing large objects by value is obviously wasteful and most likely not what you want functionally.
It's not quite as obvious as you might think. C++ compilers perform copy elision very aggressively, so you can often pass by value without incurring the cost of a copy operation. And in some cases, passing by value might even be faster.
Before condemning the issue for performance reasons, you should at the very least produce the benchmarks to back it up. And they might be hard to create because the compiler typically eliminates the performance difference.
So the real issue should be one of semantics. How do you want your code to behave? Sometimes, reference semantics are what you want, and then you should pass by reference. If you specifically want/need value semantics then you pass by value.
There is one point in favor of passing by value. It's helpful in achieving a more functional style of code, with fewer side effects and where immutability is the default. That makes a lot of code easier to reason about, and it may make it easier to parallelize the code as well.
But in truth, both have their place. And never using pass-by-reference is definitely a big warning sign.
For the last 6 months or so, I've been experimenting with making pass-by-value the default. If I don't explicitly need reference semantics, then I try to assume that the compiler will perform copy elision for me, so I can pass by value without losing any efficiency.
So far, the compiler hasn't really let me down. I'm sure I'll run into cases where I have to go back and change some calls to passing by reference, but I'll do that when I know that
performance is a problem, and
the compiler failed to apply copy elision
I would say that Not using pointers in C is a sign of a newbie programmer.
It sounds like your friend is scared of pointers.
Remember, C++ pointers were actually inherited from the C language, and C was developed when computers were much less powerful. Nevertheless, speed and efficiency continue to be vital until this day.
So, why use pointers? They allow the developer to optimize a program to run faster or use less memory that it would otherwise! Referring to the memory location of a data is much more efficient then copying all the data around.
Pointers usually are a concept that is difficult to grasp for those beginning to program, because all the experiments done involve small arrays, maybe a few structs, but basically they consist of working with a couple of megabytes (if you're lucky) when you have 1GB of memory laying around the house. In this scene, a couple of MB are nothing and it usually is too little to have a significant impact on the performance of your program.
So let's exaggerate that a little bit. Think of a char array with 2147483648 elements - 2GB of data - that you need to pass to function that will write all the data to the disk. Now, what technique do you think is going to be more efficient/faster?
Pass by value, which is going to have to re-copy those 2GB of data to another location in memory before the program can write the data to the disk, or
Pass by reference, which will just refer to that memory location.
What happens when you just don't have 4GB of RAM? Will you spend $ and buy chips of RAM just because you are afraid of using pointers?
Re-copying the data in memory sounds a bit redundant when you don't have to, and its a waste of computer resource.
Anyway, be patient with your friend. If he would like to become a serious/professional programmer at some point in his life he will eventually have to take the time to really understand pointers.
Good Luck.
As already mentioned the big difference between a reference and a pointer is that a pointer can be null. If a class requires data a reference declaration will make it required. Adding const will make it 'read only' if that is what is desired by the caller.
The pass-by-value 'flaw' mentioned is simply not true. Passing everything by value will completely change the performance of an application. It is not so bad when primitive types (i.e. int, double, etc.) are passed by value but when a class instance is passed by value temporary objects are created which requires constructors and later on destructor's to be called on the class and on all of the member variable in the class. This is exasperated when large class hierarchies are used because parent class constructors/destructor's must be called as well.
Also, just because the vector is passed by value does not mean that it only uses stack memory. heap may be used for each element as it is created in the temporary vector that is passed to the method/function. The vector itself may also have to reallocate via heap if it reaches its capacity.
If pass by value is being so that the callers values are not modified then just use a const reference.
The answers that I've seen so far have all focused on performance: cases where pass-by-reference is faster than pass-by-value. You may have more success in your argument if you focus on cases that are impossible with pass-by-value.
Small tuples or vectors are a very simple type of data-structure. More complex data-structures share information, and that sharing can't be represented directly as values. You either need to use references/pointers or something that simulates them such as arrays and indices.
Lots of problems boil down to data that forms a Graph, or a Directed-Graph. In both cases you have a mixture of edges and nodes that need to be stored within the data-structure. Now you have the problem that the same data needs to be in multiple places. If you avoid references then firstly the data needs to be duplicated, and then every change needs to be carefully replicated in each of the other copies.
Your friend's argument boils down to saying: tackling any problem complex enough to be represented by a Graph is a bad-design....
The only major argument I can think of
is that the stack could possibly
overflow, but I'm guessing that it is
improbable that this will occur? Are
there any other arguments against
using only the stack/pass by value as
opposed to pass by reference?
Well, gosh, where to start...
As you mention, "there is a lot of unnecessary copying occurring in the code". Let's say you've got a loop where you call a function on these objects. Using a pointer instead of duplicating the objects can accelerate execution by one or more orders of magnitude.
You can't pass a variable-sized data structures, arrays, etc. around on the stack. You have to dynamically allocate it and pass a pointers or reference to the beginning. If your friend hasn't run into this, then yes, he's "new to C++."
As you mention, the program in question is simple and mostly uses quite small objects like graphics 3-tuples, which if the elements are doubles would be 24 bytes apiece. But in graphics, it's common to deal with 4x4 arrays, which handle both rotation and translation. Those would be 128 bytes apiece, so if a program that had to deal with those would be five times slower per function call with pass-by-value due to the increased copying. With pass-by-reference, passing a 3-tuple or a 4x4 array in a 32-bit executable would just involve duplicating a single 4-byte pointer.
On register-rich CPU architecures like ARM, PowerPC, 64-bit x86, 680x0 - but not 32-bit x86 - pointers (and references, which are secretly pointers wearing fancy syntatical clothing) are commonly be passed or returned in a register, which is really freaking fast compared to the memory access involved in a stack operation.
You mention the improbability of running out of stack space. And yes, that's so on a small program one might write for a class assignment. But a couple of months ago, I was debugging commercial code that was probably 80 function calls below main(). If they'd used pass-by-value instead of pass-by-reference, the stack would have been ginormous. And lest your friend think this was a "broken design", this was actually a WebKit-based browser implemented on Linux using GTK+, all of which is very state-of-the-art, and the function call depth is normal for professional code.
Some executable architectures limit the size of an individual stack frame, so even though you might not run out of stack space per se, you could exceed that and wind up with perfectly valid C++ code that wouldn't build on such a platform.
I could go on and on.
If your friend is interested in graphics, he should take a look at some of the common APIs used in graphics: OpenGL and XWindows on Linux, Quartz on Mac OS X, Direct X on Windows. And he should look at the internals of large C/C++ systems like the WebKit or Gecko HTML rendering engines, or any of the Mozilla browsers, or the GTK+ or Qt GUI toolkits. They all pass by anything much larger than a single integer or float by reference, and often fill in results by reference rather than as a function return value.
Nobody with any serious real world C/C++ chops - and I mean nobody - passes data structures by value. There's a reason for this: it's just flipping inefficient and problem-prone.
Wow, there are already 13 answers… I didn't read all in detail but I think this is quite different from the others…
He has a point. The advantage of pass-by-value as a rule is that subroutines cannot subtly modify their arguments. Passing non-const references would indicate that every function has ugly side effects, indicating poor design.
Simply explain to him the difference between vector3 & and vector3 const&, and demonstrate how the latter may be initialized by a constant as in vec_function( vector3(1,2,3) );, but not the former. Pass by const reference is a simple optimization of pass by value.
Buy your friend a good c++ book. Passing non-trivial objects by reference is a good practice and saves you a lot of unneccessary constructor/destructor calls. This has also nothing to do with allocating on free store vs. using stack. You can (or should) pass objects allocated on program stack by reference without any free store usage. You also can ignore free store completely, but that throws you back to the old fortran days which your friend probably hadn't in mind - otherwise he would pick an ancient f77 compiler for your project, wouldn't he...?
This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Defensive programming
We had a great discussion this morning about the subject of defensive programming. We had a code review where a pointer was passed in and was not checked if it was valid.
Some people felt that only a check for null pointer was needed. I questioned whether it could be checked at a higher level, rather than every method it is passed through, and that checking for null was a very limited check if the object at the other end of the point did not meet certain requirements.
I understand and agree that a check for null is better than nothing, but it feels to me that checking only for null provides a false sense of security since it is limited in scope. If you want to ensure that the pointer is usable, check for more than the null.
What are your experiences on the subject? How do you write defenses in to your code for parameters that are passed to subordinate methods?
In Code Complete 2, in the chapter on error handling, I was introduced to the idea of barricades. In essence, a barricade is code which rigorously validates all input coming into it. Code inside the barricade can assume that any invalid input has already been dealt with, and that the inputs that are received are good. Inside the barricade, code only needs to worry about invalid data passed to it by other code within the barricade. Asserting conditions and judicious unit testing can increase your confidence in the barricaded code. In this way, you program very defensively at the barricade, but less so inside the barricade. Another way to think about it is that at the barricade, you always handle errors correctly, and inside the barricade you merely assert conditions in your debug build.
As far as using raw pointers goes, usually the best you can do is assert that the pointer is not null. If you know what is supposed to be in that memory then you could ensure that the contents are consistent in some way. This begs the question of why that memory is not wrapped up in an object which can verify it's consistency itself.
So, why are you using a raw pointer in this case? Would it be better to use a reference or a smart pointer? Does the pointer contain numeric data, and if so, would it be better to wrap it up in an object which managed the lifecycle of that pointer?
Answering these questions can help you find a way to be more defensive, in that you'll end up with a design that is easier to defend.
The best way to be defensive is not to check pointers for null at runtime, but to avoid using pointers that may be null to begin with
If the object being passed in must not be null, use a reference! Or pass it by value! Or use a smart pointer of some sort.
The best way to do defensive programming is to catch your errors at compile-time.
If it is considered an error for an object to be null or point to garbage, then you should make those things compile errors.
Ultimately, you have no way of knowing if a pointer points to a valid object. So rather than checking for one specific corner case (which is far less common than the really dangerous ones, pointers pointing to invalid objects), make the error impossible by using a data type that guarantees validity.
I can't think of another mainstream language that allows you to catch as many errors at compile-time as C++ does. use that capability.
There is no way to check if a pointer is valid.
In all serious, it depends on how many bugs you'd like to have to have inflicted upon you.
Checking for a null pointer is definitely something that I would consider necessary but not sufficient. There are plenty of other solid principles you can use starting with entry points of your code (e.g., input validation = does that pointer point to something useful) and exit points (e.g., you thought the pointer pointed to something useful but it happened to cause your code to throw an exception).
In short, if you assume that everyone calling your code is going to do their best to ruin your life, you'll probably find a lot of the worst culprits.
EDIT for clarity: some other answers are talking about unit tests. I firmly believe that test code is sometimes more valuable than the code that it's testing (depending on who's measuring the value). That said, I also think that units tests are also necessary but not sufficient for defensive coding.
Concrete example: consider a 3rd party search method that is documented to return a collection of values that match your request. Unfortunately, what wasn't clear in the documentation for that method is that the original developer decided that it would be better to return a null rather than an empty collection if nothing matched your request.
So now, you call your defensive and well unit-tested method thinking (that is sadly lacking an internal null pointer check) and boom! NullPointerException that, without an internal check, you have no way of dealing with:
defensiveMethod(thirdPartySearch("Nothing matches me"));
// You just passed a null to your own code.
I'm a big fan of the "let it crash" school of design. (Disclaimer: I don't work on medical equipment, avionics, or nuclear power-related software.) If your program blows up, you fire up the debugger and figure out why. In contrast, if your program keeps running after illegal parameters have been detected, by the time it crashes you'll probably have no idea what went wrong.
Good code consists of many small functions/methods, and adding a dozen lines of parameter-checking to every one of those snippets of code makes it harder to read and harder to maintain. Keep it simple.
I may be a bit extreme, but I don't like Defensive Programming, I think it's laziness that has introduced the principle.
For this particular example, there is no sense in assert that the pointer is not null. If you want a null pointer, there is no better way to actually enforce it (and document it clearly at the same time) than to use a reference instead. And it's documentation that will actually be enforced by the compiler and does not cost a ziltch at runtime!!
In general, I tend not to use 'raw' types directly. Let's illustrate:
void myFunction(std::string const& foo, std::string const& bar);
What are the possible values of foo and bar ? Well that's pretty much limited only by what a std::string may contain... which is pretty vague.
On the other hand:
void myFunction(Foo const& foo, Bar const& bar);
is much better!
if people mistakenly reverse the order of the arguments, it's detected by the compiler
each class is solely responsible for checking that the value is right, the users are not burdenned.
I have a tendency to favor Strong Typing. If I have an entry that should be composed only of alphabetical characters and be up to 12 characters, I'd rather create a small class wrapping a std::string, with a simple validate method used internally to check the assignments, and pass that class around instead. This way I know that if I test the validation routine ONCE, I don't have to actually worry about all the paths through which that value can get to me > it will be validated when it reaches me.
Of course, that doesn't me that the code should not be tested. It's just that I favor strong encapsulation, and validation of an input is part of knowledge encapsulation in my opinion.
And as no rule can come without an exception... exposed interface is necessarily bloated with validation code, because you never know what might come upon you. However with self-validating objects in your BOM it's quite transparent in general.
"Unit tests verifying the code does what it should do" > "production code trying to verify its not doing what its not supposed to do".
I wouldn't even check for null myself, unless its part of a published API.
It very much depends; is the method in question ever called by code external to your group, or is it an internal method?
For internal methods, you can test enough to make this a moot point, and if you're building code where the goal is highest possible performance, you might not want to spend the time on checking inputs you're pretty darn sure are right.
For externally visible methods - if you have any - you should always double check your inputs. Always.
From debugging point of view, it is most important that your code is fail-fast. The earlier the code fails, the easier to find the point of failure.
For internal methods, we usually stick to asserts for these kinds of checks. That does get errors picked up in unit tests (you have good test coverage, right?) or at least in integration tests that are running with assertions on.
checking for null pointer is only half of the story,
you should also assign a null value to every unassigned pointer.
most responsible API will do the same.
checking for a null pointer comes very cheap in CPU cycles, having an application crashing once its delivered can cost you and your company in money and reputation.
you can skip null pointer checks if the code is in a private interface you have complete control of and/or you check for null by running a unit test or some debug build test (e.g. assert)
There are a few things at work here in this question which I would like to address:
Coding guidelines should specify that you either deal with a reference or a value directly instead of using pointers. By definition, pointers are value types that just hold an address in memory -- validity of a pointer is platform specific and means many things (range of addressable memory, platform, etc.)
If you find yourself ever needing a pointer for any reason (like for dynamically generated and polymorphic objects) consider using smart pointers. Smart pointers give you many advantages with the semantics of "normal" pointers.
If a type for instance has an "invalid" state then the type itself should provide for this. More specifically, you can implement the NullObject pattern that specifies how an "ill-defined" or "un-initialized" object behaves (maybe by throwing exceptions or by providing no-op member functions).
You can create a smart pointer that does the NullObject default that looks like this:
template <class Type, class NullTypeDefault>
struct possibly_null_ptr {
possibly_null_ptr() : p(new NullTypeDefault) {}
possibly_null_ptr(Type* p_) : p(p_) {}
Type * operator->() { return p.get(); }
~possibly_null_ptr() {}
private:
shared_ptr<Type> p;
friend template<class T, class N> Type & operator*(possibly_null_ptr<T,N>&);
};
template <class Type, class NullTypeDefault>
Type & operator*(possibly_null_ptr<Type,NullTypeDefault> & p) {
return *p.p;
}
Then use the possibly_null_ptr<> template in cases where you support possibly null pointers to types that have a default derived "null behavior". This makes it explicit in the design that there is an acceptable behavior for "null objects", and this makes your defensive practice documented in the code -- and more concrete -- than a general guideline or practice.
Pointer should only be used if do you need to do something with the pointer. Such as pointer arithmetic to transverse some data structure. Then if possible that should be encapsulated in a class.
IF the pointer is passed into the function to do something with the object to which it points, then pass in a reference instead.
One method for defensive programming is to assert almost everything that you can. At the beginning of the project it is annoying but later it is a good adjunct to unit testing.
A number of answer address the question of how to write defenses in your code, but no much was said about "how defensive should you be?". That's something you have to evaluate based on the criticality of your software components.
We're doing flight software and the impacts of a software error range from a minor annoyance to loss of aircraft/crew. We categorize different pieces of software based on their potential adverse impacts which affects coding standards, testing, etc. You need to evaluate how your software will be used and the impacts of errors and set what level of defensiveness you want (and can afford). The DO-178B standard calls this "Design Assurance Level".