This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Defensive programming
We had a great discussion this morning about the subject of defensive programming. We had a code review where a pointer was passed in and was not checked if it was valid.
Some people felt that only a check for null pointer was needed. I questioned whether it could be checked at a higher level, rather than every method it is passed through, and that checking for null was a very limited check if the object at the other end of the point did not meet certain requirements.
I understand and agree that a check for null is better than nothing, but it feels to me that checking only for null provides a false sense of security since it is limited in scope. If you want to ensure that the pointer is usable, check for more than the null.
What are your experiences on the subject? How do you write defenses in to your code for parameters that are passed to subordinate methods?
In Code Complete 2, in the chapter on error handling, I was introduced to the idea of barricades. In essence, a barricade is code which rigorously validates all input coming into it. Code inside the barricade can assume that any invalid input has already been dealt with, and that the inputs that are received are good. Inside the barricade, code only needs to worry about invalid data passed to it by other code within the barricade. Asserting conditions and judicious unit testing can increase your confidence in the barricaded code. In this way, you program very defensively at the barricade, but less so inside the barricade. Another way to think about it is that at the barricade, you always handle errors correctly, and inside the barricade you merely assert conditions in your debug build.
As far as using raw pointers goes, usually the best you can do is assert that the pointer is not null. If you know what is supposed to be in that memory then you could ensure that the contents are consistent in some way. This begs the question of why that memory is not wrapped up in an object which can verify it's consistency itself.
So, why are you using a raw pointer in this case? Would it be better to use a reference or a smart pointer? Does the pointer contain numeric data, and if so, would it be better to wrap it up in an object which managed the lifecycle of that pointer?
Answering these questions can help you find a way to be more defensive, in that you'll end up with a design that is easier to defend.
The best way to be defensive is not to check pointers for null at runtime, but to avoid using pointers that may be null to begin with
If the object being passed in must not be null, use a reference! Or pass it by value! Or use a smart pointer of some sort.
The best way to do defensive programming is to catch your errors at compile-time.
If it is considered an error for an object to be null or point to garbage, then you should make those things compile errors.
Ultimately, you have no way of knowing if a pointer points to a valid object. So rather than checking for one specific corner case (which is far less common than the really dangerous ones, pointers pointing to invalid objects), make the error impossible by using a data type that guarantees validity.
I can't think of another mainstream language that allows you to catch as many errors at compile-time as C++ does. use that capability.
There is no way to check if a pointer is valid.
In all serious, it depends on how many bugs you'd like to have to have inflicted upon you.
Checking for a null pointer is definitely something that I would consider necessary but not sufficient. There are plenty of other solid principles you can use starting with entry points of your code (e.g., input validation = does that pointer point to something useful) and exit points (e.g., you thought the pointer pointed to something useful but it happened to cause your code to throw an exception).
In short, if you assume that everyone calling your code is going to do their best to ruin your life, you'll probably find a lot of the worst culprits.
EDIT for clarity: some other answers are talking about unit tests. I firmly believe that test code is sometimes more valuable than the code that it's testing (depending on who's measuring the value). That said, I also think that units tests are also necessary but not sufficient for defensive coding.
Concrete example: consider a 3rd party search method that is documented to return a collection of values that match your request. Unfortunately, what wasn't clear in the documentation for that method is that the original developer decided that it would be better to return a null rather than an empty collection if nothing matched your request.
So now, you call your defensive and well unit-tested method thinking (that is sadly lacking an internal null pointer check) and boom! NullPointerException that, without an internal check, you have no way of dealing with:
defensiveMethod(thirdPartySearch("Nothing matches me"));
// You just passed a null to your own code.
I'm a big fan of the "let it crash" school of design. (Disclaimer: I don't work on medical equipment, avionics, or nuclear power-related software.) If your program blows up, you fire up the debugger and figure out why. In contrast, if your program keeps running after illegal parameters have been detected, by the time it crashes you'll probably have no idea what went wrong.
Good code consists of many small functions/methods, and adding a dozen lines of parameter-checking to every one of those snippets of code makes it harder to read and harder to maintain. Keep it simple.
I may be a bit extreme, but I don't like Defensive Programming, I think it's laziness that has introduced the principle.
For this particular example, there is no sense in assert that the pointer is not null. If you want a null pointer, there is no better way to actually enforce it (and document it clearly at the same time) than to use a reference instead. And it's documentation that will actually be enforced by the compiler and does not cost a ziltch at runtime!!
In general, I tend not to use 'raw' types directly. Let's illustrate:
void myFunction(std::string const& foo, std::string const& bar);
What are the possible values of foo and bar ? Well that's pretty much limited only by what a std::string may contain... which is pretty vague.
On the other hand:
void myFunction(Foo const& foo, Bar const& bar);
is much better!
if people mistakenly reverse the order of the arguments, it's detected by the compiler
each class is solely responsible for checking that the value is right, the users are not burdenned.
I have a tendency to favor Strong Typing. If I have an entry that should be composed only of alphabetical characters and be up to 12 characters, I'd rather create a small class wrapping a std::string, with a simple validate method used internally to check the assignments, and pass that class around instead. This way I know that if I test the validation routine ONCE, I don't have to actually worry about all the paths through which that value can get to me > it will be validated when it reaches me.
Of course, that doesn't me that the code should not be tested. It's just that I favor strong encapsulation, and validation of an input is part of knowledge encapsulation in my opinion.
And as no rule can come without an exception... exposed interface is necessarily bloated with validation code, because you never know what might come upon you. However with self-validating objects in your BOM it's quite transparent in general.
"Unit tests verifying the code does what it should do" > "production code trying to verify its not doing what its not supposed to do".
I wouldn't even check for null myself, unless its part of a published API.
It very much depends; is the method in question ever called by code external to your group, or is it an internal method?
For internal methods, you can test enough to make this a moot point, and if you're building code where the goal is highest possible performance, you might not want to spend the time on checking inputs you're pretty darn sure are right.
For externally visible methods - if you have any - you should always double check your inputs. Always.
From debugging point of view, it is most important that your code is fail-fast. The earlier the code fails, the easier to find the point of failure.
For internal methods, we usually stick to asserts for these kinds of checks. That does get errors picked up in unit tests (you have good test coverage, right?) or at least in integration tests that are running with assertions on.
checking for null pointer is only half of the story,
you should also assign a null value to every unassigned pointer.
most responsible API will do the same.
checking for a null pointer comes very cheap in CPU cycles, having an application crashing once its delivered can cost you and your company in money and reputation.
you can skip null pointer checks if the code is in a private interface you have complete control of and/or you check for null by running a unit test or some debug build test (e.g. assert)
There are a few things at work here in this question which I would like to address:
Coding guidelines should specify that you either deal with a reference or a value directly instead of using pointers. By definition, pointers are value types that just hold an address in memory -- validity of a pointer is platform specific and means many things (range of addressable memory, platform, etc.)
If you find yourself ever needing a pointer for any reason (like for dynamically generated and polymorphic objects) consider using smart pointers. Smart pointers give you many advantages with the semantics of "normal" pointers.
If a type for instance has an "invalid" state then the type itself should provide for this. More specifically, you can implement the NullObject pattern that specifies how an "ill-defined" or "un-initialized" object behaves (maybe by throwing exceptions or by providing no-op member functions).
You can create a smart pointer that does the NullObject default that looks like this:
template <class Type, class NullTypeDefault>
struct possibly_null_ptr {
possibly_null_ptr() : p(new NullTypeDefault) {}
possibly_null_ptr(Type* p_) : p(p_) {}
Type * operator->() { return p.get(); }
~possibly_null_ptr() {}
private:
shared_ptr<Type> p;
friend template<class T, class N> Type & operator*(possibly_null_ptr<T,N>&);
};
template <class Type, class NullTypeDefault>
Type & operator*(possibly_null_ptr<Type,NullTypeDefault> & p) {
return *p.p;
}
Then use the possibly_null_ptr<> template in cases where you support possibly null pointers to types that have a default derived "null behavior". This makes it explicit in the design that there is an acceptable behavior for "null objects", and this makes your defensive practice documented in the code -- and more concrete -- than a general guideline or practice.
Pointer should only be used if do you need to do something with the pointer. Such as pointer arithmetic to transverse some data structure. Then if possible that should be encapsulated in a class.
IF the pointer is passed into the function to do something with the object to which it points, then pass in a reference instead.
One method for defensive programming is to assert almost everything that you can. At the beginning of the project it is annoying but later it is a good adjunct to unit testing.
A number of answer address the question of how to write defenses in your code, but no much was said about "how defensive should you be?". That's something you have to evaluate based on the criticality of your software components.
We're doing flight software and the impacts of a software error range from a minor annoyance to loss of aircraft/crew. We categorize different pieces of software based on their potential adverse impacts which affects coding standards, testing, etc. You need to evaluate how your software will be used and the impacts of errors and set what level of defensiveness you want (and can afford). The DO-178B standard calls this "Design Assurance Level".
Related
I was told recently in a job interview their project works on building the smallest size binary for their application (runs embedded) so I would not be able to use things such as templating or smart pointers as these would increase the binary size, they generally seemed to imply using things from std would be generally a no go (not all cases).
After the interview, I tried to do research online about coding and what features from standard lib caused large binary sizes and I could find basically nothing in regards to this. Is there a way to quantify using certain features and the size impact they would have (without needing to code 100 smart pointers in a code base vs self managed for example).
This question probably deserves more attention than it’s likely to get, especially for people trying to pursue a career in embedded systems. So far the discussion has gone about the way that I would expect, specifically a lot of conversation about the nuances of exactly how and when a project built with C++ might be more bloated than one written in plain C or a restricted C++ subset.
This is also why you can’t find a definitive answer from a good old fashioned google search. Because if you just ask the question “is C++ more bloated than X?”, the answer is always going to be “it depends.”
So let me approach this from a slightly different angle. I’ve both worked for, and interviewed at companies that enforced these kinds of restrictions, I’ve even voluntarily enforced them myself. It really comes down to this. When you’re running an engineering organization with more than one person with plans to keep hiring, it is wildly impractical to assume everyone on your team is going to fully understand the implications of using every feature of a language. Coding standards and language restrictions serve as a cheap way to prevent people from doing “bad things” without knowing they’re doing “bad things”.
How you define a “bad thing” is then also context specific. On a desktop platform, using lots of code space isn’t really a “bad” enough thing to rigorously enforce. On a tiny embedded system, it probably is.
C++ by design makes it very easy for an engineer to generate lots of code without having to type it out explicitly. I think that statement is pretty self-evident, it’s the whole point of meta-programming, and I doubt anyone would challenge it, in fact it’s one of the strengths of the language.
So then coming back to the organizational challenges, if your primary optimization variable is code space, you probably don’t want to allow people to use features that make it trivial to generate code that isn’t obvious. Some people will use that feature responsibly and some people won’t, but you have to standardize around the least common denominator. A C compiler is very simple. Yes you can write bloated code with it, but if you do, it will probably be pretty obvious from looking at it.
(Partially extracted from comments I wrote earlier)
I don't think there is a comprehensive answer. A lot also depends on the specific use case and needs to be judged on a case-by-case basis.
Templates
Templates may result in code bloat, yes, but they can also avoid it. If your alternative is introducing indirection through function pointers or virtual methods, then the templated function itself may become bigger in code size simply because function calls take several instructions and removes optimization potential.
Another aspect where they can at least not hurt is when used in conjunction with type erasure. The idea here is to write generic code, then put a small template wrapper around it that only provides type safety but does not actually emit any new code. Qt's QList is an example that does this to some extend.
This bare-bones vector type shows what I mean:
class VectorBase
{
protected:
void** start, *end, *capacity;
void push_back(void*);
void* at(std::size_t i);
void clear(void (*cleanup_function)(void*));
};
template<class T>
class Vector: public VectorBase
{
public:
void push_back(T* value)
{ this->VectorBase::push_back(value); }
T* at(std::size_t i)
{ return static_cast<T*>(this->VectorBase::at(i)); }
~Vector()
{ clear(+[](void* object) { delete static_cast<T*>(object); }); }
};
By carefully moving as much code as possible into the non-templated base, the template itself can focus on type-safety and to provide necessary indirections without emitting any code that wouldn't have been here anyway.
(Note: This is just meant as a demonstration of type erasure, not an actually good vector type)
Smart pointers
When written carefully, they won't generate much code that wouldn't be there anyway. Whether an inline function generates a delete statement or the programmer does it manually doesn't really matter.
The main issue that I see with those is that the programmer is better at reasoning about code and avoiding dead code. For example even after a unique_ptr has been moved away, the destructor of the pointer still has to emit code. A programmer knows that the value is NULL, the compiler often doesn't.
Another issue comes up with calling conventions. Objects with destructors are usually passed on the stack, even if you declare them pass-by-value. Same for return values. So a function unique_ptr<foo> bar(unique_ptr<foo> baz) will have higher overhead than foo* bar(foo* baz) simply because pointers have to be put on and off the stack.
Even more egregiously, the calling convention used for example on Linux makes the caller clean up parameters instead of the callee. That means if a function accepts a complex object like a smart pointer by value, a call to the destructor for that parameter is replicated at every call site, instead of putting it once inside the function. Especially with unique_ptr this is so stupid because the function itself may know that the object has been moved away and the destructor is superfluous; but the caller doesn't know this (unless you have LTO).
Shared pointers are a different beast altogether, simply because they allow a lot of different tradeoffs. Should they be atomic? Should they allow type casting, weak pointers, what indirection is used for destruction? Do you really need two raw pointers per shared pointer or can the reference counter be accessed through shared object?
Exceptions, RTTI
Generally avoided and removed via compiler flags.
Library components
On a bare-metal system, pulling in parts of the standard library can have a significant effect that can only be measured after the linker step. I suggest any such project use continuous integration and tracks the code size as a metric.
For example I once added a small feature, I don't remember which, and in its error handling it used std::stringstream. That pulled in the entire iostream library. The resulting code exceeded my entire RAM and ROM capacity. IIRC the issue was that even though exception handling was deactivated, the exception message was still being set up.
Move constructors and destructors
It's a shame that C++'s move semantics aren't the same as for example Rust's where objects can be moved with a simple memcpy and then "forgetting" their original location. In C++ the destructor for a moved object is still invoked, which requires more code in the move constructor / move assignment operator, and in the destructor.
Qt for example accounts for such simple cases in its meta type system.
In our code base we have many constructions like this:
auto* pObj = getObjectThatMayVeryRarelyBeNull();
if (!pObj) throw std::runtime_error("Ooops!");
// Use pObj->(...)
In 99.99% of cases this check is not triggered. I'm thinking about the following solution:
auto& obj = deref_or_throw(getObjectThatMayVeryRarelyBeNull());
// Use obj.(...)
Where deref_or_throw is declared as follows:
template<class T> T& deref_or_throw(T* p) {
if (p == nullptr) { throw std::invalid_argument("Argument is null!"); }
return *p;
}
That code is much clearer and works as I need.
The question is: am I reinventing the wheel? Is there some related solution in standard or boost? Or do you have some comments on the solution?
PS. Related question (with no satisfying answer): Is there a C++ equivalent of a NullPointerException
There are two ways of dealing with the "rare case of null-pointer" issue. First, it's the proposed exception solution. For this, the method deref_or_throw is a good thing, though I would prefer to throw a runtime_error, rather than an invalid_argument exception. You may also consider naming it NullPointerException, if you prefer that.
However, if the ptr != nullptr case is actually a precondition to the algorithm, you should try your best to achieve 100% safety in having this non-null case. An assert is the appropriate thing in this case.
Advantages of assert:
No cost at runtime in release mode
Makes it crystal clear to the developer, that he is responsible for this to never happen.
Disadvantages of assert:
Potential undefined behavior in release mode
You should also consider writing a method getObjectThatMayNeverBeNullButDoingTheSameAsGetObjectThatMayVeryRarelyBeNull(): A method that is guaranteed to return a non-null pointer to an object that is otherwise completely equivalent to your original method. However, if you can go the extra mile, I would strongly suggest to not returning a raw pointer anyway. Embrace C++11 and return objects by value :-)
You can, and probably should, design with null cases in mind. For example, in your problem domain, when does it make sense for class methods to return null? When does it make sense to call functions with null arguments?
Broadly speaking, it can be beneficial to remove null references from code whenever possible. In other words, you can program to the invariant that "this will not be null here... or there". This has many benefits. You won't have to obfuscate code by wrapping methods in deref_or_throw. You may achieve more semantic meaning from code, because, how often are things null in the real world? Your code may be more readable if you use exceptions to indicate errors rather than null values. And finally, you reduce the need for error-checking and also reduce the risk of run time errors (the dreaded null pointer dereference).
If your system was not designed with null cases in mind, I'd say it's best to leave it alone and not go crazy wrapping everything with deref_or_throw. Perhaps take an Agile approach. As you are coding, inspect the contracts that your class objects offer as services. How often can these contracts reasonably be expected to return null? What is the semantic value of null in these cases?
You could probably identify the classes in your system that might reasonably be expected to return null. For these classes, null may be valid business logic, rather than fringe cases that are indicative of implementation errors. For the classes that might be expected to return null, the additional check may be worth the safety. In this sense, checking for null feels more like business logic rather than low level implementation details. Across the board, though, systematically wrapping all pointer deference in deref_or_throw might be a cure that is its own poison.
This is definitely subjective, but I'd like to try to avoid it
becoming argumentative. I think it could be an interesting question if
people treat it appropriately.
In my several recent projects I used to implement architectures where long delegation chains are a common thing.
Dual delegation chains can be encountered very often:
bool Exists = Env->FileSystem->FileExists( "foo.txt" );
And triple delegation is not rare at all:
Env->Renderer->GetCanvas()->TextStr( ... );
Delegation chains of higher order exist but are really scarce.
In above mentioned examples no NULL run-time checks are performed since the objects used are always there and are vital to the functioning of the program and
explicitly constructed when execution starts. Basically I used to split a delegation chain in these cases:
1) I reuse the object obtained through a delegation chain:
{ // make C invisible to the parent scope
clCanvas* C = Env->Renderer->GetCanvas();
C->TextStr( ... );
C->TextStr( ... );
C->TextStr( ... );
}
2) An intermediate object somewhere in the middle of the delegation chain should be checked for NULL before usage. Eg.
clCanvas* C = Env->Renderer->GetCanvas();
if ( C ) C->TextStr( ... );
I used to fight the case (2) by providing proxy objects so that a method can be invoked on non-NULL object leading to an empty result.
My questions are:
Is either of cases (1) or (2) a pattern or an antipattern?
Is there a better way to deal with long delegation chains in C++?
Here are some pros and cons I considered while making my choice:
Pros:
it is very descriptive: it is clear out of 1 line of code where did the object came from
long delegation chains look nice
Cons:
interactive debugging is labored since it is hard to inspect more than one temporary object in the delegation chain
I would like to know other pros and cons of the long delegation chains. Please, present your reasoning and vote based on how well-argued opinion is and not how well you agree with it.
I wouldn't go so far to call either an anti-pattern. However, the first has the disadvantage that your variable C is visible even after it's logically relevant (too gratuitous scoping).
You can get around this by using this syntax:
if (clCanvas* C = Env->Renderer->GetCanvas()) {
C->TextStr( ... );
/* some more things with C */
}
This is allowed in C++ (while it's not in C) and allows you to keep proper scope (C is scoped as if it were inside the conditional's block) and check for NULL.
Asserting that something is not NULL is by all means better than getting killed by a SegFault. So I wouldn't recommend simply skipping these checks, unless you're a 100% sure that that pointer can never ever be NULL.
Additionally, you could encapsulate your checks in an extra free function, if you feel particularly dandy:
template <typename T>
T notNULL(T value) {
assert(value);
return value;
}
// e.g.
notNULL(notNULL(Env)->Renderer->GetCanvas())->TextStr();
In my experience, chains like that often contains getters that are less than trivial, leading to inefficiencies. I think that (1) is a reasonable approach. Using proxy objects seems like an overkill. I would rather see a crash on a NULL pointer rather than using a proxy objects.
Such long chain of delegation should not happens if you follow the Law of Demeter. I've often argued with some of its proponents that they where holding themselves to it too conscientiously, but if you come to the point to wonder how best to handle long delegation chains, you should probably be a little more compliant with its recommendations.
Interesting question, I think this is open to interpretation, but:
My Two Cents
Design patterns are just reusable solutions to common problems which are generic enough to be widely applied in object oriented (usually) programming. Many common patterns will start you out with interfaces, inheritance chains, and/or containment relationships that will result in you using chaining to call things to some extent. The patterns are not trying to solve a programming issue like this though - chaining is just a side effect of them solving the functional problems at hand. So, I wouldn't really consider it a pattern.
Equally, anti-patterns are approaches that (in my mind) counter-act the purpose of design patterns. For example, design patterns are all about structure and the adaptability of your code. People consider a singleton an anti-pattern because it (often, not always) results in spider-web like code due to the fact that it inherently creates a global, and when you have many, your design deteriorates fast.
So, again, your chaining problem doesn't necessarily indicate good or bad design - it's not related to the functional objectives of patterns or the drawbacks of anti-patterns. Some designs just have a lot of nested objects even when designed well.
What to do about it:
Long delegation chains can definitely be a pain in the butt after a while, and as long as your design dictates that the pointers in those chains won't be reassigned, I think saving a temporary pointer to the point in the chain you're interested in is completely fine (function scope or less preferably).
Personally though, I'm against saving a permanent pointer to a part of the chain as a class member as I've seen that end up in people having 30 pointers to sub objects permanently stored, and you lose all conception of how the objects are laid out in the pattern or architecture you're working with.
One other thought - I'm not sure if I like this or not, but I've seen some people create a private (for your sanity) function that navigates the chain so you can recall that and not deal with issues about whether or not your pointer changes under the covers, or whether or not you have nulls. It can be nice to wrap all that logic up once, put a nice comment at the top of the function stating which part of the chain it gets the pointer from, and then just use the function result directly in your code instead of using your delegation chain each time.
Performance
My last note would be that this wrap-in-function approach as well as your delegation chain approach both suffer from performance drawbacks. Saving a temporary pointer lets you avoid the extra two dereferences potentially many times if you're using these objects in a loop. Equally, storing the pointer from the function call will avoid the over head of an extra function call every loop cycle.
For bool Exists = Env->FileSystem->FileExists( "foo.txt" ); I'd rather go for an even more detailed breakdown of your chain, so in my ideal world, there are the following lines of code:
Environment* env = GetEnv();
FileSystem* fs = env->FileSystem;
bool exists = fs->FileExists( "foo.txt" );
and why? Some reasons:
readability: my attention gets lost till I have to read to the end of the line in case of bool Exists = Env->FileSystem->FileExists( "foo.txt" ); It's just too long for me.
validity: regardles that you mentioned the objects are, if your company tomorrow hires a new programmer and he starts writing code, the day after tomorrow the objects might not be there. These long lines are pretty unfriendly, new people might get scared of them and will do something interesting such as optimising them... which will take more experienced programmer extra time to fix.
debugging: if by any chance (and after you have hired the new programmer) the application throws a segmentation fault in the long list of chain it is pretty difficult to find out which object was the guilty one. The more detailed the breakdown the more easier to find the location of the bug.
speed: if you need to do lots of calls for getting the same chain elements, it might be faster to "pull out" a local variable from the chain instead of calling a "proper" getter function for it. I don't know if your code is production or not, but it seems to miss the "proper" getter function, instead it seems to use only the attribute.
Long delegation chains are a bit of a design smell to me.
What a delegation chain tells me is that one piece of code has deep access to an unrelated piece of code, which makes me think of high coupling, which goes against the SOLID design principles.
The main problem I have with this is maintainability. If you're reaching two levels deep, that is two independent pieces of code that could evolve on their own and break under you. This quickly compounds when you have functions inside the chain, because they can contain chains of their own - for example, Renderer->GetCanvas() could be choosing the canvas based on information from another hierarchy of objects and it is difficult to enforce a code path that does not end up reaching deep into objects over the life time of the code base.
The better way would be to create an architecture that obeyed the SOLID principles and used techniques like Dependency Injection and Inversion Of Control to guarantee your objects always have access to what they need to perform their duties. Such an approach also lends itself well to automated and unit testing.
Just my 2 cents.
If it is possible I would use references instead of pointers. So delegates are guaranteed to return valid objects or throw exception.
clCanvas & C = Env.Renderer().GetCanvas();
For objects which can not exist i will provide additional methods such as has, is, etc.
if ( Env.HasRenderer() ) clCanvas* C = Env.Renderer().GetCanvas();
If you can guarantee that all the objects exist, I don't really see a problem in what you're doing. As others have mentioned, even if you think that NULL will never happen, it may just happen anyway.
This being said, I see that you use bare pointers everywhere. What I would suggest is that you start using smart pointers instead. When you use the -> operator, a smart pointer will usually throw if the pointer is NULL. So you avoid a SegFault. Not only that, if you use smart pointers, you can keep copies and the objects don't just disappear under your feet. You have to explicitly reset each smart pointer before the pointer goes to NULL.
This being said, it wouldn't prevent the -> operator from throwing once in a while.
Otherwise I would rather use the approach proposed by AProgrammer. If object A needs a pointer to object C pointed by object B, then the work that object A is doing is probably something that object B should actually be doing. So A can guarantee that it has a pointer to B at all time (because it holds a shared pointer to B and thus it cannot go NULL) and thus it can always call a function on B to do action Z on object C. In function Z, B knows whether it always has a pointer to C or not. That's part of its B's implementation.
Note that with C++11 you have std::smart_ptr<>, so use it!
I think it's best if I describe the situation using a code example:
int MyFuncA()
{
MyClass someInstance;
//<Work with and fill someInstance...>
MyFuncB( &someInstance )
}
int MyFuncB( MyClass* instance )
{
//Do anything you could imagine with instance, *except*:
//* Allowing references to it or any of it's data members to escape this function
//* Freeing anything the class will free in it's destructor, including itself
instance->DoThis();
instance->ModifyThat();
}
And here come my straightforward questions:
Is the above concept guranteed, by C and C++ standards, to work as expected? Why (not)?
Is this considered doing this, sparingly and with care, bad practice?
Is the above concept guranteed, by C and C++ standards, to work as expected? Why (not)?
Yes, it will work as expected. someInstance is available through the scope of MyFuncA. The call to MyFuncB is within that scope.
Is this considered doing this, sparingly and with care, bad practice?
Don't see why.
I don't see any problem in actually using the pointer you were passed to call functions on the object. As long as you call public methods of MyClass, everything remains valid C/C++.
The actual instance you create at the beginning of MyFuncA() will get destroyed at the end of MyFuncA(), and you are guaranteed that the instance will remain valid for the whole execution of MyFuncB() because someInstance is still valid in the scope of MyFuncA().
Yes it will work. It does not matter if the pointer you pass into MyFuncB is on the stack or on the heap (in this specific case).
In regards for the bad practice part you can probably argue both ways. In general it's bad I think because if for any reason any object which is living outside of MyFuncA gets hold of the object reference then it will die a horrible death later on and cause sometime very hard to track bugs. It rewally depends how extensive the usage of the object becomes in MyFuncB. Especially when it starts involving another 3rd class it can get messy.
Others have answered the basic question, with "yeah, that's legal". And in the absence of greater architecture it is hard to call it good or bad practice. But I'll try and wax philosophical on the broader question you seem to be picking up about pointers, object lifetimes, and expectations across function calls...
In the C++ language, there's no built-in way to pass a pointer to a function and "enforce" that it won't stow that away after the call is complete. And since C++ pointers are "weak references" by default, the objects pointed to may disappear out from under someone you pass it to.
But explicitly weak pointer abstractions do exist, for instance in Qt:
http://doc.qt.nokia.com/latest/qweakpointer.html
These are designed to specifically encode the "paranoia" to the recipient that the object it is holding onto can disappear out from under it. Anyone dereferencing one sort of realizes something is up, and they have to take the proper cautions under the design contract.
Additionally, abstractions like shared pointer exist which signal a different understanding to the recipient. Passing them one of those gives them the right to keep the object alive as long as they want, giving you something like garbage collection:
http://doc.qt.nokia.com/4.7-snapshot/qsharedpointer.html
These are only some options. But in the most general sense, if you come up with any interesting invariant for the lifetimes of your object...consider not passing raw pointers. Instead pass some pointer-wrapping class that embodies and documents the rules of the "game" in your architecture.
(One of major the reasons to use C++ instead of other languages is the wealth of tools you have to do cool things like that, without too much runtime cost!)
i don't think there should be any problem with that barring, as you say, something that frees the object, or otherwise trashes its state. i think whatever unexpected things happen would not have anything to do with using the class this way. (nothing in life is guaranteed of course, but classes are intended to be passed around and operated on, whether it's a local variable or otherwise i do not believe is relevant.)
the one thing you would not be able to do is keep a reference to the class after it goes out of scope when MyFuncA() returns, but that's just the nature of the scoping rules.
This question was inspired by this answer.
I've always been of the philosophy that the callee is never responsible when the caller does something stupid, like passing of invalid parameters. I have arrived at this conclusion for several reasons, but perhaps the most important one comes from this article:
Everything not defined is undefined.
If a function doesn't say in it's docs that it's valid to pass nullptr, then you damn well better not be passing nullptr to that function. I don't think it's the responsibility of the callee to deal with such things.
However, I know there are going to be some who disagree with me. I'm curious whether or not I should be checking for these things, and why.
If you're going to check for NULL pointer arguments where you have not entered into a contract to accept and interpret them, do it with an assert, not a conditional error return. This way the bugs in the caller will be immediately detected and can be fixed, and it makes it easy to disable the overhead in production builds. I question the value of the assert except as documentation however; a segfault from dereferencing the NULL pointer is just as effective for debugging.
If you return an error code to a caller which has already proven itself buggy, the most likely result is that the caller will ignore the error, and bad things will happen much later down the line when the original cause of the error has become difficult or impossible to track down. Why is it reasonable to assume the caller will ignore the error you return? Because the caller already ignored the error return of malloc or fopen or some other library-specific allocation function which returned NULL to indicate an error!
In C++, if you don't want to accept NULL pointers, then don't take the chance: accept a reference instead.
While in general I don't see the value in detecting NULL (why NULL and not some other invalid address?) for a public API I'd probably still do it simply because many C and C++ programmers expect such behavior.
Defense in Depth principle says yes. If this is an external API then totally essential. Otherwise, at least an assert to assist in debugging misuse of your API.
You can document the contract until you are blue in the face, but you cannot in callee code prevent ill-advised or malicious misuse of your function. The decision you have to make is what's the likely cost of misuse.
In my view, it's not a question of responsibility. It's a question of robustness.
Unless I have full control on the caller and I must optimize for even the minute speed improvement, I always check for NULL.
I lean heavily on the side of 'don't trust your user's input to not blow up your system' and in defensive programming in general. Since I have made APIs in a past life, I have seen users of the libraries pass in null pointers and then application crashes result.
If it is truly an internal library and I'm the only person (or only a select few) have the ability to use it, then I might ease up on null pointer checks as long as everyone agrees to abide by general contracts. I can't trust the user base at large to adhere to that.
The answer is going to be different for C and C++.
C++ has references. The only difference between passing a pointer and passing a reference is that the pointer can be null. So, if the writer of the called function expects a pointer argument and forgets to do something sane when it's null, he's silly, crazy or writing C-with-classes.
Either way, this is not a matter of who wears the responsibility hat. In order to write good software, the two programmers must co-operate, and it is the responsibility of all programmers to 1° avoid special cases that would require this kind of decision and 2° when that fails, write code that blows up in a non-ambiguous and documented way in order to help with debugging.
So, sure, you can point and laugh at the caller because he messed up and "everything not defined is undefined" and had to spend one hour debugging a simple null pointer bug, but your team wasted some precious time on that.
My philosophy is: Your users should be allowed to make mistakes, your programming team should not.
What this means is that the only place you should check for invalid parameters including NULL, is in the top-level user interface. Everywhere the user can provide input to your code, you should check for errors, and handle them as gracefully as possible.
Everywhere else, you should use ASSERTS to ensure the programmers are using the functions correctly.
If you are writing an API, then only the top-level functions should catch and handle bad input. It is pointless to keep checking for a NULL pointer three or four levels deep into your call stack.
I am pro defensive programming.
Unless you can profile that these nullptr checkings happen in a bottleneck of your application... (in such cases it is conceivable one should not do those pointers value tests at those points)
but all in all comparing an int with 0 is really cheap an operation.
I think it is a shame to let potential crash bugs instead of consuming so little CPU.
so: Test your pointers against NULL!
I think that you should strive to write code that is robust for every conceivable situation. Passing a NULL pointer to a function is very common; therefore, your code should check for it and deal with it, usually by returning an error value. Library functions should NOT crash an application.
For C++, if your function doesn't accept nullpointer, then use a reference argument. In general.
There are some exceptions. For example, many people, including myself, think it's better with pointer argument when the actual argument will most naturally be a pointer, especially when the function stores away of a copy of the pointer. Even when the function doesn't support nullpointer argument.
How much to defend against invalid argument depends, including that it depends on subjective opinion and gut-feeling.
Cheers & hth.,
One thing you have to consider is what happens if some caller DOES misuse your API. In the case of passing NULL pointers, the result is an obvious crash, so it's OK not to check. Any misuse will be readily apparent to the calling code's developer.
The infamous glibc debacle is another thing entirely. The misuse resulted in actually useful behavior for the caller, and the API stayed that way for decades. Then they changed it.
In this case, the API developers' should have checked values with an assert or some similar mechanism. But you can't go back in time to correct an error. The wailing and gnashing of teeth were inevitable. Read all about it here.
If you don't want a NULL then don't make the parameter a pointer.
By using a reference you guarantee that the object will not be NULL.
He who performas invalid operations on invalid or nonexisting data, only deserves his system-state to become invalid.
I consider it complete nonsense that functions which expect input should check for NULL. Or whatever other value for that matter. The sole job of a function is to do a task based on its input or scope-state, nothing else. If you have no valid input, or no input at all, then don't even call the function. Besides, a NULL-check doesn't detect the other millions and millions of possible invalid values. You know on forehand you would be passing NULL, so why would you still pass it, waste valuable cycles on yet another function call with parameter passing, an in-function comparison of some pointer, and then check the function output again for success or not. Sure, I might have done so when I was 6 years old back in 1982, but those days have long since gone.
There is ofcourse the argument to be made for public API's. Like some DLL offering idiot-proof checking. You know, those arguments: "If the user supplies NULL you don't want your application to crash." What a non-argument. It is the user which passes bogus data in the first place; it's an explicit choice and nothing else than that. If one feels that is quality, well... I prefer solid logic and performance over such things. Besides, a programmer is supposed to know what he's doing. If he's operating on invalid data for the particular scope, then he has no business calling himself a programmer. I see no reason to downgrade the performance, increase power consumption, while increasing binary size which in turn affects instruction caching and branch-prediction, of my products in order to support such users.
I don't think it's the responsibility of the callee to deal with such things
If it doesn't take this responsibility it might create bad results, like dereferencing NULL pointers. Problem is that it always implicitly takes this responsibility. That's why i prefer graceful handling.
In my opinion, it's the callee's responsibility to enforce its contract.
If the callee shouldn't accept NULL, then it should assert that.
Otherwise, the callee should be well behaved when it's handed a NULL. That is, either it should functionally be a no-op, return an error code, or allocate its own memory, depending on the contract that you specified for it. It should do whatever seems to be the most sensible from the caller's perspective.
As the user of the API, I want to be able to continue using it without having the program crash; I want to be able to recover at the least or shut down gracefully at worst.
One side effect of that approach is that when your library crashes in response to being passed an invalid argument, you will tend to get the blame.
There is no better example of this than the Windows operating system. Initially, Microsoft's approach was to eliminate many tests for bogus arguments. The result was an operating system that was more efficient.
However, the reality is that invalid arguments are passed all time. From programmers that aren't up to snuff, or just using values returned by other functions there weren't expected to be NULL. Now, Windows performs more validation and is less efficient as a result.
If you want to allow your routines to crash, then don't test for invalid parameters.
Yes, you should check for null pointers. You don't want to crash an application because the developer messed something up.
Overhead of development time + runtime performance has a trade-off with the robustness of the API you are designing.
If the API you are publishing has to run inside the process of the calling routine, you SHOULD NOT check for NULL or invalid arguments. In this scenario, if you crash, the client program crashes and the developer using your API should mend his ways.
However, if you are providing a runtime/ framework which will run the client program inside it (e.g., you are writing a virtual machine or a middleware which can host the code or an operating system), you should definitely check of the correctness of the arguments passed. You don't want your program to be blamed for the mistakes of a plugin.
There is a distinction between what I would call legal and moral responsibility in this case. As an analogy, suppose you see a man with poor eyesight walking towards a cliff edge, blithely unaware of its existence. As far as your legal responsibility goes, it would in general not be possible to successfully prosecute you if you fail to warn him and he carries on walking, falls off the cliff and dies. On the other hand, you had an opportunity to warn him -- you were in a position to save his life, and you deliberately chose not to do so. The average person tends to regard such behaviour with contempt, judging that you had a moral responsibility to do the right thing.
How does this apply to the question at hand? Simple -- the callee is not "legally" responsible for the actions of the caller, stupid or otherwise, such as passing in invalid input. On the other hand, when things go belly up and it is observed that a simple check within your function could have saved the caller from his own stupidity, you will end up sharing some of the moral responsibility for what has happened.
There is of course a trade-off going on here, dependent on how much the check actually costs you. Returning to the analogy, suppose that you found out that the same stranger was inching slowly towards a cliff on the other side of the world, and that by spending your life savings to fly there and warn him, you could save him. Very few people would judge you entirely harshly if, in this particular situation, you neglected to do so (let's assume that the telephone has not been invented, for the purposes of this analogy). In coding terms, however, if the check is as simple as checking for NULL, you are remiss if you fail to do so, even if the "real" blame in the situation lies with the caller.