c++: why can't 'this' be a nullptr? - c++

In my early days with C++, I seem to recall you could call a member function with a NULL pointer, and check for that in the member function:
class Thing {public: void x();}
void Thing::x()
{ if (this == NULL) return; //nothing to do
...do stuff...
}
Thing* p = NULL; //nullptr these days, of course
p->x(); //no crash
Doing this may seem silly, but it was absolutely wonderful when writing recursive functions to traverse data structures, where navigating could easily run into the blind alley of a NULL; navigation functions could do a single check for NULL at the top and then blithely call themselves to try to navigate deeper without littering the code with additional checks.
According to g++ at least, the freedom (if it ever existed) has been revoked. The compiler warns about it, and if compiling optimized, it causes crashes.
Question 1: does the C++ standard (any flavor) disallow a NULL this? Or is g++ just getting in my face?
Question 2. More philosophically, why? 'this' is just another pointer. The glory of pointers is that they can be nullptr, and that's a useful condition.
I know I can get around this by making static functions, passing as first parameter a pointer to the data structure (hellllo Days of C) and then check the pointer. I'm just surprised I'd need to.
Edit: To upvote an answer I'd like to see chapter and verse from the standard on why this is disallowed. Note that my example at NO POINT dereferences NULL. Nothing is virtual here, and p is copied to "argument this" but then checked before use. No defererence occurs! so dereference of NULL can't be used as a claim of UB.
People are making a knee-jerk reaction to *p and assuming it isn't valid if p is NULL. But it is, and the evidence is here:
http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#232
In fact it calls out two cases when a pointer, p, is surprisingly valid as *p: when p is null or when p points one element past the end of an array. What you must never do is USE the value of *p... other than to take the address of it. &*p where p == nullptr for any pointer type p IS valid. It's fine to point out that p->x() is really (*p).x(), but at the end of the day that translates to x(&*p) and that is perfectly well formed and valid. For p=nullptr... it simply becomes x(nullptr).
I think my debate should be with the standards community; in their haste to undercut the concept of a null reference, they left wording unclear. Since no one here has demanded p->x() is UB without trying to demand that it's UB because *p is UB; and because *p is definitely not UB because no aspect of x() uses the referenced value, I'm going to put this down to g++ overreaching on a standard ambiguity. The absolutely identical mechanism using a static function and extra parameter is well defined, so it's not like it stops my refactor effort. Consider the question withdrawn; portable code can't assume this==nullptr will work but there's a portable solution available, so in the end it doesn't matter.

To be in a situation where this is nullptr implies you called a non-static member function without using a valid instance such as with a pointer set to nullptr. Since this is forbidden, to obtain a null this you must already be in undefined behavior. In other words, this is never nullptr unless you have undefined behavior. Due to the nature of undefined behavior, you can simplify the statement to simply be "this is never nullptr" since no rule needs to be upheld in the presence of undefined behavior.

Question 1: does the C++ standard (any flavor) disallow a NULL this?
Or is g++ just getting in my face?
The C++ standard disallows it -- calling a method on a NULL pointer is officially 'undefined behavior' and you must avoid doing it or you will get bit. In particular, optimizers will assume that the this-pointer is non-NULL when making optimizations, leading to strange/unexpected behaviors at runtime (I know this from experience :))
Question 2. More philosophically, why? 'this' is just another pointer.
The glory of pointers is that they can be nullptr, and that's a useful
condition.
I'm not sure it matters, really; it's what is specified in the C++ standard, and they probably had their reasons (philosophical or otherwise), but since the standard specifies it, the compilers expect it, therefore as programmers we have to abide by it, or face undefined behavior. (One can imagine an alternate universe where NULL this-pointers are allowed, but we don't live there)

The question has already been answered - it is undefined behavior to dereference a null pointer, and using *obj or obj-> are both dereferencing.
Now (since I assume you have a question on how to work around this) the solution is to use static function:
class Foo {
static auto bar_st(Foo* foo) { if (foo) return foo->bar(); }
}
Having said that, I do think that gcc's decision of eliminating all branches for nullptr this was not a wise one. Nobody gained by that, and a lot of people suffered. What's the benefit?

C++ does not allow calling member functions of null object. Objects need identity and that can not be stored to null pointer. What would happen if member function would read or write a field of a object referenced by null pointer?
It sounds like you could use null object pattern in your code to create wanted result.
Null pointer is recognised a problematic entity in object oriented languages because in most languages it is not a object. This creates a need for code that specifically handles the case something being null. While checking for special null pointer is the norm. There are other approaches. Smalltalk actually has a NullObject which has methods its own methods. As all objects it can also be extended. Go programming language does allow calling struct member functions for something that is nil (which sounds like something required in the question).

this might be null too if you delete this (which is possible but not recommended)

Related

How do I call multiple functions in a single main file? [duplicate]

According to ISO C++, dereferencing a null pointer is undefined behaviour. My curiosity is, why? Why standard has decided to declare it undefined behaviour? What is the rationale behind this decision? Compiler dependency? Doesn't seem, because according to C99 standard, as far as I know, it is well defined. Machine dependency? Any ideas?
Defining consistent behavior for dereferencing a NULL pointer would require the compiler to check for NULL pointers before each dereference on most CPU architectures. This is an unacceptable burden for a language that is designed for speed.
It also only fixes a small part of a larger problem - there are many ways to have an invalid pointer beyond a NULL pointer.
The primary reason is that by the time they wrote the original C standard there were a number of implementations that allowed it, but gave conflicting results.
On the PDP-11, it happened that address 0 always contained the value 0, so dereferencing a null pointer also gave the value 0. Quite a few people who used these machines felt that since they were the original machine C had been written on/used to program, that this should be considered canonical behavior for C on all machines (even though it originally happened quite accidentally).
On some other machines (Interdata comes to mind, though my memory could easily be wrong) address 0 was put to normal use, so it could contain other values. There was also some hardware on which address 0 was actually some memory-mapped hardware, so reading/writing it did special things -- not at all equivalent to reading/writing normal memory at all.
The camps wouldn't agree on what should happen, so they made it undefined behavior.
Edit: I suppose I should add that by the time the wrote the C++ standard, its being undefined behavior was already well established in C, and (apparently) nobody thought there was a good reason to create a conflict on this point so they kept the same.
The only way to give defined behaviour would be to add a runtime check to every pointer dereference, and every pointer arithmetic operation. In some situations, this overhead would be unacceptable, and would make C++ unsuitable for the high-performance applications it's often used for.
C++ allows you to create your own smart pointer types (or use ones supplied by libraries), which can include such a check in cases where safety is more important than performance.
Dereferencing a null pointer is also undefined in C, according to clause 6.5.3.2/4 of the C99 standard.
This answer from #Johannes Schaub - litb, puts forward an interesting rationale, which seems pretty convincing.
The formal problem with merely dereferencing a null pointer is that determining the identity of the resulting lvalue expression is not possible: Each such expression that results from dereferencing a pointer must unambiguously refer to an object or a function when that expression is evaluated. If you dereference a null pointer, you don't have an object or function that this lvalue identifies. This is the argument the Standard uses to forbid null-references.
Another problem that adds to the confusion is that the semantics of the typeid operator make part of this misery well defined. It says that if it was given an lvalue that resulted from dereferencing a null pointer, the result is throwing a bad_typeid exception. Although, this is a limited area where there exist an exception (no pun) to the above problem of finding an identity. Other cases exist where similar exception to undefined behavior is made (although much less subtle and with a reference on the affected sections).
The committee discussed to solve this problem globally, by defining a kind of lvalue that does not have an object or function identity: The so called empty lvalue. That concept, however, still had problems, and they decided not to adopt it.
Note:
Marking this as community wiki, since the answer & the credit should go to the original poster. I am just pasting the relevant parts of the original answer here.
The real question is, what behavior would you expect ?
A null pointer is, by definition, a singular value that represents the absence of an object. The result of dereferencing a pointer is to obtain a reference to the object pointed to.
So how do you get a good reference... from a pointer that points into the void ?
You do not. Thus the undefined behavior.
I suspect it's because if the behavior is well-defined the compiler has to insert code anywhere pointers are dereferenced. If it's implementation defined then one possible behavior could still be a hard crash. If it's unspecified then either the compilers for some systems have extra undue burden or they may generate code that causes hard crashes.
Thus to avoid any possible extra burden on compilers they left the behavior undefined.
Sometimes you need an invalid pointer (also see MmBadPointer on Windows), to represent "nothing".
If everything was valid, then that wouldn't be possible. So they made NULL invalid, and disallowed you from dereferencing it.
Here is a simple test & example:
Allocate a pointer:
int * pointer;
? What value is in the pointer when it is created?
? What is the pointer pointing to?
? What happens when I dereference this point in its current state?
Marking the end of a linked list.
In a linked list, a node points to another node, except for the last.
What is the value of the pointer in the last node?
What happens when you derefernce the "next" field of the last node?
The needs to be a value that indicates a pointer is not pointing to anything or that it's in an invalid state. This is where the NULL pointer concept comes into play. The linked list can use a NULL pointer to indicate the end of the list.
Arguments have been made elsewhere that having well-defined behaviour for null-pointer-references is impossible without a lot of overhead, which I think is true. This is because AFAIU "well-defined" here also means "portable". If you would not treat nullptr references specially, you would end up generating instructions that simply try to read address 0, but that produces different behaviour on different processors, so that would not be well-defined.
So, I guess this is why derereferencing nullptr (and probably also other invalid pointers) is marked as undefined.
I do wonder why this is undefined rather then unspecified or implementation-defined, which are distict from undefined behaviour, but require more consistency.
In particular, when a program triggers undefined behaviour, the compiler can do pretty much anything (e.g. throw away your entire program maybe?) and still be considered correct, which is somewhat problematic. In practice, you would expect that compilers would just compile a null-pointer-dereference to a read of address zero, but with modern optimizers becoming better, but also more sensitive to undefined behaviour, I think, they sometimes do things that end up more thoroughly breaking the program. E.g. consider the following:
matthijs#grubby:~$ cat test.c
unsigned foo () {
unsigned *foo = 0;
return *foo;
}
matthijs#grubby:~$ arm-none-eabi-gcc -c test.c -Os && objdump -d test.o
test.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <foo>:
0: e3a03000 mov r3, #0
4: e5933000 ldr r3, [r3]
8: e7f000f0 udf #0
This program just dereferences and accesses a null pointer, which results in an "Undefined instruction" being generated (halting the program at runtime).
This might be ok when this is an accidental nullpointer dereference, but in this case I was actually writing a bootloader that needs to read address 0 (which contains the reset vector), so I was quite surprised this happened.
So, not so much an answer, but some extra perspective on the matter.
According to original C standard NULL can be any value - not necessarily zero.
The language definition states that for each pointer type, there is a special value - the `null pointer' - which is distinguishable from all other pointer values and which is 'guaranteed to compare unequal to a pointer to any object or function.' That is, a null pointer points definitively nowhere; it is not the address of any object or function
There is a null pointer for each pointer type, and the internal values of null pointers for different types may be different.
(From http://c-faq.com/null/null1.html)
Although dereferencing a NULL pointer in C/C++ indeed leads undefined behavior from the language standpoint, such operation is well defined in compilers for targets which have memory at corresponding address. In this case, the result of such operation consists in simply reading the memory at address 0.
Also, many compilers will allow you to dereference a NULL pointer as long as you don't bind the referenced value. This is done to provide compatibility to non-conforming yet widespread code, like
#define offsetof(st, m) ((size_t)(&((st *)0)->m))
There was even a discussion to make this behaviour part of the standard.
Because you cannot create a null reference. C++ doesn't allow it. Therefore you cannot dereference a null pointer.
Mainly it is undefined because there is no logical way to handle it.
You can actually dereference a null pointer. Someone did it here: http://www.codeproject.com/KB/system/soviet_kernel_hack.aspx

Dereferencing a nullptr class object [duplicate]

Will the program:
#include <stdio.h>
struct foo
{
void blah() {printf("blah\n");}
int i;
};
void main(int, char**)
{
((foo*)NULL)->blah();
}
Ever crash, or do anything other than output blah, on any compiler you are aware of? Will any function crash, when called via a NULL pointer, if it doesn't access any members (including the vtable)?
There have been other questions on this topic, for instance Accessing class members on a NULL pointer and Is it legal/well-defined C++ to call a non-static method that doesn't access members through a null pointer?, and it is always pointed out that this results in undefined behavior. But is this undefined in the real world, or only in the standard's world? Does any extant compiler not behave as expected? Can you think of any plausible reason why any future compiler wouldn't behave as expected?
What if the function does modify members, but the NULL ptr is guarded against. For instance,
void foo::blah()
{
foo* pThis = this ? this : new foo();
pThis->i++;
}
Edit:
For the record, the reason I wanted this was to make the interface to my linked list class as easy and concise as possible. I wanted to initialize the list to NULL have idiomatic usage look like:
pList = pList->Insert(elt);
pList = pList->Remove(elt);
...
Where all the operators return the new head element. Somehow I didn't realize that using a container class would make things even easier, with no downside.
Can you think of any plausible reason why any future compiler wouldn't behave as expected?
A helpful compiler might add code to access the real object under the hood in debug builds in the hope of helping you catch this issue in your code early in the development cycle.
What if the function does modify members, but the NULL ptr is guarded against. For instance,
void foo::blah()
{
foo* pThis = this ? this : new foo();
pThis->i++;
}
Since it is undefined behavior to call that function with a null pointer, the compiler can assume that the test will always pass and optimize that function to:
void foo::blah()
{
this->i++;
}
Note that this is correct, since if this is not null, it behaves as-if the original code was executed, and if this was null, it would be undefined behavior and the compiler does not need to provide any particular behavior at all.
Undefined behavior means you can't rely on what will happen. However it's sometimes useful to know what's happening under the covers while you're debugging so that you're not surprised when the impossible happens.
Most compilers will code this as a simple function with a hidden this parameter, and if the this parameter is never referenced the code will work as expected.
Checking for this == NULL might not work, depending on how aggressively your compiler optimizes. Since a well formed program couldn't possibly have this==NULL, the compiler is free to pretend that it will never happen and optimize away the if statement entirely. I know though that Microsoft's C++ will not make this optimization because their GetSafeHWND function relies on it working as expected.
Trying to guard for this == NULL wouldn't give you any real desirable effect. Mainly dereferencing NULL pointer, AFAIK, is undefined. It works differently for different compilers. Let's say that it does work in one scenario (like this) it doesn't work for this scenarios or this (virtual functions). The second and third scenarios are understandable, since the instance doesn't have a vtable entry to check for which of the virtual functions to call. But I'm not sure the same can be said for the first.
The other thing that you need to consider is that any invalid pointer can also give the same type of error you'd want to guard against, like this. Notice that it successfully printed 'Foo' and then went into a runtime error trying to access a. This is because the memory location being pointed to by Test* t is invalid. The same behavior is seen here, when Test* t is NULL.
So, in general, avoid such behaviors and design in your code. It's not predictable and it would cause undesirable effect if someone comes after you and changes your code thinking it should behave as it did previously.

If de-referencing a NULL pointer is an invalid thing to do, how should auto pointers be implemented?

I thought dereferencing a NULL pointer was dangerous, if so then what about this implementation of an auto_ptr?
http://ootips.org/yonat/4dev/smart-pointers.html
If the default constructor is invoked without a parameter the internal pointer will be NULL, then when operator*() is invoked won't that be dereferencing a null pointer?
Therefore what is the industrial strength implementation of this function?
Yes, dereferencing NULL pointer = bad.
Yes, constructing an auto_ptr with NULL creates a NULL auto_ptr.
Yes, dereferencing a NULL auto_ptr = bad.
Therefore what is the industrial strength implementation of this function?
I don't understand the question. If the definition of the function in question created by the industry itself is not "industrial strength" then I have a very hard time figuring out what might be.
std::auto_ptr is intended to provide essentially the same performance as a "raw" pointer. To that end, it doesn't (necessarily) do any run-time checking that the pointer is valid before being dereferenced.
If you want a pointer that checks validity, it's relatively easy to provide that, but it's not the intended purpose of auto_ptr. In fairness, I should add that the real intent of auto_ptr is rather an interesting question -- its specification was changed several times during the original standardization process, largely because of disagreements over what it should try to accomplish. The version that made it into the standard does have some uses, but quite frankly, not very many. In particular, it has transfer-of-ownership semantics that make it unsuitable for storage in a standard container (among other things), removing one of the obvious purposes for smart pointers in general.
Its purpose to help prevent memory leaks by ensuring that delete is performed on the underlying pointer whenever the auto_ptr goes out of scope (or itself is deleted).
Just like in higher-level languages such as C#, trying to dereference a null pointer/object will still explode, as it should.
Do what you would do if you dereferenced a NULL pointer. On many platforms, this means throw an exception.
Well, just like you said: dereferencing null pointer is illegal, leads to undefined behavior. This immediately means that you must not use operator * on a default-constructed auto_ptr. Period.
Where exactly you see a problem with "industrial strength" of this implementation is not clear to me.
#Jerry Coffin: It is naughty of me to answer your answer here rather than the OP's question but I need more lines than a comment allows..
You are completely right about the ridiculous semantics of the current version, it is so completely rotten that a new feature: "mutable" HAD to be added to the language just to allow these insane semantics to be implemented.
The original purpose of "auto_ptr" was exactly what boost::scoped_ptr does (AFAIK), and I'm happy to see a version of that finally made it into the new Standard. The reason for the name "auto_ptr" is that it should model a first class pointer on the stack, i.e. an automatic variable.
This auto_ptr was an National Body requirement, based on the following logic: if we have catchable exceptions in C++, we MUST have a way to manage pointers which is exception safe IN the Standard. This also applies to non-static class members (although that's a slightly harder problem which required a change to the syntax and semantics of constructors).
In addition a reference counting pointer was required but due to a lot of different possible implementation with different tradeoffs, one can accept that this be left out of the Standard until a later time.
Have you ever played that game where you pass a message around a ring of people and at the end someone reads out the input and output messages? That's what happened. The original intent got lost because some people thought that the auto_ptr, now we HAD to have it, could be made to do more... and finally what got put in the standard can't even do what the original simple scope_ptr style one did (auto_ptr semantics don't assure the pointed at object is destroyed because it could be moved elsewhere).
If I recall the key problem was returning the value of a auto_ptr: the core design simply doesn't allow that (it's uncopyable). A sane solution like
return ap.swap(NULL)
unfortunately still destroys the intended invariant. The right way is probably closer to:
return ap.clone();
which copies the object and returns the copy, destroying the original: the compiler is then free to optimise away the copy (as written not exception safe .. the clone might leak if another exception is thrown before it returns: a ref counted pointer solves this of course).

This code appears to achieve the return of a null reference in C++

My C++ knowledge is somewhat piecemeal. I was reworking some code at work. I changed a function to return a reference to a type. Inside, I look up an object based on an identifier passed in, then return a reference to the object if found. Of course I ran into the issue of what to return if I don't find the object, and in looking around the web, many people claim that returning a "null reference" in C++ is impossible. Based on this advice, I tried the trick of returning a success/fail boolean, and making the object reference an out parameter. However, I ran into the roadblock of needing to initialize the references I would pass as actual parameters, and of course there is no way to do this. I retreated to the usual approach of just returning a pointer.
I asked a colleague about it. He uses the following trick quite often, which is accepted by both a recent version of the Sun compiler and by gcc:
MyType& someFunc(int id)
{
// successful case here:
// ...
// fail case:
return *static_cast<MyType*>(0);
}
// Use:
...
MyType& mt = somefunc(myIdNum);
if (&mt) // test for "null reference"
{
// whatever
}
...
I have been maintaining this code base for a while, but I find that I don't have as much time to look up the small details about the language as I would like. I've been digging through my reference book but the answer to this one eludes me.
Now, I had a C++ course a few years ago, and therein we emphasized that in C++ everything is types, so I try to keep that in mind when thinking things through. Deconstructing the expression: "static_cast<MyType>(0);", it indeed seems to me that we take a literal zero, cast it to a pointer to MyType (which makes it a null pointer), and then apply the dereferencing operator in the context of assigning to a reference type (the return type), which should give me a reference to the same object pointed to by the pointer. This sure looks like returning a null reference to me.
Any advice in explaining why this works (or why it shouldn't) would be greatly appreciated.
Thanks,
Chuck
This code doesn't work, though it may appear to work. This line dereferences a null pointer:
return *static_cast<MyType*>(0);
The zero, cast to a pointer type, results in a null pointer; this null pointer is then dereferenced using the unary-*.
Dereferencing a null pointer results in undefined behavior, so your program may do anything. In the example you describe, you get a "null reference" (or, it appears you get a null reference), but it would also be reasonable for your program to crash or for anything else to happen.
I agree with other posters that the behaviour of your example is undefined and really shouldn't be used. I offer some alternatives here. Each of them has pros and cons
If the object can't be found, throw an exception which is caught in the calling layer.
Create a globally accessible instance of MyType which is a simple shell object (i.e. static const MyType BAD_MYTYPE) and can be used to represent a bad object.
If it's likely that the object will not be found often then maybe pass the object in by reference as a parameter and return a bool or other error code indicating success / failure. If it can't find the object, you just don't assign it in the function.
Use pointers instead and check for 0 on return.
Use Boost smart pointers which allow for the validity of the returned object to be checked.
My personal preference would be for one of the first three.
That is undefined behavior. Because it is undefined behavior, it may "work" on your current compiler, but it could break if you ever upgrade/change your compiler.
From the C++03 spec:
8.3.2/4 ... A reference shall be initialized to refer to a valid object or function. [Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior.
If you are married to that return-a-reference interface, then the Right Thing®
would be to throw an exception if you can't find an object for the given ID. At least that way your poor user can trap the condition with a catch.
If you go dereferencing a null pointer on them, they have no defined way to handle the error.
The line return *static_cast<MyType*>(0); is dereferencing the null pointer which causes undefined behaviour.
Well, you said it yourself in the title of your question. The code appears to work with a null reference.
When people say null references do not exist in C++, it does not mean that the compiler will generate an error message if you try to create one. As you've found out, there's nothing to stop you from creating a reference from a null pointer.
It simply means that the C++ standard does not specify what should happen if you do it. Null references are not possible because the C++ standard says that references must not be created from null pointers. (but doesn't say anything about generating a compile error if you try to do it.)
It is undefined behavior. In practice, because references are typically implemented as pointers, it usually seems to work if you try to create a reference from a null pointer. But there's no guarantee that it'll work. Or that it'll keep working tomorrow. It's not possible in C++ because if you do it, the behavior specified by C++ no longer applies. Your program might do anything
This might all sound a bit hypothetical, because hey, it seems to work just fine. But keep in mind that just because it works, and it makes sense that it works, when naively compiled, it may break when the compiler tries to apply some optimization or other. When the compiler sees a reference, it is guaranteed by the language rules that it is not null. So what happens if the compiler uses this assumption to speed up the code, and you go behind its back creating a "null reference"?
As others have already said, the code unfortunately only appears to work... however, more fundamentally, you are breaking a contract here.
In C++, a reference is meant to be an alias for an object. Based on the signature of your function I would never expect to be passed a 'NULL' reference, and I would certainly NOT test it before using it. As James McNellis said, creating a NULL reference is undefined behavior, however in most case this behavior only creeps in when you actually try to use it, and you are now exposing the users of your methods to nasty / tricky to nail down bugs.
I won't go any further on that issue, just points you toward Herb Sutter's pick on the issue.
Now for the solution to your problem.
The evident solution is of course a plain pointer. People expect a pointer to be possibly null, so they will test it when it's returned to them (and if they don't it's their damn fault for being lazy).
There are other alternatives, but they mainly boil down to having a special value indicating your failure and there is not much point using complicated designs just for the sake of it...
The last alternative is the use of exceptions here, however I am myself partial to the advice. Exceptions are meant for exceptional situations you see, and for a find/search feature it really depends on the expected result:
if you are implementing some internal factory where you register modules and then call them back later on, then not being able to retrieve one module would indicate an error in the program and as such being reported by an exception is fine
if you are implementing a search engine for a database of yours, thus dealing with user input, then not finding a result matching the input criteria is quite likely to occur, and thus I would not use exceptions in this circumstance, for it's not a programming error but a perfectly normal course
Note: other ways include
Boost.Optional, though I find it clumsy to wrap a reference with it
A shared pointer or weak pointer, to control/monitor the object lifetime in case it may be deleted while you still use it... monitoring does not work by itself in Multi-Threaded environment though
A sentinel value (usually declared static const), but it only works if your object has a meaningful "bad" or "null" value. It's certainly not the approach I would recommend since once again you give an object but it blows up in the users hand if they do anything with it
As others have mentioned, your code is erroneous since it dereferences a null pointer.
Secondly, you are using the reference return type incorrectly, returning a reference in your example is usually not good in C/C++ since it violates the memory management model of the language where all objects are referenced to by pointers to a memory address. If you rewrite C/C++ code that was written using pointers into code that uses references you will end up with these problems. The code using pointers could return a NULL pointer without causing problems, but when returning a reference you must return something that can be cast into a 0 or false statement.
Most important of all is the pattern where your erroneous code only gets executed in the case of an exception, in this example your "fail case". Incorrect error handling, error logging with bugs etc are the most disastreous bugs in computer systems since they never show up in the happy case but always causes a system breakdown when something doesnt follow the normal flow.
The only way to ensure that your code is correct is to have testcases with 100% coverage, that means also testing the error handling, which in your example probably would cause a segmentation fault on your program.

Is there a practical benefit to casting a NULL pointer to an object and calling one of its member functions?

Ok, so I know that technically this is undefined behavior, but nonetheless, I've seen this more than once in production code. And please correct me if I'm wrong, but I've also heard that some people use this "feature" as a somewhat legitimate substitute for a lacking aspect of the current C++ standard, namely, the inability to obtain the address (well, offset really) of a member function. For example, this is out of a popular implementation of a PCRE (Perl-compatible Regular Expression) library:
#ifndef offsetof
#define offsetof(p_type,field) ((size_t)&(((p_type *)0)->field))
#endif
One can debate whether the exploitation of such a language subtlety in a case like this is valid or not, or even necessary, but I've also seen it used like this:
struct Result
{
void stat()
{
if(this)
// do something...
else
// do something else...
}
};
// ...somewhere else in the code...
((Result*)0)->stat();
This works just fine! It avoids a null pointer dereference by testing for the existence of this, and it does not try to access class members in the else block. So long as these guards are in place, it's legitimate code, right? So the question remains: Is there a practical use case, where one would benefit from using such a construct? I'm especially concerned about the second case, since the first case is more of a workaround for a language limitation. Or is it?
PS. Sorry about the C-style casts, unfortunately people still prefer to type less if they can.
The first case is not calling anything. It's taking the address. That's a defined, permitted, operation. It yields the offset in bytes from the start of the object to the specified field. This is a very, very, common practice, since offsets like this are very commonly needed. Not all objects can be created on the stack, after all.
The second case is reasonably silly. The sensible thing would be to declare that method static.
I don't see any benefit of ((Result*)0)->stat(); - it is an ugly hack which will likely break sooner than later. The proper C++ approach would be using a static method Result::stat() .
offsetof() on the other hand is legal, as the offsetof() macro never actually calls a method or accesses a member, but only performs address calculations.
Everybody else has done a good job of reiterating that the behavior is undefined. But lets pretend it wasn't, and that p->member is allowed to behave in a consistent manner under certain circumstances even if p isn't a valid pointer.
Your second construct would still serve almost no purpose. From a design perspective, you've probably done something wrong if a single function can do its job both with and without accessing members, and if it can then splitting the static portion of the code into a separate, static function would be much more reasonable than expecting your users to create a null pointer to operate on.
From a safety perspective, you've only protected against a small portion of the ways an invalid this pointer can be created. There's uninitialized pointers, for starters:
Result* p;
p->stat(); //Oops, 'this' is some random value
There's pointers that have been initialized, but are still invalid:
Result* p = new Result;
delete p;
p->stat(); //'this' points to "safe" memory, but the data doesn't belong to you
And even if you always initialize your pointers, and absolutely never accidentally reuse free'd memory:
struct Struct {
int i;
Result r;
}
int main() {
((Struct*)0)->r.stat(); //'this' is likely sizeof(int), not 0
}
So really, even if it weren't undefined behavior, it is worthless behavior.
Although libraries targeting specific C++ implementations may do this, that doesn't mean it's "legitimate" generally.
This works just fine! It avoids a null
pointer dereference by testing for the
existence of this, and it does not try
to access class members in the else
block. So long as these guards are in
place, it's legitimate code, right?
No, because although it might work fine on some C++ implementations, it is perfectly okay for it to not work on any conforming C++ implementation.
Dereferencing a null-pointer is undefined behavior and anything can happen if you do it. Don't do it if you want a program that works.
Just because it doesn't immediately crash in one specific test case doesn't mean that it won't get you into all kinds of trouble.
Undefined behaviour is undefined behaviour. Do this tricks "work" for your particular compiler? well, possibly. will they work for the next iteration of it. or for another compiler? Possibly not. You pays your money and you takes your choice. I can only say that in nearly 25 years of C++ programming I've never felt the need to do any of these things.
Regarding the statement:
It avoids a null pointer dereference by testing for the existence of this, and it does not try to access class members in the else block. So long as these guards are in place, it's legitimate code, right?
The code is not legitimate. There is no guarantee that the compiler and/or runtime will actually call to the method when the pointer is NULL. The checking in the method is of no help because you can't assume that the method will actually end up being called with a NULL this pointer.