This question already has answers here:
C++ static const access through a NULL pointer [duplicate]
(5 answers)
Closed 8 years ago.
Setup
Given this user-defined type:
struct T
{
static int x;
int y;
T() : y(38);
};
and the requisite definition placed somewhere useful:
int T::x = 42;
the following is the canonical way to stream the int's value to stdout:
std::cout << T::x;
Control
Meanwhile, the following is (of course) invalid due to an instance of T not existing:
T* ptr = NULL; // same if left uninitialised
std::cout << ptr->y;
Question
Now consider the horrid and evil and bad following code:
T* ptr = NULL;
std::cout << ptr->x; // remember, x is static
Dereferencing ptr is invalid, as stated above. Even though no physical memory dereference takes place here, I believe that it still counts as one, making the above code UB. Or... does it?
14882:2003 5.2.5/3 states explicitly that a->b is converted to (*(a)).b, and that:
The postfix expression before the dot or arrow is evaluated;
This evaluation happens even if the result is unnecessary to determine the value of the entire postfix expression, for example if the id-expression denotes a static member.
But it's not clear whether "evaluation" here involves an actual dereference. In fact neither 14882:2003 nor n3035 seem to explicitly say either way whether the pointer-expression has to evaluate to a pointer to a valid instance when dealing with static members.
My question is, just how invalid is this? Is it really specifically prohibited by the standard (even though there's no physical dereference), or is it just a quirk of the language that we can probably get away with? And even if it is prohibited, to what extent might we expect GCC/MSVC/Clang to treat it safely anyway?
My g++ 4.4 appeared to produce code that never attempts to push the [invalid] this pointer onto the stack, with optimisations turned off.
BTW If T were polymorphic then that would not affect this, as static members cannot be virtual.
it's not clear whether "evaluation" here involves an actual dereference.
I read "evaluation" here as "the subexpression is evaluated." That would mean that the unary * is evaluated and you perform indirection via a null pointer, yielding undefined behavior.
This issue (accessing a static member via a null pointer) is discussed in another question, When does invoking a member function on a null instance result in undefined behavior? While it discusses member functions specifically, I don't see any reason that data members are any different in this respect. There is some good discussion of the issue there.
There was a defect reported against the C++ Standard that asks "Is call of static member function through null pointer undefined?" (see CWG Defect 315) This defect is closed and its resolution states that it is valid to call a static member function via a null pointer:
p->f() is rewritten as (*p).f() according to 5.2.5 [expr.ref]. *p is not an error when p is null unless the lvalue is converted to an rvalue
However, this resolution is in fact wrong.
It presupposes the concept of an "empty lvalue," which is part of the proposed resolution for another defect, CWG defect 232, which asks the more general question, "Is indirection through a null pointer undefined behavior?"
The resolution to that defect would make certain forms of indirection through a null pointer (like calling a static member function) valid. However, that defect is still open and its resolution has not been adopted into the C++ Standard. Until that defect is closed and its resolution is incorporated into the C++ Standard, indirection via a null pointer (or dereferencing a null pointer, if one prefers that term) always yields undefined behavior.
Concerning p->a, where p is a null pointer, and a a static data member:
§9.4/2 says "A static member may be referred to using the class member
access syntax, in which case the object-expression is evaluated." (The
"object-expression" is the expression to the left of the . or the ->.)
Looking from computer side onto OOP it is yet another way to calculate where data resides in memory. When data is not static - it is calculated from instance pointer, when data is static it is always calculated as fixed pointer in data segment. Template adds nothing, since resolved at compile time.
So it is rather popular technique to use NULL as start pointer (for example evaluate offset of filed in class for persisting purposes)
So code above is correct for static data.
Related
I was reading a post on some nullptr peculiarities in C++, and a particular example caused some confusion in my understanding.
Consider (simplified example from the aforementioned post):
struct A {
void non_static_mem_fn() {}
static void static_mem_fn() {}
};
A* p{nullptr};
/*1*/ *p;
/*6*/ p->non_static_mem_fn();
/*7*/ p->static_mem_fn();
According to the authors, expression /*1*/ that dereferences the nullptr does not cause undefined behaviour by itself. Same with expression /*7*/ that uses the nullptr-object to call a static function.
The justification is based on issue 315 in C++ Standard Core Language Closed Issues, Revision 100 that has
...*p is not an error when p is null unless the lvalue is converted to an rvalue (7.1 [conv.lval]), which it isn't here.
thus making a distinction between /*6*/ and /*7*/.
So, the actual dereferencing of the nullptr is not undefined behaviour (answer on SO, discussion under issue 232 of C++ Standard, ...). Thus, the validity of /*1*/ is understandable under this assumption.
However, how is /*7*/ guaranteed to not cause UB? As per the cited quote, there is no conversion of lvalue to rvalue in p->static_mem_fn();. But the same is true for /*6*/ p->non_static_mem_fn();, and I think my guess is confirmed by the quote from the same issue 315 regarding:
/*6*/ is explicitly noted as undefined in 12.2.2
[class.mfct.non-static], even though one could argue that since non_static_mem_fn(); is
empty, there is no lvalue->rvalue conversion.
(in the quote, I changed "which" and f() to get the connection to the notation used in this question).
So, why is such a distinction made for p->static_mem_fn(); and p->non_static_mem_fn(); regarding the causality of UB? Is there an intended use of calling static functions from pointers that could potentially be nullptr?
Appendix:
this question asks about why dereferencing a nullptr is undefined behaviour. While I agree that in most cases it is a bad idea, I do not believe the statement is absolutely correct as per the links and quotes here.
similar discussion in this Q/A with some links to issue 232.
I was not able to find a question devoted to static methods and the nullptr dereferencing issue. Maybe I missed some obvious answer.
Standard citations in this answer are from the C++17 spec (N4713).
One of the sections cited in your question answers the question for non-static member functions. [class.mfct.non-static]/2:
If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.
This applies to, for example, accessing an object through a different pointer type:
std::string foo;
A *ptr = reinterpret_cast<A *>(&foo); // not UB by itself
ptr->non_static_mem_fn(); // UB by [class.mfct.non-static]/2
A null pointer doesn't point at any valid object, so it certainly doesn't point to an object of type A either. Using your own example:
p->non_static_mem_fn(); // UB by [class.mfct.non-static]/2
With that out of the way, why does this work in the static case? Let's pull together two parts of the standard:
[expr.ref]/2:
... The expression E1->E2 is converted to the equivalent form (*(E1)).E2 ...
[class.static]/1 (emphasis mine):
... A static member may be referred to using the class member access syntax, in which case the object expression is evaluated.
The second block, in particular, says that the object expression is evaluated even for static member access. This is important if, for example, it is a function call with side effects.
Put together, this implies that these two blocks are equivalent:
// 1
p->static_mem_fn();
// 2
*p;
A::static_mem_fn();
So the final question to answer is whether *p alone is undefined behavior when p is a null pointer value.
Conventional wisdom would say "yes" but this is not actually true. There is nothing in the standard that states dereferencing a null pointer alone is UB and there are several discussions that directly support this:
Issue 315, as you have mentioned in your question, explicitly states that *p is not UB when the result is unused.
DR 1102 removes "dereferencing the null pointer" as an example of UB. The given rationale is:
There are core issues surrounding the undefined behavior of dereferencing a null pointer. It appears the intent is that dereferencing is well defined, but using the result of the dereference will yield undefined behavior. This topic is too confused to be the reference example of undefined behavior, or should be stated more precisely if it is to be retained.
This DR links to issue 232 where it is discussed to add wording that explicitly indicates *p as defined behavior when p is a null pointer, as long as the result is not used.
In conclusion:
p->non_static_mem_fn(); // UB by [class.mfct.non-static]/2
p->static_mem_fn(); // Defined behavior per issue 232 and 315.
Can I initialize a pointer to a data member before initializing the member? In other words, is this valid C++?
#include <string>
class Klass {
public:
Klass()
: ptr_str{&str}
, str{}
{}
private:
std::string *ptr_str;
std::string str;
};
this question is similar to mine, but the order is correct there, and the answer says
I'd advise against coding like this in case someone changes the order of the members in your class.
Which seems to mean reversing the order would be illegal but I couldn't be sure.
Does a member have to be initialized to take its address?
No.
Can I initialize a pointer to a data member before initializing the member? In other words, is this valid C++?
Yes. Yes.
There is no restriction that operand of unary & need to be initialised. There is an example in the standard in specification of unary & operator:
int a;
int* p1 = &a;
Here, the value of a is indeterminate and it is OK to point to it.
What that example doesn't demonstrate is pointing to an object before its lifetime has begun, which is what happens in your example. Using a pointer to an object before and after its lifetime is explicitly allowed if the storage is occupied. Standard draft says:
[basic.life] Before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any pointer that represents the address of the storage location where the object will be or was located may be used but only in limited ways ...
The rule goes on to list how the usage is restricted. You can get by with common sense. In short, you can treat it as you could treat a void*, except violating these restrictions is UB rather than ill-formed. Similar rule exists for references.
There are also restrictions on computing the address of non-static members specifically. Standard draft says:
[class.cdtor] ... To form a pointer to (or access the value of) a direct non-static member of an object obj, the construction of obj shall have started and its destruction shall not have completed, otherwise the computation of the pointer value (or accessing the member value) results in undefined behavior.
In the constructor of Klass, the construction of Klass has started and destruction hasn't completed, so the above rule is satisfied.
P.S. Your class is copyable, but the copy will have a pointer to the member of another instance. Consider whether that makes sense for your class. If not, you will need to implement custom copy and move constructors and assignment operators. A self-reference like this is a rare case where you may need custom definitions for those, but not a custom destructor, so it is an exception to the rule of five (or three).
P.P.S If your intention is to point to one of the members, and no object other than a member, then you might want to use a pointer to member instead of pointer to object.
Funny question.
It is legitimate and will "work", though barely. There is a little "but" related to types which makes the whole thing a bit awkward with a bad taste (but not illegitimate), and which might make it illegal some border cases involving inheritance.
You can, of course, take the address of any object whether it's initialized or not, as long as it exists in the scope and has a name which you can prepend operator& to. Dereferencing the pointer is a different thing, but that wasn't the question.
Now, the subtle problem is that the standard defines the result of operator& for non-static struct members as "“pointer to member of class C of type T” and is a prvalue designating C::m".
Which basically means that ptr_str{&str} will take the address of str, but the type is not pointer-to, but pointer-to-member-of. It is then implicitly and silently cast to pointer-to.
In other words, although you do not need to explicitly write &this->str, that's nevertheless what its type is -- it's what it is and what it means [1].
Is this valid, and is it safe to use it within the initializer list? Well yes, just... barely. It's safe to use it as long as it's not being used to access uninitialized members or virtual functions, directly or indirectly. Which, as it happens, is the case here (it might not be the case in a different, arguably contrived case).
[1] Funnily, paragraph 4 starts with a clause that says that no member pointer is formed when you put stuff in parentheses. That's remarkable because most people would probably do that just to be 100% sure they got operator precedence right. But if I read correctly, then &this->foo and &(this->foo) are not in any way the same!
Is it legal to compare dangling pointers?
int *p, *q;
{
int a;
p = &a;
}
{
int b;
q = &b;
}
std::cout << (p == q) << '\n';
Note how both p and q point to objects that have already vanished. Is this legal?
Introduction: The first issue is whether it is legal to use the value of p at all.
After a has been destroyed, p acquires what is known as an invalid pointer value. Quote from N4430 (for discussion of N4430's status see the "Note" below):
When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of the deallocated storage become invalid pointer values.
The behaviour when an invalid pointer value is used is also covered in the same section of N4430 (and almost identical text appears in C++14 [basic.stc.dynamic.deallocation]/4):
Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.
[ Footnote: Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault. — end footnote ]
So you will need to consult your implementation's documentation to find out what should happen here (since C++14).
The term use in the above quotes means necessitating lvalue-to-rvalue conversion, as in C++14 [conv.lval/2]:
When an lvalue-to-rvalue conversion is applied to an expression e, and [...] the object to which the glvalue refers contains an invalid pointer value, the behaviour is implementation-defined.
History: In C++11 this said undefined rather than implementation-defined; it was changed by DR1438. See the edit history of this post for the full quotes.
Application to p == q: Supposing we have accepted in C++14+N4430 that the result of evaluating p and q is implementation-defined, and that the implementation does not define that a hardware trap occurs; [expr.eq]/2 says:
Two pointers compare equal if they are both null, both point to the same function, or both represent the same address (3.9.2), otherwise they compare unequal.
Since it's implementation-defined what values are obtained when p and q are evaluated, we can't say for sure what will happen here. But it must be either implementation-defined or unspecified.
g++ appears to exhibit unspecified behaviour in this case; depending on the -O switch I was able to have it say either 1 or 0, corresponding to whether or not the same memory address was re-used for b after a had been destroyed.
Note about N4430: This is a proposed defect resolution to C++14, that hasn't been accepted yet. It cleans up a lot of wording surrounding object lifetime, invalid pointers, subobjects, unions, and array bounds access.
In the C++14 text, it is defined under [basic.stc.dynamic.deallocation]/4 and subsequent paragraphs that an invalid pointer value arises when delete is used. However it's not clearly stated whether or not the same principle applies to static or automatic storage.
There is a definition "valid pointer" in [basic.compound]/3 but it is too vague to use sensibly.The [basic.life]/5 (footnote) refers to the same text to define the behaviour of pointers to objects of static storage duration, which suggests that it was meant to apply to all types of storage.
In N4430 the text is moved from that section up one level so that it does clearly apply to all storage durations. There is a note attached:
Drafting note: this should apply to all storage durations that can end, not just to dynamic storage duration. On an implementation supporting threads or segmented stacks, thread and automatic storage may behave in the same way that dynamic storage does.
My opinion: I don't see any consistent way to interpret the standard (pre-N4430) other than to say that p acquires an invalid pointer value. The behaviour doesn't seem to be covered by any other section besides what we have already looked at. So I am happy to treat the N4430 wording as representing the intent of the standard in this case.
Historically, there have been some systems where using a pointer as an rvalue might cause the system to fetch some information identified by some bits in that pointer. For example, if a pointer could contain the address of an object's header along with an offset into the object, fetching a pointer could cause the system to also fetch some information from that header. If the object has ceased to exist, the attempt to fetch information from its header could fail with arbitrary consequences.
That having been said, in the vast majority of C implementations, all pointers that were alive at some particular moment in time will forever hold the same relationships with regard to the relational and subtraction operators as they had at that particular time. Indeed, in most implementations if one has char *p, one may determine whether it identifies part of an object identified by char *base; size_t size; by checking whether (size_t)(p-base) < size; such comparison will work even retrospectively if there is any overlap in the objects' lifetime.
Unfortunately, the Standard defines no means by which code can indicate that it requires any of the latter guarantees, nor is there a standard means by which code can ask whether a particular implementation can promise any of the latter behaviors and refuse compilation if it does not. Further, some hyper-modern implementations will regard any use of relational or subtraction operators on two pointers as a promise by the programmer that the pointers in question will always identify the same live object, and omit any code which would only be relevant if that assumption didn't hold. Consequently, even though many hardware platforms would be able to offer guarantees that would be useful to many algorithms, there's no safe way by which code can exploit any such guarantees even if code will never need to run on hardware which does not naturally provide them.
The pointers contain the addresses of the variables they reference. The addresses are valid even when the variables that used to be stored there are released / destroyed / unavailable.
As long as you don't try to use the values at those addresses you are safe, meaning *p and *q will be undefined.
Obviously the result is implementation defined, therefore this code example can be used to study the features of your compiler if one doesn't want to dig into to assembly code.
Whether this is a meaningful practice is totally different discussion.
An blog author has brought up the discussion about null pointer dereferecing:
http://www.viva64.com/en/b/0306/
I've put some counter arguments here:
http://bit.ly/1L98GL4
His main line of reasoning quoting the standard is this:
The '&podhd->line6' expression is undefined behavior in the C language
when 'podhd' is a null pointer.
The C99 standard says the following about the '&' address-of operator
(6.5.3.2 "Address and indirection operators"):
The operand of the unary & operator shall be either a function
designator, the result of a [] or unary * operator, or an lvalue that
designates an object that is not a bit-field and is not declared with
the register storage-class specifier.
The expression 'podhd->line6' is clearly not a function designator,
the result of a [] or * operator. It is an lvalue expression. However,
when the 'podhd' pointer is NULL, the expression does not designate an
object since 6.3.2.3 "Pointers" says:
If a null pointer constant is converted to a pointer type, the
resulting pointer, called a null pointer, is guaranteed to compare
unequal to a pointer to any object or function.
When "an lvalue does not designate an object when it is evaluated, the
behavior is undefined" (C99 6.3.2.1 "Lvalues, arrays, and function
designators"):
An lvalue is an expression with an object type or an incomplete type
other than void; if an lvalue does not designate an object when it is
evaluated, the behavior is undefined.
So, the same idea in brief:
When -> was executed on the pointer, it evaluated to an lvalue where
no object exists, and as a result the behavior is undefined.
This question is purely language based, I'm not asking regarding whether a given system allows one to tamper with what lies at address 0 in any language.
As far as I can see, there's no restriction in dereferencing a pointer variable whose value is equal to nullptr, even thought comparisons of a pointer against the nullptr (or (void *) 0) constant can vanish in optimizations in certain situations because of the stated paragraphs, but this looks like another issue, it doesn't prevent dereferencing a pointer whose value is equal to nullptr. Notice that I've checked other SO questions and answers, I particularly like this set of quotations, as well as the standard quotes above, and I didn't stumbled upon something that clearly infers from standard that if a pointer ptr compares equal to nullptr, dereferencing it would be undefined behavior.
At most what I get is that deferencing the constant (or its cast to any pointer type) is what is UB, but nothing saying about a variable that's bit equal to the value that comes up from nullptr.
I'd like to clearly separate the nullptr constant from a pointer variable that holds a value equals to it. But an answer that address both cases is ideal.
I do realise that optimizations can quick in when there're comparisons against nullptr, etc and may simply strip code based on that.
If the conclusion is that, if ptr equals to the value of nullptr dereferencing it is definitely UB, another question follows:
Do C and C++ standards imply that a special value in the address space must exist solely to represent the value of null pointers?
As you quote C, dereferencing a null pointer is clearly undefined behavior from this Standard quote (emphasis mine):
(C11, 6.5.3.2p4) "If an invalid value has been assigned to the pointer, the
behavior of the unary * operator is undefined.102)"
102): "Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime."
Exact same quote in C99 and similar in C89 / C90.
C++
dcl.ref/5.
There shall be no references to references, no arrays of references, and no pointers to references. The
declaration of a reference shall contain an initializer (8.5.3) except when the declaration contains an explicit
extern specifier (7.1.1), is a class member (9.2) declaration within a class definition, or is the declaration
of a parameter or a return type (8.3.5); see 3.1. A reference shall be initialized to refer to a valid object or
function. [ Note: in particular, a null reference cannot exist in a well-defined program, because the only way
to create such a reference would be to bind it to the “object” obtained by indirection through a null pointer,
which causes undefined behavior. As described in 9.6, a reference cannot be bound directly to a bit-field.
— end note ]
The note is of interest, as it explicitly says dereferencing a null pointer is undefined.
I'm sure it says it somewhere else in a more relevant context, but this is good enough.
The answer to this that I see, as to what degree a NULL value may be dereferenced, is it is deliberately left platform-dependent in an unspecified manner, due to what is left implementation-defined in C11 6.3.2.3p5 and p6. This is mostly to support freestanding implementations used for developing boot code for a platform, as OP indicates in his rebuttal link, but has applications for a hosted implementation too.
Re:
(C11, 6.5.3.2p4) "If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.102)"
102): "Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime."
This is phrased as it is, afaict, because each of the cases in the footnote may NOT be invalid for specific platforms a compiler is targeting. If there's a defect there, it's "invalid value" should be italicized and qualified by "implementation-defined". For the alignment case a platform may be able to access any type using any address so has no alignment requirements, especially if address rollover is supported; and a platform may assume an object's lifetime only ends after the application has exited, allocating a new frame via malloc() for automatic variables on each function call.
For null pointers, at boot time a platform may have expectations that structures the processor uses have specific physical addresses, including at address 0, and get represented as object pointers in source code, or may require the function defining the boot process to use a base address of 0. If the standard didn't permit dereferences like '&podhd->line6', where a platform required podhd to have a base address of 0, then assembly language would be needed to access that structure. Similarly, a soft reboot function might need to dereference a 0 valued pointer as a void function invocation. A hosted implementation may consider 0 the base of an executable image, and map a NULL pointer in source code to the header of that image, after loading, as the struct required to be at logical address 0 for that instance of the C virtual machine.
What the standard calls pointers are more handles into the virtual address space of the virtual machine, where object handles have more requirements on what operations are permitted for them. How the compiler emits code that takes the requirements of these handles into account for a specific processor is left undefined. What is efficient for one processor may not be for another, after all.
The requirement on (void *)0 is more that the compiler emit code that guarantees expressions where the source uses (void *)0, explicitly or by referencing NULL, that the actual value stored will be one that says this can't point to any valid function definitions or objects by any mapping code. This does not have to be a 0! Similarly, for casts of (void *)0 to (obj_type) and (func_type), these are only required to get assigned values that evaluate as addresses the compiler guarantees are not being used then for objects or code. The difference with the latter is these are unused, not invalid, so are capable of being dereferenced in the defined manner.
The code that tests for pointer equality would then check if one operand is one of these values that the other is one of the 3, not just the same bit pattern, because this scoreboards them with the RTTI of being a (null *) type, distinct from void, obj, and func pointer types to defined entities. The standard could be more explicit it is a distinct type, if unnamed because compilers only use it internally, but I suppose this is considered obvious by "null pointer" being italicized. Effectively, imo, a '0' in these contexts is an additional keyword token of the compiler, due to the additional requirement of it identifying the (null *) type, but isn't characterized as such because this would complicate the definition of < identifiers >.
This stored value can be SIZE_MAX as easily as a 0, for a (void *)0, in emitted application code when implementations, for example, define the range 0 to SIZE_MAX-4*sizeof(void *) of virtual machine handles as what is valid for code and data. The NULL macro may even be defined as(void *)SIZE_MAX, and it would be up to the compiler to figure out from context this has the same semantics as 0. The casting code is responsible for noting it is the chosen value, in pointer <--> pointer casts, and supply what is appropriate as an object or function pointer. Casts from pointer <--> integer, implicit or explicit, have similar check and supply requirements; especially in unions where a (u)intptr_t field overlays a (type *) field. Portable code can guard against compilers not doing this properly with an explicit *(ptr==NULL?(type *)0:ptr) expression.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
When does invoking a member function on a null instance result in undefined behavior?
C++ standard: dereferencing NULL pointer to get a reference?
Say I have the class:
class A
{
public:
void foo() { cout << "foo"; }
};
and call foo like so:
A* a = NULL;
a->foo();
I suspect this invokes undefined behavior, since it's equivalent to (*a).foo() (or is it?), and dereferencing a NULL is UB, but I can't find the reference. Can anyone help me out? Or is it defined?
No, the function is not virtual. No, I'm not accessing any members.
EDIT: I voted to close this question but will not delete it as I couldn't find the duplicate myself, and I suspect this title might be easier to find by others.
I'm looking for the reference that says a->x is equivalent to (*a).x.
Here it is:
[C++11: 5.2.5/2]: For the first option (dot) the first expression shall have complete class type. For the second option (arrow) the first expression shall have pointer to complete class type. The expression E1->E2 is converted to the equivalent form (*(E1)).E2; the remainder of 5.2.5 will address only the first option (dot). In either case, the id-expression shall name a member of the class or of one of its base classes. [ Note: because the name of a class is inserted in its class scope (Clause 9), the name of a class is also considered a nested member of that class. —end note ] [ Note: 3.4.5 describes how names are looked up after the . and -> operators. —end note ]
There is no direct quotation for dereferencing a NULL pointer being UB, unfortunately. You may find more under this question: When does invoking a member function on a null instance result in undefined behavior?
I'm aware of at least one case where this idiom is not only allowed but relied upon: Microsoft's MFC class CWnd provides a member function GetSafeHwnd which tests if this==NULL and returns without accessing any member variables.
Of course there are plenty of people who would claim that MFC is a very bad example.
Regardless of whether the behavior is undefined or not, in practice it's not likely to behave badly. The compiler will treat a->foo() as A::foo(a) which does not do a dereference at the call site, as long as foo is not virtual.
Yes, that is UB as a has not been initialized to point to a valid memory location before it is dereferenced.
It is covered here: http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#232
At least a couple of places in the IS state that indirection through a
null pointer produces undefined behavior: 1.9 [intro.execution]
paragraph 4 gives "dereferencing the null pointer" as an example of
undefined behavior, and 8.3.2 [dcl.ref] paragraph 4 (in a note) uses
this supposedly undefined behavior as justification for the
nonexistence of "null references."
However, 5.3.1 [expr.unary.op] paragraph 1, which describes the unary
"*" operator, does not say that the behavior is undefined if the
operand is a null pointer, as one might expect. Furthermore, at least
one passage gives dereferencing a null pointer well-defined behavior:
5.2.8 [expr.typeid] paragraph 2 says
If the lvalue expression is obtained by applying the unary * operator
to a pointer and the pointer is a null pointer value (4.10
[conv.ptr]), the typeid expression throws the bad_typeid exception
(18.7.3 [bad.typeid]).
This is inconsistent and should be cleaned up.
Read more at the link if you want to learn more.