Probably this question was raised multiple times but I still cannot find any valid reasoned answer. Consider the following code piece:
struct A {virtual int vfunc() = 0;};
struct B {virtual ~B() {}};
struct C {void *cdata;};
//...
struct Z{};
struct Parent:
public A,
virtual B,
private C,
//...
protected Z
{
int data;
virtual ~Parent(){}
virtual int vfunc() {return 0;} // implements A::vfunc interface
virtual void pvfunc() {};
double func() {return 0.0;}
//...etc
};
struct Child:
public Parent
{
virtual ~Child(){}
int more_data;
virtual int vfunc() {return 0;} // reimplements A::vfunc interface
virtual void pvfunc() {};// implements Parent::pvfunc interface
};
template<class T>
struct Wrapper: public T
{
// do nothing, just empty
};
int main()
{
Child ch;
Wrapper<Child> &wr = reinterpret_cast<Wrapper<Child>&/**/>(ch);
wr.data = 100;
wr.more_data = 200;
wr.vfunc();
//some more usage of wr...
Parent pr = wr;
pr.data == wr.data; // true?
//...
return 0;
}
Basically this shows a cast to reference to dummy child class Wrapper and usage of members its ancestors classes.
The question is: is this code valid by the standard? if not then what exactly does it violate?
PS: Do not provide answers like "this is wrong on so many levels omg" and similar please. I need exact quotes from the standard proving the point.
I surely hope this is something you are doing as an academic exercise. Please do not ever write any real code that resembles any of this in any way. I can't possibly point out all the issues with this snippet of code as there are issues with just about everything in here.
However, to answer the real question - this is complete undefined behavior. In C++17, it is section 8.2.10 [expr.reinterpret.cast]. Use the phrase in the brackets to get the relevant section for previous standards.
EDIT I thought a succinct answer would suffice, but more details have been requested. I will not mention the other code issues, because they will just muddy the water.
There are several key issues here. Let's focus on the reinterpret_cast.
Child ch;
Wrapper<Child> &wr = reinterpret_cast<Wrapper<Child>&/**/>(ch);
Most of the wording in the spec uses pointers, so based on 8.2.10/11, we will change the example code slightly to this.
Child ch;
Wrapper<Child> *wr = reinterpret_cast<Wrapper<Child>*>(&ch);
Here is the quoted part of the standard for this justification.
A glvalue expression of type T1 can be cast to the type “reference to T2” if an expression of type “pointer to T1” can be explicitly converted to the type “pointer to T2” using a reinterpret_cast. The result refers to the same object as the source glvalue, but with the specified type. [ Note: That is, for lvalues, a reference cast reinterpret_cast(x) has the same effect as the conversion *reinterpret_cast(&x) with the built-in & and * operators (and similarly for reinterpret_cast(x)). — end note ] No temporary is created, no copy is made, and constructors (15.1) or conversion functions (15.3) are not called.
One subtle little part of the standard is 6.9.2/4 which allows for certain special cases for treating a pointer to one object as if it were pointing to an object of a different type.
Two objects a and b are pointer-interconvertible if:
(4.1) — they are the same object, or
(4.2) - one is a standard-layout union object and the other is a non-static data member of that object (12.3), or
(4.3) — one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, the first base class subobject of that object (12.2), or
(4.4) — there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer- interconvertible.
If two objects are pointer-interconvertible, then they have the same address, and it is possible to obtain a pointer to one from a pointer to the other via a reinterpret_cast (8.2.10). [ Note: An array object and its first element are not pointer-interconvertible, even though they have the same address. — end note ]
However, your case does not meet this criteria, so we can't use this exception to treat a pointer to Child as if it were a pointer to Wrapper<Child>.
We will ignore the stuff about reinterpret_cast that does not deal with casting between two pointer types, since this case just deals with pointer types.
Note the last sentence of 8.2.10/1
Conversions that can be performed explicitly using reinterpret_cast are listed below. No other conversion can be performed explicitly using reinterpret_cast.
There are 10 paragraphs that follow.
Paragraph 2 says reinterpret_cast can't cast away constness. Not our concern.
Paragraph 3 says that the result may or may not produce a different representation.
Paragraphs 4 and 5 are about casting between pointers and integral types.
Paragraph 6 is about casting function pointers.
Paragraph 8 is about converting between function pointers and object pointers.
Paragraph 9 is about converting null pointer values.
Paragraph 10 is about converting between member pointers.
Paragraph 11 is quoted above and basically says that casting references is akin to casting pointers.
That leaves paragraph 7, which states.
An object pointer can be explicitly converted to an object pointer of a different type.73 When a prvalue v of object pointer type is converted to the object pointer type “pointer to cv T”, the result is static_cast(static_cast(v)). [ Note: Converting a prvalue of type “pointer to T1” to the type “pointer to T2” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value. — end note ]
This means that we can cast back and forth between those two pointer types all day long. However, that's all we can safely do. You are doing more than that, and yes, there are a few exceptions that allow for some other things.
Here is 6.10/8
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
(8.1) — the dynamic type of the object,
(8.2) — a cv-qualified version of the dynamic type of the object,
(8.3) — a type similar (as defined in 7.5) to the dynamic type of the object,
(8.4) — a type that is the signed or unsigned type corresponding to the dynamic type of the object,
(8.5) — a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
(8.6) — an aggregate or union type that includes one of the aforementioned types among its elements or non- static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
(8.7) — a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
(8.8) — a char, unsigned char, or std::byte type.
You case does not satisfy any of those.
In your case, you are taking a pointer to one type, and forcing the compiler to pretend that it is pointing to a different type. Does not matter how much the two look to your eyes - did you know that a completely standard conforming compiler does not have to put data for a derived class after the data for a base class? Those details are NOT part of the C++ standard, but part of the ABI your compiler implements.
In fact, there are very few cases where using reinterpret_cast for anything other than carrying a pointer around and then casting it back to its original type that does not elicit undefined behavior.
As stated in another answer, this discussion relates to section
8.2.10 [expr.reinterpret.cast] of the C++17 standard.
Sentence 11 of this section explains that for references to
objects we can have the same reasoning as for pointers to
objects.
Wrapper<Child> &wr = reinterpret_cast<Wrapper<Child>&/**/>(ch);
or
Wrapper<Child> *wr = reinterpret_cast<Wrapper<Child>*/**/>(&ch);
Sentence 7 of this section explains that for pointers to objects
reinterpret_cast can be seen as two static_cast in sequence
(through void *).
In the specific case of this question, the type Wrapper<Child>
actually inherits from Child, so a single static_cast should
be sufficient (no need for two static_cast, nor reinterpret_cast).
So if reinterpret_cast can be seen here as the combination of a
useless static_cast through void * and a correct static_cast,
it should be considered equivalent to this correct static_cast.
hum...
On second thought, I think I'm totally wrong!
(the static_cast is incorrect, I have read it the wrong way)
If we had
Wrapper<Child> wc=...
Child *pc=&wc;
Wrapper<Child> *pwc=static_cast<Wrapper<Child>*>(pc);
the static_cast (then the reinterpret_cast) would be correct
because it goes back to the original type.
But in your example the original the original type was not
Wrapper<Child> but Child.
Even if it is very unlikely, nothing forbids the compiler to
add some hidden data members in Wrapper<Child>.
Wrapper<Child> is not an empty structure, it participates
in a hierarchy with dynamic polymorphism, and any solution could
be used under the hood by the compiler.
So, after reinterpret_cast, it becomes undefined behavior because
the address stored in the pointer (or reference) will point to
some bytes with the layout of Child but the following code
will use these bytes with the layout of Wrapper<Child>
which may be different.
Related
Let B be derived from class A. By reading various posts I've got an impression that casting like in
const std::shared_ptr<const A> a(new B());
const std::shared_ptr<const B>& b = reinterpret_cast<const std::shared_ptr<const B>&>(a);
is for some reason discouraged and that one should use reinterpret_pointer_cast instead. However, I would like to avoid creating a new shared_ptr for performance reasons. Is the above code legal? Does it lead to undefined behavior? It seems to work in gcc and in Visual Studio.
You want static_pointer_cast.
const std::shared_ptr<const A> a(new B());
const std::shared_ptr<const B> b = std::static_pointer_cast<const B>(a);
I highly doubt the above will cause any performance issues. But if you have evidence that a shared_ptr creates a performance problem, then fallback to the raw pointer:
const B* pB = static_cast<const B*>(a.get());
Another hint. Please try to avoid reinterpret_cast between classes with an inheritance relationship. In cases where there are virtual methods and/or multiple inheritance, the static_cast will correctly adjust the pointer offset to the correct vtable or base offset. But reinterpret_cast will not. (Or technically: undefined behavior)
reinterpret_cast will usually lead to UB. Sometimes you are willing to take the risk of using it, for performances reasons, but you will try to avoid this kind of thing as much as you can. In this case, it better for you to use static_pointer_cast.
Pay attention, that even if you don't know, in this case, which other cast can you use, and you willing to take the risk with reinterpret_cast, you must use some validations after and before the casting- otherwise you will be able to get a lot of errors, and a lot of time spending.
First, you create an object a of type const std::shared_ptr<const A> a and initialize it with a pointer to some type B. This only works if you can assign a B* to an A*, so there should be a relationship such as inheritance. Ignoring this, you convert an object of some type to a reference to another type with reinterpret_cast:
A glvalue expression of type T1 can be cast to the type “reference to
T2” if an expression of type “pointer to T1” can be explicitly
converted to the type “pointer to T2” using a reinterpret_cast
The result refers to the same object as the source glvalue, but with
the specified type. [ Note: That is, for lvalues, a reference cast
reinterpret_cast(x) has the same effect as the conversion
*reinterpret_cast(&x) with the built-in & and * operators (and similarly for reinterpret_cast(x)). —end note ]
For pointers, reinterpret_cast boils down to conversion to void* and then to the target type:
An object pointer can be explicitly converted to an object pointer of
a different type.72 When a prvalue v of object pointer type is
converted to the object pointer type “pointer to cv T”, the result is
static_cast<cv T*>(static_cast<cv void*>(v)).
The semantics of the two static casts are defined as:
A prvalue of type “pointer to cv1 void” can be converted to a prvalue
of type “pointer to cv2 T,” where T is an object type and cv2 is the
same cv-qualification as, or greater cv-qualification than, cv1. The
null pointer value is converted to the null pointer value of the
destination type. If the original pointer value represents the address
A of a byte in memory and A satisfies the alignment requirement of T,
then the resulting pointer value represents the same address as the
original pointer value, that is, A. The result of any other such
pointer conversion is unspecified.
The platform I am working on has near and far pointers which are 16 or 32 bit. In that case, the types shared_ptr<A> and shared_ptr<B> are of different size and alignment, and casting one into the other is then unspecified behavior. If alignment matches, the result of the static casts is defined.
However, the first clause about reinterpret_cast to a reference also contains a note
[ Note: That is, for lvalues, a reference
cast reinterpret_cast<T&>(x) has the same effect as the conversion *reinterpret_cast<T*>(&x) with
the built-in & and * operators (and similarly for reinterpret_cast<T&&>(x)). —end note ]
so basically, the cast is semantically identical to a pointer conversion with immediate dereferencing. Even if the pointers are of identical size (and compatible alignment), using the casted pointer will violate the strict alias rule since dereferencing is an access.
If a program attempts to access the stored value of an object
through a glvalue of other than one of the
following types the behavior is undefined:53
— the dynamic type of the object,
— a cv-qualified version of the dynamic type of the object,
— a type similar (as defined in 4.4) to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type
of the object,
— an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic
data members (including, recursively, an element or non-static data member of a subaggregate
or contained union),
Using functionalities of a shared_ptr, or any other standard class or template, is only defined when calling the functions (including member functions) for the class of the type which you pass to the function (including as implicit this argument):
Nothing in the standard defines what happens when you call a standard function expecting a Foo and passing a Bar, for any two standard types Foo and Bar (or even for user types.)
That's not defined; that's undefined. By not meeting the most basic precondition: to use arguments of the correct type.
For this question, no polymorphism shall be involved, i.e. no virtual methods, no virtual base classes. Just in case it matters, my case does not involve any of those.
Assume I have a class Derived which has an unambiguous accessible parent of type Base, with no polymorphism (no virtual methods, no virtual base classes), but possibly involving indirect and/or multiple inheritance.
Assume further I have a valid pointer Derived *derived
(points to an object of type Derived or a subclass thereof).
In this case, I believe static_cast<Base*>(derived) is valid
(results in a valid usable pointer). When the ancestry chain between Base and Derived involves multiple inheritance, this static_cast might imply pointer adjustments to locate the Base instance within the Derived instance. To do that, the compiler needs to know the inheritance chain, which he does in this case. However, if an intermediate cast to void * is inserted, that inheritance chain information is hidden from the compiler. For which inheritance chain is such a static cast valid nonetheless? I expect one of the following:
None at all? Accessing a static_cast from void pointer is undefined behaviour unless the pointer really points to the exact type.
For all chains without multiple-inheritance? Then, the compiler could guarantee that Base is always at the start of Derived - but what says the standard?
For all chains where Base is found within the first parent class of all intermediate multiple inheritance chains? Maybe the start of Base and Derived still matches?
Always? static_cast to void pointer could always adjust to the start of the very first parent, and static_cast from void pointer undo that adjustment. But with multiple inheritance, the "very first parent" is not necessarily a parent of all parents.
static_cast<Base*>(static_cast<void*>(derived)) has a name in the C++ standard. It's called a reinterpret_cast. It's specified in [expr.reinterpret.cast] paragraph 7:
An object pointer can be explicitly converted to an object pointer of a different type. When a prvalue v of object pointer type is converted to the object pointer type “pointer to cv T”, the result is static_cast<cv T*>(static_cast<cv void*>(v)). [ Note: Converting a prvalue of type “pointer to T1” to the type “pointer to T2” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value. — end note ]
A reinterpret_cast is us telling the compiler to treat the pointer as something else. There is no adjustment that the compiler can or will do under this instruction. If we lied, than the behavior is simply undefined. Does the standard say when such a reinterpret_cast is valid? It does actually. There is a concept of pointer interconvertiblity defined at [basic.compound] paragraph 4:
Two objects a and b are pointer-interconvertible if:
they are the same object, or
one is a union object and the other is a non-static data member of that object ([class.union]), or
one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no
non-static data members, any base class subobject of that object
([class.mem]), or
there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.
If two objects are pointer-interconvertible, then they have the same
address, and it is possible to obtain a pointer to one from a pointer
to the other via a reinterpret_cast. [ Note: An array object and its
first element are not pointer-interconvertible, even though they have
the same address. — end note ]
The third bullet is your answer. The objects in the class hierarchy must uphold restrictions (be standard layout from top base to most derived), and only then is the cast guaranteed to give well defined results.
Sample code:
struct S { int x; };
int func()
{
S s{2};
return (int &)s; // Equivalent to *reinterpret_cast<int *>(&s)
}
I believe this is common and considered acceptable. The standard does guarantee that there is no initial padding in the struct. However this case is not listed in the strict aliasing rule (C++17 [basic.lval]/11):
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
(11.1) the dynamic type of the object,
(11.2) a cv-qualified version of the dynamic type of the object,
(11.3) a type similar (as defined in 7.5) to the dynamic type of the object,
(11.4) a type that is the signed or unsigned type corresponding to the dynamic type of the object,
(11.5) a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
(11.6) an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
(11.7) a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
(11.8) a char, unsigned char, or std::byte type.
It seems clear that the object s is having its stored value accessed.
The types listed in the bullet points are the type of the glvalue doing the access, not the type of the object being accessed. In this code the glvalue type is int which is not an aggregate or union type, ruling out 11.6.
My question is: Is this code correct, and if so, under which of the above bullet points is it allowed?
The behaviour of the cast comes down to [expr.static.cast]/13;
A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T”, where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. If the original
pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T , then the resulting pointer value is unspecified. Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion.
The definition of pointer-interconvertible is:
Two objects a and b are pointer-interconvertible if:
they are the same object, or
one is a union object and the other is a non-static data member of that object, or
one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, the first base class subobject of that object, or
there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.
So in the original code, s and s.x are pointer-interconvertible and it follows that (int &)s actually designates s.x.
So, in the strict aliasing rule, the object whose stored value is being accessed is s.x and not s and so there is no problem, the code is correct.
I think it's in expr.reinterpret.cast#11
A glvalue expression of type T1, designating an object x, can be cast
to the type “reference to T2” if an expression of type “pointer to T1”
can be explicitly converted to the type “pointer to T2” using a
reinterpret_cast. The result is that of *reinterpret_cast<T2 *>(p)
where p is a pointer to x of type “pointer to T1”. No temporary is
created, no copy is made, and no constructors or
conversion functions are called [1].
[1] This is sometimes referred to as a type pun when the result refers to the same object as the source glvalue
Supporting #M.M's answer about pointer-incovertible:
from cppreference:
Assuming that alignment requirements are met, a reinterpret_cast does
not change the value of a pointer outside of a few limited cases
dealing with pointer-interconvertible objects:
struct S { int a; } s;
int* p = reinterpret_cast<int*>(&s); // value of p is "pointer to s.a" because s.a
// and s are pointer-interconvertible
*p = 2; // s.a is also 2
versus
struct S { int a; };
S s{2};
int i = (int &)s; // Equivalent to *reinterpret_cast<int *>(&s)
// i doesn't change S.a;
The cited rule is derived from a similar rule in C89 which would be nonsensical as written unless one stretches the meaning of the word "by", or recognizes what "Undefined Behavior" meant when C89 was written. Given something like struct S {unsigned dat[10];}s;, the statement s.dat[1]++; would clearly modify the stored value of s, but the only lvalue of type struct S in that expression is used solely for the purpose of producing a value of type unsigned*. The only lvalue which is used to modify any object is of type int.
As I see it, there are two related ways of resolving this issue: (1) recognizing that the authors of the Standard wanted to allow cases where an lvalue of one type was visibly derived from one of another type, but didn't want to get hung up on details of what forms of visible derivation must be accounted for, especially since the range of cases compilers would need to recognize would vary considerably based upon the styles of optimization they performed and the tasks for which they were being used; (2) recognizing that the authors of the Standard had no reason to think it should matter whether the Standard actually required that a particular construct be processed usefully, if it would be have been clear to everyone that there was reason to do otherwise.
I don't think there has consensus among the Committee members over whether a compiler given something like:
struct foo {int ct; int *dat;} it;
void test(void)
{
for (int i=0; i < it.ct; i++)
it.dat[i] = 0;
}
should be required to ensure that e.g. after it.ct = 1234; it.dat = &it.ct;, a call to test(); would zero it.ct and have no other effect. Parts of the Rationale would suggest that at least some committee members would have expected so, but the omission of any rule that would allow for an object of structure type to be accessed using an arbitrary lvalue of member type suggests otherwise. The C Standard has never really resolved this issue, and the C++ Standard cleans things up somewhat but doesn't really solve it either.
I am trying to determine whether the following code invokes undefined behavior:
#include <iostream>
class A;
void f(A& f)
{
char* x = reinterpret_cast<char*>(&f);
for (int i = 0; i < 5; ++i)
std::cout << x[i];
}
int main(int argc, char** argue)
{
A* a = reinterpret_cast<A*>(new char[5])
f(*a);
}
My understanding is that reinterpret_casts to and from char* are compliant because the standard permits aliasing with char and unsigned char pointers (emphasis mine):
If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behavior is undefined:
the dynamic type of the object,
a cv-qualified version of the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
a char or unsigned char type.
However, I am not sure whether f(*a) invokes undefined behavior by creating a A& reference to the invalid pointer. The deciding factor seems to be what "attempts to access" verbiage means in the context of the C++ standard.
My intuition is that this does not constitute an access, since an access would require A to be defined (it is declared, but not defined in this example). Unfortunately, I cannot find a concrete definition of "access" in the C++ standard:
Does f(*a) invoke undefined behavior? What constitutes "access" in the C++ standard?
I understand that, regardless of the answer, it is likely a bad idea to rely on this behavior in production code. I am asking this question primarily out of a desire to improve my understanding of the language.
[Edit] #SergeyA cited this section of the standard. I've included it here for easy reference (emphasis mine):
5.3.1/1 [expr.unary.op]
The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points. If the type of the expression is “pointer to T,” the type of the result is “T.” [Note: indirection through a pointer to an incomplete type (other than cv void) is valid. The lvalue thus obtained can be used in limited ways (to initialize a reference, for example); this lvalue must not be converted to a prvalue, see 4.1. — end note ]
Tracing the reference to 4.1, we find:
4.1/1 [conv.lval]
A glvalue (3.10) of a non-function, non-array type T can be converted to a prvalue. If T is an incomplete type, a program that necessitates this conversion is ill-formed. If T is a non-class type, the type of the prvalue is the cv-unqualified version of T. Otherwise, the type of the prvalue is T.
When an lvalue-to-rvalue conversion is applied to an expression e, and either:
e is not potentially evaluated, or
the evaluation of e results in the evaluation of a member ex of the set of potential results of e, and ex names a variable x that is not odr-used by ex (3.2)
the value contained in the referenced object is not accessed.
I think our answer lies in whether *a satisfies the second bullet point. I am having trouble parsing that condition, so I am not sure.
char* x = reinterpret_cast<char*>(&f); is valid. Or, more specifically, access through x is allowed - the cast itself is always valid.
A* a = reinterpret_cast<A*>(new char[5]) is not valid - or, to be precise, access through a will trigger undefined behaviour.
The reason for this is that while it's OK to access object through a char*, it's not OK to access array of chars through a random object. Standard allows first, but not the second.
Or, in layman terms, you can alias a type* through char*, but you can't alias char* through type*.
EDIT
I just noticed I didn't answer direct question ("What constitutes "access" in the C++ standard"). Apparently, Standard does not define access (at least, I was not able to find the formal definition), but dereferencing the pointer is commonly understood to qualify for access.
This question is inspired by comments here.
Consider the following code snippet:
struct X {}; // no virtual members
struct Y : X {}; // may or may not have virtual members, doesn't matter
Y* func(X* x) { return dynamic_cast<Y*>(x); }
Several people suggested that their compiler would reject the body of func.
However, it appears to me that whether this is defined by the Standard depends on the run-time value of x. From section 5.2.7 ([expr.dynamic.cast]):
The result of the expression dynamic_cast<T>(v) is the result of
converting the expression v to type T. T shall be a pointer or
reference to a complete class type, or "pointer to cv void." The
dynamic_cast operator shall not cast away constness.
If T is a pointer type, v shall be a prvalue of a pointer to complete class
type, and the result is a prvalue of type T. If T is an lvalue
reference type, v shall be an lvalue of a complete class type, and the
result is an lvalue of the type referred to by T. If T is an rvalue
reference type, v shall be an expression having a complete class type,
and the result is an xvalue of the type referred to by T.
If the
type of v is the same as T, or it is the same as T except that the
class object type in T is more cv-qualified than the class object type
in v, the result is v (converted if necessary).
If the value of v is a null pointer value in the pointer case, the result is the null pointer value of type T.
If T is "pointer to cv1 B" and v has type
'pointer to cv2 D" such that B is a base class of D, the result is a
pointer to the unique B subobject of the D object pointed to by v.
Similarly, if T is "reference to cv1 B" and v has type cv2 D such that
B is a base class of D, the result is the unique B subobject of the D
object referred to by v. The result is an lvalue if T is an lvalue
reference, or an xvalue if T is an rvalue reference. In both the
pointer and reference cases, the program is ill-formed if cv2 has
greater cv-qualification than cv1 or if B is an inaccessible or
ambiguous base class of D.
Otherwise, v shall be a pointer to or an lvalue of a polymorphic type.
If T is
"pointer to cv void," then the result is a pointer to the most derived
object pointed to by v. Otherwise, a run-time check is applied to see
if the object pointed or referred to by v can be converted to the type
pointed or referred to by T.) The most derived object pointed
or referred to by v can contain other B objects as base classes, but
these are ignored.
If C is the class type to which T points or refers,
the run-time check logically executes as follows:
If, in the most
derived object pointed (referred) to by v, v points (refers) to a
public base class subobject of a C object, and if only one object of
type C is derived from the subobject pointed (referred) to by v the
result points (refers) to that C object.
Otherwise, if v points
(refers) to a public base class subobject of the most derived object,
and the type of the most derived object has a base class, of type C,
that is unambiguous and public, the result points (refers) to the C
subobject of the most derived object.
Otherwise, the run-time check
fails.
The value of a failed cast to pointer type is the null
pointer value of the required result type. A failed cast to reference
type throws std::bad_cast.
The way I read this, the requirement of a polymorphic type only applies if none of the above conditions are met, and one of those conditions depends on the runtime value.
Of course, in a few cases the compiler can positively determine that the input cannot properly be NULL (for example, when it is the this pointer), but I still think the compiler cannot reject the code unless it can determine that the statement will be reached (normally a run-time question).
A warning diagnostic is of course valuable here, but is it Standard-compliant for the compiler to reject this code with an error?
A very good point.
Note that in C++03 the wording of 5.2.7/3 and 5.2.7/4 is as follows
3 If the type of v is the same as the required result type (which, for
convenience, will be called R in this description), or it is the same
as R except that the class object type in R is more cv-qualified than
the class object type in v, the result is v (converted if necessary).
4 If the value of v is a null pointer value in the pointer case, the
result is the null pointer value of type R.
The reference to type R introduced in 5.2.7/3 seems to imply that 5.2.7/4 is intended to be a sub-clause of 5.2.7/3. In other words, it appears that 5.2.7/4 is intended to apply only under the conditions described in 5.2.7/3, i.e. when types are the same.
However, the wording in C++11 is different and no longer involves R, which no longer suggests any special relationship between 5.2.7/3 and 5.2.7/4. I wonder whether it was changed intentionally...
I believe the intention of that wording is that some casts can be done at compile-time, e.g. upcasts or dynamic_cast<Y*>((X*)0), but that others need a run-time check (in which case a polymorphic type is needed.)
If your code snippet was well-formed it would need a run-time check to see if it's a null pointer value, which contradicts the idea that a run-time check should only happen for the polymorphic case.
See DR 665 which clarified that certain casts are ill-formed at compile-time, rather than postponed to run-time.
To me, it seems pretty clear cut. I think the confusion comes when you make the wrong interpretation that the enumeration of requirements is an "else if .. else if .." type of thing.
Points (1) and (2) simply define what the static input and output types are allowed to be, in terms of cv-qualification and lvalue-rvalue-prvalue-- etc. So that's trivial and applies to all cases.
Point (3) is pretty clear, if both the input and output type are the same (added cv-qualifiers aside), then the conversion is trivial (none, or just added cv-qualifiers).
Point (4) clearly requires that if the input pointer is null, then the output pointer is null too. This point needs to be made as a requirement, not as a matter of rejecting or accepting the cast (via static analysis), but as a matter of stressing the fact that if the conversion from input pointer to output pointer would normally entail an offset to the actual pointer value (as it can, under multiple-inheritance class hierarchies), then that offset must not be applied if the input pointer is null, in order to preserve the "nullness" of the pointer. This just means that when the dynamic-cast is performed, the pointer is checked for nullity, and if it is null, the resulting pointer must also have a null-value.
Point (5) simply states that if it is an upcast (from derived to base), then the cast is resolved statically (equivalent to static_cast<T>(v)). This is mostly to handle the case (as the footnote indicates) where the upcast is well-formed, but that there could be the potential for an ill-formed cast if one were to go to the most-derived object pointed to by v (e.g., if v actual points to derived object with multiple base classes in which the class T appears more than once). In other words, this means, if it's an upcast, do it statically, without a run-time mechanism (thus, avoiding a potential failure, where it shouldn't happen). Under this case, the compiler should reject the cast on the same basis as if it was a static_cast<T>(v).
In Point (6), clearly, the "otherwise" refers directly to Point (5) (and surely to the trivial case of Point (3)). Meaning (together with Point (7)), that if the cast is not an upcast (and not an identity-cast (Point (3))), then it is a down-cast, and it should be resolved at run-time, with the explicit requirement that the type (of v) be a polymorphic type (has a virtual function).
Your code should be rejected by a standard-compliant compiler. To me, there's no doubts about it. Because, the cast is a down-cast, and the type of v is not polymorphic. It doesn't meet the requirements set out by the standard. The null-pointer clause (point (4)) really has nothing to do with whether it is accepted code or not, it just has to do with preserving a null pointer-value across the cast (otherwise, some implementations could make the (stupid) choice to still apply the pointer-offset of the cast even if the value is null).
Of course, they could have made a different choice, and allowed the cast to behave as a static-cast from base to derived (i.e., without a run-time check), when the base type is not polymorphic, but I think that breaks the semantics of the dynamic-cast, which is clearly to say "I want a run-time check on this cast", otherwise you wouldn't use a dynamic-cast!