Consider following program:
struct A{};
int main()
{
A a;
A b = a;
A c = reinterpret_cast<A>(a);
}
The compiler(g++14) throws an error about invalid cast from type 'A' to type 'A'.
Why is casting to the same type invalid?
It is not allowed, because the standard says so.
There is a rather limited set of allowed conversion that you can do with reinterpret_cast. See eg cppreference. For example the first point listed there is:
1) An expression of integral, enumeration, pointer, or
pointer-to-member type can be converted to its own type. The resulting
value is the same as the value of expression. (since C++11)
However, casting a custom type (no pointer!) to itself is not on the list. Why would you do that anyhow?
Because reinterpret_cast cannot be used for classes and structures, it should be used to reinterpret pointers, references and integral types. It is very well explained in cpp reference
So, in your case one possible valid expression would be reinterpret_cast<A*>(&a)
reinterpret_cast should be used to pointers, references and integral types.
I don't know, Why someone do that.
But Still you want do.You can do like.
A *d = reinterpret_cast<A*>(&a);
or
A c = reinterpret_cast<A&>(a);
Related
I suspect this is true for all primitive types in C/C++.
For example, if you do this:
((unsigned int*)0x1234) = 1234;
The compiler will not let it pass. Whereas if you do this
((data_t*)0x1234 )->s = 1234;
where data_t is a struct, the compiler allows it.
This seems to be the case for at least two compilers I experimented on, one ARM GCC, one TDM-GCC.
Why is this?
The first code snippet doesn't work because the left hand side is not an lvalue. It is only a pointer value, and pointers by themselves are not lvalues.
The second code snippet works because a pointer is being dereferenced, and a dereferenced pointer is an lvalue. It may not be immediately clear from the syntax this is the case, so let's rewrite this:
((data_t*)0x1234 )->s = 1234;
As:
(*(data_t*)0x1234).s = 1234;
Now we can see that the value which is casted to a pointer is dereferenced to an lvalue of struct type, and a member of that struct is subsequently accessed and assigned to.
This is described in section 6.5.2.3p4 of the C standard regarding the -> operator:
A postfix expression followed by the -> operator and an identifier
designates a member of a structure or union object. The value
is that of the named member of the object to which the first
expression points, and is an lvalue. If the first expression is a
pointer to a qualified type, the result has the so-qualified
version of the type of the designated member.
Regarding the first snippet, section 6.5.4p5 regarding the typecast operator states:
Preceding an expression by a parenthesized type name converts
the value of the expression to the named type. This construction is
called a cast. 104) A cast that specifies no conversion has
no effect on the type or value of an expression.
Where footnote 104 states:
A cast does not yield an lvalue. Thus, a cast to a qualified
type has the same effect as a cast to the unqualified version
of the type.
So this describes why the first snippet won't compile but the second snippet will.
However, treating an arbitrary value as a pointer and dereferencing it is implementation defined behavior at best, and most likely undefined behavior.
Your examples are:
((unsigned int*)0x1234) = 1234;
((data_t*)0x1234 )->s = 1234;
Neither ((unsigned int*)0x1234) nor ((data_t*)0x1234 ) is an lvalue, and you can't assign to either of them.
More generally, the prefix of -> doesn't have to be an lvalue. But prefix->member is always an lvalue, whether prefix is or not. Similarly, *p is an value whether p is an lvalue or not.
I know that reinterpret_cast is primarily used going to or from a char*.
But I was surprised to find that static_cast could do the same with a void*. For example:
auto foo "hello world"s;
auto temp = static_cast<void*>(&foo);
auto bar = static_cast<string*>(temp);
What do we gain from using reinterpret_cast and char* over static_cast and void*? Is it something to do with the strict aliasing problem?
Generally speaking, static_cast will do cast any two types if one of them can be cast to the other implicitly. That includes arithmetic casts, down-casts, up-casts and cast to and from void*.
That is, if this cast is valid:
void foo(A a);
B b;
foo(b);
Then the both static_cast<B>(a) and static_cast<A>(b) will also be valid.
Since any pointer can be cast implicitly to void*, thus your peculiar behavior.
reinterpret_cast do cast by reinterpreting the bit-pattern of the values. That, as you said in the question, is usually done to convert between unrelated pointer types.
Yes, you can convert between unrelated pointer types through void*, by using two static_cast:
B *b;
A *a1 = static_cast<A*>(b); //compiler error
A *a2 = static_cast<A*>(static_cast<void*>(b)); //it works (evil laugh)!
But that is bending the rules. Just use reinterpret_cast if you really need this.
Your question really has 2 parts:
Should I use static_cast or reinterpret_cast to work with a pointer to the underlying bit pattern of an object without concern for the object type?
If I should use reinterpret_cast is a void* or a char* preferable to address this underlying bit pattern?
static_cast: Converts between types using a combination of implicit and user-defined conversions
In 5.2.9[expr.static.cast]13 the standard, in fact, gives the example:
T* p1 = new T;
const T* p2 = static_cast<const T*>(static_cast<void*>(p1));
It leverages the implicit cast:
A prvalue pointer to any (optionally cv-qualified) object type T can be converted to a prvalue pointer to (identically cv-qualified) void. The resulting pointer represents the same location in memory as the original pointer value. If the original pointer is a null pointer value, the result is a null pointer value of the destination type.*
There is however no implicit cast from a pointer of type T to a char*. So the only way to accomplish that cast is with a reinterpret_cast.
reinterpret_cast: Converts between types by reinterpreting the underlying bit pattern
So in answer to part 1 of your question when you cast to a void* or a char* you are looking to work with the underlying bit pattern, reinterpret_cast should be used because it's use denotes to the reader a conversion to/from the underlying bit pattern.
Next let's compare void* to char*. The decision between these two may be a bit more application dependent. If you are going to use a standard library function with your underlying bit pattern just use the type that function accepts:
void* is used in the mem functions provided in the cstring library
read and write use char* as inputs
It's notable that C++ specific libraries prefer char* for pointing to memory.
Holding onto memory as a void* seems to have been preserved for compatibility reasons as pointer out here. So if a cstring library function won't be used on your underlying bit patern, use the C++ specific libraries behavior to answer part 2 of your question: Prefer char* to void*.
The C++ standards mentions that reinterpret_cast is implementation defined, and doesn't give any guarantees except that casting back (using reinterpret_cast) to original type will result in original value passed to first.
C-style casting of at least some types behaves much the same way - casting back and forth results with the same value - Currently I am working with enumerations and ints, but there are some other examples as well.
While C++ standard gives those definitions for both cast-styles, does it also give the same guarantee for mixed casts? If library X returns from function int Y() some enum value, can use any of above casts, without worrying what cast was used to convert initial enum to int in Y's body? I don't have X's source code, so I cannot check (and it can change with next version anyway), and things like that are hardly mentioned in documentation.
I know that under most implementations in such cases both casts behave the same; my question is: what does C++ standard say about such cases - if anything at all.
C++ defines the semantic of the C cast syntax in terms of static_cast, const_cast and reinterpret_cast. So you get the same guaranteed for the same operation whatever syntax you use to achieve it.
reinterpret_cast can only be used for specific conversions:
Pointer to (sufficiently large) integer, and the reverse
Function pointer to function pointer
Object pointer to object pointer
Pointer-to-member to pointer-to-member
lvalue expression to reference
plus (conditionally) function pointer to object pointer and the reverse. In most cases, the converted value is unspecified, but there is a guarantee that a conversion followed by its reverse will yield the original value.
In particular, you can't use reinterpret_cast to convert between integer an enumeration types; the conversion must be done using static_cast (or implicitly, when converting an unscoped enumeration to an integer type), which is well defined for sufficiently large integer types. The only possible problem is if the library did something completely insane such as return reinterpret_cast<int&>(some_enum);
A C-style cast will perform either a static_cast or a reinterpret_cast, followed by a const_cast, as necessary; so any conversion that's well-defined by static_cast is also well-defined by a C-style cast.
No, reinterpret_cast is not equivalent to a C style cast. C style casts allow casting away const-volatile (so it includes the functionality of const_cast) not allowed in reinterpret_cast. If static_cast is allowed between the source and destination types, it will perform a static_cast which has different semantics than reinterpret_cast. It the conversion is not allowed, it will fallback to reinterpret_cast. Finally there is a corner case where the C cast cannot be represented in terms of any of the other casts: it ignores access specifiers.
Some examples that illustrate differences:
class b0 { int a; };
class b1 { int b; };
class b2 { int c; };
class d : public b0, public b1, b2 {};
int main() {
d x;
assert( static_cast<b1*>(&x) == (b1*)&x );
assert( reinterpret_cast<b1*>(&x) != (b1*)&x ); // Different value
assert( reinterpret_cast<b2*>(&x) != (b2*)&x ); // Different value,
// cannot be done with static_cast
const d *p = &x;
// reinterpret_cast<b0*>(p); // Error cannot cast const away
(b0*)p; // C style can
}
If I remember right a C-Style conversion is nothing more than an ordered set of conversions static_cast, dynamic_cast, reinterpret_cast, static_cast...,
consider:
enum NUMBERS
{
NUMBER_ONE,
NUMBER_TWO
};
void Do( NUMBERS a )
{
}
int _tmain(int argc, _TCHAR* argv[])
{
unsigned int a = 1;
Do( a ); //C2664
return 0;
}
a C-Style conversion would do
Do( (NUMBERS)a );
What I would like to know is, what is the correct non C-Style conversion to be made, why?
The correct way would be:
static_cast<NUMBERS>(a)
Because:
8) Integer, floating-point, or enumeration type can be converted to
any enumeration type (the result is unspecified if the value of
expression, converted to the enumeration's underlying type, is not one
of the target enumeration values)
Source: http://en.cppreference.com/w/cpp/language/static_cast
dynamic_cast does runtime checks using RTTI, so it only applies to classes with at least one virtual method.
reinterprest_cast is designed to tell the compiler to treat a specific piece of memory as some different type, without any actual runtime conversions.
static_cast<NUMBERS>(a)
Because the specification for static_cast includes this:
A value of integral or enumeration type can be explicitly converted to
an enumeration type. The value is unchanged if the original value is
within the range of the enumeration values (7.2). Otherwise, the
resulting value is unspecified (and might not be in that range). A
value of floating-point type can also be converted to an enumeration
type. The resulting value is the same as converting the original value
to the underlying type of the enumeration (4.9), and subsequently to
the enumeration type.
static_cast is for generally taking values of one type and getting that value represented as another type. There are about two pages of specification covering all the details, but if you understand that basic idea, then you'll probably know when you want to use static_cast. E.g., if you want to convert an integral value from one integral type to another or if you want to convert a floating point value to an integral value.
dynamic_cast is for working with dynamic types, which is mostly only useful for polymorphic, user-defined types. E.g., convert Base * to Derived * iff the referenced object actually is a Derived.
reinterpret_cast is typically for taking the representation of a value in one type, and getting the value that has the same representation in another type; i.e., type punning (even though a lot of type punning is actually not legal in C++). E.g., if you want to access an integer's storage as an array of char.
const_cast is for adding and removing cv qualifiers (const and volatile) at any level of a type. E.g., int const * volatile **i; const_cast<int volatile ** const *>(i);
static_cast<> is the way to go:
Here's why
I had a discussion with someone on IRC and this question turned up. We are allowed by the Standard to change an object of type int by a char lvalue.
int a;
char *b = (char*) &a;
*b = 0;
Would we be allowed to do this in the opposite direction, if we know that the alignment is fine?
The issue I'm seeing is that the aliasing rule does not cover the simple case of the following, if one considers the aliasing rule as a non-symmetric relation
int a;
a = 0;
The reason is, that each object contains a sequence of sizeof(obj) unsigned char objects (called the "object representation"). If we change the int, we will change some or all of those objects. However, the aliasing rule only states we are allowed to change a int by an char or unsigned char, but not the other way around. Another example
int a[1];
int *ra = a;
*ra = 0;
Only one direction is described by 3.10/15 ("An aggregate or union type that includes..."), but this time we need the other way around ("A type that is the element or non-static data member type of an aggregate...").
Is the other direction implied? This question also applies to C.
The aliasing rule simply states that there's one "effective type" (C99 6.5.7, plus footnote 73) for any given object in memory, and any accesses to such an object go through one of:
A type compatible with the effective type (qualifiers such as const and restrict, as well signed/unsigned-ness may vary)
A struct or union containing one such type
A character type
The effective type isn't specified in advanced, of course - it's just a construct that is used to specify aliasing. But the intent is simply that you don't access the same object with two different non-character types.
So the answer is, yes, you can indeed go the other direction.
The standard (C99 6.3.2.3 §7) defines pointer casts like this as "just fine", the casted pointer will point at the same address. (Unless the CPU has alignment which makes the cast impossible, then it is undefined behavior.)
That is, the actual cast in itself is fine. What will happen if you start to manipulate the data... now that's another implementation-defined story.
Here's from the standard:
"A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned (57) for the pointed-to type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.
When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers
to the remaining bytes of the object."
"57) In general, the concept ‘‘correctly aligned’’ is transitive: if a pointer to type A is correctly aligned for a pointer to type B, which in turn is correctly aligned for a pointer to type C, then a pointer to type A is
correctly aligned for a pointer to type C."
I think I'm a bit confused by the question, but the 2nd and 3rd examples are accessing the int through an lvalue that has the object's type (int in the examples).
C++ 3.10/15 states as it's first item that it's OK to access the object through an lvalue that has the the type of "the dynamic type of the object".
What am I misunderstanding in the question?