reinterpret_cast - bizarre behaviour

reinterpret_cast - bizarre behaviour - c++

I've come across bizarre error related to reinterpret_cast. Just look at below code:
int* var;
reinterpret_cast<void const **>(&var);
error in VSC++2010: error C2440: 'reinterpret_cast' : cannot convert from 'int ** ' to 'const void ** '
error in gcc 4.1.2: reinterpret_cast from type ‘int** ’ to type ‘const void** ’ casts away constness
error in gcc 4.6.2: reinterpret_cast from type ‘int** ’ to type ‘const void** ’ casts away qualifiers
Does anyone have a clue why compilers say that I'm casting const away. Me, and few of my work colleagues have no idea what's wrong with it.
Thanks for help!

Section 5.2.10 of the C++03 standard talks about what a reinterpret_cast can do. It explicitly states "The reinterpret_cast operator shall not cast away constness".
Casting away constness is defined in section 5.2.11 of the C++03 standard. The notation used there is a little confusing, but it basically states that casting between two types "casts away constness" if there is no implicit conversion for the given qualification.
In your case, you are trying to convert an int ** to a void const**. The compiler asks "Can I implicitly convert between T ** and T const**?", and the answer is no, so it says that you are casting away constness.
The logic here is that reinterpret_cast is made to handle changing types, not changing qualifiers (that's what const_cast is for). So if you are asking it to do something you would need const_cast for, it refuses.

To add/remove const, use const_cast.
To deal with confusing casting errors, do things one step at a time:
int* var;
int** v2 = &var;
int const** v3 = const_cast<int const**>(v2);
void const** v4 = reinterpret_cast<void const**>(v3);
Note that a int const** and a int** are very different types, and converting between them is dangerous -- more dangerous than a void* <-> int*.
Suppose you have an int** bob. You then pass it to a function that takes a int const** alice through a const_cast.
In that function, they assign a pointer to an int stored in read-only memory to the *alice -- perfectly legal.
Outside the function, you check that bob and *bob are valid, then assign to **bob, and you just tried to write to read-only memory.

Related

Possible implementation of std::addressof [duplicate]

Having previously been unaware of the existence of std::addressof, why it exists makes sense to me: as a way of taking the an address in the presence of an overloaded operator&. The implementation, however, is slightly more opaque. From gcc 4.7.1:
template<typename _Tp>
inline _Tp*
__addressof(_Tp& __r) _GLIBCXX_NOEXCEPT
{
return reinterpret_cast<_Tp*>
(&const_cast<char&>(reinterpret_cast<const volatile char&>(__r)));
}
The reinterpret_cast<_Tp*> is obvious. The rest of it is dark magic. Can someone break down how this actually works?

First you have __r which is of type _Tp&
It is reinterpret_cast'ed to a char& in order to ensure being able to later take its address without fearing an overloaded operator& in the original type; actually it is cast to const volatile char& because reinterpret_cast can always legally add const and volatile qualifiers even if they are not present, but it can't remove them if they are present (this ensures that whatever qualifiers _Tp had originally, they don't interfere with the cast).
This is const_cast'ed to just char&, removing the qualifiers (legally now! const_cast can do what reinterpret_cast couldn't with respect to the qualifiers).
The address is taken & (now we have a plain char*)
It is reinterpret_cast'ed back to _Tp* (which includes the original const and volatile qualifiers if any).
Edit: since my answer has been accepted, I'll be thorough and add that the choice of char as an intermediate type is due to alignment issues in order to avoid triggering Undefined Behaviour. See #JamesKanze's comments (under the question) for a full explanation. Thanks James for explaining it so clearly.

It's actually quite simple when you think about it, to get the real adress of an object/function in precense of an overloaded operator& you will need to treat the object as something other than what it really is, some type which cannot have an overloaded operator.. an intrinsic type (such as char).
A char has no alignment and can reside anywhere any other object can, with that said; casting an object to a reference to char is a very good start.
But what about the black magic involved when doing reinterpret_cast<const volatile char&>?
In order to reinterpret the returned pointer from the implementation of addressof we will eventually want to discard qualifiers such as const and volatile (to end up with a plain reference char). These two can be added easily with reinterpret_cast, but asking it to remove them is illegal.
T1 const a; reinterpret_cast<T2&> (a);
/* error: reinterpret_cast from type ‘...’ to type ‘...’ casts away qualifiers */
It's a little bit of a "better safe than sorry" trick.. "Let us add them, just in case, we will remove them later."
Later we cast away the qualifiers (const and volatile) with const_cast<char&> to end up with a plain reference to char, this result is, as the final step, turned back into a pointer to whatever type we passed into our implementation.
A relevant question on this stage is why we didn't skip the use of reinterpret_cast and went directly to the const_cast? this too has a simple answer: const_cast can add/remove qualifiers, but it cannot change the underlying type.
T1 a; const_cast<T2&> (a);
/* error: invalid const_cast from type ‘T1*’ to type ‘T2*’ */
it might not be easy as pie, but it sure tastes good when you get it..

The short version:
operator& can't be overloaded for char. So the type is being cast to a char reference to get what's guaranteed to be the true address.
That conversion is done in two casts because of the restrictions on const_cast and reinterpret_cast.
The longer version:
It's performing three sequential casts.
reinterpret_cast<const volatile char&>
This is effectively casting to a char&. The const and volatile only exist because _Tp may be const or volatile, and reinterpret_cast can add those, but would be unable to remove them.
const_cast<char&>
Now the const and volatile have been removed. const_cast may do that.
reinterpret_cast<_Tp*>(&result)
Now the address is taken and the type is converted back to a pointer to the original type.

From inside out:
First it casts __r type to a const volatile char&: It's casting to a char& just because it's a type that for sure doesn't have an overloaded operator& that does something funky. The const volatile is there because those are restrictions, they can be added but not taken away with reinterpret_cast. _Tp might've already been const and/or volatile, in which case one or both were needed in this cast. If it didn't, the cast just added them needlessly, but it is written for the most restrictive cast.
Next, to take away the const volatile you need a const_cast, which leads to the next part... const_cast<char&>.
From there they simply take the address and cast it to the type you want, a _Tp*. Note that _Tp might be const and/or volatile, which mean those things could be added back at this point.

Implementation of addressof

Having previously been unaware of the existence of std::addressof, why it exists makes sense to me: as a way of taking the an address in the presence of an overloaded operator&. The implementation, however, is slightly more opaque. From gcc 4.7.1:
template<typename _Tp>
inline _Tp*
__addressof(_Tp& __r) _GLIBCXX_NOEXCEPT
{
return reinterpret_cast<_Tp*>
(&const_cast<char&>(reinterpret_cast<const volatile char&>(__r)));
}
The reinterpret_cast<_Tp*> is obvious. The rest of it is dark magic. Can someone break down how this actually works?

First you have __r which is of type _Tp&
It is reinterpret_cast'ed to a char& in order to ensure being able to later take its address without fearing an overloaded operator& in the original type; actually it is cast to const volatile char& because reinterpret_cast can always legally add const and volatile qualifiers even if they are not present, but it can't remove them if they are present (this ensures that whatever qualifiers _Tp had originally, they don't interfere with the cast).
This is const_cast'ed to just char&, removing the qualifiers (legally now! const_cast can do what reinterpret_cast couldn't with respect to the qualifiers).
The address is taken & (now we have a plain char*)
It is reinterpret_cast'ed back to _Tp* (which includes the original const and volatile qualifiers if any).
Edit: since my answer has been accepted, I'll be thorough and add that the choice of char as an intermediate type is due to alignment issues in order to avoid triggering Undefined Behaviour. See #JamesKanze's comments (under the question) for a full explanation. Thanks James for explaining it so clearly.

It's actually quite simple when you think about it, to get the real adress of an object/function in precense of an overloaded operator& you will need to treat the object as something other than what it really is, some type which cannot have an overloaded operator.. an intrinsic type (such as char).
A char has no alignment and can reside anywhere any other object can, with that said; casting an object to a reference to char is a very good start.
But what about the black magic involved when doing reinterpret_cast<const volatile char&>?
In order to reinterpret the returned pointer from the implementation of addressof we will eventually want to discard qualifiers such as const and volatile (to end up with a plain reference char). These two can be added easily with reinterpret_cast, but asking it to remove them is illegal.
T1 const a; reinterpret_cast<T2&> (a);
/* error: reinterpret_cast from type ‘...’ to type ‘...’ casts away qualifiers */
It's a little bit of a "better safe than sorry" trick.. "Let us add them, just in case, we will remove them later."
Later we cast away the qualifiers (const and volatile) with const_cast<char&> to end up with a plain reference to char, this result is, as the final step, turned back into a pointer to whatever type we passed into our implementation.
A relevant question on this stage is why we didn't skip the use of reinterpret_cast and went directly to the const_cast? this too has a simple answer: const_cast can add/remove qualifiers, but it cannot change the underlying type.
T1 a; const_cast<T2&> (a);
/* error: invalid const_cast from type ‘T1*’ to type ‘T2*’ */
it might not be easy as pie, but it sure tastes good when you get it..

The short version:
operator& can't be overloaded for char. So the type is being cast to a char reference to get what's guaranteed to be the true address.
That conversion is done in two casts because of the restrictions on const_cast and reinterpret_cast.
The longer version:
It's performing three sequential casts.
reinterpret_cast<const volatile char&>
This is effectively casting to a char&. The const and volatile only exist because _Tp may be const or volatile, and reinterpret_cast can add those, but would be unable to remove them.
const_cast<char&>
Now the const and volatile have been removed. const_cast may do that.
reinterpret_cast<_Tp*>(&result)
Now the address is taken and the type is converted back to a pointer to the original type.

From inside out:
First it casts __r type to a const volatile char&: It's casting to a char& just because it's a type that for sure doesn't have an overloaded operator& that does something funky. The const volatile is there because those are restrictions, they can be added but not taken away with reinterpret_cast. _Tp might've already been const and/or volatile, in which case one or both were needed in this cast. If it didn't, the cast just added them needlessly, but it is written for the most restrictive cast.
Next, to take away the const volatile you need a const_cast, which leads to the next part... const_cast<char&>.
From there they simply take the address and cast it to the type you want, a _Tp*. Note that _Tp might be const and/or volatile, which mean those things could be added back at this point.

const_cast vs reinterpret_cast

Referring the SO C++ FAQ When should static_cast, dynamic_cast and reinterpret_cast be used?.
const_cast is used to remove or add const to a variable and its the only reliable, defined and legal way to remove the constness.
reinterpret_cast is used to change the interpretation of a type.
I understand in a reasonable way, why a const variable should be casted to non-const only using const_cast, but I cannot figure out a reasonable justification of issues using reinterpret_cast instead of const_cast to add constness.
I understand that using reinterpret_cast for even adding constness is not sane but would it be an UB or potential time bomb for using reinterpret_cast to add constness?
The reason I was confused here is because of the statement
Largely, the only guarantee you get with reinterpret_cast is that if
you cast the result back to the original type, you will get the exact
same value.
So if I add constness using reinterpret_cast and if you reinterpret_cast the result back to the original type, it should result back to the original type and should not be UB, but that violates the fact that one should only use const_cast to remove the constness
On a separate Note, the standard guarantees that You can add Constness using reinterpret case
5.2.10 Reinterpret cast (7) ......When a prvalue v of type “pointer to T1” is converted to the type “pointer to cv T2”, the result is
static_cast(static_cast(v)) if both T1 and T2 are
standard-layout types (3.9) and the alignment requirements of T2 are
no stricter than those of T1........

reinterpret_cast changes the interpretation of the data within the object. const_cast adds or removes the const qualifier. Data representation and constness are orthogonal. So it makes sense to have different cast keywords.
So if I add constness using reinterpret_cast and if you reinterpret_cast the result back to the original type, it should result back to the original type and should not be UB, but that violates the fact that one should only use const_cast to remove the constness
That wouldn't even compile:
int * n = new int;
const * const_added = reinterpret_cast<const int *>(n);
int * original_type = reinterpret_cast<int*>(const_added);
// error: reinterpret_cast from type ‘const int*’ to type ‘int*’ casts away qualifiers

You shouldn't just be adding const with reinterpret_cast. A reinterpret_cast should be primarily that: reinterpreting the pointer (or whatever).
In other words, if you're going from const char* to char* (hopefully because there's a bad API you can't change), then const_cast is your friend. That's really all it's intended to be.
But if you need to go from MyPODType* to const char*, you need reinterpret_cast, and it's just being nice by not requiring a const_cast on top of it.

There is one thing to keep in mind: You can't use const_cast to make a const variable writable. You can only use it to retrieve a non-const reference from a const reference if that const reference refers to a non-const object. Sounds complicated? Example:
// valid:
int x;
int const& x1 = x;
const_cast<int&>(x1) = 0;
// invalid:
int const y = 42;
int const& y1 = y;
const_cast<int&>(y1) = 0;
In reality, both of these will compile and sometimes even "work". However, the second one causes undefined behaviour and in many cases will terminate the program when the constant object is placed in read-only memory.
That said, a few more things: reinterpret_cast is the most powerful cast, but also the most dangerous one, so don't use it unless you have to. When you need to go from void* to sometype*, use static_cast. When going the opposite direction, use the built-in implicit conversion or use an explicit static_cast, too. Similarly with adding or removing const, which is also added implicitly. Concerning reinterpret_cast, see also the discussion at C++ When should we prefer to use a two chained static_cast over reinterpret_cast where an alternative that is less hackish is discussed.
Uli

The only place where I can think of for relating reinterpret_cast with const-ness is when passing a const object to an API that accepts a void pointer -
UINT ThreadFunction(void* param)
{
const MyClass* ptr = reinterpret_cast<const MyClass*>(param);
}

yeah, as you know, const_cast means that it removes constness from a specific type.
But, when we need to add constness to a type. Is there a reason we have to do it?
for example,
void PrintAnything(void* pData)
{
const CObject* pObject = reinterpret_cast<CObject*>(pData);
// below is bla-bla-bla.
}
reinterpret_cast has nothing to do with 'const'.
const_cast means two things.
first one is to remove constness from a type and the other is to give its code explicitness. Because you can use cast it using C-style cast, but this is not explicit so that is not recommended.
They do not function same. it is definitely different.

Why do we have reinterpret_cast in C++ when two chained static_cast can do its job?

Say I want to cast A* to char* and vice-versa, we have two choices (I mean, many of us think we've two choices, because both seems to work! Hence the confusion!):
struct A
{
int age;
char name[128];
};
A a;
char *buffer = static_cast<char*>(static_cast<void*>(&a)); //choice 1
char *buffer = reinterpret_cast<char*>(&a); //choice 2
Both work fine.
//convert back
A *pA = static_cast<A*>(static_cast<void*>(buffer)); //choice 1
A *pA = reinterpret_cast<A*>(buffer); //choice 2
Even this works fine!
So why do we have reinterpret_cast in C++ when two chained static_cast can do its job?
Some of you might think this topic is a duplicate of the previous topics such as listed at the bottom of this post, but it's not. Those topics discuss only theoretically, but none of them gives even a single example demonstrating why reintepret_cast is really needed, and two static_cast would surely fail. I agree, one static_cast would fail. But how about two?
If the syntax of two chained static_cast looks cumbersome, then we can write a function template to make it more programmer-friendly:
template<class To, class From>
To any_cast(From v)
{
return static_cast<To>(static_cast<void*>(v));
}
And then we can use this, as:
char *buffer = any_cast<char*>(&a); //choice 1
char *buffer = reinterpret_cast<char*>(&a); //choice 2
//convert back
A *pA = any_cast<A*>(buffer); //choice 1
A *pA = reinterpret_cast<A*>(buffer); //choice 2
Also, see this situation where any_cast can be useful: Proper casting for fstream read and write member functions.
So my question basically is,
Why do we have reinterpret_cast in C++?
Please show me even a single example where two chained static_cast would surely fail to do the same job?
Which cast to use; static_cast or reinterpret_cast?
Cast from Void* to TYPE* : static_cast or reinterpret_cast

There are things that reinterpret_cast can do that no sequence of static_casts can do (all from C++03 5.2.10):
A pointer can be explicitly converted to any integral type large enough to hold it.
A value of integral type or enumeration type can be explicitly converted to a pointer.
A pointer to a function can be explicitly converted to a pointer to a function of a different type.
An rvalue of type "pointer to member of X of type T1" can be explicitly converted to an rvalue of type "pointer to member of Y of type T2" if T1 and T2 are both function types or both object types.
Also, from C++03 9.2/17:
A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.

You need reinterpret_cast to get a pointer with a hardcoded address (like here):
int* pointer = reinterpret_cast<int*>( 0x1234 );
you might want to have such code to get to some memory-mapped device input-output port.

A concrete example:
char a[4] = "Hi\n";
char* p = &a;
f(reinterpret_cast<char (&)[4]>(p)); // call f after restoring full type
// ^-- any_cast<> can't do this...
// e.g. given...
template <typename T, int N> // <=--- can match this function
void f(T (&)[N]) { std::cout << "array size " << N << '\n'; }

Other than the practical reasons that others have given where there is a difference in what they can do it's a good thing to have because its doing a different job.
static_cast is saying please convert data of type X to Y.
reinterpret_cast is saying please interpret the data in X as a Y.
It may well be that the underlying operations are the same, and that either would work in many cases. But there is a conceptual difference between saying please convert X into a Y, and saying "yes I know this data is declared as a X but please use it as if it was really a Y".

As far as I can tell your choice 1 (two chained static_cast) is dreaded undefined behaviour. Static cast only guarantees that casting pointer to void * and then back to original pointer works in a way that the resulting pointer from these to conversions still points to the original object. All other conversions are UB. For pointers to objects (instances of the user defined classes) static_cast may alter the pointer value.
For the reinterpret_cast - it only alters the type of the pointer and as far as I know - it never touches the pointer value.
So technically speaking the two choices are not equivalent.
EDIT: For the reference, static_cast is described in section 5.2.9 of current C++0x draft (sorry, don't have C++03 standard, the draft I consider current is n3225.pdf). It describes all allowed conversions, and I guess anything not specifically listed = UB. So it can blow you PC if it chooses to do so.

Using of C Style casting is not safer. It never checks for different types can be mixed together.
C++ casts helps you to make sure the type casts are done as per related objects (based on the cast you use). This is the more recommended way to use casts than using the traditional C Style casts that's always harmful.

Look, people, you don't really need reinterpret_cast, static_cast, or even the other two C++ styles casts (dynamic* and const).
Using a C style cast is both shorter and allows you to do everything the four C++-style cast let you do.
anyType someVar = (anyOtherType)otherVar;
So why use the C++-style casts? Readability. Secondly: because the more restrictive casts allow more code safety.
*okay, you might need dynamic

Simple c++ pointer casting

Can someone explain this to me:
char* a;
unsigned char* b;
b = a;
// error: invalid conversion from ‘char*’ to ‘unsigned char*’
b = static_cast<unsigned char*>(a);
// error: invalid static_cast from type ‘char*’ to type ‘unsigned char*’
b = static_cast<unsigned char*>(static_cast<void*>(a));
// everything is fine
What makes the difference between cast 2 and 3? And are there any pitfalls if the approach from 3 is used for other (more complex) types?
[edit]
As some mentioned bad design, etc...
This simple example comes from an image library which gives me the pointer to the image data as char*. Clearly image intensities are always positive so I need to interpret it as unsigned char data.

static_cast<void*> annihilate the purpose of type checking as you say that now it points on "something you don't know the type of". Then the compiler have to trust you and when you say static_cast<unsigned char*> on your new void* then he'll just try to do his job as you ask explicitely.
You'd better use reinterpret_cast<> if you really must use a cast here (as it's obvioulsy showing a design problem here).

Your third approach works because C++ allows a void pointer to be casted to T* via static_cast (and back again) but is more restrictive with other pointer types for safety reasons. char and unsigned char are two distinct types. This calls for a reinterpret_cast.

C++ tries to be a little bit more restrictive to type casting than C, so it doesn't let you convert chars to unsigned chars using static_cast (note that you will lose sign information). However, the type void* is special, as C++ cannot make any assumption for it, and has to rely on the compiler telling it the exact type (hence the third cast works).
As for your second question, of course there are a lot of pitfals on using void*. Usually, you don't have to use it, as the C++ type system, templates, etc. is rich enough to not to have to rely in that "unknown type". Also, if you really need to use it, you have to be very careful with casts to and from void* controlling that types inserted and obtained are really the same (for example, not pointer to subclasses, etc.)

static_cast between pointers works correct only if one of pointers is void or that's casting between objects of classes, where one class is inherited by another.

The difference between 2 and 3 is that in 3, you're explicitly telling the compiler to stop checking you by casting to void*. If the approach from 3 is used for, pretty much anything that isn't a direct primitive integral type, you will invoke undefined behaviour. You may well invoke undefined behaviour in #3 anyway. If it doesn't cast implicitly, it's almost certainly a bad idea unless you really know what's going on, and if you cast a void* back to something that wasn't it's original type, you will get undefined behaviour.

Casts between pointers require reinterpret_cast, with the exception of void*:
Casts from any pointer to to void* are implicit, so you don't need to explicitly cast:
char* pch;
void* p = pch;
Casts from void* to any other pointer only require a static_cast:
unsigned char* pi = static_cast<unsigned char*>(p);

beware, when you cast to void* you lose any type information.
what you are trying to do is incorrect, and false, and error prone and misleading. that's why the compilator returned a compilation error :-)
a simple example
char* pChar = NULL; // you should always initalize your variable when you declare them
unsigned char* pUnsignedChar = NULL; // you should always initalize your variable when you declare them
char aChar = -128;
pChar = &aChar;
pUnsignedChar = static_cast<unsigned char*>(static_cast<void*>(pChar));
then, though pUnsignedChar == pChar nonethless we have *pUnsignedChar == 255 and *pChar == -128.
i do believe this is bad joke, thus bad code.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

reinterpret_cast - bizarre behaviour - c++

Related

Possible implementation of std::addressof [duplicate]

Implementation of addressof

const_cast vs reinterpret_cast

Why do we have reinterpret_cast in C++ when two chained static_cast can do its job?

Simple c++ pointer casting

Categories

Resources