I was looking at https://en.cppreference.com/w/cpp/language/reinterpret_cast and I noticed that it specifies the legal types we can always cast to:
byte*
char*
unsigned char*
But I did not see void* in the list. Is this an oversight? My use case requires a reinterpret_cast because I'm casting from an int** to a void*. And I will eventually cast from the void* back to an int**.
Those types are exempt from strict aliasing rules. It does not mean they are the only type you can use with reinterpret_cast. In the case of casting an object pointer to another object pointer type, failing to meet the requirements of strict aliasing rules means you cannot safely dereference the result. But you can still cast the resulting pointer back to the original type safely, and use the result as-if it was the original pointer.
The relevant section from cppreference on reinterpret_cast :
(Any object pointer type T1* can be converted to another object pointer type cv T2*. This is exactly equivalent to static_cast<cv T2*>(static_cast<cv void*>(expression)) (which implies that if T2's alignment requirement is not stricter than T1's, the value of the pointer does not change and conversion of the resulting pointer back to its original type yields the original value). In any case, the resulting pointer may only be dereferenced safely if allowed by the type aliasing rules)
When casting back to the original type, AliasedType and DynamicType are the same, so they are similar, which is the first case listed by the aliasing rules where it is legal to dereference the result of reinterpret_cast :
Whenever an attempt is made to read or modify the stored value of an object of type DynamicType through a glvalue of type AliasedType, the behavior is undefined unless one of the following is true:
AliasedType and DynamicType are similar.
AliasedType is the (possibly cv-qualified) signed or unsigned variant of DynamicType.
AliasedType is std::byte, (since C++17)char, or unsigned char: this permits examination of the object representation of any object as an array of bytes.
[expr.reinterpret.cast]/7:
An object pointer can be explicitly converted to an object pointer of a different type.
[basic.compound]/3:
The type of a pointer to cv void or a pointer to an object type is called an object pointer type.
You don't need to use reinterpret_cast, though. Every object pointer type whose pointed type is cv-unqualified is implicitly convertible to void*, and the inverse can be done by static_cast.
It is always legal to convert from a pointer to a type to a pointer to a different type including void, so if T is a type this is legal C++:
T* x;
void *y = reinterpret_cast<void *>(x);
In real world it is never used because void * is a special case, and you obtain the same value with static_cast:
void *y = static_cast<void *>(x); // equivalent to previous reinterpret_cast
(in fact above conversion is implicit and can be simply written void *y = x; - thank to Michael Kenzel for noticing it)
To be more explicit the standard even says in draft n4659 for C++17 8.2.10 Reinterpret cast [expr.reinterpret.cast], §7
When a prvalue v of
object pointer type is converted to the object pointer type “pointer to cv T”, the result is static_cast<cv T*>(static_cast<cv void*>(v)).
When you refer to byte and char being the only legal types, it is just that it is legal to dereference the converted pointer only for those types. void is not included here because you can never dereference a void *.
To specifically answer your question
.. I'm casting from an int** to a void*. And I will eventually cast from the void* back to an int**.
The standard guarantees that first one is a standard (read implicit) conversion:
A prvalue of type “pointer to cv T”, where T is an object type, can be converted to a prvalue of type “pointer
to cv void”. The pointer value (6.9.2) is unchanged by this conversion.
So this is always legal:
int **i = ...;
void *v = i;
For back casting, standard says (in static_cast paragraph):
A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T”,
So this is also legal
int **j = static_cast<int **>(v);
and the standard ensures that j == i.
Related
[expr.reinterpret.cast]/7:
Any object pointer type T1* can be
converted to another object pointer type cv T2*. This is exactly
equivalent to static_cast<cv T2*>(static_cast<cv void*>(T1_var))
(which implies that if T2's alignment requirement is not stricter than
T1's, the value of the pointer does not change and conversion of the
resulting pointer back to its original type yields the original value)
Am I wondering that there's no limitation while casting a pointer from one type to another using reinterpret_cast. The only thing I notice, when reading the given quote, is that the target type shall be stricter than the original type, [otherwise] not mentioned in this quote. For example
int *p = 0;
double *q = reinterpret_cast<double *>(p)
Here alignof(int) is not greater (stricter?) than or equal alignof(double). Although the compiler gives no errors or even warnings, I think in this case, q pointer holds an unspecified value? and dereferencing it is undefined behavior.
So does this mean converting a pointer of type T1* into T2*, the resulting pointer always has an unspecified value as long as static_assert(alignof(T1) >= alignof(T2)) fails?
So does this mean converting a pointer of type T1* into T2*, the
resulting pointer always has an unspecified value as long as
static_assert(alignof(T1) >= alignof(T2)) fails?
Not really. The result is only (but always) unspecified if the actual address of the T1 type doesn't fulfil the alignment requirements of the T2 type. It may do so, in which case the behaviour is well-defined.
The paragraph from the Standard you cite states the equivalence of a reinterpret_cast operation to two static_cast operations via an intermediate void* pointer. And, from the Standard, in the previous section about static_cast (bolding for emphasis mine):
8.5.1.9 Static cast [expr.static.cast]
13 A prvalue of type “pointer to cv1
void” can be converted to a prvalue of type “pointer to cv2 T”,
where T is an object type and cv2 is the same cv-qualification as,
or greater cv-qualification than, cv1. If the original pointer value
represents the address A of a byte in memory and A does not satisfy
the alignment requirement of T, then the resulting pointer value is
unspecified. Otherwise, if the original pointer value points to an
object a, and there is an object b of type T (ignoring
cv-qualification) that is pointer-interconvertible (6.7.2) with a, the
result is a pointer to b. Otherwise, the pointer value is unchanged by
the conversion.
Here's an example:
#include <cstddef>
#include <iostream>
struct A
{
char padding[7];
int x;
};
constexpr int offset = offsetof(A, x);
int main()
{
A a;
a.x = 42;
char *ptr = (char *)&a;
std::cout << *(int *)(ptr + offset) << '\n'; // Well-defined or not?
}
I always assumed that it's well-defined (otherwise what would be the point of offsetof), but wasn't sure.
Recently I was told that it's in fact UB, so I want to figure it out once and for all.
Does the example above cause UB or not? If you modify the class to not be standard-layout, does it affect the result?
And if it's UB, are there any workarounds for it (e.g. applying std::launder)?
This entire topic seems to be moot and underspecified.
Here's some information I was able to find:
Is adding to a “char *” pointer UB, when it doesn't actually point to a char array? - In 2011, CWG confirmed that we're allowed to examine the representation of a standard-layout object through an unsigned char pointer.
Unclear if a char pointer can be used insteaed, common sense says it can.
Unclear if staring from C++17 std::launder needs to be applied to the result of the (unsigned char *) cast. Given that it would be a breaking change, it's probably unnecessarly, at least in practice.
Unclear why C++17 changed offsetof to conditionally-support non-standard-layout types (used to be UB). It seems to imply that if an implementation supports that, then it also lets you examine the representation of non-standard-layout objects through unsigned char *.
Do we need to use std::launder when doing pointer arithmetic within a standard-layout object (e.g., with offsetof)? - A question similar to this one. No definitive answer was given.
Here I will refer to C++20 (draft) wording, because one relevant editorial issue was fixed between C++17 and C++20 and also it is possible to refer to specific sentences in HTML version of the C++20 draft, but otherwise there is nothing new in comparison to C++17.
At first, definitions of pointer values [basic.compound]/3:
Every value of pointer type is one of the following:
— a pointer to an object or function (the pointer is said to point to the object or function), or
— a pointer past the end of an object ([expr.add]), or
— the null pointer value for that type, or
— an invalid pointer value.
Now, lets see what happens in the (char *)&a expression.
Let me not prove that a is an lvalue denoting the object of type A, and I will say «the object a» to refer to this object.
The meaning of the &a subexpression is covered in [expr.unary.op]/(3.2):
if the operand is an lvalue of type T, the resulting expression is a prvalue of type “pointer to T” whose result is a pointer to the designated object
So, &a is a prvalue of type A* with the value «pointer to (the object) a».
Now, the cast in (char *)&a is equivalent to reinterpret_cast<char*>(&a), which is defined as static_cast<char*>(static_cast<void*>(&a)) ([expr.reinterpret.cast]/7).
Cast to void* doesn't change the pointer value ([conv.ptr]/2):
A prvalue of type “pointer to cv T”, where T is an object type, can be converted to a prvalue of type “pointer to cv void”. The pointer value ([basic.compound]) is unchanged by this conversion.
i.e. it is still «pointer to (the object) a».
[expr.static.cast]/13 covers the outer static_cast<char*>(...):
A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T”, where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1.
If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified.
Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b.
Otherwise, the pointer value is unchanged by the conversion.
There is no object of type char which is pointer-interconvertible with the object a ([basic.compound]/4):
Two objects a and b are pointer-interconvertible if:
— they are the same object, or
— one is a union object and the other is a non-static data member of that object ([class.union]), or
— one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, any base class subobject of that object ([class.mem]), or
— there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.
which means that the static_cast<char*>(...) doesn't change the pointer value and it is the same as in its operand, namely: «pointer to a».
So, (char *)&a is a prvalue of type char* whose value is «pointer to a». This value is stored into char* ptr variable. Then, when you try to do pointer arithmetic with such a value, namely ptr + offset, you step into [expr.add]/6:
For addition or subtraction, if the expressions P or Q have type “pointer to cv T”, where T and the array element type are not similar, the behavior is undefined.
For the purposes of pointer arithmetic, the object a is considered to be an element of an array A[1] ([basic.compound]/3), so the array element type is A, the type of the pointer expression P is «pointer to char», char and A are not similar types (see [conv.qual]/2), so the behavior is undefined.
This question, and the other one about launder, both seem to me to boil down to interpretation of the last sentence of C++17 [expr.static.cast]/13, which covers what happens for static_cast<T *> applied to an operand of pointer to unrelated type which is correctly aligned:
A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T ”,
[...]
Otherwise, the pointer value is unchanged by the conversion.
Some posters appear to take this to mean that the result of the cast cannot point to an object of type T, and consequently that reinterpret_cast with pointers or references can only be used on pointer-interconvertible types.
But I don't see that as justified, and (this is a reductio ad absurdum argument) that position would also imply:
The resolution to CWG1314 is overturned.
Inspecting any byte of a standard-layout object is not possible (since casting to unsigned char * or whatever character type supposedly cannot be used to access that byte).
The strict aliasing rule would be redundant since the only way to actually achieve such aliasing is to use such casts.
There would be no normative text to justify the note "[Note: Converting a prvalue of type “pointer to T1 ” to the type “pointer to T2 ” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value. —end note ]"
offsetof is useless (and the C++17 changes to it were therefore redundant)
It seems like a more sensible interpretation to me that this sentence means the result of the cast points to the same byte in memory as the operand. (As opposed to pointing to some other byte, as can happen for some pointer casts not covered by this sentence). Saying "the value is unchanged" does not mean "the type is unchanged", e.g. we describe conversion from int to long as preserving the value.
Also, I guess this may be controversial to some but I am taking as axiomatic that if a pointer's value is the address of an object, then the pointer points to that object, unless the Standard specifically excludes the case.
This is consistent with the text of [basic.compound]/3 which says the converse, i.e. that if a pointer points to an object, then its value is the address of the object.
There doesn't seem to be any other explicit statement defining when a pointer can or cannot be said to point to an object, but basic.compound/3 says that all pointers must be one of four cases (points to an object, points past the end, null, invalid).
Examples of excluded cases include:
The use case of std::launder specifically addresses a situation where there was such language ruling out the use of the un-laundered pointer.
A past-the-end pointer does not point to an object. (basic.compound/3)
I want to cast a pointer pc which points to char to a point pi which points to int
char *pc;
int *pi;
pi = (int*)pc // compiler complaint about old-style cast
pi = static_cast<int *>(static_cast<void *>(pc)) // no complaint any more but too complex
is there any simpler ways to do this cast and make compiler silence?
If you really need to do this then reinterpret_cast is your friend:
char *pc = 0;
int *pi = 0;
pi = reinterpret_cast<int*>(pc);
The behaviour on converting, in general, a char* pointer to an int* pointer is undefined. Dereferencing such a pointer will cause you further trouble as you will be breaking strict aliasing rules. Note that the C++ Standard does not require sizeof(char*) to be the same as sizeof(int*).
(Note that converting an unsigned char* pointer to an int* is well-defined if the unsigned char* pointer actually points to an int).
Don't do it. Ever.
I want to put the back and forth under #Bathsheba's post to rest. So here's an answer about the finer details of what you are doing.
#Sean already suggested you reinterpret_cast your pointers instead. And that is equivalent to your second chain of casts. It says as much in [expr.reinterpret.cast]/7:
An object pointer can be explicitly converted to an object pointer of
a different type. When a prvalue v of object pointer type is
converted to the object pointer type “pointer to cv T”, the result is
static_cast<cv T*>(static_cast<cv void*>(v)). [ Note: Converting a
prvalue of type “pointer to T1” to the type “pointer to T2” (where T1
and T2 are object types and where the alignment requirements of T2 are
no stricter than those of T1) and back to its original type yields the
original pointer value. — end note ]
Now, let's examine each step of the two step conversion. First we have a static_cast<void*>. According to [conv.ptr]/2 (emphasis mine):
A prvalue of type “pointer to cv T”, where T is an object type, can be
converted to a prvalue of type “pointer to cv void”. The pointer value
is unchanged by this conversion.
The first conversion doesn't do any alteration to the address. And then it also says in [basic.compound]/5:
A pointer to cv-qualified or cv-unqualified void can be used to point
to objects of unknown type. Such a pointer shall be able to hold any
object pointer. An object of type cv void* shall have the same
representation and alignment requirements as cv char*.
So a char* may store any address a void* may store. Now it doesn't mean the conversion from void* to char* is value preserving, only that they can represent the same values. Now, assuming a very restricted use case, that is enough of a guarantee. But there's more at [expr.static.cast]/13:
A prvalue of type “pointer to cv1 void” can be converted to a prvalue
of type “pointer to cv2 T”, where T is an object type and cv2 is the
same cv-qualification as, or greater cv-qualification than, cv1. If
the original pointer value represents the address A of a byte in
memory and A does not satisfy the alignment requirement of T, then the
resulting pointer value is unspecified. Otherwise, if the original
pointer value points to an object a, and there is an object b of type
T (ignoring cv-qualification) that is pointer-interconvertible with a,
the result is a pointer to b. Otherwise, the pointer value is
unchanged by the conversion.
Where am I going with this? Assuming pc already holds the address of an int (suitably converted according to the above), then casting the char* to an int* via reinterpret_cast will give you the address of the original int. The note under the first paragraph says as much, and the further quotes prove it. If it doesn't hold the address of an int, you are playing roulette and are likely going to lose. Your program has undefined behavior. You should follow Bathsheba's advice to the letter.
I've got an iterator of Things. If I want to convert the current item to a pointer to the item, why does this work:
thing_pointer = &(*it);
But this not:
thing_pointer = reinterpret_cast<Thing*>(it);
This is the compiler error I'm trying to comprehend: http://msdn.microsoft.com/en-us/library/sy5tsf8z(v=vs.90).aspx
Just in case, the type of the iterator is std::_Vector_iterator<std::_Vector_val<Thing,std::allocator<Thing> > >
In
&(*it);
the * is overloaded to do what you logically mean: convert the iterator type to its pointed-to object. You can then safely take the address of this object.
Whereas in
reinterpret_cast<Thing*>(it);
you are telling the compiler to literally reinterpret the it object as a pointer. But it might not be a pointer at all -- it might be a 50-byte struct, for all you know! In that case, the first sizeof (Thing*) bytes of it will absolutely not happen to point at anything sensible.
Tip: reinterpret_cast<> is nearly always the wrong thing.
Obligitory Standard Quotes, emphasis mine:
5.2.19 Reinterpret cast
1/ [...] Conversions that can be performed explicitly using
reinterpret_cast are listed below. No other conversion can be
performed explicitly using reinterpret_cast.
4/ A pointer can be explicitly converted to any integral type large
enough to hold it. [...]
5/ A value of integral type or enumeration type can be explicitly
converted to a pointer. [...]
6/ A function pointer can be explicitly converted to a function
pointer of a different type. [...]
7/ An object pointer can be explicitly converted to an object pointer
of a different type. [...]
8/ Converting a function pointer to an object pointer type or vice
versa is conditionally-supported. [...]
9/ The null pointer value (4.10) is converted to the null pointer
value of the destination type. [...]
10/ [...] “pointer to member of X of type T1” can be explicitly
converted to [...] “pointer to member of Y of type T2” [...]
11/ A [...] T1 can be cast to the type “reference to T2” if an
expression of type “pointer to T1” can be explicitly converted to the
type “pointer to T2” using a reinterpret_cast. [...]
With the exception of the integral-to-pointer and value-to-reference conversions noted in 4/, 5/ and 11/ the only conversions that can be performed using reinterpret_cast are pointer-to-pointer conversions.
However in:
thing_pointer = reinterpret_cast<Thing*>(it);
it is not a pointer, but an object. It just so happens that this object was designed to emulate a pointer in many ways, but it's still not a pointer.
Because * operator of iterator is overloaded and it return a
reference to the object it points on.
You can force it by thing_pointer = *(reinterpret_cast<Thing**>(&it));. But it's undefined behavior.
Because iterator is not a pointer. It is a class of implementation-defined structure, and if you try to reinterpret it to a pointer, the raw data of the iterator class will be taken as a memory pointer, which may, but probably will not point to valid memory
The first gets a reference to the object, then takes the address of it, giving the pointer.
The second tries to cast the iterator to a pointer, which is likely to fail because most types can't be cast to pointers - only other pointers, integers, and class types with a conversion operator.
Taking into consideration the entire C++11 standard, is it possible for any conforming implementation to succeed the first assertion below but fail the latter?
#include <cassert>
int main(int, char**)
{
const int I = 5, J = 4, K = 3;
const int N = I * J * K;
int arr1d[N] = {0};
int (&arr3d)[I][J][K] = reinterpret_cast<int (&)[I][J][K]>(arr1d);
assert(static_cast<void*>(arr1d) ==
static_cast<void*>(arr3d)); // is this necessary?
arr3d[3][2][1] = 1;
assert(arr1d[3 * (J * K) + 2 * K + 1] == 1); // UB?
}
If not, is this technically UB or not, and does that answer change if the first assertion is removed (is reinterpret_cast guaranteed to preserve addresses here?)? Also, what if the reshaping is done in the opposite direction (3d to 1d) or from a 6x35 array to a 10x21 array?
EDIT: If the answer is that this is UB because of the reinterpret_cast, is there some other strictly compliant way of reshaping (e.g., via static_cast to/from an intermediate void *)?
Update 2021-03-20:
This same question was asked on Reddit recently and it was pointed out that my original answer is flawed because it does not take into account this aliasing rule:
If a program attempts to access the stored value of an object through a glvalue whose type is not similar to one of the following types the behavior is undefined:
the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the dynamic type of the object, or
a char, unsigned char, or std::byte type.
Under the rules for similarity, these two array types are not similar for any of the above cases and therefore it is technically undefined behaviour to access the 1D array through the 3D array. (This is definitely one of those situations where, in practice, it will almost certainly work with most compilers/targets)
Note that the references in the original answer refer to an older C++11 draft standard
Original answer:
reinterpret_cast of references
The standard states that an lvalue of type T1 can be reinterpret_cast to a reference to T2 if a pointer to T1 can be reinterpret_cast to a pointer to T2 (§5.2.10/11):
An lvalue expression of type T1 can be cast to the type “reference to T2” if an expression of type “pointer to T1” can be explicitly converted to the type “pointer to T2” using a reinterpret_cast.
So we need to determine if a int(*)[N] can be converted to an int(*)[I][J][K].
reinterpret_cast of pointers
A pointer to T1 can be reinterpret_cast to a pointer to T2 if both T1 and T2 are standard-layout types and T2 has no stricter alignment requirements than T1 (§5.2.10/7):
When a prvalue v of type “pointer to T1” is converted to the type “pointer to cv T2”, the result is static_cast<cv T2*>(static_cast<cv void*>(v)) if both T1 and T2 are standard-layout types (3.9) and the alignment requirements of T2 are no stricter than those of T1, or if either type is void.
Are int[N] and int[I][J][K] standard-layout types?
int is a scalar type and arrays of scalar types are considered to be standard-layout types (§3.9/9).
Scalar types, standard-layout class types (Clause 9), arrays of such types and cv-qualified versions of these types (3.9.3) are collectively called standard-layout types.
Does int[I][J][K] have no stricter alignment requirements than int[N].
The result of the alignof operator gives the alignment requirement of a complete object type (§3.11/2).
The result of the alignof operator reflects the alignment requirement of the type in the complete-object case.
Since the two arrays here are not subobjects of any other object, they are complete objects. Applying alignof to an array gives the alignment requirement of the element type (§5.3.6/3):
When alignof is applied to an array type, the result shall be the alignment of the element type.
So both array types have the same alignment requirement.
That makes the reinterpret_cast valid and equivalent to:
int (&arr3d)[I][J][K] = *reinterpret_cast<int (*)[I][J][K]>(&arr1d);
where * and & are the built-in operators, which is then equivalent to:
int (&arr3d)[I][J][K] = *static_cast<int (*)[I][J][K]>(static_cast<void*>(&arr1d));
static_cast through void*
The static_cast to void* is allowed by the standard conversions (§4.10/2):
A prvalue of type “pointer to cv T,” where T is an object type, can be converted to a prvalue of type “pointer to cv void”. The result of converting a “pointer to cv T” to a “pointer to cv void” points to the start of the storage location where the object of type T resides, as if the object is a most derived object (1.8) of type T (that is, not a base class subobject).
The static_cast to int(*)[I][J][K] is then allowed (§5.2.9/13):
A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T,” where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1.
So the cast is fine! But are we okay to access objects through the new array reference?
Accessing array elements
Performing array subscripting on an array like arr3d[E2] is equivalent to *((E1)+(E2)) (§5.2.1/1). Let's consider the following array subscripting:
arr3d[3][2][1]
Firstly, arr3d[3] is equivalent to *((arr3d)+(3)). The lvalue arr3d undergoes array-to-pointer conversion to give a int(*)[2][1]. There is no requirement that the underlying array must be of the correct type to do this conversion. The pointers value is then accessed (which is fine by §3.10) and then the value 3 is added to it. This pointer arithmetic is also fine (§5.7/5):
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
This this pointer is dereferenced to give an int[2][1]. This undergoes the same process for the next two subscripts, resulting in the final int lvalue at the appropriate array index. It is an lvalue due to the result of * (§5.3.1/1):
The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points.
It is then perfectly fine to access the actual int object through this lvalue because the lvalue is of type int too (§3.10/10):
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
the dynamic type of the object
[...]
So unless I've missed something. I'd say this program is well-defined.
I am under the impression that it will work. You allocate the same piece of contiguous memory. I know the C-standard guarantees it will be contiguous at least. I don't know what is said in the C++11 standard.
However the first assert should always be true. The address of the first element of the array will always be the same. All memory address will be the same since the same piece of memory is allocated.
I would therefore also say that the second assert will always hold true. At least as long as the ordering of the elements are always in row major order. This is also guaranteed by the C-standard and I would be surprised if the C++11 standard says anything differently.