Related
As an example, consider the following structure:
struct S {
int a[4];
int b[4];
} s;
Would it be legal to write s.a[6] and expect it to be equal to s.b[2]?
Personally, I feel that it must be UB in C++, whereas I'm not sure about C.
However, I failed to find anything relevant in the standards of C and C++ languages.
Update
There are several answers suggesting ways to make sure there is no padding
between fields in order to make the code work reliably. I'd like to emphasize
that if such code is UB, then absense of padding is not enough. If it is UB,
then the compiler is free to assume that accesses to S.a[i] and S.b[j] do not
overlap and the compiler is free to reorder such memory accesses. For example,
int x = s.b[2];
s.a[6] = 2;
return x;
can be transformed to
s.a[6] = 2;
int x = s.b[2];
return x;
which always returns 2.
Would it be legal to write s.a[6] and expect it to be equal to s.b[2]?
No. Because accessing an array out of bound invoked undefined behaviour in C and C++.
C11 J.2 Undefined behavior
Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that points just beyond
the array object and is used as the operand of a unary * operator that
is evaluated (6.5.6).
An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression
a[1][7] given the declaration int a[4][5]) (6.5.6).
C++ standard draft section 5.7 Additive operators paragraph 5 says:
When an expression that has integral type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integral expression.
[...] If both the pointer operand and the result point to elements
of the same array object, or one past the last element of the array
object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.
Apart from the answer of #rsp (Undefined behavior for an array subscript that is out of range) I can add that it is not legal to access b via a because the C language does not specify how much padding space can be between the end of area allocated for a and the start of b, so even if you can run it on a particular implementation , it is not portable.
instance of struct:
+-----------+----------------+-----------+---------------+
| array a | maybe padding | array b | maybe padding |
+-----------+----------------+-----------+---------------+
The second padding may miss as well as the alignment of struct object is the alignment of a which is the same as the alignment of b but the C language also does not impose the second padding not to be there.
a and b are two different arrays, and a is defined as containing 4 elements. Hence, a[6] accesses the array out of bounds and is therefore undefined behaviour. Note that array subscript a[6] is defined as *(a+6), so the proof of UB is actually given by section "Additive operators" in conjunction with pointers". See the following section of the C11-standard (e.g. this online draft version) describing this aspect:
6.5.6 Additive operators
When an expression that has integer type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integer expression.
In other words, if the expression P points to the i-th element of an
array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N
(where N has the value n) point to, respectively, the i+n-th and
i-n-th elements of the array object, provided they exist. Moreover, if
the expression P points to the last element of an array object, the
expression (P)+1 points one past the last element of the array object,
and if the expression Q points one past the last element of an array
object, the expression (Q)-1 points to the last element of the array
object. If both the pointer operand and the result point to elements
of the same array object, or one past the last element of the array
object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined. If the result points one past the last element
of the array object, it shall not be used as the operand of a unary *
operator that is evaluated.
The same argument applies to C++ (though not quoted here).
Further, though it is clearly undefined behaviour due to the fact of exceeding array bounds of a, note that the compiler might introduce padding between members a and b, such that - even if such pointer arithmetics were allowed - a+6 would not necessarily yield the same address as b+2.
Is it legal? No. As others mentioned, it invokes Undefined Behavior.
Will it work? That depends on your compiler. That's the thing about undefined behavior: it's undefined.
On many C and C++ compilers, the struct will be laid out such that b will immediately follow a in memory and there will be no bounds checking. So accessing a[6] will effectively be the same as b[2] and will not cause any sort of exception.
Given
struct S {
int a[4];
int b[4];
} s
and assuming no extra padding, the structure is really just a way of looking at a block of memory containing 8 integers. You could cast it to (int*) and ((int*)s)[6] would point to the same memory as s.b[2].
Should you rely on this sort of behavior? Absolutely not. Undefined means that the compiler doesn't have to support this. The compiler is free to pad the structure which could render the assumption that &(s.b[2]) == &(s.a[6]) incorrect. The compiler could also add bounds checking on the array access (although enabling compiler optimizations would probably disable such a check).
I've have experienced the effects of this in the past. It's quite common to have a struct like this
struct Bob {
char name[16];
char whatever[64];
} bob;
strcpy(bob.name, "some name longer than 16 characters");
Now bob.whatever will be " than 16 characters". (which is why you should always use strncpy, BTW)
As #MartinJames mentioned in a comment, if you need to guarantee that a and b are in contiguous memory (or at least able to be treated as such, (edit) unless your architecture/compiler uses an unusual memory block size/offset and forced alignment that would require padding to be added), you need to use a union.
union overlap {
char all[8]; /* all the bytes in sequence */
struct { /* (anonymous struct so its members can be accessed directly) */
char a[4]; /* padding may be added after this if the alignment is not a sub-factor of 4 */
char b[4];
};
};
You can't directly access b from a (e.g. a[6], like you asked), but you can access the elements of both a and b by using all (e.g. all[6] refers to the same memory location as b[2]).
(Edit: You could replace 8 and 4 in the code above with 2*sizeof(int) and sizeof(int), respectively, to be more likely to match the architecture's alignment, especially if the code needs to be more portable, but then you have to be careful to avoid making any assumptions about how many bytes are in a, b, or all. However, this will work on what are probably the most common (1-, 2-, and 4-byte) memory alignments.)
Here is a simple example:
#include <stdio.h>
union overlap {
char all[2*sizeof(int)]; /* all the bytes in sequence */
struct { /* anonymous struct so its members can be accessed directly */
char a[sizeof(int)]; /* low word */
char b[sizeof(int)]; /* high word */
};
};
int main()
{
union overlap testing;
testing.a[0] = 'a';
testing.a[1] = 'b';
testing.a[2] = 'c';
testing.a[3] = '\0'; /* null terminator */
testing.b[0] = 'e';
testing.b[1] = 'f';
testing.b[2] = 'g';
testing.b[3] = '\0'; /* null terminator */
printf("a=%s\n",testing.a); /* output: a=abc */
printf("b=%s\n",testing.b); /* output: b=efg */
printf("all=%s\n",testing.all); /* output: all=abc */
testing.a[3] = 'd'; /* makes printf keep reading past the end of a */
printf("a=%s\n",testing.a); /* output: a=abcdefg */
printf("b=%s\n",testing.b); /* output: b=efg */
printf("all=%s\n",testing.all); /* output: all=abcdefg */
return 0;
}
No, since accesing an array out of bounds invokes Undefined Behavior, both in C and C++.
Short Answer: No. You're in the land of undefined behavior.
Long Answer: No. But that doesn't mean that you can't access the data in other sketchier ways... if you're using GCC you can do something like the following (elaboration of dwillis's answer):
struct __attribute__((packed,aligned(4))) Bad_Access {
int arr1[3];
int arr2[3];
};
and then you could access via (Godbolt source+asm):
int x = ((int*)ba_pointer)[4];
But that cast violates strict aliasing so is only safe with g++ -fno-strict-aliasing. You can cast a struct pointer to a pointer to the first member, but then you're back in the UB boat because you're accessing outside the first member.
Alternatively, just don't do that. Save a future programmer (probably yourself) the heartache of that mess.
Also, while we're at it, why not use std::vector? It's not fool-proof, but on the back-end it has guards to prevent such bad behavior.
Addendum:
If you're really concerned about performance:
Let's say you have two same-typed pointers that you're accessing. The compiler will more than likely assume that both pointers have the chance to interfere, and will instantiate additional logic to protect you from doing something dumb.
If you solemnly swear to the compiler that you're not trying to alias, the compiler will reward you handsomely:
Does the restrict keyword provide significant benefits in gcc / g++
Conclusion: Don't be evil; your future self, and the compiler will thank you.
Jed Schaff’s answer is on the right track, but not quite correct. If the compiler inserts padding between a and b, his solution will still fail. If, however, you declare:
typedef struct {
int a[4];
int b[4];
} s_t;
typedef union {
char bytes[sizeof(s_t)];
s_t s;
} u_t;
You may now access (int*)(bytes + offsetof(s_t, b)) to get the address of s.b, no matter how the compiler lays out the structure. The offsetof() macro is declared in <stddef.h>.
The expression sizeof(s_t) is a constant expression, legal in an array declaration in both C and C++. It will not give a variable-length array. (Apologies for misreading the C standard before. I thought that sounded wrong.)
In the real world, though, two consecutive arrays of int in a structure are going to be laid out the way you expect. (You might be able to engineer a very contrived counterexample by setting the bound of a to 3 or 5 instead of 4 and then getting the compiler to align both a and b on a 16-byte boundary.) Rather than convoluted methods to try to get a program that makes no assumptions whatsoever beyond the strict wording of the standard, you want some kind of defensive coding, such as static assert(&both_arrays[4] == &s.b[0], "");. These add no run-time overhead and will fail if your compiler is doing something that would break your program, so long as you don’t trigger UB in the assertion itself.
If you want a portable way to guarantee that both sub-arrays are packed into a contiguous memory range, or split a block of memory the other way, you can copy them with memcpy().
The Standard does not impose any restrictions upon what implementations must do when a program tries to use an out-of-bounds array subscript in one structure field to access a member of another. Out-of-bounds accesses are thus "illegal" in strictly conforming programs, and programs which make use of such accesses cannot simultaneously be 100% portable and free of errors. On the other hand, many implementations do define the behavior of such code, and programs which are targeted solely at such implementations may exploit such behavior.
There are three issues with such code:
While many implementations lay out structures in predictable fashion, the Standard allows implementations to add arbitrary padding before any structure member other than the first. Code could use sizeof or offsetof to ensure that structure members are placed as expected, but the other two issues would remain.
Given something like:
if (structPtr->array1[x])
structPtr->array2[y]++;
return structPtr->array1[x];
it would normally be useful for a compiler to assume that the use of structPtr->array1[x] will yield the same value as the preceding use in the "if" condition, even though it would change the behavior of code that relies upon aliasing between the two arrays.
If array1[] has e.g. 4 elements, a compiler given something like:
if (x < 4) foo(x);
structPtr->array1[x]=1;
might conclude that since there would be no defined cases where x isn't less than 4, it could call foo(x) unconditionally.
Unfortunately, while programs can use sizeof or offsetof to ensure that there aren't any surprises with struct layout, there's no way by which they can test whether compilers promise to refrain from the optimizations of types #2 or #3. Further, the Standard is a little vague about what would be meant in a case like:
struct foo {char array1[4],array2[4]; };
int test(struct foo *p, int i, int x, int y, int z)
{
if (p->array2[x])
{
((char*)p)[x]++;
((char*)(p->array1))[y]++;
p->array1[z]++;
}
return p->array2[x];
}
The Standard is pretty clear that behavior would only be defined if z is in the range 0..3, but since the type of p->array in that expression is char* (due to decay) it's not clear the cast in the access using y would have any effect. On the other hand, since converting pointer to the first element of a struct to char* should yield the same result as converting a struct pointer to char*, and the converted struct pointer should be usable to access all bytes therein, it would seem the access using x should be defined for (at minimum) x=0..7 [if the offset of array2 is greater than 4, it would affect the value of x needed to hit members of array2, but some value of x could do so with defined behavior].
IMHO, a good remedy would be to define the subscript operator on array types in a fashion that does not involve pointer decay. In that case, the expressions p->array[x] and &(p->array1[x]) could invite a compiler to assume that x is 0..3, but p->array+x and *(p->array+x) would require a compiler to allow for the possibility of other values. I don't know if any compilers do that, but the Standard doesn't require it.
I recently discovered about the vreinterpret{q}_dsttype_srctype casting operator. However this doesn't seem to support conversion in the data type described at this link (bottom of the page):
Some intrinsics use an array of vector types of the form:
<type><size>x<number of lanes>x<length of array>_t
These types are treated as ordinary C structures containing a single
element named val.
An example structure definition is:
struct int16x4x2_t
{
int16x4_t val[2];
};
Do you know how to convert from uint8x16_t to uint8x8x2_t?
Note that that the problem cannot be reliably addressed using union (reading from inactive members leads to undefined behaviour Edit: That's only the case for C++, while it turns out that C allows type punning), nor by using pointers to cast (breaks the strict aliasing rule).
It's completely legal in C++ to type pun via pointer casting, as long as you're only doing it to char*. This, not coincidentally, is what memcpy is defined as working on (technically unsigned char* which is good enough).
Kindly observe the following passage:
For any object (other than a base-class subobject) of trivially
copyable type T, whether or not the object holds a valid value of type
T, the underlying bytes (1.7) making up the object can be copied into
an array of char or unsigned char.
42 If the content of the array of char or unsigned char is copied back
into the object, the object shall subsequently hold its original
value. [Example:
#define N sizeof(T)
char buf[N];
T obj;
// obj initialized to its original value
std::memcpy(buf, &obj, N);
// between these two calls to std::memcpy,
// obj might be modified
std::memcpy(&obj, buf, N);
// at this point, each subobject of obj of scalar type
// holds its original value
— end example ]
Put simply, copying like this is the intended function of std::memcpy. As long as the types you're dealing with meet the necessary triviality requirements, it's totally legit.
Strict aliasing does not include char* or unsigned char*- you are free to alias any type with these.
Note that for unsigned ints specifically, you have some very explicit leeway here. The C++ Standard requires that they meet the requirements of the C Standard. The C Standard mandates the format. The only way that trap representations or anything like that can be involved is if your implementation has any padding bits, but ARM does not have any- 8bit bytes, 8bit and 16bit integers. So for unsigned integers on implementations with zero padding bits, any byte is a valid unsigned integer.
For unsigned integer types other than unsigned char, the bits
of the object representation shall be divided into two groups:
value bits and padding bits (there need not be any of the
latter). If there are N value bits, each bit shall represent
a different power of 2 between 1 and 2N−1, so that objects
of that type shall be capable of representing values from 0
to 2N−1 using a pure binary representation; this shall be
known as the value representation. The values of any padding bits are
unspecified.
Based on your comments, it seems you want to perform a bona fide conversion -- that is, to produce a distinct, new, separate value of a different type. This is a very different thing than a reinterpretation, such as the lead-in to your question suggests you wanted. In particular, you posit variables declared like this:
uint8x16_t a;
uint8x8x2_t b;
// code to set the value of a ...
and you want to know how to set the value of b so that it is in some sense equivalent to the value of a.
Speaking to the C language:
The strict aliasing rule (C2011 6.5/7) says,
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:
a type compatible with the effective type of the object, [...]
an aggregate or union type that includes one of the aforementioned types among its members [...], or
a character type.
(Emphasis added. Other enumerated options involve differently-qualified and differently-signed versions of the of the effective type of the object or compatible types; these are not relevant here.)
Note that these provisions never interfere with accessing a's value, including the member value, via variable a, and similarly for b. But don't overlook overlook the usage of the term "effective type" -- this is where things can get bolluxed up under slightly different circumstances. More on that later.
Using a union
C certainly permits you to perform a conversion via an intermediate union, or you could rely on b being a union member in the first place so as to remove the "intermediate" part:
union {
uint8x16_t x1;
uint8x8_2_t x2;
} temp;
temp.x1 = a;
b = temp.x2;
Using a typecast pointer (to produce UB)
However, although it's not so uncommon to see it, C does not permit you to type-pun via a pointer:
// UNDEFINED BEHAVIOR - strict-aliasing violation
b = *(uint8x8x2_t *)&a;
// DON'T DO THAT
There, you are accessing the value of a, whose effective type is uint8x16_t, via an lvalue of type uint8x8x2_t. Note that it is not the cast that is forbidden, nor even, I'd argue, the dereferencing -- it is reading the dereferenced value so as to apply the side effect of the = operator.
Using memcpy()
Now, what about memcpy()? This is where it gets interesting. C permits the stored values of a and b to be accessed via lvalues of character type, and although its arguments are declared to have type void *, this is the only plausible interpretation of how memcpy() works. Certainly its description characterizes it as copying characters. There is therefore nothing wrong with performing a
memcpy(&b, &a, sizeof a);
Having done so, you may freely access the value of b via variable b, as already mentioned. There are aspects of doing so that could be problematic in a more general context, but there's no UB here.
However, contrast this with the superficially similar situation in which you want to put the converted value into dynamically-allocated space:
uint8x8x2_t *c = malloc(sizeof(*c));
memcpy(c, &a, sizeof a);
What could be wrong with that? Nothing is wrong with it, as far as it goes, but here you have UB if you afterward you try to access the value of *c. Why? because the memory to which c points does not have a declared type, therefore its effective type is the effective type of whatever was last stored in it (if that has an effective type), including if that value was copied into it via memcpy() (C2011 6.5/6). As a result, the object to which c points has effective type uint8x16_t after the copy, whereas the expression *c has type uint8x8x2_t; the strict aliasing rule says that accessing that object via that lvalue produces UB.
So there are a bunch of gotchas here. This reflects C++.
First you can convert trivially copyable data to char* or unsigned char* or c++17 std::byte*, then copy it from one location to another. The result is defined behavior. The values of the bytes are unspecified.
If you do this from a value of one one type to another via something like memcpy, this can result in undefined behaviour upon access of the target type unless the target type has valid values for all byte representations, or if the layout of the two types is specified by your compiler.
There is the possibility of "trap representations" in the target type -- byte combinations that result in machine exceptions or something similar if interpreted as a value of that type. Imagine a system that doesn't use IEEE floats and where doing math on NaN or INF or the like causes a segfault.
There are also alignment concerns.
In C, I believe that type punning via unions is legal, with similar qualifications.
Finally, note that under a strict reading of the c++ standard, foo* pf = (foo*)malloc(sizeof(foo)); is not a pointer to a foo even if foo was plain old data. You must create an object before interacting with it, and the only way to create an object outside of automatic storage is via new or placement new. This means you must have data of the target type before you memcpy into it.
Do you know how to convert from uint8x16_t to uint8x8x2_t?
uint8x16_t input = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
uint8x8x2_t output = { vget_low_u8(input), vget_high_u8(input) };
One must understand that with neon intrinsics, uint8x16_t represents a 16-byte register; while uint8x8x2_t represents two adjacent 8-byte registers. For ARMv7 these may be the same thing (q0 == {d0, d1}) but for ARMv8 the register layout is different. It's necessary to get (extract) the low 8 bytes and the high 8 bytes of the single 16-byte register using two functions. The clang compiler will determine which instruction(s) are necessary based on the context.
Now we know that doing out-of-bounds-pointer-arithmetic has undefined behavior as described in this SO question.
My question is: can we workaround such restriction by casting to std::uintptr_t for arithmetic operations and then cast back to pointer? is that guaranteed to work?
For example:
char a[5];
auto u = reinterpret_cast<std::uintptr_t>(a) - 1;
auto p = reinterpret_cast<char*>(u + 1); // OK?
The real world usage is for optimizing offsetted memory access -- instead of p[n + offset], I want to do offset_p[n].
EDIT To make the question more explicit:
Given a base pointer p of a char array, if p + n is a valid pointer, will reinterpret_cast<char*>(reinterpret_cast<std::uintptr_t>(p) + n) be guaranteed to yield the same valid pointer?
No, uintptr_t cannot be meaningfully used to avoid undefined behavior when performing pointer arithmetic.
For one thing, at least in C there is no guarantee that uintptr_t even exists. The requirement is that any value of type void* may be converted to uintptr_t and back again, yielding the original value without loss of information. In principle, there might not be any unsigned integer type wide enough to hold all pointer values. (I presume the same applies to C++, since C++ inherits most of the C standard library and defines it by reference to the C standard.)
Even if uintptr_t does exist, there is no guarantee that a given arithmetic operation on a uintptr_t value does the same thing as the corresponding operation on a pointer value.
For example, I've worked on systems (Cray vector systems, T90 and SV1) on which byte pointers are implemented in software. A native address is a 64-bit address that refers to a 64-bit word; there is no hardware support for byte addressing. A char* or void* pointer consists of a word pointer with a 3-bit offset stored in the otherwise unused high-order bits. Conversion between integers and pointers simply copies the bits. So incrementing a char* would advance it to point to the next 8-bit byte in memory; incrementing a uintptr_t obtained by converting a char* would advance it to point to the next 64-bit word.
That's just one example. More generally, conversions between pointers and integers are implementation-defined, and the language standard makes no guarantee about the semantics of those conversions (other than, in some cases, converting back to a pointer).
So yes, you can convert a pointer value to uintptr_t (if that type exists) and perform arithmetic on it without risking undefined behavior -- but the result may or may not be meaningful.
It happens that, on most systems, the mapping between pointers and integers is simpler, and you probably can get away with that kind of game. But you're better off using pointer arithmetic directly, and just being very careful to avoid any invalid operations.
Yes, that is legal, but you must reinterpret_cast exactly the same uintptr_t value back to char*.
(Therefore, what it you're intending to do is illegal; that is, converting a different value back to a pointer.)
5.2.10 Reinterpret cast
4 . A pointer can be explicitly converted to any integral type large enough to hold it. The mapping function is
implementation-defined.
5 . A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted
to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type
will have its original value;
(Note that there'd be no way, in general, for the compiler to know that you subtracted one and then added it back.)
Is this statement correct? Can any "TYPE" of pointer can point to any other type?
Because I believe so, still have doubts.
Why are pointers declared for definite types? E.g. int or char?
The one explanation I could get was: if an int type pointer was pointing to a char array, then when the pointer is incremented, the pointer will jump from 0 position to the 2 position, skipping 1 position in between (because int size=2).
And maybe because a pointer just holds the address of a value, not the value itself, i.e. the int or double.
Am I wrong? Was that statement correct?
Pointers may be interchangeable, but are not required to be.
In particular, on some platforms, certain types need to be aligned to certain byte-boundaries.
So while a char may be anywhere in memory, an int may need to be on a 4-byte boundary.
Another important potential difference is with function-pointers.
Pointers to functions may not be interchangeable with pointers to data-types on many platforms.
It bears repeating: This is platform-specific.
I believe Intel x86 architectures treat all pointers the same.
But you may well encounter other platforms where this is not true.
Every pointer is of some specific type. There's a special generic pointer type void* that can point to any object type, but you have to convert a void* to some specific pointer type before you can dereference it. (I'm ignoring function pointer types.)
You can convert a pointer value from one pointer type to another. In most cases, converting a pointer from foo* to bar* and back to foo* will yield the original value -- but that's not actually guaranteed in all cases.
You can cause a pointer of type foo* to point to an object of type bar, but (a) it's usually a bad idea, and (b) in some cases, it may not work (say, if the target types foo and bar have different sizes or alignment requirements).
You can get away with things like:
int n = 42;
char *p = (char*)&n;
which causes p to point to n -- but then *p doesn't give you the value of n, it gives you the value of the first byte of n as a char.
The differing behavior of pointer arithmetic is only part of the reason for having different pointer types. It's mostly about type safety. If you have a pointer of type int*, you can be reasonably sure (unless you've done something unsafe) that it actually points to an int object. And if you try to treat it as an object of a different type, the compiler will likely complain about it.
Basically, we have distinct pointer types for the same reasons we have other distinct types: so we can keep track of what kind of value is stored in each object, with help from the compiler.
(There have been languages that only have untyped generic pointers. In such a language, it's more difficult to avoid type errors, such as storing a value of one type and accidentally accessing it as if it were of another type.)
Any pointer can refer to any location in memory, so technically the statement is correct. With that said, you need to be careful when reinterpreting pointer types.
A pointer basically has two pieces of information: a memory location, and the type it expects to find there. The memory location could be anything. It could be the location where an object or value is stored; it could be in the middle of a string of text; or it could just be an arbitrary block of uninitialised memory.
The type information in a pointer is important though. The array and pointer arithmetic explanation in your question is correct -- if you try to iterate over data in memory using a pointer, then the type needs to be correct, otherwise you may not iterate correctly. This is because different types have different sizes, and may be aligned differently.
The type is also important in terms of how data is handled in your program. For example, if you have an int stored in memory, but you access it by dereferencing a float* pointer, then you'll probably get useless results (unless you've programmed it that way for a specific reason). This is because an int is stored in memory differently from the way a float is stored.
Can any "TYPE" of pointer can point to any other type?
Generally no. The types have to be related.
It is possible to use reinterpret_cast to cast a pointer from one type to another, but unless those pointers can be converted legally using a static_cast, the reinterpret_cast is invalid. Hence you can't do Foo* foo = ...; Bar* bar = (Bar*)foo; unless Foo and Bar are actually related.
You can also use reinterpret_cast to cast from an object pointer to a void* and vice versa, and in that sense a void* can point to anything -- but that's not what you seem to be asking about.
Further you can reinterpret_cast from object pointer to integral value and vice versa, but again, not what you appear to be asking.
Finally, a special exception is made for char*. You can initialize a char* variable with the address of any other type, and perform pointer math on the resulting pointer. You still can't dereference thru the pointer if the thing being pointed to isn't actually a char, but it can then be casted back to the actual type and used that way.
Also keep in mind that every time you use reinterpret_cast in any context, you are dancing on the precipice of a cliff. Dereferencing a pointer to a Foo when the thing it actually points to is a Bar yields Undefined Behavior when the types are not related. You would do well to avoid these types of casts at all costs.
Some pointers are more equal than others...
First of all, not all pointers are necessarily the same thing. Function pointers can be something very different from data pointers, for instance.
Aside: Function pointers on PPC
On the PPC platform, this was quite obvious: A function pointer was actually two pointers under the hood, so there was simply no way to meaningfully cast a function pointer to a data pointer or back. I.e. the following would hold:
int* dataP;
int (*functionP)(int);
assert(sizeof(dataP) == 4);
assert(sizeof(functionP) == 8);
assert(sizeof(dataP) != sizeof(functionP));
//impossible:
//dataP = (int*)functionP; //would loose information
//functionP = (int (*)(int))dataP; //part of the resulting pointer would be garbage
Alignment
Furthermore, there is problems with alignment: Depending on the platform some data types may need to be aligned in memory. This is especially common with vector data types, but could apply to any type larger than a byte. For instance, if an int must be 4 byte aligned, the following code might crash:
char a[4];
int* alias = (int*)a;
//int foo = *alias; //may crash because alias is not aligned properly
This is not an issue if the pointer comes from a malloc() call, as that is guaranteed to return sufficiently aligned pointers for all types:
char* a = malloc(sizeof(int));
int* alias = (int*)a;
*alias = 0; //perfectly legal, the pointer is aligned
Strict aliasing and type punning
Finally, there are strict aliasing rules: You must not access an object of one type through a pointer to another type. Type punning is forbidden:
assert(sizeof(float) == sizeof(uint32_t));
float foo = 42;
//uint32_t bits = *(uint32_t*)&foo; //type punning is illegal
If you absolutely must reinterpret a bit pattern as another type, you must use memcpy():
assert(sizeof(float) == sizeof(uint32_t));
float foo = 42;
uint32_t bits;
memcpy(&bits, &foo, sizeof(bits)); //bit pattern reinterpretation is legal when copying the data
To allow memcpy() and friends to actually be implementable, the C/C++ language standards provide for an exception for char types: You can cast any pointer to a char*, copy the char data over to another buffer, and then access that other buffer as some other type. The results are implementation defined, but the standards allow it. Use cases are mostly general data manipulation routines like I/O, etc.
TL;DR:
Pointers are much less interchangeable than you think. Don't reinterpret pointers in any other way than to/from char* (check alignment in the "from" case). And even that does not work for function pointers.
Is the following well-defined:
char* charPtr = new char[42];
int* intPtr = (int*)charPtr;
charPtr++;
intPtr = (int*) charPtr;
The intPtr isn't properly aligned (in at least one of the two cases). Is it illegal just having it there? Is it UB using it at any stage? How can you use it and how can't you?
In general, the result is unspecified (5.2.10p7) if the alignment requirements of int are greater than those of char, which they usually will be. The result will be a valid value of the type int * so it can be e.g. printed as a pointer with operator<< or converted to intptr_t.
Because the result has an unspecified value, unless specified by the implementation it is undefined behaviour to indirect it and perform lvalue-to-rvalue conversion on the resulting int lvalue (except in unevaluated contexts). Converting back to char * will not necessarily round-trip.
However, if the original char * was itself the result of a cast from int *, then the cast to int * counts as the second half of a round trip; in that case, the cast is defined.
In particular, in the case above where the char * was the result of a new[] expression, we are guaranteed (5.3.4p10) that the char * pointer is appropriately aligned for int, as long as sizeof(int) <= 42. Because the new[] expression obtains its storage from an allocation function, 3.7.4.1p2 applies; the void * pointer can be converted
to a pointer of any complete object type with a fundamental alignment requirement and then used to access the object [...] which strongly implies, along with the note to 5.3.4p10, that the same holds for the char * pointer returned by the new[] expression. In this case the int * is a pointer to an uninitialised int object, so performing lvalue-to-rvalue conversion on its indirection is undefined (3.8p6), but assigning to its indirection is fully defined. The int object is in the storage allocated (3.7.4.1p2) so converting the int * back to char * will yield the original value per 1.8p6. This does not hold for the incremented char * pointer as unless sizeof(int) == 1 it is not the address of an int object.
First, of course: the pointer is guaranteed to be aligned in the
first case (by §5.3.4/10 and §3.7.4.1/2), and may be correctly
aligned in both cases. (Obviously, if sizeof(int) == 1, but
even when this is not the case, an implementation doesn't
necessarily have alignment requirements.)
And to make things clear: your casts are all reinterpret_cast.
Beyond that, this is an interesting question, because as far as
I can tell, there is no difference in the two casts, as far as
the standard is concerned. The results of the conversion are
unspecified (according to §5.2.10/7); you're not even guaranteed
that converting it back into a char* will result in the
original value. (It obviously won't, for example, on machines
where int* is smaller than a char*.)
In practice, of course: the standard requires that the return
value of new char[N] be sufficiently aligned for any value
which may fit into it, so you are guaranteed to be able to do:
intPtr = new (charPtr) int;
Which has exactly the same effect as your cast, given that the
default constructor for int is a no-op. (And assuming that
sizeof(int) <= 42.) So it's hard to imagine an implementation
in which the first part fails. You should be able to use the
intPtr just like any other legally obtained intPtr. And the
idea that converting it back to a char* would somehow result
in a different value from the original char* seems
preposterous.
In the second part, all bets are off: you definitely can't
dereference the pointer (unless your implementation guarantees
otherwise), and it's also quite possible that converting it back
to char* results in something different. (Imagine a word
addressed machine, for example, where converting a char* to an
int* rounds up. Then converting back would result in
a char* which was sizeof(int) higher than the original. Or
where an attempt to convert a misaligned pointer always resulted
in a null pointer.)