Should std::byte pointers be used for pointer arithmetic? - c++

It seems std::byte has become the way (in C++17) to work with buffers holding object representations, but it's unclear whether this intent still allows for performing pointer arithmetic.
The question in the title is intentionally phrased as should because I'm looking for recommendation. For example, void* can be used for pointer arithmetic as gcc extensions but are not standard (at least this is true for C), hence a possibility but not a recommendation.
I know the motivation for std::byte is to detach the character and the numeric aspects from the concept of byte. But at the same time, does pointer arithmetic stay?
EDIT: adjusted to clarify that I'm looking to do "pointer arithmetic" using std::byte* not the numerical value of pointers stores in std::bytes

Yes, std::byte* can be used for pointer arithmetic.
And you can even do things like
struct foo{int x,y};
foo f;
int* ptr_to_y = reinterpret_cast<int*>(reinterpret_cast<std::byte*>(&f)+offsetof(foo,y));
You do have to be careful that your locations are reachable through your operations. Just because pointers-as-integers gets the right result doesn't mean that the C++ code is doing defined behavior. There are a number of quirks in C++ around permitting the optimizer to "know" that a certain value cannot be modified.
struct loc {
int x,y;
};
void f( int* );
loc work( loc l ) {
l.x=3;
f(&l.y);
return l;
}
in the above case, someone who used the &l.y pointer to do pointer arithmetic (within f) and modify l.x, regardless of if they went to std::byte* or not, would be doing undefined behavior. The compiler is allowed to assume the returned l will have an .x value of 3.
These are not new pitfalls introduced by std::byte*.

Related

use of reinterpret_cast while reading value from binary file [duplicate]

I hear that reinterpret_cast is implementation defined, but I don't know what this really means. Can you provide an example of how it can go wrong, and it goes wrong, is it better to use C-Style cast?
The C-style cast isn't better.
It simply tries the various C++-style casts in order, until it finds one that works. That means that when it acts like a reinterpret_cast, it has the exact same problems as a reinterpret_cast. But in addition, it has these problems:
It can do many different things, and it's not always clear from reading the code which type of cast will be invoked (it might behave like a reinterpret_cast, a const_cast or a static_cast, and those do very different things)
Consequently, changing the surrounding code might change the behaviour of the cast
It's hard to find when reading or searching the code - reinterpret_cast is easy to find, which is good, because casts are ugly and should be paid attention to when used. Conversely, a C-style cast (as in (int)42.0) is much harder to find reliably by searching
To answer the other part of your question, yes, reinterpret_cast is implementation-defined. This means that when you use it to convert from, say, an int* to a float*, then you have no guarantee that the resulting pointer will point to the same address. That part is implementation-defined. But if you take the resulting float* and reinterpret_cast it back into an int*, then you will get the original pointer. That part is guaranteed.
But again, remember that this is true whether you use reinterpret_cast or a C-style cast:
int i;
int* p0 = &i;
float* p1 = (float*)p0; // implementation-defined result
float* p2 = reinterpret_cast<float*>(p0); // implementation-defined result
int* p3 = (int*)p1; // guaranteed that p3 == p0
int* p4 = (int*)p2; // guaranteed that p4 == p0
int* p5 = reinterpret_cast<int*>(p1); // guaranteed that p5 == p0
int* p6 = reinterpret_cast<int*>(p2); // guaranteed that p6 == p0
It is implementation defined in a sense that standard doesn't (almost) prescribe how different types values should look like on a bit level, how address space should be structured and so on. So it's really a very platform specific for conversions like:
double d;
int &i = reinterpret_cast<int&>(d);
However as standard says
It is intended to be unsurprising to those who know the addressing structure
of the underlying machine.
So if you know what you do and how it all looks like on a low-level nothing can go wrong.
The C-style cast is somewhat similar in a sense that it can perform reinterpret_cast, but it also "tries" static_cast first and it can cast away cv qualification (while static_cast and reinterpret_cast can't) and perform conversions disregarding access control (see 5.4/4 in C++11 standard). E.g.:
#include <iostream>
using namespace std;
class A { int x; };
class B { int y; };
class C : A, B { int z; };
int main()
{
C c;
// just type pun the pointer to c, pointer value will remain the same
// only it's type is different.
B *b1 = reinterpret_cast<B *>(&c);
// perform the conversion with a semantic of static_cast<B*>(&c), disregarding
// that B is an unaccessible base of C, resulting pointer will point
// to the B sub-object in c.
B *b2 = (B*)(&c);
cout << "reinterpret_cast:\t" << b1 << "\n";
cout << "C-style cast:\t\t" << b2 << "\n";
cout << "no cast:\t\t" << &c << "\n";
}
and here is an output from ideone:
reinterpret_cast: 0xbfd84e78
C-style cast: 0xbfd84e7c
no cast: 0xbfd84e78
note that value produced by reinterpret_cast is exactly the same as an address of 'c', while C-style cast resulted in a correctly offset pointer.
There are valid reasons to use reinterpret_cast, and for these reasons the standard actually defines what happens.
The first is to use opaque pointer types, either for a library API or just to store a variety of pointers in a single array (obviously along with their type). You are allowed to convert a pointer to a suitably sized integer and then back to a pointer and it will be the exact same pointer. For example:
T b;
intptr_t a = reinterpret_cast<intptr_t>( &b );
T * c = reinterpret_cast<T*>(a);
In this code c is guaranteed to point to the object b as you'd expected. Conversion back to a different pointer type is of course undefined (sort of).
Similar conversions are allowed for function pointers and member function pointers, but in the latter case you can cast to/from another member function pointer simply to have a variable that is big enouhg.
The second case is for using standard layout types. This is something that was de factor supported prior to C++11 and has now been specified in the standard. In this case the standard treats reinterpret_cast as a static_cast to void* first and then a static_cast to the desination type. This is used a lot when doing binary protocols where data structures often have the same header information and allows you to convert types which have the same layout, but differ in C++ class structure.
In both of these cases you should use the explicit reinterpret_cast operator rather than the C-Style. Though the C-style would normally do the same thing, it has the danger of being subjected to overloaded conversion operators.
C++ has types, and the only way they normally convert between each other is by well-defined conversion operators that you write. In general, that's all you both need and should use to write your programs.
Sometimes, however, you want to reinterpret the bits that represent a type into something else. This is usually used for very low-level operations and is not something you should typically use. For those cases, you can use reinterpret_cast.
It is implementation defined because the C++ standard does not really say much at all about how things should actually be laid out in memory. That is controlled by your specific implementation of C++. Because of this, the behaviour of reinterpret_cast depends upon how your compiler lays structures out in memory and how it implements reinterpret_cast.
C-style casts are quite similar to reinterpret_casts, but they have much less syntax and are not recommended. The thinking goes that casting is inherently an ugly operation and it requires ugly syntax to inform the programmer that something dubious is happening.
An easy example of how it could go wrong:
std::string a;
double* b;
b = reinterpret_cast<double*>(&a);
*b = 3.4;
That program's behaviour is undefined - a compiler could do anything it likes to that. Most probably, you would get a crash when the string's destructor is called, but who knows! It might just corrupt your stack and cause a crash in an unrelated function.
Both reinterpret_cast and c-style casts are implementation defined and they do almost the same thing. The differences are :
1. reinterpret_cast can not remove constness. For example :
const unsigned int d = 5;
int *g=reinterpret_cast< int* >( &d );
will issue an error :
error: reinterpret_cast from type 'const unsigned int*' to type 'int*' casts away qualifiers
2. If you use reinterpret_cast, it is easy to find the places where you did it. It is not possible to do with c-style casts
C-style casts sometimes type-pun an object in an unspecified way, such as (unsigned int)-1, sometimes convert the same value to a different format, such as (double)42, sometimes could do either, like how (void*)0xDEADBEEF reinterprets bits but (void*)0 is guaranteed to be a null pointer constant, which does not necessarily have the same object representation as (intptr_t)0, and very rarely tells the compiler to do something like shoot_self_in_foot_with((char*)&const_object);.
That's usually all well and good, but when you want to cast a double to a uint64_t, sometimes you want the value and sometimes you want the bits. If you know C, you know which one the C-style cast does, but it's nicer in some ways to have different syntax for both.
Bjarne Stroustrup, in his guidelines, recommended reinterpret_cast in another context: if you want to type-pun in a way that the language does not define by a static_cast, he suggested that you do it with something like reinterpret_cast<double&>(uint64) rather than the other methods. They're all undefined behavior, but that makes it very explicit what you're doing and that you're doing it on purpose. Reading a different member of a union than you last wrote to does not.

Any type of pointer can point to anything?

Is this statement correct? Can any "TYPE" of pointer can point to any other type?
Because I believe so, still have doubts.
Why are pointers declared for definite types? E.g. int or char?
The one explanation I could get was: if an int type pointer was pointing to a char array, then when the pointer is incremented, the pointer will jump from 0 position to the 2 position, skipping 1 position in between (because int size=2).
And maybe because a pointer just holds the address of a value, not the value itself, i.e. the int or double.
Am I wrong? Was that statement correct?
Pointers may be interchangeable, but are not required to be.
In particular, on some platforms, certain types need to be aligned to certain byte-boundaries.
So while a char may be anywhere in memory, an int may need to be on a 4-byte boundary.
Another important potential difference is with function-pointers.
Pointers to functions may not be interchangeable with pointers to data-types on many platforms.
It bears repeating: This is platform-specific.
I believe Intel x86 architectures treat all pointers the same.
But you may well encounter other platforms where this is not true.
Every pointer is of some specific type. There's a special generic pointer type void* that can point to any object type, but you have to convert a void* to some specific pointer type before you can dereference it. (I'm ignoring function pointer types.)
You can convert a pointer value from one pointer type to another. In most cases, converting a pointer from foo* to bar* and back to foo* will yield the original value -- but that's not actually guaranteed in all cases.
You can cause a pointer of type foo* to point to an object of type bar, but (a) it's usually a bad idea, and (b) in some cases, it may not work (say, if the target types foo and bar have different sizes or alignment requirements).
You can get away with things like:
int n = 42;
char *p = (char*)&n;
which causes p to point to n -- but then *p doesn't give you the value of n, it gives you the value of the first byte of n as a char.
The differing behavior of pointer arithmetic is only part of the reason for having different pointer types. It's mostly about type safety. If you have a pointer of type int*, you can be reasonably sure (unless you've done something unsafe) that it actually points to an int object. And if you try to treat it as an object of a different type, the compiler will likely complain about it.
Basically, we have distinct pointer types for the same reasons we have other distinct types: so we can keep track of what kind of value is stored in each object, with help from the compiler.
(There have been languages that only have untyped generic pointers. In such a language, it's more difficult to avoid type errors, such as storing a value of one type and accidentally accessing it as if it were of another type.)
Any pointer can refer to any location in memory, so technically the statement is correct. With that said, you need to be careful when reinterpreting pointer types.
A pointer basically has two pieces of information: a memory location, and the type it expects to find there. The memory location could be anything. It could be the location where an object or value is stored; it could be in the middle of a string of text; or it could just be an arbitrary block of uninitialised memory.
The type information in a pointer is important though. The array and pointer arithmetic explanation in your question is correct -- if you try to iterate over data in memory using a pointer, then the type needs to be correct, otherwise you may not iterate correctly. This is because different types have different sizes, and may be aligned differently.
The type is also important in terms of how data is handled in your program. For example, if you have an int stored in memory, but you access it by dereferencing a float* pointer, then you'll probably get useless results (unless you've programmed it that way for a specific reason). This is because an int is stored in memory differently from the way a float is stored.
Can any "TYPE" of pointer can point to any other type?
Generally no. The types have to be related.
It is possible to use reinterpret_cast to cast a pointer from one type to another, but unless those pointers can be converted legally using a static_cast, the reinterpret_cast is invalid. Hence you can't do Foo* foo = ...; Bar* bar = (Bar*)foo; unless Foo and Bar are actually related.
You can also use reinterpret_cast to cast from an object pointer to a void* and vice versa, and in that sense a void* can point to anything -- but that's not what you seem to be asking about.
Further you can reinterpret_cast from object pointer to integral value and vice versa, but again, not what you appear to be asking.
Finally, a special exception is made for char*. You can initialize a char* variable with the address of any other type, and perform pointer math on the resulting pointer. You still can't dereference thru the pointer if the thing being pointed to isn't actually a char, but it can then be casted back to the actual type and used that way.
Also keep in mind that every time you use reinterpret_cast in any context, you are dancing on the precipice of a cliff. Dereferencing a pointer to a Foo when the thing it actually points to is a Bar yields Undefined Behavior when the types are not related. You would do well to avoid these types of casts at all costs.
Some pointers are more equal than others...
First of all, not all pointers are necessarily the same thing. Function pointers can be something very different from data pointers, for instance.
Aside: Function pointers on PPC
On the PPC platform, this was quite obvious: A function pointer was actually two pointers under the hood, so there was simply no way to meaningfully cast a function pointer to a data pointer or back. I.e. the following would hold:
int* dataP;
int (*functionP)(int);
assert(sizeof(dataP) == 4);
assert(sizeof(functionP) == 8);
assert(sizeof(dataP) != sizeof(functionP));
//impossible:
//dataP = (int*)functionP; //would loose information
//functionP = (int (*)(int))dataP; //part of the resulting pointer would be garbage
Alignment
Furthermore, there is problems with alignment: Depending on the platform some data types may need to be aligned in memory. This is especially common with vector data types, but could apply to any type larger than a byte. For instance, if an int must be 4 byte aligned, the following code might crash:
char a[4];
int* alias = (int*)a;
//int foo = *alias; //may crash because alias is not aligned properly
This is not an issue if the pointer comes from a malloc() call, as that is guaranteed to return sufficiently aligned pointers for all types:
char* a = malloc(sizeof(int));
int* alias = (int*)a;
*alias = 0; //perfectly legal, the pointer is aligned
Strict aliasing and type punning
Finally, there are strict aliasing rules: You must not access an object of one type through a pointer to another type. Type punning is forbidden:
assert(sizeof(float) == sizeof(uint32_t));
float foo = 42;
//uint32_t bits = *(uint32_t*)&foo; //type punning is illegal
If you absolutely must reinterpret a bit pattern as another type, you must use memcpy():
assert(sizeof(float) == sizeof(uint32_t));
float foo = 42;
uint32_t bits;
memcpy(&bits, &foo, sizeof(bits)); //bit pattern reinterpretation is legal when copying the data
To allow memcpy() and friends to actually be implementable, the C/C++ language standards provide for an exception for char types: You can cast any pointer to a char*, copy the char data over to another buffer, and then access that other buffer as some other type. The results are implementation defined, but the standards allow it. Use cases are mostly general data manipulation routines like I/O, etc.
TL;DR:
Pointers are much less interchangeable than you think. Don't reinterpret pointers in any other way than to/from char* (check alignment in the "from" case). And even that does not work for function pointers.

Why can I cast int and BOOL to void*, but not float?

void* is a useful feature of C and derivative languages. For example, it's possible to use void* to store objective-C object pointers in a C++ class.
I was working on a type conversion framework recently and due to time constraints was a little lazy - so I used void*... That's how this question came up:
Why can I typecast int to void*, but not float to void* ?
BOOL is not a C++ type. It's probably typedef or defined somewhere, and in these cases, it would be the same as int. Windows, for example, has this in Windef.h:
typedef int BOOL;
so your question reduces to, why can you typecast int to void*, but not float to void*?
int to void* is ok but generally not recommended (and some compilers will warn about it) because they are inherently the same in representation. A pointer is basically an integer that points to an address in memory.
float to void* is not ok because the interpretation of the float value and the actual bits representing it are different. For example, if you do:
float x = 1.0;
what it does is it sets the 32 bit memory to 00 00 80 3f (the actual representation of the float value 1.0 in IEEE single precision). When you cast a float to a void*, the interpretation is ambiguous. Do you mean the pointer that points to location 1 in memory? or do you mean the pointer that points to location 3f800000 (assuming little endian) in memory?
Of course, if you are sure which of the two cases you want, there is always a way to get around the problem. For example:
void* u = (void*)((int)x); // first case
void* u = (void*)(((unsigned short*)(&x))[0] | (((unsigned int)((unsigned short*)(&x))[1]) << 16)); // second case
Pointers are usually represented internally by the machine as integers. C allows you to cast back and forth between pointer type and integer type. (A pointer value may be converted to an integer large enough to hold it, and back.)
Using void* to hold integer values in unconventional. It's not guaranteed by the language to work, but if you want to be sloppy and constrain yourself to Intel and other commonplace platforms, it will basically scrape by.
Effectively what you're doing is using void* as a generic container of however many bytes are used by the machine for pointers. This differs between 32-bit and 64-bit machines. So converting long long to void* would lose bits on a 32-bit platform.
As for floating-point numbers, the intention of (void*) 10.5f is ambiguous. Do you want to round 10.5 to an integer, then convert that to a nonsense pointer? No, you want the bit-pattern used by the FPU to be placed into a nonsense pointer. This can be accomplished by assigning float f = 10.5f; void *vp = * (uint32_t*) &f;, but be warned that this is just nonsense: pointers aren't generic storage for bits.
The best generic storage for bits is char arrays, by the way. The language standards guarantee that memory can be manipulated through char*. But you have to mind data alignment requirements.
Standard says that 752 An integer may be converted to any pointer type. Doesn't say anything about pointer-float conversion.
Considering any of you want you transfer float value as void *, there is a workaround using type punning.
Here is an example;
struct mfloat {
union {
float fvalue;
int ivalue;
};
};
void print_float(void *data)
{
struct mfloat mf;
mf.ivalue = (int)data;
printf("%.2f\n", mf.fvalue);
}
struct mfloat mf;
mf.fvalue = 1.99f;
print_float((void *)(mf.ivalue));
we have used union to cast our float value(fvalue) as an integer(ivalue) to void*, and vice versa
The question is based on a false premise, namely that void * is somehow a "generic" or "catch-all" type in C or C++. It is not. It is a generic object pointer type, meaning that it can safely store pointers to any type of data, but it cannot itself contain any type of data.
You could use a void * pointer to generically manipulate data of any type by allocating sufficient memory to hold an object of any given type, then using a void * pointer to point to it. In some cases you could also use a union, which is of course designed to be able to contain objects of multiple types.
Now, because pointers can be thought of as integers (and indeed, on conventionally-addressed architectures, typically are integers) it is possible and in some circles fashionable to stuff an integer into a pointer. Some library API's have even documented and supported this usage — one notable example was X Windows.
Conversions between pointers and integers are implementation-defined, and these days typically draw warnings, and so typically require an explicit cast, not so much to force the conversion as simply to silence the warning. For example, both the code fragments below print 77, but the first one probably draws compiler warnings.
/* fragment 1: */
int i = 77;
void *p = i;
int j = p;
printf("%d\n", j);
/* fragment 2: */
int i = 77;
void *p = (void *)(uintptr_t)i;
int j = (int)p;
printf("%d\n", j);
In both cases, we are not really using the void * pointer p as a pointer at all: we are merely using it as a vessel for some bits. This relies on the fact that on a conventionally-addressed architecture, the implementation-defined behavior of a pointer/integer conversion is the obvious one, which to an assembly-language programmer or an old-school C programmer doesn't seem like a "conversion" at all. And if you can stuff an int into a pointer, it's not surprising if you can stuff in other integral types, like bool, as well.
But what about trying to stuff a floating-point value into a pointer? That's considerably more problematic. Stuffing an integer value into a pointer, though implementation-defined, makes perfect sense if you're doing bare-metal programming: you're taking the numeric value of the integer, and using it as a memory address. But what would it mean to try to stuff a floating-point value into a pointer?
It's so meaningless that the C Standard doesn't even label it "undefined".
It's so meaningless that a typical compiler won't even attempt it.
And if you think about it, it's not even obvious what it should do.
Would you want to use the numeric value, or the bit pattern, as the thing to try to stuff into the pointer? Stuffing in the numeric value is closer to how floating-point-to-integer conversions work, but you'd lose your fractional part. Using the bit pattern is what you'd probably want, but accessing the bit pattern of a floating-point value is never something that C makes easy, as generations of programmers who have attempted things like
uint32_t hexval = (uint32_t)3.0;
have discovered.
Nevertheless, if you were bound and determined to store a floating-point value in a void * pointer, you could probably accomplish it, using sufficiently brute-force casts, although the results are probably both undefined and machine-dependent. (That is, I think there's a strict aliasing violation here, and if pointers are bigger than floats, as of course they are on a 64-bit architecture, I think this will probably only work if the architecture is little-endian.)
float f = 77.75;
void *p = (void *)(uintptr_t)*(uint32_t *)&f;
float f2 = *(float *)&p;
printf("%f\n", f2);
dmr help me, this actually does print 77.75 on my machine.

Does pointer arithmetic have uses outside of arrays?

I think I understand the semantics of pointer arithmetic fairly well, but I only ever see examples when dealing with arrays. Does it have any other uses that can't be achieved by less opaque means? I'm sure you could find a way with clever casting to use it to access members of a struct, but I'm not sure why you'd bother. I'm mostly interested in C, but I'll tag with C++ because the answer probably applies there too.
Edit, based on answers received so far: I know pointers can be used in many non-array contexts. I'm specifically wondering about arithmetic on pointers, e.g. incrementing, taking a difference, etc.
Pointer arithmetic by definition in C happens only on arrays. However, as every object has a representation consisting of an overlaid unsigned char [sizeof object] array, it's also valid to perform pointer arithmetic on this representation. For example:
struct foo {
int a, b, c;
} bar;
/* Equivalent to: bar.c = 1; */
*(int *)((unsigned char *)&bar + offsetof(struct foo, c)) = 1;
Actually char * would work just as well.
If you follow the language standard to the letter, then pointer arithmetic is only defined when pointing to an array, and not in any other case.
A pointer may point to any element of an array, or one step past the end of the array.
From the top of my head I know it's used in XOR linked-lists (very nifty) and I've seen it used in very hacky recursions.
On the other hand, it's very hard to find uses since according to the standard pointer arithmic is only defined if within the bounds of an array.
a[n] is "just" syntactic sugar for *(a + n). For lulz, try the following
int a[2];
0[a] = 10;
1[a] = 20;
So one could argue that indexing and pointer arithmetic are merely interchangeable syntax.
Pointer arithmetic is only defined on arrays. Adding an integer to a pointer that does not point to an array element produces undefined behavior.
In embedded systems, pointers are used to represent addresses or locations. There may not be an array defined. (Although one could say that all of memory is one huge array.)
For example, a stack (holding variables and addresses) is manipulated by adding or subtracting values from the stack pointer. (In this case, the stack could be said to be an array based stack.)
Here's a case for pointer arithmetic outside of (strictly defined) arrays:
double d = 0.5;
unsigned char *bytes = (void *)&d;
for(size_t i = 0; i < sizeof d; i++)
printf("Byte %zu of d is %hhu\n", i, bytes[i]);
Why would you do this? I don't know. But if you want to look at the bitwise representation of an object (useful for things like memcpy and memcmp), you'll need to cast their addresses to unsigned char *s (or signed char *s if you like) and work with them byte-by-byte. (If your task isn't too difficult you can even write the code to work word-by-word, which most memcpy implementations will do. It's the same principle, though, just replace char with int32_t.)
Note that, in the standard, the exact values (or the number of values) that are printed are implementation-defined, but that this will always work as a way to access an object's internal bytewise representation. (It is not required to work for larger integer types, but almost always will - no processor I know of has had trap representations for integers in quite some time).

C++: Is it safe to cast pointer to int and later back to pointer again?

Is it safe to cast pointer to int and later back to pointer again?
How about if we know if the pointer is 32 bit long and int is 32 bit long?
long* juggle(long* p) {
static_assert(sizeof(long*) == sizeof(int));
int v = reinterpret_cast<int>(p); // or if sizeof(*)==8 choose long here
do_some_math(v); // prevent compiler from optimizing
return reinterpret_cast<long*>(v);
}
int main() {
long* stuff = new long(42);
long* ffuts = juggle(stuff);
std::cout << "Is this always 42? " << *ffuts << std::endl;
}
Is this covered by the Standard?
No.
For instance, on x86-64, a pointer is 64-bit long, but int is only 32-bit long. Casting a pointer to int and back again makes the upper 32-bit of the pointer value lost.
You may use the intptr_t type in <cstdint> if you want an integer type which is guaranteed to be as long as the pointer. You could safely reinterpret_cast from a pointer to an intptr_t and back.
Yes, if... (or "Yes, but...") and no otherwise.
The standard specifies (3.7.4.3) the following:
A pointer value is a safely-derived pointer [...] if it is the result of a well-defined pointer conversion or reinterpret_cast of a safely-derived pointer value [or] the result of a reinterpret_cast of an integer representation of a safely-derived pointer value
An integer value is an integer representation of a safely-derived pointer [...] if its type is at least as large as std::intptr_t and [...] the result of a reinterpret_cast of a safely-derived pointer value [or]
the result of a valid conversion of an integer representation of a safely-derived pointer value [or] the result of an additive or bitwise operation, one of whose operands is an integer representation of a
safely-derived pointer value
A traceable pointer object is [...] an object of an integral type that is at least as large as std::intptr_t
The standard further states that implementations may be relaxed or may be strict about enforcing safely-derived pointers. Which means it is unspecified whether using or dereferencing a not-safely-derived pointer invokes undefined behavior (that's a funny thing to say!)
Which alltogether means no more and no less than "something different might work anyway, but the only safe thing is as specified above".
Therefore, if you either use std::intptr_t in the first place (the preferrable thing to do!) or if you know that the storage size of whatever integer type you use (say, long) is at least the size of std::intptr_t, then it is allowable and well-defined (i.e. "safe") to cast to your integer type and back. The standard guarantees that.
If that's not the case, the conversion from pointer to integer representation will probably (or at least possibly) lose some information, and the conversion back will not give a valid pointer. Or, it might by accident, but this is not guaranteed.
An interesting anecdote is that the C++ standard does not directly define std::intptr_t at all; it merely says "the same as 7.18 in the C standard".
The C standard, on the other hand, states "designates a signed integer type with the property that any valid
pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer".
Which means, without the rather complicated definitions above (in particular the last bit of the first bullet point), it wouldn't be allowable to convert to/from anything but void*.
Yes and no.
The language specification explicitly states that it is safe (meaning that in the end you will get the original pointer value) as long as the size of the integral type is sufficient to store the [implementation-dependent] integral representation of the pointer.
So, in general case it is not "safe", since in general case int can easily turn out to be too small. In your specific case it though it might be safe, since your int might be sufficiently large to store your pointer.
Normally, when you need to do something like that, you should use the intptr_t/uintptr_t types, which are specifically introduced for that purpose. Unfortunately, intptr_t/uintptr_t are not the part of the current C++ standard (they are standard C99 types), but many implementations provide them nevertheless. You can always define these types yourself, of course.
In general, no; pointers may be larger than int, in which case there's no way to reconstruct the value.
If an integer type is known to be large enough, then you can; according to the Standard (5.2.10/5):
A pointer converted to an integer of sufficient size ... and back to the same pointer type will have its original value
However, in C++03, there's no standard way to tell which integer types are large enough. C++11 and C99 (and hence in practice most C++03 implementations), and also Boost.Integer, define intptr_t and uintptr_t for this purpose. Or you could define your own type and assert (preferably at compile time) that it's large enough; or, if you don't have some special reason for it to be an integer type, use void*.
Is it safe? Not really.
In most circumstances, will it work? Yes
Certainly if an int is too small to hold the full pointer value and truncates, you won't get your original pointer back (hopefully your compiler will warn you about this case, with GCC truncating conversions from pointer to integers are hard errors). A long, or uintptr_t if your library supports it, may be better choices.
Even if your integer type and pointer types are the same size, it will not necessarily work depending on your application runtime. In particular, if you're using a garbage collector in your program it might easily decide that the pointer is no longer outstanding, and when you later cast your integer back to a pointer and try to dereference it, you'll find out the object was already reaped.
Absolutely not. Doing some makes a bad assumption that the size of an int and a pointer are the same. This is almost always no the case on 64 bit platforms. If they are not the same a precision loss will occur and the final pointer value will be incorrect.
MyType* pValue = ...
int stored = (int)pValue; // Just lost the upper 4 bytes on a 64 bit platform
pValue = (MyType*)stored; // pValue is now invalid
pValue->SomeOp(); // Kaboom
No, it is not (always) safe (thus not safe in general). And it is covered by the standard.
ISO C++ 2003, 5.2.10:
A pointer can be explicitly converted to any integral type large enough to hold it. The mapping function is implementation-defined.
A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined.
(The above emphases are mine.)
Therefore, if you know that the sizes are compatible, then the conversion is safe.
#include <iostream>
// C++03 static_assert.
#define ASSURE(cond) typedef int ASSURE[(cond) ? 1 : -1]
// Assure that the sizes are compatible.
ASSURE(sizeof (int) >= sizeof (char*));
int main() {
char c = 'A';
char *p = &c;
// If this program compiles, it is well formed.
int i = reinterpret_cast<int>(p);
p = reinterpret_cast<char*>(i);
std::cout << *p << std::endl;
}
Use uintptr_t from "stdint.h" or from "boost/stdint.h". It is guaranteed to have enough storage for a pointer.
No it is not. Even if we rule out the architecture issue, size of a pointer and an integer have differences. A pointer can be of three types in C++ : near, far, and huge. They have different sizes. And if we talk about an integer its normally of 16 or 32 bit. So casting integer into pointers and vice-verse is not safe. Utmost care has to be taken, as there very much chances of precision loss. In most of the cases an integer will be short of space to store a pointer, resulting in loss of value.
If your going to be doing any system portable casting, you need to use something like Microsofts INT_PTR/UINT_PTR, the safety after that relies on the target platforms and what you intend doing to the INT_PTR. generally for most arithmatic char* or uint_8* works better while being typesafe(ish)
To an int ? not always if you are on a 64 bit machine then int is only 4 bytes, however pointers are 8 bytes long and thus you would end up with a different pointer when you cast it back from int.
There are however ways to get around this. You can simply use an 8 byte long data type ,which would work whether or not you are on 32/64 bit system, such as unsigned long long unsigned because you don't want sign extension on 32-bit systems.
It is important to note that on Linux unsigned long will always be pointer size* so if you are targeting Linux systems you could just use that.
*According to cppreference and also tested it myself but not on all Linux and Linux like systems
If the issue is that you want to do normal math on it, probably the safest thing to do would be to cast it to a pointer to char (or better yet, * uint8_t), do your math, and then cast it back.