void* is a useful feature of C and derivative languages. For example, it's possible to use void* to store objective-C object pointers in a C++ class.
I was working on a type conversion framework recently and due to time constraints was a little lazy - so I used void*... That's how this question came up:
Why can I typecast int to void*, but not float to void* ?
BOOL is not a C++ type. It's probably typedef or defined somewhere, and in these cases, it would be the same as int. Windows, for example, has this in Windef.h:
typedef int BOOL;
so your question reduces to, why can you typecast int to void*, but not float to void*?
int to void* is ok but generally not recommended (and some compilers will warn about it) because they are inherently the same in representation. A pointer is basically an integer that points to an address in memory.
float to void* is not ok because the interpretation of the float value and the actual bits representing it are different. For example, if you do:
float x = 1.0;
what it does is it sets the 32 bit memory to 00 00 80 3f (the actual representation of the float value 1.0 in IEEE single precision). When you cast a float to a void*, the interpretation is ambiguous. Do you mean the pointer that points to location 1 in memory? or do you mean the pointer that points to location 3f800000 (assuming little endian) in memory?
Of course, if you are sure which of the two cases you want, there is always a way to get around the problem. For example:
void* u = (void*)((int)x); // first case
void* u = (void*)(((unsigned short*)(&x))[0] | (((unsigned int)((unsigned short*)(&x))[1]) << 16)); // second case
Pointers are usually represented internally by the machine as integers. C allows you to cast back and forth between pointer type and integer type. (A pointer value may be converted to an integer large enough to hold it, and back.)
Using void* to hold integer values in unconventional. It's not guaranteed by the language to work, but if you want to be sloppy and constrain yourself to Intel and other commonplace platforms, it will basically scrape by.
Effectively what you're doing is using void* as a generic container of however many bytes are used by the machine for pointers. This differs between 32-bit and 64-bit machines. So converting long long to void* would lose bits on a 32-bit platform.
As for floating-point numbers, the intention of (void*) 10.5f is ambiguous. Do you want to round 10.5 to an integer, then convert that to a nonsense pointer? No, you want the bit-pattern used by the FPU to be placed into a nonsense pointer. This can be accomplished by assigning float f = 10.5f; void *vp = * (uint32_t*) &f;, but be warned that this is just nonsense: pointers aren't generic storage for bits.
The best generic storage for bits is char arrays, by the way. The language standards guarantee that memory can be manipulated through char*. But you have to mind data alignment requirements.
Standard says that 752 An integer may be converted to any pointer type. Doesn't say anything about pointer-float conversion.
Considering any of you want you transfer float value as void *, there is a workaround using type punning.
Here is an example;
struct mfloat {
union {
float fvalue;
int ivalue;
};
};
void print_float(void *data)
{
struct mfloat mf;
mf.ivalue = (int)data;
printf("%.2f\n", mf.fvalue);
}
struct mfloat mf;
mf.fvalue = 1.99f;
print_float((void *)(mf.ivalue));
we have used union to cast our float value(fvalue) as an integer(ivalue) to void*, and vice versa
The question is based on a false premise, namely that void * is somehow a "generic" or "catch-all" type in C or C++. It is not. It is a generic object pointer type, meaning that it can safely store pointers to any type of data, but it cannot itself contain any type of data.
You could use a void * pointer to generically manipulate data of any type by allocating sufficient memory to hold an object of any given type, then using a void * pointer to point to it. In some cases you could also use a union, which is of course designed to be able to contain objects of multiple types.
Now, because pointers can be thought of as integers (and indeed, on conventionally-addressed architectures, typically are integers) it is possible and in some circles fashionable to stuff an integer into a pointer. Some library API's have even documented and supported this usage — one notable example was X Windows.
Conversions between pointers and integers are implementation-defined, and these days typically draw warnings, and so typically require an explicit cast, not so much to force the conversion as simply to silence the warning. For example, both the code fragments below print 77, but the first one probably draws compiler warnings.
/* fragment 1: */
int i = 77;
void *p = i;
int j = p;
printf("%d\n", j);
/* fragment 2: */
int i = 77;
void *p = (void *)(uintptr_t)i;
int j = (int)p;
printf("%d\n", j);
In both cases, we are not really using the void * pointer p as a pointer at all: we are merely using it as a vessel for some bits. This relies on the fact that on a conventionally-addressed architecture, the implementation-defined behavior of a pointer/integer conversion is the obvious one, which to an assembly-language programmer or an old-school C programmer doesn't seem like a "conversion" at all. And if you can stuff an int into a pointer, it's not surprising if you can stuff in other integral types, like bool, as well.
But what about trying to stuff a floating-point value into a pointer? That's considerably more problematic. Stuffing an integer value into a pointer, though implementation-defined, makes perfect sense if you're doing bare-metal programming: you're taking the numeric value of the integer, and using it as a memory address. But what would it mean to try to stuff a floating-point value into a pointer?
It's so meaningless that the C Standard doesn't even label it "undefined".
It's so meaningless that a typical compiler won't even attempt it.
And if you think about it, it's not even obvious what it should do.
Would you want to use the numeric value, or the bit pattern, as the thing to try to stuff into the pointer? Stuffing in the numeric value is closer to how floating-point-to-integer conversions work, but you'd lose your fractional part. Using the bit pattern is what you'd probably want, but accessing the bit pattern of a floating-point value is never something that C makes easy, as generations of programmers who have attempted things like
uint32_t hexval = (uint32_t)3.0;
have discovered.
Nevertheless, if you were bound and determined to store a floating-point value in a void * pointer, you could probably accomplish it, using sufficiently brute-force casts, although the results are probably both undefined and machine-dependent. (That is, I think there's a strict aliasing violation here, and if pointers are bigger than floats, as of course they are on a 64-bit architecture, I think this will probably only work if the architecture is little-endian.)
float f = 77.75;
void *p = (void *)(uintptr_t)*(uint32_t *)&f;
float f2 = *(float *)&p;
printf("%f\n", f2);
dmr help me, this actually does print 77.75 on my machine.
Related
This question already has answers here:
Why cast a pointer to a float into a pointer to a long, then dereference?
(5 answers)
Closed 8 months ago.
Since my Numerical Analysis course exam is near, I was searching for a implementation code to to represent floating point numbers in C/C++? Then I found a line from one the codes in github. Can you please tell me, what is the meaning of the second line in the code snippet below, and how and why this is important?
float x = ...;
unsigned u = *(unsigned*)&x;
unsigned is just short for unsigned int and using C++-style casts the line would translate to
unsigned int u = *reinterpret_cast<unsigned int*>(&x);
However read below why this causes undefined behavior in either case.
(I recommend to not use C-style casts as in the line shown in the question, since it is not obvious to which C++-style cast they resolve.)
If x is a float variable, then the line is trying to reinterpret the object representation of the float variable as the object representation of an unsigned int, basically to reinterpret the float's memory as the memory of an unsigned int, and then stores the unsigned int value corresponding to that representation in u.
Step for step, &x is a pointer to x of type float*, reinterpret_cast<unsigned int*>(&x) is a pointer to x, but now of type unsigned int*. And then *reinterpret_cast<unsigned int*>(&x) is supposed to dereference that unsigned int* pointer to the float variable to retrieve an unsigned int value from the pointed-to memory location as if the bytes stored there represented an unsigned int value instead of a float value. Finally unsigned int u = is supposed to use that value to initialize u with it.
That causes undefined behavior because it is an aliasing violation to access a float object through a unsigned int* pointer. Some compilers have options which can be enabled to allow this (under the assumption that float and unsigned int have compatible size and alignment), but it is not permitted by the standard C++ language itself.
Generally, whenever you see reinterpret_cast (or a C-style cast that might resolve to a reinterpret_cast), you are likely to cause undefined behavior if you don't know exactly what you are doing.
Since C++20 the correct way to do this without undefined behavior is using std::bit_cast:
float x = /*...*/;
auto u = std::bit_cast<unsigned>(x);
or before C++20 using std::memcpy:
float x = /*...*/;
unsigned u;
static_assert(sizeof(u) == sizeof(x));
std::memcpy(&u, &x, sizeof(u));
The size verification is done by std::bit_cast automatically. Even without C++20 it would probably be a good idea to wrap the static_assert and memcpy in a similar generic function for reuse.
Both of these still require that the representation of x is also a valid representation for a u. Otherwise the behavior is still undefined. I don't know whether there even is any C++ implementation where this doesn't hold for all values in the float -> unsigned case.
Also as an additional note: C is a different language. The rules may well be different in C. For example there is obviously no reinterpret_cast in C to which the (unsigned*) cast could resolve and the object model is very different. In this case though, C's aliasing rules will have an equivalent effect.
It is not valid C++. The behavior (of the program) is undefined.
The cast expression would cause the alignment requirement to be violated (aka "strict aliasing violation").
See: §6.7 Memory and objects, and §6.8 Types of ISO/IEC JTC1 SC22 WG21.
The problem is the explicit cast will become a reinterpret_cast:
float boat = 420.69f;
// unsigned dont_do_this = * reinterpret_cast<unsigned *> (&boat);
// ~~~~~~~~~~~~~~~^
// Dereferencing `unsigned*` pointer which doesn't point to an `unsigned`.
float* is not Pointer-Interconvertible with unsigned*.
You could do this, instead:
auto since_cpp20 = std::bit_cast<unsigned>(boat); // include <bit>
// Alternatively:
unsigned since_c;
std::memcpy(&since_c, &boat, sizeof since_c);
Assuming that x is float, then the code is a hacky way to access the binary format of a float by copying its bits directly to an integer, i.e. without doing the normal floating point to integer conversion that would happen if you just wrote u = x;
I'm currently writing a runtime for my compiler project and I want general and easy to use struct for encoding different types (the source language is scheme).
My current approach is:
struct SObj {
SType type;
uint64_t *value;
};
Pointer are always 64 or 32 bit wide, so shouldn't it be possible to literally put a float into my value? Then, if I want the actual value of the float, I just take the raw bytes and interprete them as a float.
Thanks in advance.
Not really.
When you write C++ you're programming an abstraction. You're describing a program. Contrary to popular belief, it's not "all just bytes".
Compilers are complex. They can, and will, assume that you follow the rules, and use that assumption to produce the most efficient "actual" code (read: machine code) possible.
One of those rules is that a uint64_t* is a pointer that points to a uint64_t. When you chuck arbitrary bits into there — whether they are identical to the bits that form a valid float, or something else — it is no longer a valid pointer, and simply evaluating it has undefined behaviour.
There are language facilities that can do what you want, like union. But you have to be careful not to violate aliasing rules. You'd store a flag (presumably, that's what your type is) that tells you which union member you're using. Make life easier and have a std::variant instead, which does all this for you.
That being said, you can std::memcpy/std::copy bits in and copy bits out, in say a uint64_t as long as they are a valid representation of the type you've chosen on your system. Just don't expect reinterpret_cast to be valid: it won't be.
Pointer are always 64 or 32 bit wide
No.
so shouldn't it be possible to literally put a float into my value?
Yes, that is possible, although that would be very strongly advised against. C++ has many, many other facilities so you do not have to resort such things yourself. Anyway, you can interpret the bytes inside a pointer as another type. Like this:
static_assert(sizeof(float*) >= sizeof(float));
static_assert(std::is_pod<float>::value == true); // overdramatic
float *ptr; // just allocate sizeof(float*) bytes on stack
float a = 5;
// use the memory of the pointer to store float value
std::memcpy(&ptr, &a, sizeof(float));
float b;
std::memcpy(&b, &ptr, sizeof(float));
a == b; // true
What I know about pointer is, it is used to point to specific location (memory address), so why do we even need the same data type of pointer as that of the variable we are trying to point.
Suppose I create a variable of integer, then I have to create a pointer to integer to point it. So why can't I create a void pointer or float pointer to point the value stored in that integer variable!
Am I missing some concepts of pointers?
So why can't I create a void pointer [...] to point the value stored in that integer variable
You can do that, no problem:
int x = 10;
double y = 0.4;
void* v = &x;
v = &y;
But now imagine a function like this:
void print(void* value)
How would this function know what to do with the memory at the pointer location? Is it an integer? Or a floating point number? A float or a double? Maybe it's a huge struct or an array of values? You must know this to dereference the pointer (i.e. read the memory) correctly, so it only makes sense to have different pointer types for pointers to different types:
void print(int* value)
This function knows that the pointer points to an int, so it can happily dereference it to get an int value.
The pointer type is also important when dealing with arrays, as arrays and pointers are interchangeable. When you increment a pointer (which is what indexing does), you need to know how big the type is (int, long, structure, class) in order to access the next item.
arr[5] == *(arr+5) but 5 what? This is determined by the type.
A small addition on Max Langhof's answer:
It is important to realise that in the end, variables are stored simply as a sequence of bits (binary digits), e.g. 01010101 00011101 11100010 11110000. How does your program know what this 'means'? It could be an integer (which is often 4 bytes on modern architectures), it could be a floating-point value. For the memory involved this makes no difference, but for your code the implications can be huge. Therefore, if you refer to this memory location (using a pointer), you will need to specify how the bytes there should be converted to decimal (or other) values.
Pointer arithmetic is the main reason - if p points to an object of type T, then p+1 points to the next object of that type. If p points to a 4-byte int, then p+1 points to the following 4-byte int. If p points to a 128-byte struct, then p+1 points to the following 128-byte struct. If p points to a 2 Kbyte array of double, then p+1 points to the next 2 Kbyte array of double, etc.
But it's also for the same reason we have different types in the first place - at the abstract machine level, we want to distinguish different types of data and operations that are allowed on that data. A pointer to an int is different from a pointer to a double because an int is different from a double.
You are right. Although int and float are different types, there should be no difference between their pointers int* and float*. In general, this is the case. However, the binary representation is different between int and float. Therefore accessing an int with a float* pointer leads to garbage being read from the RAM.
Furthermore, what you have on your machine is not the general case, but hardware and implementation dependent.
For example: float and int variables are usually 32bit long. However, there are systems where the int has only 16bit. What happens now if you try to read a float from a int* pointer? (Or, even if both are 32bit, what happens if you try to read a float from a char*?)
Memory accesses do not work without knowing what kind of data object you are dealing with.
Imagine some simple assignment:
int a, b=10;
float f;
a = b; // same type => Just copy the integer
f = b; // wrong type => Convert to float.
This works fine because the compiler knows that both variables are of a certain type and size and representation. If the types do not match, a proper conversion is applied.
Now the same with typed pointers:
int a = 10;
float f;
int *pa;
float *pf;
f = a; // Type conversion to float applied
*pa = a; // Just copy
*pf = a; // Type conversion
If you take away the knowledge about the memory location where the pointer points to, how would the compiler know if a conversion is required?
Or if some integer propagation is needed or is an integer is truncated into a shorter type?
More problems are waiting around the corner if you want to use a pointer to address elements of an array. Pointer arithmetics won't fly without types.
Types are essential. For variables as well as for pointers.
I think I understand the semantics of pointer arithmetic fairly well, but I only ever see examples when dealing with arrays. Does it have any other uses that can't be achieved by less opaque means? I'm sure you could find a way with clever casting to use it to access members of a struct, but I'm not sure why you'd bother. I'm mostly interested in C, but I'll tag with C++ because the answer probably applies there too.
Edit, based on answers received so far: I know pointers can be used in many non-array contexts. I'm specifically wondering about arithmetic on pointers, e.g. incrementing, taking a difference, etc.
Pointer arithmetic by definition in C happens only on arrays. However, as every object has a representation consisting of an overlaid unsigned char [sizeof object] array, it's also valid to perform pointer arithmetic on this representation. For example:
struct foo {
int a, b, c;
} bar;
/* Equivalent to: bar.c = 1; */
*(int *)((unsigned char *)&bar + offsetof(struct foo, c)) = 1;
Actually char * would work just as well.
If you follow the language standard to the letter, then pointer arithmetic is only defined when pointing to an array, and not in any other case.
A pointer may point to any element of an array, or one step past the end of the array.
From the top of my head I know it's used in XOR linked-lists (very nifty) and I've seen it used in very hacky recursions.
On the other hand, it's very hard to find uses since according to the standard pointer arithmic is only defined if within the bounds of an array.
a[n] is "just" syntactic sugar for *(a + n). For lulz, try the following
int a[2];
0[a] = 10;
1[a] = 20;
So one could argue that indexing and pointer arithmetic are merely interchangeable syntax.
Pointer arithmetic is only defined on arrays. Adding an integer to a pointer that does not point to an array element produces undefined behavior.
In embedded systems, pointers are used to represent addresses or locations. There may not be an array defined. (Although one could say that all of memory is one huge array.)
For example, a stack (holding variables and addresses) is manipulated by adding or subtracting values from the stack pointer. (In this case, the stack could be said to be an array based stack.)
Here's a case for pointer arithmetic outside of (strictly defined) arrays:
double d = 0.5;
unsigned char *bytes = (void *)&d;
for(size_t i = 0; i < sizeof d; i++)
printf("Byte %zu of d is %hhu\n", i, bytes[i]);
Why would you do this? I don't know. But if you want to look at the bitwise representation of an object (useful for things like memcpy and memcmp), you'll need to cast their addresses to unsigned char *s (or signed char *s if you like) and work with them byte-by-byte. (If your task isn't too difficult you can even write the code to work word-by-word, which most memcpy implementations will do. It's the same principle, though, just replace char with int32_t.)
Note that, in the standard, the exact values (or the number of values) that are printed are implementation-defined, but that this will always work as a way to access an object's internal bytewise representation. (It is not required to work for larger integer types, but almost always will - no processor I know of has had trap representations for integers in quite some time).
Is it safe to cast pointer to int and later back to pointer again?
How about if we know if the pointer is 32 bit long and int is 32 bit long?
long* juggle(long* p) {
static_assert(sizeof(long*) == sizeof(int));
int v = reinterpret_cast<int>(p); // or if sizeof(*)==8 choose long here
do_some_math(v); // prevent compiler from optimizing
return reinterpret_cast<long*>(v);
}
int main() {
long* stuff = new long(42);
long* ffuts = juggle(stuff);
std::cout << "Is this always 42? " << *ffuts << std::endl;
}
Is this covered by the Standard?
No.
For instance, on x86-64, a pointer is 64-bit long, but int is only 32-bit long. Casting a pointer to int and back again makes the upper 32-bit of the pointer value lost.
You may use the intptr_t type in <cstdint> if you want an integer type which is guaranteed to be as long as the pointer. You could safely reinterpret_cast from a pointer to an intptr_t and back.
Yes, if... (or "Yes, but...") and no otherwise.
The standard specifies (3.7.4.3) the following:
A pointer value is a safely-derived pointer [...] if it is the result of a well-defined pointer conversion or reinterpret_cast of a safely-derived pointer value [or] the result of a reinterpret_cast of an integer representation of a safely-derived pointer value
An integer value is an integer representation of a safely-derived pointer [...] if its type is at least as large as std::intptr_t and [...] the result of a reinterpret_cast of a safely-derived pointer value [or]
the result of a valid conversion of an integer representation of a safely-derived pointer value [or] the result of an additive or bitwise operation, one of whose operands is an integer representation of a
safely-derived pointer value
A traceable pointer object is [...] an object of an integral type that is at least as large as std::intptr_t
The standard further states that implementations may be relaxed or may be strict about enforcing safely-derived pointers. Which means it is unspecified whether using or dereferencing a not-safely-derived pointer invokes undefined behavior (that's a funny thing to say!)
Which alltogether means no more and no less than "something different might work anyway, but the only safe thing is as specified above".
Therefore, if you either use std::intptr_t in the first place (the preferrable thing to do!) or if you know that the storage size of whatever integer type you use (say, long) is at least the size of std::intptr_t, then it is allowable and well-defined (i.e. "safe") to cast to your integer type and back. The standard guarantees that.
If that's not the case, the conversion from pointer to integer representation will probably (or at least possibly) lose some information, and the conversion back will not give a valid pointer. Or, it might by accident, but this is not guaranteed.
An interesting anecdote is that the C++ standard does not directly define std::intptr_t at all; it merely says "the same as 7.18 in the C standard".
The C standard, on the other hand, states "designates a signed integer type with the property that any valid
pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer".
Which means, without the rather complicated definitions above (in particular the last bit of the first bullet point), it wouldn't be allowable to convert to/from anything but void*.
Yes and no.
The language specification explicitly states that it is safe (meaning that in the end you will get the original pointer value) as long as the size of the integral type is sufficient to store the [implementation-dependent] integral representation of the pointer.
So, in general case it is not "safe", since in general case int can easily turn out to be too small. In your specific case it though it might be safe, since your int might be sufficiently large to store your pointer.
Normally, when you need to do something like that, you should use the intptr_t/uintptr_t types, which are specifically introduced for that purpose. Unfortunately, intptr_t/uintptr_t are not the part of the current C++ standard (they are standard C99 types), but many implementations provide them nevertheless. You can always define these types yourself, of course.
In general, no; pointers may be larger than int, in which case there's no way to reconstruct the value.
If an integer type is known to be large enough, then you can; according to the Standard (5.2.10/5):
A pointer converted to an integer of sufficient size ... and back to the same pointer type will have its original value
However, in C++03, there's no standard way to tell which integer types are large enough. C++11 and C99 (and hence in practice most C++03 implementations), and also Boost.Integer, define intptr_t and uintptr_t for this purpose. Or you could define your own type and assert (preferably at compile time) that it's large enough; or, if you don't have some special reason for it to be an integer type, use void*.
Is it safe? Not really.
In most circumstances, will it work? Yes
Certainly if an int is too small to hold the full pointer value and truncates, you won't get your original pointer back (hopefully your compiler will warn you about this case, with GCC truncating conversions from pointer to integers are hard errors). A long, or uintptr_t if your library supports it, may be better choices.
Even if your integer type and pointer types are the same size, it will not necessarily work depending on your application runtime. In particular, if you're using a garbage collector in your program it might easily decide that the pointer is no longer outstanding, and when you later cast your integer back to a pointer and try to dereference it, you'll find out the object was already reaped.
Absolutely not. Doing some makes a bad assumption that the size of an int and a pointer are the same. This is almost always no the case on 64 bit platforms. If they are not the same a precision loss will occur and the final pointer value will be incorrect.
MyType* pValue = ...
int stored = (int)pValue; // Just lost the upper 4 bytes on a 64 bit platform
pValue = (MyType*)stored; // pValue is now invalid
pValue->SomeOp(); // Kaboom
No, it is not (always) safe (thus not safe in general). And it is covered by the standard.
ISO C++ 2003, 5.2.10:
A pointer can be explicitly converted to any integral type large enough to hold it. The mapping function is implementation-defined.
A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined.
(The above emphases are mine.)
Therefore, if you know that the sizes are compatible, then the conversion is safe.
#include <iostream>
// C++03 static_assert.
#define ASSURE(cond) typedef int ASSURE[(cond) ? 1 : -1]
// Assure that the sizes are compatible.
ASSURE(sizeof (int) >= sizeof (char*));
int main() {
char c = 'A';
char *p = &c;
// If this program compiles, it is well formed.
int i = reinterpret_cast<int>(p);
p = reinterpret_cast<char*>(i);
std::cout << *p << std::endl;
}
Use uintptr_t from "stdint.h" or from "boost/stdint.h". It is guaranteed to have enough storage for a pointer.
No it is not. Even if we rule out the architecture issue, size of a pointer and an integer have differences. A pointer can be of three types in C++ : near, far, and huge. They have different sizes. And if we talk about an integer its normally of 16 or 32 bit. So casting integer into pointers and vice-verse is not safe. Utmost care has to be taken, as there very much chances of precision loss. In most of the cases an integer will be short of space to store a pointer, resulting in loss of value.
If your going to be doing any system portable casting, you need to use something like Microsofts INT_PTR/UINT_PTR, the safety after that relies on the target platforms and what you intend doing to the INT_PTR. generally for most arithmatic char* or uint_8* works better while being typesafe(ish)
To an int ? not always if you are on a 64 bit machine then int is only 4 bytes, however pointers are 8 bytes long and thus you would end up with a different pointer when you cast it back from int.
There are however ways to get around this. You can simply use an 8 byte long data type ,which would work whether or not you are on 32/64 bit system, such as unsigned long long unsigned because you don't want sign extension on 32-bit systems.
It is important to note that on Linux unsigned long will always be pointer size* so if you are targeting Linux systems you could just use that.
*According to cppreference and also tested it myself but not on all Linux and Linux like systems
If the issue is that you want to do normal math on it, probably the safest thing to do would be to cast it to a pointer to char (or better yet, * uint8_t), do your math, and then cast it back.