This is in C, but I tagged it C++ incase it's the same. This is being built with:
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.220 for 80x86
if that makes any different
Why does this work?
(inVal is 0x80)
float flt = (float) inVal;
outVal = *((unsigned long*)&flt);
(results in outVal being 0x43000000 -- correct)
But this doesn't?
outVal = *((unsigned long*)&((float)inVal));
(results in outVal being 0x00000080 -- NOT CORRECT :( )
Before asking this question I googled around a bit and found this function in java that basically does what I want. If you're a bit confused about what I'm trying to do, this program might help explain it:
class hello
{
public static void main(String[] args)
{
int inside = Float.floatToIntBits(128.0f);
System.out.printf("0x%08X", inside);
}
}
You're trying to take the address of a non-const temporary (the result of your (float) conversion) – this is illegal in C++ (and probably also in C). Hence, your code results in garbage.
In your first, working, code, you're not using a temporary so your code is working. Notice that from a standards point of view this is still ill-defined since the size and internal representation of the involved types isn't specified and may differ depending on platform and compiler. You're probably safe, though.
In C99, you may use compound literals to make this work inline:
unsigned long outVal = *((unsigned long *)&((float){ inVal }));
The literal (float){ inVal } will create a variable with automatic storage duration (ie stack-allocated) with a well-defined address.
Type punning may also be done using unions instead of pointer casts. Using compound literals and the non-standard __typeof__ operator, you can even do some macro magic:
#define wicked_cast(TYPE, VALUE) \
(((union { __typeof__(VALUE) src; TYPE dest; }){ .src = VALUE }).dest)
unsigned long outVal = wicked_cast(unsigned long, (float)inVal);
GCC prefers unions over pointer casts in regard to optimization. This might not work at all with the MS compiler as its C99 support is rumored to be non-existant.
Assuming: inVal and outVal are parameters.
void func(int inVal,unsigned long* outVal)
{
float flt = (float) inVal;
*outVal = (unsigned long)flt; // convert flot to unsigned long.
// Then assign to the variable by de-ref
// the pointer.
}
Related
I have a C header file that has a list of definitions like below
#define TAG_A ((A*)0x123456)
#define TAG_B ((B*)0x456789)
I include that file in a cpp file.
I want to cast those definition in a switch case like below
unsigned int get_tag_address(unsigned int i)
{
switch(i)
{
case reinterpret_cast<unsigned int>(TAG_A):
return 1;
case reinterpret_cast<unsigned int>(TAG_B):
return 2;
}
return 3;
}
I still get compiler error that I can't cast a pointer to an unsigned intigeter.
What do I do wrong?
The definitions look at hardware addresses of an embedded system. I want to return an unsigned integer based on what hardware component is used (i.e. passed into the function argument).
This is how I ended up in that situation.
PS: The header file containing the defitions must not change.
It is impossible to use TAG_A and TAG_B in a case of a switch, except by using preprocessor tricks like stringifying the macro replacement itself in a macro and then parsing the value form the resulting string, which will however make the construct dependent on the exact form of the TAG_X macros and I feel is not really worth it when you don't have a strict requirement to obtain compile-time constant values representing the pointers.
The results of the expressions produced by the TAG_A and TAG_B replacement can not be used in a case operand because the operand must be a constant expression, but casting an integer to a pointer as done with (A*) and (B*) disqualifies an expression from being a constant expression.
So, you will need to use if/else if instead:
unsigned int get_tag_address(unsigned int i)
{
if(i == reinterpret_cast<unsigned int>(TAG_A)) {
return 1;
} else if(i == reinterpret_cast<unsigned int>(TAG_B)) {
return 2;
} else {
return 3;
}
}
Also, consider using std::uintptr_t instead of unsigned int for i and in the reinterpret_casts, since it is not guaranteed that unsigned int is large enough to hold the pointer values. However, compilation of the reinterpret_cast should fail if unsigned int is in fact too small. (It is possible that std::uintptr_t in <cstdint> does not exist, in which case you are either using pre-C++11 or, if not that, it would be a hint that the architecture does not allow for representing pointers as integer values. It is not guaranteed that this is possible, but you would need to be working some pretty exotic architecture for it to not be possible.)
And if you can, simply pass, store and compare pointers (maybe as void*) instead of integer values representing the pointers. That is safer for multiple reasons and always guaranteed to work.
Very basic question: how do I write a short literal in C++?
I know the following:
2 is an int
2U is an unsigned int
2L is a long
2LL is a long long
2.0f is a float
2.0 is a double
'\2' is a char.
But how would I write a short literal? I tried 2S but that gives a compiler warning.
((short)2)
Yeah, it's not strictly a short literal, more of a casted-int, but the behaviour is the same and I think there isn't a direct way of doing it.
That's what I've been doing because I couldn't find anything about it. I would guess that the compiler would be smart enough to compile this as if it's a short literal (i.e. it wouldn't actually allocate an int and then cast it every time).
The following illustrates how much you should worry about this:
a = 2L;
b = 2.0;
c = (short)2;
d = '\2';
Compile -> disassemble ->
movl $2, _a
movl $2, _b
movl $2, _c
movl $2, _d
C++11 gives you pretty close to what you want. (Search for "user-defined literals" to learn more.)
#include <cstdint>
inline std::uint16_t operator "" _u(unsigned long long value)
{
return static_cast<std::uint16_t>(value);
}
void func(std::uint32_t value); // 1
void func(std::uint16_t value); // 2
func(0x1234U); // calls 1
func(0x1234_u); // calls 2
// also
inline std::int16_t operator "" _s(unsigned long long value)
{
return static_cast<std::int16_t>(value);
}
Even the writers of the C99 standard got caught out by this. This is a snippet from Danny Smith's public domain stdint.h implementation:
/* 7.18.4.1 Macros for minimum-width integer constants
Accoding to Douglas Gwyn <gwyn#arl.mil>:
"This spec was changed in ISO/IEC 9899:1999 TC1; in ISO/IEC
9899:1999 as initially published, the expansion was required
to be an integer constant of precisely matching type, which
is impossible to accomplish for the shorter types on most
platforms, because C99 provides no standard way to designate
an integer constant with width less than that of type int.
TC1 changed this to require just an integer constant
*expression* with *promoted* type."
*/
Disclaimer: I'm leaving this answer up as a curiosity, but you really shouldn't be using this in production code. Use UDL or constants of the appropriate types instead.
If you use Microsoft Visual C++, there are literal suffixes available for every integer type:
auto var1 = 10i8; // char
auto var2 = 10ui8; // unsigned char
auto var3 = 10i16; // short
auto var4 = 10ui16; // unsigned short
auto var5 = 10i32; // int
auto var6 = 10ui32; // unsigned int
auto var7 = 10i64; // long long
auto var8 = 10ui64; // unsigned long long
Note that these are a non-standard extension and aren't portable. In fact, I couldn't even locate any info on these suffixes on MSDN.
You can also use pseudo constructor syntax.
short(2)
I find it more readable than casting.
One possibility is to use C++11 "list initialization" for this purpose, e.g.:
short{42};
The advantage of this solution (compared to a cast as in the currently accepted answer) is that it does not allow narrowing conversions:
auto number1 = short(100000); // Oops: Stores -31072, you may get a warning
auto number2 = short{100000}; // Compiler error. Value too large for type short
See https://en.cppreference.com/w/cpp/language/list_initialization#Narrowing_conversions for prohibited narrowing conversions with list-init
As far as I know, you don't, there's no such suffix. Most compilers will warn if an integer literal is too large to fit in whatever variable you're trying to store it in, though.
I know it is an integer type that can be cast to/from pointer without loss of data, but why would I ever want to do this? What advantage does having an integer type have over void* for holding the pointer and THE_REAL_TYPE* for pointer arithmetic?
EDIT
The question marked as "already been asked" doesn't answer this. The question there is if using intptr_t as a general replacement for void* is a good idea, and the answers there seem to be "don't use intptr_t", so my question is still valid: What would be a good use case for intptr_t?
The primary reason, you cannot do bitwise operation on a void *, but you can do the same on a intptr_t.
On many occassion, where you need to perform bitwise operation on an address, you can use intptr_t.
However, for bitwise operations, best approach is to use the unsigned counterpart, uintptr_t.
As mentioned in the other answer by #chux, pointer comparison is another important aspect.
Also, FWIW, as per C11 standard, §7.20.1.4,
These types are optional.
There's also a semantic consideration.
A void* is supposed to point to something. Despite modern practicality, a pointer is not a memory address. Okay, it usually/probably/always(!) holds one, but it's not a number. It's a pointer. It refers to a thing.
A intptr_t does not. It's an integer value, that is safe to convert to/from a pointer so you can use it for antique APIs, packing it into a pthread function argument, things like that.
That's why you can do more numbery and bitty things on an intptr_t than you can on a void*, and why you should be self-documenting by using the proper type for the job.
Ultimately, almost everything could be an integer (remember, your computer works on numbers!). Pointers could have been integers. But they're not. They're pointers, because they are meant for different use. And, theoretically, they could be something other than numbers.
The uintptr_t type is very useful when writing memory management code. That kind of code wants to talk to its clients in terms of generic pointers (void *), but internally do all kinds of arithmetic on addresses.
You can do some of the same things by operating in terms of char *, but not everything, and the result looks like pre-Ansi C.
Not all memory management code uses uintptr_t - as an example, the BSD kernel code defines a vm_offset_t with similar properties. But if you are writing e.g. a debug malloc package, why invent your own type?
It's also helpful when you have %p available in your printf, and are writing code that needs to print pointer sized integral variables in hex on a variety of architectures.
I find intptr_t rather less useful, except possibly as a way station when casting, to avoid the dread warning about changing signedness and integer size in the same cast. (Writing portable code that passes -Wall -Werror on all relevant architectures can be a bit of a struggle.)
What is the use of intptr_t?
Example use: order comparing.
Comparing pointers for equality is not a problem.
Other compare operations like >, <= may be UB. C11dr §6.5.8/5 Relational operators.
So convert to intptr_t first.
[Edit] New example: Sort an array of pointers by pointer value.
int ptr_cmp(const void *a, const void *b) {
intptr_t ia = (intptr) (*((void **) a));
intptr_t ib = (intptr) (*((void **) b));
return (ia > ib) - (ia < ib);
}
void *a[N];
...
qsort(a, sizeof a/sizeof a[0], sizeof a[0], ptr_cmp);
[Former example]
Example use: Test if a pointer is of an array of pointers.
#define N 10
char special[N][1];
// UB as testing order of pointer, not of the same array, is UB.
int test_special1(char *candidate) {
return (candidate >= special[0]) && (candidate <= special[N-1]);
}
// OK - integer compare
int test_special2(char *candidate) {
intptr_t ca = (intptr_t) candidate;
intptr_t mn = (intptr_t) special[0];
intptr_t mx = (intptr_t) special[N-1];
return (ca >= mn) && (ca <= mx);
}
As commented by #M.M, the above code may not work as intended. But at least it is not UB. - just non-portably functionality. I was hoping to use this to solve this problem.
(u)intptr_t is used when you want to do arithmetic on pointers, specifically bitwise operations. But as others said, you'll almost always want to use uintptr_t because bitwise operations are better done in unsigned. However if you need to do an arithmetic right shift then you must use intptr_t1. It's usually used for storing data in the pointer, usually called tagged pointer
In x86-64 you can use the high 16/7 bits of the address for data, but you must do sign extension manually to make the pointer canonical because it doesn't have a flag for ignoring the high bits like in ARM2. So for example if you have char* tagged_address then you'll need to do this before dereferencing it
char* pointer = (char*)((intptr_t)tagged_address << 16 >> 16);
The 32-bit Chrome V8 engine uses smi (small integer) optimization where the low bits denote the type
|----- 32 bits -----|
Pointer: |_____address_____w1| # Address to object, w = weak pointer
Smi: |___int31_value____0| # Small integer
So when the pointer's least significant bit is 0 then it'll be right shifted to retrieve the original 31-bit signed int
int v = (intptr_t)address >> 1;
For more information read
Using the extra 16 bits in 64-bit pointers
Pointer magic for efficient dynamic value representations
Another usage is when you pass a signed integer as void* which is usually done in simple callback functions or threads
void* my_thread(void *arg)
{
intptr_t val = (intptr_t)arg;
// Do something
}
int main()
{
pthread_t thread1;
intptr_t some_val = -2;
int r = pthread_create(&thread1, NULL, my_thread, (void*)some_val);
}
1 When the implementation does arithmetic shift on signed types of course
2 Very new x86-64 CPUs may have UAI/LAM support for that
ulong foo = 0;
ulong bar = 0UL;//this seems redundant and unnecessary. but I see it a lot.
I also see this in referencing the first element of arrays a good amount
blah = arr[0UL];//this seems silly since I don't expect the compiler to magically
//turn '0' into a signed value
Can someone provide some insight to why I need 'UL' throughout to specify specifically that this is an unsigned long?
void f(unsigned int x)
{
//
}
void f(int x)
{
//
}
...
f(3); // f(int x)
f(3u); // f(unsigned int x)
It is just another tool in C++; if you don't need it don't use it!
In the examples you provide it isn't needed. But suffixes are often used in expressions to prevent loss of precision. For example:
unsigned long x = 5UL * ...
You may get a different answer if you left off the UL suffix, say if your system had 16-bit ints and 32-bit longs.
Here is another example inspired by Richard Corden's comments:
unsigned long x = 1UL << 17;
Again, you'd get a different answer if you had 16 or 32-bit integers if you left the suffix off.
The same type of problem will apply with 32 vs 64-bit ints and mixing long and long long in expressions.
Some compiler may emit a warning I suppose.
The author could be doing this to make sure the code has no warnings?
Sorry, I realize this is a rather old question, but I use this a lot in c++11 code...
ul, d, f are all useful for initialising auto variables to your intended type, e.g.
auto my_u_long = 0ul;
auto my_float = 0f;
auto my_double = 0d;
Checkout the cpp reference on numeric literals: http://www.cplusplus.com/doc/tutorial/constants/
You don't normally need it, and any tolerable editor will have enough assistance to keep things straight. However, the places I use it in C# are (and you'll see these in C++):
Calling a generic method (template in C++), where the parameter types are implied and you want to make sure and call the one with an unsigned long type. This happens reasonably often, including this one recently:
Tuple<ulong, ulong> = Tuple.Create(someUlongVariable, 0UL);
where without the UL it returns Tuple<ulong, int> and won't compile.
Implicit variable declarations using the var keyword in C# or the auto keyword coming to C++. This is less common for me because I only use var to shorten very long declarations, and ulong is the opposite.
When you feel obligated to write down the type of constant (even when not absolutely necessary) you make sure:
That you always consider how the compiler will translate this constant into bits
Who ever reads your code will always know how you thought the constant looks like and that you taken it into consideration (even you, when you rescan the code)
You don't spend time if thoughts whether you need to write the 'U'/'UL' or don't need to write it
also, several software development standards such as MISRA require you to mention the type of constant no matter what (at least write 'U' if unsigned)
in other words it is believed by some as good practice to write the type of constant because at the worst case you just ignore it and at the best you avoid bugs, avoid a chance different compilers will address your code differently and improve code readability
Very basic question: how do I write a short literal in C++?
I know the following:
2 is an int
2U is an unsigned int
2L is a long
2LL is a long long
2.0f is a float
2.0 is a double
'\2' is a char.
But how would I write a short literal? I tried 2S but that gives a compiler warning.
((short)2)
Yeah, it's not strictly a short literal, more of a casted-int, but the behaviour is the same and I think there isn't a direct way of doing it.
That's what I've been doing because I couldn't find anything about it. I would guess that the compiler would be smart enough to compile this as if it's a short literal (i.e. it wouldn't actually allocate an int and then cast it every time).
The following illustrates how much you should worry about this:
a = 2L;
b = 2.0;
c = (short)2;
d = '\2';
Compile -> disassemble ->
movl $2, _a
movl $2, _b
movl $2, _c
movl $2, _d
C++11 gives you pretty close to what you want. (Search for "user-defined literals" to learn more.)
#include <cstdint>
inline std::uint16_t operator "" _u(unsigned long long value)
{
return static_cast<std::uint16_t>(value);
}
void func(std::uint32_t value); // 1
void func(std::uint16_t value); // 2
func(0x1234U); // calls 1
func(0x1234_u); // calls 2
// also
inline std::int16_t operator "" _s(unsigned long long value)
{
return static_cast<std::int16_t>(value);
}
Even the writers of the C99 standard got caught out by this. This is a snippet from Danny Smith's public domain stdint.h implementation:
/* 7.18.4.1 Macros for minimum-width integer constants
Accoding to Douglas Gwyn <gwyn#arl.mil>:
"This spec was changed in ISO/IEC 9899:1999 TC1; in ISO/IEC
9899:1999 as initially published, the expansion was required
to be an integer constant of precisely matching type, which
is impossible to accomplish for the shorter types on most
platforms, because C99 provides no standard way to designate
an integer constant with width less than that of type int.
TC1 changed this to require just an integer constant
*expression* with *promoted* type."
*/
Disclaimer: I'm leaving this answer up as a curiosity, but you really shouldn't be using this in production code. Use UDL or constants of the appropriate types instead.
If you use Microsoft Visual C++, there are literal suffixes available for every integer type:
auto var1 = 10i8; // char
auto var2 = 10ui8; // unsigned char
auto var3 = 10i16; // short
auto var4 = 10ui16; // unsigned short
auto var5 = 10i32; // int
auto var6 = 10ui32; // unsigned int
auto var7 = 10i64; // long long
auto var8 = 10ui64; // unsigned long long
Note that these are a non-standard extension and aren't portable. In fact, I couldn't even locate any info on these suffixes on MSDN.
You can also use pseudo constructor syntax.
short(2)
I find it more readable than casting.
One possibility is to use C++11 "list initialization" for this purpose, e.g.:
short{42};
The advantage of this solution (compared to a cast as in the currently accepted answer) is that it does not allow narrowing conversions:
auto number1 = short(100000); // Oops: Stores -31072, you may get a warning
auto number2 = short{100000}; // Compiler error. Value too large for type short
See https://en.cppreference.com/w/cpp/language/list_initialization#Narrowing_conversions for prohibited narrowing conversions with list-init
As far as I know, you don't, there's no such suffix. Most compilers will warn if an integer literal is too large to fit in whatever variable you're trying to store it in, though.