I have a C header file that has a list of definitions like below
#define TAG_A ((A*)0x123456)
#define TAG_B ((B*)0x456789)
I include that file in a cpp file.
I want to cast those definition in a switch case like below
unsigned int get_tag_address(unsigned int i)
{
switch(i)
{
case reinterpret_cast<unsigned int>(TAG_A):
return 1;
case reinterpret_cast<unsigned int>(TAG_B):
return 2;
}
return 3;
}
I still get compiler error that I can't cast a pointer to an unsigned intigeter.
What do I do wrong?
The definitions look at hardware addresses of an embedded system. I want to return an unsigned integer based on what hardware component is used (i.e. passed into the function argument).
This is how I ended up in that situation.
PS: The header file containing the defitions must not change.
It is impossible to use TAG_A and TAG_B in a case of a switch, except by using preprocessor tricks like stringifying the macro replacement itself in a macro and then parsing the value form the resulting string, which will however make the construct dependent on the exact form of the TAG_X macros and I feel is not really worth it when you don't have a strict requirement to obtain compile-time constant values representing the pointers.
The results of the expressions produced by the TAG_A and TAG_B replacement can not be used in a case operand because the operand must be a constant expression, but casting an integer to a pointer as done with (A*) and (B*) disqualifies an expression from being a constant expression.
So, you will need to use if/else if instead:
unsigned int get_tag_address(unsigned int i)
{
if(i == reinterpret_cast<unsigned int>(TAG_A)) {
return 1;
} else if(i == reinterpret_cast<unsigned int>(TAG_B)) {
return 2;
} else {
return 3;
}
}
Also, consider using std::uintptr_t instead of unsigned int for i and in the reinterpret_casts, since it is not guaranteed that unsigned int is large enough to hold the pointer values. However, compilation of the reinterpret_cast should fail if unsigned int is in fact too small. (It is possible that std::uintptr_t in <cstdint> does not exist, in which case you are either using pre-C++11 or, if not that, it would be a hint that the architecture does not allow for representing pointers as integer values. It is not guaranteed that this is possible, but you would need to be working some pretty exotic architecture for it to not be possible.)
And if you can, simply pass, store and compare pointers (maybe as void*) instead of integer values representing the pointers. That is safer for multiple reasons and always guaranteed to work.
Related
How does one convert from one integer type to another safely and with setting off alarm bells in compilers and static analysis tools?
Different compilers will warn for something like:
int i = get_int();
size_t s = i;
for loss of signedness or
size_t s = get_size();
int i = s;
for narrowing.
casting can remove the warnings but don't solve the safety issue.
Is there a proper way of doing this?
You can try boost::numeric_cast<>.
boost numeric_cast returns the result of converting a value of type Source to a value of type Target. If out-of-range is detected, an exception is thrown (see bad_numeric_cast, negative_overflow and positive_overflow ).
How does one convert from one integer type to another safely and with setting off alarm bells in compilers and static analysis tools?
Control when conversion is needed. As able, only convert when there is no value change. Sometimes, then one must step back and code at a higher level. IOWs, was a lossy conversion needed or can code be re-worked to avoid conversion loss?
It is not hard to add an if(). The test just needs to be carefully formed.
Example where size_t n and int len need a compare. Note that positive values of int may exceed that of size_t - or visa-versa or the same. Note in this case, the conversion of int to unsigned only happens with non-negative values - thus no value change.
int len = snprintf(buf, n, ...);
if (len < 0 || (unsigned)len >= n) {
// Handle_error();
}
unsigned to int example when it is known that the unsigned value at this point of code is less than or equal to INT_MAX.
unsigned n = ...
int i = n & INT_MAX;
Good analysis tools see that n & INT_MAX always converts into int without loss.
There is no built-in safe narrowing conversion between int types in c++ and STL. You could implement it yourself using as an example Microsoft GSL.
Theoretically, if you want perfect safety, you shouldn't be mixing types like this at all. (And you definitely shouldn't be using explicit casts to silence warnings, as you know.) If you've got values of type size_t, it's best to always carry them around in variables of type size_t.
There is one case where I do sometimes decide I can accept less than 100.000% perfect type safety, and that is when I assign sizeof's return value, which is a size_t, to an int. For any machine I am ever going to use, the only time this conversion might lose information is when sizeof returns a value greater than 2147483647. But I am content to assume that no single object in any of my programs will ever be that big. (In particular, I will unhesitatingly write things like printf("sizeof(int) = %d\n", (int)sizeof(int)), explicit cast and all. There is no possible way that the size of a type like int will not fit in an int!)
[Footnote: Yes, it's true, on a 16-bit machine the assumption is the rather less satisfying threshold that sizeof won't return a value greater than 32767. It's more likely that a single object might have a size like that, but probably not in a program that's running on a 16-bitter.]
A byte of data is being stored in a 'char' member variable. It should probably be stored as an 'unsigned char' instead, but that can't be changed. I need to retrieve it through an 'int' variable, but without propagating the sign bit.
My solution was this (UINT and UCHAR are the obvious types):
void Foo::get_data( int *val )
{
if( val )
*val = (int)(UINT)(UCHAR)m_data; // 'm_data' is type 'char'
}
This seemed the best solution to me. I could use
*val = 0xff & (int)m_data;
instead of the casting, but this doesn't seem as readable. Which alternative is better, if either, and why?
Just write
*val = (UCHAR)m_data;
As now the expression (UCHAR)m_data has an unsigned type neither sign bit will be propagated.
The type of conversion here is Integral promotion.
When promoting to a wider integer type the value is always "widened" using its signedness, so that the sign is propagated to the new high order bits for signed values. To avoid the sign propagation convert a signed value to its corresponding unsigned type first.
You can do that with an explicit *val = static_cast<UCHAR>(m_data).
Or, safer, using as_unsigned function as *val = as_unsigned(m_data). Function as_unsigned looks like:
inline unsigned char as_unsigned(char a) { return a; }
inline unsigned char as_unsigned(unsigned char a) { return a; }
inline unsigned char as_unsigned(signed char a) { return a; }
// And so on for the rest of integer types.
Using as_unsigned eliminates the risk of that explicit cast becoming incorrect after maintenance, should m_data become a wider integer it will use another overload of as_unsigned automatically without requiring the maintainer to manually update the expression. The inverse function as_signed is also useful.
The cast is better because some compilers (eg. clang) actually generate extra code for the bitwise and. Of course, you only need the one cast to unsigned char.
The cast also expresses your intent better: the data is actually an unsigned char that you move to an int. So I would call it better even with compilers which generate the same code.
I am going trough the book "Accelerated C++" by Andrew Koenig and Barbara E. Moo and I have some questions about the main example in chap 2. The code can be summarized as below, and is compiling without warning/error with g++:
#include <string>
using std::string;
int main()
{
const string greeting = "Hello, world!";
// OK
const int pad = 1;
// KO
// int pad = 1;
// OK
// unsigned int pad = 1;
const string::size_type cols = greeting.size() + 2 + pad * 2;
string::size_type c = 0;
if (c == 1 + pad)
{;}
return 0;
}
However, if I replace const int pad = 1; by int pad = 1;, the g++ compiler will return a warning:
warning: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
if (c == 1 + pad)
If I replace const int pad = 1; by unsigned int pad = 1;, the g++ compiler will not return a warning.
I understand why g++ return the warning, but I am not sure about the three below points:
Is it safe to use an unsigned int in order to compare with a std::string::size_type? The compiler does not return a warning in that case but I am not sure if it is safe.
Why is the compiler not giving a warning with the original code const int pad = 1. Is the compiler automatically converting the variable pad to an unsigned int?
I could also replace const int pad = 1; by string::size_type pad = 1;, but the meaning of the variable pad is not really linked to a string size in my opinion. Still, would this be the best approach in that case to avoid having different types in the comparison?
From the compiler point of view:
It is unsafe to compare signed and unsinged variables (non-constants).
It is safe to compare 2 unsinged variables of different sizes.
It is safe to compare an unsigned variable with a singed constant if the compiler can check that constant to be in the allowed range for the type of the signed variable (e.g. for 16-bit signed integer it is safe to use a constant in range [0..32767]).
So the answers to your questions:
Yes, it is safe to compare unsigned int and std::string::size_type.
There is no warning because the compiler can perform the safety check (while compiling :)).
There is no problem to use different unsigned types in comparison. Use unsinged int.
Comparing signed and unsigned values is "dangerous" in the sense that you may not get what you expect when the signed value is negative (it may well behave as a very large unsigned value, and thus a > b gives true when a = -1 and b = 100. (The use of const int works because the compiler knows the value isn't changing and thus can say "well, this value is always 1, so it works fine here")
As long as the value you want to compare fits in unsigned int (on typical machines, a little over 4 billion) is fine.
If you are using std::string with the default allocator (which is likely), then size_type is actually size_t.
[support.types]/6 defines that size_t is
an implementation-defined unsigned integer type that is large enough to contain the size
in bytes of any object.
So it's not technically guaranteed to be a unsigned int, but I believe it is defined this way in most cases.
Now regarding your second question: if you use const int something = 2, the compiler sees that this integer is a) never negative and b) never changes, so it's always safe to compare this variable with size_t. In some cases the compiler may optimize the variable out completely and simply replace all it's occurrences with 2.
I would say that it is better to use size_type everywhere where you are to the size of something, since it is more verbose.
What the compiler warns about is the comparison of unsigned and signed integer types. This is because the signed integer can be negative and the meaning is counter intuitive. This is because the signed is converted to unsigned before comparison, which means the negative number will compare greater than the positive.
Is it safe to use an unsigned int in order to compare with a std::string::size_type? The compiler does not return a warning in that case but I am not sure if it is safe.
Yes, they are both unsigned and then the semantics is what's expected. If their range differs the narrower are converted to a wider type.
Why is the compiler not giving a warning with the original code const int pad = 1. Is the compiler automatically converting the variable pad to an unsigned int?
This is because how the compiler is constructed. The compiler parses and to some extent optimizes the code before warnings are issued. The important point is that at the point this warning is being considered the compiler nows that the signed integer is 1 and then it's safe to compare with a unsigned integer.
I could also replace const int pad = 1; by string::size_type pad = 1;, but the meaning of the variable pad is not really linked to a string size in my opinion. Still, would this be the best approach in that case to avoid having different types in the comparison?
If you don't want it to be constant the best solution would probably be to make it at least an unsigned integer type. However you should be aware that there is no guaranteed relation between normal integer types and sizes, for example unsigned int may be narrower, wider or equal to size_t and size_type (the latter may also differ).
This is in C, but I tagged it C++ incase it's the same. This is being built with:
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.220 for 80x86
if that makes any different
Why does this work?
(inVal is 0x80)
float flt = (float) inVal;
outVal = *((unsigned long*)&flt);
(results in outVal being 0x43000000 -- correct)
But this doesn't?
outVal = *((unsigned long*)&((float)inVal));
(results in outVal being 0x00000080 -- NOT CORRECT :( )
Before asking this question I googled around a bit and found this function in java that basically does what I want. If you're a bit confused about what I'm trying to do, this program might help explain it:
class hello
{
public static void main(String[] args)
{
int inside = Float.floatToIntBits(128.0f);
System.out.printf("0x%08X", inside);
}
}
You're trying to take the address of a non-const temporary (the result of your (float) conversion) – this is illegal in C++ (and probably also in C). Hence, your code results in garbage.
In your first, working, code, you're not using a temporary so your code is working. Notice that from a standards point of view this is still ill-defined since the size and internal representation of the involved types isn't specified and may differ depending on platform and compiler. You're probably safe, though.
In C99, you may use compound literals to make this work inline:
unsigned long outVal = *((unsigned long *)&((float){ inVal }));
The literal (float){ inVal } will create a variable with automatic storage duration (ie stack-allocated) with a well-defined address.
Type punning may also be done using unions instead of pointer casts. Using compound literals and the non-standard __typeof__ operator, you can even do some macro magic:
#define wicked_cast(TYPE, VALUE) \
(((union { __typeof__(VALUE) src; TYPE dest; }){ .src = VALUE }).dest)
unsigned long outVal = wicked_cast(unsigned long, (float)inVal);
GCC prefers unions over pointer casts in regard to optimization. This might not work at all with the MS compiler as its C99 support is rumored to be non-existant.
Assuming: inVal and outVal are parameters.
void func(int inVal,unsigned long* outVal)
{
float flt = (float) inVal;
*outVal = (unsigned long)flt; // convert flot to unsigned long.
// Then assign to the variable by de-ref
// the pointer.
}
Can someone point me the to the implementation of sizeof operator in C++ and also some description about its implementation.
sizeof is one of the operator that cannot be overloaded.
So it means we cannot change its default behavior?
sizeof is not a real operator in C++. It is merely special syntax which inserts a constant equal to the size of the argument. sizeof doesn't need or have any runtime support.
Edit: do you want to know how to determine the size of a class/structure looking at its definition? The rules for this are part of the ABI, and compilers merely implement them. Basically the rules consist of
size and alignment definitions for primitive types;
structure, size and alignment of the various pointers;
rules for packing fields in structures;
rules about virtual table-related stuff (more esoteric).
However, ABIs are platform- and often vendor-specific, i.e. on x86 and (say) IA64 the size of A below will be different because IA64 does not permit unaligned data access.
struct A
{
char i ;
int j ;
} ;
assert (sizeof (A) == 5) ; // x86, MSVC #pragma pack(1)
assert (sizeof (A) == 8) ; // x86, MSVC default
assert (sizeof (A) == 16) ; // IA64
http://en.wikipedia.org/wiki/Sizeof
Basically, to quote Bjarne Stroustrup's C++ FAQ:
Sizeof cannot be overloaded because built-in operations, such as incrementing a pointer into an array implicitly depends on it. Consider:
X a[10];
X* p = &a[3];
X* q = &a[3];
p++; // p points to a[4]
// thus the integer value of p must be
// sizeof(X) larger than the integer value of q
Thus, sizeof(X) could not be given a
new and different meaning by the
programmer without violating basic
language rules.
No, you can't change it. What do you hope to learn from seeing an implementation of it?
What sizeof does can't be written in C++ using more basic operations. It's not a function, or part of a library header like e.g. printf or malloc. It's inside the compiler.
Edit: If the compiler is itself written in C or C++, then you can think of the implementation being something like this:
size_t calculate_sizeof(expression_or_type)
{
if (is_type(expression_or_type))
{
if (is_array_type(expression_or_type))
{
return array_size(exprssoin_or_type) *
calculate_sizeof(underlying_type_of_array(expression_or_type));
}
else
{
switch (expression_or_type)
{
case int_type:
case unsigned_int_type:
return 4; //for example
case char_type:
case unsigned_char_type:
case signed_char_type:
return 1;
case pointer_type:
return 4; //for example
//etc., for all the built-in types
case class_or_struct_type:
{
int base_size = compiler_overhead(expression_or_type);
for (/*loop over each class member*/)
{
base_size += calculate_sizeof(class_member) +
padding(class_member);
}
return round_up_to_multiple(base_size,
alignment_of_type(expression_or_type));
}
case union_type:
{
int max_size = 0;
for (/*loop over each class member*/)
{
max_size = max(max_size,
calculate_sizeof(class_member));
}
return round_up_to_multiple(max_size,
alignment_of_type(expression_or_type));
}
}
}
}
else
{
return calculate_sizeof(type_of(expression_or_type));
}
}
Note that is is very much pseudo-code. There's lots of things I haven't included, but this is the general idea. The compiler probably doesn't actually do this. It probably calculates the size of a type (including a class) and stores it, instead of recalculating every time you write sizeof(X). It is also allowed to e.g. have pointers being different sizes depending on what they point to.
sizeof does what it does at compile time. Operator overloads are simply functions, and do what they do at run time. It is therefore not possible to overload sizeof, even if the C++ Standard allowed it.
sizeof is a compile-time operator, which means that it is evaluated at compile-time.
It cannot be overloaded, because it already has a meaning on all user-defined types - the sizeof() a class is the size that the object the class defines takes in memory, and the sizeof() a variable is the size that the object the variable names occupies in memory.
Unless you need to see how C++-specific sizes are calculated (such as allocation for the v-table), you can look at Plan9's C compiler. It's much simpler than trying to tackle g++.
Variable:
#define getsize_var(x) ((char *)(&(x) + 1) - (char *)&(x))
Type:
#define getsize_type(type) ( (char*)((type*)(1) + 1) - (char*)((type *)(1)))
Take a look at the source for the Gnu C++ compiler for an real-world look at how this is done.