interpret unsigned as signed - c++

I'm working on an embedded platform (ARM) and have to be careful when dealing with bit patterns. Let's pretend this line is beyond my influence:
uint8_t foo = 0xCE; // 0b11001110
Interpreted as unsigned this would be 206. But actually it's signed, thus resembling -50. How can I continue using this value as signed?
int8_t bar = foo; // doesn't work
neither do (resulting in 0x10 or 0x00 for all input values)
int8_t bar = static_cast<int8_t>(foo);
int8_t bar = reinterpret_cast<int8_t&>(foo);
I just want the bits to remain untouched, ie. (bar == 0xCE)
Vice versa I'd be interested how to get bit-patters, representing negative numbers, into unsigned variables without messing the bit-pattern. I'm using GCC.

The following works fine for me, as it should though as the comments say, this is implementation-defined:
int x = (signed char)(foo);
In C++, you can also say:
int x = static_cast<signed char>(foo);
Note that promotion always tries to preserve the value before reinterpreting bit patterns. Thus you first have to cast to the signed type of the same size as your unsigned type to force the signed reinterpretation.
(I usually face the opposite problem when trying to print chars as pairs of hex digits.)

uint8_t foo = 0xCE; // 0b11001110
int8_t bar;
memcpy( &bar, &foo, 1 );
It even has the added bonus that 99% of compilers will completely optimise out the call to memcpy ...

Something ugly along the lines of this?
int8_t bar = (foo > 127) ? ((int)foo - 256) : foo;
Doesn't rely on a conversion whose behaviour is undefined.

With GCC chances are that unsigned values are two's complement, even on your embedded platform.
Then the 8-bit number 0xCE represents 0xCE-256.
Because two's complement is really just modulo 2n, where n is the number of bits in the representation.
EDIT: hm, for rep's sake I'd better give a concrete example:
int8_t toInt8( uint8_t x )
{
return (x >= 128? x - 256 : x);
}
EDIT 2: I didn't see the final question about how to get a bit pattern into an unsigned variable. That's extremely easy: just assign. The result is guaranteed by the C++ standard, namely that the value stored is congruent (on-the-clock-face equal) to the value assigned, modulo 2n.
Cheers & hth.,

You can access the representation of a value using a pointer. By reinterpreting the pointer type, not the value type, you should be able to replicate the representation.
uint8_t foo = 0xCE;
int8_t bar = *reinterpret_cast<int8_t*>(&foo);

Related

C++: Why is the value assignment interpretation always int?

I'd like to assign a value to a variable like this:
double var = 0xFFFFFFFF;
As a result var gets the value 65535.0 assigned. Since the compiler assumes a 64bit target system the number literal (i.e. all respective 32 bits) is interpreted significand precision bits. However, since 0xFFFF FFFF is just a notation for a bit pattern, without any hint about the representation, it could be quite differently interpreted w.r.t. becoming a floating point value. Thus, I was wondering if there is a way to manipulate this fixed interpretation of the value. In other words, give a hint about the desired representation. (Maybe someone could also point me to part in the standard where this implicit interpretation is defined).
So far, the default precision interpretation on my system seems to be
(int)0xFFFFFFFF x 100.
Only the fraction field is getting filled1.
So maybe (here: for 16 bit cross-compilation) I want it to be a different representation like:
(int)0xFFFFFF x 10(int)0xFF
(ignoring the sign bit for a moment).
Thus my question: How can I force a custom double interpretation of the hex literal notation?
1 Even when my hex literal would be 0xFFFF FFFF FFFF FFFF the value is only interpreted as the fraction part - so clearly, bits should be used for exponent and sign field. But it seems the literal gets just cut off.
C++ doesn't specify the in-memory representation for double, moreover, it doesn't even specify the in-memory representation of integer types (and it can really be different on systems with different endings). So if you want to interpret bytes 0xFF, 0xFF as a double, you can do something like:
uint8_t bytes[sizeof(double)] = {0xFF, 0xFF};
double var;
memcpy(&var, bytes, sizeof(double));
Note that using unions or reinterpret_casting pointers is, strictly speaking, undefined behavior, though in practice also works.
"I was wondering if there is a way to manipulate this interpretation."
Yes, you can use a reinterpret_cast<double&> via address, to force type (re-)interpretation from a certain bit pattern in memory.
"Thus my question: How can I force double interpretation of the hex notation?"
You can also use a union, to make it clearer:
union uint64_2_double {
uint64_t bits;
double dValue;
};
uint64_2_double x;
x.bits = 0x000000000000FFFF;
std::cout << x.dValue << std::endl;
There does not seem to be a direct way to initialize a double variable with an hexadecimal pattern, the c-style cast is equivalent to a C++ static_cast and the reinterpret_cast will complain it can't perform the conversion. I will give you two options, one simple solution but that will not initialize directly the variable, and a complicated one. You can do the following:
double var;
*reinterpret_cast<long *>(&var) = 0xFFFF;
Note: watch out that I would expect you to want to initialize all 64 bits of the double, your constant 0xFFFF seems small, it gives 3.23786e-319
A literal value that begins with 0x is an hexadecimal number of type unsigned int. You should use the suffix ul to make it a literal of unsigned long, which in most architectures will mean a 64 bit unsigned; or, #include <stdint.h> and do for example uint64_t(0xABCDFE13)
Now for the complicated stuff: In old C++ you can program a function that converts the integral constant to a double, but it won't be constexpr.
In constexpr functions you can't make reinterpret_cast. Then, your only choice to make a constexpr converter to double is to use an union in the middle, for example:
struct longOrDouble {
union {
unsigned long asLong;
double asDouble;
};
constexpr longOrDouble(unsigned long v) noexcept: asLong(v) {}
};
constexpr double toDouble(long v) { return longOrDouble(v).asDouble; }
This is a bit complicated, but this answers your question. Now, you can write:
double var = toDouble(0xFFFF)
And this will insert the given binary pattern into the double.
Using union to write to one member and read from another is undefined behavior in C++, there is an excellent question and excellent answers on this right here:
Accessing inactive union member and undefined behavior?

Curious arithmetic error- 255x256x256x256=18446744073692774400

I encountered a strange thing when I was programming under c++. It's about a simple multiplication.
Code:
unsigned __int64 a1 = 255*256*256*256;
unsigned __int64 a2= 255 << 24; // same as the above
cerr()<<"a1 is:"<<a1;
cerr()<<"a2 is:"<<a2;
interestingly the result is:
a1 is: 18446744073692774400
a2 is: 18446744073692774400
whereas it should be:(using calculator confirms)
4278190080
Can anybody tell me how could it be possible?
255*256*256*256
all operands are int you are overflowing int. The overflow of a signed integer is undefined behavior in C and C++.
EDIT:
note that the expression 255 << 24 in your second declaration also invokes undefined behavior if your int type is 32-bit. 255 x (2^24) is 4278190080 which cannot be represented in a 32-bit int (the maximum value is usually 2147483647 on a 32-bit int in two's complement representation).
C and C++ both say for E1 << E2 that if E1 is of a signed type and positive and that E1 x (2^E2) cannot be represented in the type of E1, the program invokes undefined behavior. Here ^ is the mathematical power operator.
Your literals are int. This means that all the operations are actually performed on int, and promptly overflow. This overflowed value, when converted to an unsigned 64bit int, is the value you observe.
It is perhaps worth explaining what happened to produce the number 18446744073692774400. Technically speaking, the expressions you wrote trigger "undefined behavior" and so the compiler could have produced anything as the result; however, assuming int is a 32-bit type, which it almost always is nowadays, you'll get the same "wrong" answer if you write
uint64_t x = (int) (255u*256u*256u*256u);
and that expression does not trigger undefined behavior. (The conversion from unsigned int to int involves implementation-defined behavior, but as nobody has produced a ones-complement or sign-and-magnitude CPU in many years, all implementations you are likely to encounter define it exactly the same way.) I have written the cast in C style because everything I'm saying here applies equally to C and C++.
First off, let's look at the multiplication. I'm writing the right hand side in hex because it's easier to see what's going on that way.
255u * 256u = 0x0000FF00u
255u * 256u * 256u = 0x00FF0000u
255u * 256u * 256u * 256u = 0xFF000000u (= 4278190080)
That last result, 0xFF000000u, has the highest bit of a 32-bit number set. Casting that value to a signed 32-bit type therefore causes it to become negative as-if 232 had been subtracted from it (that's the implementation-defined operation I mentioned above).
(int) (255u*256u*256u*256u) = 0xFF000000 = -16777216
I write the hexadecimal number there, sans u suffix, to emphasize that the bit pattern of the value does not change when you convert it to a signed type; it is only reinterpreted.
Now, when you assign -16777216 to a uint64_t variable, it is back-converted to unsigned as-if by adding 264. (Unlike the unsigned-to-signed conversion, this semantic is prescribed by the standard.) This does change the bit pattern, setting all of the high 32 bits of the number to 1 instead of 0 as you had expected:
(uint64_t) (int) (255u*256u*256u*256u) = 0xFFFFFFFFFF000000u
And if you write 0xFFFFFFFFFF000000 in decimal, you get 18446744073692774400.
As a closing piece of advice, whenever you get an "impossible" integer from C or C++, try printing it out in hexadecimal; it's much easier to see oddities of twos-complement fixed-width arithmetic that way.
The answer is simple -- overflowed.
Here Overflow occurred on int and when you are assigning it to unsigned int64 its converted in to 18446744073692774400 instead of 4278190080

Testing for a maximum unsigned value

Is this the correct way to test for a maximum unsigned value in C and C++ code:
if(foo == -1)
{
// at max possible value
}
where foo is an unsigned int, an unsigned short, and so on.
For C++, I believe you should preferably use the numeric_limits template from the <limits> header :
if (foo == std::numeric_limits<unsigned int>::max())
/* ... */
For C, others have already pointed out the <limits.h> header and UINT_MAX.
Apparently, "solutions which are allowed to name the type are easy", so you can have :
template<class T>
inline bool is_max_value(const T t)
{
return t == std::numeric_limits<T>::max();
}
[...]
if (is_max_value(foo))
/* ... */
I suppose that you ask this question since at a certain point you don't know the concrete type of your variable foo, otherwise you naturally would use UINT_MAX etc.
For C your approach is the right one only for types with a conversion rank of int or higher. This is because before being compared an unsigned short value, e.g, is first converted to int, if all values fit, or to unsigned int otherwise. So then your value foo would be compared either to -1 or to UINT_MAX, not what you expect.
I don't see an easy way of implementing the test that you want in C, since basically using foo in any type of expression would promote it to int.
With gcc's typeof extension this is easily possible. You'd just have to do something like
if (foo == (typeof(foo))-1)
As already noted, you should probably use if (foo == std::numeric_limits<unsigned int>::max()) to get the value.
However for completeness, in C++ -1 is "probably" guaranteed to be the max unsigned value when converted to unsigned (this wouldn't be the case if there were unused bit patterns at the upper end of the unsigned value range).
See 4.7/2:
If the destination type is unsigned, the resulting value is the
least unsigned integer congruent to
the source integer (modulo 2^n where n
is the number of bits used to
represent the unsigned type). [Note:
In a two’s complement representation,
this conversion is conceptual and
there is no change in the bit pattern
(if there is no truncation). ]
Note that specifically for the unsigned int case, due to the rules in 5/9 it appears that if either operand is unsigned, the other will be converted to unsigned automatically so you don't even need to cast the -1 (if I'm reading the standard correctly). In the case of unsigned short you'll need a direct check or explicit cast because of the automatic integral promotion induced by the ==.
using #include <limits.h> you could just do
if(foo == UINT_MAX)
if foo is an unsigned int it has valued [0 - +4,294,967,295] (if 32 bit)
More : http://en.wikipedia.org/wiki/Limits.h
edit: in C
if you do
#include <limits.h>
#include <stdio.h>
int main() {
unsigned int x = -1;
printf("%u",x);
return 0;
}
you will get the result 4294967295 (in a 32-bit system) and that is because internally, -1 is represented by 11111111111111111111111111111111 in two's complement. But because it is an unsigned, there is now no "sign bit" therefore making it work in the range [0-2^n]
Also see : http://en.wikipedia.org/wiki/Two%27s_complement
See other's answers for the C++ part std::numeric_limits<unsigned int>::max()
I would define a constant that would hold the maximum value as needed by the design of your code. Using "-1" is confusing. Imagine that someone in the future will change the type from unsigned int to int, it will mess your code.
Here's an attempt at doing this in C. It depends on the implementation not having padding bits:
#define IS_MAX_UNSIGNED(x) ( (sizeof(x)>=sizeof(int)) ? ((x)==-1) : \
((x)==(1<<CHAR_BIT*sizeof(x))-1) )
Or, if you can modify the variable, just do something like:
if (!(x++,x--)) { /* x is at max possible value */ }
Edit: And if you don't care about possible implementation-defined extended integer types:
#define IS_MAX_UNSIGNED(x) ( (sizeof(x)>=sizeof(int)) ? ((x)==-1) : \
(sizeof(x)==sizeof(short)) ? ((x)==USHRT_MAX) : \
(sizeof(x)==1 ? ((x)==UCHAR_MAX) : 42 )
You could use sizeof(char) in the last line, of course, but I consider it a code smell and would typically catch it grepping for code smells, so I just wrote 1. Of course you could also just remove the last conditional entirely.

typecasting to unsigned in C

int a = -534;
unsigned int b = (unsigned int)a;
printf("%d, %d", a, b);
prints -534, -534
Why is the typecast not taking place?
I expected it to be -534, 534
If I modify the code to
int a = -534;
unsigned int b = (unsigned int)a;
if(a < b)
printf("%d, %d", a, b);
its not printing anything... after all a is less than b??
Because you use %d for printing. Use %u for unsigned. Since printf is a vararg function, it cannot know the types of the parameters and must instead rely on the format specifiers. Because of this the type cast you do has no effect.
First, you don't need the cast: the value of a is implicitly converted to unsigned int with the assignment to b. So your statement is equivalent to:
unsigned int b = a;
Now, an important property of unsigned integral types in C and C++ is that their values are always in the range [0, max], where max for unsigned int is UINT_MAX (it's defined in limits.h). If you assign a value that's not in that range, it is converted to that range. So, if the value is negative, you add UINT_MAX+1 repeatedly to make it in the range [0, UINT_MAX]. For your code above, it is as if we wrote: unsigned int b = (UINT_MAX + a) + 1. This is not equal to -a (534).
Note that the above is true whether the underlying representation is in two's complement, ones' complement, or sign-magnitude (or any other exotic encoding). One can see that with something like:
signed char c = -1;
unsigned int u = c;
printf("%u\n", u);
assert(u == UINT_MAX);
On a typical two's complement machine with a 4-byte int, c is 0xff, and u is 0xffffffff. The compiler has to make sure that when value -1 is assigned to u, it is converted to a value equal to UINT_MAX.
Now going back to your code, the printf format string is wrong for b. You should use %u. When you do, you will find that it prints the value of UINT_MAX - 534 + 1 instead of 534.
When used in the comparison operator <, since b is unsigned int, a is also converted to unsigned int. This, given with b = a; earlier, means that a < b is false: a as an unsigned int is equal to b.
Let's say you have a ones' complement machine, and you do:
signed char c = -1;
unsigned char uc = c;
Let's say a char (signed or unsigned) is 8-bits on that machine. Then c and uc will store the following values and bit-patterns:
+----+------+-----------+
| c | -1 | 11111110 |
+----+------+-----------+
| uc | 255 | 11111111 |
+----+------+-----------+
Note that the bit patterns of c and uc are not the same. The compiler must make sure that c has the value -1, and uc has the value UCHAR_MAX, which is 255 on this machine.
There are more details on my answer to a question here on SO.
your specifier in the printf is asking printf to print a signed integer, so the underlying bytes are interpreted as a signed integer.
You should specify that you want an unsigned integer by using %u.
edit: a==b is true for the comparison, which is odd behaviour, but it's perfectly valid. You haven't changed the underlying bits you have only asked the compiler to treat the underlying bits in a certain way. Therefore a bitwise comparison yields true.
[speculation] I would suspect that behaviour might vary among compiler implementations -i.e., a fictitious CPU might not use the same logic for both signed and unsigned numerals in which case a bitwise comparison would fail. [/speculation]
C can be an ugly beast sometimes. The problem is that -534 always represents the value 0xfffffdea whether it is stored in a variable with the type unsigned int or signed int. To compare these variables they must be the same type so one will get automatically converted to either an unsigned or signed int to match the other. Once they are the same type they are equal as they represent the same value.
It seems likely that the behaviour you want is provided by the function abs:
int a = -534;
int b = abs(a);
printf("%d, %d", a, b);
I guess the first case of why b is being printed as -534 has been sufficiently answered by Tronic and Hassan. You should not be using %d and should be using %u.
As far as your second case is concered, again an implicit typecasting will be happening and both a and b will be same due to which your comparison does yield the expected result.
As far as I can see, the if fails because the compiler assumes the second variable should be considered the same type as the first. Try
if(b > a)
to see the difference.
Re 2nd question:
comparison never works between two different types - they are always implicitly cast to the "lowest common denominator", which in this case will be unsigned int. Nasty and counter-intuitive, I know.
Casting an integer type from signed to unsigned does not modify the bit pattern, it merely changes the interpretation of the bit pattern.
You also have a format specifier mismatch, %u should be used for unsigned integers, but even then the result will not be 534 as you expect, but 4294966762.
If you want to make a negative value positive, simply negate it:
unsigned b = (unsigned)-a ;
printf("%d, %u", a, b);
As for the second example, operations between types with differing signed-ness involve arcane implicit conversion rules - avoid. You should set your compiler's warning level high to trap many of these errors. I suggest /W4 /WX in VC++ and -Wall -Werror -Wformat for GCC for example.

Is it safe to use -1 to set all bits to true?

I've seen this pattern used a lot in C & C++.
unsigned int flags = -1; // all bits are true
Is this a good portable way to accomplish this? Or is using 0xffffffff or ~0 better?
I recommend you to do it exactly as you have shown, since it is the most straight forward one. Initialize to -1 which will work always, independent of the actual sign representation, while ~ will sometimes have surprising behavior because you will have to have the right operand type. Only then you will get the most high value of an unsigned type.
For an example of a possible surprise, consider this one:
unsigned long a = ~0u;
It won't necessarily store a pattern with all bits 1 into a. But it will first create a pattern with all bits 1 in an unsigned int, and then assign it to a. What happens when unsigned long has more bits is that not all of those are 1.
And consider this one, which will fail on a non-two's complement representation:
unsigned int a = ~0; // Should have done ~0u !
The reason for that is that ~0 has to invert all bits. Inverting that will yield -1 on a two's complement machine (which is the value we need!), but will not yield -1 on another representation. On a one's complement machine, it yields zero. Thus, on a one's complement machine, the above will initialize a to zero.
The thing you should understand is that it's all about values - not bits. The variable is initialized with a value. If in the initializer you modify the bits of the variable used for initialization, the value will be generated according to those bits. The value you need, to initialize a to the highest possible value, is -1 or UINT_MAX. The second will depend on the type of a - you will need to use ULONG_MAX for an unsigned long. However, the first will not depend on its type, and it's a nice way of getting the highest value.
We are not talking about whether -1 has all bits one (it doesn't always have). And we're not talking about whether ~0 has all bits one (it has, of course).
But what we are talking about is what the result of the initialized flags variable is. And for it, only -1 will work with every type and machine.
unsigned int flags = -1; is portable.
unsigned int flags = ~0; isn't portable because it
relies on a two's-complement representation.
unsigned int flags = 0xffffffff; isn't portable because
it assumes 32-bit ints.
If you want to set all bits in a way guaranteed by the C standard, use the first one.
Frankly I think all fff's is more readable. As to the comment that its an antipattern, if you really care that all the bits are set/cleared, I would argue that you are probably in a situation where you care about the size of the variable anyway, which would call for something like boost::uint16_t, etc.
A way which avoids the problems mentioned is to simply do:
unsigned int flags = 0;
flags = ~flags;
Portable and to the point.
I am not sure using an unsigned int for flags is a good idea in the first place in C++. What about bitset and the like?
std::numeric_limit<unsigned int>::max() is better because 0xffffffff assumes that unsigned int is a 32-bit integer.
unsigned int flags = -1; // all bits are true
"Is this a good[,] portable way to accomplish this?"
Portable? Yes.
Good? Debatable, as evidenced by all the confusion shown on this thread. Being clear enough that your fellow programmers can understand the code without confusion should be one of the dimensions we measure for good code.
Also, this method is prone to compiler warnings. To elide the warning without crippling your compiler, you'd need an explicit cast. For example,
unsigned int flags = static_cast<unsigned int>(-1);
The explicit cast requires that you pay attention to the target type. If you're paying attention to the target type, then you'll naturally avoid the pitfalls of the other approaches.
My advice would be to pay attention to the target type and make sure there are no implicit conversions. For example:
unsigned int flags1 = UINT_MAX;
unsigned int flags2 = ~static_cast<unsigned int>(0);
unsigned long flags3 = ULONG_MAX;
unsigned long flags4 = ~static_cast<unsigned long>(0);
All of which are correct and more obvious to your fellow programmers.
And with C++11: We can use auto to make any of these even simpler:
auto flags1 = UINT_MAX;
auto flags2 = ~static_cast<unsigned int>(0);
auto flags3 = ULONG_MAX;
auto flags4 = ~static_cast<unsigned long>(0);
I consider correct and obvious better than simply correct.
Converting -1 into any unsigned type is guaranteed by the standard to result in all-ones. Use of ~0U is generally bad since 0 has type unsigned int and will not fill all the bits of a larger unsigned type, unless you explicitly write something like ~0ULL. On sane systems, ~0 should be identical to -1, but since the standard allows ones-complement and sign/magnitude representations, strictly speaking it's not portable.
Of course it's always okay to write out 0xffffffff if you know you need exactly 32 bits, but -1 has the advantage that it will work in any context even when you do not know the size of the type, such as macros that work on multiple types, or if the size of the type varies by implementation. If you do know the type, another safe way to get all-ones is the limit macros UINT_MAX, ULONG_MAX, ULLONG_MAX, etc.
Personally I always use -1. It always works and you don't have to think about it.
As long as you have #include <limits.h> as one of your includes, you should just use
unsigned int flags = UINT_MAX;
If you want a long's worth of bits, you could use
unsigned long flags = ULONG_MAX;
These values are guaranteed to have all the value bits of the result set to 1, regardless of how signed integers are implemented.
Yes. As mentioned in other answers, -1 is the most portable; however, it is not very semantic and triggers compiler warnings.
To solve these issues, try this simple helper:
static const struct All1s
{
template<typename UnsignedType>
inline operator UnsignedType(void) const
{
static_assert(std::is_unsigned<UnsignedType>::value, "This is designed only for unsigned types");
return static_cast<UnsignedType>(-1);
}
} ALL_BITS_TRUE;
Usage:
unsigned a = ALL_BITS_TRUE;
uint8_t b = ALL_BITS_TRUE;
uint16_t c = ALL_BITS_TRUE;
uint32_t d = ALL_BITS_TRUE;
uint64_t e = ALL_BITS_TRUE;
On Intel's IA-32 processors it is OK to write 0xFFFFFFFF to a 64-bit register and get the expected results. This is because IA32e (the 64-bit extension to IA32) only supports 32-bit immediates. In 64-bit instructions 32-bit immediates are sign-extended to 64-bits.
The following is illegal:
mov rax, 0ffffffffffffffffh
The following puts 64 1s in RAX:
mov rax, 0ffffffffh
Just for completeness, the following puts 32 1s in the lower part of RAX (aka EAX):
mov eax, 0ffffffffh
And in fact I've had programs fail when I wanted to write 0xffffffff to a 64-bit variable and I got a 0xffffffffffffffff instead. In C this would be:
uint64_t x;
x = UINT64_C(0xffffffff)
printf("x is %"PRIx64"\n", x);
the result is:
x is 0xffffffffffffffff
I thought to post this as a comment to all the answers that said that 0xFFFFFFFF assumes 32 bits, but so many people answered it I figured I'd add it as a separate answer.
See litb's answer for a very clear explanation of the issues.
My disagreement is that, very strictly speaking, there are no guarantees for either case. I don't know of any architecture that does not represent an unsigned value of 'one less than two to the power of the number of bits' as all bits set, but here is what the Standard actually says (3.9.1/7 plus note 44):
The representations of integral types shall define values by use of a pure binary numeration system. [Note 44:]A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive bits are additive, begin with 1, and are multiplied by successive integral power of 2, except perhaps for the bit with the highest position.
That leaves the possibility for one of the bits to be anything at all.
I would not do the -1 thing. It's rather non-intuitive (to me at least). Assigning signed data to an unsigned variable just seems to be a violation of the natural order of things.
In your situation, I always use 0xFFFF. (Use the right number of Fs for the variable size of course.)
[BTW, I very rarely see the -1 trick done in real-world code.]
Additionally, if you really care about the individual bits in a vairable, it would be good idea to start using the fixed-width uint8_t, uint16_t, uint32_t types.
Although the 0xFFFF (or 0xFFFFFFFF, etc.) may be easier to read, it can break portability in code which would otherwise be portable. Consider, for example, a library routine to count how many items in a data structure have certain bits set (the exact bits being specified by the caller). The routine may be totally agnostic as to what the bits represent, but still need to have an "all bits set" constant. In such a case, -1 will be vastly better than a hex constant since it will work with any bit size.
The other possibility, if a typedef value is used for the bitmask, would be to use ~(bitMaskType)0; if bitmask happens to only be a 16-bit type, that expression will only have 16 bits set (even if 'int' would otherwise be 32 bits) but since 16 bits will be all that are required, things should be fine provided that one actually uses the appropriate type in the typecast.
Incidentally, expressions of the form longvar &= ~[hex_constant] have a nasty gotcha if the hex constant is too large to fit in an int, but will fit in an unsigned int. If an int is 16 bits, then longvar &= ~0x4000; or longvar &= ~0x10000; will clear one bit of longvar, but longvar &= ~0x8000; will clear out bit 15 and all bits above that. Values which fit in int will have the complement operator applied to a type int, but the result will be sign extended to long, setting the upper bits. Values which are too big for unsigned int will have the complement operator applied to type long. Values which are between those sizes, however, will apply the complement operator to type unsigned int, which will then be converted to type long without sign extension.
As others have mentioned, -1 is the correct way to create an integer that will convert to an unsigned type with all bits set to 1. However, the most important thing in C++ is using correct types. Therefore, the correct answer to your problem (which includes the answer to the question you asked) is this:
std::bitset<32> const flags(-1);
This will always contain the exact amount of bits you need. It constructs a std::bitset with all bits set to 1 for the same reasons mentioned in other answers.
It is certainly safe, as -1 will always have all available bits set, but I like ~0 better. -1 just doesn't make much sense for an unsigned int. 0xFF... is not good because it depends on the width of the type.
Practically: Yes
Theoretically: No.
-1 = 0xFFFFFFFF (or whatever size an int is on your platform) is only true with two's complement arithmetic. In practice, it will work, but there are legacy machines out there (IBM mainframes, etc.) where you've got an actual sign bit rather than a two's complement representation. Your proposed ~0 solution should work everywhere.
I say:
int x;
memset(&x, 0xFF, sizeof(int));
This will always give you the desired result.
Leveraging on the fact that assigning all bits to one for an unsigned type is equivalent to taking the maximum possible value for the given type,
and extending the scope of the question to all unsigned integer types:
Assigning -1 works for any unsigned integer type (unsigned int, uint8_t, uint16_t, etc.) for both C and C++.
As an alternative, for C++, you can either:
Include <limits> and use std::numeric_limits< your_type >::max()
Write a custom templated function (This would also allow some sanity check, i.e. if the destination type is really an unsigned type)
The purpose could be add more clarity, as assigning -1 would always need some explanatory comment.
A way to make the meaning bit more obvious and yet to avoid repeating the type:
const auto flags = static_cast<unsigned int>(-1);
An additional effort to emphasize, why Adrian McCarthy's approach here might be the best solution at latest since C++11 in terms of a compromise between standard conformity, type safety/explicit clearness and reduction of possible ambiguities:
unsigned int flagsPreCpp11 = ~static_cast<unsigned int>(0);
auto flags = ~static_cast<unsigned int>(0); // C++11 initialization
predeclaredflags = ~static_cast<decltype(predeclaredflags)>(0); // C++11 assignment to already declared variable
I'm going to explain my preference in detail below. As Johannes mentioned totally correctly, the fundamental origin of irritations here is the question about value vs. according bit representation semantics and about what types we're talking about exactly (the assigned value type vs. the possible compile time integral constant's type). Since there's no standard built-in mechanism to explicitly ensure the set of all bits to 1 for the concrete use case of the OP about unsigned integer values, it's obvious, that it's impossible to be fully independent of value semantics here (std::bitset is a common pure bit-layer refering container but the question was about unsigned integers in general). But we might be able to reduce ambiguity here.
Comparison of the 'better' standard compliant approaches:
The OP's way:
unsigned int flags = -1;
PROs:
is "established" and short
is quite intuitive in terms of modulo perspective of value to "natural" bit value representation
changing the target unsigned type to unsigned long for instance is possible without any further adaptions
CONs:
At least beginners might not be sure about the standard conformity ("Do I have to concern about padding bits?").
Violates type ranges (in the heavier way: signed vs. unsigned).
Solely from the code, you do not directly see any bit semantics association.
Refering to maximum values via defines:
unsigned int flags = UINT_MAX;
This circumvents the signed unsigned transition issue of the -1 approach but introduces several new problems: In doubt, one has to look twice here again, at the latest if you want to change the target type to unsigned long for instance. And here, one has to be sure about the fact, that the maximum value leads to all bits set to 1 by the standard (and padding bit concerns again). Bit semantics are also not obvious here directly from the code solely again.
Refering to maximum values more explicitly:
auto flags = std::numeric_limits<unsigned int>::max();
On my opinion, that's the better maximum value approach since it's macro/define free and one is explicit about the involved type. But all other concerns about the approach type itself remain.
Adrian's approach (and why I think, it's the preferred one before C++11 and since):
unsigned int flagsPreCpp11 = ~static_cast<unsigned int>(0);
auto flagsCpp11 = ~static_cast<unsigned int>(0);
PROs:
Only the simplest integral compile time constant is used: 0. So no worries about further bit representation or (implicit) casts are justified. From an intuitive point of view, I think we all can agree on the fact, that the bit representation for zero is commonly clearer than for maximum values, not only for unsigned integrals.
No type ambiguities are involved, no further look-ups required in doubt.
Explicit bit semantics are involved here via the complement ~. So it's quite clear from the code, what the intention was. And it's also very explicit, on which type and type range, the complement is applied.
CONs:
If assigned to a member for instance, there's a small chance that you mismatch types with pre C++11:
Declaration in class:
unsigned long m_flags;
Initialization in constructor:
m_flags(~static_cast<unsigned int>(0))
But since C++11, the usage of decltype + auto is powerful to prevent most of these possible issues. And some of these type mismatch scenarios (on interface boundaries for instance) are also possible for the -1 approach.
Robust final C++11 approach for pre-declared variables:
m_flags(~static_cast<decltype(m_flags)>(0)) // member initialization case
So with a full view on the weighting of the PROs and CONs of all approaches here, I recommend this one as the preferred approach, at latest since C++11.
Update: Thanks to a hint by Andrew Henle, I removed the statement about its readability since that might be a too subjective statement. But I still think, its readability is at least not that worse than most of the maximum value approaches or the ones with explicit maximum value provision via compile time integrals/literals since static_cast-usage is "established" too and built-in in contrast to defines/macros and even the std-lib.
yes the representation shown is very much correct as if we do it the other way round u will require an operator to reverse all the bits but in this case the logic is quite straightforward if we consider the size of the integers in the machine
for instance in most machines an integer is 2 bytes = 16 bits maximum value it can hold is 2^16-1=65535 2^16=65536
0%65536=0
-1%65536=65535 which corressponds to 1111.............1 and all the bits are set to 1 (if we consider residue classes mod 65536)
hence it is much straight forward.
I guess
no if u consider this notion it is perfectly dine for unsigned ints and it actually works out
just check the following program fragment
int main()
{
unsigned int a=2;
cout<<(unsigned int)pow(double(a),double(sizeof(a)*8));
unsigned int b=-1;
cout<<"\n"<<b;
getchar();
return 0;
}
answer for b = 4294967295 whcih is -1%2^32 on 4 byte integers
hence it is perfectly valid for unsigned integers
in case of any discrepancies plzz report