Integer overflow in operations - c++

Following code:
UINT32 dword = 4294967295;
if(dword + 1 != 0) // condition
In such operations is there any warranty that the biggest (whole) register (avaiable on architecture) is always used? and above condition will be always true on 64 bit mode, while false for 32 bit mode?

That'll depend on what sort of type UINT32 really is.
If it's an unsigned type (as you'd expect) then results are guaranteed to be reduced modulo the largest value that can be represented + 1, so code like this:
if (std::numeric_limits<T>::is_unsigned)
assert(std::numeric_limits<T>::max()+1==0);
...should succeed. OTOH, based on the name, we'd typically expect that to be a 32-bit type regardless of the implementation, register size, etc., so we'd expect to get the same result regardless.
Edit: [sorry, had to stop and feed baby for a few minutes] I should add in more detail. Although we can certainly hope it's unlikely in practice, it's conceivable that UINT32 could really be (say) a 16-bit unsigned short. For the sake of discussion, let's assume that int is 32 bits.
In this case, dword+1 would involve math between an unsigned short and an int (the implicit type of 1). In that case, dword would actually be initialized to 65535. Then, when you did the addition, that 65535 would be promoted to a 32-bit int, and 1 added as an int, so the result would be 65536.
At least in theory, the same basic thing could happen if UINT32 was an unsigned 32-bit type (as we'd expect) but int was a 64-bit type. Again, dword would be promoted to int before doing the math, so the math would be done on 64-bit quantities rather than 32-bit, so (again) the result would not wrap around to 0.

Related

Typecasting char to long

Say I have a variable, a
char a = 0x01;
and I want to cast this to a long, as in
long b;
b = (long)a;
Will the upper 3 bytes in b be guaranteed to be 0? With my setup they are 0, but I'm not sure if this is compiler-dependent.
Yes, b is guaranteed to have the value 0x1 after this assignment even without the cast. The assignment operator in c++ is generally semantic or value driven, it will copy the value or state, rather than preform bit wise copy (even if the two are sometimes equivalent, such as for trivial types).
In some cases, specially because of operator overloading, this may not be the case. Developers are very strongly encouraged to keep to this concept when they design new types, but a careless programmer could overload the assignment operator for non-fundamental types to do anything he/she wants.
As a long can represent all values for a char (be it signed or unsigned) the conversion is guaranteed to not change the value.
If you initially have a positive value, because either char is signed in you architecture or because the char values is between 0 and 127 (assuming 8 bit characters), the resulting long is guaranteed to be positive and less that 256. So in an architecture where long is 4 bytes large, the 3 high order bytes are guaranteed to be 0.
If char is signed and if the initial value is negative, things will be different! The value will be unchanged and will still be negative. In a common 2'complement architecture, the 3 high order bits will be 0xFF
The answer already given is right, but I thought I'd add that for C++, it is recommended to use one of the C++-specific casting notations, to make it abundantly clear what you are doing. Here, you would use:
long b;
b = static_cast<long>(a);
This makes it very clear what you are doing (a cast whereby how the cast is performed is calculated at compile time to a long), and you know that the "right" sort of cast will be performed.
char a = 0x01;
long b;
b = (long)a;
C and C++ are two different (but closely related) languages. Their rules happen to be the same in this case.
The cast (not "typecast") is not necessary. The assignment could, and probably should, be written as:
b = a;
which causes an implicit conversion from char to long. Since the value being converted is within the representable range of type long, the result of the conversion is 1. The result of the conversion is specified in terms of values, not representations.
The representation of the value 1 in type long probably has a 1 in the low-order bit, and 0s in all the other bits. (And the position of the low-order bit can vary; some systems are big-endian, some are little-endian, and there are other possibilities.)
There is no guarantee that type long even has three high-order bytes. Type long is at least 32 bits wide, but a byte can be wider than 8 bits. It's even possible that there are values of type char that exceed LONG_MAX (if plain char is signed and long is 1 byte, which implies CHAR_BIT >= 32).
It's also possible that the representation of type long includes padding bits, bits that do not contribute to the value. It's guaranteed that the sign bit is 0, the low-order value bit is 1, and all other value bits are 0, but if there are padding bits their values are not guaranteed. (Some combinations of padding bits can result in a trap representation that does not represent any value, but that can't happen in this particular case.)
Most of these exotic possibilities are very unlikely to occur in real life. C implementations for some DSPs do have bytes wider than 8 bits, but any system you're using almost certainly has 8-bit bytes.
The point is that the result of the conversion is defined in terms of values, not representations, and 99% of the time that's all you need to care about. If you write:
char a = 1; /* same as 0x01 */
long b = a;
printf("b = %ld\n", b);
it will print b = 1, even if you're using some exotic system where the value 1 is represented strangely.
b will be 1; this is always, compiler and endianness-independent, true. Additionally, the following expressions will be true:
b == 1
b == 01
b == 0x1
b == 0x00000001
b == 0x00000000000000000000000000000000000000000000000000001
The right hand side in all cases is an int constant with the value 1; not more, not less. Note that the zeroes do not represent bytes in memory (an int most likely does not have the number of bytes the last expression appears to suggest). The hexadecimal notation is just another way to write down a 1, exactly like 1.
In particular, we don't know where in memory the byte with the value 1 is located, because that is architecture dependent. It may be the one at the address of the int, or it may be the other end, or even in between.
Now comes the sweet thing: C does not care how the memory in an int is laid out. None of the ways to write an integer constant is architecture dependent. That seems self-understood with decimal constants — did we expect that the meaning of int i = 1 is architecture dependent? Certainly not. Nor is int i = 0x00000001;. The same is true for the bit shift operators: << shifts towards more significant bits, >> towards less significant bits. The digits in (decimal or hexadecimal) integer constants are ordered so that the most significant digits are on the left side, aligning with the "direction" indicated by the arrow-like bit shift operators. That may or may not reflect your machine's int representation; on a PC it does not.
Bottom line: If you use the standard C (or C++) means to test the "upper 3 bytes", you are home free, and the following is always true, independent of the implementation or architecture:
char a = 0x01;
long b = a;
(b & 0x11) == 1 // least significant byte is 1
(b & 0x00000011) == 1 // exactly the same as above
(b & 0x11111100) == 0 // more significant three bytes are all 0
It's possible that your long has more bits, but that is implementation dependent. How many more there are: they are all zero, save for the least significant one.

Is it a good practice to use long int to avoid overflow?

Suppose I have declared following variables :
some_dataType a,b,c; //a,b,c are positive
int res;
res=a+b-c;
And I have been provided with the information that the variable res cannot cross integer limit. Just to avoid overflow from the expression a+b in a+b-c I have following options-
declare a,b,c as long int
(a-c)+b
(long int)(a+b-c) //or some other type casting
My question is which of the above is good practice. Also, just to escape from all this, I prefer doing option 1. But it may increase the memory size of the program (not for this program though, but for large applications).
Assumption : long int is greater than int Thanks #Retired Ninja
There's no simple, general, portable way to avoid integer overflow.
Using a wider integer type is not a general solution, because there's no guarantee that there is a wider integer type. It's very common for the types int and long to be the same size on some systems. 32-bit systems typically make both int and long 32 bits; even some 64-bit systems do the same thing. And some 64-bit systems make both int and long 64 bits.
You wrote:
And I have been provided with the information that the variable res cannot cross integer limit.
It's best to be careful with terminology. Although the type name int is obviously an abbreviation of the word "integer", the words are not synonymous. In C and C++, int is just one of several integer types; others include unsigned char and long long.
Presumably what you mean is that the value of res must not be outside the range of type int, which is INT_MIN to INT_MAX.
If you use long long for the result, it's likely that you can avoid overflow; you can then compare the result to the values INT_MIN and INT_MAX, and take whatever corrective action is appropriate if it's out of range. But it's still possible for both int and long long to be 64 bits. What you do with that depends on how portable your code needs to be.
You cannot safely check whether a signed integer addition or subtraction overflowed after the fact. An overflow in signed arithmetic causes undefined behavior. Typically the result wraps around, but in principle your program could crash before you have a chance to examine the result.
Some compilers might provide functions that perform safe signed arithmetic and tell you whether there was an overflow. Consult your compiler's documentation.
There are ways to test the operands before performing the operation. They're complicated, and I'm too lazy to work out the details, but briefly:
If the operands of + are of opposite signs, or if one operand is 0, the addition is safe.
If both operands are positive, the addition is safe if x does not exceed INT_MAX - y.
If both operands are negative, the addition is safe if y is no less than INT_MIN - y.
I do not guarantee that the above is completely correct *I just wrote it off the top of my head), but in most cases it's probably more effort than it's worth.
Trust, but verify:
assert(a > 0);
assert(b > 0);
assert(c > 0);
assert((long)a - (long)c + (long)b <= INT_MAX);
int res = a - c + b;

Overflowing of Unsigned Int

What will the unsigned int contain when I overflow it? To be specific, I want to do a multiplication with two unsigned ints: what will be in the unsigned int after the multiplication is finished?
unsigned int someint = 253473829*13482018273;
unsigned numbers can't overflow, but instead wrap around using the properties of modulo.
For instance, when unsigned int is 32 bits, the result would be: (a * b) mod 2^32.
As CharlesBailey pointed out, 253473829*13482018273 may use signed multiplication before being converted, and so you should be explicit about unsigned before the multiplication:
unsigned int someint = 253473829U * 13482018273U;
Unsigned integer overflow, unlike its signed counterpart, exhibits well-defined behaviour.
Values basically "wrap" around. It's safe and commonly used for counting down, or hashing/mod functions.
It probably depends a bit on your compiler. I had errors like this years ago, and sometimes you would get runtime error, other times it would basically "wrap" back to a really small number that would result from chopping off the highest level bits and leaving the remainder, i.e if it's a 32 bit unsigned int, and the result of your multiplication would be a 34 bit number, it would chop off the high order 2 bits and give you the remainder. You would probably have to try it on your compiler to see exactly what you get, which may not be the same thing you would get with a different compiler, especially if the overflow happens in the middle of an expression where the end result is within the range of an unsigned int.

char* to double and back to char* again ( 64 bit application)

I am trying to convert a char* to double and back to char* again. the following code works fine if the application you created is 32-bit but doesn't work for 64-bit application. The problem occurs when you try to convert back to char* from int. for example if the hello = 0x000000013fcf7888 then converted is 0x000000003fcf7888 only the last 32 bits are right.
#include <iostream>
#include <stdlib.h>
#include <tchar.h>
using namespace std;
int _tmain(int argc, _TCHAR* argv[]){
char* hello = "hello";
unsigned int hello_to_int = (unsigned int)hello;
double hello_to_double = (double)hello_to_int;
cout<<hello<<endl;
cout<<hello_to_int<<"\n"<<hello_to_double<<endl;
unsigned int converted_int = (unsigned int)hello_to_double;
char* converted = reinterpret_cast<char*>(converted_int);
cout<<converted_int<<"\n"<<converted<<endl;
getchar();
return 0;
}
On 64-bit Windows pointers are 64-bit while int is 32-bit. This is why you're losing data in the upper 32-bits while casting. Instead of int use long long to hold the intermediate result.
char* hello = "hello";
unsigned long long hello_to_int = (unsigned long long)hello;
Make similar changes for the reverse conversion. But this is not guaranteed to make the conversions function correctly because a double can easily represent the entire 32-bit integer range without loss of precision but the same is not true for a 64-bit integer.
Also, this isn't going to work
unsigned int converted_int = (unsigned int)hello_to_double;
That conversion will simply truncate anything digits after the decimal point in the floating point representation. The problem exists even if you change the data type to unsigned long long. You'll need to reinterpret_cast<unsigned long long> to make it work.
Even after all that you may still run into trouble depending on the value of the pointer. The conversion to double may cause the value to be a signalling NaN for instance, in which cause your code might throw an exception.
Simple answer is, unless you're trying this out for fun, don't do conversions like these.
You can't cast a char* to int on 64-bit Windows because an int is 32 bits, while a char* is 64 bits because it's a pointer. Since a double is always 64 bits, you might be able to get away with casting between a double and char*.
A couple of issues with encoding any integer (specifically, a collection of bits) into a floating point value:
Conversions from 64-bit integers to doubles can be lossy. A double has 53-bits of actual precision, so integers above 2^52 (give or take an extra 2) will not necessarily be represented precisely.
If you decide to reinterpret the bits of a pointer as a double instead (via union or reinterpret_cast) you will still have issues if you happen to encode a pointer as set of bits that are not a valid double representation. Unless you can guarantee that the double value never gets written back by the FPU, the FPU can silently transform an invalid double into another invalid double (see NaN), i.e., a double value that represents the same value but has different bits. (See this for issues related to using floating point formats as bits.)
You can probably safely get away with encoding a 32-bit pointer in a double, as that will definitely fit within the 53-bit precision range.
only the last 32 bits are right.
That's because an int in your platform is only 32 bits long. Note that reinterpret_cast only guarantees that you can convert a pointer to an int of sufficient size (not your case), and back.
If it works in any system, anywhere, just all yourself lucky and move on. Converting a pointer to an integer is one thing (as long as the integer is large enough, you can get away with it), but a double is a floating point number - what you are doing simply doesn't make any sense, because a double is NOT necessarily capable of representing any random number. A double has range and precision limitations, and limits on how it represents things. It can represent numbers across a wide range of values, but it can't represent EVERY number in that range.
Remember that a double has two components: the mantissa and the exponent. Together, these allow you to represent either very big or very small numbers, but the mantissa has limited number of bits. If you run out of bits in the mantissa, you're going to lose some bits in the number you are trying to represent.
Apparently you got away with it under certain circumstances, but you're asking it to do something it wasn't made for, and for which it is manifestly inappropriate.
Just don't do that - it's not supposed to work.
This is as expected.
Typically a char* is going to be 32 bits on a 32-bit system, 64 bits on a 64-bit system; double is typically 64 bits on both systems. (These sizes are typical, and probably correct for Windows; the language permits a lot more variations.)
Conversion from a pointer to a floating-point type is, as far as I know, undefined. That doesn't just mean that the result of the conversion is undefined; the behavior of a program that attempts to perform such a conversion is undefined. If you're lucky, the program will crash or fail to compile.
But you're converting from a pointer to an integer (which is permitted, but implementation-defined) and then from an integer to a double (which is permitted and meaningful for meaningful numeric values -- but converted pointer values are not numerically meaningful). You're losing information because not all of the 64 bits of a double are used to represent the magnitude of the number; typically 11 or so bits are used to represent the exponent.
What you're doing quite simply makes no sense.
What exactly are you trying to accomplish? Whatever it is, there's surely a better way to do it.

Is it safe to use -1 to set all bits to true?

I've seen this pattern used a lot in C & C++.
unsigned int flags = -1; // all bits are true
Is this a good portable way to accomplish this? Or is using 0xffffffff or ~0 better?
I recommend you to do it exactly as you have shown, since it is the most straight forward one. Initialize to -1 which will work always, independent of the actual sign representation, while ~ will sometimes have surprising behavior because you will have to have the right operand type. Only then you will get the most high value of an unsigned type.
For an example of a possible surprise, consider this one:
unsigned long a = ~0u;
It won't necessarily store a pattern with all bits 1 into a. But it will first create a pattern with all bits 1 in an unsigned int, and then assign it to a. What happens when unsigned long has more bits is that not all of those are 1.
And consider this one, which will fail on a non-two's complement representation:
unsigned int a = ~0; // Should have done ~0u !
The reason for that is that ~0 has to invert all bits. Inverting that will yield -1 on a two's complement machine (which is the value we need!), but will not yield -1 on another representation. On a one's complement machine, it yields zero. Thus, on a one's complement machine, the above will initialize a to zero.
The thing you should understand is that it's all about values - not bits. The variable is initialized with a value. If in the initializer you modify the bits of the variable used for initialization, the value will be generated according to those bits. The value you need, to initialize a to the highest possible value, is -1 or UINT_MAX. The second will depend on the type of a - you will need to use ULONG_MAX for an unsigned long. However, the first will not depend on its type, and it's a nice way of getting the highest value.
We are not talking about whether -1 has all bits one (it doesn't always have). And we're not talking about whether ~0 has all bits one (it has, of course).
But what we are talking about is what the result of the initialized flags variable is. And for it, only -1 will work with every type and machine.
unsigned int flags = -1; is portable.
unsigned int flags = ~0; isn't portable because it
relies on a two's-complement representation.
unsigned int flags = 0xffffffff; isn't portable because
it assumes 32-bit ints.
If you want to set all bits in a way guaranteed by the C standard, use the first one.
Frankly I think all fff's is more readable. As to the comment that its an antipattern, if you really care that all the bits are set/cleared, I would argue that you are probably in a situation where you care about the size of the variable anyway, which would call for something like boost::uint16_t, etc.
A way which avoids the problems mentioned is to simply do:
unsigned int flags = 0;
flags = ~flags;
Portable and to the point.
I am not sure using an unsigned int for flags is a good idea in the first place in C++. What about bitset and the like?
std::numeric_limit<unsigned int>::max() is better because 0xffffffff assumes that unsigned int is a 32-bit integer.
unsigned int flags = -1; // all bits are true
"Is this a good[,] portable way to accomplish this?"
Portable? Yes.
Good? Debatable, as evidenced by all the confusion shown on this thread. Being clear enough that your fellow programmers can understand the code without confusion should be one of the dimensions we measure for good code.
Also, this method is prone to compiler warnings. To elide the warning without crippling your compiler, you'd need an explicit cast. For example,
unsigned int flags = static_cast<unsigned int>(-1);
The explicit cast requires that you pay attention to the target type. If you're paying attention to the target type, then you'll naturally avoid the pitfalls of the other approaches.
My advice would be to pay attention to the target type and make sure there are no implicit conversions. For example:
unsigned int flags1 = UINT_MAX;
unsigned int flags2 = ~static_cast<unsigned int>(0);
unsigned long flags3 = ULONG_MAX;
unsigned long flags4 = ~static_cast<unsigned long>(0);
All of which are correct and more obvious to your fellow programmers.
And with C++11: We can use auto to make any of these even simpler:
auto flags1 = UINT_MAX;
auto flags2 = ~static_cast<unsigned int>(0);
auto flags3 = ULONG_MAX;
auto flags4 = ~static_cast<unsigned long>(0);
I consider correct and obvious better than simply correct.
Converting -1 into any unsigned type is guaranteed by the standard to result in all-ones. Use of ~0U is generally bad since 0 has type unsigned int and will not fill all the bits of a larger unsigned type, unless you explicitly write something like ~0ULL. On sane systems, ~0 should be identical to -1, but since the standard allows ones-complement and sign/magnitude representations, strictly speaking it's not portable.
Of course it's always okay to write out 0xffffffff if you know you need exactly 32 bits, but -1 has the advantage that it will work in any context even when you do not know the size of the type, such as macros that work on multiple types, or if the size of the type varies by implementation. If you do know the type, another safe way to get all-ones is the limit macros UINT_MAX, ULONG_MAX, ULLONG_MAX, etc.
Personally I always use -1. It always works and you don't have to think about it.
As long as you have #include <limits.h> as one of your includes, you should just use
unsigned int flags = UINT_MAX;
If you want a long's worth of bits, you could use
unsigned long flags = ULONG_MAX;
These values are guaranteed to have all the value bits of the result set to 1, regardless of how signed integers are implemented.
Yes. As mentioned in other answers, -1 is the most portable; however, it is not very semantic and triggers compiler warnings.
To solve these issues, try this simple helper:
static const struct All1s
{
template<typename UnsignedType>
inline operator UnsignedType(void) const
{
static_assert(std::is_unsigned<UnsignedType>::value, "This is designed only for unsigned types");
return static_cast<UnsignedType>(-1);
}
} ALL_BITS_TRUE;
Usage:
unsigned a = ALL_BITS_TRUE;
uint8_t b = ALL_BITS_TRUE;
uint16_t c = ALL_BITS_TRUE;
uint32_t d = ALL_BITS_TRUE;
uint64_t e = ALL_BITS_TRUE;
On Intel's IA-32 processors it is OK to write 0xFFFFFFFF to a 64-bit register and get the expected results. This is because IA32e (the 64-bit extension to IA32) only supports 32-bit immediates. In 64-bit instructions 32-bit immediates are sign-extended to 64-bits.
The following is illegal:
mov rax, 0ffffffffffffffffh
The following puts 64 1s in RAX:
mov rax, 0ffffffffh
Just for completeness, the following puts 32 1s in the lower part of RAX (aka EAX):
mov eax, 0ffffffffh
And in fact I've had programs fail when I wanted to write 0xffffffff to a 64-bit variable and I got a 0xffffffffffffffff instead. In C this would be:
uint64_t x;
x = UINT64_C(0xffffffff)
printf("x is %"PRIx64"\n", x);
the result is:
x is 0xffffffffffffffff
I thought to post this as a comment to all the answers that said that 0xFFFFFFFF assumes 32 bits, but so many people answered it I figured I'd add it as a separate answer.
See litb's answer for a very clear explanation of the issues.
My disagreement is that, very strictly speaking, there are no guarantees for either case. I don't know of any architecture that does not represent an unsigned value of 'one less than two to the power of the number of bits' as all bits set, but here is what the Standard actually says (3.9.1/7 plus note 44):
The representations of integral types shall define values by use of a pure binary numeration system. [Note 44:]A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive bits are additive, begin with 1, and are multiplied by successive integral power of 2, except perhaps for the bit with the highest position.
That leaves the possibility for one of the bits to be anything at all.
I would not do the -1 thing. It's rather non-intuitive (to me at least). Assigning signed data to an unsigned variable just seems to be a violation of the natural order of things.
In your situation, I always use 0xFFFF. (Use the right number of Fs for the variable size of course.)
[BTW, I very rarely see the -1 trick done in real-world code.]
Additionally, if you really care about the individual bits in a vairable, it would be good idea to start using the fixed-width uint8_t, uint16_t, uint32_t types.
Although the 0xFFFF (or 0xFFFFFFFF, etc.) may be easier to read, it can break portability in code which would otherwise be portable. Consider, for example, a library routine to count how many items in a data structure have certain bits set (the exact bits being specified by the caller). The routine may be totally agnostic as to what the bits represent, but still need to have an "all bits set" constant. In such a case, -1 will be vastly better than a hex constant since it will work with any bit size.
The other possibility, if a typedef value is used for the bitmask, would be to use ~(bitMaskType)0; if bitmask happens to only be a 16-bit type, that expression will only have 16 bits set (even if 'int' would otherwise be 32 bits) but since 16 bits will be all that are required, things should be fine provided that one actually uses the appropriate type in the typecast.
Incidentally, expressions of the form longvar &= ~[hex_constant] have a nasty gotcha if the hex constant is too large to fit in an int, but will fit in an unsigned int. If an int is 16 bits, then longvar &= ~0x4000; or longvar &= ~0x10000; will clear one bit of longvar, but longvar &= ~0x8000; will clear out bit 15 and all bits above that. Values which fit in int will have the complement operator applied to a type int, but the result will be sign extended to long, setting the upper bits. Values which are too big for unsigned int will have the complement operator applied to type long. Values which are between those sizes, however, will apply the complement operator to type unsigned int, which will then be converted to type long without sign extension.
As others have mentioned, -1 is the correct way to create an integer that will convert to an unsigned type with all bits set to 1. However, the most important thing in C++ is using correct types. Therefore, the correct answer to your problem (which includes the answer to the question you asked) is this:
std::bitset<32> const flags(-1);
This will always contain the exact amount of bits you need. It constructs a std::bitset with all bits set to 1 for the same reasons mentioned in other answers.
It is certainly safe, as -1 will always have all available bits set, but I like ~0 better. -1 just doesn't make much sense for an unsigned int. 0xFF... is not good because it depends on the width of the type.
Practically: Yes
Theoretically: No.
-1 = 0xFFFFFFFF (or whatever size an int is on your platform) is only true with two's complement arithmetic. In practice, it will work, but there are legacy machines out there (IBM mainframes, etc.) where you've got an actual sign bit rather than a two's complement representation. Your proposed ~0 solution should work everywhere.
I say:
int x;
memset(&x, 0xFF, sizeof(int));
This will always give you the desired result.
Leveraging on the fact that assigning all bits to one for an unsigned type is equivalent to taking the maximum possible value for the given type,
and extending the scope of the question to all unsigned integer types:
Assigning -1 works for any unsigned integer type (unsigned int, uint8_t, uint16_t, etc.) for both C and C++.
As an alternative, for C++, you can either:
Include <limits> and use std::numeric_limits< your_type >::max()
Write a custom templated function (This would also allow some sanity check, i.e. if the destination type is really an unsigned type)
The purpose could be add more clarity, as assigning -1 would always need some explanatory comment.
A way to make the meaning bit more obvious and yet to avoid repeating the type:
const auto flags = static_cast<unsigned int>(-1);
An additional effort to emphasize, why Adrian McCarthy's approach here might be the best solution at latest since C++11 in terms of a compromise between standard conformity, type safety/explicit clearness and reduction of possible ambiguities:
unsigned int flagsPreCpp11 = ~static_cast<unsigned int>(0);
auto flags = ~static_cast<unsigned int>(0); // C++11 initialization
predeclaredflags = ~static_cast<decltype(predeclaredflags)>(0); // C++11 assignment to already declared variable
I'm going to explain my preference in detail below. As Johannes mentioned totally correctly, the fundamental origin of irritations here is the question about value vs. according bit representation semantics and about what types we're talking about exactly (the assigned value type vs. the possible compile time integral constant's type). Since there's no standard built-in mechanism to explicitly ensure the set of all bits to 1 for the concrete use case of the OP about unsigned integer values, it's obvious, that it's impossible to be fully independent of value semantics here (std::bitset is a common pure bit-layer refering container but the question was about unsigned integers in general). But we might be able to reduce ambiguity here.
Comparison of the 'better' standard compliant approaches:
The OP's way:
unsigned int flags = -1;
PROs:
is "established" and short
is quite intuitive in terms of modulo perspective of value to "natural" bit value representation
changing the target unsigned type to unsigned long for instance is possible without any further adaptions
CONs:
At least beginners might not be sure about the standard conformity ("Do I have to concern about padding bits?").
Violates type ranges (in the heavier way: signed vs. unsigned).
Solely from the code, you do not directly see any bit semantics association.
Refering to maximum values via defines:
unsigned int flags = UINT_MAX;
This circumvents the signed unsigned transition issue of the -1 approach but introduces several new problems: In doubt, one has to look twice here again, at the latest if you want to change the target type to unsigned long for instance. And here, one has to be sure about the fact, that the maximum value leads to all bits set to 1 by the standard (and padding bit concerns again). Bit semantics are also not obvious here directly from the code solely again.
Refering to maximum values more explicitly:
auto flags = std::numeric_limits<unsigned int>::max();
On my opinion, that's the better maximum value approach since it's macro/define free and one is explicit about the involved type. But all other concerns about the approach type itself remain.
Adrian's approach (and why I think, it's the preferred one before C++11 and since):
unsigned int flagsPreCpp11 = ~static_cast<unsigned int>(0);
auto flagsCpp11 = ~static_cast<unsigned int>(0);
PROs:
Only the simplest integral compile time constant is used: 0. So no worries about further bit representation or (implicit) casts are justified. From an intuitive point of view, I think we all can agree on the fact, that the bit representation for zero is commonly clearer than for maximum values, not only for unsigned integrals.
No type ambiguities are involved, no further look-ups required in doubt.
Explicit bit semantics are involved here via the complement ~. So it's quite clear from the code, what the intention was. And it's also very explicit, on which type and type range, the complement is applied.
CONs:
If assigned to a member for instance, there's a small chance that you mismatch types with pre C++11:
Declaration in class:
unsigned long m_flags;
Initialization in constructor:
m_flags(~static_cast<unsigned int>(0))
But since C++11, the usage of decltype + auto is powerful to prevent most of these possible issues. And some of these type mismatch scenarios (on interface boundaries for instance) are also possible for the -1 approach.
Robust final C++11 approach for pre-declared variables:
m_flags(~static_cast<decltype(m_flags)>(0)) // member initialization case
So with a full view on the weighting of the PROs and CONs of all approaches here, I recommend this one as the preferred approach, at latest since C++11.
Update: Thanks to a hint by Andrew Henle, I removed the statement about its readability since that might be a too subjective statement. But I still think, its readability is at least not that worse than most of the maximum value approaches or the ones with explicit maximum value provision via compile time integrals/literals since static_cast-usage is "established" too and built-in in contrast to defines/macros and even the std-lib.
yes the representation shown is very much correct as if we do it the other way round u will require an operator to reverse all the bits but in this case the logic is quite straightforward if we consider the size of the integers in the machine
for instance in most machines an integer is 2 bytes = 16 bits maximum value it can hold is 2^16-1=65535 2^16=65536
0%65536=0
-1%65536=65535 which corressponds to 1111.............1 and all the bits are set to 1 (if we consider residue classes mod 65536)
hence it is much straight forward.
I guess
no if u consider this notion it is perfectly dine for unsigned ints and it actually works out
just check the following program fragment
int main()
{
unsigned int a=2;
cout<<(unsigned int)pow(double(a),double(sizeof(a)*8));
unsigned int b=-1;
cout<<"\n"<<b;
getchar();
return 0;
}
answer for b = 4294967295 whcih is -1%2^32 on 4 byte integers
hence it is perfectly valid for unsigned integers
in case of any discrepancies plzz report