Detection of overflow - c++

How to detect overflow of unsigned char variable in c++?

unsigned numbers always positive between 0 to 255.and obey the law 2^n(n = numbers of bit in type);if char 8 bit then unsigned char variables have values between 0 and 255 , while signed chars have values between -128 and 127.

unsigned char Test = 260;
Because 260 is an integer literal your compiler should emit a warning. How to handle that? Do not ignore compiler warnings (or use an alternative syntax to avoid automatic conversions or enable this warning as error). Also note that integer literals are always positive (unsigned): -1 is not an integer literal: it's i.l. 1 and unary operator -. For gcc I'd suggest to use -Wstrict-overflow=2 (or more, according to your code policies) and possibly enabling -Werror=strict-overflow. For MS VC++ you may enable warning C4307 with /we4307 and /W14307 if you keep warnings at level 1 (!!!) (you may also do it with #pragma warning directive).
How to detect overflow of unsigned char variable in c++?
At compile-time compiler warnings are your friends but at run-time?
There is not a portable way (like, for example, checked in C#) to do this and better technique depends on which type of operation you want to monitor. For a simple assignment (made with values known at run-time) you may write something like this:
int32_t bigNumber = 260;
uint8_t smallNumber = static_cast<uint8_t>(bigNumber);
if (static_cast<int32_t>(smallNumber) != bigNumber) {
// Overflow...
}
In alternative you may check before assigning:
int32_t bigNumber = 260;
if (bigNumber > UINT8_MAX) {
// Overflow
}
Note that you may also make compiler life easier writing (after assignment):
if (smallNumber != bigNumber) {
// Overflow
}
It works because automatic promotions will convert smallNumber to bigNumber type (unless you're performing a signed/unsigned comparison, in this case you should simply avoid this alternative).
If you need it often you may write a small helper function to perform this conversion. For some ideas and possible implementations, if you're using MS compiler, you may take a look to SafeInt family functions (note, however, that in this case assignment and casting won't throw).

You can use braces to initialize your value to force a compile-time error (assuming you use C++11 or later):
unsigned char Test{260};
Brace-initialization doesn't allow narrowing conversions.
Of course, that still wouldn't allow sticking the value 260 into an unsigned char but it would draw attention to the attempt. You'd need a bigger data type, e.g., unsigned short, to represent 260.

Related

Converting Integer Types

How does one convert from one integer type to another safely and with setting off alarm bells in compilers and static analysis tools?
Different compilers will warn for something like:
int i = get_int();
size_t s = i;
for loss of signedness or
size_t s = get_size();
int i = s;
for narrowing.
casting can remove the warnings but don't solve the safety issue.
Is there a proper way of doing this?
You can try boost::numeric_cast<>.
boost numeric_cast returns the result of converting a value of type Source to a value of type Target. If out-of-range is detected, an exception is thrown (see bad_numeric_cast, negative_overflow and positive_overflow ).
How does one convert from one integer type to another safely and with setting off alarm bells in compilers and static analysis tools?
Control when conversion is needed. As able, only convert when there is no value change. Sometimes, then one must step back and code at a higher level. IOWs, was a lossy conversion needed or can code be re-worked to avoid conversion loss?
It is not hard to add an if(). The test just needs to be carefully formed.
Example where size_t n and int len need a compare. Note that positive values of int may exceed that of size_t - or visa-versa or the same. Note in this case, the conversion of int to unsigned only happens with non-negative values - thus no value change.
int len = snprintf(buf, n, ...);
if (len < 0 || (unsigned)len >= n) {
// Handle_error();
}
unsigned to int example when it is known that the unsigned value at this point of code is less than or equal to INT_MAX.
unsigned n = ...
int i = n & INT_MAX;
Good analysis tools see that n & INT_MAX always converts into int without loss.
There is no built-in safe narrowing conversion between int types in c++ and STL. You could implement it yourself using as an example Microsoft GSL.
Theoretically, if you want perfect safety, you shouldn't be mixing types like this at all. (And you definitely shouldn't be using explicit casts to silence warnings, as you know.) If you've got values of type size_t, it's best to always carry them around in variables of type size_t.
There is one case where I do sometimes decide I can accept less than 100.000% perfect type safety, and that is when I assign sizeof's return value, which is a size_t, to an int. For any machine I am ever going to use, the only time this conversion might lose information is when sizeof returns a value greater than 2147483647. But I am content to assume that no single object in any of my programs will ever be that big. (In particular, I will unhesitatingly write things like printf("sizeof(int) = %d\n", (int)sizeof(int)), explicit cast and all. There is no possible way that the size of a type like int will not fit in an int!)
[Footnote: Yes, it's true, on a 16-bit machine the assumption is the rather less satisfying threshold that sizeof won't return a value greater than 32767. It's more likely that a single object might have a size like that, but probably not in a program that's running on a 16-bitter.]

Will 'comparison between signed and unsigned integer expressions' ever actually result in errors?

Often an object I use will have (signed) int parameters (e.g. int iSize) which eventually store how large something should be. At the same time, I will often initialize them to -1 to signify that the object (etc) hasn't been setup / hasn't been filled / isn't ready for use.
I often end up with the warning comparison between signed and unsigned integer, when I do something like if( iSize >= someVector.size() ) { ... }.
Thus, I nominally don't want to be using an unsigned int. Are there any situations where this will lead to an error or unexpected behavior?
If not: what is the best way to handle this? If I use the compiler flag -Wno-sign-compare I could (hypothetically) miss a situation in which I should be using an unsigned int (or something like that). So should I just use a cast when comparing with an unsigned int--e.g. if( iSize >= (int)someVector.size() ) { ... } ?
Yes, there are, and very subtle ones. If you are curious, you can check this interesting presentation by Stephan T. Lavavej about arithmetic conversion and a bug in Microsoft's implementation of STL which was caused just by signed vs unsigned comparison.
In general, the problem is due to the fact that because of complement 2 arithmetic, a very small negative integral value has the same bit representation as a very big unsigned integral value (e.g. -1 = 0xFFFF = 65535).
In the specific case of checking size(), why not using type size_t for iSize in the first place? Unsigned values just give you greater expressivity, use it.
And if you do not want to declare iSize as size_t, just make it clear by using an explicit cast that you are aware of the nature of this comparison. The compiler is trying to do you a favor with those warnings and, as you correctly wrote, there might be situations where ignoring them would cause you a very bad headache.
Thus, if iSize is sometimes negative (and should be evaluated as less than all unsigned int values of size()), use the idiom: if ((iSize < 0) || ((unsigned)iSize < somevector.size())) ...

Verifying that C / C++ signed right shift is arithmetic for a particular compiler?

According to the C / C++ standard (see this link), the >> operator in C and C++ is not necessarily an arithmetic shift for signed numbers. It is up to the compiler implementation whether 0's (logical) or the sign bit (arithmetic) are shifted in as bits are shifted to the right.
Will this code function to ASSERT (fail) at compile time for compilers that implement a logical right shift for signed integers ?
#define COMPILE_TIME_ASSERT(EXP) \
typedef int CompileTimeAssertType##__LINE__[(EXP) ? 1 : -1]
#define RIGHT_SHIFT_IS_ARITHMETIC \
( (((signed int)-1)>>1) == ((signed int)-1) )
// SHR must be arithmetic to use this code
COMPILE_TIME_ASSERT( RIGHT_SHIFT_IS_ARITHMETIC );
Looks good to me! You can also set the compiler to emit an assembly file (or load the compiled program in the debugger) and look at which opcode it emits for signed int i; i >> 1;, but that's not automatic like your solution.
If you ever find a compiler that does not implement arithmetic right shift of a signed number, I'd like to hear about it.
Why assert? If your compiler's shift operator doesn't suit your needs, you could gracefully remedy the situation by sign-extending the result. Also, sometimes run-time is good enough. After all, the compiler's optimizer can make compile-time out of run-time:
template <typename Number>
inline Number shift_logical_right(Number value, size_t bits)
{
static const bool shift_is_arithmetic = (Number(-1) >> 1) == Number(-1);
const bool negative = value < 0;
value >>= bits;
if (!shift_is_arithmetic && negative) // sign extend
value |= -(Number(1) << (sizeof(Number) * 8 - bits));
}
The static const bool can be evaluated at compile time, so if shift_is_arithmetic is guaranteed to be true, every compiler worth its salt will eliminate the whole if clause and the construction of const bool negative as dead code.
Note: code is adapted from Mono's encode_sleb128 function: here.
Update
If you really want to abort compilation on machines without arithmetic shift, you're still better off not relying on the preprocessor. You can use static_assert (or BOOST_STATIC_ASSERT):
static_assert((Number(-1) >> 1) == Number(-1), "Arithmetic shift unsupported.");
From your various comments, you talk about using this cross-platform. Make sure that your compilers guarantee that when they compile for a platform, their compile-time operators will behave the same as run-time ones.
An example of differing behavior can be found with floating point numbers. Is your compiler doing its constant-expression math in single, double, or extended precision if you're casting back to int? Such as
constexpr int a = 41;
constexpr int b = (a / 7.5);
What I am saying is, you should make sure your compilers guarantee the same behavior during run-time as compile-time when you're working across so many different architectures.
It is entirely possible that a compiler might sign-extend internally but not generate the intended opcode(s) on the target. The only way to be sure is to test at run-time or look at the assembly output.
It's not the end of the world to look at assembly output...How many different platforms are there? Since this is so performance-critical just do the "work" of looking at 1-3 lines of assembler output for 5 different architectures. It isn't as if you have to dive through an entire assembly output (usually!) to find your line. It's very, very easy to do.

Worst side effects from chars signedness. (Explanation of signedness effects on chars and casts)

I frequently work with libraries that use char when working with bytes in C++. The alternative is to define a "Byte" as unsigned char but that not the standard they decided to use. I frequently pass bytes from C# into the C++ dlls and cast them to char to work with the library.
When casting ints to chars or chars to other simple types what are some of the side effects that can occur. Specifically, when has this broken code that you have worked on and how did you find out it was because of the char signedness?
Lucky i haven't run into this in my code, used a char signed casting trick back in an embedded systems class in school. I'm looking to better understand the issue since I feel it is relevant to the work I am doing.
One major risk is if you need to shift the bytes. A signed char keeps the sign-bit when right-shifted, whereas an unsigned char doesn't.
Here's a small test program:
#include <stdio.h>
int main (void)
{
signed char a = -1;
unsigned char b = 255;
printf("%d\n%d\n", a >> 1, b >> 1);
return 0;
}
It should print -1 and 127, even though a and b start out with the same bit pattern (given 8-bit chars, two's-complement and signed values using arithmetic shift).
In short, you can't rely on shift working identically for signed and unsigned chars, so if you need portability, use unsigned char rather than char or signed char.
The most obvious gotchas come when you need to compare the numeric value of a char with a hexadecimal constant when implementing protocols or encoding schemes.
For example, when implementing telnet you might want to do this.
// Check for IAC (hex FF) byte
if (ch == 0xFF)
{
// ...
Or when testing for UTF-8 multi-byte sequences.
if (ch >= 0x80)
{
// ...
Fortunately these errors don't usually survive very long as even the most cursory testing on a platform with a signed char should reveal them. They can be fixed by using a character constant, converting the numeric constant to a char or converting the character to an unsigned char before the comparison operator promotes both to an int. Converting the char directly to an unsigned won't work, though.
if (ch == '\xff') // OK
if ((unsigned char)ch == 0xff) // OK, so long as char has 8-bits
if (ch == (char)0xff) // Usually OK, relies on implementation defined behaviour
if ((unsigned)ch == 0xff) // still wrong
I've been bitten by char signedness in writing search algorithms that used characters from the text as indices into state trees. I've also had it cause problems when expanding characters into larger types, and the sign bit propagates causing problems elsewhere.
I found out when I started getting bizarre results, and segfaults arising from searching texts other than the one's I'd used during the initial development (obviously characters with values >127 or <0 are going to cause this, and won't necessarily be present in your typical text files.
Always check a variable's signedness when working with it. Generally now I make types signed unless I have a good reason otherwise, casting when necessary. This fits in nicely with the ubiquitous use of char in libraries to simply represent a byte. Keep in mind that the signedness of char is not defined (unlike with other types), you should give it special treatment, and be mindful.
The one that most annoys me:
typedef char byte;
byte b = 12;
cout << b << endl;
Sure it's cosmetics, but arrr...
When casting ints to chars or chars to other simple types
The critical point is, that casting a signed value from one primitive type to another (larger) type does not retain the bit pattern (assuming two's complement). A signed char with bit pattern 0xff is -1, while a signed short with the decimal value -1 is 0xffff. Casting an unsigned char with value 0xff to a unsigned short, however, yields 0x00ff. Therefore, always think of proper signedness before you typecast to a larger or smaller data type. Never carry unsigned data in signed data types if you don't need to - if an external library forces you to do so, do the conversion as late as possible (or as early as possible if the external code acts as data source).
The C and C++ language specifications define 3 data types for holding characters: char, signed char and unsigned char. The latter 2 have been discussed in other answers. Let's look at the char type.
The standard(s) say that the char data type may be signed or unsigned and is an implementation decision. This means that some compilers or versions of compilers, can implement char differently. The implication is that the char data type is not conducive for arithmetic or Boolean operations. For arithmetic and Boolean operations, signed and unsigned versions of char will work fine.
In summary, there are 3 versions of char data type. The char data type performs well for holding characters, but is not suited for arithmetic across platforms and translators since it's signedness is implementation defined.
You will fail miserably when compiling for multiple platforms because the C++ standard doesn't define char to be of a certain "signedness".
Therefore GCC introduces -fsigned-char and -funsigned-char options to force certain behavior. More on that topic can be found here, for example.
EDIT:
As you asked for examples of broken code, there are plenty of possibilities to break code that processes binary data. For example, image you process 8-bit audio samples (range -128 to 127) and you want to halven the volume. Now imagine this scenario (in which the naive programmer assumes char == signed char):
char sampleIn;
// If the sample is -1 (= almost silent), and the compiler treats char as unsigned,
// then the value of 'sampleIn' will be 255
read_one_byte_sample(&sampleIn);
// Ok, halven the volume. The value will be 127!
char sampleOut = sampleOut / 2;
// And write the processed sample to the output file, for example.
// (unsigned char)127 has the exact same bit pattern as (signed char)127,
// so this will write a sample with the loudest volume!!
write_one_byte_sample_to_output_file(&sampleOut);
I hope you like that example ;-) But to be honest I've never really came across such problems, not even as a beginner as far as I can remember...
Hope this answer is sufficient for you downvoters. What about a short comment?
Sign extension. The first version of my URL encoding function produced strings like "%FFFFFFA3".

Is it safe to use -1 to set all bits to true?

I've seen this pattern used a lot in C & C++.
unsigned int flags = -1; // all bits are true
Is this a good portable way to accomplish this? Or is using 0xffffffff or ~0 better?
I recommend you to do it exactly as you have shown, since it is the most straight forward one. Initialize to -1 which will work always, independent of the actual sign representation, while ~ will sometimes have surprising behavior because you will have to have the right operand type. Only then you will get the most high value of an unsigned type.
For an example of a possible surprise, consider this one:
unsigned long a = ~0u;
It won't necessarily store a pattern with all bits 1 into a. But it will first create a pattern with all bits 1 in an unsigned int, and then assign it to a. What happens when unsigned long has more bits is that not all of those are 1.
And consider this one, which will fail on a non-two's complement representation:
unsigned int a = ~0; // Should have done ~0u !
The reason for that is that ~0 has to invert all bits. Inverting that will yield -1 on a two's complement machine (which is the value we need!), but will not yield -1 on another representation. On a one's complement machine, it yields zero. Thus, on a one's complement machine, the above will initialize a to zero.
The thing you should understand is that it's all about values - not bits. The variable is initialized with a value. If in the initializer you modify the bits of the variable used for initialization, the value will be generated according to those bits. The value you need, to initialize a to the highest possible value, is -1 or UINT_MAX. The second will depend on the type of a - you will need to use ULONG_MAX for an unsigned long. However, the first will not depend on its type, and it's a nice way of getting the highest value.
We are not talking about whether -1 has all bits one (it doesn't always have). And we're not talking about whether ~0 has all bits one (it has, of course).
But what we are talking about is what the result of the initialized flags variable is. And for it, only -1 will work with every type and machine.
unsigned int flags = -1; is portable.
unsigned int flags = ~0; isn't portable because it
relies on a two's-complement representation.
unsigned int flags = 0xffffffff; isn't portable because
it assumes 32-bit ints.
If you want to set all bits in a way guaranteed by the C standard, use the first one.
Frankly I think all fff's is more readable. As to the comment that its an antipattern, if you really care that all the bits are set/cleared, I would argue that you are probably in a situation where you care about the size of the variable anyway, which would call for something like boost::uint16_t, etc.
A way which avoids the problems mentioned is to simply do:
unsigned int flags = 0;
flags = ~flags;
Portable and to the point.
I am not sure using an unsigned int for flags is a good idea in the first place in C++. What about bitset and the like?
std::numeric_limit<unsigned int>::max() is better because 0xffffffff assumes that unsigned int is a 32-bit integer.
unsigned int flags = -1; // all bits are true
"Is this a good[,] portable way to accomplish this?"
Portable? Yes.
Good? Debatable, as evidenced by all the confusion shown on this thread. Being clear enough that your fellow programmers can understand the code without confusion should be one of the dimensions we measure for good code.
Also, this method is prone to compiler warnings. To elide the warning without crippling your compiler, you'd need an explicit cast. For example,
unsigned int flags = static_cast<unsigned int>(-1);
The explicit cast requires that you pay attention to the target type. If you're paying attention to the target type, then you'll naturally avoid the pitfalls of the other approaches.
My advice would be to pay attention to the target type and make sure there are no implicit conversions. For example:
unsigned int flags1 = UINT_MAX;
unsigned int flags2 = ~static_cast<unsigned int>(0);
unsigned long flags3 = ULONG_MAX;
unsigned long flags4 = ~static_cast<unsigned long>(0);
All of which are correct and more obvious to your fellow programmers.
And with C++11: We can use auto to make any of these even simpler:
auto flags1 = UINT_MAX;
auto flags2 = ~static_cast<unsigned int>(0);
auto flags3 = ULONG_MAX;
auto flags4 = ~static_cast<unsigned long>(0);
I consider correct and obvious better than simply correct.
Converting -1 into any unsigned type is guaranteed by the standard to result in all-ones. Use of ~0U is generally bad since 0 has type unsigned int and will not fill all the bits of a larger unsigned type, unless you explicitly write something like ~0ULL. On sane systems, ~0 should be identical to -1, but since the standard allows ones-complement and sign/magnitude representations, strictly speaking it's not portable.
Of course it's always okay to write out 0xffffffff if you know you need exactly 32 bits, but -1 has the advantage that it will work in any context even when you do not know the size of the type, such as macros that work on multiple types, or if the size of the type varies by implementation. If you do know the type, another safe way to get all-ones is the limit macros UINT_MAX, ULONG_MAX, ULLONG_MAX, etc.
Personally I always use -1. It always works and you don't have to think about it.
As long as you have #include <limits.h> as one of your includes, you should just use
unsigned int flags = UINT_MAX;
If you want a long's worth of bits, you could use
unsigned long flags = ULONG_MAX;
These values are guaranteed to have all the value bits of the result set to 1, regardless of how signed integers are implemented.
Yes. As mentioned in other answers, -1 is the most portable; however, it is not very semantic and triggers compiler warnings.
To solve these issues, try this simple helper:
static const struct All1s
{
template<typename UnsignedType>
inline operator UnsignedType(void) const
{
static_assert(std::is_unsigned<UnsignedType>::value, "This is designed only for unsigned types");
return static_cast<UnsignedType>(-1);
}
} ALL_BITS_TRUE;
Usage:
unsigned a = ALL_BITS_TRUE;
uint8_t b = ALL_BITS_TRUE;
uint16_t c = ALL_BITS_TRUE;
uint32_t d = ALL_BITS_TRUE;
uint64_t e = ALL_BITS_TRUE;
On Intel's IA-32 processors it is OK to write 0xFFFFFFFF to a 64-bit register and get the expected results. This is because IA32e (the 64-bit extension to IA32) only supports 32-bit immediates. In 64-bit instructions 32-bit immediates are sign-extended to 64-bits.
The following is illegal:
mov rax, 0ffffffffffffffffh
The following puts 64 1s in RAX:
mov rax, 0ffffffffh
Just for completeness, the following puts 32 1s in the lower part of RAX (aka EAX):
mov eax, 0ffffffffh
And in fact I've had programs fail when I wanted to write 0xffffffff to a 64-bit variable and I got a 0xffffffffffffffff instead. In C this would be:
uint64_t x;
x = UINT64_C(0xffffffff)
printf("x is %"PRIx64"\n", x);
the result is:
x is 0xffffffffffffffff
I thought to post this as a comment to all the answers that said that 0xFFFFFFFF assumes 32 bits, but so many people answered it I figured I'd add it as a separate answer.
See litb's answer for a very clear explanation of the issues.
My disagreement is that, very strictly speaking, there are no guarantees for either case. I don't know of any architecture that does not represent an unsigned value of 'one less than two to the power of the number of bits' as all bits set, but here is what the Standard actually says (3.9.1/7 plus note 44):
The representations of integral types shall define values by use of a pure binary numeration system. [Note 44:]A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive bits are additive, begin with 1, and are multiplied by successive integral power of 2, except perhaps for the bit with the highest position.
That leaves the possibility for one of the bits to be anything at all.
I would not do the -1 thing. It's rather non-intuitive (to me at least). Assigning signed data to an unsigned variable just seems to be a violation of the natural order of things.
In your situation, I always use 0xFFFF. (Use the right number of Fs for the variable size of course.)
[BTW, I very rarely see the -1 trick done in real-world code.]
Additionally, if you really care about the individual bits in a vairable, it would be good idea to start using the fixed-width uint8_t, uint16_t, uint32_t types.
Although the 0xFFFF (or 0xFFFFFFFF, etc.) may be easier to read, it can break portability in code which would otherwise be portable. Consider, for example, a library routine to count how many items in a data structure have certain bits set (the exact bits being specified by the caller). The routine may be totally agnostic as to what the bits represent, but still need to have an "all bits set" constant. In such a case, -1 will be vastly better than a hex constant since it will work with any bit size.
The other possibility, if a typedef value is used for the bitmask, would be to use ~(bitMaskType)0; if bitmask happens to only be a 16-bit type, that expression will only have 16 bits set (even if 'int' would otherwise be 32 bits) but since 16 bits will be all that are required, things should be fine provided that one actually uses the appropriate type in the typecast.
Incidentally, expressions of the form longvar &= ~[hex_constant] have a nasty gotcha if the hex constant is too large to fit in an int, but will fit in an unsigned int. If an int is 16 bits, then longvar &= ~0x4000; or longvar &= ~0x10000; will clear one bit of longvar, but longvar &= ~0x8000; will clear out bit 15 and all bits above that. Values which fit in int will have the complement operator applied to a type int, but the result will be sign extended to long, setting the upper bits. Values which are too big for unsigned int will have the complement operator applied to type long. Values which are between those sizes, however, will apply the complement operator to type unsigned int, which will then be converted to type long without sign extension.
As others have mentioned, -1 is the correct way to create an integer that will convert to an unsigned type with all bits set to 1. However, the most important thing in C++ is using correct types. Therefore, the correct answer to your problem (which includes the answer to the question you asked) is this:
std::bitset<32> const flags(-1);
This will always contain the exact amount of bits you need. It constructs a std::bitset with all bits set to 1 for the same reasons mentioned in other answers.
It is certainly safe, as -1 will always have all available bits set, but I like ~0 better. -1 just doesn't make much sense for an unsigned int. 0xFF... is not good because it depends on the width of the type.
Practically: Yes
Theoretically: No.
-1 = 0xFFFFFFFF (or whatever size an int is on your platform) is only true with two's complement arithmetic. In practice, it will work, but there are legacy machines out there (IBM mainframes, etc.) where you've got an actual sign bit rather than a two's complement representation. Your proposed ~0 solution should work everywhere.
I say:
int x;
memset(&x, 0xFF, sizeof(int));
This will always give you the desired result.
Leveraging on the fact that assigning all bits to one for an unsigned type is equivalent to taking the maximum possible value for the given type,
and extending the scope of the question to all unsigned integer types:
Assigning -1 works for any unsigned integer type (unsigned int, uint8_t, uint16_t, etc.) for both C and C++.
As an alternative, for C++, you can either:
Include <limits> and use std::numeric_limits< your_type >::max()
Write a custom templated function (This would also allow some sanity check, i.e. if the destination type is really an unsigned type)
The purpose could be add more clarity, as assigning -1 would always need some explanatory comment.
A way to make the meaning bit more obvious and yet to avoid repeating the type:
const auto flags = static_cast<unsigned int>(-1);
An additional effort to emphasize, why Adrian McCarthy's approach here might be the best solution at latest since C++11 in terms of a compromise between standard conformity, type safety/explicit clearness and reduction of possible ambiguities:
unsigned int flagsPreCpp11 = ~static_cast<unsigned int>(0);
auto flags = ~static_cast<unsigned int>(0); // C++11 initialization
predeclaredflags = ~static_cast<decltype(predeclaredflags)>(0); // C++11 assignment to already declared variable
I'm going to explain my preference in detail below. As Johannes mentioned totally correctly, the fundamental origin of irritations here is the question about value vs. according bit representation semantics and about what types we're talking about exactly (the assigned value type vs. the possible compile time integral constant's type). Since there's no standard built-in mechanism to explicitly ensure the set of all bits to 1 for the concrete use case of the OP about unsigned integer values, it's obvious, that it's impossible to be fully independent of value semantics here (std::bitset is a common pure bit-layer refering container but the question was about unsigned integers in general). But we might be able to reduce ambiguity here.
Comparison of the 'better' standard compliant approaches:
The OP's way:
unsigned int flags = -1;
PROs:
is "established" and short
is quite intuitive in terms of modulo perspective of value to "natural" bit value representation
changing the target unsigned type to unsigned long for instance is possible without any further adaptions
CONs:
At least beginners might not be sure about the standard conformity ("Do I have to concern about padding bits?").
Violates type ranges (in the heavier way: signed vs. unsigned).
Solely from the code, you do not directly see any bit semantics association.
Refering to maximum values via defines:
unsigned int flags = UINT_MAX;
This circumvents the signed unsigned transition issue of the -1 approach but introduces several new problems: In doubt, one has to look twice here again, at the latest if you want to change the target type to unsigned long for instance. And here, one has to be sure about the fact, that the maximum value leads to all bits set to 1 by the standard (and padding bit concerns again). Bit semantics are also not obvious here directly from the code solely again.
Refering to maximum values more explicitly:
auto flags = std::numeric_limits<unsigned int>::max();
On my opinion, that's the better maximum value approach since it's macro/define free and one is explicit about the involved type. But all other concerns about the approach type itself remain.
Adrian's approach (and why I think, it's the preferred one before C++11 and since):
unsigned int flagsPreCpp11 = ~static_cast<unsigned int>(0);
auto flagsCpp11 = ~static_cast<unsigned int>(0);
PROs:
Only the simplest integral compile time constant is used: 0. So no worries about further bit representation or (implicit) casts are justified. From an intuitive point of view, I think we all can agree on the fact, that the bit representation for zero is commonly clearer than for maximum values, not only for unsigned integrals.
No type ambiguities are involved, no further look-ups required in doubt.
Explicit bit semantics are involved here via the complement ~. So it's quite clear from the code, what the intention was. And it's also very explicit, on which type and type range, the complement is applied.
CONs:
If assigned to a member for instance, there's a small chance that you mismatch types with pre C++11:
Declaration in class:
unsigned long m_flags;
Initialization in constructor:
m_flags(~static_cast<unsigned int>(0))
But since C++11, the usage of decltype + auto is powerful to prevent most of these possible issues. And some of these type mismatch scenarios (on interface boundaries for instance) are also possible for the -1 approach.
Robust final C++11 approach for pre-declared variables:
m_flags(~static_cast<decltype(m_flags)>(0)) // member initialization case
So with a full view on the weighting of the PROs and CONs of all approaches here, I recommend this one as the preferred approach, at latest since C++11.
Update: Thanks to a hint by Andrew Henle, I removed the statement about its readability since that might be a too subjective statement. But I still think, its readability is at least not that worse than most of the maximum value approaches or the ones with explicit maximum value provision via compile time integrals/literals since static_cast-usage is "established" too and built-in in contrast to defines/macros and even the std-lib.
yes the representation shown is very much correct as if we do it the other way round u will require an operator to reverse all the bits but in this case the logic is quite straightforward if we consider the size of the integers in the machine
for instance in most machines an integer is 2 bytes = 16 bits maximum value it can hold is 2^16-1=65535 2^16=65536
0%65536=0
-1%65536=65535 which corressponds to 1111.............1 and all the bits are set to 1 (if we consider residue classes mod 65536)
hence it is much straight forward.
I guess
no if u consider this notion it is perfectly dine for unsigned ints and it actually works out
just check the following program fragment
int main()
{
unsigned int a=2;
cout<<(unsigned int)pow(double(a),double(sizeof(a)*8));
unsigned int b=-1;
cout<<"\n"<<b;
getchar();
return 0;
}
answer for b = 4294967295 whcih is -1%2^32 on 4 byte integers
hence it is perfectly valid for unsigned integers
in case of any discrepancies plzz report