Unsigned negative primitives? - c++

In C++ we can make primitives unsigned. But they are always positive. Is there also a way to make unsigned negative variables? I know the word unsigned means "without sign", so also not a minus (-) sign. But I think C++ must provide it.

No. unsigned can only contain nonnegative numbers.
If you need a type that only represent negative numbers, you need to write a class yourself, or just interpret the value as negative in your program.
(But why do you need such a type?)

unsigned integers are only positive. From 3.9.1 paragraph 3 of the 2003 C++ standard:
The range of nonnegative values of a
signed integer type is a subrange of
the corresponding unsigned integer
type, and the value representation of
each corresponding signed/unsigned
type shall be the same.
The main purpose of the unsigned integer types is to support modulo arithmetic. From 3.9.1 paragraph 4:
Unsigned integers, declared unsigned,
shall obey the laws of arithmetic
modulo 2n where n is the
number of bits in the value
representation of that particular size
of integer.
You are free, of course, to treat them as negative if you wish, you'll just have to keep track of that yourself somehow (perhaps with a Boolean flag).

I think you are thinking it the wrong way.
If you want a negative number, then you must not declare the variable as unsigned.
If your problem is the value range, and you want that one more bit, then you could use a "bigger" data type (int 64 for example...).
Then if you are using legacy code, creating a new struct can solve your problem, but this is that way because of your specific situation, C++ shouldn't handle it.

Don't be fooled by the name: unsigned is often misunderstood as non-negative, but the rules for the language are different... probably a better name would have been "bitmask" or "modulo_integer".
If you think that unsigned is non-negative then for example implicit conversion rules are total nonsense (why a difference between two non-negative should be a non-negative ? why the addition of a non-negative and an integer should be non-negative ?).
It's very unfortunate that C++ standard library itself fell in that misunderstanding because for example vector.size() is unsigned (absurd if you mean it as the language itself does in terms of bitmask or modulo_integer). This choice for sizes has more to do with the old 16-bit times than with unsigned-ness and it was in my opinion a terrible choice that we're still paying as bugs.

But I think C++ must provide it.
Why? You think that C++ must provide the ability to make non-negative numbers negative? That's silly.
Unsigned values in C++ are defined to always be non-negative. They even wrap around — rather than underflowing — at zero! (And the same at the other end of the range)

Related

Is the using `int` is more preferably than the using of `unsigned int`? [duplicate]

Should one ever declare a variable as an unsigned int if they don't require the extra range of values? For example, when declaring the variable in a for loop, if you know it's not going to be negative, does it matter? Is one faster than the other? Is it bad to declare an unsigned int just as unsigned in C++?
To reitterate, should it be done even if the extra range is not required? I heard they should be avoided because they cause confusion (IIRC that's why Java doesn't have them).
The reason to use uints is that it gives the compiler a wider variety of optimizations. For example, it may replace an instance of 'abs(x)' with 'x' if it knows that x is positive. It also opens up a variety of bitwise 'strength reductions' that only work for positive numbers. If you always mult/divide an int by a power of two, then the compiler may replace the operation with a bit shift (ie x*8 == x<<3) which tends to perform much faster. Unfortunately, this relation only holds if 'x' is positive because negative numbers are encoded in a way that precludes this. With ints, the compiler may apply this trick if it can prove that the value is always positive (or can be modified earlier in the code to be so). In the case of uints, this attribute is trivial to prove, which greatly increases the odds of it being applied.
Another example might be the equation y = 16 * x + 12. If x can be negative, then a multiply and add would be required. Yet if x is always positive, then not only can the x*16 term be replaced with x<<4, but since the term would always end with four zeros this opens up replacing the '+ 12' with a binary OR (as long as the '12' term is less than 16). The result would be y = (x<<4) | 12.
In general, the 'unsigned' qualifier gives the compiler more information about the variable, which in turn allows it to squeeze in more optimizations.
You should use unsigned integers when it doesn't make sense for them to have negative values. This is completely independent of the range issue. So yes, you should use unsigned integer types even if the extra range is not required, and no, you shouldn't use unsigned ints (or anything else) if not necessary, but you need to revise your definition of what is necessary.
More often than not, you should use unsigned integers.
They are more predictable in terms of undefined behavior on overflow and such.
This is a huge subject of its own, so I won't say much more about it.
It's a very good reason to avoid signed integers unless you actually need signed values.
Also, they are easier to work with when range-checking -- you don't have to check for negative values.
Typical rules of thumb:
If you are writing a forward for loop with an index as the control variable, you almost always want unsigned integers. In fact, you almost always want size_t.
If you're writing a reverse for loop with an index as a the control variable, you should probably use signed integers, for obvious reasons. Probably ptrdiff_t would do.
The one thing to be careful with is when casting between signed and unsigned values of different sizes.
You probably want to double-check (or triple-check) to make sure the cast is working the way you expect.
int is the general purpose integer type. If you need an integer, and int meets your requirements (range [-32767,32767]), then use it.
If you have more specialized purposes, then you can choose something else. If you need an index into an array, then use size_t. If you need an index into a vector, then use std::vector<T>::size_type. If you need specific sizes, then pick something from <cstdint>. If you need something larger than 64 bits, then find a library like gmp.
I can't think of any good reasons to use unsigned int. At least, not directly (size_t and some of the specifically sized types from <cstdint> may be typedefs of unsigned int).
The problem with the systematic use of unsigned when values can't be negative isn't that Java doesn't have unsigned, it is that expressions with unsigned values, especially when mixed with signed one, give sometimes confusing results if you think about unsigned as an integer type with a shifted range. Unsigned is a modular type, not a restriction of integers to positive or zero.
Thus the traditional view is that unsigned should be used when you need a modular type or for bitwise manipulation. That view is implicit in K&R — look how int and unsigned are used —, and more explicit in TC++PL (2nd edition, p. 50):
The unsigned integer types are ideal for uses that treat storage as a bit array. Using an unsigned instead of an int to gain one more bit to represent positive integers is almost never a good idea. Attempts to ensure that some values are positive by declaring variables unsigned will typically be defeated by the implicit conversion rules.
In almost all architectures the cost of signed operation and unsigned operation is the same. So efficiency wise you wont get any advantage for using unsigned over signed. But as you pointed out, if you use unsigned you will have a bigger range
Even if you have variables that should only take non negative values unsigned can be a problem. Here is an example. Suppose a programmer is asked to write a code to print all pairs of integer numbers (a,b) with 0 <= a < b <= n where n is a given input. An incorrect code is
for (unsigned b = 0; b <= n; b++)
for (unsigned a=0; a <=b-1; b++)
cout << a << ',' << b << n ;
This is easy to correct, but thinking with unsigned is a bit less natural than thinking with int.

Why prefer signed over unsigned in C++? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I'd like to understand better why choose int over unsigned?
Personally, I've never liked signed values unless there is a valid reason for them. e.g. count of items in an array, or length of a string, or size of memory block, etc., so often these things cannot possibly be negative. Such a value has no possible meaning. Why prefer int when it is misleading in all such cases?
I ask this because both Bjarne Stroustrup and Chandler Carruth gave the advice to prefer int over unsigned here (approx 12:30').
I can see the argument for using int over short or long - int is the "most natural" data width for the target machine architecture.
But signed over unsigned has always annoyed me. Are signed values genuinely faster on typical modern CPU architectures? What makes them better?
As per requests in comments: I prefer int instead of unsigned because...
it's shorter (I'm serious!)
it's more generic and more intuitive (i. e. I like to be able to assume that 1 - 2 is -1 and not some obscure huge number)
what if I want to signal an error by returning an out-of-range value?
Of course there are counter-arguments, but these are the principal reasons I like to declare my integers as int instead of unsigned. Of course, this is not always true, in other cases, an unsigned is just a better tool for a task, I am just answering the "why would anyone prefer defaulting to signed" question specifically.
Let me paraphrase the video, as the experts said it succinctly.
Andrei Alexandrescu:
No simple guideline.
In systems programming, we need integers of different sizes and signedness.
Many conversions and arcane rules govern arithmetic (like for auto), so we need to be careful.
Chandler Carruth:
Here's some simple guidelines:
Use signed integers unless you need two's complement arithmetic or a bit pattern
Use the smallest integer that will suffice.
Otherwise, use int if you think you could count the items, and a 64-bit integer if it's even more than you would want to count.
Stop worrying and use tools to tell you when you need a different type or size.
Bjarne Stroustrup:
Use int until you have a reason not to.
Use unsigned only for bit patterns.
Never mix signed and unsigned
Wariness about signedness rules aside, my one-sentence take away from the experts:
Use the appropriate type, and when you don't know, use an int until you do know.
Several reasons:
Arithmetic on unsigned always yields unsigned, which can be a problem when subtracting integer quantities that can reasonably result in a negative result — think subtracting money quantities to yield balance, or array indices to yield distance between elements. If the operands are unsigned, you get a perfectly defined, but almost certainly meaningless result, and a result < 0 comparison will always be false (of which modern compilers will fortunately warn you).
unsigned has the nasty property of contaminating the arithmetic where it gets mixed with signed integers. So, if you add a signed and unsigned and ask whether the result is greater than zero, you can get bitten, especially when the unsigned integral type is hidden behind a typedef.
There are no reasons to prefer signed over unsigned, aside from purely sociological ones, i.e. some people believe that average programmers are not competent and/or attentive enough to write proper code in terms of unsigned types. This is often the main reasoning used by various "speakers", regardless of how respected those speakers might be.
In reality, competent programmers quickly develop and/or learn the basic set of programming idioms and skills that allow them to write proper code in terms of unsigned integral types.
Note also that the fundamental differences between signed and unsigned semantics are always present (in superficially different form) in other parts of C and C++ language, like pointer arithmetic and iterator arithmetic. Which means that in general case the programmer does not really have the option of avoiding dealing with issues specific to unsigned semantics and the "problems" it brings with it. I.e. whether you want it or not, you have to learn to work with ranges that terminate abruptly at their left end and terminate right here (not somewhere in the distance), even if you adamantly avoid unsigned integers.
Also, as you probably know, many parts of standard library already rely on unsigned integer types quite heavily. Forcing signed arithmetic into the mix, instead of learning to work with unsigned one, will only result in disastrously bad code.
The only real reason to prefer signed in some contexts that comes to mind is that in mixed integer/floating-point code signed integer formats are typically directly supported by FPU instruction set, while unsigned formats are not supported at all, making the compiler to generate extra code for conversions between floating-point values and unsigned values. In such code signed types might perform better.
But at the same time in purely integer code unsigned types might perform better than signed types. For example, integer division often requires additional corrective code in order to satisfy the requirements of the language spec. The correction is only necessary in case of negative operands, so it wastes CPU cycles in situations when negative operands are not really used.
In my practice I devotedly stick to unsigned wherever I can, and use signed only if I really have to.
The integral types in C and many languages which derive from it have two general usage cases: to represent numbers, or represent members of an abstract algebraic ring. For those unfamiliar with abstract algebra, the primary notion behind a ring is that adding, subtracting, or multiplying two items of a ring should yield another item of that ring--it shouldn't crash or yield a value outside the ring. On a 32-bit machine, adding unsigned 0x12345678 to unsigned 0xFFFFFFFF doesn't "overflow"--it simply yields the result 0x12345677 which is defined for the ring of integers congruent mod 2^32 (because the arithmetic result of adding 0x12345678 to 0xFFFFFFFF, i.e. 0x112345677, is congruent to 0x12345677 mod 2^32).
Conceptually, both purposes (representing numbers, or representing members of the ring of integers congruent mod 2^n) may be served by both signed and unsigned types, and many operations are the same for both usage cases, but there are some differences. Among other things, an attempt to add two numbers should not be expected to yield anything other than the correct arithmetic sum. While it's debatable whether a language should be required to generate the code necessary to guarantee that it won't (e.g. that an exception would be thrown instead), one could argue that for code which uses integral types to represent numbers such behavior would be preferable to yielding an arithmetically-incorrect value and compilers shouldn't be forbidden from behaving that way.
The implementers of the C standards decided to use signed integer types to represent numbers and unsigned types to represent members of the algebraic ring of integers congruent mod 2^n. By contrast, Java uses signed integers to represent members of such rings (though they're interpreted differently in some contexts; conversions among differently-sized signed types, for example, behave differently from among unsigned ones) and Java has neither unsigned integers nor any primitive integral types which behave as numbers in all non-exceptional cases.
If a language provided a choice of signed and unsigned representations for both numbers and algebraic-ring numbers, it might make sense to use unsigned numbers to represent quantities that will always be positive. If, however, the only unsigned types represent members of an algebraic ring, and the only types that represent numbers are the signed ones, then even if a value will always be positive it should be represented using a type designed to represent numbers.
Incidentally, the reason that (uint32_t)-1 is 0xFFFFFFFF stems from the fact that casting a signed value to unsigned is equivalent to adding unsigned zero, and adding an integer to an unsigned value is defined as adding or subtracting its magnitude to/from the unsigned value according to the rules of the algebraic ring which specify that if X=Y-Z, then X is the one and only member of that ring such X+Z=Y. In unsigned math, 0xFFFFFFFF is the only number which, when added to unsigned 1, yields unsigned zero.
Speed is the same on modern architectures. The problem with unsigned int is that it can sometimes generate unexpected behavior. This can create bugs that wouldn't show up otherwise.
Normally when you subtract 1 from a value, the value gets smaller. Now, with both signed and unsigned int variables, there will be a time that subtracting 1 creates a value that is MUCH LARGER. The key difference between unsigned int and int is that with unsigned int the value that generates the paradoxical result is a commonly used value --- 0 --- whereas with signed the number is safely far away from normal operations.
As far as returning -1 for an error value --- modern thinking is that it's better to throw an exception than to test for return values.
It's true that if you properly defend your code you won't have this problem, and if you use unsigned religiously everywhere you will be okay (provided that you are only adding, and never subtracting, and that you never get near MAX_INT). I use unsigned int everywhere. But it takes a lot of discipline. For a lot of programs, you can get by with using int and spend your time on other bugs.
Use int by default: it plays nicer with the rest of the language
most common domain usage is regular arithmetic, not modular arithmetic
int main() {} // see an unsigned?
auto i = 0; // i is of type int
Only use unsigned for modulo arithmetic and bit-twiddling (in particular shifting)
has different semantics than regular arithmetic, make sure it is what you want
bit-shifting signed types is subtle (see comments by #ChristianRau)
if you need a > 2Gb vector on a 32-bit machine, upgrade your OS / hardware
Never mix signed and unsigned arithmetic
the rules for that are complicated and surprising (either one can be converted to the other, depending on the relative type sizes)
turn on -Wconversion -Wsign-conversion -Wsign-promo (gcc is better than Clang here)
the Standard Library got it wrong with std::size_t (quote from the GN13 video)
use range-for if you can,
for(auto i = 0; i < static_cast<int>(v.size()); ++i) if you must
Don't use short or large types unless you actually need them
current architectures data flow caters well to 32-bit non-pointer data (but note the comment by #BenVoigt about cache effects for smaller types)
char and short save space but suffer from integral promotions
are you really going to count to over all int64_t?
To answer the actual question: For the vast number of things, it doesn't really matter. int can be a little easier to deal with things like subtraction with the second operand larger than the first and you still get a "expected" result.
There is absolutely no speed difference in 99.9% of cases, because the ONLY instructions that are different for signed and unsigned numbers are:
Making the number longer (fill with the sign for signed or zero for unsigned) - it takes the same effort to do both.
Comparisons - a signed number, the processor has to take into account if either number is negative or not. But again, it's the same speed to make a compare with signed or unsigned numbers - it's just using a different instruction code to say "numbers that have the highest bit set are smaller than numbers with the highest bit not set" (essentially). [Pedantically, it's nearly always the operation using the RESULT of a comparison that is different - the most common case being a conditional jump or branch instruction - but either way, it's the same effort, just that the inputs are taken to mean slightly different things].
Multiply and divide. Obviously, sign conversion of the result needs to happen if it's a signed multiplication, where a unsigned should not change the sign of the result if the highest bit of one of the inputs is set. And again, the effort is (as near as we care for) identical.
(I think there are one or two other cases, but the result is the same - it really doesn't matter if it's signed or unsigned, the effort to perform the operation is the same for both).
The int type more closely resembles the behavior of mathematical integers than the unsigned type.
It is naive to prefer the unsigned type simply because a situation does not require negative values to be represented.
The problem is that the unsigned type has a discontinuous behavior right next to zero. Any operation that tries to compute a small negative value, instead produces some large positive value. (Worse: one that is implementation-defined.)
Algebraic relationships such as that a < b implies that a - b < 0 are wrecked in the unsigned domain, even for small values like a = 3 and b = 4.
A descending loop like for (i = max - 1; i >= 0; i--) fails to terminate if i is made unsigned.
Unsigned quirks can cause a problem which will affect code regardless of whether that code expects to be representing only positive quantities.
The virtue of the unsigned types is that certain operations that are not portably defined at the bit level for the signed types are that way for the unsigned types. The unsigned types lack a sign bit, and so shifting and masking through the sign bit isn't a problem. The unsigned types are good for bitmasks, and for code that implements precise arithmetic in a platform-independent way. Unsigned opearations will simulate two's complement semantics even on a non two's complement machine. Writing a multi-precision (bignum) library practically requires arrays of unsigned types to be used for the representation, rather than signed types.
The unsigned types are also suitable in situations in which numbers behave like identifiers and not as arithmetic types. For instance, an IPv4 address can be represented in a 32 bit unsigned type. You wouldn't add together IPv4 addresses.
int is preferred because it's most commonly used. unsigned is usually associated with bit operations. Whenever I see an unsigned, I assume it's used for bit twiddling.
If you need a bigger range, use a 64-bit integer.
If you're iterating over stuff using indexes, types usually have size_type, and you shouldn't care whether it's signed or unsigned.
Speed is not an issue.
For me, in addition to all the integers in the range of 0..+2,147,483,647 contained within the set of signed and unsigned integers on 32 bit architectures, there is a higher probability that I will need to use -1 (or smaller) than need to use +2,147,483,648 (or larger).
One good reason that I can think of is in case of detecting overflow.
For the use cases such as the count of items in an array, length of a string, or size of memory block, you can overflow an unsigned int and you may not notice a difference even when you take a look at the variable. If it is an signed int, the variable will be less than zero and clearly wrong.
You can simply check to see if the variable is zero when you want to use it. This way, you do not have to check for overflow after every arithmetic operation as is the case for unsigned ints.
It gives unexpected result when doing simple arithmetic operation:
unsigned int i;
i = 1 - 2;
//i is now 4294967295 on a 64bit machine
It gives unexpected result when doing simple comparison:
unsigned int j = 1;
std::cout << (j>-1) << std::endl;
//output 0 as false but 1 is greater than -1
This is because when doing the operations above, the signed ints are converted to unsigned, and it overflows and goes to a really big number.

Why is unsigned integer overflow defined behavior but signed integer overflow isn't?

Unsigned integer overflow is well defined by both the C and C++ standards. For example, the C99 standard (§6.2.5/9) states
A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the largest value that can be
represented by the resulting type.
However, both standards state that signed integer overflow is undefined behavior. Again, from the C99 standard (§3.4.3/1)
An example of undefined behavior is the behavior on integer overflow
Is there an historical or (even better!) a technical reason for this discrepancy?
The historical reason is that most C implementations (compilers) just used whatever overflow behaviour was easiest to implement with the integer representation it used. C implementations usually used the same representation used by the CPU - so the overflow behavior followed from the integer representation used by the CPU.
In practice, it is only the representations for signed values that may differ according to the implementation: one's complement, two's complement, sign-magnitude. For an unsigned type there is no reason for the standard to allow variation because there is only one obvious binary representation (the standard only allows binary representation).
Relevant quotes:
C99 6.2.6.1:3:
Values stored in unsigned bit-fields and objects of type unsigned char shall be represented using a pure binary notation.
C99 6.2.6.2:2:
If the sign bit is one, the value shall be modified in one of the following ways:
— the corresponding value with sign bit 0 is negated (sign and magnitude);
— the sign bit has the value −(2N) (two’s complement);
— the sign bit has the value −(2N − 1) (one’s complement).
Nowadays, all processors use two's complement representation, but signed arithmetic overflow remains undefined and compiler makers want it to remain undefined because they use this undefinedness to help with optimization. See for instance this blog post by Ian Lance Taylor or this complaint by Agner Fog, and the answers to his bug report.
Aside from Pascal's good answer (which I'm sure is the main motivation), it is also possible that some processors cause an exception on signed integer overflow, which of course would cause problems if the compiler had to "arrange for another behaviour" (e.g. use extra instructions to check for potential overflow and calculate differently in that case).
It is also worth noting that "undefined behaviour" doesn't mean "doesn't work". It means that the implementation is allowed to do whatever it likes in that situation. This includes doing "the right thing" as well as "calling the police" or "crashing". Most compilers, when possible, will choose "do the right thing", assuming that is relatively easy to define (in this case, it is). However, if you are having overflows in the calculations, it is important to understand what that actually results in, and that the compiler MAY do something other than what you expect (and that this may very depending on compiler version, optimisation settings, etc).
First of all, please note that C11 3.4.3, like all examples and foot notes, is not normative text and therefore not relevant to cite!
The relevant text that states that overflow of integers and floats is undefined behavior is this:
C11 6.5/5
If an exceptional condition occurs during the evaluation of an
expression (that is, if the result is not mathematically defined or
not in the range of representable values for its type), the behavior
is undefined.
A clarification regarding the behavior of unsigned integer types specifically can be found here:
C11 6.2.5/9
The range of nonnegative values of a signed integer type is a subrange
of the corresponding unsigned integer type, and the representation of
the same value in each type is the same. A computation involving
unsigned operands can never overflow, because a result that cannot be
represented by the resulting unsigned integer type is reduced modulo
the number that is one greater than the largest value that can be
represented by the resulting type.
This makes unsigned integer types a special case.
Also note that there is an exception if any type is converted to a signed type and the old value can no longer be represented. The behavior is then merely implementation-defined, although a signal may be raised.
C11 6.3.1.3
6.3.1.3 Signed and unsigned integers
When a value with integer
type is converted to another integer type other than _Bool, if the
value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
Otherwise, the new type is signed and the value
cannot be represented in it; either the result is
implementation-defined or an implementation-defined signal is raised.
In addition to the other issues mentioned, having unsigned math wrap makes the unsigned integer types behave as abstract algebraic groups (meaning that, among other things, for any pair of values X and Y, there will exist some other value Z such that X+Z will, if properly cast, equal Y and Y-Z will, if properly cast, equal X). If unsigned values were merely storage-location types and not intermediate-expression types (e.g. if there were no unsigned equivalent of the largest integer type, and arithmetic operations on unsigned types behaved as though they were first converted them to larger signed types, then there wouldn't be as much need for defined wrapping behavior, but it's difficult to do calculations in a type which doesn't have e.g. an additive inverse.
This helps in situations where wrap-around behavior is actually useful - for example with TCP sequence numbers or certain algorithms, such as hash calculation. It may also help in situations where it's necessary to detect overflow, since performing calculations and checking whether they overflowed is often easier than checking in advance whether they would overflow, especially if the calculations involve the largest available integer type.
Perhaps another reason for why unsigned arithmetic is defined is because unsigned numbers form integers modulo 2^n, where n is the width of the unsigned number. Unsigned numbers are simply integers represented using binary digits instead of decimal digits. Performing the standard operations in a modulus system is well understood.
The OP's quote refers to this fact, but also highlights the fact that there is only one, unambiguous, logical way to represent unsigned integers in binary. By contrast, Signed numbers are most often represented using two's complement but other choices are possible as described in the standard (section 6.2.6.2).
Two's complement representation allows certain operations to make more sense in binary format. E.g., incrementing negative numbers is the same that for positive numbers (expect under overflow conditions). Some operations at the machine level can be the same for signed and unsigned numbers. However, when interpreting the result of those operations, some cases don't make sense - positive and negative overflow. Furthermore, the overflow results differ depending on the underlying signed representation.
The most technical reason of all, is simply that trying to capture overflow in an unsigned integer requires more moving parts from you (exception handling) and the processor (exception throwing).
C and C++ won't make you pay for that unless you ask for it by using a signed integer. This isn't a hard-fast rule, as you'll see near the end, but just how they proceed for unsigned integers. In my opinion, this makes signed integers the odd-one out, not unsigned, but it's fine they offer this fundamental difference as the programmer can still perform well-defined signed operations with overflow. But to do so, you must cast for it.
Because:
unsigned integers have well defined overflow and underflow
casts from signed -> unsigned int are well defined, [uint's name]_MAX - 1 is conceptually added to negative values, to map them to the extended positive number range
casts from unsigned -> signed int are well defined, [uint's name]_MAX - 1 is conceptually deducted from positive values beyond the signed type's max, to map them to negative numbers)
You can always perform arithmetic operations with well-defined overflow and underflow behavior, where signed integers are your starting point, albeit in a round-about way, by casting to unsigned integer first then back once finished.
int32_t x = 10;
int32_t y = -50;
// writes -60 into z, this is well defined
int32_t z = int32_t(uint32_t(y) - uint32_t(x));
Casts between signed and unsigned integer types of the same width are free, if the CPU is using 2's compliment (nearly all do). If for some reason the platform you're targeting doesn't use 2's Compliment for signed integers, you will pay a small conversion price when casting between uint32 and int32.
But be wary when using bit widths smaller than int
usually if you are relying on unsigned overflow, you are using a smaller word width, 8bit or 16bit. These will promote to signed int at the drop of a hat (C has absolutely insane implicit integer conversion rules, this is one of C's biggest hidden gotcha's), consider:
unsigned char a = 0;
unsigned char b = 1;
printf("%i", a - b); // outputs -1, not 255 as you'd expect
To avoid this, you should always cast to the type you want when you are relying on that type's width, even in the middle of an operation where you think it's unnecessary. This will cast the temporary and get you the signedness AND truncate the value so you get what you expected. It's almost always free to cast, and in fact, your compiler might thank you for doing so as it can then optimize on your intentions more aggressively.
unsigned char a = 0;
unsigned char b = 1;
printf("%i", (unsigned char)(a - b)); // cast turns -1 to 255, outputs 255

C++ why does this not provide the system maximum size for integer?

So, if I understand correctly, an integer is a collection of bytes, it represents numbers in base-two format, if you will.
Therefore, if I have unsigned int test=0, is should really just consist of a field of bits, all of which are zero. However,
unsigned int test=0;
test=~test;
produces -1.
I would've thought that this would've filled all the bits with '1', making the integer as large as it can be on that system....
Thanks for any help!
How do you print the value?
If it's displayed as "-1" or a large unsigned integer is just a manner of the bits are interpreted when printing them out, the bits themselves don't know the difference.
You need to print it as an unsigned value.
Also, as pointed out by other answers, you're assming a lot about how the system stores the numbers; there's no guarantee that there's a specific correlation between a number and the bits used to represent that number.
Anyway, the proper way to get this value is to #include <climits> and then just use UINT_MAX.
You're not understanding correctly. An integer represents an integer, and that's it. The specifics of the representation are not part of the standard (with a few exceptions), and you have no business assuming any correlation between bitwise operations and integer values.
(Ironically, what the standard does mandate via modular arithmetic rules is that -1 converted to an unsigned integer is in fact the largest possible value for that unsigned type.)
Update: To clarify, I'm speaking generally for all integral types. If you only use unsigned types (which I assumed you weren't because of your negative answer), you have a well-defined correspondence between bitwise operations and the represented value.
Alternatively you can use:
unsigned int test =0;
test--;

Should unsigned ints be used if not necessary?

Should one ever declare a variable as an unsigned int if they don't require the extra range of values? For example, when declaring the variable in a for loop, if you know it's not going to be negative, does it matter? Is one faster than the other? Is it bad to declare an unsigned int just as unsigned in C++?
To reitterate, should it be done even if the extra range is not required? I heard they should be avoided because they cause confusion (IIRC that's why Java doesn't have them).
The reason to use uints is that it gives the compiler a wider variety of optimizations. For example, it may replace an instance of 'abs(x)' with 'x' if it knows that x is positive. It also opens up a variety of bitwise 'strength reductions' that only work for positive numbers. If you always mult/divide an int by a power of two, then the compiler may replace the operation with a bit shift (ie x*8 == x<<3) which tends to perform much faster. Unfortunately, this relation only holds if 'x' is positive because negative numbers are encoded in a way that precludes this. With ints, the compiler may apply this trick if it can prove that the value is always positive (or can be modified earlier in the code to be so). In the case of uints, this attribute is trivial to prove, which greatly increases the odds of it being applied.
Another example might be the equation y = 16 * x + 12. If x can be negative, then a multiply and add would be required. Yet if x is always positive, then not only can the x*16 term be replaced with x<<4, but since the term would always end with four zeros this opens up replacing the '+ 12' with a binary OR (as long as the '12' term is less than 16). The result would be y = (x<<4) | 12.
In general, the 'unsigned' qualifier gives the compiler more information about the variable, which in turn allows it to squeeze in more optimizations.
You should use unsigned integers when it doesn't make sense for them to have negative values. This is completely independent of the range issue. So yes, you should use unsigned integer types even if the extra range is not required, and no, you shouldn't use unsigned ints (or anything else) if not necessary, but you need to revise your definition of what is necessary.
More often than not, you should use unsigned integers.
They are more predictable in terms of undefined behavior on overflow and such.
This is a huge subject of its own, so I won't say much more about it.
It's a very good reason to avoid signed integers unless you actually need signed values.
Also, they are easier to work with when range-checking -- you don't have to check for negative values.
Typical rules of thumb:
If you are writing a forward for loop with an index as the control variable, you almost always want unsigned integers. In fact, you almost always want size_t.
If you're writing a reverse for loop with an index as a the control variable, you should probably use signed integers, for obvious reasons. Probably ptrdiff_t would do.
The one thing to be careful with is when casting between signed and unsigned values of different sizes.
You probably want to double-check (or triple-check) to make sure the cast is working the way you expect.
int is the general purpose integer type. If you need an integer, and int meets your requirements (range [-32767,32767]), then use it.
If you have more specialized purposes, then you can choose something else. If you need an index into an array, then use size_t. If you need an index into a vector, then use std::vector<T>::size_type. If you need specific sizes, then pick something from <cstdint>. If you need something larger than 64 bits, then find a library like gmp.
I can't think of any good reasons to use unsigned int. At least, not directly (size_t and some of the specifically sized types from <cstdint> may be typedefs of unsigned int).
The problem with the systematic use of unsigned when values can't be negative isn't that Java doesn't have unsigned, it is that expressions with unsigned values, especially when mixed with signed one, give sometimes confusing results if you think about unsigned as an integer type with a shifted range. Unsigned is a modular type, not a restriction of integers to positive or zero.
Thus the traditional view is that unsigned should be used when you need a modular type or for bitwise manipulation. That view is implicit in K&R — look how int and unsigned are used —, and more explicit in TC++PL (2nd edition, p. 50):
The unsigned integer types are ideal for uses that treat storage as a bit array. Using an unsigned instead of an int to gain one more bit to represent positive integers is almost never a good idea. Attempts to ensure that some values are positive by declaring variables unsigned will typically be defeated by the implicit conversion rules.
In almost all architectures the cost of signed operation and unsigned operation is the same. So efficiency wise you wont get any advantage for using unsigned over signed. But as you pointed out, if you use unsigned you will have a bigger range
Even if you have variables that should only take non negative values unsigned can be a problem. Here is an example. Suppose a programmer is asked to write a code to print all pairs of integer numbers (a,b) with 0 <= a < b <= n where n is a given input. An incorrect code is
for (unsigned b = 0; b <= n; b++)
for (unsigned a=0; a <=b-1; b++)
cout << a << ',' << b << n ;
This is easy to correct, but thinking with unsigned is a bit less natural than thinking with int.