Why are signed negative numbers converted to unsigned numbers? - c++

Particularly i have a large unsigned number and a large signed negative number.
Why can't I divide a large number by a negative number C++
When I try to divide the two, I get the result zero simply because something somewhere is converting my signed number to unsigned (meaning that it has to be positive). >.>;
Edit.
Just to be absolutely clear. I know that it is resolved to zero and I'm happy with that. I'm also happy with the previous reason that it is resolved to zero because a number is converted away from being a signed number to an unsigned number.
However that doesn't answer why the number is converted from a signed number to an unsigned number in the first place. There doesn't seem to be any logical reason for doing so.

The operands of the arithmetic operators need to have the same types. That makes the language easier to define and easier to implement, in part because there are fewer cases to handle, and in part because hardware doesn't generally support mixed-type operations, either. Formally, C++ is defined in terms of a virtual machine, but the capabilities expected of that machine are influenced by those of real-world hardware. Operations dictated by the standard but not supported by hardware (such as your hypothetical signed-unsigned division) would need to be implemented by compiler-writers in library code, and that would affect performance.
Either the signed operand needs to be converted to unsigned, or vice versa. The standard committee picked the former.

Related

Is using an unsigned rather than signed int more likely to cause bugs? Why?

In the Google C++ Style Guide, on the topic of "Unsigned Integers", it is suggested that
Because of historical accident, the C++ standard also uses unsigned integers to represent the size of containers - many members of the standards body believe this to be a mistake, but it is effectively impossible to fix at this point. The fact that unsigned arithmetic doesn't model the behavior of a simple integer, but is instead defined by the standard to model modular arithmetic (wrapping around on overflow/underflow), means that a significant class of bugs cannot be diagnosed by the compiler.
What is wrong with modular arithmetic? Isn't that the expected behaviour of an unsigned int?
What kind of bugs (a significant class) does the guide refer to? Overflowing bugs?
Do not use an unsigned type merely to assert that a variable is non-negative.
One reason that I can think of using signed int over unsigned int, is that if it does overflow (to negative), it is easier to detect.
Some of the answers here mention the surprising promotion rules between signed and unsigned values, but that seems more like a problem relating to mixing signed and unsigned values, and doesn't necessarily explain why signed variables would be preferred over unsigned outside of mixing scenarios.
In my experience, outside of mixed comparisons and promotion rules, there are two primary reasons why unsigned values are bug magnets as follows.
Unsigned values have a discontinuity at zero, the most common value in programming
Both unsigned and signed integers have a discontinuities at their minimum and maximum values, where they wrap around (unsigned) or cause undefined behavior (signed). For unsigned these points are at zero and UINT_MAX. For int they are at INT_MIN and INT_MAX. Typical values of INT_MIN and INT_MAX on system with 4-byte int values are -2^31 and 2^31-1, and on such a system UINT_MAX is typically 2^32-1.
The primary bug-inducing problem with unsigned that doesn't apply to int is that it has a discontinuity at zero. Zero, of course, is a very common value in programs, along with other small values like 1,2,3. It is common to add and subtract small values, especially 1, in various constructs, and if you subtract anything from an unsigned value and it happens to be zero, you just got a massive positive value and an almost certain bug.
Consider code iterates over all values in a vector by index except the last0.5:
for (size_t i = 0; i < v.size() - 1; i++) { // do something }
This works fine until one day you pass in an empty vector. Instead of doing zero iterations, you get v.size() - 1 == a giant number1 and you'll do 4 billion iterations and almost have a buffer overflow vulnerability.
You need to write it like this:
for (size_t i = 0; i + 1 < v.size(); i++) { // do something }
So it can be "fixed" in this case, but only by carefully thinking about the unsigned nature of size_t. Sometimes you can't apply the fix above because instead of a constant one you have some variable offset you want to apply, which may be positive or negative: so which "side" of the comparison you need to put it on depends on the signedness - now the code gets really messy.
There is a similar issue with code that tries to iterate down to and including zero. Something like while (index-- > 0) works fine, but the apparently equivalent while (--index >= 0) will never terminate for an unsigned value. Your compiler might warn you when the right hand side is literal zero, but certainly not if it is a value determined at runtime.
Counterpoint
Some might argue that signed values also have two discontinuities, so why pick on unsigned? The difference is that both discontinuities are very (maximally) far away from zero. I really consider this a separate problem of "overflow", both signed and unsigned values may overflow at very large values. In many cases overflow is impossible due to constraints on the possible range of the values, and overflow of many 64-bit values may be physically impossible). Even if possible, the chance of an overflow related bug is often minuscule compared to an "at zero" bug, and overflow occurs for unsigned values too. So unsigned combines the worst of both worlds: potentially overflow with very large magnitude values, and a discontinuity at zero. Signed only has the former.
Many will argue "you lose a bit" with unsigned. This is often true - but not always (if you need to represent differences between unsigned values you'll lose that bit anyways: so many 32-bit things are limited to 2 GiB anyways, or you'll have a weird grey area where say a file can be 4 GiB, but you can't use certain APIs on the second 2 GiB half).
Even in the cases where unsigned buys you a bit: it doesn't buy you much: if you had to support more than 2 billion "things", you'll probably soon have to support more than 4 billion.
Logically, unsigned values are a subset of signed values
Mathematically, unsigned values (non-negative integers) are a subset of signed integers (just called _integers).2. Yet signed values naturally pop out of operations solely on unsigned values, such as subtraction. We might say that unsigned values aren't closed under subtraction. The same isn't true of signed values.
Want to find the "delta" between two unsigned indexes into a file? Well you better do the subtraction in the right order, or else you'll get the wrong answer. Of course, you often need a runtime check to determine the right order! When dealing with unsigned values as numbers, you'll often find that (logically) signed values keep appearing anyways, so you might as well start of with signed.
Counterpoint
As mentioned in footnote (2) above, signed values in C++ aren't actually a subset of unsigned values of the same size, so unsigned values can represent the same number of results that signed values can.
True, but the range is less useful. Consider subtraction, and unsigned numbers with a range of 0 to 2N, and signed numbers with a range of -N to N. Arbitrary subtractions result in results in the range -2N to 2N in _both cases, and either type of integer can only represent half of it. Well it turns out that the region centered around zero of -N to N is usually way more useful (contains more actual results in real world code) than the range 0 to 2N. Consider any of typical distribution other than uniform (log, zipfian, normal, whatever) and consider subtracting randomly selected values from that distribution: way more values end up in [-N, N] than [0, 2N] (indeed, resulting distribution is always centered at zero).
64-bit closes the door on many of the reasons to use unsigned values as numbers
I think the arguments above were already compelling for 32-bit values, but the overflow cases, which affect both signed and unsigned at different thresholds, do occur for 32-bit values, since "2 billion" is a number that can exceeded by many abstract and physical quantities (billions of dollars, billions of nanoseconds, arrays with billions of elements). So if someone is convinced enough by the doubling of the positive range for unsigned values, they can make the case that overflow does matter and it slightly favors unsigned.
Outside of specialized domains 64-bit values largely remove this concern. Signed 64-bit values have an upper range of 9,223,372,036,854,775,807 - more than nine quintillion. That's a lot of nanoseconds (about 292 years worth), and a lot of money. It's also a larger array than any computer is likely to have RAM in a coherent address space for a long time. So maybe 9 quintillion is enough for everybody (for now)?
When to use unsigned values
Note that the style guide doesn't forbid or even necessarily discourage use of unsigned numbers. It concludes with:
Do not use an unsigned type merely to assert that a variable is non-negative.
Indeed, there are good uses for unsigned variables:
When you want to treat an N-bit quantity not as an integer, but simply a "bag of bits". For example, as a bitmask or bitmap, or N boolean values or whatever. This use often goes hand-in-hand with the fixed width types like uint32_t and uint64_t since you often want to know the exact size of the variable. A hint that a particular variable deserves this treatment is that you only operate on it with with the bitwise operators such as ~, |, &, ^, >> and so on, and not with the arithmetic operations such as +, -, *, / etc.
Unsigned is ideal here because the behavior of the bitwise operators is well-defined and standardized. Signed values have several problems, such as undefined and unspecified behavior when shifting, and an unspecified representation.
When you actually want modular arithmetic. Sometimes you actually want 2^N modular arithmetic. In these cases "overflow" is a feature, not a bug. Unsigned values give you what you want here since they are defined to use modular arithmetic. Signed values cannot be (easily, efficiently) used at all since they have an unspecified representation and overflow is undefined.
0.5 After I wrote this I realized this is nearly identical to Jarod's example, which I hadn't seen - and for good reason, it's a good example!
1 We're talking about size_t here so usually 2^32-1 on a 32-bit system or 2^64-1 on a 64-bit one.
2 In C++ this isn't exactly the case because unsigned values contain more values at the upper end than the corresponding signed type, but the basic problem exists that manipulating unsigned values can result in (logically) signed values, but there is no corresponding issue with signed values (since signed values already include unsigned values).
As stated, mixing unsigned and signed might lead to unexpected behaviour (even if well defined).
Suppose you want to iterate over all elements of vector except for the last five, you might wrongly write:
for (int i = 0; i < v.size() - 5; ++i) { foo(v[i]); } // Incorrect
// for (int i = 0; i + 5 < v.size(); ++i) { foo(v[i]); } // Correct
Suppose v.size() < 5, then, as v.size() is unsigned, s.size() - 5 would be a very large number, and so i < v.size() - 5 would be true for a more expected range of value of i. And UB then happens quickly (out of bound access once i >= v.size())
If v.size() would have return signed value, then s.size() - 5 would have been negative, and in above case, condition would be false immediately.
On the other side, index should be between [0; v.size()[ so unsigned makes sense.
Signed has also its own issue as UB with overflow or implementation-defined behaviour for right shift of a negative signed number, but less frequent source of bug for iteration.
One of the most hair-raising examples of an error is when you MIX signed and unsigned values:
#include <iostream>
int main() {
auto qualifier = -1 < 1u ? "makes" : "does not make";
std::cout << "The world " << qualifier << " sense" << std::endl;
}
The output:
The world does not make sense
Unless you have a trivial application, it's inevitable you'll end up with either dangerous mixes between signed and unsigned values (resulting in runtime errors) or if you crank up warnings and make them compile-time errors, you end up with a lot of static_casts in your code. That's why it's best to strictly use signed integers for types for math or logical comparison. Only use unsigned for bitmasks and types representing bits.
Modeling a type to be unsigned based on the expected domain of the values of your numbers is a Bad Idea. Most numbers are closer to 0 than they are to 2 billion, so with unsigned types, a lot of your values are closer to the edge of the valid range. To make things worse, the final value may be in a known positive range, but while evaluating expressions, intermediate values may underflow and if they are used in intermediate form may be VERY wrong values. Finally, even if your values are expected to always be positive, that doesn't mean that they won't interact with other variables that can be negative, and so you end up with a forced situation of mixing signed and unsigned types, which is the worst place to be.
Why is using an unsigned int more likely to cause bugs than using a signed int?
Using an unsigned type is not more likely to cause bugs than using a signed type with certain classes of tasks.
Use the right tool for the job.
What is wrong with modular arithmetic? Isn't that the expected behaviour of an unsigned int?
Why is using an unsigned int more likely to cause bugs than using a signed int?
If the task if well-matched: nothing wrong. No, not more likely.
Security, encryption, and authentication algorithm count on unsigned modular math.
Compression/decompression algorithms too as well as various graphic formats benefit and are less buggy with unsigned math.
Any time bit-wise operators and shifts are used, the unsigned operations do not get messed up with the sign-extension issues of signed math.
Signed integer math has an intuitive look and feel readily understood by all including learners to coding. C/C++ was not targeted originally nor now should be an intro-language. For rapid coding that employs safety nets concerning overflow, other languages are better suited. For lean fast code, C assumes that coders knows what they are doing (they are experienced).
A pitfall of signed math today is the ubiquitous 32-bit int that with so many problems is well wide enough for the common tasks without range checking. This leads to complacency that overflow is not coded against. Instead, for (int i=0; i < n; i++) int len = strlen(s); is viewed as OK because n is assumed < INT_MAX and strings will never be too long, rather than being full ranged protected in the first case or using size_t, unsigned or even long long in the 2nd.
C/C++ developed in an era that included 16-bit as well as 32-bit int and the extra bit an unsigned 16-bit size_t affords was significant. Attention was needed in regard to overflow issues be it int or unsigned.
With 32-bit (or wider) applications of Google on non-16 bit int/unsigned platforms, affords the lack of attention to +/- overflow of int given its ample range. This makes sense for such applications to encourage int over unsigned. Yet int math is not well protected.
The narrow 16-bit int/unsigned concerns apply today with select embedded applications.
Google's guidelines apply well for code they write today. It is not a definitive guideline for the larger wide scope range of C/C++ code.
One reason that I can think of using signed int over unsigned int, is that if it does overflow (to negative), it is easier to detect.
In C/C++, signed int math overflow is undefined behavior and so not certainly easier to detect than defined behavior of unsigned math.
As #Chris Uzdavinis well commented, mixing signed and unsigned is best avoided by all (especially beginners) and otherwise coded carefully when needed.
I have some experience with Google's style guide, AKA the Hitchhiker's Guide to Insane Directives from Bad Programmers Who Got into the Company a Long Long Time Ago. This particular guideline is just one example of the dozens of nutty rules in that book.
Errors only occur with unsigned types if you try to do arithmetic with them (see Chris Uzdavinis example above), in other words if you use them as numbers. Unsigned types are not intended to be used to store numeric quantities, they are intended to store counts such as the size of containers, which can never be negative, and they can and should be used for that purpose.
The idea of using arithmetical types (like signed integers) to store container sizes is idiotic. Would you use a double to store the size of a list, too? That there are people at Google storing container sizes using arithmetical types and requiring others to do the same thing says something about the company. One thing I notice about such dictates is that the dumber they are, the more they need to be strict do-it-or-you-are-fired rules because otherwise people with common sense would ignore the rule.
Using unsigned types to represent non-negative values...
is more likely to cause bugs involving type promotion, when using signed and unsigned values, as other answer demonstrate and discuss in depth, but
is less likely to cause bugs involving choice of types with domains capable of representing undersirable/disallowed values. In some places you'll assume the value is in the domain, and may get unexpected and potentially hazardous behavior when other value sneak in somehow.
The Google Coding Guidelines puts emphasis on the first kind of consideration. Other guideline sets, such as the C++ Core Guidelines, put more emphasis on the second point. For example, consider Core Guideline I.12:
I.12: Declare a pointer that must not be null as not_null
Reason
To help avoid dereferencing nullptr errors. To improve performance by
avoiding redundant checks for nullptr.
Example
int length(const char* p); // it is not clear whether length(nullptr) is valid
length(nullptr); // OK?
int length(not_null<const char*> p); // better: we can assume that p cannot be nullptr
int length(const char* p); // we must assume that p can be nullptr
By stating the intent in source, implementers and tools can provide
better diagnostics, such as finding some classes of errors through
static analysis, and perform optimizations, such as removing branches
and null tests.
Of course, you could argue for a non_negative wrapper for integers, which avoids both categories of errors, but that would have its own issues...
The google statement is about using unsigned as a size type for containers. In contrast, the question appears to be more general. Please keep that in mind, while you read on.
Since most answers so far reacted to the google statement, less so to the bigger question, I will start my answer about negative container sizes and subsequently try to convince anyone (hopeless, I know...) that unsigned is good.
Signed container sizes
Lets assume someone coded a bug, which results in a negative container index. The result is either undefined behavior or an exception / access violation. Is that really better than getting undefined behavior or an exception / access violation when the index type was unsigned? I think, no.
Now, there is a class of people who love to talk about mathematics and what is "natural" in this context. How can an integral type with negative number be natural to describe something, which is inherently >= 0? Using arrays with negative sizes much? IMHO, especially mathematically inclined people would find this mismatch of semantics (size/index type says negative is possible, while a negative sized array is hard to imagine) irritating.
So, the only question, remaining on this matter is if - as stated in the google comment - a compiler could actually actively assist in finding such bugs. And even better than the alternative, which would be underflow protected unsigned integers (x86-64 assembly and probably other architectures have means to achieve that, only C/C++ does not use those means). The only way I can fathom is if the compiler automatically added run time checks (if (index < 0) throwOrWhatever) or in case of compile time actions produce a lot of potentially false positive warnings/errors "The index for this array access could be negative." I have my doubts, this would be helpful.
Also, people who actually write runtime checks for their array/container indices, it is more work dealing with signed integers. Instead of writing if (index < container.size()) { ... } you now have to write: if (index >= 0 && index < container.size()) { ... }. Looks like forced labor to me and not like an improvement...
Languages without unsigned types suck...
Yes, this is a stab at java. Now, I come from embedded programming background and we worked a lot with field buses, where binary operations (and,or,xor,...) and bit wise composition of values is literally the bread and butter. For one of our products, we - or rather a customer - wanted a java port... and I sat opposite to the luckily very competent guy who did the port (I refused...). He tried to stay composed... and suffer in silence... but the pain was there, he could not stop cursing after a few days of constantly dealing with signed integral values, which SHOULD be unsigned... Even writing unit tests for those scenarios is painful and me, personally I think java would have been better off if they had omitted signed integers and just offered unsigned... at least then, you do not have to care about sign extensions etc... and you can still interpret numbers as 2s complement.
Those are my 5 cents on the matter.

What does 'Natural Size' really mean in C++?

I understand that the 'natural size' is the width of integer that is processed most efficiently by a particular hardware. When using short in an array or in arithmetic operations, the short integer must first be converted into int.
Q: What exactly determines this 'natural size'?
I am not looking for simple answers such as
If it has a 32-bit architecture, it's natural size is 32-bit
I want to understand why this is most efficient, and why a short must be converted before doing arithmetic operations on it.
Bonus Q: What happens when arithmetic operations are conducted on a long integer?
Generally speaking, each computer architecture is designed such that certain type sizes provide the most efficient numerical operations. The specific size then depends on the architecture, and the compiler will select an appropriate size. More detailed explanations as to why hardware designers selected certain sizes for perticular hardware would be out of scope for stckoverflow.
A short most be promoted to int before performing integral operations because that's the way it was in C, and C++ inherited that behavior with little or no reason to change it, possibly breaking existing code. I'm not sure the reason it was originally added in C, but one could speculate that it's related to "default int" where if no type were specified int was assumed by the compiler.
Bonus A: from 5/9 (expressions) we learn: Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions, which are defined as follows:
And then of interest specifically:
floating point rules that don't matter here
Otherwise, the integral promotions (4.5) shall be performed on both operands
Then, if either operand is unsigned long the other shall be converted to unsigned long.
Otherwise, if one operand is a long int and the other unsigned int, then if a long int can represent
all the values of an unsigned int, the unsigned int shall be converted to a long int;
otherwise both operands shall be converted to unsigned long int.
Otherwise, if either operand is long, the other shall be converted to long.
In summary the compiler tries to use the "best" type it can to do binary operations, with int being the smallest size used.
the 'natural size' is the width of integer that is processed most efficiently by a particular hardware.
Not really. Consider the x64 architecture. Arithmetic on any size from 8 to 64 bits will be essentially the same speed. So why have all x64 compilers settled on a 32-bit int? Well, because there was a lot of code out there which was originally written for 32-bit processors, and a lot of it implicitly relied on ints being 32-bits. And given the near-uselessness of a type which can represent values up to nine quintillion, the extra four bytes per integer would have been virtually unused. So we've decided that 32-bit ints are "natural" for this 64-bit platform.
Compare the 80286 architecture. Only 16 bits in a register. Performing 32-bit integer addition on such a platform basically requires splitting it into two 16-bit additions. Doing virtually anything with it involves splitting it up, really-- and an attendant slowdown. The 80286's "natural integer size" is most definitely not 32 bits.
So really, "natural" comes down to considerations like processing efficiency, memory usage, and programmer-friendliness. It is not an acid test. It is very much a matter of subjective judgment on the part of the architecture/compiler designer.
What exactly determines this 'natural size'?
For some processors (e.g. 32-bit ARM, and most DSP-style processors), it's determined by the architecture; the processor registers are a particular size, and arithmetic can only be done on values of that size.
Others (e.g. Intel x64) are more flexible, and there's no single "natural" size; it's up to the compiler designers to choose a size, a compromise between efficiency, range of values, and memory usage.
why this is most efficient
If the processor requires values to be a particular size for arithmetic, then choosing another size will force you to convert the values to the required size - probably for a cost.
why a short must be converted before doing arithmetic operations on it
Presumably, that was a good match for the behaviour of commonly-used processors when C was developed, half a century ago. C++ inherited the promotion rules from C. I can't really comment on exactly why it was deemed a good idea, since I wasn't born then.
What happens when arithmetic operations are conducted on a long integer?
If the processor registers are large enough to hold a long, then the arithmetic will be much the same as for int. Otherwise, the operations will have to be broken down into several operations on values split between multiple registers.
I understand that the 'natural size' is the width of integer that is processed most efficiently by a particular hardware.
That's an excellent start.
Q: What exactly determines this 'natural size'?
The paragraph above is the definition of "natural size". Nothing else determines it.
I want to understand why this is most efficient
By definition.
and why a short must be converted before doing arithmetic operations on it.
It is so because the C language definitions says so. There are no deep architectural reasons (there could have been some when C was invented).
Bonus Q: What happens when arithmetic operations are conducted on a long integer?
A bunch of electrons rushes through dirty sand and meets a bunch of holes. (No, really. Ask a vague question...)

Why prefer signed over unsigned in C++? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I'd like to understand better why choose int over unsigned?
Personally, I've never liked signed values unless there is a valid reason for them. e.g. count of items in an array, or length of a string, or size of memory block, etc., so often these things cannot possibly be negative. Such a value has no possible meaning. Why prefer int when it is misleading in all such cases?
I ask this because both Bjarne Stroustrup and Chandler Carruth gave the advice to prefer int over unsigned here (approx 12:30').
I can see the argument for using int over short or long - int is the "most natural" data width for the target machine architecture.
But signed over unsigned has always annoyed me. Are signed values genuinely faster on typical modern CPU architectures? What makes them better?
As per requests in comments: I prefer int instead of unsigned because...
it's shorter (I'm serious!)
it's more generic and more intuitive (i. e. I like to be able to assume that 1 - 2 is -1 and not some obscure huge number)
what if I want to signal an error by returning an out-of-range value?
Of course there are counter-arguments, but these are the principal reasons I like to declare my integers as int instead of unsigned. Of course, this is not always true, in other cases, an unsigned is just a better tool for a task, I am just answering the "why would anyone prefer defaulting to signed" question specifically.
Let me paraphrase the video, as the experts said it succinctly.
Andrei Alexandrescu:
No simple guideline.
In systems programming, we need integers of different sizes and signedness.
Many conversions and arcane rules govern arithmetic (like for auto), so we need to be careful.
Chandler Carruth:
Here's some simple guidelines:
Use signed integers unless you need two's complement arithmetic or a bit pattern
Use the smallest integer that will suffice.
Otherwise, use int if you think you could count the items, and a 64-bit integer if it's even more than you would want to count.
Stop worrying and use tools to tell you when you need a different type or size.
Bjarne Stroustrup:
Use int until you have a reason not to.
Use unsigned only for bit patterns.
Never mix signed and unsigned
Wariness about signedness rules aside, my one-sentence take away from the experts:
Use the appropriate type, and when you don't know, use an int until you do know.
Several reasons:
Arithmetic on unsigned always yields unsigned, which can be a problem when subtracting integer quantities that can reasonably result in a negative result — think subtracting money quantities to yield balance, or array indices to yield distance between elements. If the operands are unsigned, you get a perfectly defined, but almost certainly meaningless result, and a result < 0 comparison will always be false (of which modern compilers will fortunately warn you).
unsigned has the nasty property of contaminating the arithmetic where it gets mixed with signed integers. So, if you add a signed and unsigned and ask whether the result is greater than zero, you can get bitten, especially when the unsigned integral type is hidden behind a typedef.
There are no reasons to prefer signed over unsigned, aside from purely sociological ones, i.e. some people believe that average programmers are not competent and/or attentive enough to write proper code in terms of unsigned types. This is often the main reasoning used by various "speakers", regardless of how respected those speakers might be.
In reality, competent programmers quickly develop and/or learn the basic set of programming idioms and skills that allow them to write proper code in terms of unsigned integral types.
Note also that the fundamental differences between signed and unsigned semantics are always present (in superficially different form) in other parts of C and C++ language, like pointer arithmetic and iterator arithmetic. Which means that in general case the programmer does not really have the option of avoiding dealing with issues specific to unsigned semantics and the "problems" it brings with it. I.e. whether you want it or not, you have to learn to work with ranges that terminate abruptly at their left end and terminate right here (not somewhere in the distance), even if you adamantly avoid unsigned integers.
Also, as you probably know, many parts of standard library already rely on unsigned integer types quite heavily. Forcing signed arithmetic into the mix, instead of learning to work with unsigned one, will only result in disastrously bad code.
The only real reason to prefer signed in some contexts that comes to mind is that in mixed integer/floating-point code signed integer formats are typically directly supported by FPU instruction set, while unsigned formats are not supported at all, making the compiler to generate extra code for conversions between floating-point values and unsigned values. In such code signed types might perform better.
But at the same time in purely integer code unsigned types might perform better than signed types. For example, integer division often requires additional corrective code in order to satisfy the requirements of the language spec. The correction is only necessary in case of negative operands, so it wastes CPU cycles in situations when negative operands are not really used.
In my practice I devotedly stick to unsigned wherever I can, and use signed only if I really have to.
The integral types in C and many languages which derive from it have two general usage cases: to represent numbers, or represent members of an abstract algebraic ring. For those unfamiliar with abstract algebra, the primary notion behind a ring is that adding, subtracting, or multiplying two items of a ring should yield another item of that ring--it shouldn't crash or yield a value outside the ring. On a 32-bit machine, adding unsigned 0x12345678 to unsigned 0xFFFFFFFF doesn't "overflow"--it simply yields the result 0x12345677 which is defined for the ring of integers congruent mod 2^32 (because the arithmetic result of adding 0x12345678 to 0xFFFFFFFF, i.e. 0x112345677, is congruent to 0x12345677 mod 2^32).
Conceptually, both purposes (representing numbers, or representing members of the ring of integers congruent mod 2^n) may be served by both signed and unsigned types, and many operations are the same for both usage cases, but there are some differences. Among other things, an attempt to add two numbers should not be expected to yield anything other than the correct arithmetic sum. While it's debatable whether a language should be required to generate the code necessary to guarantee that it won't (e.g. that an exception would be thrown instead), one could argue that for code which uses integral types to represent numbers such behavior would be preferable to yielding an arithmetically-incorrect value and compilers shouldn't be forbidden from behaving that way.
The implementers of the C standards decided to use signed integer types to represent numbers and unsigned types to represent members of the algebraic ring of integers congruent mod 2^n. By contrast, Java uses signed integers to represent members of such rings (though they're interpreted differently in some contexts; conversions among differently-sized signed types, for example, behave differently from among unsigned ones) and Java has neither unsigned integers nor any primitive integral types which behave as numbers in all non-exceptional cases.
If a language provided a choice of signed and unsigned representations for both numbers and algebraic-ring numbers, it might make sense to use unsigned numbers to represent quantities that will always be positive. If, however, the only unsigned types represent members of an algebraic ring, and the only types that represent numbers are the signed ones, then even if a value will always be positive it should be represented using a type designed to represent numbers.
Incidentally, the reason that (uint32_t)-1 is 0xFFFFFFFF stems from the fact that casting a signed value to unsigned is equivalent to adding unsigned zero, and adding an integer to an unsigned value is defined as adding or subtracting its magnitude to/from the unsigned value according to the rules of the algebraic ring which specify that if X=Y-Z, then X is the one and only member of that ring such X+Z=Y. In unsigned math, 0xFFFFFFFF is the only number which, when added to unsigned 1, yields unsigned zero.
Speed is the same on modern architectures. The problem with unsigned int is that it can sometimes generate unexpected behavior. This can create bugs that wouldn't show up otherwise.
Normally when you subtract 1 from a value, the value gets smaller. Now, with both signed and unsigned int variables, there will be a time that subtracting 1 creates a value that is MUCH LARGER. The key difference between unsigned int and int is that with unsigned int the value that generates the paradoxical result is a commonly used value --- 0 --- whereas with signed the number is safely far away from normal operations.
As far as returning -1 for an error value --- modern thinking is that it's better to throw an exception than to test for return values.
It's true that if you properly defend your code you won't have this problem, and if you use unsigned religiously everywhere you will be okay (provided that you are only adding, and never subtracting, and that you never get near MAX_INT). I use unsigned int everywhere. But it takes a lot of discipline. For a lot of programs, you can get by with using int and spend your time on other bugs.
Use int by default: it plays nicer with the rest of the language
most common domain usage is regular arithmetic, not modular arithmetic
int main() {} // see an unsigned?
auto i = 0; // i is of type int
Only use unsigned for modulo arithmetic and bit-twiddling (in particular shifting)
has different semantics than regular arithmetic, make sure it is what you want
bit-shifting signed types is subtle (see comments by #ChristianRau)
if you need a > 2Gb vector on a 32-bit machine, upgrade your OS / hardware
Never mix signed and unsigned arithmetic
the rules for that are complicated and surprising (either one can be converted to the other, depending on the relative type sizes)
turn on -Wconversion -Wsign-conversion -Wsign-promo (gcc is better than Clang here)
the Standard Library got it wrong with std::size_t (quote from the GN13 video)
use range-for if you can,
for(auto i = 0; i < static_cast<int>(v.size()); ++i) if you must
Don't use short or large types unless you actually need them
current architectures data flow caters well to 32-bit non-pointer data (but note the comment by #BenVoigt about cache effects for smaller types)
char and short save space but suffer from integral promotions
are you really going to count to over all int64_t?
To answer the actual question: For the vast number of things, it doesn't really matter. int can be a little easier to deal with things like subtraction with the second operand larger than the first and you still get a "expected" result.
There is absolutely no speed difference in 99.9% of cases, because the ONLY instructions that are different for signed and unsigned numbers are:
Making the number longer (fill with the sign for signed or zero for unsigned) - it takes the same effort to do both.
Comparisons - a signed number, the processor has to take into account if either number is negative or not. But again, it's the same speed to make a compare with signed or unsigned numbers - it's just using a different instruction code to say "numbers that have the highest bit set are smaller than numbers with the highest bit not set" (essentially). [Pedantically, it's nearly always the operation using the RESULT of a comparison that is different - the most common case being a conditional jump or branch instruction - but either way, it's the same effort, just that the inputs are taken to mean slightly different things].
Multiply and divide. Obviously, sign conversion of the result needs to happen if it's a signed multiplication, where a unsigned should not change the sign of the result if the highest bit of one of the inputs is set. And again, the effort is (as near as we care for) identical.
(I think there are one or two other cases, but the result is the same - it really doesn't matter if it's signed or unsigned, the effort to perform the operation is the same for both).
The int type more closely resembles the behavior of mathematical integers than the unsigned type.
It is naive to prefer the unsigned type simply because a situation does not require negative values to be represented.
The problem is that the unsigned type has a discontinuous behavior right next to zero. Any operation that tries to compute a small negative value, instead produces some large positive value. (Worse: one that is implementation-defined.)
Algebraic relationships such as that a < b implies that a - b < 0 are wrecked in the unsigned domain, even for small values like a = 3 and b = 4.
A descending loop like for (i = max - 1; i >= 0; i--) fails to terminate if i is made unsigned.
Unsigned quirks can cause a problem which will affect code regardless of whether that code expects to be representing only positive quantities.
The virtue of the unsigned types is that certain operations that are not portably defined at the bit level for the signed types are that way for the unsigned types. The unsigned types lack a sign bit, and so shifting and masking through the sign bit isn't a problem. The unsigned types are good for bitmasks, and for code that implements precise arithmetic in a platform-independent way. Unsigned opearations will simulate two's complement semantics even on a non two's complement machine. Writing a multi-precision (bignum) library practically requires arrays of unsigned types to be used for the representation, rather than signed types.
The unsigned types are also suitable in situations in which numbers behave like identifiers and not as arithmetic types. For instance, an IPv4 address can be represented in a 32 bit unsigned type. You wouldn't add together IPv4 addresses.
int is preferred because it's most commonly used. unsigned is usually associated with bit operations. Whenever I see an unsigned, I assume it's used for bit twiddling.
If you need a bigger range, use a 64-bit integer.
If you're iterating over stuff using indexes, types usually have size_type, and you shouldn't care whether it's signed or unsigned.
Speed is not an issue.
For me, in addition to all the integers in the range of 0..+2,147,483,647 contained within the set of signed and unsigned integers on 32 bit architectures, there is a higher probability that I will need to use -1 (or smaller) than need to use +2,147,483,648 (or larger).
One good reason that I can think of is in case of detecting overflow.
For the use cases such as the count of items in an array, length of a string, or size of memory block, you can overflow an unsigned int and you may not notice a difference even when you take a look at the variable. If it is an signed int, the variable will be less than zero and clearly wrong.
You can simply check to see if the variable is zero when you want to use it. This way, you do not have to check for overflow after every arithmetic operation as is the case for unsigned ints.
It gives unexpected result when doing simple arithmetic operation:
unsigned int i;
i = 1 - 2;
//i is now 4294967295 on a 64bit machine
It gives unexpected result when doing simple comparison:
unsigned int j = 1;
std::cout << (j>-1) << std::endl;
//output 0 as false but 1 is greater than -1
This is because when doing the operations above, the signed ints are converted to unsigned, and it overflows and goes to a really big number.

Why does C++ have signed and unsigned

I'm learning C++ coming from doing JavaScript professionally for years. I know the differences between signed and unsigned, but I'm not sure I understand why they even exist. Is it for performance or something?
Because they're needed.
signed types exist because sometimes one wants to use negative numbers.
unsigned types exist because for some operations, e. g. bitwise logic and shifting, they're cleaner to use (e. g. you don't have to worry about how the sign bit is represented, etc.)
You can argue that with unsigned values you can obtain a larger positive range than with the equivalent signed types, but the main reason is that they allow mapping concepts in the domain where negative numbers make little or no sense. For example, the size of a container can never be negative. By using an unsigned type that becomes explicit in the type.
Then again, the existence of the two, and the implicit conversions pull a whole different set of problems...
It comes in handy for low level situations where you need to describe length which obviously can't be negative. File sizes for example.
They exist because the math to handle can differ.
Consider multiplying two 32 bit values to get a 64 bit value. Depending upon the sign of the operands, the upper 32 bits can differ between a signed multiply and an unsigned multiply. Getting down to assembly, this is why there is an IMUL and MUL instructions on the x86.
C is designed to be a high-level language but also able to do low-level stuffs easily, such as directly inserting assembly language code into the source code using the _ASM keyword. Assembly language works directly with the CPU registers and the arithmetic instructions work with both signed and unsigned integer. The variable type are probably designed in such way so that values can be transfer between CPU registers and variables without any conversion at all. C++ which extended from C also inherit this feature.
After all, at that time when the C/C++ is invented, most of the programmers are also hardware engineers too.

Unsigned negative primitives?

In C++ we can make primitives unsigned. But they are always positive. Is there also a way to make unsigned negative variables? I know the word unsigned means "without sign", so also not a minus (-) sign. But I think C++ must provide it.
No. unsigned can only contain nonnegative numbers.
If you need a type that only represent negative numbers, you need to write a class yourself, or just interpret the value as negative in your program.
(But why do you need such a type?)
unsigned integers are only positive. From 3.9.1 paragraph 3 of the 2003 C++ standard:
The range of nonnegative values of a
signed integer type is a subrange of
the corresponding unsigned integer
type, and the value representation of
each corresponding signed/unsigned
type shall be the same.
The main purpose of the unsigned integer types is to support modulo arithmetic. From 3.9.1 paragraph 4:
Unsigned integers, declared unsigned,
shall obey the laws of arithmetic
modulo 2n where n is the
number of bits in the value
representation of that particular size
of integer.
You are free, of course, to treat them as negative if you wish, you'll just have to keep track of that yourself somehow (perhaps with a Boolean flag).
I think you are thinking it the wrong way.
If you want a negative number, then you must not declare the variable as unsigned.
If your problem is the value range, and you want that one more bit, then you could use a "bigger" data type (int 64 for example...).
Then if you are using legacy code, creating a new struct can solve your problem, but this is that way because of your specific situation, C++ shouldn't handle it.
Don't be fooled by the name: unsigned is often misunderstood as non-negative, but the rules for the language are different... probably a better name would have been "bitmask" or "modulo_integer".
If you think that unsigned is non-negative then for example implicit conversion rules are total nonsense (why a difference between two non-negative should be a non-negative ? why the addition of a non-negative and an integer should be non-negative ?).
It's very unfortunate that C++ standard library itself fell in that misunderstanding because for example vector.size() is unsigned (absurd if you mean it as the language itself does in terms of bitmask or modulo_integer). This choice for sizes has more to do with the old 16-bit times than with unsigned-ness and it was in my opinion a terrible choice that we're still paying as bugs.
But I think C++ must provide it.
Why? You think that C++ must provide the ability to make non-negative numbers negative? That's silly.
Unsigned values in C++ are defined to always be non-negative. They even wrap around — rather than underflowing — at zero! (And the same at the other end of the range)