Related
I'm investigating a standard for my team around using size_t vs int (or long, etc). The biggest drawback I've seen pointed out is that taking the difference of two size_t objects can cause problems (I'm unsure of specific problems -- maybe something wasn't 2s complemented and the signed/unsigned angers the compiler). I wrote a quick program in C++ using the V120 VS2013 compiler that allowed me to do the following:
#include <iostream>
main()
{
size_t a = 10;
size_t b = 100;
int result = a - b;
}
The program resulted in -90, which although correct, makes me nervous about type mismatches, signed/unsigned problems, or just plain undefined behavior if the size_t happens to get used in complex math.
My question is if it's safe to do math with size_t objects, specifically, taking the difference? I'm considering using size_t as a standard for things like indexes. I've seen some interesting posts on the topic here, but they don't address the math issue (or I missed it).
What type for subtracting 2 size_t's?
typedef for a signed type that can contain a size_t?
This is not guaranteed to work portably, but is not UB either. The code must run without error, but the resulting int value is implementation defined. So as long as you are working on platforms that guarantee the desired behavior, this is fine (as long as the difference can be represented by an int of course), otherwise, just use signed types everywhere (see last paragraph).
Subtracting two std::size_ts will yield a new std::size_t† and its value will be determined by wrapping. In your example, assuming 64 bit size_t, a - b will equal 18446744073709551526. This does not fit into an (commonly used 32 bit) int, so an implementation defined value is assigned to result.
To be honest, I would recommend to not use unsigned integers for anything but bit magic. Several members of the standard committee agree with me: https://channel9.msdn.com/Events/GoingNative/2013/Interactive-Panel-Ask-Us-Anything 9:50, 42:40, 1:02:50
Rule of thumb (paraphrasing Chandler Carruth from the above video): If you could count it yourself, use int, otherwise use std::int64_t.
†Unless its conversion rank is less than int, e.g. if std::size_t is unsigned short. In that case, the result is an int and everything will work fine (unless int is not wider than short). However
I do not know of any platform that does this.
This would still be platform specific, see first paragraph.
The size_t type is unsigned. The subtraction of any two size_t values is defined-behavior
However, firstly, the result is implementation-defined if a larger value is subtracted from a smaller one. The result is the mathematical value, reduced to the smallest positive residue modulo SIZE_T_MAX + 1. For instance if the largest value of size_t is 65535, and the result of subtracting two size_t values is -3, then the result will be 65536 - 3 = 65533. On a different compiler or machine with a different size_t, the numeric value will be different.
Secondly, a size_t value might be out of range of the type int. If that is the case, we get a second implementation-defined result arising from the forced conversion. In this situation, any behavior can apply; it just has to be documented by the implementation, and the conversion must not fail. For instance, the result could be clamped into the int range, producing INT_MAX. A common behavior seen on two's complement machines (virtually all) in the conversion of wider (or equal width) unsigned types to narrower signed types is simple bit truncation: enough bits are taken from the unsigned value to fill the signed value, including its sign bit.
Because of the way two's complement works, if the original arithmetically correct abstract result itself fits into int, then the conversion will produce that result.
For instance, suppose that the subtraction of a pair of 64 bit size_t values on a two's complement machine yields the abstract arithmetic value -3, which is becomes the positive value 0xFFFFFFFFFFFFFFFD. When this is coerced into a 32 bit int, then the common behavior seen in many compilers for two's complement machines is that the lower 32 bits are taken as the image of the resulting int: 0xFFFFFFFD. And, of course, that is just the value -3 in 32 bits.
So the upshot is, that your code is de facto quite portable because virtually all mainstream machines are two's complement with conversion rules based on sign extension and bit truncation, including between signed and unsigned.
Except that sign extension doesn't occur when a value is widened while converting from unsigned to signed. Thus he one problem is the rare situation in which int is wider than size_t. If a 16 bit size_t result is 65533, due to 4 being subtracted from 1, this will not produce a -3 when converted to a 32 bit int; it will produce 65533!
If you don't use size_t, you are screwed: size_t is the one type that exists to be used for memory sizes, and which is consequently guaranteed to always be big enough for that purpose. (uintptr_t is quite similar, but it's neither the first such type, nor is it used by the standard libraries, nor is it available without including stdint.h.) If you use an int, you can get undefined behavior when your allocations exceed 2GiB of address space (or 32kiB if you are on a platform where int is only 16 bits!), even though the machine has more memory and you are executing in 64 bit mode.
If you need a difference of size_t that may become negative, use the signed variant ssize_t.
Probably not, but I can't think of a good solution. I'm no expert in C++ yet.
Recently I've converted a lot of ints to unsigned ints in a project. Basically everything that should never be negative is made unsigned. This removed a lot of these warnings by MinGW:
warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
I love it. It makes the program more robust and the code more descriptive. However, there is one place where they still occur. It looks like this:
unsigned int subroutine_point_size = settings->get<unsigned int>("subroutine_point_size");
...
for(int dx = -subroutine_point_size;dx <= subroutine_point_size;dx++) //Fill pixels in the point's radius.
{
for(int dy = -subroutine_point_size;dy <= subroutine_point_size;dy++)
{
//Do something with dx and dy here.
}
}
In this case I can't make dx and dy unsigned. They start out negative and depend on comparing which is lesser or greater.
I don't like to make subroutine_point_size signed either, though this is the lesser evil. It indicates a size of a kernel in a pass over an image, and the kernel size can't be negative (it's probably unwise for a user ever to set this kernel size to anything more than 100 but the settings file allows for numbers up to 2^32 - 1).
So it seems there is no way to cast any of the variables to fix this. Is there a way to get rid of this warning and solve this neatly?
We're using C++11, compiling with GCC for Windows, Mac and various Unix distributions.
Cast the variables to a long int or long long int type giving at the same time the range of unsigned int (0..2^32-1) and sign.
You're making a big mistake.
Basically you like the name "unsigned" and you intend it to mean "not negative" but this is not what is the semantic associated to the type.
Consider the statement:
adding a signed integer and an unsigned integer the result is unsigned
Clearly it makes no sense if you consider the term "unsigned" as "not negative", yet this is what the language does: adding -3 to the unsigned value 2 you will get a huge nonsense number instead of the correct answer -1.
Indeed the choice of using an unsigned type for the size of containers is a design mistake of C++, a mistake that is too late to fix now because of backward compatibility. By the way the reason it happened has nothing to do with "non-negativeness", but just with the ability to use the 16th bit when computers were that small (i.e. being able to use 65535 elements instead of 32767). Even back then I don't think the price of wrong semantic was worth the gain (if 32767 is not enough now then 65535 won't be enough quite soon anyway).
Do not repeat the same mistake in your programs... the name is irrelevant, what counts is the semantic and for unsigned in C and C++ it is "member of the Zn modulo ring with n=2k ".
You don't want the size of a container to be the member of a modulo ring. Do you?
Instead of the current
for(int dx = -subroutine_point_size;dx <= subroutine_point_size;dx++) //Fill pixels in the point's radius.
you can do this:
for(int dx = -int(subroutine_point_size);dx <= int(subroutine_point_size);dx++) //Fill pixels in the point's radius.
where the first int cast is (1)technically redundant, but is there for consistency, because the second cast removes the signed/unsigned warning that presumably is the issue here.
However, I strongly advise you to undo the work of converting signed to unsigned types everywhere. A good rule of thumb is to use signed types for numbers, and unsigned types for bit level stuff. That avoids the problems with wrap-around due to implicit conversions, where e.g. std:.string("Bah").length() < -5 is guaranteed (very silly), and because it does away with actual problems, it also reduces spurious warnings.
Note that you can just define a suitable name, where you want to indicate that some value will never be negative.
1) Technically redundant in practice, for two's complement representation of signed integers, with no trapping inserted by the compiler. As far as I know no extant C++ compiler behaves otherwise.
Firstly, without knowing the range of values that will be stored in the variables, your claim that changing signed to unsigned variables is unsubstantiated - there are circumstances where that claim is false.
Second, the compiler is not issuing a warning only as a result of changing variables (and I assume calls of template functions like settings.get()) to be unsigned. It is warning about the fact you have expressions involving both signed and unsigned variables. Compilers typically issue warnings about such expressions because - in practice - they are more likely to indicate a programming error or to potentially involve some behaviour that the programmer may not have anticipated (e.g. instances of undefined behaviour, expressions where a negative result is expected but a large positive result is what will occur, etc).
A rule of thumb is that, if you need to have expressions involving both signed and unsigned types, you are better off making all the relevant variables signed. While there are exceptions where that rule of thumb isn't needed, you wouldn't have asked this question if you understood how to decide that.
On that basis, I suggest the most appropriate action is to unwind your changes.
Say I have the following:
int i = 23;
float f = 3.14;
if (i == f) // do something
i will be promoted to a float and the two float numbers will be compared, but can a float represent all int values? Why not promote both the int and the float to a double?
When int is promoted to unsigned in the integral promotions, negative values are also lost (which leads to such fun as 0u < -1 being true).
Like most mechanisms in C (that are inherited in C++), the usual arithmetic conversions should be understood in terms of hardware operations. The makers of C were very familiar with the assembly language of the machines with which they worked, and they wrote C to make immediate sense to themselves and people like themselves when writing things that would until then have been written in assembly (such as the UNIX kernel).
Now, processors, as a rule, do not have mixed-type instructions (add float to double, compare int to float, etc.) because it would be a huge waste of real estate on the wafer -- you'd have to implement as many times more opcodes as you want to support different types. That you only have instructions for "add int to int," "compare float to float", "multiply unsigned with unsigned" etc. makes the usual arithmetic conversions necessary in the first place -- they are a mapping of two types to the instruction family that makes most sense to use with them.
From the point of view of someone who's used to writing low-level machine code, if you have mixed types, the assembler instructions you're most likely to consider in the general case are those that require the least conversions. This is particularly the case with floating points, where conversions are runtime-expensive, and particularly back in the early 1970s, when C was developed, computers were slow, and when floating point calculations were done in software. This shows in the usual arithmetic conversions -- only one operand is ever converted (with the single exception of long/unsigned int, where the long may be converted to unsigned long, which does not require anything to be done on most machines. Perhaps not on any where the exception applies).
So, the usual arithmetic conversions are written to do what an assembly coder would do most of the time: you have two types that don't fit, convert one to the other so that it does. This is what you'd do in assembler code unless you had a specific reason to do otherwise, and to people who are used to writing assembler code and do have a specific reason to force a different conversion, explicitly requesting that conversion is natural. After all, you can simply write
if((double) i < (double) f)
It is interesting to note in this context, by the way, that unsigned is higher in the hierarchy than int, so that comparing int with unsigned will end in an unsigned comparison (hence the 0u < -1 bit from the beginning). I suspect this to be an indicator that people in olden times considered unsigned less as a restriction on int than as an extension of its value range: We don't need the sign right now, so let's use the extra bit for a larger value range. You'd use it if you had reason to expect that an int would overflow -- a much bigger worry in a world of 16-bit ints.
Even double may not be able to represent all int values, depending on how much bits does int contain.
Why not promote both the int and the float to a double?
Probably because it's more costly to convert both types to double than use one of the operands, which is already a float, as float. It would also introduce special rules for comparison operators incompatible with rules for arithmetic operators.
There's also no guarantee how floating point types will be represented, so it would be a blind shot to assume that converting int to double (or even long double) for comparison will solve anything.
The type promotion rules are designed to be simple and to work in a predictable manner. The types in C/C++ are naturally "sorted" by the range of values they can represent. See this for details. Although floating point types cannot represent all integers represented by integral types because they can't represent the same number of significant digits, they might be able to represent a wider range.
To have predictable behavior, when requiring type promotions, the numeric types are always converted to the type with the larger range to avoid overflow in the smaller one. Imagine this:
int i = 23464364; // more digits than float can represent!
float f = 123.4212E36f; // larger range than int can represent!
if (i == f) { /* do something */ }
If the conversion was done towards the integral type, the float f would certainly overflow when converted to int, leading to undefined behavior. On the other hand, converting i to f only causes a loss of precision which is irrelevant since f has the same precision so it's still possible that the comparison succeeds. It's up to the programmer at that point to interpret the result of the comparison according to the application requirements.
Finally, besides the fact that double precision floating point numbers suffer from the same problem representing integers (limited number of significant digits), using promotion on both types would lead to having a higher precision representation for i, while f is doomed to have the original precision, so the comparison will not succeed if i has a more significant digits than f to begin with. Now that is also undefined behavior: the comparison might succeed for some couples (i,f) but not for others.
can a float represent all int values?
For a typical modern system where both int and float are stored in 32 bits, no. Something's gotta give. 32 bits' worth of integers doesn't map 1-to-1 onto a same-sized set that includes fractions.
The i will be promoted to a float and the two float numbers will be compared…
Not necessarily. You don't really know what precision will apply. C++14 §5/12:
The values of the floating operands and the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby.
Although i after promotion has nominal type float, the value may be represented using double hardware. C++ doesn't guarantee floating-point precision loss or overflow. (This is not new in C++14; it's inherited from C since olden days.)
Why not promote both the int and the float to a double?
If you want optimal precision everywhere, use double instead and you'll never see a float. Or long double, but that might run slower. The rules are designed to be relatively sensible for the majority of use-cases of limited-precision types, considering that one machine may offer several alternative precisions.
Most of the time, fast and loose is good enough, so the machine is free to do whatever is easiest. That might mean a rounded, single-precision comparison, or double precision and no rounding.
But, such rules are ultimately compromises, and sometimes they fail. To precisely specify arithmetic in C++ (or C), it helps to make conversions and promotions explicit. Many style guides for extra-reliable software prohibit using implicit conversions altogether, and most compilers offer warnings to help you expunge them.
To learn about how these compromises came about, you can peruse the C rationale document. (The latest edition covers up to C99.) It is not just senseless baggage from the days of the PDP-11 or K&R.
It is fascinating that a number of answers here argue from the origin of the C language, explicitly naming K&R and historical baggage as the reason that an int is converted to a float when combined with a float.
This is pointing the blame to the wrong parties. In K&R C, there was no such thing as a float calculation. All floating point operations were done in double precision. For that reason, an integer (or anything else) was never implicitly converted to a float, but only to a double. A float also could not be the type of a function argument: you had to pass a pointer to float if you really, really, really wanted to avoid conversion into a double. For that reason, the functions
int x(float a)
{ ... }
and
int y(a)
float a;
{ ... }
have different calling conventions. The first gets a float argument, the second (by now no longer permissable as syntax) gets a double argument.
Single-precision floating point arithmetic and function arguments were only introduced with ANSI C. Kernighan/Ritchie is innocent.
Now with the newly available single float expressions (single float previously was only a storage format), there also had to be new type conversions. Whatever the ANSI C team picked here (and I would be at a loss for a better choice) is not the fault of K&R.
Q1: Can a float represent all int values?
IEE754 can represent all integers exactly as floats, up to about 223, as mentioned in this answer.
Q2: Why not promote both the int and the float to a double?
The rules in the Standard for these conversions are slight modifications of those in K&R: the modifications accommodate the added types and the value preserving rules. Explicit license was added to perform calculations in a “wider” type than absolutely necessary, since this can sometimes produce smaller and faster code, not to mention the correct answer more often. Calculations can also be performed in a “narrower” type by the as if rule so long as the same end result is obtained. Explicit casting can always be used to obtain a value in a desired type.
Source
Performing calculations in a wider type means that given float f1; and float f2;, f1 + f2 might be calculated in double precision. And it means that given int i; and float f;, i == f might be calculated in double precision. But it isn't required to calculate i == f in double precision, as hvd stated in the comment.
Also C standard says so. These are known as the usual arithmetic conversions . The following description is taken straight from the ANSI C standard.
...if either operand has type float , the other operand is converted to type float .
Source and you can see it in the ref too.
A relevant link is this answer. A more analytic source is here.
Here is another way to explain this: The usual arithmetic conversions are implicitly performed to cast their values in a common type. Compiler first performs integer promotion, if operands still have different types then they are converted to the type that appears highest in the following hierarchy:
Source.
When a programming language is created some decisions are made intuitively.
For instance why not convert int+float to int+int instead of float+float or double+double? Why call int->float a promotion if it holds the same about of bits? Why not call float->int a promotion?
If you rely on implicit type conversions you should know how they work, otherwise just convert manually.
Some language could have been designed without any automatic type conversions at all. And not every decision during a design phase could have been made logically with a good reason.
JavaScript with it's duck typing has even more obscure decisions under the hood. Designing an absolutely logical language is impossible, I think it goes to Godel incompleteness theorem. You have to balance logic, intuition, practice and ideals.
The question is why: Because it is fast, easy to explain, easy to compile, and these were all very important reasons at the time when the C language was developed.
You could have had a different rule: That for every comparison of arithmetic values, the result is that of comparing the actual numerical values. That would be somewhere between trivial if one of the expressions compared is a constant, one additional instruction when comparing signed and unsigned int, and quite difficult if you compare long long and double and want correct results when the long long cannot be represented as double. (0u < -1 would be false, because it would compare the numerical values 0 and -1 without considering their types).
In Swift, the problem is solved easily by disallowing operations between different types.
The rules are written for 16 bit ints (smallest required size). Your compiler with 32 bit ints surely converts both sides to double. There are no float registers in modern hardware anyway so it has to convert to double. Now if you have 64 bit ints I'm not too sure what it does. long double would be appropriate (normally 80 bits but it's not even standard).
What actually happens when we do the following:
1)
int i = -1; // 32 bit
void *p;
p = reinterpret_cast<void*>(i)
on 64 bit architecture, sizeof(void*) == 8
2)
long long i; // 64 bit
void *p = (unsigned int)(-1);
i = retinterpret_cast<long long>(p)
on 32 bit architecture sizeof(void*) = 4
I generally understand what will be the result, but I want somebody to describe the mechanism in terms of the C++ standard for better understanding.
In the second case the behaviour is like those described in "integral promotions"(4.5) (i will be -1)
for the case "int - unsigned long", so do we generally say that pointers are converted like signed integers?
3)
int i = ...;
unsigned long long L = ...;
i = L;
What rules apply here?
All conversions between pointers and integral types are
implementation defined. The only guarantee the standard makes
is that:
A value of integral type or enumeration type can be explicitly
converted to a pointer. A pointer converted to an integer of
sufficient size (if any such exists on the implementation) and
back to the same pointer type will have its original value
The standard does, however, make the statement (in
a non-normative note) that:
It is intended to be unsurprising to those who know the
addressing structure of the underlying machine.
From a quality of implementation point of view, if the machine
has linear addressing, casting an int to a pointer should
result in a value that corresponds to the value of the int, if
the word (whatever its size) is viewed as an integral type; in
other words, the bit pattern isn't changed. If the integral
type is smaller, and is negative, it is an open question whether
it should be sign extended, or whether the remaining bits should
be set to 0. I prefer the second, but I think that both can be
considered "unsurprising".
From a pragmatic point of view: there's a significant amount of
software out there which will occasionally cast a small
non-negative integral value to void*, and expects to get it
back with a cast later. Formally, getting it back requires
converting first to a intptr_t (or larger); otherwise the code
shouldn't compile. But I can't imagine a compiler breaking this
otherwise. For negative values, on the other hand, I'd feel
significantly less sure. And I'm not sure how small is small,
either. (I currently use the technique in one special case for
values less than some 40 or 50. I works with MSC, g++ and Sun
CC, at least, and I can't imagine it failing on any of the other
mainstream Unix machines I've used in the past. But I wouldn't
count on it on a 16 bit Intel, or some of the other more exotic
machines I've seen, and certainly not on an embedded system.)
Finally, as for your exact questions: it could vary. I'd try it
and see, counting on the fact that whatever it does, there's
some code somewhere which counts on it doing that, and that the
vendor won't risk changing the behavior. Formally, since it is
implementation defined, the implementor is required to document
it (and could, of course, change it in the next version), but
I've generally found it very, very difficult to find this
documentation.
EDIT:
I just noticed that your final question concerned unsigned long
long to int. This is an integral conversion, and not
a reinterpret_cast, so different rules apply. Or rather, the
rules are specified in a different section: the basic rule is
still "implementation defined":
If the destination type is signed, the value is unchanged if it
can be represented in the destination type (and bit-field
width); otherwise, the value is implementation-defined.
The C standard is somewhat different here, in that it explicitly
allows an implementation defined signal to be raised. (It's
stretching it somewhat to say that the "value" in the C++
standard could be the raising of a signal, even if this would be
the preferred implementation.) In practice: all of the
compilers I know just ignore the extra high order bits.
The behavior is Undefined. Anything may happen. It's not even guaranteed that the result is consistent. reinterpret_cast<void*>(1) == reinterpret_cast<void*>(1) may be false.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I'd like to understand better why choose int over unsigned?
Personally, I've never liked signed values unless there is a valid reason for them. e.g. count of items in an array, or length of a string, or size of memory block, etc., so often these things cannot possibly be negative. Such a value has no possible meaning. Why prefer int when it is misleading in all such cases?
I ask this because both Bjarne Stroustrup and Chandler Carruth gave the advice to prefer int over unsigned here (approx 12:30').
I can see the argument for using int over short or long - int is the "most natural" data width for the target machine architecture.
But signed over unsigned has always annoyed me. Are signed values genuinely faster on typical modern CPU architectures? What makes them better?
As per requests in comments: I prefer int instead of unsigned because...
it's shorter (I'm serious!)
it's more generic and more intuitive (i. e. I like to be able to assume that 1 - 2 is -1 and not some obscure huge number)
what if I want to signal an error by returning an out-of-range value?
Of course there are counter-arguments, but these are the principal reasons I like to declare my integers as int instead of unsigned. Of course, this is not always true, in other cases, an unsigned is just a better tool for a task, I am just answering the "why would anyone prefer defaulting to signed" question specifically.
Let me paraphrase the video, as the experts said it succinctly.
Andrei Alexandrescu:
No simple guideline.
In systems programming, we need integers of different sizes and signedness.
Many conversions and arcane rules govern arithmetic (like for auto), so we need to be careful.
Chandler Carruth:
Here's some simple guidelines:
Use signed integers unless you need two's complement arithmetic or a bit pattern
Use the smallest integer that will suffice.
Otherwise, use int if you think you could count the items, and a 64-bit integer if it's even more than you would want to count.
Stop worrying and use tools to tell you when you need a different type or size.
Bjarne Stroustrup:
Use int until you have a reason not to.
Use unsigned only for bit patterns.
Never mix signed and unsigned
Wariness about signedness rules aside, my one-sentence take away from the experts:
Use the appropriate type, and when you don't know, use an int until you do know.
Several reasons:
Arithmetic on unsigned always yields unsigned, which can be a problem when subtracting integer quantities that can reasonably result in a negative result — think subtracting money quantities to yield balance, or array indices to yield distance between elements. If the operands are unsigned, you get a perfectly defined, but almost certainly meaningless result, and a result < 0 comparison will always be false (of which modern compilers will fortunately warn you).
unsigned has the nasty property of contaminating the arithmetic where it gets mixed with signed integers. So, if you add a signed and unsigned and ask whether the result is greater than zero, you can get bitten, especially when the unsigned integral type is hidden behind a typedef.
There are no reasons to prefer signed over unsigned, aside from purely sociological ones, i.e. some people believe that average programmers are not competent and/or attentive enough to write proper code in terms of unsigned types. This is often the main reasoning used by various "speakers", regardless of how respected those speakers might be.
In reality, competent programmers quickly develop and/or learn the basic set of programming idioms and skills that allow them to write proper code in terms of unsigned integral types.
Note also that the fundamental differences between signed and unsigned semantics are always present (in superficially different form) in other parts of C and C++ language, like pointer arithmetic and iterator arithmetic. Which means that in general case the programmer does not really have the option of avoiding dealing with issues specific to unsigned semantics and the "problems" it brings with it. I.e. whether you want it or not, you have to learn to work with ranges that terminate abruptly at their left end and terminate right here (not somewhere in the distance), even if you adamantly avoid unsigned integers.
Also, as you probably know, many parts of standard library already rely on unsigned integer types quite heavily. Forcing signed arithmetic into the mix, instead of learning to work with unsigned one, will only result in disastrously bad code.
The only real reason to prefer signed in some contexts that comes to mind is that in mixed integer/floating-point code signed integer formats are typically directly supported by FPU instruction set, while unsigned formats are not supported at all, making the compiler to generate extra code for conversions between floating-point values and unsigned values. In such code signed types might perform better.
But at the same time in purely integer code unsigned types might perform better than signed types. For example, integer division often requires additional corrective code in order to satisfy the requirements of the language spec. The correction is only necessary in case of negative operands, so it wastes CPU cycles in situations when negative operands are not really used.
In my practice I devotedly stick to unsigned wherever I can, and use signed only if I really have to.
The integral types in C and many languages which derive from it have two general usage cases: to represent numbers, or represent members of an abstract algebraic ring. For those unfamiliar with abstract algebra, the primary notion behind a ring is that adding, subtracting, or multiplying two items of a ring should yield another item of that ring--it shouldn't crash or yield a value outside the ring. On a 32-bit machine, adding unsigned 0x12345678 to unsigned 0xFFFFFFFF doesn't "overflow"--it simply yields the result 0x12345677 which is defined for the ring of integers congruent mod 2^32 (because the arithmetic result of adding 0x12345678 to 0xFFFFFFFF, i.e. 0x112345677, is congruent to 0x12345677 mod 2^32).
Conceptually, both purposes (representing numbers, or representing members of the ring of integers congruent mod 2^n) may be served by both signed and unsigned types, and many operations are the same for both usage cases, but there are some differences. Among other things, an attempt to add two numbers should not be expected to yield anything other than the correct arithmetic sum. While it's debatable whether a language should be required to generate the code necessary to guarantee that it won't (e.g. that an exception would be thrown instead), one could argue that for code which uses integral types to represent numbers such behavior would be preferable to yielding an arithmetically-incorrect value and compilers shouldn't be forbidden from behaving that way.
The implementers of the C standards decided to use signed integer types to represent numbers and unsigned types to represent members of the algebraic ring of integers congruent mod 2^n. By contrast, Java uses signed integers to represent members of such rings (though they're interpreted differently in some contexts; conversions among differently-sized signed types, for example, behave differently from among unsigned ones) and Java has neither unsigned integers nor any primitive integral types which behave as numbers in all non-exceptional cases.
If a language provided a choice of signed and unsigned representations for both numbers and algebraic-ring numbers, it might make sense to use unsigned numbers to represent quantities that will always be positive. If, however, the only unsigned types represent members of an algebraic ring, and the only types that represent numbers are the signed ones, then even if a value will always be positive it should be represented using a type designed to represent numbers.
Incidentally, the reason that (uint32_t)-1 is 0xFFFFFFFF stems from the fact that casting a signed value to unsigned is equivalent to adding unsigned zero, and adding an integer to an unsigned value is defined as adding or subtracting its magnitude to/from the unsigned value according to the rules of the algebraic ring which specify that if X=Y-Z, then X is the one and only member of that ring such X+Z=Y. In unsigned math, 0xFFFFFFFF is the only number which, when added to unsigned 1, yields unsigned zero.
Speed is the same on modern architectures. The problem with unsigned int is that it can sometimes generate unexpected behavior. This can create bugs that wouldn't show up otherwise.
Normally when you subtract 1 from a value, the value gets smaller. Now, with both signed and unsigned int variables, there will be a time that subtracting 1 creates a value that is MUCH LARGER. The key difference between unsigned int and int is that with unsigned int the value that generates the paradoxical result is a commonly used value --- 0 --- whereas with signed the number is safely far away from normal operations.
As far as returning -1 for an error value --- modern thinking is that it's better to throw an exception than to test for return values.
It's true that if you properly defend your code you won't have this problem, and if you use unsigned religiously everywhere you will be okay (provided that you are only adding, and never subtracting, and that you never get near MAX_INT). I use unsigned int everywhere. But it takes a lot of discipline. For a lot of programs, you can get by with using int and spend your time on other bugs.
Use int by default: it plays nicer with the rest of the language
most common domain usage is regular arithmetic, not modular arithmetic
int main() {} // see an unsigned?
auto i = 0; // i is of type int
Only use unsigned for modulo arithmetic and bit-twiddling (in particular shifting)
has different semantics than regular arithmetic, make sure it is what you want
bit-shifting signed types is subtle (see comments by #ChristianRau)
if you need a > 2Gb vector on a 32-bit machine, upgrade your OS / hardware
Never mix signed and unsigned arithmetic
the rules for that are complicated and surprising (either one can be converted to the other, depending on the relative type sizes)
turn on -Wconversion -Wsign-conversion -Wsign-promo (gcc is better than Clang here)
the Standard Library got it wrong with std::size_t (quote from the GN13 video)
use range-for if you can,
for(auto i = 0; i < static_cast<int>(v.size()); ++i) if you must
Don't use short or large types unless you actually need them
current architectures data flow caters well to 32-bit non-pointer data (but note the comment by #BenVoigt about cache effects for smaller types)
char and short save space but suffer from integral promotions
are you really going to count to over all int64_t?
To answer the actual question: For the vast number of things, it doesn't really matter. int can be a little easier to deal with things like subtraction with the second operand larger than the first and you still get a "expected" result.
There is absolutely no speed difference in 99.9% of cases, because the ONLY instructions that are different for signed and unsigned numbers are:
Making the number longer (fill with the sign for signed or zero for unsigned) - it takes the same effort to do both.
Comparisons - a signed number, the processor has to take into account if either number is negative or not. But again, it's the same speed to make a compare with signed or unsigned numbers - it's just using a different instruction code to say "numbers that have the highest bit set are smaller than numbers with the highest bit not set" (essentially). [Pedantically, it's nearly always the operation using the RESULT of a comparison that is different - the most common case being a conditional jump or branch instruction - but either way, it's the same effort, just that the inputs are taken to mean slightly different things].
Multiply and divide. Obviously, sign conversion of the result needs to happen if it's a signed multiplication, where a unsigned should not change the sign of the result if the highest bit of one of the inputs is set. And again, the effort is (as near as we care for) identical.
(I think there are one or two other cases, but the result is the same - it really doesn't matter if it's signed or unsigned, the effort to perform the operation is the same for both).
The int type more closely resembles the behavior of mathematical integers than the unsigned type.
It is naive to prefer the unsigned type simply because a situation does not require negative values to be represented.
The problem is that the unsigned type has a discontinuous behavior right next to zero. Any operation that tries to compute a small negative value, instead produces some large positive value. (Worse: one that is implementation-defined.)
Algebraic relationships such as that a < b implies that a - b < 0 are wrecked in the unsigned domain, even for small values like a = 3 and b = 4.
A descending loop like for (i = max - 1; i >= 0; i--) fails to terminate if i is made unsigned.
Unsigned quirks can cause a problem which will affect code regardless of whether that code expects to be representing only positive quantities.
The virtue of the unsigned types is that certain operations that are not portably defined at the bit level for the signed types are that way for the unsigned types. The unsigned types lack a sign bit, and so shifting and masking through the sign bit isn't a problem. The unsigned types are good for bitmasks, and for code that implements precise arithmetic in a platform-independent way. Unsigned opearations will simulate two's complement semantics even on a non two's complement machine. Writing a multi-precision (bignum) library practically requires arrays of unsigned types to be used for the representation, rather than signed types.
The unsigned types are also suitable in situations in which numbers behave like identifiers and not as arithmetic types. For instance, an IPv4 address can be represented in a 32 bit unsigned type. You wouldn't add together IPv4 addresses.
int is preferred because it's most commonly used. unsigned is usually associated with bit operations. Whenever I see an unsigned, I assume it's used for bit twiddling.
If you need a bigger range, use a 64-bit integer.
If you're iterating over stuff using indexes, types usually have size_type, and you shouldn't care whether it's signed or unsigned.
Speed is not an issue.
For me, in addition to all the integers in the range of 0..+2,147,483,647 contained within the set of signed and unsigned integers on 32 bit architectures, there is a higher probability that I will need to use -1 (or smaller) than need to use +2,147,483,648 (or larger).
One good reason that I can think of is in case of detecting overflow.
For the use cases such as the count of items in an array, length of a string, or size of memory block, you can overflow an unsigned int and you may not notice a difference even when you take a look at the variable. If it is an signed int, the variable will be less than zero and clearly wrong.
You can simply check to see if the variable is zero when you want to use it. This way, you do not have to check for overflow after every arithmetic operation as is the case for unsigned ints.
It gives unexpected result when doing simple arithmetic operation:
unsigned int i;
i = 1 - 2;
//i is now 4294967295 on a 64bit machine
It gives unexpected result when doing simple comparison:
unsigned int j = 1;
std::cout << (j>-1) << std::endl;
//output 0 as false but 1 is greater than -1
This is because when doing the operations above, the signed ints are converted to unsigned, and it overflows and goes to a really big number.