How to avoid integral promotion for bitwise operations - c++

I'm floored that VisualStudio 2015 insists on promoting a WORD (unsigned short) to an unsigned int when only WORD values are involved in only bit manipulations. (i.e. promotes 16 bit to 32 bit when doing 16bit | 16bit).
e.g.
// where WORD is a 'unsigned short'
const WORD kFlag = 1;
WORD old = 2;
auto value = old | kFlag; // why the blazes is value an unsigned int (32 bits)
Moreover, is there a way to get 0x86 intrinsics for WORD|WORD? I surely do not want to pay for (16->32|16->)->16. Nor does this code need to consume more than a couple of 16 bit registers, not a few 32 bit regs.
But the registry use is really just an aside. The optimizer is welcome to do as it pleases, so long as the results are indistinguishable for me. (i.e. it should not change the size in a visible way).
The main problem for me is that using flags|kFlagValue results in a wider entity, and then pumping that into a template gives me a type mismatch error (template is rather much longer than I want to get into here, but the point is it takes two arguments, and they should match in type, or be trivially convertible, but aren't, due to this automatic size-promotion rule).
If I had access to a "conservative bit processing function set" then I could use:
flag non-promoting-bit-operator kFlagValue
To achieve my ends.
I guess I have to go write that, or use casts all over the place, because of this unfortunate rule.
C++ should not promote in this instance. It was a poor language choice.

Why is value promoted to a larger type? Because the language spec says it is (a 16-bit unsigned short will be converted to a 32-bit int). 16-bit ops on x86 actually incur a penalty over the corresponding 32 bit ones (due to a prefix opcode), so the 32 bit version just may run faster.

Related

Why is int being treated as a 32 bit value when compiled x86_64 [duplicate]

Why is int typically 32 bit on 64 bit compilers? When I was starting programming, I've been taught int is typically the same width as the underlying architecture. And I agree that this also makes sense, I find it logical for a unspecified width integer to be as wide as the underlying platform (unless we are talking 8 or 16 bit machines, where such a small range for int will be barely applicable).
Later on I learned int is typically 32 bit on most 64 bit platforms. So I wonder what is the reason for this. For storing data I would prefer an explicitly specified width of the data type, so this leaves generic usage for int, which doesn't offer any performance advantages, at least on my system I have the same performance for 32 and 64 bit integers. So that leaves the binary memory footprint, which would be slightly reduced, although not by a lot...
Bad choices on the part of the implementors?
Seriously, according to the standard, "Plain ints have the
natural size suggested by the architecture of the execution
environment", which does mean a 64 bit int on a 64 bit
machine. One could easily argue that anything else is
non-conformant. But in practice, the issues are more complex:
switching from 32 bit int to 64 bit int would not allow
most programs to handle large data sets or whatever (unlike the
switch from 16 bits to 32); most programs are probably
constrained by other considerations. And it would increase the
size of the data sets, and thus reduce locality and slow the
program down.
Finally (and probably most importantly), if int were 64 bits,
short would have to be either 16 bits or
32 bits, and you'ld have no way of specifying the other (except
with the typedefs in <stdint.h>, and the intent is that these
should only be used in very exceptional circumstances).
I suspect that this was the major motivation.
The history, trade-offs and decisions are explained by The Open Group at http://www.unix.org/whitepapers/64bit.html. It covers the various data models, their strengths and weaknesses and the changes made to the Unix specifications to accommodate 64-bit computing.
Because there is no advantage to a lot of software to have 64-bit integers.
Using 64-bit int's to calculate things that can be calculated in a 32-bit integer (and for many purposes, values up to 4 billion (or +/- 2 billon) are sufficient), and making them bigger will not help anything.
Using a bigger integer will however have a negative effect on how many integers sized "things" fit in the cache on the processor. So making them bigger will make calculations that involve large numbers of integers (e.g. arrays) take longer because.
The int is the natural size of the machine-word isn't something stipulated by the C++ standard. In the days when most machines where 16 or 32 bit, it made sense to make it either 16 or 32 bits, because that is a very efficient size for those machines. When it comes to 64 bit machines, that no longer "helps". So staying with 32 bit int makes more sense.
Edit:
Interestingly, when Microsoft moved to 64-bit, they didn't even make long 64-bit, because it would break too many things that relied on long being a 32-bit value (or more importantly, they had a bunch of things that relied on long being a 32-bit value in their API, where sometimes client software uses int and sometimes long, and they didn't want that to break).
ints have been 32 bits on most major architectures for so long that changing them to 64 bits will probably cause more problems than it solves.
I originally wrote this up in response to this question. While I've modified it some, it's largely the same.
To get started, it is possible to have plain ints wider than 32 bits, as the C++ draft says:
 Note: Plain ints are intended to have the natural size suggested by the architecture of the execution environment; the other signed integer types are provided to meet special needs. — end note
Emphasis mine
This would ostensibly seem to say that on my 64 bit architecture (and everyone else's) a plain int should have a 64 bit size; that's a size suggested by the architecture, right? However I must assert that the natural size for even 64 bit architecture is 32 bits. The quote in the specs is mainly there for cases where 16 bit plain ints is desired--which is the minimum size the specifications allow.
The largest factor is convention, going from a 32 bit architecture with a 32 bit plain int and adapting that source for a 64 bit architecture is simply easier if you keep it 32 bits, both for the designers and their users in two different ways:
The first is that less differences across systems there are the easier is for everyone. Discrepancies between systems been only headaches for most programmer: they only serve to make it harder to run code across systems. It'll even add on to the relatively rare cases where you're not able to do it across computers with the same distribution just 32 bit and 64 bit. However, as John Kugelman pointed out, architectures have gone from a 16 bit to 32 bit plain int, going through the hassle to do so could be done again today, which ties into his next point:
The more significant component is the gap it would cause in integer sizes or a new type to be required. Because sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long) is in the actual specification, a gap is forced if the plain int is moved to 64 bits. It starts with shifting long. If a plain int is adjusted to 64 bits, the constraint that sizeof(int) <= sizeof(long) would force long to be at least 64 bits and from there there's an intrinsic gap in sizes. Since long or a plain int usually are used as a 32 bit integer and neither of them could now, we only have one more data type that could, short. Because short has a minimum of 16 bits if you simply discard that size it could become 32 bits and theoretically fill that gap, however short is intended to be optimized for space so it should be kept like that and there are use cases for small, 16 bit, integers as well. No matter how you arrange the sizes there is a loss of a width and therefore use case for an int entirely unavailable. A bigger width doesn't necessarily mean it's better.
This now would imply a requirement for the specifications to change, but even if a designer goes rogue, it's highly likely it'd be damaged or grow obsolete from the change. Designers for long lasting systems have to work with an entire base of entwined code, both their own in the system, dependencies, and user's code they'll want to run and a huge amount of work to do so without considering the repercussions is simply unwise.
As a side note, if your application is incompatible with a >32 bit integer, you can use static_assert(sizeof(int) * CHAR_BIT <= 32, "Int wider than 32 bits!");. However, who knows maybe the specifications will change and 64 bits plain ints will be implemented, so if you want to be future proof, don't do the static assert.
Main reason is backward compatibility. Moreover, there is already a 64 bit integer type long and same goes for float types: float and double. Changing the sizes of these basic types for different architectures will only introduce complexity. Moreover, 32 bit integer responds to many needs in terms of range.
The C + + standard does not say how much memory should be used for the int type, tells you how much memory should be used at least for the type int. In many programming environments on 32-bit pointer variables, "int" and "long" are all 32 bits long.
Since no one pointed this out yet.
int is guaranteed to be between -32767 to 32767(2^16) That's required by the standard. If you want to support 64 bit numbers on all platforms I suggest using the right type long long which supports (-9223372036854775807 to 9223372036854775807).
int is allowed to be anything so long as it provides the minimum range required by the standard.

In new code, why would you use `int` instead of `int_fast16_t` or `int_fast32_t` for a counting variable?

If you need a counting variable, surely there must be an upper and a lower limit that your integer must support. So why wouldn't you specify those limits by choosing an appropriate (u)int_fastxx_t data type?
The simplest reason is that people are more used to int than the additional types introduced in C++11, and that it's the language's "default" integral type (so much as C++ has one); the standard specifies, in [basic.fundamental/2] that:
Plain ints have the natural size suggested by the architecture of the execution environment46; the other signed integer types are provided to meet special needs.
46) that is, large enough to contain any value in the range of INT_MIN and INT_MAX, as defined in the header <climits>.
Thus, whenever a generic integer is needed, which isn't required to have a specific range or size, programmers tend to just use int. While using other types can communicate intent more clearly (for example, using int8_t indicates that the value should never exceed 127), using int also communicates that these details aren't crucial to the task at hand, while simultaneously providing a little leeway to catch values that exceed your required range (if a system handles signed overflow with modulo arithmetic, for example, an int8_t would treat 313 as 57, making the invalid value harder to troubleshoot); typically, in modern programming, it either indicates that the value can be represented within the system's word size (which int is supposed to represent), or that the value can be represented within 32 bits (which is nearly always the size of int on x86 and x64 platforms).
Sized types also have the issue that the (theoretically) most well-known ones, the intX_t line, are only defined on platforms which support sizes of exactly X bits. While the int_leastX_t types are guaranteed to be defined on all platforms, and guaranteed to be at least X bits, a lot of people wouldn't want to type that much if they don't have to, since it adds up when you need to specify types often. [You can't use auto, either because it detects integer literals as ints. This can be mitigated by making user-defined literal operators, but that still takes more time to type.] Thus, they'll typically use int if it's safe to do so.
Or in short, int is intended to be the go-to type for normal operation, with the other types intended to be used in extranormal circumstances. Many programmers stick to this mindset out of habit, and only use sized types when they explicitly require specific ranges and/or sizes. This also communicates intent relatively well; int means "number", and intX_t means "number that always fits in X bits".
It doesn't help that int has evolved to unofficially mean "32-bit integer", due to both 32- and 64-bit platforms usually using 32-bit ints. It's very likely that many programmers expect int to always be at least 32 bits in the modern age, to the point where it can very easily bite them in the rear if they have to program for platforms that don't support 32-bit ints.
Conversely, the sized types are typically used when a specific range or size is explicitly required, such as when defining a struct that needs to have the same layout on systems with different data models. They can also prove useful when working with limited memory, using the smallest type that can fully contain the required range.
A struct intended to have the same layout on 16- and 32-bit systems, for example, would use either int16_t or int32_t instead of int, because int is 16 bits in most 16-bit data models and the LP32 32-bit data model (used by the Win16 API and Apple Macintoshes), but 32 bits in the ILP32 32-bit data model (used by the Win32 API and *nix systems, effectively making it the de facto "standard" 32-bit model).
Similarly, a struct intended to have the same layout on 32- and 64-bit systems would use int/int32_t or long long/int64_t over long, due to long having different sizes in different models (64 bits in LP64 (used by 64-bit *nix), 32 bits in LLP64 (used by Win64 API) and the 32-bit models).
Note that there is also a third 64-bit model, ILP64, where int is 64 bits; this model is very rarely used (to my knowledge, it was only used on early 64-bit Unix systems), but would mandate the use of a sized type over int if layout compatibility with ILP64 platforms is required.
There are several reasons. One, these long names make the code less readable. Two, you might introduce really hard to find bugs. Say you used int_fast16_t but you really need to count up to 40,000. The implementation might use 32 bits and the code work just fine. Then you try to run the code on an implementation that uses 16 bits and you get hard-to-find bugs.
A note: In C / C++ you have types char, short, int, long and long long which must cover 8 to 64 bits, so int cannot be 64 bits (because char and short cannot cover 8, 16 and 32 bits), even if 64 bits is the natural word size. In Swift, for example, Int is the natural integer size, either 32 and 64 bits, and you have Int8, Int16, Int32 and Int64 for explicit sizes. Int is the best type unless you absolutely need 64 bits, in which case you use Int64, or if you need to save space.

Perform 64 bit calculations in 64 bit executable

I am using MinGW64 (with the -m64 flag) with Code::Blocks and am looking to know how to perform 64 bit calculations without having to cast a really big number to int64_t before multiplying it. For example, this does not result in overflow:
int64_t test = int64_t(2123123123) * 17; //Returns 36093093091
Without the cast, the calculation overflows like such:
int64_t test = 2123123123 * 17; //Returns 1733354723
A VirusTotal scan confirms that my executable is x64.
Additional Information: OS is Windows 7 x64.
The default int type is still 32 bit even in 64 bit compilations for compatibility resons.
The "shortest" version I guess would be to add the ll suffix to the number
int64_t test = 2123123123ll * 17;
Another way would be to store the numbers in their own variables of type int64_t (or long long) and multiply the varaibles. usually it's rare anyway in a program to have many "magic-numbers" hard-coded into the codebase.
Some background:
Once upon a time, most computers had 8-bit arithmetic logic units and a 16-bit address bus. We called them 8-bit computers.
One of the first things we learned was that no real-world arithmetic problem can be expressed in 8-bits. It's like trying to reason about space flight with the arithmetic abilities of a chimpanzee. So we learned to write multi-word add, multiply, subtract and divide sequences. Because in most real-world problems, the numerical domain of the problem was bigger than 255.
The we briefly had 16-bit computers (where the same problem applied, 65535 is just not enough to model things) and then quite quickly, 32-bit arithmetic logic built in to chips. Gradually, the address bus caught up (20 bits, 24 bits, 32 bits if designers were feeling extravagant).
Then an interesting thing happened. Most of us didn't need to write multi-word arithmetic sequences any more. It turns out that most(tm) real world integer problems could be expressed in 32 bits (up to 4 billion).
Then we started producing more data at a faster rate than ever before, and we perceived the need to address more memory. The 64-bit computer eventually became the norm.
But still, most real-world integer arithmetic problems could be expressed in 32 bits. 4 billion is a big (enough) number for most things.
So, presumably through statistical analysis, your compiler writers decided that on your platform, the most useful size for an int would be 32 bits. Any smaller would be inefficient for 32-bit arithmetic (which we have needed from day 1) and any larger would waste space/registers/memory/cpu cycles.
Expressing an integer literal in c++ (and c) yields an int - the natural arithmetic size for the environment. In the present day, that is almost always a 32-bit value.
The c++ specification says that multiplying two ints yields an int. If it didn't then multiplying two ints would need to yield a long. But then what would multiplying two longs yield? A long long? Ok, that's possible. Now what if we multiply those? A long long long long?
So that's that.
int64_t x = 1 * 2; will do the following:
take the integer (32 bits) of value 1.
take the integer (32 bits) of value 2.
multiply them together, storing the result in an integer. If the arithmetic overflows, so be it. That's your lookout.
cast the resulting integer (whatever that may now be) to int64 (probably on your system a long int.
So in a nutshell, no. There is no shortcut to spelling out the type of at least one of the operands in the code snippet in the question. You can, of course, specify a literal. But there is no guarantee that the a long long (LL literal suffix) on your system is the same as int64_t. If you want an int64_t, and you want the code to be portable, you must spell it out.
For what it's worth:
In a post-c++11 world all the worrying about extra keystrokes and non-DRYness can disappear:
definitely an int64:
auto test = int64_t(2123123123) * 17;
definitely a long long:
auto test = 2'123'123'123LL * 17;
definitely int64, definitely initialised with a (possibly narrowing, but that's ok) long long:
auto test = int64_t(36'093'093'091LL);
Since you're most likely in an LP64 environment, where int is only 32 bits, you have to be careful about literal constants in expressions. The easiest way to do this is to get into the habit of using the proper suffix on literal constants, so you would write the above as:
int64_t test = 2123123123LL * 17LL;
2123123123 is an int (usually 32 bits).
Add an L to make it a long: 2123123123L (usually 32 or 64 bits, even in 64-bit mode).
Add another L to make it a long long: 2123123123LL (64 bits or more starting with C++11).
Note that you only need to add the suffix to constants that exceed the size of an int. Integral conversion will take care of producing the right result*.
(2123123123LL * 17) // 17 is automatically converted to long long, the result is long long
* But beware: even if individual constants in an expression fit into an int, the whole operation can still overflow like in
(1024 * 1024 * 1024 * 10)
In that case you should make sure the arithmetic is performed at sufficient width (taking operator precedence into account):
(1024LL * 1024 * 1024 * 10)
- will perform all 3 operations in 64 bits, with a 64-bit result.
Edit: Literal constants (A.K.A. magic numbers) are frowned upon, so the best way to do it would be to use symbolic constants (const int64_t value = 5). See What is a magic number, and why is it bad? for more info. It's best that you don't read the rest of this answer, unless you really want to use magic numbers for some strange reason.
Also, you can use intptr_t and uintprt_t from #include <cstdint> to let the compiler choose whether to use int or __int64.
For those who stumble upon this question, `LL` at the end of a number can do the trick, but it isn't recommended, as Richard Hodges told me that `long long` may not be always 64 bit, and can increase in size in the future, although it's not likely. See Richard Hodge's answer and the comments on it for more information.
The reliable way would be to put `using QW = int_64t;` at the top and use `QW(5)` instead of `5LL`.
Personally I think there should be an option to define all literals 64 bit without having to add any suffixes or functions to them, and use `int32_t(5)` when necessary, because some programs are unaffected by this change. Example: only use numbers for normal calculations instead of relying on integer overflow to do it's work. The problem is going from 64 bit to 32 bit, rather than going from 32 to 64, as the first 4 bytes are cut off.

Difference between uint8_t, uint_fast8_t and uint_least8_t

The C99 standard introduces the following datatypes. The documentation can be found here for the AVR stdint library.
uint8_t means it's an 8-bit unsigned type.
uint_fast8_t means it's the fastest unsigned int with at least 8
bits.
uint_least8_t means it's an unsigned int with at least 8 bits.
I understand uint8_t and what is uint_fast8_t( I don't know how it's implemented in register level).
1.Can you explain what is the meaning of "it's an unsigned int with at least 8 bits"?
2.How uint_fast8_t and uint_least8_t help increase efficiency/code space compared to the uint8_t?
uint_least8_t is the smallest type that has at least 8 bits.
uint_fast8_t is the fastest type that has at least 8 bits.
You can see the differences by imagining exotic architectures. Imagine a 20-bit architecture. Its unsigned int has 20 bits (one register), and its unsigned char has 10 bits. So sizeof(int) == 2, but using char types requires extra instructions to cut the registers in half. Then:
uint8_t: is undefined (no 8 bit type).
uint_least8_t: is unsigned char, the smallest type that is at least 8 bits.
uint_fast8_t: is unsigned int, because in my imaginary architecture, a half-register variable is slower than a full-register one.
uint8_t means: give me an unsigned int of exactly 8 bits.
uint_least8_t means: give me the smallest type of unsigned int which has at least 8 bits. Optimize for memory consumption.
uint_fast8_t means: give me an unsigned int of at least 8 bits. Pick a larger type if it will make my program faster, because of alignment considerations. Optimize for speed.
Also, unlike the plain int types, the signed version of the above stdint.h types are guaranteed to be 2's complement format.
The theory goes something like:
uint8_t is required to be exactly 8 bits but it's not required to exist. So you should use it where you are relying on the modulo-256 assignment behaviour* of an 8 bit integer and where you would prefer a compile failure to misbehaviour on obscure architectures.
uint_least8_t is required to be the smallest available unsigned integer type that can store at least 8 bits. You would use it when you want to minimise the memory use of things like large arrays.
uint_fast8_t is supposed to be the "fastest" unsigned type that can store at least 8 bits; however, it's not actually guaranteed to be the fastest for any given operation on any given processor. You would use it in processing code that performs lots of operations on the value.
The practice is that the "fast" and "least" types aren't used much.
The "least" types are only really useful if you care about portability to obscure architectures with CHAR_BIT != 8 which most people don't.
The problem with the "fast" types is that "fastest" is hard to pin down. A smaller type may mean less load on the memory/cache system but using a type that is smaller than native may require extra instructions. Furthermore which is best may change between architecture versions but implementers often want to avoid breaking ABI in such cases.
From looking at some popular implementations it seems that the definitions of uint_fastn_t are fairly arbitrary. glibc seems to define them as being at least the "native word size" of the system in question taking no account of the fact that many modern processors (especially 64-bit ones) have specific support for fast operations on items smaller than their native word size. IOS apparently defines them as equivalent to the fixed-size types. Other platforms may vary.
All in all if performance of tight code with tiny integers is your goal you should be bench-marking your code on the platforms you care about with different sized types to see what works best.
* Note that unfortunately modulo-256 assignment behaviour does not always imply modulo-256 arithmetic, thanks to C's integer promotion misfeature.
Some processors cannot operate as efficiently on smaller data types as on large ones. For example, given:
uint32_t foo(uint32_t x, uint8_t y)
{
x+=y;
y+=2;
x+=y;
y+=4;
x+=y;
y+=6;
x+=y;
return x;
}
if y were uint32_t a compiler for the ARM Cortex-M3 could simply generate
add r0,r0,r1,asl #2 ; x+=(y<<2)
add r0,r0,#12 ; x+=12
bx lr ; return x
but since y is uint8_t the compiler would have to instead generate:
add r0,r0,r1 ; x+=y
add r1,r1,#2 ; Compute y+2
and r1,r1,#255 ; y=(y+2) & 255
add r0,r0,r1 ; x+=y
add r1,r1,#4 ; Compute y+4
and r1,r1,#255 ; y=(y+4) & 255
add r0,r0,r1 ; x+=y
add r1,r1,#6 ; Compute y+6
and r1,r1,#255 ; y=(y+6) & 255
add r0,r0,r1 ; x+=y
bx lr ; return x
The intended purpose of the "fast" types was to allow compilers to replace smaller types which couldn't be processed efficiently with faster ones. Unfortunately, the semantics of "fast" types are rather poorly specified, which in turn leaves murky questions of whether expressions will be evaluated using signed or unsigned math.
1.Can you explain what is the meaning of "it's an unsigned int with at least 8 bits"?
That ought to be obvious. It means that it's an unsigned integer type, and that it's width is at least 8 bits. In effect this means that it can at least hold the numbers 0 through 255, and it can definitely not hold negative numbers, but it may be able to hold numbers higher than 255.
Obviously you should not use any of these types if you plan to store any number outside the range 0 through 255 (and you want it to be portable).
2.How uint_fast8_t and uint_least8_t help increase efficiency/code space compared to the uint8_t?
uint_fast8_t is required to be faster so you should use that if your requirement is that the code be fast. uint_least8_t on the other hand requires that there is no candidate of lesser size - so you would use that if size is the concern.
And of course you use only uint8_t when you absolutely require it to be exactly 8 bits. Using uint8_t may make the code non-portable as uint8_t is not required to exist (because such small integer type does not exist on certain platforms).
The "fast" integer types are defined to be the fastest integer available with at least the amount of bits required (in your case 8).
A platform can define uint_fast8_t as uint8_t then there will be absolutely no difference in speed.
The reason is that there are platforms that are slower when not using their native word length.
As the name suggests, uint_least8_t is the smallest type that has at least 8 bits, uint_fast8_t is the fastest type that has at least 8 bits. uint8_t has exactly 8 bits, but it is not guaranteed to exist on all platforms, although this is extremely uncommon.
In most case, uint_least8_t = uint_fast8_t = uint8_t = unsigned char. The only exception I have seen is the C2000 DSP from Texas Instruments, it is 32-bit, but its minimum data width is 16-bit. It does not have uint8_t, you can only use uint_least8_t and uint_fast8_t, they are defined as unsigned int, which is 16-bit.
I'm using the fast datatypes (uint_fast8_t) for local vars and function parameters, and using the normal ones (uint8_t) in arrays and structures which are used frequently and memory footprint is more important than the few cycles that could be saved by not having to clear or sign extend the upper bits.
Works great, except with MISRA checkers. They go nuts from the fast types. The trick is that the fast types are used through derived types that can be defined differently for MISRA builds and normal ones.
I think these types are great to create portable code, that's efficient on both low-end microcontrollers and big application processors. The improvement might be not huge, or totally negligible with good compilers, but better than nothing.
Some guessing in this thread.
"fast": The compiler should place "fast" type vars in IRAM (local processor RAM) which requires fewer cycles to access and write than vars stored in the hinterlands of RAM. "fast" is used if you need quickest possible action on a var, such as in an Interrupt Service Routine (ISR). Same as declaring a function to have an IRAM_ATTR; this == faster access. There is limited space for "fast" or IRAM vars/functions, so only use when needed, and never persist unless they qualify for that. Most compilers will move "fast" vars to general RAM if processor RAM is all allocated.

Why is int typically 32 bit on 64 bit compilers?

Why is int typically 32 bit on 64 bit compilers? When I was starting programming, I've been taught int is typically the same width as the underlying architecture. And I agree that this also makes sense, I find it logical for a unspecified width integer to be as wide as the underlying platform (unless we are talking 8 or 16 bit machines, where such a small range for int will be barely applicable).
Later on I learned int is typically 32 bit on most 64 bit platforms. So I wonder what is the reason for this. For storing data I would prefer an explicitly specified width of the data type, so this leaves generic usage for int, which doesn't offer any performance advantages, at least on my system I have the same performance for 32 and 64 bit integers. So that leaves the binary memory footprint, which would be slightly reduced, although not by a lot...
Bad choices on the part of the implementors?
Seriously, according to the standard, "Plain ints have the
natural size suggested by the architecture of the execution
environment", which does mean a 64 bit int on a 64 bit
machine. One could easily argue that anything else is
non-conformant. But in practice, the issues are more complex:
switching from 32 bit int to 64 bit int would not allow
most programs to handle large data sets or whatever (unlike the
switch from 16 bits to 32); most programs are probably
constrained by other considerations. And it would increase the
size of the data sets, and thus reduce locality and slow the
program down.
Finally (and probably most importantly), if int were 64 bits,
short would have to be either 16 bits or
32 bits, and you'ld have no way of specifying the other (except
with the typedefs in <stdint.h>, and the intent is that these
should only be used in very exceptional circumstances).
I suspect that this was the major motivation.
The history, trade-offs and decisions are explained by The Open Group at http://www.unix.org/whitepapers/64bit.html. It covers the various data models, their strengths and weaknesses and the changes made to the Unix specifications to accommodate 64-bit computing.
Because there is no advantage to a lot of software to have 64-bit integers.
Using 64-bit int's to calculate things that can be calculated in a 32-bit integer (and for many purposes, values up to 4 billion (or +/- 2 billon) are sufficient), and making them bigger will not help anything.
Using a bigger integer will however have a negative effect on how many integers sized "things" fit in the cache on the processor. So making them bigger will make calculations that involve large numbers of integers (e.g. arrays) take longer because.
The int is the natural size of the machine-word isn't something stipulated by the C++ standard. In the days when most machines where 16 or 32 bit, it made sense to make it either 16 or 32 bits, because that is a very efficient size for those machines. When it comes to 64 bit machines, that no longer "helps". So staying with 32 bit int makes more sense.
Edit:
Interestingly, when Microsoft moved to 64-bit, they didn't even make long 64-bit, because it would break too many things that relied on long being a 32-bit value (or more importantly, they had a bunch of things that relied on long being a 32-bit value in their API, where sometimes client software uses int and sometimes long, and they didn't want that to break).
ints have been 32 bits on most major architectures for so long that changing them to 64 bits will probably cause more problems than it solves.
I originally wrote this up in response to this question. While I've modified it some, it's largely the same.
To get started, it is possible to have plain ints wider than 32 bits, as the C++ draft says:
 Note: Plain ints are intended to have the natural size suggested by the architecture of the execution environment; the other signed integer types are provided to meet special needs. — end note
Emphasis mine
This would ostensibly seem to say that on my 64 bit architecture (and everyone else's) a plain int should have a 64 bit size; that's a size suggested by the architecture, right? However I must assert that the natural size for even 64 bit architecture is 32 bits. The quote in the specs is mainly there for cases where 16 bit plain ints is desired--which is the minimum size the specifications allow.
The largest factor is convention, going from a 32 bit architecture with a 32 bit plain int and adapting that source for a 64 bit architecture is simply easier if you keep it 32 bits, both for the designers and their users in two different ways:
The first is that less differences across systems there are the easier is for everyone. Discrepancies between systems been only headaches for most programmer: they only serve to make it harder to run code across systems. It'll even add on to the relatively rare cases where you're not able to do it across computers with the same distribution just 32 bit and 64 bit. However, as John Kugelman pointed out, architectures have gone from a 16 bit to 32 bit plain int, going through the hassle to do so could be done again today, which ties into his next point:
The more significant component is the gap it would cause in integer sizes or a new type to be required. Because sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long) is in the actual specification, a gap is forced if the plain int is moved to 64 bits. It starts with shifting long. If a plain int is adjusted to 64 bits, the constraint that sizeof(int) <= sizeof(long) would force long to be at least 64 bits and from there there's an intrinsic gap in sizes. Since long or a plain int usually are used as a 32 bit integer and neither of them could now, we only have one more data type that could, short. Because short has a minimum of 16 bits if you simply discard that size it could become 32 bits and theoretically fill that gap, however short is intended to be optimized for space so it should be kept like that and there are use cases for small, 16 bit, integers as well. No matter how you arrange the sizes there is a loss of a width and therefore use case for an int entirely unavailable. A bigger width doesn't necessarily mean it's better.
This now would imply a requirement for the specifications to change, but even if a designer goes rogue, it's highly likely it'd be damaged or grow obsolete from the change. Designers for long lasting systems have to work with an entire base of entwined code, both their own in the system, dependencies, and user's code they'll want to run and a huge amount of work to do so without considering the repercussions is simply unwise.
As a side note, if your application is incompatible with a >32 bit integer, you can use static_assert(sizeof(int) * CHAR_BIT <= 32, "Int wider than 32 bits!");. However, who knows maybe the specifications will change and 64 bits plain ints will be implemented, so if you want to be future proof, don't do the static assert.
Main reason is backward compatibility. Moreover, there is already a 64 bit integer type long and same goes for float types: float and double. Changing the sizes of these basic types for different architectures will only introduce complexity. Moreover, 32 bit integer responds to many needs in terms of range.
The C + + standard does not say how much memory should be used for the int type, tells you how much memory should be used at least for the type int. In many programming environments on 32-bit pointer variables, "int" and "long" are all 32 bits long.
Since no one pointed this out yet.
int is guaranteed to be between -32767 to 32767(2^16) That's required by the standard. If you want to support 64 bit numbers on all platforms I suggest using the right type long long which supports (-9223372036854775807 to 9223372036854775807).
int is allowed to be anything so long as it provides the minimum range required by the standard.