Difference between uint8_t, uint_fast8_t and uint_least8_t - c++

The C99 standard introduces the following datatypes. The documentation can be found here for the AVR stdint library.
uint8_t means it's an 8-bit unsigned type.
uint_fast8_t means it's the fastest unsigned int with at least 8
bits.
uint_least8_t means it's an unsigned int with at least 8 bits.
I understand uint8_t and what is uint_fast8_t( I don't know how it's implemented in register level).
1.Can you explain what is the meaning of "it's an unsigned int with at least 8 bits"?
2.How uint_fast8_t and uint_least8_t help increase efficiency/code space compared to the uint8_t?

uint_least8_t is the smallest type that has at least 8 bits.
uint_fast8_t is the fastest type that has at least 8 bits.
You can see the differences by imagining exotic architectures. Imagine a 20-bit architecture. Its unsigned int has 20 bits (one register), and its unsigned char has 10 bits. So sizeof(int) == 2, but using char types requires extra instructions to cut the registers in half. Then:
uint8_t: is undefined (no 8 bit type).
uint_least8_t: is unsigned char, the smallest type that is at least 8 bits.
uint_fast8_t: is unsigned int, because in my imaginary architecture, a half-register variable is slower than a full-register one.

uint8_t means: give me an unsigned int of exactly 8 bits.
uint_least8_t means: give me the smallest type of unsigned int which has at least 8 bits. Optimize for memory consumption.
uint_fast8_t means: give me an unsigned int of at least 8 bits. Pick a larger type if it will make my program faster, because of alignment considerations. Optimize for speed.
Also, unlike the plain int types, the signed version of the above stdint.h types are guaranteed to be 2's complement format.

The theory goes something like:
uint8_t is required to be exactly 8 bits but it's not required to exist. So you should use it where you are relying on the modulo-256 assignment behaviour* of an 8 bit integer and where you would prefer a compile failure to misbehaviour on obscure architectures.
uint_least8_t is required to be the smallest available unsigned integer type that can store at least 8 bits. You would use it when you want to minimise the memory use of things like large arrays.
uint_fast8_t is supposed to be the "fastest" unsigned type that can store at least 8 bits; however, it's not actually guaranteed to be the fastest for any given operation on any given processor. You would use it in processing code that performs lots of operations on the value.
The practice is that the "fast" and "least" types aren't used much.
The "least" types are only really useful if you care about portability to obscure architectures with CHAR_BIT != 8 which most people don't.
The problem with the "fast" types is that "fastest" is hard to pin down. A smaller type may mean less load on the memory/cache system but using a type that is smaller than native may require extra instructions. Furthermore which is best may change between architecture versions but implementers often want to avoid breaking ABI in such cases.
From looking at some popular implementations it seems that the definitions of uint_fastn_t are fairly arbitrary. glibc seems to define them as being at least the "native word size" of the system in question taking no account of the fact that many modern processors (especially 64-bit ones) have specific support for fast operations on items smaller than their native word size. IOS apparently defines them as equivalent to the fixed-size types. Other platforms may vary.
All in all if performance of tight code with tiny integers is your goal you should be bench-marking your code on the platforms you care about with different sized types to see what works best.
* Note that unfortunately modulo-256 assignment behaviour does not always imply modulo-256 arithmetic, thanks to C's integer promotion misfeature.

Some processors cannot operate as efficiently on smaller data types as on large ones. For example, given:
uint32_t foo(uint32_t x, uint8_t y)
{
x+=y;
y+=2;
x+=y;
y+=4;
x+=y;
y+=6;
x+=y;
return x;
}
if y were uint32_t a compiler for the ARM Cortex-M3 could simply generate
add r0,r0,r1,asl #2 ; x+=(y<<2)
add r0,r0,#12 ; x+=12
bx lr ; return x
but since y is uint8_t the compiler would have to instead generate:
add r0,r0,r1 ; x+=y
add r1,r1,#2 ; Compute y+2
and r1,r1,#255 ; y=(y+2) & 255
add r0,r0,r1 ; x+=y
add r1,r1,#4 ; Compute y+4
and r1,r1,#255 ; y=(y+4) & 255
add r0,r0,r1 ; x+=y
add r1,r1,#6 ; Compute y+6
and r1,r1,#255 ; y=(y+6) & 255
add r0,r0,r1 ; x+=y
bx lr ; return x
The intended purpose of the "fast" types was to allow compilers to replace smaller types which couldn't be processed efficiently with faster ones. Unfortunately, the semantics of "fast" types are rather poorly specified, which in turn leaves murky questions of whether expressions will be evaluated using signed or unsigned math.

1.Can you explain what is the meaning of "it's an unsigned int with at least 8 bits"?
That ought to be obvious. It means that it's an unsigned integer type, and that it's width is at least 8 bits. In effect this means that it can at least hold the numbers 0 through 255, and it can definitely not hold negative numbers, but it may be able to hold numbers higher than 255.
Obviously you should not use any of these types if you plan to store any number outside the range 0 through 255 (and you want it to be portable).
2.How uint_fast8_t and uint_least8_t help increase efficiency/code space compared to the uint8_t?
uint_fast8_t is required to be faster so you should use that if your requirement is that the code be fast. uint_least8_t on the other hand requires that there is no candidate of lesser size - so you would use that if size is the concern.
And of course you use only uint8_t when you absolutely require it to be exactly 8 bits. Using uint8_t may make the code non-portable as uint8_t is not required to exist (because such small integer type does not exist on certain platforms).

The "fast" integer types are defined to be the fastest integer available with at least the amount of bits required (in your case 8).
A platform can define uint_fast8_t as uint8_t then there will be absolutely no difference in speed.
The reason is that there are platforms that are slower when not using their native word length.

As the name suggests, uint_least8_t is the smallest type that has at least 8 bits, uint_fast8_t is the fastest type that has at least 8 bits. uint8_t has exactly 8 bits, but it is not guaranteed to exist on all platforms, although this is extremely uncommon.
In most case, uint_least8_t = uint_fast8_t = uint8_t = unsigned char. The only exception I have seen is the C2000 DSP from Texas Instruments, it is 32-bit, but its minimum data width is 16-bit. It does not have uint8_t, you can only use uint_least8_t and uint_fast8_t, they are defined as unsigned int, which is 16-bit.

I'm using the fast datatypes (uint_fast8_t) for local vars and function parameters, and using the normal ones (uint8_t) in arrays and structures which are used frequently and memory footprint is more important than the few cycles that could be saved by not having to clear or sign extend the upper bits.
Works great, except with MISRA checkers. They go nuts from the fast types. The trick is that the fast types are used through derived types that can be defined differently for MISRA builds and normal ones.
I think these types are great to create portable code, that's efficient on both low-end microcontrollers and big application processors. The improvement might be not huge, or totally negligible with good compilers, but better than nothing.

Some guessing in this thread.
"fast": The compiler should place "fast" type vars in IRAM (local processor RAM) which requires fewer cycles to access and write than vars stored in the hinterlands of RAM. "fast" is used if you need quickest possible action on a var, such as in an Interrupt Service Routine (ISR). Same as declaring a function to have an IRAM_ATTR; this == faster access. There is limited space for "fast" or IRAM vars/functions, so only use when needed, and never persist unless they qualify for that. Most compilers will move "fast" vars to general RAM if processor RAM is all allocated.

Related

In new code, why would you use `int` instead of `int_fast16_t` or `int_fast32_t` for a counting variable?

If you need a counting variable, surely there must be an upper and a lower limit that your integer must support. So why wouldn't you specify those limits by choosing an appropriate (u)int_fastxx_t data type?
The simplest reason is that people are more used to int than the additional types introduced in C++11, and that it's the language's "default" integral type (so much as C++ has one); the standard specifies, in [basic.fundamental/2] that:
Plain ints have the natural size suggested by the architecture of the execution environment46; the other signed integer types are provided to meet special needs.
46) that is, large enough to contain any value in the range of INT_MIN and INT_MAX, as defined in the header <climits>.
Thus, whenever a generic integer is needed, which isn't required to have a specific range or size, programmers tend to just use int. While using other types can communicate intent more clearly (for example, using int8_t indicates that the value should never exceed 127), using int also communicates that these details aren't crucial to the task at hand, while simultaneously providing a little leeway to catch values that exceed your required range (if a system handles signed overflow with modulo arithmetic, for example, an int8_t would treat 313 as 57, making the invalid value harder to troubleshoot); typically, in modern programming, it either indicates that the value can be represented within the system's word size (which int is supposed to represent), or that the value can be represented within 32 bits (which is nearly always the size of int on x86 and x64 platforms).
Sized types also have the issue that the (theoretically) most well-known ones, the intX_t line, are only defined on platforms which support sizes of exactly X bits. While the int_leastX_t types are guaranteed to be defined on all platforms, and guaranteed to be at least X bits, a lot of people wouldn't want to type that much if they don't have to, since it adds up when you need to specify types often. [You can't use auto, either because it detects integer literals as ints. This can be mitigated by making user-defined literal operators, but that still takes more time to type.] Thus, they'll typically use int if it's safe to do so.
Or in short, int is intended to be the go-to type for normal operation, with the other types intended to be used in extranormal circumstances. Many programmers stick to this mindset out of habit, and only use sized types when they explicitly require specific ranges and/or sizes. This also communicates intent relatively well; int means "number", and intX_t means "number that always fits in X bits".
It doesn't help that int has evolved to unofficially mean "32-bit integer", due to both 32- and 64-bit platforms usually using 32-bit ints. It's very likely that many programmers expect int to always be at least 32 bits in the modern age, to the point where it can very easily bite them in the rear if they have to program for platforms that don't support 32-bit ints.
Conversely, the sized types are typically used when a specific range or size is explicitly required, such as when defining a struct that needs to have the same layout on systems with different data models. They can also prove useful when working with limited memory, using the smallest type that can fully contain the required range.
A struct intended to have the same layout on 16- and 32-bit systems, for example, would use either int16_t or int32_t instead of int, because int is 16 bits in most 16-bit data models and the LP32 32-bit data model (used by the Win16 API and Apple Macintoshes), but 32 bits in the ILP32 32-bit data model (used by the Win32 API and *nix systems, effectively making it the de facto "standard" 32-bit model).
Similarly, a struct intended to have the same layout on 32- and 64-bit systems would use int/int32_t or long long/int64_t over long, due to long having different sizes in different models (64 bits in LP64 (used by 64-bit *nix), 32 bits in LLP64 (used by Win64 API) and the 32-bit models).
Note that there is also a third 64-bit model, ILP64, where int is 64 bits; this model is very rarely used (to my knowledge, it was only used on early 64-bit Unix systems), but would mandate the use of a sized type over int if layout compatibility with ILP64 platforms is required.
There are several reasons. One, these long names make the code less readable. Two, you might introduce really hard to find bugs. Say you used int_fast16_t but you really need to count up to 40,000. The implementation might use 32 bits and the code work just fine. Then you try to run the code on an implementation that uses 16 bits and you get hard-to-find bugs.
A note: In C / C++ you have types char, short, int, long and long long which must cover 8 to 64 bits, so int cannot be 64 bits (because char and short cannot cover 8, 16 and 32 bits), even if 64 bits is the natural word size. In Swift, for example, Int is the natural integer size, either 32 and 64 bits, and you have Int8, Int16, Int32 and Int64 for explicit sizes. Int is the best type unless you absolutely need 64 bits, in which case you use Int64, or if you need to save space.

When to use `short` over `int`?

There are many questions that asks for difference between the short and int integer types in C++, but practically, when do you choose short over int?
(See Eric's answer for more detailed explanation)
Notes:
Generally, int is set to the 'natural size' - the integer form that the hardware handles most efficiently
When using short in an array or in arithmetic operations, the short integer is converted into int, and so this can introduce a hit on the speed in processing short integers
Using short can conserve memory if it is narrower than int, which can be important when using a large array
Your program will use more memory in a 32-bit int system compared to a 16-bit int system
Conclusion:
Use int unless you conserving memory is critical, or your program uses a lot of memory (e.g. many arrays). In that case, use short.
You choose short over int when:
Either
You want to decrease the memory footprint of the values you're storing (for instance, if you're targeting a low-memory platform),
You want to increase performance by increasing either the number of values that can be packed into a single memory page (reducing page faults when accessing your values) and/or in the memory caches (reducing cache misses when accessing values), and profiling has revealed that there are performance gains to be had here,
Or you are sending data over a network or storing it to disk, and want to decrease your footprint (to take up less disk space or network bandwidth). Although for these cases, you should prefer types which specify exactly the size in bits rather than int or short, which can vary based on platform (as you want a platform with a 32-bit short to be able to read a file written on a platform with a 16-bit short). Good candidates are the types defined in stdint.h.
And:
You have a numeric value which does not need to take on any values that can't be stored in a short on your target platform (for a 16-bit short, this is -32768-32767, or 0-65535 for a 16-bit unsigned short).
Your target platform (or one of you r target platforms) uses less memory for a short than for an int. The standard only guarantees that short is not larger than int, so implementations are allowed to have the same size for a short and for an int.
Note:
chars can also be used as arithmetic types. An answer to "When should I use char instead of short or int?" would read very similarly to this one, but with different numbers (-128-127 for an 8-bit char, 0-255 for an 8-bit unsigned char)
In reality, you likely don't actually want to use the short type specifically. If you want an integer of specific size, there are types defined in <cstdint> that should be preferred, as, for example, an int16_t will be 16 bits on every system, whereas you cannot guarantee the size of a short will be the same across all targets your code will be compiled for.
In general, you don't prefer short over int.
The int type is the processor's native word size
Usually, an int is the processor's word size.
For example, with a 32-bit word size processor, an int would be 32 bits. The processor is most efficient using 32-bits. Assuming that short is 16-bit, the processor still fetches 32-bits from memory. So no efficiency here; actually it's longer because the processor may have to shift the bits to be placed in the correct position in a 32-bit word.
Choosing a smaller data type
There are standardized data types that are bit specific in length, such as uint16_t. These are preferred to the ambiguous types of char, short, and int. These width specific data types are usually used for accessing hardware, or compressing space (such as message protocols).
Choosing a smaller range
The short data type is based on range not bit width. On a 32-bit system, both short and int may have the same 32-bit length.
Once reason for using short is because the value will never go past a given range. This is usually a fallacy because programs will change and the data type could overflow.
Summary
Presently, I do not use short anymore. I use uint16_t when I access 16-bit hardware devices. I use unsigned int for quantities, including loop indices. I use uint8_t, uint16_t and uint32_t when size matters for data storage. The short data type is ambiguous for data storage, since it is a minimum. With the advent of stdint header files, there is no longer any need for short.
If you don't have any specific constraints imposed by your architecture, I would say you can always use int. The type short is meant for specific systems where memory is a precious resource.

C++ Does converting long, short and all ints to uint32_t, int32_t and so forth help at all?

I run a gaming server that is coded C++, also a bit of ASM and C there. I saw someone had updated the same server I run and amongst all the updates was the fact that all int, unsigned, short and everything else he had changed to int32_t, uint32_t, uint64_t and the rest.
Is there any benefit in changing all of them to the above said ones? Lets say I changed all int to int32_t and all unsigned int to uint32_t and of course everything else thats possible to change.
I was trying to read and understand if there are any benefits but I simply didnt grasp the real meaning of them. So yeah, the question goes: Is there any benefit in doing what I just said?
The compiler I use is Orwell Dev-C++
The normal types, like int and unsigned int, have variable size depending on which platform you run. int32_t and uint32_t, however, is guaranteed to be 32 bits on any platform which have a 32-bit integral type. On platforms that don't, it doesn't exist. The size of a int may vary, usually its 32 or 64 bits long. The same rules apply to the other types, like int64_t is 64 bits.
To know the size of the data types is needed for instance in network programming because the network packets are sent between different platforms with different default integral sizes while the size of the data stored in the packets, like addresses and port numbers, are always the same. An IPv4-address is always 32 bits long, and one should use a data type which is guaranteed to be this size to store it.
In most programs, where you just store numbers, you should use the normal types. The size of these is the most efficient on the current platform.
uint32_t and int32_t add a little bit more control to the handling of your data types by having a fixed size and sign - IMHO the more predictability you have in your program, the better it is. You also can see on the typename whether it is unsigned or not which may not always be self-evident
With well defined size you also are more protected when doing portable code.
Actually, the default int and long are so vague and compiler/platform dependent that I for one try to avoid them. With int and long:
If you need to know the size of your struct, you need to know how many bytes for int and long, per architecture, per operative system, per compiler. It's a waste of precious brain.
If you need to explain to somebody the probability of having a collision in that hash table, would you say "it depends on the size of int, which depends on your architecture", or would you rather say, 1.0e-5? In other words, int and long "undefine" the properties of your program.
If you use int, and the compiler swears that that's 64 bits, the chances of it optimizing it to 32 or 16 are minimal. So you end up using much more memory if all you wanted was space for representing a short symbol... not that I know of many alphabets with 2^32 characters.
With more informative types, like uint32_t, or uint_least32_t, your compiler can perhaps make better assumptions and use 64 bits if it believes that that would bring better performance. I don't know that for sure. But you as human have better chances of understanding the range of values in which your program works well.
Additionally, all binary protocols likely need to specify the size and endianess of integers.
I have no problem with int32_t in some contexts, such as struct fields, but the
practice of writing all your int as int32_t, and unsigned as uint32_t (for portability?) is much overused; I think, in practical terms, you don't get any real portability benefit from this, and there are some significant downsides.
Most likely the code you are writing will only ever be compiled on a machine where int is 32 bits. If you are going to port it to a machine with 16-bit ints, then you generally have much bigger problems than a blanket typedef will solve.
If you do move to a 16-bit machine, you are probably going to want to change a great number of the local int32_t variables to int anyway since most of them are counts and indices, the 16-bit machine can't have more than 32K of most things, but people just wrote them all that way out of habit, and if you leave them as int32_t the code will be huge and slow.
int being 64 bits in the future sometime? Won't happen, for a number of reasons, most of them just pragmatic things.
On some weird machines int could be e.g. 24 bits. Very unusual, but again, you have bigger problems than a typedef will solve, and likely the first thing you want to do if porting to such a beast, is to change all the int32_t which are just counting things back to int.
So much for it not solving portability problems. What downside is there?
Certain library functions have int * parameters, e.g. frexp. You need to supply an address of an int variable, or it won't port, and int32_t may or may not be int even if int is 32 bits (see below). I've seen people write frexp( val, (int*)&myvar); just so they can write int32_t myvar; instead of int myvar;, which is terrible and can create a bug undetected by the compiler if int32_t is ever a different size from int.
Likewise printf( "%d", intvar); requires an int, not some typedef which happens to be the same size as int but might really be long, and gcc/clang issue warnings about this if int32_t is long.
On many platforms int32_t is long (32-bit long), even though int is also 32 bits, this is unfortunate (and I think it can be traced back to Microsoft needing to thunk things over 16/32/64 and deciding that long should be 32 bits forever and vice versa).
The uncertainty of whether int32_t is int or long can cause issues with c++ overloads, when porting code across different platforms.

What is uint_fast32_t and why should it be used instead of the regular int and uint32_t?

So the reason for typedef:ed primitive data types is to abstract the low-level representation and make it easier to comprehend (uint64_t instead of long long type, which is 8 bytes).
However, there is uint_fast32_t which has the same typedef as uint32_t. Will using the "fast" version make the program faster?
int may be as small as 16 bits on some platforms. It may not be sufficient for your application.
uint32_t is not guaranteed to exist. It's an optional typedef that the implementation must provide iff it has an unsigned integer type of exactly 32-bits. Some have a 9-bit bytes for example, so they don't have a uint32_t.
uint_fast32_t states your intent clearly: it's a type of at least 32 bits which is the best from a performance point-of-view. uint_fast32_t may be in fact 64 bits long. It's up to the implementation.
There's also uint_least32_t in the mix. It designates the smallest type that's at least 32 bits long, thus it can be smaller than uint_fast32_t. It's an alternative to uint32_t if the later isn't supported by the platform.
... there is uint_fast32_t which has the same typedef as uint32_t ...
What you are looking at is not the standard. It's a particular implementation (BlackBerry). So you can't deduce from there that uint_fast32_t is always the same as uint32_t.
See also:
Exotic architectures the standards committees care about.
My pragmatic opinion about integer types in C and C++.
The difference lies in their exact-ness and availability.
The doc here says:
unsigned integer type with width of exactly 8, 16, 32 and 64 bits respectively (provided only if the implementation directly supports the type):
uint8_t
uint16_t
uint32_t
uint64_t
And
fastest unsigned unsigned integer type with width of at least 8, 16, 32 and 64 bits respectively
uint_fast8_t
uint_fast16_t
uint_fast32_t
uint_fast64_t
So the difference is pretty much clear that uint32_t is a type which has exactly 32 bits, and an implementation should provide it only if it has type with exactly 32 bits, and then it can typedef that type as uint32_t. This means, uint32_t may or may not be available.
On the other hand, uint_fast32_t is a type which has at least 32 bits, which also means, if an implementation may typedef uint32_t as uint_fast32_t if it provides uint32_t. If it doesn't provide uint32_t, then uint_fast32_t could be a typedef of any type which has at least 32 bits.
When you #include inttypes.h in your program, you get access to a bunch of different ways for representing integers.
The uint_fast*_t type simply defines the fastest type for representing a given number of bits.
Think about it this way: you define a variable of type short and use it several times in the program, which is totally valid. However, the system you're working on might work more quickly with values of type int. By defining a variable as type uint_fast*t, the computer simply chooses the most efficient representation that it can work with.
If there is no difference between these representations, then the system chooses whichever one it wants, and uses it consistently throughout.
Note that the fast version could be larger than 32 bits. While the fast int will fit nicely in a register and be aligned and the like: but, it will use more memory. If you have large arrays of these your program will be slower due to more memory cache hits and bandwidth.
I don't think modern CPUS will benefit from fast_int32, since generally the sign extending of 32 to 64 bit can happen during the load instruction and the idea that there is a 'native' integer format that is faster is old fashioned.

Usage of 'short' in C++

Why is it that for any numeric input we prefer an int rather than short, even if the input is of very few integers.
The size of short is 2 bytes on my x86 and 4 bytes for int, shouldn't it be better and faster to allocate than an int?
Or I am wrong in saying that short is not used?
CPUs are usually fastest when dealing with their "native" integer size. So even though a short may be smaller than an int, the int is probably closer to the native size of a register in your CPU, and therefore is likely to be the most efficient of the two.
In a typical 32-bit CPU architecture, to load a 32-bit value requires one bus cycle to load all the bits. Loading a 16-bit value requires one bus cycle to load the bits, plus throwing half of them away (this operation may still happen within one bus cycle).
A 16-bit short makes sense if you're keeping so many in memory (in a large array, for example) that the 50% reduction in size adds up to an appreciable reduction in memory overhead. They are not faster than 32-bit integers on modern processors, as Greg correctly pointed out.
In embedded systems, the short and unsigned short data types are used for accessing items that require less bits than the native integer.
For example, if my USB controller has 16 bit registers, and my processor has a native 32 bit integer, I would use an unsigned short to access the registers (provided that the unsigned short data type is 16-bits).
Most of the advice from experienced users (see news:comp.lang.c++.moderated) is to use the native integer size unless a smaller data type must be used. The problem with using short to save memory is that the values may exceed the limits of short. Also, this may be a performance hit on some 32-bit processors, as they have to fetch 32 bits near the 16-bit variable and eliminate the unwanted 16 bits.
My advice is to work on the quality of your programs first, and only worry about optimization if it is warranted and you have extra time in your schedule.
Using type short does not guarantee that the actual values will be smaller than those of type int. It allows for them to be smaller, and ensures that they are no bigger. Note too that short must be larger than or equal in size to type char.
The original question above contains actual sizes for the processor in question, but when porting code to a new environment, one can only rely on weak relative assumptions without verifying the implementation-defined sizes.
The C header <stdint.h> -- or, from C++, <cstdint> -- defines types of specified size, such as uint8_t for an unsigned integral type exactly eight bits wide. Use these types when attempting to conform to an externally-specified format such as a network protocol or binary file format.
The short type is very useful if you have a big array full of them and int is just way too big.
Given that the array is big enough, the memory saving will be important (instead of just using an array of ints).
Unicode arrays are also encoded in shorts (although other encode schemes exist).
On embedded devices, space still matters and short might be very beneficial.
Last but not least, some transmission protocols insists in using shorts, so you still need them there.
Maybe we should consider it in different situations. For example, x86 or x64 should consider more suitable type, not just choose int. In some cases, int have faster speed than short. The first floor have answered this question