Related
The C99 standard introduces the following datatypes. The documentation can be found here for the AVR stdint library.
uint8_t means it's an 8-bit unsigned type.
uint_fast8_t means it's the fastest unsigned int with at least 8
bits.
uint_least8_t means it's an unsigned int with at least 8 bits.
I understand uint8_t and what is uint_fast8_t( I don't know how it's implemented in register level).
1.Can you explain what is the meaning of "it's an unsigned int with at least 8 bits"?
2.How uint_fast8_t and uint_least8_t help increase efficiency/code space compared to the uint8_t?
uint_least8_t is the smallest type that has at least 8 bits.
uint_fast8_t is the fastest type that has at least 8 bits.
You can see the differences by imagining exotic architectures. Imagine a 20-bit architecture. Its unsigned int has 20 bits (one register), and its unsigned char has 10 bits. So sizeof(int) == 2, but using char types requires extra instructions to cut the registers in half. Then:
uint8_t: is undefined (no 8 bit type).
uint_least8_t: is unsigned char, the smallest type that is at least 8 bits.
uint_fast8_t: is unsigned int, because in my imaginary architecture, a half-register variable is slower than a full-register one.
uint8_t means: give me an unsigned int of exactly 8 bits.
uint_least8_t means: give me the smallest type of unsigned int which has at least 8 bits. Optimize for memory consumption.
uint_fast8_t means: give me an unsigned int of at least 8 bits. Pick a larger type if it will make my program faster, because of alignment considerations. Optimize for speed.
Also, unlike the plain int types, the signed version of the above stdint.h types are guaranteed to be 2's complement format.
The theory goes something like:
uint8_t is required to be exactly 8 bits but it's not required to exist. So you should use it where you are relying on the modulo-256 assignment behaviour* of an 8 bit integer and where you would prefer a compile failure to misbehaviour on obscure architectures.
uint_least8_t is required to be the smallest available unsigned integer type that can store at least 8 bits. You would use it when you want to minimise the memory use of things like large arrays.
uint_fast8_t is supposed to be the "fastest" unsigned type that can store at least 8 bits; however, it's not actually guaranteed to be the fastest for any given operation on any given processor. You would use it in processing code that performs lots of operations on the value.
The practice is that the "fast" and "least" types aren't used much.
The "least" types are only really useful if you care about portability to obscure architectures with CHAR_BIT != 8 which most people don't.
The problem with the "fast" types is that "fastest" is hard to pin down. A smaller type may mean less load on the memory/cache system but using a type that is smaller than native may require extra instructions. Furthermore which is best may change between architecture versions but implementers often want to avoid breaking ABI in such cases.
From looking at some popular implementations it seems that the definitions of uint_fastn_t are fairly arbitrary. glibc seems to define them as being at least the "native word size" of the system in question taking no account of the fact that many modern processors (especially 64-bit ones) have specific support for fast operations on items smaller than their native word size. IOS apparently defines them as equivalent to the fixed-size types. Other platforms may vary.
All in all if performance of tight code with tiny integers is your goal you should be bench-marking your code on the platforms you care about with different sized types to see what works best.
* Note that unfortunately modulo-256 assignment behaviour does not always imply modulo-256 arithmetic, thanks to C's integer promotion misfeature.
Some processors cannot operate as efficiently on smaller data types as on large ones. For example, given:
uint32_t foo(uint32_t x, uint8_t y)
{
x+=y;
y+=2;
x+=y;
y+=4;
x+=y;
y+=6;
x+=y;
return x;
}
if y were uint32_t a compiler for the ARM Cortex-M3 could simply generate
add r0,r0,r1,asl #2 ; x+=(y<<2)
add r0,r0,#12 ; x+=12
bx lr ; return x
but since y is uint8_t the compiler would have to instead generate:
add r0,r0,r1 ; x+=y
add r1,r1,#2 ; Compute y+2
and r1,r1,#255 ; y=(y+2) & 255
add r0,r0,r1 ; x+=y
add r1,r1,#4 ; Compute y+4
and r1,r1,#255 ; y=(y+4) & 255
add r0,r0,r1 ; x+=y
add r1,r1,#6 ; Compute y+6
and r1,r1,#255 ; y=(y+6) & 255
add r0,r0,r1 ; x+=y
bx lr ; return x
The intended purpose of the "fast" types was to allow compilers to replace smaller types which couldn't be processed efficiently with faster ones. Unfortunately, the semantics of "fast" types are rather poorly specified, which in turn leaves murky questions of whether expressions will be evaluated using signed or unsigned math.
1.Can you explain what is the meaning of "it's an unsigned int with at least 8 bits"?
That ought to be obvious. It means that it's an unsigned integer type, and that it's width is at least 8 bits. In effect this means that it can at least hold the numbers 0 through 255, and it can definitely not hold negative numbers, but it may be able to hold numbers higher than 255.
Obviously you should not use any of these types if you plan to store any number outside the range 0 through 255 (and you want it to be portable).
2.How uint_fast8_t and uint_least8_t help increase efficiency/code space compared to the uint8_t?
uint_fast8_t is required to be faster so you should use that if your requirement is that the code be fast. uint_least8_t on the other hand requires that there is no candidate of lesser size - so you would use that if size is the concern.
And of course you use only uint8_t when you absolutely require it to be exactly 8 bits. Using uint8_t may make the code non-portable as uint8_t is not required to exist (because such small integer type does not exist on certain platforms).
The "fast" integer types are defined to be the fastest integer available with at least the amount of bits required (in your case 8).
A platform can define uint_fast8_t as uint8_t then there will be absolutely no difference in speed.
The reason is that there are platforms that are slower when not using their native word length.
As the name suggests, uint_least8_t is the smallest type that has at least 8 bits, uint_fast8_t is the fastest type that has at least 8 bits. uint8_t has exactly 8 bits, but it is not guaranteed to exist on all platforms, although this is extremely uncommon.
In most case, uint_least8_t = uint_fast8_t = uint8_t = unsigned char. The only exception I have seen is the C2000 DSP from Texas Instruments, it is 32-bit, but its minimum data width is 16-bit. It does not have uint8_t, you can only use uint_least8_t and uint_fast8_t, they are defined as unsigned int, which is 16-bit.
I'm using the fast datatypes (uint_fast8_t) for local vars and function parameters, and using the normal ones (uint8_t) in arrays and structures which are used frequently and memory footprint is more important than the few cycles that could be saved by not having to clear or sign extend the upper bits.
Works great, except with MISRA checkers. They go nuts from the fast types. The trick is that the fast types are used through derived types that can be defined differently for MISRA builds and normal ones.
I think these types are great to create portable code, that's efficient on both low-end microcontrollers and big application processors. The improvement might be not huge, or totally negligible with good compilers, but better than nothing.
Some guessing in this thread.
"fast": The compiler should place "fast" type vars in IRAM (local processor RAM) which requires fewer cycles to access and write than vars stored in the hinterlands of RAM. "fast" is used if you need quickest possible action on a var, such as in an Interrupt Service Routine (ISR). Same as declaring a function to have an IRAM_ATTR; this == faster access. There is limited space for "fast" or IRAM vars/functions, so only use when needed, and never persist unless they qualify for that. Most compilers will move "fast" vars to general RAM if processor RAM is all allocated.
I run a gaming server that is coded C++, also a bit of ASM and C there. I saw someone had updated the same server I run and amongst all the updates was the fact that all int, unsigned, short and everything else he had changed to int32_t, uint32_t, uint64_t and the rest.
Is there any benefit in changing all of them to the above said ones? Lets say I changed all int to int32_t and all unsigned int to uint32_t and of course everything else thats possible to change.
I was trying to read and understand if there are any benefits but I simply didnt grasp the real meaning of them. So yeah, the question goes: Is there any benefit in doing what I just said?
The compiler I use is Orwell Dev-C++
The normal types, like int and unsigned int, have variable size depending on which platform you run. int32_t and uint32_t, however, is guaranteed to be 32 bits on any platform which have a 32-bit integral type. On platforms that don't, it doesn't exist. The size of a int may vary, usually its 32 or 64 bits long. The same rules apply to the other types, like int64_t is 64 bits.
To know the size of the data types is needed for instance in network programming because the network packets are sent between different platforms with different default integral sizes while the size of the data stored in the packets, like addresses and port numbers, are always the same. An IPv4-address is always 32 bits long, and one should use a data type which is guaranteed to be this size to store it.
In most programs, where you just store numbers, you should use the normal types. The size of these is the most efficient on the current platform.
uint32_t and int32_t add a little bit more control to the handling of your data types by having a fixed size and sign - IMHO the more predictability you have in your program, the better it is. You also can see on the typename whether it is unsigned or not which may not always be self-evident
With well defined size you also are more protected when doing portable code.
Actually, the default int and long are so vague and compiler/platform dependent that I for one try to avoid them. With int and long:
If you need to know the size of your struct, you need to know how many bytes for int and long, per architecture, per operative system, per compiler. It's a waste of precious brain.
If you need to explain to somebody the probability of having a collision in that hash table, would you say "it depends on the size of int, which depends on your architecture", or would you rather say, 1.0e-5? In other words, int and long "undefine" the properties of your program.
If you use int, and the compiler swears that that's 64 bits, the chances of it optimizing it to 32 or 16 are minimal. So you end up using much more memory if all you wanted was space for representing a short symbol... not that I know of many alphabets with 2^32 characters.
With more informative types, like uint32_t, or uint_least32_t, your compiler can perhaps make better assumptions and use 64 bits if it believes that that would bring better performance. I don't know that for sure. But you as human have better chances of understanding the range of values in which your program works well.
Additionally, all binary protocols likely need to specify the size and endianess of integers.
I have no problem with int32_t in some contexts, such as struct fields, but the
practice of writing all your int as int32_t, and unsigned as uint32_t (for portability?) is much overused; I think, in practical terms, you don't get any real portability benefit from this, and there are some significant downsides.
Most likely the code you are writing will only ever be compiled on a machine where int is 32 bits. If you are going to port it to a machine with 16-bit ints, then you generally have much bigger problems than a blanket typedef will solve.
If you do move to a 16-bit machine, you are probably going to want to change a great number of the local int32_t variables to int anyway since most of them are counts and indices, the 16-bit machine can't have more than 32K of most things, but people just wrote them all that way out of habit, and if you leave them as int32_t the code will be huge and slow.
int being 64 bits in the future sometime? Won't happen, for a number of reasons, most of them just pragmatic things.
On some weird machines int could be e.g. 24 bits. Very unusual, but again, you have bigger problems than a typedef will solve, and likely the first thing you want to do if porting to such a beast, is to change all the int32_t which are just counting things back to int.
So much for it not solving portability problems. What downside is there?
Certain library functions have int * parameters, e.g. frexp. You need to supply an address of an int variable, or it won't port, and int32_t may or may not be int even if int is 32 bits (see below). I've seen people write frexp( val, (int*)&myvar); just so they can write int32_t myvar; instead of int myvar;, which is terrible and can create a bug undetected by the compiler if int32_t is ever a different size from int.
Likewise printf( "%d", intvar); requires an int, not some typedef which happens to be the same size as int but might really be long, and gcc/clang issue warnings about this if int32_t is long.
On many platforms int32_t is long (32-bit long), even though int is also 32 bits, this is unfortunate (and I think it can be traced back to Microsoft needing to thunk things over 16/32/64 and deciding that long should be 32 bits forever and vice versa).
The uncertainty of whether int32_t is int or long can cause issues with c++ overloads, when porting code across different platforms.
So the reason for typedef:ed primitive data types is to abstract the low-level representation and make it easier to comprehend (uint64_t instead of long long type, which is 8 bytes).
However, there is uint_fast32_t which has the same typedef as uint32_t. Will using the "fast" version make the program faster?
int may be as small as 16 bits on some platforms. It may not be sufficient for your application.
uint32_t is not guaranteed to exist. It's an optional typedef that the implementation must provide iff it has an unsigned integer type of exactly 32-bits. Some have a 9-bit bytes for example, so they don't have a uint32_t.
uint_fast32_t states your intent clearly: it's a type of at least 32 bits which is the best from a performance point-of-view. uint_fast32_t may be in fact 64 bits long. It's up to the implementation.
There's also uint_least32_t in the mix. It designates the smallest type that's at least 32 bits long, thus it can be smaller than uint_fast32_t. It's an alternative to uint32_t if the later isn't supported by the platform.
... there is uint_fast32_t which has the same typedef as uint32_t ...
What you are looking at is not the standard. It's a particular implementation (BlackBerry). So you can't deduce from there that uint_fast32_t is always the same as uint32_t.
See also:
Exotic architectures the standards committees care about.
My pragmatic opinion about integer types in C and C++.
The difference lies in their exact-ness and availability.
The doc here says:
unsigned integer type with width of exactly 8, 16, 32 and 64 bits respectively (provided only if the implementation directly supports the type):
uint8_t
uint16_t
uint32_t
uint64_t
And
fastest unsigned unsigned integer type with width of at least 8, 16, 32 and 64 bits respectively
uint_fast8_t
uint_fast16_t
uint_fast32_t
uint_fast64_t
So the difference is pretty much clear that uint32_t is a type which has exactly 32 bits, and an implementation should provide it only if it has type with exactly 32 bits, and then it can typedef that type as uint32_t. This means, uint32_t may or may not be available.
On the other hand, uint_fast32_t is a type which has at least 32 bits, which also means, if an implementation may typedef uint32_t as uint_fast32_t if it provides uint32_t. If it doesn't provide uint32_t, then uint_fast32_t could be a typedef of any type which has at least 32 bits.
When you #include inttypes.h in your program, you get access to a bunch of different ways for representing integers.
The uint_fast*_t type simply defines the fastest type for representing a given number of bits.
Think about it this way: you define a variable of type short and use it several times in the program, which is totally valid. However, the system you're working on might work more quickly with values of type int. By defining a variable as type uint_fast*t, the computer simply chooses the most efficient representation that it can work with.
If there is no difference between these representations, then the system chooses whichever one it wants, and uses it consistently throughout.
Note that the fast version could be larger than 32 bits. While the fast int will fit nicely in a register and be aligned and the like: but, it will use more memory. If you have large arrays of these your program will be slower due to more memory cache hits and bandwidth.
I don't think modern CPUS will benefit from fast_int32, since generally the sign extending of 32 to 64 bit can happen during the load instruction and the idea that there is a 'native' integer format that is faster is old fashioned.
As I've learned recently, a long in C/C++ is the same length as an int. To put it simply, why? It seems almost pointless to even include the datatype in the language. Does it have any uses specific to it that an int doesn't have? I know we can declare a 64-bit int like so:
long long x = 0;
But why does the language choose to do it this way, rather than just making a long well...longer than an int? Other languages such as C# do this, so why not C/C++?
When writing in C or C++, every datatype is architecture and compiler specific. On one system int is 32, but you can find ones where it is 16 or 64; it's not defined, so it's up to compiler.
As for long and int, it comes from times, where standard integer was 16bit, where long was 32 bit integer - and it indeed was longer than int.
The specific guarantees are as follows:
char is at least 8 bits (1 byte by definition, however many bits it is)
short is at least 16 bits
int is at least 16 bits
long is at least 32 bits
long long (in versions of the language that support it) is at least 64 bits
Each type in the above list is at least as wide as the previous type (but may well be the same).
Thus it makes sense to use long if you need a type that's at least 32 bits, int if you need a type that's reasonably fast and at least 16 bits.
Actually, at least in C, these lower bounds are expressed in terms of ranges, not sizes. For example, the language requires that INT_MIN <= -32767, and INT_MAX >= +32767. The 16-bit requirements follows from this and from the requirement that integers are represented in binary.
C99 adds <stdint.h> and <inttypes.h>, which define types such as uint32_t, int_least32_t, and int_fast16_t; these are typedefs, usually defined as aliases for the predefined types.
(There isn't necessarily a direct relationship between size and range. An implementation could make int 32 bits, but with a range of only, say, -2**23 .. +2^23-1, with the other 8 bits (called padding bits) not contributing to the value. It's theoretically possible (but practically highly unlikely) that int could be larger than long, as long as long has at least as wide a range as int. In practice, few modern systems use padding bits, or even representations other than 2's-complement, but the standard still permits such oddities. You're more likely to encounter exotic features in embedded systems.)
long is not the same length as an int. According to the specification, long is at least as large as int. For example, on Linux x86_64 with GCC, sizeof(long) = 8, and sizeof(int) = 4.
long is not the same size as int, it is at least the same size as int. To quote the C++03 standard (3.9.1-2):
There are four signed integer types: “signed char”, “short int”,
“int”, and “long int.” In this list, each type provides at least as
much storage as those preceding it in the list. Plain ints have the
natural size suggested by the architecture of the execution
environment); the other signed integer types are provided to meet special needs.
My interpretation of this is "just use int, but if for some reason that doesn't fit your needs and you are lucky to find another integral type that's better suited, be our guest and use that one instead". One way that long might be better is if you 're on an architecture where it is... longer.
looking for something completely unrelated and stumbled across this and needed to answer. Yeah, this is old, so for people who surf on in later...
Frankly, I think all the answers on here are incomplete.
The size of a long is the size of the number of bits your processor can operate on at one time. It's also called a "word". A "half-word" is a short. A "doubleword" is a long long and is twice as large as a long (and originally was only implemented by vendors and not standard), and even bigger than a long long is a "quadword" which is twice the size of a long long but it had no formal name (and not really standard).
Now, where does the int come in? In part registers on your processor, and in part your OS. Your registers define the native sizes the CPU handles which in turn define the size of things like the short and long. Processors are also designed with a data size that is the most efficient size for it to operate on. That should be an int.
On todays 64bit machines you'd assume, since a long is a word and a word on a 64bit machine is 64bits, that a long would be 64bits and an int whatever the processor is designed to handle, but it might not be. Why? Your OS has chosen a data model and defined these data sizes for you (pretty much by how it's built). Ultimately, if you're on Windows (and using Win64) it's 32bits for both a long and int. Solaris and Linux use different definitions (the long is 64bits). These definitions are called things like ILP64, LP64, and LLP64. Windows uses LLP64 and Solaris and Linux use LP64:
Model ILP64 LP64 LLP64
int 64 32 32
long 64 64 32
pointer 64 64 64
long long 64 64 64
Where, e.g., ILP means int-long-pointer, and LLP means long-long-pointer
To get around this most compilers seem to support setting the size of an integer directly with types like int32 or int64.
I notice that modern C and C++ code seems to use size_t instead of int/unsigned int pretty much everywhere - from parameters for C string functions to the STL. I am curious as to the reason for this and the benefits it brings.
The size_t type is the unsigned integer type that is the result of the sizeof operator (and the offsetof operator), so it is guaranteed to be big enough to contain the size of the biggest object your system can handle (e.g., a static array of 8Gb).
The size_t type may be bigger than, equal to, or smaller than an unsigned int, and your compiler might make assumptions about it for optimization.
You may find more precise information in the C99 standard, section 7.17, a draft of which is available on the Internet in pdf format, or in the C11 standard, section 7.19, also available as a pdf draft.
Classic C (the early dialect of C described by Brian Kernighan and Dennis Ritchie in The C Programming Language, Prentice-Hall, 1978) didn't provide size_t. The C standards committee introduced size_t to eliminate a portability problem
Explained in detail at embedded.com (with a very good example)
In short, size_t is never negative, and it maximizes performance because it's typedef'd to be the unsigned integer type that's big enough -- but not too big -- to represent the size of the largest possible object on the target platform.
Sizes should never be negative, and indeed size_t is an unsigned type. Also, because size_t is unsigned, you can store numbers that are roughly twice as big as in the corresponding signed type, because we can use the sign bit to represent magnitude, like all the other bits in the unsigned integer. When we gain one more bit, we are multiplying the range of numbers we can represents by a factor of about two.
So, you ask, why not just use an unsigned int? It may not be able to hold big enough numbers. In an implementation where unsigned int is 32 bits, the biggest number it can represent is 4294967295. Some processors, such as the IP16L32, can copy objects larger than 4294967295 bytes.
So, you ask, why not use an unsigned long int? It exacts a performance toll on some platforms. Standard C requires that a long occupy at least 32 bits. An IP16L32 platform implements each 32-bit long as a pair of 16-bit words. Almost all 32-bit operators on these platforms require two instructions, if not more, because they work with the 32 bits in two 16-bit chunks. For example, moving a 32-bit long usually requires two machine instructions -- one to move each 16-bit chunk.
Using size_t avoids this performance toll. According to this fantastic article, "Type size_t is a typedef that's an alias for some unsigned integer type, typically unsigned int or unsigned long, but possibly even unsigned long long. Each Standard C implementation is supposed to choose the unsigned integer that's big enough--but no bigger than needed--to represent the size of the largest possible object on the target platform."
The size_t type is the type returned by the sizeof operator. It is an unsigned integer capable of expressing the size in bytes of any memory range supported on the host machine. It is (typically) related to ptrdiff_t in that ptrdiff_t is a signed integer value such that sizeof(ptrdiff_t) and sizeof(size_t) are equal.
When writing C code you should always use size_t whenever dealing with memory ranges.
The int type on the other hand is basically defined as the size of the (signed) integer value that the host machine can use to most efficiently perform integer arithmetic. For example, on many older PC type computers the value sizeof(size_t) would be 4 (bytes) but sizeof(int) would be 2 (byte). 16 bit arithmetic was faster than 32 bit arithmetic, though the CPU could handle a (logical) memory space of up to 4 GiB.
Use the int type only when you care about efficiency as its actual precision depends strongly on both compiler options and machine architecture. In particular the C standard specifies the following invariants: sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) placing no other limitations on the actual representation of the precision available to the programmer for each of these primitive types.
Note: This is NOT the same as in Java (which actually specifies the bit precision for each of the types 'char', 'byte', 'short', 'int' and 'long').
Type size_t must be big enough to store the size of any possible object. Unsigned int doesn't have to satisfy that condition.
For example in 64 bit systems int and unsigned int may be 32 bit wide, but size_t must be big enough to store numbers bigger than 4G
This excerpt from the glibc manual 0.02 may also be relevant when researching the topic:
There is a potential problem with the size_t type and versions of GCC prior to release 2.4. ANSI C requires that size_t always be an unsigned type. For compatibility with existing systems' header files, GCC defines size_t in stddef.h' to be whatever type the system'ssys/types.h' defines it to be. Most Unix systems that define size_t in `sys/types.h', define it to be a signed type. Some code in the library depends on size_t being an unsigned type, and will not work correctly if it is signed.
The GNU C library code which expects size_t to be unsigned is correct. The definition of size_t as a signed type is incorrect. We plan that in version 2.4, GCC will always define size_t as an unsigned type, and the fixincludes' script will massage the system'ssys/types.h' so as not to conflict with this.
In the meantime, we work around this problem by telling GCC explicitly to use an unsigned type for size_t when compiling the GNU C library. `configure' will automatically detect what type GCC uses for size_t arrange to override it if necessary.
If my compiler is set to 32 bit, size_t is nothing other than a typedef for unsigned int. If my compiler is set to 64 bit, size_t is nothing other than a typedef for unsigned long long.
size_t is the size of a pointer.
So in 32 bits or the common ILP32 (integer, long, pointer) model size_t is 32 bits.
and in 64 bits or the common LP64 (long, pointer) model size_t is 64 bits (integers are still 32 bits).
There are other models but these are the ones that g++ use (at least by default)