<stdint.h> or standard types? - c++

Which types should I use when programming C++ on Linux? Is it good idea to use types from stdint.h, such as int16_t and uint8_t?
On one hand, surely stdint.h won't be available for programming on Windows. On the other though, size of e.g. short isn't clear on the first batch of an eye. And it's even more intuitive to write int8_t instead of char...
Does C++ standard guarantee, that sizes of standard types will be unchanged in future?

First off, Microsoft's implementation does support <stdint.h>.
Use the appropriate type for what you're doing.
If you need, for example, an unsigned type that's exactly 16 bits wide with no padding bits, use uint16_t, defined in <stdint.h>.
If you need an unsigned type that's at least 16 bits wide, you can use uint_least16_t, or uint_fast16_t, or short, or int.
You probably don't need exact-width types as often as you think you do. Very often what matters is not the exact size of a type, but the range of values it supports. But exact representation is important when you're interfacing to some externally defined data format. In that case, you should already have declarations that tell you what types to use.
There are specific requirements on the ranges of the predefined types: char is at least 8 bits, short and int are at least 16 bits, long is at least 32 bits, and long long is at least 64 bits. Also, short is at least as wide as char, int is at least as wide as short, and so forth. (The standard specifies minimum ranges, but the minimum sizes can be derived from the ranges and the fact that a binary representation is required.)
Note that <stdint.h> is a C header. If you #include it in a C++ program, the type names will be imported directly into the global namespace, and may or may not also be imported into the std namespace. If you #include <cstdint>, then the type names will be imported into the std namespace, and may or may not also be imported into the global namespace. Macro names such as UINT32_MAX are not in any namespace; they're always global. You can use either version of the header; just be consistent about using or not using the std:: prefix.

C++ standard does not specify much about sizes of integer types (such as int, long or char). If you want to be sure, that certain type has fixed size across platforms, you can use C++11's Fixed-width integer types, which are standardized and guaranteed to have given size.
To use them, #include <cstdint>.
Does C++ standard guarantee, that sizes of standard types will be unchanged in future?
Not likely. On 8bit computers, sizes of integers types were different to what they are today. In the future, in 2042, with 1024-bit computers, I assume long long to be 1024-bit long.
However, we can be almost absolutely sure, that std::uint32_t will stay 32-bit long.

Related

In new code, why would you use `int` instead of `int_fast16_t` or `int_fast32_t` for a counting variable?

If you need a counting variable, surely there must be an upper and a lower limit that your integer must support. So why wouldn't you specify those limits by choosing an appropriate (u)int_fastxx_t data type?
The simplest reason is that people are more used to int than the additional types introduced in C++11, and that it's the language's "default" integral type (so much as C++ has one); the standard specifies, in [basic.fundamental/2] that:
Plain ints have the natural size suggested by the architecture of the execution environment46; the other signed integer types are provided to meet special needs.
46) that is, large enough to contain any value in the range of INT_MIN and INT_MAX, as defined in the header <climits>.
Thus, whenever a generic integer is needed, which isn't required to have a specific range or size, programmers tend to just use int. While using other types can communicate intent more clearly (for example, using int8_t indicates that the value should never exceed 127), using int also communicates that these details aren't crucial to the task at hand, while simultaneously providing a little leeway to catch values that exceed your required range (if a system handles signed overflow with modulo arithmetic, for example, an int8_t would treat 313 as 57, making the invalid value harder to troubleshoot); typically, in modern programming, it either indicates that the value can be represented within the system's word size (which int is supposed to represent), or that the value can be represented within 32 bits (which is nearly always the size of int on x86 and x64 platforms).
Sized types also have the issue that the (theoretically) most well-known ones, the intX_t line, are only defined on platforms which support sizes of exactly X bits. While the int_leastX_t types are guaranteed to be defined on all platforms, and guaranteed to be at least X bits, a lot of people wouldn't want to type that much if they don't have to, since it adds up when you need to specify types often. [You can't use auto, either because it detects integer literals as ints. This can be mitigated by making user-defined literal operators, but that still takes more time to type.] Thus, they'll typically use int if it's safe to do so.
Or in short, int is intended to be the go-to type for normal operation, with the other types intended to be used in extranormal circumstances. Many programmers stick to this mindset out of habit, and only use sized types when they explicitly require specific ranges and/or sizes. This also communicates intent relatively well; int means "number", and intX_t means "number that always fits in X bits".
It doesn't help that int has evolved to unofficially mean "32-bit integer", due to both 32- and 64-bit platforms usually using 32-bit ints. It's very likely that many programmers expect int to always be at least 32 bits in the modern age, to the point where it can very easily bite them in the rear if they have to program for platforms that don't support 32-bit ints.
Conversely, the sized types are typically used when a specific range or size is explicitly required, such as when defining a struct that needs to have the same layout on systems with different data models. They can also prove useful when working with limited memory, using the smallest type that can fully contain the required range.
A struct intended to have the same layout on 16- and 32-bit systems, for example, would use either int16_t or int32_t instead of int, because int is 16 bits in most 16-bit data models and the LP32 32-bit data model (used by the Win16 API and Apple Macintoshes), but 32 bits in the ILP32 32-bit data model (used by the Win32 API and *nix systems, effectively making it the de facto "standard" 32-bit model).
Similarly, a struct intended to have the same layout on 32- and 64-bit systems would use int/int32_t or long long/int64_t over long, due to long having different sizes in different models (64 bits in LP64 (used by 64-bit *nix), 32 bits in LLP64 (used by Win64 API) and the 32-bit models).
Note that there is also a third 64-bit model, ILP64, where int is 64 bits; this model is very rarely used (to my knowledge, it was only used on early 64-bit Unix systems), but would mandate the use of a sized type over int if layout compatibility with ILP64 platforms is required.
There are several reasons. One, these long names make the code less readable. Two, you might introduce really hard to find bugs. Say you used int_fast16_t but you really need to count up to 40,000. The implementation might use 32 bits and the code work just fine. Then you try to run the code on an implementation that uses 16 bits and you get hard-to-find bugs.
A note: In C / C++ you have types char, short, int, long and long long which must cover 8 to 64 bits, so int cannot be 64 bits (because char and short cannot cover 8, 16 and 32 bits), even if 64 bits is the natural word size. In Swift, for example, Int is the natural integer size, either 32 and 64 bits, and you have Int8, Int16, Int32 and Int64 for explicit sizes. Int is the best type unless you absolutely need 64 bits, in which case you use Int64, or if you need to save space.

Uses and when to use int16_t,int32_t,int64_t and respectively short int,int,long int,long

Uses and when to use int16_t, int32_t, int64_t and respectively short, int, long.
There are too many damn types in C++. For integers when is it correct to use one over the other?
Use the well-defined types when the precision is important. Use the less-determinate ones when it is not. It's never wrong to use the more precise ones. It sometimes leads to bugs when you use the flexible ones.
Use the exact-width types when you actually need an exact width. For example, int32_t is guaranteed to be exactly 32 bits wide, with no padding bits, and with a two's-complement representation. If you need all those requirements (perhaps because they're imposed by an external data format), use int32_t. Likewise for the other [u]intN_t types.
If you merely need a signed integer type of at least 32 bits, use int_least32_t or int_fast32_t, depending on whether you want to optimize for size or speed. (They're likely to be the same type.)
Use the predefined types short, int, long, et al when they're good enough for your purposes and you don't want to use the longer names. short and int are both guaranteed to be at least 16 bits, long at least 32 bits, and long long at least 64 bits. int is normally the "natural" integer type suggested by the system's architecture; you can think of it as int_fast16_t, and long as int_fast32_t, though they're not guaranteed to be the same.
I haven't given firm criteria for using the built-in vs. the [u]int_leastN_t and [u]int_fastN_t types because, frankly, there are no such criteria. If the choice isn't imposed by the API you're using or by your organization's coding standard, it's really a matter of personal taste. Just try to be consistent.
This is a good question, but hard to answer.
In one line: It depends to context:
My rule of thumb:
I'd prefer code performance (speed: less time, then less complexity)
When using existing library I'd follow the library coding style (context).
When coding in a team I'd follow the team coding style (context).
When coding new things I'd use int16_t,int32_t,int64_t,.. when ever possible.
Explanation:
using int (int is system word size) in some context give you performance, but in some other context not.
I'd use uint64_t over unsigned long long because it is concise, but when ever possible.
So It depends to context
A use that I have found for them is when I am bitpacking data for, say, an image compressor. Using these types that precisely specify the number of bytes can save a lot of headache, since the C++ standard does not explicitly define the number of bytes in its types, only the MIN and MAX ranges.
In MISRA-C 2004 and MISRA-C++ 2008 guidelines, the advisory is to prefer specific-length typedefs:
typedefs that indicate size and signedness should be used in place of
the basic numerical types. [...]
This rule helps to clarify the size
of the storage, but does not guarantee portability because of the
asymmetric behaviour of integral promotion. [...]
Exception is for char type :
The plain char type shall be used only for the storage and use of character values.
However, keep in mind that the MISRA guidelines are for critical systems.
Personally, I follow these guidelines for embedded systems, not for computer applications where I simply use an int when I want an integer, letting the compiler optimize as it wants.

Will int32 be the same size on all platforms?

I'm developing a multi platform app (iOS, Android, etc), using C++.
Are there base types in the C++ standard which are guaranteed to be a fixed width, and portable across multiple platforms?
I'm looking for fixed-width types such as Int32, UInt32, Int16, UInt16, Float32, etc.
int32 is a custom typedef, only int exists by default. If you need a specified width take a look at stdint.h
#include <cstdint>
int32_t integer32bits;
I don't think any floating point counterpart exists in the standard, correct me if I'm wrong.
Floats are almost always 32 bit except on some obscure platforms that do not comply with IEEE 754. You don't need to bother with those, in all likelihood. Integer types may vary, but if your target platform has a C++11-compliant compiler, then you can use the cstdint header to access types of a specific size in a standard way. If you can't use C++11, then you will need separate code for each platform, most likely.
The definitions in <stdint.h>, or <cstdint> can be used for portability:
int32_t is guaranteed to be a typedef for a signed 32 bit type, or not exist at all. Since this is C++, you can use enable_if to decide on a course of action.
int_least32_t is a typedef for the smallest type that has at least 32 bits
int_fast32_t is a typedef for a type that has at least 32 bit and can be operated on efficiently (e.g. if the memory bus is 64 bit wide and allows no partial stores, it is faster to use a 64 bit type and waste memory rather than perform read-modify-write accesses)
See also The difference of int8_t, int_least8_t and int_fast8_t.
Note that different systems can also have different endianness, so it is never safe to transmit these over the network.

How to guarantee a C++ type's number of bits

I am looking to typedef my own arithmetic types (e.g. Byte8, Int16, Int32, Float754, etc) with the intention of ensuring they comprise a specific number of bits (and in the case of the float, adhere to the IEEE754 format). How can I do this in a completely cross-platform way?
I have seen snippets of the C/C++ standards here and there and there is a lot of:
"type is at least x bytes"
and not very much of:
"type is exactly x bytes".
Given that typedef Int16 unsigned short int may not necessarily result in a 16-bit Int16, is there a cross-platform way to guarantee my types will have specific sizes?
You can use exact-width integer types int8_t, int16_t, int32_t, int64_t declared in <cstdint>. This way the sizes are fixed on all the platforms
The only available way to truly guarantee an exact number of bits is to use a bit-field:
struct X {
int abc : 14; // exactly 14 bits, regardless of platform
};
There is some upper limit on the size you can specify this way -- at least 16 bits for int, and 32 bits for long (but a modern platform may easily allow up to 64 bits for either). Note, however, that while this guarantees that arithmetic on X::abc will use (or at least emulate) exactly 14 bits, it does not guarantee that the size of a struct X is the minimum number of bytes necessary to provide 14 bits (e.g., given 8-bit bytes, its size could easily be 4 or 8 instead of the 2 that are absolutely necessary).
The C and C++ standards both now include a specification for fixed-size types (e.g., int8_t, int16_t), but no guarantee that they'll be present. They're required if the platform provides the right type, but otherwise won't be present. If memory serves, these are also required to use a 2's complement representation, so a platform with a 16-bit 1's complement integer type (for example) still won't define int16_t.
Have a look at the types declared in stdint.h. This is part of the standard library, so it is expected (though technically not guaranteed) to be available everywhere. Among the types declared here are int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, and uint64_t. Local implementations will map these types to the appropriate-width type for the given complier and architecture.
This is not possible.
There are platforms where char is 16 or even 32 bits.
Note that I'm not saying there are in theory platforms where this happens... it is a real and quite concrete possibility (e.g. DSPs).
On that kind of hardware there is just no way to use 8 bit only for an operation and for example if you need 8 bit modular arithmetic then the only way is doing a masking operation yourself.
The C language doesn't provide this kind of emulation for you...
With C++ you could try to build a class that behaves like the expected native elementary type in most cases (with the exclusion of sizeof, obviously). The result will have however truly horrible performances.
I can think to no use case in which forcing the hardware this way against its nature would be a good idea.
It is possible to use C++ templates at compile time to check and create new types on the fly that do fit your requirements, specifically that sizeof() of the type is the correct size that you want.
Take a look at this code: Compile time "if".
Do note that if the requested type is not available then it is entirely possible that your program will simply not compile. It simply depends on whether or not that works for you or not!

Why is uint_8 etc. used in C/C++?

I've seen some code where they don't use primitive types int, float, double etc. directly.
They usually typedef it and use it or use things like
uint_8 etc.
Is it really necessary even these days? Or is C/C++ standardized enough that it is preferable to use int, float etc directly.
Because the types like char, short, int, long, and so forth, are ambiguous: they depend on the underlying hardware. Back in the days when C was basically considered an assembler language for people in a hurry, this was okay. Now, in order to write programs that are portable -- which means "programs that mean the same thing on any machine" -- people have built special libraries of typedefs and #defines that allow them to make machine-independent definitions.
The secret code is really quite straight-forward. Here, you have uint_8, which is interpreted
u for unsigned
int to say it's treated as a number
_8 for the size in bits.
In other words, this is an unsigned integer with 8 bits (minimum) or what we used to call, in the mists of C history, an "unsigned char".
uint8_t is rather useless, because due to other requirements in the standard, it exists if and only if unsigned char is 8-bit, in which case you could just use unsigned char. The others, however, are extremely useful. int is (and will probably always be) 32-bit on most modern platforms, but on some ancient stuff it's 16-bit, and on a few rare early 64-bit systems, int is 64-bit. It could also of course be various odd sizes on DSPs.
If you want a 32-bit type, use int32_t or uint32_t, and so on. It's a lot cleaner and easier than all the nasty legacy hacks of detecting the sizes of types and trying to use the right one yourself...
Most code I read, and write, uses the fixed-size typedefs only when the size is an important assumption in the code.
For example if you're parsing a binary protocol that has two 32-bit fields, you should use a typedef guaranteed to be 32-bit, if only as documentation.
I'd only use int16 or int64 when the size must be that, say for a binary protocol or to avoid overflow or keep a struct small. Otherwise just use int.
If you're just doing "int i" to use i in a for loop, then I would not write "int32" for that. I would never expect any "typical" (meaning "not weird embedded firmware") C/C++ code to see a 16-bit "int," and the vast majority of C/C++ code out there would implode if faced with 16-bit ints. So if you start to care about "int" being 16 bit, either you're writing code that cares about weird embedded firmware stuff, or you're sort of a language pedant. Just assume "int" is the best int for the platform at hand and don't type extra noise in your code.
The sizes of types in C are not particularly well standardized. 64-bit integers are one example: a 64-bit integer could be long long, __int64, or even int on some systems. To get better portability, C99 introduced the <stdint.h> header, which has types like int32_t to get a signed type that is exactly 32 bits; many programs had their own, similar sets of typedefs before that.
C and C++ purposefully don't define the exact size of an int. This is because of a number of reasons, but that's not important in considering this problem.
Since int isn't set to a standard size, those who want a standard size must do a bit of work to guarantee a certain number of bits. The code that defines uint_8 does that work, and without it (or a technique like it) you wouldn't have a means of defining an unsigned 8 bit number.
The width of primitive types often depends on the system, not just the C++ standard or compiler. If you want true consistency across platforms when you're doing scientific computing, for example, you should use the specific uint_8 or whatever so that the same errors (or precision errors for floats) appear on different machines, so that the memory overhead is the same, etc.
C and C++ don't restrict the exact size of the numeric types, the standards only specify a minimum range of values that has to be represented. This means that int can be larger than you expect.
The reason for this is that often a particular architecture will have a size for which arithmetic works faster than other sizes. Allowing the implementor to use this size for int and not forcing it to use a narrower type may make arithmetic with ints faster.
This isn't going to go away any time soon. Even once servers and desktops are all fully transitioned to 64-bit platforms, mobile and embedded platforms may well be operating with a different integer size. Apart from anything else, you don't know what architectures might be released in the future. If you want your code to be portable, you have to use a fixed-size typedef anywhere that the type size is important to you.