Will int32 be the same size on all platforms? - c++

I'm developing a multi platform app (iOS, Android, etc), using C++.
Are there base types in the C++ standard which are guaranteed to be a fixed width, and portable across multiple platforms?
I'm looking for fixed-width types such as Int32, UInt32, Int16, UInt16, Float32, etc.

int32 is a custom typedef, only int exists by default. If you need a specified width take a look at stdint.h
#include <cstdint>
int32_t integer32bits;
I don't think any floating point counterpart exists in the standard, correct me if I'm wrong.

Floats are almost always 32 bit except on some obscure platforms that do not comply with IEEE 754. You don't need to bother with those, in all likelihood. Integer types may vary, but if your target platform has a C++11-compliant compiler, then you can use the cstdint header to access types of a specific size in a standard way. If you can't use C++11, then you will need separate code for each platform, most likely.

The definitions in <stdint.h>, or <cstdint> can be used for portability:
int32_t is guaranteed to be a typedef for a signed 32 bit type, or not exist at all. Since this is C++, you can use enable_if to decide on a course of action.
int_least32_t is a typedef for the smallest type that has at least 32 bits
int_fast32_t is a typedef for a type that has at least 32 bit and can be operated on efficiently (e.g. if the memory bus is 64 bit wide and allows no partial stores, it is faster to use a 64 bit type and waste memory rather than perform read-modify-write accesses)
See also The difference of int8_t, int_least8_t and int_fast8_t.
Note that different systems can also have different endianness, so it is never safe to transmit these over the network.

Related

In new code, why would you use `int` instead of `int_fast16_t` or `int_fast32_t` for a counting variable?

If you need a counting variable, surely there must be an upper and a lower limit that your integer must support. So why wouldn't you specify those limits by choosing an appropriate (u)int_fastxx_t data type?
The simplest reason is that people are more used to int than the additional types introduced in C++11, and that it's the language's "default" integral type (so much as C++ has one); the standard specifies, in [basic.fundamental/2] that:
Plain ints have the natural size suggested by the architecture of the execution environment46; the other signed integer types are provided to meet special needs.
46) that is, large enough to contain any value in the range of INT_MIN and INT_MAX, as defined in the header <climits>.
Thus, whenever a generic integer is needed, which isn't required to have a specific range or size, programmers tend to just use int. While using other types can communicate intent more clearly (for example, using int8_t indicates that the value should never exceed 127), using int also communicates that these details aren't crucial to the task at hand, while simultaneously providing a little leeway to catch values that exceed your required range (if a system handles signed overflow with modulo arithmetic, for example, an int8_t would treat 313 as 57, making the invalid value harder to troubleshoot); typically, in modern programming, it either indicates that the value can be represented within the system's word size (which int is supposed to represent), or that the value can be represented within 32 bits (which is nearly always the size of int on x86 and x64 platforms).
Sized types also have the issue that the (theoretically) most well-known ones, the intX_t line, are only defined on platforms which support sizes of exactly X bits. While the int_leastX_t types are guaranteed to be defined on all platforms, and guaranteed to be at least X bits, a lot of people wouldn't want to type that much if they don't have to, since it adds up when you need to specify types often. [You can't use auto, either because it detects integer literals as ints. This can be mitigated by making user-defined literal operators, but that still takes more time to type.] Thus, they'll typically use int if it's safe to do so.
Or in short, int is intended to be the go-to type for normal operation, with the other types intended to be used in extranormal circumstances. Many programmers stick to this mindset out of habit, and only use sized types when they explicitly require specific ranges and/or sizes. This also communicates intent relatively well; int means "number", and intX_t means "number that always fits in X bits".
It doesn't help that int has evolved to unofficially mean "32-bit integer", due to both 32- and 64-bit platforms usually using 32-bit ints. It's very likely that many programmers expect int to always be at least 32 bits in the modern age, to the point where it can very easily bite them in the rear if they have to program for platforms that don't support 32-bit ints.
Conversely, the sized types are typically used when a specific range or size is explicitly required, such as when defining a struct that needs to have the same layout on systems with different data models. They can also prove useful when working with limited memory, using the smallest type that can fully contain the required range.
A struct intended to have the same layout on 16- and 32-bit systems, for example, would use either int16_t or int32_t instead of int, because int is 16 bits in most 16-bit data models and the LP32 32-bit data model (used by the Win16 API and Apple Macintoshes), but 32 bits in the ILP32 32-bit data model (used by the Win32 API and *nix systems, effectively making it the de facto "standard" 32-bit model).
Similarly, a struct intended to have the same layout on 32- and 64-bit systems would use int/int32_t or long long/int64_t over long, due to long having different sizes in different models (64 bits in LP64 (used by 64-bit *nix), 32 bits in LLP64 (used by Win64 API) and the 32-bit models).
Note that there is also a third 64-bit model, ILP64, where int is 64 bits; this model is very rarely used (to my knowledge, it was only used on early 64-bit Unix systems), but would mandate the use of a sized type over int if layout compatibility with ILP64 platforms is required.
There are several reasons. One, these long names make the code less readable. Two, you might introduce really hard to find bugs. Say you used int_fast16_t but you really need to count up to 40,000. The implementation might use 32 bits and the code work just fine. Then you try to run the code on an implementation that uses 16 bits and you get hard-to-find bugs.
A note: In C / C++ you have types char, short, int, long and long long which must cover 8 to 64 bits, so int cannot be 64 bits (because char and short cannot cover 8, 16 and 32 bits), even if 64 bits is the natural word size. In Swift, for example, Int is the natural integer size, either 32 and 64 bits, and you have Int8, Int16, Int32 and Int64 for explicit sizes. Int is the best type unless you absolutely need 64 bits, in which case you use Int64, or if you need to save space.

<stdint.h> or standard types?

Which types should I use when programming C++ on Linux? Is it good idea to use types from stdint.h, such as int16_t and uint8_t?
On one hand, surely stdint.h won't be available for programming on Windows. On the other though, size of e.g. short isn't clear on the first batch of an eye. And it's even more intuitive to write int8_t instead of char...
Does C++ standard guarantee, that sizes of standard types will be unchanged in future?
First off, Microsoft's implementation does support <stdint.h>.
Use the appropriate type for what you're doing.
If you need, for example, an unsigned type that's exactly 16 bits wide with no padding bits, use uint16_t, defined in <stdint.h>.
If you need an unsigned type that's at least 16 bits wide, you can use uint_least16_t, or uint_fast16_t, or short, or int.
You probably don't need exact-width types as often as you think you do. Very often what matters is not the exact size of a type, but the range of values it supports. But exact representation is important when you're interfacing to some externally defined data format. In that case, you should already have declarations that tell you what types to use.
There are specific requirements on the ranges of the predefined types: char is at least 8 bits, short and int are at least 16 bits, long is at least 32 bits, and long long is at least 64 bits. Also, short is at least as wide as char, int is at least as wide as short, and so forth. (The standard specifies minimum ranges, but the minimum sizes can be derived from the ranges and the fact that a binary representation is required.)
Note that <stdint.h> is a C header. If you #include it in a C++ program, the type names will be imported directly into the global namespace, and may or may not also be imported into the std namespace. If you #include <cstdint>, then the type names will be imported into the std namespace, and may or may not also be imported into the global namespace. Macro names such as UINT32_MAX are not in any namespace; they're always global. You can use either version of the header; just be consistent about using or not using the std:: prefix.
C++ standard does not specify much about sizes of integer types (such as int, long or char). If you want to be sure, that certain type has fixed size across platforms, you can use C++11's Fixed-width integer types, which are standardized and guaranteed to have given size.
To use them, #include <cstdint>.
Does C++ standard guarantee, that sizes of standard types will be unchanged in future?
Not likely. On 8bit computers, sizes of integers types were different to what they are today. In the future, in 2042, with 1024-bit computers, I assume long long to be 1024-bit long.
However, we can be almost absolutely sure, that std::uint32_t will stay 32-bit long.

How to guarantee a C++ type's number of bits

I am looking to typedef my own arithmetic types (e.g. Byte8, Int16, Int32, Float754, etc) with the intention of ensuring they comprise a specific number of bits (and in the case of the float, adhere to the IEEE754 format). How can I do this in a completely cross-platform way?
I have seen snippets of the C/C++ standards here and there and there is a lot of:
"type is at least x bytes"
and not very much of:
"type is exactly x bytes".
Given that typedef Int16 unsigned short int may not necessarily result in a 16-bit Int16, is there a cross-platform way to guarantee my types will have specific sizes?
You can use exact-width integer types int8_t, int16_t, int32_t, int64_t declared in <cstdint>. This way the sizes are fixed on all the platforms
The only available way to truly guarantee an exact number of bits is to use a bit-field:
struct X {
int abc : 14; // exactly 14 bits, regardless of platform
};
There is some upper limit on the size you can specify this way -- at least 16 bits for int, and 32 bits for long (but a modern platform may easily allow up to 64 bits for either). Note, however, that while this guarantees that arithmetic on X::abc will use (or at least emulate) exactly 14 bits, it does not guarantee that the size of a struct X is the minimum number of bytes necessary to provide 14 bits (e.g., given 8-bit bytes, its size could easily be 4 or 8 instead of the 2 that are absolutely necessary).
The C and C++ standards both now include a specification for fixed-size types (e.g., int8_t, int16_t), but no guarantee that they'll be present. They're required if the platform provides the right type, but otherwise won't be present. If memory serves, these are also required to use a 2's complement representation, so a platform with a 16-bit 1's complement integer type (for example) still won't define int16_t.
Have a look at the types declared in stdint.h. This is part of the standard library, so it is expected (though technically not guaranteed) to be available everywhere. Among the types declared here are int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, and uint64_t. Local implementations will map these types to the appropriate-width type for the given complier and architecture.
This is not possible.
There are platforms where char is 16 or even 32 bits.
Note that I'm not saying there are in theory platforms where this happens... it is a real and quite concrete possibility (e.g. DSPs).
On that kind of hardware there is just no way to use 8 bit only for an operation and for example if you need 8 bit modular arithmetic then the only way is doing a masking operation yourself.
The C language doesn't provide this kind of emulation for you...
With C++ you could try to build a class that behaves like the expected native elementary type in most cases (with the exclusion of sizeof, obviously). The result will have however truly horrible performances.
I can think to no use case in which forcing the hardware this way against its nature would be a good idea.
It is possible to use C++ templates at compile time to check and create new types on the fly that do fit your requirements, specifically that sizeof() of the type is the correct size that you want.
Take a look at this code: Compile time "if".
Do note that if the requested type is not available then it is entirely possible that your program will simply not compile. It simply depends on whether or not that works for you or not!

Any situation where long is preferred over int when sizeof(int)=sizeof(long)

My desktop and laptop machines have 64 bit and 32 bit Ubuntu 10.10's running on them respectively. I use the gcc compiler
Now on my desktop machine I observe that sizeof(long)=8 while on my laptop sizeof(long)=4.
On machines such as my laptop where sizeof(int) =sizeof(long)=4 are there any situations where would I would prefer long over int even though they cover the same range of integers?
On my desktop of course long would be advantageous if I want a larger range of integers (though of course I could have used int64_t or long long for that also)
Don't use either of them. In modern C (starting with C89) or C++ there are typedef that have a semantic that helps you to write portable code. int is almost always wrong, the only use case that I still have for that is the return value of library functions. Otherwise use
bool or _Bool for Booleans (if you have C++ or C99, otherwise use a typedef)
enum for applicative case distinction
size_t for counting and indexing
unsigned types when you use integers for bit patterns
ptrdiff_t (if you must) for differences of addresses
If you really have an application use for a signed integer type, use either intmax_t to have the most and to be on the safe end, or one of the intXX_t to have a type with well defined precision and arithmetic.
Edit: If your main concern is performance with some minimum width guarantee use the "least" or "fast" types, e.g int_least32_t. On all platforms that I programmed so far there was not much of a difference between the precise width types and the "least" types, but who knows.
On a 32-bit OS, where sizeof(int)==sizeof(long)==4 an int and a long offer the same services.
But, for portability reasons (if you compile your code in 64-bit for example), since an int will stay at 32-bit while a long can be either 32 or 64-bit, you should use types that fit a constant size to avoid overflows.
For this purpose, the <stdint.h> header declares non-ambiguous types like:
int8_t
int16_t
int32_t
uint8_t
uint16_t
uint32_t
intptr_t
Where intptr_t / uintptr_t can represent pointers better than a long (the common sizeof(long)==sizeof(void*) assumption is not always true).
time_t and size_t are also types defined to make it easier to write portable code without wondering about the platform specifications.
Just make sure that, when you need to allocate memory, you use sizeof (like sizeof(size_t)) instead of assuming that a type has any given (hardcoded) value.
While it certainly isn't a common practice, the Linux kernel source makes the assumption that pointers -- any pointer to any type -- can fit entirely inside an unsigned long.
As #Alf noted above, you might choose either int or long for portability reasons.
On older, 16-bit operating systems, int was 16-bit and long was 32-bit;
On 32-bit Unix, DOS, and Windows (or 64-bit processors running 32-bit programs), int and long are 32-bits;
On 64-bit Unix, int is 32-bits, while long is 64-bits.
For portability reasons you should use a long if you need more than 16 bits and up to 32 bits of precision. That's really all there is to it - if you know that your values won't exceed 16 bits, then int is fine.

Why is uint_8 etc. used in C/C++?

I've seen some code where they don't use primitive types int, float, double etc. directly.
They usually typedef it and use it or use things like
uint_8 etc.
Is it really necessary even these days? Or is C/C++ standardized enough that it is preferable to use int, float etc directly.
Because the types like char, short, int, long, and so forth, are ambiguous: they depend on the underlying hardware. Back in the days when C was basically considered an assembler language for people in a hurry, this was okay. Now, in order to write programs that are portable -- which means "programs that mean the same thing on any machine" -- people have built special libraries of typedefs and #defines that allow them to make machine-independent definitions.
The secret code is really quite straight-forward. Here, you have uint_8, which is interpreted
u for unsigned
int to say it's treated as a number
_8 for the size in bits.
In other words, this is an unsigned integer with 8 bits (minimum) or what we used to call, in the mists of C history, an "unsigned char".
uint8_t is rather useless, because due to other requirements in the standard, it exists if and only if unsigned char is 8-bit, in which case you could just use unsigned char. The others, however, are extremely useful. int is (and will probably always be) 32-bit on most modern platforms, but on some ancient stuff it's 16-bit, and on a few rare early 64-bit systems, int is 64-bit. It could also of course be various odd sizes on DSPs.
If you want a 32-bit type, use int32_t or uint32_t, and so on. It's a lot cleaner and easier than all the nasty legacy hacks of detecting the sizes of types and trying to use the right one yourself...
Most code I read, and write, uses the fixed-size typedefs only when the size is an important assumption in the code.
For example if you're parsing a binary protocol that has two 32-bit fields, you should use a typedef guaranteed to be 32-bit, if only as documentation.
I'd only use int16 or int64 when the size must be that, say for a binary protocol or to avoid overflow or keep a struct small. Otherwise just use int.
If you're just doing "int i" to use i in a for loop, then I would not write "int32" for that. I would never expect any "typical" (meaning "not weird embedded firmware") C/C++ code to see a 16-bit "int," and the vast majority of C/C++ code out there would implode if faced with 16-bit ints. So if you start to care about "int" being 16 bit, either you're writing code that cares about weird embedded firmware stuff, or you're sort of a language pedant. Just assume "int" is the best int for the platform at hand and don't type extra noise in your code.
The sizes of types in C are not particularly well standardized. 64-bit integers are one example: a 64-bit integer could be long long, __int64, or even int on some systems. To get better portability, C99 introduced the <stdint.h> header, which has types like int32_t to get a signed type that is exactly 32 bits; many programs had their own, similar sets of typedefs before that.
C and C++ purposefully don't define the exact size of an int. This is because of a number of reasons, but that's not important in considering this problem.
Since int isn't set to a standard size, those who want a standard size must do a bit of work to guarantee a certain number of bits. The code that defines uint_8 does that work, and without it (or a technique like it) you wouldn't have a means of defining an unsigned 8 bit number.
The width of primitive types often depends on the system, not just the C++ standard or compiler. If you want true consistency across platforms when you're doing scientific computing, for example, you should use the specific uint_8 or whatever so that the same errors (or precision errors for floats) appear on different machines, so that the memory overhead is the same, etc.
C and C++ don't restrict the exact size of the numeric types, the standards only specify a minimum range of values that has to be represented. This means that int can be larger than you expect.
The reason for this is that often a particular architecture will have a size for which arithmetic works faster than other sizes. Allowing the implementor to use this size for int and not forcing it to use a narrower type may make arithmetic with ints faster.
This isn't going to go away any time soon. Even once servers and desktops are all fully transitioned to 64-bit platforms, mobile and embedded platforms may well be operating with a different integer size. Apart from anything else, you don't know what architectures might be released in the future. If you want your code to be portable, you have to use a fixed-size typedef anywhere that the type size is important to you.