Why is uint_8 etc. used in C/C++? - c++

I've seen some code where they don't use primitive types int, float, double etc. directly.
They usually typedef it and use it or use things like
uint_8 etc.
Is it really necessary even these days? Or is C/C++ standardized enough that it is preferable to use int, float etc directly.

Because the types like char, short, int, long, and so forth, are ambiguous: they depend on the underlying hardware. Back in the days when C was basically considered an assembler language for people in a hurry, this was okay. Now, in order to write programs that are portable -- which means "programs that mean the same thing on any machine" -- people have built special libraries of typedefs and #defines that allow them to make machine-independent definitions.
The secret code is really quite straight-forward. Here, you have uint_8, which is interpreted
u for unsigned
int to say it's treated as a number
_8 for the size in bits.
In other words, this is an unsigned integer with 8 bits (minimum) or what we used to call, in the mists of C history, an "unsigned char".

uint8_t is rather useless, because due to other requirements in the standard, it exists if and only if unsigned char is 8-bit, in which case you could just use unsigned char. The others, however, are extremely useful. int is (and will probably always be) 32-bit on most modern platforms, but on some ancient stuff it's 16-bit, and on a few rare early 64-bit systems, int is 64-bit. It could also of course be various odd sizes on DSPs.
If you want a 32-bit type, use int32_t or uint32_t, and so on. It's a lot cleaner and easier than all the nasty legacy hacks of detecting the sizes of types and trying to use the right one yourself...

Most code I read, and write, uses the fixed-size typedefs only when the size is an important assumption in the code.
For example if you're parsing a binary protocol that has two 32-bit fields, you should use a typedef guaranteed to be 32-bit, if only as documentation.
I'd only use int16 or int64 when the size must be that, say for a binary protocol or to avoid overflow or keep a struct small. Otherwise just use int.
If you're just doing "int i" to use i in a for loop, then I would not write "int32" for that. I would never expect any "typical" (meaning "not weird embedded firmware") C/C++ code to see a 16-bit "int," and the vast majority of C/C++ code out there would implode if faced with 16-bit ints. So if you start to care about "int" being 16 bit, either you're writing code that cares about weird embedded firmware stuff, or you're sort of a language pedant. Just assume "int" is the best int for the platform at hand and don't type extra noise in your code.

The sizes of types in C are not particularly well standardized. 64-bit integers are one example: a 64-bit integer could be long long, __int64, or even int on some systems. To get better portability, C99 introduced the <stdint.h> header, which has types like int32_t to get a signed type that is exactly 32 bits; many programs had their own, similar sets of typedefs before that.

C and C++ purposefully don't define the exact size of an int. This is because of a number of reasons, but that's not important in considering this problem.
Since int isn't set to a standard size, those who want a standard size must do a bit of work to guarantee a certain number of bits. The code that defines uint_8 does that work, and without it (or a technique like it) you wouldn't have a means of defining an unsigned 8 bit number.

The width of primitive types often depends on the system, not just the C++ standard or compiler. If you want true consistency across platforms when you're doing scientific computing, for example, you should use the specific uint_8 or whatever so that the same errors (or precision errors for floats) appear on different machines, so that the memory overhead is the same, etc.

C and C++ don't restrict the exact size of the numeric types, the standards only specify a minimum range of values that has to be represented. This means that int can be larger than you expect.
The reason for this is that often a particular architecture will have a size for which arithmetic works faster than other sizes. Allowing the implementor to use this size for int and not forcing it to use a narrower type may make arithmetic with ints faster.
This isn't going to go away any time soon. Even once servers and desktops are all fully transitioned to 64-bit platforms, mobile and embedded platforms may well be operating with a different integer size. Apart from anything else, you don't know what architectures might be released in the future. If you want your code to be portable, you have to use a fixed-size typedef anywhere that the type size is important to you.

Related

In new code, why would you use `int` instead of `int_fast16_t` or `int_fast32_t` for a counting variable?

If you need a counting variable, surely there must be an upper and a lower limit that your integer must support. So why wouldn't you specify those limits by choosing an appropriate (u)int_fastxx_t data type?
The simplest reason is that people are more used to int than the additional types introduced in C++11, and that it's the language's "default" integral type (so much as C++ has one); the standard specifies, in [basic.fundamental/2] that:
Plain ints have the natural size suggested by the architecture of the execution environment46; the other signed integer types are provided to meet special needs.
46) that is, large enough to contain any value in the range of INT_MIN and INT_MAX, as defined in the header <climits>.
Thus, whenever a generic integer is needed, which isn't required to have a specific range or size, programmers tend to just use int. While using other types can communicate intent more clearly (for example, using int8_t indicates that the value should never exceed 127), using int also communicates that these details aren't crucial to the task at hand, while simultaneously providing a little leeway to catch values that exceed your required range (if a system handles signed overflow with modulo arithmetic, for example, an int8_t would treat 313 as 57, making the invalid value harder to troubleshoot); typically, in modern programming, it either indicates that the value can be represented within the system's word size (which int is supposed to represent), or that the value can be represented within 32 bits (which is nearly always the size of int on x86 and x64 platforms).
Sized types also have the issue that the (theoretically) most well-known ones, the intX_t line, are only defined on platforms which support sizes of exactly X bits. While the int_leastX_t types are guaranteed to be defined on all platforms, and guaranteed to be at least X bits, a lot of people wouldn't want to type that much if they don't have to, since it adds up when you need to specify types often. [You can't use auto, either because it detects integer literals as ints. This can be mitigated by making user-defined literal operators, but that still takes more time to type.] Thus, they'll typically use int if it's safe to do so.
Or in short, int is intended to be the go-to type for normal operation, with the other types intended to be used in extranormal circumstances. Many programmers stick to this mindset out of habit, and only use sized types when they explicitly require specific ranges and/or sizes. This also communicates intent relatively well; int means "number", and intX_t means "number that always fits in X bits".
It doesn't help that int has evolved to unofficially mean "32-bit integer", due to both 32- and 64-bit platforms usually using 32-bit ints. It's very likely that many programmers expect int to always be at least 32 bits in the modern age, to the point where it can very easily bite them in the rear if they have to program for platforms that don't support 32-bit ints.
Conversely, the sized types are typically used when a specific range or size is explicitly required, such as when defining a struct that needs to have the same layout on systems with different data models. They can also prove useful when working with limited memory, using the smallest type that can fully contain the required range.
A struct intended to have the same layout on 16- and 32-bit systems, for example, would use either int16_t or int32_t instead of int, because int is 16 bits in most 16-bit data models and the LP32 32-bit data model (used by the Win16 API and Apple Macintoshes), but 32 bits in the ILP32 32-bit data model (used by the Win32 API and *nix systems, effectively making it the de facto "standard" 32-bit model).
Similarly, a struct intended to have the same layout on 32- and 64-bit systems would use int/int32_t or long long/int64_t over long, due to long having different sizes in different models (64 bits in LP64 (used by 64-bit *nix), 32 bits in LLP64 (used by Win64 API) and the 32-bit models).
Note that there is also a third 64-bit model, ILP64, where int is 64 bits; this model is very rarely used (to my knowledge, it was only used on early 64-bit Unix systems), but would mandate the use of a sized type over int if layout compatibility with ILP64 platforms is required.
There are several reasons. One, these long names make the code less readable. Two, you might introduce really hard to find bugs. Say you used int_fast16_t but you really need to count up to 40,000. The implementation might use 32 bits and the code work just fine. Then you try to run the code on an implementation that uses 16 bits and you get hard-to-find bugs.
A note: In C / C++ you have types char, short, int, long and long long which must cover 8 to 64 bits, so int cannot be 64 bits (because char and short cannot cover 8, 16 and 32 bits), even if 64 bits is the natural word size. In Swift, for example, Int is the natural integer size, either 32 and 64 bits, and you have Int8, Int16, Int32 and Int64 for explicit sizes. Int is the best type unless you absolutely need 64 bits, in which case you use Int64, or if you need to save space.

Uses and when to use int16_t,int32_t,int64_t and respectively short int,int,long int,long

Uses and when to use int16_t, int32_t, int64_t and respectively short, int, long.
There are too many damn types in C++. For integers when is it correct to use one over the other?
Use the well-defined types when the precision is important. Use the less-determinate ones when it is not. It's never wrong to use the more precise ones. It sometimes leads to bugs when you use the flexible ones.
Use the exact-width types when you actually need an exact width. For example, int32_t is guaranteed to be exactly 32 bits wide, with no padding bits, and with a two's-complement representation. If you need all those requirements (perhaps because they're imposed by an external data format), use int32_t. Likewise for the other [u]intN_t types.
If you merely need a signed integer type of at least 32 bits, use int_least32_t or int_fast32_t, depending on whether you want to optimize for size or speed. (They're likely to be the same type.)
Use the predefined types short, int, long, et al when they're good enough for your purposes and you don't want to use the longer names. short and int are both guaranteed to be at least 16 bits, long at least 32 bits, and long long at least 64 bits. int is normally the "natural" integer type suggested by the system's architecture; you can think of it as int_fast16_t, and long as int_fast32_t, though they're not guaranteed to be the same.
I haven't given firm criteria for using the built-in vs. the [u]int_leastN_t and [u]int_fastN_t types because, frankly, there are no such criteria. If the choice isn't imposed by the API you're using or by your organization's coding standard, it's really a matter of personal taste. Just try to be consistent.
This is a good question, but hard to answer.
In one line: It depends to context:
My rule of thumb:
I'd prefer code performance (speed: less time, then less complexity)
When using existing library I'd follow the library coding style (context).
When coding in a team I'd follow the team coding style (context).
When coding new things I'd use int16_t,int32_t,int64_t,.. when ever possible.
Explanation:
using int (int is system word size) in some context give you performance, but in some other context not.
I'd use uint64_t over unsigned long long because it is concise, but when ever possible.
So It depends to context
A use that I have found for them is when I am bitpacking data for, say, an image compressor. Using these types that precisely specify the number of bytes can save a lot of headache, since the C++ standard does not explicitly define the number of bytes in its types, only the MIN and MAX ranges.
In MISRA-C 2004 and MISRA-C++ 2008 guidelines, the advisory is to prefer specific-length typedefs:
typedefs that indicate size and signedness should be used in place of
the basic numerical types. [...]
This rule helps to clarify the size
of the storage, but does not guarantee portability because of the
asymmetric behaviour of integral promotion. [...]
Exception is for char type :
The plain char type shall be used only for the storage and use of character values.
However, keep in mind that the MISRA guidelines are for critical systems.
Personally, I follow these guidelines for embedded systems, not for computer applications where I simply use an int when I want an integer, letting the compiler optimize as it wants.

Is there any advantage of using non-fixed integers (int, long) instead of fixed-size ones (int64_t, int32_t)?

Maybe performance? I feel that using non-fixed integers just makes programs more complicated and prone to fail when porting to another architecture.
std::intN_t are provided only if the implementation can directly support them. So porting code that uses them can fail.
I would prefer std::intfastN_t for general use because they have less restrictions and should be as fast or faster as int.
Also, most C++ code uses int everywhere so you might run into promotion weirdness when passing a std::int32_t into a function accepting an int, especially if sizeof(int) is only 16 bits.
Many APIs accept or return values of non-fixed types. For example, file descriptors are of type int, file offsets or sizes are of type off_t and strtol() returns a long. Blindly converting such values from or to fixed-size types is likely to cause overflow on some machine.
The guaranteed-width types (intN_t) are just typedefs for the appropriate 'standard' integer types. If a platform does not have an appropriate type (for example, it uses 36-bit integers), then it can't and mustn't provide the guaranteed-width typedefs.
This means that performance can hardly be an argument.
The general guideline for maximum portability (in this regard) is to use the 'standard' integer types by default and the guaranteed-width types only if your algorithm demands an exact number of bits.
The 'standard' integer types should be assumed to be only as wide as guaranteed by the relevant standards (If you only look at the C++ standard, that would be: 8-bits char, 16-bits int, 32-bits long and, if your compiler supports it, 64-bits long long).
If you have data where the size of your type is critical for it's functionality, then you should use types with defined sizes. However, for example a piece of code that is well within [what you can reasonably expect] int range (say for example 1 ... 1000 loop counter), there is no reason to use int_32t just because you want to define that your variable. It will work just fine with a 16, 32, 64, 36, 18 or 49 bit integer, all the same. So let the compiler pick the size that is best.
There is a possibility that the compiler generates worse code for fixed size integers that aren't "best choice" for the architecture.
Obviously, any data that is presented over a network or in a file needs to have fixed size. Likewise, if you have interfaces that require binary compatibility across the interface boundary, then using defined size types is very useful to avoid the size becomming a problem.

How to guarantee a C++ type's number of bits

I am looking to typedef my own arithmetic types (e.g. Byte8, Int16, Int32, Float754, etc) with the intention of ensuring they comprise a specific number of bits (and in the case of the float, adhere to the IEEE754 format). How can I do this in a completely cross-platform way?
I have seen snippets of the C/C++ standards here and there and there is a lot of:
"type is at least x bytes"
and not very much of:
"type is exactly x bytes".
Given that typedef Int16 unsigned short int may not necessarily result in a 16-bit Int16, is there a cross-platform way to guarantee my types will have specific sizes?
You can use exact-width integer types int8_t, int16_t, int32_t, int64_t declared in <cstdint>. This way the sizes are fixed on all the platforms
The only available way to truly guarantee an exact number of bits is to use a bit-field:
struct X {
int abc : 14; // exactly 14 bits, regardless of platform
};
There is some upper limit on the size you can specify this way -- at least 16 bits for int, and 32 bits for long (but a modern platform may easily allow up to 64 bits for either). Note, however, that while this guarantees that arithmetic on X::abc will use (or at least emulate) exactly 14 bits, it does not guarantee that the size of a struct X is the minimum number of bytes necessary to provide 14 bits (e.g., given 8-bit bytes, its size could easily be 4 or 8 instead of the 2 that are absolutely necessary).
The C and C++ standards both now include a specification for fixed-size types (e.g., int8_t, int16_t), but no guarantee that they'll be present. They're required if the platform provides the right type, but otherwise won't be present. If memory serves, these are also required to use a 2's complement representation, so a platform with a 16-bit 1's complement integer type (for example) still won't define int16_t.
Have a look at the types declared in stdint.h. This is part of the standard library, so it is expected (though technically not guaranteed) to be available everywhere. Among the types declared here are int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, and uint64_t. Local implementations will map these types to the appropriate-width type for the given complier and architecture.
This is not possible.
There are platforms where char is 16 or even 32 bits.
Note that I'm not saying there are in theory platforms where this happens... it is a real and quite concrete possibility (e.g. DSPs).
On that kind of hardware there is just no way to use 8 bit only for an operation and for example if you need 8 bit modular arithmetic then the only way is doing a masking operation yourself.
The C language doesn't provide this kind of emulation for you...
With C++ you could try to build a class that behaves like the expected native elementary type in most cases (with the exclusion of sizeof, obviously). The result will have however truly horrible performances.
I can think to no use case in which forcing the hardware this way against its nature would be a good idea.
It is possible to use C++ templates at compile time to check and create new types on the fly that do fit your requirements, specifically that sizeof() of the type is the correct size that you want.
Take a look at this code: Compile time "if".
Do note that if the requested type is not available then it is entirely possible that your program will simply not compile. It simply depends on whether or not that works for you or not!

converting size_t into long, Is there any disadvantage?

Is there any disadvantage of converting size_t to long? Because, I am writing an program that maintains linked_list in a file. So I traverse to another node based on size_t and I also keep track of total number of lists as size_t. Hence, obviously there is going to be some conversion or addition of long and size_t. Is there any disadvantage of this? If there is then I will make everything as long instead of size_t, even the sizes. Please advise.
The "long" type, unfortunately, doesn't have a good theoretical basis. Originally it was introduced on 32 bit unix ports to differentiate it from the 16 bit "int" assumed by the existing PDP11 software. Then later "int" was changed to 32 bits on those platforms (and "short" was introduced) and "long" and "int" became synonyms, which they were for a very long time.
Now, on 64 bit unix-like platforms (Linux, the BSDs, OS X, iOS and whatever proprietary unixes people might still care about) "long" is a 64 bit quantity. But, sadly, not on windows: there was too much legacy "code" in the existing headers that made the sizeof(int)==sizeof(long) assumption, so they went with an abomination called "LLP64" and left long as 32 bits. Sigh.
But "size_t" isn't like that. It has always meant precisely one thing: it's the unsigned type that stores the native pointer size in the address space. If you have an unsigned (! -- use ssize_t or ptrdiff_t if you need signed arithmetic) pointer that needs an integer representation (i.e. you need to store the memory size of an object), this is what you use.
It's not a problem now, but it may be in the future depending on where you'll port your app. That's because size_t is defined to be large enough to store offsets of pointers, so if you have a 64-bit pointer, size_t will be 64 bits too. Now, long may or may not be 64 bits, because the size rules for fundamental types in C/C++ give room to some variations.
But if you're to write these values to a file, you have to choose a specific size anyway, so there's no option other than convert to long (or long long, if needed). Better yet, use one of the new size-specific types like int32_t.
My advice: somewhere in the header of your file, store the sizeof for the type you converted the size_t to. By doing that, if in the future you decide to use a larger one, you can still support the old size. And for the current version of the program, you can check if the size is supported or not, and issue an error if not.
Is there any disadvantage of converting size_t to long?
Theoretically long can be smaller than size_t. Also, long is signed. size_t is unsigned. So if you start using them both in same expression, compiler like g++ will complain about it. A lot. Theoretically it might lead to unexpected errors due to signed-to-unsigned assignments.
obviously there is going to be some conversion or addition of long
I don't see why there's supposed to be some conversion or addition to long. You can keep using size_t for all arithmetical operations. You can typedef it as "ListIndex" or whatever and keep using it throughout the code. If you mix types (long and size_t), g++/mignw will nag you to death about it.
Alternatively, you could select specific type which has guaranteed size. Newer compilers have cstdint header which includes types like uint64_t (it is extremely unlikely that you encounter file larger than 2^64, for example). If your compiler doesn't have the header, it should be available in boost.