Defining machine-independent datatypes in C++ - c++

Is there a way to do this?
#if sizeof(int) == 4
typedef unsigned int Integer32;
#else
typedef unsigned long Integer32;
#endif
or do you have to just #define integer size and compile different headers in?

If you need exact sizes you can use the intXX_t and uintXX_t variants, where XX is 8, 16, 32, or 64.
If you need types that are at least some size, use int_leastXX_t and uint_leastXX_t;
if you need fast, use int_fastXX_t and uint_fastXX_t.
You get these from <stdint.h>, which came in with C99. If you don't have C99 it's a little harder. You can't use sizeof(int) because the preprocessor doesn't know about types. So use INT_MAX (from <limits.h>, etc. to figure out whether a particular type is large enough for what you need.

Related

is it possible to get definitive/absolute sized types in C/C++?

I've glossed over some documentation and it seems like the spec only requires 'int' or 'long' or whatever to be able to hold "at least some range of values" (often corresponding to the max range afforded by n bytes).
Anyways, is there a reasonable way to ask for an integer of exactly n bits/bytes? I don't even need a way to specify arbitrary length or anything weird, I'd just want a type with definitively 2 bytes, or definitively 4 bytes. like "int32" or something.
Currently, the way I'm dealing with this is by having a char array of n length, then casting it to an int * and dereferencing.
(My reasoning for wanting this has to do with reading/writing to files directly from structs- and I acknowledge that with this I'll have to worry about struct packing and endianness and stuff with that, but that's another issue...)
Also, "compatibility" with like super limited embedded systems is not a particular concern.
Thanks!
The c++11 standard defines integer types of definite size, provided they are available on the target architecture.
#include <cstdint>
std::int8_t c; // 8-bit unsigned integer
std::int16_t s; // 16-bit unsigned integer
std::int32_t i; // 32-bit unsigned integer
std::int64_t l; // 64-bit unsigned integer
and the corresponding unsigned types with
std::uint8_t uc; // 8-bit unsigned integer
std::uint16_t us; // 16-bit unsigned integer
std::uint32_t ui; // 32-bit unsigned integer
std::uint64_t ul; // 64-bit unsigned integer
As noted in the comments, these types are also available in C from the stdint.h header without the std:: namespace prefix:
#include <stdint.h>
uint32_t ui;
In addition to the types of definite size, these header files also define types
that are at least n bits wide but may be larger, e.g. int_least16_t with at least 16 bits
that provide the fastest implementation of integers with at least n bits but may be larger, e.g. std::int_fast32_t with at least 32 bits.
The typed declared in <cstdint>, such as int32_t will either be exactly that number of bits [32 in this example], or not exist if the architecture doesn't support that size values. There are also types int_fast32_t which is guaranteed to hold a 32-bit value, but could be larger, and int_fast32_t which has a similar guarantee.
The current c++ standard provides Fixed width integer types like std::int16_t std::uint16_t, where 16 means the type size in bits.
You can use the types from <stdint.h>, but you cannot be sure that there is exactly the type you want.
If your architecture does have exact 32 bit types, which is highly likely, then you can use int16_t, uint16_t, int32_t and uint32_t, if not, the types int_fast32_t and uint_fast32_t as well as int_least32_t and uint_least32_t , etc. are always available.

Why are stoi, stol not fixed width integers?

Since ints and longs and other integer types may be different sizes on different systems, why not have stouint8_t(), stoint64_t(), etc. so that portable string to int code could be written?
Because typing that would make me want to chop off my fingers.
Seriously, the basic integer types are int and long and the std::stoX functions are just very simple wrappers around strtol etc. and note that C doesn't provide strtoi32 or strtoi64 or anything that std::stouint32_t could wrap.
If you want something more complicated you can write it yourself.
I could just as well ask "why do people use int and long, instead of int32_t and int64_t everywhere, so the code is portable?" and the answer would be because it's not always necessary.
But the actual reason is probably that noone ever proposed it for the standard. Things don't just magically appear in the standard, someone has to write a proposal and justify adding them, and convince the rest of the committee to add them. So the answer to most "why isn't this thing I just thought of in the standard?" is that noone proposed it.
Because it's usually not necessary.
stoll and stoull return results of type long long and unsigned long long respectively. If you want to convert a string to int64_t, you can just call stoll() and store the result in your int64_t object; the value will be implicitly converted.
This assumes that long long is the widest signed integer type. Like C (starting with C99), C++ permits extended integer types, some of which might be wider than [unsigned] long long. C provides conversion functions strtoimax and strtoumax (operating on intmax_t and uintmax_t, respectively) in <inttypes.h>. For whatever reason, C++ doesn't provide wrappers for this functions (the logical names would be stoimax and stoumax.
But that's not going to matter unless you're using a C++ compiler that provides an extended integer type wider than [unsigned] long long, and I'm not aware that any such compilers actually exist. For any types no wider than 64 bits, the existing functions are all you need.
For example:
#include <iostream>
#include <string>
#include <cstdint>
int main() {
const char *s = "0xdeadbeeffeedface";
uint64_t u = std::stoull(s, NULL, 0);
std::cout << u << "\n";
}

Compiler Independent Types

I've seen several libraries and some C++ header files that provide compiler independent types but I don't understand quite why they are compiler independent.
For example:
int Number; // Not compiler Independent
typedef unsigned int U32;
U32 Number2; // Now this is compiler independent
Is this above true? If so, why? I don't quite understand why the usage of a typedef would mean that the size of Number2 is the same across compilers.
Elaborating on the comment,
Proposition : Use a typedef for compiler independence.
Rationale : Platform independence is a Good Thing
Implementation:
#ifdef _MSC_VER
#if _MSC_VER < 1400
typedef int bar;
#elif _MSC_VER < 1600
typedef char bar;
#else
typedef bool bar;
#else
#error "Unknown compiler"
#endif
The preprocessor macro chain is the important part not the typedef.
Disclaimer: I haven't compiled it!
I'm assuming that you meant for the types to be the same with unsigned int Number.
But no, these are exactly the same. Both declarations, Number and Number2, have the same type. Neither is more compiler independent than the other.
However, the point of using a typedef like this is so that the developers of the library can easily change the integer type used by all functions that use U32. If, for example, they are on a system that where an unsigned int is not 32 bits, but an unsigned long is, they could change the typedef to:
typedef unsigned long U32;
In fact, it's possible to use the build system to conditionally change the typedef depending on the target platform.
However, if you want a nice standardised way to ensure that the type is a 32 bit unsigned integer type, I recommend using std::uint32_t from the <cstdint> header. However, this type is not guaranteed to exist if you're on a machine with no 32 bit integer type. Instead, you can use std::uint_least32_t, which will give you the smallest integer type with at least 32 bits.
As stated in the comments, the shown typedef is not compiler independent.
If you want a compiler independent way to get fixed sizes, you might want to use cstdint.
This header file actually comes with your compiler and assures you a minimum size, but no maximum for bigger types (64 bit, 128 bit).
If you want to be completely sure about all the sizes of your types, you need to check it.

What is int8_t if a machine has > 8 bits per byte?

I was reading the C++ FAQ and it says
The C++ language guarantees a byte must always have at least 8 bits
So what does that mean for the <cstdint> types?
Side question - if I want an array of bytes should I use int8_t or char and why?
C++ (and C as well) defines intX_t (i.e. the exact width integer types) typedefs as optional. So, it just won't be there if there is no addressable unit that's exactly 8-bit wide.
If you want an array of bytes, you should use char, as sizeof char (and signed char and unsigned char) is well-defined to always be 1 byte.
To add to what Cat Plus Plus has already said (that the type is
optional), you can test whether it is present by using something like:
#ifdef INT8_MAX
// type int8_t exists.
#endif
or more likely:
#ifndef INT8_MAX
#error Machines with bytes that don't have 8 bits aren't supported
#endif

size guarantee for integral/arithmetic types in C and C++

I know that the C++ standard explicitly guarantees the size of only char, signed char and unsigned char. Also it gives guarantees that, say, short is at least as big as char, int as big as short etc. But no explicit guarantees about absolute value of, say, sizeof(int). This was the info in my head and I lived happily with it. Some time ago, however, I came across a comment in SO (can't find it) that in C long is guaranteed to be at least 4 bytes, and that requirement is "inherited" by C++. Is that the case? If so, what other implicit guarantees do we have for the sizes of arithmetic types in C++? Please note that I am absolutely not interested in practical guarantees across different platforms in this question, just theoretical ones.
18.2.2 guarantees that <climits> has the same contents as the C library header <limits.h>.
The ISO C90 standard is tricky to get hold of, which is a shame considering that C++ relies on it, but the section "Numerical limits" (numbered 2.2.4.2 in a random draft I tracked down on one occasion and have lying around) gives minimum values for the INT_MAX etc. constants in <limits.h>. For example ULONG_MAX must be at least 4294967295, from which we deduce that the width of long is at least 32 bits.
There are similar restrictions in the C99 standard, but of course those aren't the ones referenced by C++03.
This does not guarantee that long is at least 4 bytes, since in C and C++ "byte" is basically defined to mean "char", and it is not guaranteed that CHAR_BIT is 8 in C or C++. CHAR_BIT == 8 is guaranteed by both POSIX and Windows.
Don't know about C++. In C you have
Annex E
(informative)
Implementation limits
[#1] The contents of the header are given below,
in alphabetical order. The minimum magnitudes shown shall
be replaced by implementation-defined magnitudes with the
same sign. The values shall all be constant expressions
suitable for use in #if preprocessing directives. The
components are described further in 5.2.4.2.1.
#define CHAR_BIT 8
#define CHAR_MAX UCHAR_MAX or SCHAR_MAX
#define CHAR_MIN 0 or SCHAR_MIN
#define INT_MAX +32767
#define INT_MIN -32767
#define LONG_MAX +2147483647
#define LONG_MIN -2147483647
#define LLONG_MAX +9223372036854775807
#define LLONG_MIN -9223372036854775807
#define MB_LEN_MAX 1
#define SCHAR_MAX +127
#define SCHAR_MIN -127
#define SHRT_MAX +32767
#define SHRT_MIN -32767
#define UCHAR_MAX 255
#define USHRT_MAX 65535
#define UINT_MAX 65535
#define ULONG_MAX 4294967295
#define ULLONG_MAX 18446744073709551615
So char <= short <= int <= long <= long long
and
CHAR_BIT * sizeof (char) >= 8
CHAR_BIT * sizeof (short) >= 16
CHAR_BIT * size of (int) >= 16
CHAR_BIT * sizeof (long) >= 32
CHAR_BIT * sizeof (long long) >= 64
Yes, C++ type sizes are inherited from C89.
I can't find the specification right now. But it's in the Bible.
Be aware that the guaranteed ranges of these types are one less wide than on most machines:
signed char -127 ... +127 guranteed but most twos complement machines have -128 ... + 127
Likewise for the larger types.
There are several inaccuracies in what you read. These inaccuracies were either present in the source, or maybe you remembered it all incorrectly.
Firstly, a pedantic remark about one peculiar difference between C and C++. C language does not make any guarantees about the relative sizes of integer types (in bytes). C language only makes guarantees about their relative ranges. It is true that the range of int is always at least as large as the range of short and so on. However, it is formally allowed by C standard to have sizeof(short) > sizeof(int). In such case the extra bits in short would serve as padding bits, not used for value representation. Obviously, this is something that is merely allowed by the legal language in the standard, not something anyone is likely to encounter in practice.
In C++ on the other hand, the language specification makes guarantees about both the relative ranges and relative sizes of the types, so in C++ in addition to the above range relationship inherited from C it is guaranteed that sizeof(int) is greater or equal than sizeof(short).
Secondly, the C language standard guarantees minimum range for each integer type (these guarantees are present in both C and C++). Knowing the minimum range for the given type, you can always say how many value-forming bits this type is required to have (as minimum number of bits). For example, it is true that type long is required to have at least 32 value-forming bits in order to satisfy its range requirements. If you want to recalculate that into bytes, it will depend on what you understand under the term byte. If you are talking specifically about 8-bit bytes, then indeed type long will always consist of at least four 8-bit bytes. However, that does not mean that sizeof(long) is always at least 4, since in C/C++ terminology the term byte refers to char objects. char objects are not limited to 8-bits. It is quite possible to have 32-bit char type in some implementation, meaning that sizeof(long) in C/C++ bytes can legally be 1, for example.
The C standard do not explicitly say that long has to be at least 4 bytes, but they do specify a minimum range for the different integral types, which implies a minimum size.
For example, the minimum range of an unsigned long is 0 to 4,294,967,295. You need at least 32 bits to represent every single number in that range. So yes, the standard guarantee (indirectly) that a long is at least 32 bits.
C++ inherits the data types from C, so you have to go look at the C standard. The C++ standard actually references to parts of the C standard in this case.
Just be careful about the fact that some machines have chars that are more than 8 bits. For example, IIRC on the TI C5x, a long is 32 bits, but sizeof(long)==2 because chars, shorts and ints are all 16 bits with sizeof(char)==1.