How to guarantee long is 4 bytes - c++

In c++, is there any way to guarantee that a long is 4 bytes? Perhaps a compiler flag for g++?
We are reusing some windows code in a linux program, and in windows a long is 4 bytes, but on my linux machine a long is 8 bytes. We can't go and change all the longs to ints because that would break the windows code.
The reason I need to guarantee longs are 4 bytes is because in certain parts of the code, we have a union of a struct and a char array, and when compiling for 64bit linux, the char array does not line up with the other struct.
Here are some code snippits for clarification:
struct ABC {
unsigned long data1;
unsigned short data2;
unsigned short data3;
unsigned char data4[8];
};
//sizeof(ABC) (32 bit): 16, sizeof(ABC) (64 bit): 24
union ABC_U {
ABC abc;
unsigned char bytes[16];
};
EDIT:
I forgot to mention, this problem only came up when trying to compile for 64 bit. Windows seems to like to keep longs 4 bytes regardless of architecture, whereas linux g++ usually makes longs the same size as pointers.
I'm leaning towards using a uint32_t here because this particular structure isn't used in the Windows code, and that wouldn't affect the program globally. Hopefully there aren't any other sections of the code where this will be a problem.
I found the compiler flag -mlong32, but this also forces pointers to be 32 bits which is undesirable, and as it forces nonstandard behavior would likely break the ABI like PascalCuoq mentioned.

You can use int32_t from stdint.h.

Instead of using int and long, you will likely want to create a header file that uses typedef's (preferred) or preprocessor macros to define common typenames for you to use.
#ifdef _WINDOWS
typedef unsigned long uint32;
#else
typedef unsigned int uint32;
#endif
In your union, you would use uint32 instead of long.
The header file stdint.h does exactly this, but is not always installed as a standard header file with every compiler. Visual Studio (for example) does not have it by default. You can download it from http://msinttypes.googlecode.com/svn/trunk/stdint.h if you would prefer to use it instead.

You want the compiler to generate code where long is 32-bit.
Compile the entire codebase with -m32. This will generate code in the old IA-32 instruction set, which is still widely supported library-wise(*), and where long is traditionally 32-bit like int.
(*) You may need to install these libraries in your Linux distribution, and you may have to ask your users to install them.

Related

x86-64 MSVC++/Intel C++ change size of int, long, etc

I wish to have the following sizes when I compile (using Visual C++ 2015 and/or Intel C++ 16.0)
char 32 bits unsigned (for UTF-32 characters)
short 32 bits
int 64 bits
long 128 bits
Pointers and size_t 64 bits (which they are currently)
Is this possible to change? My current solution uses the macros:
#define int int64_t
#define char char32_t // Is this unsigned?
#define short int32_t
#define long __int128
But it has problems, like "int main" doesn't work... And I can't defined "signed int" "unsigned int" etc. as macro names can't have spaces
EDIT: The reason I want to do this is to improve legibility (so I don't have to write int64_t...) and also to make any code I use, that uses int/char/short/long to automatically upgrade (when recompiling) to using 64/32/32/128 bits, without having to modify it directly.
You cannot do this. The only proper way to achieve this is by introducing your own types and using them instead.
Also, when using types like int you must not depend on the underlying size apart from what the standard says (i.e. in case of int the only guarantee is that it's at least 16 bits). What you want to achieve is a dependency you shouldn't have, and that would make you code completely unportable. Besides, I don't see why int64_t would be less legible than using int. Also, the redefining you want would come unexpected to other developers and thus is likely to cause bugs. Using your own types makes it explicit that the types are different.
It's not necessary to use macro when you define unsigned int, you can write code like the following:
typedef unsigned int UINT;
now you can also write code like this:
#define UINT balabala

fixed length data types in C/C++

I've heard that size of data types such as int may vary across platforms.
My first question is: can someone bring some example, what goes wrong, when program
assumes an int is 4 bytes, but on a different platform it is say 2 bytes?
Another question I had is related. I know people solve this issue with some typedefs,
like you have variables like u8,u16,u32 - which are guaranteed to be 8bits, 16bits, 32bits, regardless of the platform -- my question is, how is this achieved usually? (I am not referring to types from stdint library - I am curious manually, how can one enforce that some type is always say 32 bits regardless of the platform??)
I know people solve this issue with some typedefs, like you have variables like u8,u16,u32 - which are guaranteed to be 8bits, 16bits, 32bits, regardless of the platform
There are some platforms, which have no types of certain size (like for example TI's 28xxx, where size of char is 16 bits). In such cases, it is not possible to have an 8-bit type (unless you really want it, but that may introduce performance hit).
how is this achieved usually?
Usually with typedefs. c99 (and c++11) have these typedefs in a header. So, just use them.
can someone bring some example, what goes wrong, when program assumes an int is 4 bytes, but on a different platform it is say 2 bytes?
The best example is a communication between systems with different type size. Sending array of ints from one to another platform, where sizeof(int) is different on two, one has to take extreme care.
Also, saving array of ints in a binary file on 32-bit platform, and reinterpreting it on a 64-bit platform.
In earlier iterations of the C standard, you generally made your own typedef statements to ensure you got a (for example) 16-bit type, based on #define strings passed into the compiler for example:
gcc -DINT16_IS_LONG ...
Nowadays (C99 and above), there are specific types such as uint16_t, the exactly 16-bit wide unsigned integer.
Provided you include stdint.h, you get exact bit width types,at-least-that-width types, fastest types with a given minimum widthand so on, as documented in C99 7.18 Integer types <stdint.h>. If an implementation has compatible types, they are required to provide these.
Also very useful is inttypes.h which adds some other neat features for format conversion of these new types (printf and scanf format strings).
For the first question: Integer Overflow.
For the second question: for example, to typedef an unsigned 32 bits integer, on a platform where int is 4 bytes, use:
typedef unsigned int u32;
On a platform where int is 2 bytes while long is 4 bytes:
typedef unsigned long u32;
In this way, you only need to modify one header file to make the types cross-platform.
If there are some platform-specific macros, this can be achieved without modifying manually:
#if defined(PLAT1)
typedef unsigned int u32;
#elif defined(PLAT2)
typedef unsigned long u32;
#endif
If C99 stdint.h is supported, it's preferred.
First of all: Never write programs that rely on the width of types like short, int, unsigned int,....
Basically: "never rely on the width, if it isn't guaranteed by the standard".
If you want to be truly platform independent and store e.g. the value 33000 as a signed integer, you can't just assume that an int will hold it. An int has at least the range -32767 to 32767 or -32768 to 32767 (depending on ones/twos complement). That's just not enough, even though it usually is 32bits and therefore capable of storing 33000. For this value you definitively need a >16bit type, hence you simply choose int32_t or int64_t. If this type doesn't exist, the compiler will tell you the error, but it won't be a silent mistake.
Second: C++11 provides a standard header for fixed width integer types. None of these are guaranteed to exist on your platform, but when they exists, they are guaranteed to be of the exact width. See this article on cppreference.com for a reference. The types are named in the format int[n]_t and uint[n]_t where n is 8, 16, 32 or 64. You'll need to include the header <cstdint>. The C header is of course <stdint.h>.
usually, the issue happens when you max out the number or when you're serializing. A less common scenario happens when someone makes an explicit size assumption.
In the first scenario:
int x = 32000;
int y = 32000;
int z = x+y; // can cause overflow for 2 bytes, but not 4
In the second scenario,
struct header {
int magic;
int w;
int h;
};
then one goes to fwrite:
header h;
// fill in h
fwrite(&h, sizeof(h), 1, fp);
// this is all fine and good until one freads from an architecture with a different int size
In the third scenario:
int* x = new int[100];
char* buff = (char*)x;
// now try to change the 3rd element of x via buff assuming int size of 2
*((int*)(buff+2*2)) = 100;
// (of course, it's easy to fix this with sizeof(int))
If you're using a relatively new compiler, I would use uint8_t, int8_t, etc. in order to be assure of the type size.
In older compilers, typedef is usually defined on a per platform basis. For example, one may do:
#ifdef _WIN32
typedef unsigned char uint8_t;
typedef unsigned short uint16_t;
// and so on...
#endif
In this way, there would be a header per platform that defines specifics of that platform.
I am curious manually, how can one enforce that some type is always say 32 bits regardless of the platform??
If you want your (modern) C++ program's compilation to fail if a given type is not the width you expect, add a static_assert somewhere. I'd add this around where the assumptions about the type's width are being made.
static_assert(sizeof(int) == 4, "Expected int to be four chars wide but it was not.");
chars on most commonly used platforms are 8 bits large, but not all platforms work this way.
Well, first example - something like this:
int a = 45000; // both a and b
int b = 40000; // does not fit in 2 bytes.
int c = a + b; // overflows on 16bits, but not on 32bits
If you look into cstdint header, you will find how all fixed size types (int8_t, uint8_t, etc.) are defined - and only thing differs between different architectures is this header file. So, on one architecture int16_tcould be:
typedef int int16_t;
and on another:
typedef short int16_t;
Also, there are other types, which may be useful, like: int_least16_t
If a type is smaller than you think then it may not be able to store a value you need to store in it.
To create a fixed size types you read the documentation for platforms to be supported and then define typedefs based on #ifdef for the specific platforms.
can someone bring some example, what goes wrong, when program assumes an int is 4 bytes, but on a different platform it is say 2 bytes?
Say you've designed your program to read 100,000 inputs, and you're counting it using an unsigned int assuming a size of 32 bits (32-bit unsigned ints can count till 4,294,967,295). If you compile the code on a platform (or compiler) with 16-bit integers (16-bit unsigned ints can count only till 65,535) the value will wrap-around past 65535 due to the capacity and denote a wrong count.
Compilers are responsible to obey the standard. When you include <cstdint> or <stdint.h> they shall provide types according to standard size.
Compilers know they're compiling the code for what platform, then they can generate some internal macros or magics to build the suitable type. For example, a compiler on a 32-bit machine generates __32BIT__ macro, and previously it has these lines in the stdint header file:
#ifdef __32BIT__
typedef __int32_internal__ int32_t;
typedef __int64_internal__ int64_t;
...
#endif
and you can use it.
bit flags are the trivial example. 0x10000 will cause you problems, you can't mask with it or check if a bit is set in that 17th position if everything is being truncated or smashed to fit into 16-bits.

How to check if uint8_t exists as a type, instead of unsigned char?

I have two compilers, one that recognizes uint8_t(GCC ARM-EABI), and one that doesn't(Renesas M16 Standard Toolchain).
The Renesas Toolchain is NOT ANSI C compliant, so you can throw out . So uint8_t, uint16_t,... aren't defined as existing types.
In order to maintain portability, I would like to have the same types(preferably uint8_t, due to the ambiguity of int).
Also my platforms are different size processors(ARM is 32 bit, and Renesas is 16 bit). Causing int to be different values.
Is there a way to check if uint8_t exists as a type?
And if not, declare it(and others uint16_t, uint32_t,...) as a type?
Is there a way to check if uint8_t exists as a type?
Use:
#include <stdint.h>
#ifdef UINT8_MAX
...
#endif
uint8_t is not a built-in type, it is defined in stdint.h. So it is not a matter of the compiler "recognising" uint8_t, but rather simply a case of making stdint.h available.
If your toolchain does not provide stdint.h, you can easily provide your own implementation, using the compiler documentation to determine the built-in types that correspond to the specific sizes. On the toolchain without stdint.h you simply provide your own to the project, on the toolchain with stdint.h you don't. That way the code (other than stdint.h itself) will be identical across platforms - you don't need to conditionally define uint8_t.
One problem you may come across (on some TI DSP's for example) is that memory may not be 8 bit addressable and a char will be 16 bit (or larger). In that case uint8_t or indeed any 8 bit integer type will not be supported at all. A char is always the smallest data type for the specific platform, but may be larger than 8 bit.
There are a few different ways to deal with this. In an open source project that needs to be portable, the common solution is to have a "configure script", which is run to setup the build system. It would then have something like HAVE_UINTX_TYPES that is set or not set in some config.h or similar [which was one of the results of the "configure script", and do something like this:
#include "config.h"
...
#ifndef HAVE_UINTX_TYPES
#include "uintx_types.h"
#endif
In a less "needs to run on almost anything" system, you could solve the same problem by simply have a -DHAVE_UINTX_TYPES as part of the flags to the compiler. And since you (presumably) have some part of the build system that sets different compile options, picks a different compiler, etc, for the two different builds, this shouldn't be a big issue to add.
And assuming that you are happy that your unsigned char is indeed 8 bits, you could also do have a uintx_types.h that contains something like this:
typedef unsigned char uint8_t;
typedef unsigned short uint16_t;
typedef unsigned long uint32_t;
Another option is to not use uint8_t and uint16_t etc directly, but have your own definitions [and have these depend on the appropriate build setting for "is it ARM or Renesas", e.g. by using different include options]:
ARM/types.h:
typedef unsigned char u_int8;
typedef unsigned short u_int16;
typedef unsigned int u_int32;
Renesas/types.h:
typedef unsigned char u_int8;
typedef unsigned int u_int16;
typedef unsigned long u_int32;
If uint8_t doesn't exist, it's either because the implementation does not conform to C99, or because it has no type that meets the requirements. The latter probably means CHAR_BIT > 8 (which is vanishingly rare outside embedded systems).
#if __STDC_VERSION__ >= 199901L
#include <stdint.h>
#ifdef UINT8_MAX
// uint8_t exists
#else
// uint8_t doesn't exist in <stdint.h>
#else
// uint8_t probably doesn't exist because this isn't a C99 or better compiler
#endif
It's possible for an implementation that doesn't fully conform to C99 to provide <stdint.h> and uint8_t as an extension. That's difficult to detect, because there's no conditional #include directive; a #include either includes the requested header or fails.
But you can detect it by going outside the C language, using some kind of configuration script. If this compiles:
#include <stdint.h>
uint8_t dummy;
then uint8_t exists; otherwise it doesn't.
The main point of uint8_t is that it won't exist on platforms that don't support an unsigned integral type with exactly 8 bits. If unsigned char is acceptable, even though it might be larger than 8 bits, then don't use uint8_t. There's nothing to be gained by using unsigned char on one platform and uint8_t on another.

crossplatform 64 bit type

Is there a 64 bit type that in every OS(32/64 bit) and for every compiler has a size of 64?
The same question is also for 32 bit type. (It should be int?)
The origin of the question is : I am implementing the system which has 2 kinds of instructions :
32 bit
64 bit
I want to write something like:
typedef int instruction32bit;
typedef long long instruction64bit //it is not correct some system have sizeof(long long) = 128
You are looking for int64_t and int32_t, or their unsigned friends uint64_t and uint32_t. Include either cinttypes or cstdint.
If you want your code to be truly portable, then you probably want to typedef your own type, and use for example
typedef int32_t instruction32bit;
typedef int64_t instruction64bit;
This will work MOST of the time, but if it doesn't for a particular system/compiler/whatever, you can add do something like this:
#ifdef SOMEDEFINE
typedef long long int instruction64bit;
typedef int instruction32bit;
#else
typedef int32_t instruction32bit;
typedef int64_t instruction64bit;
#endif
Of course, for each model of compiler/OS (or group thereof) that doesn't support int32_t and int64_t, you probably will need a special #ifdef.
This is exactly what all truly portable code does, because no matter how much you find that "nearly all compilers do X", if you get your code popular enough, there's always someone who wants to compile the code with "Bob's Compiler Project" which doesn't have this feature. Of course, the other thing is to just leat those who use "Bob's compiler" edit the typedef itself, and not accept the "For Bob's compiler, you need this ..." patch that inevitably gets sent your way.
As Carl Norum points out in a comment, the #ifdef may be possible to convert to a #if in many cases, and then use generic types such as int and long.
Use uint_least32_t and uint_least64_t. The fixed-size types uint32_t and uint64_t will not exist on systems that don't have the exact sizes they describe.

C++: Datatypes, which to use and when?

I've been told that I should use size_t always when I want 32bit unsigned int, I don't quite understand why, but I think it has something to do with that if someone compiles the program on 16 or 64 bit machines, the unsigned int would become 16 or 64 bit but size_t won't, but why doesn't it? and how can I force the bit sizes to exactly what I want?
So, where is the list of which datatype to use and when? for example, is there a size_t alternative to unsigned short? or for 32bit int? etc. How can I be sure my datatypes have as many bits as I chose at the first place and not need to worry about different bit sizes on other machines?
Mostly I care more about the memory used rather than the marginal speed boost I get from doubling the memory usage, since I have not much RAM. So I want to stop worrying will everything break apart if my program is compiled on a machine that's not 32bit. For now I've used size_t always when i want it to be 32bit, but for short I don't know what to do. Someone help me to clear my head.
On the other hand: If I need 64 bit size variable, can I use it on a 32bit machine successfully? and what is that datatype name (if i want it to be 64bit always) ?
size_t is for storing object sizes. It is of exactly the right size for that and only that purpose - 4 bytes on 32-bit systems and 8 bytes on 64-bit systems. You shouldn't confuse it with unsigned int or any other datatype. It might be equivalent to unsigned int or might be not depending on the implementation (system bitness included).
Once you need to store something other than an object size you shouldn't use size_t and should instead use some other datatype.
As a side note: For containers, to indicate their size, don't use size_t, use container<...>::size_type
boost/cstdint.hpp can be used to be sure integers have right size.
size_t is not not necessarily 32-bit. It has been 16-bit with some compilers. It's 64-bit on a 64-bit system.
The C++ standard guarantees, via reference down to the C standard, that long is at least 32 bits.
int is only formally guaranteed 16 bits, but in practice I wouldn't worry: the chance that any ordinary code will be used on a 16-bit system is slim indeed, and on any 32-bit system int is 32-bit. Of course it's different if you're coding for a 16-bit system like some embedded computer. But in that case you'd probably be writing system-specific code anyway.
Where you need exact sizes you can use <stdint.h> if your compiler supports that header (it was introduced in C99, and the current C++ standard stems from 1998), or alternatively the corresponding Boost library header boost/cstdint.hpp.
However, in general, just use int. ;-)
Cheers & hth.,
size_t is not always 32-bit. E.g. It's 64-bit on 64-bit platforms.
For fixed-size integers, stdint.h is best. But it doesn't come with VS2008 or earlier - you have to download it separately. (It comes as a standard part of VS2010 and most other compilers).
Since you're using VS2008, you can use the MS-specific __int32, unsigned __int32 etc types. Documentation here.
To answer the 64-bit question: Most modern compilers have a 64-bit type, even on 32-bit systems. The compiler will do some magic to make it work. For Microsoft compilers, you can just use the __int64 or unsigned __int64 types.
Unfortunately, one of the quirks of the nature of data types is that it depends a great deal on which compiler you're using. Naturally, if you're only compiling for one target, there is no need to worry - just find out how large the type is using sizeof(...).
If you need to cross-compile, you could ensure compatibility by defining your own typedefs for each target (surrounded #ifdef blocks, referencing which target you're cross-compiling to).
If you're ever concerned that it could be compiled on a system that uses types with even weirder sizes than you have anticipated, you could always assert(sizeof(short)==2) or equivalent, so that you could guarantee at runtime that you're using the correctly sized types.
Your question is tagged visual-studio-2008, so I would recommend looking in the documentation for that compiler for pre-defined data types. Microsoft has a number that are predefined, such as BYTE, DWORD, and LARGE_INTEGER.
Take a look in windef.h winnt.h for more.