Is there a standard way to determine at compile-time if system is 32 or 64 bit? - c++

I need to set #ifdef - checks for conditional compile. I want to automate the process but cannot specify the target OS/machine. Is there some way that the pre-compiler can resolve whether it it is running on 32-bit or 64-bit?
(Explanation) I need to define a type that is 64 bits in size. On 64bit OS it is a long, on most others it is a long long.
I found this answer - is this the correct way to go?
[edit] a handy reference for compiler macros

The only compile check you can do reliably would be sizeof(void*) == 8, true for x64 and false for x86. This is a constexpr and you can pass it to templates but you can forget using ifdef with it. There is no platform-independent way to know the address size of the target architecture (at pre-process time), you will need to ask your IDE for one. The Standard doesn't even have the concept of the address size.

No there is no standard language support for macro to determine if the machine is a 64-bit or 32-bit at preprocessor stage.

In response to your edit, there is a "macro-less for you" way to get a type that is 64 bits.
if you need a type that can hold 64 bits, then #include <cstdint> and use either int64_t or uint64_t. You can also use the Standard Integer Types provided by Boost.
Another option is to use long long. It's technically not part of the C++ standard (it will be in C++0x) but is supported on just about every compiler.

I would look at source code for a cross-platform library. It is a quite large part. Every pair of OS and compiler has own set of definitions. Few libraries You may look at:
http://www.libsdl.org/ \include\SDL_config*.h (few files)
http://qt.nokia.com/ \src\corelib\global\qglobal.h

Boost has absorbed the old Predef project. You'll want the architecture macros, more specifically BOOST_ARCH_X86_32/BOOST_ARCH_X86_64, assuming you only care about x86.
If you need a wider detection (e.g. ARM64), either add the relevant macro's to your check, or check what you actually want to check, e.g.
sizeof(void*) == 8

Well, the answer is clearly going to be OS-specific, so you need to narrow down your requirements.
For example, on Unix uname -a typically gives enough info to distinguish a 32-bit build of the OS from a 64-bit build.
The command can be invoked by your pre-compiler. Depending on its output, compiler flags can be set appropriately.

I would be tempted to hoist the detection out of the code and put that into the Makefile. Then, you can leverage system tools to detect and set the appropriate macro upon which you are switching in your code.
In your Makefile ...
<do stuff to detect and set SUPPORT_XX_BIT to the appropriate value>
gcc myFile.c -D$(SUPPORT_XX_BIT) -o myFile
In your code ...
#if defined(SUPPORT_32_BIT)
...
#elif defined(SUPPORT_64_BIT)
...
#else
#error "Select either 32 or 64 bit option\n"
#endif

Probably the easiest way might be comparing the size of int and long long. You cannot do it in the pre-processor though but you can use it in static_assert.
Edit: WoW all the negative votes. I made my point a bit more clear. Also it appears I should have mentioned 'long long' rather than 'long' because of the way MSVC works.

Related

-Wtype-limits on attempt to limit an unsigned integer

Consider the following example:
unsigned short c = // ...
if (c > 0xfffful)
c = 0xfffful;
Since unsigned short can actually be larger than 16 bits, I want to limit the value before snprintf it in hex format to a fixed-size buffer.
However, GCC (but not clang) gives a warning: comparison is always false due to limited range of data type [-Wtype-limits].
Is it a bug in GCC or I missed something? I understand that on my machine unsigned short is exactly 16 bits, but it's not guaranteed to be so on other platforms.
I'd say it is not a bug. GCC is claiming if (c > 0xfffful) will always be false, which, on your machine, is true. GCC was smart enough to catch this, clang wasn't. Good job GCC!
On the other hand, GCC was not smart enough to notice that while it was always false on your machine, its not necessarily always false on someone else's machine. Come on GCC!
Note that in C++11, the *_least##_t types appear (I reserve the right to be proven wrong!) to be implemented by typedef. By the time GCC is running it's warning checks it likely has no clue that the original data type was uint_least16_t. If that is the case, the compiler would have no way of inferring that the comparison might be true on other systems. Changing GCC to remember what the original data type was might be extremely difficult. I'm not defending GCC's naive warning but suggesting why it might be hard to fix.
I'd be curious to see what the GCC guys say about this. Have you considered filing an issue with them?
This doesn't seem like a bug (maybe it could be deemed a slightly naive feature), but I can see why you'd want this code there for portability.
In the absence of any standard macros to tell you what the size of the type is on your platform (and there aren't any), I would probably have a step in my build process that works that out and passes it to your program as a -D definition.
e.g. in Make:
if ...
CFLAGS += -DTRUNCATE_UINT16_LEAST_T
endif
then:
#ifdef TRUNCATE_UINT16_LEAST_T
if (c > 0xfffful)
c = 0xfffful;
#endif
with the Makefile conditional predicated on output from a step in configure, or the execution of some other C++ program that simply prints out sizeofs. Sadly that rules out cross-compiling.
Longer-term I propose suggesting more intelligent behaviour to the GCC guys, for when these particular type aliases are in use.

Portable support for large files

I looked at
this
and this
and this
and I still don't know how to get to know size of file larger than 4 gb in a portable way.
Notably, incorporating some of the answers failed compiling for Cygwin, while the others failed for Linux.
Turns out there are quite a few functions defined by various standards:
fseek/ftell
It's defined by ANSI standard library. It's available virtually everywhere. It is guaranteed to work with 32-bit integers only, but it isn't required to (meaning you might get support for large files out of the box).
fseeko/ftello
This is defined by POSIX standard. On many platforms, depending on value of _FILE_OFFSET_BITS it will cause off_t to be defined as off64_t and fseeko as fseeko64 for _FILE_OFFSET_BITS=64.
fseeko64/ftello64
This is the 64-bit equivalent of fseeko and ftello. I couldn't find information on this in any standard.
Cygwin inconsistency
While it conforms to POSIX, I can't compile the fseeko no matter what I define under Cygwin, unless I use --std=gnu++11 which is obviously nonsense, since it's part of POSIX rather than a GNU extension. So what gives? According to this discussion:
64 bit file access is the natural
file access type for Cygwin. off_t is 8 bytes. There are no foo64
functions for that reason. Just use fopen and friends and you get 64
bit file access for free.
This means #ifdef for cygwin on POSIX platforms.
_fseeki64 / _ftelli64
These are defined by Microsoft Visual C++ and are exclusively used with their compiler. Obviously it doesn't support anything else from the list above (other than fseek), so you're going to need #ifdefs.
EDIT: I actually advise against using them and I'm not the only one who thinks that. I experienced literally following:
wfopen a file in binary mode
fwrite 10 bytes worth to it
_ftelli64 the position
It returns 12 rather than 10 bytes
Looks like this is horribly broken.
lseek and lseek64
Defined by POSIX, these are to be used with integer file descriptors opened with open() from unistd.h rather than FILE* structs. These are not compatible with Windows. Again, they use off_t data type.
_lseek, _lseeki64
This is Windows equivalent of lseek/lseek64. Curiously, _lseeki64 doesn't use off_t and uses __int64 instead, so you know it'll work with big files. Neat.
fsetpos/fgetpos
While these are actually pretty portable, they're almost unusable, since they operate on opaque structures rather than integer offsets, meaning you can add or subtract them, or even navigate to certain position in file obtained by any means other than through fgetpos.
Conclusion
So to make your program portable, depending on the platform, you should use:
fseeko (POSIX) + define _FILE_OFFSET_BITS=64 on POSIX
fseek for Cygwin and for default implementation
_lseeki64 for Windows - or, if you manage to work your way around it - _fseeki64.
An example that uses _ftelli64:
int64_t portable_ftell(FILE *a)
{
#ifdef __CYGWIN__
return ftell(a);
#elif defined (_WIN32)
return _ftelli64(a);
#else
return ftello(a);
#endif
}
In reality, instead of checking #ifdefs which always looked fragile to me, you could check if the functions compile using your build systems, and define your own constants such as HAVE_FTELLO64 accordingly.
Note that if you indeed decide to use lseek/_lseeki64 family and numeric file descriptors rather than the FILE* structures, you should be aware of following differences between open/fopen:
open doesn't use buffering, fopen does. Less buffering means worse performance.
open can't perform newline conversions for text files, fopen can.
More details in this question.
References:
http://www.lix.polytechnique.fr/~liberti/public/computing/prog/c/C/FUNCTIONS/funcref.htm#stdio
http://pubs.opengroup.org/onlinepubs/9699919799/functions/fseek.html
http://pubs.opengroup.org/onlinepubs/009695399/functions/open.html
http://pubs.opengroup.org/onlinepubs/009695399/functions/lseek.html
http://man7.org/linux/man-pages/man2/lseek.2.html
https://msdn.microsoft.com/en-us/library/75yw9bf3.aspx
http://www.cplusplus.com/reference/cstdio/fgetpos/
http://pubs.opengroup.org/onlinepubs/009695399/functions/fgetpos.html

C++: Datatypes, which to use and when?

I've been told that I should use size_t always when I want 32bit unsigned int, I don't quite understand why, but I think it has something to do with that if someone compiles the program on 16 or 64 bit machines, the unsigned int would become 16 or 64 bit but size_t won't, but why doesn't it? and how can I force the bit sizes to exactly what I want?
So, where is the list of which datatype to use and when? for example, is there a size_t alternative to unsigned short? or for 32bit int? etc. How can I be sure my datatypes have as many bits as I chose at the first place and not need to worry about different bit sizes on other machines?
Mostly I care more about the memory used rather than the marginal speed boost I get from doubling the memory usage, since I have not much RAM. So I want to stop worrying will everything break apart if my program is compiled on a machine that's not 32bit. For now I've used size_t always when i want it to be 32bit, but for short I don't know what to do. Someone help me to clear my head.
On the other hand: If I need 64 bit size variable, can I use it on a 32bit machine successfully? and what is that datatype name (if i want it to be 64bit always) ?
size_t is for storing object sizes. It is of exactly the right size for that and only that purpose - 4 bytes on 32-bit systems and 8 bytes on 64-bit systems. You shouldn't confuse it with unsigned int or any other datatype. It might be equivalent to unsigned int or might be not depending on the implementation (system bitness included).
Once you need to store something other than an object size you shouldn't use size_t and should instead use some other datatype.
As a side note: For containers, to indicate their size, don't use size_t, use container<...>::size_type
boost/cstdint.hpp can be used to be sure integers have right size.
size_t is not not necessarily 32-bit. It has been 16-bit with some compilers. It's 64-bit on a 64-bit system.
The C++ standard guarantees, via reference down to the C standard, that long is at least 32 bits.
int is only formally guaranteed 16 bits, but in practice I wouldn't worry: the chance that any ordinary code will be used on a 16-bit system is slim indeed, and on any 32-bit system int is 32-bit. Of course it's different if you're coding for a 16-bit system like some embedded computer. But in that case you'd probably be writing system-specific code anyway.
Where you need exact sizes you can use <stdint.h> if your compiler supports that header (it was introduced in C99, and the current C++ standard stems from 1998), or alternatively the corresponding Boost library header boost/cstdint.hpp.
However, in general, just use int. ;-)
Cheers & hth.,
size_t is not always 32-bit. E.g. It's 64-bit on 64-bit platforms.
For fixed-size integers, stdint.h is best. But it doesn't come with VS2008 or earlier - you have to download it separately. (It comes as a standard part of VS2010 and most other compilers).
Since you're using VS2008, you can use the MS-specific __int32, unsigned __int32 etc types. Documentation here.
To answer the 64-bit question: Most modern compilers have a 64-bit type, even on 32-bit systems. The compiler will do some magic to make it work. For Microsoft compilers, you can just use the __int64 or unsigned __int64 types.
Unfortunately, one of the quirks of the nature of data types is that it depends a great deal on which compiler you're using. Naturally, if you're only compiling for one target, there is no need to worry - just find out how large the type is using sizeof(...).
If you need to cross-compile, you could ensure compatibility by defining your own typedefs for each target (surrounded #ifdef blocks, referencing which target you're cross-compiling to).
If you're ever concerned that it could be compiled on a system that uses types with even weirder sizes than you have anticipated, you could always assert(sizeof(short)==2) or equivalent, so that you could guarantee at runtime that you're using the correctly sized types.
Your question is tagged visual-studio-2008, so I would recommend looking in the documentation for that compiler for pre-defined data types. Microsoft has a number that are predefined, such as BYTE, DWORD, and LARGE_INTEGER.
Take a look in windef.h winnt.h for more.

Cross-platform primitive data types in C++

Unlike Java or C#, primitive data types in C++ can vary in size depending on the platform. For example, int is not guaranteed to be a 32-bit integer.
Various compiler environments define data types such as uint32 or dword for this purpose, but there seems to be no standard include file for fixed-size data types.
What is the recommended method to achieve maximum portability?
I found this header particularly useful:
BOOST cstdint
Usually better than inventing own wheel (which incurs the maintenance and testing).
Create a header file called types.h, and define all the fixed-size primitive types you need (int32, uint32, uint8, etc.). To support multiple platforms, you can either use #ifdef's or have a separate include directory for each platform (include_x86, include_x86_64, include_sparc). In the latter case you would have separate build configurations for each platform, which would have the right include directory in their include path. The second method is preferable, according to the "The C++ Gotchas" by Stephen Dewhurst.
Just an aside, if you are planning to pass binary data between different platforms, you also have to worry about byte order.
Part of the C99 standard was a stdint.h header file to provide this kind of information. For instance, it defines a type called uint32_t. Unfortunately, a lot of compilers don't support stdint.h. The best cross-platform implementation I've seen of stdint.h is here: http://www.azillionmonkeys.com/qed/pstdint.h. You can just include that in your project.
If you're using boost, I believe it also provides something equivalent to the stdint header.
Define a type (e.g. int32) in a header file. For each platform use another #ifdef and make sure that in32 is a 32 bit integer. Everywhere in your code use int32 and make sure that when you compile on different platforms you use the right define
There is a stdint.h header defined by the C99 standard and (I think) some variant or another of ISO C++. This defines nice types like int16_t, uint64_t, etc... which are guaranteed to have a specific size and representation. Unfortunately, it's availability isn't exactly standard (Microsoft in particular was a foot dragger here).
The simple answer is this, which works on every 32 or 64 bit byte-addressable architecture I am aware of:
All char variables are 1 byte
All short variables are 2 bytes
All int variables are 4 byte
DO NOT use a "long", which is of indeterminate size.
All known compilers with support for 64 bit math allow "long long" as a native 64 bit type.
Be aware that some 32 bit compilers don't have a 64 bit type at all, so using long long will limit you to 64 bit systems and a smaller set of compilers (which includes gcc and MSVC, so most people won't care about this problem).
If its name begins with two underscores (__), a data type is non-standard.
__int8 (unsigned __int8)
__int16 (unsigned __int16)
__int32 (unsigned __int32)
__int64 (unsigned __int64)
Try to use boost/cstdint.hpp
Two things:
First, there is a header file called limits.h that gives lots of useful platform
specific information. It will give max and min values for the int type for example.
From that, you can deduce how big the int type is.
You can also use the sizeof operator at runtime for these purposes too.
I hope this helps . . .
K

Seeking and reading large files in a Linux C++ application

I am running into integer overflow using the standard ftell and fseek options inside of G++, but I guess I was mistaken because it seems that ftell64 and fseek64 are not available. I have been searching and many websites seem to reference using lseek with the off64_t datatype, but I have not found any examples referencing something equal to fseek. Right now the files that I am reading in are 16GB+ CSV files with the expectation of at least double that.
Without any external libraries what is the most straightforward method for achieving a similar structure as with the fseek/ftell pair? My application right now works using the standard GCC/G++ libraries for 4.x.
fseek64 is a C function. To make it available you'll have to define _FILE_OFFSET_BITS=64 before including the system headers That will more or less define fseek to be actually fseek64. Or do it in the compiler arguments e.g.
gcc -D_FILE_OFFSET_BITS=64 ....
http://www.suse.de/~aj/linux_lfs.html has a great overviw of large file support on linux:
Compile your programs with "gcc -D_FILE_OFFSET_BITS=64". This forces all file access calls to use the 64 bit variants. Several types change also, e.g. off_t becomes off64_t. It's therefore important to always use the correct types and to not use e.g. int instead of off_t. For portability with other platforms you should use getconf LFS_CFLAGS which will return -D_FILE_OFFSET_BITS=64 on Linux platforms but might return something else on e.g. Solaris. For linking, you should use the link flags that are reported via getconf LFS_LDFLAGS. On Linux systems, you do not need special link flags.
Define _LARGEFILE_SOURCE and _LARGEFILE64_SOURCE. With these defines you can use the LFS functions like open64 directly.
Use the O_LARGEFILE flag with open to operate on large files.
If you want to stick to ISO C standard interfaces, use fgetpos() and fsetpos(). However, these functions are only useful for saving a file position and going back to the same position later. They represent the position using the type fpos_t, which is not required to be an integer data type. For example, on a record-based system it could be a struct containing a record number and offset within the record. This may be too limiting.
POSIX defines the functions ftello() and fseeko(), which represent the position using the off_t type. This is required to be an integer type, and the value is a byte offset from the beginning of the file. You can perform arithmetic on it, and can use fseeko() to perform relative seeks. This will work on Linux and other POSIX systems.
In addition, compile with -D_FILE_OFFSET_BITS=64 (Linux/Solaris). This will define off_t to be a 64-bit type (i.e. off64_t) instead of long, and will redefine the functions that use file offsets to be the versions that take 64-bit offsets. This is the default when you are compiling for 64-bit, so is not needed in that case.
fseek64() isn't standard, the compiler docs should tell you where to find it.
Have you tried fgetpos and fsetpos? They're designed for large files and the implementation typically uses a 64-bit type as the base for fpos_t.
Have you tried fseeko() with the _FILE_OFFSET_BITS preprocessor symbol set to 64?
This will give you an fseek()-like interface but with an offset parameter of type off_t instead of long. Setting _FILE_OFFSET_BITS=64 will make off_t a 64-bit type.
The same for goes for ftello().
Use fsetpos(3) and fgetpos(3). They use the fpos_t datatype , which I believe is guaranteed to be able to hold at least 64 bits.