POINTER_32 - what is it, and why? - c++

I have just been given the task of updating a legacy application from 32-bit to 64-bit. While reviewing the extent of the task, I discovered the following definition immediately before the inclusion of external (eg. platform) headers:
#define POINTER_32
I cannot find what uses this definition or what effect it has, but it looks like the kind of thing that will be directly relevant to my task!
What is it for? What uses it? Will it be safe to remove it immediately (I presume it will be necessary to remove it in the long run)?
This is using MS VC++ 2008, soon to be 2010.

This is a macro that's normally declared in a Windows SDK header, BaseTsd.h header file. When compiling in 32-bit mode, it is defined as you showed. When compiling in 64-bit mode it is defined as
#define POINTER_32 __ptr32
which is an MSVC compiler extension to declare 32-bit pointers in a 64-bit code model. There's also a 64-bit flavor for 32-bit code:
#define POINTER_64 __ptr64
You'd use it if you write a 64-bit program and need to interop with structures that are used by 32-bit code in another process. For example:
typedef struct _SCSI_PASS_THROUGH_DIRECT32 {
USHORT Length;
UCHAR ScsiStatus;
UCHAR PathId;
UCHAR TargetId;
UCHAR Lun;
UCHAR CdbLength;
UCHAR SenseInfoLength;
UCHAR DataIn;
ULONG DataTransferLength;
ULONG TimeOutValue;
VOID * POINTER_32 DataBuffer; // <== here
ULONG SenseInfoOffset;
UCHAR Cdb[16];
}SCSI_PASS_THROUGH_DIRECT32, *PSCSI_PASS_THROUGH_DIRECT32;

Used to get around the Warning C4244 . Provides a 32-bit pointer in both 32-bit and 64-bit models

My guest is that it was originally created for VLM (Very Large Memory) in Alpha AXP, because Windows NT for Alpha is the first Windows with some 64-bit support. The OS is still 32-bit but it has the VLM API for applications to allocate large 64-bit memory if necessary, hence there must be some way to declare both 32 and 64-bit pointers in the same program
VLM returns POINTER_64 that the app has to store. POINTER_32/__ptr32 are used by default, and POINTER_64/__ptr64 will be used in 64-bit handling code
You can still even see some relics of AXP in Windows documentation and header files related to POINTER_32
POINTER_32 is defined in Ntdef.h and Winnt.h.
#ifdef (__AXP64__)
#define POINTER_32 _ptr32
#else
#define POINTER_32
#endif
Warning C4244
__AXP64__ is meant for Alpha AXP. You can also find lots of other references to Alpha AXP in Ntdef.h and Winnt.h even though Windows no longer runs on that platform
Because of this there are also POINTER_SIGNED and POINTER_UNSIGNED macros to specify how the 32-bit address is extended to the 64-bit one. They'll be expanded to __sptr and __uptr
Another usage would be for 32-bit pointers when disabling LARGEADDRESSAWARE in a 64-bit application
It's exactly the same as x32 ABI in Linux where long and pointers are 32-bit wide
By default, 64-bit Microsoft Windows-based applications have a user-mode address space of several terabytes. For precise values, see Memory Limits for Windows and Windows Server Releases. However, applications can specify that the system should allocate all memory for the application below 2 gigabytes. This feature is beneficial for 64-bit applications if the following conditions are true:
A 2 GB address space is sufficient.
The code has many pointer truncation warnings.
Pointers and integers are freely mixed.
The code has polymorphism using 32-bit data types.
All pointers are still 64-bit pointers, but the system ensures that every memory allocation occurs below the 2 GB limit, so that if the application truncates a pointer, no significant data is lost. Pointers can be truncated to 32-bit values, then extended to 64-bit values by either sign extension or zero extension.
Virtual Address Space
When setting /LARGEADDRESSAWARE:NO the size of pointers are still 64-bit wide even though the high 32 bits has no significant values. Therefore to save memory for the pointer we must declare them as POINTER_32
See How to detect X32 on Windows?

Related

Is it necessary to take care of long data type in program that is written for Windows and Linux?

According to cpp reference in 64 bit systems:
LLP64 or 4/4/8 (int and long are 32-bit, pointer is 64-bit)
Win64 API
LP64 or 4/8/8 (int is 32-bit, long and pointer are 64-bit)
Unix and Unix-like systems (Linux, Mac OS X)
Then how to consider long data type for codes which is written for Linux and Windows?
In C and C++, in portable code, you never know the exact size of a type like int or long int. If you move your code to a different compiler (or a different machine, or a different OS), the sizes of some of your types may change. This needn't be a problem; in fact it's only a problem if you want to make it a problem. (All of this has always been the case, and has nothing to do with someone's definitions of "LLP64" and "LP64" architecture families.)
On those (hopefully rare) occasions when you need a type of an exact size, one good way is to use types like int32_t and uint64_t from <cstdint> (or <stdint.h> in C).
But you really, really shouldn't need to specify the exact size of a type, most of the time. (There are those who say you need to specify the exact size of every type, but my advice is to ignore those people.)
Pretty much the only time you need to specify exact sizes is when trying to define a structure which you can read and write in "binary" fashion to conform to some externally-imposed storage layout. But there, specifying the exact sizes of data types isn't generally sufficient, because of issues like alignment, padding, and byte order. So you're better off writing explicit serialization and deserialization code anyway (or using "text" data formats instead, if you can get away with it).
My bottom line is that I rarely worry about the exact sizes of types.

Why does my compiler use an 8-bit char when I'm running on a 64-bit machine?

I am using the Microsoft Visual Studio 2013 IDE. When I compile a program in C++ while using the header <climits>, I output the macro constant CHAR_BIT to the screen. It tells me there are 8-bits in my char data type (which is 1-byte in C++). However, Visual Studio is a 32-bit application and I am running it on a 64-bit machine (i.e. a machine whose processor has a 64-bit instruction set and operating system is 64-bit Windows 7).
I don't understand why my char data type uses only 8-bits. Shouldn't it be using at least 32-bits (since my IDE is a 32-bit application), let alone 64-bits (since I'm compiling on a 64-bit machine)?
I am told that the number of bits used in a memory address (1-byte) depends on the hardware and implementation. If that's the case, why does my memory address still only use 8-bits and not more?
I think you are confusing memory address bit-width with data value bit-width. Memory addresses (pointers) are 32 bits for 32-bit programs and 64 bits for 64-bit programs. But data types have different widths for their values depending on type (as governed by the standard). So a char is 8-bits, but a char* will be 32-bits if you are compiling as a 32-bit application (also note here it depends on how you compile the application and not what type of processor or OS you are running on).
Edit for questions:
However, what is the relationship between these two?
Memory addresses will always have the same bit width regardless of what data value is stored there.
For example, if I have a 32-bit address and I assign an 8-bit value to that address, does that mean there are 24-bits of unused address space?
Some code (assume 32-bit compilation):
char i_am_1_byte = 0x00; // an 8-bit data value that lives in memory
char* i_am_a_ptr = &i_am_1_byte; // pointer is 32-bits and points to an 8-bit data value
*i_am_a_ptr = 0xFF; // writes 0xFF to the location pointed to by the pointer
// that is, to i_am_1_byte
So we have i_am_1_byte which is a char and takes up 8 bits somewhere in memory. We can get this memory location using the address-of operator & and store it in the pointer variable i_am_a_ptr, which is your 32-bit address. We can write 8 bits of data to the location pointed to be i_am_a_ptr by dereferencing it.
If not, what is the bit-width for memory address actually used for
All the data that your program uses must be located somewhere in memory and each location has an address. Most programs probably will not use most of the memory available for them to use, but we need a way to address every possible location.
how can having more memory address bit-width be helpful?
That depends on how much data you need to work with. A 32-bit program, at most, can address 4GB of memory space (and this may be smaller depending on your OS). That used to be a very, very large amount of memory, but these days it is conceivable a program could run out. It is also a lot easier for the CPU to address more the 4GB of RAM if it is 64-bit (this gets into the difference between physical memory and virtual memory). Of course, 64-bit architecture means a lot more than just bigger addresses and brings many benefits that may be more useful to programs than the bigger memory space.
An interesting fact is that on some processors, such as 32-bit ARM, mostly of their instructions are word aligned. That is, compilers tend to allocate 32-bits (4 bytes) to any data type, even though the data type used needs less than 4 bytes unless otherwise stated in the source code. This happens because ARM architectures are optimized to memory access using word alignment.

How to detect on C++ is windows 32 or 64 bit?

How to detect on C++ is windows 32 or 64 bit?
I see a lot of examples in .Net but I need C++. Also IsWow64Process() dosen't works for me, becouse "If the process is running under 32-bit Windows, the value is set to FALSE. If the process is a 64-bit application running under 64-bit Windows, the value is also set to FALSE"
if I have 32 bit proc under 32 bit OS I have FALSE
if I have 64 bit proc under 64 bit OS I have FALSE
BUT I dont care about process bit I need OS bit
The Win32 API function to detect information about the underlying system is GetNativeSystemInfo. Call the function and read the wProcessorArchitecture member of the SYSTEM_INFO struct that the function populates.
Although it is actually possible to use IsWow64Process to detect this. If you call IsWow64Process and TRUE is returned, then you know that you are running on a 64 bit system. Otherwise, FALSE is returned. And then you just need to test the size of a pointer, for instance. A 32 bit pointer indicates a 32 bit system, and a 64 bit pointer indicates a 64 bit system. In fact, you can probably get the information from a conditional supplied by the compiler, depending on which compiler you use, since the size of the pointer is known at compile time.
Raymond Chen described this approach in a blog article. He helpfully included code which I reproduce here:
BOOL Is64BitWindows()
{
#if defined(_WIN64)
return TRUE; // 64-bit programs run only on Win64
#elif defined(_WIN32)
// 32-bit programs run on both 32-bit and 64-bit Windows
// so must sniff
BOOL f64 = FALSE;
return IsWow64Process(GetCurrentProcess(), &f64) && f64;
#else
return FALSE; // Win64 does not support Win16
#endif
}
GetSystemWow64DirectoryW
Retrieves the path of the system directory used by WOW64. This directory is not present on 32-bit Windows.
An alternative method is to check for the existence of the "Program Files (x86)" root folder on the system partition. It will be present on x64 windows installations, and not on x86 installations.

How to write convertible code, 32 bit/64 bit?

A c++ specific question. So i read a question about what makes a program 32 bit/64 bit, and the anwser it got was something like this (sorry i cant find the question, was somedays ago i looked at it and i cant find it again:( ): As long as you dont make any "pointer assumptions", you only need to recompile it. So my question is, what are pointer assumtions ? To my understanding there is 32 bit pointer and 64 bit pointers so i figure it is something to do with that . Please show the diffrence in code between them. Any other good habits to keep in mind while writing code, that helps it making it easy to convert between the to are also welcome :) tho please share examples with them
Ps. I know there is this post:
How do you write code that is both 32 bit and 64 bit compatible?
but i tougth it was kind of to generall with no good examples, for new programmers like myself. Like what is a 32 bit storage unit ect. Kinda hopping to break it down a bit more (no pun intended ^^ ) ds.
In general it means that your program behavior should never depend on the sizeof() of any types (that are not made to be of some exact size), neither explicitly nor implicitly (this includes possible struct alignments as well).
Pointers are just a subset of them, and it probably also means that you should not try to rely on being able to convert between unrelated pointer types and/or integers, unless they are specifically made for this (e.g. intptr_t).
In the same way you need to take care of things written to disk, where you should also never rely on the size of e.g. built in types, being the same everywhere.
Whenever you have to (because of e.g. external data formats) use explicitly sized types like uint32_t.
For a well-formed program (that is, a program written according to syntax and semantic rules of C++ with no undefined behaviour), the C++ standard guarantees that your program will have one of a set of observable behaviours. The observable behaviours vary due to unspecified behaviour (including implementation-defined behaviour) within your program. If you avoid unspecified behaviour or resolve it, your program will be guaranteed to have a specific and certain output. If you write your program in this way, you will witness no differences between your program on a 32-bit or 64-bit machine.
A simple (forced) example of a program that will have different possible outputs is as follows:
int main()
{
std::cout << sizeof(void*) << std::endl;
return 0;
}
This program will likely have different output on 32- and 64-bit machines (but not necessarily). The result of sizeof(void*) is implementation-defined. However, it is certainly possible to have a program that contains implementation-defined behaviour but is resolved to be well-defined:
int main()
{
int size = sizeof(void*);
if (size != 4) {
size = 4;
}
std::cout << size << std::endl;
return 0;
}
This program will always print out 4, despite the fact it uses implementation-defined behaviour. This is a silly example because we could have just done int size = 4;, but there are cases when this does appear in writing platform-independent code.
So the rule for writing portable code is: aim to avoid or resolve unspecified behaviour.
Here are some tips for avoiding unspecified behaviour:
Do not assume anything about the size of the fundamental types beyond that which the C++ standard specifies. That is, a char is at least 8 bit, both short and int are at least 16 bits, and so on.
Don't try to do pointer magic (casting between pointer types or storing pointers in integral types).
Don't use a unsigned char* to read the value representation of a non-char object (for serialisation or related tasks).
Avoid reinterpret_cast.
Be careful when performing operations that may over or underflow. Think carefully when doing bit-shift operations.
Be careful when doing arithmetic on pointer types.
Don't use void*.
There are many more occurrences of unspecified or undefined behaviour in the standard. It's well worth looking them up. There are some great articles online that cover some of the more common differences that you'll experience between 32- and 64-bit platforms.
"Pointer assumptions" is when you write code that relies on pointers fitting in other data types, e.g. int copy_of_pointer = ptr; - if int is a 32-bit type, then this code will break on 64-bit machines, because only part of the pointer will be stored.
So long as pointers are only stored in pointer types, it should be no problem at all.
Typically, pointers are the size of the "machine word", so on a 32-bit architecture, 32 bits, and on a 64-bit architecture, all pointers are 64-bit. However, there are SOME architectures where this is not true. I have never worked on such machines myself [other than x86 with it's "far" and "near" pointers - but lets ignore that for now].
Most compilers will tell you when you convert pointers to integers that the pointer doesn't fit into, so if you enable warnings, MOST of the problems will become apparent - fix the warnings, and chances are pretty decent that your code will work straight away.
There will be no difference between 32bit code and 64bit code, the goal of C/C++ and other programming languages are their portability, instead of the assembly language.
The only difference will be the distrib you'll compile your code on, all the work is automatically done by your compiler/linker, so just don't think about that.
But: if you are programming on a 64bit distrib, and you need to use an external library for example SDL, the external library will have to also be compiled in 64bit if you want your code to compile.
One thing to know is that your ELF file will be bigger on a 64bit distrib than on a 32bit one, it's just logic.
What's the point with pointer? when you increment/change a pointer, the compiler will increment your pointer from the size of the pointing type.
The contained type size is defined by your processor's register size/the distrib your working on.
But you just don't have to care about this, the compilation will do everything for you.
Sum: That's why you can't execute a 64bit ELF file on a 32bit distrib.
Typical pitfalls for 32bit/64bit porting are:
The implicit assumption by the programmer that sizeof(void*) == 4 * sizeof(char).
If you're making this assumption and e.g. allocate arrays that way ("I need 20 pointers so I allocate 80 bytes"), your code breaks on 64bit because it'll cause buffer overruns.
The "kitten-killer" , int x = (int)&something; (and the reverse, void* ptr = (void*)some_int). Again an assumption of sizeof(int) == sizeof(void*). This doesn't cause overflows but looses data - the higher 32bit of the pointer, namely.
Both of these issues are of a class called type aliasing (assuming identity / interchangability / equivalence on a binary representation level between two types), and such assumptions are common; like on UN*X, assuming time_t, size_t, off_t being int, or on Windows, HANDLE, void* and long being interchangeable, etc...
Assumptions about data structure / stack space usage (See 5. below as well). In C/C++ code, local variables are allocated on the stack, and the space used there is different between 32bit and 64bit mode due to the point below, and due to the different rules for passing arguments (32bit x86 usually on the stack, 64bit x86 in part in registers). Code that just about gets away with the default stacksize on 32bit might cause stack overflow crashes on 64bit.
This is relatively easy to spot as a cause of the crash but depending on the configurability of the application possibly hard to fix.
Timing differences between 32bit and 64bit code (due to different code sizes / cache footprints, or different memory access characteristics / patterns, or different calling conventions ) might break "calibrations". Say, for (int i = 0; i < 1000000; ++i) sleep(0); is likely going to have different timings for 32bit and 64bit ...
Finally, the ABI (Application Binary Interface). There's usually bigger differences between 64bit and 32bit environments than the size of pointers...
Currently, two main "branches" of 64bit environments exist, IL32P64 (what Win64 uses - int and long are int32_t, only uintptr_t/void* is uint64_t, talking in terms of the sized integers from ) and LP64 (what UN*X uses - int is int32_t, long is int64_t and uintptr_t/void* is uint64_t), but there's the "subdivisions" of different alignment rules as well - some environments assume long, float or double align at their respective sizes, while others assume they align at multiples of four bytes. In 32bit Linux, they align all at four bytes, while in 64bit Linux, float aligns at four, long and double at eight-byte multiples.
The consequence of these rules is that in many cases, bith sizeof(struct { ...}) and the offset of structure/class members are different between 32bit and 64bit environments even if the data type declaration is completely identical.
Beyond impacting array/vector allocations, these issues also affect data in/output e.g. through files - if a 32bit app writes e.g. struct { char a; int b; char c, long d; double e } to a file that the same app recompiled for 64bit reads in, the result will not be quite what's hoped for.
The examples just given are only about language primitives (char, int, long etc.) but of course affect all sorts of platform-dependent / runtime library data types, whether size_t, off_t, time_t, HANDLE, essentially any nontrivial struct/union/class ... - so the space for error here is large,
And then there's the lower-level differences, which come into play e.g. for hand-optimized assembly (SSE/SSE2/...); 32bit and 64bit have different (numbers of) registers, different argument passing rules; all of this affects strongly how such optimizations perform and it's very likely that e.g. SSE2 code which gives best performance in 32bit mode will need to be rewritten / needs to be enhanced to give best performance 64bit mode.
There's also code design constraints which are very different for 32bit and 64bit, particularly around memory allocation / management; an application that's been carefully coded to "maximize the hell out of the mem it can get in 32bit" will have complex logic on how / when to allocate/free memory, memory-mapped file usage, internal caching, etc - much of which will be detrimental in 64bit where you could "simply" take advantage of the huge available address space. Such an app might recompile for 64bit just fine, but perform worse there than some "ancient simple deprecated version" which didn't have all the maximize-32bit peephole optimizations.
So, ultimately, it's also about enhancements / gains, and that's where more work, partly in programming, partly in design/requirements comes in. Even if your app cleanly recompiles both on 32bit and 64bit environments and is verified on both, is it actually benefitting from 64bit ? Are there changes that can/should be done to the code logic to make it do more / run faster in 64bit ? Can you do those changes without breaking 32bit backward compatibility ? Without negative impacts on the 32bit target ? Where will the enhancements be, and how much can you gain ?
For a large commercial project, answers to these questions are often important markers on the roadmap because your starting point is some existing "money maker"...

C++: Datatypes, which to use and when?

I've been told that I should use size_t always when I want 32bit unsigned int, I don't quite understand why, but I think it has something to do with that if someone compiles the program on 16 or 64 bit machines, the unsigned int would become 16 or 64 bit but size_t won't, but why doesn't it? and how can I force the bit sizes to exactly what I want?
So, where is the list of which datatype to use and when? for example, is there a size_t alternative to unsigned short? or for 32bit int? etc. How can I be sure my datatypes have as many bits as I chose at the first place and not need to worry about different bit sizes on other machines?
Mostly I care more about the memory used rather than the marginal speed boost I get from doubling the memory usage, since I have not much RAM. So I want to stop worrying will everything break apart if my program is compiled on a machine that's not 32bit. For now I've used size_t always when i want it to be 32bit, but for short I don't know what to do. Someone help me to clear my head.
On the other hand: If I need 64 bit size variable, can I use it on a 32bit machine successfully? and what is that datatype name (if i want it to be 64bit always) ?
size_t is for storing object sizes. It is of exactly the right size for that and only that purpose - 4 bytes on 32-bit systems and 8 bytes on 64-bit systems. You shouldn't confuse it with unsigned int or any other datatype. It might be equivalent to unsigned int or might be not depending on the implementation (system bitness included).
Once you need to store something other than an object size you shouldn't use size_t and should instead use some other datatype.
As a side note: For containers, to indicate their size, don't use size_t, use container<...>::size_type
boost/cstdint.hpp can be used to be sure integers have right size.
size_t is not not necessarily 32-bit. It has been 16-bit with some compilers. It's 64-bit on a 64-bit system.
The C++ standard guarantees, via reference down to the C standard, that long is at least 32 bits.
int is only formally guaranteed 16 bits, but in practice I wouldn't worry: the chance that any ordinary code will be used on a 16-bit system is slim indeed, and on any 32-bit system int is 32-bit. Of course it's different if you're coding for a 16-bit system like some embedded computer. But in that case you'd probably be writing system-specific code anyway.
Where you need exact sizes you can use <stdint.h> if your compiler supports that header (it was introduced in C99, and the current C++ standard stems from 1998), or alternatively the corresponding Boost library header boost/cstdint.hpp.
However, in general, just use int. ;-)
Cheers & hth.,
size_t is not always 32-bit. E.g. It's 64-bit on 64-bit platforms.
For fixed-size integers, stdint.h is best. But it doesn't come with VS2008 or earlier - you have to download it separately. (It comes as a standard part of VS2010 and most other compilers).
Since you're using VS2008, you can use the MS-specific __int32, unsigned __int32 etc types. Documentation here.
To answer the 64-bit question: Most modern compilers have a 64-bit type, even on 32-bit systems. The compiler will do some magic to make it work. For Microsoft compilers, you can just use the __int64 or unsigned __int64 types.
Unfortunately, one of the quirks of the nature of data types is that it depends a great deal on which compiler you're using. Naturally, if you're only compiling for one target, there is no need to worry - just find out how large the type is using sizeof(...).
If you need to cross-compile, you could ensure compatibility by defining your own typedefs for each target (surrounded #ifdef blocks, referencing which target you're cross-compiling to).
If you're ever concerned that it could be compiled on a system that uses types with even weirder sizes than you have anticipated, you could always assert(sizeof(short)==2) or equivalent, so that you could guarantee at runtime that you're using the correctly sized types.
Your question is tagged visual-studio-2008, so I would recommend looking in the documentation for that compiler for pre-defined data types. Microsoft has a number that are predefined, such as BYTE, DWORD, and LARGE_INTEGER.
Take a look in windef.h winnt.h for more.