Shared Memory Interface between Windows 64 bits and 32 bits - c++

I need to write code in Windows 7 (64 bits) that executes a 32-bits program that has a Shared Memory Interface (SMI). More precisely, the program I am coding writes into the SMI and the 32-bits program reads from this SMI.
The first problem that I have is that I don't have access to the source code of the 32-bit program, problem that can't be solved. The second problem is that the SMI stores the address of the information that is written. This pointed is stored as a based pointer using the following code:
gpSharedBlock->m_pData[uiDataPointer] = (char __based(gpSharedBlock)*)pData;
Were pData is a pointer to the data we are writing, and gpSharedBlock->m_pData[i] points to the i^th element stored.
Probably from here you have already noticed the problem; a pointer in W32 is 4 bytes while a pointer in W64 is 8 bytes. Then, since the value stored is a 64 bit pointer, the value finally read by the 32-bits program is not the desired one.
My question is: is there a way to do a translation of the 64-bit address to a 32-bit address such that the program that is running reads the correct information?
I have read about WOW64, and I suppose that the W32 program is running under it, but I don't know how to take advantage of that. Any ideas?

A __based pointer is a numeric offset from another pointer. It is effectively a virtual pointer interpretted at runtime.
A pointer is 8 bytes in 64-bit, so to be compatible with the 32-bit program, you will have to declare the pointer members of the SharedBlock type in your 64-bit code to use 4-bit integers instead of pointers, eg:
struct sSharedBlock
{
int32_t m_pData[...];
};
pData is __based on gpSharedBlock, so the value of pData is a relative offset from the value of gpSharedBlock. Use that fact to determine the actual byte offset of your data block relative to the gpSharedBlock memory block, and then store that offset value into m_pData[] as an integer. That is what the SMI memory block is actually expecting anyway - an offset, not a real pointer. The __based keyword is just a fancy way of handling offsets using pointers without doing the offset calculations manually in code.
The original code is effectively the same as the following without needing the __based keyword:
gpSharedBlock->m_pData[uiDataPointer] = (int32_t) ( ((char*)pData) - ((char*)gpSharedBlock) );

Related

Why does my compiler use an 8-bit char when I'm running on a 64-bit machine?

I am using the Microsoft Visual Studio 2013 IDE. When I compile a program in C++ while using the header <climits>, I output the macro constant CHAR_BIT to the screen. It tells me there are 8-bits in my char data type (which is 1-byte in C++). However, Visual Studio is a 32-bit application and I am running it on a 64-bit machine (i.e. a machine whose processor has a 64-bit instruction set and operating system is 64-bit Windows 7).
I don't understand why my char data type uses only 8-bits. Shouldn't it be using at least 32-bits (since my IDE is a 32-bit application), let alone 64-bits (since I'm compiling on a 64-bit machine)?
I am told that the number of bits used in a memory address (1-byte) depends on the hardware and implementation. If that's the case, why does my memory address still only use 8-bits and not more?
I think you are confusing memory address bit-width with data value bit-width. Memory addresses (pointers) are 32 bits for 32-bit programs and 64 bits for 64-bit programs. But data types have different widths for their values depending on type (as governed by the standard). So a char is 8-bits, but a char* will be 32-bits if you are compiling as a 32-bit application (also note here it depends on how you compile the application and not what type of processor or OS you are running on).
Edit for questions:
However, what is the relationship between these two?
Memory addresses will always have the same bit width regardless of what data value is stored there.
For example, if I have a 32-bit address and I assign an 8-bit value to that address, does that mean there are 24-bits of unused address space?
Some code (assume 32-bit compilation):
char i_am_1_byte = 0x00; // an 8-bit data value that lives in memory
char* i_am_a_ptr = &i_am_1_byte; // pointer is 32-bits and points to an 8-bit data value
*i_am_a_ptr = 0xFF; // writes 0xFF to the location pointed to by the pointer
// that is, to i_am_1_byte
So we have i_am_1_byte which is a char and takes up 8 bits somewhere in memory. We can get this memory location using the address-of operator & and store it in the pointer variable i_am_a_ptr, which is your 32-bit address. We can write 8 bits of data to the location pointed to be i_am_a_ptr by dereferencing it.
If not, what is the bit-width for memory address actually used for
All the data that your program uses must be located somewhere in memory and each location has an address. Most programs probably will not use most of the memory available for them to use, but we need a way to address every possible location.
how can having more memory address bit-width be helpful?
That depends on how much data you need to work with. A 32-bit program, at most, can address 4GB of memory space (and this may be smaller depending on your OS). That used to be a very, very large amount of memory, but these days it is conceivable a program could run out. It is also a lot easier for the CPU to address more the 4GB of RAM if it is 64-bit (this gets into the difference between physical memory and virtual memory). Of course, 64-bit architecture means a lot more than just bigger addresses and brings many benefits that may be more useful to programs than the bigger memory space.
An interesting fact is that on some processors, such as 32-bit ARM, mostly of their instructions are word aligned. That is, compilers tend to allocate 32-bits (4 bytes) to any data type, even though the data type used needs less than 4 bytes unless otherwise stated in the source code. This happens because ARM architectures are optimized to memory access using word alignment.

does the size of char * (character pointer) in C/C++ vary? - use for database column fixed size

per the following code, I get the size of a character pointer is 8 bytes. Yet this site has a size of 1 byte for the char pointer.
#include <stdio.h>
int main(void ){
char *a = "saher asd asd asldasdas;daksd ahwal";
printf(" nSize = %d \n", sizeof(a));
return 0;
}
Is this always the case? I am writing a connector for a simple database I am implementing and want to read TEXT field of mysql into my database. Since TEXT has variable size, I was wondering if my column Type/metadata can have a fixed size of 8 bytes where I store the pointer in memory to the string (char *)?
per the following code, I get the size of a character pointer is 8 bytes. Yet this site has a size of 1 byte for the char pointer.
It's implementation-defined. It's usually 8 on a 64-bit Intel system and 4 on a 32-bit Intel system. Don't rely on it being any particular size.
I am writing a connector for a simple database I am implementing and want to read TEXT field of mysql into my database. Since TEXT has variable size, I was wondering if my column can have a fixed size of 8 bytes where I store the pointer in memory to the string (char *)?
It makes no sense at all to store pointers into memory in a database. A database is for persistent data. On the other hand, data stored in memory is liable to disappear whenever a process exits (or the system is restarted).
No, it is not. Size of a pointer depends on CPU architecture. Some architecture even have different sizes depending on "type" of the pointer. On x86_64, pointers are 48 bits wide. 64 bits are used because individual bits are not addressable. One could, however, use pointer packing to serialize/deserialize pointers into 48-bit chunks.
A variable can be different sizes based on the computer that you are using. This is causing the discrepancy between your results and the results you see online.
However, the variable will always be the same size on the same machine.
The size of any pointer in one platform is the same.. regardless of the data type char, string, object, etc.
In PC with 64 operating system (and also the compiler support 64 bit), the size of pointer is 8 byte (64 bit address space)..
Another platform may have 4 byte, 2 byte, or 1 byte (like an 8 bit micro controller)..

Storing hexadecimal addresses in a file

I have a pintool application which store the memory address accessed by an application in a file. These addresses are in hexadecimal form. If I write these addresses in form of string, it will take a huge amount of storage(nearly 300GB). Writing such a large file will also take large amount of time. So I think of an alternate way to reduce the amount of storage used.
Each character of hexadecimal address represent 4 bits and each ASCII character is of 8 bits. So I am thinking of representing two hexadecimal characters by one ASCII character.
For example :
if my hexadecimal address is 0x26234B
then corresponding converted ASCII address will be &#K (0x is ignored as I know all address will be hexadecimal).
I want to know that is there any other much more efficient method for doing this which takes less amount of storage.
NOTE : I am working in c++
This is a good start. If you really want to go further, you can consider compressing the data using something like a zip library or Huffman encoding.
Assuming your addresses are 64-bit pointers, and that such a representation is sensible for your platform, you can just store them as 64-bit ints. For example, you list 0x1234567890abcdef, which could be stored as the four bytes:
12 34 56 78 90 ab cd ef
(your pointer, stored in 8 bytes.)
or the same, but backwards, depending on what endianness you choose. Specifically, you should read this.
We can even do this somewhat platform-independently: uintptr_t is unsigned integer type the same width as a pointer (assuming one exists, which it usually does, but it's not a sure thing), and sizeof(our_pointer), which gives us the size in bytes of a pointer. We can arrive at the above bytes with:
Convert the pointer to an integer representation (i.e., 0x0026234b)
Shift the bytes around to pick out the one we want.
Stick it somewhere.
In code:
unsigned char buffer[sizeof(YourPointerType)];
for(unsigned int i = 0; i < sizeof(YourPointerType); ++i) {
buffer[i] = (
(reinterpret_cast<uintptr_t>(your_pointer) >> (sizeof(YourPointerType) - i - 1))
& 0xff
);
}
Some notes:
That'll do a >> 0 on the last loop iteration. I suspect that might be undefined behavior, and you'll need an if-case to handle it.
This will write out pointers of the size of your platform, and requires that they can be converted sensibly to integers. (I think uintptr_t won't exist if this isn't the case.) It won't do the same thing on 64- as it will on 32-bit platforms, as they have different pointer sizes. (Or any other pointer-sized platform you run across.)
A program's pointers aren't valid once the program dies, and might not even remain valid when the program is still running. (If the pointer points to memory that the program decides to free, then the pointer is invalid.)
There's likely a library that'll do this for you. (struct, in Python, does this.)
The above is a big-endian encoder. Alternatively, you can write out little endian — the Wikipedia article details the difference.
Last, you can just cast a pointer to the pointer to a unsigned char *, and write that. (I.e., dump the actual memory of the pointer to a file.) That's way more platform dependent though.
If you need even more space, I'd run it through gzip.

Storing address of buffer in an unsigned integer;

I have a memory buffer whose address i want to store in an unsigned integer value.
uint8_t* _buff = new uint8_t[1024];
uint64_t* _base_addr = (uint64_t *)_buff;
I want the address of the location pointed by _buff or _base_addr (anyhow it is the same location) to be stored in say uint32_t value.
So that when i read the value of integer it gives me the address.
How can this be done ?
You cannot store an address in "say" uint32_t variable, as the address might not fit - on 64-bit systems the pointers require 64 bits of storage. Instead of fixing the size, use the uintptr_t of C99 (<stdint.h>) or C++11 (<cstdint>).
To store the address in such an integer variable, use
uintptr_t variable = (uintptr_t)pointer;
Just cast it: uint32_t addr = (uint32_t)_buff;
You should have a very good reason for doing it though, for 2 reasons:
It's not portable and might be even wrong - size of an address differs between different systems, normally (but not limited to) it will be either 32 or 64 bit.
It harms the readability of your code. Pointers exist for precisely this reason - to store (and manipulate) addresses.
You might want to store an address in an integer when you need to manipulate HW devices mapped into memory. In this case you need to have a very good idea of what you're doing.

Why the size of a pointer is 4bytes in C++

On the 32-bit machine, why the size of a pointer is 32-bit? Why not 16-bit or 64-bit? What's the cons and pros?
Because it mimics the size of the actual "pointers" in assembler. On a machine with a 64 bit address bus, it will be 64 bits. In the old 6502, it was an 8 bit machine, but it had 16 bit address bus so that it could address 64K of memory. On most 32 bit machines, 32 bits were enough to address all the memory, so that's what the pointer size was in C++. I know that some of the early M68000 series chips only had a 24 bit memory address space, but it was addressed from a 32 bit register so even on those the pointer would be 32 bits.
In the bad old days of the 80286, it was worse - there was a 16 bit address register, and a 16 bit segment register. Some C++ compilers didn't hide that from you, and made you declare your pointers as near or far depending on whether you wanted to change the segment register. Mercifully, I've recycled most of those brain cells, so I forget if near pointers were 16 bits - but at the machine level, they would be.
The size of a pointer in C++ is implementation-defined. C++ might run on anything from your toaster's chip up to huge mainframes. Different architectures require different sizes of the data types.
If on your implementation a pointer is 32bit, then that's very likely an architecture which can address 2^32 bytes. (Note that even the size of bytes might be different depending on the implementation.) 64bit architectures generally can address 2^64 bytes, so implementations on these architectures will likely have a pointer size of 64bit.
16 bit would obviously be insufficient - you could only address 64K then.
Why not emulate 64 bit on 32 bit systems - I guess because the performance of pointer arithmetic would degrade.
As mentioned in many other answers, the size of a pointer need not be 32-bits - the implementation will set the size of a pointer to be whatever the architecture of the platform dictates. On a system with 64-bit addressing, the size of a pointer will generally be 64-bits.
However, you should also note that even on a single implementation, different types of pointers might have different sizes. In particular, pointer-to-member types (which I'll grant are odd-ball pointers) may have different sizes than plain-old pointers to objects.
The same is true about pointers to plain old functions - they might have a different size than pointers to objects (this applies to C as well as C++). However on modern desktop systems you'll usually find that pointers to functions are the same size as pointers to objects.
Here's a short example of fun with pointer-to-member-functions:
#include <stdio.h>
class A {};
class B {};
class VirtD: public virtual A, public virtual B {
public:
virtual int Dfunc() { return 5; };
};
typedef int (VirtD::* Derived_mfp)();
int main()
{
VirtD virtd;
Derived_mfp mfp = &VirtD::Dfunc;
printf( "sizeof( mfp) == %u\n", (unsigned int) sizeof( mfp));
}
Displays: sizeof( mfp) == 12 on MSVC.
The size of the pointer has little to do with the architecture(32bit, 64bit). 32bit usually refers to the fact that the register size is 32bit. As a result, the maximum possible number of address that you can address using one register is 2^32. So, it boils down to efficiency of addressing the memory slots using a register.
With a 32-bit pointer you can point to a wider range of memory than with 16-bit pointers. When 32-bit pointers were standardized, 64-bit CPUs were not very popular (or even existent?). Therefor a pointer would not be able to fit inside the CPU register, which is a very important factor for speed.
Why not 16-bit? Because, presuming a flat 32-bit address space, you cannot address every byte. Far from it: you can only address 216 unique locations with a 16-bit pointer. Even if your pointers only point to dwords and not bytes, this still leaves 1073676288 dwords unaddressable.
Assuming a flat 32-bit address space, you can already address every single byte with a 32-bit pointer. At this point, 64-bit pointers are just wasting space, unless you want to add additional information to each pointer. For example, on 32-bit PowerPC, a function descriptor is actually a 96-bit entity, with one third pointing to the executable code and the rest being data that helps make relocating modules easier.
In a segmented address space, having larger-than-32-bit pointers to data could be useful. Windows NT on the DEC Alpha was a 32-bit operating system, but the Alpha hardware was 64-bit capable. Your ordinary address space was still 32-bit, but there were special APIs to allow 32-bit programs to access 64-bit addresses, as if they were in otherwise-inaccessible segments.
To answer your question: C++ itself says very little about the size of a pointer, and certainly not that it has to be 32 bits or anything. The size of a pointer should be the natural one for the machine architecture.