I am trying to print the address of a data member of a class:
#include <iostream>
struct test { int x; };
int main() {
test t;
std::cout << &t.x << std::endl;
}
The output is:
0x23fe4c
I don't understand how this points to a memory address. I want to know the meaning of this way of representing addresses.
The 0x (or sometimes 0X) prefix indicates that the value following is presented as a hexadecimal value, i.e. is represented in base (or radix) 16 instead of base 10 as decimal values. For example, 0x1234abcd means 1234abcd16 which is written as a decimal is 30544174110 or simply 305441741. This is simply one common representation used for memory addresses and other computer- or programming-related contexts.
Related
I was reading the Wolfenstein 3D code, and I encountered ISPOINTER macro:
#define ISPOINTER(x) ((((uintptr_t)(x)) & ~0xffff) != 0)
I know we have std::is_pointer, but how does this macro work? I tried and failed with strange behavior which I couldn't explained why it's happend:
#define ISPOINTER(x) ((((uintptr_t)(x)) & ~0xffff) != 0)
int main()
{
int* ptr;
int val;
if (ISPOINTER(ptr)) {
std::cout << "`ptr`: Is Pointer" << std::endl;
}
if (ISPOINTER(val)) {
std::cout << "`val`: Is Pointer" << std::endl;
}
}
I don't have any output, but if I add another pointer:
#define ISPOINTER(x) ((((uintptr_t)(x)) & ~0xffff) != 0)
int main()
{
int* ptr;
int val;
int* ptr2;
if (ISPOINTER(ptr)) {
std::cout << "`ptr`: Is Pointer" << std::endl;
}
if (ISPOINTER(val)) {
std::cout << "`val`: Is Pointer" << std::endl;
}
if (ISPOINTER(ptr2)) {
std::cout << "`ptr2`: Is Pointer" << std::endl;
}
}
The output will be:
`ptr`: Is Pointer
What does ISPOINTER doing? It's undefined behavior?
Let's do this in steps:
((uintptr_t)(x)) is simply a cast from whatever x is into a uintptr_t (an unsigned integer type capable of storing pointer values)
~0xffff is a bit-wise complement of 0xffff (which is 16 bits of all 1s). The result of that is a number that is all 1s except the last 16 bits.
((uintptr_t)(x)) & ~0xffff is a bit-wise AND of the pointer value with the above number. This will effectively zero-out the 16 lowest bits of whatever the pointer value is.
The full expression now just checks if the result is zero or not. So the whole expression basically checks if any bits except the least-significant 16 are set and if so it considers it a pointer.
Since this came from Wolfenstein 3D, they probably made the assumption that all dynamically allocated memory lives in high memory addresses (higher than 2^16). So this is NOT a check if a type is a pointer or not using the type system like std::is_pointer does. This is an assumption based on the target architecture Wolfenstein 3D will likely run on.
Keep in mind that this is not a safe assumption, since "normal" values above 2^16 would also be considered pointers and the memory layout of your process can be very different depending on a lof of factors (e.g. ASLR)
It might be similar to certain parameters in Windows, which can be an ordinal value or a pointer. An ordinal value will be <64K. On that operating system, any legal pointer will be >64K. (On older 16-bit versions of Windows, only 4K was reserved.)
This code may be doing the same thing. A "resource" may be a built-in or registered value referred to by a small number, or a pointer to an ad-hoc object. This macro is used to decide whether to use it as-is or look it up in the table instead.
When I want to calculate the address of a function, I do the following:
HMODULE base = GetModuleHandle(L"program.exe"); // Module base addr
// Adding an offset to the base
std::cout << (base + 0x8F0A0) << std::endl; -> Wrong!
I'm not sure why the result is wrong. I've tested it via online hex calcs and also have debugger to check both values.
Could base be considered decimal and other being hex, produce wrong results?
How can I get a result in hex?
As explained here, depending on whether STRICT is defined, HMODULE is essentially either a void* or a <unique type>*, the purpose of this being to make each handle type a different C++ type, meaning compiler errors when you mix and match. In the former case, pointer arithmetic won't compile. In the latter case, it will compile, but you can't rely on anything happening because pointer arithmetic takes the type's size into account and because pointer arithmetic is undefined if you leave the object/array being pointed to.
You should treat this pointer as pointing to nothing in particular, and therefore not do pointer arithmetic. You have to reinterpret_cast it to an integral type that you're sure is large enough (std::uintptr_t) and then do arithmetic on that integral value.
In my local header, this unique type contains an int member, so adding 1 will actually move the pointer ahead by 4 bytes (you know, except for the undefined behaviour and all). It just so happens that 0x00DE0000 + 4 * 0x8F0A0 is your 0x0101C280 value.
You're problem lies with the value GetModuleHandle(L"program.exe") returning: 00DE0000. You need to utilise C hexadecimal syntax, so you need to add and precede "0x" to your hex number 00DE0000.
Hence, your base number should be casted to a numeric value: 0x00DE0000
0x00DE0000 is equal to 00DE0000
Try using std::string to_string(int value); to convert it to string, then convert your hex values (base) to C hexadecimal syntax (add "0x" at the beginning of your hex value). To finish off, convert your base value back to a numeric value (e.g. use std::stoi) and perform the addition using std::hex.
Try this code here.
#include <iostream>
int main () {
int hex1 = 0x8F0A0;
int hex2 = 0x00DE0000; // Using int values
std::cout << std::hex << hex1 + hex2 << std::endl;
}
As Chris has said, I had the same case, solving the thing like this:
int offset = 0x8F0A0;
std::uintptr_t base = reinterpret_cast<uintptr_t>(GetModuleHandle(L"program.exe"));
// Here added 4 bytes to the offset.
std::cout << std::hex << (base + (offset + 4096)) << std::endl;
I'm studying casting in C++ and the code after is magic to me.
#include <iostream>
using namespace std;
class Base {
public:
virtual void f() { }
};
#define SOME_VALUE 8
int main() {
cout << SOME_VALUE <<endl;
getchar();
}
the output is: 8
The code is very simple, but What type of SOME_VALUE? int, or double or char?
After is more complex:
#include <iostream>
using namespace std;
class Base {
public:
virtual void f() { }
};
#define SOME_VALUE 8
int main() {
cout << (Base*)SOME_VALUE-SOME_VALUE <<endl;
getchar();
}
The output is: FFFFFFE8
Following this code, I can understand that SOME_VALUE is numeric type. I also test sizeof(SOME_VALUE) and the out put is 4. But if SOME_WHAT is numeric, how can it change to object Pointer? And how Object Pointer can minus to integer?
#define is a preprocessor command. It gets evaluated before the code gets compiled. All that happens is that SOME_VALUE in the main function has it's text replaced by the text SOME_VALUE is defined as. That is 8.
SOME_VALUE itself doesn't have a C++ type because it only exists before preprocessing. After preprocessing SOME_VALUE won't exist in the C++ program, you'll just have a literal value 8 which is an int.
For the second question, the cast to Base* uses a C style cast. That is capable of converting anything to anything just by treating the raw memory of what you're converting as being of the target type. So, it can be quite dangerous if the memory being cast doesn't match the target type. For C++, I suggest using static_cast or reinterpret_cast to make this more explicit as to what's being casted.
I think (Base*)SOME_VALUE will end up as a Base* to the memory address 8. So, this is a pointer to a Base object that starts on the 8th byte in memory. There probably isn't a Base object at location 8 in memory, so it's not actually very useful. Then "- 8" takes away 8 multiples of the size of the Base* type. On a 32bit computer, pointers are 32bits, or 4bytes. So, 8 - (4*8) = -24 decimal, which is FFFFFFE8 in hex.
If you want to know why a computer represents negative numbers as big numbers, that's a different question. Start here: http://en.wikipedia.org/wiki/Signed_number_representations
SOME_VALUE is a macro--it doesn't have a type. 8, however, is an integer.
Use #define SOME_VALUE ((Base*)8), if you want SOME_VALUE to always act like a Base*.
cout << (Base*)SOME_VALUE-SOME_VALUE <<endl;
Is basically a (horrible) way of doing:
Base* b = 8;
b = b - 8;
The 8 will be silently multiplied by the size of Base though (so you're subtracting 8 base slots, not 8).
Pointers are typically unsigned, so what's happening is that the unsigned pointer is wrapping around.
0xFFFFFFE8 is 4294967272 or (assuming 4 byte unsigned int with usual wrap around) 8 - 24.
Also, you should never do this in real code. Assigning an arbitrary value to a pointer is sure to end in a fiery explosion.
An easier to understand situation might be like this:
int* p = (int*) 24;
p -= 4; //like ((char*) p) - 4 * sizeof(int)
With 4 byte integers, the value of p would then be 8 because 24 - 4 * sizeof(int) = 24 - 4 * 4 = 24 - 16 = 8.
For example:
int* x = new int;
int y = reinterpret_cast<int>(x);
y now holds the integer value of the memory address of variable x.
Variable y is of size int. Will that int size always be large enough to store the converted memory address of ANY TYPE being converted to int?
EDIT:
Or is safer to use long int to avoid a possible loss of data?
EDIT 2: Sorry people, to make this question more understandable the thing I want to find out here is the size of returned HEX value as a number, not size of int nor size of pointer to int but plain hex value. I need to get that value in in human-readable notation. That's why I'm using reinterpret_cast to convert that memory HEX value to DEC value. But to store the value safely I also need to fing out into what kind of variable to it: int, long - what type is big enough?
No, that's not safe. There's no guarantee sizeof(int) == sizeof(int*)
On a 64 bit platform you're almost guaranteed that it's not.
As for the "hexadecimal value" ... I'm not sure what you're talking about. If you're talking about the textual representation of the pointer in hexadecimal ... you'd need a string.
Edit to try and help the OP based on comments:
Because computers don't work in hex. I don't know how else to explain it. An int stores some amount of bits (binary), as does a long. Hexadecimal is a textual representation of those bits (specifically, the base16 representation). strings are used for textual representations of values. If you need a hexadecimal representation of a pointer, you would need to convert that pointer to text (hex).
Here's a c++ example of how you would do that:
test.cpp
#include <string>
#include <iostream>
#include <sstream>
int main()
{
int *p; // declare a pointer to an int.
std::ostringstream oss; // create a stringstream
std::string s; // create a string
// this takes the value of p (the memory address), converts it to
// the hexadecimal textual representation, and puts it in the stream
oss << std::hex << p;
// Get a std::string from the stream
s = oss.str();
// Display the string
std::cout << s << std::endl;
}
Sample output:
roach$ g++ -o test test.cpp
roach$ ./test
0x7fff68e07730
It's worth noting that the same thing is needed when you want to see the base10 (decimal) representation of a number - you have to convert it to a string. Everything in memory is stored in binary (base2)
On most 64-bit targets, int is still 32-bit, while pointer is 64bit, so it won't work.
http://en.wikipedia.org/wiki/64-bit#64-bit_data_models
What you probably want is to use std::ostream's formatting of addresses:
int x(0);
std::cout << &x << '\n';
As to the length of the produced string, you need to determine the size of the respective pointer: for each used byte the output will use two hex digit because each hex digit can represent 16 values. All bytes are typically used even if it is unlikely that you have memory for all bytes e.g. when the size of pointers is 8 bytes as happens on 64 bit systems. This is because the stacks often grow from the biggest address downwards while the executable code start at the beginning of the address range (well, the very first page may be unused to cause segmentation violations if it is touched in any way). Above the executable code live some data segments, followed by the heap, and lots of unused pages.
There is question addressing similar topic:
https://stackoverflow.com/a/2369593/1010666
Summary: do not try to write pointers into non-pointer variable.
If you need to print out the pointer value, there are other solutions.
Today I've a weird question.
The Code(C++)
#include <iostream>
union name
{
int num;
float num2;
}oblong;
int main(void)
{
oblong.num2 = 27.881;
std::cout << oblong.num << std::endl;
return 0;
}
The Code(C)
#include <stdio.h>
int main(void)
{
float num = 27.881;
printf("%d\n" , num);
return 0;
}
The Question
As we know, C++ unions can hold more than one type of data element but only one type at a time. So basically the name oblong will only reserve one portion of memory which is 32-bit (because the biggest type in the union is 32-bit, int and float) and this portion could either hold a integer or float.
So I just assign a value of 27.881 into oblong.num2 (as you can see on the above code). But out of curiosity, I access the memory using oblong.num which is pointing to the same memory location.
As expected, it gave me a value which is not 27 because the way float and integer represented inside a memory is different, that's why when I use oblong.num to access the memory portion it'll treat that portion of memory value as integer and interpret it using integer representation way.
I know this phenomena also will happen in C , that's why I initialize a float type variable with a value and later on read it using the %d.So I just try it out by using the same value 27.881 which you can see above. But when I run it, something weird happens, that is the value of the one I get in C is different from C++.
Why does this happen? From what I know the two values I get from the two codes in the end are not garbage values, but why do I get different values? I also use the sizeof to verified both C and C++ integer and float size and both are 32-bit. So memory size isn't the one that causes this to happen, so what prompt this difference in values?
First of all, having the wrong printf() format string is undefined behavior. Now that said, here is what is actually happening in your case:
In vararg functions such as printf(), integers smaller than int are promoted to int and floats smaller than double are promoted to double.
The result is that your 27.881 is being converted to an 8-byte double as it is passed into printf(). Therefore, the binary representation is no longer the same as a float.
Format string %d expects a 4-byte integer. So in effect, you will be printing the lower 4-bytes of the double-precision representation of 27.881. (assuming little-endian)
*Actually (assuming strict-FP), you are seeing the bottom 4-bytes of 27.881 after it is cast to float, and then promoted to double.
In both cases you are encountering undefined behaviour. Your implementation just happens to do something strange.