Hexadecimals addition end result is always wrong? C++ - c++

When I want to calculate the address of a function, I do the following:
HMODULE base = GetModuleHandle(L"program.exe"); // Module base addr
// Adding an offset to the base
std::cout << (base + 0x8F0A0) << std::endl; -> Wrong!
I'm not sure why the result is wrong. I've tested it via online hex calcs and also have debugger to check both values.
Could base be considered decimal and other being hex, produce wrong results?
How can I get a result in hex?

As explained here, depending on whether STRICT is defined, HMODULE is essentially either a void* or a <unique type>*, the purpose of this being to make each handle type a different C++ type, meaning compiler errors when you mix and match. In the former case, pointer arithmetic won't compile. In the latter case, it will compile, but you can't rely on anything happening because pointer arithmetic takes the type's size into account and because pointer arithmetic is undefined if you leave the object/array being pointed to.
You should treat this pointer as pointing to nothing in particular, and therefore not do pointer arithmetic. You have to reinterpret_cast it to an integral type that you're sure is large enough (std::uintptr_t) and then do arithmetic on that integral value.
In my local header, this unique type contains an int member, so adding 1 will actually move the pointer ahead by 4 bytes (you know, except for the undefined behaviour and all). It just so happens that 0x00DE0000 + 4 * 0x8F0A0 is your 0x0101C280 value.

You're problem lies with the value GetModuleHandle(L"program.exe") returning: 00DE0000. You need to utilise C hexadecimal syntax, so you need to add and precede "0x" to your hex number 00DE0000.
Hence, your base number should be casted to a numeric value: 0x00DE0000
0x00DE0000 is equal to 00DE0000
Try using std::string to_string(int value); to convert it to string, then convert your hex values (base) to C hexadecimal syntax (add "0x" at the beginning of your hex value). To finish off, convert your base value back to a numeric value (e.g. use std::stoi) and perform the addition using std::hex.
Try this code here.
#include <iostream>
int main () {
int hex1 = 0x8F0A0;
int hex2 = 0x00DE0000; // Using int values
std::cout << std::hex << hex1 + hex2 << std::endl;
}

As Chris has said, I had the same case, solving the thing like this:
int offset = 0x8F0A0;
std::uintptr_t base = reinterpret_cast<uintptr_t>(GetModuleHandle(L"program.exe"));
// Here added 4 bytes to the offset.
std::cout << std::hex << (base + (offset + 4096)) << std::endl;

Related

Why is this pointer null

In Visual Studio, it seems like pointer to member variables are 32 bit signed integers behind the scenes (even in 64 bit mode), and a null-pointer is -1 in that context. So if I have a class like:
#include <iostream>
#include <cstdint>
struct Foo
{
char arr1[INT_MAX];
char arr2[INT_MAX];
char ch1;
char ch2;
};
int main()
{
auto p = &Foo::ch2;
std::cout << (p?"Not null":"null") << '\n';
}
It compiles, and prints "null". So, am I causing some kind of undefined behavior, or was the compiler supposed to reject this code and this is a bug in the compiler?
Edit:
It appears that I can keep the "2 INT_MAX arrays plus 2 chars" pattern and only in that case the compiler allows me to add as many members as I wish and the second character is always considered to be null. See demo. If I changed the pattern slightly (like 1 or 3 chars instead of 2 at some point) it complains that the class is too large.
The size limit of an object is implementation defined, per Annex B of the standard [1]. Your struct is of an absurd size.
If the struct is:
struct Foo
{
char arr1[INT_MAX];
//char arr2[INT_MAX];
char ch1;
char ch2;
};
... the size of your struct in a relatively recent version of 64-bit MSVC appears to be around 2147483649 bytes. If you then add in arr2, suddenly sizeof will tell you that Foo is of size 1.
The C++ standard (Annex B) states that the compiler must document limitations, which MSVC does [2]. It states that it follows the recommended limit. Annex B, Section 2.17 provides a recommended limit of 262144(?) for the size of an object. While it's clear that MSVC can handle more than that, it documents that it follows that minimum recommendation so I'd assume you should take care when your object size is more than that.
[1] http://eel.is/c++draft/implimits
[2] https://learn.microsoft.com/en-us/cpp/cpp/compiler-limits?view=vs-2019
It's clearly a collision between an optimization on pointer-to-member representation (use only 4 bytes of storage when no virtual bases are present) and the pigeonhole principle.
For a type X containing N subobjects of type char, there are N+1 possible valid pointer-to-members of type char X::*... one for each subobject, and one for null-pointer-to-member.
This works when there are at least N+1 distinct values in the pointer-to-member representation, which for a 4-byte representation implies that N+1 <= 232 and therefore the maximum object size is 232 - 1.
Unfortunately the compiler in question made the maximum object-type size (before it rejects the program) equal to 232 which is one too large and creates a pigeonhole problem -- at least one pair of pointer-to-members must be indistinguishable. It's not necessary that the null pointer-to-member be one half of this pair, but as you've observed in this implementation it is.
The expression &Foo::ch2 is of type char Foo::*, which is pointer to member of class Foo. By rules, a pointer to member converted to bool should be evaluated as false ONLY if it is a null pointer, i.e. it had nullptr assigned to it.
The fault here appears to be a implementation's flaw. i.e. on gcc compilers with -march=x86-64 any assigned pointer to member evaluates to non-null (1) unless it had nullptr assigned to it with following code:
struct foo
{
char arr1[LLONG_MAX];
char arr2[LLONG_MAX];
char ch1;
char ch2;
};
int main()
{
char foo::* p1 = &foo::ch1;
char foo::* p2 = &foo::ch2;
std::cout << (p1?"Not null ":"null ") << '\n';
std::cout << (p2?"Not null ":"null ") << '\n';
std::cout << LLONG_MAX + LLONG_MAX << '\n';
std::cout << ULLONG_MAX << '\n';
std::cout << offsetof(foo, ch1) << '\n';
}
Output:
Not null
null
-2
18446744073709551615
18446744073709551614
Likely it's related to fact that class size is exceeding platform limitations, leading to offset of member being wrapped around of 0 (internal value of nullptr). Compiler doesn't detect it because it becomes a victim of... integer overflow with signed value and it's programmer's fault to cause UB within compiler by using signed literals as array size: LLONG_MAX + LLONG_MAX = -2 would be "size" of two arrays combined.
Essentially size of first two members is calculated as negative and offset of ch1 is -2 represented as unsigned 18446744073709551614.
And -2 therefore pointer is not null. Another compiler may clamp value to 0 producing a nullptr, or actually detect existing problem as clang does.
If offset of ch1 is -2, then offset of ch2 is -1? Let's add this:
std::cout << reinterpret_cast<signed long long&&> (offsetof(foo, ch1)) << '\n';
std::cout << reinterpret_cast<signed long long&&> (offsetof(foo, ch2)) << '\n';
Additional output:
-2
-1
And offset for first member is obviously 0 and if pointer represent offsets, then it needs another value to represent nullptr. it's logical to assume that this particular compiler considers only -1 to be a null value, which may or may not be case for other implementations.
When I test the code, VS shows that Foo: the class is too large.
When I add char arr3[INT_MAX], Visual Studio will report Error C2089 'Foo': 'struct' too large. Microsoft Docs explains it as The specified structure or union exceeds the 4GB limit.

What is the significance of x in the address of a variable?

I am trying to print the address of a data member of a class:
#include <iostream>
struct test { int x; };
int main() {
test t;
std::cout << &t.x << std::endl;
}
The output is:
0x23fe4c
I don't understand how this points to a memory address. I want to know the meaning of this way of representing addresses.
The 0x (or sometimes 0X) prefix indicates that the value following is presented as a hexadecimal value, i.e. is represented in base (or radix) 16 instead of base 10 as decimal values. For example, 0x1234abcd means 1234abcd16 which is written as a decimal is 30544174110 or simply 305441741. This is simply one common representation used for memory addresses and other computer- or programming-related contexts.

Error when compiling C++ program: "error: cast from ‘double*’ to ‘int’ loses precision"

I get 2 errors when trying to compile this code:
#include <iostream>
using namespace std;
int main() {
int i;
char myCharArray[51] = "This string right here contains exactly 50 chars.";
double myDoubleArray[4] = {100, 101, 102, 103};
char *cp, *cbp;
double *dp, *dbp;
dp = &myDoubleArray[0];
dbp = myDoubleArray;
cp = &myCharArray[0];
cbp = myCharArray;
while ((cp-cbp) < sizeof(myCharArray)) {cp++; dp++; }
cout << "Without cast: " << (dp-dbp) << endl;
cout << " Cast 1: " << ((int *) dp-(int *) dbp) << endl;
cout << " Cast 2: " << ((int) dp-(int) dbp) << endl;
}
The errors I get are:
error: cast from ‘double*’ to ‘int’ loses precision [-fpermissive]
error: cast from ‘double*’ to ‘int’ loses precision [-fpermissive]
g++ won't let me compile the program. I'm asking what I could change to make it compile.
cast from ‘double*’ to ‘int’ loses precision
is as simple as it could be read: The number of bits to store in an int is less then the number of bits which are stored in a pointer on your platform. Normally it helps to make the int to an unsigned int, because on most platforms a pointer can be stored in an unsigned int type. A unsigned int has one more bit for the value because there is no need to decide between positive and negative. And pointers are always positive.
And even better is to use the types for such things to make your code more portable. Have a look for uintptr_t
Your "Without cast" line performs pointer subtraction, which yields the difference (in units of the size of the pointed-to type) between two pointers. If the two pointers point to elements of the same array, or just past the end of it, then the difference is the number of array elements between them. The result is of the signed integer type ptrdiff_t.
That's a perfectly sensible thing to do.
Your second line ("Cast 1:") converts the pointers (which are of type double*) to int* before the subtraction. That in effect pretends that the pointers are pointing to elements of an array of int, and determines the number of elements between the int objects to which they point. It's not at all clear why you'd want to do that.
Your third line ("Cast 2:") converts both pointer values to int before subtracting them. If int is not big enough to hold the converted pointer value, then the result may be nonsense. If it is, then on most systems it will probably yield the distinct between the two pointed-to objects in bytes. But I've worked on systems (Cray T90) where the byte offset of a pointer is stored in the high-order 3 bits of the pointer value. On such a system your code would probably yield the distance between the pointed-to objects in words. Or it might yield complete nonsense. In any case, the behavior is undefined.
The problem with the conversion from double* to int isn't just that it loses precision (which is what your compiler happened to complain about). It's that the result of the conversion doesn't necessarily mean anything.
The easiest, and probably the best, way to get your code to compile is to delete the second and third lines.
If you want a solution other than that, you'll have to explain what you're trying to do. Converting the pointer values to uintptr_t will probably avoid the error message, but it won't cause what you're doing to make sense.

What does this mean? (int &)a

define a float variable a, convert a to float & and int &, what does this mean? After the converting , a is a reference of itself? And why the two result is different?
#include <iostream>
using namespace std;
int
main(void)
{
float a = 1.0;
cout << (float &)a <<endl;
cout << (int &)a << endl;
return 0;
}
thinkpad ~ # ./a.out
1
1065353216
cout << (float &)a <<endl;
cout << (int &)a << endl;
The first one treats the bits in a like it's a float. The second one treats the bits in a like it's an int. The bits for float 1.0 just happen to be the bits for integer 1065353216.
It's basically the equivalent of:
float a = 1.0;
int* b = (int*) &a;
cout << a << endl;
cout << *b << endl;
(int &) a casts a to a reference to an integer. In other words, an integer reference to a. (Which, as I said, treats the contents of a as an integer.)
Edit: I'm looking around now to see if this is valid. I suspect that it's not. It's depends on the type being less than or equal to the actual size.
It means undefined behavior:-).
Seriously, it is a form of type punning. a is a float, but a is also a block of memory (typically four bytes) with bits in it. (float&)a means to treat that block of memory as if it were a float (in other words, what it actually is); (int&)a means to treat it as an int. Formally, accessing an object (such as a) through an lvalue expression with a type other than the actual type of the object is undefined behavior, unless the type is a character type. Practically, if the two types have the same size, I would expect the results to be a reinterpretation of the bit pattern.
In the case of a float, the bit pattern contains bits for the sign, an exponent and a mantissa. Typically, the exponent will use some excess-n notation, and only 0.0 will have 0 as an exponent. (Some representations, including the one used on PCs, will not store the high order bit of the mantissa, since in a normalized form in base 2, it must always be 1. In such cases, the stored mantissa for 1.0 will have all bits 0.) Also typically (and I don't know of any exceptions here), the exponent will be stored in the high order bits. The result is when you "type pun" a floating point value to a an integer of the same size, the value will be fairly large, regardless of the floating point value.
The values are different because interpreting a float as an int & (reference to int) throws the doors wide open. a is not an int, so pretty much anything could actually happen when you do that. As it happens, looking at that float like it's an int gives you 1065353216, but depending on the underlying machine architecture it could be 42 or an elephant in a pink tutu or even crash.
Note that this is not the same as casting to an int, which understands how to convert from float to int. Casting to int & just looks at bits in memory without understanding what the original meaning is.

what's the size of hex value of some memory address converted to int or other type?

For example:
int* x = new int;
int y = reinterpret_cast<int>(x);
y now holds the integer value of the memory address of variable x.
Variable y is of size int. Will that int size always be large enough to store the converted memory address of ANY TYPE being converted to int?
EDIT:
Or is safer to use long int to avoid a possible loss of data?
EDIT 2: Sorry people, to make this question more understandable the thing I want to find out here is the size of returned HEX value as a number, not size of int nor size of pointer to int but plain hex value. I need to get that value in in human-readable notation. That's why I'm using reinterpret_cast to convert that memory HEX value to DEC value. But to store the value safely I also need to fing out into what kind of variable to it: int, long - what type is big enough?
No, that's not safe. There's no guarantee sizeof(int) == sizeof(int*)
On a 64 bit platform you're almost guaranteed that it's not.
As for the "hexadecimal value" ... I'm not sure what you're talking about. If you're talking about the textual representation of the pointer in hexadecimal ... you'd need a string.
Edit to try and help the OP based on comments:
Because computers don't work in hex. I don't know how else to explain it. An int stores some amount of bits (binary), as does a long. Hexadecimal is a textual representation of those bits (specifically, the base16 representation). strings are used for textual representations of values. If you need a hexadecimal representation of a pointer, you would need to convert that pointer to text (hex).
Here's a c++ example of how you would do that:
test.cpp
#include <string>
#include <iostream>
#include <sstream>
int main()
{
int *p; // declare a pointer to an int.
std::ostringstream oss; // create a stringstream
std::string s; // create a string
// this takes the value of p (the memory address), converts it to
// the hexadecimal textual representation, and puts it in the stream
oss << std::hex << p;
// Get a std::string from the stream
s = oss.str();
// Display the string
std::cout << s << std::endl;
}
Sample output:
roach$ g++ -o test test.cpp
roach$ ./test
0x7fff68e07730
It's worth noting that the same thing is needed when you want to see the base10 (decimal) representation of a number - you have to convert it to a string. Everything in memory is stored in binary (base2)
On most 64-bit targets, int is still 32-bit, while pointer is 64bit, so it won't work.
http://en.wikipedia.org/wiki/64-bit#64-bit_data_models
What you probably want is to use std::ostream's formatting of addresses:
int x(0);
std::cout << &x << '\n';
As to the length of the produced string, you need to determine the size of the respective pointer: for each used byte the output will use two hex digit because each hex digit can represent 16 values. All bytes are typically used even if it is unlikely that you have memory for all bytes e.g. when the size of pointers is 8 bytes as happens on 64 bit systems. This is because the stacks often grow from the biggest address downwards while the executable code start at the beginning of the address range (well, the very first page may be unused to cause segmentation violations if it is touched in any way). Above the executable code live some data segments, followed by the heap, and lots of unused pages.
There is question addressing similar topic:
https://stackoverflow.com/a/2369593/1010666
Summary: do not try to write pointers into non-pointer variable.
If you need to print out the pointer value, there are other solutions.