Can we scramble the declaration order in C or C++? - c++

Is there a method/plugin/addon in place to ignore the following clause (for some c/c++ compiler)? To reorder the declaration of members in a struct during the same stage as the preprocessor or similar? Perhaps by adding a keyword like volatile or something similar to the front of the struct declaration.
I was thinking: a compiler option, a built-in keyword, or a programming method.
C99 §6.7.2.1 clause 13 states:
Within a structure object, the
non-bit-field members and the units in
which bit-fields reside have addresses
that increase in the order in which
they are declared.
C++ seems to have a similar clause, and I am interested in that as well. The clauses both specify a reasonable feature to have in terms of later declarations have greater memory offsets. But, I often do not need to know the declaration order of my struct for interface purposes or some other. It would be nice to write some code like:
scrambled struct foo {
int a;
int bar;
};
or, suppose order doesn't really matter with this struct.
scrambled struct foo {
int bar;
int a;
};
And so, have the declaration of a and b swapped randomly each time I compile. I believe that this also applies to setting aside stack memory.
main() {
scrambled int a;
scrambled int foo;
scrambled int bar;
..
Why do I ask?
I was curious to see how program bots were created. I watched some people analyzing memory offsets for changes while running the program to which a hack will be created.
It seems the process is: watch the memory offsets and take note of the purpose for the given offsets. Later, hack programs will inject desired values into memory at those offsets.
Now suppose those memory offsets changed every single time the program is compiled. Maybe it would hinder or dissuade individuals from taking the time to understand something you would rather they not know.

Run-time foxing is the best way, then you only have to release a single version. Where a struct has several fields of the same type, you can use an array instead. Step 1. Instead of a structure with three int fields use an array
#define foo 0
#define bar 1
#define zee 2
struct abc {
int scramble [3];
};
...
value = abc.scramble[bar];
Step 2, now use an indexing array which is randomised every time the program is run.
int abcindex [3]; // index lookup
int abcpool [3]; // index pool for randomise
for (i=0; i<3; i++) // initialise index pool
abcpool[i] = i;
srand (time(NULL));
for (i=0; i<3; i++) { // initialise lookup array
j = rand()%(3-i);
abcindex[i] = abcpool[j]; // allocate random index from pool
abcpool[j] = abcpool[2-i]; // remove index from pool
}
value = abc.scramble[abcindex[bar]];
Another way to try to fox a hacker is to include subterfuge variables that behave as if they have something to do with it but make the program exit if tampered with. Lastly you can keep some kind of checksum or encrypted copy of key variables, to check if they have been tampered with.

Your intention is good, but the solution isn't (sorry). Usually you can't recompile your program before each run. The attacker will hack the inspected program. However, there's a solution, called ASLR. The operating system could change the load address for you, thus making "return oriented programming" and "return to libc like hacks harder".

Related

How do you determine the size of a class when reverse engineering?

I've been trying to learn a bit about reverse engineering and how to essentially wrap an existing class (that we do not have the source for, we'll call it PrivateClass) with our own class (we'll call it WrapperClass).
Right now I'm basically calling the constructor of PrivateClass while feeding a pointer to WrapperClass as the this argument...
Doing this populates m_string_address, m_somevalue1, m_somevalue2, and missingBytes with the PrivateClass object data. The dilemma now is that I am noticing issues with the original program (first a crash that was resolved by adding m_u1 and m_u2) and then text not rendering that was fixed by adding mData[2900].
I'm able to deduce that m_u1 and m_u2 hold the size of the string in m_string_address, but I wasn't expecting there to be any other member variables after them (which is why I was surprised with mData[2900] resolving the text rendering problem). 2900 is also just a random large value I threw in.
So my question is how can we determine the real size of a class that we do not have the source for? Is there a tool that will tell you what variables exist in a class and their order (or atleast the correct datatypes or datatype sizes of each variable). I'm assuming this might be possible by processing assembly in an address range into a semi-decompiled state.
class WrapperClass
{
public:
WrapperClass(const wchar_t* original);
private:
uintptr_t m_string_address;
int m_somevalue1;
int m_somevalue2;
char missingBytes[2900];
};
WrapperClass::WrapperClass(const wchar_t* original)
{
typedef void(__thiscall* PrivateClassCtor)(void* pThis, const wchar_t* original);
PrivateClassCtor PrivateClassCtorFunc = PrivateClassCtor(DLLBase + 0x1c00);
PrivateClassCtorFunc(this, original);
}
So my question is how can we determine the real size of a class that
we do not have the source for?
You have to guess or logically deduce it for yourself. Or just guess. If guessing doesn't work out for you, you'll have to guess again.
Is there a tool that will tell you what variables exist in a class and
their order (or atleast the correct datatypes or datatype sizes of
each variable) I'm assuming by decompiling and processing assembly in
an address range.
No, there is not. The type of meta information that describes a class, it's members, etc. simply isn't written out as the program does not need it nor are there currently no facilities defined in the C++ Standard that would require a compiler to generate that information.
There are exactly zero guarantees that you can reliably 'guess' the size of a class. You can however probably make a reasonable estimate in most cases.
The one thing you can be sure of though: the only problem is when you have too little memory for a class instance. Having too much memory isn't really a problem at all (Which is what adding 2900 extra bytes works).
On the assumption that the code was originally well written (e.g. the developer decided to initialise all the variables nicely), then you may be able to guess the size using something like this:
#define MAGIC 0xCD
// allocate a big buffer
char temp_buffer[8092];
memset(temp_buffer, MAGIC, 8092);
// call ctor
PrivateClassCtor PrivateClassCtorFunc = PrivateClassCtor(DLLBase + 0x1c00);
PrivateClassCtorFunc(this, original);
// step backwards until we find a byte that isn't 0xCD.
// Might want to change the magic value and run again
// just to be sure (e.g. the original ctor sets the last
// few bytes of the class to 0xCD by coincidence.
//
// Obviously fails if the developer never initialises member vars though!
for(int i = 8091; i >= 0; --i) {
if(temp_buffer[i] != MAGIC) {
printf("class size might be: %d\n", i + 1);
break;
}
}
That's probably a decent guess, however the only way to be 100% sure would be to stick a breakpoint where you call the ctor, switch to assembly view in your debugger of choice, and then step through the assembly line by line to see what the max address being written to is.

How to let the compiler do the offset computations for an odd polymorphism structure, with as little code as possible?

I am not sure if this is possible at all in standard C++, so whether it even is possible to do, could be a secondary way to put my question.
I have this binary data which I want to read and re-create using structs. This data is originally created as a stream with the content appended to a buffer, field by field at a time; nothing special about that. I could simply read it as a stream, the same way it was written. Instead, I merely wanted to see if letting the compiler do the math for me, was possible, and instead implementing the binary data as a data structure instead.
The fields of the binary data have a predictable order which allows it to be represented as a data type, the issue I am having is with the depth and variable length of repeating fields. I am hoping the example code below makes it clearer.
Simple Example
struct Common {
int length;
};
struct Boo {
long member0;
char member1;
};
struct FooSimple : Common {
int count;
Boo boo_list[];
};
char buffer[1024];
int index = 15;
((FooSimple *)buffer)->boo_list[index].member0;
Advanced Example
struct Common {
int length;
};
struct Boo {
long member0;
char member1;
};
struct Goo {
int count;
Boo boo_list[];
};
struct FooAdvanced : Common {
int count;
Goo goo_list[];
};
char buffer[1024];
int index0 = 5, index1 = 15;
((FooAdvanced *)buffer)->goo_list[index0].boo_list[index1].member0;
The examples are not supposed to relate. I re-used some code due to lack of creativity for unique names.
For the simple example, there is nothing unusual about it. The Boo struct is of fixed size, therefore the compiler can do the calculations just fine, to reach the member0 field.
For the advanced example, as far as I can tell at least, it isn't as trivial of a case. The problem that I see, is that if I use the array selector operator to select a Goo object from the inline array of Goo-elements (goo_list), the compiler will not be able to do the offset calculations properly unless it makes some assumptions; possibly assuming that all preceding Goo-elements in the array have zero Boo-elements in the inline array (boo_list), or some other constant value. Naturally, that won't be the case.
Question(s):
What ways are there to achieve the offset computations to be done by the compiler, despite the inline arrays having variable lengths? Unless I am missing something, I believe templates can't help at all, due to their compile-time nature.
Is this even possible to achieve in C++?
How do you handle the case with instantiating a FoodAdvanced object, by feeding a variable number of Goo and Boo element counts to the goo_list and boo_list members, respectively?
If it is impossible, would I have to write some sort of wrapper code to handle the calculations instead?

precision about structure loading in memory

typedef struct sample_s
{
int sampleint;
sample2 b;
} sample;
typedef struct sample2_s
{
int a;
int b;
int c;
int d;
} sample2;
int main()
{
sample t;
}
In this example, when I create the instance t of the sample structure, I will also load sample2 in memory.
The Question is, how is it possible to only load the sampleint in the memory ?
Is there a way to only load a part of a structure in memory ?
If the answer is, like I think it is, the inheritance. How does it work exactly ? Will there be a waste of time during the execution due to hash table ?
I am asking those question because I want to develop a DOD (data oriented design) program and I want to understand better how structures are managed in the memory.
Thank you
If you just want to copy sampleint, you can declare int s = x.sampleint; You can also memcpy() a range of memory defined by the offsetof macro in <stddef.h> to get a range of consecutive member variables.
It seems as if what you want is one of the following:
Declare a samplebase type that, in C++, sample can inherit from.
Declare storage for only the individual members you want to copy.
Have sample hold a pointer to a sample2, and set that to NULL if you aren’t allocating one.
Declare the sample as a temporary in a block of code, copy the parts you want, let the memory be reclaimed when it goes out of scope.

Using private vars to initialize array

I'm trying to do a little application that would calculate some paths for a given graph.
I've created a class to handle simple graphs, as follows:
class SimpleGraph {
int _nbNodes;
int _nbLines;
protected:
int AdjMatrix[_nbNodes, _nbNodes]; //Error happens here...
int IncMatrix[_nbNodes, _nbLines]; //...and here!
public:
SimpleGraph(int nbNodes, int nbLines) { this->_nbNodes = nbNodes - 1; this->_nbLines = nbLines - 1; };
virtual bool isSimple();
};
At compilation time, I get an error on the two protected members declaration.
I don't understand what is wrong, as there is only one constructor that takes these values as parameters. As such, they cannot be uninitialized.
What am I missing here?
The compiler needs to know how much space to allocate for a member of class SimpleGraph. However, since AdjMatrix and IncMatrix are defined on the stack and their sizes are determined at run-time (i.e., after compilation), it cannot do that. Specifically, the standard says that the size of an array in a class must be a constexpr.
To fix this, you can:
Allocate AdjMatrix and IncMatrix on the heap instead and then you can allocate memory at runtime.
Use a fixed size for the two arrays and keep them on the stack.
--
Another major issue with your code is that you cannot create multi-dimensional arrays using a comma (AdjMatrix[int, int]). You must instead either use:
AdjMatrix[int][int]
AdjMatrix[int * int]
Objects in C++ have a fixed size that needs to be known at compilation time. The size of AdjMatrix and InMatrix are not known at compilation time, only at run time.
In the lines
int AdjMatrix[_nbNodes, _nbNodes]; //Error happens here...
int IncMatrix[_nbNodes, _nbLines]; //...and here!
The array notation is wrong. You cannot specify a 2 dimensional array that way in C++. The correct notation uses brackets on each dimension, as for instance:
int data[5][2];
Regarding the problem you are facing, the dimensions of an array in C++ must be specified at compile time, ie. the compiler must know what are the values used to indicate the array dimension when compiling the program. This is clearly not the case here. You must revert to use integer literals, as in my example, or change the code to use vectors:
std::vector<std::vector<int> > AdjMatrix;
and in the constructor:
SimpleGraph(int nbNodes, int nbLines) : AdjMatrix(nbNodes) {
for (int i = 0; i< nbNodes; i++)
AdjMatrix[i].resize(20);
}
Note that you won't need _nbNodes anymore, and use instead the size() method on AdjMatrix. You will have to do the same for IncMatrix.
Another option, if you know the values at compile time, is to use macros to define them symbolically.
#define NBNODES 20
int AdjMatrix[NBNODES][NBNODES];
but since you wish to pass them as constructor parameter, this may not fit your need. Still, if you know that the parameters are constants at compile time, you might be able use the C++11 constexpr qualifier on the constructor parameters.

Large Dynamic MultiDimensional Array Not Working

My code that I have is quite large and complicated so I won't waste your time reading it, but you're going to have to make certain assumtions about variables in it as a result. I will tell you the values of the variables which I have confirmed in the debugger so you know with certainty. Know that I have omitted a lot of unrelated code in here so what you see isn't everything but I have included everything that is relevant.
// This is defined in a class:
char**** m_DataKeys;
// This is in a member function of the same class:
m_DataKeys = new char*** [m_iNumOfHeroes]; // m_iNumOfHeroes = 2
while ( pkvHero )
{
// iHeroNum = 0 and then 1 #define NUM_OF_ABILITIES 4
m_DataKeys[iHeroNum] = new char** [NUM_OF_ABILITIES];
for (int ability = 0; ability < NUM_OF_ABILITIES; ability++)
{
if (pkvExtraData) // only is true when iHeroNum == 1 and ability == 0
{
// iNumOfExtraData == 2
m_DataKeys[iHeroNum][ability] = new char* [iNumOfExtraData];
while ( pkvSubKey )
{
// iCurExtraDataNum increments from 0 to 2
m_DataKeys[iHeroNum][ability][iCurExtraDataNum] = new char [50];
I put a break point on the line
m_DataKeys[iHeroNum] = new char** [NUM_OF_ABILITIES];
Before the line is called and when iHeroNum == 0 the m_DataKeys array looks like:
m_DataKeys | 0x02072a60
pointer | 0xffeeffee
Error : expression cannot be evaluated
Which is expected. After the line gets called it looks like:
m_DataKeys | 0x02072a60
pointer | 0x02496b00
pointer | 0xffeeffee
Error : expression cannot be evaluated
Which seems to look correct. However, since I set a breakpoint there, I hit play and had it hit it on the next loop around, where iHeroNum == 1 now and ran the line and m_DataKeys then looked like this:
m_DataKeys | 0x02072a60
pointer | 0x02496b00
pointer | 0xffeeffee
Error : expression cannot be evaluated
Which is the exact same as before! The line didn't change the array.... At all!
For clarification, m_DataKeys is a 3 dimensional array of character pointers to character arrays of size 50.
I can't figure out why this is happening, it looks like my code is correct. Is it possible that the garbage collector is screwing me over here? Or maybe the new allocator?
Edit: A Symptom of a Larger Problem
Let me elaborate a little more on the structure of my code, because really, this is just a cheap solution to a bigger problem.
I already have structs as one of you wisely suggested:
struct HeroData
{
// Lots o data here
// ...
// .
//
AbilityData* Abilities[NUM_OF_ABILITIES];
}
struct AbilityData
{
// More data here
// ...
// .
CUtlMap<char*,int> ExtraData [MAX_ABILITY_LEVELS];
}
Now when it got complicated and I had to do this DataKeys arrays of pointers to arrays of pointers crap is only when the need arose to be loading in some data to a dynamic structure, where both the keys, the values, and the numbers of data are completely dynamic. So I thought to use a map of char arrays to ints, but the only problem is that I can't store the actual char array in my map, I have to use a char *. I tried defining the map as:
CUtlMap<char[50],int> ExtraData [MAX_ABILITY_LEVELS];
But that really didn't work and it seems sort of strange to me anyway. So, I had to find some place to stick all these ExtraDataKeys and for some reason I thought it cool to do it like this. How can I store char arrays in objects like arrays or maps?
Since you are using pointers as class members, my best guess is that you are violating The Rule Of Three. That is, you did not provide a copy constructor and a copy assignment operator for your class. That usually leads to strange data loss when passing objects of your class around.
Note that no sane C++ programmer would use char****. Here is my best attempt to fix your problem using vectors and strings, but there is probably a much better design for your specific problem:
#include <string>
#include <vector>
class Foo
{
int m_iNumOfHeroes;
std::vector<std::vector<std::vector<std::string> > > m_DataKeys;
enum { NUM_OF_ABILITIES = 4, iNumOfExtraData = 2 };
public:
explicit Foo(int iNumOfHeroes)
: m_iNumOfHeroes(iNumOfHeroes)
, m_DataKeys(m_iNumOfHeroes, std::vector<std::vector<std::string> >
(NUM_OF_ABILITIES, std::vector<std::string>(iNumOfExtraData)))
{
}
};
int main()
{
Foo x(2);
}
In case you have never seen that colon syntax in the constructor before, that is a member initializer list.
I really wish C++ had array bounds checking
std::vector and std::string do have bounds checking if you use the foo.at(i) syntax instead of foo[i]. In Debug mode, even foo[i] has bounds checking enabled in Visual C++, IIRC.
Though the code might be correct, I personally find that working with something like a char **** can get pretty confusing pretty fast.
This is just my personal preference, but I always try to organize things in the most clear and unambiguous way I can, so what I would do in your situation would be something like
struct Ability
{
char extraData[NUM_OF_EXTRA_DATA][50];
};
struct HeroData
{
Ability abilities[NUM_OF_ABILITIES];
};
class Foo
{
// here you can choose a
HeroData *heroArray;
// and then you would alloc it with "heroArray = new HeroData[m_iNumOfHeroes];"
// or you can more simply go with a
std::vector<HeroData> heroVector;
};
I think this makes things more clear, making it easier for you and other programmers working on that code to keep track of what is what inside your arrays.
I think you expect the wrong thing to happen (that the visual display in the debugger would change), even though your code seems correct.
Your debugger displays m_DataKeys, *m_DataKeys and **m_DataKeys, which is the same as m_DataKeys, m_DataKeys[0] and m_DataKeys[0][0]. When you change m_DataKeys[1], you are not going to notice it in your debugger output.
The following might help you: in my debugger (MS Visual Studio 2005), if you enter e.g. m_DataKeys,5 as your watch expression, you will see the first 5 elements of the array, that is, m_DataKeys[0], m_DataKeys[1], ..., m_DataKeys[4] - arranged in a neat table. If this syntax (with the ,5) doesn't work for you, just add m_DataKeys[1] into the debugger's watch window.
Not sure why this didn't occur to me last night, but I was pretty tired. Heres what I decided to do:
struct AbilityData
{
// Stuff
CUtlMap<char*,int> ExtraData [MAX_ABILITY_LEVELS];
char **DataKeys;
}
Thats what my abilityData struct now looks like, and it now works, but now I want to reorganize it to be like:
struct AbilityData
{
// Stuff
CUtlMap<char*,int[MAX_ABILITY_LEVELS]> ExtraData;
char **DataKeys;
}
Because it makes more sense that way, but then I run into the same problem that I had before with the char array. It almost seems like to me it might just be best to ditch the whole map idea and make it like:
struct AbilityData
{
// Stuff
int *ExtraData;
char **DataKeys;
}
Where ExtraData is now also a dynamically allocated array.
The only problem with that is that I now have to get my data via a function which will loop through all the DataKeys, find a matching key for my input string, then return the ExtraData associated with it.