I've got this DLL I made. It's injected to another process. Inside the other process,
I do a search from it's memory space with the following function:
void MyDump(const void *m, unsigned int n)
{
const char *p = reinterpret_cast(m);
for (unsigned int i = 0; i < n; ++i) {
// Do something with p[i]...
}
}
Now my question. If the target process uses a data structure, let's say
struct S
{
unsigned char a;
unsigned char b;
unsigned char c;
};
Is it always presented the same way in the process' memory? I mean, if S.a = 2 (which always follows b = 3, c = 4), is the structure presented in a continuous row in the process' memory space, like
Offset
---------------------
0x0000 | 0x02 0x03 0x04
Or can those variables be in a different places there, like
Offset
---------------------
0x0000 | 0x00 0x02 0x00
0x03fc | 0x00 0x03 0x04
If the latter one, how to reconstruct the data-structure from various points from the memory?
Many thanks in advance,
nhaa123
If your victim is written in C or C++, and the datatypes used are truly that simple, then you'll always find them as a single block of bytes in memory.
But as soon as you have C++ types like std::string that observation no longer holds. For starters, the exact layout will differ between C++ compilers, and even different versions of the same compiler. The bytes of a std::string will likely not be in a contiguous array, but sometimes they are. If they're split in two, finding the second half probably will not help you in finding the first half.
Not throw in more complicated environments like a JIT'ting JVM running a Java app. The types you encounter in memory are very very complex; one could write a book about decoding them.
The order of member will always be the same and the structure will occupy a contiguous memory block.
Depending on a compiler padding might be added between members but it still will be the same if the program is recompiled with the same compiler and the same settings. If padding is added and you are unaware of it you can't detect it reliably at runtime - all the information the compiler had is lost to that moment and you are left to just analyze the patterns and guess.
It depends on the alignment of the structure.
If you have something like this:
struct A
{
int16_t a;
char b;
int32_t c;
char d;
}
then by default on 32bit platform( I dont know if that is true for 64bit ), the offset of c is 4 as there is one byte padded after b, and after d there are 3 more bytess padded at the end (if I remember correctly).
It will be different if the structure has a specified alignment.
Now my question. If the target process uses a data structure [...] is it always presented the same way in the process' memory? I mean, if S.a = 2 (which always follows b = 3, c = 4), is the structure presented in a continuous row in the process' memory space?
Yes, however it will often be padded to align members in ways you may not expect. Thus, simply recreating the data structure in order to interface with it via code injection.
I would highly recommend using ReClassEx or ReClass.NET, two open-source programs created specifically for reconstructing data structures from memory and generating useable C++ code! Check out a screenshot:
Related
I often see structures in the code, at the end of which there is a memory reserve.
struct STAT_10K4
{
int32_t npos; // position number
...
float Plts;
Pxts;
float Plto [NUM];
uint32_t reserv [(NUM * 3)% 2 + 1];
};
Why do they do this?
Why are some of the reserve values dependent on constants?
What can happen if you do not make such reserves? Or make a mistake in their size?
This is a form of manual padding of a class to make its size a multiple of some number. In your case:
uint32_t reserv [(NUM * 3)% 2 + 1];
NUM * 3 % 2 is actually nonsensical, as it would be equivalent to NUM % 2 (not considering overflow). So if the array size is odd, we pad the struct with one additional uint32_t, on top of + 1 additional ones. This padding means that STAT_10K4's size is always a multiple of 8 bytes.
You will have to consult the documentation of your software to see why exactly this is done. Perhaps padding this struct with up to 8 bytes makes some algorithm easier to implement. Or maybe it has some perceived performance benefit. But this is pure speculation.
Typically, the compiler will pad your structs to 64-bit boundaries if you use any 64-bit types, so you don't need to do this manually.
Note: This answer is specific to mainstream compilers and x86. Obviously this does not apply to compiling for TI-calculators with 20-bit char & co.
This would typically be to support variable-length records. A couple of ways this could be used will be:
1 If the maximum number of records is known then a simple structure definition can accomodate all cases.
2 In many protocols there is a "header-data" idiom. The header will be a fixed size but the data variable. The data will be received as a "blob". Thus the structure of the header can be declared and accessed by a pointer to the blob, and the data will follow on from that. For example:
typedef struct
{
uint32_t messageId;
uint32_t dataType;
uint32_t dataLenBytes;
uint8_t data[MAX_PAYLOAD];
}
tsMessageFormat;
The data is received in a blob, so a void* ptr, size_t len.
The buffer pointer is then cast so the message can be read as follows:
tsMessageFormat* pMessage = (psMessageFormat*) ptr;
for (int i = 0; i < pMessage->dataLenBytes; i++)
{
//do something with pMessage->data[i];
}
In some languages the "data" could be specified as being an empty record, but C++ does not allow this. Sometimes you will see the "data" omitted and you have to perform pointer arithmetic to access the data.
The alternative to this would be to use a builder pattern and/or streams.
Windows uses this pattern a lot; many structures have a cbSize field which allows additional data to be conveyed beyond the structure. The structure accomodates most cases, but having cbSize allows additional data to be provided if necessary.
Is bitfield a C concept or C++?
Can it be used only within a structure? What are the other places we can use them?
AFAIK, bitfields are special structure variables that occupy the memory only for specified no. of bits. It is useful in saving memory and nothing else. Am I correct?
I coded a small program to understand the usage of bitfields - But, I think it is not working as expected. I expect the size of the below structure to be 1+4+2 = 7 bytes (considering the size of unsigned int is 4 bytes on my machine), But to my surprise it turns out to be 12 bytes (4+4+4). Can anyone let me know why?
#include <stdio.h>
struct s{
unsigned int a:1;
unsigned int b;
unsigned int c:2;
};
int main()
{
printf("sizeof struct s = %d bytes \n",sizeof(struct s));
return 0;
}
OUTPUT:
sizeof struct s = 12 bytes
Because a and c are not contiguous, they each reserve a full int's worth of memory space. If you move a and c together, the size of the struct becomes 8 bytes.
Moreover, you are telling the compiler that you want a to occupy only 1 bit, not 1 byte. So even though a and c next to each other should occupy only 3 bits total (still under a single byte), the combination of a and c still become word-aligned in memory on your 32-bit machine, hence occupying a full 4 bytes in addition to the int b.
Similarly, you would find that
struct s{
unsigned int b;
short s1;
short s2;
};
occupies 8 bytes, while
struct s{
short s1;
unsigned int b;
short s2;
};
occupies 12 bytes because in the latter case, the two shorts each sit in their own 32-bit alignment.
1) They originated in C, but are part of C++ too, unfortunately.
2) Yes, or within a class in C++.
3) As well as saving memory, they can be used for some forms of bit twiddling. However, both memory saving and twiddling are inherently implementation dependent - if you want to write portable software, avoid bit fields.
Its C.
Your comiler has rounded the memory allocation to 12 bytes for alignment purposes. Most computer memory syubsystems can't handle byte addressing.
Your program is working exactly as I'd expect. The compiler allocates adjacent bitfields into the same memory word, but yours are separated by a non-bitfield.
Move the bitfields next to each other and you'll probably get 8, which is the size of two ints on your machine. The bitfields would be packed into one int. This is compiler specific, however.
Bitfields are useful for saving space, but not much else.
Bitfields are widely used in firmware to map different fields in registers. This save a lot of manual bitwise operations which would have been necessary to read / write fields without it.
One disadvantage is you can't take address of bitfields.
I am using Linux 32 bit os,
and GCC compiler.
I tried with three different type of structure.
in the first structure i have defined only one char variable. size of this structure is 1 that is correct.
in the second structure i have defined only one int variable. here size of the structure is showing 4 that is also correct.
but in the third structure when i defined one char and one int that means total size should be 5, but the output it is showing 8. Can anyone please explain how a structure is assigned?
typedef struct struct_size_tag
{
char c;
//int i;
}struct_size;
int main()
{
printf("Size of structure:%d\n",sizeof(struct_size));
return 0;
}
Output: Size of structure:1
typedef struct struct_size_tag
{
//char c;
int i;
}struct_size;
int main()
{
printf("Size of structure:%d\n",sizeof(struct_size));
return 0;
}
Output: Size of structure:4
typedef struct struct_size_tag
{
char c;
int i;
}struct_size;
int main()
{
printf("Size of structure:%d\n",sizeof(struct_size));
return 0;
}
Output:
Size of structure:8
The difference in size is due to alignment. The compiler is free to choose padding bytes, which make the total size of a structure not necessarily the sum of its individual elements.
If the padding of a structure is undesired, because it might have to interface with some hardware requirement (or other reasons), compilers usually support packing structures, so the padding is disabled.
You definitely get Data Structure Alignment
"Data alignment means putting the data at a memory offset equal to some
multiple of the word size, which increases the system's performance
due to the way the CPU handles memory. To align the data, it may be
necessary to insert some meaningless bytes between the end of the last
data structure and the start of the next, which is data structure
padding."
For more, take a look at this, Data Structure Alignment
The C standard allows a compiler to add padding bytes to structs after any field to allow the following field to be aligned according to any specific requirements of the compiler (or the user of the compiler). The standard does not specify, but typically a compiler will provide a command line argument to specify the (default) alignment. Good compilers also invariably support the de facto standard of #pragma pack, including push and pop options.
Padding bytes provide improved performance by reducing the amount of memory accesses required by suitably aligned data types. For example, on a 32-bit processor (more specifically a system which uses memory with 32 data lines) accessing a 32-bit integer will require two memory accesses when reading and writing the value rather than just one if it crosses a 4-byte boundary (ie, unless the bottom two bits of the address of the integer are zero).
See My Blog Post for more details (better than Wikipedia article).
The magic word is padding/memory alignment #see data structure alignment.
Please have a look a the following code sample, executed on a Windows-32 system using Visual Studio 2010:
#include <iostream>
using namespace std;
class LogicallyClustered
{
bool _fA;
int _nA;
char _cA;
bool _fB;
int _nB;
char _cB;
};
class TypeClustered
{
bool _fA;
bool _fB;
char _cA;
char _cB;
int _nA;
int _nB;
};
int main(int argc, char* argv[])
{
cout << sizeof(LogicallyClustered) << endl; // 20
cout << sizeof(TypeClustered) << endl; // 12
return 0;
}
Question 1
The sizeof the two classes varies because the compiler is inserting padding bytes to achieve an optimized memory allignment of the variables. Is this correct?
Question 2
Why is the memory footprint smaller if I cluster the variables by type as in class TypeClustered?
Question 3
Is it a good rule of thumb to always cluster member variables according to their type?
Should I also sort them according to their size ascending (bool, char, int, double...)?
EDIT
Additional Question 4
A smaller memory footprint will improve data cache efficiency, since more objects can be cached and you avoid full memory accesses into "slow" RAM. So could the ordering and grouping of the member declaration can be considered as a (small) but easy to achieve performance optimization?
1) Absolutely correct.
2) It's not smaller because they are grouped, but because of the way they are ordered and grouped. For example, if you declare 4 chars one after the other, they can be packed into 4 byte. If you declare one char and immediately one int, 3 padding bytes will be inserted as the int will need to be aligned to 4 bytes.
3) No! You should group members in a class so that the class becomes more readable.
Important note: this is all platform/compiler specific. Don't take it ad-literam.
Another note - there also exist some small performance increase on some platforms for accessing members that reside in the first n (varies) bytes of a class instance. So declaring frequently accessed members at the beginning of a class can result in a small speed increase. However, this too shouldn't be a criteria. I'm just stating a fact, but in no way recommend you do this.
You are right, the size differs because the compiler inserts padding bytes in class LogicallyClustered. The compiler should use a memory layout like this:
class LogicallyClustered
{
// class starts well aligned
bool _fA;
// 3 bytes padding (int needs to be aligned)
int _nA;
char _cA;
bool _fB;
// 2 bytes padding (int needs to be aligned)
int _nB;
char _cB;
// 3 bytes padding (so next class object in an array would be aligned)
};
Your class TypeClustered does not need any padding bytes because all elements are aligned. bool and char do not need alignment, int needs to be aligned on 4 byte boundary.
Regarding question 3 I would say (as often :-)) "It depends.". If you are in an environment where memory footprint does not matter very much I would rather sort logically to make the code more readable. If you are in an environment where every byte counts you might consider moving around the members for optimal usage of space.
Unless there are no extreme memory footprint restrictions, cluster them logically, which improves code readability and ease of maintenance.
Unless you actually have problems of space (i.e. very, very large
vectors with such structures), don't worry about it. Otherwise: padding
is added for alignment: on most machines, for example, a double will
be aligned on an 8 byte boundary. Regrouping all members according to
type, with the types requiring the most alignment at the start will
result in the smallest memory footprint.
Q1: Yes
Q2: Depends on the size of bool (which is AFAIK compiler-dependent). Assuming it is 1 byte (like char), the first 4 members together use 4 bytes, which is as much as is used by one integer. Therefore, the compiler does not need to insert alignment padding in front of the integers.
Q3: If you want to order by type, size-descending is a better idea. However, that kind of clustering impedes readability. If you want to avoid padding under all circumstances, just make sure that every variable which needs more memory than 1 byte starts at an alignment boundary.
The alignment boundary, however, differs from architecture to architecture. That is (besides the possibly different sizes of int) why the same struct may have different sizes on different architectures. It is generally safe to start every member x at an offset of a
multiple of sizeof(x). I.e., in
struct {
char a;
char b;
char c;
int d;
}
The int d would start at an offset of 3, which is not a multiple of sizeof(int) (=4 on x86/64), so you should probably move it to the front. It is, however, not necessary to strictly cluster by type.
Some compilers also offer the possibility to completely omit padding, e.g. __attribute((packed))__ in g++. This, however, may slow down your program, because an int then might actually need two memory accesses.
An object foo is written to a new file on platform 1 as:
write( file, &myFoo, sizeof(struct foo) );
...and then read on platform 2 using:
read(file, &myFoo, filesize(file) );
The foo object has the following definition:
struct foo
{
char a;
int b;
long c;
char* d;
};
What kind of issues might arise when loading foo on platform 2?
Every kind of issue!
We don't know if char, int, long or char* are the same size on different platforms.
And what happened to the stuff d pointed to?
There might also be padding between the members, which could differ between platforms. Big endian and little ending systems would store the bytes of integers and pointers in different order. If you are really unlucky, there might be a middle endian system as well.
When you do this you need to watch out for:
Data type sizes (char is the only one you can trust)
Alignment / padding
Endianness
Pointing to invalid memory
Floating point representation
ASCII vs EBCDIC ? (yeah, seriously ?)
Probably others
i think, you have to use that pack pragma to ensure there are no padding.
otherwise char will have 4 bytes in size depending on the default padding method.
char* this address pointer can have 32bits on 32bit machine but 64bits on 64bit machine.
So store a pointer directly out is nonsense.
The last one is endian.