precision about structure loading in memory - c++

typedef struct sample_s
{
int sampleint;
sample2 b;
} sample;
typedef struct sample2_s
{
int a;
int b;
int c;
int d;
} sample2;
int main()
{
sample t;
}
In this example, when I create the instance t of the sample structure, I will also load sample2 in memory.
The Question is, how is it possible to only load the sampleint in the memory ?
Is there a way to only load a part of a structure in memory ?
If the answer is, like I think it is, the inheritance. How does it work exactly ? Will there be a waste of time during the execution due to hash table ?
I am asking those question because I want to develop a DOD (data oriented design) program and I want to understand better how structures are managed in the memory.
Thank you

If you just want to copy sampleint, you can declare int s = x.sampleint; You can also memcpy() a range of memory defined by the offsetof macro in <stddef.h> to get a range of consecutive member variables.
It seems as if what you want is one of the following:
Declare a samplebase type that, in C++, sample can inherit from.
Declare storage for only the individual members you want to copy.
Have sample hold a pointer to a sample2, and set that to NULL if you aren’t allocating one.
Declare the sample as a temporary in a block of code, copy the parts you want, let the memory be reclaimed when it goes out of scope.

Related

Map a layout onto memory address

In C++ is there a way to "map" my desired layout onto a memory data, without memcopying it?
I.e. there is a void* buffer, and I know its layout:
byte1: uint8_t
byte2-3: uint16_t
byte4: uint8_t
I know I can create a struct, and memcpy the data to the struct, and then I can have the values as fields of struct.
But is there a way achieving this without copying? The data is already there, I just need to get some fields, and I'm looking a way for something can help with the layout.
(I can have some static ints for the memory offsets, but I'm hoping for some more generic).
I.e: I would have more "layouts", and based on type of the raw data I'd map the appropriate layout and access its fields which still points to the original data.
I know I can point structs to data, it is easy:
struct message {
uint8_t type;
};
struct request:message {
uint8_t rid;
uint8_t other;
};
struct response:message {
uint8_t result;
};
vector<uint8_t> data;
data.push_back(1); //type
data.push_back(10);
data.push_back(11);
data.push_back(12);
data.push_back(13);
struct request* ptrRequest;
ptrRequest = (struct request*)&data[1];
cout << (int)ptrRequest->rid; //10
cout << (int)ptrRequest->other; //11
But what I'd like to achieve is to have a map with the layouts, i.e:
map<int, struct message*> messagetypes;
But I have no clue on how can I proceed as emplacing would need a new object, and casting is also challenging if the maps stores the base pointers only.
If your layout structure is POD you can do placement new-expression with no initialization, that serves as an object creation marker. E.g.:
#include <new> // Placement new.
// ...
uint8_t* data = ...; // Read from disk, network, or elsewhere.
static_assert(std::is_pod<request>::value, "struct request must be POD.");
request* ptrRequest = new (static_cast<void*>(data)) request;
That only works with PODs. This is a long-standing issue documented in P0593R6
Implicit creation of objects for low-level object manipulation.
If your target architecture requires data to be aligned, add data pointer alignment check.
As another answer states, memcpy may be eliminated by the compiler, examine the assembly output.
In C++ is there a way to "map" my desired layout onto a memory data, without memcopying it?
No, not in standard C++.
If the layout matches that of the class1, then what you might be able to do is to write the memory data onto the class instance initially, so that it doesn't need for copying afterwards.
If the above is not possible, then what you might do is copy (yes, this is memcopy, but hold that thought) the data onto an automatic instance of the class, then placement-new a copy of the automatic instance onto the source array. A good optimiser can see that these copies back and forth do not change the value, and can optimise them away. Matching layout is also necessary here. Example:
struct data {
std::uint8_t byte;
std::uint8_t another;
std::uint16_t properly_aligned;
};
void* buffer = get_some_buffer();
if (!std::align(alignof(data), sizeof(data), buffer, space))
throw std::invalid_argument("bad alignment");
data local{};
std::memcpy(&local, buffer, sizeof local);
data* dataptr = new(buffer) data{local};
std::uint16_t value_from_offset = dataptr->properly_aligned;
https://godbolt.org/z/uvrXS2 Notice how there is no call to std::memcpy in the generated assembly.
One thing to consider here is that the multi-byte integers must have the same byte order as the CPU uses natively. Therefore the data is not portable across systems (of different byte endienness). More advanced de-serialisation is required for portability.
1 It however seems unlikely that the data could possibly match the layout of the class, because the second element which is uint16_t is not aligned to two a 16 bit boundary from start of the layout.

How to let the compiler do the offset computations for an odd polymorphism structure, with as little code as possible?

I am not sure if this is possible at all in standard C++, so whether it even is possible to do, could be a secondary way to put my question.
I have this binary data which I want to read and re-create using structs. This data is originally created as a stream with the content appended to a buffer, field by field at a time; nothing special about that. I could simply read it as a stream, the same way it was written. Instead, I merely wanted to see if letting the compiler do the math for me, was possible, and instead implementing the binary data as a data structure instead.
The fields of the binary data have a predictable order which allows it to be represented as a data type, the issue I am having is with the depth and variable length of repeating fields. I am hoping the example code below makes it clearer.
Simple Example
struct Common {
int length;
};
struct Boo {
long member0;
char member1;
};
struct FooSimple : Common {
int count;
Boo boo_list[];
};
char buffer[1024];
int index = 15;
((FooSimple *)buffer)->boo_list[index].member0;
Advanced Example
struct Common {
int length;
};
struct Boo {
long member0;
char member1;
};
struct Goo {
int count;
Boo boo_list[];
};
struct FooAdvanced : Common {
int count;
Goo goo_list[];
};
char buffer[1024];
int index0 = 5, index1 = 15;
((FooAdvanced *)buffer)->goo_list[index0].boo_list[index1].member0;
The examples are not supposed to relate. I re-used some code due to lack of creativity for unique names.
For the simple example, there is nothing unusual about it. The Boo struct is of fixed size, therefore the compiler can do the calculations just fine, to reach the member0 field.
For the advanced example, as far as I can tell at least, it isn't as trivial of a case. The problem that I see, is that if I use the array selector operator to select a Goo object from the inline array of Goo-elements (goo_list), the compiler will not be able to do the offset calculations properly unless it makes some assumptions; possibly assuming that all preceding Goo-elements in the array have zero Boo-elements in the inline array (boo_list), or some other constant value. Naturally, that won't be the case.
Question(s):
What ways are there to achieve the offset computations to be done by the compiler, despite the inline arrays having variable lengths? Unless I am missing something, I believe templates can't help at all, due to their compile-time nature.
Is this even possible to achieve in C++?
How do you handle the case with instantiating a FoodAdvanced object, by feeding a variable number of Goo and Boo element counts to the goo_list and boo_list members, respectively?
If it is impossible, would I have to write some sort of wrapper code to handle the calculations instead?

Can we scramble the declaration order in C or C++?

Is there a method/plugin/addon in place to ignore the following clause (for some c/c++ compiler)? To reorder the declaration of members in a struct during the same stage as the preprocessor or similar? Perhaps by adding a keyword like volatile or something similar to the front of the struct declaration.
I was thinking: a compiler option, a built-in keyword, or a programming method.
C99 §6.7.2.1 clause 13 states:
Within a structure object, the
non-bit-field members and the units in
which bit-fields reside have addresses
that increase in the order in which
they are declared.
C++ seems to have a similar clause, and I am interested in that as well. The clauses both specify a reasonable feature to have in terms of later declarations have greater memory offsets. But, I often do not need to know the declaration order of my struct for interface purposes or some other. It would be nice to write some code like:
scrambled struct foo {
int a;
int bar;
};
or, suppose order doesn't really matter with this struct.
scrambled struct foo {
int bar;
int a;
};
And so, have the declaration of a and b swapped randomly each time I compile. I believe that this also applies to setting aside stack memory.
main() {
scrambled int a;
scrambled int foo;
scrambled int bar;
..
Why do I ask?
I was curious to see how program bots were created. I watched some people analyzing memory offsets for changes while running the program to which a hack will be created.
It seems the process is: watch the memory offsets and take note of the purpose for the given offsets. Later, hack programs will inject desired values into memory at those offsets.
Now suppose those memory offsets changed every single time the program is compiled. Maybe it would hinder or dissuade individuals from taking the time to understand something you would rather they not know.
Run-time foxing is the best way, then you only have to release a single version. Where a struct has several fields of the same type, you can use an array instead. Step 1. Instead of a structure with three int fields use an array
#define foo 0
#define bar 1
#define zee 2
struct abc {
int scramble [3];
};
...
value = abc.scramble[bar];
Step 2, now use an indexing array which is randomised every time the program is run.
int abcindex [3]; // index lookup
int abcpool [3]; // index pool for randomise
for (i=0; i<3; i++) // initialise index pool
abcpool[i] = i;
srand (time(NULL));
for (i=0; i<3; i++) { // initialise lookup array
j = rand()%(3-i);
abcindex[i] = abcpool[j]; // allocate random index from pool
abcpool[j] = abcpool[2-i]; // remove index from pool
}
value = abc.scramble[abcindex[bar]];
Another way to try to fox a hacker is to include subterfuge variables that behave as if they have something to do with it but make the program exit if tampered with. Lastly you can keep some kind of checksum or encrypted copy of key variables, to check if they have been tampered with.
Your intention is good, but the solution isn't (sorry). Usually you can't recompile your program before each run. The attacker will hack the inspected program. However, there's a solution, called ASLR. The operating system could change the load address for you, thus making "return oriented programming" and "return to libc like hacks harder".

Default values for implicit Default constructor in C++

I was going through the C++ Object model when this question came. What are the default values for the data members of a class if the default constructor is invoked?
For example
class A
{
int x;
char* s;
double d;
string str; // very high doubt here as string is a wrapper class
int y[20];
public :
void print_values()
{
cout<<x<<' '<<s<<' '<<d<<' '<<str<<' '<y[0]<<' '<<y<<endl;
}
}
int main()
{
A temp;
temp.print_values(); // what does this print?
return 0;
}
The value of an un-initialized variable is undefined, no matter where the variable lives.
Undefined does not necessarily mean zero, or anything in particular. For example, in many debug builds the memory is filled with a pattern that can be used to detect invalid memory accesses. These are stripped for release builds where the memory is simply left as it was found.
You can't really predict what's going to be in your memory when you're allocating it.
There could be pretty much anything as the memory you're reading has not been set to 0 (or anything else should I say).
Most of the time you'll find the values to be 0 for numeric values in little executables.

C++ save vector of vector of vector of class object into file

I have a class in c++ like the following:
class myCls
{
public:
myCls();
void setAngle(float angle);
void setArr(unsigned char arr[64]);
unsigned char arr[64];
double angle;
int index;
static float calcMean(const unsigned char arr[64]);
static float sqrt7(float x);
};
Now in my main program I have a 3D vector of the class:
vector<vector<vector< myCls > > > obj;
The size of the vector is also dynamically changed. My question is that how can I store the content of my vector into a file and retrieve it afterward?
I have tried many ways with no success.This is my try:
std::ofstream outFile;
outFile.open(fileName, ios::out);
for(int i=0;i<obj.size();i++)
{
outFile.write((const char *)(obj.data()),sizeof(vector<vector<myCls> >)*obj.size());
}
outFile.close();
And for reading it:
vector<vector<vector<myCls>>> myObj;
id(inFile.is_open())
{
inFile.read((char*)(myObj.data()),sizeof(vector<vector<myCls> >)*obj.size());
}
What I get is only runTime error.
Can anyone help me in this issue please?
If you don't care too much about performance, try boost::serialization. Since they've already implemented serialization functions for stl containers, you would only have to write the serialize function for a myCL, and everything else comes for free. Since your member variables are all public, you can do that intrusively or non-intrusively.
Internally, a vector most usually consists of two numbers, representing the current length and the allocated length (capacity), as well as a pointer to the actual data. So the size of the “raw” object is fixed and approximately thrice the size of a pointer. This is what your code currently writes. The values the pointer points at won't be stored. When you read things back, you're setting the pointer to something which in most cases won't even be allocated memory, thus the runtime error.
In general, it's a really bad idea to directly manipulate the memory of any class which provides constructors, destructors or assignment operators. Your code writing to the private members of the vector would thoroughly confuse memory management, even if you took care to restore the pointed-at data as well. For this reason, you should only write simple (POD) data the way you did. Everything else should be customized to use custom code.
In the case of a vector, you'd probably store the length first, and then write the elements one at a time. For reading, you'd read the length, probably reserve memory accordingly, and then read elements one at a time. The boost::serialization templates suggested by Voltron will probably save you the trouble of implementing all that.