c++ figuring out memory layout of members programmatically

c++ figuring out memory layout of members programmatically - c++

Suppose in one program, I'm given:
class Foo {
int x;
double y;
char z;
};
class Bar {
Foo f1;
int t;
Foo f2;
};
int main() {
Bar b;
bar.f1.z = 'h';
bar.f2.z = 'w';
... some crap setting value of b;
FILE *f = fopen("dump", "wb"); // c-style file
fwrite(&b, sizeof(Bar), 1, f);
}
Suppose in another program, I have:
int main() {
File *f = fopen("dump", "rb");
std::string Foo = "int x; double y; char z;";
std::string Bar = "Foo f1; int t; Foo f2;";
// now, given this is it possible to read out
// the value of bar.f1.z and bar.f2.z set earlier?
}
What I'm asking is:
given I have the types of a class, can I figure out how C++ lays it out?

You need to research "serialization". There is a library, Boost Serialization, that people have been recommending.
FWIW, I recommend against using fwrite or std::ostream::write on classes, structures and unions. The compiler is allowed to insert padding between members, so there may be garbage written out. Also, pointers don't serialize very well.
To answer your question, in order to determine which structure to load data from, you need some kind of sentinel to indicate the object type. This can be anything from an enum to the name of the object.
Also investigate the Factory design pattern.

I'm not quite sure what you're asking, so I'll take a leap...
If you really need to figure out where the fields are in a struct, use offsetof.
Note the "POD" restriction in the linked page. This is a C macro, included in C++ for compatibility reasons. We are supposed to use member pointers instead these days, though member pointers don't address all the same problems.
"offsetof" basically imagines an instance of your struct at address zero, and then looks at the address of the field you're interested in. This goes horribly wrong if your struct/class uses multiple or virtual inheritance, since finding the field then involves (typically) a check in the virtual table. Since the imaginary instance at address zero doesn't exist, it doesn't have a virtual table pointer, so you probably get some kind of access violation crash.
Some compilers can cope with this, as they have replaced the traditional offsetof macro with an intrinsic that knows the layout of the struct without trying to do the imaginary-instance trickery. Even so, it's best not to rely on this.
For POD structs, though, offsetof is a convenient way to find the offset to a particular field, and a safe one in that it determines the actual offset irrespective of the alignment applied by your platform.
For the sizeof a field, you obviously just use sizeof. That just leaves platform-specific issues - different layout on different platforms etc due to alignment, endianness and so on ;-)
EDIT
Possibly a silly question, but why not fread the data from the file straight into in instance of the struct, doing essentially what you did with the fwrite but in reverse?
You get the same portability issues as above, meaning your code may not be able to read its own files if recompiled using different options, a different compiler or for a different platform. But for a single-platform app this kind of thing works very well.

You can't assume anything about the order of the bytes that represent Bar. If the file goes across system or that program is compiled with different flags then you'll be reading and writing in different orders.
I've seen a way around this, but it may only work for very simple types.
and I quote from a raknet tutorial:
#pragma pack(push, 1)
struct structName
{
unsigned char typeId; // Your type here
// Your data here
};
#pragma pack(pop)
Noticed the #pragma pack(push,1) and #pragma pack(pop) ? These force your compiler (in this case VC++), to pack the structure as byte-aligned. Check your compiler documentation to learn more.
You want serialization.

For the example that you give, it looks like you really need some sort of C parser that would parse the strings with your type declarations. Then you'd be able to interpret the bytes that you read from the file in the correct way.
Structs in C are laid out member to member in order of declaration. The compiler may insert padding between members according to platform-specific alignment needs. The size of the variables is also platform-specific.

If you have control over the class you can use member pointers. You definitely can do this. The question is whether or not you should...
class Metadata
{
public:
virtual int getOffset() = 0;
};
template <typename THost, typename TField>
class TypedMetadata : Metadata
{
private:
TField (THost::*memberPointer_);
TypedMetadata(TField (THost::*memberPointer))
{
memberPointer_ = memberPointer;
}
public:
static Metadata* getInstance(TField (THost::*memberPointer))
{
return new TypedMetadata<THost, TField>(memberPointer);
}
virtual int getOffset()
{
THost* host = 0;
int result = (int)&(host->*memberPointer_);
return result;
}
};
template<typename THost, typename TField>
Metadata* getTypeMetadata(TField (THost::*memberPointer))
{
return TypedMetadata<THost, TField>::getInstance(memberPointer);
}
class Contained
{
char foo[47];
};
class Container
{
private:
int x;
int y;
Contained contained;
char c1;
char* z;
char c2;
public:
static Metadata** getMetadata()
{
Metadata** metadata = new Metadata*[6];
metadata[0] = getTypeMetadata(&Container::x);
metadata[1] = getTypeMetadata(&Container::y);
metadata[2] = getTypeMetadata(&Container::contained);
metadata[3] = getTypeMetadata(&Container::c1);
metadata[4] = getTypeMetadata(&Container::z);
metadata[5] = getTypeMetadata(&Container::c2);
return metadata;
}
};
int main(array<System::String ^> ^args)
{
Metadata** metadata = Container::getMetadata();
std::cout << metadata[0]->getOffset() << std::endl;
std::cout << metadata[1]->getOffset() << std::endl;
std::cout << metadata[2]->getOffset() << std::endl;
std::cout << metadata[3]->getOffset() << std::endl;
std::cout << metadata[4]->getOffset() << std::endl;
std::cout << metadata[5]->getOffset() << std::endl;
return 0;
}

Related

How to make function return pointer to an array or object in C++?

I'm confused with a lot of answers found about what is a simple thing in other languages. I would like to get a reference to an object contained in class or struct. I've come up to using one of two different functions (here - getData()).
Question:
So, I am not sure which one to use, they appear to do the same thing. Other thing, is there some reason I should care because it's a union? And the most important question here is - I'm not sure about delete part I found in some answers, which scares me that this code example I've shown is not complete and will cause some memory leaks at some point.
#include <iostream>
#include <stdint.h>
using namespace std;
class settings_t {
private:
static const long b1 =0;
uint8_t setmap;
public:
uint8_t myBaseID;
uint8_t reserved1;
uint8_t reserved2;
};
class test1 {
public: //actually, I want this to be private
long v1;
settings_t st;
union {
uint8_t data[4];
uint32_t m1;
settings_t st1;
};
public:
uint8_t * getData() {
return data;
}
uint8_t (&getData2())[4] {
return data;
}
};
int main() {
test1 t1;
t1.data[2]=65;
uint8_t *d1 = t1.getData();
cout<<" => " << d1[2];
d1[2]=66;
uint8_t *d2 = t1.getData2();
cout<<" => " << d2[2];
}

The main difference of c++ from languages like c# or java is that it does not provide you with built in memory management (not a managed language). So, if the program allocates memory in c++, it is a responsibility of the program to release the memory when it is not needed. so, delete in your answers is based on this requirement.
However in you case, the getData() function returns a pointer to the data which is a part of the class test1. This is an array and the array will exist as long as the object of this class exist. Both versions of the getData will work.
You did not use any dynamic data allocation, the object t1 of type test1 was allocated on the stack of the main function and would exist till your program exits. You should not worry about 'delete'.
The difference between two methods you use is that the first method does not care about the array size it returns, whether the other does. For that reason the second methods has very limited practical use, but provides better syntactic checking.

How to adress variable by name stored in another variable (C++)

Good day,
I would like to reference a structure in a function by using a variable to store its name. Is this possible to do something like this in C++?
Definitely, all existing structures will be declared and initialised before any call is made (probably as global) and I will build in a check to make sure that only existing structures are referenced.
I would like something in this spirit:
struct StructName
{
...stuff
}a,b,c;
StructName a;
StructName b;
.
.
.
etc. including setting required values (in initialisation or elsewhere in code as needed)
and then I would have something like this to call from another portion of code:
void myFunction(char someInput)
{
some stuff
some stuff
externalFunction(static parameter, static parameter, _someInput_, static parameter);
yet some other stuff
}
where somInput is either a,b or c.
Please bear in mind I am a beginner with C, with little to no formal training in subject matter.
Thank you.
edit: If it was just myself, I would make do with case switch for someInput, referencing the structure directly in each case, but this part of a code is meant to be extendable by a non-programmer who would supply structures themselves, I would provide to him a template of structure initialisation code, and he would add the initialisation code to a specified place in the code, ammend the list of allowed names and compile the library.

You cannot convert a char or a char const * (runtime data) into a type (compile time information).
[edit] Well, actually you can with something like the following code, but since it uses templates, it will still be available only at the compile time, so you will not be able to pass function parameters, for example.
template < char C >
struct GetType;
template < >
struct GetType< 'a' > {
typedef struct { int foo; } type; };
template < >
struct GetType< 'b' > {
typedef struct { int bar; } type; };
GetType< 'a' >::type foo;
GetType< 'b' >::type bar;

Variable names disappear as part of the compilation step(s) in C and C++.
Typically, there are two scenarios that solve the type of problem you are describing:
User input corresponds to specific variable name.
You don't actually want the "name" of the variable, but just need a way to associate different data with different parts of your code.
In the second, simpler case, just use an array, and use the index to the element you want as the way to associate the correct data.
In the first case, you can use a std::map<std::string, structname&>. A std::map acts sort of like an array, but it is indexed by the first type give in the template parameters to std::map - so in this case, you can use std::string as an index.
Something like this:
#include <map>
#include <iostream>
#include <string>
struct StructName
{
int x;
};
std::map<std::string, StructName *> vars;
StructName a;
StructName b;
void registerVars()
{
vars["a"] = &a;
vars["b"] = &b;
}
int main()
{
std::string v;
registerVars();
while(std::cin >> v)
{
std::cin.ignore(1000, '\n');
if (vars.find(v) == vars.end())
{
std::cout << "No such variable" << std::endl;
}
else
{
vars[v]->x++;
std::cout << "variable " << v << " is " << vars[v]->x << std::endl;
}
}
return 0;
}

C++ Pointer to Members

If I have two classes that are in the same hierarchy with a member of the same name and type, what is the "correct" way to create a member pointer to the base class's variable.
Ex.
class A
{
int x;
A():x(1){}
};
class B : public A
{
int x;
B():x(2){}
};
int main(int argc, char *argv[]) {
B classB;
int B::*ptr = &B::x;
int B::*ptr1 = &B::A::x;
int A::*ptr2 = &A::x;
printf("%d,%d,%d\n", classB.*ptr, classB.*ptr1, classB.*ptr2);
return 0;
}
On my compiler (LLVM GCC) this will print 2,1,1 like I would expect it to. This leads me to my two questions.
Are all three of the above implementations "safe" when it comes to the c++ standard?
And If so, Do any mainstream compilers have incompatibilities with either of these?

I believe all three are safe, although I can't cite chapter and verse from the standard on them. :)
That said, I have run across one very specific bug in member function pointers on an older version of Visual Studio (I don't remember which, I'm afraid). Specifically, I had a structure like this:
struct optable_entry {
const char *name;
void (*MyClass::run)();
};
const optable_entry operations[] = {
{ "foo", &MyClass::foo },
/* ... */
};
With this, for some reason, the member function values would not be properly initialized. In my case, this was generated code, so it wasn't too much trouble to replace it with a massive switch statement instead, but it's something to watch out for - member function pointers are rarely enough used that weird corner cases like this may be lurking in your compiler.

On what platforms will this crash, and how can I improve it?

I've written the rudiments of a class for creating dynamic structures in C++. Dynamic structure members are stored contiguously with (as far as my tests indicate) the same padding that the compiler would insert in the equivalent static structure. Dynamic structures can thus be implicitly converted to static structures for interoperability with existing APIs.
Foremost, I don't trust myself to be able to write Boost-quality code that can compile and work on more or less any platform. What parts of this code are dangerously in need of modification?
I have one other design-related question: Is a templated get accessor the only way of providing the compiler with the requisite static type information for type-safe code? As it is, the user of dynamic_struct must specify the type of the member they are accessing, whenever they access it. If that type should change, all of the accesses become invalid, and will either cause spectacular crashes—or worse, fail silently. And it can't be caught at compile time. That's a huge risk, and one I'd like to remedy.
Example of usage:
struct Test {
char a, b, c;
int i;
Foo object;
};
void bar(const Test&);
int main(int argc, char** argv) {
dynamic_struct<std::string> ds(sizeof(Test));
ds.append<char>("a") = 'A';
ds.append<char>("b") = '2';
ds.append<char>("c") = 'D';
ds.append<int>("i") = 123;
ds.append<Foo>("object");
bar(ds);
}
And the code follows:
//
// dynamic_struct.h
//
// Much omitted for brevity.
//
/**
* For any type, determines the alignment imposed by the compiler.
*/
template<class T>
class alignment_of {
private:
struct alignment {
char a;
T b;
}; // struct alignment
public:
enum { value = sizeof(alignment) - sizeof(T) };
}; // class alignment_of
/**
* A dynamically-created structure, whose fields are indexed by keys of
* some type K, which can be substituted at runtime for any structure
* with identical members and packing.
*/
template<class K>
class dynamic_struct {
public:
// Default maximum structure size.
static const int DEFAULT_SIZE = 32;
/**
* Create a structure with normal inter-element padding.
*/
dynamic_struct(int size = DEFAULT_SIZE) : max(size) {
data.reserve(max);
} // dynamic_struct()
/**
* Copy structure from another structure with the same key type.
*/
dynamic_struct(const dynamic_struct& structure) :
members(structure.members), max(structure.max) {
data.reserve(max);
for (iterator i = members.begin(); i != members.end(); ++i)
i->second.copy(&data[0] + i->second.offset,
&structure.data[0] + i->second.offset);
} // dynamic_struct()
/**
* Destroy all members of the structure.
*/
~dynamic_struct() {
for (iterator i = members.begin(); i != members.end(); ++i)
i->second.destroy(&data[0] + i->second.offset);
} // ~dynamic_struct()
/**
* Get a value from the structure by its key.
*/
template<class T>
T& get(const K& key) {
iterator i = members.find(key);
if (i == members.end()) {
std::ostringstream message;
message << "Read of nonexistent member \"" << key << "\".";
throw dynamic_struct_access_error(message.str());
} // if
return *reinterpret_cast<T*>(&data[0] + i->second.offset.offset);
} // get()
/**
* Append a member to the structure.
*/
template<class T>
T& append(const K& key, int alignment = alignment_of<T>::value) {
iterator i = members.find(key);
if (i != members.end()) {
std::ostringstream message;
message << "Add of already existing member \"" << key << "\".";
throw dynamic_struct_access_error(message.str());
} // if
const int modulus = data.size() % alignment;
const int delta = modulus == 0 ? 0 : sizeof(T) - modulus;
if (data.size() + delta + sizeof(T) > max) {
std::ostringstream message;
message << "Attempt to add " << delta + sizeof(T)
<< " bytes to struct, exceeding maximum size of "
<< max << ".";
throw dynamic_struct_size_error(message.str());
} // if
data.resize(data.size() + delta + sizeof(T));
new (static_cast<void*>(&data[0] + data.size() - sizeof(T))) T;
std::pair<iterator, bool> j = members.insert
({key, member(data.size() - sizeof(T), destroy<T>, copy<T>)});
if (j.second) {
return *reinterpret_cast<T*>(&data[0] + j.first->second.offset);
} else {
std::ostringstream message;
message << "Unable to add member \"" << key << "\".";
throw dynamic_struct_access_error(message.str());
} // if
} // append()
/**
* Implicit checked conversion operator.
*/
template<class T>
operator T&() { return as<T>(); }
/**
* Convert from structure to real structure.
*/
template<class T>
T& as() {
// This naturally fails more frequently if changed to "!=".
if (sizeof(T) < data.size()) {
std::ostringstream message;
message << "Attempt to cast dynamic struct of size "
<< data.size() << " to type of size " << sizeof(T) << ".";
throw dynamic_struct_size_error(message.str());
} // if
return *reinterpret_cast<T*>(&data[0]);
} // as()
private:
// Map from keys to member offsets.
map_type members;
// Data buffer.
std::vector<unsigned char> data;
// Maximum allowed size.
const unsigned int max;
}; // class dynamic_struct

There's nothing inherently wrong with this kind of code. Delaying type-checking until runtime is perfectly valid, although you will have to work hard to defeat the compile-time type system. I wrote a homogenous stack class, where you could insert any type, which functioned in a similar fashion.
However, you have to ask yourself- what are you actually going to be using this for? I wrote a homogenous stack to replace the C++ stack for an interpreted language, which is a pretty tall order for any particular class. If you're not doing something drastic, this probably isn't the right thing to do.
In short, you can do it, and it's not illegal or bad or undefined and you can make it work - but you only should if you have a very desperate need to do things outside the normal language scope. Also, your code will die horrendously when C++0x becomes Standard and now you need to move and all the rest of it.
The easiest way to think of your code is actually a managed heap of a miniature size. You place on various types of object.. they're stored contiguously, etc.
Edit: Wait, you didn't manage to enforce type safety at runtime either? You just blew compile-time type safety but didn't replace it? Let me post some far superior code (that is somewhat slower, probably).
Edit: Oh wait. You want to convert your dynamic_struct, as the whole thing, to arbitrary unknown other structs, at runtime? Oh. Oh, man. Oh, seriously. What. Just no. Just don't. Really, really, don't. That's so wrong, it's unbelievable. If you had reflection, you could make this work, but C++ doesn't offer that. You can enforce type safety at runtime per each individual member using dynamic_cast and type erasure with inheritance. Not for the whole struct, because given a type T you can't tell what the types or binary layout is.

I think the type-checking could be improved. Right now it will reinterpret_cast itself to any type with the same size.
Maybe create an interface to register client structures at program startup, so they may be verified member-by-member — or even rearranged on the fly, or constructed more intelligently in the first place.
#define REGISTER_DYNAMIC_STRUCT_CLIENT( STRUCT, MEMBER ) \
do dynamic_struct::registry< STRUCT >() // one registry obj per client type \
.add( # MEMBER, &STRUCT::MEMBER, offsetof( STRUCT, MEMBER ) ) while(0)
// ^ name as str ^ ptr to memb ^ check against dynamic offset

I have one question: what do you get out of it ?
I mean it's a clever piece of code but:
you're fiddling with memory, the chances of blow-up are huge
it's quite complicated too, I didn't get everything and I would certainly have to pose longer...
What I am really wondering is what you actually want...
For example, using Boost.Fusion
struct a_key { typedef char type; };
struct object_key { typedef Foo type; };
typedef boost::fusion<
std::pair<a_key, a_key::type>,
std::pair<object_key, object_key::type>
> data_type;
int main(int argc, char* argv[])
{
data_type data;
boost::fusion::at_key<a_key>(data) = 'a'; // compile time checked
}
Using Boost.Fusion you get compile-time reflection as well as correct packing.
I don't really see the need for "runtime" selection here (using a value as key instead of a type) when you need to pass the right type to the assignment anyway (char vs Foo).
Finally, note that this can be automated, thanks to preprocessor programming:
DECLARE_ATTRIBUTES(
mData,
(char, a)
(char, b)
(char, c)
(int, i)
(Foo, object)
)
Not much wordy than a typical declaration, though a, b, etc... will be inner types rather than attributes names.
This has several advantages over your solution:
compile-time checking
perfect compliance with default generated constructors / copy constructors / etc...
much more compact representation
no runtime lookup of the "right" member

Pointer-to-data-member-of-data-member

I had the following piece of code (simplified for this question):
struct StyleInfo
{
int width;
int height;
};
typedef int (StyleInfo::*StyleInfoMember);
void AddStyleInfoMembers(std::vector<StyleInfoMember>& members)
{
members.push_back(&StyleInfo::width);
members.push_back(&StyleInfo::height);
}
Now, we had to restructure this a bit, and we did something like this:
struct Rectangle
{
int width;
int height;
};
struct StyleInfo
{
Rectangle size;
};
typedef int (StyleInfo::*StyleInfoMember);
void AddStyleInfoMembers(std::vector<StyleInfoMember>& members)
{
members.push_back(&StyleInfo::size::width);
members.push_back(&StyleInfo::size::height);
}
If this all looks like a stupid thing to do, or if you feel there's a good opportunity to apply BOOST here for some reason, I must warn you that I really simplified it all down to the problem at hand:
error C3083: 'size': the symbol to the left of a '::' must be a type
The point I'm trying to make is that I don't know what the correct syntax is to use here. It might be that "StyleInfo" is not the correct type of take the address from to begin with, but in my project I can fix that sort of thing (there's a whole framework there). I simply don't know how to point to this member-within-a-member.

Remember a pointer to a member is just used like a member.
Obj x;
int y = (x.*)ptrMem;
But like normal members you can not access members of subclasses using the member access mechanism. So what you need to do is access it like you would access a member of the object (in your case via the size member).
#include <vector>
#include <iostream>
struct Rectangle
{
int width;
int height;
};
struct StyleInfo
{
Rectangle size;
};
typedef Rectangle (StyleInfo::*StyleInfoMember);
typedef int (Rectangle::*RectangleMember);
typedef std::pair<StyleInfoMember,RectangleMember> Access;
void AddStyleInfoMembers(std::vector<Access>& members)
{
members.push_back(std::make_pair(&StyleInfo::size,&Rectangle::width));
members.push_back(std::make_pair(&StyleInfo::size,&Rectangle::height));
}
int main()
{
std::vector<Access> data;
AddStyleInfoMembers(data);
StyleInfo obj;
obj.size.width = 10;
std::cout << obj.*(data[0].first).*(data[0].second) << std::endl;
}
This is not something I would recommend doing!
An alternative (that I recommend even less) is to find the byte offset from the beginning of the class and then just add this to the objects address. Obviously this will involve a lot of casting backwards and forwards so this looks even worse then the above.

Is it definitely possible? I honestly don't know, never having played much with pointer-to-member.
Suppose you were using non-POD types (I know you aren't, but the syntax would have to support it). Then pointer-to-member might have to encapsulate more than just an offset from the base pointer. There might be indirection as well, depending how multiple inheritance is implemented. With multiple levels of member indirection, this could get arbitrarily complicated, which is a lot to ask for a type that has to have fixed size.
Perhaps you need a vector of pairs, of types defined by:
typedef Rectangle (StyleInfo::*StyleInfoMember);
typedef int (Rectangle::*RectangleMember);
Apply each in turn to get where you want to be. Of course this still doesn't let you build a vector of mappings from a StyleInfo to arbitrary members-of-members-of StyleInfo, since they wouldn't all go through Rectangle. For that you may need to open a can of functors...

size (as in &StyleInfo::size::width) is not the name of a type.
try size->width or size.width instead, depending on how your 'AddStyleInfoMembers` knows about size at all.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js