Going through members of a C++ class - c++

As far as I know, if I have a class such as the following:
class TileSurface{
public:
Tile * tile;
enum Type{
Top,
Left,
Right
};
Type type;
Point2D screenverts[4]; // it's a rectangle.. so..
TileSurface(Tile * thetile, Type thetype);
};
There's no easy way to programatically (using templates or whatever) go through each member and do things like print their types (for example, typeinfo's typeid(Tile).name()).
Being able to loop through them would be a useful and easy way to generate class size reports, etc. Is this impossible to do, or is there a way (even using external tools) for this?

Simply not possible in C++. You would need something like Reflection to implement this, which C++ doesn't have.
As far as your code is concerned after it is compiled, the "class" doesn't exist -- the names of the variables as well as their types have no meaning in assembly, and therefore they aren't encoded into the binary.
(Note: When I say "Not possible in C++" I mean "not possible to do built into the language" -- you could of course write a C++ parser in C++ which could implement this sort of thing...)

No. There are no easy way. If to put "easy way" aside then with C++ you can do anything imaginable.
If you want just to dump your data contents run-time then simplest way is to implement operator<<(ostream&,YourClass const&) for each YourClass you are interested in. Bit more complex is to implement visitor pattern, but with visitor pattern you may have different reports done by different visitors and also the visitors may do other things, not only generate reports.
If you want it as static analysis (program is not running, you want to generate reports) then you can use debugger database. Alternatively you may analyze AST generated by some compilers (g++ and CLang have options to generate it) and generate reports from it.
If you really need run-time reflection then you have to build it into your classes. That involves overhead. For example you may use common base-classes and put all data members of classes into array too. It is often done to communicate with applications written in languages that have reflection on more equal grounds (oldest example is Lisp).

I beg to differ from the conventional wisdom. C++ does have it; it's not part of the C++ standard, but every single C++ compiler I've seen emits metadata of this sort for use by the debugger.
Moreover, two formats for the debug database cover almost all modern compilers: pdb (the Microsoft format) and dwarf2 (just about everything else).

Our DMS Software Reengineering Toolkit is what you call an "external tool" for extractingt/transforming arbitrary code. DMS is generalized compiler technology parameterized by explicit langauge definitions. It has language definitions for C, C++, Java, COBOL, PHP, ...
For C, C++, Java and COBOL versions, it provides complete access to parse trees, and symbol table information. That symbol table information includes the kind of data you are likely to want from "reflection". If you goal is to enumerate some set of fields or methods and do something with them, DMS can be used to transform the code (or generate derived code) according to what you find in the symbol tables in arbitrary ways.

If you derive all types of the member variables from your common typeinfo-provider-baseclass, then you can get that. It is a bit more work than like in Java, but possible.

External tools: you mentioned that you need reports like class size, etc.--
Doxygen could help http://www.doxygen.nl/manual/features.html to generate class member lists (including inherited members).

Related

Are (Self) Reflections in C++ possible?

Without using macros or Boost libraries, is it possible to iterate through a classe's own members in C++?
I know "Reflections" are not natively possible in C++ like they are in Java, C# and Go (heartbreaking), but I don't know if that applies to just classes looking at attributes of other classes or if that also applies to themselves.
I'm hopeful some class minding it's own business might be able to see the attributes of itself somehow at runtime; is this possible?
Nonono. C++ is statically typed compiled language; it does not need to know the names of the members at runtime since all access at runtime is done by address; this makes member names useless cruft that does not justify being in the executable. You can't access what's not there.
The only way to know member names at runtime is to include code that explicitly stores the name during the compilation process - i.e. macros.

Options for parsing/processing C++ files

So I have a need to be able to parse some relatively simple C++ files with annotations and generate additional source files from that.
As an example, I may have something like this:
//# service
struct MyService
{
int getVal() const;
};
I will need to find the //# service annotation, and get a description of the structure that follows it.
I am looking at possibly leveraging LLVM/Clang since it seems to have library support for embedding compiler/parsing functionality in third-party applications. But I'm really pretty clueless as far as parsing source code goes, so I'm not sure what exactly I would need to look for, or where to start.
I understand that ASTs are at the core of language representations, and there is library support for generating an AST from source files in Clang. But comments would not really be part of an AST right? So what would be a good way of finding the representation of a structure that follows a specific comment annotation?
I'm not too worried about handling cases where the annotation would appear in an inappropriate place as it will only be used to parse C++ files that are specifically written for this application. But of course the more robust I can make it, the better.
One way I've been doing this is annotating identifiers of:
classes
base classes
class members
enumerations
enumerators
E.g.:
class /* #ann-class */ MyClass
: /* #ann-base-class */ MyBaseClass
{
int /* #ann-member */ member_;
};
Such annotation makes it easy to write a python or perl script that reads the header line by line and extracts the annotation and the associated identifier.
The annotation and the associated identifier make it possible to generate C++ reflection in the form of function templates that traverse objects passing base classes and members to a functor, e.g:
template<class Functor>
void reflect(MyClass& obj, Functor f) {
f.on_object_start(obj);
f.on_base_subobject(static_cast<MyBaseClass&>(obj));
f.on_member(obj.member_);
f.on_object_end(obj);
}
It is also handy to generate numeric ids (enumeration) for each base class and member and pass that to the functor, e.g:
f.on_base_subobject(static_cast<MyBaseClass&>(obj), BaseClassIndex<MyClass>::MyBaseClass);
f.on_member(obj.member_, MemberIndex<MyClass>::member_);
Such reflection code allows to write functors that serialize and de-serialize any object type to/from a number of different formats. Functors use function overloading and/or type deduction to treat different types appropriately.
Parsing C++ code is an extremely complex task. Leveraging a C++ compiler might help but it could be beneficial to restrict yourself to a more domain-specific less-powerful format i.e., to generate the source and additional C++ files from a simpler representation something like protobufs proto files or SOAP's WSDL or even simpler in your specific case.
I did some very similar work recently. The research I did indicated that there wasn't any out-of-the-box solutions available already, so I ended up hand-rolling one.
The other answers are dead-on regarding parsing C++ code. I needed something that could get ~90% of C++ code parsed correctly; I ended up using srcML. This tool takes C++ or Java source code and converts it to an XML document, which makes it easier for you to parse. It keeps the comments in-tact. Furthermore, if you need to do a source code transformation, it comes with an reverse tool which will take the XML document and produce source code.
It works in 90% of the cases correctly, but it trips on complicated template metaprogramming and the darkest corners of C++ parsing. Fortunately, my input source code is fairly consistent in design (not a lot of C++ trickery), so it works for us.
Other items to look at include gcc-xml and reflex (which actually uses gcc-xml). I'm not sure if GCC-XML preserves comments or not, but it does preserve GCC attributes and pragmas.
One last item to look at is this blog on writing GCC plugins, written by the author of the CodeSynthesis ODB tool.
Good luck!

STL Containers and Binary Interface Compatibility

STL Binary Interfaces
I'm curious to know if anyone is working on compatible interface layers for STL objects across multiple compilers and platforms for C++.
The goal would be to support STL types as first-class or intrinsic data types.
Is there some inherent design limitation imposed by templating in general that prevents this? This seems like a major limitation of using the STL for binary distribution.
Theory - Perhaps the answer is pragmatic
Microsoft has put effort into .NET and doesn't really care about C++ STL support being "first class".
Open-source doesn't want to promote binary-only distribution and focuses on getting things right with a single compiler instead of a mismatch of 10 different versions.
This seems to be supported by my experience with Qt and other libraries - they generally provide a build for the environment you're going to be using. Qt 4.6 and VS2008 for example.
References:
http://code.google.com/p/stabi/
Binary compatibility of STL containers
I think the problem preceeds your theory: C++ doesn't specify the ABI (application binary interface).
In fact even C doesn't, but being a C library just a collection of functions (and may be global variables) the ABI is just the name of the functions themselves. Depending on the platform, names can be mangled somehow, but, since every compiler must be able to place system calss, everything ends up using the same convention of the operating system builder (in windows, _cdecl just result in prepending a _ to the function name.
But C++ has overloading, hence more complex mangling scheme are required.
As far as of today, no agreement exists between compiler manufacturers about how such mangling must be done.
It is technically impossible to compile a C++ static library and link it to a C++ OBJ coming from another compiler. The same is for DLLs.
And since compilers are all different even for compiled overloaded member functions, no one is actually affording the problem of templates.
It CAN technically be afforded by introducing a redirection for every parametric type, and introducing dispatch tables but ... this makes templated function not different (in terms of call dispatching) than virtual functions of virtual bases, thus making the template performance to become similar to classic OOP dispatching (although it can limit code bloating ... the trade-off is not always obvious)
Right now, it seems there is no interest between compiler manufacturers to agree to a common standard since it will sacrifice all the performance differences every manufacturer can have with his own optimization.
C++ templates are compile-time generated code.
This means that if you want to use a templated class, you have to include its header (declaration) so the compiler can generate the right code for the templated class you need.
So, templates can't be pre-compiled to binary files.
What other libraries give you is pre-compiled base-utility classes that aren't templated.
C# generics for example are compiled into IL code in the form of dlls or executables.
But IL code is just like another programming language so this allows the compiler to read generics information from the included library.
.Net IL code is compiled into actual binary code at runtime, so the compiler at runtime has all the definitions it needs in IL to generate the right code for the generics.
I'm curious to know if anyone is working on compatible interface
layers for STL objects across multiple compilers and platforms for
C++.
Yes, I am. I am working on a layer of standardized interfaces which you can (among other things) use to pass binary safe "managed" references to instances of STL, Boost or other C++ types across component boundaries. The library (called 'Vex') will provide implementations of these interfaces plus proper factories to wrap and unwrap popular std:: or boost:: types. Additionally the library provides LINQ-like query operators to filter and manipulate contents of what I call Range and RangeSource. The library is not yet ready to "go public" but I intend to publish an early "preview" version as soon as possible...
Example:
com1 passes a reference to std::vector<uint32_t> to com2:
com1:
class Com2 : public ICom1 {
std::vector<int> mVector;
virtual void Com2::SendDataTo(ICom1* pI)
{
pI->ReceiveData(Vex::Query::From(mVector) | Vex::Query::ToInterface());
}
};
com2:
class Com2 : public ICom2 {
virtual void Com2::ReceiveData(Vex::Ranges::IRandomAccessRange<uint32_t>* pItf)
{
std::deque<uint32_t> tTmp;
// filter even numbers, reverse order and process data with STL or Boost algorithms
Vex::Query::From(pItf)
| Vex::Query::Where([](uint32_t _) -> { return _ % 2 == 0; })
| Vex::Query::Reverse
| Vex::ToStlSequence(std::back_inserter(tTmp));
// use tTmp...
}
};
You will recognize the relationship to various familiar concepts: LINQ, Boost.Range, any_iterator and D's Ranges... One of the basic intents of 'Vex' is not to reinvent wheel - it only adds that interface layer plus some required infrastructures and syntactic sugar for queries.
Cheers,
Paul

Ways to use variable as object name in c/c++

Just out of curiosity: is there a way to use variable as object name in c++?
something along the lines:
char a[] = "testme\0";
*a *vr = new *a();
If you were to write a c/c++ compiler how would you go about to implement such a thing?
I know they implemented this feature in zend engine but to lazy to look it up.
Maybe some of you guys can enlight me :)
In case what you are looking for is something like this
<?php
$className = "ClassName";
$instance = new $className();
?>
That's simply not possible in C++. This fails for many reasons, one of them that C++ at runtime doesn't know much about names of classes anymore (only in debug mode) If somebody wanted to write a compiler that would allow something like this, it would be necessary to keep a lot of information that a C++ compiler only needs during compilation and linking. Changing this would create a new language.
If you want to dynamically create classes depending on information only available at runtime, in C++ you would most likely use some of the Creational Design Patterns.
Edit:
PHP is one language, C++ is a very different one. 16M may not be that much nowadays, for a C++ programmer where some programs are in the k range, it's a whole world. Nobody wants to ship a complete compiler with his C++ app to be able to get all the dynamic features (that btw PHP too implements only in a limited way as far as I know, if you want really dynamic runtime code creation, have a look at Ruby or Python). C++ has (as all languages) a certain philosophy and creating objects by name in a string doesn't fit very well with it. This feature alone is quite useless anyway and would by no means justify the overhead necessary to implement it. This could most likely be done without adding runtime compilation, but even the extra kilobytes necessary to store the names alone make no sense in the C++ world. And C++ is strictly typed and this functionality would have to make sure, that type checking doesn't break.
In C and C++, identifier names do not have the same meaning they do in PHP.
PHP is a dynamic language, and (at least conceptually) runs in an interpreted context. Identifier names are present at run time, they can be inspected through PHP's reflection features, you can use strings to refer to functions, variables, globals, and object properties by name, etc. PHP identifiers are actual semantic entities.
In C++, identifiers are lost at run time (again, conceptually speaking). You use them in your source code to identify variables, functions, classes, etc., but the compiler translates them into memory addresses or literal values, or even optimizes them away completely. Identifier names are not generally present in the compiled binary (unless you instructed the compiler to include debug symbols), and there is no way to inspect them at run-time. Even with RTTI, the best you can get is an arbitrary number to identify a type; you can compare them for equality, but you cannot get the name back.
Consequently, if you want to translate strings into identifier names at run-time in C++, you have to perform the mapping manually. std::map can be a great help for this - you hand it a string, and it gives you a value. This doesn't work directly for class names; for these, you need to implement some sort of factory method. A nice solution is to have one wrapper function for each type, and then a std::map that maps class names to the corresponding wrappers. Something like:
map<string, FoobarFactoryMethod> factory_map;
Foobar* FooFactory() { return new Foo(); }
Foobar* BarFactory() { return new Bar(); }
Foobar* BazFactory() { return new Baz(); }
void fill_map() {
factory_map["Foo"] = FooFactory;
factory_map["Bar"] = BarFactory;
factory_map["Baz"] = BazFactory;
}
// and then later:
Foobar* f = factory_map[classname]();
Why do you even want to have this feature? You are most likely misusing OOP. Whenever my needs ran into hard language barriers like this I ended up doing one of the following:
Rethink your solution to the problem so it fits OOP better
Create a DSL for your problem (domain specific language)
Create a code generator for this part of your problem
Pick a language that fits your problem better
A combination of the above
I would think that what you want to do would be best accomplished using interfaces and a factory pattern.

Get list of available data member from a POD struct in C++

the question can sound a bit unusual. Let's take a POD struct:
struct MyStruct
{
int myInt;
double myDouble;
AnotherPOD* myPointer;
};
The compiler knows the list of available data members. Do you know any way to get list of data member name (and type) either at compile time (better) or at run time?
I have a huge amount of POD structs and I would like to automate the creation of operator<<.
I know I could create a parser for the header files, create some files and compile those. However, I am sure the compiler has already had this information and I would like to exploit it.
Any ideas?
Thanks
BOOST_FUSION_ADAPT_STRUCT introduces compile-time reflection (which is awesome).
It is up to you to map this to run-time reflection of course, and it won't be too easy, but it is possible in this direction, while it would not be in the reverse :)
I don't know of any way to do what you want directly, but you might want to take a look at clang, which is a compiler front-end implementation that you can make use of to do other things:
http://clang.llvm.org
I guess you'd then be able to traverse the abstract syntax tree it creates and get at the information you're after.
Well, standard C++ compilers can't do that, they lack reflection capabilities.
Sounds like a task for a code generator. So either use a toolkit to extract these informations from the headers or generate both headers and serialization functions from another source. Just make sure you do not repeat yourself.
I am afraid but C++ doesn't support reflection. You can use Boost.TypeTraits to achieve a restricted form of reflection at compile time.