I have an unmanaged C++ application (unmanaged meaning: not using anything of the the fancy .Net stuff). I want to extend it with some meta information, and it looks like I could use the concept of attributes.
What I actually try to achieve is the following.
Starting from something a simple class like this:
class Book
{
public:
...
private:
string m_name;
string m_author;
int m_year;
};
I want to build functionality that can access the 'meta information' of the class and use it to dynamically build logic on it, e.g.
a dialog containing 3 edit fields (name, author, year)
a data grid with 3 columns
serialization logic
logic that maps this class to a database table with 3 columns
...
I my wildest dreams I imagine modifying this class like this:
[id="Book"]
class Book
{
public:
...
private:
[id="Name", defaultValue="", maximumLength=100]
string m_name;
[id="Author", defaultValue="", maximumLength=100]
string m_author;
[id="Year", defaultValue=2000, minimum=1900]
int m_year;
};
And then being able to get this 'meta' information to build up dialogs, filling data grids, serializing and deserializing instances, ...
But, is the concept of attributes limited to .Net/managed code?
And if I could use attributes in unmanaged code, would it be possible to do something like this? And what is a good place to start? (examples, ...)
Also, can the same (or similar) concepts be found in other compilers, on other platforms?
I am using Visual Studio 2010 and, as said before, unmanaged/native C++.
Visual C++ for a while supported a similar attribute notation when defining COM objects. I think support was eventually dropped because programmers use C++ for COM implementation when they want complete control, and the compiler doing things magically outside the programmer's control runs counter to that.
OTOH IDL does still allow you to define metadata, it compiles to C++ source code along with a type library which contains the metadata, and it can be retrieved at runtime.
No. C++ does not have introspection or attributes.
Look into Boost Serialization for the serialization stuff, for the others you need to implement it manually, as far as I know.
Related
The problem
The Unreal Engine 4 Editor allows you to add objects of your own types to the scene.
Doing so requires minimal work from the user - to make a class visible in the editor you only need to add some macros, like UCLASS()
UCLASS()
class MyInputComponent: public UInputComponent //you can instantiate it in the editor!
{
UPROPERTY(EditAnywhere)
bool IsSomethingEnabled;
};
This is enough to allow the editor to serialize the created-in-editor object's data (remember: the class is user-defined but the user doesn't have to hardcode loading specific fields. Also note that the UPROPERTY variable can be of user-defined type as well). It is then deserialized while loading the actual game. So how is it handled so painlessly?
My attempt - hardcoded loading for every new class
class Component //abstract class
{
public:
virtual void LoadFromStream(std::stringstream& str) = 0;
//virtual void SaveIntoStream(std::stringstream& str) = 0;
};
class UserCreatedComponent: public Component
{
std::string Name;
int SomeInteger;
vec3 SomeVector; //example of user-defined type
public:
virtual void LoadFromStream(std::stringstream& str) override //you have to write a function like this every time you create a new class
{
str >> Name >> SomeInteger >> SomeVector.x >> SomeVector.y >> SomeVector.z;
}
};
std::vector<Component*> ComponentsFromStream(std::stringstream& str)
{
std::vector<Component*> components;
std::string type;
while (str >> type)
{
if (type == "UserCreatedComponent") //do this for every user-defined type...
components.push_back(new UserCreatedComponent);
else
continue;
components.back()->LoadFromStream(str);
}
return components;
}
Example of an UserCreatedComponent object stream representation:
UserCreatedComponent MyComponent 5 0.707 0.707 0.707
The engine user has to do these things every time he creates a new class:
1. Modify ComponentsFromStream by adding another if
2. Add two methods, one which loads from stream and another which saves to stream.
We want to simplify it so the user only has to use a macro like UPROPERTY.
Our goal is to free the user from all this work and create a more extensible solution, like UE4's (described above).
Attempt at simplifying 1: Using type-int mapping
This section is based on the following: https://stackoverflow.com/a/17409442/12703830
The idea is that for every new class we map an integer, so when we create an object we can just pass the integer given in the stream to the factory.
Example of an UserCreatedComponent object stream representation:
1 MyComponent 5 0.707 0.707 0.707
This solves the problem of working out the type of created object but also seems to create two new problems:
How should we map classes to integers? What would happen if we include two libraries containing classes that map themselves to the same number?
What will initializing e.g. components that need vectors for construction look like? We don't always use strings and ints for object construction (and streams give us pretty much only that).
So how is it handled so painlessly?
C++ language does not provide features which would allow to implement such simple de/serialization of class instances as it works in the Unreal Engine. There are various ways how to workaround the language limitations, the Unreal uses a code generator.
The general idea is following:
When you start project compilation, a code generator is executed.
The code generator parses your header files and searches for macros which has special meaning, like UCLASS, USTRUCT, UENUM, UPROPERTY, etc.
Based on collected data, it generates not only code for de/serialization, but also for other purposes, like reflection (ability to iterate certain members), information about inheritance, etc.
After that, your code is finally compiled along with the generated code.
Note: this is also why you have to include "MyClass.generated.h" in all header files which declare UCLASS, USTRUCT and similar.
In other words, someone must write the de/serialization code in some form. The Unreal solution is that the author of such code is an application.
If you want to implement such system yourself, be aware that it's lots of work. I'm no expert in this field, so I'll just provide general information:
The primary idea of code-generators is to automatize repetitive work, nothing more - in other words, there's no other special magic. That means that "how objects are de/serialized" (how they're transformed from memory to file) and "how the code which de/serializes is created" (whether it's written by a person or generated by an application) are two separate topics.
First, it should be established how objects are de/serialized. For example, std::stringstream can be used, or objects can be de/serialized from/to generally known formats like XML, json, bson, yaml, etc., or a custom solution can be defined.
Establish what's the source of data for generated de/serialization code. In case of Unreal Engine, it's user code itself. But it's not the only way - for example Protobuffers use a simple language which is used only to define data structure and the generator creates code which you can include and use.
If the source of data should be C++ code itself, do not write you own C++ parser! (The only exceptions to this rule are: educational purpose or if you want to spend rest of your life with working on the parser.) Luckily, there are projects which you can use - for example there's clang AST.
How should we map classes to integers? What would happen if we include two libraries containing classes that map themselves to the same number?
There's one fundamental problem with mapping classes to integers: it's not possible to uniquely map every possible class name to an integer.
Proof: create classes named Foo_[integer] and map it to the [integer], i.e. Foo_0 -> 0, Foo_1 -> 1, Foo_2 -> 2, etc. After you use biggest integer value, how do you map Bar_0?
You can start assigning the numbers sequentially as they're added to a project, but as you correctly pin-pointed, what if you include new library? You could start counting from some big number, like 1.000.000, but how do you determine what should be first number for each library? It doesn't have a clear solution.
Some of solutions to this problem are:
Define clear subset of classes which can be de/serialized and assign sequential integers to these classes. The subset can be, for example, "only classes in my project, no library support".
Identify classes with two integers - one for class, one for library. This means you have to have some central register which assigns library integers uniquely (e.g. in order they're registered).
Use string which uniquely identifies the class including library name. This is what Unreal uses.
Generate a hash from class and library name. There's risk of hash collision - the better hash you use, the lower risk there is. For example git (the version control application) uses SHA-1 (which is considered unsafe today) to identify it's objects (files, directories, commits) and the program is used worldwide without bigger issues.
Generate UUID, a 128-bit random number (with special rules). There's also risk of collision, but it's generally considered highly improbable. Used by Java and Unity the game engine.
What would happen if we include two libraries containing classes that map themselves to the same number?
That's called a collision. How it's handled depends on design of de/serialization code, there are mainly two approaches to this problem:
Detect that. For example if your class identifier contains library identifier, don't allow loading/registering library with ID which is already identified. In case of ID which doesn't include library ID (e.g. hash/UUID variant), don't allow registering such classes. Throw an exception or exit the application.
Assume there's no collision. If actual collision happens, it's so-called UB, an undefined behaviour. The application will probably crash or act weirdly. It might corrupt stored data.
What will initializing e.g. components that need vectors for construction look like? We don't always use strings and ints for object construction (and streams give us pretty much only that).
This depends on what it's required from de/serializing code.
The simplest solution is actually to use string of values separated by space.
For example, let's define following structure:
struct Person
{
std::string Name;
float Age;
};
A vector of Person instances could look like: 3 Adam 22.2 Bob 34.5 Cecil 19.0 (i.e. first serialize number of items (vector size), then individual items).
However, what if you add, remove or rename a member? The serialized data would become unreadable. If you want more robust solution, it might be better to use more structured data, for example YAML:
persons:
- name: Adam
age: 22.2
- name: Bob
age: 34.5
- name: Cecil
age: 19.0
Final notes
The problem of de/serializing objects (in C++) is actually big, various systems uses various solutions. That's why this answer is so generic and it doesn't provide exact code - there's not single silver bullet. Every solution has it's advantages and disadvantages. Even detailed description of just Unreal Engine's serialization system would become a book.
So this answer assumes that reader is able to search for various mentioned topic, like yaml file format, Protobuffers, UUID, etc.
Every mentioned solution to a sub-problem has lots of it's own problems which weren't explored. For example de/serialization of string with spaces or new lines from/to simple string stream. If it's needed to solve such problems, it's recommended to first search for more specialized questions or write one if there's nothing to be found.
Also, C++ is constantly evolving. For example, better support for reflection is added, which might, one day, provide enough features to implement high-quality de/serializer. However, if it should be done in compile-time, it would heavily depend on templates which slow down compilation process significantly and decrease code readibility. That's why code generators might be still considered a better choice.
I wonder how could I serialize an object of a given class (e.g. Person) with its attributes (e.g. name, age) to a JSON string using POCO C++ libraries.
Maybe I should create my models using Poco::Dynamic and Poco::Dynamic::Var in order to use POCO::JSON::Stringifier? I can't imagine how to do this...
Thanks in advance!
Unlike Java or C#, C++ doesn't have an introspection/reflection feature outside of Run-time type information (RTTI), which has a different focus and is limited to polymorphic objects. That means outside of a non-standard pre-compiler, you'll have to tell the serialisation framework one way or another how your object is structured and how you would eventually like to map it to a hierarchy of int, std::string and other basic data types. I usually differentiate between three different approaches to do so: pre-compiler, inline specification, property conversion.
Pre-compiler: A good example of the pre-compiler approach is Google Protocol Buffers: https://developers.google.com/protocol-buffers/docs/cpptutorial. You define your entities in a separate .proto file, which is transformed using a proprietary compiler to .c and .h entity classes. These classes can be used like regular POCO entities and can be serialised using Protocol Buffers.
Inline specification: Boost serialization (https://www.boost.org/doc/libs/1_67_0/libs/serialization/doc/index.html), s11n (www.s11n.net) and restc-cpp (https://github.com/jgaa/restc-cpp) are examples of explicitly specifying the structure of your POCOs for the framework inside your own code. The API to do so may be more or less sophisticated, but the principle behind it is always the same: You provide the framework serialise/deserialise implementations for your classes or you register metadata information which allows the framework to generate them. The example below is from restc-cpp:
struct Post {
int userId = 0;
int id = 0;
string title;
string body;
};
BOOST_FUSION_ADAPT_STRUCT(
Post,
(int, userId)
(int, id)
(string, title)
(string, body)
)
Property conversion: The last kind of serialisation that I don't want to miss mentioning is the explicit conversion to a framework-provided intermediate data type. Boost property tree (https://www.boost.org/doc/libs/1_67_0/doc/html/property_tree.html) and JsonCpp (http://open-source-parsers.github.io/jsoncpp-docs/doxygen/index.html) are good examples of this approach. You are responsible for implementing a conversion from your own types to ptree, which Boost can serialise to and from any format you like (XML, JSON).
Having had my share of experience with all three approaches in C++, I would recommend option 3 as your default. It seems to map nicely to POCO C++'s Parser and Var model for JSON. One option is to have all your entity POCO classes implement a to_var or from_var function, or you can keep these serialisation functions in a different namespace for each POCO class, so that you only have to include them when necessary.
If you are working on projects with a significant number of objects to serialise (e.g. messages in communication libraries), the pre-compiler option may be worth the initial setup effort and additional build complexity, but that depends, as always, on the specific project you're dealing with.
I am searching for a simple, light-weight solution for interface-based runtime object composition in C++. I want to be able to specify interfaces (methods declarations), and objects (creatable through factory pattern) implementing these. At runtime I want mechanisms to instantiate these objects and interconnect these based on interface-connectors. The method calls at runtime should remain fairly cheap, i.e. only several more instructions per call, comparable to functor patterns.
The whole thing needs to be platform independent (at least MS Windows and Linux). And the solution needs to be licensed liberally, like open source LGPL or (even better) BSD or something, especially allowing use commercial products.
What I do not want are heavy things like networking, inter-process-communication, extra compiler steps (one-time code generation is ok though), or dependencies to some heavy libraries (like Qt).
The concrete scenario is: I have such a mechanism in a larger software, but the mechanism is not very well implemented. Interfaces are realized by base classes exported by Dlls. These Dlls also export factory functions to instantiate the implementing objects, based on hand-written class ids.
Before I now start to redesign and implement something better myself, I want to know if there is something out there which would be even better.
Edit: The solution also needs to support multi-threading environments. Additionally, as everything will happen inside the same process, I do not need data serialization mechanisms of any kind.
Edit: I know how such mechanisms work, and I know that several teaching books contain corresponding examples. I do not want to write it myself. The aim of my question is: Is there some sort of "industry standard" lib for this? It is a small problem (within a single process) and I am really only searching for a small solution.
Edit: I got the suggestion to add a pseudo-code example of what I really want to do. So here it is:
Somewhere I want to define interfaces. I do not care if it's C-Headers or some language and code generation.
class interface1 {
public:
virtual void do_stuff(void) = 0;
};
class interface2 {
public:
virtual void do_more_stuff(void) = 0;
};
Then I want to provide (multiple) implementations. These may even be placed in Dll-based plugins. Especially, these two classes my be implemented in two different Dlls not knowing each other at compile time.
class A : public interface1 {
public:
virtual void do_stuff(void) {
// I even need to call further interfaces here
// This call should, however, not require anything heavy, like data serialization or something.
this->con->do_more_stuff();
}
// Interface connectors of some kind. Here I use something like a template
some_connector<interface2> con;
};
class B : public interface2 {
public:
virtual void do_more_stuff() {
// finally doing some stuff
}
};
Finally, I may application main code I want to be able to compose my application logic at runtime (e.g. based on user input):
void main(void) {
// first I create my objects through a factory
some_object a = some_factory::create(some_guid<A>);
some_object b = some_factory::create(some_guid<B>);
// Then I want to connect the interface-connector 'con' of object 'a' to the instance of object 'b'
some_thing::connect(a, some_guid<A::con>, b);
// finally I want to call an interface-method.
interface1 *ia = a.some_cast<interface1>();
ia->do_stuff();
}
I am perfectly able to write such a solution myself (including all pitfalls). What I am searching for is a solution (e.g. a library) which is used and maintained by a wide user base.
While not widely used, I wrote a library several years ago that does this.
You can see it on GitHub zen-core library, and it's also available on Google Code
The GitHub version only contains the core libraries, which is really all the you need. The Google Code version contains a LOT of extra libraries, primarily for game development, but it does provide a lot of good examples on how to use it.
The implementation was inspired by Eclipse's plugin system, using a plugin.xml file that indicates a list of available plugins, and a config.xml file that indicates which plugins you would like to load. I'd also like to change it so that it doesn't depend on libxml2 and allow you to be able to specify plugins using other methods.
The documentation has been destroyed thanks to some hackers, but if you think this would be useful then I can write enough documentation to get you started.
A co-worker gave me two further tips:
The loki library (originating from the modern c++ book):
http://loki-lib.sourceforge.net/
A boost-like library:
http://kifri.fri.uniza.sk/~chochlik/mirror-lib/html/
I still have not looked at all the ideas I got.
I'm not too sure how to explain this, but I will try.
I have a object A which has a rownr and partition nr. B, C inherits from A and adds a few other variables (and get/setters for them)
I then have a function which takes a variable that is derived from A (B, C... etc) that will create an record in a database/table with the same columns as the variables the object has.
For example:
class A {
int paritionKey;
int rowKey;
set/get for them both
}
class B : A {
string color;
...
}
One table will then be called "B" and have 3 columns, partitionKey, rowKey and color.
Is there any way of not hard coding this? Or would the best way be to create a toString method in the classes that returns a part of the xml request body that will be used to construct the new row in the table? (using REST)
It sounds like you are asking if there is a way to do automated marshalling of C++ objects into a database. The short answer is no, there is no built-in way in the C++ language to do this. Your toString() method isn't a bad approach, although it does require you to write toString() (and likely at some point also fromString()) methods for each of your classes... whether that is too much work or not would depend on how many such classes you need to support.
Alternatively you might also take a look at Qt's property system -- if you don't mind subclassing your data objects from QObject, you can decorate your class definitions with Q_PROPERTY declarations, along with getter methods for each property, and then you can write generic code that uses Qt's QMetaObject class to iterate over all declared properties of any given QObject in a generic fashion. This works because Qt's moc preprocessor (which you will be running anyway if you are using Qt) will parse the Q_PROPERTY macros and it can auto-generate a lot of the necessary glue code for you. You'll still have to write the last step (converting the QObject's data to XML or SQL commands by iterating over myObject->metaObject()->property(int) and calling myObject->property(propName) for each property) yourself, but at least you can do that in a generic fashion, without having to write a separate marshalling routine for each class.
The approach I'm using is indeed a "toString" or rather "toXml", the hierachical nature of xml being perfect for this. Schematically:
void A::toXml(QDomElement *parentEl)
{
QDomeElement* el = parentEl->ownerDocument()->createElement("A");
parentEl->apeendChild(el);
el->setAttribute("paritionKey", paritionKey);
el->setAttribute("rowKey", rowKey);
}
void B::toXml(QDomElement *parentEl)
{
QDomeElement* el = parentEl->ownerDocument()->createElement("B");
parentEl->apeendChild(el);
el->setAttribute("color", color);
A::toXml(el);
}
Which gives e.g.:
[...]
<B color="blue">
<A partitionKey=2 rowKey=25/>
</B>
[...]
Same logic for class "C".
I know that isn't exactly possible in C++, but maybe a toolchain that can generate code which has a function, which when called gives me a list of all those classes. For example, across multiple files I have stuff like:
class MyClass : public ParticularClass {
....
}
class MyClass2 : public ParticularClass {
....
}
Then, during runtime, I just want a pointer to single instances of the class. Let's say my generated code looks something like this:
void __populate_classes() {
superList.append(new MyClass());
superList.append(new MyClass2());
}
Also, superList would be of type List<ParticularClass*>. Plus, I'll be using Qt and ParticularClass will be QObject derived, so I can fetch the name of the class anyways. I need to basically introspect the class, so my internal code doesn't really bother much about the newly defined type.
So, is there a way to generate this code with some toolchain? If it is possible with qmake alone, that'd be like icing on the freaking cake :)
Thanks a lot for your time.
Doxygen does a nice job at doing this -- offline. Various IDEs do a nice job at this -- offline. The compiler does not do this. Such knowledge is not needed or used by the compiler.
Here at work I use a tool called Understand 4 C++. It is a tool that helps you analyze your code. It will do this quite easily.
But my favorite part is it comes with a C and Perl API which allows you to take advantage of the abstract syntax tree that 'understand' encapsulates and write your own static analysis tools. I have written tons of tools using this API.
Anyways, it's written by SciTools. http://scitools.com and I don't work for them. I just wholeheartedly like their product. In fact I wrote a C# API that wraps their C API and posted it on CodePlex a few years ago. Sure beats using C or Perl to write static analysis tools.
I don't think what you're trying to do is a good idea. Those who will maintain code after you will have hard times to understand it.
Maybe instead of it you'll try see how you can do it in plan C++. One possible solution which comes to mind i to implement factory design pattern. Than you can iterate over all data types in factory and add then to superList.
Any way, using ack (simple grep replacement) can do the job if you always declare the inheritence in one line:
ack ": *public ParticularClass" *.h