Parsing different xml messages. Versions

Parsing different xml messages. Versions - c++

Say we want to Parse a XML messages to Business Objects. We split the process in two parts, namely:
-Parsing the XML messages to XML Grammar Objects.
-Transform XML Objects to Business Objects.
The first part is done automatically, generation a grammar object for each node.
The second part is done following the XML architecture so far. Example:
If we have the XML Message(Simplified):
<Main>
<ChildA>XYZ</ChildA>
<ChildB att1="0">
<InnerChild>YUK</InnerChild>
</ChildB>
</Main>
We could find the following classes:
DecodeMain(Calls DecodeChildA and B)
DecodeChildA
DecodeChildB(Calls DecodeInnerChild)
DecodeInnerChild
The main problem arrives when we need to handle versions of the same messages. Say we have a new version where only DecodeInnerChild changes(e.g.: We need to add an "a" at the end of the value)
It is really important that the solutions agile for further versions and as clean as possible. I considered the following options:
1)Simple Inheritance:Create two classes of DecodeInnerChild. One for each version.
Shortcomming: I will need to create different classes for every parent class to call the right one.
2)Version Parameter: Add to each method an Object with the version as a parameter. This way we will know what to do within each method according to each version.
Shortcoming: Not clean at all. The code of different versions is mixed.
3)Inheritance + Version Parameter: Create 2 classes with a base class for the common code for the nodes that directly changes (Like InnerChild) and add the version as a parameter in each method. When a node call the another class to decode the child object, it will use one or another class depending on the Version parameter.
4)Some kind of executor pattern(I do not know how to do it): Define at the start some kind of specifications object, where all the methods that are going to be used are indicated and I pass this object to a class that is in charge of execute them.
How would you do it? Other ideas are welcomed.
Thanks in advance. :)

How would you do it? Other ideas are welcomed.
Rather than parse XML myself I would as first step let something like CodesynthesisXSD to generate all needed classes for me and work on those. Later when performance or something becomes issue I would possibly start to look aound for more efficient parsers and if that is not fruitful only then i would start to design and write my own parser for specific case.
Edit:
Sorry, I should have been more specific :P, the first part is done
automatically, the whole code is generated from the XML schema.
OK, lets discuss then how to handle the usual situation that with evolution of software you will eventually have evolved input too. I put all silver bullets and magic wands on table here. If and what you implement of them is totally up to you.
Version attribute I have anyway with most things that I create. It is sane to have before backward-compatibility issue that can not be solved elegantly. Most importantly it achieves that when old software fails to parse newer input then it will produce complaint that makes immediately sense to everybody.
I usually also add some interface for converter. So old software can be equipped with converter from newer version of input when it fails to parse that. Also new software can use same converter to parse older input. Plus it is place where to plug converter from totally "alien" input. Win-win-win situation. ;)
On special case of minor change I would consider if it is cheap to make new DecodeInnerChild to be internally more flexible so accepts the value with or without that "a" in end as valid. In converter I have still to get rid of that "a" when converting for older versions.
Often what actually happens is that InnerChild does split and both versions will be used side-by-side. If there is sufficient behavioral difference between two InnerChilds then there is no point to avoid polymorphic InnerChilds. When polymorphism is added then indeed like you say in your 1) all containing classes that now have such polymorphic members have to be altered. Converter should usually on such cases either produce crippled InnerChild or forward to older version that the input is outside of their capabilities.

Related

Cross-compiling plugins for multiple DCCs

Been using Haxe off and on for a few years, and I feel like this is something the compiler might be abusable for:
Is is possible to add target such that when a Haxe class is compiled, it can introspect the compiled code and generate boilerplate in C++? An example:
In Autodesk Maya to create a plugin node, you have to have a number of special functions overridden in a child class. Attributes on the node have to be declared first in the header, then declared statically in the cpp file, and finally added in a particular way (with error checks) in one of the function overrides. Apart from that, you have to write functions that register and de-register the plugin from the system. The pattern is just non-trivial enough that simple search / replace isn't good enough; you have to replace text in different patterns in different places, and also add function calls depending on the type of attributes and the attribute's settings.
A node with the same behavior in Foundry's Modo is very different-- you have to define three classes and compose functionality.
In both cases, the way you access data from within the node is also different, so if you were to wrap a single C++ function for both programs you'd be doing a lot of work outside the function just to prep the data in an agnostic way.
I'd like to be able to write a single node using Haxe code and generate C++ code from the Haxe class. In the case of Maya, it would subclass from MPxNode and provide the proper function overrides. In the case of Modo, it would generate the required classes properly. And in the future if I wanted to target Cinema 4D, I would add an additional target and compile against that SDK to create its version of the same functionality (probably a Tag).
I've actually done this partly in Python (generating C++ code with stubs for node functionality) and while it works, I've always been curious if there could be a better way to do this through Haxe directly. But again, it's something where the compiler would have to be aware of the structure of the Haxe class and the data within it in order to generate the proper code for each target.
Thanks in advance!

framework/library for property-tree-like data structure with generic get/set-implementation?

I'm looking for a data structure which behaves similar to boost::property_tree but (optionally) leaves the get/set implementation for each value item to the developer.
You should be able to do something like this:
std::function<int(void)> f_foo = ...;
my_property_tree tree;
tree.register<int>("some.path.to.key", f_foo);
auto v1 = tree.get<int>("some.path.to.key"); // <-- calls f_foo
auto v2 = tree.get<int>("some.other.path"); // <-- some fallback or throws exception
I guess you could abuse property_tree for this but I haven't looked into the implementation yet and I would have a bad feeling about this unless I knew that this is an intended use case.
Writing a class that handles requests like val = tree.get("some.path.to.key") by calling a provided function doesn't look too hard in the first place but I can imagine a lot of special cases which would make this quite a bulky library.
Some extra features might be:
subtree-handling: not only handle terminal keys but forward certain subtrees to separate implementations. E.g.
tree.register("some.path.config", some_handler);
// calls some_handler.get<int>("network.hostname")
v = tree.get<int>("some.path.config.network.hostname");
search among values / keys
automatic type casting (like in boost::property_tree)
"path overloading", e.g. defaulting to a property_tree-implementation for paths without registered callback.
Is there a library that comes close to what I'm looking for? Has anyone made experiences with using boost::property_tree for this purpose? (E.g. by subclassing or putting special objects into the tree like described here)

After years of coding my own container classes I ended up just adopting QVariantMap. This way it pretty much behaves (and is as flexible as) python. Just one interface. Not for performance code though.
If you care to know, I really caved in for Qt as my de facto STL because:
Industry standard - used even in avionics and satellite software
It has been around for decades with little interface change (think about long term support)
It has excellent performance, awesome documentation and enormous user base.
Extensive feature set, way beyond the STL

Would an std::map do the job you are interested in?
Have you tried this approach?
I don't quite understand what you are trying to do. So please provide a domain example.
Cheers.

I have some home-cooked code that lets you register custom callbacks for each type in GitHub. It is quite basic and still missing most of the features you would like to have. I'm working on the second version, though. I'm finishing a helper structure that will do most of the job of making callbacks. Tell me if you're interested. Also, you could implement some of those features yourself, as the code to register callbacks is already done. It shouldn't be so difficult.

Using only provided data structures:
First, getters and setters are not native features to c++ you need to call the method one way or another. To make such behaviour occur you can overload assignment operator. I assume you also want to store POD data in your data structure as well.
So without knowing the type of the data you're "get"ting, the only option I can think of is to use boost::variant. But still, you have some overloading to do, and you need at least one assignment.
You can check out the documentation. It's pretty straight-forward and easy to understand.
http://www.boost.org/doc/libs/1_61_0/doc/html/variant/tutorial.html
Making your own data structures:
Alternatively, as Dani mentioned, you can come up with your own implementation and keep a register of overloaded methods and so on.
Best

Message receive for c++ actor system

I am trying to implement message handling for actors in c++. The following code in scala is something I am trying to implement in c++
def receive = {
case Message1 =>{/* logic code */}
case Message2 =>{/* logic code */}
}
Thus the idea is to create a set of handler functions for the various message type and create a dispatch method to route the message to its appropiate message handler. All messages will extends the base message type.
What would be the best approach to solve this problem:
Maintain a Map(Message_type, function_pointer), the dispatch method will check the map and call the appropiate method. This mapping however needs to be done mannually in the Actor class.
I read this library, the lbrary is handling message exactly as I want to but I cant understand how they do pattern matching on the lambda fucntions created on line 56.
I would appreciate any suggestion or reading links that could get me closer to the solution of this problem.

Since you've already mentioned CAF: why do you want to implement your own actor library instead of using CAF? If you are writing the lib as an exercise, I suggest start reading libcaf_core/caf/match_case.hpp, libcaf_core/caf/on.hpp and libcaf_core/caf/detail/try_match.hpp. This is the "core" of CAF's pattern matching facility. Be warned, you will be looking at a lot of metaprogramming code. The code is meant to be read by C++ experts. It's definitely not a good place to learn the techniques.
I can outline what's going on, though.
CAF stores patterns as a list of match_case objects in detail::behavior_impl
You never get a pointer to either one as user
message_handler and behavior store a pointer to behavior_impl
Match cases can be generated differently:
Directly from callbacks/lambdas (trivial case)
Using a catch-all rule (via others >> ...)
Using the advanced on(...) >> ... notation
CAF can only match against tuples stored in message objects
"Emulates" (a subset of) reflections
Values and meta information (i.e. type information) are needed
For the matching itself, CAF simply iterates over the list of match_case objects
Try match the input with each case
Stop on first match (just like functional languages implement this)
We put a lot of effort into the pattern matching implementation to get a high-level and clean interface on the user's end. It's not easy, though. So, if you're doing this as an exercise, be warned that you need a lot of metaprogramming experience to understand the code.
In case you don't do this as an exercise, I would be interested why you think CAF does not cover your use case and maybe we can convince you to participate in its development rather than developing something else from scratch. ;)

Try to use sobjectizer (batteries included) or rotor(still experimental, but quite lightweight actor-like solution).

Which is the design pattern for easy maintanace of backward compatibility of data files?

We develop a desktop application which goes into versions. We save the data in a file (which has version in which it is saved) and open it lateron to work on it. Since our data model keeps changing over versions, we keep adding APIs like upgrade_from_ver1_to_ver2. This is getting chiotic, since we are now in version 10.0. We also slowly stop supporting older versions (say files saved before 4.0). Can you suggest a better way or a design pattern to save and provide backward compatibility?
It will be good if this can be designed as a seperate utility, called during import/opening of data files. It shall also be called seperately just to get current data model from old data files.

How about you just implement an upgrade function from the version before to the new version? Then you can chain those together to upgrade from an older version.
This seems to me the minimal implementation effort, while always allowing to upgrades from any prior version. The downside is, that you might loose information in the conversion chain, e.g. if the possible values of one property is reduced in one version and later extended again.

The first important thing to do is to have a separate class(es) for saving your data so your data objects aren't burdened with the responsibility of saving.
Next I would use the template pattern, defining the basics of your read/open operation. Your AbstractReader class might just define a single Read method. You could then create classes inheriting from the AbstractReader - potentially one for each file version. You could then use the abstract factory pattern to detect the version of file being opened, then return the corresponding Reader implementation for that version.

You can use ParallelChange Design Pattern here.
To give you a little bit idea, this pattern has 3 phases expand, migrate, and contract.
Let say you have method1 in your file and you want to update the method structure.
public void method1(Obj1 obj1) {
...
}
1. Expand
Now you need to support different data structure with backward compatibility. Then add new method keeping the old one as is.
public void method1(Obj1 obj1) {
...
}
public void method1(Obj2 obj2) {
...
}
After this release, all your clients will still use the method1(Obj1 obj1). You have just added capability to call a different method which will use now onwards.
2. Migrate
In this phase, you need to upgrade your all clients to new method structure ie method1(Obj2 obj2). Sometimes we need to force the client to use the new method structure and sometimes the client moves gradually. We generally stop to give new features to deprecated methods. Like after the first Expand, we found a bug in method1(Obj1 obj1), we say this will be solved in method1(Obj2 obj2).
3. Contract
Once your all clients are moved, it's totally safe to remove your method1(Obj1 obj1) implementation.
Eventually, you will upgrade your system from version to version with backward compatibility.
The only downfall that I personally feel is, it will take at least 3 release before you remove your whole code. But without a doubt, this would be surely hassle-free.

Flexible application configuration in C++

I am developing a C++ application used to simulate a real world scenario. Based on this simulation our team is going to develop, test and evaluate different algorithms working within such a real world scenrio.
We need the possibility to define several scenarios (they might differ in a few parameters, but a future scenario might also require creating objects of new classes) and the possibility to maintain a set of algorithms (which is, again, a set of parameters but also the definition which classes are to be created). Parameters are passed to the classes in the constructor.
I am wondering which is the best way to manage all the scenario and algorithm configurations. It should be easily possible to have one developer work on one scenario with "his" algorithm and another developer working on another scenario with "his" different algorithm. Still, the parameter sets might be huge and should be "sharable" (if I defined a set of parameters for a certain algorithm in Scenario A, it should be possible to use the algorithm in Scenario B without copy&paste).
It seems like there are two main ways to accomplish my task:
Define a configuration file format that can handle my requirements. This format might be XML based or custom. As there is no C#-like reflection in C++, it seems like I have to update the config-file parser each time a new algorithm class is added to project (in order to convert a string like "MyClass" into a new instance of MyClass). I could create a name for every setup and pass this name as command line argument.
The pros are: no compilation required to change a parameter and re-run, I can easily store the whole config file with the simulation results
contra: seems like a lot of effort, especially hard because I am using a lot of template classes that have to be instantiated with given template arguments. No IDE support for writing the file (at least without creating a whole XSD which I would have to update everytime a parameter/class is added)
Wire everything up in C++ code. I am not completely sure how I would do this to separate all the different creation logic but still be able to reuse parameters across scenarios. I think I'd also try to give every setup a (string) name and use this name to select the setup via command line arg.
pro: type safety, IDE support, no parser needed
con: how can I easily store the setup with the results (maybe some serialization?)?, needs compilation after every parameter change
Now here are my questions:
- What is your opinion? Did I miss
important pros/cons?
- did I miss a third option?
- Is there a simple way to implement the config file approach that gives
me enough flexibility?
- How would you organize all the factory code in the seconde approach? Are there any good C++ examples for something like this out there?
Thanks a lot!

There is a way to do this without templates or reflection.
First, you make sure that all the classes you want to create from the configuration file have a common base class. Let's call this MyBaseClass and assume that MyClass1, MyClass2 and MyClass3 all inherit from it.
Second, you implement a factory function for each of MyClass1, MyClass2 and MyClass3. The signatures of all these factory functions must be identical. An example factory function is as follows.
MyBaseClass * create_MyClass1(Configuration & cfg)
{
// Retrieve config variables and pass as parameters
// to the constructor
int age = cfg->lookupInt("age");
std::string address = cfg->lookupString("address");
return new MyClass1(age, address);
}
Third, you register all the factory functions in a map.
typedef MyBaseClass* (*FactoryFunc)(Configuration *);
std::map<std::string, FactoryFunc> nameToFactoryFunc;
nameToFactoryFunc["MyClass1"] = &create_MyClass1;
nameToFactoryFunc["MyClass2"] = &create_MyClass2;
nameToFactoryFunc["MyClass3"] = &create_MyClass3;
Finally, you parse the configuration file and iterate over it to find all the entries that specify the name of a class. When you find such an entry, you look up its factory function in the nameToFactoryFunc table and invoke the function to create the corresponding object.

If you don't use XML, it's possible that boost::spirit could short-circuit at least some of the problems you are facing. Here's a simple example of how config data could be parsed directly into a class instance.

I found this website with a nice template supporting factory which I think will be used in my code.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js