I'm looking for a simple-way to transform in C++ an object into XML string representation, so this way I could communicate with a server.
For instance, let us say I have an object:
class A{
string data1;
string data2;
string dataN;
list<B> bList;
}
class B{
string moreData;
}
I would like the following XML-representation:
(Assume that I have created one instance A and it has two instances of B)
<A>
<data1>content</data1>
<data2>content</data2>
<dataN>content</dataN>
<B>
<moreData>content</moreData>
</B>
<B>
<moreData>content</moreData>
</B>
</A>
What you describe is referred to as XML Data Binding.
There are a number of products that will generate the C++ code from and XSD or DTD, have a look at http://www.xmldatabinding.org/ for a list, and http://www.rpbourret.com/xml/XMLDataBinding.htm for more information.
Also have a look at this XML Data Binding example for C++, it shows the example source schema and generated code.
If your schemas are pretty basic and you have the ability to tune them to the generator then there are probably some open source projects that will do the job. If you are binding to an XML Standard then you quickly run up against the limitations of most generators. The Liquid XML generator copes with pretty much all XSD standards, but you have to pay for it.
There is no universal solution for this problem in C++, however many adhoc implementations exist.
This question has some remarkable links and how-tos: How to implement serialization in C++
So, there is no standard way because, simply put, because there is no way of serializing pointers and things like that. It will always be application specific.
However, you can create your own class and serialize as you want.
As for xml parsers, have you tried this one? It is extremely simple, efficient, and easy to learn. I've done basically all with it. You can even ask for a commercial license to it.
Related
I would like to test the performance of a custom string and map implementation in my code. I would like to replace all objects of types std::string and std::map with custom::string and custom::map. Is there a reasonably scriptable way of doing this?
I am more interested in a methodology that would work for any given source and target types. Ideally a method that would also support different API names, i.e. replace std::map::insert() with custom::map::custom_insert().
I don't/can't trust a search/replace/regex or any solution that solely depends on textual representation of the types. Providing a tutorial or working example would also be amazing.
I'm learning golang - coding small web blog, and writing router(I know there are available few - gorilla mux, martini, etc).
I have simple struct
type Routes struct {
method string
pattern string
handler Handler
}
and some regex matchers. But i can't understand how do i keep all routes that i will define in one place. Is using slice of structs good idea(like
[]Routes) to keep them all together?
P.S. This is meant for personal understanding of how it all works together
Your question is not really well defined. You told us you want to implement routing functionality based on regular expressions, but you haven't told us what kind of tasks you want to achieve which greatly influence the optimal or best data structure to be used.
You already mentioned you know about a lot of other implementations which are open source, maybe you should check their sources.
This answer might also be a help to you which shows a simple implementation of a basic implementation how to do routing functionality using regular expressions.
If you just want to be able to register regular expressions which if matched by the request path and then forward the serving to a Handler, yes, storing the "rules" in a []Routes is a viable and simple option.
Things to keep in mind:
I would definitely compile the regexp in advance and store the result and not compile them each time which is an awful waste of resources. So your Routes struct should contain a field of type *regexp.Regexp instead of the pattern (you can keep the string pattern too e.g. for debugging purposes).
If your Routes struct grows bigger, I would consider storing pointers in the slice and not struct values, e.g. []*Routes because each time when you loop over them (e.g. in each request to see which matches) or whenever you create a local variable from one of the Routes, a copy is made from the values. Copying large struct is inefficient compared to copying a pointer which is fast.
Say we want to Parse a XML messages to Business Objects. We split the process in two parts, namely:
-Parsing the XML messages to XML Grammar Objects.
-Transform XML Objects to Business Objects.
The first part is done automatically, generation a grammar object for each node.
The second part is done following the XML architecture so far. Example:
If we have the XML Message(Simplified):
<Main>
<ChildA>XYZ</ChildA>
<ChildB att1="0">
<InnerChild>YUK</InnerChild>
</ChildB>
</Main>
We could find the following classes:
DecodeMain(Calls DecodeChildA and B)
DecodeChildA
DecodeChildB(Calls DecodeInnerChild)
DecodeInnerChild
The main problem arrives when we need to handle versions of the same messages. Say we have a new version where only DecodeInnerChild changes(e.g.: We need to add an "a" at the end of the value)
It is really important that the solutions agile for further versions and as clean as possible. I considered the following options:
1)Simple Inheritance:Create two classes of DecodeInnerChild. One for each version.
Shortcomming: I will need to create different classes for every parent class to call the right one.
2)Version Parameter: Add to each method an Object with the version as a parameter. This way we will know what to do within each method according to each version.
Shortcoming: Not clean at all. The code of different versions is mixed.
3)Inheritance + Version Parameter: Create 2 classes with a base class for the common code for the nodes that directly changes (Like InnerChild) and add the version as a parameter in each method. When a node call the another class to decode the child object, it will use one or another class depending on the Version parameter.
4)Some kind of executor pattern(I do not know how to do it): Define at the start some kind of specifications object, where all the methods that are going to be used are indicated and I pass this object to a class that is in charge of execute them.
How would you do it? Other ideas are welcomed.
Thanks in advance. :)
How would you do it? Other ideas are welcomed.
Rather than parse XML myself I would as first step let something like CodesynthesisXSD to generate all needed classes for me and work on those. Later when performance or something becomes issue I would possibly start to look aound for more efficient parsers and if that is not fruitful only then i would start to design and write my own parser for specific case.
Edit:
Sorry, I should have been more specific :P, the first part is done
automatically, the whole code is generated from the XML schema.
OK, lets discuss then how to handle the usual situation that with evolution of software you will eventually have evolved input too. I put all silver bullets and magic wands on table here. If and what you implement of them is totally up to you.
Version attribute I have anyway with most things that I create. It is sane to have before backward-compatibility issue that can not be solved elegantly. Most importantly it achieves that when old software fails to parse newer input then it will produce complaint that makes immediately sense to everybody.
I usually also add some interface for converter. So old software can be equipped with converter from newer version of input when it fails to parse that. Also new software can use same converter to parse older input. Plus it is place where to plug converter from totally "alien" input. Win-win-win situation. ;)
On special case of minor change I would consider if it is cheap to make new DecodeInnerChild to be internally more flexible so accepts the value with or without that "a" in end as valid. In converter I have still to get rid of that "a" when converting for older versions.
Often what actually happens is that InnerChild does split and both versions will be used side-by-side. If there is sufficient behavioral difference between two InnerChilds then there is no point to avoid polymorphic InnerChilds. When polymorphism is added then indeed like you say in your 1) all containing classes that now have such polymorphic members have to be altered. Converter should usually on such cases either produce crippled InnerChild or forward to older version that the input is outside of their capabilities.
So I have a need to be able to parse some relatively simple C++ files with annotations and generate additional source files from that.
As an example, I may have something like this:
//# service
struct MyService
{
int getVal() const;
};
I will need to find the //# service annotation, and get a description of the structure that follows it.
I am looking at possibly leveraging LLVM/Clang since it seems to have library support for embedding compiler/parsing functionality in third-party applications. But I'm really pretty clueless as far as parsing source code goes, so I'm not sure what exactly I would need to look for, or where to start.
I understand that ASTs are at the core of language representations, and there is library support for generating an AST from source files in Clang. But comments would not really be part of an AST right? So what would be a good way of finding the representation of a structure that follows a specific comment annotation?
I'm not too worried about handling cases where the annotation would appear in an inappropriate place as it will only be used to parse C++ files that are specifically written for this application. But of course the more robust I can make it, the better.
One way I've been doing this is annotating identifiers of:
classes
base classes
class members
enumerations
enumerators
E.g.:
class /* #ann-class */ MyClass
: /* #ann-base-class */ MyBaseClass
{
int /* #ann-member */ member_;
};
Such annotation makes it easy to write a python or perl script that reads the header line by line and extracts the annotation and the associated identifier.
The annotation and the associated identifier make it possible to generate C++ reflection in the form of function templates that traverse objects passing base classes and members to a functor, e.g:
template<class Functor>
void reflect(MyClass& obj, Functor f) {
f.on_object_start(obj);
f.on_base_subobject(static_cast<MyBaseClass&>(obj));
f.on_member(obj.member_);
f.on_object_end(obj);
}
It is also handy to generate numeric ids (enumeration) for each base class and member and pass that to the functor, e.g:
f.on_base_subobject(static_cast<MyBaseClass&>(obj), BaseClassIndex<MyClass>::MyBaseClass);
f.on_member(obj.member_, MemberIndex<MyClass>::member_);
Such reflection code allows to write functors that serialize and de-serialize any object type to/from a number of different formats. Functors use function overloading and/or type deduction to treat different types appropriately.
Parsing C++ code is an extremely complex task. Leveraging a C++ compiler might help but it could be beneficial to restrict yourself to a more domain-specific less-powerful format i.e., to generate the source and additional C++ files from a simpler representation something like protobufs proto files or SOAP's WSDL or even simpler in your specific case.
I did some very similar work recently. The research I did indicated that there wasn't any out-of-the-box solutions available already, so I ended up hand-rolling one.
The other answers are dead-on regarding parsing C++ code. I needed something that could get ~90% of C++ code parsed correctly; I ended up using srcML. This tool takes C++ or Java source code and converts it to an XML document, which makes it easier for you to parse. It keeps the comments in-tact. Furthermore, if you need to do a source code transformation, it comes with an reverse tool which will take the XML document and produce source code.
It works in 90% of the cases correctly, but it trips on complicated template metaprogramming and the darkest corners of C++ parsing. Fortunately, my input source code is fairly consistent in design (not a lot of C++ trickery), so it works for us.
Other items to look at include gcc-xml and reflex (which actually uses gcc-xml). I'm not sure if GCC-XML preserves comments or not, but it does preserve GCC attributes and pragmas.
One last item to look at is this blog on writing GCC plugins, written by the author of the CodeSynthesis ODB tool.
Good luck!
So I've been attempting to create some classes around the xerces XML library so I can 'hide' it from the rest of my project the underlying xml library stays independent from the rest of my project.
This was supposed to be a fairly easy task, however it seems entirely impossible to hide a library from the rest of a project by writing some classes around it.
Have I got the wrong approach or is my 'wrapper' idea completely silly?
I end up with something like this:
DOMElement* root(); //in my 'wrapper' class, however this DOMElement is part of the xerces library, at this point my 'wrapper' is broken. Now I have to use the xerces library everywhere I want to use this function.
Where is my thinking gone wrong?
I would recommend avoiding the wrapper in the first stage. Just make sure that the layers and their borders are clear, i.e. the network layer takes care of serializing/deserializing the XML, and from there on you only use your internal types. If you do this, and at a later stage you need to replace xerces with any other library, just replace the serialization layer. That is, instead of wrapping each XML object, just wrap the overall operation: serialize/deserialize.
Writing your own abstract interface for a library is not a silly idea IF you have plan to change or to have the possibility to change the library you are using.
You should not rely on your library object to implement your wrapper interface. Implement your own structure and your own function interface. It will ease a lot of work when you will want to change how xml is implemented (eg: change library).
One example of implementation:
class XmlElement
{
private:
DOMElement element; // point to the element of your library
public:
// Here you define how its public interface.
// There should be enough method/parameter to interact
// with any xml interface you will use in the future
XmlElement getSubElement(param)
{
// Create the Xmlelement
// Set the DOMElement wanted
// return it
}
}
In your program you will see:
void function()
{
XmlElement root();
root.getSubElement("value"); // for example
}
Like that no DOMElement or their function appear in the project.
As I mentioned in my comments, I would take a slightly different approach. I would not want my codebase to be dependent on the particular messaging format (xml) that I am using (what if for example you decide to change the xml to something else later?) Instead I would work with a well defined object model and have a simple encoder/decoder to handle the conversion to XML string and vice versa. This encode/decoder would then be the bit that I would replace if the underlying wire format changed.
The decoder would take in the data read from the socket, and produce a suitable object (with nested objects to represent the request) and the decoder would take a similar object and generate the XML from it. If performance is not a primary concern, I would use a library such as TinyXML which is quite lightweight - heck, you can strip that down even further and make it more light weight...