C++ code generation from XML - c++

I would like to be able to specify the layout of different packets in a client server protocol in XML, and at (or before) compile time, have that XML be parsed and C++ code be generated to parse the defined layouts.
Example:
<layout>
<opcode id=1>
<field type=uint32 label="itemId" />
<field type=uint8 label="count" />
<field type=strlen32 label="name" />
</opcode>
</layout>
Bad XML not withstanding, and whereby 'strlen32' would mean a string whose 32 bit length prefixes the content, would produce the following handler upon processing:
void Handle_Opcode1(Packet &packet, std::function<...> emitter)
{
std::uint32_t itemId = packet.Read<std::uint32_t>();
std::uint8_t count = packet.Read<std::uint8_t>();
std::uint32_t nameLen = packet.Read<std::uint32_t>();
std::vector<char> name(static_cast<std::size_t>(nameLen));
packet.Read(&name[0], name.size());
emitter(itemId, count, name);
}
I don't know exactly how I would want the emitter to work in a generic way, but I consider that beyond the scope of this question. Presently I am just searching for recommendations on the C++ code generation. Are there any existing tools to do this?

Related

Refrence XSD in XSLT file and match on element name to get element type

I am a beginner in using XSLT so I am not sure if this is even feasible. I really appreciate your help.
XSLT to transform from one format to another.
The source XML does not have the type.
I need to reference the XSD to get the type for the given element.
<match="*[not(*)]">
<elementName>
<key>
<xsl:value-of select="name()"/>
</key>
<type>
if (name() matches name =id in xsd file"
// if name = id matches name = id in xsd get type=String
<type>
This is sample XSD:
<xs:complexType name = "test">
<xs:sequence
<xs:element name="id" type="xs:string"/>
</xs:sequence>
</xscomplexType>
Using an XSLT 2.0 schema-aware transformation, you can write:
<xsl:import-schema>
... schema goes here, either inline or by reference ...
</xsl:import-schema>
<xsl:template match="element(id, xs:string)">
...
</xsl:template>
The normal way of using this assumes that you write your stylesheet knowing what is in the schema. Saxon has extended this with extension functions allowing you to discover what is in your schema. For example:
<xsl:variable name="type" select="saxon:type-annotation()"/>
<key><xsl:value-of select="name()"/></key>
<type>Q{<xsl:value-of select="namespace-from-QName($p)||"}"||local-name-from-QName($type)"/></type>
See the saxon:type-annotation and saxon:schema extension functions at http://www.saxonica.com/documentation/index.html#!functions/saxon
Analysing a schema document from XSLT directly is possible in theory but it's an enormous amount of work to get it right, if you're going to handle things such as xs:include/import/redefine, named types and anonymous types, global and local element declarations, substitution groups, etc. etc.
Yet another approach is to analyse the "precompiled schema" in XML format (SCM) which you can output from Saxon, which eliminates many of these difficulties.
Other products also offer APIs to access the schema, but there is no real standard.

Building concrete objects from files. Are there any design pattern appropriate for this?

I have to say I always try to keep code simple and beautiful, mainly using design patterns when possible. Also, I am impressed I did not find anything related to this on the internet (except simple and very vague examples, mostly in javascript using json).
The scenario is: I have to parse/build concrete objects from a file, whose content may be XML, JSON and/or other formats. Heres an example:
Concrete object:
// Contains the common states for the entities
struct EntityModel
{
int hp;
int level;
int armor;
int speed;
// Other attributes...
};
class Entity
{
// Stuff (protected/public/private attributes and functions/methods)
private:
EntityModel* m_model; // Pointer to the model used (flyweight)
// Other attributes...
}
File (XML, in this case):
<entity name="Skeleton" class="Undead">
<attributes>
<hp value="150" />
<level value="10" />
<armor value="75" />
<speed value="15" />
<width value="32" />
<height value="32" />
<experience value="372" />
<texture value="skeleton.png" />
<intelligence value="skeleton.script" />
</attributes>
<restistances>
<resist type="Shock" value="30" />
<resist type="Fire" value="10" />
</resistances>
<attacks>
<spell name="Blizzard" mp="50" damage="130" distance="0" />
<spell name="Fireball" mp="30" damage="100" distance="0" />
</attacks>
<loot>
<drop item="Gold Coin" min="30" max="50" probability="1" />
<drop item="Ruby" min="0" max="2" probability="0.7" />
<drop item="Enchanted Sword" probability="0.25" />
</loot>
</entity>
This is the example of the relationship between an entity model and its file. There will also be other objects that have to be able to be parsed/built/created from their files.
Some may say that a design pattern is not really necessary in this case, as I have seen in a few implementations, although I do really believe there is one. The whole entity creation system involves the abstract factory, pool and flyweight patterns (a createEntity call is requested to the factory, which will see if a flyweight model has already been created and cached in the pool or create and cache the new model).
So, the question is: Are there any proper way to do that? Which one?
I'll be basing on the answer for this very case and adapt to the other object creations, as I have stated. In other words, I need a generic answer.
If this post is missing some information, or is in a wrong section, please forgive me as this is my first post here.
Thanks in advance.
Try the Boost Serialization Library. It has xml, binary, and text save formats. It's not too complicated and has good documentation.
I recommend a derivative of the Factory Design Pattern.
The pattern allows you to construct objects based on a criteria, such as a name or number. The traditional pattern creates objects based on a common base class.

Solr : importing dynamic field names from XML with DIH and xpath

I'm indexing data from a XML file, with many fields like these declared in DataImportHandler's dataconfig.xml :
<field column="pos_A" xpath="/positions/pos_A/#pos" />
<field column="pos_B" xpath="/positions/pos_B/#pos" />
<field column="pos_C" xpath="/positions/pos_C/#pos" />
...
And one matching dynamicField declaration in schema.xml :
<dynamicField name="pos_*" type="sint" indexed="true" stored="true" />
I'm wondering if it's possible to use a transformer to dynamically generate the field names in dataconfig.xml, and have a single line, kinda like :
<field column="pos_{$1}" xpath="/positions/pos_(*)/#pos" />
(pardon my xpath and regex syntax :)
https://issues.apache.org/jira/browse/SOLR-3251 The latest release claims that you can dynamically add fields to the schema. I tried to find documentation for the public interface, but not much luck so far.
>
SOLR-4658: In preparation for REST API requests that can modify the schema,
126 a "managed schema" is introduced.
127 Add '<schemaFactory class="ManagedSchemaFactory" mutable="true"/>' to solrconfig.xml
128 in order to use it, and to enable schema modifications via REST API requests.
129 (Steve Rowe, Robert Muir)

How to unveil typedef in C/C++ sources?

I want to parse C/C++-code with my primitive parser to get ast-tree.
But it doesn't support macro and typedefs.
It is possible to unveil macro definitions in any C/C++-project with help of gcc options.
After that my own parser is able to cope with C/C++-code, but only in case, if there is no typedefs in it.
So, I'd like in some way to get rid of typedefs. But I have no idea, what should I do.
I want to replace redefined type names, for example:
typedef char CHAR;
typedef int& INT;
INT a;
CHAR b;
by their originals:
int &a;
char b;
As result, I want to get the same sources, but with original types, without typedefs.
I guess, it is very simple task for compiler, but not for the student's project. :)
As far, as I know, DECL_ORIGINAL_TYPE (TYPE_NAME (t)) of g++ points to the tree node with original object's type.
But I really wouldn't like to dive into the g++ sources to adopt it for my demands.
So, what is the easiest way to unveil the typedefs?
Any help would be greatly appreciated.
Edited:
The solution with GCCXML is really good, but I still don't understand, how to get
C/C++ code from it's XML representation. Could you explain, what should I do to transform XML:
(an example from http://www.gccxml.org/HTML/example1out.html)
<?xml version="1.0"?>
<GCC_XML>
<Namespace id="_1" name="::" members="_2 _3 _4 "/>
<Function id="_2" name="main" returns="_5" context="_1" location="f0:8"/>
<Function id="_3" name="a_function" returns="_5" context="_1" location="f0:4">
<Argument name="f" type="_6"/>
<Argument name="e" type="_4"/>
</Function>
<Struct id="_4" name="EmptyClass" context="_1" location="f0:1" members="_7 _8 " bases=""/>
<FundamentalType id="_5" name="int"/>
<FundamentalType id="_6" name="float"/>
<Constructor id="_7" name="EmptyClass" context="_4" location="f0:1">
<Argument name="_ctor_arg" type="_9"/>
</Constructor>
<Constructor id="_8" name="EmptyClass" context="_4" location="f0:1"/>
<ReferenceType id="_9" type="_4c"/>
<File id="f0" name="example1.cxx"/>
</GCC_XML>
back to C/C++:
(an example from http://www.gccxml.org/HTML/example1in.html)
struct EmptyClass {};
int a_function(float f, EmptyClass e)
{
}
int main(void)
{
return 0;
}
Could you explain it please?
since types are a big complex argument, I would suggest to use GCCXML. It's a frontend that produces an abstract syntax tree from concrete source. I used it to generate interfaces Prolog/OpenGL. If you want to put it to good use you'll need a good XML reader (SWI-Prolog it's really good at this).
edit
the following micro file x.c
typedef struct A {
int X, Y;
} T;
T v[100];
processed with
gccxml -fxml=x.xml x.c
produces in x.xml (among many others) the following xml statement
...
<Variable id="_3" name="v" type="_141" context="_1" location="f0:5" file="f0" line="5"/>
...
<Struct id="_139" name="A" context="_1" mangled="1A" demangled="A" location="f0:1" file="f0" line="1" artificial="1" size="64" align="32" members="_160 _161 _162 _163 _164 _165 " bases=""/>
<Typedef id="_140" name="T" type="_139" context="_1" location="f0:3" file="f0" line="3"/>
<ArrayType id="_141" min="0" max="99u" type="_140" size="6400" align="32"/>
...
<Field id="_160" name="X" type="_147" offset="0" context="_139" access="public" location="f0:2" file="f0" line="2"/>
<Field id="_161" name="Y" type="_147" offset="32" context="_139" access="public" location="f0:2" file="f0" line="2"/>
<Destructor id="_162" name="A" artificial="1" throw="" context="_139" access="public" mangled="_ZN1AD1Ev *INTERNAL* " demangled="A::~A()" location="f0:1" file="f0" line="1" endline="1" inline="1">
</Destructor>
You can see that following the type="..." symbol chain you can reconstruct the type assigned to typedef.
For resolving macros you can use cpp, the gcc preprocessor, it prints to stdout the preprocessed code.
Unfortunately for you typedef are not macros, so you would need to handle them yourself.
It seems to me that you're jumping the gun on translation phases. Typedef substitution seems easy in comparison to comment substitution. Does your program recognise the following as comments, yet? If not, then I'd suggest going back to translation phases 1&2 before attempting 3&4.
// this is a basic comment
/* this is another basic comment */
// this is a slightly\
less basic comment
/* this is a slightly
* less basic comment */
/??/
*??/
c??/
o??/
m??/
m??/
e??/
n??/
t??/
*??/
/
Parsing C++ is very hard, requiring a recursive descent parser. I suggest you use GCCXML as proposed by #CapelliC, or as a better maintained alternative, use libclang. There even exist Python bindings which make it's use so much simpler.

MSXML Select Nodes Not Working

I am working on an automated testing app, and am currently in the process of writing a function that compares values between two XML files that should be identical, but may not be. Here is a sample of the XML I'm trying to process:
<?xml version="1.0" encoding="utf-8"?>
<report xmlns="http://www.**.com/**">
<subreport name="RBDReport">
<record rowNumber="1">
<field name="Time">
<value>0</value>
</field>
<field name="Reliability">
<value>1.000000</value>
</field>
<field name="Unreliability">
<value>0.000000</value>
</field>
<field name="Availability">
<value> </value>
</field>
<field name="Unavailability">
<value> </value>
</field>
<field name="Failure Rate">
<value>N/A</value>
</field>
<field name="Number of Failures">
<value> </value>
</field>
<field name="Total Downtime">
<value> </value>
</field>
</record>
(Note there may be multiple <subreport> elements and within those, multiple <record> elements.)
What I'd like is to extract the <value> tags of two documents and then compare their values. That part I know how to do. The problem is the extraction itself.
Since I'm stuck in C++, I'm using MSXML, and have written a wrapper to allow my app to abstract away the actual XML manipulation, in case I ever decide to change my data format.
That wrapper, CSimpleXMLParser, loads an XML document and sets its "top record" to the document element of the XML document. (CRecord being an abstract class with CXMLRecord one of its subclasses, and which gives access to child records singularly or by group, and also allowing access to the "value" of the Record (values for child elements or attributes, in the case of CXMLRecord.) A CXMLRecord contains an MSXML::MSXMLDOMNodePtr and a pointer to an instance of a CSimpleXMLParser.) The wrapper also contains utility functions for returning children, which the CXMLRecord uses to return its child records.
In my code, I do the following (trying to return all <subreport> nodes just to see if it works):
CSimpleXMLParser parserReportData;
parserReportData.OpenXMLDocument(strPathToXML);
bool bGetChildrenSuccess = parserReportData.GetFirstRecord()->GetChildRecords(listpChildren, _T("subreport"));
This is always returning false. The meat of the implementation of CXMLRecord::GetChildRecords() is basically
MSXML2::IXMLDOMNodeListPtr pListChildren = m_pParser->SelectNodes(strPath, m_pXMLNode);
if (pListChildren->Getlength() == 0)
{
return false;
}
for (long l = 0; l < pListChildren->Getlength(); ++l)
{
listRecords.push_back(new CXMLRecord(pListChildren->Getitem(l), m_pParser));
}
return true;
And CSimpleXMLParser::SelectNodes() is:
MSXML2::IXMLDOMNodeListPtr CSimpleXMLParser::SelectNodes(LPCTSTR strXPathFilter, MSXML2::IXMLDOMNodePtr pXMLNode)
{
return pXMLNode->selectNodes(_bstr_t(strXPathFilter));
}
When run, the top record is definitely being set to the <report> element properly. I can do all sorts of things with it, like getting its child nodes (through the MSXML interface, not through my wrapper) or its name, etc. I know that my wrapper can work, because I use it elsewhere in the app for parsing an XML configuration file, and that works flawlessly.
I thought maybe I was doing something faulty with the XPath query expression, but every permutation I could think of gives no joy. The MSXML::IXMLDOMNodeListPtr returned by IXMLDOMNodePtr::SelectNodes() is always of length 0 when I try to deal with this XML file.
This is driving me crazy.
I'm used to doing this with .NET's XmlDocument objects, but I think the effect is the same here:
If the XML document includes a namespace -- even an unnamed one -- then the Xpath query has to use one as well. So, you'll have to add the namespace to the XMLDoument which you might as well give a name in the code, and the include the prefix in the XPATH query (it doesn't matter that the prefixes are different between the xml document and the xpath, as long as the namespaces sort it out)
SO, while you are using an XPath like /report/subreport/record/field/value, you actually need to first set the namespace of your document:
pXMLDoc->setProperty(_bstr_t("SelectionNamespaces"),
_bstr_t("xmlns:r="http://www.**.com/**"));
and then selectNodes() using /r:report/r:subreport/r:record/r:field/r:value
I see no reference to a namespace when you're selecting nodes. I'd expect this to be the fundamental problem.