I am writing a parser for a hugely complex internal file format - for training purposes. I need to implement some sort of DSL for parsing - and then display the sections of format in a GUI, as some sort of a tree view. I want to be able to have the parsing engine in one process and the UI in another. So for example I want the UI to ask the parser engine to parse a file. The parser then returns some sort of tree containing sections and fields in the tree, and the GUI then displays it.
How do I make them communicate? (The language is C++). Do I make the engine a DLL and export the needed functions or how else to implement this. I cannot use external libraries.
Not a big problem. The interaction is one way, that helps a lot. Define an output format for the parser/input format for the UK, before writing either. If the format is well defined, you can even have two developers write the two parts independently.
Related
so I posted a question on OpenData and after advancing me a bit further in my quest to create maps from real cartographic information, I was advised to post my follow up questions on this website.
So after this intro, here is the original question:
I'm creating a 2D realistic RTS (Real time strategy game) and I wanted to be able to use real locations as the scenarios for the games.
The game will be developed via unreal engine which uses c++. The idea is for the engine to read an file and convert into a grid (coordinate point) where each square has type of terrain associated, like in this image of a scenario editor.
The file resulting from the other website is a GML (Geographic Markup Language) of a given location in the globe. GML is a XML extension.
The problem I'm facing is converting that GML for a given location into data that can be used by my game, like an location array or something like that.
Any sugestions?
There are many possibilities into doing this but the general guidelines or concepts should be as follows:
Know the data structure used by your engine
Open the incoming file and being able to parse and extract the needed data
Populate your structures or classes
Use your custom structures or classes to create game content (map data).
The first suggestion that I can give would be to read up on any and all documentation on the GML file format to know how the tags of the marked up language works then from there look to see if you can find any libraries that make it easy to read in a GML or XML file type and to generate C++ classes or structures, otherwise you will have to write both the file loader and gml(xml)parser with schema manually. Then from those structures being able to extract the data that you need and to pass them or store them into your engine's custom data structures. Then finally you should be able to use the stored data in your engine to generate the content that you want.
References
www.opengeospatial.org
www.w3.org
www.ogcnetwork.net
www.gdal.org
www.svgopen.org
Books
Geo-Informatics... 200+ page preview of almost 1,000 pages
Whitepapers
ResearchGate
Tools - Some Free Some Not
Google Search - GML File Converter
GoLoader by Snowflake Software
Various Sources by OGCNetwork
Open Source software from opengeospatial
GML Reader by tsusiatsoftware
Libraries
Question found on Stack Exchange
Various Libraries & Frameworks for Different Formats and Languages
gmXML Parser by YOYO games - *This Might be for the GameMaker Language
Java Code Examples for org.geotools.xml.Parser from programcreek
GML API for C/C++ Documentation
GML - The Geometric Modelling Library - API & GUI
Open Source - CityGML
The information needed is out there; it just takes time and effort; good research and the ability to apply what you have read into an actual code base.
What is the best way/common practice for maintaining all string resources found on a UI in Qt, especially the textual input/text in combo boxes etc. (since these are the once that are frequently used in the code itself)?
I know that Android has this string resources thing such that resources only have to be modified at one position.
Does Qt have something like that too or do I have to initialize string resources in code instead of in the UI's XML itself...
AFAIK, there is no built-in mechanism for string resources in Qt. If you want to maintain strings at build time you can define them in one .h/.cpp file as global variables and reuse them in your code.
Otherwise you can use Qt's translator files (binary) and load them along with your application. If you need to change a string, you simply will need to edit the translation file (xml) and "recompile" it with lrelease utility without building the application again.
There is a mechanism to dynamically translate texts in application, but it works a bit different than Android string resources, but achieves the same goals.
Qt uses i18n system modelled after standard, well known unix gettext. It works in a very similar way to iOS NSLocalizedString, if that rings a bell.
http://doc.qt.io/qt-5/qobject.html#tr
This is worth reading too:
http://en.wikipedia.org/wiki/Gettext
http://doc.qt.io/qt-5/internationalization.html
Android approach is a bit unique and you should not expect it to be a "standard everywhere". It works, it's ok, but it's not a standard way of doing things on desktop.
OK, so here's the background:
We have a third-party piece of software that does a lot of complicated stuff to generate an XML file from a lot of tables based on a wide array of business rules. The software allows you to apply an XSL transformation by supplying an XSLT file as part of its workflow, before continuing on in the process, which is usually an upload to one or more servers, based on more business rules.
Here's the problem:
One of the elements (with more on the way) this application is processing contains RTF text, and needs to be converted into formatted HTML before being uploaded. There are no means of transforming the XML inside the application other than through an XSLT file, and once we output the file, we cannot resume the workflow. My original thought was, "Easy! someone must have written a few XSL transforms for converting RTF to formatted HTML!" Hours of searching later, I must conclude I either suck at searching or it's awfully obscure.
Disclaimers:
I know the software is pretty darned limited; I'm stuck with it.
I know there are a lot of third-party tools to do this; they are not available to me because I would need to run them externally.
I know that this is not a pretty or efficient thing to do with XSLT. Changing that is not an option for me at this point.
If I cannot find a means to do this through pure XSL transforms, I will need to output the files locally, run the extra process, and take the destination routing on through a custom process. I really don't want to do that.
Does anyone have access to an XSL transformation function/ scheme that will allow me to do this natively in the application? Perhaps a series of regular expressions I could use or something?
So it turns out that external scripts can be invoked from the XSLT. It seems I will be using another scripting language to get this to work. I'm a little bummed there was no other answer available.
Well a lot of questions have been made about parsing XML in C++ and so on...
But, instead of a generic problem, mine is very specific.
I am asking for a very efficient XML parser for C++. In particular I have a VERY VERY BIG XML file to parse.
My application must open this file and retrieve data. It must also insert new nodes and save the final result in the file again.
To do this I used, at the beginning, rapidxml, but it requires me to open the file, parse it all (all the content because this lib has no functions to access the file directly without loading the entire tree first), then edit the tree, modify it and store the final tree on the file by overwriting it... It consumes too much resources.
Is there an XML parser that does not require me to load the entire file, but that I can use to insert, quickly, new nodes and retrieve data? Can you please indicate solutions for this problem of mine?
You want a streaming XML parser rather than what is called a DOM parser.
There are two types of streaming parsers: pull and push. A pull parser is good for quickly writing XML parsers that load data into program memory. A push parser is good for writing a program to translate one document to another (which is what you are trying to accomplish). I think, therefore, that a push parser would be best for your problem.
In order to use a push parser, you need to write what is essentially an event handler for parsing events. By "parsing event", I mean events like "start tag reached", "end tag reached", "text found", "attribute parsed", etc.
I suggest that as you read in the document, you write out the transformed document to a separate, temporary file. Thus, your XML parsing event handlers will need to be written so that they are stateful and write out the XML of the translated document incrementally.
Three excellent push parser libraries for C++ include Expat, Xerces-C++, and libxml2.
Search for "SAX parser". They are mostly tokenizers, i.e. they emit tag by tag without building a tree.
SAX parsers are faster than DOM parsers because DOM parsers read the entire file into memory before building an in-memory representation of the XML document, whereas a SAX parser behaves like an event listener and builds the document as it reads in the file. Go here for an explanation.
As you mentioned Xerces is a good C++ SAX parser.
I would recommend looking into ways of breaking the XML document into smaller XML documents as that seems to be part of your problem.
Okay, here is one off the beaten track, I looked at this, but haven't really used it myself, it's called asmxml. These boys claim performance bar none, downside, you need x86 assembler.
If you really seek high performance XML stream parser then libhpxml is likely the right thing for you.
I’m convinced that no XML library exists that allows you to modify a file without loading it first. This simply isn’t possible because files don’t work that way: you cannot insert (or remove) in the middle of a file. You can only overwrite a block of identical size, or append at the end. But your request would require to append or remove in the middle of the file.
Reading only parts of an XML file may be possible. But writing … no way.
Go for template libraries as much as possible, like Boost::property_tree or Boost::XMLParser or POCO::XML and Folly has XML Parser in it.
Avoid old C libraries, it all old code designs.
someone say QtXML module is high performance for huge XML files.
(Not sure if this should be CW or not, you're welcome to comment if you think it should be).
At my workplace, we have many many different file formats for all kinds of purposes. Most, if not all, of these file formats are just written in plain text, with no consistency. I'm only a student working part-time, and I have no experience with using xml in production, but it seems to me that using xml would improve productivity, as we often need to parse, check and compare these outputs.
So my questions are: given that I can only control one small application and its output (only - the inputs are formats that are used in other applications as well), is it worth trying to change the output to be xml-based? If so, what are the best known ways to do that in C++ (i.e., xml parsers/writers, etc.)? Also, should I also provide a plain-text output to make it easy for the users (which are also programmers) to get used to xml? Should I provide a script to translate xml-plaintext? What are your experiences with this subject?
Thanks.
Don't just use XML because it's XML.
Use XML because:
other applications (that only accept XML) are going to read your output
you have an hierarchical data structure that lends itself perfectly for XML
you want to transform the data to other formats using XSL (e.g. to HTML)
EDIT:
A nice personal experience:
Customer: your application MUST be able to read XML.
Me: Er, OK, I will adapt my application so it can read XML.
Same customer (a few days later): your application MUST be able to read fixed width files, because we just realized our mainframe cannot generate XML.
Amir, to parse an XML you can use TinyXML which is incredibly easy to use and start with. Check its documentation for a quick brief, and read carefully the "what it does not do" clause. Been using it for reading and all I can say is that this tiny library does the job, very well.
As for writing - if your XML files aren't complex you might build them manually with a string object. "Aren't complex" for me means that you're only going to store text at most.
For more complex XML reading/writing you better check Xerces which is heavier than TinyXML. I haven't used it yet I've seen it in production and it does deliver it.
You can try using the boost::property_tree class.
http://www.boost.org/doc/libs/1_43_0/doc/html/property_tree.html
http://www.boost.org/doc/libs/1_43_0/doc/html/boost_propertytree/tutorial.html
http://www.boost.org/doc/libs/1_43_0/doc/html/boost_propertytree/parsers.html#boost_propertytree.parsers.xml_parser
It's pretty easy to use, but the page does warn that it doesn't support the XML format completely. If you do use this though, it gives you the freedom to easily use XML, INI, JSON, or INFO files without changing more than just the read_xml line.
If you want that ability though, you should avoid xml attributes. To use an attribute, you have to look at the key , which won't transfer between filetypes (although you can manually create your own subnodes).
Although using TinyXML is probably better. I've seen it used before in a couple of projects I've worked on, but don't have any experience with it.
Another approach to handling XML in your application is to use a data binding tool, such as CodeSynthesis XSD. Such a tool will generate C++ classes that hide all the gory details of parsing/serializing XML -- all that you see are objects corresponding to your XML vocabulary and functions that you can call to get/set the data, for example:
Person p = person ("person.xml");
cout << p.name ();
p.name ("John");
p.age (30);
ofstream ofs ("person.xml");
person (ofs, p);
Here's what previous SO threads have said on the topic. Please add others you know of that are relevant:
What is the best open XML parser for C++?
What is XML good for and when should i be using it?
What are good alternative data formats to XML?
BTW, before you decide on an XML parser, you may want to make sure that it will actually be able to parse all XML documents instead of just the "simple" ones, as discussed in this article:
Are you using a real XML parser?