I am a latecomer to XML - have to parse an XML file. Our company is using xerces already so I managed to cobble together a sample app (SAX) that displays all the data in a file. However, after parsing is complete I was expecting to be able to call the parser or some other entity that had an internal representation of the file and iterate through the fields/data.
Basically I want to be able to hand it some key or other string(s) and get back strings or collections of key/value pairs. I do not see that. It seems pretty obvious to me that that is a good thing to do. Am I missing something?
Is the DOM parsing what I want, or does that fall short too?
Xerces provides both SAX and DOM processing. SAX parsing doesn't construct a model, so once parsing is finished there is nothing to examine or iterate through. DOM processing produces a tree-structured model which gives you what you want.
Check out the beginner's sample in this page
YoLinux Tutorial on Parsing XML
If you use the XercesDOMParser, there is still no way to request a specific key value pair after the document is parsed. I ran into the same problem recently, and while iterating through the DOM tree I stored all the key value pairs in an STL map. Then you can request key value pairs from the map later in the program.
Related
I have question about my internship project. They want me to create a basic Login page(ID, Password). I create a XML file for Username and Password. The program should check the XML file for username and password*. If they are correct it will direct to a second window. I'm stuck on processing XML file for username and password. How can read those information in XML file.
As #JarMan said, I would recommend the QXmlStreamReader. You can fill it with a file (QIODevice), QString, QByteArray, etc...
Parsing a value could e.g. look like that
xml.attributes().value( attribute ).toString();
if attribute is a QString and xml is the QXmlStreamReader.
See the doc https://doc.qt.io/qt-5/qxmlstreamreader.html
There are several ways to do it. Marris mentioned one, but another one is to have this kind of code generated automatically. The way this works is that you first write an XML Schema that describes what your XML data looks like. An introduction to the XML Schema language can be found e. g. here.
Then you use an XML Schema compiler to translate the XML Schema to C++ classes. The schema compiler will also generate code to parse an XML file into objects, meaning you don't have to write any code to deal with XML by hand. It's a purely declarative approach: declare what the data looks like, and let the computer figure out the details.
I am specifically looking to parse the XSLT to retrieve the fields in the input XML and also get the logic between the input and output data been generated,
I am not sure, but have i been given a target to create an XSLT parser which is like a sub module in a browser?
It is more like reverse engineer some code to get the source and map it to the destination data.
I have an XML file with the following structure;
<JobList>
<Job><subnodes/></Job>
<Job><subnodes/></Job>
</JobList>
This xml can be broken sometimes leaving a missing ending of <JobList> and missing end of </Job>.
I would like to be able to extract the <Job> nodes with full content on those that are closed with </Job>. What is the best way to do this?
To make a long story short I am using .NET and built in serializers for deserializing xml content. But since new properties are added you cannot just go back and forth between different versions as it is to strict. Mostly it works, but I would like to have a backup recovery method for this - hence the question.
The current situation is that the deserializer "crashes" the whole deserializing when a new property has been added instead of ignoring it. I am looking to manually parse it on error.
As mentioned on the comments, the ideal would be to make the xml valid, if for whatever reason that is not possible, the workaround is parsing the file as text with a regex.
A general regex for this case could be something like:
<Job>((?!<Job>).)*</Job>$
this will bring anything between a complete pair
Please notice that this will also return nodes with 'broken' inner nodes, but according to your question you are only concerned about missing and tags.
The XML below represents a FIX message. This message has a variable number of fields (numbered using the id tag), each containing differing attributes. So I would like to parse this XML and with my additional coding abilities output a C++ message object which includes all the attribute information per field.
First question would be- is there a boost library which I can use to do this? My second question would be what is the interface between what the XML parser can provide and where I have to write code to create the objects. So for example, in the XML on line 8 there is a <delta/> tag and this is an attribute of the object. So for field 52 (line 8) the attribute would be a Delta sub type object but for line 9 the attribute would be a Copy subtype object. I would like to store these subtypes in an std::unordered_map with the field ID being the key.
I guess another way of wording this is- what "end result" will the XML parser give me to help build the objects the way I want them?
You should probably use one of the many commonly-used xml parsers, Xerces and TinyXML are two possibilities. There are more. Google is your friend.
You want to run in SAX mode rather than DOM mode (the documentation for the parser you choose will explain). That means the parser will call code you supply for each element and attribute it parses rather than building an arbitrary structure in memory that doesn't match your domain-specific needs.
I want a program that parses a XML-file, build a structure with the tags I need and finally print a HTML-report using HTML-templates with keywords that get replaced by the data from the XML files.
Since I'm not(yet) really into the OO programming I hoped to get some tips and advices how to structure a program like this.
I thought that two classes should be enough. A parser class and a data class.
the first one to go through the XML-file and report every tag I want to store to a data object which stores all the tags in a hierarchical order. After that I want to call a print function which prints everything as HTML-report.
I'm not sure how to report the tags to the data object
Could I store the tags in one object which stores a tree of structs or would it be better to store each tag in a separate object?
Any help would be greatly appreciated!
You don't mention Qt in your question, but as you added it as a tag: there is QtXML, which will give a way to parse and generate XML documents, and will also work for HTML output. XML is typically handled either via DOM or SAX. With DOM, the documents are parsed into a tree structure, and you will work on the tree as your central data element. With SAX, you use callback functions that are called for the different XML elements while parsing the XML input.
There is a lot about DOM and SAX on the internet, Wikipedia is a good starting point. There is also a lot of documentation on QtXML on-line.
Using DOM and/or SAX will give a nice architecture for solving the problem.
I solved my problem and want to share my architecture.
I made a Class Parser to parse the Elements and report the tags to an HTMLHandler class which has Subclasses like Header, Content and Sub-content. which store the Data and all have write()- methodes to print themselves out.
works fine for me and is quit simple :)