Build an xml tree from scratch - pugixml C++ - c++

Firstly I would like to say that I have been using an XML parser written by Frank Vanden Berghen and recently trying to migrate to Pugixml. I am finding the transition bit difficult. Hoping to get some help here.
Question: How can I build a tree from scratch for the small xml below using pugixml APIs? I tried looking into the examples on the pugixml home page, but most of them are hard coded with root node values. what I mean is
if (!doc.load("<node id='123'>text</node><!-- comment -->", pugi::parse_default | pugi::parse_comments)) return -1;
is hard-coded. Also I tried reading about xml_document and xml_node documentation but could not figure out how to start with if I have to build a tree from scratch.
#include "pugixml.hpp"
#include <string.h>
#include <iostream>
int main()
{
pugi::xml_document doc;
if (!doc.load("<node id='123'>text</node><!-- comment -->", pugi::parse_default | pugi::parse_comments)) return -1;
//[code_modify_base_node
pugi::xml_node node = doc.child("node");
// change node name
std::cout << node.set_name("notnode");
std::cout << ", new node name: " << node.name() << std::endl;
// change comment text
std::cout << doc.last_child().set_value("useless comment");
std::cout << ", new comment text: " << doc.last_child().value() << std::endl;
// we can't change value of the element or name of the comment
std::cout << node.set_value("1") << ", " << doc.last_child().set_name("2") << std::endl;
//]
//[code_modify_base_attr
pugi::xml_attribute attr = node.attribute("id");
// change attribute name/value
std::cout << attr.set_name("key") << ", " << attr.set_value("345");
std::cout << ", new attribute: " << attr.name() << "=" << attr.value() << std::endl;
// we can use numbers or booleans
attr.set_value(1.234);
std::cout << "new attribute value: " << attr.value() << std::endl;
// we can also use assignment operators for more concise code
attr = true;
std::cout << "final attribute value: " << attr.value() << std::endl;
//]
}
// vim:et
XML:
<?xml version="1.0" encoding="UTF-8"?>
<d:testrequest xmlns:d="DAV:" xmlns:o="urn:example.com:testdrive">
<d:basicsearch>
<d:select>
<d:prop>
<o:versionnumber/>
<d:creationdate />
</d:prop>
</d:select>
<d:from>
<d:scope>
<d:href>/</d:href>
<d:depth>infinity</d:depth>
</d:scope>
</d:from>
<d:where>
<d:like>
<d:prop>
<o:name />
</d:prop>
<d:literal>%img%</d:literal>
</d:like>
</d:where>
</d:basicsearch>
</d:testrequest>
I could see most of the examples posted on how to read/parse the xml, but I could not find how to create one from the scratch.

Please refer to the following section of the manual https://github.com/zeux/pugixml/blob/master/docs/manual.html#manual.modify.add and to the following sample code https://github.com/zeux/pugixml/blob/master/docs/samples/modify_add.cpp

Home page of pugixml gives sample code for building XML tree from scratch.
Summary: Use default constructor for pugi::xml_document doc, then append_child for the root node. Generally, a node is first inserted. The insertion call's return value then serves as a handle for filling the XML node.
Constructing xml tree

Related

QT 5.11.3 - QDomNode : can't print value of dom element

I'm trying to create a very function to read a very simple XML file and print its content on the QTCreator console.
I created the following XML file :
<SCANNERS>
<SCANNER>
<NAME>Test scanner</NAME>
<SERIAL>10102030</SERIAL>
</SCANNER>
<SCANNER>
<NAME>Test scanner 2</NAME>
<SERIAL>10102031</SERIAL>
</SCANNER>
<SCANNER>
<NAME>Test scanner 3</NAME>
<SERIAL>10102032</SERIAL>
</SCANNER>
<SCANNER>
<NAME>Test scanner 4</NAME>
<SERIAL>10102033</SERIAL>
</SCANNER>
<SCANNER>
<NAME>Test scanner 5</NAME>
<SERIAL>10102034</SERIAL>
</SCANNER>
</SCANNERS>
Then I created the following function, which is supposed to print each nodes inside each "SCANNER" tags :
void printDomDocument(QString xml)
{
QDomDocument xmlScanners;
QFile file(xml);
if (!file.open(QIODevice::ReadOnly))
{
std::cout << "QScannerEntryList : couldn't open XML file : " << xml.toStdString() << std::endl;
}
if (xmlScanners.setContent(&file))
{
QDomElement elem = xmlScanners.documentElement();
QDomNode n = elem.firstChild();
while (!n.isNull())
{
QDomElement e = n.toElement(); // try to convert the node to an element.
if (!e.isNull())
{
QDomNode n2 = e.firstChild();
std::cout << n2.nodeName().toStdString() << " " << n2.nodeValue().toStdString() << std::endl;
n2 = n2.nextSibling();
std::cout << n2.nodeName().toStdString() << " " << n2.nodeValue().toStdString() << std::endl;
}
n = n.nextSibling();
}
}
else
{
std::cout << "QScannerEntryList : couldn't grab content from XML file : " << xml.toStdString() << std::endl;
}
file.close();
}
My problem is, I can print the tagnames of each node perfectly, but for some reason I can't manage to print the values inside each of these tags. n2.nodeValue() doesn't show on the console.
Is there something I am doing wrong ?
Found out what's wrong.
Actually, the actual node value seems to be one node deeper than the child itself.
The solution is simply to "dig" one layer deeper :
QDomNode n2 = e.firstChild();
std::cout << n2.nodeName().toStdString() << " " << n2.firstChild().nodeValue().toStdString() << std::endl;
n2 = n2.nextSibling();
std::cout << n2.nodeName().toStdString() << " " << n2.firstChild().nodeValue().toStdString() << std::endl;
Returns the expected result :
NAME Test scanner
SERIAL 10102030
NAME Test scanner 2
SERIAL 10102031
NAME Test scanner 3
SERIAL 10102032
NAME Test scanner 4
SERIAL 10102033
NAME Test scanner 5
SERIAL 10102034

Reading Json file's root in c++ with jsoncpp

File:
{
"somestring":{
"a":1,
"b":7,
"c":17,
"d":137,
"e":"Republic"
},
}
how can I read the somestring value by jsoncpp?
Use the getMemberNames() method.
Json::Value root;
root << jsonString;
Json::Value::Members propNames = root.getMemberNames();
std::string firstProp = propNames[0];
std::cout << firstProp << '\n'; // should print somestring
If you want to see all the properties, you can loop through it using an iterator:
for (auto it: propNames) {
cout << "Property: " << *it << " Value: " << root[*it].asString() << "\n";
}
This simple loop will only work for properties whose values are strings. If you want to handle nested objects, like in your example, you'll need to make it recursive, which I'm leaving as an exercise for the reader.

Parsing XML file with RapidXML - only parsing first line of files

I am having trouble with RapidXML only parsing the first line of my file (or so it appears). When I feed in a sample file, it merely gets the first node (“map”) and nothing else. I set a breakpoint in Xcode after the parsing to inspect the result and there seems to be NULL values for majority of the attributes. Does anyone have any recommendations on how to fix this? It is my understanding that the parser is suppose to produce some form of tree like structure. Perhaps I have a misunderstanding of the resulting data structure?
Here is my usage:
#include <iostream>
#include "rapidxml_utils.hpp"
using namespace std;
int main(){
rapidxml::file<> xmlFile("sample.txt.xml");
rapidxml::xml_document<> doc;
doc.parse<0>(xmlFile.data());
cout << "Name of my first node is: " << doc.first_node()->name() << "\n";
rapidxml::xml_node<> *node = doc.first_node("map");
cout << "Node map has value " << node->value() << "\n";
for (rapidxml::xml_attribute<> *attr = node->first_attribute();
attr; attr = attr->next_attribute())
{
cout << "Node foobar has attribute " << attr->name() << " ";
cout << "with value " << attr->value() << "\n";
}
}
Here is an example of the file I am trying to parse:
<?xml version="1.0" encoding="utf-8"?>
<map>
<room>
<name>Entrance</name>
<description>You find yourself at the mouth of a cave</description>
<item>torch</item>
<trigger>
<type>permanent</type>
<command>n</command>
<condition>
<has>no</has>
<object>torch</object>
<owner>inventory</owner>
</condition>
<print>*stumble* need some light...</print>
</trigger>
<border>
<direction>north</direction>
<name>MainCavern</name>
</border>
</room>
</map>
You're confusing XML attributes and elements.
Attributes would look like this: <map name="Zork" author="Infocom">
If you want to iterate over all elements in the 'tree', you really need a recursive algorithm that uses the rapidxml first_node() and next_sibling()methods.

Pugixml C++ parsing XML

I am a newbie in pugixml. Consider I have XML given here. I want to get value of Name and Roll of Every Student. The code below only find the tag but not the value.
#include <iostream>
#include "pugixml.hpp"
int main()
{
std::string xml_mesg = "<data> \
<student>\
<Name>student 1</Name>\
<Roll>111</Roll>\
</student>\
<student>\
<Name>student 2</Name>\
<Roll>222</Roll>\
</student>\
<student>\
<Name>student 3</Name>\
<Roll>333</Roll>\
</student>\
</data>";
pugi::xml_document doc;
doc.load_string(xml_mesg.c_str());
pugi::xml_node data = doc.child("data");
for(pugi::xml_node_iterator it=data.begin(); it!=data.end(); ++it)
{
for(pugi::xml_node_iterator itt=it->begin(); itt!=it->end(); ++itt)
std::cout << itt->name() << " " << std::endl;
}
return 0;
}
I want the output of Name and Roll for each student. How can I modify above code? Also, if one can refer here(press Test), I can directly write xpath which is supported by pugixml. If so, how can I get the values I seek using Xpath in Pugixml.
Here's how you can do it with just Xpath:
pugi::xpath_query student_query("/data/student");
pugi::xpath_query name_query("Name/text()");
pugi::xpath_query roll_query("Roll/text()");
pugi::xpath_node_set xpath_students = doc.select_nodes(student_query);
for (pugi::xpath_node xpath_student : xpath_students)
{
// Since Xpath results can be nodes or attributes, you must explicitly get
// the node out with .node()
pugi::xml_node student = xpath_student.node();
pugi::xml_node name = student.select_node(name_query).node();
pugi::xml_node roll = student.select_node(roll_query).node();
std::cout << "Student name: " << name.value() << std::endl;
std::cout << " roll: " << roll.value() << std::endl;
}
I think that the reason why you are getting the "tags/nodes" instead of their values is because you are using the name() function instead of value(). Try replacing your itt->name() with itt->value() instead.
I found some good documentation about accessing document data here
Thanks #Cornstalks for the insight of using xpath in pugixml. I used child_value given here. The code of mine was thus:
for(pugi::xml_node_iterator it=data.begin(); it!=data.end(); ++it)
{
for(pugi::xml_node_iterator itt=it->begin(); itt!=it->end(); ++itt)
std::cout << itt->name() << " " << itt->child_value() << " " << std::endl;
}
I could also use xpath as #Cornstalks suggested thus making my code as:
pugi::xml_document doc;
doc.load_string(xml_mesg.c_str());
pugi::xpath_query student_query("/data/student");
pugi::xpath_query name_query("Name/text()");
pugi::xpath_query roll_query("Roll/text()");
pugi::xpath_node_set xpath_students = doc.select_nodes(student_query);
for (pugi::xpath_node xpath_student : xpath_students)
{
// Since Xpath results can be nodes or attributes, you must explicitly get
// the node out with .node()
pugi::xml_node student = xpath_student.node();
pugi::xml_node name = student.select_node(name_query).node();
pugi::xml_node roll = student.select_node(roll_query).node();
std::cout << "Student name: " << name.value() << std::endl;
std::cout << " roll: " << roll.value() << std::endl;
}
In your inner loop change the following line to get the values like :
student1 and 111 and so on...
std::cout << itt.text().get() << " " << std::endl;

boost recognize a child

My question is related to : boost
Some of the boost code is working correctly to find that a node has child, but if one node have two other nodes it didn't recognize the children.
It's recursive call to be able to read all the tree nodes and then apply the copy of the value to the google protocol buffer
void ReadXML(iptree& tree, string doc)
{
const GPF* gpf= pMessage->GetGPF();
for(int i = 0 ; i < gpf->field_count(); ++i)
{
string fieldName = GetName(i);
boost::optional< iptree & > chl = pt.get_child_optional(fieldName);
if(chl) {
for( auto a : *chl ){
boost::property_tree::iptree subtree = (boost::property_tree::iptree) a.second ;
assignDoc(doc);
ReadXML(subtree, doc);
}
}
}
}
the XML file
<?xml version="1.0" encoding="utf-8"?>
<nodeA xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<nodeA.1>This is the Adresse</nodeA.1>
<nodeA.2>
<node1>
<node1.1>
<node1.1.1>Female</node1.1.1>
<node1.1.2>23</node1.1.2>
<node1.1.3>Engineer</node1.1.3>
</node1.1>
<node1.1>
<node1.2.1>Female</node1.2.1>
<node1.2.2>35</node1.2.2>
<node1.2.3>Doctors</node1.2.3>
</node1.1>
</node1>
</nodeA.2>
<nodeA.3>Car 1</nodeA.3>
</nodeA>
My problem is that node1 is not recognised as having child. I don't know if it's because there are two children nodes with the same name.
Note that the XML files may change from one client to another. I may have different nodes.
Do I have to use a.second or a.first?
Here
boost::optional< iptree & > chl = pt.get_child_optional(fieldName);
you explicitly search for a child with a given name. This name never seems the change during recursion. On every level you look for children with the same name it seems.
I think you could/should be looking at this problem from a higher level.
Boost Property Tree uses RapidXML under the hood. PugiXML is a similar, but more modern library that can also be used in header-only mode. With PugiXML you could write:
pugi::xml_document doc;
doc.load(iss);
for (auto& node : doc.select_nodes("*/descendant::*[count(*)=3]/*[count(*)=0]/.."))
{
auto values = node.node().select_nodes("*/text()");
std::cout << "Gender " << values[0].node().value() << "\n";
std::cout << "Age " << values[1].node().value() << "\n";
std::cout << "Job Title " << values[2].node().value() << "\n";
}
It selects all descendants of the root node (nodeA) that three leaf child nodes, and interprets them as Gender, Age and Job Title. It prints:
Gender Female
Age 23
Job Title Engineer
Gender Female
Age 35
Job Title Doctors
I hope you will find this constructive.
Full Demo
On my system to build, simply:
sudo apt-get install libpugixml-dev
g++ -std=c++11 demo.cpp -lpugixml -o demo
./demo
demo.cpp:
#include <pugiconfig.hpp>
#define PUGIXML_HEADER_ONLY
#include <pugixml.hpp>
#include <iostream>
#include <sstream>
int main()
{
std::istringstream iss("<?xml version=\"1.0\" encoding=\"utf-8\"?>\n"
"<nodeA xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\">"
"<nodeA.1>This is the Adresse</nodeA.1>"
"<nodeA.2>"
"<node1>"
"<node1.1>"
"<node1.1.1>Female</node1.1.1>"
"<node1.1.2>23</node1.1.2>"
"<node1.1.3>Engineer</node1.1.3>"
"</node1.1>"
"<node1.2>"
"<node1.2.1>Female</node1.2.1>"
"<node1.2.2>35</node1.2.2>"
"<node1.2.3>Doctors</node1.2.3>"
"</node1.2>"
"</node1>"
"</nodeA.2>"
"<nodeA.3>Car 1</nodeA.3>"
"</nodeA>");
pugi::xml_document doc;
doc.load(iss);
for (auto& node : doc.select_nodes("*/descendant::*[count(*)=3]/*[count(*)=0]/.."))
{
auto values = node.node().select_nodes("*/text()");
std::cout << "Gender " << values[0].node().value() << "\n";
std::cout << "Age " << values[1].node().value() << "\n";
std::cout << "Job Title " << values[2].node().value() << "\n";
}
//doc.save(std::cout);
}