Parsing XML Attributes with Boost - c++

I would like to share with you an issue I'm having while trying to process some attributes from XML elements in C++ with Boost libraries (version 1.52.0). Given the following code:
#define ATTR_SET ".<xmlattr>"
#define XML_PATH1 "./pets.xml"
#include <iostream>
#include <string>
#include <boost/foreach.hpp>
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/xml_parser.hpp>
using namespace std;
using namespace boost;
using namespace boost::property_tree;
const ptree& empty_ptree(){
static ptree t;
return t;
}
int main() {
ptree tree;
read_xml(XML_PATH1, tree);
const ptree & formats = tree.get_child("pets", empty_ptree());
BOOST_FOREACH(const ptree::value_type & f, formats){
string at = f.first + ATTR_SET;
const ptree & attributes = formats.get_child(at, empty_ptree());
cout << "Extracting attributes from " << at << ":" << endl;
BOOST_FOREACH(const ptree::value_type &v, attributes){
cout << "First: " << v.first.data() << " Second: " << v.second.data() << endl;
}
}
}
Let's say I have the following XML structure:
<?xml version="1.0" encoding="utf-8"?>
<pets>
<cat name="Garfield" weight="4Kg">
<somestuff/>
</cat>
<dog name="Milu" weight="7Kg">
<somestuff/>
</dog>
<bird name="Tweety" weight="0.1Kg">
<somestuff/>
</bird>
</pets>
Therefore, the console output I'll get will be the next:
Extracting attributes from cat.<xmlattr>:
First: name Second: Garfield
First: weight Second: 4Kg
Extracting attributes from dog.<xmlattr>:
First: name Second: Milu
First: weight Second: 7Kg
Extracting attributes from bird.<xmlattr>:
First: name Second: Tweety
First: weight Second: 0.1Kg
However, if I decide to use a common structure for every single element laying down from the root node (in order to identify them from their specific attributes), the result will completely change. This may be the XML file in such case:
<?xml version="1.0" encoding="utf-8"?>
<pets>
<pet type="cat" name="Garfield" weight="4Kg">
<somestuff/>
</pet>
<pet type="dog" name="Milu" weight="7Kg">
<somestuff/>
</pet>
<pet type="bird" name="Tweety" weight="0.1Kg">
<somestuff/>
</pet>
</pets>
And the output would be the following:
Extracting attributes from pet.<xmlattr>:
First: type Second: cat
First: name Second: Garfield
First: weight Second: 4Kg
Extracting attributes from pet.<xmlattr>:
First: type Second: cat
First: name Second: Garfield
First: weight Second: 4Kg
Extracting attributes from pet.<xmlattr>:
First: type Second: cat
First: name Second: Garfield
First: weight Second: 4Kg
It seems the number of elements hanging from the root node is being properly recognized since three sets of attributes have been printed. Nevertheless, all of them refer to the attributes of the very first element...
I'm not an expert in C++ and really new to Boost, so this might be something I'm missing with respect to hash mapping processing or so... Any advice will be much appreciated.

The problem with your program is located in this line:
const ptree & attributes = formats.get_child(at, empty_ptree());
With this line you are asking to get the child pet.<xmlattr> from pets and you do this 3 times independently of whichever f you are traversing. Following this article I'd guess that what you need to use is:
const ptree & attributes = f.second.get_child("<xmlattr>", empty_ptree());
The full code, that works with both your xml files, is:
#define ATTR_SET ".<xmlattr>"
#define XML_PATH1 "./pets.xml"
#include <iostream>
#include <string>
#include <boost/foreach.hpp>
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/xml_parser.hpp>
using namespace std;
using namespace boost;
using namespace boost::property_tree;
const ptree& empty_ptree(){
static ptree t;
return t;
}
int main() {
ptree tree;
read_xml(XML_PATH1, tree);
const ptree & formats = tree.get_child("pets", empty_ptree());
BOOST_FOREACH(const ptree::value_type & f, formats){
string at = f.first + ATTR_SET;
const ptree & attributes = f.second.get_child("<xmlattr>", empty_ptree());
cout << "Extracting attributes from " << at << ":" << endl;
BOOST_FOREACH(const ptree::value_type &v, attributes){
cout << "First: " << v.first.data() << " Second: " << v.second.data() << endl;
}
}
}

Without ever using this feature so far, I would suspect that boost::property_tree XML parser isn't a common XML parser, but expects a certain schema, where you have exactly one specific tag for one specific property.
You might prefer to use other XML parsers that provides parsing any XML schema, if you want to work with XML beyond the boost::property_tree capabilities. Have a look at e.g. Xerces C++ or Poco XML.

File to be parsed, pets.xml
<pets>
<pet type="cat" name="Garfield" weight="4Kg">
<something name="test" value="*"/>
<something name="demo" value="#"/>
</pet>
<pet type="dog" name="Milu" weight="7Kg">
<something name="test1" value="$"/>
</pet>
<birds type="parrot">
<bird name="african grey parrot"/>
<bird name="amazon parrot"/>
</birds>
</pets>
code:
// DemoPropertyTree.cpp : Defines the entry point for the console application.
//Prerequisite boost library
#include "stdafx.h"
#include <boost/property_tree/xml_parser.hpp>
#include <boost/property_tree/ptree.hpp>
#include <boost/foreach.hpp>
#include<iostream>
using namespace std;
using namespace boost;
using namespace boost::property_tree;
void processPet(ptree subtree)
{
BOOST_FOREACH(ptree::value_type petChild,subtree.get_child(""))
{
//processing attributes of element pet
if(petChild.first=="<xmlattr>")
{
BOOST_FOREACH(ptree::value_type petAttr,petChild.second.get_child(""))
{
cout<<petAttr.first<<"="<<petAttr.second.data()<<endl;
}
}
//processing child element of pet(something)
else if(petChild.first=="something")
{
BOOST_FOREACH(ptree::value_type somethingChild,petChild.second.get_child(""))
{
//processing attributes of element something
if(somethingChild.first=="<xmlattr>")
{
BOOST_FOREACH(ptree::value_type somethingAttr,somethingChild.second.get_child(""))
{
cout<<somethingAttr.first<<"="<<somethingAttr.second.data()<<endl;
}
}
}
}
}
}
void processBirds(ptree subtree)
{
BOOST_FOREACH(ptree::value_type birdsChild,subtree.get_child(""))
{
//processing attributes of element birds
if(birdsChild.first=="<xmlattr>")
{
BOOST_FOREACH(ptree::value_type birdsAttr,birdsChild.second.get_child(""))
{
cout<<birdsAttr.first<<"="<<birdsAttr.second.data()<<endl;
}
}
//processing child element of birds(bird)
else if(birdsChild.first=="bird")
{
BOOST_FOREACH(ptree::value_type birdChild,birdsChild.second.get_child(""))
{
//processing attributes of element bird
if(birdChild.first=="<xmlattr>")
{
BOOST_FOREACH(ptree::value_type birdAttr,birdChild.second.get_child(""))
{
cout<<birdAttr.first<<"="<<birdAttr.second.data()<<endl;
}
}
}
}
}
}
int _tmain(int argc, _TCHAR* argv[])
{
const std::string XML_PATH1 = "C:/Users/10871/Desktop/pets.xml";
ptree pt1;
boost::property_tree::read_xml( XML_PATH1, pt1 );
cout<<"********************************************"<<endl;
BOOST_FOREACH( ptree::value_type const& topNodeChild, pt1.get_child( "pets" ) )
{
ptree subtree = topNodeChild.second;
if( topNodeChild.first == "pet" )
{
processPet(subtree);
cout<<"********************************************"<<endl;
}
else if(topNodeChild.first=="birds")
{
processBirds(subtree);
cout<<"********************************************"<<endl;
}
}
getchar();
return 0;
}
The output is shown here:
output

Related

Getting an exception when reading values using BOOST_FOREACH from the JSON array in C++

I am getting the below error when reading the values using BOOST_FOREACH:
Unhandled exception at 0x76FCB502 in JSONSampleApp.exe: Microsoft C++ exception: boost::wrapexcept<boost::property_tree::ptree_bad_path> at memory location 0x00CFEB18.
Could someone help me how to read values from the array with the below JSON format?
#include <string>
#include <iostream>
#include <boost/property_tree/json_parser.hpp>
#include <boost/foreach.hpp>
using namespace std;
using boost::property_tree::ptree;
int main()
{
const char* f_strSetting = R"({"Class": [{"Student": {"Name":"John","Course":"C++"}}]})";
boost::property_tree::ptree pt1;
std::istringstream l_issJson(f_strSetting);
boost::property_tree::read_json(l_issJson, pt1);
BOOST_FOREACH(boost::property_tree::ptree::value_type & v, pt1.get_child("Class.Student"))
{
std::string l_strName;
std::string l_strCourse;
l_strName = v.second.get <std::string>("Name");
l_strCourse = v.second.get <std::string>("Course");
cout << l_strName << "\n";
cout << l_strCourse << "\n";
}
return 0;
}
You asked a very similar question yesterday. We told you not to abuse a property tree library to parse JSON. I even anticipated:
For more serious code you might want to use type-mapping
Here's how you'd expand from that answer to parse the entire array into a vector at once:
Live On Coliru
#include <boost/json.hpp>
#include <boost/json/src.hpp> // for header-only
#include <iostream>
#include <string>
namespace json = boost::json;
struct Student {
std::string name, course;
friend Student tag_invoke(json::value_to_tag<Student>, json::value const& v) {
std::cerr << "DEBUG: " << v << "\n";
auto const& s = v.at("Student");
return {
value_to<std::string>(s.at("Name")),
value_to<std::string>(s.at("Course")),
};
}
};
using Class = std::vector<Student>;
int main()
{
auto doc = json::parse(R"({ "Class": [
{ "Student": { "Name": "John", "Course": "C++" } },
{ "Student": { "Name": "Carla", "Course": "Cobol" } }
]
})");
auto c = value_to<Class>(doc.at("Class"));
for (Student const& s : c)
std::cout << "Name: " << s.name << ", Course: " << s.course << "\n";
}
Printing
Name: John, Course: C++
Name: Carla, Course: Cobol
I even threw in a handy debug line in case you need to help figuring out exactly what you get at some point:
DEBUG: {"Student":{"Name":"John","Course":"C++"}}
DEBUG: {"Student":{"Name":"Carla","Course":"Cobol"}}

Unhandled exception when trying to retrieve the value from the JSON ptree using Boost C++

I am getting the below error when reading the value from the JSON ptree using Boost C++
Unhandled exception at 0x7682B502 in JSONSampleApp.exe: Microsoft C++ exception :
boost::wrapexcept<boost::property_tree::ptree_bad_path> at memory location 0x00DFEB38.
Below is the program, Could someone please help me what i am missing here.
#include <string>
#include <iostream>
#include <boost/property_tree/json_parser.hpp>
#include <boost/property_tree/ptree.hpp>
#include <boost/foreach.hpp>
using namespace std;
using boost::property_tree::ptree;
int main()
{
const char* f_strSetting = "{\"Student\": {\"Name\":\"John\",\"Course\":\"C++\"}}";
boost::property_tree::ptree pt1;
std::istringstream l_issJson(f_strSetting);
boost::property_tree::read_json(l_issJson, pt1);
BOOST_FOREACH(boost::property_tree::ptree::value_type & v, pt1.get_child("Student"))
{
std::string l_strColor;
std::string l_strPattern;
l_strColor = v.second.get <std::string>("Name");
l_strPattern = v.second.get <std::string>("Course");
}
return 0;
}
There is a shape mismatch between your code and your data:
The data is a plain nested dictionary: Student.name is "John".
The code expects to see an array under the Student key, so it tries to fetch Student.0.name, Student.1.name, ... for every subitem of Student.
Either fix the code:
// Drop the BOOST_FOREACH
auto & l_Student = pt1.get_child("Student");
l_strColor = l_Student.get<std::string>("Name");
or fix the data:
// Note the extra []
const char * f_strSetting = R"({"Student": [{"Name":"John","Course":"C++"}]})";
Firstly, may I suggest modernizing and thus simplifying your code, while avoiding using directives:
#include <boost/property_tree/json_parser.hpp>
#include <string>
using boost::property_tree::ptree;
int main() {
ptree pt;
{
std::istringstream l_issJson( R"({"Student": {"Name":"John","Course":"C++"}})");
read_json(l_issJson, pt);
}
for(auto& [k,v] : pt.get_child("Student")) {
auto name = v.get<std::string>("Name");
auto course = v.get<std::string>("Course");
}
}
Secondly, you're selecting the wrong levels - as the other answer points out.:
#include <boost/property_tree/json_parser.hpp>
#include <iostream>
#include <string>
using boost::property_tree::ptree;
int main() {
ptree pt;
{
std::istringstream l_issJson( R"({"Student": {"Name":"John","Course":"C++"}})");
read_json(l_issJson, pt);
}
auto name = pt.get<std::string>("Student.Name");
auto course = pt.get<std::string>("Student.Course");
std::cout << "Name: '" << name << "', Course: '" << course << "'\n";
}
See it Live
But the REAL problem is:
USE A JSON LIBRARY
Boost Property Tree is not a JSON library.
Boost JSON exists:
Live On Coliru
#include <boost/json.hpp>
#include <boost/json/src.hpp> // for header-only
#include <iostream>
#include <string>
namespace json = boost::json;
int main() {
auto pt = json::parse(R"({"Student": {"Name":"John","Course":"C++"}})");
auto& student = pt.at("Student");
auto name = student.at("Name").as_string();
auto course = student.at("Course").as_string();
std::cout << "Name: " << name << ", Course: " << course << "\n";
}
Prints
Name: "John", Course: "C++"
BONUS
For more serious code you might want to use type-mapping:
#include <boost/json.hpp>
#include <boost/json/src.hpp> // for header-only
#include <iostream>
#include <string>
namespace json = boost::json;
struct Student {
std::string name, course;
friend Student tag_invoke(json::value_to_tag<Student>, json::value const& v) {
return {
json::value_to<std::string>(v.at("Name")),
json::value_to<std::string>(v.at("Course")),
};
}
};
int main()
{
auto doc = json::parse(R"({"Student": {"Name":"John","Course":"C++"}})");
auto s = value_to<Student>(doc.at("Student"));
std::cout << "Name: " << s.name << ", Course: " << s.course << "\n";
}
See it Live On Coliru

boost::property_tree passing subtree including <xmlattr>

I'm trying to pass elements of a boost::property_tree::ptree to a function.
In detail, I have to following XML code from which a ptree is initialised:
<Master Name='gamma'>
<Par1 Name='name1'>
<Value>0.</Value>
<Fix>1</Fix>
</Par1>
<Par2 Name='name2'>
<Value>0.</Value>
<Fix>1</Fix>
</Par2>
</Master>
I would like to pass part of it to a function. Basically I want to pass:
<Par2 Name='name2'>
<Value>0.</Value>
<Fix>1</Fix>
</Par2>
The function could look like this:
void processTree( which_type_do_I_put_here element ){
std::string n = element.get<std::string>("<xmlattr>.Name");
double val = element.get<double>("Value");
}
In general I could pass a subtree using ptree::get_child("par2"). This has the disadvantage that the function has no access to <xmlattr> of this node.
How can I pass this part of the tree with access to <xmlattr>?
Thanks in advance for any ideas.
~Peter
The type is a ptree.
In general I could pass a subtree using ptree::get_child("par2").
Indeed.
This has the disadvantage that the function has no access to of this node
That's not right:
Live On Coliru
#include <boost/property_tree/xml_parser.hpp>
#include <iostream>
std::string const sample = R"(
<Master Name='gamma'>
<Par1 Name='name1'>
<Value>0.</Value>
<Fix>1</Fix>
</Par1>
<Par2 Name='name2'>
<Value>0.</Value>
<Fix>1</Fix>
</Par2>
</Master>
)";
using boost::property_tree::ptree;
void processTree(ptree const& element) {
std::string n = element.get<std::string>("<xmlattr>.Name");
double val = element.get<double>("Value");
std::cout << __FUNCTION__ << ": n=" << n << " val=" << val << "\n";
}
int main() {
ptree pt;
{
std::istringstream iss(sample);
read_xml(iss, pt);
}
processTree(pt.get_child("Master.Par2"));
}
Which prints:
processTree: n=name2 val=0

Parsing an XML document

I want to parse an XML document in c++ and be able to identify what text exists in a particular tag. I have checked parsers like TiyXML and PugiXML but none of them seem to identify the tags separately. How can I achieved this?
Using RapidXml, you can traverse the nodes and attributes and identify the text of their tag.
#include <iostream>
#include <rapidxml.hpp>
#include <rapidxml_utils.hpp>
#include <rapidxml_iterators.hpp>
int main()
{
using namespace rapidxml;
file<> in ("input.xml"); // Load the file in memory.
xml_document<> doc;
doc.parse<0>(in.data()); // Parse the file.
// Traversing the first-level elements.
for (node_iterator<> first=&doc, last=0; first!=last; ++first)
{
std::cout << first->name() << '\n'; // Write tag.
// Travesing the attributes of the element.
for (attribute_iterator<> attr_first=*first, attr_last=0;
attr_first!=attr_last; ++attr_first)
{
std::cout << attr_first->name() << '\n'; // Write tag.
}
}
}
To get all tag names with pugixml:
void dumpTags(const pugi::xml_node& node) {
if (!node.empty()) {
std::cout << node.name() << std::endl;
for (pugi::xml_node child=node.first_child(); child; child=child.next_sibling())
dumpTags(child);
}
}
pugi::xml_document doc;
pugi::xml_parse_result result = doc.load("<tag1>abc<tag2>def</tag2>pqr</tag1>");
dumpTags(doc.first_child());

RapidXML giving empty CDATA nodes

I wrote the code bellow to get CDATA node value too, I got the node's name, but the values are in blank.
I changed the parse Flags to parse_full, but it not worked too.
If I manually remove "<![CDATA[" and "]]>" from the XML, It gives the value as expected, but removing it before parse is not a option.
The code:
#include <iostream>
#include <vector>
#include <sstream>
#include "rapidxml/rapidxml_utils.hpp"
using std::vector;
using std::stringstream;
using std::cout;
using std::endl;
int main(int argc, char* argv[]) {
rapidxml::file<> xmlFile("test.xml");
rapidxml::xml_document<> doc;
doc.parse<rapidxml::parse_full>(xmlFile.data());
rapidxml::xml_node<>* nodeFrame = doc.first_node()->first_node()->first_node();
cout << "BEGIN\n\n";
do {
cout << "name: " << nodeFrame->first_node()->name() << "\n";
cout << "value: " << nodeFrame->first_node()->value() << "\n\n";
} while( nodeFrame = nodeFrame->next_sibling() );
cout << "END\n\n";
return 0;
}
The XML:
<rss version="2.0" xmlns:g="http://base.google.com/ns/1.0" xmlns:c="http://base.google.com/cns/1.0">
<itens>
<item>
<title><![CDATA[Title 1]]></title>
<g:id>34022</g:id>
<g:price>2173.00</g:price>
<g:sale_price>1070.00</g:sale_price>
</item>
<item>
<title><![CDATA[Title 2]]></title>
<g:id>34021</g:id>
<g:price>217.00</g:price>
<g:sale_price>1070.00</g:sale_price>
</item>
</itens>
</rss>
When you use CDATA, RapidXML parses that as a separate node 'below' the outer element in the hierarchy.
Your code correctly gets 'title' by using nodeFrame->first_node()->name(), but - because the CDATA text is in a separate element, you'd need to use this to extract the value:
cout << "value: " <<nodeFrame->first_node()->first_node()->value();