How to retrieve node and attribute values from xml file in C++ using libxml2 by using xpath ?
Thanks in advance,
Bhargava
Since this is tagged C++ I'll assume you can use the libxml++ library bindings.
I wrote a simple program that:
Parse the document using a DomParser
Make an XPath query using find() on the document root node to get to the attribute.
Cast the first node of the XPath result to an Attribute node
Get that attribute string value using get_value()
Display that value
Here's the code:
#include <iostream>
#include <libxml++/libxml++.h>
using namespace std;
using namespace Glib;
using namespace xmlpp;
int main(int argc, char* argv[])
{
// Parse the file
DomParser parser;
parser.parse_file("file.xml");
Node* rootNode = parser.get_document()->get_root_node();
// Xpath query
NodeSet result = rootNode->find("/root/a/b/#attr");
// Get first node from result
Node *firstNodeInResult = result.at(0);
// Cast to Attribute node (dynamic_cast on reference can throw [fail fast])
Attribute &attribute = dynamic_cast<Attribute&>(*firstNodeInResult);
// Get value of the attribute
ustring attributeValue = attribute.get_value();
// Print attribute value
cout << attributeValue << endl;
}
Given this input:
<!-- file.xml -->
<root>
<a>
<b attr="I want to get this"> </b>
</a>
</root>
The code will output:
I want to get this
To compile this on an Unix system:
c++ `pkg-config libxml++-2.6 --cflags` `pkg-config libxml++-2.6 --libs` file.cpp
sp1.xml:
<users noofids="1">
<user user="vin" password="abc"/>
</users>
Program:
#include <libxml/xpath.h>
#include <libxml/tree.h>
#include <iostream>
using namespace std;
int
main (int argc, char **argv)
{
char ID[25];
xmlInitParser ();
//LIBXML_TEST_VERSION
xmlDoc *doc = xmlParseFile ("sp1.xml");
xmlXPathContext *xpathCtx = xmlXPathNewContext (doc);
xmlXPathObject *xpathObj =
xmlXPathEvalExpression ((xmlChar *) "users/user", xpathCtx);
xmlNode *node = xpathObj->nodesetval->nodeTab[0];
xmlAttr *attr = node->properties;
while (attr)
{
//if(!xmlStrcmp(attr->name,(const xmlChar *)"noofids"))
//sprintf(ID,"%s",attr->children->content);
std::cout << "Attribute name: " << attr->name << " value: " << attr->
children->content << std::endl;
attr = attr->next;
}
//std::cout<<"ID: "<<ID<<endl;
return 0;
}
i got output by trying by my own
This is an example of how to get the value for the xpath you are trying to evaluate with libxml++.
Based on Alexandre Jasmin's answer but his only shows how to print an xpath attribute and it's not trivial to figure out how to print the Node's value because you have to cast it to a specific object (also his answer throws an exception).
#include <iostream>
#include <libxml++/libxml++.h>
using namespace std;
using namespace Glib;
using namespace xmlpp;
int main(int argc, char* argv[])
{
// Parse the file
DomParser parser;
parser.parse_file("sample.xml");
Node* root = parser.get_document()->get_root_node();
// Xpath query
NodeSet result = root->find("/root/ApplicationSettings/level_three");
// Get first element from result
Element *first_element = (Element *)result.at(0);
// Print the content of the Element
cout << first_element->get_child_text()->get_content() << endl;
}
sample.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<root>
<ApplicationSettings>
<level_three>hello world</level_three>
</ApplicationSettings>
</root>
compile
g++ test.cpp -o test `pkg-config libxml++-2.6 --cflags` `pkg-config libxml++-2.6 --libs`
run
./test
Related
I have following xml data and i want to parse through boost xml parser.
<?xml version="1.0" encoding="UTF-8"?>
<applications>
<application>
<id>1</id>
<platform>linux-x64</platform>
<version>2.4</version>
</application>
<application>
<id>2</id>
<platform>windows</platform>
<version>2.5</version>
</application>
<application>
<id>3</id>
<platform>linux</platform>
<version>2.6</version>
</application>
</applications>
I have written below boost code but I read only first child of "applications" and not able to read other two childs. Everytime inner loop get the data of first child.
boost::property_tree::ptree pt;
boost::property_tree::read_xml(sModel, pt); // sModel is filename which contains above xml data
BOOST_FOREACH(boost::property_tree::ptree::value_type &v, pt.get_child("applications"))
{
std::string key = v.first.data();
std::string Id, platform, version;
if (key == std::string("application"))
{
BOOST_FOREACH(boost::property_tree::ptree::value_type &v_, pt.get_child("applications.application"))
{
std::string app_key = v_.first.data();
std::string app_value = v_.second.data();
if (app_key == std::string("id"))
pkgId = app_value;
else if (app_key == std::string("platform"))
platform = app_value;
else if (app_key == std::string("version"))
version = app_value;
}
}
}
Here, every time i get the platform as "linux-x64".
Can someone guide how to read all the child through this boost xml ?
Thanks in Advance.
get_child (and all the other path-based access functions) isn't very good at dealing with multiple identical keys. It will choose the first child with the given key and return that, ignoring all others.
But you don't need get_child, because you already hold the node you want in your hand.
pt.get_child("applications") gives you a ptree. Iterating over that gives you a ptree::value_type, which is a std::pair<std::string, ptree>.
The first weird thing, then, is this line:
std::string key = v.first.data();
The data() function you're calling here is std::string::data, not ptree::data. You could just write
std::string key = v.first;
The next strange thing is the comparison:
if (key == std::string("application"))
You don't need to explicitly construct a std::string here. In fact, doing so is a pessimization, because it has to allocate a string buffer and copy the string there, when std::string has comparison operators for C-style strings.
Then you iterator over pt.get_child("applications.application"), but you don't need to do this lookup - v.second is already the tree you want.
Furthermore, you don't need to iterate over the child at all, you can use its lookup functions to get what you need.
std::string pkgId = v.second.get("id", "");
So to sum up, this is the code I would write:
boost::property_tree::ptree pt;
boost::property_tree::read_xml(sModel, pt);
BOOST_FOREACH(boost::property_tree::ptree::value_type &v, pt.get_child("applications"))
{
// You can even omit this check if you can rely on all children
// being application nodes.
if (v.first == "application")
{
std::string pkgId = v.second.get("id", "");
std::string platform = v.second.get("platform", "");
std::string version = v.second.get("version", "");
}
}
Check this example:
#include <boost/property_tree/xml_parser.hpp>
#include <boost/property_tree/ptree.hpp>
#include <boost/foreach.hpp>
struct Application
{
int m_id
std::string m_platform;
float m_version;
};
typedef std::vector<Application> AppList;
AppList Read()
{
using boost::property_tree::ptree;
// Populate tree structure (pt):
ptree pt;
read_xml("applications.xml", pt); // For example.
// Traverse pt:
AppList List;
BOOST_FOREACH(ptree::value_type const& v, pt.get_child("applications"))
{
if (v.first == "application")
{
Application App;
App.id = v.second.get<int>("id");
App.platform = v.second.get<std::string>("platform");
App.version = v.second.get<float>("version");
List.push_back(App);
}
}
return List;
}
I'm parsing a XML file from a string.
My node Id is bar, and I want to change it to foo and then write to file.
After writing to file, the file still have the bar, and not the foo.
#include "rapidxml.hpp"
#include "rapidxml_print.hpp"
void main()
{
std::string newXml = "<?xml version=\"1.0\" encoding=\"UTF - 8\"?><Parent><FileId>fileID</FileId><IniVersion>2.0.0</IniVersion><Child><Id>bar</Id></Child></Parent>";
xml_document<> doc;
xml_node<> * root_node;
std::string str = newXml;
std::vector<char> buffer(str.begin(), str.end());
buffer.push_back('\0');
doc.parse<0>(&buffer[0]);
root_node = doc.first_node("Parent");
xml_node<> * node = root_node->first_node("Child");
xml_node<> * xml = node->first_node("Id");
xml->value("foo"); // I want to change my id from bar to foo!!!!
std::ofstream outFile("output.xml");
outFile << doc; // after I write to file, I still see the ID as bar
}
What am I missing here?
The issue is in the layout of data. Under node_element node xml there is yet another node_data node that contains "bar".
Your posted code also does not compile. Here I made your code to compile and did show how to fix it:
#include <vector>
#include <iostream>
#include "rapidxml.hpp"
#include "rapidxml_print.hpp"
int main()
{
std::string newXml = "<?xml version=\"1.0\" encoding=\"UTF - 8\"?><Parent><FileId>fileID</FileId><IniVersion>2.0.0</IniVersion><Child><Id>bar</Id></Child></Parent>";
rapidxml::xml_document<> doc;
std::string str = newXml;
std::vector<char> buffer(str.begin(), str.end());
buffer.push_back('\0');
doc.parse<0>(&buffer[0]);
rapidxml::xml_node<>* root_node = doc.first_node("Parent");
rapidxml::xml_node<>* node = root_node->first_node("Child");
rapidxml::xml_node<>* xml = node->first_node("Id");
// xml->value("foo"); // does change something that isn't output!!!!
rapidxml::xml_node<> *real_thing = xml->first_node();
if (real_thing != nullptr // these checks just demonstrate that
&& real_thing->next_sibling() == nullptr // it is there and how it is located
&& real_thing->type() == rapidxml::node_data) // when element does contain text data
{
real_thing->value("yuck"); // now that should work
}
std::cout << doc; // lets see it
}
And so it outputs:
<Parent>
<FileId>fileID</FileId>
<IniVersion>2.0.0</IniVersion>
<Child>
<Id>yuck</Id>
</Child>
</Parent>
See? Note that how data is laid out during parse depends on flags that you give to parse. For example if you first put doc.parse<rapidxml::parse_fastest> then parser will not create such node_data nodes and then changing node_element data (like you first tried) will work (and what I did above will not). Read the details from manual.
I have just downloaded the pugixml library and I am trying to adapt it to my needs. It is mostly oriented for DOM style which I am not using. The data I store looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<profile>
<points>
<point>
<index>0</index>
<x>0</x>
<y>50</y>
</point>
<point>
<index>1</index>
<x>2</x>
<y>49.9583</y>
</point>
<point>
<index>2</index>
<x>12</x>
<y>50.3083</y>
</point>
</points>
</profile>
Pugixml guide says:
It is common to store data as text contents of some node - i.e.
This is a node. In this case,
node does not have a value, but instead has a child of
type node_pcdata with value "This is a node". pugixml provides
child_value() and text() helper functions to parse such data.
But I am having problem with using those methods, I am not getting the node values out.
#include "pugixml.hpp"
#include <string.h>
#include <iostream>
int main()
{
pugi::xml_document doc;
if (!doc.load_file("/home/lukasz/Programy/eclipse_linux_projects/xmlTest/Debug/pidtest.xml"))
return -1;
pugi::xml_node points = doc.child("profile").child("points");
for (pugi::xml_node point = points.first_child(); point; point = points.next_sibling())
{
// ?
}
return 0;
}
How to read out the index, x and y values inside of the for? I Would aprichiate all help.
There are several ways, documented in the quickstart page:
http://pugixml.org/docs/samples/traverse_iter.cpp
http://pugixml.org/docs/samples/traverse_rangefor.cpp
there is a tree visitor for the power jobs http://pugixml.org/docs/samples/traverse_walker.cpp
May I suggest Xpath?
#include <pugixml.hpp>
#include <iostream>
int main()
{
pugi::xml_document doc;
if (doc.load_file("input.txt")) {
for (auto point : doc.select_nodes("//profile/points/point")) {
point.node().print(std::cout, "", pugi::format_raw);
std::cout << "\n";
}
}
}
Prints
<point><index>0</index><x>0</x><y>50</y></point>
<point><index>1</index><x>2</x><y>49.9583</y></point>
<point><index>2</index><x>12</x><y>50.3083</y></point>
I've created an XML file that represents a directory layout for a project. It looks like this:
<folder>
<folder>
<name>src</name>
<file>
<name>main.cpp</name>
</file>
</folder>
<file>
<name>Makefile</name>
</file>
<file>
<name>README.md</name>
</file>
</folder>
I'm using the Boost property tree (boost::property_tree::ptree) to parse, represent, and create the directory (the program I'm trying to write is a command line tool that generates empty C++ projects). I'm trying to write a function that will create the directory recursively, but I've never used this library before, am currently running into a few mental blocks, and feel like I'm going about it all wrong. If anyone has used this library before and can give me a few pointers with my code, I'd appreciate it. Here's what I have so far:
static void create_directory_tree(std::string &root_name,
boost::property_tree::ptree &directory_tree)
{
// Create the root folder.
boost::filesystem::create_directory(root_name);
for (auto &tree_value : directory_tree.get_child("folder"))
{
// If the child is a file, create an empty file with the
// name attribute.
if (tree_value.first == "file")
{
std::ofstream file(tree_value.second.get<std::string>("name"));
file << std::flush;
file.close();
}
// If the child is a folder, call this function again with
// the folder as the root. I don't understand the data
// structure enough to know how to access this for my
// second parameter.
else if (tree_value.first == "folder")
{
create_directory_tree(tree_value.second.get<std::string>("name"), /* ??? */)
}
// If it's not a file or folder, something's wrong with the XML file.
else
{
throw std::runtime_error("");
}
}
}
It's not exactly clear to me what you're asking.
I hope my take on it helps:
Live On Coliru
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/xml_parser.hpp>
#include <boost/filesystem.hpp>
#include <iostream>
using namespace boost::property_tree;
namespace fs = boost::filesystem;
namespace project_definition {
void apply(ptree const& def, fs::path const& rootFolder) {
for (auto node : def) {
if ("folder" == node.first) {
fs::path where = rootFolder / node.second.get<std::string>("name");
fs::create_directories(where);
apply(node.second, where);
}
if ("file" == node.first) {
std::ofstream((rootFolder / node.second.get<std::string>("name")).native(), std::ios::trunc);
}
}
}
}
int main()
{
ptree projdef;
read_xml("input.txt", projdef);
try {
project_definition::apply(projdef, "./rootfolder/");
} catch(std::exception const& e)
{
std::cerr << e.what() << "\n";
}
}
With a input.txt of
<folder>
<name>project</name>
<folder>
<name>src</name>
<file><name>main.cpp</name></file>
</folder>
<file><name>Makefile</name></file>
<file><name>README.md</name></file>
</folder>
Creates a structure:
./rootfolder
./rootfolder/project
./rootfolder/project/README.md
./rootfolder/project/src
./rootfolder/project/src/main.cpp
./rootfolder/project/Makefile
I can turn my XML document file into rapidxml object:
if(exists("generators.xml")) { //http://stackoverflow.com/a/12774387/607407
rapidxml::file<> xmlFile("generators.xml"); // Open file, default template is char
xml_document<> doc; // character type defaults to char
doc.parse<0>(xmlFile.data());; // 0 means default parse flags
xml_node<> *main = doc.first_node(); //Get the main node that contains everything
cout << "Name of my first node is: " << doc.first_node()->name() << "\n";
if(main!=NULL) {
//Get random child node?
}
}
I'd like to pick one random child node from the main object. My XML looks like this (version with comments):
<?xml version="1.0" encoding="windows-1250"?>
<Stripes>
<Generator>
<stripe h="0,360" s="255,255" l="50,80" width="10,20" />
</Generator>
<Generator>
<loop>
<stripe h="0,360" s="255,255" l="50,80" width="10,20" />
<stripe h="0,360" s="255,255" l="0,0" width="10,20" />
</loop>
</Generator>
</Stripes>
I want to pick random <Generator> entry. I think getting the child count would be a way to do it:
//Fictional code - **all** methods are fictional!
unsigned int count = node->child_count();
//In real code, `rand` is not a good way to get anything random
xmlnode<> *child = node->childAt(rand(0, count));
How can I get child count and child at offset from rapidxml node?
RapidXML stores the DOM tree using linked lists, which as you'll know are not directly indexable.
So you'd basically need to implement those two methods yourself, by traversing the nodes children. Something like this (untested) code.
int getChildCount(xmlnode<> *n)
{
int c = 0;
for (xmlnode<> *child = n->first_node(); child != NULL; child = child->next_sibling())
{
c++;
}
return c;
}
getChildAt is obviously similar.