Xerces XPath causes seg fault when path doesn't exist - c++

I can successfully use Xerces XPath feature to query for information from an XML with the following XML and C++ code.
XML
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<root>
<ApplicationSettings>
hello universe
</ApplicationSettings>
</root>
C++
int main()
{
XMLPlatformUtils::Initialize();
// create the DOM parser
XercesDOMParser *parser = new XercesDOMParser;
parser->setValidationScheme(XercesDOMParser::Val_Never);
parser->parse("fake_cmf.xml");
// get the DOM representation
DOMDocument *doc = parser->getDocument();
// get the root element
DOMElement* root = doc->getDocumentElement();
// evaluate the xpath
DOMXPathResult* result=doc->evaluate(
XMLString::transcode("/root/ApplicationSettings"), // <-- HERE IS THE XPATH
root,
NULL,
DOMXPathResult::ORDERED_NODE_SNAPSHOT_TYPE, //DOMXPathResult::ANY_UNORDERED_NODE_TYPE, //DOMXPathResult::STRING_TYPE,
NULL);
// look into the xpart evaluate result
result->snapshotItem(0);
std::cout<<TranscodeToStr(result->getNodeValue()->getFirstChild()->getNodeValue(),"ascii").str()<<std::endl;;
XMLPlatformUtils::Terminate();
return 0;
}
The problem is that sometimes my XML will only have certain fields. But if I remove the ApplicationSettings entry from the XML it will seg fault. How can I properly handle these optional fields? I know that trying to correct from seg faults is risky business.

The seg fault is occurring in this line
std::cout<<TranscodeToStr(result->getNodeValue()->getFirstChild()->getNodeValue(),"ascii").str()<<std::endl;
specifically in get getFirstChild() call because the result of getNodeValue() is NULL.
This is my quick and dirty solution. It's not really ideal but it works. I would prefer a more sophisticated evaluation and response.
if (result->getNodeValue() == NULL)
{
cout << "There is no result for the provided XPath " << endl;
}
else
{
cout<<TranscodeToStr(result->getNodeValue()->getFirstChild()->getNodeValue(),"ascii").str()<<endl;
}

Related

QT: QXmlStreamReader always returns "Premature End of Document" error

I have strange issue with Qt QXmlStreamReader. I'am trying to parse simple document (note: it is generated using QXmlStreamWriter):
<?xml version="1.0" encoding="UTF-8"?>
<tex>
<used_by/>
<facade>
<tags>
<town_related></town_related>
<zone_related></zone_related>
<visual_related></visual_related>
<kind_related></kind_related>
<other>flamingo</other>
</tags>
<additional_textures>
<id>flamingo_top.psd</id>
</additional_textures>
</facade>
</tex>
Using this code:
QFile file(filename);
if (file.open(QFile::ReadOnly | QFile::Text))
{
QXmlStreamReader xmlReader(&file);
while (xmlReader.readNextStartElement())
{
/* same issue when uncommented:
if (xmlReader.name() == "tex")
t->readXml(xmlReader);//parse texture
else*/
xmlReader.skipCurrentElement();
}
if (xmlReader.hasError())
emit reportError(xmlReader.errorString());
}
...
And it always reports error "Premature end of document". Why? When debbuging, it seems, to all elements are parsed or skipped correctly.
I verified the behavior of your code. Indeed, it seems that readNextStartElement() does not correctly check for end of document. It only checks for start/end element to return its value, so if reading past the end of document, its internal call to readNext raises "premature end".
For a quick fix try checking for end of document yourself using readNext(), eg.:
while (!xml.atEnd()) {
if (xml.readNext() != QXmlStreamReader::EndDocument) {
if (xml.isStartElement())
std::cout << qPrintable(xml.name().toString()) << std::endl;
}
}
if (xml.hasError())
std::cout << (xml.errorString().toUtf8().constData()) << std::endl;

Find Key in XML with Boost

I am using boost for the first time within an old code base that we have
iptree children = pt.get_child(fieldName);
for (const auto& kv : children) {
boost::property_tree::iptree subtree = (boost::property_tree::iptree) kv.second ;
//Recursive call
}
My problem is sometimes the fieldName doesn`t exist in the XML file and I have an exception
I tried :
boost::property_tree::iptree::assoc_iterator it = pt.find(fieldName);
but I dont know how to use the it I can`t use: if (it != null)
Any help please will be appreciated
I am using VS 2012
If it`s very complicated is there any other way to read a XML with nested nodes? I am working on that since 3 days
This is an Example of the XML
<?xml version="1.0" encoding="utf-8"?>
<nodeA xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<nodeA.1>This is the Adresse</nodeA.1>
<nodeA.2>
<node1>
<node1.1>
<node1.1.1>Female</node1.1.1>
<node1.1.2>23</node1.1.2>
<node1.1.3>Engineer</node1.1.3>
</node1.1>
<node1.2>
<node1.2.1>Female</node1.2.1>
<node1.2.2>35</node1.2.2>
<node1.2.3>Doctors</node1.2.3>
</node1.2>
</node1>
</nodeA.2>
<nodeA.3>Car 1</nodeA.3>
</nodeA>
Use pt.get_child_optional(...) to prevent an exception. pt.find(...) returns an iterator which compares true to pt.not_found() on failure.
EDIT: How to use boost::optional<--->
boost::optional< iptree & > chl = pt.get_child_optional(fieldname);
if(chl) {
for( auto a : *chl )
std::cerr << ":" << a.first << ":" << std::endl;
}

Reading xml file using QDomDocument just get the first line

I generate a xml file using QXmlStreamWriter. The file looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<pedestrianinfo>
<pedestrian uuid="2112e2ed-fc9b-41e8-bbcb-b44ad78bde11">
<module>11.1208</module>
<direction>4</direction>
<row>5</row>
<column>71</column>
</pedestrian>
<pedestrian uuid="1aabb9c1-4aa7-4f47-9542-36d2dfaa26e4">
<module>1.48032</module>
<direction>4</direction>
<row>67</row>
<column>31</column>
</pedestrian>
...
</pedestrianinfo>
Then I try to read the content by QDomDocument. My code looks like this:
xmlReader *xp = new xmlReader(QString("D:\\0T.xml"));
if(xp->openFile()) {
if(xp->isGetRootIndex()) {
xp->parseRootIndexElement();
}
else
cout<<"Unable to get root index."<<endl;
}
Here is isGetRootIndex():
bool xmlReader::isGetRootIndex()
{
doc.setContent(&file,false);
root = doc.documentElement();
if(root.tagName() == getRootIndex()) //rootIndex=="pedestrianinfo"
return true;
return false;
}
This is parseRootIndexElement():
void xmlReader::parseRootIndexElement()
{
QDomNode child = root.firstChild();
while(!child.isNull()) {
if(child.toElement().tagName() == getTagNameP()) //"childTagName=="pedestrian"
parseEntryElement(child.toElement());
qDebug()<<"module="<<module<<" direction="<<direction<<" row="<<row<<" column="<<column;
child = child.nextSibling();
}
}
parseEntryElement(const QDomElement &element) is a function to get the infomation in each tag and save them into variables such as module.
However, each time I run my code, only the first child of xml file could be qDebug*ed*. It seems that after executing child.nextSibling(), child becomes null. Why does it not get the next pedestrian info?
Looks correct to me based on what I see in the documentation. Perhaps parseEntryElement is advancing the iterator unexpectedly?

Parsin XML file using pugixml

Hi
I want to use XML file as a config file, from which I will read parameters for my application. I came across on PugiXML library, however I have problem with getting values of attributes.
My XML file looks like that
<?xml version="1.0"?>
<settings>
<deltaDistance> </deltaDistance>
<deltaConvergence>0.25 </deltaConvergence>
<deltaMerging>1.0 </deltaMerging>
<m> 2</m>
<multiplicativeFactor>0.7 </multiplicativeFactor>
<rhoGood> 0.7 </rhoGood>
<rhoMin>0.3 </rhoMin>
<rhoSelect>0.6 </rhoSelect>
<stuckProbability>0.2 </stuckProbability>
<zoneOfInfluenceMin>2.25 </zoneOfInfluenceMin>
</settings>
To pare XML file I use this code
void ReadConfig(char* file)
{
pugi::xml_document doc;
if (!doc.load_file(file)) return false;
pugi::xml_node tools = doc.child("settings");
//[code_traverse_iter
for (pugi::xml_node_iterator it = tools.begin(); it != tools.end(); ++it)
{
cout<<it->name() << " " << it->attribute(it->name()).as_double();
}
}
and I also was trying to use this
void ReadConfig(char* file)
{
pugi::xml_document doc;
if (!doc.load_file(file)) return false;
pugi::xml_node tools = doc.child("settings");
//[code_traverse_iter
for (pugi::xml_node_iterator it = tools.begin(); it != tools.end(); ++it)
{
cout<<it->name() << " " << it->value();
}
}
Attributes are loaded corectly , however all values are equals 0. Could somebody tell me what I do wrong ?
I think your problem is that you're expecting the value to be stored in the node itself, but it's really in a CHILD text node. A quick scan of the documentation showed that you might need
it->child_value()
instead of
it->value()
Are you trying to get all the attributes for a given node or do you want to get the attributes by name?
For the first case, you should be able to use this code:
unsigned int numAttributes = node.attributes();
for (unsigned int nAttribute = 0; nAttribute < numAtributes; ++nAttribute)
{
pug::xml_attribute attrib = node.attribute(nAttribute);
if (!attrib.empty())
{
// process here
}
}
For the second case:
LPCTSTR GetAttribute(pug::xml_node & node, LPCTSTR szAttribName)
{
if (szAttribName == NULL)
return NULL;
pug::xml_attribute attrib = node.attribute(szAttribName);
if (attrib.empty())
return NULL; // or empty string
return attrib.value();
}
If you want stock plain text data into the nodes like
<name> My Name</name>
You need to make it like
rootNode.append_child("name").append_child(node_pcdata).set_value("My name");
If you want to store datatypes, you need to set an attribute. I think what you want is to be able to read the value directly right?
When you are writing the node,
rootNode.append_child("version").append_attribute("value").set_value(0.11)
When you want to read it,
rootNode.child("version").attribute("version").as_double()
At least that's my way of doing it!

How do I run XPath queries in QT?

How do I run an XPath query in QT?
I need to sort out certain tags with specific values in a certain attribute. The QXmlQuery documentation is anything but legible.
The schema I'm parsing is the Rhythmbox DB format:
<rhythmdb version="1.6">
<entry type="ignore">
<title></title>
<genre></genre>
<artist></artist>
<album></album>
<location>file:///mnt/disk/music/Cover.jpg</location>
<mountpoint>file:///mnt/disk</mountpoint>
<mtime>1222396828</mtime>
<date>0</date>
<mimetype>application/octet-stream</mimetype>
<mb-trackid></mb-trackid>
<mb-artistid></mb-artistid>
<mb-albumid></mb-albumid>
<mb-albumartistid></mb-albumartistid>
<mb-artistsortname></mb-artistsortname>
</entry>
<entry type="song">
<title>Bar</title>
<genre>Foobared Music</genre>
<artist>Foo</artist>
<album>The Great big Bar</album>
<track-number>1</track-number>
<disc-number>1</disc-number>
<duration>208</duration>
<file-size>8694159</file-size>
<location>file:///media/disk/music/01-Foo_-_Bar.ogg
<mountpoint>file:///media/disk
<mtime>1216995840</mtime>
<first-seen>1250478814</first-seen>
<last-seen>1250478814</last-seen>
<bitrate>301</bitrate>
<date>732677</date>
<mimetype>application/x-id3</mimetype>
<mb-trackid></mb-trackid>
<mb-artistid></mb-artistid>
<mb-albumid></mb-albumid>
<mb-albumartistid></mb-albumartistid>
<mb-artistsortname></mb-artistsortname>
</entry>
</rhythmdb>
This is your basic XML Schema which has a collection of structured entries. My intention was to filter out the entries with the type 'ignore'.
The relevant documentation is at: http://qt-project.org/doc/qt-4.8/qxmlquery.html#running-xpath-expressions.
The solution I came to was to use QXmlQuery to generate an XML file then parse it again using QDomDocument.
RhythmboxTrackModel::RhythmboxTrackModel()
{
QXmlQuery query;
QXmlQuery entries;
QString res;
QDomDocument rhythmdb;
/*
* Try and open the Rhythmbox DB. An API call which tells us where
* the file is would be nice.
*/
QFile db(QDir::homePath() + "/.gnome2/rhythmbox/rhythmdb.xml");
if ( ! db.exists()) {
db.setFileName(QDir::homePath() + "/.local/share/rhythmbox/rhythmdb.xml");
if ( ! db.exists())
return;
}
if (!db.open(QIODevice::ReadOnly | QIODevice::Text))
return;
/*
* Use QXmlQuery to execute and XPath query. Check the version to
* make sure.
*/
query.setFocus(&db);
query.setQuery("rhythmdb[#version='1.6']/entry[#type='song']");
if ( ! query.isValid())
return;
query.evaluateTo(&res);
db.close();
/*
* Parse the result as an XML file. These shennanigans actually
* reduce the load time from a minute to a matter of seconds.
*/
rhythmdb.setContent("" + res + "");
m_entryNodes = rhythmdb.elementsByTagName("entry");
for (int i = 0; i < m_entryNodes.count(); i++) {
QDomNode n = m_entryNodes.at(i);
QString location = n.firstChildElement("location").text();
m_mTracksByLocation[location] = n;
}
qDebug() << rhythmdb.doctype().name();
qDebug() << "RhythmboxTrackModel: m_entryNodes size is" << m_entryNodes.size();
}
In case anyone is wondering, this is my code taken from a recent branch of the Mixxx project, specifically the features_looping branch.
The things I dislike about this solution are:
Parsing the XML twice
Concatenating the result with a starting and ending tag.
If it fits your parsing requirements, you can use the SAX-based reader instead of a DOM-based one. Using QXmlSimpleReader with a sub-classed QXmlDefaultHandler, you can get access to each element of your XPath query as well as its attributes as the document is scanned. I think this approach would be faster than a DOM-based one; you don't have to read anything twice and it's already built into Qt. There is an example here: http://www.digitalfanatics.org/projects/qt_tutorial/chapter09.html under "Reading Using SAX."