parsing data with pugi library in c++ - c++

I have a problem parsing document with pugi lib in c++.
I'm completely new to parsing in c++.
Here is my svg file:
<?xml version='1.0' encoding='UTF-8'?>
<!-- This file was generated by dvisvgm 2.9.1 -->
<svg version='1.1' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink' width='260.973912pt' height='53.742909pt' viewBox='229.120747 -42.285873 260.973912 53.742909'>
<defs>
<font id='cmr10' horiz-adv-x='0'>
<font-face font-family='cmr10' units-per-em='1000' ascent='750' descent='250'/>
</font>
<font id='cmmi10' horiz-adv-x='0'>
<font-face font-family='cmmi10' units-per-em='1000' ascent='750' descent='250'/>
</font>
</defs>
<style type='text/css'>
<![CDATA[text.f0 {font-family:cmmi10;font-size:9.96264px}
text.f1 {font-family:cmr10;font-size:9.96264px}
]]>
</style>
<g id='page1'>
<text class='f1' x='485.11332' y='-35.865504'>1</text>
<text class='f0' x='229.120747' y='11.457036'>abc</text>
</g>
</svg>
And here is my code:
#include "pugixml.hpp"
#include "pugixml.cpp"
int main(int argc, char* argv[])
{
std::string svg_file_path = "path_to_file.xml";
pugi::xml_document doc;
pugi::xml_parse_result result = doc.load_file(svg_file_path.c_str(), pugi::parse_default | pugi::parse_declaration);
pugi::xml_node root = doc.document_element();
return 0;
}
However when I debug I see that the variable root is still undefined. In the meantime, the value of pugi::parse_default is recognized as 116U while the value of pugi::parse_declaration is recognized as 256U.
For the result variable it writes status = status_ok(0)
What could I possible do wrong? How do I access such things as content of children? Their class (f0, f1) or finally, data CDATA where I need to know the correspondence {"f0": "cmmi10"} and {"f1": "cmr10"}
?
Yaroslav.

Related

When merging multiple xml files, how can I set EntityResolver for child xml files as well besides the parent xml?

I have a book xml file which references other multiple xml files. When I try to run an xslt on the book.xml file, the EntityResolver in my code resolves the dtd path. However, for the children xml files which are being merged, the dtd paths are not resolved.
Sample sample_book.ditamap
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE bookmap PUBLIC "-//OASIS//DTD DITA BookMap//EN" "bookmap.dtd">
<bookmap>
<booktitle>
<mainbooktitle>sample book</mainbooktitle>
</booktitle>
<part navtitle="Overview">
<topicref href="../topics/introduction.dita"/>
<topicref href="../topics/install.dita"/>
</part>
</bookmap>
`
Java Code
public class XMLProcessor {
public void transform(String xmlf, String xslf) throws TransformerConfigurationException, TransformerException, org.xml.sax.SAXException, IOException{
org.xml.sax.XMLReader reader = XMLReaderFactory.createXMLReader();
Transformer transformer;
TransformerFactory factory = TransformerFactory.newInstance();
StreamSource stylesheet = new StreamSource(xslf);
//Source source = StreamSource(xmlf);
//SAXSource source = new SAXSource(new InputSource(xmlf));
// org.xml.sax.XMLReader reader = XMLReaderFactory.createXMLReader();
EntityResolver ent = new EntityResolver() {
#Override
public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException {
System.out.println(publicId);
System.out.println(systemId);
if(publicId.equals("-//OASIS//DTD DITA BookMap//EN")){
return new InputSource("file:///D:/sample/dtd/bookmap.dtd");
}
return null;
}
};
// sour.setPublicId("file:///D:/sample/dtd/bookmap/dtd/bookmap.dtd");
reader.setEntityResolver(ent);
SAXSource source = new SAXSource(reader, new InputSource(xmlf));
//reader.parse(new InputSource(xmlf));
//StreamSource sourcedoc = new StreamSource(xmlf);
transformer = factory.newTransformer(stylesheet);
try {
transformer.transform(source, new StreamResult(new FileWriter("D:\\sample\\out\\result.xml")));
} catch (IOException ex) {
Logger.getLogger(XMLProcessor.class.getName()).log(Level.SEVERE, null, ex);
}
}
}`
Expected Result
`
<?xml version="1.0" encoding="UTF-8"?>
<pages xmlns:mf="urn:mf">
<parentpage>
<parentpagename>sample book</parentpagename>
</parentpage>
<part>
<partname>Overview</partname>
<text/>
<level-2>
<concept xmlns:ditaarch="http://dita.oasis-open.org/architecture/2005/"
id="introduction" ditaarch:DITAArchVersion="1.3"
domains="(topic concept) (topic abbrev-d) a(props deliveryTarget) (topic equation-d) (topic hazard-d) (topic hi-d) (topic indexing-d) (topic markup-d) (topic mathml-d) (topic pr-d) (topic relmgmt-d) (topic sw-d) (topic svg-d) (topic ui-d) (topic ut-d) (topic markup-d xml-d) "
class="- topic/topic concept/concept ">
<title class="- topic/title ">Introduction</title>
<shortdesc class="- topic/shortdesc "/>
<conbody class="- topic/body concept/conbody ">
<p class="- topic/p ">Sample introduction</p>
</conbody>
</concept>
</level-2>
<text/>
<level-2>
<task xmlns:ditaarch="http://dita.oasis-open.org/architecture/2005/" id="install"
ditaarch:DITAArchVersion="1.3"
domains="(topic task) (topic abbrev-d) a(props deliveryTarget) (topic equation-d) (topic hazard-d) (topic hi-d) (topic indexing-d) (topic markup-d) (topic mathml-d) (topic pr-d) (topic relmgmt-d) (topic sw-d) (topic svg-d) (topic ui-d) (topic ut-d) (topic markup-d xml-d) (topic task strictTaskbody-c) "
class="- topic/topic task/task ">
<title class="- topic/title ">Install</title>
<shortdesc class="- topic/shortdesc "/>
<taskbody class="- topic/body task/taskbody ">
<context class="- topic/section task/context ">
<p class="- topic/p ">Download xyz installer from here. </p>
</context>
<steps class="- topic/ol task/steps ">
<step class="- topic/li task/step ">
<cmd class="- topic/ph task/cmd ">Double-click the downloader
installer.</cmd>
</step>
<step class="- topic/li task/step ">
<cmd class="- topic/ph task/cmd ">Do this.</cmd>
</step>
<step class="- topic/li task/step ">
<cmd class="- topic/ph task/cmd ">Do that</cmd>
</step>
</steps>
</taskbody>
</task>
</level-2>
</part>
</pages>
`
Actual Result
When the XSLT is run, the following error message is displayed. The error goes away, when I move the dtd files to the topics folder.
Warning XTDE0540: Ambiguous rule match for /bookmap/booktitle[1] Matches both "element(Q{}booktitle)" on line 54 of
>file:///D:/sample/xsl/merge.xsl and "element(Q{}booktitle)" on line 18 of >file:///D:/sample/xsl/merge.xsl Warning at char 11 in
xsl:apply-templates/#select on line 30 column 104 of >merge.xsl:
FODC0002: I/O error reported by XML parser processing
file:/D:/sample/sampledoc/topics/introduction.dita:
D:\sample\sampledoc\topics\concept.dtd (The system cannot find the
file specified) Warning at char 11 in xsl:apply-templates/#select on line 30 column 104 of >merge.xsl: FODC0002: I/O error reported by
XML parser processing file:/D:/sample/sampledoc/topics/install.dita:
D:\sample\sampledoc\topics\task.dtd (The system cannot find the >file
specified)
You can set a URIResolver on the Transformer, which will be called when your XSLT code calls doc() or document() to fetch the referenced XML files. The URIResolver can then set an EntityResolver on the XML parser used to parse these files.
Alternatively you can do all of this with the Apache XMLResolver which deferences URIs at both the XSLT and the XML level by reference to a catalog file in a format defined by OASIS.
Thanks to Michael, I could resolve this issue.
Added a CatalogManager.properties file in the classpath.
Created a catalog.xml with all public ids.
public class XMLProcessor {
public void transform(String xmlf, String xslf, String outpath) throws TransformerConfigurationException, TransformerException, org.xml.sax.SAXException, IOException{
org.xml.sax.XMLReader reader = XMLReaderFactory.createXMLReader();
Transformer transformer = null;
TransformerFactory factory = TransformerFactory.newInstance();
StreamSource stylesheet = new StreamSource(xslf);
CatalogResolver cr = new CatalogResolver();
reader.setEntityResolver(cr);
factory.setURIResolver(cr);
SAXSource source = new SAXSource(reader, new InputSource(xmlf));
transformer = factory.newTransformer(stylesheet);
try {
transformer.transform(source, new StreamResult(new FileWriter(outpath)));
} catch (IOException ex) {
Logger.getLogger(XMLProcessor.class.getName()).log(Level.SEVERE, null, ex);
}
}
}

How do I use QXmlStreamReader to parse an XML file that contains references to other XML files?

I'm attempting to parse an xml file in C++ using a QXmlStreamReader (Qt 5.5.1). I'm using the XML file to map numeric keys to corresponding image files. I've gotten it to work with a simple XML file such as:
<?xml version="1.0" encoding="UTF-8"?>
<images>
<group1>
<image key="1" value="image1_group1.png"/>
<image key="2" value="image2_group1.png"/>
</group1>
<group2>
<image key="1" value="image1_group2.png"/>
<image key="2" value="image2_group2.png"/>
</group2>
</images>
using the following code:
#include <QFile>
xml_image_mapper::xml_image_mapper(QObject* p_parent_ptr)
{
QFile file("myfile.xml");
file.open(QFile::ReadOnly | QFile::Text);
QXmlStreamReader stream_reader(&file);
stream_reader.readNextStartElement();
while (stream_reader.readNextStartElement())
{
auto* inner_map_ptr = new QMap<quint32, QString>();
m_image_map.insert(stream_reader.name().toString(), inner_map_ptr);
parse_xml(stream_reader, *inner_map_ptr);
}
}
void xml_image_mapper::parse_xml(QXmlStreamReader& stream_reader, QMap<quint32, QString>& p_map)
{
while (stream_reader.readNextStartElement())
{
static const QString key_name = "key";
static const QString value_name = "value";
quint32 key;
QString value;
foreach (const QXmlStreamAttribute attribute, stream_reader.attributes())
{
if (key_name == attribute.name())
key = attribute.value().toInt();
else if (value_name == attribute.name())
value = attribute.value().toString();
p_map.insert(key, value);
}
stream_reader.skipCurrentElement();
}
}
This code correctly creates the map from numeric keys to filenames for the simple XML shown above, but fails to work for an XML file that includes references, such as:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc [
<!ENTITY group1 SYSTEM "group1.xml">
<!ENTITY group2 SYSTEM "group2.xml">
]>
<images>
<group1>
&group1;
</group1>
<group2>
&group2;
</group2>
</images>
Where group1.xml reads:
<?xml version="1.0" encoding="UTF-8"?>
<image key="1" value="image1_group1.png"/>
<image key="2" value="image2_group1.png"/>
and group2.xml reads:
<?xml version="1.0" encoding="UTF-8"?>
<image key="1" value="image1_group2.png"/>
<image key="2" value="image2_group2.png"/>
Is there a way to parse XML files with references to other XML files using a QXmlStreamReader?
From the Qt documentation:
QXmlStreamReader is a well-formed XML 1.0 parser that does not include external parsed entities.
So you seem to be out of luck here.

How to manually create a boost ptree with different XML attributes?

I am using boost libraries to parse XML files and I have to create a ptree manually.
I want to create below xml file using boost ptree.
<?xml version="1.0"?>
<Txn>
<Resp errCode="0" errInfo="" />
<A exptime="20171230">xyz Information</A>
<B>xyz Information</B>
<C type="Active">xyz Information</C>
</Txn>
To achieve above xml,
Below is my sample code:
boost::property_tree::ptree pt;
boost::property_tree::ptree ptr1;
boost::property_tree::ptree ptr2;
boost::property_tree::ptree ptr3;
ptr1.put("<xmlattr>.errCode", Txn.resp.errCode);
ptr1.put("<xmlattr>.errInfo", Txn.resp.errInfo);
ptr2.push_back(boost::property_tree::ptree::value_type("A", boost::property_tree::ptree(data)));
ptr2.push_back(boost::property_tree::ptree::value_type("C", boost::property_tree::ptree(data)));
ptr2.put("A.<xmlattr>.exptime", data);
ptr2.put("C.<xmlattr>.type", data);
ptr3.put("<xmlattr>", data);
pt.add_child("Txn.Resp", ptr1);
pt.add_child("Txn", ptr2);
pt.add_child("Txn.B", ptr3);
Here Child A and C always create as a separate with Parent Txn But I want to add all child in Txn Parent
. I did not understand, why child A and C are different here.
It would be very helpful, If someone help me to provide right way.
Here's the simplest thing I can think of:
Live On Coliru
#include <boost/property_tree/xml_parser.hpp>
#include <iostream>
using boost::property_tree::ptree;
static auto pretty = boost::property_tree::xml_writer_make_settings<std::string>(' ', 4);
int main() {
ptree root;
root.add("Txn.Resp.<xmlattr>.errCode", 0);
root.add("Txn.Resp.<xmlattr>.errInfo", "");
root.add("Txn.A", "xyz Information");
root.add("Txn.A.<xmlattr>.exptime", "20171230");
root.add("Txn.B", "xyz Information");
root.add("Txn.C", "xyz Information");
root.add("Txn.C.<xmlattr>.type", "Active");
write_xml(std::cout, root, pretty);
}
Prints:
<?xml version="1.0"?>
<Txn>
<Resp errCode="0" errInfo="" />
<A exptime="20171230">xyz Information</A>
<B>xyz Information</B>
<C type="Active">xyz Information</C>
</Txn>
Key point is to create the element node before adding attributes, otherwise you get this instead:
Live On Coliru
<?xml version="1.0" encoding="utf-8"?>
<Txn>
<Resp errCode="0" errInfo=""/>
<A exptime="20171230"/>
<A>xyz Information</A>
<B>xyz Information</B>
<C type="Active"/>
<C>xyz Information</C>
</Txn>

Overwrite an existing text file c++

This is how my Save As works - it is copying the current file's lines until it reaches the first figure and then I use my print methods to print the figure's info and then close the tag.
std::ofstream newFile(filePath1_fixed, std::ios::app);
std::fstream openedFile(filePath);
std::string line1;
while (std::getline(openedFile, line1)) {
if (line1.find("<rect") != std::string::npos
|| line1.find("<circle") != std::string::npos
|| line1.find("<line") != std::string::npos)
break;
newFile << line1 << std::endl;
}
figc.printToFile(newFile);
newFile << "</svg>\n";
My question is how to save the changes to the current file? I tried something like this:
std::ifstream openedFile(filePath);
std::ofstream newFile(filePath, std::ios::app);
std::string line1;
std::string info_beg[100];
int t = 0;
while (std::getline(openedFile, line1)) {
std::cout << "HELLYEAH";
if (line1.find("<rect") != std::string::npos
|| line1.find("<circle") != std::string::npos
|| line1.find("<line") != std::string::npos)
break;
info_beg[t++] = line1;
}
for (int i = 0; i < t; i++)
newFile << info_beg[i] << std::endl;
figc.printToFile(newFile);
newFile << "</svg>\n";
This is the nearest I've gone. I get this:
<?xml version="1.0" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg width="12cm" height="4cm" viewBox="0 0 1200 400"
xmlns="http://www.w3.org/2000/svg" version="1.1">
<desc>Example rect01 - rectangle with sharp corners</desc>
<!-- Show outline of canvas using 'rect' element -->
<rect x="1" y="1" width="1198" height="398"
fill="none" stroke="blue" stroke-width="2" />
<line x1="20" y1="100" x2="100" y2="20"
stroke="red" stroke-width="2" />
<rect x="20" y="30" width="40" height="50"
fill="red" stroke="red" stroke-width="1" />
<rect x="10" y="20" width="30" height="40"
fill="red" stroke="blue" stroke-width="1" />
<line x1="100" y1="200" x2="300" y2="400"
stroke="red" stroke-width="2" />
<circle cx="10" cy="20" r="30"
fill="red" stroke="blue" stroke-width="2" />
</svg>
<?xml version="1.0" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg width="12cm" height="4cm" viewBox="0 0 1200 400"
xmlns="http://www.w3.org/2000/svg" version="1.1">
<desc>Example rect01 - rectangle with sharp corners</desc>
<!-- Show outline of canvas using 'rect' element -->
<rect x="1" y="1" width="1198" height="398"
fill="none" stroke="blue" stroke-width="2" />
<line x1="20" y1="100" x2="100" y2="20"
stroke="red" stroke-width="2" />
<rect x="20" y="30" width="40" height="50"
fill="red" stroke="red" stroke-width="1" />
<rect x="10" y="20" width="30" height="40"
fill="red" stroke="blue" stroke-width="1" />
<line x1="100" y1="200" x2="300" y2="400"
stroke="red" stroke-width="2" />
<circle cx="10" cy="20" r="30"
fill="red" stroke="blue" stroke-width="2" />
<rect x="10" y="20" width="30" height="40"
fill="red" stroke="blue" stroke-width="2" />
</svg>
So my actual question is how to delete the first or overwrite it or I need a different approach.
Use ios::trunc instead of ios::app
Using std::ios::app in the constructor for your std::ofstream tells the program to append to the file and not overwrite it. If you want to overwrite it (ie truncate), then using std::ios::trunc will tell the program to overwrite the existing file. ofstream does this by default, so you could just write the initialization as just std::ofstream newFile(filePath);.
Also, don't try to read the file and write to it at the same time; that won't work. Use ifstream to get the data into the buffer, then use close() to close the file. Then initialize newFile to overwrite the file and write out the buffer.

TinyXML2 query text if attribute matches

I am trying to figure out a way to load the text from an XML document I have created using TinyXML2. Here is the entire document.
<?xml version="1.0" encoding="UTF-8"?>
<map version="1.0" orientation="orthogonal" width="15" height="13" tilewidth="32" tileheight="32">
<tileset firstgid="1" name="Background" tilewidth="32" tileheight="32">
<image source="background.png" width="64" height="32"/>
</tileset>
<tileset firstgid="3" name="Block" tilewidth="32" tileheight="32">
<image source="block.png" width="32" height="32"/>
</tileset>
<layer name="Background" width="15" height="13">
<data encoding="base64">
AgAAAAIAAAACAAAA...
</data>
</layer>
<layer name="Block" width="15" height="13">
<data encoding="base64">
AwAAAAMAAAADAAAAAwAAAAM...
</data>
</layer>
</map>
Basically, I want to copy the text from <data> into a string called background only if the layer name is "Background".
I have gotten the other variables like so:
// Get the basic information about the level
version = doc.FirstChildElement("map")->FloatAttribute("version");
orientation = doc.FirstChildElement("map")->Attribute("orientation");
mapWidth = doc.FirstChildElement("map")->IntAttribute("width");
mapHeight = doc.FirstChildElement("map")->IntAttribute("height");
That works great because I know the element name and the attribute name. Is there a way to say get the doc.FirstChildElement("map")->FirstChildElement("layer") and if it == "Background", get the text.
How would I accomplish this?
I know this thread is quite old, but just in case someone perusing the internet might stumble upon this question as I have, I wish to point out that Xanx's answer can be simplified slightly.
In tinyxml2.h it says that for the function const char* Attribute( const char* name, const char* value=0 ) const, if the value parameter is not null, then the function only returns if value and name match. According to the comments in the file this:
if ( ele->Attribute( "foo", "bar" ) ) callFooIsBar();
can be written like this:
if ( ele->Attribute( "foo" ) ) {
if ( strcmp( ele->Attribute( "foo" ), "bar" ) == 0 ) callFooIsBar();
}
So the code Xanx provided can be rewritten like this:
XMLElement * node = doc.FirstChildElement("map")->FirstChildElement("layer");
std::string value;
if (node->Attribute("name", "Background")) // no need for strcmp()
{
value = node->FirtChildElement("data")->GetText();
}
A minor change, yes, but something I wanted to add.
I advice you to do something like this:
XMLElement * node = doc.FirstChildElement("map")->FirstChildElement("layer");
std::string value;
// Get the Data element's text, if its a background:
if (strcmp(node->Attribute("name"), "Background") == 0)
{
value = node->FirtChildElement("data")->GetText();
}
auto bgData = text (find_element (doc, "map/layer[#name='Background']/data"));
Using tinyxml2 extension (#include <tixml2ex.h>).
N.B. should really be wrapped in a try/catch block.
Work in progress and documentation is incomplete (can deduce from the test example until it's ready).
I'll mention in passing that the other two answers only work properly when the desired <layer> element appears first.