Delete comments from xml file while parsing it using libxml - c++

Following is the XML file with one of its node(i.e. <date>) being commented.
<?xml version="1.0"?>
<story>
<info>
<author>Abc Xyz</author>
<!--<date>June 2, 2017</date> -->
<keyword>example keyword</keyword>
</info>
</story>
What I want is to remove that commented line/node completely from the XML file using libxml library and it should look as below:
<?xml version="1.0"?>
<story>
<info>
<author>Abc Xyz</author>
<keyword>example keyword</keyword>
</info>
</story>
I also referred the libxml documentation but that didn't helped me much with the "comment/s" in XML file.

I tried in a different way and it worked. Looks like using xmlreader for modifying the xml will not help much, instead I did xmlReadMemory(), then while parsing did following check:
if(node->type == XML_COMMENT_NODE){ //node is of type xmlNodePtr
xmlUnlinkNode(node);
xmlFreeNode(node);
}
And finally xmlDocDumpFormatMemory() to store the modified xml in xmlbuffer.

You can use NodeType() while parsing the xml and check for each node if it’s a comment (8 means comment, see here: http://xmlsoft.org/xmlreader.html#Extracting) and then remove it with xmlUnlinkNode() and xmlFreeNode().

Related

Is this a bug in xmllint or xmlstarlet pattern matching?

Here is my product.xml file contents:
<ProductCode>ABC</ProductCode>
And here is the corresponding validating schema, product.xsd file contents:
<?xml version="1.0" encoding="utf-8"?>
<xsd:schema
version="1.0"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="ProductCode">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:minLength value="1"/>
<xsd:maxLength value="15"/>
<xsd:pattern value="[\P{Ll}]*"></xsd:pattern>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
</xsd:schema>
I open a command line shell, and used xmlstarlet to validate the xml:
xmlstarlet val -e --xsd product.xsd product.xml
product.xml:1.31: Element 'ProductCode': [facet 'pattern'] The value 'ABC' is not accepted by the pattern '[\P{Ll}]*'.
product.xml:1.31: Element 'ProductCode': 'ABC' is not a valid value of the local atomic type.
product.xml - invalid
Then, i tried to use xmllint to validate the xml:
└xmllint -schema product.xsd product.xml
<?xml version="1.0"?>
<ProductCode>ABC</ProductCode>
Element 'ProductCode': [facet 'pattern'] The value 'ABC' is not accepted by the pattern '[\P{Ll}]*'.
Element 'ProductCode': 'ABC' is not a valid value of the local atomic type.
product.xml fails to validate
I spent a couple of hours tinkering with it and I found that I can make it work by removing the enclosing brackets:
<xsd:pattern value="\P{Ll}*"></xsd:pattern>
I can retain the enclosing brackets and make it work by using the /p inclusive pattern category and preceding it by a negation ^:
<xsd:pattern value="[^\p{Ll}]*"></xsd:pattern>
It seems, that there is a bug in the underlying implementation of xmllint and xmlstarlet and I need confirmation if indeed this is the case.
The versions I used are:
xmllint:
xmllint --version
xmllint: using libxml version 20904
compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules Debug Zlib Lzma
xmlstarlet:
xmlstarlet --version
1.6.1
compiled against libxml2 2.9.1, linked with 20904
compiled against libxslt 1.1.28, linked with 10129
Additional Info
Using python as coded in the snippets in python XML schema validation snippets, i found that product.xsd does not validate product.xml also. It's hard to believe that python also has this bug. So therefore, I am now seeking some kind of explanation why the pattern expression in product.xsd is not working.
The question is: why is the enclosing bracket not able to work with the exclusive /P{Ll} ?
More Additional Info
On the other hand, using the scala snippet here,it is able to validate product.xml via product.xsd. So now, we can confirm that the pattern syntax in product.xsd is correct. Yet, xmllint, xmlstarlet and python could not validate it. What is going on here?

How to read a value from property file and replace it on xml using shell script

I have a xml file which contains some path at multiple places.
Now I want to fetch value from a .properties file mentioned and replace part of path where ever it is present in xml.
Like,let's consider I have a xml file as below.
<?xml version="1.0" encoding="ISO-8859-1"?>
...
...
<classpath>
<pathelement location="/profiles/sh/finalFolder/Apache/example.jar" />
</classpath>
<property name="executable" value="/profiles/sh/finalFolder/Apache/instjamr/install" />
<fileset dir="/profiles/sh/finalFolder/Apache/ant"/>
this xml file conatins path /profiles/sh/finalFolder with some suffix at many places.
Now, I have a path.properties file which contains (key,value) pairs such as
FinalFolder=/new/final/exit (user can edit value anytime in property file)
I want to replace the path with the value mentioned in .properties file for the key FinalFolder.
so now finally, I need to write a code in .sh file to do the job.
Please help,Thanks in advance.
(please don't mark this question as duplicate as I din't find a approriate/implementable answer for my question)

How to create news ticker XML file in Joomla?

I got this extensions and all I want to know is how I am able to get the news ticker from XML file and how should I create this file?
I tried this one and it is not working.
XML file
http://www.4shared.com/document/uNJ9sdmD/newsxml.html
Extension link
http://extensions.joomla.org/extensions/news-display/articles-display/news-tickers-a-scrollers/6633?qh=YToxMDp7aTowO3M6NDoibmV3cyI7aToxO3M6MzoibmV3IjtpOjI7czo3OiJuZXduZXNzIjtpOjM7czo1OiInbmV3cyI7aTo0O3M6NToibmV3J3MiO2k6NTtzOjY6Im5ld2VseSI7aTo2O3M6NToibmV3cyciO2k6NztzOjY6Im5ld3MnLCI7aTo4O3M6NjoiJ25ldycsIjtpOjk7czo2OiInbmV3cyciO30%3D
First create the XML file and insert it in the correct directory
/modules/mod_highlighter_gk4/xml --> this is where you need to insert the file.
You can find a sample XML file named Sample on that folder which tells you exactly how to format you XML file. It's something like this:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<highlighter>
<item>
<title>Title 1</title>
<desc>Item description 1</desc>
<link>http://item1.title.com</link>
</item>
<item>
<title>Title 1</title>
<desc>Item description 1</desc>
<link>http://item1.title.com</link>
</item>
On the module itself first make sure you select the XML file on the "data source".
Then on the XML File option put the name of the XML File you created and it should work.
You can find more info from their website as well.
http://www.gavick.com/forums/highlighter.html

Removing specific element from xml using sax with xerces library in c++

My problem is that I want to remove lastfiles and all it's child element using sax with the API of Xerces in c++ language??
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE config>
<config datecreated="20011210">
<user>
John Smith
</user>
<login>jsmith</login>
<password>topsecret</password>
<lastfiles>
<lastfile timestamp="20011210T1002">accounts.txt</lastfile>
<lastfile timestamp="20011190T1132">/home/jsmith/docs/letter.doc</lastfile>
</lastfiles>
</config>

XercesDOMParser and XIncludes

I am attempting to get xincludes working in an existing system that uses a XercesDOMParser in xercesc to parse incoming xml from a client. I am working with Apache Xercesc v3.0.1, and the incoming XML, read from an input stream, is:
<?xml version="1.0" encoding="UTF-8"?>
<VisionServer xmlns:xi="http://www.w3.org/2001/XInclude">
<CompositeObject>
<xi:include href="testguioutput.xml" />
while testguioutput.xml contains
<?xml version="1.0" encoding="UTF-8"?>
<GUIOutput>
<Input>Input</Input>
<Title>IDC2_1</Title>
</GUIOutput>
The existing code uses a wrapper around a XercesDOMParser to parse the XML as it comes in, and after using setDoNamespaces and setDoXInclude to true, it is attempting to parse the XInclude, but I get a persistent "Fatal: include failed and no fallback element found in document '{0}'" error, no matter where in the directory structure I put testguioutput.xml.
I am working under visualstudio 2008, my working directory is default, and running out of /project/debug, but the include fails whether the target file is in /project/ or /project/debug/.
I was able to expand the xinclude tags using the XInclude.exe sample application that is included with the Xerces application. To do this, I created two files using your files above:
test1.xml:
<?xml version="1.0" encoding="UTF-8"?>
<VisionServer xmlns:xi="http://www.w3.org/2001/XInclude">
<CompositeObject>
<xi:include href="test2.xml"/>
</CompositeObject>
</VisionServer>
test2.xml:
<?xml version="1.0" encoding="UTF-8"?>
<GUIOutput>
<Input>Input</Input>
<Title>IDC2_1</Title>
</GUIOutput>
At the command line I executed:
"XInclude.exe test1.xml test1_expanded.xml" without quotes.
The resulting test1_expanded.xml file:
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<VisionServer xmlns="" xmlns:xi="http://www.w3.org/2001/XInclude">
<CompositeObject>
<GUIOutput xml:base="test2.xml">
<Input>Input</Input>
<Title>IDC2_1</Title>
</GUIOutput>
</CompositeObject>
</VisionServer>