Extra line break produced in converted file in XSLT - xslt

I used XSLT for converting document from XML to text. But extra line space is produced after every instance, please suggest how to solve this problem. Here I attached screen shot for your reference.
Example
Input:
<?xml version="1.0" encoding="UTF-8"?>
<book-part book-part-type="chapter" id="chapter1">
<book-part-meta>
<title-group>
<label>1</label>
<title>The Developmental Origins of Health and Disease—Where Did It All Begin?</title>
</title-group>
<contrib-group>
<contrib contrib-type="author"><name><surname>Nicholas</surname><given-names>L. M.</given-names></name></contrib>
<contrib contrib-type="author"><name><surname>Ozanne</surname><given-names>S. E.</given-names></name></contrib>
</contrib-group>
</book-part-meta>
<body>
<sec id="sec1_1">
<label>1.1</label>
<title>THE DEVELOPMENTAL ORIGINS OF ADULT DISEASE—ORIGINS OF THE HYPOTHESIS</title>
<p>One of the earliest proposals establishing the association between early life events and the risk for disease in adult life was more than 80 years ago by Kermack and colleagues.
<fig id="fig1_1">
<label>FIGURE 1.1</label>
<caption><p>Exposure to suboptimal nutrition during fetal development results in an adaptive response to optimize the growth of key body organs to the detriment of others. </p></caption>
<graphic href="001x001"/>
</fig>
</p>
</sec>
</body>
</book-part>
XSL:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xlink="http://www.w3.org/1999/xlink">
<xsl:output method="text" omit-xml-declaration="yes" standalone="yes" indent="no"/>
[enter image description here][1]
<xsl:template match="fig/label"/>
<xsl:template match="fig/caption">
<xsl:text disable-output-escaping="yes">\caption{</xsl:text><xsl:apply-templates/><xsl:text disable-output-escaping="yes">}</xsl:text>
</xsl:template>
<xsl:template match="fig/graphic">
<xsl:text disable-output-escaping="yes">\includegraphics{</xsl:text><xsl:apply-templates select="#href"/><xsl:text disable-output-escaping="yes">.pdf}</xsl:text>
</xsl:template>
</xsl:stylesheet>

You can use <xsl:strip-space elements="*" /> to remove extra line breaks.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:xlink="http://www.w3.org/1999/xlink">
<xsl:output method="text" omit-xml-declaration="yes" standalone="yes" indent="no" />
<xsl:strip-space elements="*" />
<xsl:template match="fig/label" />
<xsl:template match="fig/caption">
<xsl:text disable-output-escaping="yes">\caption{</xsl:text>
<xsl:apply-templates />
<xsl:text disable-output-escaping="yes">}</xsl:text>
</xsl:template>
<xsl:template match="fig/graphic">
<xsl:text disable-output-escaping="yes">\includegraphics{</xsl:text>
<xsl:apply-templates select="#href" />
<xsl:text disable-output-escaping="yes">.pdf}</xsl:text>
</xsl:template>
</xsl:stylesheet>
http://xsltransform.net/gVAjbT2
The <xsl:strip-space> element is used to define the elements for which white space should be removed. Preserving white space is the default setting, so using the element is only necessary if the <xsl:strip-space> element is used.

Related

How to use one cycle for different tags

I have XML with different same tags:
<?xml version="1.0" encoding="UTF-8"?>
<main>
<ROUTES>
<A1_NE>LSN/EMS_XDM_12/1021</A1_NE>
<A2_NE>LSN/EMS_XDM_12/1022</A2_NE>
<Z1_NE>LSN/EMS_XDM_12/1023</Z1_NE>
<Z2_NE>LSN/EMS_XDM_12/1024</Z2_NE>
</ROUTES>
<ROUTES>
<A1_NE>LSN/EMS_XDM_12/1001</A1_NE>
<A2_NE>LSN/EMS_XDM_12/1002</A2_NE>
<A3_NE>LSN/EMS_XDM_12/1003</A3_NE>
<A4_NE>LSN/EMS_XDM_12/1004</A4_NE>
<Z1_NE>LSN/EMS_XDM_12/1005</Z1_NE>
<Z2_NE>LSN/EMS_XDM_12/1006</Z2_NE>
</ROUTES>
</main>
XSLT:
<?xml version="1.1" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="no"/>
<xsl:template match="/">
<MAIN>
<xsl:for-each select="main/ROUTES">
<xsl:element name="ROUTES">
<A_NE><xsl:value-of select="A1_NE"/></A_NE>
<A_NE><xsl:value-of select="A2_NE"/></A_NE>
<A_NE><xsl:value-of select="A3_NE"/></A_NE>
<A_NE><xsl:value-of select="A4_NE"/></A_NE>
<Z_NE><xsl:value-of select="Z1_NE"/></Z_NE>
<Z_NE><xsl:value-of select="Z2_NE"/></Z_NE>
</xsl:element>
</xsl:for-each>
</MAIN>
</xsl:template>
</xsl:stylesheet>
How I can use for-each command to transform A1_NE, A2_NE etc to A_NE column?
And also I've not understand how I can know number of the row in the source xml.
Perhabs xslt version 1.0 couldn't do this transformation.
<?xml version="1.0" encoding="UTF-8"?>
<main>
<ROUTES>
<A_NE>LSN/EMS_XDM_12/1021</A_NE>
<A_NE>LSN/EMS_XDM_12/1022</A_NE>
<Z_NE>LSN/EMS_XDM_12/1023</Z_NE>
<Z_NE>LSN/EMS_XDM_12/1024</Z_NE>
<A_NE>LSN/EMS_XDM_12/1001</A_NE>
<A_NE>LSN/EMS_XDM_12/1002</A_NE>
<A_NE>LSN/EMS_XDM_12/1003</A_NE>
<A_NE>LSN/EMS_XDM_12/1004</A_NE>
<Z_NE>LSN/EMS_XDM_12/1005</Z_NE>
<Z_NE>LSN/EMS_XDM_12/1006</Z_NE>
</ROUTES>
</main>
You should make use of template matching, to change the node names.
First select the child nodes of all ROUTES like so:
<xsl:apply-templates select="main/ROUTES/*" />
Then, have templates like this, for example, to do the renaming
<xsl:template match="A1_NE|A2_NE|A3_NE|A4_NE">
<A_NE><xsl:value-of select="."/></A_NE>
</xsl:template>
Try this XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<main>
<ROUTES>
<xsl:apply-templates select="main/ROUTES/*" />
</ROUTES>
</main>
</xsl:template>
<xsl:template match="A1_NE|A2_NE|A3_NE|A4_NE">
<A_NE><xsl:value-of select="."/></A_NE>
</xsl:template>
<xsl:template match="Z1_NE|Z2_NE|Z3_NE|Z4_NE">
<Z_NE><xsl:value-of select="."/></Z_NE>
</xsl:template>
</xsl:stylesheet>
Alternatively, if those are indeed your real element names, you could try and make it generic
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<main>
<ROUTES>
<xsl:apply-templates select="main/ROUTES/*" />
</ROUTES>
</main>
</xsl:template>
<xsl:template match="ROUTES/*">
<xsl:element name="{substring(local-name(), 1, 1)}_{substring-after(local-name(), '_')}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
The logic that needs to be applied is not apparent from the example given. Perhaps all you need to do is:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/main">
<main>
<ROUTES>
<xsl:for-each select="ROUTES/*">
<xsl:element name="{translate(name(), '1234567890', '')}">
<xsl:value-of select="." />
</xsl:element>
</xsl:for-each>
</ROUTES>
</main>
</xsl:template>
</xsl:stylesheet>

XSL - Make all node values with namespace available in CDATA sections

I am looking for XSL to transform provided input to expected output.I have just provided sample but actual input xml had more than 1000 nodes and as too many nodes not able to use CDATA section in XSL, could you please help.
Input:
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Output:
<note>
<to><![CDATA[Tove]]></to>
<from><![CDATA[Jani]]></from>
<heading><![CDATA[Reminder]]></heading>
<body><![CDATA[Don't forget me this weekend!]]></body>
</note>
You can achieve this by using the cdata-section-elements attribute of the xsl:output element which specifies all the elements that should be output in CDATA sections.
So use the following XSLT-1.0:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" version="1.0" cdata-section-elements="to from heading body" encoding="UTF-8" indent="yes" omit-xml-declaration="yes" />
<!-- Identity template -->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
See that the cdata-section-elements denotes the elements to from heading body to enclose their content in a CDATA section. The identity template just copies all of the file with regard to this.
If your elements are in a namespace, you have to prefix the element's names in the cdata-section-elements with the appropriate namespace-prefix.
For example, if you have the following XML with a namespace on the root element, all children nodes are in that namespace, too.
<?xml version="1.0" encoding="utf-8"?>
<Bank xmlns="http://xxyy.x.com" Operation="Create">
<Customer type="random">
<CustomerId>Id10</CustomerId>
<CountryCode>CountryCode19</CountryCode>
<LanguageCode>LanguageCode20</LanguageCode>
<AddressArray>
<Address type="primary">
<StreetAddress>179 Alfred St</StreetAddress>
<City>Fortitude Valley</City>
<County>GR</County>
<Country>India</Country>
</Address>
</AddressArray>
</Customer>
</Bank>
Use this XSLT (pay attention to the namespace declaration on the xsl:stylesheet element):
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:ns0="http://xxyy.x.com">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" cdata-section-elements="ns0:CustomerId ns0:CountryCode ns0:LanguageCode ns0:StreetAddress ns0:City ns0:County ns0:Country" omit-xml-declaration="yes" />
<!-- Identity template -->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Make sure that the namespace of the XSLT matches the namespace of the XML, here both are http://xxyy.x.com, but in your sample XML it is xxyy.x.com.
EDIT 2:
If you have a large amount of elements you can either add them all to cdata-section-elements (maybe by constructing it by another stylesheet) or use the solution I found here: Wrapping all elements in CDATA. It is kind of a hack.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" omit-xml-declaration="yes" />
<xsl:variable name="CDATABegin" select="'<![CDATA['" />
<xsl:variable name="CDATAEnd" select="']]>'" />
<!-- Identity template -->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
<xsl:template match="text()[normalize-space()]">
<xsl:value-of select="$CDATABegin" disable-output-escaping="yes"/>
<xsl:value-of select="." disable-output-escaping="yes"/>
<xsl:value-of select="$CDATAEnd" disable-output-escaping="yes"/>
</xsl:template>
</xsl:stylesheet>
This wraps all non-empty text() nodes in CDATA sections. But here, too, you'd have to mention all elements in template matching rules containing the CDATA wrapping code.
Which variant would be easier to apply depends on the greater scenario.

XSLT for iTunes importer

I need to transform a given XML to another format. This is the source:
<?xml version="1.0" encoding="utf-8"?>
<package xmlns="http://apple.com/itunes/importer" version="film5.1">
<provider>Provider</provider>
<language>de-DE</language>
<video>
<type>film</type>
<subtype>feature</subtype>
<vendor_id>some_id</vendor_id>
<country>US</country>
<original_spoken_locale>en</original_spoken_locale>
<title>Some movie title</title>
</video>
</package>
And this is the XSLT I tried:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:importer="http://apple.com/itunes/importer" version="1.0">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="importer:package">
<xsl:variable name="var-title">
<xsl:apply-templates select="video/title"/>
</xsl:variable>
<Movie>
<Title><xsl:value-of select="$var-title"/></Title>
</Movie>
</xsl:template>
</xsl:stylesheet>
But the <title> from source XML is not selected. What did I do wrong?
The namespace applies to the descendants as well so change <xsl:apply-templates select="video/title"/> to <xsl:apply-templates select="importer:video/importer:title"/> to use a prefix as well.

XSLT Transformation: Search nodes, and return hierarchical parents

hoping this hasn't been asked before, but I have the following XML:
<Company id="1000" name="Company1000">
<Company id="1020" name="Company1020" />
<Company id="1004" name="Company1004">
<Company id="1005" name="Company1005" />
</Company>
<Company id="1022" name="Company1022" />
</Company>
I have the following XPath to search for nodes: //*[contains(translate(#name, "ABCDEFGHJIKLMNOPQRSTUVWXYZ", "abcdefghijklmnopqrstuvwxyz"), "005")]
I would like this to return:
<Company id="1000" name="Company1000">
<Company id="1004" name="Company1004">
<Company id="1005" name="Company1005" />
</Company>
</Company>
So this matches the Company1005 node, and all its respective parents. I would like the above to also be returned if I were searching for "100", which in that case, would match each element in turn, but I clearly don't want duplication of nodes.
I've been struggling with this for hours now, so your help will be much appreciated!!!
Thanks.
The solution is to copy only descendant or self nodes which match your requirement (containing the string you want).
Look this picture at the bottom of this page for the XPath axes you need.
Long version:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- Default: copy everything -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- just copy Company which descendant-or-self contain the matching string -->
<xsl:template match="Company">
<xsl:if test="descendant-or-self::Company[contains(translate(#name, 'ABCDEFGHJIKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), '005')]">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Short version:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- Default: copy everything -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- do not copy Company which do not have a descendant-or-self matching string -->
<xsl:template match="Company[not(descendant-or-self::Company[contains(translate(#name, 'ABCDEFGHJIKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), '005')])]"/>
</xsl:stylesheet>
Output of your xml:
<?xml version="1.0" encoding="UTF-8"?>
<Company id="1000" name="Company1000">
<Company id="1004" name="Company1004">
<Company id="1005" name="Company1005"/>
</Company>
</Company>

xpath selection on elements with namespaces

Here's a trivial but valid Docbook article:
<?xml version="1.0" encoding="utf-8"?>
<article xmlns="http://docbook.org/ns/docbook" version="5.0">
<title>I Am Title</title>
<para>I am content.</para>
</article>
Here's a stylesheet that selects title if I remove the xmlns attribute above, and not if I leave it in:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:template match="/">
<xsl:apply-templates select="article"/>
</xsl:template>
<xsl:template match="article">
<p><xsl:value-of select="title"/></p>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
How do I talk XPath into selecting title through article if it has that namespace attribute?
You need to add an alias for your namespace and use that alias in your XPath
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:a="http://docbook.org/ns/docbook"
exclude-result-prefixes="a"
>
<xsl:output method="html"/>
<xsl:template match="/">
<xsl:apply-templates select="a:article"/>
</xsl:template>
<xsl:template match="a:article">
<p><xsl:value-of select="a:title"/></p>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>