Break up XML into parts using XSLT 2.0 - xslt

I am given XML as input that I have no control over the structure. I need to break the XML up into parts and process each part separately. Below is a very simplified version of a file that I would process.
I am trying to use the grouping functionality of XSLT 2.0 to break up this XML by using the <breakEle> tag as the part boundaries. The <breakEle> can appear at any level too. Is what I'm trying to do even possible with XSLT 2.0? I have been successful in accomplishing this with XSLT 1.0 using Muenchian grouping but I want to get away from that if we can.
Sample input:
<item class="poem">
<div>
<div>
<p>paragraph 1</p>
<breakEle groupNum="1"/>
</div>
<div>
<p>Paragraph in another div.</p>
</div>
<breakEle groupNum="2"/>
<div>
<div>
<h4>header</h4>
<p>1st line</p>
<p>2nd line</p>
<br/>
<p>3rd line</p>
<p>4th line</p>
<page n="100"/>
<p>5th line</p>
</div>
<breakEle groupNum="3"/>
</div>
</div>
</item>
What I'm trying to work with:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl"
exclude-result-prefixes="xs xd"
version="2.0">
<xsl:template match="/">
<newRoot>
<xsl:copy>
<xsl:for-each-group select="*" group-ending-with="breakEle">
<div num="{#groupNum}">
<xsl:copy-of select="current-group()"/>
</div>
</xsl:for-each-group>
</xsl:copy>
</newRoot>
</xsl:template>
</xsl:stylesheet>
Would like to end up with something like this:
<newRoot>
<div num="1">
<p>paragraph 1</p>
</div>
<div num="2">
<p>Paragraph in another div.</p>
</div>
<div num="3">
<h4>header</h4>
<p>1st line</p>
<p>2nd line</p>
<br/>
<p>3rd line</p>
<p>4th line</p>
<page n="100"/>
<p>5th line</p>
</div>
</newRoot>

The following stylesheet returns the expected result when applied to the given example.
It works under the assumption that each group should only contain leaf elements.
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/item">
<newRoot>
<xsl:for-each-group select=".//*[not(*)]" group-ending-with="breakEle">
<div num="{current-group()[last()]/#groupNum}">
<xsl:copy-of select="current-group()[not(self::breakEle)]"/>
</div>
</xsl:for-each-group>
</newRoot>
</xsl:template>
</xsl:stylesheet>

Related

Strip all the text from a specific node and remove all tags from xml using xslt1

I'm trying to strip all tags from a xml doc and i need to strip all text from a specific node only. For more clearity see the below example:
<root>
<p>My 1st Semester Visual</p>
<p>
<b>Self Reflection</b>
</p>
<p>The activity</p>
<content-block>
<div class="imageWrapper" />
</content-block>
<p id="5fce699db97470099ea6c7e6"> </p>
<content-block>
<div class="carousel">
<div class="carouselHeader" />
<div class="carouselNavbar">
<div class="carouselNavbarThumbnails" />
</div>
</div>
My Space Unit Flyer
</content-block>
<div>
<br />
</div>
</root>
Result:
<root><text>My 1st Semester VisualSelf ReflectionThe activity
My Space Unit Flyer
</text><contentBlocks>2</contentBlocks></root>
Expected result: I also need to remove text that is inside the <content-block>.
<root><text>My 1st Semester VisualSelf ReflectionThe activity
</text><contentBlocks>2</contentBlocks></root>
My xslt:
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" encoding="UTF-8" indent="no" omit-xml-declaration="yes"/>
<!-- Strip out white space -->
<xsl:strip-space elements="*"/>
<!-- Strip out all html tags, only leaving text contents -->
<xsl:template match="*">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="root">
<root>
<text>
<xsl:apply-templates/>
</text>
<contentBlocks>
<xsl:if test="//content-block">
<xsl:value-of select="count(//content-block)"/>
</xsl:if>
<xsl:if test="figure">
<xsl:value-of select="count(figure)"/>
</xsl:if>
</contentBlocks>
</root>
</xsl:template>
</xsl:transform>
Thanks in advance
<!-- Add this to your code. It suppresses content-block. -->
<xsl:template match="content-block"/>

cdata-section-elements not working for dynamically created element

I am trying to define some dynamically created elements as cdata sections, but it's not working for some reason:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="no" indent="yes" method="xml"
cdata-section-elements="DESCRIPTION2"
/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="/RSS/ITEM/TEST">
<DESCRIPTION2>
<div class="container">
<xsl:if test="NAME != ''">
<div class="test">
<xsl:value-of select="NAME"/>
</div>
</xsl:if>
</div>
</DESCRIPTION2>
</xsl:template>
</xsl:stylesheet>
Test XML:
<?xml version="1.0" encoding="UTF-8"?>
<RSS>
<ITEM>
<CODE>41,000</CODE>
<TEST>
<NAME><p>HTML code</p></NAME>
</TEST>
</ITEM>
</RSS>
Live test.
Sure I can add manually (<xsl:text disable-output-escaping="yes"><![CDATA[</xsl:text>), but I would like to know why it's not working If I define it as cdata-section-elements.
CDATA serialization happens for text nodes inside of the nominated elements, if you put in elements there it doesn't happen. Note that, assuming an XSLT 3 processor with XPath 3.1 support, you can use the serialize function to serialize the content you build as html and then output it as a text node:
<xsl:template match="/RSS/ITEM/TEST">
<xsl:variable name="html">
<div class="container">
<xsl:if test="NAME != ''">
<div class="test">
<xsl:value-of select="NAME"/>
</div>
</xsl:if>
</div>
</xsl:variable>
<DESCRIPTION2>
<xsl:value-of select="serialize($html, map { 'method' : 'html' })"/>
</DESCRIPTION2>
</xsl:template>
http://xsltfiddle.liberty-development.net/948Fn5i/1 then gives the result as a CDATA section
<DESCRIPTION2><![CDATA[<div class="container">
<div class="test">Peter</div>
</div>]]></DESCRIPTION2>
Your content is well-formed XHTML, so it doesn't need to apply CDATA when serializing the content.
If you escaped the markup and constructed a string, it would serialize as CDATA:
<xsl:template match="/RSS/ITEM/TEST">
<DESCRIPTION2>
<div class="container">
<xsl:if test="NAME != ''">
<div class="test">
<xsl:value-of select="NAME"/>
</div>
</xsl:if>
</div>
</DESCRIPTION2>
</xsl:template>
Produces:
<DESCRIPTION2><![CDATA[
<div class="container">
<div class="test">
Peter
</div>
</div>
]]></DESCRIPTION2>
But why would you want to generate a string when you could have well-formed markup? It makes it a pain for everyone downstream.

XSLT, I'm studying with my own XSLT example, but it doesn't work as I expected

First, my code is..
<?xml version="1.0" encoding="UTF-8"?>
<mpml>
<problem>
<context>
<p>두 다항식 $A=x^2 - xy + 2y^2$, $B=3x^2 + 2xy - y^2$에 대하여 $A-B$를 계산한 식이 $ax^2 +bxy + cy^2$일 때, 상수 $a+b+c$의 값은?</p>
</context>
<answerlist>
<i>-4</i>
<i>-2</i>
<i>0</i>
<i>2</i>
<i>4</i>
</answerlist>
</problem>
<problem>
<context>
<p>연립방정식 $\begin{cases} x+y+z=30 \\ 2x+3y+4z=93 \\ y=z+3 \end{cases}$의 해를 $x=a$, $y=b$, $z=c$라 할 때, $a-2b+3c$의 값은? (단, $a$, $b$, $c$는 실수.)</p>
</context>
<answerlist>
<i>7</i>
<i>9</i>
<i>11</i>
<i>13</i>
<i>15</i>
</answerlist>
</problem>
</mpml>
and it's xsl code is..
<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<h2> Testing.. </h2>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="problem">
<div class="problem">
<xsl:apply-templates select="context"/>
<xsl:apply-templates select="answerlist"/>
</div>
</xsl:template>
<xsl:template match="context">
<div class="context">
</div>
</xsl:template>
<xsl:template match="answerlist">
<div class="answerlist">
test answerlist
</div>
</xsl:template>
</xsl:transform>
In XSLT Tryit editor (thanks to w3school), above XSLT work as I expected. However when I tried this in my server, there is an 'extra content at the end of the document.' error and nothings shown except very first template.
I think the problem is that your XSLT does not produce well-formed XML in output: in the template <xsl:template match="/">, you should wrap content within a single root tag.
E.g.:
<xsl:template match="/">
<body>
<h2> Testing.. </h2>
<xsl:apply-templates/>
</body>
</xsl:template>

Removing a div tag in variable of xsl file

I have to remove a div(menu) with an ul tag in it. All the data is stored in a variable $data. I have remove that div in that variable through xslt
Before:
<div id="container>
<div id="menu">
<ul>
</ul>
</div>
</div>
After
<div id="container>
</div>
Well if you know there is only the div id="menu" in that container div then you could make a shallow copy of that container div. In general, with XSLT 1.0, a variable will be a result tree fragment, to process it further with XSLT/XPath (other than outputting it with value-of or copy-of) you need to use exsl:node-set on the variable. Then you could process the elements with the identity transformation and a template for the div[#id = 'menu'] that does not process it to delete it (online at http://xsltransform.net/bFN1y9C):
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:exsl="http://exslt.org/common" exclude-result-prefixes="exsl">
<xsl:output method="html" indent="yes"/>
<xsl:variable name="data">
<div id="container">
<div id="menu">
<ul>
</ul>
</div>
</div>
</xsl:variable>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:variable name="data2">
<xsl:apply-templates select="exsl:node-set($data)/node()"/>
</xsl:variable>
<xsl:template match="div[#id = 'menu']"/>
<xsl:template match="/">
<xsl:copy-of select="$data2"/>
</xsl:template>
</xsl:transform>
If you need to perform other transformation steps you might need to separate the different steps by using modes.

How to Import stylesheets in xslt conditionally?

Is there any way to import stylesheets after checking some conditions?
Like,if the value of variable $a="1" then import 1.xsl or else import 2.xsl.
Hi All, Is there any way to import
stylesheets after checking some
conditions?
Like,if the value of variable $a="1"
then import 1.xsl or else import
2.xsl.
No, the <xsl:import> directive is only compile-time.
In XSLT 2.0 one can use the use-when attribute for a limited conditional compilation.
For example:
<xsl:import href="module-A.xsl"
use-when="system-property('xsl:vendor')='vendor-A'"/>
The limitations of the use-when attribute are that there is no dynamic context when the attribute is evaluated -- in particular that means that there are no in-scope variables defined.
A non-XSLT solution is to dynamically change the href attribute of the <xsl:import> declaration before the transformation is invoked:
Parse the xsl stylesheet as an XML file
Evaluate the condition that determines which stylesheet should be imported.
Set the value of the href attribute of the <xsl:import> declaration to the URI of the dynamically determined stylesheet-to-be-imported.
Invoke the transformation with the in-memory xsl stylesheet that was just modified.
I know this post is old, but I want to share my opinion.
Each display could use one template instead of two. The value display will be change with a VB application.
breakfast_menu.xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="conditionDisplay.xsl" ?>
<data>
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>$5.95</price>
<description>Two of our famous Belgian Waffles with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food>
<name>Strawberry Belgian Waffles</name>
<price>$7.95</price>
<description>Light Belgian waffles covered with strawberries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name>Homestyle Breakfast</name>
<price>$6.95</price>
<description>Two eggs, bacon or sausage, toast, and our ever-popular hash browns</description>
<calories>950</calories>
</food>
</breakfast_menu>
<display>1</display>
</data>
In this file, I imported my displays and with a condition I tell the template what I need.
conditionDisplay.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:import href="display1.xsl"/>
<xsl:import href="display2.xsl"/>
<xsl:template match="/">
<xsl:variable name="display"><xsl:value-of select= "data/display"/></xsl:variable>
<xsl:choose>
<xsl:when test="$display='1'">
<xsl:call-template name="display1" />
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="display2 />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
display1.xsl:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template name="display1">
<html xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<body style="font-family:Arial;font-size:12pt;background-color:#EEEEEE">
<xsl:for-each select="data/breakfast_menu/food">
<div style="background-color:teal;color:white;padding:4px">
<span style="font-weight:bold"><xsl:value-of select="name"/> - </span>
<xsl:value-of select="price"/>
</div>
<div style="margin-left:20px;margin-bottom:1em;font-size:10pt">
<p>
<xsl:value-of select="description"/>
<span style="font-style:italic"> (<xsl:value-of select="calories"/> calories per serving)</span>
</p>
</div>
</xsl:for-each>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
display2.xsl:
<?xml version="1.0" encoding="UTF-8"?>futur
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template name="display2">
<html xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<body style="font-family:Arial;font-size:12pt;background-color:#222222">
<xsl:for-each select="data/breakfast_menu/food">
<div style="background-color:teal;color:white;padding:4px">
<span style="font-weight:bold"><xsl:value-of select="name"/> - </span>
<xsl:value-of select="price"/>
</div>
<div style="margin-left:20px;margin-bottom:1em;font-size:10pt">
<p>
<xsl:value-of select="description"/>
<span style="font-style:italic"> (<xsl:value-of select="calories"/> calories per serving)</span>
</p>
</div>
</xsl:for-each>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
I genuinely apologize for my terrible English. It will be better to the next post and I hope will help someone as I think it's not the best solution.