All X elements below Y, but before another, descendent, Y - xslt

<div n="a">
. . .
. . .
<spec>red</spec>
<div n="d">
. . .
</div>
<spec>green</spec>
. . .
<div n="b">
. . .
<spec>blue</spec>
. . .
</div>
<div n="c">
<spec>yellow</spec>
</div>
. . .
. . .
. . .
</div>
[Edited to remove the ambiguity Sean noticed. -- Thanks]
When the current element is <div n="a">, I need an XPATH expression that returns the red and green elements, but not the blue and yellow ones, as .//spec does.
When the current element is <div n="b">, the same expression needs to return the blue element; when <div n="c">, the yellow element.
Something like .//spec[but no deeper than another div if there is one]

In XSLT 1.0, assuming that the current node is a div:
.//spec[generate-id(current())=generate-id(ancestor::div[1])]
In XSLT 2.0 under the same assumptions:
.//spec[ancestor::div[1] is current()]
And a pure XPath 2.0 expression:
for $this in .
return
$this//spec[ancestor::div[1] is $this]
Full XSLT 1.0 transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="div">
<div n="{#n}"/>
<xsl:copy-of select=
".//spec[generate-id(current())=generate-id(ancestor::div[1])]"/>
==============
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
when applied on the provided XML document:
<div n="a">
. . .
. . .
<spec>red</spec>
<spec>green</spec>
. . .
<div n="b">
. . .
<spec>blue</spec>
. . .
</div>
<div n="c">
<spec>yellow</spec>
</div>
. . .
. . .
. . .
</div>
the wanted, correct result is produced:
<div n="a"/>
<spec>red</spec>
<spec>green</spec>
==============
<div n="b"/>
<spec>blue</spec>
==============
<div n="c"/>
<spec>yellow</spec>
==============
Full XSLT 2.0 transformation:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="div">
<div n="{#n}"/>
<xsl:sequence select=".//spec[ancestor::div[1] is current()]"/>
===================================
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
When applied on the same XML document (above), the same correct result is produced:
<div n="a"/>
<spec>red</spec>
<spec>green</spec>
===================================
<div n="b"/>
<spec>blue</spec>
===================================
<div n="c"/>
<spec>yellow</spec>
===================================
And using pure XPath 2.0 (no current()):
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="div">
<div n="{#n}"/>
<xsl:sequence select="
for $this in .
return
$this//spec[ancestor::div[1] is $this]"/>
===================================
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
produces the same correct result:
<div n="a"/>
<spec>red</spec>
<spec>green</spec>
===================================
<div n="b"/>
<spec>blue</spec>
===================================
<div n="c"/>
<spec>yellow</spec>
===================================

Assuming that you are using XSLT 1.0, and you want to select `spec' children, all children, then with your desired 'current' node as the XSLT focus node, set the following variable...
<xsl:variable name="divs" select="*//div" />
Now you can select all spec descendants which are not preceded by a div descendant with this XPath expression...
//spec[not((preceding::div|ancestor::div)[count(. | $divs) = count($divs)])]
Caveat
This should work, but I have not tested it. With this caution, I leave it as an exercise to the OP to test.
Note
If you really desperately want an XPath expression that does not require you to declare an additional variable, AND you happen to be lucky enough that you already hold the current node in a node-set (lets call it $ref), the you could use this rather inefficient XPath expression...
$ref//spec[not((preceding::div|ancestor::div)[count(. | $ref/*//div) =
count( $ref/*//div) ])]
Addendum
Here is a test case that I may be referring to the comment streams.
Test Case 1 Input:
<div n="a">
<spec>red</spec>
<div n="x"/>
<spec>green</spec>
<div n="b">
<spec>blue</spec>
</div>
<div n="c">
<spec>yellow</spec>
</div>
</div>
Test Case 1 Expected output:
Should be just red

Related

XSLT target specific string before closing tags

I am currently trying to add a new attribute to an element but the value needs to come from the data itself and I have no clue how to target it as the text value can happen in 2 different places.
My input XML is as following:
Case 1
<div>
<title/>
<p>This is an example where the string is being used in the text (0123-45-6789) and how a sentence can look like. (0123-45-6789)</p>
</div>
Case 2
<div>
<title>This is an example title. (0123-45)</title>
<p>This is an example sentence.</p>
</div>
Target
<div id="0123-45">
<title>This is an example title. (0123-45)</title>
<p>This is an example sentence.</p>
</div>
The string I need is the one between the brackets and it can consist of 2 digits, 4 digits, 6 digits or 10 digits. As the string can also be used in text I can only target the ones that are before the closing tag and .
I already tried to use analyze-string with regex but ended up targeting all of the strings instead of the ones I need.
Is there any way this can be done in XSLT? Thanks in advance to point me in the right direction!
Kind regards
How about:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="div">
<div id="{replace(., '.*\((.*)\).*', '$1')}">
<xsl:apply-templates/>
</div>
</xsl:template>
</xsl:stylesheet>

XSLT sequential processing

Within class xyz only, I want to examine exactly two divs and their classnames.
If classname="yes" then output '1'.
If classname="no" then output '0'.
<div class="xyz">
<div class="no"></div>
<div class="yes"></div>
</div>
Desired output: 0 1
<div class="xyz">
<div class="yes"></div>
<div class="yes"></div>
</div>
Desired output: 1 1
.. etc ..
Finding the first is easy but how do I do it "sequentially"?
Recursive processing can be used as in the XSLT-1.0 code below:
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text"/>
<xsl:template match="div[#class='xyz']/div[#class='no']">
<xsl:text>0 </xsl:text>
</xsl:template>
<xsl:template match="div[#class='xyz']/div[#class='yes']">
<xsl:text>1 </xsl:text>
</xsl:template>
<xsl:template match="node()">
<xsl:apply-templates select="node()"/>
</xsl:template>
</xsl:transform>
The 3rd template processes all the nodes recursively, starting with document node. The first two templates do the desired output for #class with 'yes' and 'no'.

XSLT Order classes in a non-alphabetical non-numerical order

I have researched this problem but the suggestions that I have found seems to be rather convoluted and for a more general scenario. Perhaps there is a more concise solution for this scenario, that is more specific.
I have a large number of html files like the following:
<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type" />
<title>t</title>
</head>
<body>
<div class="a">
<div class="f">f1</div>
<div class="e">e1</div>
<div class="e">e2</div>
<div class="g">g</div>
<div class="c">c1</div>
<div class="b">
<div class="ba">ba</div>
<div class="bb">bb</div>
</div>
<div class="c">c2</div>
<div class="f">f2</div>
<div class="d">d</div>
<div class="c">c3</div>
</div>
...
</body>
</html>
Rule # 1
I want to order the div's inside div class="a" in a specific order of their class attribute that is non-alphabetic and non-numeric. For the purpose of this example, let's the final order be the following:
g
f
b
c
e
d
In my real examples, the list is much longer.
Rule # 2
If for a given class attribute there is more than one node, then they should be left in the same order as in the original file, for instance:
c1
c2
c3
Please notice that in my real examples these values would not be in alphanumerical order.
Rule # 3
The order of child nodes must not be affected, for instance:
ba
bb
Please notice that in my real examples these values would not be in alphanumerical order either.
The final output should be like the following:
<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type" />
<title>t</title>
</head>
<body>
<div class="a">
<div class="g">g</div>
<div class="f">f1</div>
<div class="f">f2</div>
<div class="b">
<div class="ba">ba</div>
<div class="bb">bb</div>
</div>
<div class="c">c1</div>
<div class="c">c2</div>
<div class="c">c3</div>
<div class="e">e1</div>
<div class="e">e2</div>
<div class="d">d</div>
</div>
...
</body>
</html>
I have thought at first to:
Prepend a number to the class attribute value, for instance rename class="g" to class="01g", etc
Order the classes in alphanumerical order
Remove the number, for instance rename class="01g" to class = "g", etc
However I dislike this solution because it requires too many transformations.
What I would really like is to come up with a more elegant solutions. Perhaps I would define an ordered list of class values and a clever index would somehow put the nodes in that defined order?
Do you have any suggestions to add to my xslt template?
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
AFAICT, you want to do something like:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" omit-xml-declaration="yes" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="div[#class='a']">
<xsl:variable name="sort-order">gfbced</xsl:variable>
<xsl:copy>
<xsl:apply-templates select="#*|node()">
<xsl:sort select="string-length(substring-before($sort-order, #class))" data-type="number" order="ascending"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
To accommodate class values that are not single characters, you can use:
<xsl:template match="div[#class='a']">
<xsl:variable name="sort-order">|g|f|b|c|e|d|</xsl:variable>
<xsl:copy>
<xsl:apply-templates select="#*|node()">
<xsl:sort select="string-length(substring-before($sort-order, concat('|', #class, '|')))" data-type="number" order="ascending"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>

Break up XML into parts using XSLT 2.0

I am given XML as input that I have no control over the structure. I need to break the XML up into parts and process each part separately. Below is a very simplified version of a file that I would process.
I am trying to use the grouping functionality of XSLT 2.0 to break up this XML by using the <breakEle> tag as the part boundaries. The <breakEle> can appear at any level too. Is what I'm trying to do even possible with XSLT 2.0? I have been successful in accomplishing this with XSLT 1.0 using Muenchian grouping but I want to get away from that if we can.
Sample input:
<item class="poem">
<div>
<div>
<p>paragraph 1</p>
<breakEle groupNum="1"/>
</div>
<div>
<p>Paragraph in another div.</p>
</div>
<breakEle groupNum="2"/>
<div>
<div>
<h4>header</h4>
<p>1st line</p>
<p>2nd line</p>
<br/>
<p>3rd line</p>
<p>4th line</p>
<page n="100"/>
<p>5th line</p>
</div>
<breakEle groupNum="3"/>
</div>
</div>
</item>
What I'm trying to work with:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl"
exclude-result-prefixes="xs xd"
version="2.0">
<xsl:template match="/">
<newRoot>
<xsl:copy>
<xsl:for-each-group select="*" group-ending-with="breakEle">
<div num="{#groupNum}">
<xsl:copy-of select="current-group()"/>
</div>
</xsl:for-each-group>
</xsl:copy>
</newRoot>
</xsl:template>
</xsl:stylesheet>
Would like to end up with something like this:
<newRoot>
<div num="1">
<p>paragraph 1</p>
</div>
<div num="2">
<p>Paragraph in another div.</p>
</div>
<div num="3">
<h4>header</h4>
<p>1st line</p>
<p>2nd line</p>
<br/>
<p>3rd line</p>
<p>4th line</p>
<page n="100"/>
<p>5th line</p>
</div>
</newRoot>
The following stylesheet returns the expected result when applied to the given example.
It works under the assumption that each group should only contain leaf elements.
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/item">
<newRoot>
<xsl:for-each-group select=".//*[not(*)]" group-ending-with="breakEle">
<div num="{current-group()[last()]/#groupNum}">
<xsl:copy-of select="current-group()[not(self::breakEle)]"/>
</div>
</xsl:for-each-group>
</newRoot>
</xsl:template>
</xsl:stylesheet>

Modify "on the fly" the portal column content tag's class

I need to modify "on the fly" a class name in portal-column-content tag, this is the html code rendered:
<div id="portal-column-content" class="cell width-9 position-1:4">
I want to replace only "width-9" with "width-12".
Any advice?
Thank's
Vito
Since you ask for advise, here is some:
Do not use a css class to signify anything concrete, let it signify intent.
The concrete implementation of the intent comes in the css. For instance, do not create a class named width-9, rather create one named portal-column-content. You can then make portal-column-content be width:9px, width:12em or whatever.
Doing a string-replace like this is not really a thing you would do with xslt.
Even though you could. Depending on your setup there are other, better ways.
If you can't/won't follow any of the above advise, try
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="#class">
<xsl:attribute name="class">
<xsl:value-of select="substring-before(.,'width-9')"/>width-12<xsl:value-of select="substring-after(.,'width-9')"/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
This XSLT 1.0 transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="div[#id='portal-column-content']/#class">
<xsl:attribute name="class">
<xsl:value-of select=
"substring-before(concat(.,'width-9'), 'width-9')"/>
<xsl:value-of select="'width-12'"/>
<xsl:value-of select="substring-after(., 'width-9')"/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
when applied on the following sample XML document:
<html>
<div id="a" class="a"/>
<div id="b" class="b"/>
<div id="c" class="cell width-9 position-1:4"/>
<div id="portal-column-content" class="cell width-9 position-1:4"/>
<div id="d" class="d"/>
<div id="e" class="cell width-9 position-1:4"/>
</html>
produces the wanted, correct result (only replaced is the 'width-9' substring of the class attribute of any div that has id attribute with string value 'portal-column-content':
<html>
<div id="a" class="a"></div>
<div id="b" class="b"></div>
<div id="c" class="cell width-9 position-1:4"></div>
<div id="portal-column-content" class="cell width-12 position-1:4"></div>
<div id="d" class="d"></div>
<div id="e" class="cell width-9 position-1:4"></div>
</html>
Do note:
Only replaced is the 'width-9' substring of the class attribute of any div that has id attribute with string value 'portal-column-content'. Other div elements that have a different id attribute aren't affected.
The transformation correctly works with class attributes, whose string value doesn't contain 'width-9' -- compare with the other answer, whose XSLT solution in such case completely replaces the string value of the class attribute with 'width-12'.
Since you've tagged this with "diazo", the simplest solution is probably to use diazo rules. Just use a replace, before or after rule to copy content children of #portal-column-content to the theme children of a correctly classed #portal-column-content.
Use an "if" expression if it's to be done selectively.