Removing unwanted xml content using XSLT - xslt

I am using XSLT 1.0 and I'm trying to do this xml-to-xml transformation :
My input.xml:
<RootElement>
<CustomAttr name="GVillage"/>
.
.
<CustomAttr name="RVC">
<ValStart>
<RetailAttr name="RVC">
.
.
</RetailAttr>
</ValStart>
</CustomAttr>
<CustomAttr name="GTX">
<ValStart>
<RetailAttr name="GTX">
.
.
</RetailAttr>
</ValStart>
</CustomAttr>
.
.
.
<CustomAttr name=".......">
<ValStart>
<RetailAttr name=".......">
.
.
</RetailAttr>
</ValStart>
</CustomAttr>
<CustomAttr name="mode" value="DummyValue"/>
<CustomAttr name="affinity" value="SomeValue"/>
<CustomAttr name="names" value="SampleValue">
<list>
<value>nodevalue</value>
</list>
</CustomAttr>
</RootElement>
My expected output.xml :
<RootElement>
<CustomAttr name="GVillage"/>
.
.
<CustomAttr name="ShortCodes">
<ValStart>
<RetailAttr name="RVC">
.
.
</RetailAttr>
</ValStart>
<ValStart>
<RetailAttr name="GTX">
.
.
</RetailAttr>
</ValStart>
.
.
.
<ValStart>
<RetailAttr name=".......">
.
.
</RetailAttr>
</ValStart>
</CustomAttr>
<CustomAttr name="mode" value="DummyValue"/>
<CustomAttr name="affinity" value="SomeValue"/>
<CustomAttr name="names" value="SampleValue">
<list>
<value>nodevalue</value>
</list>
</CustomAttr>
</RootElement>
The input xml has elements <CustomAttr name="RVC">, <CustomAttr name="GTX"> and following subsequent any number of <CustomAttr name="......."> which I've tried to generalize and show using ".....". I need to combine all these elements into one single element <CustomAttr name="ShortCodes"> which opens once at the beginning and closes once at the end (as shown in the output.xml) instead of opening and closing each time for every <RetailAttr name="RVC"> , <RetailAttr name="GTX"> etc. respectively.
The XSLT I have currently is :
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="CustomAttr[#name='RVC']">
<xsl:copy>
<xsl:attribute name="name">ShortCodes</xsl:attribute>
<xsl:copy-of select="//CustomAttr/ValStart"/>
</xsl:copy>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
The output I am getting from this XSLT is xsltoutput.xml:
<?xml version="1.0"?>
<RootElement>
<CustomAttr name="GVillage"></CustomAttr>
.
.
<CustomAttr name="ShortCodes">
<ValStart>
<RetailAttr name="RVC">
.
.
</RetailAttr>
</ValStart>
<ValStart>
<RetailAttr name="GTX">
.
.
</RetailAttr>
</ValStart>
<ValStart>
<RetailAttr name=".......">
.
.
</RetailAttr>
</ValStart>
</CustomAttr>
<CustomAttr name="GTX">
<ValStart>
<RetailAttr name="GTX">
.
.
</RetailAttr>
</ValStart>
</CustomAttr>
.
.
.
<CustomAttr name=".......">
<ValStart>
<RetailAttr name=".......">
.
.
</RetailAttr>
</ValStart>
</CustomAttr>
<CustomAttr name="mode" value="DummyValue"/>
<CustomAttr name="affinity" value="SomeValue"/>
<CustomAttr name="names" value="SampleValue">
<list>
<value>nodevalue</value>
</list>
</CustomAttr>
</RootElement>
I am getting the transform I need, but I am getting unwanted content copied as well which is after </CustomAttr> (the closing element for <CustomAttr name="ShortCodes">) in xsltoutput.xml which I am not able to understand how to remove.
The unwanted content in xsltoutput.xml which I want to remove :
<CustomAttr name="GTX">
<ValStart>
<RetailAttr name="GTX">
.
.
</RetailAttr>
</ValStart>
</CustomAttr>
.
.
.
<CustomAttr name=".......">
<ValStart>
<RetailAttr name=".......">
.
.
</RetailAttr>
</ValStart>
</CustomAttr>
Since there can be any number of CustomAttr elements as per my input xml requirement so I am trying for a generalized XSLT. Any kind of help will be highly appreciated. Thankyou!

I am not sure the identity transformation is a good starting point but if you use it, it looks as if you need <xsl:template match="CustomAttr[preceding-sibling::CustomAttr[#name='RVC']]"/> to block the copying of those elements.
Based on your comment and edit, doing <xsl:template match="CustomAttr[ValStart and preceding-sibling::CustomAttr[#name='RVC']]"/> might do.

Related

Adding an attribute via xslt

I am working on an issue where I need to add an attribute to an element under certain conditions. Here is the XML that I have. When an AdditionalItem element has a non-empty Value element, I need to add an attribute called action as such:
<AdditionalItems>
**<AdditionalItem>**
<Keys>
<Key>Intake Source</Key>
</Keys>
<IdentifierDisplay>Intake Source</IdentifierDisplay>
<DataType>
<type>Enumeration</type>
<enumeration>
<String>311</String>
<String>NIS Inspector</String>
<String>Other CCD Agency</String>
</enumeration>
<inputRange>
<maxValue>0.0</maxValue>
</inputRange>
<Enumerations>
<Enumeration>
<Keys>
<Key>311</Key>
</Keys>
<IdentifierDisplay>311</IdentifierDisplay>
</Enumeration>
<Enumeration>
<Keys>
<Key>NIS Inspector</Key>
</Keys>
<IdentifierDisplay>NIS Inspector</IdentifierDisplay>
</Enumeration>
<Enumeration>
<Keys>
<Key>Other CCD Agency</Key>
</Keys>
<IdentifierDisplay>Other CCD Agency</IdentifierDisplay>
</Enumeration>
</Enumerations>
<inputRequired>false</inputRequired>
<fieldType>Enumeration</fieldType>
</DataType>
<Name>Intake Source</Name>
**<Value>311</Value>**
<security>F</security>
<drillDown>false</drillDown>
</AdditionalItem>
<AdditionalItem>
<Keys>
<Key>Other CCD Agency</Key>
</Keys>
<IdentifierDisplay>Other CCD Agency</IdentifierDisplay>
<DataType>
<type>String</type>
<inputRange>
<maxValue>0.0</maxValue>
</inputRange>
<inputRequired>false</inputRequired>
<fieldType>Text</fieldType>
</DataType>
<Name>Other CCD Agency</Name>
<Value/>
<security>F</security>
<drillDown>false</drillDown>
</AdditionalItem>
<AdditionalItem>
<Keys>
<Key>311 Agent</Key>
</Keys>
<IdentifierDisplay>311 Agent</IdentifierDisplay>
<DataType>
<type>String</type>
<inputRange>
<maxValue>0.0</maxValue>
</inputRange>
<inputRequired>false</inputRequired>
<fieldType>Text</fieldType>
</DataType>
<Name>311 Agent</Name>
<Value/>
<security>F</security>
<drillDown>false</drillDown>
</AdditionalItem>
<AdditionalItem>
<Keys>
<Key>Case Number</Key>
</Keys>
<IdentifierDisplay>Case Number</IdentifierDisplay>
<DataType>
<type>String</type>
<inputRange>
<maxValue>0.0</maxValue>
</inputRange>
<inputRequired>false</inputRequired>
<fieldType>Text</fieldType>
</DataType>
<Name>Case Number</Name>
<Value/>
<security>F</security>
<drillDown>false</drillDown>
</AdditionalItem>
<AdditionalItem>
<Keys>
<Key>Case Created Date</Key>
</Keys>
<IdentifierDisplay>Case Created Date</IdentifierDisplay>
<DataType>
<type>Date</type>
<inputRange>
<maxValue>0.0</maxValue>
</inputRange>
<inputRequired>false</inputRequired>
<fieldType>Date</fieldType>
</DataType>
<Name>Case Created Date</Name>
<Value/>
<security>F</security>
<drillDown>false</drillDown>
</AdditionalItem>
<AdditionalItem>
<Keys>
<Key>Complaintant Name:</Key>
</Keys>
<IdentifierDisplay>Complaintant Name:</IdentifierDisplay>
<DataType>
<type>String</type>
<inputRange>
<maxValue>0.0</maxValue>
</inputRange>
<inputRequired>false</inputRequired>
<fieldType>Text</fieldType>
</DataType>
<Name>Complaintant Name:</Name>
<Value>Fred Fredderson</Value>
<security>F</security>
<drillDown>false</drillDown>
</AdditionalItem>
<AdditionalItem>
<Keys>
<Key>Phone Number:</Key>
</Keys>
<IdentifierDisplay>Phone Number:</IdentifierDisplay>
<DataType>
<type>String</type>
<inputRange>
<maxValue>0.0</maxValue>
</inputRange>
<inputRequired>false</inputRequired>
<fieldType>Text</fieldType>
</DataType>
<Name>Phone Number:</Name>
<Value>3033333333</Value>
<security>F</security>
<drillDown>false</drillDown>
</AdditionalItem>
<AdditionalItem>
<Keys>
<Key>Email</Key>
</Keys>
<IdentifierDisplay>Email</IdentifierDisplay>
<DataType>
<type>String</type>
<inputRange>
<maxValue>0.0</maxValue>
</inputRange>
<inputRequired>false</inputRequired>
<fieldType>Text</fieldType>
</DataType>
<Name>Email</Name>
<Value>1#2.com</Value>
<security>F</security>
<drillDown>false</drillDown>
</AdditionalItem>
<AdditionalItem>
<Keys>
<Key>Council District:</Key>
</Keys>
<IdentifierDisplay>Council District:</IdentifierDisplay>
<DataType>
<type>String</type>
<inputRange>
<maxValue>0.0</maxValue>
</inputRange>
<inputRequired>false</inputRequired>
<fieldType>Text</fieldType>
</DataType>
<Name>Council District:</Name>
<Value/>
<security>F</security>
<drillDown>false</drillDown>
</AdditionalItem>
<AdditionalItem>
<Keys>
<Key>Inspector Distict:</Key>
</Keys>
<IdentifierDisplay>Inspector Distict:</IdentifierDisplay>
<DataType>
<type>String</type>
<inputRange>
<maxValue>0.0</maxValue>
</inputRange>
<inputRequired>false</inputRequired>
<fieldType>Text</fieldType>
</DataType>
<Name>Inspector Distict:</Name>
<Value/>
<security>F</security>
<drillDown>false</drillDown>
</AdditionalItem>
<AdditionalItem>
<Keys>
<Key>Permit Number</Key>
</Keys>
<IdentifierDisplay>Permit Number</IdentifierDisplay>
<DataType>
<type>String</type>
<inputRange>
<maxValue>0.0</maxValue>
</inputRange>
<inputRequired>false</inputRequired>
<fieldType>Text</fieldType>
</DataType>
<Name>Permit Number</Name>
<Value/>
<security>F</security>
<drillDown>false</drillDown>
</AdditionalItem>
</AdditionalItems>
My first thought was to do a for-each on //AdditionalItem, then check to see if the length of the Value element was > 0. If so, add the action attribute. Does that seem like a reasonable approach? Something similar to this:
<xsl:for-each select="/ns2:UpdateCAP/ns2:AdditionalInformation//AdditionalItem">
<xsl:if test="string-length(Value) > 0">
<!-- somehow add the attribute -->
</xsl:if>
</xsl:for-each>
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:template match="node() | #*">
<xsl:copy>
<xsl:apply-templates select="node() | #*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="AdditionalItem[Value[text()]]">
<xsl:copy>
<xsl:attribute name="action">Add</xsl:attribute>
<xsl:apply-templates select="node() | #*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
We can not change an existing document xml using xslt. We need to copy all the elements, adding an attribute to the desired location.
So, we copy all nodes and attributes using the first template and copy AdditionalItem node that contains a non-empty inner Value node, adding an attribute using second template.
AdditionalItem matches node with name AdditionalItem.
AdditionalItem[Value] matches node with name AdditionalItem and inner node with name Value that has any content (may be empty).
AdditionalItem[Value[text()]] matches node with name AdditionalItem and inner node with name Value that has some content (non-empty).

Transform nodes with HXT using the number of <section> ancestor nodes

I'm looking to replace all title elements with h1, h2, ... , h6 elements depending on how many ancestors are section elements. Example input/output:
Input.xml
<document>
<section>
<title>Title A</title>
<section>
<title>Title B</title>
</section>
<section>
<title>Title C</title>
<section>
<title>Title D</title>
</section>
</section>
</section>
</document>
Output.xml
<document>
<section>
<h1>Title A</h1>
<section>
<h2>Title B</h2>
</section>
<section>
<h2>Title C</h2>
<section>
<h3>Title D</h3>
</section>
</section>
</section>
</document>
I can replace all titles with h1s using something like this
swapTitles :: ArrowXml a => a XmlTree XmlTree
swapTitles = processTopDown $
(changeQName . const $ mkName "h1")
`when`
(isElem >>> (hasQName $ mkName "title"))
I believe I should be using ArrowState, but I've not been able to figure out how. Can someone point me in the right direction?
Using XSL with package hxt-xslt. Standards make life easier :-)
{-# LANGUAGE Arrows, PackageImports #-}
import System.Environment ( getArgs )
import System.Exit (exitSuccess, exitWith, ExitCode(..))
import Control.Arrow
import "hxt" Text.XML.HXT.Core
import "hxt" Text.XML.HXT.DOM.XmlKeywords
import "hxt-xslt" Text.XML.HXT.XSLT.XsltArrows
import "hxt" Text.XML.HXT.Arrow.XmlState.TraceHandling (withTraceLevel)
process :: String -> String -> IO [String]
process xslStylesheetPath xmlDocPath = do
-- compile stylesheet
compiledStyleSheetResults <- runX $
arr (const xslStylesheetPath)
>>> readXSLTDoc [ withValidate yes, withInputEncoding utf8] -- withTrace 2
>>> {- withTraceLevel 2 -} xsltCompileStylesheet
case compiledStyleSheetResults of
[] -> return ["error compiling " ++ xslStylesheetPath]
compiledStyleSheet : _ -> do
-- apply compiled stylesheet to xml doc
runX $ arr (const xmlDocPath)
>>> readXSLTDoc [ withValidate yes, withInputEncoding utf8] -- withTrace 2
>>> xsltApplyStylesheet compiledStyleSheet
>>> writeDocumentToString [withOutputEncoding utf8,
withXmlPi yes, withIndent yes]
-- readXSLTDoc from internals of module Text.XML.HXT.XSLT.XsltArrows
readXSLTDoc :: SysConfigList -> IOSArrow String XmlTree
readXSLTDoc options
= readFromDocument (options ++ defaultOptions)
where
defaultOptions
= [ withCheckNamespaces yes
, withValidate no
, withPreserveComment no
]
main = do
args <- getArgs
case args of
[arg1, arg2] -> do
results <- process arg1 arg2
case results of
[] -> putStrLn "errors"
result : _ -> putStrLn result
exitSuccess
_ -> do
putStrLn "missing parameters: xslStylesheetPath xmlDocPath"
exitWith $ ExitFailure 1
with XSL file "mystyle.xsl"
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:for-each select="document">
<xsl:copy>
<xsl:call-template name="myloop">
<xsl:with-param name="nesting" select="0"/>
</xsl:call-template>
</xsl:copy>
</xsl:for-each>
</xsl:template>
<xsl:template name="myloop">
<xsl:param name="nesting"/>
<xsl:if test="title">
<xsl:element name="{concat('h',string($nesting))}">
<xsl:value-of select="title" />
</xsl:element>
</xsl:if>
<xsl:for-each select="section">
<xsl:copy>
<xsl:call-template name="myloop">
<xsl:with-param name="nesting" select="$nesting+1"/>
</xsl:call-template>
</xsl:copy>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
with "yourdata.xml"
<?xml version="1.0" encoding="UTF-8"?>
<document>
<section>
<title>Title A</title>
<section>
<title>Title B</title>
</section>
<section>
<title>Title C</title>
<section>
<title>Title D</title>
</section>
</section>
</section>
</document>
running
runhaskell test.hs mystyle.xsl yourdata.xml
result:
<?xml version="1.0" encoding="UTF-8"?>
<document>
<section>
<h1>Title A</h1>
<section>
<h2>Title B</h2>
</section>
<section>
<h2>Title C</h2>
<section>
<h3>Title D</h3>
</section>
</section>
</section>
</document>

How to convert XML to CSV using XSLT without knowing the node name

I would like to parse and convert an XML file to CSV format using XSLT.
The XML format looks like this:
<a:level1>
<a:level2>
<b:level3">
<b:date a:value="TODAY">
<c:level4>
<d:level5>
<d:level6 a:value="AAA">
<d:level7 a:value="AAA_AAA">
<d:level8 a:value="XXX/123">
<d:leaf a:value="150415">
<b:leaf1>100</b:leaf1>
<b:leaf2>100</b:leaf2>
</d:leaf>
<d:leaf a:value="200814">
<b:leaf1>1961</b:leaf1>
<b:leaf2>1961</b:leaf2>
</d:leaf>
</d:level8>
</d:level7>
</d:level6>
<d:level6 a:value="BBB">
<d:level7 a:value="BBB_BBB">
<d:level8 a:value="XXX/123">
<d:leaf a:value="1505">
<b:leaf1>0.42</b:leaf1>
<b:leaf2>0.42</b:leaf2>
</d:leaf>
</d:level8>
</d:level7>
</d:level6>
</d:level5>
</c:level4>
</b:date>
</b:level3>
</a:level2>
</a:level1>
The objective is to extract only nodes with values and use the node name as the header. The output CSV file will be like:
date, level6, leve7, level8, leaf, leaf1, leaf2
TODAY, AAA , AAA_AAA, XXX/123, 150415, 100,100
TODAY, AAA , AAA_AAA, XXX/123, 200814, 1961,1961
TODAY, BBB , BBB_BBB, XXX/123, 1505, 0.42,0.42
I am a newbie to XSLT, so do you have any samples on how construct the header and the rows in the CSV ? Node level6, level7, level8, leaf' names may change in different files.
This transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:a="some:a">
<xsl:output method="text"/>
<xsl:variable name="vFirstLeaf" select=
"(//*[#a:value][not(descendant::*[#a:value])])[1]"/>
<xsl:variable name="vAllLevels" select=
"$vFirstLeaf/ancestor-or-self::*[#a:value]
|
$vFirstLeaf/*
"/>
<xsl:template match="/">
<xsl:apply-templates select="$vAllLevels" mode="title"/>
<xsl:text>
</xsl:text>
<xsl:apply-templates mode="lastNormal" select=
"//*[#a:value and not(descendant::*[#a:value])]"/>
</xsl:template>
<xsl:template match="*" mode="title">
<xsl:if test="not(position()=1)">, </xsl:if>
<xsl:value-of select="local-name()"/>
</xsl:template>
<xsl:template match="*" mode="lastNormal">
<xsl:apply-templates mode="value"
select="ancestor-or-self::*[#a:value]"/>
<xsl:apply-templates select="*"/>
<xsl:text>
</xsl:text>
</xsl:template>
<xsl:template match="*" mode="value">
<xsl:if test="not(position()=1)">, </xsl:if>
<xsl:value-of select="#a:value"/>
</xsl:template>
<xsl:template match="*">
<xsl:text>, </xsl:text>
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<a:level1 xmlns:a="some:a" xmlns:b="some:b"
xmlns:c="some:c" xmlns:d="some:d">
<a:level2>
<b:level3>
<b:date a:value="TODAY">
<c:level4>
<d:level5>
<d:level6 a:value="AAA">
<d:level7 a:value="AAA_AAA">
<d:level8 a:value="XXX/123">
<d:leaf a:value="150415">
<b:leaf1>100</b:leaf1>
<b:leaf2>100</b:leaf2>
</d:leaf>
<d:leaf a:value="200814">
<b:leaf1>1961</b:leaf1>
<b:leaf2>1961</b:leaf2>
</d:leaf>
</d:level8>
</d:level7>
</d:level6>
<d:level6 a:value="BBB">
<d:level7 a:value="BBB_BBB">
<d:level8 a:value="XXX/123">
<d:leaf a:value="1505">
<b:leaf1>0.42</b:leaf1>
<b:leaf2>0.42</b:leaf2>
</d:leaf>
</d:level8>
</d:level7>
</d:level6>
</d:level5>
</c:level4>
</b:date>
</b:level3>
</a:level2>
</a:level1>
produces the wanted, correct result:
date, level6, level7, level8, leaf, leaf1, leaf2
TODAY, AAA, AAA_AAA, XXX/123, 150415, 100, 100
TODAY, AAA, AAA_AAA, XXX/123, 200814, 1961, 1961
TODAY, BBB, BBB_BBB, XXX/123, 1505, 0.42, 0.42

All X elements below Y, but before another, descendent, Y

<div n="a">
. . .
. . .
<spec>red</spec>
<div n="d">
. . .
</div>
<spec>green</spec>
. . .
<div n="b">
. . .
<spec>blue</spec>
. . .
</div>
<div n="c">
<spec>yellow</spec>
</div>
. . .
. . .
. . .
</div>
[Edited to remove the ambiguity Sean noticed. -- Thanks]
When the current element is <div n="a">, I need an XPATH expression that returns the red and green elements, but not the blue and yellow ones, as .//spec does.
When the current element is <div n="b">, the same expression needs to return the blue element; when <div n="c">, the yellow element.
Something like .//spec[but no deeper than another div if there is one]
In XSLT 1.0, assuming that the current node is a div:
.//spec[generate-id(current())=generate-id(ancestor::div[1])]
In XSLT 2.0 under the same assumptions:
.//spec[ancestor::div[1] is current()]
And a pure XPath 2.0 expression:
for $this in .
return
$this//spec[ancestor::div[1] is $this]
Full XSLT 1.0 transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="div">
<div n="{#n}"/>
<xsl:copy-of select=
".//spec[generate-id(current())=generate-id(ancestor::div[1])]"/>
==============
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
when applied on the provided XML document:
<div n="a">
. . .
. . .
<spec>red</spec>
<spec>green</spec>
. . .
<div n="b">
. . .
<spec>blue</spec>
. . .
</div>
<div n="c">
<spec>yellow</spec>
</div>
. . .
. . .
. . .
</div>
the wanted, correct result is produced:
<div n="a"/>
<spec>red</spec>
<spec>green</spec>
==============
<div n="b"/>
<spec>blue</spec>
==============
<div n="c"/>
<spec>yellow</spec>
==============
Full XSLT 2.0 transformation:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="div">
<div n="{#n}"/>
<xsl:sequence select=".//spec[ancestor::div[1] is current()]"/>
===================================
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
When applied on the same XML document (above), the same correct result is produced:
<div n="a"/>
<spec>red</spec>
<spec>green</spec>
===================================
<div n="b"/>
<spec>blue</spec>
===================================
<div n="c"/>
<spec>yellow</spec>
===================================
And using pure XPath 2.0 (no current()):
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="div">
<div n="{#n}"/>
<xsl:sequence select="
for $this in .
return
$this//spec[ancestor::div[1] is $this]"/>
===================================
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
produces the same correct result:
<div n="a"/>
<spec>red</spec>
<spec>green</spec>
===================================
<div n="b"/>
<spec>blue</spec>
===================================
<div n="c"/>
<spec>yellow</spec>
===================================
Assuming that you are using XSLT 1.0, and you want to select `spec' children, all children, then with your desired 'current' node as the XSLT focus node, set the following variable...
<xsl:variable name="divs" select="*//div" />
Now you can select all spec descendants which are not preceded by a div descendant with this XPath expression...
//spec[not((preceding::div|ancestor::div)[count(. | $divs) = count($divs)])]
Caveat
This should work, but I have not tested it. With this caution, I leave it as an exercise to the OP to test.
Note
If you really desperately want an XPath expression that does not require you to declare an additional variable, AND you happen to be lucky enough that you already hold the current node in a node-set (lets call it $ref), the you could use this rather inefficient XPath expression...
$ref//spec[not((preceding::div|ancestor::div)[count(. | $ref/*//div) =
count( $ref/*//div) ])]
Addendum
Here is a test case that I may be referring to the comment streams.
Test Case 1 Input:
<div n="a">
<spec>red</spec>
<div n="x"/>
<spec>green</spec>
<div n="b">
<spec>blue</spec>
</div>
<div n="c">
<spec>yellow</spec>
</div>
</div>
Test Case 1 Expected output:
Should be just red

Counting distinct items in XSLT independent of depth

If I run the following XSLT code:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:key name="kValueByVal" match="variable_name/#value"
use="."/>
<xsl:template match="assessment">
<xsl:for-each select="
/*/*/variable/attributes/variable_name/#value
[generate-id()
=
generate-id(key('kValueByVal', .)[1])
]
">
<xsl:value-of select=
"concat(., ' ', count(key('kValueByVal', .)), '
')"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
on the following XML:
<assessment>
<variables>
<variable>
<attributes>
<variable_name value="FRED"/>
</attributes>
</variable>
</variables>
<variables>
<variable>
<attributes>
<variable_name value="MORTIMER"/>
</attributes>
</variable>
</variables>
<variables>
<variable>
<attributes>
<variable_name value="FRED"/>
</attributes>
</variable>
</variables>
</assessment>
I get the desired output:
FRED 2
MORTIMER 1
(See my original question for more info, if you wish.)
However, if I run it on this input:
<ExamStore>
<assessment>
<variables>
<variable>
<attributes>
<variable_name value="FRED"/>
</attributes>
</variable>
</variables>
<variables>
<variable>
<attributes>
<variable_name value="MORTIMER"/>
</attributes>
</variable>
</variables>
<variables>
<variable>
<attributes>
<variable_name value="FRED"/>
</attributes>
</variable>
</variables>
</assessment>
</ExamStore>
I get nothing. (Note that I just wrapped the original input in an ExamStore tag.) I was expecting and hoping to get the same output.
Why don't I? How can I change the original XSLT code to get the same output?
Well your select xpath /*/*/variable/attributes/variable_name/... is no longer correct because you added another node higher in the node-tree.
If you want to have true independence you need to use something like:
//variable/attributes/variable_name/...
...(not the double slash at the start) but this is fairly dangerous because it will catch all occurences of that structure - be really sure that's what you mean.
Otherwise, just prepend your xpath with another /*
When you introduced yet another level in the XML document, this screwed up the absolute XPath expression used in the original solution (taylored exactly after your original XML file).
Therefore, in order to make the XPath expression work in the new situation, just do the following:
Replace:
/*/*/variable/attributes/variable_name/#value
with
/*/*/*/variable/attributes/variable_name/#value
and now you again get the wanted neat result:
FRED 2
MORTIMER 1
I would never give you an "independent" solution, because you haven't provided any properties/guarantees/constraints about the set of possible XML documents on which you want to apply the transformation.
In your original question you used:
.//variables/variable/attributes/variable_name
off assessment,
and this is why I used the absolute XPath expression in my solution. There was no guarantee that in another XML document some variable_name elements wouldn't exist such that their chain of ancestors is not variables/variable/attributes, If this were the case, this would mean that you probably were not interested in the values of such "irregular" variable_name elements.
The lesson is that one should not be too specific in defining a question and then want general solutions. :)
For real structure independence you should use:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:key name="kValueByVal" match="variable_name/#value"
use="."/>
<xsl:template match="variable_name[#value
[generate-id()
=
generate-id(key('kValueByVal', .)[1])
]]">
<xsl:value-of select=
"concat(#value, ' ', count(key('kValueByVal', #value)), '
')"/>
</xsl:template>
</xsl:stylesheet>
Result with first input:
FRED 2
MORTIMER 1
Result with second input:
FRED 2
MORTIMER 1
Note: Never use // as fisrt XPath operator.