sorting elements by attribute on a simple nested structure - xslt

I'm still struggling to get my head around a lot of XSLT but have a specific question.
I have a simple nested structure that I want to sort by an attribute (name).
The file has a single root node and then a series of nested nodes. I need to have all the nodes under root sorted within the level they are. The hierarchy is nested to an unspecified level.
The input:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<node name="A">
<node name="C"/>
<node name="B"/>
</node>
<node name="F"/>
<node name="E"/>
</root>
Needs to be transformed into:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<node name="A">
<node name="B"/>
<node name="C"/>
</node>
<node name="E"/>
<node name="F"/>
</root>
I won't bore you with my feable attempts at solving this.

Assuming you do wish the elements to stay within the level they are currently in, firstly, you would need a template to match any element
<xsl:template match="*">
Then you would use xsl:copy to copy the element, and xsl:copy-of to copy any attributes
<xsl:copy>
<xsl:copy-of select="#*"/>
... more code here...
</xsl:copy>
And within the xsl:copy you would then use xsl:apply-templates to process the child elements, along with xsl:sort to select the order
<xsl:apply-templates select="*">
<xsl:sort select="#name" />
</xsl:apply-templates>
Put this altogether gives you this
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes" />
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates select="*">
<xsl:sort select="#name" />
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
When applied to your input XML the following is output
<root>
<node name="A">
<node name="B"/>
<node name="C"/>
</node>
<node name="E"/>
<node name="F"/>
</root>

This answer is similar to Tim C's, but is just using an identity transform with an xsl:sort. This way you don't loose comments or processing instructions if they're present.
XSLT 1.0
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()">
<xsl:sort select="#name"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

Related

Creating taxonomic structure from individual records

I am having a failure of imagination in how to effectively solve this problem. My actual data set has thousands of thousands of records. Each record indicates its location in a taxonomic structure. I need to create that taxonomic structure and place the records within that structure (e.g., records that indicate they go in "/a/b/c" end up in the "/a/b/c" but there is only one each of the taxonomic levels "a", "b", and "c"). Due to client confidentiality, I've posted a naive representation of this here.
Using XSLT 3 is desirable. I know there is a solution to this using xsl:iterate but I cannot figure it out.
Input:
<outer>
<record>
<id>rec1</id>
<taxNodes>
<node>
<id>1</id>
<note>First level</note>
<node>
<id>node2a</id>
<note>Second level Entry A</note>
</node>
</node>
</taxNodes>
</record>
<record>
<id>rec3</id>
<taxNodes>
<node>
<id>1</id>
<note>First level</note>
<node>
<id>node2b</id>
<note>Second level Entry B</note>
</node>
</node>
</taxNodes>
</record>
<record>
<id>rec4</id>
<taxNodes>
<node>
<id>1</id>
<note>First level</note>
<node>
<id>node2b</id>
<note>Second level Entry B</note>
</node>
</node>
</taxNodes>
</record>
</outer>
Desired Output:
<outer>
<node>
<id>1</id>
<note>First level</note>
<node>
<id>node2a</id>
<note>Second level Entry A</note>
<records>
<record>
<id>rec1</id>
</record>
</records>
</node>
<node>
<id>node2b</id>
<note>Second level Entry B</note>
<records>
<record>
<id>rec3</id>
</record>
<record>
<id>rec4</id>
</record>
</records>
</node>
</node>
</outer>
I think this can be seen as a grouping problem and then solved using a recursive function using xsl:for-each-group:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="xs math mf"
version="3.0">
<xsl:output indent="yes"/>
<xsl:function name="mf:group" as="node()*">
<xsl:param name="input-nodes" as="element(node)*"/>
<xsl:for-each-group select="$input-nodes" group-by="id">
<xsl:copy>
<xsl:copy-of select="id, note"/>
<xsl:choose>
<xsl:when test="current-group()/node">
<xsl:sequence select="mf:group(current-group()/node)"/>
</xsl:when>
<xsl:otherwise>
<records>
<xsl:apply-templates select="current-group()/ancestor::record"/>
</records>
</xsl:otherwise>
</xsl:choose>
</xsl:copy>
</xsl:for-each-group>
</xsl:function>
<xsl:template match="outer">
<xsl:copy>
<xsl:sequence select="mf:group(record/taxNodes/node)"/>
</xsl:copy>
</xsl:template>
<xsl:template match="record">
<xsl:copy>
<xsl:copy-of select="id"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
That gives the desired result I think for the input sample you have posted, I am not sure it will do for other inputs, mainly as I am not sure how variable the input can be, I think if I understand the spec 'there is only one each of the taxonomic levels "a", "b", and "c"' correctly then it should work fine.
As for having a huge input file and using XSLT 3.0 (with streaming?), I am not sure that a streaming solution is possible, due to the nature of the problem where we need to recursively group the whole set of input nodes.
This stylesheet provides a solution but does so using a function for looping.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:local="http://www.local.com"
exclude-result-prefixes="xs local"
version="2.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="maxTaxonomyDepth" select="max(//node[not(node)]/count(ancestor-or-self::node))" as="xs:integer"/>
<xsl:sequence select="local:getTaxNodesForDepth(outer, 1, $maxTaxonomyDepth, '')" />
</xsl:template>
<xsl:function name="local:getTaxNodesForDepth">
<xsl:param name="out" as="element()" />
<xsl:param name="curDepth" as="xs:integer" />
<xsl:param name="maxDepth" as="xs:integer" />
<xsl:param name="parentId" as="xs:string*" />
<xsl:for-each select="distinct-values($out/record/taxNodes//node[count(ancestor-or-self::node) = $curDepth]
[if ($curDepth > 1) then parent::node/id/normalize-space(.) = $parentId else true()]
/id/normalize-space(.))">
<xsl:variable name="context" select="." as="xs:string" />
<node>
<xsl:sequence select="($out/record/taxNodes//node[count(ancestor-or-self::node) = $curDepth][id/normalize-space(.) = $context])[1]/(id | descriptor)" />
<xsl:apply-templates select="$out/record[taxNodes/descendant::node[last()][id/normalize-space(.) = $context]]" />
<xsl:choose>
<xsl:when test="$curDepth < $maxDepth">
<xsl:sequence select="local:getTaxNodesForDepth($out, $curDepth + 1, $maxDepth,
($out/record/taxNodes//node[count(ancestor-or-self::node) = $curDepth][id/normalize-space(.) = $context])[1]/id/normalize-space(.))" />
</xsl:when>
</xsl:choose>
</node>
</xsl:for-each>
</xsl:function>
<xsl:template match="taxNodes"/>
<xsl:template match="#*|node()" mode="#default">
<xsl:copy>
<xsl:apply-templates select="#*" mode="#current"/>
<xsl:apply-templates select="node()" mode="#current"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

Getting Unique values and adding the values in XSLT

Hi I am pretty new to XSLT so need some help on simple XSL code.
My input XML
<?xml version="1.0" encoding="ASCII"?>
<Node Name="Person" Received="1" Good="1" Bad="0" Condition="byPerson:1111">
</Node>
<Node Name="Person" Received="1" Good="1" Bad="0" Condition="byPerson:1111">
</Node>
<Node Name="Person" Received="1" Good="1" Bad="0" Condition="byPerson:2222">
</Node>
<Node Name="Person" Received="1" Good="1" Bad="0" Condition="byPerson:2222">
</Node>
<Node Name="Person" Received="1" Good="1" Bad="0" Condition="byPerson:3333">
</Node>
And i am expecting the result as sum of all Received , good and Bad but that need to added only once per unique condition.
Something like this
<?xml version="1.0" encoding="ASCII"?>
<Received>3</Received >
<Good>3</Good>
<Bad>0</Bad>
i was trying below code but no success so far just getting sum of everything, would like to get sum on only each 'Condition' only once.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<xsl:value-of select= "sum(Node#Received)"/>
<xsl:value-of select= "sum(Node/#Good)"/>
<xsl:value-of select= "sum(Node/#Bad)"/>
</xsl:template>
The following stylesheet uses an xsl:key to group the <node> elements by the value of the #Condition. Using the Meunchien method with key() and generate-id(), to select the first node element for each unique #Condition and then generate the sum() of the attributes of the selected node elements.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output indent="yes"/>
<xsl:key name="nodesByCondition" match="Node" use="#Condition"/>
<xsl:template match="/">
<results>
<xsl:variable name="distinctNodes"
select="*/Node[generate-id() =
generate-id(key('nodesByCondition', #Condition)[1])]"/>
<Received>
<xsl:value-of select= "sum($distinctNodes/#Received)"/>
</Received>
<Good><xsl:value-of select= "sum($distinctNodes/#Good)"/></Good>
<Bad><xsl:value-of select= "sum($distinctNodes/#Bad)"/></Bad>
</results>
</xsl:template>
</xsl:stylesheet>
in XSLT 2.0 you can use distinct-values()

Get distinct values from xml

My sample xml looks below: I need to get the distinct states from xml. I am using xslt 1.0 in vs 2010 editor.
<?xml version="1.0" encoding="utf-8" ?>
<states>
<node>
<value>2</value>
<state>DE</state>
</node>
<node>
<value>1</value>
<state>DE</state>
</node>
<node>
<value>1</value>
<state>NJ</state>
</node>
<node>
<value>1</value>
<state>NY</state>
</node>
<node>
<value>1</value>
<state>NY</state>
</node>
</states>
My xslt looks like below:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
xmlns:user="urn:my-scripts">
<xsl:output method="text" indent="yes"/>
<xsl:key name="st" match="//states/node/state" use="." />
<xsl:variable name="disst">
<xsl:for-each select="//states/node[contains(value,1)]/state[generate-id()=generate-id(key('st',.)[1])]" >
<xsl:choose>
<xsl:when test="(position() != 1)">
<xsl:value-of select="concat(', ',.)" disable-output-escaping="yes"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="." disable-output-escaping="yes"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:variable>
<xsl:template match="/" >
<xsl:value-of disable-output-escaping="yes" select="$disst"/>
</xsl:template>
</xsl:stylesheet>
Output: DE,NJ,NY
My above xml looks good for the above test xml.
If I change the xml as below:
<?xml version="1.0" encoding="utf-8" ?>
<states>
<node>
<value>2</value>
<state>DE</state>
</node>
<node>
<value>1</value>
<state>DE</state>
</node>
<node>
<value>1</value>
<state>NJ</state>
</node>
<node>
<value>1</value>
<state>NY</state>
</node>
<node>
<value>1</value>
<state>NY</state>
</node>
</states>
It in not picking the state DE. Can any one suggest the suitable solution.Thanks in advance.
I need to find out the distinct states from the xml.
The problem here is your use of a predicate in your Muenchian grouping XPath:
[contains(value,1)]
This will often make Muenchian grouping fail to find all of the available distinct values. Instead, you should add the predicate to the key:
<xsl:key name="st" match="//states/node[contains(value, 1)]/state" use="." />
Alternatively, you can apply the predicate inside the grouping statement:
<xsl:apply-templates
select="//states/node
/state[generate-id() =
generate-id(key('st',.)[contains(../value, 1)][1])]" />
Full XSLT (with some improvements):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:user="urn:my-scripts">
<xsl:output method="text" indent="yes"/>
<xsl:key name="st" match="//states/node/state" use="." />
<xsl:variable name="a" select="1" />
<xsl:variable name="disst">
<xsl:apply-templates
select="//states/node
/state[generate-id() =
generate-id(key('st',.)[contains(../value, $a)][1])]" />
</xsl:variable>
<xsl:template match="state">
<xsl:if test="position() > 1">
<xsl:text>,</xsl:text>
</xsl:if>
<xsl:value-of select ="." disable-output-escaping="yes" />
</xsl:template>
<xsl:template match="/" >
<xsl:value-of disable-output-escaping="yes" select="$disst"/>
</xsl:template>
</xsl:stylesheet>
Result when run on your sample XML:
DE,NJ,NY

Difference between * and node() in XSLT

What's the difference between these two templates?
<xsl:template match="node()">
<xsl:template match="*">
<xsl:template match="node()">
is an abbreviation for:
<xsl:template match="child::node()">
This matches any node type that can be selected via the child:: axis:
element
text-node
processing-instruction (PI) node
comment node.
On the other side:
<xsl:template match="*">
is an abbreviation for:
<xsl:template match="child::*">
This matches any element.
The XPath expression: someAxis::* matches any node of the primary node-type for the given axis.
For the child:: axis the primary node-type is element.
Just to illustrate one of the differences, viz that * doesn't match text:
Given xml:
<A>
Text1
<B/>
Text2
</A>
Matching on node()
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<!--Suppress unmatched text-->
<xsl:template match="text()" />
<xsl:template match="/">
<root>
<xsl:apply-templates />
</root>
</xsl:template>
<xsl:template match="node()">
<node>
<xsl:copy />
</node>
<xsl:apply-templates />
</xsl:template>
</xsl:stylesheet>
Gives:
<root>
<node>
<A />
</node>
<node>
Text1
</node>
<node>
<B />
</node>
<node>
Text2
</node>
</root>
Whereas matching on *:
<xsl:template match="*">
<star>
<xsl:copy />
</star>
<xsl:apply-templates />
</xsl:template>
Doesn't match the text nodes.
<root>
<star>
<A />
</star>
<star>
<B />
</star>
</root>
Also refer to XSL xsl:template match="/"
for other match patterns.

Get a collection of the attributes of a nodeset

I have a collection of nodes like this
<node id="1">
<languaje>c</languaje>
<os>linux</os>
</node>
<node id="2">
<languaje>c++</languaje>
<os>linux</os>
</node>
<node id="3">
<languaje>c#</languaje>
<os>window</os>
</node>
<node id="4">
<languaje>basic</languaje>
<os>mac</os>
</node>
And i want to create a new collection of all the properties id's like this
<root>
<token>1</token>
<token>2</token>
<token>3</token>
<token>4</token>
</root>
How can do that
If you can use XQuery you can do it like this:
<root>
{ ($document/node/<node>{string(#id)}</node>) }
</root>
which is imho the clearest solution.
Otherwise you could create a string (not a document) containing your desired result with XPath 2 by concatenating the tags and your ids :
concat("<root>", string-join(for $i in /base/node/#id return concat("<node>",$i,"</node>"), " ") , "</root>")
All you need is
<xsl:output indent="yes"/>
<xsl:template match="*[node]">
<root>
<xsl:apply-templates select="node"/>
</root>
</xsl:template>
<xsl:template match="node">
<token><xsl:value-of select="#id"/></token>
</xsl:template>
If you want to store the result in a variable you can create a result tree fragment with XSLT 1.0 with e.g.
<xsl:variable name="rtf1">
<xsl:apply-templates select="node()" mode="m1"/>
</xsl:variable>
<xsl:template match="*[node]" mode="m1">
<root>
<xsl:apply-templates select="node" mode="m1"/>
</root>
</xsl:template>
<xsl:template match="node" mode="m1">
<token><xsl:value-of select="#id"/></token>
</xsl:template>
Then you can do <xsl:copy-of select="$rtf1"/> to use the result tree fragment, or with 'exsl:node-set` you can process the created nodes with XPath and XSLT e.g.
<xsl:apply-templates select="exsl:node-set($rtf1)/root/token"/>
With XSLT 2.0 there are no longer result tree fragments so you can use the variable like any input without the need for an extension function.
If you wrap all the nodes under a tag, like <nodes> this works:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<root>
<xsl:apply-templates select="*" />
</root>
</xsl:template>
<!-- templates -->
<xsl:template match="node">
<token><xsl:value-of select="#id" /></token>
</xsl:template>
</xsl:stylesheet>
Tested on XsltCake
http://www.xsltcake.com/slices/E937yH