xslt 1.0, select group of nodes with key - xslt

I want to select nodes based on some variables.
The XML code:
<data>
<prot seq="AAA">
<node num="1">1345</node>
<node num="1">11245</node>
<node num="2">88885</node>
</prot>
<prot seq="BBB">
<node num="1">678</node>
<node num="1">456</node>
<node num="2">6666</node>
</prot>
<prot seq="CCC">
<node num="1">111</node>
<node num="1">222</node>
<node num="2">333</node>
</prot>
</data>
The XML that I want
<output>
<prot seq="AAA">
<node num="1">1345</node>
<node num="2">88885</node>
</prot>
<prot seq="BBB">
<node num="1">678</node>
<node num="2">6666</node>
</prot>
<prot seq="CCC">
<node num="1">111</node>
<node num="2">333</node>
</prot>
</data>
So, my idea has been to group the nodes with a xsl:key element, and then do a for-each of them. For example:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:key name="by" match="/data/prot" use="concat(#seq,'|',node/#num)"/>
<xsl:template match="/">
<root>
<xsl:apply-templates select="/data/prot"/>
</root>
</xsl:template>
<xsl:template match="/data/prot">
<xsl:for-each select="./node">
<xsl:for-each select="key('by',concat(current()/../#seq,'|',current()/#num))">
node <xsl:value-of select="./node" />
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
but the output is not what I expected, and I cannot see what I am doing wrong. I would prefer to keep the for-each structure. It is just as if I was not using properly the xsl:key grouping features.
the output that I get, unwanted
<root>
node 1345
node 1345
node 678
node 678
node 111
node 111</root>
And the code as it to be tested
http://www.xsltcake.com/slices/sgWUFu/20
Thanks!

The main problem in your code is that the key indexes prot elements, but what we want to de-duplicate (and need to index) is the node elements.
Here is a short and correct solution:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="nodeByParentAndNum" match="node"
use="concat(generate-id(..), '+', #num)"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/*">
<data>
<xsl:apply-templates/>
</data>
</xsl:template>
<xsl:template match=
"node
[not(generate-id()
=
generate-id(key('nodeByParentAndNum',
concat(generate-id(..), '+', #num)
)
[1]
)
)
]
"/>
</xsl:stylesheet>
when this transformation is applied on the provided XML document:
<data>
<prot seq="AAA">
<node num="1">1345</node>
<node num="1">11245</node>
<node num="2">88885</node>
</prot>
<prot seq="BBB">
<node num="1">678</node>
<node num="1">456</node>
<node num="2">6666</node>
</prot>
<prot seq="CCC">
<node num="1">111</node>
<node num="1">222</node>
<node num="2">333</node>
</prot>
</data>
the wanted, correct result is produced:
<data>
<prot seq="AAA">
<node num="1">1345</node>
<node num="2">88885</node>
</prot>
<prot seq="BBB">
<node num="1">678</node>
<node num="2">6666</node>
</prot>
<prot seq="CCC">
<node num="1">111</node>
<node num="2">333</node>
</prot>
</data>

Related

transform move nodes with the same names into their parent nodes

the document to be transformed looks more or less like this:
<?xml version="1.0" encoding="utf-8"?>
<root>
<someCatalogProp>ąć</someCatalogProp>
<meanProp>
<node id="1">
<someProperty>blabla1</someProperty>
<children>
<node idref="2"/>
</children>
</node>
<node id="2">
<someProperty>blabla2</someProperty>
<children>
<node idref="3"/>
</children>
</node>
</meanProp>
<node id="1">
<someProperty>blabla1</someProperty>
<children>
<node idref="2"/>
</children>
</node>
<node id="2">
<someProperty>blabla2</someProperty>
<children>
<node idref="3"/>
</children>
</node>
<node id="3">
<someProperty>blabla3</someProperty>
<children>
</children>
</node>
</root>
the result document should look like this:
<root>
<someCatalogProp>ąć</someCatalogProp>
<node id = "1">
<someProperty>blabla1</someProperty>
<children>
<node id = "2">
<someProperty>blabla2</someProperty>
<children>
<node id = "3">
<someProperty>blabla2</someProperty>
<children>
</children>
</node>
</children>
</node>
</children>
</node>
</root>
the number of children can be multiple. the depth of hierarchy is not limited.
How can the transformation xslt look like?
Thank you in advance.
This is actually quite simple to accomplish using keys.
Provided you have a well-formed input such as:
XML
<root>
<node id="1">
<someProperty>blabla1</someProperty>
<children>
<node idref="2"/>
</children>
</node>
<node id="2">
<someProperty>blabla2</someProperty>
<children>
<node idref="3"/>
</children>
</node>
<node id="3">
<someProperty>blabla2</someProperty>
<children>
</children>
</node>
</root>
applying the following stylesheet:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="child" match="node" use="#id" />
<xsl:key name="parent" match="node" use="#idref" />
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/root">
<xsl:copy>
<xsl:apply-templates select="node[not(key('parent', #id))]"/>
</xsl:copy>
</xsl:template>
<xsl:template match="node[#idref]">
<xsl:apply-templates select="key('child', #idref)"/>
</xsl:template>
</xsl:stylesheet>
will produce:
Result
<?xml version="1.0" encoding="UTF-8"?>
<root>
<node id="1">
<someProperty>blabla1</someProperty>
<children>
<node id="2">
<someProperty>blabla2</someProperty>
<children>
<node id="3">
<someProperty>blabla2</someProperty>
<children/>
</node>
</children>
</node>
</children>
</node>
</root>

sorting elements by attribute on a simple nested structure

I'm still struggling to get my head around a lot of XSLT but have a specific question.
I have a simple nested structure that I want to sort by an attribute (name).
The file has a single root node and then a series of nested nodes. I need to have all the nodes under root sorted within the level they are. The hierarchy is nested to an unspecified level.
The input:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<node name="A">
<node name="C"/>
<node name="B"/>
</node>
<node name="F"/>
<node name="E"/>
</root>
Needs to be transformed into:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<node name="A">
<node name="B"/>
<node name="C"/>
</node>
<node name="E"/>
<node name="F"/>
</root>
I won't bore you with my feable attempts at solving this.
Assuming you do wish the elements to stay within the level they are currently in, firstly, you would need a template to match any element
<xsl:template match="*">
Then you would use xsl:copy to copy the element, and xsl:copy-of to copy any attributes
<xsl:copy>
<xsl:copy-of select="#*"/>
... more code here...
</xsl:copy>
And within the xsl:copy you would then use xsl:apply-templates to process the child elements, along with xsl:sort to select the order
<xsl:apply-templates select="*">
<xsl:sort select="#name" />
</xsl:apply-templates>
Put this altogether gives you this
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes" />
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates select="*">
<xsl:sort select="#name" />
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
When applied to your input XML the following is output
<root>
<node name="A">
<node name="B"/>
<node name="C"/>
</node>
<node name="E"/>
<node name="F"/>
</root>
This answer is similar to Tim C's, but is just using an identity transform with an xsl:sort. This way you don't loose comments or processing instructions if they're present.
XSLT 1.0
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()">
<xsl:sort select="#name"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

Getting Unique values and adding the values in XSLT

Hi I am pretty new to XSLT so need some help on simple XSL code.
My input XML
<?xml version="1.0" encoding="ASCII"?>
<Node Name="Person" Received="1" Good="1" Bad="0" Condition="byPerson:1111">
</Node>
<Node Name="Person" Received="1" Good="1" Bad="0" Condition="byPerson:1111">
</Node>
<Node Name="Person" Received="1" Good="1" Bad="0" Condition="byPerson:2222">
</Node>
<Node Name="Person" Received="1" Good="1" Bad="0" Condition="byPerson:2222">
</Node>
<Node Name="Person" Received="1" Good="1" Bad="0" Condition="byPerson:3333">
</Node>
And i am expecting the result as sum of all Received , good and Bad but that need to added only once per unique condition.
Something like this
<?xml version="1.0" encoding="ASCII"?>
<Received>3</Received >
<Good>3</Good>
<Bad>0</Bad>
i was trying below code but no success so far just getting sum of everything, would like to get sum on only each 'Condition' only once.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<xsl:value-of select= "sum(Node#Received)"/>
<xsl:value-of select= "sum(Node/#Good)"/>
<xsl:value-of select= "sum(Node/#Bad)"/>
</xsl:template>
The following stylesheet uses an xsl:key to group the <node> elements by the value of the #Condition. Using the Meunchien method with key() and generate-id(), to select the first node element for each unique #Condition and then generate the sum() of the attributes of the selected node elements.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output indent="yes"/>
<xsl:key name="nodesByCondition" match="Node" use="#Condition"/>
<xsl:template match="/">
<results>
<xsl:variable name="distinctNodes"
select="*/Node[generate-id() =
generate-id(key('nodesByCondition', #Condition)[1])]"/>
<Received>
<xsl:value-of select= "sum($distinctNodes/#Received)"/>
</Received>
<Good><xsl:value-of select= "sum($distinctNodes/#Good)"/></Good>
<Bad><xsl:value-of select= "sum($distinctNodes/#Bad)"/></Bad>
</results>
</xsl:template>
</xsl:stylesheet>
in XSLT 2.0 you can use distinct-values()

XSLT: merging files but with better performance of the process

I have two XML files and desire a merger, the criterion for this merger is as follows:
nodes1.xml file content:
<nodes>
<node>
<type>a</type>
<name>joe</name>
</node>
<node>
<type>b</type>
<name>sam</name>
</node>
<node>
<type>c</type>
<name>pez</name>
</node>
<node>
<type>g</type>
<name>lua</name>
</node>
<node>
<type>a</type>
<name>tol</name>
</node>
<node>
<type>c</type>
<name>jua</name>
</node>
</nodes>
nodes2.xml file content:
<nodes>
<node>
<type>a</type>
<name>jill</name>
</node>
<node>
<type>c</type>
<name>imol</name>
</node>
<node>
<type>h</type>
<name>teli</name>
</node>
<node>
<type>f</type>
<name>jopp</name>
</node>
<node>
<type>c</type>
<name>zolh</name>
</node>
</nodes>
and by my xsl template I get:
<?xml version="1.0" encoding="UTF-8"?>
<nodes>
<node tipo="a">
<name>joe</name>
<name>tol</name>
<name>jill</name>
</node>
<node tipo="c">
<name>pez</name>
<name>jua</name>
<name>imol</name>
<name>zolh</name>
</node>
<node tipo="h">
<name>teli</name>
</node>
<node tipo="f">
<name>jopp</name>
</node>
</nodes>
I need a solution to get better performance.
My current solution is:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:variable name="Source2" select="document('nodes2.xml')/nodes/node"/>
<xsl:variable name="Source1" select="document('nodes1.xml')/nodes/node"/>
<xsl:template match="/nodes" >
<nodes>
<xsl:for-each-group select="node" group-by="type">
<node tipo="{type}">
<xsl:apply-templates select="$Source1[type=current-grouping-key()]/name"/>
<xsl:apply-templates select="$Source2[type=current-grouping-key()]/name"/>
</node>
</xsl:for-each-group>
</nodes>
</xsl:template>
<xsl:template match="name">
<name><xsl:value-of select="."/></name>
</xsl:template>
</xsl:stylesheet>
I run it with java saxon:
$ java net.sf.saxon.Transform nodes2.xml mysolution.xsl
I think "a shame" to have the input file at the same time in a variable, but I can not figure out to do it differently.
I appreciate help or pointer.
--Paulino
Assuming you have the second of the files as the primary input to the XSLT code you can use the following:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:param name="source1-uri" select="'nodes1.xml'"/>
<xsl:variable name="doc1" select="doc($source1-uri)"/>
<xsl:key name="by-type" match="nodes/node" use="type"/>
<xsl:template match="/nodes" >
<nodes>
<xsl:for-each-group select="key('by-type', node/type, $doc1), node" group-by="type">
<node tipo="{current-grouping-key()}">
<xsl:copy-of select="for $n in current-group() return $n/name"/>
</node>
</xsl:for-each-group>
</nodes>
</xsl:template>
</xsl:stylesheet>
I am not sure whether the order of the merged name elements matters to you but to ensure with Saxon 9.5 that I get the order you posted in your result sample I had to use <xsl:copy-of select="for $n in current-group() return $n/name"/> instead of the shorter and more usual <xsl:copy-of select="current-group()/name"/>.
So that solution should be more efficient, mainly by grouping on all input nodes and of course by then simply making use of current-group() instead of select the nodes again with a predicate.

How to remove duplicates based on level in hierarchy?

I have the following XML structure:
<node name="A">
<node name="B">
<node name="C"/>
<node name="D"/>
<node name="E"/>
</node>
<node name="D"/>
<node name="E"/>
</node>
I need to get all the leaf nodes. I use //node[not(node)] to get those. Now I need to remove duplicates by leaving elements that are deeper in hierarchy. How do I do that?
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="vallLeaves" select="//node()[not(node())]"/>
<xsl:template match="/">
$vallLeaves:
<xsl:copy-of select="$vallLeaves"/>
$vallDistinctLeaves:
<xsl:for-each select="$vallLeaves">
<xsl:if test=
"generate-id()
=
generate-id($vallLeaves[#name
=
current()/#name
]
[1]
)
">
<xsl:copy-of select="."/>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<node name="A">
<node name="B">
<node name="C"/>
<node name="D"/>
<node name="E"/>
</node>
<node name="D"/>
<node name="E"/>
</node>
produces the wanted, correct result:
$vallLeaves:
<node name="C"/>
<node name="D"/>
<node name="E"/>
<node name="D"/>
<node name="E"/>
$vallDistinctLeaves:
<node name="C"/>
<node name="D"/>
<node name="E"/>
II. XSLT 2.0 Solution:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="vallLeaves" select="//node()[not(node())]"/>
<xsl:variable name="vallDistinctLeaves" as="element()*">
<xsl:for-each-group select="$vallLeaves" group-by="#name">
<xsl:sequence select="."/>
</xsl:for-each-group>
</xsl:variable>
<xsl:template match="/">
$vallLeaves:
<xsl:sequence select="$vallLeaves"/>
$vallDistinctLeaves:
<xsl:sequence select="$vallDistinctLeaves"/>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the same XML document (above), the same correct results are produced:
$vallLeaves:
<node name="C"/>
<node name="D"/>
<node name="E"/>
<node name="D"/>
<node name="E"/>
$vallDistinctLeaves:
<node name="C"/>
<node name="D"/>
<node name="E"/>