How do I pair appropriate xml elements with xmlstarlet? - xslt

I have two sets of XML nodes, and I want to find elements that have identical "phone" child. For example:
<set1>
<node>
<phone>111</phone>
<name>John</name>
</node>
<node>
<phone>444</phone>
<name>Amy</name>
</node>
<node>
<phone>777</phone>
<name>Robin</name>
</node>
</set1>
<set2>
<node>
<phone>111</phone>
<city>Moscow</city>
</node>
<node>
<phone>444</phone>
<city>Prag</city>
</node>
<node>
<phone>999</phone>
<city>Rome</city>
</node>
</set2>
Now I want to get the following:
<result>
<node>
<phone>111</phone>
<name>John</name>
<city>Moscow</city>
</node>
<node>
<phone>444</phone>
<name>Amy</name>
<city>Prag</city>
</node>
<node>
<phone>777</phone>
<name>Robin</name>
</node>
<node>
<phone>999</phone>
<city>Rome</city>
</node>
</result>
I'm a beginner in xslt, and i managed to merge two xml's and put them in a html table. But this pairing is one level over me.

Use a key
<xsl:key name="phone" match="node" use="phone"/>
then group with Muenchian grouping as follows:
<xsl:template match="/">
<result>
<xsl:apply-templates select="//node[generate-id() = generate-id(key('phone', phone)[1])]"/>
</result>
</xsl:template>
<xsl:template match="node">
<xsl:copy>
<xsl:copy-of select="phone"/>
<xsl:copy-of select="key('phone', phone)/*[not(self::phone)]"/>
</xsl:copy>
</xsl:template>
For readability add
<xsl:output indent="yes"/>

Related

transform move nodes with the same names into their parent nodes

the document to be transformed looks more or less like this:
<?xml version="1.0" encoding="utf-8"?>
<root>
<someCatalogProp>ąć</someCatalogProp>
<meanProp>
<node id="1">
<someProperty>blabla1</someProperty>
<children>
<node idref="2"/>
</children>
</node>
<node id="2">
<someProperty>blabla2</someProperty>
<children>
<node idref="3"/>
</children>
</node>
</meanProp>
<node id="1">
<someProperty>blabla1</someProperty>
<children>
<node idref="2"/>
</children>
</node>
<node id="2">
<someProperty>blabla2</someProperty>
<children>
<node idref="3"/>
</children>
</node>
<node id="3">
<someProperty>blabla3</someProperty>
<children>
</children>
</node>
</root>
the result document should look like this:
<root>
<someCatalogProp>ąć</someCatalogProp>
<node id = "1">
<someProperty>blabla1</someProperty>
<children>
<node id = "2">
<someProperty>blabla2</someProperty>
<children>
<node id = "3">
<someProperty>blabla2</someProperty>
<children>
</children>
</node>
</children>
</node>
</children>
</node>
</root>
the number of children can be multiple. the depth of hierarchy is not limited.
How can the transformation xslt look like?
Thank you in advance.
This is actually quite simple to accomplish using keys.
Provided you have a well-formed input such as:
XML
<root>
<node id="1">
<someProperty>blabla1</someProperty>
<children>
<node idref="2"/>
</children>
</node>
<node id="2">
<someProperty>blabla2</someProperty>
<children>
<node idref="3"/>
</children>
</node>
<node id="3">
<someProperty>blabla2</someProperty>
<children>
</children>
</node>
</root>
applying the following stylesheet:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="child" match="node" use="#id" />
<xsl:key name="parent" match="node" use="#idref" />
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/root">
<xsl:copy>
<xsl:apply-templates select="node[not(key('parent', #id))]"/>
</xsl:copy>
</xsl:template>
<xsl:template match="node[#idref]">
<xsl:apply-templates select="key('child', #idref)"/>
</xsl:template>
</xsl:stylesheet>
will produce:
Result
<?xml version="1.0" encoding="UTF-8"?>
<root>
<node id="1">
<someProperty>blabla1</someProperty>
<children>
<node id="2">
<someProperty>blabla2</someProperty>
<children>
<node id="3">
<someProperty>blabla2</someProperty>
<children/>
</node>
</children>
</node>
</children>
</node>
</root>

XSLT: merging files but with better performance of the process

I have two XML files and desire a merger, the criterion for this merger is as follows:
nodes1.xml file content:
<nodes>
<node>
<type>a</type>
<name>joe</name>
</node>
<node>
<type>b</type>
<name>sam</name>
</node>
<node>
<type>c</type>
<name>pez</name>
</node>
<node>
<type>g</type>
<name>lua</name>
</node>
<node>
<type>a</type>
<name>tol</name>
</node>
<node>
<type>c</type>
<name>jua</name>
</node>
</nodes>
nodes2.xml file content:
<nodes>
<node>
<type>a</type>
<name>jill</name>
</node>
<node>
<type>c</type>
<name>imol</name>
</node>
<node>
<type>h</type>
<name>teli</name>
</node>
<node>
<type>f</type>
<name>jopp</name>
</node>
<node>
<type>c</type>
<name>zolh</name>
</node>
</nodes>
and by my xsl template I get:
<?xml version="1.0" encoding="UTF-8"?>
<nodes>
<node tipo="a">
<name>joe</name>
<name>tol</name>
<name>jill</name>
</node>
<node tipo="c">
<name>pez</name>
<name>jua</name>
<name>imol</name>
<name>zolh</name>
</node>
<node tipo="h">
<name>teli</name>
</node>
<node tipo="f">
<name>jopp</name>
</node>
</nodes>
I need a solution to get better performance.
My current solution is:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:variable name="Source2" select="document('nodes2.xml')/nodes/node"/>
<xsl:variable name="Source1" select="document('nodes1.xml')/nodes/node"/>
<xsl:template match="/nodes" >
<nodes>
<xsl:for-each-group select="node" group-by="type">
<node tipo="{type}">
<xsl:apply-templates select="$Source1[type=current-grouping-key()]/name"/>
<xsl:apply-templates select="$Source2[type=current-grouping-key()]/name"/>
</node>
</xsl:for-each-group>
</nodes>
</xsl:template>
<xsl:template match="name">
<name><xsl:value-of select="."/></name>
</xsl:template>
</xsl:stylesheet>
I run it with java saxon:
$ java net.sf.saxon.Transform nodes2.xml mysolution.xsl
I think "a shame" to have the input file at the same time in a variable, but I can not figure out to do it differently.
I appreciate help or pointer.
--Paulino
Assuming you have the second of the files as the primary input to the XSLT code you can use the following:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:param name="source1-uri" select="'nodes1.xml'"/>
<xsl:variable name="doc1" select="doc($source1-uri)"/>
<xsl:key name="by-type" match="nodes/node" use="type"/>
<xsl:template match="/nodes" >
<nodes>
<xsl:for-each-group select="key('by-type', node/type, $doc1), node" group-by="type">
<node tipo="{current-grouping-key()}">
<xsl:copy-of select="for $n in current-group() return $n/name"/>
</node>
</xsl:for-each-group>
</nodes>
</xsl:template>
</xsl:stylesheet>
I am not sure whether the order of the merged name elements matters to you but to ensure with Saxon 9.5 that I get the order you posted in your result sample I had to use <xsl:copy-of select="for $n in current-group() return $n/name"/> instead of the shorter and more usual <xsl:copy-of select="current-group()/name"/>.
So that solution should be more efficient, mainly by grouping on all input nodes and of course by then simply making use of current-group() instead of select the nodes again with a predicate.

sum of nodes after blank record until the next blank record using xslt 1.0

I am working in a project where I need to calculate the sum of hours after blank hours until the next blank hours and display them as in the output.
Here is the Input:
<Nodes>
<Node>
<EmpId>1<EmpId>
<InTime></InTime>
<Hours></Hours>
</Node>
<Node>
<EmpId>1<EmpId>
<InTime>10/12/2010</InTime>
<Hours>5</Hours>
</Node>
<Node>
<EmpId>1<EmpId>
<InTime>10/13/2010</InTime>
<Hours>5</Hours>
</Node>
<Node>
<EmpId>1<EmpId>
<InTime></InTime>
<Hours></Hours>
</Node>
<Node>
<EmpId>1</EmpId>
<InTime></InTime>
<Hours></Hours>
</Node>
<Node>
<EmpId>1</EmpId>
<InTime>10/14/2010</InTime>
<Hours>2</Hours>
</Node>
<Node>
<EmpId>1</EmpId>
<InTime>10/14/2010</InTime>
<Hours>3</Hours>
</Node>
</Nodes>
Output should be like:
<Nodes>
<Detail>
<EmpId>1</EmpId>
<InTime>10/12/2010</InTime>
<Hours>10</Hours>
</Detail>
<Detail>
<EmpId>1</EmpId>
<InTime>10/14/2010</InTime>
<Hours>5</Hours>
</Detail>
</Nodes>
Appreciate if any one could help me on this.
Your input XML is malformed (several <EmpId> tags where you should have </EmpId>), but once that's fixed, I believe this does what you describe:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/Nodes">
<Nodes>
<xsl:apply-templates select="Node[Hours != '' and not(normalize-space(preceding-sibling::Node[1]/Hours))]" />
</Nodes>
</xsl:template>
<xsl:template match="Node">
<Detail>
<xsl:copy-of select="EmpId | InTime"/>
<Hours>
<xsl:apply-templates select="." mode="SumHours" />
</Hours>
</Detail>
</xsl:template>
<xsl:template match="Node[normalize-space(following-sibling::Node[1]/Hours)]" mode="SumHours">
<xsl:param name="total" select="0" />
<xsl:apply-templates select="following-sibling::Node[1]" mode="SumHours">
<xsl:with-param name="total" select="$total + Hours" />
</xsl:apply-templates>
</xsl:template>
<xsl:template match="Node" mode="SumHours">
<xsl:param name="total" select="0" />
<xsl:value-of select="$total + Hours"/>
</xsl:template>
</xsl:stylesheet>

xslt 1.0, select group of nodes with key

I want to select nodes based on some variables.
The XML code:
<data>
<prot seq="AAA">
<node num="1">1345</node>
<node num="1">11245</node>
<node num="2">88885</node>
</prot>
<prot seq="BBB">
<node num="1">678</node>
<node num="1">456</node>
<node num="2">6666</node>
</prot>
<prot seq="CCC">
<node num="1">111</node>
<node num="1">222</node>
<node num="2">333</node>
</prot>
</data>
The XML that I want
<output>
<prot seq="AAA">
<node num="1">1345</node>
<node num="2">88885</node>
</prot>
<prot seq="BBB">
<node num="1">678</node>
<node num="2">6666</node>
</prot>
<prot seq="CCC">
<node num="1">111</node>
<node num="2">333</node>
</prot>
</data>
So, my idea has been to group the nodes with a xsl:key element, and then do a for-each of them. For example:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:key name="by" match="/data/prot" use="concat(#seq,'|',node/#num)"/>
<xsl:template match="/">
<root>
<xsl:apply-templates select="/data/prot"/>
</root>
</xsl:template>
<xsl:template match="/data/prot">
<xsl:for-each select="./node">
<xsl:for-each select="key('by',concat(current()/../#seq,'|',current()/#num))">
node <xsl:value-of select="./node" />
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
but the output is not what I expected, and I cannot see what I am doing wrong. I would prefer to keep the for-each structure. It is just as if I was not using properly the xsl:key grouping features.
the output that I get, unwanted
<root>
node 1345
node 1345
node 678
node 678
node 111
node 111</root>
And the code as it to be tested
http://www.xsltcake.com/slices/sgWUFu/20
Thanks!
The main problem in your code is that the key indexes prot elements, but what we want to de-duplicate (and need to index) is the node elements.
Here is a short and correct solution:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="nodeByParentAndNum" match="node"
use="concat(generate-id(..), '+', #num)"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/*">
<data>
<xsl:apply-templates/>
</data>
</xsl:template>
<xsl:template match=
"node
[not(generate-id()
=
generate-id(key('nodeByParentAndNum',
concat(generate-id(..), '+', #num)
)
[1]
)
)
]
"/>
</xsl:stylesheet>
when this transformation is applied on the provided XML document:
<data>
<prot seq="AAA">
<node num="1">1345</node>
<node num="1">11245</node>
<node num="2">88885</node>
</prot>
<prot seq="BBB">
<node num="1">678</node>
<node num="1">456</node>
<node num="2">6666</node>
</prot>
<prot seq="CCC">
<node num="1">111</node>
<node num="1">222</node>
<node num="2">333</node>
</prot>
</data>
the wanted, correct result is produced:
<data>
<prot seq="AAA">
<node num="1">1345</node>
<node num="2">88885</node>
</prot>
<prot seq="BBB">
<node num="1">678</node>
<node num="2">6666</node>
</prot>
<prot seq="CCC">
<node num="1">111</node>
<node num="2">333</node>
</prot>
</data>

Check if string appears within node value in XSLT

I have the following XML:
<nodes>
<node>
<articles>125,1,9027</articles>
</node>
<node>
<articles>999,48,123</articles>
</node>
<node>
<articles>123,1234,4345,567</articles>
</node>
</nodes>
I need to write some XSLT which will return only nodes which have a paricular article id, so in the example above, only those nodes which contain article 123.
My XSLT isn't great, so I'm struggling with this. I'd like to do something like this, but I know of course there isn't an 'instring' extension method in XSLT:
<xsl:variable name="currentNodeId" select="1234"/>
<xsl:for-each select="$allNodes [instring(articles,$currentNodeId)]">
<!-- Output stuff -->
</xsl:for-each>
I know this is hacky but not sure of the best approach to tackle this. The node-set is likely to be huge, and the number of article ids inside the nodes is likely to be huge too, so I'm pretty sure turning that splitting the value of the node and turning it into a node-set isn't going to be very efficient, but I could be wrong!
Any help as to the best way to do this would be much appreciated, thanks.
XSLT 2.0 : This will match articles which have exactly 123 somewhere as text.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="id" select="123"/>
<xsl:template match="/">
<xsl:for-each select="//node[matches(articles, concat('(^|\D)', $id, '($|\D)'))]">
<xsl:value-of select="current()"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Sample input :
<?xml version="1.0" encoding="utf-8"?>
<nodes>
<node>
<articles>1234,1000,9027</articles>
</node>
<node>
<articles>999,48,01234</articles>
</node>
<node>
<articles>123,1234,4345,567</articles>
</node>
<node>
<articles> 123 , 456 </articles>
</node>
</nodes>
Output :
123,1234,4345,567
123 , 456
I don't know how to do this efficiently with XSLT 1.0 but as the OP said he is using XSLT 2.0 so this should be a sufficient answer.
In XSLT 1.0 you can use this simple solution, it uses normalize-space, translate, contains, substring, string-length functions.
Sample input XML:
<nodes>
<node>
<articles>125,1,9027</articles>
</node>
<node>
<articles>999,48,123</articles>
</node>
<node>
<articles>123,1234,4345,567</articles>
</node>
<node>
<articles> 123 , 456 </articles>
</node>
<node>
<articles>789, 456</articles>
</node>
<node>
<articles> 123 </articles>
</node>
<node>
<articles>456, 123 ,789</articles>
</node>
</nodes>
XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="id" select="123"/>
<xsl:template match="node">
<xsl:variable name="s" select="translate(normalize-space(articles/.), ' ', '')"/>
<xsl:if test="$s = $id
or contains($s, concat($id, ','))
or substring($s, string-length($s) - string-length($id) + 1, string-length($id)) = $id">
<xsl:copy-of select="."/>
</xsl:if>
</xsl:template>
<xsl:template match="/nodes">
<xsl:copy>
<xsl:apply-templates select="node"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Output:
<nodes>
<node>
<articles>999,48,123</articles>
</node>
<node>
<articles>123,1234,4345,567</articles>
</node>
<node>
<articles> 123 , 456 </articles>
</node>
<node>
<articles> 123 </articles>
</node>
<node>
<articles>456, 123 ,789</articles>
</node>
</nodes>