xslt wrapping duplicate lines case insensitive inside a for-each - xslt

I am trying to write a loop using XSLT so that it automatically groups all items with the same ID but in a case insensitive way. Unfortunately the data that I am trying to parse through is client driven so I cannot change it prior to load.
regardless here is a XML structure...
<Document>
<Row>
<Cell>ID</Cell>
</Row>
<Row>
<Cell>hi</Cell>
</Row>
<Row>
<Cell>Hi</Cell>
</Row>
<Row>
<Cell>Hello</Cell>
</Row>
<Row>
<Cell>Hello</Cell>
</Row>
<Row>
<Cell>Hola</Cell>
</Row>
</Document>
This is the XSLT I am currently using...
<xsl:template match="Document">
<NewDocument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsl:for-each select="//Row[position() > 1]/Cell[1][not(.=preceding::Row/Cell[1])]">
<xsl:variable name="currentOrderID" select="." />
<xsl:variable name="currentOrderGroup" select="//Row[Cell[1] = $currentOrderID]" />
<MainID>
<xsl:value-of select="$currentOrderGroup[1]/Cell[1]"/>
</MainID>
<IDs>
<xsl:for-each select="$currentOrderGroup">
<id>
<xsl:value-of select="Cell[1]"/>
</id>
</xsl:for-each>
</IDs>
</xsl:for-each>
</NewDocument>
</xsl:template>
This is just wrapping up things as expected in a CaSe SeNSiTiVe way...
I've been trying to use a translate in there in order to make everything uppercase, however I can't seem to get the syntax just right.
The result I am trying to achieve here is this:
<NewDocument>
<MainID>hi</MainID>
<IDs>
<id>hi</id>
<id>Hi</id>
</IDs>
<MainID>Hello</MainID>
<IDs>
<id>Hello</id>
<id>Hello</id>
</IDs>
<MainID>Hola</MainID>
<IDs>
<id>Hola</id>
</IDs>
</NewDocument>
Can't seem to find anything specifically for what I need.
Thanks!

In XSLT1.0, to convert strings to lower case you need to use the rather cumbersome translate function in xpath.
translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz')
Furthermore, your problem is one of grouping, and in XSLT1.0 that usually means a technique known as Meunchian Grouping. To do, this you first define a key to look up items in the groups you require
<xsl:key
name="Cell"
match="Cell"
use="translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz')"/>
Here we are looking up cells based on their (lower-case) text content.
To find the first element in each group, you look for Cell elements in the XML which also happen to be the first element occurring in your look-up key
<xsl:apply-templates
select="Row/Cell
[generate-id()
= generate-id(
key('Cell',
translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'))[1])]"/>
Then, when you match the first element, you can then match all elements within the group by looking at the key.
Here is the full XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:key name="Cell" match="Cell" use="translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz')"/>
<xsl:template match="Document">
<NewDocument>
<xsl:apply-templates select="Row/Cell[generate-id() = generate-id(key('Cell', translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'))[1])]"/>
</NewDocument>
</xsl:template>
<xsl:template match="Cell">
<MainID>
<xsl:value-of select="."/>
</MainID>
<IDs>
<xsl:apply-templates select="key('Cell', translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'))" mode="group"/>
</IDs>
</xsl:template>
<xsl:template match="Cell" mode="group">
<id>
<xsl:value-of select="."/>
</id>
</xsl:template>
</xsl:stylesheet>
Note the use of the mode attribute, to distinguish between the two templates matching Cell elements.
When applied to your XML, the following is output:
<NewDocument>
<MainID>ID</MainID>
<IDs>
<id>ID</id>
</IDs>
<MainID>hi</MainID>
<IDs>
<id>hi</id>
<id>Hi</id>
</IDs>
<MainID>Hello</MainID>
<IDs>
<id>Hello</id>
<id>Hello</id>
</IDs>
<MainID>Hola</MainID>
<IDs>
<id>Hola</id>
</IDs>
</NewDocument>
Note, I wasn't sure what to do with the Cell with ID as a value, so I left that it in. If you do want to exclude it, just add this line to the XSLT
<xsl:template match="Cell[. = 'ID']" />

Related

How can I loop and generate keys for maps with XSLT 3.0?

I tried to construct a new map. In my source xml I've got many products (product data and IDs). How can I generate so many keys like products?
The goal is a transformation from XML to XML with XSLT. The idea was to create a map and in a next step call the keys for adressing the specifics product datas I need. So I need to know if this is possible with using maps or is there another solution?
Example for the source XML
<?xml version="1.0" encoding="UTF-8"?>
<root>
<row>
<id>102</id>
<product>Lenovo 1234</product>
<productfamily>laptop</productfamily>
</row>
<row>
.....
XSLT
<xsl:variable name="val" as="map(xs:integer, xs:integer)">
<xsl:map>
<xsl:for-each select="//id">
<xsl:map-entry key="" select="."/>
</xsl:map>
</xsl:variable>
<xsl:template match="/">
<xsl:value-of select="map:get($val , 102)"/>
</xsl:template>
To create a map based on a simple functional relationship in the data you can do
<xsl:variable name="index" as="map(*)">
<xsl:map>
<xsl:for-each select="//x">
<xsl:map-entry key=".//#id" select="."/>
</xsl:for-each>
</xsl:map>
</xsl:variable>
or if you prefer
<xsl:variable name="index" as="map(*)"
select="map:merge(//x ! map:entry(.//#id, .))"/>

XSLT using extra for-each or apply-templates using Muenchian method?

I've started learning XSLT and I've used the Muenchian method in an exercise. I've found 2 different ways of getting my expected result. With the apply-templates and with an extra for-each.
The key:
<xsl:key name="tech" match="technology" use="."/>
The first solution using the apply-templates:
<xsl:for-each select="//./technology[generate-id(.)=generate-id(key('tech', .)[1])]">
<team>
<xsl:variable name="selectedTech" select="."/>
<xsl:apply-templates select="../../person[./technology=$selectedTech]">
</team>
</xsl:for-each>
<xsl:template match="person">
<member><xsl:value-of select="name"/></member>
</xsl:template>
The second solution using an additional for-each:
<xsl:for-each select="//./technology[generate-id(.)=generate-id(key('tech', .)[1])]">
<team>
<xsl:variable name="selectedTech" select="."/>
<xsl:for-each select="key('tech', .)">
<member><xsl:value-of select="../name"/></member>
</xsl:for-each>
</team>
</xsl:for-each>
Input is something like this:
<employees>
<person>
<name>Bert</name>
<technology>IBM</technology>
</person>
<person>
<name>Jack</name>
<technology>Microsoft</technology>
</person>
<person>
<name>Karel</name>
<technology>IBM</technology>
</person>
<person>
<name>Bill</name>
<technology>Microsoft</technology>
</person>
<person>
<name>Joris</name>
<technology>OpenSource</technology>
</person>
<person>
<name>Piet</name>
<technology>OpenSource</technology>
</person>
</employees>
Is it better to use a particular solution of these 2? Or which one of these do you recommend and why?
Once you have defined a key and want to access the items in a group it is certainly more efficient to use key('key-name', keyValueExpression) to do that instead of walking an axis and writing a predicate.
So in my view instead of ../../person[./technology=$selectedTech] (where I wonder whether it does not need to be ../person[./technology=$selectedTech]) I would certainly use key('tech', .) to find the items in a group.
The decision between apply-templates or for-each is another question as you can use both.
Generally using apply-templates and separate templates a stylesheet is better structured and more readable but for quick and short ones for-each might suffice.
For the whole problem I would define the key on person
<xsl:key name="tech" match="person" use="technology"/>
<xsl:for-each select="//person[generate-id(.)=generate-id(key('tech', technology)[1])]">
<team>
<xsl:apply-templates select="key('tech', technology)">
</team>
</xsl:for-each>
<xsl:template match="person">
<member><xsl:value-of select="name"/></member>
</xsl:template>
And of course the first for-each could also be eliminated using apply-templates and a mode:
<xsl:key name="tech" match="person" use="technology"/>
<xsl:template match="root">
<xsl:copy>
<xsl:apply-templates select="//person[generate-id(.)=generate-id(key('tech', technology)[1])]" mode="team"/>
</xsl:copy>
</xsl:template>
<xsl:template match="person" mode="team">
<team>
<xsl:apply-templates select="key('tech', technology)">
</team>
</xsl:for-each>
<xsl:template match="person">
<member><xsl:value-of select="name"/></member>
</xsl:template>

Multiple records/elements grouped to create new structure

I searched and came close to finding a solution but that requires Stylesheet 2.0 and I'm stuck on 1.0.
This is the sample XML I have:
<root>
<row>A1: Apples</row>
<row>B1: Red</row>
<row>C1: Reference text</row>
<row>badly formatted text which belongs to row above</row>
<row>and here.</row>
<row>D1: ABC</row>
<row>E1: 123</row>
<row>A1: Oranges</row>
<row>B1: Purple</row>
<row>C1: More References</row>
<row>with no identifier</row>
<row>again and here.</row>
<row>D1: DEF</row>
<row>E1: 456</row>
.
.
I want it to look like:
<root>
<row>
<A1>Apples</A1>
<B1>Red</B1>
<C1>Reference text badly formatted text which belongs to row above and here.</C1>
<D1>ABC</D1>
<E1>123</E1>
</row>
<row>
<A1>Oranges</A1>
<B1>Purple</B1>
<C1>More Reference with no identifier again and here.</C1>
<D1>DEF</D1>
<E1>456</E1>
</row>
.
.
There is a pattern to this which I can convert using other utilities but quite hard with XSL 1.0.
There are headings within the elements that I can use and the reference text field is multi-line when it gets converted to XML, it creates its own row for each line but it's always in the same position between C1 and D1. The actual name of the elements, ie is not important.
The row should break up after E1. I think my example is straightforward but this transformation is not. I consider myself not even a beginner at XML/XSL. I am learning from scratch and then I get shifted to other projects and then have to come back to it again. TIA.
Update: Another case I ran into with slightly different structure but I want the result to be the same:
<root>
<row>
<Field>A1: Apples</Field>
</row>
<row>
<Field>B1: Red</Field>
</row>
<row>
<Field>C1: Reference text</Field>
</row>
<row>
<Field>badly formatted text which belongs to row above</Field>
</row>
<row>
<Field>and here.</Field>
</row>
<row>
<Field>D1: ABC</Field>
</row>
<row>
<Field>E1: 123</Field>
</row>
<row>
<Field>A1: Oranges</Field>
</row>
<row>
<Field>B1: Purple</Field>
</row>
<row>
<Field>C1: More References</Field>
</row>
<row>
<Field>with no identifier</Field>
</row>
<row>
<Field>again and here.</Field>
</row>
<row>
<Field>D1: DEF</Field>
</row>
<row>
<Field>E1: 456</Field>
</row>
I tried applying an identity transform but didn't seem to work:
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match ="row/Field">
<xsl:apply-templates/>
</xsl:template>
This looks kind of tricky, but I have a solution which seems to work. It allows for a variable number of rows after the C1 row (it wasn't clear whether this was always 2 rows or not).
The solution makes heavy use of the following-sibling axis, which is probably very inefficient, especially for a large input file.
You can test it out here.
<xsl:template match="/root">
<!-- Loop through every "A1" row -->
<xsl:for-each select="row[substring-before(text(), ':') = 'A1']">
<!-- Add a <row> tag -->
<xsl:element name="row">
<!-- Add each of the A1-E1 tags by finding the first following-sibling that matches before the colon -->
<xsl:apply-templates select="." />
<xsl:apply-templates select="following-sibling::*[substring-before(text(), ':') = 'B1'][1]" />
<xsl:apply-templates select="following-sibling::*[substring-before(text(), ':') = 'C1'][1]" />
<xsl:apply-templates select="following-sibling::*[substring-before(text(), ':') = 'D1'][1]" />
<xsl:apply-templates select="following-sibling::*[substring-before(text(), ':') = 'E1'][1]" />
</xsl:element>
</xsl:for-each>
</xsl:template>
<!-- Process each row -->
<xsl:template match="/root/row">
<!-- Create an element whose name is whatever is before the colon in the text -->
<xsl:element name="{substring-before(text(), ':')}">
<!-- Output everything after the colon -->
<xsl:value-of select="normalize-space(substring-after(text(), ':'))" />
<!-- Special treatment for the C1 node -->
<xsl:if test="substring-before(text(), ':') = 'C1'">
<!-- Count how many A1 nodes exist after this node -->
<xsl:variable name="remainingA1nodes" select="count(following-sibling::*[substring-before(text(), ':') = 'A1'])" />
<!-- Loop through all following-siblings that don't have a colon at position 3, and still have the same number of following A1 rows as this one does -->
<xsl:for-each select="following-sibling::*[substring(text(), 3, 1) != ':'][count(following-sibling::*[substring-before(text(), ':') = 'A1']) = $remainingA1nodes]">
<xsl:text> </xsl:text>
<xsl:value-of select="." />
</xsl:for-each>
</xsl:if>
</xsl:element>
</xsl:template>
Every record or group is 7 lines.
Then why not do it simply by the numbers:
XSLT 1.0
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/root">
<root>
<xsl:for-each select="row[position() mod 7 = 1]">
<row>
<xsl:apply-templates select=". | following-sibling::row[position() < 3] | following-sibling::row[4 < position() and position() < 7]"/>
</row>
</xsl:for-each>
</root>
</xsl:template>
<xsl:template match="row">
<xsl:element name="{substring-before(., ': ')}">
<xsl:value-of select="substring-after(., ': ')"/>
</xsl:element>
</xsl:template>
<xsl:template match="row[starts-with(., 'C1: ')]">
<C1>
<xsl:value-of select="substring-after(., 'C1: ')"/>
<xsl:for-each select="following-sibling::row[position() < 3]">
<xsl:text> </xsl:text>
<xsl:value-of select="."/>
</xsl:for-each>
</C1>
</xsl:template>
</xsl:stylesheet>

XSLT: Get row element inside a for-loop over one table from another table

I have three tables in my XML file: tableX, tableA and tableB.
This is my algorithm:
Go through each row element of tableX and check, if Xelement1 is NULL (empty).
If match:
Go through each row in tableA and compare the value of the row element Aelement2 with another row element Xelement2 of tableX.
If match:
Go through each row element of tableB and compare the value of row element Belement1 oftableB with the value of the row element Aelement1 of tableA
If match:
Print a value of another row element Belement2 of tableB
Currently I am doing this and it is working:
<xsl:for-each select="/root/table[#name='tableX']/row">
<xsl:variable name="rec" select="."/>
<xsl:choose>
<xsl:when test="Xelement1=''">
<xsl:for-each select="/root/table[#name='tableA']/row">
<xsl:variable name="member" select="."/>
<xsl:if test="Aelement2=$rec/Xelement2">
<xsl:for-each select="/root/table[#name='tableB']/row">
<xsl:if test="Belement1=$member/Aelement1">
<xsl:value-of select="Belement2"/>&#160
</xsl:if>
</xsl:for-each>
</xsl:if>
</xsl:for-each>
</xsl:when>
<xsl:otherwise>
<!-- Xelement1 is not null -->
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
However, I wish I could access e.g. Aelement1 within the third for-each loop, without having to save it to a variable member.
Also, why doesn't this work?
[...]
<xsl:for-each select="/root/table[#name='tableA']/row">
<xsl:variable name="member" select="Aelement1"/>
<xsl:if test="Aelement2=$rec/Xelement2">
<xsl:for-each select="/root/table[#name='tableB']/row">
<xsl:if test="Belement1=$member">
<xsl:value-of select="Belement2"/>&#160
</xsl:if>
</xsl:for-each>
</xsl:if>
</xsl:for-each>
[...]
Minimal, but complete XML example:
<root>
<table name="tableX">
<row>
<Xelement1>11</Xelement1>
<Xelement2>3</Xelement2>
<Xother>failure</Xother>
</row>
<row>
<Xelement1>NULL</Xelement1>
<Xelement2>9</Xelement2>
<Xother>success</Xother>
</row>
</table>
<table name="tableA">
<row>
<Aelement1>10</Aelement1>
<Aelement2>16</Aelement2>
<Aother>failure</Aother>
</row>
<row>
<Aelement1>12</Aelement1>
<Aelement2>9</Aelement2>
<Aother>success</Aother>
</row>
<row>
<Aelement1>12</Aelement1>
<Aelement2>16</Aelement2>
<Aother>failure</Aother>
</row>
<row>
<Aelement1>14</Aelement1>
<Aelement2>9</Aelement2>
<Aother>success</Aother>
</row>
</table>
<table name="tableB">
<row>
<Belement1>10</Belement1>
<Belement2>failure</Belement2>
<Bother>random</Bother>
</row>
<row>
<Belement1>12</Belement1>
<Belement2>success</Belement2>
<Bother>random</Bother>
</row>
<row>
<Belement1>14</Belement1>
<Belement2>success</Belement2>
<Bother>random</Bother>
</row>
</table>
</root>
You should be able to select the corresponding rows from table B with a single XPath expression:
<xsl:for-each select="/root/table[#name = 'tableB']/row[
Belement = /root/table[#name = 'tableA']/row/Aelement
]">
<!-- Do something -->
</xsl:for-each>
This works because XPath's =, when operating on node-sets, compares all nodes on the left-hand side with all nodes on the right-hand side (just like an INNER JOIN in SQL).
It will select one node in your example (namely the <row> that has <Belement>5</Belement>), but it would select more if there were more matches.
After a substantial edit to the question, the XPath expression got more complex. The same principle applies.
//table[#name = 'tableB']/row[
Belement1 = //table[#name = 'tableA']/row[
Aelement2 = //table[#name = 'tableX']/row[
Xelement1 = 'NULL'
]/Xelement2
]/Aelement1
]/Belement2
will select the elements containing "success" from your sample.
Read it from the inside out:
from the tableX rows with Xelement1 = 'NULL' you want the Xelement2
from the tableA rows where Aelement2 corresponds to those you want Aelement1
from the tableB rows where Belement1 corresponds to those you want Belement2
I would define a key
<xsl:key name="row" match="table[#name = 'tableB']/row" use="Belement"/>
then you can shorten
<xsl:for-each select="/root/table[#name='tableA']/row">
<xsl:variable name="member" select="."/>
<xsl:for-each select="/root/table[#name='tableB']/row">
<xsl:if test="Belement=$member/Aelement">
<!--Do something-->
</xsl:if>
</xsl:for-each>
</xsl:for-each>
to
<xsl:for-each select="/root/table[#name='tableA']/row/key('row', Aelement)">
<!--Do something-->
</xsl:for-each>
As for the terminology, your code processes row elements or row element nodes.
As for the sample that is not working, you would need to show us minimal but complete samples of XML input, XSLT code, result you want, result you get so that we can easily reproduce the problem.
Also, why doesn't this work?
> <xsl:for-each select="/root/table[#name='tableA']/row">
> <xsl:variable name="member" select="Aelement"/>
> <xsl:for-each select="/root/table[#name='tableB']/row">
> <xsl:if test="Belement=$member">
> <!--Do something-->
> </xsl:if>
> </xsl:for-each>
> </xsl:for-each>
Actually, it does work. If you try actually doing something with the matching row in tableB, for example:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<output>
<xsl:for-each select="/root/table[#name='tableA']/row">
<xsl:variable name="member" select="Aelement"/>
<xsl:for-each select="/root/table[#name='tableB']/row">
<xsl:if test="Belement=$member">
<!--Do something-->
<xsl:copy-of select="."/>
</xsl:if>
</xsl:for-each>
</xsl:for-each>
</output>
</xsl:template>
</xsl:stylesheet>
you will receive:
<?xml version="1.0" encoding="UTF-8"?>
<output>
<row>
<Belement>5</Belement>
</row>
</output>

Change template so XSLT Outputs a sum instead of a list of values

I have an XSLT template that is working fine.
<xsl:template match="Row[contains(BenefitType, 'MyBenefit')]">
<value>
<xsl:value-of select="BenefitList/Row/Premium* 12" />
</value>
</xsl:template>
The output is
<value>100</value>
<value>110</value>
What I would prefer is if it would just output 220. So, basically in the template I would need to use some sort of variable or looping to do this and then output the final summed value?
XSLT 1 compliance is required.
The template is being used as follows:
<xsl:apply-templates select="Root/Row[contains(BenefitType, 'MyBenefit')]" />
For some reason, when I use the contains here it only sums the first structure that matches and not all of them. If The XML values parent wasn't dependent on having a sibling element that matched a specific value then a'sum' approach would work.
The direct solution to the problem was already mentioned in the comments, but assuming you really want to do the same with some variables, this might be interesting for you:
XML:
<Root>
<Row>
<BenefitType>MyBenefit</BenefitType>
<BenefitList>
<Premium>100</Premium>
</BenefitList>
</Row>
<Row>
<BenefitType>MyBenefit, OtherBenefit</BenefitType>
<BenefitList>
<Premium>100</Premium>
</BenefitList>
</Row>
<Row>
<BenefitType>OtherBenefit</BenefitType>
<BenefitList>
<Premium>1000</Premium>
</BenefitList>
</Row>
<Row>
<BenefitType>OtherBenefit</BenefitType>
<BenefitList>
<Premium>1000</Premium>
</BenefitList>
</Row>
</Root>
XSLT:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="exsl">
<xsl:template match="/">
<total>
<xsl:variable name="valuesXml">
<values>
<xsl:apply-templates select="Root/Row[contains(BenefitType, 'MyBenefit')]" />
</values>
</xsl:variable>
<xsl:variable name="values" select="exsl:node-set($valuesXml)/values/value" />
<xsl:value-of select="sum($values)" />
</total>
</xsl:template>
<xsl:template match="Row[contains(BenefitType, 'MyBenefit')]">
<value>
<xsl:value-of select="BenefitList/Premium * 12" />
</value>
</xsl:template>
</xsl:stylesheet>
Here the same result set generated in your question is saved in another variable, which can then again be processed.