Xslt for table content replacement only in Tbody - xslt

Could you please help us out in the below scenario.
We need xsl code for the below scenario.
We need to retrieve ref tag inside para in Thead
We need to remove ref tag inside para in Tbody.
For last cell we should not perform this ref removal. ie) should behave like thead
Sample Input:
<xml>
<Table>
<thead>
<Row>
<Cell>
<para id=4>
<ref>A</ref>
</para>
</Cell>
</Row>
</thead>
<tbody>
<Row>
<Cell>
<para id=1>
<ref>b</ref>
</para>
</Cell>
.
.
<Cell>
<para id=6>
<ref>retrive</ref>
</para>
</Cell>
</Row>
<Row>
<Cell>
<para id=2>
c
</para>
</Cell>
.
.
<Cell>
<para id=7>
<ref>retrive</ref>
</para>
</Cell>
</Row>
<Row>
<Cell >
<para id=3>
<ref>d</ref>
<ref>e</ref>
</para>
</Cell>
.
.
<Cell>
<para id=8>
<ref>retrive</ref>
</para>
</Cell>
</Row>
</tbody>
</table>
Expected Output:
<xml>
<Table>
<thead>
<Row>
<Cell>
<para id=4>
<ref>A</ref> (No change in thead)
</para>
</Cell>
</Row>
</thead>
<tbody>
<Row>
<Cell>
<para id=1> (para attribute should be retrieved)
b (ref tag should be removed but content should be retrieved)
</para>
</Cell>
.
.
<Cell>
<para id=6>
<ref>retrieve</ref> (Should retrieve ref tag with value)
</para>
</Cell>
</Row>
<Row>
<Cell>
<para id=2>
c
</para>
</Cell>
.
.
<Cell>
<para id=7>
<ref>retrieve</ref> (Should retrieve ref tag with value)
</para>
</Cell>
</Row>
<Row>
<Cell>
<para id=3>
d
e
</para>
</Cell>
.
.
<Cell>
<para id=8>
<ref>retrieve</ref> (Should retrieve ref tag with value)
</para>
</Cell>
</Row>
</tbody>
</table>

With the adjustments to your input XML to have the closing table tag match the opening table tag and to wrap the value of id in quotation marks, the following XSLT
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" omit-xml-declaration="no"
encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="table">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="tbody/Row/Cell[position()!=last()]/para/ref">
<xsl:value-of select="."/>
</xsl:template>
</xsl:transform>
when applied to this corrected input XML produces the output
<?xml version="1.0" encoding="UTF-8"?>
<Table>
<thead>
<Row>
<Cell>
<para id="4">
<ref>A</ref>
</para>
</Cell>
</Row>
</thead>
<tbody>
<Row>
<Cell>
<para id="1">b</para>
</Cell>
<Cell>
<para id="6">
<ref>retrieve</ref>
</para>
</Cell>
</Row>
<Row>
<Cell>
<para id="2">
c
</para>
</Cell>
<Cell>
<para id="7">
<ref>retrieve</ref>
</para>
</Cell>
</Row>
<Row>
<Cell>
<para id="3">de</para>
</Cell>
<Cell>
<para id="8">
<ref>retrieve</ref>
</para>
</Cell>
</Row>
</tbody>
</Table>
The <xsl:template match="tbody/Row/Cell[position()!=last()]/para/ref">
matches all Cell elements in tbody except the last one - position()!=last() - and replace the ref attribute with its value.

Related

XSL Required: Merging content of next cell based on attribute of the current cell

I have a Table it has two column.
Based on First column's rowmerge and rowspan attribute it should merge the values of next column.
RowMerged attribute is to find out whether cells are merged.
RowSpan attribute is to find-out how many cells are merged.
If Rowspan is 0 then that cells is merged with above one.
In the below example we give input as 5 Rows and it will return 3 row as output.
ie) First two rows are merged into single and the content of second column which is not merged should be copied to the above one.
Concerned main on content of cell not the attribute.
Sample Input:
<Table Name="abc">
<TBODY>
<Row>
<Cell RowMerged="T" RowSpan="2"><Element>ABC</Element></Cell>
<Cell><Element>21</Element></Cell>
</Row>
<Row>
<Cell RowMerged="T" RowSpan="0"></Cell>
<Cell><Element>ABC</Element></Cell>
</Row>
<Row>
<Cell RowMerged="F" RowSpan="1"><Element>PQR</Element></Cell>
<Cell><Element>19</Element></Cell>
</Row>
<Row>
<Cell RowMerged="T" RowSpan="2"><Element>XYZ</Element></Cell>
<Cell><Element>99</Element></Cell>
</Row>
<Row>
<Cell RowMerged="T" RowSpan="0"></Cell>
<Cell><Element>Sample</Element></Cell>
</Row>
</TBODY>
</Table>
Sample Output:
<Table Name="abc">
<TBODY>
<Row>
<Cell RowMerged="F" RowSpan="1"><Element>ABC</Element></Cell>
<Cell><Element>21ABC</Element></Cell>
</Row>
<Row>
<Cell RowMerged="F" RowSpan="1"><Element>PQR</Element></Cell>
<Cell><Element>19</Element></Cell>
</Row>
<Row>
<Cell RowMerged="F" RowSpan="1"><Element>XYZ</Element></Cell>
<Cell><Element>99Sample</Element></Cell>
</Row>
</TBODY>
</Table>
You may try something like this:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes" />
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Row">
<xsl:variable name="rows" select="Cell[1]/#RowSpan"/>
<xsl:copy>
<Cell RowMerged="F" RowSpan="1">
<xsl:apply-templates select="Cell[1]/*" />
</Cell>
<Cell>
<Element>
<xsl:apply-templates select="Cell[2]/Element/node()" />
<xsl:apply-templates select="following-sibling::Row[position() < $rows]/Cell[2]/Element/node()" />
</Element>
</Cell>
</xsl:copy>
<xsl:apply-templates select="Row[Cell[1][#RowSpan > 0]]" />
</xsl:template>
<xsl:template match="TBODY">
<xsl:apply-templates select="Row[Cell[1][#RowSpan > 0]]" />
</xsl:template>
</xsl:stylesheet>
With following output:
<Table Name="abc">
<Row>
<Cell RowMerged="F" RowSpan="1">
<Element>ABC</Element>
</Cell>
<Cell>
<Element>21ABC</Element>
</Cell>
</Row>
<Row>
<Cell RowMerged="F" RowSpan="1">
<Element>PQR</Element>
</Cell>
<Cell>
<Element>19</Element>
</Cell>
</Row>
<Row>
<Cell RowMerged="F" RowSpan="1">
<Element>XYZ</Element>
</Cell>
<Cell>
<Element>99Sample</Element>
</Cell>
</Row>
</Table>

XSL Transformation <Para> content missing while row merge

INPUT
<?xml version="1.0"?>
<TABLE>
<THEAD>
<ROW id="rh">
<CELL rowmerged="F" rowspan="1" >
<Para >A</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >B</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >C</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >D</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >F</Para>
</CELL>
</ROW>
</THEAD>
<TBODY editable="T">
<ROW id="r1">
<CELL rowmerged="T" rowspan="2" >
<Para >11</Para>
</CELL>
<CELL rowmerged="T" rowspan="2" >
<Para >12</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >13</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >14</Para>
</CELL>
<CELL rowmerged="T" rowspan="2" >
<Para ></Para>
</CELL>
</ROW>
<ROW id="r2">
<CELL rowmerged="T" rowspan="2" >
<Para ></Para>
</CELL>
<CELL rowmerged="T" rowspan="2" >
<Para ></Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >23</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >24</Para>
</CELL>
<CELL rowmerged="T" rowspan="2" >
<Para ></Para>
</CELL>
</ROW>
</TBODY>
</TABLE>
OUTPUT:
<?xml version="1.0"?>
<TABLE>
<THEAD>
<ROW id="rh">
<CELL rowmerged="F" rowspan="1" >
<Para >A</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >B</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >C</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >D</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >F</Para>
</CELL>
</ROW>
</THEAD>
<TBODY editable="T">
<ROW id="r1">
<CELL rowmerged="F" rowspan="1" >
<Para >11</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >12</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >13</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >14</Para>
</CELL>
<CELL rowmerged="F" rowspan=1" >
<Para ></Para>
</CELL>
</ROW>
<ROW id="r2">
<CELL rowmerged="F" rowspan="1" >
<Para >11</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >12</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >23</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >24</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para ></Para>
</CELL>
</ROW>
</TBODY>
</TABLE>
While transforming the input XML with this transformation XSL:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="#*|node()">
<xsl:copy><xsl:apply-templates select="#*|node()"/></xsl:copy>
</xsl:template>
<xsl:template match="CELL">
<CELL rowmerged="F" rowspan="1">
<xsl:apply-templates select="node()"/>
</CELL>
</xsl:template>
<xsl:template match="Para[not(normalize-space())][../#rowmerged='T']">
<xsl:variable name="cellnum" select="count(../preceding-sibling::CELL) + 1" />
<xsl:variable name="matchingCells" select="
../../preceding-sibling::ROW/CELL[$cellnum]/Para" />
<xsl:copy-of select="$matchingCells[normalize-space()][last()]" />
</xsl:template>
The unvalued tag is not replaced with the exact location and hence returns
<CELL>
Para not found in OUTPUTXML
</CELL>
We generated output but the EMPTY Para(<Para>) content is missing while row merge is happening.
Kindly help me in achieving this and I'm new to this XSLT.
Rule: for merged rows: copy content of primary merged cell to other cells in merged rows. Could you please help me out in this scenario. I am new to xslt.
To preserve elements without text node, you should do something like this:
<Para>
<xsl:value-of select="$matchingCells[normalize-space()][last()]"/>
</Para>
Your template is almost unreadable. Why to go all this trouble:
Para[not(normalize-space())][../#rowmerged='T']
It would be more helpful if you'll explain what you want to get more clearly.

writing xslt for the below scenario

INPUT xml:
<?xml version="1.0"?>
<TABLE>
<THEAD>
<ROW id="rh">
<CELL rowmerged="F" rowspan="1" >
<Para >A</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >B</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >C</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >D</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >F</Para>
</CELL>
</ROW>
</THEAD>
<TBODY editable="T">
<ROW id="r1">
<CELL rowmerged="T" rowspan="2" >
<Para >11</Para>
</CELL>
<CELL rowmerged="T" rowspan="2" >
<Para >12</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >13</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >14</Para>
</CELL>
<CELL rowmerged="T" rowspan="2" >
<Para >15</Para>
</CELL>
</ROW>
<ROW id="r2">
<CELL rowmerged="T" rowspan="2" >
<Para ></Para>
</CELL>
<CELL rowmerged="T" rowspan="2" >
<Para ></Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >23</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >24</Para>
</CELL>
<CELL rowmerged="T" rowspan="2" >
<Para ></Para>
</CELL>
</ROW>
<ROW id="r3">
<CELL rowmerged="T" rowspan="2" >
<Para ></Para>
</CELL>
<CELL rowmerged="T" rowspan="2" >
<Para ></Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >33</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >34</Para>
</CELL>
<CELL rowmerged="T" rowspan="2" >
<Para ></Para>
</CELL>
</ROW>
<ROW id="r4">
<CELL rowmerged="F" rowspan="1" >
<Para >41</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >42</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >43</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >44</Para>
</CELL>
<CELL rowmerged="T" rowspan="1" >
<Para >45</Para>
</CELL>
</ROW>
</TBODY>
</TABLE>
Rule:
for merged rows: copy content of primary merged cell to other cells in merged rows.
Expected result:
<?xml version="1.0"?>
<TABLE>
<THEAD>
<ROW id="rh">
<CELL rowmerged="F" rowspan="1" >
<Para >A</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >B</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >C</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >D</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >F</Para>
</CELL>
</ROW>
</THEAD>
<TBODY editable="T">
<ROW id="r1">
<CELL rowmerged="F" rowspan="1" >
<Para >11</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >12</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >13</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >14</Para>
</CELL>
<CELL rowmerged="F" rowspan=1" >
<Para >15</Para>
</CELL>
</ROW>
<ROW id="r2">
<CELL rowmerged="F" rowspan="1" >
<Para >11</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >12</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >23</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >24</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >15</Para>
</CELL>
</ROW>
<ROW id="r3">
<CELL rowmerged="F" rowspan="1" >
<Para >11</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >12</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >33</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >34</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >15</Para>
</CELL>
</ROW>
<ROW id="r4">
<CELL rowmerged="F" rowspan="1" >
<Para >41</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >42</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >43</Para>
</CELL>
<CELL rowmerged="F" rowspan="1" >
<Para >44</Para>
</CELL>
<CELL rowmerged="T" rowspan="1" >
<Para >45</Para>
</CELL>
</ROW>
</TBODY>
</TABLE>
Rule:
for merged rows: copy content of primary merged cell to other cells in merged rows.
Could you please help me out in this scenario.
I am new to xslt.
Thanks in advance.
You haven't spelled out the rules in detail in your question but it appears to me that the output you've asked for can be achieved if for any cell with rowmerged="T" and and empty Para, you copy the Para from the corresponding cell in the nearest preceding row where it is not empty.
You can start as usual with the identity template, which copies everything unchanged by default but allows you to override this behaviour for specific nodes:
<xsl:template match="#*|node()">
<xsl:copy><xsl:apply-templates select="#*|node()"/></xsl:copy>
</xsl:template>
Now for the changes we want to make: for CELL elements we need to fix up their attributes
<xsl:template match="CELL">
<CELL rowmerged="F" rowspan="1">
<xsl:apply-templates select="node()"/><!-- process children -->
</CELL>
</xsl:template>
And for an empty Para in a rowmerged cell
<xsl:template match="Para[not(normalize-space())][../#rowmerged='T']">
we want to find the nearest preceding row where the matching cell is not empty, and copy that
<!-- find the number of this cell in the current row -->
<xsl:variable name="cellnum" select="count(../preceding-sibling::CELL) + 1" />
<!-- look for the corresponding Para in previous rows -->
<xsl:variable name="matchingCells" select="
../../preceding-sibling::ROW/CELL[$cellnum]/Para" />
<!-- filter for just the non-empty ones, and copy the nearest (last in doc order) -->
<xsl:copy-of select="$matchingCells[normalize-space()][last()]" />
The complete stylesheet is as follows:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="#*|node()">
<xsl:copy><xsl:apply-templates select="#*|node()"/></xsl:copy>
</xsl:template>
<xsl:template match="CELL">
<CELL rowmerged="F" rowspan="1">
<xsl:apply-templates select="node()"/>
</CELL>
</xsl:template>
<xsl:template match="Para[not(normalize-space())][../#rowmerged='T']">
<xsl:variable name="cellnum" select="count(../preceding-sibling::CELL) + 1" />
<xsl:variable name="matchingCells" select="
../../preceding-sibling::ROW/CELL[$cellnum]/Para" />
<xsl:copy-of select="$matchingCells[normalize-space()][last()]" />
</xsl:template>
</xsl:stylesheet>

XSL: Convert xml to output with xsl

Please help to write xsl:
I've got this xml:
<?xml version="1.0" encoding="utf-8"?>
<root>
<Table name = "My table1">
<Row isHeader="True">
<Cell val ="Header1"> </Cell>
<Cell val ="Header2"> </Cell>
<Cell val ="Header3"> </Cell>
</Row>
<Row isHeader="False">
<Cell val ="Data2.1"> </Cell>
<Cell val ="Data2.2"> </Cell>
<Cell val ="Data2.3"> </Cell>
</Row>
<Row>
<Cell val ="Data3.1"> </Cell>
<Cell val ="Data3.2"> </Cell>
<Cell val ="Data3.3"> </Cell>
</Row>
</Table>
</root>
to this output: First row contains headers.
<?xml version="1.0" encoding="utf-8"?>
<items>
<item>Header1=Data2.1 Header2=Data2.2 Header3=Data2.3 </item>
<item>Header1=Data3.1 Header2=Data3.2 Header3=Data3.3 </item>
</items>
Many thanks for your help!
Here is my suggestion:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:template match="Table">
<items>
<xsl:variable name="headers" select="Row[1]/Cell/#val"/>
<xsl:for-each select="Row[position() gt 1]">
<item>
<xsl:for-each select="Cell">
<xsl:if test="position() gt 1"><xsl:text> </xsl:text></xsl:if>
<xsl:variable name="pos" select="position()"/>
<xsl:value-of select="concat($headers[$pos], '=', #val)"/>
</xsl:for-each>
</item>
</xsl:for-each>
</items>
</xsl:template>
</xsl:stylesheet>
<xsl:template match="Table">
<xsl:variable name='headers' select="Row[1]"/>
<xsl:for-each select="remove(Row, 1)">
<item><xsl:value-of
select="for $i in 1 to count($headers/Cell)
return concat($headers/Cell[$i], '=', Cell[$i])"/>
</item>
</xsl:for-each>
</xsl:template>
Not tested.

Comparing 2 node sets based on attribute sequence

I'm trying to build up a kind of library XML, comparing various nodes and combining them for later reuse. The logic should be fairly straightforward, if the tag_XX attribute value sequence of a given language is equal to the tag_YY attribute value sequence of another language, the nodes can be combined. See below XML example
<Book>
<Section>
<GB>
<Para tag_GB="L1">
<Content_GB>string_1</Content_GB>
</Para>
<Para tag_GB="Illanc">
<Content_GB>string_2</Content_GB>
</Para>
<Para tag_GB="|PLB">
<Content_GB>string_3</Content_GB>
</Para>
<Para tag_GB="L1">
<Content_GB>string_4</Content_GB>
</Para>
<Para tag_GB="Sub">
<Content_GB>string_5</Content_GB>
</Para>
<Para tag_GB="L3">
<Content_GB>string_6</Content_GB>
</Para>
<Para tag_GB="Subbull">
<Content_GB>string_7</Content_GB>
</Para>
</GB>
<!-- German translations - OK because same attribute sequence -->
<DE>
<Para tag_DE="L1">
<Content_DE>German_translation of_string_1</Content_DE>
</Para>
<Para tag_DE="Illanc">
<Content_DE>German_translation of_string_2</Content_DE>
</Para>
<Para tag_DE="|PLB">
<Content_DE>German_translation of_string_3</Content_DE>
</Para>
<Para tag_DE="L1">
<Content_DE>German_translation of_string_4</Content_DE>
</Para>
<Para tag_DE="Sub">
<Content_DE>German_translation of_string_5</Content_DE>
</Para>
<Para tag_DE="L3">
<Content_DE>German_translation of_string_6</Content_DE>
</Para>
<Para tag_DE="Subbull">
<Content_DE>German_translation of_string_7</Content_DE>
</Para>
</DE>
<!-- Danish translations - NG because not same attribute sequence -->
<DK>
<Para tag_DK="L1">
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag_DK="L1_sub">
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag_DK="Illanc">
<Content_DK>Danish_translation_of_string_2</Content_DK>
</Para>
<Para tag_DK="L1">
<Content_DK>Danish_translation_of_string_4</Content_DK>
</Para>
<Para tag_DK="|PLB">
<Content_DK>Danish_translation_of_string_3</Content_DK>
</Para>
<Para tag_DK="L3">
<Content_DK>Danish_translation_of_string_6</Content_DK>
</Para>
<Para tag_DK="Sub">
<Content_DK>Danish_translation_of_string_5</Content_DK>
</Para>
<Para tag_DK="Subbull">
<Content_DK>Danish_translation_of_string_7</Content_DK>
</Para>
</DK>
</Section>
</Book>
So
GB tag_GB value sequence = L1 -> Illanc -> ... -> SubBul
DE tag_DE value sequence = L1 -> Illanc -> ... -> SubBul (same as GB so ok)
DK tag_DK value sequence = L1 -> L1.sub -> Oops, expected Illanc meaning this sequence is not the same as GB and locale can be ignored
Since German and English node sets have the same attribute sequence I like to combine them as follows :
<Book>
<Dictionary>
<Para tag="L1">
<Content_GB>string_1</Content_GB>
<Content_DE>German_translation of_string_1</Content_DE>
</Para>
<Para tag="Illanc">
<Content_GB>string_2</Content_GB>
<Content_DE>German_translation of_string_2</Content_DE>
</Para>
<Para tag="|PLB">
<Content_GB>string_3</Content_GB>
<Content_DE>German_translation of_string_3</Content_DE>
</Para>
<Para tag="L1">
<Content_GB>string_4</Content_GB>
<Content_DE>German_translation of_string_4</Content_DE>
</Para>
<Para tag="Sub">
<Content_GB>string_5</Content_GB>
<Content_DE>German_translation of_string_5</Content_DE>
</Para>
<Para tag="L3">
<Content_GB>string_6</Content_GB>
<Content_DE>German_translation of_string_6</Content_DE>
</Para>
<Para tag="Subbull">
<Content_GB>string_7</Content_GB>
<Content_DE>German_translation of_string_7</Content_DE>
</Para>
</Dictionary>
</Book>
The stylesheet I use is the following :
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" xmlns="http://www.w3.org/1999/xhtml" encoding="UTF-8" indent="yes"/>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="normalize-space(.)"/>
</xsl:template>
<xsl:template match="Section">
<!-- store reference tag list -->
<xsl:variable name="Ref_tagList" select="GB/Para/attribute()[1]"/>
<Dictionary>
<xsl:for-each select="GB/Para">
<xsl:variable name="pos" select="position()"/>
<Para tag="{#tag_GB}">
<!-- Copy English Master -->
<xsl:apply-templates select="element()[1]"/>
<xsl:for-each select="//Book/Section/element()[not(self::GB)]">
<!-- store current locale tag list -->
<xsl:variable name="Curr_tagList" select="Para/attribute()[1]"/>
<xsl:if test="$Ref_tagList = $Curr_tagList">
<!-- Copy current locale is current tag list equals reference tag list -->
<xsl:apply-templates select="Para[position()=$pos]/element()[1]"/>
</xsl:if>
</xsl:for-each>
</Para>
</xsl:for-each>
</Dictionary>
</xsl:template>
</xsl:stylesheet>
Apart from probably not the most efficient way to do this (I'm fairly new to the xslt game...) it's not working either. The logic I had in mind is to take the attribute set of the English master, and if the attribute set of any other locale is equal I copy, if not I ignore. But for some reason also nodesets that have a different attribute sequence are happily copied (as seen in below). Can some one tell me where my logic conflicts with reality ? Thanks in advance !
Current output Including Danish that should have been ignored ...
<Book>
<Dictionary>
<Para tag="L1">
<Content_GB>string_1</Content_GB>
<Content_DE>German_translation of_string_1</Content_DE>
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag="Illanc">
<Content_GB>string_2</Content_GB>
<Content_DE>German_translation of_string_2</Content_DE>
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag="|PLB">
<Content_GB>string_3</Content_GB>
<Content_DE>German_translation of_string_3</Content_DE>
<Content_DK>Danish_translation_of_string_2</Content_DK>
</Para>
<Para tag="L1">
<Content_GB>string_4</Content_GB>
<Content_DE>German_translation of_string_4</Content_DE>
<Content_DK>Danish_translation_of_string_4</Content_DK>
</Para>
<Para tag="Sub">
<Content_GB>string_5</Content_GB>
<Content_DE>German_translation of_string_5</Content_DE>
<Content_DK>Danish_translation_of_string_3</Content_DK>
</Para>
<Para tag="L3">
<Content_GB>string_6</Content_GB>
<Content_DE>German_translation of_string_6</Content_DE>
<Content_DK>Danish_translation_of_string_6</Content_DK>
</Para>
<Para tag="Subbull">
<Content_GB>string_7</Content_GB>
<Content_DE>German_translation of_string_7</Content_DE>
<Content_DK>Danish_translation_of_string_5</Content_DK>
</Para>
</Dictionary>
</Book>
This is might not be the best solution. I've used the following XSLT 2.0 features:
I compared the sequence of attributes using string-join().
I've exploited the possibility of using RTF variables
There are probably more XSLT 2.0 facilities which can resolve your problem. but I think the BIG problem here is your input document.
I'm sorry did not have a look to your current transform. Just implemented one from scratch. Hope it helps:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="GB">
<Book>
<Dictionary>
<xsl:variable name="matches">
<xsl:for-each select="following-sibling::*
[string-join(Para/#*,'-')
= string-join(current()/Para/#*,'-')]">
<match><xsl:copy-of select="Para/*"/></match>
</xsl:for-each>
</xsl:variable>
<xsl:apply-templates select="Para">
<xsl:with-param name="matches" select="$matches"/>
</xsl:apply-templates>
</Dictionary>
</Book>
</xsl:template>
<xsl:template match="Para[parent::GB]">
<xsl:param name="matches"/>
<xsl:variable name="pos" select="position()"/>
<Para tag="{#tag_GB}">
<xsl:copy-of select="Content_GB"/>
<xsl:copy-of select="$matches/match/*[position()=$pos]"/>
</Para>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
When applied to the input document provided in the question, the following output is produced:
<Book>
<Dictionary>
<Para tag="L1">
<Content_GB>string_1</Content_GB>
<Content_DE>German_translation of_string_1</Content_DE>
</Para>
<Para tag="Illanc">
<Content_GB>string_2</Content_GB>
<Content_DE>German_translation of_string_2</Content_DE>
</Para>
<Para tag="|PLB">
<Content_GB>string_3</Content_GB>
<Content_DE>German_translation of_string_3</Content_DE>
</Para>
<Para tag="L1">
<Content_GB>string_4</Content_GB>
<Content_DE>German_translation of_string_4</Content_DE>
</Para>
<Para tag="Sub">
<Content_GB>string_5</Content_GB>
<Content_DE>German_translation of_string_5</Content_DE>
</Para>
<Para tag="L3">
<Content_GB>string_6</Content_GB>
<Content_DE>German_translation of_string_6</Content_DE>
</Para>
<Para tag="Subbull">
<Content_GB>string_7</Content_GB>
<Content_DE>German_translation of_string_7</Content_DE>
</Para>
</Dictionary>
</Book>
This stylesheet makes use of <xsl:for-each-group>
First, groups the elements by their sequence of Para/#* values
Then, for each of those sequences, groups the Para using the number of following sibling elements that have attributes that start with "tag".
I have predicate filters on the matches for #*, to ensure that it is comparing the ones that start with "tag_". That may not be necessary, but would help ensure that it still worked if other attributes were added to the instance XML.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" xmlns="http://www.w3.org/1999/xhtml" encoding="UTF-8"
indent="yes"/>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()" priority="1">
<xsl:value-of select="normalize-space(.)"/>
</xsl:template>
<xsl:template match="Section">
<xsl:for-each-group select="*"
group-adjacent="string-join(
Para/#*[starts-with(local-name(),'tag_')],'|')">
<Dictionary>
<xsl:for-each-group select="current-group()/Para"
group-by="count(
following-sibling::*[#*[starts-with(local-name(),'tag_')]])">
<Para tag="{(current-group()/#*[starts-with(local-name(),'tag_')])[1]}">
<xsl:copy-of select="current-group()/*"/>
</Para>
</xsl:for-each-group>
</Dictionary>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
When applied to the sample input XML, produces the following output:
<Book>
<Dictionary>
<Para tag="L1">
<Content_GB>string_1</Content_GB>
<Content_DE>German_translation of_string_1</Content_DE>
</Para>
<Para tag="Illanc">
<Content_GB>string_2</Content_GB>
<Content_DE>German_translation of_string_2</Content_DE>
</Para>
<Para tag="|PLB">
<Content_GB>string_3</Content_GB>
<Content_DE>German_translation of_string_3</Content_DE>
</Para>
<Para tag="L1">
<Content_GB>string_4</Content_GB>
<Content_DE>German_translation of_string_4</Content_DE>
</Para>
<Para tag="Sub">
<Content_GB>string_5</Content_GB>
<Content_DE>German_translation of_string_5</Content_DE>
</Para>
<Para tag="L3">
<Content_GB>string_6</Content_GB>
<Content_DE>German_translation of_string_6</Content_DE>
</Para>
<Para tag="Subbull">
<Content_GB>string_7</Content_GB>
<Content_DE>German_translation of_string_7</Content_DE>
</Para>
</Dictionary>
<Dictionary>
<Para tag="L1">
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag="L1_sub">
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag="Illanc">
<Content_DK>Danish_translation_of_string_2</Content_DK>
</Para>
<Para tag="L1">
<Content_DK>Danish_translation_of_string_4</Content_DK>
</Para>
<Para tag="|PLB">
<Content_DK>Danish_translation_of_string_3</Content_DK>
</Para>
<Para tag="L3">
<Content_DK>Danish_translation_of_string_6</Content_DK>
</Para>
<Para tag="Sub">
<Content_DK>Danish_translation_of_string_5</Content_DK>
</Para>
<Para tag="Subbull">
<Content_DK>Danish_translation_of_string_7</Content_DK>
</Para>
</Dictionary>
</Book>