Comparing 2 node sets based on attribute sequence - xslt

I'm trying to build up a kind of library XML, comparing various nodes and combining them for later reuse. The logic should be fairly straightforward, if the tag_XX attribute value sequence of a given language is equal to the tag_YY attribute value sequence of another language, the nodes can be combined. See below XML example
<Book>
<Section>
<GB>
<Para tag_GB="L1">
<Content_GB>string_1</Content_GB>
</Para>
<Para tag_GB="Illanc">
<Content_GB>string_2</Content_GB>
</Para>
<Para tag_GB="|PLB">
<Content_GB>string_3</Content_GB>
</Para>
<Para tag_GB="L1">
<Content_GB>string_4</Content_GB>
</Para>
<Para tag_GB="Sub">
<Content_GB>string_5</Content_GB>
</Para>
<Para tag_GB="L3">
<Content_GB>string_6</Content_GB>
</Para>
<Para tag_GB="Subbull">
<Content_GB>string_7</Content_GB>
</Para>
</GB>
<!-- German translations - OK because same attribute sequence -->
<DE>
<Para tag_DE="L1">
<Content_DE>German_translation of_string_1</Content_DE>
</Para>
<Para tag_DE="Illanc">
<Content_DE>German_translation of_string_2</Content_DE>
</Para>
<Para tag_DE="|PLB">
<Content_DE>German_translation of_string_3</Content_DE>
</Para>
<Para tag_DE="L1">
<Content_DE>German_translation of_string_4</Content_DE>
</Para>
<Para tag_DE="Sub">
<Content_DE>German_translation of_string_5</Content_DE>
</Para>
<Para tag_DE="L3">
<Content_DE>German_translation of_string_6</Content_DE>
</Para>
<Para tag_DE="Subbull">
<Content_DE>German_translation of_string_7</Content_DE>
</Para>
</DE>
<!-- Danish translations - NG because not same attribute sequence -->
<DK>
<Para tag_DK="L1">
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag_DK="L1_sub">
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag_DK="Illanc">
<Content_DK>Danish_translation_of_string_2</Content_DK>
</Para>
<Para tag_DK="L1">
<Content_DK>Danish_translation_of_string_4</Content_DK>
</Para>
<Para tag_DK="|PLB">
<Content_DK>Danish_translation_of_string_3</Content_DK>
</Para>
<Para tag_DK="L3">
<Content_DK>Danish_translation_of_string_6</Content_DK>
</Para>
<Para tag_DK="Sub">
<Content_DK>Danish_translation_of_string_5</Content_DK>
</Para>
<Para tag_DK="Subbull">
<Content_DK>Danish_translation_of_string_7</Content_DK>
</Para>
</DK>
</Section>
</Book>
So
GB tag_GB value sequence = L1 -> Illanc -> ... -> SubBul
DE tag_DE value sequence = L1 -> Illanc -> ... -> SubBul (same as GB so ok)
DK tag_DK value sequence = L1 -> L1.sub -> Oops, expected Illanc meaning this sequence is not the same as GB and locale can be ignored
Since German and English node sets have the same attribute sequence I like to combine them as follows :
<Book>
<Dictionary>
<Para tag="L1">
<Content_GB>string_1</Content_GB>
<Content_DE>German_translation of_string_1</Content_DE>
</Para>
<Para tag="Illanc">
<Content_GB>string_2</Content_GB>
<Content_DE>German_translation of_string_2</Content_DE>
</Para>
<Para tag="|PLB">
<Content_GB>string_3</Content_GB>
<Content_DE>German_translation of_string_3</Content_DE>
</Para>
<Para tag="L1">
<Content_GB>string_4</Content_GB>
<Content_DE>German_translation of_string_4</Content_DE>
</Para>
<Para tag="Sub">
<Content_GB>string_5</Content_GB>
<Content_DE>German_translation of_string_5</Content_DE>
</Para>
<Para tag="L3">
<Content_GB>string_6</Content_GB>
<Content_DE>German_translation of_string_6</Content_DE>
</Para>
<Para tag="Subbull">
<Content_GB>string_7</Content_GB>
<Content_DE>German_translation of_string_7</Content_DE>
</Para>
</Dictionary>
</Book>
The stylesheet I use is the following :
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" xmlns="http://www.w3.org/1999/xhtml" encoding="UTF-8" indent="yes"/>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="normalize-space(.)"/>
</xsl:template>
<xsl:template match="Section">
<!-- store reference tag list -->
<xsl:variable name="Ref_tagList" select="GB/Para/attribute()[1]"/>
<Dictionary>
<xsl:for-each select="GB/Para">
<xsl:variable name="pos" select="position()"/>
<Para tag="{#tag_GB}">
<!-- Copy English Master -->
<xsl:apply-templates select="element()[1]"/>
<xsl:for-each select="//Book/Section/element()[not(self::GB)]">
<!-- store current locale tag list -->
<xsl:variable name="Curr_tagList" select="Para/attribute()[1]"/>
<xsl:if test="$Ref_tagList = $Curr_tagList">
<!-- Copy current locale is current tag list equals reference tag list -->
<xsl:apply-templates select="Para[position()=$pos]/element()[1]"/>
</xsl:if>
</xsl:for-each>
</Para>
</xsl:for-each>
</Dictionary>
</xsl:template>
</xsl:stylesheet>
Apart from probably not the most efficient way to do this (I'm fairly new to the xslt game...) it's not working either. The logic I had in mind is to take the attribute set of the English master, and if the attribute set of any other locale is equal I copy, if not I ignore. But for some reason also nodesets that have a different attribute sequence are happily copied (as seen in below). Can some one tell me where my logic conflicts with reality ? Thanks in advance !
Current output Including Danish that should have been ignored ...
<Book>
<Dictionary>
<Para tag="L1">
<Content_GB>string_1</Content_GB>
<Content_DE>German_translation of_string_1</Content_DE>
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag="Illanc">
<Content_GB>string_2</Content_GB>
<Content_DE>German_translation of_string_2</Content_DE>
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag="|PLB">
<Content_GB>string_3</Content_GB>
<Content_DE>German_translation of_string_3</Content_DE>
<Content_DK>Danish_translation_of_string_2</Content_DK>
</Para>
<Para tag="L1">
<Content_GB>string_4</Content_GB>
<Content_DE>German_translation of_string_4</Content_DE>
<Content_DK>Danish_translation_of_string_4</Content_DK>
</Para>
<Para tag="Sub">
<Content_GB>string_5</Content_GB>
<Content_DE>German_translation of_string_5</Content_DE>
<Content_DK>Danish_translation_of_string_3</Content_DK>
</Para>
<Para tag="L3">
<Content_GB>string_6</Content_GB>
<Content_DE>German_translation of_string_6</Content_DE>
<Content_DK>Danish_translation_of_string_6</Content_DK>
</Para>
<Para tag="Subbull">
<Content_GB>string_7</Content_GB>
<Content_DE>German_translation of_string_7</Content_DE>
<Content_DK>Danish_translation_of_string_5</Content_DK>
</Para>
</Dictionary>
</Book>

This is might not be the best solution. I've used the following XSLT 2.0 features:
I compared the sequence of attributes using string-join().
I've exploited the possibility of using RTF variables
There are probably more XSLT 2.0 facilities which can resolve your problem. but I think the BIG problem here is your input document.
I'm sorry did not have a look to your current transform. Just implemented one from scratch. Hope it helps:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="GB">
<Book>
<Dictionary>
<xsl:variable name="matches">
<xsl:for-each select="following-sibling::*
[string-join(Para/#*,'-')
= string-join(current()/Para/#*,'-')]">
<match><xsl:copy-of select="Para/*"/></match>
</xsl:for-each>
</xsl:variable>
<xsl:apply-templates select="Para">
<xsl:with-param name="matches" select="$matches"/>
</xsl:apply-templates>
</Dictionary>
</Book>
</xsl:template>
<xsl:template match="Para[parent::GB]">
<xsl:param name="matches"/>
<xsl:variable name="pos" select="position()"/>
<Para tag="{#tag_GB}">
<xsl:copy-of select="Content_GB"/>
<xsl:copy-of select="$matches/match/*[position()=$pos]"/>
</Para>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
When applied to the input document provided in the question, the following output is produced:
<Book>
<Dictionary>
<Para tag="L1">
<Content_GB>string_1</Content_GB>
<Content_DE>German_translation of_string_1</Content_DE>
</Para>
<Para tag="Illanc">
<Content_GB>string_2</Content_GB>
<Content_DE>German_translation of_string_2</Content_DE>
</Para>
<Para tag="|PLB">
<Content_GB>string_3</Content_GB>
<Content_DE>German_translation of_string_3</Content_DE>
</Para>
<Para tag="L1">
<Content_GB>string_4</Content_GB>
<Content_DE>German_translation of_string_4</Content_DE>
</Para>
<Para tag="Sub">
<Content_GB>string_5</Content_GB>
<Content_DE>German_translation of_string_5</Content_DE>
</Para>
<Para tag="L3">
<Content_GB>string_6</Content_GB>
<Content_DE>German_translation of_string_6</Content_DE>
</Para>
<Para tag="Subbull">
<Content_GB>string_7</Content_GB>
<Content_DE>German_translation of_string_7</Content_DE>
</Para>
</Dictionary>
</Book>

This stylesheet makes use of <xsl:for-each-group>
First, groups the elements by their sequence of Para/#* values
Then, for each of those sequences, groups the Para using the number of following sibling elements that have attributes that start with "tag".
I have predicate filters on the matches for #*, to ensure that it is comparing the ones that start with "tag_". That may not be necessary, but would help ensure that it still worked if other attributes were added to the instance XML.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" xmlns="http://www.w3.org/1999/xhtml" encoding="UTF-8"
indent="yes"/>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()" priority="1">
<xsl:value-of select="normalize-space(.)"/>
</xsl:template>
<xsl:template match="Section">
<xsl:for-each-group select="*"
group-adjacent="string-join(
Para/#*[starts-with(local-name(),'tag_')],'|')">
<Dictionary>
<xsl:for-each-group select="current-group()/Para"
group-by="count(
following-sibling::*[#*[starts-with(local-name(),'tag_')]])">
<Para tag="{(current-group()/#*[starts-with(local-name(),'tag_')])[1]}">
<xsl:copy-of select="current-group()/*"/>
</Para>
</xsl:for-each-group>
</Dictionary>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
When applied to the sample input XML, produces the following output:
<Book>
<Dictionary>
<Para tag="L1">
<Content_GB>string_1</Content_GB>
<Content_DE>German_translation of_string_1</Content_DE>
</Para>
<Para tag="Illanc">
<Content_GB>string_2</Content_GB>
<Content_DE>German_translation of_string_2</Content_DE>
</Para>
<Para tag="|PLB">
<Content_GB>string_3</Content_GB>
<Content_DE>German_translation of_string_3</Content_DE>
</Para>
<Para tag="L1">
<Content_GB>string_4</Content_GB>
<Content_DE>German_translation of_string_4</Content_DE>
</Para>
<Para tag="Sub">
<Content_GB>string_5</Content_GB>
<Content_DE>German_translation of_string_5</Content_DE>
</Para>
<Para tag="L3">
<Content_GB>string_6</Content_GB>
<Content_DE>German_translation of_string_6</Content_DE>
</Para>
<Para tag="Subbull">
<Content_GB>string_7</Content_GB>
<Content_DE>German_translation of_string_7</Content_DE>
</Para>
</Dictionary>
<Dictionary>
<Para tag="L1">
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag="L1_sub">
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag="Illanc">
<Content_DK>Danish_translation_of_string_2</Content_DK>
</Para>
<Para tag="L1">
<Content_DK>Danish_translation_of_string_4</Content_DK>
</Para>
<Para tag="|PLB">
<Content_DK>Danish_translation_of_string_3</Content_DK>
</Para>
<Para tag="L3">
<Content_DK>Danish_translation_of_string_6</Content_DK>
</Para>
<Para tag="Sub">
<Content_DK>Danish_translation_of_string_5</Content_DK>
</Para>
<Para tag="Subbull">
<Content_DK>Danish_translation_of_string_7</Content_DK>
</Para>
</Dictionary>
</Book>

Related

Need to split the parent and child element into two seperate element

Hi I'm having the below input xml file:
<Description>Same Date <Text>True</Text></Description>
XSL I have tried for
<xsl:template match="Description">
<def>
<para>
<title>
<xsl:value-of select="Description"/>
</title>
<para>
<xsl:value-of select="Text"/>
</para>
</para>
</def>
</xsl:template>
Expected Output:
<def>
<para>
<title>Same Date</title>
<para>True</para>
</para>
</def>
I need to split the child element and change into seperate element.
You can try This:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<xsl:output method="xml" omit-xml-declaration="no"/>
<xsl:template match="Description">
<def>
<para>
<title>
<xsl:value-of select="normalize-space(node()[1])"/>
</title>
<xsl:if test="Text">
<para>
<xsl:value-of select="Text"/>
</para>
</xsl:if>
</para>
</def>
</xsl:template>
</xsl:stylesheet>
Change Following Code:-
<title><xsl:value-of select="Description"/></title>
to
<title><xsl:value-of select="normalize-space(substring-before(., Text))"/></title>

Xslt for table content replacement only in Tbody

Could you please help us out in the below scenario.
We need xsl code for the below scenario.
We need to retrieve ref tag inside para in Thead
We need to remove ref tag inside para in Tbody.
For last cell we should not perform this ref removal. ie) should behave like thead
Sample Input:
<xml>
<Table>
<thead>
<Row>
<Cell>
<para id=4>
<ref>A</ref>
</para>
</Cell>
</Row>
</thead>
<tbody>
<Row>
<Cell>
<para id=1>
<ref>b</ref>
</para>
</Cell>
.
.
<Cell>
<para id=6>
<ref>retrive</ref>
</para>
</Cell>
</Row>
<Row>
<Cell>
<para id=2>
c
</para>
</Cell>
.
.
<Cell>
<para id=7>
<ref>retrive</ref>
</para>
</Cell>
</Row>
<Row>
<Cell >
<para id=3>
<ref>d</ref>
<ref>e</ref>
</para>
</Cell>
.
.
<Cell>
<para id=8>
<ref>retrive</ref>
</para>
</Cell>
</Row>
</tbody>
</table>
Expected Output:
<xml>
<Table>
<thead>
<Row>
<Cell>
<para id=4>
<ref>A</ref> (No change in thead)
</para>
</Cell>
</Row>
</thead>
<tbody>
<Row>
<Cell>
<para id=1> (para attribute should be retrieved)
b (ref tag should be removed but content should be retrieved)
</para>
</Cell>
.
.
<Cell>
<para id=6>
<ref>retrieve</ref> (Should retrieve ref tag with value)
</para>
</Cell>
</Row>
<Row>
<Cell>
<para id=2>
c
</para>
</Cell>
.
.
<Cell>
<para id=7>
<ref>retrieve</ref> (Should retrieve ref tag with value)
</para>
</Cell>
</Row>
<Row>
<Cell>
<para id=3>
d
e
</para>
</Cell>
.
.
<Cell>
<para id=8>
<ref>retrieve</ref> (Should retrieve ref tag with value)
</para>
</Cell>
</Row>
</tbody>
</table>
With the adjustments to your input XML to have the closing table tag match the opening table tag and to wrap the value of id in quotation marks, the following XSLT
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" omit-xml-declaration="no"
encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="table">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="tbody/Row/Cell[position()!=last()]/para/ref">
<xsl:value-of select="."/>
</xsl:template>
</xsl:transform>
when applied to this corrected input XML produces the output
<?xml version="1.0" encoding="UTF-8"?>
<Table>
<thead>
<Row>
<Cell>
<para id="4">
<ref>A</ref>
</para>
</Cell>
</Row>
</thead>
<tbody>
<Row>
<Cell>
<para id="1">b</para>
</Cell>
<Cell>
<para id="6">
<ref>retrieve</ref>
</para>
</Cell>
</Row>
<Row>
<Cell>
<para id="2">
c
</para>
</Cell>
<Cell>
<para id="7">
<ref>retrieve</ref>
</para>
</Cell>
</Row>
<Row>
<Cell>
<para id="3">de</para>
</Cell>
<Cell>
<para id="8">
<ref>retrieve</ref>
</para>
</Cell>
</Row>
</tbody>
</Table>
The <xsl:template match="tbody/Row/Cell[position()!=last()]/para/ref">
matches all Cell elements in tbody except the last one - position()!=last() - and replace the ref attribute with its value.

Basic xslt transformation

I have one XML request which I need to modify (to XML) and then send it further. I have no prior knowledge of XSLT.
Say I have
<Combined>
<Profile>
<Fullname>John Doe</Fullname>
<OtherData>
<Birthdate>1996</Birthdate>
<FavoriteBooks>
<Book>
<id>1</id>
<description>Libre1</description>
</Book>
<Book>
<id>2</id>
<description>Libre2</description>
</Book>
<Book>
<id>3</id>
<description></description>
</Book>
<Book>
<id>4</id>
<description>Libre4</description>
</Book>
</FavoriteBooks>
</OtherData>
</Profile>
<LoadedData>
<NewBirthdate>1998</NewBirthdate>
<BooksUpdate>
<Book id="1">
<BookText>Book1</BookText>
</Book>
<Book id="2">
<BookText>Book2</BookText>
</Book>
<Book id="3">
<BookText>Book3</BookText>
</Book>
<Book id="4">
<BookText>Book4</BookText>
</Book>
<Book id="5">
<BookText>Book5</BookText>
</Book>
</BooksUpdate>
</LoadedData>
And want to get
<Profile>
<Fullname>John Doe</Fullname>
<OtherData>
<Birthdate>1998</Birthdate>
<FavoriteBooks>
<Book>
<id>1</id>
<description>Libre1Book1</description>
</Book>
<Book>
<id>2</id>
<description>Libre2Book2</description>
</Book>
<Book>
<id>3</id>
<description>empty</description>
</Book>
<Book>
<id>4</id>
<description>Libre4Book4</description>
</Book>
<Book>
<id>5</id>
<description>new Book5</description>
</Book>
</FavoriteBooks>
</OtherData>
I did a pretty pathetic attempt, which obviously does not work.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<Profile>
<xsl:apply-templates select="Combined/Profile/Fullname" />
</Profile>
<Otherdata>
<Birthdate>
<xsl:apply-templates select="Combined/LoadedData/NewBirthdate"/>
</Birthdate>
<FavoriteBooks>
<xsl:for-each select="/Combined/Profile/OtherData/FavoriteBooks/Book">
<Book>
<id>
<xsl:value-of select="id"/>
</id>
<description>
<xsl:value-of select="description"/>
<xsl:apply-templates select="/Combined/LoadedData/BooksUpdate/Book[#id='']" />
</description>
</Book>
</xsl:for-each>
</FavoriteBooks>
</Otherdata>
</xsl:template>
</xsl:stylesheet>
How can I get closer to what I want to get? Could you also advise me some book to jump start, because w3schools tutorials are useless :(
How's this?
According to the logic you described, the description would not be "empty" but rather "Book3" (empty string merged with "Book3").
<!-- root and static content -->
<xsl:template match="/">
<xsl:apply-templates select='Combined/Profile' />
</xsl:template>
<!-- identity/copy, with some tweaks -->
<xsl:template match='node()|#*'>
<!-- copy node -->
<xsl:copy>
<!-- add in its attributes -->
<xsl:apply-templates select='#*' />
<!-- now either apply same treatment to child nodes, or something special -->
<xsl:choose>
<!-- use updated birthdate -->
<xsl:when test='name() = "Birthdate"'>
<xsl:value-of select='/Combined/LoadedData/NewBirthdate' />
</xsl:when>
<!-- merge book descriptions -->
<xsl:when test='name() = "description"'>
<xsl:value-of select='concat(., /Combined/LoadedData/BooksUpdate/Book[#id = current()/../id]/BookText)' />
</xsl:when>
<!-- or just keep recursing -->
<xsl:otherwise>
<xsl:apply-templates select='node()' />
</xsl:otherwise>
</xsl:choose>
</xsl:copy>
<!-- if we've done all books, add in any in the loaded data but not the original data -->
<xsl:if test='name() = "Book" and not(count(following-sibling::Book))'>
<xsl:variable name='orig_book_ids'>
<xsl:for-each select='../Book'>
<xsl:value-of select='concat("-",id,"-")' />
</xsl:for-each>
</xsl:variable>
<xsl:apply-templates select='/Combined/LoadedData/BooksUpdate/Book[not(contains($orig_book_ids, concat("-",#id,"-")))]' mode='new_books' />
</xsl:if>
</xsl:template>
<!-- new books -->
<xsl:template match='Book' mode='new_books'>
<Book>
<id><xsl:value-of select='#id' /></id>
<description>new <xsl:value-of select='BookText' /></description>
</Book>
</xsl:template>
You can run it at this XMLPlayground session (see output source).

group adjacent with constraints?

I'm having an issue while using group-adjacent. Below a simplified XML snippet :
<Paras>
<Para tag="Bind">
<Content>some standalone Bind data</Content>
</Para>
<Para tag="L3">
<Content>some header data</Content>
</Para>
<Para tag="BStep.n=1">
<Content>some data</Content>
</Para>
<Para tag="Bind">
<Content>some data</Content>
</Para>
<Para tag="BStep.n+">
<Content>some data</Content>
</Para>
<Para tag="BStep.n+">
<Content>some data</Content>
</Para>
<Para tag="Bind">
<Content>some data</Content>
</Para>
<Para tag="Bind">
<Content>some data</Content>
</Para>
<Para tag="L1">
<Content>some header</Content>
</Para>
<Para tag="BBox.n=1">
<Content>some data</Content>
</Para>
<Para tag="BBox.n+">
<Content>some data</Content>
</Para>
<Para tag="Bind">
<Content>some data</Content>
</Para>
<Para tag="BBox.n+">
<Content>some data</Content>
</Para>
<Para tag="Bind">
<Content>some data</Content>
</Para>
<Para tag="L2">
<Content>some header</Content>
</Para>
</Paras>
What I like to get after final transformation is something as below :
<Paras>
<Para tag="Bind">
<Content>some standalone Bind data</Content>
</Para>
<Para tag="L3">
<Content>some header data</Content>
</Para>
<StepGroup>
<Steps>
<Para tag="BStep.n=1">some data</Para>
<Para tag="Bind">some data</Para>
</Steps>
<Steps>
<Para tag="BStep.n+">some data</Para>
</Steps>
<Steps>
<Para tag="BStep.n+">some data</Para>
<Para tag="Bind">some data</Para>
<Para tag="Bind">some data</Para>
</Steps>
</StepGroup>
<Para tag="L1">
<Content>some header</Content>
</Para>
<BoxGroup>
<Steps>
<Para tag="BBox.n=1">some data</Para>
<Para tag="BBox.n+">some data</Para>
<Para tag="Bind">some data</Para>
</Steps>
<Steps>
<Para tag="BBox.n+">some data</Para>
<Para tag="Bind">some data</Para>
</Steps>
</BoxGroup>
<Para tag="L2">
<Content>some header</Content>
</Para>
</Paras>
Or, to make it a bit textual : All 'bstep' type of tags and 'bind' tags that are adjacent to each other should be grouped in a StepGroup Element, and also all 'bblock' type of tags that are adjacent, including Bind tags, should be grouped in a 'BoxGroup' element.
I used following xslt (only partly shown) :
<!-- Some data above this left out ... -->
<xsl:for-each-group select="current-group()" group-adjacent="#tag='BStep.boxnmb.n=1' or #tag='BStep.boxnmb.n+' or #tag='Bind' or #tag='BStep.nobox' ">
<xsl:choose>
<xsl:when test="current-grouping-key()">
<StepGroup>
<!-- do some stuff with group / not included now -->
<xsl:apply-templates select="current-group()"/>
</StepGroup>
</xsl:when>
<xsl:otherwise>
<xsl:for-each-group select="current-group()" group-adjacent="#tag='BBox.n=1' or #tag='BBox.n+' or #tag='Bind'">
<xsl:choose>
<xsl:when test="current-grouping-key()">
<BoxGroup>
<xsl:apply-templates select="current-group()"/>
</BoxGroup>
</xsl:when>
This works partly, but as I have 'Bind' tags in both types of adjacent groups I need to be able to modify the group-adjacent keys so that for the 'StepGroup' only 'Binds' are included where the element has a 'Step type' tag, and for the 'BoxGroup' only 'Binds' where the previous element has a 'Box type' tag. I've tried some things but all resulting in nice error messages, so I hope someone can point me in the right direction here.
I don't fully understand your requirements, here is some partial solution
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:strip-space elements="*"/>
<xsl:output indent="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* , node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Paras">
<xsl:copy>
<xsl:for-each-group select="Para" group-adjacent="matches(#tag, 'BStep|Bind')">
<xsl:choose>
<xsl:when test="current-grouping-key()">
<StepGroup>
<xsl:for-each-group select="current-group()" group-starting-with="Para[matches(#tag, 'BStep')]">
<Step>
<xsl:apply-templates select="current-group()"/>
</Step>
</xsl:for-each-group>
</StepGroup>
</xsl:when>
<xsl:otherwise>
<xsl:for-each-group select="current-group()" group-adjacent="matches(#tag, 'BBox|Bind')">
<xsl:choose>
<xsl:when test="current-grouping-key()">
<BoxGroup>
<xsl:apply-templates select="current-group()"/>
</BoxGroup>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
[edit]
Here is an adaption of the first stylesheet that should do the first level of grouping as you asked for:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:strip-space elements="*"/>
<xsl:output indent="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* , node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Paras">
<xsl:copy>
<xsl:for-each-group select="Para"
group-adjacent="matches(#tag, 'BStep|Bind')
and (self::Para[matches(#tag, 'BStep')]
or preceding-sibling::*[not(matches(#tag, 'Bind'))][1][self::Para[matches(#tag, 'BStep')]])">
<xsl:choose>
<xsl:when test="current-grouping-key()">
<StepGroup>
<xsl:for-each-group select="current-group()" group-starting-with="Para[matches(#tag, 'BStep')]">
<Step>
<xsl:apply-templates select="current-group()"/>
</Step>
</xsl:for-each-group>
</StepGroup>
</xsl:when>
<xsl:otherwise>
<xsl:for-each-group select="current-group()" group-adjacent="matches(#tag, 'BBox|Bind')">
<xsl:choose>
<xsl:when test="current-grouping-key()">
<BoxGroup>
<xsl:apply-templates select="current-group()"/>
</BoxGroup>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
When I apply that with Saxon 9 to your input document I get
<Paras>
<StepGroup>
<Step>
<Para tag="BStep.n=1">
<Content>some data</Content>
</Para>
<Para tag="Bind">
<Content>some data</Content>
</Para>
</Step>
<Step>
<Para tag="BStep.n+">
<Content>some data</Content>
</Para>
</Step>
<Step>
<Para tag="BStep.n+">
<Content>some data</Content>
</Para>
<Para tag="Bind">
<Content>some data</Content>
</Para>
<Para tag="Bind">
<Content>some data</Content>
</Para>
</Step>
</StepGroup>
<Para tag="L1">
<Content>some header</Content>
</Para>
<BoxGroup>
<Para tag="BBox.n=1">
<Content>some data</Content>
</Para>
<Para tag="BBox.n+">
<Content>some data</Content>
</Para>
<Para tag="Bind">
<Content>some data</Content>
</Para>
<Para tag="BBox.n+">
<Content>some data</Content>
</Para>
<Para tag="Bind">
<Content>some data</Content>
</Para>
</BoxGroup>
<Para tag="L2">
<Content>some header</Content>
</Para>
</Paras>
I realize this is not yet a final solution but I have so far not understood what defines the grouping inside of a BoxGroup. Maybe you can explain that in more detail or you can fix that part yourself.

Find absolute position of xml tag with XSLT

I have this XML document where I want to modify using XSLT to a different format. The problem I'm currently facing is finding the absolute position of a tag relative to root and not to the parent.
For instance take the following example:
<book>
<section>
<chapter>
</chapter>
</section>
</book>
<book>
<section>
<chapter>
</chapter>
</section>
</book> <book>
<section>
<chapter>
</chapter>
</section>
</book> <book>
<section>
<chapter>
</chapter>
</section>
</book>
Desired output:
<book id=1>
<section id=1>
<chapter id=1>
</chapter>
</section>
</book>
<book id=2>
<section id=2>
<chapter id=2>
</chapter>
</section>
</book>
<book id=3>
<section id=3>
<chapter id=3>
</chapter>
</section>
</book>
<book id=4>
<section id=4>
<chapter id=4>
</chapter>
</section>
</book>
To get the id for the book tag can be easily achieved by using the position(), but once we go down to section and chapter things get trickier.
A solution for this problem would be creating a global variables that would work as counters for section and chapter, which would increment every time one of these tags are found in the document, but variables in XSLT behave like constants.
thanks in advance,
fbr
How about
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="*|#*|text()">
<xsl:copy>
<xsl:apply-templates select="*|#*|text()" />
</xsl:copy>
</xsl:template>
<xsl:template match="book|section|chapter">
<xsl:copy>
<xsl:attribute name="ix">
<xsl:value-of select="1 + count(preceding::*[name() = name(current())])"/>
</xsl:attribute>
<xsl:apply-templates select="*|#*|text()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
("ix" used instead of "id" as you really shouldn't have multiple elements with the same id in your XML)
xsl:number was built for this sort of scenario.
It makes it very easy to produce various formatted numbers and counts and is used often in XSL-FO for things such as a Table of Contents and labels for figures and tables (e.g. figure 3.a, section 1.1, etc.)
I adjusted the sample XML by adding a document element in order to make it well formed.
Using this stylesheet produces the desired output.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<xsl:apply-templates select="*/book" />
</xsl:template>
<xsl:template match="*">
<xsl:copy>
<xsl:attribute name="id">
<xsl:number format="1 " level="single" count="book"/>
</xsl:attribute>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Must the IDs be integers? An easy way to generate unique IDs would be to create them be appending their parents to them:
<book id="1">
<section id="1.1">
<chapter id="1.1.1">
</chapter>
</section>
</book>
In that case you can use position() and recursion to generate the IDs easily.