XSL efficiency problem - need solution - xslt

I've got an interesting XSL scenario to run by you guys. So far my solutions seem to be inefficient (noticable increase in transformation time) so thought I'd put it out there.
The scenario
From the following XML we need to get the id of latest news item for each category.
The XML
In the XML I have a list of news items, a list of news categories and a list of item category relationships. Both the item list and item category list may as well be in random order (not date ordered).
<news>
<itemlist>
<item id="1">
<attribute name="title">Great new products</attribute>
<attribute name="startdate">2009-06-13T00:00:00</attribute>
</item>
<item id="2">
<attribute name="title">FTSE down</attribute>
<attribute name="startdate">2009-10-01T00:00:00</attribute>
</item>
<item id="3">
<attribute name="title">SAAB go under</attribute>
<attribute name="startdate">2008-01-22T00:00:00</attribute>
</item>
<item id="4">
<attribute name="title">M&A on increase</attribute>
<attribute name="startdate">2010-05-11T00:00:00</attribute>
</item>
</itemlist>
<categorylist>
<category id="1">
<name>Finance</name>
</category>
<category id="2">
<name>Environment</name>
</category>
<category id="3">
<name>Health</name>
</category>
</categorylist>
<itemcategorylist>
<itemcategory itemid="1" categoryid="2" />
<itemcategory itemid="2" categoryid="3" />
<itemcategory itemid="3" categoryid="1" />
<itemcategory itemid="4" categoryid="1" />
<itemcategory itemid="4" categoryid="2" />
<itemcategory itemid="2" categoryid="2" />
</itemcategorylist>
</news>
What I've tried
Using rtf
<xsl:template match="/">
<!-- for each category -->
<xsl:for-each select="/news/categorylist/category">
<xsl:variable name="categoryid" select="#id"/>
<!-- create RTF item list containing only items in that list ordered by startdate -->
<xsl:variable name="ordereditemlist">
<xsl:for-each select="/news/itemlist/item">
<xsl:sort select="attribute[#name='startdate']" order="descending" data-type="text"/>
<xsl:variable name="itemid" select="#id" />
<xsl:if test="/news/itemcategorylist/itemcategory[#categoryid = $categoryid][#itemid=$itemid]">
<xsl:copy-of select="."/>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<!-- get the id of the first item in the list -->
<xsl:variable name="firstitemid" select="msxsl:node-set($ordereditemlist)/item[position()=1]/#id"/>
</xsl:for-each>
</xsl:template>
Would really appreciate any ideas you have.
Thanks,
Alex

Here is how I would do it:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:output encoding="utf-8" />
<!-- this is (literally) the key to the solution -->
<xsl:key name="kItemByItemCategory" match="item" use="
/news/itemcategorylist/itemcategory[#itemid = current()/#id]/#categoryid
" />
<xsl:template match="/news">
<latest>
<xsl:apply-templates select="categorylist/category" mode="latest" />
</latest>
</xsl:template>
<xsl:template match="category" mode="latest">
<xsl:variable name="self" select="." />
<!-- sorted loop to get the latest news item -->
<xsl:for-each select="key('kItemByItemCategory', #id)">
<xsl:sort select="attribute[#name='startdate']" order="descending" />
<xsl:if test="position() = 1">
<category name="{$self/name}">
<xsl:apply-templates select="." />
</category>
</xsl:if>
</xsl:for-each>
</xsl:template>
<xsl:template match="item">
<!-- for the sake of the example, just copy the node -->
<xsl:copy-of select="." />
</xsl:template>
</xsl:stylesheet>
The <xsl:key> indexes each news item by the associated category ID. Now you have a simple way of retrieving all the news items that belong to a certain category. The rest is straight-forward.
Output for me:
<latest>
<category name="Finance">
<item id="4">
<attribute name="title">M&A on increase</attribute>
<attribute name="startdate">2010-05-11T00:00:00</attribute>
</item>
</category>
<category name="Environment">
<item id="4">
<attribute name="title">M&A on increase</attribute>
<attribute name="startdate">2010-05-11T00:00:00</attribute>
</item>
</category>
<category name="Health">
<item id="2">
<attribute name="title">FTSE down</attribute>
<attribute name="startdate">2009-10-01T00:00:00</attribute>
</item>
</category>
</latest>

It looks like you should explore <xsl:key>. This effectively creates a hashmap and avoids looping through everything.
update Here is a typical tutorial:
http://www.learn-xslt-tutorial.com/Working-with-Keys.cfm

Your're looping through all items and sorting them by date, before you throw most of them away due to not being in the correct category.
Maybe something like this might be more suitable in your case:
<xsl:variable name="ordereditemlist">
<xsl:for-each select="/news/itemcategorylist/itemcategory[#categoryid = $categoryid]">
<xsl:variable name="itemid" select="#itemid"/>
And continue from there to gather only the news items that you actually require, then sort and copy them.

Related

Remove duplicates based on condition

I am trying to remove duplicates from my xml based on a condition in XSLT1.0
Here is the input xml.
<?xml version="1.0" encoding="UTF-8"?>
<Envelope
xmlns="http://schemas.microsoft.com/dynamics/2011/01/documents/Message">
<Header>
<MessageId>{D5B72T7A-58E0-4930-9CEB-A06RT56AR21B0}</MessageId>
<Action>http://tempuri.org/TRH_FinalQueryService/find</Action>
</Header>
<Body>
<MessageParts
xmlns="http://schemas.microsoft.com/dynamics/2011/01/documents/Message">
<TRH_FinalQuery
xmlns="http://schemas.microsoft.com/dynamics/2008/01/documents/TRH_FinalQuery">
<TRH_UnionView class="entity">
<Company>1</Company>
<CS/>
<Text_1>1</Text_1>
<Text_2>Lotion</Text_2>
<WS/>
</TRH_UnionView>
<TRH_UnionView class="entity">
<Company>1</Company>
<CS>1</CS>
<Text_1>1</Text_1>
<Text_2>Soap</Text_2>
<WS>6</WS>
</TRH_UnionView>
<TRH_UnionView class="entity">
<Company>2</Company>
<CS/>
<Text_1>5</Text_1>
<Text_2>Shampoo</Text_2>
<WS/>
</TRH_UnionView>
<TRH_UnionView class="entity">
<Company>2</Company>
<CS/>
<Text_1>5</Text_1>
<Text_2>Shampoo</Text_2>
<WS/>
</TRH_UnionView>
</TRH_FinalQuery>
</MessageParts>
</Body>
</Envelope>
Here is the xslt that I have applied.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:m="http://schemas.microsoft.com/dynamics/2011/01/documents/Message" xmlns:r="http://schemas.microsoft.com/dynamics/2008/01/documents/TRH_FinalQuery" exclude-result-prefixes="m r">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*" />
<xsl:key name="r:TRH_FinalQuery" match="r:TRH_FinalQuery" use="concat(r:Text_1, '|', r:Company)" />
<!-- move all elements to no namespace -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="r:TRH_FinalQuery[r:TRH_UnionView[#class='entity']/r:WessexCostCenter=''][key('r:TRH_FinalQuery',concat(r:Text_1, '|', r:Company))[1]]"/>
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:copy-of select="#*" />
<xsl:apply-templates />
</xsl:element>
</xsl:template>
<!-- removes Envelope -->
<xsl:template match="m:Envelope">
<xsl:apply-templates />
</xsl:template>
<!-- removes Header,MessageId,Action and Body -->
<xsl:template match="m:*">
<xsl:apply-templates select="*" />
</xsl:template>
<!-- rename MessageParts to Document + skip the Run wrapper -->
<xsl:template match="m:MessageParts">
<DocumentElement>
<xsl:apply-templates select="r:TRH_FinalQuery/*" />
</DocumentElement>
</xsl:template>
<!-- rename RunObject to Item -->
<xsl:template match="r:TRH_UnionView[#class='entity']">
<xsl:choose>
<xsl:when test="r:WS!=''">
<Item>
<Text_1>
<xsl:value-of select="r:WS" />
</Text_1>
<Text_2>WS BodayWash</Text_2>
<Company>
<xsl:value-of select="r:Text_1" />
</Company>
</Item>
<Item>
<Text_1>
<xsl:value-of select="r:WS" />
</Text_1>
<Text_2>WS BodayWash</Text_2>
<Company>0123</Company>
</Item>
</xsl:when>
<xsl:otherwise>
<Item>
<xsl:apply-templates select="r:Text_1" />
<xsl:apply-templates select="r:Text_2" />
<xsl:apply-templates select="r:Company" />
</Item>
<Item>
<xsl:apply-templates select="r:Text_1" />
<xsl:apply-templates select="r:Text_2" />
<Company>0123</Company>
</Item>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
Below is the output I am getting
<?xml version="1.0" encoding="utf-8"?>
<DocumentElement>
<Item>
<Text_1>1</Text_1>
<Text_2>Lotion</Text_2>
<Company>1</Company>
</Item>
<Item>
<Text_1>1</Text_1>
<Text_2>Lotion</Text_2>
<Company>0123</Company>
</Item>
<Item>
<Text_1>6</Text_1>
<Text_2>WS BodayWash</Text_2>
<Company>1</Company>
</Item>
<Item>
<Text_1>6</Text_1>
<Text_2>WS BodayWash</Text_2>
<Company>0123</Company>
</Item>
<Item>
<Text_1>5</Text_1>
<Text_2>Shampoo</Text_2>
<Company>2</Company>
</Item>
<Item>
<Text_1>5</Text_1>
<Text_2>Shampoo</Text_2>
<Company>0123</Company>
</Item>
</DocumentElement>
Below is the expected output
<?xml version="1.0" encoding="utf-8"?>
<DocumentElement>
<Item>
<Text_1>6</Text_1>
<Text_2>WS BodayWash</Text_2>
<Company>1</Company>
</Item>
<Item>
<Text_1>6</Text_1>
<Text_2>WS BodayWash</Text_2>
<Company>0123</Company>
</Item>
<Item>
<Text_1>5</Text_1>
<Text_2>Shampoo</Text_2>
<Company>2</Company>
</Item>
<Item>
<Text_1>5</Text_1>
<Text_2>Shampoo</Text_2>
<Company>0123</Company>
</Item>
</DocumentElement>
I am trying to remove all duplicates based on condition
If the Text_1 and Company are same.
If the point 1 is true then retain all records having value in WS tag and remove records where there no value in WS tag.
Can you please suggest what I am doing wrong
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:m="http://schemas.microsoft.com/dynamics/2011/01/documents/Message"
xmlns:r="http://schemas.microsoft.com/dynamics/2008/01/documents/TRH_FinalQuery"
exclude-result-prefixes="m r">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*" />
<xsl:key name="myKey" match="r:TRH_UnionView" use="concat(r:Text_1, '|', r:Company)" />
<!-- Simplify things by not having an identity. Using this approach, you will not have to suppress
any elements.
-->
<xsl:template match="node()">
<xsl:apply-templates select="node()"/>
</xsl:template>
<!-- Start at the root. -->
<xsl:template match="/">
<DocumentElement>
<xsl:apply-templates select="node()" />
</DocumentElement>
</xsl:template>
<xsl:template match="r:TRH_UnionView">
<xsl:choose>
<!-- Handle the duplicates with no value in the WS tag. -->
<xsl:when test="count(key('myKey',concat(r:Text_1, '|', r:Company))) > 1 and
count((key('myKey',concat(r:Text_1, '|', r:Company)))[r:WS!='']) = 0">
<!-- Is this the first of the duplicates? -->
<xsl:if test="generate-id(.) = generate-id(key('myKey',concat(r:Text_1, '|', r:Company))[1])">
<Item>
<Text_1>
<xsl:value-of select="r:Text_1"/>
</Text_1>
<Text_2>
<xsl:value-of select="r:Text_2"/>
</Text_2>
<Company>
<xsl:value-of select="r:Company"/>
</Company>
</Item>
<Item>
<Text_1>
<xsl:value-of select="r:Text_1"/>
</Text_1>
<Text_2>
<xsl:value-of select="r:Text_2"/>
</Text_2>
<Company>0123</Company>
</Item>
</xsl:if>
</xsl:when>
<!-- Handle the duplicates with value at least one value in the WS tag. -->
<xsl:when test="count(key('myKey',concat(r:Text_1, '|', r:Company))) > 1">
<xsl:if test="r:WS!=''">
<Item>
<Text_1>
<xsl:value-of select="r:WS" />
</Text_1>
<Text_2>WS BodayWash</Text_2>
<Company>
<xsl:value-of select="r:Text_1" />
</Company>
</Item>
<Item>
<Text_1>
<xsl:value-of select="r:WS" />
</Text_1>
<Text_2>WS BodayWash</Text_2>
<Company>0123</Company>
</Item>
</xsl:if>
</xsl:when>
<xsl:otherwise>
<Item>
<Text_1>
<xsl:value-of select="r:Text_1"/>
</Text_1>
<Text_2>
<xsl:value-of select="r:Text_2"/>
</Text_2>
<Company>
<xsl:value-of select="r:Company"/>
</Company>
</Item>
<Item>
<Text_1>
<xsl:value-of select="r:Text_1"/>
</Text_1>
<Text_2>
<xsl:value-of select="r:Text_2"/>
</Text_2>
<Company>0123</Company>
</Item>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>

Sorting fails when using XSLT 1.0 Muenchian Grouping for creating HTML output

I've got the following XML:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="Test.xslt"?>
<test-results>
<test-case name="TestCase1" description="Descriptiontext">
<categories>
<category name="Dimension linked to measure group" />
</categories>
</test-case>
<test-case name="TestCase2" description="DescriptionText">
<categories>
<category name="Dimension linked to measure group" />
</categories>
</test-case>
<test-case name="TestCase3" description="DescriptionText">
<categories>
<category name="Default parameters" />
</categories>
</test-case>
<test-case name="TestCase4" description="DescriptionText">
<categories>
<category name="Default parameters" />
</categories>
</test-case>
<test-case name="TestCase5" description="DescriptionText">
<categories>
<category name="Referential Integrity" />
</categories>
<reason>
<message><![CDATA[Not testable, yet (v1.6.1)]]></message>
</reason>
</test-case>
<test-case name="TestCase6" description="DescriptionText">
<categories>
<category name="Referential Integrity" />
</categories>
<reason>
<message><![CDATA[Not testable, yet (v1.6.1)]]></message>
</reason>
</test-case>
</test-results>
With the following XSLT I try to use Muenchian grouping to order by category name (ascending) and within each category by test-case name (ascending).
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml">
<xsl:key name="cases-by-category" match="categories" use="category/#name" />
<xsl:template match="test-case">
<xsl:for-each select="categories[count(. | key('cases-by-category', category/#name)[1]) = 1]">
<xsl:sort select="category/#name" />
<xsl:value-of select="category/#name" /><br/>
<xsl:for-each select="key('cases-by-category', category/#name)">
<xsl:sort select="//test-case/#name" />
<xsl:value-of select="//test-case/#name"/><br/>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
However, what I get is this:
Dimension linked to measure group
TestCase1
TestCase1
Default parameters
TestCase1
TestCase1
Referential Integrity
TestCase1
TestCase1
The number of test cases for each category is correct, but the sorting doesn't get applied and the first test-case name is always used. How can I fix this?
Given <xsl:key name="cases-by-category" match="categories" use="category/#name" /> the expression key('cases-by-category', category/#name) gives you a node-set of categories elements, if you want to sort them by the parent then I think you want to use <xsl:sort select="../#name" />.
I also think having
<xsl:template match="test-case">
<xsl:for-each select="categories[count(. | key('cases-by-category', category/#name)[1]) = 1]">
looks odd as you would process the categories of every matched test-case element, it seems more likely you want
<xsl:template match="test-results">
<xsl:for-each select="test-case/categories[count(. | key('cases-by-category', category/#name)[1]) = 1]">
instead.
Here is a complete sample:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml">
<xsl:output indent="yes"/>
<xsl:key name="cases-by-category" match="categories" use="category/#name" />
<xsl:template match="/">
<html>
<body>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match="test-results">
<xsl:for-each select="test-case/categories[count(. | key('cases-by-category', category/#name)[1]) = 1]">
<xsl:sort select="category/#name" />
<xsl:value-of select="category/#name" /><br/>
<xsl:for-each select="key('cases-by-category', category/#name)">
<xsl:sort select="../#name" />
<xsl:value-of select="../#name"/><br/>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
When I run that with Saxon 6.5 against your input I get the following result:
<html xmlns="http://www.w3.org/1999/xhtml">
<body>Default parameters<br/>TestCase3<br/>TestCase4<br/>Dimension linked to measure group<br/>TestCase1<br/>TestCase2<br/>Referential Integrity<br/>TestCase5<br/>TestCase6<br/>
</body>
</html>

A distinct list based on split values

My end goal is to have a unique list of ID's I can iterate through. Here goes:
I have an XML of products (Items). In the complete XML there will be +200,000 items. In this example there is two:
<?xml version="1.0" encoding="utf-8"?>
<Export Shop="Demo Webshop" Type="Full" Clean="true" CleanIsolationShopID="SHOP1">
<Items>
<Item ItemNo="1001" ShopID="SHOP1" VariantCode="1616_42.1615_01.ct_HD">
</Item>
<Item ItemNo="1001" ShopID="SHOP1" VariantCode="1616_42.1615_02.ct_HD" >
</Item>
</Items>
The content of attribute VariantCode I need to split. For the first Item that should give me 1616_42 and 1615_01 and ct_HD. The end result is to import it to a table with the composite primary key ItemNo+VariantOption (VariantOption being the split value).
The XSLT furthermore has:
<table tableName="EcomVariantOptionsProductRelation">
<xsl:for-each select="Export/Items/Item">
<xsl:call-template name="split">
<xsl:with-param name="pText" select="#VariantCode"/>
<xsl:with-param name="ProductID" select="concat(#ItemNo,'##',#ShopID)"/>
/xsl:call-template>
</xsl:for-each>
The template being called that performs the actual split:
<xsl:template match="text()" name="split">
<xsl:param name="pText" select="."/>
<xsl:param name= "ProductID" select="." />
<xsl:choose>
<xsl:when test="string-length($pText) > 0">
<xsl:choose>
<xsl:when test="contains($pText, '.')">
<!-- has dot (more than one variantOption) -->
<item tableName="EcomVariantOptionsProductRelation">
<column columnName="VariantOptionsProductRelationVariantID">
<xsl:value-of select="substring-before($pText,'.')"/>
</column>
<column columnName="VariantOptionsProductRelationProductID">
<xsl:value-of select="$ProductID"/>
</column>
</item>
</xsl:when>
<xsl:otherwise>
<item tableName="EcomVariantOptionsProductRelation">
<column columnName="VariantOptionsProductRelationVariantID">
<xsl:value-of select="$pText"/>
</column>
<column columnName="VariantOptionsProductRelationProductID">
<xsl:value-of select="$ProductID"/>
</column>
</item>
</xsl:otherwise>
</xsl:choose>
<xsl:call-template name="split">
<xsl:with-param name="pText" select="substring-after($pText, '.')"/>
<xsl:with-param name="ProductID" select="$ProductID"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<!-- empty string (no variants) -->
<xsl:value-of select="$pText"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
The problem is that the transformed output, ie
<item tableName="EcomVariantOptionsProductRelation">
<column columnName="VariantOptionsProductRelationVariantID"><![CDATA[1616_42]]></column>
<column columnName="VariantOptionsProductRelationProductID"><![CDATA[1001##SHOP1]]></column>
</item>
is repeated, because the "1616_42" (and "ct_HD" also) part exists twice in two different items. And I need for the output to be unique, since it finally goes to a table where this composite key (VariantID+ProductID) is unique.
The desired result for the two should be:
<table tableName="EcomVariantOptionsProductRelation">
<item tableName="EcomVariantOptionsProductRelation">
<column columnName="VariantOptionsProductRelationVariantID"><![CDATA[1616_42]]></column>
<column columnName="VariantOptionsProductRelationProductID"><![CDATA[1001##SHOP1]]></column>
</item>
<item tableName="EcomVariantOptionsProductRelation">
<column columnName="VariantOptionsProductRelationVariantID"><![CDATA[1615_01]]></column>
<column columnName="VariantOptionsProductRelationProductID"><![CDATA[1001##SHOP1]]></column>
</item>
<item tableName="EcomVariantOptionsProductRelation">
<column columnName="VariantOptionsProductRelationVariantID"><![CDATA[ct_HD]]></column>
<column columnName="VariantOptionsProductRelationProductID"><![CDATA[1001##SHOP1]]></column>
</item>
<item tableName="EcomVariantOptionsProductRelation">
<column columnName="VariantOptionsProductRelationVariantID"><![CDATA[1615_02]]></column>
<column columnName="VariantOptionsProductRelationProductID"><![CDATA[1001##SHOP1]]></column>
</item>
<item tableName="EcomVariantOptionsProductRelation">
<column columnName="VariantOptionsProductRelationVariantID"><![CDATA[1616_50]]></column>
<column columnName="VariantOptionsProductRelationProductID"><![CDATA[1001##SHOP1]]></column>
</item>
<item tableName="EcomVariantOptionsProductRelation">
<column columnName="VariantOptionsProductRelationVariantID"><![CDATA[ct_NHD]]></column>
<column columnName="VariantOptionsProductRelationProductID"><![CDATA[1001##SHOP1]]></column>
</item>
</table>
Point being: no duplicates.
Searching the web I can see the possibility of creating lists with some kind of unique identifier. But I have no clue if it is possible in my scenario, and even if it is, no clue as how to implement.
Ideas? XSLT 1.0 is used.
The only way I can think of doing this (in XSLT 1.0) is by means of a "two-pass" transform. Effectively, you perform two transforms (although this can be done in a single stylesheet, as I am going to demonstrate). The first transform will split your current VariantCode attributes into separate elements, so the result is like so
<Item ProductId="1001##SHOP1">
<Variant>1616_42</Variant>
<Variant>1615_01</Variant>
<Variant>ct_HD</Variant>
</Item>
The second transform can then use a technique called Muenchian Grouping to output the distinct Variant elements you require.
For this to work, the results of the first transform are simply stored in a variable
<xsl:variable name="variantSplit">
<xsl:apply-templates select="//Item" />
</xsl:variable>
So, in this case, you would have a template matching Item to do the copying and splitting required:
<xsl:template match="Item">
<Item ProductID="{#ItemNo}##{#ShopID}">
<xsl:call-template name="VariantCodeSplit" />
</Item>
</xsl:template>
(In case you haven't seen them before, the curly braces in the ProductID attribute are "Attribute Value Templates", and indicate an expresion to be evaluated, rather than output literally).
Now, you have transformed XML in a variable, where each Item element has multiple child Variant elements as shown above.
But wait! This is XSLT 1.0, which means the contents on the variable is actually a "Result Tree Fragment". If you want to start apply templates on it, you need to use an extension function to transform it into a node-set. This depends on what processor you are using, but you are almost certain to have the node-set function available. It is just a case of declaring the correct namespace. (See http://www.xml.com/pub/a/2003/07/16/nodeset.html for more details).
Anyway, the next stage involves the Muenchian Grouping technique. This involves defining a key to match the new Variant elements, by the a combination of the ProductId and the (split) variant code
<xsl:key name="Test" match="Variant" use="concat(../#ProductID, '|', .)" />
Then, to get the distinct Variant elements, you look for the elements that occur first in the xsl:key for their given combination of ProductID and code
<xsl:apply-templates select="msxml:node-set($variantSplit)/Item/Variant
[generate-id() = generate-id(key('Test', concat(../#ProductID, '|', .))[1])]" />
(Note the use of the node-set extension function here. In my case, I am using Microsoft's).
You can then have a template that matches the Variant element, and you know each match will be a distinct occurence, so you can output the product id and code.
Try this XSLT as a starter. Note it doesn't give you the element and attribute names used in your example (I have shortened them for brevity), but it should give you a start, assuming your head hasn't exploded at this point:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxml="urn:schemas-microsoft-com:xslt"
exclude-result-prefixes="msxml">
<xsl:output method="xml" version="1.0" indent="yes" encoding="ISO-8859-1"/>
<xsl:key name="Test" match="Variant" use="concat(../#ProductID, '|', .)" />
<xsl:template match="/">
<xsl:variable name="variantSplit">
<xsl:apply-templates select="//Item" />
</xsl:variable>
<table>
<xsl:apply-templates select="msxml:node-set($variantSplit)/Item/Variant[generate-id() = generate-id(key('Test', concat(../#ProductID, '|', .))[1])]" />
</table>
</xsl:template>
<xsl:template match="Item">
<Item ProductID="{#ItemNo}##{#ShopID}">
<xsl:call-template name="VariantCodeSplit" />
</Item>
</xsl:template>
<xsl:template name="VariantCodeSplit">
<xsl:param name="Code" select="#VariantCode" />
<xsl:choose>
<xsl:when test="contains($Code, '.')">
<Variant>
<xsl:value-of select="substring-before($Code, '.')"/>
</Variant>
<xsl:call-template name="VariantCodeSplit">
<xsl:with-param name="Code" select="substring-after($Code, '.')" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<Variant>
<xsl:value-of select="$Code"/>
</Variant>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="Variant">
<Item>
<Column name="Variant">
<xsl:value-of select="."/>
</Column>
<Column name="Product">
<xsl:value-of select="../#ProductID"/>
</Column>
</Item>
</xsl:template>
</xsl:stylesheet>
Of course, if your actual XML has 200000+ elements this may no be particularly fast.

Flat XML into tree with XSLT. Show one branch only

I have flat xml structure, i need to convert into hierarchy. With the help of stackoverflow I was able to do it.
Question: Is it possible to show only one branch using the same flat structure?
Here is my xml and xsl files:
XML
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="Stack.xsl"?>
<Items>
<Item>
<Id>1</Id>
<ParentId>0</ParentId>
<Name>1</Name>
<SortOrder>0</SortOrder>
</Item>
<Item>
<Id>2</Id>
<ParentId>1</ParentId>
<Name>1.1</Name>
<SortOrder>0</SortOrder>
</Item>
<Item>
<Id>3</Id>
<ParentId>1</ParentId>
<Name>1.2</Name>
<SortOrder>0</SortOrder>
</Item>
<Item>
<Id>4</Id>
<ParentId>1</ParentId>
<Name>1.3</Name>
<SortOrder>0</SortOrder>
</Item>
<Item>
<Id>5</Id>
<ParentId>1</ParentId>
<Name>1.4</Name>
<SortOrder>0</SortOrder>
</Item>
<Item>
<Id>6</Id>
<ParentId>0</ParentId>
<Name>2</Name>
<SortOrder>0</SortOrder>
</Item>
<Item>
<Id>7</Id>
<ParentId>6</ParentId>
<Name>2.1</Name>
<SortOrder>0</SortOrder>
</Item>
<Item>
<Id>8</Id>
<ParentId>6</ParentId>
<Name>2.2</Name>
<SortOrder>0</SortOrder>
</Item>
<Item>
<Id>9</Id>
<ParentId>0</ParentId>
<Name>3</Name>
<SortOrder>0</SortOrder>
</Item>
<Item>
<Id>10</Id>
<ParentId>3</ParentId>
<Name>1.2.1</Name>
<SortOrder>0</SortOrder>
</Item>
<Item>
<Id>11</Id>
<ParentId>8</ParentId>
<Name>2.2.1</Name>
<SortOrder>0</SortOrder>
</Item>
<Item>
<Id>11</Id>
<ParentId>5</ParentId>
<Name>1.4.1</Name>
<SortOrder>0</SortOrder>
</Item>
</Items>
XSL
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml">
<xsl:param name="SelectedId" select="'10'"/>
<xsl:key name="ChildNodes" match="/Items/Item" use="ParentId"/>
<xsl:template match="Items">
<ul>
<xsl:apply-templates select="Item[ParentId = 0]" />
</ul>
</xsl:template>
<xsl:template match="Item">
<li>
<xsl:choose>
<xsl:when test="Id = $SelectedId">
<b><xsl:value-of select="Name" /></b>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="Name" />
</xsl:otherwise>
</xsl:choose>
<xsl:variable name="Descendants" select="key ('ChildNodes', Id)" />
<xsl:if test="count ($Descendants) > 0">
<ul>
<xsl:apply-templates select="$Descendants" />
</ul>
</xsl:if>
</li>
</xsl:template>
</xsl:stylesheet>
Current output I have:
1
1.1
1.2
1.2.1
1.3
1.4
1.4.1
2
2.1
2.2
2.2.1
3
Desireable result example:
1
1.1
1.2
1.2.1
1.3
1.4
2
3
One way to do this is to make use of node-set function, which will require the use of an extension namespace in XSLT.
What you could do is that instead of outputing the Descendants variable directly as currently:
<ul>
<xsl:apply-templates select="$Descendants"/>
</ul>
You instead store the results in a variable
<xsl:variable name="list">
<ul>
<xsl:apply-templates select="$Descendants"/>
</ul>
</xsl:variable>
Then you can convert this 'result tree fragment' into a node-set, which you can then check for whether the selected element (held in a b element) exists. If so, you can then output it
<xsl:if test="exsl:node-set($list)//li[b]">
<xsl:copy-of select="$list"/>
</xsl:if>
Here is the full XSLT
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="urn:schemas-microsoft-com:xslt"
exclude-result-prefixes="exsl">
<xsl:output method="html"/>
<xsl:param name="SelectedId" select="'10'"/>
<xsl:key name="ChildNodes" match="/Items/Item" use="ParentId"/>
<xsl:template match="Items">
<ul>
<xsl:apply-templates select="Item[ParentId = 0]"/>
</ul>
</xsl:template>
<xsl:template match="Item">
<li>
<xsl:choose>
<xsl:when test="Id = $SelectedId">
<b>
<xsl:value-of select="Name"/>
</b>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="Name"/>
</xsl:otherwise>
</xsl:choose>
<xsl:variable name="Descendants" select="key ('ChildNodes', Id)"/>
<xsl:if test="count ($Descendants) > 0">
<xsl:variable name="list">
<ul>
<xsl:apply-templates select="$Descendants"/>
</ul>
</xsl:variable>
<xsl:if test="exsl:node-set($list)//li[b]">
<xsl:copy-of select="$list"/>
</xsl:if>
</xsl:if>
</li>
</xsl:template>
</xsl:stylesheet>
When applied to your sample XML, the following is output
<ul>
<li>1
<ul>
<li>1.1</li>
<li>1.2
<ul>
<li>
<b>1.2.1</b>
</li>
</ul></li>
<li>1.3</li>
<li>1.4</li>
</ul></li>
<li>2</li>
<li>3</li>
</ul>
Note, because I am using Microsoft XML here, the extension namespace is "urn:schemas-microsoft-com:xslt". For other processors, you will probably have to use "http://exslt.org/common"

XSLT: Converting a structure of items based on attribute values

Can somebody help me with the following problem, here is the input XML, the XSLT I'm using and the expected output. Actually I know it is because of unique generateid not getting generated this xslt failing to generate desired output, but I don't know where that code should be inserted.
XML:
<item id="N65537" text="catalog">
<item id="N65540" text="cd">
<item id="N65542" text="title">
<item id="N65543" img="VAL" text="Empire Burlesque" />
</item>
<item id="N65545" text="artist">
<item id="N65546" img="VAL" text="Bob Dylan" />
</item>
<item id="N65548" text="country">
<item id="N65549" text="attr1" img="ATTR">
<item id="N65549_N65549" text="primary" img="ATTRVAL" />
</item>
<item id="N65550" img="VAL" text="USA" />
</item>
<item id="N65552" text="company">
<item id="N65553" text="attr2" img="ATTR">
<item id="N65553_N65553" text="main" img="ATTRVAL" />
</item>
<item id="N65554" img="VAL" text="Columbia" />
</item>
<item id="N65556" text="price">
<item id="N65557" img="VAL" text="10.90" />
</item>
<item id="N65559" text="year">
<item id="N65560" img="VAL" text="1985" />
</item>
</item>
<item id="N65563" text="cd">
<item id="N65565" text="title">
<item id="N65566" img="VAL" text="Hide your heart" />
</item>
<item id="N65568" text="artist">
<item id="N65569" img="VAL" text="Bonnie Tyler" />
</item>
<item id="N65571" text="country">
<item id="N65572" img="VAL" text="UK" />
</item>
<item id="N65574" text="company">
<item id="N65575" img="VAL" text="CBS Records" />
</item>
<item id="N65577" text="price">
<item id="N65578" img="VAL" text="9.90" />
</item>
<item id="N65580" text="year">
<item id="N65581" img="VAL" text="1988" />
</item>
</item>
</item>
XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<xsl:call-template name="dispatch">
<xsl:with-param name="nodes" select="node()"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="dispatch">
<xsl:param name="nodes"/>
<xsl:choose>
<xsl:when test="text()">
<xsl:call-template name="apply" >
<xsl:with-param name="select" select="node()" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="apply" />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template name="apply">
<xsl:param name="select" select="node()" />
<xsl:for-each select="$select">
<xsl:if test='local-name() !=""'>
<xsl:variable name="ename">
<xsl:for-each select="#*">
<xsl:if test='name()="img1"'>
<xsl:text><xsl:value-of select="." /></xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<xsl:variable name="aname">
<xsl:for-each select="#*">
<xsl:if test='name()="img"'>
<xsl:text><xsl:value-of select="." /></xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<xsl:for-each select="#*">
<xsl:variable name="tname">
<xsl:text><xsl:value-of select="." /></xsl:text>
</xsl:variable>
<xsl:choose>
<xsl:when test='name() ="text" and normalize-space($ename) = "VAL" and normalize-space($aname) != "ATTR"'>
<xsl:element name="{$tname}">
<xsl:for-each select="$select">
<xsl:call-template name="dispatch"/>
</xsl:for-each>
</xsl:element>
</xsl:when>
<xsl:when test='name() ="text" and normalize-space($ename) = "VAL" '>
<xsl:value-of select="$tname" />
</xsl:when>
<xsl:when test='name() ="text" and normalize-space($aname) = "ATTR"'>
<xsl:attribute name="id"><xsl:value-of select="$aname" /></xsl:attribute>
</xsl:when>
</xsl:choose>
</xsl:for-each>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Expected output:
<catalog>
<cd>
<title>Empire Burlesque</title>
<artist>Bob Dylan</artist>
<country attr1="primary">USA</country>
<company attr2="main">Columbia</company>
<price>10.90</price>
<year>1985</year>
</cd>
<cd>
<title>Hide your heart</title>
<artist>Bonnie Tyler</artist>
<country>UK</country>
<company>CBS Records</company>
<price>9.90</price>
<year>1988</year>
</cd>
</catalog>
EDIT: Modified answer after detail was added to the question.
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<!-- normal items become an ordinary element -->
<xsl:template match="item">
<xsl:element name="{#text}">
<!-- attributes must be created before any other contents -->
<xsl:apply-templates select="item[#img='ATTR']" />
<!-- now process sub-elements and values (i.e. "anything else") -->
<xsl:apply-templates select="item[not(#img='ATTR')]" />
</xsl:element>
</xsl:template>
<!-- items with "ATTR" become an attribute -->
<xsl:template match="item[#img='ATTR']">
<xsl:attribute name="{#text}">
<xsl:value-of select="item[#img='ATTRVAL']/#text" />
</xsl:attribute>
</xsl:template>
<!-- items with "VAL" become a simple text -->
<xsl:template match="item[#img='VAL']">
<xsl:value-of select="#text" />
</xsl:template>
</xsl:stylesheet>
gives
<catalog>
<cd>
<title>Empire Burlesque</title>
<artist>Bob Dylan</artist>
<country attr1="primary">USA</country>
<company attr2="main">Columbia</company>
<price>10.90</price>
<year>1985</year>
</cd>
<cd>
<title>Hide your heart</title>
<artist>Bonnie Tyler</artist>
<country>UK</country>
<company>CBS Records</company>
<price>9.90</price>
<year>1988</year>
</cd>
</catalog>
The stylesheet works because the XSL processor chooses templates based on the specificity of their match expressions. match="item[#img='ATTR']" is more specific than match="item", so for each <item> processed (through <xsl:apply-templates select="item" />) the engine picks the right template automatically.
The main problem I see in your XSLT solution is that you use xsl:if and xsl:choose instead of 'select' to filter nodes. This makes your XSLT difficult to read and understand (at least for me).
Try this:
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="ISO-8859-1" indent="yes"/>
<xsl:template match="/item[#text='catalog']">
<catalog>
<xsl:apply-templates select="item[#text='cd']"></xsl:apply-templates>
</catalog>
</xsl:template>
<xsl:template match="item[#text='cd']">
<cd>
<title><xsl:value-of select="item[#text='title']/item[#img1='VAL']/#text"/></title>
<artist><xsl:value-of select="item[#text='artist']/item[#img1='VAL']/#text"/></artist>
<country><xsl:value-of select="item[#text='country']/item[#img1='VAL']/#text"/></country>
<company><xsl:value-of select="item[#text='company']/item[#img1='VAL']/#text"/></company>
<price><xsl:value-of select="item[#text='price']/item[#img1='VAL']/#text"/></price>
<year><xsl:value-of select="item[#text='year']/item[#img1='VAL']/#text"/></year>
</cd>
</xsl:template>
</xsl:stylesheet>
Solution does not cover the ATTR nodes, since they are not part of described result.
If you can possibly change the input xml, do so. XML is supposed to carry some meaning in the tag names and in its structure. Calling everything item just makes it unreadable.
Making such a change will also allow you to write readable XSLT that doesn't resort to node hierarchy selector tricks.
How about this:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<catalog>
<xsl:for-each select="/item[#text='catalog']/item[#text='cd']">
<cd>
<xsl:for-each select="item">
<xsl:variable name="ename" select="string(#text)"/>
<xsl:variable name="value" select="item/#text"/>
<xsl:element name="{$ename}">
<xsl:value-of select="$value"/>
</xsl:element>
</xsl:for-each>
</cd>
</xsl:for-each>
</catalog>
</xsl:template>
</xsl:stylesheet>
Not as nice as Tomalaks solution - but maybe slightly clearer as to the intention.