XSL 2.0 output only first duplicate node - xslt

I tried searching the threads related to duplicate nodes and was almost able to achieve, it avoids duplicate nodes but it outputs the last duplicate node instead of the first node in the duplicate list (hope this is making sense).
Please advise what I'm doing wrong/missing here ?
=====XML =====
<node id="j0dp1s8s">
<name key="">ABC</name>
<link type="page" target="">
<value>abc/index</value>
</link>
</node>
<node id="j0dp1s8se">
<name key="">DEF</name>
<link type="page" target="">
<value>def/index</value>
</link>
</node>
<node id="j0dp1s92">
<name key="">XYZ</name>
<link type="page" target="">
<value>abc/index</value>
</link>
</node>
=======XSL=============
<xsl:variable name="unique-list" select="link[not(value=following::link/value)]" />
<xsl:for-each select="$unique-list">
<li><xsl:value-of select="../name" /></li>
</xsl:for-each>
Output:
DEF
XYZ
Desired Output:
ABC
DEF

Prolog
Your source XML is not valid. Element name can't be closed by label [deprecated]
I do not show your mistakes, I just provide you a much simpler and working code
I added a root nodes to make your source XML valid
XSLT:
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="html"/>
<xsl:template match="nodes">
<ul>
<xsl:for-each-group select="node" group-by="link/value">
<li><xsl:value-of select="name" /></li>
</xsl:for-each-group>
</ul>
</xsl:template>
</xsl:transform>
Therefore you can use XSLT 2, take the lovly functionality of it. https://www.w3.org/TR/xslt20/#element-for-each-group

Related

Split an .xml-file with XSLT

I have written an XSL-file, that reads some filenames from the source file and uses this filenames, to split another file (which is opened in the XSL-file via the document() function). The filenames are used to create several output files and certain parts of the loaded file are written to these output files.
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsd="http://www.w3.org/2001/XMLSchema-instance">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="Root">
<xsl:apply-templates select="//Link"/>
</xsl:template>
<xsl:template match="Link">
<xsl:result-document href="{#url}" method="xml">
<xsl:apply-templates select="document('Input.xml')//Node"/>
</xsl:result-document>
</xsl:template>
<xsl:template match="Node">
<xsl:copy-of select="."/>
<xsl:if test="following-sibling::*[1][self::NextPart]">
<!-- write some test node -->
<xsl:element name="FoundNextPart"/>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
The sourcefile looks something like this
<Root>
<SomeNode>
<Link url="part_0.xml"/>
<Link url="part_1.xml"/>
<Link url="part_2.xml"/>
</SomeNode>
</Root>
The Input.xml file will have a structure like this
<Root>
<Node>
<PartContent>
<ImportantContent>0</ImportantContent>
</PartContent>
</Node>
<Node>
<PartContent>
<ImportantContent>0</ImportantContent>
</PartContent>
</Node>
<NextPart/>
<Node>
<PartContent>
<ImportantContent>1</ImportantContent>
</PartContent>
</Node>
<Node>
<PartContent>
<ImportantContent>1</ImportantContent>
</PartContent>
</Node>
<NextPart/>
</Root>
My problem is now with the
<xsl:template match="Node">
I want to copy the content of the Input.xml up to the first appearance of the
<NextPart/>
node. Then I want to somehow break out of the current nodeset (//Node of the Input.xml) and continue with the next //Link. But for this next Link (file) I want to copy the content of the Input.xml between the first and the second appearance of the
<NextPart/>
node.
I'm not sure if this is feasible in any way. Also I'm not sure if my approach can be used for this.
I've read something like using
<xsl:call-template name="copy">
to use the following-sibling of the current node as a parameter. But anyway I have to pass the current count of the
<NextPart/>
so that I know, which content to copy!?
How about processing and grouping that Input.xml once with e.g.
<xsl:variable name="groups">
<xsl:for-each-group select="document('Input.xml')/Root/*" group-ending-with="NextPart">
<group>
<xsl:copy-of select="current-group()[self::Node]"/>
</group>
</xsl:for-each-group>
</xsl:variable>
in a global variable, then in your template you do
<xsl:template match="Link">
<xsl:variable name="pos" select="position()"/>
<xsl:result-document href="{#url}" method="xml">
<xsl:copy-of select="$groups/group[$pos]/Node"/>
</xsl:result-document>
</xsl:template>
to output the Node elements grouped earlier.

With XSLT, how can I process normally, but hold some nodes until the end and then output them all at once (e.g. footnotes)?

I have an XSLT application which reads the internal format of Microsoft Word 2007/2010 zipped XML and translates it into HTML5 with XSLT. I am investigating how to add the ability to optionally read OpenOffice documents instead of MSWord.
Microsoft stores XML for footnote text separately from the XML of the document text, which happens to suit me because I want the footnotes in a block at the end of the output HTML page.
However, unfortunately for me, OpenOffice puts each footnote right next to its reference, inline with the text of the document. Here is a simple paragraph example:
<text:p text:style-name="Standard">The real breakthrough in aerial mapping
during World War II was trimetrogon
<text:note text:id="ftn0" text:note-class="footnote">
<text:note-citation>1</text:note-citation>
<text:note-body>
<text:p text:style-name="Footnote">Three separate cameras took three
photographs at once, a direct downward and an oblique on each side.</text:p>
</text:note-body>
</text:note>
photography, but the camera was large and heavy, so there were problems finding
the right aircraft to carry it.
</text:p>
My question is, can XSLT process the XML as normal, but hold each of the text:note items until the end of the document text, and then emit them all at one time?
You're thinking of your logic as being driven by the order of things in the input, but in XSLT you need to be driven by the order of things in the output. When you get to the point where you want to output the footnotes, go find the footnote text wherever it might be in the input. Admittedly that doesn't always play too well with the apply-templates recursive descent processing model, which is explicitly input-driven; but nevertheless, that's the way you have to do it.
Don't think of it as "holding" the text:note items, instead simply ignore them in the main pass and then gather them at the end with a //text:note and process them there, e.g.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:text="whateveritshouldbe">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
<!-- normal mode - replace text:note element by [reference] -->
<xsl:template match="text:note">
<xsl:value-of select="concat('[', text:note-citation, ']')" />
</xsl:template>
<xsl:template match="/">
<document>
<xsl:apply-templates select="*" />
<footnotes>
<xsl:apply-templates select="//text:note" mode="footnotes"/>
</footnotes>
</document>
</xsl:template>
<!-- special "footnotes" mode to de-activate the usual text:node template -->
<xsl:template match="#*|node()" mode="footnotes">
<xsl:copy>
<xsl:apply-templates select="#*|node()" mode="footnotes" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
You could use <xsl:apply-templates mode="..."/>. I'm not sure on the exact syntax and your use case, but maybe the example below will give you a clue on how to approach your problem.
Basic idea is to process your nodes twice. First iteration would be pretty much the same as now, and the second iteration only looks for footnotes and only outputs those. You differentiate those iteration by setting "mode" parameter.
Maybe this example will give you a clue how to approach your problem. Note that I used different tags that in your code, so the example would be simpler.
XSLT sheet:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes" />
<xsl:template match="doc">
<xml>
<!-- First iteration - skip footnotes -->
<doc>
<xsl:apply-templates select="text" />
</doc>
<!-- Second iteration, extract all footnotes.
'mode' = footnotes -->
<footnotes>
<xsl:apply-templates select="text" mode="footnotes" />
</footnotes>
</xml>
</xsl:template>
<!-- Note: no mode attribute -->
<xsl:template match="text">
<text>
<xsl:for-each select="p">
<p>
<xsl:value-of select="text()" />
</p>
</xsl:for-each>
</text>
</xsl:template>
<!-- Note: mode = footnotes -->
<xsl:template match="text" mode="footnotes">
<xsl:for-each select=".//footnote">
<footnote>
<xsl:value-of select="text()" />
</footnote>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Input XML:
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<text>
<p>
some text
<footnote>footnote1</footnote>
</p>
<p>
other text
<footnote>footnote2</footnote>
</p>
</text>
<text>
<p>
some text2
<footnote>footnote3</footnote>
</p>
<p>
other text2
<footnote>footnote4</footnote>
</p>
</text>
</doc>
Output XML:
<?xml version="1.0" encoding="UTF-8"?>
<xml>
<!-- Output from first iteration -->
<doc>
<text>
<p>some text</p>
<p>other text</p>
</text>
<text>
<p>some text2</p>
<p>other text2</p>
</text>
</doc>
<!-- Output from second iteration -->
<footnotes>
<footnote>footnote1</footnote>
<footnote>footnote2</footnote>
<footnote>footnote3</footnote>
<footnote>footnote4</footnote>
</footnotes>
</xml>

Copy data from one XML doc to another using XSLT

I have to copy data of node element from file1.xml to file2.xml.
file1.xml
<?xml version="1.0" encoding="utf-8" ?>
<root>
<header>
<AsofDate>31-Dec-2012</AsofDate>
<FundName>This is Sample Fund</FundName>
<Description>This is test description</Description>
</header>
</root>
file2.xml
<?xml version="1.0" encoding="utf-8" ?>
<root id="1">
<header id="2">
<AsofDate id="3"/>
<FundName id="4" />
<Description id="5" />
</header>
</root>
after merging file1.xml into file2.xml, result should look below:
<?xml version="1.0" encoding="utf-8" ?>
<root id="1">
<header id="2">
<AsofDate id="3">31-Dec-2012</AsofDate>
<FundName id="4">This is Sample Fund</FundName>
<Description id="5">This is test description</Description>
</header>
</root>
I am using below XSLT to transform file.
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Below is the code is used to perform transformation:
XslCompiledTransform tf = new XslCompiledTransform();
tf.Load("TranFile.xsl");
tf.Transform("file1.xml", "file2.xml");
but above code is overwriting the file2 content with file1.xml content. This is just sample XML. In real case we don't know name of nodes and hierarchy of the xml file. But whatever structure would be will be same for both file and scenario will be exactly same. I am new to XSLT and not sure is this right approach to accomplish the result. Is it really possible to achieve result through XSLT.
The solution that I post is written having in mind the following:
The only thing that need to be merged are the attributes. Text and element nodes are copied as they appear from file1.xml.
The #id attributes are not numbered sequentially in file2.xml so the #id's in file2.xml could be (for example) 121 432 233 12 944 instead of 1 2 3 4 5. If the case is the latter then you would not need file2.xml to generate the desired output.
The document() function can be used to access files different than the current one. If XslCompiledTransform is giving an error when using the document function I would suggest to follow this using document() function in .NET XSLT generates error . I am using a different XSLT processor (xsltproc) and it works fine.
This solution is based on keeping a reference to the external file, so each time that we process an element in file1.xml the reference is moved to point at the same element in file2.xml. This can be done because according to the problem, both files present the same element hierarchy.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="no"/>
<!-- Match the document node as an entry point for matching the files -->
<xsl:template match="/">
<xsl:apply-templates select="node()">
<xsl:with-param name="doc-context" select="document('file2.xml')/node()" />
</xsl:apply-templates>
</xsl:template>
<!-- In this template we copy the elements and text nodes from file1.xml and
we merge the attributes from file2.xml with the attributes in file1.xml -->
<xsl:template match="node()">
<!-- We use this parameter to keep track of where we are in file2.xml by
mimicking the operations that we do in the current file. So we are at
the same position in both files at the same time. -->
<xsl:param name="doc-context" />
<!-- Obtain current position in file1.xml so we know where to look in file2.xml -->
<xsl:variable name="position" select="position()" />
<!-- Copy the element node from the current file (file1.xml) -->
<xsl:copy>
<!-- Merge attributes from file1.xml with attributes from file2.xml -->
<xsl:copy-of select="#*|$doc-context[position() = $position]/#*" />
<!-- Copy text nodes and process children -->
<xsl:apply-templates select="node()">
<xsl:with-param name="doc-context" select="$doc-context/node()" />
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

Finding unique nodes with xslt

I have an xml document that contains some "Item" elements with ids. I want to make a list of the unique Item ids. The Item elements are not in a list though - they can be at any depth within the xml document - for example:
<Node>
<Node>
<Item id="1"/>
<Item id="2"/>
</Node>
<Node>
<Item id="1"/>
<Node>
<Item id="3"/>
</Node>
</Node>
<Item id="2"/>
</Node>
I would like the output 1,2,3 (or a similar representation). If this can be done with a single xpath then even better!
I have seen examples of this for lists of sibling elements, but not for a general xml tree structure. I'm also restricted to using xslt 1.0 methods. Thanks!
Selecting all unique items with a single XPath expression (without indexing, beware of performance issues):
//Item[not(#id = preceding::Item/#id)]
Try this (using Muenchian grouping):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:key name="item-id" match="Item" use="#id" />
<xsl:template match="/Node">
<xsl:for-each select="//Item[count(. | key('item-id', #id)[1]) = 1]">
<xsl:value-of select="#id" />,
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Not sure if this is what you mean, but just in case.
In the html
<xsl:apply-templates select="item"/>
The template.
<xsl:template match="id">
<p>
<xsl:value-of select="#id"/> -
<xsl:value-of select="."/>
</p>
</xsl:template>

How can I build a tree from a flat XML list using XSLT?

i use a minimalist MVC framework, where the PHP controler hands the DOM model to the XSLT view (c.f. okapi).
in order to build a navigation tree, i used nested sets in MYSQL. this way, i end up with a model XML that looks as follows:
<tree>
<node>
<name>root</name>
<depth>0</depth>
</node>
<node>
<name>TELEVISIONS</name>
<depth>1</depth>
</node>
<node>
<name>TUBE</name>
<depth>2</depth>
</node>
<node>
<name>LCD</name>
<depth>2</depth>
</node>
<node>
<name>PLASMA</name>
<depth>2</depth>
</node>
<node>
<name>PORTABLE ELECTRONICS</name>
<depth>1</depth>
</node>
<node>
<name>MP3 PLAYERS</name>
<depth>2</depth>
</node>
<node>
<name>FLASH</name>
<depth>3</depth>
</node>
<node>
<name>CD PLAYERS</name>
<depth>2</depth>
</node>
<node>
<name>2 WAY RADIOS</name>
<depth>2</depth>
</node>
</tree>
which represents the following structure:
root
TELEVISIONS
TUBE
LCD
PLASMA
PORTABLE ELECTRONICS
MP3 PLAYERS
FLASH
CD PLAYERS
2 WAY RADIOS
How can I convert this flat XML list to a nested HTML list using XSLT?
PS: this is the example tree from the Managing Hierarchical Data in MySQL.
That form of flat list is very hard to work with in xslt, as you need to find the position of the next grouping, etc. Can you use different xml? For example, with the flat xml:
<?xml version="1.0" encoding="utf-8" ?>
<tree>
<node key="0">root</node>
<node key="1" parent="0">TELEVISIONS</node>
<node key="2" parent="1">TUBE</node>
<node key="3" parent="1">LCD</node>
<node key="4" parent="1">PLASMA</node>
<node key="5" parent="0">PORTABLE ELECTRONICS</node>
<node key="6" parent="5">MP3 PLAYERS</node>
<node key="7" parent="6">FLASH</node>
<node key="8" parent="5">CD PLAYERS</node>
<node key="9" parent="5">2 WAY RADIOS</node>
</tree>
It becomes trivial to do (very efficiently):
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:key name="nodeChildren" match="/tree/node" use="#parent"/>
<xsl:template match="tree">
<ul>
<xsl:apply-templates select="node[not(#parent)]"/>
</ul>
</xsl:template>
<xsl:template match="node">
<li>
<xsl:value-of select="."/>
<ul>
<xsl:apply-templates select="key('nodeChildren',#key)"/>
</ul>
</li>
</xsl:template>
</xsl:stylesheet>
Is that an option?
Of course, if you build the xml as a hierarchy it is even easier ;-p
In XSLT 2.0 it would be rather easy with the new grouping functions.
In XSLT 1.0 it's a little more complicated but this works:
<xsl:template match="/tree">
<xhtml>
<head/>
<body>
<ul>
<xsl:apply-templates select="node[depth='0']"/>
</ul>
</body>
</xhtml>
</xsl:template>
<xsl:template match="node">
<xsl:variable name="thisNodeId" select="generate-id(.)"/>
<xsl:variable name="depth" select="depth"/>
<xsl:variable name="descendants">
<xsl:apply-templates select="following-sibling::node[depth = $depth + 1][preceding-sibling::node[depth = $depth][1]/generate-id() = $thisNodeId]"/>
</xsl:variable>
<li>
<xsl:value-of select="name"/>
</li>
<xsl:if test="$descendants/*">
<ul>
<xsl:copy-of select="$descendants"/>
</ul>
</xsl:if>
</xsl:template>
The heart of the matter is the long and ugly "descendants" variable, which looks for nodes after the current node that have a "depth" child greater than the current depth, but are not after another node that would have the same depth as the current depth (because if they were, they would be children of that node instead of the current one).
BTW there is an error in your example result: "FLASH" should be a child of "MP3 PLAYERS" and not a sibling.
EDIT
In fact (as mentionned in the comments), in "pure" XSLT 1.0 this does not work for two reasons: the path expression uses generate-id() incorrectly, and one cannot use a "result tree fragment" in a path expression.
Here is a correct XSLT 1.0 version of the "node" template (successfully tested with Saxon 6.5) that does not use EXSLT nor XSLT 1.1:
<xsl:template match="node">
<xsl:variable name="thisNodeId" select="generate-id(.)"/>
<xsl:variable name="depth" select="depth"/>
<xsl:variable name="descendants">
<xsl:apply-templates select="following-sibling::node[depth = $depth + 1][generate-id(preceding-sibling::node[depth = $depth][1]) = $thisNodeId]"/>
</xsl:variable>
<xsl:variable name="descendantsNb">
<xsl:value-of select="count(following-sibling::node[depth = $depth + 1][generate-id(preceding-sibling::node[depth = $depth][1]) = $thisNodeId])"/>
</xsl:variable>
<li>
<xsl:value-of select="name"/>
</li>
<xsl:if test="$descendantsNb > 0">
<ul>
<xsl:copy-of select="$descendants"/>
</ul>
</xsl:if>
</xsl:template>
Of course, one should factor the path expression that is repeated, but without the ability to turn "result tree fragments" into XML that can actually be processed, I don't know if it's possible? (writing a custom function would do the trick of course, but then it's much simpler to use EXSLT)
Bottom line: use XSLT 1.1 or EXSLT if you can!
2nd Edit
In order to avoid to repeat the path expression, you can also forget the test altogether, which will simply result in some empty that you can either leave in the result or post-process to eliminate.
very helpful!
one suggestion is moving the < ul > inside the template would remove the empty ul.
<xsl:template match="tree">
<xsl:apply-templates select="node[not(#parent)]"/>
</xsl:template>
<xsl:template match="node">
<ul>
<li>
<xsl:value-of select="."/>
<xsl:apply-templates select="key('nodeChildren',#key)"/>
</li>
</ul>
</xsl:template>
</xsl:stylesheet>
You haven't actually said what you'd like the html output to look like, but I can tell you that from an XSLT point of view going from a flat structure to a tree is going to be complex and expensive if you're also basing this on the position of items in the tree and their relation to siblings.
It would be far better to supply a <parent> attribute/node than the <depth>.