What is this XSLT code doing? - xslt

I'm new to XSLT. I have a block code that I don't understand.
In the following block what does '*','*[#class='vcard']' and '*[#class='fn']' mean?
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="html" encoding="utf-8"/> <xsl:template match="/">
<script type="text/javascript">
<xsl:text><![CDATA[function show_hcard(info) {
win2 = window.open("about:blank", "HCARD", "width=300,height=200," + "scrollbars=no menubar=no, status=no, toolbar=no, scrollbars=no");
win2.document.write("<h1>HCARD</h1><hr/><p>" + info + "</p>"); win2.document.close();
}]]></xsl:text>
</script>
<xsl:apply-templates/> </xsl:template>
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates/>
</xsl:copy> </xsl:template>
<xsl:template match="*[#class='vcard']">
<xsl:apply-templates/> </xsl:template>
<xsl:template match="*[#class='fn']">
<u>
<a>
<xsl:attribute name="onMouseDown">
<xsl:text>show_hcard('</xsl:text>
<xsl:value-of select="text()"/>
<xsl:text>')</xsl:text>
</xsl:attribute>
<xsl:value-of select="text()"/>
</a>
</u> </xsl:template> </xsl:stylesheet>

* matches all elements, *[#class='vcard'] pattern matches all elements with class attribute of vcard value. From that you can figure out what *[#class='fn'] may mean ;-)
I'd also suggest that you start here.

Your stylesheet has four template rules. In English these rules are:
(a) starting at the top (match="/"), first output a script element, then process the next level down (xsl:apply-templates) in the input.
(b) the default rule for elements (match="*") is to create a new element in the output with the same name and attributes as the original, and to construct its content by processing the next level down in the input.
(c) the rule for elements with the attribute class="vcard" is to do nothing with this element, other than to process the next level down in the input.
(d) the rule for elements with the attribute class="fn" is to output
<u><a onMouseDown="show_hcard('X')">X</a></u>
where X is the text content of the element being processed.
A more experienced XSLT user would have written the last rule as
<xsl:template match="*[#class='fn']">
<u>
<a onMouseDown="show_hcard('{.}')">
<xsl:value-of select="."/>
</a>
</u>
</xsl:template>

Related

XSLT List attributes in the order they appear in the xml file

I have a large number of xml files with a structure similar to the following, although they are far larger:
<?xml version="1.0" encoding="UTF-8"?>
<a a1="3.0" a2="ABC">
<b b1="P1" b2="123">first
</b>
<b b1="P2" b2="456" b3="xyz">second
</b>
</a>
I want to get the following output:
1|1|b1
1|2|b2
2|1|b1
2|2|b2
2|3|b3
where:
Field 1 is the sequence number for nodes /a/b
Field 2 is the sequence number of the attribute as it appears in the xml file
Field 3 is the attribute name (not value)
I don't quite know how to calculate field 2 correctly.
I've prepared the following xslt file:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<xsl:for-each select="a/b/#*">
<xsl:value-of select="count(../preceding-sibling::*)+1"/>
<xsl:text>|</xsl:text>
<!-- TODO: This is not correct -->
<xsl:value-of select="count(preceding-sibling::*)+1"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="name()"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
but when I run the following command:
xsltproc a.xslt a.xml > a.csv
I get an incorrect output, as field 2 does not represent the attribute sequence number:
1|1|b1
1|1|b2
2|1|b1
2|1|b2
2|1|b3
Do you have any suggestions on how to get the correct output please?
Please notice that the answers provided in XSLT to order attributes do not provide a solution to this problem.
The order of attributes is irrelevant in XML. For instance, <a a1="3.0" a2="ABC"> and <a a1="3.0" a2="ABC"> are equivalent.
However this specific question is part of a larger application where it is essential to establish the order in which attributes appear in given xml files (and not in xml files that are equivalent to them).
Although, as kjhughes says in comments, attribute order is insignificant. However, you can still select them, and use the position() element to get the numbers you are after (You just can't be sure the order they are output will be the order they appear in the XML, although generally this will be the case).
Try this XSLT. Do note the nested use of xsl:for-each to select only b elements first, to get their position, before getting the attributes, which then have their own separate position.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:template match="/">
<xsl:for-each select="a/b">
<xsl:variable name="bPosition" select="position()"/>
<xsl:for-each select="#*">
<xsl:value-of select="$bPosition"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="position()"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="name()"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
You could use the position() of the items in the sequence of attributes that you are iterating over and combine with logic for the position of its parent element.
<xsl:template match="/">
<xsl:for-each select="a/b/#*">
<xsl:value-of select="count(../preceding-sibling::*)+1"/>
<xsl:text>|</xsl:text>
<!-- TODO: This is not correct -->
<xsl:value-of select="position() -
(if (count(../preceding-sibling::*)) then count(../preceding-sibling::*)+1 else 0)"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="name()"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
Which produces the following output:
1|1|b1
1|2|b2
2|1|b1
2|2|b2
2|3|b3

How to fold by a tag a group of selected (neighbor) tags with XSLT1?

I have a set of sequential nodes that must be enclosed into a new element. Example:
<root>
<c>cccc</c>
<a gr="g1">aaaa</a> <b gr="g1">1111</b>
<a gr="g2">bbbb</a> <b gr="g2">2222</b>
</root>
that must be enclosed by fold tags, resulting (after XSLT) in:
<root>
<c>cccc</c>
<fold><a gr="g1">aaaa</a> <b gr="g1">1111</b></fold>
<fold><a gr="g2">bbbb</a> <b gr="g2">2222</b></fold>
</root>
So, I have a "label for grouping" (#gr) but not imagine how to produce correct fold tags.
I am trying to use the clues of this question, or this other one... But I have a "label for grouping", so I understand that my solution not needs the use of key() function.
My non-general solution is:
<xsl:template match="/">
<root>
<xsl:copy-of select="root/c"/>
<fold><xsl:for-each select="//*[#gr='g1']">
<xsl:copy-of select="."/>
</xsl:for-each></fold>
<fold><xsl:for-each select="//*[#gr='g2']">
<xsl:copy-of select="."/>
</xsl:for-each></fold>
</root>
</xsl:template>
I need a general solution (!), looping by all #gr and coping (identity) all context that not have #gr... perhaps using identity transform.
Another (future) problem is to do this recursively, with fold of foldings.
In XSLT 1.0 the standard technique to handle this sort of thing is called Muenchian grouping, and involves the use of a key that defines how the nodes should be grouped and a trick using generate-id to extract just the first node in each group as a proxy for the group as a whole.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:strip-space elements="*" />
<xsl:output indent="yes" />
<xsl:key name="elementsByGr" match="*[#gr]" use="#gr" />
<xsl:template match="#*|node()" name="identity">
<xsl:copy><xsl:apply-templates select="#*|node()"/></xsl:copy>
</xsl:template>
<!-- match the first element with each #gr value -->
<xsl:template match="*[#gr][generate-id() =
generate-id(key('elementsByGr', #gr)[1])]" priority="2">
<fold>
<xsl:for-each select="key('elementsByGr', #gr)">
<xsl:call-template name="identity" />
</xsl:for-each>
</fold>
</xsl:template>
<!-- ignore subsequent ones in template matching, they're handled within
the first element template -->
<xsl:template match="*[#gr]" priority="1" />
</xsl:stylesheet>
This achieves the grouping you're after, but just like your non-general solution it doesn't preserve the indentation and the whitespace text nodes between the a and b elements, i.e. it will give you
<root>
<c>cccc</c>
<fold>
<a gr="g1">aaaa</a>
<b gr="g1">1111</b>
</fold>
<fold>
<a gr="g2">bbbb</a>
<b gr="g2">2222</b>
</fold>
</root>
Note that if you were able to use XSLT 2.0 then the whole thing becomes one for-each-group:
<xsl:template match="root">
<xsl:for-each-group select="*" group-adjacent="#gr">
<xsl:choose>
<!-- wrap each group in a fold -->
<xsl:when test="#gr">
<fold><xsl:copy-of select="current-group()" /></fold>
</xsl:when>
<!-- or just copy as-is for elements that don't have a #gr -->
<xsl:otherwise>
<xsl:copy-of select="current-group()" />
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:template>

XSLT: copy attributes from child element

Input:
<a q='r'>
<b x='1' y='2' z='3'/>
<!-- other a content -->
</a>
Desired output:
<A q='r' x='1' y='2' z='3'>
<!-- things derived from other a content, no b -->
</A>
Could someone kindly give me a recipe?
Easy.
<xsl:template match="a">
<A>
<xsl:copy-of select="#*|b/#*" />
<xsl:apply-templates /><!-- optional -->
</A>
</xsl:template>
The <xsl:apply-templates /> is not necessary if you have no further children of <a> you want to process.
Note
the use of <xsl:copy-of> to insert source nodes into the output unchanged
the use of the union operator | to select several unrelated nodes at once
that you can copy attribute nodes to a new element as long as it is the first thing you do - before you add any child elements.
EDIT: If you need to narrow down which attributes you copy, and which you leave alone, use this (or a variation of it):
<xsl:copy-of select="(#*|b/#*)[
name() = 'q' or name() = 'x' or name() = 'y' or name() = 'z'
]" />
or even
<xsl:copy-of select="(#*|b/#*)[
contains('|q|x|y|z|', concat('|', name(), '|'))
]" />
Note how the parentheses make the predicate apply to all matched nodes.
XSL
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="a">
<A>
<xsl:apply-templates select="#*|b/#*|node()"/>
</A>
</xsl:template>
<xsl:template match="b"/>
</xsl:stylesheet>
output
<A q="r" x="1" y="2" z="3"><!-- other a content --></A>

XSLT: add node inner text

This is a slightly version of other question posted here:
XSLT: change node inner text
Imagine i use XSLT to transform the document:
<a>
<b/>
<c/>
</a>
into this:
<a>
<b/>
<c/>
Hello world
</a>
In this case i can't use neither the
<xsl:strip-space elements="*"/>
element or the [normalize-space() != ''] predicate since there is no text in the place where i need to put new text. Any ideas? Thanks.
Here is what I would do:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<!-- identity template to copy everything unless otherwise noted -->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<!-- match the first text node directly following a <c> element -->
<xsl:template match="text()[preceding-sibling::node()[1][self::c]]">
<!-- ...and change its contents -->
<xsl:text>Hello world</xsl:text>
</xsl:template>
</xsl:stylesheet>
Note that text nodes contain "surrounding" whitespace - in the sample XML in the question the matched text node is whitespace only, which is why the above works. It will stop to work as soon as the input document looks like this:
<a><b/><c/></a>
because here is no text node following <c>. So if this is too brittle for your use case, an alternative would be:
<!-- <c> nodes get a new adjacent text node -->
<xsl:template match="c">
<xsl:copy-of select="." />
<xsl:text>Hello world</xsl:text>
</xsl:template>
<!-- make sure to remove the first text node directly following a <c> node-->
<xsl:template match="text()[preceding-sibling::node()[1][self::c]]" />
In any case, stuff like the above makes clear why intermixing of text nodes and element nodes is best avoided. This is not always possible (see XHTML). But when you have the chance and the XML is supposed to be purely a container for structural data, staying clear of mixed content makes your life easier.
This transformation inserts the desired text (for generality) after the element named a7:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*" name="identity">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="a7">
<xsl:call-template name="identity"/>
<xsl:text>Hello world</xsl:text>
</xsl:template>
</xsl:stylesheet>
when applied on this XML document:
<a>
<a1/>
<a2/>
.....
<a7/>
<a8/>
</a>
the desired result is produced:
<a>
<a1/>
<a2/>
.....
<a7/>Hello world
<a8/>
</a>
Do note:
The use of the identity rule for copying every node of the source XML document.
The overriding of the identity rule by a specific template that carries out the insertion of the new text.
How the identity rule is both applied (on every node) and called by name (for a specific need).
edit: fixed my fail to put proper syntax in.
<xsl:template match='a'>
<xsl:copy-of select="." />
<xsl:text>Hello World</xsl:text>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>

How to remove <b/> from a document

I'm trying to have an XSLT that copies most of the tags but removes empty "<b/>" tags. That is, it should copy as-is "<b> </b>" or "<b>toto</b>" but completely remove "<b/>".
I think the template would look like :
<xsl:template match="b">
<xsl:if test=".hasChildren()">
<xsl:element name="b">
<xsl:apply-templates/>
</xsl:element>
</xsl:if>
</xsl:template>
But of course, the "hasChildren()" part doesn't exist ... Any idea ?
dsteinweg put me on the right track ... I ended up doing :
<xsl:template match="b">
<xsl:if test="./* or ./text()">
<xsl:element name="b">
<xsl:apply-templates/>
</xsl:element>
</xsl:if>
</xsl:template>
This transformation ignores any <b> elements that do not have any node child. A node in this context means an element, text, comment or processing instruction node.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="b[not(node()]"/>
</xsl:stylesheet>
Notice that here we use one of the most fundamental XSLT design patterns -- using the identity transform and overriding it for specific nodes.
The overriding template will be selected only for nodes that are elements named "b" and do not have (any nodes as) children. This template is empty (does not have any contents), so the effect of its application is that the matching node is ignored/discarded and is not reproduced in the output.
This technique is very powerful and is widely used for such tasks and also for renaming, changing the contents or attributes, adding children or siblings to any specific node that can be matched (avery type of node with the exception of a namespace node can be used as a match pattern in the "match" attribute of <xsl:template/>
Hope this helped.
Cheers,
Dimitre Novatchev
I wonder if this will work?
<xsl:template match="b">
<xsl:if test="b/text()">
...
See if this will work.
<xsl:template match="b">
<xsl:if test=".!=''">
<xsl:element name="b">
<xsl:apply-templates/>
</xsl:element>
</xsl:if>
</xsl:template>
An alternative would be to do the following:
<xsl:template match="b[not(text())]" />
<xsl:template match="b">
<b>
<xsl:apply-templates/>
</b>
</xsl:template>
You could put all the logic in the predicate, and set up a template to match only what you want and delete it:
<xsl:template match="b[not(node())] />
This assumes that you have an identity template later on in the transform, which it sounds like you do. That will automatically copy any "b" tags with content, which is what you want:
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
Edit: Now uses node() like Dimitri, below.
If you have access to update the original XML, you could try using use xml:space=preserve on the root element
<html xml:space="preserve">
...
</html>
This way, the space in the empty <b> </b> tag is preserved, and so can be distinguished from <b /> in the XSLT.
<xsl:template match="b">
<xsl:if test="text() != ''">
....
</xsl:if>
</xsl:template>