I have a situation where I think I need to daisy chain my xslt transformation (i.e. that output of one xslt transform being input into another). The first transform is rather complex with lots of xsl:choice and ancestor xpaths. My thought is to transform the xml into xml that can then be easily transformed to html.
My question is 'Is this standard practice or am I missing something?'
Thanks in advance.
Stephen
Performing a chain of transformations is used quite often in XSLT applications, though doing this entirely in XSLT 1.0 requires the use of the vendor-specific xxx:node-set() function. In XSLT 2.0 no such extension is needed as the infamous RTF datatype is eliminated there.
Here is an example (too-simple to be meaningful, but illustrating completely how this is done):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ext="http://exslt.org/common">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="vrtfPass1">
<xsl:apply-templates select="/*/*"/>
</xsl:variable>
<xsl:variable name="vPass1"
select="ext:node-set($vrtfPass1)"/>
<xsl:apply-templates mode="pass2"
select="$vPass1/*"/>
</xsl:template>
<xsl:template match="num[. mod 2 = 1]">
<xsl:copy-of select="."/>
</xsl:template>
<xsl:template match="num" mode="pass2">
<xsl:copy>
<xsl:value-of select=". *2"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the following XML document:
<nums>
<num>01</num>
<num>02</num>
<num>03</num>
<num>04</num>
<num>05</num>
<num>06</num>
<num>07</num>
<num>08</num>
<num>09</num>
<num>10</num>
</nums>
the wanted, correct result is produced:
<num>2</num>
<num>6</num>
<num>10</num>
<num>14</num>
<num>18</num>
Explanation:
In the first step the XML document is transformed and the result is defined as the value of the variable $vrtfPass1. This copies only the num elements that have odd value (not even).
The $vrtfPass1 variable, being of type RTF, is not directly usable for XPath expressions so we convert it to a normal tree, using the EXSLT (implemented by most XSLT 1.0 processors) function ext:node-set and defining another variable -- $vPass1 whose value is this tree.
We now perform the second transformation in our chain of transformations -- on the result of the first transformation, that is kept as the value of the variable $vPass1. Not to mess with the first-pass template, we specify that the new processing should be in a named mode, called "pass2". In this mode the value of any num element is multiplied by two.
XSLT 2.0 solution (no RTFs):
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="vPass1" >
<xsl:apply-templates select="/*/*"/>
</xsl:variable>
<xsl:apply-templates mode="pass2"
select="$vPass1/*"/>
</xsl:template>
<xsl:template match="num[. mod 2 = 1]">
<xsl:copy-of select="."/>
</xsl:template>
<xsl:template match="num" mode="pass2">
<xsl:copy>
<xsl:value-of select=". *2"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
If this is your situation (or may become your situation):
Transform initial xml to mediary xml.
Maybe transform mediary xml into final1_html.
Maybe transform mediary xml into final2_html (not at all like final1_html).
or
Transform initial xml into mediary xml. This is reasonably likely to change over time.
Transform mediary xml to final_html. This in not likely to change over time.
Then it makes sense to use a two step transformation.
If this is your situation:
Transform initial xml to mediary xml.
Transform mediary xml to final_html.
Then consider not two stepping. Instead just perform one transformation.
I wouldn't think it was standard practice, in particular since you can transform one XML dialect directly to another.
However, if the processing is complex, splitting it to several steps (applying a different transform in each step) can indeed simplify each step and make sense.
It really depends on the particular situation.
Related
I have the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<XmlTest>
<Pictures attr="Pic1">Picture 1</Pictures>
<Pictures attr="Pic2">Picture 2</Pictures>
<Pictures attr="Pic3">Picture 3</Pictures>
</XmlTest>
While this XSL does what is expected (output the attr of the first picture):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/XmlTest">
<xsl:variable name="FirstPicture" select="Pictures[1]">
</xsl:variable>
<xsl:value-of select="$FirstPicture/#attr"/>
</xsl:template>
</xsl:stylesheet>
It seems to be not possible to do the same inside the variable declaration using xsl:copy-of:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:template match="/XmlTest">
<xsl:variable name="FirstPicture">
<xsl:copy-of select="Pictures[1]"/>
</xsl:variable>
<xsl:value-of select="$FirstPicture/#attr"/>
</xsl:template>
</xsl:stylesheet>
Curious:
If I just select "$FirstPicture" instead of "$FirstPicture/#attr" in the second example, it outputs the text node of Picture 1 as expected...
Before you all suggest me to rewrite the code:
This is just a simplified test, my real aim is to use a named template to select a node into the variable FirstPicture and reuse it for further selections.
I hope someone could help me to understand the behavior or could suggest me a proper way to select a node with code which could be easily reused (the decission which node is the first one is complex in my real application). Thanks.
Edit (thanks to Martin Honnen):
This is my working solution example (which additionally uses a seperate template to select the requested picture node), using the MS XSLT processor:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
version="1.0">
<xsl:template match="/XmlTest">
<xsl:variable name="FirstPictureResultTreeFragment">
<xsl:call-template name="SelectFirstPicture">
<xsl:with-param name="Pictures" select="Pictures" />
</xsl:call-template>
</xsl:variable>
<xsl:variable name="FirstPicture" select="msxsl:node-set($FirstPictureResultTreeFragment)/*"/>
<xsl:value-of select="$FirstPicture/#attr"/>
<!-- further operations on the $FirstPicture node -->
</xsl:template>
<xsl:template name="SelectFirstPicture">
<xsl:param name="Pictures"/>
<xsl:copy-of select="$Pictures[1]"/>
</xsl:template>
</xsl:stylesheet>
Not nice, that it is in XSLT 1.0 not possible to output a node directly from a template, but with the extra variable it is at least not impossible.
Well with an XSLT 1.0 processor if you do
<xsl:variable name="FirstPicture">
<xsl:copy-of select="Pictures[1]"/>
</xsl:variable>
the variable is a result tree fragment and all you can do with that in pure XSLT 1.0 is output it with copy-of (or value-of). If you want to apply XPath you first need to convert the result tree fragment into a node set, most XSLT 1.0 processors support an extension function for that so try
<xsl:variable name="FirstPictureRtf">
<xsl:copy-of select="Pictures[1]"/>
</xsl:variable>
<xsl:variable name="FirstPicture" select="exsl:node-set(FirstPictureRtf)/Pictures/#attr">
where you define xmlns:exsl="http://exslt.org/common" in your stylesheet.
Note that you will need to check whether your XSLT 1.0 processor supports the EXSLT extension function or a similar one in another namespace (as for instance the various MSXML versions do).
I am trying to transform
<Address>
<Line>Some street1</Line>
<Line>Some street2</Line>
<Line>Some street3</Line>
...
</Address>
into
<Address1>Some street1</Address1>
<Address2>Some street2</Address2>
<Address3>Some street3</Address3>
<Address4></Address4>
<Address5></Address5>
The first xml is malleable and can be redefined if neccessary, however the second xml is part of a legacy system which cannot me changed.
Most of what I find, correctly, points me to using attributes but unfortunatly, its the element itself that I wish to edit.
Would anyone be able to assist or if not, point me in the right direction?
As easy as this, and probably the shortest solution:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="Line">
<xsl:element name="Address{position()}"><xsl:apply-templates/></xsl:element>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<Address>
<Line>Some street1</Line>
<Line>Some street2</Line>
<Line>Some street3</Line>
</Address>
the wanted, correct result is produced:
<Address1>Some street1</Address1>
<Address2>Some street2</Address2>
<Address3>Some street3</Address3>
Explanation:
Proper use of xsl:element and AVTs (Attribute Value Templates).
Have a look at the <xsl:element> element. In its name attribute, you can also supply an expression that is computed while running the XSLT:
<xsl:template match="Line">
<xsl:element name="{concat('Address', position())}"><xsl:value-of select="text()"/></xsl:element>
</xsl:template>
Update: position() is one-based.
It can be done by mangling a new element with the current position() :
<xsl:template match="/Address">
<Addresses>
<xsl:for-each select="Line">
<xsl:variable name="elename" select="concat('Address', string(position()))"></xsl:variable>
<xsl:element name="{$elename}">
<xsl:value-of select="text()"/>
</xsl:element>
</xsl:for-each >
</Addresses>
</xsl:template>
I have to write an XSLT without knowing the input XML. So I want to start by writing an XSLT that will simply return the input XML without any transformation. Can I do that?
Look at this:
http://mrhaki.blogspot.com/2008/07/copy-xml-as-is-with-xslt.html
<xsl:template match="/">
<xsl:copy-of select="."/>
</xsl:template>
What you want to do is known as the Identity Transform. To be general, you need to ensure that all attribute and non-attribute nodes are copied, recursively:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Note that the identity transform does not guarantee that the output is identical on the surface level (i.e. some hash calculation might yield a different result, for instance). E.g. attributes could be reordered - this has no impact on the infoset or validity.
Suppose I have the following XML (which is the embedding of TEI annotation scheme into HTML):
<p>(See, for example, <bibl type="journal" xmlns="http://www.tei-c.org/ns/1.0"><author>Greger IH, et al.</author> <date>2007</date>, <title>Trends Neurosci.</title> <biblScope type="vol">30</biblScope> (<biblScope type="issue">8</biblScope>): <biblScope type="pp">407-16</biblScope></bibl>).</p>
Now I want to copy all annotation nodes as is into resulting XHTML but only rename <title> to <bibTitle> (as <title> is only allowed in <head>), so I used the following transformation:
<xsl:template match="tei:bibl/descendant-or-self::*">
<xsl:variable name="nodeName">
<xsl:choose>
<xsl:when test="name() = 'title'">bibTitle</xsl:when>
<xsl:otherwise><xsl:value-of select="name()" /></xsl:otherwise>
</xsl:choose>
</xsl:variable>
<!-- Changing of the namespace occurs here, but we don't care -->
<xsl:element name="{$nodeName}">
<xsl:copy-of select="#*" />
<xsl:apply-templates />
</xsl:element>
</xsl:template>
<xsl:template match="p/text()|tei:bibl//text()">
<xsl:copy-of select="." />
</xsl:template>
However it does not compile and breaks with following error:
Only child:: and attribute:: axes are allowed in match patterns! Offending axes = descendant-or-self
When I change the match rule to <xsl:template match="tei:bibl|tei:bibl//*"> it starts working as intended. But that should be identical to descendant-or-self::*, right? Have I hit the transformer implementation limitation here?
First I've tested with Mozilla 3.5 internal transformer, then with Xalan 2.7.1 – same negative result.
This limitation is valid only for any location step within the template's match pattern. It is by design (mandated by the W3C XSLT 1.0 and XSLT 2.0 specifications) -- to ensure efficient XSLT processing.
Do note: One can freely use any axis (including descending-or-self::) withinin the predicates that follow any location step.
Update:
Here is a short, complete example of using the descendant-or-self:: axis in the match attribute of xsl:template:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="num[descendant-or-self::num > 5]"/>
</xsl:stylesheet>
when this transformation is applied on the following XML document:
<nums>
<num>1</num>
<num>2</num>
<num>3</num>
<num>4</num>
<num>5</num>
<num>6</num>
<num>7</num>
<num>8</num>
<num>9</num>
<num>10</num>
</nums>
the wanted result: any num elements with value >= 5 are deleted:
<nums>
<num>1</num>
<num>2</num>
<num>3</num>
<num>4</num>
<num>5</num>
</nums>
It's a hard requirement by the spec:
Although patterns must not use the descendant-or-self axis, patterns may use the // operator as well as the / operator.
Here's a way you can rewrite the pattern into an equivalent one that doesn't repeat mention of tei:bibl:
<xsl:template match="*[ancestor-or-self::tei:bibl]">
As to why the limitation is there, the general answer, yes, is for performance. Perhaps the limitations are overly conservative, because, as you pointed out, the rewrite of descendant-or-self in this case is trivial.
I regularly get annoyed by this limitation (that you can use // but not descendant).
Here is a case where it is not enough:
<a>
<b>
<c>
<c>
<c/>
</c>
</c>
</b>
<c>
<c>
<c/>
</c>
</c>
</a>
now I want to match only:
*[self::a or self::b][p(.)]/c/descendent-or-self::c
i.e., if the predicate p(.) is true on a, I want a/c, a/c/c, a/c/c/c and if it is true on b, I want b/c, b/c/c, and b/c/c/c.
But I do not want a/b/c, a/b/c/c, etc. just because the predicate matches on a and not on b.
If I make a match pattern:
*[self::a or self::b][p(.)]//c
then I match all of them which I do not want.
So I have to do it backwards in the bracket:
c[ancestor-or-self::c/parent::*[self::a or self::b][p(.)]]
I think I just convinced myself that this restriction isn't really a logical restriction, however, I think the excuse not to allow proper axis steps in match patterns is pretty lame, because when I need this, I need this, who cares if it is not as fast as if I use simpler expressions.
using pure XSLT 1.0, how can I conditionally assign the node. I am trying something like this but it's not working.
<xsl:variable name="topcall" select="//topcall"/>
<xsl:variable name="focusedcall" select="//focusedcall" />
<xsl:variable name="firstcall" select="$topcall | $focusedcall"/>
For variable firstcall, I am doing the conditional node selection. if there is a topcall then assign it to firstcall, othersie assign firstcall to the focusedcall.
This should work:
<xsl:variable name="firstcall" select="$topcall[$topcall] |
$focusedcall[not($topcall)]" />
In other words, select $topcall if $topcall nodeset is non-empty; $focusedcall if $topcall nodeset is empty.
Re-Update regarding "it can be 5-6 nodes":
Given that there may be 5-6 alternatives, i.e. 3-4 more besides $topcall and $focusedcall...
The easiest solution is to use <xsl:choose>:
<xsl:variable name="firstcall">
<xsl:choose>
<xsl:when test="$topcall"> <xsl:copy-of select="$topcall" /></xsl:when>
<xsl:when test="$focusedcall"><xsl:copy-of select="$focusedcall" /></xsl:when>
<xsl:when test="$thiscall"> <xsl:copy-of select="$thiscall" /></xsl:when>
<xsl:otherwise> <xsl:copy-of select="$thatcall" /></xsl:otherwise>
</xsl:choose>
</xsl:variable>
However, in XSLT 1.0, this will convert the output of the chosen result to a result tree fragment (RTF: basically, a frozen XML subtree). After that, you won't be able to use any significant XPath expressions on $firstcall to select things from it. If you need to do XPath selections on $firstcall later, e.g. select="$firstcall[1]", you then have a few options...
Put those selections into the <xsl:when> or <xsl:otherwise> so that they happen before the data gets converted to an RTF. Or,
Consider the node-set() extension, which converts an RTF to a nodeset, so you can do normal XPath selections from it. This extension is available in most XSLT processors but not all. Or,
Consider using XSLT 2.0, where RTFs are not an issue at all. In fact, in XPath 2.0 you can put normal if/then/else conditionals inside the XPath expression if you want to.
Implement it in XPath 1.0, using nested predicates like
:
select="$topcall[$topcall] |
($focusedcall[$focusedcall] | $thiscall[not($focusedcall)])[not($topcall)]"
and keep on nesting as deep as necessary. In other words, here I took the XPath expression for 2 alternatives above, and replaced $focusedcall with
($focusedcall[$focusedcall] | $thiscall[not($focusedcall)])
The next iteration, you would replace $thiscall with
($thiscall[$thiscall] | $thatcall[not($thiscall)])
etc.
Of course this becomes hard to read, and error-prone, so I would not choose this option unless the others aren't feasible.
Does <xsl:variable name="firstcall" select="($topcall | $focusedcall)[1]"/> do what you want? That is usually the way to take the first node in document order of different types of nodes.
I. XSLT 1.0 Solution This short (30 lines), simple and parameterized transformation works with any number of node types/names:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:param name="pRatedCalls">
<call type="topcall"/>
<call type="focusedcall"/>
<call type="normalcall"/>
</xsl:param>
<xsl:variable name="vRatedCalls" select=
"document('')/*/xsl:param[#name='pRatedCalls']/*"/>
<xsl:variable name="vDoc" select="/"/>
<xsl:variable name="vpresentCallNames">
<xsl:for-each select="$vRatedCalls">
<xsl:value-of select=
"name($vDoc//*[name()=current()/#type][1])"/>
<xsl:text> </xsl:text>
</xsl:for-each>
</xsl:variable>
<xsl:template match="/">
<xsl:copy-of select=
"//*[name()
=
substring-before(normalize-space($vpresentCallNames),' ')]"/>
</xsl:template>
</xsl:stylesheet>
When applied to this XML document (do note the document order doesn't coincide with the specified priorities in the pRatedCalls parameter):
<t>
<normalcall/>
<focusedcall/>
<topcall/>
</t>
produces exactly the wanted, correct result:
<topcall/>
when the same transformation is applied to the following XML document:
<t>
<normalcall/>
<focusedcall/>
</t>
again the wanted and correct result is produced:
<focusedcall/>
Explanation:
The names of the nodes that are to be searched for (as many as needed and in order of priority) are specified by the global (typically externally specified) parameter named $pRatedCalls.
Within the body of the variable $vpresentCallNames we generate a space-separated list of names of elements that are both specified as a value of the type attribute of a call elementin the$pRatedCalls` parameter and also are names of elements in the XML document.
Finally, we determine the first such name in this space-separated list and select all elements in the document, that have this name.
II. XSLT 2.0 solution:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:param name="pRatedCalls" select=
"'topcall', 'focusedcall', 'normalcall'"/>
<xsl:template match="/">
<xsl:sequence select=
"//*
[name()=$pRatedCalls
[. = current()//*/name()]
[1]
]"/>
</xsl:template>
</xsl:stylesheet>