XSLT template match: recipe for moving disallowed axes to predicate - xslt

I understand that the XSLT 1.0 standard disallows most XPath axes in the StepPatern portion of a match expression. (See this question where the recommended alternative was using the desired axis in a Predicate.)
I have a complex XPath expression that returns a node set, node-set-expression. I would like to make a template matching node-set-expression/ following-sibling::*. Is there a general way to rewrite this to use Predicates so that it can be used in the match attribute of a XSLT template element?
And equivalently, is there a general way to translate the following:
node-set-expression/ preceding-sibling::*
node-set-expression/ self-and-following-sibling::* (this is shorthand; I know it's not a valid axis)
If Predicates won't work, are there any other general approaches?

In XSLT 2.0 I tend to handle such cases by preselecting the matching nodes in a global variable:
<xsl:variable name="special-nodes" select="//something/preceding-sibling::*"/>
<xsl:template match="*[. intersect $special-nodes]"/>
In XSLT 3.0 this will simplify further to
<xsl:template match="$special-nodes"/>
An advantage of doing it this way is that searching for the "special nodes" once is likely to be a lot more efficient than testing every node against every such pattern when doing an apply-templates; it also makes the condition clearer, in my view.
The only general solution I know to your question for XSLT 1.0 is to write the pattern as
<xsl:template match="*[count(.|//something/preceding-sibling::*) =
count(//something/preceding-sibling::*)]">
but that really is too horribly inefficient to contemplate.

Related

How do I match an element that has a certain sibling element in xslt/xpath

I'm trying to match all a/c elements that have a b sibling. I've tried:
<xsl:template match="a/b/../c">
but I get "Cannot convert the expression {..} to a pattern" from Saxon.
My XSLT/XPath skills are basic at best...
<xsl:template match="a[b]/c">
Explanation: match any c element that is a child of an a element that has a b child.
You should also be able to use .. in a predicate:
<xsl:template match="a/c[../b]">
which is similar to what you were trying.
The reason you can't use .. directly in a match pattern (i.e. outside of a predicate) is that patterns are not XPath expressions, even though they look and behave very similarly. In particular, only "downward" axes (child::, descendant::, and attribute::) are allowed directly in patterns. (The only explicit axes allowed by the spec are child:: and attribute::. descendant:: is implicitly allowed via the // between pattern steps. Saxon seems to bend the rules a bit here, allowing an explicit descendant:: axis, and even descendant-or-self::!)
The reason given for the restriction on axes is that (a) other axes are rarely needed in patterns, and (b) this allows XSLT processors to be much more efficient in testing for matches.
But predicates in patterns are defined with the same syntax rules as in XPath expressions (with some restrictions in XSLT 1.0, like not allowing variable references or current()). So you can use other axes, like parent::, or the abbreviation ...
This XPath expression seems to do the job (may be not optimal):
//a|//c[following-sibling::b or preceding-sibling::b]
Edit:
In case LarsH is right, it should be //a/c instead of //a|//c.

What are the differences between 'call-template' and 'apply-templates' in XSL?

I am new in XSLT so I'm little bit confused about the two tags,
<xsl:apply-templates name="nodes">
and
<xsl:call-template select="nodes">
So can you list out the difference between them?
<xsl:call-template> is a close equivalent to calling a function in a traditional programming language.
You can define functions in XSLT, like this simple one that outputs a string.
<xsl:template name="dosomething">
<xsl:text>A function that does something</xsl:text>
</xsl:template>
This function can be called via <xsl:call-template name="dosomething">.
<xsl:apply-templates> is a little different and in it is the real power of XSLT: It takes any number of XML nodes (whatever you define in the select attribute), processes each of them (not necessarily in any predefined order), somebody could say that apply-templates works like a loop, but this is not exactly the case, as the nodes may be processed in any order, even in parallel, and finds matching templates for them:
<!-- sample XML snippet -->
<xml>
<foo /><bar /><baz />
</xml>
<!-- sample XSLT snippet -->
<xsl:template match="xml">
<xsl:apply-templates select="*" /> <!-- three nodes selected here -->
</xsl:template>
<xsl:template match="foo"> <!-- will be called once -->
<xsl:text>foo element encountered</xsl:text>
</xsl:template>
<xsl:template match="*"> <!-- will be called twice -->
<xsl:text>other element countered</xsl:text>
</xsl:template>
This way you give up a little control to the XSLT processor - not you decide where the program flow goes, but the processor does by finding the most appropriate match for the node it's currently processing.
If multiple templates can match a node, the one with the more specific match expression wins. If more than one matching template with the same specificity exist, the one declared last wins.
You can concentrate more on developing templates and need less time to do "plumbing". Your programs will become more powerful and modularized, less deeply nested and faster (as XSLT processors are optimized for template matching).
A concept to understand with XSLT is that of the "current node". With <xsl:apply-templates> the current node moves on with every iteration, whereas <xsl:call-template> does not change the current node. I.e. the . within a called template refers to the same node as the . in the calling template. This is not the case with apply-templates.
This is the basic difference. There are some other aspects of templates that affect their behavior: Their mode and priority, the fact that templates can have both a name and a match. It also has an impact whether the template has been imported (<xsl:import>) or not. These are advanced uses and you can deal with them when you get there.
To add to the good answer by #Tomalak:
Here are some unmentioned and important differences:
xsl:apply-templates is much richer and deeper than xsl:call-templates and even from xsl:for-each, simply because we don't know what code will be applied on the nodes of
the selection -- in the general case this code will be different for
different nodes of the node-list.
The code that will be applied
can be written way after the xsl:apply templates was written and by
people that do not know the original author.
The FXSL library's implementation of higher-order functions (HOF) in XSLT wouldn't be possible if XSLT didn't have the <xsl:apply-templates> instruction.
Summary: Templates and the <xsl:apply-templates> instruction is how XSLT implements and deals with polymorphism.
Reference: See this whole thread: http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/200411/msg00546.html
xsl:apply-templates is usually (but not necessarily) used to process all or a subset of children of the current node with all applicable templates. This supports the recursiveness of XSLT application which is matching the (possible) recursiveness of the processed XML.
xsl:call-template on the other hand is much more like a normal function call. You execute exactly one (named) template, usually with one or more parameters.
So I use xsl:apply-templates if I want to intercept the processing of an interesting node and (usually) inject something into the output stream. A typical (simplified) example would be
<xsl:template match="foo">
<bar>
<xsl:apply-templates/>
</bar>
</xsl:template>
whereas with xsl:call-template I typically solve problems like adding the text of some subnodes together, transforming select nodesets into text or other nodesets and the like - anything you would write a specialized, reusable function for.
Edit:
As an additional remark to your specific question text:
<xsl:call-template name="nodes"/>
This calls a template which is named 'nodes':
<xsl:template name="nodes">...</xsl:template>
This is a different semantic than:
<xsl:apply-templates select="nodes"/>
...which applies all templates to all children of your current XML node whose name is 'nodes'.
The functionality is indeed similar (apart from the calling semantics, where call-template requires a name attribute and a corresponding names template).
However, the parser will not execute the same way.
From MSDN:
Unlike <xsl:apply-templates>, <xsl:call-template> does not change the current node or the current node-list.

Boolean expressions in XSLT select statements

I have the following XSLT code that almost does what I want:
<xsl:variable name="scoredItems"
select=
".//item/attributes/scored[#value='true'] |
self::section[attributes/variable_name/#value='SCORE']/item |
.//item//variables//variable_name"/>
I want to change this to a more complicated boolean expression:
<xsl:variable name="scoredItems"
select=
".//item/attributes/scored[#value='true'] or
(self::section[variable_name/#value='SCORE']/item and
(not (.//item/attributes/scored[#value='false']))) or
.//item//variables//variable_name"/>
However, when I run this, I get the following error:
javax.xml.transform.TransformerConfigurationException: Could not compile stylesheet
at org.apache.xalan.xsltc.trax.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:832)
at org.apache.xalan.xsltc.trax.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:618)
How do I fix this? (Note that I'm using XSLT 1.0.)
In my experience, the default exception thrown by XSLT in Java is not very helpful. You'll need to implement an instance of ErrorListener and use its methods to capture and report the true XSLT problem. You can attach this ErrorListener using the setErrorListener method of your TransformerFactory.
I would greatly discourage anyone to write complicated expressions -- in any language!
This is not an XSLT question at all. It is a general programming question and the answer is:
Never write too complicated expressions because they are challenging to write, read, test, verify, proof, change.
Split a complicated expression onto a number of simpler expressions and assign them to different variables. Then operate on these variables.

When the same XML element matches two XSLT templates through different XPaths, which template executes and why?

Consider this XML:
<people>
<person>
<firstName>Deane</firstName>
<lastName>Barker</lastName>
</person>
</people>
What if two XSLT templates match an element through different XPaths? I know that if the "match" element on two templates is identical (which should never happen, I don't think), the last template will fire.
However, consider this XSL:
<xsl:template match="person/firstName">
Template #1
</xsl:template>
<xsl:template match="firstName">
Template #2
</xsl:template>
The "firstName" element will match on either of these templates -- the first one as a child of "person" and the second one standalone.
I have tested this, and Template #1 executes, while Template #2 does not. What is the operative principle behind this? I can think of three things:
Specificity of XPath (highly doubtful)
Location in the XSLT file (also doubtful)
Some pre-emption of Template #2 by Template #1. Something happens during the execution of Template #1 that tells Template #2 not to execute.
Your first point is actually correct, there is a defined order described in https://www.w3.org/TR/1999/REC-xslt-19991116#conflict. According to the spec person/firstName has a priority of 0 while firstName has a priority of -0.5. You can also specify the priority yourself using the priority attribute on xsl:template.
I know that if the "match" element on
two templates is identical (which
should never happen, I don't think)
This can happen but would not be much point doing this and having two matching templates.
From the spec:
It is an error if this leaves more
than one matching template rule. An
XSLT processor may signal the error;
if it does not signal the error, it
must recover by choosing, from amongst
the matching template rules that are
left, the one that occurs last in the
stylesheet.
So in other words you may get an error or it will just use the last template in your XSLT depending on how the processor your are using has been written to handle this situation.
Note that the value of the match attribute is not an XPath expression (though it uses a subset of XPath syntax). It's an XSLT pattern. Absent explicit priority attributes, the choice comes down to which pattern has the highest default priority:
person/firstName has a default priority of .5
firstName has a default priority of 0
Thus, person/firstName wins.
A complete explanation of how conflict resolution works can be found here (although I recommend you study the entire chapter, "How XSLT Works"): Conflict Resolution for Template Rules
Consider this with the context in mind. The first one matches, and changes the context n (so the second does not match). The context is set to AFTER the first one is selected and processed so the visible element from that context no longer contains "firstname".
IF you want both to execute, then you can call them instead so that the context changes back to the top.
<xsl:template match="people">
<xsl:apply-templates select="person/firstname"/>
<xsl:apply-templates select="firstname"/>
</xsl:template>

XSL: Ignoring/stripping namespaces in secondary documents

I am writing an XSL template that pulls data from many secondary sources. An example secondary document looks like this:
<toplevel xmlns:foo1="http://foo1">
<path xmlns="http://foo1">
<mytag>bar</mytag>
</path>
</toplevel>
In the XSL, I am doing this:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:foo1="http://foo1"
exclude-result-prefixes="foo1">
<xsl:variable name="secondary1" select="document('secondary1.xml')/toplevel"/>
<foo>
<xsl:value-of select="$secondary1//foo1:path/foo1:mytag"/>
</foo>
</xsl:stylesheet>
With a lot of secondary sources, each of which uses a different namespace, prefixing every single tag is tedious, and that much repetition can't be the right thing to do anyway. Is there a way to use document() such that the namespace of the imported node-set is stripped (or to achieve the same effect another way)?
In XPath/XSLT 1.0, to select a namespace-qualified element by name, you have to use a prefix. In XSLT 2.0, you can use the xpath-default-namespace feature, which allows you to set the default namespace for XPath expressions, so you don't have to use prefixes anymore. See XSLT 2.0: xpath-default-namespace for more details. You can use this attribute on any element in your stylesheet, and it takes effect for all descendant elements unless overridden. (Qualify it with xsl: when you want to put it on a non-XSLT element, i.e. a literal result element.)
In XPath 1.0, you can also select elements by local name rather clumsily using, for example, *[local-name() = 'path']/*[local-name() = 'mytag']. In XPath 2.0, for greater succinctness, you can use namespace wildcards, as in *:path/*:mytag, as described here. This was a somewhat controversial addition, since it seems to encourage and/or justify the same dubious use of namespaces that your system is apparently employing.
In essence, a node with a namespace is an entirely different animal than a node with another namespace - even if they happen to share the same local name. (This is much the same way namespaces work everywhere else - there is really no easy way of "ignoring" namespaces. Think of ignoring namespaces when referring to classes in C#.)
The clean approach would be to mention each namespace you might encounter in the XSLT and work with prefixes, even if it seems repetitive.
The not-so-clean way is this:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:variable name="secondary1" select="document('secondary1.xml')"/>
<xsl:template match="/">
<foo source="1">
<xsl:value-of select="
$secondary1//*[local-name() = 'path']/*[local-name() = 'mytag']
"/>
</foo>
</xsl:template>
</xsl:stylesheet>
This is not really more pleasing to the eye than working with prefixes, it's longer and harder to read, it is ambiguous, and last but not least - it is slower because the engine must test a predicate on every step on the XPath. Take your pick.