I have several templates that match various nodes in an xml document. If I do just an
<xsl:apply-templates/>
it somehow recursively outputs the text of all the nodes beneath. I just want it to recursively match any template I have defined. How do I do that ?
This is happening because of the
built-in templates in XSLT. XSLT has a
couple of built in templates, which
say:
when you apply templates to an element, process its child elements
when you apply templates to a text node, give its value
Together, it means that if you apply
templates to an element but don't have
an explicit template for that element,
then its content gets processed and
eventually you end up with the text
that the element contains.
Read the full explanation here: http://www.dpawson.co.uk/xsl/sect2/defaultrule.html
You can override the default templates for text nodes by defining your own template and have it do nothing.
<xsl:template match="text()" />
This is probably the most frequent problem even experienced XSLT programmers experience.
The observed behavior is exactly how an XSLT-compliant processor shoud behave.
Take into account that:
<xsl:apply-templates/>
is an abbreviation for:
<xsl:apply-templates select="child::node()"/>
and the existence of the built-in template rules. According to the XSLT 1.0 Spec.:
"5.8 Built-in Template Rules
There is a built-in template rule to allow recursive processing to continue in the absence of a successful pattern match by an explicit template rule in the stylesheet. This template rule applies to both element nodes and the root node. The following shows the equivalent of the built-in template rule:
<xsl:template match="*|/">
<xsl:apply-templates/>
</xsl:template>
There is also a built-in template rule for each mode, which allows recursive processing to continue in the same mode in the absence of a successful pattern match by an explicit template rule in the stylesheet. This template rule applies to both element nodes and the root node. The following shows the equivalent of the built-in template rule for mode m.
<xsl:template match="*|/" mode="m">
<xsl:apply-templates mode="m"/>
</xsl:template>
There is also a built-in template rule for text and attribute nodes that copies text through:
<xsl:template match="text()|#*">
<xsl:value-of select="."/>
</xsl:template>
The built-in template rule for processing instructions and comments is to do nothing.
<xsl:template match="processing-instruction()|comment()"/>
The built-in template rule for namespace nodes is also to do nothing. There is no pattern that can match a namespace node; so, the built-in template rule is the only template rule that is applied for namespace nodes.
The built-in template rules are treated as if they were imported implicitly before the stylesheet and so have lower import precedence than all other template rules. Thus, the author can override a built-in template rule by including an explicit template rule
"
--- End of XSLT Spec quote ---
So, if the author wants to be in full control of the XSLT processing, they should override all built-in templates.
For example, if we do not want text() nodes to be copied to the output, we can cause them to be ignored by overriding the built-in template in the following way:
<xsl:template match="text()" />
you could set a mode to apply only your own templates:
<xsl:template match="* | /" >
<xsl:apply-templates mode="myMode" />
</xsl:template>
<xsl:template match="somenode" mode="myMode">
<!-- do something here -->
</xsl:template>
Another option would be to overwrite the built-in template rules (see e.g. http://unix.com.ua/orelly/xml/xmlnut/ch08_07.htm)
Related
Depending on a specific parameter value, I need to either omit matched elements from a transformed document or allow the matched elements to be transformed according to rules in other matching templates. I have a complex xml doc with dozens of element/attribute types, all handled in a huge variety of ways in dozens of templates. If any element has the attribute value omit_attr="true" and the stylesheet has the parameter omit_param="true", I need to omit them from the transformed document. If however my parameter is omit_param="false" then I need to apply whatever other rules exist for my omit_attr="true" elements. Here's my unacceptable template:
<xsl:template match="*[#omit_attr= 'true']">
<xsl:choose>
<xsl:when test="($omit_param= 'true')"/>
<xsl:otherwise>
<xsl:apply-templates/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
This omits content just fine when omit_param="true". The problem occurs when omit_param="false" and I need to process omit_attr="true" elements with other applicable templates. This code transforms the matched element's children, but not the matched element, which is unacceptable because the matched element will have its own set of transform rules elsewhere. And of course <xsl:apply-templates select="."/> attempts to process the matched element as desired, but creates an endless loop in the process.
So, how can I test for omit_attr="true" in all possible elements, and still apply the alternate templates to omit_attr="true" elements when omit_param="false"?
If I could wrap my xsl:template in if big <if test="omit_param='true'> statement, that would do the job, but of course it's against the rules...
If that is an XSLT 2 or later processor you can just use
<xsl:template match="*[#omit_attr= 'true' and $omit_param= 'true']"/>
to block the element from being processed/copied/transformed if the two conditions hold and have your other templates applied if not. For XSLT 1 processors I am not sure they allow a variable reference in a match pattern.
I looked a the spec and some other documents but I dont really get it. I get that now there are sequences instead of nodesets, okay, but what is a sequence constructor (or may more easily: what ISNT a sequence constructor)?
The spec just says
[Definition: A sequence constructor is a sequence of zero or more
sibling nodes in the stylesheet that can be evaluated to return a
sequence of nodes and atomic values. The way that the resulting
sequence is used depends on the containing instruction.]
Many XSLT elements, and also literal result elements, are defined to
take a sequence constructor as their content.
I kinda get that xsl:sequence constructs a sequence, but why is the content of xsl:element a sequence constructor? And what isnt a sequence constructor then?
Thanks for clarifications!
See further down in the spec where it says
The term sequence constructor replaces template as used in XSLT 1.0.
The change is made partly for clarity (to avoid confusion with
template rules and named templates), but also to reflect a more formal
definition of the semantics. Whereas XSLT 1.0 described a template as
a sequence of instructions that write to the result tree, XSLT 2.0
describes a sequence constructor as something that can be evaluated to
return a sequence of items; what happens to these items depends on the
containing instruction.
So where the XSLT 1.0 grammar had for instance
<!-- Category: top-level-element -->
<xsl:template
match = pattern
name = qname
priority = number
mode = qname>
<!-- Content: (xsl:param*, template) -->
</xsl:template>
allowing as the contents of an xsl:template rule a number of xsl:param plus a template the XSLT 2.0 grammar now has
<!-- Category: declaration -->
<xsl:template
match? = pattern
name? = qname
priority? = number
mode? = tokens
as? = sequence-type>
<!-- Content: (xsl:param*, sequence-constructor) -->
</xsl:template>
defining the contents of an xsl:template rule as a number of xsl:param plus a sequence constructor.
That way it is also obvious that xsl:param is not a sequence constructor for instance but of course as its content has a sequence constructor.
As for how to construct a sequence, well you can do that with literal result elements and of course with instructions like xsl:value-of, xsl:element, xsl:attribute and so on, as well as with xsl:sequence (to allow you to construct and return (sequences of) primitive values and not only nodes, as it was only possible in XSLT 1.0).
Pretty much anywhere you can put a literal result element is a sequence constructor, e.g.
<xsl:template match="/">
<a></a>
</xsl:template>
...or
<xsl:variable name="a">
<a></a>
</xsl:variable>
...but not
<xsl:choose>
<a></a>
</xsl:choose>
...because the xsl:choose instruction doesn't allow it:
<xsl:choose>
<!-- Content: (xsl:when+, xsl:otherwise?) -->
</xsl:choose>
So, sequence constructors are places where values, and instructions that create values, go.
I have an XSLT (1.0) style sheet. It works with no problem. I want to make it to 2.0. I want to use xsl:for-each-group (and make it have high performance). It is possible? How? Please explain.
I have many places like
<xsl:if test="test condition">
<xsl:for-each select="wo:tent">
<width aidwidth='{/wo:document/styles [#wo:name=current()/#wo:style-name]/#wo:width}'
</xsl:for-each>
</xsl:if>
ADDED
<xsl:template match="wo:country">
<xsl:for-each select="#*">
<xsl:copy/>
</xsl:for-each>
<xsl:variable name="states" select="wo:pages[#xil:style = "topstates" or #xil:style = "toppage-title"]"/>
<xsl:variable name="provinces" select="wo:pages[#xil:style = "topprovinces"]"/>
<xsl:choose>
<xsl:when test="$states">
<xsl:apply-templates select="$states[2]/preceding-sibling::*"/>
<xsl:apply-templates select="$states[2]" mode="states">
<xsl:with-param name="states" select="$states[position() != 0]"/>
</xsl:apply-templates>
</xsl:when>
<xsl:when test="$provinces">
<xsl:apply-templates select="$provinces[2]/preceding-sibling::*"/>
<xsl:apply-templates select="$provinces[2]" mode="provinces">
<xsl:with-param name="provinces" select="$provinces[position() != 2]"/>
</xsl:apply-templates>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
THE SOURCE
<?xml version="1.0" encoding="UTF-8"?>
<wo:country>
some stuff
</wo:country>
I have assumed that you want an in-depth description of xsl:for-each-group and how to use it. If this is not what you are asking for, then please let me know.
The instruction, new in XSLT 2.0, takes a set of items and groups them. The set of items is called "the population", and the groups are just called groups. The instruction processes each group in turn.
Possible attributes of the xsl:for-each-group instruction include:
select
group-by
group-adjacent
group-starting-with
group-ending-with
collation
#select is mandatory. The others are optional. It can take any number of xsl:sort children (but they must come first), followed by a sequence constructor. A "sequence constructor" is the term for all the sequence emitting type instructions that go inside templates and the like.
#select
The select attribute specifies an XPATH expression which evaluates to the population to be grouped.
#group-by
The group-by attribute specifies an XPATH expression, which you use when the type of grouping is by common value. Every item in the population that evaluates to the same group-by value as another is in the same group as that other.
XSLT 1.0 Muenchian grouping is not too difficult when the type of grouping is group by common value. There are two more common forms of grouping: group adjacent items by similar value; and group an adjacent group of items whose group is either demarcated at the end or the at the beginning by some test. While both these forms of grouping are still possible with Muenchian, it becomes relatively complex. Muenchian on these types will also be less efficient at scale, because of the use of sibling axises (however you spell that!).
Another advantage of XSLT 2.0 that comes to mind is that Muenchian only works on node sets, whereas xsl:for-each-group is broader in application because it works on a sequence of items, not just nodes.
The result of the #group-by expression will be a sequence of items. This sequence is atomized and de-duped. The population item being tested will be a member of one group per value. It's a strange consequence, that with #group-by, and item may be a member of more than one group, or perhaps even none. Although I suspect that any thing that you can do in XSLT 2.0, you can, by some tortuous path, do in XSLT 1.0, the ability to put an item into two groups is something that would be quiet fiddly to do in XSLT 1.0 Muenchian.
#group-adjacent
The attributes group-by, group-adjacent, group-starting-with and group-ending-with are mutually exclusive because they specify different kinds of grouping. Items with commons values and adjacent in the population are grouped together. Unlike #group-by, #group-adjacent must evaluate to, after atomization, a single atomic value.
group-starting-with
Unlike select, group-adjacent and group-by, this attribute does not specify an XPATH select expression, but rather a pattern, in the same way the xsl:template/#match specifies a pattern, not a selection. If an item in the population passes the pattern test or is the first item in the population then it starts a new group. Otherwise the item continues the group from the previous item.
Martin mentioned the spec examples (w3.org/TR/xslt20/#grouping-example). From that reference, I am going to copy the example entitled "Identifying a Group by its Initial Element", but alter it slightly to emphasis the point about the initial item of the population.
So this is our input document (copied from w3 spec. The inclusion of the orphaned line is mine) ...
<body>
<p>This is an orphaned paragraph.</p>
<h2>Introduction</h2>
<p>XSLT is used to write stylesheets.</p>
<p>XQuery is used to query XML databases.</p>
<h2>What is a stylesheet?</h2>
<p>A stylesheet is an XML document used to define a transformation.</p>
<p>Stylesheets may be written in XSLT.</p>
<p>XSLT 2.0 introduces new grouping constructs.</p>
</body>
... what we want to do is define groups as nodes starting with h2 and include all the following p up until the next h2. The example solution given by w3 is to use #group-starting-with ...
<xsl:template match="body">
<chapter>
<xsl:for-each-group select="*" group-starting-with="h2" >
<section title="{self::h2}">
<xsl:for-each select="current-group()[self::p]">
<para><xsl:value-of select="."/></para>
</xsl:for-each>
</section>
</xsl:for-each-group>
</chapter>
</xsl:template>
In the spec example, when the input does not contain an orphan line, this produces the desired result ...
<chapter>
<section title="Introduction">
<para>XSLT is used to write stylesheets.</para>
<para>XQuery is used to query XML databases.</para>
</section>
<section title="What is a stylesheet?">
<para>A stylesheet is an XML document used to define a transformation.</para>
<para>Stylesheets may be written in XSLT.</para>
<para>XSLT 2.0 introduces new grouping constructs.</para>
</section>
</chapter>
Although in our particular case we get instead ...
<chapter>
<section title="">
<para>This is an orphaned paragraph.</para>
</section>
<section title="Introduction">
<para>XSLT is used to write stylesheets.</para>
<para>XQuery is used to query XML databases.</para>
</section>
<section title="What is a stylesheet?">
<para>A stylesheet is an XML document used to define a transformation.</para>
<para>Stylesheets may be written in XSLT.</para>
<para>XSLT 2.0 introduces new grouping constructs.</para>
</section>
</chapter>
If the initial section for the orphaned lines is undesired, there are easy solutions. I won't go into them now. My point is just to high-light the fact that the first group resulting from #group-starting-with can be an 'orphan' group. By 'orphan', I mean a group whose head node does not fit the specified pattern.
#collation
The collation attribute specifies a collation URI and identifies a collation used to compare strings for equality.
current-group()
Within the xsl:for-each-group the current-group() function returns the current group being processed as a sequence of items.
current-grouping-key()
Within the xsl:for-each-group the current-group() function returns the current group key. I am not sure, but I believe that this can only be an atomic type. Also not sure, but I believe that this function is only applicable to #group-by and #group-adjacent type of grouping.
#group-by versus #group-adjacent
In some scenarios you will have a choice between these two sort types with the same functional result. When this is the case #group-adjacent is to be preferred over #group-by, because it will likely be more efficient to process.
Pattern versus Select
Some XSLT 2.0 instruction attributes contain select expressions. Michael Kay calls these "XPath expressions". Personally, when juxtaposing against patterns, I feel a better description would be "select expression". Other attributes contain patterns or "match expressions". While these two both contain the same syntax, they are very different beasts. The similarity between the two often makes XSLT beginners think of xsl:template/#match not as a pattern, but as a select expression. The consequence has been a lot of confusion from beginners about the value of the position() function within template's sequence constructors. As stated earlier, in xsl:for-each-group, #select, #group-by and #group-adjacent are select expressions, but #group-starting-with and #group-ending-with are patterns. So here is the difference:
Select expressions are a like a function. The input is a context document, context sequence, context item, context position and of course the actual expression. The output is a sequence of items. Depending where this is actually used, this could become the next context sequence. The default axis is child:: .
Unlike select expression, the default axis for a pattern is self:: . The pattern is also like a function. Its inputs are as before, and its output is not a sequence, but a boolean. Some item is being tested to see if it matches the pattern or not. The item being tested is made the context item. The match expression is temporarily evaluated as it were a select expression. Then the returned sequence is tested to see if the context item is a member or not. The returned sequence is then discarded. The result is true or 'match' if it was a member, and false otherwise.
Sean has provided a wonderful overview of xsl:for-each-group, which was very generous, but it doesn't really seem to be an answer to your question.
You've shown a fragment of XSLT code, and you've said you want faster performance. But the fragment you showed is not doing grouping, it is doing a join. There are two ways you can speed up a join. Either use an XSLT processor such as Saxon-EE that does automatic join optimization, or optimize it by hand using keys. For example, given this expression:
/wo:document/styles [#wo:name=current()/#wo:style-name]/#wo:width
you could define a key
<xsl:key name="style-name-key" match="styles" use="#wo:name"/>
and then replace the expression by
key('style-name-key', #wo:style-name)/#wo:width
Given
An XSLT stylesheet with a global variable:
<xsl:variable name="lang" select="/response/str[#name='lang']"/>
Question
Where from comes the limitation that using variables in predicates is incorrect in the xsl:template matching pattern, but is acceptable in xsl:apply-templates selecting pattern?
<!-- throws compilation error, at least in libxslt -->
<xsl:template match="list[#name='item_list'][$lang]"/>
<!-- works as expected -->
<xsl:template match="list[#name='item_list'][/response/str[#name='lang']]"/>
<!-- works as expected -->
<xsl:template match="response">
<xsl:apply-templates select="list[#name='item_list'][$lang]">
</xsl:template>
Variables are not allowed to be used in match expressions in XSLT 1.0.
From the XSLT 1.0 specification: Defining Template Rules
It is an error for the value of the match attribute to contain a
VariableReference.
Variables are allowed in match expressions in XSLT 2.0.
From the XSLT 2.0 specification: Syntax of Patterns
Patterns may start with an id FO or key function call, provided that
the value to be matched is supplied as either a literal or a reference
to a variable or parameter, and the key name (in the case of the key
function) is supplied as a string literal. These patterns will never
match a node in a tree whose root is not a document node.
My xsl has a parameter
<xsl:param name="halfPath" select="'halfPath'"/>
I want to use it inside match
<xsl:template match="Element[#at1='value1' and not(#at2='{$halfPath}/another/half/of/the/path')]"/>
But this doesn't work. I guess a can not use parameters inside ''. How to fix/workaround that?
The XSLT 1.0 W3C Specification forbids referencing variables/parameters inside a match pattern.:
"It is an error for the value of the
match attribute to contain a
VariableReference"
There is no such limitation in XSLT 2.0, so use XSLT 2.0.
If due to unsurmountable reasons using XSLT2.0 isn't possible, put the complete body of the <xsl:template> instruction inside an <xsl:if> where the test in conjunction with the match pattern is equivalent to the XSLT 2.0 match pattern that contains the variable/parameter reference(s).
In a more complicated case where you have more than one template matching the same kind of node but with different predicates that reference variables/parameters, then a wrapping <xsl:choose> will need to be used instead of a wrapping <xsl:if>.
Well, you could use a conditional instruction inside the template:
<xsl:template match="Element[#at1='value1']">
<xsl:if test="not(#at2=concat($halfPath,'/another/half/of/the/path'))">
.. do something
</xsl:if>
</xsl:template>
You just need to be aware that this template will handle all elements that satisfy the first condition. If you have a different template that handles elements that match the first, but not the second, then use an <xsl:choose>, and put the other template's body in the <xsl:otherwise> block.
Or, XSLT2 can handle it as is if you can switch to an XSLT2 processor.
This topic had the answer to my question, but the proposed solution by Flynn1179 was not quite correct for me (YMMV). So try it the way it is suggested by people more expert than me, but if it doesn't work for you, consider how I solved it. I am using xsltproc that only handles XSL version 1.0.
I needed to match <leadTime hour="0024">, but use a param: <xsl:param name="hour">0024</xsl:param>. I found that:
<xsl:if test="#hour='{$hour}'"> did not work, despite statements here and elsewhere that this is the required syntax for XSL v.1.0.
Instead, the simpler <xsl:if test="#hour=$hour"> did the job.
One other point: it is suggested above by Dimitre that you put template inside if statement. xsltproc complained about this: instead I put the if statement inside the template:
<xsl:template match="leadTime">
<xsl:if test="#hour=$leadhour">
<xsl:copy>
<xsl:apply-templates select="node() | #*"/>
</xsl:copy>
</xsl:if>
</xsl:template>
In XSLT 2.0 you can refer to global variables within a match pattern, but the syntax is simpler than your guess:
<xsl:template match="Element[#at1='value1' and
not(#at2=$halfPath/another/half/of/the/path)]"/>
rather than
<xsl:template match="Element[#at1='value1' and
not(#at2='{$halfPath}/another/half/of/the/path')]"/>
Also, the semantics are not what you appear to be expecting: a variable referenced on the lhs of "/" must contain a node-set, not a fragment of an XPath expression.