XSLT style - pattern matching multiple templates - xslt

This is a question about xslt 1.0 (but i've included the general xslt tag as it may apply more widely).
lets say we want to take this xml
<root>
<vegetable>
<carrot colour="orange"/>
<leek colour="green"/>
</vegetable>
</root>
and transform it to cook the vegetables if they are root vegetables so this..
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="carrot">
<xsl:copy>
<xsl:attribute name="cooked">true</xsl:attribute>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="leek">
</xsl:template>
</xsl:stylesheet>
so the xslt recursively processes the data, and when it finds multiple matching templates e.g. leek and carrot, it takes the last one, effectively overriding.
Sometimes accepted answers in this site have this style,
e.g. XSLT copy-of but change values
other answers specifically about multiple matching templates
e.g.
XSLT Multiple templates with same match
state
Having two templates of the same priority that match the same node is an error according to the XSLT specification
It is an error if [the algorithm in section 5.5] leaves more than one
matching template rule. An XSLT processor may signal the error; if it
does not signal the error, it must recover by choosing, from amongst
the matching template rules that are left, the one that occurs last in
the stylesheet.
so....we can avoid this by either using priority or by matching to explicitly excluding the the overlap, something like this.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#* | node()[not(self::carrot) or not(self::leek)]">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="carrot">
<xsl:copy>
<xsl:attribute name="cooked">true</xsl:attribute>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="leek">
</xsl:template>
</xsl:stylesheet>
I get the feeling that lots of devs actually simply use the default fallback behaviour and let the processor use the last match, this is similar in style to pattern matching in most functional languages where the 1st match is used.
I also personally am not a fan of priority, it feels a bit like magic numbers, where i have to scan and remember the priority of the pattern matches to work out whats going on.
The approach to explicitly exclude overlaps, seems sensible, but in practice requires complex logic and creates coupling between templates, if i extend/add a new match, i potentially have to amend constrain another.
Is the above a correct summary?
Is there an accepted style (even if it contradicts the spec)?

I think you may be missing the fact that there is no error in the given example, therefore the rule of applying the template that occurs last in the stylesheet is not invoked. You can verify this by switching the order of the templates and observing that the result remains unchanged.
There is no error because the identity transform has a priority of -0.5 while the specific templates have a priority of 0.
Read the entire specification for conflict resolution:
https://www.w3.org/TR/1999/REC-xslt-19991116/#conflict

Related

How to insert a blank line in XSL-FO properly?

I am trying to figure out how to do that properly. I tried to use processing instructions in the code but it seems they are somehow ignored at all.
In the text:
end of a paragraph.<?linebreak?></p>
As for templating, I tried:
<xsl:template match="processing-instruction('linebreak')">
<fo:block>
<xsl:apply-templates/>
<fo:leader/>
</fo:block>
</xsl:template>
Or simply for testing purposes:
<xsl:template match="processing-instruction('linebreak')">
<fo:block>XXXX</fo:block>
</xsl:template>
No matters what I do, the template is never used.
I use it inside an eXist-db app (3.0RC1) but I think this is not associated with eXist-db. There is FOP 1.1. I am not sure about the Saxon version.
Traditionally, you don't insert a line break at the end of a paragraph. Instead, you specify e.g. space-below="12pt" on the fo:block that contains the paragraph.
A line break is always inserted, even if you don't want it (e.g. when the paragraph is placed at the bottom of a page and the line break would wrap to the next page. The space-below can be made conditional, so this space will be collapsed if it appears at the bottom of a page. This results in a better-looking layout.
No matters what I do, the template is never used.
Concerning this part of the problem, a possible explanation is that the template matching the parent element (<p> in your examples) silently ignores processing instructions when applying templates.
For example, this quasi-identity stylesheet ignores processing instructions when elements are processed, so their matching template is never executed:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="* | #*">
<xsl:copy>
<!-- this only processes elements, attributes and text nodes! -->
<xsl:apply-templates select="* | #* | text()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="processing-instruction('linebreak')">
XXXXX
</xsl:template>
</xsl:stylesheet>
In order for the processing instructions to be taken into account, the template matching elements must explicitly apply templates to them too:
<xsl:template match="* | #*">
<xsl:copy>
<xsl:apply-templates select="* | #* | text() | processing-instruction()"/>
</xsl:copy>
</xsl:template>
Note that using <xsl:apply-templates/> would not work too, as it does not select processing instructions nor attributes, just elements and text nodes.

XSLT templates' ambiguity clarification

When ran the following input XML
<root>
<value>false</value>
<value>true</value>
</root>
against the following XSLT:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="value">
<true_value/>
</xsl:template>
<xsl:template match="value[. = 'false']">
<false_value/>
</xsl:template>
</xsl:stylesheet>
I get value element with 'false' as its content changed to false_value.. and all other value elements are turned into true_value.
Output:
<?xml version="1.0" encoding="utf-8"?>
<root>
<false_value/>
<true_value/>
</root>
But only when I change the template match to root/value do I get ambiguous template warning.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="root/value">
<true_value/>
</xsl:template>
<xsl:template match="root/value[. = 'false']">
<false_value/>
</xsl:template>
</xsl:stylesheet>
Please help me by explaining what difference does addition of root to the xpath in xsl:template's #match makes that I get this warning.(Ambiguous rule match for /root[1]/value[1])
Your result is due to implicit template priorities. You can explicitly specify a priority on any template:
<xsl:template match="foo" priority="2"/>
But in most cases, you do not state explicitly what priority you would like a template to adopt - and that's where the default priorities step in. If there is conflict between templates, that is, if an input node matches several templates, XSLT defines a conflict resolution procedure that makes use of the default priorities.
The two templates that cause the processor to issue a warning:
<xsl:template match="root/value">
and
<xsl:template match="root/value[. = 'false']">
have the same default priority (0.5). You would think that the match pattern match="root/value[. = 'false']" is more specific than match="root/value", but as far as the specification is concerned, it is not - they have exactly the same priority.
And that is why an ambiguous rule match is reported. An ambiguous rule match is a situation where the conflict cannot be resolved with either the explicit or implicit priorities. As a last resort, the last template is chosen.
To complete this thought experiment, change the order of templates to
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="root/value[. = 'false']">
<false_value/>
</xsl:template>
<xsl:template match="root/value">
<true_value/>
</xsl:template>
</xsl:stylesheet>
And the result will be (see it online here):
<?xml version="1.0" encoding="utf-8"?>
<root>
<true_value/>
<true_value/>
</root>
As you can see, for both value elements, the last template is chosen.
Why, then, does adding root/ to a template match result in a warning about template ambiguity?
The specific change you make is from
<xsl:template match="value">
to
<xsl:template match="root/value">
This changes the default priority (as discussed above) of the template. The default priority of value is 0, the default priority of root/value is 0.5. Only in the second case a conflict will arise, because the default priority of the other template is also 0.5.
Adding root/ to the second template:
<xsl:template match="root/value[. = 'false']">
does not change anything, the default priority remains 0.5.
See the relevant part of the XSLT specification. Caveat: the default priorities there are not exactly easy to read.
All priorities:
<xsl:template match="value"> 0
<xsl:template match="value[. = 'false']"> 0.5
<xsl:template match="root/value"> 0.5
<xsl:template match="root/value[. = 'false']"> 0.5
In general the default priorities are meant to indicate the specificity of the match pattern in the template rule. The match pattern "value" is less specific than "root/value" which only matches a value element with a root parent hence root/value has a higher default priority.
That default priority (0.5) happens to be the same as that of a match pattern that features a predicate (note that root/value can also be written as value[parent::root]) and that caused your template conflict.
You are also vulnerable to template conflict on your first template pattern which is the identity template which (for example) will conflict with a template that matched *. Note that when such conflicts are found it is permissible for an XSLT processor to fail rather than try to choose based on the position of the respective templates
If the identity transform is imported from it's stylesheet, needless duplication is eliminated and conflicts are mitigated because templates from imported stylesheets have a lower precedence than templates in the importing stylesheet.

xml:space='preserve' doesn't seem to get on with xsl:apply-templates select="node()"

Doing some work with xsl - first time I've done anything serious, and I've hit something which I can't explain. Easiest way to show it is with the identity transform:
This works:
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
This doesn't (says "Unable to apply transformation on current source"):
<xsl:template match="#*|node()" xml:space='preserve'>
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
This does:
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*"/>
<xsl:apply-templates select="node()" xml:space='preserve'/>
</xsl:copy>
</xsl:template>
OK, I can see what's happening. But I don't understand why. Why does xml:space not want to play nicely with attributes? Just curious.
BTW, this is using the xsl translator that's built into Notepad++. Perhaps I shouldn't trust it?
What are you trying to accomplish? xml:space="preserve" tells XML-consuming applications that you want to preserve whitespace-only text nodes that are descendants of the element that xml:space is an attribute of. In this example, you have xml:space as an attribute of <xsl:apply-templates>, but <xsl:apply-templates> has no whitespace-only text node descendants, so xml:space has no possible effect.
I think you wanted to preserve whitespace-only text nodes from the input XML document (not from the XSLT stylesheet). In that case, you need xml:space to be in the input XML document, not in the XSLT stylesheet. The stylesheet can have xsl:preserve-space-elements="*", but that's already the default, unless you have xsl:strip-space-elements set.
Yes, I would be inclined to wonder whether the XSLT processor used by Notepad++ (libxml) is doing something illegit. As a good diagnostic, try a respected processor like Saxon and see if you get any errors.
Either that, or just remove xml:space from your stylesheet, since it won't do you any good even if the processor doesn't throw an error.
Suggestion:
Just use
<xsl:output method="html" indent="yes"/>
as the first child of <xsl:stylesheet>.
The indent="yes" will prevent all the output elements from being crammed together on one line, so you can read the results.
Whitespace is not preserved for attributes according to specification - it is highlighted in this posting. Preserving attribute whitespace in XSLT

How does an XSL document look like if the mirrors the input data?

The typicle XSL usage is:
XML1.xml -> *transformed using xsl* -> XML2.xml
How does an XSL document look like, if I want to simply mirror the input data?
ex:
XML1.xml -> *transformed using xsl* -> XML1.xml
How does an XSL document look like, if
I want to simply mirror the input
data?
There are more than one answers to this question, however all of them could be named "Identity Transform":
<xsl:copy-of select="/"/> This is the shortest, simplest, most efficient and most inflexible, non-extensible and unuseful identity transform.
The classical identity rule, which everybody knows (or should know):
_
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
This is still very short, one-template transformation, which is so much more extensible and useful identity transform, known also as the "identity rule". Using and overriding the identity transform is the most fundamental and powerful XSLT design pattern, allowing to solve common copy and replace/rename/delete/add problems in just a few lines. Maybe 90%+ of all answers in the xslt tag use this form of the identity transform.
.3. The fine-grained control identity rule, which everybody should know (and very few people know):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="#*|node()[1]"/>
</xsl:copy>
<xsl:apply-templates select="following-sibling::node()[1]"/>
</xsl:template>
</xsl:stylesheet>
This is similar to the generally known identity rule defined at 2. above, but it provides a finer control over the XSLT processing.
Typically with 2. the <xsl:apply-templates select="#*|node()"> triggers a number of transformations (for all attributes and child nodes), that can be done in any order or even in parallel. There are tasks where we don't want certain types of nodes to be processed after some other nodes, so we have to plumb the leakage of the identity rule with overriding it with empty templates matching the unwanted nodes and adding other templates in a specific mode to process these nodes "when the time comes"...
.3. is more appropriate for tasks where we want more control and really sequential-type processing.
Some tasks that are very difficult to solve with 2. are easy using 3.
It would look like the identity transform:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
This is one of the most fundamental XSLT transforms. It matches any attribute or other node, copies what it matches, and then applies itself to all attributes and child nodes of the matched node.
This turns out to be quite powerful for other tasks, too. A common requirement is to copy most of a source file unchanged, while handling certain elements in a special way. This can be solved using the identity transform plus one template for the special nodes. It's a generic, flexible (and short) solution.
This matches every element or attribute and recursively applies the template.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="* | #*">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

XSLT template overriding

I have a small question regarding XSLT template overriding.
For this segment of my XML:
<record>
<medication>
<medicine>
<name>penicillin G</name>
<strength>500 mg</strength>
</medicine>
</medication>
</record>
In my XSLT sheet, I have two templates in the following order:
<xsl:template match="medication">
<xsl:copy-of select="." />
</xsl:template>
<xsl:template match="medicine/name">
<text>!unauthorized information!</text>
</xsl:template>
What I want to do is to copy everything under the medication element to the output other than the "name" element (or any other element that I explicitly define). The final xml will be shown to the user in RAW XML form. In other words, the result I want is:
<record>
<medication>
<medicine>
<text>! unauthorized information!</text>
<strength>500 mg</strength>
</medicine>
</medication>
</record>
Whereas I am getting the same XML as input, i.e. without the element replaced by text. Any ideas why the second template match is not overriding the name element in the first one? Thanks in advance
--
Ali
Template order does not matter. The only case it possibly becomes considered (and this is processor-dependent) is when you have an un-resolvable conflict, i.e. an error condition. In that case, it's legal for the XSLT processor to recover from the error by picking the one that comes last. However, you should never write code that depends on this behavior.
In your case, template priority isn't even the issue. You have two different template rules, one matching <medication> elements and one matching <name> elements. These will never collide, so it's not a question of template priority or overriding. The issue is that your code never actually applies templates to the <name> element. When you say <xsl:copy-of select="."/> on <medication>, you're saying: "perform a deep copy of <medication>". The only way any of the template rules will fire for descendant nodes is if you explicitly apply templates (using <xsl:apply-templates/>.
The solution I have for you is basically the same as alamar's, except that it uses a separate processing "mode", which isolates the rules from all other rules in your stylesheet. The generic match="#* | node()" template causes template rules to be recursively applied to children (and attributes), which gives you the opportunity to override the behavior for certain nodes.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- ...placeholder for the rest of your code... -->
<xsl:template match="/record">
<record>
<xsl:apply-templates/>
</record>
</xsl:template>
<!-- end of placeholder -->
<xsl:template match="medication">
<!-- Instead of copy-of, whose behavior is to always perform
a deep copy and cannot be customized, define your own
processing mode. Rules with this mode name are isolated
from the rest of your code. -->
<xsl:apply-templates mode="copy-medication" select="."/>
</xsl:template>
<!-- By default, copy all nodes and their descendants -->
<xsl:template mode="copy-medication" match="#* | node()">
<xsl:copy>
<xsl:apply-templates mode="copy-medication" select="#* | node()"/>
</xsl:copy>
</xsl:template>
<!-- But replace <name> -->
<xsl:template mode="copy-medication" match="medicine/name">
<text>!unauthorized information!</text>
</xsl:template>
</xsl:stylesheet>
The rule for "medicine/name" overrides the rule for "#* | node()", because the format of the pattern (which contains a "/") makes its default priority (0.5) higher than the default priority of "node()" (-1.0).
A complete but concise description of how template priority works can be found in "How XSLT Works" on my website.
Finally, I noticed you mentioned you want to display "RAW XML" to the user. Does that mean you want to display, for example, the XML, with all the start and end tags, in a browser? In that case, you'd need to escape all markup (e.g., "<" for "<"). Check out the XML-to-string utility on my website. Let me know if you need an example of how to use it.
Add
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
to your <xsl:template match="medicine/name">
And remove <xsl:template match="medication"> altogether!
<?xml version="1.0" encoding="windows-1251"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="medicine/name">
<text>!unauthorized information!</text>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>