xsl 3.0: How to process certain child elements first in xsl:apply-templates, then the remainder (overriding document order) - xslt

Assume my xml input is a MFMATR element with a few child elements, such as: TRLIST, INTRO, and SBLIST -- in that document order. I am converting to HTML.
I have a template that matches on the MFMATR element, and wants to run xsl:apply-templates on the 3 child elements, but I want INTRO to be processed first (listed first in the HTML). The other two (TRLIST and SBLIST) should keep their relative document order, as long as INTRO comes before both of them.
So I'd like to run <xsl:apply-templates select="INTRO, *"> but not have INTRO matched twice. (Using this syntax with xsl 3.0 causes dupes for me.) I also don't want to explicitly list every tag in the select expression, so unknown tags will still be processed.
A 2nd real life example is this: <xsl:apply-templates select="TITLE, CHGDESC, *"/>. Again, right now that is causing dupes I don't want.
I am using Saxon.

So I'd like to run <xsl:apply-templates select="INTRO, *"> but not have INTRO matched twice
Try:
<xsl:apply-templates select="INTRO, * except INTRO">

This seems to work. If someone has a better answer, let me know and I will change it.
There is no DRY violation here -- no repeated element names or variable names. I want it to look clean at all the call sites I will have.
It seems idiomatic to me since the function was pulled from w3's own website!
<xsl:template match="MFMATR">
<!-- Process INTRO first, no matter where it appears -->
<xsl:variable name="nodes" select="INTRO, *"/>
<xsl:apply-templates select="kp:distinct_nodes_stable($nodes)"/>
</xsl:template>
<xsl:template match="INTRO">
<xsl:variable name="nodes" select="TITLE, CHGDESC, *"/>
<xsl:apply-templates select="kp:distinct_nodes_stable($nodes)"/>
</xsl:template>
<!-- Discard duplicate elements in $seq, but keep their ordering -->
<!-- Adapted from https://www.w3.org/TR/xpath-functions/#func-distinct-nodes-stable -->
<xsl:function name="kp:distinct_nodes_stable" as="node()*">
<xsl:param name="seq" as="node()*"/>
<xsl:sequence select="fold-left($seq, (),
function($foundSoFar as node()*, $this as node()) as node()* {
if ($foundSoFar intersect $this)
then $foundSoFar
else ($foundSoFar, $this)
}) "/>
</xsl:function>

Related

XSLT 2.0 / XPATH testing for parent element in specific position

In XSLT 2.0 and XPATH, within <xsl:template match="lb">, I am testing for a variety of different case where each case produces different HTML output (using xsl:choose/xsl:when).
I want to test for the following situation, where lb is the very first node of any sort inside seg element:
<seg><lb break="n"/>text</seg>
By contrast, these tests would fail:
<seg>text<lb break="n"/>text</seg>
<seg><foo/><lb break="n"/>text</seg>
I've tried combining parentand position() but it's not testing correctly.
Many thanks.
Having a template matching lb and then using xsl:choose/when in my view can be solved more elegantly and compact with precise match patterns e.g. xsl:template match="seg/node()[1][self::lb]" would match any first child node of a seg parent where the child is an lb element. For other conditions you would set up different templates with different match patterns.
But you can use . is parent::seg/node()[1] inside the xsl:template match="lb" to write it the other way around if needed.
/seg/child::node()[1]/name() = 'lb'
check if first child is named "lb"
You can simply test if there are no preceding siblings:
<xsl:when test="not(preceding-sibling::node())">
Do note that node() includes comments and processing instructions too, not just elements and text.
Alternatively, if you have a template matching seg where you do something like this...
<xsl:template match="seg">
<xsl:copy>
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
Then, because <xsl:apply-templates /> is short for <xsl:apply-templates select="node()" /> you could use position() in your template
<xsl:when test="position() = 1">
This would not work if the "seg" template did <xsl:apply-templates select="lb" /> though.
See http://xsltfiddle.liberty-development.net/nc4NzRd for an example of the tests in action.

Change text of elements identified by dynamic XPath

I have an XML with 2 XML fragments, 1st one is a fragment where the new values must be applied (which can have pretty complex elements) like
... some static parents
<a:element1>
<a:subelement tag="someString">
<a:s1>a</a:s1>
</a:subelement>
</a:element1>
<a:element2>b</a:element2>
<a:element3>c</a:element3>
... lots of other elements like the above ones
and 2nd fragment that has XPaths generated from the first XML and a new value, like
<field>
<xpath>/Parent/element1/subelement[#tag="someString"]/s1</xpath>
<newValue>1</newValue>
</field>
<field>
<xpath>/Parent/element2</xpath>
<newValue>2</newValue>
</field>
We might not have new values to apply for all the elements in the first fragment.
I'm struggling to make an XSLT transformation that should apply the new values to the places indicated by the XPaths.
The output should be:
... some static parents
<a:element1>
<a:subelement tag="someString">
<a:s1>1</a:s1>
</a:subelement>
</a:element1>
<a:element2>2</a:element2>
... lots of other elements like the above ones
I have access to xalan:evaluate to evaluate the dynamic xpath. I'm trying different solutions, I will write them here when they will start to make sense.
Any ideas of approaches are well received. Thanks
Oki, I found out how, and I will write the answer here maybe someone sometime will need this:
<xsl:template match="/">
<!-- static parents -->
<a:Root>
<xsl:apply-templates select="/a:Root/a:Parent" />
</a:Root>
</xsl:template>
<xsl:template match="#*|*|text()">
<xsl:variable name="x" select="generate-id(../.)" />
<xsl:variable name="y" select="//field[generate-id(xalan:evaluate(xpath)) = $x]" />
<xsl:choose>
<xsl:when test="$y">
<xsl:value-of select="$y/newValue" />
</xsl:when>
<xsl:otherwise>
<xsl:copy>
<xsl:apply-templates select="#*|*|text()" />
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
And to explain the transformation:
I'm writing down part that is static and then call apply-templates on the fragment I'm interested in, that has a liquid structure.
Then I'm using a slightly modified identity transformation that copies everything from source to target (starting from the /a:Root/a:Parent fragment), except when we position ourselves on the text I'm interested in changing.
The text() I'm interested in will have as parent (../.) the element referred by an xpath string found in the second fragment. Variable x means, in the context of the when, this element.
Variable y finds a field element that has as child an xpath element that if evaluated using xalan will refer to the same element that the x variable relates to.
Now I used generate-id() in order to compare the physical elements, otherwise it would have compared by the toString of the element (which is wrong). If variable y doesn't exist, it means that I have no xpath element for this element that could have changed, and I'm leaving it alone. If the y variable exists, I can get from it the newValue and I'm currently positioned on the element which text I want to update.

Debug possible escaped-text issue in XSLT?

I have an XSL template that gives different results when used in two different contexts.
The template manifesting the defect is:
<xsl:template match="*" mode="blah">
<!-- snip irrelevant stuff -->
<xsl:if test="see">
<xsl:message>Contains a cross-ref. <xsl:value-of select="."/></xsl:message>
</xsl:if>
<xsl:apply-templates select="."/>
</xsl:template>
Given:
<el>This is a '<see cref="foo"/>' cross-referenced element.</el>
In one situation, I get the desired result:
Contains a cross-ref. This is a ' ' cross-referenced element.
(the <see/> is being dealt with as an XML element and is ultimately matched by another template.)
But in another situation, the xsl:if doesn't trigger and if I output the contents with <xsl:message><xsl:value-of select="."/>, I get:
This is a '<see cref="foo"/>' cross-referenced element.
It seems to me that in the latter improperly-behaving scenario, it's acting like it's been output-escaped. Does that make sense? Am I barking up the wrong tree? This is a typically complex XSL situation and trying to trace the call-stack is difficult; is there a particular XSLT processing command I should be looking for?

XSLT xsl:sequence. What is it good for..?

I know the following question is a little bit of beginners but I need your help to understand a basic concept.
I would like to say first that I'm a XSLT programmer for 3 years and yet there are some new and quite basics things I've been learning here I never knew (In my job anyone learns how to program alone, there is no course involved).
My question is:
What is the usage of xsl:sequence?
I have been using xsl:copy-of in order to copy node as is, xsl:apply-templates in order to modifiy nodes I selected and value-of for simple text.
I never had the necessity using xsl:sequence. I would appreciate if someone can show me an example of xsl:sequence usage which is preferred or cannot be achieved without the ones I noted above.
One more thing, I have read about the xsl:sequence definition of course, but I couldn't infer how it is useful.
<xsl:sequence> on an atomic value (or sequence of atomic values) is the same as <xsl:copy-of> both just return a copy of their input. The difference comes when you consider nodes.
If $n is a single element node, eg as defined by something like
<xsl:variable name="n" select="/html"/>
Then
<xsl:copy-of select="$n"/>
Returns a copy of the node, it has the same name and child structure but it is a new node with a new identity (and no parent).
<xsl:sequence select="$n"/>
Returns the node $n, The node returned has the same parent as $n and is equal to it by the is Xpath operator.
The difference is almost entirely masked in traditional (XSLT 1 style) template usage as you never get access to the result of either operation the result of the constructor is implicitly copied to the output tree so the fact that xsl:sequence doesn't make a copy is masked.
<xsl:template match="a">
<x>
<xsl:sequence select="$n"/>
</x>
</xsl:template>
is the same as
<xsl:template match="a">
<x>
<xsl:copy-of select="$n"/>
</x>
</xsl:template>
Both make a new element node and copy the result of the content as children of the new node x.
However the difference is quickly seen if you use functions.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:f="data:,f">
<xsl:variable name="s">
<x>hello</x>
</xsl:variable>
<xsl:template name="main">
::
:: <xsl:value-of select="$s/x is f:s($s/x)"/>
:: <xsl:value-of select="$s/x is f:c($s/x)"/>
::
:: <xsl:value-of select="count(f:s($s/x)/..)"/>
:: <xsl:value-of select="count(f:c($s/x)/..)"/>
::
</xsl:template>
<xsl:function name="f:s">
<xsl:param name="x"/>
<xsl:sequence select="$x"/>
</xsl:function>
<xsl:function name="f:c">
<xsl:param name="x"/>
<xsl:copy-of select="$x"/>
</xsl:function>
</xsl:stylesheet>
Produces
$ saxon9 -it main seq.xsl
<?xml version="1.0" encoding="UTF-8"?>
::
:: true
:: false
::
:: 1
:: 0
::
Here the results of xsl:sequence and xsl:copy-of are radically different.
The most common use case for xsl:sequence is to return a result from xsl:function.
<xsl:function name="f:get-customers">
<xsl:sequence select="$input-doc//customer"/>
</xsl:function>
But it can also be handy in other contexts, for example
<xsl:variable name="x" as="element()*">
<xsl:choose>
<xsl:when test="$something">
<xsl:sequence select="//customer"/>
</xsl:when>
<xsl:otherwise>
<xsl:sequence select="//supplier"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
The key thing here is that it returns references to the original nodes, it doesn't make new copies.
Well to return a value of a certain type you use xsl:sequence as xsl:value-of despite its name always creates a text node (since XSLT 1.0).
So in a function body you use
<xsl:sequence select="42"/>
to return an xs:integer value, you would use
<xsl:sequence select="'foo'"/>
to return an xs:string value and
<xsl:sequence select="xs:date('2013-01-16')"/>
to return an xs:date value and so on. Of course you can also return sequences with e.g. <xsl:sequence select="1, 2, 3"/>.
You wouldn't want to create a text node or even an element node in these cases in my view as it is inefficient.
So that is my take, with the new schema based type system of XSLT and XPath 2.0 a way is needed to return or pass around values of these types and a new construct was needed.
[edit]Michael Kay says in his "XSLT 2.0 and XPath 2.0 programmer's reference" about xsl:sequence: "This innocent looking instruction introduced in XSLT 2.0 has far reaching effects on the capability of the XSLT language, because it means that XSLT instructions and sequence constructors (and hence functions and templates) become capable of returning any value allowed by the XPath data model. Without it, XSLT instructions could only be used to create new nodes in a result tree, but with it, they can also return atomic values and references to existing nodes.".
Another use is to create a tag only if it has a child. An example is required :
<a>
<b>node b</b>
<c>node c</c>
</a>
Somewhere in your XSLT :
<xsl:variable name="foo">
<xsl:if select="b"><d>Got a "b" node</d></xsl:if>
<xsl:if select="c"><d>Got a "c" node</d></xsl:if>
</xsl:variable>
<xsl:if test="$foo/node()">
<wrapper><xsl:sequence select="$foo"/></wrapper>
</xsl:if>
You may see the demo here : http://xsltransform.net/eiZQaFz
It is way better than testing each tag like this :
<xsl:if test="a|b">...</xsl:if>
Because you would end up editing it in two places. Also the processing speed would depend on which tags are in your imput. If it is the last one from your test, the engine will test the presence of everyone before. As $foo/node() is an idioms for "is there a child element ?", the engine can optimize it. Doing so, you ease the life of everyone.

XSLT number only counts instances in current file of a multi-file document

I've been tasked with putting together a book (using XSL FO) from a number of XML-files, now I'm trying to number the figures in this book (simple numbering, no resets at chapters or whatever), my naive approach was to do this
<xsl:template match="figure">
<fo:block xsl:use-attribute-sets="figure">
.. stuff to deal with images ..
<fo:block xsl:use-attribute-sets="figure-caption">
Figure <xsl:number level="any"/>: <xsl:apply-templates/>
</fo:block>
<fo:block xsl:use-attribute-sets="figure-caption">
</xsl:template>
I have an aggregate XML file which selects the files to use using the document() function like so:
<xsl:template match="include">
<xsl:apply-templates select="document(#src)"/>
</xsl:template>
Now, my problem is that number seems to always only count the instances in the current file, which is not what I want (currently, there's only one or two images per file, resulting in all figures being 'Figure 1' or 'Figure 2').
I've considered two approaches, both being essentially two-pass XSLT. First, the straightforward approach, generate an intermediary XML containing the entire book using an identity transform, which I'm reluctant to do for other reasons.
Second, using node-set() extension, which I tried like this
<xsl:template match="include">
<xsl:apply-templates select="ext:node-set(document(#src))"/>
</xsl:template>
but this produced the same result.
Any ideas? Perhaps something which isn't a two-pass transformation? Any help would be greatly appreciated.
The two -pass approach is the more logical and robust one.
One-pass approach is very challenging. One can provide an expression in the value attribute of <xsl:number> and this can be used to sum the "local number" with the maximum accumulated number so far from all previous documents.
However, this requires sequencing the documents (which is something bad in a functional language) and this only works for a flat numbering scheme. In case hierarchical numbering is used (3.4.2), I don't see an easy way to continue from the max number of a previous document.
Due to this considerations, I would definitely merge all documents into one before numbering.
I will also use a two phase transformation. But just for fun, with one include level and no repetition, this stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:variable name="vIncludes" select="//include"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="include">
<xsl:apply-templates select="document(#src)"/>
</xsl:template>
<xsl:template match="picture">
<xsl:variable name="vRoot" select="generate-id(/)"/>
<xsl:variable name="vInclude"
select="$vIncludes[
$vRoot = generate-id(document(#src))
]"/>
<xsl:copy>
<xsl:value-of
select="count(
document(
(.|$vInclude)/preceding::include/#src
)//picture |
(.|$vInclude)/preceding::picture
) + 1"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
With this input:
<master>
<include src="child3.xml"/>
<block>
<include src="child1.xml"/>
<picture/>
</block>
<include src="child2.xml"/>
<picture/>
</master>
And 'child1.xml'
<child1>
<picture/>
</child1>
And 'child2.xml'
<child2>
<picture/>
</child2>
And 'child3.xml'
<child3>
<picture/>
</child3>
Output:
<master>
<child3>
<picture>1</picture>
</child3>
<block>
<child1>
<picture>2</picture>
</child1>
<picture>3</picture>
</block>
<child2>
<picture>4</picture>
</child2>
<picture>5</picture>
</master>
You could use an ancillary XML documentto keep track of the last figure number and load that file as a document from your stylesheet. Or, if you do not want to manage two output files from the same stylesheet (the "real" FOP output and the figure counter) you can simply load the previous chapter's FOP file and look for the MAX of the figure-caption.
Or you could pass the last figure number as a parameter with default zero and pass the parameter on the command line. The value of this parameter resulting from the parsing of the previous one in the ascending resulting document order.
All these alternatives suppose you are running the transformations in sequence in source document ascending order.
A more structured and robust solution would be to manage transverse document sections such as indices, table of contents and table of figures in as many separate FO documents that would be generated in a "second pass" run with their own XSLT.
I think I would do a prepass which outputs summary information about all the documents in a single XML file, and then use this as a secondary input to the number calculation. The summary information in your case might just be a count of how many figures each document contains, but in many cases it can be useful to hold other information as well such as the IDs of sections that will act as the target of hyperlinks.