Consuming the same node twice in Streaming XSLT - xslt

I am trying to convert some XML to the interstitial JSON representation specified in XSLT 3.0 (for later conversion to JSON via xml-to-json). My current XSLT works well as a non-streaming stylesheet, but I'm running into issues during the conversion to streaming. Specifically, there are situations where I need to consume the same node twice, especially in the case of repeating tags in XML, which I am converting to arrays in equivalent JSON representation.
<xsl:if test="boolean(cdf:AdjudicatorName)">
<array key="AdjudicatorName">
<xsl:for-each select="cdf:AdjudicatorName">
<string>
<xsl:value-of select="."/>
</string>
</xsl:for-each>
</array>
</xsl:if>
boolean(cdf:AdjudicatorName) tests for the existance of the tag, and creates an array if so.
This code fails in Oxygen (Saxon-EE) with the following message
Template rule is declared streamable but it does not satisfy the streamability rules. * There is more than one consuming operand: {fn:exists(...)} on line {X}, and {<string {(attr{key=...}, ...)}/>} on line {Y}
I am aware of the copy-of workaround, however, many items in the source file can repeat at the highest level, so use of this approach would yield minimal memory savings.

This looks like the perfect use case for xsl:where-populated:
<xsl:where-populated>
<array key="AdjudicatorName">
<xsl:for-each select="cdf:AdjudicatorName">
<string>
<xsl:value-of select="."/>
</string>
</xsl:for-each>
</array>
</xsl:where-populated>
The xsl:where-populated instruction (which was invented for exactly this purpose) logically evaluates its child instructions, and then eliminates any items in the resulting sequence that are "deemed empty", where an element is deemed empty if it has no children. In a streaming implementation the start tag (<array>) will be "held back" until its first child is generated, and when the corresponding end tag (</array>) is emitted, the pair of tags will be discarded if there were no intervening children emitted.

Related

In XSLT nested foreach loop always return first element

XLST receiving the date from apache-camel in below formate.
data format
<list>
<linked-hash-map>
<entry key="NAME">test1</entry>
</linked-hash-map>
<linked-hash-map>
<entry key="NAME">test2</entry>
</linked-hash-map>
</list>
My XSLT:
<xsl:stylesheet>
<xsl:template match="*">
<xsl:for-each select="//*[local-name()='linked-hash-map']">
<tag1>
<xsl:value-of select="string(//*[local-name()='entry'][#key='NAME'])"/>
</tag1t>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
OUTPUT always returns the first element.
<tag1>test1<tag1>
<tag1>test1<tag1>
What is wrong in above xslt and help generate xml with all elements.
Because path expressions starting with "//" select from the root of the document tree, you are selecting the same nodes every time in your xsl:value-of; and in XSLT 1.0, if you select multiple nodes, only the first one gets displayed.
Methinks you're using "//" because you've seen it in example code and don't actually understand what it means...
Within xsl:for-each, you normally want a relative path that selects from the node currently being processed by the for-each.
You've also probably picked up this *[local-name()='linked-hash-map'] habit from other people's code. With no namespaces involved, you can safely replace it with linked-hash-map.

Constructing, not selecting, XSL node set variable

I wish to construct an XSL node set variable using a contained for-each loop. It is important that the constructed node set is the original (a selected) node set, not a copy.
Here is a much simplified version of my problem (which could of course be solved with a select, but that's not the point of the question). I've used the <name> node to test that the constructed node set variable is in fact in the original tree and not a copy.
XSL version 1.0, processor is msxsl.
Non-working XSL:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="iso-8859-1" omit-xml-declaration="yes" />
<xsl:template match="/">
<xsl:variable name="entries">
<xsl:for-each select="//entry">
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:variable>
<xsl:variable name="entryNodes" select="msxsl:node-set($entries)"/>
<xsl:for-each select="$entryNodes">
<xsl:value-of select="/root/name"/>
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
XML input:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<name>X</name>
<entry>1</entry>
<entry>2</entry>
</root>
Wanted output:
X1X2
Actual output:
12
Of course the (or a) problem is the copy-of, but I can't work out a way around this.
There isn't a "way around it" in XSLT 1.0 - it's exactly how this is supposed to work. When you have a variable that is declared with content rather than with a select then that content is a result tree fragment consisting of newly-created nodes (even if those nodes are a copy of nodes from the original tree). If you want to refer to the original nodes attached to the original tree then you must declare the variable using select. A better question would be to detail the actual problem and ask how you could write a suitable select expression to find the nodes you want without needing to use for-each - most uses of xsl:if or xsl:choose can be replaced with suitably constructed predicates, maybe involving judicious use of xsl:key, etc.
In XSLT 2.0 it's much more flexible. There's no distinction between node sets and result tree fragments, and the content of an xsl:variable is treated as a generic "sequence constructor" which can give you new nodes if you construct or copy them:
<xsl:variable name="example" as="node()*">
<xsl:copy-of select="//entry" />
</xsl:variable>
or the original nodes if you use xsl:sequence:
<xsl:variable name="example" as="node()*">
<xsl:sequence select="//entry" />
</xsl:variable>
I wish to construct an XSL node set variable using a contained
for-each loop.
I have no idea what that means.
It is important that the constructed node set is the original (a
selected) node set, not a copy.
This part I think I understand a little better. It seems you need to replace:
<xsl:variable name="entries">
<xsl:for-each select="//entry">
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:variable>
with:
<xsl:variable name="entries" select="//entry"/>
or, preferably:
<xsl:variable name="entries" select="root/entry"/>
The resulting variable is a node-set of the original entry nodes, so you can do simply:
<xsl:for-each select="$entries">
<xsl:value-of select="/root/name"/>
<xsl:value-of select="."/>
</xsl:for-each>
to get your expected result.
Of course, you could do the same thing by operating directly on the original nodes, in their original context - without requiring the variable.
In response to the comments you've made:
We obviously need a better example here, but I think I am getting a vague idea of where you want to go with this. But there are a few things you must understand first:
1.
In order to construct a variable which contains a node-set of nodes in their original context, you must use select. This does not place any limits whatsoever on what you can select. You can do your selection all at once, or in stages, or even in a loop (here I mean a real loop). You can combine the intermediate selections you have made in any way sets can be combined: union, intersection, or difference. But you must use select in all these steps, otherwise you will end up with a set of new nodes, no longer having the context they did in the source tree.
IOW, the only difference between using copy and select is that the former creates new nodes, which is precisely what you wish to avoid.
2.
xsl:for-each is not a loop. It has no hierarchy or chronology. All the nodes are processed in parallel, and there is no way to use the result of previous iteration in the current one - because no iteration is "previous" to another.
If you try to use xsl:for-each in order to add each of n processed nodes to a pre-existing node-set, you will end up with n results, each containing the pre-existing node-set joined with one of the processed nodes.
3.
I think you'll find the XPath language is quite powerful, and allows you to select the nodes you want without having to go through the complicated loops you hint at.
It might help if you showed us a problem that can't be trivially solved in XSLT 1.0. You can't solve your problem the way you are asking for: there is no equivalent of xsl:sequence in XSLT 1.0. But the problem you have shown us can be solved without such a construct. So please explain why you need what you are asking for.

Debug possible escaped-text issue in XSLT?

I have an XSL template that gives different results when used in two different contexts.
The template manifesting the defect is:
<xsl:template match="*" mode="blah">
<!-- snip irrelevant stuff -->
<xsl:if test="see">
<xsl:message>Contains a cross-ref. <xsl:value-of select="."/></xsl:message>
</xsl:if>
<xsl:apply-templates select="."/>
</xsl:template>
Given:
<el>This is a '<see cref="foo"/>' cross-referenced element.</el>
In one situation, I get the desired result:
Contains a cross-ref. This is a ' ' cross-referenced element.
(the <see/> is being dealt with as an XML element and is ultimately matched by another template.)
But in another situation, the xsl:if doesn't trigger and if I output the contents with <xsl:message><xsl:value-of select="."/>, I get:
This is a '<see cref="foo"/>' cross-referenced element.
It seems to me that in the latter improperly-behaving scenario, it's acting like it's been output-escaped. Does that make sense? Am I barking up the wrong tree? This is a typically complex XSL situation and trying to trace the call-stack is difficult; is there a particular XSLT processing command I should be looking for?

Break the XSLT for-each loop when first match is found

I am having trouble to display the first matching value, like
<test>
<p>30</p>
<p>30{1{{23{45<p>
<p>23{34</p>
<p>30{1{98</p>
</test>
<test2>
<p1>text</p1>
</test2>
So i want to loop through the <test></test> and find the value of <p> node whose string length is greater than 2 and that contains 30. I want only the first value.
so i tired the following code
<xsl:variable name="var_test">
<xsl:for-each select="*/*/test()>
<xsl:if string-length(p/text())>2 and contains(p/text(),'30'))
<xsl:value-of select="xpath">
</xsl:variable>
the problem is the var_test is being null always.
if i try directly with out any variable
<xsl:for-each select="*/*/test()>
<xsl:if string-length(p/text())>2 and contains(p/text(),'30'))
<xsl:value-of select="xpath">
I am getting the following output
<p>30{1{23{4530{1{98</p>
but the desired output is
<p>0{1{23{45</p>
so how can i achieve this?
Instead of the for-each, use
<xsl:copy-of select="(*/*/test/p[string-length() > 2 and
contains(.,'30'))] )[1]" />
The [1] selects only the first matching <p>. (Updated: I changed the XPath above in response to #markusk's comment.)
The above will output that <p> element as well as its text content, as shown in your "desired output". If you actually want only the value of the <p>, that is, its text content, use <xsl:value-of> instead of <xsl:copy-of>.
Addendum:
The idea of breaking out of a loop does not apply to XSLT, because it is not a procedural language. In a <xsl:for-each> loop, the "first" instantiation (speaking in terms of document order, or sorted order) of the loop is not necessarily evaluated at a time chronologically before the "last" instantiation. They may be evaluated in any order, or in parallel, because they do not depend on each other. So trying to "break out of the loop", which is intended to cause "subsequent" instantiations of the loop not to be evaluated, cannot work: if it did, the outcome of later instantiations would be dependent on earlier instantiations, and parallel evaluation would be ruled out.

XSL unique value key

Goal
(XSLT 1.0). My goal is to take a set of elements, S, and produce another set, T, where T contains the unique elements in S. And to do so as efficiently as possible. (Note: I don't have to create a variable containing the set, or anything like that. I just need to loop over the elements that are unique).
Example Input and Key
<!-- My actual input consists of a bunch of <Result> elements -->
<AllMyResults>
<Result>
<someElement>value</state>
<otherElement>value 2</state>
<subject>Get unique subjects!</state>
</Result>
</AllMyResults>
<xsl:key name="SubjectKey" match="AllMyResults/Result" use="subject"/>
I think the above works, but when I go to use my key, it is incredibly slow. Below is the code for how I use my key.
<xsl:for-each select="Result[count(. | key('SubjectKey', subject)[1]) = 1]">
<xsl:sort select="subject" />
<!-- Do something with the unique subject value -->
<xsl:value-of select="subject" />
</xsl:for-each>
Additional Info
I believe I am doing this wrong because it slowed down my XSL considerably. As some additional info, the code shown above is in a separate XSL file from my main XSL file. From the main XSL, I am calling a template that contains the xsl:key and the for-each shown above. The input to this template is an xsl:param containing my node-set (similar to the example input shown above).
I can't see any reason from the information given why the code should be slow. It might be worth seeing if the slowness is something that happens on all XSLT processors, or if it's peculiar to one.
Try substituting
count(. | key('SubjectKey', subject)[1]) = 1
with:
generate-id() = generate-id(key('SubjectKey', subject)[1])
In some XSLT processors the latter is much faster.