Variable for XPath nodes - xslt

I'm starting to learn XSLT/XPath, and I copied the following from a study guide, making some modifications:
<xsl:variable name="fname" select="'polist.xml'"/>
<xsl:variable name="thePath" select="'/collection/doc'"/>
...
<xsl:value-of select="count(doc($fname)/collection/doc)"/>
It reports the number of doc elements in the XML file. The doc() function accepts the file name variable 'fname'. But if I try to do the same with the 'thePath' variable in the count() function, using $thePath instead of the "/collection/doc" text, I get an error.
Suggestions on whether/how to use the 'thePath' variable in the count() function? Is it possible? Thanks!

Learning from examples leaves you very exposed to this kind of problem: it's easy to build a completely incorrect mental model of how the examples actually work. That's why I always advise people to start by reading a good book that explains the concepts first.
In your case you've made a common mistake, which is to assume that variables work like macros, that is, that they represent fragments of XPath text that can be substituted into an expression. That's not the case: variables represent values, the result of evaluating an expression, and you can only use a variable in places where a literal value (like a number or string) could appear.
(I suspect it's the use of the $ sign that leads to this false impression. $ is often used to represent variables in macro-like languages, for example shell scripts).
In XPath 1.0 there's no direct way of achieving what you are trying to do. In practice people either use vendor extensions for this, or they construct a pipeline in which phase 1 generates an XSLT stylesheet and phase 2 executes it (that's easier in XSLT than in most other languages, because XSLT is XML and can therefore be easily manipulated using XSLT).
In 3.0 you can evaluate XPath expressions supplied in the form of a string using the xsl:evaluate instruction. But very often, the requirement can be met better using functions. We don't know what the real underlying requirement is here so it's hard to know whether that's true in this case.

An example use of xsl:evaluate in XSLT 3 would be e.g.
<xsl:evaluate xpath="'count(' || $thePath || ')'" context-item="doc($fname)"/>

Related

What's the rationale behind result tree fragments?

XSLT 1.0 adds an additional data type to those provided by XPath 1.0: result tree fragments.
This additional data type is called result tree fragment. A variable may be bound to a result tree fragment instead of one of the four basic XPath data-types (string, number, boolean, node-set). A result tree fragment represents a fragment of the result tree. A result tree fragment is treated equivalently to a node-set that contains just a single root node. However, the operations permitted on a result tree fragment are a subset of those permitted on a node-set. An operation is permitted on a result tree fragment only if that operation would be permitted on a string (the operation on the string may involve first converting the string to a number or boolean). In particular, it is not permitted to use the /, //, and [] operators on result tree fragments.
— https://www.w3.org/TR/xslt-10/#section-Result-Tree-Fragments
To me, this seems pointless. I cannot understand why anybody would want to do this! Result tree fragments just seem like a rubbish version of node-sets, requiring two intermediate variables and a language extension to allow a programmer to work around this seemingly arbitrary limitation.
To further pile on the uselessness of result tree fragments, here's the compatibility shim I stole put together to replicate exsl:node-set in MSXSL:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
exclude-result-prefixes="exsl msxsl">
<!-- exsl:node-set -->
<msxsl:script language="JScript" implements-prefix="exsl"><![CDATA[
this['node-set'] = function (x) {
return x;
}
]]></msxsl:script>
</xsl:stylesheet>
This literally just returns the result tree fragment unchanged, suggesting that MSXSL doesn't even bother with implementing result tree fragment as a different type and just treats it identically to a node-set, further suggesting that there's no real point to it in the first place!
Why do result tree fragments exist?
What is the use-case?
Why were they added?
Why not just use a node-set?
I wasn't on the Working Group at the time, but the following exchange might shed some light. In April 2001, during the development of XSLT 1.1, I asked the WG:
Can any one try to explain to me why there is a perceived problem with
the "result tree fragment as node-set" facility as defined in the XSLT
1.1 WD? I keep hearing that it won't work with the XPath 2.0 type system, but I can't see why.
I can see us wanting to change it so that the data type is "node"
rather than "node-set", but apart from that, I fail to see what the
problem is.
Is it perhaps that someone has in mind doing away with the root node
of the temporary tree, and making the value of the variable instead be
the sequence of nodes that are currently modelled as children of this
root? If so, why would that change be useful?
James Clark replied:
Is it perhaps that someone has in mind doing away with the root node of the temporary tree, and making the value of the variable instead be the sequence
of nodes that are currently modelled as children of this root?
Yes.
If so, why would that change be useful?
(a) So instructions can return nodes without copying them.
(b) So that you can use instructions to return things other than
nodes.
Explaining things more than this would require me to explain how I
hope to see XPath, XSLT and XQuery all fitting together. At this
point, let me just say that I think we need to harmonize element
construction in XSLT and XQuery. This will naturally lead to their
being much less of a gulf between expressions and instructions. I
think it will turn out to be just as awkward and inappropriate for
xsl:variable to automagically copy and wrap in a root node the value
produced by instantiating its content as it would be for it to do this
to the value produced by evaluating the expression specified in the
select attribute.
I think the WG invented the concept of "result tree fragments" because they wanted to keep options open for the future. They had ideas how the language would evolve, and they thought that making xsl:variable create a full blown node with full navigation capability would restrict the options for the future.
In retrospect I'm convinced it was a mistake, because it didn't actually achieve this objective. When we abolished RTFs in 2.0, we still found it necessary, for backwards compatibility reasons, to have the bizarre rule that xsl:variable always constructs a document node if there is no "as" attribute.
It's worth noting that no-one in the WG ever imagined that people would still be using XSLT 1.0 twenty years later. 1.0 took about two years to develop and the WG fully expected that within two years, it would be completely superseded by a later version. They were therefore very willing to put restrictions in the language if they kept options open for the next version.

Regular Expression for whole world

First of all, I use C# 4.0 to parse the code of a VB6 application.
I have some old VB6 code and about 500+ copies of it. And I use a regular expression to grab all kinds of global variables from the code. The code is described as "Yuck" and some poor victim still has to support this. So I'm hoping to help this poor sucker a bit by generating overviews of specific constants. (And yes, it should be rewritten but it ain't broke, so...)
This is a sample of a code line I need to match, in this case all boolean constants:
Public Const gDemo = False 'Is this a demo version
And this is the regular expression I use at this moment:
Public\s+Const\s+g(?'Name'[a-zA-Z][a-zA-Z0-9]*)\s+=\s+(?'Value'[0-9]*)
And I think it too is yuckie, since the * at the end of the boolean group. But if I don't use it, it will only return 'T' or 'F'. I want the whole word.
Is this the proper RegEx to use as solution or is there an even nicer-looking option?
FYI, I use similar regexs to find all string constants and all numeric constants. Those work just fine. And basically the same .BAS file is used for all 50 copies but with different values for all these variables. By parsing all files, we have a good overview of how every version is configured.
And again, yes, we need to rebuild the whole project from scratch since it becomes harder to maintain these days. But it works and we need the manpower for other tasks. It just needs the occasional tweaks...
You can use: Public\s+Const\s+g(?<Name>[a-zA-Z][a-zA-Z0-9]*)\s+=\s+(?<Value>False|True)
demo

Xslt: <xsl:value-of select="MyPath/$MyVariable" failed

<xsl:value-of select="$MyVar"/>
works but
<xsl:value-of select="MyDataPfath/$MyVar"/>
do not work.
What is wrong in my code?
From the look of it, what you are trying to achieve is 'dynamic evaluation'. XSLT does not support the dynamic evaluation of XPath by default, so you will need to make use of an extension function.
Depending on your XSLT processor, you might want to look at EXSLT extensions. In particular the dynamic module at http://www.exslt.org/dyn/index.html. This would allow to do something like this
<xsl:value-of select="dyn:evaluate('MyDataPfath/$MyVar')"/>
However, in your case, perhaps the $MyVar contains just a single element name. In which case you could change your command to the following, which would work without any extension functions
<xsl:value-of select="MyDataPfath/*[local-name() = $MyVar]"/>
Your code didn't fail, it did exactly what the specification says it should do. Which was different from what you were hoping/imagining that it might do.
Your hopes/imagination were based on a fundamental misunderstanding of the nature of variables in XPath. XPath variables are not macros. They don't work by textual substitution; they represent values. If the variable $E contains the string "X", then MyPath/$E means the same as MyPath/"X", which is illegal in XPath 1.0, and in XPath 2.0 returns as many instances of the string "X" as there are nodes in MyPath.
You probably intended MyPath/*[name()=$E]
it is not possible to get the value by using syntax 'MyDataPfath/$MyVar' in . it will not recognize the proper path.
suppose $MyVar has value 'Hi'. so it will be represented as 'MyDataPfath/"Hi"', this is not valid path, which you want to retrieve from the XML.
to remove this limitation, You can use name() or local-name() function, that can be used as follows:
or

Boolean expressions in XSLT select statements

I have the following XSLT code that almost does what I want:
<xsl:variable name="scoredItems"
select=
".//item/attributes/scored[#value='true'] |
self::section[attributes/variable_name/#value='SCORE']/item |
.//item//variables//variable_name"/>
I want to change this to a more complicated boolean expression:
<xsl:variable name="scoredItems"
select=
".//item/attributes/scored[#value='true'] or
(self::section[variable_name/#value='SCORE']/item and
(not (.//item/attributes/scored[#value='false']))) or
.//item//variables//variable_name"/>
However, when I run this, I get the following error:
javax.xml.transform.TransformerConfigurationException: Could not compile stylesheet
at org.apache.xalan.xsltc.trax.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:832)
at org.apache.xalan.xsltc.trax.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:618)
How do I fix this? (Note that I'm using XSLT 1.0.)
In my experience, the default exception thrown by XSLT in Java is not very helpful. You'll need to implement an instance of ErrorListener and use its methods to capture and report the true XSLT problem. You can attach this ErrorListener using the setErrorListener method of your TransformerFactory.
I would greatly discourage anyone to write complicated expressions -- in any language!
This is not an XSLT question at all. It is a general programming question and the answer is:
Never write too complicated expressions because they are challenging to write, read, test, verify, proof, change.
Split a complicated expression onto a number of simpler expressions and assign them to different variables. Then operate on these variables.

XSL/XPath Indentation

What conventions (if any) do you use for indenting XSL code?
how do you deal with really long, complicated XPaths
can you plug them into your XML editor of choice?
is there some open source code that does the job well?
For some background, I use nxml-mode in Emacs. For the most part its OK and you can configure the number of spaces that child elements should be indented. Its not very good though when it comes to complicated XPaths. If I have a long XPath in my code, I like to make it's structure as transparent as possible by making it look something like this...
<xsl:for-each select="/some
/very[#test = 'whatever']
/long[#another-test = perhaps
/another
/long
/xpath[#goes='here']]
/xpath"
However, I currently have to do that manually as nxml will just align it all up with the "/some.."
Sometimes a longer xpath can't be avoided, even if you use templates instead of for-eaches (like you should, if you can). This is especially true in XSLT/XPath 2.0:
<xsl:attribute name="tablevel"
select="if (following::*[self::topic | self::part])
then (following::*[self::topic | self::part])[1]/#tablevel
else #tablevel"/>
I tend not to break a "simple" path across lines, but will break the "greater" path at operators or conditionals.
For editing, I use Oxygen (which is cross-platform) and it handles this kind of spacing pretty well. Sometimes it doesn't predict what you want exactly, but it will maintain the space once it's there, even if you re-indent your code.
In my opinion, long xpaths are hard to read and should be avoided. There are 2 ways to do it:
Simplify the source xml.
Split big templates into smaller ones.
Don't use long xpaths. Ditch the for-each and use match templates. Break down the xpath into several templates. It's much easier to read a bunch of trivial match templates than one of these.
I tend to break down the XSL differently if I'm having difficulty reading the xpath statements (which isn't very often, but it happens occasionally)... it's actually rather similar to my methods of breaking up syntax for other languages... So your example in the question might become something more like this:
<xsl:for-each select="/some/very[#test = 'whatever']/long">
<xsl:if test="#another-test = perhaps/another/long/xpath[#goes='here']">
<xsl:for-each select="xpath">
... result xml ....
</xsl:for-each>
</xsl:if>
</xsl:for-each>