not(#attribute) test not working in JDom XSL transform? - xslt

I have a piece of an XSLT stylesheet that works as expected using xsltproc but produces a different output in my actual application, where the transform is applied via org.jdom.transform.XSLTransformer (jdom 1.0), I believe using Xalan.
Stylesheet snippet (this is part of a larger template that starts like this: <xsl:template match="/dspace:dim[#dspaceType='ITEM']">):
<xsl:if test="//dspace:field[#mdschema='dc' and #element='rights']">
<rightsList>
<xsl:if test="//dspace:field[#mdschema='dc' and #element='rights' and not(#qualifier) and #language='*']">
<rights>
<xsl:if test="//dspace:field[#mdschema='dc' and #element='rights' and #qualifier='uri' and #language='*']">
<xsl:attribute name="rightsUri">
<xsl:value-of select="//dspace:field[#mdschema='dc' and #element='rights' and #qualifier='uri' and #language='*']"/>
</xsl:attribute>
</xsl:if>
<xsl:value-of select="//dspace:field[#mdschema='dc' and #element='rights' and not(#qualifier) and #language='*']" />
</rights>
</xsl:if>
<xsl:apply-templates select="//dspace:field[#mdschema='dc' and #element='rights' and not(#language='*')]" />
</rightsList>
</xsl:if>
and
<xsl:template match="//dspace:field[#mdschema='dc' and #element='rights' and not(#language='*')]">
<rights><xsl:value-of select="." /></rights>
</xsl:template>
XML snippet:
<dim:dim dspaceType="ITEM" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim">
<dim:field element="rights" language="en_NZ" mdschema="dc">Actual text redacted</dim:field>
<dim:field element="rights" language="*" mdschema="dc">Attribution 3.0 New Zealand</dim:field>
<dim:field element="rights" qualifier="uri" language="*" mdschema="dc">http://creativecommons.org/licenses/by/3.0/nz/</dim:field>
</dim:dim>
With xsltproc, this produces
<rightsList>
<rights rightsUri="http://creativecommons.org/licenses/by/3.0/nz/">Attribution 3.0 New Zealand</rights>
<rights>Actual text redacted</rights>
</rightsList>
In my application, this produces
<rightsList>
<rights>Actual text redacted</rights>
<rights>Attribution 3.0 New Zealand</rights>
<rights>http://creativecommons.org/licenses/by/3.0/nz/</rights>
</rightsList>
So to me it looks like the not(#qualifier) bit doesn't work using jdom.
I'd appreciate any insight into what's going on here and how I might change the stylesheet to get the same result in my application that I currently get via xsltproc.
Edited to add: just in case it makes any difference, the stylesheet starts out as
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:dspace="http://www.dspace.org/xmlns/dspace/dim"
xmlns:exslt="http://exslt.org/common"
xmlns="http://datacite.org/schema/kernel-3"
extension-element-prefixes="exslt"
exclude-result-prefixes="exslt"
version="1.0">
and also includes this template:
<!-- Don't copy everything by default! -->
<xsl:template match="#* | text()" />
See my answer below the XML structure is actually different from what I thought it was, so the problem wasn't in the XSL after all.

Apart from solving your original problem, let's have a quick look at how to reorganize your code.
You use a lot of //foo expressions. Starting an expression with //foo means "search the whole document, at any level, for the element with the name foo". Apart from this being a potentially expensive operation, this often has unwanted side effects and makes your code hard to read, because it requires you to specify each element uniquely, leading to a lot of duplicated code.
You also use a lot of xsl:if, but in XSLT, it is hardly ever necessary to use if-statements (an exception in XSLT 1.0 and 2.0 being when you deal with something other than nodes). In almost all cases, you can replace an xsl:if with a simple xsl:apply-templates.
That said, let's have a look how we can rewrite your code to get the same effect and have less chance for error:
<xsl:if test="//dspace:field[#mdschema='dc' and #element='rights']">
<rightsList>
.....
Is similar to having a matching template as follows (assuming you have a throw-away template for uninteresting nodes):
<xsl:template match="dspace:dim[dspace:field[#mdschema='dc' and #element='rights']]">
<rightsList>
This says: if you encounter a dim element with any field element that has those properties set, then output <rightsList>.
Then you have:
<xsl:if test="//dspace:field[#mdschema='dc' and #element='rights' and not(#qualifier) and #language='*']">
<rights>
Which is precisely equivalent to the following apply-template expression (assuming a matching template with it):
<xsl:apply-templates select="dspace:field[#mdschema='dc' and #element='rights' and not(#qualifier) and #language='*']" />
Here we find that a little bit below that, we have an almost equivalent expression, this time with not(#language='*'). So let's see if we can get rid of those duplicate expressions altogether.
First, let's go back a bit and have a look at what you were doing:
If anywhere any "dc" and "rights", then create a <rightsList>
If anywhere any of these have do not have a qualifier but have a language "*", create <rights>
Inside this, create an attribute rightsUri if anywhere any qualifier has value "uri" and language "*", set its value to the first such you find
After this <rights> element (there can be at most one of them in your current structure), create a list of <rights> for each field element with language "*"
If this is correct, then this can be rewritten as follows:
<xsl:template match="dspace:dim[dspace:field[#mdschema='dc' and #element='rights']]">
<xsl:variable name="adjusted">
<xsl:copy-of select="dspace:field[#mdschema='dc' and #element='rights']"/>
</xsl:variable>
<rightsList>
<xsl:apply-templates select="exsl:node-set($adjusted)/*[not(#qualifier) and #language='*'][1]" mode="noquali"/>
<xsl:apply-templates select="exsl:node-set($adjusted)/*[not(#language='*')]" />
</rightsList>
</xsl:template>
<xsl:template match="dspace:field" mode="noquali">
<rights>
<xsl:apply-templates select="/dspace:field[#qualifier='uri' and #language='*'][1]" mode="uri"/>
<xsl:value-of select="."/>
</rights>
</xsl:template>
<xsl:template match="dspace:field" mode="uri">
<xsl:attribute name="rightsUri" select="." />
</xsl:template>
<!-- matching anything else -->
<xsl:template match="dspace:field">
<rights><xsl:value-of select="." /></rights>
</xsl:template>
The exsl:node-set function is supported by just about every XSLT 1.0 processor, just add the namespace xmlns:exsl="http://exslt.org/common" to your xsl:stylesheet declaration.
Note that I added a few times [1] to the select-expressions. While you don't do that in your code, your current code has the same effect, but if you use apply-templates, if you encounter multiple matches, you have to specify that you are only interested in the first match.
I think your code can be further simplified, but I wanted to make sure that the logic remains exactly the same. As you can see, the end result is without any //. However, you do see one /, which is now pointing to the root of the node-set, which conveniently only has the nodes you are interested in: the ones with schema "dc" and "rights" element attributes, so we do not have to repeat that expression over and over again.
You may try this rewrite and see if it helps with your current bug, otherwise I'll gladly to help you further.
Edit
After your edit, your original context item will have been dspace:dim already. If you don't mind always outputting <rightsList> (even if it ends up empty), you can simply replace my first template match pattern above with your existing dspace:dim pattern.

Duh. Forest/trees indeed. Even though the language attribute is called "language" pretty much everywhere else in the application (see also, the XML snippet I gave), it is actually called "lang" in the XML that my stylesheet operates on - I finally gave in and used this answer to be sure what the XML structure is. Surprise!
Anyway, I followed the advice Abel gave in his answer in part and simplified the templates for this particular case quite a bit. I now just have
<xsl:if test="dspace:field[#mdschema='dc' and #element='rights']">
<rightsList>
<xsl:apply-templates select="dspace:field[#mdschema='dc' and #element='rights']"/>
</rightsList>
</xsl:if>
in the big template, plus a couple of custom ones:
<xsl:template match="dspace:field[#mdschema='dc' and #element='rights']">
<xsl:choose>
<xsl:when test="#qualifier='uri'"/>
<xsl:otherwise>
<rights>
<xsl:if test="#lang='*'">
<xsl:apply-templates select="//dspace:field[#mdschema='dc' and #element='rights' and #qualifier='uri' and #lang='*'][1]" mode="rightsURI"/>
</xsl:if>
<xsl:value-of select="."/>
</rights>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="dspace:field[#mdschema='dc' and #element='rights' and #qualifier='uri' and #lang='*']" mode="rightsURI">
<xsl:attribute name="rightsURI"><xsl:value-of select="."/></xsl:attribute>
</xsl:template>

Related

check if repeating node is empty in xslt

I have xml like defined below . The node EducationDetails can repeat (unbounded).
<PersonalDetailsResponse>
<FirstName></FirstName>
<LastName></LastName>
<EducationDetails>
<Degree></Degree>
<Institution></Institution>
<Year></Year>
</EducationDetails>
<EducationDetails>
<Degree></Degree>
<Institution></Institution>
<Year></Year>
</EducationDetails>
</PersonalDetailsResponse>
I want to create another xml from the above one using xslt.
My requirement is, if there is no data in any of the EducationDetails child nodes , then the resulting xml has to get data from another source.
My problem is , I am not able to check if all the EducationDetails child nodes are empty.
Since variable value cannot be changed in xslt , I tried using saxon with below code.
xmlns:saxon="http://saxon.sf.net/" extension-element-prefixes="saxon"
<xsl:variable name="emptyNode" saxon:assignable="yes" select="0" />
<xsl:when test="count(ss:education) > 0">
<xsl:for-each select="ss:education">
<xsl:if test="not(*[.=''])">
<saxon:assign name="emptyNode">
<xsl:value-of select="1" />
</saxon:assign>
</xsl:if>
</xsl:for-each>
<xsl:if test="$emptyNode = 0">
<!-- Do logic if all educationdetails node is empty-->
</xsl:if>
</xsl:when>
But it throwing exception "net.sf.saxon.trans.XPathException: Unknown extension instruction " .
It looks like saxon 9 jar is required for it ,which I am not able to get from my repository.
Is there a simpler way to check if all the child nodes of are empty.
By empty I mean, child nodes might be present, but no value in them.
Well, if you use <xsl:template match="PersonalDetailsResponse[EducationDetails[*[normalize-space()]]">...</xsl:template> then you match only on PersonalDetailsResponse element having at least one EducationDetails element having at least one child element with non whitespace data. As you seem to use an XSLT 2.0 processor you can also use the perhaps clearer <xsl:template match="PersonalDetailsResponse[some $ed in EducationDetails/* satisfies normalize-space($ed)]">...</xsl:template>.
Or perhaps if you want a variable inside the template use
<xsl:template match="PersonalDetailsResponse">
<xsl:variable name="empty-details" select="not(EducationDetails/*[normalize-space()])"/>
<xsl:if test="$empty-details">...</xsl:if>
</xsl:template>
With XSLT 2.0 the use of some or every satisfies might be easier to understand e.g.
<xsl:template match="PersonalDetailsResponse">
<xsl:variable name="empty-details" select="every $dt in EducationDetails/* satisfies not(normalize-space($dt))"/>
<xsl:if test="$empty-details">...</xsl:if>
</xsl:template>
But usually writing templates with appropriate match conditions saves you from using xsl:if or xsl:choose inside of a template.

How can I access an xsl:variable from outside the "scope" of an xsl:for-each block

I'm trying to access a variable from within a for-each. I've done a lot of reading today to try to figure this out, but nothing quite fits my scenario. Eventually, I will have multiple series like seen below and I will use the variables that I'm pulling out and make different condition. The for-each that I have below is bringing back data from 400 records. I'm sorry, I cannot provide an XML. I'm not able to expose GUIDs and such.
UPDATE
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output media-type="xml" indent="yes"/>
<xsl:template match="/">
<Records>
<xsl:for-each select="Records/Record/Record[#levelGuid = 'level1']">
<xsl:variable name="rocName1" select="Field[#guid = '123']"/>
<xsl:variable name ="rocName2" select="substring-before($rocName1, ' - ')"/>
</xsl:for-each>
<xsl:for-each select="Records/Record/Record[#levelGuid = 'levelA']">
<xsl:variable name ="findingName" select="Field[#guid = '123']"/>
<xsl:variable name="findingName1" select="substring-after($findingName, ': ')"/>
<xsl:variable name="findingName2" select="substring-after($findingName1, 'PCIDSSv3.1:')"/>
</xsl:for-each>
<xsl:if test="$findingName1 = $rocName1">
<Records>
<findingID>
<xsl:for-each select="Records/Record/Record[#levelGuid = '123']">
<xsl:value-of select ="Field[#guid = '123']"/>
</xsl:for-each>
</findingID>
</Records>
</xsl:if>
</Records>
</xsl:template>
</xsl:stylesheet>
The desired output is any findingID that has a $findingName1 that equals $rocName1. The GUIDS only appear once, but each level has hundreds of records.
I'm trying to access a variable from within a for-each.
The variable $rocRecord is in scope inside the for-each. You can simply reference it and use it.
But I think you are trying to do something else. I.e., defining a variable inside for-each and wanting to use it outside it.
Variables are scoped within their focus-setting containing block. So the short answer is: you cannot do it. The long answer however...
Use templates. The only reason to do what you seem to want to be doing is to need to access the data elsewhere:
<xsl:template match="/">
<!-- in fact, you don't need for-each at all, but I leave it in for clarity -->
<xsl:for-each select="records/record">
<!--
apply-templates means: call the declared xsl:template that
matches this node, it is somewhat similar to a function call in other
languages, except that it works the other way around, the processor will
magically find the "function" (i.e., template) for you
-->
<xsl:apply-templates select="Field[#guid='123']" />
</xsl:for-each>
</xsl:template>
<xsl:template match="Field">
<!-- the focus here is what is the contents of your variable $rocName1 -->
<rocName>
<xsl:value-of select="substring=-before(., ' - ')" />
</rocName>
</xsl:template>
XSLT is a declarative, template-oriented, functional language with concepts that are quite unique compared to most other languages. It can take a few hours to get used to it.
You said you did a lot of reading, but perhaps it is time to check a little XSLT course? There are a few online, search for "XSLT fundamentals course". It will save you hours / days of frustration.
This is a good, short read to catch up on variables in XSLT.
Update
On second read, I think it looks like you are troubled by the fact that the loop goes on for 400 items and that you only want to output the value of $rocName1. The example I showed above, does exactly that, because apply-templates does nothing if the selection is empty, which is what happens if the guid is not found.
If the guid appears once, the code above will output it once. If it appears multiple times and you only want the first, append [1] to the select statement.
Update #2 (after your update with an example)
You have two loops:
<xsl:for-each select="Records/Record/Record[#levelGuid = 'level1']">
and
<xsl:for-each select="Records/Record/Record[#levelGuid = 'levelA']">
You then want to do something (created a findingId) when a record in the first loop matches a record in the second loop.
While you can solve this using (nested) loops, it is not necessary to do so, in fact, it is discouraged as it will make your code hard to read. As I explained in my original answer, apply-templates is usually the easier way to do get this to work.
Since the Record elements are siblings of one another, I would tackle this as follows:
<xsl:template match="/">
<Records>
<xsl:apply-templates select="Records/Records/Record[#levelGuid = 'level1']" />
</Records>
</xsl:template>
<xsl:template match="Record">
<xsl:variable name="rocName1" select="Field[#guid = '123']"/>
<xsl:variable name ="rocName2" select="substring-before($rocName1, ' - ')"/>
<xsl:variable name="findingNameBase" select="../Record[#levelGuid = 'levelA']" />
<xsl:variable name ="findingName" select="$findingNameBase/Field[#guid = '123']"/>
<xsl:variable name="findingName1" select="substring-after($findingName, ': ')"/>
<xsl:variable name="findingName2" select="substring-after($findingName1, 'PCIDSSv3.1:')"/>
<findingId rocName="{$rocName1}">
<xsl:value-of select="$findingName" />
</findingId>
</xsl:template>
While this can be simplified further, it is a good start to learn about applying templates, which is at the core of anything you do with XSLT. Learn about applying templates, because without it, XSLT will be very hard to understand.

optimization of XSLT code

While searching for some profiling tools for XSLT, I came across this post. Since a lot of people there suggested to just post the code and offered to give feedback on that, I was wondering if anyone could give me some feedback on mine. I tried this (http://www.saxonica.com/documentation/#!using-xsl/performanceanalysis), but the output html is not very detailed.
I'm new to XSLT and usually work with python/perl, where regex support is much better (however, I won't rule out the possibility that it's just my very basic understanding of XSLT). For the purpose of this project however, I had to work with XSLT. It could be that I'm forcing it to do things in a very unnatural way. Any comments -on performance in particular, but anything else is also welcome, as I'd like to learn- are welcome!
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
<xsl:template name="my_terms">
<xsl:variable name="excludes" select="not (codeblock or draft-comment or filepath or shortdesc or uicontrol or varname)"/>
<!-- leftover example of how to work with excludes var -->
<!--<xsl:if test=".//*[$excludes]/text()[contains(.,'access management console')]"><li class="prodterm"><b>PB QA:access management console should be "AppCenter"</b></li></xsl:if>-->
<!-- Loop through all sentences and check for deprecated stuff -->
<xsl:for-each select=".//*[$excludes]/text()">
<xsl:variable name="sentenceList" select="tokenize(., '[\.!\?:;]\s+')"/>
<xsl:variable name="segment" select="."/>
<!-- main sentence loop -->
<xsl:for-each select="$sentenceList">
<xsl:variable name="sentence" select="."/>
<!-- very rudimentary sentence length check -->
<xsl:if test="count(tokenize(., '\W+')) > 30"> <li class="prodterm"><b>Sentence too long:</b> <xsl:value-of select="."/></li></xsl:if>
<!-- efforts to flag the shady case of the gerund -->
<xsl:if test="matches(., '\w+ \w+ing (the|a)')">
<!-- some extra checks to weed out the false positives -->
<xsl:if test="not(matches(., '\b(on|about|for|before|while|when|after|by|a|the|an|some|all|every) \w+ing (the|a)', '!i')) and not(matches(., 'during'))">
<li class="prodterm"><b>Possible unclear usage of gerund. If so, consider rewriting:</b> <xsl:value-of select="."/></li>
</xsl:if>
</xsl:if>
<!-- comma's after certain starting phrases -->
<xsl:if test="matches(., '^\s*Therefore[^,]')"><li class="prodterm"><b>Use a comma after starting a sentence with 'Therefore':</b> <xsl:value-of select="."/></li></xsl:if>
<xsl:if test="matches(., '^\s*(If you|Before|When)[^,]+$')"><li class="prodterm"><b>Use a comma after starting a sentence with 'Before', 'If you' or 'When':</b> <xsl:value-of select="."/></li></xsl:if>
<!-- experimenting with phrasal verbs (if there are a lot of verbs in phrasalVerbs.xml, it will be better to have this as the main loop (and do it outside the sentence loop)) -->
<xsl:for-each select="document('phrasalVerbs.xml')/verbs/verb[matches($sentence, concat('.* ', ./#text, ' .*'))]">
<xsl:variable name="verbPart" select="."/>
<xsl:for-each select="$verbPart/particles/particle/#text[matches($sentence, .) and not(matches($sentence, concat($verbPart/#text, ' ', .)))]">
<xsl:variable name="particle" select="."/>
<li class="prodterm"><b>Separated phrasal verb found in:</b> <xsl:value-of select="$sentence"/></li>
</xsl:for-each>
</xsl:for-each>
<!-- checking if conditionals (should be followed by then) -->
<xsl:if test="matches($sentence, '^\s*If\b', '!i') and not(matches($sentence, '\bthen\b', '!i'))"><li class="prodterm"><b>Conditional If found, but no then:</b> <xsl:value-of select="."/></li></xsl:if>
<!-- very dodgy way of detecting passive voice -->
<!--<xsl:if test="matches($sentence, '\b(are|can be|must be) \w+ed\b', '!i')"><li class="prodterm"><b>PB QA:Possible passive voice. If so, consider using active voice for:</b> <xsl:value-of select="."/></li></xsl:if>-->
<xsl:for-each select='document("generalDeprecatedTermsAndPhrases.xml")/terms/dt'>
<xsl:variable name="pattern" select="./#pattern"/>
<xsl:variable name="message" select="./#message"/>
<xsl:variable name="regexFlag" select="./#regexFlag"/>
<!-- <xsl:if test="matches($sentence, $pattern, $regexFlag)"> -->
<xsl:if test="matches($sentence, concat('(^|\W)', $pattern, '($|\W)'), $regexFlag)"> <!-- This is the work around for not being able to use \b when variable is passed on inside matches() -->
<li class="prodterm"><b><xsl:value-of select="$message"/> in: </b> <xsl:value-of select="$sentence"/> </li>
</xsl:if>
</xsl:for-each>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
To get an idea, the stripped down version of my "generalDeprecatedTermsAndPhrases.xml" looks like this:
<dt pattern='to be able to' message="Use 'to' instead of 'to be able to'" regexFlag="i"></dt>
</terms>
The reason that Saxon's profile is not very detailed is that your code is so monolithic: it's all in one great big template rule.
However, being monolithic isn't by itself the cause of any performance problems.
First observation is a functionality problem: your variable
<xsl:variable name="excludes" select="not (codeblock or draft-comment or filepath or shortdesc or uicontrol or varname)"/>
doesn't do what you think. It's evaluated with the root document node as the context item, and its value is a boolean which is true if the outermost element has a name which is not one of those listed. So I think your xsl:for-each that uses [$excludes] as a predicate is applying to all elements, whereas I suspect you intended it to apply to selected elements. I don't know how much that affects the performance.
The main influence on performance will be the cost of evaluating the regular expressions. The best way to find out which ones are causing the problem is to measure the impact of removing them one-by-one. When you've isolated the problem, there may be a way of rewriting the regular expression to make it perform better (e.g. by making it avoid backtracking).

Detecting if a node exists?

I have a set of data called <testData> with many nodes inside.
How do I detect if the node exists or not?
I've tried
<xsl:if test="/testData">
and
<xsl:if test="../testData">
Neither one works. I'm sure this is possible but I'm not sure how. :P
For context the XML file is laid out like this
<overall>
<body/>
<state/>
<data/>(the one I want access to
</overall>
I'm currently in the <body> tag, though I'd like to access it globally. Shouldn't /overall/data work?
Edit 2:
Right now I have an index into data that I need to use at anytime when apply templates to the tags inside of body. How do I tell, while in body, that data exists? Sometimes it does, sometimes it doesn't. Can't really control that. :)
Try count(.//testdata) > 0.
However if your context node is textdata and you want to test whether it has somenode child or not i would write:
<xsl:if test="somenode">
...
</xsl:if>
But I think that's not what you really want. I think you should read on different techniques of writing XSLT stylesheets (push/pull processing, etc.). When applying these, then such expressions are not usually necessary and stylesheets become simplier.
This XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="text()"/> <!-- for clarity only -->
<xsl:template match="body">
<xsl:if test="following-sibling::data">
<xsl:text>Data occurs</xsl:text>
</xsl:if>
<xsl:if test="not(following-sibling::data)">
<xsl:text>No Data occurs</xsl:text>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Applied to this sample:
<overall>
<body/>
<state/>
<data/>(the one I want access to
</overall>
Will produce this correct result:
Data occurs
When applied to this sample:
<overall>
<body/>
<state/>
</overall>
Result will be:
No Data occurs
This will work with XSL 1.0 if someone needs...
<xsl:choose>
<xsl:when test="/testdata">node exists</xsl:when>
<xsl:otherwise>node does not exists</xsl:otherwise>
</xsl:choose>

XSL: How best to store a node in a variable and then use it in future xpath expressions?

I need to be able to store a node set in variable and then perform more filting/sorting on it afterward. All the examples I've seen of this involve either using XSL2 or extensions neither of which are really an option.
I've a list of hotels in my XML doc that can be sorted/filtered and then paged through 5 at a time. I'm finding though I'm repeating alot of the logic as currently I've not found a good way to store node-sets in xsl variable and then use xpath on them for further filtering/sorting.
This is the sort of thing I'm after (excuse the code written of the top of my head so might not be 100%):
<xsl:variable name="hotels" select="/results/hotels[active='true']" />
<xsl:variable name="3_star_or_less" select="/results/hotels[number(rating) <= 3]" />
<xsl:for-each select="3_star_or_less">
<xsl:sort select="rating" />
</xsl:for-each>
Has anyone got an example of how best to do this sort of thing?
Try this example:
<xsl:variable name="hotels" select="/results/hotels[active='true']" />
<xsl:variable name="three_star_or_less"
select="$hotels[number(rating) <= 3]" />
<xsl:for-each select="$three_star_or_less">
<xsl:sort select="rating" />
<xsl:value-of select="rating" />
</xsl:for-each>
There is no problem storing a node-set in a variable in XSLT 1.0, and no extensions are needed. If you just use an XPath expression in select attribute of xsl:variable, you'll end up doing just that.
The problem is only when you want to store the nodes that you yourself had generated in a variable, and even then only if you want to query over them later. The problem here is that nodes you output don't have type "node-set" - instead, they're what is called a "result tree fragment". You can store that to a variable, and you can use that variable to insert the fragment into output (or another variable) later on, but you cannot use XPath to query over it. That's when you need either EXSLT node-set() function (which converts a result tree fragment to a node-set), or XSLT 2.0 (in which there are no result tree fragments, only sequences of nodes, regardless of where they come from).
For your example as given, this doesn't seem to be a problem. Rubens' answer gives the exact syntax.
Another note, if you want to be able to use the variable as part of an XPath statement, you need to select into the variable with <xsl:copy-of select="."/> instead of <xsl:value-of select="."/>
value-of will only take the text of the node and you wont be able to use the node-set function to return anything meaningful.
<xsl:variable name="myStringVar">
<xsl:value-of select="."/>
</xsl:variable>
<!-- This won't work: -->
<Output>
<xsl:value-of select="node-set($myStringVar)/SubNode" />
</Output>
<xsl:variable name="myNodeSetVar">
<xsl:copy-of select="."/>
</xsl:variable>
<!-- This will work: -->
<Output>
<xsl:value-of select="node-set($myNodeSetVar)/SubNode" />
</Output>