Counting distinct items in XSLT independent of depth - xslt

If I run the following XSLT code:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:key name="kValueByVal" match="variable_name/#value"
use="."/>
<xsl:template match="assessment">
<xsl:for-each select="
/*/*/variable/attributes/variable_name/#value
[generate-id()
=
generate-id(key('kValueByVal', .)[1])
]
">
<xsl:value-of select=
"concat(., ' ', count(key('kValueByVal', .)), '
')"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
on the following XML:
<assessment>
<variables>
<variable>
<attributes>
<variable_name value="FRED"/>
</attributes>
</variable>
</variables>
<variables>
<variable>
<attributes>
<variable_name value="MORTIMER"/>
</attributes>
</variable>
</variables>
<variables>
<variable>
<attributes>
<variable_name value="FRED"/>
</attributes>
</variable>
</variables>
</assessment>
I get the desired output:
FRED 2
MORTIMER 1
(See my original question for more info, if you wish.)
However, if I run it on this input:
<ExamStore>
<assessment>
<variables>
<variable>
<attributes>
<variable_name value="FRED"/>
</attributes>
</variable>
</variables>
<variables>
<variable>
<attributes>
<variable_name value="MORTIMER"/>
</attributes>
</variable>
</variables>
<variables>
<variable>
<attributes>
<variable_name value="FRED"/>
</attributes>
</variable>
</variables>
</assessment>
</ExamStore>
I get nothing. (Note that I just wrapped the original input in an ExamStore tag.) I was expecting and hoping to get the same output.
Why don't I? How can I change the original XSLT code to get the same output?

Well your select xpath /*/*/variable/attributes/variable_name/... is no longer correct because you added another node higher in the node-tree.
If you want to have true independence you need to use something like:
//variable/attributes/variable_name/...
...(not the double slash at the start) but this is fairly dangerous because it will catch all occurences of that structure - be really sure that's what you mean.
Otherwise, just prepend your xpath with another /*

When you introduced yet another level in the XML document, this screwed up the absolute XPath expression used in the original solution (taylored exactly after your original XML file).
Therefore, in order to make the XPath expression work in the new situation, just do the following:
Replace:
/*/*/variable/attributes/variable_name/#value
with
/*/*/*/variable/attributes/variable_name/#value
and now you again get the wanted neat result:
FRED 2
MORTIMER 1
I would never give you an "independent" solution, because you haven't provided any properties/guarantees/constraints about the set of possible XML documents on which you want to apply the transformation.
In your original question you used:
.//variables/variable/attributes/variable_name
off assessment,
and this is why I used the absolute XPath expression in my solution. There was no guarantee that in another XML document some variable_name elements wouldn't exist such that their chain of ancestors is not variables/variable/attributes, If this were the case, this would mean that you probably were not interested in the values of such "irregular" variable_name elements.
The lesson is that one should not be too specific in defining a question and then want general solutions. :)

For real structure independence you should use:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:key name="kValueByVal" match="variable_name/#value"
use="."/>
<xsl:template match="variable_name[#value
[generate-id()
=
generate-id(key('kValueByVal', .)[1])
]]">
<xsl:value-of select=
"concat(#value, ' ', count(key('kValueByVal', #value)), '
')"/>
</xsl:template>
</xsl:stylesheet>
Result with first input:
FRED 2
MORTIMER 1
Result with second input:
FRED 2
MORTIMER 1
Note: Never use // as fisrt XPath operator.

Related

How to find unmatched rows with XSLT

I have two large xml files, one of which has the following format:
<Persons>
<Person>
<ID>1</ID>
<LAST_NAME>London</LAST_NAME>
</Person>
<Person>
<ID>2</ID>
<LAST_NAME>Twain</LAST_NAME>
</Person>
<Person>
<ID>3</ID>
<LAST_NAME>Dikkens</LAST_NAME>
</Person>
</Persons>
The second file has the following format:
<SalesPersons>
<SalesPerson>
<ID>2</ID>
<LAST_NAME>London</LAST_NAME>
</SalesPerson>
<SalesPerson>
<ID>3</ID>
<LAST_NAME>Dikkens</LAST_NAME>
</SalesPerson>
</SalesPersons>
I need to find those records from file 1, which does not exist in file 2. Although I have it done using for-each loop, such an approach is taking a substantial amount of time. Is it possible to somehow make it run faster using a different approach?
Using a key can help to improve performance on lookups:
<xsl:key name="sales-person" match="SalesPerson" use="concat(ID, '|', LAST_NAME)"/>
<xsl:template match="/">
<xsl:for-each select="Persons/Person">
<xsl:variable name="person" select="."/>
<!-- need to change context document for key function use -->
<xsl:for-each select="$doc2">
<xsl:if test="not(key('sales-person', concat($person/ID, '|', $person/LAST_NAME)))">
<xsl:copy-of select="$person"/>
</xsl:if>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
That assumes you have bound doc2 as a variable or parameter with e.g. <xsl:param name="doc2" select="document('sales-persons.xml')"/>.

grouping with complex "selection"

This is the source XML:
<root>
<!-- a and b have the same date entries, c is different -->
<variant name="a">
<booking>
<date from="2017-01-01" to="2017-01-02" />
<date from="2017-01-04" to="2017-01-06" />
</booking>
</variant>
<variant name="b">
<booking>
<date from="2017-01-01" to="2017-01-02" />
<date from="2017-01-04" to="2017-01-06" />
</booking>
</variant>
<variant name="c">
<booking>
<date from="2017-04-06" to="2017-04-07" />
<date from="2017-04-07" to="2017-04-09" />
</booking>
</variant>
</root>
I'd like to group the three variants so that each variants with same #from and #to in each date should be grouped together.
My attempt is:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"></xsl:output>
<xsl:template match="root">
<variants>
<xsl:for-each-group select="for $i in variant return $i" group-by="booking/date/#from">
<group>
<xsl:attribute name="cgk" select="current-grouping-key()"/>
<xsl:copy-of select="current-group()"></xsl:copy-of>
</group>
</xsl:for-each-group>
</variants>
</xsl:template>
</xsl:stylesheet>
But this gives too many groups. (How) is this possible to achieve?
Using a composite key and XSLT 3.0 you could use
<xsl:template match="root">
<variants>
<xsl:for-each-group select="variant" group-by="booking/date/(#from, #to)" composite="yes">
<group key="{current-grouping-key()}">
<xsl:copy-of select="current-group()"/>
</group>
</xsl:for-each-group>
</variants>
</xsl:template>
which should group any variant elements together which have the same descendant date element sequence.
XSLT 3.0 is supported by Saxon 9.8 (any edition) or 9.7 (PE and EE) or a 2017 release of Altova XMLSpy/Raptor.
Using XSLT 2.0 you could concatenate all those date values with string-join():
<xsl:template match="root">
<variants>
<xsl:for-each-group select="variant" group-by="string-join(booking/date/(#from, #to), '|')">
<group key="{current-grouping-key()}">
<xsl:copy-of select="current-group()"/>
</group>
</xsl:for-each-group>
</variants>
</xsl:template>
Like the XSLT 3.0 solution, it only groups variant with the same sequence of date descendants, I am not sure whether that suffices or whether you might want to sort any date descendants first before computing the grouping key. In the XSLT 3 case you could do that easily with
<xsl:for-each-group select="variant" group-by="sort(booking/date, (), function($d) { xs:date($d/#from), xs:date($d/#to) })!(#from, #to)" composite="yes">
inline (although that leaves 9.8 HE behind as it does not support function expressions/higher order functions, so there you would need to move the sorting to your own user-defined xsl:function and in there use xsl:perform-sort).

XSLT transformation of boolean expressions

I'm quite new to the XSLT world and need assistance with the following:
A program takes the following string:
cn = 'James Bond' and (sn='Bon*' or givenName='Jam*')
and generates the following XML which is my input XML that I need to process using a stylesheet.
Input XML:
<?xml version="1.0" encoding="UTF-8"?>
<queryString>
<parameters>
<parameter id = "1">
<name>cn</name>
<value>James Bond</value>
<comparativeOperator>=</comparativeOperator>
<parens>
<leftParen>((</leftParen>
<rightParen>)</rightParen>
</parens>
</parameter>
<parameter id = "25">
<name>sn</name>
<value>Bon*</value>
<comparativeOperator>=</comparativeOperator>
<parens>
<leftParen>((</leftParen>
<rightParen>)</rightParen>
</parens>
</parameter>
<parameter id = "50">
<name>givenName</name>
<value>Jam*</value>
<comparativeOperator>=</comparativeOperator>
<parens>
<leftParen>(</leftParen>
<rightParen>)))</rightParen>
</parens>
</parameter>
</parameters>
<logicalOperators>
<operator id = "20">
<value>and</value>
<precedingParameterId>1</precedingParameterId>
<followingParameterId>25</followingParameterId>
</operator>
<operator id = "46">
<value>or</value>
<precedingParameterId>25</precedingParameterId>
<followingParameterId>50</followingParameterId>
</operator>
</logicalOperators>
</queryString>
Desired Output:
<?xml version="1.0" encoding="UTF-8"?>
<ns0:filter>
<ns0:and>
<ns0:or>
<ns0:equalityMatch name="cn">
<ns0:value>James Bond</ns0:value>
</ns0:equalityMatch>
</ns0:or>
<ns0:or>
<ns0:approxMatch name="givenName">
<ns0:value>Jam*</ns0:value>
</ns0:approxMatch>
<ns0:approxMatch name="sn">
<ns0:value>Bon*</ns0:value>
</ns0:approxMatch>
</ns0:or>
</ns0:and>
</ns0:filter>
My existing xslt is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:ns0="urn:oasis:names:tc:DSML:2:0:core" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" exclude-result-prefixes="ns0">
<xsl:output method="xml" encoding="UTF-8" indent="yes"/>
<xsl:template match="/queryString/logicalOperators/operator">
<ns0:filter>
<MyOp>
<xsl:value-of select="value"/>
</MyOp>
<xsl:for-each select="../../parameters/parameter">
<xsl:if test="comparativeOperator = '='">
<ns0:equalityMatch name="{name}">
<value>
<xsl:value-of select="value"/>
</value>
</ns0:equalityMatch>
</xsl:if>
</xsl:for-each>
</ns0:filter>
<!--/xsl:element-->
</xsl:template>
<xsl:template match="parens">
<xsl:element name="leftParensoutput">
<xsl:value-of select="leftParen"/>
</xsl:element>
<xsl:element name="rightParensoutput">
<xsl:value-of select="rightParen"/>
</xsl:element>
</xsl:template>
<xsl:template match="parameters/parameter">
<xsl:element name="FilterParameters">
<xsl:apply-templates select="parens"/>
<xsl:element name="queryfilterParameterElement">
<xsl:element name="id">
<xsl:value-of select="#id"/>
</xsl:element>
<xsl:element name="name">
<xsl:value-of select="name"/>
</xsl:element>
<xsl:element name="value">
<xsl:value-of select="value"/>
</xsl:element>
</xsl:element>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Current output:
<?xml version="1.0" encoding="UTF-8"?>
<FilterParameters>
<leftParensoutput>((</leftParensoutput>
<rightParensoutput>)</rightParensoutput>
<queryfilterParameterElement>
<id>1</id>
<name>cn</name>
<value>James Bon*</value>
</queryfilterParameterElement>
</FilterParameters>
<FilterParameters>
<leftParensoutput>((</leftParensoutput>
<rightParensoutput>)</rightParensoutput>
<queryfilterParameterElement>
<id>25</id>
<name>sn</name>
<value>Bon*</value>
</queryfilterParameterElement>
</FilterParameters>
<FilterParameters>
<leftParensoutput>(</leftParensoutput>
<rightParensoutput>)))</rightParensoutput>
<queryfilterParameterElement>
<id>50</id>
<name>givenName</name>
<value>Jam*</value>
</queryfilterParameterElement>
</FilterParameters>
<ns0:filter xmlns:ns0="urn:oasis:names:tc:DSML:2:0:core">
<MyOp>and</MyOp>
<ns0:equalityMatch name="cn">
<value>James Bon*</value>
</ns0:equalityMatch>
<ns0:equalityMatch name="sn">
<value>Bon*</value>
</ns0:equalityMatch>
<ns0:equalityMatch name="givenName">
<value>Jam*</value>
</ns0:equalityMatch>
</ns0:filter>
<ns0:filter xmlns:ns0="urn:oasis:names:tc:DSML:2:0:core">
<MyOp>or</MyOp>
<ns0:equalityMatch name="cn">
<value>James Bon*</value>
</ns0:equalityMatch>
<ns0:equalityMatch name="sn">
<value>Bon*</value>
</ns0:equalityMatch>
<ns0:equalityMatch name="givenName">
<value>Jam*</value>
</ns0:equalityMatch>
</ns0:filter>
I am dealing with multiple issues. Wasn't sure if I needed to split up the questions, but decided on presenting the whole issue.
Thanks in advance!
i) As I loop through the logicalOperators, how do I match the precedingParameterId to the under parameters.
ii) In the desired output, how do I create the node: - ie. dynamically add "ns0" AND get the "value" of parameter/operator
iii)) I'm not sure how to remove the extraneous FilterParameters element. If I remove the section , my output looks like:
<?xml version="1.0" encoding="UTF-8"?>**cnJames Bon*=(()snBon*=(()givenNameJam*=()))**<ns0:filter xmlns:ns0="urn:oasis:names:tc:DSML:2:0:core">
<MyOp>and</MyOp>
<ns0:equalityMatch name="cn">
<value>James Bon*</value>
</ns0:equalityMatch>
<ns0:equalityMatch name="sn">
<value>Bon*</value>
</ns0:equalityMatch>
<ns0:equalityMatch name="givenName">
<value>Jam*</value>
</ns0:equalityMatch>
</ns0:filter><ns0:filter xmlns:ns0="urn:oasis:names:tc:DSML:2:0:core">
<MyOp>or</MyOp>
<ns0:equalityMatch name="cn">
<value>James Bon*</value>
</ns0:equalityMatch>
<ns0:equalityMatch name="sn">
<value>Bon*</value>
</ns0:equalityMatch>
<ns0:equalityMatch name="givenName">
<value>Jam*</value>
</ns0:equalityMatch>
</ns0:filter>
Added logic:
The pretzel logic is as follows:
create a root node ;
for each queryString/logicalOperators/operator, create a node where operator is the value of logicalOperators/operator/value;
Inside this node, use logicalOperators/operator/precedingParameterId and followingParameterId to match them up with
queryString/parameters/parameter/parens/leftParen and rightParen;
if the pattern is the same, then get queryString/parameters/parameter/name and value and close with ns0:operator tag
This should get
<ns0:equalityMatch name="cn">
<ns0:value>James Bond</ns0:value>
</ns0:equalityMatch>
if they are not the same, create a node and for the logicalOperators/operator/precedingParameterId, get the corresponding
queryString/parameters/parameter #id/name and value and close; for the logicalOperators/operator/followingParameterId, get the corresponding
queryString/parameters/parameter #id/name and value and close;
This should give:
<ns0:or>
<ns0:approxMatch name="givenName">
<ns0:value>Jam*</ns0:value>
</ns0:approxMatch>
<ns0:approxMatch name="sn">
<ns0:value>Bon*</ns0:value>
</ns0:approxMatch>
if queryString/parameters/parameter/value does not contain star use node equalityMatch else use node approxMatch
Who designed this intermediate XML representation of the expression, and why did they do it this way? Is the syntax and semantics of this XML representation well defined, or is it just "specified by example"? Its method of indicating operator precedence by sequences of left- and right- parentheses is highly idiosyncratic. I would say it was designed by someone with very little knowledge or experience of parser design.
Given a free choice, I would personally start from the original free-form expression rather than from this intermediate XML representation. Of course, I would want to know the full grammar of the expression language - at the moment we only have one example expression to work from. It looks like a fairly simple grammar, and parsing simple grammars in XSLT is not difficult if you know the theory. There are good examples of parser-writing in XSLT from Dimitre Novatchev and from Gunther Rademacher.
Unfortunately it's fairly clear from your post that you don't know the theory, and so I'm in the difficult position of trying to suggest a way forward that's constrained by what we know about your level of knowledge and experience.
So, since the lexical analysis phase of parsing has already been done, let's take the problem as given and see what we can do with your intermediate XML representation. The only difficult part of the problem (for me: there might be other parts that are difficult for you) is to work out how to use the left- and right parens info to construct the hierarchical expression tree in your output.
If we take the three "parameters" in your expression (which would usually be called "terms") we can assign each of them a level number: 1, 2, 2 respectively. This represents the depth of the term as it appears in the final expression tree. The level of each term can be deduced from your intermediate input by counting parentheses: the level of a term is
the sum of the number of left parentheses in this and preceding terms
minus
the sum of the number of right parentheses in preceding terms
minus 1
Translating that computation into XSLT terms depends strongly on whether you are using XSLT 1.0 or 2.0, which you haven't actually said. If we assume 2.0, it's
sum((.|preceding-sibling::parameter)/parens/string-length(left-paren))
-
sum(preceding-sibling::parameter/parens/string-length(right-paren))
-1
Now, once you've got the level numbers, constructing a tree is a "well-known" problem, a classic exercise in using grouping constructs. Unfortunately it's well beyond beginner XSLT level, but here goes.
Given a sequence of elements with level numbers:
<a level="1"/>
<b level="2"/>
<c level="3"/>
<d level="3"/>
<e level="2"/>
we can turn them into a tree structure
<a><b><c/><d/></b><e/></a>
using recursive grouping as follows. We write a template that does one level of grouping, and then calls itself recursively to do the next level:
<xsl:template name="grouping">
<xsl:param name="input" as="element()*"/>
<xsl:if test="exists($input)">
<xsl:variable name="level" select="$input[1]/#level"/>
<xsl:for-each-group select="$input"
group-starting-with="*[#level=$level]">
<xsl:copy>
<xsl:call-template name="grouping">
<xsl:with-param name="input"
select="current-group()[position() gt 1]"/>
</xsl:call-template>
</xsl:copy>
</xsl:for-each-group>
</xsl:if>
</xsl:template>
Again, that's using XSLT 2.0. A solution using XSLT 1.0 is going to be much, much harder.
I've given you an outline here and I appreciate that with your level of XSLT experience, fleshing it out is going to be quite hard work. However, I hope you now have a better understanding of the task ahead.

XSLT help - XSL for various similar XML, Template usage

I am very new to XSL and XPath. Apologies if this question shows some stupidity.
I have an XML something like
<root>
<widget name="status">
...
<component name="date">
<component name="day" label="Fri"/>
<component name="date" label="4"/>
</component>
<component name="time" label="11:23 AM"/>
....
</widget>
<widget name="foo">
</widget>
</root>
I need to create a DateTime tag which is compose of all the three values something like
Fri 4 11:23 AM
I am writing an XSL for it.
<DateTime>
<xsl:value-of select="(//widget[#name="status"]/component[#name='date'])[1]/#label"/>
<xsl:text> </xsl:text>
<xsl:value-of select="(//widget[#name="status"]/component[#name='date'])[2]/#label"/>
<xsl:text> </xsl:text>
<xsl:value-of select="//widget[#name="status"]/component[#name='time']/#label"/>
</DateTime>
Question:
I am passing the "widget[#name="date"]" to each of the select statement. Is there any better way to shorten the xpath.
I need to move this into a template and call the template. which one I should use call-template/apply-templates?
We have a set of similar applications which generate these XML. The above XML is from applicationA. ApplicationB might show the detail in little bit different way, something like <component name="datetime">Fri 4 11:23 AM</component>. We have almost 3-4 such application where they display the details in little bit different way.
DateTime is just an example, there are some other details which I also need to capture from these various applications.
I am thinking to write a single XSL to deal with all the applications.
One way to do it with your XML would be this:
<xsl:template match="widget">
<!-- ... -->
<xsl:apply-templates select="." mode="create-date-time" />
<!-- ... -->
</xsl:template>
<xsl:template match="widget" mode="create-date-time">
<xsl:variable name="date" select="component[#name='date']" />
<xsl:variable name="time" select="component[#name='time']" />
<DateTime>
<xsl:value-of select="normalize-space(
concat(
$date/component[#name='day']/#label, ' ',
$date/component[#name='date']/#label, ' ',
$time/#label
)
)" />
</DateTime>
</xsl:template>
I am passing the widget[#name="date"] to each of the select statement. Is there any better way to shorten the xpath.
Use <xsl:template>/<xsl:apply-templates>, and relative paths. Store things you need more than once in an <xsl:variable>. See above.
I need to move this into a template and call the template. which one I should use call-template/apply-templates?
The latter. Always go for <xsl:apply-templates> unless there is good reason not to. As a rule of thumb: If you are unsure, then there is no good reason.
We have a set of similar applications which generate these XML. The above XML is from applicationA. ApplicationB might show the detail in little bit different way, something like <component name="datetime">Fri 4 11:23 AM</component> We have almost 3-4 such application where they display the details in little bit different way.
You could expand the create-date-time template to accommodate for this:
<xsl:template match="widget" mode="create-date-time">
<xsl:variable name="date" select="component[#name='date']" />
<xsl:variable name="time" select="component[#name='time']" />
<xsl:variable name="dt" select="component[#name='datetime']" />
<DateTime>
<xsl:value-of select="normalize-space(
concat(
$dt/label, ' ',
$date/component[#name='day']/#label, ' ',
$date/component[#name='date']/#label, ' ',
$time/#label
)
)" />
</DateTime>
</xsl:template>
There will be no error if certain components are missing. normalize-space() makes sure that there are no excess spaces for any combination of components.
The above may fail if the date+time and datetime components are not mutually exclusive (I've assumed they are). If they are not, or if more complicated cases occur, create additional specific templates, like this one:
<xsl:template match="widget[component[name='datetime']]" mode="create-date-time">
<xsl:variable name="dt" select="component[#name='datetime']" />
<DateTime>
<xsl:value-of select="component[#name='datetime']/#label" />
</DateTime>
</xsl:template>
The <xsl:apply-templates> will make sure the correct one is called. Just create specific match= expressions for each case that can occur.
This simple transformation (16 lines, single template, completely "push" style, no variables, no modes):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match=
"component
[contains('|day|date|time|',
concat('|', #name, '|'))
]
">
<xsl:value-of select="concat(#label, ' ')"/>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
when applied on the provided XML document:
<root>
<widget name="status">
...
<component name="date">
<component name="day" label="Fri"/>
<component name="date" label="4"/>
</component>
<component name="time" label="11:23 AM"/>
....
</widget>
<widget name="foo">
</widget>
</root>
produces exactly the wanted result:
Fri 4 11:23 AM

Produce context data for first and last occurrences of every value of an element

Given the following xml:
<container>
<val>2</val>
<id>1</id>
</container>
<container>
<val>2</val>
<id>2</id>
</container>
<container>
<val>2</val>
<id>3</id>
</container>
<container>
<val>4</val>
<id>1</id>
</container>
<container>
<val>4</val>
<id>2</id>
</container>
<container>
<val>4</val>
<id>3</id>
</container>
I'd like to return something like
2 - 1
2 - 3
4 - 1
4 - 3
Using a nodeset I've been able to get the last occurrence via:
exsl:node-set($list)/container[not(val = following::val)]
but I can't figure out how to get the first one.
To get the first and the last occurrence (document order) in each "<val>" group, you can use an <xsl:key> like this:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:output method="text" />
<xsl:key name="ContainerGroupByVal" match="container" use="val" />
<xsl:variable name="ContainerGroupFirstLast" select="//container[
generate-id() = generate-id(key('ContainerGroupByVal', val)[1])
or
generate-id() = generate-id(key('ContainerGroupByVal', val)[last()])
]" />
<xsl:template match="/">
<xsl:for-each select="$ContainerGroupFirstLast">
<xsl:value-of select="val" />
<xsl:text> - </xsl:text>
<xsl:value-of select="id" />
<xsl:value-of select="'
'" /><!-- LF -->
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
EDIT #1: A bit of an explanation since this might not be obvious right away:
The <xsl:key> returns all <container> nodes having a given <val>. You use the key() function to query it.
The <xsl:variable> is where it all happens. It reads as:
for each of the <container> nodes in the document ("//container") check…
…if it has the same unique id (generate-id()) as the first node returned by key() or the last node returned by key()
where key('ContainerGroupByVal', val) returns the set of <container> nodes matching the current <val>
if the unique ids match, include the node in the selection
the <xsl:for-each> does the output. It could just as well be a <xsl:apply-templates>.
EDIT #2: As Dimitre Novatchev rightfully points out in the comments, you should be wary of using the "//" XPath shorthand. If you can avoid it, by all means, do so — partly because it potentially selects nodes you don't want, and mainly because it is slower than a more specific XPath expression. For example, if your document looks like:
<containers>
<container><!-- ... --></container>
<container><!-- ... --></container>
<container><!-- ... --></container>
</containers>
then you should use "/containers/container" or "/*/container" instead of "//container".
EDIT #3: An alternative syntax of the above would be:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:output method="text" />
<xsl:key name="ContainerGroupByVal" match="container" use="val" />
<xsl:variable name="ContainerGroupFirstLast" select="//container[
count(
.
| key('ContainerGroupByVal', val)[1]
| key('ContainerGroupByVal', val)[last()]
) = 2
]" />
<xsl:template match="/">
<xsl:for-each select="$ContainerGroupFirstLast">
<xsl:value-of select="val" />
<xsl:text> - </xsl:text>
<xsl:value-of select="id" />
<xsl:value-of select="'
'" /><!-- LF -->
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Explanation: The XPath union operator "|" combines it's arguments into a node-set. By definition, a node-set cannot contain duplicate nodes — for example: ". | . | ." will create a node-set containing exactly one node (the current node).
This means, if we create a union node-set from the current node ("."), the "key(…)[1]" node and the "key(…)[last()]" node, it's node count will be 2 if (and only if) the current node equals one of the two other nodes, in all other cases the count will be 3.
Basic XPath:
//container[position() = 1] <- this is the first one
//container[position() = last()] <- this is the last one
Here's a set of XPath functions in more detail.
I. XSLT 1.0
Basically the same solution as the one by Tomalak, but more understandable Also it is complete, so you only need to copy and paste the XML document and the transformation and then just press the "Transform" button of your favourite XSLT IDE:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:key name="kContByVal" match="container"
use="val"/>
<xsl:template match="/*">
<xsl:for-each select=
"container[generate-id()
=
generate-id(key('kContByVal',val)[1])
]
">
<xsl:variable name="vthisvalGroup"
select="key('kContByVal', val)"/>
<xsl:value-of select=
"concat($vthisvalGroup[1]/val,
'-',
$vthisvalGroup[1]/id,
'
'
)
"/>
<xsl:value-of select=
"concat($vthisvalGroup[last()]/val,
'-',
$vthisvalGroup[last()]/id,
'
'
)
"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the originally-provided XML document (edited to be well-formed):
<t>
<container>
<val>2</val>
<id>1</id>
</container>
<container>
<val>2</val>
<id>2</id>
</container>
<container>
<val>2</val>
<id>3</id>
</container>
<container>
<val>4</val>
<id>1</id>
</container>
<container>
<val>4</val>
<id>2</id>
</container>
<container>
<val>4</val>
<id>3</id>
</container>
</t>
the wanted result is produced:
2-1
2-3
4-1
4-3
Do note:
We use the Muenchian method for grouping to find one container element for each set of such elements that have the same value for val.
From the whole node-list of container elements with the same val value, we output the required data for the first container element in the group and for the last container element in the group.
II. XSLT 2.0
This transformation:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="text"/>
<xsl:template match="/*">
<xsl:for-each-group select="container"
group-by="val">
<xsl:for-each select="current-group()[1], current-group()[last()]">
<xsl:value-of select=
"concat(val, '-', id, '
')"/>
</xsl:for-each>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
when applied on the same XML document as above, prodices the wanted result:
2-1
2-3
4-1
4-3
Do note:
The use of the <xsl:for-each-group> XSLT instruction.
The use of the current-group() function.