using set:intersection in XSLT - xslt

I am trying to use tokenize with intersection from exslt in XALAN.
Sample Code:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:set="http://exslt.org/sets" xmlns:str="http://exslt.org/strings"
>
<xsl:template match="/">
Number of Values in Var1: <xsl:value-of select="count(str:tokenize(ROOT/Var1,'|'))" />
Number of Values in Var2: <xsl:value-of select="count(str:tokenize(ROOT/Var2,'|'))" />
Common Values: <xsl:value-of select="count(set:intersection(str:tokenize(ROOT/Var1,'|'),str:tokenize(ROOT/Var2,'|')))" />
</xsl:template>
</xsl:stylesheet>
Here is sample Input:
<ROOT>
<Var1>Hello|One|Two</Var1>
<Var2>Hello|One|Two|Three</Var2>
</ROOT>
I would expect intersection to report value of 3 - but it is always reporting 0.
My current output:
<?xml version="1.0" encoding="UTF-8"?>
Number of Values in Var1: 3
Number of Values in Var2: 4
Common Values: 0
Is there something incorrect in approach?

Yes the two calls to str:tokenize yield two distinct sets of token elements. None of the nodes in the first set are present in the second set. The two nodesets contain element nodes some of which have the same name and textual content as nodes in the other set, but in each such case these are pairs of nodes which have the same properties, but not the same node.
What you need is to define an intersection based on textual equality rather than node identity.
You could try something like this:
<xsl:variable name="set1" select="str:tokenize(ROOT/Var1,'|')"/>
<xsl:variable name="set2" select="str:tokenize(ROOT/Var2,'|')"/>
<xsl:variable name="intersection" select="$set1[.=$set2]"/>
This will return the set of nodes from $set1 which are textually equal to a node in $set2.

Here is final solution, with Conal's solution example:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:set="http://exslt.org/sets" xmlns:str="http://exslt.org/strings" xmlns:common="http://exslt.org/common"
>
<xsl:template match="/">
Number of Values in Var1: <xsl:value-of select="count(str:tokenize(ROOT/Var1,'|'))" />
Number of Values in Var2: <xsl:value-of select="count(str:tokenize(ROOT/Var2,'|'))" />
Common Values: <xsl:value-of select="count(set:intersection(str:tokenize(ROOT/Var1,'|'),str:tokenize(ROOT/Var2,'|'))/token)" />
<xsl:variable name="set1" select="str:tokenize(ROOT/Var1,'|')"/>
<xsl:variable name="set2" select="str:tokenize(ROOT/Var2,'|')"/>
<xsl:variable name="intersection" select="$set1[.=$set2]"/>
<xsl:value-of select="count($intersection)"/>
<xsl:for-each select="$intersection">
<xsl:value-of select="node()"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

Related

merging distinct valued nodes in a node stream using xsl

I want to merge distinct valued nodes from a list of nodes using xsl. Whenever distinct value violets it creates a new tag.
Input:
<PAGE>
<T>1</T>
<T>2</T>
<T>3</T>
<T>2</T>
<T>1</T>
<T>3</T>
<T>3</T>
</PAGE>
Output:
<PAGE>
<T>1,2,3</T>
<T>2,1,3</T>
<T>3</T>
</PAGE>
Suppose first three nodes are having value 1,2 and 3 and the the fourth node is having value 2, then we have to merge first three as all values are unique. Then next merging will start from fourth node. Now suppose fifth node is again having value 2. Then in out only fourth node will be there as the next node value is same. Like wise I have to keep on merging nodes as long as the values are unique. For example if the node value stream is 1,2,2,3,4,1,1,3,3, the the Output will be 12,2341,13,3.IN merged node unique values can exist.
Please suggest me a XSL to do that
The following stylesheet:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="PAGE">
<xsl:copy>
<xsl:apply-templates select="T[1]"/>
</xsl:copy>
</xsl:template>
<xsl:template match="T">
<xsl:param name="prev-string" select="','"/>
<xsl:variable name="next-string" select="concat($prev-string, ., ',')" />
<xsl:variable name="next-node" select="following-sibling::T[1]" />
<xsl:choose>
<xsl:when test="contains($next-string, concat(',', $next-node, ',')) or not($next-node)">
<!-- End of chain: create an element and output the accumulated string ... -->
<xsl:copy>
<xsl:value-of select="substring($next-string, 2, string-length($next-string) - 2)"/>
</xsl:copy>
<!-- ... and start a new chain -->
<xsl:apply-templates select="$next-node"/>
</xsl:when>
<xsl:otherwise>
<!-- Continue the current chain: add the current value to the accumulated values -->
<xsl:apply-templates select="$next-node">
<xsl:with-param name="prev-string" select="$next-string"/>
</xsl:apply-templates>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
when applied to your example input, will return:
<?xml version="1.0" encoding="UTF-8"?>
<PAGE>
<T>1,2,3</T>
<T>2,1,3</T>
<T>3</T>
</PAGE>
Note:
It is assumed that values do not contain the separating character
(comma in this example);
Although this employs tail-recursion (partially, anyway), a large document could easily send your processor into stack overflow.

Using XSLT 2.0 to parse the values of multiple attributes into an array-like structure

I'd like to be able to select all the attributes of a certain type in a document (for example, //#arch) and then take that node set and parse the values out into second node set. When I say "parse", in specific I mean I want to turn a node set like this:
arch="value1;value2;value3"
arch="value1:value4"
into a node set like this:
arch="value1"
arch="value2"
arch="value3"
arch="value1"
arch="value4"
or something like that; I want to get the individual values out of the attributes and into their own node.
If I can get it to that state, I've got plenty of methods for sorting and duplicate removal, after which I'd be using the finished node set for a publishing task.
I'm not so much looking for an tidy answer here as an approach. I know that XSLT cannot do dynamic arrays, but that's not the same as not being able to do something like dynamic arrays or something that mimics the important part of the functionality.
One thought that has occurred to me is that I could count the nodes in the first node set, and the number of delimiters, calculate the number of entries that the second node set would need and create it (somehow), and use the substring functions to parse out the first node set into the second node set.
There's usually a way working around XSLT's issues; has anyone worked their way around this one before?
Thanks for any help,
Jeff.
I think what you're looking for is a sequence. A sequence can be either nodes or atomic values (see http://www.w3.org/TR/xslt20/#constructing-sequences).
Here's an example showing the construction of a sequence and then iterating over it. The sequence is the atomic values from #arch, but it could also be nodes.
XML Input
<doc>
<foo arch="value1;value2;value3"/>
<foo arch="value1:value4"/>
</doc>
XSLT 2.0
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="archSequence" as="item()*">
<xsl:for-each select="//#arch">
<xsl:for-each select="tokenize(.,'[;:]')">
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:for-each>
</xsl:variable>
<xsl:template match="/*">
<sequence>
<xsl:for-each select="$archSequence">
<item><xsl:value-of select="."/></item>
</xsl:for-each>
</sequence>
</xsl:template>
</xsl:stylesheet>
XML Output
<sequence>
<item>value1</item>
<item>value2</item>
<item>value3</item>
<item>value1</item>
<item>value4</item>
</sequence>
Example of a sequence of elements (same output):
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="archSequence" as="element()*">
<xsl:for-each select="//#arch">
<xsl:for-each select="tokenize(.,'[;:]')">
<item><xsl:value-of select="."/></item>
</xsl:for-each>
</xsl:for-each>
</xsl:variable>
<xsl:template match="/*">
<sequence>
<xsl:for-each select="$archSequence">
<xsl:copy-of select="."/>
</xsl:for-each>
</sequence>
</xsl:template>
</xsl:stylesheet>
You can use the tokenize function in a for expression to get a sequence of the separate values, then create an attribute node for each one. However, since XSLT doesn't let you create a bare attribute node with no element parent, you'll have to use a trick like this:
<xsl:variable name="archElements">
<xsl:for-each select="for $attr in $initialNodeSet
return tokenize($attr, '[:;]')">
<dummy arch="{.}" />
</xsl:for-each>
</xsl:variable>
and then $archElements/dummy/#arch should be the set of separated arch attribute nodes that you require.
Complete example:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes" />
<xsl:template match="/">
<xsl:variable name="inputData">
<a arch="value1;value2;value3" />
<a arch="value1:value4" />
</xsl:variable>
<!-- create an example node set containing the two arch attribute nodes -->
<xsl:variable name="initialNodeSet" select="$inputData/a/#arch" />
<!-- tokenize and generate one arch attribute node for each value -->
<xsl:variable name="archElements">
<xsl:for-each select="for $attr in $initialNodeSet
return tokenize($attr, '[:;]')">
<dummy arch="{.}" />
</xsl:for-each>
</xsl:variable>
<!-- output to verify -->
<r>
<xsl:for-each select="$archElements/dummy/#arch">
<c><xsl:copy-of select="."/></c>
</xsl:for-each>
</r>
</xsl:template>
</xsl:stylesheet>
When run over any input document (the content is ignored) this produces
<?xml version="1.0" encoding="UTF-8"?>
<r>
<c arch="value1"/>
<c arch="value2"/>
<c arch="value3"/>
<c arch="value1"/>
<c arch="value4"/>
</r>

XSL condition to check if node exists

I want to check if in my XML exists node that has type attribute containing string type_attachment_.
Is it a correct way to check it?
<xsl:if test="count(*[contains(#Type, 'type_attachment_')]) > 0">
something
</xsl:if>
I don't know how nested can this node be. It can be for example as simple as that:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl"?>
<hello-world>
<greeter>
<dsdsds>An XSLT Programmer
<greeting type = 'type_attachment_'>Hello, World!
</greeting>
</dsdsds>
</greeter>
</hello-world>
but can also contain this node nested in different other elements.
Expressions that match existing nodes are truthy. Expressions that do not match any nodes are falsy.
Therefore, you don't need to count the set of nodes returned. Simply test to see if anything matches.
<xsl:if test="*[contains(#Type, 'type_attachment')]">
something
</xsl:if>
Find out an example:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:param name="filt">
<filters>
<ritem type="type_attachment_" relateditemnumber="8901037"/>
<ritem relateditemnumber="8901038"/>
<ritem type="type_attachment_" relateditemnumber="8901039"/>
<ritem relateditemnumber="8901040"/>
</filters>
</xsl:param>
<xsl:template match="/">
<xsl:for-each select="$filt/filters/ritem[#type='type_attachment_']">
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
OUTPUT:
<ritem type="type_attachment_" relateditemnumber="8901037"/>
<ritem type="type_attachment_" relateditemnumber="8901039"/>

XSL associative sorting using a field substring

The transformation I am writing must compose a comma separated string value from a given node set. The resulting string must be sorted according to a random (non-alphabetic) mapping for the first character in the input values.
I came up with this:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:tmp="http://tempuri.org"
exclude-result-prefixes="tmp"
>
<xsl:output method="xml" indent="yes"/>
<tmp:sorting-criterion>
<code value="A">5</code>
<code value="B">1</code>
<code value="C">3</code>
</tmp:sorting-criterion>
<xsl:template match="/InputValueParentNode">
<xsl:element name="OutputValues">
<xsl:for-each select="InputValue">
<xsl:sort select="document('')/*/tmp:sorting-criterion/code[#value=substring(.,1,1)]" data-type="number"/>
<xsl:value-of select="normalize-space(.)"/>
<xsl:if test="position() != last()">
<xsl:text>,</xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
It doesn't work and looks like the XPath document('')/*/tmp:sorting-criterion/code[#value=substring(.,1,1)] does not evaluate as I expect. I've checked to substitute the substring(.,1,1) for a literal and it evaluates to the proper value.
So, am I missing something that makes the sorting XPath expression not to evaluate as I expect or is it simply impossile to do it this way?
If not possible to create a XPath expression that works, is there a work around to achieve my purpose?
Note: I'm constrained to XSLT-1.0
Sample Input:
<?xml version="1.0" encoding="utf-8"?>
<InputValueParentNode>
<InputValue>A input value</InputValue>
<InputValue>B input value</InputValue>
<InputValue>C input value</InputValue>
</InputValueParentNode>
Expected ouput:
<?xml version="1.0" encoding="utf-8"?>
<OutputValues>B input value,C input value,A input value</OutputValues>
Replace the self::node() abbreviation ., with current() function.
A better predicate would be: starts-with(normalize-space(current()),#value)
Besides changing transformation according to Alejandro´s answer, I found it better to use a XSL variable for th mapping data to avoid declaration of a dummy namespace (tmp) as seen in Dimitre´s answer to another related question.
My final implementation:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/InputValueParentNode">
<xsl:variable name="sorting-map">
<i code="A" priority="5"/>
<i code="B" priority="1"/>
<i code="C" priority="3"/>
</xsl:variable>
<xsl:variable name="sorting-criterion" select="document('')//xsl:variable[#name='sorting-map']/*"/>
<xsl:element name="OutputValues">
<xsl:for-each select="InputValue">
<xsl:sort select="$sorting-criterion[#code=substring(normalize-space(current()),1,1)]/#priority" data-type="number"/>
<xsl:value-of select="normalize-space(current())"/>
<xsl:if test="position() != last()">
<xsl:text>,</xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:element>
</xsl:template>
</xsl:stylesheet>

How can I select nodes from a tree whose markup is stored in a variable?

Consider the following XSLT script:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="iso-8859-1"/>
<xsl:variable name="stringmap">
<map>
<entry><key>red</key><value>rot</value></entry>
<entry><key>green</key><value>gruen</value></entry>
<entry><key>blue</key><value>blau</value></entry>
</map>
</xsl:variable>
<xsl:template match="/">
<!-- IMPLEMENT ME -->
</xsl:template>
</xsl:stylesheet>
I'd like this script to print redgreenblue.
Is there any way to treat the XML markup which is stored in the stringmap variable as a document of its own which I can run XPath queries on? I'm basically looking for something like
<xsl:for-each select="document($stringmap)/map/entry">
<xsl:value-of select="key"/>
</xsl:for-each>
(except that the document() function expects an URI).
Motivation: I have various long <xsl:choose> elements which map a given string to another string. I'd like to replace all those with a single template which takes a 'map' argument (which is a simple XML document). My hope is that I can then replace the <xsl:choose> with a simple statement like <xsl:value-of select="$stringmap/map/entry/value[../key='$givenkey']"/>
I'm using XSLT 1.0 using xsltproc.
You're almost right, using document('') will allow you to process node sets inside the current stylesheet:
<xsl:for-each select="document('')/xsl:stylesheet/xsl:variable[#name='stringmap']/map/entry">
<xsl:value-of select="key"/>
</xsl:for-each>
It's not necessary to define the map node set as a variable in this case:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:data="some.uri" version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<data:map>
<entry><key>red</key><value>rot</value></entry>
<entry><key>green</key><value>gruen</value></entry>
<entry><key>blue</key><value>blau</value></entry>
</data:map>
<xsl:template match="/">
<xsl:for-each select="document('')/xsl:stylesheet/data:map/entry">
<xsl:value-of select="key"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
If you do not use xsl:variable as a wrapper, you must remember that a top level elements must have a non null namespace URI.
In XSLT 2.0 it would've been possible to just iterate over the content in a variable:
<xsl:variable name="map">
<entry><key>red</key><value>rot</value></entry>
<entry><key>green</key><value>gruen</value></entry>
<entry><key>blue</key><value>blau</value></entry>
</xsl:variable>
<xsl:template match="/">
<xsl:for-each select="$map/entry">
<xsl:value-of select="key"/>
</xsl:for-each>
</xsl:template>
A posting by M. David Peterson just taught me how to make this work:
It's not necessary to have an <xsl:variable> for this case. Instead, I can embed the data document directly into the XSL stylesheet (putting it into a namespace for sanity) and then select elements from that. Here's the result:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:map="uri:map">
<xsl:output method="text" encoding="iso-8859-1"/>
<map:colors>
<entry><key>red</key><value>rot</value></entry>
<entry><key>green</key><value>gruen</value></entry>
<entry><key>blue</key><value>blau</value></entry>
</map:colors>
<xsl:template match="/">
<xsl:for-each select="document('')/*/map:colors/entry">
<xsl:value-of select="key"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
This generates the expected output redgreenblue.
The trick is to use document('') to get a handle to the XSLT document itself, then * to get into the toplevel xsl:stylesheet element and from there I can access the color map.