I would like to ask if there is a function that can be use to to remove a duplicate value inside a string separated by | simplest possible way. I have below example of the string
1111-1|1111-1|1111-3|1111-4|1111-5|1111-3
the output that I'm expecting is:
1111-1|1111-3|1111-4|1111-5
Thanks in advance.
All presented XSLT 1.0 solutions so far produce the wrong result:
1111-1|1111-4|1111-5|1111-3
whereas the wanted, correct result is:
1111-1|1111-3|1111-4|1111-5
Now, the following transformation (no extensions, pure XSLT 1.0):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="text()" name="distinctSubstrings">
<xsl:param name="pText" select="."/>
<xsl:param name="poutDelim"/>
<xsl:param name="pFoundDistinctSubs" select="'|'"/>
<xsl:param name="pCountDistinct" select="0"/>
<xsl:if test="$pText">
<xsl:variable name="vnextSub" select="substring-before(concat($pText, '|'), '|')"/>
<xsl:variable name="vIsNewDistinct" select=
"not(contains(concat($pFoundDistinctSubs, '|'), concat('|', $vnextSub, '|')))"/>
<xsl:variable name="vnextDistinct" select=
"substring(concat($poutDelim,$vnextSub), 1 div $vIsNewDistinct)"/>
<xsl:value-of select="$vnextDistinct"/>
<xsl:variable name="vNewFoundDistinctSubs"
select="concat($pFoundDistinctSubs, $vnextDistinct)"/>
<xsl:variable name="vnextOutDelim"
select="substring('|', 2 - ($pCountDistinct > 0))"/>
<xsl:call-template name="distinctSubstrings">
<xsl:with-param name="pText" select="substring-after($pText, '|')"/>
<xsl:with-param name="pFoundDistinctSubs" select="$vNewFoundDistinctSubs"/>
<xsl:with-param name="pCountDistinct" select="$pCountDistinct + $vIsNewDistinct"/>
<xsl:with-param name="poutDelim" select="$vnextOutDelim"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
when applied on this XML document (with string value the provided string in the question):
<t>1111-1|1111-1|1111-3|1111-4|1111-5|1111-3</t>
produces the wanted, correct result:
1111-1|1111-3|1111-4|1111-5
Explanation:
All found distinct substrings are concatenated in the parameter $pFoundDistinctSubs -- whenever we get the next substring from the delimited input, we compare it to the distinct substrings passed in this parameter. This ensures that the first in order distinct substring will be output -- not the last as in the other two solutions.
We use conditionless value determination, based on the fact that XSLT 1.0 implicitly converts a Boolean false() to 0 and true() to 1 whenever it is used in a context that requires a numeric value. In particular, substring($x, 1 div true()) is equivalent to substring($x, 1 div 1) that is: substring($x, 1) and this is the entire string $x. On the other side, substring($x, 1 div false()) is equivalent to substring($x, 1 div 0) -- that is: substring($x, Infinity) and this is the empty string.
To know why avoiding conditionals is important: watch this Pluralsight course:
Tactical Design Patterns in .NET: Control Flow, by Zoran Horvat
To do this in pure XSLT 1.0, with no extension functions, you will need to use a recursive named template:
<xsl:template name="distinct-values-from-list">
<xsl:param name="list"/>
<xsl:param name="delimiter" select="'|'"/>
<xsl:choose>
<xsl:when test="contains($list, $delimiter)">
<xsl:variable name="token" select="substring-before($list, $delimiter)" />
<xsl:variable name="next-list" select="substring-after($list, $delimiter)" />
<!-- output token if it is unique -->
<xsl:if test="not(contains(concat($delimiter, $next-list, $delimiter), concat($delimiter, $token, $delimiter)))">
<xsl:value-of select="concat($token, $delimiter)"/>
</xsl:if>
<!-- recursive call -->
<xsl:call-template name="distinct-values-from-list">
<xsl:with-param name="list" select="$next-list"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$list"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Full demo: http://xsltransform.net/ncdD7mM
Added:
The above method outputs the last occurrence of each value in the list, because that's the simplest way to remove the duplicates.
The side effect of this is that the original order of the values is not preserved. Or - more correctly - it is the reverse order that is being preserved.
I would not think preserving the original forward order is of any importance here. But in case you do need it, it could be done this way (which I believe is much easier to follow than the suggested alternative):
<xsl:template name="distinct-values-from-list">
<xsl:param name="list"/>
<xsl:param name="delimiter" select="'|'"/>
<xsl:param name="result"/>
<xsl:choose>
<xsl:when test="$list">
<xsl:variable name="token" select="substring-before(concat($list, $delimiter), $delimiter)" />
<!-- recursive call -->
<xsl:call-template name="distinct-values-from-list">
<xsl:with-param name="list" select="substring-after($list, $delimiter)"/>
<xsl:with-param name="result">
<xsl:value-of select="$result"/>
<!-- add token if this is its first occurrence -->
<xsl:if test="not(contains(concat($delimiter, $result, $delimiter), concat($delimiter, $token, $delimiter)))">
<xsl:value-of select="concat($delimiter, $token)"/>
</xsl:if>
</xsl:with-param>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="substring($result, 2)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Assuming that you can use XSLT 2.0, and assuming that the input looks like
<?xml version="1.0" encoding="UTF-8"?>
<root>1111-1|1111-1|1111-3|1111-4|1111-5|1111-3</root>
you could use the distinct-values and tokenize functions:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="xml" encoding="UTF-8" indent="yes" />
<xsl:template match="/root">
<result>
<xsl:value-of separator="|" select="distinct-values(tokenize(.,'\|'))"/>
</result>
</xsl:template>
</xsl:transform>
And the result will be
<?xml version="1.0" encoding="UTF-8"?>
<result>1111-1|1111-3|1111-4|1111-5</result>
I have adapted a stylesheet below from (XSLT 1.0 How to get distinct values)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output omit-xml-declaration="yes"/>
<xsl:template match="/">
<output>
<xsl:call-template name="distinctvalues">
<xsl:with-param name="values" select="root"/>
</xsl:call-template>
</output>
</xsl:template>
<xsl:template name="distinctvalues">
<xsl:param name="values"/>
<xsl:variable name="firstvalue" select="substring-before($values, '|')"/>
<xsl:variable name="restofvalue" select="substring-after($values, '|')"/>
<xsl:if test="not(contains($values, '|'))">
<xsl:value-of select="$values"/>
</xsl:if>
<xsl:if test="contains($restofvalue, $firstvalue) = false">
<xsl:value-of select="$firstvalue"/>
<xsl:text>|</xsl:text>
</xsl:if>
<xsl:if test="$restofvalue != ''">
<xsl:call-template name="distinctvalues">
<xsl:with-param name="values" select="$restofvalue" />
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
with a sample input of:
<root>1111-1|1111-1|1111-3|1111-4|1111-5|1111-3</root>
and the output is
<output>1111-1|1111-4|1111-5|1111-3</output>
**** EDIT ****
per Michael's comment below, here is the revised stylesheet which uses a saxon extension:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:saxon="http://icl.com/saxon"
exclude-result-prefixes="saxon"
version="1.1">
<xsl:output omit-xml-declaration="yes"/>
<xsl:variable name="aaa">
<xsl:call-template name="tokenizeString">
<xsl:with-param name="list" select="root"/>
<xsl:with-param name="delimiter" select="'|'"/>
</xsl:call-template>
</xsl:variable>
<xsl:template match="/">
<xsl:for-each select="saxon:node-set($aaa)/token[not(preceding::token/. = .)]">
<xsl:if test="position() > 1">
<xsl:text>|</xsl:text>
</xsl:if>
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:template>
<xsl:template name="tokenizeString">
<!--passed template parameter -->
<xsl:param name="list"/>
<xsl:param name="delimiter"/>
<xsl:choose>
<xsl:when test="contains($list, $delimiter)">
<token>
<!-- get everything in front of the first delimiter -->
<xsl:value-of select="substring-before($list,$delimiter)"/>
</token>
<xsl:call-template name="tokenizeString">
<!-- store anything left in another variable -->
<xsl:with-param name="list" select="substring-after($list,$delimiter)"/>
<xsl:with-param name="delimiter" select="$delimiter"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:choose>
<xsl:when test="$list = ''">
<xsl:text/>
</xsl:when>
<xsl:otherwise>
<token>
<xsl:value-of select="$list"/>
</token>
</xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
given an input of:
<root>cat|cat|catalog|catalog|red|red|wired|wired</root>
it outputs
cat|catalog|red|wired
and with this input:
<root>1111-1|1111-1|1111-3|1111-4|1111-5|1111-3</root>
the output is
1111-1|1111-3|1111-4|1111-5
Related
I started learning XSLT, kind of got stuck while writing xslt functions for converting from lower-case to upper-case and upper-case to lower-case in xslt
I have tried a lot by writing different xslt functions but I think some where I'm doing mistake in my code
<xsl:template name="ConvertXmlStyleToCamelCase">
<xsl:param name="occupation"/>
<xsl:template match="node()"/>
<xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'" />
<xsl:variable name="lowercase" select="'abcdefghijklmnopqrstuvwxyz'" />
<xsl:param name="delimiter" select='/'/>
<xsl:param name="delimiter2" select= "' '"/>
<xsl:if test="not($occupation = '')" >
<xsl:choose>
<xsl:when test="contains($occupation, $delimiter)">
<xsl:variable name="word" select="substring-before(concat($occupation, $delimiter), $delimiter)"></xsl:variable>
<xsl:if test="$word">
<xsl:value-of select="translate(substring($word, 1, 1), $lowercase, $uppercase)"/>
<xsl:value-of select="translate(substring($word,2), $uppercase, $lowercase)"/>
</xsl:if>
</xsl:when>
<xsl:when test="contains( $occupation, $delimiter)">
<xsl:value-of select="$delimiter"/>
<!-- Recursive call to template to translate the text after delimeter -->
<xsl:call-template name="ConvertXmlStyleToCamelCase">
<xsl:with-param name="occupation" select="substring-after($occupation, $delimiter)"/>
</xsl:call-template>
</xsl:when>
<xsl:when test="contains($occupation, $delimiter2)">
<xsl:variable name="word2" select="substring-before(concat($occupation, $delimiter2), $delimiter2)"></xsl:variable>
<xsl:if test="$word2">
<xsl:value-of select="translate(substring($word2, 1, 1), $lowercase, $uppercase)"></xsl:value-of>
<xsl:value-of select="translate(substring($word2, 2), $uppercase, $lowercase)"/>
</xsl:if>
</xsl:when>
<xsl:when test="contains($occupation, $delimiter2)">
<xsl:value-of select="$delimiter2"/>
<!-- Recursive call to template to translate the text after delimeter2 -->
<xsl:call-template name="ConvertXmlStyleToCamelCase">
<xsl:with-param name="occupation" select="substring-after($occupation, $delimiter2)"/>
</xsl:call-template>
</xsl:when>
</xsl:choose>
</xsl:if>
<xsl:if test="not($occupation = $delimiter and $delimiter2)">
<xsl:value-of select="substring(occupation, 1, 1)"/>
<xsl:value-of select="translate(substring(occupation, 2), $uppercase, $lowercase)"/>
</xsl:if>
</xsl:template>
input will be any one value from the below
1.SELF/EMPLOYED
2.SKILL TRADE
3.GOVERNMENT
Expected output as below
Self/Employed
Skill Trade
Government
But the actual outcome is
SelfSelf employed
Skill/trade
Government
As I mentioned in a comment, your code is not reproducible. From the results you report it is clear that your 2nd delimiter is not applied. AFAICT, it is because you check first for the existence of the 1st delimiter - and if you find it, you do not bother to test if the 2nd delimiter exists before the 1st one.
Consider the following example (adapted from Converting first letter of a string to capital in xslt):
XML
<input>
<item>Self/Employed</item>
<item>Skill Trade</item>
<item>Government</item>
<item>a combi/na/tion of various de/limi/ters</item>
</input>
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/input">
<output>
<xsl:for-each select="item">
<caps>
<xsl:call-template name="capitalize">
<xsl:with-param name="text" select="."/>
</xsl:call-template>
</caps>
</xsl:for-each>
</output>
</xsl:template>
<xsl:template name="capitalize">
<xsl:param name="text"/>
<xsl:param name="delimiter" select="' '"/>
<xsl:variable name="upper-case" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>
<xsl:variable name="lower-case" select="'abcdefghijklmnopqrstuvwxyz'"/>
<xsl:variable name="word" select="substring-before(concat($text, $delimiter), $delimiter)" />
<xsl:choose>
<xsl:when test="$delimiter=' '">
<!-- tokenize word by 2nd delimiter -->
<xsl:call-template name="capitalize">
<xsl:with-param name="text" select="$word"/>
<xsl:with-param name="delimiter" select="'/'"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<!-- capitalize word -->
<xsl:value-of select="translate(substring($word, 1, 1), $lower-case, $upper-case)"/>
<xsl:value-of select="translate(substring($word, 2), $upper-case, $lower-case)"/>
</xsl:otherwise>
</xsl:choose>
<xsl:if test="contains($text, $delimiter)">
<xsl:value-of select="$delimiter"/>
<!-- recursive call -->
<xsl:call-template name="capitalize">
<xsl:with-param name="text" select="substring-after($text, $delimiter)"/>
<xsl:with-param name="delimiter" select="$delimiter"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Result
<?xml version="1.0" encoding="UTF-8"?>
<output>
<caps>Self/Employed</caps>
<caps>Skill Trade</caps>
<caps>Government</caps>
<caps>A Combi/Na/Tion Of Various De/Limi/Ters</caps>
</output>
Could you have more than two delimiters in the future? If so, try this XSLT, which can be readily extended to have more (single character) delimiters. Just change the delimiters parameters in the ConvertXmlStyleToCamelCase template.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="item">
<item>
<xsl:call-template name="ConvertXmlStyleToCamelCase" />
</item>
</xsl:template>
<xsl:template name="ConvertXmlStyleToCamelCase">
<xsl:param name="text" select="."/>
<xsl:param name="delimiters" select="' /'"/>
<xsl:variable name="upper" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>
<xsl:variable name="lower" select="'abcdefghijklmnopqrstuvwxyz'"/>
<xsl:variable name="nextDelimiter" select="substring(translate($text, translate($text, $delimiters, ''), ''), 1, 1)" />
<xsl:variable name="string" select="substring-before(concat($text, ' '), substring(concat($nextDelimiter, ' '), 1, 1))" />
<xsl:message terminate="no">Next delimiter is <xsl:value-of select="$nextDelimiter" /></xsl:message>
<xsl:value-of select="translate(substring($string, 1, 1), $lower, $upper)"/>
<xsl:value-of select="translate(substring($string, 2), $upper, $lower)"/>
<xsl:if test="$nextDelimiter">
<xsl:value-of select="$nextDelimiter" />
<xsl:call-template name="ConvertXmlStyleToCamelCase">
<xsl:with-param name="text" select="substring-after($text, $nextDelimiter)"/>
<xsl:with-param name="delimiters" select="$delimiters"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
See it in action at http://xsltfiddle.liberty-development.net/gWvjQeR where I have used a third delimiters as an example.
With reference to the double-translate, the purpose of this is to find the next delimiter in the string. To do this (in XSLT 1.0) you need to remove all the characters that are not delimiters. Doing translate($text, $delimiters, '') removes all delimiters, and so returns all characters that are not delimiters. If you then apply this result to the original string, you are left with just the delimiters present. The first character will then be the next delimiter.
I would like to ask if there is a function that can be use to to remove a duplicate value inside a string separated by | simplest possible way. I have below example of the string
1111-1|1111-1|1111-3|1111-4|1111-5|1111-3
the output that I'm expecting is:
1111-1|1111-3|1111-4|1111-5
Thanks in advance.
All presented XSLT 1.0 solutions so far produce the wrong result:
1111-1|1111-4|1111-5|1111-3
whereas the wanted, correct result is:
1111-1|1111-3|1111-4|1111-5
Now, the following transformation (no extensions, pure XSLT 1.0):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="text()" name="distinctSubstrings">
<xsl:param name="pText" select="."/>
<xsl:param name="poutDelim"/>
<xsl:param name="pFoundDistinctSubs" select="'|'"/>
<xsl:param name="pCountDistinct" select="0"/>
<xsl:if test="$pText">
<xsl:variable name="vnextSub" select="substring-before(concat($pText, '|'), '|')"/>
<xsl:variable name="vIsNewDistinct" select=
"not(contains(concat($pFoundDistinctSubs, '|'), concat('|', $vnextSub, '|')))"/>
<xsl:variable name="vnextDistinct" select=
"substring(concat($poutDelim,$vnextSub), 1 div $vIsNewDistinct)"/>
<xsl:value-of select="$vnextDistinct"/>
<xsl:variable name="vNewFoundDistinctSubs"
select="concat($pFoundDistinctSubs, $vnextDistinct)"/>
<xsl:variable name="vnextOutDelim"
select="substring('|', 2 - ($pCountDistinct > 0))"/>
<xsl:call-template name="distinctSubstrings">
<xsl:with-param name="pText" select="substring-after($pText, '|')"/>
<xsl:with-param name="pFoundDistinctSubs" select="$vNewFoundDistinctSubs"/>
<xsl:with-param name="pCountDistinct" select="$pCountDistinct + $vIsNewDistinct"/>
<xsl:with-param name="poutDelim" select="$vnextOutDelim"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
when applied on this XML document (with string value the provided string in the question):
<t>1111-1|1111-1|1111-3|1111-4|1111-5|1111-3</t>
produces the wanted, correct result:
1111-1|1111-3|1111-4|1111-5
Explanation:
All found distinct substrings are concatenated in the parameter $pFoundDistinctSubs -- whenever we get the next substring from the delimited input, we compare it to the distinct substrings passed in this parameter. This ensures that the first in order distinct substring will be output -- not the last as in the other two solutions.
We use conditionless value determination, based on the fact that XSLT 1.0 implicitly converts a Boolean false() to 0 and true() to 1 whenever it is used in a context that requires a numeric value. In particular, substring($x, 1 div true()) is equivalent to substring($x, 1 div 1) that is: substring($x, 1) and this is the entire string $x. On the other side, substring($x, 1 div false()) is equivalent to substring($x, 1 div 0) -- that is: substring($x, Infinity) and this is the empty string.
To know why avoiding conditionals is important: watch this Pluralsight course:
Tactical Design Patterns in .NET: Control Flow, by Zoran Horvat
To do this in pure XSLT 1.0, with no extension functions, you will need to use a recursive named template:
<xsl:template name="distinct-values-from-list">
<xsl:param name="list"/>
<xsl:param name="delimiter" select="'|'"/>
<xsl:choose>
<xsl:when test="contains($list, $delimiter)">
<xsl:variable name="token" select="substring-before($list, $delimiter)" />
<xsl:variable name="next-list" select="substring-after($list, $delimiter)" />
<!-- output token if it is unique -->
<xsl:if test="not(contains(concat($delimiter, $next-list, $delimiter), concat($delimiter, $token, $delimiter)))">
<xsl:value-of select="concat($token, $delimiter)"/>
</xsl:if>
<!-- recursive call -->
<xsl:call-template name="distinct-values-from-list">
<xsl:with-param name="list" select="$next-list"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$list"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Full demo: http://xsltransform.net/ncdD7mM
Added:
The above method outputs the last occurrence of each value in the list, because that's the simplest way to remove the duplicates.
The side effect of this is that the original order of the values is not preserved. Or - more correctly - it is the reverse order that is being preserved.
I would not think preserving the original forward order is of any importance here. But in case you do need it, it could be done this way (which I believe is much easier to follow than the suggested alternative):
<xsl:template name="distinct-values-from-list">
<xsl:param name="list"/>
<xsl:param name="delimiter" select="'|'"/>
<xsl:param name="result"/>
<xsl:choose>
<xsl:when test="$list">
<xsl:variable name="token" select="substring-before(concat($list, $delimiter), $delimiter)" />
<!-- recursive call -->
<xsl:call-template name="distinct-values-from-list">
<xsl:with-param name="list" select="substring-after($list, $delimiter)"/>
<xsl:with-param name="result">
<xsl:value-of select="$result"/>
<!-- add token if this is its first occurrence -->
<xsl:if test="not(contains(concat($delimiter, $result, $delimiter), concat($delimiter, $token, $delimiter)))">
<xsl:value-of select="concat($delimiter, $token)"/>
</xsl:if>
</xsl:with-param>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="substring($result, 2)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Assuming that you can use XSLT 2.0, and assuming that the input looks like
<?xml version="1.0" encoding="UTF-8"?>
<root>1111-1|1111-1|1111-3|1111-4|1111-5|1111-3</root>
you could use the distinct-values and tokenize functions:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="xml" encoding="UTF-8" indent="yes" />
<xsl:template match="/root">
<result>
<xsl:value-of separator="|" select="distinct-values(tokenize(.,'\|'))"/>
</result>
</xsl:template>
</xsl:transform>
And the result will be
<?xml version="1.0" encoding="UTF-8"?>
<result>1111-1|1111-3|1111-4|1111-5</result>
I have adapted a stylesheet below from (XSLT 1.0 How to get distinct values)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output omit-xml-declaration="yes"/>
<xsl:template match="/">
<output>
<xsl:call-template name="distinctvalues">
<xsl:with-param name="values" select="root"/>
</xsl:call-template>
</output>
</xsl:template>
<xsl:template name="distinctvalues">
<xsl:param name="values"/>
<xsl:variable name="firstvalue" select="substring-before($values, '|')"/>
<xsl:variable name="restofvalue" select="substring-after($values, '|')"/>
<xsl:if test="not(contains($values, '|'))">
<xsl:value-of select="$values"/>
</xsl:if>
<xsl:if test="contains($restofvalue, $firstvalue) = false">
<xsl:value-of select="$firstvalue"/>
<xsl:text>|</xsl:text>
</xsl:if>
<xsl:if test="$restofvalue != ''">
<xsl:call-template name="distinctvalues">
<xsl:with-param name="values" select="$restofvalue" />
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
with a sample input of:
<root>1111-1|1111-1|1111-3|1111-4|1111-5|1111-3</root>
and the output is
<output>1111-1|1111-4|1111-5|1111-3</output>
**** EDIT ****
per Michael's comment below, here is the revised stylesheet which uses a saxon extension:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:saxon="http://icl.com/saxon"
exclude-result-prefixes="saxon"
version="1.1">
<xsl:output omit-xml-declaration="yes"/>
<xsl:variable name="aaa">
<xsl:call-template name="tokenizeString">
<xsl:with-param name="list" select="root"/>
<xsl:with-param name="delimiter" select="'|'"/>
</xsl:call-template>
</xsl:variable>
<xsl:template match="/">
<xsl:for-each select="saxon:node-set($aaa)/token[not(preceding::token/. = .)]">
<xsl:if test="position() > 1">
<xsl:text>|</xsl:text>
</xsl:if>
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:template>
<xsl:template name="tokenizeString">
<!--passed template parameter -->
<xsl:param name="list"/>
<xsl:param name="delimiter"/>
<xsl:choose>
<xsl:when test="contains($list, $delimiter)">
<token>
<!-- get everything in front of the first delimiter -->
<xsl:value-of select="substring-before($list,$delimiter)"/>
</token>
<xsl:call-template name="tokenizeString">
<!-- store anything left in another variable -->
<xsl:with-param name="list" select="substring-after($list,$delimiter)"/>
<xsl:with-param name="delimiter" select="$delimiter"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:choose>
<xsl:when test="$list = ''">
<xsl:text/>
</xsl:when>
<xsl:otherwise>
<token>
<xsl:value-of select="$list"/>
</token>
</xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
given an input of:
<root>cat|cat|catalog|catalog|red|red|wired|wired</root>
it outputs
cat|catalog|red|wired
and with this input:
<root>1111-1|1111-1|1111-3|1111-4|1111-5|1111-3</root>
the output is
1111-1|1111-3|1111-4|1111-5
I am using XSLT 1.0.
Suppose I have a string similar to "apple-mango%also|there"
I am trying to replace all the non-alphanumeric characters with spaces.
I tried
<xsl:value-of select="translate(., translate(., '0123456789abcdefghijklmnopqrstuvwxysABCDEFGHIJKLMNOPQRSTUVWXYZ', ''), ' ')"/>
but it didn't work.
The trouble is with the outer translate.
As i understand, in a translate() the length of the third string should be same as that of second string or else the missing characters will be taken to be replaced by an empty string ('').
The inner translate works fine since I want to remove all characters with an empty string anyways.
But the outer translate only replaces the first character of the second argument string with a space and replaces rest with an empty string.
Since my list of non-alphanumeric characters in the second argument of the outer translate is dynamic I can't pre-code the third argument.
ex:
My inner translate will return -%|. Which is correct.
Now my outer translate is translate(., '-%|', ' ').
Which returns apple mangoalsothere.
How can it be done short of writing something like this:
translate(., '`~!##$%^&*()-_=+[]{}\|;:'",<.>/?', ' ')
Another way you could look at this is to use the result of the "inner translate" - i.e the string containing all the unwanted characters - as a parameter in a named recursive template that would replace them, one-by-one, by a space:
XML
<input>alpha-bravo/charlie#delta...echo?foxtrot%golf|hotel india-juliet</input>
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:template match="/">
<output>
<xsl:call-template name="tokenize">
<xsl:with-param name="string" select="input"/>
<xsl:with-param name="delimiters" select="translate(input, '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', '')"/>
</xsl:call-template>
</output>
</xsl:template>
<xsl:template name="tokenize">
<xsl:param name="string"/>
<xsl:param name="delimiters"/>
<xsl:choose>
<xsl:when test="$delimiters">
<xsl:variable name="delimiter" select="substring($delimiters, 1, 1)" />
<xsl:value-of select="substring-before($string, $delimiter)" />
<xsl:text> </xsl:text>
<!-- recursive call -->
<xsl:call-template name="tokenize">
<xsl:with-param name="string" select="substring-after($string, $delimiter)"/>
<xsl:with-param name="delimiters" select="substring($delimiters, 2)"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$string"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
Result
<?xml version="1.0" encoding="utf-8"?>
<output>alpha bravo charlie delta echo foxtrot golf hotel india juliet</output>
One way to do this would be to create a recursive template to create a string of nothing but spaces for a given length
<xsl:template name="AllSpaces">
<xsl:param name="spaces" />
<xsl:if test="$spaces > 0">
<xsl:text> </xsl:text>
<xsl:call-template name="AllSpaces">
<xsl:with-param name="spaces" select="$spaces - 1" />
</xsl:call-template>
</xsl:if>
</xsl:template>
Then, you can generate a string with the number of spaces equal to the length of the string you are working with.
<xsl:variable name="specialchars" select="translate(., '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', '')" />
<xsl:variable name="spaces">
<xsl:call-template name="AllSpaces">
<xsl:with-param name="spaces" select="string-length($specialchars)" />
</xsl:call-template>
</xsl:variable>
You can then use this spaces variable in your translate. For example, try this XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text" />
<xsl:template match="data">
<xsl:variable name="specialchars" select="translate(., '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', '')" />
<xsl:variable name="spaces">
<xsl:call-template name="AllSpaces">
<xsl:with-param name="spaces" select="string-length($specialchars)" />
</xsl:call-template>
</xsl:variable>
<xsl:value-of select="translate(., $specialchars, $spaces)"/>
</xsl:template>
<xsl:template name="AllSpaces">
<xsl:param name="spaces" />
<xsl:if test="$spaces > 0">
<xsl:text> </xsl:text>
<xsl:call-template name="AllSpaces">
<xsl:with-param name="spaces" select="$spaces - 1" />
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Now, if you had multiple strings you wanted to replace in your XML, you could slightly improve things by having a global variable for spaces that was equal to the length of the longest string. This would give you more spaces than you needed, but that would not be a problem.
Try this XSLT too
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text" />
<xsl:variable name="spaces">
<xsl:for-each select="//data">
<xsl:sort select="string-length(.)" order="descending" />
<xsl:if test="position() = 1">
<xsl:call-template name="AllSpaces">
<xsl:with-param name="spaces" select="string-length(.)" />
</xsl:call-template>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<xsl:template match="data">
<xsl:value-of select="translate(., translate(., '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', ''), $spaces)"/>
</xsl:template>
<xsl:template name="AllSpaces">
<xsl:param name="spaces" />
<xsl:if test="$spaces > 0">
<xsl:text> </xsl:text>
<xsl:call-template name="AllSpaces">
<xsl:with-param name="spaces" select="$spaces - 1" />
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
When applied to this XML
<test>
<data>apple-mango%also|there</data>
<data>apple-mango%also|there!test</data>
</test>
The following is output
apple mango also there
apple mango also there test
I have an input string which has csv values. Eg., 1,2,3
I would need to separate each values and assign to target node in for-each loop.
I got this below template that splits the input string based on delimiter. How can I assign each of the delimited values to the target element in for-each loop.
<xsl:template name="output-tokens">
<xsl:param name="list"/>
<xsl:param name="delimiter"/>
<xsl:variable name="newlist">
<xsl:choose>
<xsl:when test="contains($list, $delimiter)">
<xsl:value-of select="normalize-space($list)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="concat(normalize-space($list), $delimiter)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:variable name="first" select="substring-before($newlist, $delimiter)"/>
<xsl:variable name="remaining"
select="substring-after($newlist, $delimiter)"/>
<xsl:variable name="count" select="position()"/>
<num>
<xsl:value-of select="$first"/>
</num>
<xsl:if test="$remaining">
<xsl:call-template name="output-tokens">
<xsl:with-param name="list" select="$remaining"/>
<xsl:with-param name="delimiter">
<xsl:value-of select="$delimiter"/>
</xsl:with-param>
</xsl:call-template>
</xsl:if>
</xsl:template>
Input xml:
<out1:AvailableDates>
<out1:AvailableDate>15/12/2011,16/12/2011,19/12/2011,20/12/2011,21/12/2011</out1:AvailableDate>
</out1:AvailableDates>
Expected Output:
<tns:AvailableDates>
<tns:AvailableDate>15/12/2011</tns:AvailableDate>
<tns:AvailableDate>16/12/2011</tns:AvailableDate>
<tns:AvailableDate>120/12/2011</tns:AvailableDate>
</tns:AvailableDates>
Here is a complete and short, true XSLT 1.0 solution:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:out1="undefined" xmlns:tns="tns:tns"
exclude-result-prefixes="out1 tns">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="out1:AvailableDate">
<tns:AvailableDates>
<xsl:apply-templates/>
</tns:AvailableDates>
</xsl:template>
<xsl:template match="text()" name="split">
<xsl:param name="pText" select="."/>
<xsl:param name="pItemElementName" select="'tns:AvailableDate'"/>
<xsl:param name="pItemElementNamespace" select="'tns:tns'"/>
<xsl:if test="string-length($pText) > 0">
<xsl:variable name="vNextItem" select=
"substring-before(concat($pText, ','), ',')"/>
<xsl:element name="{$pItemElementName}"
namespace="{$pItemElementNamespace}">
<xsl:value-of select="$vNextItem"/>
</xsl:element>
<xsl:call-template name="split">
<xsl:with-param name="pText" select=
"substring-after($pText, ',')"/>
<xsl:with-param name="pItemElementName" select="$pItemElementName"/>
<xsl:with-param name="pItemElementNamespace" select="$pItemElementNamespace"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document (corrected to be made well-formed):
<out1:AvailableDates xmlns:out1="undefined">
<out1:AvailableDate>15/12/2011,16/12/2011,19/12/2011,20/12/2011,21/12/2011</out1:AvailableDate>
</out1:AvailableDates>
the wanted, correct result is produced:
<tns:AvailableDates xmlns:tns="tns:tns">
<tns:AvailableDate>15/12/2011</tns:AvailableDate>
<tns:AvailableDate>16/12/2011</tns:AvailableDate>
<tns:AvailableDate>19/12/2011</tns:AvailableDate>
<tns:AvailableDate>20/12/2011</tns:AvailableDate>
<tns:AvailableDate>21/12/2011</tns:AvailableDate>
</tns:AvailableDates>
With XSLT 2.0 you can use tokenize(string, separator) function instead of named template.
And this xsl:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:tns="http://tnsnamespace">
<xsl:template match="AvailableDate">
<tns:AvailableDates>
<xsl:for-each select="tokenize(current(), ',')">
<tns:AvailableDate>
<xsl:value-of select="."/>
</tns:AvailableDate>
</xsl:for-each>
</tns:AvailableDates>
</xsl:template>
</xsl:stylesheet>
gives following result:
<?xml version="1.0" encoding="UTF-8"?>
<tns:AvailableDates xmlns:tns="http://tnsnamespace">
<tns:AvailableDate>15/12/2011</tns:AvailableDate>
<tns:AvailableDate>16/12/2011</tns:AvailableDate>
<tns:AvailableDate>19/12/2011</tns:AvailableDate>
<tns:AvailableDate>20/12/2011</tns:AvailableDate>
<tns:AvailableDate>21/12/2011</tns:AvailableDate>
</tns:AvailableDates>
Update:
With Xslt 2.0 processor under backward compatibility mode following template gives the same result:
<xsl:template match="AvailableDate">
<tns:AvailableDates>
<xsl:variable name="myValue">
<xsl:call-template name="output-tokens">
<xsl:with-param name="list" select="."/>
<xsl:with-param name="delimiter" select="','"/>
</xsl:call-template>
</xsl:variable>
<xsl:for-each select="$myValue/node()">
<tns:AvailableDate>
<xsl:value-of select="."/>
</tns:AvailableDate>
</xsl:for-each>
</tns:AvailableDates>
</xsl:template>
For Xslt 1.0 - it is not possible simple (with standard functions) access to nodes via variable - see #Dimitre Novatchev answer XSLT 1.0 - Create node set and pass as a parameter
For this purpose XSLT 1.0 processors contains extension function: node-set(...)
For Saxon 6.5 node-set() function is defined in http://icl.com/saxon namespace
So in the case of XSLT 1.0 processors solution would be:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exslt="http://exslt.org/common"
xmlns:out1="http://out1namespace"
xmlns:tns="http://tnsnamespace"
exclude-result-prefixes="out1 exslt">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="out1:AvailableDate">
<tns:AvailableDates>
<xsl:variable name="myValue">
<xsl:call-template name="output-tokens">
<xsl:with-param name="list" select="."/>
<xsl:with-param name="delimiter" select="','"/>
</xsl:call-template>
</xsl:variable>
<xsl:for-each select="exslt:node-set($myValue)/node()">
<tns:AvailableDate>
<xsl:value-of select="."/>
</tns:AvailableDate>
</xsl:for-each>
</tns:AvailableDates>
</xsl:template>
<xsl:template name="output-tokens">
<xsl:param name="list"/>
<xsl:param name="delimiter"/>
<xsl:variable name="newlist">
<xsl:choose>
<xsl:when test="contains($list, $delimiter)">
<xsl:value-of select="normalize-space($list)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="concat(normalize-space($list), $delimiter)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:variable name="first" select="substring-before($newlist, $delimiter)"/>
<xsl:variable name="remaining"
select="substring-after($newlist, $delimiter)"/>
<xsl:variable name="count" select="position()"/>
<num>
<xsl:value-of select="$first"/>
</num>
<xsl:if test="$remaining">
<xsl:call-template name="output-tokens">
<xsl:with-param name="list" select="$remaining"/>
<xsl:with-param name="delimiter">
<xsl:value-of select="$delimiter"/>
</xsl:with-param>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Thanks #Dimitre Novatchev to correct me and his answer about accessing node sets from variable.
Personally, I prefer this variant based on custom extension functions. The method is compact and clean, and works fine in XSLT 1.0 (at least with XALAN 2.7 as embedded in any recent JVM).
1) declare a class with a static method returning a org.w3c.dom.Node
package com.reverseXSL.util;
import org.w3c.dom.*;
import java.util.regex.*;
import javax.xml.parsers.DocumentBuilderFactory;
public class XslTools {
public static Node splitToNodes(String input, String regex) throws Exception {
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
Element item, list = doc.createElement("List");
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(input);
while (m.find()) {
item = doc.createElement("Item");
StringBuffer sb = new StringBuffer();
for (int i=1; i<=m.groupCount(); ++i) if (m.start(i)>=0) sb.append(m.group(i));
Text txt = doc.createTextNode(sb.toString());
item.appendChild(txt);
list.appendChild(item);
}
return list;
}
}
This function splits an input string on a regex pattern and creates a document fragment of the kind <list><Item>A</Item><Item>B</Item><Item>C</Item></List>.
The regex is matched in sequence, each match yielding an Item element whose value is composed from the capturing groups (some possibly empty) inside each regex match. This allows to get rid from delimiters and other syntax chars.
For instance, to split a comma-separated list like " A, B ,, C", skip empty values, and trim extra spaces (hence get the above Node list), use a regex like '\s*([^,]+?)\s*(?:,|$)' - a mind twisting one! If instead you want to split the input text by a fixed size (here 10 chars) with the last Item taking whatever remains, use a regex like '(.{10}|.+)' - love it!
You can then use the function in XSLT 1.0 as follows (quite compact!):
<xsl:stylesheet version="1.0" xmlns:var="com.reverseXSL.util.XslTools" extension-element-prefixes="var" ...
...
<xsl:template ...
...
<xsl:for-each select="var:splitToNodes(Detail/CsvText,'\s*([^,]+?)\s*(?:,|$)')/Item">
<Loop><xsl:value-of select="."/></Loop>
</xsl:for-each>
...
Executed on a template match yielding the input fragment <Detail><CsvText>a, b ,c </CsvText></Detail> you'll generate <Loop>a</Loop><Loop>b</Loop><Loop>c</Loop>
The trick is not forgetting to follow the function call that generates the Node/Item by the XPath "/Item" (or "/*") as you shall note, so that a Node sequence is returned into the for-each loop.
I have a string (in a variable) that has a list of numbers separated by space or comma.
I need to sum the numbers in the string.
example string "1,2,5,12,3"
or "1 2 5 12 3"
Is there a way to add the numbers within the string and return the total?
This much shorter transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="text()" name="sumStringList">
<xsl:param name="pText" select="."/>
<xsl:param name="pSum" select="0"/>
<xsl:param name="pDelim" select="','"/>
<xsl:choose>
<xsl:when test="not(string-length($pText) >0)">
<xsl:value-of select="$pSum"/>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="vnewList"
select="concat($pText,$pDelim)"/>
<xsl:variable name="vHead" select=
"substring-before($vnewList, $pDelim)"/>
<xsl:call-template name="sumStringList">
<xsl:with-param name="pText" select=
"substring-after($pText, $pDelim)"/>
<xsl:with-param name="pSum" select="$pSum+$vHead"/>
<xsl:with-param name="pDelim" select="$pDelim"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
when applied on the following XML document:
<t>1,2,5,12,3</t>
produces the wanted, correct result:
23
Explanation: Recursively called named template that also matches a text node. A sentinel (appended comma) is added to speed up and streamline processing.
II. XSLT 2.0 solution:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:param name="pDelim" select="','"/>
<xsl:template match="text()">
<xsl:sequence select=
"sum(for $s in tokenize(.,$pDelim)
return number($s)
)
"/>
</xsl:template>
</xsl:stylesheet>
When applied on the same XML document (above), this transformation produces the same wanted, correct answer:
23
Here we use the standard XPath 2.0 function tokenize() and we must convert every resulting token to number (using the number() function) before finally applying the standard XPath function sum().
I don't know XSLT, but generally you would split the string using spaces and commas as separators.
After a quick search I found that you can use tokenize(string, separator) as the split function if you are using XSLT 2.0. This page has an example on how to use tokenize.
Here is an XSLT 1.0 solution
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="/">
<xsl:variable name="listOfValues" select="'1,2,5,12,3'" />
<xsl:call-template name="splitAndAdd">
<xsl:with-param name="list" select="$listOfValues"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="splitAndAdd">
<xsl:param name="list" />
<xsl:param name="delimiter" select="','"/>
<xsl:param name="total" select="0" />
<xsl:variable name="newList">
<xsl:choose>
<xsl:when test="contains($list, $delimiter)">
<xsl:value-of select="normalize-space($list)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="concat(normalize-space($list),$delimiter)" />
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:variable name="token"
select="substring-before($newList, $delimiter)" />
<xsl:variable name="remaining"
select="normalize-space(substring-after($newList, $delimiter))" />
<xsl:variable name="newTotal" select="$total + number($token)" />
<xsl:choose>
<xsl:when test="$remaining">
<xsl:call-template name="splitAndAdd">
<xsl:with-param name="delimiter" select="$delimiter"/>
<xsl:with-param name="list" select="$remaining"/>
<xsl:with-param name="total" select="$newTotal" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$newTotal" />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>