Using a Map in XSL for expanding abbreviations - xslt

I saw a similar question on creating a Map.
That answer has this code:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt">
<xsl:template match="/">
<xsl:variable name="map">
<map>
<entry key="key-1">value1</entry>
<entry key="key-2">value2</entry>
<entry key="key-3">value3</entry>
</map>
</xsl:variable>
<output>
<xsl:value-of select="msxsl:node-set($map)/map/entry[#key='key-1']"/>
</output>
</xsl:template>
I would like to replace the output command to use a value in my XML to see if it is a key in the map and then replace it with the value.
Is the best way to do a for-each select on the map and compare with contains?
Here is a snippet of the XML:
<document>
<content name="PART_DESC_SHORT" type="text" vse-streams="2" u="22" action="cluster" weight="1">
SCREW - ADJUST
</content>
</document>
The content node value may have a string containing an abbreviation that I want to replace with the full value.
Thanks,
Paul

No need to use a for-each - this XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt">
<xsl:output indent="yes" method="xml" />
<xsl:variable name="abbreviations">
<abbreviation key="Brkt Pivot R">Bracket Pivot R</abbreviation>
<abbreviation key="Foo">Expanded Foo</abbreviation>
<abbreviation key="Bar">Expanded Bar</abbreviation>
</xsl:variable>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:variable name="text" select="."/>
<xsl:variable name="abbreviation" select="msxsl:node-set($abbreviations)/*[#key=$text]"/>
<xsl:choose>
<xsl:when test="$abbreviation">
<xsl:value-of select="$abbreviation"/>
</xsl:when>
<xsl:otherwise>
<xsl:copy/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
will convert any XML in an exact copy expanding all the text matching an abbreviation defined in the abbreviations variable at the top.
If you want to expand the abbreviations only within specific elements you can modify the second template match="..." rule.
On the other hand if you want to expand ANY occurance of all the abbreviations within the text you need loops - that means recursion in XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt">
<xsl:output indent="yes" method="xml" />
<xsl:variable name="abbreviations">
<abbreviation key="Brkt">Bracket</abbreviation>
<abbreviation key="As">Assembly</abbreviation>
<abbreviation key="Foo">Expanded Foo</abbreviation>
<abbreviation key="Bar">Expanded Bar</abbreviation>
</xsl:variable>
<!-- Replaces all occurrences of a string with another within a text -->
<xsl:template name="replace">
<xsl:param name="text"/>
<xsl:param name="from"/>
<xsl:param name="to"/>
<xsl:choose>
<xsl:when test="contains($text,$from)">
<xsl:value-of select="concat(substring-before($text,$from),$to)"/>
<xsl:call-template name="replace">
<xsl:with-param name="text" select="substring-after($text,$from)"/>
<xsl:with-param name="from" select="$from"/>
<xsl:with-param name="to" select="$to"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!-- Replace all occurences of a list of abbreviation with their expanded version -->
<xsl:template name="replaceAbbreviations">
<xsl:param name="text"/>
<xsl:param name="abbreviations"/>
<xsl:choose>
<xsl:when test="count($abbreviations)>0">
<xsl:call-template name="replaceAbbreviations">
<xsl:with-param name="text">
<xsl:call-template name="replace">
<xsl:with-param name="text" select="$text"/>
<xsl:with-param name="from" select="$abbreviations[1]/#key"/>
<xsl:with-param name="to" select="$abbreviations[1]"/>
</xsl:call-template>
</xsl:with-param>
<xsl:with-param name="abbreviations" select="$abbreviations[position()>1]"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:call-template name="replaceAbbreviations">
<xsl:with-param name="text" select="."/>
<xsl:with-param name="abbreviations" select="msxsl:node-set($abbreviations)/*"/>
</xsl:call-template>
</xsl:template>
</xsl:stylesheet>
Applying this second XSLT to
<document>
<content name="PART_DESC_SHORT" type="text" vse-streams="2" u="22" action="cluster" weight="1">
Brkt Pivot R
</content>
</document>
produces
<document>
<content name="PART_DESC_SHORT" type="text" vse-streams="2" u="22" action="cluster" weight="1">
Bracket Pivot R
</content>
</document>
Note that:
this solution assumes that no abbreviation ovelap (e.g. two separate abbreviations Brk and Brkt)
it uses XSLT 1.0 - a better solution is probably possible with XSLT 2.0
this kind of heavy string processing is likely quite inefficient in XSLT, it is probably better to write an extension function in some other language and call it from the XSLT.

Related

String Split to new Elements using XSL 1.0

Can any guide me to split the given xml element values into multiple child elements based on a token. Here is my sample input xml and desired output. I have a limitation to use xsl 1.0. Thank you.
Input XML:
<?xml version='1.0' encoding='UTF-8'?>
<SQLResults>
<SQLResult>
<ACTION1>Action1</ACTION1>
<ACTION2>Action2</ACTION2>
<Encrypt>Program=GPG;Code=23FCS;</Encrypt>
<SENDER>Program=WebPost;Protocol=WS;Path=/home/Inbound</SENDER>
</SQLResult>
</SQLResults>
Output XML:
<?xml version='1.0' encoding='UTF-8'?>
<SQLResults>
<SQLResult>
<ACTION1>Action1</ACTION1>
<ACTION2>Action2</ACTION2>
<Encrypt>
<Program>GPG</Program>
<Code>23FCS</Code>
</Encrypt>
<SENDER>
<Program>Action4</Program>
<Protocol>WS</Protocol>
<Path>/home/Inbound</Path>
</SENDER>
</SQLResult>
</SQLResults>
In XSLT 2 it would be easy, just with the following template:
<xsl:template match="Encrypt|SENDER">
<xsl:copy>
<xsl:analyze-string select="." regex="(\w+)=([\w/]+);?">
<xsl:matching-substring>
<element name="{regex-group(1)}">
<xsl:value-of select="regex-group(2)"/>
</element>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:copy>
</xsl:template>
Because you want to do it in XSLT 1, you have to express it another way.
Instead of analyze-string you have to:
Tokenize the content into non-empty tokens contained between ; chars.
You have to add tokenize template.
Each such token divide into 2 substrings, before and after = char.
Create an element with the name equal to the first substring.
Write the content of this element - the second substring.
XSLT 1 has also such limitation that the result of the tokenize template
is a result tree fragment (RTF) not the node set and thus it cannot be
used in XPath expressions.
To circumvent this limitation, you must use exsl:node-set function.
So the whole script looks like below:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common">
<xsl:output method="xml" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="Encrypt|SENDER">
<xsl:copy>
<xsl:variable name="tokens">
<xsl:call-template name="tokenize">
<xsl:with-param name="txt" select="."/>
<xsl:with-param name="delim" select="';'"/>
</xsl:call-template>
</xsl:variable>
<xsl:for-each select="exsl:node-set($tokens)/token">
<xsl:variable name="t1" select="substring-before(., '=')"/>
<xsl:variable name="t2" select="substring-after(., '=')"/>
<xsl:element name="{$t1}">
<xsl:value-of select="$t2" />
</xsl:element>
</xsl:for-each>
</xsl:copy>
</xsl:template>
<xsl:template name="tokenize">
<xsl:param name="txt" />
<xsl:param name="delim" select="' '" />
<xsl:choose>
<xsl:when test="$delim and contains($txt, $delim)">
<token>
<xsl:value-of select="substring-before($txt, $delim)" />
</token>
<xsl:call-template name="tokenize">
<xsl:with-param name="txt" select="substring-after($txt, $delim)" />
<xsl:with-param name="delim" select="$delim" />
</xsl:call-template>
</xsl:when>
<xsl:when test="$txt">
<token><xsl:value-of select="$txt" /></token>
</xsl:when>
</xsl:choose>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy><xsl:apply-templates select="#*|node()"/></xsl:copy>
</xsl:template>
</xsl:transform>

Split a string with "|" Char to columns

I'm a bit new to this language so I have several doubts.
I'm working to process an xml to display some data on pdf form.
But there a few strings that have "|" so I can split the data to display properly.
Here is the example of the input data:
<root>
<reference>
<NroLinRef>12</NroLinRef>
<CodRef>I20</CodRef>
<RazonRef>Data1|Data2|Data3|Data4|Data5|Data6|Data7</RazonRef>
</reference>
</root>
In the output I need something like this so I can display in order in row with cells so data must be clear to read.
<root>
<Reference>
<NroLinRef>12</NroLinRef>
<CodRef>I20</CodRef>
<Data1>Data1</Data1>
<Data2>Data2</Data2>
<Data3>Data3</Data3>
<Data4>Data4</Data4>
<Data5>Data5</Data5>
<Data6>Data6</Data6>
<Data7>Data7</Data7>
</Reference>
</root>
To do this I have been using other code that is from another question but can't find how to get the name to be updated or customized.
And the output I get is actually like this:
<root>
<Reference>
<NroLinRef>12</NroLinRef>
<CodRef>I20</CodRef>
<Data>Data1</Data>
<Data>Data2</Data>
<Data>Data3</Data>
<Data>Data4</Data>
<Data>Data5</Data>
<Data>Data6</Data>
<Data>Data7</Data>
</Reference>
</root>
This is the XSL i'm using
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Referencia/RazonRef" name="tokenize">
<xsl:param name="text" select="."/>
<xsl:param name="separator" select="'|'"/>
<xsl:choose>
<xsl:when test="not(contains($text, $separator))">
<Data>
<xsl:value-of select="normalize-space($text)"/>
</Data>
</xsl:when>
<xsl:otherwise>
<Data>
<xsl:value-of select="normalize-space(substring-before($text, $separator))"/>
</Data>
<xsl:call-template name="tokenize">
<xsl:with-param name="text" select="substring-after($text, $separator)"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
How can I get the output I want?
The expected result can be achieved by applying the following stylesheet:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="RazonRef" name="tokenize">
<xsl:param name="text" select="."/>
<xsl:param name="separator" select="'|'"/>
<xsl:param name="i" select="1"/>
<xsl:element name="Data{$i}">
<xsl:value-of select="substring-before(concat($text, $separator), $separator)"/>
</xsl:element>
<xsl:if test="contains($text, $separator)">
<xsl:call-template name="tokenize">
<xsl:with-param name="text" select="substring-after($text, $separator)"/>
<xsl:with-param name="i" select="$i + 1"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>

Using xsl:value-of in xsl:template 's

I am passing an xml file to my fo file which looks like:
<?xml version="1.0"?>
<activityExport>
<resourceKey>
<key>monthName</key>
<value>January</value>
</resourceKey>
So if I directly use:
<xsl:value-of select="activityExport/resourceKey[key='monthName']/value"/>
I can see "January" in my PDF file just fine.
However, if I use it like this in a template I have:
<xsl:template name="format-month">
<xsl:param name="date"/>
<xsl:param name="month" select="format-number(substring($date,6,2), '##')"/>
<xsl:param name="format" select="'m'"/>
<xsl:param name="month-word">
<xsl:choose>
<xsl:when test="$month = 1"><xsl:value-of select="activityExport/resourceKey[key='monthName']/value"/>
</xsl:when>
Then I do not see "January" when I call:
<xsl:variable name="monthName">
<xsl:call-template name="format-month">
<xsl:with-param name="format" select="'M'"/>
<xsl:with-param name="month" select="#monthValue"/>
</xsl:call-template>
</xsl:variable>
<xsl:value-of select="concat($monthName,' ',#yearValue)"/>
I know my template works because if I have a static string in:
<xsl:choose>
<xsl:when test="$month = 1">Januaryyy</xsl:when>
Then I can see Januaryyyy fine.
So the template works, the resource exists, but the value-of-select does not work inside of call-template or xsl:choose or xsl:when test
Any help?
Regards!
Your template is probably fine, except that you are calling it from an unsuitable position in your XML. Therefore, the XPath you are using to set month-word does not find anything - it is a path to nothing.
For example, the following XSLT stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template name="format-month">
<xsl:param name="date"/>
<xsl:param name="month" select="format-number(substring($date,6,2), '##')"/>
<xsl:param name="format" select="'m'"/>
<xsl:param name="month-word">
<xsl:choose>
<xsl:when test="$month = 1">
<xsl:value-of select="activityExport/resourceKey[key='monthName']/value"/>
</xsl:when>
</xsl:choose>
</xsl:param>
<xsl:value-of select="$month-word"/>
</xsl:template>
<xsl:template match="/">
<xsl:variable name="monthName">
<xsl:call-template name="format-month">
<xsl:with-param name="month" select=" '1' "/>
<xsl:with-param name="format" select="'M'"/>
</xsl:call-template>
</xsl:variable>
<xsl:value-of select="concat($monthName,' ',#yearValue)"/>
</xsl:template>
</xsl:stylesheet>
applied to this XML:
<activityExport>
<resourceKey>
<key>monthName</key>
<value>January</value>
</resourceKey>
</activityExport>
produces this output:
January
Note that I have replaced the month parameter to your template with the value 1. No elements in this input XML have a #monthValue attribute (which is what lead me to believe you are calling the template from an unsuitable location) and so month-word would not get set either because of the xsl:choose.
To make things work with your real input XML, you could try replacing the XPath with "//activityExport/resourceKey[key='monthName']/value" where the double-slash defines a path to anywhere within the XML document. This should be fine if there is only one activityExport node. Otherwise, you will need to work out the suitable XPath.

Create a recursive string function for adding a sequence

I've a challenging problem and so far I wasn't able to solve.
Within my xlst I have variable which contains a string.
I need to add the following sequence [eol] to this string.
On a fix position namely every 65 characters
I thought to use a function or template to recursive add this charackter.
The reason is that the string length can variate in length.
<xsl:function name="funct:insert-eol" as="xs:string" >
<xsl:param name="originalString" as="xs:string?"/>
<xsl:variable name="length">
<xsl:value-of select="string-length($originalString)"/>
</xsl:variable>
<xsl:variable name="start" as="xs:integer">
<xsl:value-of select="1"/>
</xsl:variable>
<xsl:variable name="eol" as="xs:integer">
<xsl:value-of select="65"/>
</xsl:variable>
<xsl:variable name="newLines">
<xsl:value-of select="$length idiv number('65')"/>
</xsl:variable>
<xsl:for-each select="1 to $newLines">
<xsl:value-of select="substring($originalString, $start, $eol)" />
</xsl:for-each>
</xsl:function>
The more I write code the more variables I need to introduce. This is still my lack on understanding.
For example we want every 5 chars an [eol]
aaaaaaabbbbbbccccccccc
aaaaa[eol]aabbb[eol]bbbcc[eol]ccccc[eol]cc
Hope someone has a starting point for me..
Regards Dirk
Rather straight-forward and short -- no recursion is necessary (and can even be specified as a single XPath expression):
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:param name="pLLength" select="5"/>
<xsl:template match="/*">
<xsl:variable name="vText" select="string()"/>
<xsl:for-each select="1 to string-length($vText) idiv $pLLength +1">
<xsl:value-of select="substring($vText, $pLLength*(position()-1)+1, $pLLength)"/>
<xsl:if test=
"not(position() eq last()
or position() eq last() and string-length($vText) mod $pLLength)">[eol]</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on this XML document:
<t>aaaaaaabbbbbbccccccccc</t>
the wanted, correct result is produced:
aaaaa[eol]aabbb[eol]bbbcc[eol]ccccc[eol]cc
When this XML document is processed:
<t>aaaaaaabbbbbbcccccccccddd</t>
again the wanted, correct result is produced:
aaaaa[eol]aabbb[eol]bbbcc[eol]ccccc[eol]ccddd[eol]
You can treat it as a grouping problem, using for-each-group:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="xs mf">
<xsl:function name="mf:insert-eol" as="xs:string">
<xsl:param name="input" as="xs:string"/>
<xsl:param name="chunk-size" as="xs:integer"/>
<xsl:value-of>
<xsl:for-each-group select="string-to-codepoints($input)" group-by="(position() - 1) idiv $chunk-size">
<xsl:if test="position() gt 1"><xsl:sequence select="'eol'"/></xsl:if>
<xsl:sequence select="codepoints-to-string(current-group())"/>
</xsl:for-each-group>
</xsl:value-of>
</xsl:function>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* , node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text">
<xsl:copy>
<xsl:sequence select="mf:insert-eol(., 5)"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
That stylesheet transforms
<root>
<text>aaaaaaabbbbbbccccccccc</text>
</root>
into
<root>
<text>aaaaaeolaabbbeolbbbcceolccccceolcc</text>
</root>
Try this one:
<?xml version='1.0' ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:param name="TextToChange" select="'aaaaaaabbbbbbccccccccc'"/>
<xsl:param name="RequiredLength" select="xs:integer(5)"/>
<xsl:template match="/">
<xsl:call-template name="AddText"/>
</xsl:template>
<xsl:template name="AddText">
<xsl:param name="Text" select="$TextToChange"/>
<xsl:param name="TextLength" select="string-length($TextToChange)"/>
<xsl:param name="start" select="xs:integer(1)"/>
<xsl:param name="end" select="$RequiredLength"/>
<xsl:choose>
<xsl:when test="$TextLength gt $RequiredLength">
<xsl:value-of select="substring($Text,$start,$end)"/>
<xsl:text>[eol]</xsl:text>
<xsl:call-template name="AddText">
<xsl:with-param name="Text" select="substring-after($Text, substring($Text,$start,$end))"/>
<xsl:with-param name="TextLength"
select="string-length(substring-after($Text, substring($Text,$start,$end)))"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$Text"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>

Counting distinct items and parsing comma-delimited values using XSLT

Suppose I have XML like this:
<child_metadata>
<metadata>
<attributes>
<metadata_valuelist value="[SampleItem3]"/>
</attributes>
</metadata>
<metadata>
<attributes>
<metadata_valuelist value="[SampleItem1]"/>
</attributes>
</metadata>
<metadata>
<attributes>
<metadata_valuelist value="[SampleItem1, SampleItem2]"/>
</attributes>
</metadata>
</child_metadata>
What I want to do is count the number of distinct values that are in the metadata_valuelists. There are the following distinct values: SampleItem1, SampleItem2, and SampleItem3. So, I want to get a value of 3. (Although SampleItem1 occurs twice, I only count it once.)
How can I do this in XSLT?
I realize there are two problems here: First, separating the comma-delimited values in the lists, and, second, counting the number of unique values. However, I'm not certain that I could combine solutions to the two problems, which is why I'm asking it as one question.
Another way without extension:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:variable name="all-value" select="/*/*/*/*/#value"/>
<xsl:template match="/">
<xsl:variable name="count">
<xsl:apply-templates select="$all-value"/>
</xsl:variable>
<xsl:value-of select="string-length($count)"/>
</xsl:template>
<xsl:template match="#value" name="value">
<xsl:param name="meta" select="translate(.,'[] ','')"/>
<xsl:choose>
<xsl:when test="contains($meta,',')">
<xsl:call-template name="value">
<xsl:with-param name="meta" select="substring-before($meta,',')"/>
</xsl:call-template>
<xsl:call-template name="value">
<xsl:with-param name="meta" select="substring-after($meta,',')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:if test="count(.|$all-value[contains(translate(.,'[] ','
'),
concat('
',$meta,'
'))][1])=1">
<xsl:value-of select="1"/>
</xsl:if>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
Note: maybe can be optimize with xsl:key instead of xsl:variable
Edit: Match tricky metadata.
This (note: just a single) transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
>
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:key name="kValue" match="value" use="."/>
<xsl:template match="/">
<xsl:variable name="vRTFPass1">
<values>
<xsl:apply-templates/>
</values>
</xsl:variable>
<xsl:variable name="vPass1"
select="msxsl:node-set($vRTFPass1)"/>
<xsl:for-each select="$vPass1">
<xsl:value-of select=
"count(*/value[generate-id()
=
generate-id(key('kValue', .)[1])
]
)
"/>
</xsl:for-each>
</xsl:template>
<xsl:template match="metadata_valuelist">
<xsl:call-template name="tokenize">
<xsl:with-param name="pText" select="translate(#value, '[],', '')"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="tokenize">
<xsl:param name="pText" />
<xsl:choose>
<xsl:when test="not(contains($pText, ' '))">
<value><xsl:value-of select="$pText"/></value>
</xsl:when>
<xsl:otherwise>
<value>
<xsl:value-of select="substring-before($pText, ' ')"/>
</value>
<xsl:call-template name="tokenize">
<xsl:with-param name="pText" select=
"substring-after($pText, ' ')"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<child_metadata>
<metadata>
<attributes>
<metadata_valuelist value="[SampleItem3]"/>
</attributes>
</metadata>
<metadata>
<attributes>
<metadata_valuelist value="[SampleItem1]"/>
</attributes>
</metadata>
<metadata>
<attributes>
<metadata_valuelist value="[SampleItem1, SampleItem2]"/>
</attributes>
</metadata>
</child_metadata>
produces the wanted, correct result:
3
Do note: Because this is an XSLT 1.0 solution, it is necessary to convert the results of the first pass from the infamous RTF type to a regular tree. This is done using your XSLT 1.0 processor's xxx:node-set() function -- in my case I used msxsl:node-set().
You probably want to think about doing this in two stages; first, do a transform that breaks down these value attributes, then it's fairly trivial to count them.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="#value">
<xsl:call-template name="breakdown">
<xsl:with-param name="itemlist" select="substring-before(substring-after(.,'['),']')" />
</xsl:call-template>
</xsl:template>
<xsl:template name="breakdown">
<xsl:param name="itemlist" />
<xsl:choose>
<xsl:when test="contains($itemlist,',')">
<xsl:element name="value">
<xsl:value-of select="normalize-space(substring-before($itemlist,','))" />
</xsl:element>
<xsl:call-template name="breakdown">
<xsl:with-param name="itemlist" select="substring-after($itemlist,',')" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:element name="value">
<xsl:value-of select="normalize-space($itemlist)" />
</xsl:element>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Aside from the 'catch all' template at the bottom, this picks up any value attributes in the format you gave, and breaks them down into separate elements (as sub-elements of the 'metadata_valuelist' element) like this:
...
<metadata_valuelist>
<value>SampleItem1</value>
<value>SampleItem2</value>
</metadata_valuelist>
...
The 'substring-before/substring-after select you see near the top strips off the '[' and ']' before passing it to the 'breakdown' template. This template will check if there's a comma in it's 'itemlist' parameter, and if there is it spits out the text before it as the content of a 'value' element, before recursively calling itself with the rest of the list. If there was no comma in the parameter, it just outputs the entire content of the parameter as a 'value' element.
Then just run this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:key name="itemvalue" match="value" use="text()" />
<xsl:template match="/">
<xsl:value-of select="count(//value[generate-id(.) = generate-id(key('itemvalue',.)[1])])" />
</xsl:template>
</xsl:stylesheet>
on the XML you get from the first transform, and it'll just spit out a single value as text output that tells you how many distinct values you have.
EDIT: I should probably point out, this solution makes a few assumptions about your input:
There are no attributes named 'value' anywhere else in the document; if there are, you can modify the #value match to pick out these ones specifically.
There are no elements named 'value' anywhere else in the document; as the first transform creates them, the second will not be able to distinguish between the two. If there are, you can replace the two <xsl:element name="value"> lines with an element name that's not already used.
The content of the #value attribute always begins with '[' and ends with ']', and there are no ']' characters within the list; if there are, the 'substring-before' function will drop everything after the first ']', rather than just the ']' at the end.
There are no commas in the names of the items you want to count, e.g. [SampleItem1, "Sample2,3"]. If there are, '"Sample2' and '3"' would be treated as separate items.