For-each loop with fn:tokenize - xslt

I want to take the value of variable, tokenize it, and loop through the different tokens.
My solutions does not work as expected. I must be misunderstanding what tokenize() actually does.
<xsl:variable name="topicCode">1.2.3.4</xsl:variable>
<xsl:variable name="tokenizedTopicCode"><xsl:value-of select="tokenize($topicCode,'\.')"/></xsl:variable>
<mdcomplex name="Topic">
<xsl:for-each select="distinct-values($tokenizedTopicCode)">
<xsl:if test="position()=1">
<md name="chaptercode">
<xsl:attribute name="value"><xsl:value-of select="."/></xsl:attribute>
</md>
</xsl:if>
<xsl:if test="position()=2">
<md name="sectioncode">
<xsl:attribute name="value"><xsl:value-of select="concat($tokenizedTopicCode[position()=1],'.',.)"/></xsl:attribute>
</md>
</xsl:if>
<xsl:if test="position()=3">
<md name="subsectioncode">
<xsl:attribute name="value"><xsl:value-of select="concat($tokenizedTopicCode[position()=1],'.',$tokenizedTopicCode[position()=2],'.',.)"/></xsl:attribute>
</md>
</xsl:if>
<xsl:if test="position()=4">
<md name="topiccode">
<xsl:attribute name="value"><xsl:value-of select="concat($tokenizedTopicCode[position()=1],'.',$tokenizedTopicCode[position()=2],'.',$tokenizedTopicCode[position()=3],.)"/></xsl:attribute>
</md>
</xsl:if>
</xsl:for-each>
</mdcomplex>
Expected:
<mdcomplex name="Topic">
<md name="chaptercode" value="1"/>
<md name="sectioncode" value="1.2"/>
<md name="subsectioncode" value="1.2.3"/>
<md name="topiccode" value="1.2.3.4"/>
</mdcomplex>
Actual:
<mdcomplex name="Topic">
<md name="chaptercode" value="1 2 3 4"/>
</mdcomplex>
I also added a <xsl:message> right after the start of the <xsl:for-each> loop:
<xsl:for-each select="distinct-values($tokenizedTopicCode)">
<xsl:message><xsl:value-of select="."/></xsl:message>
I expected it to output the value of the different tokens (1, 2, 3, 4). Instead, it outputs all tokens in one go: "1 2 3 4".
How can I split the variable into the different tokens and loop through them?
I am using Saxon 9.9.1.7 on Oxygen.

Key learning points here:
(1) xsl:value-of constructs a single text node. You're splitting a string into tokens using tokenize(), and then you're immediately stringing them back together using xsl:value-of.
(2) xsl:variable, with no select or as attribute, constructs an XML document tree. Again that's going to munge your tokens together.
If you want a variable to contain a sequence of strings, do
<xsl:variable name="t" select="tokenize(...)" as="xs:string*"/>
Technically the as attribute here is redundant, but it's generally good practice to include it, because it helps both the human reader and the XSLT compiler spot any mistakes in your code.

You at least need <xsl:variable name="tokenizedTopicCode" select="tokenize($topicCode,'\.')"/> instead of <xsl:variable name="tokenizedTopicCode"><xsl:value-of select="tokenize($topicCode,'\.')"/></xsl:variable>.
I don't see, however, how you expect to select stuff in e.g. $topicCode with a positional predicate, that variable is not a sequence of items.

I think you want something like:
<xsl:variable name="topicCode">1.2.3.4</xsl:variable>
<xsl:variable name="tokenizedTopicCode" select="tokenize($topicCode,'\.')"/>
<xsl:variable name="names" select="('chapter', 'section', 'subsection', 'topic')"/>
<mdcomplex name="Topic">
<xsl:for-each select="1 to count ($tokenizedTopicCode)">
<md name="{$names[current()]}code">
<xsl:attribute name="value">
<xsl:value-of select="$tokenizedTopicCode[position() le current()]" separator="."/>
</xsl:attribute>
</md>
</xsl:for-each>
</mdcomplex>
Not sure why you would want to use distinct-values() here; isn't 1.1.1.1 a valid topic code?

Related

Duplicates in a map

I currently have an XSLT function that loads key=value pairs from a text file into a map.
<xsl:function name="myns:loadMapping" as="map(*)">
<xsl:variable name="mapping" as="map(xs:string, xs:string)">
<xsl:map>
<xsl:for-each select="unparsed-text-lines($inputFile,$fileEncoding)">
<!-- Takes only lines which are in the form abc=xyz and are not comments (does not start with #) -->
<xsl:if test="contains(.,'=') and not(starts-with(.,'#'))">
<xsl:map-entry key="substring-before(.,'=')" select="substring-after(.,'=')"/>
</xsl:if>
</xsl:for-each>
</xsl:map>
</xsl:variable>
<xsl:sequence select="$mapping"/>
</xsl:function>
The function works fine unless the user tries to load a file containing duplicates, in which case the XSLT transform fails with an error (expected behaviour):
Error evaluating (map:merge(...)) on line xyz column xy of xyz.xsl:
XTDE3365: Duplicate key in constructed map: {keyInError}
Is there a way I could catch this case and keep the transformation from aborting, something like this :
<xsl:function name="myns:loadMapping" as="map(*)">
<xsl:variable name="mapping" as="map(xs:string, xs:string)">
<xsl:map>
<xsl:for-each select="unparsed-text-lines($inputFile,$fileEncoding)">
<!-- Takes only lines which are in the form abc=xyz and are not comments (does not start with #) -->
<xsl:if test="contains(.,'=') and not(starts-with(.,'#'))">
<xsl:choose>
<xsl:when test="...map contains key...">
<xsl:message>Map already contains key. Please check input file.</xsl:message>
</xsl:when>
<xsl:otherwise>
<xsl:map-entry key="substring-before(.,'=')" select="substring-after(.,'=')"/>
</xsl:otherwise>
</xsl:choose>
</xsl:if>
</xsl:for-each>
</xsl:map>
</xsl:variable>
<xsl:sequence select="$mapping"/>
</xsl:function>
I see that there is something implemented for a future XSLT 4.0 release (Saxon - Controlling duplicates on xsl:map) but I would like to stick to XSLT 3.0 for the time being.
Thanks.
To add to Martin Honnen's suggestions, you could use xsl:iterate instead of xsl:for-each, passing the map as a parameter, which would allow you to inspect the map before adding another entry to it.
<xsl:iterate select="...">
<xsl:param name="map" select="map{}"/>
<xsl:choose>
<xsl:when test="map:contains($map, ...)">...</xsl:when>
<xsl:otherwise>
<xsl:next-iteration>
<xsl:with-param name="map" select="map:put($map, ..., ...)"/>
Well, both map:merge in XPath 3.1 or of course grouping with e.g.
<xsl:for-each-group select="unparsed-text-lines($inputFile,$fileEncoding)[contains(.,'=') and not(starts-with(.,'#'))]" group-by="substring-before(., '=')">
<xsl:map-entry key="current-grouping-key()" select="substring-after(., '=')"/>
<xsl:if test="current-group()[2]">
<xsl:message>..</xsl:message>
</xsl:if>
</xsl:for-each-group>
allow you more control than your approach without having to wait for XSLT 4 or trying to use experimental extensions.

How to add style attribute in xsl by #variable

I have a variable #expectedLength. I need to assign it to a style attribute.
<xsl:if test="#expectedLength">
<xsl:attribute name="style">
<xsl:value-of select="'width:200px'"/>
</xsl:attribute>
</xsl:if>
I need to replace 200 with the value of #expectedLength. How can I use the variable?
You could change your snippet to
<xsl:if test="#expectedLength">
<xsl:attribute name="style">width: <xsl:value-of select="#expectedLength"/>;</xsl:attribute>
</xsl:if>
That should work with any version of XSLT.
In XSLT 2 and later you can also use the select expression
<xsl:if test="#expectedLength">
<xsl:attribute name="style" select="concat('width: ', #expectedLength, ';')"/>
</xsl:if>
I would prefer to and suggest to set up a template
<xsl:template match="#expectedLength">
<xsl:attribute name="style" select="concat('width: ', #expectedLength, ';')"/>
</xsl:template>
and then to make sure higher up that any attribute nodes are processed.

XSLT 2.0 tokenising delimiters within delimiters

In XSLT 2.0 I have long string (parameter) with a delimiter (;) inside a delimiter (~), more specifically a triplet inside a delimiter.
Data is organized like so:
<parameter>qrsbfs;qsvsv;tfgz~dknk;fvtea;gtvath~pksdi;ytbdi;oiunhu</parameter>
The first tokenize($mystring,'~') in a for-each produces :
qrsbfs;qsvsv;tfgz
dknk;fvtea;gtvath
pksdi;ytbdi;oiunhu
Within that tokenization, I need to treat it by looping again:
qrsbfs
qsvsv
tfgz
dknk
fvtea
gtvath
pksdi
ytbdi
oiunhu
I can do intensive string manipulation to get there using concat, string-length, and substring-before/substring-after, but I wondered if there wasn't a more elegant solution that my neophyte mind wasn't overlooking?
EDIT, adding nested tokenize that returned incorrect results:
<xsl:for-each select="tokenize($myparameter,'~')">
<xsl:for-each select="tokenize(.,';')">
<xsl:if test="position()=1">
<xsl:value-of select="."/>
</xsl:if>
<xsl:if test="position()=2">
<xsl:value-of select="."/>
</xsl:if>
<xsl:if test="position()=3">
<xsl:value-of select="."/>
</xsl:if>
</xsl:for-each>
</xsl:for-each>
If you wanted a one line solution, you could do something like this, using nested for-in-return statements:
<xsl:sequence select="for $n in tokenize(.,'~') return concat(string-join(tokenize($n,';'),'
'),'
')"/>
If you don't need to tokenize them separately, you could replace the ~ with ; and tokenize all 9 elements at the same time:
tokenize(replace(parameter,'~',';'),';')
For what it's worth, the code in https://xsltfiddle.liberty-development.net/pPqsHUe uses
<xsl:template match="parameter">
<xsl:for-each select="tokenize(., '~')">
<xsl:value-of select="tokenize(., ';')" separator="
"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
and with output method text produces
qrsbfs
qsvsv
tfgz
dknk
fvtea
gtvath
pksdi
ytbdi
oiunhu

Choose with for-each inside?

I have a parameterignoreAttributes which is a comma separated list of things to look for. I want to set a variable copyAttrib to be equal to whether any of them are exactly matched by name().
If xsl were a procedural language where variables could be reassigned, I'd use something like this:
<xsl:variable name="copyAttrib" select="true()">
<xsl:for-each select="tokenize($ignoreAttributes,',')">
<xsl:if test="compare(., name()) != 0">
<xsl:variable name="copyAttrib" select="false()"/>
</xsl:if>
</xsl:for-each>
Unfortunately, I can't do that, because xsl is functional (so says this other answer). So variables can only be assigned once.
I think the solution would look something like:
<vsl:variable name="copyAttrib">
<xsl:choose>
<xsl:when>
<xsl:for-each select="tokenize($ignoreAttributes, ',')">
<xsl:if test="compare(., name()) != 0"/>
</xsl:for-each>
<xsl:otherwise>
<xsl:value-of select="false()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
Obviously not exactly that (otherwise I wouldn't be asking.)
I know that I could bypass the tokenize and for-each loop by just using replaces on ignoreAttributes and changing all the , to | and then using matches, but I'd like to avoid that if possible because then I need to deal with the possibility that ignoreAttributes (which the user provides) might contain some special characters that will change the regex pattern and escape them all.
I have a parameterignoreAttributes which is a comma separated list of things to look for. I want to set a variable copyAttrib to be equal to whether any of them are exactly matched by name().
That sounds to me like
<xsl:variable name="copyAttrib" as="xs:boolean"
select="tokenize($parameterignoreAttributes, ',') = name()"/>
You say:
Unfortunately, I can't do that, because xsl is functional
when what you mean is: "Fortunately, I don't need to do that, because XSLT is functional".
An XSLT-1.0 way of doing this is by using a recursive, named template:
<xsl:template name="copyAttrib">
<xsl:param name="attribs" />
<xsl:choose>
<xsl:when test="normalize-space(substring-before($attribs,',')) = normalize-space(name(.))">
<xsl:value-of select="'true'" />
</xsl:when>
<xsl:when test="normalize-space($attribs) = ''">
<xsl:value-of select="'false'" />
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="copyAttrib">
<xsl:with-param name="attribs" select="substring-after($attribs,',')" />
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Apply this template onto the current, the selected, node and wrap it in a <xsl:variable>:
<xsl:variable name="copyAttribResult">
<xsl:call-template name="copyAttrib">
<xsl:with-param name="attribs" select="'a,b,c,...commaSeparatedValues...'" />
</xsl:call-template>
</xsl:variable>
to get either true or false as a result.

substring XSLT over multiple value lines for an atttribute

I know the following xslt will work:
<xsl:attribute name="test">
<xsl:value-of select="substring(title, 1, 4000)"/>
</xsl:attribute>
But not sure what to do if there is something like the following and you want the substring over the whole attribute value not just the title or the substitle.
<xsl:attribute name="test">
<xsl:value-of select="title"/>
<xsl:if test="../../sub_title != ''">
<xsl:text> </xsl:text>
<xsl:value-of select="../sub_title"/>
</xsl:if>
</xsl:attribute>
Is it even possible to apply a substring function over multiple lines that define an attribute?
I think what you are saying is that you want to build up a long string, consisting of the values of a number of other elements, and then truncate the result.
What you could do, is use the concat function to build the attribute value, and then do a substring on that.
<xsl:attribute name="test">
<xsl:value-of select="substring(concat(title, ' ', ../sub_title), 1, 4000)" />
</xsl:attribute>
In this case, if sub_title was empty, you would end up with a space at the end of the test attribute, so you might want to add a normalize-space to this expression
<xsl:value-of select="normalize-space(substring(concat(title, ' ', ../sub_title), 1, 4000))" />
An alternate approach, if you did want to use a more complicated expression, is to do the string calculation in a variable first
<xsl:variable name="test">
<xsl:value-of select="title"/>
<xsl:if test="../../sub_title != ''">
<xsl:text> </xsl:text>
<xsl:value-of select="../sub_title"/>
</xsl:if>
</xsl:variable>
<xsl:attribute name="test">
<xsl:value-of select="substring($test, 1, 4000)" />
</xsl:attribute>
As an aside, you can simplify your code by using "Attribute Value Templates" here, instead of using the more verbose xsl:attribute command. Simply do this..
<myElement test="{substring($test, 1, 4000)}">
Here, the curly braces indicate an expression to be evaluated, rather than output literally.