XSLT - Regular Expression Parsing

XSLT - Regular Expression Parsing - regex

My current project revolves around translating a number of test cases in a document into a form of XML compatible with a test case management system. In many of these cases, the title is prefixed by a number of ticket identifiers, document location numbers and so on, which need to be removed before they can be uploaded to the system.
Given that many of these ticket identifiers could exist elsewhere in the title and be completely valid, I've written the translation in its current form so that only the start of the string is checked for the regular expression. I have written two approaches, with varying results.
Sample Input
1.
<case-name>3.1.6 (C0) TID#EIIY CHM-2213 BZ-7043 Client side Java Upgrade R8</case-name>
2.
<case-name>4.2.7 (C1) TID#F1DR – AIP - EHD-319087 - BZ6862 - Datalink builder res...</case-name>
Desired Output
1.
<tr:summary>Client side Java Upgrade R8</tr:summary>
2.
<tr:summary>Datalink builder res...</tr:summary>
First Approach
<xsl:template match="case-name">
<tr:summary>
<xsl:variable name="start">
<xsl:apply-templates/>
</xsl:variable>
<xsl:variable name="start" select="normalize-space($start)"/>
<xsl:variable name="noFloat" select="normalize-space(fn:remFirstRegEx($start, '^[0-9]+([.][0-9]+)*' ))"/>
<xsl:variable name="noFloatDash" select="normalize-space(fn:remFirstRegEx($noFloat, '^[\p{Pd}]' ))"/>
<xsl:variable name="noC" select="normalize-space(fn:remFirstRegEx($noFloatDash, '^\(C[0-2]\)' ))"/>
<xsl:variable name="noCDash" select="normalize-space(fn:remFirstRegEx($noC, '^[\p{Pd}]' ))"/>
<xsl:variable name="noTID" select="normalize-space(fn:remFirstRegEx($noCDash, '^(TID)(#|\p{Pd})(\w+)' ))"/>
<xsl:variable name="noTIDDash" select="normalize-space(fn:remFirstRegEx($noTID, '^[\p{Pd}]' ))"/>
<xsl:variable name="noAIP" select="normalize-space(fn:remFirstRegEx($noTIDDash, '^AIP' ))"/>
<xsl:variable name="noAIPDash" select="normalize-space(fn:remFirstRegEx($noAIP, '^[\p{Pd}]' ))"/>
<xsl:variable name="noCHM" select="normalize-space(fn:remFirstRegEx($noAIPDash, '^(CHM)[\p{Pd}]([0-9]+)' ))"/>
<xsl:variable name="noCHMDash" select="normalize-space(fn:remFirstRegEx($noCHM, '^[\p{Pd}]' ))"/>
<xsl:variable name="noEHD" select="normalize-space(fn:remFirstRegEx($noCHMDash, '^(EHD)[\p{Pd}]([0-9]+)' ))"/>
<xsl:variable name="noEHDDash" select="normalize-space(fn:remFirstRegEx($noEHD, '^[\p{Pd}]' ))"/>
<xsl:variable name="noBZ" select="normalize-space(fn:remFirstRegEx($noEHDDash, '^(BZ)(((#|\p{Pd})[0-9]+)|[0-9]+)' ))"/>
<xsl:variable name="noBZDash" select="normalize-space(fn:remFirstRegEx($noBZ, '^[\p{Pd}]' ))"/>
<xsl:variable name="noTT" select="normalize-space(fn:remFirstRegEx($noBZDash, '^(TT)[#](\w)+' ))"/>
<xsl:variable name="noTTDash" select="normalize-space(fn:remFirstRegEx($noTT, '^[\p{Pd}]' ))"/>
<xsl:variable name="nobrack" select="normalize-space(fn:remFirstRegEx($noTTDash, '^\[(.*?)\]' ))"/>
<xsl:variable name="noBrackDash" select="normalize-space(fn:remFirstRegEx($nobrack, '^[\p{Pd}]' ))"/>
<xsl:value-of select="normalize-space($noBrackDash)"/>
</tr:summary>
</xsl:template>
<xsl:function name="fn:remFirstRegEx">
<xsl:param name="inString"/>
<xsl:param name="regex"/>
<xsl:variable name="words" select="tokenize($inString, '\p{Z}')"/>
<xsl:variable name="outString">
<xsl:for-each select="$words">
<xsl:if test="not(matches(., $regex)) or index-of($words, .) > 1">
<xsl:value-of select="."/><xsl:text> </xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<xsl:value-of select="string-join($outString, '')">
</xsl:function>
Note: The namespace fn, for the purpose of this translation, is just "function/namespace", used to write my own functions.
First Results
1. Success
<tr:summary>Client side Java Upgrade R8</tr:summary>
2. Failure
<tr:summary>- EHD-319087 - BZ6862 - Datalink builder resolution selector may drop leading zeros on coordinate seconds</tr:summary>
Second Approach
<xsl:function name="fn:remFirstRegEx">
<xsl:param name="inString"/>
<xsl:param name="regex"/>
<xsl:analyze-string select="$inString" regex="$regex">
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:function>
This approach fails completely, I'm including it here because it's the more obvious solution and did not work at all.
It should be noted that there are a large number of regular expressions in the above solution, this is to account for all the possible IDs that might come through. Mercifully, the IDs seem to come in a consistent order.
The problem, as I have concluded, is with the dashes. I have noted that in every case in the documents where translation has failed, the failing ID has been both preceded and followed by a dash. If it only precedes, it'll go through fine. If it only follows, no issues. Both is where it falls down, and curiously, the dash still shows up, even though it has already been seemingly eliminated from the string.
There are two kinds of dashes at play here, a normal dash (–) and a minus sign (-).
Paradoxically: sorry for the long question, and let me know if I've missed anything out.
EDIT: Forgot to say, all regular expressions with the exception of the dashes have been tested elsewhere and are known to work on all input stuff.
EDIT II: Following #acheong87's solution, I tried to run the following:
<xsl:template match="case-name">
<tr:summary>
<xsl:variable name="regEx" select=
"'^[\s\p{Pd}]*(\d+([.]\d+)*)?[\s\p{Pd}]*(\(C[0-2]\))?([\s\p{Pd}]*(TID|AIP|CHM|EHD|BZ|TT)((#|\p{Pd}|)\w+|))*[\s\p{Pd}]*(\[.*?\])?'"/>
<xsl:analyze-string select="string(.)" regex="{$regEx}">
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</tr:summary>
</xsl:template>
And Saxon gives me the following error:
Error at xsl:analyze-string at line (for our purposes, 5):
XTDE1150: The regular expression must not be one that matches a zero-length string
I can get why that would come up, given that everything is optional. Is there another way of running it that won't give me this error?
Thanks again.

Here are the main components that would go into a single regex. I've rewritten some of your expressions.
\d+([.]\d+)*
\(C[0-2]\)
TID(#|\p{Pd})\w+
AIP
CHM[\p{Pd}]\d+
EHD[\p{Pd}]\d+
BZ(#|\p{Pd}|)\d+
TT#\w+
\[.*?\]
Each component should be wrapped in (...)? to make it optional, and all components should be joined by the separator, [\s\p{Pd}]*. This produces:
^[\s\p{Pd}]*(\d+([.]\d+)*)?[\s\p{Pd}]*(\(C[0-2]\))?[\s\p{Pd}]*(TID(#|\p{Pd})\w+)?[\s\p{Pd}]*(AIP)?[\s\p{Pd}]*(CHM[\p{Pd}]\d+)?[\s\p{Pd}]*(EHD[\p{Pd}]\d+)?[\s\p{Pd}]*(BZ(#|\p{Pd}|)\d+)?[\s\p{Pd}]*(TT#\w+)?[\s\p{Pd}]*(\[.*?\])?
You can see in this Rubular demo that the above expression indeed matches your two examples.
There may be an elegant simplification you may be interested in.
\d+([.]\d+)*
\(C[0-2]\)
(TID|AIP|CHM|EHD|BZ|TT)((#|\p{Pd}|)\w+|)
\[.*?\]
Maybe some codes like AIP should be separate, but you can see the spirit of this version. That is, it's unlikely that valid titles would begin with such codes; in fact probably more likely that your examples could be missing a possible combination such as EHD#, which may appear in the future but your past-based formulation would miss. (Of course, my point is irrelevant if there is no future—and the data you have is the only data you'll need to process.) If there is a future though, IMO, it's better in this case to loosen the rigor of the expression to capture potential related combinations.
The above would become:
^[\s\p{Pd}]*(\d+([.]\d+)*)?[\s\p{Pd}]*(\(C[0-2]\))?([\s\p{Pd}]*(TID|AIP|CHM|EHD|BZ|TT)((#|\p{Pd}|)\w+|))*[\s\p{Pd}]*(\[.*?\])?
Here is the Rubular demo.

One regex to rule them all looks like
^ # start of string
([0-9]\.[0-9.]+).*? # digits and dots
\((C[0-2])\).*? # C0, C1, C2
((TID#\S+).*?)? # TID...
((AIP).*?)? # AIP...
((CHM\S+).*?)? # CHM...
((EHD\S+).*?)? # EHD...
((BZ\S+).*?)? # BZ...
(\w.*)? # free text
$ # end of string
^([0-9]\.[0-9.]+).*?\((C[0-2])\).*?((TID#\S+).*?)?((AIP).*?)?((CHM\S+).*?)?((EHD\S+).*?)?((BZ\S+).*?)?(\w.*)?$
http://rubular.com/r/pPxKBVwJaE
The .*? eat any delimiter until the next match begins. Most of the matches are optional, possibly even more than you need to be optional. Remove the enclosing (...)? for any match that you want to make mandatory. Optional groups are counted but can be empty.
Putting it all together
<xsl:variable name="linePattern"> <!-- group|contents -->
<xsl:text>^</xsl:text> <!-- start of string -->
<xsl:text>([0-9]\.[0-9.]+).*?</xsl:text> <!-- 1 digits and dots -->
<xsl:text>\((C[0-2])\).*?</xsl:text> <!-- 2 C0, C1, C2 -->
<xsl:text>((TID#\S+).*?)?</xsl:text> <!-- 3, 4 TID... -->
<xsl:text>((AIP).*?)?</xsl:text> <!-- 5, 6 AIP... -->
<xsl:text>((CHM\S+).*?)?</xsl:text> <!-- 7, 8 CHM... -->
<xsl:text>((EHD\S+).*?)?</xsl:text> <!-- 9, 10 EHD... -->
<xsl:text>((BZ\S+).*?)?</xsl:text> <!-- 11, 12 BZ... -->
<xsl:text>(\w.*)?</xsl:text> <!-- 13 free text -->
<xsl:text>$</xsl:text> <!-- end of string -->
</xsl:variable>
<xsl:template match="case-name">
<xsl:analyze-string select="string(.)" regex="{$linePattern}">
<xsl:matching-substring>
<tr:summary>
<part><xsl:value-of select="regex-group(1)" /></part>
<part><xsl:value-of select="regex-group(2)" /></part>
<part><xsl:value-of select="regex-group(4)" /></part>
<part><xsl:value-of select="regex-group(6)" /></part>
<part><xsl:value-of select="regex-group(8)" /></part>
<part><xsl:value-of select="regex-group(10)" /></part>
<part><xsl:value-of select="regex-group(12)" /></part>
<part><xsl:value-of select="regex-group(13)" /></part>
</tr:summary>
</xsl:matching-substring>
<!--
possibly include <xsl:non-matching-substring>, <xsl:fallback>
-->
</xsl:analyze-string>
</xsl:template>
Of course you can deal with the individual match groups any way you like.

Related

Umbraco - Creating variable of nodes - how to check for missing value

I am using Umbraco 4.5 (yes, I know I should upgrade to 7 now!)
I have an XSLT transform which builds up a list of products which match user filters.
I am making an XSL:variable which is a collection of products from the CMS database.
Each product has several Yes/No properties (radio buttons). Some of these haven't been populated however.
As a result, the following code breaks occasionally if the dataset includes products which don't have one of the options populated with an answer.
The error I get when it transforms the XSLT is "Value was either too large or too small for an Int32". I assume this is the value being passed into the GetPreValueAsString method.
How do I check to see if ./option1 is empty and if so, use a specific integer, otherwise use ./option1
<xsl:variable name="nodes"
select="umbraco.library:GetXmlNodeById(1098)/*
[#isDoc and string(umbracoNaviHide) != '1' and
($option1= '' or $option1=umbraco.library:GetPreValueAsString(./option1)) and
($option2= '' or $option2=umbraco.library:GetPreValueAsString(./option2)) and
($option3= '' or $option3=umbraco.library:GetPreValueAsString(./option3)) and
($option4= '' or $option4=umbraco.library:GetPreValueAsString(./option4))
]" />

Note: you tagged your question as XSLT 2.0, but Umbraco does not use XSLT 2.0, it is (presently) stuck with XSLT 1.0.
$option1= '' or $option1=umbraco.library:GetPreValueAsString(./option1)
There can be multiple causes for your error. A processor is not required to process the or-expression left-to-right or right-to-left, and it is even allowed to always evaluate both expressions, even if the first is true (this is comparable with bit-wise operators (unordered) in other languages, whereas boolean operators (ordered) in those languages typically use early breakout).
Another error can be that your option value in the context node is not empty and is not an integer or empty, in which case your code will always return an error.
You could expand your expression by testing ./optionX, but then you still have the problem of order of evaluation.
That said, how can you resolve it and prevent the error from arising? In XSLT 1.0, this is a bit clumsy (i.e., you cannot define functions and cannot use sequences), but here's one way to do it:
<xsl:variable name="pre-default-option">
<default>1</default>
<default>2</default>
<default>3</default>
<default>4</default>
</xsl:variable>
<xsl:variable name="default-option"
select="exslt:node-set($pre-default-option)" />
<xsl:variable name="pre-selected-option">
<option><xsl:value-of select="$option1" /></option>
<option><xsl:value-of select="$option2" /></option>
<option><xsl:value-of select="$option3" /></option>
<option><xsl:value-of select="$option4" /></option>
</xsl:variable>
<xsl:variable name="selected-option" select="exslt:node-set($pre-selected-option)" />
<xsl:variable name="pre-process-nodes">
<xsl:variable name="selection">
<xsl:apply-templates
select="umbraco.library:GetXmlNodeById(1098)/*"
mode="pre">
<xsl:with-param name="opt-no" select="1" />
</xsl:apply-templates>
</xsl:variable>
<!-- your original code uses 'and', so is only true if all
conditions are met, hence there must be four found nodes,
otherwise it is false (i.e., this node set will be empty) -->
<xsl:if test="count($selection) = 4">
<xsl:copy-of select="$selection" />
</xsl:if>
</xsl:variable>
<!-- your original variable, should now contain correct set, no errors -->
<xsl:variable name="nodes" select="exslt:node-set($pre-process-nodes)"/>
<xsl:template match="*[#isDoc and string(umbracoNaviHide) != '1']" mode="pre">
<xsl:param name="opt-no" />
<xsl:variable name="option"
select="$selected-option[. = string($opt-no)]" />
<!-- gets the child node 'option1', 'option2' etc -->
<xsl:variable
name="pre-ctx-option"
select="*[local-name() = concat('option', $opt-no)]" />
<xsl:variable name="ctx-option">
<xsl:choose>
<!-- empty option param always allowed -->
<xsl:when test="$option = ''">
<xsl:value-of select="$option"/>
</xsl:when>
<!-- if NaN or 0, this will return false -->
<xsl:when test="number($pre-ctx-option)">
<xsl:value-of select="$default-option[$opt-no]"/>
</xsl:when>
<!-- valid number (though you could add a range check as well) -->
<xsl:otherwise>
<xsl:value-of select="umbraco.library:GetPreValueAsString($pre-ctx-option)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<!-- prevent eternal recursion -->
<xsl:if test="4 >= $opt-no">
<xsl:apply-templates select="self::*" mode="pre">
<xsl:with-param name="opt-no" select="$opt-no + 1" />
</xsl:apply-templates>
<!-- the predicate is now ctx-independent and just true/false
this copies nothing if the conditions are not met -->
<xsl:copy-of select="self::*[$option = $ctx-option]" />
</xsl:if>
</xsl:template>
<xsl:template match="*" mode="pre" />
Note (1): I have written the above code by hand, tested only for syntax errors, I couldn't test it because you didn't provide an input document to test it against. If you find errors, by all means, edit my response so that it becomes correct.
Note (2): the above code generalizes working with the numbered parameters. By generalizing it, the code becomes a bit more complicated, but it becomes easier to maintain and to extend, and less error-prone for copy/paste errors.

converting straight quotes to smart quotes

i've the below xml
<para>A number of the offences set out in the Companies Ordinance are expressed to apply to "officers" of the company. "Officer" includes directors, managers and the company secretary: Companies Ordinance, s.2(1).</para>
here actually the in the input the quotes given are " but i want to convert it to smart quotes. i used the below xslt for this.
<xsl:template match="para/text()">
<xsl:when test="contains(.,$quot)">
<xsl:value-of select="translate(.,$quot,'“')"/>
<xsl:value-of select="translate(substring-after(.,$quot),$quot,'”')"/>
</xsl:when>
</xsl:template>
but i am getting below ooutput.
A number of the offences set out in the Companies Ordinance are expressed to apply to “officers“ of the company. “Officer“ includes directors, managers and the company secretary: Companies Ordinance, s.2(1).
but i want to get it as below.
A number of the offences set out in the Companies Ordinance are expressed to apply to “officers” of the company. “Officer” includes directors, managers and the company secretary: Companies Ordinance, s.2(1).
please let me know how do i solve this.
Thanks

The main problem with your approach is that the translate function will replace all occurrences of the quotation mark with your smart-opening quote. Therefore, the subsequent substring-after function won't actually find anything because there won't be any more quotation marks to find.
What you really need in this case is a recursive template, together with a combination of both substring-before and substring-after. The named template could actually be combined with your existing template, with default parameters to be used in the initial match
<xsl:template match="para/text()" name="replace">
<xsl:param name="text" select="."/>
<xsl:param name="usequote" select="$openquote"/>
($openquote will be a variable containing the opening quote)
You would then check if the selected text contains a quotation mark
<xsl:when test="contains($text,$quot)">
If so, you would first output the text before the quotation mark, following by the new open quote
<xsl:value-of select="concat(substring-before($text, $quot), $usequote)"/>
The template would then be recursively called with the text after the quote, and with the close quote as a parameter, so the next quote find will be closed.
<xsl:call-template name="replace">
<xsl:with-param name="text" select="substring-after($text,$quot)"/>
<xsl:with-param name="usequote"
select="substring(concat($openquote, $closequote), 2 - ($usequote=$closequote), 1)"/>
</xsl:call-template>
Note: the selecting of the usequote will basically switch between open and close quotes. It takes advantage of the fact 'true' evaluates to 1 in a numeric expression, and 'false' to 0.
Try this XSLT. Note, I am using open and close brackets in this case, just to make the output clearer, but you can easily change the variables to your smart quotes as required. (You may have to specify an encoding for the output in this case though).
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text"/>
<xsl:variable name="quot">"</xsl:variable>
<xsl:variable name="openquote">[</xsl:variable>
<xsl:variable name="closequote">]</xsl:variable>
<xsl:template match="para/text()" name="replace">
<xsl:param name="text" select="."/>
<xsl:param name="usequote" select="$openquote"/>
<xsl:choose>
<xsl:when test="contains($text,$quot)">
<xsl:value-of select="concat(substring-before($text, $quot), $usequote)"/>
<xsl:call-template name="replace">
<xsl:with-param name="text" select="substring-after($text,$quot)"/>
<xsl:with-param name="usequote"
select="substring(concat($openquote, $closequote), 2 - ($usequote=$closequote), 1)"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
When applied to your XML, the following is output
A number of the offences set out in the Companies Ordinance are expressed to apply to [officers] of the company. [Officer] includes directors, managers and the company secretary: Companies Ordinance, s.2(1).

XSL Analyze-String -> Matching-Substring into multiple variables

I was wondering if it is possible to use analyze-string and set multiple groups within the RegEx and then store all of the matching groups in variables to use later on.
like so:
<xsl:analyze-string regex="^Blah\s+(\d+)\s+Bloo\s+(\d+)\s+Blee" select=".">
<xsl:matching-substring>
<xsl:variable name="varX">
<xsl:value-of select="regex-group(1)"/>
</xsl:variable>
<xsl:variable name="varY">
<xsl:value-of select="regex-group(2)"/>
</xsl:variable>
</xsl:matching-substring>
</xsl:analyze-string>
This doesn't actually work, but that's the sort of thing I'm after, I know I can wrap the analyze-string in a variable, but that seems daft that for every group I have to process the RegEx, not very efficient, I should be able to process the regex once and store all of the groups for use later on.
Any ideas?

Well does
<xsl:variable name="groups" as="element(group)*">
<xsl:analyze-string regex="^Blah\s+(\d+)\s+Bloo\s+(\d+)\s+Blee" select=".">
<xsl:matching-substring>
<group>
<x><xsl:value-of select="regex-group(1)"/></x>
<y><xsl:value-of select="regex-group(2)"/></y>
</group>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:variable>
help? That way you have a variable named groups which is a sequence of group elements with the captures.

This transformation shows that xsl:analyze-string isn't necessary to obtain the wanted results -- a simpler and generic solution exists.:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="*[matches(., '^Blah\s+(\d+)\s+Bloo\s+(\d+)\s+Blee')]">
<xsl:variable name="vTokens" select=
"tokenize(replace(., '^Blah\s+(\d+)\s+Bloo\s+(\d+)\s+Blee', '$1 $2'), ' ')"/>
<xsl:variable name="varX" select="$vTokens[1]"/>
<xsl:variable name="varY" select="$vTokens[2]"/>
<xsl:sequence select="$varX, $varY"/>
</xsl:template>
</xsl:stylesheet>
when applied on this XML document:
<t>Blah 123 Bloo 4567 Blee</t>
which produces the wanted, correct result:
123 4567
Here we don't rely on knowing the RegEx (can be supplied as parameter) and the string -- we just replace the string with a delimited string of the RegEx groups, which we then tokenize and every item in the sequence produced by tokenize() can readily be assigned to a corresponding variable.
We don't have to find the wanted results buried in a temp. tree -- we just get them all in a result sequence.

"Regular expression"-style replace in XSLT 1.0

I need to perform a find and replace using XSLT 1.0 which is really suited to regular expressions. Unfortunately these aren't available in 1.0 and I'm also unable to use any extension libraries such as EXSLT due to security settings I can't change.
The string I'm working with looks like:
19;#John Smith;#17;#Ben Reynolds;#1;#Terry Jackson
I need to replace the numbers and ; # characters with a ,. For example the above would change to:
John Smith, Ben Reynolds, Terry Jackson
I know a recursive string function is required, probably using substring and translate, but I'm not sure where to start with it.
Does anyone have some pointers on how to work this out? Here's what I've started with:
<xsl:template name="TrimMulti">
<xsl:param name="FullString" />
<xsl:variable name="NormalizedString">
<xsl:value-of select="normalize-space($FullString)" />
</xsl:variable>
<xsl:variable name="Hash">#</xsl:variable>
<xsl:choose>
<xsl:when test="contains($NormalizedString, $Hash)">
<!-- Do something and call TrimMulti -->
</xsl:when>
</xsl:choose>
</xsl:template>

I'm hoping you haven't simplified the problem too much for asking it on SO, because this shouldn't be that much of a problem.
You can define a template and recursively call it as long as you keep the input string's format consistent.
For example,
<xsl:template name="TrimMulti">
<xsl:param name="InputString"/>
<xsl:variable name="RemainingString"
select="substring-after($InputString,';#')"/>
<xsl:choose>
<xsl:when test="contains($RemainingString,';#')">
<xsl:value-of
select="substring-before($RemainingString,';#')"/>
<xsl:text>, </xsl:text>
<xsl:call-template name="TrimMulti">
<xsl:with-param
name="InputString"
select="substring-after($RemainingString,';#')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$RemainingString"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
I tested this template out with the following call:
<xsl:template match="/">
<xsl:call-template name="TrimMulti">
<xsl:with-param name="InputString">19;#John Smith;#17;#Ben Reynolds;#1;#Terry Jackson</xsl:with-param>
</xsl:call-template>
</xsl:template>
And got the following output:
John Smith, Ben Reynolds, Terry Jackson
Which seems to be what you're after.
The explanation of what it is doing is easy to explain if you're familiar with functional programming. The InputString parameter is always in the form [number];#[name];#[rest of string]. Each call of the TrimMulti template chops off the [number];# part and prints off the [name] part, then passes the remaining expression to itself recursively.
The base case is when InputString is in the form [number];#[name], in which case the RemainingString variable won't contain ;#. Since we know this is the end of the input, we don't output a comma this time.

If the ';' and '#' characters are not valid in the input because they are delimiters then why wouldn't the translate function work? It might be ugly (you have to specify all valid characters in the second argument and repeat them in the third argument) but it would be easier to debug.
translate($InputString, ';#abcdefghijklmnopqrstuvABCDEFGHIJKLMNOPQRSTUZ0123456789,- ', ', abcdefghijklmnopqrstuvABCDEFGHIJKLMNOPQRSTUZ0123456789,- ')

How to implement Carriage return in XSLT

I want to implement carriage return within xslt.
The problem is I have a varible:
Step 1 = Value 1 breaktag Step 2 = Value 2 as a string and would like to appear as
Step 1 = Value 1
Step 2 = Value 2
in the HTML form but I am getting the br tag on the page.Any good ways of implementing a line feed/carriage return in xsl would be appreciated

As an alternative to
<xsl:text>
</xsl:text>
you could use
<xsl:text>
</xsl:text> <!-- newline character -->
or
<xsl:text>
</xsl:text> <!-- carriage return character -->
in case you don't want to mess up your indentation

This works for me, as carriage-return + life feed.
<xsl:text>
</xsl:text>
The "
" string does not work.

The cleanest way I've found is to insert !ENTITY declarations at the top of the stylesheet for newline, tab, and other common text constructs. When having to insert a slew of formatting elements into your output this makes the transform sheet look much cleaner.
For example:
<?xml version="1.0"?>
<!DOCTYPE xsl:stylesheet [
<!ENTITY nl "<xsl:text>
</xsl:text>">
]>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="step">
&nl;&nl;
<xsl:apply-templates />
</xsl:template>
...
</xsl:stylesheet>

use a simple carriage return in a xsl:text element
<xsl:text>
</xsl:text>

Try this at the end of the line where you want the carriage return. It worked for me.
<xsl:text><![CDATA[<br />]]></xsl:text>

I was looking for a nice solution to this, as many would prefer, without embedding escape sequences directly in the expressions, or having weird line breaks inside of a variable. I found a hybrid of both this approaches actually works, by embedding a text node inside a variable like this:
<xsl:variable name="newline"><xsl:text>
</xsl:text></xsl:variable>
<xsl:value select="concat(some_element, $newline)" />
Another nice side-affect of this is that you can pass in whatever newline you want, be it just LF, CR, or both CRLF.
--Daniel

Here is an approach that uses a recursive template, which looks for
in the string from the database and then outputs the substring before.
If there is a substring after
remaining, then the template calls itself until there is nothing left.
In case
is not present then the text is simply output.
Here is the template call (just replace #ActivityExtDescription with your database field):
<xsl:call-template name="MultilineTextOutput">
<xsl:with-param name="text" select="#ActivityExtDescription" />
</xsl:call-template>
and here is the code for the template itself:
<xsl:template name="MultilineTextOutput">
<xsl:param name="text"/>
<xsl:choose>
<xsl:when test="contains($text, '
')">
<xsl:variable name="text-before-first-break">
<xsl:value-of select="substring-before($text, '
')" />
</xsl:variable>
<xsl:variable name="text-after-first-break">
<xsl:value-of select="substring-after($text, '
')" />
</xsl:variable>
<xsl:if test="not($text-before-first-break = '')">
<xsl:value-of select="$text-before-first-break" /><br />
</xsl:if>
<xsl:if test="not($text-after-first-break = '')">
<xsl:call-template name="MultilineTextOutput">
<xsl:with-param name="text" select="$text-after-first-break" />
</xsl:call-template>
</xsl:if>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text" /><br />
</xsl:otherwise>
</xsl:choose>
Works like a charm!!!

I believe that you can use the xsl:text tag for this, as in
<xsl:text>
</xsl:text>
Chances are that by putting the closing tag on a line of its own, the newline is part of the literal text and outputted as such.

I separated the values by Environment.NewLine and then used a pre tag in html to emulate the effect I was looking for

This is the only solution that worked for me. Except I was replacing
with \r\n

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

XSLT - Regular Expression Parsing - regex

Related

Umbraco - Creating variable of nodes - how to check for missing value

converting straight quotes to smart quotes

XSL Analyze-String -> Matching-Substring into multiple variables

"Regular expression"-style replace in XSLT 1.0

How to implement Carriage return in XSLT

Categories

Resources