I have scenario where I want to extract the sub string which matches the regular expression.
Below is the example:
<xsl:value-of select="matches('Process java(Application=JavaApplication_2) is not running in the system.', ''.*AppName=Archiver_[0-9]{1,2}.*'')"/>
But this gives me the boolean value as 'false'.
I tried with tokenize but it is becoming more complex.
Please help me on this.
See instruction analyze-string Source + Regex examples
Input
<root>Process java(Application=JavaApplication_2) is not running in the system.</root>
Template
<xsl:template match="root">
<xsl:analyze-string select="." regex="Application=JavaApplication_[0-9]{{1,2}}">
<xsl:matching-substring>
<xsl:value-of select="."/>
</xsl:matching-substring>
<!-- optional -->
<xsl:non-matching-substring>
<!-- do sth -->
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
The matches() function returns either true or false.
To extract a matching substring, try using the replace() function instead. I am not sure which substring you are trying to extract, so I will not give an example here, but see: https://stackoverflow.com/a/39402132/3016153
Related
I have some data where formatting commands have been created in the format
^B Makes the rest of the line bold
^I Makes the rest of the line italic
etc., and I'm trying to turn this in to html <b>, <i> etc.
These codes can be combined and occur anywhere in the line, but apply to the rest of the line only.
I've tokenised the data into lines, and am using analyze-string on each line to pick up the formmatting marks. The problem is that I need to open the formatting instruction where I find it in the string, but then close it at the end of the string, and what I have doesn't work, as it opens and closes the format where the marker is, as you would expect:
<xsl:analyze-string select="." regex="\^([BIU])">
<xsl:matching-substring>
<xsl:element name="{lower-case(regex-group(1))}"/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="." />
</xsl:non-matching-substring>
</xsl:analyze-string>
What I get from this is:
<b></b> Makes the rest of the line bold
<i></i> Makes the rest of the line italic
where what I want is obviously
<b> Makes the rest of the line bold </b>
<i> Makes the rest of the line italic </i>
I can't see an obvous way of using analyze-string to achieve this, and the only way I can see of doing it is to use a recursive function to process do substring-afters etc., which seems rather messy.
Anyone with a better idea? Thanks!
Screwtape.
You just need add another pattern to your regex expression to capture the rest of the line after the symbol, which can then be output inside your newly created element
Try this
<xsl:analyze-string select="." regex="\^([BIU])(.*)">
<xsl:matching-substring>
<xsl:element name="{lower-case(regex-group(1))}">
<xsl:value-of select="regex-group(2)" />
</xsl:element>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="." />
</xsl:non-matching-substring>
</xsl:analyze-string>
<handlingInstruction>
<handlingInstructionText>CTAC | MARTINE HOEYLAERTS</handlingInstructionText>
</handlingInstruction>
<handlingInstruction>
<handlingInstructionText>PHON | 02/7225235</handlingInstructionText>
</handlingInstruction>
I have The above given xml structure I concatenate them and use a comma as a separator using below code
> <xsl:value-of
> select="concat(handlingInstruction[1]/handlingInstructionText,
> ',',
> handlingInstruction[2]/handlingInstructionText)"/>
I would like to ask how will I make the comma separator appear only once the 2nd exist the shortest way possible. Thanks in advance
If you don't want to use xsl:for-each, try:
<xsl:template match="/root">
<xsl:apply-templates select="handlingInstruction/handlingInstructionText"/>
</xsl:template>
<xsl:template match="handlingInstructionText">
<xsl:value-of select="."/>
<xsl:if test="position()!=last()">
<xsl:text>,</xsl:text>
</xsl:if>
</xsl:template>
(continued from here: https://stackoverflow.com/a/34679465/3016153)
<xsl:for-each select="handlingInstruction">
<xsl:value-of select="handlingInstructionText"/>
<xsl:if test="position()!=last()">
<xsl:text>,</xsl:text>
</xsl:if>
</xsl:for-each>
This will iterate over all handlingInstruction elements and output the value of the handlingInstructionText element. It will add to the end of each element, if it is not the last one (which the first one would be if there was only one), a comma.
In your example, you only used two handlingInstruction elements. If you want to only use two with this method, do
<xsl:for-each select="handlingInstruction[position()<3]">
<xsl:value-of select="handlingInstructionText"/>
<xsl:if test="position()!=last()">
<xsl:text>,</xsl:text>
</xsl:if>
</xsl:for-each>
Note the < there. That is actually a less than sign (<), but we can't use that in xml so we use the entity defined for it.
Here is a second way to do it, which avoids the for-each loop.
If you are using xslt version 2, there is a string-join function which could be used like:
<xsl:value-of select="string-join(//handlingInstruction/handlingInstructionText,',')"/>
The string-join method takes a sequence of strings (which the nodes selected will be converted to by taking their content) and concatenates them with the separator. If there is only one string, a separator will not be added.
Alternatively, xslt 2 also provides a separator attribute on the value-of element. Thus
<xsl:value-of select="//handlingInstruction/handlingInstructionText" separator=","/>
produces the same result.
I was wondering if it is possible to use analyze-string and set multiple groups within the RegEx and then store all of the matching groups in variables to use later on.
like so:
<xsl:analyze-string regex="^Blah\s+(\d+)\s+Bloo\s+(\d+)\s+Blee" select=".">
<xsl:matching-substring>
<xsl:variable name="varX">
<xsl:value-of select="regex-group(1)"/>
</xsl:variable>
<xsl:variable name="varY">
<xsl:value-of select="regex-group(2)"/>
</xsl:variable>
</xsl:matching-substring>
</xsl:analyze-string>
This doesn't actually work, but that's the sort of thing I'm after, I know I can wrap the analyze-string in a variable, but that seems daft that for every group I have to process the RegEx, not very efficient, I should be able to process the regex once and store all of the groups for use later on.
Any ideas?
Well does
<xsl:variable name="groups" as="element(group)*">
<xsl:analyze-string regex="^Blah\s+(\d+)\s+Bloo\s+(\d+)\s+Blee" select=".">
<xsl:matching-substring>
<group>
<x><xsl:value-of select="regex-group(1)"/></x>
<y><xsl:value-of select="regex-group(2)"/></y>
</group>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:variable>
help? That way you have a variable named groups which is a sequence of group elements with the captures.
This transformation shows that xsl:analyze-string isn't necessary to obtain the wanted results -- a simpler and generic solution exists.:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="*[matches(., '^Blah\s+(\d+)\s+Bloo\s+(\d+)\s+Blee')]">
<xsl:variable name="vTokens" select=
"tokenize(replace(., '^Blah\s+(\d+)\s+Bloo\s+(\d+)\s+Blee', '$1 $2'), ' ')"/>
<xsl:variable name="varX" select="$vTokens[1]"/>
<xsl:variable name="varY" select="$vTokens[2]"/>
<xsl:sequence select="$varX, $varY"/>
</xsl:template>
</xsl:stylesheet>
when applied on this XML document:
<t>Blah 123 Bloo 4567 Blee</t>
which produces the wanted, correct result:
123 4567
Here we don't rely on knowing the RegEx (can be supplied as parameter) and the string -- we just replace the string with a delimited string of the RegEx groups, which we then tokenize and every item in the sequence produced by tokenize() can readily be assigned to a corresponding variable.
We don't have to find the wanted results buried in a temp. tree -- we just get them all in a result sequence.
Ok, this one has been driving me up the wall...
I have a xslt function that is supposed to split out the Zip-code part from a Zip+City string depending on the country. I cannot get it to work! This is what I got so far:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/functions" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:function name="exslt:GetZip" as="xs:string">
<xsl:param name="zipandcity" as="xs:string"/>
<xsl:param name="countrycode" as="xs:string"/>
<xsl:choose>
<xsl:when test="$countrycode='DK'">
<xsl:analyze-string select="$zipandcity" regex="(\d{4}) ([A-Za-zÆØÅæøå]{3,24})">
<xsl:matching-substring>
<xsl:value-of select="regex-group(1)"/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:text>fail</xsl:text>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:when>
<xsl:otherwise>
<xsl:text>error</xsl:text>
</xsl:otherwise>
</xsl:choose>
</xsl:function>
I am running it on a source XML where the following values are passed to the function:
zipandcity: "DK-2640 København SV"
countrycode: "DK"
...will output 'fail'!
I think there is something I am misunderstanding here...
Aside from that facts that regexes aren't supported until XSLT 2.0 and braces have to be escaped (but backslashes don't), there's one more reason why that code won't work: XSLT regexes are implicitly anchored at both ends. Given the string DK-2640 København SV, your regex only matches 2640 København, so you need to "pad" it to make it consume the whole string:
regex=".*(\d{{4}}) ([A-Za-zÆØÅæøå]{{3,24}}).*"
.* is probably sufficient in this case, but sometimes you have to be more specific. For example, if there's more than one place where \d{4} could match, you might use \D* at the beginning to make sure the first capturing group matches the first bunch of digits.
The regex attribute is parsed as an attribute value template whery curly braces have a special meaning. If this is in fact an XSL 2.0 Stylesheet, you need to escape the curly braces in the regex attribute by doubling them: (\d{{4}}) ([A-Za-zÆØÅæøå]{{3,24}})
Alternatively you could define a variable containing your pattern like this:
<xsl:variable name="pattern">(\d{4}) ([A-Za-zÆØÅæøå]{3,24})</xsl:variable
<xsl:analyze-string select="$zipandcity" regex="{$pattern}">
Regular expressions are only supported in XSLT 2.x -- not in XSLT 1.0.
I need to perform a find and replace using XSLT 1.0 which is really suited to regular expressions. Unfortunately these aren't available in 1.0 and I'm also unable to use any extension libraries such as EXSLT due to security settings I can't change.
The string I'm working with looks like:
19;#John Smith;#17;#Ben Reynolds;#1;#Terry Jackson
I need to replace the numbers and ; # characters with a ,. For example the above would change to:
John Smith, Ben Reynolds, Terry Jackson
I know a recursive string function is required, probably using substring and translate, but I'm not sure where to start with it.
Does anyone have some pointers on how to work this out? Here's what I've started with:
<xsl:template name="TrimMulti">
<xsl:param name="FullString" />
<xsl:variable name="NormalizedString">
<xsl:value-of select="normalize-space($FullString)" />
</xsl:variable>
<xsl:variable name="Hash">#</xsl:variable>
<xsl:choose>
<xsl:when test="contains($NormalizedString, $Hash)">
<!-- Do something and call TrimMulti -->
</xsl:when>
</xsl:choose>
</xsl:template>
I'm hoping you haven't simplified the problem too much for asking it on SO, because this shouldn't be that much of a problem.
You can define a template and recursively call it as long as you keep the input string's format consistent.
For example,
<xsl:template name="TrimMulti">
<xsl:param name="InputString"/>
<xsl:variable name="RemainingString"
select="substring-after($InputString,';#')"/>
<xsl:choose>
<xsl:when test="contains($RemainingString,';#')">
<xsl:value-of
select="substring-before($RemainingString,';#')"/>
<xsl:text>, </xsl:text>
<xsl:call-template name="TrimMulti">
<xsl:with-param
name="InputString"
select="substring-after($RemainingString,';#')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$RemainingString"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
I tested this template out with the following call:
<xsl:template match="/">
<xsl:call-template name="TrimMulti">
<xsl:with-param name="InputString">19;#John Smith;#17;#Ben Reynolds;#1;#Terry Jackson</xsl:with-param>
</xsl:call-template>
</xsl:template>
And got the following output:
John Smith, Ben Reynolds, Terry Jackson
Which seems to be what you're after.
The explanation of what it is doing is easy to explain if you're familiar with functional programming. The InputString parameter is always in the form [number];#[name];#[rest of string]. Each call of the TrimMulti template chops off the [number];# part and prints off the [name] part, then passes the remaining expression to itself recursively.
The base case is when InputString is in the form [number];#[name], in which case the RemainingString variable won't contain ;#. Since we know this is the end of the input, we don't output a comma this time.
If the ';' and '#' characters are not valid in the input because they are delimiters then why wouldn't the translate function work? It might be ugly (you have to specify all valid characters in the second argument and repeat them in the third argument) but it would be easier to debug.
translate($InputString, ';#abcdefghijklmnopqrstuvABCDEFGHIJKLMNOPQRSTUZ0123456789,- ', ', abcdefghijklmnopqrstuvABCDEFGHIJKLMNOPQRSTUZ0123456789,- ')