in xslt 1.0 if value contains number inside string value - xslt

I have a complex adress string, or rather different possible formats, which I need to split into road name, house number, floor, position (left,right,middel door) or door/room number
I have managed to do all apart from the last bit, with the pseudo "contains":
<xsl:choose>
<xsl:when test="contains(#addressFarLeft, ANY_NUMERIC_VALUE_ANYWHERE)">
<door>NUMERICVALUE</door>
</xsl:when>
<xsl:otherwise></xsl:otherwise>
</xsl:choose>
Im pretty sure I cant just use some form of contains, but what then?
Value is set dynamically, but here are a few possible values:
<xsl:variable name="addressFarLeftValue">.th.</xsl:variable> =>
no numeric value, do nothing
<xsl:variable name="addressFarLeftValue">.1.</xsl:variable> =>
produce: <door>1</door>
<xsl:variable name="addressFarLeftValue">, . tv </xsl:variable> =>
no numeric value, do nothing
<xsl:variable name="addressFarLeftValue">,th, 4.</xsl:variable> =>
produce: <door>1</door>
Any suggestions?

If you want to test whether a string contains a numeric value, one approach could be to use the 'translate' function to remove all numeric digits from the string, and if the resultant string doesn't match the initial string, you know it must have contained a number. If the string doesn't change, then it didn't.
<xsl:choose>
<xsl:when test="translate($addressFarLeft, '1234567890', '') != $addressFarLeft">
<door>1</door>
</xsl:when>
<xsl:otherwise/>
So, <xsl:variable name="addressFarLeftValue">.1.</xsl:variable> outputs <door>1</door>, but <xsl:variable name="addressFarLeftValue">.th.</xsl:variable> doesn't output anything.
IF you wanted to extract the actual number, then assuming there was only one occurence of a number in the string, you could do this...
<xsl:value-of
select="translate(
#addressFarLeft,
translate(#addressFarLeft, '1234567890', ''),
'')" />
So, <xsl:variable name="addressFarLeftValue">,th, 42.</xsl:variable> outputs 42
If you had multiple numbers present, such as ,th, 42, ab, 1 this approach would fail though.

The wanted numeric value can be obtained using the "double-translate" method, first shown by Michael Kay:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:variable name="addressFarLeftValue1" select="',th, 4.'"/>
<xsl:variable name="addressFarLeftValue2" select="', . tv'"/>
<xsl:variable name="vDoorNumber1" select=
"translate($addressFarLeftValue1,
translate($addressFarLeftValue1, '0123456789', ''),
'')"/>
<xsl:variable name="vDoorNumber2" select=
"translate($addressFarLeftValue2,
translate($addressFarLeftValue2, '0123456789', ''),
'')"/>
<xsl:template match="/">
"<xsl:value-of select="$vDoorNumber2"/>"
==========
"<xsl:value-of select="$vDoorNumber1"/>"
</xsl:template>
</xsl:stylesheet>
When this transformation is applied to any XML document (not used), the wanted, correct result is produced:
""
==========
"4"

Related

Extract parts of inputs strings

I am trying to extract equipment names from strings and would like if someone could help me find a good way to do this.
My input string can either contain 1 or 2 equipment names, consisting of EQ followed 1 to 3 digits, for example :
LocationEQ3Suffix
LocationEQ5EQ8Suffix
So in the first instance I would need 'EQ3' and in the second instance I would need 'EQ5' and 'EQ8'.
I need the output to be in a text format, for example :
SomeText.EQ3
SomeText.EQ5
SomeText.EQ8
I was thinking there might be a way to do this with xsl:analyze-string and a regex like EQ[0-9]{1,3}.
Any help is appreciated.
I started something like this, but I don't think it's the right approach.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:variable name="input" select="'LocationEQ3EQ4Funct'"/>
<xsl:choose>
<!-- Case with 2 EQ -->
<xsl:when test="matches($input, 'EQ[0-9]{1,3}EQ[0-9]{1,3}')">
<xsl:value-of select="$input"/>
</xsl:when>
<!-- Case with 1 EQ -->
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
You say you want to use xsl:analyze-string but you're not.
An implementation using it would look something like:
<xsl:analyze-string select="input-string" regex="EQ\d{{1,3}}">
<xsl:matching-substring>
<xsl:text>SomeText.</xsl:text>
<xsl:value-of select="." />
<xsl:text>
</xsl:text>
</xsl:matching-substring>
</xsl:analyze-string>
Demo: https://xsltfiddle.liberty-development.net/a9Hk1a

What is the encoding can be used in XSLT to support only basic Latin alphabet characters?

I am searching for the correct encoding type need to be used in XSLT when process my XML.
My need is:
Output text file do not accept any special characters or UTF8.
Alphabet logic utilized which only support the modern English alphabet is a Latin-based alphabet consisting of 26 letters – the same letters that are found in the Basic modern Latin alphabet.
I tried to use the encoding="ISO 8859-1" , encoding="ISO 8859-15".
Can some one tell me the correct encoding if above are wrong
Thanks,
Jagan
Like #EiríkrÚtlendi suggested in the comments; sanitize/check your output in the XSLT.
You can create a function with a single parameter that checks for an invalid character...
XML Input
<elem>ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz</elem>
XSLT 2.0
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:so="StackOverflow Example">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="elem">
<xsl:value-of select="so:out(.)"/>
</xsl:template>
<xsl:function name="so:out">
<xsl:param name="str"/>
<xsl:if test="matches($str,'[^\p{L}]')">
<xsl:message terminate="yes">
<xsl:value-of
select="
concat('Invalid character in "',
$str, '".')"
/>
</xsl:message>
</xsl:if>
<xsl:value-of select="$str"/>
</xsl:function>
</xsl:stylesheet>
Text Output
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
If you add any other character to the elem element in the input, you'll get the following message (I added a space to make it fail):
Invalid character in "ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz".
You could also check it character by character...
<xsl:function name="so:out">
<xsl:param name="str"/>
<xsl:for-each select="string-to-codepoints($str)">
<xsl:if test="matches(codepoints-to-string(.),'[^\p{L}]')">
<xsl:message terminate="yes">
<xsl:value-of
select="
concat('Invalid character ("',
codepoints-to-string(.),
'") in "',
$str, '".')"
/>
</xsl:message>
</xsl:if>
</xsl:for-each>
<xsl:value-of select="$str"/>
</xsl:function>
which would produce the message:
Invalid character (" ") in "ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz".

Split and concatenate a string in XSLT

Hi I have the below line in by XML and also I need a hyperlink for the number. I want this output to be shown in HTML format.
<main>
<alph>a b 2,3</alph>
</main>
I want an XSLT that gives output as:
a b 2, a b 3
I have tried the below XSLT:
<xsl:template match="alph">
<xsl:variable name="link" select="normalize-space(translate(
normalize-space(current()),abcdefghijklmnopqrstuvwxyz,''))"/>
<xsl:value-of select="substring-before(normalize-space(.),$link)"/>
<xsl:variable name="tex">
<xsl:value-of select="text()"/>
</xsl:variable>
<xsl:choose>
<xsl:when test="contains($link,',')">
<xsl:variable name="new">
<xsl:value-of select="tokenize($link,',')"/>
</xsl:variable>
<xsl:value-of select="concat($new,$tex)"/>
</xsl:when>
<xsl:when test="contains($link,'-')">
<xsl:value-of select="tokenize($link,'-')"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$link"/>
</xsl:otherwise>
</xsl:choose>
But it is giving me output as:
a b 2 3a b 2,3
Thanks
One problem you have is with the variable link
<xsl:variable name="link" select="normalize-space(translate(
normalize-space(current()),abcdefghijklmnopqrstuvwxyz,''))"/>
It looks like you are trying removing all alphabetic characters from the string, so that you are left with just 2,3. However, for this to work the abc...xyz needs to be enclosed in apostrophes, otherwise it will be looking for an element named abc...xyz. Having said that, you say you are using XSLT2.0, so you can make use of the replace function here, which takes a regular expression as a parameter
<xsl:variable name="link" select="normalize-space(replace(current(),'[a-z]',''))"/>
Next, you can get the text before this link, like so
<xsl:variable name="text" select="normalize-space(substring-before(current(), $link))"/>
This will give you your a b
Finally, you can use the tokenize function to split up the 2,3. In your XSLT you seem to be looking for hyphens too, but the tokenize function also uses regular expressions, so this is not a problem. What you can do is just tokenize the string, and re-join it using the text variable as a separator
<xsl:value-of select="concat($text, ' ')"/>
<xsl:value-of select="tokenize($link,',|-')" separator="{concat(', ', $text, ' ')}"/>
Here is the full XSLT
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="alph">
<xsl:variable name="link" select="normalize-space(replace(current(),'[a-z]',''))"/>
<xsl:variable name="text" select="normalize-space(substring-before(current(), $link))"/>
<xsl:value-of select="concat($text, ' ')"/>
<xsl:value-of select="tokenize($link,',|-')" separator="{concat(', ', $text, ' ')}"/>
</xsl:template>
</xsl:stylesheet>
When applied on your XML, the following is output
a b 2, a b 3

How do I use a regular expression in XSLT 1.0?

I am using XSLT 1.0.
My input information may contain these values
<!--case 1-->
<attribute>123-00</attribute>
<!--case 2-->
<attribute>Abc-01</attribute>
<!--case 3-->
<attribute>--</attribute>
<!--case 4-->
<attribute>Z2-p01</attribute>
I want to find out those string that match the criteria:
if string has at least 1 alphabet AND has at least 1 number,
then
do X processing
else
do Y processing
In example above, for case 1,2,4 I should be able to do X processing. For case 3, I should be able to do Y processing.
I aim to use a regular expression (in XSLT 1.0).
For all the cases, the attribute can take any value of any length.
I tried use of match, but the processor returned an error.
I tried use of translate function, but not sure if used the right way.
I am thinking about.
if String matches [a-zA-Z0-9]*
then do X processing
else
do y processing.
How do I implement that using XSLT 1.0 syntax?
This solution really works in XSLT 1.0 (and is simpler, because it doesn't and needn't use the double-translate method.):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:variable name="vUpper" select=
"'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>
<xsl:variable name="vLower" select=
"'abcdefghijklmnopqrstuvwxyz'"/>
<xsl:variable name="vAlpha" select="concat($vUpper, $vLower)"/>
<xsl:variable name="vDigits" select=
"'0123456789'"/>
<xsl:template match="attribute">
<xsl:choose>
<xsl:when test=
"string-length() != string-length(translate(.,$vAlpha,''))
and
string-length() != string-length(translate(.,$vDigits,''))">
Processing X
</xsl:when>
<xsl:otherwise>
Processing Y
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML fragment -- made a well-formed XML document:
<t>
<!--case 1-->
<attribute>123-00</attribute>
<!--case 2-->
<attribute>Abc-01</attribute>
<!--case 3-->
<attribute>--</attribute>
<!--case 4-->
<attribute>Z2-p01</attribute>
</t>
the wanted, correct result is produced:
Processing Y
Processing X
Processing Y
Processing X
Do Note: Any attempt to use with a true XSLT 1.0 processor code like this (borrowed from another answer to this question) will fail with error:
<xsl:template match=
"attribute[
translate(.,
translate(.,
concat($upper, $lower),
''),
'')
and
translate(., translate(., $digit, ''), '')]
">
because in XSLT 1.0 it is forbidden for a match pattern to contain a variable reference.
If you found this question because you're looking for a way to use regular expressions in XSLT 1.0, and you're writing an application using Microsoft's XSLT processor, you can solve this problem by using an inline C# script.
I've written out an example and a few tips in this thread, where someone was seeking out similar functionality. It's super simple, though it may or may not be appropriate for your purposes.
XSLT does not support regular expressions, but you can fake it.
The following stylesheet prints an X processing message for all attribute elements having a string value containing at least one number and at least one letter (and Y processing for those that do not):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:variable name="lower" select="'abcdefghijklmnopqrstuvwxyz'"/>
<xsl:variable name="upper" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>
<xsl:variable name="digit" select="'0123456789'"/>
<xsl:template match="attribute">
<xsl:choose>
<xsl:when test="
translate(., translate(., concat($upper, $lower), ''), '') and
translate(., translate(., $digit, ''), '')">
<xsl:message>X processing</xsl:message>
</xsl:when>
<xsl:otherwise>
<xsl:message>Y processing</xsl:message>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
Note: You said this:
In example above, for case 1,2,4 I should be able to do X processing.
for case 3, I should be able to do Y processing.
But that conflicts with your requirement, because case 1 does not contain a letter. If, on the other hand, you really want to match the equivalent of [a-zA-Z0-9], then use this:
translate(., translate(., concat($upper, $lower, $digit), ''), '')
...which matches any attribute having at least one letter or number.
See the following question for more information on using translate in this way:
How to write xslt if element contains letters?

Formatting string (Removing leading zeros)

I am newbie to xslt. My requirement is to transform xml file into text file as per the business specifications. I am facing an issue with one of the string formatting issue. Please help me out if you have any idea.
Here is the part of input xml data:
"0001295"
Expected result to print into text file:
1295
My main issue is to remove leading Zeros. Please share if you have any logic/function.
Just use this simple expression:
number(.)
Here is a complete example:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="t">
<xsl:value-of select="number(.)"/>
</xsl:template>
</xsl:stylesheet>
When applied on this XML document:
<t>0001295</t>
the wanted, correct result is produced:
1295
II. Use format-number()
format-number(., '#')
There are a couple of ways you can do this. If the value is entirely numeric (for example not a CSV line or part of a product code such as ASN0012345) you can convert from a string to a number and back to a string again :
string(number($value)).
Otherwise just replace the 0's at the start :
replace( $value, '^0*', '' )
The '^' is required (standard regexp syntax) or a value of 001201 will be replaced with 121 (all zero's removed).
Hope that helps.
Dave
Here is one way you could do it in XSLT 1.0.
First, find the first non-zero element, by removing all the zero elements currently in the value
<xsl:variable name="first" select="substring(translate(., '0', ''), 1, 1)" />
Then, you can find the substring-before this first character, and then use substring-after to get the non-zero part after this
<xsl:value-of select="substring-after(., substring-before(., $first))" />
Or, to combine the two statements into one
<xsl:value-of select="substring-after(., substring-before(., substring(translate(., '0', ''), 1, 1)))" />
So, given the following input
<a>00012095Kb</a>
Then using the following XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/a">
<xsl:value-of select="substring-after(., substring-before(., substring(translate(., '0', ''), 1, 1)))" />
</xsl:template>
</xsl:stylesheet>
The following will be output
12095Kb
As a simple alternative in XSLT 2.0 that can be used with numeric or alpha-numeric input, with or without leading zeros, you might try:
replace( $value, '^0*(..*)', '$1' )
This works because ^0* is greedy and (..*) captures the rest of the input after the last leading zero. $1 refers to the captured group.
Note that an input containing only zeros will output 0.
XSLT 2.0
Remove leading zeros from STRING
<xsl:value-of select="replace( $value, '^0+', '')"/>
You could use a recursive template that will remove the leading zeros:
<xsl:template name="remove-leading-zeros">
<xsl:param name="text"/>
<xsl:choose>
<xsl:when test="starts-with($text,'0')">
<xsl:call-template name="remove-leading-zeros">
<xsl:with-param name="text"
select="substring-after($text,'0')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Invoke it like this:
<xsl:call-template name="remove-leading-zeros">
<xsl:with-param name="text" select="/path/to/node/with/leading/zeros"/>
</xsl:call-template>
</xsl:template>
<xsl:value-of select="number(.) * 1"/>
works for me
All XSLT1 parser, like the popular libXML2's module for XSLT, have the registered functions facility... So, we can suppose to use it. Suppose also that the language that call XSLT, is PHP: see this wikibook about registerPHPFunctions.
The build-in PHP function ltrim can be used in
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fn="http://php.net/xsl">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<xsl:template match="test">
show <xsl:value-of select="fn:function('ltrim',string(.),'0')" />",
</xsl:template>
</xsl:stylesheet>
Now imagine a little bit more complex problem, to ltrim a string with more than 1 number, ex. hello 002 and 021, bye.
The solution is the same: use registerPHPFunctions, except to change the build-in function to a user defined one,
function ltrim0_Multi($s) {
return preg_replace('/(^0+|(?<= )0+)(?=[1-9])/','',$s);
}
converts the example into hello 2 and 21, bye.