string-to-codepoints(string) equivalent in XSLT 1.0

string-to-codepoints(string) equivalent in XSLT 1.0 - xslt

I need to get ASCII value of character and Convert ASCII code back to character if it satisfies certain conditions.
So I came across these functions:
string-to-codepoints(string)
and
codepoints-to-string((int,int,...))
provided in XSLT 2.0 (Or Rather XPATH 2.0) But unfortunately I need to use XSLT 1.0 for this task.
So My question is
Is there any equivalent of these functions in XSLT 1.0? If not Can we design it?
Can experts here help me in that?
Thanks in advance

It is possible to replace all characters with codepoints above 255 by "?" using pure XSLT 1.0 without extensions.
Define a variable
<xsl:variable name="upto255">
!"#$%.../01234...ABC...abc...úûüýþÿ</xsl:variable>
whose value is a string containing all the characters in the range 0..255 that are legal in XML.
Then use the double-translate trick:
<xsl:variable name="above255" select="translate($input, $upto255, '')"/>
This variable is a string containing all the non-Latin-1 characters present in the input string. Then use the recursive template
<xsl:template name="pad">
<xsl:param name="char"/>
<xsl:param name="count"/>
<xsl:choose>
<xsl:when test="$count=0"/>
<xsl:otherwise>
<xsl:value-of select="$char"/>
<xsl:call-template name="pad">
<xsl:with-param name="char" select="$char"/>
<xsl:with-param name="count" select="$count - 1"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
to create a string of the right number of question marks:
<xsl:variable name="qqq">
<xsl:call-template name="pad">
<xsl:with-param name="char" select="'?'"/>
<xsl:with-param name="count" select="string-length($above255)"/>
</xsl:call-template>
</xsl:variable>
and then do the substitution:
<xsl:value-of select="translate($input, $above255, $qqq)"/>
But of course since you are in Java there is no excuse for writing all this XSLT 1.0 code which could be replaced by a single line of code if you switched to an XSLT 2.0 processor such as Saxon.

Based on your comments you want to perform a string replacement based on a regular expression. If you are using Java and Xalan then I think you can use e.g. java:replaceAll($inputString, $regExpPattern, $replacementString) to call the Java String method replaceAll, here is a simple example
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:java="http://xml.apache.org/xalan/java"
version="1.0"
exclude-result-prefixes="java">
<xsl:template match="/">
<xsl:value-of select="java:replaceAll('abc-123-def','\w+', '?')"/>
</xsl:template>
</xsl:stylesheet>
which outputs ?-?-? for me with Xalan.
On the other hand if you are using Java then you should consider moving to Saxon 9 and XSLT 2.0 as that way you can use the XPath 2.0 replace function (replace('abc-123-def', '\w+', '?')) without any need for extensions.
I am not sure what that has to do with your original question about string-to-codepoints and the ASCII code of characters.

Related

Difficulty formulating conditions for XSL choose or if

I am trying to use XSL choose condition. here I am trying to acheive is, when client sends below xml to Dp System. My XSLT should look for values matching Pv(a) or Pv(b) or Pv(c), if any of these are mactched then send to the backend url which is mentioned in the xsl
else
invoke another rule which is called "Do not call rule" (which is nothing but, picks the local file called error.xml
Thanks for the help
Input xml
<DownloadProfileChannels>
<DownloadProfileChannel>
<IntervalLength>60</IntervalLength>
<PulseMultiplier>0.025</PulseMultiplier>
<Category>Pv(a)</Category> <!-- for every Pv(a) or Pv(b) or Pv(c) -->
<TimeDataEnd>2014-02-20T08:00:00Z</TimeDataEnd>
<MedianValues>
<MedianValue>
<ChannelValue>9112</ChannelValue>
<ProfileStatuses i:nil="true" />
</MedianValue>
<MedianValue>
<ChannelValue>9096</ChannelValue>
<ProfileStatuses i:nil="true" />
</MedianValue>
<MedianValue>
<ChannelValue>9188</ChannelValue>
<ProfileStatuses i:nil="true" />
</MedianValue>
</MedianValue>
</MedianValues>
</DownloadProfileChannel>
</DownloadProfileChannels>
My XSL
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:dp="http://www.w3.org/1999/XSL/Format">
<xsl:template match="/">
<xsl:message dp:priority="debug"> Entered the XSL File </xsl:message>
</xsl:message>
<xsl:choose>
<xsl:when test="contains($Quantity,'Pv(a) or Pv(b)')">
<xsl:variable name="destURL"
select="http://backendurl.com"/>
<dp:set-variable name="'var://service/routing-url'"
value="$destURL"/>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="destURL"
select="local:///clienterror.xml"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>

This line:
<xsl:when test="contains($Quantity,'Pv(a) or Pv(b)')">
Is checking if $Quantity contains the literal string 'Pv(a) or Pv(b)'. You need to separate these out into two check like so:
<xsl:when test="contains($Quantity,'Pv(a)') or contains($Quantity,'Pv(b)')">

Firstly, as #Lego Stormtroopr says, you need to separate out the two conditions. But also, I don't think you really want a "contains" test here, you want an "=" test. The contains function would match Pv(a)(b)(c) - anything that has Pv(a) as a substring, whereas I think you want to match the whole node. So it becomes
<xsl:when test="$Quantity = 'Pv(a)' or $Quantity ='Pv(b)'">
which, if you are using XSLT 2.0, can be further abbreviated to
<xsl:when test="$Quantity = ('Pv(a)', 'Pv(b)')">
Alternatively, in XSLT 2.0 you can use regular expression matching:
<xsl:when test="matches($Quantity, 'Pv([ab])')">

Strip prefix from attribute value

For a project, I'm stuck with XSLT-1.0/XPATH-1.0 and need a fast way to strip a lowercase prefix from attribute values.
Example attribute values are:
"cmdValue1", "gfValue2", "dTestCase3"
The values I need are:
"Value1", "Value2", "TestCase3"
I came up with this XPath expression but it is too slow for my application:
substring(#attr, 1 + string-length(substring-before(translate(#attr, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', '..........................'), '.')))
In essence the above does replace all uppercase chars to dots, then creates a substring from the original attribute value starting from the first found dot position (first uppercase char).
Does anyone know a shorter/faster way to do this in XSLT-1.0/XPATH-1.0?

There are not many functions in XSLT 1.0 which we could use instead, so I tried the following recursive template to avoid the use of the translate function.
Because it is 1.5 times slower, it does not answer your question. I can just avoid someone trying the same thing:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xml:space="default" exclude-result-prefixes="" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="no" indent="yes" />
<xsl:template match="/">
<out>
<xsl:call-template name="removePrefix">
<xsl:with-param name="prefixedName" select="xml/#attrib" />
</xsl:call-template>
</out>
</xsl:template>
<xsl:template name="removePrefix">
<xsl:param name="prefixedName" />
<xsl:choose>
<xsl:when test="substring-before('_abcdefghijklmnopqrstuvwxyz', substring($prefixedName, 1,1))">
<xsl:call-template name="removePrefix">
<xsl:with-param name="prefixedName" select="substring($prefixedName,2)" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$prefixedName" />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>

You don't need to calculate the prefix's length and manually extract the substring. Instead, just directly ask for everything that comes after it:
substring-after(#attr,
substring-before(translate(#attr,
'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
'..........................'),
'.'))
This isn't a huge improvement, but it might shave 7-8% (based on some really rough and quick tests).

parsing string in xslt

I have following xml
<xml>
<xref>
is determined “in prescribed manner”
</xref>
</xml>
I want to see if we can process xslt 2 and return the following result
<xml>
<xref>
is
</xref>
<xref>
determined
</xref>
<xref>
“in prescribed manner”
</xref>
</xml>
I tried few options like replace the space and entities and then using for-each loop but not able to work it out. May be we can use tokenize function of xslt 2.0 but don't know how to use it. Any hint will be helpful.

# JimGarrison: Sorry, I couldn't resist. :-) This XSLT is definitely not elegant but it does (I assume) most of the job:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />
<xsl:variable name="left_quote" select="'<'"/>
<xsl:variable name="right_quote" select="'>'"/>
<xsl:template name="protected_tokenize">
<xsl:param name="string"/>
<xsl:variable name="pattern" select="concat('^([^', $left_quote, ']+)(', $left_quote, '[^', $right_quote, ']*', $right_quote,')?(.*)')"/>
<xsl:analyze-string select="$string" regex="{$pattern}">
<xsl:matching-substring>
<!-- Handle the prefix of the string up to the first opening quote by "normal" tokenizing. -->
<xsl:variable name="prefix" select="concat(' ', normalize-space(regex-group(1)))"/>
<xsl:for-each select="tokenize(normalize-space($prefix), ' ')">
<xref>
<xsl:value-of select="."/>
</xref>
</xsl:for-each>
<!-- Handle the text between the quotes by simply passing it through. -->
<xsl:variable name="protected_token" select="normalize-space(regex-group(2))"/>
<xsl:if test="$protected_token != ''">
<xref>
<xsl:value-of select="$protected_token"/>
</xref>
</xsl:if>
<!-- Handle the suffix of the string. This part may contained protected tokens again. So we do it recursively. -->
<xsl:variable name="suffix" select="normalize-space(regex-group(3))"/>
<xsl:if test="$suffix != ''">
<xsl:call-template name="protected_tokenize">
<xsl:with-param name="string" select="$suffix"/>
</xsl:call-template>
</xsl:if>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:template>
<xsl:template match="*|#*">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="xref">
<xsl:call-template name="protected_tokenize">
<xsl:with-param name="string" select="text()"/>
</xsl:call-template>
</xsl:template>
</xsl:stylesheet>
Notes:
There is the general assumption that white space only serves as a token delimiter and need not be preserved.
“ and rdquo; seem to be invalid in XML although they are valid in HTML. In the XSLT there are variables defined holding the quote characters. They will have to be adapted once you find the right XML representation. You can also eliminate the variables and put the characters right into the regular expression pattern. It will be significantly simplified by this.
<xsl:analyze-string> does not allow a regular expression which may evaluate into an empty string. This comes as a little problem since either the prefix and/or the proteced token and/or the suffix may be empty. I take care of this by artificially adding a space at the beginning of the pattern which allows me to search for the prefix using + (at least one occurence) instead of * (zero or more occurences).

Formatting string (Removing leading zeros)

I am newbie to xslt. My requirement is to transform xml file into text file as per the business specifications. I am facing an issue with one of the string formatting issue. Please help me out if you have any idea.
Here is the part of input xml data:
"0001295"
Expected result to print into text file:
1295
My main issue is to remove leading Zeros. Please share if you have any logic/function.

Just use this simple expression:
number(.)
Here is a complete example:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="t">
<xsl:value-of select="number(.)"/>
</xsl:template>
</xsl:stylesheet>
When applied on this XML document:
<t>0001295</t>
the wanted, correct result is produced:
1295
II. Use format-number()
format-number(., '#')

There are a couple of ways you can do this. If the value is entirely numeric (for example not a CSV line or part of a product code such as ASN0012345) you can convert from a string to a number and back to a string again :
string(number($value)).
Otherwise just replace the 0's at the start :
replace( $value, '^0*', '' )
The '^' is required (standard regexp syntax) or a value of 001201 will be replaced with 121 (all zero's removed).
Hope that helps.
Dave

Here is one way you could do it in XSLT 1.0.
First, find the first non-zero element, by removing all the zero elements currently in the value
<xsl:variable name="first" select="substring(translate(., '0', ''), 1, 1)" />
Then, you can find the substring-before this first character, and then use substring-after to get the non-zero part after this
<xsl:value-of select="substring-after(., substring-before(., $first))" />
Or, to combine the two statements into one
<xsl:value-of select="substring-after(., substring-before(., substring(translate(., '0', ''), 1, 1)))" />
So, given the following input
<a>00012095Kb</a>
Then using the following XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/a">
<xsl:value-of select="substring-after(., substring-before(., substring(translate(., '0', ''), 1, 1)))" />
</xsl:template>
</xsl:stylesheet>
The following will be output
12095Kb

As a simple alternative in XSLT 2.0 that can be used with numeric or alpha-numeric input, with or without leading zeros, you might try:
replace( $value, '^0*(..*)', '$1' )
This works because ^0* is greedy and (..*) captures the rest of the input after the last leading zero. $1 refers to the captured group.
Note that an input containing only zeros will output 0.

XSLT 2.0
Remove leading zeros from STRING
<xsl:value-of select="replace( $value, '^0+', '')"/>

You could use a recursive template that will remove the leading zeros:
<xsl:template name="remove-leading-zeros">
<xsl:param name="text"/>
<xsl:choose>
<xsl:when test="starts-with($text,'0')">
<xsl:call-template name="remove-leading-zeros">
<xsl:with-param name="text"
select="substring-after($text,'0')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Invoke it like this:
<xsl:call-template name="remove-leading-zeros">
<xsl:with-param name="text" select="/path/to/node/with/leading/zeros"/>
</xsl:call-template>
</xsl:template>

<xsl:value-of select="number(.) * 1"/>
works for me

All XSLT1 parser, like the popular libXML2's module for XSLT, have the registered functions facility... So, we can suppose to use it. Suppose also that the language that call XSLT, is PHP: see this wikibook about registerPHPFunctions.
The build-in PHP function ltrim can be used in
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fn="http://php.net/xsl">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<xsl:template match="test">
show <xsl:value-of select="fn:function('ltrim',string(.),'0')" />",
</xsl:template>
</xsl:stylesheet>
Now imagine a little bit more complex problem, to ltrim a string with more than 1 number, ex. hello 002 and 021, bye.
The solution is the same: use registerPHPFunctions, except to change the build-in function to a user defined one,
function ltrim0_Multi($s) {
return preg_replace('/(^0+|(?<= )0+)(?=[1-9])/','',$s);
}
converts the example into hello 2 and 21, bye.

Complex XSLT split?

Is it possible to split a tag at lower to upper case boundaries i.e.
for example, tag 'UserLicenseCode' should be converted to 'User License Code'
so that the column headers look a little nicer.
I've done something like this in the past using Perl's regular expressions,
but XSLT is a whole new ball game for me.
Any pointers in creating such a template would be greatly appreciated!
Thanks
Krishna

Using recursion, it is possible to walk through a string in XSLT to evaluate every character. To do this, create a new template which accepts only one string parameter. Check the first character and if it's an uppercase character, write a space. Then write the character. Then call the template again with the remaining characters inside a single string. This would result in what you want to do.
That would be your pointer. I will need some time to work out the template. :-)
It took some testing, especially to get the space inside the whole thing. (I misused a character for this!) But this code should give you an idea...
I used this XML:
<?xml version="1.0" encoding="UTF-8"?>
<blah>UserLicenseCode</blah>
and then this stylesheet:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">
<xsl:output method="text"/>
<xsl:variable name="Space">*</xsl:variable>
<xsl:template match="blah">
<xsl:variable name="Split">
<xsl:call-template name="Split">
<xsl:with-param name="Value" select="."/>
<xsl:with-param name="First" select="true()"/>
</xsl:call-template></xsl:variable>
<xsl:value-of select="translate($Split, '*', ' ')" />
</xsl:template>
<xsl:template name="Split">
<xsl:param name="Value"/>
<xsl:param name="First" select="false()"/>
<xsl:if test="$Value!=''">
<xsl:variable name="FirstChar" select="substring($Value, 1, 1)"/>
<xsl:variable name="Rest" select="substring-after($Value, $FirstChar)"/>
<xsl:if test="not($First)">
<xsl:if test="translate($FirstChar, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', '..........................')= '.'">
<xsl:value-of select="$Space"/>
</xsl:if>
</xsl:if>
<xsl:value-of select="$FirstChar"/>
<xsl:call-template name="Split">
<xsl:with-param name="Value" select="$Rest"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
and I got this as result:
User License Code
Do keep in mind that spaces and other white-space characters do tend to be stripped away from XML, which is why I used an '*' instead, which I translated to a space.
Of course, this code could be improved. It's what I could come up with in 10 minutes of work. In other languages, it would take less lines of code but in XSLT it's still quite fast, considering the amount of code lines it contains.

An XSLT + FXSL solution (in XSLT 2.0, but almost the same code will work with XSLT 1.0 and FXSL 1.x:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:f="http://fxsl.sf.net/"
xmlns:testmap="testmap"
exclude-result-prefixes="f testmap"
>
<xsl:import href="../f/func-str-dvc-map.xsl"/>
<testmap:testmap/>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="vTestMap" select="document('')/*/testmap:*[1]"/>
'<xsl:value-of select="f:str-map($vTestMap, 'UserLicenseCode')"
/>'
</xsl:template>
<xsl:template name="mySplit" match="*[namespace-uri() = 'testmap']"
mode="f:FXSL">
<xsl:param name="arg1"/>
<xsl:value-of select=
"if(lower-case($arg1) ne $arg1)
then concat(' ', $arg1)
else $arg1
"/>
</xsl:template>
</xsl:stylesheet>
When the above transformation is applied on any source XML document (not used), the expected correct result is produced:
' User License Code'
Do note:
We are using the DVC version of the FXSL function/template str-map(). This is a Higher-order function (HOF) which takes two arguments: another function and a string. str-map() applies the function on every character of the string and returns the concatenation of the results.
Because the lower-case() function is used (in the XSLT 2.0 version), we are not constrained to only the Latin alphabet.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

string-to-codepoints(string) equivalent in XSLT 1.0 - xslt

Related

Difficulty formulating conditions for XSL choose or if

Strip prefix from attribute value

parsing string in xslt

Formatting string (Removing leading zeros)

Complex XSLT split?

Categories

Resources