How to craft an entity in XSLT - xslt

How can I create the entity ' ', if I have the part starting with the '#' in a variable?
When I try to do something like this:
concat('&', '#160;')
I get an syntax error in XMLspy.

Does it have to be an entity (actually you mean a "character reference"), or will it do just to output a non-breaking space character?
To do the latter, given that $var holds "#160", in XSLT 2.0 you can use
<xsl:value-of select="codepoints-to-string(number(substring($var, 2)))"/>

The problem with your code is that, in XML, you cannot use a standalone &, so it should be like this:
concat('&', '#160;')
which outputs &#160; if the output method is xml and   if text.
disable-output-escaping helps to force   in xml output:
<xsl:value-of select="concat('&', '#160;')" disable-output-escaping="yes"/>
Another way to replace a character by an arbitrary string is using character maps:
<xsl:output use-character-maps="foo"/>
<xsl:character-map name="foo">
<xsl:output-character character="&" string="&"/>
</xsl:character-map>
<xsl:template match="/">
<xsl:value-of select="concat('&', '#160;')"/>
</xsl:template>

Related

XSLT REGEX pattern match

Using Saxon 9.7, XSLT 3.0, I'm trying to select square bracketed terms from a string of text and then remove duplicate values of the terms.
So far I have found a template which selects the substrings I want and a function that tokenizes the string and then removes duplicate values.
However, I haven't been able to get the correct regex for the tokenizing of the string.
Here is my XML of the full text
<column>
<columnDerivationPrompt>Option 1: (No visit windowing)</columnDerivationPrompt>
<columnDerivationDescription>Set to collected visit name [EG.VISIT] Set to 'POST-BASELINE MINIMUM' for the new observation generated for derviation type minimum [ADEG.DTYPE] = 'MINIMUM'
Set to 'POST-BASELINE MAXIMUM' for the new observation generated for derviation type maximum [ADEG.DTYPE]= 'MAXIMUM'
</columnDerivationDescription>
<columnDerivationPrompt>Option 2: (User defined visit windows)</columnDerivationPrompt>
<columnDerivationDescription>Set to a re-defined visit range based on user-defined input, using formatting of Analysis Relative Day [ADEG.ADY] range in conjunction with Analysis Window Target [ADEG.AWTARGET] and Analysis Window Diff from Target [ADEG.AWTDIFF] to determine analysis visit.
Set to 'POST-BASELINE MINIMUM' for the new observation generated for derviation type minimum [ADEG.DTYPE] = 'MINIMUM'
Set to 'POST-BASELINE MAXIMUM' for the new observation generated for derviation type maximum [ADEG.DTYPE]= 'MAXIMUM'
</columnDerivationDescription>
</column>
The string of terms taken from the text that I need to remove duplicates from
EG.VISIT ADEG.DTYPE ADEG.DTYPE ADEG.ADY ADEG.AWTARGET ADEG.AWTDIFF ADEG.DTYPE ADEG.DTYPE
What I would like to see
EG.VISIT ADEG.DTYPE ADEG.ADY ADEG.AWTARGET ADEG.AWTDIFF
my XSLT template and function
<xsl:variable name="test">
<xsl:if test="contains($string,'[')">
<xsl:variable name="relevant-part" select="substring-before(substring-after($string,'['),']')"/>
<xsl:variable name="remainder" select="substring-after($string,']')"/>
<xsl:value-of select="$relevant-part"/>
<xsl:if test="contains($remainder,'[')">
<xsl:text disable-output-escaping="yes"> </xsl:text>
</xsl:if>
<xsl:call-template name="find-relevant-text">
<xsl:with-param name="string" select="$remainder"/>
</xsl:call-template>
</xsl:if>
</xsl:variable>
<xsl:value-of select="myfn:sortCSV($test)"/>
</xsl:template>
<xsl:function name="myfn:sortCSV" as="xs:string*">
<xsl:param name="csvString" as="xs:string"/>
<!-- Split up string and remove duplicates -->
<xsl:variable name="values" select="distinct-values(tokenize($csvString,'\W+\.\W+'))" as="xs:string*"/>
<!-- Return all elements, sorted -->
<xsl:for-each select="$values">
<xsl:sort/>
<!-- We don't return empty strings -->
<xsl:sequence select=".[.!='']"/>
</xsl:for-each>
</xsl:function>
\W+\.\W+ is the regex I have been using to identify e.g. EG.VISIT or ADEG.DTYPE. So any pattern including CC.CCCC to CCCC.CCCCCCCC (where C is a char [A-Z]).
The output I am getting is
EG.VISIT ADEG.DTYPE ADEG.DTYPE ADEG.ADY ADEG.AWTARGET ADEG.AWTDIFF ADEG.DTYPE ADEG.DTYPE
So no duplicates have been removed.
QUESTION:
Can anyone see where I am going wrong with my expression or code?
As for your regular expression, note that a \W matches a non-word char and cannot match uppercase (nor lowercase) letters. \w matches a word char.
However, best is to restrict it to [A-Z]+\.[A-Z]+ since you say the items you want to match follow the uppercase+.+uppercase pattern.
See the regex demo
I would use analyze-string, either with XSLT 2.0 the XSLT xsl:anyalyze-string or with XSLT 3.0 the function of the same name, using that approach it is a one-liner:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
exclude-result-prefixes="xs math fn"
version="3.0">
<xsl:template match="column">
<xsl:value-of select="distinct-values(analyze-string(., '\[([A-Z]+\.[A-Z]+)\]')//fn:match/fn:group[#nr = 1])"/>
</xsl:template>
</xsl:stylesheet>
Output is EG.VISIT ADEG.DTYPE ADEG.ADY ADEG.AWTARGET ADEG.AWTDIFF.
If you want to sort the extracted strings then use <xsl:value-of select="sort(distinct-values(analyze-string(., '\[([A-Z]+\.[A-Z]+)\]')//fn:match/fn:group[#nr = 1]))"/>.

Substring before throwing error

I've the below XML
<?xml version="1.0" encoding="UTF-8"?>
<body>
<p>Industrial drawing: Any creative composition</p>
<p>Industrial drawing: Any creative<fn>
<fnn>4</fnn>
<fnt>
<p>ftn1"</p>
</fnt>
</fn> composition
</p>
</body>
and the below XSL.
<xsl:template match="p">
<xsl:choose>
<xsl:when test="contains(substring-before(./text(),' '),'Article')">
<xsl:text>sect3</xsl:text>
<xsl:value-of select="./text()"/>
</xsl:when>
<xsl:when test="contains(substring-before(./b/text(),' '),'Section')">
<xsl:text> Sect 2</xsl:text>
<xsl:value-of select="./text()"/>
</xsl:when>
<xsl:when test="contains(substring-before(./b/text(),' '),'Chapter')">
<xsl:text> Sect 1</xsl:text>
<xsl:value-of select="./text()"/>
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Here my XSL is working fine for <p>Industrial drawing: Any creative composition</p> but for the below Case
<p>Industrial drawing: Any creative<fn>
<fnn>4</fnn>
<fnt>
<p>ftn1"</p>
</fnt>
</fn> composition
</p>
it is throwing me the below error.
XSLT 2.0 Debugging Error: Error: file:///C:/Users/u0138039/Desktop/Proview/ASAK/DIFC/XSLT/tabel.xslt:38: Wrong occurrence to match required sequence type - Details: - XPTY0004: The supplied sequence ('2' item(s)) has the wrong occurrence to match the sequence type xs:string ('zero or one')
please let me know how can i fix this and grab the text required.
Thanks
The second p element in your example XML has two child text nodes, one containing "Industrial drawing: Any creative" and the other containing a space, "composition", a newline and another six spaces. In XSLT 1.0 it is legal to apply a function that expects a string to an argument that is a set of more than one node, the behaviour is to take the value of the first node and ignore all the others. But in 2.0 it is a type mismatch error to pass two nodes to a function that expected a single value for its parameter.
But in this case I doubt that you really need to use text() at all - if all you care about is seeing whether the string "Article" occurs anywhere within the first word under the p (including when this is nested inside another element) then you can simply use .:
<xsl:when test="contains(substring-before(.,' '),'Article')">
(or better still, use predicates to separate the different conditions into their own templates, with one template matching "Article" paragraphs, another matching "Section" paragraphs, etc.)
The p element in your example has several text nodes, so the expression ./text() creates a sequence. You cannot apply a string function to a sequence; you must convert it to a string first. Instead of:
test="contains(substring-before(./text(),' '),'Article')"
try:
test="contains(substring-before(string-join(text(), ''), ' '), 'Article')"

replace string in xslt 2.0 with replace function

I have a string like this
"My string"
Now I want to replace my with best so that the output will be like best string.
I have tried some thing like this
<xsl:value-of select="replace( 'my string',my,best)"/>
but probably its a wrong syntax
I have googled a lot but found nothing..every where the mechanism to do this XSLT 1.0 is explained.Can any one tell me how to do it in XSLT 2.0 ,The easy way compared to 1.0
Given:
<xsl:variable name="s1" select="'My string'"/>
Simply use:
<xsl:value-of select="replace($s1, 'My', 'best')"/>
Note that a regular expression is applied. Meaning:
<xsl:value-of select="replace('test.replace', '.', ':')"/>
Becomes:
::::::::::::
Be sure to escape the characters that have special meaning to the regular expression interpreter:
<xsl:value-of select="replace('test.replace', '\.', '::')"/>
Becomes:
test::replace
First check, if your xslt processor (saxxon) is the latest release. Then you have to set
<xsl:stylesheet version="2.0" in the head of your xslt-stylesheet. That's it.
Your code was fine, besides you forgot the apostrophs:
<xsl:value-of select="replace( 'my string',my,best)"/>
must be
<xsl:value-of select="replace('my string','my','best')"/>

apostrophe in xsl:format-number

I parse an xml with my xslt and get the result as a xml.
i need to format numbers with apostrophe as delimiter for a tousand, million, etc...
eg: 1234567 = 1'234'567
now the problem is how do i get these apostrophes in there?
<xsl:value-of select="format-number(/path/to/number, '###'###'###'###')" />
this doesn't work because the apostrophe itself is already delimiting the start of the format.
is there a simple solution to that (maybe escaping the apostrophe like in c#?
The answer depends on whether you are using 1.0 or 2.0.
In 2.0, you can escape the string delimiter by doubling it (for example 'it''s dark'), and you can escape the attribute delimiter by using an XML entity such as ". So you could write:
<xsl:value-of select="format-number(/path/to/number, '###''###''###''###')" />
In 1.0, you can escape the attribute delimiter by using an XML entity, but there is no way of escaping the string delimiter. So you could switch your delimiters and use
<xsl:value-of select='format-number(/path/to/number, "###&apos;###&apos;###&apos;###")' />
The other way - probably easier - is to put the string in a variable:
<xsl:variable name="picture">###'###'###'###</xsl:variable>
<xsl:value-of select="format-number(/path/to/number, $picture)" />
After some research we came up with this solution:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:decimal-format name='ch' grouping-separator="'" />
<xsl:template match="/">
<xsl:value-of select='format-number(/the/path/of/the/number, "###&apos;###&apos;###", "ch")'/>
...

XSL - Invalid Xpath Extension on Replace

Im getting on error when I try and use the following:
<xsl:variable name="url" select="guid"/>
<xsl:variable name="vid" select="substring-after($url,'podcast/')"/>
<xsl:variable name="pre" select="substring-before($vid,'.mp4')"/>
<<xsl:variable name="p" select="replace($pre,'_','-')"/>
<xsl:variable name="p1" select="concat($p,'.embed_thumbnail.jpg')"/>
<xsl:variable name="p2" select="concat('http://images.ted.com/images/ted/tedindex/embed-posters/',$p1)"/>
Can anyone see a problem, it all looks good to me?
Are you using an XSLT 1 processor? The replace function appeared in XPath 2.0 and is therefore not available in XSLT 1.
In this case you could just use the translate function instead.
You have an extra unescaped less-than sign before your p variable's definition:
<<xsl:variable name="p" select="replace($pre,'_','-')"/>
That's not valid syntax.
You should either remove it:
<xsl:variable name="p" select="replace($pre,'_','-')"/>
Or escape it:
<<xsl:variable name="p" select="replace($pre,'_','-')"/>
I see a '<<' at the start of line 4, is that it?