XSLT - Check if pattern exists in an element string - regex

I have the following element as part of a larger XML
<MT N="NonEnglishAbstract" V="[DE] Deutsch Abstract text [FR] French Abstract text"/>
I need to do some formatting of the value in #V attribute, only if it contains anything like [DE], [FR] or any two capital letters representing a country code within square brackets.
If no such pattern exist, I need to simply write the value of #V without any formatting.
I can use an XSLT 2.0 solution
I was hoping that I could use the matches() function something like
<xsl:choose>
<xsl:when test="matches(#V,'\[([A-Z]{{2}})\]([^\[]+'">
//Do something
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="#V"/>
</xsl:otherwise>
</xsl:choose>

I think all you need is:
matches(#V,'\[[A-Z][A-Z]\]')
You don't have to match the entire string to get a true() ... I tell my students to write as short a reg-ex as possible.

You have not posted anything about what you have tried. How about looking up translate function and translating the strings capital letters to something like "X". Then test that string result for the existence of [XX]. That alone would tell you whether you need to process it.
<xsl:variable name="result">
<xsl:value-of select="translate(#V,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','XXXXXXXXXXXXXXXXXXXXXXXXX')"/>
</xsl:variable>
Then use that result and then test:
contains($result, "[XX]")
No regex required, pure XSL 1.1

Related

how to do an if like statement or equivalent in XSLT

Is it possible to write out xml based on an "if" "Like" statement or the equivalent of in xslt?
I have an element named "cust_code"
If the element starts with a "HE" then I want to write it out, otherwise jump to the next.
Is it possible?
If statements exist in XSLT.
<xsl:if test="...">
...
</xsl:if>
But this is a simple if, with no alternative.
If you want an equivalent to if ... else ... or switch ... case ..., you need to use the following:
<xsl:choose>
<xsl:when test="...">
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
You can have as many when cases as necessary.
Links: w3school - if and w3school - choose.
As to having an element starting with a specific string, look at the function starts-with. You can find a good example in this SO answer (just omit the not from the main answer, as their tests was to find strings not starting with a particular string). You can also look at this answer for more information.

How to use a variable value

The variable behaviour here does not work as expected.
I have a variable named fonttag with a value that is an HTML line with both start and end tags and a divider.
<xsl:variable name="fonttag">
<font face="ANGSANA NEW" size="12">|</font>
</xsl:variable>
When I try to use it, to get part of the string back, I get an empty string:
<xsl:value-of select="substring-before($fonttag ,'|')"/>
Where I expected the substring :
<font face="ANGSANA NEW" size="12">
Similarly the
<xsl:value-of select="$fonttag"/>
returns nothing, although
<xsl:copy-of select="$fonttag"/>
return the whole string. Is there another way to achieve the expected result ?
A derived-question: Is it possible to nest xsl select tags like this (cannot get it to work either)
<xsl:copy-of select="substring-before( <xsl:copy-of select="$fonttag"/>,'|')"/>
?
thanks
I am afraid you misunderstand how XSLT works. Your variable does not contain the string "<font face="ANGSANA NEW" size="12">|</font>". It contains the element font, with two attributes, and the string value of "|". The xsl:value-of instruction, as well as any string function such as substring(), only address the string value of the given expression.

Substring before throwing error

I've the below XML
<?xml version="1.0" encoding="UTF-8"?>
<body>
<p>Industrial drawing: Any creative composition</p>
<p>Industrial drawing: Any creative<fn>
<fnn>4</fnn>
<fnt>
<p>ftn1"</p>
</fnt>
</fn> composition
</p>
</body>
and the below XSL.
<xsl:template match="p">
<xsl:choose>
<xsl:when test="contains(substring-before(./text(),' '),'Article')">
<xsl:text>sect3</xsl:text>
<xsl:value-of select="./text()"/>
</xsl:when>
<xsl:when test="contains(substring-before(./b/text(),' '),'Section')">
<xsl:text> Sect 2</xsl:text>
<xsl:value-of select="./text()"/>
</xsl:when>
<xsl:when test="contains(substring-before(./b/text(),' '),'Chapter')">
<xsl:text> Sect 1</xsl:text>
<xsl:value-of select="./text()"/>
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Here my XSL is working fine for <p>Industrial drawing: Any creative composition</p> but for the below Case
<p>Industrial drawing: Any creative<fn>
<fnn>4</fnn>
<fnt>
<p>ftn1"</p>
</fnt>
</fn> composition
</p>
it is throwing me the below error.
XSLT 2.0 Debugging Error: Error: file:///C:/Users/u0138039/Desktop/Proview/ASAK/DIFC/XSLT/tabel.xslt:38: Wrong occurrence to match required sequence type - Details: - XPTY0004: The supplied sequence ('2' item(s)) has the wrong occurrence to match the sequence type xs:string ('zero or one')
please let me know how can i fix this and grab the text required.
Thanks
The second p element in your example XML has two child text nodes, one containing "Industrial drawing: Any creative" and the other containing a space, "composition", a newline and another six spaces. In XSLT 1.0 it is legal to apply a function that expects a string to an argument that is a set of more than one node, the behaviour is to take the value of the first node and ignore all the others. But in 2.0 it is a type mismatch error to pass two nodes to a function that expected a single value for its parameter.
But in this case I doubt that you really need to use text() at all - if all you care about is seeing whether the string "Article" occurs anywhere within the first word under the p (including when this is nested inside another element) then you can simply use .:
<xsl:when test="contains(substring-before(.,' '),'Article')">
(or better still, use predicates to separate the different conditions into their own templates, with one template matching "Article" paragraphs, another matching "Section" paragraphs, etc.)
The p element in your example has several text nodes, so the expression ./text() creates a sequence. You cannot apply a string function to a sequence; you must convert it to a string first. Instead of:
test="contains(substring-before(./text(),' '),'Article')"
try:
test="contains(substring-before(string-join(text(), ''), ' '), 'Article')"

XSLT compound condition that uses ends-with

and I have some code now that says:
<xsl:choose>
<xsl:when test="$Admin = 2">
<img src="../Lists/Announcement/Attachments/1/Banner.jpg" style="height:189px; width:568px;" title="{#Title};" />
</xsl:when>
<xsl:otherwise>
<img src="../Lists/Announcement/Attachments/{#ID}/Banner.jpg" style="height:189px; width:568px;" title="{#Title};" />
</xsl:otherwise>
</xsl:choose>
I would like to compound the condition so that it will also accept an attachment called banner.png also, and show banner.png.
If I use the substring function and give it a negative number will it count backwards from the end of the string?
If I use the substring function and give it a negative number will it count backwards from the end of the string?
No. (What made you think it would? Did you actually look for a specification? Did you try it? Guessing, and asking on SO whether your guess is correct, doesn't sound like a very efficient way of getting things done.)

How do I convert strings starting with numbers to numeric data in XSLT?

Given the following XML:
<table>
<col width="12pt"/>
<col width="24pt"/>
<col width="12pt"/>
<col width="48pt"/>
</table>
How can I convert the width attributes to numeric values that can be used in mathematical expressions? So far, I have used substring-before to do this. Here is an example template (XSLT 2.0 only) that shows how to sum the values:
<xsl:template match="table">
<xsl:text>Col sum: </xsl:text>
<xsl:value-of select="sum(
for $w
in col/#width
return number(substring-before($w, 'pt'))
)"/>
</xsl:template>
Now my questions:
Is there a more efficient way to do the conversion than substring-before?
What if I don't know the text after the numbers? Any way to do it without using regular expressions?
This is horrible, but depending on just how much you know about the potetntial set of non-numeric characters, you could strip them with translate():
translate("12jfksjkdfjskdfj", "abcdefghijklmnopqrstuvwxyz", "")
returns
"12"
which you can then pass to number() as currently.
(I said it was horrible. Note that translate() is case sensitive, too)
I found this answer from Dimitre Novatchev that provides a very clever XPATH solution that doesn't use regex:
translate(., translate(.,'0123456789', ''), '')
It uses the nested translate to strip all the numbers from the string, which yields all other characters, which are used as the values for the wrapping translate function to strip out and return just the number characters.
Applied to your template:
<xsl:template match="table">
<xsl:text>Col sum: </xsl:text>
<xsl:value-of select="sum(
for $w
in col/#width
return number(translate($w, translate($w,'0123456789', ''), ''))
)"/>
</xsl:template>
If you are using XSLT 2.0 is there a reason why you want to avoid using regex?
The most simple solution would probably be to use the replace function with a regex pattern to match on any non-numeric character and replace with empty string.:
replace($w,'[^0-9]','')