XSL if test starts-with

XSL if test starts-with - xslt

Given this piece of XML
<dc:date>info:eu-repo/date/embargoEnd/2013-06-12</dc:date>
<dc:date>2012-07-04</dc:date>
I should need with XSL to output only the year of the string not starting with info:eu-repo.
I'm trying this way, but it doesn't work. I'm wrong with the for-each?
<xsl:if test="not(starts-with('dc:date', 'info:eu-repo'))">
<xsl:for-each select="dc:date">
<publicationYear>
<xsl:variable name="date" select="."/>
<xsl:value-of select="substring($date, 0, 5)"/>
</publicationYear>
</xsl:for-each>
</xsl:if>

I guess you don't need ' in your start-with query, and you may want to iterate over dates slightly differently:
<xsl:for-each select="dc:date">
<xsl:variable name="date" select="."/>
<xsl:if test="not(starts-with($date, 'info:eu-repo'))">
<publicationYear>
<xsl:value-of select="substring($date, 0, 5)"/>
</publicationYear>
</xsl:if>
</xsl:for-each>

Use (assuming the provided XML fragment is elements that are children of the current node and there is only one element with the desired property):
substring-before(*[not(starts-with(., 'info:eu-repo'))], '-')
XSLT - based verification:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/*">
<xsl:copy-of select=
"substring-before(*[not(starts-with(., 'info:eu-repo'))], '-') "/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied to the following XML document (the provided fragment wrapped in a single top element and the namespace declared):
<t xmlns:dc="some:dc">
<dc:date>info:eu-repo/date/embargoEnd/2013-06-12</dc:date>
<dc:date>2012-07-04</dc:date>
</t>
the XPath expression is evaluated off the top element and the result of this evaluation is copied to the output:
2012
II. More than one element with the desired property:
In this case It isn't possible to produce the desired data with a single XPath 1.0 expression.
This XSLT transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="*[not(starts-with(., 'info:eu-repo'))]/text()">
<xsl:copy-of select="substring-before(., '-') "/>
==============
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
when applied on this XML document:
<t xmlns:dc="some:dc">
<dc:date>info:eu-repo/date/embargoEnd/2013-06-12</dc:date>
<dc:date>2012-07-04</dc:date>
<dc:date>info:eu-repo/date/embargoEnd/2013-06-12</dc:date>
<dc:date>2011-07-05</dc:date>
</t>
produces the wanted, correct result:
2012
==============
2011
==============
III. XPath 2.0 one-liner
*[not(starts-with(., 'info:eu-repo'))]/substring-before(., '-')
When this XPath 2.0 expression is evaluated off the top element of the last XML document (nearest above), the wanted years are produced:
2012 2011
XSLT 2.0 - based verification:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/*">
<xsl:sequence select=
"*[not(starts-with(., 'info:eu-repo'))]/substring-before(., '-')"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the last XML document, the XPath expression is evaluated and the result of this evaluation is copied to the output:
2012 2011
IV. The Most General and Difficult case:
Now, let's have this XML document:
<t xmlns:dc="some:dc">
<dc:date>info:eu-repo/date/embargoEnd/2013-06-12</dc:date>
<dc:date>2012-07-04</dc:date>
<dc:date>info:eu-repo/date/embargoEnd/2013-06-12</dc:date>
<dc:date>2011-07-05</dc:date>
<dc:date>*/date/embargoEnd/2014-06-12</dc:date>
</t>
We still want to get the year part of all dc:date elements whose string value doesn't start with 'info:eu-repo'. However none of the previous solutions work correctly with the last dc:date element above.
Remarkably, the wanted data can still be produced by a single XPAth 2.0 expression:
for $s in
*[not(starts-with(., 'info:eu-repo'))]/tokenize(.,'/')[last()]
return
substring-before($s, '-')
When this expression is evaluated off the top element of the above XML document, the wanted, correct result is produced:
2012 2011 2014
And this is the XSLT 2.0 - based verification:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/*">
<xsl:sequence select=
"for $s in
*[not(starts-with(., 'info:eu-repo'))]/tokenize(.,'/')[last()]
return
substring-before($s, '-')
"/>
</xsl:template>
</xsl:stylesheet>

You could also use a template-match to get what you want, for example:
<xsl:template match="date[not(starts-with(.,'info:eu-repo'))]">
<xsl:value-of select="."/>
</xsl:template>
I have this input XML:
<?xml version="1.0" encoding="UTF-8"?>
<list>
<date>info:eu-repo/date/embargoEnd/2013-06-12</date>
<date>2012-07-04</date>
</list>
and apply this XSLT to it:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="text() | #*"/>
<xsl:template match="/">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="date[not(starts-with(.,'info:eu-repo'))]">
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
and I get this output:
<?xml version="1.0" encoding="UTF-8"?>2012-07-04

Related

Splitting a string based on the delimiter and moving them under new nodes using XSLT

I am using XSLT 1.0. I have the following xml input:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<groupLOB>M1 M2 M3 M4</groupLOB>
</root>
The tag <groupLOB> has the value M1 M2 M3 M4 Now I want to split the value into multiple strings and store them unde based on the delimiter 'space'i.e. ' '. My end xml should be as below:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<One>M1</One>
<Two>M2</Two>
<Three>M3</Three>
<Four>M4</Four>
</root>
I tried with the following XSLT, but it's not giving me the required output, i.e. I am not sure how to move the split values under the new tags.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text" />
<xsl:template match="/*">
<xsl:value-of select="translate(., ' ', '
')" />
</xsl:template>
</xsl:stylesheet>
Anybody has any idea on how to do that?

The XSLT 2.0 solution might be:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:template match="/root">
<xsl:copy>
<xsl:for-each select="tokenize(groupLOB,' ')">
<xsl:variable name="elementName">
<xsl:number value="position()" format="Ww"/>
</xsl:variable>
<xsl:element name="{$elementName}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
And in XSLT 3.0
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
<xsl:template match="/root">
<xsl:copy>
<xsl:for-each select="tokenize(groupLOB,' ')">
<xsl:element name="{format-integer(position(),'Ww')}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Both output
<root>
<One>M1</One>
<Two>M2</Two>
<Three>M3</Three>
<Four>M4</Four>
</root>
Then in XSLT 1.0 you will need to tokenize by the means of an extension function like EXSLT tokenize() or with a recursive template (like Jeni Tennison's XSLT implementation of EXSLT tokenize). The big task is the conversion from numbers to words. Luckly we can see Saxon's open source to translate from a Java implemantation to an XSLT implemantation. This might take time but it is straightforward.
Check the English implementation shipped with Saxon at https://dev.saxonica.com/repos/archive/opensource/trunk/bj/net/sf/saxon/number/Numberer_en.java

XSLT manipulate text scattered across various nodes

Input file as follows:
<?xml version="1.0" encoding="UTF-8"?>
<!-- lower UPPER case -->
<document>
<rubbish> rubbish </rubbish>
<span class='lower'>
lower
<span class='upper'> upper </span>
case
</span>
</document>
Wanted output:
lower UPPER case
I know how to get the text included in the outer span with value-of, but this also
includes the string "upper" unchanged which is not what I want. I do not know how
to manipulate the text in the inner span and insert it in the middle of
the other text.
Failed attempt:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="text" indent="no"/>
<xsl:template match="/">
<xsl:for-each select="//span[#class = 'lower']">
<xsl:if test="span/#class = 'upper'">
<xsl:text>do something</xsl:text> <!--TO DO -->
</xsl:if>
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

You need to take a recursive approach here, for example:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="text" encoding="UTF-8"/>
<xsl:template match="text()[parent::span]">
<xsl:choose>
<xsl:when test="../#class='upper'">
<xsl:value-of select="translate(., 'abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ')" />
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="." />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
To understand how this works, read up on built-in template rules: http://www.w3.org/TR/xslt/#built-in-rule

The following approach does away with the <choose> and completely pushes the problem down to the match expression:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />
<xsl:template match="text()"/>
<xsl:template match="text()[parent::span[#class = 'upper']]">
<xsl:value-of select="translate(., 'abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ')"/>
</xsl:template>
<xsl:template match="text()[parent::span[#class = 'lower']]">
<xsl:value-of select="translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz')"/>
</xsl:template>
</xsl:stylesheet>

How to escape the # character in XSLT

$binding-path contains something like Contact!ShowsInterest which should be converted to Contact/#ShowsInterest
This is what i tried so far:
<xsl:variable name="bindpath" select="translate($binding-path, '!','/#')" />
<xsl:value-of select="concat('{Binding XPath=',$bindpath,'}')"/>
or
<xsl:variable name="bindpath" select="translate($binding-path, '!','/#')" />
<xsl:value-of select="concat('{Binding XPath=',$bindpath,'}')"/>
But no matter what i try, the result is always Contact/ShowsInterest

The translate() function can only replace every occurence of a single characterwith a single character (or with nothing, thus deleting it).
Use:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="binding-path" select="'Contact!ShowsInterest'"/>
<xsl:template match="/">
<xsl:variable name="bindingpath">
<xsl:value-of select="substring-before($binding-path, '!')"/>
<xsl:text>/#</xsl:text>
<xsl:value-of select="substring-after($binding-path, '!')"/>
</xsl:variable>
<xsl:value-of select="$bindingpath"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on any XML document (not used), the wanted, correct result is produced:
Contact/#ShowsInterest
II. XSLT 2.0
Use the XPath 2.0 replace() function:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="binding-path" select="'Contact!ShowsInterest'"/>
<xsl:template match="/">
<xsl:variable name="bindingpath" select="replace($binding-path, '!', '/#')"/>
<xsl:value-of select="$bindingpath"/>
</xsl:template>
</xsl:stylesheet>
This transformation produces the same correct result:
Contact/#ShowsInterest

Reading the entries in a loop and removing the duplicate entries using XSL

I am new to XSL and I have a confusion whether we can read the data in a xml tag and then store it in an array or something and then remove the duplicate by using distinct option.
eg.
<local>
<ID>
<fruit>apple</fruit>
<fruit>orange</fruit>
</ID>
<ID>
<fruit>apple</fruit>
<fruit>mango</fruit>
</ID>
</local>
In this, I'm reading through the Local as the loop initiator and needs to read all the ID's underneath it and display the fruits. In this case, I dont there are 4 different fruits are there and one is duplicated, so I just want to display the unique entries of those and display. Is there any possibility of getting this done using XSLT?
<xsl:for-each select="Local">
<xsl:variable name="fruits">
<xsl:for-each select="set:distinct(ID/fruit)">
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:variable>
</xsl:for-each>

I. This XSLT 1.0 transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="kFruitByName" match="fruit" use="."/>
<xsl:template match="/">
<xsl:copy-of select=
"/*/*/fruit
[generate-id()
=
generate-id(key('kFruitByName', .)[1])
]"/>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<local>
<ID>
<fruit>apple</fruit>
<fruit>orange</fruit>
</ID>
<ID>
<fruit>apple</fruit>
<fruit>mango</fruit>
</ID>
</local>
produces the wanted, correct result:
<fruit>apple</fruit>
<fruit>orange</fruit>
<fruit>mango</fruit>
Explanation: Using the Muenchian method for grouping.
II. XSLT 2.0 Solution:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/*">
<xsl:for-each-group select="*/fruit" group-by=".">
<xsl:sequence select="."/>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the same XML document (above), again the same correct result is produced:
<fruit>apple</fruit>
<fruit>orange</fruit>
<fruit>mango</fruit>

Replace special characters in XSLT

I want to remove characters other than alphabets from a string in XSLT. For example
<Name>O'Niel</Name> = <Name>ONiel</Name>
<Name>St Peter</Name> = <Name>StPeter</Name>
<Name>A.David</Name> = <Name>ADavid</Name>
Can we use Regular Expression in XSLT to do this? Which is right way to implement this?
EDIT: This needs to done on XSLT 1.0.

There is a pure XSLT way to do this.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:variable name="vAllowedSymbols"
select="'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'"/>
<xsl:template match="node() | #*">
<xsl:copy>
<xsl:apply-templates select="node() | #*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="
translate(
.,
translate(., $vAllowedSymbols, ''),
''
)
"/>
</xsl:template>
</xsl:stylesheet>
Result against this sample:
<t>
<Name>O'Niel</Name>
<Name>St Peter</Name>
<Name>A.David</Name>
</t>
Will be:
<t>
<Name>ONiel</Name>
<Name>StPeter</Name>
<Name>ADavid</Name>
</t>

Here's a 2.0 option:
EDIT: Sorry...the 1.0 requirement was added after I started on my answer.
XML
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<Name>O'Niel</Name>
<Name>St Peter</Name>
<Name>A.David</Name>
</doc>
XSLT 2.0
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="*|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="replace(.,'[^a-zA-Z]','')"/>
</xsl:template>
</xsl:stylesheet>
Output
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<Name>ONiel</Name>
<Name>StPeter</Name>
<Name>ADavid</Name>
</doc>
Here are a couple more ways of using replace()...
Using "i" (case-insensitive mode) flag:
replace(.,'[^A-Z]','','i')
Using category escapes:
replace(.,'\P{L}','')

I just created a function based on the code in this example...
<xsl:function name="lancet:stripSpecialChars">
<xsl:param name="string" />
<xsl:variable name="AllowedSymbols" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789()*%$##!~<>,.?[]=- + /\ '"/>
<xsl:value-of select="
translate(
$string,
translate($string, $AllowedSymbols, ''),
' ')
"/>
</xsl:function>
and an example of the usage would be as follows:
<xsl:value-of select="lancet:stripSpecialChars($string)"/>

quickest way is <xsl:value-of select="translate(Name,translate(Name,'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ',''),'')" />
the inner translate removes the alphabets (the needed characters). The result of that translate leaves other characters. the outer translate removes those unwanted characters

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

XSL if test starts-with - xslt

Related

Splitting a string based on the delimiter and moving them under new nodes using XSLT

XSLT manipulate text scattered across various nodes

How to escape the # character in XSLT

Reading the entries in a loop and removing the duplicate entries using XSL

Replace special characters in XSLT

Categories

Resources