XSLT regular expression to remove sequences text - xslt

I have an XML, something like this:
<?xml version="1.0" encoding="UTF-8"?>
<earth>
<computer>
<parts>;;remove;;This should stay;;remove too;;This stay;;yeah also remove;;this stay </parts>
</computer>
</earth>
I want to create an XSLT 2.0 transform to remove all text which starts and ends with ;;
<?xml version="1.0" encoding="utf-8"?>
<earth>
<computer>
<parts>This should stay This stay this stay </parts>
</computer>
</earth>
Try to do something like this but no luck:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fn="http://www.w3.org/2005/xpath-functions"
exclude-result-prefixes="fn">
<xsl:output encoding="utf-8" method="xml" indent="yes" />
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="parts">
<xsl:element name="parts" >
<xsl:value-of select="replace(., ';;.*;;','')" />
</xsl:element>
</xsl:template>
</xsl:stylesheet>

Wow, what a dumb way to markup text. You have XML at your disposal, why not use it? And even if marking this way, why not use different symbols for opening and closing the marked parts?
Anyway, I believe this returns the expected result:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="parts">
<xsl:copy>
<xsl:value-of select="replace(., ';;.+?;;', '')" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

Another approach would be tokenize on ";;" as separator, then remove all even-numbered tokens:
<xsl:template match="parts">
<parts>
<xsl:value-of select="tokenize(.,';;')[position() mod 2 = 1]"
separator=""/>
</parts>
</xsl:template>

XSLT 1.0
For this kind of thing I'd use recursion. Just using string replace you can get what is before and after a certain character (or set of characters). All you need to do is continually loop over the string until there are no more occurrences of the replace character, like follows:
<xsl:template name="string-remove-between">
<xsl:param name="text" />
<xsl:param name="remove" />
<xsl:choose>
<xsl:when test="contains($text, $remove)">
<xsl:value-of select="substring-before($text,$remove)" />
<xsl:call-template name="string-remove-between">
<xsl:with-param name="text" select="substring-after(substring-after($text,$remove), $remove)" />
<xsl:with-param name="remove" select="$remove" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Then you'd just call the template with your text and the section you want to remove:
<xsl:call-template name="string-remove-between">
<xsl:with-param name="text" select="parts"/>
<xsl:with-param name="remove">;;</xsl:with-param>
</xsl:call-template>
Note that there are two substring-after calls, this makes sure we get the second instance of the replace characters ';;' so we aren't pulling in the text between.

Related

XSLT mapping to remove double quotes which has PIPE delimited symbol inside

Experts, i need to write XSLT 1.0 code to eliminate the Pipe delimited symbol inside double quotes and also need to remove those double quotes..
Input:
<?xml version="1.0" encoding="utf-8"?>
<ns:MT_FILE>
<LN>
<LD>EXTRACT|"28|53"|1308026.7500|1176</LD>
</LN>
<LN>
<LD>DETAIL|1176|"LOS LE|OS PARRILLA"|Y|R||||<LD>
</LN>
</ns:MT_FILE>
** Desired Output:**
<?xml version="1.0" encoding="utf-8"?>
<ns:MT_FILE>
<LN>
<LD>EXTRACT|2853|1308026.7500|1176</LD>
</LN>
<LN>
<LD>DETAIL|1176|LOS LE OS PARRILLA|Y|R||||<LD>
</LN>
</ns:MT_FILE>
** XSLT I used is below:**
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*/text()">
<xsl:value-of select="translate(., '\"', '')"/>
</xsl:template>
</xsl:stylesheet>
This XSLT removing all the double quotes from my input field, please assist here..
If it can be assumed that quotes will always come in pairs, you could do:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:call-template name="process">
<xsl:with-param name="text" select="."/>
</xsl:call-template>
</xsl:template>
<xsl:template name="process">
<xsl:param name="text"/>
<xsl:choose>
<xsl:when test="contains($text, '"')">
<xsl:value-of select="substring-before($text, '"')"/>
<xsl:value-of select="translate(substring-before(substring-after($text, '"'), '"'), '|', '')"/>
<xsl:call-template name="process">
<xsl:with-param name="text" select="substring-after(substring-after($text, '"'), '"')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
As you tagged as EXSLT:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0"
xmlns:regexp="http://exslt.org/regular-expressions"
exclude-result-prefixes="regexp">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="LD/text()">
<xsl:value-of select="regexp:replace(., '(")([^|]+)\|([^"]+)(")', 'g', '$2$3')"/>
</xsl:template>
</xsl:stylesheet>

XMLT : replace values with values found in another xml

I have a file called ori.xml:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<container>
<elA>
<el1>value1</el1>
<el2>value2</el2>
</elA>
<elB>
<el3>value3</el3>
<el4>value4</el4>
<el5>value5</el5>
</elB>
<elC>
<el6>value5</el6>
</elC>
</container>
</root>
and another one called modifs.xml:
<?xml version="1.0" encoding="UTF-8"?>
<els>
<el2>newvalue2</el2>
<el5>newvalue5</el5>
</els>
and I would like to obtain result.xml:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<container>
<elA>
<el1>value1</el1>
<el2>newvalue2</el2>
</elA>
<elB>
<el3>value3</el3>
<el4>value4</el4>
<el5>newvalue5</el5>
</elB>
<elC>
<el6>value5</el6>
</elC>
</container>
</root>
I'm a beginner in XSLT.
So I started to write a stylesheet with which I'm able to change value2 into newvalue2:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:param name="fileName" select="'modifs.xml'" />
<xsl:param name="modifs" select="document($fileName)" />
<xsl:param name="updateEl" >
<xsl:value-of select="$modifs/els/el2" />
</xsl:param>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="//elA/el2">
<xsl:copy>
<xsl:apply-templates select="$updateEl" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
But now I have to modify this stylesheet to be able to know which elements are in modifs.xml and find them in ori.xml. I don't know how to do that. Could you help please ?
I would use a key:
<xsl:key name="ref-change" match="els/*" use="local-name()"/>
<xsl:template match="*[key('ref-change', local-name(), $modifs)]">
<xsl:copy-of select="key('ref-change', local-name(), $modifs)"/>
</xsl:template>
However, using the third argument for the key function is only supported in XSLT 2 and later thus if you use an XSLT 1 processor you need to move the logic into the template, that requires using for-each to "switch" the context document
<xsl:template match="*">
<xsl:variable name="this" select="."/>
<xsl:for-each select="$modifs">
<xsl:choose>
<xsl:when test="key('ref-change', local-name($this))">
<xsl:copy-of select="key('ref-change', local-name($this))"/>
</xsl:when>
<xsl:otherwise>
<xsl:for-each select="$this">
<xsl:call-template name="identity"/>
</xsl:for-each>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:template>
Put name="identity" on your identity transformation template.

Resolving variables in XSLT

I'm having problems with resolving variables in XSLT. I have a working XSL file with fixed values that I now want to make dynamic (the variale declarations will be move outside the XSL-file once it works). My current problem is to use variable $beginning in the starts-with function. This is the way all the googling has lead me to believe it should look, but it will not compile. It works how I use it in the substring-after function. How should this be done?
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="oldRoot" select="'top'" />
<xsl:variable name="beginning" select="concat('$.',$oldRoot)" />
<xsl:variable name="newRoot" select="'newRoot'" />
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="bind/#ref[starts-with(., $beginning)]">
<xsl:attribute name="ref">
<xsl:text>$.newRoot.</xsl:text><xsl:value-of select="$oldRoot"></xsl:value-of>
<xsl:value-of select="substring-after(., $beginning)" />
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
In XSLT 1.0 it is considered an error for a template match expression to contain a variable (See https://www.w3.org/TR/xslt#section-Defining-Template-Rules), so this line is failing
<xsl:template match="bind/#ref[starts-with(., $beginning)]">
(I believe some processors may allow it, but if they were following the spec, they shouldn't. It is allowed in XSLT 2.0 though).
What you can do is move the condition inside the template, and handle it with an xsl:choose
Try this XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="oldRoot" select="'top'" />
<xsl:variable name="beginning" select="concat('$.',$oldRoot)" />
<xsl:variable name="newRoot" select="'newRoot'" />
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="bind/#ref">
<xsl:choose>
<xsl:when test="starts-with(., $beginning)">
<xsl:attribute name="ref">
<xsl:text>$.newRoot.</xsl:text><xsl:value-of select="$oldRoot"></xsl:value-of>
<xsl:value-of select="substring-after(., $beginning)" />
</xsl:attribute>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="." />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>

Parsing text as is for selected nodes in XSLT

Input XML is this:
<input>
<foo>John&apos;s bar</foo>
<bar>test</bar>
<foobar>testing</foobar>
</input>
After XSL transformation:
<input>
<foo>John's bar</foo>
<bar>this_test</bar>
</input>
But the legacy system expects:
<foo>John&apos;s bar</foo>
not <foo>John's bar</foo>
So I want to retain the value under <foo> as is rather than let XSLT parse it.
I tried using <xsl:output method="text"/> but with no luck of success..
I think XML itself when loaded gets parsed and XSLT just outputs as is..
If that's true I atleast want to escape it and make &apos; irrespective of whether it was &apose or ' in the input XML.
XSLT that I tried is this:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">
<xsl:output method="xml"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="bar">
<xsl:copy>
<xsl:text>this_</xsl:text>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="foobar"/>
</xsl:stylesheet>
If you are limited to XSLT 1.0, use disable-output-escaping="yes". This attribute can be used on xsl:text and xsl:value-of elements and it is deprecated in XSLT 2.0.
Stylesheet (XSLT 1.0)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="vApos">'</xsl:variable>
<xsl:variable name="vAmp">&</xsl:variable>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="bar">
<xsl:copy>
<xsl:text>this_</xsl:text>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="foobar"/>
<xsl:template match="foo">
<xsl:variable name="rep">
<xsl:call-template name="replace-string">
<xsl:with-param name="text" select="."/>
<xsl:with-param name="replace" select="$vApos" />
<xsl:with-param name="with" select="concat($vAmp,'apos;')"/>
</xsl:call-template>
</xsl:variable>
<xsl:copy>
<xsl:value-of select="$rep" disable-output-escaping="yes"/>
</xsl:copy>
</xsl:template>
<xsl:template name="replace-string">
<xsl:param name="text"/>
<xsl:param name="replace"/>
<xsl:param name="with"/>
<xsl:choose>
<xsl:when test="contains($text,$replace)">
<xsl:value-of select="substring-before($text,$replace)"/>
<xsl:value-of select="$with"/>
<xsl:call-template name="replace-string">
<xsl:with-param name="text" select="substring-after($text,$replace)"/>
<xsl:with-param name="replace" select="$replace"/>
<xsl:with-param name="with" select="$with"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
The XSLT 1.0 solution makes use of advice given by Dimitre Novatchev here and Mads Hansen's answer here.
The XSLT 2.0 solution is more elegant, use a character-map to control the serialization of output. Make sure you escape the ampersand character as well (&apos; instead of &apos;).
Stylesheet (XSLT 2.0)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" use-character-maps="apo"/>
<xsl:strip-space elements="*"/>
<xsl:character-map name="apo">
<xsl:output-character character="&apos;" string="&apos;"/>
</xsl:character-map>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="bar">
<xsl:copy>
<xsl:text>this_</xsl:text>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="foobar"/>
</xsl:stylesheet>
Output (Saxon 9.5 for 2.0, Xalan 2.7.1 for 1.0)
<?xml version="1.0" encoding="UTF-8"?>
<input>
<foo>John&apos;s bar</foo>
<bar>this_test</bar>
</input>

replace a String with XSLT

I have a wsdl (that I get from a Web Service) where I have to replace the current address String to something else , The Idea was to use XSLT to do that. There is just one problem , I have never done anything with XSLT so i have no idea how to do that. I have found an simple example of how to do that but I dot get how do i Get the old string out of the wsdl so I can replace it.
Here is the Example
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:inm="http://www.inmagic.com/webpublisher/query" version='1.0'>
<xsl:output method="text" encoding="UTF-8"/>
<xsl:preserve-space elements="*"/>
<xsl:template match="text()"></xsl:template>
<xsl:template match="test">
<xsl:apply-templates/>
<xsl:for-each select="testObj">
'Notes or subject' <xsl:call-template name="rem-html"><xsl:with-param name="text" select="SBS_ABSTRACT"/></xsl:call-template>
</xsl:for-each>
</xsl:template>
<xsl:template name="rem-html">
<xsl:param name="text"/>
<xsl:variable name="newtext" select="translate($text,'a','b')"/>
</xsl:template>
</xsl:stylesheet>
UPDATE :
this is what i have now :
<soap:address location="http://localhost:4434/miniwebservice"/>
this is what i want to get :
<soap:address location="http://localhost:4433/miniwebservice"/>
I just replaced the number of the Port from 4434 to 4433
<xsl:template match="soap:address/#location">
<xsl:attribute name="location">
<xsl:call-template name="string-replace">
<xsl:with-param name="haystack" select="current()"/>
<xsl:with-param name="search">:4434/</xsl:with-param>
<xsl:with-param name="replace">:4433/</xsl:with-param>
</xsl:call-template>
</xsl:attribute>
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
Note that there are no built-in string replace function in XSLT, you'll need to take it somewhere else (e.g. http://symphony-cms.com/download/xslt-utilities/view/26418/ was used when writing this stylesheet).
Note that with XSLT 2.0 you have an easier way to proceed using regular expressions :
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:soap="..."
version="2.0">
<xsl:param name="newPort">4433</xsl:param>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="soap:address/#location">
<xsl:attribute name="location">
<xsl:value-of select="replace(.,
'^(http://[^/]*:)[0-9]{4}/',
concat('$1',$newPort,'/'))"/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
To make it work, you just have to change the namespace URI in xmlns:soap="..." to the soap namespace uri (i'm not sure of it) and use an XSLT 2.0 processor (e.g. : saxon).