XSLT 2.0 regular expression replace - regex

I have the following XML:
<t>a_35345_0_234_345_666_888</t>
I would like to replace the first occurrence of number after "_" with a fixed number 234. So the result should look like:
<t>a_234_0_234_345_666_888</t>
I have tried using the following but it does not work:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:template match="/">
<xsl:value-of select='replace(., "(.*)_\d+_(.*)", "$1_234_$2")'/>
</xsl:template>
</xsl:stylesheet>
UPDATE
The following works for me (thanks #Chris85). Just remove the underscore and add "? to make it non greedy.
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:template match="/">
<xsl:value-of select='replace(., "(.*?)_\d+(.*)", "$1_234$2")'/>
</xsl:template>
</xsl:stylesheet>

Your regex is/was greedy, the .* consumes everything until the last occurrence of the next character.
So
(.*)_\d+_(.*)
was putting
a_35345_0_234_345_666_
into $1. Then 888 was being removed and nothing went into $2.
To make it non-greedy add a ? after the .*. This tells the * to stop at the first occurrence of the next character.
Functional example:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:template match="/">
<xsl:value-of select='replace(., "(.*?)_\d+(.*)", "$1_234$2")'/>
</xsl:template>
</xsl:stylesheet>
Here's some more documentation on repetition and greediness, http://www.regular-expressions.info/repeat.html.

Related

XSLT - Get String between commas

How can I get the value 'four' in XSLT?
<root>
<entry>(one,two,three,four,five,six)</entry>
</root>
Thanks in advance.
You didn't specify the XSLT version, so I assume version 2.0.
I also assume that word four is only a "marker", stating from which place
take the result string (between the 3rd and 4th comma).
To get the fragment you want, you can:
Use tokenize function to "cut" the whole content of entry
into pieces, using a comma as the cutting pattern.
Take the fourth element of the result array.
This expression can be used e.g. in a template matching entry.
So the example script can look like below:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="UTF-8" indent="yes" />
<xsl:template match="entry">
<xsl:copy>
<xsl:value-of select="tokenize(., ',')[4]"/>
</xsl:copy>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy><xsl:apply-templates select="#*|node()"/></xsl:copy>
</xsl:template>
</xsl:transform>
For your input XML it gives:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<entry>four</entry>
</root>

How to remove a space in XML using stylesheet using RegEx

I have an XML and I am looking for finding particular tag (in this case "FirstName") and removing space in the value only if there is a - character before the space.
In other words, I want to keep spaces if there is no - front of them. I want to do this using an XSL stylesheet with RegEx matching and replace function.
Expected result is Sam-Louise, removing space between "Sam-" and "Louise"
<?xml version="1.0" encoding="utf-8"?>
<NCV Version="1.14">
<Invoice>
<customer>
<customerId>12785</customerId>
<FirstName>Sam- Louise</FirstName>
<LastName>Jones</LastName>
</customer>
</Invoice>
</NCV>
This is one possible XSLT :
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="html" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="FirstName">
<FirstName>
<xsl:value-of select="replace(., '-\s+', '-')"/>
</FirstName>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:transform>
xsltransform.net demo
output :
<NCV Version="1.14">
<Invoice>
<customer>
<customerId>12785</customerId>
<FirstName>Sam-Louise</FirstName>
<LastName>Jones</LastName>
</customer>
</Invoice>
</NCV>
You can use following RegEx in match
(\<FirstName\>.*?-)\s+
And replace it with the first captured group $1
RegEx (\<FirstName\>.*?-)\s+ matches,
\<FirstName\>.*?-: Literal <FirstName> followed by any character non-greedy, until first hyphen is found. This match is added in the captured group.
\s+: Match one or more of the space characters.
By replacing it with $1, will remove the spaces after hyphen.

Is it allowed to use XPath function in atribute value

Is it possible to substitute the value of a tag attribute with XPath expression.
Namely:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:import href="../Product/templates.xsl"/>
<xsl:output method="xml"/>
<xsl:template name="root" match="/">
<test-report start="substring(abcd, 1, 2)" stop="2015-10-07 16:54.103">
<xsl:call-template name="temp"/>
</test-report>
</xsl:template>
<xsl:template name="temp">
<xsl:value-of select="'TEXT_RIKO'"/>
</xsl:template>
The output is
<?xml version="1.0" encoding="UTF-8"?>
<test-report start="substring(abcd, 1, 2)" stop="2015-10-07 16:54.103">
TEXT_RIKO
</test-report>
What I want is (in the output file) the value of the start attribute to be the output of the function substring namely ab.
Thank you!
The answer is "Yes", but you need to use Attribute Value Templates to do this.
Simply write this
<test-report start="{substring(abcd, 1, 2)}" stop="2015-10-07 16:54.103">
The curly braces indicate an expression to be evaluated rather than output literally.

XSLT not matching element - namespace declarations

I'm sure this is a very simple fix, but I'm stumped. I've got input XML with the following root element, and repeating child elements:
<modsCollection
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.loc.gov/mods/v3"
xsi:schemaLocation="
http://www.loc.gov/mods/v3
http://www.loc.gov/standards/mods/v3/mods-3-4.xsd">
<mods version="3.4">
...
I've got an XSLT sheet with the following to match each <mods> node, and kick it out as a separate file named by an <identifier type="local"> element.
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.loc.gov/mods/v3">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/modsCollection">
<xsl:for-each select="mods">
<xsl:variable name="filename"
select="concat(normalize-space(
identifier[#type='local']),
'.xml')" />
<xsl:result-document href="{$filename}">
<xsl:copy-of select="."></xsl:copy-of>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
This works if the XML input does not have the xmlns:xsi, xmlns, or xsi:schemaLoaction attributes in the root element. So, for example, it works on the following:
<modsCollection>
<mods version="3.4">
...
I know that some of our MODS files have had the prefix included but I'm unclear why this won't work without the prefix if our XSLT matching is not looking for the prefix. Any thoughts or advice would be greatly appreciated.
<xsl:template match="/modsCollection">
matches modsCollection in no namespace. You want
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.loc.gov/mods/v3"
xmlns:m="http://www.loc.gov/mods/v3">
then
<xsl:template match="/m:modsCollection">
To match modsCollection in the mods namespace, and similarly use the m: prefix in all xslt patterns and xpath expressions in the stylesheet.

not adding new line in my XSLT

I am not certain why my xslt won't put a new line in my output...
This is my xslt....
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
<xsl:output method="text" encoding="iso-8859-1"/>
<xsl:variable name="newline"></xsl:variable>
<xsl:template name="FairWarningTransform" match="/"> <!--#* | node()">-->
<xsl:for-each select="//SelectFairWarningInformationResult">
<xsl:value-of select="ApplicationID"/>,<xsl:value-of select="USERID"/>
</xsl:for-each>
* Note. This report outlines Fair warning entries into reported for the above time frame.
</xsl:template>
</xsl:stylesheet>
Here is my output...
1,TEST1,test2,
I want it to look like...
1,TEST
1,test2,
Why isn't this character
creating a newline
Try replacing
with
<xsl:text>
</xsl:text>
That helps XSLT distinguish it from other whitespace in your stylesheet that is part of the stylesheet formatting (not part of the desired output).
XSLT's default behavior is to ignore any text nodes in the stylesheet that are entirely whitespace (this is true even if some of the whitespace is encoded as entities like
), except for text inside <xsl:text>, which is preserved.
I suggest replacing these lines:
<xsl:value-of select="ApplicationID"/>,<xsl:value-of select="USERID"/>
with this:
<xsl:value-of select="concat(ApplicationID, ',', USERID, '
')"/>
That way the newline should be ensured to be included in the output.
Try using this as your newline instead of the escaped character:
<xsl:text>
</xsl:text>