How to use case replacement pattern with Xpath replace function - xslt

I have this regexp and substitution patterns demo and need to use it within an xpath context with the fn:replace function,but I can't figure out how to write the replacement string correctly Is it possible ?
my naive test was
replace ("dsfjkljsdfjlsjdfABCDdfsfsdff",
"(\p{Lu})(\p{Lu}+)",
"$1\L$2")
but it complains with FORX0004 : Invalid replacement string in replace() : \ character must be followed by \ or $

I think you want e.g.
<xsl:function name="mf:lower-case-match">
<xsl:param name="input" as="xs:string"/>
<xsl:param name="regex" as="xs:string"/>
<xsl:analyze-string select="$input" regex="{$regex}">
<xsl:matching-substring>
<xsl:value-of select="concat(regex-group(1), lower-case(regex-group(2)))"/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:function>
mf:lower-case-match("dsfjkljsdfjlsjdfABCDdfsfsdff", "(\p{Lu})(\p{Lu}+)")
or, to use the as="xs:string" as the declared function type:
<xsl:function name="mf:lower-case-match" as="xs:string">
<xsl:param name="input" as="xs:string"/>
<xsl:param name="regex" as="xs:string"/>
<xsl:value-of>
<xsl:analyze-string select="$input" regex="{$regex}">
<xsl:matching-substring>
<xsl:value-of select="concat(regex-group(1), lower-case(regex-group(2)))"/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:value-of>
</xsl:function>
You need to declare a namespace for any user-defined function e.g. xmlns:mf="http://example.com/mf" on the xsl:stylesheet or xsl:transform root.
In XSLT 3 you could also simply push the result of the analyze-string function through a mode that then performs any transformation on the groups you want:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
version="3.0">
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="text">
<xsl:copy>
<xsl:apply-templates select="analyze-string(., '(\p{Lu})(\p{Lu}+)')" mode="lower-case"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*:group[#nr = 2]" mode="lower-case">
<xsl:value-of select="lower-case(.)"/>
</xsl:template>
</xsl:stylesheet>

I don't think \L regex property is supported with XPath. #Martin Honnen's answer is probably the best, but here's a full XPath 2.0 solution :
With :
dsfjkljsdfjlsjdfABCDdfsfsdff
XPath :
replace(replace("dsfjkljsdfjlsjdfABCDdfsfsdff","(\p{Lu})(\p{Lu}+)","$1___$2___"),"_{3}.+_{3}",lower-case(substring-before(substring-after(replace("dsfjkljsdfjlsjdfABCDdfsfsdff","(\p{Lu})(\p{Lu}+)","$1___$2___"),"___"),"___")))
Description :
P1 : We add ___ to identify the lower-case part with :
replace("dsfjkljsdfjlsjdfABCDdfsfsdff","(\p{Lu})(\p{Lu}+)","$1___$2___")
P2 : We generate the lower case part with :
lower-case(substring-before(substring-after(resultofP1,"___"),"___"))
We join the two preceding expressions with :
replace(resultofP1,"_{3}.+_{3}",resultofP2)
Output :
dsfjkljsdfjlsjdfAbcddfsfsdff

Related

XSLT replace char with an element

I'm trying to replace a single char with an element (containing more elements).
Using XSL 2.0.
Example:
<element1>
<element2>some text and the char - I want to replace </element2>
...
</element1>
The - (dash) should now be replaced with a new element:
<element1>
<element2>some text and the char <newElement/> I want to replace </element2>
...
</element1>
I tried already:
<xsl:template match="element1">
<xsl:analyze-string select="." regex="-">
<xsl:matching-substring>
<newElement/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
But this removed all the other elements inbetween (because only strings are "returned").
And with the function replace() you only can insert strings (no < possible).
Any further ideas?
Your template matches an element(), but replaces text(). If you match text() and replace text() instead while copying the rest, it will work as expected:
<!-- modified identity template matching no text() nodes -->
<xsl:template match="element() | comment() | processing-instruction()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:analyze-string select="." regex="-">
<xsl:matching-substring>
<newElement/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:copy-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
Two corrections are needed:
Your template should match element2, not element1.
At the beginning and end of your tempate you should add
opening / closing tag for element2 (something like in
the identity template).
So your template should look like this:
<xsl:template match="element2">
<element2>
<xsl:analyze-string select="." regex="-">
<xsl:matching-substring>
<newElement/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</element2>
</xsl:template>
Of course, your script should include also the identity template.

XSL transform on text to XML with unparsed-text: need more depth

My rather well-formed input (I don't want to copy all data):
StartThing
Size Big
Colour Blue
coords 42, 42
foo bar
EndThing
StartThing
Size Small
Colour Red
coords 29, 51
machin bidule
EndThing
<!-- repeat a few thousand times-->
I have the below XSL which I modified from Parse text file with XSLT
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="text-encoding" as="xs:string" select="'iso-8859-1'"/>
<xsl:param name="text-uri" as="xs:string" select="'unparsed-text.txt'"/>
<xsl:template name="text2xml">
<xsl:variable name="text" select="unparsed-text($text-uri, $text-encoding)"/>
<xsl:analyze-string select="$text" regex="(Size|Colour|coords) (.+)">
<xsl:matching-substring>
<xsl:element name="{(regex-group(1))}">
<xsl:value-of select="(regex-group(2))"/>
</xsl:element>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:template>
<xsl:template match="/">
<xsl:call-template name="text2xml"/>
</xsl:template>
</xsl:stylesheet>
and it works fine on parsing the pairs into elements and values. It gives me this output:
<?xml version="1.0" encoding="UTF-8"?>
<Size>Big</Size>
<Colour>Blue</Colour>
<coords>42, 42</coords>
But I'd also like to wrap the values in the Thing tag so that my output looks like this:
<Thing>
<Size>Big</Size>
<Colour>Blue</Colour>
<coords>42, 42</coords>
</Thing>
One solution might be a regex that matches each group of lines after each "thing". Then matches substrings as I'm already doing. Or is there some other way to parse the tree?
I would use two nested analyze-string levels, an outer one to extract everything between StartThing and EndThing, and then an inner one that operates on the strings matched by the outer one.
<xsl:template name="text2xml">
<xsl:variable name="text" select="unparsed-text($text-uri, $text-encoding)"/>
<!-- flags="s" allows .*? to match across newlines -->
<xsl:analyze-string select="$text" regex="StartThing.*?EndThing" flags="s">
<xsl:matching-substring>
<Thing>
<!-- "." here is the matching substring from the outer regex -->
<xsl:analyze-string select="." regex="(Size|Colour|coords) (.+)">
<xsl:matching-substring>
<xsl:element name="{(regex-group(1))}">
<xsl:value-of select="(regex-group(2))"/>
</xsl:element>
</xsl:matching-substring>
</xsl:analyze-string>
</Thing>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:template>

Duplicate fn:analyze-string() output using xsl:analyze-string?

Is it possible to generate output identical to the fn:analyze-string (XPath 3.0) using xsl:analyze-string (XSLT 2.0)?
Some examples for input string abcdefg:
regex="^a((b(c))d)(efg)$"
<s:analyze-string-result xmlns:s="http://www.w3.org/2009/xpath-functions/analyze-string">
<s:match>a<s:group nr="1">
<s:group nr="2">b<s:group nr="3">c</s:group>
</s:group>d</s:group>
<s:group nr="4">efg</s:group>
</s:match>
</s:analyze-string-result>
regex="^((a(bc)d)(.*))$
<s:analyze-string-result xmlns:s="http://www.w3.org/2009/xpath-functions/analyze-string">
<s:match>
<s:group nr="1">
<s:group nr="2">a<s:group nr="3">bc</s:group>d</s:group>
<s:group nr="4">efg</s:group>
</s:group>
</s:match>
</s:analyze-string-result>
regex="^(((a)(b)(cde)(.*)))$"
<s:analyze-string-result xmlns:s="http://www.w3.org/2009/xpath-functions/analyze-string">
<s:match>
<s:group nr="1">
<s:group nr="2">
<s:group nr="3">a</s:group>
<s:group nr="4">b</s:group>
<s:group nr="5">cde</s:group>
<s:group nr="6">fg</s:group>
</s:group>
</s:group>
</s:match>
</s:analyze-string-result>
I suspect it's not possible because xsl:analyze-string does not provide methods to: 1) know how many groups there, or 2) discover parent/child relationships of groups to facilitate recursion. But I'm curious if there is something I have overlooked.
You can make it a bit easier by changing the syntax of the regex, using <g> </g> for grouping rather than () (it would be possible but tiresome not to do this and instead analyse the regex and determine the groups)
Once you have the group structure you can generate the normal regex using () to pass to xsl:analyze-function adding extra groups so that every text run is grouped and can be retrieved later with regex-group().
Not extensively tested so there may be bugs but something like this, and it seems to work on your examples.
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:f="data:,f"
exclude-result-prefixes="xs"
>
<xsl:output omit-xml-declaration="yes"/>
<xsl:function name="f:analyze-string">
<xsl:param name="s"/>
<xsl:param name="r"/>
<xsl:variable name="rr">
<xsl:apply-templates mode="a-s" select="$r"/>
</xsl:variable>
<xsl:text>
</xsl:text>
<f:analyze-string-result>
<xsl:text>
</xsl:text>
<xsl:analyze-string select="$s" regex="{$rr}">
<xsl:matching-substring>
<f:match>
<xsl:variable name="m" select="."/>
<xsl:apply-templates mode="g" select="$r"/>
</f:match>
<xsl:text>
</xsl:text>
</xsl:matching-substring>
<xsl:non-matching-substring>
<f:non-match>
<xsl:value-of select="."/>
</f:non-match>
</xsl:non-matching-substring>
</xsl:analyze-string>
<xsl:text>
</xsl:text>
</f:analyze-string-result>
<xsl:text>
</xsl:text>
</xsl:function>
<xsl:template mode="a-s" match="g">
<xsl:text>(</xsl:text>
<xsl:apply-templates mode="a-s"/>
<xsl:text>)</xsl:text>
</xsl:template>
<xsl:template mode="a-s" match="text()[../g]">
<xsl:text>(</xsl:text>
<xsl:value-of select="."/>
<xsl:text>)</xsl:text>
</xsl:template>
<xsl:template mode="g" match="g">
<f:group>
<xsl:attribute name="nr">
<xsl:number level="any"/>
</xsl:attribute>
<xsl:apply-templates mode="g"/>
</f:group>
</xsl:template>
<xsl:template mode="g" match="text()">
<xsl:variable name="n">
<xsl:number count="g|text()[../g]" level="any"/>
</xsl:variable>
<xsl:value-of select="regex-group(xs:integer($n))"/>
</xsl:template>
<xsl:template name="main">
<!-- regex="^a((b(c))d)(efg)$" -->
<xsl:variable name="r">a<g><g>b<g>c</g></g>d</g><g>efg</g>$</xsl:variable>
<xsl:sequence select="f:analyze-string('abcdefg',$r)"/>
<!-- regex="^((a(bc)d)(.*))$ -->
<xsl:variable name="r"><g><g>a<g>bc</g>d</g><g>.*</g></g>$</xsl:variable>
<xsl:sequence select="f:analyze-string('abcdefg',$r)"/>
<!-- regex="^(((a)(b)(cde)(.*)))$" -->
<xsl:variable name="r"><g><g><g>a</g><g>b</g><g>cde</g><g>.*</g></g></g>$</xsl:variable>
<xsl:sequence select="f:analyze-string('abcdefg',$r)"/>
</xsl:template>
</xsl:stylesheet>
Produces
$ saxon9 -it main analyse.xsl
<f:analyze-string-result xmlns:f="data:,f">
<f:match>a<f:group nr="1"><f:group nr="2">b<f:group nr="3">c</f:group></f:group>d</f:group><f:group nr="4">efg</f:group></f:match>
</f:analyze-string-result>
<f:analyze-string-result xmlns:f="data:,f">
<f:match><f:group nr="1"><f:group nr="2">a<f:group nr="3">bc</f:group>d</f:group><f:group nr="4">efg</f:group></f:group></f:match>
</f:analyze-string-result>
<f:analyze-string-result xmlns:f="data:,f">
<f:match><f:group nr="1"><f:group nr="2"><f:group nr="3">a</f:group><f:group nr="4">b</f:group><f:group nr="5">cde</f:group><f:group nr="6">fg</f:group></f:group></f:group></f:match>
</f:analyze-string-result>

How can I pass HTML-element as parameter to XSLT function?

I have the following XSLT-function that I use in a XSLT file to generate XHTML output:
<xsl:function name="local:if-not-empty">
<xsl:param name="prefix"/>
<xsl:param name="str"/>
<xsl:param name="suffix"/>
<xsl:if test="$str != ''"><xsl:value-of select="concat($prefix, $str, $suffix)"/></xsl:if>
</xsl:function>
it simply checks whether a string str is not empty and, if so, returns the string, concatenated with a prefix and a suffix.
The function works fine as long as I only pass simple strings. But when I try to pass HTML elements as prefix or suffix, e.g.:
<xsl:value-of select="local:if-not-empty('', /some/xpath/expression, '<br/>')"/>
I get the following error message:
SXXP0003: Error reported by XML parser: The value of attribute "select"
associated with an element type "null" must not contain the '<' character.
The next thing I tried was to define a variable:
<xsl:variable name="br"><br/></xsl:variable>
and pass it to the function:
<xsl:value-of select="local:if-not-empty('', /some/xpath/expression, $br)"/>
but here, of course, I get an empty string, as the value of the element is extracted, and not the element itself copied.
My final hopeless attempt was to define a text element in the variable:
<xsl:variable name="br">
<xsl:text disable-output-escaping="yes"><br/></xsl:text>
</xsl:variable>
and pass this to the function, but this wasn't permitted, either.
XTSE0010: xsl:text must not contain child elements
I probably don't understand the intricate inner workings of XSLT, but in my opinion adding a <br/> element within a XSLT-transformation through a generic function seems legitimate...
Anyways... I'd appreciate if anyone could give me an alternative solution. I'd also like to understand why this doesn't work...
PS: I'm using Saxon-HE 9.4.0.1J, Java version 1.6.0_24
Try this:
<xsl:value-of select="local:if-not-empty('', /some/xpath/expression, '<br/>')" disable-output-escaping="yes"/>
Instead of concat, use: <xsl:copy-of> and pass as parameters items not strings:
<xsl:copy-of select="$pPrefix"/>
<xsl:copy-of select="$pStr"/>
<xsl:copy-of select="$pSuffix"/>
Here is a complete example:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:local="my:local" exclude-result-prefixes="local">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="vBr"><br/></xsl:variable>
<xsl:template match="/">
<xsl:sequence select="local:if-not-empty('a', 'b', $vBr/*)"/>
</xsl:template>
<xsl:function name="local:if-not-empty">
<xsl:param name="pPrefix"/>
<xsl:param name="pStr"/>
<xsl:param name="pSuffix"/>
<xsl:if test="$pStr != ''">
<xsl:copy-of select="$pPrefix"/>
<xsl:copy-of select="$pStr"/>
<xsl:copy-of select="$pSuffix"/>
</xsl:if>
</xsl:function>
</xsl:stylesheet>
When this transformation is applied on any XML document (not used), the wanted, correct result is produced:
a b<br/>
The problem is that <br/> is not a string - it is an XML element, so it cannot be manipulated using string functions. You need a separate function like this:
<xsl:function name="local:br-if-not-empty">
<xsl:param name="prefix"/>
<xsl:param name="str"/>
<xsl:if test="$str != ''">
<xsl:value-of select="concat($prefix, $str)"/>
<br/>
</xsl:if>
</xsl:function>
or a 'trick' like this where you handle <br/> as a separate case:
<xsl:function name="local:if-not-empty">
<xsl:param name="prefix"/>
<xsl:param name="str"/>
<xsl:param name="suffix"/>
<xsl:if test="$str != ''">
<xsl:value-of select="concat($prefix, $str)"/>
<xsl:choose>
<xsl:when test="$suffix = '<br/>'>
<br/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$suffix"/>
</xsl:otherwise>
</xsl:choose>
</xsl:if>
</xsl:function>

In XSLT, how come I can't set the select-attribute of a value-of using xsl:attribute, and what's a good alternative?

I have a constant and a variable that I wann mouch together to select a specific node, this is what I want to do:
<xsl:attribute name="value">
<xsl:value-of>
<xsl:attribute name="select">
<xsl:text>/root/meta/url_params/
<xsl:value-of select="$inputid" />
</xsl:attribute>
</xsl:value-of>
</xsl:attribute>
How come it doesn't work, and what could I do instad?
While #Alejandro is right that in the general case dynamic evaluation will be needed (and this may be provided in XSLT 2.1+), there are manageable simpler cases.
For example, if $inputid contains just a name, you probably want this:
<xsl:value-of select="/root/meta/url_params/*[name()=$inputid]"/>
We can implement a rather general dynamic XPath evaluator if we only restrict each location path to be an element name:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:param name="inputId" select="'param/yyy/value'"/>
<xsl:variable name="vXpathExpression"
select="concat('root/meta/url_params/', $inputId)"/>
<xsl:template match="/">
<xsl:value-of select="$vXpathExpression"/>: <xsl:text/>
<xsl:call-template name="getNodeValue">
<xsl:with-param name="pExpression"
select="$vXpathExpression"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="getNodeValue">
<xsl:param name="pExpression"/>
<xsl:param name="pCurrentNode" select="."/>
<xsl:choose>
<xsl:when test="not(contains($pExpression, '/'))">
<xsl:value-of select="$pCurrentNode/*[name()=$pExpression]"/>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="getNodeValue">
<xsl:with-param name="pExpression"
select="substring-after($pExpression, '/')"/>
<xsl:with-param name="pCurrentNode" select=
"$pCurrentNode/*[name()=substring-before($pExpression, '/')]"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on this XML document:
<root>
<meta>
<url_params>
<param>
<xxx>
<value>5</value>
</xxx>
</param>
<param>
<yyy>
<value>8</value>
</yyy>
</param>
</url_params>
</meta>
</root>
the wanted, correct result is produced:
root/meta/url_params/param/yyy/value: 8
There is no runtime evaluation for XPath expression in standar XSLT 1.0
So, depending what is $inputid, you could have different solutions.
But this /root/meta/url_params/$inputid is wrong because right hand of / must be a relative path in XPath 1.0 (in XPath 2.0 can be a function call, also).
For this particulary case you can use:
/root/meta/url_params/*[name()=$inputid]
or
/root/meta/url_params/*[#id=$inputid]
For a general case, I will go with walker pattern like Dimitre's answer.