How to find word with hyphen? - xslt

Is it possible to find words separated by a hyphen and surround them with some tag?
input
<root>
text text text-with-hyphen text text
</root>
required output
<outroot>
text text <sometag>text-with-hyphen</sometag> text text
</outroot>

This XSLT 2.0 transformation:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="root/text()">
<xsl:analyze-string select="." regex="([^ ]*\-[^ ]*)+">
<xsl:matching-substring>
<sometag><xsl:value-of select="."/></sometag>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<root>
text text text-with-hyphen text text
</root>
produces the wanted, correct result:
<root>
text text <sometag>text-with-hyphen</sometag> text text
</root>
Explanation:
Proper use of the XSLT 2.0 <xsl:analyze-string> instruction and its allowed children-instructions.

Just checked, it works. So idea behind is create recursive iteration over entire text. And inside recursion step using XPath function contains detect if word (see usage of $word) contains hyphen:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/root">
<outroot>
<xsl:call-template name="split-by-space">
<xsl:with-param name="str" select="text()"/>
</xsl:call-template>
</outroot>
</xsl:template>
<xsl:template name="split-by-space"> <!-- mode allows distinguish another tag 'step'-->
<xsl:param name="str"/>
<xsl:if test="string-length($str)"><!-- declare condition of recursion exit-->
<xsl:variable name="word"> <!-- select next word -->
<xsl:value-of select="substring-before($str, ' ')"/>
</xsl:variable>
<xsl:choose>
<xsl:when test="contains($word, '-')"> <!-- when word contains hyphen -->
<sometag>
<xsl:value-of select='concat(" ", $word)'/><!-- need add space-->
</sometag>
</xsl:when>
<xsl:otherwise>
<!-- produce normal output -->
<xsl:value-of select='concat(" ", $word)'/><!-- need add space-->
</xsl:otherwise>
</xsl:choose>
<!-- enter to recursion to proceed rest of str-->
<xsl:call-template name="split-by-space">
<xsl:with-param name="str"><xsl:value-of select="substring-after($str, ' ')"/></xsl:with-param>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>

Related

XSLT check node availability in

I am using XSLT 1.0 in my project. In my XSLT transformation I have to check for a specific element and if that exists - I have to perform some concatenation or else some other concatenation operation.
However, I am not finding an option here, like some built-in function.
Requirement is like
<Root>
<a></a>
<b></b>
<c></c>
</Root>
Here is element <a>, come in request payload, then we need to perform concatenation of <b> and <c> else <c> and <b>.
You can do that with template matching:
<xsl:template match="Root[not(a)]">
<xsl:value-of select="concat(c, b)"/>
</xsl:template>
<xsl:template match="Root[a]">
<xsl:value-of select="concat(b, c)"/>
</xsl:template>
Try it along these lines:
<xsl:template match="/Root">
<xsl:choose>
<xsl:when test="a">
<!-- do something -->
</xsl:when>
<xsl:otherwise>
<!-- do something else -->
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Explanation: the test returns the Boolean value of the node-set selected by the expression a. If the node-set is non-empty, the result is true.
Test for the presence of the element, using xsl:choose
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="/Root">
<xsl:choose>
<xsl:when test="a">
<xsl:value-of select="concat(c, b)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="concat(b, c)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
Or in a predicate for template matches:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="/Root[a]">
<xsl:value-of select="concat(c, b)"/>
</xsl:template>
<xsl:template match="/Root[not(a)]">
<xsl:value-of select="concat(b, c)"/>
</xsl:template>
</xsl:stylesheet>
In your case, use choose and test for the presence of a using boolean() on the according xpath.
<xsl:template match="Root">
<xsl:choose>
<xsl:when test="boolean(./a)">
<xsl:value-of select="concat(./b, ./c)" />
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="concat(./c, ./b)" />
</xsl:otherwise>
</xsl:choose>
</xsl:template>

How to find the SPACE and replace with required text within COMMENT text only

Please suggest for, how to find and replace the ' ' (space) to 'SPACETEXT' within Comment text . (XSLT version 2).
Input XML:
<root>
<para>First Text is <ceitalic>O</ceitalic><!--Text1 Text2 Text3 Text4--><!--Text6 Text7 Text8--><cesup>2</cesup></para>
<para>Second text is <ceitalic>H</ceitalic> <!--Text9--><!--Text10--><cesup>2</cesup></para>
</root>
XSLT:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="comment()/text()">
<xsl:analyze-string select="." regex="' '">
<xsl:matching-substring>
<xsl:choose>
<xsl:when test="regex-group(1)">SPACETEXT</xsl:when>
</xsl:choose>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
</xsl:stylesheet>
Required OUTPUT:
<root>
<para>First Text is <ceitalic>O</ceitalic><!--Text1SPACETEXTText2SPACETEXTText3SPACETEXTText4--><!--Text6SPACETEXTText7SPACETEXTText8--><cesup>2</cesup></para>
<para>Second text is <ceitalic>H</ceitalic> <!--Text9--><!--Text10--><cesup>2</cesup></para>
</root>
Comment nodes cannot contain text nodes. So, first of all, the match expression should look like:
<xsl:template match="comment()">
Also, the code can be simplified by replacing xsl:analyze-string with XPath replace() function as follows:
<xsl:template match="comment()">
<xsl:comment>
<xsl:value-of select="replace(., ' ', 'SPACETEXT')"/>
</xsl:comment>
</xsl:template>

XSLT Strip All Tabs In Text Output

Silly, simple question. When I output text, it still get the tabs based on my formatted/indented XSL structure. How do I instruct the transformer to ignore the spacing in the stylesheet while still keeping it neatly formatted?
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:apply-templates select="Foo/Bar"></xsl:apply-templates>
</xsl:template>
<xsl:template match="Bar">
<xsl:for-each select="AAA"><xsl:for-each select="BBB"><xsl:value-of select="Label"/>|<xsl:value-of select="Value"/><xsl:text>
</xsl:text></xsl:for-each></xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Produces output line by line with no tabs:
SomeLabel|SomeValue
SomeLabel|SomeValue
SomeLabel|SomeValue
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:apply-templates select="Foo/Bar"></xsl:apply-templates>
</xsl:template>
<xsl:template match="Bar">
<xsl:for-each select="AAA">
<xsl:for-each select="BBB">
<xsl:value-of select="Label"/>|<xsl:value-of select="Value"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Produces output with tabs:
SomeLabel|SomeValue
SomeLabel|SomeValue
SomeLabel|SomeValue
Update:
Adding this does not fix it:
<xsl:output method="text" indent="no"/>
<xsl:strip-space elements="*"></xsl:strip-space>
This is contrived, but you can imagine the XML looks like this:
<Foo>
<Bar>
<AAA>
<BBB>
<Label>SomeLabel1</Label>
<Value>SomeValue1</Value>
</BBB>
<BBB>
<Label>SomeLabel2</Label>
<Value>SomeValue2</Value>
</BBB>
<BBB>
<Label>SomeLabel3</Label>
<Value>SomeValue3</Value>
</BBB>
</AAA>
</Bar>
</Foo>
What you could try is wrapping all your current text nodes in xsl:text. For example, try this
<xsl:for-each select="BBB">
<xsl:value-of select="Label"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="Value"/>
<xsl:text>|</xsl:text>
</xsl:for-each>
Alternatively, you could make use of the concat function.
<xsl:for-each select="BBB">
<xsl:value-of select="concat(Label, '|')"/>
<xsl:value-of select="concat(Value, '|')"/>
</xsl:for-each>
You could even combine the two statements into one if you wanted
<xsl:for-each select="BBB">
<xsl:value-of select="concat(Label, '|', Value, '|')"/>
</xsl:for-each>
EDIT: If you prefer not to enter the separator | so many times, you make use of template matching to output the fileds. First, replace the value-of with apply-templates like so
<xsl:for-each select="BBB">
<xsl:apply-templates select="Label"/>
<xsl:apply-templates select="Value"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
Then you would have one specific template to match Label, where you wouldn't need to output the separator, and another more generic template matching any child of BBB
<xsl:template match="BBB/Label" priority="1">
<xsl:value-of select="." />
</xsl:template>
<xsl:template match="BBB/*">
<xsl:text>|</xsl:text><xsl:value-of select="." />
</xsl:template>
(The priority here is needed to ensure Label is matched by the first template, and not the general one). Of course, you could also not do apply-templates on Label in this case, and just do xsl:value-of for that one.
Furthermore, if the fields were being output in the order they appear in the XML, you could simplify the for-each to just this
<xsl:for-each select="BBB">
<xsl:apply-templates />
<xsl:text>
</xsl:text>
</xsl:for-each>

Insert XSLT node-set B in another node-set A at detected position inside text node of node-set A

I'm using XSLT to transform complex XML output of a content management system into XHTML. I'm using <xsl:apply-templates/> to get an XHTML fragment of whatever is described by XML input. That XML input comes with a very complex structure that may describe lots of different cases to be handled by several XSLT template elements. And that structure may change quite often in the future.
Previously, the resulting fragment of that transformation was directly sent to XSLT output. Now the requirements have changed and I need to capture result, occasionally modify it to insert some other well-formed XHTML fragment at a certain position in value of the fragment.
For the sake of demonstration, consider <xsl:apply-templates/> having created some opaque XHTML fragment captured in variable container.
<xsl:variable name="container">
<xsl:apply-templates />
</xsl:variable>
Next there is a second XHTML fragment in variable snippet:
<xsl:variable name="snippet">
<xsl:call-template name="get-snippet" />
</xsl:variable>
Requirement says to have node-set in $snippet to be inserted before any optionally contained period at end of value of $container. This ain't problematic unless XHTML fragments in both variables have to be kept as fragments. Thus one can't operate on string values of either variable.
Is there any opportunity to achieve that requirement in XSLT without losing the power and flexibility of <xsl:apply-templates/> on retrieving XHTML fragment in $container?
BTW: I already know about accessing the last text node in $container using:
node-set($container)//child::text[last()]
But I missed to get something inserted in the mid of that text node and I consider XSLT failing to provide proper support for what I want to do.
I. XSLT 1.0 solution:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ext="http://exslt.org/common"
exclude-result-prefixes="ext">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="vrtfContainer">
<html>
<p>Hello, world.</p>
<p> This is <b>just</b> a <i>demo.</i></p>
<p> Of some text</p>
</html>
</xsl:variable>
<xsl:variable name="vContainer" select=
"ext:node-set($vrtfContainer)/*"/>
<xsl:variable name="vrtfSnippet">
<p>Snippet</p>
</xsl:variable>
<xsl:variable name="vSnippet" select=
"ext:node-set($vrtfSnippet)/*"/>
<xsl:variable name="vText" select=
"($vContainer//text()[contains(.,'.')])[last()]"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<xsl:apply-templates select="$vContainer"/>
</xsl:template>
<xsl:template match="text()">
<xsl:choose>
<xsl:when test="not(generate-id() = generate-id($vText))">
<xsl:value-of select="."/>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="insertSnippet">
<xsl:with-param name="pText" select="$vText"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template name="insertSnippet">
<xsl:param name="pText"/>
<xsl:copy-of select="substring-before($pText, '.')"/>
<xsl:variable name="vTail" select=
"substring-after($pText, '.')"/>
<xsl:choose>
<xsl:when test="not(contains(substring($vTail,2), '.'))">
<xsl:copy-of select="$vSnippet"/>
<xsl:value-of select="concat('.', $vTail)"/>
</xsl:when>
<xsl:otherwise>
<xsl:text>.</xsl:text>
<xsl:call-template name="insertSnippet">
<xsl:with-param name="pText"
select="substring-after($pText, '.')"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied (to any XML document -- not used), the wanted, correct result is produced:
<html>
<p>Hello, world.</p>
<p> This is <b>just</b> a <i>demo
<p>Snippet</p>.</i></p>
<p> Of some text</p>
</html>
Explanation: Identity rule overriden by a recursive named template to find the last '.' in a string.
II. XSLT 2.0 solution:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="vContainer">
<html>
<p>Hello, world.</p>
<p> This is <b>just</b> a <i>demo.</i></p>
<p> Of some text</p>
</html>
</xsl:variable>
<xsl:variable name="vSnippet">
<p>Snippet</p>
</xsl:variable>
<xsl:variable name="vText" select=
"($vContainer//text()[contains(.,'.')])[last()]"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<xsl:apply-templates select="$vContainer/*"/>
</xsl:template>
<xsl:template match="text()[. is $vText]">
<xsl:variable name="vInd" select=
"index-of(string-to-codepoints(.), string-to-codepoints('.'))[last()]"/>
<xsl:sequence select="substring(., 1, $vInd -1)"/>
<xsl:sequence select="$vSnippet/*"/>
<xsl:sequence select="substring(., $vInd)"></xsl:sequence>
</xsl:template>
</xsl:stylesheet>
When this XSLT 2.0 transformation is performed, again the wanted correct result is produced:
<html>
<p>Hello, world.</p>
<p> This is <b>just</b> a <i>demo
<p>Snippet</p>.</i></p>
<p> Of some text</p>
</html>
Explanation: Use of the standard XPath 2.0 functions string-to-codepoints(), index-of(), substring() and operator is.

Counting distinct items and parsing comma-delimited values using XSLT

Suppose I have XML like this:
<child_metadata>
<metadata>
<attributes>
<metadata_valuelist value="[SampleItem3]"/>
</attributes>
</metadata>
<metadata>
<attributes>
<metadata_valuelist value="[SampleItem1]"/>
</attributes>
</metadata>
<metadata>
<attributes>
<metadata_valuelist value="[SampleItem1, SampleItem2]"/>
</attributes>
</metadata>
</child_metadata>
What I want to do is count the number of distinct values that are in the metadata_valuelists. There are the following distinct values: SampleItem1, SampleItem2, and SampleItem3. So, I want to get a value of 3. (Although SampleItem1 occurs twice, I only count it once.)
How can I do this in XSLT?
I realize there are two problems here: First, separating the comma-delimited values in the lists, and, second, counting the number of unique values. However, I'm not certain that I could combine solutions to the two problems, which is why I'm asking it as one question.
Another way without extension:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:variable name="all-value" select="/*/*/*/*/#value"/>
<xsl:template match="/">
<xsl:variable name="count">
<xsl:apply-templates select="$all-value"/>
</xsl:variable>
<xsl:value-of select="string-length($count)"/>
</xsl:template>
<xsl:template match="#value" name="value">
<xsl:param name="meta" select="translate(.,'[] ','')"/>
<xsl:choose>
<xsl:when test="contains($meta,',')">
<xsl:call-template name="value">
<xsl:with-param name="meta" select="substring-before($meta,',')"/>
</xsl:call-template>
<xsl:call-template name="value">
<xsl:with-param name="meta" select="substring-after($meta,',')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:if test="count(.|$all-value[contains(translate(.,'[] ','
'),
concat('
',$meta,'
'))][1])=1">
<xsl:value-of select="1"/>
</xsl:if>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
Note: maybe can be optimize with xsl:key instead of xsl:variable
Edit: Match tricky metadata.
This (note: just a single) transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
>
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:key name="kValue" match="value" use="."/>
<xsl:template match="/">
<xsl:variable name="vRTFPass1">
<values>
<xsl:apply-templates/>
</values>
</xsl:variable>
<xsl:variable name="vPass1"
select="msxsl:node-set($vRTFPass1)"/>
<xsl:for-each select="$vPass1">
<xsl:value-of select=
"count(*/value[generate-id()
=
generate-id(key('kValue', .)[1])
]
)
"/>
</xsl:for-each>
</xsl:template>
<xsl:template match="metadata_valuelist">
<xsl:call-template name="tokenize">
<xsl:with-param name="pText" select="translate(#value, '[],', '')"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="tokenize">
<xsl:param name="pText" />
<xsl:choose>
<xsl:when test="not(contains($pText, ' '))">
<value><xsl:value-of select="$pText"/></value>
</xsl:when>
<xsl:otherwise>
<value>
<xsl:value-of select="substring-before($pText, ' ')"/>
</value>
<xsl:call-template name="tokenize">
<xsl:with-param name="pText" select=
"substring-after($pText, ' ')"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<child_metadata>
<metadata>
<attributes>
<metadata_valuelist value="[SampleItem3]"/>
</attributes>
</metadata>
<metadata>
<attributes>
<metadata_valuelist value="[SampleItem1]"/>
</attributes>
</metadata>
<metadata>
<attributes>
<metadata_valuelist value="[SampleItem1, SampleItem2]"/>
</attributes>
</metadata>
</child_metadata>
produces the wanted, correct result:
3
Do note: Because this is an XSLT 1.0 solution, it is necessary to convert the results of the first pass from the infamous RTF type to a regular tree. This is done using your XSLT 1.0 processor's xxx:node-set() function -- in my case I used msxsl:node-set().
You probably want to think about doing this in two stages; first, do a transform that breaks down these value attributes, then it's fairly trivial to count them.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="#value">
<xsl:call-template name="breakdown">
<xsl:with-param name="itemlist" select="substring-before(substring-after(.,'['),']')" />
</xsl:call-template>
</xsl:template>
<xsl:template name="breakdown">
<xsl:param name="itemlist" />
<xsl:choose>
<xsl:when test="contains($itemlist,',')">
<xsl:element name="value">
<xsl:value-of select="normalize-space(substring-before($itemlist,','))" />
</xsl:element>
<xsl:call-template name="breakdown">
<xsl:with-param name="itemlist" select="substring-after($itemlist,',')" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:element name="value">
<xsl:value-of select="normalize-space($itemlist)" />
</xsl:element>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Aside from the 'catch all' template at the bottom, this picks up any value attributes in the format you gave, and breaks them down into separate elements (as sub-elements of the 'metadata_valuelist' element) like this:
...
<metadata_valuelist>
<value>SampleItem1</value>
<value>SampleItem2</value>
</metadata_valuelist>
...
The 'substring-before/substring-after select you see near the top strips off the '[' and ']' before passing it to the 'breakdown' template. This template will check if there's a comma in it's 'itemlist' parameter, and if there is it spits out the text before it as the content of a 'value' element, before recursively calling itself with the rest of the list. If there was no comma in the parameter, it just outputs the entire content of the parameter as a 'value' element.
Then just run this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:key name="itemvalue" match="value" use="text()" />
<xsl:template match="/">
<xsl:value-of select="count(//value[generate-id(.) = generate-id(key('itemvalue',.)[1])])" />
</xsl:template>
</xsl:stylesheet>
on the XML you get from the first transform, and it'll just spit out a single value as text output that tells you how many distinct values you have.
EDIT: I should probably point out, this solution makes a few assumptions about your input:
There are no attributes named 'value' anywhere else in the document; if there are, you can modify the #value match to pick out these ones specifically.
There are no elements named 'value' anywhere else in the document; as the first transform creates them, the second will not be able to distinguish between the two. If there are, you can replace the two <xsl:element name="value"> lines with an element name that's not already used.
The content of the #value attribute always begins with '[' and ends with ']', and there are no ']' characters within the list; if there are, the 'substring-before' function will drop everything after the first ']', rather than just the ']' at the end.
There are no commas in the names of the items you want to count, e.g. [SampleItem1, "Sample2,3"]. If there are, '"Sample2' and '3"' would be treated as separate items.