if I have a text node with a trailing full stop (or period in US english) what expression can I use to strip the full stop and leave the remainder?
e.g input
<ol>
<li>This is the first item.</li>
<li>This is the second</li>
<li>This is the 3rd. </li>
</ol>
required output
<ol>
<li>This is the first item</li>
<li>This is the second</li>
<li>This is the 3rd</li>
</ol>
I have this but it seems unnecessarily cumbersome
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<xsl:apply-templates select="ol"/>
</xsl:template>
<xsl:template match="ol">
<ol>
<xsl:apply-templates select="li"/>
</ol>
</xsl:template>
<xsl:template match="li">
<li><xsl:apply-templates select="text()" mode="clean-text"/></li>
</xsl:template>
<xsl:template match="text()" mode="clean-text">
<xsl:variable name="normal-text" select="normalize-space(.)"/>
<xsl:choose>
<xsl:when test="substring($normal-text,string-length($normal-text),1) = '.'"><xsl:value-of select="normalize-space(substring($normal-text,1,string-length($normal-text)-1))"/></xsl:when>
<xsl:otherwise><xsl:value-of select="$normal-text"/></xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
Is there a cleverer way to achieve the same thing?
BTW I am using v1.0 as this may be instantiated in a Microsoft environment.
But a v2.0 solution would be of interest too
TIA
In XSLT 2 or 3 you have the replace function:
<xsl:template match="ol/li[ends-with(normalize-space(.), '.')]">
<xsl:copy>
<xsl:value-of select="replace(., '\.\s*$', '')"/>
</xsl:copy>
</xsl:template>
In XSLT 1 enviroments you can often call into the underlying platform (e.g. Java or .NET or PHP or Python) to make use of similar string functions supporting regular expressions like \.\s*$ to match on the end of a string preceded by zero or more whitespace characters preceded by a single full stop character.
Or try to do it all with pure XPath 1 string functions
<xsl:template match="ol/li[substring(normalize-space(), string-length(normalize-space())) = '.']">
<xsl:copy>
<xsl:value-of select="substring(normalize-space(), 1, string-length(normalize-space()) - 1)"/>
</xsl:copy>
</xsl:template>
In all cases, handle copying other stuff through by the identity transformation template: https://xsltfiddle.liberty-development.net/jxNakAW
Related
Kindly help me to wrap the img.inline element with the following sibling text comma (if comma exists):
text <img id="1" class="inline" src="1.jpg"/> another text.
text <img id="2" class="inline" src="2.jpg"/>, another text.
Should be changed to:
text <img id="1" class="inline" src="1.jpg"/> another text.
text <span class="img-wrap"><img id="2" class="inline" src="2.jpg"/>,</span> another text.
Currently, my XSLT will wrap the img.inline element and add comma inside the span, now I want to remove the following comma.
text <span class="img-wrap"><img id="2" class="inline" src="2.jpg"/>,</span>
, <!--remove this extra comma--> another text.
My XSLT:
<xsl:template match="//img[#class='inline']">
<xsl:copy>
<xsl:choose>
<xsl:when test="starts-with(following-sibling::text(), ',')">
<span class="img-wrap">
<xsl:apply-templates select="node()|#*"/>
<xsl:text>,</xsl:text>
</span>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="node()|#*"/>
</xsl:otherwise>
</xsl:choose>
</xsl:copy>
<!-- checking following-sibling::text() -->
<xsl:apply-templates select="following-sibling::text()" mode="commatext"/>
</xsl:template>
<!-- here I want to match the following text, if comma, then remove it -->
<xsl:template match="the following comma" mode="commatext">
<!-- remove comma -->
</xsl:template>
Is my approach is correct? or is this something should be handled differently? pls suggest?
Currently you are copying the img and the embedding the span within that. Also, you do <xsl:apply-templates select="node()|#*"/> which will select child nodes of img (or which there are none). And for the attributes it will end add them to the span.
You don't actually need the xsl:choose here as you can add the condition to the match attribute.
<xsl:template match="//img[#class='inline'][starts-with(following-sibling::node()[1][self::text()], ',')]">
Note I have changed the condition as following-sibling::text() selects ALL text elements that follow the img node. You only want to get the node immediately after the img node, but only if it is a text node.
Also, trying to select the following text node with xsl:apply-templates is probably not the right approach, assuming you have a template that matches the parent node which selects all child nodes (not just img ones). I am assuming you were using the identity template here.
Anyway, try this XSLT instead
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html" indent="no" />
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="//img[#class='inline'][starts-with(following-sibling::node()[1][self::text()], ',')]">
<span class="img-wrap">
<xsl:copy-of select="." />
<xsl:text>,</xsl:text>
</span>
</xsl:template>
<xsl:template match="text()[starts-with(., ',')][preceding-sibling::node()[1][self::img]/#class='inline']">
<xsl:value-of select="substring(., 2)" />
</xsl:template>
</xsl:stylesheet>
I've the below two XML cases.
Case1:
<para>Rent is the sum of money paid by the Tenant to the Landlord for the exclusive use of premises. The Landlord and Tenant signs a <page num="4"/>tenancy agreement which has to be stamped with the tax authorities as required under the Stamp Duties Act. The stamping of a tenancy agreement gives it validity but if the tenancy agreement is not stamped that does not mean</para>
Case2:
<para><page num="5"/>The Writ of Distress proceedings is an effective way to recover arrears in rent but regard must be had to the Landlord/Tenant relationship and the effect of publicity of such proceedings to the image of the building amongst other things.</para>
and the below XSLT
<xsl:template match="para">
<xsl:apply-templates select="child::node()[(self::page)]"/>
<li class="item">
<div class="para">
<span class="item-num">
<xsl:value-of select="../#num"></xsl:value-of>
</span>
<xsl:apply-templates select="child::node()[not(self::page)]"/>
</div>
</li>
</xsl:template>
<xsl:template match="page">
<xsl:processing-instruction name="pb">
<xsl:text>label='</xsl:text>
<xsl:value-of select="./#num"/>
<xsl:text>'</xsl:text>
<xsl:text>?</xsl:text>
</xsl:processing-instruction>
<a name="{concat('pg_',./#num)}"/>
<xsl:apply-templates/>
</xsl:template>
What I'm trying to do is check if page is the immediate(first) child of para and print that value first and then do the rest. But in both the cases, the page is printed first.
In the above cases provided, for case1, the page should be called just like any other template in para, since it is not the immediate child of para, but in case2, first the page has to be printed and next the template is to be called, as page num="5" is the immediate child of para Please let me know how I can do this.
A demo is here
I think what you mean is that you want to perform extra processing when page is the first child node under para. Your apply-templates need to look like this
<xsl:apply-templates select="node()[1][self::page]" />
However, it also sounds like you want to perform other processing on page elements regardless. You probably need two templates matching page here, but one with a "mode" to distinguish it from your normal processing.
Call it like this
<xsl:apply-templates select="node()[1][self::page]" mode="first"/>
And match it like this
<xsl:template match="page" mode="first">
This would contain the code to output your processing instruction.
For "normal" processing of the page element, just have another template matching page without the mode
<xsl:template match="page">
<a name="{concat('pg_',./#num)}"/>
<xsl:apply-templates/>
</xsl:template>
Try this XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*" />
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="para">
<xsl:apply-templates select="node()[1][self::page]" mode="first"/>
<li class="item">
<div class="para">
<span class="item-num">
<xsl:value-of select="../#num"></xsl:value-of>
</span>
<xsl:apply-templates />
</div>
</li>
</xsl:template>
<xsl:template match="page">
<a name="{concat('pg_',./#num)}"/>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="page" mode="first">
<xsl:processing-instruction name="pb">
<xsl:text>label='</xsl:text>
<xsl:value-of select="./#num"/>
<xsl:text>'</xsl:text>
</xsl:processing-instruction>
</xsl:template>
</xsl:stylesheet>
EDIT: If you don't want both "page" templates to apply to apply to the first page element, then add the following template to ignore it
<xsl:template match="page[not(preceding-sibling::node())]" />
Note, this will only work if you have <xsl:strip-space elements="*" /> present in your document, to strip out white-space only text nodes. Alternatively, you could write this
<xsl:template match="page[not(preceding-sibling::node()[not(self::text()) or normalize-space()])]" />
EDIT 2: The reason you need the extra templates is because of this line
<xsl:apply-templates />
This is will look for templates that match all the child elements under the current para element. So, for a page element, the following template will be matched
<xsl:template match="page">
But you say you don't want the very first page element to be matched in this case. Therefore, you are a more 'specific' template to match it. For example
<xsl:template match="page[not(preceding-sibling::node())]" />
This template matches page elements with no preceding siblings; i.e. the very first element under para.
XSLT has the concept of priority for templates. Where a template matching an element with a condition specified, that template will always be given priority. In this case, the specific template simply ignores the page element, to ensure it doesn't get output.
For other page elements, the other template will be used as normal.
I have an XML document which contains the following example extract:
<p>
Some text <GlossaryTermRef href="123">term 1</GlossaryTermRef><GlossaryTermRef href="345">term 2</GlossaryTermRef>.
</p>
I am using XSLT to transform this to XHTML using the following template:
<xsl:template match="GlossaryTermRef">
<a href="#{#href}" class="glossary">
<xsl:apply-templates select="node()|text()"/>
</a>
</xsl:template>
This works quite well, however I need to insert a space between the two GlossaryTermRef elements if they appear next to each other?
Is there a way to detect whether there is either space or text between the current node and the following-sibling? I can't always insert a space GlossaryTermRef item, as it may be followed by a punctuation mark.
I managed to solve this myself my modifying the template as follows:
<xsl:template match="GlossaryTermRef">
<a href="#{#href}" class="glossary">
<xsl:apply-templates select="node()|text()"/>
</a>
<xsl:if test="following-sibling::node()[1][self::GlossaryTermRef]">
<xsl:text> </xsl:text>
</xsl:if>
</xsl:template>
Can anyone suggest a better way, or see any problems with this solution?
Firstly, "node()|text()" is a longwinded equivalent of "node()". Perhaps you meant "*|node()" which would select the element and text children but not the comments or PIs.
Your solution is probably as good as any. Another would be to use grouping:
<xsl:for-each-group select="node()" group-adjacent="boolean(self::GlossaryTermRef)">
<xsl:choose>
<xsl:when test="current-grouping-key()">
<xsl:for-each select="current-group()">
<xsl:if test="position() gt 1"><xsl:text> </xsl:text></xsl:if>
<xsl:apply-templates select="."/>
</xsl:for-each>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
Naah, that's not pretty at all.
My next attempt would be to use sibling recursion (where the parent does apply-templates on the first child, and each child does apply-templates on the immediately following sibling), but I don't think that's going to be an improvement either.
What about this one? what do you feel?
<xsl:template match="GlossaryTermRef">
<a href="#{#href}" class="glossary">
<xsl:apply-templates select="node()|text()"/>
</a>
</xsl:template>
I'd like to trim the leading whitespace inside p tags in XML, so this:
<p> Hey, <em>italics</em> and <em>italics</em>!</p>
Becomes this:
<p>Hey, <em>italics</em> and <em>italics</em>!</p>
(Trimming trailing whitespace won't hurt, but it's not mandatory.)
Now, I know normalize-whitespace() is supposed to do this, but if I try to apply it to the text nodes..
<xsl:template match="text()">
<xsl:text>[</xsl:text>
<xsl:value-of select="normalize-space(.)"/>
<xsl:text>]</xsl:text>
</xsl:template>
...it's applied to each text node (in brackets) individually and sucks them dry:
[Hey,]<em>[italics]</em>[and]<em>[italics]</em>[!]
My XSLT looks basically like this:
<xsl:template match="p">
<xsl:apply-templates/>
</xsl:template>
So is there any way I can let apply-templates complete and then run normalize-space on the output, which should do the right thing?
This stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="p//text()[1][generate-id()=
generate-id(ancestor::p[1]
/descendant::text()[1])]">
<xsl:variable name="vFirstNotSpace"
select="substring(normalize-space(),1,1)"/>
<xsl:value-of select="concat($vFirstNotSpace,
substring-after(.,$vFirstNotSpace))"/>
</xsl:template>
</xsl:stylesheet>
Output:
<p>Hey, <em>italics</em> and <em>italics</em>!</p>
Edit 2: Better expression (now only three function calls).
Edit 3: Matching the first descendant text node (not just the first node if it's a text node). Thanks to #Dimitre's comment.
Now, with this input:
<p><b> Hey, </b><em>italics</em> and <em>italics</em>!</p>
Output:
<p><b>Hey, </b><em>italics</em> and <em>italics</em>!</p>
I would do something like this:
<xsl:template match="p">
<xsl:apply-templates/>
</xsl:template>
<!-- strip leading whitespace -->
<xsl:template match="p/node()[1][self::text()]">
<xsl:call-template name="left-trim">
<xsl:with-param name="s" value="."/>
</xsl:call-template>
</xsl:template>
This will strip left space from the initial node child of a <p> element, if it is a text node. It will not strip space from the first text node child, if it is not the first node child. E.g. in
<p><em>Hey</em> there</p>
I intentionally avoid stripping the space from the front of 'there', because that would make the words run together when rendered in a browser. If you did want to strip that space, change the match pattern to
match="p/text()[1]"
If you also want to strip trailing whitespace, as your title possibly implies, add these two templates:
<!-- strip trailing whitespace -->
<xsl:template match="p/node()[last()][self::text()]">
<xsl:call-template name="right-trim">
<xsl:with-param name="s" value="."/>
</xsl:call-template>
</xsl:template>
<!-- strip leading/trailing whitespace on sole text node -->
<xsl:template match="p/node()[position() = 1 and
position() = last()][self::text()]"
priority="2">
<xsl:value-of select="normalize-space(.)"/>
</xsl:template>
The definitions of the left-trim and right-trim templates are at Trim Template for XSLT (untested). They might be slow for documents with lots of <p>s. If you can use XSLT 2.0, you can replace the call-templates with
<xsl:value-of select="replace(.,'^\s+','')" />
and
<xsl:value-of select="replace(.,'\s+$','')" />
(Thanks to Priscilla Walmsley.)
You want:
<xsl:template match="text()">
<xsl:value-of select=
"substring(
substring(normalize-space(concat('[',.,']')),2),
1,
string-length(.)
)"/>
</xsl:template>
This wraps the string in "[]", then performs normalize-string(), then finally removes the wrapping characters.
How do you write element attributes in a specific order without writing it explicitly?
Consider:
<xsl:template match="Element/#1|#2|#3|#4">
<xsl:if test="string(.)">
<span>
<xsl:value-of select="."/><br/>
</span>
</xsl:if>
</xsl:template>
The attributes should appear in the order 1, 2, 3, 4. Unfortunately, you can't guarantee the order of attributes in XML, it could be <Element 2="2" 4="4" 3="3" 1="1">
So the template above will produce the following:
<span>2</span>
<span>4</span>
<span>3</span>
<span>1</span>
Ideally I don't want to test each attribute if it has got a value. I was wondering if I can somehow set an order of my display? Or will I need to do it explicitly and repeating the if test as in:
<xsl:template match="Element">
<xsl:if test="string(./#1)>
<span>
<xsl:value-of select="./#1"/><br/>
</span>
</xsl:if>
...
<xsl:if test="string(./#4)>
<span>
<xsl:value-of select="./#4"/><br/>
</span>
</xsl:if>
</xsl:template>
What can be done in this case?
In an earlier question you seemed to use XSLT 2.0 so I hope this time too an XSLT 2.0 solution is possible.
The order is not determined in the match pattern of a template, rather it is determined when you do xsl:apply-templates. So (with XSLT 2.0) you can simply write a sequence of the attributes in the order you want e.g. <xsl:apply-templates select="#att2, #att1, #att3"/> will process the attributes in that order.
XSLT 1.0 doesn't have sequences, only node-sets. To produce the same result, use xsl:apply-templates in the required order, such as:
<xsl:apply-templates select="#att2"/>
<xsl:apply-templates select="#att1"/>
<xsl:apply-templates select="#att3"/>
Do not produce XML that relies on the order of the attributes. This is very brittle and I would consider it bad style, to say the least. XML was not designed in that way; <elem a="1" b="2" /> and <elem a="1" b="2" /> are explicitly equivalent.
If you want ordered output, order your output (instead of relying on ordered input).
Furthermore, match="Element/#1|#2|#3|#4" is not equivalent to match="Element/#1|Element/#2|Element/#3|Element/#4", but I'm sure you mean the latter.
That being said, you can do:
<xsl:template match="Element/#1|Element/#2|Element/#3|Element/#4">
<xsl:if test="string(.)">
<span>
<xsl:value-of select="."/><br/>
</span>
</xsl:if>
</xsl:template>
<xsl:template match="Element">
<xsl:apply-templates select="#1|#2|#3|#4">
<!-- order your output... -->
<xsl:sort select="name()" />
</xsl:apply-templates>
</xsl:template>
EDIT: I'll take it as read that #1 etc are just examples, because names cannot actually start with a number in XML.
I'd use xsl:sort on the local-name of the attribute to get the result you want. I'd also use a different mode so the results don't get called by accident somewhere else.
<xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="Element">
<xsl:apply-templates select="#*" mode="sorted">
<xsl:sort select="local-name()" />
</xsl:apply-templates>
</xsl:template>
<xsl:template match="Element/#a|#b|#c|#d" mode="sorted">
<xsl:if test="string(.)">
<span>
<xsl:value-of select="."/><br/>
</span>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
The clue was is the answer by Martin Honnen
To copy attributes and conditionally add a new attribute to the end of the list of attributes.
Add rel="noopener noreferrer" to all external links.
<xsl:template match="a">
<xsl:copy>
<xsl:if test="starts-with(./#href,'http')">
<xsl:apply-templates select="./#*"/>
<!-- Insert rel as last node -->
<xsl:attribute name="rel">noopener noreferrer</xsl:attribute>
</xsl:if>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="a/#href|a/#target|a/#rel">
<!--
Allowed attribute on anchor
-->
<xsl:attribute name="{name()}">
<xsl:value-of select="."></xsl:value-of>
</xsl:attribute>
</xsl:template>
You can also specify the attribute sequence by calling apply templates with each select in the order you want.
<xsl:template match="a">
<xsl:copy>
<xsl:if test="starts-with(./#href,'http')">
<xsl:apply-templates select="./#id"/>
<xsl:apply-templates select="./#href"/>
<xsl:apply-templates select="./#target"/>
<!-- Insert rel as last node -->
<xsl:attribute name="rel">noopener noreferrer</xsl:attribute>
</xsl:if>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>