XSLT: how to match exact text value within an element and replace - xslt

I need to find a specific text value within a document,'method'and for each instance replace that text value 'method' with the following:
element to replace method
This 'method' value can appear several times throughout the document. The issue is that I also need to retain the remaining text within the element, apart from 'method' which will be replaced.
<section id="1">
<title>Methods</title>
<p>The test method blah has 6 types of methods available</p>
<p>With the exception of a specific method<p
</section>
<section id="2">
<title>Organisations</title>
<p>The organisation has a method</p>
</section>
I'm not sure if using fn:replace would the best approach, and if i also need to use regular expressions (something i'm not currently familiar with). Any advice on an approach here would be greatly appreciated.
Expected output only replaces the exact text 'method' with the content element, but retains 'methods':
<section id="1">
<title>Methods</title>
<p>The test <content type="description" xlink:href="linktodescription">method</named-content> blah has 6 types of methods available</p>
</section>
<section id="2">
<title>Organisations</title>
<p>The organisation has a <content type="description" xlink:href="linktodescription">method</named-content></p>
</section>

Assuming Saxon 9 as the XSLT processor you can use (based on http://saxonica.com/html/documentation/xsl-elements/analyze-string.html)
<xsl:template match="section/p">
<xsl:copy>
<xsl:analyze-string select="." regex="\bmethod\b" flags=";j">
<xsl:matching-substring>
<content type="description" xlink:href="linktodescription">
<xsl:value-of select="."/>
</content>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:copy>
</xsl:template>

Related

Exclude first element of a certain type when doing apply-templates

This is my source XML:
<DEFINITION>
<DEFINEDTERM>criminal proceeding</DEFINEDTERM>
<TEXT> means a prosecution for an offence and includes –</TEXT>
<PARAGRAPH>
<TEXT>a proceeding for the committal of a person for trial or sentence for an offence; and</TEXT>
</PARAGRAPH>
<PARAGRAPH>
<TEXT>a proceeding relating to bail –</TEXT>
</PARAGRAPH>
<TEXT>but does not include a prosecution that is a prescribed taxation offence within the meaning of Part III of the Taxation Administration Act 1953 of the Commonwealth;</TEXT>
</DEFINITION>
This is my XSL:
<xsl:template name="DEFINITION" match="DEFINITION">
<xsl:element name="body">
<xsl:attribute name="break">before</xsl:attribute>
<xsl:element name="defn">
<xsl:attribute name="id" />
<xsl:attribute name="scope" />
<xsl:value-of select="DEFINEDTERM" />
</xsl:element>
<xsl:element name="text">
<xsl:value-of select="replace(TEXT[1],'–','--')" />
</xsl:element>
</xsl:element>
<xsl:apply-templates select="*[not(self::TEXT[1])]" />
</xsl:template>
As per my XSL, I want to do something with the DEFINEDTERM element and the TEXT element that immediately follows it.
Then I want to apply-templates to the rest of the elements, except for the DEFINEDTERM and TEXT element that have already been dealt with. Most importantly, I don't want to apply templates to the first TEXT element.
How do I achieve this, because my XSL above does not work.
I have other templates for TEXT and PARAGRAPH, but not DEFINEDTERM. I have <xsl:template match="*|#*" /> at the top of the XSL.
You did not post the expected result nor a minimal reproducible example, so I can only guess you want to do:
<xsl:template match="DEFINITION">
<body break="before">
<defn id="" scope="">
<xsl:value-of select="DEFINEDTERM" />
</defn>
<text>
<xsl:value-of select="replace(DEFINEDTERM/following-sibling::TEXT[1],'–','--')" />
</text>
</body>
<xsl:apply-templates select="* except (DEFINEDTERM | DEFINEDTERM/following-sibling::TEXT[1])" />
</xsl:template>
At least that's what I understand as:
I want to do something with the DEFINEDTERM element and the TEXT element that immediately follows it.
This is assuming you are using XSLT 2.0 or higher (otherwise you would not be able to use the replace() function).
--
P.S. You might want to make this a bit more efficient by defining DEFINEDTERM/following-sibling::TEXT[1] as a variable first, then referring to the variable instead.

Change attribute value to position of another element with corresponding attribute value

I have a single XHTML document that contains span and div elements that refer to page breaks of a print version using id and epub:type attributes. For example: <div epub:type="pagebreak" id="page-3"/>. The document also has links to those elements, for example: 3.
This single XHTML document will be split into multiple XHTML documents to form an EPUB package. For this reason, the href attributes need to be updated to match the new location of the corresponding id. For example: 3. The name of the new XHTML file is equal to the position of the body/section elements. So in the last example, the page break with id="page-3" is apparently in the second body/section element.
I'm using the following XSLT 2.0 stylesheet:
<!--identity transform-->
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<!--variable to match id of elements with pagebreak values-->
<xsl:variable name="page-id" select="//*[#epub:type = 'pagebreak']/#id"/>
<!--update href attributes to match new filenames-->
<xsl:template match="a/#href">
<xsl:choose>
<xsl:when test="tokenize(., '#')[last()] = $page-id">
<xsl:attribute name="href">
<xsl:number count="//body/section[$page-id = tokenize(., '#')[last()]]" format="01"/>
<xsl:value-of select="concat('.xhtml', .)"/>
</xsl:attribute>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
It checks for href attributes that have a corresponding id using the $page-id variable. If there is a match, the href attribute should be updated using the count() function. Otherwise, the href should remain unchanged. The test seems to work, however, I'm not getting the result I want. This is the input:
<body>
<section>
<p>Link to page 3: 3</p>
</section>
<section>
<div epub:type="pagebreak" id="page-3"/>
</section>
</body>
This is the output I get:
<body>
<section>
<p>Link to page 3: 3</p>
</section>
<section>
<div epub:type="pagebreak" id="page-3"/>
</section>
</body>
This is the output I want:
<body>
<section>
<p>Link to page 3: 3</p>
</section>
<section>
<div epub:type="pagebreak" id="page-3"/>
</section>
</body>
It seems as if the XPath expression within xsl:number doesn't return a result, but I can't figure out why. Can anyone help me with this please?
I think you want e.g.
<xsl:template match="body/section" mode="number">
<xsl:number format="01"/>
<xsl:template>
and then instead of
<xsl:number count="//body/section[$page-id = tokenize(., '#')[last()]]" format="01"/>
use
<xsl:apply-templates select="key('page-id', substring-after(., '#'))" mode="number"/>
plus a key declaration
<xsl:key name="page-id" match="body/section" use=".//*[#epub:type = 'pagebreak']/#id"/>

XSL flatten with inherit

I want to flatten an XML document such that every element would copy the attributes of its parent and convert <span/> into <text/>
Input:
<el value="
<span bold="true">
one
<span italics="true">
two
<span superscript="true">
three
</span>
</span>
</span>
<span subscript="true">
four
</span>
"/>
Output:
<text bold="true">one</text>
<text bold="true" italics="true">two</text>
<text bold="true" italics="true" superscript="true">three</text>
<text subscript="true">four</text>
I've tried using copy-of with .. but that obviously only copies one level up from the input. I presume I need a variable but I am unsure of how to operate on it - it doesn't seem like I can do <xsl:value-of select="$text-element"><!--call template--></xsl:value-of>. The fact that this is a string inside an attribute doesn't help either...
Something like this might help, once you make the XML well-formed:
<xsl:template match="text()">
<text>
<xsl:copy-of select="ancestor::*/#*"/>
<xsl:value-of select="normalize-space()"/>
</text>
</xsl:template>

Get a specific processing instruction

I've the below XML.
<?xpp /MAIN?>
<?xpp MAIN;1;0;0;0;619;0;0?>
<section>
<title>Introduction</title>
<para>
para<superscript>1</superscript>
<?xpp foot;art6_ft1;suppress?>
<?xpp FOOT;art6_ft1;1?>
<footnote label="1" id="art6_ft1">
<para>
data
</para>
</footnote>
<?xpp /FOOT?>
The data
</para>
</section>
Here I want to get the processing instruction containing MAINin it, but i'm unable to know how to get it.
I'm trying the below XSLT.
<xsl:template match="/">
<html>
<head>
</head>
<body>
<xsl:if test="//footnote">
<xsl:apply-templates select="//processing-instruction('xpp')[not(ancestor::toc)]| //footnote" mode="footnote"/>
</xsl:if>
</body>
</html>
</xsl:template>
.
.
.
.
.
.
.
<xsl:template match="processing-instruction('xpp')" mode="footnote">
<xsl:if test="following::footnote[1][preceding::processing-instruction('xpp')[1] = current()]">
<xsl:variable name="pb" select="."/>
<xsl:processing-instruction name="pb">
<xsl:text>label='</xsl:text>
<xsl:value-of select="$pb"/>
<xsl:text>'</xsl:text>
<xsl:text>?</xsl:text>
</xsl:processing-instruction>
</xsl:if>
</xsl:template>
running this i'm getting <?xpp FOOT;art6_ft1;1?> picked, but i want <?xpp MAIN;1;0;0;0;619;0;0?> to be picked, please let me know how can i do this.
Thanks
"Here I want to get the processing instruction containing MAIN in it, but i'm unable to know how to get it."
You can use the following XPath expression to match processing instruction named xpp having data contains text "MAIN" :
processing-instruction('xpp')[contains(.,'MAIN')]

xsl:copy-of excluding parent

What code could I use in replace of <xsl:copy-of select="tag"/>, that when applied to the following xml..
<tag>
content
<a>
b
</a>
</tag>
..would give the following result: ?
content
<a>
b
</a>
I wish to echo out all the content therein, but excluding the parent tag
Basically I have several sections of content in my xml file, formatted in html, grouped in xml tags
I wish to conditionally access them & echo them out
For example: <xsl:copy-of select="description"/>
The extra parent tags generated do not affect the browser rendering, but they are invalid tags, & I would prefer to be able to remove them
Am I going about this in totally the wrong way?
Since you want to include the content part as well, you'll need the node() function, not the * operator:
<xsl:copy-of select="tag/node()"/>
I've tested this on the input example and the result is the example result:
content
<a>
b
</a>
Without hard-coding the root node name, this can be:
<xsl:copy-of select="./node()" />
This is useful in situations when you are already processing the root node and want an exact copy of all elements inside, excluding the root node. For example:
<xsl:variable name="head">
<xsl:copy-of select="document('head.html')" />
</xsl:variable>
<xsl:apply-templates select="$head" mode="head" />
<!-- ... later ... -->
<xsl:template match="head" mode="head">
<head>
<title>Title Tag</title>
<xsl:copy-of select="./node()" />
</head>
</xsl:template>
Complementing Welbog's answer, which has my vote, I recommend writing separate templates, along the lines of this:
<xsl:template match="/">
<body>
<xsl:apply-templates select="description" />
</body>
</xsl:template>
<xsl:template match="description">
<div class="description">
<xsl:copy-of select="node()" />
</div>
</xsl:template>