Recursively replacing elements in XSLT - xslt

I need to replace the <tref> element with other tags from elsewhere in my document. For example, I have:
<tref id="57236"/>
and
<Topic>
<ID>57236</ID>
<Text>
<p id="4">
<cs id="56792">1090-189-01 </cs>
<href id="57237">
<cs id="56792">Document Name</cs>
</href>
</p>
</Text>
</Topic>
Obtaining the following is not a problem:
<p id="4">
<cs id="56792">1090-189-01 </cs>
<href id="57237">
<cs id="56792">Document Name</cs>
</href>
</p>
With this stylesheet:
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="tref">
<xsl:variable name="NodeID"><xsl:value-of select="#id"/></xsl:variable>
<xsl:copy-of select="//Topic[ID = $NodeID]/Text/p/node()"/>
</xsl:template>
What I cannot do is replacing trefs nested into other trefs. For example, consider the following:
<tref id="57236"/>
and:
<Topic>
<ID>57236</ID>
<Text>
<p id="251">
<tref id="37287"/>
</p>
</Text>
</Topic>
My stylesheet duly replaces the tref with the content of the tag - which also contains a tref:
<p id="251">
<tref id="37287"/>
</p>
My current solution is to call <xsl:template match="tref"> from two different stylesheets. It does the job, but it is not very elegant, and what if trefs are nested at an even deeper level? And recursion is the bread and butter of XSLT.
Is there a solution to recursively replace all trefs as in XSLT?

Instead of using xsl:copy-of, use xsl:apply-templates
<xsl:apply-templates select="//Topic[ID = $NodeID]/Text/p/node()"/>
Or, to eliminate the use of the varianle
<xsl:apply-templates select="//Topic[ID = current()/#id]/Text/p/node()"/>
Note you can make use of an xsl:key to look-up the Topic elements
<xsl:key name="topic" match="Topic" use="ID" />
Then you can write this
<xsl:apply-templates select="key('topic', #id)/Text/p/node()"/>
Be wary of infinite recursion if you have a tref referring to a Topic that is an ancestor of it.

Related

XSLT wrap element and following-sibling text

Kindly help me to wrap the img.inline element with the following sibling text comma (if comma exists):
text <img id="1" class="inline" src="1.jpg"/> another text.
text <img id="2" class="inline" src="2.jpg"/>, another text.
Should be changed to:
text <img id="1" class="inline" src="1.jpg"/> another text.
text <span class="img-wrap"><img id="2" class="inline" src="2.jpg"/>,</span> another text.
Currently, my XSLT will wrap the img.inline element and add comma inside the span, now I want to remove the following comma.
text <span class="img-wrap"><img id="2" class="inline" src="2.jpg"/>,</span>
, <!--remove this extra comma--> another text.
My XSLT:
<xsl:template match="//img[#class='inline']">
<xsl:copy>
<xsl:choose>
<xsl:when test="starts-with(following-sibling::text(), ',')">
<span class="img-wrap">
<xsl:apply-templates select="node()|#*"/>
<xsl:text>,</xsl:text>
</span>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="node()|#*"/>
</xsl:otherwise>
</xsl:choose>
</xsl:copy>
<!-- checking following-sibling::text() -->
<xsl:apply-templates select="following-sibling::text()" mode="commatext"/>
</xsl:template>
<!-- here I want to match the following text, if comma, then remove it -->
<xsl:template match="the following comma" mode="commatext">
<!-- remove comma -->
</xsl:template>
Is my approach is correct? or is this something should be handled differently? pls suggest?
Currently you are copying the img and the embedding the span within that. Also, you do <xsl:apply-templates select="node()|#*"/> which will select child nodes of img (or which there are none). And for the attributes it will end add them to the span.
You don't actually need the xsl:choose here as you can add the condition to the match attribute.
<xsl:template match="//img[#class='inline'][starts-with(following-sibling::node()[1][self::text()], ',')]">
Note I have changed the condition as following-sibling::text() selects ALL text elements that follow the img node. You only want to get the node immediately after the img node, but only if it is a text node.
Also, trying to select the following text node with xsl:apply-templates is probably not the right approach, assuming you have a template that matches the parent node which selects all child nodes (not just img ones). I am assuming you were using the identity template here.
Anyway, try this XSLT instead
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html" indent="no" />
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="//img[#class='inline'][starts-with(following-sibling::node()[1][self::text()], ',')]">
<span class="img-wrap">
<xsl:copy-of select="." />
<xsl:text>,</xsl:text>
</span>
</xsl:template>
<xsl:template match="text()[starts-with(., ',')][preceding-sibling::node()[1][self::img]/#class='inline']">
<xsl:value-of select="substring(., 2)" />
</xsl:template>
</xsl:stylesheet>

Attempts to use following-sibling to convert

I try to convert my old html by xslt-script to my new xml stucture.
I have a Problem to converting the folowing source to my needed xml structure.
Source
<p>
<a class="DropDown">Example Text</a>
</p>
<div class="collapsed">
<table>..</table>
<p>..</p>
</div>
xml structure
<lq>
<p>Example Text</p>
<table>..</table>
<p>..</p>
</lp>
I tried the following xls, but the div class="collapsed" is not adopted into the lp tag.
<xsl:template match="p/a[#class='DropDown']">
<lp>
<p><xsl:apply-templates select="text()"/></p>
<xsl:if test="/p/a/following-sibling::*[1][self::div]">
<xsl:apply-templates select="*|text()"/>
</xsl:if>
</lp>
</xsl:template>
Can anyone tell me what I did wrong ore where the mistake is?
Thanks much
IMHO, you want to do:
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="p[a/#class='DropDown']">
<lp>
<p>
<xsl:value-of select="a"/>
</p>
<xsl:copy-of select="following-sibling::*[1][self::div]/node()"/>
</lp>
</xsl:template>
<xsl:template match="div[preceding-sibling::*[1][self::p/a/#class='DropDown']]"/>
As for your mistake:
You are testing the existence of some p that is the root element
and contains an a whose following sibling is div. None of these are true in the given example;
xsl:if does not change the context: your <xsl:apply-templates
select="*|text()"/> applies templates to the child nodes of the
current a;
Presumably you don't want the div to appear again in the original place -
so if you have another template to suppress it, you cannot use
<xsl:apply-templates> to insert it at the place you do want it -
at least not without using another mode.

Template is called where it should not to be

I've the below two XML cases.
Case1:
<para>Rent is the sum of money paid by the Tenant to the Landlord for the exclusive use of premises. The Landlord and Tenant signs a <page num="4"/>tenancy agreement which has to be stamped with the tax authorities as required under the Stamp Duties Act. The stamping of a tenancy agreement gives it validity but if the tenancy agreement is not stamped that does not mean</para>
Case2:
<para><page num="5"/>The Writ of Distress proceedings is an effective way to recover arrears in rent but regard must be had to the Landlord/Tenant relationship and the effect of publicity of such proceedings to the image of the building amongst other things.</para>
and the below XSLT
<xsl:template match="para">
<xsl:apply-templates select="child::node()[(self::page)]"/>
<li class="item">
<div class="para">
<span class="item-num">
<xsl:value-of select="../#num"></xsl:value-of>
</span>
<xsl:apply-templates select="child::node()[not(self::page)]"/>
</div>
</li>
</xsl:template>
<xsl:template match="page">
<xsl:processing-instruction name="pb">
<xsl:text>label='</xsl:text>
<xsl:value-of select="./#num"/>
<xsl:text>'</xsl:text>
<xsl:text>?</xsl:text>
</xsl:processing-instruction>
<a name="{concat('pg_',./#num)}"/>
<xsl:apply-templates/>
</xsl:template>
What I'm trying to do is check if page is the immediate(first) child of para and print that value first and then do the rest. But in both the cases, the page is printed first.
In the above cases provided, for case1, the page should be called just like any other template in para, since it is not the immediate child of para, but in case2, first the page has to be printed and next the template is to be called, as page num="5" is the immediate child of para Please let me know how I can do this.
A demo is here
I think what you mean is that you want to perform extra processing when page is the first child node under para. Your apply-templates need to look like this
<xsl:apply-templates select="node()[1][self::page]" />
However, it also sounds like you want to perform other processing on page elements regardless. You probably need two templates matching page here, but one with a "mode" to distinguish it from your normal processing.
Call it like this
<xsl:apply-templates select="node()[1][self::page]" mode="first"/>
And match it like this
<xsl:template match="page" mode="first">
This would contain the code to output your processing instruction.
For "normal" processing of the page element, just have another template matching page without the mode
<xsl:template match="page">
<a name="{concat('pg_',./#num)}"/>
<xsl:apply-templates/>
</xsl:template>
Try this XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*" />
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="para">
<xsl:apply-templates select="node()[1][self::page]" mode="first"/>
<li class="item">
<div class="para">
<span class="item-num">
<xsl:value-of select="../#num"></xsl:value-of>
</span>
<xsl:apply-templates />
</div>
</li>
</xsl:template>
<xsl:template match="page">
<a name="{concat('pg_',./#num)}"/>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="page" mode="first">
<xsl:processing-instruction name="pb">
<xsl:text>label='</xsl:text>
<xsl:value-of select="./#num"/>
<xsl:text>'</xsl:text>
</xsl:processing-instruction>
</xsl:template>
</xsl:stylesheet>
EDIT: If you don't want both "page" templates to apply to apply to the first page element, then add the following template to ignore it
<xsl:template match="page[not(preceding-sibling::node())]" />
Note, this will only work if you have <xsl:strip-space elements="*" /> present in your document, to strip out white-space only text nodes. Alternatively, you could write this
<xsl:template match="page[not(preceding-sibling::node()[not(self::text()) or normalize-space()])]" />
EDIT 2: The reason you need the extra templates is because of this line
<xsl:apply-templates />
This is will look for templates that match all the child elements under the current para element. So, for a page element, the following template will be matched
<xsl:template match="page">
But you say you don't want the very first page element to be matched in this case. Therefore, you are a more 'specific' template to match it. For example
<xsl:template match="page[not(preceding-sibling::node())]" />
This template matches page elements with no preceding siblings; i.e. the very first element under para.
XSLT has the concept of priority for templates. Where a template matching an element with a condition specified, that template will always be given priority. In this case, the specific template simply ignores the page element, to ensure it doesn't get output.
For other page elements, the other template will be used as normal.

With XSLT, how can I process normally, but hold some nodes until the end and then output them all at once (e.g. footnotes)?

I have an XSLT application which reads the internal format of Microsoft Word 2007/2010 zipped XML and translates it into HTML5 with XSLT. I am investigating how to add the ability to optionally read OpenOffice documents instead of MSWord.
Microsoft stores XML for footnote text separately from the XML of the document text, which happens to suit me because I want the footnotes in a block at the end of the output HTML page.
However, unfortunately for me, OpenOffice puts each footnote right next to its reference, inline with the text of the document. Here is a simple paragraph example:
<text:p text:style-name="Standard">The real breakthrough in aerial mapping
during World War II was trimetrogon
<text:note text:id="ftn0" text:note-class="footnote">
<text:note-citation>1</text:note-citation>
<text:note-body>
<text:p text:style-name="Footnote">Three separate cameras took three
photographs at once, a direct downward and an oblique on each side.</text:p>
</text:note-body>
</text:note>
photography, but the camera was large and heavy, so there were problems finding
the right aircraft to carry it.
</text:p>
My question is, can XSLT process the XML as normal, but hold each of the text:note items until the end of the document text, and then emit them all at one time?
You're thinking of your logic as being driven by the order of things in the input, but in XSLT you need to be driven by the order of things in the output. When you get to the point where you want to output the footnotes, go find the footnote text wherever it might be in the input. Admittedly that doesn't always play too well with the apply-templates recursive descent processing model, which is explicitly input-driven; but nevertheless, that's the way you have to do it.
Don't think of it as "holding" the text:note items, instead simply ignore them in the main pass and then gather them at the end with a //text:note and process them there, e.g.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:text="whateveritshouldbe">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
<!-- normal mode - replace text:note element by [reference] -->
<xsl:template match="text:note">
<xsl:value-of select="concat('[', text:note-citation, ']')" />
</xsl:template>
<xsl:template match="/">
<document>
<xsl:apply-templates select="*" />
<footnotes>
<xsl:apply-templates select="//text:note" mode="footnotes"/>
</footnotes>
</document>
</xsl:template>
<!-- special "footnotes" mode to de-activate the usual text:node template -->
<xsl:template match="#*|node()" mode="footnotes">
<xsl:copy>
<xsl:apply-templates select="#*|node()" mode="footnotes" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
You could use <xsl:apply-templates mode="..."/>. I'm not sure on the exact syntax and your use case, but maybe the example below will give you a clue on how to approach your problem.
Basic idea is to process your nodes twice. First iteration would be pretty much the same as now, and the second iteration only looks for footnotes and only outputs those. You differentiate those iteration by setting "mode" parameter.
Maybe this example will give you a clue how to approach your problem. Note that I used different tags that in your code, so the example would be simpler.
XSLT sheet:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes" />
<xsl:template match="doc">
<xml>
<!-- First iteration - skip footnotes -->
<doc>
<xsl:apply-templates select="text" />
</doc>
<!-- Second iteration, extract all footnotes.
'mode' = footnotes -->
<footnotes>
<xsl:apply-templates select="text" mode="footnotes" />
</footnotes>
</xml>
</xsl:template>
<!-- Note: no mode attribute -->
<xsl:template match="text">
<text>
<xsl:for-each select="p">
<p>
<xsl:value-of select="text()" />
</p>
</xsl:for-each>
</text>
</xsl:template>
<!-- Note: mode = footnotes -->
<xsl:template match="text" mode="footnotes">
<xsl:for-each select=".//footnote">
<footnote>
<xsl:value-of select="text()" />
</footnote>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Input XML:
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<text>
<p>
some text
<footnote>footnote1</footnote>
</p>
<p>
other text
<footnote>footnote2</footnote>
</p>
</text>
<text>
<p>
some text2
<footnote>footnote3</footnote>
</p>
<p>
other text2
<footnote>footnote4</footnote>
</p>
</text>
</doc>
Output XML:
<?xml version="1.0" encoding="UTF-8"?>
<xml>
<!-- Output from first iteration -->
<doc>
<text>
<p>some text</p>
<p>other text</p>
</text>
<text>
<p>some text2</p>
<p>other text2</p>
</text>
</doc>
<!-- Output from second iteration -->
<footnotes>
<footnote>footnote1</footnote>
<footnote>footnote2</footnote>
<footnote>footnote3</footnote>
<footnote>footnote4</footnote>
</footnotes>
</xml>

XSLT matching PAGEID to an element ID

How would I match two separate numbers in an XML document? There are multiple <PgIndexElementInfo> elements in my XML document, each representing a different navigation element, each with a unique <ID>. Later in the document a <PageID> specifies a number that sometimes matches an <ID> used above. How could I go about matching the <PageID> to the <ID> specified above?
<Element>
<Content>
<PgIndexElementInfo>
<ElementData>
<Items>
<PgIndexElementItem>
<ID>1455917</ID>
</PgIndexElementItem>
</Items>
</ElementData>
</PgIndexElementInfo>
</Content>
</Element>
<Element>
<Content>
<CustomElementInfo>
<PageID>1455917</PageID>
</CustomElementInfo>
</Content>
</Element>
EDIT:
I added the solution below to my code. The xsl:apply-templates that is present is used to recreate the nested lists that are lost between HTML and XML. What I now need to do is match the PageID to the ID of a <PgIndexElementItem> and add a CSS class to the <ul> it is a part of. I hope that makes sense.
<xsl:key name="kIDByValue" match="ID" use="."/>
<xsl:template match="PageID[key('kIDByValue',.)]">
<xsl:apply-templates select="//PgIndexElementItem[not(contains(Description, '.'))]" />
</xsl:template>
<xsl:template match="PgIndexElementItem">
<li>
<xsl:value-of select="Title"/>
<xsl:variable name="prefix" select="concat(Description, '.')"/>
<xsl:variable name="childOptions"
select="../PgIndexElementItem[starts-with(Description, $prefix)
and not(contains(substring-after(Description, $prefix), '.'))]"/>
<xsl:if test="$childOptions">
<ul>
<xsl:apply-templates select="$childOptions" />
</ul>
</xsl:if>
</li>
</xsl:template>
The XSLT way for dealing with cross references is with keys.
Matching: A rule matching every PageID element that it has been referenced by an ID element.
<xsl:key name="kIDByValue" match="ID" use="."/>
<xsl:template match="PageID[key('kIDByValue',.)]">
<!-- Template content -->
</xsl:template>
Selecting: A expression selecting every PageID element with specific value.
<xsl:key name="kPageIDByValue" match="PageID" use="."/>
<xsl:template match="ID">
<xsl:apply-templates select="key('kPageIDByValue',.)"/>
</xsl:template>