check for the following node - xslt

I've a small question in XSLT. i need XPath to validate a condition. and below is my XML.
<root>
<para>
Erlanger and several associates formed a syndicate to acquire the lease of an island in the West Indies for £55,000. The idea was to mine the <page num="44"/>island for phosphates.
</para>
<para>
<content-style font-style="bold">2.25</content-style> A commission or payment that a promoter receives upon transfer of property to a company must also be disclosed.
<para>
board were all nominees of Green and Smith; <page num="45"/>accordingly, disclosure </para>
</para>
<para>
<content-style font-style="bold">2.26</content-style> If a promoter contracts with the company whether as vendor<footnote num="57" id="fn57">
<para>
<case>
<casename>
<content-style font-style="italic">Re Leeds & Hanley Theatres of Varieties Ltd</content-style>
</casename> [1902] Ch 809 (Court of Appeal, England)
</case>.
</para>
</footnote> or purchaser,<footnote num="58" id="fn58">
<para>
<case>
<casename>
<content-style font-style="italic">Habib Abdul Rahman v Abdul Cader</content-style>
</casename> (1886) 4 Ky 193 (High Court of the Straits Settlements)
</case>.
</para>
</footnote> the fact that he is a contractor must be disclosed.
</para>
</root>
and XSL is as below.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
xmlns:ntw="Number2Word.uri"
exclude-result-prefixes="ntw">
<xsl:output method="html"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="ThisDocument" select="document('')"/>
<xsl:template match="/">
<xsl:text disable-output-escaping="yes"><![CDATA[<!DOCTYPE html>]]></xsl:text>
<html>
<head>
<xsl:text disable-output-escaping="yes"><![CDATA[</meta>]]></xsl:text>
<title>
<xsl:value-of select="chapter/title[1]/*"/>
</title>
<link rel="stylesheet" href="C:\Users\u0138039\Desktop\Proview\SG\Commentary_SG_XML-03032014\SG-Business Guide to Competition Law\05192014\XSLT\main.css" type="text/css"/>
<xsl:text disable-output-escaping="yes"><![CDATA[</link>]]></xsl:text>
</head>
<body>
<xsl:apply-templates/>
<xsl:if test="//footnote">
<section class="tr_footnotes">
<hr/>
<xsl:apply-templates select="//page[not(ancestor::toc)]| //footnote" mode="footnote"/>
</section>
</xsl:if>
</body>
</html>
</xsl:template>
<xsl:template match="footnote">
<xsl:variable name="varHeaderNote" select='concat("f",#num)'/>
<xsl:variable name="varFootNote" select='concat("#ftn.",#num)'/>
<sup>
<a name="{$varHeaderNote}" href="{$varFootNote}" class="tr_ftn">
<xsl:value-of select="#num"/>
</a>
</sup>
</xsl:template>
<xsl:template match="page" mode="footnote">
<xsl:processing-instruction name="pb">
<xsl:text>label='</xsl:text>
<xsl:value-of select="./#num"/>
<xsl:text>'</xsl:text>
<xsl:text>?</xsl:text>
</xsl:processing-instruction>
</xsl:template>
<xsl:template match="footnote" mode="footnote">
<div class="tr_footnote">
<div class="footnote">
<sup>
<a>
<xsl:attribute name="name">
<xsl:text>ftn.</xsl:text>
<xsl:value-of select="#num"/>
</xsl:attribute>
<xsl:attribute name="href">
<xsl:text>#f</xsl:text>
<xsl:value-of select="#num"/>
</xsl:attribute>
<xsl:attribute name="class">
<xsl:text>tr_ftn</xsl:text>
</xsl:attribute>
<xsl:value-of select="#num"/>
</a>
</sup>
<xsl:apply-templates/>
</div>
</div>
</xsl:template>
</xsl:stylesheet>
here when i run this, i get both <?pb label='44'?><?pb label='45'?>
where as i need as condition as below.
there should only be a `footnote` following `page` and there should be no `page` between `page` and `footnote`
in simple, by taking the above example, there are two page, by ignoring all other tags and considering only page the structure looks like below.
page num='44'
page num='45'
footnote
here i want only page num='45' to be captured and leave page num='44' since page num='44' is followed by another page but not directly footnote, this is pretty confusing, please let me know how can i do this.
The demo can be found here
Thanks

To capture only pages that contain at least one footnote, you could use a test like
(following::page | following::footnote)[1][self::footnote]
i.e. take all the following page and footnote elements in document order, and check whether the first one of these elements is a footnote - if it isn't then either there's an intervening page or there are no more page or footnote elements at all, either way we know there are no footnotes on this page.
<xsl:template match="page[(following::page | following::footnote)[1][self::footnote]]" mode="footnote">
<xsl:processing-instruction name="pb">
<xsl:text>label='</xsl:text>
<xsl:value-of select="./#num"/>
<xsl:text>'</xsl:text>
<xsl:text>?</xsl:text>
</xsl:processing-instruction>
</xsl:template>
<xsl:template match="page" mode="footnote" />

in simple, by taking the above example, there are two page, by
ignoring all other tags and considering only page the structure
looks like below.
page num='44'
page num='45'
footnote
here i want only page num='45' to be captured and leave page
num='44' since page num='44' is followed by another page but not
directly footnote
To select pages that are immediately followed by a footnote, use:
page[following-sibling::*[1][self::footnote]]
If a footnote is always preceded by a page, you could also use:
footnote/preceding-sibling::page[1]
Edit:
In your real example, where pages and footnotes are not siblings, you should use Ian's answer, i.e :
page[(following::page | following::footnote)[1][self::footnote]]
or (assuming that there is only block of footnotes):
footnote[1]/preceding::page[1]

When you match a page you can check whether the next footnote has a preceding page which is the current page. If it's not, then you don't print out its processing instruction since it's a page without a footnote.
<xsl:template match="page" mode="footnote">
<xsl:if test="following::footnote[1][preceding::page[1]/#num = current()/#num]">
<xsl:processing-instruction name="pb">
<xsl:text>label='</xsl:text>
<xsl:value-of select="./#num"/>
<xsl:text>'</xsl:text>
<xsl:text>?</xsl:text>
</xsl:processing-instruction>
</xsl:if>
</xsl:template>
See: http://xsltransform.net/eiQZDbt/3

Related

How to test XSLT having multiple mode with XSpec?

Need to write XSpec test case to test the XSLT, in which multiple modes are used for transformation.
But with below test-case, the xspec only tests the output with default mode applied.
I wonder if there is a way to test the final output of the transformation.
<!-- input.xml -->
<body>
<div>
<p class="Title"><span>My first title</span></p>
<p class="BodyText"><span style="font-weight:bold">AAAAAAA</span><span>2 Jan 2020</span></p>
</div>
</body>
<!-- conv.xsl -->
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<!-- default mode : adding text-align attribute where #class=Title -->
<xsl:template match="*[ancestor::body]">
<xsl:choose>
<xsl:when test="#class = 'Title'">
<xsl:element name="{local-name()}">
<xsl:copy-of select="#* except #style"/>
<xsl:attribute name="text-align" select="'center'"/>
<xsl:apply-templates/>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:element name="{local-name()}">
<xsl:copy-of select="#*"/>
<xsl:apply-templates/>
</xsl:element>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!-- bodytext mode : changing element name to <title> where p[#class=Title] -->
<xsl:template match="p[#class]" mode="bodytext">
<xsl:choose>
<xsl:when test="#class = 'Title'">
<title>
<xsl:copy-of select="#* except #class"/>
<xsl:apply-templates mode="bodytext"/>
</title>
</xsl:when>
<xsl:otherwise>
<para>
<xsl:apply-templates mode="bodytext"/>
</para>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="body">
<xsl:variable name="data">
<body>
<xsl:copy-of select="#*"/>
<xsl:apply-templates/>
</body>
</xsl:variable>
<xsl:apply-templates select="$data" mode="bodytext"/>
</xsl:template>
<xsl:template match="node() | #*" mode="#all">
<xsl:copy>
<xsl:apply-templates select="node() | #*" mode="#current"/>
</xsl:copy>
</xsl:template>
O\P for first <p>:
-- after default mode applied: <p class="Title" text-align="center">. [below xspec tests this o\p]
-- final: <title text-align="center">. [Want to test this o\p]
<!-- test.xspec -->
<x:description xmlns:x="http://www.jenitennison.com/xslt/xspec" stylesheet="conv.xsl">
<x:scenario label="XSS00001: Testing 'p[#class=Title]' converts to 'title'">
<x:context href="input.xml" select="/body/div[1]/p[1]"/>
<x:expect label="Testing 'p' converts to 'title'">
<title text-align="center">
<span>My first title</span>
</title>
</x:expect>
</x:scenario>
</x:description>
Any suggestion in this regard would be a great help. Thanks...
I don't think it is solely the use of the modes that doesn't give you the result you want. However, the way you have set up the modes in your XSLT, if you match on that /body/div[1]/p[1] in the XSpec test scenario, you will get the stylesheet applied to only that p element. And obviously for that p there is the match on *[ancestor::body] in the unnamed mode and processing stops in that mode as the other mode is never used from that template.
So you might need to make the body element the context and use a scenario like the following:
<x:scenario label="XSS00002: Testing 'p[#class=Title]' converts to 'title'">
<x:context>
<body>
<div>
<p class="Title">...</p>
<p class="BodyText">...</p>
</div>
</body>
</x:context>
<x:expect label="Testing 'p' converts to 'title'">
<body>
<div>
<title text-align="center">...</title>
<para>...</para>
</div>
</body>
</x:expect>
</x:scenario>
Martin is quite right.
Another way of writing would be:
<x:scenario label="When a document contains 'body//p[#class=Title]'">
<x:context href="input.xml" />
<x:expect label="'p' is converted to 'title[#text-align]'"
test="body/div/title">
<title text-align="center">
<span>My first title</span>
</title>
</x:expect>
</x:scenario>
that is,
Remove #select from x:context, because you and/or conv.xsl seem to assume the transformation to start always from the document node (/).
Add #test to x:expect, because you seem to be interested only in the title element in the transformation result.

XSLT - To move parent attribute value into child first para

Input is like as,
<section counter="yes" level="5">
<title><target id="page92"/></title>
<section counter="yes" level="6">
<title>Standard 12-lead ECG at Rest</title>
<para>The standard ECG is recorded at rest using 12 leads in order to collect as much information as possible:</para>
<listing type="dash">
<litem><para>Standard limb leads according to Einthoven (I, II, III)</para></litem>
Output should be,
<section counter="yes" level="5">
<title><target /></title>
<section counter="yes" level="6">
<title>Standard 12-lead ECG at Rest</title>
<para id="page92">The standard ECG is recorded at rest using 12 leads in order to collect as much information as possible:</para>
<listing type="dash">
<litem><para>Standard limb leads according to Einthoven (I, II, III)</para></litem>
We wrote xslt as shown below,
<xsl:template match="para[1][parent::section[parent::section[not(normalize-space(title))]]]">
<xsl:choose>
<xsl:when test="position() = 1">
<para>
<xsl:attribute name="id" select="ancestor::section[not(normalize-space(title))]/title/target/#id"/>
<xsl:apply-templates select="#*"/>
<xsl:apply-templates/>
</para>
</xsl:when>
<xsl:otherwise>
<para>
<xsl:apply-templates select="#*|node()"/>
</para>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
While using above xslt, we are unable to meet expected output.
<section counter="yes" level="5">
<title><target /></title>
<section counter="yes" level="6">
<title>Standard 12-lead ECG at Rest</title>
<para id="page92">The standard ECG is recorded at rest using 12 leads in order to collect as much information as possible:</para>
<listing type="dash">
<litem><para id="page92">Standard limb leads according to Einthoven (I, II, III)</para></litem>
The "page ID" value is repeating on following paragraphs which we didn't required. We need to maintain the page ID only on 1st paragraph.
Could you please guide us.
As for getting the result you want, it should be as simple as
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="section[not(normalize-space(title))]/section/para[1]">
<xsl:copy>
<xsl:attribute name="id" select="ancestor::section[not(normalize-space(title))]/title/target/#id"/>
<xsl:apply-templates select="#*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="section/title[not(normalize-space())]/target/#id"/>
</xsl:transform>

Get a specific processing instruction

I've the below XML.
<?xpp /MAIN?>
<?xpp MAIN;1;0;0;0;619;0;0?>
<section>
<title>Introduction</title>
<para>
para<superscript>1</superscript>
<?xpp foot;art6_ft1;suppress?>
<?xpp FOOT;art6_ft1;1?>
<footnote label="1" id="art6_ft1">
<para>
data
</para>
</footnote>
<?xpp /FOOT?>
The data
</para>
</section>
Here I want to get the processing instruction containing MAINin it, but i'm unable to know how to get it.
I'm trying the below XSLT.
<xsl:template match="/">
<html>
<head>
</head>
<body>
<xsl:if test="//footnote">
<xsl:apply-templates select="//processing-instruction('xpp')[not(ancestor::toc)]| //footnote" mode="footnote"/>
</xsl:if>
</body>
</html>
</xsl:template>
.
.
.
.
.
.
.
<xsl:template match="processing-instruction('xpp')" mode="footnote">
<xsl:if test="following::footnote[1][preceding::processing-instruction('xpp')[1] = current()]">
<xsl:variable name="pb" select="."/>
<xsl:processing-instruction name="pb">
<xsl:text>label='</xsl:text>
<xsl:value-of select="$pb"/>
<xsl:text>'</xsl:text>
<xsl:text>?</xsl:text>
</xsl:processing-instruction>
</xsl:if>
</xsl:template>
running this i'm getting <?xpp FOOT;art6_ft1;1?> picked, but i want <?xpp MAIN;1;0;0;0;619;0;0?> to be picked, please let me know how can i do this.
Thanks
"Here I want to get the processing instruction containing MAIN in it, but i'm unable to know how to get it."
You can use the following XPath expression to match processing instruction named xpp having data contains text "MAIN" :
processing-instruction('xpp')[contains(.,'MAIN')]

applying templates of same type but working on different type of declaration

I've below XMLs.
<emphasis type="italic">
varying from ti,e to time<star.page>58</star.page>Starch
</emphasis>
and
<para indent="no">
<star.page>18</star.page> Further to same.
</para>
here i'm trying to apply-templates on the star.page, but the confusion is if i take <xsl:apply-templates select="./*[1][self::star.page]" mode="first"/>, it is working fine for the first case, but for the second case, the star.page is getting duplicated, if i use <xsl:apply-templates select="./node()[1][self::star.page]" mode="first"/>, in case 2 the star.page that is supposed to appear before div is coming inside div and for case 1, the value is getting duplicated.
Here are the DEmos
Case1-enter link description here
Case2- enter link description here
Expected output are as below.
Case 1:
<span class="font-style-italic">
varying from ti,e to time<?pb label='58'?><a name="pg_58"></a></span>2<span class="font-style-italic">Starch
</span>
Case 2:
<?pb label='18'?><a name="pg_18"></a>
<div class="para">
Further to same.
</div>
Here the condition is as below.
If star.page is immediate child of parent node(though it is para or emphasis), the pb label has to be created first followed by the tag(Case 2 output).
If there is text and in between text there is star.page, then the content should come with pb label inside it.(Case 1 output).
please let me know a common solution on how i can fix theses issues.
I am not clear what the bulk of your XSLT is doing, but I would first make use of a named template to avoid repeated code. There is no issue with giving a matched template a name too.
<xsl:template match="star.page" name="page">
<xsl:processing-instruction name="pb">
<xsl:text>label='</xsl:text>
<xsl:value-of select="."/>
<xsl:text>'</xsl:text>
<xsl:text>?</xsl:text>
</xsl:processing-instruction>
<a name="{concat('pg_',.)}"/>
</xsl:template>
Then the template with the mode "first" becomes this
<xsl:template match="star.page" mode="first">
<xsl:call-template name="page" />
</xsl:template>
You can still call this in the same way, or maybe a slightly different condition will also work
<xsl:apply-templates select="star.page[not(preceding-sibling::node())]" mode="first"/>
Then, all you need is a template to ignore "star.page" that are the first child (because they have already been explicitly selected)
<xsl:template match="star.page[not(preceding-sibling::node())]" />
As a simplified example, try this for starers
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes" />
<xsl:strip-space elements="*" />
<xsl:template match="para|emphasis">
<xsl:apply-templates select="star.page[not(preceding-sibling::node())]" mode="first"/>
<div>
<xsl:choose>
<xsl:when test="./#align">
<xsl:attribute name="class"><xsl:text>para align-</xsl:text><xsl:value-of select="./#align"/></xsl:attribute>
</xsl:when>
<xsl:otherwise>
<xsl:attribute name="class"><xsl:text>para</xsl:text></xsl:attribute>
</xsl:otherwise>
</xsl:choose>
<xsl:apply-templates/>
</div>
</xsl:template>
<xsl:template match="star.page[not(preceding-sibling::node())]" />
<xsl:template match="star.page" name="page">
<xsl:processing-instruction name="pb">
<xsl:text>label='</xsl:text>
<xsl:value-of select="."/>
<xsl:text>'</xsl:text>
<xsl:text>?</xsl:text>
</xsl:processing-instruction>
<a name="{concat('pg_',.)}"/>
</xsl:template>
<xsl:template match="star.page" mode="first">
<xsl:call-template name="page" />
</xsl:template>
</xsl:stylesheet>
Do note the use of strip-space here because strictly speaking, the star.page is not the first node in each example, there is a white-space node before it.
When applied to this XML
<root>
<emphasis type="italic">
varying from ti,e to time<star.page>58</star.page>Starch
</emphasis>
<para indent="no">
<star.page>18</star.page> Further to same.
</para>
</root>
The following is output
<div class="para">
varying from time to time<?pb label='58'??><a name="pg_58"/>Starch
</div>
<?pb label='18'??>
<a name="pg_18"/>
<div class="para"> Further to same.
</div>

Extract text from "para" tag with embedded "para" children?

I'm using Altova's command-line xml processor on Windows to process a Help & Manual xml file. Help & Manual is help authoring software.
I'm extracting the text content from it using the following xslt. Specifically, I'm having an issue with the final para rule:
<?xml version='1.0'?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:strip-space elements="*" />
<xsl:template match="para[#styleclass='Heading1']">
<xsl:text>====== </xsl:text>
<xsl:value-of select="." />
<xsl:text> ======
</xsl:text>
</xsl:template>
<xsl:template match="para[#styleclass='Heading2']">
<xsl:text>===== </xsl:text>
<xsl:value-of select="." />
<xsl:text> =====
</xsl:text>
</xsl:template>
<xsl:template match="para">
<xsl:value-of select="." />
<xsl:text>
</xsl:text>
</xsl:template>
<xsl:template match="toggle">
<xsl:text>**</xsl:text>
<xsl:apply-templates />
<xsl:text>**
</xsl:text>
</xsl:template>
<xsl:template match="title" />
<xsl:template match="topic">
<xsl:apply-templates select="body" />
</xsl:template>
<xsl:template match="body">
<xsl:text>Content-Type: text/x-zim-wiki
Wiki-Format: zim 0.4
</xsl:text>
<xsl:apply-templates />
</xsl:template>
</xsl:stylesheet>
I've run into an issue with the extraction of text from certain paragraph elements. Take for example this xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="../helpproject.xsl" ?>
<topic template="Default" lasteditedby="tlilley" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="../helpproject.xsd">
<title translate="true">New Installs</title>
<keywords>
<keyword translate="true">Regional and Language Options</keyword>
</keywords>
<body>
<header>
<para styleclass="Heading1"><text styleclass="Heading1" translate="true">New Installs</text></para>
</header>
<para styleclass="Normal"><table rowcount="1" colcount="2" style="width:100%; cell-padding:6px; cell-spacing:0px; page-break-inside:auto; border-width:1px; border-spacing:0px; cell-border-width:0px; border-color:#000000; border-style:solid; background-color:#fffff0; head-row-background-color:none; alt-row-background-color:none;">
<tr style="vertical-align:top">
<td style="vertical-align:middle; width:96px; height:103px;">
<para styleclass="Normal" style="text-align:center;"><image src="books.png" scale="100.00%" styleclass="Image Caption"></image></para>
</td>
<td style="vertical-align:middle; width:1189px; height:103px;">
<para styleclass="Callouts"><text styleclass="Callouts" style="font-weight:bold;" translate="true">Documentation Convention</text></para>
<para styleclass="Callouts"><text styleclass="Callouts" translate="true">To make the examples concrete, we refer to the </text><var styleclass="Callouts">Add2Exchange</var><text styleclass="Callouts" translate="true"> Service Account as "zAdd2Exchange" throughout this document.  If your Service Account name is different, substitute that value for "zAdd2Exchange" in all commands and examples.  If you have named your account according to the recommended "zAdd2Exchange", then you may cut and paste any given commands as is.</text></para>
</td>
</tr>
</table></para>
</body>
</topic>
When the xslt is run on that paragraph, it pulls the text out but does so at the top paragraph element. The transform is supposed to add a pair of newlines to all extracted paragraphs, but doesn't have a chance to do so on the embedded <para> elements because the text is extracted at the parent para element.
Note that I don't care about the table tags, I just want to strip those.
Is there a way to construct the para rule so that it properly extracts the directly-owned text of a para element, as well as the text of any children para's, such that each extracted chunk gets the rule's newlines in the output text?
I think I've found the answer. Instead of value-of with the last para rule, I'm using apply-templates instead and that seems to catch them all.