Using multiple regex in XSLT : how to reduce processing time - regex

I am quite new to programming and XSLT: I try to improve the way I ask questions and explain problems, but I still have a long way to go. Sorry if there is something unclear.
I need to detect various alphabets in my XML document, which looks like this, with a lot more different language options.
<text>
<p>Some text. dise´mbər Some text. Some text.</p> <!-- text in International Phonetic Alphabet + English -->
<p>Some text. dise´mbər Some text. Издательство Академии Наук СССР Some text.</p> <!-- text in International Phonetic Alphabet + English + Cyrillic alphabet -->
<p>Some text. Издательство Академии Наук СССР dise´mbər Some text. Some text.</p>
<p>Some text. Some text. Издательство Академии Наук СССР Some text.</p> <!-- text in English + Cyrillic alphabet -->
</text>
What I started to do in XSLT is this:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<xsl:output method="xml" indent="no" encoding="UTF-8" omit-xml-declaration="no" />
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:for-each select="#*">
<xsl:attribute name="{local-name()}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:for-each>
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<xsl:template match="processing-instruction()">
<xsl:processing-instruction name="{local-name()}"><xsl:apply-templates></xsl:apply-templates></xsl:processing-instruction>
</xsl:template>
<xsl:template name="IPA">
<xsl:variable name="text" ><xsl:copy-of select="."/></xsl:variable>
<xsl:analyze-string select="$text" regex="((\p{{IsIPAExtensions}}|\p{{IsPhoneticExtensions}})+)" >
<xsl:matching-substring>
<IPA><xsl:value-of select="regex-group(1)"/></IPA>
</xsl:matching-substring>
<xsl:non-matching-substring><xsl:copy-of select="."></xsl:copy-of></xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
<xsl:template name="Cyrillic">
<xsl:variable name="texte" ><xsl:call-template name="IPA"></xsl:call-template></xsl:variable>
<xsl:analyze-string select="$texte" regex="(\p{{IsCyrillic}}+)" >
<xsl:matching-substring>
<Cyrillic><xsl:apply-templates select="regex-group(1)"/></Cyrillic>
</xsl:matching-substring>
<xsl:non-matching-substring><xsl:call-template name="IPA"></xsl:call-template></xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
<xsl:template match="text()">
<xsl:call-template name="Cyrillic"></xsl:call-template>
</xsl:template>
</xsl:stylesheet>
So that I could get an XML like this:
<?xml version="1.0" encoding="UTF-8"?><text>
<p>Some text. dise´mb<IPA>ə</IPA>r Some text. Some text.</p>
<p>Some text. dise´mb<IPA>ə</IPA>r Some text. <Cyrillic>Издательство</Cyrillic> <Cyrillic>Академии</Cyrillic> <Cyrillic>Наук</Cyrillic> <Cyrillic>СССР</Cyrillic> Some text.</p>
<p>Some text. <Cyrillic>Издательство</Cyrillic> <Cyrillic>Академии</Cyrillic>
<Cyrillic>Наук</Cyrillic> <Cyrillic>СССР</Cyrillic> dise´mb<IPA>ə</IPA>r Some text. Some text.</p>
<p>Some text. Some text. <Cyrillic>Издательство</Cyrillic> <Cyrillic>Академии</Cyrillic>
<Cyrillic>Наук</Cyrillic> <Cyrillic>СССР</Cyrillic> Some text.</p>
</text>
This is what I needed, however, there is a ten or so regex blocks that I use and the processing time will be quite long if I use this method. What would you do instead? Do you think XSLT is appropriate for this?
Thank you !
Maria
(XSLT 2, Saxon-HE 9.8.0.8)
Edit: here's the profile:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Analysis of Stylesheet Execution Time</title>
</head>
<body>
<h1>Analysis of Stylesheet Execution Time</h1>
<p>Total time: 72128.065 milliseconds</p>
<h2>Time spent in each template, function or global variable:</h2>
<p>The table below is ordered by the total net time spent in the template, function
or global variable. Gross time means the time including called templates and functions
(recursive calls only count from the original entry); net time means time excluding
time spent in called templates and functions.
</p>
<table border="border" cellpadding="10">
<thead>
<tr>
<th>file</th>
<th>line</th>
<th>instruction</th>
<th>count</th>
<th>average time (gross/ms)</th>
<th>total time (gross/ms)</th>
<th>average time (net/ms)</th>
<th>total time (net/ms)</th>
</tr>
</thead>
<tbody>
<tr>
<td> "*code/unicode.xsl" </td>
<td>21</td>
<td>template Greek</td>
<td align="right">2,755,968</td>
<td align="right">0.017</td>
<td align="right">46,854.785</td>
<td align="right">0.017</td>
<td align="right">46,854.785</td>
</tr>
<tr>
<td> "*code/unicode.xsl" </td>
<td>32</td>
<td>template Hebrew</td>
<td align="right">1,329,696</td>
<td align="right">0.043</td>
<td align="right">57,529.163</td>
<td align="right">0.008</td>
<td align="right">10,674.378</td>
</tr>
<tr>
<td> "*code/unicode.xsl" </td>
<td>54</td>
<td>template IPA</td>
<td align="right">333,984</td>
<td align="right">0.206</td>
<td align="right">68,964.076</td>
<td align="right">0.019</td>
<td align="right">6,381.186</td>
</tr>
<tr>
<td> "*code/unicode.xsl" </td>
<td>43</td>
<td>template Cyrillic</td>
<td align="right">665,392</td>
<td align="right">0.094</td>
<td align="right">62,582.890</td>
<td align="right">0.008</td>
<td align="right">5,053.727</td>
</tr>
<tr>
<td> "*code/unicode.xsl" </td>
<td>65</td>
<td>template Arabic</td>
<td align="right">167,068</td>
<td align="right">0.421</td>
<td align="right">70,284.800</td>
<td align="right">0.008</td>
<td align="right">1,320.724</td>
</tr>
<tr>
<td> "*code/unicode.xsl" </td>
<td>76</td>
<td>template Arrows</td>
<td align="right">83,536</td>
<td align="right">0.849</td>
<td align="right">70,945.946</td>
<td align="right">0.008</td>
<td align="right">661.146</td>
</tr>
<tr>
<td> "*code/unicode.xsl" </td>
<td>8</td>
<td>template *</td>
<td align="right">12,122</td>
<td align="right">5.959</td>
<td align="right">72,238.100</td>
<td align="right">0.034</td>
<td align="right">413.937</td>
</tr>
<tr>
<td> "*code/unicode.xsl" </td>
<td>87</td>
<td>template Dingbats</td>
<td align="right">41,768</td>
<td align="right">1.708</td>
<td align="right">71,323.074</td>
<td align="right">0.009</td>
<td align="right">377.128</td>
</tr>
<tr>
<td> "*code/unicode.xsl" </td>
<td>98</td>
<td>template Private</td>
<td align="right">20,884</td>
<td align="right">3.427</td>
<td align="right">71,576.916</td>
<td align="right">0.012</td>
<td align="right">253.842</td>
</tr>
<tr>
<td> "*code/unicode.xsl" </td>
<td>18</td>
<td>template processing-instruction()</td>
<td align="right">6,907</td>
<td align="right">0.014</td>
<td align="right">98.490</td>
<td align="right">0.014</td>
<td align="right">98.490</td>
</tr>
<tr>
<td> "*code/unicode.xsl" </td>
<td>121</td>
<td>template text()</td>
<td align="right">20,884</td>
<td align="right">3.429</td>
<td align="right">71,600.976</td>
<td align="right">0.001</td>
<td align="right">24.060</td>
</tr>
</tbody>
</table>
</body>
</html>
The profile of Martin Honnen's code:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Analysis of Stylesheet Execution Time</title>
</head>
<body>
<h1>Analysis of Stylesheet Execution Time</h1>
<p>Total time: 2900.594 milliseconds</p>
<h2>Time spent in each template, function or global variable:</h2>
<p>The table below is ordered by the total net time spent in the template, function
or global variable. Gross time means the time including called templates and functions
(recursive calls only count from the original entry); net time means time excluding
time spent in called templates and functions.
</p>
<table border="border" cellpadding="10">
<thead>
<tr>
<th>file</th>
<th>line</th>
<th>instruction</th>
<th>count</th>
<th>average time (gross/ms)</th>
<th>total time (gross/ms)</th>
<th>average time (net/ms)</th>
<th>total time (net/ms)</th>
</tr>
</thead>
<tbody>
<tr>
<td> "*code/unicode.xsl" </td>
<td>44</td>
<td>template text()</td>
<td align="right">222,968</td>
<td align="right">0.009</td>
<td align="right">1,949.720</td>
<td align="right">0.009</td>
<td align="right">1,949.720</td>
</tr>
<tr>
<td> "*code/unicode.xsl" </td>
<td>26</td>
<td>template text()</td>
<td align="right">20,884</td>
<td align="right">0.135</td>
<td align="right">2,823.597</td>
<td align="right">0.042</td>
<td align="right">873.877</td>
</tr>
</tbody>
</table>
</body>
</html>

Regular expressions such as \p{IsIPAExtensions} should be reasonably efficient: most of the blocks are a single consecutive range of codepoints and testing a character should simply check whether it is in that range. The cost, I suspect, arises not so much from the cost of checking one character against one Unicode block, but from the number of characters and the number of blocks.
It might be worth getting a profile at the Java level to see where it is spending its time. I can guess, but a profile would reveal if my guess is right.
The thing that can kill performance with regular expressions is backtracking, but I don't immediately see any risk of backtracking with this code.
The only other approach that comes to mind is to generate an enormous translate() call that classifies characters into groups (so all latin characters become "1", all Cyrillic characters become "2", etc) and then to process the result using `<xsl:for-each-group select="string-to-codepoints(.)" group-adjacent=".">. But there's no guarantee that would perform any better, and it's a lot of work to do the experiments to find out.

In XSLT 3, I would consider the following approach:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
xmlns:map="http://www.w3.org/2005/xpath-functions/map"
exclude-result-prefixes="#all"
version="3.0">
<xsl:param name="scripts"
as="map(xs:string, xs:string)*"
select="map { 'Cyrillic' : '\p{IsCyrillic}+'},
map { 'IPA' : '[\p{IsIPAExtensions}\p{IsPhoneticExtensions}]+' }"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="text()">
<xsl:iterate select="$scripts">
<xsl:param name="input" select="."/>
<xsl:on-completion>
<xsl:sequence select="$input"/>
</xsl:on-completion>
<xsl:next-iteration>
<xsl:with-param name="input">
<xsl:apply-templates select="$input" mode="wrap">
<xsl:with-param name="script-map" tunnel="yes" select="."/>
</xsl:apply-templates>
</xsl:with-param>
</xsl:next-iteration>
</xsl:iterate>
</xsl:template>
<xsl:mode name="wrap" on-no-match="shallow-copy"/>
<xsl:template match="text()" mode="wrap">
<xsl:param name="script-map" tunnel="yes"/>
<xsl:analyze-string select="." regex="{$script-map?*}">
<xsl:matching-substring>
<xsl:element name="{map:keys($script-map)}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
</xsl:stylesheet>
I haven't measured whether it performs better but for the regular expressions I would [\p{IsIPAExtensions}\p{IsPhoneticExtensions}]+ consider to be easier than (\p{IsIPAExtensions}|\p{IsPhoneticExtensions})+.
The other improvements are to rely on the xsl:mode based identity transformation and xsl:iterate.

Related

xsl counting and xpath syntax

I have a table of projects created in SharePoint 2010. I am trying to create reporting for management and need to get counts of various fields. I have used xsl to get the values of fields when viewing single items, so I am fairly familiar with the language. However, I cannot find a good explanation of the syntax for counting multiple items.
I have a table like this:
<table>
<tr>
<th class="ms-vb2">
Project Title
</th>
<th class="ms-vb2">
Project Leader
</th>
<th class="ms-vb2">
Project Status
</th>
</tr>
<tr>
<td class="ms-vb2">
Project Title 1
</td>
<td class="ms-vb2">
Project Leader 1
</td>
<td class="ms-vb2">
Completed
</td>
</tr>
<tr>
<td class="ms-vb2">
Project Title 2
</td>
<td class="ms-vb2">
Project Leader 1
</td>
<td class="ms-vb2">
Withdrawn
</td>
</tr>
<tr>
<td class="ms-vb2">
Project Title 3
</td>
<td class="ms-vb2">
Project Leader 2
</td>
<td class="ms-vb2">
Completed
</td>
</tr>
<!--About 100 more rows-->
</table>
There is a lot of nesting going on, so I am having difficulty targeting specific areas and I have very little control over the html due to this being generated by SharePoint.
Here is the reporting table I am trying to create using XSL:
<table id="FourBlockerHead" class="ClearBlockFloat">
<tr>
<th>Completed Count</th>
<th>Withdrawn Count</th>
<th>On Hold Count</th>
</tr>
<tr>
<td>
<xsl:value-of select="count(../td[#ms-vb2='Completed'])" /><!--Should be 2-->
</td>
<td>
<xsl:value-of select="count(../td[#ms-vb2='Withdrawn'])" /><!--Should be 1-->
</td>
<td>
<xsl:value-of select="count(../td[#ms-vb2='On Hold'])" /><!--Should be 0-->
</td>
</tr>
</table>
I know there is a problem with my XPATH syntax, but I cannot figure it out.
A complete XSLT would look like this:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="yes" />
<xsl:template match="/table">
<html>
<body>
<table id="FourBlockerHead" class="ClearBlockFloat" border="1">
<tr>
<th>Completed Count</th>
<th>Withdrawn Count</th>
<th>On Hold Count</th>
</tr>
<xsl:for-each select="tr">
<tr>
<td>
<xsl:value-of select="count(td[#class='ms-vb2' and normalize-space(text())='Completed'])" /><!--Should be 2-->
</td>
<td>
<xsl:value-of select="count(td[#class='ms-vb2' and normalize-space(text())='Withdrawn'])" /><!--Should be 1-->
</td>
<td>
<xsl:value-of select="count(td[#class='ms-vb2' and normalize-space(text())='On Hold'])" /><!--Should be 0-->
</td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
You can include this in your source XML file with
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="source.xslt"?>
in your XML file to get a well formatted HTML output.
The output would look like this:
Instead of:
<xsl:value-of select="count(../td[#ms-vb2='Completed')" />
try:
<xsl:value-of select="count(//td[#class='ms-vb2'][normalize-space()='Completed'])" />
and likewise for the other two.
Notes:
You did not provide the context, so I have changed the path to an absolute one, that counts all nodes in the entire document;
You don't have an attribute named ms-vb2;
You need to trim the whitespace in the data cell before comparing ot to a string with no whitespace.
The following stylesheet:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:template match="table">
<table id="FourBlockerHead" class="ClearBlockFloat">
<tr>
<th>Completed Count</th>
<th>Withdrawn Count</th>
<th>On Hold Count</th>
</tr>
<tr>
<td>
<xsl:value-of select="count(//td[#class='ms-vb2'][normalize-space()='Completed'])" />
</td>
<td>
<xsl:value-of select="count(//td[#class='ms-vb2'][normalize-space()='Withdrawn'])" />
</td>
<td>
<xsl:value-of select="count(//td[#class='ms-vb2'][normalize-space()='On Hold'])" />
</td>
</tr>
</table>
</xsl:template>
</xsl:stylesheet>
applied to your input example, will return:
<table id="FourBlockerHead" class="ClearBlockFloat">
<tr>
<th>Completed Count</th>
<th>Withdrawn Count</th>
<th>On Hold Count</th>
</tr>
<tr>
<td>2</td>
<td>1</td>
<td>0</td>
</tr>
</table>

XSL Repeat Template Object containing List<Object>

I have a Library object that contains a collection of Books... The Library object has properties like Name, Address, Phone... While the Book object has properties like ISDN, Title, Author, and Price.
XML looks something like this...
<Library>
<Name>Metro Library</Name>
<Address>1 Post Rd. Brooklyn, NY 11218</Address>
<Phone>800 976-7070</Phone>
<Books>
<Book>
<ISDN>123456789</ISDN>
<Title>Fishing with Luke</Title>
<Author>Luke Miller</Author>
<Price>18.99</Price>
</Book>
<Book>
<ISDN>234567890</ISDN>
<Title>Hunting with Paul</Title>
<Author>Paul Worthington</Author>
<Price>28.99</Price>
</Book>
...
And more books
...
</Books>
</Library>
I have a template with space for only 10 per page for example. There can be hundreds of books in the list of Books... So I need to limit the number of books and repeat the template every 10 books.
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<div>
<table>
<tr>
<td>NAME</td>
<td><xsl:value-of select="/Library/Name"/></td>
</tr>
<tr>
<td>ADDRESS</td>
<td><xsl:value-of select="/Library/Address"/></td>
</tr>
<tr>
<td>PHONE</td>
<td><xsl:value-of select="/Library/Phone"/></td>
</tr>
</table>
<table>
<xsl:for-each select="/Library/Books/Book">
<tr>
<td><xsl:value-of select="position()"/></td>
<td><xsl:value-of select="ISDN"/></td>
<td><xsl:value-of select="Title"/></td>
<td><xsl:value-of select="Author"/></td>
<td><xsl:value-of select="Price"/></td>
</tr>
</xsl:for-each>
</table>
</div>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
How can I get the Library information to appear on all repeating pages and add 10 books per page?... First page has Library info with Books 1 thru 10, Second page has Library info with Books 11 thru 20, and so on??
Thanks
For starters, try not to use for-each, apply-templates allows the engine to optimise the order that events are processed.
It appears that you are calling this stylesheet from some other system, so the approach I've taken is to define a pagination param. In the host language, when you call this just change the root parameter. This then allows you to select the require pages in this line here:
Books/Book[($page - 1)*10 < position() and position() <= ($page)*10]
This should do the trick.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:param name="page" select="1"/>
<xsl:template match="/Library">
<html>
<body>
<div>
<table>
<tr>
<td>NAME</td>
<td>
<xsl:value-of select="/Name"/>
</td>
</tr>
<tr>
<td>ADDRESS</td>
<td>
<xsl:value-of select="/Address"/>
</td>
</tr>
<tr>
<td>PHONE</td>
<td>
<xsl:value-of select="/Phone"/>
</td>
</tr>
</table>
<table>
<xsl:apply-templates select="Books/Book[($page - 1)*10 < position() and position() <= ($page)*10]"/>
</table>
</div>
</body>
</html>
</xsl:template>
<xsl:template match="/Book">
<tr>
<td>
<xsl:value-of select="position()"/>
</td>
<td>
<xsl:value-of select="ISDN"/>
</td>
<td>
<xsl:value-of select="Title"/>
</td>
<td>
<xsl:value-of select="Author"/>
</td>
<td>
<xsl:value-of select="Price"/>
</td>
</tr>
</xsl:template>
</xsl:stylesheet>

Move a string to a header element from child only if the string exists in all the child elements using XSLT

My XML code & XSLT code
explanation :
I'm trying to move the string '2' from the elements with
(tr[#type='detail'] and td[#column='1'])
to the category header
(tr [#type='categoryhead' and level='2'])
Any help on this is greatly appreciated
Thanks a ton
<!--=============My XML=============-->
<tbody xmlns="http://mynamespace.com">
<tr layoutcode="" type="categoryhead" level="1" categorykey="2789" hierarchykey="4921">
<td colname="1">Bonds</td>
</tr>
<tr layoutcode="" type="categoryhead" level="2" categorykey="3255" hierarchykey="4922">
<td colname="1">Beverages</td>
</tr>
<tr layoutcode="" type="detail" level="3" securitymasterkey="41164">
<td colname="1">Security_1(1,2)</td>
<td colname="2">500</td>`enter code here`
<td colname="3">330</td>
</tr>
<tr layoutcode="" type="detail" level="3" securitymasterkey="41167">
<td colname="1">Security_4(1,2,3,4)</td>
<td colname="2">10</td>
<td colname="3">265</td>
</tr>
<tr layoutcode="" type="categorytotal" level="2" categorykey="3255" hierarchykey="4922">
<td colname="1">Beverages</td>
<td colname="2">530</td>
<td colname="3">1,045</td>
</tr>
<tr layoutcode="" type="categorytotal" level="1" categorykey="2789" hierarchykey="4921">
<td colname="1">TOTAL Bonds</td>
<td colname="2">530</td>
<td colname="3">1,045</td>
</tr>
<tr layoutcode="" type="categoryhead" level="1" categorykey="2936" hierarchykey="4921">
<td colname="1">Options</td>
</tr>
<tr layoutcode="" type="categoryhead" level="2" categorykey="3248" hierarchykey="4922">
<td colname="1">Agriculture</td>
</tr>
<tr layoutcode="" type="detail" level="3" securitymasterkey="41168">
<td colname="1">Security_5(#,1)</td>
<td colname="2">10</td>
<td colname="3">890</td>
</tr>
<tr layoutcode="" type="detail" level="3" securitymasterkey="41168">
<td colname="1">Security_5(#,2)</td>
<td colname="2">10</td>
<td colname="3">890</td>
</tr>
<tr layoutcode="" type="categorytotal" level="2" categorykey="3248" hierarchykey="4922">
<td colname="1">Agriculture</td>
<td colname="2">10</td>
<td colname="3">890</td>
</tr>
</tbody>
XSLT where I'm trying to move the string '2' from the elements with (tr[#type='detail'] and td[#column='1'])to the category header (tr [#type='categoryhead' and level='2'])
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:a="http://mynamespace.com" version="2.0">
<!-- Global Variable -->
<xsl:variable name="arg1" select="'2'"></xsl:variable>
<!-- This identity template copies the document -->
<xsl:template match="node() | #*">
<xsl:copy>
<xsl:apply-templates select="node() | #* "/>
</xsl:copy>
</xsl:template>
<xsl:template match="a:tbody/a:tr[#type='categoryhead' and #level='2']/a:td">
<xsl:for-each select="//a:tbody/a:tr[#type='detail']/a:td[#colname='1'][contains(.,$arg1)]">
<xsl:variable name="IsFooted" select="contains(.,$arg1)"></xsl:variable>
<xsl:value-of select="count(//a:tbody/a:tr[#type='detail']/a:td[#colname='1'][contains(.,$arg1)])"/>
<xsl:choose>
<xsl:when test="$IsFooted='true'">
<xsl:value-of select="."/>
<xsl:value-of select="concat('(',concat($arg1,')'))"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Desired XML Output:
<tbody xmlns="http://mynamespace.com">
<tr layoutcode="" type="categoryhead" level="1" categorykey="2789" hierarchykey="4921">
<td colname="1">Bonds</td>
</tr>
<tr layoutcode="" type="categoryhead" level="2" categorykey="3255" hierarchykey="4922">
<td colname="1">Beverages (2)</td>
</tr>
<tr layoutcode="" type="detail" level="3" securitymasterkey="41164">
<td colname="1">Security_1(1)</td>
<td colname="2">500</td>
<td colname="3">330</td>
</tr>
<tr layoutcode="" type="detail" level="3" securitymasterkey="41167">
<td colname="1">Security_4(1,3,4)</td>
<td colname="2">10</td>
<td colname="3">265</td>
</tr>
<tr layoutcode="" type="categorytotal" level="2" categorykey="3255" hierarchykey="4922">
<td colname="1">Beverages</td>
<td colname="2">530</td>
<td colname="3">1,045</td>
</tr>
<tr layoutcode="" type="categorytotal" level="1" categorykey="2789" hierarchykey="4921">
<td colname="1">TOTAL Bonds</td>
<td colname="2">530</td>
<td colname="3">1,045</td>
</tr>
<tr layoutcode="" type="categoryhead" level="1" categorykey="2936" hierarchykey="4921">
<td colname="1">Options</td>
</tr>
<tr layoutcode="" type="categoryhead" level="2" categorykey="3248" hierarchykey="4922">
<td colname="1">Agriculture</td>
</tr>
<tr layoutcode="" type="detail" level="3" securitymasterkey="41168">
<td colname="1">Security_5(#,1)</td>
<td colname="2">10</td>
<td colname="3">890</td>
</tr>
<tr layoutcode="" type="detail" level="3" securitymasterkey="41168">
<td colname="1">Security_5(#,2)</td>
<td colname="2">10</td>
<td colname="3">890</td>
</tr>
<tr layoutcode="" type="categorytotal" level="2" categorykey="3248" hierarchykey="4922">
<td colname="1">Agriculture</td>
<td colname="2">10</td>
<td colname="3">890</td>
</tr>
</tbody>
The terminology is not quite clear here, as looking at the output sample, all that is happening is a "(2)" is being appended to a particular table cell, so it is not really moving anything.
Also, your question title of the question mentions about child elements, but looking at the XML structure, the "detail" rows are actually siblings of the "categoryhead" rows. This is probably where the problem lies; how do you correlate the "detail" rows with the associated "categoryheader". One way to do this could be using a key
<xsl:key name="row"
match="a:tr[#level != '1']"
use="generate-id(preceding-sibling::a:tr[#level=current()/#level - 1][1])" />
This gets the rows (other than at level 1) and groups them by the first preceding row with a lowel #level attribute.
Now, for your template match, because you are changing "td" elements, I would change the template to match such an element, rather than the "tr" element
<xsl:template match="a:tbody/a:tr[#type='categoryhead' and #level='2']/a:td[#colname='1']">
You can then use the key to get the 'child' elements, but instead of thinking in terms of if all child elements contain a '2', reverse the logic and check whether any child eleent doesn't contain a '2'
Try this XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:a="http://composition.bowne.com/2010/v4" version="1.0">
<!-- Global Variable -->
<xsl:variable name="arg1" select="'2'"></xsl:variable>
<xsl:key name="row" match="a:tr[#level != '1']" use="generate-id(preceding-sibling::a:tr[#level=current()/#level - 1][1])" />
<!-- This identity template copies the document -->
<xsl:template match="node() | #*">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="a:tbody/a:tr[#type='categoryhead' and #level='2']/a:td[#colname='1']">
<xsl:variable name="IsMissing" select="key('row', generate-id(..))/a:td[#colname='1'][not(contains(text(), $arg1))]" />
<xsl:choose>
<xsl:when test="not($IsMissing)">
<xsl:value-of select="."/>
<xsl:value-of select="concat('(',concat($arg1,')'))"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
As an aside, the namespace in your XSLT does not match the namespace in your XML. You would need these to match for this to work, but I assume this is a typo.
EDIT: To remove the '2' from the 'detail' rows, try adding the following template. The 'isMissing' variable is a bit messier, because it has to find the associated 'categoryhead' first. Also note it uses 'replace' which is only available in XSLT 2.0.
<xsl:template match="a:tbody/a:tr[#type='detail']/a:td[#colname='1']">
<xsl:variable name="parentLevel" select="../#level - 1" />
<xsl:variable name="IsMissing" select="key('row', generate-id(../preceding-sibling::a:tr[#level=$parentLevel][1]))/a:td[#colname='1'][not(contains(text(), $arg1))]" />
<xsl:choose>
<xsl:when test="not($IsMissing)">
<xsl:value-of select="replace(., concat(',', $arg1), '')"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

XSLT: my transform add an unselected element. What am I missing?

Ok, I'm working through some simple tutorials from here:
http://www.cch.kcl.ac.uk/legacy/teaching/7aavdh06/xslt/html/module_06.html
The first exercise involves creating a transformation that produces a certain output. Unfortunately, although I'm close, I get an unwanted element at the start. i.e.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="xhtml"
doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" />
<xsl:template match="/div/placeName">
<html>
<head />
<body>
<Table>
<tr>
<td>Place Name</td>
<td>
<xsl:value-of select="name" />
</td>
</tr>
<tr>
<td>Place Name (regularised)</td>
<td>
<xsl:value-of select="#reg" />
</td>
</tr>
<tr>
<td>National Grid Reference</td>
<td>
<xsl:value-of select="#key" />
</td>
</tr>
<tr>
<td>Type of building/monument</td>
<td>
<xsl:value-of select="settlement/#type" />
</td>
</tr>
</Table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
but the output I'm getting is:
Location
Place Name Old Warden
Place Name (regularised) Old Warden, St Leonard
National Grid Reference TL 137 443
Type of building/monument Parish church
The rest is fine but the 'Location' is unwanted. The source XML is at the link above. Any idea how I stop the unwanted text appearing? Or, better still, tell me where I'm going wrong! :)
Edit: Here is the output
<?xml version="1.0" encoding="utf-8" ?>
Location
<!DOCTYPE html SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<body>
<table>
<tr>
<td>Place Name</td>
<td>Old Warden</td>
</tr>
<tr>
<td>Place Name (regularised)</td>
<td>Old Warden, St Leonard</td>
</tr>
<tr>
<td>National Grid Reference</td>
<td>TL 137 443</td>
</tr>
<tr>
<td>Type of building/monument</td>
<td>Parish church</td>
</tr>
</table>
</body>
</html>
As Stivel mentions, the "Location" text does come from the head element in your XML.
<div type="location">
<head n="I">Location</head>
<placeName reg="Old Warden, St Leonard" key="TL 137 443">
The reason it is appearing is because of XSTL's built-in templates which it uses when you do not specify a match for an element it is looking for in your XSLT.
You can read up on built-in templates at the W3C page but in short, if XSLT can't find a match it will either continue processing the element's children (without copying the element), or in the case of text or attributes, output the value.
XSLT will start by looking for a match for the document element first, and if you have not provided a template, it will continue looking for a template for the root element, and then its children, and so on.
In your case, you have not provided a template to match anything until /div/placeName, this means XSLT will use the built-in template for the div element. This has two children; head and placeName. You have a template it can use for placeName, but not head and so the built-in template ends up outputing the text for head because you have not told it anything otherwise.
The solution is to simply to add a template to ignore the head element
<xsl:template match="/div/head" />
Here is the full XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="xhtml"
doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" indent="yes" />
<xsl:template match="/div/head" />
<xsl:template match="/div/placeName">
<html>
<head />
<body>
<Table>
<tr>
<td>Place Name</td>
<td>
<xsl:value-of select="name" />
</td>
</tr>
<tr>
<td>Place Name (regularised)</td>
<td>
<xsl:value-of select="#reg" />
</td>
</tr>
<tr>
<td>National Grid Reference</td>
<td>
<xsl:value-of select="#key" />
</td>
</tr>
<tr>
<td>Type of building/monument</td>
<td>
<xsl:value-of select="settlement/#type" />
</td>
</tr>
</Table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
When you use this, this should give the output you need.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xhtml" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/>
<xsl:template match="div">
<xsl:apply-templates select="placeName"/>
</xsl:template>
<xsl:template match="placeName">
<html>
<head />
<body>
<Table>
<tr>
<td>Place Name</td>
<td>
<xsl:value-of select="name" />
</td>
</tr>
<tr>
<td>Place Name (regularised)</td>
<td>
<xsl:value-of select="#reg" />
</td>
</tr>
<tr>
<td>National Grid Reference</td>
<td>
<xsl:value-of select="#key" />
</td>
</tr>
<tr>
<td>Type of building/monument</td>
<td>
<xsl:value-of select="settlement/#type" />
</td>
</tr>
</Table>
</body>
</html>
</xsl:template>
Probably your <head/> may refer
<head n="I">Location</head>
remove <head/> in xsl and check that.

How to transform an XML with default namespace?

I need some to help oon generating the XSL file for my XML Data.
Here is my XML Data
<?xml-stylesheet href="C:\Style.xsl" type="text/xsl" ?>
<xml>
<ApproverRoles OperationType="RemovedUser" xmlns="http://tempuri.org/">
<UserName>Bhupathiraju, Venkata</UserName><UserRole>IT Owner</UserRole><RoleDescription>Role Owner
</RoleDescription><UserRoleID>138</UserRoleID></ApproverRoles>
<ApproverRoles OperationType="RemovedUser" xmlns="http://tempuri.org/">
<UserName>Bhupathiraju, Venkata</UserName><UserRole>Business Owner</UserRole>
<RoleDescription>Role Owner</RoleDescription><UserRoleID>136</UserRoleID></ApproverRoles>
<ApproverRoles OperationType="RemovedUser" xmlns="http://tempuri.org/"><UserName>Amperayeni, Kiran K</UserName>
<UserRole>IT Owner</UserRole><RoleDescription>asdasdasd</RoleDescription><UserRoleID>97</UserRoleID>
</ApproverRoles>
<ApproverRoles OperationType="RemovedUser" xmlns="http://tempuri.org/"><UserName>Amperayeni, Kiran K</UserName>
<UserRole>IT Owner</UserRole><RoleDescription>i</RoleDescription><UserRoleID>135</UserRoleID></ApproverRoles>
</xml>
My XSL file is below
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match ="/" >
<html>
<head>
<title>User Management</title>
</head>
<body>
<table width="600" border="1" style='font-family:Calibri;font-size:10pt;background-color:#FFFFFF;border-color:#ccccff'>
<tr bgcolor = "#ccccff" style='font-weight:bold;'>
<td colspan="3">Proposed Users :</td>
</tr>
<tr bgcolor = "#cccccc" style='font-weight:bold;'>
<td>User Name</td>
<td>Role</td>
<td>Role Qualifier</td>
</tr>
<xsl:for-each select="//ns1:ApproverRoles" >
<tr>
<td>
<xsl:value-of select="UserName" />
</td>
<td>
<xsl:value-of select="UserRole" />
</td>
<td>
<xsl:value-of select="RoleDescription" />
</td>
</tr>
</xsl:for-each>
<tr bgcolor = "#ccccff" style='font-weight:bold;'>
<td colspan="3">Removed Users :</td>
</tr>
<tr bgcolor = "#cccccc" style='font-weight:bold;'>
<td>User Name</td>
<td>Role</td>
<td>Role Qualifier</td>
</tr>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet >
You are not correctly dealing with the default namespace present in the input document. If you do not associate a prefix to the corresponding namespace uri, the XSLT processor will search for elements in no namespace. Actually, the elements in your input document, are all in the namespace http://tempuri.org/.
So, you need first to declare the namespace prefix in the transform:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ns1="http://tempuri.org/">
Then, you have to use the prefix accordingly. For instance:
<xsl:for-each select="//ns1:ApproverRoles" >
<tr>
<td>
<xsl:value-of select="ns1:UserName" />
</td>
<td>
<xsl:value-of select="ns1:UserRole" />
</td>
<td>
<xsl:value-of select="ns1:RoleDescription" />
</td>
</tr>
</xsl:for-each>