XSLT disable-output-escaping but allow some entities - xslt

I have the following xslt to output comments but it is vulnerable to XSS if I disable all output escaping.
<xsl:value-of disable-output-escaping="yes" select="//datafor:field[datafor:name='comments']/datafor:value" />
How can I allow only the following html attributes so that I can retain some formating in the rendered output?
'b,strong,i,ul,ol,li,p,br,p[style]',div,div[class]'

Or, using SaxonJS, in the browser, you could call into JavaScript to parse the HTML fragment and process it e.g.
function parseHTML(html) {
return new DOMParser().parseFromString(html, 'text/html');
}
const xml = `<orders>
<order>
<id>o1</id>
<date>2022-06-12</date>
<comments><![CDATA[I want the following extras: <ol>
<li>32 GB RAM</li>
<li>1000 GB SSD</li>
]]></comments>
</order>
</orders>`;
const xslt = `<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml"
version="3.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
xmlns:js="http://saxonica.com/ns/globalJS"
expand-text="yes">
<xsl:mode name="html"/>
<xsl:template mode="html" match="b | strong | i | ul | ol | li | p | br | p | div | p/#style | div/#class" xpath-default-namespace="http://www.w3.org/1999/xhtml">
<xsl:copy>
<xsl:apply-templates select="#* | node()" mode="#current"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<html>
<head>
<title>Test</title>
</head>
<body>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match="orders">
<h1>Orders</h1>
<xsl:where-populated>
<ol>
<xsl:apply-templates/>
</ol>
</xsl:where-populated>
</xsl:template>
<xsl:template match="order">
<li>Order {id} from {format-date(date, '[D] [M] [Y0000]')}
<div>
<h2>Comments</h2>
<div>
<xsl:apply-templates select="js:parseHTML(string(comments))" mode="html"/>
</div>
</div></li>
</xsl:template>
</xsl:stylesheet>`;
const result = SaxonJS.XPath.evaluate(`transform(map {
'stylesheet-text' : $xslt,
'source-node' : parse-xml($xml)
}
)?output/*/*/node()`,
[],
{ params : {
xslt: xslt,
xml: xml
}
}
);
document.body.append(...result);
<script src="https://martin-honnen.github.io/xslt3fiddle/js/SaxonJS2.js"></script>
With the XSLT implementation of an HTML tag soup parser by David Carlisle it would be
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml"
version="3.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
xmlns:d="data:,dpc"
xmlns:js="http://saxonica.com/ns/globalJS"
expand-text="yes">
<xsl:import href="https://github.com/davidcarlisle/web-xslt/raw/main/htmlparse/htmlparse.xsl"/>
<xsl:mode name="html"/>
<xsl:template mode="html" match="b | strong | i | ul | ol | li | p | br | p | div | p/#style | div/#class" xpath-default-namespace="http://www.w3.org/1999/xhtml">
<xsl:copy>
<xsl:apply-templates select="#* | node()" mode="#current"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<html>
<head>
<title>Test</title>
</head>
<body>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match="orders">
<h1>Orders</h1>
<xsl:where-populated>
<ol>
<xsl:apply-templates/>
</ol>
</xsl:where-populated>
</xsl:template>
<xsl:template match="order">
<li>Order {id} from {format-date(date, '[D] [M] [Y0000]')}
<div>
<h2>Comments</h2>
<div>
<xsl:apply-templates select="d:htmlparse(comments)" mode="html"/>
</div>
</div></li>
</xsl:template>
</xsl:stylesheet>
Online sample.
The current samples copy only the listed elements but copy any text node through, if elements like e.g. script need to be completely stripped add an empty template <xsl:template mode="html" match="script" xpath-default-namespace="http://www.w3.org/1999/xhtml"/>.
Of course, consider to download the HTML parser module and xsl:import a local file for your own application instead of pulling from github.

Related

copy-of with search and replace relative paths

I want to insert an html snippet from an external file into my output document with copy-of like described here: https://stackoverflow.com/a/5976762/18427492
The html snipped is a navigation bar and also used by other (python) scripts to generate other html files.
I need to replace the path in "href" to match a relative path that i have in a XSLT variable.
Full file content (Template file to be copied):
<ul class="nav">
<li class="fineprint">MyNiceGame Developer Mode Documentation</li>
<li class="switchlang"><img src="/deco/dco_en_sml.gif" alt="English" border="0"></img></li>
<li>Introduction</li>
<li>Contents</li>
<li>Search</li>
<li>Engine</li>
<li>Command Line</li>
<li>Game Data</li>
<li>Script</li>
</ul>
So how can i insert this snippet into my XSL document and replace ../../sdk/ (its possible to change this string to something like {replace-me}/sdk/...) with a relative path that i already have in a XSLT variable?
My XSLT document (i want to replace the <xsl:call-template name="nav"/> with the template file processing):
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="3.0" xpath-default-namespace="https://clonkspot.org" exclude-result-prefixes="xs">
<xsl:output method="html" encoding="ISO-8859-1" doctype-public="-//W3C//DTD HTML 4.01//EN"
doctype-system="http://www.w3.org/TR/html4/strict.dtd"/>
<xsl:template match="/clonkDoc">
<html>
<body>
<xsl:call-template name="nav"/>
<xsl:apply-templates select="func"/>
<!-- other possible nodes under /clonkDoc -->
<xsl:call-template name="nav"/>
</body>
</html>
</xsl:template>
<xsl:template name="nav">
<xsl:param name="relpath" tunnel="yes"/>
<ul class="nav">
<li class="fineprint">
<xsl:when test='lang("en")'>>MyNiceGame Developer Mode Documentation</xsl:when>
</li>
<!-- Other li elements -->
</xsl:template>
Example source file:
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<?xml-stylesheet type="text/xsl" href="../../../clonk.xsl"?>
<clonkDoc xmlns="https://clonkspot.org"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://clonkspot.org ../../../clonk.xsd" xml:lang="de">
<func>
<!-- other nodes -->
</func>
</clonkDoc>
Desired target file:
<!DOCTYPE html
PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<!-- stuff -->
</head>
<body>
<ul class="nav">
<!-- The corrected li elements with modified a href link -->
</ul>
<!-- Other stuff from source file (<func>) -->
<ul class="nav">
<!-- The corrected li elements with modified a href link -->
</ul>
</body>
</html>
Martin Honnen's solution for my specific case with the xpath-default-namespace:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="3.0" xpath-default-namespace="https://clonkspot.org" exclude-result-prefixes="xs">
<xsl:output method="html" encoding="ISO-8859-1" doctype-public="-//W3C//DTD HTML 4.01//EN"
doctype-system="http://www.w3.org/TR/html4/strict.dtd"/>
<xsl:template match="/clonkDoc">
<html>
<body>
<xsl:apply-templates select="doc('file.html')//ul[#class = 'nav']" xpath-default-namespace="" mode="fix-links"/>
<xsl:apply-templates select="func"/>
<!-- other possible nodes under /clonkDoc -->
<xsl:apply-templates select="doc('file.html')//ul[#class = 'nav']" xpath-default-namespace="" mode="fix-links"/>
</body>
</html>
</xsl:template>
<xsl:mode name="fix-links" on-no-match="shallow-copy"/>
<xsl:template mode="fix-links" match="ul/li/a/#href" xpath-default-namespace="">
<xsl:message>Value href: <xsl:value-of select="."></xsl:value-of></xsl:message>
<xsl:attribute name="{name()}" select="replace(., '../../sdk', 'foobar')"/>
</xsl:template>
copy-of makes a a deep copy, if you want to transform input nodes (even only their attribute values) you write templates to do so e.g. <xsl:apply-templates select="doc('file.xml')//ul[#class = 'nav']" mode="fix-links"/>, or, perhaps, as the edit says the snippet with the ul is all in the file, use simply <xsl:apply-templates select="doc('file.xml')" mode="fix-links"/>, and
<xsl:mode name="fix-links" on-no-match="shallow-copy"/>
<xsl:template mode="fix-links" match="ul/li/a/#href">
<xsl:attribute name="{name()}" select="replace(., '../../sdk', $varname)"/>
</xsl:template>
The xsl:mode declaration is XSLT 3 only, in earlier versions declare the identity transformation for that mode e.g.
<xsl:template mode="fix-links" match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()" mode="fix-links"/>
</xsl:copy>
</xsl:template>
in XSLT 1 or
<xsl:template mode="fix-links" match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()" mode="#current"/>
</xsl:copy>
</xsl:template>
in XSLT 2.
XSLT 3 sample (slighly adapted for the demonstration to work with the primary input) outputs
<ul class="nav">
<li class="fineprint">MyNiceGame Developer Mode Documentation</li>
<li class="switchlang"><img src="/deco/dco_de_sml.gif" alt="German" border="0"/></li>
<li>Introduction</li>
<li>Contents</li>
<li>Search</li>
<li>Engine</li>
<li>Command Line</li>
<li>Game Data</li>
<li>Script</li>
</ul>
As for the information in the latest edit that the secondary input document you want to process has elements in no namespace but your primary one has elements in a certain namespace that your XSLT has used as the xpath-default-namespace, in that case you need to override that for any selections in the secondary input e.g.
<xsl:mode name="fix-links" on-no-match="shallow-copy"/>
<xsl:template mode="fix-links" match="ul/li/a/#href" xpath-default-namespace="">
<xsl:attribute name="{name()}" select="replace(., '../../sdk', $varname)"/>
</xsl:template>
and if you continue to use the apply-templates with an element selector, there as well e.g. <xsl:apply-templates select="doc('file.xml')//ul[#class = 'nav']" xpath-default-namespace="" mode="fix-links"/>.

Can we use multiple elements with except operator in xslt 1

I want to add multiple matches in except operator.
Example:
<root>
<div>
<span style="font-family:'Work Sans', sans-serif;">
</span>
<h3>hi</h3>
<ol><li>hello</li></ol>
<ul><li>hello</li></ul>
<p>name</p>
</div>
</root>
Expected result: Want to convert div tag to p tag and then move only h3,ol,ul,p tag out of it if present.
<root>
<p>
<span style="font-family:'Work Sans', sans-serif;">
</span>
</p>
<h3>hi</h3>
<ol><li>hello</li></ol>
<ul><li>hello</li></ul>
<p>name</p>
</root>
I tried this:
<xsl:template match="div">
<p>
<xsl:copy-of select="* except h3"/>
</p>
<xsl:copy-of select="h3"/>
</xsl:template>
Above xslt results to :
<root>
<p>
<span style="font-family:'Work Sans', sans-serif;">
</span>
<ol><li>hello</li></ol>
<ul><li>hello</li></ul>
<p>name</p>
</p>
<h3>hi</h3>
</root>
Is there any way to add mutiple element names in except operator like <xsl:copy-of select="* except h3 and ol"/>
Another approach i tried:
<xsl:template match="div[p | ol | ul | h3 | h2]">
<xsl:apply-templates/>
</xsl:template>
<!-- convert <div> to <p> if it's direct child is not one of p,ol,ul -->
<xsl:template match="div[not(p | ol | ul | h3 | h2)]">
<p>
<xsl:apply-templates/>
</p>
</xsl:template>
And this results to which is wrong as i still want span tag to be under p tag:
<root><span style="font-family:'Work Sans', sans-serif;">
</span>
<h3>hi</h3>
<ol><li>hello</li></ol>
<ul><li>hello</li></ul>
<p>name</p></root>
In stead of using xsl:copy you could just use the xsl:apply-templates approach like this:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:strip-space elements="*"/>
<xsl:output indent="yes"/>
<xsl:template match="node()|#*" name="copy">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="div">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="div/span">
<p>
<xsl:call-template name="copy"/>
</p>
</xsl:template>
</xsl:stylesheet>
The except operator was introduced in XPath 2.0 so it doesn't make any sense on how to use it with XSLT 1.0 which uses XPath 1.0.
As for its use where supported, you can of course do e.g. * except (h3, ol).
In XPath 1.0 you can try e.g. *[not(self::h3|self::ol)].

How to get data between headers?

I am new to xslt. I want below input to be converted into output shown below:
Input:
<ATTRIBUTE-VALUE>
<THE-VALUE>
<div xmlns="http://www.w3.org/1999/xhtml">
<h1 dir="ltr" id="_1536217498885">Main Description</h1>
Line1 The main description text goes here.
<p>Line2 The main description text goes here.</p>
<p>Line3 The main description text goes here.</p>
<p><img alt="Embedded Image" class="embeddedImageLink" id="_1536739954166" src="_9c3778a0-d596-4eef-85fa-052a5e1b2166.jpg"/></p>
<h1 dir="ltr" id="_1536217498886">Key Consideration</h1>
<p>Line1 The key consideration text goes here.</p>
<p>Line2 The key consideration text goes here.</p>
<h1 dir="ltr" id="_1536217498887">Skills</h1>
<p>Line1 The Skills text goes here.</p>
<p>Line2 The Skills text goes here.</p>
<p>Line3 The Skills text goes here.</p>
<h1 dir="ltr" id="_1536217498888">Synonyms</h1>
<p>The Synonyms text goes here.</p>
</div>
</THE-VALUE>
</ATTRIBUTE-VALUE>
Output should be:
<MainDescription>
<![CDATA[
<p>Line1 The main description text goes here.</p>
<p>Line2 The main description text goes here.</p>
<p>Line3 The main description text goes here.</p>
<p><img alt="Embedded Image" class="embeddedImageLink" id="_1536739954166" src="_9c3778a0-d596-4eef-85fa-052a5e1b2166.jpg"/></p>
]]>
</MainDescription>
<KeyConsiderations>
<![CDATA[
<p>Line1 The key consideration text goes here.</p>
<p>Line2 The key consideration text goes here.</p>
]]>
</KeyConsiderations>
<Skills>
<p>Line1 The Skills text goes here.</p>
<p>Line2 The Skills text goes here.</p>
<p>Line3 The Skills text goes here.</p>
</Skills>
<Synonyms>
<p>The Synonyms text goes here.</p>
</Synonyms>
I want the data between <h1> and it can contain any html tag that should be generated in output. I tried the code at: https://xsltfiddle.liberty-development.net/bdxtqy/2. But it gives the data only if data is included under html tags. Please provide pointers on how to achieve required output.
XSL code:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="xhtml exsl"
version="1.0">
<xsl:import href="http://lenzconsulting.com/xml-to-string/xml-to-string.xsl"/>
<xsl:output method="xml" indent="yes"
cdata-section-elements="MainDescription KeyConsideration"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:key name="h1-group" match="xhtml:div/*[not(self::xhtml:h1)]" use="generate-id(preceding-sibling::xhtml:h1[1])"/>
<xsl:template match="xhtml:div[xhtml:h1]">
<xsl:apply-templates select="xhtml:h1"/>
</xsl:template>
<xsl:template match="xhtml:h1">
<xsl:element name="{translate(., ' ', '')}">
<xsl:variable name="rtf-with-xhtml-ns-stripped">
<xsl:apply-templates select="key('h1-group', generate-id())"/>
</xsl:variable>
<xsl:apply-templates select="exsl:node-set($rtf-with-xhtml-ns-stripped)/node()" mode="xml-to-string"/>
</xsl:element>
</xsl:template>
<xsl:template match="xhtml:p">
<p>
<xsl:apply-templates/>
</p>
</xsl:template>
</xsl:stylesheet>
I am getting output as:
<ATTRIBUTE-VALUE>
<THE-VALUE>
<MainDescription><![CDATA[<p>Line2 The main description text goes here.</p><p><img alt="Embedded Image" class="embeddedImageLink" id="_1536739954166" src="_9c3778a0-d596-4eef-85fa-052a5e1b2166.jpg" xmlns="http://www.w3.org/1999/xhtml"/></p>]]></MainDescription>
<KeyConsideration><![CDATA[<p>Line1 The key consideration text goes here.</p><p>Line2 The key consideration text goes here.</p>]]></KeyConsideration>
<Skills><p>Line1 The Skills text goes here.</p><p>Line2 The Skills text goes here.</p><p>Line3 The Skills text goes here.</p></Skills>
<Synonyms />
</THE-VALUE>
</ATTRIBUTE-VALUE>
If you change the code to match on node() instead of * for elements you get the text nodes included:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="xhtml exsl"
version="1.0">
<xsl:import href="http://lenzconsulting.com/xml-to-string/xml-to-string.xsl"/>
<xsl:output method="xml" indent="yes"
cdata-section-elements="MainDescription KeyConsideration"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:key name="h1-group" match="xhtml:div/node()[not(self::xhtml:h1)]" use="generate-id(preceding-sibling::xhtml:h1[1])"/>
<xsl:template match="xhtml:div[xhtml:h1]">
<xsl:apply-templates select="xhtml:h1"/>
</xsl:template>
<xsl:template match="xhtml:h1[. = 'Main Description' or . = 'Key Consideration']">
<xsl:element name="{translate(., ' ', '')}">
<xsl:variable name="rtf-with-xhtml-ns-stripped">
<xsl:apply-templates select="key('h1-group', generate-id())"/>
</xsl:variable>
<xsl:apply-templates select="exsl:node-set($rtf-with-xhtml-ns-stripped)/node()" mode="xml-to-string"/>
</xsl:element>
</xsl:template>
<xsl:template match="xhtml:h1">
<xsl:element name="{translate(., ' ', '')}">
<xsl:variable name="rtf-with-xhtml-ns-stripped">
<xsl:apply-templates select="key('h1-group', generate-id())"/>
</xsl:variable>
<xsl:apply-templates select="exsl:node-set($rtf-with-xhtml-ns-stripped)/node()"/>
</xsl:element>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="." disable-output-escaping="yes"/>
</xsl:template>
<xsl:template match="xhtml:p">
<p>
<xsl:apply-templates/>
</p>
</xsl:template>
</xsl:stylesheet>
https://xsltfiddle.liberty-development.net/bdxtqy/41
It is not clear when/where you want to wrap plain text like Line1 The main description text goes here. into a p element.
For the CDATA section use of disable-output-escaping I think you need to override the template for text() nodes of the imported xml-to-string stylesheet:
<xsl:template match="text()" mode="xml-to-string">
<xsl:value-of select="." disable-output-escaping="yes"/>
</xsl:template>
https://xsltfiddle.liberty-development.net/bdxtqy/42
I haven't tested whether that breaks anything.

xslt string parameter does not show

I have the following a.html file:
<html>
<body>
<div class="a">aaa
<div class="b">bbb</div>
<div class="c">ccc
<div class="d">ddd</div>
</div>
</div>
</body>
</html>
I am using the following bash script:
#!/bin/bash
pid="a"
yyy=123
xsltproc --param pid ${pid} --param yyy ${yyy} ${pid}.xslt ${pid}.html > ${pid}_${yyy}.html
One parameter is an integer, the other is a string.
My a.xslt file is trying to insert both parameters in the html structure as follows:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="pid"/>
<xsl:param name="yyy"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="div[#class='a']">
<xsl:copy>
<xsl:apply-templates select="#* | text()" />
<div class="pid"><xsl:value-of select="$pid"/></div>
<div class="yyy"><xsl:value-of select="$yyy"/></div>
<xsl:apply-templates select="node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
And my output a_123.html is the following:
<html>
<body>
<div class="a">aaa
<div class="pid"></div>
<div class="yyy">123</div>aaa
<div class="b">bbb</div>
<div class="c">ccc1
<div class="d">ddd11</div>
</div>
</div>
</body>
</html>
This contains 2 mistakes:
aaa appears again after div class="yyy"
div class="pid" does not contain the value of the string parameter
What am I doing wrong?
Change the <xsl:apply-templates select="node()" /> to <xsl:apply-templates select="*"/>, to only process the element nodes there and not all child nodes including text nodes, as you already output them earlier.
As for the parameter, I am not familiar with bash, try xsltproc --param pid '${pid}' ..., to have an XPath expression constructing a string value as the param or use --stringparam pid ${pid} for that parameter.

how to get a transformed xml file with all child tags of every occurance of parent tag under only one parent tag

I am using this input xml file .
<Content>
<body><text>xxx</text></body>
<body><text>yy</text></body>
<body><text>zz</text></body>
<body><text>kk</text></body>
<body><text>mmm</text></body>
</Content>
after Xslt transformation the output should be
<Content>
<body><text>xxx</text>
<text>yy</text>
<text>zz</text>
<text>kk</text>
<text>mmm</text></body>
</Content>
Can anyone please provide its relavant Xsl file.
This complete transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="body"/>
<xsl:template match="body[1]">
<body>
<xsl:apply-templates select="../body/node()"/>
</body>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<Content>
<body>
<text>xxx</text>
</body>
<body>
<text>yy</text>
</body>
<body>
<text>zz</text>
</body>
<body>
<text>kk</text>
</body>
<body>
<text>mmm</text>
</body>
</Content>
produces the wanted, correct result:
<Content>
<body>
<text>xxx</text>
<text>yy</text>
<text>zz</text>
<text>kk</text>
<text>mmm</text>
</body>
</Content>
Explanation:
The identity rule copies every node "as-is".
It is overriden by two templates. The first ignores/deletes every body element`.
The second template overriding the identity template also overrides the first such template (that deletes every body element) for any body element that is the first body child of its parent. For this first body child only, a body element is generated and in its body all nodes that are children nodes of any body child of its parent (the current body elements and all of its body siblings) are processed.
<xsl:template match="Content">
<body>
<xsl:apply-templates select="body/text"/>
</body>
</xsl:template>
<xsl:template match="body/text">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>