I have the following a.html file:
<html>
<body>
<div class="a">aaa
<div class="b">bbb</div>
<div class="c">ccc
<div class="d">ddd</div>
</div>
</div>
</body>
</html>
I am using the following bash script:
#!/bin/bash
pid="a"
yyy=123
xsltproc --param pid ${pid} --param yyy ${yyy} ${pid}.xslt ${pid}.html > ${pid}_${yyy}.html
One parameter is an integer, the other is a string.
My a.xslt file is trying to insert both parameters in the html structure as follows:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="pid"/>
<xsl:param name="yyy"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="div[#class='a']">
<xsl:copy>
<xsl:apply-templates select="#* | text()" />
<div class="pid"><xsl:value-of select="$pid"/></div>
<div class="yyy"><xsl:value-of select="$yyy"/></div>
<xsl:apply-templates select="node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
And my output a_123.html is the following:
<html>
<body>
<div class="a">aaa
<div class="pid"></div>
<div class="yyy">123</div>aaa
<div class="b">bbb</div>
<div class="c">ccc1
<div class="d">ddd11</div>
</div>
</div>
</body>
</html>
This contains 2 mistakes:
aaa appears again after div class="yyy"
div class="pid" does not contain the value of the string parameter
What am I doing wrong?
Change the <xsl:apply-templates select="node()" /> to <xsl:apply-templates select="*"/>, to only process the element nodes there and not all child nodes including text nodes, as you already output them earlier.
As for the parameter, I am not familiar with bash, try xsltproc --param pid '${pid}' ..., to have an XPath expression constructing a string value as the param or use --stringparam pid ${pid} for that parameter.
Related
I want to insert an html snippet from an external file into my output document with copy-of like described here: https://stackoverflow.com/a/5976762/18427492
The html snipped is a navigation bar and also used by other (python) scripts to generate other html files.
I need to replace the path in "href" to match a relative path that i have in a XSLT variable.
Full file content (Template file to be copied):
<ul class="nav">
<li class="fineprint">MyNiceGame Developer Mode Documentation</li>
<li class="switchlang"><img src="/deco/dco_en_sml.gif" alt="English" border="0"></img></li>
<li>Introduction</li>
<li>Contents</li>
<li>Search</li>
<li>Engine</li>
<li>Command Line</li>
<li>Game Data</li>
<li>Script</li>
</ul>
So how can i insert this snippet into my XSL document and replace ../../sdk/ (its possible to change this string to something like {replace-me}/sdk/...) with a relative path that i already have in a XSLT variable?
My XSLT document (i want to replace the <xsl:call-template name="nav"/> with the template file processing):
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="3.0" xpath-default-namespace="https://clonkspot.org" exclude-result-prefixes="xs">
<xsl:output method="html" encoding="ISO-8859-1" doctype-public="-//W3C//DTD HTML 4.01//EN"
doctype-system="http://www.w3.org/TR/html4/strict.dtd"/>
<xsl:template match="/clonkDoc">
<html>
<body>
<xsl:call-template name="nav"/>
<xsl:apply-templates select="func"/>
<!-- other possible nodes under /clonkDoc -->
<xsl:call-template name="nav"/>
</body>
</html>
</xsl:template>
<xsl:template name="nav">
<xsl:param name="relpath" tunnel="yes"/>
<ul class="nav">
<li class="fineprint">
<xsl:when test='lang("en")'>>MyNiceGame Developer Mode Documentation</xsl:when>
</li>
<!-- Other li elements -->
</xsl:template>
Example source file:
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<?xml-stylesheet type="text/xsl" href="../../../clonk.xsl"?>
<clonkDoc xmlns="https://clonkspot.org"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://clonkspot.org ../../../clonk.xsd" xml:lang="de">
<func>
<!-- other nodes -->
</func>
</clonkDoc>
Desired target file:
<!DOCTYPE html
PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<!-- stuff -->
</head>
<body>
<ul class="nav">
<!-- The corrected li elements with modified a href link -->
</ul>
<!-- Other stuff from source file (<func>) -->
<ul class="nav">
<!-- The corrected li elements with modified a href link -->
</ul>
</body>
</html>
Martin Honnen's solution for my specific case with the xpath-default-namespace:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="3.0" xpath-default-namespace="https://clonkspot.org" exclude-result-prefixes="xs">
<xsl:output method="html" encoding="ISO-8859-1" doctype-public="-//W3C//DTD HTML 4.01//EN"
doctype-system="http://www.w3.org/TR/html4/strict.dtd"/>
<xsl:template match="/clonkDoc">
<html>
<body>
<xsl:apply-templates select="doc('file.html')//ul[#class = 'nav']" xpath-default-namespace="" mode="fix-links"/>
<xsl:apply-templates select="func"/>
<!-- other possible nodes under /clonkDoc -->
<xsl:apply-templates select="doc('file.html')//ul[#class = 'nav']" xpath-default-namespace="" mode="fix-links"/>
</body>
</html>
</xsl:template>
<xsl:mode name="fix-links" on-no-match="shallow-copy"/>
<xsl:template mode="fix-links" match="ul/li/a/#href" xpath-default-namespace="">
<xsl:message>Value href: <xsl:value-of select="."></xsl:value-of></xsl:message>
<xsl:attribute name="{name()}" select="replace(., '../../sdk', 'foobar')"/>
</xsl:template>
copy-of makes a a deep copy, if you want to transform input nodes (even only their attribute values) you write templates to do so e.g. <xsl:apply-templates select="doc('file.xml')//ul[#class = 'nav']" mode="fix-links"/>, or, perhaps, as the edit says the snippet with the ul is all in the file, use simply <xsl:apply-templates select="doc('file.xml')" mode="fix-links"/>, and
<xsl:mode name="fix-links" on-no-match="shallow-copy"/>
<xsl:template mode="fix-links" match="ul/li/a/#href">
<xsl:attribute name="{name()}" select="replace(., '../../sdk', $varname)"/>
</xsl:template>
The xsl:mode declaration is XSLT 3 only, in earlier versions declare the identity transformation for that mode e.g.
<xsl:template mode="fix-links" match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()" mode="fix-links"/>
</xsl:copy>
</xsl:template>
in XSLT 1 or
<xsl:template mode="fix-links" match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()" mode="#current"/>
</xsl:copy>
</xsl:template>
in XSLT 2.
XSLT 3 sample (slighly adapted for the demonstration to work with the primary input) outputs
<ul class="nav">
<li class="fineprint">MyNiceGame Developer Mode Documentation</li>
<li class="switchlang"><img src="/deco/dco_de_sml.gif" alt="German" border="0"/></li>
<li>Introduction</li>
<li>Contents</li>
<li>Search</li>
<li>Engine</li>
<li>Command Line</li>
<li>Game Data</li>
<li>Script</li>
</ul>
As for the information in the latest edit that the secondary input document you want to process has elements in no namespace but your primary one has elements in a certain namespace that your XSLT has used as the xpath-default-namespace, in that case you need to override that for any selections in the secondary input e.g.
<xsl:mode name="fix-links" on-no-match="shallow-copy"/>
<xsl:template mode="fix-links" match="ul/li/a/#href" xpath-default-namespace="">
<xsl:attribute name="{name()}" select="replace(., '../../sdk', $varname)"/>
</xsl:template>
and if you continue to use the apply-templates with an element selector, there as well e.g. <xsl:apply-templates select="doc('file.xml')//ul[#class = 'nav']" xpath-default-namespace="" mode="fix-links"/>.
I have the following xslt to output comments but it is vulnerable to XSS if I disable all output escaping.
<xsl:value-of disable-output-escaping="yes" select="//datafor:field[datafor:name='comments']/datafor:value" />
How can I allow only the following html attributes so that I can retain some formating in the rendered output?
'b,strong,i,ul,ol,li,p,br,p[style]',div,div[class]'
Or, using SaxonJS, in the browser, you could call into JavaScript to parse the HTML fragment and process it e.g.
function parseHTML(html) {
return new DOMParser().parseFromString(html, 'text/html');
}
const xml = `<orders>
<order>
<id>o1</id>
<date>2022-06-12</date>
<comments><![CDATA[I want the following extras: <ol>
<li>32 GB RAM</li>
<li>1000 GB SSD</li>
]]></comments>
</order>
</orders>`;
const xslt = `<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml"
version="3.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
xmlns:js="http://saxonica.com/ns/globalJS"
expand-text="yes">
<xsl:mode name="html"/>
<xsl:template mode="html" match="b | strong | i | ul | ol | li | p | br | p | div | p/#style | div/#class" xpath-default-namespace="http://www.w3.org/1999/xhtml">
<xsl:copy>
<xsl:apply-templates select="#* | node()" mode="#current"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<html>
<head>
<title>Test</title>
</head>
<body>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match="orders">
<h1>Orders</h1>
<xsl:where-populated>
<ol>
<xsl:apply-templates/>
</ol>
</xsl:where-populated>
</xsl:template>
<xsl:template match="order">
<li>Order {id} from {format-date(date, '[D] [M] [Y0000]')}
<div>
<h2>Comments</h2>
<div>
<xsl:apply-templates select="js:parseHTML(string(comments))" mode="html"/>
</div>
</div></li>
</xsl:template>
</xsl:stylesheet>`;
const result = SaxonJS.XPath.evaluate(`transform(map {
'stylesheet-text' : $xslt,
'source-node' : parse-xml($xml)
}
)?output/*/*/node()`,
[],
{ params : {
xslt: xslt,
xml: xml
}
}
);
document.body.append(...result);
<script src="https://martin-honnen.github.io/xslt3fiddle/js/SaxonJS2.js"></script>
With the XSLT implementation of an HTML tag soup parser by David Carlisle it would be
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml"
version="3.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
xmlns:d="data:,dpc"
xmlns:js="http://saxonica.com/ns/globalJS"
expand-text="yes">
<xsl:import href="https://github.com/davidcarlisle/web-xslt/raw/main/htmlparse/htmlparse.xsl"/>
<xsl:mode name="html"/>
<xsl:template mode="html" match="b | strong | i | ul | ol | li | p | br | p | div | p/#style | div/#class" xpath-default-namespace="http://www.w3.org/1999/xhtml">
<xsl:copy>
<xsl:apply-templates select="#* | node()" mode="#current"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<html>
<head>
<title>Test</title>
</head>
<body>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match="orders">
<h1>Orders</h1>
<xsl:where-populated>
<ol>
<xsl:apply-templates/>
</ol>
</xsl:where-populated>
</xsl:template>
<xsl:template match="order">
<li>Order {id} from {format-date(date, '[D] [M] [Y0000]')}
<div>
<h2>Comments</h2>
<div>
<xsl:apply-templates select="d:htmlparse(comments)" mode="html"/>
</div>
</div></li>
</xsl:template>
</xsl:stylesheet>
Online sample.
The current samples copy only the listed elements but copy any text node through, if elements like e.g. script need to be completely stripped add an empty template <xsl:template mode="html" match="script" xpath-default-namespace="http://www.w3.org/1999/xhtml"/>.
Of course, consider to download the HTML parser module and xsl:import a local file for your own application instead of pulling from github.
I want to add multiple matches in except operator.
Example:
<root>
<div>
<span style="font-family:'Work Sans', sans-serif;">
</span>
<h3>hi</h3>
<ol><li>hello</li></ol>
<ul><li>hello</li></ul>
<p>name</p>
</div>
</root>
Expected result: Want to convert div tag to p tag and then move only h3,ol,ul,p tag out of it if present.
<root>
<p>
<span style="font-family:'Work Sans', sans-serif;">
</span>
</p>
<h3>hi</h3>
<ol><li>hello</li></ol>
<ul><li>hello</li></ul>
<p>name</p>
</root>
I tried this:
<xsl:template match="div">
<p>
<xsl:copy-of select="* except h3"/>
</p>
<xsl:copy-of select="h3"/>
</xsl:template>
Above xslt results to :
<root>
<p>
<span style="font-family:'Work Sans', sans-serif;">
</span>
<ol><li>hello</li></ol>
<ul><li>hello</li></ul>
<p>name</p>
</p>
<h3>hi</h3>
</root>
Is there any way to add mutiple element names in except operator like <xsl:copy-of select="* except h3 and ol"/>
Another approach i tried:
<xsl:template match="div[p | ol | ul | h3 | h2]">
<xsl:apply-templates/>
</xsl:template>
<!-- convert <div> to <p> if it's direct child is not one of p,ol,ul -->
<xsl:template match="div[not(p | ol | ul | h3 | h2)]">
<p>
<xsl:apply-templates/>
</p>
</xsl:template>
And this results to which is wrong as i still want span tag to be under p tag:
<root><span style="font-family:'Work Sans', sans-serif;">
</span>
<h3>hi</h3>
<ol><li>hello</li></ol>
<ul><li>hello</li></ul>
<p>name</p></root>
In stead of using xsl:copy you could just use the xsl:apply-templates approach like this:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:strip-space elements="*"/>
<xsl:output indent="yes"/>
<xsl:template match="node()|#*" name="copy">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="div">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="div/span">
<p>
<xsl:call-template name="copy"/>
</p>
</xsl:template>
</xsl:stylesheet>
The except operator was introduced in XPath 2.0 so it doesn't make any sense on how to use it with XSLT 1.0 which uses XPath 1.0.
As for its use where supported, you can of course do e.g. * except (h3, ol).
In XPath 1.0 you can try e.g. *[not(self::h3|self::ol)].
I have researched this problem but the suggestions that I have found seems to be rather convoluted and for a more general scenario. Perhaps there is a more concise solution for this scenario, that is more specific.
I have a large number of html files like the following:
<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type" />
<title>t</title>
</head>
<body>
<div class="a">
<div class="f">f1</div>
<div class="e">e1</div>
<div class="e">e2</div>
<div class="g">g</div>
<div class="c">c1</div>
<div class="b">
<div class="ba">ba</div>
<div class="bb">bb</div>
</div>
<div class="c">c2</div>
<div class="f">f2</div>
<div class="d">d</div>
<div class="c">c3</div>
</div>
...
</body>
</html>
Rule # 1
I want to order the div's inside div class="a" in a specific order of their class attribute that is non-alphabetic and non-numeric. For the purpose of this example, let's the final order be the following:
g
f
b
c
e
d
In my real examples, the list is much longer.
Rule # 2
If for a given class attribute there is more than one node, then they should be left in the same order as in the original file, for instance:
c1
c2
c3
Please notice that in my real examples these values would not be in alphanumerical order.
Rule # 3
The order of child nodes must not be affected, for instance:
ba
bb
Please notice that in my real examples these values would not be in alphanumerical order either.
The final output should be like the following:
<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type" />
<title>t</title>
</head>
<body>
<div class="a">
<div class="g">g</div>
<div class="f">f1</div>
<div class="f">f2</div>
<div class="b">
<div class="ba">ba</div>
<div class="bb">bb</div>
</div>
<div class="c">c1</div>
<div class="c">c2</div>
<div class="c">c3</div>
<div class="e">e1</div>
<div class="e">e2</div>
<div class="d">d</div>
</div>
...
</body>
</html>
I have thought at first to:
Prepend a number to the class attribute value, for instance rename class="g" to class="01g", etc
Order the classes in alphanumerical order
Remove the number, for instance rename class="01g" to class = "g", etc
However I dislike this solution because it requires too many transformations.
What I would really like is to come up with a more elegant solutions. Perhaps I would define an ordered list of class values and a clever index would somehow put the nodes in that defined order?
Do you have any suggestions to add to my xslt template?
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
AFAICT, you want to do something like:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" omit-xml-declaration="yes" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="div[#class='a']">
<xsl:variable name="sort-order">gfbced</xsl:variable>
<xsl:copy>
<xsl:apply-templates select="#*|node()">
<xsl:sort select="string-length(substring-before($sort-order, #class))" data-type="number" order="ascending"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
To accommodate class values that are not single characters, you can use:
<xsl:template match="div[#class='a']">
<xsl:variable name="sort-order">|g|f|b|c|e|d|</xsl:variable>
<xsl:copy>
<xsl:apply-templates select="#*|node()">
<xsl:sort select="string-length(substring-before($sort-order, concat('|', #class, '|')))" data-type="number" order="ascending"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
I have to remove a div(menu) with an ul tag in it. All the data is stored in a variable $data. I have remove that div in that variable through xslt
Before:
<div id="container>
<div id="menu">
<ul>
</ul>
</div>
</div>
After
<div id="container>
</div>
Well if you know there is only the div id="menu" in that container div then you could make a shallow copy of that container div. In general, with XSLT 1.0, a variable will be a result tree fragment, to process it further with XSLT/XPath (other than outputting it with value-of or copy-of) you need to use exsl:node-set on the variable. Then you could process the elements with the identity transformation and a template for the div[#id = 'menu'] that does not process it to delete it (online at http://xsltransform.net/bFN1y9C):
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:exsl="http://exslt.org/common" exclude-result-prefixes="exsl">
<xsl:output method="html" indent="yes"/>
<xsl:variable name="data">
<div id="container">
<div id="menu">
<ul>
</ul>
</div>
</div>
</xsl:variable>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:variable name="data2">
<xsl:apply-templates select="exsl:node-set($data)/node()"/>
</xsl:variable>
<xsl:template match="div[#id = 'menu']"/>
<xsl:template match="/">
<xsl:copy-of select="$data2"/>
</xsl:template>
</xsl:transform>
If you need to perform other transformation steps you might need to separate the different steps by using modes.