In my source XML, the less-than sign is represented as <, but in the output (html, as alt-text) it is represented as the < sign, which causes problems in post-processing.
I'm using saxon655 with this command line:
java -cp saxon655/saxon.jar com.icl.saxon.StyleSheet test.xml test.xsl
This really doesn't make sense to me. Here are the details:
The DocBook XML:
<chapter xmlns="http://docbook.org/ns/docbook">
<info><title>The Chapter</title></info>
<para>
<informalequation>
<mediaobject>
<imageobject>
<imagedata fileref="images/g0589.png" />
</imageobject>
<textobject role="tex"><phrase>|z_ s-z_ t|<r</phrase></textobject>
</mediaobject>
</informalequation>
</para>
</chapter>
The XSLT. If you copy this, change the path the docbook stylesheets.
<xsl:stylesheet version="1.0"
xmlns:d="http://docbook.org/ns/docbook"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:import href="/path/to/docbook/xsl-1.78.1/html/docbook.xsl" />
<xsl:template match="d:mediaobject/d:imageobject/d:imagedata">
<xsl:element name="img">
<xsl:attribute name="alt">
<xsl:value-of select="../../d:textobject[#role='tex']/d:phrase" />
</xsl:attribute>
<xsl:attribute name="src">
<xsl:value-of select="#fileref" />
</xsl:attribute>
</xsl:element>
<xsl:apply-templates />
</xsl:template>
</xsl:stylesheet>
And the resulting HTML portion:
<div class="informalequation">
<div class="mediaobject">
<img alt="|z_ s-z_ t|<r" src="images/g0589.png"></div>
</div>
Am I doing something wrong?
As far as the W3C HTML validator says, for text/html the output is fine, I created a minimal HTML 4.01 document with the markup you have at http://home.arcor.de/martin.honnen/html/test2015040301.html, it has the content
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>img alt attribute test</title>
</head>
<body>
<div class="informalequation">
<div class="mediaobject">
<img alt="|z_ s-z_ t|<r" src="images/g0589.png"></div>
</div>
</body>
</html>
and the validator says (http://validator.w3.org/check?uri=http%3A%2F%2Fhome.arcor.de%2Fmartin.honnen%2Fhtml%2Ftest2015040301.html&charset=%28detect+automatically%29&doctype=Inline&group=0) "This document was successfully checked as HTML 4.01 Strict!". So I think Saxon is creating correct HTML, I don't know how you post-process the result of the XSLT transformation but an HTML or SGML parser should do fine with it.
With an XML output (method="xml") Saxon does escape the less than in the attribute value.
Related
I'm trying to transform some xml with content encoded as html entities. I want to output the entity content as valid html.
The xml is like this ..
<?xml version="1.0" encoding="UTF-8"?><memo Version="1.0">
<header>
<meta title="==PROGRAMMING=="/>
<meta favourite="false"/>
<meta uuid="85f94ab2-77a8-XXXXXXXXXXXXXXX"/>
<meta createdTime="1551038092051"/>
</header>
<contents>
<content><p value="memo2" >=====</p><p>https://medium.freecodecamp.org/</p><p>=====</p>
</content>
</contents>
</memo>
I have some xslt as so..
xslt_src = '''
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<head>
<xsl:apply-templates select="memo/header/meta"/>
</head>
<body>
<xsl:apply-templates select="memo/contents"/>
</body>
</html>
</xsl:template>
<xsl:template match="memo/header/meta">
<xsl:apply-templates select="#title"/>
</xsl:template>
<xsl:template match="memo/contents">
<div class='content'>
<xsl:value-of select="content/text()"/>
</div>
</xsl:template>
<xsl:template match="#title">
<span id='title'>
<xsl:value-of select="."/>
</span>
</xsl:template>
</xsl:stylesheet>
'''
I process it with lxml in Python...
_________________________________________________________________python
from lxml import etree
xslt = etree.XML(xslt_src)
transform = etree.XSLT(xslt)
src = open('simple.xml').read()
xml = etree.XML(str.encode(src))
result = transform(xml)
root = result.getroot()
print('-----------------------out 1')
print(etree.tostring(root, pretty_print=True).decode('utf-8'))
print('-----------------------out 2')
content = root.xpath('/html/body/div/text()')
print(content[0])
==============================================================
etree.tostring(root) prints the structured document but leaves the html entities as encoded in the original xml.
-----------------------out 1
<html>
<head>
<span id="title">==PROGRAMMING==</span>
</head>
<body>
<div class="content"><p value="memo2" >=====</p><p>https://medium.freecodecamp.org/</p><p>=====</p>
</div>
</body>
</html>
but if I print root.xpath('/html/body/div/text()')[0] (the node with the html content) I get what I want...
-----------------------out 2
<p value="memo2" >=====</p><p>https://medium.freecodecamp.org/</p><p>=====</p>
=======================================================================
My question is: how can I make etree.tostring(root) replace the html entities with valid html, as is printed when I use the text attribute directly?
Cheers!
bitrat
Instead of:
<xsl:value-of select="content/text()"/>
try:
<xsl:value-of select="content" disable-output-escaping="yes"/>
I am transforming XML files with XSL to HTML files. Is it possible to embed the original XML file in the HTML output? When yes, how is that possible?
Update 1: To make my need better understandable: In my HTML file, I want a form where I can download the original XML file. Therefore I have to embed the original XML file into my HTML file (e.g. as a hidden input field)
Thanks
If you want to copy the nodes through you can simply do <xsl:copy-of select="/"/> where you want to insert them, however, putting arbitrary XML nodes into HTML does not make sense usually. If you want to serialize an XML document to plain text to render it then you can use solutions like http://lenzconsulting.com/xml-to-string/, for instance:
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:import href="http://lenzconsulting.com/xml-to-string/xml-to-string.xsl"/>
<xsl:output method="html" doctype-public="XSLT-compat" omit-xml-declaration="yes" encoding="UTF-8" indent="yes" />
<xsl:template match="/">
<html>
<head>
<title>Test</title>
</head>
<body>
<section>
<h1>Test</h1>
<xsl:apply-templates/>
<section>
<h2>Source</h2>
<pre>
<xsl:apply-templates mode="xml-to-string"/>
</pre>
</section>
</section>
</body>
</html>
</xsl:template>
<xsl:template match="data">
<ul>
<xsl:apply-templates/>
</ul>
</xsl:template>
<xsl:template match="item">
<li>
<xsl:apply-templates/>
</li>
</xsl:template>
</xsl:transform>
transforms an XML input like
<data>
<item att="value">
<!-- comment -->
<foo>bar</foo>
</item>
</data>
into the HTML
<!DOCTYPE html
PUBLIC "XSLT-compat">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Test</title>
</head>
<body>
<section>
<h1>Test</h1>
<ul>
<li>
bar
</li>
</ul>
<section>
<h2>Source</h2><pre><data>
<item att="value">
<!-- comment -->
<foo>bar</foo>
</item>
</data></pre></section>
</section>
</body>
</html>
I have created a .xsl file to transform an atom feed for friendly viewing in a browser. It works great when viewed in Chrome, but IE (9.0.811) the style sheet is ignored.
My question is: What can I change so that the stylesheet is processed by Internet Explorer just like it does with Chrome?
The sample atom file:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/css/atom.xsl"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title><![CDATA[A title here]]></title>
<subtitle><![CDATA[A CDATA description here.]]></subtitle>
<link href="http://www.site.com/channel" title="a title" rel="alternate" type="text/html" hreflang="en" />
<link href="http://www.site.com/channel/atom" rel="self" type="application/rss+xml" />
<id>http://www.site.com/channel</id>
<rights>All content © 2006-2013 Site Name unless otherwise noted. All rights reserved.</rights>
<updated>2011-12-18T20:34:31Z</updated>
<category term="Domain: Cat/Subcat" label="Heirarchy Label" />
<author>
<name>Editor's Desk</name>
<email>editors#site.com</email>
<uri>http://www.site.com/members/username</uri>
</author>
<logo>http://www.site.com/img/logo.gif</logo>
<entry>
<title><![CDATA[Some CDATA here]]></title>
<link href="http://www.site.com/channel/article-name" title="Article Name" rel="alternate" type="text/html" hreflang="en" />
<id>http://www.site.com/channel/article-name</id>
<content type="html"><![CDATA[<h3>New Post - Title</h3><p>A new post has been published titled "Article Title".</p><p>Click the link above to go to the full biography and to view other user's ratings and comments.</p><p>Article posted in: <i>section -> subsection -> subcat</i>.</p>]]></content>
<author>
<name>Editor's Desk</name>
<email>editors#site.com</email>
<uri>http://www.site.com/members/username</uri>
</author>
<category term="Domain: Cat/Subcat" label="text label here" />
<rights>All content © 2006-2013 Site Name unless otherwise noted. All rights reserved.</rights>
<published>2011-12-18T20:34:31Z</published>
<updated>2011-12-18T20:34:31Z</updated>
</entry>
And the relevant excerpt from the .xsl file:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:html="http://www.w3.org/TR/REC-html40" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:atom="http://www.w3.org/2005/Atom">
<xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes" />
<xsl:template match="/">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>
<xsl:value-of select="atom:feed/atom:title"/>
</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<link rel="stylesheet" href="/css/style.css" />
</head>
<body>
<div id="container">
<xsl:variable name="feed-title">
<xsl:value-of select="atom:feed/atom:title" />
</xsl:variable>
<xsl:variable name="feed-uri">
<xsl:value-of select="atom:feed/atom:link[2]/#href" />
</xsl:variable>
<xsl:variable name="web-uri">
<xsl:value-of select="atom:feed/atom:link[1]/#href" />
</xsl:variable>
<h1>Atom Feed - <xsl:value-of select="atom:feed/atom:title" /></h1>
<p><xsl:value-of select="atom:feed/atom:subtitle" /></p>
<p><img src="/img/i/atom-feed.png" alt="{$feed-title}" /> - <xsl:value-of select="atom:feed/atom:title" /><br /><xsl:value-of select="feed/link[1]/#href" /></p>
<p>To subscribe to this RSS feed:</p>
<ul>
<li>Drag the orange RSS button or text link above into your news reader</li>
<li>Cut and paste this page's URL into your news reader</li>
</ul>
<p>Alternatively, you can <a title="{$feed-title}" href="{$web-uri}">view the actual online version</a> of this content.</p>
</div>
<div id="footer">
<xsl:value-of select="atom:feed/atom:rights" />
</div>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
This may be because your stylesheet is XSLT2.0, but this is not supported by Microsoft, and so not supported by IE. I can't see anything in your XSLT sample that requires XSLT2.0, so perhaps you can try setting the version of your stylesheet to 1.0 instead of 2.0.
<xsl:stylesheet version="1.0"
xmlns:html="http://www.w3.org/TR/REC-html40"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:atom="http://www.w3.org/2005/Atom">
Also, you might want to try changing the "version" attribute of your xsl:output element to 4.0, instead of 1.0.
<xsl:output method="html" version="4.0" encoding="UTF-8" indent="yes" />
EDIT: Actually, this may be because of default behaviour in IE7.0 and above. Go into Tools -> Internet Options -> Content -> Settings (for Feeds and Web Slices), and you should see the option 'Turn on feed reading view. Try un-ticking this, and re-starting IE, to see if that works.
XML:
<?xml version="1.0" encoding="UTF-9" ?>
<mailAndMessageSettings>
<settings>
<add key="Url" value=""/>
<add key="UserName" value=""/>
<add key="Password" value=""/>
</settings>
<mail>
<subject>
Mp3 Submission
</subject>
<body>
<![CDATA[
<meta http-equiv="Content-Type" content="text/html; charset="utf-8""/>
<head></head>
<body>
<p>Hi,</p>
<p>Please find the attached mp3... :-)</p>
<p>here</p>
<p>Regards,</br>
Pete</p>
</body>
</html>
]]>
</body>
</mail>
</mailAndMessageSettings>
XSLT:
<xsl:template match="/">
<xsl:value-of select="/mailAndMessageSettings/mail" disable-output-escaping="yes"/>
</xsl:template>
Expected output:
<mail>
<subject>
Mp3 Submission
</subject>
<body>
<![CDATA[
<meta http-equiv="Content-Type" content="text/html; charset="utf-8""/>
<head></head>
<body>
<p>Hi,</p>
<p>Please find the attached mp3... :-)</p>
<p>here</p>
<p>Regards,</br>
Pete</p>
</body>
</html>
]]>
</body>
</mail>
I want to add an attribute "onclick" on "here" in a CDATA and getting the whole "mail" node? Is it really possible? Can anyone help me with this stuff? Thanks in advance.
Your help would be greatly appreciated :)
There are no nodes or tags inside a CDATA section. CDATA means "character data". The only reason for putting stuff inside CDATA is to say "The stuff in here might look like markup, but I don't want it treated as markup; just treat it as text". So if you want to treat it as markup, don't put it in CDATA.
You'll have to resort to string manipulation, like so:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="body[contains(.,'>here</a>')]">
<xsl:value-of disable-output-escaping="yes" select="concat(
'<![CDATA[',
substring-before(.,'>here</a>'),
' onclick="myfunction();">here</a>',
substring-after(.,'>here</a>'),
']]>'
)"/>
</xsl:template>
</xsl:stylesheet>
But is there a reason why the mail body has to be CDATA in the first place?
i'm having this error when i tried to validate my XSLT
javax.xml.transform.TransformerConfigurationException:
javax.xml.transform.TransformerException:
javax.xml.transform.TransformerException:
A node test that matches either NCName:* or QName was expected.
this is my XSLT
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:output method="html" />
<xsl:template match="\Apps">
<html>
<head> <title>Apps List</title>
<link rel="StyleSheet" href="table_style.css" type="text/css"/>
<style type="text/css">
body {font-family: Helvetca;}
h1 { color : Grey;}
h2 {color : Blue;}</style>
</head>
<body>
<h1> Apps List: <xsl:value-of select="\#List_Type" /></h1>
<p>This is a list of all currently hot apps:</p>
<xsl:for-each select="\App">
<xsl:if test="\App\#installed == true">
<h2 style="color:Green;"><xsl:value-of select="\App\app_name" />(instaled)</h2>
</xsl:if>
<xsl:otherwise>
<h2><xsl:value-of select="\App\app_name" /></h2>
</xsl:otherwise>
<p style="font-style:bold;">App info:</p>
<table id="#gradient-style">
<tr><th>Category:</th><td><xsl:value-of select="\App\catogry" /></td></tr>
<tr><th>Verdion:</th><td><xsl:value-of select="\App\version" /></td></tr>
<tr><th>Description:</th><td><xsl:value-of select="\App\description" /></td></tr>
<tr><th>App Reviews:</th><td><xsl:for-each select="\App\reviews\review">
<span style="font-style:bold;"><xsl:value-of select="\App\reviews\review\reviewer_name" /></span>
| <xsl:value-of select="\App\reviews\review\review_date" />
| <xsl:value-of select="\App\reviews\review\review_Time" /><br/>
<span style="font-style:bold;">Rating:</span>
<xsl:value-of select="string(\App\reviews\review\rating" /> <br/>
<xsl:value-of select="\App\reviews\review\ontent" /><br/>
----------------------------------------------------------
</xsl:for-each>
</td></tr>
</table>
</xsl:for-each>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
this is the XML that tried with
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="ShdenXSLT.xsl"?>
<Apps xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" List_Type="new releases" >
<App device_type="tablet" app_id="120">
<app_name>Meeting Manager</app_name>
<catogry>LifeStyle </catogry>
<catogry>Bussnisse </catogry>
<version>1.0</version>
<description>This app is about managing the bussnisse meeting</description>
<reviews>
<review>
<reviewer_name>Shaden</reviewer_name>
<review_date>2012-02-13</review_date>
<review_time>11:35:02</review_time>
<content>it was a useful app</content>
<rating>4.5</rating>
</review>
<review>
<reviewer_name>Mohamed</reviewer_name>
<review_date>2012-03-01</review_date>
<review_time>12:15:00</review_time>
<content>i really loved this app</content>
<rating>5.0</rating>
</review>
</reviews>
</App>
<App device_type="tablet" app_id="100">
<app_name>ToDoList</app_name>
<catogry>LifeStyle </catogry>
<version>3.4.2</version>
<description>a simple To Do List applecation</description>
<reviews>
<review>
<reviewer_name>Fahad</reviewer_name>
<review_date>2010-02-05</review_date>
<review_time>09:40:55</review_time>
<content>nice app</content>
<rating>4.0</rating>
</review>
</reviews>
</App>
</Apps>
You are using backslash (\) as your XPath separator (i.e. <xsl:value-of select="\#List_Type" />), which is incorrect. It should be a forward slash (/)