How do I find and replace text using XSLT 2.0? - xslt

Using XSLT 2.0, I need to replace:
<section class="ktp-explanation-section jasper-exclude">
with:
<section class="ktp-explanation-section atom-exclude">
But in all instances EXCEPT when the span tag below exists:
<span property="atom:tag" class="ktp-meta">behavioral_sciences</span>
I'm not very good with XLST. What would I need to include in my XSLT script to make that happen?
Here's an example of the HTML with the behavioral_sciences tag:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" class="ktp-question-set"
data-uuid="90dcafa425ef42dca522211db2db1f1f">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta charset="utf-8" />
<link type="text/css" rel="stylesheet" title="default" href="../../assets/css/main.css" />
<title>mbeh01001</title>
</head>
<body>
<ol class="ktp-question-set" data-uuid="0b866e6990f940e8b22d8083bff94248">
<li id="mbeh01001" property="ktp:question" typeof="ktp:Question"
data-uuid="3e34491dfadd46b58de842471aafd503" class="ktp-question">
<section class="ktp-question-meta" data-uuid="01e5e879ddda4f4889f1378655a4a3bd">
<section property="ktp:metadata" class="ktp-meta"
data-uuid="e7e38b9d85e045b4a7e6492b2f286cdb">
<span property="atom:content-item-name" class="ktp-meta"
data-value="mbeh01001"></span>
<span property="atom:tag" class="ktp-meta">behavioral_sciences</span>
</section>
</section>
<section property="ktp:explanation-section" typeof="ktp:step"
data-title="Step-by-Step" class="ktp-explanation-section jasper-exclude"
data-uuid="3a797b95cdc746bdbe73cd69790c21e7">
<ol class="list-step-stacked" data-uuid="78b16f9f99a84c00b142739badcc6d13">
<li data-uuid="c801e37e13304bbabd807a7aeb06e0ae"><span
class="step-title">Simplify the question</span>
<p data-uuid="1a69def59a1448ba973f4953f6cdadf4">The most important
keyword in the question stem that hints at the correct answer is
<i>anthropologist</i>—one who studies the fate of human
beings and thus their reproduction and survival. The
anthropologist correlates the development of a standard set of
emotions with better communication, thereby suggesting that
standard emotions influenced human survival and
reproduction.  Another way of wording the question would
therefore be: <i>Which term best illustrates a positive
influence on human survival and reproduction?</i></p>
</li>
</ol>
</section>
</li>
</ol>
</body>
</html>
Here's an example without the behavioral_sciences tag:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" class="ktp-question-set"
data-uuid="90dcafa425ef42dca522211db2db1f1f">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta charset="utf-8" />
<link type="text/css" rel="stylesheet" title="default" href="../../assets/css/main.css" />
<title>mbeh01001</title>
</head>
<body>
<ol class="ktp-question-set" data-uuid="0b866e6990f940e8b22d8083bff94248">
<li id="mbeh01001" property="ktp:question" typeof="ktp:Question"
data-uuid="3e34491dfadd46b58de842471aafd503" class="ktp-question">
<section class="ktp-question-meta" data-uuid="01e5e879ddda4f4889f1378655a4a3bd">
<section property="ktp:metadata" class="ktp-meta"
data-uuid="e7e38b9d85e045b4a7e6492b2f286cdb">
<span property="atom:content-item-name" class="ktp-meta"
data-value="mbeh01001"></span>
<span property="atom:tag" class="ktp-meta">biology</span>
</section>
</section>
<section property="ktp:explanation-section" typeof="ktp:step"
data-title="Step-by-Step" class="ktp-explanation-section jasper-exclude"
data-uuid="3a797b95cdc746bdbe73cd69790c21e7">
<ol class="list-step-stacked" data-uuid="78b16f9f99a84c00b142739badcc6d13">
<li data-uuid="c801e37e13304bbabd807a7aeb06e0ae"><span
class="step-title">Simplify the question</span>
<p data-uuid="1a69def59a1448ba973f4953f6cdadf4">The most important
keyword in the question stem that hints at the correct answer is
<i>anthropologist</i>—one who studies the fate of human
beings and thus their reproduction and survival. The
anthropologist correlates the development of a standard set of
emotions with better communication, thereby suggesting that
standard emotions influenced human survival and
reproduction.  Another way of wording the question would
therefore be: <i>Which term best illustrates a positive
influence on human survival and reproduction?</i></p>
</li>
</ol>
</section>
</li>
</ol>
</body>
</html>

See if this works for you:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xpath-default-namespace="http://www.w3.org/1999/xhtml">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="section/#class[.='ktp-explanation-section jasper-exclude' and not(//span[#property='atom:tag' and #class='ktp-meta' and .='behavioral_sciences'])]">
<xsl:attribute name="class">ktp-explanation-section atom-exclude</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Note that this tests for the presence of <span property="atom:tag" class="ktp-meta">behavioral_sciences</span> in the entire document, unrelated to the matched class attribute.

Related

How do I add unit and chapter tags from my source files to my output file using XSLT?

I'm working on an XSLT script to create an ePub TOC from a source file. I am able to get the content I need for the HTML files themselves but I also need to include the <unit> and <chapter> tags as well. I'm stuck how to get those into my output file.
This is my input file:
<?xml version="1.0" encoding="UTF-8"?>
<toc xmlns="http://www.standardnine.com/s9ml" data-uuid="c667450f8f7d45888630f13533a20e14">
<metadata thumbnailpath="../img/toc_thumbs/.crops/9781506250953_7c4cbc8d37244424a9f2a00ab56b5db7.jpg">
<remarks path="remarks.s9ml"/>
<edition/>
<title>SHSAT Course 2021</title>
<author/>
<publisher/>
<shortname>sn_a9645</shortname>
<pubdate>Publish Date</pubdate>
<version>Version</version>
<revision>Revision</revision>
<s9version>s9version</s9version>
<productid>Product ID</productid>
<bundleconfig path="config.s9ml"/>
<reflowable>true</reflowable>
<subtitle/>
</metadata>
<spine>
<unit designation="" enumeration="" data-uuid="f2aab3845c9d4e1988de8c6a22429af8">
<title/>
<chapter thumbnailpath="../img/toc_thumbs/.crops/9781506250953_56d8078f9660471799cb1741a3a45ade.jpg" designation="" enumeration="" data-uuid="c848cd9de0b646e1b3590904031e41f8" sandbox="true">
<title>Front Matter</title>
<exhibit path="frontmatter/fm_cover.html"/>
<exhibit path="frontmatter/fm_titlepage.html"/>
<exhibit path="frontmatter/copyright.html"/>
</chapter>
</unit>
<unit designation="Section" enumeration="1" data-uuid="d1ff46bfa6fe47dda6f27a9d871eb727">
<title>Getting Started</title>
<chapter thumbnailpath="../img/toc_thumbs/ch01_thumb.png" designation="Chapter" enumeration="1" data-uuid="6f33bec0993d4133b75e4bd9b3902ef8" sandbox="false">
<title>SHSAT Basics</title>
<exhibit path="chapter01/ch01_part.html"/>
<exhibit path="chapter01/ch01_reader_0.html"/>
</chapter>
<chapter thumbnailpath="../img/toc_thumbs/ch02_thumb.png" designation="Chapter" enumeration="2" data-uuid="2bf06864e6264270843990fe377ecb41" sandbox="false">
<title>Inside the SHSAT</title>
<exhibit path="chapter02/ch02_reader_0.html"/>
<exhibit path="chapter02/ch02_reader_1.html"/>
</chapter>
</unit>
</spine>
</toc>
This my XSLT script:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:s9ml="http://www.standardnine.com/s9ml" exclude-result-prefixes="xs math xd xhtml s9ml"
xmlns:epub="http://www.idpf.org/2007/ops"
version="3.0">
<xsl:output method="xhtml"/>
<xsl:param name="topicPrefix"/>
<xsl:template match="/">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>EPUB 3 Specifications - Table of Contents</title>
<link rel="stylesheet" type="text/css" href="../css/epub-spec.css" />
</head>
<body>
<nav epub:type="toc" id="toc">
<h1 class="title">Table of Contents</h1>
<ol>
<xsl:apply-templates select="//s9ml:exhibit" />
</ol>
</nav>
</body>
</html>
</xsl:template>
<!-- process exhibits referenced from toc.s9ml -->
<xsl:template match="s9ml:exhibit">
<xsl:element name="li" namespace="http://www.w3.org/1999/xhtml">
<xsl:variable name="count" select="position()"/>
<xsl:attribute name="id">
<xsl:value-of select="$topicPrefix"/>
<xsl:number format="0000" level="any"/>
</xsl:attribute>
<xsl:element name="a" namespace="http://www.w3.org/1999/xhtml">
<xsl:attribute name="href">
<xsl:value-of select="#path" />
</xsl:attribute>
<xsl:apply-templates select="document(#path)//xhtml:title">
</xsl:apply-templates>
</xsl:element>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
This is the output I'm getting:
<?xml version="1.0" encoding="UTF-8"?><html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>EPUB 3 Specifications - Table of Contents</title>
<link rel="stylesheet" type="text/css" href="../css/epub-spec.css" />
</head>
<body>
<nav epub:type="toc" id="toc">
<h1 class="title">Table of Contents</h1>
<ol>
<li id="0001">Cover</li>
<li id="0002">Title Page</li>
<li id="0003">Copyright</li>
<li id="0004">Section 1: Getting Started</li>
<li id="0005">SHSAT Basics</li>
<li id="0006">Inside the SHSAT</li>
<li id="0007">Structure of the Test</li>
</ol>
</nav>
</body>
</html>
This is the output I need is:
<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>EPUB 3 Specifications - Table of Contents</title>
<link rel="stylesheet" type="text/css" href="../css/epub-spec.css" />
</head>
<body>
<nav epub:type="toc" id="toc">
<h1 class="title">Table of Contents</h1>
<ol>
<li>Front Matter <ol>
<li id="0001">Cover</li>
<li id="0002">Title Page</li>
<li id="0003">Copyright</li>
</ol>
</li>
<li>Section 1 Getting Started <ol>
<li>Chapter 1 SHSAT Basics <ol>
<li id="0004"><a href="chapter01/ch01_part.html">Section 1: Getting
Started</a></li>
<li id="0005"><a href="chapter01/ch01_reader_0.html">SHSAT
Basics</a></li>
</ol>
</li>
<li>Chapter 2 Inside the SHSAT <ol>
<li id="0006"><a href="chapter02/ch02_reader_0.html">Inside the
SHSAT</a></li>
<li id="0007"><a href="chapter02/ch02_reader_1.html">Structure of
the Test</a></li>
</ol>
</li>
</ol>
</li>
</ol>
</nav>
</body>
</html>
It sounds as if you want to start processing <xsl:apply-templates select="//s9ml:unit"/> instead of <xsl:apply-templates select="//s9ml:exhibit" />, then write a template
<xsl:template mmatch="s9ml:unit | s9ml:chapter">
<li>
<xsl:value-of select="s9ml:title"/>
<ol>
<xsl:apply-templates select="node() except s9ml:title"/>
</ol>
</li>
</xsl:template>
to make sure you get a hierarchy of nested, ordered lists.
The whole snippets assumes you want to output XHTML result elements and have therefore declared xmlns="http://www.w3.org/1999/xhtml" on the xsl:stylesheet root element.

apply-templates outputs content more times than expected

I'm new to XSLT and I can't understand why the root get processed twice (at least this is my interpretation of this output).
EDIT: (I'm using Saxon-HE with XSLT 2.0) but also tested with several online processes, getting always the same result.
XSLT file
<?xml version="1.0" encoding="UTF-8"?>
<!-- XResume.xsl: resume.xml ==> resume.xhtml -->
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xpath-default-namespace="https://github.com/IME-SE8/XResume">
<xsl:output method="html"/>
<xsl:template match="/">
<html>
<head>
<meta charset="utf-8" />
<meta lang="en" />
<meta name="description" content="Personal Resume and Portfolio" />
<title><xsl:value-of select="resume/personalInformation/name/attribute::shortForm" /> Website</title>
</head>
<body>
<xsl:apply-templates select="resume"/>
</body>
</html>
</xsl:template>
<xsl:template match="resume">
<div class="resume">
<div class="header">
<div class="name"><xsl:value-of select="personalInformation/name" /></div>
<div class="contacts">
<xsl:for-each select="personalInformation/contact">
<div class="contactInformation">
<p><xsl:value-of select="organization" /></p>
<p><xsl:value-of select="address" /></p>
<p><xsl:value-of select="phoneNumber" /></p>
<p><xsl:value-of select="email" /></p>
</div>
</xsl:for-each>
</div>
</div>
<div class="sections">
<xsl:apply-templates />
</div>
</div>
</xsl:template>
<xsl:template match="interests"></xsl:template>
<xsl:template match="education"></xsl:template>
<xsl:template match="skills"></xsl:template>
<xsl:template match="experiences"></xsl:template>
<xsl:template match="projects"></xsl:template>
<xsl:template match="awards"></xsl:template>
</xsl:stylesheet>
XML file
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl"
href="https://github.com/IME-SE8/XResume/master/XResume.xsl"?>
<resume
xmlns="https://github.com/IME-SE8/XResume"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://github.com/IME-SE8/XResume XResume.xsd">
<personalInformation>
<name first="John" last="Doe" shortForm="JD">John Doe</name>
<contact type="institutional">
<organization>StackOverflow Institute of Technology</organization>
<address>Internet</address>
<phoneNumber>+1 (666) 666-9999</phoneNumber>
<email>john#d.oe</email>
</contact>
</personalInformation>
<interests>
<interest>Q and A</interest>
<interest>XSLT</interest>
</interests>
<education></education>
<skills></skills>
<experiences></experiences>
<projects></projects>
<awards></awards>
</resume>
HTML output
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta lang="en">
<meta name="description" content="Personal Resume and Portfolio">
<title>JD Website</title>
</head>
<body>
<div class="resume">
<div class="header">
<div class="name">John Doe</div>
<div class="contacts">
<div class="contactInformation">
<p>StackOverflow Institute of Technology</p>
<p>Internet</p>
<p>+1 (666) 666-9999</p>
<p>john#d.oe</p>
</div>
</div>
</div>
<div class="sections">
John Doe
StackOverflow Institute of Technology
Internet
+1 (666) 666-9999
john#d.oe
</div>
</div>
</body>
</html>
(yes, with that amount of blank lines)
The output header div is perfectly fine, but inside the sections div that apply-templates renders all the information in the div header again but without the HTML tags.
Is there any XSLT processing detail am I missing? Does the template match sets the context in a way that the matched element is now considered a root or something like that?
The problem is here:
<div class="sections">
<xsl:apply-templates />
</div>
This applies templates to all child nodes of the current node (resume), including the personalInformation element.
As there is no matching template specified for personalInformation, the builtin XSLT templates are used by the XSLT processor and applying them results in outputting the concatenation of all descendent text-nodes of the personalInformation element.
Solution:
Replace:
<div class="sections">
<xsl:apply-templates />
</div>
with:
<div class="sections">
<xsl:apply-templates select="*[not(self::personalInformation)]" />
</div>
The result of the transformation now doesn't contain the noted problematic output:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta lang="en">
<meta name="description" content="Personal Resume and Portfolio">
<title>JD Website</title>
</head>
<body>
<div class="resume">
<div class="header">
<div class="name">John Doe</div>
<div class="contacts">
<div class="contactInformation">
<p>StackOverflow Institute of Technology</p>
<p>Internet</p>
<p>+1 (666) 666-9999</p>
<p>john#d.oe</p>
</div>
</div>
</div>
<div class="sections"></div>
</div>
</body>
</html>
You haven't supplied an expected output, so I will guess at your intended outcome. Here is an XSLT 2.0 solution. If you need XSLT 1.0, please comment, and I can add. But just remember that if your transform engine is the browser, you have no excuse not to use XSLT 2.0. (Refer Saxon CE).
XSLT 2.0 Solution
This XSLT 2.0 stylesheet ...
<xsl:transform
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:r="https://github.com/IME-SE8/XResume"
exclude-result-prefixes="r"
version="2.0">
<xsl:output method="html" version="5" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*" />
<xsl:template match="/">
<html>
<head>
<meta lang="en" />
<meta name="description" content="Personal Resume and Portfolio" />
<title><xsl:value-of select="r:resume/r:personalInformation/r:name/#shortForm" /> Website</title>
</head>
<body>
<xsl:apply-templates select="r:resume"/>
</body>
</html>
</xsl:template>
<xsl:template match="r:resume">
<div class="resume">
<div class="header">
<div class="name"><xsl:value-of select="r:personalInformation/r:name" /></div>
<div class="contacts">
<xsl:apply-templates select="r:personalInformation/r:contact" />
</div>
</div>
<div class="sections">
<xsl:apply-templates select="* except r:personalInformation" />
</div>
</div>
</xsl:template>
<xsl:template match="r:contact">
<div class="contactInformation">
<xsl:apply-templates />
</div>
</xsl:template>
<xsl:template match="r:organization|r:address|r:phoneNumber|r:email">
<p><xsl:value-of select="." /></p>
</xsl:template>
<xsl:template match="r:education|r:skills|r:experiences|r:projects|r:awards">
<h2><xsl:value-of select="local-name()" /></h2>
<p><xsl:value-of select="." /></p>
</xsl:template>
<xsl:template match="r:interests">
<h2>interests</h2>
<ul>
<xsl:apply-templates />
</ul>
</xsl:template>
<xsl:template match="r:interest">
<li>
<xsl:value-of select="." />
</li>
</xsl:template>
<xsl:template match="*" />
</xsl:transform>
... when applied to this input document ...
<resume
xmlns="https://github.com/IME-SE8/XResume"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://github.com/IME-SE8/XResume XResume.xsd">
<personalInformation>
<name first="John" last="Doe" shortForm="JD">John Doe</name>
<contact type="institutional">
<organization>StackOverflow Institute of Technology</organization>
<address>Internet</address>
<phoneNumber>+1 (666) 666-9999</phoneNumber>
<email>john#d.oe</email>
</contact>
</personalInformation>
<interests>
<interest>Q and A</interest>
<interest>XSLT</interest>
</interests>
<education></education>
<skills></skills>
<experiences></experiences>
<projects></projects>
<awards></awards>
</resume>
... will yield this output html page ....
<!DOCTYPE HTML>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta lang="en">
<meta name="description" content="Personal Resume and Portfolio">
<title>JD Website</title>
</head>
<body>
<div class="resume">
<div class="header">
<div class="name">John Doe</div>
<div class="contacts">
<div class="contactInformation">
<p>StackOverflow Institute of Technology</p>
<p>Internet</p>
<p>+1 (666) 666-9999</p>
<p>john#d.oe</p>
</div>
</div>
</div>
<div class="sections">
<h2>interests</h2>
<ul>
<li>Q and A</li>
<li>XSLT</li>
</ul>
<h2>education</h2>
<p></p>
<h2>skills</h2>
<p></p>
<h2>experiences</h2>
<p></p>
<h2>projects</h2>
<p></p>
<h2>awards</h2>
<p></p>
</div>
</div>
</body>
</html>
Explanation: Why were you getting root processed twice
In short because your <div class="sections"><xsl:apply-templates /></div> instruction did not specify a select attribute. The default selection applied, which was at that point the document root.
All the elements in your source document are in a namespace, but your stylesheet is written to process elements in no namespace. Welcome to the club, and join the 10m other people who have fallen into this trap. Essentially, you resume elements don't match the match="resume" template, so the default template kicks in, and this outputs the raw text with no tags. For the solution, search on "XSLT default namespace" and choose any one of about 1000 answers.
On re-reading, I see that you've used xpath-default-namespace="https://github.com/IME-SE8/XResume", which should fix the problem if you are using an XSLT 2.0 processor, or trigger an error if you're using an XSLT 1.0 processor. So it might be useful (actually, it's always useful) to tell us what processor you are using and how you are running it.

output escaping in alt text with xslt1

In my source XML, the less-than sign is represented as <, but in the output (html, as alt-text) it is represented as the < sign, which causes problems in post-processing.
I'm using saxon655 with this command line:
java -cp saxon655/saxon.jar com.icl.saxon.StyleSheet test.xml test.xsl
This really doesn't make sense to me. Here are the details:
The DocBook XML:
<chapter xmlns="http://docbook.org/ns/docbook">
<info><title>The Chapter</title></info>
<para>
<informalequation>
<mediaobject>
<imageobject>
<imagedata fileref="images/g0589.png" />
</imageobject>
<textobject role="tex"><phrase>|z_ s-z_ t|<r</phrase></textobject>
</mediaobject>
</informalequation>
</para>
</chapter>
The XSLT. If you copy this, change the path the docbook stylesheets.
<xsl:stylesheet version="1.0"
xmlns:d="http://docbook.org/ns/docbook"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:import href="/path/to/docbook/xsl-1.78.1/html/docbook.xsl" />
<xsl:template match="d:mediaobject/d:imageobject/d:imagedata">
<xsl:element name="img">
<xsl:attribute name="alt">
<xsl:value-of select="../../d:textobject[#role='tex']/d:phrase" />
</xsl:attribute>
<xsl:attribute name="src">
<xsl:value-of select="#fileref" />
</xsl:attribute>
</xsl:element>
<xsl:apply-templates />
</xsl:template>
</xsl:stylesheet>
And the resulting HTML portion:
<div class="informalequation">
<div class="mediaobject">
<img alt="|z_ s-z_ t|<r" src="images/g0589.png"></div>
</div>
Am I doing something wrong?
As far as the W3C HTML validator says, for text/html the output is fine, I created a minimal HTML 4.01 document with the markup you have at http://home.arcor.de/martin.honnen/html/test2015040301.html, it has the content
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>img alt attribute test</title>
</head>
<body>
<div class="informalequation">
<div class="mediaobject">
<img alt="|z_ s-z_ t|<r" src="images/g0589.png"></div>
</div>
</body>
</html>
and the validator says (http://validator.w3.org/check?uri=http%3A%2F%2Fhome.arcor.de%2Fmartin.honnen%2Fhtml%2Ftest2015040301.html&charset=%28detect+automatically%29&doctype=Inline&group=0) "This document was successfully checked as HTML 4.01 Strict!". So I think Saxon is creating correct HTML, I don't know how you post-process the result of the XSLT transformation but an HTML or SGML parser should do fine with it.
With an XML output (method="xml") Saxon does escape the less than in the attribute value.

How to index feed in Google Search Appliance?

i have my atom (list of continent as xml) at this url .../continent/search?view=atom like this:
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/">
<title>List of all continents</title>
<opensearch:totalResults>{{ continents_length }}</opensearch:totalResults>
<opensearch:startIndex>{{ continents.start_index }}</opensearch:startIndex>
<opensearch:itemsPerPage>{{ count }}</opensearch:itemsPerPage>
<opensearch:Query continent="request" searchTerms="" startPage="{{ continents.start_index }}" />
<author><name>My_site</name></author>
<id>urn:domain-id:mysite.com:continent</id>
<link rel="self" href="{{ url }}" />
{% for continent in continents %}
<entry>
<span class="continent_id">{{ continent.continent_id }}</span>
<span class="continent_name">{{ continent.continent_name }}</span>
<span class="list_countries">{{ continent.list_countries }}</span>
</entry>
{% endfor %}
</feed>
When i want to PUT and index my feed in gsa-interface i have used this:
<?xml version="1.0" encoding="utf-8"?>
<entry xmlns="http://www.w3.org/2005/Atom">
<title>continent</title>
<id>urn:domain-id:mysite.com:continent</id>
<author>
<name>admin user</name>
</author>
<link rel="self" href=".../feed/continent"/>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<span id="refresh-each">15 12,14,18 * * *</span>
<span id="gsa-datasource">continent</span>
<span id="gsa-feedtype">full</span>
<span id="url">...continent/search?view=atom</span>
<span id="opensearch-pattern">&count=100&startPage=%STARTPAGE%</span>
<ul class="connection">
<li id="userid">user</li>
<li id="password">pass</li>
</ul>
<ul id="metadata">
<li id="continent_id">atom:entry/xhtml:span[#class='continent_id']</li>
<li id="continent_name">atom:entry/xhtml:span[#class='continent_name']</li>
<li id="list_countries">atom:entry/xhtml:span[#class='list_countries']</li>
</ul>
<div id="xsl-content">
<![CDATA[
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
exclude-result-prefixes="atom xhtml">
<xsl:template name="FormatDescription">
<xsl:param name="name"/>
<xsl:value-of select="$name"/>
</xsl:template>
<xsl:template match="atom:entry">
<html>
<body>
<xsl:apply-templates select="atom:entry/xhtml:span" />
</body>
</html>
</xsl:template>
<xsl:template match="atom:entry/xhtml:span">
<xsl:copy-of select="*"/>
</xsl:template>
</xsl:stylesheet>
]]>
</div>
</div>
</content>
</entry>
But when i check the flux of transfered files it return 0 file with error:
ProcessNode: Missing required attribute url. skipping element., skipping record
For the second indexation and the tird one, there is no error, and neither no file !
200 OK Feed continent has been pushed successfully to the Google Search Appliance.
Any suggestion/recommandation ?
Following up on this, the word "feed" is confusing here. A GSA content feed is not like any RSS or Atom feeds. Here's a simplified example (from the documentation):
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE gsafeed PUBLIC "-//Google//DTD GSA Feeds//EN" "">
<gsafeed>
<header>
<datasource>hello</datasource>
<feedtype>incremental</feedtype>
</header>
<record url="http://www.corp.enterprise.com/hello02" mimetype="text/plain">
<content>UPDATED - This is hello02</content>
</record>
</group>
</gsafeed>
As you can see, that's a very specific XML format, not shared with web site update feed formats. The documentation for this is good: http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/feedsguide/feedsguide.html
Check out the feed documentation
You need to pass the GSA a feed XML file, not an atom feed.

XLST to insert HTML snippet into resulting HTML document

I have a docBook 4.4 XML file which is a user guide. I can use the (Maven) tools to convert that to HTML, PDF no problem. The problem i have is to insert an small HTML code snippet into the resulting HTML file.
I would like to add the following HTML snippet:
<xsl:template name="xxxxxxx">
<img src="images/pdfdoc.gif">PDF</img>
</xsl:template>
The resulting HTML code looks like this:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>Title</title>
<link rel="stylesheet" type="text/css" href="./hilfeKMV.css">
<meta name="generator" content="DocBook XSL Stylesheets V1.76.0">
<meta name="date" content="10/12/2011">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084"
alink="#0000FF">
<div lang="de" class="book"
title="Title">
<div class="titlepage">
<div>
<div>
<h1 class="title">
<a name="d0e1"></a>title
</h1>
</div>
</div>
<hr>
</div>
<div class="toc">
<p>
<b>TOC</b>
</p>
I would like to insert the HTML snippet before the <div class="toc">...So the question is how to solve this? I'm using docbook 1.76.0
I think there must be something like the following to solve that but i don't know how to set call-template etc. ?
<xsl:template name="xxxxxxx">
<xsl:variable name="top-anchor">
<xsl:call-template name="object.id">
<xsl:with-param name="object" select="/*[1]"/>
</xsl:call-template>
</xsl:variable>
<img src="images/pdfdoc.gif">PDF</img>
</xsl:template>
The <div class="toc"> element is generated by the template named "make.toc", which is called by "division.toc" (in autotoc.xsl). In order to output something immediately before this <div>, you can override the "division.toc" template in your customization layer. Just copy the original template and add your code, like this:
<xsl:template name="division.toc">
<xsl:param name="toc-context" select="."/>
<xsl:param name="toc.title.p" select="true()"/>
<img src="images/pdfdoc.gif" alt="PDF"/> <!-- Your stuff here -->
<xsl:call-template name="make.toc">
...
...
</xsl:call-template>
</xsl:template>
I have found the correct location to insert the code i need.
In the titlepage.templates.xsl file i found the correct solution. I just taken the snippet from the titlepage.templates.xsl and enhanced it like the following:
<xsl:template name="book.titlepage.separator">
<div class="subsubtile">
<div class="pdflink">
<a href="./xxxx.pdf" title="Hilfeseite als PDF-Dokument">
<img src="images/pdfdoc.gif" border="0" alt="Hilfeseite als PDF-Dokument" />
<br />
PDF
</a>
</div>
</div>
<hr/>
</xsl:template>