How do I understand linebreaks in XSLT? - xslt

I have a piece of XML that looks like:
<bunch of other things>
<bunch of other things>
<errorLog> error1 \n error2 \n error3 </errorLog>
I want to modify the XSLT that this XML runs through to apply newlines after errors1 through error3.
I can completely control the output of errorLog or the contents of the XSLT file, but I'm not sure how to craft either the XML or the XSLT to make the output HTML show line breaks. Is it easier to change the XML output into some special character that will cause a newline, or do I modify the XSLT to interpret \n as newlines?
There is an example on this site that contains something akin to what I want, but my <errorLog> XSLT is nested in another template, and I'm not sure how templates inside templates can work.

Backslash is used as an escape character in a number of languages including C and Java, but not in XML or XSLT. If you put \n in your stylesheet, that's not a newline, it's two characters backslash followed by "n". The XML way of writing a newline is
. However, if you send a newline to the browser in HTML, it displays it as a space. If you want a newline displayed by the browser, you need to send a <br/> element.

If you have control over your errorLog element then you may as well use a literal LF character in there. It is no different from any other character as far as XSLT is concerned.
As for creating HTML that displays with line breaks, you will want to add a <br/> element in place of whatever marker you have in your XML source. It would be easiest of all if you could put each error within a separate element, like this
<errorLog>
<error>error1</error>
<error>error2</error>
<error>error3</error>
</errorLog>
then the XSLT doesn't have to go through the rather clumsy process of splitting up the text itself.
With this XML data taken from your question
<document>
<bunch-of-other-things/>
<bunch-of-other-things/>
<errorLog>error1 \n error2 \n error3</errorLog>
</document>
this stylesheet
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="*"/>
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes" />
<xsl:template match="/document">
<html>
<head>
<title>Error Log</title>
</head>
<body>
<xsl:apply-templates select="*"/>
</body>
</html>
</xsl:template>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="errorLog">
<p>
<xsl:call-template name="split-on-newline">
<xsl:with-param name="string" select="."/>
</xsl:call-template>
</p>
</xsl:template>
<xsl:template name="split-on-newline">
<xsl:param name="string"/>
<xsl:choose>
<xsl:when test="contains($string, '\n')">
<xsl:value-of select="substring-before($string, '\n')"/>
<br/>
<xsl:call-template name="split-on-newline">
<xsl:with-param name="string" select="substring-after($string, '\n')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$string"/>
<br/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
will produce this output
<html>
<head>
<title>Error Log</title>
</head>
<body>
<bunch-of-other-things/>
<bunch-of-other-things/>
<p>error1 <br/> error2 <br/> error3<br/>
</p>
</body>
</html>

Related

XSLT: in text element, how to replace line break (<br/>) with blank space?

NOTE: I am using xsltproc on OS X Yosemite.
The source content for an XSLT transformation is HTML. Some
text nodes contain line breaks (<br/>). In the transformed
content (an XML file), I wish to convert the line breaks to spaces.
For example, I have:
<div class="location">London<br />Hyde Park<br /></div>
I want to transform this element like so:
<xsl:element name="location">
<xsl:variable name="location" select="div[#class='location']"/>
<xsl:value-of select="$location"/>
</xsl:element>
What happens is the <br /> are simply removed the output:
<location>LondonHyde Park</location>
I do have other templates that are involved:
<xsl:template match="node()|script"/>
<xsl:template match="*">
<xsl:apply-templates/>
</xsl:template>
What XSLT operations are required to transform the <br />'s here
to a single space?
I would use xsl:apply-templates instead of xsl:value-of and add a template to handle <br/>.
You would also need to modify <xsl:template match="node()|script"/> because node() also selects text nodes. You can replace node() with processing-instruction()|comment() if you need to, but they would not be output by default anyway.
Here's a working example:
Input
<div class="location">London<br />Hyde Park<br /></div>
XSLT 1.0
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="script"/>
<xsl:template match="*">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="div[#class='location']">
<location><xsl:apply-templates/></location>
</xsl:template>
<xsl:template match="br">
<xsl:text> </xsl:text>
</xsl:template>
</xsl:stylesheet>
Output
<location>London Hyde Park </location>
If you don't want the trailing space, you could either...
put the xsl:apply-templates in a variable ($var) and use normalize-space() in an xsl:value-of. Like: <xsl:value-of select="normalize-space($var)"/>
update the match for the br element. Like: br[not(position()=last())]

XSLT to fetch content from a file, where that filename ends with 'demo.xml' only

Please suggest to fetch content from group of XMLs out of those, which file name ends with 'demo.xml' by that file I need to fetch the particular element or content. Please suggest, here I used 'collection' method with '*demo.xml' for finding the particular file that ends with word demo.
XML:
2830_demo.xml
<article>
<head>
<title>Space Technologies</title>
<author-group>
<au>Rudramuni</au>
</author-group>
</head>
<body><p>The first para</p></body></article>
XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:variable name="varCollection">
<xsl:copy-of select="collection('file:///D:/Rudramuni/XSLT/Samples/Documents/JVIR?select=*demo.xml;recurse=yes')"/>
</xsl:variable>
<xsl:template match="*"><xsl:apply-templates/></xsl:template>
<xsl:template match="article1">
<xsl:for-each select="$varCollection">
<xsl:variable name="a" select="."/>
<aug><xsl:apply-templates select="$a/article/head/ceauthor-group"/></aug>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

How to read Text values of node XSLT

How can read the value of node/element , which ignores few child tags.
I have List of tags which requires to be Ignored,
Example :
a)
OUTPUT :Title Txt a
<Title>
<Comment>Comment code</Comment>Title Txt a
</Title>
b)
OUTPUT :Title Txt b
<Title>
<Ignore1>Comment code</Ignore1>Title Txt b
</Title>
c)
OUTPUT :Comment code Title Txt c
<Title>
<includethis>Comment code</includethis>Title Txt c
</Title>
You simply match for the Title element:
<xsl:template match="Title">
and output its text content:
<xsl:value-of select="."/>
Then, process the child nodes in turn:
<xsl:template match="*[parent::Title and starts-with(.,'Ignore')]"/>
<xsl:template match="includethis">
<xsl:value-of select="."/>
</xsl:template>
Above, the first template matches elements whose name starts with "Ignore". This is because I assume there could me other elements named Ignore2, Ignore3 and so on.
Finally, the includethis elements are matched and their text content is output, same as for the Title elements.
Now, to sum up:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/Title">
<xsl:value-of select="."/>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="*[parent::Title and starts-with(.,'Ignore')]"/>
<xsl:template match="includethis">
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
Thanks for your Help. But i missed to mention the important condition which started the problem,
I need to check the 'Title' for not empty before starting any processing.
example below tag is considered empty, when the child tag is from one from the Ignored List.
<Title>
<Comment>Comment code</Comment>
</Title>
Sample Code :
<xsl:choose>
<xsl:when test="Title and normalize-space(Title) != ''">
<xsl:apply-templates select="Title" mode="xyz"/>
</xsl:when>
<xsl:otherwise>
<xsl:call-templates name="getalternative_label" mode="xyz"/>
</xsl:otherwise>
</xsl:choose>

With XSLT, how can I process normally, but hold some nodes until the end and then output them all at once (e.g. footnotes)?

I have an XSLT application which reads the internal format of Microsoft Word 2007/2010 zipped XML and translates it into HTML5 with XSLT. I am investigating how to add the ability to optionally read OpenOffice documents instead of MSWord.
Microsoft stores XML for footnote text separately from the XML of the document text, which happens to suit me because I want the footnotes in a block at the end of the output HTML page.
However, unfortunately for me, OpenOffice puts each footnote right next to its reference, inline with the text of the document. Here is a simple paragraph example:
<text:p text:style-name="Standard">The real breakthrough in aerial mapping
during World War II was trimetrogon
<text:note text:id="ftn0" text:note-class="footnote">
<text:note-citation>1</text:note-citation>
<text:note-body>
<text:p text:style-name="Footnote">Three separate cameras took three
photographs at once, a direct downward and an oblique on each side.</text:p>
</text:note-body>
</text:note>
photography, but the camera was large and heavy, so there were problems finding
the right aircraft to carry it.
</text:p>
My question is, can XSLT process the XML as normal, but hold each of the text:note items until the end of the document text, and then emit them all at one time?
You're thinking of your logic as being driven by the order of things in the input, but in XSLT you need to be driven by the order of things in the output. When you get to the point where you want to output the footnotes, go find the footnote text wherever it might be in the input. Admittedly that doesn't always play too well with the apply-templates recursive descent processing model, which is explicitly input-driven; but nevertheless, that's the way you have to do it.
Don't think of it as "holding" the text:note items, instead simply ignore them in the main pass and then gather them at the end with a //text:note and process them there, e.g.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:text="whateveritshouldbe">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
<!-- normal mode - replace text:note element by [reference] -->
<xsl:template match="text:note">
<xsl:value-of select="concat('[', text:note-citation, ']')" />
</xsl:template>
<xsl:template match="/">
<document>
<xsl:apply-templates select="*" />
<footnotes>
<xsl:apply-templates select="//text:note" mode="footnotes"/>
</footnotes>
</document>
</xsl:template>
<!-- special "footnotes" mode to de-activate the usual text:node template -->
<xsl:template match="#*|node()" mode="footnotes">
<xsl:copy>
<xsl:apply-templates select="#*|node()" mode="footnotes" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
You could use <xsl:apply-templates mode="..."/>. I'm not sure on the exact syntax and your use case, but maybe the example below will give you a clue on how to approach your problem.
Basic idea is to process your nodes twice. First iteration would be pretty much the same as now, and the second iteration only looks for footnotes and only outputs those. You differentiate those iteration by setting "mode" parameter.
Maybe this example will give you a clue how to approach your problem. Note that I used different tags that in your code, so the example would be simpler.
XSLT sheet:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes" />
<xsl:template match="doc">
<xml>
<!-- First iteration - skip footnotes -->
<doc>
<xsl:apply-templates select="text" />
</doc>
<!-- Second iteration, extract all footnotes.
'mode' = footnotes -->
<footnotes>
<xsl:apply-templates select="text" mode="footnotes" />
</footnotes>
</xml>
</xsl:template>
<!-- Note: no mode attribute -->
<xsl:template match="text">
<text>
<xsl:for-each select="p">
<p>
<xsl:value-of select="text()" />
</p>
</xsl:for-each>
</text>
</xsl:template>
<!-- Note: mode = footnotes -->
<xsl:template match="text" mode="footnotes">
<xsl:for-each select=".//footnote">
<footnote>
<xsl:value-of select="text()" />
</footnote>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Input XML:
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<text>
<p>
some text
<footnote>footnote1</footnote>
</p>
<p>
other text
<footnote>footnote2</footnote>
</p>
</text>
<text>
<p>
some text2
<footnote>footnote3</footnote>
</p>
<p>
other text2
<footnote>footnote4</footnote>
</p>
</text>
</doc>
Output XML:
<?xml version="1.0" encoding="UTF-8"?>
<xml>
<!-- Output from first iteration -->
<doc>
<text>
<p>some text</p>
<p>other text</p>
</text>
<text>
<p>some text2</p>
<p>other text2</p>
</text>
</doc>
<!-- Output from second iteration -->
<footnotes>
<footnote>footnote1</footnote>
<footnote>footnote2</footnote>
<footnote>footnote3</footnote>
<footnote>footnote4</footnote>
</footnotes>
</xml>

Input contains a paragraph character that needs to be removed

I have been attempting to modify the text of the parent element from within the xsl. How can I delete the element from the XSL code ( I do not control the input ). I only want to delete the preceding line break not all line breaks in the body. The preceding 'some text here' may take the form of multiple paragraphs.
Xsl
<xsl:template match="element">
<!-- attempting to add fix here -->
<xsl:apply-templates />
</xsl:template>
Input
<body>
<p>
some text here
</p>
<element>
some more text
</element>
</body>
Output
some text here
some more text
Desired Output
some text here some more text
Does
<xsl:template match="p[following-sibling::*[1][self::element]]//text() | element[preceding-sibling::*[1][self::p]//text()">
<xsl:value-of select="normalize-space()"/>
</xsl:template>
do what you want?
You don't need the <xsl:template match="element"><xsl:apply-templates/></xsl:template> as the built-in template will do that anyway.
I found some time to test code, now I have
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="p[following-sibling::*[1][self::element]]//text() |
element[preceding-sibling::*[1][self::p]]//text()">
<xsl:value-of select="normalize-space()"/>
</xsl:template>
<xsl:template match="text()[preceding-sibling::*[1][self::p] and following-sibling::*[1][self::element] and not(normalize-space())]">
<xsl:text> </xsl:text>
</xsl:template>
</xsl:stylesheet>
transforms
<body>
<p>
some text here
</p>
<element>
some more text
</element>
</body>
into
some text here some more text