Is it possible to replace & with & in XSLT? - xslt

I tried to do that with replace($val, 'amp;', ''), but seems like & is atomic entity to the parser. Any other ideas?
I need it to get rid of double escaping, so I have constructions like ᾰ in input file.
UPD:
Also one important notice: I have to make this substitution only inside of specific tags, not inside of every tag.

If you serialize there is always (if supported) the disable-output-escaping hack, see http://xsltransform.hikmatu.com/nbUY4kh which transforms
<root>
<foo>a & b</foo>
<bar>a & b</bar>
</root>
selectively into
<root>
<foo>a & b</foo>
<bar>a & b</bar>
</root>
by using <xsl:value-of select="." disable-output-escaping="yes"/> in the template matching foo/text():
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="foo/text()">
<xsl:value-of select="." disable-output-escaping="yes"/>
</xsl:template>
</xsl:transform>
To achieve the same selective replacement with a character map you could replace the ampersand in foo text() children (or descendants if necessary) with a character not used elsewhere in your document and then use the map to map it to an unescaped ampersand:
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output use-character-maps="doe"/>
<xsl:character-map name="doe">
<xsl:output-character character="«" string="&"/>
</xsl:character-map>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="foo/text()">
<xsl:value-of select="replace(., '&', '«')"/>
</xsl:template>
</xsl:transform>
That way
<root>
<foo>a & b</foo>
<bar>a & b</bar>
</root>
is also transformed to
<root>
<foo>a & b</foo>
<bar>a & b</bar>
</root>
see http://xsltransform.hikmatu.com/pPgCcoj for a sample.

If your XML contains ᾰ and you believe that this is a double-escaped representation of the character with codepoint 8112, then you can convert it to this character using the XPath expression
codepoints-to-string(xs:integer(replace($input, '&#([0-9]+);', $1)))
remembering that if you write this XPath expression in XSLT, then the & must be written as &.

Related

XSLT 2.0 3.0 for-each context error when tokenizing attributes

Given this XML
<dmodule>
<content>
<warningsAndCautionsRef>
<warningRef id="w001" warningIdentNumber="warning-001">
</warningRef>
<warningRef id="w002" warningIdentNumber="warning-002">
</warningRef>
<cautionRef id="c001" cautionIdentNumber="caution-001">
</cautionRef>
<cautionRef id="c002" cautionIdentNumber="caution-002">
</cautionRef>
</warningsAndCautionsRef>
<faultReporting>
<preliminaryRqmts>
<reqSafety>
<safetyRqmts cautionRefs="c001 c002" warningRefs="w001 w002"/>
</reqSafety>
</preliminaryRqmts>
</faultReporting>
</content>
</dmodule>
I would like to tokenize the attributes #cautionRefs (and #warningRefs) and then find the cautionRef element that matches its #id to the tokenized value:
<xsl:template match="#cautionRefs">
<xsl:for-each select="tokenize(.,'\s')">
<xsl:apply-templates select="//*[#id=.]"/>
</xsl:for-each>
</xsl:template>
but the apply-templates fails: Fatal error during transformation Leading '/' selects nothing: the context item is not a node. It works if I don't tokenize and use string functions instead but that is not desirable.
Desired result:
Tokenize #cautionRefs="c001 c002" (which has multiple parent elements)
So each value is passed to the <cautionRef>template that will retrieve the caution and warning statements, to be displayed in a PDF:
<xsl:apply-templates select="//*[#id='c001']"/>
<xsl:apply-templates select="//*[#id='c002']"/>
I tried using <xsl:key name="id" match="*" use="#id"/> with
<xsl:for-each select="key('id',tokenize(.,'\s'))">
but the for-each is blank.
The above apply-templates will match with this <cautionRef> template, which retrieves the caution and warning statements correctly. I just need help with the context of the #cautionRefs template:
<xsl:template match="cautionRef">
<xsl:variable name="IdentNumber" select="#cautionIdentNumber"/>
<xsl:apply-templates select="//cautionSpec[cautionIdent/#cautionIdentNumber=$IdentNumber]"/>
</xsl:template>
You could use a variable and use that for context:
<xsl:template match="#cautionRefs|#warningRefs">
<xsl:variable name="ctx" select="/"/>
<xsl:for-each select="tokenize(.,'\s')">
<xsl:apply-templates select="$ctx//*[#id=.]"/>
</xsl:for-each>
</xsl:template>
but I would use a key like you hinted at (updated to include context based on comments)...
<xsl:key name="by_id" match="*[#id]" use="#id"/>
<xsl:variable name="root" select="/"/>
<xsl:template match="#cautionRefs|#warningRefs">
<xsl:for-each select="tokenize(.,'\s')">
<xsl:apply-templates select="key('by_id',.,$root)"/>
</xsl:for-each>
</xsl:template>
Here's a full working example. NB it's best to have this level of detail in the actual question; i.e. a sample input file, the XSLT, and output, along with an example of what you want the output to look like.
Input:
<test>
<safetyRqmts cautionRefs="c001 c002" warningRefs="w001"/>
<cautionRef id="c001" cautionIdentNumber="caution-001"/>
<cautionRef id="c002" cautionIdentNumber="caution-001"/>
</test>
Stylesheet:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:template match="*|#*">
<xsl:copy>
<xsl:apply-templates select="#*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:key name="by_id" match="*[#id]" use="#id"/>
<xsl:template match="#cautionRefs|#warningRefs">
<xsl:apply-templates select="key('by_id', tokenize(.))"/>
</xsl:template>
</xsl:stylesheet>
Output:
<test>
<safetyRqmts><cautionRef id="c001" cautionIdentNumber="caution-001"/><cautionRef id="c002" cautionIdentNumber="caution-001"/></safetyRqmts>
<cautionRef id="c001" cautionIdentNumber="caution-001"/>
<cautionRef id="c002" cautionIdentNumber="caution-001"/>
</test>

Adding special characters to xml output

We have a piece of code which returns XML in a format like:
Source XML:
<Root>
<Book>
<BookId>a</BookId>
<Description>aDescription</Description>
</Book>
<Book>
<BookId>b</BookId>
<Description>bDescription</Description>
</Book>
</Root>
I want to replace the special characters with the literal characters...
<
will be < etc.
I know I can use:
<xsl:character-map name="escapeMapper">
<xsl:output-character character="<" string="<"/>
<xsl:output-character character=">" string=">"/>
</xsl:character-map>
However here is the twist, I want to convert special characters first, then run the resulting XML through other templates. So, I want to run the source XML through a template replacing the special characters, putting the result in to a variable:
<xsl:variable name="vrtfPass1">
Now I can use the multi-pass technique and apply other templates using the variable as the source.
How can I convert the special characters into the literal characters?
It sounds like what you want to do is not convert < and > to < and >, but to parse the XML, yes? Are you able to use Saxon extension functions in what you're doing? If so, I think you could do something like this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:saxon="http://saxon.sf.net/">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="loaded">
<xsl:apply-templates />
</xsl:variable>
<xsl:apply-templates select="$loaded/*" />
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Book/text()[normalize-space()]">
<xsl:copy-of select="saxon:parse(.)"/>
</xsl:template>
</xsl:stylesheet>
Though this assumes that the text children of Book would always be well-formed XML in its own right.

Doing a double-pass in XSL?

Is it possible to store the output of an XSL transformation in some sort of variable and then perform an additional transformation on the variable's contents? (All in one XSL file)
(XSLT-2.0 Preferred)
XSLT 2.0 Solution :
<xsl:variable name="firstPassResult">
<xsl:apply-templates select="/" mode="firstPass"/>
</xsl:variable>
<xsl:template match="/">
<xsl:apply-templates select="$firstPassResult" mode="secondPass"/>
</xsl:template>
The trick here, is to use mode="firstPassResult" for the first pass while all the templates for the sedond pass should have mode="secondPass".
Edit:
Example :
<root>
<a>Init</a>
</root>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="firstPassResult">
<xsl:apply-templates select="/" mode="firstPass"/>
</xsl:variable>
<xsl:template match="/" mode="firstPass">
<test>
<firstPass>
<xsl:value-of select="root/a"/>
</firstPass>
</test>
</xsl:template>
<xsl:template match="/">
<xsl:apply-templates select="$firstPassResult" mode="secondPass"/>
</xsl:template>
<xsl:template match="/" mode="secondPass">
<xsl:message terminate="no">
<xsl:copy-of select="."/>
</xsl:message>
</xsl:template>
</xsl:stylesheet>
Output :
[xslt] <test><firstPass>Init</firstPass></test>
So the first pass creates some elements with the content of root/a and the second one prints the created elements to std out. Hopefully this is enough to get you going.
Yes, with XSLT 2.0 it is easy. With XSLT 1.0 you can of course also use modes and store a temporary result in a variable the same way as in XSLT 2.0 but the variable is then a result tree fragment, to be able to process it further with apply-templates you need to use an extension function like exsl:node-set on the variable.

XSLT: Add namespace to root element

I need to change namespaces in the root element as follows:
input document:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<foo xsi:schemaLocation="urn:isbn:1-931666-22-9 http://www.loc.gov/ead/ead.xsd"
xmlns:ns2="http://www.w3.org/1999/xlink" xmlns="urn:isbn:1-931666-22-9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
desired output:
<foo audience="external" xsi:schemaLocation="urn:isbn:1-931666-22-9
http://www.loc.gov/ead/ead.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-
instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="urn:isbn:1-931666-22-9">
I was trying to do it as I copy over the whole document and before I give any other transformation instructions, but the following doesn't work:
<xsl:template match="* | processing-instruction() | comment()">
<xsl:copy copy-namespaces="no">
<xsl:for-each select=".">
<xsl:attribute name="audience" select="'external'"/>
<xsl:namespace name="xlink" select="'http://www.w3.org/1999/xlink'"/>
</xsl:for-each>
<xsl:copy-of select="#*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
Thanks for any advice!
XSLT 2.0 isn't necessary to solve this problem.
Here is an XSLT 1.0 solution, which works equally well as XSLT 2.0 (just change the version attribute to 2.0):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xlink="http://www.w3.org/1999/xlink"
exclude-result-prefixes="xlink"
>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/*">
<xsl:element name="{name()}" namespace="{namespace-uri()}">
<xsl:copy-of select=
"namespace::*
[not(name()='ns2')
and
not(name()='')
]"/>
<xsl:copy-of select=
"document('')/*/namespace::*[name()='xlink']"/>
<xsl:copy-of select="#*"/>
<xsl:attribute name="audience">external</xsl:attribute>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
When the above transformation is applied on this XML document:
<foo
xsi:schemaLocation="urn:isbn:1-931666-22-9 http://www.loc.gov/ead/ead.xsd"
xmlns:ns2="http://www.w3.org/1999/xlink"
xmlns="urn:isbn:1-931666-22-9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"/>
the wanted result is produced:
<foo xmlns="urn:isbn:1-931666-22-9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xlink="http://www.w3.org/1999/xlink"
xsi:schemaLocation="urn:isbn:1-931666-22-9 http://www.loc.gov/ead/ead.xsd"
audience="external"/>
You should really be using the "identity template" for this, and you should always have it on hand. Create an XSLT with that template, call it "identity.xslt", then into the current XSLT. Assume the prefix "bad" for the namespace you want to replace, and "good" for the one you want to replace it with, then all you need is a template like this (I'm at work, so forgive the formatting; I'll get back to this when I'm at home): ... If that doesn't work in XSLT 1.0, use a match expression like "*[namespace-uri() = 'urn:bad-namespace'", and follow Dimitre's instructions for creating a new element programmatically. Within , you really need to just apply-template recursively...but really, read up on the identity template.

Can an XSLT processor preserve empty CDATA sections?

I'm processing an XML document (an InstallAnywhere .iap_xml installer) before handing it off to another tool (InstallAnywhere itself) to update some values. However, it appears that the XSLT transform I am using is stripping CDATA sections (which appear to be significant to InstallAnywhere) from the document.
I'm using Ant 1.7.0, JDK 1.6.0_16, and a stylesheet based on the identity:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" encoding="UTF-8" cdata-section-elements="string" />
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Basically, "string" nodes that look like:
<string><![CDATA[]]></string>
are being processed into:
<string/>
From reading XSLT FAQs, I can see that what is happening is legal as far as the XSLT spec is concerned. Is there any way I can prevent this from happening and convince the XSLT processor to emit the CDATA section?
Found a solution:
<xsl:template match="string">
<xsl:element name="string">
<xsl:text disable-output-escaping="yes"><![CDATA[</xsl:text><xsl:value-of select="text()" disable-output-escaping="yes" /><xsl:text disable-output-escaping="yes">]]></xsl:text>
</xsl:element>
</xsl:template>
I also removed the cdata-section-elements attribute from the <xsl:output> element.
Basically, since the CDATA sections are significant to the next tool in the chain, I take output them manually.
To do this, you'll need to add a special case for empty string elements and use disable-output-escaping. I don't have a copy of Ant to test with, but the following template worked for me with libxml's xsltproc, which exhibits the same behavior you describe:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes" cdata-section-elements="string"/>
<xsl:template match="string">
<xsl:choose>
<xsl:when test=". = ''">
<string>
<xsl:text disable-output-escaping="yes"><![CDATA[]]></xsl:text>
</string>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Input:
<input>
<string><![CDATA[foo]]></string>
<string><![CDATA[]]></string>
</input>
Output:
<input>
<string><![CDATA[foo]]></string>
<string><![CDATA[]]></string>
</input>
Once the XML parser has finished with the XML, there is absolutely no difference between <![CDATA[abc]]> and abc. And the same is true for an empty string - <![CDATA[]]> resolves to nothing at all, and is silently ignored. It has no representation in the XML model. In fact, there is no way to tell the difference from CDATA and regular strings, and neither has any representation in the XML model.
Sorry.
Now, why would you want this? Perhaps there is another solution which can help you?