Using an HTML entity in XSLT (e.g. ) - xslt

What is the best way to include an html entity in XSLT?
<xsl:template match="/a/node">
<xsl:value-of select="."/>
<xsl:text> </xsl:text>
</xsl:template>
this one returns a XsltParseError

You can use CDATA section
<xsl:text disable-output-escaping="yes"><![CDATA[ ]]></xsl:text>
or you can describe &nbsp in local DTD:
<!DOCTYPE xsl:stylesheet [ <!ENTITY nbsp " "> ]>
or just use   instead of

It is also possible to extend the approach from 2nd part of aku's answer and get all known character references available, like this:
<!DOCTYPE stylesheet [
<!ENTITY % w3centities-f PUBLIC "-//W3C//ENTITIES Combined Set//EN//XML"
"http://www.w3.org/2003/entities/2007/w3centities-f.ent">
%w3centities-f;
]>
...
<xsl:text> −30°</xsl:text>
There is certain difference in the result as compared to <xsl:text disable-output-escaping="yes"> approach. The latter one is going to produce string literals like for all kinds of output, even for <xsl:output method="text">, and this may happen to be different from what you might wish... On the contrary, getting entities defined for XSLT template via <!DOCTYPE ... <!ENTITY ... will always produce output consistent with your xsl:output settings.
It may be wise then to use a local entity resolver to keep the XSLT engine from fetching character entity definitions from the Internet. JAXP or explicit Xalan-J users may need a patch for Xalan-J to use the resolver correctly. See my blog XSLT, entities, Java, Xalan... for patch download and comments.

one other possibility to use html entities from within xslt is the following one:
<xsl:text disable-output-escaping="yes">&nbsp;</xsl:text>

this one returns a XsltParseError
Yes, and the reason for that is that is not a predefined entity in XML or XSLT as it is in HTML.
You could just use the unicode character which stands for:  

XSLT only handles the five basic entities by default: lt, gt, apos, quot, and amp. All others need to be defined as #Aku mentions.

Now that there's Unicode, it's generally counter-productive to use named character entities. I would recommend using the Unicode character for a non-breaking space instead of an entity, just for that reason. Alternatively, you could use the entity  , instead of the named entity. Using named entities makes your XML dependent on an inline or external DTD.

I found all of these solutions produced a  character in the blank space.
Using <xsl:text> </xsl:text> solved the problem for me; but <xsl:text>#x20;</xsl:text> might work as well.

Thank you for your information. I have written a short blog post based on what worked for me as I was doing XSLT transformation in a template of the Dynamicweb CMS.
The blog post is here: How to add entities to XSLT templates.
/Sten Hougaard

It is necessary to use the entity #x160;

I had no luck with the DOCTYPE approach from Aku.
What worked for me in MSXML transforms on an Windows 2003 server, was
<xsl:text disable-output-escaping="yes">&#160;</xsl:text>
Sort of a hybrid of the above. Thanks Stackoverflow contributors!

One space character between text tags should be enough.

Related

Passing a node as parameter to a XSL stylesheet

I need to pass a node as a parameter to an XSL stylesheet. The issue is that the parameter gets sent as a string. I have seen the several SO questions regarding this topic, and I know that the solution (in XSLT 1.0) is to use an external node-set() function to transform the string to a node set.
My issue is that I am using eXist DB I cannot seem to be able to get its XSLT processor to locate any such function. I have tried the EXSLT node-set() from the namespace http://exslt.org/common as well as both the Saxon and Xalan version (I think eXist used to use Xalan but now it might be Saxon).
Are these extensions even allowed in the XSLT processor used by eXist? If not, is there something else I can do?
To reference or transform documents from the database, you should pass the path as a parameter to the transformation, and then refer to it using a parameter and variable
(: xquery :)
let $path-to-document := "/db/test/testa.xml"
let $stylesheet :=
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:param name="source" required="no"/>
<xsl:variable name="error"><error>doc not available</error></xsl:variable>
<xsl:variable name="theDoc" select="if (doc-available($source)) then doc($source) else $error"/>
<xsl:template match="/">
<result><xsl:value-of select="$source"/> - <xsl:value-of select="node-name($theDoc/*)"/></result>
</xsl:template>
</xsl:stylesheet>
return transform:transform(<dummy/>,$stylesheet, <parameters><param name="source" value="xmldb:exist://{$path-to-document}"/></parameters>)
As per Martin Honnen's comments I don't think it is possible to pass an XML node via the <parameters> structure of the transform:transform() function in eXist. The function seems to strip away any XML tags passed to it as a value.
As a workaround I will wrap both my input XML and my parameter XML into a root element and pass that as input to the transform function.

Exclude certain child nodes when data structure is unknown

EDIT -
I've figured out the solution to my problem and posted a Q&A here.
I'm looking to process XML conforming to the Library of Congress EAD standard (found here). Unfortunately, the standard is very loose regarding the structure of the XML.
For example the <bioghist> tag can exist within the <archdesc> tag, or within a <descgrp> tag, or nested within another <bioghist> tag, or a combination of the above, or can be left out entirely. I've found it to be very difficult to select just the bioghist tag I'm looking for without also selecting others.
Below are a few different possible EAD XML documents my XSLT might have to process:
First example
<ead>
<eadheader>
<archdesc>
<bioghist>one</bioghist>
<dsc>
<c01>
<descgrp>
<bioghist>two</bioghist>
</descgrp>
<c02>
<descgrp>
<bioghist>
<bioghist>three</bioghist>
</bioghist>
</descgrp>
</c02>
</c01>
</dsc>
</archdesc>
</eadheader>
</ead>
Second example
<ead>
<eadheader>
<archdesc>
<descgrp>
<bioghist>
<bioghist>one</bioghist>
</bioghist>
</descgrp>
<dsc>
<c01>
<c02>
<descgrp>
<bioghist>three</bioghist>
</descgrp>
</c02>
<bioghist>two</bioghist>
</c01>
</dsc>
</archdesc>
</eadheader>
</ead>
Third example
<ead>
<eadheader>
<archdesc>
<descgrp>
<bioghist>one</bioghist>
</descgrp>
<dsc>
<c01>
<c02>
<bioghist>three</bioghist>
</c02>
</c01>
</dsc>
</archdesc>
</eadheader>
</ead>
As you can see, an EAD XML file might have a <bioghist> tag almost anywhere. The actual output I'm suppose to produce is too complicated to post here. A simplified example of the output for the above three EAD examples might be like:
Output for First example
<records>
<primary_record>
<biography_history>first</biography_history>
</primary_record>
<child_record>
<biography_history>second</biography_history>
</child_record>
<granchild_record>
<biography_history>third</biography_history>
</granchild_record>
</records>
Output for Second example
<records>
<primary_record>
<biography_history>first</biography_history>
</primary_record>
<child_record>
<biography_history>second</biography_history>
</child_record>
<granchild_record>
<biography_history>third</biography_history>
</granchild_record>
</records>
Output for Third example
<records>
<primary_record>
<biography_history>first</biography_history>
</primary_record>
<child_record>
<biography_history></biography_history>
</child_record>
<granchild_record>
<biography_history>third</biography_history>
</granchild_record>
</records>
If I want to pull the "first" bioghist value and put that in the <primary_record>, I can't simply <xsl:apply-templates select="/ead/eadheader/archdesc/bioghist", as that tag might not be a direct descendant of the <archdesc> tag. It might be wrapped by a <descgrp> or a <bioghist> or a combination thereof. And I can't select="//bioghist", because that will pull all the <bioghist> tags. I can't even select="//bioghist[1]" because there might not actually be a <bioghist> tag there and then I'll be pulling the value below the <c01>, which is "Second" and should be processed later.
This is already a long post, but one other wrinkle is that there can be an unlimited number of <cxx> nodes, nested up to twelve levels deep. I'm currently processing them recursively. I've tried saving the node I'm currently processing (<c01> for example) as a variable called 'RN', then running <xsl:apply-templates select=".//bioghist [name(..)=name($RN) or name(../..)=name($RN)]">. This works for some forms of EAD, where the <bioghist> tag isn't nested too deeply, but it will fail if it ever has to process an EAD file created by someone who loves wrapping tags in other tags (which is totally fine according to the EAD Standard).
What I'd love is someway of saying
Get any <bioghist> tag anywhere below the current node but
don't dig deeper if you hit a <c??> tag
I hope that I've made the situation clear. Please let me know if I've left anything ambiguous. Any assistance you can provide would be greatly appreciated. Thanks.
As the requirements are rather vague, any answer only reflects the guesses its author has made.
Here is mine:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:my="my:my" exclude-result-prefixes="my">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<my:names>
<n>primary_record</n>
<n>child_record</n>
<n>grandchild_record</n>
</my:names>
<xsl:variable name="vNames" select="document('')/*/my:names/*"/>
<xsl:template match="/">
<xsl:apply-templates select=
"//bioghist[following-sibling::node()[1]
[self::descgrp]
]"/>
</xsl:template>
<xsl:template match="bioghist">
<xsl:variable name="vPos" select="position()"/>
<xsl:element name="{$vNames[position() = $vPos]}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<ead>
<eadheader>
<archdesc>
<bioghist>first</bioghist>
<descgrp>
<bioghist>first</bioghist>
<bioghist>
<bioghist>first</bioghist></bioghist>
</descgrp>
<dsc>
<c01>
<bioghist>second</bioghist>
<descgrp>
<bioghist>second</bioghist>
<bioghist>
<bioghist>second</bioghist></bioghist>
</descgrp>
<c02>
<bioghist>third</bioghist>
<descgrp>
<bioghist>third</bioghist>
<bioghist>
<bioghist>third</bioghist></bioghist>
</descgrp>
</c02>
</c01>
</dsc>
</archdesc>
</eadheader>
</ead>
the wanted result is produced:
<primary_record>first</primary_record>
<child_record>second</child_record>
<grandchild_record>third</grandchild_record>
I worked out a solution on my own and posted it at this Q&A because the solution is quite specific to a certain XML standard and seemed out of the scope of this question. If people feel it would be best to post it here as well, I can update this answer with a copy.

Getting the value of an attribute in an XML document using XSL

I'm trying to get the value of iWantToGetThis.jpg and put it into an <img> during my XSL transformation. This is how my xml is structures:
<article>
<info>
<mediaobject>
<imageobject>
<imagedata fileref='iWantToGetThis.jpg'>
Here's what I've come up with for the XSL:
<xsl:template name="user.header.content">
<xsl:stylesheet xmlns:d='http://docbook.org/ns/docbook'>
<img><xsl:attribute name="src">../icons/<xsl:value-of select='ancestor-or-self::d:article/info/mediaobject/imageobject/imagedata/#fileref' /></xsl:attribute></img>
</xsl:stylesheet>
</xsl:template>
The image is being added to the output, but the src attribute is set to "../icons/", so I'm assuming it's not finding the fileref attribute in the XML. This looks perfectly valid to me, so I'm not sure what I'm missing.
I am not sure how you can get anything back at all, because that does not look like a valid XSLT document (I would expect the error "Keyword xsl:stylesheet may not contain img.").
However, it may be you are just showing a fragment of the code. If this is the case, your issue may be that you have only specified the namespace for the article element, when you really need to specify it for all elements in your xpath. Try this
<xsl:value-of
select="ancestor-or-self::d:article/d:info/d:mediaobject/d:imageobject/d:imagedata/#fileref"/>
Another possible problem may be because you are using the 'ancestor-or-self' xpath axis to find the attribute. This would only work if your current context was already on the article element, or one of its descendants.
As a side note, you can simplify the code by making use of Attribute Value Templates here
<img src="../icons/{ancestor-or-self::d:article/d:info/d:mediaobject/d:imageobject/d:imagedata/#fileref}" />

YQL XSLT implementation limitations

For some reason, YQL's XSLT table can't parse my stylesheet. I have used the stylesheet successfully with the W3C's XSLT service. Here's an example of the problem in YQL Console. Why does this not work in YQL?
Also, I have yet to figure out how to pass the results of a YQL query to the XSLT table as the XML to be transformed while also specifying a stylesheet url. Current workaround is to abuse the W3C's service.
Your stylesheet is defined as 1.0 but you're using replace() and tokenize() which is part of the 2.0 standard. However it is a fully valid XSLT/XPath 2.0 stylesheet.
As an addition to Per T answer, change this:
<xsl:variable name="r">
<xsl:value-of select="replace(tr/td/p/a/following-sibling::text(),
'\s*-\s*(\d+)\.(\d+)\.(\d+)\s*',
'$1,$2,$3')" />
</xsl:variable>
With this:
<xsl:variable name="r"
select="translate(tr/td/p/a/following-sibling::text(),'. -',',')">
These:
tokenize($r,',')[1]
tokenize($r,',')[2]
tokenize($r,',')[3]
With these:
substring-before($r,',')
substring-before(substring-after($r,','),',')
substring-after(substring-after($r,','),',')
Note: This is just in case you don't know the amount of digit in advance, otherwise you could do:
substring($r,1,2)
substring($r,4,2)
substring($r,7)
Also, this
replace(tr/td/p[#class='t11bold']/a,'\s+',' ')
It should be just this:
normalize-space(tr/td/p[#class='t11bold']/a)
And finaly this:
replace($d,'^[^\[]*\[\s*(\d+:\d{2})?\s*-?\s*([^\]]*)\]\s*$','$2')
Could be:
normalize-space(substring-after(substring-before(substring-after($d,'['),']'),'-'))

Xslt transform on special characters

I have an XML document that needs to pass text inside an element with an '&' in it.
This is called from .NET to a Web Service and comes over the wire with the correct encoding &
e.g.
T&O
I then need to use XSLT to create a transform but need to query SQL server through a SP without the encoding on the Ampersand e.g T&O would go to the DB.
(Note this all has to be done through XSLT, I do have the choice to use .NET encoding at this point)
Anyone have any idea how to do this from XSLT?
Note my XSLT knowledge isn’t the best to say the least!
Cheers
<xsl:text disable-output-escaping="yes">&<!--&--></xsl:text>
More info at: http://www.w3schools.com/xsl/el_text.asp
If you have the choice to use .NET you can convert between an HTML-encoded and regular string using (this code requires a reference to System.Web):
string htmlEncodedText = System.Web.HttpUtility.HtmlEncode("T&O");
string text = System.Web.HttpUtility.HtmlDecode(htmlEncodedText);
Update
Since you need to do this in plain XSLT you can use xsl:value-of to decode the HTML encoding:
<xsl:variable name="test">
<xsl:value-of select="'T&O'"/>
</xsl:variable>
The variable string($test) will have the value T&O. You can pass this variable as an argument to your extension function then.
Supposing your XML looks like this:
<root>T&O</root>
you can use this XSLT snippet to get the text out of it:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:template match="root"> <!-- Select the root element... -->
<xsl:value-of select="." /> <!-- ...and extract all text from it -->
</xsl:template>
</xsl:stylesheet>
Output (from Saxon 9, that is):
T&O
The point is the <xsl:output/> element. The defauklt would be to output XML, where the ampersand would still be encoded.