xsltproc doesn't select elements by name - xslt

I am trying to transform XHTML using an XSLT stylesheet, but I can't even get a basic stylesheet to match anything. I'm sure I'm missing something simple.
Here's my XHTML source document (no big surprises):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content=
"HTML Tidy for Windows (vers 25 March 2009), see www.w3.org" />
...
</body>
</html>
The actual contents don't matter too much, as I'll demonstrate below. By the way, I'm pretty sure the document is well-formed since it was created via tidy -asxml.
My more complex XPath expressions were not returning any results, so as a sanity test, I'm trying to transform it very simply using the following stylesheet:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" omit-xml-declaration="yes" indent="no"/>
<xsl:template match="/">
<xsl:text>---[</xsl:text>
<xsl:for-each select="html">
<xsl:text>Found HTML element.</xsl:text>
</xsl:for-each>
<xsl:text>]---</xsl:text>
</xsl:template>
</xsl:stylesheet>
The transform is done via xsltproc --nonet stylesheet.xsl input.html, and the output is: "---[]---" (i.e., it didn't find a child element of html). However, if I change the for-each section to:
<xsl:for-each select="*">
<xsl:value-of select="name()"/>
</xsl:for-each>
Then I get "---[html]---". And similarly, if I use for-each select="*/*" I get "---[headbody]---" as I would expect.
Why can it find the child element via * (with name() giving the correct name) but it won't find it using the element name directly?

The html element in your source XML defines a namespace. You have to include it in your match expression and reference it in your xsl:stylesheet element:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:html="http://www.w3.org/1999/xhtml">
<xsl:output method="text" omit-xml-declaration="yes" indent="no"/>
<xsl:template match="/">
<xsl:text>---[</xsl:text>
<xsl:for-each select="html:html">
<xsl:text>Found HTML element.</xsl:text>
</xsl:for-each>
<xsl:text>]---</xsl:text>
</xsl:template>
</xsl:stylesheet>

Change your stylesheet from:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" omit-xml-declaration="yes" indent="no"/>
<xsl:template match="/">
<xsl:text>---[</xsl:text>
<xsl:for-each select="html">
<xsl:text>Found HTML element.</xsl:text>
</xsl:for-each>
<xsl:text>]---</xsl:text>
</xsl:template>
</xsl:stylesheet>
to:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:x="http://www.w3.org/1999/xhtml"
>
<xsl:output method="text" omit-xml-declaration="yes" indent="no"/>
<xsl:template match="/">
<xsl:text>---[</xsl:text>
<xsl:for-each select="x:html">
<xsl:text>Found HTML element.</xsl:text>
</xsl:for-each>
<xsl:text>]---</xsl:text>
</xsl:template>
</xsl:stylesheet>
Explanation:
The XML document has declared a default namespace: "http://www.w3.org/1999/xhtml", and all unprefixed nodes that descend from the top element declaring this default namespace, belong to this namespace.
On the other side, in XPath any unprefixed name is considered to belong in "no namespace".
Therefore, the <xsl:for-each select="html"> instruction will select and apply its body to all html elements that belong to "no namespace" -- and there are none such in the document -- the only html element does belong to the xhtml namespace.
Solution:
The the names that belong to a default namespace cannot be referenced unprefixed. Therefore, we need to bind a prefix to the namespace such an element belongs to. If this prefix is "x:", then we can reference any such element prefixed with "x:".

A workaround without declaring the namespace, so that the stylesheet accept any namespace:
<xsl:template match="*[name()='html']" >

Related

Extract text inside cdata tag using XSLT

I have the following XML with a cdata tag that I would like to extract the text from?
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<cus:TestData xmlns:cus="http://test.namespace.com/data">
<![CDATA[testValue]]></cus:TestData >
How can I achieve this in XSLT?
I was briefly trying with the following
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
"<xsl:value-of select="/*/Name"/>"
</xsl:template>
</xsl:stylesheet>
But it doesn't seem to be working
Also the XML doesn't also have the same prefix or namespace, it changes
This is not really an issue with CData. Your XSLT is currently looking for an element called Name, under the root element, which does not exist in your XML. If your XML source is the one you are actually using, you can just do this...
<xsl:value-of select="/*"/>
But supposing your XML looked like this...
<cus:TestData xmlns:cus="http://test.namespace.com/data">
<cus:Name><![CDATA[testValue]]></cus:Name>
</cus:TestData>
Then, you would need to account for the namespace in your XSLT, as Name is in a namespace in your XML, but your XSLT is currently looking for a Name element in no namespace.
Something like this would do:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:c="http://test.namespace.com/data">
<xsl:output method="text"/>
<xsl:template match="/">
"<xsl:value-of select="/*/c:Name"/>"
</xsl:template>
</xsl:stylesheet>
Note, the prefixes don't need to match, but the namespace URI does.
If the namespace URI could actually vary, you could do something like this instead...
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
"<xsl:value-of select="/*/*[local-name() = 'Name']"/>"
</xsl:template>
</xsl:stylesheet>

XSL Transform, select namespace

I'm new with XSL, but it's ok, but it's the first time I need to do something with namespace, and I'm totally out, can someone explain how to do this :
I have an XHTML like this :
<?xml version="1.0" encoding="ISO-8859-1" ?>
<?xml-stylesheet type="text/xsl" href="5C.xslt"?>
<!DOCTYPE rdf:RDF SYSTEM "http://purl.org/dc/schemas/dcmes-xml20000714.
dtd">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dcterms="http://purl.org/dc/terms/">
<rdf:Description rdf:about="MyJPeg.jpg">
<dc:title>Find Info</dc:title>
<dc:contributor>Myself</dc:contributor>
<dcterms:created>2013-12-11</dcterms:created>
<dcterms:issued>2013-12-23</dcterms:issued>
</rdf:Description>
</rdf:RDF>
I need to validate if the issued date if = to 2013-10-10 (answer no)
My XSLT is :
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<xsl:output method="html" version="1.0" encoding="ISO-8859-1" indent="yes"/>
<xsl:template match="*">
<html><body><pre>
<xsl:value-of select="rdf/issued"/>
<xsl:if test="xxx = '2013-10-10' ">
</xsl:if>
</pre></body>
</html>
</xsl:template>
</xsl:stylesheet>
So I try to have ther value with this line :
<xsl:value-of select="rdf/issued"/>
(to see if I got it)
And to validate with this one :
<xsl:if test="xxx = '2013-10-10' ">
But I'm new with name space and I can't find out how to get my value,
Can someone help me ?
thanks
Question #2, the solution works, but :
If I want to validate if the date is HIGHER than instead of equal, how I can do that ? (I replace = by >), and I change my date to be higher and lower, and each time it doesn't work
<xsl:if test="rdf:Description/dcterms:issued > '2001-01-01' ">
Good job
</xsl:if>
What's wrong ?
thanks
In XML, an element with a namespace if different to an element with no namespace. For example, despite having the same "local" name of "RDF" the following two elements are different.
<RDF>Test</RDF>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">Test</RDF>
To access elements within a namespace in XSLT, you first have to declare the relevant namespaces in your XSTL
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dcterms="http://purl.org/dc/terms/">
Then, where you have an xpath expression that refers to elements, you need to add in the prefix
<xsl:value-of select="rdf:Description/dcterms:issued"/>
(I took it as a typo in your question, but "issued" is a child of "Description" in your XML sample!).
Try this XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dcterms="http://purl.org/dc/terms/">
<xsl:output method="html" version="1.0" encoding="ISO-8859-1" indent="yes"/>
<xsl:template match="rdf:RDF">
<html><body><pre>
<xsl:value-of select="rdf:Description/dcterms:issued"/>
<xsl:if test="rdf:Description/dcterms:issued = '2013-10-10' ">
</xsl:if>
</pre></body>
</html>
</xsl:template>
</xsl:stylesheet>
It is worth mentioning that the namespace prefix ("rdf:" in this case), does not have to be the same in the XML as it is in the XSLT. It is the namespace URI ("http://www.w3.org/1999/02/22-rdf-syntax-ns#") that has to match.

Use XSLT to copy XML without the xml declaration

I have the following xml and want the output to not contain the xml declaration
i.e.
FROM
<?xml version="1.0" encoding="UTF-8"?>
<tns:MFTRNS xmlns:tns="MFTRNS" recordState="New" msgVersion="13.0">
<OSCONO>100</OSCONO>
<OSINOU>1</OSINOU>
<OSDLIX>155379</OSDLIX>
<OSPANR>AAG44780</OSPANR>
<OSWHLO>AAG</OSWHLO>
</tns:MFTRNS>
TO
<tns:MFTRNS xmlns:tns="MFTRNS" recordState="New" msgVersion="13.0">
<OSCONO>100</OSCONO>
<OSINOU>1</OSINOU>
<OSDLIX>155379</OSDLIX>
<OSPANR>AAG44780</OSPANR>
<OSWHLO>AAG</OSWHLO>
</tns:MFTRNS>
Can you get an xslt to do this and if so how?
The reason for doing this is that I want to wrap the xml in an envelope which cannot be done if the declaration is a part of the XML as it does not create a valid xml file
Thanks
If you only want to remove the declaration then a stylesheet as simple as this will do it:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes" />
<xsl:template match="/">
<xsl:copy-of select="node()" />
</xsl:template>
</xsl:stylesheet>
But if your ultimate aim is to "wrap the xml in an envelope" then you might be better doing that directly in your XSLT, for example:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" />
<xsl:template match="/">
<soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope">
<xsl:copy-of select="node()" />
</soap:Envelope>
</xsl:template>
</xsl:stylesheet>
which will be safer than trying to combine the two files using non-XML-aware textual operations. For example, if your envelope declares a default namespace xmlns="http://example.com" then simply inserting the text of another XML document inside the envelope would change the semantics as it would move the non-prefixed elements like OSCONO into the envelope's default namespace when they should really be in no namespace. XSLT will spot this case and add the necessary xmlns="" overrides.
It's simple: you have to set the omit-xml-declaration attribute of your xsl:output element to yes.

Processing a GPX file with xsl (probably a namespace issue)

This question looks like a duplicate of XPath query for GPX files with namespaces?, but I must be missing something because I can't seem to get a fairly simple style sheet to work. I have this input:
<?xml version="1.0" encoding="utf-8"?>
<gpx xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.0" creator="Groundspeak Pocket Query" xsi:schemaLocation="http://www.topografix.com/GPX/1/0 http://www.topografix.com/GPX/1/0/gpx.xsd http://www.groundspeak.com/cache/1/0 http://www.groundspeak.com/cache/1/0/cache.xsd" xmlns="http://www.topografix.com/GPX/1/0">
<name>Ottawa Pocket Query</name>
<wpt lat="45.348517" lon="-75.825933">
<name>GC3HXAZ</name>
<desc>Craft maker box by FishDetective, Traditional Cache (2/2.5)</desc>
<url>http://www.geocaching.com/seek/cache_details.aspx?guid=e86ce3f5-9e75-48a6-b47e-9415101fc658</url>
<groundspeak:cache id="2893138" available="True" archived="False" xmlns:groundspeak="http://www.groundspeak.com/cache/1/0">
<groundspeak:name>Craft maker box</groundspeak:name>
<groundspeak:difficulty>2</groundspeak:difficulty>
<groundspeak:terrain>2.5</groundspeak:terrain>
</groundspeak:cache>
</wpt>
</gpx>
And a stylesheet that looks like this:
<?xml version="1.0"?>
<!-- -->
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:groundspeak="http://www.groundspeak.com/cache/1/0"
>
<xsl:output method="html"/>
<xsl:template match="/">
Cache names:
<xsl:apply-templates select="//wpt">
</xsl:apply-templates>
</xsl:template>
<xsl:template match="wpt">
<li><xsl:value-of select="groundspeak:cache/groundspeak:name"/></li>
</xsl:template>
</xsl:stylesheet>
And what I would expect is a list with one element on it, "Craft Maker Box", but what I get is an empty list.
What am I missing?
The default namespace is http://www.topografix.com/GPX/1/0. You should add that and use a prefix to match wpt.
Something like this: (untested)
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:groundspeak="http://www.groundspeak.com/cache/1/0"
xmlns:gpx="http://www.topografix.com/GPX/1/0"
>
<xsl:output method="html"/>
<xsl:template match="/">
Cache names:
<xsl:apply-templates select="//gpx:wpt">
</xsl:apply-templates>
</xsl:template>
<xsl:template match="gpx:wpt">
<li><xsl:value-of select="groundspeak:cache/groundspeak:name"/></li>
</xsl:template>
</xsl:stylesheet>
It is indeed a namespace issue. You have
xmlns="http://www.topografix.com/GPX/1/0"
in the XML so unprefixed element names are in this namespace. You need to bind the same uri to a prefix in your stylesheet, e.g.
xmlns:g="http://www.topografix.com/GPX/1/0"
and then use g:wpt in the match and select expressions.

Processing an XML file with public doctype

I'm trying to process an SVG file with XSLT. I am having behaviors I don't understand, that involves the doctype declaration.
Here are two tests I've done. The first one gives me the expected result and the second gives me a result I don't understand. (tested with saxon and xalan).
Stylesheet used for the two tests :
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="text" encoding="UTF-8"/>
<xsl:template match="text()" >
</xsl:template>
<xsl:template match="/">
<xsl:text>/</xsl:text>
<xsl:apply-templates />
</xsl:template>
<xsl:template match="svg">
<xsl:text>svg</xsl:text>
<xsl:apply-templates />
</xsl:template>
</xsl:stylesheet>
Test n°1
source file :
<?xml version="1.0"?>
<svg width="768" height="430">
</svg>
result :
/svg
Test n°2
source file :
<?xml version="1.0"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20001102//EN"
"http://www.w3.org/TR/2000/CR-SVG-20001102/DTD/svg-20001102.dtd">
<svg width="768" height="430">
</svg>
result :
/
Why does the doctype declaration modifies the behavior of the processing ?
The SVG elements are in the SVG namespace.
The DTD defines this, so:
<xsl:template match="svg">
is matching an element with the name of svg, but in no namespace. All the elements in the XML document are in the SVG namespace and this template doesn't match any node.
This explains the output.
Solution: Replace the template matching svg with one that matches svg in the SVG namespace, as in the following transformation:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:s="http://www.w3.org/2000/svg"
>
<xsl:output method="text" encoding="UTF-8"/>
<xsl:template match="text()" >
</xsl:template>
<xsl:template match="/">
<xsl:text>/</xsl:text>
<xsl:apply-templates />
</xsl:template>
<xsl:template match="s:svg">
<xsl:text >svg</xsl:text>
<xsl:apply-templates />
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20001102//EN"
"http://www.w3.org/TR/2000/CR-SVG-20001102/DTD/svg-20001102.dtd">
<svg width="768" height="430" >
</svg>
the wanted result is produced:
/svg
Update:
Several people asked me "How a DTD can set a (default) namespace?"
Here is an answer: XML and DTDs with it were made a W3C Recommendation before namespaces made it. In pre-namespace XML a namespace declaration is simply an attribute.
DTD's can specify "default attributes" -- attributes, which may be ommitted from an instance but will be automatically added with a default value.
So, one way to define a default namespace in a DTD is to define an xmlns default attribute for the top element of the document.