XSLT help in declaring namespace - xslt

need some help in resolving the following issue. I need to transform the below input(XML) to the mentioned output(XML).
<Header>
<End_Date xsi:nil="true"/>
<Header>
To the following format.
<Header>
<End_Date xsi:nil="true" xmlns:xsi"http://www.w3.org/2001/XMLSchema"/>
<Header>
This is the stylesheet:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs">
<xsl:output method="xml" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<HEADER>
<xsl:for-each select="HEADER">
<xsl:sequence select="(./#node(), ./node())"/>
</xsl:for-each>
</HEADER>
</xsl:template>
</xsl:stylesheet>
Thanks in advance.
Gabriel

Am I right in thinking you want to reproduce a nearly exact copy of the input XML, with the addition of the xsi namespace declaration that is lacking from the input?
First, as it is now, your input is not well-formed XML, just because of the lacking xsi namespace declaration. Hence, there's no way to use XSLT for adding it: any XSLT processor will choke on the input's non-well-formedness.
Second, you have to check case sensitivity: currently, no input nodes are matched by the <xsl:for-each select="HEADER"> select expression. If you change it to "Header", the template rule will indeed replace the <Header> input with <HEADER>, whose content is copied identically. But... only if you have the namespace declarations in the input right...
So, if the purpose is indeed to 'upgrade' non-well-formed XML to a well-formed version, I'd suggest to look for other tools, such as Perl, Awk, or any other simple search/replace solution that operates on plain text and could just add the missing namespace declaration to the document element:
<Header xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<End_Date xsi:nil="true"/>
</Header>
(Of course, you could also make use of XSLT 2.0's unparsed-text($href) function that lets you read any file as unparsed text, which you could then further process with <xsl:analyze-string>. See Michael Kay's article Up-conversion using XSLT 2.0 for further inspiration. Since this a rather awkward way to process non-XML with an XML tool, I give this merely for completeness -- if adding the namespace prefix is the only problem to be solved, I'd definitely go for a cheaper search/replace option.)
Hope this helps,
Ron

Related

Namespace result in xsl:result-document appears on children instead of parent element

I, like many, am having trouble understanding how to control some xslt namespace declarations in xslt ouput. I'm using a recent version of Saxon XSLT 2.0 processor in Java. I'be been able to find solutions to most of the issues I was having with <xsl:output> namespace declarations, but I'm having trouble with a <xsl:result-document> namespace declaration. I'm using XSL to create portions of epub3 files.
Following is the pertinent part of my XSLT 2.0 file
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns="http://www.w3.org/1999/xhtml"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:epub="http://www.idpf.org/2007/opf"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:my="my:functions"
exclude-result-prefixes="xs xsl dc epub my"
version="2.0">
<xsl:output name="xhtml" method="xhtml" indent="yes"/>
<xsl:output name="xml" method="xml" indent="no"/>
<!--content.opf -->
<xsl:result-document href="{concat('ePubs/ePub',project/bookAbbrev,'/OEBPS/content.opf')}" format="xml" indent="yes">
<package xmlns="http://www.idpf.org/2007/opf" version="3.0" xml:lang="en" unique-identifier="pub-identifier">
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:identifier id="pub-identifier">temporary-<xsl:value-of select="/project/bookAbbrev"/>1</dc:identifier>
<dc:title><xsl:value-of select="/project/bookTitle"/>-v1</dc:title>
<dc:language>en</dc:language>
<dc:creator id="creator">MDB</dc:creator>
<dc:subject>history</dc:subject>
<dc:date>2017-06-26</dc:date>
<meta name="cover" content="cover-image"/>
<meta property="dcterms:modified"><xsl:value-of select="format-dateTime(current-dateTime(),'[Y0001]-[M01]-[D01]T[H01]:[m01]:[s01]Z')"/> </meta>
</metadata>
</package>
</xsl:result-document>
<!-- end content.opf -->
</xsl:stylesheet>
I've not included the XML file, since I don't think it is needed in this example. I've also removed other result-document sections.
The desired output of this .opf file would be the following:
<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf"
version="3.0"
xml:lang="en"
unique-identifier="pub-identifier">
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:identifier" id="pub-identifier">temporary-WB-A1</dc:identifier>
<dc:title>Christian County Kentucky Will Book A-v1</dc:title>
<dc:language>en</dc:language>
<dc:creator id="creator">MDB</dc:creator>
<dc:subject>history</dc:subject>
<dc:date>2017-06-26</dc:date>
<meta name="cover" content="cover-image"/>
<meta property="dcterms:modified">2017-07-17T16:44:57Z</meta>
</metadata>
</package>
But instead of the <metadata> element holding the xmlns:dc namespace declaration, all the children have the xmlns:dc declaration and the parent no declaration as follows:
<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf"
version="3.0"
xml:lang="en"
unique-identifier="pub-identifier">
<metadata>
<dc:identifier xmlns:dc="http://purl.org/dc/elements/1.1/" id="pub-identifier">temporary-WB-A1</dc:identifier>
<dc:title xmlns:dc="http://purl.org/dc/elements/1.1/">Christian County Kentucky Will Book A-v1</dc:title>
<dc:language xmlns:dc="http://purl.org/dc/elements/1.1/">en</dc:language>
<dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/" id="creator">MDB</dc:creator>
<dc:subject xmlns:dc="http://purl.org/dc/elements/1.1/">history</dc:subject>
<dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2017-06-26</dc:date>
<meta name="cover" content="cover-image"/>
<meta property="dcterms:modified">2017-07-17T16:44:57Z</meta>
</metadata>
</package>
Any help or explanation would be appreciated. I've searched the forum multiple times, but I don't think the solutions I've found in other posts are quite identical to this problem. I hope the info posted in my question is enough to make the problem clear. I'm novice to intermediate with XSL experience so may have used some inappropriate terminology.
Thank you - Michael
Your desired output has a namespace declaration on an element where it is not actually needed. That's fine; but to achieve this you need to understand the rules for how namespaces are implicitly added to elements.
The element is constructed using a literal result element. The rule for literal result elements is that they have copies of all the namespaces that are in scope for the LRE in the stylesheet, other than excluded namespaces. In your case dc is an excluded namespace because it appears in the value of the exclude-result-prefixes attribute.
If you want to exclude this namespace for the package element but not for the metadata element, there are several options available:
(a) avoid declaring the namespace at the xsl:stylesheet level; declare it only where it is needed
(b) use the xsl:exclude-result-prefixes attribute locally on the literal result elements rather than (or as well as) on the xsl:stylesheet element. However, this would require a bit of reorganisation of your code, because the value is cumulative: if a prefix is excluded on one stylesheet element, then it is automatically excluded for its children and descendants. You would have to move the construction of the package element into a named template so it is not lexically contained within the metadata literal result element.
(c) use the xsl:namespace instruction to explicitly add the namespace to the package element.
(d) construct the metadata element using xsl:element rather than using an LRE, and then remove 'dc' from the list of excluded result prefixes (exclude-result-prefixes applies only to elements created using an LRE).
I think the simplest solution is (a). Unless there are things in your real stylesheet that aren't exposed by this sample, the declaration of the dc namespace at stylesheet level is unnecessary.
Note that none of this has anything to do with the use of xsl:result-document.

XSLT Namespace troubles

In the next step of a project I am working on I am having a problem with namespace statements in an xslt file. I admit that the problem is likely identical to that found in this question: Filemaker XSL Importing blank fields. However, I'm not able to understand the solution there and feel that perhaps the answer may be a bit more simplistic, i.e. I've mucked up the syntax somehow.
The xml I'm working with is:
<?xml version="1.0" encoding="utf-8" ?>
<ledesxml xmlns="http://www.ledes.org/ledes20.xsd">
<firm>
<lf_vendor_id>test</lf_vendor_id>
</firm>
</ledesxml>
The xslt I'm currently using is:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns="http://www.ledes.org/ledes2000.xsd"
xmlns:t="http://www.ledes.org/ledes2000.xsd"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<FMPXMLRESULT xmlns="http://www.filemaker.com/fmpxmlresult">
<METADATA>
<FIELD NAME="lf_vendor_id" TYPE="TEXT"/>
</METADATA>
<RESULTSET>
<ROW>
<COL><DATA><xsl:value-of select="/t:ledesxml/t:firm/t:lf_vendor_id"/></DATA></COL>
</ROW>
</RESULTSET>
</FMPXMLRESULT>
</xsl:template>
</xsl:stylesheet>
The import to Filemaker results in a new record without any data. The xml input here is an industry standard and doesn't change (at least for present purposes).
The use of name spaces here is a bit confusing and is based almost entirely on the namespaces used in the question linked above. Using a wild card in the "value-of select" statement does work, but as you might expect, grabs all the text in the xml sample and not just the data in which I am interested.
Since the import seems to work and the name space convention seems to have worked for another poster, I'm at a bit of a loss. Does anyone have some pointers as to where I've gone wrong?
The XML document has xmlns="http://www.ledes.org/ledes20.xsd" while the XSLT declares xmlns:t="http://www.ledes.org/ledes2000.xsd" with ledes2000 instead of ledes20. You will need to use the same namespace URL in both documents.

Problems Trying to Pretty Print XSLT Output

this is my first post so please let me know if I can make it more constructive in any way. I have read the forum guidelines so if I inadvertantly break them in anyway it will be nothing more than an innocent mistake.
The Question
Is a simple one:
How do I pretty print the output of an XSL file?
But with some criteria:
Using only native XSL functionality.
Without having to use a second XSL file to do a 'second pass'.
It must also work for elements with mixed content.
I have googled this reasonably thoroughly but have not found a clear answer to this question. I have only used XSL for about a week so go easy if I have somehow missed the answer elsewhere.
An Example
This XML...
<email>
<attachedItem>priceless photograph.jpg</attachedItem>
<attachedItem>important document.doc</attachedItem>
<attachedItem>access codes.pdf</attachedItem>
</email>
...Transformed by this XSL...
<!-- Pretty Print Output -->
<xsl:strip-space elements="*"/>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<email>
"Please find attached the stuff."
<xsl:apply-templates/>
</email>
</xsl:template>
<xsl:template match="attachedItem">
<xsl:copy/>
</xsl:template>
...Produces this result...
<?xml version="1.0" encoding="utf-8"?>
<email>
"Please find attached the stuff."
<attachedItem>priceless photograph.jpg</attachedItem>
<attachedItem>important document.doc</attachedItem>
<attachedItem>access codes.pdf</attachedItem>
</email>
Using the Saxon6.5.5 engine
Desired Output
<?xml version="1.0" encoding="utf-8"?>
<email>
"Please find attached the stuff."
<attachedItem>priceless photograph.jpg</attachedItem>
<attachedItem>important document.doc</attachedItem>
<attachedItem>access codes.pdf</attachedItem>
</email>
My Own Progress on the Problem
From the XSL above you will see I have discovered the use of <xsl:strip-space> and <xsl:output>. This meets the first 2 criteria but not the 3rd. In other words, it produces nice pretty printed XML without the mixed content, but with it I recieve the undesired output you can see above.
I know that the reason I get this output is because of the way whitespace is preserved in the source XML. White space is always preserved if it is part of a text node that contains other non-whitespace characters, regardless of the <xsl:strip-space> instructions. However despite my understanding I still cannot think of a solution.
Although I have addressed the first 2 criteria myself I would still like to know if this is the best way to achieve a pretty printed result.
Thanks in advance!
The following stylesheet produces exactly the output you request. The transformation was performed with Saxon 6.5.5. The correct indentation can only be achieved by meticulously typing all the line feed (
) and space ( )characters.
Note that pretty printing XML has no meaning when text content is concerned. The indentation of element tags can be easily controlled, but text nodes of elements with mixed content are always a problem. An application that takes XML as input should never rely on the exact indentation or whitespace handling of text content in XML.
In general, it is considered a bad idea to directly output literal text in an XSLT stylesheet. Always put text content inside xsl:text. xsl:strip-space has an effect only on whitespace-only text nodes of elements that belong to the input XML document (as suggested by #TobiasKlevenz already).
Stylesheet
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<!-- Pretty Print Output -->
<xsl:strip-space elements="*"/>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<email>
<xsl:text>
"Please find attached the stuff."
</xsl:text>
<xsl:apply-templates/>
</email>
</xsl:template>
<xsl:template match="attachedItem|text()">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
</xsl:transform>
Output
<?xml version="1.0" encoding="utf-8"?>
<email>
"Please find attached the stuff."
<attachedItem>priceless photograph.jpg</attachedItem>
<attachedItem>important document.doc</attachedItem>
<attachedItem>access codes.pdf</attachedItem>
</email>
you can wrap "Please find attached the stuff." in an
<xsl:text>
which would produce my assumption of your desired result, if not please post a 'desired output' example/.

XSLT - Extract and manipulate portion of XML data

The input XML:
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Description><![CDATA[Audience: Andrew Reed, Senior Training Specialist, Microsoft Corporation<br/>This session is for individuals who spend significant time writing and creating documents and have some familiarity with Microsoft Word.<br/>Thanks.]]></Description>
</root>
The XSLT:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
<xsl:output method="html" indent="yes"/>
<xsl:template match="/root">
<div>
<xsl:value-of disable-output-escaping="yes" select="Description"/>
</div>
</xsl:template>
</xsl:stylesheet>
I need to add a couple of more BR tags after first occurrence of BR, that's after Audience line and before other description starts.
Can you please modify my XSLT to get the desired output?
So I want output like below:
Audience: Andrew Reed, Senior Training Specialist, Microsoft Corporation
This session is for individuals who spend significant time writing and creating documents and have some familiarity with Microsoft Word.
Thanks.
It would be nice if your input data had the <br/> elements as actual elements, instead of being escaped, so that they could be selected directly using XPath.
But since they are as they are, you can use regexp replace, relying on the assumption that they will always conform to a limited range of patterns. You will often be warned not to parse XML or HTML in general using regexps, and rightly so, because regexps aren't a general solution. But for limited uses they can be sufficient.
If I've guessed your requirements correctly, you could use something like
<xsl:value-of select="replace(Description, '<[Bb][Rr] ?/?>',
'
')"/>
That will give you the sample output you showed, as opposed to adding a couple of more BR tags after first occurrence of BR. It will tolerate some variation, e.g. <br> or <BR />.
This is assuming you can use XSLT 2.0, because replace() isn't available in 1.0. If you're limited to 1.0, please let me know.

XSL for-each loop is not working

I'm using Java to transform an XML document to text:
Transformer transformer = tFactory.newTransformer(stylesource);
transformer.transform(source, result);
This seems to work except when there are colons in the XML document. I tried this example:
XML file:
<?xml version="1.0" encoding="UTF-8"?>
<test:TEST >
<one.two:three id="my id" name="my name" description="my description" >
</one.two:three>
<one.two:three id="some id" name="some name" description="some description" />
</test:TEST>
XSL file:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xmi="http://www.omg.org/XMI"
xmlns:one.two="http://www.one.two/one.two:three" >
<xsl:output method="text" indent="yes" omit-xml-declaration="yes"/>
<xsl:variable name="myVariable">one.two:three</xsl:variable>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="*[substring(name(),1,9)='test:TEST']" >
<xsl:for-each select="./$myVariable">
inFirstLoop
</xsl:for-each>
<xsl:for-each select="./one.two:three">
inSecondLoop
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
The result of the transformation I'm getting is a single line:
inFirstLoop
I'm expecting 4 lines of output
inFirstLoop
inFirstLoop
inSecondLoop
inSecondLoop
How do I fix this? Any help is greatly appreciated. Thanks.
There are multiple things wrong here. I'm surprised your transformation managed to run at all, instead of failing on parse errors and other errors.
One big problem is that your input XML uses namespace prefixes (that's what the colons are for) without declaring them. Declarations like
xmlns:one.two="http://www.one.two/one.two:three"
need to be in the source XML, as well as in the XSL. Otherwise your source XML is not well-formed (according to namespace rules).
A second problem is the XPath expression
./$myVariable
which should have thrown an error. I think what you wanted was
*[name() = $myVariable]
The third change I would make is not an error in the XSLT, but just a poor way of doing things... If you want to match <test:TEST>, you should use namespace tools to refer to namespaces. Therefore, instead of
<xsl:template match="*[substring(name(),1,9)='test:TEST']" >
use
<xsl:template match="test:TEST">
Much cleaner. Then you need to put in a namespace declaration on the outermost element of the stylesheet, as you already have to do in the input XML document:
xmlns:test="...test..."
XML namespaces, like driving a car, are a topic better learned from a little training than by trial-and-error. Reading a brief article like this will help you avoid a lot of confusion and pain down the road.