Multiple XSLT files in a single pipeline with ant - xslt

I have multiple XSLT files that I'm using to process my source XML in a pipeline. I know about the trick with exsl:node-set but after having some issues with this workflow, I took the decision to split the various passes into separate XSL files. I'm much happier with the structure of the files now and the workflow works fine in Eclipse. Our release system works with ant. I can process the files like this:
<xslt basedir="src-xml" style="src-xml/preprocess_1.xsl" in="src-xml/original.xml" out="src-xml/temp_1.xml" />
<xslt basedir="src-xml" style="src-xml/preprocess_2.xsl" in="src-xml/temp_1.xml" out="src-xml/temp_2.xml" />
<xslt basedir="src-xml" style="src-xml/preprocess_3.xsl" in="src-xml/temp_2.xml" out="src-xml/temp_3.xml" />
<xslt basedir="src-xml" style="src-xml/finaloutput.xsl" in="src-xml/temp_3.xml" out="${finaloutput}" />
But this method, going via multiple files on disk, seems inefficient. Is there a better way of doing this with ant?
Update following Dimitre's suggestion
I've created myself a wrapper around the various other XSLs, as follows:
<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xmlns:fn='http://www.w3.org/2005/xpath-functions' xmlns:exslt="http://exslt.org/common">
<xsl:import href="preprocess_1.xsl"/>
<xsl:import href="preprocess_2.xsl"/>
<xsl:import href="preprocess_3.xsl"/>
<xsl:import href="finaloutput.xsl"/>
<xsl:output method="text" />
<xsl:template match="/">
<xsl:apply-imports />
</xsl:template>
</xsl:stylesheet>
This has... not worked well. It looks like the document had not been preprocessed before the final output XSL ran. I should perhaps have been clearer here: the preprocess XSL files are modifying the document, adding attributes and the like. preprocess_3 is based on the output of ..._2 is based on ..._1. Is this import solution still appropriate? If so, what am I missing?

The more efficient method is to perform a single, multipass transformation.
The files can remain as they are -- they will be imported using xsl:import instructions.
The savings are obvious:
Just one initiation (loading of the XSLT processor).
Just one termination.
Eliminates the two intermediate files and their creation, writing into, closing and deleting.

Hmm, you say I know about the trick with exsl:node-set, but you don't use it in your attempt ("Update following Dimitre's suggestion"). In case you don't know it, or for the others (like me) who don't know how to perform multipass transformation, here is a nice article: Multipass processing.
The drawback of this approach is that it requires engine specific xsl code. So if you know the engine, you could try this. If you don't know the engine, you could try with solutions from result tree fragment to node-set: generic approach for all xsl engines.
Looking at these sources one conclusion is sure: your current solution is more readable. But you are seeking efficiency, so some readability may be sacrificed.

Related

Running eXist-db XQuery in Saxon

What is the recommended way in Saxon to load in an XML document from eXist-db via XQuery GET/POST within an XSL stylesheet? I want to run an XQL query in eXist-db, which should be simple enough to do as a GET with <xsl:variable name="test" select="doc('xmldb:exist:///db/test.xql')"/> or <xsl:variable name="test" select="doc('http://localhost:8080/exist/rest/db/test.xql')"/>. But the former doesn't exectute the query and tries to return the XQL source as XML, and the latter doesn't have the basic authentication to execute. Also, I really want to send an XML fragment using POST, and have the XQL use that posted XML fragment.
I can't find anything in the Saxon documentation about this. I did find an old EXPath article at http://expath.org/modules/http-client/samples, but the downloads there are 7 years old, and may not work with modern Saxon. So looking for the best known method to do this.
The first thing that comes to mind is the EXPath HTTP Client module. There's no way to persuade the doc() or document() functions to do POST instead of GET, AFAIK.

Controlling JRebel package scope

I am trying to speed up execution of code being debugged with JRebel. In particular, I notice that framework code is slow. I am wondering whether I can tell JRebel to ignore certain packages, in much the same way that we can setup JProfiler to ignore certain packages and patterns.
You most definitely can.
Use a system property (or add to jrebel.properties) meant just for that purpose. More information at JRebel agent properties.
-Drebel.exclude_packages=PACKAGE1,PACKAGE2,...
Specify the excluded packages in rebel.xml using Ant-styled patterns. More information at rebel.xml configuration.
<?xml version="1.0" encoding="UTF-8"?>
<application xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.zeroturnaround.com" xsi:schemaLocation="http://www.zeroturnaround.com http://update.zeroturnaround.com/jrebel/rebel-2_1.xsd">
<classpath>
<dir name="/path/to/module/build/directory/root">
<exclude name="com/yourapp/package1/internal/**"/>
</dir>
</classpath>
</application>
Both ways work similarly but since the second one enables to customize each module inividually it is generally preferred.

xlst: from base xml to intermediate xml to final output

This is a follow up on my earlier question xslt split mp3 tag into artist and title
I'll try to phrase it in a generic terms because I think it will me allow for a better understanding of XSLT: what and how use it using the appropriate XSLT idoms.
This is what I want:
input XML -> intermediate XML -> ... -> final transformation
Or in other words: how can I pipeline various XML transformations in one XSLT document?
My command-line analogy would be to have multiple command line tools that perform parts of the solution, then have them execute in succeeding order using pipes.
In this specific case:
input XML (with element) -> intermediate XML (with separate and element) -> final XML sorted by ,
I'm limited to one XSLT document as the web-tool at hand does not even allow xsl:include or xsl:import to succeed.
Three approaches that come readily to mind are:
Use operating-system pipelines:
xsltproc ss1.xsl input.xml \\
| xsltproc ss2.xsl - \\
| xsltproc ss3.xsl - \\
> output.xml
The primary downside I'm aware of here is that not all processors have command-line interfaces that make it easy to read the main input tree on stdin. So when I do this, I sometimes end up writing temporary files; fortunately, disk space is cheap. Upside: you probably already know how to do this.
Use XProc pipelines.
Primary downside: you have to learn a new technology. Primary upside: you get to learn a new technology, which is actually quite cool.
Define different modes for the different operations and use XSLT 2.0 (or an XSLT 1.0 processor with some form of the node-set extension) to process the data:
<xsl:template match="/">
<xsl:variable name="tree1">
<xsl:apply-templates mode="mode1"/>
</xsl:variable>
<xsl:variable name="tree2">
<xsl:apply-templates mode="mode2" select="$tree1"/>
</xsl:variable>
<xsl:apply-templates mode="mode3" select="$tree2"/>
</xsl:template>
Upside: it's all in a single stylesheet, so you never have to puzzle out how to run the process, when you come back to it six months later. (And the phrasing of your question says that this is the answer you really want.) Downside: it's all in a single stylesheet, so you have to work harder to achieve modularity and separation of concerns.
There are doubtless other approaches as well.

Is there a tool to check if an XSLT stylesheet follows coding standards?

How can we check whether an XSLT stylesheet is following all the coding standards? Is there a tool where we can specify our own rules and find out if the stylesheet conforms?
First of all, there is nothing like
"all coding standards"
...
This said, take a look at the XSLT Lint, developed by Mukul Ghandi, and published Dec. 2008:
http://lists.xml.org/archives/xml-dev/200812/msg00178.html
There is also one published recently by Andriy Gerasika:
http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/201102/msg00103.html
In case you are interested in functional programming with XSLT, take a look at FXSL.
Finally, if by "all coding standards" you mean "style", you may look at my answers in the xslt tag of SO, to learn a little bit more about "push style" and programming without explicit logical instructions.
Well, if the stylesheet works, it should be valid, right?
Other than that, i think all the big XML IDEs like Altova's XMLSpy provide some sort of schema validation, if that's what you're looking for.
XSL stylesheets is a XML language. So, validation tools available for XML still applies here. So, DTD or XML namespaces can be used to define the rules to check. Link to the location where the DTD/ns reside in the XSL sheet. Then tools like Xerces can be used for validating the document.
If you are using ANT, xmlvalidate task will do this automatically by invoking Xerces SAXParser.
<target name="validate" if="perform-validation-dtd">
<xmlvalidate file="${input-xml}"
classname="org.apache.xerces.parsers.SAXParser"/>
</target>
xmllint program validates an XML file against its DTD (Document Type Definition) and reports on any differences. Read more here
Well, an XSLT stylesheet is XML itself of course; you could easily write an XSLT that looks for patterns.
For example:
<xsl:template match="xsl:for-each">
<xsl:text>Inappropriate use of xsl:for-each; should be using templates instead</xsl:text>
</xsl:template>
if your policy includes not using xsl:for-each.
Or, you could write a schema that expands on the xslt one.

XSLT Unit testing

Does anyone know of a way to write unit tests for the XSLT transformation?
I've a lot of XSLT files and it's getting harder to test them manually. We have an example XML and can compare it to the resulting output XML from the XSL transormation. However, I'm looking for a better test method.
I am currently looking for some good options to do this as well. As a result, I came across this question, and a few other potential candidate solutions. Admittedly, I haven't tried any of them yet, so I can't speak to their quality, but at least they are some other avenues potentially worthy of researching.
Jenni Tennison's Unit Testing Package
UTF-X Unit Testing Framework
Juxy
XTC
Additionally, I found the following article to be informative in terms of a general methodology for unit testing XSLT.
Unit test XSL transformations
Try XSpec, a testing framework for XSLT. It allows you to write tests declaratively, and test templates and functions.
Looks like Oxygen editor has Unit Testing available as well. It "provides XSLT Unit Test support based on XSpec".
I haven't tried it myself, but will soon.
Here are a few simple solutions:
Use xsltproc with a mock XML file:
xsltproc test.xsl mock.xml
XSLT Cookbook - Chapter 13
Create a document() placeholder variable and comment/uncomment it manually:
<xsl:variable name="Data" select="descendant-or-self::node()"/>
<!--
<xsl:variable name="Data" select="document('foo.xml')" />
-->
<xsl:if test="$Data/pagename='foo'">
<p>hi</p>
</xsl:if>
Create a condition to swap the comment programmatically:
<xsl:variable name="Data">
<xsl:choose>
<!-- If source XML is inline -->
<xsl:when test="descendant-or-self::node()/pageName='foo'"/>
<xsl:value-of select="descendant-or-self::node()"/>
</xsl:when>
<!-- If source XML is external -->
<xsl:otherwise>
<xsl:value-of select="document('foo.xml')" />
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
Use a shell script to inline the data programmatically in the build to automate the tests completely.
References
Transformiix Test Cases
Running XSLT at the Department: Command Line XSLT Processing
Building TransforMiiX standalone - Archive of obsolete content | MDN
OASIS XSLT Conformance TC Public Documents
Using XSLT to Assist Regression Testing
MicroHowTo: Process an XML document using an XSLT stylesheet
Tip: Debug stylesheets with xsl:message
Batch XSLT Processing
Embedded Stylesheet Modules: XSL Transformations (XSLT) Version 3.0
Multi layer conditional wrap HTML with XSLT
XPath 1.0: Axes
CentOS 7.0 - man page for xsltproc
XMLStarlet command line XML toolkit download | SourceForge.net
We have been using Java based unit test cases, in which we provide expected xml string after transformation and input xml string which needs to be transformed using some XSL.
Refer to following package if you want to explore more.
org.custommonkey.xmlunit.Transform
org.custommonkey.xmlunit.Diff
org.custommonkey.xmlunit.DetailedDiff
I´m using this tool: jxsltunit.
The test is defined by an XML file which is then passed to the tool. This is an example of the test configuration:
<xsltTestsuite xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="jxsltunit jxslttestsuite.xsd" xmlns="jxsltunit"
description="Testsuite Test"
xml="min-test.xml"
xslt="min-test.xslt"
path="pa > ch">
<xsltTestcase match_number="0">
<![CDATA[<ch>child 1</ch>]]>
</xsltTestcase>
<xsltTestcase match_number="1">
<![CDATA[<ch>child 2</ch>]]>
</xsltTestcase>
</xsltTestsuite>
It takes the XML, the XSL and a path in the transformed XML which gets tested. The path can contain a list which elements are identified by their index.
One benefit of this tool is that it can output the results as a junit XML file. This file can be picked up by your Jenkins to show the XLST-tests in your test results. Just add the call to the tool as a build step.
Try Jenni Tennison's Unit Testing Package (XSpec), which is a unit test and behaviour-driven development (BDD) framework for XSLT, XQuery, and Schematron. It is based on the Spec framework of RSpec, which is a BDD framework for Ruby.
With XSpec you can test XLT template wise or XPath wise per your need.
For an overview on how to use/handle/write (installation|execution) click https://github.com/xspec/xspec/wiki/What-is-XSpec