XSLT: Is there a way to "inherit" canned functionality? - xslt

i am once again having to cobble together a bit of XSLT into order to turn generated XML into (rather than simply generating HTML).
i'm having huge deja-vu this time again. i'm once again having to solve again basic problems, e.g.:
how to convert characters into valid html entity references
how to preserve whitespace/carriage returns when converting to html
how to convert to HTML as opposed to xhtml
how to convert dates from xml format into presentable format
how to tear apart strings with substring
This is all stuff that i've solved many times before. But every time i come back to XSLT i have to start from scratch, re-inventing the wheel every time.
If it were a programming language i would have a library of canned functions and procedures i can call. i would have subroutines to perform the commonly repeated tasks. i would inherit from a base class that already implements the ugly boilerplate stuff.
Is there any way in XSLT to grow, expand and improve the ecosystem with canned code?

This is all stuff that i've solved
many times before. But every time i
come back to XSLT i have to start from
scratch, re-inventing the wheel every
time.
This isn't necessary, of course.
If it were a programming language
Yes, XSLT is a programming language.
i would have a library of canned
functions and procedures i can call. i
would have subroutines to perform the
commonly repeated tasks.
Yes, you can do this in XSLT.
i would
inherit from a base class that already
implements the ugly boilerplate stuff.
Yes, there is something quite similar in XSLT.
Is there any way in XSLT to grow,
expand and improve the ecosystem with
canned code?
Even in XSLT 1.0 there are powerful, standard features that support reusability:
<xsl:import>
<xsl:include>
<xsl:apply-templates>
<xsl:call-template>
<xsl:apply-imports>
XSLT 2.0 adds a few even more powerful features:
<xsl:function>
Parameters for <xsl:apply-imports>
<xsl:next-match>
There have been several XSLT libraries for quite some time:
FXSL (1.x and 2.x) implements Higher-Order Functions in XSLT 1.0/2.0
FunctX -- a library of useful XSLT 2.0 and XQuery functions.
XPath 2.1 and XSLT 2.1 add Higher-Order Functions as standard. Functions become first-class datatypes.

Related

Upgrading XSLT 1.0 to XSLT 2.0

What is involved in upgrading from XSLT 1.0 to 2.0?
1 - What are the possible reasons for upgrading?
2 - What are the possible reasons for NOT upgrading?
3 - And finally, what are the steps to upgrading?
I'm hoping for an executive summary--the short version :)
What is involved in upgrading from XSLT 1.0 to 2.0?
1 - What are the possible reasons for upgrading?
If you are an XSLT programmer you'll benefit largely from the more convenient and expressive XSLT 2.0 language + XPath 2.0 and the new XDM (XPath Data Model).
You may want to watch this XSLT 2.0 Pluralsight course to get firm and systematic understanding of the power of XSLT 2.0.
You have:
Strong typing and all XSD types available.
The ability to define your own (schema) types.
the XPath 2.0 sequence type that doesn't have any counterpart (simply is missing) in XPath 1.0.
The ability to define and write functions in pure XSLT -- the xsl:function instruction.
Range variables in XPath expressions (the for clause).
Much better and more powerful string processing -- XPath 2.0 supports regular expressions in its tokenize(), matches() and replace() functions.
Much better and more powerful string processing -- XSLT 2.0 support for regular expressions -- the xsl:analyze-string, xsl:matching-substring and xsl:non-matching-substring new XSLT instructions.
More convenient, powerful and expressive grouping: the xsl:for-each-group instruction.
A lot of new, very powerful XPath 2.0 functions -- such as the functions on date, time and duration, just to name a few.
The new XPath operators intersect, except, is, >>, <<, some, every, instance of, castable as, ..., etc.
The general XPath operators >, <, etc. now work on any ordered value type (not only on numbers as in XPath 1.0).
New, safer value comparison operators: lt, le, eq, gt, ge, ne.
The XPath 2.0 to operator, allowing to have xsl:for-each select="1 to $N"
These, and many other improvements/new features significantly increase the productivity of any XSLT programmer, which allows XSLT 2.0 development to be finished in a small fraction of the time necessary for developing the same modules with XSLT 1.0.
Strong typing allows many errors to be caught at compile time and to be corrected immediately. For me this strong type-safety is the biggest advantage of using XSLT 2.0.
2 - What are the possible reasons for NOT upgrading?
It is often possible, reasonable and cost-efficient to leave existing, legacy XSLT 1.0 applications untouched and to continue using them with XSLT 1.0, while at the same time developing only new applications using XSLT 2.0.
Your management + any other non-technical reasons.
Having a lot of legacy XSLT 1.0 applications written in a poor style (e.g. using DOE or extension functions that now need to be re-written and the code refactored).
Not having available an XSLT 2.0 processor.
3 - And finally, what are the steps to upgrading?
Change the version attribute of the xsl:stylesheet or xsl:transform element from "1.0" to "2.0".
Remove any xxx:node-set() functions.
Remove any DOE.
Be ready for the surprise that xsl:value-of now outputs not just the first, but all items of a sequence.
Try to use the new xsl:sequence instruction as much as possible -- use it to replace any xsl:copy-of instructions; use it instead of xsl:value-of any time when the type of the output isn't string or text node.
Test extensively.
When the testing has verified that the code works as expected, start refactoring (if deemed necessary). It is a good idea to declare types for any variables, parameters, templates and functions. Doing so may reveal new, hidden errors and fixing them increases the quality of your code.
Optionally, decide which named templates to rewrite as xsl:function.
Decide if you still need some extension functions that are used in the old version, or you can rewrite them easily using the new, powerful capabilities of XSLT.
Final remarks: Not all of the above steps are necessary and one can stop and declare the migration successful on zero bug testing results. It is much cleaner to start using all XSLT 2.0/XPath 2.0 features in new projects.
Dimitre's answer is very comprehensive and 100% accurate (as always) but there is one point I would add. When upgrading to a 2.0 processor, you have a choice of leaving the version attribute set to "1.0" and running in "backwards compatibility mode", or changing the version attribute to "2.0". People often ask which approach is recommended.
My advice is, if you have a good set of tests for your stylesheets, take the plunge: set version="2.0", run the tests, and if there are any problems, fix them. Usually the problems will be code that was never quite right in the first place and only worked by accident. But if you don't have a good set of tests and are concerned about the reliability of your workload, then leaving version="1.0" is a lower-risk approach: the processor will then emulate all the quirks of XSLT 1.0, such as xsl:value-of ignoring all but the first item, and the strange rules for comparing numbers with strings.

When to use for-each and when to use apply-templates in xslt?

I've heard that most of the time it's usually possible (and better) to use apply-templates rather than for-each when writing an XSLT. Is this true? If so, what are the benefits of using apply-templates?
Using <xsl:for-each> is in no way harmful if one knows exactly how an <xsl:for-each> is processed.
The trouble is that a lot of newcomers to XSLT that have experience in imperative programming take <xsl:for-each> as a substitute of a "loop" in their favorite PL and think that it allows them to perform the impossible -- like incrementing a counter or any other modification of an already defined <xsl:variable>.
One indispensable use of <xsl:for-each> in XSLT 1.0 is to change the current document -- this is often needed in order to be able to use the key() function on a document, different from the current source XML document, for example to efficiently access lookup-table that resides in its own xml document.
On the other side, using <xsl:template> and <xsl:apply-templates> is much more powerful and elegant.
Here are some of the most important differences between the two approaches:
xsl:apply-templates is much richer and deeper than xsl:for-each, even
simply because we don't know what code will be applied on the nodes of
the selection -- in the general case this code will be different for
different nodes of the node-list.
The code that will be applied
can be written way after the xsl:apply templates was written and by
people that do not know the original author.
The FXSL library's implementation of higher-order functions (HOF) in XSLT wouldn't be possible if XSLT didn't have the <xsl:apply-templates> instruction.
Summary: Templates and the <xsl:apply-templates> instruction is how XSLT implements and deals with polymorphism.
Reference: See this whole thread: http://www.stylusstudio.com/xsllist/200411/post60540.html

Does LINQ to XML replace XSLT?

Is there anything you can do in XSLT that can't be done in LINQ to XML? Is it still important to learn XSLT? When would you choose one over the other?
Is there anything you can do in XSLT that can't be done in Linq to XML?
No, since LINQ to XML is an API used by Turing-complete programming languages, and covers more of XML Infoset than XSLT document model does (e.g. you can fully control the difference between text and CDATA nodes in L2X).
Is it still important to learn XSLT?
Depends on what you're doing. Broadly speaking, yes.
When would you choose one over the other?
XSLT is generally better when you need to do a transformation - i.e. both input and output is XML. There are a number of reasons for that. First of all, XSLT pattern matching is usually more concise than nested ?: in L2X queries, and far more readable. You can also use * to great effect to set up a default rule (like "copy everything", or "process children but do not generate output"), and then add rules for specific nodes you need to process in a special way - thus you do not need to write explicit loops/comprehensions for each node level in the document, as you often do in L2X. Finally, XPath is also more concise than L2X queries (at least in C#), so if you do a lot of non-trivial querying, it's likely to be far shorter and more readable in XSLT.
L2X is generally better when you need to quickly query a document for some value or node. The main advantage here is that there's less runtime overhead (XPath needs to be parsed, L2X query does not), and you don't need to mess with XmlNamespaceManager and other cruft - the API is streamlined for writing single-expression queries. As well, having nested from loops and let brings it closer to XQuery territory.
L2X is also the only choice when you need an in-place update of the document, and may be better when you only need to replace a few values in the document, and in-place update is an option - since XSLT doesn't let you touch the input in any way.
It is definitely still important to learn XSLT. LINQ to XML is great, but it's use is limited to .NET Apps.
XSLT can be applied across languages and platforms...even browsers can take XML and apply an XSLT to generate an output.
Don't forget that some .NET Application API's (CMS systems for example) still require you to supply XSLT to transform internal XML into an output. Ignoring the technology all together would be, in my opinion, a real mistake.
Not for anyone not using .NET

Two concepts from XSLT in other languages: apply-templates and xpath

Background: Having given up on the practical daily use of XSLT as a part of my programming toolkit, I was wondering if there were any implementations in other languages of the (only) two things I miss about that tool:
the ability to traverse data structures using "path" style statments via xpath
the ability to traverse template transformations using apply-templates instead of via an iterative or "looping" approach.
According to Google there are a couple of efforts out there to add "xpath-style" support to Javascript, but these have not apparently caught on very much. So far I haven't found anything where someone uses an "apply-templates" approach in another language
Question: Does anyone out there know of a programming language (hopefully one that is main-stream) that steals these two good ideas from XSLT, or applies the same or similar concepts using a different method?
the ability to traverse data structures using "path" style statments via xpath
I'm not aware of any other language that embeds XPath, but LINQ to XML is somewhat similar, particularly in its VB syntactic sugar incarnation. You could implement it in Common Lisp macros, or D templates, however.
the ability to traverse template transformations using apply-templates instead of via an iterative or "looping" approach.
No mainstream languages that I know of. Indeed, this feature is probably the main reason to use XSLT (and not e.g. XQuery, looking at closely related languages).
It's effectively extensible dynamic dispatch on receiver on arbitrary conditions - as such, I think you could probably do it in Common Lisp (CLOS, to be specific) - if I remember correctly, its multimethods can match arbitrary conditions, so if you have an XPath pattern evaluator, you could use it to emulate apply-templates, and even more - since apply-templates only dispatches on a single argument, while CLOS multimethods dispatch on multiple arguments.
XPath, while essential to making XSLT work, is independent of it; libraries like libxml give you it for free. The style of template application you describe is a little trickier; that's what you would normally use XSLT for.
Any programming language that does this should be functional. You could try writing your own, less-verbose, XSLT dialect; Perl also may give you enough rope to emulate this feature convincingly (although the performance implications are unclear).
The tough answer, though, is that this doesn't really exist, except as libraries for already existing languages.
For XPath, definitely. For C, there's Xalan-C++, for Java javax.xml.xpath (with multiple implementations), and C# has XPathNavigator and SelectNodes. If you want to use XPath for object hierarchies, look at JXPath.
For the template transformations, you should look at C#'s LINQ if you haven't already. It's not exactly the same thing, but it allows processing objects without explicit looping.
I have found nothing like that. But why would anybody use anything else to transform XML ? XSLT does a perfect job once you understand the non procedural way of developing solutions. Our applications are largely XSLT based and it is a really powerful tool.
A comment on your first requirement:
the ability to traverse data structures using "path" style statments via xpath
XPath makes a lot of assumptions on the data structure. If you're going to use it, you might as well convert your structure to XML because it's going to look like it anyway once you make it traversable via some XPath-like language unless you severely limit your XPath subset.
Also, keep in mind that the "only two things" that you are missing, XPath and template processing, are in-fact a huge part of what makes up Xslt. I'm curious why you decided to take it off of your tool-belt.
In spite of that fact that you wanted an Xslt alternative, I would still recommend Xslt and Xslt 2.0 in particular. With the addition of the unparsed-text and analyze-string you have a powerful text processing language. For example take a look at a CSV to XML stylesheet. Even though JSON isn't regular, you'd still be able to write a simple JSON to XML translator using recursive templates and transform the result at will.

Is XSLT a functional programming language?

Several questions about functional programming languages have got me thinking about whether XSLT is a functional programming language. If not, what features are missing? Has XSLT 2.0 shortened or closed the gap?
XSLT is declarative as opposed to stateful.
Although XSLT is based on functional programming ideas, it is not a full functional programming language, it lacks the ability to treat functions as a first class data type. It has elements like lazy evaluation to reduce unneeded evaluation and also the absence of explicit loops.
Like a functional language though, I would think that it can be nicely parallelized with automatic safe multi threading across several processors.
From Wikipedia on XSLT:
As a language, XSLT is influenced by
functional languages, and by
text-based pattern matching languages
like SNOBOL and awk. Its most direct
predecessor was DSSSL, a language that
performed the same function for SGML
that XSLT performs for XML. XSLT can
also be considered as a template
processor.
Here is a great site on using XSLT as a functional language with the help of FXSL. FXSL is a library that implements support for higher-order functions.
Because of FXSL I don't think that XSLT has a need to be fully functional itself. Perhaps FXSL will be included as a W3C standard in the future, but I have no evidence of this.
I am sure you guys have found this link by now :-) http://fxsl.sourceforge.net/articles/FuncProg/Functional%20Programming.html .
Well functions in XSLT are first class-citizens with some work arounds after all :-)
That is sort of how it feels when I am programming it.
XSLT is entirely based on defining functions and applying them to selected events that come down the input stream.
XSLT lets you set a variable. Functional programming does not allow functions to have side effects - and that is a biggie.
Still, writing in XSLT, one has the same "feel as working in an FP fashion. You are working with input - you are not changing it - to create output.
This is a very, very different programming model from that used when working with the DOM API. DOM does not separate input and output at all. You are handed a data structure - and you mangle it how you see fit - without hesitation, restriction, or remorse.
Suffice it to say if you like FP and the principles behind it, you will probably feel comfortable working in it. Just like experience with event driven programming - and XML itself - will make you comfortable with it as well.
If your only experience is with top-down, non event driven programs - then XSLT will be very unfamiliar, alien landscape indeed. At least at first. Growing a little experience and then coming back to XSLT when XPath expressions and event-handling are really comfortable to you will pay off handsomely.
For the most part, what makes XSLT not a 100% functional programming language is it's inability to treat functions as a first-class data type.
There may be some others -- but that's the obvious answer.
Good luck!
Saxon-SA has introduced some extension functions which make XSLT functional. You can use saxon:function() to create a function value (actually a {http://net.sf.saxon/java-type}net.sf.saxon.expr.UserFunctionCall value) which you then call with saxon:call().
Saxon-B has similar functionality with the pairing of saxon:expression() and saxon:eval(). The difference is that saxon:expression() takes any XPath expression, and saxon:eval() evaluates it, whereas saxon:function() takes the name of a function which saxon:call() calls.
That is not really an argument, since you can only declare variables, not change their values after declaration. In that sense it is declarative not imperative style, as stated in Mr Novatchev's article.
Functional programming languages like Scheme or Erlang enable you to declare variables as well, and in Haskell you can also do that:
-- function 'test' takes variable x and adds it on every element of list xs
test :: [Int] -> [Int]
test xs = map (+ x) xs
where x = 2