XSLT fn:transform() with stylesheet in eXist-db having xsl:import - xslt

I'm trying to use fn:transform() within a XSL stylesheet for the first time, where the stylesheet is stored in eXist-db:
transform(map{'stylesheet-location':'xmldb:exist:///db/sample.xsl', 'source-node':$xml})
sample.xml contains imports, e.g. <xsl:import href="functions.xsl"/>. It works when running the main XSL in oXygen. But when I run it in eXist-db, I get the following error:
exerr:ERROR Exception while transforming node: I/O error reported by XML parser processing file:/Applications/eXist-db.app/Contents/Resources/functions.xsl: /Applications/eXist-db.app/Contents/Resources/functions.xsl (No such file or directory) [at line 127, column 30]
So it looks like Saxon is looking for the imported XSL in the filesystem where the app is intalled, rather than in eXist-db, where sample.xsl is located. How can I get fn:transform() to call an XSL in eXist-db, and have the imports also come from eXist-db?
I tried adding 'stylesheet-base-uri':'xmldb:exist:///db/' to the map parameter of fn:transform(), but that did not resolve it.

This is just a guess, but "xmldb:exist:" isn't a URI scheme that's defined outside of eXist. I imagine that whoever setup Saxon to run inside eXist configured some sort of URI resolver for Saxon that knows how to resolve that scheme and return data.
My further guess is that they failed to configure the module URI resolver in the same way. The fact that you get a file: URI instead of an error makes me wonder if the resolver that has been installed for "xmldb:exist:" scheme URIs is also failing to set the base URI of the returned resource correctly.
My final guess is that it works if you do it from Oxygen because Oxygen does configure both the entity and module URI resolvers.
I suspect your only recourse is to report this back to the folks who maintain the Saxon integration in eXist.

Related

Can I force the XSLT collection function to treat all files as XSL?

I'm currently building an XSLT stylesheet used to document other XSLT stylesheets in a series of folders and sub-folders. My code pulls out specific details about variables, functions, etc and renders it in an output format. The sheets being read are created by a 3rd party product. Most of them have an XSL extension but some of them are proprietary extensions. I have some files with a DTCBS extension but they are just XSL stylesheets.
I'm currently loading the content of these files into a variable using the XSLT function "collection" as follows:
<xsl:variable name="Collection" select="collection(concat('file:///', encode-for-uri(replace($filePath, '\\', '/')),'?select=*.(xsl|dtcbs|xml);recurse=yes'))" as="node()*"/>
The variable works just fine if I use XSL|XML. But if I include the DTCBS extension, the variable blows up citing "the supplied value is xs:base64Binary".
If I manually put the xml declaration line at the top of my DTCBS file, the variable works fine. Those DTCBS files are auto-generated without the declaration line so I can't fix that, nor can I manually edit them each time I want to run my documenter code.
From what I can tell, because it's not an XSL extension, and the XML declaration line isn't present, the XSLT parser thinks it's base64 when it isn't.
I'm using Saxon as my XSLT parser and the Saxon documentation says it uses file extensions and http headers to detect the file type.
Does anyone know if there is a way to force collection() to treat every file as an XSL?
Tried adding the XML declaration line in the DTCBS file. This did correct the issue but I can't do this in all cases as I am trying to automate the entire thing.
I also renamed the DTCBS extension to XSL and the problem went away as well.
As well as Martin's suggestion, you can register content types with the Saxon configuration:
processor.getUnderlyingConfiguration()
.registerFileExtension("dtcbs", "application/xml");
This has been available since Saxon 9.7.
Try to add e.g. content-type=application/xml e.g. '?select=*.(xsl|dtcbs|xml);recurse=yes;content-type=application/xml'.

eXist-DB transformation failure with XSLT - where to find error log?

Environment: eXist 4.2.1 - xquery 3.1 - xslt 3.0 - TEI-XML document
Using the eXide interface, I am attempting to do a transformation of a TEI-XML document with an XSL file, with an output of HTML.
Until now I have been developing XML documents and their XSL transformations in Oxygen. Firing off the transformations in Oxygen or using a terminal, both have been working error free. Now I am preparing a web application using eXist (which will contains the thousands of TEI-XML docs).
I am trying to simply fire off the same transformation in eXist with the following xquery test:
let $result := transform:transform(doc("xmldb:exist://db/apps/deheresi/resources/documents/ms609_0001.xml"), doc("xmldb:exist://db/apps/deheresi/resources/documents/document_style.xsl"), ())
return $result?output
eXide returns me only this:
exerr:ERROR Unable to set up transformer: Stylesheet compilation failed: 62 errors reported [at line 3, column 16]
I'm new at eXist DB and have not been able to figure out how to get the reasons for errors.
How do I access the error details (detail log?) in eXist? (I have searched without success my books and online documentation; for example https://exist-db.org/exist/apps/doc/xsl-transform doesn't help at all on errors).
For Oxygen and terminal transformations, I use Saxon 9he. I understand that eXist uses the same?
NB: my documents are all organized in an eXist collection identical to the setup on my computer, thus all relative locations should function correctly?
First - when using doc and collection functions for the paths in the database you don't need the XML:DB URI, instead you can just use:
transform:transform(doc("/db/apps/deheresi/resources/documents/ms609_0001.xml"),
doc("/db/apps/deheresi/resources/documents/document_style.xsl"), ())
The errors should be in exist.log the default location for that is $EXIST_HOME/webapp/WEB-INF/logs. You might otherwise find them on the "Standard Out" of the terminal session which is running eXist-db.
If you are using the YAJSW (Service Wrapper) to run eXist-db you might also need to check $EXIST_HOME/tools/yajsw/logs.

How to break caching on exist-db of included XSLs in Transform

I have a large set of XSLs that we recently went through and implemented a shared XSL template with common bits. We included an xsl:include in all the main XSLs now to pull these in. We had no issues at first until we started to make changes to the shared XSL.
For information, the whole system is web based, calling queries to dynamically format documents in the database given different XSLs through XSL FO and RenderX.
The main transform is:
let $fo := util:expand(transform:transform($articles, doc("/db/Customer/data/edit/xsl/Custbatch.xsl"), $parameters))
That XSL (Custbatch.xsl) has:
<xsl:include href="Custshared.v1.xsl"/>
If we make an edit to "Custshared.v1.xsl" is not reflected in the result because it is obvious that "Custshared.v1.xsl" is being cached and used. We know this because as you can see the name now includes "v1". If we make a change and change all the references say from v1 to v2, it all works. But this seems a bit ridiculous as that means we have to change the 18 XSLs that include this XSL or do something silly like restart the database.
So, what am I missing in the setup or controller.xql (which has the following on all not matched paths), to get things not to cache. I assume that is all internal so this setting likely does not matter. Is there some other setting in the config that does?
<dispatch xmlns="http://exist.sourceforge.net/NS/exist">
<cache-control cache="no"/>
</dispatch>
In reading the document here: http://exist-db.org/exist/apps/doc/xsl-transform.xml, it states:
"The stylesheet will be compiled into a template using the standard Java APIs (javax.xml.transform). The template is shared between all instances of the function and will only be reloaded if modified since its last invocation."
However, if I change an included XSL, it is not being used.
Update #1
I even went as far as creating a query that returns the XSL that is included, then I use:
<xsl:include href="http://localhost/get-include-xsl.xq"/>
This does work as formatting is not broken, but changing the underlying XSL yields the same result. So even that Xquery result is cached.
Update #2
And yes, through some simple test all is proven.
If I make any change to the root template (like add a meaningless space) and run, it does include the changes made in the include. If I only change the included XSL, no changes happen.
So lacking anything else, we could always write a Xquery that basically touches all the main templates after a change is made to the include template. Seems so wrong as a workaround.
Update #3
So the workaround we are currently using is that we have an unused "variable" in the XSL (version) and when we update the shared template, we execute that query which basically updates the value in that variable. At least it's only one XQuery and maybe we should attach to a trigger.
There is a setting in $exist-db-root$/conf.xml for the XSL transformer where you can turn off caching: <transformer class="net.sf.saxon.TransformerFactoryImpl" caching="no"> (The default is 'yes')

Cannot include sub xsl files within main xsl file based on if statement [duplicate]

The <xsl:import> and <xsl:include> elements seem to behave quite specific.
What I am trying to do:
<xsl:import href="{$base}/themes/{/settings/active_theme}/styles.xsl" />
I want to allow loading different themes for my application. I have a settings in my App which stores the "currently active theme" folder name in a xml node.
Unfortunately the code above won't work.
Does anybody know about a workaround to achieve what I want to do?
edit:
just confirmed with a XSLT guru via Twitter... there's no nice way of doing this. Easiest solution in my case will probably be to seperate frontend and backend stylesheets and load them individually to the XSLTProcessor...
xsl:import assembles the stylesheet prior to execution. The stylesheet can't modify itself while it is executing, which is what you are trying to achieve.
If you have three variants of a stylesheet for use in different circumstances, represented by three modules A.xsl, B.xsl, and C.xsl, then instead of trying to import one of these into the module common.xsl that contains all the common code, you need to invert the structure: each of A.xsl, B.xsl, and C.xsl should import common.xsl, and you should select A.xsl, B.xsl, or C.xsl as the principal stylesheet module when initiating the transformation.
What I am trying to do:
<xsl:import href="{$base}/themes/{/settings/active_theme}/styles.xsl" />
This isn't allowed in any version (1.0, 2.0, or 3.0) of XSLT.
In XSLT 2.0 (and up) one may use the use-when attribute, but the conditions that may be specified are very limited.
One non-XSLT solution is to load the importing XSLT stylesheet as an XmlDocument and use the DOM API to set href attribute to the really wanted value -- only then invoke the transformation.

what to do when XSL transformation namespace page is offline?

XSL requires this at the top of every stylesheet:
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
and throws an error if the url in the namespace is not exactly right.
Today, "http://www.w3.org/1999/XSL/Transform" is offline. I cannot run any transformations. The transformation hangs and then returns "unexpected end of file" when the net request times out. If I change the URL in the namespace declaration to a random URL, the transformation fails with an error telling me that "http://www.w3.org/1999/XSL/Transform" is the required xsl namespace.
So how do I work around W3's site being down?
Using xmlns:something="..." declares an XML namespace. Such a namespace is merely a string, something that will help to attribute a unique meaning to element names like template or href, making sure multiple XML-based languages can be used in a single document without creating confusion as to its meaning.
Some of those namespaces are reserved for use by the W3C. The XSLT namespace is one of those. A proper XSLT processor will check if a stylesheet declares the correct namespace to make sure there can be no incorrect interpretation. The root element of the stylesheet should be in that XSLT namespace.
For an actual namespace value, you'd usually have a URI (and most often a URL) since that's normally a good unique identifier. However, this should never be used to actually resolve to any online resources during XML processing. Whereas HTTP URLs are normally treated in a case-insensitive manner and may make use of URL encoding for characters (e.g. space becomes %20), such resolution or equality of URLs is not checked in XML namespace processing. A namespace in XML is nothing but a string that's always checked in its exact form, casing and everything.
So if an XSLT processor complains that some resource at a URL cannot be found, then either it's doing something it shouldn't do, or the problem has nothing to do with namespace processing.
You're using Saxon, which most definitely isn't a processor that doesn't understand the concept of a namespace. Its father is Michael Kay who is also responsible for the XSLT 2.0 spec. But Saxon does support schema-aware XSLT processing. If a document specifies a schema location, then a processor using this for validation would actually use that location to get the schema. That's the difference with a namespace. DTDs and XML Schema locations can definitely result in network activity.
So I advise you to check if...
the XML uses a DTD with external definitions and whether those are available;
the XML specifies a schema location and whether that location can be reached;
the stylesheet makes use of a schema or some other external resource and whether that's available.
Once you've found the cause, look into the use of XML catalogs in conjunction with the processor. An XML catalog will allow you to use local resources if they can't be resolved from their URIs.
Simple answer: The http://www.w3.org/1999/XSL/Transform isn't a URL, it's just a string. If W3C had decided, there's no reason it couldn't have been 'ThisIsAnXsltStylesheet'. By convention, they usually resemble URL's, but this isn't required.
So, the fact that there's nothing at that URL isn't relevant to why your stylesheet is failing, and certainly won't be the cause. Logically speaking, if that were the case, then nobody without an internet connection would ever be able to use XSLT, and w3c's servers would be seriously overworked.
I'd recommend adding the first few lines of your XSLT into your question; it might shed some light on where your problem really is.