Reading Environment Variables in an XSLT Stylesheet with Saxon - xslt

I'm trying to generate an XML file with the my machine's hostname in some arbitrary element or attribute, e.g.
<hostname>myHostname</hostname>
I'm using Saxon 9.2. I can think of three ways to do this:
Read and parse /etc/sysconfig/network (I'm using Fedora)
Read the environment variable (as in $ echo $HOSTNAME)
Pass the hostname to saxon and then use somehow dereference a variable (not sure if this is possible)
Are any of these possible? I think the first option is most likely to work, but I think the other two options will produce less verbose XSLT.
I also have a related question:
Currently, I have an XSLT and source XML file that generates a bunch of XML files, it works like I expect it to. Is there anyway I can selectively generate one file per host? That is, I want to say 'if the hostname is myHostName then generate the XML file for myHostName, if the hostname is myOtherHostName then generate the XML file for myOtherHostName'.
I ask this because I'm trying to configure a large number of machines and if I could drop an XSLT and XML file on each and then call the same command on every machine and hten get the right XML on each it would be really convienent.

You should pass a parameter to your xslt when "calling" it. I think this is the most robust solution.
So at the top of your stylesheet you would have something like :
<xsl:param name="hostName"/>
Then you can use it in your .xslt via the usual notation : $hostName etc.
You just then need to pass those parameters when calling the xslt processor. Depending on how you use it this may vary.

You can generate an XML file containing all needed parameters, then you can either pass it as parameter to the transformation (refer to the code samples to see examples of how this is done with Saxon).
Here is a link that can help: https://www.saxonica.com/html/documentation/javadoc/net/sf/saxon/expr/instruct/GlobalParameterSet.html
Or simpler, save this XML file in the file system and just pass as parameter to the transformation the file path and name.
Then inside the transformation, use the standard XSLT function document() to load the XML document that contains the parameters.
Even further simplification is possible, if this file can be stored at a location that has exactly the same path on all machines. Then this avoids the need to pass this filepath as parameter to the transformation.

There are many possible ways of doing this: passing in parameters, reading the configuration file using the unparsed-text() function, calling an extension function.
But perhaps the most direct way is that Saxon 9.3 implements the new XPath 3.0 function get-environment-variable(). Support for XPath 3.0 requires Saxon-PE or higher.
(XPath 3.0 is of course still a draft and subject to change. In fact it has changed since Saxon 9.3 was released - the function has been renamed environment-variable()).

Related

Can you disable output of the main output file in Saxon/XSLT3?

I am using XSLT3 as provided in Saxon to convert a single input XML file to a set of output files. All these output files are conceptually equivalent. Each output file is declared via the result-document directive, as explained on https://www.w3.org/TR/2007/REC-xslt20-20070123/#element-result-document.
In this case, I don't need any main output, but Saxon is still creating such output file. Is it somehow possible to disable the main output in XSLT3 or Saxon?
I could use result-document for all desired output files, except the last one, and just use the main output for that one - but that feels odd.
It's a good question. XSLT 2.0 had a very complicated rule (in §2.4): An implicit result tree is also created when the result sequence is empty, provided that no xsl:result-document instruction has been evaluated during the course of the transformation. In this situation the implicit result tree will consist of a document node with no children.
Interpreting "creating a result tree" as meaning that the corresponding serialised output file is written to filestore, this means that when the principal result tree is empty, the corresponding output file is written if and only if there are no secondary output files. Which is (I think) the effect that you are asking for.
This rule became increasingly unwieldy in XSLT 3.0 because of the greatly extended ways of invoking a stylesheet (e.g by calling an initial public function), and it was therefore dropped; and Saxon followed suit.
You can certainly avoid the file being written by supplying a result or destination that discards it rather than writing it to filestore (for example, in s9api, an XdmDestination). Achieving the same thing from the command line is not so easy; in fact, I'm not sure it can be done without writing some Java or C# code somewhere.

document() function for a file on another computer/server

I understand the use of document() as follows.
<xsl:value-of select="document('path\to\docuemnt.xml')/RootElement/Element"/>
And this has to be a relative path to the parent XSL file. But what if I need to reference a file which is hosted on another server on the local network? I've tried such things as.
<xsl:value-of select="document('\\servername\path\to\document.xml')/RootElement/Element"/>
But this throws an error, because it looks in
C:\path\to\xsl\\servername\path\to\document.xml
Which of course doesn't exist.
This solution only relates to the Saxon-HE 9.4.0.3N XSLT processor, in the console application form, on Windows 7.
In my experimentation, I found that the document() function will accept file names or URIs. However I would avoid filenames because they need to be short-form. If you use long-form, the file-name will be rejected.
Suppose your document is ...
c:\path\to\document.xml
on server 'servername' which is mapped to drive 'j'.
To form a URI from this use as the document() parameter value...
file:///j:/path/to/document.xml
In relation to the URI, I was mistaken about Saxon not accepting long-form. This only applies to filenames. However, there are a number of gotchas...
Note the forward slashes. Backslashes will not work.
I have not found a way to build a workable file: URI with just UNC names. You need to make a drive mapping to a letter.
Any failure to open the document for any reason will be reported as the same error. With file system, there are so many things that can go wrong, that if you can't open the file, it is not safe to assume that the URI is wrong. There could be many mundane reasons why a file cannot be opened at a particular time.
Beware of firewall issues. These play a role.
Many text editors, such as NotePad++ assume, in the absence of a BOM and not encoded in one of the two UTF-16 encodings, that a text file is encoded in the system code-page. Saxon will make the default assumption that the file is encoded in UTF-8 so if you have a character that looks like this in NotePad++ (ä) with my code-page, Saxon will spit the dummy, and report that it is unable to open the file. (Aside: I'm not sure what my code-page is. My o/s is Win7 and the Current system locale is English (Australia). It is the system local that determines the system code-page). The reason why Saxon will not open the document is that the (ä) encoded in some code-page results in a sequence of bytes which is not a valid UTF-8 sequence.
URI paths which are not URL paths are not supported by the underlying operating system. Saxon may well truthfully say that it supports URIs in relation to the document() function, but that doesn't boil any cabbages, because in practice, you can't use them. - Well at least not on the windows family of o/s.
Please ignore the MSDN page on the file protocol. The form of URL suggested on that page (with the | character etc) is not accepted by the Saxon document() function. Use the form that I have suggested above. I have tested it and it works.
Your understanding of document() is incorrect. It expects a URI, not a filename.

How can I identify line numbers in XSLT input or from XPath?

I am processing XML files using an XSLT stylsheet and wish to report the input line number when a given template is triggered. I can use a DOM (e.g. XOM in Java) which supports a SAX parser so maybe can use a Locator.
Alternatively the XSLT generates an Xpath which could be applied to the original document and so, at least for a human, can lead to the particular line.
(The actual application is to detect error conditions in the XML, which are searched for using XSLT)
Saxon has an extension for this. You can set an option when building the source tree to maintain line number information (e.g. -l on the command line), and if this was set, you can use the extension function saxon:line-number() to get the line number associated with an element node in the tree.

XSLT to convert an XML element containing RTF data to HTML?

OK, so here's the background:
We have a third-party piece of software that does a lot of complicated stuff to generate an XML file from a lot of tables based on a wide array of business rules. The software allows you to apply an XSL transformation by supplying an XSLT file as part of its workflow, before continuing on in the process, which is usually an upload to one or more servers, based on more business rules.
Here's the problem:
One of the elements (with more on the way) this application is processing contains RTF text, and needs to be converted into formatted HTML before being uploaded. There are no means of transforming the XML inside the application other than through an XSLT file, and once we output the file, we cannot resume the workflow. My original thought was, "Easy! someone must have written a few XSL transforms for converting RTF to formatted HTML!" Hours of searching later, I must conclude I either suck at searching or it's awfully obscure.
Disclaimers:
I know the software is pretty darned limited; I'm stuck with it.
I know there are a lot of third-party tools to do this; they are not available to me because I would need to run them externally.
I know that this is not a pretty or efficient thing to do with XSLT. Changing that is not an option for me at this point.
If I cannot find a means to do this through pure XSL transforms, I will need to output the files locally, run the extra process, and take the destination routing on through a custom process. I really don't want to do that.
Does anyone have access to an XSL transformation function/ scheme that will allow me to do this natively in the application? Perhaps a series of regular expressions I could use or something?
So it turns out that external scripts can be invoked from the XSLT. It seems I will be using another scripting language to get this to work. I'm a little bummed there was no other answer available.

XSLT getting character count of transformed XML

I am creating some XML from an XSLT
the XML after transformation looks a little like...
<root><one><two>dfd</two></one></root>
I need to get a character count for the output (in this case would be 38).
I tried putting the whole lot in a variable then doing a string-length($vVariable) but this only brings back 3 (for the 'dfd' it excludes the characters of the tags)
This is going to be very difficult to do in straight XSLT, since it's internal data model doesn't see XML elements as strings. Although your particular example is very simple, there are multiple valid ways to serialize the same XML into text, especially when you get into namespaces.
Your best bet may be to send the result of your transformation to another tool. If you're running the XSLT processor from the command line, you could use a tool like the linux command "wc"). If you're calling XSLT from within a larger program, you could use that language's built-in string-length functionality.