xslt passing input document as String to java extension - xslt

I am trying to write two XSL files, trying to achieve following goals:
It is supposed to encrypt the input document.
It is supposed to binary encode the XML document.
Example output of 1)
<Response>
<encryptedData>e070dee5cb4688c608ee</encryptedData>
</Response>
Example output of 2)
<Response>
<compressedData>ASCDee5cb4688c608ee</compressedData>
</Response>
For functionality #1, I have a Java extension function that takes a string input and returns an encrypted string. But I don't know how to pass the input document as string to the extension function.
For functionality #2, I am not sure how to convert input to binary XML.

XSLT cannot exactly reproduce the original string representing an XML document -- due to various lexical peculiarities (and substitution of entity referencies) that are not able to reconstruct from the XmlDocument produced by the XML parser -- which is the input that an XSLT processor sees.
You can pass to the extension function the document object (/) and then the Java function can use a method like OuterXml() or InnerXml() to get one possible representation of the XML document.

I can only give an answer to your first question on how to call a java function from an XSLT.
In your stylesheet declaration you have to define a namespace, e.g. xmlns:filecounter="mappings.GenerateSequenceNumber":
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:filecounter="mappings.GenerateSequenceNumber"
exclude-result-prefixes="filecounter" version="1.0">
<xsl:output indent="yes"/>
In this case the java function is in the package "mappings" and the java class is called "GenerateSequenceNumber".
When calling the java function in your stylesheet you do for example:
<xsl:value-of select="filecounter:getSequenceNumber('countit',3)"/>
So you call the method "getSequenceNumber" in your java class and pass any variables that the java function needs in the brackets.
Unfortunately I can't help you with your second question.

Related

How to get ampersand "&" in output of Transform xml activity of TIBCO

Could anyone please help in getting the ampersand "&" output of Transform xml activity of TIBCO .
My requirement is the xmlstring from Transform xml activity is mapped to Parse xml (which will give the final output ) .Ex; Maitree&Sons. What should be passed in xslt so that when the output from Transform xml goes to Parse xml it will give the final result as "&".
I tried using CDATA and disable-escaping-output also in xslt but in parse xml it fails.
Please help.
Generally XSLT won't allow you to produce invalid output. The correct representation in XML is Maitree&Sons and this is what it produces. If it produced Maitree&Sons, this would be invalid XML and would be thrown out by an XML parser trying to read the document.
Having said that, it's possible using disable-output-escaping to produce an unescaped ampersand if your XSLT processor supports this option. If it's not working for you we need to know exactly what you did and how it failed.
(General rule: on SO, always tell us exactly what you did and exactly how it failed. Saying in general terms that you tried lots of things and none of them worked doesn't get us any nearer to a solution.)
LATER
I'm reading the question again. You want to produce output from the transformer that will go into an XML parser, such that the output of the parser is Maitree&Sons. Well, in that case the lexical XML must be Maitree&Sons, which it will be if you generate the string Maitree&Sons in XSLT. But XSLT is XML, so if you want to write this as a literal string in your stylesheet, it will be written Maitree&Sons.
I guess we need a much clearer picture of what you are doing and where it is going wrong.

Encoding Issues in XSL Transformation

I have Encoding issues similar to those discussed here : cross-encoding XSL transformations
No clean answer was given to these questions; that's why I'm asking it again.
I have an XML input file encoded in UTF8.
I have a XSL Transformation to apply to these files which should generate an XML ouptput encoded in Windows-1252.
I have the two declarations below in my XSLT file :
<?xml version="1.0" encoding='Windows-1252'?>
<xsl:output method="text" indent="yes" encoding="Windows-1252"/>
I use Saxon as the XSL processor.
Besides all of that, I still have fatal errors each time a UTF8 charac whith no Windows-1252 equivalent is encountered.
Actually, I don't really care about these characters and my transformation could dropp all of them. I just want the transformation goes on and don't crash because of them.
Where I miss something ? Why still have this fatal errors (Fatal Error! Output character not available in this encoding) ?
Thanks in advance for your help.
The message you describe is produced only with the text output method (with XML or HTML, the serializer would use numeric character entities). This error is required by the specification
(see http://www.w3.org/TR/xslt-xquery-serialization/#TEXT_ENCODING), though I can understand why you might want a gentler fallback, e.g. outputting a substitute character.
If you don't mind a bit of Java coding, it would be easy to substitute your own version of Saxon's TEXTEmitter that does things differently (you only need to override one method); alternatively, you could send the XSLT output to a Java Writer (the encoding would then be ignored), and use the Java I/O framework to convert characters to the required encoding, with whatever handling of invalid characters your application requires.
UTF-8 is a larger character set then Windows-1252
This means some UTF-8 characters can not be translated to windows-1252
Ask yourself why you need to transform between encodings

XSLT Identity Transformation without change to the output

Is it possible to do xslt identity transformation where absolutly nothing is changed from the source?
When I use following template, ident and linebreaks are changed in the output and I don't want to do any changes to the source xml.
XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
INPUT
<S:Envelope
xmlns:S="http://www.w3.org/2003/05/soap-envelope"
xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/addressing"
xmlns:f123="http://www.fabrikam123.example/svc53">
<S:Header>
<wsa:MessageID>
uuid:aaaabbbb-cccc-dddd-eeee-wwwwwwwwwww
</wsa:MessageID>
<wsa:RelatesTo>
uuid:aaaabbbb-cccc-dddd-eeee-ffffffffffff
</wsa:RelatesTo>
<wsa:To S:mustUnderstand="1">
http://business456.example/client1
</wsa:To>
<wsa:Action>http://fabrikam123.example/mail/DeleteAck</wsa:Action>
</S:Header>
<S:Body>
<f123:DeleteAck/>
</S:Body>
</S:Envelope>
OUTPUT
<?xml version="1.0" encoding="UTF-8"?><S:Envelope xmlns:S="http://www.w3.org/2003/05/soap-envelope" xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/addressing" xmlns:f123="http://www.fabrikam123.example/svc53">
<S:Header>
<wsa:MessageID>
uuid:aaaabbbb-cccc-dddd-eeee-wwwwwwwwwww
</wsa:MessageID>
<wsa:RelatesTo>
uuid:aaaabbbb-cccc-dddd-eeee-ffffffffffff
</wsa:RelatesTo>
<wsa:To S:mustUnderstand="1">
http://business456.example/client1
</wsa:To>
<wsa:Action>http://fabrikam123.example/mail/DeleteAck</wsa:Action>
</S:Header>
<S:Body>
<f123:DeleteAck/>
</S:Body>
</S:Envelope>
No, you cannot. The input and output XML will be the "same" in the sense that they produce the same XML Infoset, but they will not necessarily be byte-for-byte identical and this is not something that XSLT can control.
Why do you need this? If you are trying to compare XML documents easily, consider using XML Canonicalization. Many XML libraries have a method of producing canonical XML, and the xmllint command line tool can produce it easily from files.
The default behavior of XSLT processors is to preserve whitespace in the input, and the behavior of the processors I've just tested is consistent with the spec.
But the whitespace in question is whitespace in the text nodes of the input.
The whitespace between attribute-value specifications in start-tags, and the whitespace between items (e.g. comments and processing instructions) in the prolog and epilog of the document are not text nodes, and are not affected by the preserve-space settings. That white space is also, in fact, not part of the XPath data model, so there is very little the processor can legitimately do to preserve it.
If the whitespace in question carries information, you will want to revisit the design of the vocabulary (it's really a bad idea for that whitespace to be significant); if it's just that you would prefer that there be newlines between attribute-value specifications, you may want to write a custom serializer to insert such newlines and indentation on output. (If your motive is to avoid confusing a diff program with whitespace differences, my experience is that your choices are to normalize whitespace before diffing or to get a diff program that's a bit more robust in the face of whitespace variation.) Good luck.
In general it's not possible to be 100% confident that you'll get exactly everything unchanged because the xslt data model simply doesn't preserve all the information from the parse. For example if the input contains < then the output might contain <. Similarly CDATA sections aren't preserved - adjacent text nodes (CDATA sections and normal text modes) are merged into one at parse time and while you can configure the processor to use CDATA for the content of certain elements you can't simply preserve them as they were.
There are other issues such as the fact that the data model doesn't distinguish between <foo></foo>, <foo/> and <foo /> - they all represent the same empty element and any of them from the input could be represented by any of them in the output. And as in your example white space between attributes within a start tag is not preserved.
But of course these differences are all things that an XML tool shouldn't care about as they're different ways to represent exactly the same infoset.

Including a plain text file with XSLT 1.0

How can I include the content of a plain text file in a result document from within an XSLT 1.0 stylesheet? I.e., just like document(), but without parsing it:
<xsl:value-of select="magic-method-to-include-plaintext(#xlink_href)" />
I am almost sure, that this doesn't work without extension, because:
there is a special XPath function defined for this in XSLT/XPath 2.0:
<xsl:value-of select="unparsed-text(#xlink:href, 'UTF-8')"/>
the XSLT FAQ only lists a Java extension to achieve this via EXSLT
However, perhaps I missed something?
However, perhaps I missed something?
No, XSLT 1.0 cannot access the content of a non-xml text file without using an extension function.
One way around this is to pass the string as a global parameter to the transformation.

XSLT encoding problem, questionmarks in result

I'm trying to run an XSLT transformation, but characters like ëöï are replaced by a literal '?' in the output (I checked with an hex editor). The source file has the correct characters, and the stylesheet has:
<xsl:output encoding="UTF-8" indent="yes" method="xml"/>
What else am I missing?
I'm using saxon as the transformer, if that matters.
The problem is most likely in the way you call the transformer. My guess is it will work fine if you call it using java -jar saxon.jar ...
In general, when you use XML tools which take InputStream/OutputStream, then the tools will make sure that the encoding is correct.
When you use a mixture of Streams and Writers, you will have to make sure that the encoding when going from one to the other matches what you told the XSLT processor to produce. Always set encodings explicitly. There may be defaults, but when it comes to encodings, they are wrong more often than not.