How to transform XML document with CDATA using JDOM2? - xslt

Source document:
<content><![CDATA[>&< test]]></content>
XSLT document (cdata-transformation.xslt):
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" cdata-section-elements="transformed" />
<xsl:template match="/content">
<transformed>
<xsl:value-of select="." />
</transformed>
</xsl:template>
</xsl:stylesheet>
Wanted result:
<?xml version="1.0" encoding="UTF-8"?>
<transformed><![CDATA[>&< test]]></transformed>
Actual result:
<?xml version="1.0" encoding="UTF-8"?>
<transformed>>&< test</transformed>
Code used to test using JDOM2:
import java.io.IOException;
import java.io.InputStream;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import org.jdom2.CDATA;
import org.jdom2.Document;
import org.jdom2.Element;
import org.jdom2.JDOMException;
import org.jdom2.input.SAXBuilder;
import org.jdom2.output.XMLOutputter;
import org.jdom2.transform.JDOMResult;
import org.jdom2.transform.JDOMSource;
import org.junit.Test;
public class CdataTransformationTest {
#Test
public void learning_cdataTransformationWithJdom() throws Exception {
Document xslt = loadResource("xslt/cdata-transformation.xslt");
Document source = new Document(new Element("content")
.addContent(new CDATA(">&< test")));
Document transformed = transform(source, xslt);
XMLOutputter outputter = new XMLOutputter();
System.out.println(outputter.outputString(transformed));
}
private static Document transform(Document sourceDoc, Document xsltDoc) throws TransformerException {
JDOMSource source = new JDOMSource(sourceDoc);
JDOMResult result = new JDOMResult();
Transformer transformer = TransformerFactory.newInstance()
.newTransformer(new JDOMSource(xsltDoc));
transformer.transform(source, result);
return result.getDocument();
}
private static Document loadResource(String resource) throws IOException, JDOMException {
ClassLoader classloader = Thread.currentThread().getContextClassLoader();
InputStream inputStream = classloader.getResourceAsStream(resource);
if (inputStream != null) {
try {
SAXBuilder builder = new SAXBuilder();
return builder.build(inputStream);
} finally {
inputStream.close();
}
} else {
return null;
}
}
}
JDOM version used:
<dependency>
<groupId>org.jdom</groupId>
<artifactId>jdom2</artifactId>
<version>2.0.6</version>
</dependency>
XSLT processor used:
<dependency>
<groupId>xalan</groupId>
<artifactId>xalan</artifactId>
<version>2.7.1</version>
</dependency>
I have searched around for ways to do this, and the best answers say that what is needed to wrap content in CDATA is to add the tag name in the cdata-section-elements attribute. I cannot get this to work with JDOM and not when using the Free online XSL Transformer either. I have also tried using saxon instead of xalan, but with the same result.
Why doesn't this work? What am I missing/doing wrong here? Is JDOM ignoring the cdata-section-elements attribute?
I have also tried wrapping the content like this:
<xsl:text disable-output-escaping="yes"><![CDATA[</xsl:text>
<xsl:value-of select="." />
<xsl:text disable-output-escaping="yes">]]></xsl:text>
But this produces an unwanted result in JDOM which makes it difficult to work with. Visible when you set the outputer.getFormat().setIgnoreTrAXEscapingPIs(true); and it looks really ugly when using pretty format.
<?xml version="1.0" encoding="UTF-8"?>
<transformed>
<?javax.xml.transform.disable-output-escaping?>
<![CDATA[
<?javax.xml.transform.enable-output-escaping?>
>&< test
<?javax.xml.transform.disable-output-escaping?>
]]>
<?javax.xml.transform.enable-output-escaping?>
</transformed>

You are transforming to a JDOMResult, that is, a tree representation, and not to a stream or file. Output directives like cdata-section-elements are only used when the XSLT processor serializes the result to a stream or file, but not when building a result tree in memory. So I think if you want to construct CDATA sections as the result of XSLT with cdata-section-elements, you need to make sure you write to a file or stream or at least a StringWriter, then you could load the JDOM result from that file or stream respectively created String.
Rewriting the transform method to:
private static Document transform(Document sourceDoc, Document xsltDoc) throws JDOMException, IOException, TransformerException {
StringWriter writer = new StringWriter();
JDOMSource source = new JDOMSource(sourceDoc);
Result result = new StreamResult(writer);
Transformer transformer = TransformerFactory.newInstance()
.newTransformer(new JDOMSource(xsltDoc));
transformer.transform(source, result);
SAXBuilder builder = new SAXBuilder();
return builder.build(new StringReader(writer.toString()));
}

Related

Hot to access principalResult of the transformation? [saxonJS]

I migrate from saxon-CE to saxonJS (v1.2.0)
The output of the XSLT transformation need to be captured as an XML Document object as it was in saxon-CE:
var xslPath = './*.xsl';
var xsl = Saxon.requestXML(xslPath);
var proc = Saxon.newXSLT20Processor(xsl);
var xmlDoc;
var xmlDocTransformed;
var xmlStr;
xmlDoc = Saxon.parseXML(app.getLoadedMEI());
xmlDocTransformed = proc.transformToDocument(xmlDoc);
It tried to apply SaxonJS this way:
var result;
result = SaxonJS.transform({
stylesheetLocation: "./*.sef.xml",
sourceLocation: "./*.xml",
destination: "application"
});
and expected to get a transformation results object where I can access the principalResult property as described in the official documentation (#destination) and in this presentation.
When running the code I obtain the following:
There is no problem with transformation itself: when destination is set to replaceBody it works as expected.
Eventually I solved my task with the following code (Saxon JS 2.3):
var options = {
stylesheetLocation: xslPath,
sourceText: inputXmlStr,
stylesheetParams: params,
destination: "document"
};
var result = SaxonJS.transform(options);
var transformedXmlStr = SaxonJS.serialize(result.principalResult);
The SEF could be produced by xslt3 tool.
Note that you might use the alternative command for this (windows 10 power shell):
node node_modules/xslt3/xslt3.js "-t" "-xsl:stylesheet.xsl" "-export:stylesheet.sef.json" "-nogo"
One way is, of course, to use the fn:transform function with "normal" XSLT code:
const xml = `<root>
<item>a</item>
</root>`;
const xslt = `<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
expand-text="yes">
<xsl:output method="xml"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="/" name="xsl:initial-template">
<xsl:next-match/>
<xsl:comment>Run with {system-property('xsl:product-name')} {system-property('xsl:product-version')} {system-property('Q{http://saxon.sf.net/}platform')}</xsl:comment>
</xsl:template>
</xsl:stylesheet>`;
const resultString = SaxonJS.XPath.evaluate(`transform(map { 'source-node' : parse-xml($xml), 'stylesheet-text' : $xslt, 'delivery-format' : 'serialized' })?output`, null, { params : { 'xml' : xml, 'xslt' : xslt } });
console.log(resultString);
<script src="https://martin-honnen.github.io/Saxon-JS-2.3/SaxonJS2.js"></script>
Not recommended for performance reason but perhaps a workaround that avoid the hassle of creating an SEF. If you want to create an SEF, note, that the Node.js xslt3 tool from Saxonica can also do that, you don't need a current version of Saxon EE, just xslt3 -t -export:my-sheet.sef.json -nogo -xsl:my-sheet.xsl.

How to set language data with Saxon HE 10.2?

How can I set language data correctly with Saxon HE 10.2? I need the XSLT Processor to output the current date with a month name written out in German, like 21. Oktober 2020. Unfortunately, the processor outputs
[Language: en]21. October 2020.
Saxon PE gives the desired output out of the box.
This is my XSLT code:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="3.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:value-of select="format-dateTime(current-dateTime(), '[D]. [MNn] [Y]', 'de', (), ())"/>
</xsl:template>
</xsl:stylesheet>
The test XML source code file is like this:
<?xml version="1.0" encoding="UTF-8"?>
<root/>
In Linux, I run java -cp $xsltProc $class -s:source.xml -xsl:stylesheet.xslt -o:result.
$xsltProc is the path to the file saxon-he-10.2.jar.
$class is net.sf.saxon.Transform.
Any help would be greatly appreciated.
To support German date formats "out of the box", you need Saxon-PE or higher.
See https://saxonica.com/documentation/index.html#!extensibility/config-extend/localizing/other-numberings
If you want this with Saxon-HE, you can compile the open source code for class net.sf.saxon.option.local.Numberer_de and register it with the Configuration:
configuration.setLocalizerFactory(new LocalizerFactory() {
public Numberer getNumberer(String language, String country) {
if (lang.equals("de")) {
return new Numberer_de();
} else {
...
}
});
The Numberer code is available at https://saxonica.plan.io/projects/saxon/repository/he/revisions/master/entry/latest10/hej/net/sf/saxon/option/local/Numberer_de.java
I tried the following with the example files from the original question, but the Configuration was never invoked.
I presume the Configuration needs to be registered somehow?
final Configuration config = new Configuration();
/**/ config.setLocalizerFactory(new LocalizerFactory() {
public Numberer getNumberer(final String language, final String country) {
if (language.equals("de")) {
return Numberer_de.getInstance();
} else {
return null;
}
}
});
Transform.main(new String[] {
"-s:source.xml",
"-xsl:stylesheet.xslt",
"-o:result.txt"
});

JAXP pipeline ignores xsl:comment's (using Saxon HE 9.5.1-4)

When using JAXP pipelin (using Saxon HE), the comments created with don't appear in the resulted .xml.
First I set the system properties to get info and use Saxon and define the input/output:
System.setProperty("jaxp.debug", "1");
System.setProperty("javax.xml.transform.TransformerFactory", "net.sf.saxon.TransformerFactoryImpl");
StreamSource xsl = ...
StreamResult output = ...
InputSource input = ...
Then I have the following construction with dummy Pre and Post filters:
TransformerFactory factory = TransformerFactory.newInstance();
SAXTransformerFactory saxFactory = (SAXTransformerFactory) factory;
SAXParserFactory parserFactory = SAXParserFactory.newInstance();
parserFactory.setNamespaceAware(true);
XMLReader parser = parserFactory.newSAXParser().getXMLReader();
XMLFilter pre = new XMLFilterImpl(parser);
XMLFilter xslFilter = saxFactory.newXMLFilter(xsl);
xslFilter.setParent(pre);
XMLFilter post = new XMLFilterImpl(xslFilter);
TransformerHandler serializer = saxFactory.newTransformerHandler();
serializer.setResult(output);
Transformer trans = serializer.getTransformer();
trans.setOutputProperty(OutputKeys.METHOD, "xml");
post.setContentHandler(serializer);
post.parse(input);
When run with the following stylesheet:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:comment>Nice comment</xsl:comment>
<test>[<xsl:value-of select="system-property('xsl:vendor')" />]
(<xsl:value-of select="system-property('xsl:version')" />
)[<xsl:value-of select="system-property('xsl:vendor-url')" />]</test>
</xsl:template>
</xsl:stylesheet>
I get the follwing output.xml without the comment:
<?xml version="1.0" encoding="UTF-8"?>
<test>[Saxonica]
(2.0
)[http://www.saxonica.com/]
</test>
And the following console log:
JAXP: find factoryId =javax.xml.transform.TransformerFactory
JAXP: found system property, value=net.sf.saxon.TransformerFactoryImpl
JAXP: created new instance of class net.sf.saxon.TransformerFactoryImpl using ClassLoader: null
JAXP: find factoryId =javax.xml.parsers.SAXParserFactory
JAXP: loaded from fallback value: com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
JAXP: created new instance of class com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl using ClassLoader: null
JAXP: find factoryId =javax.xml.parsers.SAXParserFactory
JAXP: loaded from fallback value: com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
JAXP: created new instance of class com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl using ClassLoader: null
When run without the whole pipeline, I do get the coment:
javax.xml.transform.TransformerFactory tFactory = javax.xml.transform.TransformerFactory.newInstance();
javax.xml.transform.Transformer transformer = tFactory.newTransformer(xsl);
transformer.transform(input, output);
results in
<?xml version="1.0" encoding="UTF-8"?><!--Nice comment -->
<test>[Saxonica]
(2.0
)[http://www.saxonica.com/]
</test>
Does anyone know why the JAXP pipeline omits the comments?
The SAX2 ContentHandler interface does not receive notification of comments. For that you need a LexicalHandler. But the SAX2 helper class XMLFilterImpl does not implement LexicalHandler, so it effectively drops the comments.
Switch to s9api in place of JAXP - it does these things much better.

Getting 'WrappedRuntimeException' when using XSL-FO in an Xpage

I am trying to create a PDF from the contents of an xpage. I am following the format that Paul Calhoun used in Notes in 9 #102. I am able to create PDF's for views, but having trouble creating one for a document. I do not think the error is in Paul's code so I am not including it here, although I can if need be.
To generate the XML to display I use the generateXML() method of the document class in java. I get a handle to the backend document and then return the XML. The XML appears well formed, the top level tab is <document>. I pass this XML to the transformer which is using Apache FOP. All of my code is contained in the beforeRenderResponse of an xAgent.
The XSL stylesheet that I am using is a stripped down version to just get a proof of concept to work. I am going to include it, because the problem likely resides with this code. I am totally new to XSL.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
version="1.1"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fo="http://www.w3.org/1999/XSL/Format"
exclude-result-prefixes="fo">
<xsl:output
method="xml"
version="1.0"
omit-xml-declaration="no"
indent="yes" />
<xsl:param
name="versionParam"
select="'1.0'" />
<xsl:template match="document">
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<fo:layout-master-set>
<fo:simple-page-master
master-name="outpage"
page-height="11in"
page-width="8.5in"
margin-top="1in"
margin-bottom="1in"
margin-left="1in"
margin-right="1in">
<fo:region-body />
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="A4">
<fo:block
font-size="16pt"
font-weight="bold"
space-after="5mm">
Apache FOP Proof of Concept.
</fo:block>
</fo:page-sequence>
</fo:root>
</xsl:template>
</xsl:stylesheet>
In the log file I get the message:
FATAL ERROR: 'com.ibm.xtq.common.utils.WrappedRuntimeException: D:\Program Files\IBM\Domino\<document form='PO'>
The error echos the entire XML in the log that it is trying to transform and ends with:
(The filename, directory name, or volume label syntax is incorrect.)'
Notice that Domino is trying to include the XML in a path. I know this is wrong, but don't know what to do to fix it.
EDIT: This is the Java class that runs the transformation. This code is from Paul Calhoun's demo.
import java.io.OutputStream;
import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.sax.SAXResult;
import javax.xml.transform.stream.StreamSource;
import org.apache.fop.apps.FOUserAgent;
import org.apache.fop.apps.Fop;
import org.apache.fop.apps.FopFactory;
public class DominoXMLFO2PDF {
public static void getPDF(OutputStream pdfout,String xml,String xslt, Boolean authReq, String usernamepass) {
try {
//System.out.println("Transforming...");
Source xmlSource,xsltSource; //Source xmlSource = new StreamSource("http://localhost/APCC.nsf/Main?ReadViewEntries&count=999&ResortAscending=2");
xmlSource = new StreamSource(xml); //Source xsltSource = new StreamSource("http://localhost/APCC.nsf/viewdata.xsl");
xsltSource = new StreamSource(xslt);// configure fopFactory as desired
final FopFactory fopFactory = FopFactory.newInstance();
FOUserAgent foUserAgent = fopFactory.newFOUserAgent();
// configure foUserAgent as desired
// Setup output
// OutputStream out = pdfout;
// out = new java.io.BufferedOutputStream(out);
try {
// Construct fop with desired output format
Fop fop = fopFactory.newFop(org.apache.xmlgraphics.util.MimeConstants.MIME_PDF,foUserAgent, pdfout);
// Setup XSLT
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(xsltSource);
//transformer.setParameter("versionParam", "Company List"); // Set the value of a <param> in the stylesheet
Source src = xmlSource; // Setup input for XSLT transformation
Result res = new SAXResult(fop.getDefaultHandler()); // Resulting SAX events (the generated FO) must be piped through to FOP
transformer.transform(src, res); // Start XSLT transformation and FOP processing
} catch (Exception e) {
} finally {
}
} catch (Exception e) {
e.printStackTrace(System.err);
}
}
}
This code is called in the xAgent using this line:
var retOutput = jce.getPDF(pageOutput, xmlsource, xsltsource, authReq, usernamepass);
The xmlsource is set with this line where the method returns XML using Document.generateXML():
var xmlsource = statusBean.generateXML(POdata, sessionScope.unidPDF);
Your problem is the XMLSource! When you look at Paul's code:
Source xmlSource = new StreamSource("http://localhost/APCC.nsf/Main?ReadViewEntries");
This points to an URL where to retrieve XML.On the other hand your code:
xmlsource = statusBean.generateXML(POdata, sessionScope.unidPDF);
contains the XML. So you need to change to:
String xmlstring = statusBean.generateXML(POdata, sessionScope.unidPDF);
Source xmlsource = new StreamSource(new java.io.StringReader(xmlstring));
I strongly suggest you try to keep all the Java in a java class, so you don't need to wrap/upwrap the objects in SSJS. Have a look at my series on FO too.

MissingResourceException - Java 5, JBoss 5.0, XSLT

I have written sample program which uses XSLT to generate HTML response. Check below files.
welcome.xsl
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:java="http://xml.apache.org/xalan/java" version="1.0">
<xsl:output method="html" indent="yes" />
<xsl:variable name="myResource" select="java:java.util.ResourceBundle.getBundle('com.carbonrider.web.xslt.AppResources')" />
<xsl:template match="/">
<html>
<body>
<xsl:apply-templates />
</body>
</html>
</xsl:template>
<xsl:template match="first">
<h2>
<xsl:value-of select="java:getString($myResource,'hi')" />
</h2>
</xsl:template>
</xsl:stylesheet>
PageTransformer.java
package curiousmind.web.xslt;
import java.io.BufferedInputStream;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Result;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamSource;
import org.w3c.dom.Document;
public class PageTransformer extends HttpServlet {
private static final long serialVersionUID = 1L;
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
TransformerFactory transFactory = TransformerFactory.newInstance();
try {
DOMSource domSource = createDOMSource();
Transformer transformer = transFactory.newTransformer(new StreamSource(this.getClass().getResourceAsStream("welcome.xsl")));
Result result = new javax.xml.transform.stream.StreamResult(response.getWriter());
transformer.transform(domSource, result);
} catch (Exception e) {
throw new ServletException(e);
}
}
private DOMSource createDOMSource() throws Exception {
String xmlString = "<?xml version=\"1.0\" ?>\n<first><second>Hello World</second></first>";
byte[] buf = xmlString.getBytes("UTF-8");
BufferedInputStream is = new BufferedInputStream(new ByteArrayInputStream(buf));
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder domBuilder = domFactory.newDocumentBuilder();
Document dom = domBuilder.parse(is);
DOMSource domSource = new DOMSource(dom);
is.close();
return domSource;
}
}
When I run above code, I get following error message in console
ERROR [STDERR] SystemId Unknown; Line #7; Column #95; java.util.MissingResourceException: Can't find bundle for base name curiousmind.web.xslt.AppResources, locale en_US
Here is the properties file kept inside curiousmind.web.xslt
AppResources.properties
hi=Hello World
Can anyone please tell me what could be the problem?
I tried to access the resource bundle from same servlet "PageTransformer" by instantiating java.util.ResourceBundle and it worked. This lead to more confusion as why transformer instantiated from same class is not able to get the ResourceBundle instance.
I added xalan.jar file, but it gave same result.
Finally I thought of enabling "-verbose" mode for jboss as to find out, what could be the actual cause. This gave me hint that, when the servlet is getting invoked and it is instantiating Transformer, it is loading xalan.jar file from JBOSS_DIR/lib/endorsed/xalan.jar. I had to finally remove "xalan.jar" and "serializer.jar" file from jboss and my page worked well.
Though this solved problem, I think better approach would be to use "jboss-classloading.xml" to customize the classloading behavior. But couldn't get appropriate configuration for that.