XSLT 2 (Saxon): How to read multiple files into memory - xslt

How do I read multiple xml files into memory/stream?
Using <xsl:result-document> I am able to split xml into multiple xmls onto the directory.
I want to read the multiple result files into memory
XSL :
<xsl:template match="/testdata">
<xsl:for-each select="trd">
<xsl:result-document href="result_{position()}.xml">
<abc>
<xyz>
<xsl:copy-of select="*"/>
</xyz>
</abc>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
With below I am able to read one resulting xml into memory (after removing <xsl:result-document>). I want read multiple output xmls into memory
System.setProperty("javax.xml.transform.TransformerFactory", "net.sf.saxon.TransformerFactoryImpl");
TransformerFactory tFactory = TransformerFactory.newInstance();
Source xslt = new StreamSource(new File("testxsl.xsl"));
Transformer transformer = null;
transformer = tFactory.newTransformer(xslt);
Source xmlInput = new StreamSource(new File("test.xml"));
StreamResult standardResult = new StreamResult(new ByteArrayOutputStream());
transformer.transform(xmlInput, standardResult);

This can't be done using the standard JAXP API (which was designed for XSLT 1.0 and has never been upgraded). Use Saxon's s9api API, and call Xslt30Transformer.setResultDocumentHandler() to supply a destination for result documents. This can be an XdmDestination if you want the result as an XdmNode object, or it can be a Serializer writing to an in-memory OutputStream or StringWriter if you want to capture serialized results in memory.

Related

XSLT to Concatenate Empty Values

We have a requirement that requires us to list out all the empty values from the incoming xml. I have searched but all I could find was listing non-null values, trying to use that for our xml is not returning the required results.
Here is the xml that we will receive and I want to be able to concatenate all the null values from this xml and print. Kindly assist.
<?xml version = "1.0" encoding = "UTF-8"?>
<Output>
<Rows>
<ns0:I2NA xmlns:ns0 = "http://www.example.com/schemas/Schema.xsd">
<ns0:Organization>108</ns0:Organization>
<ns0:AccountNumber>1231231231231231233 </ns0:AccountNumber>
<ns0:Status>0</ns0:Status>
<ns0:VipStatus>0</ns0:VipStatus>
<ns0:TypeOfIdNo>1</ns0:TypeOfIdNo>
<ns0:IdNo>2303111450 </ns0:IdNo>
<ns0:HomePhone>123456 </ns0:HomePhone>
<ns0:Employer> </ns0:Employer>
<ns0:EmployersPhone>123456 </ns0:EmployersPhone>
<ns0:FaxPhone>123456 </ns0:FaxPhone>
<ns0:MobileNo>0568520421 </ns0:MobileNo>
<ns0:CountryCode> </ns0:CountryCode>
<ns0:PostalCode> </ns0:PostalCode>
<ns0:Position> </ns0:Position>
<ns0:MaritalStatus>0</ns0:MaritalStatus>
<ns0:DateOfBirth>00000000</ns0:DateOfBirth>
<ns0:EmailAddrs> </ns0:EmailAddrs>
<ns0:UserCode1> </ns0:UserCode1>
<ns0:NationalityCode> </ns0:NationalityCode>
<ns0:NameLine1>ABC </ns0:NameLine1>
<ns0:NameLine2> </ns0:NameLine2>
<ns0:NameLine3> </ns0:NameLine3>
<ns0:ChDob> </ns0:ChDob>
<ns0:AddressLine1>USA </ns0:AddressLine1>
<ns0:AddressLine2>USA </ns0:AddressLine2>
<ns0:AddressLine3>USA </ns0:AddressLine3>
<ns0:AddressLine4>USA </ns0:AddressLine4>
<ns0:City> </ns0:City>
<ns0:State> </ns0:State>
<ns0:GenderCode>0</ns0:GenderCode>
<ns0:StatementNotifIndi> </ns0:StatementNotifIndi>
<ns0:Nationality> </ns0:Nationality>
<ns0:County> </ns0:County>
<ns0:LastName>John </ns0:LastName>
<ns0:MiddleName> </ns0:MiddleName>
<ns0:FirstName>SHAN MATHEW </ns0:FirstName>
<ns0:LangPref> </ns0:LangPref>
</ns0:I2NA>
</Rows>
<EOF>true</EOF>
When the XSLT is applied, we would like to receive the below string as output.
Status,Employer,CountryCode,PostalCode,Position,EmailAddrs,UserCode1,NationalityCode,NameLine2,NameLine3,ChDob,City,State,StatementNotifIndi,Nationality,County,MiddleName,LangPref
Thanks!
Use <xsl:value-of select="//*[not(*) and not(normalize-space())]/local-name()" separator=","/>. But I don't understand why your sample string starts with Status while the XML has a value <ns0:Status>0</ns0:Status> for that field.
With XSLT 1.0 you need a bit of more code:
<xsl:for-each select="//*[not(*) and not(normalize-space())]">
<xsl:if test="position() > 1"><xsl:text>.</xsl:text></xsl:if>
<xsl:value-of select="local-name()"/>
</xsl:for-each>

XSLT transformation passing parameters

I am trying to pass parameters during an XSLT transformation. Here is the xsl stylesheet.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:param name="param1" select="'defaultval1'" />
<xsl:param name="param2" select="'defaultval2'" />
<xsl:template match="/">
<xslttest>
<tagg param1="{$param1}"><xsl:value-of select="$param2" /></tagg>
</xslttest>
</xsl:template>
</xsl:stylesheet>
The following in the java code.
File xsltFile = new File("template.xsl");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document stylesheet = builder.parse("template.xsl");
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer xsltTransformer = transformerFactory.newTransformer(new DOMSource(stylesheet));
//Transformer xsltTransformer = transformerFactory.newTransformer(new StreamSource(xsltFile));
xsltTransformer.setParameter("param1", "value1");
xsltTransformer.setParameter("param2", "value2");
StreamResult result = new StreamResult(System.out);
xsltTransformer.transform(new DOMSource(builder.newDocument()), result);
I get following errors:-
ERROR: 'Variable or parameter 'param1' is undefined.'
FATAL ERROR: 'Could not compile stylesheet'
However, if i use the following line to create the transformer everything works fine.
Transformer xsltTransformer = transformerFactory.newTransformer(new StreamSource(xsltFile));
Q1. I just wanted to know whats wrong in using a DOMSource in creating a Transformer.
Q2. Is this one of the ideal ways to substitute values for placeholders in an xml document? If my placeholders were in a source xml document is there any (straightforward) way to substitute them using style sheets (and passing parameters)?
Q1: This is a namespace awareness problem. You need to make the DocumentBuilderFactory namespace aware:
factory.setNamespaceAware(true);
Q2: There are several ways to get the values from an external xml file. One way to do this is with the document function and a top level variable in the document:
<!-- Loads a map relative to the template. -->
<xsl:variable name="map" select="document('map.xml')"/>
Then you can select the values out of the map. For instance, if map.xml was defined as:
<?xml version="1.0" encoding="UTF-8"?>
<mappings>
<mapping key="value1">value2</mapping>
</mappings>
You could remove the second parameter from your template, then look up the value using this line:
<tagg param1="{$param1}"><xsl:value-of select="$map/mappings/mapping[#key=$param1]"/></tagg>
Be aware that using relative document URIs will require that the stylesheet has a system id specified, so you will need to update the way you create your DOMSource:
DOMSource source = new DOMSource();
source.setNode(stylesheet);
source.setSystemId(xsltFile.toURL().toString());
In general, I suggest looking at all of the options that are available in Java's XML APIs. Assume that all of the features available are set wrong for what you are trying to do. I also suggest reading the XML Information Set. That specification will give you all of the definitions that the API authors are using.

how to add root node tag in a XML document with XSLT

Iam parsing the xml document in SSIS through the xmlsource. It does not have any root tag. So iam trying to add the root tag to my xml document through XSLT, but getting the error as
[XML Task] Error: An error occurred with the following error message: "There are multiple root elements. Line 11, position 2.".
what is the XSL to be used to add the root element.? Please help..this is very urgent..
Please find the xml source below
<organizational_unit>
<box_id>898</box_id>
<hierarchy_id>22</hierarchy_id>
<parent_box_id>0</parent_box_id>
<code>Team</code>
<description />
<name>CAPS Teams</name>
<manager_title />
<level>0</level>
</organizational_unit>
<organizational_unit>
<box_id>967</box_id>
<hierarchy_id>31</hierarchy_id>
<parent_box_id>0</parent_box_id>
<code>main</code>
<description />
<name>Protegent</name>
<manager_title />
<level>0</level>
<organizational_unit>
<box_id>968</box_id>
<hierarchy_id>31</hierarchy_id>
<parent_box_id>967</parent_box_id>
<code>19L</code>
<description>19L</description>
<name>19L</name>
<level>1</level>
<managers>
<manager>
<hierarchy_mgr_id>243</hierarchy_mgr_id>
<hierarchy_id>31</hierarchy_id>
<box_id>968</box_id>
<rep_id>19499</rep_id>
<unique_rep_id>100613948</unique_rep_id>
<first_name>Ed</first_name>
<last_name>Kill</last_name>
</manager>
</managers>
</organizational_unit>
<organizational_unit>
<box_id>1152</box_id>
<hierarchy_id>31</hierarchy_id>
<parent_box_id>967</parent_box_id>
<code>UNKNOWN_m</code>
<description>Unknown Reps</description>
<name>Unknown Reps</name>
<level>1</level>
</organizational_unit>
</organizational_unit>
Well which XSLT processor do you use, how do you use it? I usually don't suggest to use string processing to construct XML but if you have a fragment without a root element then perhaps doing string concatenation "<root>" + fragment + "</root>" is the easiest way to get a well-formed document. XSLT can work with fragments but how you do that depends on the XSLT processor or XML parser you use, for instance .NET can use an XmlReader with XmlReaderSettings with ConformanceLevel set to fragment, which can then be loaded in an XPathDocument (for processing with XSLT 1.0 and XslCompiledTransform) and probably also with Saxon's XdmNode (although I am not sure I remember that correctly).
The stylesheet would then simply do
<xsl:template match="/">
<root>
<xsl:copy-of select="node()"/>
</root>
</xsl:template>
to wrap all top level nodes into a root element.

XSLT; parse escaped text to a node-set and extract subelements

I've been fighting with this problem all day and am just about at my wit's end.
I have an XML file in which certain portions of data are stored as escaped text but are themselves well-formed XML. I want to convert the whole hierarchy in this text node to a node-set and extract the data therein. No combination of variables and functions I can think of works.
The way I'd expect it to work would be:
<xsl:variable name="a" select="InnerXML">
<xsl:for-each select="exsl:node-set($a)/*">
'do something
</xsl:for-each>
The input element InnerXML contains text of the form
<root><elementa>text</elementa><elementb><elementc/><elementd>text</elementd></elementb></root>
but that doesn't really matter. I just want to navigate the xml like a normal node-set.
Where am I going wrong?
In case you can use Saxon 9.x, it provides the saxon:parse() extension function exactly for solving this task.
what I've done is had a msxsl script in the xslt ( this is in a windows .NET environment):
<msxsl:script implements-prefix="cs" language="C#" >
<![CDATA[
public XPathNodeIterator parse(String strXML)
{
System.IO.StringReader rdr = new System.IO.StringReader(strXML);
XPathDocument doc = new XPathDocument(rdr);
XPathNavigator nav = doc.CreateNavigator();
XPathExpression expr;
expr = nav.Compile("/");
XPathNodeIterator iterator = nav.Select(expr);
return iterator;
}
]]>
</msxsl:script>
then you can call it like this:
<xsl:variable name="itemHtml" select="cs:parse(EscapedNode)" />
and that variable now contains xml you can iterate through

Xslt transform on special characters

I have an XML document that needs to pass text inside an element with an '&' in it.
This is called from .NET to a Web Service and comes over the wire with the correct encoding &
e.g.
T&O
I then need to use XSLT to create a transform but need to query SQL server through a SP without the encoding on the Ampersand e.g T&O would go to the DB.
(Note this all has to be done through XSLT, I do have the choice to use .NET encoding at this point)
Anyone have any idea how to do this from XSLT?
Note my XSLT knowledge isn’t the best to say the least!
Cheers
<xsl:text disable-output-escaping="yes">&<!--&--></xsl:text>
More info at: http://www.w3schools.com/xsl/el_text.asp
If you have the choice to use .NET you can convert between an HTML-encoded and regular string using (this code requires a reference to System.Web):
string htmlEncodedText = System.Web.HttpUtility.HtmlEncode("T&O");
string text = System.Web.HttpUtility.HtmlDecode(htmlEncodedText);
Update
Since you need to do this in plain XSLT you can use xsl:value-of to decode the HTML encoding:
<xsl:variable name="test">
<xsl:value-of select="'T&O'"/>
</xsl:variable>
The variable string($test) will have the value T&O. You can pass this variable as an argument to your extension function then.
Supposing your XML looks like this:
<root>T&O</root>
you can use this XSLT snippet to get the text out of it:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:template match="root"> <!-- Select the root element... -->
<xsl:value-of select="." /> <!-- ...and extract all text from it -->
</xsl:template>
</xsl:stylesheet>
Output (from Saxon 9, that is):
T&O
The point is the <xsl:output/> element. The defauklt would be to output XML, where the ampersand would still be encoded.