Performance of XSL while creating CSV from XML - xslt

i am trying to process 8 XML in parallel using PROC XSL .
Each XML is of different structure and complex hierarchical path.Am trying to create CSV out of each XML.
The XSLT sheet that i created is using XSD that was provided for the XML .but the actual XML file doesnt have values for all the columns specified in the XSD and some columns doesnt come in the XML at all .
For Example : based on the XSD we expect the XML to be like the one below
<Master2>
<Child1>
<tag1>name1</tag1>
<tag2>name2</tag2>
<TypeArray>
<value>1</value>
<value>13</value>
</TypeArray>
<TypeArray1>
<Quanutityvalue>1</Quanutityvalue>
<Qualityvalue>13</Qualityvalue>
</TypeArray1>
</Child1>
</Master2>
But what we receive as below
<Master2>
<Child1>
<tag1>name1</tag1>
<TypeArray>
<value>1</value>
<value>13</value>
</TypeArray>
</Child1>
</Master2>
When we create XSLT we create with all columns .The XML that we receive have data for multiple table in it .we create 4 csv from the each of the XML .and each CSV have around 270 columns.
The total time taken to process 8 files is 3 minute 50 seconds .
Does the XSL takes more time to read the NULL tags as well ? or is the time justifiable?
Am using PROC XSL option in SAS
Sample XSLT
<xsl:template match="/">
<xsl:text>tag1,tag2,TypeArray,Quanutityvalue,Qualityvalue</xsl:text>
<xsl:text>
</xsl:text>
<xsl:for-each select="Master1/Master2/Child1">
<xsl:call-template name="CsvEscape"><xsl:with-param name="value" select="normalize-
space(tag1)"/></xsl:call-template>
<xsl:text>,</xsl:text>
<xsl:call-template name="CsvEscape"><xsl:with-param name="value" select="normalize-
space(tag2)"/></xsl:call-template>
<xsl:text>,</xsl:text>
<xsl:call-template name="CsvEscape"><xsl:with-param name="value" select="TypeArray/value"/>
<xsl:text>,</xsl:text>
<xsl:call-template name="CsvEscape"><xsl:with-param name="value" select="TypeArray1/Quanutityvalue"/>
<xsl:text>,</xsl:text>
<xsl:call-template name="CsvEscape"><xsl:with-param name="value" select="TypeArray1/Qualityvalue"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
Thanks

Related

save data to map while processing xslt dynamically

I want to create dynamic map in xslt and populate the same based on some conditions in xslt, how can i do that ?
i can see examples of hardcoded map like below and stored in variable, but i don't want like that
<xsl:variable name="map">
<map>
<entry key="key-1">value1</entry>
<entry key="key-2">value2</entry>
<entry key="key-3">value3</entry>
</map>
</xsl:variable>
Straight from the XSLT 3 spec in https://www.w3.org/TR/xslt-30/#element-map is the example
<xsl:variable name="index" as="map(xs:string, element(employee))">
<xsl:map>
<xsl:for-each select="//employee">
<xsl:map-entry key="#empNr" select="."/>
</xsl:for-each>
</xsl:map>
</xsl:variable>
It seems, however, that this gives a type error unless the input is schema validated so for untyped XML or not schema-aware XSLT you would need to redefine the map type or change the code a bit, the second option is shown below:
<xsl:variable name="index" as="map(xs:string, element(employee))">
<xsl:map>
<xsl:for-each select="//employee">
<xsl:map-entry key="string(#empNr)" select="."/>
</xsl:for-each>
</xsl:map>
</xsl:variable>
Example fiddle is at https://xsltfiddle.liberty-development.net/3MP42MS.

XSL:for-each - Trouble with duplicate nodes in XML, needs to be output to separate lines in a CSV file

Apologies if this has been answered elsewhere, but I have been unable to find a solution. I'm preparing a data set for Microsoft School Data Sync, where I generate a number of CSV files based on a XML file. In the file I have a number of carers with one or several carerrelationship (children). In the output CSV I need each child to be on a separate line, each with the contact info (email) of their carer. However, I just can't get the xsl:for-each to work right. Any help or feedback will be much appreciated!
XML -> This is a carer with two children:
<person recstatus="1">
<sourcedid>
<id>e763fb61-2086-40af-9eb6-d5355d5922bc</id>
</sourcedid>
<name>
<fn>Smith,John</fn>
<n>
<family>Smith</family>
<given>John</given>
</n>
</name>
<email>john.smith#email.com</email>
<institutionrole institutionroletype="Carer" primaryrole="Yes" />
<extension>
<carerrelationship recstatus="1">
<sourcedid>
<id>04ba28e9-0934-41c9-aa42-31c0b66f36ad</id>
</sourcedid>
<sourcedid>
<id>300c42ca-c78a-4ec9-9d81-9acb2382bbca</id>
</sourcedid>
</carerrelationship>
</extension>
</person>
Here's my XSL
<xsl:result-document href="Guardianrelationship.csv" method="text">
<xsl:text>SIS ID,Email,Role</xsl:text><xsl:value-of select="$break"/>
<xsl:for-each select="person[institutionrole/#institutionroletype = 'Carer']/extension/carerrelationship">
<xsl:variable name="person" select="."/>
<xsl:value-of select="sourcedid/id"/><xsl:value-of select="$delimiter"/>
<xsl:value-of select="ancestor::person/email"/><xsl:value-of select="$delimiter"/>
<xsl:text>Guardian</xsl:text><xsl:value-of select="$break"/>
</xsl:for-each>
</xsl:result-document>
Expected result:
SIS ID,Email,Role
04ba28e9-0934-41c9-aa42-31c0b66f36ad,john.smith#email.com.no,Guardian
300c42ca-c78a-4ec9-9d81-9acb2382bbca,john.smith#email.no,Guardian
Actual result:
SIS ID,Email,Role
04ba28e9-0934-41c9-aa42-31c0b66f36ad 300c42ca-c78a-4ec9-9d81-9acb2382bbca,rw347#kirken.no,Guardian
You're currently outputting one line for each carerrelationship, you need to output one line for each sourcedid.
For example
<xsl:for-each select="person[institutionrole/#institutionroletype = 'Carer']/extension/carerrelationship/sourcedid">
<xsl:value-of select="id"/><xsl:value-of select="$delimiter"/>
<xsl:value-of select="ancestor::person/email"/><xsl:value-of select="$delimiter"/>
<xsl:text>Guardian</xsl:text><xsl:value-of select="$break"/>
</xsl:for-each>
You can simplify it to
<xsl:for-each select="person[institutionrole/#institutionroletype = 'Carer']/extension/carerrelationship/sourcedid">
<xsl:value-of select="id || $delimiter || ancestor::person/email || $delimiter || 'Guardian' || $break"/>
</xsl:for-each>
or use concat(x,y,z) in place of || if using 2.0 rather than 3.0.

How can I loop and generate keys for maps with XSLT 3.0?

I tried to construct a new map. In my source xml I've got many products (product data and IDs). How can I generate so many keys like products?
The goal is a transformation from XML to XML with XSLT. The idea was to create a map and in a next step call the keys for adressing the specifics product datas I need. So I need to know if this is possible with using maps or is there another solution?
Example for the source XML
<?xml version="1.0" encoding="UTF-8"?>
<root>
<row>
<id>102</id>
<product>Lenovo 1234</product>
<productfamily>laptop</productfamily>
</row>
<row>
.....
XSLT
<xsl:variable name="val" as="map(xs:integer, xs:integer)">
<xsl:map>
<xsl:for-each select="//id">
<xsl:map-entry key="" select="."/>
</xsl:map>
</xsl:variable>
<xsl:template match="/">
<xsl:value-of select="map:get($val , 102)"/>
</xsl:template>
To create a map based on a simple functional relationship in the data you can do
<xsl:variable name="index" as="map(*)">
<xsl:map>
<xsl:for-each select="//x">
<xsl:map-entry key=".//#id" select="."/>
</xsl:for-each>
</xsl:map>
</xsl:variable>
or if you prefer
<xsl:variable name="index" as="map(*)"
select="map:merge(//x ! map:entry(.//#id, .))"/>

XSLT Transformation (help)

I newby to XSLT and having some trouble to solve this problem.
The input is coming from an XML Excel document and has this format :
<Row>
<Cell><Data ss:Type="String">ToE.3</Data></Cell>
<Cell ss:Index="15"><Data ss:Type="String">Maintain</Data></Cell>
<Cell><Data ss:Type="Number">3</Data></Cell>
<Cell><Data ss:Type="String">Other</Data></Cell>
<Cell ss:Index="131"><Data ss:Type="String">Windows 2003</Data></Cell>
<Cell><Data >Microsoft SQL Server 2005</Data></Cell>
</Row>
..more rows (note the excel sheet has 132 columns)
I need to convert this to a standard text file, something like (with the right column) separator :
Col1 Col2 Col3 ..To.. Col15 Col16 ..To.. Col131
ToE.3 Maintain 3 Windows 2003
The problem is how to insert the empty row values that are skipt with the Index attribute.
The transformation without the empty, index handling looks like :
<xsl:for-each select="Row">
<xsl:for-each select="Cell/Data">
<xsl:value-of select="current()"/>
<xsl:text>\</xsl:text>
</xsl:for-each>
<xsl:text>
</xsl:text>
</xsl:for-each>
Some help would be warmly appreciated
step1: you need to declare output format, ie, "text" and not "xml"..
step2: you need to get rid of additional whitespace. use Strip-space with element='*', that means 'all'!
step3: you need to write header row first ie, col1, col2 etc..
so using template match select an element row that is first in your XML.. assuming that all the rows have same number of columns, you need to write "COL+ NUMBER" .. column numbers = no of cells you have in first row.
step4: if the cell is last then insert 'enter character'..
step5: call the generic function
step6: explaining generic function:
this function copies data under each cells separated by \. Only for the first row, we would be calling it manually, otherwise template match will take care of it.
Here is the code:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template name="Header" match="Row[not(preceding-sibling::Row)]">
<xsl:for-each select="Cell">
<xsl:value-of select="'Col'"/>
<xsl:value-of select="position()"/>
<xsl:if test="position()!=last()">
<xsl:value-of select="'\'"/>
</xsl:if>
</xsl:for-each>
<xsl:text>
</xsl:text>
<xsl:call-template name="CopyData"/>
</xsl:template>
<xsl:template name="CopyData" match="Row">
<xsl:for-each select="Cell">
<xsl:for-each select="Data">
<xsl:apply-templates select="."/>
</xsl:for-each>
<xsl:if test="position()!=last()">
<xsl:value-of select="'\'"/>
</xsl:if>
</xsl:for-each>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
corresponding sample output:
Col1\Col2\Col3\Col4\Col5\Col6
ToE.3\Maintain\3\Other\Windows 2003\Microsoft SQL Server 2005
ToE.3\Maintain\3\Other\Windows 2003\Microsoft SQL Server 2005
This is tricky because as you are seeing Excel skips columns in which no data appears, then provides an ss:Index attribute for the subsequent non-blank column. You have to reconstruct the "missing" cell positions on your own. That is, if you wish to retain the original column position like "15" or "131" in your example, with intervening blanks.
Agreeing with InfantProgrammer above, but suggest you'd add some logic to the "CopyData" template above to (a) determine the number of missing cells, then (b) call a recursive named template to write 'em to output.
<xsl:template name="WriteBlanks">
<xsl:param name="Count" select="0"/>
<xsl:if test="Count > 0">
<xsl:value-of select="'\'"/>
<xsl:call-template name="WriteBlanks">
<xsl:with-param name="Count" select="$Count - 1"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
You could do something similar to generate the first row of column headers.
Given the simplicity of your need to just write backslash characters as column separator, a more succinct approach of just creating a long string of them, then lopping off however many are needed with XPath substring() could be in reach. However a recursive template may be suitable for more complex outputs.

Nested for-each loops, accessing outer element with variable from the inner loop

I'm trying to write an XSL that will output a certain subset of fields from the source XML. This subset will be determined at transformation time, by using an external XML configuration document containing the field names, and other specific information (such as the padding length).
So, this is two for-each loops:
The outer one iterates over the records to access their fields record by record.
The inner one iterates over the configuration XML document to grab the configured fields from the current record.
I've seen in In XSLT how do I access elements from the outer loop from within nested loops? that the current element in the outside loop can be stored in an xsl:variable. But then I've got to define a new variable inside the inner loop to get the field name. Which yields to the question: Is it possible to access a path in which there are two variables ?
For instance, the source XML document looks like:
<data>
<dataset>
<record>
<field1>value1</field1>
...
<fieldN>valueN</fieldN>
</record>
</dataset>
<dataset>
<record>
<field1>value1</field1>
...
<fieldN>valueN</fieldN>
</record>
</dataset>
</data>
I'd like to have an external XML file looking like:
<configuration>
<outputField order="1">
<fieldName>field1</fieldName>
<fieldPadding>25</fieldPadding>
</outputField>
...
<outputField order="N">
<fieldName>fieldN</fieldName>
<fieldPadding>10</fieldPadding>
</outputField>
</configuration>
The XSL I've got so far:
<xsl:variable name="config" select="document('./configuration.xml')"/>
<xsl:for-each select="data/dataset/record">
<!-- Store the current record in a variable -->
<xsl:variable name="rec" select="."/>
<xsl:for-each select="$config/configuration/outputField">
<xsl:variable name="field" select="fieldName"/>
<xsl:variable name="padding" select="fieldPadding"/>
<!-- Here's trouble -->
<xsl:variable name="value" select="$rec/$field"/>
<xsl:call-template name="append-pad">
<xsl:with-param name="padChar" select="$padChar"/>
<xsl:with-param name="padVar" select="$value"/>
<xsl:with-param name="length" select="$padding"/>
</xsl:call-template>
</xsl:for-each>
<xsl:value-of select="$newline"/>
</xsl:for-each>
I'm quite new to XSL, so this might well be a ridiculous question, and the approach can also be plain wrong (i.e. repeatig inner loop for a task that could be done once at the beggining). I'd appreciate any tips on how to select the field value from the outer loop element and, of course, open to better ways to approach this task.
Your stylesheet looks almost fine. Just the expression $rec/$field doesn't make sense because you can't combine two node sets/sequences this way. Instead, you should compare the names of the elements using the name() function. If I understood your problem correctly, something like this should work:
<xsl:variable name="config" select="document('./configuration.xml')"/>
<xsl:for-each select="data/dataset/record">
<xsl:variable name="rec" select="."/>
<xsl:for-each select="$config/configuration/outputField">
<xsl:variable name="field" select="fieldName"/>
...
<xsl:variable name="value" select="$rec/*[name(.)=$field]"/>
...
</xsl:for-each>
<xsl:value-of select="$newline"/>
</xsl:for-each>
Variable field is not required in this example. You can also use function current() to access the current context node of the inner loop:
<xsl:variable name="value" select="$rec/*[name(.)=current()/fieldName]"/>