xslt sort output xml - xslt

I'm trying to find a solution to the following problem.
I'm developing XSLT transformation (which is now about 40KB big) that is transforming quite complex XMLs into a quite simple structure which would like this:
<Records>
<Record key="XX">
</Record>
<Record key="XX1">
</Record>
<Record key="XX2">
</Record>
<Record key="XX3">
</Record>
</Records>
I would like to have this output XML sorted according to Records/Record/#key values.
The problem is that my XSLT produces this output unsorted and due to its complexity I am unable to sort it there.
Is it possible to apply xsl:sort on the output XML? I know that I can prepare another XSLT transform, but in my case that's not the solution, as I'm limited to only one XSLT.. Please, help!...

Is it possible to apply xsl:sort on the output XML?
Yes, multipass processing is possible, and especially in XSLT 2.0 you don't even need to apply an xxx:node-set() extension on the result, because the infamous RTF type does no longer exist:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="vPass1">
<!--
Put/Invoke your cirrent code here
to generate the following
-->
<Records>
<Record key="XX3">
</Record>
<Record key="XX2">
</Record>
<Record key="XX4">
</Record>
<Record key="XX1">
</Record>
</Records>
</xsl:variable>
<xsl:apply-templates select="$vPass1/*"/>
</xsl:template>
<xsl:template match="Records">
<Records>
<xsl:perform-sort select="*">
<xsl:sort select="#key"/>
</xsl:perform-sort>
</Records>
</xsl:template>
</xsl:stylesheet>
When this transformation is performed on any XML document (not used/ignored), the wanted, correct, sorted result is produced:
<Records>
<Record key="XX1"/>
<Record key="XX2"/>
<Record key="XX3"/>
<Record key="XX4"/>
</Records>
In XSLT 1.0 it is almost the same with the additional conversion of the result from RTF type to a normal tree:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ext="http://exslt.org/common"
exclude-result-prefixes="ext">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="vrtfPass1">
<!--
Put/Invoke your cirrent code here
to generate the following
-->
<Records>
<Record key="XX3">
</Record>
<Record key="XX2">
</Record>
<Record key="XX4">
</Record>
<Record key="XX1">
</Record>
</Records>
</xsl:variable>
<xsl:variable name="vPass1"
select="ext:node-set($vrtfPass1)"/>
<xsl:apply-templates select="$vPass1/*"/>
</xsl:template>
<xsl:template match="Records">
<Records>
<xsl:for-each select="*">
<xsl:sort select="#key"/>
<xsl:copy-of select="."/>
</xsl:for-each>
</Records>
</xsl:template>
</xsl:stylesheet>

40Kb is a lot of code for one stylesheet. When things get to this kind of scale, it's usually best to split a transformation into a pipeline of smaller transformations. If you have such a pipeline architecture, then adding a sort step at the end is trivial. There are plenty of technologies for managing a pipeline of transformations (XProc, Orbeon, xmlsh, ant, Coccoon) depending on your requirements. The benefit of pipelining is that it keeps your code modular and reusable.

As an addendum to Dimitre's excellent solution above, if you're using an XSLT 1.0 processor (for example, .NET), the following can give you a pointer as to how to use node-set:
http://www.xml.com/pub/a/2003/07/16/nodeset.html#tab.namespaces
In my case, I was in .NET 1.1 (i.e. MSXML) and the solution looked something like:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt">
<xsl:template match="/">
<xsl:variable name="vrtfPass1">
<Records xmlns="">
<xsl:apply-templates />
</Records >
</xsl:variable>
<xsl:variable name="vPass1" select="msxsl:node-set($vrtfPass1)"/>
<xsl:apply-templates select="$vPass1/*" mode="sorting"/>
</xsl:template>
<xsl:template match="Records" mode="sorting">
<Records>
<xsl:for-each select="Record">
<xsl:sort select="#key"/>
<xsl:copy-of select="."/>
</xsl:for-each>
</Records>
</xsl:template>
</xsl:stylesheet>

Related

Compare two xml tree nodes and find if a node with a value exists in another using xslt

I have an XML input which is merged format of two xmls:
<DATA>
<RECORDS1>
<RECORD>
<id>11</id>
<value>123</value>
</RECORD>
<RECORD>
<id>33</id>
<value>321</value>
</RECORD>
<RECORD>
<id>55</id>
<value>121113</value>
</RECORD>
...
</RECORDS1>
<RECORDS2>
<RECORD>
<id>11</id>
<value>123</value>
</RECORD>
<RECORD>
<id>33</id>
<value>323</value>
</RECORD>
<RECORD>
<id>44</id>
<value>12333</value>
</RECORD>
...
</RECORDS2>
I need to copy in the output the records in RECORDS1 provided:
The records in RECORDS1 doesnot exist in RECORDS2
The records in RECORDS1 exists in RECORDS2 but the value is different
Plus if the output could be extended such with an extra field with value as NEW (when does not not exist) as CHANGE (when exists but value is different)
Output
<DATA>
<RECORDS>
<RECORD>
<id>33</id>
<value>321</value>
<kind>Change</kind>
</RECORD>
<RECORD>
<id>55</id>
<value>121113</value>
<kind>New</kind>
</RECORD>
...
</RECORDS>
I have applied FOR Loop but as the variable in xslt cant be reset hence it doesnot work.
Any ideas?
Perhaps
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
version="2.0">
<xsl:output indent="yes"/>
<xsl:key name="rec2-complete" match="RECORDS2/RECORD" use="concat(id, '|', value)"/>
<xsl:key name="rec2-id" match="RECORDS2/RECORD" use="id"/>
<xsl:template match="DATA">
<xsl:apply-templates select="RECORDS1/RECORD[not(key('rec2-complete', concat(id, '|', value)))]"/>
</xsl:template>
<xsl:template match="RECORDS1/RECORD">
<xsl:copy>
<xsl:copy-of select="node()"/>
<merged>
<xsl:value-of select="if (key('rec2-id', id)/value != value) then 'changed' else 'new'"/>
</merged>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
implements the requirements.
Or, to construct the complete result you have shown now, use
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
version="2.0">
<xsl:output indent="yes"/>
<xsl:key name="rec2-complete" match="RECORDS2/RECORD" use="concat(id, '|', value)"/>
<xsl:key name="rec2-id" match="RECORDS2/RECORD" use="id"/>
<xsl:template match="DATA">
<xsl:copy>
<RECORDS>
<xsl:apply-templates select="RECORDS1/RECORD[not(key('rec2-complete', concat(id, '|', value)))]"/>
</RECORDS>
</xsl:copy>
</xsl:template>
<xsl:template match="RECORDS1/RECORD">
<xsl:copy>
<xsl:copy-of select="node()"/>
<kind>
<xsl:value-of select="if (key('rec2-id', id)/value != value) then 'change' else 'new'"/>
</kind>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

XSLT logic to add a sequence for the combination of elements with same dates and ID field

I am struggling to create logic for transformation.
Logic: "Seq" A sequential number used to make a unique key when the ID and Date fields are equal.
<root>
<Record>
<ID>11</ID>
<date>2020-03-11-07:00</date>
<quantity>10</quantity>
</Record>
<Record>
<ID>13</ID>
<date>2020-03-12-07:00</date>
<quantity>20</quantity>
</Record>
<Record>
<ID>15</ID>
<date>2020-03-13-07:00</date>
<quantity>40</quantity>
</Record>
<Record>
<ID>11</ID>
<date>2020-03-11-07:00</date>
<quantity>5</quantity>
</Record>
<Record>
<ID>13</ID>
<date>2020-03-17-07:00</date>
<quantity>100</quantity>
</Record>
</root>
to, Output
ID,seq,Date,quantity
11,1,2020-03-11-07:00,10
11,2,2020-03-11-07:00,5
13,1,2020-03-12-07:00,20
15,1,2020-03-13-07:00,40
13,1,2020-03-17-07:00,100
In XSLT 3 it is a simple grouping problem with a composite key:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
version="3.0">
<xsl:output method="text" />
<xsl:template match="root">
<xsl:text>ID,seq,Date,quantity
</xsl:text>
<xsl:for-each-group select="Record" composite="yes" group-by="ID, date">
<xsl:apply-templates select="current-group()"/>
</xsl:for-each-group>
</xsl:template>
<xsl:template match="Record">
<xsl:value-of select="ID, position(), date, quantity" separator=","/>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
https://xsltfiddle.liberty-development.net/93dFepy
XSLT 3 can be used with Saxon 9.8 and later or AltovaXML 2017 R3 or later.

Flatten XML after copy

I have a requirement to create a copy of the xml record based on a repeating field which I am able to do so, however I need the result to be flattened
I have tried to use variables and copying them to the output
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<Hire>
<xsl:for-each select="Hire/Record">
<xsl:variable name="var_record" select="./*
[not(name()='Sweldo')]" />
<xsl:for-each select="Sweldo">
<xsl:variable name="var_SWELDO" select=".">
</xsl:variable>
<Record>
<xsl:copy-of select="$var_record" />
<xsl:copy-of select="$var_SWELDO" />
</Record>
</xsl:for-each>
</xsl:for-each>
</Hire>
</xsl:template>
</xsl:stylesheet>
The input is
<?xml version="1.0" encoding="UTF-8"?>
<Hire>
<Record>
<XRefCode>XX</XRefCode>
<EmployeeNumber>161</EmployeeNumber>
<BirthDate>1985-04-09</BirthDate>
<SocialSecurityNumber>XXXXXXX</SocialSecurityNumber>
<FirstName>XX</FirstName>
<LastName>XX</LastName>
<MiddleName>D</MiddleName>
<Sweldo>
<sahod>ONE MILLION</sahod>
</Sweldo>
<Sweldo>
<sahod>1 BILLION</sahod>
</Sweldo>
</Record>
</Hire>
The output I am getting is
<?xml version="1.0"?>
<Hire>
<Record>
<XRefCode>161</XRefCode>
<EmployeeNumber>161</EmployeeNumber>
<BirthDate>1985-04-09</BirthDate>
<SocialSecurityNumber>999-81-9462</SocialSecurityNumber>
<FirstName>XXX</FirstName>
<LastName>XXXXX</LastName>
<MiddleName>D</MiddleName>
<Sweldo>
<sahod>ONE MILLION</sahod>
</Sweldo>
</Record>
<Record>
<XRefCode>161</XRefCode>
<EmployeeNumber>161</EmployeeNumber>
<BirthDate>1985-04-09</BirthDate>
<SocialSecurityNumber>999-81-9462</SocialSecurityNumber>
<FirstName>XXX</FirstName>
<LastName>XXXX</LastName>
<MiddleName>D</MiddleName>
<Sweldo>
<sahod>1 BILLION</sahod>
</Sweldo>
</Record>
</Hire>
However I need the following format
<?xml version="1.0"?>
<Hire>
<Record>
<XRefCode>161</XRefCode>
<EmployeeNumber>161</EmployeeNumber>
<BirthDate>1985-04-09</BirthDate>
<SocialSecurityNumber>999-81-9462</SocialSecurityNumber>
<FirstName>XXXX</FirstName>
<LastName>XXXXXX</LastName>
<MiddleName>D</MiddleName>
<sahod>ONE MILLION</sahod>
</Record>
<Record>
<XRefCode>161</XRefCode>
<EmployeeNumber>161</EmployeeNumber>
<BirthDate>1985-04-09</BirthDate>
<SocialSecurityNumber>999-81-9462</SocialSecurityNumber>
<FirstName>XXXX</FirstName>
<LastName>XXXX</LastName>
<MiddleName>D</MiddleName>
<sahod>1 BILLION</sahod>
</Record>
</Hire>
Is there a way to completely remove the element?
Please check and update following code:-
<xsl:for-each select="Sweldo">
**change to**
<xsl:for-each select="Sweldo/sahod">

Verticalize XML using XSLT

I am trying to implement an at first looking simple transformation but whatever I have tried has been failed.
The XML is generated from a fixed length record and have the below format.
<?xml version="1.0" encoding="UTF-8"?>
<record>
<no_of_records>30</no_of_records>
<cust_lastname_1>Smith</cust_lastname_1>
<cust_name_1>John</cust_name_1>
<cust_id_1>X45</cust_id_1>
<cust_lastname_2>George</cust_lastname_2>
<cust_name_2>Michael</cust_name_2>
<cust_id_2>X76</cust_id_2>
<cust_lastname_3>Ria</cust_lastname_3>
<cust_name_3>Chris</cust_name_3>
<cust_id_3>C87</cust_id_3>
...
</record>
The no_of_records indicates how many _X suffixed elements contains each record and because of its fix length origin has a defined maximum.
I want to transform it to a “verticalized” form resempling the below.
<record>
<customer num="1">
<lastname>Smith</lastname>
<name>John</name>
<id>X45</id>
</customer>
<customer num="2">
<lastname>George</lastname>
<name>Michael</name>
<id>X76</id>
</customer>
<customer num="3">
<lastname>Ria</lastname>
<name>Chris</name>
<id>C87</id>
...
</customer>
</record>
Any help would greatly appreciated.
In XSLT 2.0, you want something like
<xsl:for-each-group select="*" group-starting-with="*[starts-with(local-name(), 'cust_lastname']">
<customer num="{position()}">
<xsl:apply-templates select="current-group()"/>
</customer>
</xsl:for-each-group>
....
<xsl:template match="*[starts-with(local-name(), 'cust')]">
<xsl:element name="{replace(local-name(), 'cust_(.*?)_[0-9]+', '$1')}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
The solution from #Michael Kay works fine. Thank you !
XML
<?xml version="1.0" encoding="UTF-8"?>
<record>
<no_of_records>3</no_of_records>
<cust_lastname_1>Smith</cust_lastname_1>
<cust_name_1>John</cust_name_1>
<cust_id_1>X45</cust_id_1>
<cust_lastname_2>George</cust_lastname_2>
<cust_name_2>Michael</cust_name_2>
<cust_id_2>X76</cust_id_2>
<cust_lastname_3>Ria</cust_lastname_3>
<cust_name_3>Chris</cust_name_3>
<cust_id_3>C87</cust_id_3>
</record>
XSLT
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="record">
<records>
<xsl:for-each-group select="*[starts-with(local-name(), 'cust_')]"
group-starting-with="*[starts-with(local-name(), 'cust_lastname')]">
<customer num="{position()}">
<xsl:apply-templates select="current-group()"/>
</customer>
</xsl:for-each-group>
</records>
</xsl:template>
<xsl:template match="*[starts-with(local-name(), 'cust')]">
<xsl:element name="{replace(local-name(), 'cust_(.*?)_[0-9]+', '$1')}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Result
<?xml version="1.0" encoding="UTF-8"?>
<records>
<customer num="1">
<lastname>Smith</lastname>
<name>John</name>
<id>X45</id>
</customer>
<customer num="2">
<lastname>George</lastname>
<name>Michael</name>
<id>X76</id>
</customer>
<customer num="3">
<lastname>Ria</lastname>
<name>Chris</name>
<id>C87</id>
</customer>
</records>

XSLT To filter out records with letters

We have a requirement to filter records with characters in numeric fields and report them separately. I did come across the following question which has been answered -
XPATH To filter out records with letters
However is there a way to mark these records with a flag or collect them in a variable as we need to report these records as invalid records. If we delete them completely the problem is that we do not have a clue on which of them were invalid.
Please suggest.
Thank You!
Input:
<?xml version="1.0" encoding="UTF-8"?>
<payload>
<records>
<record>
<number>123</number>
</record>
<record>
<number>456</number>
</record>
<record>
<number>78A</number>
</record>
</records>
</payload>
Output:
<?xml version="1.0" encoding="UTF-8"?>
<payload>
<records>
<record>
<number>123</number>
</record>
<record>
<number>456</number>
</record>
</records>
</payload>
XSLT solution from the link above:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="record[translate(number, '0123456789', '')]"/>
</xsl:stylesheet>
After the match, output the original element with whatever "flag" you want (attribute, comment, processing instruction, etc.).
Example:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="record[string(number(number))='NaN']">
<record invalid="true">
<xsl:apply-templates select="#*|node()"/>
</record>
</xsl:template>
</xsl:stylesheet>
Output
<payload>
<records>
<record>
<number>123</number>
</record>
<record>
<number>456</number>
</record>
<record invalid="true">
<number>78A</number>
</record>
</records>
</payload>
You can still use your original match if you'd like.
Edit to handle multiple number (fields) and identify the specific fields (columns) at the record level.
Modified XML input example:
<payload>
<records>
<record>
<number>123</number>
</record>
<record>
<number>456</number>
</record>
<record>
<number>321</number>
<number>78A</number>
<number>654</number>
<number>abc</number>
</record>
</records>
</payload>
Updated XSLT 1.0
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="record">
<xsl:variable name="invalidCols">
<xsl:apply-templates select="*" mode="invalid"/>
</xsl:variable>
<record>
<xsl:if test="string($invalidCols)">
<xsl:attribute name="invalidCols">
<xsl:value-of select="normalize-space($invalidCols)"/>
</xsl:attribute>
</xsl:if>
<xsl:apply-templates select="#*|node()"/>
</record>
</xsl:template>
<xsl:template match="number[string(number(.))='NaN']" mode="invalid">
<xsl:number/>
<xsl:text> </xsl:text>
</xsl:template>
<xsl:template match="*" mode="invalid"/>
</xsl:stylesheet>
Output
<payload>
<records>
<record>
<number>123</number>
</record>
<record>
<number>456</number>
</record>
<record invalidCols="2 4">
<number>321</number>
<number>78A</number>
<number>654</number>
<number>abc</number>
<number>123456</number>
</record>
</records>
</payload>