Get max date from each specific user ID without repeating IDs using XSLT - xslt

I have the following XML
<?xml version='1.0' encoding='UTF-8'?>
<root>
<Data>
<Record>
<User>1</User>
<LastModified>1/1/2023</LastModified>
<UniversityDegree>University of Texas Bachelors</UniversityDegree>
</Record>
<Record>
<User>1</User>
<LastModified>1/11/2023</LastModified>
<UniversityDegree>University of Missouri Masters</UniversityDegree>
</Record>
<Record>
<User>2</User>
<LastModified>1/1/2024</LastModified>
<UniversityDegree>University of Texas Bachelors</UniversityDegree>
</Record>
<Record>
<User>2</User>
<LastModified>1/12/2023</LastModified>
<UniversityDegree>University of Missouri Masters</UniversityDegree>
</Record>
<Record>
<User>3</User>
<LastModified>5/7/2023</LastModified>
<UniversityDegree>University of Texas Bachelors</UniversityDegree>
</Record>
<Record>
<User>3</User>
<LastModified>9/8/2023</LastModified>
<UniversityDegree>University of Missouri Masters</UniversityDegree>
</Record>
<Record>
<User>4</User>
<LastModified>24/1/2023</LastModified>
<UniversityDegree>University of Texas Bachelors</UniversityDegree>
</Record>
<Record>
<User>4</User>
<LastModified>28/9/2023</LastModified>
<UniversityDegree>University of Missouri Masters</UniversityDegree>
</Record>
<Record>
<User>5</User>
<LastModified>15/3/2023</LastModified>
<UniversityDegree>University of Texas Bachelors</UniversityDegree>
</Record>
<Record>
<User>5</User>
<LastModified>10/3/2023</LastModified>
<UniversityDegree>University of Missouri Masters</UniversityDegree>
</Record>
</Data>
</root>
And I need to extract the max date of each user, so for example out of use 5 the max date from 15/3/2023 and 10/3/2023 is 15/3/2023 and show it like this:
<?xml version="1.0" encoding="UTF-8"?>
<LastModified>15/3/2023</LastModified>
<User>5</User>
I've done the following,
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="//root">
<xsl:for-each select="//Record">
<xsl:sort select="number(substring(LastModified, 7, 4))" order="descending"/>
<xsl:sort select="number(substring(LastModified, 3, 2))" order="descending"/>
<xsl:sort select="number(substring(LastModified, 1, 2))" order="descending"/>
<xsl:if test="position() = 1">
<xsl:copy-of select="LastModified"/>
<xsl:copy-of select="User"/>
<Source>SF</Source>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Which returns,
<?xml version="1.0" encoding="UTF-8"?>
<LastModified>1/1/2024</LastModified>
<User>2</User>
<Source>SF</Source>
But it only returns the first sorted record due to the position 1 if. I would need to get the max date of each of the users without having duplicates. If I remove the IF condition, I get everything sorted but Users are repeated,
<?xml version="1.0" encoding="UTF-8"?>
<LastModified>1/1/2024</LastModified>
<User>2</User>
<Source>SF</Source>
<LastModified>1/12/2023</LastModified>
<User>2</User>
<Source>SF</Source>
<LastModified>1/11/2023</LastModified>
<User>1</User>
<Source>SF</Source>
<LastModified>28/9/2023</LastModified>
<User>4</User>
<Source>SF</Source>
<LastModified>24/1/2023</LastModified>
<User>4</User>
<Source>SF</Source>
<LastModified>15/3/2023</LastModified>
<User>5</User>
<Source>SF</Source>
<LastModified>10/3/2023</LastModified>
<User>5</User>
<Source>SF</Source>
<LastModified>1/1/2023</LastModified>
<User>1</User>
<Source>SF</Source>
<LastModified>5/7/2023</LastModified>
<User>3</User>
<Source>SF</Source>
<LastModified>9/8/2023</LastModified>
<User>3</User>
<Source>SF</Source>

Try perhaps something like:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/root">
<result>
<xsl:for-each-group select="Data/Record" group-by="User">
<user>
<xsl:for-each select="current-group()">
<xsl:sort select="tokenize(LastModified, '/')[3]" data-type="number" order="descending"/>
<xsl:sort select="tokenize(LastModified, '/')[2]" data-type="number" order="descending"/>
<xsl:sort select="tokenize(LastModified, '/')[1]" data-type="number" order="descending"/>
<xsl:if test="position() = 1">
<xsl:copy-of select="LastModified, User"/>
<Source>SF</Source>
</xsl:if>
</xsl:for-each>
</user>
</xsl:for-each-group>
</result>
</xsl:template>
</xsl:stylesheet>
Caveat: not tested very thoroughly.

Here is a somewhat different approach that's not necessarily better, but it takes advantage of more XSLT 2.0 features:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://www.example.com/mf"
exclude-result-prefixes="xs mf">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/root">
<result>
<xsl:for-each-group select="Data/Record" group-by="User">
<user>
<LastModified>
<xsl:value-of select="format-date(max(current-group()/LastModified/mf:reformat-date(.)), '[D]/[M]/[Y]')"/>
</LastModified>
<xsl:copy-of select="User"/>
<Source>SF</Source>
</user>
</xsl:for-each-group>
</result>
</xsl:template>
<xsl:function name="mf:reformat-date" as="xs:date">
<!-- expected input format: d/m/y -->
<xsl:param name="date"/>
<xsl:variable name="components" select="for $t in tokenize($date, '/') return xs:integer($t)" />
<xsl:value-of select="format-number($components[3], '0000'), format-number($components[2], '00'), format-number($components[1], '00')" separator="-"/>
</xsl:function>
</xsl:stylesheet>

Related

XSLT logic to add a sequence for the combination of elements with same dates and ID field

I am struggling to create logic for transformation.
Logic: "Seq" A sequential number used to make a unique key when the ID and Date fields are equal.
<root>
<Record>
<ID>11</ID>
<date>2020-03-11-07:00</date>
<quantity>10</quantity>
</Record>
<Record>
<ID>13</ID>
<date>2020-03-12-07:00</date>
<quantity>20</quantity>
</Record>
<Record>
<ID>15</ID>
<date>2020-03-13-07:00</date>
<quantity>40</quantity>
</Record>
<Record>
<ID>11</ID>
<date>2020-03-11-07:00</date>
<quantity>5</quantity>
</Record>
<Record>
<ID>13</ID>
<date>2020-03-17-07:00</date>
<quantity>100</quantity>
</Record>
</root>
to, Output
ID,seq,Date,quantity
11,1,2020-03-11-07:00,10
11,2,2020-03-11-07:00,5
13,1,2020-03-12-07:00,20
15,1,2020-03-13-07:00,40
13,1,2020-03-17-07:00,100
In XSLT 3 it is a simple grouping problem with a composite key:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
version="3.0">
<xsl:output method="text" />
<xsl:template match="root">
<xsl:text>ID,seq,Date,quantity
</xsl:text>
<xsl:for-each-group select="Record" composite="yes" group-by="ID, date">
<xsl:apply-templates select="current-group()"/>
</xsl:for-each-group>
</xsl:template>
<xsl:template match="Record">
<xsl:value-of select="ID, position(), date, quantity" separator=","/>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
https://xsltfiddle.liberty-development.net/93dFepy
XSLT 3 can be used with Saxon 9.8 and later or AltovaXML 2017 R3 or later.

Multiply 2 filtered numbers and sum

I need to sum the multiplication of 2 numbers based on this example
<test>
<stop>
<id>1</id>
<unit_id>1</unit_id>
<unit_id>2</unit_id>
</stop>
<stop>
<id>2</id>
<unit_id>1</unit_id>
<unit_id>3</unit_id>
</stop>
<unit>
<id>1</id>
<count>2</count>
<value>1</value>
</unit>
<unit>
<id>2</id>
<count>4</count>
<value>1</value>
</unit>
<unit>
<id>3</id>
<count>2</count>
<value>3</value>
</unit>
The result i want to get is the one below
<test>
<stop>
<id>1</id>
<sum>6</sum>
</stop>
<stop>
<id>2</id>
<sum>10</sum>
</stop>
Any tips how to get it?
I tried with this example but the sum of the moltiplication doesn't work, it is ok for only the sum or the multiplication but not both
<xsl:template match="stop">
<xsl:variable name="ship_unit" select="id"/>
<xsl:value-of select="sum(following-sibling::unit[id=$ship_unit]/count*following-sibling::unit[id=$ship_unit]/value)"/>
If I am guessing correctly, you want to do something like:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:key name="unit" match="unit" use="id" />
<xsl:template match="/test">
<xsl:copy>
<xsl:for-each select="stop">
<xsl:variable name="unit1" select="key('unit', unit_id[1])" />
<xsl:variable name="unit2" select="key('unit', unit_id[2])" />
<xsl:copy>
<xsl:copy-of select="id"/>
<sum>
<xsl:value-of select="$unit1/count * $unit1/value + $unit2/count * $unit2/value" />
</sum>
</xsl:copy>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
However, the result of applying this to your input example will be:
<?xml version="1.0" encoding="utf-8"?>
<test>
<stop>
<id>1</id>
<sum>6</sum>
</stop>
<stop>
<id>2</id>
<sum>8</sum>
</stop>
</test>
and not what you posted.

Verticalize XML using XSLT

I am trying to implement an at first looking simple transformation but whatever I have tried has been failed.
The XML is generated from a fixed length record and have the below format.
<?xml version="1.0" encoding="UTF-8"?>
<record>
<no_of_records>30</no_of_records>
<cust_lastname_1>Smith</cust_lastname_1>
<cust_name_1>John</cust_name_1>
<cust_id_1>X45</cust_id_1>
<cust_lastname_2>George</cust_lastname_2>
<cust_name_2>Michael</cust_name_2>
<cust_id_2>X76</cust_id_2>
<cust_lastname_3>Ria</cust_lastname_3>
<cust_name_3>Chris</cust_name_3>
<cust_id_3>C87</cust_id_3>
...
</record>
The no_of_records indicates how many _X suffixed elements contains each record and because of its fix length origin has a defined maximum.
I want to transform it to a “verticalized” form resempling the below.
<record>
<customer num="1">
<lastname>Smith</lastname>
<name>John</name>
<id>X45</id>
</customer>
<customer num="2">
<lastname>George</lastname>
<name>Michael</name>
<id>X76</id>
</customer>
<customer num="3">
<lastname>Ria</lastname>
<name>Chris</name>
<id>C87</id>
...
</customer>
</record>
Any help would greatly appreciated.
In XSLT 2.0, you want something like
<xsl:for-each-group select="*" group-starting-with="*[starts-with(local-name(), 'cust_lastname']">
<customer num="{position()}">
<xsl:apply-templates select="current-group()"/>
</customer>
</xsl:for-each-group>
....
<xsl:template match="*[starts-with(local-name(), 'cust')]">
<xsl:element name="{replace(local-name(), 'cust_(.*?)_[0-9]+', '$1')}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
The solution from #Michael Kay works fine. Thank you !
XML
<?xml version="1.0" encoding="UTF-8"?>
<record>
<no_of_records>3</no_of_records>
<cust_lastname_1>Smith</cust_lastname_1>
<cust_name_1>John</cust_name_1>
<cust_id_1>X45</cust_id_1>
<cust_lastname_2>George</cust_lastname_2>
<cust_name_2>Michael</cust_name_2>
<cust_id_2>X76</cust_id_2>
<cust_lastname_3>Ria</cust_lastname_3>
<cust_name_3>Chris</cust_name_3>
<cust_id_3>C87</cust_id_3>
</record>
XSLT
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="record">
<records>
<xsl:for-each-group select="*[starts-with(local-name(), 'cust_')]"
group-starting-with="*[starts-with(local-name(), 'cust_lastname')]">
<customer num="{position()}">
<xsl:apply-templates select="current-group()"/>
</customer>
</xsl:for-each-group>
</records>
</xsl:template>
<xsl:template match="*[starts-with(local-name(), 'cust')]">
<xsl:element name="{replace(local-name(), 'cust_(.*?)_[0-9]+', '$1')}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Result
<?xml version="1.0" encoding="UTF-8"?>
<records>
<customer num="1">
<lastname>Smith</lastname>
<name>John</name>
<id>X45</id>
</customer>
<customer num="2">
<lastname>George</lastname>
<name>Michael</name>
<id>X76</id>
</customer>
<customer num="3">
<lastname>Ria</lastname>
<name>Chris</name>
<id>C87</id>
</customer>
</records>

XSLT code to pass a value in output XML based on a condition in input XML

Input XML:
<?xml version="1.0" encoding="UTF-8"?>
<DATA>
<RECORDS>
<Group>
<Name>12345</Name>
<Grp>MANAGER</Grp>
<FName>Alex</FName>
<LName>Johnson</LName>
<String1>abcd</String1>
/Group>
<Group>
<Name>67891</Name>
<Grp>PROJECT MANAGER</Grp>
<FName>JAMES</FName>
<LName>HARPER</LName>
<String1></String1>
</Group> </RECORDS> <LOGIN>
<User>
<Name>12345</UserName>
<Last>14/02/2013</Last>
</User>
<User>
<Name>67891</Name>
<Last>14/01/2013/Last>
</User> </LOGIN> </DATA>
Requirement:
In output XML
If String1 has a value then Type tag should have value as "axbx" and
if String1 is blank then Type tag should have value as "dydy"
<?xml version="1.0" encoding="UTF-8"?>
<DATA>
<RECORDS>
<Group>
<Name>12345</Name>
<Grp>MANAGER</Grp>
<FName>Alex</FName>
<LName>Johnson</LName>
<Type>axbx</Type>
</Group>
<Group>
<Name>67891</Name>
<Grp>PROJECT MANAGER</Grp>
<FName>JAMES</FName>
<LName>HARPER</LName>
<Type>dydy</Type>
</Group> </RECORDS> </DATA>
Please suggest.
I can't edit your question so I copy the corrected XML:
<?xml version="1.0" encoding="UTF-8"?>
<DATA>
<RECORDS>
<Group>
<Name>12345</Name>
<Grp>MANAGER</Grp>
<FName>Alex</FName>
<LName>Johnson</LName>
<String1>abcd</String1>
</Group>
<Group>
<Name>67891</Name>
<Grp>PROJECT MANAGER</Grp>
<FName>JAMES</FName>
<LName>HARPER</LName>
<String1></String1>
</Group>
</RECORDS>
<LOGIN>
<User>
<Name>12345</Name>
<Last>14/02/2013</Last>
</User>
<User>
<Name>67891</Name>
<Last>14/01/2013</Last>
</User>
</LOGIN>
</DATA>
and the XSL
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="String1">
<Type>
<xsl:choose>
<xsl:when test="string-length(.) > 0">axbx</xsl:when>
<xsl:otherwise>dydy</xsl:otherwise>
</xsl:choose>
</Type>
</xsl:template>
</xsl:stylesheet>
I'm not very experienced so there might be a better way.

efficient xslt conditional increment

In this question i asked how to perform a conditional increment. The provided answer worked, but does not scale well on huge data-sets.
The Input:
<Users>
<User>
<id>1</id>
<username>jack</username>
</User>
<User>
<id>2</id>
<username>bob</username>
</User>
<User>
<id>3</id>
<username>bob</username>
</User>
<User>
<id>4</id>
<username>jack</username>
</User>
</Users>
The desired output (in optimal time-complexity):
<Users>
<User>
<id>1</id>
<username>jack01</username>
</User>
<User>
<id>2</id>
<username>bob01</username>
</User>
<User>
<id>3</id>
<username>bob02</username>
</User>
<User>
<id>4</id>
<username>jack02</username>
</User>
</Users>
For this purpose it would be nice to
sort input by username
for each user
when previous username is equals current username
increment counter and
set username to '$username$counter'
otherwise
set counter to 1
(sort by id again - no requirement)
Any thoughts?
This is kind of ugly and I'm not fond of using xsl:for-each, but it should be faster than using preceding-siblings, and doesn't need a 2-pass approach:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:key name="count" match="User" use="username" />
<xsl:template match="Users">
<Users>
<xsl:for-each select="User[generate-id()=generate-id(key('count',username)[1])]">
<xsl:for-each select="key('count',username)">
<User>
<xsl:copy-of select="id" />
<username>
<xsl:value-of select="username" />
<xsl:number value="position()" format="01"/>
</username>
</User>
</xsl:for-each>
</xsl:for-each>
</Users>
</xsl:template>
</xsl:stylesheet>
If you really need it sorted by ID afterwards, you can wrap it into a two-pass template:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt">
<xsl:key name="count" match="User" use="username" />
<xsl:template match="Users">
<xsl:variable name="pass1">
<xsl:for-each select="User[generate-id()=generate-id(key('count',username)[1])]">
<xsl:for-each select="key('count',username)">
<User>
<xsl:copy-of select="id" />
<username>
<xsl:value-of select="username" />
<xsl:number value="position()" format="01"/>
</username>
</User>
</xsl:for-each>
</xsl:for-each>
</xsl:variable>
<xsl:variable name="pass1Nodes" select="msxsl:node-set($pass1)" />
<Users>
<xsl:for-each select="$pass1Nodes/*">
<xsl:sort select="id" />
<xsl:copy-of select="." />
</xsl:for-each>
</Users>
</xsl:template>
</xsl:stylesheet>
This transformation produces exactly the specified wanted result and is efficient (O(N)):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ext="http://exslt.org/common" exclude-result-prefixes="ext">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="kUserByName" match="User" use="username"/>
<xsl:key name="kUByGid" match="u" use="#gid"/>
<xsl:variable name="vOrderedByName">
<xsl:for-each select=
"/*/User[generate-id()=generate-id(key('kUserByName',username)[1])]">
<xsl:for-each select="key('kUserByName',username)">
<u gid="{generate-id()}" pos="{position()}"/>
</xsl:for-each>
</xsl:for-each>
</xsl:variable>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="username/text()">
<xsl:value-of select="."/>
<xsl:variable name="vGid" select="generate-id(../..)"/>
<xsl:for-each select="ext:node-set($vOrderedByName)[1]">
<xsl:value-of select="format-number(key('kUByGid', $vGid)/#pos, '00')"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
When applied on the provided XML document:
<Users>
<User>
<id>1</id>
<username>jack</username>
</User>
<User>
<id>2</id>
<username>bob</username>
</User>
<User>
<id>3</id>
<username>bob</username>
</User>
<User>
<id>4</id>
<username>jack</username>
</User>
</Users>
the wanted, correct result is produced:
<Users>
<User>
<id>1</id>
<username>jack01</username>
</User>
<User>
<id>2</id>
<username>bob01</username>
</User>
<User>
<id>3</id>
<username>bob02</username>
</User>
<User>
<id>4</id>
<username>jack02</username>
</User>
</Users>
Here's a slight variation, but possible not a great increase in efficiency
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:output method="xml" indent="yes"/>
<xsl:key name="User" match="User" use="username" />
<xsl:template match="username/text()">
<xsl:value-of select="." />
<xsl:variable name="id" select="generate-id(..)" />
<xsl:for-each select="key('User', .)">
<xsl:if test="generate-id(username) = $id">
<xsl:number value="position()" format="01"/>
</xsl:if>
</xsl:for-each>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
What this is doing is defining a key to group Users by username. Then, for each username element, you look through the elements in the key for that username, and output the position when you find a match.
One slight advantage of this approach is that you are only looking at user records with the same name. This may be more efficient if you don't have huge numbers of the same name.