Grouping XML Values Based On XML Attributes (XSLT) - xslt

Below are three rows of data (tab separated) that are to be transformed into an XML using XSLT.
Column_1 Column_2 Column_3 Column_4
A B C D
A B A F
C B D C
The Expected output is as below
<firstTag Column_1 ='A' Column_2='B'>
<secondTag Column_3='C' Column_4='D'/>
<secondTag Column_3='A' Column_4='F'/>
</firstTag>
<firstTag Column_1 ='C' Column_2='B'>
<secondTag Column_3='D' Column_4='C'/>
</firstTag>
How would one be able to group these rows based on one or more attribute values (Column_1 and Column_2) using XSLT

You can use unparsed-text or (in XSLT 3 supported since 2017 by Saxon 9.8 and later) unparsed-text-lines to process non-XML text files like the one you seem to have, then you have the tokenize function plus the xsl:analyze-string element or in XSLT 3 the analyze-string function to process and structure your tab delimited data into something you can feed to xsl:for-each-group, i.e. some XML or some mixture of arrays and sequences of string in XSLT 3.
Grouping in XSLT 2 and 3 is covered in https://stackoverflow.com/tags/xslt-grouping/info.
Here is an example using XSLT 3 and for-each-group over a grouping population that is a sequence of arrays of strings:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
xmlns:map="http://www.w3.org/2005/xpath-functions/map"
xmlns:array="http://www.w3.org/2005/xpath-functions/array"
exclude-result-prefixes="#all"
version="3.0">
<xsl:param name="text" as="xs:string">Column_1 Column_2 Column_3 Column_4
A B C D
A B A F
C B D C</xsl:param>
<xsl:param name="el1" as="xs:string">firstElement</xsl:param>
<xsl:param name="el2" as="xs:string">secondElement</xsl:param>
<xsl:variable name="rows" as="array(xs:string)*" select="($text => tokenize('\r?\n')) ! array { tokenize(., '\s+') }"/>
<xsl:variable name="data-rows" select="$rows => tail()"/>
<xsl:variable name="column-names" select="$rows[1]?*"/>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/" name="xsl:initial-template">
<result>
<xsl:for-each-group select="$data-rows" composite="yes" group-by="?1, ?2">
<xsl:element name="{$el1}">
<xsl:attribute name="{$column-names[1]}" select="current-grouping-key()[1]"/>
<xsl:attribute name="{$column-names[2]}" select="current-grouping-key()[2]"/>
<xsl:apply-templates select="current-group()"/>
</xsl:element>
</xsl:for-each-group>
</result>
</xsl:template>
<xsl:template match=".[. instance of array(xs:string)]">
<xsl:element name="{$el2}">
<xsl:for-each select="?(3 to array:size(current()))">
<xsl:attribute name="{subsequence($column-names, position() + 2, 1)}" select="."/>
</xsl:for-each>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
When I copied the input sample from your question it didn't seem to contain tab characters so I tokenized on whitespace instead.

Related

XSLT Need to Limit Return of Multiple Instances in XML File to 18 Characters

I currently have the following code to combine multiple instances of Ustrd into one returned value:
<Ustrd>
<xsl:value-of select="a:RmtInf/a:Ustrd"/>
</Ustrd>
This returns:
<Ustrd>Item-1 Item-2 Item-3</Ustrd>
The problem is that I need to limit this to 18 characters, and the substring function does not work with a sequence of items.
Tried:
<Ustrd>
<xsl:value-of select="substring(a:RmtInf/a:Ustrd, 1, 18"/>
</Ustrd>
Expected Result:
<Ustrd>Item-1 Item-2 Item</Ustrd>
Use string-join first e.g. substring(string-join(a:RmtInf/a:Ustrd, ' '), 1, 18). In XPath 3.1 you can also write that as a:RmtInf/a:Ustrd => string-join(' ') => substring(1, 18).
Here's a way this could be done in XSLT 1.0.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<Ustrd>
<xsl:variable name="temp">
<xsl:for-each select="RmtInf/Ustrd">
<xsl:value-of select="."/>
<xsl:if test="position()!=last()">
<xsl:value-of select="' '"/>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<xsl:value-of select="substring($temp,1,18)"/>
</Ustrd>
</xsl:template>
</xsl:stylesheet>
(Only need to add your namespace.)
See it working here: https://xsltfiddle.liberty-development.net/pPgzCL4

How are sequences spliced, and why is my variable's value a document node?

Look at the code below:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="3.0">
<xsl:output indent="yes"/>
<xsl:template match="/">
<root>
<xsl:variable name="v1">
<xsl:variable name="a1" select="137"/>
<xsl:variable name="a2" select="(1, 3, 'abc')"/>
<xsl:variable name="a3" select="823"/>
<xsl:sequence select="$a1"/>
<xsl:sequence select="$a2"/>
<xsl:sequence select="$a3"/>
</xsl:variable>
<xsl:variable name="v2" as="item()+">
<xsl:variable name="b1" select="137"/>
<xsl:variable name="b2" select="(1, 3)"/>
<xsl:variable name="b3" select="823"/>
<xsl:variable name="b4" select="'abc'"/>
<xsl:sequence select="$b1"/>
<xsl:sequence select="$b2"/>
<xsl:sequence select="$b3"/>
<xsl:sequence select="$b4"/>
</xsl:variable>
<count>
<xsl:text>v1 count is: </xsl:text>
<xsl:value-of select="count($v1)"/>
</count>
<count>
<xsl:text>v2 count is: </xsl:text>
<xsl:value-of select="count($v2)"/>
</count>
<count>
<xsl:text>a2 count is: </xsl:text>
<xsl:value-of select="count((1, 3, 'abc'))"/>
</count>
</root>
</xsl:template>
</xsl:stylesheet>
The result ouput is:
<root>
<count>v1 count is: 1</count>
<count>v2 count is: 5</count>
<count>a2 count is: 3</count>
</root>
Why v2 count is different from v1 count? They seems to have the same items. How the sequence splice?
Why is v1 treated as the 'document-node' type?
Words "It looks like your post is mostly code; please add some more details." always prevent me to submit.
Well, you have different variable declarations, as one uses the as attribute and the other not.
And you seem to have inferred that your first case without any as declaration results in a document node (containing content).
As for the gory details of the various options, the spec treats your first case in https://www.w3.org/TR/xslt-30/#temporary-trees and outlines the various options of how as, select and content constructors in xsl:variable interact in https://www.w3.org/TR/xslt-30/#variable-values.

check for successively numbered attributes

I have a situation where I need to check for attribute values that may be successively numbered and input a dash between the start and end values.
<root>
<ref id="value00008 value00009 value00010 value00011 value00020"/>
</root>
The ideal output would be...
8-11, 20
I can tokenize the attribute into separate values, but I'm unsure how to check if the number at the end of "valueXXXXX" is successive to the previous value.
I'm using XSLT 2.0
You can use xsl:for-each-group with #group-adjacent testing for the number() value subtracting the position().
This trick was apparently invented by David Carlisle, according to Michael Kay.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:output indent="yes"/>
<xsl:template match="/">
<xsl:variable name="vals"
select="tokenize(root/ref/#id, '\s?value0*')[normalize-space()]"/>
<xsl:variable name="condensed-values" as="item()*">
<xsl:for-each-group select="$vals"
group-adjacent="number(.) - position()">
<xsl:choose>
<xsl:when test="count(current-group()) > 1">
<!--a sequence of successive numbers,
grab the first and last one and join with '-' -->
<xsl:sequence select="
string-join(current-group()[position()=1
or position()=last()]
,'-')"/>
</xsl:when>
<xsl:otherwise>
<!--single value group-->
<xsl:sequence select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:variable>
<xsl:value-of select="string-join($condensed-values, ',')"/>
</xsl:template>
</xsl:stylesheet>

How to get distinct/unique attributes in xsl variable

<xsl:variable name="Rows" select=" .. some stmt .." />
<xsl:for-each select="$Rows">
<xsl:value-of select="#ATTRNAME"/>
</xsl:for-each>
Would like to know how to find 'Rows' with unique/distinct attribute 'ATTRNAME' [ in XSLT 1.0 ].
Grouping in XSLT 1.0 is done using xsl:key. The following prints only the unique elements of the root element:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" omit-xml-declaration="yes"/>
<xsl:key name="attrByVal" match="/*/#*" use="."/>
<xsl:template match="/">
<xsl:apply-templates select="/*/#*"/>
</xsl:template>
<xsl:template match="#*[generate-id()=generate-id(key('attrByVal', .)[1])]">
<xsl:value-of select="concat(name(), ': ', ., '
')"/>
</xsl:template>
<xsl:template match="#*"/>
</xsl:stylesheet>
Explanation: First, we group all attributes of the root element by value:
<xsl:key name="attrByVal" match="/*/#*" use="."/>
Then, create a template that matches only the first element for each key in the group:
<xsl:template match="#*[generate-id()=generate-id(key('attrByVal', .)[1])]">
And ignore all the others:
<xsl:template match="#*"/>
Example input:
<root one="apple" two="apple" three="orange" four="apple"/>
Output:
one: apple
three: orange
XSLT 2.0 solution :
<xsl:for-each-group select="$Rows" group-by="#ATTRNAME">
<!-- Do something with the rows in the group. These rows
can be accessed by the current-group() function. -->
</xsl:for-each-group>

How to sort the elements (columns) in xslt to transform the xml file to csv format

<?xml version="1.0" encoding="utf-8"?>
<Report p1:schemaLocation="Customer details http://reportserver?%2fCustomer details&rs%3aFormat=XML&rc%3aSchema=True" Name="Customer details" xmlns:p1="http://www.w3.org/2001/XMLSchema-instance" xmlns="Customer details">
<table2>
<Detail_Collection>
<Detail Col1="aaa" col1_SeqID="2" col1_Include="1"
Col2="aaa" col2_SeqID="1" col2_Include="1"
Col3="aaa" col3_SeqID="" col3_Include="0"
Col4="aaa" col4_SeqID="4" col4_Include="1"
Col5="aaa" col5_SeqID="" col5_Include="0"
... ... ...
... ... ...
... ... ...
Col50="aaa" col50_SeqID="3" col50_Include="1"
/>
<Detail_Collection>
</table2>
</Report>
The above xml is produced by SSRS for the RDL file. I want to transform the above xml file to CSV format using XSLT (customized format).
The RDL file (SSRS report) is very simple with 50 columns, and displays the data for all the columns depending on the user selection on the user interface.
The user interface has got the parameter selection for all the 50 columns (i.e they can select the order of the column, they can select a particular column to be included on the report or not, the fontstyle etc...). As mentioned the each column has 2 main functionalities i.e. they can be sorted and as well ordered by based on the selections.
For example from the report output i.e in the xml format given above you will see all the 50 columns exist on the xml format but I am also including the extra fiedls which are generally hided on the report.
The col1 is included on the report and is ordered (seqID) as the 2nd column on the csv file.
The col2 is also included on the report and is ordered as the 1st column on the csv file.
The col3 is not included on the report and the order selection is empty, so this is not included on the csv file.
...
...
like wise the col50 is included on the report but is ordered in as 3rd column in the csv file.
My main challenge here to create the xslt file for "CSV" and put the columns in the order selection which are selected per user basis.
The output in the CSV file after transformation will look as follows:
Col2 Col1 Col50 Col4
... ... ... ....
Any good idea to create this kind of xsl file is much appreciated and I thank you so much for understanding my question and trying to help me in this regard.
I. This XSLT 1.0 transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:c="Customer details">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="c:Detail">
<xsl:apply-templates select=
"#*[substring(name(), string-length(name())-5)
= '_SeqID'
and
number(.) = number(.)
]
">
<xsl:sort data-type="number"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="#*">
<xsl:if test="not(position()=1)">,</xsl:if>
<xsl:value-of select=
"../#*
[name()
=
concat('Col',substring-before(substring(name(current()),4),'_'))
]"/>
</xsl:template>
</xsl:stylesheet>
when applied on this XML document (the provided one, made well-formed and unambiguous):
<Report
p1:schemaLocation="Customer details http://reportserver?%2fCustomer details&rs%3aFormat=XML&rc%3aSchema=True"
Name="Customer details"
xmlns:p1="http://www.w3.org/2001/XMLSchema-instance"
xmlns="Customer details">
<table2>
<Detail_Collection>
<Detail Col1="aaa1" col1_SeqID="2" col1_Include="1"
Col2="aaa2" col2_SeqID="1" col2_Include="1"
Col3="aaa3" col3_SeqID="" col3_Include="0"
Col4="aaa4" col4_SeqID="4" col4_Include="1"
Col5="aaa5" col5_SeqID="" col5_Include="0"
Col50="aaa50" col50_SeqID="3" col50_Include="1"
/>
</Detail_Collection>
</table2>
</Report>
produces the wanted, correct result:
aaa2,aaa1,aaa50,aaa4
Explanation:
We use that the XPath 1.0 expression:
__
substring($s1, string-length($s1) - string-length($s2) +1) = $s2
is equivalent to the XPath 2.0 expression:
ends-with($s1, $s2))
.2. Appropriate use of <xsl:sort>, substring(), name() and current().
.3. Using the fact that a string $s is castable to number if and only if:
__
number($s) = number($s)
II. XSLT 2.0 solution:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:c="Customer details">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="c:Detail">
<xsl:apply-templates select=
"#*[ends-with(name(),'_SeqID')
and . castable as xs:integer]">
<xsl:sort select="xs:integer(.)"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="#*">
<xsl:if test="not(position()=1)">,</xsl:if>
<xsl:value-of select=
"../#*
[name()
eq
concat('Col',translate(name(current()),'col_SeqID',''))]"/>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the same XML document (above), the same correct result is produced:
aaa2,aaa1,aaa50,aaa4
Update: #desi has asked that the heading should also be generated.
Here is the updated XSLT 1.0 transformation (as indicated, #desi is limited to use XSLT 1.0 only) that does this:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:c="Customer details">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="c:Detail">
<xsl:for-each select=
"#*[substring(name(), string-length(name())-5)
= '_SeqID'
and
number(.) = number(.)
]
">
<xsl:sort data-type="number"/>
<xsl:value-of select=
"concat('Col',
substring-before(substring(name(current()),4),
'_')
)
"/>
<xsl:text> </xsl:text>
</xsl:for-each>
<xsl:text>
</xsl:text>
<xsl:apply-templates select=
"#*[substring(name(), string-length(name())-5)
= '_SeqID'
and
number(.) = number(.)
]
">
<xsl:sort data-type="number"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="#*">
<xsl:if test="not(position()=1)">,</xsl:if>
<xsl:value-of select=
"../#*
[name()
=
concat('Col',substring-before(substring(name(current()),4),'_'))
]"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the same XML document (above), the wanted, correct result is produced:
Col2 Col1 Col50 Col4
aaa2,aaa1,aaa50,aaa4