XSLT to split text data into group of multiple lines - xslt
I am trying to write an XSLT code which splits the text data having multiple lines and produces an XML which contains group of multiple fixed number of lines from the text data.
For example, If my input XML is like this
<?xml version="1.0" encoding="UTF-8"?>
<csv>
<data>Id,Name,Address,Location,Extid,contact
1,raagu1,hosakote1,bangalore1,123,contact1
2,raagu2,hosakote2,bangalore2,123,contact2
3,raagu3,hosakote3,bangalore3,123,contact3
4,raag4,hosakote4,bangalore4,123,contact4
5,raagu5,hosakote5,bangalore5,123,contact5
6,raagu6,hosakote6,bangalore6,123,contact6
7,raagu7,hosakote7,bangalore7,123,contact7
</data>
</csv>
where the text data under element data tells, the first line (Id,Name,Address,Location,Extid,contact) is header and rest of the lines are data corresponding to the header fields.
When I say fixed length for lines is 4 i,e. group of 4 data sets,
then my output XML should be like this.
<?xml version="1.0" encoding="UTF-8"?>
<csv>
<data>
Id,Name,Address,Location,Extid,contact
1,raagu1,hosakote1,bangalore1,123,contact1
2,raagu2,hosakote2,bangalore2,123,contact2
3,raagu3,hosakote3,bangalore3,123,contact3
4,raag4,hosakote4,bangalore4,123,contact4
</data>
<data>
Id,Name,Address,Location,Extid,contact
5,raagu5,hosakote5,bangalore5,123,contact5
6,raagu6,hosakote6,bangalore6,123,contact6
7,raagu7,hosakote7,bangalore7,123,contact6
</data>
</csv>
To achieve this I have explored on xslt scripts and tried following XSLT
<xsl:stylesheet version = "2.0" xmlns:xsl = "http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" method="xml" encoding="UTF-8"/>
<xsl:template match = "/csv/data">
<xsl:variable name="header" select="substring-before(.,'
')"/>
<xsl:variable name="data" select="substring-after(.,'
')"/>
<csv>
<xsl:for-each select = "tokenize($data, '\n')">
<xsl:variable name="count" select="position()"/>
<data>
<xsl:value-of select="$header"/>
<xsl:text>
</xsl:text>
<xsl:sequence select = "."/>
</data>
</xsl:for-each>
</csv>
</xsl:template>
</xsl:stylesheet>
With this, the output I got was
<?xml version="1.0" encoding="UTF-8"?>
<csv>
<data>
Id,Name,Address,Location,Extid,contact
1,raagu1,hosakote1,bangalore1,123,contact1
</data>
<data>
Id,Name,Address,Location,Extid,contact
2,raagu2,hosakote2,bangalore2,123,contact2
</data>
<data>
Id,Name,Address,Location,Extid,contact
3,raagu3,hosakote3,bangalore3,123,contact3
</data>
<data>
Id,Name,Address,Location,Extid,contact
4,raag4,hosakote4,bangalore4,123,contact4
</data>
<data>
Id,Name,Address,Location,Extid,contact
5,raagu5,hosakote5,bangalore5,123,contact5
</data>
<data>
Id,Name,Address,Location,Extid,contact
6,raagu6,hosakote6,bangalore6,123,contact6
</data>
<data>
Id,Name,Address,Location,Extid,contact
7,raagu7,hosakote7,bangalore7,123,contact7
</data>
</csv>
I could not quite get it right since for every line it is grouping. I think I missing some thing to do with concatenation. I am looking for some help to see whether are they any functions in xslt using which we can split the text into multiple groups lines and create a single element for each of those group with very good performance? I am ok for xslt 2.0 functions. The code should work even for 1,00,000+ data sets.
Thanks
Raagu
Do you really want to create that XML result format that continues to have comma separated data and line separated data? I would consider to clean up the data and mark it up properly with XML.
But as for the grouping, here is an example:
<xsl:stylesheet version = "2.0" xmlns:xsl = "http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs">
<xsl:param name="chunk-size" select="4" as="xs:integer"/>
<xsl:output indent="yes" method="xml" encoding="UTF-8"/>
<xsl:template match = "/csv/data">
<xsl:variable name="header" select="substring-before(.,'
')"/>
<xsl:variable name="data" select="substring-after(.,'
')"/>
<csv>
<xsl:for-each-group select = "tokenize($data, '\n')" group-adjacent="(position() - 1) idiv $chunk-size">
<data>
<xsl:value-of select="$header"/>
<xsl:text>
</xsl:text>
<xsl:value-of select = "current-group()" separator="
"/>
</data>
</xsl:for-each-group>
</csv>
</xsl:template>
</xsl:stylesheet>
This is a basic solution (not group-adjacent) with manual element creation - not very beautiful, but works and is comprehensive.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" method="xml" encoding="UTF-8"/>
<xsl:template match="/csv/data">
<xsl:variable name="header" select="substring-before(.,'
')"/>
<xsl:variable name="data" select="substring-after(.,'
')"/>
<xsl:variable name="numberOfRows" select="4"/>
<csv>
<xsl:for-each select="tokenize($data, '\n')">
<xsl:variable name="count" select="position()-1"/>
<xsl:variable name="modulo" select="$count mod $numberOfRows"/>
<xsl:if test="$modulo = 0">
<xsl:text disable-output-escaping="yes"><data></xsl:text>
<xsl:value-of select="$header"/>
<xsl:text>
</xsl:text>
</xsl:if>
<xsl:sequence select="."/>
<xsl:text>
</xsl:text>
<xsl:if test="$modulo = ($numberOfRows - 1)">
<xsl:text disable-output-escaping="yes"></data></xsl:text>
</xsl:if>
</xsl:for-each>
</csv>
</xsl:template>
</xsl:stylesheet>
Related
How to partition dates with XSLT
I have a group of dates and I'd like to create partitions with a criterion such as "exactly 7 days apart" For example this is my source xml: <root> <entry date="2019-05-12" /> <entry date="2019-05-19" /> <entry date="2019-05-26" /> <entry date="2019-06-16" /> <entry date="2019-06-23" /> </root> The result should be like this: <root> <group> <val>12.5.</val> <val>19.5.</val> <val>26.5.</val> </group> <group> <val>16.6.</val> <val>23.6.</val> </group> </root> since the first three and the last two dates are all on a Sunday without a gap. What I have so far is this: <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:sd="urn:someprefix" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="#all" > <xsl:output indent="yes"/> <xsl:template match="root"> <root> <xsl:copy-of select="sd:partition(distinct-values(for $i in entry/#date return $i cast as xs:date))"/> </root> </xsl:template> <xsl:function name="sd:partition"> <xsl:param name="dates" as="xs:date*"/> <xsl:for-each-group select="$dates" group-adjacent="format-date(., '[F]')"> <group> <xsl:for-each select="current-group()"> <val> <xsl:value-of select="format-date(.,'[D].[M].')"/> </val> </xsl:for-each> </group> </xsl:for-each-group> </xsl:function> </xsl:stylesheet> Which only generates one group. How can I ask for the previous element to be 7 days apart? I know of duration (xs:dayTimeDuration('P1D')), but I don't know how to compare it to a previous value. I use Saxon 9.8 HE.
I think you can also do it using group-adjacent: <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="#all" expand-text="yes" version="3.0"> <xsl:output method="xml" indent="yes"/> <xsl:template match="root"> <xsl:copy> <xsl:for-each-group select="entry/#date/xs:date(.)" group-adjacent=". - (position() - 1) * xs:dayTimeDuration('P7D')"> <group> <xsl:apply-templates select="current-group()"/> </group> </xsl:for-each-group> </xsl:copy> </xsl:template> <xsl:template match=".[. instance of xs:date]"> <val>{format-date(.,'[D].[M].')}</val> </xsl:template> </xsl:stylesheet> https://xsltfiddle.liberty-development.net/ncdD7mM
To do your grouping, you really need to know the difference in days with the previous element, then you can group starting with dates where the difference is not 7 days. So, you can declare a variable where you build up some new XML with the dates and differences, and then use that to group. Try this function in your XSLT instead. <xsl:function name="sd:partition"> <xsl:param name="dates" as="xs:date*"/> <xsl:variable name="datesWithDiff" as="element()*"> <xsl:for-each select="$dates"> <xsl:variable name="pos" select="position()" /> <date diff="{(. - $dates[$pos - 1]) div xs:dayTimeDuration('P1D')}"> <xsl:value-of select="." /> </date> </xsl:for-each> </xsl:variable> <xsl:for-each-group select="$datesWithDiff" group-starting-with="date[#diff = '' or xs:int(#diff) gt 7]"> <group> <xsl:for-each select="current-group()"> <val> <xsl:value-of select="format-date(.,'[D].[M].')"/> </val> </xsl:for-each> </group> </xsl:for-each-group> </xsl:function>
Add mandatory nodes with XSLT
I am facing an xslt/xpath problem and hope someone could help, in a few words here is what I try to achieve. I have to transform an XML document where some nodes may be missing, these missing nodes are mandatory in the final result. I have the set of mandatory node names available in an xsl:param. The base document is: <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="TRANSFORM.xslt"?> <BEGIN> <CLIENT> <NUMBER>0021732561</NUMBER> <NAME1>John</NAME1> <NAME2>Connor</NAME2> </CLIENT> <PRODUCTS> <PRODUCT_ID>12</PRODUCT_ID> <DESCRIPTION>blah blah</DESCRIPTION> </PRODUCTS> <PRODUCTS> <PRODUCT_ID>13</PRODUCT_ID> <DESCRIPTION>description ...</DESCRIPTION> </PRODUCTS> <OPTIONS> <OPTION_ID>1</OPTION_ID> <DESCRIPTION>blah blah blah ...</DESCRIPTION> </OPTIONS> <PROMOTIONS> <PROMOTION_ID>1</PROMOTION_ID> <DESCRIPTION>blah blah blah ...</DESCRIPTION> </PROMOTIONS> </BEGIN> Here is the stylesheet so far: <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions"> <xsl:output method="xml" encoding="UTF-8" indent="yes"/> <xsl:param name="mandatoryNodes" as="xs:string*" select=" 'PRODUCTS', 'OPTIONS', 'PROMOTIONS' "/> <xsl:template match="/"> <xsl:apply-templates select="child::node()"/> </xsl:template> <xsl:template match="node()"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> </xsl:template> <xsl:template match="BEGIN"> <xsl:element name="BEGIN"> <xsl:for-each select="$mandatoryNodes"> <!-- If there is no node with this name --> <xsl:if test="count(*[name() = 'current()']) = 0"> <xsl:element name="{current()}" /> </xsl:if> </xsl:for-each> <xsl:apply-templates select="child::node()"/> </xsl:element> </xsl:template> </xsl:stylesheet> I tried the transformation in XML Spy, the xsl:iftest failed saying that 'current item is PRODUCTS of type xs:string. I've tried the same xsl:if outside of a for-each and it seems to work ... what am I missing ?
Inside of <xsl:for-each select="$mandatoryNodes"> the context item is a string but you want to access the primary input document and its nodes so you need to store that document or the template's context node in a variable and use that e.g. <xsl:template match="BEGIN"> <xsl:variable name="this" select="."/> <xsl:element name="BEGIN"> <xsl:for-each select="$mandatoryNodes"> <!-- If there is no child node of `BEGIN` with this name --> <xsl:if test="count($this/*[name() = current()]) = 0"> <xsl:element name="{current()}" /> </xsl:if> </xsl:for-each> <xsl:apply-templates select="child::node()"/> </xsl:element> </xsl:template>
Filemaker xml output via xslt with column names
I am new to xslt programming and xlm. I have created the code below this works fine, except that instead variable names for each column, it just shows "colno" How do I get the column names into the output? <?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fmp="http://www.filemaker.com/fmpxmlresult" exclude-result-prefixes="fmp" > <xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/> <xsl:variable name="kMetaData" select="fmp:METADATA/fmp:FIELD"/> <xsl:variable name="colno" select="count($kMetaData[following-sibling::fmp:FIELD/#NAME]) + 1" /> <xsl:template match="/fmp:FMPXMLRESULT"> <PERSON> <xsl:apply-templates select="fmp:RESULTSET/fmp:ROW" /> </PERSON> </xsl:template> <xsl:template match="fmp:ROW"> <ELEMENTS> <xsl:apply-templates select="fmp:COL" /> </ELEMENTS> </xsl:template> <xsl:template match="fmp:COL"> <xsl:element name="colno"> <xsl:value-of select="fmp:DATA" /> </xsl:element> </xsl:template> </xsl:stylesheet>
It is hard to make some suggestion without input xml. But at first sight this <xsl:element name="colno"> says "output an element <colno>". I think you should use something like <xsl:element name="{xpath/to/columnName}"> edit: According to your input xml your template for "COL" element should look like <xsl:template match="COL"> <xsl:variable name="colPosition" select="position()" /> <!-- Prevent spaces in NAME attribute of FIELD element --> <xsl:variable name="colName" select="translate($kMetaData[$colPosition]/#NAME, ' ', '_')" /> <xsl:element name="{$colName}"> <xsl:value-of select="DATA"/> </xsl:element> </xsl:template> Then the output looks like <?xml version="1.0" encoding="utf-8"?> <PERSON> <ELEMENTS> <FIRSTNAME>Richard</FIRSTNAME> <LASTNAME>Katz</LASTNAME> <MIDDLENAME>David</MIDDLENAME> <REQUESTDT>1/1/2001</REQUESTDT> <salutation>Mr</salutation> <Bargaining_Unit>CSEA (02,03,04)</Bargaining_Unit> <Field_134>b</Field_134> </ELEMENTS> </PERSON>
XSL apply more than one template
I'm transforming an XML document with PHP/XSL. I'm searching for a keyword value. I want to add paging, so I'm not returning all the search results. I can do this with separate xsl files, but I'd like to join them if I can. How can I return the search results and then apply the paging? E.g. Paging ... <xsl:if test="position() > $start and position() < $end"> ... Search.xsl <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" method="xml" version="1.0" encoding="UTF-8"/> <xsl:strip-space elements="*"/> <xsl:template match="/*"> <items> <xsl:attribute name="count"><xsl:value-of select="count(//item)"/></xsl:attribute> <xsl:apply-templates select="//item"> <xsl:sort select="*[name()=$sortBy]" order="{$order}" data-type="{$type}" /> </xsl:apply-templates> </items> </xsl:template> <xsl:template match="//item"> <xsl:choose> <xsl:when test="contains( translate(title, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), $keyword) or contains(translate(content, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), $keyword)"> <item> <title><xsl:value-of select="title"/></title> <content><xsl:value-of select="content"/></content> <date><xsl:value-of select="date"/></date> <author><xsl:value-of select="author"/></author> <uri><xsl:value-of select="uri"/></uri> <division><xsl:value-of select="division"/></division> </item> </xsl:when> </xsl:choose> </xsl:template> </xsl:stylesheet> Final Solution using xsl-variable and node-set() Need to do some more checks, but i'm pretty sure this works ok. <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/common"> <xsl:output omit-xml-declaration="yes" method="xml" version="1.0" encoding="UTF-8"/> <xsl:strip-space elements="*"/> <xsl:variable name="searchResults"> <xsl:apply-templates select="//item"> <xsl:sort select="*[name()=$sortBy]" order="{$order}" data-type="{$type}" /> </xsl:apply-templates> </xsl:variable> <xsl:template match="//item"> <xsl:choose> <xsl:when test="contains(translate(title, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), $keyword) or contains(translate(content, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), $keyword)"> <item> <title><xsl:value-of select="title"/></title> <content><xsl:value-of select="content"/></content> <date><xsl:value-of select="date"/></date> <author><xsl:value-of select="author"/></author> <uri><xsl:value-of select="uri"/></uri> <division><xsl:value-of select="division"/></division> </item> </xsl:when> </xsl:choose> </xsl:template> <xsl:template match="//item" mode="paging"> <xsl:choose> <xsl:when test="position() > $start and position() < $end"> <item> <title><xsl:value-of select="title"/></title> <content><xsl:value-of select="content"/></content> <date><xsl:value-of select="date"/></date> <author><xsl:value-of select="author"/></author> <uri><xsl:value-of select="uri"/></uri> <division><xsl:value-of select="division"/></division> </item> </xsl:when> </xsl:choose> </xsl:template> <xsl:template match="/*"> <items> <xsl:attribute name="count"><xsl:value-of select="count(//item)"/></xsl:attribute> <xsl:apply-templates select="exslt:node-set($searchResults)/*" mode="paging" /> </items> </xsl:template>
Read about modes in XSLT. Then use in these two cases: <xsl:apply-templates mode="search" select="someExpression"> <!-- <xsl:with-param> children if necessary --> <!-- <xsl:sort> children if necessary --> </xsl:apply-templates> and also: <xsl:apply-templates mode="paging" select="someExpression"> <!-- <xsl:with-param> children if necessary --> <!-- <xsl:sort> children if necessary --> </xsl:apply-templates> Of course, you must have tempaltes in each of the above modes.
I think this is no ways to apply different templates consequentially in single transformation. So, try to add paging xpath to search xpath: <xsl:when test="contains( translate(title, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), $keyword) or contains(translate(content, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), $keyword) and position() > $start and position() < $end">
This feels to me like a case where you should split the transformation into two phases, one to do searching and one to do paging. You can either writing a pipeline that executes two transformations written as separate stylesheets, or (with a bit of help from exslt:node-set()) you can do it in a single stylesheet saving the results in a temporary variable. I'd recommend two stylesheets - it makes the code more readable and reusable.
XSLT: How to reverse output without sorting by content
I have a list of items: <item>a</item> <item>x</item> <item>c</item> <item>z</item> and I want as output z c x a I have no order information in the file and I just want to reverse the lines. The last line in the source file should be first line in the output. How can I solve this problem with XSLT without sorting by the content of the items, which would give the wrong result?
I will present two XSLT solutions: I. XSLT 1.0 with recursion Note that this solution works for any node-set, not only in the case when the nodes are siblings: This transformation: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:template match="/*"> <xsl:call-template name="reverse"> <xsl:with-param name="pList" select="*"/> </xsl:call-template> </xsl:template> <xsl:template name="reverse"> <xsl:param name="pList"/> <xsl:if test="$pList"> <xsl:value-of select="concat($pList[last()], ' ')"/> <xsl:call-template name="reverse"> <xsl:with-param name="pList" select="$pList[not(position() = last())]"/> </xsl:call-template> </xsl:if> </xsl:template> </xsl:stylesheet> when applied on this XML document: <t> <item>a</item> <item>x</item> <item>c</item> <item>z</item> </t> produces the wanted result: z c x a II. XSLT 2.0 solution : <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xsl:output method="text"/> <xsl:template match="/*"> <xsl:value-of select="reverse(*)/string(.)" separator=" "/> </xsl:template> </xsl:stylesheet> When this transformation is applied on the same XML document, the same correct result is produced.
XML CODE: <?xml version="1.0" encoding="ISO-8859-1"?> <!-- Edited by XMLSpy® --> <device> <element>a</element> <element>x</element> <element>c</element> <element>z</element> </device> XSLT CODE: <?xml version="1.0" encoding="ISO-8859-1"?> <!-- Edited by XMLSpy® --> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="//device"> <xsl:for-each select="element"> <xsl:sort select="position()" data-type="number" order="descending"/> <xsl:text> </xsl:text> <xsl:value-of select="."/> <xsl:text> </xsl:text> </xsl:for-each> </xsl:template> note: if you're using data-type="number", and any of the values aren't numbers, those non-numeric values will sort before the numeric values. That means if you're using order="ascending", the non-numeric values appear first; if you use order="descending", the non-numeric values appear last. Notice that the non-numeric values were not sorted; they simply appear in the output document in the order in which they were encountered. also, you may find usefull to read this: http://docstore.mik.ua/orelly/xml/xslt/ch06_01.htm
Not sure what the full XML looks like, so I wrapped in a <doc> element to make it well formed: <doc> <item>a</item> <item>x</item> <item>c</item> <item>z</item> </doc> Running that example XML against this stylesheet: <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" encoding="UTF-8" omit-xml-declaration="yes"/> <xsl:template match="/"> <xsl:call-template name="reverse"> <xsl:with-param name="item" select="doc/item[position()=last()]" /> </xsl:call-template> </xsl:template> <xsl:template name="reverse"> <xsl:param name="item" /> <xsl:value-of select="$item" /> <!--Adds a line feed--> <xsl:text> </xsl:text> <!--Move on to the next item, if we aren't at the first--> <xsl:if test="$item/preceding-sibling::item"> <xsl:call-template name="reverse"> <xsl:with-param name="item" select="$item/preceding-sibling::item[1]" /> </xsl:call-template> </xsl:if> </xsl:template> </xsl:stylesheet> Produces the requested output: z c x a You may need to adjust the xpath to match your actual XML.
Consider this XML input: <?xml version="1.0" encoding="utf-8" ?> <items> <item>a</item> <item>x</item> <item>c</item> <item>z</item> </items> The XSLT: <?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text" /> <xsl:template match="/items[1]"> <xsl:variable name="items-list" select="." /> <xsl:variable name="items-count" select="count($items-list/*)" /> <xsl:for-each select="item"> <xsl:variable name="index" select="$items-count+1 - position()"/> <xsl:value-of select="$items-list/item[$index]"/> <xsl:value-of select="' '"/> </xsl:for-each> </xsl:template> </xsl:stylesheet> And the result: z c x a