I am trying to take a CSV file as input and transform it into a XML. I'm new to XSLT and I've found a way to convert a CSV into XML (using an example from Andrew Welch) like so:
Input CSV file:
car manufacturer,model,color,price,inventory
subaru,outback,blue,23195,54
subaru,forester,silver,20495,23
And my output XML would be:
<?xml version="1.0" encoding="UTF-8"?>
<rows>
<row>
<column name="car manufacturer">subaru</column>
<column name="model">outback</column>
<column name="color">blue</column>
<column name="price">23195</column>
<column name="inventory">54</column>
</row>
<row>
<column name="car manufacturer">subaru</column>
<column name="model">forester</column>
<column name="color">silver</column>
<column name="price">20495</column>
<column name="inventory">23</column>
</row>
</rows>
My desired output is actually something similar to:
<stock>
<model>
<car>subaru outback</car>
<color>blue</color>
<price>23195</price>
<inventory>54</inventory>
</model>
<model>
<car>subaru forester</car>
<color>silver</color>
<price>20495</price>
<inventory>23</inventory>
</model>
</stock>
What I read is that it would best be done using a two phase transformation. The CSV to XML is done using XSLT 2.0, so I thought the two phase transformation would be done using that as well without using the node-set function.
So the first phase would be to take the original CSV file as input, and then output the intermediate XML shown above. Then take that intermediate XML, and pass it into another transformation to get the desired output.
Anyone can help on how the two phase transformation can be done? I'm having trouble passing the output of phase one as an input of phase 2?
I have something like this so far:
<xsl:import href="csv2xml.xsl"/>
<xsl:output method="xml" indent="yes" />
<xsl:variable name="intermediate">
<xsl:apply-templates select="/" mode="csv2xml"/>
</xsl:variable>
<xsl:template match="rows" name="main">
**[This is what I'm having trouble with]**
</xsl:template>
I don't see any reason why this transformation needs two phases - except perhaps to allow you to reuse existing code for one of the phases.
However, when you do need two phases, the general model is:
<xsl:template match="/">
<xsl:variable name="phase-1-result">
<xsl:apply-templates select="/" mode="phase-1"/>
</xsl:variable>
<xsl:apply-templates select="$phase-1-result" mode="phase-2"/>
</xsl:template>
with the template rules for phase 1 and phase 2 (and their apply-templates calls) all being in mode phase-1 or phase-2 respectively.
This XSLT 2.0 stylesheet:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:variable name="vLines"
select="tokenize(unparsed-text('test.txt'),'(
)?
')"/>
<xsl:variable name="vHeaders"
select="tokenize($vLines[1],',')"/>
<stock>
<xsl:for-each select="$vLines[position()!=1]">
<model>
<xsl:variable name="vColumns" select="tokenize(.,',')"/>
<xsl:for-each select="$vColumns">
<xsl:variable name="vPosition" select="position()"/>
<xsl:variable name="vHeader"
select="$vHeaders[$vPosition]"/>
<xsl:choose>
<xsl:when test="$vHeader = 'car manufacturer'">
<column name="car">
<xsl:value-of
select="(.,$vColumns[
index-of($vHeaders,'model')
])"/>
</column>
</xsl:when>
<xsl:when test="$vHeader = 'model'"/>
<xsl:otherwise>
<column name="{$vHeader}">
<xsl:value-of select="."/>
</column>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</model>
</xsl:for-each>
</stock>
</xsl:template>
</xsl:stylesheet>
Output:
<stock>
<model>
<column name="car">subaru outback</column>
<column name="color">blue</column>
<column name="price">23195</column>
<column name="inventory">54</column>
</model>
<model>
<column name="car">subaru forester</column>
<column name="color">silver</column>
<column name="price">20495</column>
<column name="inventory">23</column>
</model>
</stock>
Note: In XSLT 3.0 you will be able to apply templates to items in general.
EDIT: Correct names.
You can find here an example of how to do this with XSLT 3.0 :
http://www.stylusstudio.com/tutorials/intro-xslt-3.html
And see under "Text Manipulations".
Related
Using xslt 1.0, I need to transform below input xml to output xml
--input xml
<Row>
<Column name="NUMBER" sqltype="int">123</Column>
<Column name="DEPT1" sqltype="int">A</Column>
<Column name="CUST_EMPTYPE" sqltype="int">1</Column>
<Column name="CUST_TIJD" sqltype="int">31</Column>
</Row>
--output xml
<EMPLOYEE xmlns="http://xmlns.oracle.com/Employee">
<NUMBER>123</NUMBER>
<DEPT1>IHC</DEPT1>
<CUST_EMPTYPE>1</LASTNAME>
<CUST_TIJD>31</FIRSTNAME>
</EMPLOYEE>
the Column names from input xml are not known at design time, the Column can grow..
Can anyone let me know how to achieve this?
Thank you very much,
Yes. It is quite simple. One minor difference is the deviation between your DEPT1 source and destination value (I really don't know where 'IHC' may have been coming from). I harmonized it in the following XSLT code which just sets the Column/#name nodes to new EMPLOYEE elements with the content of the old text() content:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" />
<xsl:template match="/Row">
<EMPLOYEE xmlns="http:xmlns.oracle.com/Employee">
<xsl:for-each select="Column">
<xsl:element name="{#name}">
<xsl:value-of select="text()" />
</xsl:element>
</xsl:for-each>
</EMPLOYEE>
</xsl:template>
</xsl:stylesheet>
The result of this is:
<EMPLOYEE xmlns="http:xmlns.oracle.com/Employee">
<NUMBER>123</NUMBER>
<DEPT1>A</DEPT1>
<CUST_EMPTYPE>1</CUST_EMPTYPE>
<CUST_TIJD>31</CUST_TIJD>
</EMPLOYEE>
I'm new to XSLT and trying to transform from XML to pipe-delimited format. The problem that I'm having is that in the output, each claim has to be duplicated for each service line.
Expected Output:
EP030315706890704|TESTSUBMITTER|FAMILY HEALTHCARE|1122334455|1|99214|179.00
EP030315706890704|TESTSUBMITTER|FAMILY HEALTHCARE|1122334455|2|2000F|0.00
EP030315706890705|TESTSUBMITTER2|FAMILY HEALTHCARE|1122334455|1|99214|179.00
EP030315706890705|TESTSUBMITTER2|FAMILY HEALTHCARE|1122334455|2|2000F|0.00
Input XML looks as follows:
<payloadContainer>
<afile>
<clm>
<hdr>
<corn>EP030315706890704</corn>
<idSend>112233445</idSend>
<nmSend>TESTSUBMITTER</nmSend>
</hdr>
<provBill>
<name>
<nmOrg>FAMILY HEALTHCARE</nmOrg>
</name>
<id T="XX" P="P">1122334455</id>
</provBill>
<serv S="1">
<numLine>1</numLine>
<prof>
<px L="S">
<cdPx T="HC">99214</cdPx>
</px>
<amtChrg>179.00</amtChrg>
</prof>
</serv>
<serv S="2">
<numLine>2</numLine>
<prof>
<px L="S">
<cdPx T="HC">2000F</cdPx>
</px>
<amtChrg>0.00</amtChrg>
</prof>
</serv>
</clm>
<clm>
<hdr>
<corn>EP030315706890705</corn>
<idSend>112233445</idSend>
<nmSend>TESTSUBMITTER2</nmSend>
</hdr>
<provBill>
<name>
<nmOrg>FAMILY HEALTHCARE</nmOrg>
</name>
<id T="XX" P="P">1122334455</id>
</provBill>
<serv S="1">
<numLine>1</numLine>
<prof>
<px L="S">
<cdPx T="HC">99214</cdPx>
</px>
<amtChrg>179.00</amtChrg>
</prof>
</serv>
<serv S="2">
<numLine>2</numLine>
<prof>
<px L="S">
<cdPx T="HC">2000F</cdPx>
</px>
<amtChrg>0.00</amtChrg>
</prof>
</serv>
</clm>
</afile>
</payloadContainer>
Desired output XML:
<Table>
<row>
.... All the fields represented here.
</row>
</Table>
Possible solution: https://www.dropbox.com/s/wzvtzw7ihtgxx9o/claimtoRedshift.xsl
This scenario creates two row's dynamically. However, I'm still stuck at how to duplicate for each service line.
I don't see the connection of the linked XSLT to the question presented here. AFAICT, the following stylesheet will return the expected pipe-delimited output:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8"/>
<xsl:template match="/payloadContainer">
<xsl:for-each select="afile/clm">
<xsl:variable name="common">
<xsl:value-of select="hdr/corn"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="hdr/nmSend"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="provBill/name/nmOrg"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="provBill/id"/>
<xsl:text>|</xsl:text>
</xsl:variable>
<xsl:for-each select="serv">
<xsl:value-of select="$common"/>
<xsl:value-of select="numLine"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="prof/px/cdPx"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="prof/amtChrg"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
I have an XSLT template that is working fine.
<xsl:template match="Row[contains(BenefitType, 'MyBenefit')]">
<value>
<xsl:value-of select="BenefitList/Row/Premium* 12" />
</value>
</xsl:template>
The output is
<value>100</value>
<value>110</value>
What I would prefer is if it would just output 220. So, basically in the template I would need to use some sort of variable or looping to do this and then output the final summed value?
XSLT 1 compliance is required.
The template is being used as follows:
<xsl:apply-templates select="Root/Row[contains(BenefitType, 'MyBenefit')]" />
For some reason, when I use the contains here it only sums the first structure that matches and not all of them. If The XML values parent wasn't dependent on having a sibling element that matched a specific value then a'sum' approach would work.
The direct solution to the problem was already mentioned in the comments, but assuming you really want to do the same with some variables, this might be interesting for you:
XML:
<Root>
<Row>
<BenefitType>MyBenefit</BenefitType>
<BenefitList>
<Premium>100</Premium>
</BenefitList>
</Row>
<Row>
<BenefitType>MyBenefit, OtherBenefit</BenefitType>
<BenefitList>
<Premium>100</Premium>
</BenefitList>
</Row>
<Row>
<BenefitType>OtherBenefit</BenefitType>
<BenefitList>
<Premium>1000</Premium>
</BenefitList>
</Row>
<Row>
<BenefitType>OtherBenefit</BenefitType>
<BenefitList>
<Premium>1000</Premium>
</BenefitList>
</Row>
</Root>
XSLT:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="exsl">
<xsl:template match="/">
<total>
<xsl:variable name="valuesXml">
<values>
<xsl:apply-templates select="Root/Row[contains(BenefitType, 'MyBenefit')]" />
</values>
</xsl:variable>
<xsl:variable name="values" select="exsl:node-set($valuesXml)/values/value" />
<xsl:value-of select="sum($values)" />
</total>
</xsl:template>
<xsl:template match="Row[contains(BenefitType, 'MyBenefit')]">
<value>
<xsl:value-of select="BenefitList/Premium * 12" />
</value>
</xsl:template>
</xsl:stylesheet>
Here the same result set generated in your question is saved in another variable, which can then again be processed.
I am trying to write a loop using XSLT so that it automatically groups all items with the same ID but in a case insensitive way. Unfortunately the data that I am trying to parse through is client driven so I cannot change it prior to load.
regardless here is a XML structure...
<Document>
<Row>
<Cell>ID</Cell>
</Row>
<Row>
<Cell>hi</Cell>
</Row>
<Row>
<Cell>Hi</Cell>
</Row>
<Row>
<Cell>Hello</Cell>
</Row>
<Row>
<Cell>Hello</Cell>
</Row>
<Row>
<Cell>Hola</Cell>
</Row>
</Document>
This is the XSLT I am currently using...
<xsl:template match="Document">
<NewDocument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsl:for-each select="//Row[position() > 1]/Cell[1][not(.=preceding::Row/Cell[1])]">
<xsl:variable name="currentOrderID" select="." />
<xsl:variable name="currentOrderGroup" select="//Row[Cell[1] = $currentOrderID]" />
<MainID>
<xsl:value-of select="$currentOrderGroup[1]/Cell[1]"/>
</MainID>
<IDs>
<xsl:for-each select="$currentOrderGroup">
<id>
<xsl:value-of select="Cell[1]"/>
</id>
</xsl:for-each>
</IDs>
</xsl:for-each>
</NewDocument>
</xsl:template>
This is just wrapping up things as expected in a CaSe SeNSiTiVe way...
I've been trying to use a translate in there in order to make everything uppercase, however I can't seem to get the syntax just right.
The result I am trying to achieve here is this:
<NewDocument>
<MainID>hi</MainID>
<IDs>
<id>hi</id>
<id>Hi</id>
</IDs>
<MainID>Hello</MainID>
<IDs>
<id>Hello</id>
<id>Hello</id>
</IDs>
<MainID>Hola</MainID>
<IDs>
<id>Hola</id>
</IDs>
</NewDocument>
Can't seem to find anything specifically for what I need.
Thanks!
In XSLT1.0, to convert strings to lower case you need to use the rather cumbersome translate function in xpath.
translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz')
Furthermore, your problem is one of grouping, and in XSLT1.0 that usually means a technique known as Meunchian Grouping. To do, this you first define a key to look up items in the groups you require
<xsl:key
name="Cell"
match="Cell"
use="translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz')"/>
Here we are looking up cells based on their (lower-case) text content.
To find the first element in each group, you look for Cell elements in the XML which also happen to be the first element occurring in your look-up key
<xsl:apply-templates
select="Row/Cell
[generate-id()
= generate-id(
key('Cell',
translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'))[1])]"/>
Then, when you match the first element, you can then match all elements within the group by looking at the key.
Here is the full XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:key name="Cell" match="Cell" use="translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz')"/>
<xsl:template match="Document">
<NewDocument>
<xsl:apply-templates select="Row/Cell[generate-id() = generate-id(key('Cell', translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'))[1])]"/>
</NewDocument>
</xsl:template>
<xsl:template match="Cell">
<MainID>
<xsl:value-of select="."/>
</MainID>
<IDs>
<xsl:apply-templates select="key('Cell', translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'))" mode="group"/>
</IDs>
</xsl:template>
<xsl:template match="Cell" mode="group">
<id>
<xsl:value-of select="."/>
</id>
</xsl:template>
</xsl:stylesheet>
Note the use of the mode attribute, to distinguish between the two templates matching Cell elements.
When applied to your XML, the following is output:
<NewDocument>
<MainID>ID</MainID>
<IDs>
<id>ID</id>
</IDs>
<MainID>hi</MainID>
<IDs>
<id>hi</id>
<id>Hi</id>
</IDs>
<MainID>Hello</MainID>
<IDs>
<id>Hello</id>
<id>Hello</id>
</IDs>
<MainID>Hola</MainID>
<IDs>
<id>Hola</id>
</IDs>
</NewDocument>
Note, I wasn't sure what to do with the Cell with ID as a value, so I left that it in. If you do want to exclude it, just add this line to the XSLT
<xsl:template match="Cell[. = 'ID']" />
Is it possible to perform conditional totalling in xsl?
I have the following xml sample:
<?xml version="1.0" encoding="utf-8"?>
<export>
<stats set="1">
<columns>
<column id="0">
<sum>100</sum>
</column>
<column id="1">
<sum>102</sum>
</column>
<column id="2">
<sum>12</sum>
</column>
</columns>
</stats>
<stats set="2">
<columns>
<column id="0">
<sum>100</sum>
</column>
<column id="1">
<sum>101</sum>
</column>
<column id="2">
<sum>19</sum>
</column>
</columns>
</stats>
</export>
Is it possible to compute the total of all columns in each stat set where they are not equal to one another? So it would output the following:
Set 1 Set 2 Diff(Set 1 - Set 2)
Total (Diff) 114 120 -6
column 2 102 101 1
column 3 12 19 -7
So in the output column 1 would be omitted as the sum in the two stat sets is the same.
I can get my xsl to output the columns that are different but unsure how to total these up and put in the total row.
Many thanks,
Andez
This transformation (64 lines):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:key name="kColByPosAndVal" match="column"
use="concat(count(preceding-sibling::*),
'+',
sum)"/>
<xsl:key name="kIdByVal" match="column/#id"
use="."/>
<xsl:template match="/*">
<xsl:variable name="vsum1" select=
"sum(stats[#set=1]/*/column
[not(key('kColByPosAndVal',
concat(count(preceding-sibling::*),
'+',
sum)
)[2]
)])"/>
<xsl:variable name="vsum2" select=
"sum(stats[#set=2]/*/column
[not(key('kColByPosAndVal',
concat(count(preceding-sibling::*),
'+',
sum)
)[2]
)])"/>
Set 1 Set 2 Diff(Set 1 - Set 2)
Total (Diff) <xsl:text/>
<xsl:value-of select="$vsum1"/>
<xsl:text> </xsl:text>
<xsl:value-of select="$vsum2"/>
<xsl:text> </xsl:text>
<xsl:value-of select="$vsum1 -$vsum2"/>
<xsl:text>
</xsl:text>
<xsl:for-each select=
"*/*/column/#id
[generate-id()
= generate-id(key('kIdByVal',.)[1])
]
[not(key('kColByPosAndVal',
concat(count(../preceding-sibling::*),
'+',
../sum)
)[2]
)]">
<xsl:variable name="vcolSet1" select=
"/*/stats[#set=1]/*/column[#id=current()]/sum"/>
<xsl:variable name="vcolSet2" select=
"/*/stats[#set=2]/*/column[#id=current()]/sum"/>
Column <xsl:value-of select=".+1"/><xsl:text/>
<xsl:text> </xsl:text>
<xsl:value-of select="$vcolSet1"/>
<xsl:text> </xsl:text>
<xsl:value-of select="$vcolSet2"/>
<xsl:text> </xsl:text>
<xsl:value-of select="$vcolSet1 -$vcolSet2"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<export>
<stats set="1">
<columns>
<column id="0">
<sum>100</sum>
</column>
<column id="1">
<sum>102</sum>
</column>
<column id="2">
<sum>12</sum>
</column>
</columns>
</stats>
<stats set="2">
<columns>
<column id="0">
<sum>100</sum>
</column>
<column id="1">
<sum>101</sum>
</column>
<column id="2">
<sum>19</sum>
</column>
</columns>
</stats>
</export>
produces the wanted, correct result:
Set 1 Set 2 Diff(Set 1 - Set 2)
Total (Diff) 114 120 -6
Column 2 102 101 1
Column 3 12 19 -7
Explanation:
The key named kColByPosAndVal is used to select all columns that have a given position (among all column siblings) and a given value for their sum child-element.
The key named kIdByVal is used in Muenchian grouping to find all different values for the id attribute.
The two totals are calculated summing only those columns, whose kColByPosAndVal key selects only one column element (if it selects two column elements, they both are at the same position and have the same sum).
The rest should be easy to understand.