I am trying to convert xml dumps similar to this one
<?xml version="1.0" encoding="UTF-8"?>
<report>
<report_header>
<c1>desc</c1>
<c2>prname</c2>
<c3>prnum</c3>
<c4>cdate</c4>
<c5>phase</c5>
<c6>stype</c6>
<c7>status</c7>
<c8>parent</c8>
<c9>location</c9>
</report_header>
<report_row>
<c1></c1>
<c2>IT Project Message Validation</c2>
<c3>IT-0000021</c3>
<c4>12/14/2010 09:56 AM</c4>
<c5>Preparation</c5>
<c6>IT Projects</c6>
<c7>Active</c7>
<c8>IT</c8>
<c9>/IT/BIOMED</c9>
</report_row>
<report_row>
<c1></c1>
<c2>David, Michael John Morning QA Test</c2>
<c3>IT-0000020</c3>
<c4>12/14/2010 08:12 AM</c4>
<c5>Preparation</c5>
<c6>IT Projects</c6>
<c7>Active</c7>
<c8>IT</c8>
<c9>/IT/BIOMED</c9>
</report_row>
</report>
with the xslt below, to csv. Unfortunately the contains function does not work.
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="report">
<xsl:apply-templates select="report_header"/>
<xsl:apply-templates select="report_row"/>
</xsl:template>
<xsl:template match="report_header">
<xsl:for-each select="*">
<xsl:value-of select="."/>
<xsl:if test="position() != last()">
<xsl:value-of select="','"/>
</xsl:if>
</xsl:for-each>
<xsl:text>
</xsl:text>
</xsl:template>
<xsl:template match="report_row">
<xsl:param name="value" />
<xsl:for-each select="*">
<xsl:value-of select="$value" />
<xsl:if test="(contains($value,','))">
<xsl:text>"</xsl:text><xsl:value-of select="."/><xsl:text>"</xsl:text>
</xsl:if>
<xsl:if test="not(contains($value,','))">
<xsl:value-of select="."/>
</xsl:if>
<xsl:if test="position() != last()">
<xsl:value-of select="','"/>
</xsl:if>
</xsl:for-each>
<xsl:if test="position() != last()">
<xsl:text>
</xsl:text>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
I get the following dump. I expected the qualifiers around the prname column on the second row.
desc,prname,prnum,cdate,phase,stype,status,parent,location
,IT Project Message Validation,IT-0000021,12/14/2010 09:56 AM,Preparation,IT Projects,Active,IT,/IT/BIOMED
,David, Michael John Morning QA Test,IT-0000020,12/14/2010 08:12 AM,Preparation,IT Projects,Active,IT,/IT/BIOMED
I have only used the coldfusion xmltransform function to test it.
The provided code has issues, some of them reported in the answer of Mads Hansen.
The main problem is that the code is unnecessarily complicated.
Below is a simple solution that produces what seems to be wanted:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="report_header/*">
<xsl:value-of select="."/>
<xsl:call-template name="processEnd"/>
</xsl:template>
<xsl:template match="report_row/*[contains(., ',')]">
<xsl:text>"</xsl:text>
<xsl:value-of select="."/>
<xsl:text>"</xsl:text>
<xsl:call-template name="processEnd"/>
</xsl:template>
<xsl:template match="report_row/*[not(contains(., ','))]">
<xsl:value-of select="."/>
<xsl:call-template name="processEnd"/>
</xsl:template>
<xsl:template name="processEnd">
<xsl:choose>
<xsl:when test="position() != last()">,</xsl:when>
<xsl:otherwise><xsl:text> </xsl:text></xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<report>
<report_header>
<c1>desc</c1>
<c2>prname</c2>
<c3>prnum</c3>
<c4>cdate</c4>
<c5>phase</c5>
<c6>stype</c6>
<c7>status</c7>
<c8>parent</c8>
<c9>location</c9>
</report_header>
<report_row>
<c1></c1>
<c2>IT Project Message Validation</c2>
<c3>IT-0000021</c3>
<c4>12/14/2010 09:56 AM</c4>
<c5>Preparation</c5>
<c6>IT Projects</c6>
<c7>Active</c7>
<c8>IT</c8>
<c9>/IT/BIOMED</c9>
</report_row>
<report_row>
<c1></c1>
<c2>David, Michael John Morning QA Test</c2>
<c3>IT-0000020</c3>
<c4>12/14/2010 08:12 AM</c4>
<c5>Preparation</c5>
<c6>IT Projects</c6>
<c7>Active</c7>
<c8>IT</c8>
<c9>/IT/BIOMED</c9>
</report_row>
</report>
the wanted, correct result is produced:
desc,prname,prnum,cdate,phase,stype,status,parent,location ,IT Project Message Validation,IT-0000021,12/14/2010 09:56 AM,Preparation,IT Projects,Active,IT,/IT/BIOMED ,"David, Michael John Morning QA Test",IT-0000020,12/14/2010 08:12 AM,Preparation,IT Projects,Active,IT,/IT/BIOMED
I don't think that contains() is your issue.
The issue is that your report_row template has an <xsl:param name="value"/> that is never assigned a value. You have logic that is driven from that param, which never fires. Because $value is empty, it will never contain() , or any other character.
You could get the desired behavior by adding a select attribute to the xsl:param:
<xsl:template match="report_row">
<xsl:param name="value" select="." />
You could simplify your stylesheet and logic by making more of a "push" style, which can be easier to debug and maintain than "pull" style stylesheets that attempt to implement procedural logic in XSLT.
Something like the following stylesheet achieve the same thing:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:apply-templates select="*/report_header/*"/>
<xsl:apply-templates select="*/report_row/*"/>
</xsl:template>
<!-- For all but the last item, apply templates for the content, then add a comma -->
<xsl:template match="*[following-sibling::*]">
<xsl:apply-templates/>
<xsl:text>,</xsl:text>
</xsl:template>
<!-- If it's the last element in a group, add a newline char -->
<xsl:template match="*[not(following-sibling::*)]">
<xsl:apply-templates />
<!--Line break-->
<xsl:text>
</xsl:text>
</xsl:template>
<!-- If any values contains a comma, wrap it in quotes -->
<xsl:template match="text()[contains(.,',')]">
<xsl:text>"</xsl:text>
<xsl:value-of select="."/>
<xsl:text>"</xsl:text>
</xsl:template>
</xsl:stylesheet>
Produces the following output:
desc,prname,prnum,cdate,phase,stype,status,parent,location
,IT Project Message Validation,IT-0000021,12/14/2010 09:56 AM,Preparation,IT Projects,Active,IT,/IT/BIOMED
,"David, Michael John Morning QA Test",IT-0000020,12/14/2010 08:12 AM,Preparation,IT Projects,Active,IT,/IT/BIOMED
Related
I started learning Linux couple of days ago and currently I'm stuck at XSLT. Sorry if it's a stupid question or already answered elsewhere, I'm quite new here.
My XML looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<lac xmlns:t="http://smth.de">
<header>
<order id="20210323346730329408"/>
<adress id="IZ0009"/>
</header>
<items>
<item id="1"><material><code>IS-0001-BT-1</code><lotcode/></material><qty>10,000000</qty><expiry/></item>
<item id="2"><material><code>IS-0001-BT-2</code><lotcode/></material><qty>20,000000</qty><expiry/></item>
<item id="3"><material><code>IS-0001-AZ-1</code><lotcode/></material><qty>30,000000</qty><expiry/></item>
<item id="4"><material><code>IS-0001-AZ-2</code><lotcode/></material><qty>40,000000</qty><expiry/></item>
</items>
</lac>
I want to get this output:
Order ID,Adress ID,Item ID,MaterialCode,MaterialLotCode,MaterialQty,Expiry
20210323346730329408,IZ0009,1,IS-0001-BT-1,,10,000000,
20210323346730329408,IZ0009,2,IS-0001-BT-2,,20,000000,
20210323346730329408,IZ0009,3,IS-0001-AZ-1,,30,000000,
20210323346730329408,IZ0009,4,IS-0001-AZ-2,,40,000000,
This is what I got sofar, Your help is very welcome:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8"/>
<xsl:strip-space elements="*" />
<xsl:template match="/">
<xsl:text>Order ID,Adress ID,Item ID,MaterialCode,MaterialLotCode,MaterialQty,Expiry
</xsl:text>
<xsl:apply-templates mode="runHeader"/>
<xsl:apply-templates mode="runItems"/>
</xsl:template>
<xsl:template match="header" mode="runHeader">
<xsl:apply-templates mode="runOrder"/>
<xsl:apply-templates mode="runAdress"/>
</xsl:template>
<xsl:template match="order" mode="runOrder">
<xsl:value-of select="./#id"/>
<xsl:text>,</xsl:text>
</xsl:template>
<xsl:template match="adress" mode="runAdress">
<xsl:value-of select="./#id"/>
<xsl:text>,</xsl:text>
</xsl:template>
<xsl:template match="items" mode="runItems">
<xsl:for-each select="//item">
<xsl:value-of select="./#id"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="./material"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="./lotcode"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="./qty"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="./expiry"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Many thanks in advance.
I would suggest a different approach:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8"/>
<xsl:template match="/lac">
<!-- header row -->
<xsl:text>Order ID,Adress ID,Item ID,MaterialCode,MaterialLotCode,MaterialQty,Expiry
</xsl:text>
<!-- common data -->
<xsl:variable name="common">
<xsl:value-of select="header/order/#id"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="header/adress/#id"/>
<xsl:text>,</xsl:text>
</xsl:variable>
<!-- data rows -->
<xsl:for-each select="items/item">
<xsl:copy-of select="$common"/>
<xsl:value-of select="#id"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="material/code"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="material/lotcode"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="qty"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="expiry"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Note that this assumes that each item has at most one material (your XML structure allows for more).
--
P.S. That's not how you spell address.
New to XSLT and I've been experimenting but want to know if this would be possible.
I want to transform some XML to .csv
The crux of the problem is that I want to create a numeric id for each selected element and then re-use that id for said element to link back
Given the following XML:
<root>
<executables>
<executable name="foo">
<executables>
<executable name="bar"></executable>
</executables>
</executable>
</executables>
<constraints>
<constraint name="baz" from="foo" to="bar"></constraint>
</constraints>
</root>
I'd like the result to be something along the lines of:
id,type,name,from,to
1,executable,foo,,
2,executable,bar,,
3,constraint,baz,1,2
Is this even possible?
Here is my starting XSL:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="utf-8" indent="no"/>
<xsl:template match="text()" />
<xsl:template match="/">
<xsl:text>id,type,name,from,to
</xsl:text>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="executables">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="constraints">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="executable">
<xsl:number format="1" level="any"/>,executable,<xsl:value-of select="#name" /><xsl:text>,,
</xsl:text>
<xsl:apply-templates />
</xsl:template>
<xsl:template match="constraint">
<xsl:number format="1" level="any"/>,constraint,<xsl:value-of select="#name" />,<xsl:value-of select="#from" />,<xsl:value-of select="#to" /><xsl:text>
</xsl:text>
<xsl:apply-templates />
</xsl:template>
</xsl:stylesheet>
which gives this result:
id,type,name,from,to
1,executable,foo,,
2,executable,bar,,
1,constraint,baz,foo,baz
So I basically need to use the <xsl:number> matched by the attribute #name, which will be unique. Also the number isn't quite right; it counted from 1 again for the constraint match.
For the two <xsl:number format="1" level="any"/> I think you want <xsl:number count="executable | constraint" format="1" level="any"/>.
For the references set up a key <xsl:key name="ref" match="executable" use="#name"/> and then instead of the <xsl:value-of select="#from" /> use e.g. <xsl:apply-templates select="key('ref', #from)" mode="number"/> and set up
<xsl:template match="executable" mode="number">
<xsl:number level="any"/>
</xsl:template>
If the constraint elements can also be referenced then use match="executable | constraint" in the key declaration and also <xsl:number count="executable | constraint" level="any"/> in that template.
And for the <xsl:value-of select="#to" /> you use <xsl:apply-templates select="key('ref', #to)" mode="number"/>.
https://xsltfiddle.liberty-development.net/gWvjQgk
I would use actual generated ids, as mentioned in your title, instead of trying to produce sequential numbering:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="utf-8" indent="no"/>
<xsl:strip-space elements="*"/>
<xsl:key name="exe-by-name" match="executable" use="#name" />
<xsl:template match="/root">
<xsl:text>id,type,name,from,to
</xsl:text>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="executable">
<xsl:value-of select="generate-id()" />
<xsl:text>,executable,</xsl:text>
<xsl:value-of select="#name" />
<xsl:text>,,
</xsl:text>
<xsl:apply-templates />
</xsl:template>
<xsl:template match="constraint">
<xsl:value-of select="generate-id()" />
<xsl:text>,constraint,</xsl:text>
<xsl:value-of select="#name" />
<xsl:text>,</xsl:text>
<xsl:value-of select="generate-id(key('exe-by-name', #from))" />
<xsl:text>,</xsl:text>
<xsl:value-of select="generate-id(key('exe-by-name', #to))" />
<xsl:text>
</xsl:text>
<xsl:apply-templates />
</xsl:template>
</xsl:stylesheet>
Demo (using corrected XML): https://xsltfiddle.liberty-development.net/gWvjQgk/1
With XSLT 2.0, I am trying to create a list of relations between all children of given elements, in a document such as:
<doc>
<part1>
<name>John</name>
<name>Paul</name>
<name>George</name>
<name>Ringo</name>
<place>Liverpool</place>
</part1>
<part2>
<name>Romeo</name>
<name>Romeo</name>
<name>Juliet</name>
<fam>Montague</fam>
<fam>Capulet</fam>
</part2>
</doc>
The result I would like to obtain, ideally by conflating and weighing the identical relations, would be (in whatever order) something like:
<doc>
<part1>
<rel><name>John</name><name>Paul</name></rel>
<rel><name>John</name><name>George</name></rel>
<rel><name>John</name><name>Ringo</name></rel>
<rel><name>Paul</name><name>George</name></rel>
<rel><name>Paul</name><name>Ringo</name></rel>
<rel><name>George</name><name>Ringo</name></rel>
<rel><name>John</name><place>Liverpool</place></rel>
<rel><name>Paul</name><place>Liverpool</place></rel>
<rel><name>George</name><place>Liverpool</place></rel>
<rel><name>Ringo</name><place>Liverpool</place></rel>
</part1>
<part2>
<rel weight="2"><name>Romeo</name><name>Juliet</name></rel>
<rel weight="2"><name>Romeo</name><fam>Montague</fam></rel>
<rel weight="2"><name>Romeo</name><fam>Capulet</fam></rel>
<rel><name>Juliet</name><fam>Montague</fam></rel>
<rel><name>Juliet</name><fam>Capulet</fam></rel>
<rel><fam>Montague</fam><fam>Capulet</fam></rel>
</part2>
</doc>
—but I'm not sure how to proceed. Many thanks in advance for your help.
You still haven't explained the logic that needs to be applied here, so this is based largely on a guess:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="/">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="doc/*">
<!-- first pass-->
<xsl:variable name="unique-items">
<xsl:for-each-group select="*" group-by="concat(name(), '|', .)">
<item name="{name()}" count="{count(current-group())}" value="{.}"/>
</xsl:for-each-group>
</xsl:variable>
<!-- output -->
<xsl:copy>
<xsl:for-each select="$unique-items/item">
<xsl:variable name="left" select="."/>
<xsl:for-each select="following-sibling::item">
<xsl:variable name="weight" select="$left/#count * #count" />
<rel>
<xsl:if test="$weight gt 1">
<xsl:attribute name="weight" select="$weight"/>
</xsl:if>
<xsl:apply-templates select="$left | ." />
</rel>
</xsl:for-each>
</xsl:for-each>
</xsl:copy>
</xsl:template>
<xsl:template match="item">
<xsl:element name="{#name}">
<xsl:value-of select="#value"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
The idea here is to remove duplicates in the first pass, then enumerate all combinations in the second (final) pass. The weight is computed by multiplying the number of occurrences of each member of a combination pair and shown only when it exceeds 1.
At least the combinatoric part of your problem could be solved with the following XSLT script. It does not solve the elimination of duplicates, but that could possibly be done in a second transformation.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- standard copy template -->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
<xsl:template match="doc/*">
<xsl:copy>
<xsl:variable name="l" select="./*"/>
<xsl:for-each select="$l">
<xsl:variable name="a" select="."/>
<xsl:variable name="posa" select="position()"/>
<xsl:variable name="namea" select="name()"/>
<xsl:for-each select="$l">
<xsl:if test="position() > $posa and (. != $a or name() != $namea)">
<rel>
<xsl:copy-of select="$a"/>
<xsl:copy-of select="."/>
</rel>
</xsl:if>
</xsl:for-each>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
When applied to the first part of your example, this produces:
<part1>
<rel><name>John</name><name>Paul</name></rel>
<rel><name>John</name><name>George</name></rel>
<rel><name>John</name><name>Ringo</name></rel>
<rel><name>John</name><place>Liverpool</place></rel>
<rel><name>Paul</name><name>George</name></rel>
<rel><name>Paul</name><name>Ringo</name></rel>
<rel><name>Paul</name><place>Liverpool</place></rel>
<rel><name>George</name><name>Ringo</name></rel>
<rel><name>George</name><place>Liverpool</place></rel>
<rel><name>Ringo</name><place>Liverpool</place></rel>
</part1>
Which seems about correct. If have no idea if the duplicate elimination (or weighting, as you call it) could be done in the same transformation.
Silly, simple question. When I output text, it still get the tabs based on my formatted/indented XSL structure. How do I instruct the transformer to ignore the spacing in the stylesheet while still keeping it neatly formatted?
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:apply-templates select="Foo/Bar"></xsl:apply-templates>
</xsl:template>
<xsl:template match="Bar">
<xsl:for-each select="AAA"><xsl:for-each select="BBB"><xsl:value-of select="Label"/>|<xsl:value-of select="Value"/><xsl:text>
</xsl:text></xsl:for-each></xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Produces output line by line with no tabs:
SomeLabel|SomeValue
SomeLabel|SomeValue
SomeLabel|SomeValue
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:apply-templates select="Foo/Bar"></xsl:apply-templates>
</xsl:template>
<xsl:template match="Bar">
<xsl:for-each select="AAA">
<xsl:for-each select="BBB">
<xsl:value-of select="Label"/>|<xsl:value-of select="Value"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Produces output with tabs:
SomeLabel|SomeValue
SomeLabel|SomeValue
SomeLabel|SomeValue
Update:
Adding this does not fix it:
<xsl:output method="text" indent="no"/>
<xsl:strip-space elements="*"></xsl:strip-space>
This is contrived, but you can imagine the XML looks like this:
<Foo>
<Bar>
<AAA>
<BBB>
<Label>SomeLabel1</Label>
<Value>SomeValue1</Value>
</BBB>
<BBB>
<Label>SomeLabel2</Label>
<Value>SomeValue2</Value>
</BBB>
<BBB>
<Label>SomeLabel3</Label>
<Value>SomeValue3</Value>
</BBB>
</AAA>
</Bar>
</Foo>
What you could try is wrapping all your current text nodes in xsl:text. For example, try this
<xsl:for-each select="BBB">
<xsl:value-of select="Label"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="Value"/>
<xsl:text>|</xsl:text>
</xsl:for-each>
Alternatively, you could make use of the concat function.
<xsl:for-each select="BBB">
<xsl:value-of select="concat(Label, '|')"/>
<xsl:value-of select="concat(Value, '|')"/>
</xsl:for-each>
You could even combine the two statements into one if you wanted
<xsl:for-each select="BBB">
<xsl:value-of select="concat(Label, '|', Value, '|')"/>
</xsl:for-each>
EDIT: If you prefer not to enter the separator | so many times, you make use of template matching to output the fileds. First, replace the value-of with apply-templates like so
<xsl:for-each select="BBB">
<xsl:apply-templates select="Label"/>
<xsl:apply-templates select="Value"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
Then you would have one specific template to match Label, where you wouldn't need to output the separator, and another more generic template matching any child of BBB
<xsl:template match="BBB/Label" priority="1">
<xsl:value-of select="." />
</xsl:template>
<xsl:template match="BBB/*">
<xsl:text>|</xsl:text><xsl:value-of select="." />
</xsl:template>
(The priority here is needed to ensure Label is matched by the first template, and not the general one). Of course, you could also not do apply-templates on Label in this case, and just do xsl:value-of for that one.
Furthermore, if the fields were being output in the order they appear in the XML, you could simplify the for-each to just this
<xsl:for-each select="BBB">
<xsl:apply-templates />
<xsl:text>
</xsl:text>
</xsl:for-each>
is it possible to do the following in xsl. I'm tring to split the contents of an element and create sub-elements based on the split. To make things trickier there are the occasional exception (ie node-4 doesn't get split). I'm wondering if there is a way i can do this without explicit splits hardcoded for each element. Again, not sure if this is possible. thanks for the help!
original XML:
<document>
<node>
<node-1>hello world1</node-1>
<node-2>hello^world2</node-2>
<node-3>hello^world3</node-3>
<node-4>hello^world4</node-4>
</node>
</document>
transformed XML
<document>
<node>
<node-1>hello world1</node-1>
<node-2>
<node2-1>hello</node2-1>
<node2-2>world2</node2-2>
</node-2>
<node-3>
<node3-1>hello</node3-1>
<node3-2>world3</node3-2>
</node-3>
<node-4>hello^world4</node-4>
</node>
</document>
To make things trickier there are the
occasional exception (ie node-4
doesn't get split). I'm wondering if
there is a way i can do this without
explicit splits hardcoded for each
element.
Pattern matching text nodes to tokenize, this more semantic stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()[contains(.,'^')]" name="tokenize">
<xsl:param name="pString" select="concat(.,'^')"/>
<xsl:param name="pCount" select="1"/>
<xsl:if test="$pString">
<xsl:element name="{translate(name(..),'-','')}-{$pCount}">
<xsl:value-of select="substring-before($pString,'^')"/>
</xsl:element>
<xsl:call-template name="tokenize">
<xsl:with-param name="pString"
select="substring-after($pString,'^')"/>
<xsl:with-param name="pCount" select="$pCount + 1"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
<xsl:template match="node-4/text()">
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
Output:
<document>
<node>
<node-1>hello world1</node-1>
<node-2>
<node2-1>hello</node2-1>
<node2-2>world2</node2-2>
</node-2>
<node-3>
<node3-1>hello</node3-1>
<node3-2>world3</node3-2>
</node-3>
<node-4>hello^world4</node-4>
</node>
</document>
Note: A classic tokenizer (In fact, this use a normalized string allowing empty items in sequence). Pattern matching and overwriting rules (preserving node-4 text node).
Here's an XSL 1.0 solution. I presume that the inconsistency in node-4 in your sample output was just a typo. Otherwise you'll have to define why node3 was split and node4 wasn't.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<document>
<node>
<xsl:apply-templates select="document/node/*"/>
</node>
</document>
</xsl:template>
<xsl:template match="*">
<xsl:variable name="tag" select="name()"/>
<xsl:choose>
<xsl:when test="contains(text(),'^')">
<xsl:element name="{$tag}">
<xsl:element name="{concat($tag,'-1')}">
<xsl:value-of select="substring-before(text(),'^')"/>
</xsl:element>
<xsl:element name="{concat($tag,'-2')}">
<xsl:value-of select="substring-after(text(),'^')"/>
</xsl:element>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
This works as long as all the nodes you want split are at the same level, under /document/node. If the real document structure is different you will have to tweak the solution to match.
Can you use XSLT 2.0? If so, it sounds like <xsl:analyze-string> is right up your alley. You can split based on a regexp.
If you need further details, ask...
solution i used:
<xsl:output omit-xml-declaration="yes" method="xml" indent="yes"/>
<xsl:preserve-space elements="*"/>
<xsl:template match="node()|#*" name="identity">
<xsl:copy>
<xsl:apply-templates select="node()[1]|#*"/>
</xsl:copy>
<xsl:apply-templates select="following-sibling::node()[1]"/>
</xsl:template>
<xsl:template match="node()" mode="copy">
<xsl:call-template name="identity"/>
</xsl:template>
<xsl:template match="node-2 | node-3" name="subFieldCarrotSplitter">
<xsl:variable name="tag" select="name()"/>
<xsl:element name="{$tag}">
<xsl:for-each select="str:split(text(),'^')">
<xsl:element name="{concat($tag,'-',position())}">
<xsl:value-of select="text()"/>
</xsl:element>
</xsl:for-each>
</xsl:element>
<xsl:apply-templates select="following-sibling::node()[1]"/>
</xsl:template>