For the grouping function, is there a way to group by dynamic key, which is passed in the input data? For example, in the input xml below, I want to group <Trans> by the node name passed in <key1>, which is currently "id". Thank you!
<xsl:for-each-group select="Trans" group-by="[this key node name is from the input]">
Input xml:
<File>
<key1>id</key1>
<Trans>
<id>1</id>
<name>jane</name>
<location>ga</location>
<value>1.11</value>
</Trans>
<Trans>
<id>2</id>
<name>jane</name>
<location>ma</location>
<value>2.22</value>
</Trans>
<Trans>
<id>1</id>
<name>john</name>
<location>al</location>
<value>3.33</value>
</Trans>
<Trans>
<id>3</id>
<name>jj</name>
<location>ga</location>
<value>4.44</value>
</Trans> </File>
group-by="*[local-name() = ../key1]"
Related
I read expressions like this
<xsl:variable name="myVar" select="$data[not(key('myKey',#myRef))]"/>
in legacy code. Most likeley it is code from experts ;-). I'm wondering what it does, how it works and how i could reeng it in order to make it more readable. Thank you.
Keys are an important aspect of XSLT. Instead of re-engineering them, it's better to learn the concept.
Keys can be understood as tables with nodes stored under specific keys. They are defined like this:
<xsl:key name="addressByStreet" match="address" use="street"/>
The name attribute is just a QName (similar to a variable name). The match attribute holds an XPath expression that works similarly to the match attribute of <xsl:template>. When the processor finds a node that matches the expression, it evaluates the XPath expression of the use attribute in the context of the matched element. If this expression returns values, they will be used to create new entries in the "key table" for the matched element.
To illustrate that: The above key creates a table with all the <address> elements in the processed document, keyed by the value of their <street> child. This means, if you have these elements:
<address>
<street>Main Street</street>
<number>123</number>
</address>
<address>
<street>Main Street</street>
<number>456</number>
</address>
<address>
<street>Country Road</street>
<street>Country Rd.</street>
<number>789</number>
</address>
… you could then use key('addressByStreet', 'Main Street') to retrieve all the listed addresses in Main Street.
You can use both key('addressByStreet', 'Country Road') and key('addressByStreet', 'Country Rd.') to retrieve the last address.
Why use keys here? The above expression could be re-implemented like //address[street='Main Street'], but now every time this expression is called, the XSLT processor likely goes through the entire document again. That's a problem if a template or loop is called often. Keys can have huge performance benefits (e.g. reduce complexity from O(n²) to O(n)) because the results are "cached".
There are many applications and patterns in which keys are used. For example if you have this XML:
<street-list>
<street>Main Street</street>
<street>Bumpy Road</street>
</street-list>
The expression street-list/street[not(key('addressByStreet', .))] will filter the list of streets and only return streets for which there is no address in the above list – i.e. only "Bumpy Road" in this case because for "Main Street", a key entry exists.
A typical application of keys in XSLT 1 is Muenchian grouping.
I've got the use case now. No, it is not legacy code. This is clear from context and definition of key and data.
If we have data like this:
<xsl:variable name="dict">
<ITEMS>
<ITEM id="1" content="it1">
<ITEM-REF ref="3"/>
</ITEM>
<ITEM id="2" content="it2">
<ITEM-REF ref="1"/>
</ITEM>
<ITEM id="3" content="it3">
<ITEM-REF ref="6"/>
</ITEM>
<ITEM id="4" content="it4">
<ITEM-REF ref="3"/>
</ITEM>
<ITEM id="5" content="it5">
<ITEM-REF ref="5"/>
</ITEM>
<ITEM id="6" content="it6">
<ITEM-REF ref="8"/>
</ITEM>
<ITEM id="7" content="it7">
<ITEM-REF ref="9"/>
</ITEM>
</ITEMS>
</xsl:variable>
And we want to get all ITEM-REF elements with #ref values where there is no ITEM with the same #id value (broken links) the expression can help out:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<xsl:variable name="dict">
<ITEMS>
<ITEM id="1" content="it1">
<ITEM-REF ref="3"/>
</ITEM>
<ITEM id="2" content="it2">
<ITEM-REF ref="1"/>
</ITEM>
<ITEM id="3" content="it3">
<ITEM-REF ref="6"/>
</ITEM>
<ITEM id="4" content="it4">
<ITEM-REF ref="3"/>
</ITEM>
<ITEM id="5" content="it5">
<ITEM-REF ref="5"/>
</ITEM>
<ITEM id="6" content="it6">
<ITEM-REF ref="8"/>
</ITEM>
<ITEM id="7" content="it7">
<ITEM-REF ref="9"/>
</ITEM>
</ITEMS>
</xsl:variable>
<xsl:key name="itemkey" match="ITEM" use="#id"/>
<xsl:template match="START">
<xsl:variable name="allItems" select="msxsl:node-set($dict)//ITEM"/>
<xsl:variable name="allItemRefs" select="msxsl:node-set($dict)//ITEM-REF"/>
<xsl:variable name="itemRefsNotReferencingOtherItems" select="$allItemRefs[not(key('itemkey',#ref))]"/>
<REFERENCED-NOT-EXISTING>
<xsl:for-each select="msxsl:node-set($itemRefsNotReferencingOtherItems)">
<ITEM>
<xsl:attribute name="id">
<xsl:value-of select="#ref"/>
</xsl:attribute>
</ITEM>
</xsl:for-each>
</REFERENCED-NOT-EXISTING>
</xsl:template>
</xsl:stylesheet>
Output:
<?xml version="1.0" encoding="utf-8"?>
<REFERENCED-NOT-EXISTING>
<ITEM id="8" />
<ITEM id="9" />
</REFERENCED-NOT-EXISTING>
Input file:
<?xml version="1.0" encoding="utf-8"?>
<START/>
I have two large xml files, one of which has the following format:
<Persons>
<Person>
<ID>1</ID>
<LAST_NAME>London</LAST_NAME>
</Person>
<Person>
<ID>2</ID>
<LAST_NAME>Twain</LAST_NAME>
</Person>
<Person>
<ID>3</ID>
<LAST_NAME>Dikkens</LAST_NAME>
</Person>
</Persons>
The second file has the following format:
<SalesPersons>
<SalesPerson>
<ID>2</ID>
<LAST_NAME>London</LAST_NAME>
</SalesPerson>
<SalesPerson>
<ID>3</ID>
<LAST_NAME>Dikkens</LAST_NAME>
</SalesPerson>
</SalesPersons>
I need to find those records from file 1, which does not exist in file 2. Although I have it done using for-each loop, such an approach is taking a substantial amount of time. Is it possible to somehow make it run faster using a different approach?
Using a key can help to improve performance on lookups:
<xsl:key name="sales-person" match="SalesPerson" use="concat(ID, '|', LAST_NAME)"/>
<xsl:template match="/">
<xsl:for-each select="Persons/Person">
<xsl:variable name="person" select="."/>
<!-- need to change context document for key function use -->
<xsl:for-each select="$doc2">
<xsl:if test="not(key('sales-person', concat($person/ID, '|', $person/LAST_NAME)))">
<xsl:copy-of select="$person"/>
</xsl:if>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
That assumes you have bound doc2 as a variable or parameter with e.g. <xsl:param name="doc2" select="document('sales-persons.xml')"/>.
I have an XML file with this structure:
<DetailTxt>
<Text>
<span>Some Text</span>
</Text>
<TextComplement Kind="Owner" MarkLbl="1">
<ComplCaption>
Caption 1
</ComplCaption>
<ComplBody>
Body 1
</ComplBody>
</TextComplement>
<Text>
<span>More Text</span>
</Text>
</DetailTxt>
Here is the part of the XSLT that is relevant here:
<xsl:template match="*[local-name() = 'DetailTxt']">
<xsl:apply-templates select="*[local-name() = 'Text']"/>
</xsl:template>
<xsl:template match="*[local-name() = 'Text']">
<item name="{local-name()}">
<richtext>
<par>
<run>
<xsl:text disable-output-escaping="yes"><![CDATA[</xsl:text>
<xsl:apply-templates/>
<xsl:text disable-output-escaping="yes">]]></xsl:text>
</run>
</par>
</richtext>
</item>
<item name="{local-name()}">
<richtext>
<par>
<run>
<xsl:text disable-output-escaping="yes"><![CDATA[</xsl:text>
<xsl:value-of select="concat('[', ../TextComplement/#Kind, ../TextComplement/#MarkLbl,']')" />
<xsl:text disable-output-escaping="yes">]]></xsl:text>
</run>
</par>
</richtext>
</item>
</xsl:template>
I expect the output to look like this:
<item name="Text">
<richtext>
<par>
<run><![CDATA[
<span>Some Text</span>
</p>]]></run>
</par>
</richtext>
</item>
<item name="Text">
<richtext>
<par>
<run><![CDATA[[Owner1]]]></run>
</par>
</richtext>
</item>
But the line using the TextComplement XPath looks like this:
<run><![CDATA[[]]]></run>
All values from TextComplement are missing. Whats wrong with the XPath here?
EDIT: I completely reworked my question and put in a CONCRETE question resulting from the first answer. That kind of invalidates the first answer but IMHO improves the question.
Not sure how the XSLT looks like but you can try adding the following template with the concat() function for getting the output.
<xsl:template match="Text">
<document version="9.0" form="Form1">
<item name="{local-name()}">
<xsl:copy-of select="span" />
</item>
<item name="{local-name()}">
<span>
<xsl:value-of select="concat('[', ../TextComplement/#Kind, ../TextComplement/#MarkLbl, ']')" />
</span>
</item>
</document>
</xsl:template>
This template is applied to the <Text> node and the ../ is used to go up one level and then access the attributes of <TextComplement> using the XPath.
The output of the template when applied to your XML will look like.
<document form="Form1" version="9.0">
<item name="Text">
<span>Some Text</span>
</item>
<item name="Text">
<span>[Owner1]</span>
</item>
</document>
The same template will also get applied to the <Text> node having More Text content and produce similar output.
I found a solution myself for the concrete question. I quess this is IBM Notes / LotusScript specific issue.
When using the selector
../TextComplement/#Kind
the parser returned an empty string. I changed to
../*[local-name() = 'TextComplement']/#Kind
and later (more concrete) to:
./following-sibling::*[local-name() = 'TextComplement']/#Kind
And that worked. I personally see no difference in these notations, but it seams that internally they are handled differently.
I have the following xml:
<Details>
<Head>
<pageid>123</pageid> <!-- Needs to be sequential starting with 0000000001 -->
</Head>
<Start>
<pageid>124</pageid>
<value>Details of Minerals</value>
</Start>
<Item>
<pageid>12</pageid>
<name>Coal</name>
</Item>
<Quantity>
<pageid>45</pageid>
<value>3</value>
<comments>NONE MENTIONED</comments>
</Quantity>
<Item>
<pageid>459</pageid>
<name>MICA</name>
</Item>
<Quantity>
<pageid>65</pageid>
<value>2</value>
<comments>NONE MENTIONED</comments>
</Quantity>
<END>
<pageid>78</pageid>
</END>
</Details>
I want to the value pageid to be incremental with 10 digits.
Sample o/p
<Details>
<Head>
<pageid>0000000001</pageid>
</Head>
<Start>
<pageid>0000000002</pageid>
<value>Details of Minerals</value>
</Start>
<Item>
<pageid>0000000003</pageid>
<name>Coal</name>
</Item>
<Quantity>
<pageid>0000000004</pageid>
<value>3</value>
<comments>NONE MENTIONED</comments>
</Quantity>
<Item>
<pageid>0000000005</pageid>
<name>MICA</name>
</Item>
<Quantity>
<pageid>0000000006</pageid>
<value>2</value>
<comments>NONE MENTIONED</comments>
</Quantity>
<END>
<pageid>0000000007</pageid>
</END>
</Details>
I tried using the following construct:
<xsl:variable name="counter" select="0000000000" saxon:assignable="yes"/>
<xsl:template match="//*[local-name()='pageid']">
<saxon:assign name="counter" select="$counter+0000000001"/>
<imp1:Line_id>
<xsl:value-of select="$counter"></xsl:value-of>
</imp1:Line_id>
But this wasnt helpful. Can u suggest a easier way to do it?
Instead of trying to use a variable counter, you could just make use of the xsl:number element here:
<xsl:template match="//*[local-name()='pageid']">
<imp1:Line_id>
<xsl:number level="any" format="0000000000" />
</imp1:Line_id>
</xsl:template>
I'm processing a source HTML file that holds tabular data in an unstructured way. Basically it's a bunch of absolutely positioned divs. My goal is to rebuild some sort of structured XML data. So far, using XSLT 2.0 I was able to produce an XML looking like this:
<data>
<line top="44">
<item left="294">Some heading text</item>
</line>
<line top="47">
<item left="718">A</item> <!-- this item is a section-start -->
<item left="764">Section heading</item>
</line>
<line top="78">
<item left="92">Data</item>
<item left="144">Data</item>
<item left="540">Data</item>
<item left="588">Data</item>
</line>
<line top="101">
<item left="61">B</item> <!-- this item is a section-start -->
<item left="144">Section heading</item>
</line>
<line top="123">
<item left="92">Data</item>
<item left="144">Data</item>
</line>
</data>
However, what I need to do next is group lines into sections. Each section starts with a line whose first item's value consists of a single letter A – Z. My approach is to hold all the <line> elements in a $lines variable and then use xsl:for-each-group with group-starting-with attribute to identify the element starting a new section.
The respective XSLT fragment looks like this:
<xsl:for-each-group select="$lines/line" group-starting-with="...pattern here...">
<section>
<xsl:copy-of select="current-group()"/>
</section>
</xsl:for-each-group>
The problem is I can't figure out a working pattern to identify section starts. The best I could do was ensuring that //line/item[1]/text()[matches(., '^[A-Z]$')] works when used separately in an XPath evaluator. However, I can't seem to derive a working version to be used with group-starting-with.
Update Hence the wanted result should look like this:
<data>
<section> <!-- this section started automatically because of being at the beginning -->
<line top="44">
<item left="294">Some heading text</item>
</line>
</section>
<section>
<line top="47">
<item left="718">A</item> <!-- this item is a section-start -->
<item left="764">Section heading</item>
</line>
<line top="78">
<item left="92">Data</item>
<item left="144">Data</item>
<item left="540">Data</item>
<item left="588">Data</item>
</line>
</section>
<section>
<line top="101">
<item left="61">B</item> <!-- this item is a section-start -->
<item left="144">Section heading</item>
</line>
<line top="123">
<item left="92">Data</item>
<item left="144">Data</item>
</line>
</section>
</data>
The solution:
<xsl:for-each-group select="$lines/line" group-starting-with="line[matches(child::item[1], '^[A-Z]$')]">
<section name="{current-group()[1]/item[1]}">
<xsl:copy-of select="current-group()"/>
</section>
</xsl:for-each-group>
The trick is really understanding that group-starting-with shall be a pattern not a condition.