how to check duplicate element values using xslt - xslt

<BookList>
<Book>
<History>
<Type>history</Type>
<Prize>123</Prize>
<Publication>``
<Name>YEAP1</Name>
</Publication>
<RNumber Type="VolumeNumber">11111</RNumber>
<RNumber Type="SupplementNumber">123456</RNumber>
</History>
<chemistry>
<Type>chemistry</Type>
<Prize>333</Prize>
<Publication>
<Name>YEAP</Name>
</Publication>
<RNumber Type="VolumeNumber">11111</RNumber>
<RNumber Type="SupplementNumber">45454</RNumber>
</chemistry>
......
</Book>
</BookList>
There are duplicate VolumnNumber 11111. How to check duplicate VolumnNumber in BoolList xml using xslt . please help on this

I. This can be found using a single XPath expression:
false()
or
/*/*/*/RNumber
[#Type='VolumeNumber'
and
. = ../preceding-sibling::*
/RNumber[#Type='VolumeNumber']
]
Here is a complete XSLT transformation that evaluates this XPath expression and outputs the result of this evaluation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
Duplicate volume numbers exist: <xsl:text/>
<xsl:value-of select=
"false()
or
/*/*/*/RNumber
[#Type='VolumeNumber'
and
. = ../preceding-sibling::*
/RNumber[#Type='VolumeNumber']
]
"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<BookList>
<Book>
<History>
<Type>history</Type>
<Prize>123</Prize>
<Publication>``
<Name>YEAP1</Name>
</Publication>
<RNumber Type="VolumeNumber">11111</RNumber>
<RNumber Type="SupplementNumber">123456</RNumber>
</History>
<chemistry>
<Type>chemistry</Type>
<Prize>333</Prize>
<Publication>
<Name>YEAP</Name>
</Publication>
<RNumber Type="VolumeNumber">11111</RNumber>
<RNumber Type="SupplementNumber">45454</RNumber>
</chemistry>
......
</Book>
</BookList>
the wanted, correct result is produced:
Duplicate volume numbers exist: true
II. Solution using keys (generally more efficient):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:key name="kVolNum" match="RNumber[#Type='VolumeNumber']"
use="."/>
<xsl:template match="/">
Duplicate volume numbers exist: <xsl:text/>
<xsl:value-of select=
"false()
or
/*/*/*/RNumber
[#Type='VolumeNumber'
and
key('kVolNum',.)[2]
]"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the same XML document (above), the same correct result is produced:
Duplicate volume numbers exist: true

Related

XSLT 2 - find first missing element in source list

I have problem with XSLT and/or XPATH. Let's say I have XML Input:
<context>
<pdpid-set>
<list>
<item>1</item>
<item>2</item>
<item>4</item>
<item>6</item>
<item>7</item>
<item>8</item>
</list>
</pdpid-set>
</context>
Task is: find FIRST missing element in array pdpid-set/list. In example above answer is 3.
I tried to use <xsl:for-each to find missing element but there is no possibility to break such loop so my XSL produce more than one element in output:
<xsl:variable name="list" select="context/pdpid-set/list"/>
<xsl:variable name="length" select="count(context/pdpid-set/list/item)"/>
<xsl:for-each select="1 to ($length)">
<xsl:variable name="position" select="position()"/>
<xsl:if test="$list/item[$position] > $position">
<missing-value>
<xsl:value-of select="$position"/>
</missing-value>
</xsl:if>
</xsl:for-each>
in code above output will be:
<missing-value>3</missing-value><missing-value>4</missing-value><missing-value>5</missing-value>...
I don't want to have more than one missing-value. Any suggestion?
Even in XPath 1.0
/context
/pdpid-set
/list
/item[not(position()=.)][1]
Do note: this select the first item not aligned with the ascending order. I still think that position() is better than following-sibling axis performance wise and for code clarity. Also, it lets you easily change starting number and step like in:
/context
/pdpid-set
/list
/item[not((position() - 1) * $step + $start = .)][1]
Task is: find FIRST missing element in array pdpid-set/list. In
example above answer is 3
Here is a correct XPath 1.0 expression that when evaluates to the wanted result (3):
/*/*/*/item[not(. +1 = following-sibling::*[1])][1] + 1
The XPath expression in the currently selected answer, on the other side, selects this element:
<item>4</item>
And the complete correct XSLT 1.0 transformation is:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<missing-value>
<xsl:copy-of select="/*/*/*/item[not(. +1 = following-sibling::*[1])][1] + 1"/>
</missing-value>
</xsl:template>
</xsl:stylesheet>
When applied on the provided XML document, the wanted, correct result is produced:
<missing-value>3</missing-value>
Finally, if the task is to find all missing elements:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match=
"item[following-sibling::* and not(number(.) +1 = following-sibling::*[1]/number())]">
<xsl:for-each select="xs:integer(.) + 1 to following-sibling::*[1]/xs:integer(.) -1">
<missing-value><xsl:copy-of select="."/></missing-value>
</xsl:for-each>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
when this XSLT 2.0 transformation is applied on the following XML document (missing 3, 5, and 6):
<context>
<pdpid-set>
<list>
<item>1</item>
<item>2</item>
<item>4</item>
<item>7</item>
<item>8</item>
</list>
</pdpid-set>
</context>
the wanted, correct result is produced:
<missing-value>3</missing-value>
<missing-value>5</missing-value>
<missing-value>6</missing-value>

XSLT 3.0 Streaming with Grouping and Sum/Accumulator

I'm trying to figure out how to use XSLT Streaming (to reduce memory usage) in a scenario that requires grouping (with an arbitrary number of groups) and summing the group. So far I haven't been able to find any examples. Here's an example XML
<?xml version='1.0' encoding='UTF-8'?>
<Data>
<Entry>
<Genre>Fantasy</Genre>
<Condition>New</Condition>
<Format>Hardback</Format>
<Title>Birds</Title>
<Count>3</Count>
</Entry>
<Entry>
<Genre>Fantasy</Genre>
<Condition>New</Condition>
<Format>Hardback</Format>
<Title>Cats</Title>
<Count>2</Count>
</Entry>
<Entry>
<Genre>Non-Fiction</Genre>
<Condition>New</Condition>
<Format>Paperback</Format>
<Title>Dogs</Title>
<Count>4</Count>
</Entry>
</Data>
In XSLT 2.0 I would use this to group by Genre, Condition and Format and Sum the counts.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" indent="yes" />
<xsl:template match="/">
<xsl:call-template name="body"/>
</xsl:template>
<xsl:template name="body">
<xsl:for-each-group select="Data/Entry" group-by="concat(Genre,Condition,Format)">
<xsl:value-of select="Genre"/>
<xsl:value-of select="Condition"/>
<xsl:value-of select="Format"/>
<xsl:value-of select="sum(current-group()/Count)"/>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
For output I would get two lines, a sum of 5 for Fantasy, New, Hardback and a sum of 4 for Non-Fiction, New, Paperback.
Obviously this won't work with Streaming because the sum accesses the whole group. I think I need to iterate through the document twice. The first time I could build a map of the groups (creating a new group if one doesn't exist yet). The second time The problem is I also need an accumulator for each group with a rule that matches the group, and it doesn't seem you can create dynamic accumulators.
Is there a way to create accumulators on the fly? Is there another/easier way to do this with Streaming?
To be able to use streamed grouping with XSLT 3.0 one option that I see is to first transform the element based data you have into attribute based data using a stylesheet like
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
exclude-result-prefixes="xs math"
version="3.0">
<xsl:mode streamable="yes" on-no-match="shallow-copy"/>
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="Entry/*">
<xsl:attribute name="{name()}" namespace="{namespace-uri()}" select="."/>
</xsl:template>
</xsl:stylesheet>
then you can perfectly used streamed grouping (as far as a streamed group-by is possible at all, as far as I understand there will be some buffering necessary) as follows:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
exclude-result-prefixes="xs math"
version="3.0">
<xsl:mode streamable="yes"/>
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:fork>
<xsl:for-each-group select="Data/Entry" composite="yes" group-by="#Genre, #Condition, #Format">
<xsl:value-of select="current-grouping-key(), sum(current-group()/#Count)"/>
<xsl:text>
</xsl:text>
</xsl:for-each-group>
</xsl:fork>
</xsl:template>
</xsl:stylesheet>
I don't know whether first creating an attribute centric document is an option but I think it is better to share suggestions with code in an answer instead of trying to put them into a comment. And the answer in XSLT Streaming Chained Transform shows how to use Saxon 9 with Java or Scala to chain two streaming transformations without the need to write a temporary output file for the first transformation step.
As for doing it with copy-of on the original input format, Saxon 9.7 EE assesses the following as streamable and executes it with the right result:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math" exclude-result-prefixes="xs math"
version="3.0">
<xsl:mode streamable="yes"/>
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:for-each-group select="copy-of(Data/Entry)" composite="yes"
group-by="Genre, Condition, Format">
<xsl:value-of select="current-grouping-key(), sum(current-group()/Count)"/>
<xsl:text>
</xsl:text>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
I am not sure it consumes less memory however than normal, tree based grouping. Perhaps you can measure with your real input data.
As a third alternative, to use a map as you seemed to want to do, here is an xsl:iterate example that iterates through the Entry elements, collecting the accumulated Count value in a map:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
xmlns:map="http://www.w3.org/2005/xpath-functions/map" exclude-result-prefixes="xs math map"
version="3.0">
<xsl:mode streamable="yes"/>
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:iterate select="Data/Entry">
<xsl:param name="groups" as="map(xs:string, xs:integer)" select="map{}"/>
<xsl:on-completion>
<xsl:value-of select="map:keys($groups)!(. || ' ' || $groups(.))" separator="
"/>
</xsl:on-completion>
<xsl:variable name="current-entry" select="copy-of()"/>
<xsl:variable name="key"
select="string-join($current-entry/(Genre, Condition, Format), '|')"/>
<xsl:next-iteration>
<xsl:with-param name="groups"
select="
if (map:contains($groups, $key)) then
map:put($groups, $key, map:get($groups, $key) + xs:integer($current-entry/Count))
else
map:put($groups, $key, xs:integer($current-entry/Count))"
/>
</xsl:next-iteration>
</xsl:iterate>
</xsl:template>
</xsl:stylesheet>

Extract node value by calculating absolute position

My source XML looks:
<test>
<text1>Test</text1>
<text2>Test</text2>
<text2>Test</text2>
<section>
<text1>Test<bold>content</bold></text1>
<text1>Test</text1>
<text2>Test</text2>
<text2>Test</text2>
</section>
</test>
Want to extract the value of 6th node, based on the absolute number of the element (overall count). The absolute number of the element has been identified using <xsl:number level="any" from="/" count="*"/>.
The XPath expression /descendant::*[6] should give you the element you need.
<xsl:template match="/">
<xsl:copy-of select="/descendant::*[6]" />
</xsl:template>
outputs
<text1>Test<bold>content</bold></text1>
Note that this is an example of the difference between descendant:: and // - //*[6] would give you all elements that are the sixth child element of their respective parent, rather than simply the sixth element in the document in depth-first order.
This xslt
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:variable name="allElements" select="//element()" />
<xsl:template match="/">
<output>
<xsl:value-of select="$allElements[6]" />
</output>
</xsl:template>
</xsl:stylesheet>
will result in
<?xml version="1.0" encoding="UTF-8"?>
<output>Testcontent</output>

Can I check condition in template match in XSLT?

I want to check variable in template match, is it possible?
like:
<xsl:template match="*:Item and $MODE='PURCHASE'">
So template should check variable $MODE='PURCHASE' as well
Not in XSLT 1.0.
In XSLT 2.0 one can have variable references -- in the predicates of the template match pattern.
For example:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:param name="MODE" select="'PURCHASE'"/>
<xsl:template match="*:Item[$MODE='PURCHASE']">
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on this XML document:
<t xmlns:x="some:x">
<x:Item>someText</x:Item>
</t>
the wanted, correct result is produced:
someText

XSL: How to concatenate nodes with conditions?

I have the following code (eg):
<response>
<parameter>
<cottage>
<cot>
<res>
<hab desc="Lakeside">
<reg cod="OB" prr="600.84>
<lwz>TR#2#AB#200.26#0#QB#OK#20120829#20120830#EU#3-0#</lwz>
<lwz>TR#2#AB#200.26#0#QB#OK#20120830#20120831#EU#3-0#</lwz>
<lwz>TR#2#AB#200.26#0#QB#OK#20120831#20120901#EU#3-0#</lwz>
I need to create a concatenated string that includes the whole of the first 'lwz' line and then the price (200.26, but it can be different in each line) for each corresponding line.
So the output, separating each line with | would be:
TR#2#AB#200.26#0#QB#OK#20120829#20120830#EU#3-0#|200.26|200.26
Thanks
This XSLT 1.0 transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="lwz[1]">
<xsl:value-of select="."/>
</xsl:template>
<xsl:template match="lwz[position() >1]">
<xsl:value-of select=
"concat('
',
substring-before(substring-after(substring-after(substring-after(.,'#'),'#'),'#'),'#')
)
"/>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
when applied on the provided text (converted to a well-formed XML document !!!):
<response>
<parameter>
<cottage>
<cot>
<res>
<hab desc="Lakeside">
<reg cod="OB" prr="600.84">
<lwz>TR#2#AB#200.26#0#QB#OK#20120829#20120830#EU#3-0#</lwz>
<lwz>TR#2#AB#200.26#0#QB#OK#20120830#20120831#EU#3-0#</lwz>
<lwz>TR#2#AB#200.26#0#QB#OK#20120831#20120901#EU#3-0#</lwz>
</reg>
</hab>
</res>
</cot>
</cottage>
</parameter>
</response>
produces the wanted, correct result:
TR#2#AB#200.26#0#QB#OK#20120829#20120830#EU#3-0#
200.26
200.26
II XSLT 2.0 solution:
This transformation:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="lwz[1]">
<xsl:value-of select="."/>
</xsl:template>
<xsl:template match="lwz[position() >1]">
<xsl:value-of select=
"concat('
', tokenize(.,'#')[4])"/>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
when applied on the above XML document, again produces the wanted, correct result. Note the use of the standard XPath 2.0 function tokenize():
TR#2#AB#200.26#0#QB#OK#20120829#20120830#EU#3-0#
200.26
200.26
You can use the XPath substring function to select substrings from your lwz node data. You don't really give much more detail about your problem, if you want a more detailed answer, perhaps provide the full XML document and your best-guess XSLT