Get position of specific word - xslt

I am new in XSLT and if it is possible to get the position of a specific word? For example, I have a data like this:
<Data>The quick brown fox jumps over the lazy dog!</Data>
I want to get the position of a "brown", "over", "dog" and "!". And, store it in different output name. Like the position of brown is <foo>3</foo>, position of over is <boo>6</boo>, dog <hop>9</hop> and ! <po_df>10</po_df>. Is it possible?

If you were only looking for words you could use tokenize(., '\s+|\p{P}')
<xsl:template match="Data">
<xsl:copy>
<xsl:variable name="words" select="tokenize(., '\s+|\p{P}')"/>
<xsl:for-each select="'brown', 'over', 'dog'">
<matched item="{.}" at-pos="{index-of($words, .)}"/>
</xsl:for-each>
</xsl:copy>
</xsl:template>
which gives
<Data>
<matched item="brown" at-pos="3"/>
<matched item="over" at-pos="6"/>
<matched item="dog" at-pos="9"/>
</Data>
so it has the right positions (I am not sure where the names of the elements you posted (like hop) are to be taken from so I have not tried to implement that.).
As you also want to identify a punctuation character I am not sure tokenize suffices and even with analyze-string it is not straight-forward to match and collect the position. Maybe someone else has a better idea.

Related

Find dynamic length number from string using XSLT

I have input XML like below
<?xml version = "1.0" encoding = "UTF-8"?>
<root>
<param>:22:/ABC/GID:50749612002 BOOK USER REF: 12311111112222 XYZ: DEF BK ID:3333 3333 JKL:MNN - VZXVXHFHF DETA ABC:DEF ORDERID:989796123456789.GKLT C:0006789 FASDFSF.FYRTY 53546475</param>
</root>
Need help to extract ORDERID using XSLT.
Consider extracting the node value to a variable $param:
<xsl:variable name="param">
<xsl:value-of select="/root/param"/>
</xsl:variable>
Now you can use since XSLT 2.0 the function replace to get the number:
<xsl:value-of select="replace($param, '.*?ORDERID:(\d+)\.\w{4} .*', '$1')"/>
The regex .*?ORDERID:(\d+)\.\w{4} .* is demonstrated at Regex101.

Copy an element and truncate it with xslt

I am trying to select all the table cells in a particular column and truncate them down to 10 characters.
I really can't fault this syntax.
<xsl:template match="table[contains(#class, 'listing')]//tr/td[contains(#class, 'listing-body-start')]">
<td>
<xsl:value-of select="substring(./text(), 1, 10)"/>
</td>
</xsl:template>
I can replace the contents of the cell with static text, or replicate the same text by using select=".", but as soon as I try this substring method, the cells dissappear.
Try to avoid using text(). It's generally bad practice.
Use
<xsl:value-of select="substring(., 1, 10)"/>
This works on the string value of the td element, rather than on its child text nodes. In many content models it would be normal for a td element to contain other elements as children, e.g. a p or para element, and in that case looking for the child text nodes will fail.
This answer is a guess. If you don't tell us what's in your input, we have no choice but to guess.

xslt 1.0 substring-after to ignore case

I have 2 xml nodes like this, for example:
<Model>GRAND MODUS</Model>
<QualifiedDescription>2008 58 Reg Renault Grand Modus 1.2 TCE Dynamique 5drMetallic Flame Red</QualifiedDescription>
I'm trying to use substring-after to split the QualifiedDescription after the Grand Modus like this:
<xsl:variable name="something"><xsl:value-of select='substring-after(QualifiedDescription, Model)' /></xsl:variable>
But obviously it's not working being of it being case sensitive. Is it possible to get substring-after to work case insensitive, but still return the output with case preserved EG.
1.2 TCE Dynamique 5drMetallic Flame Red
Thanks.
You could convert the two strings to the same case using translate in order to work out the character offset of the first within the second, then take a substring of the original QualifiedDescription from that position.
<xsl:variable name="uc" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'" />
<xsl:variable name="lc" select="'abcdefghijklmnopqrstuvwxyz'" />
<xsl:variable name="substrStart" select="
string-length(substring-before(translate(QualifiedDescription, $uc, $lc),
translate(Model, $uc, $lc)))
+ string-length(Model)
+ 1" /><!-- +1 because string indexes in XPath are 1-based -->
<xsl:variable name="something"
select="substring(QualifiedDescription, $substrStart)" />
You'd need slightly more complex logic to take account of cases where the QualifiedDescription does not include the Model (since in this case both substring-before and substring-after return the empty string) but you get the idea.
You can do case insensitive if you uppercase all first and substring on uppercase:
substring-after(upper-case(QualifiedDescription), upper-case(Model))

how to parse the value from xml through xsl

<block4>
<tag>
<name>50K</name>
<value>/001/002/300060000120135670
CREDIT AGRICOLE ASSET MANAGEMENT</value>
</tag>
</block4>
I need to get output that looks like:
/001/002,/300060000120135670,CREDIT AGRICOLE ASSET MANAGEMENT
I have done like this in XSL, but I didn't get the output I wanted. Can anyone please give me some idea how I could get that output?
<xsl:for-each select ="block4/tag[name = '50K']">
<xsl:value-of select="
concat(
substring(value,1,8),
(concat(substring(value,9,'
'),',')),
substring-after(value,'
')
)
" />,<xsl:text/>
</xsl:for-each>
concat takes any number of arguments, no need to nest those calls. Besides, substring takes a beginning and an optional length, not a terminating character. Try something like this instead:
<xsl:for-each select ="block4/tag[name = '50K']">
<xsl:value-of select="
concat(
substring(value, 1, 8), ',',
substring(substring-before(value,'
'),9), ',',
substring-after(value,'
')
)
" />,<xsl:text/>
</xsl:for-each>
I've kept the final comma in, which is one of the many things you did not really specify.
Why not use XSLT 2.0 tokenize() function?
See Here

Find value in sequence using XSL

I want to check if a value exists in a sequence defined as
<xsl:variable name="some_seq" select="/root/word[#optional='no']/text()"/>
In the past, I've had success with Priscilla Walmsleys function. For clarity, I reproduce it here as follows:
<xsl:function name="functx:is-value-in-sequence" as="xs:boolean">
<xsl:param name="value" as="xs:anyAtomicType?"/>
<xsl:param name="seq" as="xs:anyAtomicType*"/>
<xsl:sequence select="$value=$seq"/>
</xsl:function>
However, this time I need to make a case-insensitive comparison, and so I tried to wrap both $value and $seq with a lower-case(). Obviously, that didn't help much, as $seq is a sequence and lower-case() takes only strings.
Question: what is the best way to either 1) construct a sequence of lower-case strings, or 2) make a case-insensitive comparison analogous to $value=$seq above? TIA!
Question: what is the best way to
either 1) construct a sequence of
lower-case strings
Not many people realize that you can use a function as the last location step in an XPATH 2.0 expression.
You can create a sequence of lower-case() string values with this expression:
/root/word[#optional='no']/text()/lower-case(.)
or 2) make a case-insensitive
comparison analogous to $value=$seq
above?
Using that strategy, you can define a custom function that compares the lower-case() value of the $value and each string value in the $seq:
<xsl:function name="functx:is-value-in-sequence" as="xs:boolean">
<xsl:param name="value" as="xs:anyAtomicType?"/>
<xsl:param name="seq" as="xs:anyAtomicType*"/>
<xsl:sequence select="some $word in $seq/lower-case(.)
satisfies ($word = $value/lower-case(.))"/>
</xsl:function>
Use a "for-expression" inside the function to prepare a lower-case version of the sequence
<xsl:variable name="lcseq" select="for $i in $seq return lower-case($i)"/>
See Michael Kay's "XSLT 2.0 and XPATH 2.0, 4th ed", p. 640
(I haven't tested this)