Get file name where the content is present in - xslt

I've the below XML.
<?xml version="1.0" encoding="UTF-8"?>
<entry>
<file name="FILE_ORD_01.xml"/>
<file name="FILE_ORD_02.xml"/>
<file name="FILE_ORD_03.xml"/>
<file name="FILE_ORD_04.xml"/>
<file name="FILE_ORD_05.xml"/>
</entry>
Basically this is a list of files in my folder
In every XML file there is a phrase.
and I've another XML file from where i need to get the phrase values compare them with the phrase values in this list and give in which list is this phrase present in.
<xsl:variable name="prent">
<xsl:for-each select="document('C:\Users\u0138039\Desktop\Proview\MY\2015\title.xml')/entry/file">
<xsl:value-of select="normalize-space(document(concat('C:\Users\u0138039\Desktop\Proview\MY\2015\',./#name))/chapter[//page/#num=regex-group(1)])"/>
</xsl:for-each>
</xsl:variable>
using this code i'm able to see in which file is the phrase match found in for example if the match is found in FILE_ORD_03.xml i want to print FILE_ORD_03
Here basically i want to get the file name in which the phrase is present in from the above list and print its value. by using base-uri() or getting the attribute value directly
Thanks

I don't think you can extract the value from your variable as there you have simply text nodes with the normalized string value, but I think you want something along the lines of
<xsl:variable name="file-names" as="xs:string*"
select="document(document('file:///C:/Users/u0138039/Desktop/Proview/MY/2015/title.xml')/entry/file/#name)/chapter[//page/#name = regex-group(1)]/substring-before(tokenize(document-uri(/), '/')[last()], '.')"/>
to extract the file name(s) of matching file(s) as a sequence of strings into a second variable.

Related

xsltproc add text before and after multiple files

I'm using the xsltproc utility to transform multiple xml test results into pretty printed console output using a command like the following.
xsltproc stylesheet.xslt testresults/*
Where stylesheet.xslt looks something like this:
<!-- One testsuite per xml test report file -->
<xsl:template match="/testsuite">
<xsl:text>begin</xsl:text>
...
<xsl:text>end</xsl:text>
</xsl:template>
This gives me an output similar to this:
begin
TestSuite: 1
end
begin
TestSuite: 2
end
begin
TestSuite: 3
end
What I want is the following:
begin
TestSuite: 1
TestSuite: 2
TestSuite: 3
end
Googling is turning up empty. I suspect I might be able to merge the xml files somehow before I give them to xsltproc, but I was hoping for a simpler solution.
xsltproc transforms each specified XML document separately, as indeed is the only sensible thing for it to do because XSLT operates on a single source tree, and xsltproc doesn't have enough information to compose multiple documents into a single tree. Since your template emits text nodes with the "begin" and "end" text, those nodes are emitted for each input document.
There are several ways you could arrange to have just one "begin" and one "end". All of the reasonable ones start with lifting the text nodes out your template for <testsuite> elements. If each "TestSuite:" line in the output should correspond to one <testsuite> element then you'll need to do that even if you physically merge the input documents.
One solution would be to remove the responsibility for the "begin" and "end" lines from XSLT altogether. For example, remove the xsl:text elements from the stylesheet and write a simple script such as this:
echo begin
xsltproc stylesheet.xslt testresults/*
echo end
Alternatively, if the individual XML files do not start with XML declarations, then you might merge them dynamically, by running xsltproc with a command such as this:
{ echo "<suites>"; cat testresults/*; echo "</suites>"; } \
| xsltproc stylesheet.xslt -
The corresponding stylesheet might then take a form along these lines:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/suites">
<!-- the transform of the root element produces the "begin" and "end" -->
<xsl:text>begin
</xsl:text>
<xsl:apply-templates select="testsuite"/>
<xsl:text>
end</xsl:text>
</xsl:template>
<xsl:template match="testsuite">
...
</xsl:template>
</xsl:stylesheet>

XSLT REGEX pattern match

Using Saxon 9.7, XSLT 3.0, I'm trying to select square bracketed terms from a string of text and then remove duplicate values of the terms.
So far I have found a template which selects the substrings I want and a function that tokenizes the string and then removes duplicate values.
However, I haven't been able to get the correct regex for the tokenizing of the string.
Here is my XML of the full text
<column>
<columnDerivationPrompt>Option 1: (No visit windowing)</columnDerivationPrompt>
<columnDerivationDescription>Set to collected visit name [EG.VISIT] Set to 'POST-BASELINE MINIMUM' for the new observation generated for derviation type minimum [ADEG.DTYPE] = 'MINIMUM'
Set to 'POST-BASELINE MAXIMUM' for the new observation generated for derviation type maximum [ADEG.DTYPE]= 'MAXIMUM'
</columnDerivationDescription>
<columnDerivationPrompt>Option 2: (User defined visit windows)</columnDerivationPrompt>
<columnDerivationDescription>Set to a re-defined visit range based on user-defined input, using formatting of Analysis Relative Day [ADEG.ADY] range in conjunction with Analysis Window Target [ADEG.AWTARGET] and Analysis Window Diff from Target [ADEG.AWTDIFF] to determine analysis visit.
Set to 'POST-BASELINE MINIMUM' for the new observation generated for derviation type minimum [ADEG.DTYPE] = 'MINIMUM'
Set to 'POST-BASELINE MAXIMUM' for the new observation generated for derviation type maximum [ADEG.DTYPE]= 'MAXIMUM'
</columnDerivationDescription>
</column>
The string of terms taken from the text that I need to remove duplicates from
EG.VISIT ADEG.DTYPE ADEG.DTYPE ADEG.ADY ADEG.AWTARGET ADEG.AWTDIFF ADEG.DTYPE ADEG.DTYPE
What I would like to see
EG.VISIT ADEG.DTYPE ADEG.ADY ADEG.AWTARGET ADEG.AWTDIFF
my XSLT template and function
<xsl:variable name="test">
<xsl:if test="contains($string,'[')">
<xsl:variable name="relevant-part" select="substring-before(substring-after($string,'['),']')"/>
<xsl:variable name="remainder" select="substring-after($string,']')"/>
<xsl:value-of select="$relevant-part"/>
<xsl:if test="contains($remainder,'[')">
<xsl:text disable-output-escaping="yes"> </xsl:text>
</xsl:if>
<xsl:call-template name="find-relevant-text">
<xsl:with-param name="string" select="$remainder"/>
</xsl:call-template>
</xsl:if>
</xsl:variable>
<xsl:value-of select="myfn:sortCSV($test)"/>
</xsl:template>
<xsl:function name="myfn:sortCSV" as="xs:string*">
<xsl:param name="csvString" as="xs:string"/>
<!-- Split up string and remove duplicates -->
<xsl:variable name="values" select="distinct-values(tokenize($csvString,'\W+\.\W+'))" as="xs:string*"/>
<!-- Return all elements, sorted -->
<xsl:for-each select="$values">
<xsl:sort/>
<!-- We don't return empty strings -->
<xsl:sequence select=".[.!='']"/>
</xsl:for-each>
</xsl:function>
\W+\.\W+ is the regex I have been using to identify e.g. EG.VISIT or ADEG.DTYPE. So any pattern including CC.CCCC to CCCC.CCCCCCCC (where C is a char [A-Z]).
The output I am getting is
EG.VISIT ADEG.DTYPE ADEG.DTYPE ADEG.ADY ADEG.AWTARGET ADEG.AWTDIFF ADEG.DTYPE ADEG.DTYPE
So no duplicates have been removed.
QUESTION:
Can anyone see where I am going wrong with my expression or code?
As for your regular expression, note that a \W matches a non-word char and cannot match uppercase (nor lowercase) letters. \w matches a word char.
However, best is to restrict it to [A-Z]+\.[A-Z]+ since you say the items you want to match follow the uppercase+.+uppercase pattern.
See the regex demo
I would use analyze-string, either with XSLT 2.0 the XSLT xsl:anyalyze-string or with XSLT 3.0 the function of the same name, using that approach it is a one-liner:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
exclude-result-prefixes="xs math fn"
version="3.0">
<xsl:template match="column">
<xsl:value-of select="distinct-values(analyze-string(., '\[([A-Z]+\.[A-Z]+)\]')//fn:match/fn:group[#nr = 1])"/>
</xsl:template>
</xsl:stylesheet>
Output is EG.VISIT ADEG.DTYPE ADEG.ADY ADEG.AWTARGET ADEG.AWTDIFF.
If you want to sort the extracted strings then use <xsl:value-of select="sort(distinct-values(analyze-string(., '\[([A-Z]+\.[A-Z]+)\]')//fn:match/fn:group[#nr = 1]))"/>.

Apply-templates with in Analyze string

I've the below XML.
<?xml version="1.0" encoding="UTF-8"?>
<para align="center">
<content-style font-style="bold">A.1 This is the first text</content-style> (This is second text)
</para>
Below are my 2 Questions.
here i've declared a regex to match the content-style, But when i run this the second one is caught where as it should be div class="para", but in the output i get <div class="para align-center">. please let me know where am i going wrong.
Is there a way i can apply-templates with in the match. when i tried it throws me an error. I want it like below.
if (para)
xsl:apply-templates select child::node()[not(self::text)]
else
xsl:apply-templates
Working Example
Thanks
If you want to use apply-templates inside the analyze-string then you need to store the context node outside of analyze-string in a variable <xsl:variable name="context-node" select="."/>, then you can use <xsl:apply-templates select="$context-node/node()"/> for instance to process the child nodes.
Whether you need that approach I am not sure, I wonder whether you can not simply use the matches functions in a pattern e.g. <xsl:template match="para[content-style[matches(., '(\w+)\.(\w+)')]]">...</xsl:template>.

XSLT- Get substring and pass it as parameters to java function

In the below code snippet, I am trying to get the substring of my #imageMeta node, append some more path location and pass it as a parameter to my java method through XSLT.
<xsl:variable name="imagePathFrom" select="/config/assets/images/{substring-after(#imageMeta,'/')}" />
<xsl:variable name="imagePathTo" select="'/dev/svn_root/platform/system'" />
<xsl:value-of select="filecopy:copyFile($imagePathFrom, $imagePathTo)"/>
My #imageMeta node data looks like Images/common/dialog/dialogue_black.png.
I have to convert the above path to images/common/dialog/dialogue_black.png (note the change of capital 'I' to small 'i') and append some more path data.
So my final path entry should look like "/config/assets/images/common/dialog/dialogue_black.png". When i run my code snippet i get an error stating:
line 51: Error parsing XPath expression '/config/assets/images/{substring-after(#imageMeta,'/')}'.'
Please help.
<xsl:variable name="imagePathFrom" select="/config/assets/images/{substring-after(#imageMeta,'/')}" />
There are two problems here:
A syntax error -- a select is probably the only attribute attribute in XSLT that cannot contain an AVT.
Even without the AVT, this would attempt to select all /config/assets/images nodes, but the intent is that the variable must contain the string "/config/assets/images"
Solution to both problems:
<xsl:variable name="imagePathFrom" select=
"concat('/config/assets/images/', substring-after(#imageMeta,'/')" />
Alternative solution:
<xsl:variable name="imagePathFrom" select=
"concat('/config/assets/',
translate(substring(#imageMeta, 1, 1),
$vUpper,
$vLower
),
substring(#imageMeta, 2)
)" />
where $vLower and $vUpper are defined, respectively, as:
'abcdefghijklmnopqrstuvwxyz'
and
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
There is one problem in your code:
<xsl:variable name="imagePathFrom" select="/config/assets/images/{substring-after(#imageMeta,'/')}" />
It suppose to be ..
<xsl:variable name="imagePathFrom" select="substring-after(/config/assets/images/#imageMeta,'/')" />
infant programmer 'Aravind' suggestion will solve your parse error.
You also mentioned you wanted to lower-case the capital i. Two options here:
Using XSLT 1.0, this StackOverflow answer explains how to lower-case the first character of a string. It won't work for Unicode characters such as 'Í' but you probably don't need it.
XSLT 2.0 has a lower-case function, which will lower-case your entire string, and may not be what you're looking for.

Concatenate with escaped chars in xslt?

I'm writing xslt code which concatenates some string:
<xsl:attribute name='src'>
<xsl:value-of select="concat('url(&apos;', $imgSrc, '&apos;)')" />
</xsl:attribute>
For some reason I can't use it, I keep getting this error:
Unknown function - Name and number of arguments do not match any function signature in the static context - 'http://www.w3.org/2005/xpath-functions:concat'
while evaluating the expression:
select="concat('url(&apos;', $imgSrc, '&apos;)')"
Any idea?
thx
====================
EDIT
I'm trying to get:
url('some_path')
Was having trouble with the apostrophes, but now it just doesn't work.
The &apos; references are resolved by the XML parser that parses your XSLT. Your XSLT processor never sees them. What your XSLT processor sees is:
concat('url('', $imgSrc, '')')
Which is not valid because the commas don't end up in the right place to separate the arguments. However, this might work for you, depending on the serializer your XSLT processor uses:
concat("url('", $imgSrc, "')")
This surrounds the arguments in double-quotes, so that your single-quotes do not conflict. The XSLT processor should see this:
concat("url('", $imgSrc, "')")
Another option is to define a variable:
<xsl:variable name="apos" select='"&apos;"'/>
Which can be used like this:
concat('url(', $apos, $imgSrc, $apos, ')')
More here:
When you apply an XSLT stylesheet to a
document, if entities are declared and
referenced in that document, your XSLT
processor won't even know about them.
An XSLT processor leaves the job of
parsing the input document (reading it
and figuring out what's what) to an
XML parser; that's why the
installation of some XSLT processors
requires you to identify the XML
parser you want them to use. (Others
include an XML parser as part of their
installation.) An important part of an
XML parser's job is to resolve all
entity references, so that if the
input document's DTD declares a cpdate
entity as having the value "2001" and
the document has the line "copyright
&cpdate; all rights reserved", the XML
parser will pass along the text node
"copyright 2001 all rights reserved"
to put on the XSLT source tree.
From http://www.w3.org/TR/xpath/#NT-Literal
[29] Literal ::= '"' [^"]* '"' | "'" [^']* "'"
Meaning that an XPath literal string value can't have the delimiter as also part of the content.
For this you should use the host language. In XSLT:
<xsl:variable name="$vPrefix">url('</xsl:variable>
<xsl:variable name="$vSufix">')</xsl:variable>
<xsl:attribute name="src">
<xsl:value-of select="concat($vPrefix, $imgSrc, $vSufix)" />
</xsl:attribute>
Or more proper:
<xsl:attribute name="src">
<xsl:text>url('</xsl:text>
<xsl:value-of select="$imgSrc"/>
<xsl:text>')</xsl:text>
</xsl:attribute>