xsltproc add text before and after multiple files - xslt

I'm using the xsltproc utility to transform multiple xml test results into pretty printed console output using a command like the following.
xsltproc stylesheet.xslt testresults/*
Where stylesheet.xslt looks something like this:
<!-- One testsuite per xml test report file -->
<xsl:template match="/testsuite">
<xsl:text>begin</xsl:text>
...
<xsl:text>end</xsl:text>
</xsl:template>
This gives me an output similar to this:
begin
TestSuite: 1
end
begin
TestSuite: 2
end
begin
TestSuite: 3
end
What I want is the following:
begin
TestSuite: 1
TestSuite: 2
TestSuite: 3
end
Googling is turning up empty. I suspect I might be able to merge the xml files somehow before I give them to xsltproc, but I was hoping for a simpler solution.

xsltproc transforms each specified XML document separately, as indeed is the only sensible thing for it to do because XSLT operates on a single source tree, and xsltproc doesn't have enough information to compose multiple documents into a single tree. Since your template emits text nodes with the "begin" and "end" text, those nodes are emitted for each input document.
There are several ways you could arrange to have just one "begin" and one "end". All of the reasonable ones start with lifting the text nodes out your template for <testsuite> elements. If each "TestSuite:" line in the output should correspond to one <testsuite> element then you'll need to do that even if you physically merge the input documents.
One solution would be to remove the responsibility for the "begin" and "end" lines from XSLT altogether. For example, remove the xsl:text elements from the stylesheet and write a simple script such as this:
echo begin
xsltproc stylesheet.xslt testresults/*
echo end
Alternatively, if the individual XML files do not start with XML declarations, then you might merge them dynamically, by running xsltproc with a command such as this:
{ echo "<suites>"; cat testresults/*; echo "</suites>"; } \
| xsltproc stylesheet.xslt -
The corresponding stylesheet might then take a form along these lines:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/suites">
<!-- the transform of the root element produces the "begin" and "end" -->
<xsl:text>begin
</xsl:text>
<xsl:apply-templates select="testsuite"/>
<xsl:text>
end</xsl:text>
</xsl:template>
<xsl:template match="testsuite">
...
</xsl:template>
</xsl:stylesheet>

Related

Get file name where the content is present in

I've the below XML.
<?xml version="1.0" encoding="UTF-8"?>
<entry>
<file name="FILE_ORD_01.xml"/>
<file name="FILE_ORD_02.xml"/>
<file name="FILE_ORD_03.xml"/>
<file name="FILE_ORD_04.xml"/>
<file name="FILE_ORD_05.xml"/>
</entry>
Basically this is a list of files in my folder
In every XML file there is a phrase.
and I've another XML file from where i need to get the phrase values compare them with the phrase values in this list and give in which list is this phrase present in.
<xsl:variable name="prent">
<xsl:for-each select="document('C:\Users\u0138039\Desktop\Proview\MY\2015\title.xml')/entry/file">
<xsl:value-of select="normalize-space(document(concat('C:\Users\u0138039\Desktop\Proview\MY\2015\',./#name))/chapter[//page/#num=regex-group(1)])"/>
</xsl:for-each>
</xsl:variable>
using this code i'm able to see in which file is the phrase match found in for example if the match is found in FILE_ORD_03.xml i want to print FILE_ORD_03
Here basically i want to get the file name in which the phrase is present in from the above list and print its value. by using base-uri() or getting the attribute value directly
Thanks
I don't think you can extract the value from your variable as there you have simply text nodes with the normalized string value, but I think you want something along the lines of
<xsl:variable name="file-names" as="xs:string*"
select="document(document('file:///C:/Users/u0138039/Desktop/Proview/MY/2015/title.xml')/entry/file/#name)/chapter[//page/#name = regex-group(1)]/substring-before(tokenize(document-uri(/), '/')[last()], '.')"/>
to extract the file name(s) of matching file(s) as a sequence of strings into a second variable.

Apply-templates with in Analyze string

I've the below XML.
<?xml version="1.0" encoding="UTF-8"?>
<para align="center">
<content-style font-style="bold">A.1 This is the first text</content-style> (This is second text)
</para>
Below are my 2 Questions.
here i've declared a regex to match the content-style, But when i run this the second one is caught where as it should be div class="para", but in the output i get <div class="para align-center">. please let me know where am i going wrong.
Is there a way i can apply-templates with in the match. when i tried it throws me an error. I want it like below.
if (para)
xsl:apply-templates select child::node()[not(self::text)]
else
xsl:apply-templates
Working Example
Thanks
If you want to use apply-templates inside the analyze-string then you need to store the context node outside of analyze-string in a variable <xsl:variable name="context-node" select="."/>, then you can use <xsl:apply-templates select="$context-node/node()"/> for instance to process the child nodes.
Whether you need that approach I am not sure, I wonder whether you can not simply use the matches functions in a pattern e.g. <xsl:template match="para[content-style[matches(., '(\w+)\.(\w+)')]]">...</xsl:template>.

How to avoid Open xml tag on its own line

I have an XML structure as below:
<?xml version="1.0" encoding="utf-8"?>
<cl:doc identifier="ISBN" xsi:schemaLocation="http://xml.cengage-learning.com/cendoc-core cendoc.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:cl="http://xml.cengage-learning.com/cendoc-core" xmlns:m="http://www.w3.org/1998/Math/MathML">
<cl:chapter identifier="ch01">
<cl:opener identifier="ch06_opn">
<cl:introduction identifier="ch06_int">
<cl:list identifier="tu_1" list-style="Unformatted" item-length="long">
<cl:item identifier="tu_2"><cl:para identifier="ch01_dum_2">Solubility</cl:para></cl:item>
<cl:item identifier="tu_3"><cl:para identifier="ch01_dum_3">Polarity</cl:para></cl:item>
</cl:list></cl:introduction></cl:opener></cl:chapter></cl:doc>
When I transform this above xml using XSLT, I got the below output:
<?xml version="1.0" encoding="utf-8"?>
<cl:doc identifier="ISBN" xsi:schemaLocation="http://xml.cengage-learning.com/cendoc-core cendoc.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:cl="http://xml.cengage-learning.com/cendoc-core" xmlns:m="http://www.w3.org/1998/Math/MathML"><cl:chapter identifier="ch01">
<cl:opener identifier="ch06_opn">
<cl:introduction identifier="ch06_int"><cl:list identifier="tu_1" list-style="Unformatted" item-length="long">
<cl:item identifier="tu_2"><cl:para identifier="ch01_dum_2">Solubility</cl:para></cl:item>
<cl:item identifier="tu_3"><cl:para identifier="ch01_dum_3">Polarity</cl:para></cl:item></cl:list></cl:introduction></cl:opener></cl:chapter></cl:doc>
Here, the opening tag <cl:opener identifier="ch06_opn"> alone comes on separate line. This result me to have the blank line after doing the conversion.
I need this <cl:opener identifier="ch06_opn"> tag must be run-on with either its previous line or to the next line.
Can anybody help me how this can be achieved through XSLT.
Thanks,
Gopal
Without seeing your XSLT it's difficult to be certain, but it sounds like your XSLT is copying over the whitespace in the source into the output.
The quickest way to prevent that is to put
<xsl:strip-space elements="*"/>
or alternatively
<xsl:template match="text()[not(normalize-space())]"/>
This removes all whitespace, but you can of course be more specific about the whitespace you're removing, such as
<xsl:template match="cl:opener/text()[1][not(normalize-space())]"/>
to remove just the whitespace after that opening element tag- this matches the first text node within cl:opener if it's whitespace only, and outputs nothing in it's place.

how to get text with xpath from bad xml?

I have a "bad xml structure" file:
<cars>
<car>Toyota
<country>Japan</coutry>
....
</car>
</cars>
How to correctly get the right word (Toyota) using Xpath?
I tried:
<xsl:value-of select = "cars/car/text()"/>.
It works, but I think there are more appropriate methods.
Thanks.
Use:
/cars/car/text()[1]
or if you want to discard most of the white space in the text node selected above, use:
normalize-space(/cars/car/text()[1])
Do note that while in XSLT 1.0 <xsl:value-of> outputs the string valu only of the first node of the node-set selected by the expression in the select attribute, <xsl:copy-of> will output all the nodes in the node-set. In XSLT 2.0 even <xsl:value-of> outputs all the nodes in the node-set.
Therefore, for purposes of portability, upgradability and simply for avoiding errors, it is better to specify which exactlyy node from the nodeset is to be output -- even when using <xsl:value-of>

XSLT shorter version of OR conditional statement

I was wondering if someone remembers how to write a shorter OR statements in XSLT. I'm sure there was a way but I can't remember.
So instead of
test="$var = 'text1' or $var = 'text2'"
I'd like to use a shorter version like test="$var =['text1','text2']" However, I can't remember or find the right shorthand syntax for such cases.
Would really appreciate if someone could help with that!
Many thanks
With XSLT 2.0 (but not with XSLT 1.0) you can do
<xsl:if test="$var = ('text1','text2')">
Maybe that is the syntax you are looking for.
For string values as you appear to be using you can use a concat trick:-
test="contains('__text1____text2__', concat('__', $var, '__'))"
Not shorter for just two items but given 5 or more it starts to look better.
Having said that you probably can multi-line when using or's so it may be better just to use a series of or's:-
test = "
$var = 'text1'
or $var = 'text2'
or $var = 'text3'
or $var = 'text3'"
More text but clearer solution.
If you find that you do many comparisons against a fixed set of values, you can also do this:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:cfg="http://tempuri.org/config"
exclude-result-prefixes="cfg"
>
<xsl:output method="text" />
<!-- prepare a fixed list of possible values; note the namespace -->
<config xmlns="http://tempuri.org/config">
<val>text1</val>
<val>text2</val>
<!-- ... -->
</config>
<!-- document('') lets you access the stylesheet itself -->
<xsl:variable name="cfg" select="document('')/*/cfg:config/cfg:val" />
<xsl:template match="/">
<xsl:variable name="var" select="'text2'" />
<!-- check against all possible values in one step -->
<xsl:if test="$cfg[.=$var]">
<xsl:text>Match!</xsl:text>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
The above would print
Match!
The [] operator only works on a nodeset. Maybe you're thinking of when you say something like [a|b] to select nodes from your nodeset that have a child element a or a child element b. But for string comparison I don't know of any way other than using "or".
There is no 'contains' function for sequences, but you could use index-of or intersect:
fn:exists(('test1', 'test2') intersect $var))
or
fn:exists(fn:index-of(('test1', 'test2'), $var))
With only two strings, your original solution is shorter though.