How to count output lines with xslt? - xslt

I'm outputting text with XSLT. I need to count the number of lines of each section and write it out to my output file. How this could be done?
My output is like this:
HDR+aaa:bbb'
AAA+78901234567890+String1:String2'
BBB+123+String'
CCC+321:1212'
DDD+112211'
DDD+121122'
XXX+number_of_records+78901234567890'
AAA+1234567890+String1:String2'
BBB+123+String'
CCC+321:1212'
DDD+1212:2121'
BBB+123+String'
BBB+122+String'
CCC+String'
XXX+number_of_records+1234567890'
The number_of_records should contain number of lines from AAA to XXX including both lines. In the first section the number of lines should be 6 and in the second message it should be 8. The first and the last line of each section will share the same unique ID-number.
The number of lines cannot be counted from source since there is so much processing inside XSLT.

A conceptually simple way to do this would be to use a second-stage process. Take the output of your initial transformation (what you posted) and run it through a template (or stylesheet like #Alejandro's) that parses it into lines, and groups the lines starting with AAA... and ending with XXX. See Up-conversion using XSLT 2.0 for a very clear and practical tutorial on doing this, using tokenize(), xsl:analyze-string, and xsl:for-each-group. Then count the lines in each group, and re-output each line, plug the line count into the XXX record.
But that's inefficient, and somewhat error-prone, since you would be parsing the initial output. Why parse a serialization of information that the stylesheet already had internally? You could avoid the inefficiency by changing your initial output to XML, something like
<hdr>
<section id="78901234567890">
<!-- It sounds like AAA's ID actually applies to the section? -->
<AAA String1="..." String2="..."/>
<BBB .../>
<!-- no need to include XXX at this stage AFAICT -->
</section>
<section id="1234567890">
...
</section>
</hdr>
Then the second-stage template (or a separate stylesheet) could take this XML as input and very easily serialize it as you have done above, counting the lines as it goes.
In XSLT 1.0, you would have to use a separate stylesheet to process the output XML, or else use the extension function node-set(). (But even with a separate stylesheet processor, you could still avoid the cost of re-parsing the intermediate XML, if you can pipeline the two stylesheet processors together using SAX.) In XSLT 2.0, you can process the XML output of one template with another template, without restriction.

Just for fun, until you post your input sample and stylesheet building that text output, this stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="result" name="result">
<xsl:param name="pString" select="."/>
<xsl:variable name="vAfter" select="substring-after($pString, 'AAA+')"/>
<xsl:choose>
<xsl:when test="$vAfter!=''">
<xsl:variable name="vId"
select="substring-before($vAfter, '+')"/>
<xsl:variable name="vEnd"
select='concat("XXX+number_of_records+",$vId,"&apos;
")'/>
<xsl:variable name="vInto"
select="substring-before($vAfter,$vEnd)"/>
<xsl:value-of
select='concat(substring-before($pString,"AAA+"),
"AAA+",
$vInto,
"XXX+",
string-length(translate($vInto,
translate($vInto,
"
",
""),
"")) + 1,
"+",$vId,"&apos;
")'/>
<xsl:call-template name="result">
<xsl:with-param name="pString"
select="substring-after($vAfter,$vEnd)"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$pString"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
With this input:
<result>
HDR+aaa:bbb'
AAA+78901234567890+String1:String2'
BBB+123+String'
CCC+321:1212'
DDD+112211'
DDD+121122'
XXX+number_of_records+78901234567890'
AAA+1234567890+String1:String2'
BBB+123+String'
CCC+321:1212'
DDD+1212:2121'
BBB+123+String'
BBB+122+String'
CCC+String'
XXX+number_of_records+1234567890'
</result>
Output:
HDR+aaa:bbb'
AAA+78901234567890+String1:String2'
BBB+123+String'
CCC+321:1212'
DDD+112211'
DDD+121122'
XXX+6+78901234567890'
AAA+1234567890+String1:String2'
BBB+123+String'
CCC+321:1212'
DDD+1212:2121'
BBB+123+String'
BBB+122+String'
CCC+String'
XXX+8+1234567890'

My solution: I created an extension function that increments number_of_records by one each time I call it. I use xsl:comment to suppress the output until I really need to output the number. I reset the number_of_records after each XXX+ -line.
Doing this in two steps vould have caused too much hassle.

Related

How can I replace text with angle bracket without parsing the replace value?

I have this:
replace("Both cruciate ligaments are well visualized and are intact.",
".",
".<br>")
But I do not want to output the escaped angle brackets but the actual brackets. when I run the code I get :
Both cruciate ligaments are well visualized and are intact.<br>
I want:
Both cruciate ligaments are well visualized and are intact.<br>
How can I achieve that? I cannot use the angle bracket directly as replace value since I get an error.
EDIT
I have a stylesheet that takes in a text file that is injected into a HTML file (coming from the stylesheet). I take an XML (Clinical document) and a text file and merge them together with the stylesheet. So for example I have:
RADIOLOGY REPORT
NAME: JOHN, DOE
DoB: 1982-02-25
Injected text goes here
The text has to wrap on carriage return and has to wrap at a word level. I did manage to do the latter but I did not find a way to the line breaks. I thought of finding 'LF' in the file an replace with <BR> so that once the page is rendered I get to see the line breaks.
You need to use xsl:analyze-string if you want to output nodes and not simply strings. Here is an example:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:template match="text">
<xsl:analyze-string select="." regex="\.">
<xsl:matching-substring>
<xsl:value-of select="."/><br/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
</xsl:stylesheet>
With the input being
<text>Both cruciate ligaments are well visualized and are intact.</text>
the transformation result is
Both cruciate ligaments are well visualized and are intact.<br>
Martin Honnen's answer is a perfectly good way to do this.
Using a simple template to find the text in question is another way:
<xsl:variable name="magic-string"
select='"Both cruciate ligaments are well visualized and are intact."'/>
...
<xsl:template match="text()
[contains(.,$magic-string)]">
<xsl:value-of select="substring-before(.,$magic-string)"/>
<xsl:value-of select="$magic-string"/>
<br/>
<xsl:value-of select="substring-after(.,$magic-string)"/>
</xsl:template>
In either case, use the HTML output method to serialize the empty br element as <br> instead of as <br/>.
Note: I'm assuming here that you want a br after this particular sentence, not that you want one after each occurrence of full stop, which is how Martin Honnen appears to have interpreted the question.

Using an xsl param as argument to XPath function

I've been trying to figure out a way to use a param/variable as an argument to a function.
At the very least, I'd like to be able to use basic string parameters as arguments as follows:
<xsl:param name="stringValue" default="'abcdef'"/>
<xsl:value-of select="substring(string($stringValue),1,3)"/>
The above code generates no output.
I feel like I'm missing a simple way of doing this. I'm happy to use exslt or some other extension if an xslt 1.0 processor does not allow this.
Edit:
I am using XSL 1.0 and transforming using Nokogiri, which supports XPATH 1.0 . Here is a more complete snippet of what I am trying to do:
I want to pass column numbers as parameters using nokogiri as follows
document = Nokogiri::XML(File.read('table.xml'))
template = Nokogiri::XSLT(File.read('extractTableData.xsl'))
transformed_document = template.transform(document,
["tableName","'Problems'", #Table Heading
"tablePath","'Table'", #Takes an absolute XPATH String
"nameColumnIndex","2", #column number
"valueColumnIndex","3"]) #column number
File.open('FormattedOutput.xml', 'w').write(transformed_document)
My xsl then wants to access every TD[valueColumnIndex] and and retrieve the first 3 characters at that position, which is why I am using a substring function. So I want to do something like:
<xsl:value-of select="substring(string(TD[$valueColumnIndex]),1,3)"/>
Since I was unable to do that, I tried to extract TD[$valueColumnIndex] to another param valueCode and then do substring(string(valueCode),1,3)
That did not work either (which is to say, no text was output, whereas <xsl:value-of select="$valueCode"/> gave me the expected output).
As a result, i decided to understand how to use parameters better, I would just use a hard coded string, as mentioned in my earlier question.
Things I have tried:
using single quotes around abcdef (and not) while
using string() around the param name (and not)
Based on the comments below, it seems I am handicapped in my ability to understand the error because Nokogiri does not report an error for these situations. I am in the process of installing xsltproc right now and seeing if I receive any errors.
Finally, here is my entire xsl. I use a separate template forLoop because of the valueCode param I am creating. The lines of interest are the last 5 or so. I cannot include the xml as there are data use issues involved.
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ext="http://exslt.org/common"
xmlns:dyn="http://exslt.org/dynamic"
exclude-result-prefixes="ext dyn">
<xsl:param name="tableName" />
<xsl:param name="tablePath" />
<xsl:param name= "nameColumnIndex" />
<xsl:param name= "valueColumnIndex"/>
<xsl:template match="/">
<xsl:param name="tableRowPath">
<xsl:value-of select="$tablePath"/><xsl:text>/TR</xsl:text>
</xsl:param>
<!-- Problems -->
<section>
<name>
<xsl:value-of select="$tableName" />
</name>
<!-- <xsl:for-each select="concat($tablePath,'/TR')"> -->
<xsl:for-each select="dyn:evaluate($tableRowPath)">
<!-- Encode record section -->
<xsl:call-template name="forLoop"/>
</xsl:for-each>
</section>
</xsl:template>
<xsl:template name="forLoop">
<xsl:param name="valueCode">
<xsl:value-of select="./TD[number($valueColumnIndex)][text()]"/>
</xsl:param>
<xsl:param name="RandomString" select="'Try123'"/>
<section>
<name>
<xsl:value-of select="./TD[number($nameColumnIndex)]"/>
</name>
<code>
<short>
<xsl:value-of select="substring(string($valueCode),1,3)"/>
</short>
<long>
<xsl:value-of select="$valueCode"/>
</long>
</code>
</section>
</xsl:template>
</xsl:stylesheet>
Use it this way:
<xsl:param name="stringValue" select="'abcdef'"/>
<xsl:value-of select="substring($stringValue,1,3)"/>

Split xml to several output files

Simplistically, I have xml that contains 120 nodes. How can I create 3 xml files that have 50 nodes in each? I've marked output as dynamic. Then I've tried to apply auto-number function but I can't get when it fires and how to create condition on it. What I need is sthm like trigger that would cause creation of new file. My strategy:
P.S. I'm noob at MapForce.
Assuming your input is
<root>
<elt>...</elt>
...
</root>
then simplistically, you could do something like:
<xsl:template match="/">
<xsl:document href="1-50.xml">
<root>
<xsl:for-each select="root/elt[pos() <= 50]">
<xsl:copy-of select="."/>
</xsl:for-each>
</root>
</xsl:document>
<xsl:document href="51-100.xml">
<root>
<xsl:for-each select="root/elt[pos() >= 51 and pos() <= 100]">
<xsl:copy-of select="."/>
</xsl:for-each>
</root>
</xsl:document>
<!-- repeat for other portions of input -->
</xsl:template>
In practice, you'd want to be a little bit smarter to handle arbitrary numbers of nodes in the input.

How to prevent self closing tags as well empty tags after transforming

I have in an input file:
<a></a>
<b/>
<c>text</c>
I need to converting this to string. Using transformer I am getting below output:
<a/> <!-- Empty tags should not collapse-->
<b/>
<c>text</c>
If I use xslt and output method is "HTML", I get the below output:
<a></a> <!-- This is as expected-->
<b></b> <!-- This is not expected-->
<c>text</c>
I want the structure same as in input file. It is required in my application since I need to calculate index and it will be very difficult to change the index calution logic.
What would be the correct XSLT to use?
What XSLT processor? XSLT is merely a language to transform xml so "html output" is dependent on the processor.
I'm going to guess this first solution is too simple for you but i've had to use this to avoid processing raw html
<xsl:copy-of select="child::node()" />
as this should clone the raw input.
In my case, I have used the following to extract all nodes that had the raw attribute:
<xsl:for-each select="xmlData//node()[#raw]">
<xsl:copy-of select="child::node()" />
</xsl:for-each>
Other options:
2) Add an attribute to each empty node depending on what you want it to do later ie role="long", role="short-hand".
3)
Loop through each node (xsl:for-each)
<xsl:choose>
<xsl:when test="string-length(.)=0"> <!-- There is no child-->
<xsl:copy-of select="node()" />
</xsl:when>
<xsl:otherwise>
...whatever normal processing you have
</xsl:otherwise>
4) Redefine your problem. Both are valid XHTML/XML, so perhaps your problem can be reframed or fixed elsewhere.
Either way, you may want to add more information in your question so that we can reproduce your problem and test it locally.
P.S. Too much text/code to put in a comment, but that's where this would belong.
A possible alternative is to use disable-output-escaping like this:
<xsl:text disable-output-escaping="yes"><a></a></xsl:text>
But I understand that this is a dirty solution...

How to comment in XSLT and not HTML

I'm writing XSL and I want to make comments throughout the code that will be stripped when it's processed, like PHP, however I'm not sure how.
I'm aware of the comment object, but it prints out an HTML comment when processed. :\
<xsl:comment>comment</xsl:comment>
You use standard XML comments:
<!-- Comment -->
These are not processed by the XSLT transformer.
Just make sure that you put your <!-- comments --> AFTER the opening XML declaration (if you use one, which you really don't need):
BREAKS:
<!-- a comment -->
<?xml version="1.0"?>
WORKS:
<?xml version="1.0"?>
<!-- a comment -->
I scratched my head on this same issue for a bit while debugging someone else's XSLT... seems obvious, but easily overlooked.
Note that white space on either side of the comments can end up in the output stream, depending on your XSLT processor and its settings for handling white-space. If this is an issue for your output, make sure the comment is bracketed by xslt tags.
EG
<xsl:for-each select="someTag">
<xsl:text>"</xsl:text>
<!-- output the id -->
<xsl:value-of select="#id"/>
<xsl:text>"</xsl:text>
</xsl:for-each>
Will output " someTagID" (the indent tab/spaces in front of the comment tag are output).
To remove, either unindent it flush with left margin, or bracket it like
<xsl:text>"</xsl:text><!-- output the id --><xsl:value-of select="#id"/>
This is the way to do it in order to create a comment node that won't be displayed in html
<xsl:comment>
<!-- Content:template -->
</xsl:comment>
Sure. Read http://www.w3.org/TR/xslt#built-in-rule and then it should be apparent why this simple stylesheet will (well, should) do what you want:
<?xml version="1.0"?>
<xsl:stylesheet xmlns="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="comment()">
<xsl:copy/>
</xsl:template>
<xsl:template match="text()|#*"/>
</xsl:stylesheet>
Try :
<xsl:template match="/">
<xsl:for-each select="//comment()">
<SRC_COMMENT>
<xsl:value-of select="."/>
</SRC_COMMENT>
</xsl:for-each>
</xsl:template>
or use a <xsl:comment ...> instruction for a more literal duplication of the source document content in place of my <SRC_COMMENT> tag.