How to read Text values of node XSLT - xslt

How can read the value of node/element , which ignores few child tags.
I have List of tags which requires to be Ignored,
Example :
a)
OUTPUT :Title Txt a
<Title>
<Comment>Comment code</Comment>Title Txt a
</Title>
b)
OUTPUT :Title Txt b
<Title>
<Ignore1>Comment code</Ignore1>Title Txt b
</Title>
c)
OUTPUT :Comment code Title Txt c
<Title>
<includethis>Comment code</includethis>Title Txt c
</Title>

You simply match for the Title element:
<xsl:template match="Title">
and output its text content:
<xsl:value-of select="."/>
Then, process the child nodes in turn:
<xsl:template match="*[parent::Title and starts-with(.,'Ignore')]"/>
<xsl:template match="includethis">
<xsl:value-of select="."/>
</xsl:template>
Above, the first template matches elements whose name starts with "Ignore". This is because I assume there could me other elements named Ignore2, Ignore3 and so on.
Finally, the includethis elements are matched and their text content is output, same as for the Title elements.
Now, to sum up:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/Title">
<xsl:value-of select="."/>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="*[parent::Title and starts-with(.,'Ignore')]"/>
<xsl:template match="includethis">
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>

Thanks for your Help. But i missed to mention the important condition which started the problem,
I need to check the 'Title' for not empty before starting any processing.
example below tag is considered empty, when the child tag is from one from the Ignored List.
<Title>
<Comment>Comment code</Comment>
</Title>
Sample Code :
<xsl:choose>
<xsl:when test="Title and normalize-space(Title) != ''">
<xsl:apply-templates select="Title" mode="xyz"/>
</xsl:when>
<xsl:otherwise>
<xsl:call-templates name="getalternative_label" mode="xyz"/>
</xsl:otherwise>
</xsl:choose>

Related

XSLT List attributes in the order they appear in the xml file

I have a large number of xml files with a structure similar to the following, although they are far larger:
<?xml version="1.0" encoding="UTF-8"?>
<a a1="3.0" a2="ABC">
<b b1="P1" b2="123">first
</b>
<b b1="P2" b2="456" b3="xyz">second
</b>
</a>
I want to get the following output:
1|1|b1
1|2|b2
2|1|b1
2|2|b2
2|3|b3
where:
Field 1 is the sequence number for nodes /a/b
Field 2 is the sequence number of the attribute as it appears in the xml file
Field 3 is the attribute name (not value)
I don't quite know how to calculate field 2 correctly.
I've prepared the following xslt file:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<xsl:for-each select="a/b/#*">
<xsl:value-of select="count(../preceding-sibling::*)+1"/>
<xsl:text>|</xsl:text>
<!-- TODO: This is not correct -->
<xsl:value-of select="count(preceding-sibling::*)+1"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="name()"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
but when I run the following command:
xsltproc a.xslt a.xml > a.csv
I get an incorrect output, as field 2 does not represent the attribute sequence number:
1|1|b1
1|1|b2
2|1|b1
2|1|b2
2|1|b3
Do you have any suggestions on how to get the correct output please?
Please notice that the answers provided in XSLT to order attributes do not provide a solution to this problem.
The order of attributes is irrelevant in XML. For instance, <a a1="3.0" a2="ABC"> and <a a1="3.0" a2="ABC"> are equivalent.
However this specific question is part of a larger application where it is essential to establish the order in which attributes appear in given xml files (and not in xml files that are equivalent to them).
Although, as kjhughes says in comments, attribute order is insignificant. However, you can still select them, and use the position() element to get the numbers you are after (You just can't be sure the order they are output will be the order they appear in the XML, although generally this will be the case).
Try this XSLT. Do note the nested use of xsl:for-each to select only b elements first, to get their position, before getting the attributes, which then have their own separate position.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:template match="/">
<xsl:for-each select="a/b">
<xsl:variable name="bPosition" select="position()"/>
<xsl:for-each select="#*">
<xsl:value-of select="$bPosition"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="position()"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="name()"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
You could use the position() of the items in the sequence of attributes that you are iterating over and combine with logic for the position of its parent element.
<xsl:template match="/">
<xsl:for-each select="a/b/#*">
<xsl:value-of select="count(../preceding-sibling::*)+1"/>
<xsl:text>|</xsl:text>
<!-- TODO: This is not correct -->
<xsl:value-of select="position() -
(if (count(../preceding-sibling::*)) then count(../preceding-sibling::*)+1 else 0)"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="name()"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
Which produces the following output:
1|1|b1
1|2|b2
2|1|b1
2|2|b2
2|3|b3

xsl: when two nodes are equal, display child of first node

I'm using XML Editor 19.1, Saxon P.E 9.7.
For each selected div, I'm looking to display a graphic/#url, following each <surface> if surface/#xml:id = div/#facs.
XSL
<xsl:for-each select="descendant-or-self::div3[#type='col']/div4[#n]">
<xsl:variable name="div4tablet" select="#facs"/>
<xsl:choose>
<xsl:when test="translate(.[#n]/$div4tablet, '#', '') = preceding::facsimile/surfaceGrp[#type='tablet']/surface[#n]/#xml:id">
<xsl:value-of select=""/> <!-- DISPLAY graphic/#url that follows facsimile/surfaceGrp/surface -->
</xsl:when>
<xsl:otherwise/>
</xsl:choose>
[....]
</xsl:for-each>
TEI example
<facsimile>
<surfaceGrp n="1" type="tablet">
<surface n="1.1" xml:id="ktu1-2_i_1_to_10_img">
<graphic url="../img/KTU-1-2-1-10-recto.jpg"/>
<zone xml:id=""/>
<zone xml:id=""/>
</surface>
<surface n="1.2" xml:id="ktu1-2_i_10_to_30_img">
<graphic url="../img/KTU-1-2-10-30-recto.jpg"/>
<zone xml:id=""/>
</surface>
[...]
</surfaceGrp>
<surfaceGrp n="2">
[...]
</surfaceGrp>
</facsimile>
<text>
[...]
<div3 type="col">
<div4 n="1.2.1-10" xml:id="ktu1-2_i_1_to_10" facs="#ktu1-2_i_1_to_10_img">
[...]
</div4>
<div4 n="1.2.10-30" xml:id="ktu1-2_i_10_to_30" facs="#ktu1-2_i_10_to_30_img">
[...]
</div4>
</div3>
</text>
I have tried <xsl:value-of select="preceding::facsimile/surfaceGrp[#type='tablet']/surface[#n, #xml:id]/graphic/#url"/>, but it displays all graphic/#url and not only the one that follows fascsimile/surfaceGrp/surface.
So my question: how to display only surface/graphic/#url for each div3[#type='col']/div4[#n]?
In advance, thank you for your kind help.
As you use XSLT 2 or 3 and the elements have the xml:id attribute you do not even need a key but can use the id function:
<xsl:template match="div4">
<div>
<xsl:value-of select="id(substring(#facs, 2))/graphic/#url"/>
</div>
</xsl:template>
I put the use of id into a template matching the div4 element but you can of course use it the same way inside of your for-each selecting those elements.
See a minimal but complete sample at https://xsltfiddle.liberty-development.net/bdxtpR.
you should use xsl:key for this type of problem.
First, we must declare a key for the target node
<xsl:key name="kSurface" match="surface" use="concat('#', #xml:id)"/>
notice the concat function being used here, an # was being added to the xml:id so that the keys would appear as:
#ktu1-2_i_1_to_10_img
#ktu1-2_i_10_to_30_img
now in this loop:
<xsl:for-each select="descendant-or-self::div3[#type='col']/div4[#n]">
we can access the key that matches the #facs attribute by having:
<xsl:value-of select="key('kSurface', #facs)/graphic/#url"/>
The whole stylesheet is below:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="1.0">
<xsl:output omit-xml-declaration="yes"/>
<xsl:key name="kSurface" match="surface" use="concat('#', #xml:id)"/>
<xsl:template match="/">
<xsl:for-each select="descendant-or-self::div3[#type='col']/div4[#n]">
<xsl:value-of select="key('kSurface', #facs)/graphic/#url"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
see it in action here.

XSLT select all text except one child node text

I have xml like this:
<article>
<title> Test title - <literal> Compulsory - </literal> <fn> ABC </fn>
<comments> a comment</comments>
</title>
</article>
I want to get all child node + self text in a variable
e.g.
$full_title = "Test title - Compulsory - ABC"
Except comments node text.
Following is my unsuccessful try where i miss title node text.
<xsl:template name="test">
<xsl:variable name="full_title" select="article/title/*[not(self::comments)][1]" />
<xsl:variable name="width" select="45" />
<xsl:choose>
<xsl:when test="string-length($full_title) > $width">
<xsl:value-of select="concat(substring($full_title,1,$width),'..')"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$full_title"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Change * to node(). That will select both elements and text nodes that are children of the <title> element. Then take out the [1] since you want all children of <title>:
<xsl:variable name="full_title"
select="string-join(article/title/node()[not(self::comments)], '')" />
A more reliable way to do it, so that you won't get tripped up if you have multiple levels under <title> and <comments> elements occur as grandchildren, would be this:
<xsl:variable name="full_title"
select="string-join(article/title//text()[not(ancestor::comments)], '')" />
Update:
Since you want the variable to hold a string value, and since you're passing it to functions like concat() and string-length() which cannot take a sequence of multiple nodes as a first argument, using string-join(..., '') around the sequence converts it to a string by concatenating the string values of each node.
Try this:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:template match="/">
<xsl:variable name="full-text">
<xsl:apply-templates select="//*[not(self::comments)]"
mode="no-comments"/>
</xsl:variable>
<xsl:value-of select="$full-text"/><!-- just for debug-->
</xsl:template >
<xsl:template match="*" mode="no-comments">
<xsl:value-of select="text()"/>
</xsl:template>
</xsl:stylesheet>
attribute mode used only for clarity

What is this XSLT code doing?

I'm new to XSLT. I have a block code that I don't understand.
In the following block what does '*','*[#class='vcard']' and '*[#class='fn']' mean?
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="html" encoding="utf-8"/> <xsl:template match="/">
<script type="text/javascript">
<xsl:text><![CDATA[function show_hcard(info) {
win2 = window.open("about:blank", "HCARD", "width=300,height=200," + "scrollbars=no menubar=no, status=no, toolbar=no, scrollbars=no");
win2.document.write("<h1>HCARD</h1><hr/><p>" + info + "</p>"); win2.document.close();
}]]></xsl:text>
</script>
<xsl:apply-templates/> </xsl:template>
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates/>
</xsl:copy> </xsl:template>
<xsl:template match="*[#class='vcard']">
<xsl:apply-templates/> </xsl:template>
<xsl:template match="*[#class='fn']">
<u>
<a>
<xsl:attribute name="onMouseDown">
<xsl:text>show_hcard('</xsl:text>
<xsl:value-of select="text()"/>
<xsl:text>')</xsl:text>
</xsl:attribute>
<xsl:value-of select="text()"/>
</a>
</u> </xsl:template> </xsl:stylesheet>
* matches all elements, *[#class='vcard'] pattern matches all elements with class attribute of vcard value. From that you can figure out what *[#class='fn'] may mean ;-)
I'd also suggest that you start here.
Your stylesheet has four template rules. In English these rules are:
(a) starting at the top (match="/"), first output a script element, then process the next level down (xsl:apply-templates) in the input.
(b) the default rule for elements (match="*") is to create a new element in the output with the same name and attributes as the original, and to construct its content by processing the next level down in the input.
(c) the rule for elements with the attribute class="vcard" is to do nothing with this element, other than to process the next level down in the input.
(d) the rule for elements with the attribute class="fn" is to output
<u><a onMouseDown="show_hcard('X')">X</a></u>
where X is the text content of the element being processed.
A more experienced XSLT user would have written the last rule as
<xsl:template match="*[#class='fn']">
<u>
<a onMouseDown="show_hcard('{.}')">
<xsl:value-of select="."/>
</a>
</u>
</xsl:template>

Get N characters introduction text with XSLT 1.0 from XHTML

How I can get first n characters with XSLT 1.0 from XHTML? I'm trying to create introduction text for news.
Everything is UTF-8
HTML entity aware ( &), one entity = one character
HTML tag aware (adds missing end tags)
Input HTML is always valid
If input text is over n chars add '...' to end output
Input tags are restricted to: a, img, p, div, span, b, strong
Example input HTML:
<img src="image.jpg" alt="">text link here
Example output with 9 characters:
<img src="image.jpg" alt="">text link...
Example input HTML:
<p>link here text</p>
Example output with 4 characters:
<p>link...</p>
Here is a starting point, although it currently doesn't contain any code to handle the requirement "Input tags are restricted to: a, img, p, div, span, b, strong"
It works by looping through the child nodes of a node, and totalling the length of the preceding siblings up to that point. Note that the code to get the length of the preceding siblings requires the use of the node-set function, which is an extension function to XSLT 1.0. In my example I am using Microsoft Extension function.
Where a node is not a text node, the total length of characters up to that point will be the sum of the lengths of the preceding siblings, put the sum of the preceding siblings of the parent node (which is passed as a parameter to the template).
Here is the XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
<xsl:param name="MAXCHARS">9</xsl:param>
<xsl:template match="/body">
<xsl:apply-templates select="child::node()"/>
</xsl:template>
<xsl:template match="node()">
<xsl:param name="LengthToParent">0</xsl:param>
<!-- Get length of previous siblings -->
<xsl:variable name="previousSizes">
<xsl:for-each select="preceding-sibling::node()">
<length>
<xsl:value-of select="string-length(.)"/>
</length>
</xsl:for-each>
</xsl:variable>
<xsl:variable name="LengthToNode" select="sum(msxsl:node-set($previousSizes)/length)"/>
<!-- Total amount of characters processed so far -->
<xsl:variable name="LengthSoFar" select="$LengthToNode + number($LengthToParent)"/>
<!-- Check limit is not exceeded -->
<xsl:if test="$LengthSoFar < number($MAXCHARS)">
<xsl:choose>
<xsl:when test="self::text()">
<!-- Output text nonde with ... if required -->
<xsl:value-of select="substring(., 1, number($MAXCHARS) - $LengthSoFar)"/>
<xsl:if test="string-length(.) > number($MAXCHARS) - $LengthSoFar">...</xsl:if>
</xsl:when>
<xsl:otherwise>
<!-- Output copy of node and recursively call template on its children -->
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates select="child::node()">
<xsl:with-param name="LengthToParent" select="$LengthSoFar"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
When applied to this input
<body>
<img src="image.jpg" alt="" />text link here
</body>
The output is:
<body>
<img src="image.jpg" alt="" />text link...
</body>
When applied to this input (and changing the parameter to 4 in the XSLT)
<p>link here text</p>
The output is:
<p>link...</p>
This stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:param name="pMaxLength" select="4"/>
<xsl:template match="node()">
<xsl:param name="pPrecedingLength" select="0"/>
<xsl:variable name="vContent">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates select="node()[1]">
<xsl:with-param name="pPrecedingLength"
select="$pPrecedingLength"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:variable>
<xsl:variable name="vLength"
select="$pPrecedingLength + string-length($vContent)"/>
<xsl:if test="$pMaxLength + 3 >= $vLength and
(string-length($vContent) or not(node()))">
<xsl:copy-of select="$vContent"/>
<xsl:apply-templates select="following-sibling::node()[1]">
<xsl:with-param name="pPrecedingLength" select="$vLength"/>
</xsl:apply-templates>
</xsl:if>
</xsl:template>
<xsl:template match="text()" priority="1">
<xsl:param name="pPrecedingLength" select="0"/>
<xsl:variable name="vOutput"
select="substring(.,1,$pMaxLength - $pPrecedingLength)"/>
<xsl:variable name="vSumLength"
select="$pPrecedingLength + string-length($vOutput)"/>
<xsl:value-of select="concat($vOutput,
substring('...',
1 div ($pMaxLength
= $vSumLength)))"/>
<xsl:apply-templates select="following-sibling::node()[1]">
<xsl:with-param name="pPrecedingLength"
select="$vSumLength"/>
</xsl:apply-templates>
</xsl:template>
</xsl:stylesheet>
With this input and 9 as pMaxLength:
<html><img src="image.jpg" alt=""/>text link here</html>
Output:
<html><img src="image.jpg" alt="">text link...</html>
And this input with 4 as pMaxLength:
<html><p>link here text</p></html>
Output:
<html><p>link...</p></html>
As indicated by many: this gets very messy very fast. So I just added another field to DB which has the introduction text.