XSL copy without values is it possible? - xslt

I want to compare two xmls.
1. First compare XML strucutre/schema.
2. Compare values.
I am using beyond compare tool to compare. Since these two xmls are different values, there are lot many differences in comparison report, for which I am not interested. Since, my focus now is to only compare structure/schema.
I tried to copy the xmls by following template, and other as well. But every time it is with values.
I surfed on google, xsl-copy command itself copies everything for selected node/element..
Is there any ways with which I can filter out values and only schema is copied ?
My Data :
<root>
<Child1>xxxx</Child1>
<Child2>yyy</Child2>
<Child3>
<GrandChild1>dddd<GrandChild1>
<GrandChild2>erer<GrandChild2>
</Child3>
</root>
Template used :
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<!-- for all elements (tags) -->
<xsl:template match="*">
<!-- create a copy of the tag (without attributes and children) in the output -->
<xsl:copy>
<!-- For all attributes of the current tag -->
<xsl:for-each select="#*">
<xsl:sort select="name( . )" order="ascending" case-order="lower-first" />
<xsl:copy/>
</xsl:for-each>
<!-- recurse through all child tags -->
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()|comment()|processing-instruction()">
<xsl:copy/>
</xsl:template>
OutPut Required :
Something like..
<root>
<Child1></Child1>
<Child2></Child2>
<Child3>
<GrandChild1><GrandChild1>
<GrandChild2><GrandChild2>
</Child3>
</root>

At the moment, you have a template matching text() to copy it. What you need to do is remove this match from that template, and have a separate template match, that matches only non-whitespace text, and remove it.
<xsl:template match="comment()|processing-instruction()">
<xsl:copy/>
</xsl:template>
<xsl:template match="text()[normalize-space()]" />
For white-space only text (as used in indentation), these will be matched by XSLT'S built-in templates.
For attributes, use xsl:attribute to create a new attribute, without a value, rather than using xsl:copy which will copy the whole attribute.
<xsl:attribute name="{name()}" />
Note the use of Attribute Value Templates (the curly braces) to indicate the expression is to be evaluated to get the string to use.
Try this XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<!-- for all elements (tags) -->
<xsl:template match="*">
<!-- create a copy of the tag (without attributes and children) in the output -->
<xsl:copy>
<!-- For all attributes of the current tag -->
<xsl:for-each select="#*">
<xsl:sort select="name( . )" order="ascending" case-order="lower-first" />
<xsl:attribute name="{name()}" />
</xsl:for-each>
<!-- recurse through all child tags -->
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="comment()|processing-instruction()">
<xsl:copy/>
</xsl:template>
<xsl:template match="text()[normalize-space()]" />
</xsl:stylesheet>
Also note that attributes are considered to be unordered in XML, so although you have code to sort the attributes, and they probably will appear in the right order, you can't guarantee it.

Related

XSLT wrap element and following-sibling text

Kindly help me to wrap the img.inline element with the following sibling text comma (if comma exists):
text <img id="1" class="inline" src="1.jpg"/> another text.
text <img id="2" class="inline" src="2.jpg"/>, another text.
Should be changed to:
text <img id="1" class="inline" src="1.jpg"/> another text.
text <span class="img-wrap"><img id="2" class="inline" src="2.jpg"/>,</span> another text.
Currently, my XSLT will wrap the img.inline element and add comma inside the span, now I want to remove the following comma.
text <span class="img-wrap"><img id="2" class="inline" src="2.jpg"/>,</span>
, <!--remove this extra comma--> another text.
My XSLT:
<xsl:template match="//img[#class='inline']">
<xsl:copy>
<xsl:choose>
<xsl:when test="starts-with(following-sibling::text(), ',')">
<span class="img-wrap">
<xsl:apply-templates select="node()|#*"/>
<xsl:text>,</xsl:text>
</span>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="node()|#*"/>
</xsl:otherwise>
</xsl:choose>
</xsl:copy>
<!-- checking following-sibling::text() -->
<xsl:apply-templates select="following-sibling::text()" mode="commatext"/>
</xsl:template>
<!-- here I want to match the following text, if comma, then remove it -->
<xsl:template match="the following comma" mode="commatext">
<!-- remove comma -->
</xsl:template>
Is my approach is correct? or is this something should be handled differently? pls suggest?
Currently you are copying the img and the embedding the span within that. Also, you do <xsl:apply-templates select="node()|#*"/> which will select child nodes of img (or which there are none). And for the attributes it will end add them to the span.
You don't actually need the xsl:choose here as you can add the condition to the match attribute.
<xsl:template match="//img[#class='inline'][starts-with(following-sibling::node()[1][self::text()], ',')]">
Note I have changed the condition as following-sibling::text() selects ALL text elements that follow the img node. You only want to get the node immediately after the img node, but only if it is a text node.
Also, trying to select the following text node with xsl:apply-templates is probably not the right approach, assuming you have a template that matches the parent node which selects all child nodes (not just img ones). I am assuming you were using the identity template here.
Anyway, try this XSLT instead
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html" indent="no" />
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="//img[#class='inline'][starts-with(following-sibling::node()[1][self::text()], ',')]">
<span class="img-wrap">
<xsl:copy-of select="." />
<xsl:text>,</xsl:text>
</span>
</xsl:template>
<xsl:template match="text()[starts-with(., ',')][preceding-sibling::node()[1][self::img]/#class='inline']">
<xsl:value-of select="substring(., 2)" />
</xsl:template>
</xsl:stylesheet>

Is it possible to preprocess xml source within the same XSLT Stylesheet?

Within the same XSLT (2.0) Stylesheet and transformation I would like to:
1) first preprocess the whole XML Datasource (Add a attribute
with a specific calculation to certain elements)
and then
2: transform the changed XML Datasource with the sylesheet's templates.
How can I achieve this? A code example would be nice?
Yes it is possible. One possibility would be to to proceed as follows:
<xsl:template match="/">
<!-- store the modified content in a variable -->
<xsl:variable name="preprocessed.doc">
<xsl:apply-templates mode="preprocess" />
</xsl:variable>
<!-- process the modified contents -->
<xsl:apply-templates select="$preprocessed.doc/*" />
</xsl:template>
<!-- first pass: sample process to add an attribute named "q" -->
<xsl:template match="*" mode="preprocess">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:attribute name="q"><xsl:number count="*" level="any" /></xsl:attribute>
<xsl:apply-templates mode="preprocess" />
</xsl:copy>
</xsl:template>
<!-- "normal" processing of the modified content. It is able to used the newly processed attribute. -->
<xsl:template match="*[#q <= 5]">
<xsl:copy>
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
<xsl:template match="*"/>
In the first pass, an attribute is added, counting the element in the input XML.
In the processing, we only retain the elements having the value of the q attribute set a number less or equals to 5.

How to fold by a tag a group of selected (neighbor) tags with XSLT1?

I have a set of sequential nodes that must be enclosed into a new element. Example:
<root>
<c>cccc</c>
<a gr="g1">aaaa</a> <b gr="g1">1111</b>
<a gr="g2">bbbb</a> <b gr="g2">2222</b>
</root>
that must be enclosed by fold tags, resulting (after XSLT) in:
<root>
<c>cccc</c>
<fold><a gr="g1">aaaa</a> <b gr="g1">1111</b></fold>
<fold><a gr="g2">bbbb</a> <b gr="g2">2222</b></fold>
</root>
So, I have a "label for grouping" (#gr) but not imagine how to produce correct fold tags.
I am trying to use the clues of this question, or this other one... But I have a "label for grouping", so I understand that my solution not needs the use of key() function.
My non-general solution is:
<xsl:template match="/">
<root>
<xsl:copy-of select="root/c"/>
<fold><xsl:for-each select="//*[#gr='g1']">
<xsl:copy-of select="."/>
</xsl:for-each></fold>
<fold><xsl:for-each select="//*[#gr='g2']">
<xsl:copy-of select="."/>
</xsl:for-each></fold>
</root>
</xsl:template>
I need a general solution (!), looping by all #gr and coping (identity) all context that not have #gr... perhaps using identity transform.
Another (future) problem is to do this recursively, with fold of foldings.
In XSLT 1.0 the standard technique to handle this sort of thing is called Muenchian grouping, and involves the use of a key that defines how the nodes should be grouped and a trick using generate-id to extract just the first node in each group as a proxy for the group as a whole.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:strip-space elements="*" />
<xsl:output indent="yes" />
<xsl:key name="elementsByGr" match="*[#gr]" use="#gr" />
<xsl:template match="#*|node()" name="identity">
<xsl:copy><xsl:apply-templates select="#*|node()"/></xsl:copy>
</xsl:template>
<!-- match the first element with each #gr value -->
<xsl:template match="*[#gr][generate-id() =
generate-id(key('elementsByGr', #gr)[1])]" priority="2">
<fold>
<xsl:for-each select="key('elementsByGr', #gr)">
<xsl:call-template name="identity" />
</xsl:for-each>
</fold>
</xsl:template>
<!-- ignore subsequent ones in template matching, they're handled within
the first element template -->
<xsl:template match="*[#gr]" priority="1" />
</xsl:stylesheet>
This achieves the grouping you're after, but just like your non-general solution it doesn't preserve the indentation and the whitespace text nodes between the a and b elements, i.e. it will give you
<root>
<c>cccc</c>
<fold>
<a gr="g1">aaaa</a>
<b gr="g1">1111</b>
</fold>
<fold>
<a gr="g2">bbbb</a>
<b gr="g2">2222</b>
</fold>
</root>
Note that if you were able to use XSLT 2.0 then the whole thing becomes one for-each-group:
<xsl:template match="root">
<xsl:for-each-group select="*" group-adjacent="#gr">
<xsl:choose>
<!-- wrap each group in a fold -->
<xsl:when test="#gr">
<fold><xsl:copy-of select="current-group()" /></fold>
</xsl:when>
<!-- or just copy as-is for elements that don't have a #gr -->
<xsl:otherwise>
<xsl:copy-of select="current-group()" />
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:template>

orbeon: applying xslt transform in xbl on bound document

In Orbeon Forms I need to create a component (using XBL) that when bound to an instance like
<OCinstructionitem>
<OCp>paragraph 1</OCp>
<OCp>paragraph 2 <OCem>with italics part</OCem> rest of paragraph 2 </OCp>
</OCinstructionitem>
creates an editable div like this:
<div contentEditable="true">
<p>paragraph 1</p>
<p>paragraph 2 <i>with italics part</i> rest of paragraph 2 </p>
</div>
My thought was that I need to do this using XSLT. I get this working when the to-be-transformed XML is inside the xforms document:
<oc:instructionitem>
<OCinstructionitem>
<!-- here the xml of above -->
...
</OCinstructionitem>
</oc:instructionitem>
but I want to let the XSLT operate on the bound node as in:
<oc:instructionitem ref="OCinstructionitem"/>
However, I canot access the bound node from the XSLT.
My question: is that just not possible? Or do I have to modify my XBL?
My XBL code:
<xbl:binding id="oc-instructionitem" element="oc|instructionitem">
<xbl:template xxbl:transform="oxf:xslt">
<xsl:transform version="2.0">
<xsl:template match="#*|node()" priority="-100">
<xsl:apply-templates select="#*|node()"/>
</xsl:template>
<xsl:template match="text()" priority="-100" mode="in-paragraph">
<xsl:copy>
<xsl:value-of select="."/>
</xsl:copy>
</xsl:template>
<xsl:template match="OCem" mode="in-paragraph">
<x:i><xsl:value-of select="."/></x:i>
</xsl:template>
<xsl:template match="OCp">
<x:div>
<xsl:apply-templates mode="in-paragraph"/>
</x:div>
</xsl:template>
<xsl:template match="/*">
<x:div contentEditable="true">
<xsl:apply-templates/>
</x:div>
</xsl:template>
</xsl:transform>
</xbl:template>
</xbl:binding>
</xbl:xbl>
Any help would be greatly appreciated,
edit: after helpfull comment of tohuwawohu (below)
It seems that you need to define a variable which is bound to the instance data. Like this:
<xsl:template match="oc:instructionitem">
<xforms:group xbl:attr="model context ref bind" xxbl:scope="outer">
<xxforms:variable name="binding" as="node()?" xxbl:scope="inner" >
<xxforms:sequence select="." xxbl:scope="outer"/>
</xxforms:variable>
<xforms:group xxbl:scope="inner">
<!-- Variable pointing to external single-node binding -->
<xforms:group ref="$binding">
<xsl:call-template name="main"/>
</xforms:group>
</xforms:group>
</xforms:group>
</xsl:template>
<xsl:template name="main">
<x:div contentEditable="true">
<xforms:repeat nodeset="*">
<xsl:call-template name="paragraph"/>
</xforms:repeat>
</x:div>
</xsl:template>
However, the XSLT elements still cannot act on the data. It can only generate XFORMS elements, which can act on the data. This means that something like <xsl:template match="OCp"> will never be selected. This is why the above code uses named templates.
So the question still stands: can the bound data be made available to the xslt code?
Martijn
I didn't test it, but i think it should be possible by breaking the encapsulation. Without this, your XBL will only have access to the content of the bound node, as in your third code snippet. If you want to make the XBL access the XForms instance data, you will have to declare a variable inside the XBL that's pointing to the single-node binding of the bound node.
Here's the code snippet from the Wiki example:
<xforms:group xbl:attr="model context ref bind" xxbl:scope="outer">
<xforms:group xxbl:scope="inner">
<!-- Variable pointing to external single-node binding -->
<xxforms:variable name="binding" as="node()?">
<xxforms:sequence select="." xxbl:scope="outer"/>
</xxforms:variable>
...
</xforms:group>
</xforms:group>
Having defined the single-node binding like this, it should be possible to reference that variable and use it for xslt transformation. Just replace the XSLT template element matching the root node and point it to the content of the variable $binding:
<xsl:transform version="2.0">
<xsl:template match="#*|node()" priority="-100">
<xsl:apply-templates select="#*|node()"/>
</xsl:template>
<xsl:template match="text()" priority="-100" mode="in-paragraph">
<xsl:copy>
<xsl:value-of select="."/>
</xsl:copy>
</xsl:template>
<xsl:template match="OCem" mode="in-paragraph">
<x:i><xsl:value-of select="."/></x:i>
</xsl:template>
<xsl:template match="OCp">
<x:div>
<xsl:apply-templates mode="in-paragraph"/>
</x:div>
</xsl:template>
<xsl:template match="oc:instructionitem">
<xforms:group ref="$binding">
<x:div contentEditable="true">
<xsl:apply-templates/>
</x:div>
</xforms:group>
</xsl:template>
Hope this works for you. Maybe this requires making provisions if the bound node (in your example: oc:instructionitem) isn't empty, so the xslt may process both that content as well as the content of the $binding variable.

How to remove <b/> from a document

I'm trying to have an XSLT that copies most of the tags but removes empty "<b/>" tags. That is, it should copy as-is "<b> </b>" or "<b>toto</b>" but completely remove "<b/>".
I think the template would look like :
<xsl:template match="b">
<xsl:if test=".hasChildren()">
<xsl:element name="b">
<xsl:apply-templates/>
</xsl:element>
</xsl:if>
</xsl:template>
But of course, the "hasChildren()" part doesn't exist ... Any idea ?
dsteinweg put me on the right track ... I ended up doing :
<xsl:template match="b">
<xsl:if test="./* or ./text()">
<xsl:element name="b">
<xsl:apply-templates/>
</xsl:element>
</xsl:if>
</xsl:template>
This transformation ignores any <b> elements that do not have any node child. A node in this context means an element, text, comment or processing instruction node.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="b[not(node()]"/>
</xsl:stylesheet>
Notice that here we use one of the most fundamental XSLT design patterns -- using the identity transform and overriding it for specific nodes.
The overriding template will be selected only for nodes that are elements named "b" and do not have (any nodes as) children. This template is empty (does not have any contents), so the effect of its application is that the matching node is ignored/discarded and is not reproduced in the output.
This technique is very powerful and is widely used for such tasks and also for renaming, changing the contents or attributes, adding children or siblings to any specific node that can be matched (avery type of node with the exception of a namespace node can be used as a match pattern in the "match" attribute of <xsl:template/>
Hope this helped.
Cheers,
Dimitre Novatchev
I wonder if this will work?
<xsl:template match="b">
<xsl:if test="b/text()">
...
See if this will work.
<xsl:template match="b">
<xsl:if test=".!=''">
<xsl:element name="b">
<xsl:apply-templates/>
</xsl:element>
</xsl:if>
</xsl:template>
An alternative would be to do the following:
<xsl:template match="b[not(text())]" />
<xsl:template match="b">
<b>
<xsl:apply-templates/>
</b>
</xsl:template>
You could put all the logic in the predicate, and set up a template to match only what you want and delete it:
<xsl:template match="b[not(node())] />
This assumes that you have an identity template later on in the transform, which it sounds like you do. That will automatically copy any "b" tags with content, which is what you want:
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
Edit: Now uses node() like Dimitri, below.
If you have access to update the original XML, you could try using use xml:space=preserve on the root element
<html xml:space="preserve">
...
</html>
This way, the space in the empty <b> </b> tag is preserved, and so can be distinguished from <b /> in the XSLT.
<xsl:template match="b">
<xsl:if test="text() != ''">
....
</xsl:if>
</xsl:template>