Selectively copy and update xml nodes using XSLT - templates

I'm working with an xml that I need to copy and update to pass on for further processing. The issue I'm having is that I have not figured out an efficient method to do this. Essentially, I want to update some data, conditionally, then copy all the nodes that were not updated. Why this is challenging is due to the volume and variance in the number and name of nodes to be copied. I also want to NOT copy nodes that have no text value. Here is an example:
INPUT XML
<root>
<PersonProfile xmlns:'namespace'>
<ID>0001</ID>
<Name>
<FirstName>Jonathan</FirstName>
<PreferredName>John</PreferredName>
<MiddleName>A</MiddleName>
<LastName>Doe</LastName>
</Name>
<Country>US</Country>
<Biirthdate>01-01-1980</Birthdate>
<BirthPlace>
<City>Townsville</City>
<State>OR</State>
<Country>US</Country>
</Birthplace>
<Gender>Male</Gender>
<HomeState>OR</HomeState>
...
<nodeN>text</nodeN>
</PersonProfile>
</root>
The "PersonProfile" node is just one of several node sets within the "root" element, each with their own subset of data. Such as mailing address, emergency contact info, etc. What I am attempting to do is update nodes if the variable has a new value for them then copy all the nodes that were not updated.
Here is my current XSLT
<xsl:stylesheet version="2.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsl:variable name='updateData' select='document("report")'/>
<!-- Identity Transform -->
<xsl:template match='#* | node()'>
<xsl:if test'. != ""'>
<xsl:copy>
<xsl:apply-templates select='#* | node()'/>
</xsl:copy>
</xsl:if>
</xsl:template>
<!-- Template to update Person Profile -->
<xsl:template match='PersonProfile'>
<xsl:copy>
<xsl:apply-templates select='*'/>
<xsl:element name='Name'>
<xsl:if test='exists($updateData/Preferred)'>
<xsl:element name='FirstName'>
<xsl:value-of select='$reportData/FirstName'/>
</xsl:element>
</xsl:if>
<xsl:if test='exists($updateData/Preferred)'>
<xsl:element name='PreferredName'>
<xsl:value-of select='$updateData/Preferred'/>
</xsl:element>
</xsl:if>
<xsl:if test='exists($updateData/Middle)'>
<xsl:element name='MiddleName'>
<xsl:value-of select='$updateData/Middle'/>
</xsl:element>
</xsl:if>
<xsl:if test='exists($updateData/LastName)'>
<xsl:element name='LastName'>
<xsl:value-of select='$updateData/wd:LastName'/>
</xsl:element>
</xsl:if>
</xsl:element>
<xsl:if test='exists($updateData/Country)'>
<xsl:element name='Country'>
<xsl:value-of select='$updateData/Country'/>
</xsl:element>
</xsl:if>
....
<!-- follows same structure until end of template -->
</xsl:copy>
</xsl:template>
<!-- More Templates to Update other Node sets -->
</xsl:stylesheet>
What's happening right now, is that it's copying ALL the nodes and then adding the updates values. Using Saxon-PE 9.3.0.5, I'll get an output similar to this:
Sample Output
<root>
<PersonProfile xmlns:'namespace'>
<ID>0001</ID>
<Name>
<FirstName>Jonathan</FirstName>
<PreferredName>John</PreferredName>
<MiddleName>A</MiddleName>
<LastName>Doe</LastName>
</Name>
<Country>US</Country>
<Biirthdate>01-01-1980</Birthdate>
<BirthPlace>
<City>Townsville</City>
<State>OR</State>
<Country>US</Country>
</Birthplace>
<Gender>Male</Gender>
<HomeState>OR</HomeState>
...
<nodeN>text</nodeN>
<PreferredName>Jonathan</PreferredName>
<HomeState>WA</HomeState>
</PersonProfile>
</root>
I realize this is happening because I am applying the templates to all the nodes in PersonProfile and that I could specify which nodes to exclude, but I feel like this is a very poor solution as the volume of nodes could be upwards of 30 or more and that would require a written value for each one. I trust XML has a more elegant solution than to explicitly list each of these nodes. I would like to have an out like this:
Desired Output
<root>
<PersonProfile xmlns:'namespace'>
<ID>0001</ID>
<Name>
<FirstName>Jonathan</FirstName>
<PreferredName>Jonathan</PreferredName>
<MiddleName>A</MiddleName>
<LastName>Doe</LastName>
</Name>
<Country>US</Country>
<Biirthdate>01-01-1980</Birthdate>
<BirthPlace>
<City>Townsville</City>
<State>OR</State>
<Country>US</Country>
</Birthplace>
<Gender>Male</Gender>
<HomeState>WA</HomeState>
...
<nodeN>text</nodeN>
</PersonProfile>
</root>
If anyone could help me create a template structure that would work for the xml structure, I would GREATLY appreciate it. It would need to be "reusable" as there are similar node structures like Person Profile I would have to apply it to, but with different node names and number of elements, etc.
Thanks in advance for any help!
J

This should work for your original question:
<xsl:stylesheet version="2.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsl:variable name='updateData' select='document("report")'/>
<!-- Identity Transform -->
<xsl:template match='#* | node()' name='copy'>
<xsl:copy>
<xsl:apply-templates select='#* | node()'/>
</xsl:copy>
</xsl:template>
<xsl:template match='*[not(*)]'>
<xsl:variable name='matchingValue'
select='$updateData/*[name() = name(current())]'/>
<xsl:choose>
<xsl:when test='$matchingValue'>
<xsl:copy>
<xsl:apply-templates select='#*' />
<xsl:value-of select='$matchingValue'/>
</xsl:copy>
</xsl:when>
<xsl:when test='normalize-space()'>
<xsl:call-template name='copy' />
</xsl:when>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
As far as inserting new elements that are not present in the source XML, that's trickier. Could you open a separate question for that? I may have some ideas for how to approach that.

Related

Retrieving nodes having (or not) a child to apply a conditional template

I read lot of articles but did not find a conclusive help to my problem.
I have an XML document to which I apply an xslt to get a csv file as output.
I send a parameter to my xsl transformation to filter the target nodes to apply the templates.
The xml document looks like that (I removed some unuseful nodes for comprehension):
<GetMOTransactionsResponse xmlns="http://www.exane.com/pott" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.exane.com/pott PoTTMOTransaction.xsd">
<MOTransaction>
<Transaction VersionNumber="2" TradeDate="2013-11-20">
<TransactionId Type="Risque">32164597</TransactionId>
<InternalTransaction Type="Switch">
<BookCounterparty>
<Id Type="Risque">94</Id>
</BookCounterparty>
</InternalTransaction>
<SalesPerson>
<Id Type="Risque">-1</Id>
</SalesPerson>
</Transaction>
<GrossPrice>58.92</GrossPrice>
<MOAccount Account="TO1E" />
<Entity>0021</Entity>
</MOTransaction>
<MOTransaction>
<Transaction VersionNumber="1" TradeDate="2013-11-20">
<TransactionId Type="Risque">32164598</TransactionId>
<SalesPerson>
<Id Type="Risque">-1</Id>
</SalesPerson>
</Transaction>
<GrossPrice>58.92</GrossPrice>
<MOAccount Account="TO3E" />
<Entity>0021</Entity>
</MOTransaction>
</GetMOTransactionsResponse>
My xslt is below (sorry it's quite long, and I write it more simple than it really is):
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:pott="http://www.exane.com/pott">
<xsl:output method="text" omit-xml-declaration="no" indent="no" />
<xsl:param name="instrumentalSystem"></xsl:param>
<xsl:template name="abs">
<xsl:param name="n" />
<xsl:choose>
<xsl:when test="$n = 0">
<xsl:text>0</xsl:text>
</xsl:when>
<xsl:when test="$n > 0">
<xsl:value-of select="format-number($n, '#')" />
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="format-number(0 - $n, '#')" />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template name="outputFormat">
<!--Declaration of variables-->
<xsl:variable name="GrossPrice" select="pott:GrossPrice" />
<xsl:variable name="TransactionId" select="pott:Transaction/pott:TransactionId[#Type='Risque']" />
<xsl:variable name="VersionNumber" select="pott:Transaction/#VersionNumber" />
<!--Set tags values-->
<xsl:value-of select="$Entity" />
<xsl:text>;</xsl:text>
<xsl:value-of select="concat('0000000', pott:MOAccount/#Account) "/>
<xsl:text>;</xsl:text>
<xsl:text>;</xsl:text>
<xsl:value-of select="$TransactionId" />
<xsl:text>;</xsl:text>
<xsl:value-of select="$VersionNumber" />
<xsl:text>
</xsl:text>
</xsl:template>
<xsl:template match="/">
<xsl:choose>
<!-- BB -->
<xsl:when test="$instrumentalSystem = 'BB'">
<!--xsl:for-each select="pott:GetMOTransactionsResponse/pott:MOTransaction/pott:Transaction[pott:InternalTransaction]"-->
<xsl:for-each select="pott:GetMOTransactionsResponse/pott:MOTransaction/pott:Transaction[pott:InternalTransaction]">
<xsl:call-template name="outputFormat"></xsl:call-template>
</xsl:for-each>
</xsl:when>
<!-- CP -->
<xsl:when test="$instrumentalSystem = 'CP'">
<xsl:for-each select="pott:GetMOTransactionsResponse/pott:MOTransaction/pott:Transaction[not(pott:InternalTransaction)]">
<xsl:call-template name="outputFormat"></xsl:call-template>
</xsl:for-each>
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
If parameter = BB, I want to select MOTransaction nodes that have a child Transaction that contains a InternalTransaction node.
If parameter = CP, I want to select MOTransaction nodes that don't have a child Transaction that contains a InternalTransaction node
When I write
pott:GetMOTransactionsResponse/pott:MOTransaction/pott:Transaction[pott:InternalTransaction], I get the Transaction nodes and not the MOTransaction nodes
I think I am not very far from the expected result, but despite all my attempts, I fail.
If anyone can help me.
I hope being clear, otherwise I can give more information.
Looking at one of xsl:for-each statements, you are doing this
<xsl:for-each select="pott:GetMOTransactionsResponse/pott:MOTransaction/pott:Transaction[pott:InternalTransaction]">
You say you want to select MOTransaction elements, but it is actually selecting the child Transaction elements. To match the logic you require, it should be this
<xsl:for-each select="pott:GetMOTransactionsResponse/pott:MOTransaction[pott:Transaction[pott:InternalTransaction]]">
In fact, this should also work
<xsl:for-each select="pott:GetMOTransactionsResponse/pott:MOTransaction[pott:Transaction/pott:InternalTransaction]">
Similarly, for the second statement (in the case of the parameter being "CP"), it could look like this
<xsl:for-each select="pott:GetMOTransactionsResponse/pott:MOTransaction[pott:Transaction[not(pott:InternalTransaction)]]">
Alternatively, it could look like this
<xsl:for-each select="pott:GetMOTransactionsResponse/pott:MOTransaction[not(pott:Transaction/pott:InternalTransaction)]">
They are not quite the same though, as the first will only include MOTransaction elements that have Transaction child elements, whereas the second will include MOTransaction that don't have any Transaction childs at all.
As a slight aside, you don't really need to use an xsl:for-each and xsl:call-template here. It might be better to use template matching.
Firstly, try changing the named template <xsl:template name="outputFormat"> to this
<xsl:template match="pott:MOTransaction">
Then, you can re-write you merge the xsl:for-each and xsl:call-template into a single xsl:apply-templates call.
<xsl:apply-template select="pott:GetMOTransactionsResponse/pott:MOTransaction[pott:Transaction/pott:InternalTransaction]" />

how to set Click event on Xslt output

This is my XSLT file:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:template match="/">
<xsl:for-each select="//child_4331">
<xsl:value-of select="*"/>
<xsl:value-of select="#value" />
<xsl:attribute name="onclick">
<xsl:call-template name="GetOnClickJavaScript" />
</xsl:attribute>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
How can I set the click event on child_4331 value?
You didn't say, but I'm assuming you want to copy the child_4331 element and add the onclick attribute.
I would get rid of the template matching '/' and create one to match 'child_4331'. Use xsl:copy to create a copy of the element and add the attribute inside it. If the child_4331 element has attributes or child elements you will want to use xsl:apply-templates to pick them up.
Here is a sample snippet. Your solution may vary depending on your desired output. I can't give you more without knowing what your source XML looks like and what you expect to see in the result.
<xsl:template match="child_4331">
<xsl:copy>
<xsl:attribute name="onclick">
<xsl:call-template name="GetOnClickJavaScript" />
</xsl:attribute>
</xsl:copy>
</xsl:template>

Testing whether a node with particular content exists using xslt

I am trying to merge the elements from two separate web.xml files using XSLT. For example, if web-1.xml and web-2.xml are being merged, and I'm processing web-1.xml, I want all elements in web-2.xml to be added into the result, except any that already exist in web-1.xml.
In the XSLT sheet, I have loaded the document whose servlet's are to be merged into the other document using:
<xsl:variable name="jandy" select="document('web-2.xml')"/>
I then have the following rule:
<xsl:template match="webapp:web-app">
<xsl:copy>
<!-- Copy all of the existing content from the document being processed -->
<xsl:apply-templates/>
<!-- Merge any <servlet> elements that don't already exist into the result -->
<xsl:for-each select="$jandy/webapp:web-app/webapp:servlet">
<xsl:variable name="servlet-name"><xsl:value-of select="webapp:servlet-name"/></xsl:variable>
<xsl:if test="not(/webapp:web-app/webapp:servlet/webapp:servlet-name[text() = $servlet-name])">
<xsl:copy-of select="."/>
</xsl:if>
</xsl:for-each>
</xsl:copy>
</xsl:template>
The problem I'm having is getting the test in the if correct. With the above code, the test always evaluates to false, whether a servlet-name element with the given node exists or not. I have tried all kinds of different tests but with no luck.
The relevant files are available at http://www.cs.hope.edu/~mcfall/stackoverflow/web-1.xml, and http://www.cs.hope.edu/~mcfall/stackoverflow/transform.xslt (the second web-2.xml is there as well, but StackOverflow won't let me post three links).
provide an anchor for the first document, just before the for-each loop:
<xsl:variable name="var" select="."/>
then, use it in your if:
<xsl:if test="not($var/webapp:servlet/webapp:servlet-name[text() = $servlet-name])">
Your template matches XPATH webapp:webapp from web-1.xml and
you are refencing absolute XPATH if your xsl:if condition: /webapp:web-app/webapp:servlet/webapp:servlet-name[text() = $servlet-name]. Try to do it using relative XPATH:
<xsl:if test="not(webapp:servlet/webapp:servlet-name[text() = $servlet-name])">
<xsl:copy-of select="."/>
</xsl:if>
I haven't checked it, so you have to give it a try.
Also, it would be easier if you could provide web-1.xml and web-2.xml files.
EDIT
The following XSLT merges two files - the only problem appears when there are sections of the same type (like listener) in two places of the input XML.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:webapp="http://java.sun.com/xml/ns/javaee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://java.sun.com/xml/ns/javaee"
xpath-default-namespace="http://java.sun.com/xml/ns/javaee">
<xsl:output indent="yes"/>
<xsl:variable name="jandy" select="document('web-2.xml')"/>
<xsl:template match="/">
<xsl:element name="web-app">
<xsl:for-each select="webapp:web-app/*[(name() != preceding-sibling::node()[1]/name()) or (position() = 1)]">
<xsl:variable name="nodeName" select="./name()"/>
<xsl:variable name="web1" as="node()*">
<xsl:sequence select="/webapp:web-app/*[name()=$nodeName]"/>
</xsl:variable>
<xsl:variable name="web2" as="node()*">
<xsl:sequence select="$jandy/webapp:web-app/*[name() = $nodeName]"/>
</xsl:variable>
<xsl:copy-of select="$web1" copy-namespaces="no"/>
<xsl:for-each select="$web2">
<xsl:variable name="text" select="./*[1]/text()"/>
<xsl:if test="count($web1[*[1]/text() = $text]) = 0">
<xsl:copy-of select="." copy-namespaces="no"/>
</xsl:if>
</xsl:for-each>
</xsl:for-each>
</xsl:element>
</xsl:template>
</xsl:stylesheet>

Use variable to store and output attributes

If have a big 'xsl:choose' chunk in which I need to set a number of defined sets of attributes on different elements.
I really do not like to repeat the definition of sets of attributes inside every branch of the 'choose'.
So I would like to work with a variable that contains those attributes.
A lot easier to maintain and less room for error...
So far I have not been able to call the attribute node out?
I thought they are just a node-set, so copy-of would do the trick.
But that gives me nothing on output.
Is this because attribute nodes are not really children?
But XSLT 1.O does not allow me to address them directly...<xsl:copy-of select="$attributes_body/#*/> returns an error
Here is the stylesheet fragment (reduced from original)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="list">
<xsl:for-each select="figure">
<xsl:variable name="attributes_body">
<xsl:attribute name="chapter"><xsl:value-of select="#chapter"/></xsl:attribute>
<xsl:attribute name="id"><xsl:value-of select="#id"/></xsl:attribute>
</xsl:variable>
<xsl:variable name="attributes_other">
<xsl:attribute name="chapter"><xsl:value-of select="#book"/></xsl:attribute>
<xsl:attribute name="id"><xsl:value-of select="#id"/></xsl:attribute>
</xsl:variable>
<xsl:choose>
<xsl:when test="ancestor::body">
<xsl:element name="entry">
<xsl:copy-of select="$attributes_body"/>
<xsl:text>Body fig</xsl:text>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:element name="entry">
<xsl:copy-of select="$attributes_other"/>
<xsl:text>other fig</xsl:text>
</xsl:element>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:template>
If this can not be done in XLST 1.0 would 2.0 be able to do this?
<xsl:variable name="attributes_body">
<xsl:attribute name="chapter"><xsl:value-of select="#chapter"/></xsl:attribute>
<xsl:attribute name="id"><xsl:value-of select="#id"/></xsl:attribute>
</xsl:variable>
You need to select the wanted attributes -- not to copy their contents in the body of the variable.
Remember: Whenever possible, try always to specify an XPath expression in the select attribute of xsl:variable -- avoid copying content in its body.
Solution:
Just use:
<xsl:variable name="attributes_body" select="#chapter | #id">
Here is a complete example:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="x/a">
<xsl:variable name="vAttribs" select="#m | #n"/>
<newEntry>
<xsl:copy-of select="$vAttribs"/>
<xsl:value-of select="."/>
</newEntry>
</xsl:template>
</xsl:stylesheet>
when applied on this:
<x>
<a m="1" n="2" p="3">zzz</a>
</x>
produces:
<newEntry m="1" n="2">zzz</newEntry>
in my case, I was trying to store a tag attribute into a variable
to do so, use this syntax tag-name/#attribute-inside-tag-name
here is an example
<xsl:variable name="articleLanguage" select="/article/#language"/><!--the tricky part -->
//<!--now you can use this this varialbe as you like -->
<xsl:apply-templates select="front/article-meta/kwd-group[#language=$articleLanguage]"/>
and the xml was
<article article-type="research-article" language="es" explicit-lang="es" dtd-version="1.0">
.....
hope this help you

How to remove <b/> from a document

I'm trying to have an XSLT that copies most of the tags but removes empty "<b/>" tags. That is, it should copy as-is "<b> </b>" or "<b>toto</b>" but completely remove "<b/>".
I think the template would look like :
<xsl:template match="b">
<xsl:if test=".hasChildren()">
<xsl:element name="b">
<xsl:apply-templates/>
</xsl:element>
</xsl:if>
</xsl:template>
But of course, the "hasChildren()" part doesn't exist ... Any idea ?
dsteinweg put me on the right track ... I ended up doing :
<xsl:template match="b">
<xsl:if test="./* or ./text()">
<xsl:element name="b">
<xsl:apply-templates/>
</xsl:element>
</xsl:if>
</xsl:template>
This transformation ignores any <b> elements that do not have any node child. A node in this context means an element, text, comment or processing instruction node.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="b[not(node()]"/>
</xsl:stylesheet>
Notice that here we use one of the most fundamental XSLT design patterns -- using the identity transform and overriding it for specific nodes.
The overriding template will be selected only for nodes that are elements named "b" and do not have (any nodes as) children. This template is empty (does not have any contents), so the effect of its application is that the matching node is ignored/discarded and is not reproduced in the output.
This technique is very powerful and is widely used for such tasks and also for renaming, changing the contents or attributes, adding children or siblings to any specific node that can be matched (avery type of node with the exception of a namespace node can be used as a match pattern in the "match" attribute of <xsl:template/>
Hope this helped.
Cheers,
Dimitre Novatchev
I wonder if this will work?
<xsl:template match="b">
<xsl:if test="b/text()">
...
See if this will work.
<xsl:template match="b">
<xsl:if test=".!=''">
<xsl:element name="b">
<xsl:apply-templates/>
</xsl:element>
</xsl:if>
</xsl:template>
An alternative would be to do the following:
<xsl:template match="b[not(text())]" />
<xsl:template match="b">
<b>
<xsl:apply-templates/>
</b>
</xsl:template>
You could put all the logic in the predicate, and set up a template to match only what you want and delete it:
<xsl:template match="b[not(node())] />
This assumes that you have an identity template later on in the transform, which it sounds like you do. That will automatically copy any "b" tags with content, which is what you want:
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
Edit: Now uses node() like Dimitri, below.
If you have access to update the original XML, you could try using use xml:space=preserve on the root element
<html xml:space="preserve">
...
</html>
This way, the space in the empty <b> </b> tag is preserved, and so can be distinguished from <b /> in the XSLT.
<xsl:template match="b">
<xsl:if test="text() != ''">
....
</xsl:if>
</xsl:template>