XSLT: How to remove the self-closed element - xslt

I have a large xml file which contents a lot of self-closed tags. How could remove all them by using XSLT.
eg.
<?xml version="1.0" encoding="utf-8" ?>
<Persons>
<Person>
<Name>user1</Name>
<Tel />
<Mobile>123</Mobile>
</Person>
<Person>
<Name>user2</Name>
<Tel>456</Tel>
<Mobile />
</Person>
<Person>
<Name />
<Tel>123</Tel>
<Mobile />
</Person>
<Person>
<Name>user4</Name>
<Tel />
<Mobile />
</Person>
</Persons>
I'm expecting the result:
<?xml version="1.0" encoding="utf-8" ?>
<Persons>
<Person>
<Name>user1</Name>
<Mobile>123</Mobile>
</Person>
<Person>
<Name>user2</Name>
<Tel>456</Tel>
</Person>
<Person>
<Tel>123</Tel>
</Person>
<Person>
<Name>user4</Name>
</Person>
</Persons>
Note: there are thousands of different elements, how can I programmatically remove all the self-closed tags. Another question is how to remove the empty element such as <name></name> as well.
Can anyone help me on this? Many thanks.

The self-closed tags are equivalent to empty tags. You can remove all empty tags, but you have no way of knowing whether they were self-closed in the input XML or not (<tag/> and <tag></tag> are indistinguishable).
<!-- the identity template copies everything that has no special handler -->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
<!-- special handler for elements that have no child nodes:
they are removed by this empty template -->
<xsl:template match="*[not(node())]" />
If elements that contain whitespace only are "empty" by your definition as well, then replace the second template with:
<xsl:template match="*[normalize-space() = '']" />

From the XML point of view, there is no difference between "self-closed" element like and empty element like (see spec).
Here is a transformation to strip all empty elements:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" encoding="utf-8" />
<xsl:strip-space elements="*" />
<xsl:template match="#*|node()">
<xsl:if test=".!=''">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:if>
</xsl:template>
</xsl:stylesheet>

You might want to check if they are required. It should look something like this if they are: use="required". Also check if they are: type="nonEmptyString".

You can remove all empty elements - ones that do not have nested elements and attributes declared. If this solution works for you you can do following:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="*">
<xsl:if test="string(.) != '' or descendant-or-self::*/#*[string(.)]">
<xsl:element name="{name()}" >
<xsl:copy-of select="#*[string(.)]"/>
<xsl:apply-templates select="* | text()" />
</xsl:element>
</xsl:if>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>

The reason to post this answer is,
that you haven't accepted any of the
existing answers yet.
Well. This is very simple XSLT challenge. Just match a node with text data as null and close the template tag, so that, the node will not appear in the output.like this, <xsl:template match=*[.='']/> add it along with your identity template. Similar to the way Tomolak has nailed.
The problem with this approach is, it deletes even your parent node (<Person/> tag for example) if it null.
If this is your xml:
<Persons>
<Person>
<data>text</data>
<data2>text</data2>
<data3/>
</Person>
<Person/>
</Persons>
From the above xml even the tag is removed. So the output xml will be:
<Persons>
<Person>
<data>text</data>
<data2>text</data2>
</Person>
</Persons>
If you want to avoid that, then add an exception.
<xsl:template match="*[name()!='Person' and not(node())]"/>
Add it your identity template. Your XSLT will be:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[name()!='Person' and not(node())]"/>
</xsl:stylesheet>
And the output xml will be:
<Persons>
<Person>
<data>text</data>
<data2>text</data2>
</Person>
<Person/>
</Persons>

Related

current-group() except for one element in xslt

The below is my input xml. I'm trying group by using current-group() function but it is not meeting my requirement, below I have provided the details.
<UsrTimeCardEntry>
<Code>1<Code>
<Name>TC1</Name>
<Person>
<Code>074</Code>
</Person>
</UsrTimeCardEntry>
<UsrTimeCardEntry>
<Code>2<Code>
<Name>TC2</Name>
<Person>
<Code>074</Code>
</Person>
</UsrTimeCardEntry>
I want to group it by Person/Code so that it looks like this
<Person Code="074">
<UsrTimeCardEntry>
<Code>1</Code>
<Name>TC1</Name>
</UsrTimeCardEntry>
<UsrTimeCardEntry>
<Code>2</Code>
<Name>TC2</Name>
</UsrTimeCardEntry>
</Person>
For which I'm using the below xslt, but it is again copying the Person which I don't want, what it that I'm missing here, I tried using current-group() except and not[child::Person] but that too did not work.
<xsl:template match="businessobjects">
<xsl:for-each-group select="UsrTimeCardEntry" group-by="Person/Code">
<Person Code="{current-grouping-key()}">
<xsl:copy-of select="current-group()"></xsl:copy-of>
</Person>
</xsl:for-each-group>
</xsl:template>
Instead of using xsl:copy-of here, use xsl:apply-templates, then you can add a template to ignore the Person node
<xsl:template match="Person" />
This assumes you are also using the identity template to copy all other nodes normally.
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
Try this XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="xml" indent="yes" />
<xsl:strip-space elements="*" />
<xsl:template match="businessobjects">
<xsl:for-each-group select="UsrTimeCardEntry" group-by="Person/Code">
<Person Code="{current-grouping-key()}">
<xsl:apply-templates select="current-group()" />
</Person>
</xsl:for-each-group>
</xsl:template>
<xsl:template match="Person" />
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

XML to XML using XSLT remove duplicates

Below is an input xml file:
<assets>
<item>
<file_name>file123</file_name>
<description>testing</description>
<created>date</created>
<metadata>
<guest>name</guest>
<webinfo>test</webinfo>
<albumorder>3</albumorder>
<albumorder>3</albumorder>
</metadata>
</item>
</assets>
From the above xml metadata/albumorder is having duplicate. I want to keep only one albumorder element. How to remove the duplicate elements.
Result file should be like:
<assets>
<item>
<file_name>file123</file_name>
<description>testing</description>
<created>date</created>
<metadata>
<guest>name</guest>
<webinfo>test</webinfo>
<albumorder>3</albumorder>
</metadata>
</item>
</assets>
The following copies only elements, which don't have a same named preceeding sibling element:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="*">
<xsl:if test="not(preceding-sibling::*[name(.) = name(current())])">
<xsl:copy><xsl:apply-templates/></xsl:copy>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
There is pattern called "XSL identity template" or "XSL identity transformation", that will keep all XML except for rules you define to change certain elements.
kiwiwings already pointed out the solution, I would just integrate it in the identity template, otherwise you would loose XML comments, attributes, and processing instructions (none of them are present in your XML, but someone else may have them).
<xsl:output method="xml" indent="yes" encoding="UTF-8"/>
<xsl:template match="comment()|processing-instruction()|text()">
<xsl:copy/>
</xsl:template>
<xsl:template match="*">
<xsl:if test="not(preceding-sibling::*[name(.) = name(current())])">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates select="comment()|processing-instruction()|text()|*"/>
</xsl:copy>
</xsl:if>
</xsl:template>

Insert list of elements inside XML tags

I am trying to obtain a list of all the elements with values that aren't in the (Line 1, Line2), and then insert them into the tags similar to the test.
Right now I can retrieve all the elements, but I'm having trouble restricting this to just my desired values. And then I'm unsure how to match and do a for each on elements outside my match criteria. Any advice would be greatly appreciated!
Given the Following XML:
<?xml version="1.0" encoding="UTF-8"?>
<Request>
<Header>
<Line1>Element1</Line1>
<Line2>Element2</Line2>
</Header>
<ElementControl>
<Update>
<Element>test</Element>
</Update>
</ElementControl>
<Member>
<Identifier>123456789</Identifier>
<Contact>
<Person>
<Gender>MALE</Gender>
<Title>Mr</Title>
<Name>JOHN DOE</Name>
</Person>
<HomePhone/>
<eMailAddress/>
<ContactAddresses>
<Address>
<AddressType>POS</AddressType>
<Line1>100 Fake Street</Line1>
<Line2/>
<Line3/>
<Line4/>
<Suburb>Jupiter</Suburb>
<State>OTH</State>
<PostCode>9999</PostCode>
<Country>AUS</Country>
</Address>
</ContactAddresses>
</Contact>
</Member>
</Request>
Current XSL for getting elements
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()">
<xsl:for-each select="node()[text() != '']">
<xsl:value-of select="local-name()"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
<xsl:apply-templates select="node()"/>
</xsl:template>
</xsl:stylesheet>
My WIP xml for inserting the result xml tags is below. I'm unsure how to insert the results of the above xsl into this,
<xsl:template match="Element">
<xsl:copy-of select="."/>
<Element>Value1</Element>
</xsl:template>
And ultimate desired output:
<?xml version="1.0" encoding="UTF-8"?>
<Request>
<Header>
<Line1>Element1</Line1>
<Line2>Element2</Line2>
</Header>
<ElementControl>
<Update>
<Element>Identifier</Element>
<Element>Gender</Element>
<Element>Title</Element>
<Element>Name</Element>
<Element>AddressType</Element>
<Element>Line1</Element>
<Element>Suburb</Element>
<Element>State</Element>
<Element>PostCode</Element>
<Element>Country</Element>
</Update>
</ElementControl>
<Member>
<Identifier>123456789</Identifier>
<Contact>
<Person>
<Gender>MALE</Gender>
<Title>Mr</Title>
<Name>JOHN DOE</Name>
</Person>
<HomePhone/>
<eMailAddress/>
<ContactAddresses>
<Address>
<AddressType>POS</AddressType>
<Line1>100 Fake Street</Line1>
<Line2/>
<Line3/>
<Line4/>
<Suburb>Jupiter</Suburb>
<State>OTH</State>
<PostCode>9999</PostCode>
<Country>AUS</Country>
</Address>
</ContactAddresses>
</Contact>
</Member>
</Request>
I would change the current template to use mode attribute, so it is only used in specific cases, rather than matching all elements. You should also change it to output elements, not text, like so:
<xsl:template match="node()" mode="copy">
<xsl:for-each select=".//node()[text() != '']">
<Element>
<xsl:value-of select="local-name()"/>
</Element>
</xsl:for-each>
</xsl:template>
Then you can call it like this....
<xsl:template match="ElementControl/Update">
<xsl:apply-templates select="../../Member" mode="copy" />
</xsl:template>
Try this XSLT. Note the use of the identity template to copy all other existing elements unchanged
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()" mode="copy">
<xsl:for-each select=".//node()[text() != '']">
<Element>
<xsl:value-of select="local-name()"/>
</Element>
</xsl:for-each>
</xsl:template>
<xsl:template match="ElementControl/Update">
<xsl:apply-templates select="../../Member" mode="copy" />
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

XSLT: remove empty elements under specific node

input
<person>
<address>
<city>NY</city>
<state></state>
<country>US</country>
</address>
<other>
<gender></gender>
<age>22</age>
<weight/>
</other>
</person>
i only want to remove empty elements from the 'other' node, also the tags under 'other' are not fixed.
output
<person>
<address>
<city>NY</city>
<state></state>
<country>US</country>
</address>
<other>
<age>22</age>
</other>
</person>
I'm new to xslt so pls help..
This transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="other/*[not(node())]"/>
</xsl:stylesheet>
when applied on the provided XML document:
<person>
<address>
<city>NY</city>
<state></state>
<country>US</country>
</address>
<other>
<gender></gender>
<age>22</age>
<weight/>
</other>
</person>
produces the wanted, correct result:
<person>
<address>
<city>NY</city>
<state/>
<country>US</country>
</address>
<other>
<age>22</age>
</other>
</person>
Explanation:
The identity rule copies "as-is" every matched node, for which it is selected for execution.
The only template that overrides the identity templates matches any element that is a child of other and has no children nodes (is empty). As this template has no body, this effectively "deletes" the matched element.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"/>
<xsl:template match="/">
<xsl:apply-templates select="person"/>
</xsl:template>
<xsl:template match="person">
<person>
<xsl:copy-of select="address"/>
<xsl:apply-templates select="other"/>
</person>
</xsl:template>
<xsl:template match="other">
<xsl:for-each select="child::*">
<xsl:if test=".!=''">
<xsl:copy-of select="."/>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

XSLT: Adding a node!

How do I encapsulate nodes around my XML blocks using XSLT?
For example, I have the following XML file.
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" omit-xml-declaration="yes" />
<xsl:template match="/">
<Root>
<VOBaseCollection>
<xsl:apply-templates select="Root/Location" />
</VOBaseCollection>
</Root>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
My input XML file looks like this.
<Root>
<Location><Name>Pennsylvania</Name><Type>State</Type></Location>
</Root>
I wish the output to look like this.
<Root><Container>
<Location><Name>Pennsylvania</Name><Type>State</Type></Location>
</Container>
</Root>
I wish to make sure that a node called <CONTAINER> gets applied every time, it copies over information from Root/Location. What changes do I need to do to my XSLT file?
Summarizing all the answers in comments, this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="#*|node()" name="identity">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Location">
<Container>
<xsl:call-template name="identity"/>
</Container>
</xsl:template>
</xsl:stylesheet>
Result:
<Root>
<Container>
<Location>
<Name>Pennsylvania</Name>
<Type>State</Type>
</Location>
</Container>
</Root>
I am just guessing, and in guess mode it seems that you want this:
EDIT: helped by another guess by Mads Hansen...
Add this to the identity template you already have:
<xsl:template match="Location">
<CONTAINER><xsl:apply-templates/></CONTAINER>
</xsl:template>