XSLT: output into multiple xml files based on grouping - xslt

Let's assume, you have the xml below. The goal is to group by FirstName and export the Person into different xml files. Each output xml files should only contain up to X different FirstName.
Below is an example of the desired transformation with X = 3
XML input:
<People>
<Person>
<FirstName>John</FirstName>
<LastName>Doe</LastName>
</Person>
<Person>
<FirstName>Jack</FirstName>
<LastName>White</LastName>
</Person>
<Person>
<FirstName>Mark</FirstName>
<LastName>Wall</LastName>
</Person>
<Person>
<FirstName>John</FirstName>
<LastName>Ding</LastName>
</Person>
<Person>
<FirstName>Cyrus</FirstName>
<LastName>Ding</LastName>
</Person>
<Person>
<FirstName>Megan</FirstName>
<LastName>Boing</LastName>
</Person>
</People>
XML output 1 with 3 different FirstName
<People>
<Person>
<FirstName>John</FirstName>
<LastName>Doe</LastName>
</Person>
<Person>
<FirstName>John</FirstName>
<LastName>Ding</LastName>
</Person>
<Person>
<FirstName>Jack</FirstName>
<LastName>White</LastName>
</Person>
<Person>
<FirstName>Mark</FirstName>
<LastName>Wall</LastName>
</Person>
</People>
XML output 2 with the 2 remaining FirstName
<People>
<Person>
<FirstName>Cyrus</FirstName>
<LastName>Ding</LastName>
</Person>
<Person>
<FirstName>Megan</FirstName>
<LastName>Boing</LastName>
</Person>
</People>
It seems to me that the muenchian grouping can be used along with the to produce multiple output files. However, the core question is where we can set a threshold in number of person before exporting to a new file?

Here is an example of doing it in two steps with XSLT 2.0:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<xsl:param name="n" as="xs:integer" select="3"/>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="People">
<xsl:variable name="groups" as="element(group)*">
<xsl:for-each-group select="Person" group-by="FirstName">
<group>
<xsl:copy-of select="current-group()"/>
</group>
</xsl:for-each-group>
</xsl:variable>
<xsl:for-each-group select="$groups" group-by="(position() - 1) idiv $n">
<xsl:result-document href="group{position()}.xml">
<People>
<xsl:copy-of select="current-group()"/>
</People>
</xsl:result-document>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
I might try to convert to XSLT 1.0 and EXSLT later.
[edit]
Here is an attempt to translate into XSLT 1.0 and EXSLT:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
extension-element-prefixes="exsl"
exclude-result-prefixes="exsl"
version="1.0">
<xsl:param name="n" select="3"/>
<xsl:output method="xml" indent="yes"/>
<xsl:key name="person-by-firstname"
match="Person"
use="FirstName"/>
<xsl:template match="People">
<xsl:variable name="groups">
<xsl:for-each select="Person[generate-id() = generate-id(key('person-by-firstname', FirstName)[1])]">
<group>
<xsl:copy-of select="key('person-by-firstname', FirstName)"/>
</group>
</xsl:for-each>
</xsl:variable>
<xsl:for-each select="exsl:node-set($groups)/group[(position() - 1) mod $n = 0]">
<exsl:document href="groupTest{position()}.xml">
<People>
<xsl:copy-of select="Person | following-sibling::group[position() < $n]/Person"/>
</People>
</exsl:document>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

Related

Is it possible to match attribute from another element and retrieve its content?

When I'm in : <xsl:template match="listOfPerson/person">
for person of id "A", is it possible to retrieve his information that is stored in another element here it's inside the element data
xml :
<root>
<data>
<person id="A">
<name> Anna </name>
<age> 1 </age>
</person>
<person id="B">
<name> Banana </name>
<age> 1 </age>
</person>
</data>
<listOfPerson>
<person>
<id>A</id>
</person>
<person>
<id>B</id>
</person>
</listOfPerson>
</root>
my current xsl :
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" indent="yes" />
<xsl:template match="root">
<xsl:apply-templates select="listOfPerson/person"/>
</xsl:template>
<xsl:template match="listOfPerson/person">
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
current output :
A
B
desired output :
Anna 1
Banana 1
XSLT has a built-in key mechanism for resolving cross-references. Consider the following example:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8"/>
<xsl:key name="person" match="data/person" use="#id" />
<xsl:template match="/root">
<xsl:for-each select="listOfPerson/person">
<xsl:variable name="data" select="key('person', id)" />
<xsl:value-of select="$data/name" />
<xsl:text> </xsl:text>
<xsl:value-of select="$data/age" />
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Applied to your input example, the result will be:
Anna 1
Banana 1

How to find node value with longest string length

Here is my XML:
<persons>
<person>
<name>Jason</name>
</person>
<person>
<name>John</name>
</person>
<person>
<name>Mary</name>
</person>
<person>
<name>Jennifer</name>
</person>
</persons>
Using XSLT 1.0 I need to find the person with the longest name. What is the best way to do this?
Try:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/persons">
<xsl:for-each select="person">
<xsl:sort select="string-length(name)" data-type="number" order="ascending"/>
<xsl:if test="position()=last()">
<xsl:copy-of select="name"/>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

XSLT: remove empty elements under specific node

input
<person>
<address>
<city>NY</city>
<state></state>
<country>US</country>
</address>
<other>
<gender></gender>
<age>22</age>
<weight/>
</other>
</person>
i only want to remove empty elements from the 'other' node, also the tags under 'other' are not fixed.
output
<person>
<address>
<city>NY</city>
<state></state>
<country>US</country>
</address>
<other>
<age>22</age>
</other>
</person>
I'm new to xslt so pls help..
This transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="other/*[not(node())]"/>
</xsl:stylesheet>
when applied on the provided XML document:
<person>
<address>
<city>NY</city>
<state></state>
<country>US</country>
</address>
<other>
<gender></gender>
<age>22</age>
<weight/>
</other>
</person>
produces the wanted, correct result:
<person>
<address>
<city>NY</city>
<state/>
<country>US</country>
</address>
<other>
<age>22</age>
</other>
</person>
Explanation:
The identity rule copies "as-is" every matched node, for which it is selected for execution.
The only template that overrides the identity templates matches any element that is a child of other and has no children nodes (is empty). As this template has no body, this effectively "deletes" the matched element.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"/>
<xsl:template match="/">
<xsl:apply-templates select="person"/>
</xsl:template>
<xsl:template match="person">
<person>
<xsl:copy-of select="address"/>
<xsl:apply-templates select="other"/>
</person>
</xsl:template>
<xsl:template match="other">
<xsl:for-each select="child::*">
<xsl:if test=".!=''">
<xsl:copy-of select="."/>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

xslt following group

Old Source XML:
<Employees>
<Person>
<FirstName>Joy</FirstName>
<IsManager>N</IsManager>
</Person>
<Person>
<FirstName>Joyce</FirstName>
<IsManager>N</IsManager>
</Person>
<Person>
<FirstName>Joe</FirstName>
<IsManager>Y</IsManager>
</Person>...
</Employees>
New Source XML:
<Employees>
<Person>
<FirstName>Joy</FirstName>
<DetailsArray>
<Details1>
<IsManager>N</IsManager>
<IsSuperviser>N</IsSuperviser>
</Details1>
<Details2>
<IsManager>N</IsManager>
<IsSuperviser>N</IsSuperviser>
</Details2>
</DetailsArray>
</Person>
<Person>
<FirstName>Joyce</FirstName>
<DetailsArray>
<Details1>
<IsManager>N</IsManager>
<IsSuperviser>N</IsSuperviser>
</Details1>
<Details2>
<IsManager>N</IsManager>
<IsSuperviser>N</IsSuperviser>
</Details2>
</DetailsArray>
</Person>
<Person>
<FirstName>Joe</FirstName>
<DetailsArray>
<Details1>
<IsManager>N</IsManager>
<IsSuperviser>N</IsSuperviser>
</Details1>
<Details2>
<IsManager>Y</IsManager>
<IsSuperviser>N</IsSuperviser>
</Details2>
</DetailsArray>
</Person>...
</Employees>
output should be:
<Names>
<Name num='1'>Joe</Name>
<Name num='2'>Joy</Name>
<Name num='3'>Joyce</Name>
....
</Names>
This source XML has some adjustments when compared to previous XML. Here the new condition is "The person may be linked to 2projects or 2tasks", so that i need the output to start from the person with IsManager='Y' even if IsManager is 'y' in Details2 tag of DetailsArray. The output should not have duplications of Names. For suppose if we sort The names will be duplicated..
Thanks for the Previous answers..
EDIT. As lwburk points out, the original solution of this answer just sorts the nodes by IsManager.
Here is a solution that finds the first manager, prints it out, then cycles through the remaining people (cycling back to the beginning, if needed).
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="Employees">
<xsl:variable name="position" select="count(Person) - count(Person/IsManager[. = 'Y'][1]/../following-sibling::*)" />
<xsl:call-template name="person">
<xsl:with-param name="name" select="Person/IsManager[. = 'Y'][1]/../FirstName" />
<xsl:with-param name="position" select="'1'" />
</xsl:call-template>
<xsl:for-each select="Person[position() > $position]">
<xsl:call-template name="person" />
</xsl:for-each>
<xsl:for-each select="Person[position() < $position]">
<xsl:call-template name="person" />
</xsl:for-each>
</xsl:template>
<xsl:template name="person">
<xsl:param name="name" select="FirstName" />
<xsl:param name="position" select="position() + 1" />
<Name>
<xsl:attribute name="num"><xsl:value-of select="$position" /></xsl:attribute>
<xsl:value-of select="$name" />
</Name>
</xsl:template>
</xsl:stylesheet>
Old answer.
I'm not sure about your question, but I think you want to get all the names starting from the person with IsManager = Y. You can use <xsl:sort> by the IsManager value. Don't forget to specify "descending" in the attribute "order" (otherwise, the person with IsManager = Y will be the last one).
I wrote an example that works with your input data:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="Employees">
<xsl:for-each select="Person">
<xsl:sort select="IsManager" order="descending" />
<Name>
<xsl:attribute name="num">
<xsl:value-of select="position()" />
</xsl:attribute>
<xsl:value-of select="FirstName" />
</Name>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
This short and simple transformation (no modes, no variables, and only three templates):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<Names>
<xsl:apply-templates select="*/Person[IsManager='Y'][1]"/>
</Names>
</xsl:template>
<xsl:template match="Person[IsManager='Y']">
<xsl:apply-templates select=
"FirstName |../Person[not(generate-id()=generate-id(current()))]
/FirstName
">
<xsl:sort select=
"generate-id(..) = generate-id(/*/*[IsManager = 'Y'][1])"
order="descending"/>
<xsl:sort select=
"boolean(../preceding-sibling::Person[IsManager='Y'])"
order="descending"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="FirstName">
<Name num="{position()}"><xsl:value-of select="."/></Name>
</xsl:template>
</xsl:stylesheet>
when applied on the following XML (the same one as provided by #lwburk):
<Employees>
<Person>
<FirstName>Joy</FirstName>
<IsManager>N</IsManager>
</Person>
<Person>
<FirstName>Joyce</FirstName>
<IsManager>N</IsManager>
</Person>
<Person>
<FirstName>Joe</FirstName>
<IsManager>Y</IsManager>
</Person>
<Person>
<FirstName>Professor X</FirstName>
<IsManager>N</IsManager>
</Person>
<Person>
<FirstName>Songey</FirstName>
<IsManager>Y</IsManager>
</Person>
</Employees>
produces the wanted, correct result:
<Names>
<Name num="1">Joe</Name>
<Name num="2">Professor X</Name>
<Name num="3">Songey</Name>
<Name num="4">Joy</Name>
<Name num="5">Joyce</Name>
</Names>
Explanation:
This is a typical case of sorting using multiple keys.
The highest priority sorting criteria is whether the Person parent is the first manager.
The second priority sorting criteria is whether the parent Person is following a manager.
We use the fact that when sorting booleans false() comes before true(), therefore we are processing the sorted nodelist in descending order.
It sounds like you're trying to start at the the first manager and then processes all Person elements in order, cycling back around to the beginning to get all elements before the partition element.
The following stylesheet achieves the desired result:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:apply-templates select="Employees/Person"/>
</xsl:template>
<xsl:template match="Person[IsManager='Y'][1]">
<Name num="1">
<xsl:apply-templates select="FirstName"/>
</Name>
<!-- partition -->
<xsl:apply-templates select="following-sibling::Person" mode="after"/>
<xsl:apply-templates select="../Person" mode="before">
<xsl:with-param name="pos" select="last() - position() + 1"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="Person" mode="after">
<Name num="{position() + 1}">
<xsl:apply-templates select="FirstName"/>
</Name>
</xsl:template>
<xsl:template match="Person[not(IsManager='Y') and
not(preceding-sibling::Person[IsManager='Y'])]" mode="before">
<xsl:param name="pos" select="0"/>
<Name num="{position() + $pos}">
<xsl:apply-templates select="FirstName"/>
</Name>
</xsl:template>
<xsl:template match="Person"/>
<xsl:template match="Person" mode="before"/>
</xsl:stylesheet>
Note: 1) This solution requires there be at least one manager present in the source; 2) This might not be a very efficient solution because it requires multiple passes and uses preceding-sibling to test group membership (for elements before the partition element).
Example input:
<Employees>
<Person>
<FirstName>Joy</FirstName>
<IsManager>N</IsManager>
</Person>
<Person>
<FirstName>Joyce</FirstName>
<IsManager>N</IsManager>
</Person>
<Person>
<FirstName>Joe</FirstName>
<IsManager>Y</IsManager>
</Person>
<Person>
<FirstName>Professor X</FirstName>
<IsManager>N</IsManager>
</Person>
<Person>
<FirstName>Songey</FirstName>
<IsManager>Y</IsManager>
</Person>
</Employees>
Output:
<Name num="1">Joe</Name>
<Name num="2">Professor X</Name>
<Name num="3">Songey</Name>
<Name num="4">Joy</Name>
<Name num="5">Joyce</Name>

How to use XSLT to convert a simple piece of XML

How can you convert
<person>
<personFirstName>FirstName</personFirstName>
<personLastName>LastName</personLastName>
<personAge>40</personAge>
</person>
to
<person>
<name>
<first>FirstName</first>
<last>LastName</last>
</name>
<age>40</age>
</person>
using XSLT, moreover, if the input XML is a collection of person nodes, like so:
<persons>
<person>
...
</person>
</persons>
It should be very easy. You can try to:
match person then open name, apply templates, close name, open age, get value from personAge, close age
match personFirstName, open first, get value, close first
same as personFirstName for personLastName
I think 3 templates wihtout loops should be enough. Try it!
The key is the identity transform and overriding it when needed.
Sample XML
<persons>
<person>
<personFirstName>FirstName</personFirstName>
<personLastName>LastName</personLastName>
<personAge>40</personAge>
</person>
<person>
<personFirstName>FirstName2</personFirstName>
<personLastName>LastName2</personLastName>
<personAge>100</personAge>
</person>
</persons>
Sample XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<!--Identity Transform-->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="person">
<person>
<name>
<first><xsl:apply-templates select="personFirstName"/></first>
<last><xsl:apply-templates select="personLastName"/></last>
</name>
<age><xsl:apply-templates select="personAge"/></age>
</person>
</xsl:template>
<xsl:template match="personFirstName|personLastName|personAge">
<xsl:apply-templates/>
</xsl:template>
</xsl:stylesheet>
OUTPUT
<persons>
<person>
<name>
<first>FirstName</first>
<last>LastName</last>
</name>
<age>40</age>
</person>
<person>
<name>
<first>FirstName2</first>
<last>LastName2</last>
</name>
<age>100</age>
</person>
</persons>
A "push-style" solution:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="personFirstName">
<name>
<xsl:apply-templates mode="renameWrapped"
select=".|../personLastName"/>
</name>
</xsl:template>
<xsl:template match="personFirstName" mode="renameWrapped">
<first><xsl:apply-templates/></first>
</xsl:template>
<xsl:template match="personLastName" mode="renameWrapped">
<last><xsl:apply-templates/></last>
</xsl:template>
<xsl:template match="personAge">
<age><xsl:apply-templates/></age>
</xsl:template>
<xsl:template match="personLastName"/>
</xsl:stylesheet>
when applied on this XML document:
<persons>
<person>
<personFirstName>FirstName</personFirstName>
<personLastName>LastName</personLastName>
<personAge>40</personAge>
</person>
<person>
<personFirstName>FirstName2</personFirstName>
<personLastName>LastName2</personLastName>
<personAge>100</personAge>
</person>
</persons>
the wanted, correct result is produced:
<persons>
<person>
<name>
<first>FirstName</first>
<last>LastName</last>
</name>
<age>40</age>
</person>
<person>
<name>
<first>FirstName2</first>
<last>LastName2</last>
</name>
<age>100</age>
</person>
</persons>
Explanation:
Using and overriding the identity rule/template for wrapping and renaming of elements.
The elements to be wrapped are renamed in mode renameWrapped.
The personAge element is renamed in a non-moded template that overrides the identity rule for elements named personAge.