Remove unused elements from XML schema using XSLT - xslt

I'm looking for a way (if it's even possible) of using an XSL transform of an XSD document to remove unused elements. This comes up a lot in my job where a company will define an XSD with absolutely everything in it, but then they will want to create a cut-down version for a single root element within it.
To explain further, I might have an XSD like the following:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="RootElement">
<xs:complexType>
<xs:sequence>
<xs:element ref="ChildElement"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="ChildElement"/>
<xs:element name="UnusedElement"/>
</xs:schema>
What I would like to be able to do is to set up an XSL where I provide the starting element (in this case RootElement) and it will copy over all dependent elements but omit the unused ones. In the above example, if I passed in RootElement I'd expect to see RootElement and ChildElement included but UnusedElement omitted.
(When I say "provide the starting element", I'm quite happy to crack open the stylesheet and type xsl:template match="RootElement" where required.)
This would obviously have to be recursive, so would navigate the entire structure defined below the starting element, and any element in that schema that was not used would be discarded.
(Of course, it would be even better if it could do the same in any imported schemas!)
I've searched Google extensively and can't find anything on this - I'm not sure if that means it's not possible or not.
Thanks!
Edit: Actually I probably should clarify and say that I would like to remove unused elements AND types, so it would follow both ref="childElement" and type="someType" links.

This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" >
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="ptopElementName" select="'RootElement'"/>
<xsl:variable name="vTop" select=
"/*/xs:element[#name=$ptopElementName]"/>
<xsl:variable name="vNames"
select="$vTop/descendant-or-self::*/#name"/>
<xsl:variable name="vRefs"
select="$vTop/descendant-or-self::*/#ref"/>
<xsl:variable name="vTypes"
select="$vTop/descendant-or-self::*/#type"/>
<xsl:template match="node()|#*" name="identity">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="xs:element">
<xsl:if test=
"#name=$vNames
or
#name=$vRefs
or
ancestor-or-self::*[#name=$ptopElementName]">
<xsl:call-template name="identity"/>
</xsl:if>
</xsl:template>
<xsl:template match="xs:complexType|xs:simpleType">
<xsl:if test=
"#name=$vTypes
or
ancestor-or-self::*[#name=$ptopElementName]">
<xsl:call-template name="identity"/>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="RootElement">
<xs:complexType>
<xs:sequence>
<xs:element ref="ChildElement"/>
</xs:sequence>
</xs:complexType></xs:element>
<xs:element name="ChildElement"/>
<xs:element name="UnusedElement"/>
</xs:schema>
produces the wanted, corect result:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="RootElement">
<xs:complexType>
<xs:sequence>
<xs:element ref="ChildElement"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="ChildElement"/>
</xs:schema>

Related

mapping dynamic xml nodes to static nodes

I am looking for someone to assist in solving a seemingly simple problem.
I want to map a node of /fields[x]/message_id to a static node /MessageID0x for 5 entries in a list.
The source node is optional and may not exist.
The schema is below
I am just not seeing the obvious, I hope.
The source is defined as:
<xs:element name="fields">
<xs:complexType>
<xs:sequence>
<xs:element name="tenant_id" type="xs:normalizedString" minOccurs="0"/>
<xs:element name="message_id" type="xs:normalizedString" minOccurs="0"/>
Target is defined as:
<xs:element name="MessageID01" type="xs:normalizedString" minOccurs="0"/>
<xs:element name="MessageID02" type="xs:normalizedString" minOccurs="0"/>
<xs:element name="MessageID03" type="xs:normalizedString" minOccurs="0"/>
<xs:element name="MessageID04" type="xs:normalizedString" minOccurs="0"/>
<xs:element name="MessageID05" type="xs:normalizedString" minOccurs="0"/>
=== FROM ===========
<root>
<ID>2019Nov12_17</ID>
<PingResult>OK</PingResult>
<StartDateTime>2019-11-12T16:16:01</StartDateTime>
<EndDateTime>2019-11-12T17:16:01.771Z</EndDateTime>
<start>0</start>
<numFound>1</numFound>
<fields>
<tenant_id>KOCHIND_AX2</tenant_id>
<message_id>lid://infor.landmark.lmrkmt/15d8f834-7680-541e-0000-001d5dae3e7b</message_id>
</fields>
<fields>
<tenant_id>KOCHIND_AX2</tenant_id>
<message_id>lid://infor.landmark.lmrkmt/0535a86a-7680-1868-0000-07625db833c1</message_id>
</fields>
<fields>
<tenant_id>KOCHIND_AX2</tenant_id>
<message_id>lid://infor.landmark.lmrkmt/0535a86a-7680-1864-0000-03445db849c8</message_id>
</fields>
<fields>
<tenant_id>KOCHIND_AX2</tenant_id>
<message_id>lid://infor.landmark.lmrkmt/0535a86a-7680-1867-0000-01151db125c8</message_id>
</fields>
</root>
TO ===================
<root>
<ID>2019Nov12_17</ID>
<PingResult>OK</PingResult>
<StartDateTime>2019-11-12T16:16:01</StartDateTime>
<EndDateTime>2019-11-12T17:16:01.771Z</EndDateTime>
<start>0</start>
<numFound>1</numFound>
<MessageID01>lid://infor.landmark.lmrkmt/15d8f834-7680-541e-0000-001d5dae3e7b</MessageID01>
<MessageID02>lid://infor.landmark.lmrkmt/0535a86a-7680-1868-0000-07625db833c1</MessageID02>
<MessageID03>lid://infor.landmark.lmrkmt/0535a86a-7680-1868-0000-07625db833c8</MessageID03>
<MessageID04>lid://infor.landmark.lmrkmt/15d8f834-1864—3322-0000-03445db125c8</MessageID04>
<MessageID05>lid://infor.landmark.lmrkmt/15d8f834-7680-1867-0000-01151db125g4</MessageID05>
</root>
I believe the requested result can be achieved rather simply by doing:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="fields">
<xsl:variable name="n">
<xsl:number format="01"/>
</xsl:variable>
<xsl:element name="MessageID{$n}">
<xsl:value-of select="message_id"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
The result of transforming the example input will be:
<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<root>
<ID>2019Nov12_17</ID>
<PingResult>OK</PingResult>
<StartDateTime>2019-11-12T16:16:01</StartDateTime>
<EndDateTime>2019-11-12T17:16:01.771Z</EndDateTime>
<start>0</start>
<numFound>1</numFound>
<MessageID01>lid://infor.landmark.lmrkmt/15d8f834-7680-541e-0000-001d5dae3e7b</MessageID01>
<MessageID02>lid://infor.landmark.lmrkmt/0535a86a-7680-1868-0000-07625db833c1</MessageID02>
<MessageID03>lid://infor.landmark.lmrkmt/0535a86a-7680-1864-0000-03445db849c8</MessageID03>
<MessageID04>lid://infor.landmark.lmrkmt/0535a86a-7680-1867-0000-01151db125c8</MessageID04>
</root>
which is different from the result you show - nevertheless, I suspect it is the correct one.
Do note that numbering sibling nodes by name is bad practice. It makes subsequent transformations much more difficult. If you need a number (although I don't see why you should), use an attribute.

How to prevent double elements when merging include files in xsd?

I am working on a stylesheet that processes some XSDs.
The main XSD file includes 2 others. Of those 2 one also includes the other.
All XSDs have the same attributes and namespaces in the root element. The files are only separate for maintenance purposes.
The stylesheet:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:exa="http://www.example.com/"
exclude-result-prefixes="xsl xs exa">
<xsl:output method="xml" indent="yes" encoding="iso-8859-1" />
<!-- global variable to merge schema with it's includes
to be used for further processing the schema -->
<xsl:variable name="with_includes">
<xsl:apply-templates select="/xs:schema" mode="include"/>
</xsl:variable>
<!-- copy the main schema root element including attributes
then process all nodes in it -->
<xsl:template match="xs:schema" mode="include">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates select="node()" mode="include"/>
</xsl:copy>
</xsl:template>
<!-- all schemas have the same namespaces and targetnamespace defined
so do not copy namespaces -->
<xsl:template match="node()" mode="include">
<xsl:copy-of select="." copy-namespaces="no"/>
</xsl:template>
<!-- when matching an include, process all the nodes in the schema -->
<xsl:template match="xs:include" mode="include">
<xsl:apply-templates select="doc(#schemaLocation)/xs:schema/node()" mode="include"/>
</xsl:template>
<!-- only here to show the result -->
<xsl:template match="/xs:schema">
<xsl:copy-of select="$with_includes"/>
</xsl:template>
</xsl:stylesheet>
Some very basic example schemas to demonstrate the problem:
Schema A.xsd:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema
attributeFormDefault="unqualified"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.example.com/example"
xmlns:exa="http://www.example.com/example">
<xs:include schemaLocation="B.xsd"/>
<xs:include schemaLocation="C.xsd"/>
<xs:element name="a" type="exa:t_a"/>
<xs:complexType name="t_a">
<xs:sequence>
<xs:element name="b" type="exa:t_b"/>
<xs:element name="c" type="exa:t_c"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
Schema B.xsd:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema
attributeFormDefault="unqualified"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.example.com/example"
xmlns:exa="http://www.example.com/example">
<xs:include schemaLocation="C.xsd"/>
<xs:complexType name="t_b">
<xs:sequence>
<xs:element name="c1" type="exa:t_c"/>
<xs:element name="c2" type="exa:t_c"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
Schema C.xsd
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema
attributeFormDefault="unqualified"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.example.com/example"
xmlns:exa="http://www.example.com/example">
<xs:simpleType name="t_c">
<xs:restriction base="xs:string">
<xs:minLength value="1"/>
<xs:maxLength value="20"/>
</xs:restriction>
</xs:simpleType>
</xs:schema>
If I use the stylesheet above, I get simpleType t_c twice in the result. I am looking for a way to prevent that.
Btw, I use Saxon.
Deduplicating the includes is relatively straightforward, but it becomes more complex if you need to handle cycles as well - cycles of xs:includes are permitted in XSD, though the 1.0 spec isn't entirely clear on the point. If you're not concerned about cycles, just build a list of all the includes by transitive expansion using a recursive function, preferably calling resolve-uri() to resolve each #schemaLocation against its base URI, then remove duplicates from the list using distinct-values(). If you need to eliminate cycles, you'll need to pass a parameter to your recursive function indicating the route by which the document was reached, and ignore a document if it's already on the list. If you've got a copy of my book, there's an example of cycle detection in the section on xsl:call-template. But you may find the book too expensive too ;-(

Extracting data from XSD in XSL

I am transforming an XML file that needs to generate some elements based on the valid enumeration options defined in the XSD.
Suppose I have an XSD that declares a type and an element something like this:
<xs:simpleType name="optionType" nillable="true">
<xs:restriction base="xs:string">
<xs:maxLength value="50"/>
<xs:enumeration value="USERCHOICE">
</xs:enumeration>
<xs:enumeration value="DEFAULT">
</xs:enumeration>
</xs:restriction>
</xs:simpleType>
...
<xs:element name="chosenOption" type='optionType'/>
...
<xs:element name="availableOption" type='optionType'/>
The input will only contain the chosen option, so you can imagine it looks like this:
<options>
<chosenOption>USERCHOICE</chosenOption>
</options>
I need to have an output that looks like this:
<options>
<chosenOption>USERCHOICE</chosenOption> <!-- This comes from incoming XML -->
<!-- This must be a list of ALL possible values for this element, as defined in XSD -->
<availableOptions>
<availableOption>USERCHOICE</availableOption>
<availableOption>DEFAULT</availableOption>
</availableOptions>
</options>
Is there a way to have the XSL extract the enumeration values USERCHOICE and DEFAULT from the XSD and produce them in the output?
This will run on WebSphere 6 and will be used by an XSLT 1.0 engine. :(
(The schema file does not change often but it will change now and then and I'd rather only have to update the schema file instead of update the schema file and XSLT)
Here's a prototype that assumes that your input XML and XSD are as simple as the samples above. To be tweaked according to ways in which they may vary. If you need help with that tweaking, let me know.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="1.0">
<xsl:variable name="xsd" select="document('mySchema.xsd')"/>
<xsl:template match="/options">
<xsl:copy>
<xsl:for-each select="*">
<xsl:variable name="eltName" select="local-name()"/>
<xsl:copy-of select="." />
<availableOptions>
<xsl:variable name="optionType"
select="$xsd//xs:element[#name = $eltName]/#type"/>
<xsl:apply-templates
select="$xsd//xs:simpleType[#name = $optionType]/
xs:restriction/xs:enumeration"/>
</availableOptions>
</xsl:for-each>
</xsl:copy>
</xsl:template>
<xsl:template match="xs:enumeration">
<availableOption><xsl:value-of select="#value" /></availableOption>
</xsl:template>
</xsl:stylesheet>

XSLT change page

I am trying to figure out how to change the page with my xslt transformation, by clicking on an element in the table.
I have looked around and couln't find an answer, here is a simple code and I would like to know how to change the page and desplay something:
I don't think it is necessary for you to read the code except for the the final one, the XSLT. I have added the other sections of the code in case they also need some changing.
XSD:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com">
<xs:element name="Table">
<xs:complexType>
<xs:sequence maxOccurs="unbounded">
<xs:element ref="Person"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Person">
<xs:complexType>
<xs:all maxOccurs="1">
<xs:element ref="Name"/>
<xs:element minOccurs="1" ref="Age"/>
</xs:all>
</xs:complexType>
</xs:element>
<xs:element name="Name">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="1" ref="theName"/>
<xs:element maxOccurs="1" ref="Description"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="theName" type="xs:string"/>
<xs:element name="Description" type="xs:string"/>
<xs:element abstract="false" name="Age" type="xs:positiveInteger"/>
</xs:schema>
So I created two elements, theName and the age. Name is composed of the String TheName and the String Description. Age is just a positiveinteger. Now I created the XML file which just chooses two different People:
XML:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="try.xsl"?>
<Table xmlns="http://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3schools.com try.xsd">
<Person>
<Name>
<theName>Thomas</theName>
<Description>He is a nice guy</Description>
</Name>
<Age>10</Age>
</Person>
<Person>
<Name>
<theName>Peter</theName>
<Description>He is good at swimming</Description>
</Name>
<Age>12</Age>
</Person>
</Table>
My transformation now is to make a table with the name of the people in function of there age:
XSLT:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ws="http://www.w3schools.com"
version="1.0">
<xsl:template match="/">
<html>
<table border="2">
<tr bgcolor="red">
<th>Name</th>
<th>Age</th>
</tr>
<xsl:for-each select="ws:Table/ws:Person">
<tr>
<td><xsl:value-of select="ws:Name/ws:theName"/></td>
<td><xsl:value-of select="ws:Age"/></td>
</tr>
</xsl:for-each>
</table>
This is where I need help, how do I change the page that I am on and go on a page which simply displays the description of the person, I want to do this by clicking on the name of the person. I don't want to open a new window, just change the page.
</html>
</xsl:template>
</xsl:stylesheet>
In your XSLT code, generate HTML that contains a hyperlink to the relevant page. This could be a traditional <a href="..."> hyperlink, or any element with an onclick attribute.

transforming with xsl an xml schema template to an other xml schema template

can anyone give me a paradigm of transforming an xml schema template like
<xs:element name="carareWrap">
<xs:annotation>
<xs:documentation xml:lang="en">The CARARE wrapper element. It wraps CARARE elements.</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" maxOccurs="unbounded" ref="carare"/>
</xs:sequence>
</xs:complexType>
</xs:element>
to an other xml schema template with xsl?
the other xml schema could be anything you can do.. I just need to have somwthing to start with...
can anyone give me a paradigm of transforming an xml schema template like ... to an other xml schema template?
An XML Schema is just a an XML document with declared namespace uri http://www.w3.org/2001/XMLSchema. Therefore, you can apply XSLT as usual.
For instance, you have a source schema like this:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="carareWrap">
<xs:annotation>
<xs:documentation xml:lang="en">The CARARE wrapper element. It wraps CARARE elements.</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" maxOccurs="unbounded" ref="carare"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
And (for example) you want to remove the attributes of reference elements only. You can apply the following transform:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
version="1.0">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="xs:element[#ref]">
<xsl:copy>
<xsl:copy-of select="#ref"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
The result will be:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="carareWrap">
<xs:annotation>
<xs:documentation xml:lang="en">The CARARE wrapper element. It wraps CARARE elements.</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element ref="carare" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Notice
The needing of the declaration of the namespace of the input document in the XSLT
The usage of the identity transform to copy the input document as is and override the elements as by requirements.
There is nothing special about trating XSD documents. They are just XML.
Since you do not specify what changes you want to make, here is a sample XSLT stylesheet that changes a random detail (the value of minOccurs in this case)
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
>
<!-- the identity template copyies everything as-is -->
<xsl:template match="node() | #*">
<xsl:copy>
<xsl:apply-templates select="node() | #*" />
</xsl:copy>
</xsl:template>
<!-- ...unless there is a more specific template available -->
<xsl:template match="
xs:element[#name = 'carareWrap']//xs:element[#ref = 'carare' and #minOccurs = 1]/#minOccurs
">
<xsl:attribute name="{name()}">2</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Output
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="carareWrap">
<xs:annotation>
<xs:documentation xml:lang="en">The CARARE wrapper element. It wraps CARARE elements.</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="2" maxOccurs="unbounded" ref="carare"></xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
A few things to notice:
the namespace declaration xmlns:xs="http://www.w3.org/2001/XMLSchema" so the xs prefix is available in the XSLT stylesheet
the use of the identity template copy everything that is not handled otherwise
the use of a complex match expression to pick a specific node
the use of an attribute value template and the name() function to copy the attribute name: name="{name()}"