How to prevent double elements when merging include files in xsd? - xslt

I am working on a stylesheet that processes some XSDs.
The main XSD file includes 2 others. Of those 2 one also includes the other.
All XSDs have the same attributes and namespaces in the root element. The files are only separate for maintenance purposes.
The stylesheet:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:exa="http://www.example.com/"
exclude-result-prefixes="xsl xs exa">
<xsl:output method="xml" indent="yes" encoding="iso-8859-1" />
<!-- global variable to merge schema with it's includes
to be used for further processing the schema -->
<xsl:variable name="with_includes">
<xsl:apply-templates select="/xs:schema" mode="include"/>
</xsl:variable>
<!-- copy the main schema root element including attributes
then process all nodes in it -->
<xsl:template match="xs:schema" mode="include">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates select="node()" mode="include"/>
</xsl:copy>
</xsl:template>
<!-- all schemas have the same namespaces and targetnamespace defined
so do not copy namespaces -->
<xsl:template match="node()" mode="include">
<xsl:copy-of select="." copy-namespaces="no"/>
</xsl:template>
<!-- when matching an include, process all the nodes in the schema -->
<xsl:template match="xs:include" mode="include">
<xsl:apply-templates select="doc(#schemaLocation)/xs:schema/node()" mode="include"/>
</xsl:template>
<!-- only here to show the result -->
<xsl:template match="/xs:schema">
<xsl:copy-of select="$with_includes"/>
</xsl:template>
</xsl:stylesheet>
Some very basic example schemas to demonstrate the problem:
Schema A.xsd:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema
attributeFormDefault="unqualified"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.example.com/example"
xmlns:exa="http://www.example.com/example">
<xs:include schemaLocation="B.xsd"/>
<xs:include schemaLocation="C.xsd"/>
<xs:element name="a" type="exa:t_a"/>
<xs:complexType name="t_a">
<xs:sequence>
<xs:element name="b" type="exa:t_b"/>
<xs:element name="c" type="exa:t_c"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
Schema B.xsd:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema
attributeFormDefault="unqualified"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.example.com/example"
xmlns:exa="http://www.example.com/example">
<xs:include schemaLocation="C.xsd"/>
<xs:complexType name="t_b">
<xs:sequence>
<xs:element name="c1" type="exa:t_c"/>
<xs:element name="c2" type="exa:t_c"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
Schema C.xsd
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema
attributeFormDefault="unqualified"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.example.com/example"
xmlns:exa="http://www.example.com/example">
<xs:simpleType name="t_c">
<xs:restriction base="xs:string">
<xs:minLength value="1"/>
<xs:maxLength value="20"/>
</xs:restriction>
</xs:simpleType>
</xs:schema>
If I use the stylesheet above, I get simpleType t_c twice in the result. I am looking for a way to prevent that.
Btw, I use Saxon.

Deduplicating the includes is relatively straightforward, but it becomes more complex if you need to handle cycles as well - cycles of xs:includes are permitted in XSD, though the 1.0 spec isn't entirely clear on the point. If you're not concerned about cycles, just build a list of all the includes by transitive expansion using a recursive function, preferably calling resolve-uri() to resolve each #schemaLocation against its base URI, then remove duplicates from the list using distinct-values(). If you need to eliminate cycles, you'll need to pass a parameter to your recursive function indicating the route by which the document was reached, and ignore a document if it's already on the list. If you've got a copy of my book, there's an example of cycle detection in the section on xsl:call-template. But you may find the book too expensive too ;-(

Related

XSL - Make element mandatory based on other element availability

I am creating a XSD based on the below Xml using a XSL file. Actually I want to make Element2 block mandatory (minOccurs=1) only when Element1 block is available in each node (Sample, Sample1, etc..). If Element1 block is not available then Element2 block should become optional (minOccurs=0). I tried ancestor::[], match etc., but nothing is working. Please help.
Xml
<Root>
<Sample>
<Element1>
<SubElement1>0</SubElement1>
</Element1>
<Element2>
<X>1</X>
<Y>2</Y>
</Element2>
</Sample>
<Sample1>
<Element2>
<X>1</X>
<Y>2</Y>
</Element2>
</Sample1>
</Root>
XSL
<xsl:choose>
<xsl:when test="(local-name() = 'Element2' and //*[matches(local-name(), 'Element1')])">
<xs:element name="{local-name()}" minOccurs="1">
<xsl:call-template name="renderChildElements" />
</xs:element>
</xsl:when>
<xsl:otherwise>
<xs:element name="{local-name()}" minOccurs="0">
<xsl:call-template name="renderChildElements" />
</xs:element>
</xsl:otherwise>
</xsl:choose>
<xsl:template name="renderChildElements">
<xs:complexType>
<xs:all minOccurs="0">
<xsl:apply-templates select="*"/>
</xs:all>
<xsl:apply-templates select="#*"/>
</xs:complexType>
</xsl:template>
It sounds as if you only want to make Element2 have minOccurs="1" when it has an Element1 sibling?
The XPath //*[matches(local-name(), 'Element1')]) is using the descendant axis, so it is jumping up to the top of the XML tree and looking through the entire document to test whether there is an Element1 element.
Instead, you want to constrain it to just the elements under the Element2 parent: ..//*[local-name() eq 'Element1'] or if you just care about the children of the parent of Element2 you could use: ../*[local-name() eq 'Element1']
I think you can further simplify and consolidate, and move the conditional logic inside of the minOccurs attribute, using an attribute value template to compute the value of either 1 or 0:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output indent="yes"/>
<xsl:template match="*">
<xs:element name="{local-name()}"
minOccurs="{if (local-name() = 'Element2' and ..//*[local-name() eq 'Element1']) then 1 else 0}">
<xsl:call-template name="renderChildElements" />
</xs:element>
</xsl:template>
<xsl:template name="renderChildElements">
<xs:complexType>
<xs:all minOccurs="0">
<xsl:apply-templates select="*"/>
</xs:all>
<xsl:apply-templates select="#*"/>
</xs:complexType>
</xsl:template>
</xsl:stylesheet>

mapping dynamic xml nodes to static nodes

I am looking for someone to assist in solving a seemingly simple problem.
I want to map a node of /fields[x]/message_id to a static node /MessageID0x for 5 entries in a list.
The source node is optional and may not exist.
The schema is below
I am just not seeing the obvious, I hope.
The source is defined as:
<xs:element name="fields">
<xs:complexType>
<xs:sequence>
<xs:element name="tenant_id" type="xs:normalizedString" minOccurs="0"/>
<xs:element name="message_id" type="xs:normalizedString" minOccurs="0"/>
Target is defined as:
<xs:element name="MessageID01" type="xs:normalizedString" minOccurs="0"/>
<xs:element name="MessageID02" type="xs:normalizedString" minOccurs="0"/>
<xs:element name="MessageID03" type="xs:normalizedString" minOccurs="0"/>
<xs:element name="MessageID04" type="xs:normalizedString" minOccurs="0"/>
<xs:element name="MessageID05" type="xs:normalizedString" minOccurs="0"/>
=== FROM ===========
<root>
<ID>2019Nov12_17</ID>
<PingResult>OK</PingResult>
<StartDateTime>2019-11-12T16:16:01</StartDateTime>
<EndDateTime>2019-11-12T17:16:01.771Z</EndDateTime>
<start>0</start>
<numFound>1</numFound>
<fields>
<tenant_id>KOCHIND_AX2</tenant_id>
<message_id>lid://infor.landmark.lmrkmt/15d8f834-7680-541e-0000-001d5dae3e7b</message_id>
</fields>
<fields>
<tenant_id>KOCHIND_AX2</tenant_id>
<message_id>lid://infor.landmark.lmrkmt/0535a86a-7680-1868-0000-07625db833c1</message_id>
</fields>
<fields>
<tenant_id>KOCHIND_AX2</tenant_id>
<message_id>lid://infor.landmark.lmrkmt/0535a86a-7680-1864-0000-03445db849c8</message_id>
</fields>
<fields>
<tenant_id>KOCHIND_AX2</tenant_id>
<message_id>lid://infor.landmark.lmrkmt/0535a86a-7680-1867-0000-01151db125c8</message_id>
</fields>
</root>
TO ===================
<root>
<ID>2019Nov12_17</ID>
<PingResult>OK</PingResult>
<StartDateTime>2019-11-12T16:16:01</StartDateTime>
<EndDateTime>2019-11-12T17:16:01.771Z</EndDateTime>
<start>0</start>
<numFound>1</numFound>
<MessageID01>lid://infor.landmark.lmrkmt/15d8f834-7680-541e-0000-001d5dae3e7b</MessageID01>
<MessageID02>lid://infor.landmark.lmrkmt/0535a86a-7680-1868-0000-07625db833c1</MessageID02>
<MessageID03>lid://infor.landmark.lmrkmt/0535a86a-7680-1868-0000-07625db833c8</MessageID03>
<MessageID04>lid://infor.landmark.lmrkmt/15d8f834-1864—3322-0000-03445db125c8</MessageID04>
<MessageID05>lid://infor.landmark.lmrkmt/15d8f834-7680-1867-0000-01151db125g4</MessageID05>
</root>
I believe the requested result can be achieved rather simply by doing:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="fields">
<xsl:variable name="n">
<xsl:number format="01"/>
</xsl:variable>
<xsl:element name="MessageID{$n}">
<xsl:value-of select="message_id"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
The result of transforming the example input will be:
<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<root>
<ID>2019Nov12_17</ID>
<PingResult>OK</PingResult>
<StartDateTime>2019-11-12T16:16:01</StartDateTime>
<EndDateTime>2019-11-12T17:16:01.771Z</EndDateTime>
<start>0</start>
<numFound>1</numFound>
<MessageID01>lid://infor.landmark.lmrkmt/15d8f834-7680-541e-0000-001d5dae3e7b</MessageID01>
<MessageID02>lid://infor.landmark.lmrkmt/0535a86a-7680-1868-0000-07625db833c1</MessageID02>
<MessageID03>lid://infor.landmark.lmrkmt/0535a86a-7680-1864-0000-03445db849c8</MessageID03>
<MessageID04>lid://infor.landmark.lmrkmt/0535a86a-7680-1867-0000-01151db125c8</MessageID04>
</root>
which is different from the result you show - nevertheless, I suspect it is the correct one.
Do note that numbering sibling nodes by name is bad practice. It makes subsequent transformations much more difficult. If you need a number (although I don't see why you should), use an attribute.

Extracting data from XSD in XSL

I am transforming an XML file that needs to generate some elements based on the valid enumeration options defined in the XSD.
Suppose I have an XSD that declares a type and an element something like this:
<xs:simpleType name="optionType" nillable="true">
<xs:restriction base="xs:string">
<xs:maxLength value="50"/>
<xs:enumeration value="USERCHOICE">
</xs:enumeration>
<xs:enumeration value="DEFAULT">
</xs:enumeration>
</xs:restriction>
</xs:simpleType>
...
<xs:element name="chosenOption" type='optionType'/>
...
<xs:element name="availableOption" type='optionType'/>
The input will only contain the chosen option, so you can imagine it looks like this:
<options>
<chosenOption>USERCHOICE</chosenOption>
</options>
I need to have an output that looks like this:
<options>
<chosenOption>USERCHOICE</chosenOption> <!-- This comes from incoming XML -->
<!-- This must be a list of ALL possible values for this element, as defined in XSD -->
<availableOptions>
<availableOption>USERCHOICE</availableOption>
<availableOption>DEFAULT</availableOption>
</availableOptions>
</options>
Is there a way to have the XSL extract the enumeration values USERCHOICE and DEFAULT from the XSD and produce them in the output?
This will run on WebSphere 6 and will be used by an XSLT 1.0 engine. :(
(The schema file does not change often but it will change now and then and I'd rather only have to update the schema file instead of update the schema file and XSLT)
Here's a prototype that assumes that your input XML and XSD are as simple as the samples above. To be tweaked according to ways in which they may vary. If you need help with that tweaking, let me know.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="1.0">
<xsl:variable name="xsd" select="document('mySchema.xsd')"/>
<xsl:template match="/options">
<xsl:copy>
<xsl:for-each select="*">
<xsl:variable name="eltName" select="local-name()"/>
<xsl:copy-of select="." />
<availableOptions>
<xsl:variable name="optionType"
select="$xsd//xs:element[#name = $eltName]/#type"/>
<xsl:apply-templates
select="$xsd//xs:simpleType[#name = $optionType]/
xs:restriction/xs:enumeration"/>
</availableOptions>
</xsl:for-each>
</xsl:copy>
</xsl:template>
<xsl:template match="xs:enumeration">
<availableOption><xsl:value-of select="#value" /></availableOption>
</xsl:template>
</xsl:stylesheet>

transforming with xsl an xml schema template to an other xml schema template

can anyone give me a paradigm of transforming an xml schema template like
<xs:element name="carareWrap">
<xs:annotation>
<xs:documentation xml:lang="en">The CARARE wrapper element. It wraps CARARE elements.</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" maxOccurs="unbounded" ref="carare"/>
</xs:sequence>
</xs:complexType>
</xs:element>
to an other xml schema template with xsl?
the other xml schema could be anything you can do.. I just need to have somwthing to start with...
can anyone give me a paradigm of transforming an xml schema template like ... to an other xml schema template?
An XML Schema is just a an XML document with declared namespace uri http://www.w3.org/2001/XMLSchema. Therefore, you can apply XSLT as usual.
For instance, you have a source schema like this:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="carareWrap">
<xs:annotation>
<xs:documentation xml:lang="en">The CARARE wrapper element. It wraps CARARE elements.</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" maxOccurs="unbounded" ref="carare"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
And (for example) you want to remove the attributes of reference elements only. You can apply the following transform:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
version="1.0">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="xs:element[#ref]">
<xsl:copy>
<xsl:copy-of select="#ref"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
The result will be:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="carareWrap">
<xs:annotation>
<xs:documentation xml:lang="en">The CARARE wrapper element. It wraps CARARE elements.</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element ref="carare" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Notice
The needing of the declaration of the namespace of the input document in the XSLT
The usage of the identity transform to copy the input document as is and override the elements as by requirements.
There is nothing special about trating XSD documents. They are just XML.
Since you do not specify what changes you want to make, here is a sample XSLT stylesheet that changes a random detail (the value of minOccurs in this case)
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
>
<!-- the identity template copyies everything as-is -->
<xsl:template match="node() | #*">
<xsl:copy>
<xsl:apply-templates select="node() | #*" />
</xsl:copy>
</xsl:template>
<!-- ...unless there is a more specific template available -->
<xsl:template match="
xs:element[#name = 'carareWrap']//xs:element[#ref = 'carare' and #minOccurs = 1]/#minOccurs
">
<xsl:attribute name="{name()}">2</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Output
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="carareWrap">
<xs:annotation>
<xs:documentation xml:lang="en">The CARARE wrapper element. It wraps CARARE elements.</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="2" maxOccurs="unbounded" ref="carare"></xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
A few things to notice:
the namespace declaration xmlns:xs="http://www.w3.org/2001/XMLSchema" so the xs prefix is available in the XSLT stylesheet
the use of the identity template copy everything that is not handled otherwise
the use of a complex match expression to pick a specific node
the use of an attribute value template and the name() function to copy the attribute name: name="{name()}"

Remove unused elements from XML schema using XSLT

I'm looking for a way (if it's even possible) of using an XSL transform of an XSD document to remove unused elements. This comes up a lot in my job where a company will define an XSD with absolutely everything in it, but then they will want to create a cut-down version for a single root element within it.
To explain further, I might have an XSD like the following:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="RootElement">
<xs:complexType>
<xs:sequence>
<xs:element ref="ChildElement"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="ChildElement"/>
<xs:element name="UnusedElement"/>
</xs:schema>
What I would like to be able to do is to set up an XSL where I provide the starting element (in this case RootElement) and it will copy over all dependent elements but omit the unused ones. In the above example, if I passed in RootElement I'd expect to see RootElement and ChildElement included but UnusedElement omitted.
(When I say "provide the starting element", I'm quite happy to crack open the stylesheet and type xsl:template match="RootElement" where required.)
This would obviously have to be recursive, so would navigate the entire structure defined below the starting element, and any element in that schema that was not used would be discarded.
(Of course, it would be even better if it could do the same in any imported schemas!)
I've searched Google extensively and can't find anything on this - I'm not sure if that means it's not possible or not.
Thanks!
Edit: Actually I probably should clarify and say that I would like to remove unused elements AND types, so it would follow both ref="childElement" and type="someType" links.
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" >
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="ptopElementName" select="'RootElement'"/>
<xsl:variable name="vTop" select=
"/*/xs:element[#name=$ptopElementName]"/>
<xsl:variable name="vNames"
select="$vTop/descendant-or-self::*/#name"/>
<xsl:variable name="vRefs"
select="$vTop/descendant-or-self::*/#ref"/>
<xsl:variable name="vTypes"
select="$vTop/descendant-or-self::*/#type"/>
<xsl:template match="node()|#*" name="identity">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="xs:element">
<xsl:if test=
"#name=$vNames
or
#name=$vRefs
or
ancestor-or-self::*[#name=$ptopElementName]">
<xsl:call-template name="identity"/>
</xsl:if>
</xsl:template>
<xsl:template match="xs:complexType|xs:simpleType">
<xsl:if test=
"#name=$vTypes
or
ancestor-or-self::*[#name=$ptopElementName]">
<xsl:call-template name="identity"/>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="RootElement">
<xs:complexType>
<xs:sequence>
<xs:element ref="ChildElement"/>
</xs:sequence>
</xs:complexType></xs:element>
<xs:element name="ChildElement"/>
<xs:element name="UnusedElement"/>
</xs:schema>
produces the wanted, corect result:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="RootElement">
<xs:complexType>
<xs:sequence>
<xs:element ref="ChildElement"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="ChildElement"/>
</xs:schema>