XSLT: How to remove synonymous namespaces - xslt

I have a large collection of XML files which I need to transform using XSLT. The problem is that many of these files were hand-written by different people and they do not use consistent names to refer to the schemas. For example, one file might use:
xmlns:itemType="http://example.com/ItemType/XSD"
where another might use the prefix "it" instead of "itemType":
xmlns:it="http://example.com/ItemType/XSD"
If that's not bad enough, there are several files which use two or three synonyms for the same thing!
<?xml version="1.0"?>
<Document
xmlns:it="http://example.com/ItemType/XSD"
xmlns:itemType="http://example.com/ItemType/XSD"
xmlns:ItemType="http://example.com/ItemType/XSD"
...
(there's clearly been a lot of cutting and pasting going on)
Now, because the pattern matching in the XSLT file appears to work on the namespace prefix (as opposed to the schema it relates to) the pattern only matches one of the variants. So if I write something like:
<xsl:template match="SomeNode[#xsi:type='itemType:SomeType']">
...
</xsl:template>
Then it only matches a subset of the cases that I want it to.
Question 1: Is there any way to get the XSLT to match all the variants?
Question 2: Is there any way to remove the duplicates so all the output files use consistent naming?
I naïvely tried using "namespace-alias" but I guess I've misunderstood what that does because I can't get it to do anything at all - either match all the variants or affect the output XML.
<?xsl:stylesheet
version="1.0"
...
xmlns:it="http://example.com/ItemType/XSD"
xmlns:itemType="http://example.com/ItemType/XSD"
xmlns:ItemType="http://example.com/ItemType/XSD"
...
<xsl:output method="xml" indent="yes"/>
<xsl:namespace-alias stylesheet-prefix="it" result-prefix="ItemType"/>
<xsl:namespace-alias stylesheet-prefix="itemType" result-prefix="ItemType"/>

Attribute values or text nodes won't be cast to QName unless you explicitly say so. Although this is only posible in XSLT/XPath 2.0
In XSLT/XPath 1.0 you must do this "manually":
<xsl:template match="SomeNode">
<xsl:variable name="vPrefix" select="substring-before(#xsi:type,':')"/>
<xsl:variable name="vNCName"
select="translate(substring-after(#xsi:type,$vPrefix),':','')"/>
<xsl:if test="namespace::*[
name()=$vPrefix
] = 'http://example.com/ItemType/XSD'
and
$vNCName = 'SomeType'">
<!-- Content Template -->
<xsl:if>
</xsl:template>
Edit: All in one pattern (less readable, maybe):
<xsl:template match="SomeNode[
namespace::*[
name()=substring-before(../#xsi:type,':')
] = 'http://example.com/ItemType/XSD'
and
substring(
concat(':',#xsi:type),
string-length(#xsi:type) - 7
) = ':SomeType'
]">
<!-- Content Template -->
</xsl:template>

In XSLT 2.0 (whether or not you use schema-awareness) you can write the predicate as [#xsi:type=xs:QName('it:SomeType')] where "it" is the prefix declared in the stylesheet for this namespace. It doesn't have to be the same as the prefix used in the source document.
Of course matching of element and attribute names (as distinct from QName-valued content) uses namespace URIs rather than prefixes in both XSLT 1.0 and XSLT 2.0.

Related

Allowing an attribute-set to set a namespace required by an attribute

I understand that xsl:attribute-set exists to allow a set of XML attributes to be grouped under a single name, which can then then easily be applied to several similar elements at a later date.
I understand that namespaces are not attributes and cannot be set using this.
However in Saxon9.8EE I note this works and I was wondering if this is safe to use:
<xsl:attribute-set name="swbml.ir" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsl:attribute name="version">4-2</xsl:attribute>
<xsl:attribute name="xsi:schemaLocation">http://www.fpml.org/2005/FpML-4-2 /path/to/swbml-ird-main-4-2.xsd</xsl:attribute>
</xsl:attribute-set>
By adding the xsi namespace to the xsl:attribute-set itself, it applies this namespace to any element using the swbml.ir attribute set.
(of course it has to because one of the attributes sits in the xsi namespace)
So this:
<SWBML xmlns="http://www.fpml.org/2005/FpML-4-2" xsl:use-attribute-sets="swbml.ir">
Results in:
<SWBML xmlns="http://www.fpml.org/2005/FpML-4-2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
version="4-2"
xsi:schemaLocation="http://www.fpml.org/2005/FpML-4-2 /path/to/swbml-ird-main-4-2.xsd">
This is exactly what I want. But it feels like I might be stretching the intended use-case for attribute sets?
Specifically if I try to go one step further and add xmlns="http://www.fpml.org/2005/FpML-4-2" like so:
<xsl:attribute-set name="swbml.ir" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.fpml.org/2005/FpML-4-2">
The default xmlns is not applied to <SWBML> - which is kinda what I expect.
So - is the rule that attribute sets will add any namespace that is required in order to qualify any attribute the set contains, BUT will not add any other namespace? Or have I strayed into undefined territory?
Your understanding is basically correct, in that if there is content bound to a namespace, and you include it in your output, then the namespace will come along for the ride. However, the fact that you happen to have declared it on the attribute-set is not critical. It could be declared in other places in the stylesheet, such as on the xsl:stylesheet element, to be in-scope and referenced in the attribute-set.
Building upon the examples that you posed, you could move the declaration of the xsi namespace prefix out of the xsl:attribute-set and up to the xsl:stylesheet element, and it would still appear in your output if the attribute-set were applied to the element:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
exclude-result-prefixes="xsi">
<xsl:output method="xml"/>
<xsl:attribute-set name="swbml.ir">
<xsl:attribute name="version">4-2</xsl:attribute>
<xsl:attribute name="xsi:schemaLocation">http://www.fpml.org/2005/FpML-4-2 /path/to/swbml-ird-main-4-2.xsd</xsl:attribute>
</xsl:attribute-set>
<xsl:template match="/">
<SWBML xmlns="http://www.fpml.org/2005/FpML-4-2" xsl:use-attribute-sets="swbml.ir"/>
</xsl:template>
</xsl:stylesheet>
And it would not appear in the output if the attribute-set is not applied to the content:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
exclude-result-prefixes="xsi">
<xsl:output method="xml"/>
<xsl:attribute-set name="swbml.ir">
<xsl:attribute name="version">4-2</xsl:attribute>
<xsl:attribute name="xsi:schemaLocation">http://www.fpml.org/2005/FpML-4-2 /path/to/swbml-ird-main-4-2.xsd</xsl:attribute>
</xsl:attribute-set>
<xsl:template match="/">
<SWBML xmlns="http://www.fpml.org/2005/FpML-4-2" />
</xsl:template>
</xsl:stylesheet>
Note that I used exclude-result-prefixes in both examples to ensure that the xsi namepsace is pruned from the output if unused. Otherwise, the in-scope namespace might come along for the ride in the output, even if it were not applied to any content.
Yes, this will work: as #MadsHansen points out, when you use <xsl:attribute name="p:u"/> the only thing that really matters is that the prefix p is declared somewhere - on the xsl:attribute element itself, or on one of its ancestors. If it's convenient to declare it at the level of the xsl:attribute-set itself, then fine, do that.
A thing to watch out for here is that this doesn't apply to QName-valued attributes. If you want to do
<xsl:attribute name="xsi:type">xs:date</xsl:attribute>
then you can get the prefix xsi declared in the result document simply by having it in-scope for the xsl:attribute instruction, but for the xs prefix you need to work a bit harder (because the XSLT processor doesn't know that the attribute value xs:date is a QName). In this case you need to explicitly ensure that some containing element in the result tree declares the xs namespace.

Counting elements that are generated in XSLT1

I'm trying to count the elements my transformation generates (must use XLST1). For example, my transformation creates:
<Parent>
<ElementX Att1="2"/>
<ElementY Att1="1"/>
<ElementZ Att1="6"/>
</Parent>
I need to print 3 within the same transformation, because there are 3 child elements.
Can this be done?
Thanks.
It would help a lot if you provide some extract of your XSLT.
I cn't give you a XSLT code without it. I'll try to give some "way" to the answer :
One solution could be to store the output into a nodeset (use the XSLT 1.0 extension which provides the nodeset() function) and apply the XPath count() function on this variable. After that just output your variable with xsl:value-of, and your count result the same way.
Here is a demo how to do this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ext="http://exslt.org/common" exclude-result-prefixes="ext">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="vrtfPass1">
<xsl:apply-templates/>
</xsl:variable>
<xsl:value-of select="count(ext:node-set($vrtfPass1)/*/*)"/>
</xsl:template>
<xsl:template match="/*">
<Parent>
<ElementX Att1="2"/>
<ElementY Att1="1"/>
<ElementZ Att1="6"/>
</Parent>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on any XML document (not used in this Demo), the wanted, correct result is produced:
3
Explanation:
A general way to process the result of the transformation (in a single transformation), is to organize it in two passes where we save the result of the first pass in a variable.
In the second pass we access the result and do the additional processing.
Do note that in XSLT 1.0 if the variable that captures the result of the first pass is of the infamous RTF (Result Tree Fragment) type and needs to be converted to a regular tree in order of any nodes inside this tree to be accessible (xsl:copy-of and string() are still allowed on an RTF).
This conversion to a regular tree is done by an extension function, which most often has the name node-set and always belongs to a vendor-defined namespace. In this demo we are using the node-set() extension function that belongs to the EXSLT namespace -- because most XSLT 1.0 processors implement EXSLT.
For more information on multi-pass processing, see this: Two phase processing: Do not output empty tags from phase-1 XSLT 2.0 processing

adding multiple filters

As I have mentioned in this post:
dynamic multiple filters in xsl
Basically, I want to apply multiple filters to my xml using "for loop" and these filters are dynamic which are coming from some other xml
sth like this:
foreach(list/field[#ProgramCategory=$Country][not(contain(#Program,$State1][not(contain(#Program,$State2][not(contain(#Program,$State3][not(contain(#Program,$Staten])
The problem is that I can get n no. of states which I am getting through for loop of other xml.
I cannot use document() function as suggested by Dimitre so I was thinking of achieving it by:
<xsl:variable name="allprograms">
<xsl:for-each select="/list2/field2">
<xsl:text disable-output-escaping="yes">[not(contains(#Program,'</xsl:text><xsl:value-of select="#ProgramID"></xsl:value-of><xsl:text disable-output-escaping="yes">'))]</xsl:text>
</xsl:for-each>
</xsl:variable>
gives me something like this:
[not(contains(#Program,'Virginia'))][not(contains(#Program,'Texas'))][not(contains(#Program,'Florida'))]
I want to use this above value as a filter in the for loop below and I am not sure how to achieve that
<xsl:for-each="list/field[not(contains(#Program,'Virginia'))][not(contains(#Program,'Texas'))][not(contains(#Program,'Florida'))]">
Before this I also have a for loop to filter United States
xsl:for-each="list/field $allprograms">
<xsl:value-of select="#ows_ID" />
</xsl:for-each>
I want my answer to be 1082, 1088..
I can add the xml here too if there is any confusion..
Jack,
From the previous solution you just need to add to this:
<xsl:param name="pFilteredStates">
<state>Virginia</state>
<state>Texas</state>
<state>Florida</state>
</xsl:param>
the following (changing the current variable definition that relies on the document() function):
<xsl:variable name="vFiltered" select=
"ext:node-set($pFilteredStates)/*
"/>
Where the "ext:" prefix needs to be bound to this namespace (this is the EXSLT namespace -- if your XSLT processor doesn't implement exslt:node-set() then you need to find what xxx:node-set() extension it implements, or tell us what is your XSLT processor and people will provide this information):
"http://exslt.org/common"
So, your <xsl:stylesheet> may look like the following:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ext="http://exslt.org/common" exclude-result-prefixes="ext">
I still recommend that the $pFilteredStates parameter should be passed by the initiator of the transformation -- in which case you can delete the definition of $vFiltered and replace every reference to it with $pFilteredStates` and the transformation should work OK.

XSL Reuse? YES! But: Element must not contain an xsl:import element! :-(

I am using a heavy stylesheet with a lot of recurring transformations, so I thought it would be smart to reuse the same chunks of code, so I would not need to make the same changes at a bunch of different places. So I discovered , but -alas- it won't allow me to do it. When trying to run it in Sonic Workbench I get the following error:
An xsl:for-each element must not contain an xsl:import element
This is my stylesheet code:
<xsl:template match="/">
<InboundFargoMessage>
<EdiSender>
<xsl:value-of select="TransportInformationMessage/SenderId"/>
</EdiSender>
<EdiReceiver>
<xsl:value-of select="TransportInformationMessage/RecipientId"/>
</EdiReceiver>
<EdiSource>PORLOGIS</EdiSource>
<EdiDestination>FARGO</EdiDestination>
<Transportations>
<xsl:for-each select="TransportInformationMessage/TransportUnits/TransportUnit">
<xsl:import href="TransportCDMtoFDM_V0.6.xsl"/>
</xsl:for-each>
<xsl:for-each select="TransportInformationMessage/Waybill/TransportUnits/TransportUnit">
<xsl:import href="TransportCDMtoFDM_V0.6.xsl"/>
</xsl:for-each>
</Transportations>
</InboundFargoMessage>
</xsl:template>
</xsl:stylesheet>
I will leave out the child xsl-sheets for now, as the problem appears to be happening at the base.
If I cannot use xsl:import, is there any option of reuse?
If I cannot use xsl:import, is there
any option of reuse?
You can use <xsl:import>.
All <xsl:import> elements must be the first element children of <xsl:stylesheet>
As an alternative, an <xsl:include> element has to be globally defined (a child of <xsl:stylesheet>) but can be preceded by any other xslt instruction that can be placed globally.
You need to be aware of and understand well the rules of using these two XSLT instructions. I'd recommend reading a good book on XSLT.
The main unit of reusability in XSLT is the template (<xsl:template>).
The importing stylesheet can use (via <xsl:call-template> or <xsl:apply-templates>) any template that is defined in any imported stylesheet.
Each of included XSL files should contain a template(s).
The main file includes the others in the beginning and then calls the templates with call-template or apply-templates from various places.
Thanks for all the suggestions, which were somewhat helpful, but allow me formulate a complete answer. As suggested, the answer to the question of re-use lies in the xsl:templates. Templates can be defined by enclosing them within . Then, whereever necessary, they can be summoned by adding a element. Also, they can be put into separate xsl sheets, as long as they are imported at the top of the parent xsl sheet.
Thus, the solution to my questions looks as follows:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:far="http://www.itella.com/fargo/fargogate/" xmlns:a="http://tempuri.org/XMLSchema.xsd" xmlns:p="http://tempuri.org/XMLSchema.xsd">
<xsl:import href="TransportCDMtoFDM_V0.6.xsl"/>
<xsl:template match="/">
<InboundFargoMessage>
<EdiSender>
<xsl:value-of select="TransportInformationMessage/SenderId"/>
</EdiSender>
<EdiReceiver>
<xsl:value-of select="TransportInformationMessage/RecipientId"/>
</EdiReceiver>
<EdiSource>PORLOGIS</EdiSource>
<EdiDestination>FARGO</EdiDestination>
<Transportations>
<xsl:for-each select="TransportInformationMessage/TransportUnits/TransportUnit">
<xsl:call-template name="transport"/>
</xsl:for-each>
<xsl:for-each select="TransportInformationMessage/Waybill/TransportUnits/TransportUnit">
<xsl:call-template name="transport"/>
</xsl:for-each>
</Transportations>
</InboundFargoMessage>
</xsl:template>
Where the file 'TransportCDMtoFDM_V0.6.xsl' contains the template called "transport".
There is just one problem left: Using templates, all the nodes mentioned within the template are used, even if they are empty. So the remaining question is how to leave out the empty nodes?

Ignore name space with t: prefix

We have XML file like below...
<?xml version='1.0'?>
<T0020 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.safersys.org/namespaces/T0020V1 T0020V1.xsd"
xmlns="http://www.safersys.org/namespaces/T0020V1">
<IRP_ACCOUNT>
<IRP_CARRIER_ID_NUMBER>1213561</IRP_CARRIER_ID_NUMBER>
<IRP_BASE_COUNTRY>US</IRP_BASE_COUNTRY>
<IRP_BASE_STATE>AL</IRP_BASE_STATE>
<IRP_ACCOUNT_NUMBER>15485</IRP_ACCOUNT_NUMBER>
<IRP_ACCOUNT_TYPE>I</IRP_ACCOUNT_TYPE>
<IRP_STATUS_CODE>0</IRP_STATUS_CODE>
<IRP_STATUS_DATE>2004-02-23</IRP_STATUS_DATE>
<IRP_UPDATE_DATE>2007-03-09</IRP_UPDATE_DATE>
<IRP_NAME>
<NAME_TYPE>LG</NAME_TYPE>
<NAME>WILLIAMS TODD</NAME>
<IRP_ADDRESS>
<ADDRESS_TYPE>MA</ADDRESS_TYPE>
<STREET_LINE_1>P O BOX 1210</STREET_LINE_1>
<STREET_LINE_2/>
<CITY>MARION</CITY>
<STATE>AL</STATE>
<ZIP_CODE>36756</ZIP_CODE>
<COUNTY/>
<COLONIA/>
<COUNTRY>US</COUNTRY>
</IRP_ADDRESS>
</IRP_NAME>
</IRP_ACCOUNT>
</T0020>
In order to Insert this XML data to database ,we have used two XSLT.
First XSLT will remove name space from XML file and convert this XML to some intermediate
XML(say Process.xml) file on some temporary location.
then we were taking that intermediate xml(without namespace lines) and applied another XSL
to map xml field to Database.
Then we have found solution and we have used only one XSLT which does bode [1] Remove namespace and [2] Mapping XML field to Database to insert data.
Our final style sheet contain following lines
xmlns:t="http://www.safersys.org/namespaces/T0020V1">
and we used following to map field to Database
<xsl:template match="/">
<xsl:element name="T0020">
<xsl:apply-templates select="t:T0020/t:IRP_ACCOUNT" />
</xsl:element>
</xsl:template>
how did our problem solved with this approach ?Any consequences with using this ?
I have searched about this but not getting the functionality.
Thanks in Advance..
I don't see any problems with your approach.
XSLT mandates a fully qualified name for a correct matching, so using a prefixed namespace in your XSLT is the right solution; this is why you solved your problem.