XML with empty prefix transformation with XSLT [duplicate] - xslt

This question already has answers here:
XSLT Transform doesn't work until I remove root node
(2 answers)
Closed 1 year ago.
I'm willing to use XSLT to transform XML files to other XML files by removing (TextLine) elements. However, the elements are not removed as I expect in the output XML files. I imagine that I'll have to modify the XSLT file, but I don't know how. Let me know what should be done.
I suspect that the root cause of the issue is that elements in the XML files have an empty prefix namespace.
The details are the following ones.
An XML test-01.xml file that contains empty prefix namespace elements:
<?xml version="1.0" encoding="UTF-8"?>
<alto xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.loc.gov/standards/alto/ns-v4#"
xsi:schemaLocation="http://www.loc.gov/standards/alto/ns-v4# http://www.loc.gov/standards/alto/v4/alto-4-2.xsd">
<TextLine TAGREFS="LT9"/>
<TextLine TAGREFS="LT10"/>
<TextLine TAGREFS="LT9"/>
<TextLine TAGREFS="LT8"/>
</alto>
And I'm using the following date.xslt file:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="TextLine"/>
</xsl:stylesheet>
Note: I'm using python lxml to perform the transformation. However, this shouldn't have any influence on the process as I could use any other XML transformer as xsltproc.

Yes. Your assumption that that the default namespace was the cause of your XSLT not functioning as desired was correct. Try this XSLT-1.0 instead:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:loc="http://www.loc.gov/standards/alto/ns-v4#">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="loc:TextLine"/>
</xsl:stylesheet>

Related

xsl:apply-templates returns nothing − what am I missing?

I have a simple XML response, like
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/">
<numberOfRecords>1</numberOfRecords>
<records>
<record>
<recordData>
<kitodo xmlns="http://meta.kitodo.org/v1/">
<metadata name="key1">value1</metadata>
<metadata name="key2">value2</metadata>
<metadata name="key3">value3</metadata>
</kitodo>
</recordData>
</record>
</records>
</searchRetrieveResponse>
which I want to transform to this by XSLT
<?xml version="1.0" encoding="utf-8"?>
<mets:mdWrap xmlns:kitodo="http://meta.kitodo.org/v1/"
xmlns:mets="http://www.loc.gov/METS/"
xmlns:srw="http://www.loc.gov/zing/srw/"
MDTYPE="OTHER"
OTHERMDTYPE="Kitodo">
<mets:xmlData>
<kitodo:kitodo>
<kitodo:metadata name="key1">value1</kitodo:metadata>
<kitodo:metadata name="key2">value2</kitodo:metadata>
<kitodo:metadata name="key3">value3</kitodo:metadata>
</kitodo:kitodo>
</mets:xmlData>
</mets:mdWrap>
That is, I want to remove the outside tree searchRetrieveResponse/records/record/recordData, replace it with mdWrap/xmlData and move the contained data node there.
I have a quite short XSLT for it:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:kitodo="http://meta.kitodo.org/v1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:srw="http://www.loc.gov/zing/srw/">
<xsl:output method="xml" indent="yes" encoding="utf-8"/>
<xsl:strip-space elements="*"/>
<xsl:template match="srw:recordData">
<mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="Kitodo">
<mets:xmlData>
<xsl:apply-templates select="#*|node()"/>
</mets:xmlData>
</mets:mdWrap>
</xsl:template>
<!-- pass-through rule -->
<xsl:template match="#*|node()">
<xsl:apply-templates select="#*|node()"/>
</xsl:template>
</xsl:stylesheet>
However, what I get is:
<?xml version="1.0" encoding="utf-8"?>
<mets:mdWrap xmlns:kitodo="http://meta.kitodo.org/v1/"
xmlns:mets="http://www.loc.gov/METS/"
xmlns:srw="http://www.loc.gov/zing/srw/"
MDTYPE="OTHER"
OTHERMDTYPE="Kitodo">
<mets:xmlData/>
</mets:mdWrap>
Obviously, the template match="srw:recordData" does match, otherwise I would get an empty result. However, the contained apply-templates doesn’t output anything. (I also tried an <xsl:apply-templates/> without a select="" attribute, but it doesn’t output anything either.) What am I missing?
XSLT processor is net.sf.saxon.TransformerFactoryImpl (Java)
I think nothing happens when you are applying templates inside xmlData. There are no templates that would match descendant nodes.
Try using copy-of:
<xsl:template match="srw:recordData">
<mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="Kitodo">
<mets:xmlData>
<xsl:copy-of select="kitodo:kitodo"/>
</mets:xmlData>
</mets:mdWrap>
</xsl:template>
The problem is not with the xsl:apply-templates instruction. It is with the template being applied. Your "pass-through rule" does not write anything to the output. You probably meant to have the identity transform template in that place - which goes like this:
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>

Disable xsl:comment in XSLT transformation

I have an XSLT file littered with comments such as the following:
<xsl:comment>Entering shipping block</xsl:comment>
Is there a way to explicity disable these comments so that they aren't output at runtime in production? The output of the XSLT file is shown in a public API, so while it is useful for debugging, I would rather be able to switch it off.
The only way I can think of is to have a flag that is set in development mode to turn on the comments:
<xsl:if test="$enableDebug='true'">
<xsl:comment>Entering shipping block</xsl:comment>
</xsl:if>
Is there another way?
(I'm using XSLT 2.0.)
Just include this transformation step in your deployment to production:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="comment()"/>
</xsl:stylesheet>
You my even not specify any indent attribute on <xsl:output> in case readability is not a goal.
You could declare a variable verbose at the very top of your XSL:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output encoding="UTF-8" indent="yes" method="xml"/>
<xsl:variable name="verbose">1</xsl:variable>
<xsl:template match="#*|node()">
<xsl:copy>
...
and wrap your comments in <xsl:if> like this:
<xsl:if test="$verbose">
<xsl:comment>Applying the template...</xsl:comment>
</xsl:if>
If you then want to temporarily deactivate the comments, just remove the 1 from the verbose variable:
<xsl:variable name="verbose"></xsl:variable>

Handling < > in XSLT 1.0

I have a problem, when trying to read a structure having < > in source XML.
Input Structure -
<?xml version="1.0" encoding="UTF-8"?>
<RecordsData>
<RecordsData>
<UID><RecordsData xmlns=""><RecordsData><UID>200</UID><RID>Test-1</RID><Date>20142812</Date><Status>N</Status></RecordsData></RecordsData></UID>
</RecordsData>
</RecordsData>
Expected Output Structure (there are two requirements) -
One is just conversion of < >into well formed XML tags.
<?xml version="1.0" encoding="UTF-8"?>
<RecordsData>
<RecordsData>
<UID><RecordsData xmlns=""><RecordsData><UID>200</UID><RID>Test-1</RID><Date>20142812</Date><Status>N</Status></RecordsData></RecordsData></UID>
</RecordsData>
</RecordsData>
Second is extraction of whole data inside UID tag with output as only below -
<RecordsData xmlns=""><RecordsData><UID>200</UID><RID>Test-1</RID><Date>20142812</Date><Status>N</Status></RecordsData></RecordsData>
I am able to get second output if I have first one in hand. But struggling to get first output from Input over last few days after searching forum extensively and being very new to XSLT.
If we can directly get second output from input source - it's actually what is expected solution. For above - I just tried to break down problem into steps.
Any of experts can you please help!
Thanks,
Conversion is easy, extraction is not.
To convert the escaped markup to real markup, simply disable the escaping when writing the node to the result tree, for example:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="UID">
<xsl:copy>
<xsl:value-of select="." disable-output-escaping="yes"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Ideally, you would use the resulting XML file to extract any data from the escaped portion. Otherwise you would have to apply string functions for this purpose, since the escaped text is not XML.
However, in your example, you don't want to extract anything particular from the data, just isolate it and convert it to a stand-alone markup document. This can be easily accomplished by:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<xsl:value-of select="RecordsData/RecordsData/UID" disable-output-escaping="yes"/>
</xsl:template>
</xsl:stylesheet>

Removing an XML tag that is named like xfdf:field (with a namespace)

I want to remove an XML element from an XML file. The tag that I want to remove is named as xfdf:field.
How do I specify this in my xslt ? I tried this and I am getting an error saything "org.apache.xpath.domapi.XPathStylesheetDOM3Exception: Prefix must resolve to a namespace: xfdf ".
Here is my xslt.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" />
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
<xsl:template match="xfdf:field"></xsl:template>
</xsl:stylesheet>
Here is my xml.
<xfa:datasets
xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/"
xmlns:dd="http://ns.adobe.com/data-description/" xmlns:tns="http://hostname"
xmlns:xfdf="http://ns.adobe.com/xfdf/">
<xfa:data>
<tns:form xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<tns:formHeader>
<tns:formId>formid</tns:formId>
<tns:revId>Rev123</tns:revId>
<tns:agencyId>agency</tns:agencyId>
<tns:progId>program</tns:progId>
<tns:serviceId>service</tns:serviceId>
</tns:formHeader>
<tns:formFields>
<tns:date>08-13-1967</tns:date>
<tns:agreementBetween></tns:agreementBetween>
<tns:ncr>xxxx</tns:ncr>
<tns:formConfirmationInfo>
<tns:confNbrLbl>nbrlabel</tns:confNbrLbl>
<tns:confNbrData>1231</tns:confNbrData>
<tns:custNameLbl>3332</tns:custNameLbl>
<tns:custNameData>dasdas</tns:custNameData>
<tns:dateLbl>date</tns:dateLbl>
<tns:dateData>01012001</tns:dateData>
</tns:formConfirmationInfo>
</tns:formFields>
<xfdf:field xmlns:xfdfi="http://ns.adobe.com/xfdf-transition/"
xfdfi:original="FSAPPLICATIONDATA_">
<!--irrelevant data omitted-->
</xfdf:field>
</tns:form>
</xfa:data>
</xfa:datasets>
You need to specify the namespace in your XSLT file:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xfdf="http://ns.adobe.com/xfdf/">

How to copy everything as is and only remove a specific element

<?xml version="1.0" encoding="UTF-8"?>
<Emp:Employee xmlns:Emp="http://Emp.com">
<Emp:EmpName>XYZ</Emp:EmpName>
<Emp:EmpAddres>AAAA</Emp:EmpAddres>
<Det:EmpDetails xmlns:Det="http://Det.com">
<Det:EmpDesignation>SE</Det:EmpDesignation>
<Det:EmpExperience>4</Det:EmpExperience>
</Det:EmpDetails>
</Emp:Employee>
I am just trying to copy all the elements including the namespace but without <Det:EmpExperience>4</Det:EmpExperience>
so the final output should be :
<?xml version="1.0" encoding="UTF-8"?>
<Emp:Employee xmlns:Emp="http://Emp.com">
<Emp:EmpName>XYZ</Emp:EmpName>
<Emp:EmpAddres>AAAA</Emp:EmpAddres>
<Det:EmpDetails xmlns:Det="http://Det.com">
<Det:EmpDesignation>SE</Det:EmpDesignation>
</Det:EmpDetails>
</Emp:Employee>
I used
<xsl:template match='/'>
<xsl:copy-of select='#*[not(Det:EmpExperience)]'/>
</xsl:template>
its not working :-( ... any solution for this plz.
how to remove only <Det:EmpExperience> element and copy rest of the elements including namespace ?
Try this (adapted from here):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:Det="http://Det.com">
<xsl:output omit-xml-declaration="yes"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Det:EmpExperience"/>
</xsl:stylesheet>
The second template overrides the identity transformation and the empty template uses your matching logic (selecting Det:EmpExperience nodes).