XPath 3.0 Serialize without Namespaces in Scope - xslt

While answering this question, it occurred to me that I know how to use the XSLT 3.0 (XPath 3.0) serialize() function, but that I do not know how to avoid serialization of namespaces that are in scope. Here is a minimal example:
XML Input
<?xml version="1.0" encoding="UTF-8" ?>
<ci:cichlids xmlns:ci="http://www.cichlids.com">
<cichlid id="1">
<name>Zeus</name>
<color>gold</color>
<teeth>molariform</teeth>
<breeding-type>lekking</breeding-type>
</cichlid>
</ci:cichlids>
XSLT 3.0 Stylesheet
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization"
xmlns:ci="http://www.cichlids.com">
<xsl:output method="xml" encoding="UTF-8" indent="yes" />
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/ci:cichlids/cichlid">
<xsl:variable name="serial-params">
<output:serialization-parameters>
<output:omit-xml-declaration value="yes"/>
</output:serialization-parameters>
</xsl:variable>
<xsl:value-of select="serialize(., $serial-params/*)"/>
</xsl:template>
</xsl:stylesheet>
Actual Output
<?xml version="1.0" encoding="UTF-8"?>
<ci:cichlids xmlns:ci="http://www.cichlids.com">
<cichlid xmlns:ci="http://www.cichlids.com" id="1">
<name>Zeus</name>
<color>gold</color>
<teeth>molariform</teeth>
<breeding-type>lekking</breeding-type>
</cichlid>
</ci:cichlids>
The serialization process included the namespace declaration that is in scope for the cichlid element, although it is not used on this element. I would like to remove this declaration and make the output look like
Expected Output
<?xml version="1.0" encoding="UTF-8"?>
<ci:cichlids xmlns:ci="http://www.cichlids.com">
<cichlid id="1">
<name>Zeus</name>
<color>gold</color>
<teeth>molariform</teeth>
<breeding-type>lekking</breeding-type>
</cichlid>
</ci:cichlids>
I know how to modify the cichlid element, removing the namespaces in scope, and serialize this modified element instead. But this seems a rather cumbersome solution. My question is:
What is a canonical way to serialize an XML element using the serialize() function without also serializing unused namespace declarations that are in scope?
Testing with Saxon-EE 9.6.0.7 from within Oxygen.

Serialization will always give you a faithful representation of the data model that you are serializing. If you want to modify the data model, that's called transformation. Run a transformation to remove the unwanted namespaces, then serialize the result.

Michael Kay already gave the correct answer and I have accepted it. This is just to flesh out his comments. By
Run a transformation to remove the unwanted namespaces, then serialize the result.
he means applying a transformation like the following before calling serialize():
XSLT Stylesheet
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization"
xmlns:ci="http://www.cichlids.com">
<xsl:output method="xml" encoding="UTF-8" indent="yes" />
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:variable name="cichlid-without-namespace">
<xsl:copy-of copy-namespaces="no" select="/ci:cichlids/cichlid"/>
</xsl:variable>
<xsl:template match="/ci:cichlids/cichlid">
<xsl:variable name="serial-params">
<output:serialization-parameters>
<output:omit-xml-declaration value="yes"/>
</output:serialization-parameters>
</xsl:variable>
<xsl:value-of select="serialize($cichlid-without-namespace, $serial-params/*)"/>
</xsl:template>
</xsl:stylesheet>
XML Output
<?xml version="1.0" encoding="UTF-8"?>
<ci:cichlids xmlns:ci="http://www.cichlids.com">
<cichlid id="1">
<name>Zeus</name>
<color>gold</color>
<teeth>molariform</teeth>
<breeding-type>lekking</breeding-type>
</cichlid>
</ci:cichlids>

Related

how to get 'excel' new lines in spreadsheetML and the behaviour of nodeset() on disable-output-escaping (Saxon xslt 1.0)

This is a follow up question to
how to get 'excel' new lines in spreadsheetML (MSXSLT)
but asked as a new question, to separate this into different issue, as the behaviour seems to be different between engines (I'll leave the specific context in the other question, this is purely how to achieve some functional result).
This XSLT (in saxon he) will create what I want.
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<root>
<bar>
<xsl:text disable-output-escaping="yes">&#10;</xsl:text>
</bar>
</root>
</xsl:template>
</xsl:stylesheet>
and gives the output
<root>
<bar>
</bar>
</root>
this one wont:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
version="1.0">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="foo">
<bar>
<xsl:text disable-output-escaping="yes">&#10;</xsl:text>
</bar>
</xsl:variable>
<root>
<xsl:copy-of select="exsl:node-set($foo)"/>
</root>
</xsl:template>
</xsl:stylesheet>
it gives
<bar>&#10;</bar>
(the question is about XSLT 1.0 but interestingly XSLT 3.0 can be made to work like this
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="foo">
<bar>
<xsl:text disable-output-escaping="yes">&#10;</xsl:text>
</bar>
</xsl:variable>
<root>
<xsl:sequence select="$foo"/>
</root>
</xsl:template>
</xsl:stylesheet>
whilst
<xsl:copy-of select="$foo"/>
doesnt. Even following the 'sequence' pattern, I don't seem to be able to preserve non escaping in anything but a non trivial xslt - I've got a complex transformation using call-templates/apply-templates etc, and I think understanding how nodes are interpreted and serialised is not trivial)
There's actually a long history to this question, which was known in the working group as the "sticky d-o-e problem" (d-o-e being disable-output-escaping). The question is, does d-o-e have any effect when writing to a temporary tree (an xsl:variable), or is it only effective when writing to serialized output?
The XSLT 1.0 specification is pretty clear on the matter:
It is an error for output escaping to be disabled for a text node that
is used for something other than a text node in the result tree. Thus,
it is an error to disable output escaping for an xsl:value-of or
xsl:text element that is used to generate the string-value of a
comment, processing instruction or attribute node; it is also an error
to convert a result tree fragment to a number or a string if the
result tree fragment contains a text node for which escaping was
disabled. In both cases, an XSLT processor may signal the error; if it
does not signal the error, it must recover by ignoring the
disable-output-escaping attribute.
XSLT 2.0 deprecated d-o-e, but retained the rule in a slightly different form:
This [property], however, can be set only within a final result tree
that is being passed to the serializer.
But in between those two versions, the working group dithered. The XSLT 1.1 working draft (which never became a recommendation, but was popularised by the first version of my XSLT book) says:
When a root node is copied using an xsl:copy-of element ... and
escaping was disabled for a text node descendant of that root node,
then escaping should also be disabled for the resulting copy of that
text node. For example
<xsl:variable name="x">
<xsl:text disable-output-escaping="yes"><</xsl:text>
</xsl:variable>
<xsl:copy-of select="$x"/>
This is the "sticky d-o-e" - the d-o-e property is attached to the text node in the temporary tree and springs into life when the text node is eventually serialized. So this behaviour was endorsed at some stage in the life of XSLT, and you may be using a processor that implements this version of the spec.
Generally, though, try to forget that d-o-e exists. Whatever the problem, it's not the best solution. It's an incredibly messy feature because it requires a breaking of the architectural boundary between the transformation processor and the serializer, and breaking this boundary leads to close coupling of the transformation and serialization, and prevents you reusing the same code in a different pipeline configuration.
I'm afraid that researching the history of the W3C spec on this is rather easier than researching exactly what was implemented in early versions of Saxon (which are now nearly a quarter of a century old).
So to take the information from Michael Kay's answer which explains how the specification for XSLT 1.0 handles this, then we CAN implement a solution, for this.
So we take a recap of the underlying issue.
Excel spreadsheetML requires data to be formatted with the specific chars "
" to interpret a line feed in a cell (but this solution applies generally).
<Cell>Alpha
Bravo
Charlie</Cell>
If we try to write an XSLT to generate this, lets say naively:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<Cell>
<xsl:text>Alpha</xsl:text>
<xsl:text>&#10;</xsl:text>
<xsl:text>Bravo</xsl:text>
<xsl:text>&#10;</xsl:text>
<xsl:text>Charlie</xsl:text>
</Cell>
</xsl:template>
</xsl:stylesheet>
our
will get delimited and we get this
<Cell>Alpha&#10;Bravo&#10;Charlie</Cell>
this (thanks to the answer on how to get 'excel' new lines in spreadsheetML (MSXSLT)) can be fixed by using
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<Cell>
<xsl:text>Alpha</xsl:text>
<xsl:text disable-output-escaping="yes">&#10;</xsl:text>
<xsl:text>Bravo</xsl:text>
<xsl:text disable-output-escaping="yes">&#10;</xsl:text>
<xsl:text>Charlie</xsl:text>
</Cell>
</xsl:template>
</xsl:stylesheet>
which produces this:
<Cell>Alpha
Bravo
Charlie</Cell>
unfortunately this 'breaks' if you process your output document via some intermediary internal document e.g. even this:
<xsl:stylesheet version="1.0"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
exclude-result-prefixes="msxsl">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="output">
<Cell>
<xsl:text>Alpha</xsl:text>
<xsl:text disable-output-escaping="yes">&#10;</xsl:text>
<xsl:text>Bravo</xsl:text>
<xsl:text disable-output-escaping="yes">&#10;</xsl:text>
<xsl:text>Charlie</xsl:text>
</Cell>
</xsl:variable>
<xsl:copy-of select="msxsl:node-set($output)"/>
</xsl:template>
</xsl:stylesheet>
reverts to:
<Cell>Alpha&#10;Bravo&#10;Charlie</Cell>
because (see Michael Hay's answer) the disable-output-escaping attribute gets ignored if its passed through some internal document (i.e. the variable).
So...how can you get around this?
If you generate a token for the LF, you can then construct your psuedo excel output almost in its entirety except you use a custom element to flag the LF char, and then you can process that DIRECTLY into the result tree and interpret the custom element as an unescaped "
"
so this:
<xsl:stylesheet version="1.0"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:kookerella="kookerella.com"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
exclude-result-prefixes="msxsl kookerella">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="output">
<Cell>
<xsl:text>Alpha</xsl:text>
<kookerella:LF/>
<xsl:text>Bravo</xsl:text>
<kookerella:LF/>
<xsl:text>Charlie</xsl:text>
</Cell>
</xsl:variable>
<!-- process data directly into the result tree only -->
<xsl:apply-templates select="msxsl:node-set($output)" mode="injectLF"/>
</xsl:template>
<!-- Inject LF -->
<xsl:template match="#* | node()" mode="injectLF">
<xsl:copy>
<xsl:apply-templates select="#* | node()" mode="injectLF"/>
</xsl:copy>
</xsl:template>
<xsl:template match="kookerella:LF" mode="injectLF">
<xsl:text disable-output-escaping="yes">&#10;</xsl:text>
<xsl:apply-templates select="#* | node()" mode="injectLF"/>
</xsl:template>
</xsl:stylesheet>
now results in:
<Cell>Alpha
Bravo
Charlie</Cell>
P.S.
as an aside, this seems to work for me in both the various MSXSLT and Saxon HE, but I have had an instance of using the MSXSLT engine where even this doesnt work, presumably due to some configuration out output serialisation issue.

xsl:apply-templates returns nothing − what am I missing?

I have a simple XML response, like
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/">
<numberOfRecords>1</numberOfRecords>
<records>
<record>
<recordData>
<kitodo xmlns="http://meta.kitodo.org/v1/">
<metadata name="key1">value1</metadata>
<metadata name="key2">value2</metadata>
<metadata name="key3">value3</metadata>
</kitodo>
</recordData>
</record>
</records>
</searchRetrieveResponse>
which I want to transform to this by XSLT
<?xml version="1.0" encoding="utf-8"?>
<mets:mdWrap xmlns:kitodo="http://meta.kitodo.org/v1/"
xmlns:mets="http://www.loc.gov/METS/"
xmlns:srw="http://www.loc.gov/zing/srw/"
MDTYPE="OTHER"
OTHERMDTYPE="Kitodo">
<mets:xmlData>
<kitodo:kitodo>
<kitodo:metadata name="key1">value1</kitodo:metadata>
<kitodo:metadata name="key2">value2</kitodo:metadata>
<kitodo:metadata name="key3">value3</kitodo:metadata>
</kitodo:kitodo>
</mets:xmlData>
</mets:mdWrap>
That is, I want to remove the outside tree searchRetrieveResponse/records/record/recordData, replace it with mdWrap/xmlData and move the contained data node there.
I have a quite short XSLT for it:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:kitodo="http://meta.kitodo.org/v1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:srw="http://www.loc.gov/zing/srw/">
<xsl:output method="xml" indent="yes" encoding="utf-8"/>
<xsl:strip-space elements="*"/>
<xsl:template match="srw:recordData">
<mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="Kitodo">
<mets:xmlData>
<xsl:apply-templates select="#*|node()"/>
</mets:xmlData>
</mets:mdWrap>
</xsl:template>
<!-- pass-through rule -->
<xsl:template match="#*|node()">
<xsl:apply-templates select="#*|node()"/>
</xsl:template>
</xsl:stylesheet>
However, what I get is:
<?xml version="1.0" encoding="utf-8"?>
<mets:mdWrap xmlns:kitodo="http://meta.kitodo.org/v1/"
xmlns:mets="http://www.loc.gov/METS/"
xmlns:srw="http://www.loc.gov/zing/srw/"
MDTYPE="OTHER"
OTHERMDTYPE="Kitodo">
<mets:xmlData/>
</mets:mdWrap>
Obviously, the template match="srw:recordData" does match, otherwise I would get an empty result. However, the contained apply-templates doesn’t output anything. (I also tried an <xsl:apply-templates/> without a select="" attribute, but it doesn’t output anything either.) What am I missing?
XSLT processor is net.sf.saxon.TransformerFactoryImpl (Java)
I think nothing happens when you are applying templates inside xmlData. There are no templates that would match descendant nodes.
Try using copy-of:
<xsl:template match="srw:recordData">
<mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="Kitodo">
<mets:xmlData>
<xsl:copy-of select="kitodo:kitodo"/>
</mets:xmlData>
</mets:mdWrap>
</xsl:template>
The problem is not with the xsl:apply-templates instruction. It is with the template being applied. Your "pass-through rule" does not write anything to the output. You probably meant to have the identity transform template in that place - which goes like this:
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>

Can't read the namespaces and attribute in xslt

I know that this is simple problem. I'm still learning and getting familiarize with the XSLT coding. I have a problem in my XSLT and I don't know if I did it correctly. I need to get the value from the input file and store it in the new element tag name and that I don't need to populate the namespaces and attributes what's on the parent root element. I did a research about this and I saw many references but I can't apply it. The XSLT(v02) that I made is working fine (just copy from the references) if the root element doesn't have any namespaces and attributes. But, when I put a namespaces and attribute, no output populated.
Input file
<Root xmlns="http://abcd.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" releaseID="9.2" versionID="2.12.3" xsi:schemaLocation="abcd.com abcd.xsd">
<Element>
<Field>AAAAA</Field>
</Element>
<Element>
<Field>BBBBB</Field>
</Element>
<Element>
<Field>CCCCC</Field>
</Element>
xslt file
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<NewRecord>
<xsl:for-each select="Root/Element">
<NewTransaction>
<Position>
<xsl:value-of select="position()"/>
</Position>
<TransactionID>
<xsl:value-of select="Field"/>
</TransactionID>
</NewTransaction>
</xsl:for-each>
</NewRecord>
</xsl:template>
output generated
<NewRecord/>
My expected output should look like this:
<NewRecord>
<NewTransaction>
<Position>1</Position>
<TransactionID>AAAAA</TransactionID>
</NewTransaction>
<NewTransaction>
<Position>2</Position>
<TransactionID>BBBBB</TransactionID>
</NewTransaction>
<NewTransaction>
<Position>3</Position>
<TransactionID>CCCCC</TransactionID>
</NewTransaction>
I think the problem is in the <xsl:template match="/">, I'm still confused on the nodes that I need to put. Thank you for your help.
If you are really using XSLT 2.0, you only need to add:
xpath-default-namespace="http://abcd.com"
to the stylesheet tag, and leave everything else as is.
If you're using xslt 1.0, you'll have to declare the same namespace in the stylesheet, and use the prefix you map to the namespace to qualify the names of the elements:
The prefix can be whatever you want. I picked abcd to match your example, but it could be any legal identifier.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:abcd="http://abcd.com">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<NewRecord>
<xsl:for-each select="abcd:Root/abcd:Element">
<NewTransaction>
<Position>
<xsl:value-of select="position()"/>
</Position>
<TransactionID>
<xsl:value-of select="abcd:Field"/>
</TransactionID>
</NewTransaction>
</xsl:for-each>
</NewRecord>
</xsl:template>
</xsl:stylesheet>

Coying an entire xml in a Variable using xslt

How can i copy an entire xml as is in an Variable?
Below is the sample xml:
<?xml version="1.0" encoding="UTF-8"?>
<products author="Jesper">
<product id="p1">
<name>Delta</name>
<price>800</price>
<stock>4</stock>
</product>
</products>
I have tried below xslt but it is not working.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#*|node()">
<xsl:variable name="reqMsg">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:variable>
<xsl:copy-of select="$reqMsg"/>
</xsl:template>
</xsl:stylesheet>
Regards,
Rahul
Your transformation fails because at a certain point, it tries to create a variable (result tree fragment) containing an attribute node. This is not allowed.
It's not really clear what you mean by "copying an entire XML to a variable". But you probably want to simply use the select attribute on the root node:
<xsl:variable name="reqMsg" select="/"/>
This will actually create variable with a node-set containing the root node of the document. Using this variable with xsl:copy-of will output the whole document.
<xsl:copy-of select="document('path/to/file.xml')" />
Or if you need it more than once, to avoid repeating the doc name:
<xsl:variable name="filepath" select="'path/to/file.xml'" />
…
<xsl:copy-of select="document($filepath)" />
The result of document() should be cached IIRC, so don't worry about calling it repeatedly.

How do I make the value of one element a new attribute to the root?

I want to transform a markup like <root><a><c>CA</c></a></root> to <root juris="CA"><a><c>CA</c></a></root>
If you could specify more about the schema that is allowed, something more specific can be written. With what was given, something like this should work though:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/root">
<xsl:copy>
<xsl:attribute name="juris"><xsl:value-of select="./a/c"/></xsl:attribute>
<xsl:copy-of select="*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
As it stands, I would not be surprised if this led to issues with more complex inputs.