use xslt to add newlines after attributes - xslt

I am trying to transform XML that looks like:
<item attr1="value1" attr2="value2"><nestedItem attr1="value1" attr="value2"/></item>
To XML that looks like:
<item
attr1="value1"
attr2="value2">
<nestedItem
attr1="value1"
attr="value2"/>
</item>
I am working with a stylesheet:
<xsl:output method="xml" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template name="newline">
<xsl:text disable-output-escaping="yes">
</xsl:text>
</xsl:param>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="text()|#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="#*">
<xsl:attribute name="{name(.)}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
I've tried calling my newline template from a few different places, but can't get newlines inserted between my attributes.
Thanks!

There is no support for the wanted serialization in XSLT 1.0 and 2.0 (and, to my knowledge, also in the forthcoming XSLT 3.0).
In case your XSLT processor allows serialization via a user-provided XmlWriter class, then you can implement such serialization.
For example, when using one or more specific overloads of the .NET XslCompiledTransform.Transform() method, one may pass as one of the arguments to the method, an instance of XmlWriter. Pass an instance of your own class that derives from XmlWriter.

The following template may not pretty and may not be best practice but it works for me. Give it a try. It will not output xmlns 'attributes' if there are any on the element.
<xsl:template match="*">
<xsl:text disable-output-escaping="yes"><</xsl:text><xsl:value-of select="name()"/><xsl:text>
</xsl:text>
<xsl:for-each select="#*">
<xsl:text> </xsl:text><xsl:value-of select="concat(name(),'=' ,'"', . ,'"')" /><xsl:text>
</xsl:text>
</xsl:for-each>
<xsl:text disable-output-escaping="yes">>
</xsl:text>
<xsl:apply-templates/>
<xsl:text disable-output-escaping="yes"></</xsl:text><xsl:value-of select="name()"/><xsl:text disable-output-escaping="yes">>
</xsl:text>
</xsl:template>

When you use the Saxon serializer with indentation you can get output close to what you are looking for, but it might not be exactly what you want. If you're really fussy about the format then you will have to write your own serializer or adapt an existing one by tweaking the code. The usual philosophy in XML circles is that you shouldn't really care about distinctions that will be ignored once the data is parsed, and that includes things like the choice of quote character, the order of attributes, and the whitespace that separates attributes.

Related

XSLT List attributes in the order they appear in the xml file

I have a large number of xml files with a structure similar to the following, although they are far larger:
<?xml version="1.0" encoding="UTF-8"?>
<a a1="3.0" a2="ABC">
<b b1="P1" b2="123">first
</b>
<b b1="P2" b2="456" b3="xyz">second
</b>
</a>
I want to get the following output:
1|1|b1
1|2|b2
2|1|b1
2|2|b2
2|3|b3
where:
Field 1 is the sequence number for nodes /a/b
Field 2 is the sequence number of the attribute as it appears in the xml file
Field 3 is the attribute name (not value)
I don't quite know how to calculate field 2 correctly.
I've prepared the following xslt file:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<xsl:for-each select="a/b/#*">
<xsl:value-of select="count(../preceding-sibling::*)+1"/>
<xsl:text>|</xsl:text>
<!-- TODO: This is not correct -->
<xsl:value-of select="count(preceding-sibling::*)+1"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="name()"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
but when I run the following command:
xsltproc a.xslt a.xml > a.csv
I get an incorrect output, as field 2 does not represent the attribute sequence number:
1|1|b1
1|1|b2
2|1|b1
2|1|b2
2|1|b3
Do you have any suggestions on how to get the correct output please?
Please notice that the answers provided in XSLT to order attributes do not provide a solution to this problem.
The order of attributes is irrelevant in XML. For instance, <a a1="3.0" a2="ABC"> and <a a1="3.0" a2="ABC"> are equivalent.
However this specific question is part of a larger application where it is essential to establish the order in which attributes appear in given xml files (and not in xml files that are equivalent to them).
Although, as kjhughes says in comments, attribute order is insignificant. However, you can still select them, and use the position() element to get the numbers you are after (You just can't be sure the order they are output will be the order they appear in the XML, although generally this will be the case).
Try this XSLT. Do note the nested use of xsl:for-each to select only b elements first, to get their position, before getting the attributes, which then have their own separate position.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:template match="/">
<xsl:for-each select="a/b">
<xsl:variable name="bPosition" select="position()"/>
<xsl:for-each select="#*">
<xsl:value-of select="$bPosition"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="position()"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="name()"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
You could use the position() of the items in the sequence of attributes that you are iterating over and combine with logic for the position of its parent element.
<xsl:template match="/">
<xsl:for-each select="a/b/#*">
<xsl:value-of select="count(../preceding-sibling::*)+1"/>
<xsl:text>|</xsl:text>
<!-- TODO: This is not correct -->
<xsl:value-of select="position() -
(if (count(../preceding-sibling::*)) then count(../preceding-sibling::*)+1 else 0)"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="name()"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
Which produces the following output:
1|1|b1
1|2|b2
2|1|b1
2|2|b2
2|3|b3

Setting disable-output-escaping="yes" for every xsl:text tag in the xml

say I have the following xml:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/*">
<display>
<xsl:for-each select="logline_t">
<xsl:text disable-output-escaping="yes"><</xsl:text> <xsl:value-of select="./line_1" <xsl:text disable-output-escaping="yes">></xsl:text>
<xsl:text disable-output-escaping="yes"><</xsl:text> <xsl:value-of select="./line_2" <xsl:text disable-output-escaping="yes">></xsl:text>
<xsl:text disable-output-escaping="yes"><</xsl:text> <xsl:value-of select="./line_3" <xsl:text disable-output-escaping="yes">></xsl:text>
</xsl:for-each>
</display>
</xsl:template>
</xsl:stylesheet>
Is there a way to set disable-output-escaping="yes" to all of the xsl:text that appear in the document?
I know there is an option to put
< xsl:output method="text"/ >
and every time something like
& lt;
appears, a < will appear, but the thing is that sometimes in the values of line_1, line_2 or line_3, there is a "$lt;" that I don't want changed (this is, I only need whatever is between to be changed)
This is what I'm trying to accomplish. I have this xml:
<readlog_l>
<logline_t>
<hora>16:01:09</hora>
<texto>Call-ID: 663903<hola>396#127.0.0.1</texto>
</logline_t>
</readlog_l>
And this translation:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/*">
<display>
<screen name="<xsl:value-of select="name(.)"/>">
<xsl:for-each select="logline_t">
< field name="<xsl:for-each select="*"><xsl:value-of select="."/></xsl:for-each>" value="" type="label"/>
</xsl:for-each>
</screen>
</display>
</xsl:template>
</xsl:stylesheet>
I want this to be the output:
<?xml version="1.0"?>
<display>
<screen name="readlog_l">
<field name="16:01:09 Call-ID: 663903<hola>396#127.0.0.1 " value="" type="label">
</screen>
</display>
Note that I need the "<" inside the field name not to be escaped, this is why I can't use output method text.
Also, note that this is an example and the translations are much bigger, so this is why I'm trying to find out how not to write disable-output-escaping for every '<' or '>' I need.
Thanks!
Thanks for clarifying the question. In this case, I'm fairly sure there's no need to disable output escaping. XSLT was designed to accomplish what you're doing:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/*">
<display>
<screen name="{name(.)}">
<xsl:for-each select="logline_t">
<xsl:variable name="nameContent">
<xsl:for-each select="*">
<xsl:if test="position() > 1"><xsl:text> </xsl:text></xsl:if>
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:variable>
<field name="{$nameContent}" value="" type="label" />
</xsl:for-each>
</screen>
</display>
</xsl:template>
</xsl:stylesheet>
I'm a bit unclear on this point:
Note that I need the "<" inside the field name not to be escaped, this is why I can't use output method text.
Which < are you referring to? Is it the < and > around "hola"? If you left those unescaped you would wind up with invalid XML. It also looks like the name attribute in your sample output have a lot of values that aren't in the input XML. Where did those come from?
Given your expected output you don't need d-o-e at all for this. Here is a possible solution that doesn't use d-o-e, and is based on templates rather than for-each:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" />
<xsl:template match="/*">
<display>
<screen name="{name(.)}">
<xsl:apply-templates select="logline_t"/>
</screen>
</display>
</xsl:template>
<xsl:template match="logline_t">
<field value="" type="label">
<xsl:attribute name="name">
<xsl:apply-templates select="*" mode="fieldvalue"/>
</xsl:attribute>
</field>
</xsl:template>
<xsl:template match="*[last()]" mode="fieldvalue">
<xsl:value-of select="." />
</xsl:template>
<xsl:template match="*" mode="fieldvalue">
<xsl:value-of select="." />
<xsl:text> </xsl:text>
</xsl:template>
</xsl:stylesheet>
If you want to set d-o-e on everything, that suggests you are trying to generate markup "by hand". I don't think that's a particularly good idea (in fact, I think it's a lousy idea), but if it's what you want to do, I would suggest using the text output method instead of the xml output method. That way, no escaping of special characters takes place, and therefore it doesn't need to be disabled.

XSL FO inline alignment on an existing sort/conditional XSL

I need to get right-align and left-align working in the same line. Looking over similar responses, I found the below recommendation,
<fo:block text-align-last="justify">
LEFT TEXT (want this to be the Contacts element from the below)
<fo:leader leader-pattern="space" />
RIGHT TEXT (want this to be the Address1 element from the below)
</fo:block>
But when I try to apply it to my existing XSL code (see below) I can’t make it work – I don’t know enough about how to edit it to accommodate/merge both the sort/conditionals and the FO. Can someone help me get this right?
Exsiting/working code:
<?xml version="1.0"?><!-- DWXMLSource="XML - Builder Members.xml" -->
<!DOCTYPE xsl:stylesheet [<!ENTITY nbsp " ">]>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="no"/>
<xsl:template match="/">
<memberdata>
<xsl:for-each select="memberdata/memberinfo">
<xsl:sort select="SortKey"/>
<memberdata>
<xsl:if test="Contacts[.!='']">
<Contacts><xsl:value-of select="Contacts" /></Contacts>
<xsl:text>
</xsl:text>
</xsl:if>
<xsl:if test="Address1[.!='']">
<Address1><xsl:value-of select="Address1" /></Address1>
<xsl:text>
</xsl:text>
</xsl:if>
</memberdata>
</xsl:for-each>
</memberdata>
</xsl:template>
</xsl:stylesheet>
Independently of the actual answer to your question (which is impossible to give in the current form the question is in), I'd like to suggest a few improvements to your general approach to XSLT:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="no"/>
<xsl:template match="memberdata">
<xsl:copy>
<xsl:apply-templates select="memberinfo">
<xsl:sort select="SortKey" />
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<xsl:template match="memberinfo">
<memberdata>
<xsl:apply-templates select="Contacts" />
<xsl:apply-templates select="Address1" />
</memberdata>
</xsl:template>
<xsl:template match="Contacts|Address1">
<xsl:if test="normalize-space() != ''">
<xsl:copy-of select="." />
<xsl:text>
</xsl:text>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Avoid <xsl:for-each>, use distinct templates and <xsl:apply-templates> instead. This results in cleaner, less duplicated and less deeply nested code. It also could result in more efficient processing of your stylesheet, as XSLT processors are optimized for template matching and can parallelize template execution.
Note that you can use the same template for multiple elements, see third template above.
Avoid adding line-breaks via such a construct: <xsl:text>
</xsl:text>. Doing this destroys source code readability and is prone to errors as soon as the source code is formatted (I've already done this in your question to be able to indent your code properly in the first place). Use character references
instead to separate source code layout and output layout.
Note that you can use <xsl:copy-of> to make a copy of an element, no need to do <foo><xsl:value-of select="foo" /></foo>.
Taking your request at face value, this seems to be what you're asking for, which merges the sort, the conditionals and the FO.
<?xml version="1.0"?><!-- DWXMLSource="XML - Builder Members.xml" -->
<!DOCTYPE xsl:stylesheet [<!ENTITY nbsp " ">]>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="no"/>
<xsl:template match="/">
<memberdata>
<xsl:for-each select="memberdata/memberinfo">
<xsl:sort select="SortKey"/>
<memberdata>
<fo:block text-align-last="justify">
<xsl:if test="Contacts[.!='']">
<Contacts><xsl:value-of select="Contacts" /></Contacts>
<xsl:text>
</xsl:text>
</xsl:if>
<fo:leader leader-pattern="space" />
<xsl:if test="Address1[.!='']">
<Address1><xsl:value-of select="Address1" /></Address1>
<xsl:text>
</xsl:text>
</xsl:if>
</fo:block>
</memberdata>
</xsl:for-each>
</memberdata>
</xsl:template>
</xsl:stylesheet>
However it seems unlikely that you really want to mix <fo:*> elements and other elements (<memberdata>) in your output... unless you plan to process them later to produce a full FO document. So the above may not be quite the solution you need.
(See also #Tomalak's good points about how to improve the XSLT. I would differ with him only on the question of for-each vs. apply-templates... it really depends on several factors and what your priorities are.)

XSLT: how to write redundant xmlns?

I need the following output for bizzare system which expects same xmlns declared in parent and child and refuses to work otherwise. I.e that's what expected:
<root xmlns="http://something">
<element xmlns="http://something" />
</root>
I can create xmlns in root with
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:element name="root" namespace="http://something">
<xsl:element name="node" namespace="http://something" />
</xsl:element>
</xsl:template>
</xsl:stylesheet>
However it doesn't add xmlns into childnode because node's parent has the same xmlns. How to force XSLT to write xmlns disregarding parent?
The XML schema specification expressly prohibits attributes named xmlns, so an XSLT stylesheet cannot create such attributes directly using <xsl:attribute>. I can only see two options for you...
One option is to create dummy attributes using a different name (e.g. xmlnsx):
<xsl:template match="/">
<xsl:element name="root">
<xsl:attribute name="xmlnsx">http://something</xsl:attribute>
<xsl:element name="node">
<xsl:attribute name="xmlnsx">http://something</xsl:attribute>
</xsl:element>
</xsl:element>
</xsl:template>
... and then replace all occurrences of the attribute xmlnsx with xmlns in some post-processing step (such as a SAX filter or other stream editor). However, this solution involves inserting a non-XSLT step into the pipeline.
The other option is pure, if ugly, XSLT. You could generate the required XML directly, using xsl:text and disable-output-escaping, like this:
<xsl:template match="/">
<xsl:text disable-output-escaping="yes"><root xmlns="http://something"></xsl:text>
<xsl:text disable-output-escaping="yes"><node xmlns="http://something"></xsl:text>
<xsl:text disable-output-escaping="yes"></root></xsl:text>
</xsl:template>
Note that the XSLT 1.0 specification is pretty loose when it comes to serialization, so a particular XSLT processor could still conceivable strip the redundant namespace declarations from this second solution. However, it worked in the four processors that I tried (namely Saxon, MSXML, MSXML.NET and LIBXML).

Can an XSLT processor preserve empty CDATA sections?

I'm processing an XML document (an InstallAnywhere .iap_xml installer) before handing it off to another tool (InstallAnywhere itself) to update some values. However, it appears that the XSLT transform I am using is stripping CDATA sections (which appear to be significant to InstallAnywhere) from the document.
I'm using Ant 1.7.0, JDK 1.6.0_16, and a stylesheet based on the identity:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" encoding="UTF-8" cdata-section-elements="string" />
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Basically, "string" nodes that look like:
<string><![CDATA[]]></string>
are being processed into:
<string/>
From reading XSLT FAQs, I can see that what is happening is legal as far as the XSLT spec is concerned. Is there any way I can prevent this from happening and convince the XSLT processor to emit the CDATA section?
Found a solution:
<xsl:template match="string">
<xsl:element name="string">
<xsl:text disable-output-escaping="yes"><![CDATA[</xsl:text><xsl:value-of select="text()" disable-output-escaping="yes" /><xsl:text disable-output-escaping="yes">]]></xsl:text>
</xsl:element>
</xsl:template>
I also removed the cdata-section-elements attribute from the <xsl:output> element.
Basically, since the CDATA sections are significant to the next tool in the chain, I take output them manually.
To do this, you'll need to add a special case for empty string elements and use disable-output-escaping. I don't have a copy of Ant to test with, but the following template worked for me with libxml's xsltproc, which exhibits the same behavior you describe:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes" cdata-section-elements="string"/>
<xsl:template match="string">
<xsl:choose>
<xsl:when test=". = ''">
<string>
<xsl:text disable-output-escaping="yes"><![CDATA[]]></xsl:text>
</string>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Input:
<input>
<string><![CDATA[foo]]></string>
<string><![CDATA[]]></string>
</input>
Output:
<input>
<string><![CDATA[foo]]></string>
<string><![CDATA[]]></string>
</input>
Once the XML parser has finished with the XML, there is absolutely no difference between <![CDATA[abc]]> and abc. And the same is true for an empty string - <![CDATA[]]> resolves to nothing at all, and is silently ignored. It has no representation in the XML model. In fact, there is no way to tell the difference from CDATA and regular strings, and neither has any representation in the XML model.
Sorry.
Now, why would you want this? Perhaps there is another solution which can help you?