XSLT output format - xslt

I am using XSLT to generate an .sql file from an .xml input file.
I have some problems with the indentation.
The way the stylesheet is formatted (how many line feeds and carriage returns and tabs) directly effects the output file i.e. if I include a few line feeds and CRs in my stylesheet to make it more readable, they are displayed in the output file as well (this would not be that bad if the tabs didn't affect the formatting of the output file as well):
It looks like this:
SQLStatement1<CR><LF>
<CR><LF>
<CR><LF>
SQLStatement2<CR><LF>
.... (tabs are also outputted)
I use an ant task to create the .sql file. The target looks like this:
<xslt in="input.xml"
out="queries.sql"
style="createQueries.xls">
</xslt>
I am using XSLT 1.0 and cannot use XSLT 2.0.
I thought about modifying some output parameters. However it does not have any effect if I change the method attribute to e.g. 'html' (I guess that the method is set to 'text' since the type of the output file(sql) is not known)
Any ideas on how to fix this issue?
Cheers

You would make it much easier on us if you showed a small but complete XML input sample, an XSLT sample, the output you get and the output you want.
If you use xsl:output method="text" and want to control the white space then make sure you use xsl:text to output literal text and xsl:value-of to output computed text. That way you should be able to control the white space exactly.

Related

How to get ampersand "&" in output of Transform xml activity of TIBCO

Could anyone please help in getting the ampersand "&" output of Transform xml activity of TIBCO .
My requirement is the xmlstring from Transform xml activity is mapped to Parse xml (which will give the final output ) .Ex; Maitree&Sons. What should be passed in xslt so that when the output from Transform xml goes to Parse xml it will give the final result as "&".
I tried using CDATA and disable-escaping-output also in xslt but in parse xml it fails.
Please help.
Generally XSLT won't allow you to produce invalid output. The correct representation in XML is Maitree&Sons and this is what it produces. If it produced Maitree&Sons, this would be invalid XML and would be thrown out by an XML parser trying to read the document.
Having said that, it's possible using disable-output-escaping to produce an unescaped ampersand if your XSLT processor supports this option. If it's not working for you we need to know exactly what you did and how it failed.
(General rule: on SO, always tell us exactly what you did and exactly how it failed. Saying in general terms that you tried lots of things and none of them worked doesn't get us any nearer to a solution.)
LATER
I'm reading the question again. You want to produce output from the transformer that will go into an XML parser, such that the output of the parser is Maitree&Sons. Well, in that case the lexical XML must be Maitree&Sons, which it will be if you generate the string Maitree&Sons in XSLT. But XSLT is XML, so if you want to write this as a literal string in your stylesheet, it will be written Maitree&Sons.
I guess we need a much clearer picture of what you are doing and where it is going wrong.

XSLT missing linebreaks when selecting nodes by path expression

I am copying some nodes according to XSLT: Copy child elements of a complex type only once by using a path expression within a copy-of tag:
<xsl:copy-of select="/xs:schema/xs:complexType[#name=current()/xs:element/#type]"/>
In the output all linebreaks are missing at the elements processed by this statement. (Elesewhere they are shown) It looks like this:
...</xs:complexType><xs:complexType....
I can only add linebreaks before and after, but not between them. How can i achive this? Thanx for your help!
You provided too little data to attempt any testing. E.g. it is not clear, what output method uses your script.
Quite often XSLT script contains xsl:strip-space instruction, which causes normalization of text nodes.
This normalization a.o. changes "internal" sequences of "white" chars, including line breaks,
into a single space.
Maybe this is the cause.
Take alse a look at xsl:output instruction in your script.
Does it contain indent="yes" attribute?
If it doesn't, the output contains no line breaks between output elements.
Maybe your script contains in some places output of explicite line breaks
(e.g. <xsl:text>&#aA;</xsl:text>), so these line breaks are rendered.
But if you have no indent="yes" attribute, then no line breaks are inserted
"automatically" between consecutive elements.
Your XPath expression only selects the xs:complexType elements, not the whitespace that separates them.
When you're working with a vocabulary such as XSD that doesn't use mixed content (except perhaps in annotations) it's probably best to remove all whitespace text nodes from the input using xsl:strip-space and then to generate new whitespace in the output using xsl:output indent='yes'.

Replacing strings in xml file using batch script

I'm new to batch script. I want to replace a strings in a particular file.
In below script I'm getting error.
#echo off
$standalone = Get-Content 'C:\wildfly\standalone\configuration\standalone.xml'
$standalone -replace '<wsdl-host>${jboss.bind.address:127.0.0.1}</wsdl-host>','<wsdl-host>${jboss.bind.address:0.0.0.0}</wsdl-host>' |
Set-Content 'C:\wildfly\standalone\configuration\standalone.xml'
The proper way to edit XML is to process it as an XML document, not as a string. That's because the XML file is not guaranteed to maintain specific formatting. Any edits should be context-aware and string replace isn't. Consider the three eqvivalent XML fragments:
<wsdl-host>${jboss.bind.address:127.0.0.1}</wsdl-host>
<wsdl-host>${jboss.bind.address:127.0.0.1}</wsdl-host >
<wsdl-host >${jboss.bind.address:127.0.0.1}</wsdl-host >
Note that whitespacing in element names is different and it's legal to add some. What's more, in practice, a lot of implementations simply discard line breaks in element values, so the two following are likely to provide same results to a config parser:
<wsdl-host>${jboss.bind.address:127.0.0.1}</wsdl-host>
<wsdl-host>${jboss.bind.address:127.0.0.1}
</wsdl-host>
It really doesn't make much sense to process XML as string, does it?
Fortunately, Powershell has built-in support for XML files. A simple approach is like so,
# Mock XML config
[xml]$x = #'
<root>
<wsdl-host>${jboss.bind.address:127.0.0.1}</wsdl-host>
</root>
'#
# Let's change the wsdl-host element's contents
$x.root.'wsdl-host' = '${jboss.bind.address:0.0.0.0}'
# Save the modified document to console to see the change
$x.save([console]::out)
<?xml version="1.0" encoding="ibm850"?>
<root>
<wsdl-host>${jboss.bind.address:0.0.0.0}</wsdl-host>
</root>
If you can't use Powershell and are stuck with batch scripts, you really need to use a 3rd party XML manipulation program.

Prevent Narrow Non-Breaking Space (n-nbsp) in XSLT output

I have an XSLT transform that puts   into my output. That is a narrow-non breaking space. Here is one section that results in nnbsp:
<span>
<xsl:text>§ </xsl:text>
<xsl:value-of select="$firstsection"/>
<xsl:text> to </xsl:text>
<xsl:value-of select="$lastsection"/>
</span>
The nnbsp in this case, comes in after the § and after the text to.
<span>§ 1 to 8</span>
(interestingly, the space before the to turns out to be a regular full size space)
This occurs in my UTF-8 encoded output, as well as iso-8859-1 (latin1).
How can I avoid the nnbsp? While the narrow space is visually more appropriate, it doesn't work for all the devices that will read this document. I need a plain vanilla blank space.
Is there a transform setting? I use Saxon 9 at the command line.
Should I do a another transform.. using a replace template to replace the nnbsp?
Should I re-do my templates like the one above? Example, if I did a concat() would that be a better coding practice?
UPDATE: For those who may find this question someday... as suggested by Michael Kay, I researched the issue further. Indeed, it turns out narrow-NBSP were in the source XML files (and bled into my templates via cut/paste). I did not know this, and it was hard to discover (hat tip to gVim hex view). The narrows don't exactly jump out at you in a GUI editor. I have no control over production of the source XML, so I had to find a way to 'deal with it.' Eric's answer below turned out to be my preferred way to scrub the narrow-nbsp. SED editing was (and is) an another option to consider, but I like keeping my production in XSLT when possible. So Eric's suggestion has worked well for me.
You could use the translate() function to replace your nnbsp by something else, but since you are using Saxon 9 you can rely on XSLT 2.0 features and use a character map which will do that kind of things automatically for you, for instance (assuming that you want to replace them by a non breaking space:
<xsl:output use-character-maps="nnbsp"/>
<xsl:character-map name="nnbsp">
<xsl:output-character character=" " string=" "/>
</xsl:character-map>
Eric
The narrow non-breaking space is coming from somewhere: either the source document or the stylesheet. It's not being magically injected by the XSLT processor. If it's in the stylesheet, then get rid of it. If it's in the source document, then transform it away, for example by use of the translate() function.
In fact, pasting your code fragment into a text editor and looking at it in hex, I see that the 202F characters are right there in your code. I don't know how you got them into your stylesheet, but you should (a) remove them, and (b) work out how it happened so it doesn't happen again.

Parsing with fscanf() ignoring spaces or missing values?

I'm trying to scan a text file with XML, the XML has a number of items with this structure:
<enemy>
<type> 0 </type>
<x> 273 </x>
<y> 275 </y>
<event> </event>
</enemy>
The problem is that the xml may have spaces between tags or inside them. I created a loop and I'm trying to do a single scan in each iteration to get int type, x, y and event into a variable each. However I don't know how to ignore whitespaces nor how to handle missing values since some tags may or may not have a value (like event).
How can I scan this "enemy" regadless of spacing and missing values?
That's an easy one - you do not parse XML using fscanf(). Use a real XML parser otherwise you will end up with a very complicated code that will not work 80% of the time either returning wrong data or crashing.
XML format (despite seeming simplicity) is complicated even in most innocuous cases and existing XML parsers are there for a reason. See libxml or a lot of others.
Still, if you are hell-bent on parsing XML yourself, the right way to do it is to first tokenize the input and then ensure that your token sequences result in correct forms. That's way more complicated than using simple fscanf().