I'm trying to simulate the copy-of function in XSLT where I want everything within a node outputted in the response.
Using this template
<#ftl ns_prefixes={"D": "http://milyn.codehaus.org/Smooks"} output_format="XML">
${Order.orderitem.##markup}
Facing 2 issues here
The output i get transformed the <, > as well of the XML tags. I do need XML formatting to escape invalid characters like & etc.
How can i remove the namescapes that appears in every node
My response is
<orderitem xmlns="http://milyn.codehaus.org/Smooks"><position>1</position><quantity>1</quantity><productid>364</productid><title>The 40YearOld</title><price>29.98</price></orderitem><orderitem xmlns="http://milyn.codehaus.org/Smooks"><position>2</position><quantity>1</quantity><productid>299</productid><title>Pulp Fiction</title><price>29.99</price></orderitem>
Input being
<Order xmlns="http://milyn.codehaus.org/Smooks" xmlns:xsi="http://www.w3.org/2001/XMLSchemainstance">
<header>
<orderid>1</orderid>
<statuscode>0</statuscode>
<netamount>59.97</netamount>
<totalamount>64.92</totalamount>
<tax>4.95</tax>
<date>Wed Nov 15 13:45:28 EST 2006</date>
</header>
<customerdetails>
<username>user1</username>
<name>
<firstname>Harry</firstname>
<lastname>Fletcher</lastname>
</name>
<state>South Dakota</state>
</customerdetails>
<orderitem>
<position>1</position>
<quantity>1</quantity>
<productid>364</productid>
<title>The 40YearOld</title>
<price>29.98</price>
</orderitem>
<orderitem>
<position>2</position>
<quantity>1</quantity>
<productid>299</productid>
<title>Pulp Fiction</title>
<price>29.99</price>
</orderitem>
To prevent auto-escaping: ${Order.orderitem.##markup?no_esc}. (Unfortunately XML wrapping way predates auto-escaping, so it has remained like so...)
Prevent repeated xmlns-es... you can't. The problem is that the orderitem-s has no common ancestor as far as ##markup can know, where a common xmlns could solve this, so it does the safest thing.
Related
I have a fairly nested XML file that I'd like to transform with an XSL template to something a little simpler to make bulk loading the data into SQL more efficient. I wanted to do it in C++ (Codeblocks with gcc) but I'm having a bit of trouble just being able to load the document with any of the libraries I've come across, including MSXML. If anyone has any experience using MSXML in Codeblocks with gcc let me know!
I have a stylesheet that transforms the XML in Excel VBA with a DOMDocument but I don't want to depend on Excel. I figured the next best thing would be a VBScript.
The data are one or two text values that are held in <DATAVALUE> nodes, descendants of 100 <LOCATION> nodes. The first child of each <LOCATION> node, called <LOCATIONNAME>, holds a unique name for each <LOCATION> node (i.e; NAME1-NAME100). The third and fourth children of the <LOCATION> node (if there is a fourth child) are <DATA> nodes, each holding a <DATAVALUE> node. The file can have upwards of 1 million <SAMPLE> nodes. Here is the XML:
<?xml version="1.0" encoding="utf-8"?>
<MYImportFile xmlns="urn:ohLookHEREaNamespacedeclaration">
<HEADERVERSION>1.10</HEADERVERSION>
<MESSAGE>Import</MESSAGE>
<MYBED>QUEEN</MYBED>
<SOURCE>SPRING </SOURCE>
<USERID>MMOUSE</USERID>
<DATETIME>2019-11-25T12:31:00</DATETIME>
<SAMPLE TYPE="No" APPLE="false">
<SAMPLEID>0000565</SAMPLEID>
<SAMPLECATEGORY>CLASS5</SAMPLECATEGORY>
<LOCATION APPLE="false">
<LOCATIONNAME>NAME1</LOCATIONNAME>
<READBY>MMOUSE</READBY>
<TIME>12:31:00</TIME>
<DATA>
<DATAVALUE>aaaa</DATAVALUE>
</DATA>
<DATA>
<DATAVALUE>bbbb</DATAVALUE>
</DATA>
</LOCATION>
'''''''''''''''''there are 100 LOCATION entries''''''''''''''''''''''''
<LOCATION APPLE="false">
<LOCATIONNAME>NAME100</LOCATIONNAME>
<READBY>MMOUSE</READBY>
<TIME>12:31:00</TIME>
<DATA>
<DATAVALUE>zzzz</DATAVALUE>
</DATA>
</LOCATION>
</SAMPLE>
'''''''''''''''''repeat for however many SAMPLES there are''''''''''''''''''''''
</MYImportFile>
I want to point something out so it's a little more clear what's going on. In the transformed xml document, one of the things I need to account for is when there is only one <DATA> node in a <LOCATION>. This is done by copying the first <DATAVALUE> node into a second <DATAVALUE> node in the new document. For example, the <DATAVALUE>, "zzzz" that appears twice in the transformed sheet only appears in the initial XML once. Here is what I want the transformed XML to look like:
<?xml version="1.0" encoding="UTF-8"?>
<MYImportFile>
<SAMPLE>
<SAMPLEID>0000565</SAMPLEID>
<NAME1_1>aaaa</NAME1_1>
<NAME1_2>bbbb</NAME1_2>
<NAME2_1>cccc</NAME2_1>
<NAME2_2>dddd</NAME2_2>
'''''''''''''''''there are 100 LOCATION entries transformed to NAME1-NAME100''''''''''''''''''''''''
<NAME100_1>zzzz</NAME100_1>
<NAME100_2>zzzz</NAME100_2>
</SAMPLE>
'''''''''''''''''repeat for however many SAMPLES there are''''''''''''''''''''''
</MYImportFile>
My StyleSheet (that works with VBA code):
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:b="urn:ohLookHEREaNamespacedeclaration" exclude-result-prefixes="b">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/b:MYImportFile">
<MYImportFile>
<xsl:for-each select="b:SAMPLE">
<SAMPLE>
<SAMPLEID>
<xsl:value-of select="b:SAMPLEID"/>
</SAMPLEID>
<NAME1_1>
<xsl:value-of select="b:LOCATION/b:LOCATIONNAME[text() = 'NAME1']/../b:DATA[1]/b:DATAVALUE"/>
</NAME1_1>
<xsl:choose>
<xsl:when test="b:LOCATION/b:LOCATIONNAME[text() = 'NAME1']/../b:DATA[2]/b:DATAVALUE">
<NAME1_2>
<xsl:value-of select="b:LOCATION/b:LOCATIONNAME[text() = 'NAME1']/../b:DATA[2]/b:DATAVALUE"/>
</NAME1_2>
</xsl:when>
<xsl:otherwise>
<NAME1_2>
<xsl:value-of select="b:LOCATION/b:LOCATIONNAME[text() = 'NAME1']/../b:DATA[1]/b:DATAVALUE"/>
</NAME1_2>
</xsl:otherwise>
</xsl:choose>
'''''''''''''''''''there are 100 NAME entires to recieve the 100 locations
</SAMPLE>
</xsl:for-each>
</MYImportFile>
</xsl:template>
</xsl:stylesheet>
My Script:
Option Explicit
Const strInputFile = "C:\Path\fileName.xml"
Const strTemplateFile = "C:\Path\convFileName.xsl"
Const strOutputFile = "C:\Path\newFileName.xml"
Dim objXMLDoc : Set objXMLDoc = WScript.CreateObject("Msxml2.DOMDocument")
objXMLDoc.async = False
objXMLDoc.loadXML(strInputFile)
objXMLDoc.SetProperty "SelectionNamespaces", "xmlns='urn:myNamespace'"
Dim objXSLDoc : Set objXSLDoc = WScript.CreateObject("Msxml2.DOMDocument")
objXSLDoc.async = False
objXSLDoc.loadXML(strTemplateFile)
Dim objNewXMLDoc : Set objNewXMLDoc = WScript.CreateObject("Msxml2.DOMDocument")
objXMLDoc.transformNodeToObject objXSLDoc, objNewXMLDoc
objNewXMLDoc.save strOutputFile
The error:
Line: 19
Char: 1
Error: The stylesheet does not contain a document element. The
stylesheet may be empty, or it may not be a well-formed XML document.
Code: 80004005
Source: msxml3.dll
I'm guessing either my script isn't quite right or there's a setting I'm missing, causing mismatching objects and libraries, because my VBA macro transforms the xml with that stylesheet. Anyone have any ideas? Suggestions to make this thing run?
As far as I remember loadXML takes a string with the XML. If you have a file or URL to parse use the load method.
Is there a regex to check for empty elements for the XML below? So I want to check whether or not everything below the <ClientRequest> tags are populated or not?
<Response xmlns="http://Test/Types">
<ClientRequest>
<Name>TEST</Name>
<Id><222/Id>
<Parameters>
<SID>123456</SID>
</RequestParams>
<StartDate>2017-10-13T23:00:01.000+01:00</StartDate>
<EndDate>2017-10-14T22:59:59.000+01:00</EndDate>
<URL></URL>
</ClientRequest>
<Install/>
<Types/>
<LR/>
<Package/>
<Services/>
<Issues/>
<Complaints/>
</Response>
Use an XML parser or XPath, not regex, to check or parse XML.
This XPath,
//*[not(text()) and not(*)]
will select all elements that have no text or element children.
This XPath,
//*[not(node())]
will select all empty elements (also disallowing comment and PI children).
Note that your XML is not well-formed. Here it is with corrections:
<Response xmlns="http://Test/Types">
<ClientRequest>
<Name>TEST</Name>
<Id>222</Id>
<Parameters>
<SID>123456</SID>
</Parameters>
<StartDate>2017-10-13T23:00:01.000+01:00</StartDate>
<EndDate>2017-10-14T22:59:59.000+01:00</EndDate>
<URL></URL>
</ClientRequest>
<Install/>
<Types/>
<LR/>
<Package/>
<Services/>
<Issues/>
<Complaints/>
</Response>
Note also that you could wrap either of the above XPaths in boolean() or count() to return an indicator or count of the presence of such populated elements.
I'm attempting to create an XSLT mapping that properly converts a fairly large integer value coming through in a text field into the appropriate integer value. The problem is that since 1.0 only supports converting to type number, I get a value like 1.234567890E9 back for input of "1234567890"
I'm using Altova MapForce with XSLT1.0 as the coding platform. XSLT2.0 doesn't appear to be an option, as the XSLT has to be processed using a pre-existing routine that only supports XSLT1.0
By default Mapforce generates
<xsl:value-of select="string(floor(number(string(.))))"/>
and I've tried every combination of functions I can think of, but always get a float for large values.
Further testing shows the problem lies in Mapforce, which insists on using the number() function when mapping from text to int.
Let me try and move this forward by answering a question that you did not ask, but perhaps should have. Suppose you have the following input:
XML
<input>
<value>1234567890000000.9</value>
<value>9876543210000000</value>
</input>
and you want to make sure that the input values (which are all numbers, but some of them are not integers) are converted to integers at the output, you could apply the following transformation:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<output>
<xsl:for-each select="input/value">
<value><xsl:value-of select="format-number(., '#')"/></value>
</xsl:for-each>
</output>
</xsl:template>
</xsl:stylesheet>
to obtain the following output:
<?xml version="1.0" encoding="UTF-8"?>
<output>
<value>1234567890000001</value>
<value>9876543210000000</value>
</output>
Note that the results here are rounded, not floored.
Are you sure that mapforce isn't using xslt-2.0?
If I do in XSLT-1.0 (with either saxon or Altova's processor):
<xsl:value-of select="number('1234567890')"/>
I get -> 1234567890
If I use XSLT-2.0 I get -> 1.23456789E9
So I think it is very strange that an XSLT 1 transformation supposedly returns you the floating point representation of the number.
Formatting the number with format-number(1.23456789E9,'#') will always give you 1234567890 in both XSLT-1.0 and 2.0. Edit: saxon will not convert 1.23456789E9 to number in xslt-1.0, altova's processor however will.
The problem lies within Mapforce, so I've decided to let mapforce generate it's code, then overwrite it for this one field that's causing all the trouble.
#Tobias #Michael Thanks to you both for your help. I've +1'ed both your answers and a few comments since your help led to the answer.
I am fairly new to XML dev.. I had a few questions regarding XML parsing with XPATH and libxml.
I have an XML structured as :
<resultset>
<result count=1>
<row>
<name> He-Man! </name>
<home> Greyskull </home>
<row>
</result>
<result count=2>
<row>
<name> Spider-Man</name>
<home> Some downtown apartment </home>
<row>
<row>
<name> Disco-Man!</name>
<home> The 70's dance floor </home>
<row>
</result>
<resultset>
I need to pick out the names from this XML , but where the count is 2 , i need it only from the first record. I ran through a few tutorials, but i am unable to come up with an XPATH query which would serve this purpose.
/name will select all name elements.
/result[#count > 1 ]/row[1]/name | /result[#count =1 ]/row/name
Is this possible to be done with XPATH ? Is this better to be done via XPATH or by walking the XML tree?
Can some one point me to some complex searches through out XML's ?
Edit : The actual scenario requires select a subset of the XML row , which are nested at 2 levels at times. This sounds like i need to OR '|' many paths to select the nodes i require... I am not sure if that would be efficient as opposed to walking a tree... The above is typed to replicate the problem :)
Thanks!
Try this XPath -
/resultset/result[#count=2]/row/name
This will give a list of all nodes falling under this XPath. From this just take the first element (as you needed only the first record).
I'd probably keep my xpath simpler and just extract both cases, then loop over both node sets.
If you do need to go down the single xpath route, you should try out your xpath expressions in something that lets you enter them live, rather than having to recompile C/C++ code. You should be able to do that by loading your XML into firefox and using firebug - for example typing $x('//name') in the firebug console gives three nodes.
NOTE however that your XML is invalid... You have a bunch of "<row>"s that should be "</row>" and the same for "<resultset>" and your counts need to be
<result count="1">
i.e. with quote marks around the value.
I know that if I have an XML file like this:
<persons>
<class name="English">
<person name="Tarzan" id="050676"/>
<person name="Donald" id="070754"/>
<person name="Dolly" id="231256"/>
</class>
<class name="Math">
<person name="Winston" id="050677"/>
<person name="Donald" id="070754"/>
<person name="Fred" id="231257"/>
</class>
</persons>
I can define a key in an XSL file like this:
<xsl:key name="preg" match="person" use="#id"/>
where I'm using id as the key. However, Donald is listed twice, but is only in one place in preg.
Suppose I want him listed twice in preg. That is, I want to make the class name be part of the identifier. Basically, I want preg to have keys that are equivalent to ordered pairs: (class-name, id). How do I do that (using XSLT 1.0)?
Concatenate the keys? How about
use="concat(../#name, #id)"
This would serve to keep them separate in the index. You'd of course have to use the same key to retrieve them. To avoid any ambiguity I'd also include a delimiter that won't occur in either subkey, as in
use="concat(../#name, '|', #id)"
This is the recommended approach in Michael Kay's XSLT2 reference.