Xmlstarlet select String+number for update

Xmlstarlet select String+number for update - regex

I want to update an element in my xml-File, with some dynamic names.
The name (paraName) is always GLOBPARA followed by a number, like you can see in the following example. These are the elements I want to change the value.
The file is also filled with some elements, which are also starting with an GLOBPARA, but have no simple number following. These elements I don't want to change.
With the following command I can change every element, which has a GLOBPARA in the name. Including the unwanted ones.
xmlstarlet ed --update "//globPara[contains(paramName, 'GLOBPARA')]/paramValue" -v "100" test.xml
Question:
How do I change only the ones, containing a String (GLOBPARA) and some random Numbers?
Before:
<?xml version="1.0" encoding="UTF-8"?>
<container>
<dataList>
<globPara>
<paramName>GLOBPARA260</paramName>
<paramValue>0</paramValue>
</globPara>
<globPara>
<paramName>GLOBPARAMON_BAD_TEST_18_1_SGB_IV</paramName>
<paramValue>2555</paramValue>
</globPara>
</dataList>
</container>
Wanted result:
<?xml version="1.0" encoding="UTF-8"?>
<container>
<dataList>
<globPara>
<paramName>GLOBPARA260</paramName>
<paramValue>100</paramValue>
</globPara>
<globPara>
<paramName>GLOBPARAMON_BAD_TEST_18_1_SGB_IV</paramName>
<paramValue>2555</paramValue>
</globPara>
</dataList>
</container>
I tried it with the regex d+, but it didn't work.
xmlstarlet ed --update "//globPara[contains(paramName, 'GLOBPARA[\d+]')]/paramValue" -v "100" test.xml

You can try below XPath
//globPara[number(substring-after(paramName, 'GLOBPARA'))>=0]/paramValue
This will return you paramValue of globPara node that contains paraName child with text in format GLOBPARAXXX where XXX is ANY positive number

Related

XML to XML XSLT transformation. MSXML in VBScript

I have a fairly nested XML file that I'd like to transform with an XSL template to something a little simpler to make bulk loading the data into SQL more efficient. I wanted to do it in C++ (Codeblocks with gcc) but I'm having a bit of trouble just being able to load the document with any of the libraries I've come across, including MSXML. If anyone has any experience using MSXML in Codeblocks with gcc let me know!
I have a stylesheet that transforms the XML in Excel VBA with a DOMDocument but I don't want to depend on Excel. I figured the next best thing would be a VBScript.
The data are one or two text values that are held in <DATAVALUE> nodes, descendants of 100 <LOCATION> nodes. The first child of each <LOCATION> node, called <LOCATIONNAME>, holds a unique name for each <LOCATION> node (i.e; NAME1-NAME100). The third and fourth children of the <LOCATION> node (if there is a fourth child) are <DATA> nodes, each holding a <DATAVALUE> node. The file can have upwards of 1 million <SAMPLE> nodes. Here is the XML:
<?xml version="1.0" encoding="utf-8"?>
<MYImportFile xmlns="urn:ohLookHEREaNamespacedeclaration">
<HEADERVERSION>1.10</HEADERVERSION>
<MESSAGE>Import</MESSAGE>
<MYBED>QUEEN</MYBED>
<SOURCE>SPRING </SOURCE>
<USERID>MMOUSE</USERID>
<DATETIME>2019-11-25T12:31:00</DATETIME>
<SAMPLE TYPE="No" APPLE="false">
<SAMPLEID>0000565</SAMPLEID>
<SAMPLECATEGORY>CLASS5</SAMPLECATEGORY>
<LOCATION APPLE="false">
<LOCATIONNAME>NAME1</LOCATIONNAME>
<READBY>MMOUSE</READBY>
<TIME>12:31:00</TIME>
<DATA>
<DATAVALUE>aaaa</DATAVALUE>
</DATA>
<DATA>
<DATAVALUE>bbbb</DATAVALUE>
</DATA>
</LOCATION>
'''''''''''''''''there are 100 LOCATION entries''''''''''''''''''''''''
<LOCATION APPLE="false">
<LOCATIONNAME>NAME100</LOCATIONNAME>
<READBY>MMOUSE</READBY>
<TIME>12:31:00</TIME>
<DATA>
<DATAVALUE>zzzz</DATAVALUE>
</DATA>
</LOCATION>
</SAMPLE>
'''''''''''''''''repeat for however many SAMPLES there are''''''''''''''''''''''
</MYImportFile>
I want to point something out so it's a little more clear what's going on. In the transformed xml document, one of the things I need to account for is when there is only one <DATA> node in a <LOCATION>. This is done by copying the first <DATAVALUE> node into a second <DATAVALUE> node in the new document. For example, the <DATAVALUE>, "zzzz" that appears twice in the transformed sheet only appears in the initial XML once. Here is what I want the transformed XML to look like:
<?xml version="1.0" encoding="UTF-8"?>
<MYImportFile>
<SAMPLE>
<SAMPLEID>0000565</SAMPLEID>
<NAME1_1>aaaa</NAME1_1>
<NAME1_2>bbbb</NAME1_2>
<NAME2_1>cccc</NAME2_1>
<NAME2_2>dddd</NAME2_2>
'''''''''''''''''there are 100 LOCATION entries transformed to NAME1-NAME100''''''''''''''''''''''''
<NAME100_1>zzzz</NAME100_1>
<NAME100_2>zzzz</NAME100_2>
</SAMPLE>
'''''''''''''''''repeat for however many SAMPLES there are''''''''''''''''''''''
</MYImportFile>
My StyleSheet (that works with VBA code):
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:b="urn:ohLookHEREaNamespacedeclaration" exclude-result-prefixes="b">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/b:MYImportFile">
<MYImportFile>
<xsl:for-each select="b:SAMPLE">
<SAMPLE>
<SAMPLEID>
<xsl:value-of select="b:SAMPLEID"/>
</SAMPLEID>
<NAME1_1>
<xsl:value-of select="b:LOCATION/b:LOCATIONNAME[text() = 'NAME1']/../b:DATA[1]/b:DATAVALUE"/>
</NAME1_1>
<xsl:choose>
<xsl:when test="b:LOCATION/b:LOCATIONNAME[text() = 'NAME1']/../b:DATA[2]/b:DATAVALUE">
<NAME1_2>
<xsl:value-of select="b:LOCATION/b:LOCATIONNAME[text() = 'NAME1']/../b:DATA[2]/b:DATAVALUE"/>
</NAME1_2>
</xsl:when>
<xsl:otherwise>
<NAME1_2>
<xsl:value-of select="b:LOCATION/b:LOCATIONNAME[text() = 'NAME1']/../b:DATA[1]/b:DATAVALUE"/>
</NAME1_2>
</xsl:otherwise>
</xsl:choose>
'''''''''''''''''''there are 100 NAME entires to recieve the 100 locations
</SAMPLE>
</xsl:for-each>
</MYImportFile>
</xsl:template>
</xsl:stylesheet>
My Script:
Option Explicit
Const strInputFile = "C:\Path\fileName.xml"
Const strTemplateFile = "C:\Path\convFileName.xsl"
Const strOutputFile = "C:\Path\newFileName.xml"
Dim objXMLDoc : Set objXMLDoc = WScript.CreateObject("Msxml2.DOMDocument")
objXMLDoc.async = False
objXMLDoc.loadXML(strInputFile)
objXMLDoc.SetProperty "SelectionNamespaces", "xmlns='urn:myNamespace'"
Dim objXSLDoc : Set objXSLDoc = WScript.CreateObject("Msxml2.DOMDocument")
objXSLDoc.async = False
objXSLDoc.loadXML(strTemplateFile)
Dim objNewXMLDoc : Set objNewXMLDoc = WScript.CreateObject("Msxml2.DOMDocument")
objXMLDoc.transformNodeToObject objXSLDoc, objNewXMLDoc
objNewXMLDoc.save strOutputFile
The error:
Line: 19
Char: 1
Error: The stylesheet does not contain a document element. The
stylesheet may be empty, or it may not be a well-formed XML document.
Code: 80004005
Source: msxml3.dll
I'm guessing either my script isn't quite right or there's a setting I'm missing, causing mismatching objects and libraries, because my VBA macro transforms the xml with that stylesheet. Anyone have any ideas? Suggestions to make this thing run?

As far as I remember loadXML takes a string with the XML. If you have a file or URL to parse use the load method.

Issue with xslt processing

I get the following error after my xslt has been processed:
There are 1 schema validation error(s):
1. Error Msg:The element 'BusinessObjectList' has incomplete content. List of possible elements expected: 'BusinessObject'. Line Number: 1, Line Position: 40, Severity:Error
I am trying to troubleshoot this issue and just require some clarifications. From my understanding of this error, there is a missing element called BusinessObject. So, I am not too sure if i have to incorporate this missing element or replace the existing element with this? another question too is, how to refer to Line 1, Line Position 40 in my xslt file?
Below is how the beginning of the xslt looks like:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msxsl="urn:schemas-microsoft-com:xslt">
<xsl:template match="/">
<BusinessObjectList SchemaVersion="1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="HierarchicalObjects-1.0.xsd">

You haven't told us how you are running the transformation, but it seems to be configured so that on completion the result document is validated against the schema at HierarchicalObjects-1.0.xsd. Presumably that schema says that the BusinessObjectList must contain at least N BusinessObject elements (perhaps N is 1, we don't know), and your transformation output includes less than N.
You either need to produce output that conforms to the schema, or you need to change the way you are processing to avoid the validation step.
The line/column number probably refers to a line/column in the result document, not in the stylesheet.

XSLT Filter on property

I am needing a little help with filtering my xml based on a property
I have the XML in the following format:
<?xml version="1.0" encoding="utf-8" ?>
<root id="-1">
<LandingPage id="1067" parentID="1050" level="2"
writerID="0" creatorID="0" nodeType="1066" template="1073"
sortOrder="0" createDate="2013-02-04T14:29:39"
updateDate="2013-02-07T11:08:27" nodeName="About"
urlName="about" writerName="Pete" creatorName="Pete"
path="-1,1050,1067" isDoc="">
<hideInNavigation>0</hideInNavigation>
</LandingPage>
</root>
What I need to do is filter these elements where hideInNavigation = 0
I have tried the following:
[#isDoc and #hideInNavigation ='0']
(I need the #isDoc attribute too) but realised this would only work if hideInNavigation was an attribute of the LandingPage tag so I tried
value['hideInNavigation'='0']
but this didn't seem to do anything either. After much searching for the answer, I haven't come up with anything so was wondering if it is possible

Supposing the current context was the <root> element, you could select the LandingPages with hideInNavigation = 0 with:
LandingPage[hideInNavigation = '0']
If you would share your XSLT, I van give you more specific guidance on how to amend it for this particular case.
And was the #isDoc test in your first example something you wanted? Do you want to filter LandingPages that have an isDoc attribute and a hideInNavigation value of 0?

'hideInNavigation'='0' compares the two strings 'hideInNavigation' and '0', which are guaranteed to be different.
In the context of root, LandingPage[hideInNavigation=0] would match the LandingPage element in your example.

This XPath return all LandingPage with isDoc attribute empty and hideInNavigation element content is '0'
//LandingPage[#isDoc="" and hideInNavigation='0']

BaseX XQuery replace

I have the following problem. I want to replace a value of an element in my xquery-file by using baseX as the database. The xquery code is as follows:
let $db := doc('update.xml')
replace value of node $db//elem with 'haha'
return <result> {$db//elem/text()} </result>
The xml document contains the following elements:
<?xml version="1.0" encoding="ISO-8859-1"?>
<root xmlns:xs="http://www.w3.org/2001/XMLSchema-instance">
<check>
<ok>
<elem>test</elem>
<help></help>
</ok>
</check>
</root>
Everytime I want to execute this xquery an error like this is thrown:
Expecting 'where', 'order' or 'return' expression
so what should i do or change, to just replace the text "test" by "haha" in the element ?
If I use just this line of code it works, but I have to read out of URL-Parameter so I need more lines of code, except the "replace...." line!

let starts a flwor-expression which may not directly contain update statements. You will have to put a return between these two:
let $db := doc('update.xml')
return
replace value of node $db//elem with 'haha'
You will also be able to do arbitrary calculations, but make sure not to have any output returned by your query.
There is no way to use updating statements and return a result at the same time.

sgml to xml conversion

I have a following sample sgml data from my .sgm file and I want convert this in to xml
<?dtd name="viewed">
<?XMLDOC>
<viewed >xyz
<cite>
<yr>2010
<pno cite="2010 abc 1188">10
<?/XMLDOC>
<?XMLDOC>
<viewed>abc.
<cite>
<yr>2010
<pno cite="2010 xyz 5133">9
<?/XMLDOC>
Output should be like this:
<index1>
<num viewed="xyz"/>
<heading>xyz</heading>
<index-refs>
<link caseno="2010 abc 1188</link>
</index-refs>
</index-1>
<index1>
<num viewed="abc"/>
<heading>abc</heading>
<index-refs>
<link caseno="2010 xyz 5133</link>
</index-refs>
</index-1>
Can this be done in c# or can we use xslt 2.0 to do this kind of conversion?

Others have already given some good advice. Here's one way of putting it all together by first converting the input SGML to well-formed XML and then using XSLT to transform that to the exact format you need.
Converting your SGML to well-formed XML
The osx tool from the OpenSP package suggested by mzjn is a good tool for this. Since your SGML markup omits end tags, you need to have a DTD from which the correct nesting of elements can be determined. If you don't have a DTD, you need to create one. For your example input, it could be as simple as this:
<!ELEMENT toplevel o o (viewed)+>
<!ELEMENT viewed - o (#PCDATA,cite)>
<!ELEMENT cite - o (yr,pno)>
<!ELEMENT yr - o (#PCDATA)>
<!ELEMENT pno - o (#PCDATA)>
<!ATTLIST pno cite CDATA #REQUIRED>
You also need to add a proper doctype declaration to the beginning of your SGML file. Assuming you have your DTD in file viewed.dtd.
<!DOCTYPE toplevel SYSTEM "viewed.dtd" >
With this addition, you should now be able use osx to convert the SGML to XML. (It won't be able to convert the processing instructions which start with a / as those are not allowed in XML, and will emit a warning about them.)
osx input.sgm > input.xml
Transforming the resulting XML to your desired format
For the above case, you could use something like the following XSLT stylesheet:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="VIEWED">
<index1>
<num viewed="{normalize-space(text())}"/>
<heading>
<xsl:value-of select="normalize-space(text())"/>
</heading>
<index-refs>
<xsl:apply-templates select="CITE"/>
</index-refs>
</index1>
</xsl:template>
<xsl:template match="CITE">
<link caseno="{PNO/#CITE}"/>
</xsl:template>
</xsl:stylesheet>

Maybe you can use the osx SGML to XML converter. It is part of the OpenSP package (based on SP, originally written by James Clark).
http://openjade.sourceforge.net/doc/index.htm
http://www.jclark.com/sp/index.htm

Can the SGML-Reader, originally developed by Chris Lovett help in solving this problem?

Why XSLT? I doubt you can map SGML to XML Infoset or XDM...
I think that you should better use the language made for this task: DSSSL (Document Style Semantics and Specification Language)
This is the predecessor of XSLT. The author is James Clark. And this is the his site.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Xmlstarlet select String+number for update - regex

You can try below XPath //globPara[number(substring-after(paramName, 'GLOBPARA'))>=0]/paramValue This will return you paramValue of globPara node that contains paraName child with text in format GLOBPARAXXX where XXX is ANY positive number

Related

XML to XML XSLT transformation. MSXML in VBScript

Issue with xslt processing

XSLT Filter on property

BaseX XQuery replace

sgml to xml conversion

Categories

Resources