RegEx: Match only one element in xml structure

RegEx: Match only one element in xml structure - regex

I have this XML-structure where I would like to match the individual data-elements elements that have the ###DoNotUse### string inside.
In the example, it should match the data elements C and E.
But my RegEx also matches all data-elements before the matches that I require, meaning A+B+C instead of just C, and D+E instead of just E.
I appreciate your help very much.
My RegEx is:
<data(.*?)###DoNotUse###(.*?)</data>
The example data is:
<data name="A">
<value>A</value>
<comment>Bla Bla</comment>
</data>
<data name="B">
<value>B</value>
</data>
<data name="C">
<value>###DoNotUse###</value>
<comment>Bla Bla</comment>
</data>
<data name="D">
<value>D</value>
<comment>Bla Bla</comment>
</data>
<data name="E">
<value>###DoNotUse###</value>
</data>

You are looking for :
(?s)<data name="([^"]*)">(?:(?!data).)*DoNotUse
The (?s) is for single line mode. You can find the demonstration and explanations HERE: DEMO

Related

Select Node by Variable in Grouping

For the grouping function, is there a way to group by dynamic key, which is passed in the input data? For example, in the input xml below, I want to group <Trans> by the node name passed in <key1>, which is currently "id". Thank you!
<xsl:for-each-group select="Trans" group-by="[this key node name is from the input]">
Input xml:
<File>
<key1>id</key1>
<Trans>
<id>1</id>
<name>jane</name>
<location>ga</location>
<value>1.11</value>
</Trans>
<Trans>
<id>2</id>
<name>jane</name>
<location>ma</location>
<value>2.22</value>
</Trans>
<Trans>
<id>1</id>
<name>john</name>
<location>al</location>
<value>3.33</value>
</Trans>
<Trans>
<id>3</id>
<name>jj</name>
<location>ga</location>
<value>4.44</value>
</Trans> </File>

group-by="*[local-name() = ../key1]"

Replace Particular XML tag with NULL value in Oracle SQL

I have columns as VALUE in DUMMY table with type XMLTYPE.
It contains:
<?xml version="1.0"?>
<ROWSET>
<Value>
<Data>802
</Data>
</Value>
<Value>
<Data>902
</Data>
</Value>
</ROWSET>
I need to replace it with NULL for 802 value tag.
The output should be :
<?xml version="1.0"?>
<ROWSET>
<Value>
<Data>902
</Data>
</Value>
</ROWSET>
802 value tag should be removed with NULL.
I tried UpdateXML():
update Dummy set VALUE=updatexml(VALUE,'ROWSET/Value/Data/text()','');
But it will update only 802 value with null.
2nd Approach:
update Dummy set Value=updatexml(Value,'ROWSET','');
But it will delete everything inside ROWSET tag.Then,It will contain only :
<?xml version="1.0"?>
<ROWSET/>
I tried Replace() too.
update Dummy set emps=replace('
<Value><Data>802
</Data></Value>',null);
Then it will remove other values from VALUE column and remain only the mentioned tag in replace().
After this replace(), It contains :
<Value><Data>802
</Data></Value>
Please suggest me on this.

You need
deleteXml() instead of updateXml(), and
a correct XPath for what's to be deleted.
Let's test our XPath first
with input$ as (
select --+ no_merge
xmltype(q'{<?xml version="1.0"?>
<ROWSET>
<Value>
<Data>802
</Data>
</Value>
<Value>
<Data>902
</Data>
</Value>
</ROWSET>}') as xx
from dual
)
select
xmlserialize(document X.xx indent) as original,
xmlserialize(document
deletexml(X.xx, '/ROWSET/Value[normalize-space(Data)="802"]')
indent
) as with_802_removed
from input$ X;
... yields...
ORIGINAL WITH_802_REMOVED
---------------------- ----------------------
<?xml version="1.0"?> <?xml version="1.0"?>
<ROWSET> <ROWSET>
<Value> <Value>
<Data>802 <Data>902
</Data> </Data>
</Value> </Value>
<Value> </ROWSET>
<Data>902
</Data>
</Value>
</ROWSET>
Note: The xmlserialize() function is used here only for pretty-printing.
Now your update
update Dummy X
set X.value = deletexml(X.value, '/ROWSET/Value[normalize-space(Data)="802"]');
Note for Oracle 12c: There should be a more elegant solution to this using XQuery, but I was not able to get a full grasp on the XQuery language yet, hence I present you the deleteXml() solution only.

How to match nodes which have ALL nodes in another sequence

We're using XSLT2. Wondering if this is possible.
We have a tag filter, where a customer can choose to see all the themes which match ALL of their selections.
Here's an idea of the XML structure:
<themes>
<theme>
<name>Apple</name>
<tags>
<tag id="1">
<tag id="2">
</tags>
</theme>
<theme>
<name>Banana</name>
<tags>
<tag id="2">
<tag id="3">
</tags>
</theme>
<theme>
<name>Kiwifruit</name>
<tags>
<tag id="2">
<tag id="3">
</tags>
</theme>
</themes>
The customer chooses tags 2 and 3. The result we want is to only show is Banana and Kiwifruit, as they have all the tags the user selected.
We can't use the AND operator as the list of tags is long and unknown. We currently have this list passed into the XSLT and then tokenised:
<xsl:param name="tag_ids"/>
<xsl:variable name="tag_id_list" select="tokenize($tag_ids,',')"/>
This statement selects any theme that has any of the tag_id_list:
<xsl:for-each select="themes/theme/tags/tag[#id=$tag_id_list]">
But we're trying to find a XPath statement that makes sure the has ALL the s in $tag_id_list
Any ideas?! Thanks in advance.

You want this if the tags have to be in the right order:
themes/theme/tags[deep-equal($tag_id_list, tag/#id)]
or this if they can be in any order:
themes/theme/tags[
(every $tag in $tag_id_list satisfies $tag = tag/#id)
and
(every $tag in tag/#id satisfies $tag = $tag_id_list)]

You could count the number of tags that match, and see if it equals to the number of tags in tag_id_list. For example
<xsl:variable name="tagcount" select="count($tag_id_list)" />
<xsl:for-each select="themes/theme[count(tags/tag[#id=$tag_id_list]) = $tagcount]">
If the customer could enter duplicate tags (like '2,2,3') then you might have to change tagcount to this
<xsl:variable name="tagcount" select="count(distinct-values($tag_id_list))" />

How do I selectively include elements from one or the other xml file based on the contents of a plain text file?

I have two source xml files and I need to construct a new xml file which contains elements chosen for one or other of the files depending on whether their 'name' is contained in a plain text file.
xml file a:
<data name="name1">
<value>abc1</value>
</data>
<data name="name2">
<value>abc2</value>
</data>
<data name="name3">
<value>abc3</value>
</data>
xml file b:
<data name="name1">
<value>xyz1</value>
</data>
<data name="name2">
<value>xyz2</value>
</data>
<data name="name3">
<value>xyz3</value>
</data>
text file:
name1
name3
desired output:
<data name="name1">
<value>abc1</value>
</data>
<data name="name2">
<value>xyz2</value> <---- note this element is from file 'b'
</data>
<data name="name3">
<value>abc3</value>
</data>
So the elements with names 'name1' and 'name3' come from 'xml file a' because they are listed in the text file, but 'name2' comes from 'xml file b' because it isn't.
The actual names aren't 'name1' etc, but arbitrary string identifiers, but they are unique within the files.
Is it possible to do this with XSLT?

This transformation:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="vNames" select=
"tokenize(unparsed-text('file:///c:/temp/delete/Names.txt'), '\s')"/>
<xsl:variable name="vDoc1" select="document('file:///c:/temp/delete/FileA.xml')"/>
<xsl:variable name="vDoc2" select="document('file:///c:/temp/delete/FileB.xml')"/>
<xsl:template match="/">
<t>
<xsl:sequence select=
"$vDoc1/*/*[#name = $vNames],
$vDoc2/*/*[not(#name = $vNames)]
"/>
</t>
</xsl:template>
</xsl:stylesheet>
when applied on any XML document (not used) and having these two files:
c:/temp/delete/FileA.xml:
<t>
<data name="name1">
<value>abc1</value>
</data>
<data name="name2">
<value>abc2</value>
</data>
<data name="name3">
<value>abc3</value>
</data>
</t>
c:/temp/delete/FileB.xml:
<t>
<data name="name1">
<value>xyz1</value>
</data>
<data name="name2">
<value>xyz2</value>
</data>
<data name="name3">
<value>xyz3</value>
</data>
</t>
c:/temp/delete/Names.txt:
name1
name3
produces the wanted, correct result:
<t>
<data name="name1">
<value>abc1</value>
</data>
<data name="name3">
<value>abc3</value>
</data>
<data name="name2">
<value>xyz2</value>
</data>
</t>
Explanation:
Proper use of the standard XSLT functions: unparsed-text() (2.0 and up only) and document() and the standard XPath 2.0 function tokenize()

While XSLT can output plain text, the input document is expected to be XML. However, it's possible to mix in Java in your transformation. Personally, I'd add some java functions to open the file and process it in a grep-like fashion.
This general tutorial on adding Java functions to XSLT stylesheets would be a good start. It has the advantage of mentioning how-to's for some of the more common XSLT processing engines.
Here's a related discussion on SO

Selecting the first and second node in XSLT

Given the following structure, how to copy the first and the second nodes with all their elements from the document based on the predicate in XSLT:
<list>
<slot>xx</slot>
<data>
<name>xxx</name>
<age>xxx</age>
</data>
<data>
<name>xxx</name>
<age>xxx</age>
</data>
<data>
<name>xxx</name>
<age>xxx</age>
</data>
</list>
<list>
<slot>xx</slot>
<data>
<name>xxx</name>
<age>xxx</age>
</data>
<data>
<name>xxx</name>
<age>xxx</age>
</data>
<data>
<name>xxx</name>
<age>xxx</age>
</data>
</list>
How to select the first and the second occurence of data (without the data element itself, only name, age) from the list, where the slot is equal to a different variable, i.e the first list has the slot=02, but I need the data from the second list, where the slot=01. But it does not really matter the order of the list by a slot as long as slot=$slotvariable.
I tried the following statement, but it did not produce any results:
<xsl:element name="{'Lastdata'}">
<xsl:copy-of select="list/data[position()=1 and slot = $slotvariable]" />
</xsl:element>
<xsl:element name="{'prevdata'}">
<xsl:copy-of select="list/data[position()=2 and slot = $slotvariable]" />
</xsl:element>
Any working suggestions would be appreciated

If I understood your question correctly, then:
<Lastdata>
<xsl:copy-of select="list[slot=$slotvariable]/data[1]/*" />
</Lastdata>
<prevdata>
<xsl:copy-of select="list[slot=$slotvariable]/data[2]/*" />
<prevdata>
Hints:
Don't use <xsl:element> unless you have a dynamic name based on an expression.
[1] is a shorthand for [position() = 1]

The following stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:variable name="slot" select="'slot1'"/>
<xsl:template match="/lists/list">
<xsl:copy-of select="data[../slot=$slot][position()<3]/*"/>
</xsl:template>
</xsl:stylesheet>
Applied to this source:
<lists>
<list>
<slot>slot1</slot>
<data>
<name>George</name>
<age>7</age>
</data>
<data>
<name>Bob</name>
<age>22</age>
</data>
<data>
<name>James</name>
<age>77</age>
</data>
</list>
<list>
<slot>slot2</slot>
<data>
<name>Wendy</name>
<age>25</age>
</data>
</list>
</lists>
Produces the following result:
<name>George</name>
<age>7</age>
<name>Bob</name>
<age>22</age>

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

RegEx: Match only one element in xml structure - regex

You are looking for : (?s)<data name="([^"])">(?:(?!data).)DoNotUse The (?s) is for single line mode. You can find the demonstration and explanations HERE: DEMO

Related

Select Node by Variable in Grouping

Replace Particular XML tag with NULL value in Oracle SQL

How to match nodes which have ALL nodes in another sequence

How do I selectively include elements from one or the other xml file based on the contents of a plain text file?

Selecting the first and second node in XSLT

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

RegEx: Match only one element in xml structure - regex

You are looking for : (?s)<data name="([^"]*)">(?:(?!data).)*DoNotUse The (?s) is for single line mode. You can find the demonstration and explanations HERE: DEMO

Related

Select Node by Variable in Grouping

Replace Particular XML tag with NULL value in Oracle SQL

How to match nodes which have ALL nodes in another sequence

How do I selectively include elements from one or the other xml file based on the contents of a plain text file?

Selecting the first and second node in XSLT

Categories

Resources

You are looking for : (?s)<data name="([^"])">(?:(?!data).)DoNotUse The (?s) is for single line mode. You can find the demonstration and explanations HERE: DEMO