Regex: Exclude a substring in fn:matches - regex

I have an config xml as
<cfg:property name="Gallery.*.*">GalleryPort</cfg:property>
<cfg:property name="Office.*.*">OfficePort</cfg:property>
<cfg:property name="Home.Living.Closet$">LivingCloset</cfg:property>
<cfg:property name="Home.Living.Lights$">LivingLights</cfg:property>
I use xquery to get the value for the property using
$dynamicURN := $config//cfg:property[matches($key, #name)]/text()
$key is received from database and the urn is the fetched.
What should be the fourth property name so that it can catch any string other than Closet or Lights after Home.Living.?
Example: I tried
<cfg:property name=" Home.Living.[A-Z-[LIGHTS]]">LivingMisc</cfg:property>
<cfg:property name=" Home.Living.[^'(LIGHTS)']">LivingMisc</cfg:property>
Possible values for key are:
Home.Living.Lights
Home.Living.Light
Home.Living.Closed
Home.Living.Closet
Home.Living.Table
The respective output should be
LivingLights
LivingMisc
LivingMisc
LivingCloset
LivingMisc

The code below will get you what you need, but you could do it another way that is more performent at scale. Just so you know that regex will be evaluated for each item it finds. So for every node it finds its going to be 3 expressions.
declare namespace cfg = "somthinghere";
let $key := ("Home.Living.Lights","Home.Living.Closet","Home.Living.anyOtherString","Home.Living.anotherString")[1]
let $config :=
<config>
<cfg:property name="Gallery.*.*">GalleryPort</cfg:property>
<cfg:property name="Office.*.*">OfficePort</cfg:property>
<cfg:property name="Home.Living.Closet$">LivingCloset</cfg:property>
<cfg:property name="Home.Living.Lights$">LivingLights</cfg:property>
<cfg:property name="Home.Living.[^Closet|Lights]">LivingMisc</cfg:property>
</config>
let $option := $config//cfg:property[fn:matches($key, #name )]/text()
return $option
Update:
Since you can't use negative lookahead in xquery you could try something like this. Don't use Name for the other home.living. matches. So the code here will first look at #name. If it finds something it will stop. If it doesn't find something then it will look at #fallback and match off of that.
It just means the keys that aren't found in the first match will be running a 2nd match so its just more expressions for those items.
declare namespace cfg = "somthinghere";
let $key := (
"Home.Living.Lights",
"Home.Living.Closet",
"Home.Living.anyOtherString",
"Home.Living.anotherString",
"Home.Living.Close",
"Gallery"
)[2]
let $config :=
<config>
<cfg:property name="Gallery.*.*">GalleryPort</cfg:property>
<cfg:property name="Office.*.*">OfficePort</cfg:property>
<cfg:property name="Home.Living.Closet$">LivingCloset</cfg:property>
<cfg:property name="Home.Living.Lights$">LivingLights</cfg:property>
<cfg:property fallback="Home.Living.*">LivingMisc</cfg:property>
</config>
return ($config//cfg:property[fn:matches($key, #name )]/text()[1], $config//cfg:property[fn:matches($key, #fallback )]/text()[1])[1]

Related

Ruby regx for xml attributes

i am trying to create a regx expression for fluentbit parser and not sure how to drop specific characters from a string
<testsuite name="Activity moved" tests="1" errors="0" failures="0" skipped="0" time="151.109" timestamp="2022-09-05T16:22:53.184000">
Above is the input which is i have as a string and i want to make multiple keys out of it.
expected output:
name: Activity moved
tests: 1
errors: 0
failures: 0
skipped: 0
timestamp: 2022-09-05T16:22:53.184000
How can i achieve this please?
try this:
str = "<testsuite name=\"Activity moved\" tests=\"1\" errors=\"0\" failures=\"0\" skipped=\"0\" time=\"151.109\" timestamp=\"2022-09-05T16:22:53.184000\">"
regexp = /(\w*)="(.*?)"/ # there's your regexp
str.scan(regexp).to_h # and this is how you make the requested hash
# => {"name"=>"Activity moved", "tests"=>"1", "errors"=>"0", "failures"=>"0", "skipped"=>"0", "time"=>"151.109", "timestamp"=>"2022-09-05T16:22:53.184000"}
Of course you can write your own parser but may be it's more comfortable to use Nokogiri?
require 'nokogiri'
doc = Nokogiri::XML(File.open("your.file", &:read))
puts doc.at("testsuite").attributes.map { |name, value| "#{name}: #{value}" }

Unable to figure out how to replace a tag in xml

Need help to figure out correct regex for replacing xml tag with contents of a file.
Tried basic things like escaping special characters but no luck. Open to using something else other than sed.
config.txt
<localReplications/>
replace-with-config.txt
<localReplications>
<localReplication>
<enabled>true</enabled>
<cronExp>0 0 /5 * * ?</cronExp>
<syncDeletes>true</syncDeletes>
<syncProperties>true</syncProperties>
<repoKey>some-repo-key</repoKey>
<url>https://foo.bar/random</url>
<socketTimeoutMillis>15000</socketTimeoutMillis>
<username>foo</username>
<password>bar</password>
<enableEventReplication>true</enableEventReplication>
<syncStatistics>false</syncStatistics>
</localReplication>
</localReplications>
<localReplications/> tag is part of really complicated xml file. I expect <localReplications/> to be replaced with contents in replace-with-config.txt
use XML::LibXML qw();
my $config = XML::LibXML
->load_xml(location => 'config.txt');
my $replace = (XML::LibXML
->load_xml(location => 'replace-with-config.txt')
->findnodes('//localReplications')
)[0];
for my $local_replications (
$config->findnodes('//localReplications')
) {
# $local_replications->replaceNode($replace);
# this fails with HIERARCHY_REQUEST_ERR,
# so do it in two steps instead
$local_replications->addSibling($replace);
$local_replications->unbindNode;
}
print $config->toString;

Generic solution for removing xml declararation using perl

Hi i want remove the declaration in my xml file and problem is declaration is sometimes embed with the root element.
XML looks as follows
Case1:
<?xml version="1.0" encoding="UTF-8"?> <document> This is a document root
<child>----</child>
</document>`
Case 2:
<?xml version="1.0" encoding="UTF-8"?>
<document> This is a document root
<child>----</child>
</document>`
Function should also work for the case when root node is in next line.
My function works only for case 2..
sub getXMLData {
my ($xml) = #_;
my #data = ();
open(FILE,"<$xml");
while(<FILE>) {
chomp;
if(/\<\?xml\sversion/) {next;}
push(#data, $_);
}
close(FILE);
return join("\n",#data);
}
*** Please note that encoding is not constant always.
OK, so the problem here is - you're trying to parse XML line based, and that DOESN'T WORK. You should avoid doing it, because it makes brittle code, which will one day break - as you've noted - thanks to perfectly valid changes to the source XML. Both your documents are semantically identical, so the fact your code handles one and not the other is an example of exactly why doing XML this way is a bad idea.
More importantly though - why are you trying to remove the XML declaration from your XML? What are you trying to accomplish?
Generically reformatting XML can be done like this:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new(
pretty_print => 'indented',
);
$twig->parsefile('your_xml_file');
$twig->print;
This will parse your XML and reformat it in one of the valid ways XML may be formatted. However I would strongly urge you not to just discard your XML declaration, and instead carry on with something like XML::Twig to process it. (Open a new question with what you're trying to accomplish, and I'll happily give you a solution that doesn't trip up with different valid formats of XML).
When it comes to merging XML documents, XML::Twig can do this too - and still check and validate your XML as it goes.
So you might do something like (extending from the above):
foreach my $file ( #file_list ) {
my $child = XML::Twig -> new ();
$child -> parsefile ( $xml_file );
my $child_doc = $child -> root -> cut;
$child_doc -> paste ( $twig -> root );
}
$twig -> print;
Exactly what you'd need to do, depends a little on your desired output structure - you'd need 'wrap' in the root element anyway. Open a new question with some sample input and desired output, and I'll happily take a crack at it.
As an example - if you feed the above your sample input twice, you get:
<?xml version="1.0" encoding="UTF-8"?>
<document><document> This is a document root
<child>----</child></document> This is a document root
<child>----</child></document>
Which I know isn't likely to be what you want, but hopefully illustrates a parser based way of XML restructuring.

Assigning a variable with Dynamic data in Xquery

I have a requirement where I have to validate the incoming data against data present in a constant.xml.
Say below is my constant file:
<Constant>
<data>
<Nation>India</Nation>
<EndPointURL>customers/{$custID}/Resource</EndPointURL>
</data>
<data>
<Nation>China</Nation>
<EndPointURL>customers/{$custID}/Resource</EndPointURL>
</data>
<data>
<Nation>Russia</Nation>
<EndPointURL>customers/Resource</EndPointURL>
</data>
</Constant>
and $body is as follows:
<body>
<custID>1234</custID>
<Country>India</Country>
<ServiceURL>customers/1234/Resource</ServiceURL>
</body>
Here I have to check, that if $body/ServiceURL = $Constant/data/EndPointURL.
And the cardinality of data is (1...infinity).
Is their a way I can change pass original CustID fr4om Input and make a validation check with customers/{$custID}/Resource.
Presently, I am using below code to make a check.
let $ServiceURL :=$body/ServiceURL/text()
let $country :=$body/Country/text()
for $service in ($Constant/data) where
($service/Nation/text() = $Country)
and ($service/EndPointURL/text() = $ServiceURL)
return
<ServiceURL>{fn:concat('/REALTime/',$service/EndPointURL/text())}</ServiceURL>
};
Please, let me know, how can I change the data of constant.xml in xquery
The following query shows you how to expand the embedded templates in your EndPointURLs and only return the result if there is a match.
(: find the `data` for our `Country` :)
let $data := $Constant/data[Nation eq $body/Country]
return
(: expand the {$var} parts of the EndPointURL :)
let $parts := tokenize($data/EndPointURL, "/")
return
let $expanded-parts :=
for $part in $parts
return
if(matches($part, "\{\$[^}]+\}"))then
let $src := replace($part, "\{\$([^}]+)\}", "$1")
return
$body/element()[local-name(.) eq $src]
else
$part
return
let $expanded-endpoint-url := concat("/", string-join($parts, "/"))
return
(: the body if the ServiceURL matches the expanded EndPointURL :)
$body[ServiceURL eq $expanded-endpoint-url]

regular expression to replace an xml attribute

I have an xml file of the form:
<property name="foo" value="this is a long value">stuff</property>
There are many properties but I want to match the one with name foo and then replace its value attribute with something else as so:
<property name="foo" value="yet another long value">stuff</property>
I was thinking to write a regular expression to match everything after "foo" to the end of the tag ( ">" ) and replace that, but I can't seem to get the syntax right.
I'm trying to do this using sed, if that's any help.
/property name=\"foo\" value=\"([^\"]*)\"/
Then just replace the first submatch with the new value of your wishing.
You probably don't want to use a regex for manipulating an xml file. Please instead consider xslt, which is aware of xml rules and won't cause your transformed document to become malformed.
If you're doing this in the context of a browser, you could create a throwaway DOM node containing the XML and just walk that to replace attribute values.
This function will call a callback on every child node:
const walkDOM = (node, callback) => {
callback(node);
[...node.children].forEach(child => {
walkDOM(child, callback)
});
}
You can then use this to update any attributes matching conditions you'd like (here replacing any, assuming you have an XML string called svgXml:
const containerEl = document.createElement('div');
containerEl.innerHTML = svgXml;
walkDOM(containerEl, el => {
const attributes = [...el.attributes];
attributes.forEach(attr => {
if (attr.name === 'foo' && attr.value === 'this is a long value']) {
attr.value = 'yet another long value';
}
});
});
const outputSvgXml = containerEl.innerHTML;
Of course you could further optimize this by using querySelectorAll(property) to only walk <property> nodes, etc.
I found this useful for updating an SVG while taking advantage of the browser's robustness.