Ruby regx for xml attributes - regex

i am trying to create a regx expression for fluentbit parser and not sure how to drop specific characters from a string
<testsuite name="Activity moved" tests="1" errors="0" failures="0" skipped="0" time="151.109" timestamp="2022-09-05T16:22:53.184000">
Above is the input which is i have as a string and i want to make multiple keys out of it.
expected output:
name: Activity moved
tests: 1
errors: 0
failures: 0
skipped: 0
timestamp: 2022-09-05T16:22:53.184000
How can i achieve this please?

try this:
str = "<testsuite name=\"Activity moved\" tests=\"1\" errors=\"0\" failures=\"0\" skipped=\"0\" time=\"151.109\" timestamp=\"2022-09-05T16:22:53.184000\">"
regexp = /(\w*)="(.*?)"/ # there's your regexp
str.scan(regexp).to_h # and this is how you make the requested hash
# => {"name"=>"Activity moved", "tests"=>"1", "errors"=>"0", "failures"=>"0", "skipped"=>"0", "time"=>"151.109", "timestamp"=>"2022-09-05T16:22:53.184000"}

Of course you can write your own parser but may be it's more comfortable to use Nokogiri?
require 'nokogiri'
doc = Nokogiri::XML(File.open("your.file", &:read))
puts doc.at("testsuite").attributes.map { |name, value| "#{name}: #{value}" }

Related

Cannot get Maven Archetype requiredProperty validationRegex to work

I have an archetype and I am trying to add a new requiredProperty which should only allow one of two possible values: "hibernate" and "hibernate-reactive". In the archetype-metadata.xml, I have defined the property as shown below:
<requiredProperty key="quarkus_orm_selection">
<validationRegex><![CDATA[^(hibernate|hibernate-reactive)$]]></validationRegex>
</requiredProperty>
In jshell and in other Java programs, I have verified that the regular expression works properly, but in the archetype when I test using a value like hibernate-ree the archetype proceeds without an error!
I proved out the regex as follows in JShell:
jshell> String[] examples = {"hibernate", "hibernate-reactive", "hibernate-re", "hibernate-ree", "testing", "reactive"}
examples ==> String[6] { "hibernate", "hibernate-reactive", "h ... ", "testing", "reactive" }
jshell> Pattern regex = Pattern.compile("^(hibernate|hibernate-reactive)$")
regex ==> ^(hibernate|hibernate-reactive)$
jshell> Arrays.asList(examples).stream().filter(i -> regex.matcher(i).matches()).forEach(System.out::println)
hibernate
hibernate-reactive
Can anyone suggest what I might be doing wrong?
I am using Maven Archetype Plugin version 3.2.0
As far as I can tell maven archetypes only validates reg-ex's if you pass in the property in interactive mode.
I created a archetype-post-generate.groovy script (see below)
src/main/resources/META-INF/archetype-post-generate.groovy:
String ormSelector = request.getProperties().get("quarkus_orm_selection")
def pattern = "^(hibernate|hibernate-reactive)\$" // the \$ is important!
final match = ormSelector.matches(pattern)
if (!match) {
println "[ERROR] ormSelector: $ormSelector is not valid"
println "[ERROR] please provide an ormSelector that follows this pattern: '$pattern'"
throw new RuntimeException("OrmSelector: $ormSelector is not valid")
}

Jmeter - find with regex and replace part of XML body using groovy

I'm trying to replace part an XML response data with something else.
Here is an example:
?xml version="1.0" encoding="UTF-8"?>
<trustedDevices><trustedDevice><id>1942</id><name>BksQ9LKwWuNOHpn</name></trustedDevice><trustedDevice><id>1944</id><name>6f4srs4PkJk1j36</name></trustedDevice><trustedDevice><id>1943</id><name>7cGYVAlmQoXaVrf</name></trustedDevice></trustedDevices>
I'm trying to get all the <name>(.+?)<\/name> data and replace it with something else (timestamp or random string)
so far, my groovy post processor code looks like this:
String trustedDevices = prev.getResponseDataAsString()
log.info('Response: ' + trustedDevices)
def nameFind = "/<name>(.+?)<\/name>/"
def newTrustedDevices = trustedDevices.replaceAll(nameFind, "test")
log.info('New response: ' + newTrustedDevices)
Unfortunately it seems that replaceAll requires String or Long to work, and won't work with regex.
You regex just need a correct escaping:
def nameFind = "<name>(.+?)<\\/name>"
Replacing values in XML using regular expressions is not the best option as it will be fragile and very sensitive to any markup change.
I would suggest going for Groovy's XML parsing capabilities instead
Example code:
def trustedDevices = new XmlSlurper().parseText(prev.getResponseDataAsString())
trustedDevices.trustedDevice.findAll().each {
it.name = 'test'
}
def newTrustedDevices = new StreamingMarkupBuilder().bind { mkp.yield trustedDevices }.toString()
More information on Groovy scripting in JMeter: Apache Groovy - Why and How You Should Use It

Unable to figure out how to replace a tag in xml

Need help to figure out correct regex for replacing xml tag with contents of a file.
Tried basic things like escaping special characters but no luck. Open to using something else other than sed.
config.txt
<localReplications/>
replace-with-config.txt
<localReplications>
<localReplication>
<enabled>true</enabled>
<cronExp>0 0 /5 * * ?</cronExp>
<syncDeletes>true</syncDeletes>
<syncProperties>true</syncProperties>
<repoKey>some-repo-key</repoKey>
<url>https://foo.bar/random</url>
<socketTimeoutMillis>15000</socketTimeoutMillis>
<username>foo</username>
<password>bar</password>
<enableEventReplication>true</enableEventReplication>
<syncStatistics>false</syncStatistics>
</localReplication>
</localReplications>
<localReplications/> tag is part of really complicated xml file. I expect <localReplications/> to be replaced with contents in replace-with-config.txt
use XML::LibXML qw();
my $config = XML::LibXML
->load_xml(location => 'config.txt');
my $replace = (XML::LibXML
->load_xml(location => 'replace-with-config.txt')
->findnodes('//localReplications')
)[0];
for my $local_replications (
$config->findnodes('//localReplications')
) {
# $local_replications->replaceNode($replace);
# this fails with HIERARCHY_REQUEST_ERR,
# so do it in two steps instead
$local_replications->addSibling($replace);
$local_replications->unbindNode;
}
print $config->toString;

Regex on io.Text RDD using scala

I have a problem. I need to extract some data from a file like this:
(3269,
<page>
<title>Anarchism</title>
<ns>0</ns>
<id>12</id>
<revision>...
)
(194712,
<page>
<title>AssistiveTechnology</title>
<ns>0</ns>
<id>23</id>..
) etc...
This file was generated using:
val conf = new Configuration
conf.set("textinputformat.record.delimiter", "</page>")
val rdd=sc.newAPIHadoopFile("sample.bz2", classOf[TextInputFormat], classOf[LongWritable], classOf[Text], conf)
rdd.map{case (k,v) => (k.get(), new String(v.copyBytes()))}
I need to obtain the title content. Im using regex but the output file still remains empty. My code is like this:
val xx = rdd.map(x => x._2).filter(x => x.matches(".*<title>([A-Za-z]+)<\\/title>.*"))
I also try with these:
".*<title>([A-Za-z]+)</title>.*"
And using this:
val reg = ".*<title>([\\w]+)</title>.*".r
val xx = rdd.map(x => x._2).filter(x => reg.pattern.matcher(x).matches)
I create the .jar using sbt and running with spark-submit.
BTW, using spark-shell it works :S
I need your help please. Thanks.
You could use built-in Scala support for XML. Something like
import scala.xml._
rdd.map(x => (XML.loadString(x._2) \ "title").text)

Regex: Exclude a substring in fn:matches

I have an config xml as
<cfg:property name="Gallery.*.*">GalleryPort</cfg:property>
<cfg:property name="Office.*.*">OfficePort</cfg:property>
<cfg:property name="Home.Living.Closet$">LivingCloset</cfg:property>
<cfg:property name="Home.Living.Lights$">LivingLights</cfg:property>
I use xquery to get the value for the property using
$dynamicURN := $config//cfg:property[matches($key, #name)]/text()
$key is received from database and the urn is the fetched.
What should be the fourth property name so that it can catch any string other than Closet or Lights after Home.Living.?
Example: I tried
<cfg:property name=" Home.Living.[A-Z-[LIGHTS]]">LivingMisc</cfg:property>
<cfg:property name=" Home.Living.[^'(LIGHTS)']">LivingMisc</cfg:property>
Possible values for key are:
Home.Living.Lights
Home.Living.Light
Home.Living.Closed
Home.Living.Closet
Home.Living.Table
The respective output should be
LivingLights
LivingMisc
LivingMisc
LivingCloset
LivingMisc
The code below will get you what you need, but you could do it another way that is more performent at scale. Just so you know that regex will be evaluated for each item it finds. So for every node it finds its going to be 3 expressions.
declare namespace cfg = "somthinghere";
let $key := ("Home.Living.Lights","Home.Living.Closet","Home.Living.anyOtherString","Home.Living.anotherString")[1]
let $config :=
<config>
<cfg:property name="Gallery.*.*">GalleryPort</cfg:property>
<cfg:property name="Office.*.*">OfficePort</cfg:property>
<cfg:property name="Home.Living.Closet$">LivingCloset</cfg:property>
<cfg:property name="Home.Living.Lights$">LivingLights</cfg:property>
<cfg:property name="Home.Living.[^Closet|Lights]">LivingMisc</cfg:property>
</config>
let $option := $config//cfg:property[fn:matches($key, #name )]/text()
return $option
Update:
Since you can't use negative lookahead in xquery you could try something like this. Don't use Name for the other home.living. matches. So the code here will first look at #name. If it finds something it will stop. If it doesn't find something then it will look at #fallback and match off of that.
It just means the keys that aren't found in the first match will be running a 2nd match so its just more expressions for those items.
declare namespace cfg = "somthinghere";
let $key := (
"Home.Living.Lights",
"Home.Living.Closet",
"Home.Living.anyOtherString",
"Home.Living.anotherString",
"Home.Living.Close",
"Gallery"
)[2]
let $config :=
<config>
<cfg:property name="Gallery.*.*">GalleryPort</cfg:property>
<cfg:property name="Office.*.*">OfficePort</cfg:property>
<cfg:property name="Home.Living.Closet$">LivingCloset</cfg:property>
<cfg:property name="Home.Living.Lights$">LivingLights</cfg:property>
<cfg:property fallback="Home.Living.*">LivingMisc</cfg:property>
</config>
return ($config//cfg:property[fn:matches($key, #name )]/text()[1], $config//cfg:property[fn:matches($key, #fallback )]/text()[1])[1]