Hello I try to extract each group of data ( each data is separated by , from a string like that
MyString=[XXXXXX:XX XX XX XX, XXXXX:332.83, XXXXX:XXX-XX-XX XX:XX:XX, XXXX:0.0, XXXX:2, XXXX:0, XXXX:-256, counter_tipeee:5, XXXX:136935, XXXX:0, XXXX:XX XXX XXX, XXXX:0.5, XXXXX:true, XXXX:0.509375, XXX:0.0, XXXX:[2022-06-14 06:45:00], 2022-09-17 XXXXX:1]
With this regex, I can match all characters except ,
([^,]*)
https://regex101.com/r/lCN2YK/1
But I search to mismatch ,
The problem is if I remove space with \s, it removes space from certain data of my string. I search to extract all data that is not precisely coma+space ,
Another problem with my regex, it does not exclude the first [ and the last ] from my string. I can't exclude all [ ] because certain data have [ ]
I found this regex to exclude the first and last character ^.(.*).$ but don't know how to combine my two regex
https://regex101.com/r/CAsKHE/1
The output that I expect is
List<String> My_goal= [
XXXXXX:XX XX XX XX
XXXXX:332.83
XXXXX:XXX-XX-XX XX:XX:XX
XXXX:0.0, XXXX:2
....
2022-09-17,XXXXX:1
]
Try this:
(?<=(?<!: *)\[).*?(?=,)|(?<=, *(?=[^ \r\n]))(?:.*?(?=,)|[^,\r\n\[\]]+?(?=\])|[^,\r\n]+\](?= *\]))
See regex demo.
Related
I have a regex working with 99% of my situations. But one is not working
My input is like that
MyString=[XXXXXX:XX XX XX XX, XXXXX:332.83, XXXXX:XXX-XX-XX XX:XX:XX, XXXX:0.0, XXXX:2, XXXX:0, XXXX:-256, XXXXX:5, XXXX:136935, XXXX:0, XXXX:XX XXX XXX, XXXX:0.5, XXXXX:true, XXXX:0.509375, XXX:0.0, [XXXX:2022-06-14 06:45:00], 2022-09-17,XXXXX:1]
This regex allows to match all key:value in between the first [ and last ]
(?<=(?<!: *)\[).*?(?=,)|(?<=, *(?=[^ \r\n]))(?:.*?(?=,)|[^,\r\n\[\]]+?(?=\])|[^,\r\n]+\](?= *\]))
It split all key:value when a comma is detected, but I have specific data where I have a comma without space after. For example, the last date of my example is split because of , I search to exclude this split match to match all 2022-09-17,XXXXX:1
I search for a regex that match only data that a separate by , and not ,
Here is the example with the split of the last data I search to prevent
https://regex101.com/r/8fd7Xv/1
You can add a space, or 1 or more spaces at the positions that you assert a comma.
(?<=(?<!: *)\[).*?(?=, )|(?<=, +(?=[^ \r\n]))(?:.*?(?=, )|[^,\r\n\[\]]+?(?=\])|[^,\r\n]+\](?= *\]))
See the updated pattern https://regex101.com/r/Dlx8Xi/1
What I'm trying to achieve is changing square brackets [] to curly/brace brackets {}.
There are two conditions, some start with [", the others end with "]
There will not be any occurrences where both exist in same string. Haven't run across any yet.
BEFORE:
[Strained breathing]
["Wanna Give My Love"
by The Sons of Rainier]
[Mavrick blows a fart]
["Hallelujah"
by The Sons of Rainer]
[Victor over the phone]
[The Korgi's "Everybody's
Got To Learn Sometime"]
[Lola chuckles]
["It's Good"
by Jack Hammer]
[Uno Hype's "Leave"]
Here's what I would like as the end results
AFTER:
[Strained breathing]
{"Wanna Give My Love"
by The Sons of Rainier playing}
[Mavrick blows a fart]
{"Hallelujah"
by The Sons of Rainer}
[Victor over the phone]
{The Korgi's "Everybody's
Got To Learn Sometime"}
[Lola chuckles]
{"It's Good"
by Jack Hammer}
{Uno Hype's "Leave"}
Here are my attempts:
Find: (?=\[")([\S\s]+?)\]
Replace: \{$1\}
Find: (?=\[[A-Z])([\S\s]+?)\"]
Replace: \{$1\}
Find: \["([A-Z][\S\s]+?)\]
Replace: \{$1\}
So frustrated that my light blub is still so dim in regards to regex.
Thanks in Advance
You could use this regex:
\[("[^]]+|[^]]+")\]
which matches a [ followed by either
a " and some number of non-] characters; or
some number of non-] characters followed by a "
and then followed by a ], and replace it with {\1}.
Regex demo on regex101
You can use
\[([^]["]*"[^][]*)]
Explanation
\[ Match [
( Capture group 1
[^]["]* Optionally match any char except ] [ "
" Then match a single "
[^][]* Optionally match any char except ] [
) Close group 1
] Match ]
Regex demo
In the replacement use {\1}
I'm trying to remove unneeded words between brackets that contains certain modifier ('DeleteMe') and don't delete contents between brackets that contains other words ('DontDeleteMe').
I though it was simple but proved difficult due to repeating brackets see below.
[
aljdsfjfldsa DeleteMe aldsjflajdf
]
[
aldskjfal DontDeleteMe asdlkjflasdj
]
[
aljdsfjfldsa DeleteMe aldsjflajdf
]
[
aldskjfal DontDeleteMe asdlkjflasdj
]
Desired output
[
aldskjfal DontDeleteMe asdlkjflasdj
]
[
aldskjfal DontDeleteMe asdlkjflasdj
]
I tried the following but the problem is the second line will be deleted with the third line.
(?s)\[.*?'DeleteMe'.*?\]
You can use a word boundary in combination with a negated character class [^
\[[^][]*\bDontDeleteMe\b[^][]*\]
Regex demo
If the word is DeleteMe you can match it using word boundaries and repace with an empty string.
\[[^][]*\bDeleteMe\b[^][]*\]
Regex demo
For a JavaScript application, I'm trying to come up with a regex that will match key/value pairs in a string. It's working pretty well, but there is one last thing that I need to implement and I'm not sure how.
The syntax is very similar to what you'll find in a .env file. So key/value pairs look like KEY=value.
A few rules that I have already implemented:
The key
alphanumeric string.
can't be empty and can't be a number.
may contain an underscore
The value
can be string
may be surrounded by single or double quotes, or none at all.
Now I'm trying to add comments with # in there. It works, except when # is between the quotes. Any idea how to fix that? Thanks!
Here is my code sample:
// This is my regex
const regex = /^\s*(?![0-9_]*\s*=\s*([\W\w\s.]*)\s*$)[A-Z0-9_]+\s*=\s*(.*)?\s*(?<!#.*)/gi;
// Outputs [ "KEY=value " ] --> OK
const str = `KEY=value # Comment`;
console.log(str.match(regex));
// Outputs [ "KEY2=val" ] --> OK
const str2 = `KEY2=val#ue # Comment`;
console.log(str2.match(regex));
// Outputs [ "key3='value3' " ] --> OK
const str3 = `key3='value3' # Comment`;
console.log(str3.match(regex));
// Outputs [ "key_4='val" ] --> NOT OK
// Expecting [ "key_4='val#ue4' " ]
const str4 = `key_4='val#ue4' # Comment`;
console.log(str4.match(regex));
EDIT:
Here is another sample for testing:
# The following are matching
ONE = This is ONE
TWO=This is TWO
THREE="This is 'THREE'"
FOUR = "This is \"FOUR\""
fi_ve = 'This is \'FIVE\''
six='This is "SIX"'
NUMBER7="This is SEVEN" # Comment for SEVEN
number8="This is EIGHT"#Comment for EIGHT
NINE="This is #9"
TEN=This is #10
ELEVEN=
TWELVE=10
THIRTEEN=TRUE
FOURTEEN="true"
FIFTEEN=false
SIXTEEN='FALSE'
# The following are not matching(incl. empty line)
17="Is not valid because the key is a number"
="Is also not valid because the key is missing"
You may use
([A-Za-z_]\w*)[ \t]*=[ \t]*('[^'\\]*(?:\\.[^'\\]*)*'|"[^"\\]*(?:\\.[^"\\]*)*"|[^\r\n#]*)
See the regex demo
([A-Za-z_]\w*) - Group 1:
[ \t]*=[ \t]* - a = enclosed with 0 or more spaces or tabs
('[^'\\]*(?:\\.[^'\\]*)*'|"[^"\\]*(?:\\.[^"\\]*)*"|[^\r\n#]*) - Group 2:
'[^'\\]*(?:\\.[^'\\]*)*'| - a '...' like substring that may contain any string escape sequence, or
"[^"\\]*(?:\\.[^"\\]*)*"| - a "..." like substring that may contain any string escape sequence, or
[^\r\n#]* - 0 or more chars other than #, CR and LF
I have an input line that looks like this:
localhost_9999.kafka.server:type=SessionExpireListener,name=ZooKeeperSyncConnectsPerSec.OneMinuteRate
and I can use this pattern to parse it:
%{DATA:kafka_node}:type=%{DATA:kafka_metric_type},name=%{JAVACLASS:kafka_metric_name}
which gives me this:
{
"kafka_node": [
[
"localhost_9999.kafka.server"
]
],
"kafka_metric_type": [
[
"SessionExpireListener"
]
],
"kafka_metric_name": [
[
"ZooKeeperSyncConnectsPerSec.OneMinuteRate"
]
]
}
I want to split the OneMinuteRate into a seperate field but can't seem to get it to work. I've tried this:
%{DATA:kafka_node}:type=%{DATA:kafka_metric_type},name=%{WORD:kafka_metric_name}.%{WORD:attr_type}"
but get nothing back then.
I'm also using https://grokdebug.herokuapp.com/ to test these out...
You can either use your last regex with an escaped . (note that a . matches any char but newline and a \. will match a literal dot char), or use DATA type for the last but one field and a GREEDYDATA for the last field:
%{DATA:kafka_node}:type=%{DATA:kafka_metric_type},name=% {DATA:kafka_metric_name}\.%{GREEDYDATA:attr_type}
Since %{DATA:name} translates to (?<name>.*?) and %{GREEDYDATA:name} translates to (?<name>.*), the name part will match any chars, 0 or more occurrences, as few as possible, up to the first ., and attr_type .* pattern will greedily "eat up" the rest of the line up to its end.