Regex match strings with different values - regex

for i,v in array
for i , v in array
for i , v in array
for i, v in array
for i,v in array
for i, v in array
for[\s+,.](.+)
https://regex101.com/r/Vd3w7C/2
How i could match anything after the v
but
i,v, and in array will have different values
i mean something like:
for ppp,gflgkf heekd gfvb

You could use
\bfor\s+[^\s,]+(?:\s*,\s*[^\s,]+)*\s+(.+)
The pattern matches:
\bfor\s+ Match for and 1+ whitespace chars
[^\s,]+ Match 1+ times any char except a whitspace char or ,
(?: Non capture group
\s*,\s*[^\s,]+ Match a comma between optional whitespace chars, and match at least a single char other than a comma or whitespace chars
)*\s+ Close the group and optionally repeat it followed by 1+ whitespace chars
(.+) Capture 1+ times any char except a newline in group 1
See a regex demo.

Related

How to clear all commas except for commas in even position in sheet?

I have multiple rows of string where the string is all wrong. Here is one row an an example of the geometry and output expected:
id
geometry
output
1
POLYGON (( 106.812271, -6.361551, 106.812111, -6.361339, 106.81205, -6.361177, 106.81206, -6.360905, 106.812055, -6.360582, 106.812065, -6.360218, 106.812293, -6.359295, 106.812593, -6.358644, 106.812436, -6.358406, 106.8121515, -6.3582051, 106.8123, -6.357823, 106.81244, -6.357407, 106.812612, -6.356842, 106.812719, -6.356544, 106.81274, -6.356384, 106.812864, -6.356148, 106.813019, -6.356021, 106.813287, -6.355797, 106.813781, -6.355286, 106.814076, -6.354751, 106.814277, -6.354393, 106.814403, -6.354027, 106.814553, -6.353814, 106.814736, -6.353526, 106.814993, -6.353302, 106.81516, -6.353024, 106.815358, -6.35279, 106.815509, -6.352588, 106.815675, -6.352331, 106.8153007, -6.3521138, 106.8151398, -6.3520137, 106.8149789, -6.3518005, 106.8147643, -6.3516939, 106.8144639, -6.3516245, 106.8141527, -6.3515392, 106.8135734, -6.351342, 106.813171, -6.3512034, 106.8123284, -6.3509219, 106.8122418, -6.3511298, 106.8118164, -6.3521534, 106.8116597, -6.3525047, 106.8111849, -6.3535692, 106.8102245, -6.3554942, 106.8093545, -6.3568947, 106.8085097, -6.3580518, 106.80795, -6.358832, 106.8077793, -6.3590429, 106.807668, -6.359441, 106.807499, -6.360346, 106.8072531, -6.3616378, 106.8071476, -6.3622599, 106.8070637, -6.3626798, 106.8070823, -6.3629367, 106.8071207, -6.3634531, 106.8078269, -6.363831, 106.809448, -6.364124, 106.810574, -6.364198, 106.81066, -6.362993, 106.811175, -6.36277, 106.812087, -6.361703, 106.812271, -6.361551))
POLYGON (( 106.812271 -6.361551, 106.812111 -6.361339, 106.81205 -6.361177, 106.81206 -6.360905, 106.812055 -6.360582, 106.812065 -6.360218, 106.812293 -6.359295, 106.812593 -6.358644, 106.812436 -6.358406, 106.8121515 -6.3582051, 106.8123 -6.357823, 106.81244 -6.357407, 106.812612 -6.356842, 106.812719 -6.356544, 106.81274 -6.356384, 106.812864 -6.356148, 106.813019 -6.356021, 106.813287 -6.355797, 106.813781 -6.355286, 106.814076 -6.354751, 106.814277 -6.354393, 106.814403 -6.354027, 106.814553 -6.353814, 106.814736 -6.353526, 106.814993 -6.353302, 106.81516 -6.353024, 106.815358 -6.35279, 106.815509 -6.352588, 106.815675, -6.352331, 106.8153007, -6.3521138, 106.8151398 -6.3520137, 106.8149789 -6.3518005, 106.8147643 -6.3516939, 106.8144639 -6.3516245, 106.8141527 -6.3515392, 106.8135734 -6.351342, 106.813171 -6.3512034, 106.8123284 -6.3509219, 106.8122418 -6.3511298, 106.8118164 -6.3521534, 106.8116597 -6.3525047, 106.8111849 -6.3535692, 106.8102245 -6.3554942, 106.8093545 -6.3568947, 106.8085097 -6.3580518, 106.80795 -6.358832, 106.8077793 -6.3590429, 106.807668 -6.359441, 106.807499 -6.360346, 106.8072531 -6.3616378, 106.8071476 -6.3622599, 106.8070637 -6.3626798, 106.8070823 -6.3629367, 106.8071207 -6.3634531, 106.8078269 -6.363831, 106.809448 -6.364124, 106.810574 -6.364198, 106.81066 -6.362993, 106.811175 -6.36277, 106.812087 -6.361703, 106.812271 -6.361551))
One example is as follows above. I need to get rid of all odd position commas and only keep the even position commas. So that the geometry can become output.
I tried doing a split(text.",") and concatenate however when the columns is blank it returns xxx,,,, which is not what I had in mind.
Since some have more than 200 commas means that I need to have more than 200 columns, is there a simpler way like using regex?Someone please help.
If the second number is always negative, this is simple as replacing , - (comma, space, dash) with (space).
=REGEXREPLACE(B2,", -"," ")
If not,
=REGEXREPLACE(B2,"(-??\d+\.?\d*),(\s*-?\d+\.?\d*)","$1$2")
Capture group #1: (-??\d+\.?\d*)
-?? zero or one of literal dash followed by
\d+ one or more digits followed by
.? zero or one of literal .
\d* zero or more of digits
literal ,
Capture group #2 (\s*-?\d+\.?\d*)
\s* zero or more of space characters
-? zero or one of literal dash followed by
\d+ one or more digits followed by
.? zero or one of literal .
\d* zero or more of digits
Replace with capture groups only: $1$2
try:
=INDEX(REGEXREPLACE(QUERY(FLATTEN(SPLIT(A1, ",")&IF(ISODD(
SEQUENCE(1, COLUMNS(SPLIT(A1, ",")))),, ",")),,9^9), ",$", ))
for array:
=INDEX(IFERROR(BYROW(A1:A3, LAMBDA(x, REGEXREPLACE(QUERY(FLATTEN(SPLIT(x, ",")&
IF(ISODD(SEQUENCE(1, COLUMNS(SPLIT(x, ",")))),, ",")),,9^9), ",$", )))))
An idea to match , and capture any non-commas with an optional comma after:
=REGEXREPLACE(A1; ",([^,]*,?)"; "$1")
Replace with $1 what was captured by the first group - See this demo at regex101

Regex that does not accept sub strings of more than two 'b'

I need a regex that accepts all the strings consisting only of characters a and b, except those with more than two 'b' in a row.
For example, these should not match:
abb
ababbb
bba
bbbaa
bbb
bb
I came up with this, but it's not working
[a-b]+b{2,}[a-b]*
Here is my code:
int main() {
string input;
regex validator_regex("\b(?:b(?:a+b?)*|(?:a+b?)+)\b");
cout << "Hello, "<<endl;
while(regex_match(input,validator_regex)==false){
cout << "please enter your choice of regEx :"<<endl;
cin>>input;
if(regex_match(input,validator_regex)==false)
cout<<input+" is not a valid input"<<endl;
else
cout<<input+" is valid "<<endl;
}
}
Your pattern [a-b]+b{2,}[a-b]* matches 1 or more a or b chars until you match bb which is what you don't want. Also note that the string should be at least 3 characters long due to this part [a-b]+b{2,}
To not match 2 b chars in a row you can exclude those matches using a negative lookahead by matching optional chars a or b until you encounter bb
Note that [a-b] is the same as [ab]
\b(?![ab]*?bb)[ab]+\b
\b A word boundary
(?![ab]*?bb) Negative lookahead, assert not 0+ times a or b followed by bb to the right
[ab]+ Match 1+ occurrences of a or b
\b A word boundary
Regex demo
Without using lookarounds, you can match the strings that you don't want by matching a string that contains bb, and capture in group 1 the strings that you want to keep:
\b[ab]*bb[ab]*\b|\b([ab]+)\b
Regex demo
Or use an alternation matching either starting with b and optional repetitions of 1+ a chars followed by an optional b, or match 1+ repetitions of starting with a followed by an optional b
\b(?:b(?:a+b?)*|(?:a+b?)+)\b
Regex demo
The simplest regex is:
^(?!.*bb)[ab]+$
See live demo.
This regex works by adding a negative look ahead (anchored to start) for bb appearing anywhere within input consisting of a or b.
If zero length input should match, change [ab]+ to [ab]*.

Find if either followed by non number or end of file

I want to match the string b5 with optional $ in front of the b or tha 5 :
=b5
b$5
= $b$5
($b5)
But the 5 can't be followed by any number . And the b can't be preceded by any alphabet. So this should return false :
b55
ab5
I tried this :
\W\$*b\$*5\W
it works fine. i will match X=($b$5) but the problem is : it won't match anymore if the '5' is the last character in the line.
because 5 is last character
You can use
(?:\W|^)\$*b\$*5(?:\W|$)
(?:\W|^)\$*b\$*5\b
See the RE2 regex demo.
Details
(?:\W|^) - a non-capturing group matching either a non-word char or start of string
\$* - zero or more $ chars
b - a b char
\$* - zero or more $ chars
5 - a 5 char
(?:\W|$) - a non-capturing group matching either a non-word char or end of string or
\b - a word boundary.

Regex: Deal \r\n as normal word

I'm doing a small project which can calculate the count of functions in C++ files(.cpp).
I used the following Regex as "function pattern":
/[a-z|A-Z]+\s*::\s*~?[a-z|A-Z]+\(.*\)/gm
It works for most cases, but fails when there are new line breaks in ().
void CXYZRScanPanel::OnPrepareScanning()
{
//This one is ok.
}
void CXYZRScanPanel::OnPrepareScanning(int k)
{
//This one is ok.
}
void CXYZRScanPanel::OnPrepareScanning(int k,
int j)
{
//This one fails.
}
I'm thinking if there is anything "stronger" than the .* which can skip the \r\n.
Thanks for any help.
If there is no such a thing, I will probably remove all /r/n within () before doing the such.
You could write the pattern using a negated character class starting with [^ matching any char except ( and ) which will also match a newline.
Note that you can omit the | in the character class.
[a-zA-Z]+\s*::\s*~?[a-zA-Z]+(\([^()]*\))
The pattern matches:
[a-zA-Z]+ Match 1+ times chars a-zA-Z
\s*::\s* Match :: between optional whitespace chars
~? Match an optional ~ char
[a-zA-Z]+ Match 1+ times chars a-zA-Z
( Capture group 1
\([^()]*\) Optionally match any char except ( and ) between parenthesis
) Close group 1
See a regex demo

How to get lines until an empty newline

I want to get a bloc of lines which contains < or > operator until an empty newline
i try with this regex .*[<>][^,\r\n]+?\(.*\S.*,.*\S.*\).*(?:(\n).*)
You find here my example : https://regex101.com/r/UQYLB5/1/
Expected Result :
MATCH 1 :
BAR18>17M(3,5.2)V
MATCH 2 :
BAR19>1.243037M(3,5.2)V
INFORMATION PROCESS
TAKE B/F: 19V[1]
LIGHT PC CARD:
MATCH 3 :
TEFAL17>1.262259M(4.5,5.5)V
SISS17 : 1789-ID
LIGHT 19/17
MAPPING NICE :
MATCH 4 :
MASCARPONE19>493.818969M(3,5.2)V
BATA17 : CDER78945 -- 1875
LEFT ERREUR - CAME BACK
MATCH 5 :
REPAR_178>748.515487M(4.5,5.5)V
CHAN1 / STEREO MIX
If you don't want to match lines which could consist of spaces only, you could use match either < or > and match at least a non whitespace char \S in the following lines:
^[^<>\r\n]*[<>].*(?:\r?\n[^\r\n\S]*\S.*)*
The pattern will match:
^ Start of string
[^<>\r\n]* Match any char except < `
[<>].* Match either < or > and the rest of the line
(?: Non capture group
\r?\n Match a newline
[^\r\n\S]* Match any char except a newline
\S.* Match a non whitespace char and the rest of the line
)* Close the group and repeat 0+ times
Regex demo
If the first line should also contain a , after matching < or >:
^[^<>\r\n]*[<>][^\r\n,]*,.*(?:\r?\n[^\r\n\S]*\S.*)*
Regex demo