I'm looking for a regex that looks for prices in a comment line that can be formatted differntly (depending on the person who entered them
REG / SZ / 236,30 SUMMER
should match 236.30 (rather easy)
WB / SZ / 187.75 EBS
should match 187.75 (could have done that by myself so far)
here are the tricky ones
FS / EBS / 1*145.80 + 231.30
FS / EBS / 1x 145,80 + 231
FS / EBS / 3x 145.80 + 4x231
FS / EBS / 3* 145.80 + 4x231
First should match 145.80 and 231.30
Second should match 145.80 and 231.00
Third should match 145.80 and 231 and possibly "4x" and "3x"
Fourth as third with * AND x
Is there any way to do that with a regex?
//EDIT (clarification)
I want to have a total sum in the end. So third and fourth case would be (3*145.80) + (4*231). Second case is intentionally 145,80 instead of 145.80.
What I got so far
(([0-9])*?\.([0-9])*)|(([0-9])*?\,([0-9])*)
Which will give me 236,30, 187.75, 145.80, 145,80
from re import findall
examples = ('REG / SZ / 236,30 SUMMER', \
'WB / SZ / 187.75 EBS', \
'FS / EBS / 1*145.80 + 231.30', \
'FS / EBS / 1x 145,80 + 231', \
'FS / EBS / 3x 145.80 + 4x231', \
'FS / EBS / 3* 145.80 + 4x231')
for line in examples:
numbers = findall(r'[/+]\s*(?:(\d+[.,]?\d*)[*x ]\s*)?(\d+[.,]?\d*)', line)
result = 0.0
for multiplier, value in numbers:
if not multiplier:
result += float(value.replace(',', '.'))
else:
result += float(multiplier) * float(value.replace(',', '.'))
print '%s\nAfter regex: %s\nResult: %.2f\n' % (line, numbers, result)
Produces the result:
REG / SZ / 236,30 SUMMER
After regex: [('', '236,30')]
Result: 236.30
WB / SZ / 187.75 EBS
After regex: [('', '187.75')]
Result: 187.75
FS / EBS / 1*145.80 + 231.30
After regex: [('1', '145.80'), ('', '231.30')]
Result: 377.10
FS / EBS / 1x 145,80 + 231
After regex: [('1', '145,80'), ('', '231')]
Result: 376.80
FS / EBS / 3x 145.80 + 4x231
After regex: [('3', '145.80'), ('4', '231')]
Result: 1361.40
FS / EBS / 3* 145.80 + 4x231
After regex: [('3', '145.80'), ('4', '231')]
Result: 1361.40
Assuming an input such as 1*[VALUE] is fine by you, I believe the following will catch all numeric statements:
(\d[x*] ?)?\d+([.,]\d+)?(?![*x])
Here's the breakdown:
(\d[x*] ?)?
Catches optional multipliers
\d+([.,]\d+)?
Requires the numeric value, with an optional decimal value
(?![*x])
Is a negative lookup to invalidate standalone multipliers as accepted values (e.g. match 1x as the value 1).
Hope I didn't miss anything.
Why not simplify it and have a regex to capture the part trailing last /
(?<=\/)\s[\dx\*\+\.\,\s]+
It would give all the numeric parts and then you have to evaluate the expression.
Related
I am converting a code of mine from MATLAB to julia, thus I need to replace parentheses used for indexing: they are of the type () in MATLAB and of the type [] in julia. Functions parentheses are of the same type in both, i.e. ().
I thought that the fastest way to do this was to use Notepad++, finding all of the parenthes and then replacing them with brackets when need.
Anyhow it does not work as expected.
I won't copy all of the function I am converting now, but some parts as example:
x= coord(:,1);
y= coord(:,2);
natG_coord(1,1)= sqrt(1/3);
natG_coord(2,1)= -sqrt(1/3);
natG_coord(3,1)= -sqrt(1/3);
natG_coord(4,1)= sqrt(1/3);
for i=1:4
dNG(1,i)= (1+etaG(i))/4 + csiG(i)*(1+etaG(i))/2 - (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2)/4;
dNG(2,i)= -(1+etaG(i))/4 + csiG(i)*(1+etaG(i))/2 + (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2)/4;
dNG(3,i)= -(1-etaG(i))/4 + csiG(i)*(1-etaG(i))/2 + (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2)/4;
dNG(4,i)= (1-etaG(i))/4 + csiG(i)*(1-etaG(i))/2 - (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2)/4;
end
I tried finding \((.*)\) and replacing with [$1], but it does not get all of the parentheses. For instance, it gets the ones in declaring x and y, the sqrt value but does not get the natG_coord indexes. In the for cycle, it only gets the last expression of each line, i.e. (1-etaG(i)^2), but the external parenthes, not the etaG index (which is actually what I need to replace).
I cannot see a pattern in the choice and thus cannot come up with a solution.
Other solutions not to get mad doing this parenthesis by parenthesis is fine!
Thank you all for your help.
edit
#stribizhev: the final result should be this:
x= coord[:,1]
y= coord[:,2]
natG_coord[1,1]= sqrt(1/3)
natG_coord[2,1]= -sqrt(1/3)
natG_coord[3,1]= -sqrt(1/3)
natG_coord[4,1]= sqrt(1/3)
for i=1:4
dNG[1,i]= (1+etaG[i])/4 + csiG[i]*(1+etaG[i])/2 - (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4
dNG[2,i]= -(1+etaG[i])/4 + csiG[i]*(1+etaG[i])/2 + (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4
dNG[3,i]= -(1-etaG[i])/4 + csiG[i]*(1-etaG[i])/2 + (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4
dNG[4,i]= (1-etaG[i])/4 + csiG[i]*(1-etaG[i])/2 - (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4
end
What I get finding \((.*)\) and replacing with [$1] one time is:
x= coord[:,1];
y= coord[:,2];
natG_coord[1,1)= sqrt(1/3];
natG_coord[2,1)= -sqrt(1/3];
natG_coord[3,1)= -sqrt(1/3];
natG_coord[4,1)= sqrt(1/3];
for i=1:4
dNG[1,i)= (1+etaG(i))/4 + csiG(i)*(1+etaG(i))/2 - (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2]/4;
dNG[2,i)= -(1+etaG(i))/4 + csiG(i)*(1+etaG(i))/2 + (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2]/4;
dNG[3,i)= -(1-etaG(i))/4 + csiG(i)*(1-etaG(i))/2 + (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2]/4;
dNG[4,i)= (1-etaG(i))/4 + csiG(i)*(1-etaG(i))/2 - (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2]/4;
end
What I get finding \(((?>[^()]|(?R))*)\) and replacing all with [$1] one time is (I know you said several times, if I do it it'll replace every matching braces in the end):
x= coord[:,1];
y= coord[:,2];
natG_coord[1,1]= sqrt[1/3];
natG_coord[2,1]= -sqrt[1/3];
natG_coord[3,1]= -sqrt[1/3];
natG_coord[4,1]= sqrt[1/3];
for i=1:4
dNG[1,i]= [1+etaG(i)]/4 + csiG[i]*[1+etaG(i)]/2 - [1-etaG(i)^2]/4 - 2*csiG[i]*[1-etaG(i)^2]/4;
dNG[2,i]= -[1+etaG(i)]/4 + csiG[i]*[1+etaG(i)]/2 + [1-etaG(i)^2]/4 - 2*csiG[i]*[1-etaG(i)^2]/4;
dNG[3,i]= -[1-etaG(i)]/4 + csiG[i]*[1-etaG(i)]/2 + [1-etaG(i)^2]/4 - 2*csiG[i]*[1-etaG(i)^2]/4;
dNG[4,i]= [1-etaG(i)]/4 + csiG[i]*[1-etaG(i)]/2 - [1-etaG(i)^2]/4 - 2*csiG[i]*[1-etaG(i)^2]/4;
end
What I get finding \(([^()]*)\) replacing all with [$1] one time is:
x= coord[:,1];
y= coord[:,2];
natG_coord[1,1]= sqrt[1/3];
natG_coord[2,1]= -sqrt[1/3];
natG_coord[3,1]= -sqrt[1/3];
natG_coord[4,1]= sqrt[1/3];
for i=1:4
dNG[1,i]= (1+etaG[i])/4 + csiG[i]*(1+etaG[i])/2 - (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4;
dNG[2,i]= -(1+etaG[i])/4 + csiG[i]*(1+etaG[i])/2 + (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4;
dNG[3,i]= -(1-etaG[i])/4 + csiG[i]*(1-etaG[i])/2 + (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4;
dNG[4,i]= (1-etaG[i])/4 + csiG[i]*(1-etaG[i])/2 - (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4;
end
So the last one is exactly what I was looking for. Once I go with the "find next" command, I can decide whether they are indexing parantheses or not and substitute them or not (avoiding the sqrt function input, for instance).
Thank you very much for your help.
Since the \(([^()]*)\) (to replace with [$1]) worked for you, here is the explanation:
\(([^()]*)\)
Matches:
\( - an opening round bracket
([^()]*) - Capture group 1 matches zero or more characters other than ( and ) (with [^()]*)
\)- a closing round bracket
This regex above will match all last nested level parentheses, that do not have any parentheses inside them.
Answering Aaron's remark about replacing the parentheses inside the quoted strings, it is great that Notepad++ supports Boost conditional replacement patterns. We can match what we do not need to modify and replace with self, and use another replacement for the other matches.
(?<o1>"[^"\\]*(?:\\.[^"\\]*)*")|(?<o2>\(([^()]*)\))
And replace with (?{o1}$+{o1}:[$3]).
Note that "[^"\\]*(?:\\.[^"\\]*)*" matches C strings with escaped entities correctly and efficiently. The replacement pattern means to replace with the quoted string (if o1 group matched) or with [+Group 3 value+] (if the other group matched).
If you need to replace outer balanced parentheses, use
\(((?>[^()]|(?R))*)\)
And replace with [$1] (see demo). If you need to replace the overlapping parenthetical substrings, you will need to hit Replace All several times.
Regex explanation:
\( # an outer literal opening round bracket
( # start group 1
(?> # start of atomic group
[^()] # any character other than ( and )
| # OR
(?R) # recursively match the whole pattern
)* # end atomic group and repeat zero or more times
) # end of group 1
\) # match a literal closing round bracket
If the strings you need to replace those parentheses should be preceded with word characters, use
(\w+)(\(((?>[^()]|(?2))*)\))
And replace with $1[$3]. See demo
This regex uses a (?2) subroutine that just repeats the second capture group subpattern.
Now, avoiding to match these inside quoted strings. Assume we have var d = "r(string here)" and we do not want to turn the () to [] here. Instead of (\w+)(\(((?>[^()]|(?2))*)\)) (with $1[$3] replacement), use
(?<o1>"[^"\\]*(?:\\.[^"\\]*)*")|(?<o2>(\w+)(\(((?>[^()]|(?4))*)\)))
And (?{o1}$+{o1}:$3[$5]) as the replacement. This will keep var d = "r(string here)" string intact, and will turn var f = a(fg()g) into var f = a[fg()g].
I want to match the expression of var * var, var * num, num * var and num * num separately, i.e. using four different regular expression.
my var could be s1,s2,...,S1,S2,...,v1,v2,...V1,V2....
my num could be any float number
for var*var, I use:
[vVsS][0-9]+\s*[*/]\s*[vVsS][0-9]+
and it works well
for var*num and num*var, I use:
[vVsS][0-9]+\s*[*/]\s*[0-9]+[.]?[0-9]*
and
[0-9]+[.]?[0-9]*\s*[*/]\s*(vVsS)[0-9]+
but it returns nothing when I try the input:
2*4 + s1* 7 + v3 * 2 + s3 * V2 + 5*v1
UPDATE: I could do that now.
For example, for the case of var * num
[vVsS][0-9]+\s*[*/]\s*[0-9]+(?:[.][0-9]+)? works well, as Wiktor Stribiżew suggests in comment.
But I didn't find some explanation on the use of(?:) online. Anyone has idea on that?
You may use
[vVsS][0-9]+\s*[*/]\s*[0-9]+(?:[.][0-9]+)?
The pattern matches:
[vVsS][0-9]+ - a letter from the character class (either v, V, s or S) followed with one or more digits
\s*[*/]\s* - a / or * enclosed with zero or more whitespaces
[0-9]+ - one or more digits
(?:[.][0-9]+)? - an optional non-capturing group matching a dot and one or more digits.
I have a group of variable var:
> var
[1] "a1" "a2" "a3" "a4"
here is what I want to achieve: using regex and change strings such as this:
3*a1 + a1*a2 + 4*a3*a4 + a1*a3
to
3a1 + a1*a2 + 4a3*a4 + a1*a3
Basically, I want to trim "*" that is not in between any values in var. Thank you in advance
Can do find (?<![\da-z])(\d+)\* replace $1
(?<! [\da-z] )
( \d+ ) # (1)
\*
Or, ((?:[^\da-z]|^)\d+)\* for the assertion impaired engines
( # (1 start)
(?: [^\da-z] | ^ )
\d+
) # (1 end)
\*
Leading assertions are bad anyways.
Benchmark
Regex1: (?<![\da-z])(\d+)\*
Options: < none >
Completed iterations: 100 / 100 ( x 1000 )
Matches found per iteration: 2
Elapsed Time: 1.09 s, 1087.84 ms, 1087844 µs
Regex2: ((?:[^\da-z]|^)\d+)\*
Options: < none >
Completed iterations: 100 / 100 ( x 1000 )
Matches found per iteration: 2
Elapsed Time: 0.77 s, 767.04 ms, 767042 µs
You can create a dynamic regex out of the var to match and capture *s that are inside your variables, and reinsert them back with a backreference in gsub, and remove all other asterisks:
var <- c("a1","a2","a3","a4")
s = "3*a1 + a1*a2 + 4*a3*a4 + a1*a3"
block = paste(var, collapse="|")
pat = paste0("\\b((?:", block, ")\\*)(?=\\b(?:", block, ")\\b)|\\*")
gsub(pat, "\\1", s, perl=T)
## "3a1 + a1*a2 + 4a3*a4 + a1*a3"
See the IDEONE demo
Here is the regex:
\b((?:a1|a2|a3|a4)\*)(?=\b(?:a1|a2|a3|a4)\b)|\*
Details:
\b - leading word boundary
((?:a1|a2|a3|a4)\*) - Group 1 matching
(?:a1|a2|a3|a4) - either one of your variables
\* - asterisk
(?=\b(?:a1|a2|a3|a4)\b) - a lookahead check that there must be one of your variables (otherwise, no match is returned, the * is matched with the second branch of the alternation)
| - or
\* - a "wild" literal asterisk to be removed.
Taking the equation as a string, one option is
gsub('((?:^| )\\d)\\*(\\w)', '\\1\\2', '3*a1 + a1*a2 + 4*a3*a4 + a1*a3')
# [1] "3a1 + a1*a2 + 4a3*a4 + a1*a3"
which looks for
a captured group of characters, ( ... )
containing a non-capturing group, (?: ... )
containing the beginning of the line ^
or, |
a space (or \\s)
followed by a digit 0-9, \\d.
The capturing group is followed by an asterisk, \\*,
followed by another capturing group ( ... )
containing an alphanumeric character \\w.
It replaces the above with
the first captured group, \\1,
followed by the second captured group, \\2.
Adjust as necessary.
Thank #alistaire for offering a solution with non-capturing group. However, the solution replies on that there exists an space between the coefficient and "+" in front of it. Here's my modified solution based on his suggestion:
> ss <- "3*a1 + a1*a2+4*a3*a4 +2*a1*a3+ 4*a2*a3"
# my modified version
> gsub('((?:^|\\s|\\+|\\-)\\d)\\*(\\w)', '\\1\\2', ss)
[1] "3a1 + a1*a2+4a3*a4 +2a1*a3+ 4a2*a3"
# alistire's
> gsub('((?:^| )\\d)\\*(\\w)', '\\1\\2', ss)
[1] "3a1 + a1*a2+4*a3*a4 +2*a1*a3+ 4a2*a3"
I am converting a code of mine from MATLAB to julia, thus I need to replace parentheses used for indexing: they are of the type () in MATLAB and of the type [] in julia. Functions parentheses are of the same type in both, i.e. ().
I thought that the fastest way to do this was to use Notepad++, finding all of the parenthes and then replacing them with brackets when need.
Anyhow it does not work as expected.
I won't copy all of the function I am converting now, but some parts as example:
x= coord(:,1);
y= coord(:,2);
natG_coord(1,1)= sqrt(1/3);
natG_coord(2,1)= -sqrt(1/3);
natG_coord(3,1)= -sqrt(1/3);
natG_coord(4,1)= sqrt(1/3);
for i=1:4
dNG(1,i)= (1+etaG(i))/4 + csiG(i)*(1+etaG(i))/2 - (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2)/4;
dNG(2,i)= -(1+etaG(i))/4 + csiG(i)*(1+etaG(i))/2 + (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2)/4;
dNG(3,i)= -(1-etaG(i))/4 + csiG(i)*(1-etaG(i))/2 + (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2)/4;
dNG(4,i)= (1-etaG(i))/4 + csiG(i)*(1-etaG(i))/2 - (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2)/4;
end
I tried finding \((.*)\) and replacing with [$1], but it does not get all of the parentheses. For instance, it gets the ones in declaring x and y, the sqrt value but does not get the natG_coord indexes. In the for cycle, it only gets the last expression of each line, i.e. (1-etaG(i)^2), but the external parenthes, not the etaG index (which is actually what I need to replace).
I cannot see a pattern in the choice and thus cannot come up with a solution.
Other solutions not to get mad doing this parenthesis by parenthesis is fine!
Thank you all for your help.
edit
#stribizhev: the final result should be this:
x= coord[:,1]
y= coord[:,2]
natG_coord[1,1]= sqrt(1/3)
natG_coord[2,1]= -sqrt(1/3)
natG_coord[3,1]= -sqrt(1/3)
natG_coord[4,1]= sqrt(1/3)
for i=1:4
dNG[1,i]= (1+etaG[i])/4 + csiG[i]*(1+etaG[i])/2 - (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4
dNG[2,i]= -(1+etaG[i])/4 + csiG[i]*(1+etaG[i])/2 + (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4
dNG[3,i]= -(1-etaG[i])/4 + csiG[i]*(1-etaG[i])/2 + (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4
dNG[4,i]= (1-etaG[i])/4 + csiG[i]*(1-etaG[i])/2 - (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4
end
What I get finding \((.*)\) and replacing with [$1] one time is:
x= coord[:,1];
y= coord[:,2];
natG_coord[1,1)= sqrt(1/3];
natG_coord[2,1)= -sqrt(1/3];
natG_coord[3,1)= -sqrt(1/3];
natG_coord[4,1)= sqrt(1/3];
for i=1:4
dNG[1,i)= (1+etaG(i))/4 + csiG(i)*(1+etaG(i))/2 - (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2]/4;
dNG[2,i)= -(1+etaG(i))/4 + csiG(i)*(1+etaG(i))/2 + (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2]/4;
dNG[3,i)= -(1-etaG(i))/4 + csiG(i)*(1-etaG(i))/2 + (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2]/4;
dNG[4,i)= (1-etaG(i))/4 + csiG(i)*(1-etaG(i))/2 - (1-etaG(i)^2)/4 - 2*csiG(i)*(1-etaG(i)^2]/4;
end
What I get finding \(((?>[^()]|(?R))*)\) and replacing all with [$1] one time is (I know you said several times, if I do it it'll replace every matching braces in the end):
x= coord[:,1];
y= coord[:,2];
natG_coord[1,1]= sqrt[1/3];
natG_coord[2,1]= -sqrt[1/3];
natG_coord[3,1]= -sqrt[1/3];
natG_coord[4,1]= sqrt[1/3];
for i=1:4
dNG[1,i]= [1+etaG(i)]/4 + csiG[i]*[1+etaG(i)]/2 - [1-etaG(i)^2]/4 - 2*csiG[i]*[1-etaG(i)^2]/4;
dNG[2,i]= -[1+etaG(i)]/4 + csiG[i]*[1+etaG(i)]/2 + [1-etaG(i)^2]/4 - 2*csiG[i]*[1-etaG(i)^2]/4;
dNG[3,i]= -[1-etaG(i)]/4 + csiG[i]*[1-etaG(i)]/2 + [1-etaG(i)^2]/4 - 2*csiG[i]*[1-etaG(i)^2]/4;
dNG[4,i]= [1-etaG(i)]/4 + csiG[i]*[1-etaG(i)]/2 - [1-etaG(i)^2]/4 - 2*csiG[i]*[1-etaG(i)^2]/4;
end
What I get finding \(([^()]*)\) replacing all with [$1] one time is:
x= coord[:,1];
y= coord[:,2];
natG_coord[1,1]= sqrt[1/3];
natG_coord[2,1]= -sqrt[1/3];
natG_coord[3,1]= -sqrt[1/3];
natG_coord[4,1]= sqrt[1/3];
for i=1:4
dNG[1,i]= (1+etaG[i])/4 + csiG[i]*(1+etaG[i])/2 - (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4;
dNG[2,i]= -(1+etaG[i])/4 + csiG[i]*(1+etaG[i])/2 + (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4;
dNG[3,i]= -(1-etaG[i])/4 + csiG[i]*(1-etaG[i])/2 + (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4;
dNG[4,i]= (1-etaG[i])/4 + csiG[i]*(1-etaG[i])/2 - (1-etaG[i]^2)/4 - 2*csiG[i]*(1-etaG[i]^2)/4;
end
So the last one is exactly what I was looking for. Once I go with the "find next" command, I can decide whether they are indexing parantheses or not and substitute them or not (avoiding the sqrt function input, for instance).
Thank you very much for your help.
Since the \(([^()]*)\) (to replace with [$1]) worked for you, here is the explanation:
\(([^()]*)\)
Matches:
\( - an opening round bracket
([^()]*) - Capture group 1 matches zero or more characters other than ( and ) (with [^()]*)
\)- a closing round bracket
This regex above will match all last nested level parentheses, that do not have any parentheses inside them.
Answering Aaron's remark about replacing the parentheses inside the quoted strings, it is great that Notepad++ supports Boost conditional replacement patterns. We can match what we do not need to modify and replace with self, and use another replacement for the other matches.
(?<o1>"[^"\\]*(?:\\.[^"\\]*)*")|(?<o2>\(([^()]*)\))
And replace with (?{o1}$+{o1}:[$3]).
Note that "[^"\\]*(?:\\.[^"\\]*)*" matches C strings with escaped entities correctly and efficiently. The replacement pattern means to replace with the quoted string (if o1 group matched) or with [+Group 3 value+] (if the other group matched).
If you need to replace outer balanced parentheses, use
\(((?>[^()]|(?R))*)\)
And replace with [$1] (see demo). If you need to replace the overlapping parenthetical substrings, you will need to hit Replace All several times.
Regex explanation:
\( # an outer literal opening round bracket
( # start group 1
(?> # start of atomic group
[^()] # any character other than ( and )
| # OR
(?R) # recursively match the whole pattern
)* # end atomic group and repeat zero or more times
) # end of group 1
\) # match a literal closing round bracket
If the strings you need to replace those parentheses should be preceded with word characters, use
(\w+)(\(((?>[^()]|(?2))*)\))
And replace with $1[$3]. See demo
This regex uses a (?2) subroutine that just repeats the second capture group subpattern.
Now, avoiding to match these inside quoted strings. Assume we have var d = "r(string here)" and we do not want to turn the () to [] here. Instead of (\w+)(\(((?>[^()]|(?2))*)\)) (with $1[$3] replacement), use
(?<o1>"[^"\\]*(?:\\.[^"\\]*)*")|(?<o2>(\w+)(\(((?>[^()]|(?4))*)\)))
And (?{o1}$+{o1}:$3[$5]) as the replacement. This will keep var d = "r(string here)" string intact, and will turn var f = a(fg()g) into var f = a[fg()g].
I have this regular expression to test
(\&TRUNC)[\(]{1,}(.+)[\)]{1,}
And I have this "tester"
((((&TRUNC((1800,000 / 510)) * 510) * 920) + (2 * (510 * 700)) + ((&TRUNC((1800,000 / 510)) - 1) * 2 * 510 * 80)) / 1000000) * 85,715
My expected value is (inside the personal command "&TRUNC(command)")
(1800,000 / 510)
I got this value
1800,000 / 510)) * 510) * 920) + (2 * (510 * 700)) + ((&TRUNC((1800,000 / 510)) - 1) * 2 * 510 * 80)) / 1000000
How can I get only expected value in a separated group?
PS:. The expressions inside the command called for me as "&TRUNC(command)" is variable.
In your regex
(\&TRUNC)[\(]{1,}(.+)[\)]{1,}
change .+ to make it not greedy .+?
(\&TRUNC)[\(]{1,}(.+?)[\)]{1,}
You can also simplify a bit
&TRUNC\(+(.+?)\)+
With SED, you can use back reference to match the text you are looking for -
[jaypal~/Temp]$ cat input_file
((((&TRUNC((1800,000 / 510)) * 510) * 920) + (2 * (510 * 700)) + ((&TRUNC((1800,000 / 510)) - 1) * 2 * 510 * 80)) / 1000000) * 85,715
[jaypal~/Temp]$ sed 's/.[^(&TRUNC)]*(*\&TRUNC((\(.[^*)]*\)))* \* .*/\1/' input_file
1800,000 / 510
Sorry, I dont know .NET but how about this one -
([\(]{1}[0-9,/ ]+[\)]{1})