Modify Go regex so it doesn't pick up the last character - regex

I have this regex, which works as on this link: https://regex101.com/r/HVKfYU/1
This is my regex string: (\d+[-–]\(?\d+([+\-*/^]\d+ ?[+\-*/^] ?\d+)?\)?)
These are my test strings:
(0–(2^63 - 1))
(1-(2^16 - 2))
(1-29999984)
(3-32)
This is what the regex matches in the first two cases:
0–(2^63 - 1)
1-(2^16 - 2)
// works, it doesn't match the first pair of brackets
And this is what it matches in the last two:
1-29999984)
3-32)
// doesn't work, it matches the closing bracket
I'd like it to not match the last closing bracket in any of the test strings. At the moment I'm stripping the bracket if necessary, but I would like to avoid that. How could I modify the regex, so it works as I would like?

Try (\d+[-–](?:\d+|\(\d+([+\-*/^]\d+[ ]?[+\-*/^][ ]?\d+)?\)))
demo
it just match digits or block with paren
add some explern
(
\d+ [-–]
(?: # non capture for alternation
\d+ # dd-dd form
| # or
\( \d+ # dd-(dd + dd) form
(
[+\-*/^]
\d+
[ ]?
[+\-*/^]
[ ]?
\d+
)?
\)
)
)

Related

Exclude curly brace matches

I have the following strings:
logger.debug('123', 123)
logger.debug(`123`,123)
logger.debug('1bc','test')
logger.debug('1bc', `test`)
logger.debug('1bc', test)
logger.debug('1bc', {})
logger.debug('1bc',{})
logger.debug('1bc',{test})
logger.debug('1bc',{ test })
logger.debug('1bc',{ test})
logger.debug('1bc',{test })
Instead of debug there can be other calls like warn, fatal etc.
All quote pairs can be "", '' or ``.
I need to create a regular express which matches case 1 - 5 but not 6 - 11.
That's what I've come up with:
logger.*\(['`].*['`],\s*.([^{.*}])
This also matches 8 - 11, so I'm suspecting this part is wrong ([^{.*}]) but I don't get it why.
You can try this
logger\.[^(]+\((?:"(?:\\"|[^"])*"|'(?:\\'|[^'])*'|`(?:\\`|[^`])*`),[^{}]*?\)
Regex Demo
P.S:- This pattern can be shorten if we are sure there won't be any mismatch of quotes, also if there won't be any escaped quote inside string
If there's no escaped string
logger\.[^(]+\((?:"[^"]*"|'[^']*'|`[^`]*`),[^{}]*?\)
If there's no quotes in between string. i.e no strings like "mr's jhon
logger\.[^(]+\(([`"'])[^"'`]*\1,[^{}]*?\)
If there are no quotes between the quoted parts, you could make use of a capturing group to match one of the quote types (['`"]) and use a backreference \1 to match the closing quote type.
The \r\n in the negated character class is to not cross newline boundaries.
The pattern will match either the quoted parts or 1+ times a word character for the first part.
The second part matches any char except { or } or ) using a negated character class.
logger\.[^(\r\n]+\((?:(['`"])[^'`"]+\1|\w+),[^{})\r\n]+\)
That will match
logger\. Match logger.
[^(\r\n]+ Match 1+ times any char except ( or a newline
\( Match (
(?: Non capture group
(['`"]) Capture group 1
[^'`"]+\1 Match 1+ times any char except the quote types, backreference to the captured
| or
\w+ Match 1+ word chars
), Close non capture group and match ,
[^{})\r\n]+ Match 1+ times any char except { } ) or a newline
\) Match )
Regex demo

Match parenthesis that doesn't contain digit + % only

I'm struggling with that one. I want to capture the content of parenthesis where there isn't only digit %. This means I would want to capture this (essiccato, ricco di flavonoidi) or (ricco di 23% pollo, in parte essiccato, in parte idrolizzato) but not this (23 %)or (23)or (23 %)
Here is an exemple : https://regex101.com/r/yW4aZ3/896
So far I'm there : \([^()][^()]*\)
You may use
r'\((?!\s*\d+(?:[.,]\d+)?\s*)[^()]+\)'
See the regex demo and the regex graph:
Details
\( - a ( char
(?!\s*\d+(?:[.,]\d+)?\s*) - a negative lookahead that matches a location not immediately followed with
\s* - 0+ whitespaces
\d+ - 1+ digits
(?:[.,]\d+)? - an optional occurrence of . or , and 1+ digits
\s* - 0+ whitespaces
[^()]+ - 1+ chars other than ( and )
\) - a ) char.
You might use a negative lookahead what follows after the opening parenthesis is not digits followed by an optional percentage sign:
\((?!\s*\d+\s*%?\s*\))[^)]+\)
Explanation
\( Match (
(?! Negative lookahead, assert what is on the right is not
\s*\d+\s*%?\s*\) match 1+ digits followed by an optional % till )
) Close lookahead
[^)]+\) Match 1+ times any char except ), then match )
Regex demo
Assuming that (...) are all balanced and there is no escaping of parentheses inside, you may use this regex with a character class and 2 negated character classes:
\([\d%]*[^%\d()][^()]*\)
Updated RegEx Demo
RegEx Details
\(: Match opening (
[\d%]*: Match 0 or more of any characters that is either a digit or %
[^%\d()]: Match a character that is not (, ), % and a digit
[^()]*: Match 0 or more of any characters that are not ( and not a )
\): Match closing )

Search / and replace it with ; in xml tag with sublime text 3

I am working on an .xml file with this tag
<Categories><![CDATA[Test/Test1-Test2-Test3|Test4/Test5-Test6|Test7/Test8]]></Categories>
and I am trying to replace / with ; by using regular expressions in Sublime Text 3.
The output should be
<Categories><![CDATA[Test;Test1-Test2-Test3|Test4;Test5-Test6|Test7;Test8]]></Categories>
When I use this (<Categories>\S+)\/(.+</Categories>) it matches all the line and of course if I use this \/ it matches all / everywhere inside the .xml file.
Could you please help?
For you example string, you could make use of a \G to assert the position at the end of the previous match and use \K to forget what has already been matched and then match a forward slash.
In the replacement use a ;
Use a positive lookahead to assert what is on the right is ]]></Categories>
(?:<Categories><!\[CDATA\[|\G(?!^))[^/]*\K/(?=[^][]*]]></Categories>)
Explanation
(?: Non capturing group
<Categories><!\[CDATA\[ Match <Categories><![CDATA[
| Or
\G(?!^) Assert the position at the end of the previous match, not at the start
) Close non capturing group
[^/]* Match 0+ times not / using a negated character class
\K/ Forget what was matched, then match /
(?= Positive lookahead, assert what is on the right is
[^][]*]]></Categories> Match 0+ times not [ or ], then match ]]></Categories>
) Close positive lookahead
Regex demo

match character not enclosed by braces recursively

I'm trying to split a string on pipes, when they are not enclosed by braces.
i've got a regex that works, unless there are recursive braces:
~\([^)]*\)(*SKIP)(*F)|\|~
test(test(test|tester)|test)|test
^ and ^ are matched, only last one should match
regex101 link to play around
You may use the following regex based on a subroutine:
(\((?:[^()]++|(?1))*\))(*SKIP)(*F)|\|
See the regex demo
Details
(\((?:[^()]++|(?1))*\)) - Group 1 that matches
\( - a (
(?:[^()]++|(?1))* - 0 or more occurrences of:
[^()]++ - any 1+ chars other than ( and )
| - or
(?1) - the whole Group 1 pattern is recursed (note that (?R) would not work here since it would recurse the whole regex pattern)
\) - a ) char
(*SKIP)(*F) - PCRE verb sequence that omits the currently matched text and makes the regex engine search for the next match beginning from the end of the current match
| - or
\| - a literal |

Find an item in the text with exceptions[Regular Expression]

Please help create a regular expression that would be allocated "|" character everywhere except parentheses.
example|example (example(example))|example|example|example(example|example|example(example|example))|example
After making the selection should have 5 characters "|" are out of the equation. I want to note that the contents within the brackets should remain unchanged including the "|" character within them.
Considering you want to match pipes that are outside any set of parentheses, with nested sets, here's the pattern to achieve what you want:
Regex:
(?x) # Allow comments in regex (ignore whitespace)
(?: # Repeat *
[^(|)]*+ # Match every char except ( ) or |
( # 1. Group 1
\( # Opening paren
(?: # chars inside:
[^()]++ # a. everything inside parens except nested parens
| # or
(?1) # b. nested parens (recurse group 1)
) #
\) # Until closing paren.
)?+ # (end of group 1)
)*+ #
\K # Keep text out of match
\| # Match a pipe
regex101 Demo
One-liner:
(?:[^(|)]*+(\((?:[^()]++|(?1))\))?+)*+\K\|
regex101 Demo
This pattern uses some advanced features:
Possessive quantifiers
Recursion
Resetting the match start