Exclude an escaped character from a range - regex

I need to extract an expression between brackets that can include everything but not an non-escaped closed bracket.
For example, the regexp from [aaa\]bbbbbb] should give as result : aaa\]bbbbbb.
I tried this : \[([^(?<!\\)\]]*)\] but that fail.
Any hints?

You may use
\[([^\]\[\\]*(?:\\.[^\]\[\\]*)*)]
Or - if there may be any non-escaped [ in-between non-escaped [ and ] (e.g. [a[\[aa\]bbbbbba\[aabbbbbb]), take out the \[:
\[([^\]\\]*(?:\\.[^\]\\]*)*)]
See the regex demo 1 and regex demo 2. It is an unrolled variant of a \[((?:[^][\\]|\\.)*)] regex.
Details:
\[ - a [
([^\]\[\\]*(?:\\.[^\]\[\\]*)*) - Group 1 capturing:
[^\]\[\\]* - zero or more chars other than [, ] and \ (in some regex flavors, you may write it without escapes - [^][\\]*)
(?:\\.[^\]\[\\]*)* - zero or more sequences of:
\\. - any escaped sequence (\ and any char other than line break chars
[^\]\[\\]* - zero or more chars other than [, ] and \
] - a closing ].

This is the simplest regex that (I think) works:
\[(.*?)(?<!\\)\]
which captures the bracketed text as group 1.
See live demo.

Related

Match all strings preceding string in brackets

I'm trying to retrieve preceding TableNames before brackets:
=IFERROR(INDEX(RepositoriesQ[ContentRepository];MATCH(1&2;RepositoriesQ[Url]&RepositoriesQ[Credentials];0));"something")
I found the way of getting all strings between brackets:
\[(.*?)\]
but what i want to get is all strings preceding column names in brackets.
So as result i should get 3 matches here:
RepositoriesQ[ContentRepository]
RepositoriesQ[Url]
RepositoriesQ[Credentials]
You can use
\w+\[[^\][]*]
See the regex demo. Details:
\w+ - one or more word chars
\[ - a [ char
[^\][]* - zero or more chars other than [ and ]
] - a ] char.

Options matching in a command

I'm actually creating a discord bot and I'm trying to match some command options and I have a problem getting the value between the square brackets. (if there is)
I've already tried to add a ? to match one or more of these but it's not working, searching about how I could match between two characters but found nothing that helped me.
Here is the pattern I've got so far : https://regexr.com/4icgi
and here it is in text : /[+|-](.+)(\[(.+)\])?/g
What I expect it to do is from an option like that : +user[someRandomPeople]
to extract the parameter user and the value someRandomPeople and if there is no square brackets, it will only extract the parameter.
You may use
^[+-](.*?)(?:\[(.*?)\])?$
Or, if there should be no square brackets inside the optional [...] substring at the end:
^[+-](.*?)(?:\[([^\][]*)\])?$
Or, if the matches are searched for on different lines:
^[+-](.*?)(?:\[([^\][\r\n]*)\])?$
See the regex demo and the regex graph:
Details
^ - start of string
[+-] - + or - (note that | inside square brackets matches a literal | char)
(.*?) - Group 1: any 0 or more chars other than line break chars as few as possible
(?:\[(.*?)\])? - an optional sequence of
\[ - a [ char
(.*?) - Group 2: any 0 or more chars other than line break chars as few as possible ([^\][]* matches 0 or more chars other than [ and ])
\] - a ] char
$ - end of string.

Regex in Notepad++ with Negative Lookahead with Question Mark Operator

I'm trying to find a way to replace square brackets into apostrophes for some subtitle files, but only for cases when these square brackets do not contain a whole sentence having the square brackets at the beginning & end of the line.
These lines would have square brackets changed into apostrophes:
[que] vão levar [vocês]
ao [limite].
While these would not:
[Vamos começar]
[com algo simples.]
I came up with the following regex command
(?!^\[.*?\]$)(\[.*?\])
That uses negative lookahead to find lines starting with [ and ending with ], while using the inside question mark character ? as an operator to prevent selection of line with extra square brackets.
Unfortunately, this does not seem to work. What am I doing wrong in here?
You may match the lines that start with [ and end with ] and have no [ and ] and capture into Group 1, and only match other [ and ] and replace using a conditional replacement pattern:
Find what: ^(\[[^][\r\n]*\])$|[][]
Replace with: (?1$1:')
Search pattern details:
^ - start of line
(\[[^][\r\n]*\]) - Group 1 capturing a [, then 0 or more characters other than ], [, \r or \n and then ] at the...
$ - end of line
| - or
[][] - a [ or ]
Replacement pattern details:
(?1 - Did the Group 1 match? If yes,
$1 - use the Group 1 contents
: - or
' - a single apostrophe
) - end of the conditional pattern.

Regex lookahead/lookbehind match for SQL script

I'm trying to analyse some SQLCMD scripts for code quality tests. I have a regex not working as expected:
^(\s*)USE (\[?)(?<![master|\$])(.)+(\]?)
I'm trying to match:
Strings that start with USE (ignore whitespace)
Followed by optional square bracket
Followed by 1 or more non-whitespace characters.
EXCEPT where that text is "master" (case insensitive)
OR EXCEPT where that that text is a $ symbol
Expected results:
USE [master] - don't match
USE [$(CompiledDatabaseName)] - don't match
USE [anything_else.01234] - match
Also, the same patterns above without the [ and ] characters.
I'm using Sublime Text 2 as my RegEx search tool and referencing this cheatsheet
Your pattern - ^(\s*)USE (\[?)(?<![master|\$])(.)+(\]?) - uses a lookbehind that is variable-width (its length is not known beforehand) if you fix the character class issue inside it (i.e. replace [...] with (...) as you mean an alternative list of $ or a character sequence master) and thus is invalid in a Boost regex. Your (.)+ capturing is wrong since this group will only contain one last character captured (you could use (.+)), but this also matches spaces (while you need 1 or more non-whitespace characters). ? is the one or zero times quantifier, but you say you might have 2 opening and closing brackets (so, you need a limiting quantifier {0,2}).
You can use
^\h*USE(?!\h*\[{0,2}[^]\s]*(?:\$|(?i:master)))\h*\[{0,2}[^]\s]*]{0,2}
See regex demo
Explanation:
^ - start of a line in Sublime Text
\h* - optional horizontal whitespace (if you need to match newlines, use \s*)
USE - a literal case-sensitive character sequence USE
(?!\h*\[{0,2}[^]\s]*(?:\$|(?i:master))) - a negative lookahead that makes sure the USE is NOT followed with:
\h* - zero or more horizontal whitespace
\[{0,2} - zero, one or two [ brackets
[^]\s]* - zero or more characters other than ] and whitespace
(?:\$|(?i:master)) - either a $ or a case-insensitive master (we turn off case sensitivity with (?i:...) construct)
\h* - go on matching zero or more horizontal whitespace
\[{0,2} - zero, one or two [ brackets
[^]\s]* - zero or more characters other than ] and whitespace (when ] is the first character in a character class, it does not have to be escaped in Boost/PCRE regexps)
]{0,2} - zero, one or two ] brackets (outside of character class, the closing square bracket does not need escaping)

Trying to work out why this regex is not working? Regex should be less restrictive

The Text :
[prc:tl:plfl]
is matched by:
\[prc:tl:[^]]*plfl\]
However I need to also match:
[prc:tl:plfl,tr]
Basically "plfl" can appear anywhere in the string after "tl:" and before next "]"
So all of the following should match
[prc:tl:plfl,tr]
[prc:tl:tr, plfl]
[prc:tl:tr, plfl,sr]
[prc:tl:plfl,tr, sr, mr]
What is missing from my regex?
MAny thanks in advance.
You may match any text other than ] after plfl with a negated character class [^\]] (you are actually already using it in the regex):
\[prc:tl:[^\]]*?plfl[^\]]*\]
See the regex demo
Details
\[prc:tl: - a [prc:tl: substring
[^\]]*? - a negated character class matching any 0+ chars other than ] as few as possible
plfl - a plfl substring
[^\]]* - any 0+ chars other than ] as few as possible
\] - a ] char.
See the Regulex graph: