How do I use more than nine backreferences in Notepad++ regexp? - regex

If I use a long regular expression in Notepad++, i.e.:
^([^ ]+) ([^ ]+) ([^ ]+) (\[.*?\]) (".*?") (".*?") (".*?") (".*?") (\d+) (\d+) (\d+)$
(this is for turning an Apache log lines from space-separated to tab-separated)
then I can't successfully use more than nine backreferences for replacing, as \10 yields the content of the first captured group plus a literal "0".
I tried with $10, but that gives the same result.

You can use curly braces for this:
${10}
For reference, Notepad++ uses boost::regex, and you can find its substitution pattern docs here: Boost-Extended Format String Syntax. This replacement mode allows for more complex expressions (like conditionals and common Perl placeholders) in the replacement pattern.

Just use the curly braces:
${10}
This will ensure that the 10th capturing group is being referred, and not the 1st group followed by zero.

Related

Extracting days from a string Regex

I am trying to extract the days using regex groups in C# from the following string,
"RRULE:FREQ=MONTHLY;UNTIL=20211126T143000Z;INTERVAL=1;BYDAY=MO,TU,WE,TH,FR;BYSETPOS=-1"
I am new to regular expressions and looked at various websites to try write an expression the expression i have got so far is the following
(?:BYDAY=)([A-Z,]*);
Which matches
MO,TU,WE,TH,FR;
as a whole, which i can then use ',' in a split to achieve what I want, I wanted to know if there is a way of doing this purely in Regex.
If a quantifier in the lookbehind is supported, you might use:
(?<=BYDAY=[A-Z,]*)[A-Z]+
Explanation
(?<= Positive lookbehind, assert what is on the left is
BYDAY=[A-Z,]* match BYDAY= followed by 0 or more times A-Z or ,
) Close lookbehind
[A-Z]+ Match 1+ chars A-Z
.Net regex demo | C# demo by WiktorStribiżew
Alternatively you can make use of the \G anchor to get iterative matches and capture the value in group 1
(?:\G(?!^)|BYDAY=)([A-Z]+),?
Regex demo

Changing Parantheses to Square Brackets With Regular Expressions

Under Visual Studio 2019 I am trying to replace Parantheses with Square Brackets
For example: fld(126) to fld[126]
Using this regular expression
fld[\(][0-9]*[\)]
matches good what I look for in the code.
But \2 replaces everything between parentheses with '\2' instead of leaving what exists there before.
Any help would be appreciated...
Your example RegEx shows you only want to change the brackets for the function fld.
Use
Find: fld\((\d*)\)
Replace: fld[$1]
If the content between the brackets can be more than numbers use
Find: fld\(([^)]*)\)
Replace: fld[$1]
Try the following find and replace, in regex mode:
Find: ([^(\s]+)\(([^)]+)\)(?!\S)
Replace: $1[$2]
Demo
Here is an explanation of the regex pattern:
([^(\s]+) match AND capture the leading 'fld' term, in $1
this is given by all non '(' and whitespace characters
\( match a literal (
([^)]+) match AND capture the content inside parentheses, in $2
\) match a literal )
(?!\S) assert that what follows the closing ) is a boundary or whitespace

Match any character but no empty and not only white spaces

I have this regex:
\[tag\](.*?)\[\/tag\]
It match any character between two tags. The problem that is matching also empty contents or just white spaces inside the tags, for example:
[tag][/tag]
[tag] [/tag]
How can I avoid it? Make it to match at least 1 character and not only white spaces. Thanks!
Use
\[tag\](?!\s*\[\/tag\])(.*?)\[\/tag\]
^^^^^^^^^^^^^^^^
See the regex demo and the Regulex graph:
The (?!\s*\[\/tag\]) is a negative lookahead that fails the match if, immediately to the right of the current location, there is 0+ whitespaces, [/tag].
You might change your expression to something similar to this:
\[tag\]([\s\S]+)\[\/tag\]
and you might add a quantifier to it, and bound it with number of chars, similar to this expression:
\[tag\]([\s\S]{3,})\[\/tag\]
Or you could do the same with your original expression as this expression:
Try this regex:
\[(tag)\](?!\s*\[\/\1\])(.*?)\[\/\1\]
This regex matches tag only if it has at least one non-whitespace char.
If this is a PCRE (or php) or NP++ or Perl, use this
(?s)(?:\[tag\]\s*\[/tag\](*SKIP)(?!)|\[tag\]\s*(.+?)\s*\[/tag\])
https://regex101.com/r/aCsOoQ/1
If not, you're stuck with using Stribnetz regex, which works because of
an odd condition of your requirements.
Readable
(?s)
(?:
\[tag\]
\s*
\[/tag\]
(*SKIP)
(?!)
|
\[tag\]
\s*
( .+? ) # (1)
\s*
\[/tag\]
)

Regular expressions in Sublime Text 3

I am trying to make a regular expression that replaces the content of the texts in parentheses.
I have used the following regular expression:
"([A-Za-z ]*)"
But as you can see in the following image does not work:
Thank you and greetings.
Remove the double quotes from your expression and escape the parentheses:
\([A-Za-z ]*\)
Details:
\( - a literal (
[A-Za-z ]* - zero or more ASCII letters or spaces
\) - a literal ).
The unescaped (...) form a capturing group that stores a submatch in the memory buffer that can be used later during matching or replacement via backreferences.

~m, s and () in perl regexp

I am trying to get hold of regular expressions in Perl. Can anyone please provide any examples of what matches and what doesn't for the below regular expression?
$sentence =~m/.+\/(.+)/s
=~ is the binding operator; it makes the regex match be performed on $sentence instead of the default $_. m is the match operator; it is optional (e.g. $foo =~ /bar/) when the regex is delimited by / characters but required if you want to use a different delimiter.
s is a regex flag that makes . in the regex match any characters; by default . does not match newlines.
The actual regex is .+\/(.+); this will match one or more characters, then a literal / character, then one or more other characters. Because the initial .+ consumes as much as possible while still allowing the regex to succeed, it will match up to the last / in the string that has at least one character after it; then the (.+) will capture the characters that follow that / and make them available as $1.
So it is essentially capturing the final component of a filepath. Of foo/bar it will capture the bar, of foo/bar/ it will capture the bar/. Strings with only one component, like /foo or bar/ or baz will not match.
Any string, including multi-line strings, that contain a slash character somewhere in the middle of the string.
Matches:
foo/bar
asdf\nwrqwer/wrqwerqw # /s modifier allows '.' to match newlines
Doesn't match:
asdfasfdasf # no slash character
/asdfasdf # no characters before the slash
asdfasf/ # no characters after the slash
In addition, the entire substring that follows the last slash in the string will be captured and assigned to the variable $1.
Breakdown:
$sentence =~ — match $sentence with
m/ — the pattern consisting of
. — any character
+ — one or more times
\/ — then a forward-slash
( — and, saving in the $1 capture group,
.+ — any character one or more times
)
/s — allowing . to match newlines
See perldoc perlop for information about operators such as =~ and quote-like operators such as m//, and perldoc perlre about regular expressions and their options such as /s.