RegEx for replacing alphanumeric chars in Notepad++ [duplicate] - regex

This question already has an answer here:
Notepad++ regular expression replace
(1 answer)
Closed 3 years ago.
I want to upgrade a project from one version to another so, that in need to change thousands of lines with same pattern.
Example:
From this
$this->returnData['status']
To this
$this->{returnData['status']}
By using following regex i found all the matches but unable to replace with braces.
->[a-zA-z]{5,15}\['[a-zA-z]{5,15}'\]
I used following to replace
->\{[a-zA-z]{5,15}\['[a-zA-z]{5,15}'\]\}

Try using the following find and replace, in regex mode:
Find: \$this->([A-Za-z]{5,15}\['[A-Za-z]{5,15}'\])
Replace: $this->{$1}
Demo
The regex matches says to:
\$this-> match "$this->"
( then match and capture
[A-Za-z]{5,15} 5-15 letters
\[ [
'[A-Za-z]{5,15}' 5-15 letters in single quotes
\] ]
) stop capture group
The replacement is just $this-> followed by the first capture group, but now wrapped in curly braces.
If you want to just match the arrow operator, regardless of what precedes it, then just use this pattern, and keep everything else the same:
->([A-Za-z]{5,15}\['[A-Za-z]{5,15}'\])

You could also use negated character classes and use \K to forget the match. In the replacement use $0 to insert the full match surrounded by { and }
Match:
->\K[^[\n]+\['[^\n']+']
That will match
-> match ->
\K Forget what was matched
[^[\n]+ Match not [ or newline 1+ times or exactly [a-zA-Z]{5,15} instead of [^[\n]+
\[ Match [
'[^\n']+' Match from ', 1+ times not ' or newline and then ' or exactly [a-zA-Z]{5,15} instead of [^\n']+
] Match literally
Regex demo
Replace with:
{$0}
Result:
$this->{returnData['status']}
The exacter version would look like:
->\K[a-zA-Z]{5,15}\['[a-zA-Z]{5,15}']
Regex demo
Note that [a-zA-z] matches more than [a-zA-Z]

Related

RegEx matching only within a match / restrict matching to part of string

Is there a way to use a single regular-expression to match only within another math. For example, if I want to remove spaces from a string, but only within parentheses:
source : "foobar baz blah (some sample text in here) and some more"
desired: "foobar baz blah (somesampletextinhere) and some more"
In other words, is it possible to restrict matching to a specific part of the string?
In PCRE a combination of \G and \K can be used:
(?:\G(?!^)|\()[^)\s]*\K\s+
\G continues where the previous match ended
\K resets beginning of the reported match
[^)\s] matches any character not in the set
See demo at regex101
The idea is to chain matches to an opening parentheses. The chain-links are either [^)\s]* or \s+. To only get spaces \K is used to reset before. This solution does not require a closing ).
In other regex flavors that support \G but not \K, capturing groups can help out. Eg Search for
(\G(?!^)|\()([^)\s]*)\s+
and replace with captures of the 2 groups (depending on lang: $1$2 or \1\2) - Regex101 demo
Further there is (*SKIP)(*F), a PCRE feature for skipping over certain parts. It is often used together with The Trick. The idea is simple: skip this(*SKIP)(*F)|match that - Regex101 demo. Also this can be worked around with capture groups. Eg replace ([^)(]*\(|\)[^)(]*)|\s with$1
One idea is to replace any space between parentheses using a lookahead pattern:
(?=([^\s\(]+ )*\S*\))(?!\S*\s*\()`
The lookahead will attempt to match the last space before the closed parenthesis (\S*\)) and any optional space before ([^\s\(]+ )* (if found).
Detailed Regex Explanation:
: space
(?=([^\s\(]+ )*\S*\)): lookahead non-capturing group
([^\s\(]+ )*: any combination characters not including the open parenthesis and the space characters + space (this group is optional)
\S*\): any non-space character + closed parenthesis
(?!\S*\s*\(): what lookahead should not be
\S*: any non space character (optional), followed by
\s*: any space character (optional), followed by
\(: the open parenthesis
Check the demo here.

Match any character but no empty and not only white spaces

I have this regex:
\[tag\](.*?)\[\/tag\]
It match any character between two tags. The problem that is matching also empty contents or just white spaces inside the tags, for example:
[tag][/tag]
[tag] [/tag]
How can I avoid it? Make it to match at least 1 character and not only white spaces. Thanks!
Use
\[tag\](?!\s*\[\/tag\])(.*?)\[\/tag\]
^^^^^^^^^^^^^^^^
See the regex demo and the Regulex graph:
The (?!\s*\[\/tag\]) is a negative lookahead that fails the match if, immediately to the right of the current location, there is 0+ whitespaces, [/tag].
You might change your expression to something similar to this:
\[tag\]([\s\S]+)\[\/tag\]
and you might add a quantifier to it, and bound it with number of chars, similar to this expression:
\[tag\]([\s\S]{3,})\[\/tag\]
Or you could do the same with your original expression as this expression:
Try this regex:
\[(tag)\](?!\s*\[\/\1\])(.*?)\[\/\1\]
This regex matches tag only if it has at least one non-whitespace char.
If this is a PCRE (or php) or NP++ or Perl, use this
(?s)(?:\[tag\]\s*\[/tag\](*SKIP)(?!)|\[tag\]\s*(.+?)\s*\[/tag\])
https://regex101.com/r/aCsOoQ/1
If not, you're stuck with using Stribnetz regex, which works because of
an odd condition of your requirements.
Readable
(?s)
(?:
\[tag\]
\s*
\[/tag\]
(*SKIP)
(?!)
|
\[tag\]
\s*
( .+? ) # (1)
\s*
\[/tag\]
)

RegEx: don't capture match, but capture after match

There are a thousand regular expression questions on SO, so I apologize if this is already covered. I did look first.
I have string:
Name Subname 11X22 88X620 AB33(20) YA5619 77,66
I need to capture this string: YA5619
What I am doing is just finding AB33(20) and after this I am capturing until first white space. But AB33(20) can be AB-33(20) or AB33(-20) or AB33(-1).
My preg_match regex is: (?<=\bAB\d{2}\(\d{2}\)\s).+?(?=\s)
Why I am getting error when I change from \d{2} to \d+?
For final result I was thinking this regix will work but no:
(?<=\bAB-?\d+\(-?\d+\)\s).+?(?=\s)
Any ideas what I am doing wrong?
With most regex flavors, lookbehind needs to evaluate to a fixed-length sequence, so you can't use variable quantifiers like * or + or even {1,2}.
Instead of using lookaround, you can simply match your marker pattern and then forget it with \K.
AB-?\d+(?:\(-?\d+\))? \K[^ ]+
demo: https://regex101.com/r/8XXngH/1
It depends on the language. If it is in .NET for example, it matches due to the various length in the lookbehind.
Another solution might be to use a character class and add the character you would allow to match. Then match a whitespace character and capture in a group matching \S+ which matches 1+ times not a whitespace character.
\bAB[()\d-]+\s\K\S+
Explanation
\bAB Match literally prepended with word boundary to prevent AB being part of a larger match.
[()\d-]+ Match 1+ times any of the listed character in the character class
\s Match a whitespace char (or \s+ to match 1 or more)
\K Reset the starting point of the reported match( Forget what was matched)
\S+ Match in a group 1+ times not a whitespace character
Regex demo | Php demo

Regex expression for 2 identical strings in a row

So I am trying to create a regex expression for the following template.
"[alphaNumeric]String/String.xcl"
So
[a1B2c3]Hello/Hello.xcl would pass
a1B2c3]hello/hello.xcl fails
[a1B2c3]Hello/hello.xcl fails
[a1B2c3]hello/hello.xc fails
I have tried the following so far:
\[[\da-zA-Z]+\][a-z]+\/[a-z]+\.xcl$
How do I check if the middle strings are identical?
Use a backreference:
\[[a-zA-Z0-9]+\]([^/]+)/\1\.xcl
The term in parenthesis captures the first part of your path. We may then refer to it later in the regex using \1.
Depending on how you plan to use this regex, you might need optional starting and closing anchors (^ and $).
Demo
You may capture the part after brackets and use a backreference after /:
^\[[\da-zA-Z]+]([A-Za-z]+)\/\1\.xcl$
^^^^^^^^^^ ^^
See the regex demo
Details
^ - start of the string
\[ - a [
[\da-zA-Z]+ - 1+ alphanumeric chars
] - a ] char
([A-Za-z]+) - Capturing group 1: one or more letters
\/ - a slash
\1 - a backreference to capturing group 1 value
\.xcl - .xcl substring
$ - end of string.
NOTE: If you do not care about what kind of chars there can be inside brackets, you may replace [\da-zA-Z]+ with [^\]]+.
NOTE2: If you want to match any chars on both ends of /, replace ([A-Za-z]+) with ([^\/]+).

Can notepad++ regex find a string and replace with a new string that contains the found string

I have the following text:
.clk,
.rst_b,
.phase_in
I want the follwing:
.clk(clk),
.rst_b(rst_b),
.phase_in(phase_in)
That is-
find a string starting with a period and excluding a trailing comma that may or may not be present.
Append the string, excluding the period to the found string inside parenthesizes.
Notepad++ has the function "find and replace" not "find and append" therefore step two could be written as follows - replace the string with a copy of itself followed by a copy of itself enclosed in parenthesizes.
Step one is completed by \.\w+. Any thoughts on step 2?
Thanks!
Yes you can! Use capture groups to refer to the matched word in your replacement.
Replace \.(\w+) with .\1\(\1\).
The parentheses mean that we want to keep a reference to \w+. We access that reference with \1 in the replacement.
You can use the following regular expression. This captures word characters following the dot.
Find: ^\.(\w+)
Replace: .\1\(\1\)
Explanation:
^ # the beginning of the string
\. # '.'
( # group and capture to \1:
\w+ # word characters (a-z, A-Z, 0-9, _) (1 or more times)
) # end of \1
In the replacement, we use \1 to reference what was matched and captured by capturing group #1. Note: You need to escape the parentheses in the replacement to actually display them.
use \1 find \.(\w+) and replace with \.\1\(\1\)
You can use a simple regex like this:
Find what: \.([^,]+)
Replacement: .$1\($1\)
Online demo
Make sure regex expression radio button is checked and dot matches newline is unchecked.
Pattern explanation:
\. '.'
( group and capture to \1:
[^,]+ any character except: ',' (1 or more times)
) end of \1