Regex notepad++ and groups - regex

I have the following data in my file:
234xt_
yad42_
23ft3_
45gdw_
...
Where the _ means a space.
Using Notepad++ I want to rewrite it to be:
'234xt',
'yad42',
'23ft3',
'45gdw'
I am using the following regex in the "Find what" (^\w+)\s*\n
And in the "Replace with" field $0,
But it is not working as expected.

You may use
^(\w+) $
or
^(\w+)\h$
And replace with '$1',.
^ will match the start of a line, (\w+) will place one or more letters, digits or underscores into Group 1 (that you may access via $1 or \1 backreference in the replacement pattern), and then a space or \h will match a space or any horizontal whitespace, and then $ will assert the position at the end of the line.
If the (white)spaces can go missing add the appropriate quantifier after the space or \h: \h* will match 0 or more whitespaces and \h? will match 1 or 0.
Settings & demo:

You should use \1 instead of $0 see the example in the docs.

Related

Add Find Special Character at beginning and ADD to END of string with regex

Have a string that starts with a # symbol and would like to add the same # symbol.
The string could contain any type of lower/upper, numbers, comas, periods, etc. Each string is a single separate line.
Here is what I have tried with no success:
Find: (?=#)([\S\s]+) # www.admitme.app
Find: (?=#\B)([\S\s]+) # Carlo Baxter
Find: (?=#\B)([A-Za-z0-9]+) # resumes in 15 minutes
Replace: $1 # # resumes in 15 minutes #
Yes I'm a noob with regex...
Thanks in advance
Hank K
The following pattern is working in your demo:
(?=#\B)(.*)
This works in multiline mode, because then the .* will not match across newlines. You were using [\s\S]*, which will match across newlines, even in multiline mode. Here is the updated demo.
You can do the same replacement without lookarounds or capture groups using one of these patterns. The point is to match any character without newlines using .* (And not have a flag set to have the dot matches newlines)
#\B.*
# .*
In the replacement use the matched text followed by a space and #
$0 #
See a regex demo.

Extend string between strings

startABCend
->
startABC123end
I seek to capture text between start and end, and extend it, as shown. I tried:
find = start.*end, replace = \1 123: will capture start and end and between, but replace them all
find = (?s)(?<=start).+?(?=end), replace = \1 123: will keep start and end but replace captured
How to accomplish this with regex in N++?
The exact use case is
func_name(a, b=1) -> func_name(a, b=1, c=2)
# can also be
func_name(g=5, k=7) -> func_name(g=5, k=7, c=2)
# so capture between `func_name(` and `)` and extend with `, c=2`
You could do this without capture groups, and match what you want to replace.
\bstart\K.*?(?=end\b)
The pattern matches:
\bstart Match start preceded by a word boundary
\K Forget what is matched until now
.*? Match as least chars as possible
(?=end\b) Positive lookahead, assert end to the right followed by a word boundary
In the replacement use the full match followed by 123
$&123
For the updated example data, you could match the format of key with an optional =value, and optionally repeat that asserting a ) to the right.
\bfunc_name\([^\s,=]+(?:=[^\s,=]+)?(?:,\h*[^\s,=]+(?:=[^\s,=]+)?)*(?=\))
Regex demo
And replace with
$&, c=2
Your example target does not include the white space you have in your replace string. To accomplish using the group AND append numbers you can use brackets.
Basically:
Find: (?<=start)(.+?)(?=end)
Replace: (\1)123
or just
Find: start(.+?)end
Replace: start(\1)123end

How to encase words with quotations?

I am currently trying to convert a list of 1000 words into this format:
'known', 'buss', 'hello',
and so on.
The list i have is currently in this format:
known
worry
claim
tenuous
porter
I am trying to use notepad++ to do this, if anybody could point me in the correct direction, that would be great!
Use this if you want a comma delimited list but no extra comma at the end.
Ctrl+H
Find what: (\S+)(\s+)?
Replace with: '$1'(?2,:)
CHECK Wrap around
CHECK Regular expression
Replace all
Explanation:
(\S+) # group 1, 1 or more non spaces
(\s+)? # group 2, 1 or more spaces, optional
Replacement:
'$1' # content of group 1 enclosed in quotes
(?2,:) # if group 2 exists, add a comma, else, do nothing
Screen capture (before):
Screen capture (after):
How about replacing (\S+) with '$1'? Make sure your Regular Expression button is selected in the Find and Replace tool inside Notepad++
Explanation
(\S+) is regex for repeating non-whitespace characters (1 or more). Wrapping it in parenthesis puts it in a capture group which can be accessed in numerical order by using a dollar sign ($1).
'$1' will take that found text from the Find above and replace it with capture group #1 ($1) wrapped in single quotes '.
Sample
Input: known worry claim tenuous porter
Output: 'known' 'worry' 'claim' 'tenuous' 'porter'

How to replace specific character one time

I want to replace character - using regular expression in my text so it would work like this:
Original text: abcd-efg-hijk-lmno
Text after replacing: abcd-efg-hijk/lmno
As you can see I want to replace character - starting from the end just one time with character /.
Thanks in advance for any tips
Find what: -([^-]*)$
Replace with: /$1
Search Mode: Regular Expression
Explanation:
- : a dash
([^-]*$) : text with no dash,
zero or more times,
to the end of the line,
put in the $1 variable
/$1 : literal "/", contents of $1
Good resource: http://www.grymoire.com/Unix/Regular.html
To replace characters in Notepad++, you can open the Replace window using Ctrl+H, or under the "Search" menu. Once open, enter the following regular expression:
(.{4}-.{3}-.{4})(-)(.{4})
This will find:
a group of four characters (the "." being any character, the "{4}" being the quantity),
a dash,
a group of three characters,
another dash,
a group of four characters,
again another dash,
then a group of four characters.
The parentheses group this search into captured groups, which we will use for the replacement part. See https://www.regular-expressions.info/brackets.html for more info.
If you want to restrict the search to lowercase letters as in your example, you would replace the "." with "[a-z]", or for upper and lower "[a-z,A-Z]".
Now for the replacement. The groups from earlier are referenced by the dollar sign then the number, e.g. $1 would be the first. So we will replace the characters found with the first group ($1), disregard the second group containing the dash and insert the "/" instead, then include the third group ($3):
$1/$3
The settings in the replace window need to have "Regular expression" and "Wrap around" checked, and ". matches newline" unchecked.
You can then click Replace all to replace all occurrences, or go through using Replace individually.
Since the beginning and end of line characters are not included, you can find multiple occurrences of this pattern on a single line.
Note: This answer follows the same procedure as Toto's, however uses a different regular expression.
Ctrl+H
Find what: ^(.+)-([^-]+)$
Replace with: $1/$2
check Wrap around
check Regular expression
DO NOT CHECK . matches newline
Replace all
Explanation:
^ : begining of line
(.+) : 1 or more any character, catch in group 1
- : a dash
([^-]+) : 1 or more any character but dash, catch in group 2
$ : end of line

Remove all characters after a certain match

I am using Notepad++ to remove some unwanted strings from the end of a pattern and this for the life of me has got me.
I have the following sets of strings:
myApp.ComboPlaceHolderLabel,
myApp.GridTitleLabel);
myApp.SummaryLabel + '</b></div>');
myApp.NoneLabel + ')') + '</label></div>';
I would like to leave just myApp.[variable] and get rid of, e.g. ,, );, + '...', etc.
Using Notepad++, I can match the strings themselves using ^myApp.[a-zA-Z0-9].*?\b (it's a bit messy, but it works for what I need).
But in reality, I need negate that regex, to match everything at the end, so I can replace it with a blank.
You don't need to go for negation. Just put your regex within capturing groups and add an extra .*$ at the last. $ matches the end of a line. All the matched characters(whole line) are replaced by the characters which are present inside the first captured group. .
matches any character, so you need to escape the dot to match a literal dot.
^(myApp\.[a-zA-Z0-9].*?\b).*$
Replacement string:
\1
DEMO
OR
Match only the following characters and then replace it with an empty string.
\b[,); +]+.*$
DEMO
I think this works equally as well:
^(myApp.\w+).*$
Replacement string:
\1
From difference between \w and \b regular expression meta characters:
\w stands for "word character", usually [A-Za-z0-9_]. Notice the inclusion of the underscore and digits.
(^.*?\.[a-zA-Z]+)(.*)$
Use this.Replace by
$1
See demo.
http://regex101.com/r/lU7jH1/5