Replacing text in Notepad++ with regex

Replacing text in Notepad++ with regex - regex

I'm trying to translate a subtitle in Google Translate and everything goes ok with just one problem, it removes the comma , from the times. Well, nice. I pasted it on Notepad++ and tried to replace with regex. The time format is:
00:00:44927 -->
and should be
00:00:44,927 -->
So I tried this regex on the Find what field: :(\d){2}(\d){3}( -->)
And this on the replace with field: :$1,$2 -->
The search works but the replace results in this: 00:00:47. It seems that $1 stands for the first number of the first match (\d){2} that is 4 and the second match (\d){3} that is 7.
Why ?

You need to place the range quantifier {n} inside of your capturing groups. By placing them outside of your capturing groups, you're telling the regex engine that the group is to be repeated nth times instead of the token \d.
Find: :(\d{2})(\d{3})( -->)
Replace: :$1,$2$3
If you wanted to, you could also use lookaround assertions to achieve this.
Find: :\d\d\K(?=\d\d\d)
Replace: ,

You can just do
(?<=\d\d)(\d{3})(?= -->)
and replace with ,$1
DEMO
You were not capturing \d{3} but just \d which is why your regex didn't work as expected.

Related

Notepad++: Can I use regex to find some values and remove only one character instead of the whole pattern?

I want to use regex in notepad to find this pattern: "[0-9]+[\.][0-9]+[,][0-9]+" e.g. 1.010,80260
However from these kind of numbers I just want to remove the '.' , so the new value should be 1010,80260 .
So far I can only replace the whole pattern. Is there a way to do it?
Thank you in advance!

You can make use of the \K meta escape since PCRE doesn't support variable width lookbehinds:
regex:
[0-9]+\K[\.](?=[0-9]+[,][0-9]+)
[0-9]+ - capture digits
\K - forget what we've captured
[\.] - capture a period; just \. can be used, no need for the char class brackets
(?=[0-9]+[,][0-9]+) - ahead of me should be digits followed by a comma and digits
replace:
Nothing
\K is bugged in Notepad++ so you could use this regex instead since you only care that at least one digit is behind the period:
(?<=\d)\.(?=[0-9]+[,][0-9]+)

You can use \K, which basically says throw away whatever was matched up until that point, then add a lookahead. Like so
[0-9]+\K\.(?=[0-9]+[,][0-9]+)

Change the regular expression to: ([0-9]+)[\.]([0-9]+[,][0-9]+)
The () pieces are groups which you can refer to in the replace with \1 for the first group, and \2 for the second group.
The docs also explain this here: https://npp-user-manual.org/docs/searching/#substitution-grouping (even better, and in more detail, than my usage in this answer...)
EDIT: I just wanted to share the animated gif showing that 'Replace' in Notepad++ 7.9.5. does not seem to work.

regex to remove dash from phone numbers

I have lines like this
TEL;TYPE=CELL:343-454-1212
TEL;TYPE=CELL:34345-121212
TEL;TYPE=CELL:123456789
I need to remove dashes from them using VSCode.
So far I came with:
Search With : (TEL;TYPE=CELL:\d*)-(\d*)
Replace With : $1$2
I have to search and replace multiple times (In this case two times) to get the expected output. This is mainly because I do not know how many dashes are there.
Is there any regex which I can use to accomplish, what is being done here in single go?

Wrap the number in parenthesis like this so you can restore it, removing the -
-([0-9])
replace with $1
Or
Just repeat this find replace, till they are all removed:
find: (TEL;TYPE=CELL:[^-]*)-
replace: $1

I tried this limited solution.only for 2 or 3 dash.
Search with: (\d+)-(\d+)(?:-(\d+))?
Replace with: $1$2$3
but in the comment that said replace dashes with empty str also a good solution.

Visual Studio Code search and replace in the current document feature (not the one to replace in files) supports regexps with variable-width lookbehind patterns, so you can use
(?<=TEL;TYPE=CELL:[\d-]*)-
See the regex demo. Details:
(?<=TEL;TYPE=CELL:[\d-]*) - a position that is preceded with TEL;TYPE=CELL: and then zero or more digits or hyphens
- - a hyphen.

When replacing with regexes, how do I append digits to the end of a match group?

This is for Visual Studio code, but I'm not sure how relevant that is.
Here's my regex: cost:(\d\d)
I'm trying to multiply that number by 100. e.g., change cost:25 to cost:2500
But if I try this replacement, it thinks I'm asking for group #100, instead of group #1 and two zeros: cost:$100 changes cost:25 to cost:$100. If I put a space after the two zeros, it replaces it with this: cost:25 00. Is there a way to tell the regex engine where the group name stops?
I've tried replacing with cost:${1}00, but that doesn't work. It changes it to cost:${1}00:

On VS Code 1.51.1, this solution works for Replace (current file) and Replace in Files:
Use $0 to refer to the whole match:
Search: cost:\d\d
Replace: $000
As mentioned in the comment, these two methods only works for Replace (current file) but not Replace in Files:
Use $& to refer to the whole match:
Search: cost:\d\d
Replace: $&00
Use $1 to refer to the first capturing group:
Search: cost:(\d\d)
Replace: cost:$100
For the regular Replace, the replacement string parsing rule is the same as JavaScript. Since we don't have 10 or 100 capturing groups in the regex, the replacement string is parsed as $1 (whatever matched in group 1) followed by literal 00
However, for Replace in Files, the replacement only works when there is only 1 digit following the $1 (cost:$10). Once there are more than 2 digits, it's treated as literal string replacement
I played around with named capturing groups - while they are recognized in the search field, I can't refer to it in replacement string. Common syntax such as ${group} and $<group> doesn't work.
After testing a bit more, I find some weird bugs with how Replace in Files is implemented.
This straight-forward method with named capturing group doesn't work:
Search: cost:(?<n>\d\d)
Replace: cost:$<n>00
But we can coax it to work by mixing named capturing group replacement with numbered group replacement:
Search: cost:(?<n>\d\d)()
Replace: cost:$2$<n>00
We can also apply this trick to get the numbered group replacement to work:
Search: cost:(\d\d)()
Replace: cost:$2$100
Seems that there is a bug when the replacement string only includes a single replacement token.

Instead of using groups, why not match the zero-width character using lookbehind?
(?<=cost:\d\d)
replacement:
00

You have run into this bug discussed here: VSCode Regex Find/Replace In Files: can't get a numbered capturing group followed by numbers to work out
and the filed issue: https://github.com/microsoft/vscode/issues/102221
which was unfortunately closed with "ask on SO response" despite having originated here. Although there are workarounds - also noted in the So link here, you can still comment on the issue and ask it to be reopened.
The weird thing is that if you add just one digit like cost:$10 it works just fine - that is why I consider this a bug.

remove repeated character between words

I am trying out the quiz from Regex 101
In Task 6, the question is
Oh no! It seems my friends spilled beer all over my keyboard last night and my keys are super sticky now. Some of the time when I press a key, I get two duplicates. Can you pppllleaaaseee help me fix this? Content in bold should be removed.
I have tried this regex
([a-z])(\1{2})
But couldn't get the solution.

The solution for the riddle on that website is:
/(.)\1{2}/g
Since any key on the keyboard can get stuck, so we need to use ..
\1 in the regex means match whatever the 1st capturing group (.) matches.
Replacement is $1 or \1.
The rest of your regex is correct, just that there are unnecessary capturing groups.

Your regex is correct if you want to match exactly three characters. If you want to match at least three, that is
([a-z])(\1{2,})
or
([a-z])(\1\1+)
Since you don't need to capture anything but the first occurence, these are slightly better:
([a-z])\1{2} # your original regex (exactly three occurences)
([a-z])\1{2,}
([a-z])\1\1+
Now, the replacement should be exactly one occurence of the character, and nothing more:
\1

Replace:
(.)\1+
with:
\1
This of course requires that your regex engine suports backreferences... Also, in the replacement part, and according to regex engines, \1 may have to be written as $1.

I'd do it with (\w)(\1+)? but can't find out how to "remove" within the given site...
Best way would be to replace the results of the secound match with empty strings

How to replace only part of found text?

I have a file with a some comma separated names and some comma separated account numbers.
Names will always be something like Dow, John and numbers like 012394,19862.
Using Notepad++'s "Regex Find" feature, I'd like to replace commas between numbers with pipes |.
Basically :
turn: Dow,John into: Dow,John
12345,09876 12345|09876
13568,08642 13568|08642
I've been using [0-9], to find the commas, but I can't get it to properly leave the number's last digit and replace just the comma.
Any ideas?

Search for ([0-9]), and replace it with \1|. Does that work?

use this regex
(\d),(\d)
and replace it with
$1|$2
OR
\1|\2

(?<=\d), should work. Oddly enough, this only works if I use replace all, but not if I use replace single. As an alternative, you can use (\d), and replace with $1|

General thoughts about replacing only part of a match
In order to replace a part of a match, you need to either 1) use capturing groups in the regex pattern and backreferences to the kept group values in the replacement pattern, or 2) lookarounds, or 3) a \K operator to discard left-hand context.
So, if you have a string like a = 10, and you want to replace the number after a = with, say, 500, you can
find (a =)\d+ and replace with \1500 / ${1}500 (if you use $n backreference syntax and it is followed with a digit, you should wrap it with braces)
find (?<=a =)\d+ and replace with 500 (since (?<=...) is a non-consuming positive lookbehind pattern and the text it matches is not added to the match value, and hence is not replaced)
find a =\K\d+ and replace with 500 (where \K makes the regex engine "forget" the text is has matched up to the \K position, making it similar to the lookbehind solution, but allowing any quantifiers, e.g. a\h*=\K\d+ will match a = even if there are any zero or more horizontal whitespaces between a and =).
Current problem solution
In order to replace any comma in between two digits, you should use lookarounds:
Find What: (?<=\d),(?=\d)
Replace With: |
Details:
(?<=\d) - a positive lookbehind that requires a digit immediately to the left of the current location
, - a comma
(?=\d) - a positive lookahead that requires a digit immediately to the right of the current location.
See the demo screenshot with settings:
See the regex demo.
Variations:
Find What: (\d),(?=\d)
Replace With: \1|
Find What: \d\K,(?=\d)
Replace With: |
Note: if there are comma-separated single digits, e.g. 1,2,3,4 you can't use (\d),(\d) since this will only match odd occurrences (see what I mean).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Replacing text in Notepad++ with regex - regex

You can just do (?<=\d\d)(\d{3})(?= -->) and replace with ,$1 DEMO You were not capturing \d{3} but just \d which is why your regex didn't work as expected.

Related

Notepad++: Can I use regex to find some values and remove only one character instead of the whole pattern?

regex to remove dash from phone numbers

When replacing with regexes, how do I append digits to the end of a match group?

remove repeated character between words

How to replace only part of found text?

Categories

Resources