regex to remove dash from phone numbers

regex to remove dash from phone numbers - regex

I have lines like this
TEL;TYPE=CELL:343-454-1212
TEL;TYPE=CELL:34345-121212
TEL;TYPE=CELL:123456789
I need to remove dashes from them using VSCode.
So far I came with:
Search With : (TEL;TYPE=CELL:\d*)-(\d*)
Replace With : $1$2
I have to search and replace multiple times (In this case two times) to get the expected output. This is mainly because I do not know how many dashes are there.
Is there any regex which I can use to accomplish, what is being done here in single go?

Wrap the number in parenthesis like this so you can restore it, removing the -
-([0-9])
replace with $1
Or
Just repeat this find replace, till they are all removed:
find: (TEL;TYPE=CELL:[^-]*)-
replace: $1

I tried this limited solution.only for 2 or 3 dash.
Search with: (\d+)-(\d+)(?:-(\d+))?
Replace with: $1$2$3
but in the comment that said replace dashes with empty str also a good solution.

Visual Studio Code search and replace in the current document feature (not the one to replace in files) supports regexps with variable-width lookbehind patterns, so you can use
(?<=TEL;TYPE=CELL:[\d-]*)-
See the regex demo. Details:
(?<=TEL;TYPE=CELL:[\d-]*) - a position that is preceded with TEL;TYPE=CELL: and then zero or more digits or hyphens
- - a hyphen.

Related

REGEX - How to find two hyphens in a filename?

I'd like to search for filenames that contain two hyphens (only). Some filenames have one hyphen, I just want the one's with two hyphens in the name:
THIS: some text - more text - yet more.txt
NOT THIS: some text - more text.txt
The hyphens are always surrounded by a space, FWIW.
I tried using (.*) - (.*) - (.*) and a couple variants, but the results aren't what I am looking for. I either get nothing or filenames with just one hyphen when I try various combinations.
I know this is an obvious one, but I have tried wading through regex tutorials concerning greedy, look aheads, etc. but can't for the life of me solve this. Can anyone help? I'm not looking for just the solution--I'd like to understand what I'm doing wrong in the regex syntax.

You can use this regex,
^[^-]*(?:-[^-]*){2}$
This when written in expanded form will look like this,
^[^-]*-[^-]*-[^-]*$
Which is how you wanted it, but I've compacted it by using quantifier to restrict the occurrence of hyphen to just two only.
Demo
If you want to extend your regex, just change .* to [^-]* to make your regex this, otherwise .* will match additional hyphens too leading to unexpected match results.
^([^-]*) - ([^-]*) - ([^-]*)$
Notice you should use start ^ and end $ anchors to make the filename match whole regex.
Demo with your modified regex

RegEx help for NotePad++

I need help with RegEx I just can't figure it out I need to search for broken Hashtags which have an space.
So the strings are for Example:
#ThisIsaHashtagWith Space
But there could also be the Words "With Space" which I don't want to replace.
So important is that the String starts with "#" then any character and then the words "With Space" which I want to replace to "WithSpace" to repair the Hashtags.
I have a Document with 10k of this broken Hashtags and I'm kind of trying the whole day without success.
I have tried on regex101.com
with following RegEx:
^#+(?:.*?)+(With Space)
Even I think it works on regex101.com it doesn't in Notepad++
Any help is appreciated.
Thanks a lot.
BR

In your current regex you match a # and then any character and in a capturing group match (With Space).
You could change the capturing group to capture the first part of the match.
(#+.*?)With Space
Then you could use that group in the replacement:
$1WithSpace
As an alternative you could first match a single # followed by zero or more times any character non greedy .*? and then use \K to reset the starting point of the reported match.
Then match With Space.
#+(?:.*?)\KWith Space
In the replacement use WithSpace
If you want to match one or more times # you could use a quantifier +. If the match should start at the beginning of string you could use an anchor ^ at the start of the regex.

Try using ^(#.+?)(With\s+Space) for your regex as it also matches multiple spaces and tab characters - if you have multiple rows that you want to affect do gmi for the flags. I just tried it with the following two strings, each on a separate line in Notepad++
#blablaWith Space
#hello###$aWith Space
The replace with value is set to $1WithSpace and I've tried both replaceAll and replace one by one - seems to result in the following.
#blablaWithSpace
#hello###$aWithSpace
Feel free to comment with other strings you want replaced. Also be sure that you have selected the Regular Extension search mode in NPP.

Try this? (#.*)( ).
I tried this in Notepad++ and you should be able to just replace all with $1. Make sure you set the find mode to regular expressions first.
const str = "#ThisIsAHashtagWith Space";
console.log(str.replace(/(#.*)( )/g, "$1"));

Regex Select groups not found in a pattern

I have been looking at the various topics on Regex on SO, and they are all saying that to find the invert (select all that doesn't fit the criteria) you simply use the[^] syntax or negative lookahead.
I have tried using both of these methods on my Regex but the results are not adequate the [^] especially seems to take all its contents literally (even when escaped).
What I need this for:
I have a massive SQL line with a SQL dump I'm trying to remove all characters that are not the line id, and the numerical value of one column.
My regex works in matching exactly what I'm looking for; what I need to do is to invert this match so I can remove all non-matching parts in my IDE.
My regex:
/(\),\(\d{1,4},)|(,\d{10},)/
This matches a "),(<number upto 4 digits>," or ",<number of ten digits>," .
The subject
My subject is a 500Kb line of an SQL dump looking something like this (I have already removed a-z and other unwanted characters in previous simple find/replaces):
),(39,' ',1,'01761472100','#','9 ','20',1237213277,0,1237215419,''),(40,' ',3,'01445731203','#',' ','-','22 2','210410//816',1237225423,0,1484651768,''),(4270,' /
My aim is to use a regex to achive the following output:
),(39,,1237213277,,1237215419,),(40,,1237225423,,1484651768,),(4270,
Which I can then go over again and easily remove repetitions such as commas.
I have read that Negation in Regex is tricky, So, what is the syntax to get the regex I've made to work inverted? To remove all non-matching groups? What can you recommend as a way of solving this without spending hours manually reading the lines?

You may use a really helpful (*SKIP)(?!) (=(*SKIP)(*F) or (*SKIP)(*FAIL)) construct in PCRE to match these texts you know and then skip and match all other text to remove:
/(?:\),\(\d{1,4},|,\d{10},)(*SKIP)(?!)|./s
See the regex demo
Details:
(?:\),\(\d{1,4},|,\d{10},) - match 1 of the 2 alternatives:
\),\(\d{1,4}, - ),(, then 1 to 4 digits and then ,
| - or
,\d{10}, - a comma, 10 digits, a comma
(*SKIP)(?!) - omit the matched text and proceed to the next match
| - or
. - any char (since /s DOTALL modifier is passed to the regex)
The same can be done with
/(\),\(\d{1,4},|,\d{10},)?./s
and replacing with $1 backreference (since we need to put back the text captured with the patterns we need to keep), see another regex demo.

Replacing text in Notepad++ with regex

I'm trying to translate a subtitle in Google Translate and everything goes ok with just one problem, it removes the comma , from the times. Well, nice. I pasted it on Notepad++ and tried to replace with regex. The time format is:
00:00:44927 -->
and should be
00:00:44,927 -->
So I tried this regex on the Find what field: :(\d){2}(\d){3}( -->)
And this on the replace with field: :$1,$2 -->
The search works but the replace results in this: 00:00:47. It seems that $1 stands for the first number of the first match (\d){2} that is 4 and the second match (\d){3} that is 7.
Why ?

You need to place the range quantifier {n} inside of your capturing groups. By placing them outside of your capturing groups, you're telling the regex engine that the group is to be repeated nth times instead of the token \d.
Find: :(\d{2})(\d{3})( -->)
Replace: :$1,$2$3
If you wanted to, you could also use lookaround assertions to achieve this.
Find: :\d\d\K(?=\d\d\d)
Replace: ,

You can just do
(?<=\d\d)(\d{3})(?= -->)
and replace with ,$1
DEMO
You were not capturing \d{3} but just \d which is why your regex didn't work as expected.

How to replace only part of found text?

I have a file with a some comma separated names and some comma separated account numbers.
Names will always be something like Dow, John and numbers like 012394,19862.
Using Notepad++'s "Regex Find" feature, I'd like to replace commas between numbers with pipes |.
Basically :
turn: Dow,John into: Dow,John
12345,09876 12345|09876
13568,08642 13568|08642
I've been using [0-9], to find the commas, but I can't get it to properly leave the number's last digit and replace just the comma.
Any ideas?

Search for ([0-9]), and replace it with \1|. Does that work?

use this regex
(\d),(\d)
and replace it with
$1|$2
OR
\1|\2

(?<=\d), should work. Oddly enough, this only works if I use replace all, but not if I use replace single. As an alternative, you can use (\d), and replace with $1|

General thoughts about replacing only part of a match
In order to replace a part of a match, you need to either 1) use capturing groups in the regex pattern and backreferences to the kept group values in the replacement pattern, or 2) lookarounds, or 3) a \K operator to discard left-hand context.
So, if you have a string like a = 10, and you want to replace the number after a = with, say, 500, you can
find (a =)\d+ and replace with \1500 / ${1}500 (if you use $n backreference syntax and it is followed with a digit, you should wrap it with braces)
find (?<=a =)\d+ and replace with 500 (since (?<=...) is a non-consuming positive lookbehind pattern and the text it matches is not added to the match value, and hence is not replaced)
find a =\K\d+ and replace with 500 (where \K makes the regex engine "forget" the text is has matched up to the \K position, making it similar to the lookbehind solution, but allowing any quantifiers, e.g. a\h*=\K\d+ will match a = even if there are any zero or more horizontal whitespaces between a and =).
Current problem solution
In order to replace any comma in between two digits, you should use lookarounds:
Find What: (?<=\d),(?=\d)
Replace With: |
Details:
(?<=\d) - a positive lookbehind that requires a digit immediately to the left of the current location
, - a comma
(?=\d) - a positive lookahead that requires a digit immediately to the right of the current location.
See the demo screenshot with settings:
See the regex demo.
Variations:
Find What: (\d),(?=\d)
Replace With: \1|
Find What: \d\K,(?=\d)
Replace With: |
Note: if there are comma-separated single digits, e.g. 1,2,3,4 you can't use (\d),(\d) since this will only match odd occurrences (see what I mean).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

regex to remove dash from phone numbers - regex

Wrap the number in parenthesis like this so you can restore it, removing the - -([0-9]) replace with $1 Or Just repeat this find replace, till they are all removed: find: (TEL;TYPE=CELL:[^-]*)- replace: $1

I tried this limited solution.only for 2 or 3 dash. Search with: (\d+)-(\d+)(?:-(\d+))? Replace with: $1$2$3 but in the comment that said replace dashes with empty str also a good solution.

Related

REGEX - How to find two hyphens in a filename?

RegEx help for NotePad++

Regex Select groups not found in a pattern

Replacing text in Notepad++ with regex

How to replace only part of found text?

Categories

Resources