Copy multiple lines Notepad++ - regex

How can I duplicate 6 lines in notepad++?
For example I have:
a 01
a 02
a 03
And I want to make like this:
a 01
a 01
a 01
a 01
a 01
a 01
a 02
a 02
a 02
a 02
a 02
a 02
a 03
a 03
a 03
a 03
a 03
a 03
I just try with regex 3 times:
Find what : ^(.*)$
Replace with : $1\n$1
But i got 8 lines, not 6 lines

You get that result running that 3 times, because you duplicate all the lines 3 times going from 1->2->4->8 lines.
If you want 6 lines, you can either match with ^.*$ (you don't need the capture group, $0 refers to the whole match) and write the full repetition of all the lines $0\n$0\n$0\n$0\n$0\n$0
Or you can use the pattern ^(.*) with the replacement just 1 time to duplicate the lines
Then replace per 2 lines as $0 now refers to 2 of the same consecutive lines due to the capture group and the backreference \1
^(.*)\R\1
Replace with 3 times the whole match:
$0\n$0\n$0

Related

Regex with Named Capture Group

I am trying to update a regex pattern to include a Named Capture Group. Currently, this regex pattern:
\b\d(?!(?:\d{0,3}([-\/\\.])\d{1,2}\1\d{1,4})\b(?!\S))(?:[^\n\d\$\.\%]*\d){14}\b
correctly returns 4 matches from this sample text:
AAA
43 42 040 012 036 00
43 42 090 037 124 00
53 07 010 005 124 00
06-14 301-830-081-49
BBB
When I revised the pattern to add a Named Capture Group it only returns 3 matches and misses the last one.
(?<myPattern>\b\d(?!(?:\d{0,3}([-\/\\.])\d{1,2}\1\d{1,4})\b(?!\S))(?:[^\n\d\$\.\%]*\d){14}\b)
How can I keep the Named Capture Group but still return 4 matches ?
See example here.
Thanks.
Named capturing groups are still assigned numeric IDs. That is, (?<myPattern>[a-z]+)(\d+) contains two groups, the first one - with ID 1 - is "myPattern" group matching one or more lowercase letters, and the second group is \d+, with ID 2.
In your case, the problem arises due to the use of the \1 backreference later in the pattern. It refers to "myPattern" group value now, so the matching is incorrect.
To fix the issue, you need to replace \1 to the corresponding group, \2:
(?<myPattern>\b\d(?!(?:\d{0,3}([-\/\\.])\d{1,2}\2\d{1,4})\b(?!\S))(?:[^\n\d\$\.\%]*\d){14}\b)
See the regex demo.

Regex to match phone numbers, but in text paragraph

I would like to highlight phone numbers in a text with the PHP function preg_replace().
This works pretty well:
(?=(?:\D*\d\D*){8,14}$)[- \d()+]*
It almost matches those different formats:
01 02 03 04 05
0102030405
+33102030405 01-02-03-04-05
01.02.03.04.05
+33 1 02 03 04 053
(33)102030405
DEMO
But now I would like to make it running with this test text:
blabla 01 02 03 04 05 blabla 0102030405 blabla +33102030405 blabla
01-02-03-04-05 blabla 01.02.03.04.05 blabla +33 1 02 03 04 05 blabla
(33)102030405
I do not speak fluently regex, I've tried many things but failed.
DEMO
Thanks for your help.
You can use
[+(]?\d(?:[-()+\s.]*\d){8,14}(?![-()+\s.]*\d)
Details:
[+(]? - an optional + or (
\d - a digit
(?:[-()+\s.]*\d){8,14} - eight to 14 occurrences of a -, (, ), +, whitespace or . char and then a digit
(?![-()+\s.]*\d) - not immediately followed with a -, (, ), +, whitespace or . char and then a digit.
See the regex demo.

How to remove everything except "01 - 10" pattern and vice-versa?

I want to remove everything including digits, characters, special characters from a file except 01 - 10 pattern. For example:
01 - 10
This is the first one :1
02 - 20
This is the second one -2
03 - 30
This is the third one "3
04 - 40
This is the forth one ;4
05 - 50
This is the fifth one .5
The regex that I have used is [^\d\d\s-\s\d\d], keeping in mind to match everything except "01 - 10" pattern by selecting \d\d\s-\d\d and using character set charet(^).
But using this I'm getting:
01 - 10
1
02 - 20
-2
03 - 30
3
04 - 40
4
05 - 50
5
But I want the result to be:
01 - 10
02 - 20
03 - 30
04 - 40
05 - 50
I.E., I want, 01 - 10 pattern and, not to include any individual :1 and -2 and "3 and ;4 and .5 as mentioned in the problem at the top.
And vice versa i.e. to select each line except the "01 - 10" pattern
eg:
This is the first one :1
This is the second one -2
This is the third one "3
This is the forth one ;4
This is the fifth one .5
I want to know the regex pattern for 01 - 10 case and vice versa so that I can keep both the separately generated results in separate files.
You may use this regex:
^(?!\d{2}\h*[:-]\h*\d{2}\h*$).*[\r\n]
RegEx Demo
RegEx Details:
^: Start
(?!: Start negative lookahead
\d{2}: Match 2 digits
\h*[:-]\h*: Match 0 or more horizontal whitespaces followed by : or - followed by 0 or more horizontal whitespaces
\d{2}: Match 2 digits
\h*: Match 0 or more whitespaces
$: End of the line
): End negative lookahead
.*: Match anything
[\r\n]: Match 1+ of line breaks
Replacement is an empty string to remove all matching lines
Reverse Removal
To remove digit pair lines you can use:
^(?=\d{2}\h*[:-]\h*\d{2}\h*$).*[\r\n]+
RegEx Demo 2
Use this:
^(?!\d\d\s-\s\d\d).*
Demo: https://regex101.com/r/t4qrh0/1

Regex beginner question - Number combination not found

I'm using RegexPal to crosscheck my Regex.
I'm trying to extract phone numbers from text. German Phone numbers typically have one of the following formats:
0 0000 000000
+49 0000 000000
00000 000000
+490000 000000
00000/000000
+490000/000000
0000 - 00 00 00 00
+49000 - 00 00 00 00
0000 - 00000000
+49000 - 00000000
I have constructed the following RegEx to test the phone numbers
/([+]??\d{2}|[0])[\s/-]??\d{3,4}([\s/-]|(\s-\s))??(\d{2}\s??){3,4}/g
The last two layouts get detected, while the second last doe not. Could anyone explain this to me? Specifically the last space removes the last pair for some reason.
Edit:
00 00 00 00 vs
00000000
with this RegEx:
(\d{2}\s??){3,4}
The last one gets detected, the first one does not.
Edit 2: With (+49|0) I meant +49 OR 0. Replaced for clarity.
Your version with corrections:
(\+\d{2}|\d)[ \/-]?\d{3,4}([ \/-]|( - ))?(\d{2} ?){3,4}
1) Don't use \s. It also means the new line.
2) One ? is enough.
3) / may need \ inside []. Not in browsers.
4) No need to use [] for only one symbol.
All your variants:
console.log(`0 0000 000000
00000 000000
00000/000000
0000 - 00 00 00 00
+49000 - 00000000
+49 0000 000000
+490000 000000
+49000 - 00 00 00 00
+49000 - 00000000`.match(/(\+\d{2}|\d)[ \/-]?\d{3,4}([ \/-]|( - ))?(\d{2} ?){3,4}/g))
reason is because of the ?? syntax
says match if can but preferr not to
a good rx engin that is it is says ok then
only need to stop when in the quantified range and so it
doesn't have to match a space due to the ??.
You'll notice if there is a space on the 3 rd time, the
engine will stop because it's met the minimum (3) and it doesn't
want to match that space.
See it in this example where this (\d{2}\s??){3,4} only matches this
00 00 00 00
or
000000 00
demo1
And the reason it matches 00000000 is there is no space
just before the last 00.
It would match 00 00 0000 for that very reason too.
this ?? preference of no match almost always results in not
matching when it is the last sub-expression in regex.
so the engine really sees this \d{2}\s?? as the sub expression that is
quantified. it only will match 3 times because ?? forces it to stop
after seeing a space before the last 000000 00 and meeting the minimal
3 in {3,4}
Steer clear of this if possible.
You could use the following regular expression for verifying the telephone numbers.
(?m)^(?:\(\+49\|\d\)(?: ?\d{4} \d{6}|\d{3} - (?:\d{2}(?: \d{2}){3}|\d{8}))|\d{5}\/\d{6})$
Demo
I used the PCRE regex engine for testing, there's nothing fancy about the regex so it should work with most engines.
The regex engine performs the following operations. (I've put each space in a character class to make them more apparent.)
(?m) multiline mode
^ match beginning of line
(?: begin non-capure group
\(\+49\|\d\) match '(+49|', 1 digit, ')'
(?: begin non-capture group
[ ]?\d{4}[ ]\d{6} match ' ', 4 digits, ' ', 6 digits
| or
\d{3}[ ]-[ ] match 3 digits, ' - '
(?: begin non-capture group
\d{2} match 2 digits
(?:[ ]\d{2}) match ' ', 2 digits in non-capture gruop
{3} execute non-capture group 3 times
| or
\d{8} match 8 digits
) end non-capture group
) end non-capture group
| or
\d{5}\/\d{6} match 5 digits, '/', 6 digits
) end non-capture group
$ match end-of-line

Regex capture consecutive numbers in a pattern

Trying to extract the numbers from a string of pattern:
<Some Alphanumeric> <numbers> X <numbers> <Some Alphanumeric>
e.g.
I 00 Crazy 060 X 0140 08 Dance 47
should extract the numbers 060 and 0140 and the text I 00 Crazy and 08 Dance 47
I'm using the following Regex:
(.*)(\d{1,3})\s*(x|X)\s*(\d{1,4})(.*)
However this isn't working on the first number preceding the X, it's only capturing 0 instead of 060 but captures the second number 0140 correctly.
\d{1,3} should be a greedy capture of digits between 1 and 3 in length - so what am I missing here?
This should work,
(.*)\b(\d{1,3})\s*(x|X)\s*(\d{1,4})(.*)
Here, \b asserts position at a word boundary (^\w|\w$|\W\w|\w\W)