While replacing using regex, How to keep a part of matched string? - regex

I have
12.hello.mp3
21.true.mp3
35.good.mp3
.
.
.
so on as file names in listed in a text file.
I need to replace only those dots(.) infront of numbers with a space.(e.g. 12.hello.mp3 => 12 hello.mp3).
If I have regex as "[0-9].", it replaces number also.
Please help me.

Replace
^(\d+)\.(.*mp3)$
with
\1 \2
Also, in recent versions of notepad++, it will also accept the following, which is also accepted by other IDEs/editors (eg. JetBrains products like Intellij IDEA):
$1 $2
This assumes that the notepad++ regex matching engine supports groups. What the regex basically means is: match the digits in front of the first dot as group 1 and everything after it as group 2 (but only if it ends with mp3)

I tested with vscode. You must use groups with parentheses (group of regex)
Practical example
start with sample data
1 a text
2 another text
3 yet more text
Do the Regex to find/Search the numerical digits and spaces. The group here will be the digits as it is surrounded in parenthesis
(\d)\s
Run a replace regex ops. Replace spaces for a dash but keep the numbers or digits in each line
$1-
Outputs
1-a text
2-another text
3-yet more text

Using the basic pattern, well described in the accepted answer here is an example to add the class="odd" and class="even" to every <tr> element in Notepad++ or any other regex compatible editor:
Find what: (<tr><td>)(.*?\r\n)(<tr><td>)(.*?\r\n)
Replace with: <tr class="odd"><td>\2<tr class="even"><td>\4

Related

REGEX - How can I select/mark 3 works delimited by tabs on a consecutive lines?

Happy New Year !
I have a problem. I don’t know how to marks\select some words delimited by tabs on a consecutive lines: Recent, Coments and Tags
please see this print screen:
I can easy to put | sign, like: Recent|Comments|Tags but this will select all the words in the files that repeats, and I want only those 3 on those lines.
What I want is to make a regex, to remove all text before those 3 words, and another regex to remove everything after those 3 words.
I try something like this ((?s)((^.*)^.*Recente.*$|^.*Coments.*$|^.*Tags.*^))(.*$)but is not very good. And I have to pay atention, because those words can repeated in the text files, so I have to select\mark exactly those 3, on that 3 consecutive line (that doesn't have any other words on it)
Since you mentioned in a comment that you want to do this in Notepad++ (a fact that should have been mentioned in the question text), and since the screenshot shows a single space after the first two words, you might try this regular expression:
.*\n([ \t]+Recente\s+Coments\s+Tags).*
It will select everything, but capture the 3 words including whitespace between them and whitespace preceding first word on same line.
If you then replace with $1, everything not in the capture group will be removed.
Actually, the spaces after the first two words don't matter to this regex.
Could you please try this in perl:
perl -0777 -ne 'while(m/((\s|\t)+)Recent\n\1Comments\n\1Tags/g){print "$&\n";}' /path/to/file
To breakdown:
Start with 1 or more tab characters (first capture group)
Then "Recent" followed by new line
Capture group, Comments and new line
Capture group, Tags
By the way, is "tab" really tab or multiple consecutive whitespaces (\s+)?

How to format the content of the file in notepad++ using regular expression?

I have a file abc.txt which has data like this when I opened up in notepad++
10.114.128.196, 10.149.53.72, 40.169.74.47
Is there any way I can make it like this using regular expressions in notepad++?
10.114.128.196,abc
10.149.53.72,abc
40.169.74.47,abc
Search for
(\d{1,3}(\.\d{1,3}){3})[,\s]*
and replace with
$1,abc\n
(\d{1,3}(\.\d{1,3}){3}) matches 1 to 3 digits followed by 3 more such groups starting with a ".". Because of the round brackets around the found pattern is stored in capturing group 1, you can reuse this matched text in the replacement by inserting $1.
[,\s]* matches zero or more commas and whitespace characters.
Global replace ", " with ",abc\n"?
On the field search put: ((\d+\.?){4}(.)( ?))
On the field replace put: $1abc\r\n
The last line will not have a comma , so I think it is ok to have just this one to fix ;)

How can I replace this data in between certain delimiter with Notepad ++?

I have a list of data in this format
0000000000000000|000|000|00000|000000|CITY|GA|123456|8001234567
I need to replace the last piece of data with the word N/A so there is no phone number in the list.
0000000000000000|000|000|00000|000000|CITY|GA|123456|N/A
Thank you for the assistance, much appreciated.
The simplest and fastest solution for that would be to search for
[^|\r\n]+$
and replacing all with N/A.
Explanation:
[^|\r\n]+ matches one or more characters except | or newlines, and $ makes sure that the match only occurs at the end of a line.
Do a find/replace, with the mode set to "Regular expression".
Find:
(.*)\|[0-9]*
Replace:
\1|N/A
If your phone numbers contain any non-numeric characters (such as periods, hyphens, spaces, etc.), then I would recommend the following adjustment to the regex given by #Bitwise:
(.*)\|(.*)$
Also, in Notepad++, the backreference syntax is not
\1
but rather
$1
which means your replace string will actually be
$1|N/A
You can use
(?!.*\|)(.+)
to mark the end of the line.
In Notepad++ you can use the search and replace (regex) function.

Remove dashes surrounded by numbers on both sides

I'm trying to search and replace using regex in TextWrangler (https://gist.github.com/ccstone/5385334, http://www.barebones.com/products/textwrangler/textwranglerpower.html)
I have rows like this
56-84 29 STRINGOFLETTERS -2.54
I´d like to replace the dash in "56-84" with a tab, so I get
56 84 29 STRINGOFLETTERS -2.54
But without replacing the dash in "-2.54"
How do I specifically only remove dashes surrounded by numbers on both sides?
My regex knowledge is extremelly small, I tried to find [0-9]-[0-9] and replace with [0-9][0-9] but that didnt work.
Your link says "The PCRE engine (Perl Compatible Regular Expressions) is what BBEdit and TextWrangler use". So hopefully you can use lookaround with your regex.
replace regex:
(?<=\d)-(?=\d)
replace with tab(\t).
If it's plain text, not sure you need TextWrangler. You can just use the "sed" command of unix:
$ sed 's/\d-\d/\d\d/g' a.txt > b.txt
You actually need to capture the numbers you want. So the regex would be:
^([0-9])-([0-9])
I'm assuming here that the numbers start at the beginning of the line. If not, you can remove the ^.
Based on your link, the flavor of regex is PCRE, so backreferences look like \1, and \2 in the replacement pattern. So your replacement pattern simply becomes:
\1\t\2
Here \1 refers to the first group (so the first number) and \2 refers to the second group (so the second number).

How to find and replace contents of a bracket inside notepad++

I have a large file with content inside every bracket. This is not at the beginning of the line.
1. Atmos-phere (7800)
2. Atmospheric composition (90100)
3.Air quality (10110)
4. Atmospheric chemistry and composition (889s120)
5.Atmospheric particulates (10678130)
I need to do the following
Replace the entire content, get rid of line numbers
1.Atmosphere (10000) to plain Atmosphere
Delete the line numbers as well
1.Atmosphere (10000) to plain Atmosphere
make it a hyperlink
1.Atmosphere (10000) to plain linky study
[I added/Edit] Extract the words into a new file, where we get a simple list of key words. Can you also please explain the numbers in replace the \1\2, and escape on some characters
Each set of key words is a new line
Atmospheric
Atmospheric composition
Air quality
Each set is a on one line separated by one space and commas
Atmospheric, Atmospheric composition, Air quality
I tried find with regex like so, \(*\) it finds the brackets, but dont know how to replace this, and where to put the replace, and what variable holds the replacement value.
Here is mine exression for notepad ([0-9(). ]*)(.*)(\s\()(.*)
You need split your search in groups
([0-9. ]*) numbers, spaces and dots combination in 0 or more times
(.*) everything till next expression
(\s\() space and opening parenthesis
(.*) everything else
In replace box - for practicing if you place
\1\2\3\4 this do nothing :) just print all groups from above from 1.1 to 1.4
\2 this way you get only 1.2 group
new_thing\2new_thing adds your text before and after group
<a href=blah.com/\2.html>linky study</a> so now your text is added - spaces between words can be problematic when creating link - so another expression need to be made to replace all spaces in link to i.e. _
If you need add backslash as text (or other special sign used by regex) it must be escaped so you put \\ for backslash or \$ for dolar sign
Want more tune - <a href=blah.com/\2.html>\2</a> add again 1.2 group - or use whichever you want
On the screenshot you can see how I use it (I had found and replaced one line)
Ok and then we have case 4.2 with colon at the end so simply add colon after extracted section:
change replace from \2 to \2,
Now you need join it so simplest way is to Edit->Line Operations->Join Lines
but if you want to be real pro switch to Extended mode (just above Regular expression mode in Replace window) and Find \r\n and replace with space.
Removing line endings can differ in some cases but this is another story - for now I assume that you using windows since Notepad++ is windows tool and line endings are in windows style :)
The following regex should do the job: \d+\.\s*(.*?)\s*\(.*?\).
And the replacement: <a href=example.com\\\1.htm>\1</a>.
Explanation:
\d+ : Match a digit 0 or more times.
\. : Match a dot.
\s* : Match spaces 0 or more times.
(.*?) : Group and match everything until ( found.
\s* : Match spaces 0 or more times.
\(.*?\) : Match parenthesis and what's between it.
The replacement part is simple since \1 is referring to the matching group.
Online demo.
Try replacing ^\d+\.(.*) \(\w+\)$ with <a href=blah.com\\\1.htm>linky study</a>.
The ^\d+. removes the leading number and dot. The (.*) collects the words. Then there is a single space. The \(\w+\)$ matches the final number in brackets.
Update for the added Q4.
Regular expressions capture things written between round brackets ( and ). Brackets that are to be found in the text being searched must be escaped as \( and \). In the replacement expression the \1 and \2 etc are replaced by the corresponding capture expression. So a search expression such as Z(\d+)X([aeiou]+)Y might match Z29XeieiY then the replacement expression P\2Q\1R would insert PeieiQ29R. In the search at the top of this answer there is one capture, the (.) captures or collects the words and then the \1 inserts the captured words into the replacement text.