finding pattern between two "CERRADO}" strings using negative look-ahead - regex

I have a text file containing lines like these:
CERRADO}165856}TICKET}DESCRIPTION}some random text here\r\n
other random text here}158277747\r\n
CERRADO}165856}TICKET}FR2CODE}more random text also here}1587269339\r\n
My ultimate goal is to concatenate those lines not beginnning with "CERRADO}" string with their preceding line. There might be an arbitrary number of lines not beginning with that string on the file. This is the end result:
CERRADO}165856}TICKET}DESCRIPTION}some random text here other random text here}158277747\r\n
CERRADO}165856}TICKET}FR2CODE}more random text also here}1587269339\r\n
My first attempt was to create a simple regex to match those lines.
CERRADO\}.+\r\n(?!CERRADO\})(.+\r\n)+
After having that regex right, to create a matching group and replace it getting rid of the \r\n patterns, here is what I have so far:
The proposed regex matches all the lines in the file and not just the wanted ones.
Any ideas would be appreciated

You may use
\R(?!CERRADO\})
and replace with a space.
The regex matches:
\R - a line break sequence that is...
(?!CERRADO\}) - not followed with CERRADO}.
Or,
^(CERRADO\}.*)\R(?!CERRADO\})
and replace with \1 . This regex matches:
^ - start of a line
(CERRADO\}.*) - Capturing group 1 (later referred to with \1 backreference from the replacement pattern): CERRADO} substring and then the rest of the line
\R - a line break sequence
(?!CERRADO\}) - not followed with CERRADO}.
To make multiple replacements with this one, you will need to hit Replace All several times.

Related

vscode regex get lines with duplicate strings separated by comma

I have a vscode file with the following text:
"070230107121","46969","petcarerx","petcarerx"
"070230107121","46970","petcarerx","petcarerx"
"070230107121","47332","petcarerx","petcarerx"
"070230107121","47333","petcarerx","petcarerx"
"070230107121","47333","petcarerx","petcarerx"
"070230107121","46968","petcarerx","petcarerx"
"07087","46968","petcarerx","petcarerx"
"07087","46968","petcarerx","petcarerx"
If I do ctrl+f regex expression ^(.*)(\n\1)+$ it will find the identical lines, so in this case it finds two cases of identical lines:
I am trying to create a regular expression to find all lines where the first column is identical. so in this case; find all rows where the string that comes before the first comma is identical.
This regex expression gets everything before the first comma; ^(.+?),, is there someway I can combine that with my first regex expression to get all lines that are identical before the first comma?
You may use
^(.*?),.*(?:\n\1,.*)+$
Details
^ - start of a line
(.*?) - Capturing group 1 (\1 inline backreference can refer to it from the regex pattern, $1 if you need to refer to it from the replacement pattern)
, - a comma
.* - the rest of the line
(?:\n\1,.*)+ - 1 or more repetitions of a line break, then the same value as in Group 1 and then a comma and the rest of the line
$ - end of a line.
See the regex demo online.
Tested in VS Code:

Eclipse Add text to first line of all files

I need to add text to first line of all my JSP's in eclipse, this is the regex I a using \A.* but some how it selects the first line, I just want to prepend text to the start of the file. any help will be very much appreciated.
The .* pattern matches any 0+ chars other than line break characters, so it matches the first line.
It seems that Eclipse Find/Replace regex feature does not match entirely zero-width patterns (e.g. (?=,) will not find and insert a text before commas).
A workaround is to match and capture some text with (...) (where ... stand for a consuming pattern) capturing group and use $1 in the replacement pattern to reinsert the matched text.
Use
\A(.*)
Replace with MY_NEW_TEXT_HERE_AT_THE_START_OF_FILE$1.

Add to end of line that contains a specific word and starts with x

I would like to add some custom text to the end of all lines in my document opened in Notepad++ that start with 10 and contain a specific word (for example "frog").
So far, I managed to solve the first part.
Search: ^(10)$
Replace: \1;Batteries (to add ;Batteries to the end of the line)
What I need now is to edit this regex pattern to recognize only those lines that also contain a specific word.
For example:
Before: 1050;There is this frog in the lake
After: 1050;There is this frog in the lake;Batteries
You can use the regex to match your wanted lines:
(^(10).*?(frog).*)
the .*? is a lazy quantifier to get the minimum until frog
and replace by :
$1;Battery
Hope it helps,
You should allow any characters between the number and the end of line:
^10.*frog.*
And replacement will be $0;Batteries. You do not even need a $ anchor as .* matches till the end of a line since . matches any character but a line break char.
NOTE: There is no need to wrap the whole pattern with capturing parentheses, the $0 placeholder refers to the whole match value.
More details:
^ - start of a line
10 - a literal 10 text
.* - zero or more chars other than line break chars as many as possible
frog - a literal string
.* - zero or more chars other than line break chars as many as possible
try this
find with: (^(10).*(frog).*)
replace with: $1;Battery
Use ^(10.*frog.*)$ as regex. Replace it with something like $1;Batteries

Regular Expressions in Notepad++ starting with text and ending with incremental number

I'm trying to understand this as I'm reading tutorials and apply this to what I'm doing.
I have a file with lines of text like:
line1blahblahblahblah
line2blahblahblahblah
...
line10blahblahblahblah
I want to go in and remove the line and the number after it (which is incremented 1-1000 for each line) and replace it with new text leaving all the text after in tact.
Can someone explain how and explain the regex expression?
Search for
^line\d+
And replace with an empty string.
Explanation: The ^ matches the begining of the line, the line matches a literal character sequence, and the \d matches any digit character. The + after the \d makes it match one or more digits characters.
Your Notepad++ search panel should look like this:

How to find and replace contents of a bracket inside notepad++

I have a large file with content inside every bracket. This is not at the beginning of the line.
1. Atmos-phere (7800)
2. Atmospheric composition (90100)
3.Air quality (10110)
4. Atmospheric chemistry and composition (889s120)
5.Atmospheric particulates (10678130)
I need to do the following
Replace the entire content, get rid of line numbers
1.Atmosphere (10000) to plain Atmosphere
Delete the line numbers as well
1.Atmosphere (10000) to plain Atmosphere
make it a hyperlink
1.Atmosphere (10000) to plain linky study
[I added/Edit] Extract the words into a new file, where we get a simple list of key words. Can you also please explain the numbers in replace the \1\2, and escape on some characters
Each set of key words is a new line
Atmospheric
Atmospheric composition
Air quality
Each set is a on one line separated by one space and commas
Atmospheric, Atmospheric composition, Air quality
I tried find with regex like so, \(*\) it finds the brackets, but dont know how to replace this, and where to put the replace, and what variable holds the replacement value.
Here is mine exression for notepad ([0-9(). ]*)(.*)(\s\()(.*)
You need split your search in groups
([0-9. ]*) numbers, spaces and dots combination in 0 or more times
(.*) everything till next expression
(\s\() space and opening parenthesis
(.*) everything else
In replace box - for practicing if you place
\1\2\3\4 this do nothing :) just print all groups from above from 1.1 to 1.4
\2 this way you get only 1.2 group
new_thing\2new_thing adds your text before and after group
<a href=blah.com/\2.html>linky study</a> so now your text is added - spaces between words can be problematic when creating link - so another expression need to be made to replace all spaces in link to i.e. _
If you need add backslash as text (or other special sign used by regex) it must be escaped so you put \\ for backslash or \$ for dolar sign
Want more tune - <a href=blah.com/\2.html>\2</a> add again 1.2 group - or use whichever you want
On the screenshot you can see how I use it (I had found and replaced one line)
Ok and then we have case 4.2 with colon at the end so simply add colon after extracted section:
change replace from \2 to \2,
Now you need join it so simplest way is to Edit->Line Operations->Join Lines
but if you want to be real pro switch to Extended mode (just above Regular expression mode in Replace window) and Find \r\n and replace with space.
Removing line endings can differ in some cases but this is another story - for now I assume that you using windows since Notepad++ is windows tool and line endings are in windows style :)
The following regex should do the job: \d+\.\s*(.*?)\s*\(.*?\).
And the replacement: <a href=example.com\\\1.htm>\1</a>.
Explanation:
\d+ : Match a digit 0 or more times.
\. : Match a dot.
\s* : Match spaces 0 or more times.
(.*?) : Group and match everything until ( found.
\s* : Match spaces 0 or more times.
\(.*?\) : Match parenthesis and what's between it.
The replacement part is simple since \1 is referring to the matching group.
Online demo.
Try replacing ^\d+\.(.*) \(\w+\)$ with <a href=blah.com\\\1.htm>linky study</a>.
The ^\d+. removes the leading number and dot. The (.*) collects the words. Then there is a single space. The \(\w+\)$ matches the final number in brackets.
Update for the added Q4.
Regular expressions capture things written between round brackets ( and ). Brackets that are to be found in the text being searched must be escaped as \( and \). In the replacement expression the \1 and \2 etc are replaced by the corresponding capture expression. So a search expression such as Z(\d+)X([aeiou]+)Y might match Z29XeieiY then the replacement expression P\2Q\1R would insert PeieiQ29R. In the search at the top of this answer there is one capture, the (.) captures or collects the words and then the \1 inserts the captured words into the replacement text.