RegEx Notepad++ - regex

I have series of line in following format:
LINE: 5190 STNO: 22669 SI: VOICE
CCT LINE STNO SI BUS TYPE
003 6269 OPTI ONLY
MULTLINE 8. . . . . . . . . . . . . . .
001 SUBUNIT . . . . . DIGITE MAIN DEFIL/TRS
(ALT_ROUT: N) (OPTIIP )
LINE: 5291 STNO: 29956 SI: VOICE
What I need to find through regex (notepad++) are the numbers just after "STNO:"
There are approximately 100 of such matches.
I tired STNO:\s+\d{4,5} but it is taking STNO also into the match which i do not want. Please help.
I need to keep the matched result only and rest i want to delete or copy the matched items to a new file whichever is easier.

I suggest a two step approach. First get all the lines with the STNO and a number. Second remove everything except the number.
Select the Mark tab in the find dialogue. Ensure Bookmark line is ticked. In the Find what box enter STNO:\s*\d+ and then click Mark all.
Access Menu => Search => Bookmark => Copy bookmarked lines. Then paste into another buffer. Alternatively, to work in the same file use Menu => Search => Bookmark => Remove unmarked lines. Now you should have all the wanted lines in a buffer.
Do a regular expression search and replace setting Find what to be ^.*STNO:\s*(\d+).*$ and Replace with to \1. Then click Replace all.
The above assumes that there is only one number to be found per line.
=========================
As only the numbers are wanted, another method would be to put line breaks plus a marker around the wanted numbers, then delete any lines without the marker, finally delete the markers.
Let the marker be keep. Do a search and replace setting Find what to be keep and Replace with to a single space character, make sure Match case is not selected; then click Replace all. Next, do a regular expression search and replace setting Find what to be ^STNO:\s*(\d+) and Replace with to \r\nkeep\1\r\n. You might want Match case ticked; then click Replace all. Next do a mark lines (as described above) with Find what set to keep, followed by Menu => Search => Bookmark => Remove unmarked lines. Finally, do a search and replace setting Find what to be keep and Replace with to be empty.

Related

Mass regex search-and-replace BETWEEN patterns

I have a directory with a bunch of text files, all of which follow this structure:
...
- Some random number of list items of random text
- And even more of it
PATTERN_A (surrounded by empty lines)
- Again, some list items of random text
- Which does look similar as the first batch
PATTERN_B (surrounded by empty lines)
- And even more some random text
....
And I need to run a replace operation (let's say, I need to prepend CCC at the beginning of the line, just after the dash) on only those "list items", which are between PATTERN_A and PATTERN_B. The problem is they aren't really much different from the text above PATTERN_A, or below PATTERN_B, so an ordinary regex can't really catch them without also affecting the remaining text.
So, my question would be, what tool and what regex should I use to perform that replacement?
(Just in case, I'm fine with Vim, and I can collect those files in a QuickFix for a further :cdo, for example. I'm not that good with awk, unfortunately, and absolutely bad with Perl :))
Thanks!
If I have understood your questions, you can do so quite easily with a pattern-range selection and the general substitution form with sed (stream editor). For example, in your case:
$ sed '/PATTERN_A/,/PATTERN_B/s/^\([ ]*-\)/\1CCC/' file
- Some random number of list items of random text
- And even more of it
PATTERN_A (surrounded by empty lines)
-CCC Again, some list items of random text
-CCC Which does look similar as the first batch
PATTERN_B (surrounded by empty lines)
- And even more some random text
(note: to substitute in place within the file add the -i option, and to create a backup of the original add -i.bak which will save the original file as file.bak)
Explanation
/PATTERN_A/,/PATTERN_B/ - select lines between PATTERN_A and PATTERN_B
s/^\([ ]*-\)/\1CCC/ - substitute (general form 's/find/replace/') where find is from beginning of line ^ capturing text between \(...\) that contains [ ]*- (any number of spaces and a hyphen) and then replace with \1 (called a backreference that contains all characters you captured with the capture group \(...\)) and appending CCC to its end.
Look things over and let me know if you have questions or if I misinterpreted your question.
With Perl also, you can get the results
> perl -pe ' { s/^(\s*-)/\1CCC/g if /PATTERN_A/../PATTERN_B/ } ' mass_replace.txt
...
- Some random number of list items of random text
- And even more of it
PATTERN_A (surrounded by empty lines)
-CCC Again, some list items of random text
-CCC Which does look similar as the first batch
PATTERN_B (surrounded by empty lines)
- And even more some random text
....
>

Format a text file by regex match and replace

I have a text file that looks like the following:
Chanelle
Jettie
Winnie
Jen
Shella
Krysta
Tish
Monika
Lynwood
Danae
2649
2466
2890
2224
2829
2427
2816
2648
2833
2453
I need to make it look like this
Chanelle 2649
Jettie 2466
... ...
I tried a lot on sublime editor but couldn't figure out the regex to do that. Can somebody demonstrate if it can be done.
I tested the following in Notepad++ but it should work universally.
Use this as the search string:
(?:(\s+[A-Za-z]+)(\r?\n))((?:\s*[A-Za-z]*\r?\n)+)\s+(\d+)
and this as the replacement:
$1 $4$2$3
Running a replace with it once will do one line at a time, if you run it multiple times it'll continue to replace lines until there are no matching lines left.
Alternatively, you can use this as the replacement if you want to have the values aligned by tabs, but it's not going to match in all cases:
$1\t\t$4$2$3
While the regex answer by SeinopSys will work, you don't need a regex to do this - instead, you can take advantage of Sublime's multiple cursors.
Place your cursor at the beginning of line 1, then hold down Shift↓ to select all the names.
Hit CtrlShiftL (Selection -> Split into Lines) to split the selection into lines.
CtrlC to copy.
Place your cursor on line 11 (the first number line) and press CtrlShift↓ (Windows/OS X) or AltShift↓ (Linux) to place a cursor at the beginning of each number line.
Hit CtrlV to paste the names before the numbers.
You can now delete the names at the top and you're all set. Alternatively, you could use CtrlX to cut the names in step 3.

Enumerate existing text in Vim (make numbered list out of existing text)

I have a source document with the following text
Here is a bunch of text
...
Collect underpants
???
Profit!
...
More text
I would like to visually select the middle three lines and insert numbers in front of them:
Here is a bunch of text
...
1. Collect underpants
2. ???
3. Profit!
...
More text
All the solutions I found either put the numbers on their own new lines or prepended the actual line of the file.
How can I prepend a range of numbers to existing lines, starting with 1?
It makes for a good macro.
Add the first number to your line, and put your cursor back at the beginning.
Start a macro with qq (or q<any letter>)
Copy the number with yf<space> (yank find )
Move down a line with j
Paste your yank with P
Move back to the beginning of the line with 0
Increment the number with Ctrl-a
Back to the beginning again with 0 (incrementing positions you at the end of the number)
End the macro by typing q again
Play the macro with #q (or #<the letter you picked>)
Replay the macro as many times as you want with <number>## (## replays the last macro)
Profit!
To summarize the fun way, this GIF image is i1. <Esc>0qqyf jP0^a0q10#q.
To apply enumeration for all lines:
:let i=1 | g/^/s//\=i.'. '/ | let i=i+1
To enumerate only selected lines:
:let i=1 | '<,'>g/^/s//\=i.'. '/ | let i=i+1
Set non recursive mapping with following command and type ,enum in command mode when cursor is inside the lines you are going to enumerate.
:nn ,enum {j<C-v>}kI0. <Esc>vipg<C-a>
TL;DR
You can type :help CTRL-A to see an answer on your question.
{Visual}g CTRL-A Add [count] to the number or alphabetic character in
the highlighted text. If several lines are
highlighted, each one will be incremented by an
additional [count] (so effectively creating a
[count] incrementing sequence).
For Example, if you have this list of numbers:
1.
1.
1.
1.
Move to the second "1." and Visually select three
lines, pressing g CTRL-A results in:
1.
2.
3.
4.
If you have a paragraph (:help paragraph) you can select it (look at :help object-select). Suppose each new line in the paragraph needs to be enumerated.
{ jump to the beginning of current paragraph
j skip blank line, move one line down
<C-v> emulates Ctrl-v, turns on Visual mode
} jump to the end of current paragraph
k skip blank line, move one line up
required region selected, we can make multi row edit:
I go into Insert mode and place cursor in the beginning of each line
0. is added in the beginning of each line
<Esc> to change mode back to Normal
You should get list prepended with zeros. If you already have such, you can omit this part.
vip select inner paragraph (list prepended with "0. ")
g<C-a> does the magic
I have found it easier to enumerate with zeroes instead of omitting first line of the list to enumerate as said in documentation.
Note: personally I have no mappings. It is easier to remember what g <C-a> does and use it directly. Answer above describes usage of pure <C-a> which requires you to manually count whatever, on the other hand g <C-a> can increment numbers with given value (aka step) and have it's "internal counter".
Create a map for #DmitrySandalov solution:
vnoremap <silent> <Leader>n :<C-U>let i=1 \| '<,'>g/^/s//\=i.'. '/ \| let i=i+1 \| nohl<CR>

Sublime Text 2: Regex to remove block containing certain characters

My raw text file is like
>
item1
>
item{2}
>
item3
>
item4}
I would like to remove/match all items containing { or }. In the example above, it would be removing item 2 and 4. The result would be:
>
item1
>
item3
I want to do greedy match so it matches the minimum block. It also has to match over multiple lines. like:
(?s)>(.+?)[\{|\}](.+?)>
But it's not working properly for me.
This regex does exactly what you asked for. It assumes that exact type of input, with nothing else on the line above the one which contains { or }.
>\n.*?[{}]+.*?$
If that line could have text on it, the following one works.
>.*\n.*?[{}]+.*?$
Both these replaces will leave a blank line. To avoid this, add \n either in front or the back of the regex, depending on what fits your document.
Try this one:
\n>\s(.*)(\{|\})

Vim: Delete the text matching a pattern IF submatch(1) is empty

This command line parses a contact list document that may or may not have either a phone, email or web listed. If it has all three then everything works great - appending the return from the FormatContact() at the end of the line for data uploading:
silent!/^\d/+1|ki|/\n^\d\|\%$/-1|kj|'i,'jd|let #a = substitute(#",'\s*Phone: \([^,]*\)\_.*','\1',"")|let #b = substitute(#",'^\_.*E-mail:\s\[\d*\]\([-_#.0-9a-zA-Z]*\)\_.*','\1',"")|let #c = substitute(#",'^\_.*Web site:\s*\[\d*\]\([-_.:/0-9a-zA-Z]*\)\_.*','\1',"")|?^\d\+?s/$/\=','.FormatContact(#a,#b,#c)
or, broken down:
silent!/^\d/+1|ki|/\n^\d\|\%$/-1|kj|'i,'jd
let #a = substitute(#",'\s*Phone: \([^,]*\)\_.*','\1',"")
let #b = substitute(#",'^\_.*E-mail:\s\[\d*\]\([-_#.0-9a-zA-Z]*\)\_.*','\1',"")
let #c = substitute(#",'^\_.*Web site:\s*\[\d*\]\([-_.:/0-9a-zA-Z]*\)\_.*','\1',"")
?^\d\+?s/$/\=','.FormatContact(#a,#b,#c)
I created three separate searches so as not to make any ONE search fail if one atom failed to match because - again - the contact info may or may not exist per contact.
The Problem that solution created was that when the pattern does not match I get the whole #" into #a. Instead, I need it to be empty when the match does not occur. I need each variable represented (phone,email,web) whether it be empty or not.
I see no flags that can be set in the substitution function that
will do this.
Is there a way to return "" if \1 is empty?
Is there a way to create an optional atom so the search query(ies) could still account for an empty match so as to properly record it as empty?
Instead of using substitutions that replace the whole captured text
with its part of interest, one can match only that target part. Unlike
substitution routines, matching ones either locate the text conforming
to the given pattern, or report that there is no such text. Thus,
using the matchstr() function in preference to substitute(), the
parsing code listed in the question can be changed as follows:
let #a = matchstr(#", '\<Phone:\s*\zs[^,]*')
let #b = matchstr(#", '\<E-mail:\s*\[\d*\]\zs[-_#.0-9a-zA-Z]*')
let #c = matchstr(#", '\<Web site:\s*\[\d*\]\zs[-_.:/0-9a-zA-Z]*')
Just in case you want linewise processing, consider using in combination with :global, e.g.
let #a=""
g/text to match/let #A=substitute(getline("."), '^.*\(' . #/ . '\).*$', '\1\r', '')
This will print the matched text for any line that contained it, separated with newlines:
echo #a
The beautiful thing here, is that you can make it work with the last-used search-pattern very easily:
g//let #A=substitute(getline("."), '^.*\(' . #/ . '\).*$', '\1\r', '')